delta/ceph.git - github.com: ceph/ceph.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	ceph, config, auth: better messages on failure to open keyring/ceph.confwip-5634	Dan Mick	2013-07-16	7	-15/+31
\| \| \| \| \| \| \| \| \| \|	If something as simple as file ownership is wrong, Ceph commands and daemons can fail to run, and the diagnostics are not great. Improve that for at least the specific cases of unopenable keyring and ceph.conf files. Fixes: #5634 Signed-off-by: Dan Mick <dan.mick@inktank.com>
*	mon/MDSMonitor: make 'mds cluster_{up,down}' idempotent	Sage Weil	2013-07-16	1	-6/+2
\| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
*	osdmaptool: fix cli tests	Sage Weil	2013-07-16	2	-9/+9
\| \| \| \| \| \|	From the HASHPSPOOL change in acbc2f0bc0b4266125403aebb28e6e3a2365394d. Signed-off-by: Sage Weil <sage@inktank.com>
*	Merge branch 'wip-ceph-disk' into next	Sage Weil	2013-07-16	1	-47/+75
\|\ \| \| \| \| \| \| \| \|	Reviewed-by: Gary Lowell <gary.lowell@inktank.com> Tested-by: Jing Yuan Luke <jyluke@gmail.com>
\| *	ceph-disk: use /sys/block to determine partition device names	Sage Weil	2013-07-16	1	-1/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not all devices are basename + number; some have intervening character(s), like /dev/cciss/c0d1p2. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	ceph-disk: reimplement is_partition() using /sys/block	Sage Weil	2013-07-16	1	-16/+9
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
\| *	ceph-disk: use get_dev_name() helper throughout	Sage Weil	2013-07-16	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This is more robust than the broken split trick. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	ceph-disk: refactor list_[all_]partitions	Sage Weil	2013-07-16	1	-32/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make these methods work in terms of device names, not paths, and fix up the only direct list_partitions() caller to do the same. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	ceph-disk: add get_dev_name, path helpers	Sage Weil	2013-07-16	1	-0/+25
\|/ \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
*	mon/OSDMonitor: fix typo	Sage Weil	2013-07-16	1	-1/+1
\| \| \| \| \| \|	From 5eac38797d9eb5a59fcff1d81571cff7a2f10e66 Signed-off-by: Sage Weil <sage@inktank.com>
*	osd/OSDMonitor: make 'osd pool rmsnap ...' not racy/crashy	Sage Weil	2013-07-16	1	-24/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensure that the snap does in fact exist before we try to remove it. This avoids a crash where a we get two dup rmsnap requests (due to thrashing, or a reconnect, or something), the committed (p) value does have the snap, but the uncommitted (pp) does not. This fails the old test such that we try to remove it from pp again, and assert. Restructure the flow so that it is easier to distinguish the committed short return from the uncommitted return (which must still wait for the commit). 0> 2013-07-16 14:21:27.189060 7fdf301e9700 -1 osd/osd_types.cc: In function 'void pg_pool_t::remove_snap(snapid_t)' thread 7fdf301e9700 time 2013-07-16 14:21:27.187095 osd/osd_types.cc: 662: FAILED assert(snaps.count(s)) ceph version 0.66-602-gcd39d8a (cd39d8a6727d81b889869e98f5869e4227b50720) 1: (pg_pool_t::remove_snap(snapid_t)+0x6d) [0x7ad6dd] 2: (OSDMonitor::prepare_command(MMonCommand)+0x6407) [0x5c1517] 3: (OSDMonitor::prepare_update(PaxosServiceMessage)+0x1fb) [0x5c41ab] 4: (PaxosService::dispatch(PaxosServiceMessage)+0x937) [0x598c87] 5: (Monitor::handle_command(MMonCommand)+0xe56) [0x56ec36] 6: (Monitor::_ms_dispatch(Message)+0xd1d) [0x5719ad] 7: (Monitor::handle_forward(MForward)+0x821) [0x572831] 8: (Monitor::_ms_dispatch(Message)+0xe44) [0x571ad4] 9: (Monitor::ms_dispatch(Message)+0x32) [0x588c52] 10: (DispatchQueue::entry()+0x549) [0x7cf1d9] 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x7060fd] 12: (()+0x7e9a) [0x7fdf35165e9a] 13: (clone()+0x6d) [0x7fdf334fcccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
*	ObjectStore: add omap_rmkeyrange to dump	Samuel Just	2013-07-16	1	-0/+14
\| \| \| \| \|	Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	OSD: add perfcounter tracking messages delayed pending a map	Samuel Just	2013-07-16	2	-0/+4
\| \| \| \| \|	Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	FileStore: add a perf counter for time spent acquiring op queue throttle	Samuel Just	2013-07-16	2	-0/+5
\| \| \| \| \|	Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	Merge branch 'wip-4779' into next	Sage Weil	2013-07-16	5	-36/+85
\|\ \| \| \| \| \| \|	Reviewed-by: Sage Weil <sage@inktank.com># Please enter a commit message to explain why this merge is necessary,
\| *	mon/OSDMonitor: return error if we can't set the new bucket's name	Sage Weil	2013-07-16	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
\| *	crush: return EINVAL on invalid name from ↵	Sage Weil	2013-07-16	2	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	{insert,update,create_or_move}_item, set_item_name Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
\| *	crush: add is_valid_crush_name() helper	Sage Weil	2013-07-16	2	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	[A-Za-z0-9-_.]+ Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
\| *	MonCommands.h: use new validation for crush names (CephString goodchars)	Dan Mick	2013-07-12	1	-16/+17
\| \| \| \| \| \| \| \|	Signed-off-by: Dan Mick <dan.mick@inktank.com>
\| *	ceph_argparse.py: allow valid char RE arg to CephString	Dan Mick	2013-07-12	1	-8/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change badchars to goodchars (no one was using badchars); allow goodchars to be a RE character class of valid characters for the param. First use: crush item names. Signed-off-by: Dan Mick <dan.mick@inktank.com>
\| *	ceph_argparse: ignore prefix mismatches, but quit if non-prefix	Dan Mick	2013-07-12	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	I don't know what I was thinking; this was always the right validation algorithm, and I broke it trying to simplify. Signed-off-by: Dan Mick <dan.mick@inktank.com>
\| *	ceph_argparse.py: validate's 3rd arg is not verbose, it's partial	Dan Mick	2013-07-12	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Dan Mick <dan.mick@inktank.com>
* \|	Merge pull request #439 from yehudasa/wip-rgw-next	Gregory Farnum	2013-07-16	1	-0/+5
\|\ \ \| \| \| \| \| \| \| \| \|	rgw: quiet down ECANCELED on put_obj_meta() Reviewed-by: Greg Farnum <greg@inktank.com>
\| * \|	rgw: quiet down ECANCELED on put_obj_meta()	Yehuda Sadeh	2013-07-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: #5439 ECANCELED there means that we lost in a race to write the object. We should treat it as a successful write. This is reviving an old behavior that was changed inadvertently. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
* \| \|	mon: OSDMonitor: only thrash and propose if we are the leader	Joao Eduardo Luis	2013-07-16	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	'thrash_map' is only set if we are the leader, so we would thrash and propose the pending value if we are the leader. However, we should keep the 'is_leader()' check not only for clarity's sake (an unfamiliar reader may cry OMGBUG, prompting to a patch much like this), but also because we may lose a subsequent election and become a peon instead, while still holding a 'thrash_map' value > 0 -- and we really don't want to propose while being a peon. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* \| \|	mon/MDSMonitor: make 'ceph mds remove_data_pool ...' idempotent	Sage Weil	2013-07-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
* \| \|	mon/OSDMonitor: clean up waiting_for_map messages on shutdown	Sage Weil	2013-07-16	4	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not leak these. Fixes: #5643 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
* \| \|	mon/OSDMonitor: send_to_waiting() in on_active()	Sage Weil	2013-07-16	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The send_latest() helper may put a message in the waiting_for_map list if we are not readable, but currently send_to_waiting() is only called from update_from_paxos(), and it is possible that we may be unreadable but not get a map update. Instead, share the map when we are active. Do the same for check_subs(), which is also about sharing the new map. Leave share_map_with_random_osd() and process_failures() which are not concerned with whether this is the latest map or not. This problem surfaced when we changed the timing of refresh relative to paxos commit, since update_from_paxos() is now not normally called while readable; see f1ce8d7c955a2443111bf7d9e16b4c563d445712 and c711203c0d4b924e5951aa808b243bf06e7ad23a. Fixes: #5643 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
* \| \|	osd: do not enable HASHPSPOOL pool feature by default	Sage Weil	2013-07-16	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was added in kernel 3.9 and should not yet be enabled by default. Signed-off-by: Sage Weil <sage@inktank.com>
* \| \|	ceph-disk: rely on /dev/disk/by-partuuid instead of special-casing journal ↵	Sage Weil	2013-07-16	1	-34/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	symlinks This was necessary when ceph-disk-udev didn't create the by-partuuid (and other) symlinks for us, but now it is fragile and error-prone. (It also appears to be broken on a certain customer RHEL VM.) See d7f7d613512fe39ec883e11d201793c75ee05db1. Instead, just use the by-partuuid symlinks that we spent all that ugly effort generating. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
* \| \|	PendingReleaseNotes: formatted ceph CLI output and ceph-rest-api	Dan Mick	2013-07-16	1	-0/+11
\|/ / \| \| \| \| \| \|	Signed-off-by: Dan Mick <dan.mick@inktank.com>
* \|	mon: Monitor: StoreConverter: clearer debug message on 'needs_conversion()'	Joao Eduardo Luis	2013-07-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous debug message outputted the function's name, as often our functions do. This was however a source of bewilderment, as users would see those in logs and think their stores would need conversion. Changing this message is trivial enough and it will make ceph users happier log readers. Backport: cuttlefish Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* \|	mon: Monitor: StoreConverter: sanitize 'store' pointer on init	Joao Eduardo Luis	2013-07-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We are supposed to have umount'ed the store and set the pointer to NULL. We should not tolerate any other case on init(). Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* \|	mon: Monitor: do not reopen MonitorDBStore during conversion	Joao Eduardo Luis	2013-07-16	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already open the store on ceph_mon.cc, before we start the conversion. Given we are unable to reproduce this every time a conversion is triggered, we are led to believe that this causes a race in leveldb that will lead to 'store.db/LOCK' being locked upon the open this patch removes. Regardless, reopening the db here is pointless as we already did it when we reach Monitor::StoreConverter::convert(). Fixes: #5640 Backport: cuttlefish Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* \|	Merge pull request #438 from yehudasa/wip-rgw-next	Gregory Farnum	2013-07-16	3	-24/+45
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	Fix an issue with bucket placements and with listing on new installations. Reviewed-by: Greg Farnum <greg@inktank.com>
\| * \|	rgw: handle ENOENT when listing bucket metadata entries	Yehuda Sadeh	2013-07-15	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Just return success (with an empty list) Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
\| * \|	rgw: fix bucket placement assignment	Yehuda Sadeh	2013-07-15	2	-22/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we set bucket.instance meta, we need to set the correct bucket placement to the bucket (according to the specific placement rule). However, it might be that bucket placement was never configured and we just go by the defaults, using the old legacy pools selection. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
* \| \|	OSD: add config option for peering_wq batch size	Samuel Just	2013-07-15	3	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Large peering_wq batch sizes may excessively delay peering messages resulting in unreasonably long peering. This may speed up peering. Backport: cuttlefish Related: #5084 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* \| \|	mon: make report pure json	Sage Weil	2013-07-15	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Put the crc in the status string and drop the header and footer. If users want to capture it, ceph report 2>&1 > foo.txt Signed-off-by: Sage Weil <sage@inktank.com>
* \| \|	Merge remote-tracking branch 'gh/wip-mon-report' into next	Sage Weil	2013-07-15	10	-1/+42
\|\ \ \
\| * \| \|	mon: include some (basic) auth info in report	Sage Weil	2013-07-14	4	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Nothing privileged! Signed-off-by: Sage Weil <sage@inktank.com>
\| * \| \|	mon: include paxos info in report	Sage Weil	2013-07-14	3	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
\| * \| \|	mon: move quorum out of monmap	Sage Weil	2013-07-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
\| * \| \|	mon: include service first_committed in report	Sage Weil	2013-07-14	4	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
* \| \| \|	ceph: drop --threshold hack for 'pg dump_stuck'	Sage Weil	2013-07-15	4	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can live with the incompatibility here; the hack is currently not working anyway (see #5623). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
* \| \| \|	msg/Pipe: be a bit more explicit about encoding outgoing messages	Sage Weil	2013-07-15	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
* \| \| \|	messages/MClientReconnect: clear data when encoding	Sage Weil	2013-07-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MClientReconnect puts everything in the data payload portion of the message and nothing in the front portion. That means that if the message is resent (socket failure or something), the messenger thinks it hasn't been encoded yet (front empty) and reencodes, which means everything gets added (again) to the data portion. Decoding keep decoding until it runs out of data, so the second copy means we decode garbage snap realms, leading to the crash in bug Clearing data each time around resolves the problem, although it does mean we do the encoding work multiple times. We could alternatively (or also) stick some data in the front portion of the payload (ignored), but that changes the wire protocol and I would rather not do that. Fixes: #4565 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
* \| \| \|	Merge pull request #436 from ceph/wip-mon-fixes	Sage Weil	2013-07-15	6	-74/+103
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wip mon fixes Reviewed-by: Greg Farnum <greg@inktank.com>
\| * \| \| \|	mon: set forwarded message recv stamp	Sage Weil	2013-07-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Set it to the stamp of the MForward that carried us. One could argue we really want the original receive stamp on the origin, but that is not available to us, and this is better than nothing. In particular, this gives 'ceph log ...' commands a timestamp when they are forwarded via a peon. The stamp is still between when the request is sent and when it is committed/acked, so all is well from the client's perspective. Signed-off-by: Sage Weil <sage@inktank.com>
\| * \| \| \|	mon: drop win_election() _reset() kludge and strengthen assertions	Sage Weil	2013-07-15	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is only there for the benefit of win_standalone_election(), but it doesn't need it, it clutters the code, and weakens our assertions. Now the only win_election() callers are win_standalone_election() (which is a single path that just did _reset()) and from the elector. Signed-off-by: Sage Weil <sage@inktank.com>