delta/ceph.git - github.com: ceph/ceph.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Message,OSD,PG: make Connection::features private	Sage Weil	2013-07-19	1	-0/+2
\| \| \| \| \| \| \|	Use has_feature() method too. Signed-off-by: Samuel Just <sam.just@inktank.com> Signed-off-by: Sage Weil <sage@inktank.com>
*	Merge branch 'wip-small-object-recovery'	Samuel Just	2013-07-08	1	-0/+4
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: src/include/ceph_features.h Reviewed-by: Sage Weil <sage@inktank.com> Fixes: #5278
\| *	messages/,osd_types: add messages for Push, PushReply, Pull	Samuel Just	2013-07-08	1	-0/+4
\| \| \| \| \| \| \| \|	Signed-off-by: Samuel Just <sam.just@inktank.com>
* \|	mon: implement simple 'scrub' command	Sage Weil	2013-07-08	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	Compare all keys within the sync'ed prefixes across members of the quorum and compare the key counts and CRC for inconsistencies. Currently this is a one-shot inefficient hammer. We'll want to make this work in chunks before it is usable in production environments. Protect with a feature bit to avoid sending MMonScrub to mons who can't decode it. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
*	msgr: queue reset exactly once on any connection	Sage Weil	2013-06-13	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Use the atomic pipe link removal as a signal that we are the one failing the con and use that to queue the reset event. This fixes the case where we have an open, the session gets set up via the handle_accept callback, and then race with another connection and go into wait + close, or just close. In that case, fault() needs to queue a reset event to match the accept. Signed-off-by: Sage Weil <sage@inktank.com>
*	msgr: use ConnectionRef throughout	Sage Weil	2013-06-13	1	-12/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Make RefCountedObject a private parent of Connection so that users are forced to use ConnectionRef whenever references are taken. Many methods can still take a raw Connection* when they are using the caller's reference but not taking their own; this is cheaper than twiddling the reference count, and the lifetime is still well defined. Local variables generally use ConnectionRef, though. Signed-off-by: Sage Weil <sage@inktank.com>
*	Merge remote-tracking branch 'yan/wip-mds'	Sage Weil	2013-05-29	1	-0/+2
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Sage Weil <sage@inktank.com> Conflicts: src/mds/MDCache.cc
\| *	mds: open inode by ino	Yan, Zheng	2013-05-28	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds "open-by-ino" helper. It utilizes backtrace to find inode's path and open the inode. The algorithm looks like: 1. Check MDS peers. If any MDS has the inode in its cache, goto step 6. 2. Fetch backtrace. If backtrace was previously fetched and get the same backtrace again, return -EIO. 3. Traverse the path in backtrace. If the inode is found, goto step 6; if non-auth dirfrag is encountered, goto next step. If fail to find the inode in its parent dir, goto step 1. 4. Request MDS peers to traverse the path in backtrace. If the inode is found, goto step 6. If MDS peer encounters non-auth dirfrag, it stops traversing. If any MDS peer fails to find the inode in its parent dir, goto step 1. 5. Use the same algorithm to open the inode's parent. Goto step 3 if succeeds; goto step 1 if fails. 6. return the inode's auth MDS ID. The algorithm has two main assumptions: 1. If an inode is in its auth MDS's cache, its on-disk backtrace can be out of date. 2. If an inode is not in any MDS's cache, its on-disk backtrace must be up to date. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* \|	osd: ping both front and back interfaces	Sage Weil	2013-05-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Send ping requests to both the front and back hb addrs for peer osds. If the front hb addr is not present, do not send it and interpret a reply as coming from both. This handles the transition from old to new OSDs seamlessly. Note both the front and back rx times. Both need to be up to date in order for the peer to be healthy. Signed-off-by: Sage Weil <sage@inktank.com>
* \|	msgr: add Messenger reference to Connection	Sage Weil	2013-05-22	1	-1/+4
\|/ \| \| \| \| \|	This allows us to get the messenger associated with a connection. Signed-off-by: Sage Weil <sage@inktank.com>
*	msgr: add second per-message throttler to message policy	Sage Weil	2013-04-06	1	-16/+34
\| \| \| \| \| \| \| \| \| \|	We already have a throttler that lets of limit the amount of memory consumed by messages from a given source. Currently this is based only on the size of the message payload. Add a second throttler that limits the number of messages so that we can effectively throttle small requests as well. Signed-off-by: Sage Weil <sage@inktank.com>
*	messages: add MOSDMarkMeDown	Samuel Just	2013-03-21	1	-0/+1
\| \| \| \|	Signed-off-by: Samuel Just <sam.just@inktank.com>
*	mon: HealthMonitor: Keep track of monitor cluster's health	Joao Eduardo Luis	2013-03-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The HealthMonitor builds upon the QuorumService interface, and should be used to keep track of all and any relevant information about the monitor cluster (maybe even about all the cluster if need be). This patch also introduces the HealthService interface, used to define a HealthMonitor service, responsible for dispatching 'MMonHealth' messages (the QuorumService interface dispatches generic 'Message'). Based on the HealthService interface, we introduce the DataHealthService class, a service that will track disk space consumption by the monitors, warn when a given threshold is crossed, and gracefully shutdown the monitor if disk space usage hits critical levels that might affect the correct monitor behavior. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
*	Merge branch 'wsp.bobtail.2merge' into wsp.bobtail.master	Joao Eduardo Luis	2013-02-21	1	-0/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: src/.gitignore src/Makefile.am src/include/ceph_features.h src/mon/MDSMonitor.cc src/mon/PGMonitor.cc
\| *	message: MMonSync: Monitor Synchronization message	Joao Eduardo Luis	2013-02-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The monitor's synchronization process requires a specific message type to carry the required informations. Since this process significantly differs from slurping, reusing the MMonProbe message is not an option as it would require major changes and, for all intetions and purposes, it would be far outside the scope of the MMonProbe message. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
* \|	msg/Message.h: fix C-style pointer casting	Danny Al-Gaaf	2013-02-06	1	-2/+2
\|/ \| \| \| \| \|	Replace C-style pointer casting with correct static_cast<>(). Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
*	osd: use Message::get_cost() function for queueing	Sage Weil	2013-01-22	1	-0/+4
\| \| \| \| \| \|	The data payload is a decent proxy for cost in most cases, but not all. Signed-off-by: Sage Weil <sage@inktank.com>
*	messages: add MTimeCheck	Joao Eduardo Luis	2013-01-11	1	-0/+2
\| \| \| \| \|	Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
*	features is uint64_t	Sage Weil	2012-12-27	1	-5/+5
\| \| \| \| \| \| \|	This won't bite us for a while yet (we're on bit 26), but it will soon! Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
*	osd/: make OSDService messenger helpers return ConnectionRef	Samuel Just	2012-11-30	1	-0/+6
\| \| \| \|	Signed-off-by: Samuel Just <sam.just@inktank.com>
*	common: add RefCountedObj.cc with intrusive_ptr hooks	Samuel Just	2012-11-13	1	-4/+0
\| \| \| \|	Signed-off-by: Samuel Just <sam.just@inktank.com>
*	message: add MRecoveryReserve	Mike Ryan	2012-11-01	1	-0/+3
\| \| \| \| \| \| \|	This message will be used to reserve and release recovery slots on replica PGs. Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
*	osd/: add backfill reservations	Samuel Just	2012-09-25	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, a new osd would be bombarded by backfills from many osds simultaneously, resulting in excessively high load. Instead, we want to limit the number of backfills coming into and going out from a single osd. To that end, each OSDService now has two AsyncReserver instances: one for backfills going from the osd (local_reserver) and one for backfills going to the osd (remote_reserver). For a primary to initiate a backfill, it must first obtain a reservation from its own local_reserver. Then, it must obtain a reservation from the backfill target's remote_reserver via a MBackfillReserve message. This process is managed by substates of Active and ReplicaActive (see the changes in PG.h). The reservations are dropped either on the Backfilled event, which is sent on the primary before calling recovery_complete and on the replica on receipt of the BackfillComplete progress message), or upon leaving Active or ReplicaActive. It's important that we always grab the local reservation before the remote reservation in order to prevent a circular dependency. Signed-off-by: Samuel Just <sam.just@inktank.com>
*	Merge branch 'next'	Samuel Just	2012-08-15	1	-4/+2
\|\
\| *	PG,Message: move intrusive_ptr_* into top namespace	Samuel Just	2012-08-15	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	gcc 4.7 requires that the intrusive_ptr_* functions be in the same namespace as the templated class. Signed-off-by: Samuel Just <sam.just@inktank.com>
* \|	Merge branch 'wip-msgr'	Sage Weil	2012-08-13	1	-3/+24
\|\ \ \| \|/ \|/\|
\| *	msgr: do not reopen failed lossy Connections	Sage Weil	2012-07-20	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a race where: - sending stuff to a lossy Connection - it fails, and queues itself for reap, queues a RESET event - reaper clears the Pipe - some thread queues new messages and the Pipe is reopened, messages sent - RESET event delivered to dispatch, connection is closed and reopened. The result was that messages got sent to the OSD out of order during the window between the fault() and ms_handle_reset() getting called. This will prevent that. Signed-off-by: Sage Weil <sage@inktank.com>
\| *	msg/Connection: add failed flag for lossy Connections	Sage Weil	2012-07-20	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \|	If a lossy Connection fails and we disconnect the Pipe, set a failed flag. Signed-off-by: Sage Weil <sage@inktank.com>
* \|	PG: compound messages must carry epoch_sent for each part	Samuel Just	2012-07-05	1	-0/+6
\|/ \| \| \| \| \| \| \|	Query and Notify messages include logical messages from multiple pgs. Each logical message (pg_query_t and pg_notify_t) now contains an epoch_sent. Signed-off-by: Samuel Just <sam.just@inktank.com>
*	misc assert #include cleanup, hackery	Sage Weil	2012-06-07	1	-19/+10
\| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
*	msg: make clear_pipe work only on a given Pipe, rather than the current one.	Greg Farnum	2012-06-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	This way old Pipes that have been replaced can't clear the new Pipe out of a Connection's link. We might attempt to instead sever the link between CLOSED Pipes and their Connections more completely (eg, when the Connection gets a new Pipe), but that will require more work to handle all the cases, and this works for now. Signed-off-by: Greg Farnum <greg@inktank.com>
*	src: get rid of the Observers throughout the code base.	Joao Eduardo Luis	2012-05-17	1	-2/+2
\| \| \| \| \| \| \| \|	This is a big patch that will remove all references to the observers throughout the code, including a complete removal of the Observer-related messages' source files. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
*	RefCountedObject: relocate from msg/Message.h to common/RefCountedObj.h	Yehuda Sadeh	2012-04-25	1	-80/+2
\| \| \| \| \| \|	Following a popular request. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
*	librados: call notification under different thread context	Yehuda Sadeh	2012-04-25	1	-0/+63
\| \| \| \| \| \| \| \| \| \|	This fixes #2342. We shouldn't call notify on the dispatcher context. We should also make sure that we don't hold the client lock while waiting for the responses. Also, pushed the client_lock locking into the ctx->notify(). Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
*	OSD: improve information and format of OSDTracker messages	Samuel Just	2012-04-13	1	-5/+8
\| \| \| \| \| \| \|	Also, Message now has a timestamp indicating when the message was fully recieved for use by OSDTracker. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
*	messages: populate header.version in constructor	Sage Weil	2012-02-10	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Define a HEAD_VERSION and COMPAT_VERSION for any versioned message. Pass to Message constructor so that it is always initialized, even from the the default constructor. That's needed because we use that to check decoding compatibility when receiving/decoding messages. If we are conditionally encoding an old version, explicitly set header.version in encode_payload(). We also set compat_version to demonstrate what will happen for future revisions. In this case, it's moot, because no old code understands compat_version yet: nobody with old decode code will see these values anyway. But use this opportunity to demonstrate how it would be used in the future. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	Merge branch 'wip-encoding'	Sage Weil	2012-02-02	1	-19/+21
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com> Conflicts: src/msg/Message.h src/osd/OSD.cc src/osd/ReplicatedPG.cc src/osd/ReplicatedPG.h
\| *	msg: implement Message::dump()	Sage Weil	2012-01-30	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Just wrap print() for now. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	msg: go const-crazy on messages	Sage Weil	2012-01-30	1	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- get_type_name() - print() and all the random crap they call. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	msg: no cct for decode_payload	Sage Weil	2012-01-30	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	msg: no cct needed for message encoding	Sage Weil	2012-01-30	1	-3/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	msg: pass features explicitly into message encoders	Sage Weil	2012-01-30	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid using the connection reference; pass it in explicitly instead. This will make ceph-dencoder's life a bit easier. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* \|	msgr: Document recv_stamp and add a dispatch_stamp and throttle_wait.	Greg Farnum	2012-01-31	1	-1/+14
\|/ \| \| \|	Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
*	osd: add MOSDPGBackfill message	Sage Weil	2011-12-14	1	-0/+1
\| \| \| \|	Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	osd: MOSDPGScan	Sage Weil	2011-12-14	1	-0/+1
\| \| \| \| \| \|	Message to query hash ranges of a PG. Signed-off-by: Sage Weil <sage@newdream.net>
*	mon: allow monitor to automagically join cluster	Sage Weil	2011-11-11	1	-0/+1
\| \| \| \| \| \| \| \|	If a monitor starts up with the correct fsid and auth keys, it will now add itself to the monmap (and subsequently try to join the quorum) if it is not already in the monmap. Signed-off-by: Sage Weil <sage@newdream.net>
*	mon: slurp latest state from active monitors before joining quorum	Sage Weil	2011-11-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a monitor has been down and is behind, and joins the quorum, the other nodes will try to send it all of the needed state, which can bring the cluster to a halt. Instead, implement a new bootstrap() procedure: - probe the cluster nodes - if there is an existing quorum, - and it is not too far ahead of me, join it (call an election) - otherwise, slurp down all the newer state and then restart (bootstrap) - if we see enough online nodes that are not part of the quorum, call an election. We still need to add some timeouts. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	msg: add MCommand, MCommandReply message types	Sage Weil	2011-10-14	1	-0/+2
\| \| \| \| \| \| \| \|	These are similar to MMonCommand[Ack], but aren't PaxosServiceMessage children, don't include the command in the reply (useless), have a more generic name. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	osd: discard requests that from disconnected clients	Sage Weil	2011-10-07	1	-0/+4
\| \| \| \| \| \|	If we can't reply, throw out the request; they'll need to resend it anyway. Signed-off-by: Sage Weil <sage@newdream.net>
*	msg: remove globals	Colin Patrick McCabe	2011-06-20	1	-7/+8
\| \| \| \|	Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>