summaryrefslogtreecommitdiff
path: root/src/msg/Message.h
Commit message (Collapse)AuthorAgeFilesLines
* Message,OSD,PG: make Connection::features privateSage Weil2013-07-191-0/+2
| | | | | | | Use has_feature() method too. Signed-off-by: Samuel Just <sam.just@inktank.com> Signed-off-by: Sage Weil <sage@inktank.com>
* Merge branch 'wip-small-object-recovery'Samuel Just2013-07-081-0/+4
|\ | | | | | | | | | | | | | | Conflicts: src/include/ceph_features.h Reviewed-by: Sage Weil <sage@inktank.com> Fixes: #5278
| * messages/,osd_types: add messages for Push, PushReply, PullSamuel Just2013-07-081-0/+4
| | | | | | | | Signed-off-by: Samuel Just <sam.just@inktank.com>
* | mon: implement simple 'scrub' commandSage Weil2013-07-081-0/+1
|/ | | | | | | | | | | | | | Compare all keys within the sync'ed prefixes across members of the quorum and compare the key counts and CRC for inconsistencies. Currently this is a one-shot inefficient hammer. We'll want to make this work in chunks before it is usable in production environments. Protect with a feature bit to avoid sending MMonScrub to mons who can't decode it. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
* msgr: queue reset exactly once on any connectionSage Weil2013-06-131-1/+3
| | | | | | | | | | | | Use the atomic pipe link removal as a signal that we are the one failing the con and use that to queue the reset event. This fixes the case where we have an open, the session gets set up via the handle_accept callback, and then race with another connection and go into wait + close, or just close. In that case, fault() needs to queue a reset event to match the accept. Signed-off-by: Sage Weil <sage@inktank.com>
* msgr: use ConnectionRef throughoutSage Weil2013-06-131-12/+9
| | | | | | | | | | | | Make RefCountedObject a private parent of Connection so that users are forced to use ConnectionRef whenever references are taken. Many methods can still take a raw Connection* when they are using the caller's reference but not taking their own; this is cheaper than twiddling the reference count, and the lifetime is still well defined. Local variables generally use ConnectionRef, though. Signed-off-by: Sage Weil <sage@inktank.com>
* Merge remote-tracking branch 'yan/wip-mds'Sage Weil2013-05-291-0/+2
|\ | | | | | | | | | | | | Reviewed-by: Sage Weil <sage@inktank.com> Conflicts: src/mds/MDCache.cc
| * mds: open inode by inoYan, Zheng2013-05-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds "open-by-ino" helper. It utilizes backtrace to find inode's path and open the inode. The algorithm looks like: 1. Check MDS peers. If any MDS has the inode in its cache, goto step 6. 2. Fetch backtrace. If backtrace was previously fetched and get the same backtrace again, return -EIO. 3. Traverse the path in backtrace. If the inode is found, goto step 6; if non-auth dirfrag is encountered, goto next step. If fail to find the inode in its parent dir, goto step 1. 4. Request MDS peers to traverse the path in backtrace. If the inode is found, goto step 6. If MDS peer encounters non-auth dirfrag, it stops traversing. If any MDS peer fails to find the inode in its parent dir, goto step 1. 5. Use the same algorithm to open the inode's parent. Goto step 3 if succeeds; goto step 1 if fails. 6. return the inode's auth MDS ID. The algorithm has two main assumptions: 1. If an inode is in its auth MDS's cache, its on-disk backtrace can be out of date. 2. If an inode is not in any MDS's cache, its on-disk backtrace must be up to date. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* | osd: ping both front and back interfacesSage Weil2013-05-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Send ping requests to both the front and back hb addrs for peer osds. If the front hb addr is not present, do not send it and interpret a reply as coming from both. This handles the transition from old to new OSDs seamlessly. Note both the front and back rx times. Both need to be up to date in order for the peer to be healthy. Signed-off-by: Sage Weil <sage@inktank.com>
* | msgr: add Messenger reference to ConnectionSage Weil2013-05-221-1/+4
|/ | | | | | This allows us to get the messenger associated with a connection. Signed-off-by: Sage Weil <sage@inktank.com>
* msgr: add second per-message throttler to message policySage Weil2013-04-061-16/+34
| | | | | | | | | | We already have a throttler that lets of limit the amount of memory consumed by messages from a given source. Currently this is based only on the size of the message payload. Add a second throttler that limits the number of messages so that we can effectively throttle small requests as well. Signed-off-by: Sage Weil <sage@inktank.com>
* messages: add MOSDMarkMeDownSamuel Just2013-03-211-0/+1
| | | | Signed-off-by: Samuel Just <sam.just@inktank.com>
* mon: HealthMonitor: Keep track of monitor cluster's healthJoao Eduardo Luis2013-03-181-0/+1
| | | | | | | | | | | | | | | | | | The HealthMonitor builds upon the QuorumService interface, and should be used to keep track of all and any relevant information about the monitor cluster (maybe even about all the cluster if need be). This patch also introduces the HealthService interface, used to define a HealthMonitor service, responsible for dispatching 'MMonHealth' messages (the QuorumService interface dispatches generic 'Message'). Based on the HealthService interface, we introduce the DataHealthService class, a service that will track disk space consumption by the monitors, warn when a given threshold is crossed, and gracefully shutdown the monitor if disk space usage hits critical levels that might affect the correct monitor behavior. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
* Merge branch 'wsp.bobtail.2merge' into wsp.bobtail.masterJoao Eduardo Luis2013-02-211-0/+1
|\ | | | | | | | | | | | | | | | | Conflicts: src/.gitignore src/Makefile.am src/include/ceph_features.h src/mon/MDSMonitor.cc src/mon/PGMonitor.cc
| * message: MMonSync: Monitor Synchronization messageJoao Eduardo Luis2013-02-211-0/+1
| | | | | | | | | | | | | | | | | | | | The monitor's synchronization process requires a specific message type to carry the required informations. Since this process significantly differs from slurping, reusing the MMonProbe message is not an option as it would require major changes and, for all intetions and purposes, it would be far outside the scope of the MMonProbe message. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
* | msg/Message.h: fix C-style pointer castingDanny Al-Gaaf2013-02-061-2/+2
|/ | | | | | Replace C-style pointer casting with correct static_cast<>(). Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
* osd: use Message::get_cost() function for queueingSage Weil2013-01-221-0/+4
| | | | | | The data payload is a decent proxy for cost in most cases, but not all. Signed-off-by: Sage Weil <sage@inktank.com>
* messages: add MTimeCheckJoao Eduardo Luis2013-01-111-0/+2
| | | | | Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* features is uint64_tSage Weil2012-12-271-5/+5
| | | | | | | This won't bite us for a while yet (we're on bit 26), but it will soon! Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
* osd/: make OSDService messenger helpers return ConnectionRefSamuel Just2012-11-301-0/+6
| | | | Signed-off-by: Samuel Just <sam.just@inktank.com>
* common: add RefCountedObj.cc with intrusive_ptr hooksSamuel Just2012-11-131-4/+0
| | | | Signed-off-by: Samuel Just <sam.just@inktank.com>
* message: add MRecoveryReserveMike Ryan2012-11-011-0/+3
| | | | | | | This message will be used to reserve and release recovery slots on replica PGs. Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
* osd/: add backfill reservationsSamuel Just2012-09-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Previously, a new osd would be bombarded by backfills from many osds simultaneously, resulting in excessively high load. Instead, we want to limit the number of backfills coming into and going out from a single osd. To that end, each OSDService now has two AsyncReserver instances: one for backfills going from the osd (local_reserver) and one for backfills going to the osd (remote_reserver). For a primary to initiate a backfill, it must first obtain a reservation from its own local_reserver. Then, it must obtain a reservation from the backfill target's remote_reserver via a MBackfillReserve message. This process is managed by substates of Active and ReplicaActive (see the changes in PG.h). The reservations are dropped either on the Backfilled event, which is sent on the primary before calling recovery_complete and on the replica on receipt of the BackfillComplete progress message), or upon leaving Active or ReplicaActive. It's important that we always grab the local reservation before the remote reservation in order to prevent a circular dependency. Signed-off-by: Samuel Just <sam.just@inktank.com>
* Merge branch 'next'Samuel Just2012-08-151-4/+2
|\
| * PG,Message: move intrusive_ptr_* into top namespaceSamuel Just2012-08-151-4/+2
| | | | | | | | | | | | | | gcc 4.7 requires that the intrusive_ptr_* functions be in the same namespace as the templated class. Signed-off-by: Samuel Just <sam.just@inktank.com>
* | Merge branch 'wip-msgr'Sage Weil2012-08-131-3/+24
|\ \ | |/ |/|
| * msgr: do not reopen failed lossy ConnectionsSage Weil2012-07-201-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was a race where: - sending stuff to a lossy Connection - it fails, and queues itself for reap, queues a RESET event - reaper clears the Pipe - some thread queues new messages and the Pipe is reopened, messages sent - RESET event delivered to dispatch, connection is closed and reopened. The result was that messages got sent to the OSD out of order during the window between the fault() and ms_handle_reset() getting called. This will prevent that. Signed-off-by: Sage Weil <sage@inktank.com>
| * msg/Connection: add failed flag for lossy ConnectionsSage Weil2012-07-201-3/+12
| | | | | | | | | | | | If a lossy Connection fails and we disconnect the Pipe, set a failed flag. Signed-off-by: Sage Weil <sage@inktank.com>
* | PG: compound messages must carry epoch_sent for each partSamuel Just2012-07-051-0/+6
|/ | | | | | | | Query and Notify messages include logical messages from multiple pgs. Each logical message (pg_query_t and pg_notify_t) now contains an epoch_sent. Signed-off-by: Samuel Just <sam.just@inktank.com>
* misc assert #include cleanup, hackerySage Weil2012-06-071-19/+10
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* msg: make clear_pipe work only on a given Pipe, rather than the current one.Greg Farnum2012-06-031-3/+3
| | | | | | | | | | | This way old Pipes that have been replaced can't clear the new Pipe out of a Connection's link. We might attempt to instead sever the link between CLOSED Pipes and their Connections more completely (eg, when the Connection gets a new Pipe), but that will require more work to handle all the cases, and this works for now. Signed-off-by: Greg Farnum <greg@inktank.com>
* src: get rid of the Observers throughout the code base.Joao Eduardo Luis2012-05-171-2/+2
| | | | | | | | This is a big patch that will remove all references to the observers throughout the code, including a complete removal of the Observer-related messages' source files. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
* RefCountedObject: relocate from msg/Message.h to common/RefCountedObj.hYehuda Sadeh2012-04-251-80/+2
| | | | | | Following a popular request. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
* librados: call notification under different thread contextYehuda Sadeh2012-04-251-0/+63
| | | | | | | | | | This fixes #2342. We shouldn't call notify on the dispatcher context. We should also make sure that we don't hold the client lock while waiting for the responses. Also, pushed the client_lock locking into the ctx->notify(). Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
* OSD: improve information and format of OSDTracker messagesSamuel Just2012-04-131-5/+8
| | | | | | | Also, Message now has a timestamp indicating when the message was fully recieved for use by OSDTracker. Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
* messages: populate header.version in constructorSage Weil2012-02-101-5/+10
| | | | | | | | | | | | | | | | | | Define a HEAD_VERSION and COMPAT_VERSION for any versioned message. Pass to Message constructor so that it is always initialized, even from the the default constructor. That's needed because we use that to check decoding compatibility when receiving/decoding messages. If we are conditionally encoding an old version, explicitly set header.version in encode_payload(). We also set compat_version to demonstrate what will happen for future revisions. In this case, it's moot, because no old code understands compat_version yet: nobody with old decode code will see these values anyway. But use this opportunity to demonstrate how it would be used in the future. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* Merge branch 'wip-encoding'Sage Weil2012-02-021-19/+21
|\ | | | | | | | | | | | | | | | | | | Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com> Conflicts: src/msg/Message.h src/osd/OSD.cc src/osd/ReplicatedPG.cc src/osd/ReplicatedPG.h
| * msg: implement Message::dump()Sage Weil2012-01-301-0/+2
| | | | | | | | | | | | Just wrap print() for now. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
| * msg: go const-crazy on messagesSage Weil2012-01-301-13/+13
| | | | | | | | | | | | | | | | - get_type_name() - print() and all the random crap they call. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
| * msg: no cct for decode_payloadSage Weil2012-01-301-1/+1
| | | | | | | | Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
| * msg: no cct needed for message encodingSage Weil2012-01-301-3/+3
| | | | | | | | Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
| * msg: pass features explicitly into message encodersSage Weil2012-01-301-3/+3
| | | | | | | | | | | | | | Avoid using the connection reference; pass it in explicitly instead. This will make ceph-dencoder's life a bit easier. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* | msgr: Document recv_stamp and add a dispatch_stamp and throttle_wait.Greg Farnum2012-01-311-1/+14
|/ | | | Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
* osd: add MOSDPGBackfill messageSage Weil2011-12-141-0/+1
| | | | Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* osd: MOSDPGScanSage Weil2011-12-141-0/+1
| | | | | | Message to query hash ranges of a PG. Signed-off-by: Sage Weil <sage@newdream.net>
* mon: allow monitor to automagically join clusterSage Weil2011-11-111-0/+1
| | | | | | | | If a monitor starts up with the correct fsid and auth keys, it will now add itself to the monmap (and subsequently try to join the quorum) if it is not already in the monmap. Signed-off-by: Sage Weil <sage@newdream.net>
* mon: slurp latest state from active monitors before joining quorumSage Weil2011-11-071-0/+1
| | | | | | | | | | | | | | | | | | | If a monitor has been down and is behind, and joins the quorum, the other nodes will try to send it all of the needed state, which can bring the cluster to a halt. Instead, implement a new bootstrap() procedure: - probe the cluster nodes - if there is an existing quorum, - and it is not too far ahead of me, join it (call an election) - otherwise, slurp down all the newer state and then restart (bootstrap) - if we see enough online nodes that are not part of the quorum, call an election. We still need to add some timeouts. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* msg: add MCommand, MCommandReply message typesSage Weil2011-10-141-0/+2
| | | | | | | | These are similar to MMonCommand[Ack], but aren't PaxosServiceMessage children, don't include the command in the reply (useless), have a more generic name. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* osd: discard requests that from disconnected clientsSage Weil2011-10-071-0/+4
| | | | | | If we can't reply, throw out the request; they'll need to resend it anyway. Signed-off-by: Sage Weil <sage@newdream.net>
* msg: remove globalsColin Patrick McCabe2011-06-201-7/+8
| | | | Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>