diff options
author | Yan, Zheng <zheng.z.yan@intel.com> | 2013-01-27 15:14:55 +0800 |
---|---|---|
committer | Yan, Zheng <zheng.z.yan@intel.com> | 2013-01-29 10:17:37 +0800 |
commit | e69e7e5d0eae6581049d22c07ea3cda773c80f13 (patch) | |
tree | 32812e665d6c093e07de1a83420e0617f461bc1b | |
parent | 0e9c8124a1acfcd52cf2712e59dc8493209b42c6 (diff) | |
download | ceph-e69e7e5d0eae6581049d22c07ea3cda773c80f13.tar.gz |
mds: fix 'discover' handling in the rejoin stage
If the MDS is the resolve stage, current MDCache::handle_discover() only handles
'discover' from MDS that it has already gotten rejoin acknowledgement. This can
cause circular wait because MDCache::rejoin_gather_finish() fetches reconnected
inodes before send rejoin acknowledgements, and fetching reconnected inode may
triggers 'discover'. The fix is not delay handling 'discover' from MDS that are
also in the rejoin stage.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
-rw-r--r-- | src/mds/MDCache.cc | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc index 84502b0918a..545ffdca2e1 100644 --- a/src/mds/MDCache.cc +++ b/src/mds/MDCache.cc @@ -8873,10 +8873,12 @@ void MDCache::handle_discover(MDiscover *dis) assert(from != whoami); - if (mds->get_state() < MDSMap::STATE_CLIENTREPLAY) { + if (mds->get_state() <= MDSMap::STATE_REJOIN) { int from = dis->get_source().num(); + // proceed if requester is in the REJOIN stage, the request is from parallel_fetch(). + // delay processing request from survivor because we may not yet choose lock states. if (mds->get_state() < MDSMap::STATE_REJOIN || - rejoin_ack_gather.count(from)) { + !mds->mdsmap->is_rejoin(from)) { dout(0) << "discover_reply not yet active(|still rejoining), delaying" << dendl; mds->wait_for_active(new C_MDS_RetryMessage(mds, dis)); return; |