summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* mon: make max_global_id increase a bit more robustwip-mon-skip-auth-cuttlefishSage Weil2013-08-042-8/+10
| | | | | | | - Increase it in one go, not with lots of additions - Deal with weird edge cases where max_global_id is not a sane value Signed-off-by: Sage Weil <sage@inktank.com>
* skip missing auth incrmentals. hacky workaround.Sage Weil2013-08-031-0/+5
|
* upstart: stop ceph-create-keys when the monitor stopsSage Weil2013-07-261-0/+1
| | | | | | | | This avoids lingering ceph-create-keys tasks. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit a90a2b42db8de134b8ea5d81cab7825fb9ec50b4)
* FileStore: fix fd leak in _check_global_replay_guardSamuel Just2013-07-261-0/+1
| | | | | | | | | | Bug introduced in f3f92fe21061e21c8b259df5ef283a61782a44db. Fixes: #5766 Backport: cuttlefish Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit c562b72e703f671127d0ea2173f6a6907c825cd1)
* mon/Paxos: share uncommitted value when leader is/was behindSage Weil2013-07-251-1/+1
| | | | | | | | | | | | | | | | | | If the leader has and older lc than we do, and we are sharing states to bring them up to date, we still want to also share our uncommitted value. This particular case was broken by b26b7f6e, which was only contemplating the case where the leader was ahead of us or at the same point as us, but not the case where the leader was behind. Note that the call to share_state() a few lines up will bring them fully up to date, so after they receive and store_state() for this message they will be at the same lc as we are. Fixes: #5750 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> (cherry picked from commit 05b6c7e8645081f405c616735238ae89602d3cc6)
* Merge remote-tracking branch 'gh/cuttlefish-next' into cuttlefishSage Weil2013-07-2517-131/+412
|\
| * HashIndex: reset attr upon split or merge completionSamuel Just2013-07-251-13/+2
| | | | | | | | | | | | | | | | | | | | A replay of an in progress merge or split might make our counts unreliable. Fixes: #5723 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 0dc3efdd885377a07987d868af5bb7a38245c90b)
| * test/filestore/store_test: add test for 5723Samuel Just2013-07-252-5/+77
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 37a4c4af54879512429bb114285bcb4c7c3488d5) Conflicts: src/os/LFNIndex.cc src/test/filestore/store_test.cc
| * FileStore::_collection_rename: fix global replay guardSamuel Just2013-07-251-3/+10
| | | | | | | | | | | | | | | | | | | | | | If the replay is being replayed, we might have already performed the rename, skip it. Also, we must set the collection replay guard only after we have done the rename. Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 870c474c5348831fcb13797d164f49682918fb30)
| * PGLog::rewind_divergent_log: unindex only works from tail, index() insteadSamuel Just2013-07-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: #5714 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 6957dbc75cc2577652b542aa3eae69f03060cb63) The original patch covered the same code in PGLog.cc. Conflicts: src/osd/PGLog.cc src/osd/PG.cc
| * msg/Pipe: do not hold pipe_lock for verify_authorizer()Sage Weil2013-07-241-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We shouldn't hold the pipe_lock while doing the ms_verify_authorizer upcalls. Fix by unlocking a bit earlier, and verifying our state is still correct in the failure path. This regression was introduced by ecab4bb9513385bd765cca23e4e2fadb7ac4bac2. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> (cherry picked from commit 723d691f7a1f53888618dfc311868d1988f61f56) Conflicts: src/msg/Pipe.cc
| * msg/Pipe: a bit of additional debug outputSage Weil2013-07-241-2/+7
| | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 16568d9e1fb8ac0c06ebaa1e1dc1d6a432a5e4d4)
| * msg/Pipe: hold pipe_lock during important parts of accept()Sage Weil2013-07-241-14/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously we did not bother with locking for accept() because we were not visible to any other threads. However, we need to close accepting Pipes from mark_down_all(), which means we need to handle interference. Fix up the locking so that we hold pipe_lock when looking at Pipe state and verify that we are still in the ACCEPTING state any time we retake the lock. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit ecab4bb9513385bd765cca23e4e2fadb7ac4bac2)
| * msgr: fix a typo/goto-cross from dd4addef2dGreg Farnum2013-07-241-1/+2
| | | | | | | | | | | | | | | | We didn't build or review carefully enough! Signed-off-by: Greg Farnum <greg@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 1a84411209b13084b3edb87897d5d678937e3299)
| * msgr: close accepting_pipes from mark_down_all()Sage Weil2013-07-241-0/+11
| | | | | | | | | | | | | | | | We need to catch these pipes too, particularly when doing a rebind(), to avoid them leaking through. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 687fe888b32ac9d41595348dfc82111c8dbf2fcb)
| * msgr: maintain list of accepting pipesSage Weil2013-07-243-0/+11
| | | | | | | | | | | | | | | | | | New pipes exist in a sort of limbo before we know who the peer is and add them to rank_pipe. Keep a list of them in accepting_pipes for that period. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit dd4addef2d5b457cc9a58782fe42af6b13c68b81)
| * msgr: adjust nonce on rebind()Sage Weil2013-07-241-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can have a situation where: - we have a pipe to a peer - pipe goes to standby (on peer) - we rebind to a new port - .... - we rebind again to the same old port - we connect to peer and get reattached to the ancient pipe from two instances back. Avoid that by picking a new nonce each time we rebind. Add 1,000,000 each time so that the port is still legible in the printed output. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 994e2bf224ab7b7d5b832485ee14de05354d2ddf) Conflicts: src/msg/Accepter.cc
| * msgr: mark_down_all() after, not before, rebindSage Weil2013-07-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we are shutting down all old connections and binding to new ports, we want to avoid a sequence like: - close all prevoius connections - new connection comes in on old port - rebind to new ports -> connection from old port leaks through As a first step, close all connections after we shut down the old accepter and before we start the new one. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 07a0860a1899c7353bb506e33de72fdd22b857dd) Conflicts: src/msg/SimpleMessenger.cc
| * msg/Pipe: unlock msgr->lock earlier in accept()Sage Weil2013-07-241-4/+1
| | | | | | | | | | | | | | | | Small cleanup. Nothing needs msgr->lock for the previously larger window. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit ad548e72fd94b4a16717abd3b3f1d1be4a3476cf)
| * msg/Pipe: avoid creating empty out_q entrySage Weil2013-07-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to maintain the invariant that all sub queues in out_q are never empty. Fix discard_requeued_up_to() to avoid creating an entry unless we know it is already present. This bug leads to an incorrect reconnect attempt when - we accept a pipe (lossless peer) - they send some stuff, maybe - fault - we initiate reconnect, even tho we have nothing queued In particular, we shouldn't reconnect because we aren't checking for resets, and the fact that our out_seq is 0 while the peer's might be something else entirely will trigger asserts later. This fixes at least one source of #5626, and possibly #5517. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 9f1c27261811733f40acf759a72958c3689c8516)
| * msg/Pipe: assert lock is held in various helpersSage Weil2013-07-241-0/+3
| | | | | | | | | | | | | | These all require that we hold pipe_lock. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 579d858aabbe5df88543d096ef4dbddcfc023cca)
| * msg/Pipe: be a bit more explicit about encoding outgoing messagesSage Weil2013-07-241-2/+8
| | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 4282971d47b90484e681ff1a71ae29569dbd1d32)
| * msg/Pipe: fix RECONNECT_SEQ behaviorSage Weil2013-07-241-1/+9
| | | | | | | | | | | | | | | | | | | | Calling handle_ack() here has no effect because we have already spliced sent messages back into our out queue. Instead, pull them out of there and discard. Add a few assertions along the way. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> (cherry picked from commit 495ee108dbb39d63e44cd3d4938a6ec7d11b12e3)
| * msgr: reaper: make sure pipe has been cleared (under pipe_lock)Sage Weil2013-07-241-2/+2
| | | | | | | | | | | | | | | | | | | | All paths to pipe shutdown should have cleared the con->pipe reference already. Assert as much. Also, do it under pipe_lock! Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 9586305a2317c7d6bbf31c9cf5b67dc93ccab50d)
| * msg/Pipe: goto fail_unlocked on early failures in accept()Sage Weil2013-07-241-55/+46
| | | | | | | | | | | | | | | | | | Instead of duplicating an incomplete cleanup sequence (that does not clear_pipe()), goto fail_unlocked and do the cleanup in a generic way. s/rc/r/ while we are here. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit ec612a5bda119cea52bbac9b2a49ecf1e83b08e5)
| * msgr: clear con->pipe inside pipe_lock on mark_downSage Weil2013-07-241-0/+10
| | | | | | | | | | | | | | We need to do this under protection of the pipe_lock. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit afafb87e8402242d3897069f4b94ba46ffe0c413)
| * msgr: clear_pipe inside pipe_lock on mark_down_allSage Weil2013-07-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Observed a segfault in rebind -> mark_down_all -> clear_pipe -> put that may have been due to a racing thread clearing the connection_state pointer. Do the clear_pipe() call under the protection of pipe_lock, as we do in all other contexts. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 5fc1dabfb3b2cbffdee3214d24d7769d6e440e45) Conflicts: src/msg/SimpleMessenger.cc
| * ReplicatedPG: track temp collection contents, clear during on_changeSamuel Just2013-07-242-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We also assert in on_flushed() that the temp collection is actually empty. Fixes: #5670 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 47516d9c4b7f023f3a16e166749fa7b1c7b3b24c) Conflicts: src/osd/ReplicatedPG.cc
| * PG, ReplicatedPG: pass a transaction down to ReplicatedPG::on_changeSamuel Just2013-07-244-7/+10
| | | | | | | | | | | | Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 9f56a7b8bfcb63cb4fbbc0c9b8ff01de9e518c57)
| * PG: start flush on primary only after we process the master logSamuel Just2013-07-241-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Once we start serving reads, stray objects must have already been removed. Therefore, we have to flush all operations up to the transaction writing out the authoritative log. On replicas, we flush in Stray() if we will not eventually be activated and in ReplicaActive if we are in the acting set. This way a replica won't serve a replica read until the store is consistent. Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit b41f1ba48563d1d3fd17c2f62d10103b5d63f305)
| * ReplicatedPG: replace clean_up_local with a debug checkSamuel Just2013-07-245-13/+24
| | | | | | | | | | | | | | | | | | | | | | Stray objects should have been cleaned up in the merge_log transactions. Only on the primary have those operations necessarily been flushed at activate(). Fixes: 5084 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 278c7b59228f614addf830cb0afff4988c9bc8cb)
| * FileStore: add global replay guard for split, collection_renameSamuel Just2013-07-242-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the event of a split or collection rename, we need to ensure that we don't replay any operations on objects within those collections prior to that point. Thus, we mark a global replay guard on the collection after doing a syncfs and make sure to check that in _check_replay_guard() for all object operations. Fixes: #5154 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit f3f92fe21061e21c8b259df5ef283a61782a44db) Conflicts: src/os/FileStore.cc
| * OSD: add config option for peering_wq batch sizeSamuel Just2013-07-243-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | Large peering_wq batch sizes may excessively delay peering messages resulting in unreasonably long peering. This may speed up peering. Backport: cuttlefish Related: #5084 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 39e5a2a406b77fa82e9a78c267b679d49927e3c3)
* | v0.61.7v0.61.7Gary Lowell2013-07-242-1/+7
|/
* ceph-disk: use new get_dev_path helper for listSage Weil2013-07-241-4/+4
| | | | | | | | Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Tested-by: Olivier Bonvalet <ob.ceph@daevel.fr> (cherry picked from commit fd1fd664d6102a2a96b27e8ca9933b54ac626ecb)
* ceph-disk: use /sys/block to determine partition device namesSage Weil2013-07-241-1/+24
| | | | | | | | Not all devices are basename + number; some have intervening character(s), like /dev/cciss/c0d1p2. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 2ea8fac441141d64ee0d26c5dd2b441f9782d840)
* ceph-disk: reimplement is_partition() using /sys/blockSage Weil2013-07-241-16/+9
| | | | | Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 5b031e100b40f597752b4917cdbeebb366eb98d7)
* ceph-disk: use get_dev_name() helper throughoutSage Weil2013-07-241-4/+4
| | | | | | | This is more robust than the broken split trick. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 3359aaedde838c98d1155611e157fd2da9e8b9f5)
* ceph-disk: refactor list_[all_]partitionsSage Weil2013-07-241-32/+19
| | | | | | | | Make these methods work in terms of device *names*, not paths, and fix up the only direct list_partitions() caller to do the same. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 35d3f2d84808efda3d2ac868afe03e6959d51c03)
* ceph-disk: add get_dev_name, path helpersSage Weil2013-07-241-0/+25
| | | | | Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit e0401591e352ea9653e3276d66aebeb41801eeb3)
* ceph-disk: handle /dev/foo/bar devices throughoutSage Weil2013-07-241-4/+4
| | | | | | | | Assume the last component is the unique device name, even if it appears under a subdir of /dev. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit cb97338b1186939deecb78e9d949c38c3ef59026)
* ceph-disk: make is_held() smarter about full disksSage Weil2013-07-241-7/+14
| | | | | | | | | Handle the case where the device is a full disk. Make the partition check a bit more robust (don't make assumptions about naming aside from the device being a prefix of the partition). Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit e082f1247fb6ddfb36c4223cbfdf500d6b45c978)
* mon/OSDMonitor: search for latest full osdmap if record version is missingSage Weil2013-07-241-0/+10
| | | | | | | | | | | | | | In 97462a3213e5e15812c79afc0f54d697b6c498b1 we tried to search for a recent full osdmap but were looking at the wrong key. If full_0 was present we could record that the latest full map was last_committed even though it wasn't present. This is fixed in 76cd7ac1c, but we need to compensate for when get_version_latest_full() gives us a back version number by repeating the search. Fixes: #5737 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> (cherry picked from commit c2131d4047156aa2964581c9dbd93846382a07e7)
* test: test_store_tool: global init before using LevelDBStoreJoao Eduardo Luis2013-07-241-4/+22
| | | | | | | | Fixes a segfault Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit a7a7d3fc8a2ba4a30ef136a32f2903d157b3e19a)
* mon: OSDMonitor: fix a bug introduced on 97462a32Joao Eduardo Luis2013-07-241-1/+1
| | | | | | | | | Fixes: #5737 Backport: cuttlefish Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 76cd7ac1c2094b34ad36bea89b2246fa90eb2f6d)
* mon/Paxos: fix pn for uncommitted value during collect/last phaseSage Weil2013-07-241-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the collect/last exchange, peers share any uncommitted values with the leader. They are supposed to also share the pn under which that value was accepted, but were instead using the just-accepted pn value. This effectively meant that we *always* took the uncommitted value; if there were multiples, which one we accepted depended on what order the LAST messages arrived, not which pn the values were generated under. The specific failure sequence I observed: - collect - learned uncommitted value for 262 from myself - send collect with pn 901 - got last with pn 901 (incorrect) for 200 (old) from peer - discard our own value, remember the other - finish collect phase - ignore old uncommitted value Fix this by storing a pending_v and pending_pn value whenever we accept a value. Use this to send an appropriate pn value in the LAST reply so that the leader can make it's decision about which uncommitted value to accept based on accurate information. Also use it when we learn the uncommitted value from ourselves. We could probably be more clever about storing less information here, for example by omitting pending_v and clearing pending_pn at the appropriate point, but that would be more fragile. Similarly, we could store a pn for *every* commit if we wanted to lay some groundwork for having multiple uncommitted proposals in flight, but I don't want to speculate about what is necessary or sufficient for a correct solution there. Fixes: #5698 Backport: cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 20baf662112dd5f560bc3a2d2114b469444c3de8)
* mon/Paxos: debug ignored uncommitted valuesSage Weil2013-07-241-11/+17
| | | | | Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 19b29788966eb80ed847630090a16a3d1b810969)
* mon/Paxos: only learn uncommitted value if it is in the futureSage Weil2013-07-241-1/+3
| | | | | | | | | | | | If an older peer sends an uncommitted value, make sure we only take it if it is in the future, and at least as new as any current uncommitted value. (Prior to the previous patch, peers could send values from long-past rounds. The pn values are also bogus.) Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit b3253a453c057914753846c77499f98d3845c58e)
* mon/Paxos: only share uncommitted value if it is nextSage Weil2013-07-241-1/+2
| | | | | | | | | | We may have an uncommitted value from our perspective (it is our lc + 1) when the collector has a much larger lc (because we have been out for the last few rounds). Only share an uncommitted value if it is in fact the next value. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit b26b7f6e5e02ac6beb66e3e34e177e6448cf91cf)
* v0.61.6v0.61.6Gary Lowell2013-07-232-1/+7
|