| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes valgrind warning:
==14803== Use of uninitialised value of size 8
==14803== at 0x12E7614: sctp_crc32c_sb8_64_bit (sctp_crc32.c:567)
==14803== by 0x12E76F8: update_crc32 (sctp_crc32.c:609)
==14803== by 0x12E7720: ceph_crc32c_le (sctp_crc32.c:733)
==14803== by 0x105085F: ceph::buffer::list::crc32c(unsigned int) (buffer.h:427)
==14803== by 0x115D7B2: Message::calc_front_crc() (Message.h:441)
==14803== by 0x1159BB0: Message::encode(unsigned long, bool) (Message.cc:170)
==14803== by 0x1323934: Pipe::writer() (Pipe.cc:1524)
==14803== by 0x13293D9: Pipe::Writer::entry() (Pipe.h:59)
==14803== by 0x120A398: Thread::_entry_func(void*) (Thread.cc:41)
==14803== by 0x503BE99: start_thread (pthread_create.c:308)
==14803== by 0x6C6E4BC: clone (clone.S:112)
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit eb91f41042fa31df2bef9140affa6eac726f6187)
|
|\
| |
| |
| | |
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On XFS this call is problematic because it directly calls the filemap
writeback without vectoring through xfs. This can break the delicate
ordering of writeback and range zeroing; see #4976 and this thread
http://oss.sgi.com/archives/xfs/2013-06/msg00066.html
Drop this behavior for now to avoid subtle data corruption.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The use of sync_file_range(2) on XFS screws up XFS' delicate ordering
of writeback and range zeroing; see #4976 and this thread:
http://oss.sgi.com/archives/xfs/2013-06/msg00066.html
Instead, replace all sync_file_range(2) calls with fdatasync(2), which
*does* do ordered writeback and should not leak unzeroed blocks.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
common/Preforker.h: In member function ‘int Preforker::signal_exit(int)’:
warning: common/Preforker.h:82:45: ignoring return value of ‘ssize_t safe_write(int, const void*, size_t)’, declared with attribute warn_unused_result [-Wunused-result]
This is harder than it should be to fix. :(
http://stackoverflow.com/questions/3614691/casting-to-void-doesnt-remove-warn-unused-result-error
Whatever, I guess we can do something useful with this return value.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(cherry picked from commit ce7b5ea7d5c30be32e4448ab0e7e6bb6147af548)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
First of all, we must find a monmap to backup. The newest version.
Secondly, we must make sure we back it up before clearing the store.
Finally, we must make sure that we don't remove said backup while
clearing the store; otherwise, we would be out of a backup monmap if the
sync happened to fail (and if the monitor happened to be killed before a
new sync had finished).
This patch makes sure these conditions are met.
Fixes: #5256 (partially)
Backport: cuttlefish
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 5e6dc4ea21b452e34599678792cd36ce1ba3edb3)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Always use the highest version amongst all the typically available
monmaps: whatever we have in memory, whatever we have under the
MonmapMonitor's store, and whatever we have backed up from a previous
sync. This ensures we always use the newest version we came across
with.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 6284fdce794b73adcc757fee910e975b6b4bd054)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Otherwise, we will end up losing the monmap we backed up when we started
the sync, and the monitor may be unable to start if it is killed or
crashes in-between the sync abort and finishing a new sync.
Fixes: #5256 (partially)
Backport: cuttlefish
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit af5a9861d7c6b4527b0d2312d0efa792910bafd9)
|
| |
| |
| |
| |
| |
| |
| | |
From 654299108bfb11e7dce45f54946d1505f71d2de8.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit e9689ac6f5f50b077a6ac874f811d204ef996c96)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The %ghost %dir ... line will make this get cleaned up but won't install
it.
Reported-by: Derek Yarnell <derek@umiacs.umd.edu>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
(cherry picked from commit 64ee0148a5b7324c7df7de2d5f869b880529d452)
|
| |
| |
| |
| |
| |
| |
| |
| | |
This handles cases where the daemon is started without the benefit of
sysvinit or upstart (as with teuthology or ceph-fuse).
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 654299108bfb11e7dce45f54946d1505f71d2de8)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
merge_log and friends all take care of dirtying the log
as necessary.
Fixes: #5238
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 5deece1d034749bf72b7bd04e4e9c5d97e5ad6ce)
|
|/
|
|
|
|
|
|
|
|
|
| |
apply_incremental() may return -EINVAL. Don't ignore it.
[1] UfP = Update from Paxos
Fixes: #5343
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit e3c33f4315cbf8718f61eb79e15dd6d44fc908b7)
|
|
|
|
|
|
|
|
|
|
| |
If we get a reset during our attempt to open an MDS session, close out the
Connection* and retry to open the session, moving the waiters over.
Fixes: #5379
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit df8a3e5591948dfd94de2e06640cfe54d2de4322)
|
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 8c6b24e9039079e897108f28d6af58cbc703a15a)
|
|
|
|
|
|
|
| |
The weird output from libreadline users is related to the TERM variable.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit e538829f16ce19d57d63229921afa01cc687eb86)
|
|
|
|
|
|
|
|
| |
Make the ancient-udev/blkid workaround script for RHEL/CentOS create the
symlinks for us too.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit d7f7d613512fe39ec883e11d201793c75ee05db1)
|
|
|
|
|
|
|
|
|
| |
Keep going even if we hit one activation error. This avoids failing to
start some disks when only one of them won't start (e.g., because it
doesn't belong to the current cluster).
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c9074375bfbe1e3757b9c423a5ff60e8013afbce)
|
|
|
|
|
|
|
| |
Commit f3234c147e083f2904178994bc85de3d082e2836 missed this.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 253069e04707c5bf46869f4ff5a47ea6bb0fde3e)
|
|
|
|
|
|
|
|
|
|
| |
This was commented out almost years ago in commit 9baf5ef4 but it is not
clear to me that it was correct to do so. In any case, we are not
installing the rc.d links for ceph, which means it does not start up after
a reboot.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit cc9b83a80262d014cc37f0c974963cf7402a577a)
|
|
|
|
|
|
|
|
| |
On 'service ceph start' or 'service ceph start osd' or start ceph-osd-all
we should activate any osd GPT partitions.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 13680976ef6899cb33109f6f841e99d4d37bb168)
|
|
|
|
|
|
|
|
|
| |
Scan /dev/disk/by-parttypeuuid for ceph OSDs and activate them all. This
is useful when the event didn't trigger on the initial udev event for
some reason.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 5c7a23687a1a21bec5cca7b302ac4ba47c78e041)
|
|
|
|
|
|
|
| |
We need this to help trigger OSD activations.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit d512dc9eddef3299167d4bf44e2018b3b6031a22)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: #5362
When listing objects prefix needs to be escaped correctly (the
same as with the marker). Otherwise listing objects with prefix
that starts with underscore doesn't work.
Backport: bobtail, cuttlefish
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit d582ee2438a3bd307324c5f44491f26fd6a56704)
|
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit cd1c289b96a874ff99a83a44955d05efc9f2765a)
|
|
|
|
|
|
|
|
|
|
| |
If we have dropped all references to a revoked capability, send the ack
to the MDS. This typo has been there since v0.7 (early 2009)!
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit b7143c2f84daafbe2c27d5b2a2d5dc40c3a68d15)
|
|
|
|
|
|
|
|
| |
This was initialized in (one of) the ctor(s), but not encoded/decoded,
and not used. Remove it. This makes valgrind a happy.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 08bb8d510b5abd64f5b9f8db150bfc8bccaf9ce8)
|
|
|
|
|
|
| |
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4974b29e251d433101b69955091e22393172bcd8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we exit via preforker, call exit(3) and not recursively back into
Preforker::exit(r). Otherwise you get a hang with the child blocked
at:
Thread 1 (Thread 0x7fa08962e7c0 (LWP 5419)):
#0 0x000000309860e0cd in write () from /lib64/libpthread.so.0
#1 0x00000000005cc906 in Preforker::exit(int) ()
#2 0x00000000005c8dfb in main ()
and the parent at
#0 0x000000309860eba7 in waitpid () from /lib64/libpthread.so.0
#1 0x00000000005cc87a in Preforker::parent_wait() ()
#2 0x00000000005c75ae in main ()
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 7e7ff7532d343c473178799e37f4b83cf29c4eee)
|
|
|
|
|
|
| |
Fixes #5342
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
|
|
|
|
|
|
|
|
|
| |
If ceph-mon segfault, socket file isn't removed.
By adding a remove in post-stop, upstart clean run directory properly.
Signed-off-by: Guilhem Lettron <guilhem@lettron.fr>
(cherry picked from commit 554b41b171eab997038e83928c462027246c24f4)
|
|
|
|
|
|
| |
Upstart tasks don't have to concept of 'stop on' as they
are not long running.
(cherry picked from commit 17f6fccabc262b9a6d59455c524b550e77cd0fe3)
|
|
|
|
|
| |
Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit f86b4e7a4831c684033363ddd335d2f3fb9a189a)
|
|
|
|
|
|
|
|
| |
Cast output of _check_output() to str() to be able to use
str.split().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
(cherry picked from commit 16ecae153d260407085aaafbad1c1c51f4486c9a)
|
|
|
|
|
| |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
(cherry picked from commit 9785478a2aae7bf5234fbfe443603ba22b5a50d2)
|
|
|
|
|
| |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
(cherry picked from commit 9429ff90a06368fc98d146e065a7b9d1b68e9822)
|
|
|
|
|
| |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
(cherry picked from commit c127745cc021c8b244d721fa940319158ef9e9d4)
|
|
|
|
|
|
|
| |
It doesn't mean anything anymore; drop it.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit bcfd2f31a50d27038bc02e645795f0ec99dd3b32)
|
|
|
|
|
|
|
| |
Trigger 'ceph-disk activate-journal' from the alt udev rules.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit b139152039bfc0d190f855910d44347c9e79b22a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel does not let you mount --move when the parent mount is
shared (see, e.g., https://bugzilla.redhat.com/show_bug.cgi?id=917008
for another person this also confused). We can't use --bind either
since that (on RHEL at least) screws up /etc/mtab so that the final
result looks like
/var/lib/ceph/tmp/mnt.HNHoXU /var/lib/ceph/osd/ceph-0 none rw,bind 0 0
Instead, mount the original dev in the final location and then umount
from the old location.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit e5ffe0d2484eb6cbcefcaeb5d52020b1130871a5)
|
|
|
|
|
|
|
|
| |
These are need for old or buggy udev. Having them for new and unbroken
udev is harmless.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit f3234c147e083f2904178994bc85de3d082e2836)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
parted on RHEL/Centos prefixes the *machine readable output* with
1b 5b 3f 31 30 33 34 68
Note that the same thing happens when you 'import readline' in python.
Work around it!
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 82ff72f827b9bd7f91d30a09d35e42b25d2a7344)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Activate an osd via its journal device. udev populates its symlinks and
triggers events in an order that is not related to whether the device is
an osd data partition or a journal. That means that triggering
'ceph-disk activate' can happen before the journal (or journal symlink)
is present and then fail.
Similarly, it may be that they are on different disks that are hotplugged
with the journal second.
This can be wired up to the journal partition type to ensure that osds are
started when the journal appears second.
Include the udev rules to trigger this.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit a2a78e8d16db0a71b13fc15457abc5fe0091c84c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After we change the final partition type, sgdisk may or may not trigger a
udev event, depending on how well udev is behaving (it varies between
distros, it seems). The old code would often settle and wait for udev to
activate the device, and then partprobe would uselessly fail because it
was already mounted.
Call partprobe only at the very end, after prepare is done. This ensures
that if partprobe calls udevadm settle (which is sometimes does) we do not
get stuck.
Drop the udevadm settle. I'm not sure what this accomplishes; take it out,
at least until we determine we need it.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 8b3b59e01432090f7ae774e971862316203ade68)
|
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 10ba60cd088c15d4b4ea0b86ad681aa57f1051b6)
|
|
|
|
|
|
|
| |
Broken by 225fefe5e7c997b365f481b6c4f66312ea28ed61.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit bcc8bfdb672654c6a6b48a2aa08267a894debc32)
|
|
|
|
|
|
|
|
|
|
| |
It is often useful to prepare but not activate a device, for example when
preparing a bunch of spare disks. This marks a device as 'do not
activate' so that it can be prepared without activating.
Fixes: #3255
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 225fefe5e7c997b365f481b6c4f66312ea28ed61)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Starting when only one network interface has started breaks machines with
multiple nics in very problematic ways.
There may be an earlier trigger that we can use for cases where other
services on the local machine depend on ceph, but for now this is better
than the existing behavior.
See #5248
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 7e08ed1bf154f5556b3c4e49f937c1575bf992b8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We regularly have been observing a stall where the MDS is blocked waiting
for a cap revocation (Ls, in our case) and never gets a reply. We finally
tracked down the sequence:
- mds issues cap seq 1 to client
- mds does revocation (seq 2)
- client replies
- much time goes by
- client trims inode from cache, sends release with seq == 2
- mds ignores release because its issue_seq is 1
- mds later tries to revoke other caps
- client discards message because it doesn't have the inode in cache
The problem is simply that we are using seq instead of issue_seq in the
cap release message. Note that the other release call site in
encode_inode_release() is correct. That one is much more commonly
triggered by short tests, as compared to this case where the inode needs to
get pushed out of the client cache.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit 9b012e234a924efd718826ab6a53b9aeb7cd6649)
|