| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
This will eventually be used to tell the client to send their request
elsewhere.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
| |
Users probably want get_pg_acting_rank(). If they don't, they can probably
have the mapping and can calculate the rank themselves. Having this here
is asking for bugs like #2022.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
|
| |
We want to look at the acting set here, nothing else. This was causing us
to erroneously queue ops for later (wasting memory) and to erroneously
print out a 'misdrected op' message in the cluster log (confusion and
incorrect [but ignored] -ENXIO reply).
Fixes: #2022
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
| |
Make the helper exclusively for the PG != NULL cases, and open-code the
one PG == NULL caller. This is simpler, and lets us include more useful
information in the log message.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
| |
Stores absolute path to the generated keyring so that tests running in
other directories (e.g. src/java/test) can simply reference the
generated ceph.conf.
Signed-off-by: Noah Watkins <jawhawk@cs.ucsc.edu>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to test monitor and osd failure detection and false
positive correction, this patch adds the following options:
1. osd_debug_drop_ping_probability: probability of dropping
a string of pings from a client upon ping recipt.
2. osd_debug_drop_ping_duration: number of pings to drop in
a row.
This should help with replicating some wrongly-marked-down
thrashing cases.
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
|
|
|
|
|
| |
to a set of CSV files for off-line analysis.
Signed-off-by: caleb miles <caleb.miles@inktank.com>
|
|
|
|
| |
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
|
|
|
|
| |
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we get a ping reply, remove the peer from the failure_queue
and send a still alive message if the peer is in the failure_pending
map.
Otherwise, the monitor could slowly accumulate sporadic failure reports
leading to an osd being incorrectly marked out.
This bug may have been contributing to the wrongly-marked-down
thrashing observed on some systems.
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the osd recieving the log has divergent entries, it will
also have a "divergent" stat structure. In general, it suffices
to simply trust the stat structure shipped with the authoritative
log and info since merge_log is only used to merge an authoritative
log.
Probably fixes #2769.
In cases like #2769, this bug can result in a primary with a stat
structure which double counts an operation: once for the
divergent operation, and once for the replay. It turned up
in a regression suite run as a scrub stat mismatch.
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
|
|
|
|
|
| |
These python tests aren't installed, so they need to be downloaded
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the following sequence:
1. issue, apply repop
2. replicas and primary commit
Here, repop->waitfor_(ack|disk) are empty, so we mark
repop->done and remove_repop.
3. interval change, repops still in queue are marked aborted
4. activate, last_update_applied = last_update
5. the repop from one enters apply_repop, is not aborted,
and finds that last_update_applied has passed it by.
Fixes #2749
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
|
|
|
|
|
|
|
|
| |
test/test_librbd.cc: In member function ‘virtual void LibRBD_TestClone_Test::TestBody()’:
warning: test/test_librbd.cc:1040:111: format ‘%ld’ expects argument of type ‘long int’, but argument 2 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat]
warning: test/test_librbd.cc:1040:111: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘uint64_t {aka long long unsigned int}’ [-Wformat]
warning: test/test_librbd.cc:1040:111: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘int64_t {aka long long int}’ [-Wformat]
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
| |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Sum the quantized weights for each bucket, and check that for overflow.
This could change the results of a compile marginally if the map is using
non-divisible weight values that quantize funny. The old code might
calculate a bucket sum that is not the actual sum of the quantized weights.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: caleb miles <caleb.miles@inktank.com>
|
| |
| |
| |
| |
| |
| |
| | |
Disallow setting OSD weights to a value over 10,000 and cap bucket weight
at 10,000,000 in a CRUSH map. Addresses issue #2101.
Signed-off-by: caleb miles <caleb.miles@inktank.com>
|
| |
| |
| |
| |
| |
| | |
We'll add options for different features later.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This used to be conditional on config having osd_crush_location set,
but with that, minimal configuration left the OSD completely out of
the crush map, and prevented the OSD from starting properly.
Note: Ceph does not currently let this mechanism automatically move
hosts to another location in the CRUSH hierarchy. This means if you
let this run with defaults, setting osd_crush_location later will not
take effect. Set up your config file (or Chef environment) fully
before starting the OSDs the first time.
Signed-off-by: Tommi Virtanen <tv@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Issue #2776. Allow the removal of multiple objects in a single
rados tool command:
# rados -p pool rm obj1 [obj2 [...]]
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
|\ \
| | |
| | |
| | |
| | | |
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
get() in ctor, put() in dtor.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When a cct is destroyed, tell lockdep so that it can shut down if it needed
it.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Drop this useless helper and call cct->put() directly. The comment that
this can't be used after global_init is no longer relevant as long as
nobody puts a reference they don't own... and nobody owns
g_ceph_context.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Take ownership of the passed cct. Drop it when we destroy the
RadosClient.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
These get shared via the librados API.
Fixes: #845
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This was creating a new cluster connection/session per iteration, and
along with it a few service threads and sockets and so forth.
Unfortunately, librados leaks like a sieve, starting with CephContext
and ceph::crypto::init(). See #845 and #2067.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | | |
Conflicts:
src/rados.cc
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Bad linebreaks, wrapping, stringification, missing doc for bench args
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Bug #2772. This fixes an issue that was introduced when we
added the 'rados cp' command. The -t param was already used
for rados bench. With this change the only way to specify
a target pool is using --target-pool.
Though this problem is post argonaut, the 'rados cp' command
has been backported, so we need this fix there too.
Backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This is leftover from when we built a libcrush.so. We can re-add when we
start doing that again.
Reported-by: Laszlo Boszormenyi <gcs@debian.hu>
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |\ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Reported-by: Laszlo Boszormenyi <gcs@debian.hu>
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Hit this limit with the rados api tests.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Introduce a private, already-locked version.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The _impl() helper is only called from parse_config_files(); don't retake
the lock.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This fixes various valgrind warnings triggered by the s3test
test_object_create_unreadable.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Handle response-* params that set response header field values.
Fixes #2734, #2735.
Backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|