summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* rgw: quiet down warning messagewip-6123Yehuda Sadeh2013-08-261-1/+3
| | | | | | | | | Fixes: #6123 We don't want to know about failing to read region map info if it's not found, only if failed on some other error. In any case it's just a warning. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
* Merge pull request #531 from dmick/wip-6099Sage Weil2013-08-231-0/+13
|\ | | | | | | | | ceph_rest_api.py: create own default for log_file Reviewed-by: Sage Weil <sage@inktank.com>
| * ceph_rest_api.py: create own default for log_fileDan Mick2013-08-231-0/+13
| | | | | | | | | | | | | | | | | | | | | | common/config thinks the default log_file for non-daemons should be "". Override that so that the default is /var/log/ceph/{cluster}-{name}.{pid}.log since ceph-rest-api is more of a daemon than a client. Fixes: #6099 Backport: dumpling Signed-off-by: Dan Mick <dan.mick@inktank.com>
* | Merge pull request #535 from ceph/wip-readdir-r-sucksYehuda Sadeh2013-08-234-8/+11
|\ \ | | | | | | | | | | | | Fix readdir_r invocation Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
| * | os: make readdir_r buffers largerSage Weil2013-08-232-4/+5
| | | | | | | | | | | | | | | | | | | | | PATH_MAX isn't quite big enough. Backport: dumpling, cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com>
| * | os: fix readdir_r buffer sizeSage Weil2013-08-232-4/+6
|/ / | | | | | | | | | | | | The buffer needs to be big or else we're walk all over the stack. Backport: dumpling, cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com>
* | mon/Paxos: fix another uncommitted value corner caseSage Weil2013-08-231-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that we begin the paxos recovery with an uncommitted value for, say, commit 100. During last/collect we discover 100 has been committed already. But also, another node provides an uncommitted value for 101 with the same pn. Currently, we refuse to learn it, because the pn is not strictly > than our current uncommitted pn... even though it is the next last_committed+1 value that we need. There are two possible fixes here: - make this a >= as we can accept newer values from the same pn. - discard our uncommitted value metadata when we commit the value. Let's do both! Fixes: #6090 Signed-off-by: Sage Weil <sage@inktank.com>
* | rgw: bucket meta remove don't overwrite entry point firstYehuda Sadeh2013-08-231-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: #6056 When removing a bucket metadata entry we first unlink the bucket and then we remove the bucket entrypoint object. Originally when unlinking the bucket we first overwrote the bucket entrypoint entry marking it as 'unlinked'. However, this is not really needed as we're just about to remove it. The original version triggered a bug, as we needed to propagate the new header version first (which we didn't do, so the subsequent bucket removal failed). Reviewed-by: Greg Farnum <greg@inktank.com> Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
* | ceph-disk: specify the filetype when mountingAlfredo Deza2013-08-231-0/+1
| | | | | | | | | | Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | Merge pull request #532 from dmick/nextSage Weil2013-08-221-1/+0
|\ \ | | | | | | | | | | | | PGMonitor: pg dump_stuck should respect --format (plain works fine) Reviewed-by: Sage Weil <sage@inktank.com>
| * | PGMonitor: pg dump_stuck should respect --format (plain works fine)Dan Mick2013-08-221-1/+0
| | | | | | | | | | | | Signed-off-by: Dan Mick <dan.mick@inktank.com>
* | | QA: Compile fsstress if missing on machine.Sandon Van Ness2013-08-221-0/+15
|/ / | | | | | | | | | | | | | | | | | | | | Some distro's have a lack of ltp-kernel packages and all we need is fstress. This just modified the shell script to download/compile fstress from source and copy it to the right location if it doesn't currently exist where it is expected. It is a very small/quick compile and currently only SLES and debian do not have it already. Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Sandon Van Ness <sandon@inktank.com>
* | rgw: fix crash when creating new zone on initYehuda Sadeh2013-08-221-8/+8
| | | | | | | | | | | | | | | | Moving the watch/notify init before the zone init, as we might need to send a notification. Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
* | enable mds rejoin with active inodes' old parent xattrsAlexandre Oliva2013-08-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the parent xattrs of active inodes that the mds attempts to open during rejoin lack pool info (struct_v < 5), this field will be filled in with -1, causing the mds to retry fetching a backtrace with a pool number that matches the expected value, which fails and causes the err==-ENOENT branch to be taken and retry pool 1, which succeeds, but with pool -1, and so keeps on bouncing between the two retry cases forever. This patch arranges for the mds to go along with pool -1 instead of insisting that it be refetched, enabling it to complete recovery instead of eating cpu, network bandwidth and metadata osd's resources like there's no tomorrow, in what AFAICT is an infinite and very busy loop. This is not a new problem: I've had it even before upgrading from Cuttlefish to Dumpling, I'd just never managed to track it down, and force-unmounting the filesystem and then restarting the mds was an easier (if inconvenient) work-around, particularly because it always hit when the filesystem was under active, heavy-ish use (or there wouldn't be much reason for caps recovery ;-) There are two issues not addressed in this patch, however. One is that nothing seems to proactively update the parent xattr when it is found to be outdated, so it remains out of date forever. Not even renaming top-level directories causes the xattrs to be recursively rewritten. AFAICT that's a bug. The other is that inodes that don't have a parent xattr (created by even older versions of ceph) are reported as non-existing in the mds rejoin message, because the absence of the parent xattr is signaled as a missing inode (?failed to reconnect caps for missing inodes?). I suppose this may cause more serious recovery problems. I suppose a global pass over the filesystem tree updating parent xattrs that are out-of-date would be desirable, if we find any parent xattrs still lacking current information; it might make sense to activate it as a background thread from the backtrace decoding function, when it finds a parent xattr that's too out-of-date, or as a separate client (ceph-fsck?). Backport: dumpling, cuttlefish Signed-off-by: Alexandre Oliva <oliva@gnu.org> Reviewed-by: Zheng, Yan <zheng.z.yan@intel.com>
* | ceph-monstore-tool: shut up coveritySage Weil2013-08-211-0/+1
| | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | store: fix issues reported by coverityYan, Zheng2013-08-212-12/+19
| | | | | | | | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | objecter: fix keys of dump_linger_opsJosh Durgin2013-08-211-2/+1
| | | | | | | | | | | | | | | | The registering flag no longer exists, and registered was using the wrong property due to a copy-paste error. Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Sage Weil <sage.weil@inktank.com>
* | objecter: resend unfinished lingers when osdmap is no longer pausedJosh Durgin2013-08-211-2/+12
|/ | | | | | | | | | | | | | | Plain Ops that haven't finished yet need to be resent if the osdmap transitions from full or paused to unpaused. If these Ops are triggered by LingerOps, they will be cancelled instead (since should_resend = false), but the LingerOps that triggered them will not be resent. Fix this by checking the registered flag for all linger ops, and resending any of them that aren't paused anymore. Fixes: #6070 Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Sage Weil <sage.weil@inktank.com>
* rgw: change cache / watch-notify init sequenceYehuda Sadeh2013-08-213-5/+16
| | | | | | | | | | | | | Fixes: #6046 We were initializing the watch-notify (through the cache init) before reading the zone info which was much too early, as we didn't have the control pool name yet. Now simplifying init/cleanup a bit, cache doesn't call watch/notify init and cleanup directly, but rather states its need through a virtual callback. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* Merge remote-tracking branch 'gh/wip-6004' into nextSage Weil2013-08-202-12/+37
|\ | | | | | | | | Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
| * osdc/ObjectCacher: do not merge rx buffersSage Weil2013-08-151-0/+4
| | | | | | | | | | | | | | | | | | We do not try to merge rx buffers currently. Make that explicit and documented in the code that it is not supported. (Otherwise the last_read_tid values will get lost and read results won't get applied to the cache properly.) Signed-off-by: Sage Weil <sage@inktank.com>
| * osdc/ObjectCacher: match reads with their original rx buffersSage Weil2013-08-152-12/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider a sequence like: 1- start read on 100~200 100~200 state rx 2- truncate to 200 100~100 state rx 3- start read on 200~200 100~100 state rx 200~200 state rx 4- get 100~200 read result Currently this makes us crash on osdc/ObjectCacher.cc: 738: FAILED assert(bh->length() <= start+(loff_t)length-opos) when processing the second 200~200 bufferhead (it is too big). The larger issue, though, is that we should not be looking at this data at all; it has been truncated away. Fix this by marking each rx buffer with the read request that is sent to fill it, and only fill it from that read request. Then the first reply will fill the first 100~100 extend but not touch the other extent; the second read will do that. Signed-off-by: Sage Weil <sage@inktank.com>
* | .gitignore: ignore test-driverSage Weil2013-08-201-0/+1
| | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | fuse: fix warning when compiled against old fuse versionsSage Weil2013-08-201-1/+1
| | | | | | | | | | | | | | client/fuse_ll.cc: In function 'void invalidate_cb(void*, vinodeno_t, int64_t, int64_t)': warning: client/fuse_ll.cc:540: unused variable 'fino' Signed-off-by: Sage Weil <sage@inktank.com>
* | json_spirit: remove unused typedefSage Weil2013-08-201-2/+0
| | | | | | | | | | | | | | | | | | | | | | In file included from json_spirit/json_spirit_writer.cpp:7:0: json_spirit/json_spirit_writer_template.h: In function 'String_type json_spirit::non_printable_to_string(unsigned int)': json_spirit/json_spirit_writer_template.h:37:50: warning: typedef 'Char_type' locally defined but not used [-Wunused-local-typedefs] typedef typename String_type::value_type Char_type; (Also, ha ha, this file uses \r\n.) Signed-off-by: Sage Weil <sage@inktank.com>
* | gtest: add build-aux/test-driver to .gitignoreSage Weil2013-08-201-0/+1
| | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | Merge pull request #517 from dmick/wip-6049Dan Mick2013-08-201-3/+3
|\ \ | | | | | | | | | | | | | | | mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous) Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
| * | mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous)Dan Mick2013-08-201-3/+3
| | | | | | | | | | | | | | | Fixes: #6049 Signed-off-by: Dan Mick <dan.mick@inktank.com>
* | | mon/Paxos: always refresh after any store_stateSage Weil2013-08-201-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we store any new state, we need to refresh the services, even if we are still in the midst of Paxos recovery. This is because the subscription path will share any committed state even when paxos is still recovering. This prevents a race like: - we have maps 10..20 - we drop out of quorum - we are elected leader, paxos recovery starts - we get one LAST with committed states that trim maps 10..15 - we get a subscribe for map 10..20 - we crash because 10 is no longer on disk because the PaxosService is out of sync with the on-disk state. Fixes: #6045 Backport: dumpling Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
* | | mon/Paxos: return whether store_state stored anythingSage Weil2013-08-202-2/+7
| | | | | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
* | | mon/Paxos: cleanup: use do_refresh from handle_commitSage Weil2013-08-201-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | This avoid duplicated code by using the helper created exactly for this purpose. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
* | | pybind: fix Rados.conf_parse_env testSage Weil2013-08-201-6/+2
|/ / | | | | | | | | | | | | | | | | This happens after we connect, which means we get ENOSYS always. Instead, parse_env inside the normal setup method, which had the added benefit of being able to debug these tests. Backport: dumpling Signed-off-by: Sage Weil <sage@inktank.com>
* | PG: remove old log when we upgrade log versionSamuel Just2013-08-192-0/+9
| | | | | | | | | | | | | | | | | | Otherwise the log_oid will be non-empty and the next boot will cause us to try to upgrade again. Fixes: #6057 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | PGLog: add a config to disable PGLog::check()Samuel Just2013-08-193-3/+12
| | | | | | | | | | | | | | | | This is a debug check which may be causing excessive cpu usage. Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Samuel Just <sam.just@inktank.com>
* | ceph: parse CEPH_ARGS environment variableSage Weil2013-08-191-0/+1
| | | | | | | | | | | | Fixes: #6052 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
* | rados pybind: add conf_parse_env()Sage Weil2013-08-192-0/+19
| | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
* | Merge remote-tracking branch 'gh/next'Sage Weil2013-08-193-4/+10
|\ \
| * \ Merge pull request #513 from dalgaaf/fix/wip-da-documentationSage Weil2013-08-192-4/+8
| |\ \ | | | | | | | | Fix documentation issues
| | * | filestore-config-ref.rst: mark some filestore keys as deprecatedDanny Al-Gaaf2013-08-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marked the following keys as deprecated since v0.65: - filestore flusher - filestore flusher max fds - filestore sync flush Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
| | * | auth-config-ref.rst: fix signature keysDanny Al-Gaaf2013-08-191-4/+4
| |/ / | | | | | | | | | | | | | | | Fix names of cephx signature keys. Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
| * | Merge pull request #509 from dmick/wip-rest-confDan Mick2013-08-161-0/+2
| |\ \ | | | | | | | | | | | | | | | | config_opts: add two ceph-rest-api-only variables for convenience Reviewed-by: Sage Weil <sage@inktank.com>
| | * | config_opts: add two ceph-rest-api-only variables for convenienceDan Mick2013-08-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These aren't used by the C++ code at all, but in order for rados_conf_get to find them, they need to be listed. They're consumed by ceph_rest_api. Signed-off-by: Dan Mick <dan.mick@inktank.com>
* | | | doc/release-notes: v0.61.8Sage Weil2013-08-192-0/+842
| | | | | | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | | | Merge pull request #512 from ceph/wip-5988Sage Weil2013-08-194-5/+11
|\ \ \ \ | | | | | | | | | | Reviewed-by: Sage Weil <sage@inktank.com>
| * | | | librados: synchronous commands should return on commit instead of ackGreg Farnum2013-08-193-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is unlikely to be noticed by anybody, but it is a big change. Document in the PendingReleaseNotes and bump up the librados minor version number to 68. Signed-off-by: Greg Farnum <greg@inktank.com>
| * | | | mon: make MonMap error message about unspecified monitors less specific.Greg Farnum2013-08-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The error message helpfully references the -m and -c CLI options for specifying monitors, but this code can be invoked from non-core librados client applications so that's unfortunately not kosher. Remove the reference. Fixes #5979. Signed-off-by: Greg Farnum <greg@inktank.com>
* | | | | Merge branch 'wip-erasure-coded-doc'Samuel Just2013-08-194-0/+1050
|\ \ \ \ \ | |/ / / / |/| | | |
| * | | | Merge pull request #493 from dachary/wip-erasure-coding-docathanatos2013-08-195-478/+1046
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | rearrange erasure code documents Reviewed-by: Samuel Just <sam.just@inktank.com>
| | * | | | rearrange the documentation to be inserted and maintained in masterLoic Dachary2013-08-095-478/+478
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Loic Dachary <loic@dachary.org>
| | * | | | document the write / read path for erasure encodingLoic Dachary2013-08-091-0/+568
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Explains how objects are stored and used in erasure coded pools. It is the result of discussions that occured on the ceph-devel mailing list around june 2013. The rationale behind each change can be found in the archive of the mailing list. For instance, the coding of the chunk number with the object or the decision to decode using any K chunks instead of trying to fetch the data chunks when possible because it would allow simple concatenation when systematic codes are used. http://tracker.ceph.com/issues/4929 refs #4929 Signed-off-by: Loic Dachary <loic@dachary.org>