summaryrefslogtreecommitdiff
path: root/ceilometer/compute
Commit message (Collapse)AuthorAgeFilesLines
* Fix OutgoingBytesDeltaPollster sample nameArnaud Morin2022-11-291-1/+1
| | | | | | | The good name is network.outgoing.bytes.delta. Change-Id: I78e9fbe9d60b3e83761f6490d25e85ad54fcc7c4 Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
* Remove unicode prefixesTakashi Kajinami2022-08-071-4/+4
| | | | | | | A unicode prefix is meaningless in Python 3. Because now ceilometer supports only Python 3, we can remove the prefix. Change-Id: I7bc91be21df646d8bbc7793eec28a93179a3eefa
* Debug log libvirt metadata version failsTobias Urdin2022-03-291-2/+2
| | | | | | | | | We don't need to log this as a warning or info. If you have upgraded from an older version it just outputs that line times the amount of instances on the compute node every minute. Change-Id: Ic0f4ca41d2e4114800aba9d02309f2e04e313752
* Support two nova metadata versions in instance XMLPavlo Shchelokovskyy2022-01-112-11/+36
| | | | | | | | | | | this is followup to I2aa34cf142c6429a7a0a3b8f232c3ed83f7d9981 In order to support discovery of instances booted before Wallaby, ceilometer must support working with both 1.0 (old, Victoria and before) and 1.1 (new, Wallaby and for now newer) versions of nova metadata. Change-Id: I93b6e92a5f46de5486f30a99fa3917a5932f7360 Related-Bug: #1930446
* Update compute.discovery to get nova domain metaChristophe Useinovic2021-10-071-1/+1
| | | | | | | | Nova changed the namespace for nova specific data to version 1.1 for the wallaby cycle . Closes-Bug: #1930446 Change-Id: I2aa34cf142c6429a7a0a3b8f232c3ed83f7d9981
* Merge "Ceilometer compute `retry_on_disconnect` using `no-wait`"Zuul2021-06-301-1/+2
|\
| * Ceilometer compute `retry_on_disconnect` using `no-wait`Rafael Weingärtner2021-05-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was discovered a problem on a production setup of Ceilometer compute with metrics stopping to be gathered. While troubleshooting, we found the following error message. ``` ERROR ceilometer.polling.manager [-] Prevent pollster cpu from polling ``` That error message happened after the following message: ``` WARNING ceilometer.compute.pollsters [-] Cannot inspect data of CPUPollster for <UUID>, non-fatal reason: Failed to inspect instance <UUID> stats, can not get info from libvirt: Unable to read from monitor: Connection reset by peer: NoDataException: Failed to inspect instance <UUID> stats, can not get info from libvirt: Unable to read from monitor: Connection reset by peer ``` The instance was running just fine in the host. It seems a concurrency issue with some other process that made the instance locked/unavailable to ceilometer computer pollsters. Ceilometer was unable to connect to Libvirt (after 2 retries), and the code is designed to prevent Ceilometer from continuing trying. Therefore, the "CPU" metric pollster was put in permanent error. To fix the issue, We needed to restart Ceilometer in the affected hosts. However, until we discovered this issue, we lost the amount 3 days of data. ``` @libvirt_utils.raise_nodata_if_unsupported @libvirt_utils.retry_on_disconnect def inspect_instance(self, instance, duration=None): domain = self._get_domain_not_shut_off_or_raise(instance) ``` It will try to retrieve the domain (VM) object (XML description) via libvirt. If it fails, it will retry via `@libvirt_utils.retry_on_disconnect`; if that fails, it marks the metric in permanent error with the annotation: `@libvirt_utils.raise_nodata_if_unsupported`. Other metrics continued working. Therefore, I investigated a bit deeper, and the problem seems to be here: ``` retry_on_disconnect = tenacity.retry( retry=tenacity.retry_if_exception(is_disconnection_exception), stop=tenacity.stop_after_attempt(2)) ``` The `retry_on_disconnect` annotation is not configuring the "tenacity" retry library wait. The default is "no wait". Therefore, the retries have a bigger chance of being affected by very minor instabilities (microseconds connection issues can generate a problem with this configuration). One alternative to avoid such problems in the future is to use a wait configuration such as the one being proposed. Then, ceilometer computer pollsters would wait/sleep before retrying, which would provide some time for the system to be available for the compute pollsters. In this proposal, we would wait 2^x * 3 seconds between each retry starting with 1 second, then up to 60 seconds. Change-Id: I9a2d46f870dc2d2791a7763177773dc0cf8aed9d
* | Remove Xen supportTakashi Kajinami2021-05-054-199/+3
|/ | | | | | | | | This change removes the Xen support which was deprecared during the previous cycle[1]. [1] fd0a561bea956f1b62f6ca5a27e762cb76ad9a90 Change-Id: If1675468095cbc1b9c065edb6b086e7f4afa2f3e
* Merge "Deprecate support for Xen"16.0.0.0rc116.0.0Zuul2021-02-222-1/+13
|\
| * Deprecate support for XenTakashi Kajinami2021-02-192-1/+13
| | | | | | | | | | | | | | | | | | | | Since Nova removed its XenAPI driver[1] and Xen support using libvirt, we no longer expect usage of Xen in OpenStack deployments. [1] adb28f503ca8c38bd7224ec0a335f730557d7ca9 [1] 3a390c2c8238409c00acc08fad725d46fa02c0ad Change-Id: Id79799541dfc8ec17d3ea1482c6b8ca4b58f7a92
* | Merge "Fix gnocchi create resource error when missing flavor"Zuul2021-02-111-2/+3
|\ \
| * | Fix gnocchi create resource error when missing flavorliyi2021-02-031-2/+3
| |/ | | | | | | | | | | | | | | | | | | When a server created, I delete the related flavor, and the gnocchi can not discover the server. This patch also makes discovery of private flavors possible. Change-Id: I0f570bddf0f2597808a0148d72c0ea2e5f900e23 Closes-Bug: #1777607
* | Using Iterable was deprecated in python 3.3Matthias Runge2021-02-021-1/+1
|/ | | | | | and one should use collections.abc.Iterable Change-Id: I52455c85b5cb50e9c4f1fdf5538965dc200065aa
* Replace six with python3 code stylekuangcx2021-01-133-12/+5
| | | | | | Co-authored by: Matthias Runge <mrunge@redhat.com> Change-Id: I85a4d79396874670f1b36cb91cfba5da812c2839
* Implement some new meters for vnic deltaArnaud Morin2020-10-295-4/+48
| | | | | | | | | Add two new meters called network.incoming.bytes.delta and network.outgoing.bytes.delta that give the delta Bytes that were send or received by an vNic. Change-Id: Icf45a8d185cdb4a7b00a83586c98f998cbc0e928 Signed-off-by: Arnaud Morin <arnaud.morin@gmail.com>
* Remove six.moveswangzihao2020-09-271-1/+1
| | | | | | | | | | | Remove six.moves Replace the following items with Python 3 style code. - six.moves.urllib - six.moves.xrange - six.moves.range urlparse instead of url_parse Change-Id: I2a66e69d7c1401d0bbdb9d8e8b0a7b5400aee6d2
* Adding exception handling when inspect_disksSeyeong Kim2020-07-091-9/+17
| | | | | | | | | | It raises error even when live migration. As live migration uses lock normally, it should not be an error. Story: #2007651 Task: #39715 Change-Id: I3c0f29f79dc3c73e7aec9c9035c94c0fdcf8ccfd
* Merge "Temporary failures should be treated as temporary."Zuul2020-04-091-1/+0
|\
| * Temporary failures should be treated as temporary.Matthias Runge2020-04-011-1/+0
| | | | | | | | | | | | | | There is no reason e.g to treat timeouts as permanent and thus these sources should not be removed from polling. Change-Id: Ifcb8dc7ca2c91f3d2482264afbd81df6e51c5937
* | Update hacking for Python3Andreas Jaeger2020-03-312-7/+2
|/ | | | | | | | | | | The repo is Python 3 now, so update hacking to version 3.0 which supports Python 3. Fix problems found. Update local hacking checks for new flake8. Change-Id: I129bc38e6663836e12610dd50a20c74dbc79891c
* Fix logging libvirt error on python 3Artem Vasilyev2020-02-111-2/+1
| | | | | | | | | | | | | This patch fixes error caused by absence of "message" attribute in libvirtError when running on python 3. ceilometer-agent-compute service fails to fetch domain metadata and tries to log libvirt error, during logging the following exception is raised and service fails: AttributeError: 'libvirtError' object has no attribute 'message' Change-Id: I56e74d1cb30310db104c850e74b2d422d3aeb966
* Only install monotonic on python2Jon Schlueter2019-08-231-2/+7
| | | | | | | | monotonic is a backport of time.monotonic from python3. Only install it for python2. Depends-On: https://review.openstack.org/615441 Change-Id: Id27cd748e883d54dd93dac2e6bd8caee6728f7e1
* Merge "disk capacity is less than disk usage"Zuul2019-04-261-1/+5
|\
| * disk capacity is less than disk usagezhang-shaoman2019-04-231-1/+5
| | | | | | | | | | | | | | | | | | | | | | If the virtual machine mounts cd-rom, libvirt will align by 4K bytes, causing the disk capacity is less than the disk usage, which does not seem reasonable. Maybe we shoulde use the bigger one as disk capacity. Change-Id: I25808856bce27483da0cb2583ae94e8dc162d647 Closes-Bug: #1819107
* | making inspector object singleton again by assigning to the base pollster classSupreeth Shivanand2019-04-021-5/+5
|/ | | | | Change-Id: I1afc02d595273f7b699afac9c12fcb17a777b4a8 Closes-Bug: #1819562
* Add interfaceid and bridges parameters12.0.0.0rc112.0.0Arnaud Morin2019-02-011-0/+17
| | | | | | | | Add interfaceid and bridge to parameters. These are sometimes needed to help figure out from which instance the meter is coming. Change-Id: Ic50adf5aee8d934d890ac90300e683c6762aefac Signed-off-by: Arnaud Morin <arnaud.morin@gmail.com>
* compute: remove deprecated disk metersJulien Danjou2018-09-061-156/+0
| | | | | | The equivalent disk.device meters are available for a while now. Change-Id: I6f1af3b8d0a1ec32b2722db62ab9cafe6309532f
* inspector: memory: use usable of memoryStats if availableChen Hanxiao2018-05-141-1/+4
| | | | | | | | | | | | | | | Since kernel v4.6, virtio balloon driver commit 5057dcd0f introduced metric VIRTIO_BALLOON_S_AVAIL, corresponding to 'Available' in /proc/meminfo. Libvirt exposed this metric as 'usable'. As 'Available' of meminfo is an estimate of how much memory is available for starting new applications, without swapping. It's a better metric for calculating memory_usage. Change-Id: I3b935f1fc2ed74ca45b26990c4f2bd5996e1dfea Signed-off-by: Chen Hanxiao <chenhx@certusnet.com.cn>
* Merge "hyper-v: Converts all os-win exceptions"Zuul2018-02-271-33/+43
|\
| * hyper-v: Converts all os-win exceptionsClaudiu Belu2018-02-221-33/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When os-win was first introduced, a decorator which converts all os-win exceptions to virt inspector exceptions has been added to the HyperVInspector. However, the decorator works as intended for methods that return values, and not for the ones that yield (they return a generator). This patch makes sure that exceptions are converted properly for yielding methods as well. Closes-Bug #1751088 Change-Id: I7d09e1860c6940758f0d0965fedfe4dd285e0cae
* | the previous patch was missing a 'continue'Lars Kellogg-Stedman2018-02-171-0/+1
| | | | | | | | | | | | | | | | This commit puts a try/except AttributeError block around all the code that is fetching attributes on the result of metadata_xml.find(...). Change-Id: I41aa76cf9def3e8c4bceef0280d15c1fd7c48e3d Closes-Bug: #1749960
* | Gracefully handle missing metadata in libvirt xmlLars Kellogg-Stedman2018-02-161-22/+34
|/ | | | | | | | | | | | Missing metadata in the libvirt domain xml for a nova instance would cause ceilometer-compute to abort, leading to missing metrics for the current and any subequent libvirt guests. This commit puts a try/except AttributeError block around all the code that is fetching attributes on the result of metadata_xml.find(...). Change-Id: I8adc609cc21c86de2daba326d24b73a80d6eb61f Closes-Bug: #1749960
* Deprecate aggregated disk.* metrics on instanceMehdi Abaakouk2018-02-021-0/+10
| | | | | | | | | | disk.* are just aggregates of disk.device.*. We basically build the same think twice. It's up to the backend (ie: Gnocchi) to aggregate them if someone want the aggregate. Change-Id: I612b575004f65665f8630f19f56c2fb3637448fd
* Remove extra space between method parametersHuachao Mao2018-01-121-1/+1
| | | | Change-Id: I8fbcb516febd5c9a0008c9bb727031015b3759de
* Do not check iterable objects before for loopjing.liuqing2018-01-021-31/+29
| | | | | | | | | | get_VIFs, get_VBDs will return set For more details see the XenAPI docs: https://docs.citrix.com/content/dam/docs/en-us/xenserver/xenserver-62/xenenterpriseapi.pdf Change-Id: Ic3f0e2eb18d5d6408c60979383465575e0a99d05
* Merge "fix ceilometer-compute invoke libvirt exception error"Zuul2017-11-211-4/+13
|\
| * fix ceilometer-compute invoke libvirt exception errorxiexianbin2017-11-211-4/+13
| | | | | | | | | | | | | | | | | | when nova compute start vm not create by nova-compute, the ceilometer-compute will occur libvirtError: "metadata not found: Requested metadata element is not present". and cause all vm meter not report. Change-Id: Id71788606bc0da9a7959831fb90d13c25c0b8dcb
* | separate base manager from pipelinegord chung2017-11-162-2/+2
| | | | | | | | | | | | common agent for all Change-Id: I19a83d3d0e5c91ab5cb6e792ab7389e36f8ede55
* | libvirt: share disk device listinggord chung2017-11-061-22/+14
| | | | | | | | | | | | they work on same part of xml Change-Id: I3d42695d89717b732ad0866116995565d429ecdf
* | Fix bug for ceilometer polling generates an exceptionWenyanZhang2017-11-031-1/+2
|/ | | | | | | | | | When the "cdrom" device associated to the configdrive , which no longer has a "source" element. It is a normal and expected situation which shouldn't generate any logs. Change-Id: Ia9910f6aec1b2cc8db99d8468e42b840b387130c Closes-Bug: #1729506
* Remove the wrap for skip inspect rbd disk infoYaguang Tang2017-10-231-21/+13
| | | | | | | Libvirt already support get rbd disk info since 2.0, so we can remove this wrap waring. Change-Id: Ie11f64ce5dd9ce60b574ef1f6445d60e60b1887b
* Merge "Remove deprecated compute.workload_partitioning"Jenkins2017-10-131-10/+0
|\
| * Remove deprecated compute.workload_partitioningJulien Danjou2017-09-251-10/+0
| | | | | | | | Change-Id: I9ba50ac8513afa4370f76921af05fbf5b86bd4a9
* | Merge "fix disk total_time metrics"Jenkins2017-09-252-12/+9
|\ \ | |/ |/|
| * fix disk total_time metricsgord chung2017-08-292-12/+9
| | | | | | | | | | | | | | | | | | | | libvirt rd_total_time and wr_total_time metrics are in nanoseconds[1]. the current computation puts it to us not ms as defined. just skip any conversion completely as we already capture cputime in nanoseconds. [1] https://linux.die.net/man/1/virsh Change-Id: I68951a2c7d08c58497952f2f2a448d813e17e2cb
* | Update description 'resource_update_interval' optionHuachao Mao2017-09-051-3/+5
|/ | | | Change-Id: I6e65f19cd0b993f225b35f27dd13e79de9df9cc2
* Merge "Add disk total duration of reads/writes metric"Jenkins2017-08-294-3/+23
|\
| * Add disk total duration of reads/writes metric1iuwei2017-08-124-3/+23
| | | | | | | | | | | | | | | | Add disk total duration of reads/writes(ms): disk.device.read.latency disk.device.write.latency Change-Id: I0235087af459278b9ad0a66f95c4e4c4ac72e112
* | Merge "Modify memory swap metric type"Jenkins2017-08-251-0/+2
|\ \
| * | Modify memory swap metric typezhang-shaoman2017-08-111-0/+2
| | | | | | | | | | | | | | | | | | | | | The metric type of memory.swap.in and memory.swap.out shoule be Cumulative, while the default is Gauge, so fix it. Change-Id: I4da715027b3dabb1ceed4640773b1ad64aa50e9c