summaryrefslogtreecommitdiff
path: root/ironic_python_agent/errors.py
Commit message (Collapse)AuthorAgeFilesLines
* Guard shared device/cluster filesystemsJulia Kreger2022-07-191-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Certain filesystems are sometimes used in specialty computing environments where a shared storage infrastructure or fabric exists. These filesystems allow for multi-host shared concurrent read/write access to the underlying block device by *not* locking the entire device for exclusive use. Generally ranges of the disk are reserved for each interacting node to write to, and locking schemes are used to prevent collissions. These filesystems are common for use cases where high availability is required or ability for individual computers to collaborate on a given workload is critical, such as a group of hypervisors supporting virtual machines because it can allow for nearly seamless transfer of workload from one machine to another. Similar technologies are also used for cluster quorum and cluster durable state sharing, however that is not specifically considered in scope. Where things get difficult is becuase the entire device is not exclusively locked with the storage fabrics, and in some cases locking is handled by a Distributed Lock Manager on the network, or via special sector interactions amongst the cluster members which understand and support the filesystem. As a reult of this IO/Interaction model, an Ironic-Python-Agent performing cleaning can effectively destroy the cluster just by attempting to clean storage which it percieves as attached locally. This is not IPA's fault, often this case occurs when a Storage Administrator forgot to update LUN masking or volume settings on a SAN as it relates to an individual host in the overall computing environment. The net result of one node cleaning the shared volume may include restoration from snapshot, backup storage, or may ultimately cause permenant data loss, depending on the environment and the usage of that environment. Included in this patch: - IBM GPFS - Can be used on a shared block device... apparently according to IBM's documentation. The standard use of GPFS is more Ceph like in design... however GPFS is also a specially licensed commercial offering, so it is a red flag if this is encountered, and should be investigated by the environment's systems operator. - Red Hat GFS2 - Is used with shared common block devices in clusters. - VMware VMFS - Is used with shared SAN block devices, as well as local block devices. With shared block devices, ranges of the disk are locked instead of the whole disk, and the ranges are mapped to virtual machine disk interfaces. It is unknown, due to lack of information, if this will detect and prevent erasure of VMFS logical extent volumes. Co-Authored-by: Jay Faulkner <jay@jvf.cc> Change-Id: Ic8cade008577516e696893fdbdabf70999c06a5b Story: 2009978 Task: 44985
* Remove the iscsi extensionDmitry Tantsur2021-05-101-19/+0
| | | | Change-Id: I2f0e581575112d6c7ba0d211661cab3e0b6caca6
* When reporting that agent is busy, report the executed commandDmitry Tantsur2020-09-181-0/+9
| | | | | | Also make this API return a proper HTTP code (409 instead of 500). Change-Id: I5d86878b5ed6142ed2630adee78c0867c49b663f
* Clarify connection error on heartbeatsJulia Kreger2020-08-201-0/+10
| | | | | | | | | | | | | | | | | | | | Heartbeat connection errors are often a sign of a transitory network failures which may resolve themselves. But an operator looking at the screen doesn't necessarilly know that. They don't understand that there could have been a network failure, or a misconfiguration that caused the connectivity failure and soft of kind of default to "well it failed" without further clarification. As such, this patch adds explicit catching of the requests ConnectionError exception and rasies a new internal error with a more verbose error message in that event to provide operators with additional clarity. Change-Id: I4cb2c0d1f577df1c4451308bd86efa8f94390b0c Story: 2008046 Task: 40709
* Add an ability to run in-band deploy stepsMark Goddard2020-04-061-10/+21
| | | | | | | | | Mostly adaptation of cleaning methods. Co-Authored-By: Dmitry Tantsur <dtantsur@redhat.com> Change-Id: Ife0502391bbece46d619a20a825dfdb191d5c2b4 Story: 2006963 Task: 37791
* Add NTP time syncJulia Kreger2020-03-071-0/+6
| | | | | | | | | | | Attempt to sync the clock and save it to the hardware clock. This feature supports use of chrony or ntpdate. Sem-Ver: feature Change-Id: I178d7614429d582e742d9cba6d0fa3ae099775e3 Story: 1619054 Task: 11591
* Merge "Software RAID: Create/delete configurations"Zuul2019-06-051-0/+9
|\
| * Software RAID: Create/delete configurationsArne Wiebalck2019-06-041-0/+9
| | | | | | | | | | | | | | | | | | | | This patch proposes to extend the IPA to be able to configure software RAID devices. For this, the {create,delete}_configuration methods of the GenericHardwareManager are implemented. Change-Id: Id20302537f7994982c7584af546a7e7520e9612b Story: #2004581 Task: #29101
* | Allow image checksum to be a URLDmitry Tantsur2019-02-251-1/+1
|/ | | | | | | | | We allow image_source to be a URL, let us also support URLs for checksums. This change copies handling of multi-file checksum files from metalsmith. Change-Id: Ie4d7e5c79b76bdd72d50eeb384cf10519278a80c Story: #2005061 Task: #29605
* NUMA-topology collectorJaganathan Palanisamy2017-05-161-0/+6
| | | | | | | | | | | | | | Implement the optional collector for fetching the NUMA topology details. Collects RAM, CPU Cores, thread siblings and NICS data for each NUMA node and stored under "numa_topology" key. Closes-bug: #1635253 Co-Authored-By: Jaganathan Palanisamy <jpalanis@redhat.com> Change-Id: I5a546c009d95f39b7af4d89cf785be8acb8ebc67 Signed-off-by: karthik s <ksundara@redhat.com>
* Correct failure message output when downloadingGalyna Zholtkevych2017-03-101-0/+1
| | | | | | | | | This fixes unreadable output on download image failure. Adding new instance variable to exception `ImageDownloadError` class to avoid redundant logs. Change-Id: I51782abd572588adfc62745eeab9c559eb8346dd Closes-Bug: #1657691
* Fix two typos, "messsage" and "containg"Nam Nguyen Hoai2016-11-221-1/+1
| | | | | | | | This patch set updated two wrong words: + In error.py file, it should be changed from "messsage" to "message" + In utils.py file, it should be changed from "containg" to "contaning" Change-Id: I5ad121ec58ccc6e5f3cc499eca50d16e691f217e
* Use ironic-lib to create configdriveShivanand Tendulker2016-10-211-23/+0
| | | | | | | | | Shell script to create config drive being replaced with python code in ironic-lib. Closes-Bug: #1493328 Change-Id: I31108f1173db3fb585386b2949ec880a95305fb6
* Remove Python 2.6 format styleJohn L. Villalovos2016-10-061-16/+16
| | | | | | | | | | | | | In Python 2.6 it was required to use {0}, {1}...{n} when using the string format function. In Python 2.7 and Python 3 it it not required. Change {N} to {} in code. This brings the code in style alignment with other projects like ironic and ironic-lib. Change-Id: I81c4bb67b0974f73905f14b589b3dd0a7131650d Depends-On: I8f0e5405f3e2d6e35418c73f610ac6b779dd75e5
* Bind to interface routable to the ironic host, not a random oneDmitry Tantsur2016-03-211-9/+0
| | | | | | | | | | | | | | Binding to the first interface that has an IP address is error-prone: there is no guarantee that ironic can reach us via this inteface. It is much safer to detect the interface facing ironic and bind to it. Unused LookupAgentInterfaceError exception is deleted. The TinyIPA build also requires iptables dependency at build time to insert the required kernel modules. Closes-Bug: #1558956 Change-Id: I9586805e6c7f52a50834bc03efeb72d1faa6cb65
* Change to use WARNING level for heartbeat conflict errorsZhenguo Niu2016-03-061-0/+9
| | | | | | | | | It's normal that ironic returns 409 Conflict from time to time, so it's a bit confusing that we report this with Exception level and traceback. Change-Id: I1627c61facc3fadd0f5d9d324150e7d2833c7fbc Closes-Bug: #1533113
* Support Linux-IO in addition to tgtdDmitry Tantsur2015-11-301-1/+9
| | | | | | | | | The iSCSI extension now tries to use Linux-IO first (via rtslib) and falls back to tgtd if Linux-IO can't be used (e.g. in the CoreOS-based image which uses containers). Change-Id: I9cc7a30d9c93c445a66d183146e9260c2b096d33 Closes-Bug: #1504562
* avoid duplicate text in ISCSIError messagedparalen2015-10-201-2/+1
| | | | | | | | | | | | | | | | | | | The ISCSIError class defines a class-level message attribute with value: "Error starting iSCSI target". This attribute is further processed in RESTError.__init__ method, the ISCSIError super-class, to create an Exception message concatenating self.message with provided details argument. However, the ISCISError.__init__ method provides a details attribute prefixed with the same text to the super(ISCSIError, self).__init__ method. As a result, the text appears twice: "ISCSIError: Error starting iSCSI target: Error starting iSCSI target: ISCSI daemon didn't initialize. Failed with exit code 107. stdout: . stderr: tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected" The patch purpose is to remove the details prefix to avoid duplicate text in the exception text while honouring ISCSIError.message. Change-Id: I9e1434ae17da5112527a841ac069ed2285566cca
* Add more info to checksum exceptionJosh Gachnang2015-09-211-4/+8
| | | | | | | | If an image cannot be downloaded for some reason, it is helpful for operators to have the image path, checksum, and calculated checksums available easily from the API. Change-Id: I6a2fb46726245cebd730b5c51d4f25f8465f1658
* Add support for inspection using ironic-inspectorDmitry Tantsur2015-09-071-0/+6
| | | | | | | | | | | | | Adds a new module ironic_python_agent.inspector and new entry point for extensions, which will allow vendor-specific inspection. Inspection is run on service start up just before the lookup. Due to this early start, and due to the fact we don't even know MAC address of nodes on inspection (to say nothing about IP addresses), exception handling is a bit different from other agent features: we try hard not to error out until we send at least something to inspector. Change-Id: I00932463d41819fd0a050782e2c88eddf6fc08c6
* Fix printing of errors in IPAJosh Gachnang2015-08-111-35/+26
| | | | | | | | | | | | | | Exception messages weren't being bubbled up to the API because the base exception class wasn't printing correctly. This adds a string and representation function to ensure they print properly and show up correctly when debugging interactively. Cleaned up the `message` attr on the exception classes. It looks like they started out all without a period, but started adding them later. Changed classes that were setting error `details` == `message` to use the default details provided in RESTError. Change-Id: I1ce256585c9a574e1d1f857c7dc4c417a56b913b
* Update hacking and fix hacking violationsJim Rollenhagen2015-06-031-23/+15
| | | | | | | | | | | | | | | This does a few things: * Update hacking to the version in global-requirements. Old hacking was installing a version of pbr that was breaking other packages. * Fix all the hacking/pep8 rules that updating hacking raised. * Do some general docstring cleanup, while already in there cleaning up a bunch of docstrings due to H405 violations. Change-Id: I1fc1e59d4c3d7b14631f8b576e3f3854bc452188 Closes-Bug: #1461717
* Add cleaning/zapping support to IPAJosh Gachnang2015-03-171-0/+32
| | | | | | | | | | | | | | | | | | | This will add support for in band cleaning operations to IPA and replace the decom API that was unused. Adds API support for get_clean_steps, which returns a list of supported clean steps for the node, execute_clean_step, to execute one of the steps returned by get_clean_steps. Adds versioning and naming for hardware managers, so if a new hardware manager version is deployed in the middle of cleaning/zapping, the cleaning/zapping will be restarted to avoid incompatabilities. blueprint implement-cleaning-states blueprint inband-raid-configuration blueprint implement-zaping-states Depends-On: Ia2500ed5afb72058b4c5e8f41307169381cbce48 Change-Id: I750b80b9bf98b3ddc5643bb4c14a67d2052239af
* Add the image extension (for local boot)Lucas Alvares Gomes2015-03-041-0/+12
| | | | | | | Initially this extension supports installing a bootloader so the user image can boot from the local disk. Change-Id: Ia588aafc240b55119c02f1254addc0cf796f88c5
* Add iscsi extensionLucas Alvares Gomes2015-02-261-0/+12
| | | | | | | This extension allows IPA to be used with the PXE/iSCSI methodology of deployment in Ironic. Change-Id: I32ec9fa74182c0d03c7ef1b698b1d0c0e3007773
* Log required troubleshooting info on image dl failJay Faulkner2015-02-121-2/+2
| | | | | | | | | | Currently, we only log the image ID and attempted URL. Now, we log the status code recieved and detailed information about how and when things failed. Change-Id: I718c7facbe1500d98be78b7b6137e92fdfb2fdf1 Closes-bug: 1420981 Depends-On: I69f6f6eef4ad573f406d64d579a9811c70ac5d28
* Make all IPA error classes inherit from RESTErrorMichael Turek2015-01-161-11/+15
| | | | | | | | | | | Currently several IPA error classes inherit from Exception. This patch makes the base class of those classes RESTError. These error classes are also restructured to initialize in the same manner as other classes which inherit from RESTError. Additionally test cases are added for these error classes. Change-Id: Ie6235e4cc25f072b789b2e72e4592d4cf02bfedc Closes-bug: #1410372
* Consistent way to set details for Error instancesRuby Loo2015-01-151-29/+23
| | | | | | | | | | | | | This fixes and cleans up (making it consistent) how error instances set their details value. The base class RESTError will set the details value; all subclasses should call their parent's __init__(). Unit tests were added to test that the Error instances are initialized correctly. Change-Id: I2390fa0012f8e4e6d73cbfb188f1733dfe85e65a Closes-Bug: #1408817
* Merge "HardwareManagerMethodNotFound requires a method"Jenkins2015-01-151-5/+2
|\
| * HardwareManagerMethodNotFound requires a methodJay Faulkner2015-01-091-5/+2
| | | | | | | | | | | | | | | | | | If this exception is called, it should contain a method argument. If not, allow the incorrect call to bubble up rather than setting it to None. This will ensure this error is never called without the method argument. Change-Id: Iedc82b3446d1ee41d6ae94ee43391e12ef4899a7
* | Error classes invoke their parent's __init__()Ruby Loo2015-01-121-3/+3
|/ | | | | | | | | This fixes some Error classes so that they are correcting invoking their parent's __init__() method instead of some other ancestor's method. Change-Id: I7cb2fc56792f7516222baf75f76b50509deefcf5 Closes-Bug: 1408813
* Allow use of multiple simultaneous HW managersJay Faulkner2015-01-081-0/+40
| | | | | | | | | | | | | Currently we pick the most specific manager and use it. Instead, call each method on each hardware manager in priority order, and consider the call successful if the method exists and doesn't throw IncompatibleHardwareMethodError. This is an API breaking change for anyone with out-of-tree HardwareManagers. Closes-bug: 1408469 Change-Id: I30c65c9259acd4f200cb554e7d688344b7486a58
* Fix exception that is not properly raisedJim Rollenhagen2014-09-101-0/+11
| | | | | | | | | | | This commit fixes an exception that was not properly raised, and also makes the exception more relevant. This also fixes an outstanding bug where, if the agent was not associated with a node, get_node_uuid() would fail in an unexpected manner. Change-Id: Ifca474a73dd50b5fd2242e5b7e938a5db04f27a8
* Add vmedia boot support in IPARamakrishnan G2014-09-021-0/+11
| | | | | | | | | | This commit adds support for booting IPA from virtual media cdrom. When IPA is booted over virtual media cdrom, the parameters to the IPA are passed in a text file within the virtual media floppy. Change-Id: Ia04585416aada85022af73fb2b945bd3895606f0 Closes-Bug: #1358723
* Improve Disk DetectionJosh Gachnang2014-07-181-0/+9
| | | | | | | | | | | | | | | The previous implementation of list_block_devices used blockdev, which would list partitions, software RAID and other devices as block devices. By switching to lsblk, the agent can filter down to only physical block devices, which is all the agent cares about for any of its operations. This change adds two new fields to the BlockDevice class: model, a string of the block devices reported model, and rotational, a boolean representing a spinning disk (True) or a solid state disk (False). This data can be useful for vendor hardware managers. Change-Id: I385c3bb378c2c49385bca14a1d7efa074933becf Closes-Bug: 1344351
* Better errors for execute() failuresJim Rollenhagen2014-06-241-10/+12
| | | | | | | Exceptions raised due to processutils.execute() failing now include stdout and stderr. Change-Id: Id5d1b5bc51d377f9f3c338cd7303ea800f76e5cd
* Tries to advertise valid default IPEllen Hui2014-06-101-0/+18
| | | | | | | | | | | During the first heartbeat, the heartbeater asks the agent to check its advertised address; if the advertised IP is still the default (None), the agent tries to replace it with the IP of the first network interface it finds. If it fails to find either a network interface or an IP address, the agent raises an exception. Change-Id: I6d435d39e99ed0ff5c8b4883b6aa0b356f6cb4ae Closes-Bug: #1309110
* Add a HardwareManager method to erase devicesRussell Haering2014-06-061-0/+10
| | | | | | | | | | | Add erase_devices method to the HardwareManager class. By default this method iterates block devices, and calls a new abstract erase_block_device method for each device. This patch includes a simple implementation of erase_block_device on the GenericHardwareManager which attempts to issue an ATA secure erase on supported devices. Change-Id: I81da065395b8785f636f1b0a0d60c9f1c045441e
* Flow extension uses extension manager from agentVladimir Kozhukalov2014-06-021-0/+4
| | | | | | | | | | Removed creating separate extension manager for flow extension. Instead, have made flow extension using the same extension manager instance which is initialized in agent. It fixes circular extension loading in stevedore. Closes-Bug: #1316145 Change-Id: Id339f1876168a41ca43ba7473f3ff6949a233ef3
* Make encoding.serialize() more programmaticalAlexander Gordeev2014-05-061-9/+5
| | | | | | | | | Introduce `serializable_fields` to express which class attributes to be serialized. Get rid of OrderedDict. Just replacing it with regular dict. Change-Id: I3f7639dab171d3d62e92d0d1bb6d7b071cf963ad
* Check configdrive size before writing to partitionJim Rollenhagen2014-04-251-0/+11
| | | | | | | Avoids writing a configdrive out to disk that is larger than the intended partition. Change-Id: I4e067ccb23ba528d96e4faad39219f67b4178e82
* Use # instead of """ for copyright blocksJim Rollenhagen2014-04-101-15/+13
| | | | | | | Reformats copyright messages to be comments rather than docstring-style blocks. Change-Id: I4d863f53b67bb49d03bda0952b9e6179b6d23c59
* Replacing teeth/overlord with ipa/ironicJosh Gachnang2014-03-191-8/+8
|
* Renaming to IPAJosh Gachnang2014-03-191-0/+179