summaryrefslogtreecommitdiff
path: root/nova/tests
Commit message (Collapse)AuthorAgeFilesLines
* tests: Use GreenThreadPoolExecutor.shutdown(wait=True)melanie witt2023-05-171-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We are still having some issues in the gate where greenlets from previous tests continue to run while the next test starts, causing false negative failures in unit or functional test jobs. This adds a new fixture that will ensure GreenThreadPoolExecutor.shutdown() is called with wait=True, to wait for greenlets in the pool to finish running before moving on. In local testing, doing this does not appear to adversely affect test run times, which was my primary concern. As a baseline, I ran a subset of functional tests in a loop until failure without the patch and after 11 hours, I got a failure reproducing the bug. With the patch, running the same subset of functional tests in a loop has been running for 24 hours and has not failed yet. Based on this, I think it may be worth trying this out to see if it will help stability of our unit and functional test jobs. And if it ends up impacting test run times or causes other issues, we can revert it. Partial-Bug: #1946339 Change-Id: Ia916310522b007061660172fa4d63d0fde9a55ac
* Enable use of service user token with admin contextmelanie witt2023-05-103-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the [service_user] section is configured in nova.conf, nova will have the ability to send a service user token alongside the user's token. The service user token is sent when nova calls other services' REST APIs to authenticate as a service, and service calls can sometimes have elevated privileges. Currently, nova does not however have the ability to send a service user token with an admin context. This means that when nova makes REST API calls to other services with an anonymous admin RequestContext (such as in nova-manage or periodic tasks), it will not be authenticated as a service. This adds a keyword argument to service_auth.get_auth_plugin() to enable callers to provide a user_auth object instead of attempting to extract the user_auth from the RequestContext. The cinder and neutron client modules are also adjusted to make use of the new user_auth keyword argument so that nova calls made with anonymous admin request contexts can authenticate as a service when configured. Related-Bug: #2004555 Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4
* Use force=True for os-brick disconnect during deletemelanie witt2023-05-1011-14/+168
| | | | | | | | | | | | | | | | | | | The 'force' parameter of os-brick's disconnect_volume() method allows callers to ignore flushing errors and ensure that devices are being removed from the host. We should use force=True when we are going to delete an instance to avoid leaving leftover devices connected to the compute host which could then potentially be reused to map to volumes to an instance that should not have access to those volumes. We can use force=True even when disconnecting a volume that will not be deleted on termination because os-brick will always attempt to flush and disconnect gracefully before forcefully removing devices. Closes-Bug: #2004555 Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8
* Merge "Handle zero pinned CPU in a cell with mixed policy"Zuul2023-05-091-23/+19
|\
| * Handle zero pinned CPU in a cell with mixed policyBalazs Gibizer2022-12-131-23/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When cpu_policy is mixed the scheduler tries to find a valid CPU pinning for each instance NUMA cell. However if there is an instance NUMA cell that does not request any pinned CPUs then such logic will calculate empty pinning information for that cell. Then the scheduler logic wrongly assumes that an empty pinning result means there was no valid pinning. However there is difference between a None result when no valid pinning found, from an empty result [] which means there was nothing to pin. This patch makes sure that pinning == None is differentiated from pinning == []. Closes-Bug: #1994526 Change-Id: I5a35a45abfcfbbb858a94927853777f112e73e5b
* | Merge "Reproduce asym NUMA mixed CPU policy bug"Zuul2023-05-051-0/+74
|\ \ | |/
| * Reproduce asym NUMA mixed CPU policy bugBalazs Gibizer2022-12-131-0/+74
| | | | | | | | | | Related-Bug: #1994526 Change-Id: I52ee068377cc48ef4b4cdcb4b05fdc8d926faddf
* | Merge "Fix get_segments_id with subnets without segment_id"Zuul2023-05-041-1/+17
|\ \
| * | Fix get_segments_id with subnets without segment_idSylvain Bauza2023-05-031-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | Unfortunatly when we merged Ie166f3b51fddeaf916cda7c5ac34bbcdda0fd17a we forgot that subnets can have no segment_id field. Change-Id: Idb35b7e3c69fe8efe498abe4ebcc6cad8918c4ed Closes-Bug: #2018375
* | | Merge "Have host look for CPU controller of cgroupsv2 location."Zuul2023-05-048-50/+140
|\ \ \
| * | | Have host look for CPU controller of cgroupsv2 location.Jorge San Emeterio2023-05-038-50/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the host class look under '/sys/fs/cgroup/cgroup.controllers' for support of the cpu controller. The host will try searching through cgroupsv1 first, just like up until now, and in the case that fails, it will try cgroupsv2 then. The host will not support the feature if both checks fail. This new check needs to be mocked by all tests that focus on this piece of code, as it touches a system file that requires privileges. For such thing, the CGroupsFixture is defined to easily add suck mocking to all test cases that require so. I also removed old mocking at test_driver.py in favor of the fixture from above. Partial-Bug: #2008102 Change-Id: I99b57c27c8a4425389bec2b7f05af660bab85610
* | | | Merge "Save cell socket correctly when updating host NUMA topology"Zuul2023-05-045-19/+52
|\ \ \ \
| * | | | Save cell socket correctly when updating host NUMA topologyArtom Lifshitz2023-04-255-19/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, in numa_usage_from_instance_numa(), any new NUMACell objects we created did not have the `socket` attribute. In some cases this was persisted all the way down to the database. Fix this by copying `socket` from the old_cell. Change-Id: I9ed3c31ccd3220b02d951fc6dbc5ea049a240a68 Closes-Bug: 1995153
* | | | | Merge "add hypervisor version weigher"Zuul2023-05-041-0/+97
|\ \ \ \ \ | |_|_|/ / |/| | | |
| * | | | add hypervisor version weigherSean Mooney2023-04-201-0/+97
| | | | | | | | | | | | | | | | | | | | | | | | | implements: blueprint weigh-host-by-hypervisor-version Change-Id: I36b16a388383c26bdf432030bc9e28b2fd75d120
* | | | | Merge "Stop ignoring missing compute nodes in claims"Zuul2023-04-263-25/+41
|\ \ \ \ \
| * | | | | Stop ignoring missing compute nodes in claimsDan Smith2023-04-243-25/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The resource tracker will silently ignore attempts to claim resources when the node requested is not managed by this host. The misleading "self.disabled(nodename)" check will fail if the nodename is not known to the resource tracker, causing us to bail early with a NopClaim. That means we also don't do additional setup like creating a migration context for the instance, claim resources in placement, and handle PCI/NUMA things. This behavior is quite old, and clearly doesn't make sense in a world with things like placement. The bulk of the test changes here are due to the fact that a lot of tests were relying on this silent ignoring of a mismatching node, because they were passing node names that weren't even tracked. This change makes us raise an error if this happens so that we can actually catch it, and avoid silently continuing with no resource claim. Change-Id: I416126ee5d10428c296fe618aa877cca0e8dffcf
* | | | | | Merge "Remove silent failure to find a node on rebuild"Zuul2023-04-261-6/+13
|\ \ \ \ \ \ | |/ / / / / | | | / / / | |_|/ / / |/| | | |
| * | | | Remove silent failure to find a node on rebuildDan Smith2023-04-241-6/+13
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have been ignoring the case where a rebuild or evacuate is triggered and we fail to find *any* node for our host. This appears to be a very old behavior, which I traced back ten years to this: https://review.opendev.org/c/openstack/nova/+/35851 which was merely fixing the failure to reset instance.node during an evacuate (which re-uses rebuild, which before that was a single-host operation). That patch intended to make a failure to find a node for our host a non-fatal error, but it just means we fall through that check with no node selected, which means we never update instance.node *and* run ResourceTracker code that will fail to find the node later. So, this makes it an explicit error, where we stop further processing, set the migration for the evacuation to 'failed', and send a notification for it. This is the same behavior as happens further down if we find that the instance has been deleted underneath us. Change-Id: I88b962aaeaa0554da4ab00906ac4d9e6deb43589
* | | | Reproduce bug 1995153Artom Lifshitz2023-04-251-0/+109
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | If we first boot an instance with NUMA topology on a host, any subsequent attempts to boot instances with the `socket` PCI NUMA policy will fail with `Cannot load 'socket' in the base class`. Demonstrate this in a functional test. Change-Id: I63f4e3dfa38f65b73d0051b8e52b1abd0f027e9b Related-bug: 1995153
* | | Merge "db: Remove legacy migrations"Zuul2023-04-173-270/+7
|\ \ \
| * | | db: Remove legacy migrationsStephen Finucane2023-02-013-270/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sqlalchemy-migrate does not (and will not) support sqlalchemy 2.0. We need to drop these migrations to ensure we can upgrade our sqlalchemy version. Change-Id: I7756e393b78296fb8dbf3ca69c759d75b816376d Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
* | | | Merge "Make scheduler lazy-load the placement client"Zuul2023-04-101-0/+36
|\ \ \ \
| * | | | Make scheduler lazy-load the placement clientDan Smith2023-03-221-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like we did for conductor, this makes the scheduler lazy-load the placement client instead of only doing it during __init__. This avoids a startup crash if keystone or placement are not available, but retains startup failures for other problems and errors likely to be a result of misconfigurations. Closes-Bug: #2012530 Change-Id: I42ed876b84d80536e83d9ae01696b0a64299c9f7
* | | | | Merge "Update min support for Bobcat"Zuul2023-04-041-0/+16
|\ \ \ \ \ | |/ / / / |/| | | |
| * | | | Update min support for BobcatSylvain Bauza2023-03-081-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I needed to update some VDPA functests as they were verifying a Yoga compute service. NOTE(sbauza): For the moment, the grenade-skip-level is not voting but it will be done once I2b21e7d5f487f65ce4391f5c934046552d01a1e2 is merged. Change-Id: I8ef2a8f251a3142c359e14841459bffcc3b50ac9
* | | | | Merge "Unbind port when offloading a shelved instance"Zuul2023-03-133-22/+47
|\ \ \ \ \
| * | | | | Unbind port when offloading a shelved instanceArnaud Morin2022-11-293-22/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When offloading a shelved instance, the compute needs to remove the binding so the port will appear as "unbound" in neutron. Closes-Bug: 1983471 Change-Id: Ia49271b126870c7936c84527a4c39ab96b6c5ea7 Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
* | | | | | Merge "fup for power management series"Zuul2023-03-091-3/+3
|\ \ \ \ \ \ | |_|/ / / / |/| | | | |
| * | | | | fup for power management seriesSylvain Bauza2023-02-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Emptying the cpu init file and directly calling the submodule API. Relates to blueprint libvirt-cpu-state-mgmt Change-Id: I1299ca4b49743f58bec6f541785dd9fbee0ae9e2
* | | | | | Merge "Revert "Add logging to find test cases leaking libvirt threads""27.0.0.0rc127.0.0Zuul2023-03-041-26/+0
|\ \ \ \ \ \
| * | | | | | Revert "Add logging to find test cases leaking libvirt threads"Sylvain Bauza2023-02-141-26/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 1778a9c589cf24e17b44f556680b17af9577df11. Reason for revert: We said we wouldn't have it in RC1. Change-Id: Idf0c9a8adeac231f099b312fc24b9cf9726687e0
* | | | | | | Merge "Fix logging in MemEncryption-related checks"Zuul2023-02-282-2/+22
|\ \ \ \ \ \ \
| * | | | | | | Fix logging in MemEncryption-related checksAlexey Stupnikov2023-02-112-2/+22
| | |_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently Nova produces ambigous error when volume-backed instance is started using flavor with hw:mem_encryption extra_specs flag: ImageMeta doesn't contain name if it represents Cinder volume. This fix sligtly changes steps to get image_meta.name for some MemEncryption-related checks where it could make any difference. Closes-bug: #2006952 Change-Id: Ia69e7cb18cd862f01ecfdbdc358c87af1ab8fbf6
* | | | | | | Merge "Transport context to all threads"Zuul2023-02-271-1/+7
|\ \ \ \ \ \ \ | |_|_|/ / / / |/| | | | | |
| * | | | | | Transport context to all threadsFabian Wiesel2022-08-041-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nova.utils.spawn and spawn_n methods transport the context (and profiling information) to the newly created threads. But the same isn't done when submitting work to thread-pools in the ComputeManager. The code doing that for spawn and spawn_n is extracted to a new function and called to submit the work to the thread-pools. Closes-Bug: #1962574 Change-Id: I9085deaa8cf0b167d87db68e4afc4a463c00569c
* | | | | | | Merge "Enable cpus when an instance is spawning"Zuul2023-02-184-0/+484
|\ \ \ \ \ \ \
| * | | | | | | Enable cpus when an instance is spawningSylvain Bauza2023-02-104-0/+484
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By this patch, we now automatically power down or up cores when an instance is either stopped or started. Also, by default, we now powersave or offline dedicated cores when starting the compute service. Implements: blueprint libvirt-cpu-state-mgmt Change-Id: Id645fd1ba909683af903f3b8f11c7f06db3401cb
* | | | | | | | Merge "libvirt: let CPUs be power managed"Zuul2023-02-182-0/+34
|\ \ \ \ \ \ \ \ | |/ / / / / / /
| * | | | | | | libvirt: let CPUs be power managedSylvain Bauza2023-02-102-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before going further, we need to somehow return the list of CPUs even offline if they are power managed by Nova. Co-Authored-By: Sean Mooney <smooney@redhat.com> Partially-Implements: blueprint libvirt-cpu-state-mgmt Change-Id: I5dca10acde0eff554ed139587aefaf2f5fad2ca5
* | | | | | | | Merge "libvirt: Add configuration options to set SPICE compression settings"Zuul2023-02-172-1/+89
|\ \ \ \ \ \ \ \
| * | | | | | | | libvirt: Add configuration options to set SPICE compression settingsManuel Bentele2023-01-112-1/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the following SPICE-related options to the 'spice' configuration group of a Nova configuration: - image_compression - jpeg_compression - zlib_compression - playback_compression - streaming_mode These configuration options can be used to enable and set the SPICE compression settings for libvirt (QEMU/KVM) provisioned instances. Each configuration option is optional and can be set explictly to configure the associated SPICE compression setting for libvirt. If all configuration options are not set, then none of the SPICE compression settings will be configured for libvirt, which corresponds to the behavior before this change. In this case, the built-in defaults from the libvirt backend (e.g. QEMU) are used. Note that those options are only taken into account if SPICE support is enabled (and the VNC support is disabled). Implements: blueprint nova-support-spice-compression-algorithm Change-Id: Ia7efeb1b1a04504721e1a5bdd1b5fa7a87cdb810
* | | | | | | | | Merge "cpu: interfaces for managing state and governor"Zuul2023-02-154-0/+237
|\ \ \ \ \ \ \ \ \ | | |/ / / / / / / | |/| | | | | | |
| * | | | | | | | cpu: interfaces for managing state and governorSylvain Bauza2023-02-094-0/+237
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first stage of the power management series. In order to be able to switch the CPU state or change the governor, we need a framework to access sysfs. As some bits can be reused, let's create a nova.filesystem helper module that will define read-write mechanisms for accessing sysfs-specific commands. Partially-Implements: blueprint libvirt-cpu-state-mgmt Change-Id: Icb913ed9be8d508de35e755a9c650ba25e45aca2
* | | | | | | | | Merge "Add logging to find test cases leaking libvirt threads"Zuul2023-02-141-0/+26
|\ \ \ \ \ \ \ \ \ | |_|_|_|/ / / / / |/| | | | / / / / | | |_|_|/ / / / | |/| | | | | |
| * | | | | | | Add logging to find test cases leaking libvirt threadsSylvain Bauza2023-02-101-0/+26
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We see functional test failures due to leaked libvirt event handling thread weaking up after its original test finished and importing libvirt. If it happens when the libvirt package import is poisoned then the currently executing test will fail. This patch logs the name of the test case that leaked the libvirt event handling thread. We will revert his before RC1. Change-Id: I3146e9afb411056d004fc118ccfa31126a3c6b15 Related-Bug: #1946339
* | | | | | | Stable compute uuid functional testsDan Smith2023-02-011-0/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a number of functional test cases for the stable-compute-uuid error cases. Specifically around checks and aborted startups to make sure we're catching what we expect, and failing in the appropriate ways. Related to blueprint stable-compute-uuid Change-Id: I8bcb93a6887ed06dbd4b7c28c93a20a3705a6077
* | | | | | | Abort startup if nodename conflict is detectedDan Smith2023-02-012-0/+23
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do run update_available_resource() synchronously during service startup, but we only allow certain exceptions to abort startup. This makes us abort for InvalidConfiguration, and makes the resource tracker raise that for the case where the compute node create failed due to a duplicate entry. This also modifies the object to raise a nova-specific error for that condition to avoid the compute node needing to import oslo_db stuff just to be able to catch it. Change-Id: I5de98e6fe52e45996bc2e1014fa8a09a2de53682
* | | | | | Protect against a deleted node id fileDan Smith2023-02-012-3/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we are starting up for the first time, we expect to generate and write a node_uuid file if one does not exist. If we are upgrading, we expect to do the same. However, if we are starting up not after an upgrade and not for the first time, a missing compute_id file is an error, and we should abort. Because of the additional check that this adds, which is called from a variety of places that don't have the stable compute node singleton stubbed to make it happy, this mocks the check for any test that does not specifically aim to exercise it. Related to blueprint stable-compute-uuid Change-Id: If83ce14b96e7d84ae38eba9d798754557d5abdfd
* | | | | | Check our nodes for hypervisor_hostname changesDan Smith2023-02-011-12/+36
| |_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we are loading our ComputeNode objects by UUID according to what the virt driver reported, we can sanity check the DB records against the virt driver's hostname. This covers the case where our CONF.host has not changed but the hostname reported by the virt driver has, assuming we can find the ComputeNode object(s) that match our persistent node uuid. Related to blueprint stable-compute-uuid Change-Id: I41635210d7d6f46b437b06d2570a26a80ed8676a