| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We are still having some issues in the gate where greenlets from
previous tests continue to run while the next test starts, causing
false negative failures in unit or functional test jobs.
This adds a new fixture that will ensure
GreenThreadPoolExecutor.shutdown() is called with wait=True, to wait
for greenlets in the pool to finish running before moving on.
In local testing, doing this does not appear to adversely affect test
run times, which was my primary concern.
As a baseline, I ran a subset of functional tests in a loop
until failure without the patch and after 11 hours, I got a failure
reproducing the bug. With the patch, running the same subset of
functional tests in a loop has been running for 24 hours and has not
failed yet.
Based on this, I think it may be worth trying this out to see if it
will help stability of our unit and functional test jobs. And if it
ends up impacting test run times or causes other issues, we can
revert it.
Partial-Bug: #1946339
Change-Id: Ia916310522b007061660172fa4d63d0fde9a55ac
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the [service_user] section is configured in nova.conf, nova will
have the ability to send a service user token alongside the user's
token. The service user token is sent when nova calls other services'
REST APIs to authenticate as a service, and service calls can sometimes
have elevated privileges.
Currently, nova does not however have the ability to send a service user
token with an admin context. This means that when nova makes REST API
calls to other services with an anonymous admin RequestContext (such as
in nova-manage or periodic tasks), it will not be authenticated as a
service.
This adds a keyword argument to service_auth.get_auth_plugin() to
enable callers to provide a user_auth object instead of attempting to
extract the user_auth from the RequestContext.
The cinder and neutron client modules are also adjusted to make use of
the new user_auth keyword argument so that nova calls made with
anonymous admin request contexts can authenticate as a service when
configured.
Related-Bug: #2004555
Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 'force' parameter of os-brick's disconnect_volume() method allows
callers to ignore flushing errors and ensure that devices are being
removed from the host.
We should use force=True when we are going to delete an instance to
avoid leaving leftover devices connected to the compute host which
could then potentially be reused to map to volumes to an instance that
should not have access to those volumes.
We can use force=True even when disconnecting a volume that will not be
deleted on termination because os-brick will always attempt to flush
and disconnect gracefully before forcefully removing devices.
Closes-Bug: #2004555
Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When cpu_policy is mixed the scheduler tries to find a valid CPU pinning
for each instance NUMA cell. However if there is an instance NUMA cell
that does not request any pinned CPUs then such logic will calculate
empty pinning information for that cell. Then the scheduler logic
wrongly assumes that an empty pinning result means there was no valid
pinning. However there is difference between a None result when no valid
pinning found, from an empty result [] which means there was nothing to
pin.
This patch makes sure that pinning == None is differentiated from
pinning == [].
Closes-Bug: #1994526
Change-Id: I5a35a45abfcfbbb858a94927853777f112e73e5b
|
|\ \
| |/ |
|
| |
| |
| |
| |
| | |
Related-Bug: #1994526
Change-Id: I52ee068377cc48ef4b4cdcb4b05fdc8d926faddf
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Unfortunatly when we merged Ie166f3b51fddeaf916cda7c5ac34bbcdda0fd17a we
forgot that subnets can have no segment_id field.
Change-Id: Idb35b7e3c69fe8efe498abe4ebcc6cad8918c4ed
Closes-Bug: #2018375
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Make the host class look under '/sys/fs/cgroup/cgroup.controllers' for support of the cpu controller. The host will try searching through cgroupsv1 first, just like up until now, and in the case that fails, it will try cgroupsv2 then. The host will not support the feature if both checks fail.
This new check needs to be mocked by all tests that focus on this piece of code, as it touches a system file that requires privileges. For such thing, the CGroupsFixture is defined to easily add suck mocking to all test cases that require so.
I also removed old mocking at test_driver.py in favor of the fixture from above.
Partial-Bug: #2008102
Change-Id: I99b57c27c8a4425389bec2b7f05af660bab85610
|
|\ \ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Previously, in numa_usage_from_instance_numa(), any new NUMACell
objects we created did not have the `socket` attribute. In some cases
this was persisted all the way down to the database. Fix this by
copying `socket` from the old_cell.
Change-Id: I9ed3c31ccd3220b02d951fc6dbc5ea049a240a68
Closes-Bug: 1995153
|
|\ \ \ \ \
| |_|_|/ /
|/| | | | |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
implements: blueprint weigh-host-by-hypervisor-version
Change-Id: I36b16a388383c26bdf432030bc9e28b2fd75d120
|
|\ \ \ \ \ |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The resource tracker will silently ignore attempts to claim resources
when the node requested is not managed by this host. The misleading
"self.disabled(nodename)" check will fail if the nodename is not known
to the resource tracker, causing us to bail early with a NopClaim.
That means we also don't do additional setup like creating a migration
context for the instance, claim resources in placement, and handle
PCI/NUMA things. This behavior is quite old, and clearly doesn't make
sense in a world with things like placement. The bulk of the test
changes here are due to the fact that a lot of tests were relying on
this silent ignoring of a mismatching node, because they were passing
node names that weren't even tracked.
This change makes us raise an error if this happens so that we can
actually catch it, and avoid silently continuing with no resource
claim.
Change-Id: I416126ee5d10428c296fe618aa877cca0e8dffcf
|
|\ \ \ \ \ \
| |/ / / / /
| | | / / /
| |_|/ / /
|/| | | | |
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We have been ignoring the case where a rebuild or evacuate is triggered
and we fail to find *any* node for our host. This appears to be a very
old behavior, which I traced back ten years to this:
https://review.opendev.org/c/openstack/nova/+/35851
which was merely fixing the failure to reset instance.node during an
evacuate (which re-uses rebuild, which before that was a single-host
operation). That patch intended to make a failure to find a node for
our host a non-fatal error, but it just means we fall through that
check with no node selected, which means we never update instance.node
*and* run ResourceTracker code that will fail to find the node later.
So, this makes it an explicit error, where we stop further processing,
set the migration for the evacuation to 'failed', and send a
notification for it. This is the same behavior as happens further
down if we find that the instance has been deleted underneath us.
Change-Id: I88b962aaeaa0554da4ab00906ac4d9e6deb43589
|
|/ / /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If we first boot an instance with NUMA topology on a host, any
subsequent attempts to boot instances with the `socket` PCI NUMA
policy will fail with `Cannot load 'socket' in the base class`.
Demonstrate this in a functional test.
Change-Id: I63f4e3dfa38f65b73d0051b8e52b1abd0f027e9b
Related-bug: 1995153
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
sqlalchemy-migrate does not (and will not) support sqlalchemy 2.0. We
need to drop these migrations to ensure we can upgrade our sqlalchemy
version.
Change-Id: I7756e393b78296fb8dbf3ca69c759d75b816376d
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|\ \ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Like we did for conductor, this makes the scheduler lazy-load the
placement client instead of only doing it during __init__. This avoids
a startup crash if keystone or placement are not available, but
retains startup failures for other problems and errors likely to be
a result of misconfigurations.
Closes-Bug: #2012530
Change-Id: I42ed876b84d80536e83d9ae01696b0a64299c9f7
|
|\ \ \ \ \
| |/ / / /
|/| | | | |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
I needed to update some VDPA functests as they were verifying a Yoga compute
service.
NOTE(sbauza): For the moment, the grenade-skip-level is not voting but it
will be done once I2b21e7d5f487f65ce4391f5c934046552d01a1e2 is merged.
Change-Id: I8ef2a8f251a3142c359e14841459bffcc3b50ac9
|
|\ \ \ \ \ |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
When offloading a shelved instance, the compute needs to remove the
binding so the port will appear as "unbound" in neutron.
Closes-Bug: 1983471
Change-Id: Ia49271b126870c7936c84527a4c39ab96b6c5ea7
Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
|
|\ \ \ \ \ \
| |_|/ / / /
|/| | | | | |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Emptying the cpu init file and directly calling the submodule API.
Relates to blueprint libvirt-cpu-state-mgmt
Change-Id: I1299ca4b49743f58bec6f541785dd9fbee0ae9e2
|
|\ \ \ \ \ \ |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
This reverts commit 1778a9c589cf24e17b44f556680b17af9577df11.
Reason for revert: We said we wouldn't have it in RC1.
Change-Id: Idf0c9a8adeac231f099b312fc24b9cf9726687e0
|
|\ \ \ \ \ \ \ |
|
| | |_|_|_|/ /
| |/| | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Currently Nova produces ambigous error when volume-backed instance
is started using flavor with hw:mem_encryption extra_specs flag:
ImageMeta doesn't contain name if it represents Cinder volume.
This fix sligtly changes steps to get image_meta.name for
some MemEncryption-related checks where it could make any
difference.
Closes-bug: #2006952
Change-Id: Ia69e7cb18cd862f01ecfdbdc358c87af1ab8fbf6
|
|\ \ \ \ \ \ \
| |_|_|/ / / /
|/| | | | | | |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The nova.utils.spawn and spawn_n methods transport
the context (and profiling information) to the
newly created threads. But the same isn't done
when submitting work to thread-pools in the
ComputeManager.
The code doing that for spawn and spawn_n
is extracted to a new function
and called to submit the work to the thread-pools.
Closes-Bug: #1962574
Change-Id: I9085deaa8cf0b167d87db68e4afc4a463c00569c
|
|\ \ \ \ \ \ \ |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
By this patch, we now automatically power down or up cores
when an instance is either stopped or started.
Also, by default, we now powersave or offline dedicated cores when
starting the compute service.
Implements: blueprint libvirt-cpu-state-mgmt
Change-Id: Id645fd1ba909683af903f3b8f11c7f06db3401cb
|
|\ \ \ \ \ \ \ \
| |/ / / / / / / |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Before going further, we need to somehow return the list of CPUs even offline
if they are power managed by Nova.
Co-Authored-By: Sean Mooney <smooney@redhat.com>
Partially-Implements: blueprint libvirt-cpu-state-mgmt
Change-Id: I5dca10acde0eff554ed139587aefaf2f5fad2ca5
|
|\ \ \ \ \ \ \ \ |
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This patch adds the following SPICE-related options to the 'spice'
configuration group of a Nova configuration:
- image_compression
- jpeg_compression
- zlib_compression
- playback_compression
- streaming_mode
These configuration options can be used to enable and set the SPICE
compression settings for libvirt (QEMU/KVM) provisioned instances.
Each configuration option is optional and can be set explictly to
configure the associated SPICE compression setting for libvirt. If all
configuration options are not set, then none of the SPICE compression
settings will be configured for libvirt, which corresponds to the
behavior before this change. In this case, the built-in defaults from
the libvirt backend (e.g. QEMU) are used.
Note that those options are only taken into account if SPICE support is
enabled (and the VNC support is disabled).
Implements: blueprint nova-support-spice-compression-algorithm
Change-Id: Ia7efeb1b1a04504721e1a5bdd1b5fa7a87cdb810
|
|\ \ \ \ \ \ \ \ \
| | |/ / / / / / /
| |/| | | | | | | |
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This is the first stage of the power management series.
In order to be able to switch the CPU state or change the
governor, we need a framework to access sysfs.
As some bits can be reused, let's create a nova.filesystem helper module
that will define read-write mechanisms for accessing sysfs-specific commands.
Partially-Implements: blueprint libvirt-cpu-state-mgmt
Change-Id: Icb913ed9be8d508de35e755a9c650ba25e45aca2
|
|\ \ \ \ \ \ \ \ \
| |_|_|_|/ / / / /
|/| | | | / / / /
| | |_|_|/ / / /
| |/| | | | | | |
|
| |/ / / / / /
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
We see functional test failures due to leaked libvirt event handling
thread weaking up after its original test finished and importing
libvirt. If it happens when the libvirt package import is poisoned then
the currently executing test will fail. This patch logs the name of the
test case that leaked the libvirt event handling thread.
We will revert his before RC1.
Change-Id: I3146e9afb411056d004fc118ccfa31126a3c6b15
Related-Bug: #1946339
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
This adds a number of functional test cases for the stable-compute-uuid
error cases. Specifically around checks and aborted startups to make
sure we're catching what we expect, and failing in the appropriate
ways.
Related to blueprint stable-compute-uuid
Change-Id: I8bcb93a6887ed06dbd4b7c28c93a20a3705a6077
|
|/ / / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
We do run update_available_resource() synchronously during service
startup, but we only allow certain exceptions to abort startup. This
makes us abort for InvalidConfiguration, and makes the resource
tracker raise that for the case where the compute node create failed
due to a duplicate entry.
This also modifies the object to raise a nova-specific error for that
condition to avoid the compute node needing to import oslo_db stuff
just to be able to catch it.
Change-Id: I5de98e6fe52e45996bc2e1014fa8a09a2de53682
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If we are starting up for the first time, we expect to generate and
write a node_uuid file if one does not exist. If we are upgrading,
we expect to do the same. However, if we are starting up not after an
upgrade and not for the first time, a missing compute_id file is an
error, and we should abort.
Because of the additional check that this adds, which is called from
a variety of places that don't have the stable compute node singleton
stubbed to make it happy, this mocks the check for any test that does
not specifically aim to exercise it.
Related to blueprint stable-compute-uuid
Change-Id: If83ce14b96e7d84ae38eba9d798754557d5abdfd
|
| |_|_|/ /
|/| | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When we are loading our ComputeNode objects by UUID according to what
the virt driver reported, we can sanity check the DB records against
the virt driver's hostname. This covers the case where our CONF.host
has not changed but the hostname reported by the virt driver has,
assuming we can find the ComputeNode object(s) that match our
persistent node uuid.
Related to blueprint stable-compute-uuid
Change-Id: I41635210d7d6f46b437b06d2570a26a80ed8676a
|