| Commit message (Collapse) | Author | Age | Files | Lines |
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We are still having some issues in the gate where greenlets from
previous tests continue to run while the next test starts, causing
false negative failures in unit or functional test jobs.
This adds a new fixture that will ensure
GreenThreadPoolExecutor.shutdown() is called with wait=True, to wait
for greenlets in the pool to finish running before moving on.
In local testing, doing this does not appear to adversely affect test
run times, which was my primary concern.
As a baseline, I ran a subset of functional tests in a loop
until failure without the patch and after 11 hours, I got a failure
reproducing the bug. With the patch, running the same subset of
functional tests in a loop has been running for 24 hours and has not
failed yet.
Based on this, I think it may be worth trying this out to see if it
will help stability of our unit and functional test jobs. And if it
ends up impacting test run times or causes other issues, we can
revert it.
Partial-Bug: #1946339
Change-Id: Ia916310522b007061660172fa4d63d0fde9a55ac
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The patch to remove legacy migrations merged during the Bobcat cycle,
not the Antelope cycle, so the docs need to be updated accordingly.
Change-Id: I0d164ff1aaaab8d84116a0210f668330d2f86e7e
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
| |/
|/|
| |
| |
| |
| |
| |
| | |
validate-backport job started to fail as only old stable branch naming
is accepted. This patch extends the script to allow numbers and dot as
well in the branch names (like stable/2023.1).
Change-Id: Icbdcd5d124717e195d55d9e42530611ed812fadd
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The recent change(s) to enable a lot more SSHABLE checks puts the
runtime of the ceph job really close to the 2h timeout even when
things are working. Sometimes it times out before it finishes even
though things are progressing. Bump the timeout to avoid that.
Also bump us to 8G swap to match what is set on the parent ceph job
when we upgraded to jammy. We could just unset this, but better to
pin it high in case that job (defined elsewhere) changes. Our job
is the largest ceph job, so it makes sense that it keeps its own
swap level high.
Change-Id: I6cefd87671614d87d92e4675fbc989fc9453c8b9
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When the [service_user] section is configured in nova.conf, nova will
have the ability to send a service user token alongside the user's
token. The service user token is sent when nova calls other services'
REST APIs to authenticate as a service, and service calls can sometimes
have elevated privileges.
Currently, nova does not however have the ability to send a service user
token with an admin context. This means that when nova makes REST API
calls to other services with an anonymous admin RequestContext (such as
in nova-manage or periodic tasks), it will not be authenticated as a
service.
This adds a keyword argument to service_auth.get_auth_plugin() to
enable callers to provide a user_auth object instead of attempting to
extract the user_auth from the RequestContext.
The cinder and neutron client modules are also adjusted to make use of
the new user_auth keyword argument so that nova calls made with
anonymous admin request contexts can authenticate as a service when
configured.
Related-Bug: #2004555
Change-Id: I14df2d55f4b2f0be58f1a6ad3f19e48f7a6bfcb4
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The 'force' parameter of os-brick's disconnect_volume() method allows
callers to ignore flushing errors and ensure that devices are being
removed from the host.
We should use force=True when we are going to delete an instance to
avoid leaving leftover devices connected to the compute host which
could then potentially be reused to map to volumes to an instance that
should not have access to those volumes.
We can use force=True even when disconnecting a volume that will not be
deleted on termination because os-brick will always attempt to flush
and disconnect gracefully before forcefully removing devices.
Closes-Bug: #2004555
Change-Id: I3629b84d3255a8fe9d8a7cea8c6131d7c40899e8
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This reverts commit afb0f774841d30dcae9c074d524e7fa9be840678.
Reason for revert:
We unfortunately leak the token in the logs which is considered a security flaw, even if only provided on DEBUG level.
Change-Id: I52b52e65b689dadbdb08122c94652c491f850de6
Closes-Bug: #2012993
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When cpu_policy is mixed the scheduler tries to find a valid CPU pinning
for each instance NUMA cell. However if there is an instance NUMA cell
that does not request any pinned CPUs then such logic will calculate
empty pinning information for that cell. Then the scheduler logic
wrongly assumes that an empty pinning result means there was no valid
pinning. However there is difference between a None result when no valid
pinning found, from an empty result [] which means there was nothing to
pin.
This patch makes sure that pinning == None is differentiated from
pinning == [].
Closes-Bug: #1994526
Change-Id: I5a35a45abfcfbbb858a94927853777f112e73e5b
|
|\ \ \ \
| |/ / / |
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Related-Bug: #1994526
Change-Id: I52ee068377cc48ef4b4cdcb4b05fdc8d926faddf
|
|\ \ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Unfortunatly when we merged Ie166f3b51fddeaf916cda7c5ac34bbcdda0fd17a we
forgot that subnets can have no segment_id field.
Change-Id: Idb35b7e3c69fe8efe498abe4ebcc6cad8918c4ed
Closes-Bug: #2018375
|
|\ \ \ \ \ |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Make the host class look under '/sys/fs/cgroup/cgroup.controllers' for support of the cpu controller. The host will try searching through cgroupsv1 first, just like up until now, and in the case that fails, it will try cgroupsv2 then. The host will not support the feature if both checks fail.
This new check needs to be mocked by all tests that focus on this piece of code, as it touches a system file that requires privileges. For such thing, the CGroupsFixture is defined to easily add suck mocking to all test cases that require so.
I also removed old mocking at test_driver.py in favor of the fixture from above.
Partial-Bug: #2008102
Change-Id: I99b57c27c8a4425389bec2b7f05af660bab85610
|
|\ \ \ \ \ \ |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Previously, in numa_usage_from_instance_numa(), any new NUMACell
objects we created did not have the `socket` attribute. In some cases
this was persisted all the way down to the database. Fix this by
copying `socket` from the old_cell.
Change-Id: I9ed3c31ccd3220b02d951fc6dbc5ea049a240a68
Closes-Bug: 1995153
|
|\ \ \ \ \ \ \
| |_|_|/ / / /
|/| | | | | | |
|
| | |_|_|_|/
| |/| | | |
| | | | | |
| | | | | |
| | | | | | |
implements: blueprint weigh-host-by-hypervisor-version
Change-Id: I36b16a388383c26bdf432030bc9e28b2fd75d120
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | | |
https://docs.openstack.org/nova/latest/admin/availability-zones.html"
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
https://docs.openstack.org/nova/latest/admin/availability-zones.html
Closes-Bug: #1956506
Change-Id: Iec536713923b17cfceb19f2382b7a10c8527705e
|
|\ \ \ \ \ \ \ |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
The resource tracker will silently ignore attempts to claim resources
when the node requested is not managed by this host. The misleading
"self.disabled(nodename)" check will fail if the nodename is not known
to the resource tracker, causing us to bail early with a NopClaim.
That means we also don't do additional setup like creating a migration
context for the instance, claim resources in placement, and handle
PCI/NUMA things. This behavior is quite old, and clearly doesn't make
sense in a world with things like placement. The bulk of the test
changes here are due to the fact that a lot of tests were relying on
this silent ignoring of a mismatching node, because they were passing
node names that weren't even tracked.
This change makes us raise an error if this happens so that we can
actually catch it, and avoid silently continuing with no resource
claim.
Change-Id: I416126ee5d10428c296fe618aa877cca0e8dffcf
|
|\ \ \ \ \ \ \ \
| |/ / / / / / / |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
We have been ignoring the case where a rebuild or evacuate is triggered
and we fail to find *any* node for our host. This appears to be a very
old behavior, which I traced back ten years to this:
https://review.opendev.org/c/openstack/nova/+/35851
which was merely fixing the failure to reset instance.node during an
evacuate (which re-uses rebuild, which before that was a single-host
operation). That patch intended to make a failure to find a node for
our host a non-fatal error, but it just means we fall through that
check with no node selected, which means we never update instance.node
*and* run ResourceTracker code that will fail to find the node later.
So, this makes it an explicit error, where we stop further processing,
set the migration for the evacuation to 'failed', and send a
notification for it. This is the same behavior as happens further
down if we find that the instance has been deleted underneath us.
Change-Id: I88b962aaeaa0554da4ab00906ac4d9e6deb43589
|
|\ \ \ \ \ \ \ \
| |/ / / / / / /
|/| | | / / / /
| | |_|/ / / /
| |/| | | | | |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
If we first boot an instance with NUMA topology on a host, any
subsequent attempts to boot instances with the `socket` PCI NUMA
policy will fail with `Cannot load 'socket' in the base class`.
Demonstrate this in a functional test.
Change-Id: I63f4e3dfa38f65b73d0051b8e52b1abd0f027e9b
Related-bug: 1995153
|
|/ / / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Neutron also flipped to python>=3.9 on all their repos this morningi[1]
which means we can't install neutron on focal at all. I'm not sure if
that's going to get reverted at this point, but even if it is, it's
going to take a while to undo. As noted in the comments and the
original commit[2] that added this job, it was intended to be removed
when we dropped focal from the test interface, which we have now done.
1: https://review.opendev.org/q/topic:bug%252F2017478
2: https://review.opendev.org/c/openstack/nova/+/861111
Change-Id: I5be638a702629e07ec9c88bd67bb9b7f1212f7fc
|
|\ \ \ \ \ \
| |_|/ / / /
|/| | | | | |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Cloudbase have changed priorities and will no longer be testing the
Hyper-V driver. We need to mark this as experimental and consider
removing it in the future.
Change-Id: I823fbf660948c062581d4e0aaaadc6a6983de2a3
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
|
|\ \ \ \ \ \ |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
sqlalchemy-migrate does not (and will not) support sqlalchemy 2.0. We
need to drop these migrations to ensure we can upgrade our sqlalchemy
version.
Change-Id: I7756e393b78296fb8dbf3ca69c759d75b816376d
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
|
|\ \ \ \ \ \ \ |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Was a bit old, refreshed with more up-to-date information and links.
Change-Id: I5b5da4748238acda98f29570fa97d09d8aa8df82
|
|\ \ \ \ \ \ \ \ |
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Like we did for conductor, this makes the scheduler lazy-load the
placement client instead of only doing it during __init__. This avoids
a startup crash if keystone or placement are not available, but
retains startup failures for other problems and errors likely to be
a result of misconfigurations.
Closes-Bug: #2012530
Change-Id: I42ed876b84d80536e83d9ae01696b0a64299c9f7
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This makes us able to run functional on python 3.11. Without this,
tox will happily (and silently) run the default venv, which is unit
tests.
Change-Id: I544a29ae78814f9a454daba8c1978f7ab2c2505c
|
|\ \ \ \ \ \ \ \ \ |
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
I needed to update some VDPA functests as they were verifying a Yoga compute
service.
NOTE(sbauza): For the moment, the grenade-skip-level is not voting but it
will be done once I2b21e7d5f487f65ce4391f5c934046552d01a1e2 is merged.
Change-Id: I8ef2a8f251a3142c359e14841459bffcc3b50ac9
|
| |_|_|_|_|/ / / /
|/| | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Current versions of mypy run with no-implicit-optional
by default. This change gets Nova's mypy test environment
to pass again.
Change-Id: Ie50c8d364ad9c339355cc138b560ec4df14fe307
|
| |/ / / / / / /
|/| | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
This makes us test N-2->N even for non-SLURP releases. Ideally we
would continue to keep this working, even though we don't have to.
But, even if this highlights some breaking change and we have to drop
this job, the sentinel will be useful.
Depends-On: https://review.opendev.org/c/openstack/grenade/+/875990
Change-Id: I2b21e7d5f487f65ce4391f5c934046552d01a1e2
|
|\ \ \ \ \ \ \ \ |
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
When offloading a shelved instance, the compute needs to remove the
binding so the port will appear as "unbound" in neutron.
Closes-Bug: 1983471
Change-Id: Ia49271b126870c7936c84527a4c39ab96b6c5ea7
Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
|
|\ \ \ \ \ \ \ \ \ |
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Emptying the cpu init file and directly calling the submodule API.
Relates to blueprint libvirt-cpu-state-mgmt
Change-Id: I1299ca4b49743f58bec6f541785dd9fbee0ae9e2
|
|\ \ \ \ \ \ \ \ \ \ |
|