| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
When a user sets zuul_console_disabled, we don't need to try to
connect to the streaming daemon. In fact, they may have set it
because they know it won't be running. Check for this and avoid
the connection step in that case and therefore avoid the extraneous
"Waiting on logger" messages and extra 30 second delay at the end
of each task.
Change-Id: I86af231f1ca1c5b54b21daae29387a8798190a58
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tox v4 behaves significantly differently than v3, and some of the
more complex things we do with tox would need an overhaul to
continue to use it. Meanwhile, nox is much simpler and more
flexible, so let's try using it.
This adds a noxfile which should be equivalent to our tox.ini file.
We still need to update the docs build (which involves changes to
base jobs) before we can completely remove tox.
Depends-On: https://review.opendev.org/868134
Change-Id: Ibebb0988d2702d310e46c437e58917db3f091382
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This parameter is removed in ansible-core 2.14 [1][2] and an error is raised
during tutorial run.
FAILED! => {"changed": false, "msg": "Unsupported parameters for
(ansible.legacy.command) module: warn.
[1] https://github.com/ansible/ansible/issues/77394
[2] https://github.com/ansible/ansible/pull/77411
Change-Id: I7ee86f019eeac14ddb22abc7924d0a10b051750e
|
|\ |
|
| |
| |
| |
| |
| |
| |
| | |
This adds a tutorial for enabling tracing along with a simple
all-in-one Jaeger tracing server.
Change-Id: I2c0e9b63730e4981c1b9acb67f8a4f90c38395ed
|
|\ \
| |/
|/| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Upon further discussion we recently found another case of leaking
console streaming files; if the zuul_console is not running on port
19885, or can not be reached, the streaming spool files will still be
leaked.
The prior work in I823156dc2bcae91bd6d9770bd1520aa55ad875b4 has the
receiving side indicate to the zuul_console daemon that it should
remove the spool file.
If this doesn't happen, either because the daemon was never there, or
it is firewalled off, the streaming spool files are left behind.
This modifies the command action plugin to look for a variable
"zuul_console_disable" which will indicate to the library running the
shell/command task not to write out the spool file at all, as it will
not be consumed.
It is expected this would be set at a host level in inventory for
nodes that you know can not or will not have access to zuul_console
daemon.
We already have a mechanism to disable this for commands running in a
loop; we expand this with a new string type. The advantage of this is
it leaves the library/command.py side basically untouched.
Documentation is updated, and we cover this with a new test.
Change-Id: I0273993c3ece4363098e4bf30bfc4308bb69a8b4
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is a follow-on to Ia78ad9e3ec51bc47bf68c9ff38c0fcd16ba2e728 to
use a different loopback address for the local connection to the
Python 2.7 container. This way, we don't have to override the
existing localhost/127.0.0.1 matches that avoid the executor trying to
talk to a zuul_console daemon. These bits are removed.
The comment around the port settings is updated while we're here.
Change-Id: I33b2198baba13ea348052e998b1a5a362c165479
|
|\ \ \
| |/ / |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Change Ief366c092e05fb88351782f6d9cd280bfae96237 intoduced a bug in
the streaming daemons because it was using Python 3.6 features. The
streaming console needs to work on all Ansible managed nodes, which
includes back to Python 2.7 nodes (while Ansible supports that).
This introduces a regression test by building about the smallest
Python 2.7 container that can be managed by Ansbile. We start this
container and modify the test inventory to include it, then run the
stream tests against it.
The existing testing runs against the "new" console but also tests
against the console OpenDev's Zuul starts to ensure
backwards-compatability. Since this container wasn't started by Zuul
it doesn't have this, so that testing is skipped for this node.
It might be good to abstract all testing of the console daemons into
separate containers for each Ansible supported managed-node Python
version -- it's a bit more work than I want to take on right now.
This should ensure the lower-bound though and prevent regressions for
older platforms.
Change-Id: Ia78ad9e3ec51bc47bf68c9ff38c0fcd16ba2e728
|
|\ \ \
| |/ /
|/| | |
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The keycloak tutorial incorrectly instructed users to run
"docker-compose-compose". Correct that.
Also, change the instructions to "stop" rather than "down" the
original containers so that the results of the quick-start tutorial
are still present.
Finally, verify that, and also add a verification that the intended
effect of the restart worked (by checking the available authn methods).
Change-Id: I43a17e27300126e8acdc1919ba2bbe98719ad604
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the defaulh "linear" strategy (and likely others), Ansible will
send the on_task_start callback, and then fork a worker process to
execute that task. Since we spawn a thread in the on_task_start
callback, we can end up emitting a log message in this method while
Ansible is forking. If a forked process inherits a Python file object
(i.e., stdout) that is locked by a thread that doesn't exist in the
fork (i.e., this one), it can deadlock when trying to flush the file
object. To minimize the chances of that happening, we should avoid
using _display outside the main thread.
The Python logging module is supposed to use internal locks which are
automatically aqcuired and released across a fork. Assuming this is
(still) true and functioning correctly, we should be okay to issue
our Python logging module calls at any time. If there is a fault
in this system, however, it could have a potential to cause a similar
problem.
If we can convince the Ansible maintainers to lock _display across
forks, we may be able to revert this change in the future.
Change-Id: Ifc6b835c151539e6209284728ccad467bef8be6f
|
|
|
|
|
|
|
|
| |
This is a small refactor to check the output of each node separately.
This should have no effect, but makes it easier to add more testing in
a follow-on change.
Change-Id: Ic5d490c54da968b23fed068253f5be0249ea953a
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using protocol version 1, send a finalise message when streaming
is complete so that the zuul_console daemon can delete the temporary
file.
We test this by inspecting the Ansible console output, which logs a
message with the UUID of the streaming job. We dump the temporary
files on the remote side and make sure a console file for that job
isn't present.
Change-Id: I823156dc2bcae91bd6d9770bd1520aa55ad875b4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A refresher on how this works, to the best of my knowledge
1 Firstly, Zuul's Ansible has a library task "zuul_console:" which is
run against the remote node; this forks a console daemon, listening
on a default port.
2 We have a action plugin that runs for each task, and if that task
is a command/shell task, assigns it a unique id
3 We then override with library/command.py (which backs command/shell
tasks) with a version that forks and runs the process on the target
node as usual, but also saves the stdout/stderr output to a
temporary file named per the unique uuid from the step above.
4 At the same time we have the callback plugin zuul_stream.py, which
Ansible is calling as it moves through starting, running and
finishing the tasks. This looks at the task, and if it has a UUID
[2], sends a request to the zuul_console [1], which opens the
temporary file [3] and starts streaming it back.
5 We loop reading this until the connection is closed by [1],
eventually outputting each line.
In this way, the console log is effectively streamed and saved into
our job output.
We have established that we expect the console [1] is updated
asynchronously to the command/streaming [3,4] in situation such as
static nodes. This poses a problem if we ever want to update either
part -- for example we can not change the file-name that the
command.py file logs to, because an old zuul_console: will not know to
open the new file. You could imagine other fantasy things you might
like to do; e.g. negotiate compression etc. that would have similar
issues.
To provide the flexibility for these types of changes, implement a
simple protocol where the zuul_stream and zuul_console sides exchange
their respective version numbers before sending the log files. This
way they can both decide what operations are compatible both ways.
Luckily the extant protocol, which is really just sending a plain
uuid, can be adapted to this. When an old zuul_console server gets
the protocol request it will just look like an invalid log file, which
zuul_stream can handle and thus assume the remote end doesn't know
about protocols.
This bumps the testing timeout; it seems that the extra calls make for
random failures. The failures are random and not in the same place,
I've run this separately in 850719 several times and not seen any
problems with the higher timeout. This test is already has a settle
timeout slightly higher so I think it must have just been on the edge.
Change-Id: Ief366c092e05fb88351782f6d9cd280bfae96237
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The zuul-stream-functional tests currently run Ansible against two
hosts (node1 and node2) started by the "infrastructure" Zuul. However
the tests are working with the zuul_console running on the default
port 19885 -- which is the zuul_console started by the setup jobs at
[1].
The result of this is that this test is only talking to a zuul_console
instance started from the current executor's code, not the Zuul
checkout uder test -- i.e. changes to zuul_console.py aren't tested by
this change.
This modifies the job playbook to have another step that can start its
own zuul_console service on the two hosts running at another port
(19887). This way we can test against the zuul_console code from the
Zuul checkout. A new step is added to run a playbook overriding to
this port.
We retain the existing test (against the already running port 19885)
as a backwards compatability test; although we don't exactly know what
this is running (as it comes from OpenDev's production executors) we
want changes to the console log/stream callback to be able to run
against it, as it may represent what is running a static node.
The test results are pulled apart a bit to be more explicit, logging
and testing each run separately.
[1] https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/prepare-workspace/tasks/main.yaml#L2
Change-Id: Ib11b77cfdc6c59d12807c6d9684c3e653ccad863
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This adds python3.10 testing on Jammy and switches the docker images to
python3.10 from 3.8.
We run sudo for postgres with -Hi to avoid non fatal errors when
postres' client attempts to write command history to Zuul's homedir (it
is running as the postgres user which can't write to zuul's homedir). We
also need to update the libffi package version for jammy to 8 in
bindep.txt. Finally, python_version values need to be quoted as "3.10"
is different than 3.10 which is equivalent to 3.1 when serialized by
yaml as a float.
Force setuptools to use stdlib (shipped by the distro) distutils to
avoid problems with virtualenvs not actually being virtualenvs.
Finally we switch the bulk of jobs over to using nodeset: ubuntu-jammy
as the default python there is 3.10.
Change-Id: I97b90bb7a23c90f108f23dda9fdd0e89f9f4dbca
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
formatStatusUrl() return build set URL when the
item.current_build_set.result exist (zuul/model.py)
Also updated the quick-start to continue to look for the
build URL instead of the buildset URL.
Change-Id: I5f8433e2926da5a8d14b966d89cc943be1ecfca9
|
| |/
|/|
| |
| |
| |
| |
| |
| | |
A couple of locations continue to reference actiongeneral which has been
removed. Update these locations to use action as the current location
for these plugins.
Change-Id: I71c03d2c0a84592be66fa0d84bc684684a392a27
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This adds support for Ansible 5. As mentioned in the reno, only
the major version is specified; that corresponds to major.minor in
Ansible core, so is approximately equivalent to our current regime.
The command module is updated to be based on the current code in
ansible core 2.12.4 (corresponding to community 5.6.0). The previous
version is un-symlinked and copied to the 2.8 and 2.8 directories
for easy deletion as they age out.
The new command module has corrected a code path we used to test
that the zuul_stream module handles python exceptions in modules,
so instead we now take advantage of the ability to load
playbook-adjacent modules to add a test fixture module that always
raises an exception. The zuul stream functional test validation is
adjusted to match the new values.
Similarly, in test_command in the remote tests, we relied on that
behavior, but there is already a test for module exceptions in
test_module_exception, so that check is simply removed.
Among our Ansible version tests, we occasionally had tests which
exercised 2.8 but not 2.9 because it is the default and is otherwise
tested. This change adds explicit tests for 2.9 even if they are
redundant in order to make future Ansible version updates easier and
more mechanical (we don't need to remember to add 2.9 later when
we change the default).
This is our first version of Ansible where the value of
job.ansible-version could be interpreted as an integer, so the
configloader is updated to handle that possibility transparently,
as it already does for floating point values.
Change-Id: I694b979077d7944b4b365dbd8c72aba3f9807329
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This has been pinned to a very old version of ARA for some time, and
newer versions of Ansible are no longer compatible with the old version
of ARA. Since this isn't receiving maintenance keeping it up to date,
remove it.
Note that if there is desire for support for this or other callback
plugins, it would be quite reasonable and relatively straightforward
to add the ability to generically configure additional callback plugins.
This would have the advantage of not requiring tight internal integration
between Zuul and other callback plugins. Such a change would likely
be welcome.
Change-Id: I733e48127f2b1cf7d2d52153844098163e48bae8
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
New versions of git refuse to run on git repos owned by a user
other than the caller. This is due to a fix for
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-24765
The stream jobs install Zuul as root while the repo is owned by
zuul. This causes pbr to run git which runs afoul of that issue.
To correct this, run a wheel build as the zuul user first and
install the wheel.
Change-Id: Id48245f715760c436ae46415f057358e1b687181
|
|
|
|
|
|
|
|
|
|
|
| |
This leak in GitPython may have been a significant contributor to
our high open file counts in tests:
https://github.com/gitpython-developers/GitPython/commit/e16a0040d07afa4ac9c0548aa742ec18ec1395a8
Now that the fix is in a release, remove our ulimit workaround.
Change-Id: Ib61c0ec67d0d244939ee2da142faf03e791159d5
|
|
|
|
|
|
|
|
|
|
|
| |
We were running on Bionic because Zuul's inclusion of a pinned Gear
conflicted with TLS policy on Focal. With Gear gone we can bump up to
Focal safely now.
Followup changes can bump testing platforms ahead to 3.9 or newer as
well.
Change-Id: I4cfef79ebc97753994edaf36a1deca0d3b37ad17
|
|
|
|
|
|
|
|
|
|
|
| |
To save space in ZooKeeper compress the data of ZKObjects. This way we
can reduce the amount of data stored in some cases by a factor of 15x
(e.g. for some job znodes).
In case for data that is not yet compressed the ZKObject will fall back
to loading the stored JSON data directly.
Change-Id: Ibb59d3dfc1db0537ff6d28705832f0717d45b632
|
|
|
|
|
|
|
|
|
| |
This adds a Zuul quick-start tutorial add-on that sets up a keycloak
server. This can be used by new users to demonstrate the admin api
capability, or developers for testing.
Change-Id: I7ce73ce499dd840ad43fd8d0c6544177d02a7187
Co-Authored-By: Matthieu Huin <mhuin@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is 2 changes:
* Run dstat in unit test jobs so that we can get an idea of
system performance/utilization.
* Remove the stestr concurrency cap. With 8 cores, we have
enough headroom to run the test dependencies (sql/zk) while
the tests are running too. Use all the CPU that's available.
Change-Id: I9f250865f7043fdbb1fa8a01f1bc9508490accc1
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now that the Zuul web server and database are both required, provide
a consistent user experience by always reporting the build page.
This means that success-url and failure-url are no longer useful,
so remove them.
Update the quick-start to reflect that the build page is always
reported.
Change-Id: I4ff108df3917c9b6f44e2f5b0ccc4a7adbda1677
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We dropped installing ARA here when
I5ce1385245c76818777aa34230786a9dbaf723e5 converting things to using
the Zuul managed versions of Ansible (which installs ARA). Since we
dropped the required-projects, I think the bindep here has been
silently just not doing anything.
We also stop gzipping the output -- it makes it fairly useless as
you'd have to bulk download the whole thing to unzip it. I think this
was an artifact of the old logs.openstack.org days.
Needed-By: https://review.opendev.org/c/openstack/project-config/+/777675
Change-Id: I727a088c3176d13b483a0c28f39cbc0f51dad3f6
|
|
|
|
|
|
|
| |
This reverts commit fb595d0427c3597f82c3ba51e44a38df27f54ade. We
fixed the issue with I7b4221c1446b071f2b3a86af7d67ad55f92a68a1.
Change-Id: Ia812dae029eecc7f96757a9d930b3a9c098394bd
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
docker-compose in focal doesn't handle container name with '-' the same
way as the version included in bionic.
Update the container name to be correctly handled using the newest
version of docker-compose.
This will match recent update for opendev infra, people localy running
tutorial with an old docker-compose will face an issue, as this export
is mainly useful for opendev and is now flagged with
"ignore_errors: yes", it won't have a significant impact for those users
Change-Id: I7b4221c1446b071f2b3a86af7d67ad55f92a68a1
|
|
|
|
|
|
|
|
| |
Something is failing in here, but we can't see it because the output
is all sent to logs which then aren't collected if the task fails.
Ignore errors here, that way we should get whatever logs we can.
Change-Id: Iba90c5b0b084b8740c40cb4e611eb7d46a83834e
|
|
|
|
|
|
|
|
|
|
|
| |
Something (?) changed which is causing ssh-keyscan to try to use
IPv6 lo on our test nodes which does not work. Use ssh-keyscan -4
to work around this.
There is no user impact since we expect the user to interactively
accept the keys in the tutorial.
Change-Id: Ibf033de7a3ed6cb41993c6f4adbddeeb53f4c09c
|
|
|
|
|
|
|
|
| |
This mirrors the configuration in Nodepool for using TLS-enabled
ZooKeeper in tests. We use the ensure-zookeeper role in order
to get a newer ZooKeeper than is supplied in bionic.
Change-Id: I14413fccbc9a6a7a75b6233d667e2a1d2856d894
|
|
|
|
|
|
|
|
|
| |
This was intended as a one-time helper program to help people upgrade
from Zuul v2 to v3. It did not cover all use cases, and has not been
kept up to date or improved. It's time to remove it before the v4
release.
Change-Id: I12cdcedb5baabd8fa0937a6ea21590259093ead1
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This change is a common root for other
Zookeeper related changed regarding
scale-out-scheduler. Zookeeper becoming
a central component requires to increase
"maxClientCnxns".
Since the ZooKeeper class is expected to grow
significantly (ZooKeeper is becoming a central part
of Zuul) a split of the ZooKeeper class (zk.py) into
zk module is done here to avoid the current god-class.
Also the zookeeper log is copied to the "zuul_output_dir".
Change-Id: I714c06052b5e17269a6964892ad53b48cf65db19
Story: 2007192
|
|/
|
|
|
|
|
|
| |
The new cryptography release needs either a recent pip to be able to
use abi3 wheels or a rust compiler on the system to make the source
installation work. Thus upgrade pip to use the wheels.
Change-Id: Ied6007f9f834f313063e2e56a057a7082a71e5c4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-quick-start steps are modified and fit more to what a reader would do
-quick-start test code is mainly splitted into 2 files, one which is a
setup part as a role, the second one starts with cloning the test
repository, just like all followings tutorial will do
-some elementary steps when manipulating or checking gerrit are being
added as roles
tutorial ssh config: test ssh configuration has been modified to allow
using a known_hosts file for both someone executing localtest and
opendev.org's zuul. A reader executing the tutorial would still have to
accept the fingerprint. To do so, commit-msg hook is fetched manually,
otherwise it would be downloaded by git-review throught scp. Alas,
git-review doesn't allow to pass options to scp to provide a new
known_hosts file.
User's ssh key is used if ~/.ssh/id_rsa.pub is available, otherwise use
a generated one.
- "to_json | from_json | json_query" in test is due to an issue between
ansible and jmespath [1]
[1] https://github.com/ansible-collections/community.general/issues/320
Change-Id: Id5c669537ff5afc7468352139980ebade167d534
|
|
|
|
|
|
|
| |
Zuul now requires at least python 3.6 so use the bionic image for the
zuul-stream-functional tests as well.
Change-Id: Iba47cd9733e97fba5b0471fdcfc69588f03f85d9
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Zuul was designed to block local code execution in untrusted
environments to not only rely on bwrap to contain a job. This got
broken since the creation of a command plugin that injects the
zuul_job_id which is required for log streaming. However this plugin
doesn't do a check if the task is a localhost task. Further it is
required in trusted and untrusted environments due to log
streaming. Thus we need to fork this plugin and restrict the variant
that is used in untrusted environments.
We do this by moving actiongeneral/command.py back to action/*. We
further introduce a new catecory actiontrusted which gets the
unrestricted version of this plugin.
Change-Id: If81cc46bcae466f4c071badf09a8a88469ae6779
Story: 2007935
Task: 40391
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
virtualenv 20.0.24 creates ~/.local/share/virtualenv with the
seed packages needed for making virtualenvs per-python version.
Creating empty virtualenvs is quick, so run those in sequence
to avoid race possibilities. Then, we can still run the
installs into the virtualenvs in parallel.
We also fix a bug in the console stream functional jobs and install pip
with the use of ensure-pip. This is necessary because the virtualenv
fix runs the stream functional jobs and the update to the stream
functional jobs relies on working docker images.
Change-Id: I3dec251d19dd7b3807848a54e6a20a8e89d30a4e
|
|
|
|
|
|
|
|
| |
Installing docker and pip are pre-reqs and we're using zuul roles
to do them, so they're not really testing explicit quick-start
steps. Move them to pre-run.
Change-Id: I374dac18b9b7e376d924b11f4661355ea7c4d149
|
|
|
|
|
|
|
|
| |
We assume pip exists and were getting it from opendev images before.
Add ensure-pip so that pip exists.
Depends-On: https://review.opendev.org/735904
Change-Id: Ifbd32f11f09c518062ffa350f9ebbe597493df67
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v14 is the latest lts. Let's use it.
Also - rename the jobs to make it clear what they're doing, and
add a dashboard job that points at opendev's multi-tenat api too.
There are new jobs that default to latest node LTS and auto-detect
yarn vs npm. Update to use them.
Depends-On: https://review.opendev.org/728097
Depends-On: https://review.opendev.org/726547
Change-Id: I5717edea2cd09acc5bce673c38bbe7fa8057a376
|