| Commit message (Collapse) | Author | Age | Files | Lines |
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The span info for the different merger operations is stored on the
request and will be returned to the scheduler via the result event.
This also adds the request UUID to the "refstat" job so that we can
attach that as a span attribute.
Change-Id: Ib6ac7b5e7032d168f53fe32e28358bd0b87df435
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| | |
Change-Id: Ib10452862e7aa1355502bb381d3ff07c65ac7187
Co-Authored-By: James E. Blair <jim@acmegating.com>
Co-Authored-By: Tristan Cacqueray <tdecacqu@redhat.com>
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | | |
Versions 2.8 and 2.9 are no longer supported by the Ansible project.
Change-Id: I888ddcbecadd56ced83a27ae5a6e70377dc3bf8c
|
|\ \ \
| | |/
| |/| |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This adds methods to allow us to save and restore spans using
ZooKeeper data.
Additionally, we subclass the tracing.Span class so that we can
transparently handle timestamps which are stored as floating point
numbers rather than integer nanoseconds.
To exercise the new features, emit spans for QueueItems and BuildSets.
Because most of our higher-level (parent) spans may start on
one host and end on another, we save the full information about
the span in ZK and restore it whenever we do anything with it,
including starting child spans. This works well for starting
a Build span given a BuildSet, since both objects are used by
the executor client and so the span information for both is
available.
However, there are cases where we would like to have child spans
and we do not have the full information of the parent, such as
any children of the Build span on the executor. We could
duplicate all the information of the Build span in ZK and send
it along with the build request, but we really only need a few
bits of info to start a remote child span. In OpenTelemetry,
this is called trace propogation, and there are some tools for
this which assume that the implicit trace context is being used
and formats information for an HTTP header. We could use those
methods, but this change adds a simpler API that is well suited
to our typical json-serialization method of propogation.
To use it, we will add a small extra dictionary to build and merge
requests. This should serialize to about 104 bytes.
So that we can transparantly handle upgrades from having no
saved state for spans and span context in our ZK data, have our
tracing API return a NonRecordingSpan when we try to restore
from a None value. This code uses tracing.INVALID_SPAN or
tracing.INVALID_SPAN_CONTEXT which are suitable constants. They
are sufficiently real for the purpose of context managers, etc.
The only down side is that any child spans of these will be
real, actual reported spans, so in these cases, we will emit
what we intend to be child spans as actual parent traces.
Since this should only happen when a user first enables tracing
on an already existing system, that seems like a reasonable
trade-off. As new objects are populated, the spans will be emitted
as expected.
The trade off here is that our operational code can be much
simpler as we can avoid null value checks and any confusion regarding
context managers.
In particular, we can just assume that tracing spans and contexts
are always valid.
Change-Id: If55b06572b5e95f8c21611b2a3c23f7fd224a547
|
|\ \ \
| |/ / |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This adds support for configuring tracing in Zuul along with
basic documentation of the configuration.
It also adds test infrastructure that runs a gRPC-based collector
so that we can test tracing end-to-end, and exercises a simple
test span.
Change-Id: I4744dc2416460a2981f2c90eb3e48ac93ec94964
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This adds information about semaphores to the REST API.
It allows for inspection of the known semaphores in a tenant, the
current number of jobs holding the semaphore, and information about
each holder iff that holder is in the current tenant.
Followup changes will add zuul-client and zuul-web support for the
API, along with docs and release notes.
Change-Id: I6ff57ca8db11add2429eefcc8b560abc9c074f4a
|
|\ \ \ \
| |_|/ /
|/| | | |
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | | |
If Gerrit returns a 403 on submit, log the text we get in reply to
help diagnose the problem.
Change-Id: I8c9b286bb63ba1703a6a8f3cd6cd9a4b86e62cf2
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This feature instructs Zuul to attempt a second or more node request
with a different node configuration (ie, possibly different labels)
if the first one fails.
It is intended to address the case where a cloud provider is unable
to supply specialized high-performance nodes, and the user would like
the job to proceed anyway on lower-performance nodes.
Change-Id: Idede4244eaa3b21a34c20099214fda6ecdc992df
|
|\ \ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This was deprecated quite some time ago and we should remove it as
part of the next major release.
Also remove a very old Zuul v1 layout.yaml from the test fixtures.
Change-Id: I40030840b71e95f813f028ff31bc3e9b3eac4d6a
|
|\ \ \ \ \
| |/ / / / |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This was previously deprecated and should be removed shortly before
we release Zuul v7.
Change-Id: Idbdfca227d2f7ede5583f031492868f634e1a990
|
|\ \ \ \ \
| |_|_|_|/
|/| | | | |
|
| | |_|/
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This adds an option to include result data from a job in the MQTT
reporter. It is off by default since it may be quite large for
some jobs.
Change-Id: I802adee834b60256abd054eda2db834f8db82650
|
|\ \ \ \
| | |/ /
| |/| | |
|
| | | |
| | | |
| | | |
| | | | |
Change-Id: I0d450d9385b9aaab22d2d87fb47798bf56525f50
|
|\ \ \ \
| |/ / / |
|
| | | |
| | | |
| | | |
| | | | |
Change-Id: I2576d0dcec7c8f7bbb76bdd469fd992874742edc
|
|\ \ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
I noticed in some of our testing a construct like
debug:
msg: '{{ ansible_version }}'
was actually erroring out; you'll see in the console output if you're
looking
Ansible output: b'TASK [Print ansible version msg={{ ansible_version }}] *************************'
Ansible output: b'[WARNING]: Failure using method (v2_runner_on_ok) in callback plugin'
Ansible output: b'(<ansible.plugins.callback.zuul_stream.CallbackModule object at'
Ansible output: b"0x7f502760b490>): 'dict' object has no attribute 'startswith'"
and the job-output.txt will be empty for this task (this is detected
by by I9f569a411729f8a067de17d99ef6b9d74fc21543).
This is because the msg value here comes in as a dict, and in several
places we assume it is a string. This changes places we inspect the
msg variable to use the standard Ansible way to make a text string
(to_text function) and ensures in the logging function it converts the
input to a string.
We test for this with updated tasks in the remote_zuul_stream tests.
It is slightly refactored to do partial matches so we can use the
version strings, which is where we saw the issue.
Change-Id: I6e6ed8dba2ba1fc74e7fc8361e8439ea6139279e
|
|\ \ \ \ \
| | |/ / /
| |/| | | |
|
| | |_|/
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Several of our tests which validate Ansible behavior with Zuul are
not versioned so that they test all supported versions of Ansible.
For those cases, add versioned tests and fix any descrepancies that
have been uncovered by the additional tests (fortunately all are
minor test syntax issues and do not affect real-world usage).
One of our largest versioned Ansible tests was not actually testing
multiple Ansible versions -- we just ran it 3 times on the default
version. Correct that and add validation that the version ran was
the expected version.
Change-Id: I26213f69fe844776408fce24322749a197e07551
|
|\ \ \ \
| | |/ /
| |/| | |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently the task in the test playbook
- hosts: compute1
tasks:
- name: Command Not Found
command: command-not-found
failed_when: false
is failing in the zuul_stream callback with an exception trying to
fill out the "delta" value in the message here. The result dict
(taken from the new output) shows us why:
2022-08-24 07:19:27.079961 | TASK [Command Not Found]
2022-08-24 07:19:28.578380 | compute1 | ok: ERROR (ignored)
2022-08-24 07:19:28.578622 | compute1 | {
2022-08-24 07:19:28.578672 | compute1 | "failed_when_result": false,
2022-08-24 07:19:28.578700 | compute1 | "msg": "[Errno 2] No such file or directory: b'command-not-found'",
2022-08-24 07:19:28.578726 | compute1 | "rc": 2
2022-08-24 07:19:28.578750 | compute1 | }
i.e. it has no start/stop/delta in the result (it did run and fail, so
you'd think it might ... but this is what Ansible gives us).
This checks for this path; as mentioned the output will now look like
above in this case.
This was found by the prior change
I9f569a411729f8a067de17d99ef6b9d74fc21543. This fixes the current
warning, so we invert the test to prevent further regressions.
Change-Id: I106b2bbe626ed5af8ca739d354ba41eca2f08f77
|
|\ \ \ \
| |/ / / |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We have a couple of places that can trigger errors in the zuul_stream
callback that this testing currently misses.
To try and catch this case better we grab the Ansible console output
(where the failure of the callback plugin is noted) and check it for
Ansible failure strings.
Just for review purposes, follow-on changes will correct current
errors so this test can be inverted.
Change-Id: I9f569a411729f8a067de17d99ef6b9d74fc21543
|
| |_|/
|/| |
| | |
| | |
| | |
| | |
| | | |
Timer unit test jobs should disable timer triggers before ending,
otherwise we may not shut down cleanly and will fail the test.
Change-Id: I2bbbfcaa7da50cd2daedb8f7dea11eb5725d56e4
|
|\ \ \
| |_|/
|/| | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The Ansible version is sometimes used for selecting the correct linter
or for implementing feature switches to make roles/playbooks backward
compatible.
With the split of Ansible into an "ansible" and "ansible-core" package,
the `ansible_version` now contains the version of the core package.
There seems to be no other variable that contains the version of the
"Ansible community" package that Zuul is using.
In order to support this use-case for Ansible 5+ we will add the Ansible
version to the job's Zuul vars.
Change-Id: I3f3a3237b8649770a9b7ff488e501a97b646a4c4
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This adds a config-error pipeline reporter configuration option and
now also reports config errors and merge conflicts to the database
as buildset failures.
The driving use case is that if periodic pipelines encounter config
errors (such as being unable to freeze a job graph), they might send
email if configured to send email on merge conflicts, but otherwise
their results are not reported to the database.
To make this more visible, first we need Zuul pipelines to report
buildset ends to the database in more cases -- currently we typically
only report a buildset end if there are jobs (and so a buildset start),
or in some other special cases. This change adds config errors and
merge conflicts to the set of cases where we report a buildset end.
Because of some shortcuts previously taken, that would end up reporting
a merge conflict message to the database instead of the actual error
message. To resolve this, we add a new config-error reporter action
and adjust the config error reporter handling path to use it instead
of the merge-conflicts action.
Tests of this as well as the merge-conflicts code path are added.
Finally, a small debug aid is added to the GerritReporter so that we
can easily see in the logs which reporter action was used.
Change-Id: I805c26a88675bf15ae9d0d6c8999b178185e4f1f
|
|\ \ \ \
| |_|/ /
|/| | | |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The report message "This change depends on a change that failed to
merge" (and a similar change for circular dependency bundles) is
famously vague. To help users identify the actual problem, include
URLs for which change(s) caused the problem so that users may more
easily resolve the issue.
Change-Id: Id8b9f8cf2c108703e9209e30bdc9a3933f074652
|
| |/ /
|/| |
| | |
| | | |
Change-Id: I12e8a056a2e5cd1bb18c1f24ecd7db55405f0a8c
|
|\ \ \
| |_|/
|/| | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Sometimes Gerrit events may arrive in batches (for example, an
automated process modifies several related changes nearly
simultaneously). Because of our inbuilt delay (10 seconds by
default), it's possible that in these cases, many or all of the
updates represented by these events will have settled on the Gerrit
server before we even start processing the first event. In these
cases, we don't need to query the same changes multiple times.
Take for example a stack of 10 changes. Someone approves all 10
simultaneously. That would produce (at least) 10 events for Zuul
to process. Each event would cause Zuul to query all 10 changes in
the series (since they are related). That's 100 change queries
(and each change query requires 2 or 3 HTTP queries).
But if we know that all the event arrived before our first set of
change queries, we can reduce that set of 100 queries to 10 by
suppressing any queries after the first.
This change generates a logical timestamp (ltime) immediately
before querying Gerrit for a change, and stores that ltime in the
change cache. Whenever an event arrives for processing with an
ltime later than the query ltime, we assume the change is already
up to date with that event and do not perform another query.
Change-Id: Ib1b9245cc84ab3f5a0624697f4e3fc73bc8e03bd
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Depends-On: https://review.opendev.org/850110
Change-Id: Ifb3cdd27a3114cf0c0538eebefcb9b6fae948fee
|
|\ \ \ \
| |/ / / |
|
| | |/
| |/|
| | |
| | |
| | | |
Depends-On: https://review.opendev.org/850108
Change-Id: Ieb58888d06c910ad6529594999c09bd4feecd48d
|
|\ \ \ |
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
An executor can have a zone configured and at the same time allow
unzoned jobs. In this case the executor was not counted for the zoned
executor metric (online/accepting).
Change-Id: Ib39947e3403d828b595cf2479e64789e049e63cc
|