diff options
author | James E. Blair <jim@acmegating.com> | 2022-04-30 15:58:32 -0700 |
---|---|---|
committer | James E. Blair <jim@acmegating.com> | 2022-05-02 13:57:15 -0700 |
commit | 282182f7c22d3b9540663fb02cfad71fd5897d23 (patch) | |
tree | b9295fda90bcc1c13dc67126f9761a55a8f60c45 /tests/fixtures | |
parent | a89ce345c0fb5fb304713b0b0aa8097312be59f5 (diff) | |
download | zuul-282182f7c22d3b9540663fb02cfad71fd5897d23.tar.gz |
Fix node request failures in dependency cycles
Node request failures cause a queue item to fail (naturally). In a normal
queue without cycles, that just means that we would cancel jobs behind and
wait for the current item to finish the remaining jobs. But with cycles,
the items in the bundle detect that items ahead (which are part of the bundle)
are failing and so they cancel their own jobs more agressively. If they do
this before all the jobs have started (ie, because we are waiting on an
unfulfilled node request), they can end up in a situation where they never
run builds, but yet they don't report because they are still expecting
those builds.
This likely points to a larger problem in that we should probably not be
canceling those jobs so aggressively. However, the more serious and immediate
problem is the race condition that can cause items not to report.
To correct this immediate problem, tell the scheduler to create fake build
objects with a result of "CANCELED" when the pipeline manager cancels builds
and there is no existing build already. This will at least mean that all
expected builds are present regardless of whether the node request has been
fulfilled.
A later change can be made to avoid canceling jobs in the first place without
needing to change this behavior.
Change-Id: I1e1150ef67c03452b9a98f9366434c53a5ad26fb
Diffstat (limited to 'tests/fixtures')
-rw-r--r-- | tests/fixtures/layouts/circular-deps-node-failure.yaml | 62 |
1 files changed, 62 insertions, 0 deletions
diff --git a/tests/fixtures/layouts/circular-deps-node-failure.yaml b/tests/fixtures/layouts/circular-deps-node-failure.yaml new file mode 100644 index 000000000..244449b82 --- /dev/null +++ b/tests/fixtures/layouts/circular-deps-node-failure.yaml @@ -0,0 +1,62 @@ +- queue: + name: integrated + allow-circular-dependencies: true + +- pipeline: + name: gate + manager: dependent + success-message: Build succeeded (gate). + require: + gerrit: + approval: + - Approved: 1 + trigger: + gerrit: + - event: comment-added + approval: + - Approved: 1 + success: + gerrit: + Verified: 2 + submit: true + failure: + gerrit: + Verified: -2 + start: + gerrit: + Verified: 0 + precedence: high + +- job: + name: base + parent: null + run: playbooks/run.yaml + nodeset: + nodes: + - label: debian + name: controller + +- job: + name: common-job + +- job: + name: project1-job + +- job: + name: project2-job + +- project: + name: org/project1 + queue: integrated + gate: + jobs: + - common-job + - project1-job + +- project: + name: org/project2 + queue: integrated + gate: + jobs: + - common-job + - project2-job |