delta/openstack/zuul.git - opendev.org: zuul/zuul.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge "Change custom type field to default one"6.4.0	Zuul	2022-09-13	1	-1/+1
\|\
\| *	Change custom type field to default one	Daniel Pawlik	2022-09-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the new Elasticsearch does not support custom field type [1]. [1] https://www.elastic.co/guide/en/elasticsearch/reference/7.17/removal-of-types.html#_custom_type_field Change-Id: I0b154da0a4736c6b7758f9936356d5b7097c35ad
* \|	Merge "elasticsearch: Remove the deprecated document type"	Zuul	2022-09-13	1	-1/+0
\|\ \ \| \|/
\| *	elasticsearch: Remove the deprecated document type	Tristan Cacqueray	2022-08-26	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to the removal-of-types[1] documentation, it is no longer necessary to specify a document type. [1] https://www.elastic.co/guide/en/elasticsearch/reference/7.17/removal-of-types.html Change-Id: I02996ce328a48b5ae6493646abe08ebab31ec962
* \|	Merge "Don't try to report build w/o buildset to DB"	Zuul	2022-09-09	1	-0/+2
\|\ \
\| * \|	Don't try to report build w/o buildset to DB	Simon Westphahl	2022-09-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a build result arrives for a non-current buildset we should skip the reporting as we can no longer create the reference to the buildset. Traceback (most recent call last): File "/opt/zuul/lib/python3.10/site-packages/zuul/scheduler.py", line 2654, in _doBuildCompletedEvent self.sql.reportBuildEnd( File "/opt/zuul/lib/python3.10/site-packages/zuul/driver/sql/sqlreporter.py", line 143, in reportBuildEnd db_build = self._createBuild(db, build) File "/opt/zuul/lib/python3.10/site-packages/zuul/driver/sql/sqlreporter.py", line 180, in _createBuild tenant=buildset.item.pipeline.tenant.name, uuid=buildset.uuid) AttributeError: 'NoneType' object has no attribute 'item' Change-Id: Iccbe9ab8212fbbfa21cb29b84a17e03ca221d7bd
* \| \|	Merge "Don't run cleanup playbooks after setup failure"	Zuul	2022-09-09	1	-0/+5
\|\ \ \
\| * \| \|	Don't run cleanup playbooks after setup failure	Simon Westphahl	2022-09-07	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a job already fails during setup we never load the frozen hostvars. Since the cleanup playbooks depend on those, we can skip the cleanup runs if the dict is empty. As we always add "localhost" to the hostlist, the frozen hostvars will never be empty when loading was successful. This will get rid of the following exception: Traceback (most recent call last): File "/opt/zuul/lib/python3.10/site-packages/zuul/executor/server.py", line 1126, in execute self._execute() File "/opt/zuul/lib/python3.10/site-packages/zuul/executor/server.py", line 1493, in _execute self.runCleanupPlaybooks(success) File "/opt/zuul/lib/python3.10/site-packages/zuul/executor/server.py", line 1854, in runCleanupPlaybooks self.runAnsiblePlaybook( File "/opt/zuul/lib/python3.10/site-packages/zuul/executor/server.py", line 3042, in runAnsiblePlaybook self.writeInventory(playbook, self.frozen_hostvars) File "/opt/zuul/lib/python3.10/site-packages/zuul/executor/server.py", line 2551, in writeInventory inventory = make_inventory_dict( File "/opt/zuul/lib/python3.10/site-packages/zuul/executor/server.py", line 913, in make_inventory_dict node_hostvars = hostvars[node['name']].copy() KeyError: 'node' Change-Id: I33a6a9ab355482e471e79f3dd5d702589fee04b3
* \| \| \|	Merge "Add Ansible 6"	Zuul	2022-09-08	16	-0/+18
\|\ \ \ \
\| * \| \| \|	Add Ansible 6	James E. Blair	2022-09-02	16	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I0d450d9385b9aaab22d2d87fb47798bf56525f50
* \| \| \| \|	Merge "zuul_stream : Use !127.0.0.1 for loopback"	Zuul	2022-09-08	1	-7/+7
\|\ \ \ \ \
\| * \| \| \| \|	zuul_stream : Use !127.0.0.1 for loopback	Ian Wienand	2022-09-07	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-on to Ia78ad9e3ec51bc47bf68c9ff38c0fcd16ba2e728 to use a different loopback address for the local connection to the Python 2.7 container. This way, we don't have to override the existing localhost/127.0.0.1 matches that avoid the executor trying to talk to a zuul_console daemon. These bits are removed. The comment around the port settings is updated while we're here. Change-Id: I33b2198baba13ea348052e998b1a5a362c165479
* \| \| \| \| \|	Merge "zuul-stream : Test against a Python 2.7 container"	Zuul	2022-09-08	1	-2/+6
\|\ \ \ \ \ \ \| \|/ / / / /
\| * \| \| \| \|	zuul-stream : Test against a Python 2.7 container	Ian Wienand	2022-09-07	1	-2/+6
\| \| \|_\|/ / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change Ief366c092e05fb88351782f6d9cd280bfae96237 intoduced a bug in the streaming daemons because it was using Python 3.6 features. The streaming console needs to work on all Ansible managed nodes, which includes back to Python 2.7 nodes (while Ansible supports that). This introduces a regression test by building about the smallest Python 2.7 container that can be managed by Ansbile. We start this container and modify the test inventory to include it, then run the stream tests against it. The existing testing runs against the "new" console but also tests against the console OpenDev's Zuul starts to ensure backwards-compatability. Since this container wasn't started by Zuul it doesn't have this, so that testing is skipped for this node. It might be good to abstract all testing of the console daemons into separate containers for each Ansible supported managed-node Python version -- it's a bit more work than I want to take on right now. This should ensure the lower-bound though and prevent regressions for older platforms. Change-Id: Ia78ad9e3ec51bc47bf68c9ff38c0fcd16ba2e728
* \| \| \| \|	Merge "Deprecate Ansible 2, make Ansible 5 default"	Zuul	2022-09-07	1	-1/+3
\|\ \ \ \ \ \| \| \|/ / / \| \|/\| \| \|
\| * \| \| \|	Deprecate Ansible 2, make Ansible 5 default	James E. Blair	2022-09-02	1	-1/+3
\| \|/ / / \| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I2576d0dcec7c8f7bbb76bdd469fd992874742edc
* \| \| \|	Merge "zuul_stream: handle non-string msg value"	Zuul	2022-09-07	1	-6/+12
\|\ \ \ \
\| * \| \| \|	zuul_stream: handle non-string msg value	Ian Wienand	2022-08-24	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed in some of our testing a construct like debug: msg: '{{ ansible_version }}' was actually erroring out; you'll see in the console output if you're looking Ansible output: b'TASK [Print ansible version msg={{ ansible_version }}] *************************' Ansible output: b'[WARNING]: Failure using method (v2_runner_on_ok) in callback plugin' Ansible output: b'(<ansible.plugins.callback.zuul_stream.CallbackModule object at' Ansible output: b"0x7f502760b490>): 'dict' object has no attribute 'startswith'" and the job-output.txt will be empty for this task (this is detected by by I9f569a411729f8a067de17d99ef6b9d74fc21543). This is because the msg value here comes in as a dict, and in several places we assume it is a string. This changes places we inspect the msg variable to use the standard Ansible way to make a text string (to_text function) and ensures in the logging function it converts the input to a string. We test for this with updated tasks in the remote_zuul_stream tests. It is slightly refactored to do partial matches so we can use the version strings, which is where we saw the issue. Change-Id: I6e6ed8dba2ba1fc74e7fc8361e8439ea6139279e
* \| \| \| \|	Merge "zuul_stream: handle failed_when tasks"	Zuul	2022-09-07	1	-4/+12
\|\ \ \ \ \ \| \|/ / / / \| \| \| / / \| \|_\|/ / \|/\| \| \|
\| * \| \|	zuul_stream: handle failed_when tasks	Ian Wienand	2022-08-24	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the task in the test playbook - hosts: compute1 tasks: - name: Command Not Found command: command-not-found failed_when: false is failing in the zuul_stream callback with an exception trying to fill out the "delta" value in the message here. The result dict (taken from the new output) shows us why: 2022-08-24 07:19:27.079961 \| TASK [Command Not Found] 2022-08-24 07:19:28.578380 \| compute1 \| ok: ERROR (ignored) 2022-08-24 07:19:28.578622 \| compute1 \| { 2022-08-24 07:19:28.578672 \| compute1 \| "failed_when_result": false, 2022-08-24 07:19:28.578700 \| compute1 \| "msg": "[Errno 2] No such file or directory: b'command-not-found'", 2022-08-24 07:19:28.578726 \| compute1 \| "rc": 2 2022-08-24 07:19:28.578750 \| compute1 \| } i.e. it has no start/stop/delta in the result (it did run and fail, so you'd think it might ... but this is what Ansible gives us). This checks for this path; as mentioned the output will now look like above in this case. This was found by the prior change I9f569a411729f8a067de17d99ef6b9d74fc21543. This fixes the current warning, so we invert the test to prevent further regressions. Change-Id: I106b2bbe626ed5af8ca739d354ba41eca2f08f77
* \| \| \|	Merge "Add Ansible version to a job's Zuul vars"	Zuul	2022-09-05	1	-0/+1
\|\ \ \ \ \| \|_\|/ / \|/\| \| \|
\| * \| \|	Add Ansible version to a job's Zuul vars	Simon Westphahl	2022-08-29	1	-0/+1
\| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Ansible version is sometimes used for selecting the correct linter or for implementing feature switches to make roles/playbooks backward compatible. With the split of Ansible into an "ansible" and "ansible-core" package, the `ansible_version` now contains the version of the core package. There seems to be no other variable that contains the version of the "Ansible community" package that Zuul is using. In order to support this use-case for Ansible 5+ we will add the Ansible version to the job's Zuul vars. Change-Id: I3f3a3237b8649770a9b7ff488e501a97b646a4c4
* \| \|	Merge "Do not use _display outside the main thread in zuul_stream"	Zuul	2022-09-01	1	-13/+20
\|\ \ \
\| * \| \|	Do not use _display outside the main thread in zuul_stream	James E. Blair	2022-08-31	1	-13/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the defaulh "linear" strategy (and likely others), Ansible will send the on_task_start callback, and then fork a worker process to execute that task. Since we spawn a thread in the on_task_start callback, we can end up emitting a log message in this method while Ansible is forking. If a forked process inherits a Python file object (i.e., stdout) that is locked by a thread that doesn't exist in the fork (i.e., this one), it can deadlock when trying to flush the file object. To minimize the chances of that happening, we should avoid using _display outside the main thread. The Python logging module is supposed to use internal locks which are automatically aqcuired and released across a fork. Assuming this is (still) true and functioning correctly, we should be okay to issue our Python logging module calls at any time. If there is a fault in this system, however, it could have a potential to cause a similar problem. If we can convince the Ansible maintainers to lock _display across forks, we may be able to revert this change in the future. Change-Id: Ifc6b835c151539e6209284728ccad467bef8be6f
* \| \| \|	Merge "Add config-error reporter and report config errors to DB"	Zuul	2022-08-31	4	-7/+23
\|\ \ \ \ \| \|/ / / \|/\| \| \|
\| * \| \|	Add config-error reporter and report config errors to DB	James E. Blair	2022-08-22	4	-7/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a config-error pipeline reporter configuration option and now also reports config errors and merge conflicts to the database as buildset failures. The driving use case is that if periodic pipelines encounter config errors (such as being unable to freeze a job graph), they might send email if configured to send email on merge conflicts, but otherwise their results are not reported to the database. To make this more visible, first we need Zuul pipelines to report buildset ends to the database in more cases -- currently we typically only report a buildset end if there are jobs (and so a buildset start), or in some other special cases. This change adds config errors and merge conflicts to the set of cases where we report a buildset end. Because of some shortcuts previously taken, that would end up reporting a merge conflict message to the database instead of the actual error message. To resolve this, we add a new config-error reporter action and adjust the config error reporter handling path to use it instead of the merge-conflicts action. Tests of this as well as the merge-conflicts code path are added. Finally, a small debug aid is added to the GerritReporter so that we can easily see in the logs which reporter action was used. Change-Id: I805c26a88675bf15ae9d0d6c8999b178185e4f1f
* \| \| \|	Merge "Add detail to "depends on a change that failed to merge""	Zuul	2022-08-30	5	-65/+102
\|\ \ \ \ \| \|_\|/ / \|/\| \| \|
\| * \| \|	Add detail to "depends on a change that failed to merge"	James E. Blair	2022-08-11	5	-65/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The report message "This change depends on a change that failed to merge" (and a similar change for circular dependency bundles) is famously vague. To help users identify the actual problem, include URLs for which change(s) caused the problem so that users may more easily resolve the issue. Change-Id: Id8b9f8cf2c108703e9209e30bdc9a3933f074652
* \| \| \|	Merge "Set remote URL after config was updated"	Zuul	2022-08-24	1	-5/+5
\|\ \ \ \
\| * \| \| \|	Set remote URL after config was updated	Simon Westphahl	2022-08-18	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To avoid issues with outdated Github access tokens in the Git config we only update the remote URL on the repo object after the config update was successful. This also adds a missing repo lock when building the repo state. Change-Id: I8e1b5b26f03cb75727d2b2e3c9310214a3eac447
* \| \| \| \|	Fix links for jobs with special characters	Albin Vass	2022-08-23	1	-0/+2
\| \|_\|/ / \|/\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I12e8a056a2e5cd1bb18c1f24ecd7db55405f0a8c
* \| \| \|	Merge "Reduce redundant Gerrit queries"	Zuul	2022-08-19	3	-7/+27
\|\ \ \ \ \| \|_\|_\|/ \|/\| \| \|
\| * \| \|	Reduce redundant Gerrit queries	James E. Blair	2022-08-19	3	-7/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sometimes Gerrit events may arrive in batches (for example, an automated process modifies several related changes nearly simultaneously). Because of our inbuilt delay (10 seconds by default), it's possible that in these cases, many or all of the updates represented by these events will have settled on the Gerrit server before we even start processing the first event. In these cases, we don't need to query the same changes multiple times. Take for example a stack of 10 changes. Someone approves all 10 simultaneously. That would produce (at least) 10 events for Zuul to process. Each event would cause Zuul to query all 10 changes in the series (since they are related). That's 100 change queries (and each change query requires 2 or 3 HTTP queries). But if we know that all the event arrived before our first set of change queries, we can reduce that set of 100 queries to 10 by suppressing any queries after the first. This change generates a logical timestamp (ltime) immediately before querying Gerrit for a change, and stores that ltime in the change cache. Whenever an event arrives for processing with an ltime later than the query ltime, we assume the change is already up to date with that event and do not perform another query. Change-Id: Ib1b9245cc84ab3f5a0624697f4e3fc73bc8e03bd
* \| \| \|	Merge "zuul-stream : don't write out logfile for tasks in loops"	Zuul	2022-08-17	4	-14/+44
\|\ \ \ \
\| * \| \| \|	zuul-stream : don't write out logfile for tasks in loops	Ian Wienand	2022-07-23	4	-14/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In zuul_stream.py:v2_playbook_on_task_start() it checks for "task.loop" and exits if the task is part of a loop. However the library/command.py override still writes out the console log despite it never being read. To avoid leaving this file around, mark a sentinel uuid in the action plugin if the command is part of a loop. In that case, for simplicity we just write to /dev/null -- that way no other assumptions in the library/command.py have to change; it just doesn't leave a file on disk. This is currently difficult to test as the infrastructure zuul_console leaves /tmp/console-* files and we do not know what comes from that, or testing. After this and the related change I823156dc2bcae91bd6d9770bd1520aa55ad875b4 are deployed to the infrastructure executors, we can make a simple and complete test for the future by just ensuring no /tmp/console-* files are left behind afer testing. I have tested this locally and do not see files from loops, which I was before this change. Change-Id: I4f4660c3c0b0f170561c14940cc159dc43eadc79
* \| \| \| \|	Merge "Fix zoned executor metric when unzoned is allowed"	Zuul	2022-08-15	2	-2/+6
\|\ \ \ \ \ \| \|_\|_\|/ / \|/\| \| \| \|
\| * \| \| \|	Fix zoned executor metric when unzoned is allowed	Simon Westphahl	2022-08-11	2	-2/+6
\| \| \|_\|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An executor can have a zone configured and at the same time allow unzoned jobs. In this case the executor was not counted for the zoned executor metric (online/accepting). Change-Id: Ib39947e3403d828b595cf2479e64789e049e63cc
* \| \| \|	Merge "zuul_console: do not use f-strings"	Zuul	2022-08-15	1	-1/+2
\|\ \ \ \
\| * \| \| \|	zuul_console: do not use f-strings	Ian Wienand	2022-08-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In Ief366c092e05fb88351782f6d9cd280bfae96237 I missed that this runs in the context of the remote node; meaning that it must support all the Python versions that might run there. f-strings are not 3.5 compatible. I'm thinking about how to lint this better (a syntax check run?) Change-Id: Ia4133b061800791196cd631f2e6836cb77347664
* \| \| \| \|	Merge "zuul-stream: automatically remove streaming files"	Zuul	2022-08-14	2	-6/+26
\|\ \ \ \ \ \| \|/ / / / \|/\| \| \| \|
\| * \| \| \|	zuul-stream: automatically remove streaming files	Ian Wienand	2022-08-09	2	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using protocol version 1, send a finalise message when streaming is complete so that the zuul_console daemon can delete the temporary file. We test this by inspecting the Ansible console output, which logs a message with the UUID of the streaming job. We dump the temporary files on the remote side and make sure a console file for that job isn't present. Change-Id: I823156dc2bcae91bd6d9770bd1520aa55ad875b4
* \| \| \| \|	Merge "zuul-stream: implement a protocol and version"	Zuul	2022-08-12	2	-31/+99
\|\ \ \ \ \ \| \|/ / / / \| \| / / / \| \|/ / / \|/\| \| \|
\| * \| \|	zuul-stream: implement a protocol and version	Ian Wienand	2022-08-09	2	-31/+99
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A refresher on how this works, to the best of my knowledge 1 Firstly, Zuul's Ansible has a library task "zuul_console:" which is run against the remote node; this forks a console daemon, listening on a default port. 2 We have a action plugin that runs for each task, and if that task is a command/shell task, assigns it a unique id 3 We then override with library/command.py (which backs command/shell tasks) with a version that forks and runs the process on the target node as usual, but also saves the stdout/stderr output to a temporary file named per the unique uuid from the step above. 4 At the same time we have the callback plugin zuul_stream.py, which Ansible is calling as it moves through starting, running and finishing the tasks. This looks at the task, and if it has a UUID [2], sends a request to the zuul_console [1], which opens the temporary file [3] and starts streaming it back. 5 We loop reading this until the connection is closed by [1], eventually outputting each line. In this way, the console log is effectively streamed and saved into our job output. We have established that we expect the console [1] is updated asynchronously to the command/streaming [3,4] in situation such as static nodes. This poses a problem if we ever want to update either part -- for example we can not change the file-name that the command.py file logs to, because an old zuul_console: will not know to open the new file. You could imagine other fantasy things you might like to do; e.g. negotiate compression etc. that would have similar issues. To provide the flexibility for these types of changes, implement a simple protocol where the zuul_stream and zuul_console sides exchange their respective version numbers before sending the log files. This way they can both decide what operations are compatible both ways. Luckily the extant protocol, which is really just sending a plain uuid, can be adapted to this. When an old zuul_console server gets the protocol request it will just look like an invalid log file, which zuul_stream can handle and thus assume the remote end doesn't know about protocols. This bumps the testing timeout; it seems that the extra calls make for random failures. The failures are random and not in the same place, I've run this separately in 850719 several times and not seen any problems with the higher timeout. This test is already has a settle timeout slightly higher so I think it must have just been on the edge. Change-Id: Ief366c092e05fb88351782f6d9cd280bfae96237
* \| \|	Merge "smtpreporter: Add pipeline to subject"	Zuul	2022-08-02	1	-1/+1
\|\ \ \
\| * \| \|	smtpreporter: Add pipeline to subject	Joshua Watt	2022-07-29	1	-1/+1
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds the pipeline for the change to the subject format. This makes it easier to include information about the pipeline (e.g. its name) in the e-mail subject Change-Id: I6ec973635543b4404c125589f23ffd1ba5504c17
* \| \|	Add whitespace around keywords	James E. Blair	2022-08-02	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes pep8 E275 which wants whitespace after assert and del. Change-Id: I1f8659f462aa91c3fdf8f7eb8b939b67c0ce9f55
* \| \|	Merge "Add pipeline-based merge op metrics"	Zuul	2022-07-29	4	-7/+37
\|\ \ \
\| * \| \|	Add pipeline-based merge op metrics	James E. Blair	2022-07-12	4	-7/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So that operators can see in aggregate how long merge, files-changes, and repo-state merge operations take in certain pipelines, add metrics for the merge operations themselves (these exclude the overhead of pipeline processing and job dispatching). Change-Id: I8a707b8453c7c9559d22c627292741972c47c7d7
* \| \| \|	Merge "Clear pipeline change cache at start of refresh"	Zuul	2022-07-29	2	-0/+8
\|\ \ \ \
\| * \| \| \|	Clear pipeline change cache at start of refresh	James E. Blair	2022-07-27	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pipeline change cache is used to avoid repeated cache lookups for change dependencies, but is not as robust as the "real" sounce change cache at managing the lifetime of change objects. If it is used to store some change objects in one scheduler which are later dequeued by a second scheduler, the next time those changes show up in that pipeline on the first scheduler it may use old change objects instead of new ones from the ZooKeeper cache. To fix this, clear the pipeeline change cache before we refresh the pipeline state. This will cause some extra ZK ChangeCache hits to repopulate the cache, but only in the case of commit dependencies (the pipeline change cache isn't used for git dependencies or the changes in the pipeline themselves unless they are commit dependencies). Change-Id: I0a20dc972917440d4f3e8bb59295b77c13913a48