Parallelize some pipeline refresh ops

We may be able to speed up pipeline refreshes in cases where there are large numbers of items or jobs/builds by parallelizing ZK reads. Quick refresher: the ZK protocol is async, and kazoo uses a queue to send operations to a single thread which manages IO. We typically call synchronous kazoo client methods which wait for the async result before returning. Since this is all thread-safe, we can attempt to fill the kazoo pipe by having multiple threads call the synchronous kazoo methods. If kazoo is waiting on IO for an earlier call, it will be able to start a later request simultaneously. Quick aside: it would be difficult for us to use the async methods directly since our overall code structure is still ordered and effectively single threaded (we need to load a QueueItem before we can load the BuildSet and the Builds, etc). Thus it makes the most sense for us to retain our ordering by using a ThreadPoolExecutor to run some operations in parallel. This change parallelizes loading QueueItems within a ChangeQueue, and also Builds/Jobs within a BuildSet. These are the points in a pipeline refresh tree which potentially have the largest number of children and could benefit the most from the change, especially if the ZK server has some measurable latency. Change-Id: I0871cc05a2d13e4ddc4ac284bd67e5e3003200ad
author: James E. Blair <jim@acmegating.com> 2022-10-15 14:02:31 -0700
committer: James E. Blair <jim@acmegating.com> 2022-11-09 10:51:29 -0800
commit: 3a981b89a844f9169abb47d0439c80538e5291c3 (patch)
tree: 23b0c66c8ac977ce446669f59ba47f35376e5c36 /tests/unit/test_sos.py
parent: 8a8502f661c55acb0ce1b8f9bfea9ff3e86f0be1 (diff)
download: zuul-3a981b89a844f9169abb47d0439c80538e5291c3.tar.gz
1 files changed, 8 insertions, 5 deletions
diff --git a/tests/unit/test_sos.py b/tests/unit/test_sos.py
index f371a8064..37a47c6ae 100644
--- a/tests/unit/test_sos.py
+++ b/tests/unit/test_sos.py
@@ -163,8 +163,8 @@ class TestScaleOutScheduler(ZuulTestCase):
         pipeline = tenant.layout.pipelines['check']
         summary = zuul.model.PipelineSummary()
         summary._set(pipeline=pipeline)
-        context = self.createZKContext()
-        summary.refresh(context)
+        with self.createZKContext() as context:
+            summary.refresh(context)
         self.assertEqual(summary.status['change_queues'], [])
 
     def test_config_priming(self):
@@ -322,7 +322,8 @@ class TestScaleOutScheduler(ZuulTestCase):
         def new_summary():
             summary = zuul.model.PipelineSummary()
             summary._set(pipeline=pipeline)
-            summary.refresh(context)
+            with context:
+                summary.refresh(context)
             return summary
 
         A = self.fake_gerrit.addFakeChange('org/project', 'master', 'A')
@@ -345,7 +346,8 @@ class TestScaleOutScheduler(ZuulTestCase):
         self.assertTrue(context.client.exists(summary2.getPath()))
 
         # Our earlier summary object should use its cached data
-        summary1.refresh(context)
+        with context:
+            summary1.refresh(context)
         self.assertNotEqual(summary1.status, {})
 
         self.executor_server.hold_jobs_in_build = False
@@ -354,7 +356,8 @@ class TestScaleOutScheduler(ZuulTestCase):
 
         # The scheduler should have written a new summary that our
         # second object can read now.
-        summary2.refresh(context)
+        with context:
+            summary2.refresh(context)
         self.assertNotEqual(summary2.status, {})
 
     @simple_layout('layouts/semaphore.yaml')
author	James E. Blair <jim@acmegating.com>	2022-10-15 14:02:31 -0700
committer	James E. Blair <jim@acmegating.com>	2022-11-09 10:51:29 -0800
commit	3a981b89a844f9169abb47d0439c80538e5291c3 (patch)
tree	23b0c66c8ac977ce446669f59ba47f35376e5c36 /tests/unit/test_sos.py
parent	8a8502f661c55acb0ce1b8f9bfea9ff3e86f0be1 (diff)
download	zuul-3a981b89a844f9169abb47d0439c80538e5291c3.tar.gz