summaryrefslogtreecommitdiff
path: root/yarns.webapp/040-running-jobs.yarn
blob: 11ec55735af4f94c909d39e5a13728cef009b75d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
Running jobs
============

This chapter contains tests that verify that WEBAPP schedules jobs,
accepts job output, and lets the admin kill running jobs.

Run a job successfully
----------------------

To start with, with an empty run-queue, nothing should be scheduled.

    SCENARIO run a job
    GIVEN a new git repository in CONFGIT
    AND an empty lorry-controller.conf in CONFGIT
    AND lorry-controller.conf in CONFGIT adds lorries *.lorry using prefix upstream
    AND WEBAPP uses CONFGIT as its configuration directory
    AND a running WEBAPP

We stop the queue first.

    WHEN admin makes request POST /1.0/stop-queue

Then make sure we don't get a job when we request one.

    WHEN admin makes request POST /1.0/give-me-job with host=testhost&pid=123
    THEN response has job_id set to null
    
    WHEN admin makes request GET /1.0/list-running-jobs
    THEN response has running_jobs set to []

Add a Lorry spec to the run-queue, and request a job. We still
shouldn't get a job, since the queue isn't set to run yet.

    GIVEN Lorry file CONFGIT/foo.lorry with {"foo":{"type":"git","url":"git://foo"}}

    WHEN admin makes request POST /1.0/read-configuration
    AND admin makes request POST /1.0/give-me-job with host=testhost&pid=123
    THEN response has job_id set to null

Enable the queue, and off we go.

    WHEN admin makes request POST /1.0/start-queue
    AND admin makes request POST /1.0/give-me-job with host=testhost&pid=123
    THEN response has job_id set to 1
    AND response has path set to "upstream/foo"

    WHEN admin makes request GET /1.0/lorry/upstream/foo
    THEN response has running_job set to 1
    
    WHEN admin makes request GET /1.0/list-running-jobs
    THEN response has running_jobs set to [1]

Requesting another job should now again return null.

    WHEN admin makes request POST /1.0/give-me-job with host=testhost&pid=123
    THEN response has job_id set to null

Inform WEBAPP the job is finished.

    WHEN MINION makes request POST /1.0/job-update with job_id=1&exit=0
    THEN response has kill_job set to false
    WHEN admin makes request GET /1.0/lorry/upstream/foo
    THEN response has running_job set to null
    WHEN admin makes request GET /1.0/list-running-jobs
    THEN response has running_jobs set to []

Cleanup.

    FINALLY WEBAPP terminates


Limit number of jobs running at the same time
---------------------------------------------

WEBAPP can be told to limit the number of jobs running at the same
time.

Set things up. Note that we have two local Lorry files, so that we
could, in principle, run two jobs at the same time.

    SCENARIO limit concurrent jobs
    GIVEN a new git repository in CONFGIT
    AND an empty lorry-controller.conf in CONFGIT
    AND lorry-controller.conf in CONFGIT adds lorries *.lorry using prefix upstream
    AND Lorry file CONFGIT/foo.lorry with {"foo":{"type":"git","url":"git://foo"}}
    AND Lorry file CONFGIT/bar.lorry with {"bar":{"type":"git","url":"git://bar"}}
    AND WEBAPP uses CONFGIT as its configuration directory
    AND a running WEBAPP
    WHEN admin makes request POST /1.0/read-configuration

Check the current set of the `max_jobs` setting.

    WHEN admin makes request GET /1.0/get-max-jobs
    THEN response has max_jobs set to null

Set the limit to 1.

    WHEN admin makes request POST /1.0/set-max-jobs with max_jobs=1
    THEN response has max_jobs set to 1
    WHEN admin makes request GET /1.0/get-max-jobs
    THEN response has max_jobs set to 1

Get a job. This should succeed.

    WHEN MINION makes request POST /1.0/give-me-job with host=testhost&pid=1
    THEN response has job_id set to 1

Get a second job. This should not succeed.

    WHEN MINION makes request POST /1.0/give-me-job with host=testhost&pid=2
    THEN response has job_id set to null

Finish the first job. Then get a new job. This should succeed.

    WHEN MINION makes request POST /1.0/job-update with job_id=1&exit=0
    AND MINION makes request POST /1.0/give-me-job with host=testhost&pid=2
    THEN response has job_id set to 2

Stop job in the middle
----------------------

We need to be able to stop jobs while they're running as well. We
start by setting up everything so that a job is running, the same way
we did for the successful job scenario.

    SCENARIO stop a job while it's running
    GIVEN a new git repository in CONFGIT
    AND an empty lorry-controller.conf in CONFGIT
    AND lorry-controller.conf in CONFGIT adds lorries *.lorry using prefix upstream
    AND WEBAPP uses CONFGIT as its configuration directory
    AND a running WEBAPP
    AND Lorry file CONFGIT/foo.lorry with {"foo":{"type":"git","url":"git://foo"}}
    WHEN admin makes request POST /1.0/read-configuration
    AND admin makes request POST /1.0/start-queue
    AND admin makes request POST /1.0/give-me-job with host=testhost&pid=123
    THEN response has job_id set to 1
    AND response has path set to "upstream/foo"

Admin will now ask WEBAPP to kill the job. This changes sets a field
in the STATEDB only.

    WHEN admin makes request POST /1.0/stop-job with job_id=1
    AND admin makes request GET /1.0/lorry/upstream/foo
    THEN response has kill_job set to true

Now, when MINION updates the job, WEBAPP will tell it to kill it.
MINION will do so, and then update the job again.

    WHEN MINION makes request POST /1.0/job-update with job_id=1&exit=no
    THEN response has kill_job set to true
    WHEN MINION makes request POST /1.0/job-update with job_id=1&exit=1

Admin will now see that the job has, indeed, been killed.

    WHEN admin makes request GET /1.0/lorry/upstream/foo
    THEN response has running_job set to null

    WHEN admin makes request GET /1.0/list-running-jobs
    THEN response has running_jobs set to []

Cleanup.

    FINALLY WEBAPP terminates

Stop a job that runs too long
-----------------------------

Sometimes a job gets "stuck" and should be killed. The
`lorry-controller.conf` has an optional `lorry-timeout` field for
this, to set the timeout, and WEBAPP will tell MINION to kill a job
when it has been running too long.

Some setup. Set the `lorry-timeout` to a know value. It doesn't
matter what it is since we'll be telling WEBAPP to fake its sense of
time, so that the test suite is not timing sensitive. We wouldn't want
to have the test suite fail when running on slow devices.

    SCENARIO stop stuck job
    GIVEN a new git repository in CONFGIT
    AND an empty lorry-controller.conf in CONFGIT
    AND lorry-controller.conf in CONFGIT adds lorries *.lorry using prefix upstream
    AND lorry-controller.conf in CONFGIT has lorry-timeout set to 1 for everything
    AND Lorry file CONFGIT/foo.lorry with {"foo":{"type":"git","url":"git://foo"}}
    AND WEBAPP uses CONFGIT as its configuration directory
    AND a running WEBAPP
    WHEN admin makes request POST /1.0/read-configuration

Pretend it is the start of time.

    WHEN admin makes request POST /1.0/pretend-time with now=0
    AND admin makes request GET /1.0/status
    THEN response has timestamp set to "1970-01-01 00:00:00 UTC"

Start the job.

    WHEN admin makes request POST /1.0/give-me-job with host=testhost&pid=123
    THEN response has job_id set to 1

Check that the job info contains a start time.

    WHEN admin makes request GET /1.0/job/1
    THEN response has job_started set

Pretend it is now much later, or at least later than the timeout specified.

    WHEN admin makes request POST /1.0/pretend-time with now=2

Pretend to be a MINION that reports an update on the job. WEBAPP
should now be telling us to kill the job.

    WHEN MINION makes request POST /1.0/job-update with job_id=1&exit=no
    THEN response has kill_job set to true

Cleanup.

    FINALLY WEBAPP terminates

Remove a terminated job
-----------------------

WEBAPP doesn't remove jobs automatically, it needs to be told to
remove jobs.

    SCENARIO remove job
    
Setup.

    GIVEN a new git repository in CONFGIT
    AND an empty lorry-controller.conf in CONFGIT
    AND lorry-controller.conf in CONFGIT adds lorries *.lorry using prefix upstream
    AND WEBAPP uses CONFGIT as its configuration directory
    AND a running WEBAPP
    GIVEN Lorry file CONFGIT/foo.lorry with {"foo":{"type":"git","url":"git://foo"}}
    WHEN admin makes request POST /1.0/read-configuration

Start job 1.

    WHEN admin makes request POST /1.0/give-me-job with host=testhost&pid=123
    THEN response has job_id set to 1

Try to remove job 1 while it is running. This should fail.

    WHEN admin makes request POST /1.0/remove-job with job_id=1
    THEN response has reason set to "still running"

Finish the job.

    WHEN MINION makes request POST /1.0/job-update with job_id=1&exit=0
    WHEN admin makes request GET /1.0/list-jobs
    THEN response has job_ids set to [1]

Remove it.

    WHEN admin makes request POST /1.0/remove-job with job_id=1
    AND admin makes request GET /1.0/list-jobs
    THEN response has job_ids set to []

Cleanup.

    FINALLY WEBAPP terminates