| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
We no longer serialise entire artifacts, so the output of deserialise_artifact
is an ArtifactReference. This commit changes stuff in distbuild to know how to
deal with that rather than an Artifact.
Change-Id: I79b40d041700a85c25980e3bd70cd34dedd2a113
|
|
|
|
|
|
|
|
|
|
| |
The controller no longer needs to know everything about an artifact
as the workers can calculate the build graph themselves quickly.
This reduces the amount of data which needs to be serialised by
serialise-artifact, making the yaml dump quicker.
Change-Id: I6bd0bed14c2efb2f499e9d6f0a97e6188353121a
|
|
|
|
|
|
|
|
|
| |
This is mostly to check that the 'cancel entire subprocess tree' works
as expected. Revert that patch and the test fails.
There are also some tweaks included in this commit.
Change-Id: If297522e6589ebb3a07dac66a39eb243789e53aa
|
|
|
|
|
|
|
|
| |
Currently, it leaves around empty directories called build-00, build-01,
etc. when you run a distbuild that fails to get as far as building
something, which is annoying.
Change-Id: Id3466e248c327dedaf973bc2fe22d42e5c5570d4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We discovered a case where a user of distbuild began a build of
'qtbase', then cancelled it 2 minutes in. The `morph worker-build`
process didn't exit for over an hour -- it ran right through until the
chunk artifacts had been created. Then it exited with code -9 (SIGKILL).
This seems to be due to the fact that SIGKILL doesn't kill subprocesses,
and so any file descriptors the subprocesses have open will remain open.
If we set up the `morph worker-build` process as a process group
leader, using os.setpgid(), then we can use os.killpg() to kill the
entire process group. This should ensure that the `morph worker-build`
command exits straight away, as all of its subprocesses will be killed
at the same time it is.
Change-Id: I38707d18004d8c5bc994fd0cb99e90fd5def58e4
|
|
|
|
|
|
|
|
|
| |
Previously it was only available in the distbuild-helper program. Moving
it to its own module means we can test it and reuse it.
This commit also adds a docstring to the class.
Change-Id: Iaf7854048cf0ff463a87894f1f500cdcb6a34d8b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A log message was printing the 'remote name' of a socket that was
listening for connections. There isn't one, so the message always
shows this:
2015-04-14 17:05:19 INFO Binding socket to sam-jetson-mason
2015-04-14 17:05:19 INFO Listening at None
Print the local name instead:
2015-04-14 17:05:19 INFO Binding socket to sam-jetson-mason
2015-04-14 17:05:19 INFO Listening at 10.24.2.125:7878
Change-Id: I22c1bbe8c9f78ef63e587b6ace516afc861fae0f
|
|
|
|
|
|
|
|
|
|
| |
Add InitiatorListJobs class and list-jobs message template, add
distbuild-list-jobs to morph commandlist, send running job
information back to initiator, split out handling of build request
and list-jobs messages to separate functions and change generating
a random integer to UUID for message identification
Change-Id: Id02604f2c1201dbc10f6bbd7f501b8ce1ce0deae
|
|
|
|
|
|
|
| |
A JsonMachine object can be set to log all messages that it sends, we
don't need to handle it in the WorkerConnection class as well.
Change-Id: Idfdc06953363a016708b5dda50c978eb93b1113c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Worker log files are overly verbose with this enabled, each message is
dumped 6 times:
2015-03-19 11:00:11 DEBUG JsonMachine: Received: '"{...}\\n"\n'
2015-03-19 11:00:11 DEBUG JsonMachine: line: '"{...}\\n"'
2015-03-19 11:00:11 DEBUG JsonRouter: got msg: {...}
2015-03-19 11:00:11 DEBUG JsonMachine: Sending message {...}
2015-03-19 11:00:11 DEBUG JsonMachine: As '"{...}\\n"'
2015-03-19 11:00:11 DEBUG JsonRouter: sent to client: {...}
With this setting disabled, the message is only logged by the JsonRouter
class, so appears only twice:
2015-03-19 11:00:11 DEBUG JsonRouter: got msg: {...}
2015-03-19 11:00:11 DEBUG JsonRouter: sent to client: {...}
We've not seen any issues with message encoding/decoding recently so I
think it's safe to disable this debugging output by default.
Change-Id: I7d22ed29e81d6c594cb2c639abf3b40bfb27e3ad
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's good to know which jobs are in progress and which are queued, when
reading morph-controller.log.
Old output:
2015-04-09 10:40:58 DEBUG Current jobs:
['3f647933a1effbb128c857225ba77e9aa775d92314ef0acf3e58e084a7248c73.chunk.stage1-binutils-misc',
'd7279e4179a31d8a3a98c27d5b01ad1bb7387c7fab623fee1086ab68af2784bb.chunk.stage2-fhs-dirs-misc']
New output:
2015-04-09 10:40:58 DEBUG Current jobs:
['3f647933a1effbb128c857225ba77e9aa775d92314ef0acf3e58e084a7248c73.chunk.stage1-binutils-misc (given to worker1:3434)',
'd7279e4179a31d8a3a98c27d5b01ad1bb7387c7fab623fee1086ab68af2784bb.chunk.stage2-fhs-dirs-misc (given to worker2:3434)']
Change-Id: Ie89e6723b0da5f930813591a3166301fd3966804
|
|
|
|
|
|
|
|
|
| |
A cancel during the 'graphing' or 'annotating' stages would be ignored
as the BuildController was listening for the InitiatorDisconnect message
from the wrong event source. In 'building' state the actual build would
be stopped, but the BuildController instance would stick around due to
sending the message class instead of an instance of the message.
Change-Id: I222a8aa39bf7fffab4d89e12997ffd18cd1b54fc
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In addition to partial builds we also want to be able to do partial
distbuilds, and distbuild uses a different codepath.
This commit updates the distbuild code to know what to do if a partial
build is requested. It only builds up to the latest chunk/stratum that
was requested, and displays where to find the artifacts for each of
the chunks/strata requested upon completion of the build.
The usage is the same as for local builds.
Change-Id: I0537f74e2e65c7aefe5e71795f17999e2415fce5
|
|
|
|
| |
Change-Id: Ibda7a938cd16e35517a531140f39ef4664d85c72
|
|
|
|
| |
Change-Id: I992dc0c1d40f563ade56a833162d409b02be90a0
|
| |
|
|\
| |
| |
| |
| | |
Reviewed-By: Adam Coldrick <adam.coldrick@codethink.co.uk>
Reviewed-By: Richard Maw <richard.maw@codethink.co.uk>
|
| |
| |
| |
| |
| | |
This makes it easier to spot if an incomplete build was due to the user
cancelling, or if it represents a dropped connection or internal error.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This message was hundreds of kilobytes in size, as it contained a
recursive list of dependencies for each artifact in the build graph. It
was used in the initiator only to print this message:
Build steps in total: 592
This message is now gone. The 'Need to build %d artifacts'
build-progress message now indicates the total build steps instead:
Need to build 300 artifacts, of 592 total
This is a compatible change to the distbuild protocol: old initiators
will continue to work as normal with new controllers that don't send
the build-steps message.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It gets messy having hundreds of build-step-xx.log files in the current
directory, and if two builds are run in parallel from the same directory
the logs for a given chunk will be mixed together in one file.
Now, a new directory named build-0, build-1, build-2 etc is created for
each new build.
If the user passes --initiator-step-output-dir the logs will be placed
in that directory, instead. This behaviour is the same as before.
|
| |
| |
| |
| |
| |
| |
| |
| | |
Users build sources, not artifacts. So the log files should be called
build-step-systemd.log and not build-step-systemd-misc.log.
Note strata are a kind of special case so you will still see
build-step-foundation-runtime.log, build-step-foundation-devel.log etc.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For a while we have seen an issue where output from build A would end up
in the log file of some other random chunk.
The problem turns out to be that the WorkerConnection class in the
controller-daemon assumes cancellation is instantaneous. If a build was
cancelled, the WorkerConnection would send a cancel message for the job
it was running, and then start a new job. However, the worker-daemon
process would have a backlog of exec-output messages and a delayed
exec-response message from the old job. The controller would receive
these and would assume that they were for the new job, without checking
the job ID in the messages. Thus they would be sent to the wrong log
file.
To fix this, the WorkerConnection class now tracks jobs by job ID, and
the code should be generally more robust when unexpected messages are
received.
|
| | |
|
|\ \
| | |
| | |
| | |
| | |
| | | |
Reviewed-By: Richard Maw <richard.maw@codethink.co.uk>
Reviewed-By: Francisco Redondo Marchena <francisco.marchena@codethink.co.uk>
Reviewed-By: Mike Smith <mike.smith@codethink.co.uk>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The logic to handle a worker disconnecting was broken. The
WorkerConnection object would remove itself from the main loop as soon
as the worker disconnected. But it would not get removed from the list
of available workers that the WorkerBuildQueue maintains. So the
controller would continue sending messages to this dead connection, and
the builds it sent would hang forever for a response.
|
| | | |
|
|\ \ \
| |_|/
|/| |
| | |
| | |
| | |
| | | |
'lauren/baserock/lauren/distbuild-invalid-input-crash'
Reviewed-By: Richard Maw <richard.maw@codethink.co.uk>
Reviewed-By: Sam Thursfield <sam.thursfield@codethink.co.uk>
|
| | | |
|
| |/ |
|
| | |
|
| |
| |
| |
| |
| |
| | |
Let the end-user see the URL that distbuild was attempting to talk to,
so they can more easily spot configuration errors. It's kind of silly
to say 'HTTP request failed' without saying where the request was going.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The previous error looked like this by the time it had reached the
initiator's console:
ERROR: Failed to build baserock:baserock/definitions
c7292b7c81cdd7e5b9e85722406371748453c44f
systems/base-system-x86_64-generic.morph.frodsham: Failed to compute
build graph. Problem with serialise-artifact: ERROR: Couldn't find
morphology: systems/base-system-x86_64-generic.morph.frodsham
New message is at least a bit simpler:
ERROR: Failed to build baserock:baserock/definitions
c7292b7c81cdd7e5b9e85722406371748453c44f
systems/base-system-x86_64-generic.morph.frodsham: ERROR: Couldn't
find morphology: systems/base-system-x86_64-generic.morph.frodsham
|
| |
| |
| |
| |
| |
| | |
If there's no distbuild-helper process running on the controller then
the controller would hang forever. This situation is unlikely, but it's
important to give the user feedback instead of silently hanging forever.
|
| |
| |
| |
| |
| |
| |
| | |
There's no need to handle failure differently at each stage of the
build. Simpler to use the BuildFailed message for all errors. This
then allows us to have a single self.fail() function that can be used
everywhere.
|
| | |
|
|/
|
|
|
|
|
| |
Knowing which worker built something is useful for debugging, and right
now that information is only present on the initiator's console. It's
good to have it in the build-step-xx.log file too so the information
doesn't get lost.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The recent changes to the BuildCommand.build() function caused distbuild
to break, because I didn't make the same change to the
InitiatorBuildCommand.build() function but did change how it was called.
This commit adds the ability to have optional fields in distbuild
messages. This is used to add an optional 'original_ref' field, which
will get passed to `morph serialise-artifact` by new distbuild
controllers, and will be ignored by older ones.
|
| |
|
| |
|
|
|
|
|
|
| |
JSON can only handle unicode strings, but commands can write anything to
stdout/stderr, so we do the same trick as for the serialise, and json
encode yaml.
|
|
|
|
|
|
| |
The horrible json.dumped, yaml dump is because we need it to be both
binary safe (which yaml gives us) and one line per message (which json
gives us).
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
You can bind to an ephemeral port by passing 0 as the port number.
To work out which port you actually got, you need to call getsockname().
To facilitate being able to spawn multiple copies of the daemons for
testing environments, you can pass a -file option, which will make the
daemon write which port it actually bound to.
If this path is a fifo, reading from it in the spawner process will
allow synchronisation of only spawning services that require that port to
be ready after it is.
|
| |
|
| |
|
| |
|
| |
|
| |
|