| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
The controller no longer needs to know everything about an artifact
as the workers can calculate the build graph themselves quickly.
This reduces the amount of data which needs to be serialised by
serialise-artifact, making the yaml dump quicker.
Change-Id: I6bd0bed14c2efb2f499e9d6f0a97e6188353121a
|
|
|
|
| |
Change-Id: I992dc0c1d40f563ade56a833162d409b02be90a0
|
| |
|
|
|
|
|
|
| |
The horrible json.dumped, yaml dump is because we need it to be both
binary safe (which yaml gives us) and one line per message (which json
gives us).
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "unicode fix" worked for the subset of cases relevant, and only
broke distbuild because its tests have not been integrated with ./check,
so the fact that it broke for any string ending with a \ escaped notice,
if you will excuse the pun.
During json.load, the encode option is for specifying the character
encoding of the file or string that is being loaded.
During json.dump, the encode option is for the encoding of `str` keys
and values.
The fact that it worked for the set of cases we cared about is a small
mystery, probably caused by the strings we happened to give it being
valid unicode-escape encoded `str`ings.
A full fix would require either converting all these cases to a
different format, such as YAML, which will handle input data not being
valid Unicode, or pre-processing the data that is passed to `json.dump`
to convert all `str` instances to an appropriately escaped `unicode`,
and converting back on `json.load`, but this is a quick fix to get the
distbuild code working again.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
json only accepts unicode. Various APIs such as file paths and environment
variables allow binary data, so we need to support this properly.
This patch changes every[1] use of json.load or json.dump to escape
non-unicode data strings. This appears exactly as it used to if the
input was valid unicode, if it isn't it will insert \xabcd escapes in
the place of non-unicode data.
When loading back in, if json.load is told to unescape it with
`encoding='unicode-escape'` then it will convert it back correctly.
This change was primarily to support file paths that weren't valid
unicode, where this would choke and die. Now it works, but any tools
that parsed the metadata need to unescape the paths.
[1]: The interface to the remote repo cache uses json data, but I haven't
changes its json.load calls to unescape the data, since the repo
caches haven't been made to escape the data.
|
|
|
|
|
| |
We still log all messages sent to workers, which include the output of the serialise-artifact
code in full. There's no need for these status messages.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Serialisation was simple when we only had 1 artifact per source.
However, to allow smaller systems, we need artifact splitting to produce
multiple artifacts per chunk source.
So now the new serialisation format has a separate list of artifacts
and sources, rather than the Source being generated from the artifact's
serialisation.
Python's id() function is used to encode the references between the
various Sources and Artifacts, these are replaced with a reference to
the new object after deserialisation.
Previously the cache-key was used, but this is no longer sufficient to
uniquely identify an Artifact.
The resultant build graph after deserialisation is a little different
to what went in: Strata end up with a different Source per Artifact,
so it _is_ a 1 to 1 mapping, as opposed to Chunks, where it's many to 1.
We serialise strata and chunks differently because stratum artifacts
from the same source can have different dependencies, for example
core-devel can have different dependencies to core-runtime.
Without intervention we would serialise core-devel and core-devel's
dependencies without including core-runtime's dependencies.
To solve this we've decided to encode stratum artifacts completely
indepedently: each stratum artifact has its own source. This is safe
because stratum artifacts can be constructed independently,
as opposed to Chunks where all the Artifacts for a Source
are produced together.
This is a little hacky in its current form, but it simplifies matters
later in distbuild with regards to how it handles expressing that
every Artifact that shares a Source is built together.
Arguably, this should be the output of producing the build graph
anyway, since it more helpfully represents which Artifacts are built
together than checking the morphology kind all the time, but more
assumptions need checking in morph before it's safe to make this
change across the whole of the morph codebase.
|
|
|