| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During the early design of FCOS and RHCOS, we chose a value of 384M
for the boot partition. This turned out to be too small: some arches
other than x86_64 have larger initrds, kernel binaries, or additional
artifacts (like device tree blobs). We'll likely bump the boot partition
size in the future, but we don't want to abandon all the nodes deployed
with the current size.[[1]]
Because stale entries in `/boot` are cleaned up after new entries are
written, there is a window in the update process during which the bootfs
temporarily must host all the `(kernel, initrd)` pairs for the union of
current and new deployments.
This patch determines if the bootfs is capable of holding all the
pairs. If it can't but it could hold all the pairs from just the new
deployments, the outgoing deployments (e.g. rollbacks) are deleted
*before* new deployments are written. This is done by updating the
bootloader in two steps to maintain atomicity.
Since this is a lot of new logic in an important section of the
code, this feature is gated for now behind an environment variable
(`OSTREE_ENABLE_AUTO_EARLY_PRUNE`). Once we gain more experience with
it, we can consider turning it on by default.
This strategy increases the fallibility of the update system since one
would no longer be able to rollback to the previous deployment if a bug
is present in the bootloader update logic after auto-pruning (see [[2]]
and following). This is however mitigated by the fact that the heuristic
is opportunistic: the rollback is pruned *only if* it's the only way for
the system to update.
[1]: https://github.com/coreos/fedora-coreos-tracker/issues/1247
[2]: https://github.com/ostreedev/ostree/issues/2670#issuecomment-1179341883
Closes: #2670
|
|
|
|
|
|
|
|
| |
When hacking and testing locally with `cosa build-fast` and `kola run`,
I prefer to leave testing framework stuff within the work directory
rather than installed in my pet container. Add a `localinstall` target
for this which puts the tests in `tests/kola`. Then a simple `kola run`
will pick it up.
|
|
|
|
|
|
|
| |
XFS now seems to want filesystems larger than 300MB, so switch
to ext4. Also use `20MiB` so we align to 512b sectors to squash
a `losetup` warning.
Also tweak some of the numbers to still work.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduces an intermediate format for overlayfs storage, where
.wh-ostree. prefixed files will be converted into char 0:0
whiteout devices used by overlayfs to mark deletions across layers.
The CI scripts now uses a volume for the scratch directories
previously in /var/tmp otherwise we cannot create whiteout
devices into an overlayfs mounted filesystem.
Related-Issue: #2712
|
|\
| |
| | |
finalize-staged: Ensure /boot and /sysroot automounts don't expire
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If `/boot` is an automount, then the unit will be stopped as soon as the
automount expires. That's would defeat the purpose of using systemd to
delay finalizing the deployment until shutdown. This is not uncommon as
`systemd-gpt-auto-generator` will create an automount unit for `/boot`
when it's the EFI System Partition and there's no fstab entry.
To ensure that systemd doesn't stop the service early when the `/boot`
automount expires, introduce a new unit that holds `/boot` open until
it's sent `SIGTERM`. This uses a new `--hold` option for
`finalize-staged` that loads but doesn't lock the sysroot. A separate
unit is used since we want the process to remain active throughout the
finalization run in `ExecStop`. That wouldn't work if it was specified
in `ExecStart` in the same unit since it would be killed before the
`ExecStop` action was run.
Fixes: #2543
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
are pending
This is to support pending deployments instead of rasing assertion.
For example:
```
$ sudo rpm-ostree kargs --append=foo=bar
$ sudo ostree admin kargs edit-in-place --append-if-missing=foobar
```
After reboot we get both `foo=bar foobar`.
Fix https://github.com/ostreedev/ostree/issues/2679
|
|
|
|
|
|
|
|
|
| |
Don't parse `rpm-ostree status` output, it's not meant for that. Use
`--json` output instead.
While we're here, fix an obsolete reference to Ansible.
Related: https://github.com/coreos/rpm-ostree/pull/3938
|
| |
|
|
|
|
|
|
|
|
|
| |
https://github.com/coreos/coreos-assembler/pull/2921 broke this
test which is intentionally causing a systemd unit to fail.
As they say, necessity is the mother of invention. They don't
say though that need always causes particularly *beautiful* things
to be invented...
|
|
|
|
| |
`tests/inst` became its own workspace.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Quite a while ago we added staged deployments, which solved
a bunch of issues around the `/etc` merge. However...a persistent
problem since then is that any failures in that process that
happened in the *previous* boot are not very visible.
We ship custom code in `rpm-ostree status` to query the previous
journal. But that has a few problems - one is that on systems
that have been up a while, that failure message may even get
rotated out. And second, some systems may not even have a persistent
journal at all.
A general thing we do in e.g. Fedora CoreOS testing is to check
for systemd unit failures. We do that both in our automated tests,
and we even ship code that displays them on ssh logins. And beyond
that obviously a lot of other projects do the same; it's easy via
`systemctl --failed`.
So to make failures more visible, change our `ostree-finalize-staged.service`
to have an internal wrapper around the process that "catches" any
errors, and copies the error message into a file in `/boot/ostree`.
Then, a new `ostree-boot-complete.service` looks for this file on
startup and re-emits the error message, and fails.
It also deletes the file. The rationale is to avoid *continually*
warning. For example we need to handle the case when an upgrade
process creates a new staged deployment. Now, we could change the
ostree core code to delete the warning file when that happens instead,
but this is trying to be a conservative change.
This should make failures here much more visible as is.
|
|
|
|
|
|
|
|
|
|
| |
Our CI started falling over because coreos-assembler checks
for units stuck activating as of https://github.com/coreos/coreos-assembler/pull/2810
Really need to centralize the code for this and fix the root
problem, but...not today.
xref https://github.com/coreos/coreos-assembler/pull/2814
|
|
|
|
|
|
|
|
| |
`kola` now follows symlinks when archiving an external test's `data/`
dir. So the recursive `data` symlink we have here breaks it.
Let's just move the shared files in its own directory and update the
symlinks.
|
|
|
|
|
|
|
|
|
| |
Support for that file was added previously, but the testing lived in
rpm-ostree only. Let's add it here too.
In the process add a hidden `--lock-finalization` to `ostree admin
deploy` to make testing easier (though it could also be useful to update
managers driving OSTree via the CLI).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In fixing https://github.com/coreos/rpm-ostree/pull/3323
I felt that it was a bit ugly we're installing `/usr/bin/ostree-container`.
It's kind of an implementation detail. We want users to use
`ostree container`.
Let's support values outside of $PATH too.
For example, this also ensures that TAB completion for `ost` expands
to `ostree ` with a space.
|
| |
|
|
|
|
|
|
|
|
| |
This reworks the var-mount destructive test in order to properly use
the datadir for the current stateroot instead of a duplicated one.
In turn, it ensures that the resulting `var.mount` after reboot is
correctly pointing to the same location which hosted `/var` on the
previous boot.
|
|
|
|
|
|
|
|
|
|
|
| |
https://bugzilla.redhat.com/show_bug.cgi?id=1945274 is an issue where a privileged
kubernetes daemonset is writing a socket into `/etc`. This makes ostree upgrades barf.
Now, they should clearly move it to `/run`. However, one option is for us to
just ignore it instead of erroring out. Some brief investigation shows that
e.g. `git add somesocket` is a silent no-op, which is an argument in favor of ignoring it.
Closes: https://github.com/ostreedev/ostree/issues/2446
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic for `--selinux-policy` ended up in the `--tree=dir`
path, but there's no reason for that. Fix the imported
labeling with `--tree=tar`. Prep for use with containers.
We had this bug because the previous logic was trying to avoid
duplicating the code for generic `--selinux-policy` and
the case of `--selinux-policy-from-base --tree=dir`.
It's a bit more code, but it's cleaner if we dis-entangle them.
|
|
|
|
|
|
|
|
|
|
|
| |
Having to touch a global test counter when adding tests is
a recipe for conflicts between PRs.
The TAP protocol allows *ending* with the expected number of
tests, so the best way to do this is to have an explicit
API like our `tap_ok` which bumps a counter, then end with `tap_end`.
I ported one test as a demo.
|
|
|
|
|
|
|
|
|
|
| |
We're waaay overdue for this, it's been the default
in rpm-ostree for years, and solves several important bugs
around not capturing `/etc` while things are running.
Also, `ostree admin upgrade --stage` (should) become idempotent.
Closes: https://github.com/ostreedev/ostree/issues/2389
|
|\
| |
| | |
libtest-core: Add some improvements from bubblewrap
|
| |
| |
| |
| | |
Signed-off-by: Simon McVittie <smcv@collabora.com>
|
| |
| |
| |
| | |
Signed-off-by: Simon McVittie <smcv@collabora.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
If we fail as a result of `set -x`, It's often not completely obvious
which command failed or how. Use a trap on ERR to show the command that
failed, and its exit status.
Signed-off-by: Simon McVittie <smcv@collabora.com>
|
| |
| |
| |
| |
| |
| | |
[Originally from bubblewrap commits c5c999a7 "tests: test --userns"
and 3e5fe1bf "tests: Better error message if assert_files_equal fails";
separated into this commit by Simon McVittie.]
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We struggled for a long time with enablement of our "internal units",
trying to follow the philosophy that units should only be enabled
by explicit preset.
See https://bugzilla.redhat.com/show_bug.cgi?id=1451458
and https://github.com/coreos/rpm-ostree/pull/1482
etc.
And I just saw chat (RH internal on a proprietary system sadly) where
someone hit `ostree-remount.service` not being enabled in CentOS8.
Thinking about this more, I realized we've shipped a systemd generator
for a long time and while its only role until now was to generate `var.mount`,
but by using it to force on our internal units, we don't require
people to deal with presets anymore.
Basically we're inverting things so that "if ostree= is on the kernel
cmdline, then enable our units" and not "enable our units, but have
them use ConditionKernelCmdline=ostree to skip".
Drop the weird gyrations we were doing around `ostree-finalize-staged.path`
too; forking `systemctl start` is just asking for bugs.
So after this, hopefully we won't ever again have to think about
distribution presets and our units.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This will be ignored, so let's make it very clear
people are doing something wrong. Motivated by a bug
in a build pipeline that injected `/var/lib/rpm` into an ostree
commit which ended up crashing rpm-ostree because it was an empty db
which it wasn't expecting.
It *also* turns out rpm-ostree is incorrectly dumping content in the
deployment `/var` today, which is another bug.
|
|/
|
|
|
| |
Yeah, we should stop parsing the text; I need to dig at that
at some point.
|
|
|
|
|
|
|
|
|
| |
rust-analyzer is happier with this because it understands
the project structure out of the box.
We aren't actually again adding a dependency on Rust/cargo in the core,
this is only used to make `cargo build` work out of the box to build
the Rust test code.
|
|
|
|
| |
See latest in https://github.com/coreos/fedora-coreos-tracker/blob/master/docs/ci-and-builds.md
|
|
|
|
|
|
| |
To help debug an issue we've seen where `/boot` isn't
in sync with the `/ostree/boot` dir, let's log to the journal
what we're doing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In FCOS and RHCOS, the need to configure software in the initramfs has
come up multiple times. Sometimes, using kernel arguments suffices.
Other times, it really must be a configuration file. Rebuilding the
initramfs on the client-side however is a costly operation. Not only
does it add complexity to the update workflow, it also erodes a lot of
the value obtained from using the baked "blessed" initramfs from the
tree itself.
One elegant way to address this is to allow specifying multiple
initramfses. This is supported by most bootloaders (notably GRUB) and
results in each initrd being overlayed on top of each other.
This patch allows libostree clients to leverage this so that they can
avoid regenerating the initramfs entirely. libostree itself is agnostic
as to what kind and how much data overlay initrds contain. It's up to
the clients to enforce such boundaries.
To implement this, we add a new ostree_sysroot_stage_overlay_initrd
which takes a file descriptor and returns a checksum. Then users can
pass these checksums when calling the deploy APIs via the new array
option `overlay_initrds`. We copy these files into `/boot` and add them
to the BLS as another `initrd` entry.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds infrastructure to the Rust test suite for destructive
tests, and adds a new `transactionality` test which runs
rpm-ostree in a loop (along with `ostree-finalize-staged`) and
repeatedly uses either `kill -9`, `reboot` and `reboot -ff`.
The main goal here is to flush out any "logic errors".
So far I've validated that this passes a lot of cycles
using
```
$ kola run --qemu-image=fastbuild-fedora-coreos-ostree-qemu.qcow2 ext.ostree.destructive-rs.transactionality --debug --multiply 8 --parallel 4
```
a number of times.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was thinking a bit more recently about the "live" changes
stuff https://github.com/coreos/rpm-ostree/issues/639
(particularly since https://github.com/coreos/rpm-ostree/pull/2060 )
and I realized reading the last debates in that issue that
there's really a much simpler solution; do exactly the same
thing we do for `ostree admin unlock`, except mount it read-only
by default.
Then, anything that wants to modify it does the same thing
libostree does for `/sysroot` and `/boot` as of recently; create
a new mount namespace and do the modifications there.
The advantages of this are numerous. First, we already have
all of the code, it's basically just plumbing through a new
entry in the state enumeration and passing `MS_RDONLY` into
the `mount()` system call.
"live" changes here also naturally don't persist, unlike what
we are currently doing in rpm-ostree.
|
|
|
|
|
| |
Fixes the tests, see https://github.com/coreos/coreos-assembler/pull/1600
TODO: provide a webserver binary via virtio or so
|
|
|
|
|
|
|
|
| |
See https://github.com/coreos/coreos-assembler/pull/1528
I think we can drop the old cosa reboot APIs after this,
though I've already forgotten where else I might have written
tests using it.
|
|
|
|
| |
For the motivation for this see #2132.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a lot going on here. First, this is intended to run
nicely as part of the new [cosa/kola ext-tests](https://github.com/coreos/coreos-assembler/pull/1252).
With Rust we can get one big static binary that we can upload,
and include a webserver as part of the binary. This way we don't
need to do the hack of running a container with Python or whatever.
Now, what's even better about Rust for this is that it has macros,
and specifically we are using [commandspec](https://github.com/tcr/commandspec/)
which allows us to "inline" shell script. I think the macros
could be even better, but this shows how we can intermix
pure Rust code along with using shell safely enough.
We're using my fork of commandspec because the upstream hasn't
merged [a few PRs](https://github.com/tcr/commandspec/pulls?q=is%3Apr+author%3Acgwalters+).
This model is intended to replace *both* some of our
`make check` tests as well.
Oh, and this takes the obvious step of using the Rust OSTree bindings
as part of our tests. Currently the "commandspec tests" and "API tests"
are separate, but nothing stops us from intermixing them if we wanted.
I haven't yet tried to write destructive tests with this but
I think it will go well.
|
|
Follow the precedent set in https://github.com/coreos/rpm-ostree/pull/2106
and rename the directory, to more clearly move away from the
"uninstalled" test model. Prep for Rust-based tests.
|