diff options
author | Colin Walters <walters@verbum.org> | 2022-01-17 11:46:04 -0500 |
---|---|---|
committer | Colin Walters <walters@verbum.org> | 2022-01-18 09:19:20 -0500 |
commit | cb731294837736e957ee595ce11ab115277dbb36 (patch) | |
tree | a77115c6b1f6b1883f0361b4324d0d01aa01653d /Makefile-switchroot.am | |
parent | 0095f7c472e237a10befeb02f300127f28880354 (diff) | |
download | ostree-cb731294837736e957ee595ce11ab115277dbb36.tar.gz |
deploy: Add a 5s max timeout on global filesystem `sync()`
https://bugzilla.redhat.com/show_bug.cgi?id=2003532
Basically there's a systemd bug where it's losing the `_netdev`
aspect of Ceph filesystem mounts. This means the network is taken
down before Ceph is unmounted. In turn, our invocation of `sync()`
blocks on Ceph, which won't succeed.
And this in turn manifests as a failure to transition to the new
deployment.
I initially did this patch to just rip out the global `sync()`. I
am pretty sure we don't need it anymore. We've been doing individual
`syncfs()` on `/sysroot` and `/boot` for a while now, and those
are the only filesystems we should be touching. But *proving* that
is a whole other thing of course.
To be conservative, let's instead just add a timeout of 5s on
our invocation of `sync()`. It doesn't return any information on
success/error anyways.
To allow testing without the `sync()` invocation, we also support
a new `OSTREE_SYSROOT_OPT_SKIP_SYNC=1` environment variable. For
staged deployments, this needs to be injected via e.g. systemd unit
overrides into `ostree-finalize-staged.service`.
Implementing this is a bit hairy - we need to spawn a thread. I
debated blocking in arecursive mainloop, but I think `g_cond_wait_until()`
is also fine here.
Diffstat (limited to 'Makefile-switchroot.am')
0 files changed, 0 insertions, 0 deletions