| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
Confexts should not contain code, so mount confexts with noexec.
We cannot mount invidial extensions as noexec, as the overlay ignores
it and bypasses it, we need to use the flag on the whole overlay for
it to be effective.
But given there are legacy scripts still shipped in /etc, allow to
override it with --noexec=false.
|
|
|
|
|
|
|
|
|
| |
Addresses
https://github.com/systemd/systemd/pull/25608/commits/84be0c710d9d562f6d2cf986cc2a8ff4c98a138b#r1060130312,
https://github.com/systemd/systemd/pull/25608/commits/84be0c710d9d562f6d2cf986cc2a8ff4c98a138b#r1067927293, and
https://github.com/systemd/systemd/pull/25608/commits/84be0c710d9d562f6d2cf986cc2a8ff4c98a138b#r1067926416.
Follow-up for 84be0c710d9d562f6d2cf986cc2a8ff4c98a138b.
|
| |
|
|\
| |
| | |
dissect: add dissection policies
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| | |
The confext concept is an extension of the existing sysext concept and
allows to extend the host's filesystem or a unit's filesystem with signed
images that add new files to the /etc/ directory using OverlayFS.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
extension-release file
The release file that accompanies the confext images needs to be
host compatible to be able to be merged into the host /etc/ directory.
This commit checks for version compatibility between the image file and
the host file.
|
|/
|
|
|
|
|
|
|
| |
files
Adds a new image type called IMAGE_CONFEXT which is similar to IMAGE_SYSEXT but works
for the /etc/ directory instead of /usr/ and /opt/. This commit also adds the ability to
parse the release file that is present with the confext image in /etc/confext-release.d/
directory.
|
|
|
|
|
| |
sysexts are not supposed to ship os-release files, enforce this
when loading them
|
|
|
|
|
| |
It will be used for other extension DDI validation, not just for extension-release
validation
|
|
|
|
|
|
|
|
|
| |
Chasing symlinks is a core function that's used in a lot of places
so it deservers a less verbose names so let's rename it to chase()
and chaseat().
We also slightly change the pattern used for the chaseat() helpers
so we get chase_and_openat() and similar.
|
|
|
|
| |
Signed-off-by: Morten Linderud <morten@linderud.pw>
|
|
|
|
|
|
| |
Let's not leave the sector size unspecified: either set a user supplied
value, or auto-detect the right size by probing the disk image
accordingly.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DISSECT_IMAGE_OPEN_PARTITION_DEVICES
Curently, these two flags were implied by dissect_loop_device(), but
that's not right, because this means systemd-gpt-auto-generator will
dissect the root block device with these flags set and that's not
desirable: the generator should not cause the partition devices to be
created (we don't intend to use them right-away after all, but expect
udev to find/probe them first, and then mount them though .mount units).
And there's no point in opening the partition devices, since we do not
intend to mount them via fds either.
Hence, rework this: instead of implying the flags, specify them
explicitly.
While we are at it, let's also rename the flags to make them more
descriptive:
DISSECT_IMAGE_MANAGE_PARTITION_DEVICES becomes
DISSECT_IMAGE_ADD_PARTITION_DEVICES, since that's really all this does:
add the partition devices via BLKPG.
DISSECT_IMAGE_OPEN_PARTITION_DEVICES becomes
DISSECT_IMAGE_PIN_PARTITION_DEVICES, since we not only open the devices,
but keep the devices open continously (i.e. we "pin" them).
Also, drop the DISSECT_IMAGE_BLOCK_DEVICE combination flag, since it is
misleading, i.e. it suggests it was appropriate to specify on all
dissected blocking devices, but that's precisely not the case, see the
systemd-gpt-auto-generator case. My guess is that the confusion around
this was actually the cause for this bug we are addressing here.
Fixes: #25528
|
| |
|
|
|
|
|
|
|
|
|
| |
I changed imports of util.h to initrd-util.h, or added an import of
initrd-util.h, to keep compilation working. It turns out that many files didn't
import util.h directly.
When viewing the patch, don't be confused by git rename detection logic:
a new .c file is added and two functions moved into it.
|
|\
| |
| | |
Adjust table n/a text in more places
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
All users were setting this to some static string (usually "-"), so let's
simplify things by not doing strdup, but instead limiting callers to a fixed
set of values. In preparation for the next commit, the function is renamed from
"empty" to "replacement", because it'll be used for more than empty fields. I
didn't do the whole string-table setup, because it's all used internally in one
file and this way we can immediately assert if an invalid value is passed in.
Some callers were (void)ing the error, others were ignoring it, and others
propagating. It's nicer to remove the boilerplate.
|
| | |
|
|/ |
|
|
|
|
|
|
| |
Note, currently, for each call of dissect_loop_device_and_warn(), the
specified name is equivalent to the path passed to loop_device_make_by_path().
Hence, this should not change the current behavios.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's rework how we lock loopback block devices in two ways:
1. Lock a separate fd, instead of the main block device fd. We already
did that for our internal locking when allocating loopback block
devices, but do so for the exposed locking (i.e.
loop_device_flock()), too, so that the lock is independent of the
main fd we actually use of IO.
2. Instead of locking the device during allocation of the loopback
device, then unlocking it (which will make udev run), and then
re-locking things if we need, let's instead just keep the lock the
whole time, to make things a bit safer and faster, and not have to
wait for udev at all. This is done by adding a "lock_op" parameter to
loop device allocation functions that declares the initial state of
the lock, and is one of LOCK_UN/LOCK_SH/LOCK_EX. This change also
shortens a lot of code, since we allocate + immediately lock loopback
devices pretty much everywhere.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts a major chunk of 75d7e04eb4662a814c26010d447eed8a862f5ec1
Now that the loopback device code already destroys the partitions we
don't have to do this here anymore.
I am sure the right place to delete the partitions is in the loopback
code, since we really only should do that for loopback devices, see
bug #24431, and not on "real" block devices.
I am also not convinced dropping partitions the dissection logic doesn't
care about is a good idea, after all. The dissection stuff should
probably not consider itself the "owner" of the block devices it
analyzes, but take a more passive role: figure out what is what, but not
modify it.
Fixes: #24431
|
| |
|
|
|
|
|
|
|
|
| |
Fixes build with musl:
| ../git/src/shared/dissect-image.c: In function 'mount_image_privately_interactively':
| ../git/src/shared/dissect-image.c:2986:34: error: 'LOCK_SH' undeclared (first use in this function)
| 2986 | r = loop_device_flock(d, LOCK_SH);
| | ^~~~~~~
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When closing a loop device, the kernel will asynchronously remove
the probed partitions. This can lead to race conditions where we
try to reuse a partition device that still needs to be removed by
the kernel. To avoid such issues, let's explicitly try to remove
any partitions using BLKPG_DEL_PARTITION when we're done with an
image.
To make sure we don't try to remove partitions when we want them
to remain (e.g. systemd-dissect --mount), we add
dissected_image_relinquish() in a similar vein to loop_device_relinquish()
and decrypted_image_relinquish().
|
|
|
|
|
| |
Otherwise, the assertion in extension_release_validate() will be
triggered.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up for f470cb6d13558fc06131dc677d54a089a0b07359 which in
turn is a follow-up for a068aceafbffcba85398cce636c25d659265087a.
The latter started to honour hidden files when deciding whether a
directory is empty. The former reverted to the old behaviour to fix
issue #23220.
It introduced a bug though: when a directory contains a larger number of
hidden entries the getdents64() buffer will not suffice to read them,
since we just allocate three entries for it (which is definitely enough
if we just ignore the . + .. entries, but not ig we ignore more).
I think it's a bit confusing that dir_is_empty() can return true even if
rmdir() on the dir would return ENOTEMPTY. Hence, let's rework the
function to make it optional whether hidden files are ignored or not.
After all, I looking at the users of this function I am pretty sure in
more cases we want to honour hidden files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No actual code changes, just splitting out of some dev_t handling
related calls from stat-util.[ch], they are quite a number already, and
deserve their own module now I think.
Also, try to settle on the name "devnum" as the name for the concept,
instead of "devno" or "dev" or "devid". "devnum" is the name exported in
udev APIs, hence probably best to stick to that. (this just renames a
few symbols to "devum", local variables are left untouched, to make the
patch not too invasive)
No actual code changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So here's something we should always keep in mind:
systemd-udevd actually does *two* things with BSD file locks on block
devices:
1. While it probes a device it takes a LOCK_SH lock. Thus everyone else
taking a LOCK_EX lock will temporarily block udev from probing
devices, which is good when making changes to it.
2. Whenever a device is closed after write (detected via inotify), udevd
will issue BLKRRPART (requesting the kernel to reread the partition
table). It does this while holding a LOCK_EX lock on the block
device. Thus anyone else taking LOCK_SH or LOCK_EX will temporarily
block udevd from issuing that ioctl. And that's quite relevant, since
the kernel will temporarily flush out all partitions while re-reading
the partition table and then create them anew. Thus it is smart to
take LOCK_SH when dissecting a block device to ensure that no
BLKRRPART is issued in the background, until we mounted the devices.
|
|
|
|
| |
This also avoids multiple evaluations in STRV_FOREACH_BACKWARDS()
|
|
|
|
|
|
| |
Jan 17 12:34:59 myguest1 (sd-sysext)[486]: Device '/var/lib/extensions/myext.raw' is loopback block device with partition scanning turned off, please turn it on.
Fixes https://github.com/systemd/systemd/issues/22146
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
system extension is for
This should make things a bit more robust since it ensures system
extension can only applied to the right environments. Right now three
different "scopes" are defined:
1. "system" (for regular OS systems, after the initrd transition)
2. "initrd" (for sysext images that apply to the initrd environment)
3. "portable" (for sysext images that apply to portable images)
If not specified we imply a default of "system portable", i.e. any image
where the field is not specified is implicitly OK for application to OS
images and for portable services – but not for initrds.
|
|
|
|
|
|
|
|
| |
It's "sysext", not "sysexit".
The string passed here is pure decoration, and noone will see it, since
it's only in our private mount namespace. But still, it's a typo, let's
fix it
|
| |
|
| |
|
|
|
|
|
| |
This adds support for actually using embedded signature data from
partitions.
|
|\
| |
| | |
Use new diskseq block device property
|
| |
| |
| |
| |
| |
| |
| |
| | |
DISKSEQ is a reliable way to find out if we missed a uevent or not, as
it's monotonically increasing. If we parse an event with a smaller or
no sequence number, we know we need to wait longer. If we parse an
event with a greater sequence number, we know we missed it and the
device was reused.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In general we almost never hit those asserts in production code, so users see
them very rarely, if ever. But either way, we just need something that users
can pass to the developers.
We have quite a few of those asserts, and some have fairly nice messages, but
many are like "WTF?" or "???" or "unexpected something". The error that is
printed includes the file location, and function name. In almost all functions
there's at most one assert, so the function name alone is enough to identify
the failure for a developer. So we don't get much extra from the message, and
we might just as well drop them.
Dropping them makes our code a tiny bit smaller, and most importantly, improves
development experience by making it easy to insert such an assert in the code
without thinking how to phrase the argument.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This tries to shorten the race of device reuse a bit more: let's ignore
udev database entries that are older than the time where we started to
use a loopback device.
This doesn't fix the whole loopback device raciness mess, but it makes
the race window a bit shorter.
|
|
|
|
|
|
|
|
|
|
|
| |
Let's drop all monitor uevent that were enqueued before we actually
started setting up the device.
This doesn't fix the race, but it makes the race window smaller: since
we cannot determine the uevent seqnum and the loopback attachment
atomically, there's a tiny window where uevents might be generated by
the device which we mistake for being associated with out use of the
loopback device.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the flag did two things at once: enable support for using
generic partitions as root fs if there were only one/allow use of
partition-table-less images as root fs. And secondly, insist that there
was a rootfs, and fail if not. Let's split these two in two separate
options so that they can be used independently of each other.
There are cases where one wants to use one without the other (i.e. when
inspecting things with systemd-dissect tool it should be OK to do so
even if image has no root fs), and it's cleaner anyway.
|
|
|
|
|
|
|
|
| |
Let's make use of the new dissection in all tools where this makes
sense, which are all tools that dissect images, except for those which
inherently operate on state/configuraiton and thus where an image
without state nor configuration is useless (e.g.
systemd-tmpfiles/systemd-firstboot/… --image= switch).
|