summaryrefslogtreecommitdiff
path: root/units
Commit message (Collapse)AuthorAgeFilesLines
...
* boot: implement kernel EFI RNG seed protocol with proper hashingJason A. Donenfeld2022-11-141-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than passing seeds up to userspace via EFI variables, pass seeds directly to the kernel's EFI stub loader, via LINUX_EFI_RANDOM_SEED_TABLE_GUID. EFI variables can potentially leak and suffer from forward secrecy issues, and processing these with userspace means that they are initialized much too late in boot to be useful. In contrast, LINUX_EFI_RANDOM_SEED_TABLE_GUID uses EFI configuration tables, and so is hidden from userspace entirely, and is parsed extremely early on by the kernel, so that every single call to get_random_bytes() by the kernel is seeded. In order to do this properly, we use a bit more robust hashing scheme, and make sure that each input is properly memzeroed out after use. The scheme is: key = HASH(LABEL || sizeof(input1) || input1 || ... || sizeof(inputN) || inputN) new_disk_seed = HASH(key || 0) seed_for_linux = HASH(key || 1) The various inputs are: - LINUX_EFI_RANDOM_SEED_TABLE_GUID from prior bootloaders - 256 bits of seed from EFI's RNG - The (immutable) system token, from its EFI variable - The prior on-disk seed - The UEFI monotonic counter - A timestamp This also adjusts the secure boot semantics, so that the operation is only aborted if it's not possible to get random bytes from EFI's RNG or a prior boot stage. With the proper hashing scheme, this should make boot seeds safe even on secure boot. There is currently a bug in Linux's EFI stub in which if the EFI stub manages to generate random bytes on its own using EFI's RNG, it will ignore what the bootloader passes. That's annoying, but it means that either way, via systemd-boot or via EFI stub's mechanism, the RNG *does* get initialized in a good safe way. And this bug is now fixed in the efi.git tree, and will hopefully be backported to older kernels. As the kernel recommends, the resultant seeds are 256 bits and are allocated using pool memory of type EfiACPIReclaimMemory, so that it gets freed at the right moment in boot.
* unit: also prioritize input devices when triggering devicesYu Watanabe2022-10-261-1/+1
| | | | | | | | | | | | | | | As in most cases, tty device without input devices is meaningless. This also swaps the priority of tty and net: - input devices are often connected under USB bus, hence may take slightly much time to be initialized. As, described in the above, in most cases it is allowed that tty devices are initialized just before input devices, - network configuration usually requires much time, e.g. DHCP or RA, hence it is better that network interfaces initialized. Then, network services can start DHCP client or friends earlier. Fixes #24026.
* pcrphase: add two additional phasesLennart Poettering2022-10-173-1/+28
| | | | | | | | | | | | | | | | | This adds two more phases to the PCR boot phase logic: "sysinit" + "final". The "sysinit" one is placed between sysinit.target and basic.target. It's good to have a milestone in this place, since this is after all file systems/LUKS volumes are in place (which sooner or later should result in measurements of their own) and before services are started (where we should be able to rely on them to be complete). This is particularly useful to make certain secrets available for mounting secondary file systems, but making them unavailable later. This breaks API in a way (as measurements during runtime will change), but given that the pcrphase stuff wasn't realeased yet should be OK.
* meson: Fix pcrphase unit conditionsDaan De Meyer2022-10-111-2/+2
|
* units: udev: partially emulate ProtectClock=Topi Miettinen2022-09-261-0/+2
| | | | | Drop CAP_SYS_TIME and CAP_WAKE_ALARM capabilities and block clock-related system calls. Update TODO.
* tmpfiles: add lines for provisioning ssh keys for root by defaultLennart Poettering2022-09-231-0/+1
| | | | | | | | With this, I can now easily do: systemd-nspawn --load-credential=ssh.authorized_keys.root:/home/lennart/.ssh/authorized_keys --image=… --boot To boot into an image with my SSH key copied in. Yay!
* units: add pcrphase unitsLennart Poettering2022-09-223-0/+51
|
* Merge pull request #24670 from keszybz/early-boot-orderingZbigniew Jędrzejewski-Szmek2022-09-1714-31/+67
|\ | | | | Early boot ordering
| * units: drop path to executable in $PATHZbigniew Jędrzejewski-Szmek2022-09-151-1/+1
| | | | | | | | We don't have it other places, so let's make things a bit simpler.
| * units: make sure that initrd-switch-root.service pulls in .targetZbigniew Jędrzejewski-Szmek2022-09-151-0/+1
| | | | | | | | | | | | | | Normally we queue initrd-switch-root.target/isolate, which pulls in the service via Wants= in the .target unit file. But if the service is instead started directly, there may be nothing pulling in the target. Let's make sure that the reference exists.
| * units: add dependency ordering for emergency.service conflictsZbigniew Jędrzejewski-Szmek2022-09-153-0/+3
| | | | | | | | | | | | If we want to stop those services which would compete for access to the console, we need to have an ordering so that they are actually stopped before the other things starts, not asynchronously.
| * units: add ordering dependencies on initrd-switch-root.targetZbigniew Jędrzejewski-Szmek2022-09-159-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For shutdown, we queue shutdown.target/start, so in every unit which should be stopped *before* shutdown, we need both Conflicts and an ordering dependency with shutdown.target (either Before= or After= would work, because stop jobs are always ordered before start jobs). For initrd transition, we queue initrd-switch-root.service/isolate. This automatically creates a /stop job for every running unit without IgnoreOnIsolate. But no ordering dependency is created, unless the unit has a (possibly transitive) ordering dependency on initrd-switch-root.service. Since most units must stop before the transition, we should add the ordering dependency. It is nicer to use Before=initrd-switch-root.target for this. initrd-switch-root.target is ordered before initrd-switch-root.service, so the effect it the same when both are in a transaction. Fixes #23745. To also cover the case where somebody is emergency mode in the initrd and queues initrd-switch-root.service/start (not isolate), also add Conflicts=initrd-switch-root.target, so various units are stopped properly. This extends 2525682565b372b9b83c848bfe89c025fed47a1d to cover all the other services that are touched. It could be consider "operator error", but it's easy to make and it's nicer if we can make this more foolproof.
| * units/systemd-network-generator.service: add forgotten ordering for shutdownZbigniew Jędrzejewski-Szmek2022-09-151-0/+2
| |
| * units: reorder/split unit dependency blocksZbigniew Jędrzejewski-Szmek2022-09-1513-24/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | The block is reordered and split to have: 1. description + documentation 2. (optionally) conditions 3. all the dependencies I think it's easier to read the units this way. Also, the Conflicts+Before is seperated out to separate lines. The ordering dependency is "fake", because it could just as well be After=, we are adding it to force ordering wrt. shutdown.target, and it plays a different role than the other Before=, which are about a real ordering on boot.
* | add CAP_LINUX_IMMUTABLE to systemd-machined, so it can handle machinectl ↵Dan Streetman2022-09-161-1/+1
| | | | | | | | | | | | read-only requests Without this, the 'machinectl read-only ...' command always fails.
* | unit: drop ProtectClock=yes from systemd-udevd.serviceYu Watanabe2022-09-161-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This partially reverts cabc1c6d7adae658a2966a4b02a6faabb803e92b. The setting ProtectClock= implies DeviceAllow=, which is not suitable for udevd. Although we are slowly removing cgropsv1 support, but DeviceAllow= with cgroupsv1 is necessarily racy, and reloading PID1 during the early boot process may cause issues like #24668. Let's disable ProtectClock= for udevd. And, if necessary, let's explicitly drop CAP_SYS_TIME and CAP_WAKE_ALARM (and possibly others) by using CapabilityBoundingSet= later. Fixes #24668.
* | pstore: do not try to load all known pstore modulesNick Rosbrook2022-09-141-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | | Commit 70e74a5997 ("pstore: Run after modules are loaded") added After= and Wants= entries for all known kernel modules providing a pstore. While adding these dependencies on systems where one of the modules is not present, or not configured, should not have a real affect on the system, it can produce annoying error messages in the kernel log. E.g. "mtd device must be supplied (device name is empty)" when the mtdpstore module is not configured correctly. Since dependencies cannot be removed with drop-ins, if a distro wants to remove some of these modules from systemd-pstore.service, they need to patch units/systemd-pstore.service.in. On the other hand, if they want to append to the dependencies this can be done by shipping a drop-in. Since the original intent of the previous commit was to fix [1], which only requires the efi_pstore module, remove all other kernel module dependencies from systemd-pstore.service, and let distros ship drop-ins to add dependencies if needed. [1] https://github.com/systemd/systemd/issues/18540
* units: prolong the stop timeout for homedLennart Poettering2022-09-051-0/+1
| | | | | | Let's give IO/resizing/… more time then usual. Fixes: #22901
* Merge pull request #24054 from keszybz/initrd-no-reloadFrantisek Sumsal2022-08-183-25/+34
|\ | | | | Don't do daemon-reload in the initrd
| * initrd-parse-etc: override argv[0] to avoid dracut issueZbigniew Jędrzejewski-Szmek2022-08-181-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Quoting https://github.com/systemd/systemd/pull/24054#issuecomment-1210501631: > this would need a patch in dracut, specifically adding the > systemd-sysroot-fstab-check to the list of installed stuff: > https://github.com/dracutdevs/dracut/blob/fe8fa2b0cadbb33e27c8dd8b5851548dcd65835c/modules.d/00systemd/module-setup.sh#L47. > > I could do this manually in the CI (and I guess I'd have to do it anyway even > if the patch lands in upstream, since it won't be available in C8S), but it > should get there first before merging this PR, otherwise it's going to break > Rawhide.
| * units/initrd-parse-etc.service: only start units that are requiredZbigniew Jędrzejewski-Szmek2022-07-232-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes use of the option switch that was added in the previous commit. We used a pretty big hammer on a relatively small nail: we would do daemon-reload and (in principle) allow any configuration to be changed. But in fact we only made use of this in systemd-fstab-generator. systemd-fstab-generator filters out all mountpoints except /usr and those marked with x-initrd.mount, i.e. on a big majority of systems it wouldn't do anything. Also, since systemd-fstab-generator first parses /proc/cmdline, and then initrd's /etc/fstab, and only then /sysroot/etc/fstab, configuration in the host would only matter if it the same mountpoint wasn't configured "earlier". So the config in the host could be used for new mountpoints, but it couldn't be used to amend configuration for existing mountpoints. And we wouldn't actually remount anything, so mountpoints that were already mounted wouldn't be affected, even if did change some config. In the new scheme, we will parse /sysroot/etc/fstab and explicitly start sysroot-usr.mount and other units that we just wrote. In most cases (as written above), this will actually result in no units being created or started. If the generator is invoked on a system with /sysroot/etc/fstab present, behaviour is not changed and we'll create units as before. This is needed so that if daemon-reload is later at some points, we don't "lose" those units. There's a minor bugfix here: we honour x-initrd.mount for swaps, but we wouldn't restart swap.target, i.e. the new swaps wouldn't necessarilly be pulled in immediately.
* | network/tuntap: save tun or tap file descriptor in fd storeYu Watanabe2022-08-161-0/+1
| |
* | units: Simplify container getty handlingDaan De Meyer2022-07-282-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | Let's remove the baud settings for the container getty units since they don't have any effect there anyway. On top of that, when we're dealing with container TTYs, we can handle all the setup involved ourselves so let's prevent agetty/login from touching the container tty at all. One example where this helps is that it actually makes disabling TTYVHangup have an effect since before, login would unconditionally call vhangup() on the tty.
* | tmpfiles: automatically provision /etc/issue.d/ + /etc/motd.d/ + /etc/hosts ↵Lennart Poettering2022-07-211-0/+3
| | | | | | | | from credentials
* | tmpfiles: accept additional tmpfiles lines via credentialLennart Poettering2022-07-203-0/+3
| |
* | tree-wide: fix typoYu Watanabe2022-07-201-1/+1
| |
* | sysusers: allow defining additional sysusers lines via credentialsLennart Poettering2022-07-161-1/+5
| |
* | sysctl: also process sysctl requests via the "sysctl.extra" credentialLennart Poettering2022-07-141-0/+1
|/
* logind: don't delay login for root even if systemd-user-sessions.service is ↵Franck Bui2022-07-126-3/+33
| | | | | | | | | | | | | | | | | | | | | | | not activated yet If for any reason something goes wrong during the boot process (most likely due to a network issue), system admins should be allowed to log in to the system to debug the problem. However due to the login session barrier enforced by systemd-user-sessions.service for all users, logins for root will be delayed until a (dbus) timeout expires. Beside being confusing, it's not a nice user experience to wait for an indefinite period of time (no message is shown) this and also suggests that something went wrong in the background. The reason of this delay is due to the fact that all units involved in the creation of a user session are ordered after systemd-user-sessions.service, which is subject to network issues. If root needs to log in at that time, logind is requested to create a new session (via pam_systemd), which ultimately ends up waiting for systemd-user-session.service to be activated. This has the bad side effect to block login for root until the dbus call done by pam_systemd times out and the PAM stack proceeds anyways. To solve this problem, this patch orders the session scope units and the user instances only after systemd-user-sessions.service for unprivileged users only.
* user: delegate cpu controller, assign weights to user slicesZbigniew Jędrzejewski-Szmek2022-07-054-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far we didn't enable the cpu controller because of overhead of the accounting. If I'm reading things correctly, delegation was enabled for a while for the units with user and pam context set, i.e. for user@.service too. a931ad47a8623163a29d898224d8a8c1177ffdaf added the explicit Delegate=yes|no switch, but it was initially set to 'yes'. acc8059129b38d60c1b923670863137f8ec8f91a disabled delegation for user@.service with the justication that CPU accounting is expensive, but half a year later a88c5b8ac4df713d9831d0073a07fac82e884fb3 changed DefaultCPUAccounting=yes for kernels >=4.15 with the justification that CPU accounting is inexpensive there. In my (very noncomprehensive) testing, I don't see a measurable overhead if the cpu controller is enabled for user slices. I tried some repeated compilations, and there is was no statistical difference, but the noise level was fairly high. Maybe better benchmarking would reveal a difference. The goal of this change is very simple: currently all of the user session, including services like the display server and pipewire are under user@.service. This means that when e.g. a compilation job is started in the session's app.slice, the processes in session.slice compete for CPU and can be starved. In particular, audio starts to stutter, etc. With CPU controller enabled, I can start start 'ninja -C build -j40' in a tab and this doesn't have any noticable effect on audio. I don't think the particular values matter too much: the CPU controller is work-convserving, and presumably the session slice would never need more than e.g. one 1 full CPU, i.e. half or a quarter of available CPU resources on even the smallest of today's machines. app.slice and session.slice are assigned equal weights, background.slice is assigned a smaller fraction. CPUWeight=100 is the default, but I wrote it explicitly to make it easier for users to see how the split is done. So effectively this should result in session.slice getting as much power as it needs. If if turns out that this does have a noticable overhead, we could make it opt-in. But I think that the benefit to usability is important enough to enable it by default. W/o something like this the session is not really usable with background tasks.
* tree-wide: link to docs.kernel.org for kernel documentationnl67202022-07-048-8/+8
| | | | | | | https://www.kernel.org/ links to https://docs.kernel.org/ for the documentation. See https://git.kernel.org/pub/scm/docs/kernel/website.git/commit/?id=ebc1c372850f249dd143c6d942e66c88ec610520 These URLs are shorter and nicer looking.
* tree-wide: use html links for kernel docsZbigniew Jędrzejewski-Szmek2022-07-025-5/+5
| | | | Instead of using "*.txt" as reference name, use the actual destination title.
* unit: prioritize module devicesYu Watanabe2022-07-011-1/+1
| | | | | | | | Also, prioritize tty and network devices. Follow-up for 2336bde96420475ccb054326f27290fa0228f27d Fixes #23850.
* units: add IgnoreOnIsolate=yes to systemd-journald tooZbigniew Jędrzejewski-Szmek2022-07-013-6/+11
| | | | | | | | | We already had it on the socket units, so it's possible that systemd-journald.service would be stopped and then restarted when trafic hits the sockets when something logs. Let's not try to stop it. It is supposed to run until the end and be eventually killed in the final killing spree. This might (or not) help with #23287.
* units: remove the restart limit on the modprobe@.serviceAlban Bedel2022-06-211-0/+1
| | | | | | | | | | | | | They are various cases where the same module might be repeatedly loaded in a short time frame, for example if a service depending on a module keep restarting, or if many instances of such service get started at the same time. If this happend the modprobe@.service instance will be marked as failed because it hit the restart limit. Overall it doesn't seems to make much sense to have a restart limit on the modprobe service so just disable it. Fixes: #23742
* pstore: Run after modules are loadedAlexander Graf2022-06-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | The systemd-pstore service takes pstore files on boot and transfers them to disk. It only does it once on boot and only if it finds any. The typical location of the pstore on modern systems is the UEFI variable store. Most distributions ship with CONFIG_EFI_VARS_PSTORE=m. That means, the UEFI variable store is only available on boot after the respective module is loaded. In most situations, the pstore service gets loaded before the UEFI pstore, so we don't get to transfer logs. Instead, they accumulate, filling up the pstore over time, potentially breaking the UEFI variable store. Let's add a service dependency on any kernel module that can provide a pstore to ensure we only scan for pstate after we can actually see pstate. I have seen live occurences of systems breaking because we did not erase the pstates and ran out of UEFI nvram space. Fixes https://github.com/systemd/systemd/issues/18540
* tree-wide: replace obsolete wiki links with systemd.io/manpagesBenjamin Franzke2022-05-213-3/+3
| | | | | | | | | | | All wiki pages that contain a deprecation banner pointing to systemd.io or manpages are updated to point to their replacements directly. Helpful command for identification of available links: git grep freedesktop.org/wiki | \ sed "s#.*\(https://www.freedesktop.org/wiki[^ $<'\\\")]*\)\(.*\)#\\1#" | \ sort | uniq
* units: remove spurious empty lineLennart Poettering2022-05-041-1/+0
|
* meson: also allow setting GIT_VERSION via templatesZbigniew Jędrzejewski-Szmek2022-04-051-1/+1
| | | | | | | | | GIT_VERSION is not available as a config.h variable, because it's rendered into version.h during builds. Let's rework jinja2 rendering to also parse version.h. No functional change, the new variable is so far unused. I guess this will make partial rebuilds a bit slower, but it's useful to be able to use the full version string.
* unit: make systemd-udev-trigger.service use --prioritized-subsystemYu Watanabe2022-03-221-2/+1
| | | | Replaces #19637 and #22643.
* spelling: weekday names are capitalizedZbigniew Jędrzejewski-Szmek2022-03-211-1/+1
|
* unit: add units for new "systemd-sysupdate" toolLennart Poettering2022-03-195-0/+108
| | | | | | These unit (if enabled) will try to update the OS in regular intervals. Moreover, every day in the early morning this will attempt to reboot the system if there's a newer version installed than running.
* udev: run the main process, workers, and spawned commands in /udev subcgroupYu Watanabe2022-03-171-0/+1
| | | | | | | | And enable cgroup delegation for udevd. Then, processes invoked through ExecReload= are assigned .control subcgroup, and they are not killed by cg_kill(). Fixes #16867 and #22686.
* units: fix factory-reset.target descriptionVivien Didelot2022-03-141-1/+1
| | | | | | | | | | | | | | | The current description for the factory reset target does not add any value and doesn't respect the definition of the related property as described in systemd.unit(5). Starting the target currently results in the following log: [ 11.139174] systemd[1]: Reached target Target that triggers factory reset. Does nothing by default.. [ OK ] Reached target Target that…set. Does nothing by default.. Simply update the target description to "Factory Reset". Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
* units: drop After=systemd-resolved.service from systemd-nspawn@.serviceLennart Poettering2022-02-241-1/+1
| | | | | resolved is now started as part of early boot hence we need no explicit ordering anymore.
* units: move resolved to sysinit.target (from basic.target)Lennart Poettering2022-02-241-2/+2
| | | | | | | | | | | | | | | | | | 79a67f3ca4d32c37b5e754501852a85eae908a6a pulled systemd-resolved.service in from basic.target instead of multi-user.target, i.e. the idea is to make it an early boot service, instead of a regular service. However, early boot services are supposed to be in sysinit.target, not basic.target (the latter is just one that combines the early boot services in sysinit.target, the sockets in sockets.targt, the mounts in local-fs.target and so on into one big target). Also, the comit actually didn't add a synchronization point, i.e. not Before=, so that the whole thing was racy. Let's fix all that. Follow-up for 79a67f3ca4d32c37b5e754501852a85eae908a6a
* unit: escape %Yu Watanabe2022-02-231-1/+1
| | | | Fixes #22601.
* units: drop After=systemd-networkd.service from systemd-resolved.serviceLennart Poettering2022-02-231-1/+1
| | | | | | | | | | | This ordering existed since resolved was first created, but there should not be any need to order the two services against each other, as resolved should be able to pick up networkd DNS metadata either way (as it works with inotify in /run). Let's drop this hence, and not cargo-cult this to eternity Also see: https://github.com/systemd/systemd/pull/22389#issuecomment-1045978403
* units: we need systemd-journald.service from systemd-journal-flush.serviceLennart Poettering2022-02-021-0/+1
| | | | | | | | | | | | This is a follow-up for d5ee050ffc9d413253932d9340ade8c8fb111092, and reintroduces a requirement dep from systemd-journal-flush.service onto systemd-journald.service, but a weaker one than originally: a Wants= one instead of a Requires= one. Why? Simply because the service issues an IPC call to the journald, hence it should pull it in. (Note that socket activation doesn't happen for the Varlink socket it uses, hence we should pull in the service itself.)
* unit: introduce wait-online@.service for specific interfaceYu Watanabe2022-01-282-0/+26
| | | | | | This should be useful when a host has multiple interfaces. Inspired by #22246.