| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All daemons use a similar scheme to read their main config files and theirs
drop-ins. The main config files are always stored in /etc/systemd directory and
it's easy enough to construct the name of the drop-in directories based on the
name of the main config file.
Hence the new helper does that internally, which allows to reduce and simplify
the args passed previously to config_parse_many_nulstr().
Besides the overall code simplification it results:
16 files changed, 87 insertions(+), 159 deletions(-)
it allows to identify clearly the locations in the code where configuration
files are parsed.
|
|
|
|
| |
Signed-off-by: Morten Linderud <morten@linderud.pw>
|
|
|
|
|
|
|
| |
Although this slightly more verbose it makes it much easier to reason
about. The code that produces the tests heavily benefits from this.
Test lists are also now sorted by test name.
|
| |
|
|
|
|
|
|
| |
Meson+ninja+compiler do this for us and are better at it.
https://mesonbuild.com/FAQ.html#do-i-need-to-add-my-headers-to-the-sources-list-like-in-autotools
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-1 was used everywhere, but -EBADF or -EBADFD started being used in various
places. Let's make things consistent in the new style.
Note that there are two candidates:
EBADF 9 Bad file descriptor
EBADFD 77 File descriptor in bad state
Since we're initializating the fd, we're just assigning a value that means
"no fd yet", so it's just a bad file descriptor, and the first errno fits
better. If instead we had a valid file descriptor that became invalid because
of some operation or state change, the other errno would fit better.
In some places, initialization is dropped if unnecessary.
|
|
|
|
|
| |
Otherwise, the dry run isn't much use since it would be logged at debug
and not seen.
|
|
|
|
|
| |
Explicitly state that ManagedOOMPreference is always honored when the
unit's cgroup is owned by root.
|
|
|
|
|
|
| |
If the cgroup is owned by root there is no need to get prefix_uid. Only
check prefix_uid when uid != 0, and then set MANAGED_OOM_PREFERENCE_NONE
and return early if uid != prefix_uid.
|
|
|
|
|
|
|
|
|
|
| |
This conditional with !empty_or_root(ctx->path) always returns false
because the most recent oomd_cgroup_context_acquire() call was with the
root cgroup. Make sure this test case can be reached by checking cgroup
instead of ctx->path.
While here, use an unused uid (61183) instead of the nobody uid so the
test case does not fail in unprivileged LXD containers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 652a4efb66a ("oomd: loosen the restriction on ManagedOOMPreference")
made the change to allow ManagedOOMPreference on a cgroup candidate when
the monitored cgroup and cgroup candidate are owned by the same user.
The commit assumed that this check was sufficient to continue allowing
ManagedOOMPreference on all cgroups owned by root. However, it caused a
regression for unprivileged LXD containers where e.g. /sys/fs/cgroup is
owned by nobody (uid=65534).
Fix this by explicitly allowing the ManagedOOMPreference if uid == 0 in
oomd_fetch_cgroup_oom_preference().
|
| |
|
|
|
|
|
|
|
|
| |
(s) is just ugly with a vibe of DOS. In most cases just using the normal plural
form is more natural and gramatically correct.
There are some log_debug() statements left, and texts in foreign licenses or
headers. Those are not touched on purpose.
|
|
|
|
|
| |
We check the same list of error codes on various xattr operations, and
we should on some more. Add a common helper for this purpose.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building with GCC 12.2 and binutils 2.39 fails on riscv64 Ubuntu Kinetic
with:
FAILED: systemd-oomd
/usr/bin/ld: systemd-oomd.p/src_oom_oomd-util.c.o:
in function `oomd_cgroup_context_acquire':
build/../src/oom/oomd-util.c:415:
undefined reference to `__atomic_exchange_1'
We have to link with -latomic.
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
|
| |
|
| |
|
|
|
|
|
|
|
| |
When we kill a cgroup that is towards the end of the sorted candidate
list (i.e. when we have to resort to killing a candidate with
ManagedOOMPreference=avoid), this cgroup is not logged in the candidate
list. This is due to an off-by-one error when assigning dump_until.
|
|
|
|
|
|
|
| |
Add a new test function, test_oomd_fetch_cgroup_oom_preference, to test
the ManagedOOMPreference logic. For starters, cut the relevant tests out
of test_oomd_cgroup_context_acquire_and_insert, and add them to the new
function. Then, expand these tests to cover the new behavior.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ManagedOOMPreference property is only honored on cgroups which are
owned by root. This precludes anyone from setting ManagedOOMPreference
on cgroups managed by user managers.
Loosen this restriction in the following way: when processing a
monitored cgroup for kill candidates, honor the ManagedOOMPreference
setting if the monitored cgroup and cgroup candidate are owned by the
same user. This allows unprivileged users to configure
ManagedOOMPreference on their cgroups without affecting the kill
priority of ancestor cgroups.
N.B. that since swap kill operates globally to kill the largest
candidate, it is not appropriate to apply this logic to the swap kill
scenario. Therefore, the existing restriction on ManagedOOMPreference
will remain when calculating candidates for swap kill.
Add a new function, oomd_fetch_cgroup_oom_preference, to assist with
this new logic. To simplify things, move the `user.oomd_{avoid,omit}`
xattr reads to this function so that the xattr reads and uid checks are
performed all at once.
|
| |
|
|
|
|
| |
Fixes https://github.com/systemd/systemd/issues/23785#issuecomment-1210030100.
|
|
|
|
| |
grep -l -r http:// | xargs sed -E -i s'#http://(.*).freedesktop.org#https://\1.freedesktop.org#'
|
|
|
|
| |
The latter is the common spelling apparently.
|
|
|
|
| |
Follow-up for a858355e4a7168625ec1b9e5d17fdb6a11dfecb8.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The calculation for used memory in oomd_system_context_acquire is given
by MemTotal - MemFree from /proc/meminfo. This is too strict of a
calculation because it does not consider memory that is still available
for starting new applictions without swapping (MemAvailable). As a
result, systemd-oomd can start to kill processes before it is necessary.
This is more apparent on systems with low swap space.
Instead, compute 'used' memory as MemTotal - MemAvailable in
oomd_system_context_acquire and procfs_memory_get (which is used by
oomd_cgroup_context_acquire). And, rename oomd_mem_free_below to
oomd_mem_available_below for clarity.
|
|
|
|
|
| |
Check if `user.oomd_ooms` xattr is being set as part of `oomd_cgroup_kill()`
this xattr tracks OOM kills that were initiated by systemd-oomd.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To notify user of kill events from systemd-oomd we now use
`SERVICE_FAILURE_OOM_KILL` as the failure result.
`unit_check_oomd_kill` now calls `notify_cgroup_oom` to
update the service result to `oom-kill`.
We add a new xattr `user.oomd_ooms` to keep track of the OOM kills
initiated by systemd-oomd, this helps us resolve a race between sending
SIGKILL to processes and checking for OOM kill status from the xattr.
Related to: #20649
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The approach to use '''…'''.split() instead of a list of strings was initially
used when converting from automake because it allowed identical blocks of lines
to be used for both, making the conversion easier.
But over the years we have been using normal lists more and more, especially
when there were just a few filenames listed. This converts the rest.
No functional change.
|
|\
| |
| | |
test: fix fd leaks
|
| |
| |
| |
| | |
Fixes an issue reported in #22576.
|
| |
| |
| |
| | |
Fixes #22577.
|
|/ |
|
|
|
|
|
|
| |
This is a follow up to 29f4185a9cdc ("oomd: Dump top offenders after a
kill action") to clean up the code a bit for review comments that
happened after the code had been merged already.
|
|
|
|
|
| |
This hopefully makes it more transparent why a specific cgroup was
killed by systemd-oomd.
|
|
|
|
|
|
|
|
|
| |
Currently if systemd-oomd doesn't kill anything in a selected cgroup, it
selects a new candidate immediately. But if a selected cgroup wasn't killed,
it is likely due to it disappearing or getting cleaned up between the time
it was selected as a candidate and getting sent SIGKILL(s). We should handle
it as though systemd-oomd did perform a kill so that it will check
swap/pressure again before it tries to select a new candidate.
|
|
|
|
|
|
|
|
|
|
|
| |
There can be a situation where systemd-oomd would kill all of the processes
in a cgroup, pid1 would clean up that cgroup, and systemd-oomd would get
ENODEV trying to iterate the cgroup a final time to ensure it was empty.
systemd-oomd sees this as an error and immediately picks a new candidate even
though pressure may have recovered. To counter this, check and handle
path unavailability errnos specially.
Fixes: #22030
|
|
|
|
|
|
| |
Not having to provide the full path in the source tree is much
nicer and the produced lists can also be used anywhere in the source
tree.
|
|
|
|
| |
The end result is the same.
|
| |
|
| |
|
|
|
|
| |
Follow-up for #20839
|
|
|
|
|
|
|
|
|
|
| |
loadavg.h is an internal header of the Linux source repository, and as
such it is licensed as GPLv2-only, without syscall exception.
We use it only for 4 macros, which are simply doing some math calculations
that cannot thus be subject to copyright.
Reimplement the same calculations in another internal header and delete
loadavg.h from our tree.
|
|\
| |
| | |
oom: Support for user services
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Compared to PID1 where systemd-oomd has to be the client to PID1
because PID1 is a more privileged process than systemd-oomd, systemd-oomd
is the more privileged process compared to a user manager so we have
user managers be the client whereas systemd-oomd is now the server.
The same varlink protocol is used between user managers and systemd-oomd
to deliver ManagedOOM property updates. systemd-oomd now sets up a varlink
server that user managers connect to to send ManagedOOM property updates.
We also add extra validation to make sure that non-root senders don't
send updates for cgroups they don't own.
The integration test was extended to repeat the chill/bloat test using
a user manager instead of PID1.
|
| |
| |
| |
| |
| |
| | |
Gets rid of a few gotos, allows removing the extra ret variable and
will also be used in a future commit by the codepath that receives
cgroups from user instances of systemd.
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM 13 introduced `-Wunused-but-set-variable` diagnostic flag, which
trips over some intentionally set-but-not-used variables or variables
attached to cleanup handlers with side effects (`_cleanup_umask_`,
`_cleanup_(notify_on_cleanup)`, `_cleanup_(restore_sigsetp)`, etc.):
```
../src/basic/process-util.c:1257:46: error: variable 'saved_ssp' set but not used [-Werror,-Wunused-but-set-variable]
_cleanup_(restore_sigsetp) sigset_t *saved_ssp = NULL;
^
1 error generated.
```
|