summaryrefslogtreecommitdiff
path: root/src/basic/cgroup-util.c
Commit message (Collapse)AuthorAgeFilesLines
* cgroup: rework how we validate/escape cgroupsLennart Poettering2023-04-271-44/+55
| | | | | | Let's clean up validation/escaping of cgroup names. i.e. split out code that tests if name needs escaping. Return proper error codes, and extend test a bit.
* Include <threads.h> if possible to get thread_local definitionCristian Rodríguez2023-03-061-0/+1
| | | | | | | | | | | | IN C23, thread_local is a reserved keyword and we shall therefore do nothing to redefine it. glibc has it defined for older standard version with the right conditions. v2 by Yu Watanabe: Move the definition to missing_threads.h like the way we define e.g. missing syscalls or missing definitions, and include it by the users. Co-authored-by: Yu Watanabe <watanabe.yu+github@gmail.com>
* core: add cg_path_get_unit_path()Quentin Deslandes2023-02-081-0/+22
| | | | | | | | | | From a given cgroup path, cg_path_get_unit() allows to retrieve the unit's name. Although, this removes the path to the unit's cgroup, preventing the result to be used to fetch xattrs. Introduce cg_path_get_unit_path() which provides the path to the unit's cgroup. This function behave similarly to cg_path_get_unit() (checking the validity and escaping the unit's name).
* basic: fix hosed return value in skip_session()Cristian Rodríguez2023-01-031-1/+1
| | | | | | ../src/basic/cgroup-util.c: In function ‘skip_session’: ../src/basic/cgroup-util.c:1241:32: error: incompatible types when returning type ‘_Bool’ but ‘const char *’ was expected 1241 | return false;
* Rename def.h to constants.hZbigniew Jędrzejewski-Szmek2022-11-081-1/+1
| | | | | | The name "def.h" originates from before the rule of "no needless abbreviations" was established. Let's rename the file to clarify that it contains a collection of various semi-related constants.
* cgroup-util: Properly handle conditions where cgroup.threads is empty after ↵msizanoen12022-05-311-3/+12
| | | | | | | | | | | | | SIGKILL but processes still remain After sending a SIGKILL to a process, the process might disappear from `cgroup.threads` but still show up in `cgroup.procs` and still remains in the cgroup and cause migrating new processes to `Delegate=yes` cgroups to fail with `-EBUSY`. This is especially likely for heavyweight processes that consume more kernel CPU time to clean up. Fix this by only returning 0 when both `cgroup.threads` and `cgroup.procs` are empty.
* cgroup-util: introduce cg_is_threaded()Yu Watanabe2022-02-161-0/+24
|
* cgroup-util: refuse the case that both path and suffix are empty stringsYu Watanabe2022-02-121-1/+1
| | | | Fixes CID#1322378.
* cgroup-util: minor modernizationsLennart Poettering2022-02-111-27/+20
| | | | | Rename return parameters to "ret", use ternary op without second argument, rebreak comments, use isempty() more.
* tree-wide: make FOREACH_DIRENT_ALL define the iterator variableZbigniew Jędrzejewski-Szmek2021-12-151-2/+0
| | | | | | | | | The variable is not useful outside of the loop (it'll always be null after the loop is finished), so we can declare it inline in the loop. This saves one variable declaration and reduces the chances that somebody tries to use the variable outside of the loop. For consistency, 'de' is used everywhere for the var name.
* tree-wide: use new RET_NERRNO() helper at various placesLennart Poettering2021-11-161-8/+2
|
* cgroup-util: laccess() returns negative errno alreadyLennart Poettering2021-11-161-4/+1
|
* alloc-util: add strdupa_safe() + strndupa_safe() and use it everywhereLennart Poettering2021-10-141-1/+1
| | | | | | | | | | | | | Let's define two helpers strdupa_safe() + strndupa_safe() which do the same as their non-safe counterparts, except that they abort if called with allocations larger than ALLOCA_MAX. This should ensure that all our alloca() based allocations are subject to this limit. afaics glibc offers three alloca() based APIs: alloca() itself, strndupa() + strdupa(). With this we have now replacements for all of them, that take the limit into account.
* Merge pull request #20910 from poettering/nftw-no-moreLennart Poettering2021-10-071-1/+0
|\ | | | | basic: add recurse_dir() function as modern replacement for nftw()
| * tree-wide: remove a few unnecessary inclusions of ftw.hLennart Poettering2021-10-071-1/+0
| |
* | cgroup-util: add reusable union type for cgroupfs file_handle structsLennart Poettering2021-10-071-10/+3
| | | | | | | | That way we can easily call name_to_handle_at() on cgroupfs2 elsewhere.
* | xattr-util: merge various getxattr()/listxattr() helpers into ↵Lennart Poettering2021-10-071-1/+1
|/ | | | | | | | | | | | | | getxattr_at_malloc() + listxattr_at_malloc() Unfortunately fgetxattr() and flistxattr() don't work via O_PATH fds. Let's thus add fallbacks to go via /proc/self/fd/ in these cases. Also, let's merge all the various flavours we have here into singular implementations that can do everything we need: 1. malloc() loop handling 2. by fd, by path, or combination (i.e. a proper openat() like API) 3. work on O_PATH
* cgroup-util: add cg_path_get_cgroupid()Iago López Galeiras2021-10-061-0/+23
| | | | It returns the cgroupID from a cgroup path.
* cgroup-util: use string_hash_ops_freeYu Watanabe2021-09-111-6/+2
|
* cgroup-util: use _cleanup_free_ attributeYu Watanabe2021-09-111-8/+4
|
* core: implement RestrictNetworkInterfaces=Mauricio Vásquez2021-08-181-0/+1
| | | | | | | | This commit introduces all the logic to load and attach the BPF programs to restrict network interfaces when a unit specifying it is loaded. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
* pid1: add support for cgroup.killAlbert Brox2021-08-091-23/+72
|
* tree-wide: always drop unnecessary dot in pathYu Watanabe2021-05-281-5/+5
|
* alloc-util: simplify GREEDY_REALLOC() logic by relying on malloc_usable_size()Lennart Poettering2021-05-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | We recently started making more use of malloc_usable_size() and rely on it (see the string_erase() story). Given that we don't really support sytems where malloc_usable_size() cannot be trusted beyond statistics anyway, let's go fully in and rework GREEDY_REALLOC() on top of it: instead of passing around and maintaining the currenly allocated size everywhere, let's just derive it automatically from malloc_usable_size(). I am mostly after this for the simplicity this brings. It also brings minor efficiency improvements I guess, but things become so much nicer to look at if we can avoid these allocation size variables everywhere. Note that the malloc_usable_size() man page says relying on it wasn't "good programming practice", but I think it does this for reasons that don't apply here: the greedy realloc logic specifically doesn't rely on the returned extra size, beyond the fact that it is equal or larger than what was requested. (This commit was supposed to be a quick patch btw, but apparently we use the greedy realloc stuff quite a bit across the codebase, so this ends up touching *a*lot* of code.)
* core: add socket-bind cgroup mask harnessJulia Kartseva2021-04-261-0/+1
| | | | Standard cgroup harness for bpf feature.
* core: add bpf-foreign cgroup mask and harnessJulia Kartseva2021-04-091-0/+1
| | | | | Add CGROUP_MASK_BPF_FOREIGN to CGROUP_MASK_BPF and standard cgroup context harness.
* tree-wide: use the same comment for work-around initializationsZbigniew Jędrzejewski-Szmek2021-04-071-1/+1
| | | | | | | | This should make it easier to remove those warnings when the compiler gets smarter. Not sure if I got them all... Double space before the comment start to make it easier to separate from the preceding line.
* basic/cgroup-util: silence gcc warning about unitialized variableZbigniew Jędrzejewski-Szmek2021-03-311-1/+2
|
* tree-wide: use read_full_virtual_file() where appropriateLennart Poettering2021-03-171-1/+1
| | | | | | | Wherever we read virtual files we better should use read_full_virtual_file(), to make sure we get a consistent response given how weird the kernel's handling with partial read on such file systems is.
* cg_unified_cached: return ENOMEDIUM if we cannot find a known hierarchyMike Gilbert2021-03-171-1/+7
| | | | | | | | When the test suite is being run in a foreign environment, /sys/fs/cgroup might not be set up in a way that we recognize. Returning ENOMEDIUM causes the tests to be skipped in this case. Bug: https://bugs.gentoo.org/771819
* basic/group-util: optimize alloca useZbigniew Jędrzejewski-Szmek2021-03-111-7/+5
| | | | Follow-up for 0fa7b50053.
* Merge pull request #18553 from Werkov/cgroup-user-instance-controllersZbigniew Jędrzejewski-Szmek2021-03-101-42/+43
|\ | | | | Make (user) instance aware of delegated cgroup controllers
| * core: Make (user) instance aware of delegated cgroup controllersMichal Koutný2021-02-111-14/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | systemd user instance assumed same controllers are available to it as to PID 1. That is not true generally, in v1 (legacy, hybrid) we don't delegate any controllers to anyone and in v2 (unified) we may delegate only subset of controllers. The user instance would fail silently when the controller cgroup cannot be created or the controller cannot be enabled on the unified hierarchy. The changes in 7b63961415 ("cgroup: Swap cgroup v1 deletion and migration") caused some attempts of operating on non-delegated controllers to be logged. Make the user instance first check what controllers are availble to it and narrow operations only to these controllers. The original checks are kept in place. Note that daemon-reexec needs to be invoked in order to update the set of unabled controllers after a change. Fixes: #18047 Fixes: #17862
| * cgroup: Simplify cg_get_path_and_checkMichal Koutný2021-02-111-34/+22
| | | | | | | | | | | | The function controller_is_accessible() doesn't do really much in case of the unified hierarchy. Move common parts into cg_get_path_and_check and make controller check v1 specific. This is refactoring only.
* | basic: tighten two filename length checksLennart Poettering2021-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | This fixes two checks where we compare string sizes when validating with FILENAME_MAX. In both cases the check apparently wants to check if the name fits in a filename, but that's not actually what FILENAME_MAX can be used for, as it — in contrast to what the name suggests — actually encodes the maximum length of a path. In both cases the stricter change doesn't actually change much, but the use of FILENAME_MAX is still misleading and typically wrong.
* | tree-wide: use UINT64_MAX or friendsYu Watanabe2021-03-051-1/+1
| |
* | Merge pull request #18401 from anitazha/oomdxattrZbigniew Jędrzejewski-Szmek2021-02-131-0/+41
|\ \ | |/ |/| oomd: implement avoid/omit support for cgroups
| * oom: implement avoid/omit xattr supportAnita Zhang2021-02-091-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There may be situations where a cgroup should be protected from killing or deprioritized as a candidate. In FB oomd xattrs are used to bias oomd away from supervisor cgroups and towards worker cgroups in container tasks. On desktops this can be used to protect important units with unpredictable resource consumption. The patch allows systemd-oomd to understand 2 xattrs: "user.oomd_avoid" and "user.oomd_omit". If systemd-oomd sees these xattrs set to 1 on a candidate cgroup (i.e. while attempting to kill something) AND the cgroup is owned by root, it will either deprioritize the cgroup as a candidate (avoid) or remove it completely as a candidate (omit). Usage is restricted to root owned cgroups to prevent situations where an unprivileged user can set their own cgroups lower in the kill priority than another user's (and prevent them from omitting their units from systemd-oomd killing).
| * cgroup-util: add ManagedOOMPreference enum to use between pid1 and oomdAnita Zhang2021-02-091-0/+8
| |
* | tree-wide: replace strverscmp() and str_verscmp() with strverscmp_improved()Yu Watanabe2021-02-091-1/+1
|/
* treewide: tighten variable scope in loops (#18372)Susant Sahani2021-01-271-5/+3
| | | | Also use _cleanup_free_ in one more place.
* string-util: imply NULL termination of strextend() argument listLennart Poettering2021-01-061-2/+2
| | | | | The trailing NULL in the argument list is now implied (similar to what we already have in place in strjoin()).
* tree-wide: fix "a the" or "the a"Yu Watanabe2020-11-131-3/+2
|
* license: LGPL-2.1+ -> LGPL-2.1-or-laterYu Watanabe2020-11-091-1/+1
|
* cgroup-util: add cg_get_attribute_as_bool() helperAnita Zhang2020-10-071-0/+20
|
* core: add ManagedOOM*= properties to configure systemd-oomd on the unitAnita Zhang2020-10-071-0/+7
| | | | | This adds the hook ups so it can be read with the usual systemd utilities. Used in later commits by sytemd-oomd.
* basic/cgroup-util: port over to string_contains_word()Zbigniew Jędrzejewski-Szmek2020-09-091-23/+13
|
* core: introduce support for cgroup freezerMichal Sekletár2020-04-301-3/+15
| | | | | | | | | | | | | | | | | | | | With cgroup v2 the cgroup freezer is implemented as a cgroup attribute called cgroup.freeze. cgroup can be frozen by writing "1" to the file and kernel will send us a notification through "cgroup.events" after the operation is finished and processes in the cgroup entered quiescent state, i.e. they are not scheduled to run. Writing "0" to the attribute file does the inverse and process execution is resumed. This commit exposes above low-level functionality through systemd's DBus API. Each unit type must provide specialized implementation for these methods, otherwise, we return an error. So far only service, scope, and slice unit types provide the support. It is possible to check if a given unit has the support using CanFreeze() DBus property. Note that DBus API has a synchronous behavior and we dispatch the reply to freeze/thaw requests only after the kernel has notified us that requested operation was completed.
* basic/cgroup-util: introduce cg_get_keyed_attribute_full()Michal Sekletár2020-04-291-3/+10
| | | | | | Callers of cg_get_keyed_attribute_full() can now specify via the flag whether the missing keyes in cgroup attribute file are OK or not. Also the wrappers for both strict and graceful version are provided.
* cgroup-util: check for SYSFS_MAGIC when detecting cgroup formatDan Streetman2020-04-251-0/+3
| | | | | | | | | | | When nothing at all is mounted at /sys/fs/cgroup, the fs.f_type is SYSFS_MAGIC (0x62656572) which results in the confusing debug log: "Unknown filesystem type 62656572 mounted on /sys/fs/cgroup." Instead, if the f_type is SYSFS_MAGIC, a more accurate message is: "No filesystem is currently mounted on /sys/fs/cgroup."