summaryrefslogtreecommitdiff
path: root/src/nspawn/nspawn-cgroup.c
Commit message (Collapse)AuthorAgeFilesLines
* nspawn: port over to /supervisor/ subcgroup being delegated to nspawnLennart Poettering2023-04-271-5/+16
| | | | | | Let's make use of the new DelegateSubgroup= feature and delegate the /supervisor/ subcgroup already to nspawn, so that moving the supervisor process there is unnecessary.
* tree-wide: use -EBADF for fd initializationZbigniew Jędrzejewski-Szmek2022-12-191-1/+1
| | | | | | | | | | | | | | | | -1 was used everywhere, but -EBADF or -EBADFD started being used in various places. Let's make things consistent in the new style. Note that there are two candidates: EBADF 9 Bad file descriptor EBADFD 77 File descriptor in bad state Since we're initializating the fd, we're just assigning a value that means "no fd yet", so it's just a bad file descriptor, and the first errno fits better. If instead we had a valid file descriptor that became invalid because of some operation or state change, the other errno would fit better. In some places, initialization is dropped if unnecessary.
* tree-wide: use mode=0nnn for mount optionZbigniew Jędrzejewski-Szmek2022-12-141-4/+9
| | | | | | This is an octal number. We used the 0 prefix in some places inconsistently. The kernel always interprets in base-8, so this has no effect, but I think it's nicer to use the 0 to remind the reader that this is not a decimal number.
* basic: rename util.h to logarithm.hZbigniew Jędrzejewski-Szmek2022-11-081-1/+0
| | | | | util.h is now about logarithms only, so we can rename it. Many files included util.h for no apparent reason… Those includes are dropped.
* strv: declare iterator of FOREACH_STRING() in the loopZbigniew Jędrzejewski-Szmek2022-03-231-1/+0
| | | | | | | | | | | Same idea as 03677889f0ef42cdc534bf3b31265a054b20a354. No functional change intended. The type of the iterator is generally changed to be 'const char*' instead of 'char*'. Despite the type commonly used, modifying the string was not allowed. I adjusted the naming of some short variables for clarity and reduced the scope of some variable declarations in code that was being touched anyway.
* tree-wide: fix duplicated wordsMichael Biebl2022-03-181-1/+1
| | | | | | | the the in in not not we we
* cgroup-util: use string_hash_ops_freeYu Watanabe2021-09-111-1/+1
|
* license: LGPL-2.1+ -> LGPL-2.1-or-laterYu Watanabe2020-11-091-1/+1
|
* mount-util: rework umount_verbose() to take log level and flags argLennart Poettering2020-09-231-1/+1
| | | | | | Let's make umount_verbose() more like mount_verbose_xyz(), i.e. take log level and flags param. In particular the latter matters, since we typically don't actually want to follow symlinks when unmounting.
* mount-util: switch most mount_verbose() code over to not follow symlinksLennart Poettering2020-09-231-19/+19
|
* tree-wide: add size limits for tmpfs mountsTopi Miettinen2020-05-131-2/+2
| | | | | | | | | | | | | | | | | Limit size of various tmpfs mounts to 10% of RAM, except volatile root and /var to 25%. Another exception is made for /dev (also /devs for PrivateDevices) and /sys/fs/cgroup since no (or very few) regular files are expected to be used. In addition, since directories, symbolic links, device specials and xattrs are not counted towards the size= limit, number of inodes is also limited correspondingly: 4MB size translates to 1k of inodes (assuming 4k each), 10% of RAM (using 16GB of RAM as baseline) translates to 400k and 25% to 1M inodes. Because nr_inodes option can't use ratios like size option, there's an unfortunate side effect that with small memory systems the limit may be on the too large side. Also, on an extremely small device with only 256MB of RAM, 10% of RAM for /run may not be enough for re-exec of PID1 because 16MB of free space is required.
* basic/set: let set_put_strdup() create the set with string hash opsZbigniew Jędrzejewski-Szmek2020-05-061-7/+3
| | | | | | | | | | | | | | | | | | If we're using a set with _put_strdup(), most of the time we want to use string hash ops on the set, and free the strings when done. This defines the appropriate a new string_hash_ops_free structure to automatically free the keys when removing the set, and makes set_put_strdup() and set_put_strdupv() instantiate the set with those hash ops. hashmap_put_strdup() was already doing something similar. (It is OK to instantiate the set earlier, possibly with a different hash ops structure. set_put_strdup() will then use the existing set. It is also OK to call set_free_free() instead of set_free() on a set with string_hash_ops_free, the effect is the same, we're just overriding the override of the cleanup function.) No functional change intended.
* util-lib: move some functions from basic/cgroup-util to shared/cgroup-setupZbigniew Jędrzejewski-Szmek2019-09-161-0/+1
| | | | | | | | | This way less stuff needs to be in basic. Initially, I wanted to move all the parts of cgroup-utils.[ch] that depend on efivars.[ch] to shared, because efivars.[ch] is in shared/. Later on, I decide to split efivars.[ch], so the move done in this patch is not necessary anymore. Nevertheless, it is still valid on its own. If at some point we want to expose libbasic, it is better to to not have stuff that belong in libshared there.
* path-util: get rid of prefix_root()Lennart Poettering2019-06-211-3/+3
| | | | | | | | | | | | | | | | | | | prefix_root() is equivalent to path_join() in almost all ways, hence let's remove it. There are subtle differences though: prefix_root() will try shorten multiple "/" before and after the prefix. path_join() doesn't do that. This means prefix_root() might return a string shorter than both its inputs combined, while path_join() never does that. I like the path_join() semantics better, hence I think dropping prefix_root() is totally OK. In the end the strings generated by both functon should always be identical in terms of path_equal() if not streq(). This leaves prefix_roota() in place. Ideally we'd have path_joina(), but I don't think we can reasonably implement that as a macro. or maybe we can? (if so, sounds like something for a later PR) Also add in a few missing OOM checks
* tree-wide: make use of the new WRITE_STRING_FILE_MKDIR_0755 flagLennart Poettering2019-05-081-2/+1
|
* codespell: fix spelling errorsBen Boeckel2019-04-291-1/+1
|
* headers: remove unneeded includes from util.hZbigniew Jędrzejewski-Szmek2019-03-271-0/+1
| | | | | This means we need to include many more headers in various files that simply included util.h before, but it seems cleaner to do it this way.
* nspawn: (void)ify more stuffLennart Poettering2019-03-151-1/+1
|
* Split out part of mount-util.c into mountpoint-util.cZbigniew Jędrzejewski-Szmek2018-11-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea is that anything which is related to actually manipulating mounts is in mount-util.c, but functions for mountpoint introspection are moved to the new file. Anything which requires libmount must be in mount-util.c. This was supposed to be a preparation for further changes, with no functional difference, but it results in a significant change in linkage: $ ldd build/libnss_*.so.2 (before) build/libnss_myhostname.so.2: linux-vdso.so.1 (0x00007fff77bf5000) librt.so.1 => /lib64/librt.so.1 (0x00007f4bbb7b2000) libmount.so.1 => /lib64/libmount.so.1 (0x00007f4bbb755000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4bbb734000) libc.so.6 => /lib64/libc.so.6 (0x00007f4bbb56e000) /lib64/ld-linux-x86-64.so.2 (0x00007f4bbb8c1000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f4bbb51b000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f4bbb512000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f4bbb4e3000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f4bbb45e000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f4bbb458000) build/libnss_mymachines.so.2: linux-vdso.so.1 (0x00007ffc19cc0000) librt.so.1 => /lib64/librt.so.1 (0x00007fdecb74b000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fdecb744000) libmount.so.1 => /lib64/libmount.so.1 (0x00007fdecb6e7000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdecb6c6000) libc.so.6 => /lib64/libc.so.6 (0x00007fdecb500000) /lib64/ld-linux-x86-64.so.2 (0x00007fdecb8a9000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fdecb4ad000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fdecb4a2000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fdecb475000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fdecb3f0000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fdecb3ea000) build/libnss_resolve.so.2: linux-vdso.so.1 (0x00007ffe8ef8e000) librt.so.1 => /lib64/librt.so.1 (0x00007fcf314bd000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fcf314b6000) libmount.so.1 => /lib64/libmount.so.1 (0x00007fcf31459000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcf31438000) libc.so.6 => /lib64/libc.so.6 (0x00007fcf31272000) /lib64/ld-linux-x86-64.so.2 (0x00007fcf31615000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fcf3121f000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fcf31214000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fcf311e7000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fcf31162000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fcf3115c000) build/libnss_systemd.so.2: linux-vdso.so.1 (0x00007ffda6d17000) librt.so.1 => /lib64/librt.so.1 (0x00007f610b83c000) libcap.so.2 => /lib64/libcap.so.2 (0x00007f610b835000) libmount.so.1 => /lib64/libmount.so.1 (0x00007f610b7d8000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f610b7b7000) libc.so.6 => /lib64/libc.so.6 (0x00007f610b5f1000) /lib64/ld-linux-x86-64.so.2 (0x00007f610b995000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f610b59e000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f610b593000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f610b566000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f610b4e1000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f610b4db000) (after) build/libnss_myhostname.so.2: linux-vdso.so.1 (0x00007fff0b5e2000) librt.so.1 => /lib64/librt.so.1 (0x00007fde0c328000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fde0c307000) libc.so.6 => /lib64/libc.so.6 (0x00007fde0c141000) /lib64/ld-linux-x86-64.so.2 (0x00007fde0c435000) build/libnss_mymachines.so.2: linux-vdso.so.1 (0x00007ffdc30a7000) librt.so.1 => /lib64/librt.so.1 (0x00007f06ecabb000) libcap.so.2 => /lib64/libcap.so.2 (0x00007f06ecab4000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f06eca93000) libc.so.6 => /lib64/libc.so.6 (0x00007f06ec8cd000) /lib64/ld-linux-x86-64.so.2 (0x00007f06ecc15000) build/libnss_resolve.so.2: linux-vdso.so.1 (0x00007ffe95747000) librt.so.1 => /lib64/librt.so.1 (0x00007fa56a80f000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fa56a808000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa56a7e7000) libc.so.6 => /lib64/libc.so.6 (0x00007fa56a621000) /lib64/ld-linux-x86-64.so.2 (0x00007fa56a964000) build/libnss_systemd.so.2: linux-vdso.so.1 (0x00007ffe67b51000) librt.so.1 => /lib64/librt.so.1 (0x00007ffb32113000) libcap.so.2 => /lib64/libcap.so.2 (0x00007ffb3210c000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffb320eb000) libc.so.6 => /lib64/libc.so.6 (0x00007ffb31f25000) /lib64/ld-linux-x86-64.so.2 (0x00007ffb3226a000) I don't quite understand what is going on here, but let's not be too picky.
* cgroup: be more careful with which controllers we can enable/disable on a cgroupLennart Poettering2018-11-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes cg_enable_everywhere() to return which controllers are enabled for the specified cgroup. This information is then used to correctly track the enablement mask currently in effect for a unit. Moreover, when we try to turn off a controller, and this works, then this is indicates that the parent unit might succesfully turn it off now, too as our unit might have kept it busy. So far, when realizing cgroups, i.e. when syncing up the kernel representation of relevant cgroups with our own idea we would strictly work from the root to the leaves. This is generally a good approach, as when controllers are enabled this has to happen in root-to-leaves order. However, when controllers are disabled this has to happen in the opposite order: in leaves-to-root order (this is because controllers can only be enabled in a child if it is already enabled in the parent, and if it shall be disabled in the parent then it has to be disabled in the child first, otherwise it is considered busy when it is attempted to remove it in the parent). To make things complicated when invalidating a unit's cgroup membershup systemd can actually turn off some controllers previously turned on at the very same time as it turns on other controllers previously turned off. In such a case we have to work up leaves-to-root *and* root-to-leaves right after each other. With this patch this is implemented: we still generally operate root-to-leaves, but as soon as we noticed we successfully turned off a controller previously turned on for a cgroup we'll re-enqueue the cgroup realization for all parents of a unit, thus implementing leaves-to-root where necessary.
* coccinelle: make use of SYNTHETIC_ERRNOZbigniew Jędrzejewski-Szmek2018-11-221-2/+2
| | | | | | | | | | | Ideally, coccinelle would strip unnecessary braces too. But I do not see any option in coccinelle for this, so instead, I edited the patch text using search&replace to remove the braces. Unfortunately this is not fully automatic, in particular it didn't deal well with if-else-if-else blocks and ifdefs, so there is an increased likelikehood be some bugs in such spots. I also removed part of the patch that coccinelle generated for udev, where we returns -1 for failure. This should be fixed independently.
* tree-wide: set WRITE_STRING_FILE_DISABLE_BUFFER flag when we write files ↵Yu Watanabe2018-11-061-1/+1
| | | | under /proc or /sys
* nspawn: chown() the legacy hierarchy when it's used in a containerEvgeny Vereshchagin2018-09-261-1/+1
| | | | | | | This is a follow-up to 720f0a2f3c928cc9379501a52146be9fbb4d9be2. Closes https://github.com/systemd/systemd/issues/10026 Closes https://github.com/systemd/systemd/issues/9563
* fs-util: make symlink_idempotent() optionally create relative linkYu Watanabe2018-09-241-2/+2
|
* nspawn: sync_cgroup(): Rename arg_uid_shift -> uid_shiftLuke Shumaker2018-07-201-2/+2
| | | | | Naming it arg_uid_shift is confusing because of the global arg_uid_shift in nspawn.c
* nspawn: Move cgroup mount stuff from nspawn-mount.c to nspawn-cgroup.cLuke Shumaker2018-07-201-0/+417
|
* tree-wide: remove Lennart's copyright linesLennart Poettering2018-06-141-3/+0
| | | | | | | | | | | These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.
* tree-wide: drop 'This file is part of systemd' blurbLennart Poettering2018-06-141-2/+0
| | | | | | | | | | | | | | | | This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.
* nspawn: move nspawn cgroup hierarchy one level down unconditionallyLennart Poettering2018-05-031-25/+34
| | | | | | | | | | | | | | | | We need to do this in all cases, including on cgroupsv1 in order to ensure the host systemd and any systemd in the payload won't fight for the cgroup attributes of the top-level cgroup of the payload. This is because systemd for Delegate=yes units will only delegate the right to create children as well as their attributes. However, nspawn expects that the cgroup delegated covers both the right to create children and the attributes of the cgroup itself. Hence, to clear this up, let's unconditionally insert a intermediary cgroup, on cgroupsv1 as well as cgroupsv2, unconditionally. This is also nice as it reduces the differences in the various setups and exposes very close behaviour everywhere.
* tree-wide: drop license boilerplateZbigniew Jędrzejewski-Szmek2018-04-061-13/+0
| | | | | | | | | | Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.
* nspawn: when in hybrid mode, chown() both the legacy and the unified ↵Lennart Poettering2017-12-051-1/+14
| | | | | | | | | | hierarchy to the root in the container If user namespacing is used, let's make sure that the root user in the container gets access to both /sys/fs/cgroup/systemd and /sys/fs/cgroup/unified. This matches similar logic in cg_set_access().
* cgroup: also include "cgroups.threads" in the list of files to chownLennart Poettering2017-12-051-5/+7
| | | | | | | | | Also, add "cgroups.stat". It's read-only anyway, hence its UID/GID ownership matters little, but it's probably a good idea to keep it ownership in sync with the other read-only files such as "cgroups.controllers". Also, order the list of files alphabetically.
* Add SPDX license identifiers to source files under the LGPLZbigniew Jędrzejewski-Szmek2017-11-191-0/+1
| | | | | This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
* Be slightly more verbose in error messageZbigniew Jędrzejewski-Szmek2017-07-021-1/+1
| | | | | | Including the full path is always useful. Also use PID_FMT in one more place.
* cgroup: rename cg_unified() → cg_unified_controller()Lennart Poettering2017-02-241-2/+2
| | | | | cg_unified() is a bit generic a name, let's make clear that it checks whether a specified controller is in unified mode.
* cgroup: change cg_unified() to possibly return errors againLennart Poettering2017-02-241-4/+10
| | | | | | | | | We use our cgroup APIs in various contexts, including from our libraries sd-login, sd-bus. As we don#t control those environments we can't rely that the unified cgroup setup logic succeeds, and hence really shouldn't assert on it. This more or less reverts 415fc41ceaeada2e32639f24f134b1c248b9e43f.
* core: simplify cg_[all_]unified()Tejun Heo2017-02-181-12/+5
| | | | | | | | | | | | | | | | | | | | | | | | | cg_[all_]unified() test whether a specific controller or all controllers are on the unified hierarchy. While what's being asked is a simple binary question, the callers must assume that the functions may fail any time, which unnecessarily complicates their usages. This complication is unnecessary. Internally, the test result is cached anyway and there are only a few places where the test actually needs to be performed. This patch simplifies cg_[all_]unified(). * cg_[all_]unified() are updated to return bool. If the result can't be decided, assertion failure is triggered. Error handlings from their callers are dropped. * cg_unified_flush() is updated to calculate the new result synchrnously and return whether it succeeded or not. Places which need to flush the test result are updated to test for failure. This ensures that all the following cg_[all_]unified() tests succeed. * Places which expected possible cg_[all_]unified() failures are updated to call and test cg_unified_flush() before calling cg_[all_]unified(). This includes functions used while setting up mounts during boot and manager_setup_cgroup().
* nspawn: remove unused variable (#4369)Thomas H. P. Andersen2016-10-141-1/+0
|
* nspawn: cleanup and chown the synced cgroup hierarchy (#4223)Evgeny Vereshchagin2016-10-131-15/+38
| | | Fixes: #4181
* nspawn,mount-util: add [u]mount_verbose and use it in nspawnZbigniew Jędrzejewski-Szmek2016-10-111-6/+7
| | | | | | | | | | | | | | | | | | | | This makes it easier to debug failed nspawn invocations: Mounting sysfs on /var/lib/machines/fedora-rawhide/sys (MS_RDONLY|MS_NOSUID|MS_NOEXEC|MS_NODEV "")... Mounting tmpfs on /var/lib/machines/fedora-rawhide/dev (MS_NOSUID|MS_STRICTATIME "mode=755,uid=1450901504,gid=1450901504")... Mounting tmpfs on /var/lib/machines/fedora-rawhide/dev/shm (MS_NOSUID|MS_NODEV|MS_STRICTATIME "mode=1777,uid=1450901504,gid=1450901504")... Mounting tmpfs on /var/lib/machines/fedora-rawhide/run (MS_NOSUID|MS_NODEV|MS_STRICTATIME "mode=755,uid=1450901504,gid=1450901504")... Bind-mounting /sys/fs/selinux on /var/lib/machines/fedora-rawhide/sys/fs/selinux (MS_BIND "")... Remounting /var/lib/machines/fedora-rawhide/sys/fs/selinux (MS_RDONLY|MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_BIND|MS_REMOUNT "")... Mounting proc on /proc (MS_NOSUID|MS_NOEXEC|MS_NODEV "")... Bind-mounting /proc/sys on /proc/sys (MS_BIND "")... Remounting /proc/sys (MS_RDONLY|MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_BIND|MS_REMOUNT "")... Bind-mounting /proc/sysrq-trigger on /proc/sysrq-trigger (MS_BIND "")... Remounting /proc/sysrq-trigger (MS_RDONLY|MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_BIND|MS_REMOUNT "")... Mounting tmpfs on /tmp (MS_STRICTATIME "mode=1777,uid=0,gid=0")... Mounting tmpfs on /sys/fs/cgroup (MS_NOSUID|MS_NOEXEC|MS_NODEV|MS_STRICTATIME "mode=755,uid=0,gid=0")... Mounting cgroup on /sys/fs/cgroup/systemd (MS_NOSUID|MS_NOEXEC|MS_NODEV "none,name=systemd,xattr")... Failed to mount cgroup on /sys/fs/cgroup/systemd (MS_NOSUID|MS_NOEXEC|MS_NODEV "none,name=systemd,xattr"): No such file or directory
* core: use the unified hierarchy for the systemd cgroup controller hierarchyTejun Heo2016-08-171-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, systemd uses either the legacy hierarchies or the unified hierarchy. When the legacy hierarchies are used, systemd uses a named legacy hierarchy mounted on /sys/fs/cgroup/systemd without any kernel controllers for process management. Due to the shortcomings in the legacy hierarchy, this involves a lot of workarounds and complexities. Because the unified hierarchy can be mounted and used in parallel to legacy hierarchies, there's no reason for systemd to use a legacy hierarchy for management even if the kernel resource controllers need to be mounted on legacy hierarchies. It can simply mount the unified hierarchy under /sys/fs/cgroup/systemd and use it without affecting other legacy hierarchies. This disables a significant amount of fragile workaround logics and would allow using features which depend on the unified hierarchy membership such bpf cgroup v2 membership test. In time, this would also allow deleting the said complexities. This patch updates systemd so that it prefers the unified hierarchy for the systemd cgroup controller hierarchy when legacy hierarchies are used for kernel resource controllers. * cg_unified(@controller) is introduced which tests whether the specific controller in on unified hierarchy and used to choose the unified hierarchy code path for process and service management when available. Kernel controller specific operations remain gated by cg_all_unified(). * "systemd.legacy_systemd_cgroup_controller" kernel argument can be used to force the use of legacy hierarchy for systemd cgroup controller. * nspawn: By default nspawn uses the same hierarchies as the host. If UNIFIED_CGROUP_HIERARCHY is set to 1, unified hierarchy is used for all. If 0, legacy for all. * nspawn: arg_unified_cgroup_hierarchy is made an enum and now encodes one of three options - legacy, only systemd controller on unified, and unified. The value is passed into mount setup functions and controls cgroup configuration. * nspawn: Interpretation of SYSTEMD_CGROUP_CONTROLLER to the actual mount option is moved to mount_legacy_cgroup_hierarchy() so that it can take an appropriate action depending on the configuration of the host. v2: - CGroupUnified enum replaces open coded integer values to indicate the cgroup operation mode. - Various style updates. v3: Fixed a bug in detect_unified_cgroup_hierarchy() introduced during v2. v4: Restored legacy container on unified host support and fixed another bug in detect_unified_cgroup_hierarchy().
* core: rename cg_unified() to cg_all_unified()Tejun Heo2016-08-151-2/+2
| | | | | | | | | | | A following patch will update cgroup handling so that the systemd controller (/sys/fs/cgroup/systemd) can use the unified hierarchy even if the kernel resource controllers are on the legacy hierarchies. This would require distinguishing whether all controllers are on cgroup v2 or only the systemd controller is. In preparation, this patch renames cg_unified() to cg_all_unified(). This patch doesn't cause any functional changes.
* treewide: fix typos and remove accidental repetition of wordsTorstein Husebø2016-07-111-1/+1
|
* core: update populated event handling in unified hierarchyTejun Heo2016-03-261-2/+1
| | | | | | | Earlier during the development of unified hierarchy, the populated event was reported through by the dedicated "cgroup.populated" file; however, the interface was updated so that it's reported through the "populated" field of "cgroup.events" file. Update populated event handling logic accordingly.
* cgroup2: use new fstype for unified hierarchyAlban Crequy2016-03-261-1/+1
| | | | | | | | | | | | | | Since Linux v4.4-rc1, __DEVEL__sane_behavior does not exist anymore and is replaced by a new fstype "cgroup2". With this patch, systemd no longer supports the old (unstable) way of doing unified hierarchy with __DEVEL__sane_behavior and systemd now requires Linux v4.4 for unified hierarchy. Non-unified hierarchy is still the default and is unchanged by this patch. https://github.com/torvalds/linux/commit/67e9c74b8a873408c27ac9a8e4c1d1c8d72c93ff
* nspawn: Fix two misspellings of "hierarchy" in error messagesTobias Klauser2016-03-161-2/+2
|
* tree-wide: remove Emacs lines from all filesDaniel Mack2016-02-101-2/+0
| | | | | This should be handled fine now by .dir-locals.el, so need to carry that stuff in every file.
* nspawn: userns and unified cgroup: chown cgroup.eventsAlban Crequy2015-12-281-0/+1
| | | | | | | When starting a container in a new user namespace, systemd-nspawn chowns the cgroup knob files so they are usable by the container. But the cgroup knob file "cgroup.events" was missing. This file exists when the unified hierarchy is used.
* util-lib: split out allocation calls into alloc-util.[ch]Lennart Poettering2015-10-271-0/+1
|
* util-lib: split out fd-related operations into fd-util.[ch]Lennart Poettering2015-10-251-1/+2
| | | | | There are more than enough to deserve their own .c file, hence move them over.