| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The path may have unbounded length, for example through a fuse mount.
CVE-2021-33910: attacked controlled alloca() leads to crash in systemd and
ultimately a kernel panic. Systemd parses the content of /proc/self/mountinfo
and each mountpoint is passed to mount_setup_unit(), which calls
unit_name_path_escape() underneath. A local attacker who is able to mount a
filesystem with a very long path can crash systemd and the whole system.
https://bugzilla.redhat.com/show_bug.cgi?id=1970887
The resulting string length is bounded by UNIT_NAME_MAX, which is 256. But we
can't easily check the length after simplification before doing the
simplification, which in turns uses a copy of the string we can write to.
So we can't reject paths that are too long before doing the duplication.
Hence the most obvious solution is to switch back to strdup(), as before
7410616cd9dbbec97cf98d75324da5cda2b2f7a2.
Resolves: #1974698
(cherry picked from commit 441e0115646d54f080e5c3bb0ba477c892861ab9)
|
|
|
|
|
|
|
|
| |
This was introduced by commit d9ae3222cfbd5d2a48e6dbade6617085cc76f1c1 .
(cherry picked from commit 573229efeb2c5ade25794deee8cfe2f967414ef7)
Resolves: #1934500
|
|
|
|
|
|
|
|
|
|
|
|
| |
User instance of systemd is optional feature and if user@.service
template is masked then administrator most likely doesn't want --user
instances of systemd for logged in users. We don't need to be verbose
about it.
(cherry picked from commit 03b6fa0c5b51b0d39334ff6ba183a3391443bcf6)
(cherry picked from commit 65e96327360ab41d44d5383dcecc82a19fad198c)
Resolves: #1894152
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Low-level cgroup freezer state manipulation is invoked directly from the
job engine when we are about to execute the job in order to make sure
the unit is not frozen and job execution is not blocked because of
that.
Currently with cgroup v1 we would needlessly do a bunch of work in the
function and even falsely update the freezer state. Don't do any of this
and skip the function silently when v2 freezer is not available.
Following bug is fixed by this commit,
$ systemd-run --unit foo.service /bin/sleep infinity
$ systemctl restart foo.service
$ systemctl show -p FreezerState foo.service
Before (cgroup v1, i.e. full "legacy" mode):
FreezerState=thawing
After:
FreezerState=running
(cherry picked from commit 9a1e90aee556b7a30d87553a891a4175ae77ed68)
Resolves: #1868831
|
|
|
|
|
|
|
| |
Fixes: #15356
(cherry picked from commit e9da62b18af647bfa73807e1c7fc3bfa4bb4b2ac)
Resolves: #1829867
|
|
|
|
|
|
| |
(cherry picked from commit 82ea38258c0f4964c2f3ad3691c6e4554c4f0bb0)
Related: #1872243
|
|
|
|
|
|
| |
(cherry picked from commit 329d20db3cb02d789473b8f7e4a59526fcbf5728)
Resolves: #1872243
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Functions called from device_setup_unit() already make sure that unit is
enqueued in case it is a new unit or properties exported on the bus have
changed.
This should prevent unnecessary DBus wakeups and associated DBus traffic
when device_setup_unit() was called while reparsing /proc/self/mountinfo
due to the mountinfo notifications. Note that we parse
/proc/self/mountinfo quite often on the busy systems (e.g. k8s container
hosts) but majority of the time mounts didn't change, only some mount
got added. Thus we don't need to generate PropertiesChanged for devices
associated with the mounts that didn't change.
Thanks to Renaud Métrich <rmetrich@redhat.com> for debugging the
problem and providing draft version of the patch.
(cherry picked from commit 2e129d5d6bd6bd8be4b5359e81a880cbf72a44b8)
Resolves: #1793533
|
|
|
|
|
|
| |
(cherry picked from commit 7c4d139485139eae95b17a1d54cb51ae958abd70)
Related: #1793533
|
|
|
|
|
|
|
|
|
| |
freeze/thaw
Fixes: #16050
(cherry picked from commit a0d79df8e59c6bb6dc0382d71e835dec869a7df4)
Related: #1848421
|
|
|
|
| |
Resolves: #1848421
|
|
|
|
|
|
| |
(cherry picked from commit d446ae89c0168f17eed7135ac06df3b294b3fcc6)
Related: #1830861
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
return too early
Actually, it is the same kind of problem as in d910f4c . Basically, we
need to return 1 on success code path in slice_freezer_action().
Otherwise we dispatch DBus return message too soon.
Fixes: #16050
(cherry picked from commit 2884836e3c26fa76718319cdc6d13136bbc1354d)
Related: #1830861
|
|
|
|
|
|
|
|
|
|
|
|
| |
We should return 0 only if current freezer state, as reported by the
kernel, is already the desired state. Otherwise, we would dispatch
return dbus message prematurely in bus_unit_method_freezer_generic().
Thanks to Frantisek Sumsal for reporting the issue.
(cherry picked from commit d910f4c2b2542544d7b187a09605da7a0f220837)
Related: #1830861
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With cgroup v2 the cgroup freezer is implemented as a cgroup
attribute called cgroup.freeze. cgroup can be frozen by writing "1"
to the file and kernel will send us a notification through
"cgroup.events" after the operation is finished and processes in the
cgroup entered quiescent state, i.e. they are not scheduled to
run. Writing "0" to the attribute file does the inverse and process
execution is resumed.
This commit exposes above low-level functionality through systemd's DBus
API. Each unit type must provide specialized implementation for these
methods, otherwise, we return an error. So far only service, scope, and
slice unit types provide the support. It is possible to check if a
given unit has the support using CanFreeze() DBus property.
Note that DBus API has a synchronous behavior and we dispatch the reply
to freeze/thaw requests only after the kernel has notified us that
requested operation was completed.
(cherry picked from commit d9e45bc3abb8adf5a1cb20816ba8f2d2aa65b17e)
Resolves: #1830861
|
|
|
|
|
|
|
|
| |
Follow-up for 9f65637308.
(cherry picked from commit d3d53e5cd143bf96d1eb0e254f16fa8d458d38ce)
Related: #1830861
|
|
|
|
|
|
|
| |
BugLink: https://bugs.launchpad.net/bugs/1870930
(cherry picked from commit 9f656373082cb13542b877b4f5cb917ef5ff329c)
Related: #1830861
|
|
|
|
|
|
|
|
| |
Fixup for 3572d3df8f8. Coverity CID#1403013.
(cherry picked from commit 60b17d6fcd988c9995b7d1476d3aba1c4cbbfddd)
Related: #1830861
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a generic implementation of a client-side logic of waiting until
a unit enters or leaves some state.
This is a more generic implementation of the WaitContext logic currently
in systemctl.c, and is supposed to replace it (a later commit does
this). It's similar to bus-wait-for-jobs.c and we probably should fold
that one into it later on.
This code is more powerful and cleaner than the WaitContext logic
however. In addition to waiting for a unit to exit this also allows us
to wait for a unit to leave the "maintainance" state.
This commit only implements the generic logic, and adds no users of it
yet.
(cherry picked from commit 3572d3df8f822d4cf1601428401a837f723771cf)
Related: #1830861
|
|
|
|
|
|
|
|
|
|
| |
Callers of cg_get_keyed_attribute_full() can now specify via the flag whether the
missing keyes in cgroup attribute file are OK or not. Also the wrappers for both
strict and graceful version are provided.
(cherry picked from commit 25a1f04c682260bb9b96e25bdf33665d6172db98)
Related: #1830861
|
|
|
|
|
|
|
|
|
|
|
| |
This has the advantage that mac_selinux_access_check() can be used as a
function in all contexts. For example, parameters passed to it won't be
reported as unused if the "function" call is replaced with 0 on SELinux
disabled builds.
(cherry picked from commit 08deac6e3e9119aeb966375f94695e4aa14ffb1c)
Related: #1830861
|
|
|
|
|
|
|
| |
Fixes: #16115
(cherry picked from commit bb9244781c6fc7608f7cac910269f8987b8adc01)
Related: #1737283
|
|
|
|
|
|
|
| |
Very loosely based on upstream commits e1ca734edd17a90a325d5b566a4ea96e66c206e5
and 681bd2c524ed71ac04045c90884ba8d55eee7b66.
Resolves: #1804252
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A later version of the DefaultMemory{Low,Min} patch changed these to
require explicitly setting memory_foo_set, but we only set that in
load-fragment, not dbus-cgroup.
Without these, we may fall back to either DefaultMemoryFoo or
CGROUP_LIMIT_MIN when we really shouldn't.
(cherry picked from commit 184e989d7da4648bd36511ffa28a9f2b469589d1)
Related: #1763435
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an oversight from https://github.com/systemd/systemd/pull/12332.
Sadly the tests didn't catch it since it requires a real cgroup
hierarchy to see, and it wasn't seen in prod since we're only currently
using DefaultMemoryLow, not DefaultMemoryMin. :-(
(cherry picked from commit 64fe532e90b3e99bf7821ded8a1107c239099e40)
Related: #1763435
|
|
|
|
|
|
|
|
|
| |
Otherwise we might not enable it when we should, ie. DefaultMemoryMin is
set in a parent, but not MemoryMin in the current unit.
(cherry picked from commit 7c9d2b79935d413389a603918a711df75acd3f48)
Related: #1763435
|
|
|
|
|
|
|
|
|
| |
The previous commit fixes this up, and this should prevent it
regressing.
(cherry picked from commit 465ace74d9820824968ab5e82c81e42c2f1894b0)
Related: #1763435
|
|
|
|
|
|
|
|
|
|
|
|
| |
These make sense to be explicitly set at 0 (which has a different effect
than the default, since it can affect processing of `DefaultMemoryXXX`).
Without this, it's not easily possible to relinquish memory protection
for a subtree, which is not great.
(cherry picked from commit 22bf131be278b95a4a204514d37a4344cf6365c6)
Related: #1763435
|
|
|
|
|
|
|
|
|
| |
Somehow these got lost in the previous PR, rendering DefaultMemoryMin
not very useful.
(cherry picked from commit 7e7223b3d57c950b399352a92e1d817f7c463602)
Related: #1763435
|
|
|
|
|
|
|
|
|
|
|
| |
I missed adding a section in `systemd.resource-control` about
DefaultMemoryMin in #12332.
Also, add a NEWS entry going over the general concept.
(cherry picked from commit acdb4b5236f38bbefbcc4a47fdbb9cd558b4b5c5)
Related: #1763435
|
|
|
|
|
|
| |
(cherry picked from commit 7ad5439e0663e39e36619957fa37eefe8026bcab)
Related: #1763435
|
|
|
|
|
|
|
|
| |
This is in preparation for creating unit_get_ancestor_memory_min.
(cherry picked from commit 6264b85e92aeddb74b8d8808a08c9eae8390a6a5)
Related: #1763435
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In cgroup v2 we have protection tunables -- currently MemoryLow and
MemoryMin (there will be more in future for other resources, too). The
design of these protection tunables requires not only intermediate
cgroups to propagate protections, but also the units at the leaf of that
resource's operation to accept it (by setting MemoryLow or MemoryMin).
This makes sense from an low-level API design perspective, but it's a
good idea to also have a higher-level abstraction that can, by default,
propagate these resources to children recursively. In this patch, this
happens by having descendants set memory.low to N if their ancestor has
DefaultMemoryLow=N -- assuming they don't set a separate MemoryLow
value.
Any affected unit can opt out of this propagation by manually setting
`MemoryLow` to some value in its unit configuration. A unit can also
stop further propagation by setting `DefaultMemoryLow=` with no
argument. This removes further propagation in the subtree, but has no
effect on the unit itself (for that, use `MemoryLow=0`).
Our use case in production is simplifying the configuration of machines
which heavily rely on memory protection tunables, but currently require
tweaking a huge number of unit files to make that a reality. This
directive makes that significantly less fragile, and decreases the risk
of misconfiguration.
After this patch is merged, I will implement DefaultMemoryMin= using the
same principles.
(cherry picked from commit c52db42b78f6fbeb7792cc4eca27e2767a48b6ca)
Related: #1763435
|
|
|
|
|
|
|
|
| |
Instead, use path_join() in callers wherever needed.
(cherry picked from commit 55890a40c3ec0c061c04d1395a38c26313132d12)
Related: #1763435
|
|
|
|
|
|
| |
(cherry picked from commit fd870bac25c2dd36affaed0251b5a7023f635306)
Related: #1763435
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel added support for a new cgroup memory controller knob memory.min in
bf8d5d52ffe8 ("memcg: introduce memory.min") which was merged during v4.18
merge window.
Add MemoryMin to support memory.min.
(cherry picked from commit 484226357789991de0b3363beb69258be06b4c92)
Resolves: #1763435
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The dbus external authentication takes as optional argument the UID the
sender wants to authenticate as. This uid is purely optional. The
AF_UNIX socket already conveys the same information through the
auxiliary socket data, so we really don't have to provide that
information.
Unfortunately, there is no way to send empty arguments, since they are
interpreted as "missing argument", which has a different meaning. The
SASL negotiation thus changes from:
AUTH EXTERNAL <uid>
NEGOTIATE_UNIX_FD (optional)
BEGIN
to:
AUTH EXTERNAL
DATA
NEGOTIATE_UNIX_FD (optional)
BEGIN
And thus the replies we expect as a client change from:
OK <server-id>
AGREE_UNIX_FD (optional)
to:
DATA
OK <server-id>
AGREE_UNIX_FD (optional)
Since the old sd-bus server implementation used the wrong reply for
"AUTH" requests that do not carry the arguments inlined, we decided to
make sd-bus clients accept this as well. Hence, sd-bus now allows
"OK <server-id>\r\n" replies instead of "DATA\r\n" replies.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
(cherry picked from commit 1ed4723d38cd0d1423c8fe650f90fa86007ddf55)
Resolves: #1838081
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The correct way to reply to "AUTH <protocol>" without any payload is to
send "DATA" rather than "OK". The "DATA" reply triggers the client to
respond with the requested payload.
In fact, adding the data as hex-encoded argument like
"AUTH <protocol> <hex-data>" is an optimization that skips the "DATA"
roundtrip. The standard way to perform an authentication is to send the
"DATA" line.
This commit fixes sd-bus to properly send the "DATA" line. Surprisingly
no existing implementation depends on this, as they all pass the data
directly as argument to "AUTH". This will not work if we want to pass
an empty argument, though.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
(cherry picked from commit 2010873b4b49b223e0cc07d28205b09c693ef005)
Related: #1838081
|
|
|
|
|
|
|
|
|
| |
Lets avoid magic numbers and use a constant `strlen()` instead.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
(cherry picked from commit 3cacdab925c40a5d9b7cf3f67719201bbaa17f67)
Related: #1838081
|
|
|
|
|
|
|
|
| |
After the first warning log, further messages are downgraded to LOG_DEBUG.
(cherry picked from commit 527ede0c638b47b62a87900438a8a09dea42889e)
Related: #1770379
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new setting allows configuration of CFS period on the CPU cgroup, instead
of using a hardcoded default of 100ms.
Tested:
- Legacy cgroup + Unified cgroup
- systemctl set-property
- systemctl show
- Confirmed that the cgroup settings (such as cpu.cfs_period_ns) were set
appropriately, including updating the CPU quota (cpu.cfs_quota_ns) when
CPUQuotaPeriodSec= is updated.
- Checked that clamping works properly when either period or (quota * period)
are below the resolution of 1ms, or if period is above the max of 1s.
(cherry picked from commit 10f28641115733c61754342d5dcbe70b083bea4b)
Resolves: #1770379
|
|
|
|
|
|
| |
(cherry picked from commit de8a711a5849f9239c93aefa5554a62986dfce42)
Related: #1770379
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This works like parse_sec() but defaults to USEC_INFINITY when passed an
empty string or only whitespace.
Also introduce config_parse_sec_def_infinity, which can be used to parse
config options using this function.
This is useful for time options that use "infinity" for default and that
can be reset by unsetting them.
Introduce a test case to ensure it works as expected.
(cherry picked from commit 7b61ce3c44ef5908e817009ce4f9d2a7a37722be)
Related: #1770379
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the following proposed latency based IO control
mechanism.
https://lkml.org/lkml/2018/6/5/428
(cherry picked from commit 6ae4283cb14c4e4a895f4bbba703804e4128c86c)
Resolves: #1831519
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a unit in a state INACTIVE or DEACTIVATING, JobType JOB_TRY_RESTART or
JOB_TRY_RELOAD will be collapsed to JOB_NOP. And use u->nop_job instead
of u->job.
If a JOB_NOP job is going on with a waiting state, a parallel daemon-reload
just install it during deserialization. Without a coldplug, the job will
not be in m->run_queue, which results in a hung try-restart or
try-reload process.
Reproduce:
run systemctl try-restart test.servcie (inactive) repeatly in a terminal.
run systemctl daemon-reload repeatly in other terminals.
After successful reproduce, systemctl list-jobs will list the hang job.
Upsteam:
systemd/systemd#13124
(cherry picked from commit b49e14d5f3081dfcd363d8199a14c0924ae9152f)
Resolves: #1829798
|
|
|
|
|
|
|
| |
This is a follow-up to #1619292.
rhel-only
Resolves: #1748840
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also use compat_main() when called as `resolvconf`, since the interface
is closer to that of `systemd-resolve`.
Use a heap allocated string to set arg_ifname, since a stack allocated
one would be lost after the function returns. (This last one broke the
case where an interface name was suffixed with a dot, such as in
`resolvconf -a tap0.dhcp`.)
Tested:
$ build/resolvconf -a nonexistent.abc </etc/resolv.conf
Unknown interface 'nonexistent': No such device
Fixes #9423.
(cherry picked from commit 5a01b3f35d7b6182c78b6973db8d99bdabd4f9c3)
Resolves: #1835594
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the root account is locked sulogin will either inform you of
this and not allow you in or if --force is used it will hand
you passwordless root (if using a recent enough version of util-linux).
Not being allowed a shell is ofcourse inconvenient, but at the same
time handing out passwordless root unconditionally is probably not
a good idea everywhere.
This patch thus allows to control which behaviour you want by
setting the SYSTEMD_SULOGIN_FORCE environment variable to true
or false to control the behaviour, eg. via adding this to
'systemctl edit rescue.service' (or emergency.service):
[Service]
Environment=SYSTEMD_SULOGIN_FORCE=1
Distributions who used locked root accounts and want the passwordless
behaviour could thus simply drop in the override file in
/etc/systemd/system/rescue.service.d/override.conf
Fixes: #7115
Addresses: https://bugs.debian.org/802211
(cherry picked from commit 33eb44fe4a8d7971b5614bc4c2d90f8d91cce66c)
Resolves: #1625929
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function to replacement paths into the configuration file list was borked.
Apart from the crash with empty root prefix, it would incorrectly handle the
case where root *was* set, and the replacement file was supposed to override
an existing file.
prefix_root is used instead of path_join because prefix_root removes duplicate
slashes (when --root=dir/ is used).
A test is added.
Fixes #11124.
(cherry picked from commit 082bb1c59bd4300bcdc08488c94109680cfadf57)
Resolves: #1836024
|
|
|
|
|
|
|
|
| |
Loosely based on
https://github.com/systemd/systemd/pull/14032 and
https://github.com/systemd/systemd/pull/14268.
Related: #1843871
|