| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement directives `NoExecPaths=` and `ExecPaths=` to control `MS_NOEXEC`
mount flag for the file system tree. This can be used to implement file system
W^X policies, and for example with allow-listing mode (NoExecPaths=/) a
compromised service would not be able to execute a shell, if that was not
explicitly allowed.
Example:
[Service]
NoExecPaths=/
ExecPaths=/usr/bin/daemon /usr/lib64 /usr/lib
Closes: #17942.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cmp in ExecStartPost= was actually failing – ExecStartPost= has the
same StandardOutput as the rest of the service, so the output file is
truncated before cmp can compare it with the expected output – but the
test still passed because test_exec_standardoutput_truncate() calls
test(), which only checks the main result, rather than test_service(),
which checks the result of the whole service. Fix the test by merging
the ExecStartPost= into the ExecStart= – the cmp has to be part of the
same command line as the cat so that the file is not truncated between
the two processes.
|
|
|
|
|
|
| |
This adds the ability to specify truncate:PATH for StandardOutput= and
StandardError=, similar to the existing append:PATH. The code is mostly
copied from the related append: code. Fixes #8983.
|
|
|
|
|
| |
echo is a built-in, so we were testing execve in our own code, and not in
the running child.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Define explicit action "kill" for SystemCallErrorNumber=.
In addition to errno code, allow specifying "kill" as action for
SystemCallFilter=.
---
v7: seccomp_parse_errno_or_action() returns -EINVAL if !HAVE_SECCOMP
v6: use streq_ptr(), let errno_to_name() handle bad values, kill processes,
init syscall_errno
v5: actually use seccomp_errno_or_action_to_string(), don't fail bus unit
parsing without seccomp
v4: fix build without seccomp
v3: drop log action
v2: action -> number
|
|
|
|
|
|
|
|
|
| |
All backslashes that should be single in shell syntax need to be written as "\\" because
our parser will remove one level of quoting. Also, single quotes were doubly nested, which
cannot work.
Should fix the following message:
test-execute/exec-dynamicuser-statedir.service:16: Ignoring unknown escape sequences: "test $$(find / \( -path /var/tmp -o -path /tmp -o -path /proc -o -path /dev/mqueue -o -path /dev/shm -o -path /sys/fs/bpf -o -path /dev/.lxc \) -prune -o -type d -writable -print 2>/dev/null | sort -u | tr -d \\n) = /var/lib/private/quux/pief/var/lib/private/waldo"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://tools.ietf.org/html/draft-knodel-terminology-02
https://lwn.net/Articles/823224/
This gets rid of most but not occasions of these loaded terms:
1. scsi_id and friends are something that is supposed to be removed from
our tree (see #7594)
2. The test suite defines an API used by the ubuntu CI. We can remove
this too later, but this needs to be done in sync with the ubuntu CI.
3. In some cases the terms are part of APIs we call or where we expose
concepts the kernel names the way it names them. (In particular all
remaining uses of the word "slave" in our codebase are like this,
it's used by the POSIX PTY layer, by the network subsystem, the mount
API and the block device subsystem). Getting rid of the term in these
contexts would mean doing some major fixes of the kernel ABI first.
Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
|
|
|
|
|
| |
Both hostname and uname utilities boil down to uname(2) syscall. Reduce
tests dependency footprint by using uname for checking hostname too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libcap v2.33 introduces a new capability set called IAB[0] which is shown
in the output of `capsh --print` and interferes with the test checks. Let's
drop the IAB set from the output, for now, to mitigate this.
This could be (and probably should be) replaced in the future by the
newly introduced testing options[1][2] in libcap v2.32, namely:
--has-p=xxx
--has-i=xxx
--has-a=xxx
but this needs to wait until the respective libcap version gets a wider
adoption. Until then, let's stick with the relatively ugly sed.
Fixes: #15046
[0] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=943b011b5e53624eb9cab4e96c1985326e077cdd
[1] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=588d0439cb6495b03f0ab9f213f0b6b339e7d4b7
[2] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=e7709bbc1c4712f2ddfc6e6f42892928a8a03782
|
|
|
|
|
|
|
|
| |
The man pages state that the '+' prefix in Exec* directives should
ignore filesystem namespacing options such as PrivateTmp. Now it does.
This is very similar to #8842, just with PrivateTmp instead of
PrivateDevices.
|
|
|
|
|
|
|
| |
Since libcap v2.29 the format of cap_to_text() has been changed which
makes certain `test-execute` subtest fail. Let's remove the offending
part of the output (dropped capabilities) to make it compatible with
both the old and the new libcap.
|
| |
|
|
|
|
|
|
| |
For root, group enforcement needs to come after PrivateDevices=y set up
according to 096424d1230e0a0339735c51b43949809e972430. Add a test to
verify this is the case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some containers unshare() is made unavailable entirely. Let's deal
with this that more gracefully and disable our sandboxing of services
then, so that we work in a container, under the assumption the container
manager is then responsible for sandboxing if we can't do it ourselves.
Previously, we'd insist on sandboxing as soon as any form of BindPath=
is used. With this change we only insist on it if we have a setting like
that where source and destination differ, i.e. there's a mapping
established that actually rearranges things, and thus would result in
systematically different behaviour if skipped (as opposed to mappings
that just make stuff read-only/writable that otherwise arent').
(Let's also update a test that intended to test for this behaviour with
a more specific configuration that still triggers the behaviour with
this change in place)
Fixes: #13955
(For testing purposes unshare() can easily be blocked with
systemd-nspawn --system-call-filter=~unshare.)
|
| |
|
| |
|
|
|
|
|
|
| |
It appears in nested LXC containers and broke the test in Ubuntu CI.
BugLink: https://bugs.launchpad.net/bugs/1845337
|
|
|
|
| |
Closes #10596
|
|
|
|
|
|
|
|
|
|
| |
Before only one comparison was allowed. Let's make this more flexible:
ConditionKernelVersion = ">=4.0" "<=4.5"
Fixes #12881.
This also fixes expressions like "ConditionKernelVersion=>" which would
evaluate as true.
|
| |
|
|
|
|
|
|
| |
These services are likely to coredump, and we expect that but aren't
interested in the coredump. Hence let's turn off processing by setting
RLIMIT_CORE to 0/0.
|
|
|
|
|
| |
As explained in the previous commit, blocking /proc can cause us
to go into a long loop or fail the test.
|
| |
|
| |
|
|
|
|
| |
Fixes #10934.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
Those interfaces are created automatically when ip6_tunnel and ip6_gre loaded.
They break the test with exec-privatenetwork-yes.service.
C.f. 6b08180ca6f1ceb913f6a69ffcaf96e9818fbdf5.
|
|
|
|
| |
This adds a test for issue #9939 which is fixed by
a5404992cc7724ebf7572a0aa89d9fdb26ce0b62 (#9942).
|
|
|
|
| |
This is mainly for testing 1beab8b0d0ff2d7d1436b52d4a0c3d56dc908962.
|
|
|
|
| |
Fixes #9881.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... when no mount options are passed.
Change the code, to avoid the following failure in the newly added tests:
exec-temporaryfilesystem-rw.service: Executing: /usr/bin/sh -x -c
'[ "$(stat -c %a /var)" == 755 ]'
++ stat -c %a /var
+ '[' 1777 == 755 ']'
Received SIGCHLD from PID 30364 (sh).
Child 30364 (sh) died (code=exited, status=1/FAILURE)
(And I spotted an opportunity to use TAKE_PTR() at the end).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't remount the underlying superblocks, if we are inside a user
namespace and running Linux <= 4.17. We can only change the per-mount
flags (MS_REMOUNT | MS_BIND).
This type of mount() call can only change the per-mount flags, so we
don't have to worry about passing the right string options now.
Fixes #9914 ("Since 1beab8b was merged, systemd has been failing to start
systemd-resolved inside unprivileged containers" ... "Failed to re-mount
'/run/systemd/unit-root/dev' read-only: Operation not permitted").
> It's basically my fault :-). I pointed out we could remount read-only
> without MS_BIND when reviewing the PR that added TemporaryFilesystem=,
> and poettering suggested to change PrivateDevices= at the same time.
> I think it's safe to change back, and I don't expect anyone will notice
> a difference in behaviour.
>
> It just surprised me to realize that
> `TemporaryFilesystem=/tmp:size=10M,ro,nosuid` would not apply `ro` to the
> superblock (underlying filesystem), like mount -osize=10M,ro,nosuid does.
> Maybe a comment could note the kernel version (v4.18), that lets you
> remount without MS_BIND inside a user namespace.
This makes the code longer and I guess this function is still ugly, sorry.
One obstacle to cleaning it up is the interaction between
`PrivateDevices=yes` and `ReadOnlyPaths=/dev`. I've added a test for the
existing behaviour, which I think is now the correct behaviour.
|
|
|
|
| |
different
|
|
|
|
| |
Addresses part of #8983
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This corresponds nicely with the specifiers we already pass for
/var/lib, /var/cache, /run and so on.
This is particular useful to update the test-path service files to
operate without guessable files, thus allowing multiple parallel
test-path invocations to pass without issues (the idea is to set $TMPDIR
early on in the test to some private directory, and then only use the
new %T or %V specifier to refer to it).
|
|\
| |
| | |
core: allow to specify RestrictNamespaces= multiple times
|
| | |
|
| |
| |
| |
| | |
Fixes #8679.
|
|/
|
|
|
| |
systemd silently strips out backslashes in variables from environment
files. Add a testcase that explicitly tests for this behaviour.
|
| |
|
| |
|
|
|
| |
Follow-up for 250e9fadbcc0ca90e697d7efb40855b054ed3b8f.
|
|
|
|
|
| |
We synthesize the passwd record for UID 0, hence we need to compare with
our synthesized data and not with the data stored in /etc/passwd
|