| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-1 was used everywhere, but -EBADF or -EBADFD started being used in various
places. Let's make things consistent in the new style.
Note that there are two candidates:
EBADF 9 Bad file descriptor
EBADFD 77 File descriptor in bad state
Since we're initializating the fd, we're just assigning a value that means
"no fd yet", so it's just a bad file descriptor, and the first errno fits
better. If instead we had a valid file descriptor that became invalid because
of some operation or state change, the other errno would fit better.
In some places, initialization is dropped if unnecessary.
|
| |
|
|
|
|
| |
Fixes #17078.
|
|
|
|
|
|
|
|
| |
We have to skip the W^X protections as we need executable
memory on PARISC for now. Kernel work is in progress (started
w/ 5.18).
Closes: https://github.com/systemd/systemd/issues/23180
|
|
|
|
|
|
|
|
|
|
| |
On ppc64le sanitizers disable ASLR (i.e. by setting ADDR_NO_RANDOMIZE),
which opinionated_personality() doesn't return. Let's tweak the current
personality ourselves in such cases.
See: https://github.com/llvm/llvm-project/commit/78f7a6eaa601bfdd6ae70ffd3da2254c21ff77f9
Resolves: #23666
|
|
|
|
|
|
| |
shmat() requires the CAP_IPC_OWNER capability. When running test-seccomp
in environments with root + CAP_SYS_ADMIN, but not CAP_IPC_OWNER,
memory_deny_write_execute_shmat would fail. This fixes it.
|
|
|
|
|
|
|
|
|
| |
This converts to TEST macro where it is trivial.
Some additional notable changes:
- simplify HAVE_LIBIDN #ifdef in test-dns-domain.c
- use saved_argc/saved_argv in test-copy.c, test-path-util.c,
test-tmpfiles.c and test-unit-file.c
|
| |
|
|
|
|
|
|
| |
Follow-up for 0643eb47a0418dc90d33853089dc9bc6ad63b0ca.
This also drops errnously introduced hashmap_put() in the commit.
|
|
|
| |
some newer architectures like riscv32 do not have __NR_ppoll from get go
|
|
|
|
|
|
|
| |
geteuid() without CAP_SYS_ADMIN is not enough to do unrestricted
seccomp(). Hence tighten the check.
See: #19746
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts the gist of da1921a5c396547261c8c7fcd94173346eb3b718 and
0d9fca76bb69e162265b2d25cb79f1890c0da31b (for ppc).
Quoting #17559:
> libseccomp 2.5 added socket syscall multiplexing on ppc64(el):
> https://github.com/seccomp/libseccomp/pull/229
>
> Like with i386, s390 and s390x this breaks socket argument filtering, so
> RestrictAddressFamilies doesn't work.
>
> This causes the unit test to fail:
> /* test_restrict_address_families */
> Operating on architecture: ppc
> Failed to install socket family rules for architecture ppc, skipping: Operation canceled
> Operating on architecture: ppc64
> Failed to add socket() rule for architecture ppc64, skipping: Invalid argument
> Operating on architecture: ppc64-le
> Failed to add socket() rule for architecture ppc64-le, skipping: Invalid argument
> Assertion 'fd < 0' failed at src/test/test-seccomp.c:424, function test_restrict_address_families(). Aborting.
>
> The socket filters can't be added so `socket(AF_UNIX, SOCK_DGRAM, 0);` still
> works, triggering the assertion.
Fixes #17559.
|
| |
|
| |
|
| |
|
|
|
|
| |
It seems that kernel 5.9 started returning that.
|
|\
| |
| | |
Return ENOSYS in nspawn for "unknown" syscalls
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds seccomp support to the riscv64 architecture. seccomp
support is available in the riscv64 kernel since version 5.5, and it
has just been added to the libseccomp library.
riscv64 uses generic syscalls like aarch64, so I used that architecture
as a reference to find which code has to be modified.
With this patch, the testsuite passes successfully, including the
test-seccomp test. The system boots and works fine with kernel 5.4 (i.e.
without seccomp support) and kernel 5.5 (i.e. with seccomp support). I
have also verified that the "SystemCallFilter=~socket" option prevents a
service to use the ping utility when running on kernel 5.5.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://tools.ietf.org/html/draft-knodel-terminology-02
https://lwn.net/Articles/823224/
This gets rid of most but not occasions of these loaded terms:
1. scsi_id and friends are something that is supposed to be removed from
our tree (see #7594)
2. The test suite defines an API used by the ubuntu CI. We can remove
this too later, but this needs to be done in sync with the ubuntu CI.
3. In some cases the terms are part of APIs we call or where we expose
concepts the kernel names the way it names them. (In particular all
remaining uses of the word "slave" in our codebase are like this,
it's used by the POSIX PTY layer, by the network subsystem, the mount
API and the block device subsystem). Getting rid of the term in these
contexts would mean doing some major fixes of the kernel ABI first.
Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't assume that 4MB can be allocated from stack since there could be smaller
DefaultLimitSTACK= in force, so let's use malloc(). NUL terminate the huge
strings by hand, also ensure termination in test_lz4_decompress_partial() and
optimize the memset() for the string.
Some items in /proc and /etc may not be accessible to poor unprivileged users
due to e.g. SELinux, BOFH or both, so check for EACCES and EPERM.
/var/tmp may be a symlink to /tmp and then path_compare() will always fail, so
let's stick to /tmp like elsewhere.
/tmp may be mounted with noexec option and then trying to execute scripts from
there would fail.
Detect and warn if seccomp is already in use, which could make seccomp test
fail if the syscalls are already blocked.
Unset $TMPDIR so it will not break specifier tests where %T is assumed to be
/tmp and %V /var/tmp.
|
|
|
|
|
|
|
|
|
|
| |
This improves the following debug log.
Before:
systemd[1162]: Restricting namespace to: .
After:
systemd[1162]: Restricting namespace to: n/a.
|
|
|
|
|
|
|
| |
Real syscall numbers start at 0. The fake seccomp values seem to be
strictly less than 0.
Fixes: 4df8fe8415eaf4abd5b93c3447452547c6ea9e5f
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like with shmat already the actual results of the test
test_memory_deny_write_execute_mmap depend on kernel/libseccomp/glibc
of the platform it is running on.
There are known-good platforms, but on the others do not assert success
(which implies test has actually failed as no seccomp blocking was achieved),
but instead make the check dependent to the success of the mmap call
on that platforms.
Finally the assert of the munmap on that valid pointer should return ==0,
so that is what the check should be for in case of p != MAP_FAILED.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
|
|\
| |
| | |
Add ProtectKernelLogs to systemd.exec
|
| | |
|
|/
|
|
|
|
|
|
|
| |
namespace invasion
A follow-up for 59b657296a2fe104f112b91bbf9301724067cc81, adding the
same conditioning for all cases of our __NR_xyz use.
Fixes: #14031
|
| |
|
| |
|
|
|
|
| |
This way, we know that it works as intended.
|
|
|
|
| |
It has no open().
|
| |
|
|
|
|
|
|
|
|
|
|
| |
multiplexer
After
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d6040d46817,
those syscalls have their separate numbers and we can block them.
But glibc might still use the old ones. So let's just do a best-effort
block and not assume anything about how effective it is.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g. on i686:
(previously)
arch x86: SCMP_SYS(mmap) = 90
arch x86: SCMP_SYS(mmap2) = 192
arch x86: SCMP_SYS(shmat) = -221
arch x86: SCMP_SYS(shmat) = -221
arch x86: SCMP_SYS(shmdt) = -222
(now)
arch x86: SCMP_SYS(mmap) = 90
arch x86: SCMP_SYS(mmap2) = 192
arch x86: SCMP_SYS(shmat) = 397
arch x86: SCMP_SYS(shmat) = 397
arch x86: SCMP_SYS(shmdt) = 398
The relevant commit seems to be
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d6040d46817.
|
| |
|
|
|
|
| |
Just some source rearranging.
|
|
|
|
|
|
|
| |
Apparently on Debian LXC/AppArmor doesn't allow namespacing to container
payloads. Deal with it.
Fixes: #9700
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our logs are full of:
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldstat() / -10037, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call get_thread_area() / -10076, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call set_thread_area() / -10079, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldfstat() / -10034, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldolduname() / -10036, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldlstat() / -10035, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call waitpid() / -10073, ignoring: Numerical argument out of domain
...
This is pointless and makes debug logs hard to read. Let's keep the logs
in test code, but disable it in nspawn and pid1. This is done through a function
parameter because those functions operate recursively and it's not possible to
make the caller to log meaningfully.
There should be no functional change, except the skipped debug logs.
|
|
|
|
|
| |
Various tests produce similar output, and the function names make it
easier to see where the output is generated.
|
| |
|
| |
|
|
|
|
|
| |
The advantages are that we save a few lines, and that we can override
logging using environment variables in more test executables.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
system services
Currently we employ mostly system call blacklisting for our system
services. Let's add a new system call filter group @system-service that
helps turning this around into a whitelist by default.
The new group is very similar to nspawn's default filter list, but in
some ways more restricted (as sethostname() and suchlike shouldn't be
available to most system services just like that) and in others more
relaxed (for example @keyring is blocked in nspawn since it's not
properly virtualized yet in the kernel, but is fine for regular system
services).
|
|
|
|
|
|
|
|
|
|
|
| |
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
|
|
|
|
|
|
| |
This also drops namespace_flag_to_string_many_with_check(), and
renames namespace_flag_{from,to}_string_many() to
namespace_flags_{from,to}_string().
|
|
|
|
|
|
|
|
|
|
| |
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
|