From f09eb7688eeddffd6dab7f86501079f1142c860b Mon Sep 17 00:00:00 2001 From: Lennart Poettering Date: Mon, 26 Feb 2018 11:48:46 +0100 Subject: doc: add a new doc/ directory, and move two markdown docs into them I figure sooneror later we'll have more of these docs, hence let's give them a clean place to be. This leaves NEWS and README/README.md as well as the LICENSE texts in the root directory of the project since that appears to be customary for Free Software projects. --- .github/CONTRIBUTING.md | 4 +- CODING_STYLE | 442 -------------------------------------------- DISTRO_PORTING | 71 -------- ENVIRONMENT.md | 91 ---------- HACKING | 116 ------------ NEWS | 8 +- README.md | 4 +- TRANSIENT-SETTINGS.md | 454 ---------------------------------------------- TRANSLATORS | 27 --- UIDS-GIDS.md | 243 ------------------------- doc/CODING_STYLE | 442 ++++++++++++++++++++++++++++++++++++++++++++ doc/DISTRO_PORTING | 71 ++++++++ doc/ENVIRONMENT.md | 91 ++++++++++ doc/HACKING | 116 ++++++++++++ doc/TRANSIENT-SETTINGS.md | 454 ++++++++++++++++++++++++++++++++++++++++++++++ doc/TRANSLATORS | 27 +++ doc/UIDS-GIDS.md | 243 +++++++++++++++++++++++++ meson.build | 18 +- src/basic/verbs.c | 2 +- 19 files changed, 1463 insertions(+), 1461 deletions(-) delete mode 100644 CODING_STYLE delete mode 100644 DISTRO_PORTING delete mode 100644 ENVIRONMENT.md delete mode 100644 HACKING delete mode 100644 TRANSIENT-SETTINGS.md delete mode 100644 TRANSLATORS delete mode 100644 UIDS-GIDS.md create mode 100644 doc/CODING_STYLE create mode 100644 doc/DISTRO_PORTING create mode 100644 doc/ENVIRONMENT.md create mode 100644 doc/HACKING create mode 100644 doc/TRANSIENT-SETTINGS.md create mode 100644 doc/TRANSLATORS create mode 100644 doc/UIDS-GIDS.md diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index 6f197441cb..2f266f2934 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -24,8 +24,8 @@ If you discover a security vulnerability, we'd appreciate a non-public disclosur ## Posting Pull Requests * Make sure to post PRs only relative to a very recent git master. -* Follow our [Coding Style](https://raw.githubusercontent.com/systemd/systemd/master/CODING_STYLE) when contributing code. This is a requirement for all code we merge. -* Please make sure to test your change before submitting the PR. See [HACKING](https://raw.githubusercontent.com/systemd/systemd/master/HACKING) for details how to do this. +* Follow our [Coding Style](https://raw.githubusercontent.com/systemd/systemd/master/doc/CODING_STYLE) when contributing code. This is a requirement for all code we merge. +* Please make sure to test your change before submitting the PR. See [HACKING](https://raw.githubusercontent.com/systemd/systemd/master/doc/HACKING) for details how to do this. * Make sure to run the test suite locally, before posting your PR. We use a CI system, meaning we don't even look at your PR, if the build and tests don't pass. * If you need to update the code in an existing PR, force-push into the same branch, overriding old commits with new versions. * After you have pushed a new version, add a comment about the new version (no notification is sent just for the commits, so it's easy to miss the update without an explicit comment). If you are a member of the systemd project on GitHub, remove the `reviewed/needs-rework` label. diff --git a/CODING_STYLE b/CODING_STYLE deleted file mode 100644 index ae818126cb..0000000000 --- a/CODING_STYLE +++ /dev/null @@ -1,442 +0,0 @@ -- 8ch indent, no tabs, except for files in man/ which are 2ch indent, - and still no tabs - -- We prefer /* comments */ over // comments in code you commit, please. This - way // comments are left for developers to use for local, temporary - commenting of code for debug purposes (i.e. uncommittable stuff), making such - comments easily discernable from explanatory, documenting code comments - (i.e. committable stuff). - -- Don't break code lines too eagerly. We do *not* force line breaks at - 80ch, all of today's screens should be much larger than that. But - then again, don't overdo it, ~119ch should be enough really. - -- Variables and functions *must* be static, unless they have a - prototype, and are supposed to be exported. - -- structs in MixedCase (with exceptions, such as public API structs), - variables + functions in lower_case. - -- The destructors always unregister the object from the next bigger - object, not the other way around - -- To minimize strict aliasing violations, we prefer unions over casting - -- For robustness reasons, destructors should be able to destruct - half-initialized objects, too - -- Error codes are returned as negative Exxx. e.g. return -EINVAL. There - are some exceptions: for constructors, it is OK to return NULL on - OOM. For lookup functions, NULL is fine too for "not found". - - Be strict with this. When you write a function that can fail due to - more than one cause, it *really* should have "int" as return value - for the error code. - -- Do not bother with error checking whether writing to stdout/stderr - worked. - -- Do not log errors from "library" code, only do so from "main - program" code. (With one exception: it is OK to log with DEBUG level - from any code, with the exception of maybe inner loops). - -- Always check OOM. There is no excuse. In program code, you can use - "log_oom()" for then printing a short message, but not in "library" code. - -- Do not issue NSS requests (that includes user name and host name - lookups) from PID 1 as this might trigger deadlocks when those - lookups involve synchronously talking to services that we would need - to start up - -- Do not synchronously talk to any other service from PID 1, due to - risk of deadlocks - -- Avoid fixed-size string buffers, unless you really know the maximum - size and that maximum size is small. They are a source of errors, - since they possibly result in truncated strings. It is often nicer - to use dynamic memory, alloca() or VLAs. If you do allocate fixed-size - strings on the stack, then it is probably only OK if you either - use a maximum size such as LINE_MAX, or count in detail the maximum - size a string can have. (DECIMAL_STR_MAX and DECIMAL_STR_WIDTH - macros are your friends for this!) - - Or in other words, if you use "char buf[256]" then you are likely - doing something wrong! - -- Stay uniform. For example, always use "usec_t" for time - values. Do not mix usec and msec, and usec and whatnot. - -- Make use of _cleanup_free_ and friends. It makes your code much - nicer to read! - -- Be exceptionally careful when formatting and parsing floating point - numbers. Their syntax is locale dependent (i.e. "5.000" in en_US is - generally understood as 5, while on de_DE as 5000.). - -- Try to use this: - - void foo() { - } - - instead of this: - - void foo() - { - } - - But it is OK if you do not. - -- Single-line "if" blocks should not be enclosed in {}. Use this: - - if (foobar) - waldo(); - - instead of this: - - if (foobar) { - waldo(); - } - -- Do not write "foo ()", write "foo()". - -- Please use streq() and strneq() instead of strcmp(), strncmp() where applicable. - -- Please do not allocate variables on the stack in the middle of code, - even if C99 allows it. Wrong: - - { - a = 5; - int b; - b = a; - } - - Right: - - { - int b; - a = 5; - b = a; - } - -- Unless you allocate an array, "double" is always the better choice - than "float". Processors speak "double" natively anyway, so this is - no speed benefit, and on calls like printf() "float"s get promoted - to "double"s anyway, so there is no point. - -- Do not mix function invocations with variable definitions in one - line. Wrong: - - { - int a = foobar(); - uint64_t x = 7; - } - - Right: - - { - int a; - uint64_t x = 7; - - a = foobar(); - } - -- Use "goto" for cleaning up, and only use it for that. i.e. you may - only jump to the end of a function, and little else. Never jump - backwards! - -- Think about the types you use. If a value cannot sensibly be - negative, do not use "int", but use "unsigned". - -- Use "char" only for actual characters. Use "uint8_t" or "int8_t" - when you actually mean a byte-sized signed or unsigned - integers. When referring to a generic byte, we generally prefer the - unsigned variant "uint8_t". Do not use types based on "short". They - *never* make sense. Use ints, longs, long longs, all in - unsigned+signed fashion, and the fixed size types - uint8_t/uint16_t/uint32_t/uint64_t/int8_t/int16_t/int32_t and so on, - as well as size_t, but nothing else. Do not use kernel types like - u32 and so on, leave that to the kernel. - -- Public API calls (i.e. functions exported by our shared libraries) - must be marked "_public_" and need to be prefixed with "sd_". No - other functions should be prefixed like that. - -- In public API calls, you *must* validate all your input arguments for - programming error with assert_return() and return a sensible return - code. In all other calls, it is recommended to check for programming - errors with a more brutal assert(). We are more forgiving to public - users than for ourselves! Note that assert() and assert_return() - really only should be used for detecting programming errors, not for - runtime errors. assert() and assert_return() by usage of _likely_() - inform the compiler that he should not expect these checks to fail, - and they inform fellow programmers about the expected validity and - range of parameters. - -- Never use strtol(), atoi() and similar calls. Use safe_atoli(), - safe_atou32() and suchlike instead. They are much nicer to use in - most cases and correctly check for parsing errors. - -- For every function you add, think about whether it is a "logging" - function or a "non-logging" function. "Logging" functions do logging - on their own, "non-logging" function never log on their own and - expect their callers to log. All functions in "library" code, - i.e. in src/shared/ and suchlike must be "non-logging". Every time a - "logging" function calls a "non-logging" function, it should log - about the resulting errors. If a "logging" function calls another - "logging" function, then it should not generate log messages, so - that log messages are not generated twice for the same errors. - -- Avoid static variables, except for caches and very few other - cases. Think about thread-safety! While most of our code is never - used in threaded environments, at least the library code should make - sure it works correctly in them. Instead of doing a lot of locking - for that, we tend to prefer using TLS to do per-thread caching (which - only works for small, fixed-size cache objects), or we disable - caching for any thread that is not the main thread. Use - is_main_thread() to detect whether the calling thread is the main - thread. - -- Command line option parsing: - - Do not print full help() on error, be specific about the error. - - Do not print messages to stdout on error. - - Do not POSIX_ME_HARDER unless necessary, i.e. avoid "+" in option string. - -- Do not write functions that clobber call-by-reference variables on - failure. Use temporary variables for these cases and change the - passed in variables only on success. - -- When you allocate a file descriptor, it should be made O_CLOEXEC - right from the beginning, as none of our files should leak to forked - binaries by default. Hence, whenever you open a file, O_CLOEXEC must - be specified, right from the beginning. This also applies to - sockets. Effectively this means that all invocations to: - - a) open() must get O_CLOEXEC passed - b) socket() and socketpair() must get SOCK_CLOEXEC passed - c) recvmsg() must get MSG_CMSG_CLOEXEC set - d) F_DUPFD_CLOEXEC should be used instead of F_DUPFD, and so on - f) invocations of fopen() should take "e" - -- We never use the POSIX version of basename() (which glibc defines it in - libgen.h), only the GNU version (which glibc defines in string.h). - The only reason to include libgen.h is because dirname() - is needed. Every time you need that please immediately undefine - basename(), and add a comment about it, so that no code ever ends up - using the POSIX version! - -- Use the bool type for booleans, not integers. One exception: in public - headers (i.e those in src/systemd/sd-*.h) use integers after all, as "bool" - is C99 and in our public APIs we try to stick to C89 (with a few extension). - -- When you invoke certain calls like unlink(), or mkdir_p() and you - know it is safe to ignore the error it might return (because a later - call would detect the failure anyway, or because the error is in an - error path and you thus couldn't do anything about it anyway), then - make this clear by casting the invocation explicitly to (void). Code - checks like Coverity understand that, and will not complain about - ignored error codes. Hence, please use this: - - (void) unlink("/foo/bar/baz"); - - instead of just this: - - unlink("/foo/bar/baz"); - - Don't cast function calls to (void) that return no error - conditions. Specifically, the various xyz_unref() calls that return a NULL - object shouldn't be cast to (void), since not using the return value does not - hide any errors. - -- Don't invoke exit(), ever. It is not replacement for proper error - handling. Please escalate errors up your call chain, and use normal - "return" to exit from the main function of a process. If you - fork()ed off a child process, please use _exit() instead of exit(), - so that the exit handlers are not run. - -- Please never use dup(). Use fcntl(fd, F_DUPFD_CLOEXEC, 3) - instead. For two reason: first, you want O_CLOEXEC set on the new fd - (see above). Second, dup() will happily duplicate your fd as 0, 1, - 2, i.e. stdin, stdout, stderr, should those fds be closed. Given the - special semantics of those fds, it's probably a good idea to avoid - them. F_DUPFD_CLOEXEC with "3" as parameter avoids them. - -- When you define a destructor or unref() call for an object, please - accept a NULL object and simply treat this as NOP. This is similar - to how libc free() works, which accepts NULL pointers and becomes a - NOP for them. By following this scheme a lot of if checks can be - removed before invoking your destructor, which makes the code - substantially more readable and robust. - -- Related to this: when you define a destructor or unref() call for an - object, please make it return the same type it takes and always - return NULL from it. This allows writing code like this: - - p = foobar_unref(p); - - which will always work regardless if p is initialized or not, and - guarantees that p is NULL afterwards, all in just one line. - -- Use alloca(), but never forget that it is not OK to invoke alloca() - within a loop or within function call parameters. alloca() memory is - released at the end of a function, and not at the end of a {} - block. Thus, if you invoke it in a loop, you keep increasing the - stack pointer without ever releasing memory again. (VLAs have better - behaviour in this case, so consider using them as an alternative.) - Regarding not using alloca() within function parameters, see the - BUGS section of the alloca(3) man page. - -- Use memzero() or even better zero() instead of memset(..., 0, ...) - -- Instead of using memzero()/memset() to initialize structs allocated - on the stack, please try to use c99 structure initializers. It's - short, prettier and actually even faster at execution. Hence: - - struct foobar t = { - .foo = 7, - .bar = "bazz", - }; - - instead of: - - struct foobar t; - zero(t); - t.foo = 7; - t.bar = "bazz"; - -- When returning a return code from main(), please preferably use - EXIT_FAILURE and EXIT_SUCCESS as defined by libc. - -- The order in which header files are included doesn't matter too - much. systemd-internal headers must not rely on an include order, so - it is safe to include them in any order possible. - However, to not clutter global includes, and to make sure internal - definitions will not affect global headers, please always include the - headers of external components first (these are all headers enclosed - in <>), followed by our own exported headers (usually everything - that's prefixed by "sd-"), and then followed by internal headers. - Furthermore, in all three groups, order all includes alphabetically - so duplicate includes can easily be detected. - -- To implement an endless loop, use "for (;;)" rather than "while - (1)". The latter is a bit ugly anyway, since you probably really - meant "while (true)"... To avoid the discussion what the right - always-true expression for an infinite while() loop is our - recommendation is to simply write it without any such expression by - using "for (;;)". - -- Never use the "off_t" type, and particularly avoid it in public - APIs. It's really weirdly defined, as it usually is 64bit and we - don't support it any other way, but it could in theory also be - 32bit. Which one it is depends on a compiler switch chosen by the - compiled program, which hence corrupts APIs using it unless they can - also follow the program's choice. Moreover, in systemd we should - parse values the same way on all architectures and cannot expose - off_t values over D-Bus. To avoid any confusion regarding conversion - and ABIs, always use simply uint64_t directly. - -- Commit message subject lines should be prefixed with an appropriate - component name of some kind. For example "journal: ", "nspawn: " and - so on. - -- Do not use "Signed-Off-By:" in your commit messages. That's a kernel - thing we don't do in the systemd project. - -- Avoid leaving long-running child processes around, i.e. fork()s that - are not followed quickly by an execv() in the child. Resource - management is unclear in this case, and memory CoW will result in - unexpected penalties in the parent much much later on. - -- Don't block execution for arbitrary amounts of time using usleep() - or a similar call, unless you really know what you do. Just "giving - something some time", or so is a lazy excuse. Always wait for the - proper event, instead of doing time-based poll loops. - -- To determine the length of a constant string "foo", don't bother - with sizeof("foo")-1, please use STRLEN() instead. - -- If you want to concatenate two or more strings, consider using - strjoin() rather than asprintf(), as the latter is a lot - slower. This matters particularly in inner loops. - -- Please avoid using global variables as much as you can. And if you - do use them make sure they are static at least, instead of - exported. Especially in library-like code it is important to avoid - global variables. Why are global variables bad? They usually hinder - generic reusability of code (since they break in threaded programs, - and usually would require locking there), and as the code using them - has side-effects make programs non-transparent. That said, there are - many cases where they explicitly make a lot of sense, and are OK to - use. For example, the log level and target in log.c is stored in a - global variable, and that's OK and probably expected by most. Also - in many cases we cache data in global variables. If you add more - caches like this, please be careful however, and think about - threading. Only use static variables if you are sure that - thread-safety doesn't matter in your case. Alternatively consider - using TLS, which is pretty easy to use with gcc's "thread_local" - concept. It's also OK to store data that is inherently global in - global variables, for example data parsed from command lines, see - below. - -- If you parse a command line, and want to store the parsed parameters - in global variables, please consider prefixing their names with - "arg_". We have been following this naming rule in most of our - tools, and we should continue to do so, as it makes it easy to - identify command line parameter variables, and makes it clear why it - is OK that they are global variables. - -- When exposing public C APIs, be careful what function parameters you make - "const". For example, a parameter taking a context object should probably not - be "const", even if you are writing an otherwise read-only accessor function - for it. The reason is that making it "const" fixates the contract that your - call won't alter the object ever, as part of the API. However, that's often - quite a promise, given that this even prohibits object-internal caching or - lazy initialization of object variables. Moreover it's usually not too useful - for client applications. Hence: please be careful and avoid "const" on object - parameters, unless you are very sure "const" is appropriate. - -- Make sure to enforce limits on every user controllable resource. If the user - can allocate resources in your code, your code must enforce some form of - limits after which it will refuse operation. It's fine if it is hard-coded (at - least initially), but it needs to be there. This is particularly important - for objects that unprivileged users may allocate, but also matters for - everything else any user may allocated. - -- htonl()/ntohl() and htons()/ntohs() are weird. Please use htobe32() and - htobe16() instead, it's much more descriptive, and actually says what really - is happening, after all htonl() and htons() don't operate on longs and - shorts as their name would suggest, but on uint32_t and uint16_t. Also, - "network byte order" is just a weird name for "big endian", hence we might - want to call it "big endian" right-away. - -- You might wonder what kind of common code belongs in src/shared/ and what - belongs in src/basic/. The split is like this: anything that uses public APIs - we expose (i.e. any of the sd-bus, sd-login, sd-id128, ... APIs) must be - located in src/shared/. All stuff that only uses external libraries from - other projects (such as glibc's APIs), or APIs from src/basic/ itself should - be placed in src/basic/. Conversely, src/libsystemd/ may only use symbols - from src/basic, but not from src/shared/. To summarize: - - src/basic/ → may be used by all code in the tree - → may not use any code outside of src/basic/ - - src/libsystemd/ → may be used by all code in the tree, except for code in src/basic/ - → may not use any code outside of src/basic/, src/libsystemd/ - - src/shared/ → may be used by all code in the tree, except for code in src/basic/, src/libsystemd/ - → may not use any code outside of src/basic/, src/libsystemd/, src/shared/ - -- Our focus is on the GNU libc (glibc), not any other libcs. If other libcs are - incompatible with glibc it's on them. However, if there are equivalent POSIX - and Linux/GNU-specific APIs, we generally prefer the POSIX APIs. If there - aren't, we are happy to use GNU or Linux APIs, and expect non-GNU - implementations of libc to catch up with glibc. - -- Whenever installing a signal handler, make sure to set SA_RESTART for it, so - that interrupted system calls are automatically restarted, and we minimize - hassles with handling EINTR (in particular as EINTR handling is pretty broken - on Linux). - -- When applying C-style unescaping as well as specifier expansion on the same - string, always apply the C-style unescaping fist, followed by the specifier - expansion. When doing the reverse, make sure to escape '%' in specifier-style - first (i.e. '%' → '%%'), and then do C-style escaping where necessary. diff --git a/DISTRO_PORTING b/DISTRO_PORTING deleted file mode 100644 index d1a187aa41..0000000000 --- a/DISTRO_PORTING +++ /dev/null @@ -1,71 +0,0 @@ -Porting systemd To New Distributions - -HOWTO: - You need to make the follow changes to adapt systemd to your - distribution: - - 1) Find the right configure parameters for: - - -D rootprefix= - -D sysvinit-path= - -D sysvrcnd-path= - -D rc-local= - -D halt-local= - -D loadkeys-path= - -D setfont-path= - -D tty-gid= - -D ntp-servers= - -D dns-servers= - -D support-url= - - 2) Try it out. Play around (as an ordinary user) with - '/usr/lib/systemd/systemd --test --system' for a test run - of systemd without booting. This will read the unit files and - print the initial transaction it would execute during boot-up. - This will also inform you about ordering loops and suchlike. - -NTP POOL: - By default, systemd-timesyncd uses the Google Public NTP servers - time[1-4].google.com, if no other NTP configuration is available. They - serve time that uses a leap second smear, and can be up to .5s off from - servers that use stepped leap seconds. - - https://developers.google.com/time/smear - - If you prefer to use leap second steps, please register your own - vendor pool at ntp.org and make it the built-in default by - passing --with-ntp-servers= to configure. Registering vendor - pools is free: - - http://www.pool.ntp.org/en/vendors.html - - Use -D ntp-servers= to direct systemd-timesyncd to different fallback - NTP servers. - -DNS SERVERS: - By default, systemd-resolved uses the Google Public DNS servers - 8.8.8.8, 8.8.4.4, 2001:4860:4860::8888, 2001:4860:4860::8844 as - fallback, if no other DNS configuration is available. - - Use -D dns-servers= to direct systemd-resolved to different fallback - DNS servers. - -PAM: - The default PAM config shipped by systemd is really bare bones. - It does not include many modules your distro might want to enable - to provide a more seamless experience. For example, limits set in - /etc/security/limits.conf will not be read unless you load pam_limits. - Make sure you add modules your distro expects from user services. - - Pass -D pamconfdir=no to meson to avoid installing this file and - instead install your own. - -CONTRIBUTING UPSTREAM: - We generally do no longer accept distribution-specific patches to - systemd upstream. If you have to make changes to systemd's source code - to make it work on your distribution, unless your code is generic - enough to be generally useful, we are unlikely to merge it. Please - always consider adopting the upstream defaults. If that is not - possible, please maintain the relevant patches downstream. - - Thank you for understanding. diff --git a/ENVIRONMENT.md b/ENVIRONMENT.md deleted file mode 100644 index 581bf3c238..0000000000 --- a/ENVIRONMENT.md +++ /dev/null @@ -1,91 +0,0 @@ -# Known Environment Variables - -A number of systemd components take additional runtime parameters via -environment variables. Many of these environment variables are not supported at -the same level as command line switches and other interfaces are: we don't -document them in the man pages and we make no stability guarantees for -them. While they generally are unlikely to be dropped any time soon again, we -do not want to guarantee that they stay around for good either. - -Below is an (incomprehensive) list of the environment variables understood by -the various tools. Note that this list only covers environment variables not -documented in the proper man pages. - -All tools: - -* `$SYSTEMD_OFFLINE=[0|1]` — if set to `1`, then `systemctl` will - refrain from talking to PID 1; this has the same effect as the historical - detection of `chroot()`. Setting this variable to `0` instead has a similar - effect as `SYSTEMD_IGNORE_CHROOT=1`; i.e. tools will try to - communicate with PID 1 even if a `chroot()` environment is detected. - You almost certainly want to set this to `1` if you maintain a package build system - or similar and are trying to use a modern container system and not plain - `chroot()`. - -* `$SYSTEMD_IGNORE_CHROOT=1` — if set, don't check whether being invoked in a - `chroot()` environment. This is particularly relevant for systemctl, as it - will not alter its behaviour for `chroot()` environments if set. Normally it - refrains from talking to PID 1 in such a case; turning most operations such - as `start` into no-ops. If that's what's explicitly desired, you might - consider setting `SYSTEMD_OFFLINE=1`. - -* `$SD_EVENT_PROFILE_DELAYS=1` — if set, the sd-event event loop implementation - will print latency information at runtime. - -* `$SYSTEMD_PROC_CMDLINE` — if set, may contain a string that is used as kernel - command line instead of the actual one readable from /proc/cmdline. This is - useful for debugging, in order to test generators and other code against - specific kernel command lines. - -systemctl: - -* `$SYSTEMCTL_FORCE_BUS=1` — if set, do not connect to PID1's private D-Bus - listener, and instead always connect through the dbus-daemon D-bus broker. - -* `$SYSTEMCTL_INSTALL_CLIENT_SIDE=1` — if set, enable or disable unit files on - the client side, instead of asking PID 1 to do this. - -* `$SYSTEMCTL_SKIP_SYSV=1` — if set, do not call out to SysV compatibility hooks. - -systemd-nspawn: - -* `$UNIFIED_CGROUP_HIERARCHY=1` — if set, force nspawn into unified cgroup - hierarchy mode. - -* `$SYSTEMD_NSPAWN_API_VFS_WRITABLE=1` — if set, make /sys and /proc/sys and - friends writable in the container. If set to "network", leave only - /proc/sys/net writable. - -* `$SYSTEMD_NSPAWN_CONTAINER_SERVICE=…` — override the "service" name nspawn - uses to register with machined. If unset defaults to "nspawn", but with this - variable may be set to any other value. - -* `$SYSTEMD_NSPAWN_USE_CGNS=0` — if set, do not use cgroup namespacing, even if - it is available. - -* `$SYSTEMD_NSPAWN_LOCK=0` — if set, do not lock container images when running. - -systemd-logind: - -* `$SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK=1` — if set, report that - hibernation is available even if the swap devices do not provide enough room - for it. - -installed systemd tests: - -* `$SYSTEMD_TEST_DATA` — override the location of test data. This is useful if - a test executable is moved to an arbitrary location. - -nss-systemd: - -* `$SYSTEMD_NSS_BYPASS_SYNTHETIC=1` — if set, `nss-systemd` won't synthesize - user/group records for the `root` and `nobody` users if they are missing from - `/etc/passwd`. - -* `$SYSTEMD_NSS_DYNAMIC_BYPASS=1` — if set, `nss-systemd` won't return - user/group records for dynamically registered service users (i.e. users - registered through `DynamicUser=1`). - -* `$SYSTEMD_NSS_BYPASS_BUS=1` — if set, `nss-systemd` won't use D-Bus to do - dynamic user lookups. This is primarily useful to make `nss-systemd` work - safely from within `dbus-daemon`. diff --git a/HACKING b/HACKING deleted file mode 100644 index e9a159ba9f..0000000000 --- a/HACKING +++ /dev/null @@ -1,116 +0,0 @@ -HACKING ON SYSTEMD - -We welcome all contributions to systemd. If you notice a bug or a missing -feature, please feel invited to fix it, and submit your work as a github Pull -Request (PR): - - https://github.com/systemd/systemd/pull/new - -Please make sure to follow our Coding Style when submitting patches. See -CODING_STYLE for details. Also have a look at our Contribution Guidelines: - - https://github.com/systemd/systemd/blob/master/.github/CONTRIBUTING.md - -When adding new functionality, tests should be added. For shared functionality -(in src/basic and src/shared) unit tests should be sufficient. The general -policy is to keep tests in matching files underneath src/test, -e.g. src/test/test-path-util.c contains tests for any functions in -src/basic/path-util.c. If adding a new source file, consider adding a matching -test executable. For features at a higher level, tests in src/test/ are very -strongly recommended. If that is no possible, integration tests in test/ are -encouraged. - -Please always test your work before submitting a PR. For many of the components -of systemd testing is straight-forward as you can simply compile systemd and -run the relevant tool from the build directory. - -For some components (most importantly, systemd/PID1 itself) this is not -possible, however. In order to simplify testing for cases like this we provide -a set of "mkosi" build files directly in the source tree. "mkosi" is a tool for -building clean OS images from an upstream distribution in combination with a -fresh build of the project in the local working directory. To make use of this, -please acquire "mkosi" from https://github.com/systemd/mkosi first, unless your -distribution has packaged it already and you can get it from there. After the -tool is installed it is sufficient to type "mkosi" in the systemd project -directory to generate a disk image "image.raw" you can boot either in -systemd-nspawn or in an UEFI-capable VM: - - # systemd-nspawn -bi image.raw - -or: - - # qemu-system-x86_64 -enable-kvm -m 512 -smp 2 -bios /usr/share/edk2/ovmf/OVMF_CODE.fd -hda image.raw - -Every time you rerun the "mkosi" command a fresh image is built, incorporating -all current changes you made to the project tree. - -Alternatively, you may install the systemd version from your git check-out -directly on top of your host system's directory tree. This mostly works fine, -but of course you should know what you are doing as you might make your system -unbootable in case of a bug in your changes. Also, you might step into your -package manager's territory with this. Be careful! - -And never forget: most distributions provide very simple and convenient ways to -install all development packages necessary to build systemd. For example, on -Fedora the following command line should be sufficient to install all of -systemd's build dependencies: - - # dnf builddep systemd - -Putting this all together, here's a series of commands for preparing a patch -for systemd (this example is for Fedora): - - $ sudo dnf builddep systemd # install build dependencies - $ sudo dnf install mkosi # install tool to quickly build images - $ git clone https://github.com/systemd/systemd.git - $ cd systemd - $ vim src/core/main.c # or wherever you'd like to make your changes - $ meson build # configure the build - $ ninja -C build # build it locally, see if everything compiles fine - $ ninja -C build test # run some simple regression tests - $ sudo mkosi # build a test image - $ sudo systemd-nspawn -bi image.raw # boot up the test image - $ git add -p # interactively put together your patch - $ git commit # commit it - $ git push REMOTE HEAD:refs/heads/BRANCH - # where REMOTE is your "fork" on github - # and BRANCH is a branch name. - -And after that, head over to your repo on github and click "Compare & pull request" - -Happy hacking! - - -FUZZERS - -systemd includes fuzzers in src/fuzz that use libFuzzer and are automatically -run by OSS-Fuzz (https://github.com/google/oss-fuzz) with sanitizers. To add a -fuzz target, create a new src/fuzz/fuzz-foo.c file with a LLVMFuzzerTestOneInput -function and add it to the list in src/fuzz/meson.build. - -Whenever possible, a seed corpus and a dictionary should also be added with new -fuzz targets. The dictionary should be named src/fuzz/fuzz-foo.dict and the seed -corpus should be built and exported as $OUT/fuzz-foo_seed_corpus.zip in -scripts/oss-fuzz.sh. - -The fuzzers can be built locally if you have libFuzzer installed by running -scripts/oss-fuzz.sh. You should also confirm that the fuzzer runs in the -OSS-Fuzz environment by checking out the OSS-Fuzz repo, and then running -commands like this: - - python infra/helper.py build_image systemd - python infra/helper.py build_fuzzers --sanitizer memory systemd ../systemd - python infra/helper.py run_fuzzer systemd fuzz-foo - -If you find a bug that impacts the security of systemd, please follow the -guidance in .github/CONTRIBUTING.md on how to report a security vulnerability. - -For more details on building fuzzers and integrating with OSS-Fuzz, visit: - - https://github.com/google/oss-fuzz/blob/master/docs/new_project_guide.md - - https://llvm.org/docs/LibFuzzer.html - - https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md - - https://chromium.googlesource.com/chromium/src/testing/libfuzzer/+/HEAD/efficient_fuzzer.md diff --git a/NEWS b/NEWS index 22372a8296..f5e348cc4c 100644 --- a/NEWS +++ b/NEWS @@ -373,8 +373,8 @@ CHANGES WITH 236: store again, ahead of POLLHUP or POLLERR when they are removed anyway. - * A new document UIDS-GIDS.md has been added to the source tree, that - documents the UID/GID range and assignment assumptions and + * A new document doc/UIDS-GIDS.md has been added to the source tree, + that documents the UID/GID range and assignment assumptions and requirements of systemd. * The watchdog device PID 1 will ping may now be configured through the @@ -1106,7 +1106,7 @@ CHANGES WITH 233: * Documentation has been added that lists all of systemd's low-level environment variables: - https://github.com/systemd/systemd/blob/master/ENVIRONMENT.md + https://github.com/systemd/systemd/blob/master/doc/ENVIRONMENT.md * sd-daemon gained a new API sd_is_socket_sockaddr() for determining whether a specific socket file descriptor matches a specified socket @@ -1817,7 +1817,7 @@ CHANGES WITH 231: booted up with "systemd-nspawn -b -i", qemu-kvm or on any physical UEFI PC. This functionality is particularly useful to easily test local changes made to systemd in a pristine, defined environment. See - HACKING for details. + doc/HACKING for details. * configure learned the --with-support-url= option to specify the distribution's bugtracker. diff --git a/README.md b/README.md index 06fd69142a..4b017faf1b 100644 --- a/README.md +++ b/README.md @@ -14,10 +14,10 @@ Information about build requirements are provided in the [README file](../master Consult our [NEWS file](../master/NEWS) for information about what's new in the most recent systemd versions. -Please see the [HACKING file](../master/HACKING) for information how to hack on systemd and test your modifications. +Please see the [HACKING file](../master/doc/HACKING) for information how to hack on systemd and test your modifications. Please see our [Contribution Guidelines](../master/.github/CONTRIBUTING.md) for more information about filing GitHub Issues and posting GitHub Pull Requests. -When preparing patches for systemd, please follow our [Coding Style Guidelines](../master/CODING_STYLE). +When preparing patches for systemd, please follow our [Coding Style Guidelines](../master/doc/CODING_STYLE). If you are looking for support, please contact our [mailing list](https://lists.freedesktop.org/mailman/listinfo/systemd-devel) or join our [IRC channel](irc://irc.freenode.org/%23systemd). diff --git a/TRANSIENT-SETTINGS.md b/TRANSIENT-SETTINGS.md deleted file mode 100644 index ca9e8387b7..0000000000 --- a/TRANSIENT-SETTINGS.md +++ /dev/null @@ -1,454 +0,0 @@ -# What settings are currently available for transient units? - -Our intention is to make all settings that are available as unit file settings -also available for transient units, through the D-Bus API. At the moment, some -unit types (device, swap, target) are not supported at all via unit types, -but most others are pretty well supported, with some notable omissions. - -The lists below contain all settings currently available in unit files. The -ones currently available in transient units are prefixed with `✓`. - -## Generic Unit Settings - -Most generic unit settings are available for transient units. - -``` -✓ Description= -✓ Documentation= -✓ SourcePath= -✓ Requires= -✓ Requisite= -✓ Wants= -✓ BindsTo= -✓ Conflicts= -✓ Before= -✓ After= -✓ OnFailure= -✓ PropagatesReloadTo= -✓ ReloadPropagatedFrom= -✓ PartOf= -✓ JoinsNamespaceOf= -✓ RequiresMountsFor= -✓ StopWhenUnneeded= -✓ RefuseManualStart= -✓ RefuseManualStop= -✓ AllowIsolate= -✓ DefaultDependencies= -✓ OnFailureJobMode= -✓ IgnoreOnIsolate= -✓ JobTimeoutSec= -✓ JobRunningTimeoutSec= -✓ JobTimeoutAction= -✓ JobTimeoutRebootArgument= -✓ StartLimitIntervalSec=SECONDS -✓ StartLimitBurst=UNSIGNED -✓ StartLimitAction=ACTION -✓ FailureAction= -✓ SuccessAction= -✓ AddRef= -✓ RebootArgument=STRING -✓ ConditionPathExists= -✓ ConditionPathExistsGlob= -✓ ConditionPathIsDirectory= -✓ ConditionPathIsSymbolicLink= -✓ ConditionPathIsMountPoint= -✓ ConditionPathIsReadWrite= -✓ ConditionDirectoryNotEmpty= -✓ ConditionFileNotEmpty= -✓ ConditionFileIsExecutable= -✓ ConditionNeedsUpdate= -✓ ConditionFirstBoot= -✓ ConditionKernelCommandLine= -✓ ConditionKernelVersion= -✓ ConditionArchitecture= -✓ ConditionVirtualization= -✓ ConditionSecurity= -✓ ConditionCapability= -✓ ConditionHost= -✓ ConditionACPower= -✓ ConditionUser= -✓ ConditionGroup= -✓ ConditionControlGroupController= -✓ AssertPathExists= -✓ AssertPathExistsGlob= -✓ AssertPathIsDirectory= -✓ AssertPathIsSymbolicLink= -✓ AssertPathIsMountPoint= -✓ AssertPathIsReadWrite= -✓ AssertDirectoryNotEmpty= -✓ AssertFileNotEmpty= -✓ AssertFileIsExecutable= -✓ AssertNeedsUpdate= -✓ AssertFirstBoot= -✓ AssertKernelCommandLine= -✓ AssertKernelVersion= -✓ AssertArchitecture= -✓ AssertVirtualization= -✓ AssertSecurity= -✓ AssertCapability= -✓ AssertHost= -✓ AssertACPower= -✓ AssertUser= -✓ AssertGroup= -✓ AssertControlGroupController= -✓ CollectMode= -``` - -## Execution-Related Settings - -All execution-related settings are available for transient units. - -``` -✓ WorkingDirectory= -✓ RootDirectory= -✓ RootImage= -✓ User= -✓ Group= -✓ SupplementaryGroups= -✓ Nice= -✓ OOMScoreAdjust= -✓ IOSchedulingClass= -✓ IOSchedulingPriority= -✓ CPUSchedulingPolicy= -✓ CPUSchedulingPriority= -✓ CPUSchedulingResetOnFork= -✓ CPUAffinity= -✓ UMask= -✓ Environment= -✓ EnvironmentFile= -✓ PassEnvironment= -✓ UnsetEnvironment= -✓ DynamicUser= -✓ RemoveIPC= -✓ StandardInput= -✓ StandardOutput= -✓ StandardError= -✓ StandardInputText= -✓ StandardInputData= -✓ TTYPath= -✓ TTYReset= -✓ TTYVHangup= -✓ TTYVTDisallocate= -✓ SyslogIdentifier= -✓ SyslogFacility= -✓ SyslogLevel= -✓ SyslogLevelPrefix= -✓ LogLevelMax= -✓ LogExtraFields= -✓ SecureBits= -✓ CapabilityBoundingSet= -✓ AmbientCapabilities= -✓ TimerSlackNSec= -✓ NoNewPrivileges= -✓ KeyringMode= -✓ SystemCallFilter= -✓ SystemCallArchitectures= -✓ SystemCallErrorNumber= -✓ MemoryDenyWriteExecute= -✓ RestrictNamespaces= -✓ RestrictRealtime= -✓ RestrictAddressFamilies= -✓ LockPersonality= -✓ LimitCPU= -✓ LimitFSIZE= -✓ LimitDATA= -✓ LimitSTACK= -✓ LimitCORE= -✓ LimitRSS= -✓ LimitNOFILE= -✓ LimitAS= -✓ LimitNPROC= -✓ LimitMEMLOCK= -✓ LimitLOCKS= -✓ LimitSIGPENDING= -✓ LimitMSGQUEUE= -✓ LimitNICE= -✓ LimitRTPRIO= -✓ LimitRTTIME= -✓ ReadWritePaths= -✓ ReadOnlyPaths= -✓ InaccessiblePaths= -✓ BindPaths= -✓ BindReadOnlyPaths= -✓ TemporaryFileSystem= -✓ PrivateTmp= -✓ PrivateDevices= -✓ ProtectKernelTunables= -✓ ProtectKernelModules= -✓ ProtectControlGroups= -✓ PrivateNetwork= -✓ PrivateUsers= -✓ ProtectSystem= -✓ ProtectHome= -✓ MountFlags= -✓ MountAPIVFS= -✓ Personality= -✓ RuntimeDirectoryPreserve= -✓ RuntimeDirectoryMode= -✓ RuntimeDirectory= -✓ StateDirectoryMode= -✓ StateDirectory= -✓ CacheDirectoryMode= -✓ CacheDirectory= -✓ LogsDirectoryMode= -✓ LogsDirectory= -✓ ConfigurationDirectoryMode= -✓ ConfigurationDirectory= -✓ PAMName= -✓ IgnoreSIGPIPE= -✓ UtmpIdentifier= -✓ UtmpMode= -✓ SELinuxContext= -✓ SmackProcessLabel= -✓ AppArmorProfile= -✓ Slice= -``` - -## Resource Control Settings - -All cgroup/resource control settings are available for transient units - -``` -✓ CPUAccounting= -✓ CPUWeight= -✓ StartupCPUWeight= -✓ CPUShares= -✓ StartupCPUShares= -✓ CPUQuota= -✓ MemoryAccounting= -✓ MemoryLow= -✓ MemoryHigh= -✓ MemoryMax= -✓ MemorySwapMax= -✓ MemoryLimit= -✓ DeviceAllow= -✓ DevicePolicy= -✓ IOAccounting= -✓ IOWeight= -✓ StartupIOWeight= -✓ IODeviceWeight= -✓ IOReadBandwidthMax= -✓ IOWriteBandwidthMax= -✓ IOReadIOPSMax= -✓ IOWriteIOPSMax= -✓ BlockIOAccounting= -✓ BlockIOWeight= -✓ StartupBlockIOWeight= -✓ BlockIODeviceWeight= -✓ BlockIOReadBandwidth= -✓ BlockIOWriteBandwidth= -✓ TasksAccounting= -✓ TasksMax= -✓ Delegate= -✓ IPAccounting= -✓ IPAddressAllow= -✓ IPAddressDeny= -``` - -## Process Killing Settings - -All process killing settings are available for transient units: - -``` -✓ SendSIGKILL= -✓ SendSIGHUP= -✓ KillMode= -✓ KillSignal= -``` - -## Service Unit Settings - -Most service unit settings are available for transient units. - -``` -✓ PIDFile= -✓ ExecStartPre= -✓ ExecStart= -✓ ExecStartPost= -✓ ExecReload= -✓ ExecStop= -✓ ExecStopPost= -✓ RestartSec= -✓ TimeoutStartSec= -✓ TimeoutStopSec= -✓ TimeoutSec= -✓ RuntimeMaxSec= -✓ WatchdogSec= -✓ Type= -✓ Restart= -✓ PermissionsStartOnly= -✓ RootDirectoryStartOnly= -✓ RemainAfterExit= -✓ GuessMainPID= -✓ RestartPreventExitStatus= -✓ RestartForceExitStatus= -✓ SuccessExitStatus= -✓ NonBlocking= -✓ BusName= -✓ FileDescriptorStoreMax= -✓ NotifyAccess= - Sockets= -✓ USBFunctionDescriptors= -✓ USBFunctionStrings= -``` - -## Mount Unit Settings - -All mount unit settings are available to transient units: - -``` -✓ What= -✓ Where= -✓ Options= -✓ Type= -✓ TimeoutSec= -✓ DirectoryMode= -✓ SloppyOptions= -✓ LazyUnmount= -✓ ForceUnmount= -``` - -## Automount Unit Settings - -All automount unit setting is available to transient units: - -``` -✓ Where= -✓ DirectoryMode= -✓ TimeoutIdleSec= -``` - -## Timer Unit Settings - -Most timer unit settings are available to transient units. - -``` -✓ OnCalendar= -✓ OnActiveSec= -✓ OnBootSec= -✓ OnStartupSec= -✓ OnUnitActiveSec= -✓ OnUnitInactiveSec= -✓ Persistent= -✓ WakeSystem= -✓ RemainAfterElapse= -✓ AccuracySec= -✓ RandomizedDelaySec= - Unit= -``` - -## Slice Unit Settings - -Slice units are fully supported as transient units, but they have no settings -of their own beyond the generic unit and resource control settings. - -## Scope Unit Settings - -Scope units are fully supported as transient units (in fact they only exist as -such). - -``` -✓ TimeoutStopSec= -``` - -## Socket Unit Settings - -Most socket unit settings are available to transient units. - -``` -✓ ListenStream= -✓ ListenDatagram= -✓ ListenSequentialPacket= -✓ ListenFIFO= -✓ ListenNetlink= -✓ ListenSpecial= -✓ ListenMessageQueue= -✓ ListenUSBFunction= -✓ SocketProtocol= -✓ BindIPv6Only= -✓ Backlog= -✓ BindToDevice= -✓ ExecStartPre= -✓ ExecStartPost= -✓ ExecStopPre= -✓ ExecStopPost= -✓ TimeoutSec= -✓ SocketUser= -✓ SocketGroup= -✓ SocketMode= -✓ DirectoryMode= -✓ Accept= -✓ Writable= -✓ MaxConnections= -✓ MaxConnectionsPerSource= -✓ KeepAlive= -✓ KeepAliveTimeSec= -✓ KeepAliveIntervalSec= -✓ KeepAliveProbes= -✓ DeferAcceptSec= -✓ NoDelay= -✓ Priority= -✓ ReceiveBuffer= -✓ SendBuffer= -✓ IPTOS= -✓ IPTTL= -✓ Mark= -✓ PipeSize= -✓ FreeBind= -✓ Transparent= -✓ Broadcast= -✓ PassCredentials= -✓ PassSecurity= -✓ TCPCongestion= -✓ ReusePort= -✓ MessageQueueMaxMessages= -✓ MessageQueueMessageSize= -✓ RemoveOnStop= -✓ Symlinks= -✓ FileDescriptorName= - Service= -✓ TriggerLimitIntervalSec= -✓ TriggerLimitBurst= -✓ SmackLabel= -✓ SmackLabelIPIn= -✓ SmackLabelIPOut= -✓ SELinuxContextFromNet= -``` - -## Swap Unit Settings - -Swap units are currently not available at all as transient units: - -``` - What= - Priority= - Options= - TimeoutSec= -``` - -## Path Unit Settings - -Most path unit settings are available to transient units. - -``` -✓ PathExists= -✓ PathExistsGlob= -✓ PathChanged= -✓ PathModified= -✓ DirectoryNotEmpty= - Unit= -✓ MakeDirectory= -✓ DirectoryMode= -``` - -## Install Section - -The `[Install]` section is currently not available at all for transient units, and it probably doesn't even make sense. - -``` - Alias= - WantedBy= - RequiredBy= - Also= - DefaultInstance= -``` diff --git a/TRANSLATORS b/TRANSLATORS deleted file mode 100644 index 99c144eb12..0000000000 --- a/TRANSLATORS +++ /dev/null @@ -1,27 +0,0 @@ -Notes for translators -===================== - -Systemd depends on gettext for multilingual support. -In po/ directory you'll find the needed files. - -POT (Portable Object Template) ------------------------------- -A text file with .pot extension, with all the extracted labels from code. - -To update the template: - -$ cd systemd/ -$ ninja -C build systemd-pot - -To start a new translation: - -$ cd po/ -$ cp systemd.pot .po - -Replace with the two-letters codes of ISO 639 standard. - -PO (Portable Object) --------------------- -A text file with .po extension, with all the available labels and some additional -metadata fields. Any editor is ok, but a good standard is 'poedit', a graphical -application specifically designed for this kind of task. diff --git a/UIDS-GIDS.md b/UIDS-GIDS.md deleted file mode 100644 index e19cc88162..0000000000 --- a/UIDS-GIDS.md +++ /dev/null @@ -1,243 +0,0 @@ -# Users, Groups, UIDs and GIDs on `systemd` systems - -Here's a summary of the requirements `systemd` (and Linux) make on UID/GID -assignments and their ranges. - -Note that while in theory UIDs and GIDs are orthogonal concepts they really -aren't IRL. With that in mind, when we discuss UIDs below it should be assumed -that whatever we say about UIDs applies to GIDs in mostly the same way, and all -the special assignments and ranges for UIDs always have mostly the same -validity for GIDs too. - -## Special Linux UIDs - -In theory, the range of the C type `uid_t` is 32bit wide on Linux, -i.e. 0…4294967295. However, four UIDs are special on Linux: - -1. 0 → The `root` super-user - -2. 65534 → The `nobody` UID, also called the "overflow" UID or similar. It's - where various subsystems map unmappable users to, for example file systems - only supporting 16bit UIDs, NFS or user namespacing. (The latter can be - changed with a sysctl during runtime, but that's not supported on - `systemd`. If you do change it you void your warranty.) Because Fedora is a - bit confused the `nobody` user is called `nfsnobody` there (and they have a - different `nobody` user at UID 99). I hope this will be corrected eventually - though. (Also, some distributions call the `nobody` group `nogroup`. I wish - they didn't.) - -3. 4294967295, aka "32bit `(uid_t) -1`" → This UID is not a valid user ID, as - `setresuid()`, `chown()` and friends treat -1 as a special request to not - change the UID of the process/file. This UID is hence not available for - assignment to users in the user database. - -4. 65535, aka "16bit `(uid_t) -1`" → Before Linux kernel 2.4 `uid_t` used to be - 16bit, and programs compiled for that would hence assume that `(uid_t) -1` - is 65535. This UID is hence not usable either. - -The `nss-systemd` glibc NSS module will synthesize user database records for -the UIDs 0 and 65534 if the system user database doesn't list them. This means -that any system where this module is enabled works to some minimal level -without `/etc/passwd`. - -## Special Distribution UID ranges - -Distributions generally split the available UID range in two: - -1. 1…999 → System users. These are users that do not map to actual "human" - users, but are used as security identities for system daemons, to implement - privilege separation and run system daemons with minimal privileges. - -2. 1000…65533 and 65536…4294967294 → Everything else, i.e. regular (human) users. - -Note that most distributions allow changing the boundary between system and -regular users, even during runtime as user configuration. Moreover, some older -systems placed the boundary at 499/500, or even 99/100. In `systemd`, the -boundary is configurable only during compilation time, as this should be a -decision for distribution builders, not for users. Moreover, we strongly -discourage downstreams to change the boundary from the upstream default of -999/1000. - -Also note that programs such as `adduser` tend to allocate from a subset of the -available regular user range only, usually 1000..60000. And it's also usually -user-configurable, too. - -Note that systemd requires that system users and groups are resolvable without -networking available — a requirement that is not made for regular users. This -means regular users may be stored in remote LDAP or NIS databases, but system -users may not (except when there's a consistent local cache kept, that is -available during earliest boot, including in the initial RAM disk). - -## Special `systemd` GIDs - -`systemd` defines no special UIDs beyond what Linux already defines (see -above). However, it does define some special group/GID assignments, which are -primarily used for `systemd-udevd`'s device management. The precise list of the -currently defined groups is found in this `sysusers.d` snippet: -[basic.conf](https://raw.githubusercontent.com/systemd/systemd/master/sysusers.d/basic.conf.in) - -It's strongly recommended that downstream distributions include these groups in -their default group databases. - -Note that the actual GID numbers assigned to these groups do not have to be -constant beyond a specific system. There's one exception however: the `tty` -group must have the GID 5. That's because it must be encoded in the `devpts` -mount parameters during earliest boot, at a time where NSS lookups are not -possible. (Note that the actual GID can be changed during `systemd` build time, -but downstreams are strongly advised against doing that.) - -## Special `systemd` UID ranges - -`systemd` defines a number of special UID ranges: - -1. 61184…65519 → UIDs for dynamic users are allocated from this range (see the - `DynamicUser=` documentation in - [`systemd.exec(5)`](https://www.freedesktop.org/software/systemd/man/systemd.exec.html)). This - range has been chosen so that it is below the 16bit boundary (i.e. below - 65535), in order to provide compatibility with container environments that - assign a 64K range of UIDs to containers using user namespacing. This range - is above the 60000 boundary, so that its allocations are unlikely to be - affected by `adduser` allocations (see above). And we leave some room - upwards for other purposes. (And if you wonder why precisely these numbers: - if you write them in hexadecimal, they might make more sense: 0xEF00 and - 0xFFEF). The `nss-systemd` module will synthesize user records implicitly - for all currently allocated dynamic users from this range. Thus, NSS-based - user record resolving works correctly without those users being in - `/etc/passwd`. - -2. 524288…1879048191 → UID range for `systemd-nspawn`'s automatic allocation of - per-container UID ranges. When the `--private-users=pick` switch is used (or - `-U`) then it will automatically find a so far unused 16bit subrange of this - range and assign it to the container. The range is picked so that the upper - 16bit of the 32bit UIDs are constant for all users of the container, while - the lower 16bit directly encode the 65536 UIDs assigned to the - container. This mode of allocation means that the upper 16bit of any UID - assigned to a container are kind of a "container ID", while the lower 16bit - directly expose the container's own UID numbers. If you wonder why precisely - these numbers, consider them in hexadecimal: 0x00080000…0x6FFFFFFF. This - range is above the 16bit boundary. Moreover it's below the 31bit boundary, - as some broken code (specifically: the kernel's `devpts` file system) - erroneously considers UIDs signed integers, and hence can't deal with values - above 2^31. The `nss-mymachines` glibc NSS module will synthesize user - database records for all UIDs assigned to a running container from this - range. - -Note for both allocation ranges: when an UID allocation takes place NSS is -checked for collisions first, and a different UID is picked if an entry is -found. Thus, the user database is used as synchronization mechanism to ensure -exclusive ownership of UIDs and UID ranges. To ensure compatibility with other -subsystems allocating from the same ranges it is hence essential that they -ensure that whatever they pick shows up in the user/group databases, either by -providing an NSS module, or by adding entries directly to `/etc/passwd` and -`/etc/group`. For performance reasons, do note that `systemd-nspawn` will only -do an NSS check for the first UID of the range it allocates, not all 65536 of -them. Also note that while the allocation logic is operating, the glibc -`lckpwdf()` user database lock is taken, in order to make this logic race-free. - -## Figuring out the system's UID boundaries - -The most important boundaries of the local system may be queried with -`pkg-config`: - -``` -$ pkg-config --variable=systemuidmax systemd -999 -$ pkg-config --variable=dynamicuidmin systemd -61184 -$ pkg-config --variable=dynamicuidmax systemd -65519 -$ pkg-config --variable=containeruidbasemin systemd -524288 -$ pkg-config --variable=containeruidbasemax systemd -1878982656 -``` - -(Note that the latter encodes the maximum UID *base* `systemd-nspawn` might -pick — given that 64K UIDs are assigned to each container according to this -allocation logic, the maximum UID used for this range is hence -1878982656+65535=1879048191.) - -Note that systemd does not make any of these values runtime-configurable. All -these boundaries are chosen during build time. That said, the system UID/GID -boundary is traditionally configured in /etc/login.defs, though systemd won't -look there during runtime. - -## Considerations for container managers - -If you hack on a container manager, and wonder how and how many UIDs best to -assign to your containers, here are a few recommendations: - -1. Definitely, don't assign less than 65536 UIDs/GIDs. After all the `nobody` -user has magic properties, and hence should be available in your container, and -given that it's assigned the UID 65534, you should really cover the full 16bit -range in your container. Note that systemd will — as mentioned — synthesize -user records for the `nobody` user, and assumes its availability in various -other parts of its codebase, too, hence assigning fewer users means you lose -compatibility with running systemd code inside your container. And most likely -other packages make similar restrictions. - -2. While it's fine to assign more than 65536 UIDs/GIDs to a container, there's -most likely not much value in doing so, as Linux distributions won't use the -higher ranges by default (as mentioned neither `adduser` nor `systemd`'s -dynamic user concept allocate from above the 16bit range). Unless you actively -care for nested containers, it's hence probably a good idea to allocate exactly -65536 UIDs per container, and neither less nor more. A pretty side-effect is -that by doing so, you expose the same number of UIDs per container as Linux 2.2 -supported for the whole system, back in the days. - -3. Consider allocating UID ranges for containers so that the first UID you -assign has the lower 16bits all set to zero. That way, the upper 16bits become -a container ID of some kind, while the lower 16bits directly encode the -internal container UID. This is the way `systemd-nspawn` allocates UID ranges -(see above). Following this allocation logic ensures best compability with -`systemd-nspawn` and all other container managers following the scheme, as it -is sufficient then to check NSS for the first UID you pick regarding conflicts, -as that's what they do, too. Moreover, it makes `chown()`ing container file -system trees nicely robust to interruptions: as the external UID encodes the -internal UID in a fixed way, it's very easy to adjust the container's base UID -without the need to know the original base UID: to change the container base, -just mask away the upper 16bit, and insert the upper 16bit of the new container -base instead. Here are the easy conversions to derive the internal UID, the -external UID, and the container base UID from each other: - - ``` - INTERNAL_UID = EXTERNAL_UID & 0x0000FFFF - CONTAINER_BASE_UID = EXTERNAL_UID & 0xFFFF0000 - EXTERNAL_UID = INTERNAL_UID | CONTAINER_BASE_UID - ``` - -4. When picking a UID range for containers, make sure to check NSS first, with -a simple `getpwuid()` call: if there's already a user record for the first UID -you want to pick, then it's already in use: pick a different one. Wrap that -call in a `lckpwdf()` + `ulckpwdf()` pair, to make allocation -race-free. Provide an NSS module that makes all UIDs you end up taking show up -in the user database, and make sure that the NSS module returns up-to-date -information before you release the lock, so that other system components can -safely use the NSS user database as allocation check, too. Note that if you -follow this scheme no changes to `/etc/passwd` need to be made, thus minimizing -the artifacts the container manager persistently leaves in the system. - -## Summary - -| UID/GID | Purpose | Defined By | Listed in | -|-----------------------|-----------------------|---------------|-------------------------------| -| 0 | `root` user | Linux | `/etc/passwd` + `nss-systemd` | -| 1…4 | System users | Distributions | `/etc/passwd` | -| 5 | `tty` group | `systemd` | `/etc/passwd` | -| 6…999 | System users | Distributions | `/etc/passwd` | -| 1000…60000 | Regular users | Distributions | `/etc/passwd` + LDAP/NIS/… | -| 60001…61183 | Unused | | | -| 61184…65519 | Dynamic service users | `systemd` | `nss-systemd` | -| 65520…65533 | Unused | | | -| 65534 | `nobody` user | Linux | `/etc/passwd` + `nss-systemd` | -| 65535 | 16bit `(uid_t) -1` | Linux | | -| 65536…524287 | Unused | | | -| 524288…1879048191 | Container UID ranges | `systemd` | `nss-mymachines` | -| 1879048192…4294967294 | Unused | | | -| 4294967295 | 32bit `(uid_t) -1` | Linux | | - -Note that "Unused" in the table above doesn't meant that these ranges are -really unused. It just means that these ranges have no well-established -pre-defined purposes between Linux, generic low-level distributions and -`systemd`. There might very well be other packages that allocate from these -ranges. diff --git a/doc/CODING_STYLE b/doc/CODING_STYLE new file mode 100644 index 0000000000..ae818126cb --- /dev/null +++ b/doc/CODING_STYLE @@ -0,0 +1,442 @@ +- 8ch indent, no tabs, except for files in man/ which are 2ch indent, + and still no tabs + +- We prefer /* comments */ over // comments in code you commit, please. This + way // comments are left for developers to use for local, temporary + commenting of code for debug purposes (i.e. uncommittable stuff), making such + comments easily discernable from explanatory, documenting code comments + (i.e. committable stuff). + +- Don't break code lines too eagerly. We do *not* force line breaks at + 80ch, all of today's screens should be much larger than that. But + then again, don't overdo it, ~119ch should be enough really. + +- Variables and functions *must* be static, unless they have a + prototype, and are supposed to be exported. + +- structs in MixedCase (with exceptions, such as public API structs), + variables + functions in lower_case. + +- The destructors always unregister the object from the next bigger + object, not the other way around + +- To minimize strict aliasing violations, we prefer unions over casting + +- For robustness reasons, destructors should be able to destruct + half-initialized objects, too + +- Error codes are returned as negative Exxx. e.g. return -EINVAL. There + are some exceptions: for constructors, it is OK to return NULL on + OOM. For lookup functions, NULL is fine too for "not found". + + Be strict with this. When you write a function that can fail due to + more than one cause, it *really* should have "int" as return value + for the error code. + +- Do not bother with error checking whether writing to stdout/stderr + worked. + +- Do not log errors from "library" code, only do so from "main + program" code. (With one exception: it is OK to log with DEBUG level + from any code, with the exception of maybe inner loops). + +- Always check OOM. There is no excuse. In program code, you can use + "log_oom()" for then printing a short message, but not in "library" code. + +- Do not issue NSS requests (that includes user name and host name + lookups) from PID 1 as this might trigger deadlocks when those + lookups involve synchronously talking to services that we would need + to start up + +- Do not synchronously talk to any other service from PID 1, due to + risk of deadlocks + +- Avoid fixed-size string buffers, unless you really know the maximum + size and that maximum size is small. They are a source of errors, + since they possibly result in truncated strings. It is often nicer + to use dynamic memory, alloca() or VLAs. If you do allocate fixed-size + strings on the stack, then it is probably only OK if you either + use a maximum size such as LINE_MAX, or count in detail the maximum + size a string can have. (DECIMAL_STR_MAX and DECIMAL_STR_WIDTH + macros are your friends for this!) + + Or in other words, if you use "char buf[256]" then you are likely + doing something wrong! + +- Stay uniform. For example, always use "usec_t" for time + values. Do not mix usec and msec, and usec and whatnot. + +- Make use of _cleanup_free_ and friends. It makes your code much + nicer to read! + +- Be exceptionally careful when formatting and parsing floating point + numbers. Their syntax is locale dependent (i.e. "5.000" in en_US is + generally understood as 5, while on de_DE as 5000.). + +- Try to use this: + + void foo() { + } + + instead of this: + + void foo() + { + } + + But it is OK if you do not. + +- Single-line "if" blocks should not be enclosed in {}. Use this: + + if (foobar) + waldo(); + + instead of this: + + if (foobar) { + waldo(); + } + +- Do not write "foo ()", write "foo()". + +- Please use streq() and strneq() instead of strcmp(), strncmp() where applicable. + +- Please do not allocate variables on the stack in the middle of code, + even if C99 allows it. Wrong: + + { + a = 5; + int b; + b = a; + } + + Right: + + { + int b; + a = 5; + b = a; + } + +- Unless you allocate an array, "double" is always the better choice + than "float". Processors speak "double" natively anyway, so this is + no speed benefit, and on calls like printf() "float"s get promoted + to "double"s anyway, so there is no point. + +- Do not mix function invocations with variable definitions in one + line. Wrong: + + { + int a = foobar(); + uint64_t x = 7; + } + + Right: + + { + int a; + uint64_t x = 7; + + a = foobar(); + } + +- Use "goto" for cleaning up, and only use it for that. i.e. you may + only jump to the end of a function, and little else. Never jump + backwards! + +- Think about the types you use. If a value cannot sensibly be + negative, do not use "int", but use "unsigned". + +- Use "char" only for actual characters. Use "uint8_t" or "int8_t" + when you actually mean a byte-sized signed or unsigned + integers. When referring to a generic byte, we generally prefer the + unsigned variant "uint8_t". Do not use types based on "short". They + *never* make sense. Use ints, longs, long longs, all in + unsigned+signed fashion, and the fixed size types + uint8_t/uint16_t/uint32_t/uint64_t/int8_t/int16_t/int32_t and so on, + as well as size_t, but nothing else. Do not use kernel types like + u32 and so on, leave that to the kernel. + +- Public API calls (i.e. functions exported by our shared libraries) + must be marked "_public_" and need to be prefixed with "sd_". No + other functions should be prefixed like that. + +- In public API calls, you *must* validate all your input arguments for + programming error with assert_return() and return a sensible return + code. In all other calls, it is recommended to check for programming + errors with a more brutal assert(). We are more forgiving to public + users than for ourselves! Note that assert() and assert_return() + really only should be used for detecting programming errors, not for + runtime errors. assert() and assert_return() by usage of _likely_() + inform the compiler that he should not expect these checks to fail, + and they inform fellow programmers about the expected validity and + range of parameters. + +- Never use strtol(), atoi() and similar calls. Use safe_atoli(), + safe_atou32() and suchlike instead. They are much nicer to use in + most cases and correctly check for parsing errors. + +- For every function you add, think about whether it is a "logging" + function or a "non-logging" function. "Logging" functions do logging + on their own, "non-logging" function never log on their own and + expect their callers to log. All functions in "library" code, + i.e. in src/shared/ and suchlike must be "non-logging". Every time a + "logging" function calls a "non-logging" function, it should log + about the resulting errors. If a "logging" function calls another + "logging" function, then it should not generate log messages, so + that log messages are not generated twice for the same errors. + +- Avoid static variables, except for caches and very few other + cases. Think about thread-safety! While most of our code is never + used in threaded environments, at least the library code should make + sure it works correctly in them. Instead of doing a lot of locking + for that, we tend to prefer using TLS to do per-thread caching (which + only works for small, fixed-size cache objects), or we disable + caching for any thread that is not the main thread. Use + is_main_thread() to detect whether the calling thread is the main + thread. + +- Command line option parsing: + - Do not print full help() on error, be specific about the error. + - Do not print messages to stdout on error. + - Do not POSIX_ME_HARDER unless necessary, i.e. avoid "+" in option string. + +- Do not write functions that clobber call-by-reference variables on + failure. Use temporary variables for these cases and change the + passed in variables only on success. + +- When you allocate a file descriptor, it should be made O_CLOEXEC + right from the beginning, as none of our files should leak to forked + binaries by default. Hence, whenever you open a file, O_CLOEXEC must + be specified, right from the beginning. This also applies to + sockets. Effectively this means that all invocations to: + + a) open() must get O_CLOEXEC passed + b) socket() and socketpair() must get SOCK_CLOEXEC passed + c) recvmsg() must get MSG_CMSG_CLOEXEC set + d) F_DUPFD_CLOEXEC should be used instead of F_DUPFD, and so on + f) invocations of fopen() should take "e" + +- We never use the POSIX version of basename() (which glibc defines it in + libgen.h), only the GNU version (which glibc defines in string.h). + The only reason to include libgen.h is because dirname() + is needed. Every time you need that please immediately undefine + basename(), and add a comment about it, so that no code ever ends up + using the POSIX version! + +- Use the bool type for booleans, not integers. One exception: in public + headers (i.e those in src/systemd/sd-*.h) use integers after all, as "bool" + is C99 and in our public APIs we try to stick to C89 (with a few extension). + +- When you invoke certain calls like unlink(), or mkdir_p() and you + know it is safe to ignore the error it might return (because a later + call would detect the failure anyway, or because the error is in an + error path and you thus couldn't do anything about it anyway), then + make this clear by casting the invocation explicitly to (void). Code + checks like Coverity understand that, and will not complain about + ignored error codes. Hence, please use this: + + (void) unlink("/foo/bar/baz"); + + instead of just this: + + unlink("/foo/bar/baz"); + + Don't cast function calls to (void) that return no error + conditions. Specifically, the various xyz_unref() calls that return a NULL + object shouldn't be cast to (void), since not using the return value does not + hide any errors. + +- Don't invoke exit(), ever. It is not replacement for proper error + handling. Please escalate errors up your call chain, and use normal + "return" to exit from the main function of a process. If you + fork()ed off a child process, please use _exit() instead of exit(), + so that the exit handlers are not run. + +- Please never use dup(). Use fcntl(fd, F_DUPFD_CLOEXEC, 3) + instead. For two reason: first, you want O_CLOEXEC set on the new fd + (see above). Second, dup() will happily duplicate your fd as 0, 1, + 2, i.e. stdin, stdout, stderr, should those fds be closed. Given the + special semantics of those fds, it's probably a good idea to avoid + them. F_DUPFD_CLOEXEC with "3" as parameter avoids them. + +- When you define a destructor or unref() call for an object, please + accept a NULL object and simply treat this as NOP. This is similar + to how libc free() works, which accepts NULL pointers and becomes a + NOP for them. By following this scheme a lot of if checks can be + removed before invoking your destructor, which makes the code + substantially more readable and robust. + +- Related to this: when you define a destructor or unref() call for an + object, please make it return the same type it takes and always + return NULL from it. This allows writing code like this: + + p = foobar_unref(p); + + which will always work regardless if p is initialized or not, and + guarantees that p is NULL afterwards, all in just one line. + +- Use alloca(), but never forget that it is not OK to invoke alloca() + within a loop or within function call parameters. alloca() memory is + released at the end of a function, and not at the end of a {} + block. Thus, if you invoke it in a loop, you keep increasing the + stack pointer without ever releasing memory again. (VLAs have better + behaviour in this case, so consider using them as an alternative.) + Regarding not using alloca() within function parameters, see the + BUGS section of the alloca(3) man page. + +- Use memzero() or even better zero() instead of memset(..., 0, ...) + +- Instead of using memzero()/memset() to initialize structs allocated + on the stack, please try to use c99 structure initializers. It's + short, prettier and actually even faster at execution. Hence: + + struct foobar t = { + .foo = 7, + .bar = "bazz", + }; + + instead of: + + struct foobar t; + zero(t); + t.foo = 7; + t.bar = "bazz"; + +- When returning a return code from main(), please preferably use + EXIT_FAILURE and EXIT_SUCCESS as defined by libc. + +- The order in which header files are included doesn't matter too + much. systemd-internal headers must not rely on an include order, so + it is safe to include them in any order possible. + However, to not clutter global includes, and to make sure internal + definitions will not affect global headers, please always include the + headers of external components first (these are all headers enclosed + in <>), followed by our own exported headers (usually everything + that's prefixed by "sd-"), and then followed by internal headers. + Furthermore, in all three groups, order all includes alphabetically + so duplicate includes can easily be detected. + +- To implement an endless loop, use "for (;;)" rather than "while + (1)". The latter is a bit ugly anyway, since you probably really + meant "while (true)"... To avoid the discussion what the right + always-true expression for an infinite while() loop is our + recommendation is to simply write it without any such expression by + using "for (;;)". + +- Never use the "off_t" type, and particularly avoid it in public + APIs. It's really weirdly defined, as it usually is 64bit and we + don't support it any other way, but it could in theory also be + 32bit. Which one it is depends on a compiler switch chosen by the + compiled program, which hence corrupts APIs using it unless they can + also follow the program's choice. Moreover, in systemd we should + parse values the same way on all architectures and cannot expose + off_t values over D-Bus. To avoid any confusion regarding conversion + and ABIs, always use simply uint64_t directly. + +- Commit message subject lines should be prefixed with an appropriate + component name of some kind. For example "journal: ", "nspawn: " and + so on. + +- Do not use "Signed-Off-By:" in your commit messages. That's a kernel + thing we don't do in the systemd project. + +- Avoid leaving long-running child processes around, i.e. fork()s that + are not followed quickly by an execv() in the child. Resource + management is unclear in this case, and memory CoW will result in + unexpected penalties in the parent much much later on. + +- Don't block execution for arbitrary amounts of time using usleep() + or a similar call, unless you really know what you do. Just "giving + something some time", or so is a lazy excuse. Always wait for the + proper event, instead of doing time-based poll loops. + +- To determine the length of a constant string "foo", don't bother + with sizeof("foo")-1, please use STRLEN() instead. + +- If you want to concatenate two or more strings, consider using + strjoin() rather than asprintf(), as the latter is a lot + slower. This matters particularly in inner loops. + +- Please avoid using global variables as much as you can. And if you + do use them make sure they are static at least, instead of + exported. Especially in library-like code it is important to avoid + global variables. Why are global variables bad? They usually hinder + generic reusability of code (since they break in threaded programs, + and usually would require locking there), and as the code using them + has side-effects make programs non-transparent. That said, there are + many cases where they explicitly make a lot of sense, and are OK to + use. For example, the log level and target in log.c is stored in a + global variable, and that's OK and probably expected by most. Also + in many cases we cache data in global variables. If you add more + caches like this, please be careful however, and think about + threading. Only use static variables if you are sure that + thread-safety doesn't matter in your case. Alternatively consider + using TLS, which is pretty easy to use with gcc's "thread_local" + concept. It's also OK to store data that is inherently global in + global variables, for example data parsed from command lines, see + below. + +- If you parse a command line, and want to store the parsed parameters + in global variables, please consider prefixing their names with + "arg_". We have been following this naming rule in most of our + tools, and we should continue to do so, as it makes it easy to + identify command line parameter variables, and makes it clear why it + is OK that they are global variables. + +- When exposing public C APIs, be careful what function parameters you make + "const". For example, a parameter taking a context object should probably not + be "const", even if you are writing an otherwise read-only accessor function + for it. The reason is that making it "const" fixates the contract that your + call won't alter the object ever, as part of the API. However, that's often + quite a promise, given that this even prohibits object-internal caching or + lazy initialization of object variables. Moreover it's usually not too useful + for client applications. Hence: please be careful and avoid "const" on object + parameters, unless you are very sure "const" is appropriate. + +- Make sure to enforce limits on every user controllable resource. If the user + can allocate resources in your code, your code must enforce some form of + limits after which it will refuse operation. It's fine if it is hard-coded (at + least initially), but it needs to be there. This is particularly important + for objects that unprivileged users may allocate, but also matters for + everything else any user may allocated. + +- htonl()/ntohl() and htons()/ntohs() are weird. Please use htobe32() and + htobe16() instead, it's much more descriptive, and actually says what really + is happening, after all htonl() and htons() don't operate on longs and + shorts as their name would suggest, but on uint32_t and uint16_t. Also, + "network byte order" is just a weird name for "big endian", hence we might + want to call it "big endian" right-away. + +- You might wonder what kind of common code belongs in src/shared/ and what + belongs in src/basic/. The split is like this: anything that uses public APIs + we expose (i.e. any of the sd-bus, sd-login, sd-id128, ... APIs) must be + located in src/shared/. All stuff that only uses external libraries from + other projects (such as glibc's APIs), or APIs from src/basic/ itself should + be placed in src/basic/. Conversely, src/libsystemd/ may only use symbols + from src/basic, but not from src/shared/. To summarize: + + src/basic/ → may be used by all code in the tree + → may not use any code outside of src/basic/ + + src/libsystemd/ → may be used by all code in the tree, except for code in src/basic/ + → may not use any code outside of src/basic/, src/libsystemd/ + + src/shared/ → may be used by all code in the tree, except for code in src/basic/, src/libsystemd/ + → may not use any code outside of src/basic/, src/libsystemd/, src/shared/ + +- Our focus is on the GNU libc (glibc), not any other libcs. If other libcs are + incompatible with glibc it's on them. However, if there are equivalent POSIX + and Linux/GNU-specific APIs, we generally prefer the POSIX APIs. If there + aren't, we are happy to use GNU or Linux APIs, and expect non-GNU + implementations of libc to catch up with glibc. + +- Whenever installing a signal handler, make sure to set SA_RESTART for it, so + that interrupted system calls are automatically restarted, and we minimize + hassles with handling EINTR (in particular as EINTR handling is pretty broken + on Linux). + +- When applying C-style unescaping as well as specifier expansion on the same + string, always apply the C-style unescaping fist, followed by the specifier + expansion. When doing the reverse, make sure to escape '%' in specifier-style + first (i.e. '%' → '%%'), and then do C-style escaping where necessary. diff --git a/doc/DISTRO_PORTING b/doc/DISTRO_PORTING new file mode 100644 index 0000000000..d1a187aa41 --- /dev/null +++ b/doc/DISTRO_PORTING @@ -0,0 +1,71 @@ +Porting systemd To New Distributions + +HOWTO: + You need to make the follow changes to adapt systemd to your + distribution: + + 1) Find the right configure parameters for: + + -D rootprefix= + -D sysvinit-path= + -D sysvrcnd-path= + -D rc-local= + -D halt-local= + -D loadkeys-path= + -D setfont-path= + -D tty-gid= + -D ntp-servers= + -D dns-servers= + -D support-url= + + 2) Try it out. Play around (as an ordinary user) with + '/usr/lib/systemd/systemd --test --system' for a test run + of systemd without booting. This will read the unit files and + print the initial transaction it would execute during boot-up. + This will also inform you about ordering loops and suchlike. + +NTP POOL: + By default, systemd-timesyncd uses the Google Public NTP servers + time[1-4].google.com, if no other NTP configuration is available. They + serve time that uses a leap second smear, and can be up to .5s off from + servers that use stepped leap seconds. + + https://developers.google.com/time/smear + + If you prefer to use leap second steps, please register your own + vendor pool at ntp.org and make it the built-in default by + passing --with-ntp-servers= to configure. Registering vendor + pools is free: + + http://www.pool.ntp.org/en/vendors.html + + Use -D ntp-servers= to direct systemd-timesyncd to different fallback + NTP servers. + +DNS SERVERS: + By default, systemd-resolved uses the Google Public DNS servers + 8.8.8.8, 8.8.4.4, 2001:4860:4860::8888, 2001:4860:4860::8844 as + fallback, if no other DNS configuration is available. + + Use -D dns-servers= to direct systemd-resolved to different fallback + DNS servers. + +PAM: + The default PAM config shipped by systemd is really bare bones. + It does not include many modules your distro might want to enable + to provide a more seamless experience. For example, limits set in + /etc/security/limits.conf will not be read unless you load pam_limits. + Make sure you add modules your distro expects from user services. + + Pass -D pamconfdir=no to meson to avoid installing this file and + instead install your own. + +CONTRIBUTING UPSTREAM: + We generally do no longer accept distribution-specific patches to + systemd upstream. If you have to make changes to systemd's source code + to make it work on your distribution, unless your code is generic + enough to be generally useful, we are unlikely to merge it. Please + always consider adopting the upstream defaults. If that is not + possible, please maintain the relevant patches downstream. + + Thank you for understanding. diff --git a/doc/ENVIRONMENT.md b/doc/ENVIRONMENT.md new file mode 100644 index 0000000000..581bf3c238 --- /dev/null +++ b/doc/ENVIRONMENT.md @@ -0,0 +1,91 @@ +# Known Environment Variables + +A number of systemd components take additional runtime parameters via +environment variables. Many of these environment variables are not supported at +the same level as command line switches and other interfaces are: we don't +document them in the man pages and we make no stability guarantees for +them. While they generally are unlikely to be dropped any time soon again, we +do not want to guarantee that they stay around for good either. + +Below is an (incomprehensive) list of the environment variables understood by +the various tools. Note that this list only covers environment variables not +documented in the proper man pages. + +All tools: + +* `$SYSTEMD_OFFLINE=[0|1]` — if set to `1`, then `systemctl` will + refrain from talking to PID 1; this has the same effect as the historical + detection of `chroot()`. Setting this variable to `0` instead has a similar + effect as `SYSTEMD_IGNORE_CHROOT=1`; i.e. tools will try to + communicate with PID 1 even if a `chroot()` environment is detected. + You almost certainly want to set this to `1` if you maintain a package build system + or similar and are trying to use a modern container system and not plain + `chroot()`. + +* `$SYSTEMD_IGNORE_CHROOT=1` — if set, don't check whether being invoked in a + `chroot()` environment. This is particularly relevant for systemctl, as it + will not alter its behaviour for `chroot()` environments if set. Normally it + refrains from talking to PID 1 in such a case; turning most operations such + as `start` into no-ops. If that's what's explicitly desired, you might + consider setting `SYSTEMD_OFFLINE=1`. + +* `$SD_EVENT_PROFILE_DELAYS=1` — if set, the sd-event event loop implementation + will print latency information at runtime. + +* `$SYSTEMD_PROC_CMDLINE` — if set, may contain a string that is used as kernel + command line instead of the actual one readable from /proc/cmdline. This is + useful for debugging, in order to test generators and other code against + specific kernel command lines. + +systemctl: + +* `$SYSTEMCTL_FORCE_BUS=1` — if set, do not connect to PID1's private D-Bus + listener, and instead always connect through the dbus-daemon D-bus broker. + +* `$SYSTEMCTL_INSTALL_CLIENT_SIDE=1` — if set, enable or disable unit files on + the client side, instead of asking PID 1 to do this. + +* `$SYSTEMCTL_SKIP_SYSV=1` — if set, do not call out to SysV compatibility hooks. + +systemd-nspawn: + +* `$UNIFIED_CGROUP_HIERARCHY=1` — if set, force nspawn into unified cgroup + hierarchy mode. + +* `$SYSTEMD_NSPAWN_API_VFS_WRITABLE=1` — if set, make /sys and /proc/sys and + friends writable in the container. If set to "network", leave only + /proc/sys/net writable. + +* `$SYSTEMD_NSPAWN_CONTAINER_SERVICE=…` — override the "service" name nspawn + uses to register with machined. If unset defaults to "nspawn", but with this + variable may be set to any other value. + +* `$SYSTEMD_NSPAWN_USE_CGNS=0` — if set, do not use cgroup namespacing, even if + it is available. + +* `$SYSTEMD_NSPAWN_LOCK=0` — if set, do not lock container images when running. + +systemd-logind: + +* `$SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK=1` — if set, report that + hibernation is available even if the swap devices do not provide enough room + for it. + +installed systemd tests: + +* `$SYSTEMD_TEST_DATA` — override the location of test data. This is useful if + a test executable is moved to an arbitrary location. + +nss-systemd: + +* `$SYSTEMD_NSS_BYPASS_SYNTHETIC=1` — if set, `nss-systemd` won't synthesize + user/group records for the `root` and `nobody` users if they are missing from + `/etc/passwd`. + +* `$SYSTEMD_NSS_DYNAMIC_BYPASS=1` — if set, `nss-systemd` won't return + user/group records for dynamically registered service users (i.e. users + registered through `DynamicUser=1`). + +* `$SYSTEMD_NSS_BYPASS_BUS=1` — if set, `nss-systemd` won't use D-Bus to do + dynamic user lookups. This is primarily useful to make `nss-systemd` work + safely from within `dbus-daemon`. diff --git a/doc/HACKING b/doc/HACKING new file mode 100644 index 0000000000..0682af27ba --- /dev/null +++ b/doc/HACKING @@ -0,0 +1,116 @@ +HACKING ON SYSTEMD + +We welcome all contributions to systemd. If you notice a bug or a missing +feature, please feel invited to fix it, and submit your work as a github Pull +Request (PR): + + https://github.com/systemd/systemd/pull/new + +Please make sure to follow our Coding Style when submitting patches. See +doc/CODING_STYLE for details. Also have a look at our Contribution Guidelines: + + https://github.com/systemd/systemd/blob/master/.github/CONTRIBUTING.md + +When adding new functionality, tests should be added. For shared functionality +(in src/basic and src/shared) unit tests should be sufficient. The general +policy is to keep tests in matching files underneath src/test, +e.g. src/test/test-path-util.c contains tests for any functions in +src/basic/path-util.c. If adding a new source file, consider adding a matching +test executable. For features at a higher level, tests in src/test/ are very +strongly recommended. If that is no possible, integration tests in test/ are +encouraged. + +Please always test your work before submitting a PR. For many of the components +of systemd testing is straight-forward as you can simply compile systemd and +run the relevant tool from the build directory. + +For some components (most importantly, systemd/PID1 itself) this is not +possible, however. In order to simplify testing for cases like this we provide +a set of "mkosi" build files directly in the source tree. "mkosi" is a tool for +building clean OS images from an upstream distribution in combination with a +fresh build of the project in the local working directory. To make use of this, +please acquire "mkosi" from https://github.com/systemd/mkosi first, unless your +distribution has packaged it already and you can get it from there. After the +tool is installed it is sufficient to type "mkosi" in the systemd project +directory to generate a disk image "image.raw" you can boot either in +systemd-nspawn or in an UEFI-capable VM: + + # systemd-nspawn -bi image.raw + +or: + + # qemu-system-x86_64 -enable-kvm -m 512 -smp 2 -bios /usr/share/edk2/ovmf/OVMF_CODE.fd -hda image.raw + +Every time you rerun the "mkosi" command a fresh image is built, incorporating +all current changes you made to the project tree. + +Alternatively, you may install the systemd version from your git check-out +directly on top of your host system's directory tree. This mostly works fine, +but of course you should know what you are doing as you might make your system +unbootable in case of a bug in your changes. Also, you might step into your +package manager's territory with this. Be careful! + +And never forget: most distributions provide very simple and convenient ways to +install all development packages necessary to build systemd. For example, on +Fedora the following command line should be sufficient to install all of +systemd's build dependencies: + + # dnf builddep systemd + +Putting this all together, here's a series of commands for preparing a patch +for systemd (this example is for Fedora): + + $ sudo dnf builddep systemd # install build dependencies + $ sudo dnf install mkosi # install tool to quickly build images + $ git clone https://github.com/systemd/systemd.git + $ cd systemd + $ vim src/core/main.c # or wherever you'd like to make your changes + $ meson build # configure the build + $ ninja -C build # build it locally, see if everything compiles fine + $ ninja -C build test # run some simple regression tests + $ sudo mkosi # build a test image + $ sudo systemd-nspawn -bi image.raw # boot up the test image + $ git add -p # interactively put together your patch + $ git commit # commit it + $ git push REMOTE HEAD:refs/heads/BRANCH + # where REMOTE is your "fork" on github + # and BRANCH is a branch name. + +And after that, head over to your repo on github and click "Compare & pull request" + +Happy hacking! + + +FUZZERS + +systemd includes fuzzers in src/fuzz that use libFuzzer and are automatically +run by OSS-Fuzz (https://github.com/google/oss-fuzz) with sanitizers. To add a +fuzz target, create a new src/fuzz/fuzz-foo.c file with a LLVMFuzzerTestOneInput +function and add it to the list in src/fuzz/meson.build. + +Whenever possible, a seed corpus and a dictionary should also be added with new +fuzz targets. The dictionary should be named src/fuzz/fuzz-foo.dict and the seed +corpus should be built and exported as $OUT/fuzz-foo_seed_corpus.zip in +scripts/oss-fuzz.sh. + +The fuzzers can be built locally if you have libFuzzer installed by running +scripts/oss-fuzz.sh. You should also confirm that the fuzzer runs in the +OSS-Fuzz environment by checking out the OSS-Fuzz repo, and then running +commands like this: + + python infra/helper.py build_image systemd + python infra/helper.py build_fuzzers --sanitizer memory systemd ../systemd + python infra/helper.py run_fuzzer systemd fuzz-foo + +If you find a bug that impacts the security of systemd, please follow the +guidance in .github/CONTRIBUTING.md on how to report a security vulnerability. + +For more details on building fuzzers and integrating with OSS-Fuzz, visit: + + https://github.com/google/oss-fuzz/blob/master/docs/new_project_guide.md + + https://llvm.org/docs/LibFuzzer.html + + https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md + + https://chromium.googlesource.com/chromium/src/testing/libfuzzer/+/HEAD/efficient_fuzzer.md diff --git a/doc/TRANSIENT-SETTINGS.md b/doc/TRANSIENT-SETTINGS.md new file mode 100644 index 0000000000..ca9e8387b7 --- /dev/null +++ b/doc/TRANSIENT-SETTINGS.md @@ -0,0 +1,454 @@ +# What settings are currently available for transient units? + +Our intention is to make all settings that are available as unit file settings +also available for transient units, through the D-Bus API. At the moment, some +unit types (device, swap, target) are not supported at all via unit types, +but most others are pretty well supported, with some notable omissions. + +The lists below contain all settings currently available in unit files. The +ones currently available in transient units are prefixed with `✓`. + +## Generic Unit Settings + +Most generic unit settings are available for transient units. + +``` +✓ Description= +✓ Documentation= +✓ SourcePath= +✓ Requires= +✓ Requisite= +✓ Wants= +✓ BindsTo= +✓ Conflicts= +✓ Before= +✓ After= +✓ OnFailure= +✓ PropagatesReloadTo= +✓ ReloadPropagatedFrom= +✓ PartOf= +✓ JoinsNamespaceOf= +✓ RequiresMountsFor= +✓ StopWhenUnneeded= +✓ RefuseManualStart= +✓ RefuseManualStop= +✓ AllowIsolate= +✓ DefaultDependencies= +✓ OnFailureJobMode= +✓ IgnoreOnIsolate= +✓ JobTimeoutSec= +✓ JobRunningTimeoutSec= +✓ JobTimeoutAction= +✓ JobTimeoutRebootArgument= +✓ StartLimitIntervalSec=SECONDS +✓ StartLimitBurst=UNSIGNED +✓ StartLimitAction=ACTION +✓ FailureAction= +✓ SuccessAction= +✓ AddRef= +✓ RebootArgument=STRING +✓ ConditionPathExists= +✓ ConditionPathExistsGlob= +✓ ConditionPathIsDirectory= +✓ ConditionPathIsSymbolicLink= +✓ ConditionPathIsMountPoint= +✓ ConditionPathIsReadWrite= +✓ ConditionDirectoryNotEmpty= +✓ ConditionFileNotEmpty= +✓ ConditionFileIsExecutable= +✓ ConditionNeedsUpdate= +✓ ConditionFirstBoot= +✓ ConditionKernelCommandLine= +✓ ConditionKernelVersion= +✓ ConditionArchitecture= +✓ ConditionVirtualization= +✓ ConditionSecurity= +✓ ConditionCapability= +✓ ConditionHost= +✓ ConditionACPower= +✓ ConditionUser= +✓ ConditionGroup= +✓ ConditionControlGroupController= +✓ AssertPathExists= +✓ AssertPathExistsGlob= +✓ AssertPathIsDirectory= +✓ AssertPathIsSymbolicLink= +✓ AssertPathIsMountPoint= +✓ AssertPathIsReadWrite= +✓ AssertDirectoryNotEmpty= +✓ AssertFileNotEmpty= +✓ AssertFileIsExecutable= +✓ AssertNeedsUpdate= +✓ AssertFirstBoot= +✓ AssertKernelCommandLine= +✓ AssertKernelVersion= +✓ AssertArchitecture= +✓ AssertVirtualization= +✓ AssertSecurity= +✓ AssertCapability= +✓ AssertHost= +✓ AssertACPower= +✓ AssertUser= +✓ AssertGroup= +✓ AssertControlGroupController= +✓ CollectMode= +``` + +## Execution-Related Settings + +All execution-related settings are available for transient units. + +``` +✓ WorkingDirectory= +✓ RootDirectory= +✓ RootImage= +✓ User= +✓ Group= +✓ SupplementaryGroups= +✓ Nice= +✓ OOMScoreAdjust= +✓ IOSchedulingClass= +✓ IOSchedulingPriority= +✓ CPUSchedulingPolicy= +✓ CPUSchedulingPriority= +✓ CPUSchedulingResetOnFork= +✓ CPUAffinity= +✓ UMask= +✓ Environment= +✓ EnvironmentFile= +✓ PassEnvironment= +✓ UnsetEnvironment= +✓ DynamicUser= +✓ RemoveIPC= +✓ StandardInput= +✓ StandardOutput= +✓ StandardError= +✓ StandardInputText= +✓ StandardInputData= +✓ TTYPath= +✓ TTYReset= +✓ TTYVHangup= +✓ TTYVTDisallocate= +✓ SyslogIdentifier= +✓ SyslogFacility= +✓ SyslogLevel= +✓ SyslogLevelPrefix= +✓ LogLevelMax= +✓ LogExtraFields= +✓ SecureBits= +✓ CapabilityBoundingSet= +✓ AmbientCapabilities= +✓ TimerSlackNSec= +✓ NoNewPrivileges= +✓ KeyringMode= +✓ SystemCallFilter= +✓ SystemCallArchitectures= +✓ SystemCallErrorNumber= +✓ MemoryDenyWriteExecute= +✓ RestrictNamespaces= +✓ RestrictRealtime= +✓ RestrictAddressFamilies= +✓ LockPersonality= +✓ LimitCPU= +✓ LimitFSIZE= +✓ LimitDATA= +✓ LimitSTACK= +✓ LimitCORE= +✓ LimitRSS= +✓ LimitNOFILE= +✓ LimitAS= +✓ LimitNPROC= +✓ LimitMEMLOCK= +✓ LimitLOCKS= +✓ LimitSIGPENDING= +✓ LimitMSGQUEUE= +✓ LimitNICE= +✓ LimitRTPRIO= +✓ LimitRTTIME= +✓ ReadWritePaths= +✓ ReadOnlyPaths= +✓ InaccessiblePaths= +✓ BindPaths= +✓ BindReadOnlyPaths= +✓ TemporaryFileSystem= +✓ PrivateTmp= +✓ PrivateDevices= +✓ ProtectKernelTunables= +✓ ProtectKernelModules= +✓ ProtectControlGroups= +✓ PrivateNetwork= +✓ PrivateUsers= +✓ ProtectSystem= +✓ ProtectHome= +✓ MountFlags= +✓ MountAPIVFS= +✓ Personality= +✓ RuntimeDirectoryPreserve= +✓ RuntimeDirectoryMode= +✓ RuntimeDirectory= +✓ StateDirectoryMode= +✓ StateDirectory= +✓ CacheDirectoryMode= +✓ CacheDirectory= +✓ LogsDirectoryMode= +✓ LogsDirectory= +✓ ConfigurationDirectoryMode= +✓ ConfigurationDirectory= +✓ PAMName= +✓ IgnoreSIGPIPE= +✓ UtmpIdentifier= +✓ UtmpMode= +✓ SELinuxContext= +✓ SmackProcessLabel= +✓ AppArmorProfile= +✓ Slice= +``` + +## Resource Control Settings + +All cgroup/resource control settings are available for transient units + +``` +✓ CPUAccounting= +✓ CPUWeight= +✓ StartupCPUWeight= +✓ CPUShares= +✓ StartupCPUShares= +✓ CPUQuota= +✓ MemoryAccounting= +✓ MemoryLow= +✓ MemoryHigh= +✓ MemoryMax= +✓ MemorySwapMax= +✓ MemoryLimit= +✓ DeviceAllow= +✓ DevicePolicy= +✓ IOAccounting= +✓ IOWeight= +✓ StartupIOWeight= +✓ IODeviceWeight= +✓ IOReadBandwidthMax= +✓ IOWriteBandwidthMax= +✓ IOReadIOPSMax= +✓ IOWriteIOPSMax= +✓ BlockIOAccounting= +✓ BlockIOWeight= +✓ StartupBlockIOWeight= +✓ BlockIODeviceWeight= +✓ BlockIOReadBandwidth= +✓ BlockIOWriteBandwidth= +✓ TasksAccounting= +✓ TasksMax= +✓ Delegate= +✓ IPAccounting= +✓ IPAddressAllow= +✓ IPAddressDeny= +``` + +## Process Killing Settings + +All process killing settings are available for transient units: + +``` +✓ SendSIGKILL= +✓ SendSIGHUP= +✓ KillMode= +✓ KillSignal= +``` + +## Service Unit Settings + +Most service unit settings are available for transient units. + +``` +✓ PIDFile= +✓ ExecStartPre= +✓ ExecStart= +✓ ExecStartPost= +✓ ExecReload= +✓ ExecStop= +✓ ExecStopPost= +✓ RestartSec= +✓ TimeoutStartSec= +✓ TimeoutStopSec= +✓ TimeoutSec= +✓ RuntimeMaxSec= +✓ WatchdogSec= +✓ Type= +✓ Restart= +✓ PermissionsStartOnly= +✓ RootDirectoryStartOnly= +✓ RemainAfterExit= +✓ GuessMainPID= +✓ RestartPreventExitStatus= +✓ RestartForceExitStatus= +✓ SuccessExitStatus= +✓ NonBlocking= +✓ BusName= +✓ FileDescriptorStoreMax= +✓ NotifyAccess= + Sockets= +✓ USBFunctionDescriptors= +✓ USBFunctionStrings= +``` + +## Mount Unit Settings + +All mount unit settings are available to transient units: + +``` +✓ What= +✓ Where= +✓ Options= +✓ Type= +✓ TimeoutSec= +✓ DirectoryMode= +✓ SloppyOptions= +✓ LazyUnmount= +✓ ForceUnmount= +``` + +## Automount Unit Settings + +All automount unit setting is available to transient units: + +``` +✓ Where= +✓ DirectoryMode= +✓ TimeoutIdleSec= +``` + +## Timer Unit Settings + +Most timer unit settings are available to transient units. + +``` +✓ OnCalendar= +✓ OnActiveSec= +✓ OnBootSec= +✓ OnStartupSec= +✓ OnUnitActiveSec= +✓ OnUnitInactiveSec= +✓ Persistent= +✓ WakeSystem= +✓ RemainAfterElapse= +✓ AccuracySec= +✓ RandomizedDelaySec= + Unit= +``` + +## Slice Unit Settings + +Slice units are fully supported as transient units, but they have no settings +of their own beyond the generic unit and resource control settings. + +## Scope Unit Settings + +Scope units are fully supported as transient units (in fact they only exist as +such). + +``` +✓ TimeoutStopSec= +``` + +## Socket Unit Settings + +Most socket unit settings are available to transient units. + +``` +✓ ListenStream= +✓ ListenDatagram= +✓ ListenSequentialPacket= +✓ ListenFIFO= +✓ ListenNetlink= +✓ ListenSpecial= +✓ ListenMessageQueue= +✓ ListenUSBFunction= +✓ SocketProtocol= +✓ BindIPv6Only= +✓ Backlog= +✓ BindToDevice= +✓ ExecStartPre= +✓ ExecStartPost= +✓ ExecStopPre= +✓ ExecStopPost= +✓ TimeoutSec= +✓ SocketUser= +✓ SocketGroup= +✓ SocketMode= +✓ DirectoryMode= +✓ Accept= +✓ Writable= +✓ MaxConnections= +✓ MaxConnectionsPerSource= +✓ KeepAlive= +✓ KeepAliveTimeSec= +✓ KeepAliveIntervalSec= +✓ KeepAliveProbes= +✓ DeferAcceptSec= +✓ NoDelay= +✓ Priority= +✓ ReceiveBuffer= +✓ SendBuffer= +✓ IPTOS= +✓ IPTTL= +✓ Mark= +✓ PipeSize= +✓ FreeBind= +✓ Transparent= +✓ Broadcast= +✓ PassCredentials= +✓ PassSecurity= +✓ TCPCongestion= +✓ ReusePort= +✓ MessageQueueMaxMessages= +✓ MessageQueueMessageSize= +✓ RemoveOnStop= +✓ Symlinks= +✓ FileDescriptorName= + Service= +✓ TriggerLimitIntervalSec= +✓ TriggerLimitBurst= +✓ SmackLabel= +✓ SmackLabelIPIn= +✓ SmackLabelIPOut= +✓ SELinuxContextFromNet= +``` + +## Swap Unit Settings + +Swap units are currently not available at all as transient units: + +``` + What= + Priority= + Options= + TimeoutSec= +``` + +## Path Unit Settings + +Most path unit settings are available to transient units. + +``` +✓ PathExists= +✓ PathExistsGlob= +✓ PathChanged= +✓ PathModified= +✓ DirectoryNotEmpty= + Unit= +✓ MakeDirectory= +✓ DirectoryMode= +``` + +## Install Section + +The `[Install]` section is currently not available at all for transient units, and it probably doesn't even make sense. + +``` + Alias= + WantedBy= + RequiredBy= + Also= + DefaultInstance= +``` diff --git a/doc/TRANSLATORS b/doc/TRANSLATORS new file mode 100644 index 0000000000..873ec7b01c --- /dev/null +++ b/doc/TRANSLATORS @@ -0,0 +1,27 @@ +Notes for translators +===================== + +systemd depends on gettext for multilingual support. +In po/ directory you'll find the needed files. + +POT (Portable Object Template) +------------------------------ +A text file with .pot extension, with all the extracted labels from code. + +To update the template: + +$ cd systemd/ +$ ninja -C build systemd-pot + +To start a new translation: + +$ cd po/ +$ cp systemd.pot .po + +Replace with the two-letters codes of ISO 639 standard. + +PO (Portable Object) +-------------------- +A text file with .po extension, with all the available labels and some additional +metadata fields. Any editor is ok, but a good standard is 'poedit', a graphical +application specifically designed for this kind of task. diff --git a/doc/UIDS-GIDS.md b/doc/UIDS-GIDS.md new file mode 100644 index 0000000000..e19cc88162 --- /dev/null +++ b/doc/UIDS-GIDS.md @@ -0,0 +1,243 @@ +# Users, Groups, UIDs and GIDs on `systemd` systems + +Here's a summary of the requirements `systemd` (and Linux) make on UID/GID +assignments and their ranges. + +Note that while in theory UIDs and GIDs are orthogonal concepts they really +aren't IRL. With that in mind, when we discuss UIDs below it should be assumed +that whatever we say about UIDs applies to GIDs in mostly the same way, and all +the special assignments and ranges for UIDs always have mostly the same +validity for GIDs too. + +## Special Linux UIDs + +In theory, the range of the C type `uid_t` is 32bit wide on Linux, +i.e. 0…4294967295. However, four UIDs are special on Linux: + +1. 0 → The `root` super-user + +2. 65534 → The `nobody` UID, also called the "overflow" UID or similar. It's + where various subsystems map unmappable users to, for example file systems + only supporting 16bit UIDs, NFS or user namespacing. (The latter can be + changed with a sysctl during runtime, but that's not supported on + `systemd`. If you do change it you void your warranty.) Because Fedora is a + bit confused the `nobody` user is called `nfsnobody` there (and they have a + different `nobody` user at UID 99). I hope this will be corrected eventually + though. (Also, some distributions call the `nobody` group `nogroup`. I wish + they didn't.) + +3. 4294967295, aka "32bit `(uid_t) -1`" → This UID is not a valid user ID, as + `setresuid()`, `chown()` and friends treat -1 as a special request to not + change the UID of the process/file. This UID is hence not available for + assignment to users in the user database. + +4. 65535, aka "16bit `(uid_t) -1`" → Before Linux kernel 2.4 `uid_t` used to be + 16bit, and programs compiled for that would hence assume that `(uid_t) -1` + is 65535. This UID is hence not usable either. + +The `nss-systemd` glibc NSS module will synthesize user database records for +the UIDs 0 and 65534 if the system user database doesn't list them. This means +that any system where this module is enabled works to some minimal level +without `/etc/passwd`. + +## Special Distribution UID ranges + +Distributions generally split the available UID range in two: + +1. 1…999 → System users. These are users that do not map to actual "human" + users, but are used as security identities for system daemons, to implement + privilege separation and run system daemons with minimal privileges. + +2. 1000…65533 and 65536…4294967294 → Everything else, i.e. regular (human) users. + +Note that most distributions allow changing the boundary between system and +regular users, even during runtime as user configuration. Moreover, some older +systems placed the boundary at 499/500, or even 99/100. In `systemd`, the +boundary is configurable only during compilation time, as this should be a +decision for distribution builders, not for users. Moreover, we strongly +discourage downstreams to change the boundary from the upstream default of +999/1000. + +Also note that programs such as `adduser` tend to allocate from a subset of the +available regular user range only, usually 1000..60000. And it's also usually +user-configurable, too. + +Note that systemd requires that system users and groups are resolvable without +networking available — a requirement that is not made for regular users. This +means regular users may be stored in remote LDAP or NIS databases, but system +users may not (except when there's a consistent local cache kept, that is +available during earliest boot, including in the initial RAM disk). + +## Special `systemd` GIDs + +`systemd` defines no special UIDs beyond what Linux already defines (see +above). However, it does define some special group/GID assignments, which are +primarily used for `systemd-udevd`'s device management. The precise list of the +currently defined groups is found in this `sysusers.d` snippet: +[basic.conf](https://raw.githubusercontent.com/systemd/systemd/master/sysusers.d/basic.conf.in) + +It's strongly recommended that downstream distributions include these groups in +their default group databases. + +Note that the actual GID numbers assigned to these groups do not have to be +constant beyond a specific system. There's one exception however: the `tty` +group must have the GID 5. That's because it must be encoded in the `devpts` +mount parameters during earliest boot, at a time where NSS lookups are not +possible. (Note that the actual GID can be changed during `systemd` build time, +but downstreams are strongly advised against doing that.) + +## Special `systemd` UID ranges + +`systemd` defines a number of special UID ranges: + +1. 61184…65519 → UIDs for dynamic users are allocated from this range (see the + `DynamicUser=` documentation in + [`systemd.exec(5)`](https://www.freedesktop.org/software/systemd/man/systemd.exec.html)). This + range has been chosen so that it is below the 16bit boundary (i.e. below + 65535), in order to provide compatibility with container environments that + assign a 64K range of UIDs to containers using user namespacing. This range + is above the 60000 boundary, so that its allocations are unlikely to be + affected by `adduser` allocations (see above). And we leave some room + upwards for other purposes. (And if you wonder why precisely these numbers: + if you write them in hexadecimal, they might make more sense: 0xEF00 and + 0xFFEF). The `nss-systemd` module will synthesize user records implicitly + for all currently allocated dynamic users from this range. Thus, NSS-based + user record resolving works correctly without those users being in + `/etc/passwd`. + +2. 524288…1879048191 → UID range for `systemd-nspawn`'s automatic allocation of + per-container UID ranges. When the `--private-users=pick` switch is used (or + `-U`) then it will automatically find a so far unused 16bit subrange of this + range and assign it to the container. The range is picked so that the upper + 16bit of the 32bit UIDs are constant for all users of the container, while + the lower 16bit directly encode the 65536 UIDs assigned to the + container. This mode of allocation means that the upper 16bit of any UID + assigned to a container are kind of a "container ID", while the lower 16bit + directly expose the container's own UID numbers. If you wonder why precisely + these numbers, consider them in hexadecimal: 0x00080000…0x6FFFFFFF. This + range is above the 16bit boundary. Moreover it's below the 31bit boundary, + as some broken code (specifically: the kernel's `devpts` file system) + erroneously considers UIDs signed integers, and hence can't deal with values + above 2^31. The `nss-mymachines` glibc NSS module will synthesize user + database records for all UIDs assigned to a running container from this + range. + +Note for both allocation ranges: when an UID allocation takes place NSS is +checked for collisions first, and a different UID is picked if an entry is +found. Thus, the user database is used as synchronization mechanism to ensure +exclusive ownership of UIDs and UID ranges. To ensure compatibility with other +subsystems allocating from the same ranges it is hence essential that they +ensure that whatever they pick shows up in the user/group databases, either by +providing an NSS module, or by adding entries directly to `/etc/passwd` and +`/etc/group`. For performance reasons, do note that `systemd-nspawn` will only +do an NSS check for the first UID of the range it allocates, not all 65536 of +them. Also note that while the allocation logic is operating, the glibc +`lckpwdf()` user database lock is taken, in order to make this logic race-free. + +## Figuring out the system's UID boundaries + +The most important boundaries of the local system may be queried with +`pkg-config`: + +``` +$ pkg-config --variable=systemuidmax systemd +999 +$ pkg-config --variable=dynamicuidmin systemd +61184 +$ pkg-config --variable=dynamicuidmax systemd +65519 +$ pkg-config --variable=containeruidbasemin systemd +524288 +$ pkg-config --variable=containeruidbasemax systemd +1878982656 +``` + +(Note that the latter encodes the maximum UID *base* `systemd-nspawn` might +pick — given that 64K UIDs are assigned to each container according to this +allocation logic, the maximum UID used for this range is hence +1878982656+65535=1879048191.) + +Note that systemd does not make any of these values runtime-configurable. All +these boundaries are chosen during build time. That said, the system UID/GID +boundary is traditionally configured in /etc/login.defs, though systemd won't +look there during runtime. + +## Considerations for container managers + +If you hack on a container manager, and wonder how and how many UIDs best to +assign to your containers, here are a few recommendations: + +1. Definitely, don't assign less than 65536 UIDs/GIDs. After all the `nobody` +user has magic properties, and hence should be available in your container, and +given that it's assigned the UID 65534, you should really cover the full 16bit +range in your container. Note that systemd will — as mentioned — synthesize +user records for the `nobody` user, and assumes its availability in various +other parts of its codebase, too, hence assigning fewer users means you lose +compatibility with running systemd code inside your container. And most likely +other packages make similar restrictions. + +2. While it's fine to assign more than 65536 UIDs/GIDs to a container, there's +most likely not much value in doing so, as Linux distributions won't use the +higher ranges by default (as mentioned neither `adduser` nor `systemd`'s +dynamic user concept allocate from above the 16bit range). Unless you actively +care for nested containers, it's hence probably a good idea to allocate exactly +65536 UIDs per container, and neither less nor more. A pretty side-effect is +that by doing so, you expose the same number of UIDs per container as Linux 2.2 +supported for the whole system, back in the days. + +3. Consider allocating UID ranges for containers so that the first UID you +assign has the lower 16bits all set to zero. That way, the upper 16bits become +a container ID of some kind, while the lower 16bits directly encode the +internal container UID. This is the way `systemd-nspawn` allocates UID ranges +(see above). Following this allocation logic ensures best compability with +`systemd-nspawn` and all other container managers following the scheme, as it +is sufficient then to check NSS for the first UID you pick regarding conflicts, +as that's what they do, too. Moreover, it makes `chown()`ing container file +system trees nicely robust to interruptions: as the external UID encodes the +internal UID in a fixed way, it's very easy to adjust the container's base UID +without the need to know the original base UID: to change the container base, +just mask away the upper 16bit, and insert the upper 16bit of the new container +base instead. Here are the easy conversions to derive the internal UID, the +external UID, and the container base UID from each other: + + ``` + INTERNAL_UID = EXTERNAL_UID & 0x0000FFFF + CONTAINER_BASE_UID = EXTERNAL_UID & 0xFFFF0000 + EXTERNAL_UID = INTERNAL_UID | CONTAINER_BASE_UID + ``` + +4. When picking a UID range for containers, make sure to check NSS first, with +a simple `getpwuid()` call: if there's already a user record for the first UID +you want to pick, then it's already in use: pick a different one. Wrap that +call in a `lckpwdf()` + `ulckpwdf()` pair, to make allocation +race-free. Provide an NSS module that makes all UIDs you end up taking show up +in the user database, and make sure that the NSS module returns up-to-date +information before you release the lock, so that other system components can +safely use the NSS user database as allocation check, too. Note that if you +follow this scheme no changes to `/etc/passwd` need to be made, thus minimizing +the artifacts the container manager persistently leaves in the system. + +## Summary + +| UID/GID | Purpose | Defined By | Listed in | +|-----------------------|-----------------------|---------------|-------------------------------| +| 0 | `root` user | Linux | `/etc/passwd` + `nss-systemd` | +| 1…4 | System users | Distributions | `/etc/passwd` | +| 5 | `tty` group | `systemd` | `/etc/passwd` | +| 6…999 | System users | Distributions | `/etc/passwd` | +| 1000…60000 | Regular users | Distributions | `/etc/passwd` + LDAP/NIS/… | +| 60001…61183 | Unused | | | +| 61184…65519 | Dynamic service users | `systemd` | `nss-systemd` | +| 65520…65533 | Unused | | | +| 65534 | `nobody` user | Linux | `/etc/passwd` + `nss-systemd` | +| 65535 | 16bit `(uid_t) -1` | Linux | | +| 65536…524287 | Unused | | | +| 524288…1879048191 | Container UID ranges | `systemd` | `nss-mymachines` | +| 1879048192…4294967294 | Unused | | | +| 4294967295 | 32bit `(uid_t) -1` | Linux | | + +Note that "Unused" in the table above doesn't meant that these ranges are +really unused. It just means that these ranges have no well-established +pre-defined purposes between Linux, generic low-level distributions and +`systemd`. There might very well be other packages that allocate from these +ranges. diff --git a/meson.build b/meson.build index 96340bb134..45e5cdb8ff 100644 --- a/meson.build +++ b/meson.build @@ -2596,15 +2596,17 @@ install_data('xorg/50-systemd-user.sh', install_dir : xinitrcdir) install_data('modprobe.d/systemd.conf', install_dir : modprobedir) -install_data('README', - 'NEWS', - 'CODING_STYLE', - 'DISTRO_PORTING', - 'ENVIRONMENT.md', - 'LICENSE.GPL2', +install_data('LICENSE.GPL2', 'LICENSE.LGPL2.1', - 'TRANSIENT-SETTINGS.md', - 'UIDS-GIDS.md', + 'NEWS', + 'README', + 'doc/CODING_STYLE', + 'doc/DISTRO_PORTING', + 'doc/ENVIRONMENT.md', + 'doc/HACKING', + 'doc/TRANSIENT-SETTINGS.md', + 'doc/TRANSLATORS', + 'doc/UIDS-GIDS.md', 'src/libsystemd/sd-bus/GVARIANT-SERIALIZATION', install_dir : docdir) diff --git a/src/basic/verbs.c b/src/basic/verbs.c index 47644670da..4f3cd91465 100644 --- a/src/basic/verbs.c +++ b/src/basic/verbs.c @@ -41,7 +41,7 @@ bool running_in_chroot_or_offline(void) { /* Added to support use cases like rpm-ostree, where from %post scripts we only want to execute "preset", but * not "start"/"restart" for example. * - * See ENVIRONMENT.md for docs. + * See doc/ENVIRONMENT.md for docs. */ r = getenv_bool("SYSTEMD_OFFLINE"); if (r < 0 && r != -ENXIO) -- cgit v1.2.1