diff options
author | Lennart Poettering <lennart@poettering.net> | 2018-01-12 13:41:05 +0100 |
---|---|---|
committer | Lennart Poettering <lennart@poettering.net> | 2018-01-23 21:29:31 +0100 |
commit | 62a769136df4065ce0711625e1e78ec996447862 (patch) | |
tree | 360b89fcda490f4936cf9ed01c391d8b2223d1d3 /src/core/manager.h | |
parent | 11aef522c16d739653228ef3d5925b6fb25b9d8b (diff) | |
download | systemd-62a769136df4065ce0711625e1e78ec996447862.tar.gz |
core: rework how we track which PIDs to watch for a unit
Previously, we'd maintain two hashmaps keyed by PIDs, pointing to Unit
interested in SIGCHLD events for them. This scheme allowed a specific
PID to be watched by exactly 0, 1 or 2 units.
With this rework this is replaced by a single hashmap which is primarily
keyed by the PID and points to a Unit interested in it. However, it
optionally also keyed by the negated PID, in which case it points to a
NULL terminated array of additional Unit objects also interested. This
scheme means arbitrary numbers of Units may now watch the same PID.
Runtime and memory behaviour should not be impact by this change, as for
the common case (i.e. each PID only watched by a single unit) behaviour
stays the same, but for the uncommon case (a PID watched by more than
one unit) we only pay with a single additional memory allocation for the
array.
Why this all? Primarily, because allowing exactly two units to watch a
specific PID is not sufficient for some niche cases, as processes can
belong to more than one unit these days:
1. sd_notify() with MAINPID= can be used to attach a process from a
different cgroup to multiple units.
2. Similar, the PIDFile= setting in unit files can be used for similar
setups,
3. By creating a scope unit a main process of a service may join a
different unit, too.
4. On cgroupsv1 we frequently end up watching all processes remaining in
a scope, and if a process opens lots of scopes one after the other it
might thus end up being watch by many of them.
This patch hence removes the 2-unit-per-PID limit. It also makes a
couple of other changes, some of them quite relevant:
- manager_get_unit_by_pid() (and the bus call wrapping it) when there's
ambiguity will prefer returning the Unit the process belongs to based on
cgroup membership, and only check the watch-pids hashmap if that
fails. This change in logic is probably more in line with what people
expect and makes things more stable as each process can belong to
exactly one cgroup only.
- Every SIGCHLD event is now dispatched to all units interested in its
PID. Previously, there was some magic conditionalization: the SIGCHLD
would only be dispatched to the unit if it was only interested in a
single PID only, or the PID belonged to the control or main PID or we
didn't dispatch a signle SIGCHLD to the unit in the current event loop
iteration yet. These rules were quite arbitrary and also redundant as
the the per-unit handlers would filter the PIDs anyway a second time.
With this change we'll hence relax the rules: all we do now is
dispatch every SIGCHLD event exactly once to each unit interested in
it, and it's up to the unit to then use or ignore this. We use a
generation counter in the unit to ensure that we only invoke the unit
handler once for each event, protecting us from confusion if a unit is
both associated with a specific PID through cgroup membership and
through the "watch_pids" logic. It also protects us from being
confused if the "watch_pids" hashmap is altered while we are
dispatching to it (which is a very likely case).
- sd_notify() message dispatching has been reworked to be very similar
to SIGCHLD handling now. A generation counter is used for dispatching
as well.
This also adds a new test that validates that "watch_pid" registration
and unregstration works correctly.
Diffstat (limited to 'src/core/manager.h')
-rw-r--r-- | src/core/manager.h | 23 |
1 files changed, 14 insertions, 9 deletions
diff --git a/src/core/manager.h b/src/core/manager.h index 3af780f866..90d5258b53 100644 --- a/src/core/manager.h +++ b/src/core/manager.h @@ -145,14 +145,14 @@ struct Manager { sd_event *event; - /* We use two hash tables here, since the same PID might be - * watched by two different units: once the unit that forked - * it off, and possibly a different unit to which it was - * joined as cgroup member. Since we know that it is either - * one or two units for each PID we just use to hashmaps - * here. */ - Hashmap *watch_pids1; /* pid => Unit object n:1 */ - Hashmap *watch_pids2; /* pid => Unit object n:1 */ + /* This maps PIDs we care about to units that are interested in. We allow multiple units to he interested in + * the same PID and multiple PIDs to be relevant to the same unit. Since in most cases only a single unit will + * be interested in the same PID we use a somewhat special encoding here: the first unit interested in a PID is + * stored directly in the hashmap, keyed by the PID unmodified. If there are other units interested too they'll + * be stored in a NULL-terminated array, and keyed by the negative PID. This is safe as pid_t is signed and + * negative PIDs are not used for regular processes but process groups, which we don't care about in this + * context, but this allows us to use the negative range for our own purposes. */ + Hashmap *watch_pids; /* pid => unit as well as -pid => array of units */ /* A set contains all units which cgroup should be refreshed after startup */ Set *startup_units; @@ -350,8 +350,13 @@ struct Manager { int first_boot; /* tri-state */ - /* prefixes of e.g. RuntimeDirectory= */ + /* Prefixes of e.g. RuntimeDirectory= */ char *prefix[_EXEC_DIRECTORY_TYPE_MAX]; + + /* Used in the SIGCHLD and sd_notify() message invocation logic to avoid that we dispatch the same event + * multiple times on the same unit. */ + unsigned sigchldgen; + unsigned notifygen; }; #define MANAGER_IS_SYSTEM(m) ((m)->unit_file_scope == UNIT_FILE_SYSTEM) |