summaryrefslogtreecommitdiff
path: root/src/core/unit.h
Commit message (Collapse)AuthorAgeFilesLines
* core: add new OnSuccess= dependency typeLennart Poettering2021-05-251-2/+3
| | | | | | | | | | | | | | This is similar to OnFailure= but is activated whenever a unit returns into inactive state successfully. I was always afraid of adding this, since it effectively allows building loops and makes our engine Turing complete, but it pretty much already was it was just hidden. Given that we have per-unit ratelimits as well as an event loop global ratelimit I feel safe to add this finally, given it actually is useful. Fixes: #13386
* core: convert Slice= into a proper dependency (and add a back dependency)Lennart Poettering2021-05-251-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | The slice a unit is assigned to is currently a UnitRef reference. Let's turn it into a proper dependency, to simplify and clean up code a bit. Now that new dep types are cheaper, deps should generally be preferable over everything else, if the concept applies. This brings one major benefit: we often have to iterate through all unit a slice contains. So far we iterated through all Before= dependencies of the slice unit to achieve that, filtering out unrelated units, and taking benefit of the fact that slice units are implicitly ordered Before= the units they contain. By making Slice= a proper dependency, and having an accompanying SliceOf= dependency type, this is much simpler and nicer as we can directly enumerate the units a slice contains. The forward dependency is actually called InSlice internally, since we already used the UNIT_SLICE name as UnitType field. However, since we don't intend to expose the dependency to users as dep anyway (we already have the regular Slice D-Bus property for this) this shouldn't matter. The SliceOf= implicit dependency type (the erverse of Slice=/InSlice=) is exported over the bus, to make things a bit nicer to debug and discoverable.
* core: add UNIT_GET_SLICE() helperLennart Poettering2021-05-251-0/+5
| | | | | | | | In a later commit we intend to move the slice logic to use proper dependencies instead of a "UnitRef" object. This preparatory commit drops direct use of the slice UnitRef object for a static inline function UNIT_GET_SLICE() that is both easier to grok, and allows us to easily replace its internal implementation later on.
* core: split dependency types into atomsLennart Poettering2021-05-251-5/+75
|
* core: apply LogLevelMax to messages about unitsRyan Hendrickson2021-05-031-5/+32
| | | | | | | | | | This commit applies the filtering imposed by LogLevelMax on a unit's processes to messages logged by PID1 about the unit as well. The target use case for this feature is a service that runs on a timer many times an hour, where the system administrator decides that writing a generic success message to the journal every few minutes or seconds adds no diagnostic value and isn't worth the clutter or disk I/O.
* shared, bpf: add bpf link serializationJulia Kartseva2021-04-261-0/+1
| | | | core: serialize socket_bind bpf links
* core, bpf: add socket-bind feature to unitJulia Kartseva2021-04-261-0/+8
| | | | | | | | | | | | | | | | | | | | | Add supported and install unit interface for socket-bind feature. supported verifies that - unified cgroup hierarchy (cgroup v2) is used - BPF_FRAMEWORK (libbpf + clang + llvm + bpftool) was available in compile time - kernel supports BPF_PROG_TYPE_CGROUP_SOCK_ADDR - bpf programs can be loaded into kernel - bpf link can be used install: - load bpf_object from bpf skeleton - resize rules map to fit socket_bind_allow and socket_bind deny rules from cgroup context - populate cgroup-bpf maps with rules - get bpf programs from bpf skeleton - attach programs to unit cgroup using bpf link - save bpf link in the unit
* core: make log_unit_xxx_errno() refuse zero errnoYu Watanabe2021-04-161-3/+10
|
* core: add bpf-foreign unit helpersJulia Kartseva2021-04-091-0/+4
| | | | | | | | | | | | | | | | | | - Introduce support of cgroup-bpf programs managed (i.e. compiled, loaded to and unloaded from kernel) externally. Systemd is only responsible for attaching programs to unit cgroup hence the name 'foreign'. Foreign BPF programs are identified by bpf program ID and attach type. systemd: - Gets kernel FD of BPF program; - Makes a unique identifier of BPF program from BPF attach type and program ID. Same program IDs mean the same program, i.e the same chunk of kernel memory. Even if the same program is passed multiple times, identical (program_id, attach_type) instances are collapsed into one; - Attaches programs to unit cgroup.
* tree-wide: return NULL from freeing functionsZbigniew Jędrzejewski-Szmek2021-02-161-2/+2
| | | | | | I started working on this because I wanted to change how DEFINE_TRIVIAL_CLEANUP_FUNC is defined. Even independently of that change, it's nice to make make things more consistent and predictable.
* core: add Unit.Markers propertyZbigniew Jędrzejewski-Szmek2021-02-151-0/+3
| | | | | | | | | The property is never set by systemd, only reset after a stop or restart or reload. It may externally be set to mark the unit for a later restart/reload. I wasn't sure whether to configure the property only for the types where this makes sense (Service, Swap, etc). But Restart() method is defined on the unit, and also having this always under the same property name is more convenient.
* core: pahole optimization of struct UnitZbigniew Jędrzejewski-Szmek2021-02-121-10/+12
| | | | | | | | | | | | | | | | | We had a lone 'bool job_running_timeout_set:1', which generated a hole. Let's move things around a bit. The structure is a tiny bit smaller and has less holes: /* size: 1192, cachelines: 19, members: 149 */ /* sum members: 1175, holes: 3, sum holes: 11 */ /* sum bitfield members: 27 bits, bit holes: 1, sum bit holes: 7 bits */ /* bit_padding: 14 bits */ /* last cacheline: 40 bytes */ /* size: 1184, cachelines: 19, members: 149 */ /* sum members: 1175, holes: 1, sum holes: 4 */ /* sum bitfield members: 27 bits (3 bytes) */ /* bit_padding: 13 bits */ /* last cacheline: 32 bytes */
* core: split out a few funcs into unit-serialize.[ch]Zbigniew Jędrzejewski-Szmek2021-02-121-6/+3
| | | | Just a straightforward move and resulting include file adjustments.
* tree-wide: use -EINVAL for enum invalid valuesZbigniew Jędrzejewski-Szmek2021-02-101-2/+2
| | | | | | | | | As suggested in https://github.com/systemd/systemd/pull/11484#issuecomment-775288617. This does not touch anything exposed in src/systemd. Changing the defines there would be a compatibility break. Note that tests are broken after this commit. They will be fixed in the next one.
* core: update setings on the unit and job as the result of ExecCondition=Anita Zhang2021-01-221-1/+0
| | | | | | | | Update ExecCondition= to set Unit->condition_result and return JOB_DONE in the Job results if the check fails so as to match the current behavior of ConditionXYZ= w.r.t units/jobs dependency checks. Fixes: #18207
* Merge pull request #16603 from benzea/benzea/special-app-sliceLennart Poettering2020-11-111-0/+8
|\ | | | | Use app.slice by default in user manager (and define special user slices)
| * pid1: expose "extrinsic" status of swaps and mountsZbigniew Jędrzejewski-Szmek2020-11-101-0/+8
| | | | | | | | | | The only visible change from this is that we show Extrinsic: yes/no in dumps for swap units (this was already done for mount units).
* | license: LGPL-2.1+ -> LGPL-2.1-or-laterYu Watanabe2020-11-091-1/+1
|/
* core: systemd-oomd pid1 integrationAnita Zhang2020-10-071-1/+4
|
* core: add ManagedOOM*= properties to configure systemd-oomd on the unitAnita Zhang2020-10-071-0/+3
| | | | | This adds the hook ups so it can be read with the usual systemd utilities. Used in later commits by sytemd-oomd.
* core: propagate triggered unit in more load statesLennart Poettering2020-09-141-0/+4
| | | | | | | | | | | | | | | In 4c2ef3276735ad9f7fccf33f5bdcbe7d8751e7ec we enabled propagating triggered unit state to the triggering unit for service units in more load states, so that we don't accidentally stop tracking state correctly. Do the same for our other triggering unit states: automounts, paths, and timers. Also, make this an assertion rather than a simple test. After all it should never happen that we get called for half-loaded units or units of the wrong type. The load routines should already have made this impossible.
* core: make log_unit_error() or friends return voidYu Watanabe2020-09-091-12/+14
|
* Rework how we cache mtime to figure out if units changedZbigniew Jędrzejewski-Szmek2020-08-311-1/+1
| | | | | | | | | | | | | | | | | Instead of assuming that more-recently modified directories have higher mtime, just look for any mtime changes, up or down. Since we don't want to remember individual mtimes, hash them to obtain a single value. This should help us behave properly in the case when the time jumps backwards during boot: various files might have mtimes that in the future, but we won't care. This fixes the following scenario: We have /etc/systemd/system with T1. T1 is initially far in the past. We have /run/systemd/generator with time T2. The time is adjusted backwards, so T2 will be always in the future for a while. Now the user writes new files to /etc/systemd/system, and T1 is updated to T1'. Nevertheless, T1 < T1' << T2. We would consider our cache to be up-to-date, falsely.
* pid1: use the cache mtime not clock to "mark" load attemptsZbigniew Jędrzejewski-Szmek2020-08-311-1/+1
| | | | | | | | | | | | | | | | | | | | We really only care if the cache has been reloaded between the time when we last attempted to load this unit and now. So instead of recording the actual time we try to load the unit, just store the timestamp of the cache. This has the advantage that we'll notice if the cache mtime jumps forward or backward. Also rename fragment_loadtime to fragment_not_found_time. It only gets set when we failed to load the unit and the old name was suggesting it is always set. In https://bugzilla.redhat.com/show_bug.cgi?id=1871327 (and most likely https://bugzilla.redhat.com/show_bug.cgi?id=1867930 and most likely https://bugzilla.redhat.com/show_bug.cgi?id=1872068) we try to load a non-existent unit over and over from transaction_add_job_and_dependencies(). My understanding is that the clock was in the future during inital boot, so cache_mtime is always in the future (since we don't touch the fs after initial boot), so no matter how many times we try to load the unit and set fragment_loadtime / fragment_not_found_time, it is always higher than cache_mtime, so manager_unit_cache_should_retry_load() always returns true.
* core: add credentials logicLennart Poettering2020-08-251-1/+1
| | | | Fixes: #15778 #16060
* core: store timestamps of unit load attemptsLuca Boccassi2020-06-301-0/+1
| | | | | | | | | | | | | | | | | | | When the system is under heavy load, it can happen that the unit cache is refreshed for an unrelated reason (in the test I simulate this by attempting to start a non-existing unit). The new unit is found and accounted for in the cache, but it's ignored since we are loading something else. When we actually look for it, by attempting to start it, the cache is up to date so no refresh happens, and starting fails although we have it loaded in the cache. When the unit state is set to UNIT_NOT_FOUND, mark the timestamp in u->fragment_loadtime. Then when attempting to load again we can check both if the cache itself needs a refresh, OR if it was refreshed AFTER the last failed attempt that resulted in the state being UNIT_NOT_FOUND. Update the test so that this issue reproduces more often.
* Merge pull request #15940 from keszybz/names-set-optimizationLennart Poettering2020-06-101-2/+2
|\ | | | | Try to optimize away Unit.names set
| * core: store unit aliases in a separate setZbigniew Jędrzejewski-Szmek2020-06-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We allocated the names set for each unit, but in the majority of cases, we'd put only one name in the set: $ systemctl show --value -p Names '*'|grep .|grep -v ' '|wc -l 564 $ systemctl show --value -p Names '*'|grep .|grep ' '|wc -l 16 So let's add a separate .id field, and only store aliases in the set, and only create the set if there's at least one alias. This requires a bit of gymnastics in the code, but I think this optimization is worth the trouble, because we save one object for many loaded units. In particular set_complete_move() wasn't very useful because the target unit would always have at least one name defined, i.e. the optimization to move the whole set over would never fire.
* | core: make unit_set_invocation_id staticZbigniew Jędrzejewski-Szmek2020-05-281-1/+0
|/ | | | No functional change.
* core: also log about left-over processes during unit stopLennart Poettering2020-05-261-1/+3
| | | | | | | | | Only log at LOG_INFO level, i.e. make this informational. During start let's leave it at LOG_WARNING though. Of course, it's ugly leaving processes around like that either in start or in stop, but at start its more dangerous than on stop, so be tougher there.
* Merge pull request #15265 from fbuihuu/mount-fixesLennart Poettering2020-05-151-3/+0
|\ | | | | Mount fixes
| * device: drop refuse_afterFranck Bui2020-04-011-3/+0
| | | | | | | | | | | | | | Scheduling devices after a given unit can be useful to start device *jobs* at a specific time in the transaction, see commit 4195077ab4c823c. This (hidden) change was introduced by commit eef85c4a3f8054d2.
* | core: Update prototype of notify_message, tags list is read onlyBenjamin Robin2020-05-101-1/+1
| | | | | | | | | | | | Indicates that the tags list cannot be modified by notify_message function. Since the tags list is created only once for multiple call to notify_message functions.
* | pid1: convert to the new schemeZbigniew Jędrzejewski-Szmek2020-05-051-3/+0
| | | | | | | | | | | | | | | | In all the other cases, I think the code was clearer with the static table. Here, not so much. And because of the existing dump code, the vtables cannot be made static and need to remain exported. I still think it's worth to do the change to have the cmdline introspection, but I'm disappointed with how this came out.
* | core: introduce support for cgroup freezerMichal Sekletár2020-04-301-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With cgroup v2 the cgroup freezer is implemented as a cgroup attribute called cgroup.freeze. cgroup can be frozen by writing "1" to the file and kernel will send us a notification through "cgroup.events" after the operation is finished and processes in the cgroup entered quiescent state, i.e. they are not scheduled to run. Writing "0" to the attribute file does the inverse and process execution is resumed. This commit exposes above low-level functionality through systemd's DBus API. Each unit type must provide specialized implementation for these methods, otherwise, we return an error. So far only service, scope, and slice unit types provide the support. It is possible to check if a given unit has the support using CanFreeze() DBus property. Note that DBus API has a synchronous behavior and we dispatch the reply to freeze/thaw requests only after the kernel has notified us that requested operation was completed.
* | core: add log_get_max_level check optimization in log_unit_fullLuca Boccassi2020-04-211-2/+3
|/ | | | | | Just as log_full already does, check if the log level would result in logging immediately in the macro in order to avoid doing unnecessary work that adds up in hot spots.
* Use Finished instead of Started for Type=oneshot services (#14851)Zbigniew Jędrzejewski-Szmek2020-03-051-0/+3
| | | | | | | | | | | UnitStatusMessageFormats.finished_job, if present, will be called with the same arguments as job_get_done_status_message_format() to provide a format string appropriate for the context This commit replaces "Started" with "Finished" for started oneshot units, as mentioned in the referenced issue Closes #2458.
* pid1: when showing error status, do not switch to status=temporaryZbigniew Jędrzejewski-Szmek2020-03-011-1/+2
| | | | | | | | | | | | | | We would flip to status=temporary mode on the first error, and then switch back to status=auto after the initial transaction was done. This isn't very useful, because usually all the messages about successfully started units and not related to the original failure. In fact, all those messages most likely cause the information about the prime error to scroll off screen. And if the user requested quiet boot, there's no reason to think that they care about those success messages. Also, when logging about dependency cycles, treat this similarly to a unit error and show the message even if the status is "soft disabled" (before we wouldn't show it in that case).
* core: unit_label_path(): take const unitChristian Göttsche2020-02-041-1/+1
|
* core: add implicit ordering dep on blockdev@.target from all mount unitsLennart Poettering2020-01-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This way we shuld be able to order mounts properly against their backing services in case complex storage is used (i.e. LUKS), even if the device path used for mounting the devices is different from the expected device node of the backing service. Specifically, if we have a LUKS device /dev/mapper/foo that is mounted by this name all is trivial as the relationship can be established a priori easily. But if it is mounted via a /dev/disk/by-uuid/ symlink or similar we only can relate the device node generated to the one mounted at the moment the device is actually established. That's because the UUID of the fs is stored inside the encrypted volume and thus not knowable until the volume is set up. This patch tries to improve on this situation: a implicit After=blockdev@.target dependency is generated for all mounts, based on the data from /proc/self/mountinfo, which should be the actual device node, with all symlinks resolved. This means that as soon as the mount is established the ordering via blockdev@.target will work, and that means during shutdown it is honoured, which is what we are looking for. Note that specifying /etc/fstab entries via UUID= for LUKS devices still sucks and shouldn't be done, because it means we cannot know which LUKS device to activate to make an fs appear, and that means unless the volume is set up at boot anyway we can't really handle things automatically when putting together transactions that need the mount.
* core: make a number of functions not used externally staticLennart Poettering2020-01-211-6/+0
|
* Merge pull request #14424 from poettering/watch-bus-name-reworkLennart Poettering2020-01-151-1/+1
|\ | | | | pid1: simplify drastically how we watch bus names for service's BusName= setting
| * core: drop initial ListNames() bus call from PID 1Lennart Poettering2020-01-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, when first connecting to the bus after connecting to it we'd issue a ListNames() bus call to the driver to figure out which bus names are currently active. This information was then used to initialize the initial state for services that use BusName=. This change removes the whole code for this and replaces it with something vastly simpler. First of all, the ListNames() call was issues synchronosuly, which meant if dbus was for some reason synchronously calling into PID1 for some reason we'd deadlock. As it turns out there's now a good chance it does: the nss-systemd userdb hookup means that any user dbus-daemon resolves might result in a varlink call into PID 1, and dbus resolves quite a lot of users while parsing its policy. My original goal was to fix this deadlock. But as it turns out we don't need the ListNames() call at all anymore, since #12957 has been merged. That PR was supposed to fix a race where asynchronous installation of bus matches would cause us missing the initial owner of a bus name when a service is first started. It fixed it (correctly) by enquiring with GetOwnerName() who currently owns the name, right after installing the match. But this means whenever we start watching a bus name we anyway issue a GetOwnerName() for it, and that means also when first connecting to the bus we don't need to issue ListNames() anymore since that just tells us the same info: which names are currently owned. hence, let's drop ListNames() and instead make better use of the GetOwnerName() result: if it failed the name is not owned. Also, while we are at it, let's simplify the unit's owner_name_changed() callback(): let's drop the "old_owner" argument. We never used that besides logging, and it's hard to synthesize from just the return of a GetOwnerName(), hence don't bother.
* | core: clearly refuse OnFailure= deps on units that can't failLennart Poettering2020-01-091-0/+9
|/ | | | | | | | | Similar, refuse triggering deps on units that cannot trigger. And rework how we ignore After= dependencies on device units, to work the same way. See: #14142
* core: drop 'wants' parameter from unit_add_node_dependency()Franck Bui2019-10-281-1/+1
| | | | | Since Wants dependency is no more automagically added to swap and mount units, this parameter is no more used hence this patch drops it.
* core: turn unit_load_fragment_and_dropin_optional() into a flagZbigniew Jędrzejewski-Szmek2019-10-111-2/+1
| | | | | | | | | | unit_load_fragment_and_dropin() and unit_load_fragment_and_dropin_optional() are really the same, with one minor difference in behaviour. Let's drop the second function. "_optional" in the name suggests that it's the "dropin" part that is optional. (Which it is, but in this case, we mean the fragment to be optional.) I think the new version with a flag is easier to understand.
* core: add support for RestartKillSignal= to override signal used for restart ↵Zbigniew Jędrzejewski-Szmek2019-10-021-0/+1
| | | | | | | | jobs v2: - if RestartKillSignal= is not specified, fall back to KillSignal=. This is necessary to preserve backwards compatibility (and keep KillSignal= generally useful).
* core: add helper function to check job statusZbigniew Jędrzejewski-Szmek2019-10-011-0/+4
| | | | | Since job.h includes unit.h, and unit.h includes job.h, imports need to be adjusted to make sure unit.h is included first if the helper is used.
* tree-wide: say "ratelimit" not "rate_limit"Zbigniew Jędrzejewski-Szmek2019-09-201-2/+2
| | | | | | "ratelimit" is a real word, so we don't need to use the other form anywhere. We had both forms in various places, let's standarize on the shorter and more correct one.
* pid1: rename start_limit to start_ratelimitZbigniew Jędrzejewski-Szmek2019-09-201-1/+1
| | | | | This way it is clearer what the type is. We also have auto_stop_ratelimit adjacent, and it feels ugly to have a different suffix for those two.