diff options
author | Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> | 2022-04-26 22:04:31 +0200 |
---|---|---|
committer | Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> | 2022-04-28 15:46:44 +0200 |
commit | 6f83ea60e90b18e44cc979834aae2947afa66834 (patch) | |
tree | 3737967c9eda6e1961ae4d0dbd66727d1227a17d | |
parent | c0a96b1b1d19a06a3828885b10a275c423a5e6f2 (diff) | |
download | systemd-6f83ea60e90b18e44cc979834aae2947afa66834.tar.gz |
man: beef up the description of systemd-oomd.service
The gist of the description is moved from systemd.resource-control
to systemd-oomd man page. Cross-references to OOMPolicy, memory.oom.group,
oomctl, ManagedOOMSwap and ManagedOOMMemoryPressure are added in all
places.
The descriptions are also more down-to-earth: instead of talking
about "taking action" let's just say "kill". We *might* add configuration
for different actions in the future, but we're not there yet, so let's
just describe what we do now.
-rw-r--r-- | man/systemd-oomd.service.xml | 79 | ||||
-rw-r--r-- | man/systemd.resource-control.xml | 30 | ||||
-rw-r--r-- | man/systemd.service.xml | 14 |
3 files changed, 71 insertions, 52 deletions
diff --git a/man/systemd-oomd.service.xml b/man/systemd-oomd.service.xml index e87a753987..11c9237645 100644 --- a/man/systemd-oomd.service.xml +++ b/man/systemd-oomd.service.xml @@ -29,23 +29,36 @@ <refsect1> <title>Description</title> - <para><command>systemd-oomd</command> is a system service that uses cgroups-v2 and pressure stall information (PSI) - to monitor and take action on processes before an OOM occurs in kernel space.</para> - - <para>You can enable monitoring and actions on units by setting <varname>ManagedOOMSwap=</varname> and/or - <varname>ManagedOOMMemoryPressure=</varname> to the appropriate value. <command>systemd-oomd</command> will - periodically poll enabled units' cgroup data to detect when corrective action needs to occur. When an action needs - to happen, it will only be performed on the descendant cgroups of the enabled units. More precisely, only cgroups with - <filename>memory.oom.group</filename> set to <constant>1</constant> and leaf cgroup nodes are eligible candidates. - Action will be taken recursively on all of the processes under the chosen candidate.</para> - - <para>See - <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> + <para><command>systemd-oomd</command> is a system service that uses cgroups-v2 and pressure stall + information (PSI) to monitor and take corrective action before an OOM occurs in the kernel space.</para> + + <para>You can enable monitoring and actions on units by setting <varname>ManagedOOMSwap=</varname> and + <varname>ManagedOOMMemoryPressure=</varname> in the unit configuration, see + <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>. + <command>systemd-oomd</command> retrieves information about such units from <command>systemd</command> + when it starts and watches for subsequent changes.</para> + + <para>Cgroups of units with <varname>ManagedOOMSwap=</varname> or + <varname>ManagedOOMMemoryPressure=</varname> set to <option>kill</option> will be monitored. + <command>systemd-oomd</command> periodically polls PSI statistics for the system and those cgroups to + decide when to take action. If the configured limits are exceeded, <command>systemd-oomd</command> will + select a cgroup to terminate, and send <constant>SIGKILL</constant> to all processes in it. Note that + only descendant cgroups are eligible candidates for killing; the unit with its property set to + <option>kill</option> is not a candidate (unless one of its ancestors set their property to + <option>kill</option>). Also only leaf cgroups and cgroups with <filename>memory.oom.group</filename> set + to <constant>1</constant> are eligible candidates; see <varname>OOMPolicy=</varname> in + <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>. + </para> + + <para><citerefentry><refentrytitle>oomctl</refentrytitle><manvolnum>1</manvolnum></citerefentry> can + be used to list monitored cgroups and pressure information.</para> + + <para>See <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> for more information about the configuration of this service.</para> </refsect1> <refsect1> - <title>Setup Information</title> + <title>System requirements and configuration</title> <para>The system must be running systemd with a full unified cgroup hierarchy for the expected cgroups-v2 features. Furthermore, memory accounting must be turned on for all units monitored by <command>systemd-oomd</command>. @@ -53,23 +66,25 @@ is set to <constant>true</constant> in <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> - <para>You will need a kernel compiled with PSI support. This is available in Linux 4.20 and above.</para> + <para>The kernel must be compiled with PSI support. This is available in Linux 4.20 and above.</para> - <para>It is highly recommended for the system to have swap enabled for <command>systemd-oomd</command> to function - optimally. With swap enabled, the system spends enough time swapping pages to let <command>systemd-oomd</command> react. - Without swap, the system enters a livelocked state much more quickly and may prevent <command>systemd-oomd</command> - from responding in a reasonable amount of time. See - <ulink url="https://chrisdown.name/2018/01/02/in-defence-of-swap.html">"In defence of swap: common misconceptions"</ulink> - for more details on swap. Any swap-based actions on systems without swap will be ignored. While - <command>systemd-oomd</command> can perform pressure-based actions on a system without swap, the pressure increases - will be more abrupt and may require more tuning to get the desired thresholds and behavior.</para> + <para>It is highly recommended for the system to have swap enabled for <command>systemd-oomd</command> to + function optimally. With swap enabled, the system spends enough time swapping pages to let + <command>systemd-oomd</command> react. Without swap, the system enters a livelocked state much more + quickly and may prevent <command>systemd-oomd</command> from responding in a reasonable amount of + time. See <ulink url="https://chrisdown.name/2018/01/02/in-defence-of-swap.html">"In defence of swap: + common misconceptions"</ulink> for more details on swap. Any swap-based actions on systems without swap + will be ignored. While <command>systemd-oomd</command> can perform pressure-based actions on such a + system, the pressure increases will be more abrupt and may require more tuning to get the desired + thresholds and behavior.</para> <para>Be aware that if you intend to enable monitoring and actions on <filename>user.slice</filename>, - <filename>user-$UID.slice</filename>, or their ancestor cgroups, it is highly recommended that your programs be - managed by the systemd user manager to prevent running too many processes under the same session scope (and thus - avoid a situation where memory intensive tasks trigger <command>systemd-oomd</command> to kill everything under the - cgroup). If you're using a desktop environment like GNOME, it already spawns many session components with the - systemd user manager.</para> + <filename>user-$UID.slice</filename>, or their ancestor cgroups, it is highly recommended that your + programs be managed by the systemd user manager to prevent running too many processes under the same + session scope (and thus avoid a situation where memory intensive tasks trigger + <command>systemd-oomd</command> to kill everything under the cgroup). If you're using a desktop + environment like GNOME or KDE, it already spawns many session components with the systemd user manager. + </para> </refsect1> <refsect1> @@ -79,11 +94,11 @@ <filename>-.slice</filename>, and allowing all descendant cgroups to be eligible candidates may make the most sense.</para> - <para><varname>ManagedOOMMemoryPressure=</varname> tends to work better on the cgroups below the root slice - <filename>-.slice</filename>. For units which tend to have processes that are less latency sensitive (e.g. - <filename>system.slice</filename>), a higher limit like the default of 60% may be acceptable, as those processes - can usually ride out slowdowns caused by lack of memory without serious consequences. However, something like - <filename>user@$UID.service</filename> may prefer a much lower value like 40%.</para> + <para><varname>ManagedOOMMemoryPressure=</varname> tends to work better on the cgroups below the root + slice. For units which tend to have processes that are less latency sensitive (e.g. + <filename>system.slice</filename>), a higher limit like the default of 60% may be acceptable, as those + processes can usually ride out slowdowns caused by lack of memory without serious consequences. However, + something like <filename>user@$UID.service</filename> may prefer a much lower value like 40%.</para> </refsect1> <refsect1> diff --git a/man/systemd.resource-control.xml b/man/systemd.resource-control.xml index d9edb6ab74..ce03a2f1a6 100644 --- a/man/systemd.resource-control.xml +++ b/man/systemd.resource-control.xml @@ -1108,24 +1108,24 @@ DeviceAllow=/dev/loop-control <citerefentry><refentrytitle>systemd-oomd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> will act on this unit's cgroups. Defaults to <option>auto</option>.</para> - <para>When set to <option>kill</option>, <command>systemd-oomd</command> will actively monitor this unit's - cgroup metrics to decide whether it needs to act. If the cgroup passes the limits set by - <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> or its - overrides, <command>systemd-oomd</command> will send a <constant>SIGKILL</constant> to all of the processes - under the chosen candidate cgroup. Note that only descendant cgroups can be eligible candidates for killing; - the unit that set its property to <option>kill</option> is not a candidate (unless one of its ancestors set - their property to <option>kill</option>). You can find more details on candidates and kill behavior at + <para>When set to <option>kill</option>, the unit becomes a candidate for monitoring by + <command>systemd-oomd</command>. If the cgroup passes the limits set by + <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> or + the unit configuration, <command>systemd-oomd</command> will select a descendant cgroup and send + <constant>SIGKILL</constant> to all of the processes under it. You can find more details on + candidates and kill behavior at <citerefentry><refentrytitle>systemd-oomd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> - and <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>. Setting - either of these properties to <option>kill</option> will also automatically acquire + and + <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + + <para>Setting either of these properties to <option>kill</option> will also result in <varname>After=</varname> and <varname>Wants=</varname> dependencies on - <filename>systemd-oomd.service</filename> unless <varname>DefaultDependencies=no</varname>. - </para> + <filename>systemd-oomd.service</filename> unless <varname>DefaultDependencies=no</varname>.</para> - <para>When set to <option>auto</option>, <command>systemd-oomd</command> will not actively use this cgroup's - data for monitoring and detection. However, if an ancestor cgroup has one of these properties set to - <option>kill</option>, a unit with <option>auto</option> can still be an eligible candidate for - <command>systemd-oomd</command> to act on.</para> + <para>When set to <option>auto</option>, <command>systemd-oomd</command> will not actively use this + cgroup's data for monitoring and detection. However, if an ancestor cgroup has one of these + properties set to <option>kill</option>, a unit with <option>auto</option> can still be a candidate + for <command>systemd-oomd</command> to terminate.</para> </listitem> </varlistentry> diff --git a/man/systemd.service.xml b/man/systemd.service.xml index 4e4a9732e4..ad303d440b 100644 --- a/man/systemd.service.xml +++ b/man/systemd.service.xml @@ -1130,8 +1130,12 @@ killed by the kernel's OOM killer this is logged but the service continues running. If set to <constant>stop</constant> the event is logged but the service is terminated cleanly by the service manager. If set to <constant>kill</constant> and one of the service's processes is killed by the OOM - killer the kernel is instructed to kill all remaining processes of the service, too. Defaults to the - setting <varname>DefaultOOMPolicy=</varname> in + killer the kernel is instructed to kill all remaining processes of the service too, by setting the + <filename>memory.oom.group</filename> attribute to <constant>1</constant>; also see <ulink + url="https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html">kernel documentation</ulink>. + </para> + + <para>Defaults to the setting <varname>DefaultOOMPolicy=</varname> in <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> is set to, except for services where <varname>Delegate=</varname> is turned on, where it defaults to <constant>continue</constant>.</para> @@ -1142,9 +1146,9 @@ <citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry> for details.</para> - <para>This setting also applies to <command>systemd-oomd</command>, similar to kernel OOM kills - this setting determines the state of the service after <command>systemd-oomd</command> kills a cgroup associated - with the service.</para></listitem> + <para>This setting also applies to <command>systemd-oomd</command>, similar to the kernel OOM kills + this setting determines the state of the service after <command>systemd-oomd</command> kills a cgroup + associated with the service.</para></listitem> </varlistentry> </variablelist> |