summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide')
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt35
-rw-r--r--Documentation/admin-guide/mm/damon/index.rst1
-rw-r--r--Documentation/admin-guide/mm/damon/lru_sort.rst294
-rw-r--r--Documentation/admin-guide/mm/damon/reclaim.rst6
-rw-r--r--Documentation/admin-guide/mm/damon/usage.rst2
-rw-r--r--Documentation/admin-guide/mm/index.rst1
-rw-r--r--Documentation/admin-guide/mm/shrinker_debugfs.rst135
-rw-r--r--Documentation/admin-guide/mm/userfaultfd.rst40
-rw-r--r--Documentation/admin-guide/sysctl/vm.rst8
9 files changed, 499 insertions, 23 deletions
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3ebc22e524b0..c408070f30a3 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1667,6 +1667,19 @@
hlt [BUGS=ARM,SH]
+ hostname= [KNL] Set the hostname (aka UTS nodename).
+ Format: <string>
+ This allows setting the system's hostname during early
+ startup. This sets the name returned by gethostname.
+ Using this parameter to set the hostname makes it
+ possible to ensure the hostname is correctly set before
+ any userspace processes run, avoiding the possibility
+ that a process may call gethostname before the hostname
+ has been explicitly set, resulting in the calling
+ process getting an incorrect result. The string must
+ not exceed the maximum allowed hostname length (usually
+ 64 characters) and will be truncated otherwise.
+
hpet= [X86-32,HPET] option to control HPET usage
Format: { enable (default) | disable | force |
verbose }
@@ -1722,9 +1735,11 @@
Built with CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=y,
the default is on.
- This is not compatible with memory_hotplug.memmap_on_memory.
- If both parameters are enabled, hugetlb_free_vmemmap takes
- precedence over memory_hotplug.memmap_on_memory.
+ Note that the vmemmap pages may be allocated from the added
+ memory block itself when memory_hotplug.memmap_on_memory is
+ enabled, those vmemmap pages cannot be optimized even if this
+ feature is enabled. Other vmemmap pages not allocated from
+ the added memory block itself do not be affected.
hung_task_panic=
[KNL] Should the hung task detector generate panics.
@@ -3083,10 +3098,12 @@
[KNL,X86,ARM] Boolean flag to enable this feature.
Format: {on | off (default)}
When enabled, runtime hotplugged memory will
- allocate its internal metadata (struct pages)
- from the hotadded memory which will allow to
- hotadd a lot of memory without requiring
- additional memory to do so.
+ allocate its internal metadata (struct pages,
+ those vmemmap pages cannot be optimized even
+ if hugetlb_free_vmemmap is enabled) from the
+ hotadded memory which will allow to hotadd a
+ lot of memory without requiring additional
+ memory to do so.
This feature is disabled by default because it
has some implication on large (e.g. GB)
allocations in some configurations (e.g. small
@@ -3096,10 +3113,6 @@
Note that even when enabled, there are a few cases where
the feature is not effective.
- This is not compatible with hugetlb_free_vmemmap. If
- both parameters are enabled, hugetlb_free_vmemmap takes
- precedence over memory_hotplug.memmap_on_memory.
-
memtest= [KNL,X86,ARM,M68K,PPC,RISCV] Enable memtest
Format: <integer>
default : 0 <disable>
diff --git a/Documentation/admin-guide/mm/damon/index.rst b/Documentation/admin-guide/mm/damon/index.rst
index c4681fa69b9c..05500042f777 100644
--- a/Documentation/admin-guide/mm/damon/index.rst
+++ b/Documentation/admin-guide/mm/damon/index.rst
@@ -14,3 +14,4 @@ optimize those.
start
usage
reclaim
+ lru_sort
diff --git a/Documentation/admin-guide/mm/damon/lru_sort.rst b/Documentation/admin-guide/mm/damon/lru_sort.rst
new file mode 100644
index 000000000000..c09cace80651
--- /dev/null
+++ b/Documentation/admin-guide/mm/damon/lru_sort.rst
@@ -0,0 +1,294 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============================
+DAMON-based LRU-lists Sorting
+=============================
+
+DAMON-based LRU-lists Sorting (DAMON_LRU_SORT) is a static kernel module that
+aimed to be used for proactive and lightweight data access pattern based
+(de)prioritization of pages on their LRU-lists for making LRU-lists a more
+trusworthy data access pattern source.
+
+Where Proactive LRU-lists Sorting is Required?
+==============================================
+
+As page-granularity access checking overhead could be significant on huge
+systems, LRU lists are normally not proactively sorted but partially and
+reactively sorted for special events including specific user requests, system
+calls and memory pressure. As a result, LRU lists are sometimes not so
+perfectly prepared to be used as a trustworthy access pattern source for some
+situations including reclamation target pages selection under sudden memory
+pressure.
+
+Because DAMON can identify access patterns of best-effort accuracy while
+inducing only user-specified range of overhead, proactively running
+DAMON_LRU_SORT could be helpful for making LRU lists more trustworthy access
+pattern source with low and controlled overhead.
+
+How It Works?
+=============
+
+DAMON_LRU_SORT finds hot pages (pages of memory regions that showing access
+rates that higher than a user-specified threshold) and cold pages (pages of
+memory regions that showing no access for a time that longer than a
+user-specified threshold) using DAMON, and prioritizes hot pages while
+deprioritizing cold pages on their LRU-lists. To avoid it consuming too much
+CPU for the prioritizations, a CPU time usage limit can be configured. Under
+the limit, it prioritizes and deprioritizes more hot and cold pages first,
+respectively. System administrators can also configure under what situation
+this scheme should automatically activated and deactivated with three memory
+pressure watermarks.
+
+Its default parameters for hotness/coldness thresholds and CPU quota limit are
+conservatively chosen. That is, the module under its default parameters could
+be widely used without harm for common situations while providing a level of
+benefits for systems having clear hot/cold access patterns under memory
+pressure while consuming only a limited small portion of CPU time.
+
+Interface: Module Parameters
+============================
+
+To use this feature, you should first ensure your system is running on a kernel
+that is built with ``CONFIG_DAMON_LRU_SORT=y``.
+
+To let sysadmins enable or disable it and tune for the given system,
+DAMON_LRU_SORT utilizes module parameters. That is, you can put
+``damon_lru_sort.<parameter>=<value>`` on the kernel boot command line or write
+proper values to ``/sys/modules/damon_lru_sort/parameters/<parameter>`` files.
+
+Below are the description of each parameter.
+
+enabled
+-------
+
+Enable or disable DAMON_LRU_SORT.
+
+You can enable DAMON_LRU_SORT by setting the value of this parameter as ``Y``.
+Setting it as ``N`` disables DAMON_LRU_SORT. Note that DAMON_LRU_SORT could do
+no real monitoring and LRU-lists sorting due to the watermarks-based activation
+condition. Refer to below descriptions for the watermarks parameter for this.
+
+commit_inputs
+-------------
+
+Make DAMON_LRU_SORT reads the input parameters again, except ``enabled``.
+
+Input parameters that updated while DAMON_LRU_SORT is running are not applied
+by default. Once this parameter is set as ``Y``, DAMON_LRU_SORT reads values
+of parametrs except ``enabled`` again. Once the re-reading is done, this
+parameter is set as ``N``. If invalid parameters are found while the
+re-reading, DAMON_LRU_SORT will be disabled.
+
+hot_thres_access_freq
+---------------------
+
+Access frequency threshold for hot memory regions identification in permil.
+
+If a memory region is accessed in frequency of this or higher, DAMON_LRU_SORT
+identifies the region as hot, and mark it as accessed on the LRU list, so that
+it could not be reclaimed under memory pressure. 50% by default.
+
+cold_min_age
+------------
+
+Time threshold for cold memory regions identification in microseconds.
+
+If a memory region is not accessed for this or longer time, DAMON_LRU_SORT
+identifies the region as cold, and mark it as unaccessed on the LRU list, so
+that it could be reclaimed first under memory pressure. 120 seconds by
+default.
+
+quota_ms
+--------
+
+Limit of time for trying the LRU lists sorting in milliseconds.
+
+DAMON_LRU_SORT tries to use only up to this time within a time window
+(quota_reset_interval_ms) for trying LRU lists sorting. This can be used
+for limiting CPU consumption of DAMON_LRU_SORT. If the value is zero, the
+limit is disabled.
+
+10 ms by default.
+
+quota_reset_interval_ms
+-----------------------
+
+The time quota charge reset interval in milliseconds.
+
+The charge reset interval for the quota of time (quota_ms). That is,
+DAMON_LRU_SORT does not try LRU-lists sorting for more than quota_ms
+milliseconds or quota_sz bytes within quota_reset_interval_ms milliseconds.
+
+1 second by default.
+
+wmarks_interval
+---------------
+
+The watermarks check time interval in microseconds.
+
+Minimal time to wait before checking the watermarks, when DAMON_LRU_SORT is
+enabled but inactive due to its watermarks rule. 5 seconds by default.
+
+wmarks_high
+-----------
+
+Free memory rate (per thousand) for the high watermark.
+
+If free memory of the system in bytes per thousand bytes is higher than this,
+DAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the
+watermarks. 200 (20%) by default.
+
+wmarks_mid
+----------
+
+Free memory rate (per thousand) for the middle watermark.
+
+If free memory of the system in bytes per thousand bytes is between this and
+the low watermark, DAMON_LRU_SORT becomes active, so starts the monitoring and
+the LRU-lists sorting. 150 (15%) by default.
+
+wmarks_low
+----------
+
+Free memory rate (per thousand) for the low watermark.
+
+If free memory of the system in bytes per thousand bytes is lower than this,
+DAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the
+watermarks. 50 (5%) by default.
+
+sample_interval
+---------------
+
+Sampling interval for the monitoring in microseconds.
+
+The sampling interval of DAMON for the cold memory monitoring. Please refer to
+the DAMON documentation (:doc:`usage`) for more detail. 5ms by default.
+
+aggr_interval
+-------------
+
+Aggregation interval for the monitoring in microseconds.
+
+The aggregation interval of DAMON for the cold memory monitoring. Please
+refer to the DAMON documentation (:doc:`usage`) for more detail. 100ms by
+default.
+
+min_nr_regions
+--------------
+
+Minimum number of monitoring regions.
+
+The minimal number of monitoring regions of DAMON for the cold memory
+monitoring. This can be used to set lower-bound of the monitoring quality.
+But, setting this too high could result in increased monitoring overhead.
+Please refer to the DAMON documentation (:doc:`usage`) for more detail. 10 by
+default.
+
+max_nr_regions
+--------------
+
+Maximum number of monitoring regions.
+
+The maximum number of monitoring regions of DAMON for the cold memory
+monitoring. This can be used to set upper-bound of the monitoring overhead.
+However, setting this too low could result in bad monitoring quality. Please
+refer to the DAMON documentation (:doc:`usage`) for more detail. 1000 by
+defaults.
+
+monitor_region_start
+--------------------
+
+Start of target memory region in physical address.
+
+The start physical address of memory region that DAMON_LRU_SORT will do work
+against. By default, biggest System RAM is used as the region.
+
+monitor_region_end
+------------------
+
+End of target memory region in physical address.
+
+The end physical address of memory region that DAMON_LRU_SORT will do work
+against. By default, biggest System RAM is used as the region.
+
+kdamond_pid
+-----------
+
+PID of the DAMON thread.
+
+If DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread. Else,
+-1.
+
+nr_lru_sort_tried_hot_regions
+-----------------------------
+
+Number of hot memory regions that tried to be LRU-sorted.
+
+bytes_lru_sort_tried_hot_regions
+--------------------------------
+
+Total bytes of hot memory regions that tried to be LRU-sorted.
+
+nr_lru_sorted_hot_regions
+-------------------------
+
+Number of hot memory regions that successfully be LRU-sorted.
+
+bytes_lru_sorted_hot_regions
+----------------------------
+
+Total bytes of hot memory regions that successfully be LRU-sorted.
+
+nr_hot_quota_exceeds
+--------------------
+
+Number of times that the time quota limit for hot regions have exceeded.
+
+nr_lru_sort_tried_cold_regions
+------------------------------
+
+Number of cold memory regions that tried to be LRU-sorted.
+
+bytes_lru_sort_tried_cold_regions
+---------------------------------
+
+Total bytes of cold memory regions that tried to be LRU-sorted.
+
+nr_lru_sorted_cold_regions
+--------------------------
+
+Number of cold memory regions that successfully be LRU-sorted.
+
+bytes_lru_sorted_cold_regions
+-----------------------------
+
+Total bytes of cold memory regions that successfully be LRU-sorted.
+
+nr_cold_quota_exceeds
+---------------------
+
+Number of times that the time quota limit for cold regions have exceeded.
+
+Example
+=======
+
+Below runtime example commands make DAMON_LRU_SORT to find memory regions
+having >=50% access frequency and LRU-prioritize while LRU-deprioritizing
+memory regions that not accessed for 120 seconds. The prioritization and
+deprioritization is limited to be done using only up to 1% CPU time to avoid
+DAMON_LRU_SORT consuming too much CPU time for the (de)prioritization. It also
+asks DAMON_LRU_SORT to do nothing if the system's free memory rate is more than
+50%, but start the real works if it becomes lower than 40%. If DAMON_RECLAIM
+doesn't make progress and therefore the free memory rate becomes lower than
+20%, it asks DAMON_LRU_SORT to do nothing again, so that we can fall back to
+the LRU-list based page granularity reclamation. ::
+
+ # cd /sys/modules/damon_lru_sort/parameters
+ # echo 500 > hot_thres_access_freq
+ # echo 120000000 > cold_min_age
+ # echo 10 > quota_ms
+ # echo 1000 > quota_reset_interval_ms
+ # echo 500 > wmarks_high
+ # echo 400 > wmarks_mid
+ # echo 200 > wmarks_low
+ # echo Y > enabled
diff --git a/Documentation/admin-guide/mm/damon/reclaim.rst b/Documentation/admin-guide/mm/damon/reclaim.rst
index a8bd3bd29959..4f1479a11e63 100644
--- a/Documentation/admin-guide/mm/damon/reclaim.rst
+++ b/Documentation/admin-guide/mm/damon/reclaim.rst
@@ -48,12 +48,6 @@ DAMON_RECLAIM utilizes module parameters. That is, you can put
``damon_reclaim.<parameter>=<value>`` on the kernel boot command line or write
proper values to ``/sys/modules/damon_reclaim/parameters/<parameter>`` files.
-Note that the parameter values except ``enabled`` are applied only when
-DAMON_RECLAIM starts. Therefore, if you want to apply new parameter values in
-runtime and DAMON_RECLAIM is already enabled, you should disable and re-enable
-it via ``enabled`` parameter file. Writing of the new values to proper
-parameter values should be done before the re-enablement.
-
Below are the description of each parameter.
enabled
diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
index 5540a3a40fc9..d52f572a9029 100644
--- a/Documentation/admin-guide/mm/damon/usage.rst
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -264,6 +264,8 @@ that can be written to and read from the file and their meaning are as below.
- ``pageout``: Call ``madvise()`` for the region with ``MADV_PAGEOUT``
- ``hugepage``: Call ``madvise()`` for the region with ``MADV_HUGEPAGE``
- ``nohugepage``: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE``
+ - ``lru_prio``: Prioritize the region on its LRU lists.
+ - ``lru_deprio``: Deprioritize the region on its LRU lists.
- ``stat``: Do nothing but count the statistics
schemes/<N>/access_pattern/
diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
index c21b5823f126..1bd11118dfb1 100644
--- a/Documentation/admin-guide/mm/index.rst
+++ b/Documentation/admin-guide/mm/index.rst
@@ -36,6 +36,7 @@ the Linux memory management.
numa_memory_policy
numaperf
pagemap
+ shrinker_debugfs
soft-dirty
swap_numa
transhuge
diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst
new file mode 100644
index 000000000000..3887f0b294fe
--- /dev/null
+++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst
@@ -0,0 +1,135 @@
+.. _shrinker_debugfs:
+
+==========================
+Shrinker Debugfs Interface
+==========================
+
+Shrinker debugfs interface provides a visibility into the kernel memory
+shrinkers subsystem and allows to get information about individual shrinkers
+and interact with them.
+
+For each shrinker registered in the system a directory in **<debugfs>/shrinker/**
+is created. The directory's name is composed from the shrinker's name and an
+unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*.
+
+Each shrinker directory contains **count** and **scan** files, which allow to
+trigger *count_objects()* and *scan_objects()* callbacks for each memcg and
+numa node (if applicable).
+
+Usage:
+------
+
+1. *List registered shrinkers*
+
+ ::
+
+ $ cd /sys/kernel/debug/shrinker/
+ $ ls
+ dquota-cache-16 sb-devpts-28 sb-proc-47 sb-tmpfs-42
+ mm-shadow-18 sb-devtmpfs-5 sb-proc-48 sb-tmpfs-43
+ mm-zspool:zram0-34 sb-hugetlbfs-17 sb-pstore-31 sb-tmpfs-44
+ rcu-kfree-0 sb-hugetlbfs-33 sb-rootfs-2 sb-tmpfs-49
+ sb-aio-20 sb-iomem-12 sb-securityfs-6 sb-tracefs-13
+ sb-anon_inodefs-15 sb-mqueue-21 sb-selinuxfs-22 sb-xfs:vda1-36
+ sb-bdev-3 sb-nsfs-4 sb-sockfs-8 sb-zsmalloc-19
+ sb-bpf-32 sb-pipefs-14 sb-sysfs-26 thp-deferred_split-10
+ sb-btrfs:vda2-24 sb-proc-25 sb-tmpfs-1 thp-zero-9
+ sb-cgroup2-30 sb-proc-39 sb-tmpfs-27 xfs-buf:vda1-37
+ sb-configfs-23 sb-proc-41 sb-tmpfs-29 xfs-inodegc:vda1-38
+ sb-dax-11 sb-proc-45 sb-tmpfs-35
+ sb-debugfs-7 sb-proc-46 sb-tmpfs-40
+
+2. *Get information about a specific shrinker*
+
+ ::
+
+ $ cd sb-btrfs\:vda2-24/
+ $ ls
+ count scan
+
+3. *Count objects*
+
+ Each line in the output has the following format::
+
+ <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1> ...
+ <cgroup inode id> <nr of objects on node 0> <nr of objects on node 1> ...
+ ...
+
+ If there are no objects on all numa nodes, a line is omitted. If there
+ are no objects at all, the output might be empty.
+
+ If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is printed
+ as cgroup inode id. If the shrinker is not numa-aware, 0's are printed
+ for all nodes except the first one.
+ ::
+
+ $ cat count
+ 1 224 2
+ 21 98 0
+ 55 818 10
+ 2367 2 0
+ 2401 30 0
+ 225 13 0
+ 599 35 0
+ 939 124 0
+ 1041 3 0
+ 1075 1 0
+ 1109 1 0
+ 1279 60 0
+ 1313 7 0
+ 1347 39 0
+ 1381 3 0
+ 1449 14 0
+ 1483 63 0
+ 1517 53 0
+ 1551 6 0
+ 1585 1 0
+ 1619 6 0
+ 1653 40 0
+ 1687 11 0
+ 1721 8 0
+ 1755 4 0
+ 1789 52 0
+ 1823 888 0
+ 1857 1 0
+ 1925 2 0
+ 1959 32 0
+ 2027 22 0
+ 2061 9 0
+ 2469 799 0
+ 2537 861 0
+ 2639 1 0
+ 2707 70 0
+ 2775 4 0
+ 2877 84 0
+ 293 1 0
+ 735 8 0
+
+4. *Scan objects*
+
+ The expected input format::
+
+ <cgroup inode id> <numa id> <number of objects to scan>
+
+ For a non-memcg-aware shrinker or on a system with no memory
+ cgrups **0** should be passed as cgroup id.
+ ::
+
+ $ cd /sys/kernel/debug/shrinker/
+ $ cd sb-btrfs\:vda2-24/
+
+ $ cat count | head -n 5
+ 1 212 0
+ 21 97 0
+ 55 802 5
+ 2367 2 0
+ 225 13 0
+
+ $ echo "55 0 200" > scan
+
+ $ cat count | head -n 5
+ 1 212 0
+ 21 96 0
+ 55 752 5
+ 2367 2 0
+ 225 13 0
diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
index 6528036093e1..9bae1acd431f 100644
--- a/Documentation/admin-guide/mm/userfaultfd.rst
+++ b/Documentation/admin-guide/mm/userfaultfd.rst
@@ -17,7 +17,10 @@ of the ``PROT_NONE+SIGSEGV`` trick.
Design
======
-Userfaults are delivered and resolved through the ``userfaultfd`` syscall.
+Userspace creates a new userfaultfd, initializes it, and registers one or more
+regions of virtual memory with it. Then, any page faults which occur within the
+region(s) result in a message being delivered to the userfaultfd, notifying
+userspace of the fault.
The ``userfaultfd`` (aside from registering and unregistering virtual
memory ranges) provides two primary functionalities:
@@ -34,12 +37,11 @@ The real advantage of userfaults if compared to regular virtual memory
management of mremap/mprotect is that the userfaults in all their
operations never involve heavyweight structures like vmas (in fact the
``userfaultfd`` runtime load never takes the mmap_lock for writing).
-
Vmas are not suitable for page- (or hugepage) granular fault tracking
when dealing with virtual address spaces that could span
Terabytes. Too many vmas would be needed for that.
-The ``userfaultfd`` once opened by invoking the syscall, can also be
+The ``userfaultfd``, once created, can also be
passed using unix domain sockets to a manager process, so the same
manager process could handle the userfaults of a multitude of
different processes without them being aware about what is going on
@@ -50,6 +52,38 @@ is a corner case that would currently return ``-EBUSY``).
API
===
+Creating a userfaultfd
+----------------------
+
+There are two ways to create a new userfaultfd, each of which provide ways to
+restrict access to this functionality (since historically userfaultfds which
+handle kernel page faults have been a useful tool for exploiting the kernel).
+
+The first way, supported by older kernels, is the userfaultfd(2) syscall.
+Access to this is controlled in several ways:
+
+- By default, the userfaultfd will be able to handle kernel page faults. This
+ can be disabled by passing in UFFD_USER_MODE_ONLY.
+
+- If vm.unprivileged_userfaultfd is 0, then the caller must *either* have
+ CAP_SYS_PTRACE, or pass in UFFD_USER_MODE_ONLY.
+
+- If vm.unprivileged_userfaultfd is 1, then no particular privilege is needed to
+ use this syscall, even if UFFD_USER_MODE_ONLY is *not* set.
+
+The second way, added to the kernel more recently, is by opening and issuing a
+USERFAULTFD_IOC_NEW ioctl to /dev/userfaultfd. This method yields equivalent
+userfaultfds to the userfaultfd(2) syscall; its benefit is in how access to
+creating userfaultfds is controlled.
+
+Access to /dev/userfaultfd is controlled via normal filesystem permissions
+(user/group/mode for example), which gives fine grained access to userfaultfd
+specifically, without also granting other unrelated privileges at the same time
+(as e.g. granting CAP_SYS_PTRACE would do).
+
+Initializing up a userfaultfd
+-----------------------------
+
When first opened the ``userfaultfd`` must be enabled invoking the
``UFFDIO_API`` ioctl specifying a ``uffdio_api.api`` value set to ``UFFD_API`` (or
a later API version) which will specify the ``read/POLLIN`` protocol
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 4a440a7cfeb0..5a810ea6dc77 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -565,9 +565,8 @@ See Documentation/admin-guide/mm/hugetlbpage.rst
hugetlb_optimize_vmemmap
========================
-This knob is not available when memory_hotplug.memmap_on_memory (kernel parameter)
-is configured or the size of 'struct page' (a structure defined in
-include/linux/mm_types.h) is not power of two (an unusual system config could
+This knob is not available when the size of 'struct page' (a structure defined
+in include/linux/mm_types.h) is not power of two (an unusual system config could
result in this).
Enable (set to 1) or disable (set to 0) the feature of optimizing vmemmap pages
@@ -928,6 +927,9 @@ calls without any restrictions.
The default value is 0.
+An alternative to this sysctl / the userfaultfd(2) syscall is to create
+userfaultfds via /dev/userfaultfd. See
+Documentation/admin-guide/mm/userfaultfd.rst.
user_reserve_kbytes
===================