summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-kernel-livepatch8
-rw-r--r--Documentation/filesystems/proc.txt18
-rw-r--r--Documentation/livepatch/livepatch.txt214
3 files changed, 198 insertions, 42 deletions
diff --git a/Documentation/ABI/testing/sysfs-kernel-livepatch b/Documentation/ABI/testing/sysfs-kernel-livepatch
index da87f43aec58..d5d39748382f 100644
--- a/Documentation/ABI/testing/sysfs-kernel-livepatch
+++ b/Documentation/ABI/testing/sysfs-kernel-livepatch
@@ -25,6 +25,14 @@ Description:
code is currently applied. Writing 0 will disable the patch
while writing 1 will re-enable the patch.
+What: /sys/kernel/livepatch/<patch>/transition
+Date: Feb 2017
+KernelVersion: 4.12.0
+Contact: live-patching@vger.kernel.org
+Description:
+ An attribute which indicates whether the patch is currently in
+ transition.
+
What: /sys/kernel/livepatch/<patch>/<object>
Date: Nov 2014
KernelVersion: 3.19.0
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index c94b4675d021..9036dbf16156 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -44,6 +44,7 @@ Table of Contents
3.8 /proc/<pid>/fdinfo/<fd> - Information about opened file
3.9 /proc/<pid>/map_files - Information about memory mapped files
3.10 /proc/<pid>/timerslack_ns - Task timerslack value
+ 3.11 /proc/<pid>/patch_state - Livepatch patch operation state
4 Configuring procfs
4.1 Mount options
@@ -1887,6 +1888,23 @@ Valid values are from 0 - ULLONG_MAX
An application setting the value must have PTRACE_MODE_ATTACH_FSCREDS level
permissions on the task specified to change its timerslack_ns value.
+3.11 /proc/<pid>/patch_state - Livepatch patch operation state
+-----------------------------------------------------------------
+When CONFIG_LIVEPATCH is enabled, this file displays the value of the
+patch state for the task.
+
+A value of '-1' indicates that no patch is in transition.
+
+A value of '0' indicates that a patch is in transition and the task is
+unpatched. If the patch is being enabled, then the task hasn't been
+patched yet. If the patch is being disabled, then the task has already
+been unpatched.
+
+A value of '1' indicates that a patch is in transition and the task is
+patched. If the patch is being enabled, then the task has already been
+patched. If the patch is being disabled, then the task hasn't been
+unpatched yet.
+
------------------------------------------------------------------------------
Configuring procfs
diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
index 9d2096c7160d..ecdb18104ab0 100644
--- a/Documentation/livepatch/livepatch.txt
+++ b/Documentation/livepatch/livepatch.txt
@@ -72,7 +72,8 @@ example, they add a NULL pointer or a boundary check, fix a race by adding
a missing memory barrier, or add some locking around a critical section.
Most of these changes are self contained and the function presents itself
the same way to the rest of the system. In this case, the functions might
-be updated independently one by one.
+be updated independently one by one. (This can be done by setting the
+'immediate' flag in the klp_patch struct.)
But there are more complex fixes. For example, a patch might change
ordering of locking in multiple functions at the same time. Or a patch
@@ -86,20 +87,141 @@ or no data are stored in the modified structures at the moment.
The theory about how to apply functions a safe way is rather complex.
The aim is to define a so-called consistency model. It attempts to define
conditions when the new implementation could be used so that the system
-stays consistent. The theory is not yet finished. See the discussion at
-https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz
-
-The current consistency model is very simple. It guarantees that either
-the old or the new function is called. But various functions get redirected
-one by one without any synchronization.
-
-In other words, the current implementation _never_ modifies the behavior
-in the middle of the call. It is because it does _not_ rewrite the entire
-function in the memory. Instead, the function gets redirected at the
-very beginning. But this redirection is used immediately even when
-some other functions from the same patch have not been redirected yet.
-
-See also the section "Limitations" below.
+stays consistent.
+
+Livepatch has a consistency model which is a hybrid of kGraft and
+kpatch: it uses kGraft's per-task consistency and syscall barrier
+switching combined with kpatch's stack trace switching. There are also
+a number of fallback options which make it quite flexible.
+
+Patches are applied on a per-task basis, when the task is deemed safe to
+switch over. When a patch is enabled, livepatch enters into a
+transition state where tasks are converging to the patched state.
+Usually this transition state can complete in a few seconds. The same
+sequence occurs when a patch is disabled, except the tasks converge from
+the patched state to the unpatched state.
+
+An interrupt handler inherits the patched state of the task it
+interrupts. The same is true for forked tasks: the child inherits the
+patched state of the parent.
+
+Livepatch uses several complementary approaches to determine when it's
+safe to patch tasks:
+
+1. The first and most effective approach is stack checking of sleeping
+ tasks. If no affected functions are on the stack of a given task,
+ the task is patched. In most cases this will patch most or all of
+ the tasks on the first try. Otherwise it'll keep trying
+ periodically. This option is only available if the architecture has
+ reliable stacks (HAVE_RELIABLE_STACKTRACE).
+
+2. The second approach, if needed, is kernel exit switching. A
+ task is switched when it returns to user space from a system call, a
+ user space IRQ, or a signal. It's useful in the following cases:
+
+ a) Patching I/O-bound user tasks which are sleeping on an affected
+ function. In this case you have to send SIGSTOP and SIGCONT to
+ force it to exit the kernel and be patched.
+ b) Patching CPU-bound user tasks. If the task is highly CPU-bound
+ then it will get patched the next time it gets interrupted by an
+ IRQ.
+ c) In the future it could be useful for applying patches for
+ architectures which don't yet have HAVE_RELIABLE_STACKTRACE. In
+ this case you would have to signal most of the tasks on the
+ system. However this isn't supported yet because there's
+ currently no way to patch kthreads without
+ HAVE_RELIABLE_STACKTRACE.
+
+3. For idle "swapper" tasks, since they don't ever exit the kernel, they
+ instead have a klp_update_patch_state() call in the idle loop which
+ allows them to be patched before the CPU enters the idle state.
+
+ (Note there's not yet such an approach for kthreads.)
+
+All the above approaches may be skipped by setting the 'immediate' flag
+in the 'klp_patch' struct, which will disable per-task consistency and
+patch all tasks immediately. This can be useful if the patch doesn't
+change any function or data semantics. Note that, even with this flag
+set, it's possible that some tasks may still be running with an old
+version of the function, until that function returns.
+
+There's also an 'immediate' flag in the 'klp_func' struct which allows
+you to specify that certain functions in the patch can be applied
+without per-task consistency. This might be useful if you want to patch
+a common function like schedule(), and the function change doesn't need
+consistency but the rest of the patch does.
+
+For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user
+must set patch->immediate which causes all tasks to be patched
+immediately. This option should be used with care, only when the patch
+doesn't change any function or data semantics.
+
+In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE
+may be allowed to use per-task consistency if we can come up with
+another way to patch kthreads.
+
+The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
+is in transition. Only a single patch (the topmost patch on the stack)
+can be in transition at a given time. A patch can remain in transition
+indefinitely, if any of the tasks are stuck in the initial patch state.
+
+A transition can be reversed and effectively canceled by writing the
+opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
+the transition is in progress. Then all the tasks will attempt to
+converge back to the original patch state.
+
+There's also a /proc/<pid>/patch_state file which can be used to
+determine which tasks are blocking completion of a patching operation.
+If a patch is in transition, this file shows 0 to indicate the task is
+unpatched and 1 to indicate it's patched. Otherwise, if no patch is in
+transition, it shows -1. Any tasks which are blocking the transition
+can be signaled with SIGSTOP and SIGCONT to force them to change their
+patched state.
+
+
+3.1 Adding consistency model support to new architectures
+---------------------------------------------------------
+
+For adding consistency model support to new architectures, there are a
+few options:
+
+1) Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and
+ for non-DWARF unwinders, also making sure there's a way for the stack
+ tracing code to detect interrupts on the stack.
+
+2) Alternatively, ensure that every kthread has a call to
+ klp_update_patch_state() in a safe location. Kthreads are typically
+ in an infinite loop which does some action repeatedly. The safe
+ location to switch the kthread's patch state would be at a designated
+ point in the loop where there are no locks taken and all data
+ structures are in a well-defined state.
+
+ The location is clear when using workqueues or the kthread worker
+ API. These kthreads process independent actions in a generic loop.
+
+ It's much more complicated with kthreads which have a custom loop.
+ There the safe location must be carefully selected on a case-by-case
+ basis.
+
+ In that case, arches without HAVE_RELIABLE_STACKTRACE would still be
+ able to use the non-stack-checking parts of the consistency model:
+
+ a) patching user tasks when they cross the kernel/user space
+ boundary; and
+
+ b) patching kthreads and idle tasks at their designated patch points.
+
+ This option isn't as good as option 1 because it requires signaling
+ user tasks and waking kthreads to patch them. But it could still be
+ a good backup option for those architectures which don't have
+ reliable stack traces yet.
+
+In the meantime, patches for such architectures can bypass the
+consistency model by setting klp_patch.immediate to true. This option
+is perfectly fine for patches which don't change the semantics of the
+patched functions. In practice, this is usable for ~90% of security
+fixes. Use of this option also means the patch can't be unloaded after
+it has been disabled.
4. Livepatch module
@@ -134,7 +256,7 @@ Documentation/livepatch/module-elf-format.txt for more details.
4.2. Metadata
-------------
+-------------
The patch is described by several structures that split the information
into three levels:
@@ -156,6 +278,9 @@ into three levels:
only for a particular object ( vmlinux or a kernel module ). Note that
kallsyms allows for searching symbols according to the object name.
+ There's also an 'immediate' flag which, when set, patches the
+ function immediately, bypassing the consistency model safety checks.
+
+ struct klp_object defines an array of patched functions (struct
klp_func) in the same object. Where the object is either vmlinux
(NULL) or a module name.
@@ -172,10 +297,13 @@ into three levels:
This structure handles all patched functions consistently and eventually,
synchronously. The whole patch is applied only when all patched
symbols are found. The only exception are symbols from objects
- (kernel modules) that have not been loaded yet. Also if a more complex
- consistency model is supported then a selected unit (thread,
- kernel as a whole) will see the new code from the entire patch
- only when it is in a safe state.
+ (kernel modules) that have not been loaded yet.
+
+ Setting the 'immediate' flag applies the patch to all tasks
+ immediately, bypassing the consistency model safety checks.
+
+ For more details on how the patch is applied on a per-task basis,
+ see the "Consistency model" section.
4.3. Livepatch module handling
@@ -188,8 +316,15 @@ section "Livepatch life-cycle" below for more details about these
two operations.
Module removal is only safe when there are no users of the underlying
-functions. The immediate consistency model is not able to detect this;
-therefore livepatch modules cannot be removed. See "Limitations" below.
+functions. The immediate consistency model is not able to detect this. The
+code just redirects the functions at the very beginning and it does not
+check if the functions are in use. In other words, it knows when the
+functions get called but it does not know when the functions return.
+Therefore it cannot be decided when the livepatch module can be safely
+removed. This is solved by a hybrid consistency model. When the system is
+transitioned to a new patch state (patched/unpatched) it is guaranteed that
+no task sleeps or runs in the old code.
+
5. Livepatch life-cycle
=======================
@@ -239,9 +374,15 @@ Registered patches might be enabled either by calling klp_enable_patch() or
by writing '1' to /sys/kernel/livepatch/<name>/enabled. The system will
start using the new implementation of the patched functions at this stage.
-In particular, if an original function is patched for the first time, a
-function specific struct klp_ops is created and an universal ftrace handler
-is registered.
+When a patch is enabled, livepatch enters into a transition state where
+tasks are converging to the patched state. This is indicated by a value
+of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks have
+been patched, the 'transition' value changes to '0'. For more
+information about this process, see the "Consistency model" section.
+
+If an original function is patched for the first time, a function
+specific struct klp_ops is created and an universal ftrace handler is
+registered.
Functions might be patched multiple times. The ftrace handler is registered
only once for the given function. Further patches just add an entry to the
@@ -261,6 +402,12 @@ by writing '0' to /sys/kernel/livepatch/<name>/enabled. At this stage
either the code from the previously enabled patch or even the original
code gets used.
+When a patch is disabled, livepatch enters into a transition state where
+tasks are converging to the unpatched state. This is indicated by a
+value of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks
+have been unpatched, the 'transition' value changes to '0'. For more
+information about this process, see the "Consistency model" section.
+
Here all the functions (struct klp_func) associated with the to-be-disabled
patch are removed from the corresponding struct klp_ops. The ftrace handler
is unregistered and the struct klp_ops is freed when the func_stack list
@@ -329,23 +476,6 @@ The current Livepatch implementation has several limitations:
by "notrace".
- + Livepatch modules can not be removed.
-
- The current implementation just redirects the functions at the very
- beginning. It does not check if the functions are in use. In other
- words, it knows when the functions get called but it does not
- know when the functions return. Therefore it can not decide when
- the livepatch module can be safely removed.
-
- This will get most likely solved once a more complex consistency model
- is supported. The idea is that a safe state for patching should also
- mean a safe state for removing the patch.
-
- Note that the patch itself might get disabled by writing zero
- to /sys/kernel/livepatch/<patch>/enabled. It causes that the new
- code will not longer get called. But it does not guarantee
- that anyone is not sleeping anywhere in the new code.
-
+ Livepatch works reliably only when the dynamic ftrace is located at
the very beginning of the function.