diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/sysfs-kernel-livepatch | 8 | ||||
-rw-r--r-- | Documentation/filesystems/proc.txt | 18 | ||||
-rw-r--r-- | Documentation/livepatch/livepatch.txt | 214 |
3 files changed, 198 insertions, 42 deletions
diff --git a/Documentation/ABI/testing/sysfs-kernel-livepatch b/Documentation/ABI/testing/sysfs-kernel-livepatch index da87f43aec58..d5d39748382f 100644 --- a/Documentation/ABI/testing/sysfs-kernel-livepatch +++ b/Documentation/ABI/testing/sysfs-kernel-livepatch @@ -25,6 +25,14 @@ Description: code is currently applied. Writing 0 will disable the patch while writing 1 will re-enable the patch. +What: /sys/kernel/livepatch/<patch>/transition +Date: Feb 2017 +KernelVersion: 4.12.0 +Contact: live-patching@vger.kernel.org +Description: + An attribute which indicates whether the patch is currently in + transition. + What: /sys/kernel/livepatch/<patch>/<object> Date: Nov 2014 KernelVersion: 3.19.0 diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index c94b4675d021..9036dbf16156 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -44,6 +44,7 @@ Table of Contents 3.8 /proc/<pid>/fdinfo/<fd> - Information about opened file 3.9 /proc/<pid>/map_files - Information about memory mapped files 3.10 /proc/<pid>/timerslack_ns - Task timerslack value + 3.11 /proc/<pid>/patch_state - Livepatch patch operation state 4 Configuring procfs 4.1 Mount options @@ -1887,6 +1888,23 @@ Valid values are from 0 - ULLONG_MAX An application setting the value must have PTRACE_MODE_ATTACH_FSCREDS level permissions on the task specified to change its timerslack_ns value. +3.11 /proc/<pid>/patch_state - Livepatch patch operation state +----------------------------------------------------------------- +When CONFIG_LIVEPATCH is enabled, this file displays the value of the +patch state for the task. + +A value of '-1' indicates that no patch is in transition. + +A value of '0' indicates that a patch is in transition and the task is +unpatched. If the patch is being enabled, then the task hasn't been +patched yet. If the patch is being disabled, then the task has already +been unpatched. + +A value of '1' indicates that a patch is in transition and the task is +patched. If the patch is being enabled, then the task has already been +patched. If the patch is being disabled, then the task hasn't been +unpatched yet. + ------------------------------------------------------------------------------ Configuring procfs diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt index 9d2096c7160d..ecdb18104ab0 100644 --- a/Documentation/livepatch/livepatch.txt +++ b/Documentation/livepatch/livepatch.txt @@ -72,7 +72,8 @@ example, they add a NULL pointer or a boundary check, fix a race by adding a missing memory barrier, or add some locking around a critical section. Most of these changes are self contained and the function presents itself the same way to the rest of the system. In this case, the functions might -be updated independently one by one. +be updated independently one by one. (This can be done by setting the +'immediate' flag in the klp_patch struct.) But there are more complex fixes. For example, a patch might change ordering of locking in multiple functions at the same time. Or a patch @@ -86,20 +87,141 @@ or no data are stored in the modified structures at the moment. The theory about how to apply functions a safe way is rather complex. The aim is to define a so-called consistency model. It attempts to define conditions when the new implementation could be used so that the system -stays consistent. The theory is not yet finished. See the discussion at -https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz - -The current consistency model is very simple. It guarantees that either -the old or the new function is called. But various functions get redirected -one by one without any synchronization. - -In other words, the current implementation _never_ modifies the behavior -in the middle of the call. It is because it does _not_ rewrite the entire -function in the memory. Instead, the function gets redirected at the -very beginning. But this redirection is used immediately even when -some other functions from the same patch have not been redirected yet. - -See also the section "Limitations" below. +stays consistent. + +Livepatch has a consistency model which is a hybrid of kGraft and +kpatch: it uses kGraft's per-task consistency and syscall barrier +switching combined with kpatch's stack trace switching. There are also +a number of fallback options which make it quite flexible. + +Patches are applied on a per-task basis, when the task is deemed safe to +switch over. When a patch is enabled, livepatch enters into a +transition state where tasks are converging to the patched state. +Usually this transition state can complete in a few seconds. The same +sequence occurs when a patch is disabled, except the tasks converge from +the patched state to the unpatched state. + +An interrupt handler inherits the patched state of the task it +interrupts. The same is true for forked tasks: the child inherits the +patched state of the parent. + +Livepatch uses several complementary approaches to determine when it's +safe to patch tasks: + +1. The first and most effective approach is stack checking of sleeping + tasks. If no affected functions are on the stack of a given task, + the task is patched. In most cases this will patch most or all of + the tasks on the first try. Otherwise it'll keep trying + periodically. This option is only available if the architecture has + reliable stacks (HAVE_RELIABLE_STACKTRACE). + +2. The second approach, if needed, is kernel exit switching. A + task is switched when it returns to user space from a system call, a + user space IRQ, or a signal. It's useful in the following cases: + + a) Patching I/O-bound user tasks which are sleeping on an affected + function. In this case you have to send SIGSTOP and SIGCONT to + force it to exit the kernel and be patched. + b) Patching CPU-bound user tasks. If the task is highly CPU-bound + then it will get patched the next time it gets interrupted by an + IRQ. + c) In the future it could be useful for applying patches for + architectures which don't yet have HAVE_RELIABLE_STACKTRACE. In + this case you would have to signal most of the tasks on the + system. However this isn't supported yet because there's + currently no way to patch kthreads without + HAVE_RELIABLE_STACKTRACE. + +3. For idle "swapper" tasks, since they don't ever exit the kernel, they + instead have a klp_update_patch_state() call in the idle loop which + allows them to be patched before the CPU enters the idle state. + + (Note there's not yet such an approach for kthreads.) + +All the above approaches may be skipped by setting the 'immediate' flag +in the 'klp_patch' struct, which will disable per-task consistency and +patch all tasks immediately. This can be useful if the patch doesn't +change any function or data semantics. Note that, even with this flag +set, it's possible that some tasks may still be running with an old +version of the function, until that function returns. + +There's also an 'immediate' flag in the 'klp_func' struct which allows +you to specify that certain functions in the patch can be applied +without per-task consistency. This might be useful if you want to patch +a common function like schedule(), and the function change doesn't need +consistency but the rest of the patch does. + +For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user +must set patch->immediate which causes all tasks to be patched +immediately. This option should be used with care, only when the patch +doesn't change any function or data semantics. + +In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE +may be allowed to use per-task consistency if we can come up with +another way to patch kthreads. + +The /sys/kernel/livepatch/<patch>/transition file shows whether a patch +is in transition. Only a single patch (the topmost patch on the stack) +can be in transition at a given time. A patch can remain in transition +indefinitely, if any of the tasks are stuck in the initial patch state. + +A transition can be reversed and effectively canceled by writing the +opposite value to the /sys/kernel/livepatch/<patch>/enabled file while +the transition is in progress. Then all the tasks will attempt to +converge back to the original patch state. + +There's also a /proc/<pid>/patch_state file which can be used to +determine which tasks are blocking completion of a patching operation. +If a patch is in transition, this file shows 0 to indicate the task is +unpatched and 1 to indicate it's patched. Otherwise, if no patch is in +transition, it shows -1. Any tasks which are blocking the transition +can be signaled with SIGSTOP and SIGCONT to force them to change their +patched state. + + +3.1 Adding consistency model support to new architectures +--------------------------------------------------------- + +For adding consistency model support to new architectures, there are a +few options: + +1) Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and + for non-DWARF unwinders, also making sure there's a way for the stack + tracing code to detect interrupts on the stack. + +2) Alternatively, ensure that every kthread has a call to + klp_update_patch_state() in a safe location. Kthreads are typically + in an infinite loop which does some action repeatedly. The safe + location to switch the kthread's patch state would be at a designated + point in the loop where there are no locks taken and all data + structures are in a well-defined state. + + The location is clear when using workqueues or the kthread worker + API. These kthreads process independent actions in a generic loop. + + It's much more complicated with kthreads which have a custom loop. + There the safe location must be carefully selected on a case-by-case + basis. + + In that case, arches without HAVE_RELIABLE_STACKTRACE would still be + able to use the non-stack-checking parts of the consistency model: + + a) patching user tasks when they cross the kernel/user space + boundary; and + + b) patching kthreads and idle tasks at their designated patch points. + + This option isn't as good as option 1 because it requires signaling + user tasks and waking kthreads to patch them. But it could still be + a good backup option for those architectures which don't have + reliable stack traces yet. + +In the meantime, patches for such architectures can bypass the +consistency model by setting klp_patch.immediate to true. This option +is perfectly fine for patches which don't change the semantics of the +patched functions. In practice, this is usable for ~90% of security +fixes. Use of this option also means the patch can't be unloaded after +it has been disabled. 4. Livepatch module @@ -134,7 +256,7 @@ Documentation/livepatch/module-elf-format.txt for more details. 4.2. Metadata ------------- +------------- The patch is described by several structures that split the information into three levels: @@ -156,6 +278,9 @@ into three levels: only for a particular object ( vmlinux or a kernel module ). Note that kallsyms allows for searching symbols according to the object name. + There's also an 'immediate' flag which, when set, patches the + function immediately, bypassing the consistency model safety checks. + + struct klp_object defines an array of patched functions (struct klp_func) in the same object. Where the object is either vmlinux (NULL) or a module name. @@ -172,10 +297,13 @@ into three levels: This structure handles all patched functions consistently and eventually, synchronously. The whole patch is applied only when all patched symbols are found. The only exception are symbols from objects - (kernel modules) that have not been loaded yet. Also if a more complex - consistency model is supported then a selected unit (thread, - kernel as a whole) will see the new code from the entire patch - only when it is in a safe state. + (kernel modules) that have not been loaded yet. + + Setting the 'immediate' flag applies the patch to all tasks + immediately, bypassing the consistency model safety checks. + + For more details on how the patch is applied on a per-task basis, + see the "Consistency model" section. 4.3. Livepatch module handling @@ -188,8 +316,15 @@ section "Livepatch life-cycle" below for more details about these two operations. Module removal is only safe when there are no users of the underlying -functions. The immediate consistency model is not able to detect this; -therefore livepatch modules cannot be removed. See "Limitations" below. +functions. The immediate consistency model is not able to detect this. The +code just redirects the functions at the very beginning and it does not +check if the functions are in use. In other words, it knows when the +functions get called but it does not know when the functions return. +Therefore it cannot be decided when the livepatch module can be safely +removed. This is solved by a hybrid consistency model. When the system is +transitioned to a new patch state (patched/unpatched) it is guaranteed that +no task sleeps or runs in the old code. + 5. Livepatch life-cycle ======================= @@ -239,9 +374,15 @@ Registered patches might be enabled either by calling klp_enable_patch() or by writing '1' to /sys/kernel/livepatch/<name>/enabled. The system will start using the new implementation of the patched functions at this stage. -In particular, if an original function is patched for the first time, a -function specific struct klp_ops is created and an universal ftrace handler -is registered. +When a patch is enabled, livepatch enters into a transition state where +tasks are converging to the patched state. This is indicated by a value +of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks have +been patched, the 'transition' value changes to '0'. For more +information about this process, see the "Consistency model" section. + +If an original function is patched for the first time, a function +specific struct klp_ops is created and an universal ftrace handler is +registered. Functions might be patched multiple times. The ftrace handler is registered only once for the given function. Further patches just add an entry to the @@ -261,6 +402,12 @@ by writing '0' to /sys/kernel/livepatch/<name>/enabled. At this stage either the code from the previously enabled patch or even the original code gets used. +When a patch is disabled, livepatch enters into a transition state where +tasks are converging to the unpatched state. This is indicated by a +value of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks +have been unpatched, the 'transition' value changes to '0'. For more +information about this process, see the "Consistency model" section. + Here all the functions (struct klp_func) associated with the to-be-disabled patch are removed from the corresponding struct klp_ops. The ftrace handler is unregistered and the struct klp_ops is freed when the func_stack list @@ -329,23 +476,6 @@ The current Livepatch implementation has several limitations: by "notrace". - + Livepatch modules can not be removed. - - The current implementation just redirects the functions at the very - beginning. It does not check if the functions are in use. In other - words, it knows when the functions get called but it does not - know when the functions return. Therefore it can not decide when - the livepatch module can be safely removed. - - This will get most likely solved once a more complex consistency model - is supported. The idea is that a safe state for patching should also - mean a safe state for removing the patch. - - Note that the patch itself might get disabled by writing zero - to /sys/kernel/livepatch/<patch>/enabled. It causes that the new - code will not longer get called. But it does not guarantee - that anyone is not sleeping anywhere in the new code. - + Livepatch works reliably only when the dynamic ftrace is located at the very beginning of the function. |