diff options
author | Sebastian Andrzej Siewior <bigeasy@linutronix.de> | 2021-09-22 22:52:01 +0200 |
---|---|---|
committer | Sebastian Andrzej Siewior <bigeasy@linutronix.de> | 2021-09-22 22:52:01 +0200 |
commit | 8448a7b76e99b98c3622848f32d418db931a7eac (patch) | |
tree | 230cf5eebc7f813cdd311b2fe277c9a69b0f4b25 /patches | |
parent | d26e8a11d057e46922f191438a4d4ea65e65ce99 (diff) | |
download | linux-rt-8448a7b76e99b98c3622848f32d418db931a7eac.tar.gz |
[ANNOUNCE] v5.15-rc2-rt3v5.15-rc2-rt3-patches
Dear RT folks!
I'm pleased to announce the v5.15-rc2-rt3 patch set.
Changes since v5.15-rc2-rt2:
- Remove kernel_fpu_resched(). A few ciphers were restructured and
this function has no users and can be removed.
- The cpuset code is using spinlock_t again. Since the mm/slub rework
there is need to use raw_spinlock_t.
- Allow to enable CONFIG_RT_GROUP_SCHED on RT again. The original
issue can not be reproduced. Please test and report any issue.
- The RCU warning, that has been fixed Valentin Schneider, has been
replaced with a patch by Thomas Gleixner. There is another issue
open in that area an Frederick Weisbecker is looking into it.
- RCU lock accounting and checking has been reworked by Thomas
Gleixner. A direct effect is that might_sleep() produces a warning
if invoked in a RCU read section. Previously it would only trigger a
warning in schedule() in such a situation.
- The preempt_*_nort() macros have been removed.
- The preempt_enable_no_resched() macro should behave like
preempt_enable() on PREEMPT_RT but was was misplaced in v3.14-rt1
and has has been corrected now.
Known issues
- netconsole triggers WARN.
- The "Memory controller" (CONFIG_MEMCG) has been disabled.
- Valentin Schneider reported a few splats on ARM64, see
https://https://lkml.kernel.org/r/.kernel.org/lkml/20210810134127.1394269-1-valentin.schneider@arm.com/
The delta patch against v5.15-rc2-rt2 is appended below and can be found here:
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.15/incr/patch-5.15-rc2-rt2-rt3.patch.xz
You can get this release via the git tree at:
git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git v5.15-rc2-rt3
The RT patch against v5.15-rc2 can be found here:
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patch-5.15-rc2-rt3.patch.xz
The split quilt queue is available at:
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15-rc2-rt3.tar.xz
Sebastian
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Diffstat (limited to 'patches')
27 files changed, 421 insertions, 1003 deletions
diff --git a/patches/Add_localversion_for_-RT_release.patch b/patches/Add_localversion_for_-RT_release.patch index d960d516454d..53b69a97ca19 100644 --- a/patches/Add_localversion_for_-RT_release.patch +++ b/patches/Add_localversion_for_-RT_release.patch @@ -15,4 +15,4 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- /dev/null +++ b/localversion-rt @@ -0,0 +1 @@ -+-rt2 ++-rt3 diff --git a/patches/cpuset__Convert_callback_lock_to_raw_spinlock_t.patch b/patches/cpuset__Convert_callback_lock_to_raw_spinlock_t.patch deleted file mode 100644 index 923fdbc85863..000000000000 --- a/patches/cpuset__Convert_callback_lock_to_raw_spinlock_t.patch +++ /dev/null @@ -1,362 +0,0 @@ -Subject: cpuset: Convert callback_lock to raw_spinlock_t -From: Mike Galbraith <efault@gmx.de> -Date: Sun Jan 8 09:32:25 2017 +0100 - -From: Mike Galbraith <efault@gmx.de> - -The two commits below add up to a cpuset might_sleep() splat for RT: - -8447a0fee974 cpuset: convert callback_mutex to a spinlock -344736f29b35 cpuset: simplify cpuset_node_allowed API - -BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:995 -in_atomic(): 0, irqs_disabled(): 1, pid: 11718, name: cset -CPU: 135 PID: 11718 Comm: cset Tainted: G E 4.10.0-rt1-rt #4 -Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0056.R01.1409242327 09/24/2014 -Call Trace: - ? dump_stack+0x5c/0x81 - ? ___might_sleep+0xf4/0x170 - ? rt_spin_lock+0x1c/0x50 - ? __cpuset_node_allowed+0x66/0xc0 - ? ___slab_alloc+0x390/0x570 <disables IRQs> - ? anon_vma_fork+0x8f/0x140 - ? copy_page_range+0x6cf/0xb00 - ? anon_vma_fork+0x8f/0x140 - ? __slab_alloc.isra.74+0x5a/0x81 - ? anon_vma_fork+0x8f/0x140 - ? kmem_cache_alloc+0x1b5/0x1f0 - ? anon_vma_fork+0x8f/0x140 - ? copy_process.part.35+0x1670/0x1ee0 - ? _do_fork+0xdd/0x3f0 - ? _do_fork+0xdd/0x3f0 - ? do_syscall_64+0x61/0x170 - ? entry_SYSCALL64_slow_path+0x25/0x25 - -The later ensured that a NUMA box WILL take callback_lock in atomic -context by removing the allocator and reclaim path __GFP_HARDWALL -usage which prevented such contexts from taking callback_mutex. - -One option would be to reinstate __GFP_HARDWALL protections for -RT, however, as the 8447a0fee974 changelog states: - -The callback_mutex is only used to synchronize reads/updates of cpusets' -flags and cpu/node masks. These operations should always proceed fast so -there's no reason why we can't use a spinlock instead of the mutex. - -Cc: stable-rt@vger.kernel.org -Signed-off-by: Mike Galbraith <efault@gmx.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> - - ---- - kernel/cgroup/cpuset.c | 82 ++++++++++++++++++++++++------------------------- - 1 file changed, 41 insertions(+), 41 deletions(-) ---- ---- a/kernel/cgroup/cpuset.c -+++ b/kernel/cgroup/cpuset.c -@@ -358,7 +358,7 @@ void cpuset_read_unlock(void) - percpu_up_read(&cpuset_rwsem); - } - --static DEFINE_SPINLOCK(callback_lock); -+static DEFINE_RAW_SPINLOCK(callback_lock); - - static struct workqueue_struct *cpuset_migrate_mm_wq; - -@@ -1308,7 +1308,7 @@ static int update_parent_subparts_cpumas - * Newly added CPUs will be removed from effective_cpus and - * newly deleted ones will be added back to effective_cpus. - */ -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - if (adding) { - cpumask_or(parent->subparts_cpus, - parent->subparts_cpus, tmp->addmask); -@@ -1331,7 +1331,7 @@ static int update_parent_subparts_cpumas - if (old_prs != new_prs) - cpuset->partition_root_state = new_prs; - -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - notify_partition_change(cpuset, old_prs, new_prs); - - return cmd == partcmd_update; -@@ -1435,7 +1435,7 @@ static void update_cpumasks_hier(struct - continue; - rcu_read_unlock(); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - - cpumask_copy(cp->effective_cpus, tmp->new_cpus); - if (cp->nr_subparts_cpus && (new_prs != PRS_ENABLED)) { -@@ -1469,7 +1469,7 @@ static void update_cpumasks_hier(struct - if (new_prs != old_prs) - cp->partition_root_state = new_prs; - -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - notify_partition_change(cp, old_prs, new_prs); - - WARN_ON(!is_in_v2_mode() && -@@ -1588,7 +1588,7 @@ static int update_cpumask(struct cpuset - return -EINVAL; - } - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cpumask_copy(cs->cpus_allowed, trialcs->cpus_allowed); - - /* -@@ -1599,7 +1599,7 @@ static int update_cpumask(struct cpuset - cs->cpus_allowed); - cs->nr_subparts_cpus = cpumask_weight(cs->subparts_cpus); - } -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - update_cpumasks_hier(cs, &tmp); - -@@ -1798,9 +1798,9 @@ static void update_nodemasks_hier(struct - continue; - rcu_read_unlock(); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cp->effective_mems = *new_mems; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - WARN_ON(!is_in_v2_mode() && - !nodes_equal(cp->mems_allowed, cp->effective_mems)); -@@ -1868,9 +1868,9 @@ static int update_nodemask(struct cpuset - if (retval < 0) - goto done; - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->mems_allowed = trialcs->mems_allowed; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - /* use trialcs->mems_allowed as a temp variable */ - update_nodemasks_hier(cs, &trialcs->mems_allowed); -@@ -1961,9 +1961,9 @@ static int update_flag(cpuset_flagbits_t - spread_flag_changed = ((is_spread_slab(cs) != is_spread_slab(trialcs)) - || (is_spread_page(cs) != is_spread_page(trialcs))); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->flags = trialcs->flags; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - if (!cpumask_empty(trialcs->cpus_allowed) && balance_flag_changed) - rebuild_sched_domains_locked(); -@@ -2054,9 +2054,9 @@ static int update_prstate(struct cpuset - rebuild_sched_domains_locked(); - out: - if (!err) { -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->partition_root_state = new_prs; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - notify_partition_change(cs, old_prs, new_prs); - } - -@@ -2471,7 +2471,7 @@ static int cpuset_common_seq_show(struct - cpuset_filetype_t type = seq_cft(sf)->private; - int ret = 0; - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - - switch (type) { - case FILE_CPULIST: -@@ -2493,7 +2493,7 @@ static int cpuset_common_seq_show(struct - ret = -EINVAL; - } - -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - return ret; - } - -@@ -2811,14 +2811,14 @@ static int cpuset_css_online(struct cgro - - cpuset_inc(); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - if (is_in_v2_mode()) { - cpumask_copy(cs->effective_cpus, parent->effective_cpus); - cs->effective_mems = parent->effective_mems; - cs->use_parent_ecpus = true; - parent->child_ecpus_count++; - } -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags)) - goto out_unlock; -@@ -2845,12 +2845,12 @@ static int cpuset_css_online(struct cgro - } - rcu_read_unlock(); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->mems_allowed = parent->mems_allowed; - cs->effective_mems = parent->mems_allowed; - cpumask_copy(cs->cpus_allowed, parent->cpus_allowed); - cpumask_copy(cs->effective_cpus, parent->cpus_allowed); -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - out_unlock: - percpu_up_write(&cpuset_rwsem); - cpus_read_unlock(); -@@ -2906,7 +2906,7 @@ static void cpuset_css_free(struct cgrou - static void cpuset_bind(struct cgroup_subsys_state *root_css) - { - percpu_down_write(&cpuset_rwsem); -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - - if (is_in_v2_mode()) { - cpumask_copy(top_cpuset.cpus_allowed, cpu_possible_mask); -@@ -2917,7 +2917,7 @@ static void cpuset_bind(struct cgroup_su - top_cpuset.mems_allowed = top_cpuset.effective_mems; - } - -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - percpu_up_write(&cpuset_rwsem); - } - -@@ -3014,12 +3014,12 @@ hotplug_update_tasks_legacy(struct cpuse - { - bool is_empty; - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cpumask_copy(cs->cpus_allowed, new_cpus); - cpumask_copy(cs->effective_cpus, new_cpus); - cs->mems_allowed = *new_mems; - cs->effective_mems = *new_mems; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - /* - * Don't call update_tasks_cpumask() if the cpuset becomes empty, -@@ -3056,10 +3056,10 @@ hotplug_update_tasks(struct cpuset *cs, - if (nodes_empty(*new_mems)) - *new_mems = parent_cs(cs)->effective_mems; - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cpumask_copy(cs->effective_cpus, new_cpus); - cs->effective_mems = *new_mems; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - if (cpus_updated) - update_tasks_cpumask(cs); -@@ -3126,10 +3126,10 @@ static void cpuset_hotplug_update_tasks( - if (is_partition_root(cs) && (cpumask_empty(&new_cpus) || - (parent->partition_root_state == PRS_ERROR))) { - if (cs->nr_subparts_cpus) { -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->nr_subparts_cpus = 0; - cpumask_clear(cs->subparts_cpus); -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - compute_effective_cpumask(&new_cpus, cs, parent); - } - -@@ -3147,9 +3147,9 @@ static void cpuset_hotplug_update_tasks( - NULL, tmp); - old_prs = cs->partition_root_state; - if (old_prs != PRS_ERROR) { -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->partition_root_state = PRS_ERROR; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - notify_partition_change(cs, old_prs, PRS_ERROR); - } - } -@@ -3231,7 +3231,7 @@ static void cpuset_hotplug_workfn(struct - - /* synchronize cpus_allowed to cpu_active_mask */ - if (cpus_updated) { -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - if (!on_dfl) - cpumask_copy(top_cpuset.cpus_allowed, &new_cpus); - /* -@@ -3251,17 +3251,17 @@ static void cpuset_hotplug_workfn(struct - } - } - cpumask_copy(top_cpuset.effective_cpus, &new_cpus); -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - /* we don't mess with cpumasks of tasks in top_cpuset */ - } - - /* synchronize mems_allowed to N_MEMORY */ - if (mems_updated) { -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - if (!on_dfl) - top_cpuset.mems_allowed = new_mems; - top_cpuset.effective_mems = new_mems; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - update_tasks_nodemask(&top_cpuset); - } - -@@ -3362,9 +3362,9 @@ void cpuset_cpus_allowed(struct task_str - { - unsigned long flags; - -- spin_lock_irqsave(&callback_lock, flags); -+ raw_spin_lock_irqsave(&callback_lock, flags); - guarantee_online_cpus(tsk, pmask); -- spin_unlock_irqrestore(&callback_lock, flags); -+ raw_spin_unlock_irqrestore(&callback_lock, flags); - } - - /** -@@ -3435,11 +3435,11 @@ nodemask_t cpuset_mems_allowed(struct ta - nodemask_t mask; - unsigned long flags; - -- spin_lock_irqsave(&callback_lock, flags); -+ raw_spin_lock_irqsave(&callback_lock, flags); - rcu_read_lock(); - guarantee_online_mems(task_cs(tsk), &mask); - rcu_read_unlock(); -- spin_unlock_irqrestore(&callback_lock, flags); -+ raw_spin_unlock_irqrestore(&callback_lock, flags); - - return mask; - } -@@ -3531,14 +3531,14 @@ bool __cpuset_node_allowed(int node, gfp - return true; - - /* Not hardwall and node outside mems_allowed: scan up cpusets */ -- spin_lock_irqsave(&callback_lock, flags); -+ raw_spin_lock_irqsave(&callback_lock, flags); - - rcu_read_lock(); - cs = nearest_hardwall_ancestor(task_cs(current)); - allowed = node_isset(node, cs->mems_allowed); - rcu_read_unlock(); - -- spin_unlock_irqrestore(&callback_lock, flags); -+ raw_spin_unlock_irqrestore(&callback_lock, flags); - return allowed; - } - diff --git a/patches/crypto__limit_more_FPU-enabled_sections.patch b/patches/crypto__limit_more_FPU-enabled_sections.patch deleted file mode 100644 index c09e6227c51f..000000000000 --- a/patches/crypto__limit_more_FPU-enabled_sections.patch +++ /dev/null @@ -1,67 +0,0 @@ -Subject: crypto: limit more FPU-enabled sections -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu Nov 30 13:40:10 2017 +0100 - -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> - -Those crypto drivers use SSE/AVX/… for their crypto work and in order to -do so in kernel they need to enable the "FPU" in kernel mode which -disables preemption. -There are two problems with the way they are used: -- the while loop which processes X bytes may create latency spikes and - should be avoided or limited. -- the cipher-walk-next part may allocate/free memory and may use - kmap_atomic(). - -The whole kernel_fpu_begin()/end() processing isn't probably that cheap. -It most likely makes sense to process as much of those as possible in one -go. The new *_fpu_sched_rt() schedules only if a RT task is pending. - -Probably we should measure the performance those ciphers in pure SW -mode and with this optimisations to see if it makes sense to keep them -for RT. - -This kernel_fpu_resched() makes the code more preemptible which might hurt -performance. - -Cc: stable-rt@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> - - ---- - arch/x86/include/asm/fpu/api.h | 1 + - arch/x86/kernel/fpu/core.c | 12 ++++++++++++ - 2 files changed, 13 insertions(+) ---- ---- a/arch/x86/include/asm/fpu/api.h -+++ b/arch/x86/include/asm/fpu/api.h -@@ -28,6 +28,7 @@ extern void kernel_fpu_begin_mask(unsign - extern void kernel_fpu_end(void); - extern bool irq_fpu_usable(void); - extern void fpregs_mark_activate(void); -+extern void kernel_fpu_resched(void); - - /* Code that is unaware of kernel_fpu_begin_mask() can use this */ - static inline void kernel_fpu_begin(void) ---- a/arch/x86/kernel/fpu/core.c -+++ b/arch/x86/kernel/fpu/core.c -@@ -185,6 +185,18 @@ void kernel_fpu_end(void) - } - EXPORT_SYMBOL_GPL(kernel_fpu_end); - -+void kernel_fpu_resched(void) -+{ -+ WARN_ON_FPU(!this_cpu_read(in_kernel_fpu)); -+ -+ if (should_resched(PREEMPT_OFFSET)) { -+ kernel_fpu_end(); -+ cond_resched(); -+ kernel_fpu_begin(); -+ } -+} -+EXPORT_SYMBOL_GPL(kernel_fpu_resched); -+ - /* - * Sync the FPU register state to current's memory register state when the - * current task owns the FPU. The hardware register state is preserved. diff --git a/patches/genirq-Disable-irqfixup-poll-on-PREEMPT_RT.patch b/patches/genirq-Disable-irqfixup-poll-on-PREEMPT_RT.patch new file mode 100644 index 000000000000..1127849c2a89 --- /dev/null +++ b/patches/genirq-Disable-irqfixup-poll-on-PREEMPT_RT.patch @@ -0,0 +1,48 @@ +From: Ingo Molnar <mingo@kernel.org> +Date: Fri, 3 Jul 2009 08:29:57 -0500 +Subject: [PATCH] genirq: Disable irqfixup/poll on PREEMPT_RT. + +The support for misrouted IRQs is used on old / legacy systems and is +not feasible on PREEMPT_RT. + +Polling for interrupts reduces the overall system performance. +Additionally the interrupt latency depends on the polling frequency and +delays are not desired for real time workloads. + +Disable IRQ polling on PREEMPT_RT and let the user know that it is not +enabled. The compiler will optimize the real fixup/poll code out. + +[ bigeasy: Update changelog and switch to IS_ENABLED() ] + +Signed-off-by: Ingo Molnar <mingo@kernel.org> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Link: https://lore.kernel.org/r/20210917223841.c6j6jcaffojrnot3@linutronix.de +--- + kernel/irq/spurious.c | 8 ++++++++ + 1 file changed, 8 insertions(+) + +--- a/kernel/irq/spurious.c ++++ b/kernel/irq/spurious.c +@@ -447,6 +447,10 @@ MODULE_PARM_DESC(noirqdebug, "Disable ir + + static int __init irqfixup_setup(char *str) + { ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) { ++ pr_warn("irqfixup boot option not supported with PREEMPT_RT\n"); ++ return 1; ++ } + irqfixup = 1; + printk(KERN_WARNING "Misrouted IRQ fixup support enabled.\n"); + printk(KERN_WARNING "This may impact system performance.\n"); +@@ -459,6 +463,10 @@ module_param(irqfixup, int, 0644); + + static int __init irqpoll_setup(char *str) + { ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) { ++ pr_warn("irqpoll boot option not supported with PREEMPT_RT\n"); ++ return 1; ++ } + irqfixup = 2; + printk(KERN_WARNING "Misrouted IRQ fixup and polling support " + "enabled\n"); diff --git a/patches/genirq__Move_prio_assignment_into_the_newly_created_thread.patch b/patches/genirq-Move-prio-assignment-into-the-newly-created-t.patch index ef15ef6f6d99..fe3798fc7b46 100644 --- a/patches/genirq__Move_prio_assignment_into_the_newly_created_thread.patch +++ b/patches/genirq-Move-prio-assignment-into-the-newly-created-t.patch @@ -1,11 +1,10 @@ -Subject: genirq: Move prio assignment into the newly created thread -From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon Nov 9 23:32:39 2020 +0100 - From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue, 10 Nov 2020 12:38:48 +0100 +Subject: [PATCH] genirq: Move prio assignment into the newly created thread With enabled threaded interrupts the nouveau driver reported the following: + | Chain exists of: | &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem | @@ -20,23 +19,21 @@ following: The device->mutex is nvkm_device::mutex. -Unblocking the lockchain at `cpuset_rwsem' is probably the easiest thing -to do. -Move the priority assignment to the start of the newly created thread. +Unblocking the lockchain at `cpuset_rwsem' is probably the easiest +thing to do. Move the priority assignment to the start of the newly +created thread. Fixes: 710da3c8ea7df ("sched/core: Prevent race condition between cpuset and __sched_setscheduler()") Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bigeasy: Patch description] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/a23a826af7c108ea5651e73b8fbae5e653f16e86.camel@gmx.de - - --- kernel/irq/manage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) ---- + --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1259,6 +1259,8 @@ static int irq_thread(void *data) diff --git a/patches/genirq__Disable_irqpoll_on_-rt.patch b/patches/genirq__Disable_irqpoll_on_-rt.patch deleted file mode 100644 index e5c9c92dc214..000000000000 --- a/patches/genirq__Disable_irqpoll_on_-rt.patch +++ /dev/null @@ -1,41 +0,0 @@ -Subject: genirq: Disable irqpoll on -rt -From: Ingo Molnar <mingo@elte.hu> -Date: Fri Jul 3 08:29:57 2009 -0500 - -From: Ingo Molnar <mingo@elte.hu> - -Creates long latencies for no value - -Signed-off-by: Ingo Molnar <mingo@elte.hu> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> - - - ---- - kernel/irq/spurious.c | 8 ++++++++ - 1 file changed, 8 insertions(+) ---- ---- a/kernel/irq/spurious.c -+++ b/kernel/irq/spurious.c -@@ -447,6 +447,10 @@ MODULE_PARM_DESC(noirqdebug, "Disable ir - - static int __init irqfixup_setup(char *str) - { -+#ifdef CONFIG_PREEMPT_RT -+ pr_warn("irqfixup boot option not supported w/ CONFIG_PREEMPT_RT\n"); -+ return 1; -+#endif - irqfixup = 1; - printk(KERN_WARNING "Misrouted IRQ fixup support enabled.\n"); - printk(KERN_WARNING "This may impact system performance.\n"); -@@ -459,6 +463,10 @@ module_param(irqfixup, int, 0644); - - static int __init irqpoll_setup(char *str) - { -+#ifdef CONFIG_PREEMPT_RT -+ pr_warn("irqpoll boot option not supported w/ CONFIG_PREEMPT_RT\n"); -+ return 1; -+#endif - irqfixup = 2; - printk(KERN_WARNING "Misrouted IRQ fixup and polling support " - "enabled\n"); diff --git a/patches/genirq__update_irq_set_irqchip_state_documentation.patch b/patches/genirq__update_irq_set_irqchip_state_documentation.patch index 03cc8597dba1..c3b062d4fd3c 100644 --- a/patches/genirq__update_irq_set_irqchip_state_documentation.patch +++ b/patches/genirq__update_irq_set_irqchip_state_documentation.patch @@ -10,16 +10,14 @@ irq_set_irqchip_state() documentation to reflect this. Signed-off-by: Josh Cartwright <joshc@ni.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> - - +Link: https://lkml.kernel.org/r/20210917103055.92150-1-bigeasy@linutronix.de --- kernel/irq/manage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c -@@ -2827,7 +2827,7 @@ EXPORT_SYMBOL_GPL(irq_get_irqchip_state) +@@ -2833,7 +2833,7 @@ EXPORT_SYMBOL_GPL(irq_get_irqchip_state) * This call sets the internal irqchip state of an interrupt, * depending on the value of @which. * diff --git a/patches/kthread__Move_prio_affinite_change_into_the_newly_created_thread.patch b/patches/kthread-Move-prio-affinite-change-into-the-newly-cre.patch index e2ce09dcab53..f21ae2c9dbac 100644 --- a/patches/kthread__Move_prio_affinite_change_into_the_newly_created_thread.patch +++ b/patches/kthread-Move-prio-affinite-change-into-the-newly-cre.patch @@ -1,11 +1,11 @@ -Subject: kthread: Move prio/affinite change into the newly created thread -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon Nov 9 21:30:41 2020 +0100 - From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue, 10 Nov 2020 12:38:47 +0100 +Subject: [PATCH] kthread: Move prio/affinite change into the newly created + thread With enabled threaded interrupts the nouveau driver reported the following: + | Chain exists of: | &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem | @@ -20,21 +20,19 @@ following: The device->mutex is nvkm_device::mutex. -Unblocking the lockchain at `cpuset_rwsem' is probably the easiest thing -to do. -Move the priority reset to the start of the newly created thread. +Unblocking the lockchain at `cpuset_rwsem' is probably the easiest +thing to do. Move the priority reset to the start of the newly +created thread. Fixes: 710da3c8ea7df ("sched/core: Prevent race condition between cpuset and __sched_setscheduler()") Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/a23a826af7c108ea5651e73b8fbae5e653f16e86.camel@gmx.de - - --- kernel/kthread.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) ---- + --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -270,6 +270,7 @@ EXPORT_SYMBOL_GPL(kthread_parkme); diff --git a/patches/lockdep-Let-lock_is_held_type-detect-recursive-read-.patch b/patches/lockdep-Let-lock_is_held_type-detect-recursive-read-.patch index 4be135d95018..b100d046a5c9 100644 --- a/patches/lockdep-Let-lock_is_held_type-detect-recursive-read-.patch +++ b/patches/lockdep-Let-lock_is_held_type-detect-recursive-read-.patch @@ -13,9 +13,10 @@ as a read held lock. Fixes: e918188611f07 ("locking: More accurate annotations for read_lock()") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Acked-by: Waiman Long <longman@redhat.com> +Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Boqun Feng <boqun.feng@gmail.com> -Link: https://lkml.kernel.org/r/20210910135312.4axzdxt74rgct2ur@linutronix.de +Acked-by: Waiman Long <longman@redhat.com> +Link: https://lkml.kernel.org/r/20210903084001.lblecrvz4esl4mrr@linutronix.de --- kernel/locking/lockdep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/patches/locking-rt--Take-RCU-nesting-into-account-for-might_sleep--.patch b/patches/locking-rt--Take-RCU-nesting-into-account-for-might_sleep--.patch new file mode 100644 index 000000000000..dc3799062497 --- /dev/null +++ b/patches/locking-rt--Take-RCU-nesting-into-account-for-might_sleep--.patch @@ -0,0 +1,72 @@ +Subject: locking/rt: Take RCU nesting into account for might_sleep() +From: Thomas Gleixner <tglx@linutronix.de> +Date: Wed, 22 Sep 2021 12:28:19 +0200 + +The RT patches contained a cheap hack to ignore the RCU nesting depth in +might_sleep() checks, which was a pragmatic but incorrect workaround. + +The general rule that rcu_read_lock() held sections cannot voluntary sleep +does apply even on RT kernels. Though the substitution of spin/rw locks on +RT enabled kernels has to be exempt from that rule. On !RT a spin_lock() +can obviously nest inside a rcu read side critical section as the lock +acquisition is not going to block, but on RT this is not longer the case +due to the 'sleeping' spin lock substitution. + +Instead of generally ignoring the RCU nesting depth in might_sleep() +checks, pass the rcu_preempt_depth() as offset argument to might_sleep() +from spin/read/write_lock() which makes the check work correctly even in +RCU read side critical sections. + +The actual blocking on such a substituted lock within a RCU read side +critical section is already handled correctly in __schedule() by treating +it as a "preemption" of the RCU read side critical section. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +--- + kernel/locking/spinlock_rt.c | 14 +++++++++++--- + 1 file changed, 11 insertions(+), 3 deletions(-) + +--- a/kernel/locking/spinlock_rt.c ++++ b/kernel/locking/spinlock_rt.c +@@ -24,6 +24,14 @@ + #define RT_MUTEX_BUILD_SPINLOCKS + #include "rtmutex.c" + ++/* ++ * Use ___might_sleep() which skips the state check and take RCU nesting ++ * into account as spin/read/write_lock() can legitimately nest into an RCU ++ * read side critical section: ++ */ ++#define rtlock_might_sleep() \ ++ ___might_sleep(__FILE__, __LINE__, rcu_preempt_depth()) ++ + static __always_inline void rtlock_lock(struct rt_mutex_base *rtm) + { + if (unlikely(!rt_mutex_cmpxchg_acquire(rtm, NULL, current))) +@@ -32,7 +40,7 @@ static __always_inline void rtlock_lock( + + static __always_inline void __rt_spin_lock(spinlock_t *lock) + { +- ___might_sleep(__FILE__, __LINE__, 0); ++ rtlock_might_sleep(); + rtlock_lock(&lock->lock); + rcu_read_lock(); + migrate_disable(); +@@ -210,7 +218,7 @@ EXPORT_SYMBOL(rt_write_trylock); + + void __sched rt_read_lock(rwlock_t *rwlock) + { +- ___might_sleep(__FILE__, __LINE__, 0); ++ rtlock_might_sleep(); + rwlock_acquire_read(&rwlock->dep_map, 0, 0, _RET_IP_); + rwbase_read_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT); + rcu_read_lock(); +@@ -220,7 +228,7 @@ EXPORT_SYMBOL(rt_read_lock); + + void __sched rt_write_lock(rwlock_t *rwlock) + { +- ___might_sleep(__FILE__, __LINE__, 0); ++ rtlock_might_sleep(); + rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_); + rwbase_write_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT); + rcu_read_lock(); diff --git a/patches/mm__workingset__replace_IRQ-off_check_with_a_lockdep_assert..patch b/patches/mm__workingset__replace_IRQ-off_check_with_a_lockdep_assert..patch index 156c8ce4e9ff..61d1089df4c9 100644 --- a/patches/mm__workingset__replace_IRQ-off_check_with_a_lockdep_assert..patch +++ b/patches/mm__workingset__replace_IRQ-off_check_with_a_lockdep_assert..patch @@ -17,7 +17,7 @@ held. Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Link: https://https://lkml.kernel.org/r/.kernel.org/linux-mm/20190211113829.sqf6bdi4c4cdd3rp@linutronix.de/ +Link: https://lkml.kernel.org/r/20190211113829.sqf6bdi4c4cdd3rp@linutronix.de --- mm/workingset.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/patches/net__Remove_preemption_disabling_in_netif_rx.patch b/patches/net__Remove_preemption_disabling_in_netif_rx.patch index 0d5bfcb58185..648527035472 100644 --- a/patches/net__Remove_preemption_disabling_in_netif_rx.patch +++ b/patches/net__Remove_preemption_disabling_in_netif_rx.patch @@ -38,7 +38,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- --- a/net/core/dev.c +++ b/net/core/dev.c -@@ -4891,7 +4891,7 @@ static int netif_rx_internal(struct sk_b +@@ -4884,7 +4884,7 @@ static int netif_rx_internal(struct sk_b struct rps_dev_flow voidflow, *rflow = &voidflow; int cpu; @@ -47,7 +47,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> rcu_read_lock(); cpu = get_rps_cpu(skb->dev, skb, &rflow); -@@ -4901,14 +4901,14 @@ static int netif_rx_internal(struct sk_b +@@ -4894,14 +4894,14 @@ static int netif_rx_internal(struct sk_b ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail); rcu_read_unlock(); diff --git a/patches/powerpc__Add_support_for_lazy_preemption.patch b/patches/powerpc__Add_support_for_lazy_preemption.patch index 128c33985941..9864c18e2eec 100644 --- a/patches/powerpc__Add_support_for_lazy_preemption.patch +++ b/patches/powerpc__Add_support_for_lazy_preemption.patch @@ -75,7 +75,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> /* Don't move TLF_NAPPING without adjusting the code in entry_32.S */ --- a/arch/powerpc/kernel/interrupt.c +++ b/arch/powerpc/kernel/interrupt.c -@@ -303,7 +303,7 @@ interrupt_exit_user_prepare_main(unsigne +@@ -346,7 +346,7 @@ interrupt_exit_user_prepare_main(unsigne ti_flags = READ_ONCE(current_thread_info()->flags); while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) { local_irq_enable(); @@ -84,7 +84,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> schedule(); } else { /* -@@ -509,11 +509,15 @@ notrace unsigned long interrupt_exit_ker +@@ -552,11 +552,15 @@ notrace unsigned long interrupt_exit_ker /* Returning to a kernel context with local irqs enabled. */ WARN_ON_ONCE(!(regs->msr & MSR_EE)); again: diff --git a/patches/preempt__Provide_preempt__nort_variants.patch b/patches/preempt__Provide_preempt__nort_variants.patch deleted file mode 100644 index d6a0f4ba8347..000000000000 --- a/patches/preempt__Provide_preempt__nort_variants.patch +++ /dev/null @@ -1,51 +0,0 @@ -Subject: preempt: Provide preempt_*_(no)rt variants -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri Jul 24 12:38:56 2009 +0200 - -From: Thomas Gleixner <tglx@linutronix.de> - -RT needs a few preempt_disable/enable points which are not necessary -otherwise. Implement variants to avoid #ifdeffery. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> - - - ---- - include/linux/preempt.h | 18 +++++++++++++++++- - 1 file changed, 17 insertions(+), 1 deletion(-) ---- ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -188,7 +188,11 @@ do { \ - preempt_count_dec(); \ - } while (0) - --#define preempt_enable_no_resched() sched_preempt_enable_no_resched() -+#ifdef CONFIG_PREEMPT_RT -+# define preempt_enable_no_resched() sched_preempt_enable_no_resched() -+#else -+# define preempt_enable_no_resched() preempt_enable() -+#endif - - #define preemptible() (preempt_count() == 0 && !irqs_disabled()) - -@@ -282,6 +286,18 @@ do { \ - set_preempt_need_resched(); \ - } while (0) - -+#ifdef CONFIG_PREEMPT_RT -+# define preempt_disable_rt() preempt_disable() -+# define preempt_enable_rt() preempt_enable() -+# define preempt_disable_nort() barrier() -+# define preempt_enable_nort() barrier() -+#else -+# define preempt_disable_rt() barrier() -+# define preempt_enable_rt() barrier() -+# define preempt_disable_nort() preempt_disable() -+# define preempt_enable_nort() preempt_enable() -+#endif -+ - #ifdef CONFIG_PREEMPT_NOTIFIERS - - struct preempt_notifier; diff --git a/patches/ptrace__fix_ptrace_vs_tasklist_lock_race.patch b/patches/ptrace__fix_ptrace_vs_tasklist_lock_race.patch index 39115c9a8b65..93924c9805a8 100644 --- a/patches/ptrace__fix_ptrace_vs_tasklist_lock_race.patch +++ b/patches/ptrace__fix_ptrace_vs_tasklist_lock_race.patch @@ -49,7 +49,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> /* * Special states are those that do not use the normal wait-loop pattern. See * the comment with set_special_state(). -@@ -2014,6 +2010,81 @@ static inline int test_tsk_need_resched( +@@ -2015,6 +2011,81 @@ static inline int test_tsk_need_resched( return unlikely(test_tsk_thread_flag(tsk,TIF_NEED_RESCHED)); } diff --git a/patches/rcu-tree-Protect-rcu_rdp_is_offloaded-invocations-on.patch b/patches/rcu-tree-Protect-rcu_rdp_is_offloaded-invocations-on.patch new file mode 100644 index 000000000000..d9da124064af --- /dev/null +++ b/patches/rcu-tree-Protect-rcu_rdp_is_offloaded-invocations-on.patch @@ -0,0 +1,83 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue, 21 Sep 2021 23:12:50 +0200 +Subject: [PATCH] rcu/tree: Protect rcu_rdp_is_offloaded() invocations on RT + +Valentin reported warnings about suspicious RCU usage on RT kernels. Those +happen when offloading of RCU callbacks is enabled: + + WARNING: suspicious RCU usage + 5.13.0-rt1 #20 Not tainted + ----------------------------- + kernel/rcu/tree_plugin.h:69 Unsafe read of RCU_NOCB offloaded state! + + rcu_rdp_is_offloaded (kernel/rcu/tree_plugin.h:69 kernel/rcu/tree_plugin.h:58) + rcu_core (kernel/rcu/tree.c:2332 kernel/rcu/tree.c:2398 kernel/rcu/tree.c:2777) + rcu_cpu_kthread (./include/linux/bottom_half.h:32 kernel/rcu/tree.c:2876) + +The reason is that rcu_rdp_is_offloaded() is invoked without one of the +required protections on RT enabled kernels because local_bh_disable() does +not disable preemption on RT. + +Valentin proposed to add a local lock to the code in question, but that's +suboptimal in several aspects: + + 1) local locks add extra code to !RT kernels for no value. + + 2) All possible callsites have to audited and amended when affected + possible at an outer function level due to lock nesting issues. + + 3) As the local lock has to be taken at the outer functions it's required + to release and reacquire them in the inner code sections which might + voluntary schedule, e.g. rcu_do_batch(). + +Both callsites of rcu_rdp_is_offloaded() which trigger this check invoke +rcu_rdp_is_offloaded() in the variable declaration section right at the top +of the functions. But the actual usage of the result is either within a +section which provides the required protections or after such a section. + +So the obvious solution is to move the invocation into the code sections +which provide the proper protections, which solves the problem for RT and +does not have any impact on !RT kernels. + +Reported-by: Valentin Schneider <valentin.schneider@arm.com> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + kernel/rcu/tree.c | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +--- a/kernel/rcu/tree.c ++++ b/kernel/rcu/tree.c +@@ -2278,13 +2278,13 @@ rcu_report_qs_rdp(struct rcu_data *rdp) + { + unsigned long flags; + unsigned long mask; +- bool needwake = false; +- const bool offloaded = rcu_rdp_is_offloaded(rdp); ++ bool offloaded, needwake = false; + struct rcu_node *rnp; + + WARN_ON_ONCE(rdp->cpu != smp_processor_id()); + rnp = rdp->mynode; + raw_spin_lock_irqsave_rcu_node(rnp, flags); ++ offloaded = rcu_rdp_is_offloaded(rdp); + if (rdp->cpu_no_qs.b.norm || rdp->gp_seq != rnp->gp_seq || + rdp->gpwrap) { + +@@ -2446,7 +2446,7 @@ static void rcu_do_batch(struct rcu_data + int div; + bool __maybe_unused empty; + unsigned long flags; +- const bool offloaded = rcu_rdp_is_offloaded(rdp); ++ bool offloaded; + struct rcu_head *rhp; + struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl); + long bl, count = 0; +@@ -2472,6 +2472,7 @@ static void rcu_do_batch(struct rcu_data + rcu_nocb_lock(rdp); + WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); + pending = rcu_segcblist_n_cbs(&rdp->cblist); ++ offloaded = rcu_rdp_is_offloaded(rdp); + div = READ_ONCE(rcu_divisor); + div = div < 0 ? 7 : div > sizeof(long) * 8 - 2 ? sizeof(long) * 8 - 2 : div; + bl = max(rdp->blimit, pending >> div); diff --git a/patches/rcu__Delay_RCU-selftests.patch b/patches/rcu__Delay_RCU-selftests.patch index b91148176da4..b467716890ec 100644 --- a/patches/rcu__Delay_RCU-selftests.patch +++ b/patches/rcu__Delay_RCU-selftests.patch @@ -18,7 +18,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h -@@ -101,6 +101,13 @@ void rcu_init_tasks_generic(void); +@@ -94,6 +94,13 @@ void rcu_init_tasks_generic(void); static inline void rcu_init_tasks_generic(void) { } #endif diff --git a/patches/rcu_nocb_protect_nocb_state_via_local_lock_under_preempt_rt.patch b/patches/rcu_nocb_protect_nocb_state_via_local_lock_under_preempt_rt.patch deleted file mode 100644 index 1225b82a7f0f..000000000000 --- a/patches/rcu_nocb_protect_nocb_state_via_local_lock_under_preempt_rt.patch +++ /dev/null @@ -1,304 +0,0 @@ -From: Valentin Schneider <valentin.schneider@arm.com> -Subject: rcu/nocb: Protect NOCB state via local_lock() under PREEMPT_RT -Date: Wed, 11 Aug 2021 21:13:53 +0100 - -Warning -======= - -Running v5.13-rt1 on my arm64 Juno board triggers: - -[ 0.156302] ============================= -[ 0.160416] WARNING: suspicious RCU usage -[ 0.164529] 5.13.0-rt1 #20 Not tainted -[ 0.168300] ----------------------------- -[ 0.172409] kernel/rcu/tree_plugin.h:69 Unsafe read of RCU_NOCB offloaded state! -[ 0.179920] -[ 0.179920] other info that might help us debug this: -[ 0.179920] -[ 0.188037] -[ 0.188037] rcu_scheduler_active = 1, debug_locks = 1 -[ 0.194677] 3 locks held by rcuc/0/11: -[ 0.198448] #0: ffff00097ef10cf8 ((softirq_ctrl.lock).lock){+.+.}-{2:2}, at: __local_bh_disable_ip (./include/linux/rcupdate.h:662 kernel/softirq.c:171) -[ 0.208709] #1: ffff80001205e5f0 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock (kernel/locking/spinlock_rt.c:43 (discriminator 4)) -[ 0.217134] #2: ffff80001205e5f0 (rcu_read_lock){....}-{1:2}, at: __local_bh_disable_ip (kernel/softirq.c:169) -[ 0.226428] -[ 0.226428] stack backtrace: -[ 0.230889] CPU: 0 PID: 11 Comm: rcuc/0 Not tainted 5.13.0-rt1 #20 -[ 0.237100] Hardware name: ARM Juno development board (r0) (DT) -[ 0.243041] Call trace: -[ 0.245497] dump_backtrace (arch/arm64/kernel/stacktrace.c:163) -[ 0.249185] show_stack (arch/arm64/kernel/stacktrace.c:219) -[ 0.252522] dump_stack (lib/dump_stack.c:122) -[ 0.255947] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6439) -[ 0.260328] rcu_rdp_is_offloaded (kernel/rcu/tree_plugin.h:69 kernel/rcu/tree_plugin.h:58) -[ 0.264537] rcu_core (kernel/rcu/tree.c:2332 kernel/rcu/tree.c:2398 kernel/rcu/tree.c:2777) -[ 0.267786] rcu_cpu_kthread (./include/linux/bottom_half.h:32 kernel/rcu/tree.c:2876) -[ 0.271644] smpboot_thread_fn (kernel/smpboot.c:165 (discriminator 3)) -[ 0.275767] kthread (kernel/kthread.c:321) -[ 0.279013] ret_from_fork (arch/arm64/kernel/entry.S:1005) - -In this case, this is the RCU core kthread accessing the local CPU's -rdp. Before that, rcu_cpu_kthread() invokes local_bh_disable(). - -Under !CONFIG_PREEMPT_RT (and rcutree.use_softirq=0), this ends up -incrementing the preempt_count, which satisfies the "local non-preemptible -read" of rcu_rdp_is_offloaded(). - -Under CONFIG_PREEMPT_RT however, this becomes - - local_lock(&softirq_ctrl.lock) - -which, under the same config, is migrate_disable() + rt_spin_lock(). As -pointed out by Frederic, this is not sufficient to safely access an rdp's -offload state, as the RCU core kthread can be preempted by a kworker -executing rcu_nocb_rdp_offload() [1]. - -Introduce a local_lock to serialize an rdp's offload state while the rdp's -associated core kthread is executing rcu_core(). - -rcu_core() preemptability considerations -======================================== - -As pointed out by Paul [2], keeping rcu_check_quiescent_state() preemptible -(which is the case under CONFIG_PREEMPT_RT) requires some consideration. - -note_gp_changes() itself runs with irqs off, and enters -__note_gp_changes() with rnp->lock held (raw_spinlock), thus is safe vs -preemption. - -rdp->core_needs_qs *could* change after being read by the RCU core -kthread if it then gets preempted. Consider, with -CONFIG_RCU_STRICT_GRACE_PERIOD: - - rcuc/x task_foo - - rcu_check_quiescent_state() - `\ - rdp->core_needs_qs == true - <PREEMPT> - rcu_read_unlock() - `\ - rcu_preempt_deferred_qs_irqrestore() - `\ - rcu_report_qs_rdp() - `\ - rdp->core_needs_qs := false; - -This would let rcuc/x's rcu_check_quiescent_state() proceed further down to -rcu_report_qs_rdp(), but if task_foo's earlier rcu_report_qs_rdp() -invocation would have cleared the rdp grpmask from the rnp mask, so -rcuc/x's invocation would simply bail. - -Since rcu_report_qs_rdp() can be safely invoked, even if rdp->core_needs_qs -changed, it appears safe to keep rcu_check_quiescent_state() preemptible. - -[1]: http://lore.kernel.org/r/20210727230814.GC283787@lothringen -[2]: http://lore.kernel.org/r/20210729010445.GO4397@paulmck-ThinkPad-P17-Gen-1 - -Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Link: https://lore.kernel.org/r/20210811201354.1976839-4-valentin.schneider@arm.com ---- - kernel/rcu/tree.c | 4 ++++ - kernel/rcu/tree.h | 4 ++++ - kernel/rcu/tree_nocb.h | 39 +++++++++++++++++++++++++++++++++++++++ - kernel/rcu/tree_plugin.h | 38 ++++++++++++++++++++++++++++++-------- - 4 files changed, 77 insertions(+), 8 deletions(-) - ---- a/kernel/rcu/tree.c -+++ b/kernel/rcu/tree.c -@@ -80,6 +80,7 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(str - .dynticks = ATOMIC_INIT(1), - #ifdef CONFIG_RCU_NOCB_CPU - .cblist.flags = SEGCBLIST_SOFTIRQ_ONLY, -+ .nocb_local_lock = INIT_LOCAL_LOCK(nocb_local_lock), - #endif - }; - static struct rcu_state rcu_state = { -@@ -2811,10 +2812,12 @@ static void rcu_cpu_kthread(unsigned int - { - unsigned int *statusp = this_cpu_ptr(&rcu_data.rcu_cpu_kthread_status); - char work, *workp = this_cpu_ptr(&rcu_data.rcu_cpu_has_work); -+ struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - int spincnt; - - trace_rcu_utilization(TPS("Start CPU kthread@rcu_run")); - for (spincnt = 0; spincnt < 10; spincnt++) { -+ rcu_nocb_local_lock(rdp); - local_bh_disable(); - *statusp = RCU_KTHREAD_RUNNING; - local_irq_disable(); -@@ -2824,6 +2827,7 @@ static void rcu_cpu_kthread(unsigned int - if (work) - rcu_core(); - local_bh_enable(); -+ rcu_nocb_local_unlock(rdp); - if (*workp == 0) { - trace_rcu_utilization(TPS("End CPU kthread@rcu_wait")); - *statusp = RCU_KTHREAD_WAITING; ---- a/kernel/rcu/tree.h -+++ b/kernel/rcu/tree.h -@@ -210,6 +210,8 @@ struct rcu_data { - struct timer_list nocb_timer; /* Enforce finite deferral. */ - unsigned long nocb_gp_adv_time; /* Last call_rcu() CB adv (jiffies). */ - -+ local_lock_t nocb_local_lock; -+ - /* The following fields are used by call_rcu, hence own cacheline. */ - raw_spinlock_t nocb_bypass_lock ____cacheline_internodealigned_in_smp; - struct rcu_cblist nocb_bypass; /* Lock-contention-bypass CB list. */ -@@ -445,6 +447,8 @@ static void rcu_nocb_unlock(struct rcu_d - static void rcu_nocb_unlock_irqrestore(struct rcu_data *rdp, - unsigned long flags); - static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp); -+static void rcu_nocb_local_lock(struct rcu_data *rdp); -+static void rcu_nocb_local_unlock(struct rcu_data *rdp); - #ifdef CONFIG_RCU_NOCB_CPU - static void __init rcu_organize_nocb_kthreads(void); - #define rcu_nocb_lock_irqsave(rdp, flags) \ ---- a/kernel/rcu/tree_nocb.h -+++ b/kernel/rcu/tree_nocb.h -@@ -21,6 +21,11 @@ static inline int rcu_lockdep_is_held_no - return lockdep_is_held(&rdp->nocb_lock); - } - -+static inline int rcu_lockdep_is_held_nocb_local(struct rcu_data *rdp) -+{ -+ return lockdep_is_held(&rdp->nocb_local_lock); -+} -+ - static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp) - { - /* Race on early boot between thread creation and assignment */ -@@ -181,6 +186,22 @@ static void rcu_nocb_unlock_irqrestore(s - } - } - -+/* -+ * The invocation of rcu_core() within the RCU core kthreads remains preemptible -+ * under PREEMPT_RT, thus the offload state of a CPU could change while -+ * said kthreads are preempted. Prevent this from happening by protecting the -+ * offload state with a local_lock(). -+ */ -+static void rcu_nocb_local_lock(struct rcu_data *rdp) -+{ -+ local_lock(&rcu_data.nocb_local_lock); -+} -+ -+static void rcu_nocb_local_unlock(struct rcu_data *rdp) -+{ -+ local_unlock(&rcu_data.nocb_local_lock); -+} -+ - /* Lockdep check that ->cblist may be safely accessed. */ - static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp) - { -@@ -948,6 +969,7 @@ static int rdp_offload_toggle(struct rcu - if (rdp->nocb_cb_sleep) - rdp->nocb_cb_sleep = false; - rcu_nocb_unlock_irqrestore(rdp, flags); -+ rcu_nocb_local_unlock(rdp); - - /* - * Ignore former value of nocb_cb_sleep and force wake up as it could -@@ -979,6 +1001,7 @@ static long rcu_nocb_rdp_deoffload(void - - pr_info("De-offloading %d\n", rdp->cpu); - -+ rcu_nocb_local_lock(rdp); - rcu_nocb_lock_irqsave(rdp, flags); - /* - * Flush once and for all now. This suffices because we are -@@ -1061,6 +1084,7 @@ static long rcu_nocb_rdp_offload(void *a - * Can't use rcu_nocb_lock_irqsave() while we are in - * SEGCBLIST_SOFTIRQ_ONLY mode. - */ -+ rcu_nocb_local_lock(rdp); - raw_spin_lock_irqsave(&rdp->nocb_lock, flags); - - /* -@@ -1408,6 +1432,11 @@ static inline int rcu_lockdep_is_held_no - return 0; - } - -+static inline int rcu_lockdep_is_held_nocb_local(struct rcu_data *rdp) -+{ -+ return 0; -+} -+ - static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp) - { - return false; -@@ -1430,6 +1459,16 @@ static void rcu_nocb_unlock_irqrestore(s - local_irq_restore(flags); - } - -+/* No ->nocb_local_lock to acquire. */ -+static void rcu_nocb_local_lock(struct rcu_data *rdp) -+{ -+} -+ -+/* No ->nocb_local_lock to release. */ -+static void rcu_nocb_local_unlock(struct rcu_data *rdp) -+{ -+} -+ - /* Lockdep check that ->cblist may be safely accessed. */ - static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp) - { ---- a/kernel/rcu/tree_plugin.h -+++ b/kernel/rcu/tree_plugin.h -@@ -13,23 +13,45 @@ - - #include "../locking/rtmutex_common.h" - -+/* -+ * Is a local read of the rdp's offloaded state safe and stable? -+ * See rcu_nocb_local_lock() & family. -+ */ -+static inline bool rcu_local_offload_access_safe(struct rcu_data *rdp) -+{ -+ if (!preemptible()) -+ return true; -+ -+ if (!is_migratable()) { -+ if (!IS_ENABLED(CONFIG_RCU_NOCB)) -+ return true; -+ -+ return rcu_lockdep_is_held_nocb_local(rdp); -+ } -+ -+ return false; -+} -+ - static bool rcu_rdp_is_offloaded(struct rcu_data *rdp) - { - /* -- * In order to read the offloaded state of an rdp is a safe -- * and stable way and prevent from its value to be changed -- * under us, we must either hold the barrier mutex, the cpu -- * hotplug lock (read or write) or the nocb lock. Local -- * non-preemptible reads are also safe. NOCB kthreads and -- * timers have their own means of synchronization against the -- * offloaded state updaters. -+ * In order to read the offloaded state of an rdp is a safe and stable -+ * way and prevent from its value to be changed under us, we must -+ * either... - */ - RCU_LOCKDEP_WARN( -+ // ...hold the barrier mutex... - !(lockdep_is_held(&rcu_state.barrier_mutex) || -+ // ... the cpu hotplug lock (read or write)... - (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) || -+ // ... or the NOCB lock. - rcu_lockdep_is_held_nocb(rdp) || -+ // Local reads still require the local state to remain stable -+ // (preemption disabled / local lock held) - (rdp == this_cpu_ptr(&rcu_data) && -- !(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible())) || -+ rcu_local_offload_access_safe(rdp)) || -+ // NOCB kthreads and timers have their own means of -+ // synchronization against the offloaded state updaters. - rcu_current_is_nocb_kthread(rdp)), - "Unsafe read of RCU_NOCB offloaded state" - ); diff --git a/patches/sched--Make-cond_resched_lock---RT-aware.patch b/patches/sched--Make-cond_resched_lock---RT-aware.patch new file mode 100644 index 000000000000..5731df7bd02b --- /dev/null +++ b/patches/sched--Make-cond_resched_lock---RT-aware.patch @@ -0,0 +1,78 @@ +Subject: sched: Make cond_resched_lock() RT aware +From: Thomas Gleixner <tglx@linutronix.de> +Date: Wed, 22 Sep 2021 12:08:32 +0200 + +The might_sleep() checks in the cond_resched_lock() variants use +PREEMPT_LOCK_OFFSET for preempt count offset checking. + +On PREEMPT_RT enabled kernels spin/rw_lock held sections stay preemptible +which means PREEMPT_LOCK_OFFSET is 0, but that still triggers the +might_sleep() check because that takes RCU read side nesting into account. + +On RT enabled kernels spin/read/write_lock() issue rcu_read_lock() to +resemble the !RT semantics, which means in cond_resched_lock() the might +sleep check will see preempt_count() == 0 and rcu_preempt_depth() == 1. + +Introduce PREEMPT_LOCK_SCHED_OFFSET for those might sleep checks and map +them to PREEMPT_LOCK_OFFSET on !RT and to 1 (accounting for +rcu_preempt_depth()) on RT enabled kernels. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +--- + include/linux/preempt.h | 12 ++++++++++-- + include/linux/sched.h | 18 +++++++++--------- + 2 files changed, 19 insertions(+), 11 deletions(-) + +--- a/include/linux/preempt.h ++++ b/include/linux/preempt.h +@@ -122,9 +122,17 @@ + * The preempt_count offset after spin_lock() + */ + #if !defined(CONFIG_PREEMPT_RT) +-#define PREEMPT_LOCK_OFFSET PREEMPT_DISABLE_OFFSET ++#define PREEMPT_LOCK_OFFSET PREEMPT_DISABLE_OFFSET ++#define PREEMPT_LOCK_RESCHED_OFFSET PREEMPT_LOCK_OFFSET + #else +-#define PREEMPT_LOCK_OFFSET 0 ++/* Locks on RT do not disable preemption */ ++#define PREEMPT_LOCK_OFFSET 0 ++/* ++ * spin/rw_lock() on RT implies rcu_read_lock(). The might_sleep() check in ++ * cond_resched*lock() has to take that into account because it checks for ++ * preempt_count() + rcu_preempt_depth(). ++ */ ++#define PREEMPT_LOCK_RESCHED_OFFSET 1 + #endif + + /* +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -2057,19 +2057,19 @@ extern int __cond_resched_lock(spinlock_ + extern int __cond_resched_rwlock_read(rwlock_t *lock); + extern int __cond_resched_rwlock_write(rwlock_t *lock); + +-#define cond_resched_lock(lock) ({ \ +- ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\ +- __cond_resched_lock(lock); \ ++#define cond_resched_lock(lock) ({ \ ++ __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_RESCHED_OFFSET); \ ++ __cond_resched_lock(lock); \ + }) + +-#define cond_resched_rwlock_read(lock) ({ \ +- __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ +- __cond_resched_rwlock_read(lock); \ ++#define cond_resched_rwlock_read(lock) ({ \ ++ __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_RESCHED_OFFSET); \ ++ __cond_resched_rwlock_read(lock); \ + }) + +-#define cond_resched_rwlock_write(lock) ({ \ +- __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ +- __cond_resched_rwlock_write(lock); \ ++#define cond_resched_rwlock_write(lock) ( { \ ++ __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_RESCHED_OFFSET); \ ++ __cond_resched_rwlock_write(lock); \ + }) + + static inline void cond_resched_rcu(void) diff --git a/patches/sched-Add-preempt_disable_rt.patch b/patches/sched-Add-preempt_disable_rt.patch new file mode 100644 index 000000000000..6ae705b0b218 --- /dev/null +++ b/patches/sched-Add-preempt_disable_rt.patch @@ -0,0 +1,30 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Date: Fri, 17 Sep 2021 12:56:34 +0200 +Subject: [PATCH] sched: Add preempt_disable_rt() + +RT needs a few preempt_disable/enable points which are not necessary +otherwise. Implement variants to avoid #ifdeffery. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + include/linux/preempt.h | 8 ++++++++ + 1 file changed, 8 insertions(+) + +--- a/include/linux/preempt.h ++++ b/include/linux/preempt.h +@@ -282,6 +282,14 @@ do { \ + set_preempt_need_resched(); \ + } while (0) + ++#ifdef CONFIG_PREEMPT_RT ++# define preempt_disable_rt() preempt_disable() ++# define preempt_enable_rt() preempt_enable() ++#else ++# define preempt_disable_rt() barrier() ++# define preempt_enable_rt() barrier() ++#endif ++ + #ifdef CONFIG_PREEMPT_NOTIFIERS + + struct preempt_notifier; diff --git a/patches/sched-Make-preempt_enable_no_resched-behave-like-pre.patch b/patches/sched-Make-preempt_enable_no_resched-behave-like-pre.patch new file mode 100644 index 000000000000..c82ab1b2000c --- /dev/null +++ b/patches/sched-Make-preempt_enable_no_resched-behave-like-pre.patch @@ -0,0 +1,26 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Date: Fri, 17 Sep 2021 12:56:01 +0200 +Subject: [PATCH] sched: Make preempt_enable_no_resched() behave like + preempt_enable() on PREEMPT_RT + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + include/linux/preempt.h | 6 +++++- + 1 file changed, 5 insertions(+), 1 deletion(-) + +--- a/include/linux/preempt.h ++++ b/include/linux/preempt.h +@@ -188,7 +188,11 @@ do { \ + preempt_count_dec(); \ + } while (0) + +-#define preempt_enable_no_resched() sched_preempt_enable_no_resched() ++#ifndef CONFIG_PREEMPT_RT ++# define preempt_enable_no_resched() sched_preempt_enable_no_resched() ++#else ++# define preempt_enable_no_resched() preempt_enable() ++#endif + + #define preemptible() (preempt_count() == 0 && !irqs_disabled()) + diff --git a/patches/sched__Add_support_for_lazy_preemption.patch b/patches/sched__Add_support_for_lazy_preemption.patch index 69fdbdbc4e5b..fc5d088ff23c 100644 --- a/patches/sched__Add_support_for_lazy_preemption.patch +++ b/patches/sched__Add_support_for_lazy_preemption.patch @@ -72,7 +72,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- --- a/include/linux/preempt.h +++ b/include/linux/preempt.h -@@ -174,6 +174,20 @@ extern void preempt_count_sub(int val); +@@ -182,6 +182,20 @@ extern void preempt_count_sub(int val); #define preempt_count_inc() preempt_count_add(1) #define preempt_count_dec() preempt_count_sub(1) @@ -93,7 +93,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> #ifdef CONFIG_PREEMPT_COUNT #define preempt_disable() \ -@@ -182,6 +196,12 @@ do { \ +@@ -190,6 +204,12 @@ do { \ barrier(); \ } while (0) @@ -106,7 +106,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> #define sched_preempt_enable_no_resched() \ do { \ barrier(); \ -@@ -219,6 +239,18 @@ do { \ +@@ -227,6 +247,18 @@ do { \ __preempt_schedule(); \ } while (0) @@ -125,7 +125,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> #else /* !CONFIG_PREEMPTION */ #define preempt_enable() \ do { \ -@@ -226,6 +258,12 @@ do { \ +@@ -234,6 +266,12 @@ do { \ preempt_count_dec(); \ } while (0) @@ -138,7 +138,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> #define preempt_enable_notrace() \ do { \ barrier(); \ -@@ -267,6 +305,9 @@ do { \ +@@ -275,6 +313,9 @@ do { \ #define preempt_check_resched_rt() barrier() #define preemptible() 0 @@ -148,7 +148,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> #endif /* CONFIG_PREEMPT_COUNT */ #ifdef MODULE -@@ -285,7 +326,7 @@ do { \ +@@ -293,7 +334,7 @@ do { \ } while (0) #define preempt_fold_need_resched() \ do { \ @@ -157,7 +157,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> set_preempt_need_resched(); \ } while (0) -@@ -413,8 +454,15 @@ extern void migrate_enable(void); +@@ -417,8 +458,15 @@ extern void migrate_enable(void); #else @@ -177,7 +177,7 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- a/include/linux/sched.h +++ b/include/linux/sched.h -@@ -2014,6 +2014,43 @@ static inline int test_tsk_need_resched( +@@ -2015,6 +2015,43 @@ static inline int test_tsk_need_resched( return unlikely(test_tsk_thread_flag(tsk,TIF_NEED_RESCHED)); } diff --git a/patches/sched__Disable_CONFIG_RT_GROUP_SCHED_on_RT.patch b/patches/sched__Disable_CONFIG_RT_GROUP_SCHED_on_RT.patch deleted file mode 100644 index a3d5bc25c9b6..000000000000 --- a/patches/sched__Disable_CONFIG_RT_GROUP_SCHED_on_RT.patch +++ /dev/null @@ -1,32 +0,0 @@ -Subject: sched: Disable CONFIG_RT_GROUP_SCHED on RT -From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon Jul 18 17:03:52 2011 +0200 - -From: Thomas Gleixner <tglx@linutronix.de> - -Carsten reported problems when running: - - taskset 01 chrt -f 1 sleep 1 - -from within rc.local on a F15 machine. The task stays running and -never gets on the run queue because some of the run queues have -rt_throttled=1 which does not go away. Works nice from a ssh login -shell. Disabling CONFIG_RT_GROUP_SCHED solves that as well. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> - - ---- - init/Kconfig | 1 + - 1 file changed, 1 insertion(+) ---- ---- a/init/Kconfig -+++ b/init/Kconfig -@@ -1008,6 +1008,7 @@ config CFS_BANDWIDTH - config RT_GROUP_SCHED - bool "Group scheduling for SCHED_RR/FIFO" - depends on CGROUP_SCHED -+ depends on !PREEMPT_RT - default n - help - This feature lets you explicitly allocate real CPU bandwidth diff --git a/patches/sched__Do_not_account_rcu_preempt_depth_on_RT_in_might_sleep.patch b/patches/sched__Do_not_account_rcu_preempt_depth_on_RT_in_might_sleep.patch deleted file mode 100644 index 6d56d70c4e5d..000000000000 --- a/patches/sched__Do_not_account_rcu_preempt_depth_on_RT_in_might_sleep.patch +++ /dev/null @@ -1,51 +0,0 @@ -Subject: sched: Do not account rcu_preempt_depth on RT in might_sleep() -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue Jun 7 09:19:06 2011 +0200 - -From: Thomas Gleixner <tglx@linutronix.de> - -RT changes the rcu_preempt_depth semantics, so we cannot check for it -in might_sleep(). - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> - - ---- - include/linux/rcupdate.h | 7 +++++++ - kernel/sched/core.c | 2 +- - 2 files changed, 8 insertions(+), 1 deletion(-) ---- ---- a/include/linux/rcupdate.h -+++ b/include/linux/rcupdate.h -@@ -54,6 +54,11 @@ void __rcu_read_unlock(void); - * types of kernel builds, the rcu_read_lock() nesting depth is unknowable. - */ - #define rcu_preempt_depth() READ_ONCE(current->rcu_read_lock_nesting) -+#ifndef CONFIG_PREEMPT_RT -+#define sched_rcu_preempt_depth() rcu_preempt_depth() -+#else -+static inline int sched_rcu_preempt_depth(void) { return 0; } -+#endif - - #else /* #ifdef CONFIG_PREEMPT_RCU */ - -@@ -79,6 +84,8 @@ static inline int rcu_preempt_depth(void - return 0; - } - -+#define sched_rcu_preempt_depth() rcu_preempt_depth() -+ - #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ - - /* Internal to kernel */ ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -9471,7 +9471,7 @@ void __init sched_init(void) - #ifdef CONFIG_DEBUG_ATOMIC_SLEEP - static inline int preempt_count_equals(int preempt_offset) - { -- int nested = preempt_count() + rcu_preempt_depth(); -+ int nested = preempt_count() + sched_rcu_preempt_depth(); - - return (nested == preempt_offset); - } diff --git a/patches/sched_introduce_migratable.patch b/patches/sched_introduce_migratable.patch index fc44f3015d8c..c31f03372f7a 100644 --- a/patches/sched_introduce_migratable.patch +++ b/patches/sched_introduce_migratable.patch @@ -26,7 +26,7 @@ Link: https://lore.kernel.org/r/20210811201354.1976839-3-valentin.schneider@arm. --- a/include/linux/sched.h +++ b/include/linux/sched.h -@@ -1729,6 +1729,16 @@ static inline bool is_percpu_thread(void +@@ -1730,6 +1730,16 @@ static inline bool is_percpu_thread(void #endif } diff --git a/patches/series b/patches/series index 497e738d515a..9b1086587b39 100644 --- a/patches/series +++ b/patches/series @@ -3,8 +3,9 @@ ########################################################################### # Valentin's PCP fixes ########################################################################### +# Temp RCU patch, Frederick is working on something, too. +rcu-tree-Protect-rcu_rdp_is_offloaded-invocations-on.patch sched_introduce_migratable.patch -rcu_nocb_protect_nocb_state_via_local_lock_under_preempt_rt.patch arm64_mm_make_arch_faults_on_old_pte_check_for_migratability.patch ########################################################################### @@ -25,14 +26,18 @@ printk__add_pr_flush.patch printk__Enhance_the_condition_check_of_msleep_in_pr_flush.patch ########################################################################### -# Posted +# Posted and applied ########################################################################### sched-Switch-wait_task_inactive-to-HRTIMER_MODE_REL_.patch +rcutorture-Avoid-problematic-critical-section-nestin.patch +kthread-Move-prio-affinite-change-into-the-newly-cre.patch +genirq-Move-prio-assignment-into-the-newly-created-t.patch +genirq-Disable-irqfixup-poll-on-PREEMPT_RT.patch lockdep-Let-lock_is_held_type-detect-recursive-read-.patch -ASoC-mediatek-mt8195-Remove-unsued-irqs_lock.patch -smack-Guard-smack_ipv6_lock-definition-within-a-SMAC.patch -virt-acrn-Remove-unsued-acrn_irqfds_mutex.patch +########################################################################### +# Posted +########################################################################### #KCOV 0001_documentation_kcov_include_types_h_in_the_example.patch 0002_documentation_kcov_define_ip_in_the_example.patch @@ -43,26 +48,21 @@ virt-acrn-Remove-unsued-acrn_irqfds_mutex.patch ########################################################################### # Post ########################################################################### -kthread__Move_prio_affinite_change_into_the_newly_created_thread.patch -genirq__Move_prio_assignment_into_the_newly_created_thread.patch cgroup__use_irqsave_in_cgroup_rstat_flush_locked.patch mm__workingset__replace_IRQ-off_check_with_a_lockdep_assert..patch net__Move_lockdep_where_it_belongs.patch tcp__Remove_superfluous_BH-disable_around_listening_hash.patch samples_kfifo__Rename_read_lock_write_lock.patch smp__Wake_ksoftirqd_on_PREEMPT_RT_instead_do_softirq..patch -genirq__update_irq_set_irqchip_state_documentation.patch -mm-Fully-initialize-invalidate_lock-amend-lock-class.patch ########################################################################### # Kconfig bits: ########################################################################### -genirq__Disable_irqpoll_on_-rt.patch jump-label__disable_if_stop_machine_is_used.patch leds__trigger__disable_CPU_trigger_on_-RT.patch kconfig__Disable_config_options_which_are_not_RT_compatible.patch mm__Allow_only_SLUB_on_RT.patch -sched__Disable_CONFIG_RT_GROUP_SCHED_on_RT.patch + net_core__disable_NET_RX_BUSY_POLL_on_RT.patch efi__Disable_runtime_services_on_RT.patch efi__Allow_efiruntime.patch @@ -86,25 +86,33 @@ lockdep-selftests-Avoid-using-local_lock_-acquire-re.patch 0007-lockdep-selftests-Unbalanced-migrate_disable-rcu_rea.patch 0008-lockdep-selftests-Skip-the-softirq-related-tests-on-.patch 0010-lockdep-selftests-Adapt-ww-tests-for-PREEMPT_RT.patch - -# Unbreaks powerpc locking-Allow-to-include-asm-spinlock_types.h-from-l.patch ########################################################################### # preempt: Conditional variants ########################################################################### -preempt__Provide_preempt__nort_variants.patch +sched-Add-preempt_disable_rt.patch +sched-Make-preempt_enable_no_resched-behave-like-pre.patch ########################################################################### # sched: ########################################################################### +# cpu-light kernel_sched__add_putget_cpu_light.patch +block_mq__do_not_invoke_preempt_disable.patch +md__raid5__Make_raid5_percpu_handling_RT_aware.patch +scsi_fcoe__Make_RT_aware..patch +mm_vmalloc__Another_preempt_disable_region_which_sucks.patch +net__Remove_preemption_disabling_in_netif_rx.patch +sunrpc__Make_svc_xprt_do_enqueue_use_get_cpu_light.patch +crypto__cryptd_-_add_a_lock_instead_preempt_disable_local_bh_disable.patch +# sched__Limit_the_number_of_task_migrations_per_batch.patch sched__Move_mmdrop_to_RCU_on_RT.patch kernel_sched__move_stack__kprobe_clean_up_to___put_task_struct.patch -sched__Do_not_account_rcu_preempt_depth_on_RT_in_might_sleep.patch +sched--Make-cond_resched_lock---RT-aware.patch +locking-rt--Take-RCU-nesting-into-account-for-might_sleep--.patch sched__Disable_TTWU_QUEUE_on_RT.patch -cpuset__Convert_callback_lock_to_raw_spinlock_t.patch ########################################################################### # softirq: @@ -126,7 +134,7 @@ mm__page_alloc__Use_migrate_disable_in_drain_local_pages_wq.patch u64_stats__Disable_preemption_on_32bit-UP_SMP_with_RT_during_updates.patch mm_zsmalloc__copy_with_get_cpu_var_and_locking.patch -mm_vmalloc__Another_preempt_disable_region_which_sucks.patch +drivers_block_zram__Replace_bit_spinlocks_with_rtmutex_for_-rt.patch mm_scatterlist__Do_not_disable_irqs_on_RT.patch ########################################################################### @@ -157,7 +165,6 @@ fs__namespace__Use_cpu_chill_in_trylock_loops.patch # RCU ########################################################################### rcu__Delay_RCU-selftests.patch -rcutorture-Avoid-problematic-critical-section-nestin.patch ########################################################################### # net: @@ -165,25 +172,13 @@ rcutorture-Avoid-problematic-critical-section-nestin.patch net_Qdisc__use_a_seqlock_instead_seqcount.patch net__Properly_annotate_the_try-lock_for_the_seqlock.patch net_core__use_local_bh_disable_in_netif_rx_ni.patch -sunrpc__Make_svc_xprt_do_enqueue_use_get_cpu_light.patch net__Use_skbufhead_with_raw_lock.patch net__Dequeue_in_dev_cpu_dead_without_the_lock.patch net__dev__always_take_qdiscs_busylock_in___dev_xmit_skb.patch -net__Remove_preemption_disabling_in_netif_rx.patch - -########################################################################### -# block & friends: -########################################################################### -block_mq__do_not_invoke_preempt_disable.patch -drivers_block_zram__Replace_bit_spinlocks_with_rtmutex_for_-rt.patch -md__raid5__Make_raid5_percpu_handling_RT_aware.patch -scsi_fcoe__Make_RT_aware..patch ########################################################################### # crypto: ########################################################################### -crypto__limit_more_FPU-enabled_sections.patch -crypto__cryptd_-_add_a_lock_instead_preempt_disable_local_bh_disable.patch crypto-testmgr-Only-disable-migration-in-crypto_disa.patch ########################################################################### @@ -204,21 +199,6 @@ drm_i915_gt__Only_disable_interrupts_for_the_timeline_lock_on_force-threaded.pat drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch -########################################################################### -# tty/serial: ARM drivers -########################################################################### -tty_serial_omap__Make_the_locking_RT_aware.patch -tty_serial_pl011__Make_the_locking_work_on_RT.patch - -########################################################################### -# TPM: -########################################################################### -tpm_tis__fix_stall_after_iowrites.patch - -########################################################################### -# sysfs -########################################################################### -sysfs__Add__sys_kernel_realtime_entry.patch ########################################################################### # X86: @@ -229,6 +209,16 @@ x86__Allow_to_enable_RT.patch x86__Enable_RT_also_on_32bit.patch ########################################################################### +# For later, not essencial +########################################################################### +genirq__update_irq_set_irqchip_state_documentation.patch +mm-Fully-initialize-invalidate_lock-amend-lock-class.patch +ASoC-mediatek-mt8195-Remove-unsued-irqs_lock.patch +smack-Guard-smack_ipv6_lock-definition-within-a-SMAC.patch +virt-acrn-Remove-unsued-acrn_irqfds_mutex.patch +tpm_tis__fix_stall_after_iowrites.patch + +########################################################################### # Lazy preemption ########################################################################### sched__Add_support_for_lazy_preemption.patch @@ -246,6 +236,8 @@ ARM__enable_irq_in_translation_section_permission_fault_handlers.patch KVM__arm_arm64__downgrade_preempt_disabled_region_to_migrate_disable.patch arm64-sve-Delay-freeing-memory-in-fpsimd_flush_threa.patch arm64-sve-Make-kernel-FPU-protection-RT-friendly.patch +tty_serial_omap__Make_the_locking_RT_aware.patch +tty_serial_pl011__Make_the_locking_work_on_RT.patch ARM__Allow_to_enable_RT.patch ARM64__Allow_to_enable_RT.patch @@ -258,6 +250,9 @@ powerpc_kvm__Disable_in-kernel_MPIC_emulation_for_PREEMPT_RT.patch powerpc_stackprotector__work_around_stack-guard_init_from_atomic.patch POWERPC__Allow_to_enable_RT.patch +# Sysfs file vs uname() -v +sysfs__Add__sys_kernel_realtime_entry.patch + ########################################################################### # RT release version ########################################################################### diff --git a/patches/softirq__Check_preemption_after_reenabling_interrupts.patch b/patches/softirq__Check_preemption_after_reenabling_interrupts.patch index 9b68858513cc..d23abc5ed580 100644 --- a/patches/softirq__Check_preemption_after_reenabling_interrupts.patch +++ b/patches/softirq__Check_preemption_after_reenabling_interrupts.patch @@ -25,18 +25,18 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- --- a/include/linux/preempt.h +++ b/include/linux/preempt.h -@@ -190,8 +190,10 @@ do { \ +@@ -198,8 +198,10 @@ do { \ - #ifdef CONFIG_PREEMPT_RT + #ifndef CONFIG_PREEMPT_RT # define preempt_enable_no_resched() sched_preempt_enable_no_resched() -+# define preempt_check_resched_rt() preempt_check_resched() ++# define preempt_check_resched_rt() barrier(); #else # define preempt_enable_no_resched() preempt_enable() -+# define preempt_check_resched_rt() barrier(); ++# define preempt_check_resched_rt() preempt_check_resched() #endif #define preemptible() (preempt_count() == 0 && !irqs_disabled()) -@@ -262,6 +264,7 @@ do { \ +@@ -270,6 +272,7 @@ do { \ #define preempt_disable_notrace() barrier() #define preempt_enable_no_resched_notrace() barrier() #define preempt_enable_notrace() barrier() |