summaryrefslogtreecommitdiff
path: root/arch/powerpc
Commit message (Collapse)AuthorAgeFilesLines
* powerpc/mm/hash: Don't memset pgd table if not neededAneesh Kumar K.V2018-03-311-1/+11
| | | | | | | | | | | | | We need to zero-out pgd table only if we share the slab cache with pud/pmd level caches. With the support of 4PB, we don't share the slab cache anymore. Instead of removing the code completely hide it within an #ifdef. We don't need to do this with any other page table level, because they all allocate table of double the size and we take of initializing the first half corrrectly during page table zap. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Consolidate multiple #if / #ifdef into one] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/hash64: Increase the VA rangeAneesh Kumar K.V2018-03-315-13/+13
| | | | | | | | | This patch increases the max virtual (effective) address value to 4PB. With 4K page size config we continue to limit ourself to 64TB. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Keep the H_PGTABLE_RANGE test, update it to work] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Add support for handling > 512TB address in SLB missAneesh Kumar K.V2018-03-3115-27/+245
| | | | | | | | | | | | | | For addresses above 512TB we allocate additional mmu contexts. To make it all easy, addresses above 512TB are handled with IR/DR=1 and with stack frame setup. The mmu_context_t is also updated to track the new extended_ids. To support upto 4PB we need a total 8 contexts. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Minor formatting tweaks and comment wording, switch BUG to WARN in get_ea_context().] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/slice: Consolidate return path in slice_get_unmapped_area()Aneesh Kumar K.V2018-03-311-16/+20
| | | | | | | | | | | In a following patch, on finding a free area we will need to do allocatinon of extra contexts as needed. Consolidating the return path for slice_get_unmapped_area() will make that easier. Split into a separate patch to make review easy. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/keys: Move pte bits to correct headersAneesh Kumar K.V2018-03-313-19/+15
| | | | | | | | | Memory keys are supported only with hash translation mode. Instead of using #ifdef in generic code move the key related pte bits to respective headers Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/xive: Fix wrong xmon output caused by typoFrederic Barrat2018-03-311-1/+1
| | | | | Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Fix smp_wmb barrier definition use use lwsync consistentlyNicholas Piggin2018-03-312-5/+2
| | | | | | | | | | | | | | | asm/barrier.h is not always included after asm/synch.h, which meant it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would be eieio when it should be lwsync. kernel/time/hrtimer.c is one case. __SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in to where it's used. Previously with my small simulator config, 377 instances of eieio in the tree. After this patch there are 55. Fixes: 46d075be585e ("powerpc: Optimise smp_wmb") Cc: stable@vger.kernel.org # v2.6.29+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/4xx: Fix error return code in ppc4xx_msi_probe()Wei Yongjun2018-03-311-1/+2
| | | | | | | | | Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> [mpe: Add missing ';' to make it compile] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Fix thread_pkey_regs_init()Ram Pai2018-03-311-3/+3
| | | | | | | | | | | | | | thread_pkey_regs_init() initializes the pkey related registers instead of initializing the fields in the task structures. Fortunately those key related registers are re-set to zero when the task gets scheduled on the cpu. However its good to fix this glaringly visible error. Fixes: 06bb53b33804 ("powerpc: store and restore the pkey state across context switches") Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/kprobes: Fix call trace due to incorrect preempt countNaveen N. Rao2018-03-311-13/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Michael Ellerman reported the following call trace when running ftracetest: BUG: using __this_cpu_write() in preemptible [00000000] code: ftracetest/6178 caller is opt_pre_handler+0xc4/0x110 CPU: 1 PID: 6178 Comm: ftracetest Not tainted 4.15.0-rc7-gcc6x-gb2cd1df #1 Call Trace: [c0000000f9ec39c0] [c000000000ac4304] dump_stack+0xb4/0x100 (unreliable) [c0000000f9ec3a00] [c00000000061159c] check_preemption_disabled+0x15c/0x170 [c0000000f9ec3a90] [c000000000217e84] opt_pre_handler+0xc4/0x110 [c0000000f9ec3af0] [c00000000004cf68] optimized_callback+0x148/0x170 [c0000000f9ec3b40] [c00000000004d954] optinsn_slot+0xec/0x10000 [c0000000f9ec3e30] [c00000000004bae0] kretprobe_trampoline+0x0/0x10 This is showing up since OPTPROBES is now enabled with CONFIG_PREEMPT. trampoline_probe_handler() considers itself to be a special kprobe handler for kretprobes. In doing so, it expects to be called from kprobe_handler() on a trap, and re-enables preemption before returning a non-zero return value so as to suppress any subsequent processing of the trap by the kprobe_handler(). However, with optprobes, we don't deal with special handlers (we ignore the return code) and just try to re-enable preemption causing the above trace. To address this, modify trampoline_probe_handler() to not be special. The only additional processing done in kprobe_handler() is to emulate the instruction (in this case, a 'nop'). We adjust the value of regs->nip for the purpose and delegate the job of re-enabling preemption and resetting current kprobe to the probe handlers (kprobe_handler() or optimized_callback()). Fixes: 8a2d71a3f273 ("powerpc/kprobes: Disable preemption before invoking probe handler for optprobes") Cc: stable@vger.kernel.org # v4.15+ Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()Nicholas Piggin2018-03-311-0/+4
| | | | | | | | | | | | | opal_nvram_write currently just assumes success if it encounters an error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO on other errors instead. Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/pseries: Fix clearing of security feature flagsMauricio Faria de Oliveira2018-03-311-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The H_CPU_BEHAV_* flags should be checked for in the 'behaviour' field of 'struct h_cpu_char_result' -- 'character' is for H_CPU_CHAR_* flags. Found by playing around with QEMU's implementation of the hypercall: H_CPU_CHAR=0xf000000000000000 H_CPU_BEHAV=0x0000000000000000 This clears H_CPU_BEHAV_FAVOUR_SECURITY and H_CPU_BEHAV_L1D_FLUSH_PR so pseries_setup_rfi_flush() disables 'rfi_flush'; and it also clears H_CPU_CHAR_L1D_THREAD_PRIV flag. So there is no RFI flush mitigation at all for cpu_show_meltdown() to report; but currently it does: Original kernel: # cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: RFI Flush Patched kernel: # cat /sys/devices/system/cpu/vulnerabilities/meltdown Not affected H_CPU_CHAR=0x0000000000000000 H_CPU_BEHAV=0xf000000000000000 This sets H_CPU_BEHAV_BNDS_CHK_SPEC_BAR so cpu_show_spectre_v1() should report vulnerable; but currently it doesn't: Original kernel: # cat /sys/devices/system/cpu/vulnerabilities/spectre_v1 Not affected Patched kernel: # cat /sys/devices/system/cpu/vulnerabilities/spectre_v1 Vulnerable Brown-paper-bag-by: Michael Ellerman <mpe@ellerman.id.au> Fixes: f636c14790ea ("powerpc/pseries: Set or clear security feature flags") Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge branch 'fixes' into nextMichael Ellerman2018-03-2815-93/+157
|\ | | | | | | | | | | | | | | | | Merge our fixes branch from the 4.16 cycle. There were a number of important fixes merged, in particular some Power9 workarounds that we want in next for testing purposes. There's also been some conflicting changes in the CPU features code which are best merged and tested before going upstream.
| * powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRsNicholas Piggin2018-03-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SLB bad address handler's trap number fixup does not preserve the low bit that indicates nonvolatile GPRs have not been saved. This leads save_nvgprs to skip saving them, and subsequent functions and return from interrupt will think they are saved. This causes kernel branch-to-garbage debugging to not have correct registers, can also cause userspace to have its registers clobbered after a segfault. Fixes: f0f558b131db ("powerpc/mm: Preserve CFAR value on SLB miss caused by access to bogus address") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm: Fixup tlbie vs store ordering issue on POWER9Aneesh Kumar K.V2018-03-237-2/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On POWER9, under some circumstances, a broadcast TLB invalidation might complete before all previous stores have drained, potentially allowing stale stores from becoming visible after the invalidation. This works around it by doubling up those TLB invalidations which was verified by HW to be sufficient to close the risk window. This will be documented in a yet-to-be-published errata. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Enable the feature in the DT CPU features code for all Power9, rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm/radix: Move the functions that does the actual tlbie closerAneesh Kumar K.V2018-03-231-32/+32
| | | | | | | | | | | | | | No functionality change. Just code movement to ease code changes later Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm/radix: Remove unused codeAneesh Kumar K.V2018-03-232-43/+0
| | | | | | | | | | | | | | These function are not used in the code. Remove them. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm: Workaround Nest MMU bug with TLB invalidationsBenjamin Herrenschmidt2018-03-231-7/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On POWER9 the Nest MMU may fail to invalidate some translations when doing a tlbie "by PID" or "by LPID" that is targeted at the TLB only and not the page walk cache. This works around it by forcing such invalidations to escalate to RIC=2 (full invalidation of TLB *and* PWC) when a coprocessor is in use for the context. Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> [balbirs: fixed spelling and coding style to quiesce checkpatch.pl] Tested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/mm: Add tracking of the number of coprocessors using a contextBenjamin Herrenschmidt2018-03-233-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when using coprocessors (which use the Nest MMU), we simply increment the active_cpu count to force all TLB invalidations to be come broadcast. Unfortunately, due to an errata in POWER9, we will need to know more specifically that coprocessors are in use. This maintains a separate copros counter in the MMU context for that purpose. NB. The commit mentioned in the fixes tag below is not at fault for the bug we're fixing in this commit and the next, but this fix applies on top the infrastructure it introduced. Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/64s: Fix lost pending interrupt due to race causing lost update to ↵Nicholas Piggin2018-03-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | irq_happened force_external_irq_replay() can be called in the do_IRQ path with interrupts hard enabled and soft disabled if may_hard_irq_enable() set MSR[EE]=1. It updates local_paca->irq_happened with a load, modify, store sequence. If a maskable interrupt hits during this sequence, it will go to the masked handler to be marked pending in irq_happened. This update will be lost when the interrupt returns and the store instruction executes. This can result in unpredictable latencies, timeouts, lockups, etc. Fix this by ensuring hard interrupts are disabled before modifying irq_happened. This could cause any maskable asynchronous interrupt to get lost, but it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE, so very high external interrupt rate and high IPI rate. The hang was bisected down to enabling doorbell interrupts for IPIs. These provided an interrupt type that could run at high rates in the do_IRQ path, stressing the race. Fixes: 1d607bb3bd60 ("powerpc/irq: Add mechanism to force a replay of interrupts") Cc: stable@vger.kernel.org # v4.8+ Reported-by: Carol L. Soto <clsoto@us.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/64s: Fix NULL AT_BASE_PLATFORM when using DT CPU featuresMichael Ellerman2018-03-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running virtualised the powerpc kernel is able to run the system in "compat mode" - which means the kernel and hardware are pretending to userspace that the CPU is an older version than it actually is. AT_BASE_PLATFORM is an AUXV entry that we export to userspace for use when we're running in that mode, which tells userspace the "platform" string for the real CPU version, as opposed to the faked version. Although we don't support compat mode when using DT CPU features, and arguably don't need to set AT_BASE_PLATFORM, the existing cputable based code always sets it even when we're running bare metal. That means the lack of AT_BASE_PLATFORM is a user-visible artifact of the fact that the kernel is using DT CPU features, which we don't want. So set it in the DT CPU features code also. This results in eg: $ LD_SHOW_AUXV=1 /bin/true | grep "AT_.*PLATFORM" AT_PLATFORM: power9 AT_BASE_PLATFORM:power9 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
| * powerpc/pseries: Fix vector5 in ibm architecture vector tableBharata B Rao2018-03-061-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | With ibm,dynamic-memory-v2 and ibm,drc-info coming around the same time, byte22 in vector5 of ibm architecture vector table got set twice separately. The end result is that guest kernel isn't advertising support for ibm,dynamic-memory-v2. Fix this by removing the duplicate assignment of byte22. Fixes: 02ef6dd8109b ("powerpc: Enable support for ibm,drc-info devtree property") Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * powerpc/boot: Fix random libfdt related build errorsGuenter Roeck2018-02-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once in a while I see build errors similar to the following when building images from a clean tree. Building powerpc:virtex-ml507:44x/virtex5_defconfig ... failed ------------ Error log: arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error: libfdt.h: No such file or directory Building powerpc:bamboo:smpdev:44x/bamboo_defconfig ... failed ------------ Error log: arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error: libfdt.h: No such file or directory arch/powerpc/boot/treeboot-currituck.c:35:20: fatal error: libfdt.h: No such file or directory Rebuilds will succeed. Turns out that several source files in arch/powerpc/boot/ include libfdt.h, but Makefile dependencies are incomplete. Let's fix that. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | Merge branch 'topic/ppc-kvm' into nextMichael Ellerman2018-03-2710-6/+59
|\ \ | | | | | | | | | | | | Merge the DAWR series, which touches arch code and KVM code and may need to be merged into the kvm-ppc tree.
| * | powerpc: Disable DAWR in the base POWER9 CPU featuresMichael Neuling2018-03-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | Using the DAWR on POWER9 can cause xstops, hence we need to disable it. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc: Disable DAWR on POWER9 via CPU feature quirkMichael Neuling2018-03-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This disables the DAWR on all POWER9 CPUs via cpu feature quirk. Using the DAWR on POWER9 can cause xstops, hence we need to disable it. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | KVM: PPC: Book3S HV: Handle migration with POWER9 disabled DAWRMichael Neuling2018-03-271-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POWER9 with the DAWR disabled causes problems for partition migration. Either we have to fail the migration (since we lose the DAWR) or we silently drop the DAWR and allow the migration to pass. This patch does the latter and allows the migration to pass (at the cost of silently losing the DAWR). This is not ideal but hopefully the best overall solution. This approach has been acked by Paulus. With this patch kvmppc_set_one_reg() will store the DAWR in the vcpu but won't actually set it on POWER9 hardware. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | KVM: PPC: Book3S HV: Return error from h_set_dabr() on POWER9Michael Neuling2018-03-272-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POWER7 compat mode guests can use h_set_dabr on POWER9. POWER9 should use the DAWR but since it's disabled there we can't. This returns H_UNSUPPORTED on a h_set_dabr() on POWER9 where the DAWR is disabled. Current Linux guests ignore this error, so they will silently not get the DAWR (sigh). The same error code is being used by POWERVM in this case. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | KVM: PPC: Book3S HV: Return error from h_set_mode(SET_DAWR) on POWER9Michael Neuling2018-03-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return H_P2 on a h_set_mode(SET_DAWR) on POWER9 where the DAWR is disabled. Current Linux guests ignore this error, so they will silently not get the DAWR (sigh). The same error code is being used by POWERVM in this case. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc: Update xmon to use ppc_breakpoint_available()Michael Neuling2018-03-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | The 'bd' command will now print an error and not set the breakpoint on P9. Signed-off-by: Michael Neuling <mikey@neuling.org> [mpe: Unsplit quoted string] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc: Update ptrace to use ppc_breakpoint_available()Michael Neuling2018-03-272-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This updates the ptrace code to use ppc_breakpoint_available(). We now advertise via PPC_PTRACE_GETHWDBGINFO zero breakpoints when the DAWR is missing (ie. POWER9). This results in GDB falling back to software emulation of the breakpoint (which is slow). For the features advertised by PPC_PTRACE_GETHWDBGINFO, we keep advertising DAWR as if we don't GDB assumes 1 breakpoint irrespective of the number of breakpoints advertised. GDB then fails later when trying to set this one breakpoint. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc: Add ppc_breakpoint_available()Michael Neuling2018-03-272-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | Add ppc_breakpoint_available() to determine if a breakpoint is available currently via the DAWR or DABR. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Add eeh_state_active() helperSam Bobroff2018-03-273-20/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking for a "fully active" device state requires testing two flag bits, which is open coded in several places, so add a function to do it. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Factor out common code eeh_reset_device()Sam Bobroff2018-03-271-22/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The caller will always pass NULL for 'rmv_data' when 'eeh_aware_driver' is true, so the first two calls to eeh_pe_dev_traverse() can be combined without changing behaviour as can the two arms of the final 'if' block. This should not change behaviour. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Remove always-true tests in eeh_reset_device()Sam Bobroff2018-03-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | eeh_reset_device() tests the value of 'bus' more than once but the only caller, eeh_handle_normal_device() does this test itself and will never pass NULL. So, remove the dead tests. This should not change behaviour. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Clarify arguments to eeh_reset_device()Sam Bobroff2018-03-271-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is currently difficult to understand the behaviour of eeh_reset_device() due to the way it's parameters are used. In particular, when 'bus' is NULL, it's value is still necessary so the same value is looked up again locally under a different name ('frozen_bus') but behaviour is changed. To clarify this, add a new parameter 'driver_eeh_aware', and have the caller set it when it would have passed NULL for 'bus' and always pass a value for 'bus'. Then change any test that was on 'bus' to one on '!driver_eeh_aware' and replace uses of 'frozen_bus' with 'bus'. Also update the function's comment. This should not change behaviour. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()Sam Bobroff2018-03-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | The name "frozen_bus" is misleading: it's not necessarily frozen, it's just the PE's PCI bus. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Remove misleading test in eeh_handle_normal_event()Sam Bobroff2018-03-271-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove a test that checks if "frozen_bus" is NULL, because it cannot have changed since it was tested at the start of the function and so must be true here. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()Sam Bobroff2018-03-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit "0ba178888b05 powerpc/eeh: Remove reference to PCI device" removed a call to pci_dev_get() from __eeh_addr_cache_get_device() but did not update the comment to match. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()Sam Bobroff2018-03-273-21/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the EEH_PE_RECOVERING flag for a PE is managed by both the caller and callee of eeh_handle_normal_event() (among other places not considered here). This is complicated by the fact that the PE may or may not have been invalidated by the call. So move the callee's handling into eeh_handle_normal_event(), which clarifies it and allows the return type to be changed to void (because it no longer needs to indicate at the PE has been invalidated). This should not change behaviour except in eeh_event_handler() where it was previously possible to cause eeh_pe_state_clear() to be called on an invalid PE, which is now avoided. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/eeh: Remove eeh_handle_event()Sam Bobroff2018-03-273-30/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function eeh_handle_event(pe) does nothing other than switching between calling eeh_handle_normal_event(pe) and eeh_handle_special_event(). However it is only called in two places, one where pe can't be NULL and the other where it must be NULL (see eeh_event_handler()) so it does nothing but obscure the flow of control. So, remove it. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/powernv/npu: Do not try invalidating 32bit table when 64bit table is ↵Alexey Kardashevskiy2018-03-271-3/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled GPUs and the corresponding NVLink bridges get different PEs as they have separate translation validation entries (TVEs). We put these PEs to the same IOMMU group so they cannot be passed through separately. So the iommu_table_group_ops::set_window/unset_window for GPUs do set tables to the NPU PEs as well which means that iommu_table's list of attached PEs (iommu_table_group_link) has both GPU and NPU PEs linked. This list is used for TCE cache invalidation. The problem is that NPU PE has just a single TVE and can be programmed to point to 32bit or 64bit windows while GPU PE has two (as any other PCI device). So we end up having an 32bit iommu_table struct linked to both PEs even though only the 64bit TCE table cache can be invalidated on NPU. And a relatively recent skiboot detects this and prints errors. This changes GPU's iommu_table_group_ops::set_window/unset_window to make sure that NPU PE is only linked to the table actually used by the hardware. If there are two tables used by an IOMMU group, the NPU PE will use the last programmed one which with the current use scenarios is expected to be a 64bit one. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/mm: Fix typo in commentsAlexey Kardashevskiy2018-03-271-7/+7
| | | | | | | | | | | | | | | | | | Fixes: 912cc87a6 "powerpc/mm/radix: Add LPID based tlb flush helpers" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/lpar/debug: Initialize flags before printing debug messageAlexey Kardashevskiy2018-03-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With enabled DEBUG, there is a compile error: "error: ‘flags’ is used uninitialized in this function". This moves pr_devel() little further where @flags are initialized. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/init: Do not advertise radix during client-architecture-supportAlexey Kardashevskiy2018-03-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the pseries kernel advertises radix MMU support even if the actual support is disabled via the CONFIG_PPC_RADIX_MMU option. This adds a check for CONFIG_PPC_RADIX_MMU to avoid advertising radix to the hypervisor. Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/mm: Fix section mismatch warning in stop_machine_change_mapping()Mauricio Faria de Oliveira2018-03-273-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the warning messages for stop_machine_change_mapping(), and a number of other affected functions in its call chain. All modified functions are under CONFIG_MEMORY_HOTPLUG, so __meminit is okay (keeps them / does not discard them). Boot-tested on powernv/power9/radix-mmu and pseries/power8/hash-mmu. $ make -j$(nproc) CONFIG_DEBUG_SECTION_MISMATCH=y vmlinux ... MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x6b130): Section mismatch in reference from the function stop_machine_change_mapping() to the function .meminit.text:create_physical_mapping() The function stop_machine_change_mapping() references the function __meminit create_physical_mapping(). This is often because stop_machine_change_mapping lacks a __meminit annotation or the annotation of create_physical_mapping is wrong. WARNING: vmlinux.o(.text+0x6b13c): Section mismatch in reference from the function stop_machine_change_mapping() to the function .meminit.text:create_physical_mapping() The function stop_machine_change_mapping() references the function __meminit create_physical_mapping(). This is often because stop_machine_change_mapping lacks a __meminit annotation or the annotation of create_physical_mapping is wrong. ... Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/64s: Wire up cpu_show_spectre_v2()Michael Ellerman2018-03-271-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a definition for cpu_show_spectre_v2() to override the generic version. This has several permuations, though in practice some may not occur we cater for any combination. The most verbose is: Mitigation: Indirect branch serialisation (kernel only), Indirect branch cache disabled, ori31 speculation barrier enabled We don't treat the ori31 speculation barrier as a mitigation on its own, because it has to be *used* by code in order to be a mitigation and we don't know if userspace is doing that. So if that's all we see we say: Vulnerable, ori31 speculation barrier enabled Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/64s: Wire up cpu_show_spectre_v1()Michael Ellerman2018-03-271-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a definition for cpu_show_spectre_v1() to override the generic version. Currently this just prints "Not affected" or "Vulnerable" based on the firmware flag. Although the kernel does have array_index_nospec() in a few places, we haven't yet audited all the powerpc code to see where it's necessary, so for now we don't list that as a mitigation. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/pseries: Use the security flags in pseries_setup_rfi_flush()Michael Ellerman2018-03-271-15/+12
| | | | | | | | | | | | | | | | | | | | | | | | Now that we have the security flags we can simplify the code in pseries_setup_rfi_flush() because the security flags have pessimistic defaults. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | powerpc/powernv: Use the security flags in pnv_setup_rfi_flush()Michael Ellerman2018-03-271-31/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have the security flags we can significantly simplify the code in pnv_setup_rfi_flush(), because we can use the flags instead of checking device tree properties and because the security flags have pessimistic defaults. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>