summaryrefslogtreecommitdiff
path: root/arch/powerpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'pci-v3.16-changes' of ↵Linus Torvalds2014-06-024-29/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into next Pull PCI changes from Bjorn Helgaas: "Enumeration - Notify driver before and after device reset (Keith Busch) - Use reset notification in NVMe (Keith Busch) NUMA - Warn if we have to guess host bridge node information (Myron Stowe) - Work around AMD Fam15h BIOSes that fail to provide _PXM (Suravee Suthikulpanit) - Clean up and mark early_root_info_init() as deprecated (Suravee Suthikulpanit) Driver binding - Add "driver_override" for force specific binding (Alex Williamson) - Fail "new_id" addition for devices we already know about (Bandan Das) Resource management - Support BAR sizes up to 8GB (Nikhil Rao, Alan Cox) - Don't move IORESOURCE_PCI_FIXED resources (Bjorn Helgaas) - Mark SBx00 HPET BAR as IORESOURCE_PCI_FIXED (Bjorn Helgaas) - Fail safely if we can't handle BARs larger than 4GB (Bjorn Helgaas) - Reject BAR above 4GB if dma_addr_t is too small (Bjorn Helgaas) - Don't convert BAR address to resource if dma_addr_t is too small (Bjorn Helgaas) - Don't set BAR to zero if dma_addr_t is too small (Bjorn Helgaas) - Don't print anything while decoding is disabled (Bjorn Helgaas) - Don't add disabled subtractive decode bus resources (Bjorn Helgaas) - Add resource allocation comments (Bjorn Helgaas) - Restrict 64-bit prefetchable bridge windows to 64-bit resources (Yinghai Lu) - Assign i82875p_edac PCI resources before adding device (Yinghai Lu) PCI device hotplug - Remove unnecessary "dev->bus" test (Bjorn Helgaas) - Use PCI_EXP_SLTCAP_PSN define (Bjorn Helgaas) - Fix rphahp endianess issues (Laurent Dufour) - Acknowledge spurious "cmd completed" event (Rajat Jain) - Allow hotplug service drivers to operate in polling mode (Rajat Jain) - Fix cpqphp possible NULL dereference (Rickard Strandqvist) MSI - Replace pci_enable_msi_block() by pci_enable_msi_exact() (Alexander Gordeev) - Replace pci_enable_msix() by pci_enable_msix_exact() (Alexander Gordeev) - Simplify populate_msi_sysfs() (Jan Beulich) Virtualization - Add Intel Patsburg (X79) root port ACS quirk (Alex Williamson) - Mark RTL8110SC INTx masking as broken (Alex Williamson) Generic host bridge driver - Add generic PCI host controller driver (Will Deacon) Freescale i.MX6 - Use new clock names (Lucas Stach) - Drop old IRQ mapping (Lucas Stach) - Remove optional (and unused) IRQs (Lucas Stach) - Add support for MSI (Lucas Stach) - Fix imx6_add_pcie_port() section mismatch warning (Sachin Kamat) Renesas R-Car - Add gen2 device tree support (Ben Dooks) - Use new OF interrupt mapping when possible (Lucas Stach) - Add PCIe driver (Phil Edworthy) - Add PCIe MSI support (Phil Edworthy) - Add PCIe device tree bindings (Phil Edworthy) Samsung Exynos - Remove unnecessary OOM messages (Jingoo Han) - Fix add_pcie_port() section mismatch warning (Sachin Kamat) Synopsys DesignWare - Make MSI ISR shared IRQ aware (Lucas Stach) Miscellaneous - Check for broken config space aliasing (Alex Williamson) - Update email address (Ben Hutchings) - Fix Broadcom CNB20LE unintended sign extension (Bjorn Helgaas) - Fix incorrect vgaarb conditional in WARN_ON() (Bjorn Helgaas) - Remove unnecessary __ref annotations (Bjorn Helgaas) - Add arch/x86/kernel/quirks.c to MAINTAINERS PCI file patterns (Bjorn Helgaas) - Fix use of uninitialized MPS value (Bjorn Helgaas) - Tidy x86/gart messages (Bjorn Helgaas) - Fix return value from pci_user_{read,write}_config_*() (Gavin Shan) - Turn pcibios_penalize_isa_irq() into a weak function (Hanjun Guo) - Remove unused serial device IDs (Jean Delvare) - Use designated initialization in PCI_VDEVICE (Mark Rustad) - Fix powerpc NULL dereference in pci_root_buses traversal (Mike Qiu) - Configure MPS on ARM (Murali Karicheri) - Remove unnecessary includes of <linux/init.h> (Paul Gortmaker) - Move Open Firmware devspec attribute to PCI common code (Sebastian Ott) - Use pdev->dev.groups for attribute creation on s390 (Sebastian Ott) - Remove pcibios_add_platform_entries() (Sebastian Ott) - Add new ID for Intel GPU "spurious interrupt" quirk (Thomas Jarosch) - Rename pci_is_bridge() to pci_has_subordinate() (Yijing Wang) - Add and use new pci_is_bridge() interface (Yijing Wang) - Make pci_bus_add_device() void (Yijing Wang) DMA API - Clarify physical/bus address distinction in docs (Bjorn Helgaas) - Fix typos in docs (Emilio López) - Update dma_pool_create ()and dma_pool_alloc() descriptions (Gioh Kim) - Change dma_declare_coherent_memory() CPU address to phys_addr_t (Bjorn Helgaas) - Pass GAPSPCI_DMA_BASE CPU & bus address to dma_declare_coherent_memory() (Bjorn Helgaas)" * tag 'pci-v3.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (92 commits) MAINTAINERS: Add generic PCI host controller driver PCI: generic: Add generic PCI host controller driver PCI: imx6: Add support for MSI PCI: designware: Make MSI ISR shared IRQ aware PCI: imx6: Remove optional (and unused) IRQs PCI: imx6: Drop old IRQ mapping PCI: imx6: Use new clock names i82875p_edac: Assign PCI resources before adding device ARM/PCI: Call pcie_bus_configure_settings() to set MPS PCI: imx6: Fix imx6_add_pcie_port() section mismatch warning PCI: Make pci_bus_add_device() void PCI: exynos: Fix add_pcie_port() section mismatch warning PCI: Introduce new device binding path using pci_dev.driver_override PCI: rcar: Add gen2 device tree support PCI: cpqphp: Fix possible null pointer dereference PCI: rcar: Add R-Car PCIe device tree bindings PCI: rcar: Add MSI support for PCIe PCI: rcar: Add Renesas R-Car PCIe driver PCI: Fix return value from pci_user_{read,write}_config_*() PCI: exynos: Remove unnecessary OOM messages ...
| * Merge branch 'pci/misc' into nextBjorn Helgaas2014-05-281-5/+0
| |\ | | | | | | | | | | | | | | | | | | * pci/misc: PCI: Fix return value from pci_user_{read,write}_config_*() PCI: Turn pcibios_penalize_isa_irq() into a weak function PCI: Test for std config alias when testing extended config space
| | * PCI: Turn pcibios_penalize_isa_irq() into a weak functionHanjun Guo2014-05-271-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pcibios_penalize_isa_irq() is only implemented by x86 now, and legacy ISA is not used by some architectures. Make pcibios_penalize_isa_irq() a __weak function to simplify the code. This removes the need for new platforms to add stub implementations of pcibios_penalize_isa_irq(). [bhelgaas: changelog, comments] Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
| | |
| | \
| *-. \ Merge branches 'pci/hotplug', 'pci/pci_is_bridge' and 'pci/virtualization' ↵Bjorn Helgaas2014-05-282-4/+2
| |\ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into next * pci/hotplug: PCI: cpqphp: Fix possible null pointer dereference NVMe: Implement PCIe reset notification callback PCI: Notify driver before and after device reset * pci/pci_is_bridge: pcmcia: Use pci_is_bridge() to simplify code PCI: pciehp: Use pci_is_bridge() to simplify code PCI: acpiphp: Use pci_is_bridge() to simplify code PCI: cpcihp: Use pci_is_bridge() to simplify code PCI: shpchp: Use pci_is_bridge() to simplify code PCI: rpaphp: Use pci_is_bridge() to simplify code sparc/PCI: Use pci_is_bridge() to simplify code powerpc/PCI: Use pci_is_bridge() to simplify code ia64/PCI: Use pci_is_bridge() to simplify code x86/PCI: Use pci_is_bridge() to simplify code PCI: Use pci_is_bridge() to simplify code PCI: Add new pci_is_bridge() interface PCI: Rename pci_is_bridge() to pci_has_subordinate() * pci/virtualization: PCI: Introduce new device binding path using pci_dev.driver_override Conflicts: drivers/pci/pci-sysfs.c
| | | * powerpc/PCI: Use pci_is_bridge() to simplify codeYijing Wang2014-05-272-4/+2
| | |/ | | | | | | | | | | | | | | | | | | | | | Use pci_is_bridge() to simplify code. No functional change. Requires: 326c1cdae741 PCI: Rename pci_is_bridge() to pci_has_subordinate() Requires: 1c86438c9423 PCI: Add new pci_is_bridge() interface Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
| * | PCI: Move Open Firmware devspec attribute to PCI common codeSebastian Ott2014-04-301-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the devspec OF attribute to PCI common code's set of device attributes since it's not architecture dependent. As a side effect microblaze and powerpc no longer need to use pcibios_add_platform_entries(). [bhelgaas: fold in #include for compile error] Link: https://lkml.kernel.org/r/alpine.LFD.2.11.1404141101500.1529@denkbrett Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | powerpc/PCI: Fix NULL dereference in sys_pciconfig_iobase() list traversalMike Qiu2014-04-141-4/+6
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 3bc955987fb3 ("powerpc/PCI: Use list_for_each_entry() for bus traversal") caused a NULL pointer dereference because the loop body set the iterator to NULL: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc000000000041d78 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c000000000041d78] .sys_pciconfig_iobase+0x68/0x1f0 LR [c000000000041e0c] .sys_pciconfig_iobase+0xfc/0x1f0 Call Trace: [c0000003b4787db0] [c000000000041e0c] .sys_pciconfig_iobase+0xfc/0x1f0 (unreliable) [c0000003b4787e30] [c000000000009ed8] syscall_exit+0x0/0x98 Fix it by using a temporary variable for the iterator. [bhelgaas: changelog, drop tmp_bus initialization] Fixes: 3bc955987fb3 powerpc/PCI: Use list_for_each_entry() for bus traversal Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* | Merge branch 'merge' of ↵Linus Torvalds2014-06-013-1/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fix from Ben Herrenschmidt: "Here's just one trivial patch to wire up sys_renameat2 which I seem to have completely missed so far. (My test build scripts fwd me warnings but miss the ones generated for missing syscalls)" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Wire renameat2() syscall
| * | powerpc: Wire renameat2() syscallBenjamin Herrenschmidt2014-06-023-1/+3
| | | | | | | | | | | | Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2014-05-287-8/+127
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm fixes from Paolo Bonzini: "Small fixes for x86, slightly larger fixes for PPC, and a forgotten s390 patch. The PPC fixes are important because they fix breakage that is new in 3.15" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: announce irqfd capability KVM: x86: disable master clock if TSC is reset during suspend KVM: vmx: disable APIC virtualization in nested guests KVM guest: Make pv trampoline code executable KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit KVM: PPC: Book3S: HV: make _PAGE_NUMA take effect
| * | Merge tag 'signed-for-3.15' of git://github.com/agraf/linux-2.6 into kvm-masterPaolo Bonzini2014-05-137-8/+127
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch queue for 3.15 - 2014-05-12 This request includes a few bug fixes that really shouldn't wait for the next release. It fixes KVM on 32bit PowerPC when built as module. It also fixes the PV KVM acceleration when NX gets honored by the host. Furthermore we fix transactional memory support and numa support on HV KVM.
| | * | KVM guest: Make pv trampoline code executableAlexander Graf2014-04-293-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our PV guest patching code assembles chunks of instructions on the fly when it encounters more complicated instructions to hijack. These instructions need to live in a section that we don't mark as non-executable, as otherwise we fault when jumping there. Right now we put it into the .bss section where it automatically gets marked as non-executable. Add a check to the NX setting function to ensure that we leave these particular pages executable. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bitAlexander Graf2014-04-282-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The book3s_32 target can get built as module which means we don't see the config define for it in code. Instead, check on the bool define CONFIG_KVM_BOOK3S_32_HANDLER whenever we want to know whether we're building for a book3s_32 host. This fixes running book3s_32 kvm as a module for me. Signed-off-by: Alexander Graf <agraf@suse.de> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
| | * | KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exitPaul Mackerras2014-04-281-0/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Testing by Michael Neuling revealed that commit e4e38121507a ("KVM: PPC: Book3S HV: Add transactional memory support") is missing the code that saves away the checkpointed state of the guest when switching to the host. This adds that code, which was in earlier versions of the patch but went missing somehow. Reported-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S: HV: make _PAGE_NUMA take effectpingfank@linux.vnet.ibm.com2014-04-281-1/+1
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Numa fault is a method which help to achieve auto numa balancing. When such a page fault takes place, the page fault handler will check whether the page is placed correctly. If not, migration should be involved to cut down the distance between the cpu and pages. A pte with _PAGE_NUMA help to implement numa fault. It means not to allow the MMU to access the page directly. So a page fault is triggered and numa fault handler gets the opportunity to run checker. As for the access of MMU, we need special handling for the powernv's guest. When we mark a pte with _PAGE_NUMA, we already call mmu_notifier to invalidate it in guest's htab, but when we tried to re-insert them, we firstly try to map it in real-mode. Only after this fails, we fallback to virt mode, and most of important, we run numa fault handler in virt mode. This patch guards the way of real-mode to ensure that if a pte is marked with _PAGE_NUMA, it will NOT be mapped in real mode, instead, it will be mapped in virt mode and have the opportunity to be checked with placement. Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
* | | powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST modeSrivatsa S. Bhat2014-05-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we try to perform a kexec when the machine is in ST (Single-Threaded) mode (ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we get the following messages during boot: [ 0.089866] POWER8 performance monitor hardware support registered [ 0.089985] power8-pmu: PMAO restore workaround active. [ 5.095419] Processor 1 is stuck. [ 10.097933] Processor 2 is stuck. [ 15.100480] Processor 3 is stuck. [ 20.102982] Processor 4 is stuck. [ 25.105489] Processor 5 is stuck. [ 30.108005] Processor 6 is stuck. [ 35.110518] Processor 7 is stuck. [ 40.113369] Processor 9 is stuck. [ 45.115879] Processor 10 is stuck. [ 50.118389] Processor 11 is stuck. [ 55.120904] Processor 12 is stuck. [ 60.123425] Processor 13 is stuck. [ 65.125970] Processor 14 is stuck. [ 70.128495] Processor 15 is stuck. [ 75.131316] Processor 17 is stuck. Note that only the sibling threads are stuck, while the primary threads (0, 8, 16 etc) boot just fine. Looking closer at the previous step of kexec, we observe that kexec tries to wakeup (bring online) the sibling threads of all the cores, before performing kexec: [ 9464.131231] Starting new kernel [ 9464.148507] kexec: Waking offline cpu 1. [ 9464.148552] kexec: Waking offline cpu 2. [ 9464.148600] kexec: Waking offline cpu 3. [ 9464.148636] kexec: Waking offline cpu 4. [ 9464.148671] kexec: Waking offline cpu 5. [ 9464.148708] kexec: Waking offline cpu 6. [ 9464.148743] kexec: Waking offline cpu 7. [ 9464.148779] kexec: Waking offline cpu 9. [ 9464.148815] kexec: Waking offline cpu 10. [ 9464.148851] kexec: Waking offline cpu 11. [ 9464.148887] kexec: Waking offline cpu 12. [ 9464.148922] kexec: Waking offline cpu 13. [ 9464.148958] kexec: Waking offline cpu 14. [ 9464.148994] kexec: Waking offline cpu 15. [ 9464.149030] kexec: Waking offline cpu 17. Instrumenting this piece of code revealed that the cpu_up() operation actually fails with -EBUSY. Thus, only the primary threads of all the cores are online during kexec, and hence this is a sure-shot receipe for disaster, as explained in commit e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec), as well as in the comment above wake_offline_cpus(). It turns out that cpu_up() was returning -EBUSY because the variable 'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done by migrate_to_reboot_cpu() inside kernel_kexec(). Now, migrate_to_reboot_cpu() was originally written with the assumption that any further code will not need to perform CPU hotplug, since we are anyway in the reboot path. However, kexec is clearly not such a case, since we depend on onlining CPUs, atleast on powerpc. So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the kexec path, to fix this regression in kexec on powerpc. Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we can catch such issues more easily in the future. Fixes: c97102ba963 (kexec: migrate to reboot cpu) Cc: stable@vger.kernel.org Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc: Fix 64 bit builds with binutils 2.24Guenter Roeck2014-05-282-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With binutils 2.24, various 64 bit builds fail with relocation errors such as arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e': (.text+0x165ee): relocation truncated to fit: R_PPC64_ADDR16_HI against symbol `interrupt_base_book3e' defined in .text section in arch/powerpc/kernel/built-in.o arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e': (.text+0x16602): relocation truncated to fit: R_PPC64_ADDR16_HI against symbol `interrupt_end_book3e' defined in .text section in arch/powerpc/kernel/built-in.o The assembler maintainer says: I changed the ABI, something that had to be done but unfortunately happens to break the booke kernel code. When building up a 64-bit value with lis, ori, shl, oris, ori or similar sequences, you now should use @high and @higha in place of @h and @ha. @h and @ha (and their associated relocs R_PPC64_ADDR16_HI and R_PPC64_ADDR16_HA) now report overflow if the value is out of 32-bit signed range. ie. @h and @ha assume you're building a 32-bit value. This is needed to report out-of-range -mcmodel=medium toc pointer offsets in @toc@h and @toc@ha expressions, and for consistency I did the same for all other @h and @ha relocs. Replacing @h with @high in one strategic location fixes the relocation errors. This has to be done conditionally since the assembler either supports @h or @high but not both. Cc: <stable@vger.kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc: irq work racing with timer interrupt can result in timer interrupt hangAnton Blanchard2014-05-121-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am seeing an issue where a CPU running perf eventually hangs. Traces show timer interrupts happening every 4 seconds even when a userspace task is running on the CPU. /proc/timer_list also shows pending hrtimers have not run in over an hour, including the scheduler. Looking closer, decrementers_next_tb is getting set to 0xffffffffffffffff, and at that point we will never take a timer interrupt again. In __timer_interrupt() we set decrementers_next_tb to 0xffffffffffffffff and rely on ->event_handler to update it: *next_tb = ~(u64)0; if (evt->event_handler) evt->event_handler(evt); In this case ->event_handler is hrtimer_interrupt. This will eventually call back through the clockevents code with the next event to be programmed: static int decrementer_set_next_event(unsigned long evt, struct clock_event_device *dev) { /* Don't adjust the decrementer if some irq work is pending */ if (test_irq_work_pending()) return 0; __get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt; If irq work came in between these two points, we will return before updating decrementers_next_tb and we never process a timer interrupt again. This looks to have been introduced by 0215f7d8c53f (powerpc: Fix races with irq_work). Fix it by removing the early exit and relying on code later on in the function to force an early decrementer: /* We may have raced with new irq work */ if (test_irq_work_pending()) set_dec(1); Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org # 3.14+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | powerpc/powernv: Reset root port in firmwareGavin Shan2014-05-121-1/+2
|/ / | | | | | | | | | | | | | | | | | | | | Resetting root port has more stuff to do than that for PCIe switch ports and we should have resetting root port done in firmware instead of the kernel itself. The problem was introduced by commit 5b2e198e ("powerpc/powernv: Rework EEH reset"). Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/4xx: Fix section mismatch in ppc4xx_pci.cAlistair Popple2014-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes this section mismatch: WARNING: vmlinux.o(.text+0x1efc4): Section mismatch in reference from the function apm821xx_pciex_init_port_hw() to the function .init.text:ppc4xx_pciex_wait_on_sdr.isra.9() The function apm821xx_pciex_init_port_hw() references the function __init ppc4xx_pciex_wait_on_sdr.isra.9(). This is often because apm821xx_pciex_init_port_hw lacks a __init annotation or the annotation of ppc4xx_pciex_wait_on_sdr.isra.9 is wrong. apm821xx_pciex_init_port_hw is only referenced by a struct in __initdata, so it should be safe to add __init to apm821xx_pciex_init_port_hw. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | ppc/kvm: Clear the runlatch bit of a vcpu before nappingPreeti U Murthy2014-04-281-1/+11
| | | | | | | | | | | | | | | | | | | | | | When the guest cedes the vcpu or the vcpu has no guest to run it naps. Clear the runlatch bit of the vcpu before napping to indicate an idle cpu. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | ppc/kvm: Set the runlatch bit of a CPU just before starting guestPreeti U Murthy2014-04-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The secondary threads in the core are kept offline before launching guests in kvm on powerpc: "371fefd6f2dc4666:KVM: PPC: Allow book3s_hv guests to use SMT processor modes." Hence their runlatch bits are cleared. When the secondary threads are called in to start a guest, their runlatch bits need to be set to indicate that they are busy. The primary thread has its runlatch bit set though, but there is no harm in setting this bit once again. Hence set the runlatch bit for all threads before they start guest. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | ppc/powernv: Set the runlatch bits correctly for offline cpusPreeti U Murthy2014-04-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up until now we have been setting the runlatch bits for a busy CPU and clearing it when a CPU enters idle state. The runlatch bit has thus been consistent with the utilization of a CPU as long as the CPU is online. However when a CPU is hotplugged out the runlatch bit is not cleared. It needs to be cleared to indicate an unused CPU. Hence this patch has the runlatch bit cleared for an offline CPU just before entering an idle state and sets it immediately after it exits the idle state. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/pseries: Protect remove_memory() with device hotplug lockLi Zhong2014-04-281-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While testing memory hot-remove, I found following dead lock: Process #1141 is drmgr, trying to remove some memory, i.e. memory499. It holds the memory_hotplug_mutex, and blocks when trying to remove file "online" under dir memory499, in kernfs_drain(), at wait_event(root->deactivate_waitq, atomic_read(&kn->active) == KN_DEACTIVATED_BIAS); Process #1120 is trying to online memory499 by echo 1 > memory499/online In .kernfs_fop_write, it uses kernfs_get_active() to increase &kn->active, thus blocking process #1141. While itself is blocked later when trying to acquire memory_hotplug_mutex, which is held by process The backtrace of both processes are shown below: [<c000000001b18600>] 0xc000000001b18600 [<c000000000015044>] .__switch_to+0x144/0x200 [<c000000000263ca4>] .online_pages+0x74/0x7b0 [<c00000000055b40c>] .memory_subsys_online+0x9c/0x150 [<c00000000053cbe8>] .device_online+0xb8/0x120 [<c00000000053cd04>] .online_store+0xb4/0xc0 [<c000000000538ce4>] .dev_attr_store+0x64/0xa0 [<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0 [<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0 [<c000000000268450>] .vfs_write+0xe0/0x260 [<c000000000269144>] .SyS_write+0x64/0x110 [<c000000000009ffc>] syscall_exit+0x0/0x7c [<c000000001b18600>] 0xc000000001b18600 [<c000000000015044>] .__switch_to+0x144/0x200 [<c00000000030be14>] .__kernfs_remove+0x204/0x300 [<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0 [<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60 [<c000000000539354>] .device_remove_attrs+0x54/0xc0 [<c000000000539fd8>] .device_del+0x158/0x250 [<c00000000053a104>] .device_unregister+0x34/0xa0 [<c00000000055bc14>] .unregister_memory_section+0x164/0x170 [<c00000000024ee18>] .__remove_pages+0x108/0x4c0 [<c00000000004b590>] .arch_remove_memory+0x60/0xc0 [<c00000000026446c>] .remove_memory+0x8c/0xe0 [<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160 [<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290 [<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100 [<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0 [<c00000000071ddec>] .of_property_notify+0x7c/0xc0 [<c00000000071ed3c>] .of_update_property+0x3c/0x1b0 [<c0000000000756cc>] .ofdt_write+0x3dc/0x740 [<c0000000002f60fc>] .proc_reg_write+0xac/0x110 [<c000000000268450>] .vfs_write+0xe0/0x260 [<c000000000269144>] .SyS_write+0x64/0x110 [<c000000000009ffc>] syscall_exit+0x0/0x7c This patch uses lock_device_hotplug() to protect remove_memory() called in pseries_remove_memblock(), which is also stated before function remove_memory(): * NOTE: The caller must call lock_device_hotplug() to serialize hotplug * and online/offline operations before this call, as required by * try_offline_node(). */ void __ref remove_memory(int nid, u64 start, u64 size) With this lock held, the other process(#1120 above) trying to online the memory block will retry the system call when calling lock_device_hotplug_sysfs(), and finally find No such device error. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Fix error return in rtas_flash module initAnton Blanchard2014-04-281-1/+1
| | | | | | | | | | | | | | module_init should return 0 or a negative errno. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048Anton Blanchard2014-04-281-1/+1
| | | | | | | | | | | | | | | | Bump the boot wrapper BOOT_COMMAND_LINE_SIZE to match the kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Bump COMMAND_LINE_SIZE to 2048Anton Blanchard2014-04-281-1/+6
| | | | | | | | | | | | | | | | I've had a report that the current limit is too small for an automated network based installer. Bump it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Rename duplicate COMMAND_LINE_SIZE defineAnton Blanchard2014-04-283-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have two definitions of COMMAND_LINE_SIZE, one for the kernel and one for the boot wrapper. I assume this is so the boot wrapper can be self sufficient and not rely on kernel headers. Having two defines with the same name is confusing, I just updated the wrong one when trying to bump it. Make the boot wrapper define unique by calling it BOOT_COMMAND_LINE_SIZE. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/perf/hv-24x7: Catalog version number is be64, not be32Cody P Schafer2014-04-281-6/+7
| | | | | | | | | | | | | | | | The catalog version number was changed from a be32 (with proceeding 32bits of padding) to a be64, update the code to treat it as a be64 Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on itCody P Schafer2014-04-281-1/+1
| | | | | | | | | | Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling ↵Cody P Schafer2014-04-281-4/+16
| | | | | | | | | | | | | | plpar_hcall_norets() Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/perf/hv-gpci: Make device attr staticCody P Schafer2014-04-281-1/+1
| | | | | | | | | | Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reducedCody P Schafer2014-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | fixup for "powerpc/perf: Add support for the hv gpci (get performance counter info) interface". Makes the "not enabled" message less awful (and hidden unless debugging). Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixedCody P Schafer2014-04-281-2/+2
| | | | | | | | | | | | | | | | | | fixup for "powerpc/perf: Add support for the hv 24x7 interface" Makes the "not enabled" message less awful (and hides it in most cases). Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/mm: Fix tlbie to add AVAL fields for 64K pagesAneesh Kumar K.V2014-04-281-22/+16
| | | | | | | | | | | | | | The if condition check was based on a draft ISA doc. Remove the same. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix little endian issues in OPAL dump codeAnton Blanchard2014-04-282-6/+11
| | | | | | | | | | Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Create OPAL sglist helper functions and fix endian issuesAnton Blanchard2014-04-284-188/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | We have two copies of code that creates an OPAL sg list. Consolidate these into a common set of helpers and fix the endian issues. The flash interface embedded a version number in the num_entries field, whereas the dump interface did did not. Since versioning wasn't added to the flash interface and it is impossible to add this in a backwards compatible way, just remove it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix little endian issues in OPAL error log codeAnton Blanchard2014-04-282-2/+9
| | | | | | | | | | | | | | | | Fix little endian issues with the OPAL error log code. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix little endian issues with opal_do_notifier callsAnton Blanchard2014-04-281-3/+3
| | | | | | | | | | | | | | | | The bitmap in opal_poll_events and opal_handle_interrupt is big endian, so we need to byteswap it on little endian builds. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Remove some OPAL function declaration duplicationAnton Blanchard2014-04-281-10/+2
| | | | | | | | | | | | | | We had some duplication of the internal OPAL functions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Use uint64_t instead of size_t in OPAL APIsAnton Blanchard2014-04-282-7/+7
| | | | | | | | | | | | | | | | | | Using size_t in our APIs is asking for trouble, especially when some OPAL calls use size_t pointers. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Release the refcount for pci_devWei Yang2014-04-281-1/+0
| | | | | | | | | | | | | | | | | | | | On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which leads to the pci_dev is not destroyed when hotplugging a pci device. This patch release the unnecessary refcount. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Reduce multi-hit of iommu_add_device()Wei Yang2014-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the EEH hotplug event, iommu_add_device() will be invoked three times and two of them will trigger warning or error. The three times to invoke the iommu_add_device() are: pci_device_add ... set_iommu_table_base_and_group <- 1st time, fail device_add ... tce_iommu_bus_notifier <- 2nd time, succees pcibios_add_pci_devices ... pcibios_setup_bus_devices <- 3rd time, re-attach The first time fails, since the dev->kobj->sd is not initialized. The dev->kobj->sd is initialized in device_add(). The third time's warning is triggered by the re-attach of the iommu_group. After applying this patch, the error iommu_tce: 0003:05:00.0 has not been added, ret=-14 and the warning [ 204.123609] ------------[ cut here ]------------ [ 204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125 [ 204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt [ 204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102 [ 204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000 [ 204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0 [ 204.124506] REGS: c0000027ed50f440 TRAP: 0700 Not tainted (3.14.0-rc5yw+) [ 204.124558] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 88008084 XER: 20000000 [ 204.124682] CFAR: c00000000006c644 SOFTE: 1 GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300 GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110 GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062 GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800 GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000 [ 204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0 [ 204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0 [ 204.125443] Call Trace: [ 204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable) [ 204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0 [ 204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340 [ 204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0 [ 204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0 [ 204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0 [ 204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0 [ 204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480 [ 204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340 [ 204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0 [ 204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130 [ 204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c [ 204.126556] Instruction dump: [ 204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000 [ 204.126787] 60000000 e87e0298 3143ffff 7d2a1910 <0b090000> 2fa90000 40de00c8 ebfe0218 [ 204.126966] ---[ end trace 6e7aefd80add2973 ]--- are cleared. This patch removes iommu_add_device() in pnv_pci_ioda_dma_dev_setup(), which revert part of the change in commit d905c5df(PPC: POWERNV: move iommu_add_device earlier). Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix little endian issues in OPAL flash codeAnton Blanchard2014-04-281-4/+8
| | | | | | | | | | | | | | With this patch I was able to update firmware on an LE kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix kexec races going back to OPALBenjamin Herrenschmidt2014-04-281-2/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a subtle race when sending CPUs back to OPAL on kexec. We mark them as "in real mode" right before we send them down. Once we've booted the new kernel, it might try to call opal_reinit_cpus() to change endianness, and that requires all CPUs to be spinning inside OPAL. However there is no synchronization here and we've observed cases where the returning CPUs hadn't established their new state inside OPAL before opal_reinit_cpus() is called, causing it to fail. The proper fix is to actually wait for them to go down all the way from the kexec'ing kernel. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Check sysparam size before creationJoel Stanley2014-04-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | The size of the sysparam sysfs files is determined from the device tree at boot. However the buffer is hard coded to 64 bytes. If we encounter a parameter that is larger than 64, or miss-parse the device tree, the buffer will overflow when reading or writing to the parameter. Check it at discovery time, and if the parameter is too large, do not create a sysfs entry for it. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix typos in sysparam codeJoel Stanley2014-04-281-2/+2
| | | | | | | | Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Check sysfs size before copyingJoel Stanley2014-04-281-0/+4
| | | | | | | | | | | | | | | | | | The sysparam code currently uses the userspace supplied number of bytes when memcpy()ing in to a local 64-byte buffer. Limit the maximum number of bytes by the size of the buffer. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Use ssize_t for sysparam return valuesJoel Stanley2014-04-281-5/+6
| | | | | | | | | | | | | | | | | | The OPAL calls are returning int64_t values, which the sysparam code stores in an int, and the sysfs callback returns ssize_t. Make code a easier to read by consistently using ssize_t. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Fix sysparam sysfs error handlingJoel Stanley2014-04-281-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a sysparam query in OPAL returned a negative value (error code), sysfs would spew out a decent chunk of memory; almost 64K more than expected. This was traced to a sign/unsigned mix up in the OPAL sysparam sysfs code at sys_param_show. The return value of sys_param_show is a ssize_t, calculated using return ret ? ret : attr->param_size; Alan Modra explains: "attr->param_size" is an unsigned int, "ret" an int, so the overall expression has type unsigned int. Result is that ret is cast to unsigned int before being cast to ssize_t. Instead of using the ternary operator, set ret to the param_size if an error is not detected. The same bug exists in the sysfs write callback; this patch fixes it in the same way. A note on debugging this next time: on my system gcc will warn about this if compiled with -Wsign-compare, which is not enabled by -Wall, only -Wextra. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>