summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm
AgeCommit message (Collapse)Author
2014-10-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: "Here's a first pull request for powerpc updates for 3.18. The bulk of the additions are for the "cxl" driver, for IBM's Coherent Accelerator Processor Interface (CAPI). Most of it's in drivers/misc, which Greg & Arnd maintain, Greg said he was happy for us to take it through our tree. There's the usual minor cleanups and fixes, including a bit of noise in drivers from some of those. A bunch of updates to our EEH code, which has been getting more testing. Several nice speedups from Anton, including 20% in clear_page(). And a bunch of updates for freescale from Scott" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (130 commits) cxl: Fix afu_read() not doing finish_wait() on signal or non-blocking cxl: Add documentation for userspace APIs cxl: Add driver to Kbuild and Makefiles cxl: Add userspace header file cxl: Driver code for powernv PCIe based cards for userspace access cxl: Add base builtin support powerpc/mm: Add hooks for cxl powerpc/opal: Add PHB to cxl mode call powerpc/mm: Add new hash_page_mm() powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts cxl: Add new header for call backs and structs powerpc/powernv: Split out set MSI IRQ chip code powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize powerpc/msi: Improve IRQ bitmap allocator powerpc/cell: Make spu_flush_all_slbs() generic powerpc/cell: Move data segment faulting code out of cell platform powerpc/cell: Move spu_handle_mm_fault() out of cell platform powerpc/pseries: Use new defines when calling H_SET_MODE powerpc: Update contact info in Documentation files powerpc/perf/hv-24x7: Simplify catalog_read() ...
2014-10-09Merge branch 'akpm' (fixes from Andrew Morton)Linus Torvalds
Merge patch-bomb from Andrew Morton: - part of OCFS2 (review is laggy again) - procfs - slab - all of MM - zram, zbud - various other random things: arch, filesystems. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits) nosave: consolidate __nosave_{begin,end} in <asm/sections.h> include/linux/screen_info.h: remove unused ORIG_* macros kernel/sys.c: compat sysinfo syscall: fix undefined behavior kernel/sys.c: whitespace fixes acct: eliminate compile warning kernel/async.c: switch to pr_foo() include/linux/blkdev.h: use NULL instead of zero include/linux/kernel.h: deduplicate code implementing clamp* macros include/linux/kernel.h: rewrite min3, max3 and clamp using min and max alpha: use Kbuild logic to include <asm-generic/sections.h> frv: remove deprecated IRQF_DISABLED frv: remove unused cpuinfo_frv and friends to fix future build error zbud: avoid accessing last unused freelist zsmalloc: simplify init_zspage free obj linking mm/zsmalloc.c: correct comment for fullness group computation zram: use notify_free to account all free notifications zram: report maximum used memory zram: zram memory size limitation zsmalloc: change return value unit of zs_get_total_size_bytes zsmalloc: move pages_allocated to zs_pool ...
2014-10-09mm: remove misleading ARCH_USES_NUMA_PROT_NONEMel Gorman
ARCH_USES_NUMA_PROT_NONE was defined for architectures that implemented _PAGE_NUMA using _PROT_NONE. This saved using an additional PTE bit and relied on the fact that PROT_NONE vmas were skipped by the NUMA hinting fault scanner. This was found to be conceptually confusing with a lot of implicit assumptions and it was asked that an alternative be found. Commit c46a7c81 "x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels" redefined _PAGE_NUMA on x86 to be one of the swap PTE bits and shrunk the maximum possible swap size but it did not go far enough. There are no architectures that reuse _PROT_NONE as _PROT_NUMA but the relics still exist. This patch removes ARCH_USES_NUMA_PROT_NONE and removes some unnecessary duplication in powerpc vs the generic implementation by defining the types the core NUMA helpers expected to exist from x86 with their ppc64 equivalent. This necessitated that a PTE bit mask be created that identified the bits that distinguish present from NUMA pte entries but it is expected this will only differ between arches based on _PAGE_PROTNONE. The naming for the generic helpers was taken from x86 originally but ppc64 has types that are equivalent for the purposes of the helper so they are mapped instead of duplicating code. Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09Merge tag 'pci-v3.18-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "The interesting things here are: - Turn on Config Request Retry Status Software Visibility. This caused hangs last time, but we included a fix this time. - Rework PCI device configuration to use _HPP/_HPX more aggressively - Allow PCI devices to be put into D3cold during system suspend - Add arm64 PCI support - Add APM X-Gene host bridge driver - Add TI Keystone host bridge driver - Add Xilinx AXI host bridge driver More detailed summary: Enumeration - Check Vendor ID only for Config Request Retry Status (Rajat Jain) - Enable Config Request Retry Status when supported (Rajat Jain) - Add generic domain handling (Catalin Marinas) - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado) Resource management - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu) - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr) PCI device hotplug - Prevent NULL dereference during pciehp probe (Andreas Noever) - Move _HPP & _HPX handling into core (Bjorn Helgaas) - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas) - Apply _HPP/_HPX to display devices (Bjorn Helgaas) - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas) - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas) - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas) - Fix wait time in pciehp timeout message (Yinghai Lu) - Add more pciehp Slot Control debug output (Yinghai Lu) - Stop disabling pciehp notifications during init (Yinghai Lu) MSI - Remove arch_msi_check_device() (Alexander Gordeev) - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev) - Move D0 check into pci_msi_check_device() (Alexander Gordeev) - Remove unused kobject from struct msi_desc (Yijing Wang) - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang) - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang) - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang) - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang) - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang) Power management - Drop unused runtime PM support code for PCIe ports (Rafael J. Wysocki) - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki) AER - Add additional AER error strings (Gong Chen) - Make <linux/aer.h> standalone includable (Thierry Reding) Virtualization - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson) - Add ACS quirk for Intel 10G NICs (Alex Williamson) - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp) - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson) - Add device flag helpers (Ethan Zhao) - Assume all Mellanox devices have broken INTx masking (Gavin Shan) Generic host bridge driver - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau) - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau) - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau) - Fix the conversion of IO ranges into IO resources (Liviu Dudau) - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau) - Add support for parsing PCI host bridge resources from DT (Liviu Dudau) - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau) - Add arm64 architectural support for PCI (Liviu Dudau) APM X-Gene - Add APM X-Gene PCIe driver (Tanmay Inamdar) - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar) Freescale i.MX6 - Probe in module_init(), not fs_initcall() (Lucas Stach) - Delay enabling reference clock for SS until it stabilizes (Tim Harvey) Marvell MVEBU - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni) NVIDIA Tegra - Make sure the PCIe PLL is really reset (Eric Yuen) - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang) - Fix extended configuration space mapping (Peter Daifuku) - Implement resource hierarchy (Thierry Reding) - Clear CLKREQ# enable on port disable (Thierry Reding) - Add Tegra124 support (Thierry Reding) ST Microelectronics SPEAr13xx - Pass config resource through reg property (Pratyush Anand) Synopsys DesignWare - Use NULL instead of false (Fabio Estevam) - Parse bus-range property from devicetree (Lucas Stach) - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach) - Remove pci_assign_unassigned_resources() (Lucas Stach) - Check private_data validity in single place (Lucas Stach) - Setup and clear exactly one MSI at a time (Lucas Stach) - Remove open-coded bitmap operations (Lucas Stach) - Fix configuration base address when using 'reg' (Minghuan Lian) - Fix IO resource end address calculation (Minghuan Lian) - Rename get_msi_data() to get_msi_addr() (Minghuan Lian) - Add get_msi_data() to pcie_host_ops (Minghuan Lian) - Add support for v3.65 hardware (Murali Karicheri) - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand) TI Keystone - Add TI Keystone PCIe driver (Murali Karicheri) - Limit MRSS for all downstream devices (Murali Karicheri) - Assume controller is already in RC mode (Murali Karicheri) - Set device ID based on SoC to support multiple ports (Murali Karicheri) Xilinx AXI - Add Xilinx AXI PCIe driver (Srikanth Thokala) - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter) Miscellaneous - Clean up whitespace (Quentin Lambert) - Remove assignments from "if" conditions (Quentin Lambert) - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri) - x86: Mark DMI tables as initialization data (Mathias Krause) - x86: Move __init annotation to the correct place (Mathias Krause) - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause) - x86: Constify pci_mmcfg_probes[] array (Mathias Krause) - x86: Mark PCI BIOS initialization code as such (Mathias Krause) - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya) - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)" * tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (109 commits) arm64: dts: Add APM X-Gene PCIe device tree nodes PCI: Add ACS quirk for AMD A88X southbridge devices PCI: xgene: Add APM X-Gene PCIe driver PCI: designware: Remove open-coded bitmap operations PCI/MSI: Remove unnecessary temporary variable PCI/MSI: Use __write_msi_msg() instead of write_msi_msg() MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg() PCI/MSI: Use __get_cached_msi_msg() instead of get_cached_msi_msg() PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints PCI/MSI: Remove "pos" from the struct msi_desc msi_attrib PCI/MSI: Remove unused kobject from struct msi_desc PCI/MSI: Rename pci_msi_check_device() to pci_msi_supported() PCI/MSI: Move D0 check into pci_msi_check_device() PCI/MSI: Remove arch_msi_check_device() irqchip: armada-370-xp: Remove arch_msi_check_device() PCI/MSI/PPC: Remove arch_msi_check_device() arm64: Add architectural support for PCI PCI: Add pci_remap_iospace() to map bus I/O resources of/pci: Add support for parsing PCI host bridge resources from DT of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr() ... Conflicts: arch/arm64/boot/dts/apm-storm.dtsi
2014-10-09Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Main changes: - Fix the deadlock reported by Dave Jones et al - Clean up and fix nohz_full interaction with arch abilities - nohz init code consolidation/cleanup" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: nohz: nohz full depends on irq work self IPI support nohz: Consolidate nohz full init code arm64: Tell irq work about self IPI support arm: Tell irq work about self IPI support x86: Tell irq work about self IPI support irq_work: Force raised irq work to run on irq work interrupt irq_work: Introduce arch_irq_work_has_interrupt() nohz: Move nohz full init call to tick init
2014-10-08Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "Fixes and features for 3.18. Apart from the usual cleanups, here is the summary of new features: - s390 moves closer towards host large page support - PowerPC has improved support for debugging (both inside the guest and via gdbstub) and support for e6500 processors - ARM/ARM64 support read-only memory (which is necessary to put firmware in emulated NOR flash) - x86 has the usual emulator fixes and nested virtualization improvements (including improved Windows support on Intel and Jailhouse hypervisor support on AMD), adaptive PLE which helps overcommitting of huge guests. Also included are some patches that make KVM more friendly to memory hot-unplug, and fixes for rare caching bugs. Two patches have trivial mm/ parts that were acked by Rik and Andrew. Note: I will soon switch to a subkey for signing purposes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (157 commits) kvm: do not handle APIC access page if in-kernel irqchip is not in use KVM: s390: count vcpu wakeups in stat.halt_wakeup KVM: s390/facilities: allow TOD-CLOCK steering facility bit KVM: PPC: BOOK3S: HV: CMA: Reserve cma region only in hypervisor mode arm/arm64: KVM: Report correct FSC for unsupported fault types arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc kvm: Fix kvm_get_page_retry_io __gup retval check arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset kvm: x86: Unpin and remove kvm_arch->apic_access_page kvm: vmx: Implement set_apic_access_page_addr kvm: x86: Add request bit to reload APIC access page address kvm: Add arch specific mmu notifier for page invalidation kvm: Rename make_all_cpus_request() to kvm_make_all_cpus_request() and make it non-static kvm: Fix page ageing bugs kvm/x86/mmu: Pass gfn and level to rmapp callback. x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-only kvm: x86: use macros to compute bank MSRs KVM: x86: Remove debug assertion of non-PAE reserved bits kvm: don't take vcpu mutex for obviously invalid vcpu ioctls kvm: Faults which trigger IO release the mmap_sem ...
2014-10-08powerpc/opal: Add PHB to cxl mode callIan Munsie
This adds the OPAL call to change a PHB into cxl mode. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08powerpc/mm: Add new hash_page_mm()Ian Munsie
This adds a new function hash_page_mm() based on the existing hash_page(). This version allows any struct mm to be passed in, rather than assuming current. This is useful for servicing co-processor faults which are not in the context of the current running process. We need to be careful here as the current hash_page() assumes current in a few places. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08powerpc/powerpc: Add new PCIe functions for allocating cxl interruptsIan Munsie
This adds a number of functions for allocating IRQs under powernv PCIe for cxl. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08powerpc/cell: Make spu_flush_all_slbs() genericIan Munsie
This moves spu_flush_all_slbs() into a generic call copro_flush_all_slbs(). This will be useful when we add cxl which also needs a similar SLB flush call. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08powerpc/cell: Move data segment faulting code out of cell platformIan Munsie
__spu_trap_data_seg() currently contains code to determine the VSID and ESID required for a particular EA and mm struct. This code is generically useful for other co-processors. This moves the code of the cell platform so it can be used by other powerpc code. It also adds 1TB segment handling which Cell didn't support. The new function is called copro_calculate_slb(). This also moves the internal struct spu_slb to a generic struct copro_slb which is now used in the Cell and copro code. We use this new struct instead of passing around esid and vsid parameters. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08powerpc/cell: Move spu_handle_mm_fault() out of cell platformIan Munsie
Currently spu_handle_mm_fault() is in the cell platform. This code is generically useful for other non-cell co-processors on powerpc. This patch moves this function out of the cell platform into arch/powerpc/mm so that others may use it. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-07powerpc/pseries: Use new defines when calling H_SET_MODEMichael Neuling
Now that we define these in the KVM code, use these defines when we call H_SET_MODE. No functional change. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-04Merge branch 'next' of ↵Michael Ellerman
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git Freescale updates from Scott (27 commits): "Highlights include DMA32 zone support (SATA, USB, etc now works on 64-bit FSL kernels), MSI changes, 8xx optimizations and cleanup, t104x board support, and PrPMC PCI enumeration."
2014-10-02powerpc: Add printk levels to powerpc codeAnton Blanchard
Add printk levels to some places in the powerpc port. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02powerpc: Remove powerpc specific cmd_lineAnton Blanchard
There is no need for yet another copy of the command line, just use boot_command_line like everyone else. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02powerpc: Speed up clear_page by unrolling itAnton Blanchard
Unroll clear_page 8 times. A simple microbenchmark which allocates and frees a zeroed page: for (i = 0; i < iterations; i++) { unsigned long p = __get_free_page(GFP_KERNEL | __GFP_ZERO); free_page(p); } improves 20% on POWER8. This assumes cacheline sizes won't grow beyond 512 bytes or page sizes wont drop below 1kB, which is unlikely, but we could add a runtime check during early init if it makes people nervous. Michael found that some versions of gcc produce quite bad code (all multiplies), so we give gcc a hand by using shifts and adds. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-01PCI/MSI/PPC: Remove arch_msi_check_device()Alexander Gordeev
Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs(). This makes the code more compact and allows removing arch_msi_check_device() from generic MSI code. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/powernv: Override dma_get_required_mask()Gavin Shan
The dma_get_required_mask() function is used by some drivers to query the platform about what DMA mask is needed to cover all of memory. This is a bit of a strange semantic when we have to choose between IOMMU translation or bypass, but essentially what it means is "what DMA mask will give best performances". Currently, our IOMMU backend always returns a 32-bit mask here, we don't do anything special to it when we have bypass available. This causes some drivers to choose a 32-bit mask, thus losing the ability to use the bypass window, thinking this is more efficient. The problem was reported from the driver of following device: 0004:03:00.0 0107: 1000:0087 (rev 05) 0004:03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios \ Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05) This patch adds an override of that function in order to, instead, return a 64-bit mask whenever a bypass window is available in order for drivers to prefer this configuration. Reported-by: Murali N. Iyer <mniyer@us.ibm.com> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Emulate EEH recovery for VFIO devicesGavin Shan
When enabling EEH functionality on passed through devices (PE) with VFIO, the devices in the PE would be removed permanently from guest side. In that case, the PE remains frozen state. When returning PE to host, or restarting the guest again, we had mechanism unfreezing the PE by clearing PESTA/B frozen bits. However, that's not enough for some adapters, which are indicated as following "lspci" shows. Those adapters require hot reset on the parent bus to bring their firmware back to workable state. Otherwise, those adaptrs won't be operative and the host (for returning case) or the guest will fail to load the drivers for those adapters without exception. 0000:01:00.0 Ethernet controller: Emulex Corporation OneConnect \ 10Gb NIC (be3) (rev 02) 0000:01:00.0 0200: 19a2:0710 (rev 02) 0001:03:00.0 Ethernet controller: Emulex Corporation OneConnect \ NIC (Lancer) (rev 10) 0001:03:00.0 0200: 10df:e220 (rev 10) The patch adds mechanism to emulate EEH recovery (for hot reset on parent PCI bus) on 3 gates to fix the issue: open/release one adapter of the PE, enable EEH functionality on one adapter of the PE. Reported-by: Murilo Fossa Vicentini <muvic@br.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/powernv: Sync OpalPciResetScope with firmwareGavin Shan
The names of PCI reset scopes aren't sychronized with firmware. The patch fixes it. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Unfreeze PE on enabling EEH functionalityGavin Shan
When passing through PE to guest, that's possibly in frozen state. The driver for the pass-through devices on guest side can't be loaded successfully as reported. We already had one gate in eeh_dev_open() to clear PE frozen state accordingly, but that's not enough because the function is only called at QEMU startup for once. The patch adds another gate in eeh_pe_set_option() so that the PE frozen state can be cleared at QEMU restart time. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Introduce eeh_ops::err_injectGavin Shan
The patch introduces eeh_ops::err_inject(), which allows to inject specified errors to indicated PE for testing purpose. The functionality isn't support on pSeries platform. On PowerNV, the functionality relies on OPAL API opal_pci_err_inject(). Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/powernv: Sync header with firmwareGavin Shan
The patch synchronizes firmware header file (opal.h) for PCI error injection. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Freeze PE before PE resetGavin Shan
The patch adds one more option (EEH_OPT_FREEZE_PE) to set_option() method to proactively freeze PE, which will be issued before resetting pass-throughed PE to drop MMIO access during reset because it's always contributing to recursive EEH error. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc/eeh: Drop unused argument in eeh_check_failure()Gavin Shan
eeh_check_failure() is used to check frozen state of the PE which owns the indicated I/O address. The argument "val" of the function isn't used. The patch drops it and return the frozen state of the PE as expected. Cc: Vishal Mansur <vmansur@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc: ppc64le optimised word at a timeAnton Blanchard
Use cmpb which compares each byte in two 64 bit values and for each matching byte places 0xff in the target and 0x00 otherwise. A simple hash_name microbenchmark: http://ozlabs.org/~anton/junkcode/hash_name_bench.c shows this version to be 10-20% faster than running the x86 version on POWER8, depending on the length. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30selftests/powerpc: Add test of load_unaligned_zero_pad()Michael Ellerman
It is a rarely exercised case, so we want to have a test to ensure it works as required. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30powerpc: Implement load_unaligned_zeropadAnton Blanchard
Implement a bi-arch and bi-endian version of load_unaligned_zeropad. Since the fallback case is so rare, a userspace test harness was used to test this on ppc64le, ppc64 and ppc32: http://ozlabs.org/~anton/junkcode/test_load_unaligned_zeropad.c It uses mprotect to force a SEGV across a page boundary, and a SEGV handler to lookup the exception tables and run the fixup routine. It also compares the result against a normal load. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc/eeh: Fix kernel crash when passing through VFWei Yang
When doing vfio passthrough a VF, the kernel will crash with following message: [ 442.656459] Unable to handle kernel paging request for data at address 0x00000060 [ 442.656593] Faulting instruction address: 0xc000000000038b88 [ 442.656706] Oops: Kernel access of bad area, sig: 11 [#1] [ 442.656798] SMP NR_CPUS=1024 NUMA PowerNV [ 442.656890] Modules linked in: vfio_pci mlx4_core nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw tg3 nfsd be2net nfs_acl ses lockd ptp enclosure pps_core kvm_hv kvm_pr shpchp binfmt_misc kvm sunrpc uinput lpfc scsi_transport_fc ipr scsi_tgt [last unloaded: mlx4_core] [ 442.658152] CPU: 40 PID: 14948 Comm: qemu-system-ppc Not tainted 3.10.42yw-pkvm+ #37 [ 442.658219] task: c000000f7e2a9a00 ti: c000000f6dc3c000 task.ti: c000000f6dc3c000 [ 442.658287] NIP: c000000000038b88 LR: c0000000004435a8 CTR: c000000000455bc0 [ 442.658352] REGS: c000000f6dc3f580 TRAP: 0300 Not tainted (3.10.42yw-pkvm+) [ 442.658419] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 28004882 XER: 20000000 [ 442.658577] CFAR: c00000000000908c DAR: 0000000000000060 DSISR: 40000000 SOFTE: 1 GPR00: c0000000004435a8 c000000f6dc3f800 c0000000012b1c10 c00000000da24000 GPR04: 0000000000000003 0000000000001004 00000000000015b3 000000000000ffff GPR08: c00000000127f5d8 0000000000000000 000000000000ffff 0000000000000000 GPR12: c000000000068078 c00000000fdd6800 000001003c320c80 000001003c3607f0 GPR16: 0000000000000001 00000000105480c8 000000001055aaa8 000001003c31ab18 GPR20: 000001003c10fb40 000001003c360ae8 000000001063bcf0 000000001063bdb0 GPR24: 000001003c15ed70 0000000010548f40 c000001fe5514c88 c000001fe5514cb0 GPR28: c00000000da24000 0000000000000000 c00000000da24000 0000000000000003 [ 442.659471] NIP [c000000000038b88] .pcibios_set_pcie_reset_state+0x28/0x130 [ 442.659530] LR [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40 [ 442.659585] Call Trace: [ 442.659610] [c000000f6dc3f800] [00000000000719e0] 0x719e0 (unreliable) [ 442.659677] [c000000f6dc3f880] [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40 [ 442.659757] [c000000f6dc3f900] [c000000000455bf8] .reset_fundamental+0x38/0x80 [ 442.659835] [c000000f6dc3f980] [c0000000004562a8] .pci_dev_specific_reset+0xa8/0xf0 [ 442.659913] [c000000f6dc3fa00] [c0000000004448c4] .__pci_dev_reset+0x44/0x430 [ 442.659980] [c000000f6dc3fab0] [c000000000444d5c] .pci_reset_function+0x7c/0xc0 [ 442.660059] [c000000f6dc3fb30] [d00000001c141ab8] .vfio_pci_open+0xe8/0x2b0 [vfio_pci] [ 442.660139] [c000000f6dc3fbd0] [c000000000586c30] .vfio_group_fops_unl_ioctl+0x3a0/0x630 [ 442.660219] [c000000f6dc3fc90] [c000000000255fbc] .do_vfs_ioctl+0x4ec/0x7c0 [ 442.660286] [c000000f6dc3fd80] [c000000000256364] .SyS_ioctl+0xd4/0xf0 [ 442.660354] [c000000f6dc3fe30] [c000000000009e54] syscall_exit+0x0/0x98 [ 442.660420] Instruction dump: [ 442.660454] 4bfffce9 4bfffee4 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7e1b78 [ 442.660566] 7c9f2378 60000000 60000000 e93e02c8 <e8690060> 2fa30000 41de00c4 2b9f0002 [ 442.660679] ---[ end trace a64ac9546bcf0328 ]--- [ 442.660724] The reason is current VF is not EEH enabled. This patch introduces a macro to convert eeh_dev to eeh_pe. By doing so, it will prevent converting with NULL pointer. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> CC: Michael Ellerman <mpe@ellerman.id.au> V3 -> V4: 1. move the macro definition from include/linux/pci.h to arch/powerpc/include/asm/eeh.h V2 -> V3: 1. rebased on 3.17-rc4 2. introduce a macro 3. use this macro in several other places V1 -> V2: 1. code style and patch subject adjustment Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc: Emulate icbi, mcrf and conditional-trap instructionsPaul Mackerras
This extends the instruction emulation done by analyse_instr() and emulate_step() to handle a few more instructions that are found in the kernel. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc: Split out instruction analysis part of emulate_step()Paul Mackerras
This splits out the instruction analysis part of emulate_step() into a separate analyse_instr() function, which decodes the instruction, but doesn't execute any load or store instructions. It does execute integer instructions and branches which can be executed purely by updating register values in the pt_regs struct. For other instructions, it returns the instruction type and other details in a new instruction_op struct. emulate_step() then uses that information to execute loads, stores, cache operations, mfmsr, mtmsr[d], and (on 64-bit) sc instructions. The reason for doing this is so that the KVM code can use it instead of having its own separate instruction emulation code. Possibly the alignment interrupt handler could also use this. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc/powernv: Don't call generic code on offline cpusPaul Mackerras
On PowerNV platforms, when a CPU is offline, we put it into nap mode. It's possible that the CPU wakes up from nap mode while it is still offline due to a stray IPI. A misdirected device interrupt could also potentially cause it to wake up. In that circumstance, we need to clear the interrupt so that the CPU can go back to nap mode. In the past the clearing of the interrupt was accomplished by briefly enabling interrupts and allowing the normal interrupt handling code (do_IRQ() etc.) to handle the interrupt. This has the problem that this code calls irq_enter() and irq_exit(), which call functions such as account_system_vtime() which use RCU internally. Use of RCU is not permitted on offline CPUs and will trigger errors if RCU checking is enabled. To avoid calling into any generic code which might use RCU, we adopt a different method of clearing interrupts on offline CPUs. Since we are on the PowerNV platform, we know that the system interrupt controller is a XICS being driven directly (i.e. not via hcalls) by the kernel. Hence this adds a new icp_native_flush_interrupt() function to the native-mode XICS driver and arranges to call that when an offline CPU is woken from nap. This new function reads the interrupt from the XICS. If it is an IPI, it clears the IPI; if it is a device interrupt, it prints a warning and disables the source. Then it does the end-of-interrupt processing for the interrupt. The other thing that briefly enabling interrupts did was to check and clear the irq_happened flag in this CPU's PACA. Therefore, after flushing the interrupt from the XICS, we also clear all bits except the PACA_IRQ_HARD_DIS (interrupts are hard disabled) bit from the irq_happened flag. The PACA_IRQ_HARD_DIS flag is set by power7_nap() and is left set to indicate that interrupts are hard disabled. This means we then have to ignore that flag in power7_nap(), which is reasonable since it doesn't indicate that any interrupt event needs servicing. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc: Move htab_remove_mapping function prototype into header fileAnton Blanchard
A recent patch added a function prototype for htab_remove_mapping in c code. Fix it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc: Remove stale function prototypesAnton Blanchard
There were a number of prototypes for functions that no longer exist. Remove them. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc/powernv: Add OPAL check token callMichael Neuling
Currently there is no way to generically check if an OPAL call exists or not from the host kernel. This adds an OPAL call opal_check_token() which tells you if the given token is present in OPAL or not. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-24Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into ↵Paolo Bonzini
kvm-next Patch queue for ppc - 2014-09-24 New awesome things in this release: - E500: e6500 core support - E500: guest and remote debug support - Book3S: remote sw breakpoint support - Book3S: HV: Minor bugfixes Alexander Graf (1): KVM: PPC: Pass enum to kvmppc_get_last_inst Bharat Bhushan (8): KVM: PPC: BOOKE: allow debug interrupt at "debug level" KVM: PPC: BOOKE : Emulate rfdi instruction KVM: PPC: BOOKE: Allow guest to change MSR_DE KVM: PPC: BOOKE: Clear guest dbsr in userspace exit KVM_EXIT_DEBUG KVM: PPC: BOOKE: Guest and hardware visible debug registers are same KVM: PPC: BOOKE: Add one reg interface for DBSR KVM: PPC: BOOKE: Add one_reg documentation of SPRG9 and DBSR KVM: PPC: BOOKE: Emulate debug registers and exception Madhavan Srinivasan (2): powerpc/kvm: support to handle sw breakpoint powerpc/kvm: common sw breakpoint instr across ppc Michael Neuling (1): KVM: PPC: Book3S HV: Add register name when loading toc Mihai Caraman (10): powerpc/booke: Restrict SPE exception handlers to e200/e500 cores powerpc/booke: Revert SPE/AltiVec common defines for interrupt numbers KVM: PPC: Book3E: Increase FPU laziness KVM: PPC: Book3e: Add AltiVec support KVM: PPC: Make ONE_REG powerpc generic KVM: PPC: Move ONE_REG AltiVec support to powerpc KVM: PPC: Remove the tasklet used by the hrtimer KVM: PPC: Remove shared defines for SPE and AltiVec interrupts KVM: PPC: e500mc: Add support for single threaded vcpus on e6500 core KVM: PPC: Book3E: Enable e6500 core Paul Mackerras (2): KVM: PPC: Book3S HV: Increase timeout for grabbing secondary threads KVM: PPC: Book3S HV: Only accept host PVR value for guest PVR
2014-09-24kvm: Add arch specific mmu notifier for page invalidationTang Chen
This will be used to let the guest run while the APIC access page is not pinned. Because subsequent patches will fill in the function for x86, place the (still empty) x86 implementation in the x86.c file instead of adding an inline function in kvm_host.h. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-24kvm: Fix page ageing bugsAndres Lagar-Cavilla
1. We were calling clear_flush_young_notify in unmap_one, but we are within an mmu notifier invalidate range scope. The spte exists no more (due to range_start) and the accessed bit info has already been propagated (due to kvm_pfn_set_accessed). Simply call clear_flush_young. 2. We clear_flush_young on a primary MMU PMD, but this may be mapped as a collection of PTEs by the secondary MMU (e.g. during log-dirty). This required expanding the interface of the clear_flush_young mmu notifier, so a lot of code has been trivially touched. 3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate the access bit by blowing the spte. This requires proper synchronizing with MMU notifier consumers, like every other removal of spte's does. Signed-off-by: Andres Lagar-Cavilla <andreslc@google.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-22powerpc/kvm: common sw breakpoint instr across ppcMadhavan Srinivasan
This patch extends the use of illegal instruction as software breakpoint instruction across the ppc platform. Patch extends booke program interrupt code to support software breakpoint. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [agraf: Fix bookehv] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-22powerpc/kvm: support to handle sw breakpointMadhavan Srinivasan
This patch adds kernel side support for software breakpoint. Design is that, by using an illegal instruction, we trap to hypervisor via Emulation Assistance interrupt, where we check for the illegal instruction and accordingly we return to Host or Guest. Patch also adds support for software breakpoint in PR KVM. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-22KVM: PPC: e500mc: Add support for single threaded vcpus on e6500 coreMihai Caraman
ePAPR represents hardware threads as cpu node properties in device tree. So with existing QEMU, hardware threads are simply exposed as vcpus with one hardware thread. The e6500 core shares TLBs between hardware threads. Without tlb write conditional instruction, the Linux kernel uses per core mechanisms to protect against duplicate TLB entries. The guest is unable to detect real siblings threads, so it can't use the TLB protection mechanism. An alternative solution is to use the hypervisor to allocate different lpids to guest's vcpus that runs simultaneous on real siblings threads. On systems with two threads per core this patch halves the size of the lpid pool that the allocator sees and use two lpids per VM. Use even numbers to speedup vcpu lpid computation with consecutive lpids per VM: vm1 will use lpids 2 and 3, vm2 lpids 4 and 5, and so on. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> [agraf: fix spelling] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-22KVM: PPC: Remove shared defines for SPE and AltiVec interruptsMihai Caraman
We currently decide at compile-time which of the SPE or AltiVec units to support exclusively. Guard kernel defines with CONFIG_SPE_POSSIBLE and CONFIG_PPC_E500MC and remove shared defines. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-22KVM: PPC: Remove the tasklet used by the hrtimerMihai Caraman
Powerpc timer implementation is a copycat version of s390. Now that they removed the tasklet with commit ea74c0ea1b24a6978a6ebc80ba4dbc7b7848b32d follow this optimization. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-22KVM: PPC: BOOKE: Emulate debug registers and exceptionBharat Bhushan
This patch emulates debug registers and debug exception to support guest using debug resource. This enables running gdb/kgdb etc in guest. On BOOKE architecture we cannot share debug resources between QEMU and guest because: When QEMU is using debug resources then debug exception must be always enabled. To achieve this we set MSR_DE and also set MSRP_DEP so guest cannot change MSR_DE. When emulating debug resource for guest we want guest to control MSR_DE (enable/disable debug interrupt on need). So above mentioned two configuration cannot be supported at the same time. So the result is that we cannot share debug resources between QEMU and Guest on BOOKE architecture. In the current design QEMU gets priority over guest, this means that if QEMU is using debug resources then guest cannot use them and if guest is using debug resource then QEMU can overwrite them. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-22KVM: PPC: BOOKE: Guest and hardware visible debug registers are sameBharat Bhushan
Guest visible debug register and hardware visible debug registers are same, so ther is no need to have arch->shadow_dbg_reg, instead use arch->dbg_reg. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-22KVM: PPC: BOOKE : Emulate rfdi instructionBharat Bhushan
This patch adds "rfdi" instruction emulation which is required for guest debug hander on BOOKE-HV Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-19Revert "powerpc/fsl_msi: spread msi ints across different MSIRs"Scott Wood
This reverts commit c822e73731fce3b49a4887140878d084d8a44c08. This commit conflicted with a bitmap allocator change that partially accomplishes the same thing, but which does so more correctly. Revert this one until it can be respun on top of the correct change. Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-09-13irq_work: Introduce arch_irq_work_has_interrupt()Peter Zijlstra
The nohz full code needs irq work to trigger its own interrupt so that the subsystem can work even when the tick is stopped. Lets introduce arch_irq_work_has_interrupt() that archs can override to tell about their support for this ability. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-09-09powerpc: Wire up sys_seccomp(), sys_getrandom() and sys_memfd_create()Pranith Kumar
This patch wires up three new syscalls for powerpc. The three new syscalls are seccomp, getrandom and memfd_create. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Reviewed-by: David Herrmann <dh.herrmann@gmail.com>