summaryrefslogtreecommitdiff
path: root/drivers/s390/block/dcssblk.c
AgeCommit message (Collapse)Author
2024-07-18Merge tag 's390-6.11-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Remove restrictions on PAI NNPA and crypto counters, enabling concurrent per-task and system-wide sampling and counting events - Switch to GENERIC_CPU_DEVICES by setting up the CPU present mask in the architecture code and letting the generic code handle CPU bring-up - Add support for the diag204 busy indication facility to prevent undesirable blocking during hypervisor logical CPU utilization queries. Implement results caching - Improve the handling of Store Data SCLP events by suppressing unnecessary warning, preventing buffer release in I/O during failures, and adding timeout handling for Store Data requests to address potential firmware issues - Provide optimized __arch_hweight*() implementations - Remove the unnecessary CPU KOBJ_CHANGE uevents generated during topology updates, as they are unused and also not present on other architectures - Cleanup atomic_ops, optimize __atomic_set() for small values and __atomic_cmpxchg_bool() for compilers supporting flag output constraint - Couple of cleanups for KVM: - Move and improve KVM struct definitions for DAT tables from gaccess.c to a new header - Pass the asce as parameter to sie64a() - Make the crdte() and cspg() page table handling wrappers return a boolean to indicate success, like the other existing "compare and swap" wrappers - Add documentation for HWCAP flags - Switch to obtaining total RAM pages from memblock instead of totalram_pages() during mm init, to ensure correct calculation of zero page size, when defer_init is enabled - Refactor lowcore access and switch to using the get_lowcore() function instead of the S390_lowcore macro - Cleanups for PG_arch_1 and folio handling in UV and hugetlb code - Add missing MODULE_DESCRIPTION() macros - Fix VM_FAULT_HWPOISON handling in do_exception() * tag 's390-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits) s390/mm: Fix VM_FAULT_HWPOISON handling in do_exception() s390/kvm: Move bitfields for dat tables s390/entry: Pass the asce as parameter to sie64a() s390/sthyi: Use cached data when diag is busy s390/sthyi: Move diag operations s390/hypfs_diag: Diag204 busy loop s390/diag: Add busy-indication-facility requirements s390/diag: Diag204 add busy return errno s390/diag: Return errno's from diag204 s390/sclp: Diag204 busy indication facility detection s390/atomic_ops: Make use of flag output constraint s390/atomic_ops: Improve __atomic_set() for small values s390/atomic_ops: Use symbolic names s390/smp: Switch to GENERIC_CPU_DEVICES s390/hwcaps: Add documentation for HWCAP flags s390/pgtable: Make crdte() and cspg() return a value s390/topology: Remove CPU KOBJ_CHANGE uevents s390/sclp: Add timeout to Store Data requests s390/sclp: Prevent release of buffer in I/O s390/sclp: Suppress unnecessary Store Data warning ...
2024-06-28s390/dcssblk: Add missing MODULE_DESCRIPTION() macroJeff Johnson
With ARCH=s390, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/s390/block/dcssblk.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240615-md-s390-drivers-s390-block-dcssblk-v1-1-d9d19703abcb@quicinc.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-06-19block: move the dax flag to queue_limitsChristoph Hellwig
Move the dax flag into the queue_limits feature field so that it can be set atomically with the queue frozen. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240617060532.127975-21-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-03-19Merge tag 's390-6.9-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - Various virtual vs physical address usage fixes - Add new bitwise types and helper functions and use them in s390 specific drivers and code to make it easier to find virtual vs physical address usage bugs. Right now virtual and physical addresses are identical for s390, except for module, vmalloc, and similar areas. This will be changed, hopefully with the next merge window, so that e.g. the kernel image and modules will be located close to each other, allowing for direct branches and also for some other simplifications. As a prerequisite this requires to fix all misuses of virtual and physical addresses. As it turned out people are so used to the concept that virtual and physical addresses are the same, that new bugs got added to code which was already fixed. In order to avoid that even more code gets merged which adds such bugs add and use new bitwise types, so that sparse can be used to find such usage bugs. Most likely the new types can go away again after some time - Provide a simple ARCH_HAS_DEBUG_VIRTUAL implementation - Fix kprobe branch handling: if an out-of-line single stepped relative branch instruction has a target address within a certain address area in the entry code, the program check handler may incorrectly execute cleanup code as if KVM code was executed, leading to crashes - Fix reference counting of zcrypt card objects - Various other small fixes and cleanups * tag 's390-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits) s390/entry: compare gmap asce to determine guest/host fault s390/entry: remove OUTSIDE macro s390/entry: add CIF_SIE flag and remove sie64a() address check s390/cio: use while (i--) pattern to clean up s390/raw3270: make class3270 constant s390/raw3270: improve raw3270_init() readability s390/tape: make tape_class constant s390/vmlogrdr: make vmlogrdr_class constant s390/vmur: make vmur_class constant s390/zcrypt: make zcrypt_class constant s390/mm: provide simple ARCH_HAS_DEBUG_VIRTUAL support s390/vfio_ccw_cp: use new address translation helpers s390/iucv: use new address translation helpers s390/ctcm: use new address translation helpers s390/lcs: use new address translation helpers s390/qeth: use new address translation helpers s390/zfcp: use new address translation helpers s390/tape: fix virtual vs physical address confusion s390/3270: use new address translation helpers s390/3215: use new address translation helpers ...
2024-03-14Merge tag 'mm-stable-2024-03-13-20-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames from hotplugged memory rather than only from main memory. Series "implement "memmap on memory" feature on s390". - More folio conversions from Matthew Wilcox in the series "Convert memcontrol charge moving to use folios" "mm: convert mm counter to take a folio" - Chengming Zhou has optimized zswap's rbtree locking, providing significant reductions in system time and modest but measurable reductions in overall runtimes. The series is "mm/zswap: optimize the scalability of zswap rb-tree". - Chengming Zhou has also provided the series "mm/zswap: optimize zswap lru list" which provides measurable runtime benefits in some swap-intensive situations. - And Chengming Zhou further optimizes zswap in the series "mm/zswap: optimize for dynamic zswap_pools". Measured improvements are modest. - zswap cleanups and simplifications from Yosry Ahmed in the series "mm: zswap: simplify zswap_swapoff()". - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has contributed several DAX cleanups as well as adding a sysfs tunable to control the memmap_on_memory setting when the dax device is hotplugged as system memory. - Johannes Weiner has added the large series "mm: zswap: cleanups", which does that. - More DAMON work from SeongJae Park in the series "mm/damon: make DAMON debugfs interface deprecation unignorable" "selftests/damon: add more tests for core functionalities and corner cases" "Docs/mm/damon: misc readability improvements" "mm/damon: let DAMOS feeds and tame/auto-tune itself" - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs extension" Rakie Kim has developed a new mempolicy interleaving policy wherein we allocate memory across nodes in a weighted fashion rather than uniformly. This is beneficial in heterogeneous memory environments appearing with CXL. - Christophe Leroy has contributed some cleanup and consolidation work against the ARM pagetable dumping code in the series "mm: ptdump: Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute". - Luis Chamberlain has added some additional xarray selftesting in the series "test_xarray: advanced API multi-index tests". - Muhammad Usama Anjum has reworked the selftest code to make its human-readable output conform to the TAP ("Test Anything Protocol") format. Amongst other things, this opens up the use of third-party tools to parse and process out selftesting results. - Ryan Roberts has added fork()-time PTE batching of THP ptes in the series "mm/memory: optimize fork() with PTE-mapped THP". Mainly targeted at arm64, this significantly speeds up fork() when the process has a large number of pte-mapped folios. - David Hildenbrand also gets in on the THP pte batching game in his series "mm/memory: optimize unmap/zap with PTE-mapped THP". It implements batching during munmap() and other pte teardown situations. The microbenchmark improvements are nice. - And in the series "Transparent Contiguous PTEs for User Mappings" Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte mappings"). Kernel build times on arm64 improved nicely. Ryan's series "Address some contpte nits" provides some followup work. - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has fixed an obscure hugetlb race which was causing unnecessary page faults. He has also added a reproducer under the selftest code. - In the series "selftests/mm: Output cleanups for the compaction test", Mark Brown did what the title claims. - Kinsey Ho has added the series "mm/mglru: code cleanup and refactoring". - Even more zswap material from Nhat Pham. The series "fix and extend zswap kselftests" does as claimed. - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX regression" Mathieu Desnoyers has cleaned up and fixed rather a mess in our handling of DAX on archiecctures which have virtually aliasing data caches. The arm architecture is the main beneficiary. - Lokesh Gidra's series "per-vma locks in userfaultfd" provides dramatic improvements in worst-case mmap_lock hold times during certain userfaultfd operations. - Some page_owner enhancements and maintenance work from Oscar Salvador in his series "page_owner: print stacks and their outstanding allocations" "page_owner: Fixup and cleanup" - Uladzislau Rezki has contributed some vmalloc scalability improvements in his series "Mitigate a vmap lock contention". It realizes a 12x improvement for a certain microbenchmark. - Some kexec/crash cleanup work from Baoquan He in the series "Split crash out from kexec and clean up related config items". - Some zsmalloc maintenance work from Chengming Zhou in the series "mm/zsmalloc: fix and optimize objects/page migration" "mm/zsmalloc: some cleanup for get/set_zspage_mapping()" - Zi Yan has taught the MM to perform compaction on folios larger than order=0. This a step along the path to implementaton of the merging of large anonymous folios. The series is named "Enable >0 order folio memory compaction". - Christoph Hellwig has done quite a lot of cleanup work in the pagecache writeback code in his series "convert write_cache_pages() to an iterator". - Some modest hugetlb cleanups and speedups in Vishal Moola's series "Handle hugetlb faults under the VMA lock". - Zi Yan has changed the page splitting code so we can split huge pages into sizes other than order-0 to better utilize large folios. The series is named "Split a folio to any lower order folios". - David Hildenbrand has contributed the series "mm: remove total_mapcount()", a cleanup. - Matthew Wilcox has sought to improve the performance of bulk memory freeing in his series "Rearrange batched folio freeing". - Gang Li's series "hugetlb: parallelize hugetlb page init on boot" provides large improvements in bootup times on large machines which are configured to use large numbers of hugetlb pages. - Matthew Wilcox's series "PageFlags cleanups" does that. - Qi Zheng's series "minor fixes and supplement for ptdesc" does that also. S390 is affected. - Cleanups to our pagemap utility functions from Peter Xu in his series "mm/treewide: Replace pXd_large() with pXd_leaf()". - Nico Pache has fixed a few things with our hugepage selftests in his series "selftests/mm: Improve Hugepage Test Handling in MM Selftests". - Also, of course, many singleton patches to many things. Please see the individual changelogs for details. * tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits) mm/zswap: remove the memcpy if acomp is not sleepable crypto: introduce: acomp_is_async to expose if comp drivers might sleep memtest: use {READ,WRITE}_ONCE in memory scanning mm: prohibit the last subpage from reusing the entire large folio mm: recover pud_leaf() definitions in nopmd case selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements selftests/mm: skip uffd hugetlb tests with insufficient hugepages selftests/mm: dont fail testsuite due to a lack of hugepages mm/huge_memory: skip invalid debugfs new_order input for folio split mm/huge_memory: check new folio order when split a folio mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure mm: add an explicit smp_wmb() to UFFDIO_CONTINUE mm: fix list corruption in put_pages_list mm: remove folio from deferred split list before uncharging it filemap: avoid unnecessary major faults in filemap_fault() mm,page_owner: drop unnecessary check mm,page_owner: check for null stack_record before bumping its refcount mm: swap: fix race between free_swap_and_cache() and swapoff() mm/treewide: align up pXd_leaf() retval across archs mm/treewide: drop pXd_large() ...
2024-03-13s390/dcssblk: fix virtual vs physical address confusionGerald Schaefer
Fix virtual vs physical address confusion. This does not fix a bug since virtual and physical address spaces are currently the same. dax_direct_access() should receive a virtual kernel address in kaddr. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-22dcssblk: handle alloc_dax() -EOPNOTSUPP failureMathieu Desnoyers
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of dcssblk dcssblk_add_store() to handle alloc_dax() -EOPNOTSUPP failures. Considering that s390 is not a data cache aliasing architecture, and considering that DCSSBLK selects DAX, a return value of -EOPNOTSUPP from alloc_dax() should make dcssblk_add_store() fail. Link: https://lkml.kernel.org/r/20240215144633.96437-6-mathieu.desnoyers@efficios.com Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@armlinux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: kernel test robot <lkp@intel.com> Cc: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-19dcssblk: pass queue_limits to blk_mq_alloc_diskChristoph Hellwig
Pass the queue limits directly to blk_alloc_disk instead of setting them one at a time. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20240215071055.2201424-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-19block: pass a queue_limits argument to blk_alloc_diskChristoph Hellwig
Pass a queue_limits to blk_alloc_disk and apply it if non-NULL. This will allow allocating queues with valid queue limits instead of setting the values one at a time later. Also change blk_alloc_disk to return an ERR_PTR instead of just NULL which can't distinguish errors. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20240215071055.2201424-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-30s390/dcssblk: fix lockdep warningGerald Schaefer
dcssblk_remove_store() holds the dcssblk_devices_sem semaphore while calling del_gendisk(dev_info->gd), which in turn tries to acquire disk->open_mutex. Then there is dcssblk_release(), which is called with disk->open_mutex held, and tries to acquire dcssblk_devices_sem. Lockdep reports this as possible circular locking dependency (CPU0 = dcssblk_remove_store, CPU1 = dcssblk_release): [ 44.948865] Possible unsafe locking scenario: [ 44.948866] CPU0 CPU1 [ 44.948867] ---- ---- [ 44.948868] lock(&dcssblk_devices_sem); [ 44.948870] lock(&disk->open_mutex); [ 44.948872] lock(&dcssblk_devices_sem); [ 44.948874] lock(&disk->open_mutex); [ 44.948876] *** DEADLOCK *** In practice, this deadlock should not happen, since dcssblk_remove_store() checks for dev_info->use_count != 0 after acquiring dcssblk_devices_sem, and breaks out before calling del_gendisk(). dev_info->use_count will be decremented in dcssblk_release(), protected by dcssblk_devices_sem. Still there is no need for dcssblk_remove_store() to hold the dcssblk_devices_sem until after calling del_gendisk(), as this only protects dcssblk internal data. So fix the lockdep warning by releasing dcssblk_devices_sem earlier. Also move the segment_unload() loop up, similar to dcssblk_shared_store() error path, no need to do that after calling del_gendisk(). Also change dcssblk_shared_store() error path, where dcssblk_devices_sem was also released only after calling del_gendisk(), and a similar lockdep warning could be triggered (but also deadlock prevented by check for dev_info->use_count). Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-16s390/dcssblk: fix kernel crash with list_add corruptionGerald Schaefer
Commit fb08a1908cb1 ("dax: simplify the dax_device <-> gendisk association") introduced new logic for gendisk association, requiring drivers to explicitly call dax_add_host() and dax_remove_host(). For dcssblk driver, some dax_remove_host() calls were missing, e.g. in device remove path. The commit also broke error handling for out_dax case in device add path, resulting in an extra put_device() w/o the previous get_device() in that case. This lead to stale xarray entries after device add / remove cycles. In the case when a previously used struct gendisk pointer (xarray index) would be used again, because blk_alloc_disk() happened to return such a pointer, the xa_insert() in dax_add_host() would fail and go to out_dax, doing the extra put_device() in the error path. In combination with an already flawed error handling in dcssblk (device_register() cleanup), which needs to be addressed in a separate patch, this resulted in a missing device_del() / klist_del(), and eventually in the kernel crash with list_add corruption on a subsequent device_add() / klist_add(). Fix this by adding the missing dax_remove_host() calls, and also move the put_device() in the error path to restore the previous logic. Fixes: fb08a1908cb1 ("dax: simplify the dax_device <-> gendisk association") Cc: <stable@vger.kernel.org> # 5.17+ Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-07-24s390/dcssblk: fix virtual vs physical address confusionAlexander Gordeev
Fix virtual vs physical address confusion (which currently are the same). Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-07-24s390/dcssblk: use IS_ALIGNED() for alignment checksAlexander Gordeev
Use IS_ALIGNED() instead of cumbersome bit manipulations. Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-07-06Merge tag 's390-6.5-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Alexander Gordeev: - Fix virtual vs physical address confusion in vmem_add_range() and vmem_remove_range() functions - Include <linux/io.h> instead of <asm/io.h> and <asm-generic/io.h> throughout s390 code - Make all PSW related defines also available for assembler files. Remove PSW_DEFAULT_KEY define from uapi for that - When adding an undefined symbol the build still succeeds, but userspace crashes trying to execute VDSO, because the symbol is not resolved. Add undefined symbols check to prevent that - Use kvmalloc_array() instead of kzalloc() for allocaton of 256k memory when executing s390 crypto adapter IOCTL - Add -fPIE flag to prevent decompressor misaligned symbol build error with clang - Use .balign instead of .align everywhere. This is a no-op for s390, but with this there no mix in using .align and .balign anymore - Filter out -mno-pic-data-is-text-relative flag when compiling kernel to prevent VDSO build error - Rework entering of DAT-on mode on CPU restart to use PSW_KERNEL_BITS mask directly - Do not retry administrative requests to some s390 crypto cards, since the firmware assumes replay attacks - Remove most of the debug code, which is build in when kernel config option CONFIG_ZCRYPT_DEBUG is enabled - Remove CONFIG_ZCRYPT_MULTIDEVNODES kernel config option and switch off the multiple devices support for the s390 zcrypt device driver - With the conversion to generic entry machine checks are accounted to the current context instead of irq time. As result, the STCKF instruction at the beginning of the machine check handler and the lowcore member are no longer required, therefore remove it - Fix various typos found with codespell - Minor cleanups to CPU-measurement Counter and Sampling Facilities code - Revert patch that removes VMEM_MAX_PHYS macro, since it causes a regression * tag 's390-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (25 commits) Revert "s390/mm: get rid of VMEM_MAX_PHYS macro" s390/cpum_sf: remove check on CPU being online s390/cpum_sf: handle casts consistently s390/cpum_sf: remove unnecessary debug statement s390/cpum_sf: remove parameter in call to pr_err s390/cpum_sf: simplify function setup_pmu_cpu s390/cpum_cf: remove unneeded debug statements s390/entry: remove mcck clock s390: fix various typos s390/zcrypt: remove ZCRYPT_MULTIDEVNODES kernel config option s390/zcrypt: do not retry administrative requests s390/zcrypt: cleanup some debug code s390/entry: rework entering DAT-on mode on CPU restart s390/mm: fence off VM macros from asm and linker s390: include linux/io.h instead of asm/io.h s390/ptrace: make all psw related defines also available for asm s390/ptrace: remove PSW_DEFAULT_KEY from uapi s390/vdso: filter out mno-pic-data-is-text-relative cflag s390: consistently use .balign instead of .align s390/decompressor: fix misaligned symbol build error ...
2023-07-03s390: include linux/io.h instead of asm/io.hHeiko Carstens
Include linux/io.h instead of asm/io.h everywhere. linux/io.h includes asm/io.h, so this shouldn't cause any problems. Instead this might help for some randconfig build errors which were reported due to some undefined io related functions. Also move the changed include so it stays grouped together with other includes from the same directory. For ctcm_mpc.c also remove not needed comments (actually questions). Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-01Merge tag 'libnvdimm-for-6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull nvdimm and DAX updates from Vishal Verma: "This is mostly small cleanups and fixes, with the biggest change being the change to the DAX fault handler allowing it to return VM_FAULT_HWPOISON. Summary: - DAX fixes and cleanups including a use after free, extra references, and device unregistration, and a redundant variable. - Allow the DAX fault handler to return VM_FAULT_HWPOISON - A few libnvdimm cleanups such as making some functions and variables static where sufficient. - Add a few missing prototypes for wrapped functions in tools/testing/nvdimm" * tag 'libnvdimm-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: enable dax fault handler to report VM_FAULT_HWPOISON nvdimm: make security_show static nvdimm: make nd_class variable static dax/kmem: Pass valid argument to memory_group_register_static fsdax: remove redundant variable 'error' dax: Cleanup extra dax_region references dax: Introduce alloc_dev_dax_id() dax: Use device_unregister() in unregister_dax_mapping() dax: Fix dax_mapping_release() use after free tools/testing/nvdimm: Drop empty platform remove function libnvdimm: mark 'security_show' static again testing: nvdimm: add missing prototypes for wrapped functions dax: fix missing-prototype warnings
2023-06-26dax: enable dax fault handler to report VM_FAULT_HWPOISONJane Chu
When multiple processes mmap() a dax file, then at some point, a process issues a 'load' and consumes a hwpoison, the process receives a SIGBUS with si_code = BUS_MCEERR_AR and with si_lsb set for the poison scope. Soon after, any other process issues a 'load' to the poisoned page (that is unmapped from the kernel side by memory_failure), it receives a SIGBUS with si_code = BUS_ADRERR and without valid si_lsb. This is confusing to user, and is different from page fault due to poison in RAM memory, also some helpful information is lost. Channel dax backend driver's poison detection to the filesystem such that instead of reporting VM_FAULT_SIGBUS, it could report VM_FAULT_HWPOISON. If user level block IO syscalls fail due to poison, the errno will be converted to EIO to maintain block API consistency. Signed-off-by: Jane Chu <jane.chu@oracle.com> Link: https://lore.kernel.org/r/20230615181325.1327259-2-jane.chu@oracle.com Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2023-06-12block: replace fmode_t with a block-specific type for block open flagsChristoph Hellwig
The only overlap between the block open flags mapped into the fmode_t and other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and ->ioctl and stop abusing fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: remove the unused mode argument to ->releaseChristoph Hellwig
The mode argument to the ->release block_device_operation is never used, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: pass a gendisk to ->openChristoph Hellwig
->open is only called on the whole device. Make that explicit by passing a gendisk instead of the block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29s390/dcssblk:: don't call bio_split_to_limitsChristoph Hellwig
s390 iterates over the bio using bio_for_each_segment and doesn't need any bio splitting. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/r/20230123075356.60847-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-04block: handle bio_split_to_limits() NULL returnJens Axboe
This can't happen right now, but in preparation for allowing bio_split_to_limits() returning NULL if it ended the bio, check for it in all the callers. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-16s390/dcssblk: fix deadlock when adding a DCSSGerald Schaefer
After the rework from commit 1ebe2e5f9d68 ("block: remove GENHD_FL_EXT_DEVT"), when calling device_add_disk(), dcssblk will end up in disk_scan_partitions(), and not break out early w/o GENHD_FL_NO_PART. This will trigger implicit open/release via blkdev_get/put_whole() later. dcssblk_release() will then deadlock on dcssblk_devices_sem semaphore, which is already held from dcssblk_add_store() when calling device_add_disk(). dcssblk does not support partitions (DCSSBLK_MINORS_PER_DISK == 1), and never scanned partitions before. Therefore restore the previous behavior, and explicitly disallow partition scanning by setting the GENHD_FL_NO_PART flag. This will also prevent this deadlock scenario. Fixes: 1ebe2e5f9d68 ("block: remove GENHD_FL_EXT_DEVT") Cc: <stable@vger.kernel.org> # 5.17+ Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-30s390: move from strlcpy with unused retval to strscpyWolfram Sang
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Benjamin Block <bblock@linux.ibm.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Link: https://lore.kernel.org/r/20220818205948.6360-1-wsa+renesas@sang-engineering.com Link: https://lore.kernel.org/r/20220818210102.7301-1-wsa+renesas@sang-engineering.com [gor@linux.ibm.com: squashed two changes linked above together] Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-02block: change the blk_queue_split calling conventionChristoph Hellwig
The double indirect bio leads to somewhat suboptimal code generation. Instead return the (original or split) bio, and make sure the request_queue arguments to the lower level helpers is passed after the bio to avoid constant reshuffling of the argument passing registers. Also give it and the helpers used to implement it more descriptive names. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220727162300.3089193-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-28block: remove blk_cleanup_diskChristoph Hellwig
blk_cleanup_disk is nothing but a trivial wrapper for put_disk now, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-16dax: introduce DAX_RECOVERY_WRITE dax access modeJane Chu
Up till now, dax_direct_access() is used implicitly for normal access, but for the purpose of recovery write, dax range with poison is requested. To make the interface clear, introduce enum dax_access_mode { DAX_ACCESS, DAX_RECOVERY_WRITE, } where DAX_ACCESS is used for normal dax access, and DAX_RECOVERY_WRITE is used for dax recovery write. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/165247982851.52965.11024212198889762949.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-12-18dax: remove the copy_from_iter and copy_to_iter methodsChristoph Hellwig
These methods indirect the actual DAX read/write path. In the end pmem uses magic flush and mc safe variants and fuse and dcssblk use plain ones while device mapper picks redirects to the underlying device. Add set_dax_nocache() and set_dax_nomc() APIs to control which copy routines are used to remove indirect call from the read/write fast path as well as a lot of boilerplate code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> [virtiofs] Link: https://lore.kernel.org/r/20211215084508.435401-5-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-12-18dax: remove the DAXDEV_F_SYNC flagChristoph Hellwig
Remove the DAXDEV_F_SYNC flag and thus the flags argument to alloc_dax and just let the drivers call set_dax_synchronous directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20211215084508.435401-4-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-12-04dax: remove dax_capableChristoph Hellwig
Just open code the block size and dax_dev == NULL checks in the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> [erofs] Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-9-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-12-04dax: simplify the dax_device <-> gendisk associationChristoph Hellwig
Replace the dax_host_hash with an xarray indexed by the pointer value of the gendisk, and require explicitly calls from the block drivers that want to associate their gendisk with a dax_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-5-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-11-06Merge tag 's390-5.16-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for ftrace with direct call and ftrace direct call samples. - Add support for kernel command lines longer than current 896 bytes and make its length configurable. - Add support for BEAR enhancement facility to improve last breaking event instruction tracking. - Add kprobes sanity checks and testcases to prevent kprobe in the mid of an instruction. - Allow concurrent access to /dev/hwc for the CPUMF users. - Various ftrace / jump label improvements. - Convert unwinder tests to KUnit. - Add s390_iommu_aperture kernel parameter to tweak the limits on concurrently usable DMA mappings. - Add ap.useirq AP module option which can be used to disable interrupt use. - Add add_disk() error handling support to block device drivers. - Drop arch specific and use generic implementation of strlcpy and strrchr. - Several __pa/__va usages fixes. - Various cio, crypto, pci, kernel doc and other small fixes and improvements all over the code. [ Merge fixup as per https://lore.kernel.org/all/YXAqZ%2FEszRisunQw@osiris/ ] * tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (63 commits) s390: make command line configurable s390: support command lines longer than 896 bytes s390/kexec_file: move kernel image size check s390/pci: add s390_iommu_aperture kernel parameter s390/spinlock: remove incorrect kernel doc indicator s390/string: use generic strlcpy s390/string: use generic strrchr s390/ap: function rework based on compiler warning s390/cio: make ccw_device_dma_* more robust s390/vfio-ap: s390/crypto: fix all kernel-doc warnings s390/hmcdrv: fix kernel doc comments s390/ap: new module option ap.useirq s390/cpumf: Allow multiple processes to access /dev/hwc s390/bitops: return true/false (not 1/0) from bool functions s390: add support for BEAR enhancement facility s390: introduce nospec_uses_trampoline() s390: rename last_break to pgm_last_break s390/ptrace: add last_break member to pt_regs s390/sclp: sort out physical vs virtual pointers usage s390/setup: convert start and end initrd pointers to virtual ...
2021-10-18block: switch polling to be bio basedChristoph Hellwig
Replace the blk_poll interface that requires the caller to keep a queue and cookie from the submissions with polling based on the bio. Polling for the bio itself leads to a few advantages: - the cookie construction can made entirely private in blk-mq.c - the caller does not need to remember the request_queue and cookie separately and thus sidesteps their lifetime issues - keeping the device and the cookie inside the bio allows to trivially support polling BIOs remapping by stacking drivers - a lot of code to propagate the cookie back up the submission path can be removed entirely. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-04s390/block/dcssblk: add error handling support for add_disk()Gerald Schaefer
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Link: https://lore.kernel.org/r/20210927220232.1071926-6-mcgrof@kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-08-16dcssblk: use bvec_virtChristoph Hellwig
Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-04Merge tag 's390-5.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Rework inline asm to get rid of error prone "register asm" constructs, which are problematic especially when code instrumentation is enabled. In particular introduce and use register pair union to allocate even/odd register pairs. Unfortunately this breaks compatibility with older clang compilers and minimum clang version for s390 has been raised to 13. https://lore.kernel.org/linux-next/CAK7LNARuSmPCEy-ak0erPrPTgZdGVypBROFhtw+=3spoGoYsyw@mail.gmail.com/ - Fix gcc 11 warnings, which triggered various minor reworks all over the code. - Add zstd kernel image compression support. - Rework boot CPU lowcore handling. - De-duplicate and move kernel memory layout setup logic earlier. - Few fixes in preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for mem functions. - Remove broken and unused power management support leftovers in s390 drivers. - Disable stack-protector for decompressor and purgatory to fix buildroot build. - Fix vt220 sclp console name to match the char device name. - Enable HAVE_IOREMAP_PROT and add zpci_set_irq()/zpci_clear_irq() in zPCI code. - Remove some implausible WARN_ON_ONCEs and remove arch specific counter transaction call backs in favour of default transaction handling in perf code. - Extend/add new uevents for online/config/mode state changes of AP card / queue device in zcrypt. - Minor entry and ccwgroup code improvements. - Other small various fixes and improvements all over the code. * tag 's390-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (91 commits) s390/dasd: use register pair instead of register asm s390/qdio: get rid of register asm s390/ioasm: use symbolic names for asm operands s390/ioasm: get rid of register asm s390/cmf: get rid of register asm s390/lib,string: get rid of register asm s390/lib,uaccess: get rid of register asm s390/string: get rid of register asm s390/cmpxchg: use register pair instead of register asm s390/mm,pages-states: get rid of register asm s390/lib,xor: get rid of register asm s390/timex: get rid of register asm s390/hypfs: use register pair instead of register asm s390/zcrypt: Switch to flexible array member s390/speculation: Use statically initialized const for instructions virtio/s390: get rid of open-coded kvm hypercall s390/pci: add zpci_set_irq()/zpci_clear_irq() scripts/min-tool-version.sh: Raise minimum clang version to 13.0.0 for s390 s390/ipl: use register pair instead of register asm s390/mem_detect: fix tprot() program check new psw handling ...
2021-06-18s390/dcssblk: Remove power management supportPeter Oberparleiter
Power management support was removed for s390 with commit 394216275c7d ("s390: remove broken hibernate / power management support"). Remove leftover dcssblk-related power management code. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-01dcssblk: convert to blk_alloc_disk/blk_cleanup_diskChristoph Hellwig
Convert the dcssblk driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-24-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-01-24block: store a block_device pointer in struct bioChristoph Hellwig
Replace the gendisk pointer in struct bio with a pointer to the newly improved struct block device. From that the gendisk can be trivially accessed with an extra indirection, but it also allows to directly look up all information related to partition remapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-01-24dcssblk: remove the end of device check in dcssblk_submit_bioChristoph Hellwig
The block layer already checks for this conditions in bio_check_eod before calling the driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01dcssblk: don't set bd_block_size in ->openChristoph Hellwig
bd_block_size contains a value that matches the logic block size when opening, so the statement is redundant. Even if it wasn't the dumb assignment would cause a a mismatch with bd_inode->i_blkbits. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01block: move ->make_request_fn to struct block_device_operationsChristoph Hellwig
The make_request_fn is a little weird in that it sits directly in struct request_queue instead of an operation vector. Replace it with a block_device_operations method called submit_bio (which describes much better what it does). Also remove the request_queue argument to it, as the queue can be derived pretty trivially from the bio. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01block: remove the request_queue argument from blk_queue_splitChristoph Hellwig
The queue can be trivially derived from the bio, so pass one less argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-08Merge tag 'libnvdimm-for-5.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm and dax updates from Dan Williams: "There were multiple touches outside of drivers/nvdimm/ this round to add cross arch compatibility to the devm_memremap_pages() interface, enhance numa information for persistent memory ranges, and add a zero_page_range() dax operation. This cycle I switched from the patchwork api to Konstantin's b4 script for collecting tags (from x86, PowerPC, filesystem, and device-mapper folks), and everything looks to have gone ok there. This has all appeared in -next with no reported issues. Summary: - Add support for region alignment configuration and enforcement to fix compatibility across architectures and PowerPC page size configurations. - Introduce 'zero_page_range' as a dax operation. This facilitates filesystem-dax operation without a block-device. - Introduce phys_to_target_node() to facilitate drivers that want to know resulting numa node if a given reserved address range was onlined. - Advertise a persistence-domain for of_pmem and papr_scm. The persistence domain indicates where cpu-store cycles need to reach in the platform-memory subsystem before the platform will consider them power-fail protected. - Promote numa_map_to_online_node() to a cross-kernel generic facility. - Save x86 numa information to allow for node-id lookups for reserved memory ranges, deploy that capability for the e820-pmem driver. - Pick up some miscellaneous minor fixes, that missed v5.6-final, including a some smatch reports in the ioctl path and some unit test compilation fixups. - Fixup some flexible-array declarations" * tag 'libnvdimm-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (29 commits) dax: Move mandatory ->zero_page_range() check in alloc_dax() dax,iomap: Add helper dax_iomap_zero() to zero a range dax: Use new dax zero page method for zeroing a page dm,dax: Add dax zero_page_range operation s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver dax, pmem: Add a dax operation zero_page_range pmem: Add functions for reading/writing page to/from pmem libnvdimm: Update persistence domain value for of_pmem and papr_scm device tools/test/nvdimm: Fix out of tree build libnvdimm/region: Fix build error libnvdimm/region: Replace zero-length array with flexible-array member libnvdimm/label: Replace zero-length array with flexible-array member ACPI: NFIT: Replace zero-length array with flexible-array member libnvdimm/region: Introduce an 'align' attribute libnvdimm/region: Introduce NDD_LABELING libnvdimm/namespace: Enforce memremap_compat_align() libnvdimm/pfn: Prevent raw mode fallback if pfn-infoblock valid libnvdimm: Out of bounds read in __nd_ioctl() acpi/nfit: improve bounds checking for 'func' mm/memremap_pages: Introduce memremap_compat_align() ...
2020-04-02dax: Move mandatory ->zero_page_range() check in alloc_dax()Vivek Goyal
zero_page_range() dax operation is mandatory for dax devices. Right now that check happens in dax_zero_page_range() function. Dan thinks that's too late and its better to do the check earlier in alloc_dax(). I also modified alloc_dax() to return pointer with error code in it in case of failure. Right now it returns NULL and caller assumes failure happened due to -ENOMEM. But with this ->zero_page_range() check, I need to return -EINVAL instead. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20200401161125.GB9398@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-04-02s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driverVivek Goyal
Add dax operation zero_page_range for dcssblk driver. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> CC: linux-s390@vger.kernel.org Link: https://lore.kernel.org/r/20200228163456.1587-4-vgoyal@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-27block: simplify queue allocationChristoph Hellwig
Current make_request based drivers use either blk_alloc_queue_node or blk_alloc_queue to allocate a queue, and then set up the make_request_fn function pointer and a few parameters using the blk_queue_make_request helper. Simplify this by passing the make_request pointer to blk_alloc_queue, and while at it merge the _node variant into the main helper by always passing a node_id, and remove the superfluous gfp_mask parameter. A lower-level __blk_alloc_queue is kept for the blk-mq case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-05libnvdimm: add dax_dev sync flagPankaj Gupta
This patch adds 'DAXDEV_SYNC' flag which is set for nd_region doing synchronous flush. This later is used to disable MAP_SYNC functionality for ext4 & xfs filesystem for devices don't support synchronous flush. Signed-off-by: Pankaj Gupta <pagupta@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-05-20dax: Arrange for dax_supported check to span multiple devicesDan Williams
Pankaj reports that starting with commit ad428cdb525a "dax: Check the end of the block-device capacity with dax_direct_access()" device-mapper no longer allows dax operation. This results from the stricter checks in __bdev_dax_supported() that validate that the start and end of a block-device map to the same 'pagemap' instance. Teach the dax-core and device-mapper to validate the 'pagemap' on a per-target basis. This is accomplished by refactoring the bdev_dax_supported() internals into generic_fsdax_supported() which takes a sector range to validate. Consequently generic_fsdax_supported() is suitable to be used in a device-mapper ->iterate_devices() callback. A new ->dax_supported() operation is added to allow composite devices to split and route upper-level bdev_dax_supported() requests. Fixes: ad428cdb525a ("dax: Check the end of the block-device...") Cc: <stable@vger.kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Reported-by: Pankaj Gupta <pagupta@redhat.com> Reviewed-by: Pankaj Gupta <pagupta@redhat.com> Tested-by: Pankaj Gupta <pagupta@redhat.com> Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-28block: genhd: add 'groups' argument to device_add_diskHannes Reinecke
Update device_add_disk() to take an 'groups' argument so that individual drivers can register a device with additional sysfs attributes. This avoids race condition the driver would otherwise have if these groups were to be created with sysfs_add_groups(). Signed-off-by: Martin Wilck <martin.wilck@suse.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>