summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-07-14bcachefs: dev_usage updated by new accountingKent Overstreet
Reading disk accounting now requires an eytzinger lookup (see: bch2_accounting_mem_read()), but the per-device counters are used frequently enough that we'd like to still be able to read them with just a percpu sum, as in the old code. This patch special cases the device counters; when we update in-memory accounting we also update the old style percpu counters if it's a deice counter update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Coalesce accounting keys before journal replayKent Overstreet
This fixes a performance regression in journal replay; without colaescing accounting keys we have multiple keys at the same position, which means journal_keys_peek_upto() has to skip past many overwritten keys - turning journal replay into an O(n^2) algorithm. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Disk space accounting rewriteKent Overstreet
Main part of the disk accounting rewrite. This is a wholesale rewrite of the existing disk space accounting, which relies on percepu counters that are sharded by journal buffer, and rolled up and added to each journal write. With the new scheme, every set of counters is a distinct key in the accounting btree; this fixes scaling limitations of the old scheme, where counters took up space in each journal entry and required multiple percpu counters. Now, in memory accounting requires a single set of percpu counters - not multiple for each in flight journal buffer - and in the future we'll probably also have counters that don't use in memory percpu counters, they're not strictly required. An accounting update is now a normal btree update, using the btree write buffer path. At transaction commit time, we apply accounting updates to the in memory counters, which are percpu counters indexed in an eytzinger tree by the accounting key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: btree write buffer knows how to accumulate bch_accounting keysKent Overstreet
Teach the btree write buffer how to accumulate accounting keys - instead of having the newer key overwrite the older key as we do with other updates, we need to add them together. Also, add a flag so that write buffer flush knows when journal replay is finished flushing accounting, and teach it to hold accounting keys until that flag is set. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Accumulate accounting keys in journal replayKent Overstreet
Until accounting keys hit the btree, they are deltas, not new versions of the existing key; this means we have to teach journal replay to accumulate them. Additionally, the journal doesn't track precisely which entries have been flushed to the btree; it only tracks a range of entries that may possibly still need to be flushed. That means we need to compare accounting keys against the version in the btree and only flush updates that are newer. There's another wrinkle with the write buffer: if the write buffer starts flushing accounting keys before journal replay has finished flushing accounting keys, journal replay will see the version number from the new updates and updates from the journal will be lost. To avoid this, journal replay has to flush accounting keys first, and we'll be adding a flag so that write buffer flush knows to hold accounting keys until then. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: KEY_TYPE_accountingKent Overstreet
New key type for the disk space accounting rewrite. - Holds a variable sized array of u64s (may be more than one for accounting e.g. compressed and uncompressed size, or buckets and sectors for a given data type) - Updates are deltas, not new versions of the key: this means updates to accounting can happen via the btree write buffer, which we'll be teaching to accumulate deltas. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: use new mount APIThomas Bertschinger
This updates bcachefs to use the new mount API: - Update the file_system_type to use the new init_fs_context() function. - Define the new fs_context_operations functions. - No longer register bch2_mount() and bch2_remount(); these are now called via the new fs_context functions. - Define a new helper type, bch2_opts_parse that includes a struct bch_opts and additionally a printbuf used to save options that can't be parsed until after the FS is opened. This enables us to parse as many options as possible prior to opening the filesystem while saving those options that need the open FS for later parsing. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Add error code to defer option parsingThomas Bertschinger
This introduces a new error code, option_needs_open_fs, which is used to indicate that an attempt was made to parse a mount option prior to opening a filesystem, when that mount option requires an open filesystem in order to be validated. Returning this error results in bch2_parse_one_mount_opt() saving that option for later parsing, after the filesystem is opened. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: add printbuf arg to bch2_parse_mount_opts()Thomas Bertschinger
Mount options that take the name of a device that may be part of a filesystem, for example "metadata_target", cannot be validated until after the filesystem has been opened. However, an attempt to parse those options may be made prior to the filesystem being opened. This change adds a printbuf parameter to bch2_parse_mount_opts() which will be used to save those mount options, when they are supplied prior to the FS being opened, so that they can be parsed later. This functionality is not currently needed, but will be used after bcachefs starts using the new mount API to parse mount options. This is because using the new mount API, we will process mount options prior to opening the FS, but the new API doesn't provide a convenient way to "replay" mount option parsing. So we save these options ourselves to accomplish this. This change also splits out the code to parse a single option into bch2_parse_one_mount_opt(), which will be useful when using the new mount API which deals with a single mount option at a time. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: metadata version bucket_stripe_sectorsKent Overstreet
New on disk format version for bch_alloc->stripe_sectors and BCH_DATA_unstriped - accounting for unstriped data in stripe buckets. Upgrade/downgrade requires regenerating alloc info - but only if erasure coding is in use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: BCH_DATA_unstripedKent Overstreet
Add a new pseudo data type, to track buckets that are members of a stripe, but have unstriped data in them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch_alloc->stripe_sectorsKent Overstreet
Add a separate counter to bch_alloc_v4 for amount of striped data; this lets us separately track striped and unstriped data in a bucket, which lets us see when erasure coding has failed to update extents with stripe pointers, and also find buckets to continue updating if we crash mid way through creating a new stripe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: check_key_has_inode()Kent Overstreet
Consolidate duplicated checks for extents/dirents/xattrs - these keys should all have a corresponding inode of the correct type. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: allow passing full device path for target optionsThomas Bertschinger
The output of mount options such as "metadata_target" in `/proc/mounts` uses the full path to the device. mount(8) from util-linux uses the output from `/proc/mounts` to pass existing mount options when performing a remount, so bcachefs should accept as input the same form that it prints as output. Without this change: $ mount -t bcachefs -o metadata_target=vdb /dev/vdb /mnt $ strace mount -o remount /mnt ... fsconfig(4, FSCONFIG_SET_STRING, "metadata_target", "/dev/vdb", 0) = -1 EINVAL (Invalid argument) ... Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch2_printbuf_strip_trailing_newline()Kent Overstreet
Add a new helper to fix inode_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: don't expose "read_only" as a mount optionThomas Bertschinger
When "read_only" is exposed as a mount option, it is redundant with the standard option "ro" and gives users multiple ways to specify that a bcachefs filesystem should be mounted read-only. This presents the risk of having inconsistent options specified. This can be seen when remounting a read-only filesystem in read-write mode, using mount(8) from util-linux. Because mount(8) parses the existing mount options from `/proc/mounts` and applies them when remounting, it can end up applying both "read_only" and "rw": $ mount img -o ro /mnt $ strace mount -o remount,rw /mnt ... fsconfig(4, FSCONFIG_SET_FLAG, "read_only", NULL, 0) = 0 fsconfig(4, FSCONFIG_SET_FLAG, "rw", NULL, 0) = 0 ... Making "read_only" no longer a mount option means this edge case cannot occur. Fixes: 62719cf33c3a ("bcachefs: Fix nochanges/read_only interaction") Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: make offline fsck set read_only fs flagThomas Bertschinger
A subsequent change will remove "read_only" as a mount option in favor of the standard option "ro", meaning the userspace fsck command cannot pass it to the fsck ioctl. Instead, in offline fsck, set "read_only" kernel-side without trying to parse it as a mount option. For compatibility with versions of the "bcachefs fsck" command that try to pass the "read_only" mount opt, remove it from the mount options string prior to parsing when it is present. Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: btree_ptr_sectors_written() now takes bkey_s_cKent Overstreet
this is for the userspace metadata dump tool Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Check for bsets past bch_btree_ptr_v2.sectors_writtenKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Use try_cmpxchg() family of functions instead of cmpxchg()Uros Bizjak
Use try_cmpxchg() family of functions instead of cmpxchg (*ptr, old, new) == old. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: add might_sleep() annotations for fsck_err()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: fix missing includeKent Overstreet
fs-common.h needs dirent.h for enum bch_rename_mode Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Use filemap_read() to simplify the execution flowYouling Tang
Using filemap_read() can reduce unnecessary code execution for non IOCB_DIRECT paths. Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Align the display format of `btrees/inodes/keys`Youling Tang
Before patch: ``` #cat btrees/inodes/keys u64s 17 type inode_v3 0:4096:U32_MAX len 0 ver 0: mode=40755 flags= (16300000) bi_size=0 ``` After patch: ``` #cat btrees/inodes/keys u64s 17 type inode_v3 0:4096:U32_MAX len 0 ver 0: mode=40755 flags=(16300000) bi_size=0 ``` Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Fix missing spaces in journal_entry_dev_usage_to_textYouling Tang
Fixed missing spaces displayed in journal_entry_dev_usage_to_text while adjusting the display format to improve readability. before: ``` # bcachefs list_journal -a -t alloc:1:0 /dev/sdb ... dev_usage: dev=0free: buckets=233180 sectors=0 fragmented=0sb: buckets=13 sectors=6152 fragmented=504journal: buckets=1847 sectors=945664 fragmented=0btree: buckets=20 sectors=10240 fragmented=0user: buckets=1419 sectors=726513 fragmented=15cached: buckets=0 sectors=0 fragmented=0parity: buckets=0 sectors=0 fragmented=0stripe: buckets=0 sectors=0 fragmented=0need_gc_gens: buckets=0 sectors=0 fragmented=0need_discard: buckets=1 sectors=0 fragmented=0 ``` after: ``` # bcachefs list_journal -a -t alloc:1:0 /dev/sdb ... dev_usage: dev=0 free: buckets=233180 sectors=0 fragmented=0 sb: buckets=13 sectors=6152 fragmented=504 journal: buckets=1847 sectors=945664 fragmented=0 btree: buckets=20 sectors=10240 fragmented=0 user: buckets=1419 sectors=726513 fragmented=15 cached: buckets=0 sectors=0 fragmented=0 parity: buckets=0 sectors=0 fragmented=0 stripe: buckets=0 sectors=0 fragmented=0 need_gc_gens: buckets=0 sectors=0 fragmented=0 need_discard: buckets=1 sectors=0 fragmented=0 ``` Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: fix ei_update_lock lock orderingKent Overstreet
ei_update_lock is largely vestigal and will probably be removed, but we're not ready for that just yet. this fixes some lockdep splats with the new lockdep support for btree node locks; they're harmless, since we were taking ei_update_lock before actually locking any btree nodes, but "any btree nodes locked" are now tracked at the btree_trans level. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch2_btree_reserve_cache_to_text()Kent Overstreet
Add a pretty printer so the btree reserve cache can be seen in sysfs; as it pins open_buckets we need it for tracking down open_buckets issues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: sysfs trigger_freelist_wakeupKent Overstreet
another debugging knob Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: sysfs internal/trigger_journal_writesKent Overstreet
another debugging knob - trigger the journal to do ready journal writes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: add capacity, reserved to fs_alloc_debug_to_text()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: uninline fallocate functionsKent Overstreet
better stack traces Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: btree ids are 64 bit bitmasksKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Print allocator stuck on timeout in fallocate pathKent Overstreet
same as in io_write.c, if we're waiting on the allocator for an excessive amount of time, print what's going on Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14Linux 6.10v6.10Linus Torvalds
2024-07-14Merge tag 'kbuild-fixes-v6.10-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Make scripts/ld-version.sh robust against the latest LLD - Fix warnings in rpm-pkg with device tree support - Fix warnings in fortify tests with KASAN * tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: fortify: fix warnings in fortify tests with KASAN kbuild: rpm-pkg: avoid the warnings with dtb's listed twice kbuild: Make ld-version.sh more robust against version string changes
2024-07-15fortify: fix warnings in fortify tests with KASANMasahiro Yamada
When a software KASAN mode is enabled, the fortify tests emit warnings on some architectures. For example, for ARCH=arm, the combination of CONFIG_FORTIFY_SOURCE=y and CONFIG_KASAN=y produces the following warnings: TEST lib/test_fortify/read_overflow-memchr.log warning: unsafe memchr() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memchr.c TEST lib/test_fortify/read_overflow-memchr_inv.log warning: unsafe memchr_inv() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memchr_inv.c TEST lib/test_fortify/read_overflow-memcmp.log warning: unsafe memcmp() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memcmp.c TEST lib/test_fortify/read_overflow-memscan.log warning: unsafe memscan() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memscan.c TEST lib/test_fortify/read_overflow2-memcmp.log warning: unsafe memcmp() usage lacked '__read_overflow2' warning in lib/test_fortify/read_overflow2-memcmp.c [ more and more similar warnings... ] Commit 9c2d1328f88a ("kbuild: provide reasonable defaults for tool coverage") removed KASAN flags from non-kernel objects by default. It was an intended behavior because lib/test_fortify/*.c are unit tests that are not linked to the kernel. As it turns out, some architectures require -fsanitize=kernel-(hw)address to define __SANITIZE_ADDRESS__ for the fortify tests. Without __SANITIZE_ADDRESS__ defined, arch/arm/include/asm/string.h defines __NO_FORTIFY, thus excluding <linux/fortify-string.h>. This issue does not occur on x86 thanks to commit 4ec4190be4cf ("kasan, x86: don't rename memintrinsics in uninstrumented files"), but there are still some architectures that define __NO_FORTIFY in such a situation. Set KASAN_SANITIZE=y explicitly to the fortify tests. Fixes: 9c2d1328f88a ("kbuild: provide reasonable defaults for tool coverage") Reported-by: Arnd Bergmann <arnd@arndb.de> Closes: https://lore.kernel.org/all/0e8dee26-41cc-41ae-9493-10cd1a8e3268@app.fastmail.com/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-15kbuild: rpm-pkg: avoid the warnings with dtb's listed twiceJose Ignacio Tornos Martinez
After 8d1001f7bdd0 (kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n), the following warning "warning: File listed twice: *.dtb" is appearing for every dtb file that is included. The reason is that the commented commit already adds the folder /lib/modules/%{KERNELRELEASE} in kernel.list file so the folder /lib/modules/%{KERNELRELEASE}/dtb is no longer necessary, just remove it. Fixes: 8d1001f7bdd0 ("kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n") Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-15kbuild: Make ld-version.sh more robust against version string changesNathan Chancellor
After [1] in upstream LLVM, ld.lld's version output became slightly different when the cmake configuration option LLVM_APPEND_VC_REV is disabled. Before: Debian LLD 19.0.0 (compatible with GNU linkers) After: Debian LLD 19.0.0, compatible with GNU linkers This results in ld-version.sh failing with scripts/ld-version.sh: 18: arithmetic expression: expecting EOF: "10000 * 19 + 100 * 0 + 0," because the trailing comma is included in the patch level part of the expression. While [1] has been partially reverted in [2] to avoid this breakage (as it impacts the configuration stage and it is present in all LTS branches), it would be good to make ld-version.sh more robust against such miniscule changes like this one. Use POSIX shell parameter expansion [3] to remove the largest suffix after just numbers and periods, replacing of the current removal of everything after a hyphen. ld-version.sh continues to work for a number of distributions (Arch Linux, Debian, and Fedora) and the kernel.org toolchains and no longer errors on a version of ld.lld with [1]. Fixes: 02aff8592204 ("kbuild: check the minimum linker version in Kconfig") Link: https://github.com/llvm/llvm-project/commit/0f9fbbb63cfcd2069441aa2ebef622c9716f8dbb [1] Link: https://github.com/llvm/llvm-project/commit/649cdfc4b6781a350dfc87d9b2a4b5a4c3395909 [2] Link: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html [3] Suggested-by: Fangrui Song <maskray@google.com> Reviewed-by: Fangrui Song <maskray@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-14Merge tag 'sched_urgent_for_v6.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Fix a performance regression when measuring the CPU time of a thread (clock_gettime(CLOCK_THREAD_CPUTIME_ID,...)) due to the addition of PSI IRQ time accounting in the hotpath - Fix a task_struct leak due to missing to decrement the refcount when the task is enqueued before the timer which is supposed to do that, expires - Revert an attempt to expedite detaching of movable tasks, as finding those could become very costly. Turns out the original issue wasn't even hit by anyone * tag 'sched_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath sched/deadline: Fix task_struct reference leak Revert "sched/fair: Make sure to try to detach at least one movable task"
2024-07-14Merge tag 'x86_urgent_for_v6.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: - Make sure TF is cleared before calling other functions (BHI mitigation in this case) in the SYSENTER compat handler, as otherwise it will warn about being in single-step mode * tag 'x86_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bhi: Avoid warning in #DB handler due to BHI mitigation
2024-07-13Merge tag 'i2c-for-6.10-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Fixes for the I2C testunit, the Renesas R-Car driver and some MAINTAINERS corrections" * tag 'i2c-for-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: testunit: avoid re-issued work after read message i2c: rcar: ensure Gen3+ reset does not disturb local targets i2c: mark HostNotify target address as used i2c: testunit: correct Kconfig description MAINTAINERS: VIRTIO I2C loses a maintainer, gains a reviewer MAINTAINERS: delete entries for Thor Thayer i2c: rcar: clear NO_RXDMA flag after resetting i2c: rcar: bring hardware to known state when probing
2024-07-13Merge tag '6.10-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fix from Steve French: "Small fix, also for stable" * tag '6.10-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix setting SecurityFlags to true
2024-07-13cifs: fix setting SecurityFlags to trueSteve French
If you try to set /proc/fs/cifs/SecurityFlags to 1 it will set them to CIFSSEC_MUST_NTLMV2 which no longer is relevant (the less secure ones like lanman have been removed from cifs.ko) and is also missing some flags (like for signing and encryption) and can even cause mount to fail, so change this to set it to Kerberos in this case. Also change the description of the SecurityFlags to remove mention of flags which are no longer supported. Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-07-13Merge tag 'i2c-host-fixes-6.10-rc8' of ↵Wolfram Sang
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current This tag includes three fixes for the Renesas R-Car driver: 1. Ensures the device is in a known state after probing. 2. Allows clearing the NO_RXDMA flag after a reset. 3. Forces a reset before any transfer on Gen3+ platforms to prevent disruption of the configuration during parallel transfers.
2024-07-12Merge tag 'net-6.10-rc8-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull more networking fixes from Jakub Kicinski: "A quick follow up to yesterday's pull. We got a regressions report for the bnxt patch as soon as it got to your tree. The ethtool fix is also good to have, although it's an older regression. Current release - regressions: - eth: bnxt_en: fix crash in bnxt_get_max_rss_ctx_ring() on older HW when user tries to decrease the ring count Previous releases - regressions: - ethtool: fix RSS setting, accept "no change" setting if the driver doesn't support the new features - eth: i40e: remove needless retries of NVM update, don't wait 20min when we know the firmware update won't succeed" * tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring() octeontx2-af: fix issue with IPv4 match for RSS octeontx2-af: fix issue with IPv6 ext match for RSS octeontx2-af: fix detection of IP layer octeontx2-af: fix a issue with cpt_lf_alloc mailbox octeontx2-af: replace cpt slot with lf id on reg write i40e: fix: remove needless retries of NVM update net: ethtool: Fix RSS setting
2024-07-12bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()Michael Chan
On older chips not supporting multiple RSS contexts, reducing ethtool channels will crash: BUG: kernel NULL pointer dereference, address: 00000000000000b8 PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 7032 Comm: ethtool Tainted: G S 6.10.0-rc4 #1 Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017 RIP: 0010:bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en] Code: c3 d3 eb 4c 8b 83 38 01 00 00 48 8d bb 38 01 00 00 4c 39 c7 74 42 41 8d 54 24 ff 31 c0 0f b7 d2 4c 8d 4c 12 02 66 85 ed 74 1d <49> 8b 90 b8 00 00 00 49 8d 34 11 0f b7 0a 66 39 c8 0f 42 c1 48 83 RSP: 0018:ffffaaa501d23ba8 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff8efdf600c940 RCX: 0000000000000000 RDX: 000000000000007f RSI: ffffffffacf429c4 RDI: ffff8efdf600ca78 RBP: 0000000000000080 R08: 0000000000000000 R09: 0000000000000100 R10: 0000000000000001 R11: ffffaaa501d238c0 R12: 0000000000000080 R13: 0000000000000000 R14: ffff8efdf600c000 R15: 0000000000000006 FS: 00007f977a7d2740(0000) GS:ffff8f041f840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b8 CR3: 00000002320aa004 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die_body+0x15/0x60 ? page_fault_oops+0x157/0x440 ? do_user_addr_fault+0x60/0x770 ? _raw_spin_lock_irqsave+0x12/0x40 ? exc_page_fault+0x61/0x120 ? asm_exc_page_fault+0x22/0x30 ? bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en] ? bnxt_get_max_rss_ctx_ring+0x25/0x90 [bnxt_en] bnxt_set_channels+0x9d/0x340 [bnxt_en] ethtool_set_channels+0x14b/0x210 __dev_ethtool+0xdf8/0x2890 ? preempt_count_add+0x6a/0xa0 ? percpu_counter_add_batch+0x23/0x90 ? filemap_map_pages+0x417/0x4a0 ? avc_has_extended_perms+0x185/0x420 ? __pfx_udp_ioctl+0x10/0x10 ? sk_ioctl+0x55/0xf0 ? kmalloc_trace_noprof+0xe0/0x210 ? dev_ethtool+0x54/0x170 dev_ethtool+0xa2/0x170 dev_ioctl+0xbe/0x530 sock_do_ioctl+0xa3/0xf0 sock_ioctl+0x20d/0x2e0 bp->rss_ctx_list is not initialized if the chip or firmware does not support multiple RSS contexts. Fix it by adding a check in bnxt_get_max_rss_ctx_ring() before proceeding to reference bp->rss_ctx_list. Fixes: 0d1b7d6c9274 ("bnxt: fix crashes when reducing ring count with active RSS contexts") Reported-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/netdev/ZpFEJeNpwxW1aW9k@gmail.com/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240712175318.166811-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12Merge tag 'for-6.10-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Fix a regression in extent map shrinker behaviour. In the past weeks we got reports from users that there are huge latency spikes or freezes. This was bisected to newly added shrinker of extent maps (it was added to fix a build up of the structures in memory). I'm assuming that the freezes would happen to many users after release so I'd like to get it merged now so it's in 6.10. Although the diff size is not small the changes are relatively straightforward, the reporters verified the fixes and we did testing on our side. The fixes: - adjust behaviour under memory pressure and check lock or scheduling conditions, bail out if needed - synchronize tracking of the scanning progress so inode ranges are not skipped or work duplicated - do a delayed iput when scanning a root so evicting an inode does not slow things down in case of lots of dirty data, also fix lockdep warning, a deadlock could happen when writing the dirty data would need to start a transaction" * tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: avoid races when tracking progress for extent map shrinking btrfs: stop extent map shrinker if reschedule is needed btrfs: use delayed iput during extent map shrinking
2024-07-12Merge tag 'ceph-for-6.10-rc8' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "A fix for a possible use-after-free following "rbd unmap" or "umount" marked for stable and two kernel-doc fixups" * tag 'ceph-for-6.10-rc8' of https://github.com/ceph/ceph-client: libceph: fix crush_choose_firstn() kernel-doc warnings libceph: suppress crush_choose_indep() kernel-doc warnings libceph: fix race between delayed_work() and ceph_monc_stop()
2024-07-12Merge tag 'pmdomain-v6.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fix from Ulf Hansson: - qcom: Skip retention level for rpmhpd's * tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: qcom: rpmhpd: Skip retention level for Power Domains
2024-07-12Merge tag 'mmc-v6.10-rc4-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - davinci_mmc: Prevent transmitted data size from exceeding sgm's length - sdhci: Fix max_seg_size for 64KiB PAGE_SIZE * tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: davinci_mmc: Prevent transmitted data size from exceeding sgm's length mmc: sdhci: Fix max_seg_size for 64KiB PAGE_SIZE