linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-08 06:25:52 +02:00

Author	SHA1	Message	Date
Anand Moon	918273be08	arm64: dts: amlogic: meson-axg: Add missing cache information to cpu0 Add missing L1 data and instruction cache parameters to the CPU node 0 for the Cortex-A53 caches on the Meson AXG SoC. Fixes: `3b6ad2a433` ("arm64: dts: amlogic: Add cache information to the Amlogic AXG SoCS") Signed-off-by: Anand Moon <linux.amoon@gmail.com> Link: https://patch.msgid.link/20260219103548.18392-1-linux.amoon@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>	2026-04-21 15:46:22 +02:00
Nick Xie	28e4a49a28	arm64: dts: amlogic: t7: khadas-vim4: fix board model name Update the model property to "Khadas VIM4" to match the official product branding and maintain consistency with other Khadas boards (e.g., VIM1, VIM2, VIM3) in the kernel tree. Signed-off-by: Nick Xie <nick@khadas.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20260306030756.2421841-1-nick@khadas.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>	2026-04-21 15:46:22 +02:00
Ronald Claveau	232eb5dc61	arm64: dts: amlogic: Fix GIC register ranges for Amlogic T7 This patch aims to fix the GIC register ranges for Amlogic T7 SoC family. - Context Kernel log shows a warning about GIC [ 0.000000] GIC: GICv2 detected, but range too small and irqchip.gicv2_force_probe not set Using cat /proc/interrupts command shows GIC as GIC-0 Adding some peripherals sometimes causes hangs on interrupts. - According to the GIC-400 ARM doc, the memory map is like: 0x1000-0x1FFF Distributor 0x2000-0x3FFF CPU interfaces 0x4000-0x5FFF Virtual interface control block 0x6000-0x7FFF Virtual CPU interfaces - Identify GIC model from distributor register Offset \| Name \| Type \| Reset 0x008 \| GICD_IIDR \| RO \| 0x0200143B kvim4# md.l 0xFFF01008 1 fff01008: 0200143b - Identify CPU interface from CPU interface register Offset \| Name \| Type \| Reset 0x00FC \| GICC_IIDR \| RO \| 0x0202143B kvim4# md.l 0xFFF020FC 1 fff020fc: 0202143b - Virtual interface control register check Offset \| Name \| Type \| Reset 0x004 \| GICH_VTR \| RO \| 0x90000003 kvim4# md.l 0xFFF04004 1 fff04004: 90000003 - Virtual CPU interfaces check Offset \| Name \| Type \| Reset 0x00FC \| GICV_IIDR \| RO \| 0x0202143B kvim4# md.l 0xFFF060FC 1 fff060fc: 0202143b - After this patch there is no warning anymore. GICv2 is correctly identified. [ 0.000000] GIC: Using split EOI/Deactivate mode Using cat /proc/interrupts command shows GIC as GICv2 Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20260305-fix-amlt7-gic-dts-v1-1-5944415c74bf@aliel.fr Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>	2026-04-21 15:46:22 +02:00
Nick Xie	124d5e138a	arm64: dts: amlogic: t7: khadas-vim4: fix memory layout for 8GB RAM The Khadas VIM4 features 8GB of LPDDR4X RAM. The previous memory node mapped a single incorrect region. This caused the kernel to map MMIO and secure firmware (ATF/TrustZone) memory holes as standard RAM, leading to an Asynchronous SError Interrupt during early boot (paging_init) when the kernel attempted to clear those pages. Fix this by splitting the 8GB memory layout into three separate regions to properly avoid the memory holes (e.g., 0xe0000000 - 0xffffffff): - 3.5GB @ 0x000000000 - 3.5GB @ 0x100000000 - 1.0GB @ 0x200000000 Signed-off-by: Nick Xie <nick@khadas.com> Suggested-by: Ronald Claveau <linux-kernel-dev@aliel.fr> Link: https://patch.msgid.link/20260319023446.3422695-1-nick@khadas.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>	2026-04-21 15:46:22 +02:00
Geert Uytterhoeven	5ecee47dc9	arm64: dts: amlogic: s6: Drop CPU masks from GICv3 PPI interrupts Unlike older GIC variants, the GICv3 DT bindings do not support specifying a CPU mask in PPI interrupt specifiers. Drop the masks. While at it, replace the magic number for IRQ_TYPE_LEVEL_HIGH by its symbolic definition. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/f9c6eddebebcd2e128edd2dbc51706e23589f9e8.1772643434.git.geert+renesas@glider.be Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>	2026-04-21 15:46:22 +02:00
Chia-Yu Chang	478ed6b7d2	net/sched: sch_dualpi2: drain both C-queue and L-queue in dualpi2_change() Fix dualpi2_change() to correctly enforce updated limit and memlimit values after a configuration change of the dualpi2 qdisc. Before this patch, dualpi2_change() always attempted to dequeue packets via the root qdisc (C-queue) when reducing backlog or memory usage, and unconditionally assumed that a valid skb will be returned. When traffic classification results in packets being queued in the L-queue while the C-queue is empty, this leads to a NULL skb dereference during limit or memlimit enforcement. This is fixed by first dequeuing from the C-queue path if it is non-empty. Once the C-queue is empty, packets are dequeued directly from the L-queue. Return values from qdisc_dequeue_internal() are checked for both queues. When dequeuing from the L-queue, the parent qdisc qlen and backlog counters are updated explicitly to keep overall qdisc statistics consistent. Fixes: `320d031ad6` ("sched: Struct definition and parsing of dualpi2 qdisc") Reported-by: "Kito Xu (veritas501)" <hxzene@gmail.com> Closes: https://lore.kernel.org/netdev/20260413075740.2234828-1-hxzene@gmail.com/ Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Link: https://patch.msgid.link/20260417152551.71648-1-chia-yu.chang@nokia-bell-labs.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 15:00:39 +02:00
Lorenzo Bianconi	d647f25452	net: airoha: Fix PPE cpu port configuration for GDM2 loopback path When QoS loopback is enabled for GDM3 or GDM4, incoming packets are forwarded to GDM2. However, the PPE cpu port for GDM2 is not configured in this path, causing traffic originating from GDM3/GDM4, which may be set up as WAN ports backed by QDMA1, to be incorrectly directed to QDMA0 instead. Configure the PPE cpu port for GDM2 when QoS loopback is active on GDM3 or GDM4 to ensure traffic is routed to the correct QDMA instance. Fixes: `9cd451d414` ("net: airoha: Add loopback support for GDM2") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260417-airoha-ppe-cpu-port-for-gdm2-loopback-v1-1-c7a9de0f6f57@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 14:46:22 +02:00
Corey Minyard	36920f30e7	ipmi: Check event message buffer response for bad data The event message buffer response data size got checked later when processing, but check it right after the response comes back. It appears some BMCs may return an empty message instead of an error when fetching events. There are apparently some new BMCs that make this error, so we need to compensate. Reported-by: Matt Fleming <mfleming@cloudflare.com> Closes: https://lore.kernel.org/lkml/20260415115930.3428942-1-matt@readmodwrite.com/ Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: <stable@vger.kernel.org> Signed-off-by: Corey Minyard <corey@minyard.net>	2026-04-21 07:29:04 -05:00
Paolo Abeni	edaa48dc2c	Merge branch 'net-sleepable-ndo_set_rx_mode' Stanislav Fomichev says: ==================== net: sleepable ndo_set_rx_mode This series adds a new ndo_set_rx_mode_async callback that enables drivers to handle address list updates in a sleepable context. The current ndo_set_rx_mode is called under the netif_addr_lock spinlock with BHs disabled, which prevents drivers from sleeping. This is problematic for ops-locked drivers that need to sleep. The approach: 1. Add snapshot/reconcile infrastructure for address lists 2. Introduce dev_rx_mode_work that takes snapshots under the lock, drops the lock, calls the driver, then reconciles changes back 3. Move promiscuity handling into the scheduled work as well 4. Convert existing ops-locked drivers to ndo_set_rx_mode_async 5. Add a warning for ops-locked drivers still using ndo_set_rx_mode 6. Add a selftest exercising the team+bridge+macvlan topology that triggers the addr_lock -> ops_lock ordering issue ==================== Link: https://patch.msgid.link/20260416185712.2155425-1-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:26 +02:00
Stanislav Fomichev	c4dde411bc	selftests: net: use ip commands instead of teamd in team rx_mode test Replace teamd daemon usage with ip link commands for team device setup. teamd -d daemonizes and returns to the shell before port addition completes, creating a race: the test may create the macvlan (and check for its address on a slave) before teamd has finished adding ports. This makes the test inherently dependent on scheduling timing. Using ip commands makes port addition synchronous, removing the race and making the test deterministic. Cc: Jiri Pirko <jiri@resnulli.us> Cc: Jay Vosburgh <jv@jvosburgh.net> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-16-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	ee514cdb07	selftests: net: add team_bridge_macvlan rx_mode test Add a test that exercises the ndo_change_rx_flags path through a macvlan -> bridge -> team -> dummy stack. This triggers dev_uc_add under addr_list_lock which flips promiscuity on the lower device. With the new work queue approach, this must not deadlock. Link: https://lore.kernel.org/netdev/20260214033859.43857-1-jiayuan.chen@linux.dev/ Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-15-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	3cbd229388	net: warn ops-locked drivers still using ndo_set_rx_mode Now that all in-tree ops-locked drivers have been converted to ndo_set_rx_mode_async, add a warning in register_netdevice to catch any remaining or newly added drivers that use ndo_set_rx_mode with ops locking. This ensures future driver authors are guided toward the async path. Also route ops-locked devices through netdev_rx_mode_work even if they lack rx_mode NDOs, to ensure netdev_ops_assert_locked() does not fire on the legacy path where only RTNL is held. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-14-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	754b7e1169	netkit: convert to ndo_set_rx_mode_async Convert netkit driver from ndo_set_rx_mode to ndo_set_rx_mode_async. The netkit driver's set_multicast_list is a no-op, presumably for the same reason as the one in dummy? (fake multicast ability) Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-13-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	4d157e89bd	dummy: convert to ndo_set_rx_mode_async Convert dummy driver from ndo_set_rx_mode to ndo_set_rx_mode_async. The dummy driver's set_multicast_list is a no-op, so the conversion is straightforward: update the signature and the ops assignment. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-12-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	8a5df09e70	netdevsim: convert to ndo_set_rx_mode_async Convert netdevsim from ndo_set_rx_mode to ndo_set_rx_mode_async. The callback is a no-op stub so just update the signature and ops struct wiring. Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-11-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	d071c15b43	iavf: convert to ndo_set_rx_mode_async Convert iavf from ndo_set_rx_mode to ndo_set_rx_mode_async. iavf_set_rx_mode now takes explicit uc/mc list parameters and uses __hw_addr_sync_dev on the snapshots instead of __dev_uc_sync and __dev_mc_sync. The iavf_configure internal caller passes the real lists directly. Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-10-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	a453b5d9b3	bnxt: use snapshot in bnxt_cfg_rx_mode With the introduction of ndo_set_rx_mode_async (as discussed in [1]) we can call bnxt_cfg_rx_mode directly. Convert bnxt_cfg_rx_mode to use uc/mc snapshots and move its call in bnxt_sp_task to the section that resets BNXT_STATE_IN_SP_TASK. Switch to direct call in bnxt_set_rx_mode. Link: https://lore.kernel.org/netdev/CACKFLi=5vj8hPqEUKDd8RTw3au5G+zRgQEqjF+6NZnyoNm90KA@mail.gmail.com/ [1] Cc: Michael Chan <michael.chan@broadcom.com> Cc: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-9-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	f6c53cfa12	bnxt: convert to ndo_set_rx_mode_async Convert bnxt from ndo_set_rx_mode to ndo_set_rx_mode_async. bnxt_set_rx_mode, bnxt_mc_list_updated and bnxt_uc_list_updated now take explicit uc/mc list parameters and iterate with netdev_hw_addr_list_for_each instead of netdev_for_each_{uc,mc}_addr. The bnxt_cfg_rx_mode internal caller passes the real lists under netif_addr_lock_bh. BNXT_RX_MASK_SP_EVENT is still used here, next patch converts to the direct call. Cc: Michael Chan <michael.chan@broadcom.com> Cc: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-8-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	5cf06fbdaf	mlx5: convert to ndo_set_rx_mode_async Convert mlx5 from ndo_set_rx_mode to ndo_set_rx_mode_async. The driver's mlx5e_set_rx_mode now receives uc/mc snapshots and calls mlx5e_fs_set_rx_mode_work directly instead of queueing work. mlx5e_sync_netdev_addr and mlx5e_handle_netdev_addr now take explicit uc/mc list parameters and iterate with netdev_hw_addr_list_for_each instead of netdev_for_each_{uc,mc}_addr. Fallback to netdev's uc/mc in a few places and grab addr lock. Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-7-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:25 +02:00
Stanislav Fomichev	60dd9781e9	fbnic: convert to ndo_set_rx_mode_async Convert fbnic from ndo_set_rx_mode to ndo_set_rx_mode_async. The driver's __fbnic_set_rx_mode() now takes explicit uc/mc list parameters and uses __hw_addr_sync_dev() on the snapshots instead of __dev_uc_sync/__dev_mc_sync on the netdev directly. Update callers in fbnic_up, fbnic_fw_config_after_crash, fbnic_bmc_rpc_check and fbnic_set_mac to pass the real address lists calling __fbnic_set_rx_mode outside the async work path. Cc: Alexander Duyck <alexanderduyck@fb.com> Cc: kernel-team@meta.com Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-6-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:24 +02:00
Stanislav Fomichev	7ef83bf171	net: move promiscuity handling into netdev_rx_mode_work Move unicast promiscuity tracking into netdev_rx_mode_work so it runs under netdev_ops_lock instead of under the addr_lock spinlock. This is required because __dev_set_promiscuity calls dev_change_rx_flags and __dev_notify_flags, both of which may need to sleep. Change ASSERT_RTNL() to netdev_ops_assert_locked() in __dev_set_promiscuity, netif_set_allmulti and __dev_change_flags since these are now called from the work queue under the ops lock. Link: https://lore.kernel.org/netdev/20260214033859.43857-1-jiayuan.chen@linux.dev/ Fixes: `78cd408356` ("net: add missing instance lock to dev_set_promiscuity") Reported-by: syzbot+2b3391f44313b3983e91@syzkaller.appspotmail.com Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-5-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:24 +02:00
Stanislav Fomichev	a4c8332781	net: cache snapshot entries for ndo_set_rx_mode_async Add a per-device netdev_hw_addr_list cache (rx_mode_addr_cache) that allows __hw_addr_list_snapshot() and __hw_addr_list_reconcile() to reuse previously allocated entries instead of hitting GFP_ATOMIC on every snapshot cycle. snapshot pops entries from the cache when available, falling back to __hw_addr_create(). reconcile splices both snapshot lists back into the cache via __hw_addr_splice(). The cache is flushed in free_netdev(). Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-4-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:13 +02:00
Stanislav Fomichev	3554b4345d	net: introduce ndo_set_rx_mode_async and netdev_rx_mode_work Add ndo_set_rx_mode_async callback that drivers can implement instead of the legacy ndo_set_rx_mode. The legacy callback runs under the netif_addr_lock spinlock with BHs disabled, preventing drivers from sleeping. The async variant runs from a work queue with rtnl_lock and netdev_lock_ops held, in fully sleepable context. When __dev_set_rx_mode() sees ndo_set_rx_mode_async, it schedules netdev_rx_mode_work instead of calling the driver inline. The work function takes two snapshots of each address list (uc/mc) under the addr_lock, then drops the lock and calls the driver with the work copies. After the driver returns, it reconciles the snapshots back to the real lists under the lock. Add netif_rx_mode_sync() to opportunistically execute the pending workqueue update inline, so that rx mode changes are committed before returning to userspace: - dev_change_flags (SIOCSIFFLAGS / RTM_NEWLINK) - dev_set_promiscuity - dev_set_allmulti - dev_ifsioc SIOCADDMULTI / SIOCDELMULTI - do_setlink (RTM_SETLINK) Note that some deep hierarchies still do skip the lower updates via: - dev_uc_sync - dev_mc_sync If we do end up hitting user-visible issues, we can add more calls to netif_rx_mode_sync in specific places. But hopefully we should not, the actual user-visible lists are still synced, it's that just HW state that might be lagging. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-3-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:03 +02:00
Stanislav Fomichev	db9e726525	net: add address list snapshot and reconciliation infrastructure Introduce __hw_addr_list_snapshot() and __hw_addr_list_reconcile() for use by the upcoming ndo_set_rx_mode_async callback. The async rx_mode path needs to snapshot the device's unicast and multicast address lists under the addr_lock, hand those snapshots to the driver (which may sleep), and then propagate any sync_cnt changes back to the real lists. Two identical snapshots are taken: a work copy for the driver to pass to __hw_addr_sync_dev() and a reference copy to compute deltas against. __hw_addr_list_reconcile() walks the reference snapshot comparing each entry against the work snapshot to determine what the driver synced or unsynced. It then applies those deltas to the real list, handling concurrent modifications: - If the real entry was concurrently removed but the driver synced it to hardware (delta > 0), re-insert a stale entry so the next work run properly unsyncs it from hardware. - If the entry still exists, apply the delta normally. An entry whose refcount drops to zero is removed. # dev_addr_test_snapshot_benchmark: 1024 addrs x 1000 snapshots: 89872802 ns total, 89872 ns/iter # dev_addr_test_snapshot_benchmark.speed: slow Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-2-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 12:50:03 +02:00
Pablo Neira Ayuso	10f79dbd77	netfilter: nf_tables: add hook transactions for device deletions Restore the flag that indicates that the hook is going away, ie. NFT_HOOK_REMOVE, but add a new transaction object to track deletion of hooks without altering the basechain/flowtable hook_list during the preparation phase. The existing approach that moves the hook from the basechain/flowtable hook_list to transaction hook_list breaks netlink dump path readers of this RCU-protected list. It should be possible use an array for nft_trans_hook to store the deleted hooks to compact the representation but I am not expecting many hook object, specially now that wildcard support for devices is in place. Note that the nft_trans_chain_hooks() list contains a list of struct nft_trans_hook objects for DELCHAIN and DELFLOWTABLE commands, while this list stores struct nft_hook objects for NEWCHAIN and NEWFLOWTABLE. Note that new commands can be updated to use nft_trans_hook for consistency. This patch also adapts the event notification path to deal with the list of hook transactions. Fixes: `7d937b1071` ("netfilter: nf_tables: support for deleting devices in an existing netdev chain") Fixes: `b6d9014a33` ("netfilter: nf_tables: delete flowtable hooks via transaction list") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-21 12:48:44 +02:00
Pablo Neira Ayuso	a6134e62db	netfilter: nf_tables: join hook list via splice_list_rcu() in commit phase Publish new hooks in the list into the basechain/flowtable using splice_list_rcu() to ensure netlink dump list traversal via rcu is safe while concurrent ruleset update is going on. Fixes: `78d9f48f7f` ("netfilter: nf_tables: add devices to existing flowtable") Fixes: `b9703ed44f` ("netfilter: nf_tables: support for adding new devices to an existing netdev chain") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-21 12:48:44 +02:00
Pablo Neira Ayuso	f902877b63	rculist: add list_splice_rcu() for private lists This patch adds a helper function, list_splice_rcu(), to safely splice a private (non-RCU-protected) list into an RCU-protected list. The function ensures that only the pointer visible to RCU readers (prev->next) is updated using rcu_assign_pointer(), while the rest of the list manipulations are performed with regular assignments, as the source list is private and not visible to concurrent RCU readers. This is useful for moving elements from a private list into a global RCU-protected list, ensuring safe publication for RCU readers. Subsystems with some sort of batching mechanism from userspace can benefit from this new function. The function __list_splice_rcu() has been added for clarity and to follow the same pattern as in the existing list_splice*() interfaces, where there is a check to ensure that the list to splice is not empty. Note that __list_splice_rcu() has no documentation for this reason. Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-21 12:48:44 +02:00
Florian Westphal	f3224ee463	netfilter: nf_tables: use list_del_rcu for netlink hooks nft_netdev_unregister_hooks and __nft_unregister_flowtable_net_hooks need to use list_del_rcu(), this list can be walked by concurrent dumpers. Add a new helper and use it consistently. Fixes: `f9a43007d3` ("netfilter: nf_tables: double hook unregistration in netns path") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-21 12:47:47 +02:00
Pablo Neira Ayuso	1e8e3f449b	netfilter: arp_tables: fix IEEE1394 ARP payload parsing Weiming Shi says: "arp_packet_match() unconditionally parses the ARP payload assuming two hardware addresses are present (source and target). However, IPv4-over-IEEE1394 ARP (RFC 2734) omits the target hardware address field, and arp_hdr_len() already accounts for this by returning a shorter length for ARPHRD_IEEE1394 devices. As a result, on IEEE1394 interfaces arp_packet_match() advances past a nonexistent target hardware address and reads the wrong bytes for both the target device address comparison and the target IP address. This causes arptables rules to match against garbage data, leading to incorrect filtering decisions: packets that should be accepted may be dropped and vice versa. The ARP stack in net/ipv4/arp.c (arp_create and arp_process) already handles this correctly by skipping the target hardware address for ARPHRD_IEEE1394. Apply the same pattern to arp_packet_match()." Mangle the original patch to always return 0 (no match) in case user matches on the target hardware address which is never present in IEEE1394. Note that this returns 0 (no match) for either normal and inverse match because matching in the target hardware address in ARPHRD_IEEE1394 has never been supported by arptables. This is intentional, matching on the target hardware address should never evaluate true for ARPHRD_IEEE1394. Moreover, adjust arpt_mangle to drop the packet too as AI suggests: In arpt_mangle, the logic assumes a standard ARP layout. Because IEEE1394 (FireWire) omits the target hardware address, the linear pointer arithmetic miscalculates the offset for the target IP address. This causes mangling operations to write to the wrong location, leading to packet corruption. To ensure safety, this patch drops packets (NF_DROP) when mangling is requested for these fields on IEEE1394 devices, as the current implementation cannot correctly map the FireWire ARP payload. This omits both mangling target hardware and IP address. Even if IP address mangling should be possible in IEEE1394, this would require to adjust arpt_mangle offset calculation, which has never been supported. Based on patch from Weiming Shi <bestswngs@gmail.com>. Fixes: `6752c8db8e` ("firewire net, ipv4 arp: Extend hardware address and remove driver-level packet inspection.") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-21 12:44:39 +02:00
Gao Xiang	2d8c7edcb6	erofs: unify lcn as u64 for 32-bit platforms As sashiko reported [1], `lcn` was typed as `unsigned long` (or `unsigned int` sometimes), which is only 32 bits wide on 32-bit platforms, which causes `(lcn << lclusterbits)` to be truncated at 4 GiB. In order to consolidate the logic, just use `u64` consistently around the codebase. [1] https://sashiko.dev/r/20260420034612.1899973-1-hsiangkao%40linux.alibaba.com Fixes: `152a333a58` ("staging: erofs: add compacted compression indexes support") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2026-04-21 16:56:08 +08:00
Gao Xiang	c99493ce40	erofs: fix offset truncation when shifting pgoff on 32-bit platforms On 32-bit platforms, pgoff_t is 32 bits wide, so left-shifting large arbitrary pgoff_t values by PAGE_SHIFT performs 32-bit arithmetic and silently truncates the result for pages beyond the 4 GiB boundary. Cast the page index to loff_t before shifting to produce a correct 64-bit byte offset. Fixes: `386292919c` ("erofs: introduce readmore decompression strategy") Fixes: `307210c262` ("erofs: verify metadata accesses for file-backed mounts") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2026-04-21 16:56:08 +08:00
Gao Xiang	d18a3b5d33	erofs: fix the out-of-bounds nameoff handling for trailing dirents Currently we already have boundary-checks for nameoffs, but the trailing dirents are special since the namelens are calculated with strnlen() with unchecked nameoffs. If a crafted EROFS has a trailing dirent with nameoff >= maxsize, maxsize - nameoff can underflow, causing strnlen() to read past the directory block. nameoff0 should also be verified to be a multiple of `sizeof(struct erofs_dirent)` as well [1]. [1] https://sashiko.dev/#/patchset/20260416063511.3173774-1-hsiangkao%40linux.alibaba.com Fixes: `3aa8ec716e` ("staging: erofs: add directory operations") Fixes: `33bac91284` ("staging: erofs: keep corrupted fs from crashing kernel in erofs_readdir()") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Reported-by: Junrui Luo <moonafterrain@outlook.com> Closes: https://lore.kernel.org/r/A0FD7E0F-7558-49B0-8BC8-EB1ECDB2479A@outlook.com Cc: stable@vger.kernel.org Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org>	2026-04-21 16:56:04 +08:00
Weiming Shi	4c1367a2d7	slip: bound decode() reads against the compressed packet length slhc_uncompress() parses a VJ-compressed TCP header by advancing a pointer through the packet via decode() and pull16(). Neither helper bounds-checks against isize, and decode() masks its return with & 0xffff so it can never return the -1 that callers test for -- those error paths are dead code. A short compressed frame whose change byte requests optional fields lets decode() read past the end of the packet. The over-read bytes are folded into the cached cstate and reflected into subsequent reconstructed packets. Make decode() and pull16() take the packet end pointer and return -1 when exhausted. Add a bounds check before the TCP-checksum read. The existing == -1 tests now do what they were always meant to. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: Simon Horman <horms@kernel.org> Closes: https://lore.kernel.org/netdev/20260414134126.758795-2-horms@kernel.org/ Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260416100147.531855-5-bestswngs@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 10:18:18 +02:00
Phil Willoughby	12c1c672d4	ALSA: usb-audio/line6: Add support for POD HD PRO The POD HD PRO is the rackmount version of the POD 500, with most of the same behaviors. As with some of the other rackmount POD devices it will not send captured audio to the host unless the host is sending playback audio, so it has LINE6_CAP_IN_NEEDS_OUT in addition to the POD 500 flags. Tested-By: Phil Willoughby <willerz@gmail.com> Signed-off-by: Phil Willoughby <willerz@gmail.com> Link: https://patch.msgid.link/20260420152405.7230-1-willerz@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2026-04-21 10:07:45 +02:00
Chris Chiu	cb78517e60	ALSA: hda/realtek: Add LED fixup for HP EliteBook 6 G2a Laptops The HP EliteBook 6 G2a laptops requires specific LED control method ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF to work. Signed-off-by: Chris Chiu <chris.chiu@canonical.com> Link: https://patch.msgid.link/20260421023429.3723154-1-chris.chiu@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2026-04-21 10:06:48 +02:00
Weiming Shi	e76607442d	slip: reject VJ receive packets on instances with no rstate array slhc_init() accepts rslots == 0 as a valid configuration, with the documented meaning of 'no receive compression'. In that case the allocation loop in slhc_init() is skipped, so comp->rstate stays NULL and comp->rslot_limit stays 0 (from the kzalloc of struct slcompress). The receive helpers do not defend against that configuration. slhc_uncompress() dereferences comp->rstate[x] when the VJ header carries an explicit connection ID, and slhc_remember() later assigns cs = &comp->rstate[...] after only comparing the packet's slot number to comp->rslot_limit. Because rslot_limit is 0, slot 0 passes the range check, and the code dereferences a NULL rstate. The configuration is reachable in-tree through PPP. PPPIOCSMAXCID stores its argument in a signed int, and (val >> 16) uses arithmetic shift. Passing 0xffff0000 therefore sign-extends to -1, so val2 + 1 is 0 and ppp_generic.c ends up calling slhc_init(0, 1). Because /dev/ppp open is gated by ns_capable(CAP_NET_ADMIN), the whole path is reachable from an unprivileged user namespace. Once the malformed VJ state is installed, any inbound VJ-compressed or VJ-uncompressed frame that selects slot 0 crashes the kernel in softirq context: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:slhc_uncompress (drivers/net/slip/slhc.c:519) Call Trace: <TASK> ppp_receive_nonmp_frame (drivers/net/ppp/ppp_generic.c:2466) ppp_input (drivers/net/ppp/ppp_generic.c:2359) ppp_async_process (drivers/net/ppp/ppp_async.c:492) tasklet_action_common (kernel/softirq.c:926) handle_softirqs (kernel/softirq.c:623) run_ksoftirqd (kernel/softirq.c:1055) smpboot_thread_fn (kernel/smpboot.c:160) kthread (kernel/kthread.c:436) ret_from_fork (arch/x86/kernel/process.c:164) </TASK> Reject the receive side on such instances instead of touching rstate. slhc_uncompress() falls through to its existing 'bad' label, which bumps sls_i_error and enters the toss state. slhc_remember() mirrors that with an explicit sls_i_error increment followed by slhc_toss(); the sls_i_runt counter is not used here because a missing rstate is an internal configuration state, not a runt packet. The transmit path is unaffected: the only in-tree caller that picks rslots from userspace (ppp_generic.c) still supplies tslots >= 1, and slip.c always calls slhc_init(16, 16), so comp->tstate remains valid and slhc_compress() continues to work. Fixes: `4ab42d78e3` ("ppp, slip: Validate VJ compression slot parameters completely") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260415204130.258866-2-bestswngs@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-21 09:51:40 +02:00
Tejun Heo	4fe9852927	rhashtable: Bounce deferred worker kick through irq_work Inserts past 75% load call schedule_work(&ht->run_work) to kick an async resize. If a caller holds a raw spinlock (e.g. an insecure_elasticity user), schedule_work() under that lock records caller_lock -> pool->lock -> pi_lock -> rq->__lock A cycle forms if any of these locks is acquired in the reverse direction elsewhere. sched_ext, the only current insecure_elasticity user, hits this: it holds scx_sched_lock across rhashtable inserts of sub-schedulers, while scx_bypass() takes rq->__lock -> scx_sched_lock. Exercising the resize path produces: Chain exists of: &pool->lock --> &rq->__lock --> scx_sched_lock Bounce the kick from the insert paths through irq_work so schedule_work() runs from hard IRQ context with the caller's lock no longer held. rht_deferred_worker()'s self-rearm on error stays on schedule_work(&ht->run_work) - the worker runs in process context with no caller lock held, and keeping the self-requeue on @run_work lets cancel_work_sync() in rhashtable_free_and_destroy() drain it. v3: Keep rht_deferred_worker()'s self-rearm on schedule_work(&run_work). Routing it through irq_work in v2 broke cancel_work_sync()'s self-requeue handling - an irq_work queued after irq_work_sync() returned but while cancel_work_sync() was still waiting could fire post-teardown. v2: Bounce unconditionally instead of gating on insecure_elasticity, as suggested by Herbert. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>	2026-04-20 20:10:50 -10:00
Yihang Li	47e66bec3e	scsi: hisi_sas: Fix sparse warnings in prep_ata_v3_hw() In prep_ata_v3_hw(), add cpu_to_le32() to fix warning: drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:1448:26: sparse: sparse: invalid assignment: \|= drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:1448:26: sparse: left side has type restricted __le32 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:1448:26: sparse: right side has type unsigned int Fixes: `8aa580cd92` ("scsi: hisi_sas: Enable force phy when SATA disk directly connected") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604191850.IVYPTaML-lkp@intel.com/ Signed-off-by: Yihang Li <liyihang9@huawei.com> Link: https://patch.msgid.link/20260420021044.3339459-1-liyihang9@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-20 22:37:15 -04:00
Hugo Villeneuve	1dc39ed655	scsi: pmcraid: Fix typo in comments Fix typo in structure comment. Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Link: https://patch.msgid.link/20260417200738.3920001-1-hugo@hugovil.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-20 22:33:19 -04:00
Brian Bunker	68c3a65a5a	scsi: scsi_dh_alua: Increase default ALUA timeout to maximum spec value The ALUA handler maps a 0 value (no implicit transition timeout provided by the target) to the ALUA_FAILOVER_TIMEOUT constant, currently 60 seconds. This means the kernel already does not accept an infinite transition time. However, 60 seconds is insufficient for some arrays that may take longer to complete ALUA transitions. Since the highest value allowed by the SCSI specification for the implicit transition timeout is a single byte (255 seconds), change the default to 255. This way, when a target does not provide an explicit transition timeout, we default to the maximum value the spec allows rather than an arbitrary 60 second limit. Co-developed-by: Krishna Kant <krishna.kant@purestorage.com> Signed-off-by: Krishna Kant <krishna.kant@purestorage.com> Co-developed-by: Riya Savla <rsavla@purestorage.com> Signed-off-by: Riya Savla <rsavla@purestorage.com> Signed-off-by: Brian Bunker <brian@purestorage.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://patch.msgid.link/20260416165512.26497-2-brian@purestorage.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-20 22:29:55 -04:00
Tomas Henzl	d65efdf467	scsi: smartpqi: Silence a recursive lock warning On systems with multiple controllers debug kernel shows WARNING: possible recursive locking detected during shutdown. Each controller does have its own ctrl_info (and mutex) and that isn't correctly recognized by debug kernel. Suppress the warning by releasing the mutex at the end of pqi_shutdown(). Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20260414124118.23661-1-thenzl@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-20 22:26:12 -04:00
Ranjan Kumar	04631f55af	scsi: mpt3sas: Limit NVMe request size to 2 MiB The HBA firmware reports NVMe MDTS values based on the underlying drive capability. However, because the driver allocates a fixed 4K buffer for the PRP list, accommodating at most 512 entries, the driver supports a maximum I/O transfer size of 2 MiB. Limit max_hw_sectors to the smaller of the reported MDTS and the 2 MiB driver limit to prevent issuing oversized I/O that may lead to a kernel oops. Cc: stable@vger.kernel.org Fixes: `9b8b84879d` ("block: Increase BLK_DEF_MAX_SECTORS_CAP") Reported-by: Mira Limbeck <m.limbeck@proxmox.com> Closes: https://lore.kernel.org/r/291f78bf-4b4a-40dd-867d-053b36c564b3@proxmox.com Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9b8b84879d4a Suggested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Tested-by: Mira Limbeck <m.limbeck@proxmox.com> Link: https://patch.msgid.link/20260414110811.85156-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-20 22:22:33 -04:00
Christoph Hellwig	7b03c93d2b	scsi: sg: Don't use GFP_ATOMIC in sg_start_req() sg_start_req() is called from normal user context and can sleep when waiting for memory. Switch it to use GFP_KERNEL, which fixes allocation failures seen with the bio_alloc rework. Fixes: `b520c4eef8` ("block: split bio_alloc_bioset more clearly into a fast and slowpath") Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20260415060813.807659-2-hch@lst.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-20 22:13:41 -04:00
Mark Harmstone	82323b1a70	btrfs: fix double-decrement of bytes_may_use in submit_one_async_extent() submit_one_async_extent() calls btrfs_reserve_extent(), which decrements bytes_may_use. If the call btrfs_create_io_em() fails, we jump to out_free_reserve, which calls extent_clear_unlock_delalloc(). Because we're specifying EXTENT_DO_ACCOUNTING, i.e. EXTENT_CLEAR_META_RESV \| EXTENT_CLEAR_DATA_RESV, this decreases bytes_may_use again. This can lead to problems later on, as an initial write can fail only for the writeback to silently ENOSPC. Fix this by replacing EXTENT_DO_ACCOUNTING with EXTENT_CLEAR_META_RESV. This parallels `a4fe134fc1` ("btrfs: fix a double release on reserved extents in cow_one_range()"), which is the same fix in cow_one_range(). Fixes: `151a41bc46` ("Btrfs: fix what bits we clear when erroring out from delalloc") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:03:08 +02:00
robbieko	a8d58a7c02	btrfs: check return value of btrfs_partially_delete_raid_extent() btrfs_partially_delete_raid_extent() returns an error code (e.g. -ENOMEM from kzalloc(), or errors from btrfs_del_item/btrfs_insert_item()), but all three call sites in btrfs_delete_raid_extent() discard the return value, silently losing errors and potentially leaving the stripe tree in an inconsistent state. Fix by capturing the return value into ret at all three call sites and breaking out of the loop on error where appropriate. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: robbieko <robbieko@synology.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:02:39 +02:00
robbieko	fe0cdfd711	btrfs: handle -EAGAIN from btrfs_duplicate_item and refresh stale leaf pointer In the 'punch a hole' case of btrfs_delete_raid_extent(), btrfs_duplicate_item() can return -EAGAIN when the leaf needs to be split and the path becomes invalid. The old code treats any error as fatal and breaks out of the loop. Additionally, btrfs_duplicate_item() may trigger setup_leaf_for_split() which can reallocate the leaf node. The code continues using the old leaf pointer, leading to use-after-free or stale data access. Fix both issues by: - Handling -EAGAIN specifically: release the path and retry the loop. - Refreshing leaf = path->nodes[0] after successful duplication. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: robbieko <robbieko@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:02:34 +02:00
robbieko	653361585d	btrfs: replace ASSERT with proper error handling in stripe lookup fallback After falling back to the previous item in btrfs_delete_raid_extent(), the code uses ASSERT(found_start <= start) to verify the found extent actually precedes our target range. If the B-tree state is unexpected (e.g. no overlapping extent exists), this triggers a kernel BUG/panic in debug builds, or silently continues with wrong data otherwise. Replace the ASSERT with a proper bounds check that returns -ENOENT if the found extent does not actually overlap with the start position. Signed-off-by: robbieko <robbieko@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:02:30 +02:00
robbieko	1871ae78ff	btrfs: fix wrong min_objectid in btrfs_previous_item() call When found_start > start and slot == 0, btrfs_previous_item() is called with min_objectid=start to find the previous stripe extent. However, the previous stripe extent we are looking for has objectid < start (it starts before our deletion range), so passing start as min_objectid prevents finding it. Fix by passing 0 as min_objectid to allow finding any preceding stripe extent regardless of its objectid. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: robbieko <robbieko@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:02:26 +02:00
robbieko	2aef5cb1dc	btrfs: fix raid stripe search missing entries at leaf boundaries In btrfs_delete_raid_extent(), the search key uses offset=0. When the target stripe entry is the first item on a leaf, btrfs_search_slot() may land on the previous leaf and decrementing the slot from nritems still points to the wrong entry, causing the stripe extent to be silently missed. Fix this by searching with offset=(u64)-1 instead. Since no real stripe entry has this offset, btrfs_search_slot() always returns 1 with the slot pointing past the last matching objectid entry. Then unconditionally decrement the slot with a proper slots[0]==0 early-exit check to handle the case where no matching entry exists. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: robbieko <robbieko@synology.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:02:21 +02:00
robbieko	513f8a52ee	btrfs: copy devid in btrfs_partially_delete_raid_extent() When btrfs_partially_delete_raid_extent() rebuilds a truncated/shifted stripe extent into newitem, the loop copies the physical address for each stride but forgets to copy the devid. The resulting item written back to the stripe tree has zeroed-out devids, corrupting the stripe mapping. Fix this by reading the devid with btrfs_raid_stride_devid() and writing it into the new item with btrfs_set_stack_raid_stride_devid() before copying the physical address. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: robbieko <robbieko@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:02:18 +02:00

... 65 66 67 68 69 ...

1447055 Commits