From cfbe371194d1f342bdd88f87a9b36407d1ec0f52 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 12 Jan 2026 15:28:05 -0800 Subject: [PATCH 001/167] KVM: SVM: Check vCPU ID against max x2AVIC ID if and only if x2AVIC is enabled When allocating the AVIC backing page, only check one of the max AVIC vs. x2AVIC ID based on whether or not x2AVIC is enabled. Doing so fixes a bug where KVM incorrectly inhibits AVIC if x2AVIC is _disabled_ and any vCPU with a non-zero APIC ID is created, as x2avic_max_physical_id is left '0' when x2AVIC is disabled. Fixes: 940fc47cfb0d ("KVM: SVM: Add AVIC support for 4k vCPUs in x2AVIC mode") Cc: stable@vger.kernel.org Cc: Naveen N Rao (AMD) Cc: Suravee Suthikulpanit Reviewed-by: Naveen N Rao (AMD) Link: https://patch.msgid.link/20260112232805.1512361-1-seanjc@google.com Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/avic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 6b77b2033208..0f6c8596719b 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -376,6 +376,7 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb) static int avic_init_backing_page(struct kvm_vcpu *vcpu) { + u32 max_id = x2avic_enabled ? x2avic_max_physical_id : AVIC_MAX_PHYSICAL_ID; struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm); struct vcpu_svm *svm = to_svm(vcpu); u32 id = vcpu->vcpu_id; @@ -388,8 +389,7 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu) * avic_vcpu_load() expects to be called if and only if the vCPU has * fully initialized AVIC. */ - if ((!x2avic_enabled && id > AVIC_MAX_PHYSICAL_ID) || - (id > x2avic_max_physical_id)) { + if (id > max_id) { kvm_set_apicv_inhibit(vcpu->kvm, APICV_INHIBIT_REASON_PHYSICAL_ID_TOO_BIG); vcpu->arch.apic->apicv_active = false; return 0; From b4d37cdb77a0015f51fee083598fa227cc07aaf1 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Tue, 13 Jan 2026 09:46:05 -0800 Subject: [PATCH 002/167] KVM: Don't clobber irqfd routing type when deassigning irqfd When deassigning a KVM_IRQFD, don't clobber the irqfd's copy of the IRQ's routing entry as doing so breaks kvm_arch_irq_bypass_del_producer() on x86 and arm64, which explicitly look for KVM_IRQ_ROUTING_MSI. Instead, to handle a concurrent routing update, verify that the irqfd is still active before consuming the routing information. As evidenced by the x86 and arm64 bugs, and another bug in kvm_arch_update_irqfd_routing() (see below), clobbering the entry type without notifying arch code is surprising and error prone. As a bonus, checking that the irqfd is active provides a convenient location for documenting _why_ KVM must not consume the routing entry for an irqfd that is in the process of being deassigned: once the irqfd is deleted from the list (which happens *before* the eventfd is detached), it will no longer receive updates via kvm_irq_routing_update(), and so KVM could deliver an event using stale routing information (relative to KVM_SET_GSI_ROUTING returning to userspace). As an even better bonus, explicitly checking for the irqfd being active fixes a similar bug to the one the clobbering is trying to prevent: if an irqfd is deactivated, and then its routing is changed, kvm_irq_routing_update() won't invoke kvm_arch_update_irqfd_routing() (because the irqfd isn't in the list). And so if the irqfd is in bypass mode, IRQs will continue to be posted using the old routing information. As for kvm_arch_irq_bypass_del_producer(), clobbering the routing type results in KVM incorrectly keeping the IRQ in bypass mode, which is especially problematic on AMD as KVM tracks IRQs that are being posted to a vCPU in a list whose lifetime is tied to the irqfd. Without the help of KASAN to detect use-after-free, the most common sympton on AMD is a NULL pointer deref in amd_iommu_update_ga() due to the memory for irqfd structure being re-allocated and zeroed, resulting in irqfd->irq_bypass_data being NULL when read by avic_update_iommu_vcpu_affinity(): BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 40cf2b9067 P4D 40cf2b9067 PUD 408362a067 PMD 0 Oops: Oops: 0000 [#1] SMP CPU: 6 UID: 0 PID: 40383 Comm: vfio_irq_test Tainted: G U W O 6.19.0-smp--5dddc257e6b2-irqfd #31 NONE Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.78.2-0 09/05/2025 RIP: 0010:amd_iommu_update_ga+0x19/0xe0 Call Trace: avic_update_iommu_vcpu_affinity+0x3d/0x90 [kvm_amd] __avic_vcpu_load+0xf4/0x130 [kvm_amd] kvm_arch_vcpu_load+0x89/0x210 [kvm] vcpu_load+0x30/0x40 [kvm] kvm_arch_vcpu_ioctl_run+0x45/0x620 [kvm] kvm_vcpu_ioctl+0x571/0x6a0 [kvm] __se_sys_ioctl+0x6d/0xb0 do_syscall_64+0x6f/0x9d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x46893b ---[ end trace 0000000000000000 ]--- If AVIC is inhibited when the irfd is deassigned, the bug will manifest as list corruption, e.g. on the next irqfd assignment. list_add corruption. next->prev should be prev (ffff8d474d5cd588), but was 0000000000000000. (next=ffff8d8658f86530). ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:31! Oops: invalid opcode: 0000 [#1] SMP CPU: 128 UID: 0 PID: 80818 Comm: vfio_irq_test Tainted: G U W O 6.19.0-smp--f19dc4d680ba-irqfd #28 NONE Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.78.2-0 09/05/2025 RIP: 0010:__list_add_valid_or_report+0x97/0xc0 Call Trace: avic_pi_update_irte+0x28e/0x2b0 [kvm_amd] kvm_pi_update_irte+0xbf/0x190 [kvm] kvm_arch_irq_bypass_add_producer+0x72/0x90 [kvm] irq_bypass_register_consumer+0xcd/0x170 [irqbypass] kvm_irqfd+0x4c6/0x540 [kvm] kvm_vm_ioctl+0x118/0x5d0 [kvm] __se_sys_ioctl+0x6d/0xb0 do_syscall_64+0x6f/0x9d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ---[ end trace 0000000000000000 ]--- On Intel and arm64, the bug is less noisy, as the end result is that the device keeps posting IRQs to the vCPU even after it's been deassigned. Note, the worst of the breakage can be traced back to commit cb210737675e ("KVM: Pass new routing entries and irqfd when updating IRTEs"), as before that commit KVM would pull the routing information from the per-VM routing table. But as above, similar bugs have existed since support for IRQ bypass was added. E.g. if a routing change finished before irq_shutdown() invoked kvm_arch_irq_bypass_del_producer(), VMX and SVM would see stale routing information and potentially leave the irqfd in bypass mode. Alternatively, x86 could be fixed by explicitly checking irq_bypass_vcpu instead of irq_entry.type in kvm_arch_irq_bypass_del_producer(), and arm64 could be modified to utilize irq_bypass_vcpu in a similar manner. But (a) that wouldn't fix the routing updates bug, and (b) fixing core code doesn't preclude x86 (or arm64) from adding such code as a sanity check (spoiler alert). Fixes: f70c20aaf141 ("KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'") Fixes: cb210737675e ("KVM: Pass new routing entries and irqfd when updating IRTEs") Fixes: a0d7e2fc61ab ("KVM: arm64: vgic-v4: Only attempt vLPI mapping for actual MSIs") Cc: stable@vger.kernel.org Cc: Marc Zyngier Cc: Oliver Upton Link: https://patch.msgid.link/20260113174606.104978-2-seanjc@google.com Signed-off-by: Sean Christopherson --- virt/kvm/eventfd.c | 44 ++++++++++++++++++++++++-------------------- 1 file changed, 24 insertions(+), 20 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 0e8b5277be3b..a369b20d47f0 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -157,21 +157,28 @@ irqfd_shutdown(struct work_struct *work) } -/* assumes kvm->irqfds.lock is held */ -static bool -irqfd_is_active(struct kvm_kernel_irqfd *irqfd) +static bool irqfd_is_active(struct kvm_kernel_irqfd *irqfd) { + /* + * Assert that either irqfds.lock or SRCU is held, as irqfds.lock must + * be held to prevent false positives (on the irqfd being active), and + * while false negatives are impossible as irqfds are never added back + * to the list once they're deactivated, the caller must at least hold + * SRCU to guard against routing changes if the irqfd is deactivated. + */ + lockdep_assert_once(lockdep_is_held(&irqfd->kvm->irqfds.lock) || + srcu_read_lock_held(&irqfd->kvm->irq_srcu)); + return list_empty(&irqfd->list) ? false : true; } /* * Mark the irqfd as inactive and schedule it for removal - * - * assumes kvm->irqfds.lock is held */ -static void -irqfd_deactivate(struct kvm_kernel_irqfd *irqfd) +static void irqfd_deactivate(struct kvm_kernel_irqfd *irqfd) { + lockdep_assert_held(&irqfd->kvm->irqfds.lock); + BUG_ON(!irqfd_is_active(irqfd)); list_del_init(&irqfd->list); @@ -217,8 +224,15 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, int sync, void *key) seq = read_seqcount_begin(&irqfd->irq_entry_sc); irq = irqfd->irq_entry; } while (read_seqcount_retry(&irqfd->irq_entry_sc, seq)); - /* An event has been signaled, inject an interrupt */ - if (kvm_arch_set_irq_inatomic(&irq, kvm, + + /* + * An event has been signaled, inject an interrupt unless the + * irqfd is being deassigned (isn't active), in which case the + * routing information may be stale (once the irqfd is removed + * from the list, it will stop receiving routing updates). + */ + if (unlikely(!irqfd_is_active(irqfd)) || + kvm_arch_set_irq_inatomic(&irq, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false) == -EWOULDBLOCK) schedule_work(&irqfd->inject); @@ -585,18 +599,8 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd *args) spin_lock_irq(&kvm->irqfds.lock); list_for_each_entry_safe(irqfd, tmp, &kvm->irqfds.items, list) { - if (irqfd->eventfd == eventfd && irqfd->gsi == args->gsi) { - /* - * This clearing of irq_entry.type is needed for when - * another thread calls kvm_irq_routing_update before - * we flush workqueue below (we synchronize with - * kvm_irq_routing_update using irqfds.lock). - */ - write_seqcount_begin(&irqfd->irq_entry_sc); - irqfd->irq_entry.type = 0; - write_seqcount_end(&irqfd->irq_entry_sc); + if (irqfd->eventfd == eventfd && irqfd->gsi == args->gsi) irqfd_deactivate(irqfd); - } } spin_unlock_irq(&kvm->irqfds.lock); From ef3719e33e6649164382c629d58704b828f56079 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Tue, 13 Jan 2026 09:46:06 -0800 Subject: [PATCH 003/167] KVM: x86: Assert that non-MSI doesn't have bypass vCPU when deleting producer When disconnecting a non-MSI irqfd from an IRQ bypass producer, WARN if the irqfd is configured for IRQ bypass and set its IRTE back to remapped mode to harden against kernel/KVM bugs (keeping the irqfd in bypass mode is often fatal to the host). Deactivating an irqfd (removing it from the list of irqfds), updating irqfd routes, and the code in question are all mutually exclusive (all run under irqfds.lock). If an irqfd is configured for bypass, and the irqfd is deassigned at the same time IRQ routing is updated (to change the routing to non-MSI), then either kvm_arch_update_irqfd_routing() should process the irqfd routing change and put the IRTE into remapped mode (routing update "wins"), or kvm_arch_irq_bypass_del_producer() should see the MSI routing info (deactivation "wins"). Link: https://patch.msgid.link/20260113174606.104978-3-seanjc@google.com Signed-off-by: Sean Christopherson --- arch/x86/kvm/irq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index 7cc8950005b6..4c7688670c2d 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -514,7 +514,8 @@ void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons, */ spin_lock_irq(&kvm->irqfds.lock); - if (irqfd->irq_entry.type == KVM_IRQ_ROUTING_MSI) { + if (irqfd->irq_entry.type == KVM_IRQ_ROUTING_MSI || + WARN_ON_ONCE(irqfd->irq_bypass_vcpu)) { ret = kvm_pi_update_irte(irqfd, NULL); if (ret) pr_info("irq bypass consumer (eventfd %p) unregistration fails: %d\n", From 830e0bef79aaaea8b1ef426b8032e70c63a58653 Mon Sep 17 00:00:00 2001 From: "leobannocloutier@gmail.com" Date: Fri, 16 Jan 2026 20:53:15 -0500 Subject: [PATCH 004/167] hwmon: (dell-smm) Add Dell G15 5510 to fan control whitelist On the Dell G15 5510, fans spin at maximum speed when AC power is connected. This behavior has been observed as a regression in recent kernels (v6.18+). Add the Dell G15 5510 to the fan control whitelist to enable manual fan control and resolve the issue. This model requires the same fan control configuration as the Dell G15 5511. Fixes: 1c1658058c99 ("hwmon: (dell-smm) Add support for automatic fan mode") Signed-off-by: Leo Banno-Cloutier Link: https://lore.kernel.org/r/20260117015315.214569-2-leobannocloutier@gmail.com [groeck: Updated patch description to follow guidance] Signed-off-by: Guenter Roeck --- drivers/hwmon/dell-smm-hwmon.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c index 6040a8940674..93143cfc157c 100644 --- a/drivers/hwmon/dell-smm-hwmon.c +++ b/drivers/hwmon/dell-smm-hwmon.c @@ -1639,6 +1639,14 @@ static const struct dmi_system_id i8k_whitelist_fan_control[] __initconst = { }, .driver_data = (void *)&i8k_fan_control_data[I8K_FAN_30A3_31A3], }, + { + .ident = "Dell G15 5510", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Dell G15 5510"), + }, + .driver_data = (void *)&i8k_fan_control_data[I8K_FAN_30A3_31A3], + }, { .ident = "Dell G15 5511", .matches = { From 731bb3118f859d2a68444a9ae580681522d32bc0 Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Thu, 22 Jan 2026 16:35:56 -0800 Subject: [PATCH 005/167] Revert "PCI/TSM: Report active IDE streams" The proposed ABI failed to account for multiple host bridges with the same stream name. The fix needs to namespace streams or otherwise link back to the host bridge, but a change like that is too big for a fix. Given this ABI never saw a released kernel, delete it for now and bring it back later with this issue addressed. Reported-by: Xu Yilun Reported-by: Yi Lai Closes: http://lore.kernel.org/20251223085601.2607455-1-yilun.xu@linux.intel.com Link: http://patch.msgid.link/6972c872acbb9_1d3310035@dwillia2-mobl4.notmuch Signed-off-by: Dan Williams --- Documentation/ABI/testing/sysfs-class-tsm | 10 -------- drivers/pci/ide.c | 4 ---- drivers/virt/coco/tsm-core.c | 28 ----------------------- include/linux/pci-ide.h | 2 -- include/linux/tsm.h | 3 --- 5 files changed, 47 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-class-tsm b/Documentation/ABI/testing/sysfs-class-tsm index 6fc1a5ac6da1..2949468deaf7 100644 --- a/Documentation/ABI/testing/sysfs-class-tsm +++ b/Documentation/ABI/testing/sysfs-class-tsm @@ -7,13 +7,3 @@ Description: signals when the PCI layer is able to support establishment of link encryption and other device-security features coordinated through a platform tsm. - -What: /sys/class/tsm/tsmN/streamH.R.E -Contact: linux-pci@vger.kernel.org -Description: - (RO) When a host bridge has established a secure connection via - the platform TSM, symlink appears. The primary function of this - is have a system global review of TSM resource consumption - across host bridges. The link points to the endpoint PCI device - and matches the same link published by the host bridge. See - Documentation/ABI/testing/sysfs-devices-pci-host-bridge. diff --git a/drivers/pci/ide.c b/drivers/pci/ide.c index f0ef474e1a0d..280941b05969 100644 --- a/drivers/pci/ide.c +++ b/drivers/pci/ide.c @@ -11,7 +11,6 @@ #include #include #include -#include #include "pci.h" @@ -373,9 +372,6 @@ void pci_ide_stream_release(struct pci_ide *ide) if (ide->partner[PCI_IDE_EP].enable) pci_ide_stream_disable(pdev, ide); - if (ide->tsm_dev) - tsm_ide_stream_unregister(ide); - if (ide->partner[PCI_IDE_RP].setup) pci_ide_stream_teardown(rp, ide); diff --git a/drivers/virt/coco/tsm-core.c b/drivers/virt/coco/tsm-core.c index f027876a2f19..0e705f3067a1 100644 --- a/drivers/virt/coco/tsm-core.c +++ b/drivers/virt/coco/tsm-core.c @@ -4,13 +4,11 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include -#include #include #include #include #include #include -#include static struct class *tsm_class; static DECLARE_RWSEM(tsm_rwsem); @@ -108,32 +106,6 @@ void tsm_unregister(struct tsm_dev *tsm_dev) } EXPORT_SYMBOL_GPL(tsm_unregister); -/* must be invoked between tsm_register / tsm_unregister */ -int tsm_ide_stream_register(struct pci_ide *ide) -{ - struct pci_dev *pdev = ide->pdev; - struct pci_tsm *tsm = pdev->tsm; - struct tsm_dev *tsm_dev = tsm->tsm_dev; - int rc; - - rc = sysfs_create_link(&tsm_dev->dev.kobj, &pdev->dev.kobj, ide->name); - if (rc) - return rc; - - ide->tsm_dev = tsm_dev; - return 0; -} -EXPORT_SYMBOL_GPL(tsm_ide_stream_register); - -void tsm_ide_stream_unregister(struct pci_ide *ide) -{ - struct tsm_dev *tsm_dev = ide->tsm_dev; - - ide->tsm_dev = NULL; - sysfs_remove_link(&tsm_dev->dev.kobj, ide->name); -} -EXPORT_SYMBOL_GPL(tsm_ide_stream_unregister); - static void tsm_release(struct device *dev) { struct tsm_dev *tsm_dev = container_of(dev, typeof(*tsm_dev), dev); diff --git a/include/linux/pci-ide.h b/include/linux/pci-ide.h index 37a1ad9501b0..5d4d56ed088d 100644 --- a/include/linux/pci-ide.h +++ b/include/linux/pci-ide.h @@ -82,7 +82,6 @@ struct pci_ide_regs { * @host_bridge_stream: allocated from host bridge @ide_stream_ida pool * @stream_id: unique Stream ID (within Partner Port pairing) * @name: name of the established Selective IDE Stream in sysfs - * @tsm_dev: For TSM established IDE, the TSM device context * * Negative @stream_id values indicate "uninitialized" on the * expectation that with TSM established IDE the TSM owns the stream_id @@ -94,7 +93,6 @@ struct pci_ide { u8 host_bridge_stream; int stream_id; const char *name; - struct tsm_dev *tsm_dev; }; /* diff --git a/include/linux/tsm.h b/include/linux/tsm.h index a3b7ab668eff..22e05b2aac69 100644 --- a/include/linux/tsm.h +++ b/include/linux/tsm.h @@ -123,7 +123,4 @@ int tsm_report_unregister(const struct tsm_report_ops *ops); struct tsm_dev *tsm_register(struct device *parent, struct pci_tsm_ops *ops); void tsm_unregister(struct tsm_dev *tsm_dev); struct tsm_dev *find_tsm_dev(int id); -struct pci_ide; -int tsm_ide_stream_register(struct pci_ide *ide); -void tsm_ide_stream_unregister(struct pci_ide *ide); #endif /* __TSM_H */ From 8370af2019dee9ca004ca7c5e36b1f629ecb1e39 Mon Sep 17 00:00:00 2001 From: Li Ming Date: Wed, 14 Jan 2026 19:14:55 +0800 Subject: [PATCH 006/167] PCI/IDE: Fix off by one error calculating VF RID range The VF ID range of an SR-IOV device is [0, num_VFs - 1]. pci_ide_stream_alloc() mistakenly uses num_VFs to represent the last ID. Fix that off by one error to stay in bounds of the range. Fixes: 1e4d2ff3ae45 ("PCI/IDE: Add IDE establishment helpers") Signed-off-by: Li Ming Reviewed-by: Xu Yilun Link: https://patch.msgid.link/20260114111455.550984-1-ming.li@zohomail.com Signed-off-by: Dan Williams --- drivers/pci/ide.c | 4 ++-- include/linux/pci-ide.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/pci/ide.c b/drivers/pci/ide.c index 280941b05969..fcceb518c64e 100644 --- a/drivers/pci/ide.c +++ b/drivers/pci/ide.c @@ -282,8 +282,8 @@ struct pci_ide *pci_ide_stream_alloc(struct pci_dev *pdev) /* for SR-IOV case, cover all VFs */ num_vf = pci_num_vf(pdev); if (num_vf) - rid_end = PCI_DEVID(pci_iov_virtfn_bus(pdev, num_vf), - pci_iov_virtfn_devfn(pdev, num_vf)); + rid_end = PCI_DEVID(pci_iov_virtfn_bus(pdev, num_vf - 1), + pci_iov_virtfn_devfn(pdev, num_vf - 1)); else rid_end = pci_dev_id(pdev); diff --git a/include/linux/pci-ide.h b/include/linux/pci-ide.h index 5d4d56ed088d..ae07d9f699c0 100644 --- a/include/linux/pci-ide.h +++ b/include/linux/pci-ide.h @@ -26,7 +26,7 @@ enum pci_ide_partner_select { /** * struct pci_ide_partner - Per port pair Selective IDE Stream settings * @rid_start: Partner Port Requester ID range start - * @rid_end: Partner Port Requester ID range end + * @rid_end: Partner Port Requester ID range end (inclusive) * @stream_index: Selective IDE Stream Register Block selection * @mem_assoc: PCI bus memory address association for targeting peer partner * @pref_assoc: PCI bus prefetchable memory address association for From 0b50f116af5e29724b756a5ee6ae268deaae2d29 Mon Sep 17 00:00:00 2001 From: Li Ming Date: Sun, 11 Jan 2026 15:38:23 +0800 Subject: [PATCH 007/167] PCI/IDE: Fix reading a wrong reg for unused sel stream initialization During pci_ide_init(), it will write PCI_ID_RESERVED_STREAM_ID into all unused selective IDE stream blocks. In a selective IDE stream block, IDE stream ID field is in selective IDE stream control register instead of selective IDE stream capability register. Fixes: 079115370d00 ("PCI/IDE: Initialize an ID for all IDE streams") Signed-off-by: Li Ming Acked-by: Bjorn Helgaas Reviewed-by: Xu Yilun Link: https://patch.msgid.link/20260111073823.486665-1-ming.li@zohomail.com Signed-off-by: Dan Williams --- drivers/pci/ide.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/ide.c b/drivers/pci/ide.c index fcceb518c64e..23f554490539 100644 --- a/drivers/pci/ide.c +++ b/drivers/pci/ide.c @@ -167,7 +167,7 @@ void pci_ide_init(struct pci_dev *pdev) for (u16 i = 0; i < nr_streams; i++) { int pos = __sel_ide_offset(ide_cap, nr_link_ide, i, nr_ide_mem); - pci_read_config_dword(pdev, pos + PCI_IDE_SEL_CAP, &val); + pci_read_config_dword(pdev, pos + PCI_IDE_SEL_CTL, &val); if (val & PCI_IDE_SEL_CTL_EN) continue; val &= ~PCI_IDE_SEL_CTL_ID; From e396a74222654486d6ab45dca5d0c54c408b8b91 Mon Sep 17 00:00:00 2001 From: Zhiquan Li Date: Thu, 22 Jan 2026 13:35:50 +0800 Subject: [PATCH 008/167] KVM: selftests: Add -U_FORTIFY_SOURCE to avoid some unpredictable test failures Some distributions (such as Ubuntu) configure GCC so that _FORTIFY_SOURCE is automatically enabled at -O1 or above. This results in some fortified version of definitions of standard library functions are included. While linker resolves the symbols, the fortified versions might override the definitions in lib/string_override.c and reference to those PLT entries in GLIBC. This is not a problem for the code in host, but it is a disaster for the guest code. E.g., if build and run x86/nested_emulation_test on Ubuntu 24.04 will encounter a L1 #PF due to memset() reference to __memset_chk@plt. The option -fno-builtin-memset is not helpful here, because those fortified versions are not built-in but some definitions which are included by header, they are for different intentions. In order to eliminate the unpredictable behaviors may vary depending on the linker and platform, add the "-U_FORTIFY_SOURCE" into CFLAGS to prevent from introducing the fortified definitions. Signed-off-by: Zhiquan Li Link: https://patch.msgid.link/20260122053551.548229-1-zhiquan_li@163.com Fixes: 6b6f71484bf4 ("KVM: selftests: Implement memcmp(), memcpy(), and memset() for guest use") Cc: stable@vger.kernel.org [sean: tag for stable] Signed-off-by: Sean Christopherson --- tools/testing/selftests/kvm/Makefile.kvm | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm index ba5c2b643efa..d45bf4ccb3bf 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -251,6 +251,7 @@ LINUX_TOOL_INCLUDE = $(top_srcdir)/tools/include LINUX_TOOL_ARCH_INCLUDE = $(top_srcdir)/tools/arch/$(ARCH)/include CFLAGS += -Wall -Wstrict-prototypes -Wuninitialized -O2 -g -std=gnu99 \ -Wno-gnu-variable-sized-type-not-at-end -MD -MP -DCONFIG_64BIT \ + -U_FORTIFY_SOURCE \ -fno-builtin-memcmp -fno-builtin-memcpy \ -fno-builtin-memset -fno-builtin-strnlen \ -fno-stack-protector -fno-PIE -fno-strict-aliasing \ From 894148a25aebeaeb7ec397fdf1a1f1958d55496d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= Date: Fri, 23 Jan 2026 13:09:00 -0800 Subject: [PATCH 009/167] coco/tsm: Remove unused variable tsm_rwsem MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This variable is and was never used, remove it. Fixes: 603c646f0010 ("coco/tsm: Introduce a core device for TEE Security Managers") Signed-off-by: Thomas Weißschuh Link: https://patch.msgid.link/20260120-coco-tsm_rwsem-v1-1-125059fe2f69@linutronix.de Signed-off-by: Dan Williams --- drivers/virt/coco/tsm-core.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/virt/coco/tsm-core.c b/drivers/virt/coco/tsm-core.c index 0e705f3067a1..8712df8596a1 100644 --- a/drivers/virt/coco/tsm-core.c +++ b/drivers/virt/coco/tsm-core.c @@ -4,14 +4,12 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include -#include #include #include #include #include static struct class *tsm_class; -static DECLARE_RWSEM(tsm_rwsem); static DEFINE_IDA(tsm_ida); static int match_id(struct device *dev, const void *data) From 473dd9ecb3fd139ffebd6f7a9f73fb73d85fe2aa Mon Sep 17 00:00:00 2001 From: Shawn Guo Date: Wed, 21 Jan 2026 22:20:47 +0800 Subject: [PATCH 010/167] MAINTAINERS: Replace Shawn with Frank as i.MX platform maintainer Shawn is no longer interested in maintaining i.MX platform, and would like to step down. On the other hand, Frank seems to be the best successor for this role. - He has been one of the most outstanding contributors to i.MX platform in the recent years. - He has been actively working as a co-maintainer reviewing i.MX patches and keep the platform support in good shape. - He works for NXP and could be the bridge and coordinator between NXP internal developers and community contributors. Acked-by: Peng Fan Acked-by: Daniel Baluta Reviewed-by: Fabio Estevam Acked-by: Dong Aisheng Reviewed-by: Frank Li Signed-off-by: Shawn Guo --- MAINTAINERS | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 5b11839cba9d..a294b4e952a9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2745,14 +2745,14 @@ F: arch/arm/include/asm/hardware/dec21285.h F: arch/arm/mach-footbridge/ ARM/FREESCALE IMX / MXC ARM ARCHITECTURE -M: Shawn Guo +M: Frank Li M: Sascha Hauer R: Pengutronix Kernel Team R: Fabio Estevam L: imx@lists.linux.dev L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained -T: git git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux.git F: Documentation/devicetree/bindings/firmware/fsl* F: Documentation/devicetree/bindings/firmware/nxp* F: arch/arm/boot/dts/nxp/imx/ @@ -2767,22 +2767,22 @@ N: mxs N: \bmxc[^\d] ARM/FREESCALE LAYERSCAPE ARM ARCHITECTURE -M: Shawn Guo +M: Frank Li L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained -T: git git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux.git F: arch/arm/boot/dts/nxp/ls/ F: arch/arm64/boot/dts/freescale/fsl-* F: arch/arm64/boot/dts/freescale/qoriq-* ARM/FREESCALE VYBRID ARM ARCHITECTURE -M: Shawn Guo +M: Frank Li M: Sascha Hauer R: Pengutronix Kernel Team R: Stefan Agner L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained -T: git git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux.git F: arch/arm/boot/dts/nxp/vf/ F: arch/arm/mach-imx/*vf610* @@ -20562,7 +20562,7 @@ F: drivers/pinctrl/pinctrl-amd.c PIN CONTROLLER - FREESCALE M: Dong Aisheng M: Fabio Estevam -M: Shawn Guo +M: Frank Li M: Jacky Bai R: Pengutronix Kernel Team R: NXP S32 Linux Team From 41399c5d476156635c9a58de870d39318e22fa09 Mon Sep 17 00:00:00 2001 From: Guodong Xu Date: Thu, 22 Jan 2026 17:43:42 +0800 Subject: [PATCH 011/167] regulator: spacemit-p1: Fix n_voltages for BUCK and LDO regulators Higher voltage settings were unusable due to incorrect n_voltages values causing registration failures. For example, setting aldo4 to 3.3V failed with -EINVAL because the required selector (123) exceeded the allowed range (n_voltages=117). Fix by aligning n_voltages with the hardware register widths per the P1 datasheet [1]: - BUCK: 255 (was 254), allows selectors 0-254, selector 255 is reserved - LDO: 128 (was 117), allows selectors 0-127, selectors 0-10 are for suspend mode, valid operational range is 11-127 This enables the full voltage range supported by the hardware. Fixes: 8b84d712ad84 ("regulator: spacemit: support SpacemiT P1 regulators") Link: https://developer.spacemit.com/documentation [1] Signed-off-by: Guodong Xu Link: https://patch.msgid.link/20260122-spacemit-p1-v1-1-309be27fbff9@riscstar.com Signed-off-by: Mark Brown --- drivers/regulator/spacemit-p1.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/regulator/spacemit-p1.c b/drivers/regulator/spacemit-p1.c index 2bf9137e12b1..2b585ba01a93 100644 --- a/drivers/regulator/spacemit-p1.c +++ b/drivers/regulator/spacemit-p1.c @@ -87,13 +87,13 @@ static const struct linear_range p1_ldo_ranges[] = { } #define P1_BUCK_DESC(_n) \ - P1_REG_DESC(BUCK, buck, _n, "vin", 0x47, BUCK_MASK, 254, p1_buck_ranges) + P1_REG_DESC(BUCK, buck, _n, "vin", 0x47, BUCK_MASK, 255, p1_buck_ranges) #define P1_ALDO_DESC(_n) \ - P1_REG_DESC(ALDO, aldo, _n, "vin", 0x5b, LDO_MASK, 117, p1_ldo_ranges) + P1_REG_DESC(ALDO, aldo, _n, "vin", 0x5b, LDO_MASK, 128, p1_ldo_ranges) #define P1_DLDO_DESC(_n) \ - P1_REG_DESC(DLDO, dldo, _n, "buck5", 0x67, LDO_MASK, 117, p1_ldo_ranges) + P1_REG_DESC(DLDO, dldo, _n, "buck5", 0x67, LDO_MASK, 128, p1_ldo_ranges) static const struct regulator_desc p1_regulator_desc[] = { P1_BUCK_DESC(1), From 43b0b7eff4b3fb684f257d5a24376782e9663465 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 20 Jan 2026 16:43:44 +0100 Subject: [PATCH 012/167] platform/x86: panasonic-laptop: Fix sysfs group leak in error path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The acpi_pcc_hotkey_add() error path leaks sysfs group pcc_attr_group if platform_device_register_simple() fails for the "panasonic" platform device. Address this by making it call sysfs_remove_group() in that case for the group in question. Signed-off-by: Rafael J. Wysocki Link: https://patch.msgid.link/3398370.44csPzL39Z@rafael.j.wysocki Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/panasonic-laptop.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/panasonic-laptop.c b/drivers/platform/x86/panasonic-laptop.c index 255317e6fec8..937f1a5b78ed 100644 --- a/drivers/platform/x86/panasonic-laptop.c +++ b/drivers/platform/x86/panasonic-laptop.c @@ -1089,7 +1089,7 @@ static int acpi_pcc_hotkey_add(struct acpi_device *device) PLATFORM_DEVID_NONE, NULL, 0); if (IS_ERR(pcc->platform)) { result = PTR_ERR(pcc->platform); - goto out_backlight; + goto out_sysfs; } result = device_create_file(&pcc->platform->dev, &dev_attr_cdpower); @@ -1105,6 +1105,8 @@ static int acpi_pcc_hotkey_add(struct acpi_device *device) out_platform: platform_device_unregister(pcc->platform); +out_sysfs: + sysfs_remove_group(&device->dev.kobj, &pcc_attr_group); out_backlight: backlight_device_unregister(pcc->backlight); out_input: From 128497456756e1b952bd5a912cd073836465109d Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 26 Jan 2026 16:38:45 +0200 Subject: [PATCH 013/167] platform/x86: toshiba_haps: Fix memory leaks in add/remove routines toshiba_haps_add() leaks the haps object allocated by it if it returns an error after allocating that object successfully. toshiba_haps_remove() does not free the object pointed to by toshiba_haps before clearing that pointer, so it becomes unreachable allocated memory. Address these memory leaks by using devm_kzalloc() for allocating the memory in question. Fixes: 23d0ba0c908a ("platform/x86: Toshiba HDD Active Protection Sensor") Signed-off-by: Rafael J. Wysocki --- drivers/platform/x86/toshiba_haps.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/toshiba_haps.c b/drivers/platform/x86/toshiba_haps.c index 03dfddeee0c0..e9324bf16aea 100644 --- a/drivers/platform/x86/toshiba_haps.c +++ b/drivers/platform/x86/toshiba_haps.c @@ -183,7 +183,7 @@ static int toshiba_haps_add(struct acpi_device *acpi_dev) pr_info("Toshiba HDD Active Protection Sensor device\n"); - haps = kzalloc(sizeof(struct toshiba_haps_dev), GFP_KERNEL); + haps = devm_kzalloc(&acpi_dev->dev, sizeof(*haps), GFP_KERNEL); if (!haps) return -ENOMEM; From 98bdc485b281da7e20e67b13a8cc834956d68e9c Mon Sep 17 00:00:00 2001 From: "David E. Box" Date: Wed, 21 Jan 2026 18:21:08 -0800 Subject: [PATCH 014/167] platform/x86/intel/vsec: Add Nova Lake PUNIT support MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add PCI ID for Nova Lake, supporting PUNIT telemetry. Signed-off-by: David E. Box Link: https://patch.msgid.link/20260122022110.3231344-1-david.e.box@linux.intel.com Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/intel/vsec.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/platform/x86/intel/vsec.c b/drivers/platform/x86/intel/vsec.c index ecfc7703f201..012d87878afd 100644 --- a/drivers/platform/x86/intel/vsec.c +++ b/drivers/platform/x86/intel/vsec.c @@ -766,6 +766,7 @@ static const struct intel_vsec_platform_info lnl_info = { #define PCI_DEVICE_ID_INTEL_VSEC_LNL_M 0x647d #define PCI_DEVICE_ID_INTEL_VSEC_PTL 0xb07d #define PCI_DEVICE_ID_INTEL_VSEC_WCL 0xfd7d +#define PCI_DEVICE_ID_INTEL_VSEC_NVL 0xd70d static const struct pci_device_id intel_vsec_pci_ids[] = { { PCI_DEVICE_DATA(INTEL, VSEC_ADL, &tgl_info) }, { PCI_DEVICE_DATA(INTEL, VSEC_DG1, &dg1_info) }, @@ -778,6 +779,7 @@ static const struct pci_device_id intel_vsec_pci_ids[] = { { PCI_DEVICE_DATA(INTEL, VSEC_LNL_M, &lnl_info) }, { PCI_DEVICE_DATA(INTEL, VSEC_PTL, &mtl_info) }, { PCI_DEVICE_DATA(INTEL, VSEC_WCL, &mtl_info) }, + { PCI_DEVICE_DATA(INTEL, VSEC_NVL, &mtl_info) }, { } }; MODULE_DEVICE_TABLE(pci, intel_vsec_pci_ids); From 25e9e322d2ab5c03602eff4fbf4f7c40019d8de2 Mon Sep 17 00:00:00 2001 From: Kaushlendra Kumar Date: Wed, 24 Dec 2025 08:50:53 +0530 Subject: [PATCH 015/167] platform/x86: intel_telemetry: Fix swapped arrays in PSS output MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The LTR blocking statistics and wakeup event counters are incorrectly cross-referenced during debugfs output rendering. The code populates pss_ltr_blkd[] with LTR blocking data and pss_s0ix_wakeup[] with wakeup data, but the display loops reference the wrong arrays. This causes the "LTR Blocking Status" section to print wakeup events and the "Wakes Status" section to print LTR blockers, misleading power management analysis and S0ix residency debugging. Fix by aligning array usage with the intended output section labels. Fixes: 87bee290998d ("platform:x86: Add Intel Telemetry Debugfs interfaces") Cc: stable@vger.kernel.org Signed-off-by: Kaushlendra Kumar Link: https://patch.msgid.link/20251224032053.3915900-1-kaushlendra.kumar@intel.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/intel/telemetry/debugfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/platform/x86/intel/telemetry/debugfs.c b/drivers/platform/x86/intel/telemetry/debugfs.c index 70e5736c44c7..189c61ff7ff0 100644 --- a/drivers/platform/x86/intel/telemetry/debugfs.c +++ b/drivers/platform/x86/intel/telemetry/debugfs.c @@ -449,7 +449,7 @@ static int telem_pss_states_show(struct seq_file *s, void *unused) for (index = 0; index < debugfs_conf->pss_ltr_evts; index++) { seq_printf(s, "%-32s\t%u\n", debugfs_conf->pss_ltr_data[index].name, - pss_s0ix_wakeup[index]); + pss_ltr_blkd[index]); } seq_puts(s, "\n--------------------------------------\n"); @@ -459,7 +459,7 @@ static int telem_pss_states_show(struct seq_file *s, void *unused) for (index = 0; index < debugfs_conf->pss_wakeup_evts; index++) { seq_printf(s, "%-32s\t%u\n", debugfs_conf->pss_wakeup[index].name, - pss_ltr_blkd[index]); + pss_s0ix_wakeup[index]); } return 0; From 39e9c376ac42705af4ed4ae39eec028e8bced9b4 Mon Sep 17 00:00:00 2001 From: Kaushlendra Kumar Date: Wed, 24 Dec 2025 11:41:44 +0530 Subject: [PATCH 016/167] platform/x86: intel_telemetry: Fix PSS event register mask MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The PSS telemetry info parsing incorrectly applies TELEM_INFO_SRAMEVTS_MASK when extracting event register count from firmware response. This reads bits 15-8 instead of the correct bits 7-0, causing misdetection of hardware capabilities. The IOSS path correctly uses TELEM_INFO_NENABLES_MASK for register count. Apply the same mask to PSS parsing for consistency. Fixes: 9d16b482b059 ("platform:x86: Add Intel telemetry platform driver") Signed-off-by: Kaushlendra Kumar Link: https://patch.msgid.link/20251224061144.3925519-1-kaushlendra.kumar@intel.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/intel/telemetry/pltdrv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/intel/telemetry/pltdrv.c b/drivers/platform/x86/intel/telemetry/pltdrv.c index f23c170a55dc..d9aa349f81e4 100644 --- a/drivers/platform/x86/intel/telemetry/pltdrv.c +++ b/drivers/platform/x86/intel/telemetry/pltdrv.c @@ -610,7 +610,7 @@ static int telemetry_setup(struct platform_device *pdev) /* Get telemetry Info */ events = (read_buf & TELEM_INFO_SRAMEVTS_MASK) >> TELEM_INFO_SRAMEVTS_SHIFT; - event_regs = read_buf & TELEM_INFO_SRAMEVTS_MASK; + event_regs = read_buf & TELEM_INFO_NENABLES_MASK; if ((events < TELEM_MAX_EVENTS_SRAM) || (event_regs < TELEM_MAX_EVENTS_SRAM)) { dev_err(&pdev->dev, "PSS:Insufficient Space for SRAM Trace\n"); From 2b4e00d8e70ca8736fda82447be6a4e323c6d1f5 Mon Sep 17 00:00:00 2001 From: gongqi <550230171hxy@gmail.com> Date: Thu, 22 Jan 2026 23:55:00 +0800 Subject: [PATCH 017/167] platform/x86/amd/pmc: Add quirk for MECHREVO Wujie 15X Pro MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The MECHREVO Wujie 15X Pro suffers from spurious IRQ issues related to the AMD PMC. Add it to the quirk list to use the spurious_8042 fix. Signed-off-by: gongqi <550230171hxy@gmail.com> Link: https://patch.msgid.link/20260122155501.376199-4-550230171hxy@gmail.com Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/amd/pmc/pmc-quirks.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/platform/x86/amd/pmc/pmc-quirks.c b/drivers/platform/x86/amd/pmc/pmc-quirks.c index 404e62ad293a..ed285afaf9b0 100644 --- a/drivers/platform/x86/amd/pmc/pmc-quirks.c +++ b/drivers/platform/x86/amd/pmc/pmc-quirks.c @@ -302,6 +302,13 @@ static const struct dmi_system_id fwbug_list[] = { DMI_MATCH(DMI_BOARD_NAME, "XxKK4NAx_XxSP4NAx"), } }, + { + .ident = "MECHREVO Wujie 15X Pro", + .driver_data = &quirk_spurious_8042, + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "WUJIE Series-X5SP4NAG"), + } + }, {} }; From 662c9cb86fc322038647d8808e751f5c6c0cc13f Mon Sep 17 00:00:00 2001 From: Jonas Ringeis Date: Fri, 23 Jan 2026 23:54:23 +0100 Subject: [PATCH 018/167] platform/x86: lg-laptop: Recognize 2022-2025 models MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The lg-laptop driver uses the DMI to identify the product year. Currently, the driver recognizes all models released after 2022 incorrectly as 2022. Update logic to handle model identifiers for years 2022-2025. Link: https://en.wikipedia.org/w/index.php?title=LG_Gram&oldid=1327931565#Comparison_of_Gram_models Signed-off-by: Jonas Ringeis Link: https://patch.msgid.link/20260123225503.493467-1-private@glitchdev.me Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/lg-laptop.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/lg-laptop.c b/drivers/platform/x86/lg-laptop.c index f92e89c75db9..61ef7a218a80 100644 --- a/drivers/platform/x86/lg-laptop.c +++ b/drivers/platform/x86/lg-laptop.c @@ -838,8 +838,17 @@ static int acpi_add(struct acpi_device *device) case 'P': year = 2021; break; - default: + case 'Q': year = 2022; + break; + case 'R': + year = 2023; + break; + case 'S': + year = 2024; + break; + default: + year = 2025; } break; default: From 8f589c9c3be539d6c2b393c82940c3783831082f Mon Sep 17 00:00:00 2001 From: Alice Ryhl Date: Mon, 29 Dec 2025 15:38:14 +0000 Subject: [PATCH 019/167] rust_binder: correctly handle FDA objects of length zero Fix a bug where an empty FDA (fd array) object with 0 fds would cause an out-of-bounds error. The previous implementation used `skip == 0` to mean "this is a pointer fixup", but 0 is also the correct skip length for an empty FDA. If the FDA is at the end of the buffer, then this results in an attempt to write 8-bytes out of bounds. This is caught and results in an EINVAL error being returned to userspace. The pattern of using `skip == 0` as a special value originates from the C-implementation of Binder. As part of fixing this bug, this pattern is replaced with a Rust enum. I considered the alternate option of not pushing a fixup when the length is zero, but I think it's cleaner to just get rid of the zero-is-special stuff. The root cause of this bug was diagnosed by Gemini CLI on first try. I used the following prompt: > There appears to be a bug in @drivers/android/binder/thread.rs where > the Fixups oob bug is triggered with 316 304 316 324. This implies > that we somehow ended up with a fixup where buffer A has a pointer to > buffer B, but the pointer is located at an index in buffer A that is > out of bounds. Please investigate the code to find the bug. You may > compare with @drivers/android/binder.c that implements this correctly. Cc: stable@vger.kernel.org Reported-by: DeepChirp Closes: https://github.com/waydroid/waydroid/issues/2157 Fixes: eafedbc7c050 ("rust_binder: add Rust Binder driver") Tested-by: DeepChirp Signed-off-by: Alice Ryhl Acked-by: Carlos Llamas Link: https://patch.msgid.link/20251229-fda-zero-v1-1-58a41cb0e7ec@google.com Signed-off-by: Greg Kroah-Hartman --- drivers/android/binder/thread.rs | 59 ++++++++++++++++++-------------- 1 file changed, 34 insertions(+), 25 deletions(-) diff --git a/drivers/android/binder/thread.rs b/drivers/android/binder/thread.rs index 1a8e6fdc0dc4..dcd47e10aeb8 100644 --- a/drivers/android/binder/thread.rs +++ b/drivers/android/binder/thread.rs @@ -69,17 +69,24 @@ struct ScatterGatherEntry { } /// This entry specifies that a fixup should happen at `target_offset` of the -/// buffer. If `skip` is nonzero, then the fixup is a `binder_fd_array_object` -/// and is applied later. Otherwise if `skip` is zero, then the size of the -/// fixup is `sizeof::()` and `pointer_value` is written to the buffer. -struct PointerFixupEntry { - /// The number of bytes to skip, or zero for a `binder_buffer_object` fixup. - skip: usize, - /// The translated pointer to write when `skip` is zero. - pointer_value: u64, - /// The offset at which the value should be written. The offset is relative - /// to the original buffer. - target_offset: usize, +/// buffer. +enum PointerFixupEntry { + /// A fixup for a `binder_buffer_object`. + Fixup { + /// The translated pointer to write. + pointer_value: u64, + /// The offset at which the value should be written. The offset is relative + /// to the original buffer. + target_offset: usize, + }, + /// A skip for a `binder_fd_array_object`. + Skip { + /// The number of bytes to skip. + skip: usize, + /// The offset at which the skip should happen. The offset is relative + /// to the original buffer. + target_offset: usize, + }, } /// Return type of `apply_and_validate_fixup_in_parent`. @@ -762,8 +769,7 @@ fn translate_object( parent_entry.fixup_min_offset = info.new_min_offset; parent_entry.pointer_fixups.push( - PointerFixupEntry { - skip: 0, + PointerFixupEntry::Fixup { pointer_value: buffer_ptr_in_user_space, target_offset: info.target_offset, }, @@ -807,9 +813,8 @@ fn translate_object( parent_entry .pointer_fixups .push( - PointerFixupEntry { + PointerFixupEntry::Skip { skip: fds_len, - pointer_value: 0, target_offset: info.target_offset, }, GFP_KERNEL, @@ -871,17 +876,21 @@ fn apply_sg(&self, alloc: &mut Allocation, sg_state: &mut ScatterGatherState) -> let mut reader = UserSlice::new(UserPtr::from_addr(sg_entry.sender_uaddr), sg_entry.length).reader(); for fixup in &mut sg_entry.pointer_fixups { - let fixup_len = if fixup.skip == 0 { - size_of::() - } else { - fixup.skip + let (fixup_len, fixup_offset) = match fixup { + PointerFixupEntry::Fixup { target_offset, .. } => { + (size_of::(), *target_offset) + } + PointerFixupEntry::Skip { + skip, + target_offset, + } => (*skip, *target_offset), }; - let target_offset_end = fixup.target_offset.checked_add(fixup_len).ok_or(EINVAL)?; - if fixup.target_offset < end_of_previous_fixup || offset_end < target_offset_end { + let target_offset_end = fixup_offset.checked_add(fixup_len).ok_or(EINVAL)?; + if fixup_offset < end_of_previous_fixup || offset_end < target_offset_end { pr_warn!( "Fixups oob {} {} {} {}", - fixup.target_offset, + fixup_offset, end_of_previous_fixup, offset_end, target_offset_end @@ -890,13 +899,13 @@ fn apply_sg(&self, alloc: &mut Allocation, sg_state: &mut ScatterGatherState) -> } let copy_off = end_of_previous_fixup; - let copy_len = fixup.target_offset - end_of_previous_fixup; + let copy_len = fixup_offset - end_of_previous_fixup; if let Err(err) = alloc.copy_into(&mut reader, copy_off, copy_len) { pr_warn!("Failed copying into alloc: {:?}", err); return Err(err.into()); } - if fixup.skip == 0 { - let res = alloc.write::(fixup.target_offset, &fixup.pointer_value); + if let PointerFixupEntry::Fixup { pointer_value, .. } = fixup { + let res = alloc.write::(fixup_offset, pointer_value); if let Err(err) = res { pr_warn!("Failed copying ptr into alloc: {:?}", err); return Err(err.into()); From 5e8a3d01544282e50d887d76f30d1496a0a53562 Mon Sep 17 00:00:00 2001 From: Carlos Llamas Date: Thu, 22 Jan 2026 18:02:02 +0000 Subject: [PATCH 020/167] binder: fix UAF in binder_netlink_report() Oneway transactions sent to frozen targets via binder_proc_transaction() return a BR_TRANSACTION_PENDING_FROZEN error but they are still treated as successful since the target is expected to thaw at some point. It is then not safe to access 't' after BR_TRANSACTION_PENDING_FROZEN errors as the transaction could have been consumed by the now thawed target. This is the case for binder_netlink_report() which derreferences 't' after a pending frozen error, as pointed out by the following KASAN report: ================================================================== BUG: KASAN: slab-use-after-free in binder_netlink_report.isra.0+0x694/0x6c8 Read of size 8 at addr ffff00000f98ba38 by task binder-util/522 CPU: 4 UID: 0 PID: 522 Comm: binder-util Not tainted 6.19.0-rc6-00015-gc03e9c42ae8f #1 PREEMPT Hardware name: linux,dummy-virt (DT) Call trace: binder_netlink_report.isra.0+0x694/0x6c8 binder_transaction+0x66e4/0x79b8 binder_thread_write+0xab4/0x4440 binder_ioctl+0x1fd4/0x2940 [...] Allocated by task 522: __kmalloc_cache_noprof+0x17c/0x50c binder_transaction+0x584/0x79b8 binder_thread_write+0xab4/0x4440 binder_ioctl+0x1fd4/0x2940 [...] Freed by task 488: kfree+0x1d0/0x420 binder_free_transaction+0x150/0x234 binder_thread_read+0x2d08/0x3ce4 binder_ioctl+0x488/0x2940 [...] ================================================================== Instead, make a transaction copy so the data can be safely accessed by binder_netlink_report() after a pending frozen error. While here, add a comment about not using t->buffer in binder_netlink_report(). Cc: stable@vger.kernel.org Fixes: 63740349eba7 ("binder: introduce transaction reports via netlink") Signed-off-by: Carlos Llamas Reviewed-by: Alice Ryhl Link: https://patch.msgid.link/20260122180203.1502637-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman --- drivers/android/binder.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 535fc881c8da..4e16438eceb7 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -2991,6 +2991,10 @@ static void binder_set_txn_from_error(struct binder_transaction *t, int id, * @t: the binder transaction that failed * @data_size: the user provided data size for the transaction * @error: enum binder_driver_return_protocol returned to sender + * + * Note that t->buffer is not safe to access here, as it may have been + * released (or not yet allocated). Callers should guarantee all the + * transaction items used here are safe to access. */ static void binder_netlink_report(struct binder_proc *proc, struct binder_transaction *t, @@ -3780,6 +3784,14 @@ static void binder_transaction(struct binder_proc *proc, goto err_dead_proc_or_thread; } } else { + /* + * Make a transaction copy. It is not safe to access 't' after + * binder_proc_transaction() reported a pending frozen. The + * target could thaw and consume the transaction at any point. + * Instead, use a safe 't_copy' for binder_netlink_report(). + */ + struct binder_transaction t_copy = *t; + BUG_ON(target_node == NULL); BUG_ON(t->buffer->async_transaction != 1); return_error = binder_proc_transaction(t, target_proc, NULL); @@ -3790,7 +3802,7 @@ static void binder_transaction(struct binder_proc *proc, */ if (return_error == BR_TRANSACTION_PENDING_FROZEN) { tcomplete->type = BINDER_WORK_TRANSACTION_PENDING; - binder_netlink_report(proc, t, tr->data_size, + binder_netlink_report(proc, &t_copy, tr->data_size, return_error); } binder_enqueue_thread_work(thread, tcomplete); From d047248190d86a52164656d47bec9bfba61dc71e Mon Sep 17 00:00:00 2001 From: Alice Ryhl Date: Fri, 23 Jan 2026 16:23:56 +0000 Subject: [PATCH 021/167] rust_binder: add additional alignment checks This adds some alignment checks to match C Binder more closely. This causes the driver to reject more transactions. I don't think any of the transactions in question are harmful, but it's still a bug because it's the wrong uapi to accept them. The cases where usize is changed for u64, it will affect only 32-bit kernels. Cc: stable@vger.kernel.org Fixes: eafedbc7c050 ("rust_binder: add Rust Binder driver") Signed-off-by: Alice Ryhl Acked-by: Carlos Llamas Link: https://patch.msgid.link/20260123-binder-alignment-more-checks-v1-1-7e1cea77411d@google.com Signed-off-by: Greg Kroah-Hartman --- drivers/android/binder/thread.rs | 50 +++++++++++++++++++++++--------- 1 file changed, 36 insertions(+), 14 deletions(-) diff --git a/drivers/android/binder/thread.rs b/drivers/android/binder/thread.rs index dcd47e10aeb8..e0ea33ccfe58 100644 --- a/drivers/android/binder/thread.rs +++ b/drivers/android/binder/thread.rs @@ -39,6 +39,10 @@ sync::atomic::{AtomicU32, Ordering}, }; +fn is_aligned(value: usize, to: usize) -> bool { + value % to == 0 +} + /// Stores the layout of the scatter-gather entries. This is used during the `translate_objects` /// call and is discarded when it returns. struct ScatterGatherState { @@ -795,6 +799,10 @@ fn translate_object( let num_fds = usize::try_from(obj.num_fds).map_err(|_| EINVAL)?; let fds_len = num_fds.checked_mul(size_of::()).ok_or(EINVAL)?; + if !is_aligned(parent_offset, size_of::()) { + return Err(EINVAL.into()); + } + let info = sg_state.validate_parent_fixup(parent_index, parent_offset, fds_len)?; view.alloc.info_add_fd_reserve(num_fds)?; @@ -809,6 +817,10 @@ fn translate_object( } }; + if !is_aligned(parent_entry.sender_uaddr, size_of::()) { + return Err(EINVAL.into()); + } + parent_entry.fixup_min_offset = info.new_min_offset; parent_entry .pointer_fixups @@ -825,6 +837,7 @@ fn translate_object( .sender_uaddr .checked_add(parent_offset) .ok_or(EINVAL)?; + let mut fda_bytes = KVec::new(); UserSlice::new(UserPtr::from_addr(fda_uaddr as _), fds_len) .read_all(&mut fda_bytes, GFP_KERNEL)?; @@ -958,25 +971,30 @@ pub(crate) fn copy_transaction_data( let data_size = trd.data_size.try_into().map_err(|_| EINVAL)?; let aligned_data_size = ptr_align(data_size).ok_or(EINVAL)?; - let offsets_size = trd.offsets_size.try_into().map_err(|_| EINVAL)?; - let aligned_offsets_size = ptr_align(offsets_size).ok_or(EINVAL)?; - let buffers_size = tr.buffers_size.try_into().map_err(|_| EINVAL)?; - let aligned_buffers_size = ptr_align(buffers_size).ok_or(EINVAL)?; + let offsets_size: usize = trd.offsets_size.try_into().map_err(|_| EINVAL)?; + let buffers_size: usize = tr.buffers_size.try_into().map_err(|_| EINVAL)?; let aligned_secctx_size = match secctx.as_ref() { Some((_offset, ctx)) => ptr_align(ctx.len()).ok_or(EINVAL)?, None => 0, }; + if !is_aligned(offsets_size, size_of::()) { + return Err(EINVAL.into()); + } + if !is_aligned(buffers_size, size_of::()) { + return Err(EINVAL.into()); + } + // This guarantees that at least `sizeof(usize)` bytes will be allocated. let len = usize::max( aligned_data_size - .checked_add(aligned_offsets_size) - .and_then(|sum| sum.checked_add(aligned_buffers_size)) + .checked_add(offsets_size) + .and_then(|sum| sum.checked_add(buffers_size)) .and_then(|sum| sum.checked_add(aligned_secctx_size)) .ok_or(ENOMEM)?, - size_of::(), + size_of::(), ); - let secctx_off = aligned_data_size + aligned_offsets_size + aligned_buffers_size; + let secctx_off = aligned_data_size + offsets_size + buffers_size; let mut alloc = match to_process.buffer_alloc(debug_id, len, is_oneway, self.process.task.pid()) { Ok(alloc) => alloc, @@ -1008,13 +1026,13 @@ pub(crate) fn copy_transaction_data( } let offsets_start = aligned_data_size; - let offsets_end = aligned_data_size + aligned_offsets_size; + let offsets_end = aligned_data_size + offsets_size; // This state is used for BINDER_TYPE_PTR objects. let sg_state = sg_state.insert(ScatterGatherState { unused_buffer_space: UnusedBufferSpace { offset: offsets_end, - limit: len, + limit: offsets_end + buffers_size, }, sg_entries: KVec::new(), ancestors: KVec::new(), @@ -1023,12 +1041,16 @@ pub(crate) fn copy_transaction_data( // Traverse the objects specified. let mut view = AllocationView::new(&mut alloc, data_size); for (index, index_offset) in (offsets_start..offsets_end) - .step_by(size_of::()) + .step_by(size_of::()) .enumerate() { - let offset = view.alloc.read(index_offset)?; + let offset: usize = view + .alloc + .read::(index_offset)? + .try_into() + .map_err(|_| EINVAL)?; - if offset < end_of_previous_object { + if offset < end_of_previous_object || !is_aligned(offset, size_of::()) { pr_warn!("Got transaction with invalid offset."); return Err(EINVAL.into()); } @@ -1060,7 +1082,7 @@ pub(crate) fn copy_transaction_data( } // Update the indexes containing objects to clean up. - let offset_after_object = index_offset + size_of::(); + let offset_after_object = index_offset + size_of::(); view.alloc .set_info_offsets(offsets_start..offset_after_object); } From 1769f90e5ba2a6d24bb46b85da33fe861c68f005 Mon Sep 17 00:00:00 2001 From: Carlos Llamas Date: Fri, 23 Jan 2026 17:57:02 +0000 Subject: [PATCH 022/167] binder: fix BR_FROZEN_REPLY error log The error logging for failed transactions is misleading as it always reports "dead process or thread" even when the target is actually frozen. Additionally, the pid and tid are reversed which can further confuse debugging efforts. Fix both issues. Cc: stable@kernel.org Cc: Steven Moreland Fixes: a15dac8b2286 ("binder: additional transaction error logs") Signed-off-by: Carlos Llamas Reviewed-by: Alice Ryhl Link: https://patch.msgid.link/20260123175702.2154348-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman --- drivers/android/binder.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 4e16438eceb7..b356c9b88254 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -3824,8 +3824,9 @@ static void binder_transaction(struct binder_proc *proc, return; err_dead_proc_or_thread: - binder_txn_error("%d:%d dead process or thread\n", - thread->pid, proc->pid); + binder_txn_error("%d:%d %s process or thread\n", + proc->pid, thread->pid, + return_error == BR_FROZEN_REPLY ? "frozen" : "dead"); return_error_line = __LINE__; binder_dequeue_work(proc, tcomplete); err_translate_failed: From 8aa6f7697f5981d336cac7af6ddd182a03c6da01 Mon Sep 17 00:00:00 2001 From: Gabor Juhos Date: Thu, 22 Jan 2026 18:20:12 +0100 Subject: [PATCH 023/167] pmdomain: qcom: rpmpd: fix off-by-one error in clamping to the highest state As it is indicated by the comment, the rpmpd_aggregate_corner() function tries to clamp the state to the highest corner/level supported by the given power domain, however the calculation of the highest state contains an off-by-one error. The 'max_state' member of the 'rpmpd' structure indicates the highest corner/level, and as such it does not needs to be decremented. Change the code to use the 'max_state' value directly to avoid the error. Fixes: 98c8b3efacae ("soc: qcom: rpmpd: Add sync_state") Signed-off-by: Gabor Juhos Reviewed-by: Konrad Dybcio Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson --- drivers/pmdomain/qcom/rpmpd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pmdomain/qcom/rpmpd.c b/drivers/pmdomain/qcom/rpmpd.c index f8580ec0f737..98ab4f9ea9bf 100644 --- a/drivers/pmdomain/qcom/rpmpd.c +++ b/drivers/pmdomain/qcom/rpmpd.c @@ -1001,7 +1001,7 @@ static int rpmpd_aggregate_corner(struct rpmpd *pd) /* Clamp to the highest corner/level if sync_state isn't done yet */ if (!pd->state_synced) - this_active_corner = this_sleep_corner = pd->max_state - 1; + this_active_corner = this_sleep_corner = pd->max_state; else to_active_sleep(pd, pd->corner, &this_active_corner, &this_sleep_corner); From ae0a24c5a8dcea20bf8e344eadf6593e6d1959c3 Mon Sep 17 00:00:00 2001 From: Jacky Bai Date: Fri, 23 Jan 2026 10:51:26 +0800 Subject: [PATCH 024/167] pmdomain: imx: gpcv2: Fix the imx8mm gpu hang due to wrong adb400 reset MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On i.MX8MM, the GPUMIX, GPU2D, and GPU3D blocks share a common reset domain. Due to this hardware limitation, powering off/on GPU2D or GPU3D also triggers a reset of the GPUMIX domain, including its ADB400 port. However, the ADB400 interface must always be placed into power‑down mode before being reset. Currently the GPUMIX and GPU2D/3D power domains rely on runtime PM to handle dependency ordering. In some corner cases, the GPUMIX power off sequence is skipped, leaving the ADB400 port active when GPU2D/3D reset. This causes the GPUMIX ADB400 port to be reset while still active, leading to unpredictable bus behavior and GPU hangs. To avoid this, refine the power‑domain control logic so that the GPUMIX ADB400 port is explicitly powered down and powered up as part of the GPU power domain on/off sequence. This ensures proper ordering and prevents incorrect ADB400 reset. Suggested-by: Lucas Stach Signed-off-by: Jacky Bai Reviewed-by: Lucas Stach Tested-by: Philipp Zabel Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson --- drivers/pmdomain/imx/gpcv2.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/pmdomain/imx/gpcv2.c b/drivers/pmdomain/imx/gpcv2.c index 105fcaf13a34..cff738e4d546 100644 --- a/drivers/pmdomain/imx/gpcv2.c +++ b/drivers/pmdomain/imx/gpcv2.c @@ -165,13 +165,11 @@ #define IMX8M_VPU_HSK_PWRDNREQN BIT(5) #define IMX8M_DISP_HSK_PWRDNREQN BIT(4) -#define IMX8MM_GPUMIX_HSK_PWRDNACKN BIT(29) -#define IMX8MM_GPU_HSK_PWRDNACKN (BIT(27) | BIT(28)) +#define IMX8MM_GPU_HSK_PWRDNACKN GENMASK(29, 27) #define IMX8MM_VPUMIX_HSK_PWRDNACKN BIT(26) #define IMX8MM_DISPMIX_HSK_PWRDNACKN BIT(25) #define IMX8MM_HSIO_HSK_PWRDNACKN (BIT(23) | BIT(24)) -#define IMX8MM_GPUMIX_HSK_PWRDNREQN BIT(11) -#define IMX8MM_GPU_HSK_PWRDNREQN (BIT(9) | BIT(10)) +#define IMX8MM_GPU_HSK_PWRDNREQN GENMASK(11, 9) #define IMX8MM_VPUMIX_HSK_PWRDNREQN BIT(8) #define IMX8MM_DISPMIX_HSK_PWRDNREQN BIT(7) #define IMX8MM_HSIO_HSK_PWRDNREQN (BIT(5) | BIT(6)) @@ -794,8 +792,6 @@ static const struct imx_pgc_domain imx8mm_pgc_domains[] = { .bits = { .pxx = IMX8MM_GPUMIX_SW_Pxx_REQ, .map = IMX8MM_GPUMIX_A53_DOMAIN, - .hskreq = IMX8MM_GPUMIX_HSK_PWRDNREQN, - .hskack = IMX8MM_GPUMIX_HSK_PWRDNACKN, }, .pgc = BIT(IMX8MM_PGC_GPUMIX), .keep_clocks = true, From fe747d7112283f47169e9c16e751179a9b38611e Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 26 Jan 2026 21:02:40 +0100 Subject: [PATCH 025/167] platform/x86: classmate-laptop: Add missing NULL pointer checks MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In a few places in the Classmate laptop driver, code using the accel object may run before that object's address is stored in the driver data of the input device using it. For example, cmpc_accel_sensitivity_store_v4() is the "show" method of cmpc_accel_sensitivity_attr_v4 which is added in cmpc_accel_add_v4(), before calling dev_set_drvdata() for inputdev->dev. If the sysfs attribute is accessed prematurely, the dev_get_drvdata(&inputdev->dev) call in in cmpc_accel_sensitivity_store_v4() returns NULL which leads to a NULL pointer dereference going forward. Moreover, sysfs attributes using the input device are added before initializing that device by cmpc_add_acpi_notify_device() and if one of them is accessed before running that function, a NULL pointer dereference will occur. For example, cmpc_accel_sensitivity_attr_v4 is added before calling cmpc_add_acpi_notify_device() and if it is read prematurely, the dev_get_drvdata(&acpi->dev) call in cmpc_accel_sensitivity_show_v4() returns NULL which leads to a NULL pointer dereference going forward. Fix this by adding NULL pointer checks in all of the relevant places. Signed-off-by: Rafael J. Wysocki Link: https://patch.msgid.link/12825381.O9o76ZdvQC@rafael.j.wysocki Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/classmate-laptop.c | 32 +++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/drivers/platform/x86/classmate-laptop.c b/drivers/platform/x86/classmate-laptop.c index 6b1b8e444e24..74d3eb83f56a 100644 --- a/drivers/platform/x86/classmate-laptop.c +++ b/drivers/platform/x86/classmate-laptop.c @@ -207,7 +207,12 @@ static ssize_t cmpc_accel_sensitivity_show_v4(struct device *dev, acpi = to_acpi_device(dev); inputdev = dev_get_drvdata(&acpi->dev); + if (!inputdev) + return -ENXIO; + accel = dev_get_drvdata(&inputdev->dev); + if (!accel) + return -ENXIO; return sysfs_emit(buf, "%d\n", accel->sensitivity); } @@ -224,7 +229,12 @@ static ssize_t cmpc_accel_sensitivity_store_v4(struct device *dev, acpi = to_acpi_device(dev); inputdev = dev_get_drvdata(&acpi->dev); + if (!inputdev) + return -ENXIO; + accel = dev_get_drvdata(&inputdev->dev); + if (!accel) + return -ENXIO; r = kstrtoul(buf, 0, &sensitivity); if (r) @@ -256,7 +266,12 @@ static ssize_t cmpc_accel_g_select_show_v4(struct device *dev, acpi = to_acpi_device(dev); inputdev = dev_get_drvdata(&acpi->dev); + if (!inputdev) + return -ENXIO; + accel = dev_get_drvdata(&inputdev->dev); + if (!accel) + return -ENXIO; return sysfs_emit(buf, "%d\n", accel->g_select); } @@ -273,7 +288,12 @@ static ssize_t cmpc_accel_g_select_store_v4(struct device *dev, acpi = to_acpi_device(dev); inputdev = dev_get_drvdata(&acpi->dev); + if (!inputdev) + return -ENXIO; + accel = dev_get_drvdata(&inputdev->dev); + if (!accel) + return -ENXIO; r = kstrtoul(buf, 0, &g_select); if (r) @@ -302,6 +322,8 @@ static int cmpc_accel_open_v4(struct input_dev *input) acpi = to_acpi_device(input->dev.parent); accel = dev_get_drvdata(&input->dev); + if (!accel) + return -ENXIO; cmpc_accel_set_sensitivity_v4(acpi->handle, accel->sensitivity); cmpc_accel_set_g_select_v4(acpi->handle, accel->g_select); @@ -549,7 +571,12 @@ static ssize_t cmpc_accel_sensitivity_show(struct device *dev, acpi = to_acpi_device(dev); inputdev = dev_get_drvdata(&acpi->dev); + if (!inputdev) + return -ENXIO; + accel = dev_get_drvdata(&inputdev->dev); + if (!accel) + return -ENXIO; return sysfs_emit(buf, "%d\n", accel->sensitivity); } @@ -566,7 +593,12 @@ static ssize_t cmpc_accel_sensitivity_store(struct device *dev, acpi = to_acpi_device(dev); inputdev = dev_get_drvdata(&acpi->dev); + if (!inputdev) + return -ENXIO; + accel = dev_get_drvdata(&inputdev->dev); + if (!accel) + return -ENXIO; r = kstrtoul(buf, 0, &sensitivity); if (r) From f2090ebdb59d0546cbd7b55d9dd63a77133efc03 Mon Sep 17 00:00:00 2001 From: Christian Marangi Date: Sat, 22 Nov 2025 19:49:56 +0100 Subject: [PATCH 026/167] soc: qcom: smem: fix qcom_smem_is_available and check if __smem is valid Commit 7a94d5f31b54 ("soc: qcom: smem: better track SMEM uninitialized state") changed the usage of __smem and init now as an error pointer instead of NULL. qcom_smem_is_available() wasn't updated to reflect this change and also .qcom_smem_remove doesn't reset it on module exit. Update both entry to reflect new handling of __smem. Fixes: 7a94d5f31b54 ("soc: qcom: smem: better track SMEM uninitialized state") Reported-by: Dan Carpenter Closes: https://lore.kernel.org/all/aSAnR3ECa04CoPqp@stanley.mountain/ Signed-off-by: Christian Marangi Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20251122185002.26524-1-ansuelsmth@gmail.com Signed-off-by: Bjorn Andersson --- drivers/soc/qcom/smem.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c index fef840b54574..c18a0c946f76 100644 --- a/drivers/soc/qcom/smem.c +++ b/drivers/soc/qcom/smem.c @@ -396,7 +396,7 @@ EXPORT_SYMBOL_GPL(qcom_smem_bust_hwspin_lock_by_host); */ bool qcom_smem_is_available(void) { - return !!__smem; + return !IS_ERR(__smem); } EXPORT_SYMBOL_GPL(qcom_smem_is_available); @@ -1247,7 +1247,8 @@ static void qcom_smem_remove(struct platform_device *pdev) { platform_device_unregister(__smem->socinfo); - __smem = NULL; + /* Set to -EPROBE_DEFER to signal unprobed state */ + __smem = ERR_PTR(-EPROBE_DEFER); } static const struct of_device_id qcom_smem_of_match[] = { From c7e45aa0a91f27a63fc9adc44385cfbcdb48eec1 Mon Sep 17 00:00:00 2001 From: Yushan Wang Date: Wed, 28 Jan 2026 19:56:31 +0000 Subject: [PATCH 027/167] MAINTAINERS: Add myself as maintainer of hisi_soc_hha Add myself as maintainer of HiSilicon SoC HHA driver as I will be responsible for this driver. Signed-off-by: Yushan Wang Acked-by: Jonathan Cameron Signed-off-by: Conor Dooley Link: https://lore.kernel.org/r/20260128-audition-professed-31d1b66d5b4e@spud Signed-off-by: Arnd Bergmann --- MAINTAINERS | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 151a0baca2a9..62fe38a771a5 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11376,6 +11376,11 @@ F: Documentation/ABI/testing/sysfs-devices-platform-kunpeng_hccs F: drivers/soc/hisilicon/kunpeng_hccs.c F: drivers/soc/hisilicon/kunpeng_hccs.h +HISILICON SOC HHA DRIVER +M: Yushan Wang +S: Maintained +F: drivers/cache/hisi_soc_hha.c + HISILICON LPC BUS DRIVER M: Jay Fang S: Maintained From 59e82a4237cf39be697f25f2da26f4bee3007826 Mon Sep 17 00:00:00 2001 From: Sudeep Holla Date: Sun, 25 Jan 2026 20:15:45 +0000 Subject: [PATCH 028/167] MAINTAINERS: Change Sudeep Holla's email address Switch to my kernel.org email address going forward. Signed-off-by: Sudeep Holla Link: https://lore.kernel.org/r/20260125201545.3461463-1-sudeep.holla@kernel.org Signed-off-by: Arnd Bergmann --- .mailmap | 3 ++- MAINTAINERS | 22 +++++++++++----------- 2 files changed, 13 insertions(+), 12 deletions(-) diff --git a/.mailmap b/.mailmap index 428d721ffbb1..99c7ac8454b0 100644 --- a/.mailmap +++ b/.mailmap @@ -786,7 +786,8 @@ Subash Abhinov Kasiviswanathan Subhash Jadavani Sudarshan Rajagopalan -Sudeep Holla Sudeep KarkadaNagesha +Sudeep Holla Sudeep KarkadaNagesha +Sudeep Holla Sumit Garg Sumit Semwal Surabhi Vishnoi diff --git a/MAINTAINERS b/MAINTAINERS index 62fe38a771a5..3344b5c14016 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -335,7 +335,7 @@ F: tools/power/acpi/ ACPI FOR ARM64 (ACPI/arm64) M: Lorenzo Pieralisi M: Hanjun Guo -M: Sudeep Holla +M: Sudeep Holla L: linux-acpi@vger.kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained @@ -351,7 +351,7 @@ F: drivers/acpi/riscv/ F: include/linux/acpi_rimt.h ACPI PCC(Platform Communication Channel) MAILBOX DRIVER -M: Sudeep Holla +M: Sudeep Holla L: linux-acpi@vger.kernel.org S: Supported F: drivers/mailbox/pcc.c @@ -3681,7 +3681,7 @@ N: uniphier ARM/VERSATILE EXPRESS PLATFORM M: Liviu Dudau -M: Sudeep Holla +M: Sudeep Holla M: Lorenzo Pieralisi L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained @@ -6513,7 +6513,7 @@ F: drivers/i2c/busses/i2c-cp2615.c CPU FREQUENCY DRIVERS - VEXPRESS SPC ARM BIG LITTLE M: Viresh Kumar -M: Sudeep Holla +M: Sudeep Holla L: linux-pm@vger.kernel.org S: Maintained W: http://www.arm.com/products/processors/technologies/biglittleprocessing.php @@ -6609,7 +6609,7 @@ F: include/linux/platform_data/cpuidle-exynos.h CPUIDLE DRIVER - ARM PSCI M: Lorenzo Pieralisi -M: Sudeep Holla +M: Sudeep Holla M: Ulf Hansson L: linux-pm@vger.kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) @@ -9816,7 +9816,7 @@ F: include/uapi/linux/firewire*.h F: tools/firewire/ FIRMWARE FRAMEWORK FOR ARMV8-A -M: Sudeep Holla +M: Sudeep Holla L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained F: drivers/firmware/arm_ffa/ @@ -10514,7 +10514,7 @@ S: Maintained F: scripts/gendwarfksyms/ GENERIC ARCHITECTURE TOPOLOGY -M: Sudeep Holla +M: Sudeep Holla L: linux-kernel@vger.kernel.org S: Maintained F: drivers/base/arch_topology.c @@ -15106,7 +15106,7 @@ F: drivers/mailbox/arm_mhuv2.c F: include/linux/mailbox/arm_mhuv2_message.h MAILBOX ARM MHUv3 -M: Sudeep Holla +M: Sudeep Holla M: Cristian Marussi L: linux-kernel@vger.kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) @@ -23636,7 +23636,7 @@ F: include/uapi/linux/sed* SECURE MONITOR CALL(SMC) CALLING CONVENTION (SMCCC) M: Mark Rutland M: Lorenzo Pieralisi -M: Sudeep Holla +M: Sudeep Holla L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained F: drivers/firmware/smccc/ @@ -25400,7 +25400,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git F: drivers/mfd/syscon.c SYSTEM CONTROL & POWER/MANAGEMENT INTERFACE (SCPI/SCMI) Message Protocol drivers -M: Sudeep Holla +M: Sudeep Holla R: Cristian Marussi L: arm-scmi@vger.kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) @@ -26552,7 +26552,7 @@ F: samples/tsm-mr/ TRUSTED SERVICES TEE DRIVER M: Balint Dobszay -M: Sudeep Holla +M: Sudeep Holla L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) L: trusted-services@lists.trustedfirmware.org S: Maintained From 6222883af286e2feb3c9ff2bf9fd8fdf4220c55a Mon Sep 17 00:00:00 2001 From: Mario Limonciello Date: Wed, 28 Jan 2026 13:04:45 -0600 Subject: [PATCH 029/167] platform/x86: hp-bioscfg: Skip empty attribute names MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Avoid registering kobjects with empty names when a BIOS attribute name decodes to an empty string. Fixes: a34fc329b1895 ("platform/x86: hp-bioscfg: bioscfg") Reported-by: Alain Cousinie Closes: https://lore.kernel.org/platform-driver-x86/22ed5f78-c8bf-4ab4-8c38-420cc0201e7e@laposte.net/ Signed-off-by: Mario Limonciello Link: https://patch.msgid.link/20260128190501.2170068-1-mario.limonciello@amd.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/hp/hp-bioscfg/bioscfg.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/platform/x86/hp/hp-bioscfg/bioscfg.c b/drivers/platform/x86/hp/hp-bioscfg/bioscfg.c index dbe096eefa75..51e8977d3eb4 100644 --- a/drivers/platform/x86/hp/hp-bioscfg/bioscfg.c +++ b/drivers/platform/x86/hp/hp-bioscfg/bioscfg.c @@ -696,6 +696,11 @@ static int hp_init_bios_package_attribute(enum hp_wmi_data_type attr_type, return ret; } + if (!str_value || !str_value[0]) { + pr_debug("Ignoring attribute with empty name\n"); + goto pack_attr_exit; + } + /* All duplicate attributes found are ignored */ duplicate = kset_find_obj(temp_kset, str_value); if (duplicate) { From 008bec8ffe6e7746588d1e12c5b3865fa478fc91 Mon Sep 17 00:00:00 2001 From: Ricardo Neri Date: Tue, 27 Jan 2026 15:45:40 -0800 Subject: [PATCH 030/167] platform/x86/intel/tpmi/plr: Make the file domain/status writeable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The file sys/kernel/debug/tpmi-/plr/domain/status has store and show callbacks. Make it writeable. Fixes: 811f67c51636d ("platform/x86/intel/tpmi: Add new auxiliary driver for performance limits") Signed-off-by: Ricardo Neri Link: https://patch.msgid.link/20260127-plr-debugfs-write-v1-1-1fffbc370b1e@linux.intel.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen --- drivers/platform/x86/intel/plr_tpmi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/intel/plr_tpmi.c b/drivers/platform/x86/intel/plr_tpmi.c index 58132da47745..05727169f49c 100644 --- a/drivers/platform/x86/intel/plr_tpmi.c +++ b/drivers/platform/x86/intel/plr_tpmi.c @@ -316,7 +316,7 @@ static int intel_plr_probe(struct auxiliary_device *auxdev, const struct auxilia snprintf(name, sizeof(name), "domain%d", i); dentry = debugfs_create_dir(name, plr->dbgfs_dir); - debugfs_create_file("status", 0444, dentry, &plr->die_info[i], + debugfs_create_file("status", 0644, dentry, &plr->die_info[i], &plr_status_fops); } From a8ff29f0ca1d63a215ef445102662850a912d127 Mon Sep 17 00:00:00 2001 From: Josh Poimboeuf Date: Tue, 27 Jan 2026 17:12:05 -0800 Subject: [PATCH 031/167] livepatch/klp-build: Require Clang assembler >= 20 Some special sections specify their ELF section entsize, for example: .pushsection section, "M", @progbits, 8 The entsize (8 in this example) is needed by objtool klp-diff for extracting individual entries. Clang assembler versions older than 20 silently ignore the above construct and set entsize to 0, resulting in the following error: .discard.annotate_data: missing special section entsize or annotations Add a klp-build check to prevent the use of Clang assembler versions prior to 20. Fixes: 24ebfcd65a87 ("livepatch/klp-build: Introduce klp-build script for generating livepatch modules") Reported-by: Song Liu Acked-by: Song Liu Link: https://patch.msgid.link/957fd52e375d0e2cfa3ac729160da995084a7f5e.1769562556.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf --- scripts/livepatch/klp-build | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/scripts/livepatch/klp-build b/scripts/livepatch/klp-build index a73515a82272..809e198a561d 100755 --- a/scripts/livepatch/klp-build +++ b/scripts/livepatch/klp-build @@ -249,6 +249,10 @@ validate_config() { [[ -v CONFIG_GCC_PLUGIN_RANDSTRUCT ]] && \ die "kernel option 'CONFIG_GCC_PLUGIN_RANDSTRUCT' not supported" + [[ -v CONFIG_AS_IS_LLVM ]] && \ + [[ "$CONFIG_AS_VERSION" -lt 200000 ]] && \ + die "Clang assembler version < 20 not supported" + return 0 } From bdde21d3e77da55121885fd2ef42bc6a15ac2f0c Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Thu, 29 Jan 2026 13:31:56 -0500 Subject: [PATCH 032/167] lsm: preserve /proc/sys/vm/mmap_min_addr when !CONFIG_SECURITY While reworking the LSM initialization code the /proc/sys/vm/mmap_min_addr handler was inadvertently caught up in the change and the procfs entry wasn't setup when CONFIG_SECURITY was not selected at kernel build time. This patch restores the previous behavior and ensures that the procfs entry is setup regardless of the CONFIG_SECURITY state. Future work will improve upon this, likely by moving the procfs handler into the mm subsystem, but this patch should resolve the immediate regression. Fixes: 4ab5efcc2829 ("lsm: consolidate all of the LSM framework initcalls") Reported-by: Lorenzo Stoakes Reviewed-by: Lorenzo Stoakes Tested-by: Lorenzo Stoakes Reviewed-by: Kees Cook Signed-off-by: Paul Moore --- security/lsm.h | 9 --------- security/lsm_init.c | 7 +------ security/min_addr.c | 5 ++--- 3 files changed, 3 insertions(+), 18 deletions(-) diff --git a/security/lsm.h b/security/lsm.h index 81aadbc61685..db77cc83e158 100644 --- a/security/lsm.h +++ b/security/lsm.h @@ -37,15 +37,6 @@ int lsm_task_alloc(struct task_struct *task); /* LSM framework initializers */ -#ifdef CONFIG_MMU -int min_addr_init(void); -#else -static inline int min_addr_init(void) -{ - return 0; -} -#endif /* CONFIG_MMU */ - #ifdef CONFIG_SECURITYFS int securityfs_init(void); #else diff --git a/security/lsm_init.c b/security/lsm_init.c index 05bd52e6b1f2..573e2a7250c4 100644 --- a/security/lsm_init.c +++ b/security/lsm_init.c @@ -489,12 +489,7 @@ int __init security_init(void) */ static int __init security_initcall_pure(void) { - int rc_adr, rc_lsm; - - rc_adr = min_addr_init(); - rc_lsm = lsm_initcall(pure); - - return (rc_adr ? rc_adr : rc_lsm); + return lsm_initcall(pure); } pure_initcall(security_initcall_pure); diff --git a/security/min_addr.c b/security/min_addr.c index 0fde5ec9abc8..56e4f9d25929 100644 --- a/security/min_addr.c +++ b/security/min_addr.c @@ -5,8 +5,6 @@ #include #include -#include "lsm.h" - /* amount of vm to protect from userspace access by both DAC and the LSM*/ unsigned long mmap_min_addr; /* amount of vm to protect from userspace using CAP_SYS_RAWIO (DAC) */ @@ -54,10 +52,11 @@ static const struct ctl_table min_addr_sysctl_table[] = { }, }; -int __init min_addr_init(void) +static int __init mmap_min_addr_init(void) { register_sysctl_init("vm", min_addr_sysctl_table); update_mmap_min_addr(); return 0; } +pure_initcall(mmap_min_addr_init); From 37d312bf957b95346fae2b3f82ce043474ea66c9 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Wed, 28 Jan 2026 13:04:35 -0800 Subject: [PATCH 033/167] MAINTAINERS: add an entry for PSP We are missing a MAINTAINERS entry for PSP, create one. Acked-by: Willem de Bruijn Link: https://patch.msgid.link/20260128210435.161061-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- MAINTAINERS | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 0efa8cc6775b..12d0858e02c5 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -20973,6 +20973,18 @@ F: Documentation/devicetree/bindings/net/pse-pd/ F: drivers/net/pse-pd/ F: net/ethtool/pse-pd.c +PSP SECURITY PROTOCOL +M: Daniel Zahka +M: Jakub Kicinski +M: Willem de Bruijn +F: Documentation/netlink/specs/psp.yaml +F: Documentation/networking/psp.rst +F: include/net/psp/ +F: include/net/psp.h +F: include/uapi/linux/psp.h +F: net/psp/ +K: struct\ psp(_assoc|_dev|hdr)\b + PSTORE FILESYSTEM M: Kees Cook R: Tony Luck From 13e00fdc9236bd4d0bff4109d2983171fbcb74c4 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Wed, 28 Jan 2026 14:15:38 +0000 Subject: [PATCH 034/167] net: add skb_header_pointer_careful() helper This variant of skb_header_pointer() should be used in contexts where @offset argument is user-controlled and could be negative. Negative offsets are supported, as long as the zone starts between skb->head and skb->data. Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260128141539.3404400-2-edumazet@google.com Signed-off-by: Jakub Kicinski --- include/linux/skbuff.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 86737076101d..112e48970338 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -4301,6 +4301,18 @@ skb_header_pointer(const struct sk_buff *skb, int offset, int len, void *buffer) skb_headlen(skb), buffer); } +/* Variant of skb_header_pointer() where @offset is user-controlled + * and potentially negative. + */ +static inline void * __must_check +skb_header_pointer_careful(const struct sk_buff *skb, int offset, + int len, void *buffer) +{ + if (unlikely(offset < 0 && -offset > skb_headroom(skb))) + return NULL; + return skb_header_pointer(skb, offset, len, buffer); +} + static inline void * __must_check skb_pointer_if_linear(const struct sk_buff *skb, int offset, int len) { From cabd1a976375780dabab888784e356f574bbaed8 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Wed, 28 Jan 2026 14:15:39 +0000 Subject: [PATCH 035/167] net/sched: cls_u32: use skb_header_pointer_careful() skb_header_pointer() does not fully validate negative @offset values. Use skb_header_pointer_careful() instead. GangMin Kim provided a report and a repro fooling u32_classify(): BUG: KASAN: slab-out-of-bounds in u32_classify+0x1180/0x11b0 net/sched/cls_u32.c:221 Fixes: fbc2e7d9cf49 ("cls_u32: use skb_header_pointer() to dereference data safely") Reported-by: GangMin Kim Closes: https://lore.kernel.org/netdev/CANn89iJkyUZ=mAzLzC4GdcAgLuPnUoivdLaOs6B9rq5_erj76w@mail.gmail.com/T/ Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20260128141539.3404400-3-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/sched/cls_u32.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index 2a1c00048fd6..58e849c0acf4 100644 --- a/net/sched/cls_u32.c +++ b/net/sched/cls_u32.c @@ -161,10 +161,8 @@ TC_INDIRECT_SCOPE int u32_classify(struct sk_buff *skb, int toff = off + key->off + (off2 & key->offmask); __be32 *data, hdata; - if (skb_headroom(skb) + toff > INT_MAX) - goto out; - - data = skb_header_pointer(skb, toff, 4, &hdata); + data = skb_header_pointer_careful(skb, toff, 4, + &hdata); if (!data) goto out; if ((*data ^ key->val) & key->mask) { @@ -214,8 +212,9 @@ TC_INDIRECT_SCOPE int u32_classify(struct sk_buff *skb, if (ht->divisor) { __be32 *data, hdata; - data = skb_header_pointer(skb, off + n->sel.hoff, 4, - &hdata); + data = skb_header_pointer_careful(skb, + off + n->sel.hoff, + 4, &hdata); if (!data) goto out; sel = ht->divisor & u32_hash_fold(*data, &n->sel, @@ -229,7 +228,7 @@ TC_INDIRECT_SCOPE int u32_classify(struct sk_buff *skb, if (n->sel.flags & TC_U32_VAROFFSET) { __be16 *data, hdata; - data = skb_header_pointer(skb, + data = skb_header_pointer_careful(skb, off + n->sel.offoff, 2, &hdata); if (!data) From ed48a84a72fefb20a82dd90a7caa7807e90c6f66 Mon Sep 17 00:00:00 2001 From: Junrui Luo Date: Wed, 28 Jan 2026 16:07:34 +0800 Subject: [PATCH 036/167] dpaa2-switch: prevent ZERO_SIZE_PTR dereference when num_ifs is zero The driver allocates arrays for ports, FDBs, and filter blocks using kcalloc() with ethsw->sw_attr.num_ifs as the element count. When the device reports zero interfaces (either due to hardware configuration or firmware issues), kcalloc(0, ...) returns ZERO_SIZE_PTR (0x10) instead of NULL. Later in dpaa2_switch_probe(), the NAPI initialization unconditionally accesses ethsw->ports[0]->netdev, which attempts to dereference ZERO_SIZE_PTR (address 0x10), resulting in a kernel panic. Add a check to ensure num_ifs is greater than zero after retrieving device attributes. This prevents the zero-sized allocations and subsequent invalid pointer dereference. Reported-by: Yuhao Jiang Reported-by: Junrui Luo Fixes: 0b1b71370458 ("staging: dpaa2-switch: handle Rx path on control interface") Signed-off-by: Junrui Luo Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/SYBPR01MB7881BEABA8DA896947962470AF91A@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c index b1e1ad9e4b48..0ff234f6a3ed 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c @@ -3024,6 +3024,12 @@ static int dpaa2_switch_init(struct fsl_mc_device *sw_dev) goto err_close; } + if (!ethsw->sw_attr.num_ifs) { + dev_err(dev, "DPSW device has no interfaces\n"); + err = -ENODEV; + goto err_close; + } + err = dpsw_get_api_version(ethsw->mc_io, 0, ðsw->major, ðsw->minor); From 926ede0c85e1e57c97d64d9612455267d597bb2c Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 28 Jan 2026 15:44:38 +0000 Subject: [PATCH 037/167] net: liquidio: Initialize netdev pointer before queue setup In setup_nic_devices(), the netdev is allocated using alloc_etherdev_mq(). However, the pointer to this structure is stored in oct->props[i].netdev only after the calls to netif_set_real_num_rx_queues() and netif_set_real_num_tx_queues(). If either of these functions fails, setup_nic_devices() returns an error without freeing the allocated netdev. Since oct->props[i].netdev is still NULL at this point, the cleanup function liquidio_destroy_nic_device() will fail to find and free the netdev, resulting in a memory leak. Fix this by initializing oct->props[i].netdev before calling the queue setup functions. This ensures that the netdev is properly accessible for cleanup in case of errors. Compile tested only. Issue found using a prototype static analysis tool and code review. Fixes: c33c997346c3 ("liquidio: enhanced ethtool --set-channels feature") Signed-off-by: Zilin Guan Reviewed-by: Kory Maincent Link: https://patch.msgid.link/20260128154440.278369-2-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski --- .../net/ethernet/cavium/liquidio/lio_main.c | 34 +++++++++---------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c index 0732440eeacd..1f10b1b22a1e 100644 --- a/drivers/net/ethernet/cavium/liquidio/lio_main.c +++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c @@ -3505,6 +3505,23 @@ static int setup_nic_devices(struct octeon_device *octeon_dev) */ netdev->netdev_ops = &lionetdevops; + lio = GET_LIO(netdev); + + memset(lio, 0, sizeof(struct lio)); + + lio->ifidx = ifidx_or_pfnum; + + props = &octeon_dev->props[i]; + props->gmxport = resp->cfg_info.linfo.gmxport; + props->netdev = netdev; + + /* Point to the properties for octeon device to which this + * interface belongs. + */ + lio->oct_dev = octeon_dev; + lio->octprops = props; + lio->netdev = netdev; + retval = netif_set_real_num_rx_queues(netdev, num_oqueues); if (retval) { dev_err(&octeon_dev->pci_dev->dev, @@ -3521,16 +3538,6 @@ static int setup_nic_devices(struct octeon_device *octeon_dev) goto setup_nic_dev_free; } - lio = GET_LIO(netdev); - - memset(lio, 0, sizeof(struct lio)); - - lio->ifidx = ifidx_or_pfnum; - - props = &octeon_dev->props[i]; - props->gmxport = resp->cfg_info.linfo.gmxport; - props->netdev = netdev; - lio->linfo.num_rxpciq = num_oqueues; lio->linfo.num_txpciq = num_iqueues; for (j = 0; j < num_oqueues; j++) { @@ -3596,13 +3603,6 @@ static int setup_nic_devices(struct octeon_device *octeon_dev) netdev->min_mtu = LIO_MIN_MTU_SIZE; netdev->max_mtu = LIO_MAX_MTU_SIZE; - /* Point to the properties for octeon device to which this - * interface belongs. - */ - lio->oct_dev = octeon_dev; - lio->octprops = props; - lio->netdev = netdev; - dev_dbg(&octeon_dev->pci_dev->dev, "if%d gmx: %d hw_addr: 0x%llx\n", i, lio->linfo.gmxport, CVM_CAST64(lio->linfo.hw_addr)); From 8558aef4e8a1a83049ab906d21d391093cfa7e7f Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 28 Jan 2026 15:44:39 +0000 Subject: [PATCH 038/167] net: liquidio: Fix off-by-one error in PF setup_nic_devices() cleanup In setup_nic_devices(), the initialization loop jumps to the label setup_nic_dev_free on failure. The current cleanup loop while(i--) skip the failing index i, causing a memory leak. Fix this by changing the loop to iterate from the current index i down to 0. Also, decrement i in the devlink_alloc failure path to point to the last successfully allocated index. Compile tested only. Issue found using code review. Fixes: f21fb3ed364b ("Add support of Cavium Liquidio ethernet adapters") Suggested-by: Simon Horman Signed-off-by: Zilin Guan Reviewed-by: Kory Maincent Link: https://patch.msgid.link/20260128154440.278369-3-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/cavium/liquidio/lio_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c index 1f10b1b22a1e..c1a3df225254 100644 --- a/drivers/net/ethernet/cavium/liquidio/lio_main.c +++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c @@ -3750,6 +3750,7 @@ static int setup_nic_devices(struct octeon_device *octeon_dev) if (!devlink) { device_unlock(&octeon_dev->pci_dev->dev); dev_err(&octeon_dev->pci_dev->dev, "devlink alloc failed\n"); + i--; goto setup_nic_dev_free; } @@ -3765,11 +3766,11 @@ static int setup_nic_devices(struct octeon_device *octeon_dev) setup_nic_dev_free: - while (i--) { + do { dev_err(&octeon_dev->pci_dev->dev, "NIC ifidx:%d Setup failed\n", i); liquidio_destroy_nic_device(octeon_dev, i); - } + } while (i--); setup_nic_dev_done: From 6cbba46934aefdfb5d171e0a95aec06c24f7ca30 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Wed, 28 Jan 2026 15:44:40 +0000 Subject: [PATCH 039/167] net: liquidio: Fix off-by-one error in VF setup_nic_devices() cleanup In setup_nic_devices(), the initialization loop jumps to the label setup_nic_dev_free on failure. The current cleanup loop while(i--) skip the failing index i, causing a memory leak. Fix this by changing the loop to iterate from the current index i down to 0. Compile tested only. Issue found using code review. Fixes: 846b46873eeb ("liquidio CN23XX: VF offload features") Suggested-by: Simon Horman Signed-off-by: Zilin Guan Reviewed-by: Kory Maincent Link: https://patch.msgid.link/20260128154440.278369-4-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c index e02942dbbcce..43c595f3b84e 100644 --- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c +++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c @@ -2212,11 +2212,11 @@ static int setup_nic_devices(struct octeon_device *octeon_dev) setup_nic_dev_free: - while (i--) { + do { dev_err(&octeon_dev->pci_dev->dev, "NIC ifidx:%d Setup failed\n", i); liquidio_destroy_nic_device(octeon_dev, i); - } + } while (i--); setup_nic_dev_done: From 31a7a0bbeb006bac2d9c81a2874825025214b6d8 Mon Sep 17 00:00:00 2001 From: Junrui Luo Date: Thu, 29 Jan 2026 00:55:13 +0800 Subject: [PATCH 040/167] dpaa2-switch: add bounds check for if_id in IRQ handler The IRQ handler extracts if_id from the upper 16 bits of the hardware status register and uses it to index into ethsw->ports[] without validation. Since if_id can be any 16-bit value (0-65535) but the ports array is only allocated with sw_attr.num_ifs elements, this can lead to an out-of-bounds read potentially. Add a bounds check before accessing the array, consistent with the existing validation in dpaa2_switch_rx(). Reported-by: Yuhao Jiang Reported-by: Junrui Luo Fixes: 24ab724f8a46 ("dpaa2-switch: use the port index in the IRQ handler") Signed-off-by: Junrui Luo Link: https://patch.msgid.link/SYBPR01MB7881D420AB43FF1A227B84AFAF91A@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c index 0ff234f6a3ed..66240c340492 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c @@ -1531,6 +1531,10 @@ static irqreturn_t dpaa2_switch_irq0_handler_thread(int irq_num, void *arg) } if_id = (status & 0xFFFF0000) >> 16; + if (if_id >= ethsw->sw_attr.num_ifs) { + dev_err(dev, "Invalid if_id %d in IRQ status\n", if_id); + goto out; + } port_priv = ethsw->ports[if_id]; if (status & DPSW_IRQ_EVENT_LINK_CHANGED) From 6bd8b4a92a901fae1a422e6f914801063c345e8d Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Fri, 30 Jan 2026 13:11:07 +0800 Subject: [PATCH 041/167] pmdomain: imx8m-blk-ctrl: fix out-of-range access of bc->domains Fix out-of-range access of bc->domains in imx8m_blk_ctrl_remove(). Fixes: 2684ac05a8c4 ("soc: imx: add i.MX8M blk-ctrl driver") Cc: stable@kernel.org Signed-off-by: Xu Yang Reviewed-by: Daniel Baluta Signed-off-by: Ulf Hansson --- drivers/pmdomain/imx/imx8m-blk-ctrl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pmdomain/imx/imx8m-blk-ctrl.c b/drivers/pmdomain/imx/imx8m-blk-ctrl.c index 74bf4936991d..19e992d2ee3b 100644 --- a/drivers/pmdomain/imx/imx8m-blk-ctrl.c +++ b/drivers/pmdomain/imx/imx8m-blk-ctrl.c @@ -340,7 +340,7 @@ static void imx8m_blk_ctrl_remove(struct platform_device *pdev) of_genpd_del_provider(pdev->dev.of_node); - for (i = 0; bc->onecell_data.num_domains; i++) { + for (i = 0; i < bc->onecell_data.num_domains; i++) { struct imx8m_blk_ctrl_domain *domain = &bc->domains[i]; pm_genpd_remove(&domain->genpd); From aabd8ea0aa253d40cf5f20a609fc3d6f61e38299 Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Mon, 26 Jan 2026 09:50:26 -0800 Subject: [PATCH 042/167] spi: tegra210-quad: Return IRQ_HANDLED when timeout already processed transfer When the ISR thread wakes up late and finds that the timeout handler has already processed the transfer (curr_xfer is NULL), return IRQ_HANDLED instead of IRQ_NONE. Use a similar approach to tegra_qspi_handle_timeout() by reading QSPI_TRANS_STATUS and checking the QSPI_RDY bit to determine if the hardware actually completed the transfer. If QSPI_RDY is set, the interrupt was legitimate and triggered by real hardware activity. The fact that the timeout path handled it first doesn't make it spurious. Returning IRQ_NONE incorrectly suggests the interrupt wasn't for this device, which can cause issues with shared interrupt lines and interrupt accounting. Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling") Signed-off-by: Breno Leitao Signed-off-by: Usama Arif Tested-by: Jon Hunter Acked-by: Jon Hunter Acked-by: Thierry Reding Link: https://patch.msgid.link/20260126-tegra_xfer-v2-1-6d2115e4f387@debian.org Signed-off-by: Mark Brown --- drivers/spi/spi-tegra210-quad.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-tegra210-quad.c b/drivers/spi/spi-tegra210-quad.c index cdc3cb7c01f9..f0408c0b4b98 100644 --- a/drivers/spi/spi-tegra210-quad.c +++ b/drivers/spi/spi-tegra210-quad.c @@ -1552,15 +1552,30 @@ static irqreturn_t handle_dma_based_xfer(struct tegra_qspi *tqspi) static irqreturn_t tegra_qspi_isr_thread(int irq, void *context_data) { struct tegra_qspi *tqspi = context_data; + u32 status; + + /* + * Read transfer status to check if interrupt was triggered by transfer + * completion + */ + status = tegra_qspi_readl(tqspi, QSPI_TRANS_STATUS); /* * Occasionally the IRQ thread takes a long time to wake up (usually * when the CPU that it's running on is excessively busy) and we have * already reached the timeout before and cleaned up the timed out * transfer. Avoid any processing in that case and bail out early. + * + * If no transfer is in progress, check if this was a real interrupt + * that the timeout handler already processed, or a spurious one. */ - if (!tqspi->curr_xfer) - return IRQ_NONE; + if (!tqspi->curr_xfer) { + /* Spurious interrupt - transfer not ready */ + if (!(status & QSPI_RDY)) + return IRQ_NONE; + /* Real interrupt, already handled by timeout path */ + return IRQ_HANDLED; + } tqspi->status_reg = tegra_qspi_readl(tqspi, QSPI_FIFO_STATUS); From ef13ba357656451d6371940d8414e3e271df97e3 Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Mon, 26 Jan 2026 09:50:27 -0800 Subject: [PATCH 043/167] spi: tegra210-quad: Move curr_xfer read inside spinlock Move the assignment of the transfer pointer from curr_xfer inside the spinlock critical section in both handle_cpu_based_xfer() and handle_dma_based_xfer(). Previously, curr_xfer was read before acquiring the lock, creating a window where the timeout path could clear curr_xfer between reading it and using it. By moving the read inside the lock, the handlers are guaranteed to see a consistent value that cannot be modified by the timeout path. Fixes: 921fc1838fb0 ("spi: tegra210-quad: Add support for Tegra210 QSPI controller") Signed-off-by: Breno Leitao Acked-by: Thierry Reding Tested-by: Jon Hunter Acked-by: Jon Hunter Link: https://patch.msgid.link/20260126-tegra_xfer-v2-2-6d2115e4f387@debian.org Signed-off-by: Mark Brown --- drivers/spi/spi-tegra210-quad.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-tegra210-quad.c b/drivers/spi/spi-tegra210-quad.c index f0408c0b4b98..ee291b9e9e9c 100644 --- a/drivers/spi/spi-tegra210-quad.c +++ b/drivers/spi/spi-tegra210-quad.c @@ -1440,10 +1440,11 @@ static int tegra_qspi_transfer_one_message(struct spi_controller *host, static irqreturn_t handle_cpu_based_xfer(struct tegra_qspi *tqspi) { - struct spi_transfer *t = tqspi->curr_xfer; + struct spi_transfer *t; unsigned long flags; spin_lock_irqsave(&tqspi->lock, flags); + t = tqspi->curr_xfer; if (tqspi->tx_status || tqspi->rx_status) { tegra_qspi_handle_error(tqspi); @@ -1474,7 +1475,7 @@ static irqreturn_t handle_cpu_based_xfer(struct tegra_qspi *tqspi) static irqreturn_t handle_dma_based_xfer(struct tegra_qspi *tqspi) { - struct spi_transfer *t = tqspi->curr_xfer; + struct spi_transfer *t; unsigned int total_fifo_words; unsigned long flags; long wait_status; @@ -1513,6 +1514,7 @@ static irqreturn_t handle_dma_based_xfer(struct tegra_qspi *tqspi) } spin_lock_irqsave(&tqspi->lock, flags); + t = tqspi->curr_xfer; if (num_errors) { tegra_qspi_dma_unmap_xfer(tqspi, t); From f5a4d7f5e32ba163cff893493ec1cbb0fd2fb0d5 Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Mon, 26 Jan 2026 09:50:28 -0800 Subject: [PATCH 044/167] spi: tegra210-quad: Protect curr_xfer assignment in tegra_qspi_setup_transfer_one When the timeout handler processes a completed transfer and signals completion, the transfer thread can immediately set up the next transfer and assign curr_xfer to point to it. If a delayed ISR from the previous transfer then runs, it checks if (!tqspi->curr_xfer) (currently without the lock also -- to be fixed soon) to detect stale interrupts, but this check passes because curr_xfer now points to the new transfer. The ISR then incorrectly processes the new transfer's context. Protect the curr_xfer assignment with the spinlock to ensure the ISR either sees NULL (and bails out) or sees the new value only after the assignment is complete. Fixes: 921fc1838fb0 ("spi: tegra210-quad: Add support for Tegra210 QSPI controller") Signed-off-by: Breno Leitao Tested-by: Jon Hunter Acked-by: Jon Hunter Acked-by: Thierry Reding Link: https://patch.msgid.link/20260126-tegra_xfer-v2-3-6d2115e4f387@debian.org Signed-off-by: Mark Brown --- drivers/spi/spi-tegra210-quad.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/spi/spi-tegra210-quad.c b/drivers/spi/spi-tegra210-quad.c index ee291b9e9e9c..15c110c00aca 100644 --- a/drivers/spi/spi-tegra210-quad.c +++ b/drivers/spi/spi-tegra210-quad.c @@ -839,6 +839,7 @@ static u32 tegra_qspi_setup_transfer_one(struct spi_device *spi, struct spi_tran u32 command1, command2, speed = t->speed_hz; u8 bits_per_word = t->bits_per_word; u32 tx_tap = 0, rx_tap = 0; + unsigned long flags; int req_mode; if (!has_acpi_companion(tqspi->dev) && speed != tqspi->cur_speed) { @@ -846,10 +847,12 @@ static u32 tegra_qspi_setup_transfer_one(struct spi_device *spi, struct spi_tran tqspi->cur_speed = speed; } + spin_lock_irqsave(&tqspi->lock, flags); tqspi->cur_pos = 0; tqspi->cur_rx_pos = 0; tqspi->cur_tx_pos = 0; tqspi->curr_xfer = t; + spin_unlock_irqrestore(&tqspi->lock, flags); if (is_first_of_msg) { tegra_qspi_mask_clear_irq(tqspi); From bf4528ab28e2bf112c3a2cdef44fd13f007781cd Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Mon, 26 Jan 2026 09:50:29 -0800 Subject: [PATCH 045/167] spi: tegra210-quad: Protect curr_xfer in tegra_qspi_combined_seq_xfer The curr_xfer field is read by the IRQ handler without holding the lock to check if a transfer is in progress. When clearing curr_xfer in the combined sequence transfer loop, protect it with the spinlock to prevent a race with the interrupt handler. Protect the curr_xfer clearing at the exit path of tegra_qspi_combined_seq_xfer() with the spinlock to prevent a race with the interrupt handler that reads this field. Without this protection, the IRQ handler could read a partially updated curr_xfer value, leading to NULL pointer dereference or use-after-free. Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling") Signed-off-by: Breno Leitao Tested-by: Jon Hunter Acked-by: Jon Hunter Acked-by: Thierry Reding Link: https://patch.msgid.link/20260126-tegra_xfer-v2-4-6d2115e4f387@debian.org Signed-off-by: Mark Brown --- drivers/spi/spi-tegra210-quad.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/spi/spi-tegra210-quad.c b/drivers/spi/spi-tegra210-quad.c index 15c110c00aca..669e01d3f56a 100644 --- a/drivers/spi/spi-tegra210-quad.c +++ b/drivers/spi/spi-tegra210-quad.c @@ -1161,6 +1161,7 @@ static int tegra_qspi_combined_seq_xfer(struct tegra_qspi *tqspi, u32 address_value = 0; u32 cmd_config = 0, addr_config = 0; u8 cmd_value = 0, val = 0; + unsigned long flags; /* Enable Combined sequence mode */ val = tegra_qspi_readl(tqspi, QSPI_GLOBAL_CONFIG); @@ -1264,13 +1265,17 @@ static int tegra_qspi_combined_seq_xfer(struct tegra_qspi *tqspi, tegra_qspi_transfer_end(spi); spi_transfer_delay_exec(xfer); } + spin_lock_irqsave(&tqspi->lock, flags); tqspi->curr_xfer = NULL; + spin_unlock_irqrestore(&tqspi->lock, flags); transfer_phase++; } ret = 0; exit: + spin_lock_irqsave(&tqspi->lock, flags); tqspi->curr_xfer = NULL; + spin_unlock_irqrestore(&tqspi->lock, flags); msg->status = ret; return ret; From 6d7723e8161f3c3f14125557e19dd080e9d882be Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Mon, 26 Jan 2026 09:50:30 -0800 Subject: [PATCH 046/167] spi: tegra210-quad: Protect curr_xfer clearing in tegra_qspi_non_combined_seq_xfer Protect the curr_xfer clearing in tegra_qspi_non_combined_seq_xfer() with the spinlock to prevent a race with the interrupt handler that reads this field to check if a transfer is in progress. Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling") Signed-off-by: Breno Leitao Tested-by: Jon Hunter Acked-by: Jon Hunter Acked-by: Thierry Reding Link: https://patch.msgid.link/20260126-tegra_xfer-v2-5-6d2115e4f387@debian.org Signed-off-by: Mark Brown --- drivers/spi/spi-tegra210-quad.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/spi/spi-tegra210-quad.c b/drivers/spi/spi-tegra210-quad.c index 669e01d3f56a..79aeb80aa4a7 100644 --- a/drivers/spi/spi-tegra210-quad.c +++ b/drivers/spi/spi-tegra210-quad.c @@ -1288,6 +1288,7 @@ static int tegra_qspi_non_combined_seq_xfer(struct tegra_qspi *tqspi, struct spi_transfer *transfer; bool is_first_msg = true; int ret = 0, val = 0; + unsigned long flags; msg->status = 0; msg->actual_length = 0; @@ -1368,7 +1369,9 @@ static int tegra_qspi_non_combined_seq_xfer(struct tegra_qspi *tqspi, msg->actual_length += xfer->len + dummy_bytes; complete_xfer: + spin_lock_irqsave(&tqspi->lock, flags); tqspi->curr_xfer = NULL; + spin_unlock_irqrestore(&tqspi->lock, flags); if (ret < 0) { tegra_qspi_transfer_end(spi); From edf9088b6e1d6d88982db7eb5e736a0e4fbcc09e Mon Sep 17 00:00:00 2001 From: Breno Leitao Date: Mon, 26 Jan 2026 09:50:31 -0800 Subject: [PATCH 047/167] spi: tegra210-quad: Protect curr_xfer check in IRQ handler Now that all other accesses to curr_xfer are done under the lock, protect the curr_xfer NULL check in tegra_qspi_isr_thread() with the spinlock. Without this protection, the following race can occur: CPU0 (ISR thread) CPU1 (timeout path) ---------------- ------------------- if (!tqspi->curr_xfer) // sees non-NULL spin_lock() tqspi->curr_xfer = NULL spin_unlock() handle_*_xfer() spin_lock() t = tqspi->curr_xfer // NULL! ... t->len ... // NULL dereference! With this patch, all curr_xfer accesses are now properly synchronized. Although all accesses to curr_xfer are done under the lock, in tegra_qspi_isr_thread() it checks for NULL, releases the lock and reacquires it later in handle_cpu_based_xfer()/handle_dma_based_xfer(). There is a potential for an update in between, which could cause a NULL pointer dereference. To handle this, add a NULL check inside the handlers after acquiring the lock. This ensures that if the timeout path has already cleared curr_xfer, the handler will safely return without dereferencing the NULL pointer. Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling") Signed-off-by: Breno Leitao Tested-by: Jon Hunter Acked-by: Jon Hunter Acked-by: Thierry Reding Link: https://patch.msgid.link/20260126-tegra_xfer-v2-6-6d2115e4f387@debian.org Signed-off-by: Mark Brown --- drivers/spi/spi-tegra210-quad.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/spi/spi-tegra210-quad.c b/drivers/spi/spi-tegra210-quad.c index 79aeb80aa4a7..f425d62e0c27 100644 --- a/drivers/spi/spi-tegra210-quad.c +++ b/drivers/spi/spi-tegra210-quad.c @@ -1457,6 +1457,11 @@ static irqreturn_t handle_cpu_based_xfer(struct tegra_qspi *tqspi) spin_lock_irqsave(&tqspi->lock, flags); t = tqspi->curr_xfer; + if (!t) { + spin_unlock_irqrestore(&tqspi->lock, flags); + return IRQ_HANDLED; + } + if (tqspi->tx_status || tqspi->rx_status) { tegra_qspi_handle_error(tqspi); complete(&tqspi->xfer_completion); @@ -1527,6 +1532,11 @@ static irqreturn_t handle_dma_based_xfer(struct tegra_qspi *tqspi) spin_lock_irqsave(&tqspi->lock, flags); t = tqspi->curr_xfer; + if (!t) { + spin_unlock_irqrestore(&tqspi->lock, flags); + return IRQ_HANDLED; + } + if (num_errors) { tegra_qspi_dma_unmap_xfer(tqspi, t); tegra_qspi_handle_error(tqspi); @@ -1565,6 +1575,7 @@ static irqreturn_t handle_dma_based_xfer(struct tegra_qspi *tqspi) static irqreturn_t tegra_qspi_isr_thread(int irq, void *context_data) { struct tegra_qspi *tqspi = context_data; + unsigned long flags; u32 status; /* @@ -1582,7 +1593,9 @@ static irqreturn_t tegra_qspi_isr_thread(int irq, void *context_data) * If no transfer is in progress, check if this was a real interrupt * that the timeout handler already processed, or a spurious one. */ + spin_lock_irqsave(&tqspi->lock, flags); if (!tqspi->curr_xfer) { + spin_unlock_irqrestore(&tqspi->lock, flags); /* Spurious interrupt - transfer not ready */ if (!(status & QSPI_RDY)) return IRQ_NONE; @@ -1599,7 +1612,14 @@ static irqreturn_t tegra_qspi_isr_thread(int irq, void *context_data) tqspi->rx_status = tqspi->status_reg & (QSPI_RX_FIFO_OVF | QSPI_RX_FIFO_UNF); tegra_qspi_mask_clear_irq(tqspi); + spin_unlock_irqrestore(&tqspi->lock, flags); + /* + * Lock is released here but handlers safely re-check curr_xfer under + * lock before dereferencing. + * DMA handler also needs to sleep in wait_for_completion_*(), which + * cannot be done while holding spinlock. + */ if (!tqspi->is_curr_dma_xfer) return handle_cpu_based_xfer(tqspi); From 99854c167cfc113ad863832b1601c4ca1a639cfe Mon Sep 17 00:00:00 2001 From: Grzegorz Nitka Date: Thu, 27 Nov 2025 10:25:58 +0100 Subject: [PATCH 048/167] ice: fix missing TX timestamps interrupts on E825 devices Modify PTP (Precision Time Protocol) configuration on link down flow. Previously, PHY_REG_TX_OFFSET_READY register was cleared in such case. This register is used to determine if the timestamp is valid or not on the hardware side. However, there is a possibility that there is still the packet in the HW queue which originally was supposed to be timestamped but the link is already down and given register is cleared. This potentially might lead to the situation in which that 'delayed' packet's timestamp is treated as invalid one when the link is up again. This in turn leads to the situation in which the driver is not able to effectively clean timestamp memory and interrupt configuration. From the hardware perspective, that 'old' interrupt was not handled properly and even if new timestamp packets are processed, no new interrupts is generated. As a result, providing timestamps to the user applications (like ptp4l) is not possible. The solution for this problem is implemented at the driver level rather than the firmware, and maintains the tx_ready bit high, even during link down events. This avoids entering a potential inconsistent state between the driver and the timestamp hardware. Testing hints: - run PTP traffic at higher rate (like 16 PTP messages per second) - observe ptp4l behaviour at the client side in the following conditions: a) trigger link toggle events. It needs to be physiscal link down/up events b) link speed change In all above cases, PTP processing at ptp4l application should resume always. In failure case, the following permanent error message in ptp4l log was observed: controller-0 ptp4l: err [6175.116] ptp4l-legacy timed out while polling for tx timestamp Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Reviewed-by: Aleksandr Loktionov Signed-off-by: Grzegorz Nitka Tested-by: Sunitha Mekala (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_ptp.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index 4c8d20f2d2c0..0b7c2a13ab04 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -1347,9 +1347,12 @@ void ice_ptp_link_change(struct ice_pf *pf, bool linkup) /* Do not reconfigure E810 or E830 PHY */ return; case ICE_MAC_GENERIC: - case ICE_MAC_GENERIC_3K_E825: ice_ptp_port_phy_restart(ptp_port); return; + case ICE_MAC_GENERIC_3K_E825: + if (linkup) + ice_ptp_port_phy_restart(ptp_port); + return; default: dev_warn(ice_pf_to_dev(pf), "%s: Unknown PHY type\n", __func__); } From 88b68f35eb43ad5ac77ac1107059040b04e6f477 Mon Sep 17 00:00:00 2001 From: Jacob Keller Date: Wed, 21 Jan 2026 10:44:19 -0800 Subject: [PATCH 049/167] ice: PTP: fix missing timestamps on E825 hardware MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The E825 hardware currently has each PF handle the PFINT_TSYN_TX cause of the miscellaneous OICR interrupt vector. The actual interrupt cause underlying this is shared by all ports on the same quad: ┌─────────────────────────────────┐ │ │ │ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ │ │PF 0│ │PF 1│ │PF 2│ │PF 3│ │ │ └────┘ └────┘ └────┘ └────┘ │ │ │ └────────────────▲────────────────┘ │ │ ┌────────────────┼────────────────┐ │ PHY QUAD │ └───▲────────▲────────▲────────▲──┘ │ │ │ │ ┌───┼──┐ ┌───┴──┐ ┌───┼──┐ ┌───┼──┐ │Port 0│ │Port 1│ │Port 2│ │Port 3│ └──────┘ └──────┘ └──────┘ └──────┘ If multiple PFs issue Tx timestamp requests near simultaneously, it is possible that the correct PF will not be interrupted and will miss its timestamp. Understanding why is somewhat complex. Consider the following sequence of events: CPU 0: Send Tx packet on PF 0 ... PF 0 enqueues packet with Tx request CPU 1, PF1: ... Send Tx packet on PF1 ... PF 1 enqueues packet with Tx request HW: PHY Port 0 sends packet PHY raises Tx timestamp event interrupt MAC raises each PF interrupt CPU 0, PF0: CPU 1, PF1: ice_misc_intr() checks for Tx timestamps ice_misc_intr() checks for Tx timestamp Sees packet ready bit set Sees nothing available ... Exits ... ... HW: PHY port 1 sends packet PHY interrupt ignored because not all packet timestamps read yet. ... Read timestamp, report to stack Because the interrupt event is shared for all ports on the same quad, the PHY will not raise a new interrupt for any PF until all timestamps are read. In the example above, the second timestamp comes in for port 1 before the timestamp from port 0 is read. At this point, there is no longer an interrupt thread running that will read the timestamps, because each PF has checked and found that there was no work to do. Applications such as ptp4l will timeout after waiting a few milliseconds. Eventually, the watchdog service task will re-check for all quads and notice that there are outstanding timestamps, and issue a software interrupt to recover. However, by this point it is far too late, and applications have already failed. All of this occurs because of the underlying hardware behavior. The PHY cannot raise a new interrupt signal until all outstanding timestamps have been read. As a first step to fix this, switch the E825C hardware to the ICE_PTP_TX_INTERRUPT_ALL mode. In this mode, only the clock owner PF will respond to the PFINT_TSYN_TX cause. Other PFs disable this cause and will not wake. In this mode, the clock owner will iterate over all ports and handle timestamps for each connected port. This matches the E822 behavior, and is a necessary but insufficient step to resolve the missing timestamps. Even with use of the ICE_PTP_TX_INTERRUPT_ALL mode, we still sometimes miss a timestamp event. The ice_ptp_tx_tstamp_owner() does re-check the ready bitmap, but does so before re-enabling the OICR interrupt vector. It also only checks the ready bitmap, but not the software Tx timestamp tracker. To avoid risk of losing a timestamp, refactor the logic to check both the software Tx timestamp tracker bitmap *and* the hardware ready bitmap. Additionally, do this outside of ice_ptp_process_ts() after we have already re-enabled the OICR interrupt. Remove the checks from the ice_ptp_tx_tstamp(), ice_ptp_tx_tstamp_owner(), and the ice_ptp_process_ts() functions. This results in ice_ptp_tx_tstamp() being nothing more than a wrapper around ice_ptp_process_tx_tstamp() so we can remove it. Add the ice_ptp_tx_tstamps_pending() function which returns a boolean indicating if there are any pending Tx timestamps. First, check the software timestamp tracker bitmap. In ICE_PTP_TX_INTERRUPT_ALL mode, check *all* ports software trackers. If a tracker has outstanding timestamp requests, return true. Additionally, check the PHY ready bitmap to confirm if the PHY indicates any outstanding timestamps. In the ice_misc_thread_fn(), call ice_ptp_tx_tstamps_pending() just before returning from the IRQ thread handler. If it returns true, write to PFINT_OICR to trigger a PFINT_OICR_TSYN_TX_M software interrupt. This will force the handler to interrupt again and complete the work even if the PHY hardware did not interrupt for any reason. This results in the following new flow for handling Tx timestamps: 1) send Tx packet 2) PHY captures timestamp 3) PHY triggers MAC interrupt 4) clock owner executes ice_misc_intr() with PFINT_OICR_TSYN_TX flag set 5) ice_ptp_ts_irq() returns IRQ_WAKE_THREAD 7) The interrupt thread wakes up and kernel calls ice_misc_intr_thread_fn() 8) ice_ptp_process_ts() is called to handle any outstanding timestamps 9) ice_irq_dynamic_ena() is called to re-enable the OICR hardware interrupt cause 10) ice_ptp_tx_tstamps_pending() is called to check if we missed any more outstanding timestamps, checking both software and hardware indicators. With this change, it should no longer be possible for new timestamps to come in such a way that we lose an interrupt. If a timestamp comes in before the ice_ptp_tx_tstamps_pending() call, it will be noticed by at least one of the software bitmap check or the hardware bitmap check. If the timestamp comes in *after* this check, it should cause a timestamp interrupt as we have already read all timestamps from the PHY and the OICR vector has been re-enabled. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Signed-off-by: Jacob Keller Reviewed-by: Aleksandr Loktionov Reviewed-by: Przemyslaw Korba Tested-by: Vitaly Grinberg Tested-by: Sunitha Mekala (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_main.c | 20 +-- drivers/net/ethernet/intel/ice/ice_ptp.c | 148 ++++++++++++---------- drivers/net/ethernet/intel/ice/ice_ptp.h | 13 +- 3 files changed, 103 insertions(+), 78 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 71c6d53b461e..25cbbe67d992 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -3314,18 +3314,20 @@ static irqreturn_t ice_misc_intr_thread_fn(int __always_unused irq, void *data) if (ice_is_reset_in_progress(pf->state)) goto skip_irq; - if (test_and_clear_bit(ICE_MISC_THREAD_TX_TSTAMP, pf->misc_thread)) { - /* Process outstanding Tx timestamps. If there is more work, - * re-arm the interrupt to trigger again. - */ - if (ice_ptp_process_ts(pf) == ICE_TX_TSTAMP_WORK_PENDING) { - wr32(hw, PFINT_OICR, PFINT_OICR_TSYN_TX_M); - ice_flush(hw); - } - } + if (test_and_clear_bit(ICE_MISC_THREAD_TX_TSTAMP, pf->misc_thread)) + ice_ptp_process_ts(pf); skip_irq: ice_irq_dynamic_ena(hw, NULL, NULL); + ice_flush(hw); + + if (ice_ptp_tx_tstamps_pending(pf)) { + /* If any new Tx timestamps happened while in interrupt, + * re-arm the interrupt to trigger it again. + */ + wr32(hw, PFINT_OICR, PFINT_OICR_TSYN_TX_M); + ice_flush(hw); + } return IRQ_HANDLED; } diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index 0b7c2a13ab04..b5cef6396319 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -573,6 +573,9 @@ static void ice_ptp_process_tx_tstamp(struct ice_ptp_tx *tx) pf = ptp_port_to_pf(ptp_port); hw = &pf->hw; + if (!tx->init) + return; + /* Read the Tx ready status first */ if (tx->has_ready_bitmap) { err = ice_get_phy_tx_tstamp_ready(hw, tx->block, &tstamp_ready); @@ -674,14 +677,9 @@ static void ice_ptp_process_tx_tstamp(struct ice_ptp_tx *tx) pf->ptp.tx_hwtstamp_good += tstamp_good; } -/** - * ice_ptp_tx_tstamp_owner - Process Tx timestamps for all ports on the device - * @pf: Board private structure - */ -static enum ice_tx_tstamp_work ice_ptp_tx_tstamp_owner(struct ice_pf *pf) +static void ice_ptp_tx_tstamp_owner(struct ice_pf *pf) { struct ice_ptp_port *port; - unsigned int i; mutex_lock(&pf->adapter->ports.lock); list_for_each_entry(port, &pf->adapter->ports.ports, list_node) { @@ -693,49 +691,6 @@ static enum ice_tx_tstamp_work ice_ptp_tx_tstamp_owner(struct ice_pf *pf) ice_ptp_process_tx_tstamp(tx); } mutex_unlock(&pf->adapter->ports.lock); - - for (i = 0; i < ICE_GET_QUAD_NUM(pf->hw.ptp.num_lports); i++) { - u64 tstamp_ready; - int err; - - /* Read the Tx ready status first */ - err = ice_get_phy_tx_tstamp_ready(&pf->hw, i, &tstamp_ready); - if (err) - break; - else if (tstamp_ready) - return ICE_TX_TSTAMP_WORK_PENDING; - } - - return ICE_TX_TSTAMP_WORK_DONE; -} - -/** - * ice_ptp_tx_tstamp - Process Tx timestamps for this function. - * @tx: Tx tracking structure to initialize - * - * Returns: ICE_TX_TSTAMP_WORK_PENDING if there are any outstanding incomplete - * Tx timestamps, or ICE_TX_TSTAMP_WORK_DONE otherwise. - */ -static enum ice_tx_tstamp_work ice_ptp_tx_tstamp(struct ice_ptp_tx *tx) -{ - bool more_timestamps; - unsigned long flags; - - if (!tx->init) - return ICE_TX_TSTAMP_WORK_DONE; - - /* Process the Tx timestamp tracker */ - ice_ptp_process_tx_tstamp(tx); - - /* Check if there are outstanding Tx timestamps */ - spin_lock_irqsave(&tx->lock, flags); - more_timestamps = tx->init && !bitmap_empty(tx->in_use, tx->len); - spin_unlock_irqrestore(&tx->lock, flags); - - if (more_timestamps) - return ICE_TX_TSTAMP_WORK_PENDING; - - return ICE_TX_TSTAMP_WORK_DONE; } /** @@ -2666,32 +2621,94 @@ s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb) return idx + tx->offset; } -/** - * ice_ptp_process_ts - Process the PTP Tx timestamps - * @pf: Board private structure - * - * Returns: ICE_TX_TSTAMP_WORK_PENDING if there are any outstanding Tx - * timestamps that need processing, and ICE_TX_TSTAMP_WORK_DONE otherwise. - */ -enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf) +void ice_ptp_process_ts(struct ice_pf *pf) { switch (pf->ptp.tx_interrupt_mode) { case ICE_PTP_TX_INTERRUPT_NONE: /* This device has the clock owner handle timestamps for it */ - return ICE_TX_TSTAMP_WORK_DONE; + return; case ICE_PTP_TX_INTERRUPT_SELF: /* This device handles its own timestamps */ - return ice_ptp_tx_tstamp(&pf->ptp.port.tx); + ice_ptp_process_tx_tstamp(&pf->ptp.port.tx); + return; case ICE_PTP_TX_INTERRUPT_ALL: /* This device handles timestamps for all ports */ - return ice_ptp_tx_tstamp_owner(pf); + ice_ptp_tx_tstamp_owner(pf); + return; default: WARN_ONCE(1, "Unexpected Tx timestamp interrupt mode %u\n", pf->ptp.tx_interrupt_mode); - return ICE_TX_TSTAMP_WORK_DONE; + return; } } +static bool ice_port_has_timestamps(struct ice_ptp_tx *tx) +{ + bool more_timestamps; + + scoped_guard(spinlock_irqsave, &tx->lock) { + if (!tx->init) + return false; + + more_timestamps = !bitmap_empty(tx->in_use, tx->len); + } + + return more_timestamps; +} + +static bool ice_any_port_has_timestamps(struct ice_pf *pf) +{ + struct ice_ptp_port *port; + + scoped_guard(mutex, &pf->adapter->ports.lock) { + list_for_each_entry(port, &pf->adapter->ports.ports, + list_node) { + struct ice_ptp_tx *tx = &port->tx; + + if (ice_port_has_timestamps(tx)) + return true; + } + } + + return false; +} + +bool ice_ptp_tx_tstamps_pending(struct ice_pf *pf) +{ + struct ice_hw *hw = &pf->hw; + unsigned int i; + + /* Check software indicator */ + switch (pf->ptp.tx_interrupt_mode) { + case ICE_PTP_TX_INTERRUPT_NONE: + return false; + case ICE_PTP_TX_INTERRUPT_SELF: + if (ice_port_has_timestamps(&pf->ptp.port.tx)) + return true; + break; + case ICE_PTP_TX_INTERRUPT_ALL: + if (ice_any_port_has_timestamps(pf)) + return true; + break; + default: + WARN_ONCE(1, "Unexpected Tx timestamp interrupt mode %u\n", + pf->ptp.tx_interrupt_mode); + break; + } + + /* Check hardware indicator */ + for (i = 0; i < ICE_GET_QUAD_NUM(hw->ptp.num_lports); i++) { + u64 tstamp_ready = 0; + int err; + + err = ice_get_phy_tx_tstamp_ready(&pf->hw, i, &tstamp_ready); + if (err || tstamp_ready) + return true; + } + + return false; +} + /** * ice_ptp_ts_irq - Process the PTP Tx timestamps in IRQ context * @pf: Board private structure @@ -2741,7 +2758,9 @@ irqreturn_t ice_ptp_ts_irq(struct ice_pf *pf) return IRQ_WAKE_THREAD; case ICE_MAC_E830: /* E830 can read timestamps in the top half using rd32() */ - if (ice_ptp_process_ts(pf) == ICE_TX_TSTAMP_WORK_PENDING) { + ice_ptp_process_ts(pf); + + if (ice_ptp_tx_tstamps_pending(pf)) { /* Process outstanding Tx timestamps. If there * is more work, re-arm the interrupt to trigger again. */ @@ -3194,8 +3213,9 @@ static void ice_ptp_init_tx_interrupt_mode(struct ice_pf *pf) { switch (pf->hw.mac_type) { case ICE_MAC_GENERIC: - /* E822 based PHY has the clock owner process the interrupt - * for all ports. + case ICE_MAC_GENERIC_3K_E825: + /* E82x hardware has the clock owner process timestamps for + * all ports. */ if (ice_pf_src_tmr_owned(pf)) pf->ptp.tx_interrupt_mode = ICE_PTP_TX_INTERRUPT_ALL; diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h index 27016aac4f1e..8489bd842710 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp.h @@ -304,8 +304,9 @@ void ice_ptp_extts_event(struct ice_pf *pf); s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb); void ice_ptp_req_tx_single_tstamp(struct ice_ptp_tx *tx, u8 idx); void ice_ptp_complete_tx_single_tstamp(struct ice_ptp_tx *tx); -enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf); +void ice_ptp_process_ts(struct ice_pf *pf); irqreturn_t ice_ptp_ts_irq(struct ice_pf *pf); +bool ice_ptp_tx_tstamps_pending(struct ice_pf *pf); u64 ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts); @@ -345,16 +346,18 @@ static inline void ice_ptp_req_tx_single_tstamp(struct ice_ptp_tx *tx, u8 idx) static inline void ice_ptp_complete_tx_single_tstamp(struct ice_ptp_tx *tx) { } -static inline bool ice_ptp_process_ts(struct ice_pf *pf) -{ - return true; -} +static inline void ice_ptp_process_ts(struct ice_pf *pf) { } static inline irqreturn_t ice_ptp_ts_irq(struct ice_pf *pf) { return IRQ_HANDLED; } +static inline bool ice_ptp_tx_tstamps_pending(struct ice_pf *pf) +{ + return false; +} + static inline u64 ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts) { From fc6f36eaaedcf4b81af6fe1a568f018ffd530660 Mon Sep 17 00:00:00 2001 From: Aaron Ma Date: Wed, 21 Jan 2026 15:51:06 +0800 Subject: [PATCH 050/167] ice: Fix PTP NULL pointer dereference during VSI rebuild Fix race condition where PTP periodic work runs while VSI is being rebuilt, accessing NULL vsi->rx_rings. The sequence was: 1. ice_ptp_prepare_for_reset() cancels PTP work 2. ice_ptp_rebuild() immediately queues PTP work 3. VSI rebuild happens AFTER ice_ptp_rebuild() 4. PTP work runs and accesses NULL vsi->rx_rings Fix: Keep PTP work cancelled during rebuild, only queue it after VSI rebuild completes in ice_rebuild(). Added ice_ptp_queue_work() helper function to encapsulate the logic for queuing PTP work, ensuring it's only queued when PTP is supported and the state is ICE_PTP_READY. Error log: [ 121.392544] ice 0000:60:00.1: PTP reset successful [ 121.392692] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 121.392712] #PF: supervisor read access in kernel mode [ 121.392720] #PF: error_code(0x0000) - not-present page [ 121.392727] PGD 0 [ 121.392734] Oops: Oops: 0000 [#1] SMP NOPTI [ 121.392746] CPU: 8 UID: 0 PID: 1005 Comm: ice-ptp-0000:60 Tainted: G S 6.19.0-rc6+ #4 PREEMPT(voluntary) [ 121.392761] Tainted: [S]=CPU_OUT_OF_SPEC [ 121.392773] RIP: 0010:ice_ptp_update_cached_phctime+0xbf/0x150 [ice] [ 121.393042] Call Trace: [ 121.393047] [ 121.393055] ice_ptp_periodic_work+0x69/0x180 [ice] [ 121.393202] kthread_worker_fn+0xa2/0x260 [ 121.393216] ? __pfx_ice_ptp_periodic_work+0x10/0x10 [ice] [ 121.393359] ? __pfx_kthread_worker_fn+0x10/0x10 [ 121.393371] kthread+0x10d/0x230 [ 121.393382] ? __pfx_kthread+0x10/0x10 [ 121.393393] ret_from_fork+0x273/0x2b0 [ 121.393407] ? __pfx_kthread+0x10/0x10 [ 121.393417] ret_from_fork_asm+0x1a/0x30 [ 121.393432] Fixes: 803bef817807d ("ice: factor out ice_ptp_rebuild_owner()") Signed-off-by: Aaron Ma Tested-by: Sunitha Mekala (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_main.c | 3 +++ drivers/net/ethernet/intel/ice/ice_ptp.c | 26 ++++++++++++++++++----- drivers/net/ethernet/intel/ice/ice_ptp.h | 5 +++++ 3 files changed, 29 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 25cbbe67d992..c1033070d0c7 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -7809,6 +7809,9 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type) /* Restore timestamp mode settings after VSI rebuild */ ice_ptp_restore_timestamp_mode(pf); + + /* Start PTP periodic work after VSI is fully rebuilt */ + ice_ptp_queue_work(pf); return; err_vsi_rebuild: diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index b5cef6396319..272683001476 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -2839,6 +2839,20 @@ static void ice_ptp_periodic_work(struct kthread_work *work) msecs_to_jiffies(err ? 10 : 500)); } +/** + * ice_ptp_queue_work - Queue PTP periodic work for a PF + * @pf: Board private structure + * + * Helper function to queue PTP periodic work after VSI rebuild completes. + * This ensures that PTP work only runs when VSI structures are ready. + */ +void ice_ptp_queue_work(struct ice_pf *pf) +{ + if (test_bit(ICE_FLAG_PTP_SUPPORTED, pf->flags) && + pf->ptp.state == ICE_PTP_READY) + kthread_queue_delayed_work(pf->ptp.kworker, &pf->ptp.work, 0); +} + /** * ice_ptp_prepare_rebuild_sec - Prepare second NAC for PTP reset or rebuild * @pf: Board private structure @@ -2857,10 +2871,15 @@ static void ice_ptp_prepare_rebuild_sec(struct ice_pf *pf, bool rebuild, struct ice_pf *peer_pf = ptp_port_to_pf(port); if (!ice_is_primary(&peer_pf->hw)) { - if (rebuild) + if (rebuild) { + /* TODO: When implementing rebuild=true: + * 1. Ensure secondary PFs' VSIs are rebuilt + * 2. Call ice_ptp_queue_work(peer_pf) after VSI rebuild + */ ice_ptp_rebuild(peer_pf, reset_type); - else + } else { ice_ptp_prepare_for_reset(peer_pf, reset_type); + } } } } @@ -3006,9 +3025,6 @@ void ice_ptp_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type) ptp->state = ICE_PTP_READY; - /* Start periodic work going */ - kthread_queue_delayed_work(ptp->kworker, &ptp->work, 0); - dev_info(ice_pf_to_dev(pf), "PTP reset successful\n"); return; diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h index 8489bd842710..8c44bd758a4f 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp.h @@ -318,6 +318,7 @@ void ice_ptp_prepare_for_reset(struct ice_pf *pf, void ice_ptp_init(struct ice_pf *pf); void ice_ptp_release(struct ice_pf *pf); void ice_ptp_link_change(struct ice_pf *pf, bool linkup); +void ice_ptp_queue_work(struct ice_pf *pf); #else /* IS_ENABLED(CONFIG_PTP_1588_CLOCK) */ static inline int ice_ptp_hwtstamp_get(struct net_device *netdev, @@ -386,6 +387,10 @@ static inline void ice_ptp_link_change(struct ice_pf *pf, bool linkup) { } +static inline void ice_ptp_queue_work(struct ice_pf *pf) +{ +} + static inline int ice_ptp_clock_index(struct ice_pf *pf) { return -1; From 234e615bfece9e3e91c50fe49ab9e68ee37c791a Mon Sep 17 00:00:00 2001 From: Mohammad Heib Date: Sun, 28 Dec 2025 21:40:21 +0200 Subject: [PATCH 051/167] ice: drop udp_tunnel_get_rx_info() call from ndo_open() The ice driver calls udp_tunnel_get_rx_info() during ice_open_internal(). This is redundant because UDP tunnel RX offload state is preserved across device down/up cycles. The udp_tunnel core handles synchronization automatically when required. Furthermore, recent changes in the udp_tunnel infrastructure require querying RX info while holding the udp_tunnel lock. Calling it directly from the ndo_open path violates this requirement, triggering the following lockdep warning: Call Trace: ice_open_internal+0x253/0x350 [ice] __udp_tunnel_nic_assert_locked+0x86/0xb0 [udp_tunnel] __dev_open+0x2f5/0x880 __dev_change_flags+0x44c/0x660 netif_change_flags+0x80/0x160 devinet_ioctl+0xd21/0x15f0 inet_ioctl+0x311/0x350 sock_ioctl+0x114/0x220 __x64_sys_ioctl+0x131/0x1a0 ... Remove the redundant and unsafe call to udp_tunnel_get_rx_info() from ice_open_internal() to resolve the locking violation Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency") Signed-off-by: Mohammad Heib Reviewed-by: Aleksandr Loktionov Tested-by: Rinitha S (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_main.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index c1033070d0c7..d04605d3e61a 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -9662,9 +9662,6 @@ int ice_open_internal(struct net_device *netdev) netdev_err(netdev, "Failed to open VSI 0x%04X on switch 0x%04X\n", vsi->vsi_num, vsi->vsw->sw_id); - /* Update existing tunnels information */ - udp_tunnel_get_rx_info(netdev); - return err; } From 40857194956dcaf3d2b66d6bd113d844c93bef54 Mon Sep 17 00:00:00 2001 From: Mohammad Heib Date: Sun, 28 Dec 2025 21:40:20 +0200 Subject: [PATCH 052/167] i40e: drop udp_tunnel_get_rx_info() call from i40e_open() The i40e driver calls udp_tunnel_get_rx_info() during i40e_open(). This is redundant because UDP tunnel RX offload state is preserved across device down/up cycles. The udp_tunnel core handles synchronization automatically when required. Furthermore, recent changes in the udp_tunnel infrastructure require querying RX info while holding the udp_tunnel lock. Calling it directly from the ndo_open path violates this requirement, triggering the following lockdep warning: Call Trace: ? __udp_tunnel_nic_assert_locked+0x39/0x40 [udp_tunnel] i40e_open+0x135/0x14f [i40e] __dev_open+0x121/0x2e0 __dev_change_flags+0x227/0x270 dev_change_flags+0x3d/0xb0 devinet_ioctl+0x56f/0x860 sock_do_ioctl+0x7b/0x130 __x64_sys_ioctl+0x91/0xd0 do_syscall_64+0x90/0x170 ... Remove the redundant and unsafe call to udp_tunnel_get_rx_info() from i40e_open() resolve the locking violation. Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency") Signed-off-by: Mohammad Heib Reviewed-by: Aleksandr Loktionov Reviewed-by: Paul Menzel Tested-by: Rinitha S (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/i40e/i40e_main.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 0b1cc0481027..d3bc3207054f 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -9030,7 +9030,6 @@ int i40e_open(struct net_device *netdev) TCP_FLAG_FIN | TCP_FLAG_CWR) >> 16); wr32(&pf->hw, I40E_GLLAN_TSOMSK_L, be32_to_cpu(TCP_FLAG_CWR) >> 16); - udp_tunnel_get_rx_info(netdev); return 0; } From f8ade833b733ae0b72e87ac6d2202a1afbe3eb4a Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Tue, 27 Jan 2026 17:43:08 -0800 Subject: [PATCH 053/167] KVM: x86: Explicitly configure supported XSS from {svm,vmx}_set_cpu_caps() Explicitly configure KVM's supported XSS as part of each vendor's setup flow to fix a bug where clearing SHSTK and IBT in kvm_cpu_caps, e.g. due to lack of CET XFEATURE support, makes kvm-intel.ko unloadable when nested VMX is enabled, i.e. when nested=1. The late clearing results in nested_vmx_setup_{entry,exit}_ctls() clearing VM_{ENTRY,EXIT}_LOAD_CET_STATE when nested_vmx_setup_ctls_msrs() runs during the CPU compatibility checks, ultimately leading to a mismatched VMCS config due to the reference config having the CET bits set, but every CPU's "local" config having the bits cleared. Note, kvm_caps.supported_{xcr0,xss} are unconditionally initialized by kvm_x86_vendor_init(), before calling into vendor code, and not referenced between ops->hardware_setup() and their current/old location. Fixes: 69cc3e886582 ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER") Cc: stable@vger.kernel.org Cc: Mathias Krause Cc: John Allen Cc: Rick Edgecombe Cc: Chao Gao Cc: Binbin Wu Cc: Xiaoyao Li Reviewed-by: Xiaoyao Li Reviewed-by: Binbin Wu Link: https://patch.msgid.link/20260128014310.3255561-2-seanjc@google.com Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/svm.c | 2 ++ arch/x86/kvm/vmx/vmx.c | 2 ++ arch/x86/kvm/x86.c | 30 +++++++++++++++++------------- arch/x86/kvm/x86.h | 2 ++ 4 files changed, 23 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 24d59ccfa40d..4394be40fe78 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -5284,6 +5284,8 @@ static __init void svm_set_cpu_caps(void) */ kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT); kvm_cpu_cap_clear(X86_FEATURE_MSR_IMM); + + kvm_setup_xss_caps(); } static __init int svm_hardware_setup(void) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6b96f7aea20b..8c94241fbcca 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8051,6 +8051,8 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_clear(X86_FEATURE_SHSTK); kvm_cpu_cap_clear(X86_FEATURE_IBT); } + + kvm_setup_xss_caps(); } static bool vmx_is_io_intercepted(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 63afdb6bb078..72d37c8930ad 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9953,6 +9953,23 @@ static struct notifier_block pvclock_gtod_notifier = { }; #endif +void kvm_setup_xss_caps(void) +{ + if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) + kvm_caps.supported_xss = 0; + + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; + + if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; + } +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_setup_xss_caps); + static inline void kvm_ops_update(struct kvm_x86_init_ops *ops) { memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops)); @@ -10125,19 +10142,6 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (!tdp_enabled) kvm_caps.supported_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT; - if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) - kvm_caps.supported_xss = 0; - - if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && - !kvm_cpu_cap_has(X86_FEATURE_IBT)) - kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; - - if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) { - kvm_cpu_cap_clear(X86_FEATURE_SHSTK); - kvm_cpu_cap_clear(X86_FEATURE_IBT); - kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; - } - if (kvm_caps.has_tsc_control) { /* * Make sure the user can only configure tsc_khz values that diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index fdab0ad49098..00de24f55b1f 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -471,6 +471,8 @@ extern struct kvm_host_values kvm_host; extern bool enable_pmu; +void kvm_setup_xss_caps(void); + /* * Get a filtered version of KVM's supported XCR0 that strips out dynamic * features for which the current process doesn't (yet) have permission to use. From 403dd7da22461b0c7fd18d5cd4373b9a1ec8b5f7 Mon Sep 17 00:00:00 2001 From: Alexey Kardashevskiy Date: Fri, 23 Jan 2026 16:30:56 +1100 Subject: [PATCH 054/167] crypto/ccp: Use PCI bridge defaults for IDE The current number of streams in AMD TSM is 1 which is too little, the core uses 255. Also, even if the module parameter is increased, calling pci_ide_set_nr_streams() second time triggers WARN_ON. Simplify the code by sticking to the PCI core defaults. Fixes: 4be423572da1 ("crypto/ccp: Implement SEV-TIO PCIe IDE (phase1)") Signed-off-by: Alexey Kardashevskiy Acked-by: Tom Lendacky Link: https://patch.msgid.link/20260123053057.1350569-2-aik@amd.com Signed-off-by: Dan Williams --- drivers/crypto/ccp/sev-dev-tsm.c | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/drivers/crypto/ccp/sev-dev-tsm.c b/drivers/crypto/ccp/sev-dev-tsm.c index ea29cd5d0ff9..7407b77c2ef2 100644 --- a/drivers/crypto/ccp/sev-dev-tsm.c +++ b/drivers/crypto/ccp/sev-dev-tsm.c @@ -19,12 +19,6 @@ MODULE_IMPORT_NS("PCI_IDE"); -#define TIO_DEFAULT_NR_IDE_STREAMS 1 - -static uint nr_ide_streams = TIO_DEFAULT_NR_IDE_STREAMS; -module_param_named(ide_nr, nr_ide_streams, uint, 0644); -MODULE_PARM_DESC(ide_nr, "Set the maximum number of IDE streams per PHB"); - #define dev_to_sp(dev) ((struct sp_device *)dev_get_drvdata(dev)) #define dev_to_psp(dev) ((struct psp_device *)(dev_to_sp(dev)->psp_data)) #define dev_to_sev(dev) ((struct sev_device *)(dev_to_psp(dev)->sev_data)) @@ -193,7 +187,6 @@ static void streams_teardown(struct pci_ide **ide) static int stream_alloc(struct pci_dev *pdev, struct pci_ide **ide, unsigned int tc) { - struct pci_dev *rp = pcie_find_root_port(pdev); struct pci_ide *ide1; if (ide[tc]) { @@ -201,11 +194,6 @@ static int stream_alloc(struct pci_dev *pdev, struct pci_ide **ide, return -EBUSY; } - /* FIXME: find a better way */ - if (nr_ide_streams != TIO_DEFAULT_NR_IDE_STREAMS) - pci_notice(pdev, "Enable non-default %d streams", nr_ide_streams); - pci_ide_set_nr_streams(to_pci_host_bridge(rp->bus->bridge), nr_ide_streams); - ide1 = pci_ide_stream_alloc(pdev); if (!ide1) return -EFAULT; From c2012263047689e495e81c96d7d5b0586299578d Mon Sep 17 00:00:00 2001 From: Alexey Kardashevskiy Date: Fri, 23 Jan 2026 16:30:57 +1100 Subject: [PATCH 055/167] crypto/ccp: Allow multiple streams on the same root bridge With SEV-TIO the low-level TSM driver is responsible for allocating a Stream ID. The Stream ID needs to be unique within each IDE partner port. Fix the Stream ID selection to reuse the host bridge stream resource id which is a pool of 256 ids per host bridge on AMD platforms. Otherwise, only one device per-host bridge can establish Selective Stream IDE. Fixes: 4be423572da1 ("crypto/ccp: Implement SEV-TIO PCIe IDE (phase1)") Signed-off-by: Alexey Kardashevskiy Acked-by: Tom Lendacky Link: https://patch.msgid.link/20260123053057.1350569-3-aik@amd.com [djbw: clarify end user impact in changelog] Signed-off-by: Dan Williams --- drivers/crypto/ccp/sev-dev-tsm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/crypto/ccp/sev-dev-tsm.c b/drivers/crypto/ccp/sev-dev-tsm.c index 7407b77c2ef2..40d02adaf3f6 100644 --- a/drivers/crypto/ccp/sev-dev-tsm.c +++ b/drivers/crypto/ccp/sev-dev-tsm.c @@ -198,8 +198,7 @@ static int stream_alloc(struct pci_dev *pdev, struct pci_ide **ide, if (!ide1) return -EFAULT; - /* Blindly assign streamid=0 to TC=0, and so on */ - ide1->stream_id = tc; + ide1->stream_id = ide1->host_bridge_stream; ide[tc] = ide1; From adcbadfd8e05d3558c9cfaa783f17c645181165f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marek=20Beh=C3=BAn?= Date: Thu, 29 Jan 2026 09:22:27 +0100 Subject: [PATCH 056/167] net: sfp: Fix quirk for Ubiquiti U-Fiber Instant SFP module MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit fd580c9830316eda ("net: sfp: augment SFP parsing with phy_interface_t bitmap") did not add augumentation for the interface bitmap in the quirk for Ubiquiti U-Fiber Instant. The subsequent commit f81fa96d8a6c7a77 ("net: phylink: use phy_interface_t bitmaps for optical modules") then changed phylink code for selection of SFP interface: instead of using link mode bitmap, the interface bitmap is used, and the fastest interface mode supported by both SFP module and MAC is chosen. Since the interface bitmap contains also modes faster than 1000base-x, this caused a regression wherein this module stopped working out-of-the-box. Fix this. Fixes: fd580c9830316eda ("net: sfp: augment SFP parsing with phy_interface_t bitmap") Signed-off-by: Marek Behún Reviewed-by: Maxime Chevallier Reviewed-by: Russell King (Oracle) Link: https://patch.msgid.link/20260129082227.17443-1-kabel@kernel.org Signed-off-by: Jakub Kicinski --- drivers/net/phy/sfp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 47f095bd91ce..3e023723887c 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -479,6 +479,8 @@ static void sfp_quirk_ubnt_uf_instant(const struct sfp_eeprom_id *id, linkmode_zero(caps->link_modes); linkmode_set_bit(ETHTOOL_LINK_MODE_1000baseX_Full_BIT, caps->link_modes); + phy_interface_zero(caps->interfaces); + __set_bit(PHY_INTERFACE_MODE_1000BASEX, caps->interfaces); } #define SFP_QUIRK(_v, _p, _s, _f) \ From f8db6475a83649689c087a8f52486fcc53e627e9 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 29 Jan 2026 20:43:59 +0000 Subject: [PATCH 057/167] macvlan: fix error recovery in macvlan_common_newlink() valis provided a nice repro to crash the kernel: ip link add p1 type veth peer p2 ip link set address 00:00:00:00:00:20 dev p1 ip link set up dev p1 ip link set up dev p2 ip link add mv0 link p2 type macvlan mode source ip link add invalid% link p2 type macvlan mode source macaddr add 00:00:00:00:00:20 ping -c1 -I p1 1.2.3.4 He also gave a very detailed analysis: The issue is triggered when a new macvlan link is created with MACVLAN_MODE_SOURCE mode and MACVLAN_MACADDR_ADD (or MACVLAN_MACADDR_SET) parameter, lower device already has a macvlan port and register_netdevice() called from macvlan_common_newlink() fails (e.g. because of the invalid link name). In this case macvlan_hash_add_source is called from macvlan_change_sources() / macvlan_common_newlink(): This adds a reference to vlan to the port's vlan_source_hash using macvlan_source_entry. vlan is a pointer to the priv data of the link that is being created. When register_netdevice() fails, the error is returned from macvlan_newlink() to rtnl_newlink_create(): if (ops->newlink) err = ops->newlink(dev, ¶ms, extack); else err = register_netdevice(dev); if (err < 0) { free_netdev(dev); goto out; } and free_netdev() is called, causing a kvfree() on the struct net_device that is still referenced in the source entry attached to the lower device's macvlan port. Now all packets sent on the macvlan port with a matching source mac address will trigger a use-after-free in macvlan_forward_source(). With all that, my fix is to make sure we call macvlan_flush_sources() regardless of @create value whenever "goto destroy_macvlan_port;" path is taken. Many thanks to valis for following up on this issue. Fixes: aa5fd0fb7748 ("driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.") Signed-off-by: Eric Dumazet Reported-by: valis Reported-by: syzbot+7182fbe91e58602ec1fe@syzkaller.appspotmail.com Closes: https: //lore.kernel.org/netdev/695fb1e8.050a0220.1c677c.039f.GAE@google.com/T/#u Cc: Boudewijn van der Heide Link: https://patch.msgid.link/20260129204359.632556-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- drivers/net/macvlan.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index b4df7e184791..c509228be84d 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1567,9 +1567,10 @@ int macvlan_common_newlink(struct net_device *dev, /* the macvlan port may be freed by macvlan_uninit when fail to register. * so we destroy the macvlan port only when it's valid. */ - if (create && macvlan_port_get_rtnl(lowerdev)) { + if (macvlan_port_get_rtnl(lowerdev)) { macvlan_flush_sources(port, vlan); - macvlan_port_destroy(port->dev); + if (create) + macvlan_port_destroy(port->dev); } return err; } From 6d06bc83a5ae8777a5f7a81c32dd75b8d9b2fe04 Mon Sep 17 00:00:00 2001 From: Sergey Senozhatsky Date: Thu, 29 Jan 2026 12:10:30 +0900 Subject: [PATCH 058/167] net: usb: r8152: fix resume reset deadlock rtl8152 can trigger device reset during reset which potentially can result in a deadlock: **** DPM device timeout after 10 seconds; 15 seconds until panic **** Call Trace: schedule+0x483/0x1370 schedule_preempt_disabled+0x15/0x30 __mutex_lock_common+0x1fd/0x470 __rtl8152_set_mac_address+0x80/0x1f0 dev_set_mac_address+0x7f/0x150 rtl8152_post_reset+0x72/0x150 usb_reset_device+0x1d0/0x220 rtl8152_resume+0x99/0xc0 usb_resume_interface+0x3e/0xc0 usb_resume_both+0x104/0x150 usb_resume+0x22/0x110 The problem is that rtl8152 resume calls reset under tp->control mutex while reset basically re-enters rtl8152 and attempts to acquire the same tp->control lock once again. Reset INACCESSIBLE device outside of tp->control mutex scope to avoid recursive mutex_lock() deadlock. Fixes: 4933b066fefb ("r8152: If inaccessible at resume time, issue a reset") Reviewed-by: Douglas Anderson Signed-off-by: Sergey Senozhatsky Link: https://patch.msgid.link/20260129031106.3805887-1-senozhatsky@chromium.org Signed-off-by: Jakub Kicinski --- drivers/net/usb/r8152.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index fa5192583860..2f3baa5f6e9c 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -8535,19 +8535,6 @@ static int rtl8152_system_resume(struct r8152 *tp) usb_submit_urb(tp->intr_urb, GFP_NOIO); } - /* If the device is RTL8152_INACCESSIBLE here then we should do a - * reset. This is important because the usb_lock_device_for_reset() - * that happens as a result of usb_queue_reset_device() will silently - * fail if the device was suspended or if too much time passed. - * - * NOTE: The device is locked here so we can directly do the reset. - * We don't need usb_lock_device_for_reset() because that's just a - * wrapper over device_lock() and device_resume() (which calls us) - * does that for us. - */ - if (test_bit(RTL8152_INACCESSIBLE, &tp->flags)) - usb_reset_device(tp->udev); - return 0; } @@ -8658,19 +8645,33 @@ static int rtl8152_suspend(struct usb_interface *intf, pm_message_t message) static int rtl8152_resume(struct usb_interface *intf) { struct r8152 *tp = usb_get_intfdata(intf); + bool runtime_resume = test_bit(SELECTIVE_SUSPEND, &tp->flags); int ret; mutex_lock(&tp->control); rtl_reset_ocp_base(tp); - if (test_bit(SELECTIVE_SUSPEND, &tp->flags)) + if (runtime_resume) ret = rtl8152_runtime_resume(tp); else ret = rtl8152_system_resume(tp); mutex_unlock(&tp->control); + /* If the device is RTL8152_INACCESSIBLE here then we should do a + * reset. This is important because the usb_lock_device_for_reset() + * that happens as a result of usb_queue_reset_device() will silently + * fail if the device was suspended or if too much time passed. + * + * NOTE: The device is locked here so we can directly do the reset. + * We don't need usb_lock_device_for_reset() because that's just a + * wrapper over device_lock() and device_resume() (which calls us) + * does that for us. + */ + if (!runtime_resume && test_bit(RTL8152_INACCESSIBLE, &tp->flags)) + usb_reset_device(tp->udev); + return ret; } From 23ea2a4c72323feb6e3e025e8a6f18336513d5ad Mon Sep 17 00:00:00 2001 From: Thomas Weissschuh Date: Wed, 7 Jan 2026 11:01:49 +0100 Subject: [PATCH 059/167] ARM: 9468/1: fix memset64() on big-endian MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On big-endian systems the 32-bit low and high halves need to be swapped for the underlying assembly implementation to work correctly. Fixes: fd1d362600e2 ("ARM: implement memset32 & memset64") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh Reviewed-by: Matthew Wilcox (Oracle) Reviewed-by: Arnd Bergmann Signed-off-by: Russell King (Oracle) --- arch/arm/include/asm/string.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/arm/include/asm/string.h b/arch/arm/include/asm/string.h index 6c607c68f3ad..c35250c4991b 100644 --- a/arch/arm/include/asm/string.h +++ b/arch/arm/include/asm/string.h @@ -42,7 +42,10 @@ static inline void *memset32(uint32_t *p, uint32_t v, __kernel_size_t n) extern void *__memset64(uint64_t *, uint32_t low, __kernel_size_t, uint32_t hi); static inline void *memset64(uint64_t *p, uint64_t v, __kernel_size_t n) { - return __memset64(p, v, n * 8, v >> 32); + if (IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN)) + return __memset64(p, v, n * 8, v >> 32); + else + return __memset64(p, v >> 32, n * 8, v); } /* From 615901b57b7ef8eb655f71358f7e956e42bcd16b Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Sat, 31 Jan 2026 07:23:28 -0800 Subject: [PATCH 060/167] hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify() The acpi_power_meter driver's .notify() callback function, acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock that is also acquired by callbacks in sysfs attributes of the device being unregistered which is prone to deadlocks between sysfs access and device removal. Address this by moving the hwmon device removal in acpi_power_meter_notify() outside the lock in question, but notice that doing it alone is not sufficient because two concurrent METER_NOTIFY_CONFIG notifications may be attempting to remove the same device at the same time. To prevent that from happening, add a new lock serializing the execution of the switch () statement in acpi_power_meter_notify(). For simplicity, it is a static mutex which should not be a problem from the performance perspective. The new lock also allows the hwmon_device_register_with_info() in acpi_power_meter_notify() to be called outside the inner lock because it prevents the other notifications handled by that function from manipulating the "resource" object while the hwmon device based on it is being registered. The sending of ACPI netlink messages from acpi_power_meter_notify() is serialized by the new lock too which generally helps to ensure that the order of handling firmware notifications is the same as the order of sending netlink messages related to them. In addition, notice that hwmon_device_register_with_info() may fail in which case resource->hwmon_dev will become an error pointer, so add checks to avoid attempting to unregister the hwmon device pointer to by it in that case to acpi_power_meter_notify() and acpi_power_meter_remove(). Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register") Closes: https://lore.kernel.org/linux-hwmon/CAK8fFZ58fidGUCHi5WFX0uoTPzveUUDzT=k=AAm4yWo3bAuCFg@mail.gmail.com/ Reported-by: Jaroslav Pulchart Signed-off-by: Rafael J. Wysocki Signed-off-by: Guenter Roeck --- drivers/hwmon/acpi_power_meter.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/hwmon/acpi_power_meter.c b/drivers/hwmon/acpi_power_meter.c index 29ccdc2fb7ff..de408df0c4d7 100644 --- a/drivers/hwmon/acpi_power_meter.c +++ b/drivers/hwmon/acpi_power_meter.c @@ -47,6 +47,8 @@ static int cap_in_hardware; static bool force_cap_on; +static DEFINE_MUTEX(acpi_notify_lock); + static int can_cap_in_hardware(void) { return force_cap_on || cap_in_hardware; @@ -823,18 +825,26 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event) resource = acpi_driver_data(device); + guard(mutex)(&acpi_notify_lock); + switch (event) { case METER_NOTIFY_CONFIG: + if (!IS_ERR(resource->hwmon_dev)) + hwmon_device_unregister(resource->hwmon_dev); + mutex_lock(&resource->lock); + free_capabilities(resource); remove_domain_devices(resource); - hwmon_device_unregister(resource->hwmon_dev); res = read_capabilities(resource); if (res) dev_err_once(&device->dev, "read capabilities failed.\n"); res = read_domain_devices(resource); if (res && res != -ENODEV) dev_err_once(&device->dev, "read domain devices failed.\n"); + + mutex_unlock(&resource->lock); + resource->hwmon_dev = hwmon_device_register_with_info(&device->dev, ACPI_POWER_METER_NAME, @@ -843,7 +853,7 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event) power_extra_groups); if (IS_ERR(resource->hwmon_dev)) dev_err_once(&device->dev, "register hwmon device failed.\n"); - mutex_unlock(&resource->lock); + break; case METER_NOTIFY_TRIP: sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME); @@ -953,7 +963,8 @@ static void acpi_power_meter_remove(struct acpi_device *device) return; resource = acpi_driver_data(device); - hwmon_device_unregister(resource->hwmon_dev); + if (!IS_ERR(resource->hwmon_dev)) + hwmon_device_unregister(resource->hwmon_dev); remove_domain_devices(resource); free_capabilities(resource); From fdf3f6800be36377e045e2448087f12132b88d2f Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Thu, 29 Jan 2026 19:38:27 -0800 Subject: [PATCH 061/167] net: don't touch dev->stats in BPF redirect paths Gal reports that BPF redirect increments dev->stats.tx_errors on failure. This is not correct, most modern drivers completely ignore dev->stats so these drops will be invisible to the user. Core code should use the dedicated core stats which are folded into device stats in dev_get_stats(). Note that we're switching from tx_errors to tx_dropped. Core only has tx_dropped, hence presumably users already expect that counter to increment for "stack" Tx issues. Reported-by: Gal Pressman Link: https://lore.kernel.org/c5df3b60-246a-4030-9c9a-0a35cd1ca924@nvidia.com Fixes: b4ab31414970 ("bpf: Add redirect_neigh helper as redirect drop-in") Acked-by: Martin KaFai Lau Acked-by: Daniel Borkmann Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20260130033827.698841-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- net/core/filter.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index bcd73d9bd764..029e560e32ce 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2289,12 +2289,12 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev, err = bpf_out_neigh_v6(net, skb, dev, nh); if (unlikely(net_xmit_eval(err))) - DEV_STATS_INC(dev, tx_errors); + dev_core_stats_tx_dropped_inc(dev); else ret = NET_XMIT_SUCCESS; goto out_xmit; out_drop: - DEV_STATS_INC(dev, tx_errors); + dev_core_stats_tx_dropped_inc(dev); kfree_skb(skb); out_xmit: return ret; @@ -2396,12 +2396,12 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev, err = bpf_out_neigh_v4(net, skb, dev, nh); if (unlikely(net_xmit_eval(err))) - DEV_STATS_INC(dev, tx_errors); + dev_core_stats_tx_dropped_inc(dev); else ret = NET_XMIT_SUCCESS; goto out_xmit; out_drop: - DEV_STATS_INC(dev, tx_errors); + dev_core_stats_tx_dropped_inc(dev); kfree_skb(skb); out_xmit: return ret; From 6c65db809796717f0a96cf22f80405dbc1a31a4b Mon Sep 17 00:00:00 2001 From: John Ogness Date: Fri, 30 Jan 2026 12:38:08 +0106 Subject: [PATCH 062/167] Revert "drm/nouveau/disp: Set drm_mode_config_funcs.atomic_(check|commit)" This reverts commit 604826acb3f53c6648a7ee99a3914ead680ab7fb. Apparently there is more to supporting atomic modesetting than providing atomic_(check|commit) callbacks. Before this revert: WARNING: [] drivers/gpu/drm/drm_plane.c:389 at .__drm_universal_plane_init+0x13c/0x794 [drm], CPU#1: modprobe/1790 BUG: Kernel NULL pointer dereference on read at 0x00000000 .drm_atomic_get_plane_state+0xd4/0x210 [drm] (unreliable) .drm_client_modeset_commit_atomic+0xf8/0x338 [drm] .drm_client_modeset_commit_locked+0x80/0x260 [drm] .drm_client_modeset_commit+0x40/0x7c [drm] .__drm_fb_helper_restore_fbdev_mode_unlocked.part.0+0xfc/0x108 [drm_kms_helper] .drm_fb_helper_set_par+0x8c/0xb8 [drm_kms_helper] .fbcon_init+0x31c/0x618 [...] .__drm_fb_helper_initial_config_and_unlock+0x474/0x7f4 [drm_kms_helper] .drm_fbdev_client_hotplug+0xb0/0x120 [drm_client_lib] .drm_client_register+0x88/0xe4 [drm] .drm_fbdev_client_setup+0x12c/0x19b4 [drm_client_lib] .drm_client_setup+0x15c/0x18c [drm_client_lib] .nouveau_drm_probe+0x19c/0x268 [nouveau] Fixes: 604826acb3f5 ("drm/nouveau/disp: Set drm_mode_config_funcs.atomic_(check|commit)") Reported-by: John Ogness Closes: https://lore.kernel.org/lkml/87ldhf1prw.fsf@jogness.linutronix.de Signed-off-by: John Ogness Tested-by: Daniel Palmer Link: https://patch.msgid.link/20260130113230.2311221-1-john.ogness@linutronix.de Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_display.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index 829c2b573971..00515623a2cc 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -352,8 +352,6 @@ nouveau_user_framebuffer_create(struct drm_device *dev, static const struct drm_mode_config_funcs nouveau_mode_config_funcs = { .fb_create = nouveau_user_framebuffer_create, - .atomic_commit = drm_atomic_helper_commit, - .atomic_check = drm_atomic_helper_check, }; From daafcc0ef0b358d9d622b6e3b7c43767aa3814ee Mon Sep 17 00:00:00 2001 From: Deepanshu Kartikey Date: Fri, 30 Jan 2026 21:22:15 +0530 Subject: [PATCH 063/167] tracing/dma: Cap dma_map_sg tracepoint arrays to prevent buffer overflow The dma_map_sg tracepoint can trigger a perf buffer overflow when tracing large scatter-gather lists. With devices like virtio-gpu creating large DRM buffers, nents can exceed 1000 entries, resulting in: phys_addrs: 1000 * 8 bytes = 8,000 bytes dma_addrs: 1000 * 8 bytes = 8,000 bytes lengths: 1000 * 4 bytes = 4,000 bytes Total: ~20,000 bytes This exceeds PERF_MAX_TRACE_SIZE (8192 bytes), causing: WARNING: CPU: 0 PID: 5497 at kernel/trace/trace_event_perf.c:405 perf buffer not large enough, wanted 24620, have 8192 Cap all three dynamic arrays at 128 entries using min() in the array size calculation. This ensures arrays are only as large as needed (up to the cap), avoiding unnecessary memory allocation for small operations while preventing overflow for large ones. The tracepoint now records the full nents/ents counts and a truncated flag so users can see when data has been capped. Changes in v2: - Use min(nents, DMA_TRACE_MAX_ENTRIES) for dynamic array sizing instead of fixed DMA_TRACE_MAX_ENTRIES allocation (feedback from Steven Rostedt) - This allocates only what's needed up to the cap, avoiding waste for small operations Reported-by: syzbot+28cea38c382fd15e751a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=28cea38c382fd15e751a Tested-by: syzbot+28cea38c382fd15e751a@syzkaller.appspotmail.com Signed-off-by: Deepanshu Kartikey Reviwed-by: Sean Anderson Signed-off-by: Marek Szyprowski Link: https://lore.kernel.org/r/20260130155215.69737-1-kartikey406@gmail.com --- include/trace/events/dma.h | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h index b3fef140ae15..33e99e792f1a 100644 --- a/include/trace/events/dma.h +++ b/include/trace/events/dma.h @@ -275,6 +275,8 @@ TRACE_EVENT(dma_free_sgt, sizeof(u64), sizeof(u64))) ); +#define DMA_TRACE_MAX_ENTRIES 128 + TRACE_EVENT(dma_map_sg, TP_PROTO(struct device *dev, struct scatterlist *sgl, int nents, int ents, enum dma_data_direction dir, unsigned long attrs), @@ -282,9 +284,12 @@ TRACE_EVENT(dma_map_sg, TP_STRUCT__entry( __string(device, dev_name(dev)) - __dynamic_array(u64, phys_addrs, nents) - __dynamic_array(u64, dma_addrs, ents) - __dynamic_array(unsigned int, lengths, ents) + __field(int, full_nents) + __field(int, full_ents) + __field(bool, truncated) + __dynamic_array(u64, phys_addrs, min(nents, DMA_TRACE_MAX_ENTRIES)) + __dynamic_array(u64, dma_addrs, min(ents, DMA_TRACE_MAX_ENTRIES)) + __dynamic_array(unsigned int, lengths, min(ents, DMA_TRACE_MAX_ENTRIES)) __field(enum dma_data_direction, dir) __field(unsigned long, attrs) ), @@ -292,11 +297,16 @@ TRACE_EVENT(dma_map_sg, TP_fast_assign( struct scatterlist *sg; int i; + int traced_nents = min_t(int, nents, DMA_TRACE_MAX_ENTRIES); + int traced_ents = min_t(int, ents, DMA_TRACE_MAX_ENTRIES); __assign_str(device); - for_each_sg(sgl, sg, nents, i) + __entry->full_nents = nents; + __entry->full_ents = ents; + __entry->truncated = (nents > DMA_TRACE_MAX_ENTRIES) || (ents > DMA_TRACE_MAX_ENTRIES); + for_each_sg(sgl, sg, traced_nents, i) ((u64 *)__get_dynamic_array(phys_addrs))[i] = sg_phys(sg); - for_each_sg(sgl, sg, ents, i) { + for_each_sg(sgl, sg, traced_ents, i) { ((u64 *)__get_dynamic_array(dma_addrs))[i] = sg_dma_address(sg); ((unsigned int *)__get_dynamic_array(lengths))[i] = @@ -306,9 +316,12 @@ TRACE_EVENT(dma_map_sg, __entry->attrs = attrs; ), - TP_printk("%s dir=%s dma_addrs=%s sizes=%s phys_addrs=%s attrs=%s", + TP_printk("%s dir=%s nents=%d/%d ents=%d/%d%s dma_addrs=%s sizes=%s phys_addrs=%s attrs=%s", __get_str(device), decode_dma_data_direction(__entry->dir), + min_t(int, __entry->full_nents, DMA_TRACE_MAX_ENTRIES), __entry->full_nents, + min_t(int, __entry->full_ents, DMA_TRACE_MAX_ENTRIES), __entry->full_ents, + __entry->truncated ? " [TRUNCATED]" : "", __print_array(__get_dynamic_array(dma_addrs), __get_dynamic_array_len(dma_addrs) / sizeof(u64), sizeof(u64)), From c33efdfcfa6f80e05ce1ee33694c1bad4994cd78 Mon Sep 17 00:00:00 2001 From: Shanker Donthineni Date: Thu, 29 Jan 2026 12:13:17 -0600 Subject: [PATCH 064/167] dma: contiguous: Check return value of dma_contiguous_reserve_area() Commit 8f1fc1bf1a3d ("dma: contiguous: Reserve default CMA heap") introduced a bug where dma_heap_cma_register_heap() is called with a NULL pointer when dma_contiguous_reserve_area() fails to reserve the CMA area. When dma_contiguous_reserve_area() fails, dma_contiguous_default_area remains NULL (initialized as a global variable), but the code doesn't check the return value and proceeds to call dma_heap_cma_register_heap() with this NULL pointer. Later during boot, add_cma_heaps() iterates through the dma_areas[] array and attempts to register heaps. When it encounters the NULL pointer stored by the earlier call, it crashes in __add_cma_heap() -> dma_heap_add() when trying to dereference the NULL CMA pointer. The crash manifests as: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038 ... Call trace: dma_heap_add+0x40/0x2b0 __add_cma_heap+0x80/0xe0 add_cma_heaps+0x64/0xb0 do_one_initcall+0x60/0x318 kernel_init_freeable+0x260/0x2f0 kernel_init+0x2c/0x168 ret_from_fork+0x10/0x20 Fix this by checking the return value of dma_contiguous_reserve_area() and only calling dma_heap_cma_register_heap() when the reservation succeeds. Fixes: 8f1fc1bf1a3d ("dma: contiguous: Reserve default CMA heap") Signed-off-by: Shanker Donthineni Reviewed-by: T.J. Mercier Signed-off-by: Marek Szyprowski Link: https://lore.kernel.org/r/20260129181317.2429196-1-sdonthineni@nvidia.com --- kernel/dma/contiguous.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index 0e266979728b..c56004d314dc 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -257,10 +257,12 @@ void __init dma_contiguous_reserve(phys_addr_t limit) pr_debug("%s: reserving %ld MiB for global area\n", __func__, (unsigned long)selected_size / SZ_1M); - dma_contiguous_reserve_area(selected_size, selected_base, - selected_limit, - &dma_contiguous_default_area, - fixed); + ret = dma_contiguous_reserve_area(selected_size, selected_base, + selected_limit, + &dma_contiguous_default_area, + fixed); + if (ret) + return; ret = dma_heap_cma_register_heap(dma_contiguous_default_area); if (ret) From 78f1421dc3ac2588fb59b55e2b4c2c55a442913e Mon Sep 17 00:00:00 2001 From: Jani Nikula Date: Fri, 30 Jan 2026 17:13:19 +0200 Subject: [PATCH 065/167] Revert "drm/gma500: use drm_crtc_vblank_crtc()" This reverts commit d930ffa5d6e8867a290db9c6aad1c62731aeb2c3. According to Thomas, commit d930ffa5d6e8 ("drm/gma500: use drm_crtc_vblank_crtc()") breaks the driver with a NULL-ptr oops on startup. This is because the IRQ initialization in gma_irq_install() now uses CRTCs that are only allocated later in psb_modeset_init(). Stack trace is below. Revert. Go back to the drawing board. [ 65.831766] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] SMP KASAN NOPTI [ 65.832114] KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f] [ 65.832232] CPU: 1 UID: 0 PID: 296 Comm: (udev-worker) Tainted: G E 6.19.0-rc6-1-default+ #4622 PREEMPT(voluntary) [ 65.832376] Tainted: [E]=UNSIGNED_MODULE [ 65.832448] Hardware name: /DN2800MT, BIOS MTCDT10N.86A.0164.2012.1213.1024 12/13/2012 [ 65.832543] RIP: 0010:drm_crtc_vblank_crtc+0x24/0xd0 [ 65.832652] Code: 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 81 c7 18 01 00 00 48 83 ec 10 48 ba 00 00 00 00 00 fc ff df 48 89 f9 48 c1 e9 03 <0f> b6 14 11 84 d2 74 05 80 fa 03 7e 58 48 89 c6 8b 90 18 01 00 00 [ 65.832820] RSP: 0018:ffff88800c8f7688 EFLAGS: 00010006 [ 65.832919] RAX: fffffffffffffff0 RBX: ffff88800fff4928 RCX: 0000000000000021 [ 65.833011] RDX: dffffc0000000000 RSI: ffffc90000978130 RDI: 0000000000000108 [ 65.833107] RBP: ffffed1001ffea03 R08: 0000000000000000 R09: ffffed100191eec7 [ 65.833199] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8880014480c8 [ 65.833289] R13: dffffc0000000000 R14: fffffffffffffff0 R15: ffff88800fff4000 [ 65.833380] FS: 00007fe53d4d5d80(0000) GS:ffff888148dd8000(0000) knlGS:0000000000000000 [ 65.833488] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 65.833575] CR2: 00007fac707420b8 CR3: 000000000ebd1000 CR4: 00000000000006f0 [ 65.833668] Call Trace: [ 65.833735] [ 65.833808] gma_irq_preinstall+0x190/0x3e0 [gma500_gfx] [ 65.834054] gma_irq_install+0xb2/0x240 [gma500_gfx] [ 65.834282] psb_driver_load+0x7b2/0x1090 [gma500_gfx] [ 65.834516] ? __pfx_psb_driver_load+0x10/0x10 [gma500_gfx] [ 65.834726] ? ksize+0x1d/0x40 [ 65.834817] ? drmm_add_final_kfree+0x3b/0xb0 [ 65.834935] ? __pfx_psb_pci_probe+0x10/0x10 [gma500_gfx] [ 65.835164] psb_pci_probe+0xc8/0x150 [gma500_gfx] [ 65.835384] local_pci_probe+0xd5/0x190 [ 65.835492] pci_call_probe+0x167/0x4b0 [ 65.835594] ? __pfx_pci_call_probe+0x10/0x10 [ 65.835693] ? local_clock+0x11/0x30 [ 65.835808] ? __pfx___driver_attach+0x10/0x10 [ 65.835915] ? do_raw_spin_unlock+0x55/0x230 [ 65.836014] ? pci_match_device+0x303/0x790 [ 65.836124] ? pci_match_device+0x386/0x790 [ 65.836226] ? __pfx_pci_assign_irq+0x10/0x10 [ 65.836320] ? kernfs_create_link+0x16a/0x230 [ 65.836418] ? do_raw_spin_unlock+0x55/0x230 [ 65.836526] ? __pfx___driver_attach+0x10/0x10 [ 65.836626] pci_device_probe+0x175/0x2c0 [ 65.836735] call_driver_probe+0x64/0x1e0 [ 65.836842] really_probe+0x194/0x740 [ 65.836951] ? __pfx___driver_attach+0x10/0x10 [ 65.837053] __driver_probe_device+0x18c/0x3a0 [ 65.837163] ? __pfx___driver_attach+0x10/0x10 [ 65.837262] driver_probe_device+0x4a/0x120 [ 65.837369] __driver_attach+0x19c/0x550 [ 65.837474] ? __pfx___driver_attach+0x10/0x10 [ 65.837575] bus_for_each_dev+0xe6/0x150 [ 65.837669] ? local_clock+0x11/0x30 [ 65.837770] ? __pfx_bus_for_each_dev+0x10/0x10 [ 65.837891] bus_add_driver+0x2af/0x4f0 [ 65.838000] ? __pfx_psb_init+0x10/0x10 [gma500_gfx] [ 65.838236] driver_register+0x19f/0x3a0 [ 65.838342] ? rcu_is_watching+0x11/0xb0 [ 65.838446] do_one_initcall+0xb5/0x3a0 [ 65.838546] ? __pfx_do_one_initcall+0x10/0x10 [ 65.838644] ? __kasan_slab_alloc+0x2c/0x70 [ 65.838741] ? rcu_is_watching+0x11/0xb0 [ 65.838837] ? __kmalloc_cache_noprof+0x3e8/0x6e0 [ 65.838937] ? klp_module_coming+0x1a0/0x2e0 [ 65.839033] ? do_init_module+0x85/0x7f0 [ 65.839126] ? kasan_unpoison+0x40/0x70 [ 65.839230] do_init_module+0x26e/0x7f0 [ 65.839341] ? __pfx_do_init_module+0x10/0x10 [ 65.839450] init_module_from_file+0x13f/0x160 [ 65.839549] ? __pfx_init_module_from_file+0x10/0x10 [ 65.839651] ? __lock_acquire+0x578/0xae0 [ 65.839791] ? do_raw_spin_unlock+0x55/0x230 [ 65.839886] ? idempotent_init_module+0x585/0x720 [ 65.839993] idempotent_init_module+0x1ff/0x720 [ 65.840097] ? __pfx_cred_has_capability.isra.0+0x10/0x10 [ 65.840211] ? __pfx_idempotent_init_module+0x10/0x10 Reported-by: Thomas Zimmermann Closes: https://lore.kernel.org/r/5aec1964-072c-4335-8f37-35e6efb4910e@suse.de Fixes: d930ffa5d6e8 ("drm/gma500: use drm_crtc_vblank_crtc()") Cc: Patrik Jakobsson Signed-off-by: Jani Nikula Reviewed-by: Thomas Zimmermann Tested-by: Thomas Zimmermann Signed-off-by: Patrik Jakobsson Link: https://patch.msgid.link/20260130151319.31264-1-jani.nikula@intel.com --- drivers/gpu/drm/gma500/psb_irq.c | 36 ++++++++++++-------------------- 1 file changed, 13 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/gma500/psb_irq.c b/drivers/gpu/drm/gma500/psb_irq.c index 3a946b472064..c224c7ff353c 100644 --- a/drivers/gpu/drm/gma500/psb_irq.c +++ b/drivers/gpu/drm/gma500/psb_irq.c @@ -250,7 +250,6 @@ static irqreturn_t gma_irq_handler(int irq, void *arg) void gma_irq_preinstall(struct drm_device *dev) { struct drm_psb_private *dev_priv = to_drm_psb_private(dev); - struct drm_crtc *crtc; unsigned long irqflags; spin_lock_irqsave(&dev_priv->irqmask_lock, irqflags); @@ -261,15 +260,10 @@ void gma_irq_preinstall(struct drm_device *dev) PSB_WSGX32(0x00000000, PSB_CR_EVENT_HOST_ENABLE); PSB_RSGX32(PSB_CR_EVENT_HOST_ENABLE); - drm_for_each_crtc(crtc, dev) { - struct drm_vblank_crtc *vblank = drm_crtc_vblank_crtc(crtc); - - if (vblank->enabled) { - u32 mask = drm_crtc_index(crtc) ? _PSB_VSYNC_PIPEB_FLAG : - _PSB_VSYNC_PIPEA_FLAG; - dev_priv->vdc_irq_mask |= mask; - } - } + if (dev->vblank[0].enabled) + dev_priv->vdc_irq_mask |= _PSB_VSYNC_PIPEA_FLAG; + if (dev->vblank[1].enabled) + dev_priv->vdc_irq_mask |= _PSB_VSYNC_PIPEB_FLAG; /* Revisit this area - want per device masks ? */ if (dev_priv->ops->hotplug) @@ -284,8 +278,8 @@ void gma_irq_preinstall(struct drm_device *dev) void gma_irq_postinstall(struct drm_device *dev) { struct drm_psb_private *dev_priv = to_drm_psb_private(dev); - struct drm_crtc *crtc; unsigned long irqflags; + unsigned int i; spin_lock_irqsave(&dev_priv->irqmask_lock, irqflags); @@ -298,13 +292,11 @@ void gma_irq_postinstall(struct drm_device *dev) PSB_WVDC32(dev_priv->vdc_irq_mask, PSB_INT_ENABLE_R); PSB_WVDC32(0xFFFFFFFF, PSB_HWSTAM); - drm_for_each_crtc(crtc, dev) { - struct drm_vblank_crtc *vblank = drm_crtc_vblank_crtc(crtc); - - if (vblank->enabled) - gma_enable_pipestat(dev_priv, drm_crtc_index(crtc), PIPE_VBLANK_INTERRUPT_ENABLE); + for (i = 0; i < dev->num_crtcs; ++i) { + if (dev->vblank[i].enabled) + gma_enable_pipestat(dev_priv, i, PIPE_VBLANK_INTERRUPT_ENABLE); else - gma_disable_pipestat(dev_priv, drm_crtc_index(crtc), PIPE_VBLANK_INTERRUPT_ENABLE); + gma_disable_pipestat(dev_priv, i, PIPE_VBLANK_INTERRUPT_ENABLE); } if (dev_priv->ops->hotplug_enable) @@ -345,8 +337,8 @@ void gma_irq_uninstall(struct drm_device *dev) { struct drm_psb_private *dev_priv = to_drm_psb_private(dev); struct pci_dev *pdev = to_pci_dev(dev->dev); - struct drm_crtc *crtc; unsigned long irqflags; + unsigned int i; if (!dev_priv->irq_enabled) return; @@ -358,11 +350,9 @@ void gma_irq_uninstall(struct drm_device *dev) PSB_WVDC32(0xFFFFFFFF, PSB_HWSTAM); - drm_for_each_crtc(crtc, dev) { - struct drm_vblank_crtc *vblank = drm_crtc_vblank_crtc(crtc); - - if (vblank->enabled) - gma_disable_pipestat(dev_priv, drm_crtc_index(crtc), PIPE_VBLANK_INTERRUPT_ENABLE); + for (i = 0; i < dev->num_crtcs; ++i) { + if (dev->vblank[i].enabled) + gma_disable_pipestat(dev_priv, i, PIPE_VBLANK_INTERRUPT_ENABLE); } dev_priv->vdc_irq_mask &= _PSB_IRQ_SGX_FLAG | From 1425900231372acf870dd89e8d3bb4935f7f0c81 Mon Sep 17 00:00:00 2001 From: Maciej Strozek Date: Wed, 28 Jan 2026 09:24:05 +0000 Subject: [PATCH 066/167] ASoC: sof_sdw: Add a quirk for Lenovo laptop using sidecar amps with cs42l43 Add a quirk for a Lenovo laptop (SSID: 0x17aa3821) to allow using sidecar CS35L57 amps with CS42L43 codec. Signed-off-by: Maciej Strozek Reviewed-by: Cezary Rojewski Link: https://patch.msgid.link/20260128092410.1540583-1-mstrozek@opensource.cirrus.com Signed-off-by: Mark Brown --- sound/soc/intel/boards/sof_sdw.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c index 8721a098d53f..50b838be24e9 100644 --- a/sound/soc/intel/boards/sof_sdw.c +++ b/sound/soc/intel/boards/sof_sdw.c @@ -838,6 +838,7 @@ static const struct snd_pci_quirk sof_sdw_ssid_quirk_table[] = { SND_PCI_QUIRK(0x17aa, 0x2347, "Lenovo P16", SOC_SDW_CODEC_MIC), SND_PCI_QUIRK(0x17aa, 0x2348, "Lenovo P16", SOC_SDW_CODEC_MIC), SND_PCI_QUIRK(0x17aa, 0x2349, "Lenovo P1", SOC_SDW_CODEC_MIC), + SND_PCI_QUIRK(0x17aa, 0x3821, "Lenovo 0x3821", SOC_SDW_SIDECAR_AMPS), {} }; From 6b641122d31f9d33e7d60047ee0586d1659f3f54 Mon Sep 17 00:00:00 2001 From: Tagir Garaev Date: Sun, 1 Feb 2026 15:17:28 +0300 Subject: [PATCH 067/167] ASoC: Intel: sof_es8336: Add DMI quirk for Huawei BOD-WXX9 Add DMI entry for Huawei Matebook D (BOD-WXX9) with HEADPHONE_GPIO and DMIC quirks. This device has ES8336 codec with: - GPIO 16 (headphone-enable) for headphone amplifier control - GPIO 17 (speakers-enable) for speaker amplifier control - GPIO 269 for jack detection IRQ - 2-channel DMIC Hardware investigation shows that both GPIO 16 and 17 are required for proper audio routing, as headphones and speakers share the same physical output (HPOL/HPOR) and are separated only via amplifier enable signals. RFC: Seeking advice on GPIO control issue: GPIO values change in driver (gpiod_get_value() shows logical value changes) but not physically (debugfs gpio shows no change). The same gpiod_set_value_cansleep() calls work correctly in probe context with msleep(), but fail when called from DAPM event callbacks. Context information from diagnostics: - in_atomic=0, in_interrupt=0, irqs_disabled=0 - Process context: pipewire - GPIO 17 (speakers): changes in driver, no physical change - GPIO 16 (headphone): changes in driver, no physical change In Windows, audio switching works without visible GPIO changes, suggesting possible ACPI/firmware involvement. Any suggestions on how to properly control these GPIOs from DAPM events would be appreciated. Signed-off-by: Tagir Garaev Link: https://patch.msgid.link/20260201121728.16597-1-tgaraev653@gmail.com Signed-off-by: Mark Brown --- sound/soc/intel/boards/sof_es8336.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/sound/soc/intel/boards/sof_es8336.c b/sound/soc/intel/boards/sof_es8336.c index fce50fd9f093..e9ee752d5ec3 100644 --- a/sound/soc/intel/boards/sof_es8336.c +++ b/sound/soc/intel/boards/sof_es8336.c @@ -334,6 +334,15 @@ static int sof_es8336_quirk_cb(const struct dmi_system_id *id) * if the topology file is modified as well. */ static const struct dmi_system_id sof_es8336_quirk_table[] = { + { + .callback = sof_es8336_quirk_cb, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "HUAWEI"), + DMI_MATCH(DMI_PRODUCT_NAME, "BOD-WXX9"), + }, + .driver_data = (void *)(SOF_ES8336_HEADPHONE_GPIO | + SOF_ES8336_ENABLE_DMIC) + }, { .callback = sof_es8336_quirk_cb, .matches = { From e77a4081d7e324dfa876a9560b2a78969446ba82 Mon Sep 17 00:00:00 2001 From: Charles Keepax Date: Fri, 30 Jan 2026 15:09:27 +0000 Subject: [PATCH 068/167] ASoC: cs42l43: Correct handling of 3-pole jack load detection The load detection process for 3-pole jacks requires slightly updated reference values to ensure an accurate result. Update the code to apply different tunings for the 3-pole and 4-pole cases. This also updates the thresholds overall so update the relevant comments to match. Signed-off-by: Charles Keepax Link: https://patch.msgid.link/20260130150927.2964664-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown --- sound/soc/codecs/cs42l43-jack.c | 37 +++++++++++++++++++++++++++------ 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/sound/soc/codecs/cs42l43-jack.c b/sound/soc/codecs/cs42l43-jack.c index b719d62635a0..b83bc4de1301 100644 --- a/sound/soc/codecs/cs42l43-jack.c +++ b/sound/soc/codecs/cs42l43-jack.c @@ -496,7 +496,23 @@ void cs42l43_bias_sense_timeout(struct work_struct *work) pm_runtime_put_autosuspend(priv->dev); } -static void cs42l43_start_load_detect(struct cs42l43_codec *priv) +static const struct reg_sequence cs42l43_3pole_patch[] = { + { 0x4000, 0x00000055 }, + { 0x4000, 0x000000AA }, + { 0x17420, 0x8500F300 }, + { 0x17424, 0x36003E00 }, + { 0x4000, 0x00000000 }, +}; + +static const struct reg_sequence cs42l43_4pole_patch[] = { + { 0x4000, 0x00000055 }, + { 0x4000, 0x000000AA }, + { 0x17420, 0x7800E600 }, + { 0x17424, 0x36003800 }, + { 0x4000, 0x00000000 }, +}; + +static void cs42l43_start_load_detect(struct cs42l43_codec *priv, bool mic) { struct cs42l43 *cs42l43 = priv->core; @@ -520,6 +536,15 @@ static void cs42l43_start_load_detect(struct cs42l43_codec *priv) dev_err(priv->dev, "Load detect HP power down timed out\n"); } + if (mic) + regmap_multi_reg_write_bypassed(cs42l43->regmap, + cs42l43_4pole_patch, + ARRAY_SIZE(cs42l43_4pole_patch)); + else + regmap_multi_reg_write_bypassed(cs42l43->regmap, + cs42l43_3pole_patch, + ARRAY_SIZE(cs42l43_3pole_patch)); + regmap_update_bits(cs42l43->regmap, CS42L43_BLOCK_EN3, CS42L43_ADC1_EN_MASK | CS42L43_ADC2_EN_MASK, 0); regmap_update_bits(cs42l43->regmap, CS42L43_DACCNFG2, CS42L43_HP_HPF_EN_MASK, 0); @@ -598,7 +623,7 @@ static int cs42l43_run_load_detect(struct cs42l43_codec *priv, bool mic) reinit_completion(&priv->load_detect); - cs42l43_start_load_detect(priv); + cs42l43_start_load_detect(priv, mic); time_left = wait_for_completion_timeout(&priv->load_detect, msecs_to_jiffies(CS42L43_LOAD_TIMEOUT_MS)); cs42l43_stop_load_detect(priv); @@ -622,11 +647,11 @@ static int cs42l43_run_load_detect(struct cs42l43_codec *priv, bool mic) } switch (val & CS42L43_AMP3_RES_DET_MASK) { - case 0x0: // low impedance - case 0x1: // high impedance + case 0x0: // < 22 Ohm impedance + case 0x1: // < 150 Ohm impedance + case 0x2: // < 1000 Ohm impedance return CS42L43_JACK_HEADPHONE; - case 0x2: // lineout - case 0x3: // Open circuit + case 0x3: // > 1000 Ohm impedance return CS42L43_JACK_LINEOUT; default: return -EINVAL; From 611c7d2262d5645118e0b3a9a88475d35a8366f2 Mon Sep 17 00:00:00 2001 From: Dirk Su Date: Thu, 29 Jan 2026 14:50:19 +0800 Subject: [PATCH 069/167] ASoC: amd: yc: Add quirk for HP 200 G2a 16 Fix the missing mic on HP 200 G2a 16 by adding quirk with the board ID 8EE4 Signed-off-by: Dirk Su Link: https://patch.msgid.link/20260129065038.39349-1-dirk.su@canonical.com Signed-off-by: Mark Brown --- sound/soc/amd/yc/acp6x-mach.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c index c18da0915baa..67f2fee19398 100644 --- a/sound/soc/amd/yc/acp6x-mach.c +++ b/sound/soc/amd/yc/acp6x-mach.c @@ -640,6 +640,13 @@ static const struct dmi_system_id yc_acp_quirk_table[] = { DMI_MATCH(DMI_BOARD_NAME, "8BD6"), } }, + { + .driver_data = &acp6x_card, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "HP"), + DMI_MATCH(DMI_BOARD_NAME, "8EE4"), + } + }, { .driver_data = &acp6x_card, .matches = { From 10db9f6899dd3a2dfd26efd40afd308891dc44a8 Mon Sep 17 00:00:00 2001 From: Richard Fitzgerald Date: Fri, 30 Jan 2026 17:12:56 +0000 Subject: [PATCH 070/167] firmware: cs_dsp: rate-limit log messages in KUnit builds Use the dev_*_ratelimit() macros if the cs_dsp KUnit tests are enabled in the build, and allow the KUnit tests to disable message output. Some of the KUnit tests cause a very large number of log messages from cs_dsp, because the tests perform many different test cases. This could cause some lines to be dropped from the kernel log. Dropped lines can prevent the KUnit wrappers from parsing the ktap output in the dmesg log. The KUnit builds of cs_dsp export three bools that the KUnit tests can use to entirely disable log output of err, warn and info messages. Some tests have been updated to use this, replacing the previous fudge of a usleep() in the exit handler of each test. We don't necessarily want to disable all log messages if they aren't expected to be excessive, so the rate-limiting allows leaving some logging enabled. The rate-limited macros are not used in normal builds because it is not appropriate to rate-limit every message. That could cause important messages to be dropped, and there wouldn't be such a high rate of messages in normal operation. Signed-off-by: Richard Fitzgerald Reported-by: Mark Brown Closes: https://lore.kernel.org/linux-sound/af393f08-facb-4c44-a054-1f61254803ec@opensource.cirrus.com/T/#t Fixes: cd8c058499b6 ("firmware: cs_dsp: Add KUnit testing of bin error cases") Link: https://patch.msgid.link/20260130171256.863152-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown --- drivers/firmware/cirrus/cs_dsp.c | 37 +++++++++++++++++++ drivers/firmware/cirrus/cs_dsp.h | 18 +++++++++ .../firmware/cirrus/test/cs_dsp_test_bin.c | 22 ++++++++++- .../cirrus/test/cs_dsp_test_bin_error.c | 24 +++++++++--- .../firmware/cirrus/test/cs_dsp_test_wmfw.c | 26 ++++++++++++- .../cirrus/test/cs_dsp_test_wmfw_error.c | 24 +++++++++--- drivers/firmware/cirrus/test/cs_dsp_tests.c | 1 + 7 files changed, 138 insertions(+), 14 deletions(-) create mode 100644 drivers/firmware/cirrus/cs_dsp.h diff --git a/drivers/firmware/cirrus/cs_dsp.c b/drivers/firmware/cirrus/cs_dsp.c index 525ac0f0a75d..73d201e7d992 100644 --- a/drivers/firmware/cirrus/cs_dsp.c +++ b/drivers/firmware/cirrus/cs_dsp.c @@ -9,6 +9,7 @@ * Cirrus Logic International Semiconductor Ltd. */ +#include #include #include #include @@ -24,6 +25,41 @@ #include #include +#include "cs_dsp.h" + +/* + * When the KUnit test is running the error-case tests will cause a lot + * of messages. Rate-limit to prevent overflowing the kernel log buffer + * during KUnit test runs. + */ +#if IS_ENABLED(CONFIG_FW_CS_DSP_KUNIT_TEST) +bool cs_dsp_suppress_err_messages; +EXPORT_SYMBOL_IF_KUNIT(cs_dsp_suppress_err_messages); + +bool cs_dsp_suppress_warn_messages; +EXPORT_SYMBOL_IF_KUNIT(cs_dsp_suppress_warn_messages); + +bool cs_dsp_suppress_info_messages; +EXPORT_SYMBOL_IF_KUNIT(cs_dsp_suppress_info_messages); + +#define cs_dsp_err(_dsp, fmt, ...) \ + do { \ + if (!cs_dsp_suppress_err_messages) \ + dev_err_ratelimited(_dsp->dev, "%s: " fmt, _dsp->name, ##__VA_ARGS__); \ + } while (false) +#define cs_dsp_warn(_dsp, fmt, ...) \ + do { \ + if (!cs_dsp_suppress_warn_messages) \ + dev_warn_ratelimited(_dsp->dev, "%s: " fmt, _dsp->name, ##__VA_ARGS__); \ + } while (false) +#define cs_dsp_info(_dsp, fmt, ...) \ + do { \ + if (!cs_dsp_suppress_info_messages) \ + dev_info_ratelimited(_dsp->dev, "%s: " fmt, _dsp->name, ##__VA_ARGS__); \ + } while (false) +#define cs_dsp_dbg(_dsp, fmt, ...) \ + dev_dbg_ratelimited(_dsp->dev, "%s: " fmt, _dsp->name, ##__VA_ARGS__) +#else #define cs_dsp_err(_dsp, fmt, ...) \ dev_err(_dsp->dev, "%s: " fmt, _dsp->name, ##__VA_ARGS__) #define cs_dsp_warn(_dsp, fmt, ...) \ @@ -32,6 +68,7 @@ dev_info(_dsp->dev, "%s: " fmt, _dsp->name, ##__VA_ARGS__) #define cs_dsp_dbg(_dsp, fmt, ...) \ dev_dbg(_dsp->dev, "%s: " fmt, _dsp->name, ##__VA_ARGS__) +#endif #define ADSP1_CONTROL_1 0x00 #define ADSP1_CONTROL_2 0x02 diff --git a/drivers/firmware/cirrus/cs_dsp.h b/drivers/firmware/cirrus/cs_dsp.h new file mode 100644 index 000000000000..adf543004aea --- /dev/null +++ b/drivers/firmware/cirrus/cs_dsp.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * cs_dsp.h -- Private header for cs_dsp driver. + * + * Copyright (C) 2026 Cirrus Logic, Inc. and + * Cirrus Logic International Semiconductor Ltd. + */ + +#ifndef FW_CS_DSP_H +#define FW_CS_DSP_H + +#if IS_ENABLED(CONFIG_KUNIT) +extern bool cs_dsp_suppress_err_messages; +extern bool cs_dsp_suppress_warn_messages; +extern bool cs_dsp_suppress_info_messages; +#endif + +#endif /* ifndef FW_CS_DSP_H */ diff --git a/drivers/firmware/cirrus/test/cs_dsp_test_bin.c b/drivers/firmware/cirrus/test/cs_dsp_test_bin.c index 163b7faecff4..2c6486fa9575 100644 --- a/drivers/firmware/cirrus/test/cs_dsp_test_bin.c +++ b/drivers/firmware/cirrus/test/cs_dsp_test_bin.c @@ -17,6 +17,8 @@ #include #include +#include "../cs_dsp.h" + /* * Test method is: * @@ -2224,7 +2226,22 @@ static int cs_dsp_bin_test_common_init(struct kunit *test, struct cs_dsp *dsp) return ret; /* Automatically call cs_dsp_remove() when test case ends */ - return kunit_add_action_or_reset(priv->test, _cs_dsp_remove_wrapper, dsp); + ret = kunit_add_action_or_reset(priv->test, _cs_dsp_remove_wrapper, dsp); + if (ret) + return ret; + + /* + * The large number of test cases will cause an unusually large amount + * of dev_info() messages from cs_dsp, so suppress these. + */ + cs_dsp_suppress_info_messages = true; + + return 0; +} + +static void cs_dsp_bin_test_exit(struct kunit *test) +{ + cs_dsp_suppress_info_messages = false; } static int cs_dsp_bin_test_halo_init(struct kunit *test) @@ -2536,18 +2553,21 @@ static struct kunit_case cs_dsp_bin_test_cases_adsp2[] = { static struct kunit_suite cs_dsp_bin_test_halo = { .name = "cs_dsp_bin_halo", .init = cs_dsp_bin_test_halo_init, + .exit = cs_dsp_bin_test_exit, .test_cases = cs_dsp_bin_test_cases_halo, }; static struct kunit_suite cs_dsp_bin_test_adsp2_32bit = { .name = "cs_dsp_bin_adsp2_32bit", .init = cs_dsp_bin_test_adsp2_32bit_init, + .exit = cs_dsp_bin_test_exit, .test_cases = cs_dsp_bin_test_cases_adsp2, }; static struct kunit_suite cs_dsp_bin_test_adsp2_16bit = { .name = "cs_dsp_bin_adsp2_16bit", .init = cs_dsp_bin_test_adsp2_16bit_init, + .exit = cs_dsp_bin_test_exit, .test_cases = cs_dsp_bin_test_cases_adsp2, }; diff --git a/drivers/firmware/cirrus/test/cs_dsp_test_bin_error.c b/drivers/firmware/cirrus/test/cs_dsp_test_bin_error.c index a7ec956d2724..631b9cb9eb25 100644 --- a/drivers/firmware/cirrus/test/cs_dsp_test_bin_error.c +++ b/drivers/firmware/cirrus/test/cs_dsp_test_bin_error.c @@ -18,6 +18,8 @@ #include #include +#include "../cs_dsp.h" + KUNIT_DEFINE_ACTION_WRAPPER(_put_device_wrapper, put_device, struct device *); KUNIT_DEFINE_ACTION_WRAPPER(_cs_dsp_remove_wrapper, cs_dsp_remove, struct cs_dsp *); @@ -380,11 +382,9 @@ static void bin_block_payload_len_garbage(struct kunit *test) static void cs_dsp_bin_err_test_exit(struct kunit *test) { - /* - * Testing error conditions can produce a lot of log output - * from cs_dsp error messages, so rate limit the test cases. - */ - usleep_range(200, 500); + cs_dsp_suppress_err_messages = false; + cs_dsp_suppress_warn_messages = false; + cs_dsp_suppress_info_messages = false; } static int cs_dsp_bin_err_test_common_init(struct kunit *test, struct cs_dsp *dsp, @@ -474,7 +474,19 @@ static int cs_dsp_bin_err_test_common_init(struct kunit *test, struct cs_dsp *ds return ret; /* Automatically call cs_dsp_remove() when test case ends */ - return kunit_add_action_or_reset(priv->test, _cs_dsp_remove_wrapper, dsp); + ret = kunit_add_action_or_reset(priv->test, _cs_dsp_remove_wrapper, dsp); + if (ret) + return ret; + + /* + * Testing error conditions can produce a lot of log output + * from cs_dsp error messages, so suppress messages. + */ + cs_dsp_suppress_err_messages = true; + cs_dsp_suppress_warn_messages = true; + cs_dsp_suppress_info_messages = true; + + return 0; } static int cs_dsp_bin_err_test_halo_init(struct kunit *test) diff --git a/drivers/firmware/cirrus/test/cs_dsp_test_wmfw.c b/drivers/firmware/cirrus/test/cs_dsp_test_wmfw.c index 9e997c4ee2d6..f02cb6cf7638 100644 --- a/drivers/firmware/cirrus/test/cs_dsp_test_wmfw.c +++ b/drivers/firmware/cirrus/test/cs_dsp_test_wmfw.c @@ -18,6 +18,8 @@ #include #include +#include "../cs_dsp.h" + /* * Test method is: * @@ -1853,7 +1855,22 @@ static int cs_dsp_wmfw_test_common_init(struct kunit *test, struct cs_dsp *dsp, return ret; /* Automatically call cs_dsp_remove() when test case ends */ - return kunit_add_action_or_reset(priv->test, _cs_dsp_remove_wrapper, dsp); + ret = kunit_add_action_or_reset(priv->test, _cs_dsp_remove_wrapper, dsp); + if (ret) + return ret; + + /* + * The large number of test cases will cause an unusually large amount + * of dev_info() messages from cs_dsp, so suppress these. + */ + cs_dsp_suppress_info_messages = true; + + return 0; +} + +static void cs_dsp_wmfw_test_exit(struct kunit *test) +{ + cs_dsp_suppress_info_messages = false; } static int cs_dsp_wmfw_test_halo_init(struct kunit *test) @@ -2163,42 +2180,49 @@ static struct kunit_case cs_dsp_wmfw_test_cases_adsp2[] = { static struct kunit_suite cs_dsp_wmfw_test_halo = { .name = "cs_dsp_wmfwV3_halo", .init = cs_dsp_wmfw_test_halo_init, + .exit = cs_dsp_wmfw_test_exit, .test_cases = cs_dsp_wmfw_test_cases_halo, }; static struct kunit_suite cs_dsp_wmfw_test_adsp2_32bit_wmfw0 = { .name = "cs_dsp_wmfwV0_adsp2_32bit", .init = cs_dsp_wmfw_test_adsp2_32bit_wmfw0_init, + .exit = cs_dsp_wmfw_test_exit, .test_cases = cs_dsp_wmfw_test_cases_adsp2, }; static struct kunit_suite cs_dsp_wmfw_test_adsp2_32bit_wmfw1 = { .name = "cs_dsp_wmfwV1_adsp2_32bit", .init = cs_dsp_wmfw_test_adsp2_32bit_wmfw1_init, + .exit = cs_dsp_wmfw_test_exit, .test_cases = cs_dsp_wmfw_test_cases_adsp2, }; static struct kunit_suite cs_dsp_wmfw_test_adsp2_32bit_wmfw2 = { .name = "cs_dsp_wmfwV2_adsp2_32bit", .init = cs_dsp_wmfw_test_adsp2_32bit_wmfw2_init, + .exit = cs_dsp_wmfw_test_exit, .test_cases = cs_dsp_wmfw_test_cases_adsp2, }; static struct kunit_suite cs_dsp_wmfw_test_adsp2_16bit_wmfw0 = { .name = "cs_dsp_wmfwV0_adsp2_16bit", .init = cs_dsp_wmfw_test_adsp2_16bit_wmfw0_init, + .exit = cs_dsp_wmfw_test_exit, .test_cases = cs_dsp_wmfw_test_cases_adsp2, }; static struct kunit_suite cs_dsp_wmfw_test_adsp2_16bit_wmfw1 = { .name = "cs_dsp_wmfwV1_adsp2_16bit", .init = cs_dsp_wmfw_test_adsp2_16bit_wmfw1_init, + .exit = cs_dsp_wmfw_test_exit, .test_cases = cs_dsp_wmfw_test_cases_adsp2, }; static struct kunit_suite cs_dsp_wmfw_test_adsp2_16bit_wmfw2 = { .name = "cs_dsp_wmfwV2_adsp2_16bit", .init = cs_dsp_wmfw_test_adsp2_16bit_wmfw2_init, + .exit = cs_dsp_wmfw_test_exit, .test_cases = cs_dsp_wmfw_test_cases_adsp2, }; diff --git a/drivers/firmware/cirrus/test/cs_dsp_test_wmfw_error.c b/drivers/firmware/cirrus/test/cs_dsp_test_wmfw_error.c index c309843261d7..37162d12e2fa 100644 --- a/drivers/firmware/cirrus/test/cs_dsp_test_wmfw_error.c +++ b/drivers/firmware/cirrus/test/cs_dsp_test_wmfw_error.c @@ -18,6 +18,8 @@ #include #include +#include "../cs_dsp.h" + KUNIT_DEFINE_ACTION_WRAPPER(_put_device_wrapper, put_device, struct device *); KUNIT_DEFINE_ACTION_WRAPPER(_cs_dsp_remove_wrapper, cs_dsp_remove, struct cs_dsp *); @@ -989,11 +991,9 @@ static void wmfw_v2_coeff_description_exceeds_block(struct kunit *test) static void cs_dsp_wmfw_err_test_exit(struct kunit *test) { - /* - * Testing error conditions can produce a lot of log output - * from cs_dsp error messages, so rate limit the test cases. - */ - usleep_range(200, 500); + cs_dsp_suppress_err_messages = false; + cs_dsp_suppress_warn_messages = false; + cs_dsp_suppress_info_messages = false; } static int cs_dsp_wmfw_err_test_common_init(struct kunit *test, struct cs_dsp *dsp, @@ -1072,7 +1072,19 @@ static int cs_dsp_wmfw_err_test_common_init(struct kunit *test, struct cs_dsp *d return ret; /* Automatically call cs_dsp_remove() when test case ends */ - return kunit_add_action_or_reset(priv->test, _cs_dsp_remove_wrapper, dsp); + ret = kunit_add_action_or_reset(priv->test, _cs_dsp_remove_wrapper, dsp); + if (ret) + return ret; + + /* + * Testing error conditions can produce a lot of log output + * from cs_dsp error messages, so suppress messages. + */ + cs_dsp_suppress_err_messages = true; + cs_dsp_suppress_warn_messages = true; + cs_dsp_suppress_info_messages = true; + + return 0; } static int cs_dsp_wmfw_err_test_halo_init(struct kunit *test) diff --git a/drivers/firmware/cirrus/test/cs_dsp_tests.c b/drivers/firmware/cirrus/test/cs_dsp_tests.c index 7b829a03ca52..288675fdbdc5 100644 --- a/drivers/firmware/cirrus/test/cs_dsp_tests.c +++ b/drivers/firmware/cirrus/test/cs_dsp_tests.c @@ -12,3 +12,4 @@ MODULE_AUTHOR("Richard Fitzgerald "); MODULE_LICENSE("GPL"); MODULE_IMPORT_NS("FW_CS_DSP"); MODULE_IMPORT_NS("FW_CS_DSP_KUNIT_TEST_UTILS"); +MODULE_IMPORT_NS("EXPORTED_FOR_KUNIT_TESTING"); From 0ae91d8ab70922fb74c22c20bedcb69459579b1c Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Sun, 1 Feb 2026 21:18:53 +0000 Subject: [PATCH 071/167] io_uring/zcrx: fix page array leak d9f595b9a65e ("io_uring/zcrx: fix leaking pages on sg init fail") fixed a page leakage but didn't free the page array, release it as well. Fixes: b84621d96ee02 ("io_uring/zcrx: allocate sgtable for umem areas") Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- io_uring/zcrx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c index b99cf2c6670a..f18c173a7bcb 100644 --- a/io_uring/zcrx.c +++ b/io_uring/zcrx.c @@ -197,6 +197,7 @@ static int io_import_umem(struct io_zcrx_ifq *ifq, GFP_KERNEL_ACCOUNT); if (ret) { unpin_user_pages(pages, nr_pages); + kvfree(pages); return ret; } From af07330e28ad65352126270b0b3af226df46e307 Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Sun, 1 Feb 2026 21:19:56 +0000 Subject: [PATCH 072/167] io_uring/zcrx: fix rq flush locking zcrx needs to keep the rq lock for uref manipulations, for now move all zcrx_return_buffers() under the lock. Fixes: 475eb39b00478 ("io_uring/zcrx: add sync refill queue flushing") Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- io_uring/zcrx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c index f18c173a7bcb..3d398283cf34 100644 --- a/io_uring/zcrx.c +++ b/io_uring/zcrx.c @@ -1069,8 +1069,6 @@ static unsigned zcrx_parse_rq(netmem_ref *netmem_array, unsigned nr, unsigned int mask = zcrx->rq_entries - 1; unsigned int i; - guard(spinlock_bh)(&zcrx->rq_lock); - nr = min(nr, io_zcrx_rqring_entries(zcrx)); for (i = 0; i < nr; i++) { struct io_uring_zcrx_rqe *rqe = io_zcrx_get_rqe(zcrx, mask); @@ -1115,9 +1113,11 @@ static int zcrx_flush_rq(struct io_ring_ctx *ctx, struct io_zcrx_ifq *zcrx, return -EINVAL; do { - nr = zcrx_parse_rq(netmems, ZCRX_FLUSH_BATCH, zcrx); + scoped_guard(spinlock_bh, &zcrx->rq_lock) { + nr = zcrx_parse_rq(netmems, ZCRX_FLUSH_BATCH, zcrx); + zcrx_return_buffers(netmems, nr); + } - zcrx_return_buffers(netmems, nr); total += nr; if (fatal_signal_pending(current)) From 41d9a6795b95d6ea28439ac1e9ce8c95bbca20fc Mon Sep 17 00:00:00 2001 From: Felix Gu Date: Mon, 2 Feb 2026 23:15:09 +0800 Subject: [PATCH 073/167] spi: tegra: Fix a memory leak in tegra_slink_probe() In tegra_slink_probe(), when platform_get_irq() fails, it directly returns from the function with an error code, which causes a memory leak. Replace it with a goto label to ensure proper cleanup. Fixes: eb9913b511f1 ("spi: tegra: Fix missing IRQ check in tegra_slink_probe()") Signed-off-by: Felix Gu Reviewed-by: Jon Hunter Link: https://patch.msgid.link/20260202-slink-v1-1-eac50433a6f9@gmail.com Signed-off-by: Mark Brown --- drivers/spi/spi-tegra20-slink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-tegra20-slink.c b/drivers/spi/spi-tegra20-slink.c index fe452d03c1ee..709669610840 100644 --- a/drivers/spi/spi-tegra20-slink.c +++ b/drivers/spi/spi-tegra20-slink.c @@ -1086,8 +1086,10 @@ static int tegra_slink_probe(struct platform_device *pdev) reset_control_deassert(tspi->rst); spi_irq = platform_get_irq(pdev, 0); - if (spi_irq < 0) - return spi_irq; + if (spi_irq < 0) { + ret = spi_irq; + goto exit_pm_put; + } tspi->irq = spi_irq; ret = request_threaded_irq(tspi->irq, tegra_slink_isr, tegra_slink_isr_thread, IRQF_ONESHOT, From 43151f812886be1855d2cba059f9c93e4729460b Mon Sep 17 00:00:00 2001 From: Chen Ridong Date: Mon, 2 Feb 2026 12:27:16 +0000 Subject: [PATCH 074/167] cgroup/dmem: fix NULL pointer dereference when setting max An issue was triggered: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 15 UID: 0 PID: 658 Comm: bash Tainted: 6.19.0-rc6-next-2026012 Tainted: [O]=OOT_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), RIP: 0010:strcmp+0x10/0x30 RSP: 0018:ffffc900017f7dc0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888107cd4358 RDX: 0000000019f73907 RSI: ffffffff82cc381a RDI: 0000000000000000 RBP: ffff8881016bef0d R08: 000000006c0e7145 R09: 0000000056c0e714 R10: 0000000000000001 R11: ffff888107cd4358 R12: 0007ffffffffffff R13: ffff888101399200 R14: ffff888100fcb360 R15: 0007ffffffffffff CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000105c79000 CR4: 00000000000006f0 Call Trace: dmemcg_limit_write.constprop.0+0x16d/0x390 ? __pfx_set_resource_max+0x10/0x10 kernfs_fop_write_iter+0x14e/0x200 vfs_write+0x367/0x510 ksys_write+0x66/0xe0 do_syscall_64+0x6b/0x390 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f42697e1887 It was trriggered setting max without limitation, the command is like: "echo test/region0 > dmem.max". To fix this issue, add check whether options is valid after parsing the region_name. Fixes: b168ed458dde ("kernel/cgroup: Add "dmem" memory accounting cgroup") Cc: stable@vger.kernel.org # v6.14+ Signed-off-by: Chen Ridong Signed-off-by: Tejun Heo --- kernel/cgroup/dmem.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c index e12b946278b6..1f0d6caaf2fb 100644 --- a/kernel/cgroup/dmem.c +++ b/kernel/cgroup/dmem.c @@ -700,6 +700,9 @@ static ssize_t dmemcg_limit_write(struct kernfs_open_file *of, if (!region_name[0]) continue; + if (!options || !*options) + return -EINVAL; + rcu_read_lock(); region = dmemcg_get_region_by_name(region_name); rcu_read_unlock(); From 592a68212c5664bcaa88f24ed80bf791282790fe Mon Sep 17 00:00:00 2001 From: Chen Ridong Date: Mon, 2 Feb 2026 12:27:17 +0000 Subject: [PATCH 075/167] cgroup/dmem: avoid rcu warning when unregister region A warnning was detected: WARNING: suspicious RCU usage 6.19.0-rc7-next-20260129+ #1101 Tainted: G O kernel/cgroup/dmem.c:456 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by insmod/532: #0: ffffffff85e78b38 (dmemcg_lock){+.+.}-dmem_cgroup_unregister_region+ stack backtrace: CPU: 2 UID: 0 PID: 532 Comm: insmod Tainted: 6.19.0-rc7-next- Tainted: [O]=OOT_MODULE Call Trace: dump_stack_lvl+0xb0/0xd0 lockdep_rcu_suspicious+0x151/0x1c0 dmem_cgroup_unregister_region+0x1e2/0x380 ? __pfx_dmem_test_init+0x10/0x10 [dmem_uaf] dmem_test_init+0x65/0xff0 [dmem_uaf] do_one_initcall+0xbb/0x3a0 The macro list_for_each_rcu() must be used within an RCU read-side critical section (between rcu_read_lock() and rcu_read_unlock()). Using it outside that context, as seen in dmem_cgroup_unregister_region(), triggers the lockdep warning because the RCU protection is not guaranteed. Replace list_for_each_rcu() with list_for_each_entry_safe(), which is appropriate for traversal under spinlock protection where nodes may be deleted. Fixes: b168ed458dde ("kernel/cgroup: Add "dmem" memory accounting cgroup") Cc: stable@vger.kernel.org # v6.14+ Signed-off-by: Chen Ridong Signed-off-by: Tejun Heo --- kernel/cgroup/dmem.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c index 1f0d6caaf2fb..787b334e0f5d 100644 --- a/kernel/cgroup/dmem.c +++ b/kernel/cgroup/dmem.c @@ -423,7 +423,7 @@ static void dmemcg_free_region(struct kref *ref) */ void dmem_cgroup_unregister_region(struct dmem_cgroup_region *region) { - struct list_head *entry; + struct dmem_cgroup_pool_state *pool, *next; if (!region) return; @@ -433,10 +433,7 @@ void dmem_cgroup_unregister_region(struct dmem_cgroup_region *region) /* Remove from global region list */ list_del_rcu(®ion->region_node); - list_for_each_rcu(entry, ®ion->pools) { - struct dmem_cgroup_pool_state *pool = - container_of(entry, typeof(*pool), region_node); - + list_for_each_entry_safe(pool, next, ®ion->pools, region_node) { list_del_rcu(&pool->css_node); } From 99a2ef500906138ba58093b9893972a5c303c734 Mon Sep 17 00:00:00 2001 From: Chen Ridong Date: Mon, 2 Feb 2026 12:27:18 +0000 Subject: [PATCH 076/167] cgroup/dmem: avoid pool UAF An UAF issue was observed: BUG: KASAN: slab-use-after-free in page_counter_uncharge+0x65/0x150 Write of size 8 at addr ffff888106715440 by task insmod/527 CPU: 4 UID: 0 PID: 527 Comm: insmod 6.19.0-rc7-next-20260129+ #11 Tainted: [O]=OOT_MODULE Call Trace: dump_stack_lvl+0x82/0xd0 kasan_report+0xca/0x100 kasan_check_range+0x39/0x1c0 page_counter_uncharge+0x65/0x150 dmem_cgroup_uncharge+0x1f/0x260 Allocated by task 527: Freed by task 0: The buggy address belongs to the object at ffff888106715400 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 64 bytes inside of freed 512-byte region [ffff888106715400, ffff888106715600) The buggy address belongs to the physical page: Memory state around the buggy address: ffff888106715300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888106715380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888106715400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888106715480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888106715500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb The issue occurs because a pool can still be held by a caller after its associated memory region is unregistered. The current implementation frees the pool even if users still hold references to it (e.g., before uncharge operations complete). This patch adds a reference counter to each pool, ensuring that a pool is only freed when its reference count drops to zero. Fixes: b168ed458dde ("kernel/cgroup: Add "dmem" memory accounting cgroup") Cc: stable@vger.kernel.org # v6.14+ Signed-off-by: Chen Ridong Signed-off-by: Tejun Heo --- kernel/cgroup/dmem.c | 60 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 58 insertions(+), 2 deletions(-) diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c index 787b334e0f5d..1ea6afffa985 100644 --- a/kernel/cgroup/dmem.c +++ b/kernel/cgroup/dmem.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -71,7 +72,9 @@ struct dmem_cgroup_pool_state { struct rcu_head rcu; struct page_counter cnt; + struct dmem_cgroup_pool_state *parent; + refcount_t ref; bool inited; }; @@ -88,6 +91,9 @@ struct dmem_cgroup_pool_state { static DEFINE_SPINLOCK(dmemcg_lock); static LIST_HEAD(dmem_cgroup_regions); +static void dmemcg_free_region(struct kref *ref); +static void dmemcg_pool_free_rcu(struct rcu_head *rcu); + static inline struct dmemcg_state * css_to_dmemcs(struct cgroup_subsys_state *css) { @@ -104,10 +110,38 @@ static struct dmemcg_state *parent_dmemcs(struct dmemcg_state *cg) return cg->css.parent ? css_to_dmemcs(cg->css.parent) : NULL; } +static void dmemcg_pool_get(struct dmem_cgroup_pool_state *pool) +{ + refcount_inc(&pool->ref); +} + +static bool dmemcg_pool_tryget(struct dmem_cgroup_pool_state *pool) +{ + return refcount_inc_not_zero(&pool->ref); +} + +static void dmemcg_pool_put(struct dmem_cgroup_pool_state *pool) +{ + if (!refcount_dec_and_test(&pool->ref)) + return; + + call_rcu(&pool->rcu, dmemcg_pool_free_rcu); +} + +static void dmemcg_pool_free_rcu(struct rcu_head *rcu) +{ + struct dmem_cgroup_pool_state *pool = container_of(rcu, typeof(*pool), rcu); + + if (pool->parent) + dmemcg_pool_put(pool->parent); + kref_put(&pool->region->ref, dmemcg_free_region); + kfree(pool); +} + static void free_cg_pool(struct dmem_cgroup_pool_state *pool) { list_del(&pool->region_node); - kfree(pool); + dmemcg_pool_put(pool); } static void @@ -342,6 +376,12 @@ alloc_pool_single(struct dmemcg_state *dmemcs, struct dmem_cgroup_region *region page_counter_init(&pool->cnt, ppool ? &ppool->cnt : NULL, true); reset_all_resource_limits(pool); + refcount_set(&pool->ref, 1); + kref_get(®ion->ref); + if (ppool && !pool->parent) { + pool->parent = ppool; + dmemcg_pool_get(ppool); + } list_add_tail_rcu(&pool->css_node, &dmemcs->pools); list_add_tail(&pool->region_node, ®ion->pools); @@ -389,6 +429,10 @@ get_cg_pool_locked(struct dmemcg_state *dmemcs, struct dmem_cgroup_region *regio /* Fix up parent links, mark as inited. */ pool->cnt.parent = &ppool->cnt; + if (ppool && !pool->parent) { + pool->parent = ppool; + dmemcg_pool_get(ppool); + } pool->inited = true; pool = ppool; @@ -435,6 +479,8 @@ void dmem_cgroup_unregister_region(struct dmem_cgroup_region *region) list_for_each_entry_safe(pool, next, ®ion->pools, region_node) { list_del_rcu(&pool->css_node); + list_del(&pool->region_node); + dmemcg_pool_put(pool); } /* @@ -515,8 +561,10 @@ static struct dmem_cgroup_region *dmemcg_get_region_by_name(const char *name) */ void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool) { - if (pool) + if (pool) { css_put(&pool->cs->css); + dmemcg_pool_put(pool); + } } EXPORT_SYMBOL_GPL(dmem_cgroup_pool_state_put); @@ -530,6 +578,8 @@ get_cg_pool_unlocked(struct dmemcg_state *cg, struct dmem_cgroup_region *region) pool = find_cg_pool_locked(cg, region); if (pool && !READ_ONCE(pool->inited)) pool = NULL; + if (pool && !dmemcg_pool_tryget(pool)) + pool = NULL; rcu_read_unlock(); while (!pool) { @@ -538,6 +588,8 @@ get_cg_pool_unlocked(struct dmemcg_state *cg, struct dmem_cgroup_region *region) pool = get_cg_pool_locked(cg, region, &allocpool); else pool = ERR_PTR(-ENODEV); + if (!IS_ERR(pool)) + dmemcg_pool_get(pool); spin_unlock(&dmemcg_lock); if (pool == ERR_PTR(-ENOMEM)) { @@ -573,6 +625,7 @@ void dmem_cgroup_uncharge(struct dmem_cgroup_pool_state *pool, u64 size) page_counter_uncharge(&pool->cnt, size); css_put(&pool->cs->css); + dmemcg_pool_put(pool); } EXPORT_SYMBOL_GPL(dmem_cgroup_uncharge); @@ -624,7 +677,9 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *region, u64 size, if (ret_limit_pool) { *ret_limit_pool = container_of(fail, struct dmem_cgroup_pool_state, cnt); css_get(&(*ret_limit_pool)->cs->css); + dmemcg_pool_get(*ret_limit_pool); } + dmemcg_pool_put(pool); ret = -EAGAIN; goto err; } @@ -719,6 +774,7 @@ static ssize_t dmemcg_limit_write(struct kernfs_open_file *of, /* And commit */ apply(pool, new_limit); + dmemcg_pool_put(pool); out_put: kref_put(®ion->ref, dmemcg_free_region); From e3a43633023e3cacaca60d4b8972d084a2b06236 Mon Sep 17 00:00:00 2001 From: ChenXiaoSong Date: Mon, 2 Feb 2026 08:24:07 +0000 Subject: [PATCH 077/167] smb/client: fix memory leak in smb2_open_file() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reproducer: 1. server: directories are exported read-only 2. client: mount -t cifs //${server_ip}/export /mnt 3. client: dd if=/dev/zero of=/mnt/file bs=512 count=1000 oflag=direct 4. client: umount /mnt 5. client: sleep 1 6. client: modprobe -r cifs The error message is as follows: ============================================================================= BUG cifs_small_rq (Not tainted): Objects remaining on __kmem_cache_shutdown() ----------------------------------------------------------------------------- Object 0x00000000d47521be @offset=14336 ... WARNING: mm/slub.c:1251 at __kmem_cache_shutdown+0x34e/0x440, CPU#0: modprobe/1577 ... Call Trace: kmem_cache_destroy+0x94/0x190 cifs_destroy_request_bufs+0x3e/0x50 [cifs] cleanup_module+0x4e/0x540 [cifs] __se_sys_delete_module+0x278/0x400 __x64_sys_delete_module+0x5f/0x70 x64_sys_call+0x2299/0x2ff0 do_syscall_64+0x89/0x350 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... kmem_cache_destroy cifs_small_rq: Slab cache still has objects when called from cifs_destroy_request_bufs+0x3e/0x50 [cifs] WARNING: mm/slab_common.c:532 at kmem_cache_destroy+0x16b/0x190, CPU#0: modprobe/1577 Link: https://lore.kernel.org/linux-cifs/9751f02d-d1df-4265-a7d6-b19761b21834@linux.dev/T/#mf14808c144448b715f711ce5f0477a071f08eaf6 Fixes: e255612b5ed9 ("cifs: Add fallback for SMB2 CREATE without FILE_READ_ATTRIBUTES") Reported-by: Paulo Alcantara Reviewed-by: Paulo Alcantara (Red Hat) Signed-off-by: ChenXiaoSong Reviewed-by: Pali Rohár Signed-off-by: Steve French --- fs/smb/client/smb2file.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/smb/client/smb2file.c b/fs/smb/client/smb2file.c index 7f11ae6bb785..2dd08388ea87 100644 --- a/fs/smb/client/smb2file.c +++ b/fs/smb/client/smb2file.c @@ -178,6 +178,7 @@ int smb2_open_file(const unsigned int xid, struct cifs_open_parms *oparms, rc = SMB2_open(xid, oparms, smb2_path, &smb2_oplock, smb2_data, NULL, &err_iov, &err_buftype); if (rc == -EACCES && retry_without_read_attributes) { + free_rsp_buf(err_buftype, err_iov.iov_base); oparms->desired_access &= ~FILE_READ_ATTRIBUTES; rc = SMB2_open(xid, oparms, smb2_path, &smb2_oplock, smb2_data, NULL, &err_iov, &err_buftype); From 67b3da8d3051fba7e1523b3afce4f71c658f15f8 Mon Sep 17 00:00:00 2001 From: ChenXiaoSong Date: Mon, 2 Feb 2026 09:49:06 +0000 Subject: [PATCH 078/167] smb/client: fix memory leak in SendReceive() Reproducer: 1. server: supports SMB1, directories are exported read-only 2. client: mount -t cifs -o vers=1.0 //${server_ip}/export /mnt 3. client: dd if=/dev/zero of=/mnt/file bs=512 count=1000 oflag=direct 4. client: umount /mnt 5. client: sleep 1 6. client: modprobe -r cifs The error message is as follows: ============================================================================= BUG cifs_small_rq (Not tainted): Objects remaining on __kmem_cache_shutdown() ----------------------------------------------------------------------------- Object 0x00000000d34491e6 @offset=896 Object 0x00000000bde9fab3 @offset=4480 Object 0x00000000104a1f70 @offset=6272 Object 0x0000000092a51bb5 @offset=7616 Object 0x000000006714a7db @offset=13440 ... WARNING: mm/slub.c:1251 at __kmem_cache_shutdown+0x379/0x3f0, CPU#7: modprobe/712 ... Call Trace: kmem_cache_destroy+0x69/0x160 cifs_destroy_request_bufs+0x39/0x40 [cifs] cleanup_module+0x43/0xfc0 [cifs] __se_sys_delete_module+0x1d5/0x300 __x64_sys_delete_module+0x1a/0x30 x64_sys_call+0x2299/0x2ff0 do_syscall_64+0x6e/0x270 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... kmem_cache_destroy cifs_small_rq: Slab cache still has objects when called from cifs_destroy_request_bufs+0x39/0x40 [cifs] WARNING: mm/slab_common.c:532 at kmem_cache_destroy+0x142/0x160, CPU#7: modprobe/712 Link: https://lore.kernel.org/linux-cifs/9751f02d-d1df-4265-a7d6-b19761b21834@linux.dev/T/#mf14808c144448b715f711ce5f0477a071f08eaf6 Fixes: 6be09580df5c ("cifs: Make smb1's SendReceive() wrap cifs_send_recv()") Reported-by: Paulo Alcantara Reviewed-by: Paulo Alcantara (Red Hat) Reviewed-by: David Howells Signed-off-by: ChenXiaoSong Signed-off-by: Steve French --- fs/smb/client/cifstransport.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/smb/client/cifstransport.c b/fs/smb/client/cifstransport.c index 28d1cee90625..98287132626e 100644 --- a/fs/smb/client/cifstransport.c +++ b/fs/smb/client/cifstransport.c @@ -251,13 +251,15 @@ SendReceive(const unsigned int xid, struct cifs_ses *ses, rc = cifs_send_recv(xid, ses, ses->server, &rqst, &resp_buf_type, flags, &resp_iov); if (rc < 0) - return rc; + goto out; if (out_buf) { *pbytes_returned = resp_iov.iov_len; if (resp_iov.iov_len) memcpy(out_buf, resp_iov.iov_base, resp_iov.iov_len); } + +out: free_rsp_buf(resp_buf_type, resp_iov.iov_base); return rc; } From f5c092787c48296633c2dd7240752f88fa9710fc Mon Sep 17 00:00:00 2001 From: Gabor Juhos Date: Sun, 1 Feb 2026 21:35:06 +0100 Subject: [PATCH 079/167] hwmon: (gpio-fan) Fix set_rpm() return value The set_rpm function is used as a 'store' callback of a device attribute, and as such it should return with the number of bytes consumed. However since commit 0d01110e6356 ("hwmon: (gpio-fan) Add regulator support"), the function returns with zero on success. Due to this, the function gets called again and again whenever the user tries to change the FAN speed by writing the desired RPM value into the 'fan1_target' sysfs attribute. The broken behaviour can be reproduced easily. For example, the following command never returns unless it gets terminated: $ echo 500 > /sys/class/hwmon/hwmon1/fan1_target ^C $ Change the code to return with the same value as the 'count' parameter on success to indicate that all bytes from the input buffer are consumed. The function behaved the same way prior to the offending change. Cc: stable@vger.kernel.org Fixes: 0d01110e6356 ("hwmon: (gpio-fan) Add regulator support") Signed-off-by: Gabor Juhos Link: https://lore.kernel.org/r/20260201-gpio-fan-set_rpm-retval-fix-v1-1-dc39bc7693ca@gmail.com Signed-off-by: Guenter Roeck --- drivers/hwmon/gpio-fan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/gpio-fan.c b/drivers/hwmon/gpio-fan.c index 516c34bb61c9..d7fa021f376e 100644 --- a/drivers/hwmon/gpio-fan.c +++ b/drivers/hwmon/gpio-fan.c @@ -291,7 +291,7 @@ static ssize_t set_rpm(struct device *dev, struct device_attribute *attr, { struct gpio_fan_data *fan_data = dev_get_drvdata(dev); unsigned long rpm; - int ret = count; + int ret; if (kstrtoul(buf, 10, &rpm)) return -EINVAL; @@ -308,7 +308,7 @@ static ssize_t set_rpm(struct device *dev, struct device_attribute *attr, exit_unlock: mutex_unlock(&fan_data->lock); - return ret; + return ret ? ret : count; } static DEVICE_ATTR_RW(pwm1); From 52fb36a5f9c15285b7d67c0ff87dc17b3206b5df Mon Sep 17 00:00:00 2001 From: Gabor Juhos Date: Mon, 2 Feb 2026 16:58:57 +0100 Subject: [PATCH 080/167] hwmon: (gpio-fan) Allow to stop FANs when CONFIG_PM is disabled When CONFIG_PM is disabled, the GPIO controlled FANs can't be stopped by using the sysfs attributes since commit 0d01110e6356 ("hwmon: (gpio-fan) Add regulator support"). Using either the 'pwm1' or the 'fan1_target' attribute fails the same way: $ echo 0 > /sys/class/hwmon/hwmon1/pwm1 ash: write error: Function not implemented $ echo 0 > /sys/class/hwmon/hwmon1/fan1_target ash: write error: Function not implemented Both commands were working flawlessly before the mentioned commit. The issue happens because pm_runtime_put_sync() returns with -ENOSYS when CONFIG_PM is disabled, and the set_fan_speed() function handles this as an error. In order to restore the previous behaviour, change the error check in the set_fan_speed() function to ignore the -ENOSYS error code. Cc: stable@vger.kernel.org Fixes: 0d01110e6356 ("hwmon: (gpio-fan) Add regulator support") Signed-off-by: Gabor Juhos Link: https://lore.kernel.org/r/20260202-gpio-fan-stop-fix-v1-1-c7853183d93d@gmail.com Signed-off-by: Guenter Roeck --- drivers/hwmon/gpio-fan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/gpio-fan.c b/drivers/hwmon/gpio-fan.c index d7fa021f376e..a8892ced1e54 100644 --- a/drivers/hwmon/gpio-fan.c +++ b/drivers/hwmon/gpio-fan.c @@ -148,7 +148,7 @@ static int set_fan_speed(struct gpio_fan_data *fan_data, int speed_index) int ret; ret = pm_runtime_put_sync(fan_data->dev); - if (ret < 0) + if (ret < 0 && ret != -ENOSYS) return ret; } From 83b67cc9be9223183caf91826d9c194d7fb128fa Mon Sep 17 00:00:00 2001 From: Jiayuan Chen Date: Sun, 1 Feb 2026 21:59:10 +0800 Subject: [PATCH 081/167] linkwatch: use __dev_put() in callers to prevent UAF After linkwatch_do_dev() calls __dev_put() to release the linkwatch reference, the device refcount may drop to 1. At this point, netdev_run_todo() can proceed (since linkwatch_sync_dev() sees an empty list and returns without blocking), wait for the refcount to become 1 via netdev_wait_allrefs_any(), and then free the device via kobject_put(). This creates a use-after-free when __linkwatch_run_queue() tries to call netdev_unlock_ops() on the already-freed device. Note that adding netdev_lock_ops()/netdev_unlock_ops() pair in netdev_run_todo() before kobject_put() would not work, because netdev_lock_ops() is conditional - it only locks when netdev_need_ops_lock() returns true. If the device doesn't require ops_lock, linkwatch won't hold any lock, and netdev_run_todo() acquiring the lock won't provide synchronization. Fix this by moving __dev_put() from linkwatch_do_dev() to its callers. The device reference logically pairs with de-listing the device, so it's reasonable for the caller that did the de-listing to release it. This allows placing __dev_put() after all device accesses are complete, preventing UAF. The bug can be reproduced by adding mdelay(2000) after linkwatch_do_dev() in __linkwatch_run_queue(), then running: ip tuntap add mode tun name tun_test ip link set tun_test up ip link set tun_test carrier off ip link set tun_test carrier on sleep 0.5 ip tuntap del mode tun name tun_test KASAN report: ================================================================== BUG: KASAN: use-after-free in netdev_need_ops_lock include/net/netdev_lock.h:33 [inline] BUG: KASAN: use-after-free in netdev_unlock_ops include/net/netdev_lock.h:47 [inline] BUG: KASAN: use-after-free in __linkwatch_run_queue+0x865/0x8a0 net/core/link_watch.c:245 Read of size 8 at addr ffff88804de5c008 by task kworker/u32:10/8123 CPU: 0 UID: 0 PID: 8123 Comm: kworker/u32:10 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Workqueue: events_unbound linkwatch_event Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x156/0x4c9 mm/kasan/report.c:482 kasan_report+0xdf/0x1a0 mm/kasan/report.c:595 netdev_need_ops_lock include/net/netdev_lock.h:33 [inline] netdev_unlock_ops include/net/netdev_lock.h:47 [inline] __linkwatch_run_queue+0x865/0x8a0 net/core/link_watch.c:245 linkwatch_event+0x8f/0xc0 net/core/link_watch.c:304 process_one_work+0x9c2/0x1840 kernel/workqueue.c:3257 process_scheduled_works kernel/workqueue.c:3340 [inline] worker_thread+0x5da/0xe40 kernel/workqueue.c:3421 kthread+0x3b3/0x730 kernel/kthread.c:463 ret_from_fork+0x754/0xaf0 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246 ================================================================== Fixes: 04efcee6ef8d ("net: hold instance lock during NETDEV_CHANGE") Reported-by: syzbot+1ec2f6a450f0b54af8c8@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6824d064.a70a0220.3e9d8.001a.GAE@google.com/T/ Signed-off-by: Jiayuan Chen Signed-off-by: Jiayuan Chen Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20260201135915.393451-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski --- net/core/link_watch.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/net/core/link_watch.c b/net/core/link_watch.c index 212cde35affa..25c455c10a01 100644 --- a/net/core/link_watch.c +++ b/net/core/link_watch.c @@ -185,10 +185,6 @@ static void linkwatch_do_dev(struct net_device *dev) netif_state_change(dev); } - /* Note: our callers are responsible for calling netdev_tracker_free(). - * This is the reason we use __dev_put() instead of dev_put(). - */ - __dev_put(dev); } static void __linkwatch_run_queue(int urgent_only) @@ -243,6 +239,11 @@ static void __linkwatch_run_queue(int urgent_only) netdev_lock_ops(dev); linkwatch_do_dev(dev); netdev_unlock_ops(dev); + /* Use __dev_put() because netdev_tracker_free() was already + * called above. Must be after netdev_unlock_ops() to prevent + * netdev_run_todo() from freeing the device while still in use. + */ + __dev_put(dev); do_dev--; spin_lock_irq(&lweventlist_lock); } @@ -278,8 +279,13 @@ void __linkwatch_sync_dev(struct net_device *dev) { netdev_ops_assert_locked(dev); - if (linkwatch_clean_dev(dev)) + if (linkwatch_clean_dev(dev)) { linkwatch_do_dev(dev); + /* Use __dev_put() because netdev_tracker_free() was already + * called inside linkwatch_clean_dev(). + */ + __dev_put(dev); + } } void linkwatch_sync_dev(struct net_device *dev) @@ -288,6 +294,10 @@ void linkwatch_sync_dev(struct net_device *dev) netdev_lock_ops(dev); linkwatch_do_dev(dev); netdev_unlock_ops(dev); + /* Use __dev_put() because netdev_tracker_free() was already + * called inside linkwatch_clean_dev(). + */ + __dev_put(dev); } } From 1c172febdf065375359b2b95156e476bfee30b60 Mon Sep 17 00:00:00 2001 From: Jakub Kicinski Date: Fri, 30 Jan 2026 11:03:11 -0800 Subject: [PATCH 082/167] net: rss: fix reporting RXH_XFRM_NO_CHANGE as input_xfrm for contexts Initializing input_xfrm to RXH_XFRM_NO_CHANGE in RSS contexts is problematic. I think I did this to make it clear that the context does not have its own settings applied. But unlike ETH_RSS_HASH_NO_CHANGE which is zero, RXH_XFRM_NO_CHANGE is 0xff. We need to be careful when reading the value back, and remember to treat 0xff as 0. Remove the initialization and switch to storing 0. This lets us also remove the workaround in ethnl_rss_set(). Get side does not need any adjustments and context get no longer reports: RSS input transformation: symmetric-xor: on symmetric-or-xor: on Unknown bits in RSS input transformation: 0xfc for NICs which don't support input_xfrm. Remove the init of hfunc to ETH_RSS_HASH_NO_CHANGE while at it. As already mentioned this is a noop since ETH_RSS_HASH_NO_CHANGE is 0 and struct is zalloc'd. But as this fix exemplifies storing NO_CHANGE as state is fragile. This issue is implicitly caught by running our selftests because YNL in selftests errors out on unknown bits. Fixes: d3e2c7bab124 ("ethtool: rss: support setting input-xfrm via Netlink") Link: https://patch.msgid.link/20260130190311.811129-1-kuba@kernel.org Signed-off-by: Jakub Kicinski --- net/ethtool/common.c | 3 --- net/ethtool/rss.c | 9 ++------- 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/net/ethtool/common.c b/net/ethtool/common.c index 369c05cf8163..d47a279eb8b9 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -862,9 +862,6 @@ ethtool_rxfh_ctx_alloc(const struct ethtool_ops *ops, ctx->key_off = key_off; ctx->priv_size = ops->rxfh_priv_size; - ctx->hfunc = ETH_RSS_HASH_NO_CHANGE; - ctx->input_xfrm = RXH_XFRM_NO_CHANGE; - return ctx; } diff --git a/net/ethtool/rss.c b/net/ethtool/rss.c index 4dced53be4b3..da5934cceb07 100644 --- a/net/ethtool/rss.c +++ b/net/ethtool/rss.c @@ -824,8 +824,8 @@ rss_set_ctx_update(struct ethtool_rxfh_context *ctx, struct nlattr **tb, static int ethnl_rss_set(struct ethnl_req_info *req_info, struct genl_info *info) { - bool indir_reset = false, indir_mod, xfrm_sym = false; struct rss_req_info *request = RSS_REQINFO(req_info); + bool indir_reset = false, indir_mod, xfrm_sym; struct ethtool_rxfh_context *ctx = NULL; struct net_device *dev = req_info->dev; bool mod = false, fields_mod = false; @@ -860,12 +860,7 @@ ethnl_rss_set(struct ethnl_req_info *req_info, struct genl_info *info) rxfh.input_xfrm = data.input_xfrm; ethnl_update_u8(&rxfh.input_xfrm, tb[ETHTOOL_A_RSS_INPUT_XFRM], &mod); - /* For drivers which don't support input_xfrm it will be set to 0xff - * in the RSS context info. In all other case input_xfrm != 0 means - * symmetric hashing is requested. - */ - if (!request->rss_context || ops->rxfh_per_ctx_key) - xfrm_sym = rxfh.input_xfrm || data.input_xfrm; + xfrm_sym = rxfh.input_xfrm || data.input_xfrm; if (rxfh.input_xfrm == data.input_xfrm) rxfh.input_xfrm = RXH_XFRM_NO_CHANGE; From dbbec8c5a79f4c7aa8d07da8c0b5a34d76c50699 Mon Sep 17 00:00:00 2001 From: "Russell King (Oracle)" Date: Fri, 30 Jan 2026 20:04:57 +0000 Subject: [PATCH 083/167] net: stmmac: fix stm32 (and potentially others) resume regression Marek reported that suspending stm32 causes the following errors when the interface is administratively down: $ echo devices > /sys/power/pm_test $ echo mem > /sys/power/state ... ck_ker_eth2stp already disabled ... ck_ker_eth2stp already unprepared ... On suspend, stm32 starts the eth2stp clock in its suspend method, and stops it in the resume method. This is because the blamed commit omits the call to the platform glue ->suspend() method, but does make the call to the platform glue ->resume() method. This problem affects all other converted drivers as well - e.g. looking at the PCIe drivers, pci_save_state() will not be called, but pci_restore_state() will be. Similar issues affect all other drivers. Fix this by always calling the ->suspend() method, even when the network interface is down. This fixes all the conversions to the platform glue ->suspend() and ->resume() methods. Link: https://lore.kernel.org/r/20260114081809.12758-1-marex@nabladev.com Fixes: 07bbbfe7addf ("net: stmmac: add suspend()/resume() platform ops") Reported-by: Marek Vasut Tested-by: Marek Vasut Signed-off-by: Russell King (Oracle) Link: https://patch.msgid.link/E1vlujh-00000007Hkw-2p6r@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 3f42843cd9ed..a379221b96a3 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -8042,7 +8042,7 @@ int stmmac_suspend(struct device *dev) u32 chan; if (!ndev || !netif_running(ndev)) - return 0; + goto suspend_bsp; mutex_lock(&priv->lock); @@ -8082,6 +8082,7 @@ int stmmac_suspend(struct device *dev) if (stmmac_fpe_supported(priv)) ethtool_mmsv_stop(&priv->fpe_cfg.mmsv); +suspend_bsp: if (priv->plat->suspend) return priv->plat->suspend(dev, priv->plat->bsp_priv); From 74d9391e8849e70ded5309222d09b0ed0edbd039 Mon Sep 17 00:00:00 2001 From: Daniel Hodges Date: Sat, 31 Jan 2026 10:01:14 -0800 Subject: [PATCH 084/167] tipc: use kfree_sensitive() for session key material The rx->skey field contains a struct tipc_aead_key with GCM-AES encryption keys used for TIPC cluster communication. Using plain kfree() leaves this sensitive key material in freed memory pages where it could potentially be recovered. Switch to kfree_sensitive() to ensure the key material is zeroed before the memory is freed. Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange") Signed-off-by: Daniel Hodges Link: https://patch.msgid.link/20260131180114.2121438-1-hodgesd@meta.com Signed-off-by: Jakub Kicinski --- net/tipc/crypto.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c index 751904f10aab..970db62bd029 100644 --- a/net/tipc/crypto.c +++ b/net/tipc/crypto.c @@ -1219,7 +1219,7 @@ void tipc_crypto_key_flush(struct tipc_crypto *c) rx = c; tx = tipc_net(rx->net)->crypto_tx; if (cancel_delayed_work(&rx->work)) { - kfree(rx->skey); + kfree_sensitive(rx->skey); rx->skey = NULL; atomic_xchg(&rx->key_distr, 0); tipc_node_put(rx->node); @@ -2394,7 +2394,7 @@ static void tipc_crypto_work_rx(struct work_struct *work) break; default: synchronize_rcu(); - kfree(rx->skey); + kfree_sensitive(rx->skey); rx->skey = NULL; break; } From a69c17230cab07bd156f894fdc82bd78b43ea72f Mon Sep 17 00:00:00 2001 From: Claudiu Manoil Date: Fri, 30 Jan 2026 16:10:32 +0200 Subject: [PATCH 085/167] net: enetc: Remove SI/BDR cacheability AXI settings for ENETC v4 For ENETC v4 these settings are controlled by the global ENETC message and buffer cache attribute registers (EnBCAR and EnMCAR), from the IERB register block. The hardcoded cacheability settings were inherited from LS1028A, and should be removed from the ENETC v4 driver as they conflict with the global IERB settings. Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF") Signed-off-by: Claudiu Manoil Reviewed-by: Wei Fang Link: https://patch.msgid.link/20260130141035.272471-2-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/freescale/enetc/enetc.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 53b26cece16a..e380a4f39855 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -2512,10 +2512,13 @@ int enetc_configure_si(struct enetc_ndev_priv *priv) struct enetc_hw *hw = &si->hw; int err; - /* set SI cache attributes */ - enetc_wr(hw, ENETC_SICAR0, - ENETC_SICAR_RD_COHERENT | ENETC_SICAR_WR_COHERENT); - enetc_wr(hw, ENETC_SICAR1, ENETC_SICAR_MSI); + if (is_enetc_rev1(si)) { + /* set SI cache attributes */ + enetc_wr(hw, ENETC_SICAR0, + ENETC_SICAR_RD_COHERENT | ENETC_SICAR_WR_COHERENT); + enetc_wr(hw, ENETC_SICAR1, ENETC_SICAR_MSI); + } + /* enable SI */ enetc_wr(hw, ENETC_SIMR, ENETC_SIMR_EN); From 9ae13b2e64fcd2ca00a76b7d60fc4641a6b9209d Mon Sep 17 00:00:00 2001 From: Claudiu Manoil Date: Fri, 30 Jan 2026 16:10:33 +0200 Subject: [PATCH 086/167] net: enetc: Remove CBDR cacheability AXI settings for ENETC v4 For ENETC v4 these settings are controlled by the global ENETC command cache attribute registers (EnCAR), from the IERB register block. The hardcoded CDBR cacheability settings were inherited from LS1028A, and should be removed from the ENETC v4 driver as they conflict with the global IERB settings. Fixes: e3f4a0a8ddb4 ("net: enetc: add command BD ring support for i.MX95 ENETC") Signed-off-by: Claudiu Manoil Reviewed-by: Wei Fang Link: https://patch.msgid.link/20260130141035.272471-3-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/freescale/enetc/enetc_cbdr.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc_cbdr.c b/drivers/net/ethernet/freescale/enetc/enetc_cbdr.c index 3d5f31879d5c..a635bfdc30af 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_cbdr.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_cbdr.c @@ -74,10 +74,6 @@ int enetc4_setup_cbdr(struct enetc_si *si) if (!user->ring) return -ENOMEM; - /* set CBDR cache attributes */ - enetc_wr(hw, ENETC_SICAR2, - ENETC_SICAR_RD_COHERENT | ENETC_SICAR_WR_COHERENT); - regs.pir = hw->reg + ENETC_SICBDRPIR; regs.cir = hw->reg + ENETC_SICBDRCIR; regs.mr = hw->reg + ENETC_SICBDRMR; From 21d0fc95b5920ae8e69a2c0394bef82b8392bcc9 Mon Sep 17 00:00:00 2001 From: Claudiu Manoil Date: Fri, 30 Jan 2026 16:10:34 +0200 Subject: [PATCH 087/167] net: enetc: Convert 16-bit register writes to 32-bit for ENETC v4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit For ENETC v4, which is integrated into more complex SoCs (compared to v1), 16‑bit register writes are blocked in the SoC interconnect on some chips. To be fair, it is not recommended to access 32‑bit registers of this IP using lower‑width accessors (i.e. 16‑bit), and the only exception to this rule was introduced by me in the initial ENETC v1 driver for the PMAR1 register, which holds the lower 16 bits of the primary MAC address of an SI. Meanwhile, this exception has been replicated for v4 as well. Since LS1028 (the only SoC with ENETC v1) is not affected by this issue, the current patch fixes the 16‑bit writes to PMAR1 starting with ENETC v4. Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF") Signed-off-by: Claudiu Manoil Reviewed-by: Wei Fang Link: https://patch.msgid.link/20260130141035.272471-4-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/freescale/enetc/enetc4_pf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc4_pf.c b/drivers/net/ethernet/freescale/enetc/enetc4_pf.c index 498346dd996a..c0859d200a2c 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc4_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc4_pf.c @@ -59,10 +59,10 @@ static void enetc4_pf_set_si_primary_mac(struct enetc_hw *hw, int si, if (si != 0) { __raw_writel(upper, hw->port + ENETC4_PSIPMAR0(si)); - __raw_writew(lower, hw->port + ENETC4_PSIPMAR1(si)); + __raw_writel(lower, hw->port + ENETC4_PSIPMAR1(si)); } else { __raw_writel(upper, hw->port + ENETC4_PMAR0); - __raw_writew(lower, hw->port + ENETC4_PMAR1); + __raw_writel(lower, hw->port + ENETC4_PMAR1); } } From c28d765ec5da160d3a48d0928528084cef97bf19 Mon Sep 17 00:00:00 2001 From: Claudiu Manoil Date: Fri, 30 Jan 2026 16:10:35 +0200 Subject: [PATCH 088/167] net: enetc: Convert 16-bit register reads to 32-bit for ENETC v4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It is not recommended to access the 32‑bit registers of this hardware IP using lower‑width accessors (i.e. 16‑bit), and the only exception to this rule was introduced in the initial ENETC v1 driver for the PMAR1 register, which holds the lower 16 bits of the primary MAC address of an SI. Meanwhile, this exception has been replicated in the v4 driver code as well. Since LS1028 (the only SoC with ENETC v1) is not affected by this issue, the current patch converts the 16‑bit reads from PMAR1 starting with ENETC v4. Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF") Signed-off-by: Claudiu Manoil Reviewed-by: Wei Fang Link: https://patch.msgid.link/20260130141035.272471-5-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski --- .../net/ethernet/freescale/enetc/enetc4_pf.c | 2 +- drivers/net/ethernet/freescale/enetc/enetc_hw.h | 17 ++++++++++++++--- 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc4_pf.c b/drivers/net/ethernet/freescale/enetc/enetc4_pf.c index c0859d200a2c..5850540634b0 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc4_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc4_pf.c @@ -73,7 +73,7 @@ static void enetc4_pf_get_si_primary_mac(struct enetc_hw *hw, int si, u16 lower; upper = __raw_readl(hw->port + ENETC4_PSIPMAR0(si)); - lower = __raw_readw(hw->port + ENETC4_PSIPMAR1(si)); + lower = __raw_readl(hw->port + ENETC4_PSIPMAR1(si)); put_unaligned_le32(upper, addr); put_unaligned_le16(lower, addr + 4); diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h index 7b882b8921fe..662e4fbafb74 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h @@ -708,13 +708,24 @@ struct enetc_cmd_rfse { #define ENETC_RFSE_EN BIT(15) #define ENETC_RFSE_MODE_BD 2 +static inline void enetc_get_primary_mac_addr(struct enetc_hw *hw, u8 *addr) +{ + u32 upper; + u16 lower; + + upper = __raw_readl(hw->reg + ENETC_SIPMAR0); + lower = __raw_readl(hw->reg + ENETC_SIPMAR1); + + put_unaligned_le32(upper, addr); + put_unaligned_le16(lower, addr + 4); +} + static inline void enetc_load_primary_mac_addr(struct enetc_hw *hw, struct net_device *ndev) { - u8 addr[ETH_ALEN] __aligned(4); + u8 addr[ETH_ALEN]; - *(u32 *)addr = __raw_readl(hw->reg + ENETC_SIPMAR0); - *(u16 *)(addr + 4) = __raw_readw(hw->reg + ENETC_SIPMAR1); + enetc_get_primary_mac_addr(hw, addr); eth_hw_addr_set(ndev, addr); } From 16459fe7e0ca6520a6e8f603de4ccd52b90fd765 Mon Sep 17 00:00:00 2001 From: Andrew Cooper Date: Mon, 26 Jan 2026 21:10:46 +0000 Subject: [PATCH 089/167] x86/kfence: fix booting on 32bit non-PAE systems The original patch inverted the PTE unconditionally to avoid L1TF-vulnerable PTEs, but Linux doesn't make this adjustment in 2-level paging. Adjust the logic to use the flip_protnone_guard() helper, which is a nop on 2-level paging but inverts the address bits in all other paging modes. This doesn't matter for the Xen aspect of the original change. Linux no longer supports running 32bit PV under Xen, and Xen doesn't support running any 32bit PV guests without using PAE paging. Link: https://lkml.kernel.org/r/20260126211046.2096622-1-andrew.cooper3@citrix.com Fixes: b505f1944535 ("x86/kfence: avoid writing L1TF-vulnerable PTEs") Reported-by: Ryusuke Konishi Closes: https://lore.kernel.org/lkml/CAKFNMokwjw68ubYQM9WkzOuH51wLznHpEOMSqtMoV1Rn9JV_gw@mail.gmail.com/ Signed-off-by: Andrew Cooper Tested-by: Ryusuke Konishi Tested-by: Borislav Petkov (AMD) Cc: Alexander Potapenko Cc: Marco Elver Cc: Dmitry Vyukov Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Jann Horn Cc: Signed-off-by: Andrew Morton --- arch/x86/include/asm/kfence.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kfence.h b/arch/x86/include/asm/kfence.h index acf9ffa1a171..dfd5c74ba41a 100644 --- a/arch/x86/include/asm/kfence.h +++ b/arch/x86/include/asm/kfence.h @@ -42,7 +42,7 @@ static inline bool kfence_protect_page(unsigned long addr, bool protect) { unsigned int level; pte_t *pte = lookup_address(addr, &level); - pteval_t val; + pteval_t val, new; if (WARN_ON(!pte || level != PG_LEVEL_4K)) return false; @@ -57,11 +57,12 @@ static inline bool kfence_protect_page(unsigned long addr, bool protect) return true; /* - * Otherwise, invert the entire PTE. This avoids writing out an + * Otherwise, flip the Present bit, taking care to avoid writing an * L1TF-vulnerable PTE (not present, without the high address bits * set). */ - set_pte(pte, __pte(~val)); + new = val ^ _PAGE_PRESENT; + set_pte(pte, __pte(flip_protnone_guard(val, new, PTE_PFN_MASK))); /* * If the page was protected (non-present) and we're making it From 011d4e52a76cf4131ae612869b9874c09eef7657 Mon Sep 17 00:00:00 2001 From: "Pratyush Yadav (Google)" Date: Tue, 27 Jan 2026 00:02:52 +0100 Subject: [PATCH 090/167] liveupdate: luo_file: do not clear serialized_data on unfreeze Patch series "liveupdate: fixes in error handling". This series contains some fixes in LUO's error handling paths. The first patch deals with failed freeze() attempts. The cleanup path calls unfreeze, and that clears some data needed by later unpreserve calls. The second patch is a bit more involved. It deals with failed retrieve() attempts. To do so properly, it reworks some of the error handling logic in luo_file core. Both these fixes are "theoretical" -- in the sense that I have not been able to reproduce either of them in normal operation. The only supported file type right now is memfd, and there is nothing userspace can do right now to make it fail its retrieve or freeze. I need to make the retrieve or freeze fail by artificially injecting errors. The injected errors trigger a use-after-free and a double-free. That said, once more complex file handlers are added or memfd preservation is used in ways not currently expected or covered by the tests, we will be able to see them on real systems. This patch (of 2): The unfreeze operation is supposed to undo the effects of the freeze operation. serialized_data is not set by freeze, but by preserve. Consequently, the unpreserve operation needs to access serialized_data to undo the effects of the preserve operation. This includes freeing the serialized data structures for example. If a freeze callback fails, unfreeze is called for all frozen files. This would clear serialized_data for them. Since live update has failed, it can be expected that userspace aborts, releasing all sessions. When the sessions are released, unpreserve will be called for all files. The unfrozen files will see 0 in their serialized_data. This is not expected by file handlers, and they might either fail, leaking data and state, or might even crash or cause invalid memory access. Do not clear serialized_data on unfreeze so it gets passed on to unpreserve. There is no need to clear it on unpreserve since luo_file will be freed immediately after. Link: https://lkml.kernel.org/r/20260126230302.2936817-1-pratyush@kernel.org Link: https://lkml.kernel.org/r/20260126230302.2936817-2-pratyush@kernel.org Fixes: 7c722a7f44e0 ("liveupdate: luo_file: implement file systems callbacks") Signed-off-by: Pratyush Yadav (Google) Reviewed-by: Mike Rapoport (Microsoft) Cc: Pasha Tatashin Signed-off-by: Andrew Morton --- kernel/liveupdate/luo_file.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c index a32a777f6df8..9f7283379ebc 100644 --- a/kernel/liveupdate/luo_file.c +++ b/kernel/liveupdate/luo_file.c @@ -402,8 +402,6 @@ static void luo_file_unfreeze_one(struct luo_file_set *file_set, luo_file->fh->ops->unfreeze(&args); } - - luo_file->serialized_data = 0; } static void __luo_file_unfreeze(struct luo_file_set *file_set, From f1675db3c7f1838dbbec4f1768d680e0acbc6721 Mon Sep 17 00:00:00 2001 From: Alexander Mikhalitsyn Date: Wed, 28 Jan 2026 18:39:15 +0100 Subject: [PATCH 091/167] mailmap: update Alexander Mikhalitsyn's emails Link: https://lkml.kernel.org/r/20260128173915.162309-1-alexander@mihalicyn.com Signed-off-by: Alexander Mikhalitsyn Signed-off-by: Andrew Morton --- .mailmap | 1 + 1 file changed, 1 insertion(+) diff --git a/.mailmap b/.mailmap index a64f269dd7cb..f0aa92beab95 100644 --- a/.mailmap +++ b/.mailmap @@ -33,6 +33,7 @@ Alexander Lobakin Alexander Lobakin Alexander Mikhalitsyn Alexander Mikhalitsyn +Alexander Mikhalitsyn Alexander Sverdlin Alexander Sverdlin Alexander Sverdlin From 2030dddf95451b4e7a389f052091e7c4b7b274c6 Mon Sep 17 00:00:00 2001 From: Kairui Song Date: Thu, 29 Jan 2026 00:19:23 +0800 Subject: [PATCH 092/167] mm, shmem: prevent infinite loop on truncate race When truncating a large swap entry, shmem_free_swap() returns 0 when the entry's index doesn't match the given index due to lookup alignment. The failure fallback path checks if the entry crosses the end border and aborts when it happens, so truncate won't erase an unexpected entry or range. But one scenario was ignored. When `index` points to the middle of a large swap entry, and the large swap entry doesn't go across the end border, find_get_entries() will return that large swap entry as the first item in the batch with `indices[0]` equal to `index`. The entry's base index will be smaller than `indices[0]`, so shmem_free_swap() will fail and return 0 due to the "base < index" check. The code will then call shmem_confirm_swap(), get the order, check if it crosses the END boundary (which it doesn't), and retry with the same index. The next iteration will find the same entry again at the same index with same indices, leading to an infinite loop. Fix this by retrying with a round-down index, and abort if the index is smaller than the truncate range. Link: https://lkml.kernel.org/r/aXo6ltB5iqAKJzY8@KASONG-MC4 Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Fixes: 8a1968bd997f ("mm/shmem, swap: fix race of truncate and swap entry split") Signed-off-by: Kairui Song Reported-by: Chris Mason Closes: https://lore.kernel.org/linux-mm/20260128130336.727049-1-clm@meta.com/ Reviewed-by: Baolin Wang Cc: Baoquan He Cc: Barry Song Cc: Chris Li Cc: Hugh Dickins Cc: Kemeng Shi Cc: Nhat Pham Cc: Signed-off-by: Andrew Morton --- mm/shmem.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 6c3485d24d66..79af5f9f8b90 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1211,17 +1211,22 @@ static void shmem_undo_range(struct inode *inode, loff_t lstart, uoff_t lend, swaps_freed = shmem_free_swap(mapping, indices[i], end - 1, folio); if (!swaps_freed) { - /* - * If found a large swap entry cross the end border, - * skip it as the truncate_inode_partial_folio above - * should have at least zerod its content once. - */ + pgoff_t base = indices[i]; + order = shmem_confirm_swap(mapping, indices[i], radix_to_swp_entry(folio)); - if (order > 0 && indices[i] + (1 << order) > end) - continue; - /* Swap was replaced by page: retry */ - index = indices[i]; + /* + * If found a large swap entry cross the end or start + * border, skip it as the truncate_inode_partial_folio + * above should have at least zerod its content once. + */ + if (order > 0) { + base = round_down(base, 1 << order); + if (base < start || base + (1 << order) > end) + continue; + } + /* Swap was replaced by page or extended, retry */ + index = base; break; } nr_swaps_freed += swaps_freed; From 1a47837bfafed7e9ef93f5dfdea6d70869b0c3ab Mon Sep 17 00:00:00 2001 From: Li Chen Date: Fri, 30 Jan 2026 19:20:33 +0800 Subject: [PATCH 093/167] Documentation: document liveupdate cmdline parameter liveupdate is used to enable Live Update Orchestrator (LUO) early during boot. Add it to kernel-parameters.txt so users can discover and use it. Link: https://lkml.kernel.org/r/20260130112036.359806-1-me@linux.beauty Signed-off-by: Li Chen Acked-by: Mike Rapoport (Microsoft) Cc: Arnd Bergmann Cc: "Borislav Petkov (AMD)" Cc: Frank van der Linden Cc: Ingo Molnar Cc: Jonathan Corbet Cc: Kees Cook Cc: Li RongQing Cc: Pawan Gupta Cc: Randy Dunlap Cc: Pratyush Yadav Cc: Pasha Tatashin Signed-off-by: Andrew Morton --- Documentation/admin-guide/kernel-parameters.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 1058f2a6d6a8..aa0031108bc1 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3472,6 +3472,11 @@ Kernel parameters If there are multiple matching configurations changing the same attribute, the last one is used. + liveupdate= [KNL,EARLY] + Format: + Enable Live Update Orchestrator (LUO). + Default: off. + load_ramdisk= [RAM] [Deprecated] lockd.nlm_grace_period=P [NFS] Assign grace period. From 3125fc17016945b11e9725c6aff30ff3326fd58f Mon Sep 17 00:00:00 2001 From: Tomas Hlavacek Date: Fri, 30 Jan 2026 11:23:01 +0100 Subject: [PATCH 094/167] net: spacemit: k1-emac: fix jumbo frame support The driver never programs the MAC frame size and jabber registers, causing the hardware to reject frames larger than the default 1518 bytes even when larger DMA buffers are allocated. Program MAC_MAXIMUM_FRAME_SIZE, MAC_TRANSMIT_JABBER_SIZE, and MAC_RECEIVE_JABBER_SIZE based on the configured MTU. Also fix the maximum buffer size from 4096 to 4095, since the descriptor buffer size field is only 12 bits. Account for double VLAN tags in frame size calculations. Fixes: bfec6d7f2001 ("net: spacemit: Add K1 Ethernet MAC") Cc: stable@vger.kernel.org Signed-off-by: Tomas Hlavacek Link: https://patch.msgid.link/20260130102301.477514-1-tmshlvck@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/spacemit/k1_emac.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/spacemit/k1_emac.c b/drivers/net/ethernet/spacemit/k1_emac.c index 88e9424d2d51..b49c4708bf9e 100644 --- a/drivers/net/ethernet/spacemit/k1_emac.c +++ b/drivers/net/ethernet/spacemit/k1_emac.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -38,7 +39,7 @@ #define EMAC_DEFAULT_BUFSIZE 1536 #define EMAC_RX_BUF_2K 2048 -#define EMAC_RX_BUF_4K 4096 +#define EMAC_RX_BUF_MAX FIELD_MAX(RX_DESC_1_BUFFER_SIZE_1_MASK) /* Tuning parameters from SpacemiT */ #define EMAC_TX_FRAMES 64 @@ -202,8 +203,7 @@ static void emac_init_hw(struct emac_priv *priv) { /* Destination address for 802.3x Ethernet flow control */ u8 fc_dest_addr[ETH_ALEN] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x01 }; - - u32 rxirq = 0, dma = 0; + u32 rxirq = 0, dma = 0, frame_sz; regmap_set_bits(priv->regmap_apmu, priv->regmap_apmu_offset + APMU_EMAC_CTRL_REG, @@ -228,6 +228,15 @@ static void emac_init_hw(struct emac_priv *priv) DEFAULT_TX_THRESHOLD); emac_wr(priv, MAC_RECEIVE_PACKET_START_THRESHOLD, DEFAULT_RX_THRESHOLD); + /* Set maximum frame size and jabber size based on configured MTU, + * accounting for Ethernet header, double VLAN tags, and FCS. + */ + frame_sz = priv->ndev->mtu + ETH_HLEN + 2 * VLAN_HLEN + ETH_FCS_LEN; + + emac_wr(priv, MAC_MAXIMUM_FRAME_SIZE, frame_sz); + emac_wr(priv, MAC_TRANSMIT_JABBER_SIZE, frame_sz); + emac_wr(priv, MAC_RECEIVE_JABBER_SIZE, frame_sz); + /* Configure flow control (enabled in emac_adjust_link() later) */ emac_set_mac_addr_reg(priv, fc_dest_addr, MAC_FC_SOURCE_ADDRESS_HIGH); emac_wr(priv, MAC_FC_PAUSE_HIGH_THRESHOLD, DEFAULT_FC_FIFO_HIGH); @@ -924,14 +933,14 @@ static int emac_change_mtu(struct net_device *ndev, int mtu) return -EBUSY; } - frame_len = mtu + ETH_HLEN + ETH_FCS_LEN; + frame_len = mtu + ETH_HLEN + 2 * VLAN_HLEN + ETH_FCS_LEN; if (frame_len <= EMAC_DEFAULT_BUFSIZE) priv->dma_buf_sz = EMAC_DEFAULT_BUFSIZE; else if (frame_len <= EMAC_RX_BUF_2K) priv->dma_buf_sz = EMAC_RX_BUF_2K; else - priv->dma_buf_sz = EMAC_RX_BUF_4K; + priv->dma_buf_sz = EMAC_RX_BUF_MAX; ndev->mtu = mtu; @@ -2025,7 +2034,7 @@ static int emac_probe(struct platform_device *pdev) ndev->hw_features = NETIF_F_SG; ndev->features |= ndev->hw_features; - ndev->max_mtu = EMAC_RX_BUF_4K - (ETH_HLEN + ETH_FCS_LEN); + ndev->max_mtu = EMAC_RX_BUF_MAX - (ETH_HLEN + 2 * VLAN_HLEN + ETH_FCS_LEN); ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS; priv = netdev_priv(ndev); From 29fb415a6a72c9207d118dd0a7a37184a14a3680 Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Fri, 30 Jan 2026 17:06:45 +0000 Subject: [PATCH 095/167] btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap We allocate the bitmap but we never free it in free_raid_bio_pointers(). Fix this by adding a bitmap_free() call against the stripe_uptodate_bitmap of a raid bio. Fixes: 1810350b04ef ("btrfs: raid56: move sector_ptr::uptodate into a dedicated bitmap") Reported-by: Christoph Hellwig Link: https://lore.kernel.org/linux-btrfs/20260126045315.GA31641@lst.de/ Reviewed-by: Qu Wenruo Tested-by: Christoph Hellwig Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/raid56.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c index f38d8305e46d..baadaaa189c0 100644 --- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -150,6 +150,7 @@ static void scrub_rbio_work_locked(struct work_struct *work); static void free_raid_bio_pointers(struct btrfs_raid_bio *rbio) { bitmap_free(rbio->error_bitmap); + bitmap_free(rbio->stripe_uptodate_bitmap); kfree(rbio->stripe_pages); kfree(rbio->bio_paddrs); kfree(rbio->stripe_paddrs); From d6ba734814266bbf7ee01f9030436597116805f3 Mon Sep 17 00:00:00 2001 From: Carlos Llamas Date: Tue, 27 Jan 2026 23:55:10 +0000 Subject: [PATCH 096/167] rust_binderfs: fix ida_alloc_max() upper bound The 'max' argument of ida_alloc_max() takes the maximum valid ID and not the "count". Using an ID of BINDERFS_MAX_MINOR (1 << 20) for dev->minor would exceed the limits of minor numbers (20-bits). Fix this off-by-one error by subtracting 1 from the 'max'. Cc: stable@vger.kernel.org Fixes: eafedbc7c050 ("rust_binder: add Rust Binder driver") Reported-by: kernel test robot Closes: https://lore.kernel.org/r/202512181203.IOv6IChH-lkp@intel.com/ Signed-off-by: Carlos Llamas Reviewed-by: Alice Ryhl Link: https://patch.msgid.link/20260127235545.2307876-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman --- drivers/android/binder/rust_binderfs.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/android/binder/rust_binderfs.c b/drivers/android/binder/rust_binderfs.c index c69026df775c..88374b31ab7c 100644 --- a/drivers/android/binder/rust_binderfs.c +++ b/drivers/android/binder/rust_binderfs.c @@ -132,8 +132,8 @@ static int binderfs_binder_device_create(struct inode *ref_inode, mutex_lock(&binderfs_minors_mutex); if (++info->device_count <= info->mount_opts.max) minor = ida_alloc_max(&binderfs_minors, - use_reserve ? BINDERFS_MAX_MINOR : - BINDERFS_MAX_MINOR_CAPPED, + use_reserve ? BINDERFS_MAX_MINOR - 1 : + BINDERFS_MAX_MINOR_CAPPED - 1, GFP_KERNEL); else minor = -ENOSPC; @@ -405,8 +405,8 @@ static int binderfs_binder_ctl_create(struct super_block *sb) /* Reserve a new minor number for the new device. */ mutex_lock(&binderfs_minors_mutex); minor = ida_alloc_max(&binderfs_minors, - use_reserve ? BINDERFS_MAX_MINOR : - BINDERFS_MAX_MINOR_CAPPED, + use_reserve ? BINDERFS_MAX_MINOR - 1 : + BINDERFS_MAX_MINOR_CAPPED - 1, GFP_KERNEL); mutex_unlock(&binderfs_minors_mutex); if (minor < 0) { From ec4ddc90d201d09ef4e4bef8a2c6d9624525ad68 Mon Sep 17 00:00:00 2001 From: Carlos Llamas Date: Tue, 27 Jan 2026 23:55:11 +0000 Subject: [PATCH 097/167] binderfs: fix ida_alloc_max() upper bound The 'max' argument of ida_alloc_max() takes the maximum valid ID and not the "count". Using an ID of BINDERFS_MAX_MINOR (1 << 20) for dev->minor would exceed the limits of minor numbers (20-bits). Fix this off-by-one error by subtracting 1 from the 'max'. Cc: stable@vger.kernel.org Fixes: 3ad20fe393b3 ("binder: implement binderfs") Signed-off-by: Carlos Llamas Link: https://patch.msgid.link/20260127235545.2307876-2-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman --- drivers/android/binderfs.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c index b46bcb91072d..9f8a18c88d66 100644 --- a/drivers/android/binderfs.c +++ b/drivers/android/binderfs.c @@ -132,8 +132,8 @@ static int binderfs_binder_device_create(struct inode *ref_inode, mutex_lock(&binderfs_minors_mutex); if (++info->device_count <= info->mount_opts.max) minor = ida_alloc_max(&binderfs_minors, - use_reserve ? BINDERFS_MAX_MINOR : - BINDERFS_MAX_MINOR_CAPPED, + use_reserve ? BINDERFS_MAX_MINOR - 1 : + BINDERFS_MAX_MINOR_CAPPED - 1, GFP_KERNEL); else minor = -ENOSPC; @@ -408,8 +408,8 @@ static int binderfs_binder_ctl_create(struct super_block *sb) /* Reserve a new minor number for the new device. */ mutex_lock(&binderfs_minors_mutex); minor = ida_alloc_max(&binderfs_minors, - use_reserve ? BINDERFS_MAX_MINOR : - BINDERFS_MAX_MINOR_CAPPED, + use_reserve ? BINDERFS_MAX_MINOR - 1 : + BINDERFS_MAX_MINOR_CAPPED - 1, GFP_KERNEL); mutex_unlock(&binderfs_minors_mutex); if (minor < 0) { From 5ff641011ab7fb63ea101251087745d9826e8ef5 Mon Sep 17 00:00:00 2001 From: Miri Korenblit Date: Thu, 29 Jan 2026 21:27:09 +0200 Subject: [PATCH 098/167] wifi: iwlwifi: mld: cancel mlo_scan_start_wk mlo_scan_start_wk is not canceled on disconnection. In fact, it is not canceled anywhere except in the restart cleanup, where we don't really have to. This can cause an init-after-queue issue: if, for example, the work was queued and then drv_change_interface got executed. This can also cause use-after-free: if the work is executed after the vif is freed. Fixes: 9748ad82a9d9 ("wifi: iwlwifi: defer MLO scan after link activation") Reviewed-by: Johannes Berg Signed-off-by: Miri Korenblit Link: https://patch.msgid.link/20260129212650.a36482a60719.I5bf64a108ca39dacb5ca0dcd8b7258a3ce8db74c@changeid --- drivers/net/wireless/intel/iwlwifi/mld/iface.c | 2 -- drivers/net/wireless/intel/iwlwifi/mld/mac80211.c | 2 ++ 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/mld/iface.c b/drivers/net/wireless/intel/iwlwifi/mld/iface.c index a5ececfc13e4..f15d1f5d1bf5 100644 --- a/drivers/net/wireless/intel/iwlwifi/mld/iface.c +++ b/drivers/net/wireless/intel/iwlwifi/mld/iface.c @@ -55,8 +55,6 @@ void iwl_mld_cleanup_vif(void *data, u8 *mac, struct ieee80211_vif *vif) ieee80211_iter_keys(mld->hw, vif, iwl_mld_cleanup_keys_iter, NULL); - wiphy_delayed_work_cancel(mld->wiphy, &mld_vif->mlo_scan_start_wk); - CLEANUP_STRUCT(mld_vif); } diff --git a/drivers/net/wireless/intel/iwlwifi/mld/mac80211.c b/drivers/net/wireless/intel/iwlwifi/mld/mac80211.c index 55b484c16280..cd0dce8de856 100644 --- a/drivers/net/wireless/intel/iwlwifi/mld/mac80211.c +++ b/drivers/net/wireless/intel/iwlwifi/mld/mac80211.c @@ -1759,6 +1759,8 @@ static int iwl_mld_move_sta_state_down(struct iwl_mld *mld, wiphy_work_cancel(mld->wiphy, &mld_vif->emlsr.unblock_tpt_wk); wiphy_delayed_work_cancel(mld->wiphy, &mld_vif->emlsr.check_tpt_wk); + wiphy_delayed_work_cancel(mld->wiphy, + &mld_vif->mlo_scan_start_wk); iwl_mld_reset_cca_40mhz_workaround(mld, vif); iwl_mld_smps_workaround(mld, vif, true); From fb7f54aa2a99b07945911152c5d3d4a6eb39f797 Mon Sep 17 00:00:00 2001 From: Miri Korenblit Date: Thu, 29 Jan 2026 21:27:10 +0200 Subject: [PATCH 099/167] wifi: iwlwifi: mvm: pause TCM on fast resume Not pausing it means that we can have the TCM work queued into a non-freezable workqueue, which, in resume, is re-activated before the driver's resume is called. The TCM work might send commands to the FW before we resumed the device, leading to an assert. Closes: https://lore.kernel.org/linux-wireless/aTDoDiD55qlUZ0pn@debian.local/ Tested-by: Chris Bainbridge Fixes: e8bb19c1d590 ("wifi: iwlwifi: support fast resume") Reviewed-by: Johannes Berg Signed-off-by: Miri Korenblit Link: https://patch.msgid.link/20260129212650.05621f3faedb.I44df9cf9183b5143df8078131e0d87c0fd7e1763@changeid --- drivers/net/wireless/intel/iwlwifi/mvm/d3.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/d3.c b/drivers/net/wireless/intel/iwlwifi/mvm/d3.c index 07f1a84c274e..af1a45845999 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/d3.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/d3.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause /* - * Copyright (C) 2012-2014, 2018-2025 Intel Corporation + * Copyright (C) 2012-2014, 2018-2026 Intel Corporation * Copyright (C) 2013-2015 Intel Mobile Communications GmbH * Copyright (C) 2016-2017 Intel Deutschland GmbH */ @@ -3239,6 +3239,8 @@ void iwl_mvm_fast_suspend(struct iwl_mvm *mvm) IWL_DEBUG_WOWLAN(mvm, "Starting fast suspend flow\n"); + iwl_mvm_pause_tcm(mvm, true); + mvm->fast_resume = true; set_bit(IWL_MVM_STATUS_IN_D3, &mvm->status); @@ -3295,6 +3297,8 @@ int iwl_mvm_fast_resume(struct iwl_mvm *mvm) mvm->trans->state = IWL_TRANS_NO_FW; } + iwl_mvm_resume_tcm(mvm); + out: clear_bit(IWL_MVM_STATUS_IN_D3, &mvm->status); mvm->fast_resume = false; From 284e70ace9ecdeb8644fbe65c5da12c90b377545 Mon Sep 17 00:00:00 2001 From: Bard Liao Date: Tue, 3 Feb 2026 15:24:05 +0800 Subject: [PATCH 100/167] ASoC: SOF: Intel: use hdev->info.link_mask directly MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The link_mask variable is not changed after setting to hdev->info.link_mask until it is used for another purpose to get the used SoundWire links and set to mach->mach_params.links. Besides, the link_mask variable should be reset before any link id is added to the link_mask. To fix the issue above and avoid confusing, use the hdev->info.link_mask variable directly to check if the SoundWire link is enabled. Fixes: 5226d19d4cae ("ASoC: SOF: Intel: use sof_sdw as default SDW machine driver") Signed-off-by: Bard Liao Reviewed-by: Ranjani Sridharan Reviewed-by: Péter Ujfalusi Link: https://patch.msgid.link/20260203072405.3716307-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/sof/intel/hda.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c index 0bb85f92e106..686ecc040867 100644 --- a/sound/soc/sof/intel/hda.c +++ b/sound/soc/sof/intel/hda.c @@ -1304,9 +1304,8 @@ static struct snd_soc_acpi_mach *hda_sdw_machine_select(struct snd_sof_dev *sdev int i; hdev = pdata->hw_pdata; - link_mask = hdev->info.link_mask; - if (!link_mask) { + if (!hdev->info.link_mask) { dev_info(sdev->dev, "SoundWire links not enabled\n"); return NULL; } @@ -1337,7 +1336,7 @@ static struct snd_soc_acpi_mach *hda_sdw_machine_select(struct snd_sof_dev *sdev * link_mask supported by hw and then go on searching * link_adr */ - if (~link_mask & mach->link_mask) + if (~hdev->info.link_mask & mach->link_mask) continue; /* No need to match adr if there is no links defined */ From 6e1e735181e0c18e1f4ecb0118be4b1e2ee439d1 Mon Sep 17 00:00:00 2001 From: Shuming Fan Date: Tue, 3 Feb 2026 16:48:26 +0800 Subject: [PATCH 101/167] ASoC: rt1320: fix intermittent no-sound issue This patch adds a setting to resolve the intermittent no-sound issue. Signed-off-by: Shuming Fan Link: https://patch.msgid.link/20260203084827.768238-1-shumingf@realtek.com Signed-off-by: Mark Brown --- sound/soc/codecs/rt1320-sdw.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/soc/codecs/rt1320-sdw.c b/sound/soc/codecs/rt1320-sdw.c index feecef258b65..e6142645b903 100644 --- a/sound/soc/codecs/rt1320-sdw.c +++ b/sound/soc/codecs/rt1320-sdw.c @@ -203,6 +203,7 @@ static const struct reg_sequence rt1320_vc_blind_write[] = { { 0x3fc2bfc2, 0x00 }, { 0x3fc2bfc1, 0x00 }, { 0x3fc2bfc0, 0x07 }, + { 0x1000cc46, 0x00 }, { 0x0000d486, 0x43 }, { SDW_SDCA_CTL(FUNC_NUM_AMP, RT1320_SDCA_ENT_PDE23, RT1320_SDCA_CTL_REQ_POWER_STATE, 0), 0x00 }, { 0x1000db00, 0x07 }, @@ -354,6 +355,7 @@ static const struct reg_sequence rt1321_blind_write[] = { { 0x0000d73d, 0xd7 }, { 0x0000d73e, 0x00 }, { 0x0000d73f, 0x10 }, + { 0x1000cd56, 0x00 }, { 0x3fc2dfc3, 0x00 }, { 0x3fc2dfc2, 0x00 }, { 0x3fc2dfc1, 0x00 }, From 826af7fa62e347464b1b4e0ba2fe19a92438084f Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 3 Feb 2026 15:09:59 +0100 Subject: [PATCH 102/167] ALSA: aloop: Fix racy access at PCM trigger The PCM trigger callback of aloop driver tries to check the PCM state and stop the stream of the tied substream in the corresponding cable. Since both check and stop operations are performed outside the cable lock, this may result in UAF when a program attempts to trigger frequently while opening/closing the tied stream, as spotted by fuzzers. For addressing the UAF, this patch changes two things: - It covers the most of code in loopback_check_format() with cable->lock spinlock, and add the proper NULL checks. This avoids already some racy accesses. - In addition, now we try to check the state of the capture PCM stream that may be stopped in this function, which was the major pain point leading to UAF. Reported-by: syzbot+5f8f3acdee1ec7a7ef7b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/69783ba1.050a0220.c9109.0011.GAE@google.com Cc: Link: https://patch.msgid.link/20260203141003.116584-1-tiwai@suse.de Signed-off-by: Takashi Iwai --- sound/drivers/aloop.c | 62 +++++++++++++++++++++++++------------------ 1 file changed, 36 insertions(+), 26 deletions(-) diff --git a/sound/drivers/aloop.c b/sound/drivers/aloop.c index 64ef03b2d579..aa0d2fcb1a18 100644 --- a/sound/drivers/aloop.c +++ b/sound/drivers/aloop.c @@ -336,37 +336,43 @@ static bool is_access_interleaved(snd_pcm_access_t access) static int loopback_check_format(struct loopback_cable *cable, int stream) { + struct loopback_pcm *dpcm_play, *dpcm_capt; struct snd_pcm_runtime *runtime, *cruntime; struct loopback_setup *setup; struct snd_card *card; + bool stop_capture = false; int check; - if (cable->valid != CABLE_VALID_BOTH) { - if (stream == SNDRV_PCM_STREAM_PLAYBACK) - goto __notify; - return 0; - } - runtime = cable->streams[SNDRV_PCM_STREAM_PLAYBACK]-> - substream->runtime; - cruntime = cable->streams[SNDRV_PCM_STREAM_CAPTURE]-> - substream->runtime; - check = runtime->format != cruntime->format || - runtime->rate != cruntime->rate || - runtime->channels != cruntime->channels || - is_access_interleaved(runtime->access) != - is_access_interleaved(cruntime->access); - if (!check) - return 0; - if (stream == SNDRV_PCM_STREAM_CAPTURE) { - return -EIO; - } else { - snd_pcm_stop(cable->streams[SNDRV_PCM_STREAM_CAPTURE]-> - substream, SNDRV_PCM_STATE_DRAINING); - __notify: - runtime = cable->streams[SNDRV_PCM_STREAM_PLAYBACK]-> - substream->runtime; - setup = get_setup(cable->streams[SNDRV_PCM_STREAM_PLAYBACK]); - card = cable->streams[SNDRV_PCM_STREAM_PLAYBACK]->loopback->card; + scoped_guard(spinlock_irqsave, &cable->lock) { + dpcm_play = cable->streams[SNDRV_PCM_STREAM_PLAYBACK]; + dpcm_capt = cable->streams[SNDRV_PCM_STREAM_CAPTURE]; + + if (cable->valid != CABLE_VALID_BOTH) { + if (stream == SNDRV_PCM_STREAM_CAPTURE || !dpcm_play) + return 0; + } else { + if (!dpcm_play || !dpcm_capt) + return -EIO; + runtime = dpcm_play->substream->runtime; + cruntime = dpcm_capt->substream->runtime; + if (!runtime || !cruntime) + return -EIO; + check = runtime->format != cruntime->format || + runtime->rate != cruntime->rate || + runtime->channels != cruntime->channels || + is_access_interleaved(runtime->access) != + is_access_interleaved(cruntime->access); + if (!check) + return 0; + if (stream == SNDRV_PCM_STREAM_CAPTURE) + return -EIO; + else if (cruntime->state == SNDRV_PCM_STATE_RUNNING) + stop_capture = true; + } + + setup = get_setup(dpcm_play); + card = dpcm_play->loopback->card; + runtime = dpcm_play->substream->runtime; if (setup->format != runtime->format) { snd_ctl_notify(card, SNDRV_CTL_EVENT_MASK_VALUE, &setup->format_id); @@ -389,6 +395,10 @@ static int loopback_check_format(struct loopback_cable *cable, int stream) setup->access = runtime->access; } } + + if (stop_capture) + snd_pcm_stop(dpcm_capt->substream, SNDRV_PCM_STATE_DRAINING); + return 0; } From b1dfe4e0fcef0cc01233a70ec8fd95b900024a5a Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Tue, 3 Feb 2026 09:55:47 -0700 Subject: [PATCH 103/167] io_uring/fdinfo: kill unnecessary newline feed in CQE32 printing There's an unconditional newline feed anyway after dumping both normal and big CQE contents, remove the \n from the CQE32 extra1/extra2 printing. Signed-off-by: Jens Axboe --- io_uring/fdinfo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c index a87d4e26eee8..4f12e98b22c3 100644 --- a/io_uring/fdinfo.c +++ b/io_uring/fdinfo.c @@ -159,7 +159,7 @@ static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m) cq_head & cq_mask, cqe->user_data, cqe->res, cqe->flags); if (cqe32) - seq_printf(m, ", extra1:%llu, extra2:%llu\n", + seq_printf(m, ", extra1:%llu, extra2:%llu", cqe->big_cqe[0], cqe->big_cqe[1]); seq_printf(m, "\n"); cq_head++; From 124bdc6eccc8c5cba68fee00e01c084c116c4360 Mon Sep 17 00:00:00 2001 From: Sergey Shtylyov Date: Tue, 3 Feb 2026 19:15:57 +0300 Subject: [PATCH 104/167] ALSA: usb-audio: fix broken logic in snd_audigy2nx_led_update() When the support for the Sound Blaster X-Fi Surround 5.1 Pro was added, the existing logic for the X-Fi Surround 5.1 in snd_audigy2nx_led_put() was broken due to missing *else* before the added *if*: snd_usb_ctl_msg() became incorrectly called twice and an error from first snd_usb_ctl_msg() call ignored. As the added snd_usb_ctl_msg() call was totally identical to the existing one for the "plain" X-Fi Surround 5.1, just merge those two *if* statements while fixing the broken logic... Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Fixes: 7cdd8d73139e ("ALSA: usb-audio - Add support for USB X-Fi S51 Pro") Signed-off-by: Sergey Shtylyov Link: https://patch.msgid.link/20260203161558.18680-1-s.shtylyov@auroraos.dev Signed-off-by: Takashi Iwai --- sound/usb/mixer_quirks.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/sound/usb/mixer_quirks.c b/sound/usb/mixer_quirks.c index f8eccb1c58da..cf4ccf467cb6 100644 --- a/sound/usb/mixer_quirks.c +++ b/sound/usb/mixer_quirks.c @@ -311,13 +311,8 @@ static int snd_audigy2nx_led_update(struct usb_mixer_interface *mixer, if (pm.err < 0) return pm.err; - if (chip->usb_id == USB_ID(0x041e, 0x3042)) - err = snd_usb_ctl_msg(chip->dev, - usb_sndctrlpipe(chip->dev, 0), 0x24, - USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_OTHER, - !value, 0, NULL, 0); - /* USB X-Fi S51 Pro */ - if (chip->usb_id == USB_ID(0x041e, 0x30df)) + if (chip->usb_id == USB_ID(0x041e, 0x3042) || /* USB X-Fi S51 */ + chip->usb_id == USB_ID(0x041e, 0x30df)) /* USB X-Fi S51 Pro */ err = snd_usb_ctl_msg(chip->dev, usb_sndctrlpipe(chip->dev, 0), 0x24, USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_OTHER, From 38cfdd9dd279473a73814df9fd7e6e716951d361 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Tue, 3 Feb 2026 09:56:55 -0700 Subject: [PATCH 105/167] io_uring/fdinfo: be a bit nicer when looping a lot of SQEs/CQEs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add cond_resched() in those dump loops, just in case a lot of entries are being dumped. And detect invalid CQ ring head/tail entries, to avoid iterating more than what is necessary. Generally not an issue, but can be if things like KASAN or other debugging metrics are enabled. Reported-by: 是参差 Link: https://lore.kernel.org/all/PS1PPF7E1D7501FE5631002D242DD89403FAB9BA@PS1PPF7E1D7501F.apcprd02.prod.outlook.com/ Reviewed-by: Keith Busch Signed-off-by: Jens Axboe --- io_uring/fdinfo.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c index 4f12e98b22c3..80178b69e05a 100644 --- a/io_uring/fdinfo.c +++ b/io_uring/fdinfo.c @@ -67,7 +67,7 @@ static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m) unsigned int cq_head = READ_ONCE(r->cq.head); unsigned int cq_tail = READ_ONCE(r->cq.tail); unsigned int sq_shift = 0; - unsigned int sq_entries; + unsigned int cq_entries, sq_entries; int sq_pid = -1, sq_cpu = -1; u64 sq_total_time = 0, sq_work_time = 0; unsigned int i; @@ -146,9 +146,11 @@ static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m) } } seq_printf(m, "\n"); + cond_resched(); } seq_printf(m, "CQEs:\t%u\n", cq_tail - cq_head); - while (cq_head < cq_tail) { + cq_entries = min(cq_tail - cq_head, ctx->cq_entries); + for (i = 0; i < cq_entries; i++) { struct io_uring_cqe *cqe; bool cqe32 = false; @@ -163,8 +165,11 @@ static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m) cqe->big_cqe[0], cqe->big_cqe[1]); seq_printf(m, "\n"); cq_head++; - if (cqe32) + if (cqe32) { cq_head++; + i++; + } + cond_resched(); } if (ctx->flags & IORING_SETUP_SQPOLL) { From bd3884a204c3b507e6baa9a4091aa927f9af5404 Mon Sep 17 00:00:00 2001 From: Ilya Dryomov Date: Wed, 7 Jan 2026 22:37:55 +0100 Subject: [PATCH 106/167] rbd: check for EOD after exclusive lock is ensured to be held Similar to commit 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held"), move the "beyond EOD" check into the image request state machine so that it's performed after exclusive lock is ensured to be held. This avoids various race conditions which can arise when the image is shrunk under I/O (in practice, mostly readahead). In one such scenario rbd_assert(objno < rbd_dev->object_map_size); can be triggered if a close-to-EOD read gets queued right before the shrink is initiated and the EOD check is performed against an outdated mapping_size. After the resize is done on the server side and exclusive lock is (re)acquired bringing along the new (now shrunk) object map, the read starts going through the state machine and rbd_obj_may_exist() gets invoked on an object that is out of bounds of rbd_dev->object_map array. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov Reviewed-by: Dongsheng Yang --- drivers/block/rbd.c | 33 +++++++++++++++++++++------------ 1 file changed, 21 insertions(+), 12 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index af0e21149dbc..8f441eb8b192 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3495,11 +3495,29 @@ static void rbd_img_object_requests(struct rbd_img_request *img_req) rbd_assert(!need_exclusive_lock(img_req) || __rbd_is_lock_owner(rbd_dev)); - if (rbd_img_is_write(img_req)) { - rbd_assert(!img_req->snapc); + if (test_bit(IMG_REQ_CHILD, &img_req->flags)) { + rbd_assert(!rbd_img_is_write(img_req)); + } else { + struct request *rq = blk_mq_rq_from_pdu(img_req); + u64 off = (u64)blk_rq_pos(rq) << SECTOR_SHIFT; + u64 len = blk_rq_bytes(rq); + u64 mapping_size; + down_read(&rbd_dev->header_rwsem); - img_req->snapc = ceph_get_snap_context(rbd_dev->header.snapc); + mapping_size = rbd_dev->mapping.size; + if (rbd_img_is_write(img_req)) { + rbd_assert(!img_req->snapc); + img_req->snapc = + ceph_get_snap_context(rbd_dev->header.snapc); + } up_read(&rbd_dev->header_rwsem); + + if (unlikely(off + len > mapping_size)) { + rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)", + off, len, mapping_size); + img_req->pending.result = -EIO; + return; + } } for_each_obj_request(img_req, obj_req) { @@ -4725,7 +4743,6 @@ static void rbd_queue_workfn(struct work_struct *work) struct request *rq = blk_mq_rq_from_pdu(img_request); u64 offset = (u64)blk_rq_pos(rq) << SECTOR_SHIFT; u64 length = blk_rq_bytes(rq); - u64 mapping_size; int result; /* Ignore/skip any zero-length requests */ @@ -4738,17 +4755,9 @@ static void rbd_queue_workfn(struct work_struct *work) blk_mq_start_request(rq); down_read(&rbd_dev->header_rwsem); - mapping_size = rbd_dev->mapping.size; rbd_img_capture_header(img_request); up_read(&rbd_dev->header_rwsem); - if (offset + length > mapping_size) { - rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)", offset, - length, mapping_size); - result = -EIO; - goto err_img_request; - } - dout("%s rbd_dev %p img_req %p %s %llu~%llu\n", __func__, rbd_dev, img_request, obj_op_name(op_type), offset, length); From bc8dedae022ce3058659c3addef3ec4b41d15e00 Mon Sep 17 00:00:00 2001 From: Daniel Vogelbacher Date: Sun, 1 Feb 2026 09:34:01 +0100 Subject: [PATCH 107/167] ceph: fix oops due to invalid pointer for kfree() in parse_longname() This fixes a kernel oops when reading ceph snapshot directories (.snap), for example by simply running `ls /mnt/my_ceph/.snap`. The variable str is guarded by __free(kfree), but advanced by one for skipping the initial '_' in snapshot names. Thus, kfree() is called with an invalid pointer. This patch removes the need for advancing the pointer so kfree() is called with correct memory pointer. Steps to reproduce: 1. Create snapshots on a cephfs volume (I've 63 snaps in my testcase) 2. Add cephfs mount to fstab $ echo "samba-fileserver@.files=/volumes/datapool/stuff/3461082b-ecc9-4e82-8549-3fd2590d3fb6 /mnt/test/stuff ceph acl,noatime,_netdev 0 0" >> /etc/fstab 3. Reboot the system $ systemctl reboot 4. Check if it's really mounted $ mount | grep stuff 5. List snapshots (expected 63 snapshots on my system) $ ls /mnt/test/stuff/.snap Now ls hangs forever and the kernel log shows the oops. Cc: stable@vger.kernel.org Fixes: 101841c38346 ("[ceph] parse_longname(): strrchr() expects NUL-terminated string") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220807 Suggested-by: Helge Deller Signed-off-by: Daniel Vogelbacher Reviewed-by: Viacheslav Dubeyko Signed-off-by: Ilya Dryomov --- fs/ceph/crypto.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 0ea4db650f85..9a115282f67d 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -166,12 +166,13 @@ static struct inode *parse_longname(const struct inode *parent, struct ceph_vino vino = { .snap = CEPH_NOSNAP }; char *name_end, *inode_number; int ret = -EIO; - /* NUL-terminate */ - char *str __free(kfree) = kmemdup_nul(name, *name_len, GFP_KERNEL); + /* Snapshot name must start with an underscore */ + if (*name_len <= 0 || name[0] != '_') + return ERR_PTR(-EIO); + /* Skip initial '_' and NUL-terminate */ + char *str __free(kfree) = kmemdup_nul(name + 1, *name_len - 1, GFP_KERNEL); if (!str) return ERR_PTR(-ENOMEM); - /* Skip initial '_' */ - str++; name_end = strrchr(str, '_'); if (!name_end) { doutc(cl, "failed to parse long snapshot name: %s\n", str); From b126097b0327437048bd045a0e4d273dea2910dd Mon Sep 17 00:00:00 2001 From: LI Qingwu Date: Fri, 16 Jan 2026 11:19:05 +0000 Subject: [PATCH 108/167] i2c: imx: preserve error state in block data length handler When a block read returns an invalid length, zero or >I2C_SMBUS_BLOCK_MAX, the length handler sets the state to IMX_I2C_STATE_FAILED. However, i2c_imx_master_isr() unconditionally overwrites this with IMX_I2C_STATE_READ_CONTINUE, causing an endless read loop that overruns buffers and crashes the system. Guard the state transition to preserve error states set by the length handler. Fixes: 5f5c2d4579ca ("i2c: imx: prevent rescheduling in non dma mode") Signed-off-by: LI Qingwu Cc: # v6.13+ Reviewed-by: Stefan Eichenberger Signed-off-by: Andi Shyti Link: https://lore.kernel.org/r/20260116111906.3413346-2-Qing-wu.Li@leica-geosystems.com.cn Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-imx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index dcce882f3eba..85f554044cf1 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -1103,7 +1103,8 @@ static irqreturn_t i2c_imx_master_isr(struct imx_i2c_struct *i2c_imx, unsigned i case IMX_I2C_STATE_READ_BLOCK_DATA_LEN: i2c_imx_isr_read_block_data_len(i2c_imx); - i2c_imx->state = IMX_I2C_STATE_READ_CONTINUE; + if (i2c_imx->state == IMX_I2C_STATE_READ_BLOCK_DATA_LEN) + i2c_imx->state = IMX_I2C_STATE_READ_CONTINUE; break; case IMX_I2C_STATE_WRITE: From 1478a34470bf4755465d29b348b24a610bccc180 Mon Sep 17 00:00:00 2001 From: Mario Limonciello Date: Thu, 29 Jan 2026 13:47:22 -0600 Subject: [PATCH 109/167] drm/amd: Set minimum version for set_hw_resource_1 on gfx11 to 0x52 commit f81cd793119e ("drm/amd/amdgpu: Fix MES init sequence") caused a dependency on new enough MES firmware to use amdgpu. This was fixed on most gfx11 and gfx12 hardware with commit 0180e0a5dd5c ("drm/amdgpu/mes: add compatibility checks for set_hw_resource_1"), but this left out that GC 11.0.4 had breakage at MES 0x51. Bump the requirement to 0x52 instead. Reported-by: danijel@nausys.com Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4576 Fixes: f81cd793119e ("drm/amd/amdgpu: Fix MES init sequence") Reviewed-by: Alex Deucher Signed-off-by: Mario Limonciello Signed-off-by: Alex Deucher (cherry picked from commit c2d2ccc85faf8cc6934d50c18e43097eb453ade2) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 3a52754b5cad..ceddc953785f 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -1671,7 +1671,7 @@ static int mes_v11_0_hw_init(struct amdgpu_ip_block *ip_block) if (r) goto failure; - if ((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 0x50) { + if ((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 0x52) { r = mes_v11_0_set_hw_resources_1(&adev->mes); if (r) { DRM_ERROR("failed mes_v11_0_set_hw_resources_1, r=%d\n", r); From 243b467dea1735fed904c2e54d248a46fa417a2d Mon Sep 17 00:00:00 2001 From: Bert Karwatzki Date: Sun, 1 Feb 2026 01:24:45 +0100 Subject: [PATCH 110/167] Revert "drm/amd: Check if ASPM is enabled from PCIe subsystem" MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This reverts commit 7294863a6f01248d72b61d38478978d638641bee. This commit was erroneously applied again after commit 0ab5d711ec74 ("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device") removed it, leading to very hard to debug crashes, when used with a system with two AMD GPUs of which only one supports ASPM. Link: https://lore.kernel.org/linux-acpi/20251006120944.7880-1-spasswolf@web.de/ Link: https://github.com/acpica/acpica/issues/1060 Fixes: 0ab5d711ec74 ("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device") Signed-off-by: Bert Karwatzki Reviewed-by: Christian König Reviewed-by: Mario Limonciello (AMD) Signed-off-by: Mario Limonciello Signed-off-by: Alex Deucher (cherry picked from commit 97a9689300eb2b393ba5efc17c8e5db835917080) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 6ccb80e2d7c9..39387da8586b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2405,9 +2405,6 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, return -ENODEV; } - if (amdgpu_aspm == -1 && !pcie_aspm_enabled(pdev)) - amdgpu_aspm = 0; - if (amdgpu_virtual_display || amdgpu_device_asic_has_dc_support(pdev, flags & AMD_ASIC_MASK)) supports_atomic = true; From 8f959d37c1f2efec6dac55915ee82302e98101fb Mon Sep 17 00:00:00 2001 From: Melissa Wen Date: Thu, 22 Jan 2026 12:20:29 -0300 Subject: [PATCH 111/167] drm/amd/display: fix wrong color value mapping on MCM shaper LUT Some shimmer/colorful points appears when using the steamOS color pipeline for HDR on gaming with DCN32. These points look like black values being wrongly mapped to red/blue/green values. It was caused because the number of hw points in regular LUTs and in a shaper LUT was treated as the same. DCN3+ regular LUTs have 257 bases and implicit deltas (i.e. HW calculates them), but shaper LUT is a special case: it has 256 bases and 256 deltas, as in DCN1-2 regular LUTs, and outputs 14-bit values. Fix that by setting by decreasing in 1 the number of HW points computed in the LUT segmentation so that shaper LUT (i.e. fixpoint == true) keeps the same DCN10 CM logic and regular LUTs go with `hw_points + 1`. CC: Krunoslav Kovac Fixes: 4d5fd3d08ea9 ("drm/amd/display: PQ tail accuracy") Signed-off-by: Melissa Wen Reviewed-by: Alex Hung Signed-off-by: Alex Deucher (cherry picked from commit 5006505b19a2119e71c008044d59f6d753c858b9) --- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c index 0690c346f2c5..a4f14b16564c 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c @@ -163,6 +163,11 @@ bool cm3_helper_translate_curve_to_hw_format( hw_points += (1 << seg_distr[k]); } + // DCN3+ have 257 pts in lieu of no separate slope registers + // Prior HW had 256 base+slope pairs + // Shaper LUT (i.e. fixpoint == true) is still 256 bases and 256 deltas + hw_points = fixpoint ? (hw_points - 1) : hw_points; + j = 0; for (k = 0; k < (region_end - region_start); k++) { increment = NUMBER_SW_SEGMENTS / (1 << seg_distr[k]); @@ -223,8 +228,6 @@ bool cm3_helper_translate_curve_to_hw_format( corner_points[1].green.slope = dc_fixpt_zero; corner_points[1].blue.slope = dc_fixpt_zero; - // DCN3+ have 257 pts in lieu of no separate slope registers - // Prior HW had 256 base+slope pairs lut_params->hw_points_num = hw_points + 1; k = 0; From d25b32aa829a3ed5570138e541a71fb7805faec3 Mon Sep 17 00:00:00 2001 From: Melissa Wen Date: Mon, 8 Dec 2025 22:44:15 -0100 Subject: [PATCH 112/167] drm/amd/display: extend delta clamping logic to CM3 LUT helper Commit 27fc10d1095f ("drm/amd/display: Fix the delta clamping for shaper LUT") fixed banding when using plane shaper LUT in DCN10 CM helper. The problem is also present in DCN30 CM helper, fix banding by extending the same bug delta clamping fix to CM3. Signed-off-by: Melissa Wen Reviewed-by: Harry Wentland Signed-off-by: Alex Deucher (cherry picked from commit 0274a54897f356f9c78767c4a2a5863f7dde90c6) --- .../amd/display/dc/dcn30/dcn30_cm_common.c | 30 +++++++++++++++---- .../display/dc/dwb/dcn30/dcn30_cm_common.h | 2 +- .../amd/display/dc/hwss/dcn30/dcn30_hwseq.c | 9 +++--- .../amd/display/dc/hwss/dcn32/dcn32_hwseq.c | 17 ++++++----- .../amd/display/dc/hwss/dcn401/dcn401_hwseq.c | 16 +++++----- 5 files changed, 49 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c index a4f14b16564c..227aa8672d17 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c @@ -105,9 +105,12 @@ void cm_helper_program_gamcor_xfer_func( #define NUMBER_REGIONS 32 #define NUMBER_SW_SEGMENTS 16 -bool cm3_helper_translate_curve_to_hw_format( - const struct dc_transfer_func *output_tf, - struct pwl_params *lut_params, bool fixpoint) +#define DC_LOGGER \ + ctx->logger + +bool cm3_helper_translate_curve_to_hw_format(struct dc_context *ctx, + const struct dc_transfer_func *output_tf, + struct pwl_params *lut_params, bool fixpoint) { struct curve_points3 *corner_points; struct pwl_result_data *rgb_resulted; @@ -251,6 +254,10 @@ bool cm3_helper_translate_curve_to_hw_format( if (fixpoint == true) { i = 1; while (i != hw_points + 2) { + uint32_t red_clamp; + uint32_t green_clamp; + uint32_t blue_clamp; + if (i >= hw_points) { if (dc_fixpt_lt(rgb_plus_1->red, rgb->red)) rgb_plus_1->red = dc_fixpt_add(rgb->red, @@ -263,9 +270,20 @@ bool cm3_helper_translate_curve_to_hw_format( rgb_minus_1->delta_blue); } - rgb->delta_red_reg = dc_fixpt_clamp_u0d10(rgb->delta_red); - rgb->delta_green_reg = dc_fixpt_clamp_u0d10(rgb->delta_green); - rgb->delta_blue_reg = dc_fixpt_clamp_u0d10(rgb->delta_blue); + rgb->delta_red = dc_fixpt_sub(rgb_plus_1->red, rgb->red); + rgb->delta_green = dc_fixpt_sub(rgb_plus_1->green, rgb->green); + rgb->delta_blue = dc_fixpt_sub(rgb_plus_1->blue, rgb->blue); + + red_clamp = dc_fixpt_clamp_u0d14(rgb->delta_red); + green_clamp = dc_fixpt_clamp_u0d14(rgb->delta_green); + blue_clamp = dc_fixpt_clamp_u0d14(rgb->delta_blue); + + if (red_clamp >> 10 || green_clamp >> 10 || blue_clamp >> 10) + DC_LOG_ERROR("Losing delta precision while programming shaper LUT."); + + rgb->delta_red_reg = red_clamp & 0x3ff; + rgb->delta_green_reg = green_clamp & 0x3ff; + rgb->delta_blue_reg = blue_clamp & 0x3ff; rgb->red_reg = dc_fixpt_clamp_u0d14(rgb->red); rgb->green_reg = dc_fixpt_clamp_u0d14(rgb->green); rgb->blue_reg = dc_fixpt_clamp_u0d14(rgb->blue); diff --git a/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_cm_common.h b/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_cm_common.h index b86347c9b038..95f9318a54ef 100644 --- a/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_cm_common.h +++ b/drivers/gpu/drm/amd/display/dc/dwb/dcn30/dcn30_cm_common.h @@ -59,7 +59,7 @@ void cm_helper_program_gamcor_xfer_func( const struct pwl_params *params, const struct dcn3_xfer_func_reg *reg); -bool cm3_helper_translate_curve_to_hw_format( +bool cm3_helper_translate_curve_to_hw_format(struct dc_context *ctx, const struct dc_transfer_func *output_tf, struct pwl_params *lut_params, bool fixpoint); diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c index 81bcadf5e57e..f2d4cd527874 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn30/dcn30_hwseq.c @@ -239,7 +239,7 @@ bool dcn30_set_blend_lut( if (plane_state->blend_tf.type == TF_TYPE_HWPWL) blend_lut = &plane_state->blend_tf.pwl; else if (plane_state->blend_tf.type == TF_TYPE_DISTRIBUTED_POINTS) { - result = cm3_helper_translate_curve_to_hw_format( + result = cm3_helper_translate_curve_to_hw_format(plane_state->ctx, &plane_state->blend_tf, &dpp_base->regamma_params, false); if (!result) return result; @@ -334,8 +334,9 @@ bool dcn30_set_input_transfer_func(struct dc *dc, if (plane_state->in_transfer_func.type == TF_TYPE_HWPWL) params = &plane_state->in_transfer_func.pwl; else if (plane_state->in_transfer_func.type == TF_TYPE_DISTRIBUTED_POINTS && - cm3_helper_translate_curve_to_hw_format(&plane_state->in_transfer_func, - &dpp_base->degamma_params, false)) + cm3_helper_translate_curve_to_hw_format(plane_state->ctx, + &plane_state->in_transfer_func, + &dpp_base->degamma_params, false)) params = &dpp_base->degamma_params; result = dpp_base->funcs->dpp_program_gamcor_lut(dpp_base, params); @@ -406,7 +407,7 @@ bool dcn30_set_output_transfer_func(struct dc *dc, params = &stream->out_transfer_func.pwl; else if (pipe_ctx->stream->out_transfer_func.type == TF_TYPE_DISTRIBUTED_POINTS && - cm3_helper_translate_curve_to_hw_format( + cm3_helper_translate_curve_to_hw_format(stream->ctx, &stream->out_transfer_func, &mpc->blender_params, false)) params = &mpc->blender_params; diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c index bf19ba65d09a..49cc0f8d57b5 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c @@ -486,8 +486,9 @@ bool dcn32_set_mcm_luts( if (plane_state->blend_tf.type == TF_TYPE_HWPWL) lut_params = &plane_state->blend_tf.pwl; else if (plane_state->blend_tf.type == TF_TYPE_DISTRIBUTED_POINTS) { - result = cm3_helper_translate_curve_to_hw_format(&plane_state->blend_tf, - &dpp_base->regamma_params, false); + result = cm3_helper_translate_curve_to_hw_format(plane_state->ctx, + &plane_state->blend_tf, + &dpp_base->regamma_params, false); if (!result) return result; @@ -502,8 +503,9 @@ bool dcn32_set_mcm_luts( else if (plane_state->in_shaper_func.type == TF_TYPE_DISTRIBUTED_POINTS) { // TODO: dpp_base replace ASSERT(false); - cm3_helper_translate_curve_to_hw_format(&plane_state->in_shaper_func, - &dpp_base->shaper_params, true); + cm3_helper_translate_curve_to_hw_format(plane_state->ctx, + &plane_state->in_shaper_func, + &dpp_base->shaper_params, true); lut_params = &dpp_base->shaper_params; } @@ -543,8 +545,9 @@ bool dcn32_set_input_transfer_func(struct dc *dc, if (plane_state->in_transfer_func.type == TF_TYPE_HWPWL) params = &plane_state->in_transfer_func.pwl; else if (plane_state->in_transfer_func.type == TF_TYPE_DISTRIBUTED_POINTS && - cm3_helper_translate_curve_to_hw_format(&plane_state->in_transfer_func, - &dpp_base->degamma_params, false)) + cm3_helper_translate_curve_to_hw_format(plane_state->ctx, + &plane_state->in_transfer_func, + &dpp_base->degamma_params, false)) params = &dpp_base->degamma_params; dpp_base->funcs->dpp_program_gamcor_lut(dpp_base, params); @@ -575,7 +578,7 @@ bool dcn32_set_output_transfer_func(struct dc *dc, params = &stream->out_transfer_func.pwl; else if (pipe_ctx->stream->out_transfer_func.type == TF_TYPE_DISTRIBUTED_POINTS && - cm3_helper_translate_curve_to_hw_format( + cm3_helper_translate_curve_to_hw_format(stream->ctx, &stream->out_transfer_func, &mpc->blender_params, false)) params = &mpc->blender_params; diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c index 2fbc22afb89c..5eda7648d0d2 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c @@ -430,7 +430,7 @@ void dcn401_populate_mcm_luts(struct dc *dc, if (mcm_luts.lut1d_func->type == TF_TYPE_HWPWL) m_lut_params.pwl = &mcm_luts.lut1d_func->pwl; else if (mcm_luts.lut1d_func->type == TF_TYPE_DISTRIBUTED_POINTS) { - rval = cm3_helper_translate_curve_to_hw_format( + rval = cm3_helper_translate_curve_to_hw_format(mpc->ctx, mcm_luts.lut1d_func, &dpp_base->regamma_params, false); m_lut_params.pwl = rval ? &dpp_base->regamma_params : NULL; @@ -450,7 +450,7 @@ void dcn401_populate_mcm_luts(struct dc *dc, m_lut_params.pwl = &mcm_luts.shaper->pwl; else if (mcm_luts.shaper->type == TF_TYPE_DISTRIBUTED_POINTS) { ASSERT(false); - rval = cm3_helper_translate_curve_to_hw_format( + rval = cm3_helper_translate_curve_to_hw_format(mpc->ctx, mcm_luts.shaper, &dpp_base->regamma_params, true); m_lut_params.pwl = rval ? &dpp_base->regamma_params : NULL; @@ -627,8 +627,9 @@ bool dcn401_set_mcm_luts(struct pipe_ctx *pipe_ctx, if (plane_state->blend_tf.type == TF_TYPE_HWPWL) lut_params = &plane_state->blend_tf.pwl; else if (plane_state->blend_tf.type == TF_TYPE_DISTRIBUTED_POINTS) { - rval = cm3_helper_translate_curve_to_hw_format(&plane_state->blend_tf, - &dpp_base->regamma_params, false); + rval = cm3_helper_translate_curve_to_hw_format(plane_state->ctx, + &plane_state->blend_tf, + &dpp_base->regamma_params, false); lut_params = rval ? &dpp_base->regamma_params : NULL; } result = mpc->funcs->program_1dlut(mpc, lut_params, mpcc_id); @@ -639,8 +640,9 @@ bool dcn401_set_mcm_luts(struct pipe_ctx *pipe_ctx, lut_params = &plane_state->in_shaper_func.pwl; else if (plane_state->in_shaper_func.type == TF_TYPE_DISTRIBUTED_POINTS) { // TODO: dpp_base replace - rval = cm3_helper_translate_curve_to_hw_format(&plane_state->in_shaper_func, - &dpp_base->shaper_params, true); + rval = cm3_helper_translate_curve_to_hw_format(plane_state->ctx, + &plane_state->in_shaper_func, + &dpp_base->shaper_params, true); lut_params = rval ? &dpp_base->shaper_params : NULL; } result &= mpc->funcs->program_shaper(mpc, lut_params, mpcc_id); @@ -674,7 +676,7 @@ bool dcn401_set_output_transfer_func(struct dc *dc, params = &stream->out_transfer_func.pwl; else if (pipe_ctx->stream->out_transfer_func.type == TF_TYPE_DISTRIBUTED_POINTS && - cm3_helper_translate_curve_to_hw_format( + cm3_helper_translate_curve_to_hw_format(stream->ctx, &stream->out_transfer_func, &mpc->blender_params, false)) params = &mpc->blender_params; From 84962445cd8a83dc5bed4c8ad5bbb2c1cdb249a0 Mon Sep 17 00:00:00 2001 From: Melissa Wen Date: Fri, 16 Jan 2026 12:50:49 -0300 Subject: [PATCH 113/167] drm/amd/display: remove assert around dpp_base replacement There is nothing wrong if in_shaper_func type is DISTRIBUTED POINTS. Remove the assert placed for a TODO to avoid misinterpretations. Signed-off-by: Melissa Wen Reviewed-by: Alex Hung Signed-off-by: Alex Deucher (cherry picked from commit 1714dcc4c2c53e41190896eba263ed6328bcf415) --- drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c index 49cc0f8d57b5..be1f3caf4096 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c @@ -502,7 +502,6 @@ bool dcn32_set_mcm_luts( lut_params = &plane_state->in_shaper_func.pwl; else if (plane_state->in_shaper_func.type == TF_TYPE_DISTRIBUTED_POINTS) { // TODO: dpp_base replace - ASSERT(false); cm3_helper_translate_curve_to_hw_format(plane_state->ctx, &plane_state->in_shaper_func, &dpp_base->shaper_params, true); From 6b61a54e684006ca0d92d684a1d3c3a00f077d8f Mon Sep 17 00:00:00 2001 From: Harish Kasiviswanathan Date: Fri, 9 Jan 2026 15:26:36 -0500 Subject: [PATCH 114/167] drm/amdgpu: Fix double deletion of validate_list If amdgpu_amdkfd_gpuvm_free_memory_of_gpu() fails after kgd_mem is removed from validate_list, the mem handle still lingers in the KFD idr. This means when process is terminated, kfd_process_free_outstanding_kfd_bos() will call amdgpu_amdkfd_gpuvm_free_memory_of_gpu() again resulting in double deletion. To avoid this - (a) Check if list is empty before deleting it (b) Rearragne amdgpu_amdkfd_gpuvm_free_memory_of_gpu() such that it can be safely called again if it returns failure the first time. Signed-off-by: Harish Kasiviswanathan Reviewed-by: Philip Yang Signed-off-by: Alex Deucher (cherry picked from commit 6ba60345f45eaf7cb4f89105d26083a4b9fd1cba) --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index b1c24c8fa686..a51e76623bad 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -1920,21 +1920,21 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu( /* Make sure restore workers don't access the BO any more */ mutex_lock(&process_info->lock); - list_del(&mem->validate_list); + if (!list_empty(&mem->validate_list)) + list_del_init(&mem->validate_list); mutex_unlock(&process_info->lock); - /* Cleanup user pages and MMU notifiers */ - if (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm)) { - amdgpu_hmm_unregister(mem->bo); - mutex_lock(&process_info->notifier_lock); - amdgpu_hmm_range_free(mem->range); - mutex_unlock(&process_info->notifier_lock); - } - ret = reserve_bo_and_cond_vms(mem, NULL, BO_VM_ALL, &ctx); if (unlikely(ret)) return ret; + /* Cleanup user pages and MMU notifiers */ + if (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm)) { + amdgpu_hmm_unregister(mem->bo); + amdgpu_hmm_range_free(mem->range); + mem->range = NULL; + } + amdgpu_amdkfd_remove_eviction_fence(mem->bo, process_info->eviction_fence); pr_debug("Release VA 0x%llx - 0x%llx\n", mem->va, From 90caca3b7264cc3e92e347b2004fff4e386fc26e Mon Sep 17 00:00:00 2001 From: Dave Airlie Date: Tue, 3 Feb 2026 15:21:11 +1000 Subject: [PATCH 115/167] nouveau/gsp: use rpc sequence numbers properly. There are two layers of sequence numbers, one at the msg level and one at the rpc level. 570 firmware started asserting on the sequence numbers being in the right order, and we would see nocat records with asserts in them. Add the rpc level sequence number support. Fixes: 53dac0623853 ("drm/nouveau/gsp: add support for 570.144") Cc: Signed-off-by: Dave Airlie Reviewed-by: Lyude Paul Tested-by: Lyude Paul Link: https://patch.msgid.link/20260203052431.2219998-2-airlied@gmail.com --- drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 6 ++++++ drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c | 4 ++-- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c | 6 ++++++ drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/gsp.c | 2 +- 4 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h index b8b97e10ae83..64fed208e4cf 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h @@ -44,6 +44,9 @@ typedef void (*nvkm_gsp_event_func)(struct nvkm_gsp_event *, void *repv, u32 rep * NVKM_GSP_RPC_REPLY_NOWAIT - If specified, immediately return to the * caller after the GSP RPC command is issued. * + * NVKM_GSP_RPC_REPLY_NOSEQ - If specified, exactly like NOWAIT + * but don't emit RPC sequence number. + * * NVKM_GSP_RPC_REPLY_RECV - If specified, wait and receive the entire GSP * RPC message after the GSP RPC command is issued. * @@ -53,6 +56,7 @@ typedef void (*nvkm_gsp_event_func)(struct nvkm_gsp_event *, void *repv, u32 rep */ enum nvkm_gsp_rpc_reply_policy { NVKM_GSP_RPC_REPLY_NOWAIT = 0, + NVKM_GSP_RPC_REPLY_NOSEQ, NVKM_GSP_RPC_REPLY_RECV, NVKM_GSP_RPC_REPLY_POLL, }; @@ -242,6 +246,8 @@ struct nvkm_gsp { /* The size of the registry RPC */ size_t registry_rpc_size; + u32 rpc_seq; + #ifdef CONFIG_DEBUG_FS /* * Logging buffers in debugfs. The wrapper objects need to remain diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c index 2a7e80c6d70f..6e7af2f737b7 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c @@ -704,7 +704,7 @@ r535_gsp_rpc_set_registry(struct nvkm_gsp *gsp) build_registry(gsp, rpc); - return nvkm_gsp_rpc_wr(gsp, rpc, NVKM_GSP_RPC_REPLY_NOWAIT); + return nvkm_gsp_rpc_wr(gsp, rpc, NVKM_GSP_RPC_REPLY_NOSEQ); fail: clean_registry(gsp); @@ -921,7 +921,7 @@ r535_gsp_set_system_info(struct nvkm_gsp *gsp) info->pciConfigMirrorSize = device->pci->func->cfg.size; r535_gsp_acpi_info(gsp, &info->acpiMethodData); - return nvkm_gsp_rpc_wr(gsp, info, NVKM_GSP_RPC_REPLY_NOWAIT); + return nvkm_gsp_rpc_wr(gsp, info, NVKM_GSP_RPC_REPLY_NOSEQ); } static int diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c index 0dc4782df8c0..3ca3de8f4340 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c @@ -557,6 +557,7 @@ r535_gsp_rpc_handle_reply(struct nvkm_gsp *gsp, u32 fn, switch (policy) { case NVKM_GSP_RPC_REPLY_NOWAIT: + case NVKM_GSP_RPC_REPLY_NOSEQ: break; case NVKM_GSP_RPC_REPLY_RECV: reply = r535_gsp_msg_recv(gsp, fn, gsp_rpc_len); @@ -588,6 +589,11 @@ r535_gsp_rpc_send(struct nvkm_gsp *gsp, void *payload, rpc->data, rpc->length - sizeof(*rpc), true); } + if (policy == NVKM_GSP_RPC_REPLY_NOSEQ) + rpc->sequence = 0; + else + rpc->sequence = gsp->rpc_seq++; + ret = r535_gsp_cmdq_push(gsp, rpc); if (ret) return ERR_PTR(ret); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/gsp.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/gsp.c index 9d2fa4e66d59..996941c668ba 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/gsp.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/gsp.c @@ -176,7 +176,7 @@ r570_gsp_set_system_info(struct nvkm_gsp *gsp) info->bIsPrimary = video_is_primary_device(device->dev); info->bPreserveVideoMemoryAllocations = false; - return nvkm_gsp_rpc_wr(gsp, info, NVKM_GSP_RPC_REPLY_NOWAIT); + return nvkm_gsp_rpc_wr(gsp, info, NVKM_GSP_RPC_REPLY_NOSEQ); } static void From 8f8a4dce64013737701d13565cf6107f42b725ea Mon Sep 17 00:00:00 2001 From: Dave Airlie Date: Tue, 3 Feb 2026 15:21:12 +1000 Subject: [PATCH 116/167] nouveau: add a third state to the fini handler. This is just refactoring to allow the lower layers to distinguish between suspend and runtime suspend. GSP 570 needs to set a flag with the GPU is going into GCOFF, this flag taken from the opengpu driver is set whenever runtime suspend is enterning GCOFF but not for normal suspend paths. This just refactors the code, a subsequent patch use the information. Fixes: 53dac0623853 ("drm/nouveau/gsp: add support for 570.144") Cc: Reviewed-by: Lyude Paul Tested-by: Lyude Paul Signed-off-by: Dave Airlie Link: https://patch.msgid.link/20260203052431.2219998-3-airlied@gmail.com --- drivers/gpu/drm/nouveau/include/nvif/client.h | 2 +- drivers/gpu/drm/nouveau/include/nvif/driver.h | 2 +- .../drm/nouveau/include/nvkm/core/device.h | 3 ++- .../drm/nouveau/include/nvkm/core/engine.h | 2 +- .../drm/nouveau/include/nvkm/core/object.h | 5 +++-- .../drm/nouveau/include/nvkm/core/oproxy.h | 2 +- .../drm/nouveau/include/nvkm/core/subdev.h | 4 ++-- .../nouveau/include/nvkm/core/suspend_state.h | 11 ++++++++++ drivers/gpu/drm/nouveau/nouveau_drm.c | 2 +- drivers/gpu/drm/nouveau/nouveau_nvif.c | 10 +++++++-- drivers/gpu/drm/nouveau/nvif/client.c | 4 ++-- drivers/gpu/drm/nouveau/nvkm/core/engine.c | 4 ++-- drivers/gpu/drm/nouveau/nvkm/core/ioctl.c | 4 ++-- drivers/gpu/drm/nouveau/nvkm/core/object.c | 20 +++++++++++++---- drivers/gpu/drm/nouveau/nvkm/core/oproxy.c | 2 +- drivers/gpu/drm/nouveau/nvkm/core/subdev.c | 18 ++++++++++++--- drivers/gpu/drm/nouveau/nvkm/core/uevent.c | 2 +- .../gpu/drm/nouveau/nvkm/engine/ce/ga100.c | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/ce/priv.h | 2 +- .../gpu/drm/nouveau/nvkm/engine/device/base.c | 22 ++++++++++++++----- .../gpu/drm/nouveau/nvkm/engine/device/pci.c | 4 ++-- .../gpu/drm/nouveau/nvkm/engine/device/priv.h | 2 +- .../gpu/drm/nouveau/nvkm/engine/device/user.c | 2 +- .../gpu/drm/nouveau/nvkm/engine/disp/base.c | 4 ++-- .../gpu/drm/nouveau/nvkm/engine/disp/chan.c | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/falcon.c | 4 ++-- .../gpu/drm/nouveau/nvkm/engine/fifo/base.c | 2 +- .../gpu/drm/nouveau/nvkm/engine/fifo/uchan.c | 6 ++--- drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c | 4 ++-- .../gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/gr/nv04.c | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/gr/nv10.c | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.c | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.h | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/gr/nv40.c | 4 ++-- .../gpu/drm/nouveau/nvkm/engine/mpeg/nv44.c | 2 +- .../gpu/drm/nouveau/nvkm/engine/sec2/base.c | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/xtensa.c | 4 ++-- .../gpu/drm/nouveau/nvkm/subdev/acr/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/bar/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/clk/base.c | 2 +- .../drm/nouveau/nvkm/subdev/devinit/base.c | 4 ++-- .../gpu/drm/nouveau/nvkm/subdev/fault/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/fault/user.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/gpio/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/gsp/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/gsp/gh100.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/gsp/priv.h | 8 +++---- .../drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 2 +- .../drm/nouveau/nvkm/subdev/instmem/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/pci/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/pmu/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/therm/base.c | 6 ++--- .../gpu/drm/nouveau/nvkm/subdev/timer/base.c | 2 +- 56 files changed, 139 insertions(+), 84 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/include/nvkm/core/suspend_state.h diff --git a/drivers/gpu/drm/nouveau/include/nvif/client.h b/drivers/gpu/drm/nouveau/include/nvif/client.h index 03f1d564eb12..b698c74306f8 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/client.h +++ b/drivers/gpu/drm/nouveau/include/nvif/client.h @@ -11,7 +11,7 @@ struct nvif_client { int nvif_client_ctor(struct nvif_client *parent, const char *name, struct nvif_client *); void nvif_client_dtor(struct nvif_client *); -int nvif_client_suspend(struct nvif_client *); +int nvif_client_suspend(struct nvif_client *, bool); int nvif_client_resume(struct nvif_client *); /*XXX*/ diff --git a/drivers/gpu/drm/nouveau/include/nvif/driver.h b/drivers/gpu/drm/nouveau/include/nvif/driver.h index 7b08ff769039..61c8a177b28f 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/driver.h +++ b/drivers/gpu/drm/nouveau/include/nvif/driver.h @@ -8,7 +8,7 @@ struct nvif_driver { const char *name; int (*init)(const char *name, u64 device, const char *cfg, const char *dbg, void **priv); - int (*suspend)(void *priv); + int (*suspend)(void *priv, bool runtime); int (*resume)(void *priv); int (*ioctl)(void *priv, void *data, u32 size, void **hack); void __iomem *(*map)(void *priv, u64 handle, u32 size); diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/device.h b/drivers/gpu/drm/nouveau/include/nvkm/core/device.h index 99579e7b9376..954a89d43bad 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/core/device.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/device.h @@ -2,6 +2,7 @@ #ifndef __NVKM_DEVICE_H__ #define __NVKM_DEVICE_H__ #include +#include #include enum nvkm_subdev_type; @@ -93,7 +94,7 @@ struct nvkm_device_func { void *(*dtor)(struct nvkm_device *); int (*preinit)(struct nvkm_device *); int (*init)(struct nvkm_device *); - void (*fini)(struct nvkm_device *, bool suspend); + void (*fini)(struct nvkm_device *, enum nvkm_suspend_state suspend); int (*irq)(struct nvkm_device *); resource_size_t (*resource_addr)(struct nvkm_device *, enum nvkm_bar_id); resource_size_t (*resource_size)(struct nvkm_device *, enum nvkm_bar_id); diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/engine.h b/drivers/gpu/drm/nouveau/include/nvkm/core/engine.h index 738899fcf30b..1e97be6c6564 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/core/engine.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/engine.h @@ -20,7 +20,7 @@ struct nvkm_engine_func { int (*oneinit)(struct nvkm_engine *); int (*info)(struct nvkm_engine *, u64 mthd, u64 *data); int (*init)(struct nvkm_engine *); - int (*fini)(struct nvkm_engine *, bool suspend); + int (*fini)(struct nvkm_engine *, enum nvkm_suspend_state suspend); int (*reset)(struct nvkm_engine *); int (*nonstall)(struct nvkm_engine *); void (*intr)(struct nvkm_engine *); diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/object.h b/drivers/gpu/drm/nouveau/include/nvkm/core/object.h index 10107ef3ca49..54d356154274 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/core/object.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/object.h @@ -2,6 +2,7 @@ #ifndef __NVKM_OBJECT_H__ #define __NVKM_OBJECT_H__ #include +#include struct nvkm_event; struct nvkm_gpuobj; struct nvkm_uevent; @@ -27,7 +28,7 @@ enum nvkm_object_map { struct nvkm_object_func { void *(*dtor)(struct nvkm_object *); int (*init)(struct nvkm_object *); - int (*fini)(struct nvkm_object *, bool suspend); + int (*fini)(struct nvkm_object *, enum nvkm_suspend_state suspend); int (*mthd)(struct nvkm_object *, u32 mthd, void *data, u32 size); int (*ntfy)(struct nvkm_object *, u32 mthd, struct nvkm_event **); int (*map)(struct nvkm_object *, void *argv, u32 argc, @@ -49,7 +50,7 @@ int nvkm_object_new(const struct nvkm_oclass *, void *data, u32 size, void nvkm_object_del(struct nvkm_object **); void *nvkm_object_dtor(struct nvkm_object *); int nvkm_object_init(struct nvkm_object *); -int nvkm_object_fini(struct nvkm_object *, bool suspend); +int nvkm_object_fini(struct nvkm_object *, enum nvkm_suspend_state); int nvkm_object_mthd(struct nvkm_object *, u32 mthd, void *data, u32 size); int nvkm_object_ntfy(struct nvkm_object *, u32 mthd, struct nvkm_event **); int nvkm_object_map(struct nvkm_object *, void *argv, u32 argc, diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/oproxy.h b/drivers/gpu/drm/nouveau/include/nvkm/core/oproxy.h index 0e70a9afba33..cf66aee4d111 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/core/oproxy.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/oproxy.h @@ -13,7 +13,7 @@ struct nvkm_oproxy { struct nvkm_oproxy_func { void (*dtor[2])(struct nvkm_oproxy *); int (*init[2])(struct nvkm_oproxy *); - int (*fini[2])(struct nvkm_oproxy *, bool suspend); + int (*fini[2])(struct nvkm_oproxy *, enum nvkm_suspend_state suspend); }; void nvkm_oproxy_ctor(const struct nvkm_oproxy_func *, diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h b/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h index bce6e1ba09ea..bd6b1b658e40 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h @@ -40,7 +40,7 @@ struct nvkm_subdev_func { int (*oneinit)(struct nvkm_subdev *); int (*info)(struct nvkm_subdev *, u64 mthd, u64 *data); int (*init)(struct nvkm_subdev *); - int (*fini)(struct nvkm_subdev *, bool suspend); + int (*fini)(struct nvkm_subdev *, enum nvkm_suspend_state suspend); void (*intr)(struct nvkm_subdev *); }; @@ -65,7 +65,7 @@ void nvkm_subdev_unref(struct nvkm_subdev *); int nvkm_subdev_preinit(struct nvkm_subdev *); int nvkm_subdev_oneinit(struct nvkm_subdev *); int nvkm_subdev_init(struct nvkm_subdev *); -int nvkm_subdev_fini(struct nvkm_subdev *, bool suspend); +int nvkm_subdev_fini(struct nvkm_subdev *, enum nvkm_suspend_state suspend); int nvkm_subdev_info(struct nvkm_subdev *, u64, u64 *); void nvkm_subdev_intr(struct nvkm_subdev *); diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/suspend_state.h b/drivers/gpu/drm/nouveau/include/nvkm/core/suspend_state.h new file mode 100644 index 000000000000..134120fb71f4 --- /dev/null +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/suspend_state.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: MIT */ +#ifndef __NVKM_SUSPEND_STATE_H__ +#define __NVKM_SUSPEND_STATE_H__ + +enum nvkm_suspend_state { + NVKM_POWEROFF, + NVKM_SUSPEND, + NVKM_RUNTIME_SUSPEND, +}; + +#endif diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 1527b801f013..dc469e571c0a 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -983,7 +983,7 @@ nouveau_do_suspend(struct nouveau_drm *drm, bool runtime) } NV_DEBUG(drm, "suspending object tree...\n"); - ret = nvif_client_suspend(&drm->_client); + ret = nvif_client_suspend(&drm->_client, runtime); if (ret) goto fail_client; diff --git a/drivers/gpu/drm/nouveau/nouveau_nvif.c b/drivers/gpu/drm/nouveau/nouveau_nvif.c index adb802421fda..eeb4ebbc16bf 100644 --- a/drivers/gpu/drm/nouveau/nouveau_nvif.c +++ b/drivers/gpu/drm/nouveau/nouveau_nvif.c @@ -62,10 +62,16 @@ nvkm_client_resume(void *priv) } static int -nvkm_client_suspend(void *priv) +nvkm_client_suspend(void *priv, bool runtime) { struct nvkm_client *client = priv; - return nvkm_object_fini(&client->object, true); + enum nvkm_suspend_state state; + + if (runtime) + state = NVKM_RUNTIME_SUSPEND; + else + state = NVKM_SUSPEND; + return nvkm_object_fini(&client->object, state); } static int diff --git a/drivers/gpu/drm/nouveau/nvif/client.c b/drivers/gpu/drm/nouveau/nvif/client.c index fdf5054ed7d8..36d3c99786bd 100644 --- a/drivers/gpu/drm/nouveau/nvif/client.c +++ b/drivers/gpu/drm/nouveau/nvif/client.c @@ -30,9 +30,9 @@ #include int -nvif_client_suspend(struct nvif_client *client) +nvif_client_suspend(struct nvif_client *client, bool runtime) { - return client->driver->suspend(client->object.priv); + return client->driver->suspend(client->object.priv, runtime); } int diff --git a/drivers/gpu/drm/nouveau/nvkm/core/engine.c b/drivers/gpu/drm/nouveau/nvkm/core/engine.c index 36a31e9eea22..5bf62940d7be 100644 --- a/drivers/gpu/drm/nouveau/nvkm/core/engine.c +++ b/drivers/gpu/drm/nouveau/nvkm/core/engine.c @@ -41,7 +41,7 @@ nvkm_engine_reset(struct nvkm_engine *engine) if (engine->func->reset) return engine->func->reset(engine); - nvkm_subdev_fini(&engine->subdev, false); + nvkm_subdev_fini(&engine->subdev, NVKM_POWEROFF); return nvkm_subdev_init(&engine->subdev); } @@ -98,7 +98,7 @@ nvkm_engine_info(struct nvkm_subdev *subdev, u64 mthd, u64 *data) } static int -nvkm_engine_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_engine_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_engine *engine = nvkm_engine(subdev); if (engine->func->fini) diff --git a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c index 45051a1249da..b8fc9be67851 100644 --- a/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c +++ b/drivers/gpu/drm/nouveau/nvkm/core/ioctl.c @@ -141,7 +141,7 @@ nvkm_ioctl_new(struct nvkm_client *client, } ret = -EEXIST; } - nvkm_object_fini(object, false); + nvkm_object_fini(object, NVKM_POWEROFF); } nvkm_object_del(&object); @@ -160,7 +160,7 @@ nvkm_ioctl_del(struct nvkm_client *client, nvif_ioctl(object, "delete size %d\n", size); if (!(ret = nvif_unvers(ret, &data, &size, args->none))) { nvif_ioctl(object, "delete\n"); - nvkm_object_fini(object, false); + nvkm_object_fini(object, NVKM_POWEROFF); nvkm_object_del(&object); } diff --git a/drivers/gpu/drm/nouveau/nvkm/core/object.c b/drivers/gpu/drm/nouveau/nvkm/core/object.c index 390c265cf8af..af9f00f74c28 100644 --- a/drivers/gpu/drm/nouveau/nvkm/core/object.c +++ b/drivers/gpu/drm/nouveau/nvkm/core/object.c @@ -142,13 +142,25 @@ nvkm_object_bind(struct nvkm_object *object, struct nvkm_gpuobj *gpuobj, } int -nvkm_object_fini(struct nvkm_object *object, bool suspend) +nvkm_object_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { - const char *action = suspend ? "suspend" : "fini"; + const char *action; struct nvkm_object *child; s64 time; int ret; + switch (suspend) { + case NVKM_POWEROFF: + default: + action = "fini"; + break; + case NVKM_SUSPEND: + action = "suspend"; + break; + case NVKM_RUNTIME_SUSPEND: + action = "runtime"; + break; + } nvif_debug(object, "%s children...\n", action); time = ktime_to_us(ktime_get()); list_for_each_entry_reverse(child, &object->tree, head) { @@ -212,11 +224,11 @@ nvkm_object_init(struct nvkm_object *object) fail_child: list_for_each_entry_continue_reverse(child, &object->tree, head) - nvkm_object_fini(child, false); + nvkm_object_fini(child, NVKM_POWEROFF); fail: nvif_error(object, "init failed with %d\n", ret); if (object->func->fini) - object->func->fini(object, false); + object->func->fini(object, NVKM_POWEROFF); return ret; } diff --git a/drivers/gpu/drm/nouveau/nvkm/core/oproxy.c b/drivers/gpu/drm/nouveau/nvkm/core/oproxy.c index 5db80d1780f0..7c9edf752768 100644 --- a/drivers/gpu/drm/nouveau/nvkm/core/oproxy.c +++ b/drivers/gpu/drm/nouveau/nvkm/core/oproxy.c @@ -87,7 +87,7 @@ nvkm_oproxy_uevent(struct nvkm_object *object, void *argv, u32 argc, } static int -nvkm_oproxy_fini(struct nvkm_object *object, bool suspend) +nvkm_oproxy_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nvkm_oproxy *oproxy = nvkm_oproxy(object); int ret; diff --git a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c index 6c20e827a069..b7045d1c8415 100644 --- a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c +++ b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c @@ -51,12 +51,24 @@ nvkm_subdev_info(struct nvkm_subdev *subdev, u64 mthd, u64 *data) } int -nvkm_subdev_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_subdev_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_device *device = subdev->device; - const char *action = suspend ? "suspend" : subdev->use.enabled ? "fini" : "reset"; + const char *action; s64 time; + switch (suspend) { + case NVKM_POWEROFF: + default: + action = subdev->use.enabled ? "fini" : "reset"; + break; + case NVKM_SUSPEND: + action = "suspend"; + break; + case NVKM_RUNTIME_SUSPEND: + action = "runtime"; + break; + } nvkm_trace(subdev, "%s running...\n", action); time = ktime_to_us(ktime_get()); @@ -186,7 +198,7 @@ void nvkm_subdev_unref(struct nvkm_subdev *subdev) { if (refcount_dec_and_mutex_lock(&subdev->use.refcount, &subdev->use.mutex)) { - nvkm_subdev_fini(subdev, false); + nvkm_subdev_fini(subdev, NVKM_POWEROFF); mutex_unlock(&subdev->use.mutex); } } diff --git a/drivers/gpu/drm/nouveau/nvkm/core/uevent.c b/drivers/gpu/drm/nouveau/nvkm/core/uevent.c index cc254c390a57..46beb6e470ee 100644 --- a/drivers/gpu/drm/nouveau/nvkm/core/uevent.c +++ b/drivers/gpu/drm/nouveau/nvkm/core/uevent.c @@ -73,7 +73,7 @@ nvkm_uevent_mthd(struct nvkm_object *object, u32 mthd, void *argv, u32 argc) } static int -nvkm_uevent_fini(struct nvkm_object *object, bool suspend) +nvkm_uevent_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nvkm_uevent *uevent = nvkm_uevent(object); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/ce/ga100.c b/drivers/gpu/drm/nouveau/nvkm/engine/ce/ga100.c index 1c0c60138706..1a3caf697608 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/ce/ga100.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/ce/ga100.c @@ -46,7 +46,7 @@ ga100_ce_nonstall(struct nvkm_engine *engine) } int -ga100_ce_fini(struct nvkm_engine *engine, bool suspend) +ga100_ce_fini(struct nvkm_engine *engine, enum nvkm_suspend_state suspend) { nvkm_inth_block(&engine->subdev.inth); return 0; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/ce/priv.h b/drivers/gpu/drm/nouveau/nvkm/engine/ce/priv.h index 34fd2657134b..f07b45853310 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/ce/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/engine/ce/priv.h @@ -14,7 +14,7 @@ extern const struct nvkm_object_func gv100_ce_cclass; int ga100_ce_oneinit(struct nvkm_engine *); int ga100_ce_init(struct nvkm_engine *); -int ga100_ce_fini(struct nvkm_engine *, bool); +int ga100_ce_fini(struct nvkm_engine *, enum nvkm_suspend_state); int ga100_ce_nonstall(struct nvkm_engine *); u32 gb202_ce_grce_mask(struct nvkm_device *); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index 2517b65d8faa..b101e14f841e 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2936,13 +2936,25 @@ nvkm_device_engine(struct nvkm_device *device, int type, int inst) } int -nvkm_device_fini(struct nvkm_device *device, bool suspend) +nvkm_device_fini(struct nvkm_device *device, enum nvkm_suspend_state suspend) { - const char *action = suspend ? "suspend" : "fini"; + const char *action; struct nvkm_subdev *subdev; int ret; s64 time; + switch (suspend) { + case NVKM_POWEROFF: + default: + action = "fini"; + break; + case NVKM_SUSPEND: + action = "suspend"; + break; + case NVKM_RUNTIME_SUSPEND: + action = "runtime"; + break; + } nvdev_trace(device, "%s running...\n", action); time = ktime_to_us(ktime_get()); @@ -3032,7 +3044,7 @@ nvkm_device_init(struct nvkm_device *device) if (ret) return ret; - nvkm_device_fini(device, false); + nvkm_device_fini(device, NVKM_POWEROFF); nvdev_trace(device, "init running...\n"); time = ktime_to_us(ktime_get()); @@ -3060,9 +3072,9 @@ nvkm_device_init(struct nvkm_device *device) fail_subdev: list_for_each_entry_from(subdev, &device->subdev, head) - nvkm_subdev_fini(subdev, false); + nvkm_subdev_fini(subdev, NVKM_POWEROFF); fail: - nvkm_device_fini(device, false); + nvkm_device_fini(device, NVKM_POWEROFF); nvdev_error(device, "init failed with %d\n", ret); return ret; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c index 8f0261a0d618..4c29b60460d4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c @@ -1605,10 +1605,10 @@ nvkm_device_pci_irq(struct nvkm_device *device) } static void -nvkm_device_pci_fini(struct nvkm_device *device, bool suspend) +nvkm_device_pci_fini(struct nvkm_device *device, enum nvkm_suspend_state suspend) { struct nvkm_device_pci *pdev = nvkm_device_pci(device); - if (suspend) { + if (suspend != NVKM_POWEROFF) { pci_disable_device(pdev->pdev); pdev->suspend = true; } diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h b/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h index 75ee7506d443..d0c40f034244 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h @@ -56,5 +56,5 @@ int nvkm_device_ctor(const struct nvkm_device_func *, const char *name, const char *cfg, const char *dbg, struct nvkm_device *); int nvkm_device_init(struct nvkm_device *); -int nvkm_device_fini(struct nvkm_device *, bool suspend); +int nvkm_device_fini(struct nvkm_device *, enum nvkm_suspend_state suspend); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c index 58191b7a0494..32ff3181f47b 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c @@ -218,7 +218,7 @@ nvkm_udevice_map(struct nvkm_object *object, void *argv, u32 argc, } static int -nvkm_udevice_fini(struct nvkm_object *object, bool suspend) +nvkm_udevice_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nvkm_udevice *udev = nvkm_udevice(object); struct nvkm_device *device = udev->device; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/base.c index b24eb1e560bc..84745f60912e 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/base.c @@ -99,13 +99,13 @@ nvkm_disp_intr(struct nvkm_engine *engine) } static int -nvkm_disp_fini(struct nvkm_engine *engine, bool suspend) +nvkm_disp_fini(struct nvkm_engine *engine, enum nvkm_suspend_state suspend) { struct nvkm_disp *disp = nvkm_disp(engine); struct nvkm_outp *outp; if (disp->func->fini) - disp->func->fini(disp, suspend); + disp->func->fini(disp, suspend != NVKM_POWEROFF); list_for_each_entry(outp, &disp->outps, head) { if (outp->func->fini) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/chan.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/chan.c index 9b84e357d354..57a62a2de7c7 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/chan.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/chan.c @@ -128,7 +128,7 @@ nvkm_disp_chan_child_get(struct nvkm_object *object, int index, struct nvkm_ocla } static int -nvkm_disp_chan_fini(struct nvkm_object *object, bool suspend) +nvkm_disp_chan_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nvkm_disp_chan *chan = nvkm_disp_chan(object); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/falcon.c b/drivers/gpu/drm/nouveau/nvkm/engine/falcon.c index fd5ee9f0af36..cf8e356867b4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/falcon.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/falcon.c @@ -93,13 +93,13 @@ nvkm_falcon_intr(struct nvkm_engine *engine) } static int -nvkm_falcon_fini(struct nvkm_engine *engine, bool suspend) +nvkm_falcon_fini(struct nvkm_engine *engine, enum nvkm_suspend_state suspend) { struct nvkm_falcon *falcon = nvkm_falcon(engine); struct nvkm_device *device = falcon->engine.subdev.device; const u32 base = falcon->addr; - if (!suspend) { + if (suspend == NVKM_POWEROFF) { nvkm_memory_unref(&falcon->core); if (falcon->external) { vfree(falcon->data.data); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c index 6fd4e60634fb..1561287a32f2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/base.c @@ -122,7 +122,7 @@ nvkm_fifo_class_get(struct nvkm_oclass *oclass, int index, const struct nvkm_dev } static int -nvkm_fifo_fini(struct nvkm_engine *engine, bool suspend) +nvkm_fifo_fini(struct nvkm_engine *engine, enum nvkm_suspend_state suspend) { struct nvkm_fifo *fifo = nvkm_fifo(engine); struct nvkm_runl *runl; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/uchan.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/uchan.c index 52420a1edca5..c978b97e10c6 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/uchan.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/uchan.c @@ -72,7 +72,7 @@ struct nvkm_uobj { }; static int -nvkm_uchan_object_fini_1(struct nvkm_oproxy *oproxy, bool suspend) +nvkm_uchan_object_fini_1(struct nvkm_oproxy *oproxy, enum nvkm_suspend_state suspend) { struct nvkm_uobj *uobj = container_of(oproxy, typeof(*uobj), oproxy); struct nvkm_chan *chan = uobj->chan; @@ -87,7 +87,7 @@ nvkm_uchan_object_fini_1(struct nvkm_oproxy *oproxy, bool suspend) nvkm_chan_cctx_bind(chan, ectx->engn, NULL); if (refcount_dec_and_test(&ectx->uses)) - nvkm_object_fini(ectx->object, false); + nvkm_object_fini(ectx->object, NVKM_POWEROFF); mutex_unlock(&chan->cgrp->mutex); } @@ -269,7 +269,7 @@ nvkm_uchan_map(struct nvkm_object *object, void *argv, u32 argc, } static int -nvkm_uchan_fini(struct nvkm_object *object, bool suspend) +nvkm_uchan_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nvkm_chan *chan = nvkm_uchan(object)->chan; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c index f5e68f09df76..cd4908b1b4df 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c @@ -168,11 +168,11 @@ nvkm_gr_init(struct nvkm_engine *engine) } static int -nvkm_gr_fini(struct nvkm_engine *engine, bool suspend) +nvkm_gr_fini(struct nvkm_engine *engine, enum nvkm_suspend_state suspend) { struct nvkm_gr *gr = nvkm_gr(engine); if (gr->func->fini) - return gr->func->fini(gr, suspend); + return gr->func->fini(gr, suspend != NVKM_POWEROFF); return 0; } diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c index 3ea447f6a45b..3608215f0f11 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c @@ -2330,7 +2330,7 @@ gf100_gr_reset(struct nvkm_gr *base) WARN_ON(gf100_gr_fecs_halt_pipeline(gr)); - subdev->func->fini(subdev, false); + subdev->func->fini(subdev, NVKM_POWEROFF); nvkm_mc_disable(device, subdev->type, subdev->inst); if (gr->func->gpccs.reset) gr->func->gpccs.reset(gr); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv04.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv04.c index ca822f07b63e..82937df8b8c0 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv04.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv04.c @@ -1158,7 +1158,7 @@ nv04_gr_chan_dtor(struct nvkm_object *object) } static int -nv04_gr_chan_fini(struct nvkm_object *object, bool suspend) +nv04_gr_chan_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nv04_gr_chan *chan = nv04_gr_chan(object); struct nv04_gr *gr = chan->gr; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv10.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv10.c index 92ef7c9b2910..fcb4e4fce83f 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv10.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv10.c @@ -951,7 +951,7 @@ nv10_gr_context_switch(struct nv10_gr *gr) } static int -nv10_gr_chan_fini(struct nvkm_object *object, bool suspend) +nv10_gr_chan_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nv10_gr_chan *chan = nv10_gr_chan(object); struct nv10_gr *gr = chan->gr; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.c index 13407fafe947..ab57b3b40228 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.c @@ -27,7 +27,7 @@ nv20_gr_chan_init(struct nvkm_object *object) } int -nv20_gr_chan_fini(struct nvkm_object *object, bool suspend) +nv20_gr_chan_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nv20_gr_chan *chan = nv20_gr_chan(object); struct nv20_gr *gr = chan->gr; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.h index c0d2be53413e..786c7832f7ac 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.h +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv20.h @@ -31,5 +31,5 @@ struct nv20_gr_chan { void *nv20_gr_chan_dtor(struct nvkm_object *); int nv20_gr_chan_init(struct nvkm_object *); -int nv20_gr_chan_fini(struct nvkm_object *, bool); +int nv20_gr_chan_fini(struct nvkm_object *, enum nvkm_suspend_state); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv40.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv40.c index b609b0150ba1..e3e797cf3034 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv40.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/nv40.c @@ -89,7 +89,7 @@ nv40_gr_chan_bind(struct nvkm_object *object, struct nvkm_gpuobj *parent, } static int -nv40_gr_chan_fini(struct nvkm_object *object, bool suspend) +nv40_gr_chan_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nv40_gr_chan *chan = nv40_gr_chan(object); struct nv40_gr *gr = chan->gr; @@ -101,7 +101,7 @@ nv40_gr_chan_fini(struct nvkm_object *object, bool suspend) nvkm_mask(device, 0x400720, 0x00000001, 0x00000000); if (nvkm_rd32(device, 0x40032c) == inst) { - if (suspend) { + if (suspend != NVKM_POWEROFF) { nvkm_wr32(device, 0x400720, 0x00000000); nvkm_wr32(device, 0x400784, inst); nvkm_mask(device, 0x400310, 0x00000020, 0x00000020); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv44.c b/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv44.c index 4b1374adbda3..38146f9cc81c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv44.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv44.c @@ -65,7 +65,7 @@ nv44_mpeg_chan_bind(struct nvkm_object *object, struct nvkm_gpuobj *parent, } static int -nv44_mpeg_chan_fini(struct nvkm_object *object, bool suspend) +nv44_mpeg_chan_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nv44_mpeg_chan *chan = nv44_mpeg_chan(object); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/base.c index f2c60da5d1e8..3e4d6a680ee9 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/base.c @@ -37,7 +37,7 @@ nvkm_sec2_finimsg(void *priv, struct nvfw_falcon_msg *hdr) } static int -nvkm_sec2_fini(struct nvkm_engine *engine, bool suspend) +nvkm_sec2_fini(struct nvkm_engine *engine, enum nvkm_suspend_state suspend) { struct nvkm_sec2 *sec2 = nvkm_sec2(engine); struct nvkm_subdev *subdev = &sec2->engine.subdev; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/xtensa.c b/drivers/gpu/drm/nouveau/nvkm/engine/xtensa.c index f7d3ba0afb55..910a5bb2d191 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/xtensa.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/xtensa.c @@ -76,7 +76,7 @@ nvkm_xtensa_intr(struct nvkm_engine *engine) } static int -nvkm_xtensa_fini(struct nvkm_engine *engine, bool suspend) +nvkm_xtensa_fini(struct nvkm_engine *engine, enum nvkm_suspend_state suspend) { struct nvkm_xtensa *xtensa = nvkm_xtensa(engine); struct nvkm_device *device = xtensa->engine.subdev.device; @@ -85,7 +85,7 @@ nvkm_xtensa_fini(struct nvkm_engine *engine, bool suspend) nvkm_wr32(device, base + 0xd84, 0); /* INTR_EN */ nvkm_wr32(device, base + 0xd94, 0); /* FIFO_CTRL */ - if (!suspend) + if (suspend == NVKM_POWEROFF) nvkm_memory_unref(&xtensa->gpu_fw); return 0; } diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c index 9b8ca4e898f9..13d829593180 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c @@ -182,7 +182,7 @@ nvkm_acr_managed_falcon(struct nvkm_device *device, enum nvkm_acr_lsf_id id) } static int -nvkm_acr_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_acr_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { if (!subdev->use.enabled) return 0; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c index 91bc53be97ff..7dee55bf9ada 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c @@ -90,7 +90,7 @@ nvkm_bar_bar2_init(struct nvkm_device *device) } static int -nvkm_bar_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_bar_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_bar *bar = nvkm_bar(subdev); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c index 178dc56909c2..71420f81714b 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c @@ -577,7 +577,7 @@ nvkm_clk_read(struct nvkm_clk *clk, enum nv_clk_src src) } static int -nvkm_clk_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_clk_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_clk *clk = nvkm_clk(subdev); flush_work(&clk->work); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/base.c index 3d9319c319c6..ad5ec9ee1294 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/base.c @@ -67,11 +67,11 @@ nvkm_devinit_post(struct nvkm_devinit *init) } static int -nvkm_devinit_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_devinit_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_devinit *init = nvkm_devinit(subdev); /* force full reinit on resume */ - if (suspend) + if (suspend != NVKM_POWEROFF) init->post = true; return 0; } diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c index b53ac9a2552f..d8d32bb5bcd9 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c @@ -51,7 +51,7 @@ nvkm_fault_intr(struct nvkm_subdev *subdev) } static int -nvkm_fault_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_fault_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_fault *fault = nvkm_fault(subdev); if (fault->func->fini) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/user.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/user.c index cd2fbc0472d8..8ab052d18e5d 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/user.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/user.c @@ -56,7 +56,7 @@ nvkm_ufault_map(struct nvkm_object *object, void *argv, u32 argc, } static int -nvkm_ufault_fini(struct nvkm_object *object, bool suspend) +nvkm_ufault_fini(struct nvkm_object *object, enum nvkm_suspend_state suspend) { struct nvkm_fault_buffer *buffer = nvkm_fault_buffer(object); buffer->fault->func->buffer.fini(buffer); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gpio/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gpio/base.c index b196baa376dc..b2c34878a68f 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gpio/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gpio/base.c @@ -144,7 +144,7 @@ nvkm_gpio_intr(struct nvkm_subdev *subdev) } static int -nvkm_gpio_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_gpio_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_gpio *gpio = nvkm_gpio(subdev); u32 mask = (1ULL << gpio->func->lines) - 1; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/base.c index 7ccb41761066..30cb843ba35c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/base.c @@ -48,7 +48,7 @@ nvkm_gsp_intr_stall(struct nvkm_gsp *gsp, enum nvkm_subdev_type type, int inst) } static int -nvkm_gsp_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_gsp_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_gsp *gsp = nvkm_gsp(subdev); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/gh100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/gh100.c index b0dd5fce7bad..88436a264177 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/gh100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/gh100.c @@ -17,7 +17,7 @@ #include int -gh100_gsp_fini(struct nvkm_gsp *gsp, bool suspend) +gh100_gsp_fini(struct nvkm_gsp *gsp, enum nvkm_suspend_state suspend) { struct nvkm_falcon *falcon = &gsp->falcon; int ret, time = 4000; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h index 9dd66a2e3801..71b7203bef50 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h @@ -59,7 +59,7 @@ struct nvkm_gsp_func { void (*dtor)(struct nvkm_gsp *); int (*oneinit)(struct nvkm_gsp *); int (*init)(struct nvkm_gsp *); - int (*fini)(struct nvkm_gsp *, bool suspend); + int (*fini)(struct nvkm_gsp *, enum nvkm_suspend_state suspend); int (*reset)(struct nvkm_gsp *); struct { @@ -75,7 +75,7 @@ int tu102_gsp_fwsec_sb_ctor(struct nvkm_gsp *); void tu102_gsp_fwsec_sb_dtor(struct nvkm_gsp *); int tu102_gsp_oneinit(struct nvkm_gsp *); int tu102_gsp_init(struct nvkm_gsp *); -int tu102_gsp_fini(struct nvkm_gsp *, bool suspend); +int tu102_gsp_fini(struct nvkm_gsp *, enum nvkm_suspend_state suspend); int tu102_gsp_reset(struct nvkm_gsp *); u64 tu102_gsp_wpr_heap_size(struct nvkm_gsp *); @@ -87,12 +87,12 @@ int ga102_gsp_reset(struct nvkm_gsp *); int gh100_gsp_oneinit(struct nvkm_gsp *); int gh100_gsp_init(struct nvkm_gsp *); -int gh100_gsp_fini(struct nvkm_gsp *, bool suspend); +int gh100_gsp_fini(struct nvkm_gsp *, enum nvkm_suspend_state suspend); void r535_gsp_dtor(struct nvkm_gsp *); int r535_gsp_oneinit(struct nvkm_gsp *); int r535_gsp_init(struct nvkm_gsp *); -int r535_gsp_fini(struct nvkm_gsp *, bool suspend); +int r535_gsp_fini(struct nvkm_gsp *, enum nvkm_suspend_state suspend); int nvkm_gsp_new_(const struct nvkm_gsp_fwif *, struct nvkm_device *, enum nvkm_subdev_type, int, struct nvkm_gsp **); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c index 6e7af2f737b7..2f028a30e07d 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c @@ -1721,7 +1721,7 @@ r535_gsp_sr_data_size(struct nvkm_gsp *gsp) } int -r535_gsp_fini(struct nvkm_gsp *gsp, bool suspend) +r535_gsp_fini(struct nvkm_gsp *gsp, enum nvkm_suspend_state suspend) { struct nvkm_rm *rm = gsp->rm; int ret; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c index 04b642a1f730..19cb269e7a26 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c @@ -161,7 +161,7 @@ tu102_gsp_reset(struct nvkm_gsp *gsp) } int -tu102_gsp_fini(struct nvkm_gsp *gsp, bool suspend) +tu102_gsp_fini(struct nvkm_gsp *gsp, enum nvkm_suspend_state suspend) { u32 mbox0 = 0xff, mbox1 = 0xff; int ret; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c index 7ec17e8435a1..454bb21815a2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c @@ -135,7 +135,7 @@ nvkm_i2c_intr(struct nvkm_subdev *subdev) } static int -nvkm_i2c_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_i2c_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_i2c *i2c = nvkm_i2c(subdev); struct nvkm_i2c_pad *pad; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/base.c index 2f55bab8e132..6b9ed61684a0 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/base.c @@ -176,7 +176,7 @@ nvkm_instmem_boot(struct nvkm_instmem *imem) } static int -nvkm_instmem_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_instmem_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_instmem *imem = nvkm_instmem(subdev); int ret; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c index 6867934256a7..0f3e0d324a52 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c @@ -74,7 +74,7 @@ nvkm_pci_rom_shadow(struct nvkm_pci *pci, bool shadow) } static int -nvkm_pci_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_pci_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_pci *pci = nvkm_pci(subdev); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c index 8f2f50ad4ded..9e9004ec4588 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c @@ -77,7 +77,7 @@ nvkm_pmu_intr(struct nvkm_subdev *subdev) } static int -nvkm_pmu_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_pmu_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_pmu *pmu = nvkm_pmu(subdev); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c index fc5ee118e910..1510aba33956 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c @@ -341,15 +341,15 @@ nvkm_therm_intr(struct nvkm_subdev *subdev) } static int -nvkm_therm_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_therm_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_therm *therm = nvkm_therm(subdev); if (therm->func->fini) therm->func->fini(therm); - nvkm_therm_fan_fini(therm, suspend); - nvkm_therm_sensor_fini(therm, suspend); + nvkm_therm_fan_fini(therm, suspend != NVKM_POWEROFF); + nvkm_therm_sensor_fini(therm, suspend != NVKM_POWEROFF); if (suspend) { therm->suspend = therm->mode; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.c index 8b0da0c06268..a5c3c282b5d0 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.c @@ -149,7 +149,7 @@ nvkm_timer_intr(struct nvkm_subdev *subdev) } static int -nvkm_timer_fini(struct nvkm_subdev *subdev, bool suspend) +nvkm_timer_fini(struct nvkm_subdev *subdev, enum nvkm_suspend_state suspend) { struct nvkm_timer *tmr = nvkm_timer(subdev); tmr->func->alarm_fini(tmr); From 8302d0afeaec0bc57d951dd085e0cffe997d4d18 Mon Sep 17 00:00:00 2001 From: Dave Airlie Date: Tue, 3 Feb 2026 15:21:13 +1000 Subject: [PATCH 117/167] nouveau/gsp: fix suspend/resume regression on r570 firmware The r570 firmware with certain GPUs (at least RTX6000) needs this flag to reflect the suspend vs runtime PM state of the driver. This uses that info to set the correct flags to the firmware. This fixes a regression on RTX6000 and other GPUs since r570 firmware was enabled. Fixes: 53dac0623853 ("drm/nouveau/gsp: add support for 570.144") Cc: Reviewed-by: Lyude Paul Tested-by: Lyude Paul Signed-off-by: Dave Airlie Link: https://patch.msgid.link/20260203052431.2219998-4-airlied@gmail.com --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/fbsr.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/fbsr.c | 8 ++++---- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/rm.h | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/fbsr.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/fbsr.c index 150e22fde2ac..e962d0e8f837 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/fbsr.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/fbsr.c @@ -208,7 +208,7 @@ r535_fbsr_resume(struct nvkm_gsp *gsp) } static int -r535_fbsr_suspend(struct nvkm_gsp *gsp) +r535_fbsr_suspend(struct nvkm_gsp *gsp, bool runtime) { struct nvkm_subdev *subdev = &gsp->subdev; struct nvkm_device *device = subdev->device; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c index 2f028a30e07d..7fb13434c051 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c @@ -1748,7 +1748,7 @@ r535_gsp_fini(struct nvkm_gsp *gsp, enum nvkm_suspend_state suspend) sr->sysmemAddrOfSuspendResumeData = gsp->sr.radix3.lvl0.addr; sr->sizeOfSuspendResumeData = len; - ret = rm->api->fbsr->suspend(gsp); + ret = rm->api->fbsr->suspend(gsp, suspend == NVKM_RUNTIME_SUSPEND); if (ret) { nvkm_gsp_mem_dtor(&gsp->sr.meta); nvkm_gsp_radix3_dtor(gsp, &gsp->sr.radix3); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/fbsr.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/fbsr.c index 2945d5b4e570..8ef8b4f65588 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/fbsr.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r570/fbsr.c @@ -62,7 +62,7 @@ r570_fbsr_resume(struct nvkm_gsp *gsp) } static int -r570_fbsr_init(struct nvkm_gsp *gsp, struct sg_table *sgt, u64 size) +r570_fbsr_init(struct nvkm_gsp *gsp, struct sg_table *sgt, u64 size, bool runtime) { NV2080_CTRL_INTERNAL_FBSR_INIT_PARAMS *ctrl; struct nvkm_gsp_object memlist; @@ -81,7 +81,7 @@ r570_fbsr_init(struct nvkm_gsp *gsp, struct sg_table *sgt, u64 size) ctrl->hClient = gsp->internal.client.object.handle; ctrl->hSysMem = memlist.handle; ctrl->sysmemAddrOfSuspendResumeData = gsp->sr.meta.addr; - ctrl->bEnteringGcoffState = 1; + ctrl->bEnteringGcoffState = runtime ? 1 : 0; ret = nvkm_gsp_rm_ctrl_wr(&gsp->internal.device.subdevice, ctrl); if (ret) @@ -92,7 +92,7 @@ r570_fbsr_init(struct nvkm_gsp *gsp, struct sg_table *sgt, u64 size) } static int -r570_fbsr_suspend(struct nvkm_gsp *gsp) +r570_fbsr_suspend(struct nvkm_gsp *gsp, bool runtime) { struct nvkm_subdev *subdev = &gsp->subdev; struct nvkm_device *device = subdev->device; @@ -133,7 +133,7 @@ r570_fbsr_suspend(struct nvkm_gsp *gsp) return ret; /* Initialise FBSR on RM. */ - ret = r570_fbsr_init(gsp, &gsp->sr.fbsr, size); + ret = r570_fbsr_init(gsp, &gsp->sr.fbsr, size, runtime); if (ret) { nvkm_gsp_sg_free(device, &gsp->sr.fbsr); return ret; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/rm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/rm.h index 393ea775941f..4f0ae6cc085c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/rm.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/rm.h @@ -78,7 +78,7 @@ struct nvkm_rm_api { } *device; const struct nvkm_rm_api_fbsr { - int (*suspend)(struct nvkm_gsp *); + int (*suspend)(struct nvkm_gsp *, bool runtime); void (*resume)(struct nvkm_gsp *); } *fbsr; From 78211543d2e44f84093049b4ef5f5bfa535f4645 Mon Sep 17 00:00:00 2001 From: Chen Ni Date: Mon, 2 Feb 2026 12:02:28 +0800 Subject: [PATCH 118/167] net: ethernet: adi: adin1110: Check return value of devm_gpiod_get_optional() in adin1110_check_spi() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The devm_gpiod_get_optional() function may return an ERR_PTR in case of genuine GPIO acquisition errors, not just NULL which indicates the legitimate absence of an optional GPIO. Add an IS_ERR() check after the call in adin1110_check_spi(). On error, return the error code to ensure proper failure handling rather than proceeding with invalid pointers. Fixes: 36934cac7aaf ("net: ethernet: adi: adin1110: add reset GPIO") Signed-off-by: Chen Ni Reviewed-by: Nuno Sá Link: https://patch.msgid.link/20260202040228.4129097-1-nichen@iscas.ac.cn Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/adi/adin1110.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/adi/adin1110.c b/drivers/net/ethernet/adi/adin1110.c index 30f9d271e595..71a2397edf2b 100644 --- a/drivers/net/ethernet/adi/adin1110.c +++ b/drivers/net/ethernet/adi/adin1110.c @@ -1089,6 +1089,9 @@ static int adin1110_check_spi(struct adin1110_priv *priv) reset_gpio = devm_gpiod_get_optional(&priv->spidev->dev, "reset", GPIOD_OUT_LOW); + if (IS_ERR(reset_gpio)) + return dev_err_probe(&priv->spidev->dev, PTR_ERR(reset_gpio), + "failed to get reset gpio\n"); if (reset_gpio) { /* MISO pin is used for internal configuration, can't have * anyone else disturbing the SDO line. From f613e8b4afea0cd17c7168e8b00e25bc8d33175d Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 2 Feb 2026 20:52:17 +0000 Subject: [PATCH 119/167] net: add proper RCU protection to /proc/net/ptype Yin Fengwei reported an RCU stall in ptype_seq_show() and provided a patch. Real issue is that ptype_seq_next() and ptype_seq_show() violate RCU rules. ptype_seq_show() runs under rcu_read_lock(), and reads pt->dev to get device name without any barrier. At the same time, concurrent writers can remove a packet_type structure (which is correctly freed after an RCU grace period) and clear pt->dev without an RCU grace period. Define ptype_iter_state to carry a dev pointer along seq_net_private: struct ptype_iter_state { struct seq_net_private p; struct net_device *dev; // added in this patch }; We need to record the device pointer in ptype_get_idx() and ptype_seq_next() so that ptype_seq_show() is safe against concurrent pt->dev changes. We also need to add full RCU protection in ptype_seq_next(). (Missing READ_ONCE() when reading list.next values) Many thanks to Dong Chenchen for providing a repro. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Fixes: 1d10f8a1f40b ("net-procfs: show net devices bound packet types") Fixes: c353e8983e0d ("net: introduce per netns packet chains") Reported-by: Yin Fengwei Reported-by: Dong Chenchen Closes: https://lore.kernel.org/netdev/CANn89iKRRKPnWjJmb-_3a=sq+9h6DvTQM4DBZHT5ZRGPMzQaiA@mail.gmail.com/T/#m7b80b9fc9b9267f90e0b7aad557595f686f9c50d Signed-off-by: Eric Dumazet Reviewed-by: Willem de Bruijn Tested-by: Yin Fengwei Link: https://patch.msgid.link/20260202205217.2881198-1-edumazet@google.com Signed-off-by: Jakub Kicinski --- net/core/net-procfs.c | 50 +++++++++++++++++++++++++++++-------------- 1 file changed, 34 insertions(+), 16 deletions(-) diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c index 70e0e9a3b650..7dbfa6109f0b 100644 --- a/net/core/net-procfs.c +++ b/net/core/net-procfs.c @@ -170,8 +170,14 @@ static const struct seq_operations softnet_seq_ops = { .show = softnet_seq_show, }; +struct ptype_iter_state { + struct seq_net_private p; + struct net_device *dev; +}; + static void *ptype_get_idx(struct seq_file *seq, loff_t pos) { + struct ptype_iter_state *iter = seq->private; struct list_head *ptype_list = NULL; struct packet_type *pt = NULL; struct net_device *dev; @@ -181,12 +187,16 @@ static void *ptype_get_idx(struct seq_file *seq, loff_t pos) for_each_netdev_rcu(seq_file_net(seq), dev) { ptype_list = &dev->ptype_all; list_for_each_entry_rcu(pt, ptype_list, list) { - if (i == pos) + if (i == pos) { + iter->dev = dev; return pt; + } ++i; } } + iter->dev = NULL; + list_for_each_entry_rcu(pt, &seq_file_net(seq)->ptype_all, list) { if (i == pos) return pt; @@ -218,6 +228,7 @@ static void *ptype_seq_start(struct seq_file *seq, loff_t *pos) static void *ptype_seq_next(struct seq_file *seq, void *v, loff_t *pos) { + struct ptype_iter_state *iter = seq->private; struct net *net = seq_file_net(seq); struct net_device *dev; struct packet_type *pt; @@ -229,19 +240,21 @@ static void *ptype_seq_next(struct seq_file *seq, void *v, loff_t *pos) return ptype_get_idx(seq, 0); pt = v; - nxt = pt->list.next; - if (pt->dev) { - if (nxt != &pt->dev->ptype_all) + nxt = READ_ONCE(pt->list.next); + dev = iter->dev; + if (dev) { + if (nxt != &dev->ptype_all) goto found; - dev = pt->dev; for_each_netdev_continue_rcu(seq_file_net(seq), dev) { - if (!list_empty(&dev->ptype_all)) { - nxt = dev->ptype_all.next; + nxt = READ_ONCE(dev->ptype_all.next); + if (nxt != &dev->ptype_all) { + iter->dev = dev; goto found; } } - nxt = net->ptype_all.next; + iter->dev = NULL; + nxt = READ_ONCE(net->ptype_all.next); goto net_ptype_all; } @@ -252,20 +265,20 @@ static void *ptype_seq_next(struct seq_file *seq, void *v, loff_t *pos) if (nxt == &net->ptype_all) { /* continue with ->ptype_specific if it's not empty */ - nxt = net->ptype_specific.next; + nxt = READ_ONCE(net->ptype_specific.next); if (nxt != &net->ptype_specific) goto found; } hash = 0; - nxt = ptype_base[0].next; + nxt = READ_ONCE(ptype_base[0].next); } else hash = ntohs(pt->type) & PTYPE_HASH_MASK; while (nxt == &ptype_base[hash]) { if (++hash >= PTYPE_HASH_SIZE) return NULL; - nxt = ptype_base[hash].next; + nxt = READ_ONCE(ptype_base[hash].next); } found: return list_entry(nxt, struct packet_type, list); @@ -279,19 +292,24 @@ static void ptype_seq_stop(struct seq_file *seq, void *v) static int ptype_seq_show(struct seq_file *seq, void *v) { + struct ptype_iter_state *iter = seq->private; struct packet_type *pt = v; + struct net_device *dev; - if (v == SEQ_START_TOKEN) + if (v == SEQ_START_TOKEN) { seq_puts(seq, "Type Device Function\n"); - else if ((!pt->af_packet_net || net_eq(pt->af_packet_net, seq_file_net(seq))) && - (!pt->dev || net_eq(dev_net(pt->dev), seq_file_net(seq)))) { + return 0; + } + dev = iter->dev; + if ((!pt->af_packet_net || net_eq(pt->af_packet_net, seq_file_net(seq))) && + (!dev || net_eq(dev_net(dev), seq_file_net(seq)))) { if (pt->type == htons(ETH_P_ALL)) seq_puts(seq, "ALL "); else seq_printf(seq, "%04x", ntohs(pt->type)); seq_printf(seq, " %-8s %ps\n", - pt->dev ? pt->dev->name : "", pt->func); + dev ? dev->name : "", pt->func); } return 0; @@ -315,7 +333,7 @@ static int __net_init dev_proc_net_init(struct net *net) &softnet_seq_ops)) goto out_dev; if (!proc_create_net("ptype", 0444, net->proc_net, &ptype_seq_ops, - sizeof(struct seq_net_private))) + sizeof(struct ptype_iter_state))) goto out_softnet; if (wext_proc_init(net)) From 5c2c3c38be396257a6a2e55bd601a12bb9781507 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Mon, 2 Feb 2026 12:43:14 +0100 Subject: [PATCH 120/167] net: gro: fix outer network offset The udp GRO complete stage assumes that all the packets inserted the RX have the `encapsulation` flag zeroed. Such assumption is not true, as a few H/W NICs can set such flag when H/W offloading the checksum for an UDP encapsulated traffic, the tun driver can inject GSO packets with UDP encapsulation and the problematic layout can also be created via a veth based setup. Due to the above, in the problematic scenarios, udp4_gro_complete() uses the wrong network offset (inner instead of outer) to compute the outer UDP header pseudo checksum, leading to csum validation errors later on in packet processing. Address the issue always clearing the encapsulation flag at GRO completion time. Such flag will be set again as needed for encapsulated packets by udp_gro_complete(). Fixes: 5ef31ea5d053 ("net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb") Reviewed-by: Willem de Bruijn Signed-off-by: Paolo Abeni Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/562638dbebb3b15424220e26a180274b387e2a88.1770032084.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski --- net/core/gro.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/core/gro.c b/net/core/gro.c index 76f9c3712422..482fa7d7f598 100644 --- a/net/core/gro.c +++ b/net/core/gro.c @@ -265,6 +265,8 @@ static void gro_complete(struct gro_node *gro, struct sk_buff *skb) goto out; } + /* NICs can feed encapsulated packets into GRO */ + skb->encapsulation = 0; rcu_read_lock(); list_for_each_entry_rcu(ptype, head, list) { if (ptype->type != type || !ptype->callbacks.gro_complete) From bee60ce21b751275b3a7766f614373ef02dde512 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Mon, 2 Feb 2026 12:43:15 +0100 Subject: [PATCH 121/167] selftest: net: add a test-case for encap segmentation after GRO We had a few patches in this area and no explicit coverage so far. The test case covers the scenario addressed by the previous fix; reusing the existing udpgro_fwd.sh script to leverage part of the of the virtual network setup, even if such script is possibly not a perfect fit. Note that the mentioned script already contains several shellcheck violation; this patch does not fix the existing code, just avoids adding more issues in the new one. Reviewed-by: Willem de Bruijn Signed-off-by: Paolo Abeni Link: https://patch.msgid.link/768ca132af81e83856e34d3105b86c37e566a7ad.1770032084.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski --- tools/testing/selftests/net/udpgro_fwd.sh | 64 +++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/tools/testing/selftests/net/udpgro_fwd.sh b/tools/testing/selftests/net/udpgro_fwd.sh index a39fdc4aa2ff..9b722c1e4b0f 100755 --- a/tools/testing/selftests/net/udpgro_fwd.sh +++ b/tools/testing/selftests/net/udpgro_fwd.sh @@ -162,6 +162,39 @@ run_test() { echo " ok" } +run_test_csum() { + local -r msg="$1" + local -r dst="$2" + local csum_error_filter=UdpInCsumErrors + local csum_errors + + printf "%-40s" "$msg" + + is_ipv6 "$dst" && csum_error_filter=Udp6InCsumErrors + + ip netns exec "$NS_DST" iperf3 -s -1 >/dev/null & + wait_local_port_listen "$NS_DST" 5201 tcp + local spid="$!" + ip netns exec "$NS_SRC" iperf3 -c "$dst" -t 2 >/dev/null + local retc="$?" + wait "$spid" + local rets="$?" + if [ "$rets" -ne 0 ] || [ "$retc" -ne 0 ]; then + echo " fail client exit code $retc, server $rets" + ret=1 + return + fi + + csum_errors=$(ip netns exec "$NS_DST" nstat -as "$csum_error_filter" | + grep "$csum_error_filter" | awk '{print $2}') + if [ -n "$csum_errors" ] && [ "$csum_errors" -gt 0 ]; then + echo " fail - csum error on receive $csum_errors, expected 0" + ret=1 + return + fi + echo " ok" +} + run_bench() { local -r msg=$1 local -r dst=$2 @@ -260,6 +293,37 @@ for family in 4 6; do ip netns exec $NS_SRC $PING -q -c 1 $OL_NET$DST_NAT >/dev/null run_test "GRO fwd over UDP tunnel" $OL_NET$DST_NAT 10 10 $OL_NET$DST cleanup + + # force segmentation and re-aggregation + create_vxlan_pair + ip netns exec "$NS_DST" ethtool -K veth"$DST" generic-receive-offload on + ip netns exec "$NS_SRC" ethtool -K veth"$SRC" tso off + ip -n "$NS_SRC" link set dev veth"$SRC" mtu 1430 + + # forward to a 2nd veth pair + ip -n "$NS_DST" link add br0 type bridge + ip -n "$NS_DST" link set dev veth"$DST" master br0 + + # segment the aggregated TSO packet, without csum offload + ip -n "$NS_DST" link add veth_segment type veth peer veth_rx + for FEATURE in tso tx-udp-segmentation tx-checksumming; do + ip netns exec "$NS_DST" ethtool -K veth_segment "$FEATURE" off + done + ip -n "$NS_DST" link set dev veth_segment master br0 up + ip -n "$NS_DST" link set dev br0 up + ip -n "$NS_DST" link set dev veth_rx up + + # move the lower layer IP in the last added veth + for ADDR in "$BM_NET_V4$DST/24" "$BM_NET_V6$DST/64"; do + # the dad argument will let iproute emit a unharmful warning + # with ipv4 addresses + ip -n "$NS_DST" addr del dev veth"$DST" "$ADDR" + ip -n "$NS_DST" addr add dev veth_rx "$ADDR" \ + nodad 2>/dev/null + done + + run_test_csum "GSO after GRO" "$OL_NET$DST" + cleanup done exit $ret From 7b9ebcce0296e104a0d82a6b09d68564806158ff Mon Sep 17 00:00:00 2001 From: Debarghya Kundu Date: Mon, 2 Feb 2026 19:39:24 +0000 Subject: [PATCH 122/167] gve: Fix stats report corruption on queue count change The driver and the NIC share a region in memory for stats reporting. The NIC calculates its offset into this region based on the total size of the stats region and the size of the NIC's stats. When the number of queues is changed, the driver's stats region is resized. If the queue count is increased, the NIC can write past the end of the allocated stats region, causing memory corruption. If the queue count is decreased, there is a gap between the driver and NIC stats, leading to incorrect stats reporting. This change fixes the issue by allocating stats region with maximum size, and the offset calculation for NIC stats is changed to match with the calculation of the NIC. Cc: stable@vger.kernel.org Fixes: 24aeb56f2d38 ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.") Signed-off-by: Debarghya Kundu Reviewed-by: Joshua Washington Signed-off-by: Harshitha Ramamurthy Reviewed-by: Jacob Keller Link: https://patch.msgid.link/20260202193925.3106272-2-hramamurthy@google.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/google/gve/gve_ethtool.c | 54 ++++++++++++------- drivers/net/ethernet/google/gve/gve_main.c | 4 +- 2 files changed, 36 insertions(+), 22 deletions(-) diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c index 311b106160b2..137dd7285bda 100644 --- a/drivers/net/ethernet/google/gve/gve_ethtool.c +++ b/drivers/net/ethernet/google/gve/gve_ethtool.c @@ -156,7 +156,8 @@ gve_get_ethtool_stats(struct net_device *netdev, u64 rx_buf_alloc_fail, rx_desc_err_dropped_pkt, rx_hsplit_unsplit_pkt, rx_pkts, rx_hsplit_pkt, rx_skb_alloc_fail, rx_bytes, tx_pkts, tx_bytes, tx_dropped; - int stats_idx, base_stats_idx, max_stats_idx; + int rx_base_stats_idx, max_rx_stats_idx, max_tx_stats_idx; + int stats_idx, stats_region_len, nic_stats_len; struct stats *report_stats; int *rx_qid_to_stats_idx; int *tx_qid_to_stats_idx; @@ -265,20 +266,38 @@ gve_get_ethtool_stats(struct net_device *netdev, data[i++] = priv->stats_report_trigger_cnt; i = GVE_MAIN_STATS_LEN; - /* For rx cross-reporting stats, start from nic rx stats in report */ - base_stats_idx = GVE_TX_STATS_REPORT_NUM * num_tx_queues + - GVE_RX_STATS_REPORT_NUM * priv->rx_cfg.num_queues; - /* The boundary between driver stats and NIC stats shifts if there are - * stopped queues. - */ - base_stats_idx += NIC_RX_STATS_REPORT_NUM * num_stopped_rxqs + - NIC_TX_STATS_REPORT_NUM * num_stopped_txqs; - max_stats_idx = NIC_RX_STATS_REPORT_NUM * - (priv->rx_cfg.num_queues - num_stopped_rxqs) + - base_stats_idx; + rx_base_stats_idx = 0; + max_rx_stats_idx = 0; + max_tx_stats_idx = 0; + stats_region_len = priv->stats_report_len - + sizeof(struct gve_stats_report); + nic_stats_len = (NIC_RX_STATS_REPORT_NUM * priv->rx_cfg.num_queues + + NIC_TX_STATS_REPORT_NUM * num_tx_queues) * sizeof(struct stats); + if (unlikely((stats_region_len - + nic_stats_len) % sizeof(struct stats))) { + net_err_ratelimited("Starting index of NIC stats should be multiple of stats size"); + } else { + /* For rx cross-reporting stats, + * start from nic rx stats in report + */ + rx_base_stats_idx = (stats_region_len - nic_stats_len) / + sizeof(struct stats); + /* The boundary between driver stats and NIC stats + * shifts if there are stopped queues + */ + rx_base_stats_idx += NIC_RX_STATS_REPORT_NUM * + num_stopped_rxqs + NIC_TX_STATS_REPORT_NUM * + num_stopped_txqs; + max_rx_stats_idx = NIC_RX_STATS_REPORT_NUM * + (priv->rx_cfg.num_queues - num_stopped_rxqs) + + rx_base_stats_idx; + max_tx_stats_idx = NIC_TX_STATS_REPORT_NUM * + (num_tx_queues - num_stopped_txqs) + + max_rx_stats_idx; + } /* Preprocess the stats report for rx, map queue id to start index */ skip_nic_stats = false; - for (stats_idx = base_stats_idx; stats_idx < max_stats_idx; + for (stats_idx = rx_base_stats_idx; stats_idx < max_rx_stats_idx; stats_idx += NIC_RX_STATS_REPORT_NUM) { u32 stat_name = be32_to_cpu(report_stats[stats_idx].stat_name); u32 queue_id = be32_to_cpu(report_stats[stats_idx].queue_id); @@ -354,14 +373,9 @@ gve_get_ethtool_stats(struct net_device *netdev, i += priv->rx_cfg.num_queues * NUM_GVE_RX_CNTS; } - /* For tx cross-reporting stats, start from nic tx stats in report */ - base_stats_idx = max_stats_idx; - max_stats_idx = NIC_TX_STATS_REPORT_NUM * - (num_tx_queues - num_stopped_txqs) + - max_stats_idx; - /* Preprocess the stats report for tx, map queue id to start index */ skip_nic_stats = false; - for (stats_idx = base_stats_idx; stats_idx < max_stats_idx; + /* NIC TX stats start right after NIC RX stats */ + for (stats_idx = max_rx_stats_idx; stats_idx < max_tx_stats_idx; stats_idx += NIC_TX_STATS_REPORT_NUM) { u32 stat_name = be32_to_cpu(report_stats[stats_idx].stat_name); u32 queue_id = be32_to_cpu(report_stats[stats_idx].queue_id); diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c index 52c5e4942cd4..dbc84de39b70 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -283,9 +283,9 @@ static int gve_alloc_stats_report(struct gve_priv *priv) int tx_stats_num, rx_stats_num; tx_stats_num = (GVE_TX_STATS_REPORT_NUM + NIC_TX_STATS_REPORT_NUM) * - gve_num_tx_queues(priv); + priv->tx_cfg.max_queues; rx_stats_num = (GVE_RX_STATS_REPORT_NUM + NIC_RX_STATS_REPORT_NUM) * - priv->rx_cfg.num_queues; + priv->rx_cfg.max_queues; priv->stats_report_len = struct_size(priv->stats_report, stats, size_add(tx_stats_num, rx_stats_num)); priv->stats_report = From c7db85d579a1dccb624235534508c75fbf2dfe46 Mon Sep 17 00:00:00 2001 From: Max Yuan Date: Mon, 2 Feb 2026 19:39:25 +0000 Subject: [PATCH 123/167] gve: Correct ethtool rx_dropped calculation The gve driver's "rx_dropped" statistic, exposed via `ethtool -S`, incorrectly includes `rx_buf_alloc_fail` counts. These failures represent an inability to allocate receive buffers, not true packet drops where a received packet is discarded. This misrepresentation can lead to inaccurate diagnostics. This patch rectifies the ethtool "rx_dropped" calculation. It removes `rx_buf_alloc_fail` from the total and adds `xdp_tx_errors` and `xdp_redirect_errors`, which represent legitimate packet drops within the XDP path. Cc: stable@vger.kernel.org Fixes: 433e274b8f7b ("gve: Add stats for gve.") Signed-off-by: Max Yuan Reviewed-by: Jordan Rhee Reviewed-by: Joshua Washington Reviewed-by: Matt Olson Signed-off-by: Harshitha Ramamurthy Reviewed-by: Jacob Keller Link: https://patch.msgid.link/20260202193925.3106272-3-hramamurthy@google.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/google/gve/gve_ethtool.c | 23 ++++++++++++++----- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c index 137dd7285bda..66ddc4413f8d 100644 --- a/drivers/net/ethernet/google/gve/gve_ethtool.c +++ b/drivers/net/ethernet/google/gve/gve_ethtool.c @@ -152,10 +152,11 @@ gve_get_ethtool_stats(struct net_device *netdev, u64 tmp_rx_pkts, tmp_rx_hsplit_pkt, tmp_rx_bytes, tmp_rx_hsplit_bytes, tmp_rx_skb_alloc_fail, tmp_rx_buf_alloc_fail, tmp_rx_desc_err_dropped_pkt, tmp_rx_hsplit_unsplit_pkt, - tmp_tx_pkts, tmp_tx_bytes; + tmp_tx_pkts, tmp_tx_bytes, + tmp_xdp_tx_errors, tmp_xdp_redirect_errors; u64 rx_buf_alloc_fail, rx_desc_err_dropped_pkt, rx_hsplit_unsplit_pkt, rx_pkts, rx_hsplit_pkt, rx_skb_alloc_fail, rx_bytes, tx_pkts, tx_bytes, - tx_dropped; + tx_dropped, xdp_tx_errors, xdp_redirect_errors; int rx_base_stats_idx, max_rx_stats_idx, max_tx_stats_idx; int stats_idx, stats_region_len, nic_stats_len; struct stats *report_stats; @@ -199,6 +200,7 @@ gve_get_ethtool_stats(struct net_device *netdev, for (rx_pkts = 0, rx_bytes = 0, rx_hsplit_pkt = 0, rx_skb_alloc_fail = 0, rx_buf_alloc_fail = 0, rx_desc_err_dropped_pkt = 0, rx_hsplit_unsplit_pkt = 0, + xdp_tx_errors = 0, xdp_redirect_errors = 0, ring = 0; ring < priv->rx_cfg.num_queues; ring++) { if (priv->rx) { @@ -216,6 +218,9 @@ gve_get_ethtool_stats(struct net_device *netdev, rx->rx_desc_err_dropped_pkt; tmp_rx_hsplit_unsplit_pkt = rx->rx_hsplit_unsplit_pkt; + tmp_xdp_tx_errors = rx->xdp_tx_errors; + tmp_xdp_redirect_errors = + rx->xdp_redirect_errors; } while (u64_stats_fetch_retry(&priv->rx[ring].statss, start)); rx_pkts += tmp_rx_pkts; @@ -225,6 +230,8 @@ gve_get_ethtool_stats(struct net_device *netdev, rx_buf_alloc_fail += tmp_rx_buf_alloc_fail; rx_desc_err_dropped_pkt += tmp_rx_desc_err_dropped_pkt; rx_hsplit_unsplit_pkt += tmp_rx_hsplit_unsplit_pkt; + xdp_tx_errors += tmp_xdp_tx_errors; + xdp_redirect_errors += tmp_xdp_redirect_errors; } } for (tx_pkts = 0, tx_bytes = 0, tx_dropped = 0, ring = 0; @@ -250,8 +257,8 @@ gve_get_ethtool_stats(struct net_device *netdev, data[i++] = rx_bytes; data[i++] = tx_bytes; /* total rx dropped packets */ - data[i++] = rx_skb_alloc_fail + rx_buf_alloc_fail + - rx_desc_err_dropped_pkt; + data[i++] = rx_skb_alloc_fail + rx_desc_err_dropped_pkt + + xdp_tx_errors + xdp_redirect_errors; data[i++] = tx_dropped; data[i++] = priv->tx_timeo_cnt; data[i++] = rx_skb_alloc_fail; @@ -330,6 +337,9 @@ gve_get_ethtool_stats(struct net_device *netdev, tmp_rx_buf_alloc_fail = rx->rx_buf_alloc_fail; tmp_rx_desc_err_dropped_pkt = rx->rx_desc_err_dropped_pkt; + tmp_xdp_tx_errors = rx->xdp_tx_errors; + tmp_xdp_redirect_errors = + rx->xdp_redirect_errors; } while (u64_stats_fetch_retry(&priv->rx[ring].statss, start)); data[i++] = tmp_rx_bytes; @@ -340,8 +350,9 @@ gve_get_ethtool_stats(struct net_device *netdev, data[i++] = rx->rx_frag_alloc_cnt; /* rx dropped packets */ data[i++] = tmp_rx_skb_alloc_fail + - tmp_rx_buf_alloc_fail + - tmp_rx_desc_err_dropped_pkt; + tmp_rx_desc_err_dropped_pkt + + tmp_xdp_tx_errors + + tmp_xdp_redirect_errors; data[i++] = rx->rx_copybreak_pkt; data[i++] = rx->rx_copied_pkt; /* stats from NIC */ From c0b5dc73a38f954e780f93a549b8fe225235c07a Mon Sep 17 00:00:00 2001 From: Kevin Hao Date: Tue, 3 Feb 2026 10:18:30 +0800 Subject: [PATCH 124/167] net: cpsw_new: Execute ndo_set_rx_mode callback in a work queue Commit 1767bb2d47b7 ("ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.") removed the RTNL lock for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP operations. However, this change triggered the following call trace on my BeagleBone Black board: WARNING: net/8021q/vlan_core.c:236 at vlan_for_each+0x120/0x124, CPU#0: rpcbind/496 RTNL: assertion failed at net/8021q/vlan_core.c (236) Modules linked in: CPU: 0 UID: 997 PID: 496 Comm: rpcbind Not tainted 6.19.0-rc6-next-20260122-yocto-standard+ #8 PREEMPT Hardware name: Generic AM33XX (Flattened Device Tree) Call trace: unwind_backtrace from show_stack+0x28/0x2c show_stack from dump_stack_lvl+0x30/0x38 dump_stack_lvl from __warn+0xb8/0x11c __warn from warn_slowpath_fmt+0x130/0x194 warn_slowpath_fmt from vlan_for_each+0x120/0x124 vlan_for_each from cpsw_add_mc_addr+0x54/0xd8 cpsw_add_mc_addr from __hw_addr_ref_sync_dev+0xc4/0xec __hw_addr_ref_sync_dev from __dev_mc_add+0x78/0x88 __dev_mc_add from igmp6_group_added+0x84/0xec igmp6_group_added from __ipv6_dev_mc_inc+0x1fc/0x2f0 __ipv6_dev_mc_inc from __ipv6_sock_mc_join+0x124/0x1b4 __ipv6_sock_mc_join from do_ipv6_setsockopt+0x84c/0x1168 do_ipv6_setsockopt from ipv6_setsockopt+0x88/0xc8 ipv6_setsockopt from do_sock_setsockopt+0xe8/0x19c do_sock_setsockopt from __sys_setsockopt+0x84/0xac __sys_setsockopt from ret_fast_syscall+0x0/0x5 This trace occurs because vlan_for_each() is called within cpsw_ndo_set_rx_mode(), which expects the RTNL lock to be held. Since modifying vlan_for_each() to operate without the RTNL lock is not straightforward, and because ndo_set_rx_mode() is invoked both with and without the RTNL lock across different code paths, simply adding rtnl_lock() in cpsw_ndo_set_rx_mode() is not a viable solution. To resolve this issue, we opt to execute the actual processing within a work queue, following the approach used by the icssg-prueth driver. Fixes: 1767bb2d47b7 ("ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.") Signed-off-by: Kevin Hao Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260203-bbb-v5-1-ea0ea217a85c@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/ti/cpsw_new.c | 34 ++++++++++++++++++++++++----- drivers/net/ethernet/ti/cpsw_priv.h | 1 + 2 files changed, 30 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c index ab88d4c02cbd..21af0a10626a 100644 --- a/drivers/net/ethernet/ti/cpsw_new.c +++ b/drivers/net/ethernet/ti/cpsw_new.c @@ -248,16 +248,22 @@ static int cpsw_purge_all_mc(struct net_device *ndev, const u8 *addr, int num) return 0; } -static void cpsw_ndo_set_rx_mode(struct net_device *ndev) +static void cpsw_ndo_set_rx_mode_work(struct work_struct *work) { - struct cpsw_priv *priv = netdev_priv(ndev); + struct cpsw_priv *priv = container_of(work, struct cpsw_priv, rx_mode_work); struct cpsw_common *cpsw = priv->cpsw; + struct net_device *ndev = priv->ndev; + rtnl_lock(); + if (!netif_running(ndev)) + goto unlock_rtnl; + + netif_addr_lock_bh(ndev); if (ndev->flags & IFF_PROMISC) { /* Enable promiscuous mode */ cpsw_set_promiscious(ndev, true); cpsw_ale_set_allmulti(cpsw->ale, IFF_ALLMULTI, priv->emac_port); - return; + goto unlock_addr; } /* Disable promiscuous mode */ @@ -270,6 +276,18 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev) /* add/remove mcast address either for real netdev or for vlan */ __hw_addr_ref_sync_dev(&ndev->mc, ndev, cpsw_add_mc_addr, cpsw_del_mc_addr); + +unlock_addr: + netif_addr_unlock_bh(ndev); +unlock_rtnl: + rtnl_unlock(); +} + +static void cpsw_ndo_set_rx_mode(struct net_device *ndev) +{ + struct cpsw_priv *priv = netdev_priv(ndev); + + schedule_work(&priv->rx_mode_work); } static unsigned int cpsw_rxbuf_total_len(unsigned int len) @@ -1398,6 +1416,7 @@ static int cpsw_create_ports(struct cpsw_common *cpsw) priv->msg_enable = netif_msg_init(debug_level, CPSW_DEBUG); priv->emac_port = i + 1; priv->tx_packet_min = CPSW_MIN_PACKET_SIZE; + INIT_WORK(&priv->rx_mode_work, cpsw_ndo_set_rx_mode_work); if (is_valid_ether_addr(slave_data->mac_addr)) { ether_addr_copy(priv->mac_addr, slave_data->mac_addr); @@ -1447,13 +1466,18 @@ static int cpsw_create_ports(struct cpsw_common *cpsw) static void cpsw_unregister_ports(struct cpsw_common *cpsw) { + struct net_device *ndev; + struct cpsw_priv *priv; int i = 0; for (i = 0; i < cpsw->data.slaves; i++) { - if (!cpsw->slaves[i].ndev) + ndev = cpsw->slaves[i].ndev; + if (!ndev) continue; - unregister_netdev(cpsw->slaves[i].ndev); + priv = netdev_priv(ndev); + unregister_netdev(ndev); + disable_work_sync(&priv->rx_mode_work); } } diff --git a/drivers/net/ethernet/ti/cpsw_priv.h b/drivers/net/ethernet/ti/cpsw_priv.h index 91add8925e23..acb6181c5c9e 100644 --- a/drivers/net/ethernet/ti/cpsw_priv.h +++ b/drivers/net/ethernet/ti/cpsw_priv.h @@ -391,6 +391,7 @@ struct cpsw_priv { u32 tx_packet_min; struct cpsw_ale_ratelimit ale_bc_ratelimit; struct cpsw_ale_ratelimit ale_mc_ratelimit; + struct work_struct rx_mode_work; }; #define ndev_to_cpsw(ndev) (((struct cpsw_priv *)netdev_priv(ndev))->cpsw) From 0b8c878d117319f2be34c8391a77e0f4d5c94d79 Mon Sep 17 00:00:00 2001 From: Kevin Hao Date: Tue, 3 Feb 2026 10:18:31 +0800 Subject: [PATCH 125/167] net: cpsw: Execute ndo_set_rx_mode callback in a work queue Commit 1767bb2d47b7 ("ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.") removed the RTNL lock for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP operations. However, this change triggered the following call trace on my BeagleBone Black board: WARNING: net/8021q/vlan_core.c:236 at vlan_for_each+0x120/0x124, CPU#0: rpcbind/481 RTNL: assertion failed at net/8021q/vlan_core.c (236) Modules linked in: CPU: 0 UID: 997 PID: 481 Comm: rpcbind Not tainted 6.19.0-rc7-next-20260130-yocto-standard+ #35 PREEMPT Hardware name: Generic AM33XX (Flattened Device Tree) Call trace: unwind_backtrace from show_stack+0x28/0x2c show_stack from dump_stack_lvl+0x30/0x38 dump_stack_lvl from __warn+0xb8/0x11c __warn from warn_slowpath_fmt+0x130/0x194 warn_slowpath_fmt from vlan_for_each+0x120/0x124 vlan_for_each from cpsw_add_mc_addr+0x54/0x98 cpsw_add_mc_addr from __hw_addr_ref_sync_dev+0xc4/0xec __hw_addr_ref_sync_dev from __dev_mc_add+0x78/0x88 __dev_mc_add from igmp6_group_added+0x84/0xec igmp6_group_added from __ipv6_dev_mc_inc+0x1fc/0x2f0 __ipv6_dev_mc_inc from __ipv6_sock_mc_join+0x124/0x1b4 __ipv6_sock_mc_join from do_ipv6_setsockopt+0x84c/0x1168 do_ipv6_setsockopt from ipv6_setsockopt+0x88/0xc8 ipv6_setsockopt from do_sock_setsockopt+0xe8/0x19c do_sock_setsockopt from __sys_setsockopt+0x84/0xac __sys_setsockopt from ret_fast_syscall+0x0/0x54 This trace occurs because vlan_for_each() is called within cpsw_ndo_set_rx_mode(), which expects the RTNL lock to be held. Since modifying vlan_for_each() to operate without the RTNL lock is not straightforward, and because ndo_set_rx_mode() is invoked both with and without the RTNL lock across different code paths, simply adding rtnl_lock() in cpsw_ndo_set_rx_mode() is not a viable solution. To resolve this issue, we opt to execute the actual processing within a work queue, following the approach used by the icssg-prueth driver. Please note: To reproduce this issue, I manually reverted the changes to am335x-bone-common.dtsi from commit c477358e66a3 ("ARM: dts: am335x-bone: switch to new cpsw switch drv") in order to revert to the legacy cpsw driver. Fixes: 1767bb2d47b7 ("ipv6: mcast: Don't hold RTNL for IPV6_ADD_MEMBERSHIP and MCAST_JOIN_GROUP.") Signed-off-by: Kevin Hao Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260203-bbb-v5-2-ea0ea217a85c@gmail.com Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/ti/cpsw.c | 41 +++++++++++++++++++++++++++++----- 1 file changed, 35 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 54c24cd3d3be..b0e18bdc2c85 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -305,12 +305,19 @@ static int cpsw_purge_all_mc(struct net_device *ndev, const u8 *addr, int num) return 0; } -static void cpsw_ndo_set_rx_mode(struct net_device *ndev) +static void cpsw_ndo_set_rx_mode_work(struct work_struct *work) { - struct cpsw_priv *priv = netdev_priv(ndev); + struct cpsw_priv *priv = container_of(work, struct cpsw_priv, rx_mode_work); struct cpsw_common *cpsw = priv->cpsw; + struct net_device *ndev = priv->ndev; int slave_port = -1; + rtnl_lock(); + if (!netif_running(ndev)) + goto unlock_rtnl; + + netif_addr_lock_bh(ndev); + if (cpsw->data.dual_emac) slave_port = priv->emac_port + 1; @@ -318,7 +325,7 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev) /* Enable promiscuous mode */ cpsw_set_promiscious(ndev, true); cpsw_ale_set_allmulti(cpsw->ale, IFF_ALLMULTI, slave_port); - return; + goto unlock_addr; } else { /* Disable promiscuous mode */ cpsw_set_promiscious(ndev, false); @@ -331,6 +338,18 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev) /* add/remove mcast address either for real netdev or for vlan */ __hw_addr_ref_sync_dev(&ndev->mc, ndev, cpsw_add_mc_addr, cpsw_del_mc_addr); + +unlock_addr: + netif_addr_unlock_bh(ndev); +unlock_rtnl: + rtnl_unlock(); +} + +static void cpsw_ndo_set_rx_mode(struct net_device *ndev) +{ + struct cpsw_priv *priv = netdev_priv(ndev); + + schedule_work(&priv->rx_mode_work); } static unsigned int cpsw_rxbuf_total_len(unsigned int len) @@ -1472,6 +1491,7 @@ static int cpsw_probe_dual_emac(struct cpsw_priv *priv) priv_sl2->ndev = ndev; priv_sl2->dev = &ndev->dev; priv_sl2->msg_enable = netif_msg_init(debug_level, CPSW_DEBUG); + INIT_WORK(&priv_sl2->rx_mode_work, cpsw_ndo_set_rx_mode_work); if (is_valid_ether_addr(data->slave_data[1].mac_addr)) { memcpy(priv_sl2->mac_addr, data->slave_data[1].mac_addr, @@ -1653,6 +1673,7 @@ static int cpsw_probe(struct platform_device *pdev) priv->dev = dev; priv->msg_enable = netif_msg_init(debug_level, CPSW_DEBUG); priv->emac_port = 0; + INIT_WORK(&priv->rx_mode_work, cpsw_ndo_set_rx_mode_work); if (is_valid_ether_addr(data->slave_data[0].mac_addr)) { memcpy(priv->mac_addr, data->slave_data[0].mac_addr, ETH_ALEN); @@ -1758,6 +1779,8 @@ static int cpsw_probe(struct platform_device *pdev) static void cpsw_remove(struct platform_device *pdev) { struct cpsw_common *cpsw = platform_get_drvdata(pdev); + struct net_device *ndev; + struct cpsw_priv *priv; int i, ret; ret = pm_runtime_resume_and_get(&pdev->dev); @@ -1770,9 +1793,15 @@ static void cpsw_remove(struct platform_device *pdev) return; } - for (i = 0; i < cpsw->data.slaves; i++) - if (cpsw->slaves[i].ndev) - unregister_netdev(cpsw->slaves[i].ndev); + for (i = 0; i < cpsw->data.slaves; i++) { + ndev = cpsw->slaves[i].ndev; + if (!ndev) + continue; + + priv = netdev_priv(ndev); + unregister_netdev(ndev); + disable_work_sync(&priv->rx_mode_work); + } cpts_release(cpsw->cpts); cpdma_ctlr_destroy(cpsw->dma); From 0e0c8f4d16de92520623aa1ea485cadbf64e6929 Mon Sep 17 00:00:00 2001 From: Jacob Keller Date: Mon, 2 Feb 2026 16:16:39 -0800 Subject: [PATCH 126/167] drm/mgag200: fix mgag200_bmc_stop_scanout() The mgag200_bmc_stop_scanout() function is called by the .atomic_disable() handler for the MGA G200 VGA BMC encoder. This function performs a few register writes to inform the BMC of an upcoming mode change, and then polls to wait until the BMC actually stops. The polling is implemented using a busy loop with udelay() and an iteration timeout of 300, resulting in the function blocking for 300 milliseconds. The function gets called ultimately by the output_poll_execute work thread for the DRM output change polling thread of the mgag200 driver: kworker/0:0-mm_ 3528 [000] 4555.315364: ffffffffaa0e25b3 delay_halt.part.0+0x33 ffffffffc03f6188 mgag200_bmc_stop_scanout+0x178 ffffffffc087ae7a disable_outputs+0x12a ffffffffc087c12a drm_atomic_helper_commit_tail+0x1a ffffffffc03fa7b6 mgag200_mode_config_helper_atomic_commit_tail+0x26 ffffffffc087c9c1 commit_tail+0x91 ffffffffc087d51b drm_atomic_helper_commit+0x11b ffffffffc0509694 drm_atomic_commit+0xa4 ffffffffc05105e8 drm_client_modeset_commit_atomic+0x1e8 ffffffffc0510ce6 drm_client_modeset_commit_locked+0x56 ffffffffc0510e24 drm_client_modeset_commit+0x24 ffffffffc088a743 __drm_fb_helper_restore_fbdev_mode_unlocked+0x93 ffffffffc088a683 drm_fb_helper_hotplug_event+0xe3 ffffffffc050f8aa drm_client_dev_hotplug+0x9a ffffffffc088555a output_poll_execute+0x29a ffffffffa9b35924 process_one_work+0x194 ffffffffa9b364ee worker_thread+0x2fe ffffffffa9b3ecad kthread+0xdd ffffffffa9a08549 ret_from_fork+0x29 On a server running ptp4l with the mgag200 driver loaded, we found that ptp4l would sometimes get blocked from execution because of this busy waiting loop. Every so often, approximately once every 20 minutes -- though with large variance -- the output_poll_execute() thread would detect some sort of change that required performing a hotplug event which results in attempting to stop the BMC scanout, resulting in a 300msec delay on one CPU. On this system, ptp4l was pinned to a single CPU. When the output_poll_execute() thread ran on that CPU, it blocked ptp4l from executing for its 300 millisecond duration. This resulted in PTP service disruptions such as failure to send a SYNC message on time, failure to handle ANNOUNCE messages on time, and clock check warnings from the application. All of this despite the application being configured with FIFO_RT and a higher priority than the background workqueue tasks. (However, note that the kernel did not use CONFIG_PREEMPT...) It is unclear if the event is due to a faulty VGA connection, another bug, or actual events causing a change in the connection. At least on the system under test it is not a one-time event and consistently causes disruption to the time sensitive applications. The function has some helpful comments explaining what steps it is attempting to take. In particular, step 3a and 3b are explained as such: 3a - The third step is to verify if there is an active scan. We are waiting on a 0 on remhsyncsts (. 3b - This step occurs only if the remove is actually scanning. We are waiting for the end of the frame which is a 1 on remvsyncsts (). The actual steps 3a and 3b are implemented as while loops with a non-sleeping udelay(). The first step iterates while the tmp value at position 0 is *not* set. That is, it keeps iterating as long as the bit is zero. If the bit is already 0 (because there is no active scan), it will iterate the entire 300 attempts which wastes 300 milliseconds in total. This is opposite of what the description claims. The step 3b logic only executes if we do not iterate over the entire 300 attempts in the first loop. If it does trigger, it is trying to check and wait for a 1 on the remvsyncsts. However, again the condition is actually inverted and it will loop as long as the bit is 1, stopping once it hits zero (rather than the explained attempt to wait until we see a 1). Worse, both loops are implemented using non-sleeping waits which spin instead of allowing the scheduler to run other processes. If the kernel is not configured to allow arbitrary preemption, it will waste valuable CPU time doing nothing. There does not appear to be any documentation for the BMC register interface, beyond what is in the comments here. It seems more probable that the comment here is correct and the implementation accidentally got inverted from the intended logic. Reading through other DRM driver implementations, it does not appear that the .atomic_enable or .atomic_disable handlers need to delay instead of sleep. For example, the ast_astdp_encoder_helper_atomic_disable() function calls ast_dp_set_phy_sleep() which uses msleep(). The "atomic" in the name is referring to the atomic modesetting support, which is the support to enable atomic configuration from userspace, and not to the "atomic context" of the kernel. There is no reason to use udelay() here if a sleep would be sufficient. Replace the while loops with a read_poll_timeout() based implementation that will sleep between iterations, and which stops polling once the condition is met (instead of looping as long as the condition is met). This aligns with the commented behavior and avoids blocking on the CPU while doing nothing. Note the RREG_DAC is implemented using a statement expression to allow working properly with the read_poll_timeout family of functions. The other RREG_ macros ought to be cleaned up to have better semantics, and several places in the mgag200 driver could make use of RREG_DAC or similar RREG_* macros should likely be cleaned up for better semantics as well, but that task has been left as a future cleanup for a non-bugfix. Fixes: 414c45310625 ("mgag200: initial g200se driver (v2)") Suggested-by: Thomas Zimmermann Signed-off-by: Jacob Keller Reviewed-by: Thomas Zimmermann Reviewed-by: Jocelyn Falempe Signed-off-by: Thomas Zimmermann Link: https://patch.msgid.link/20260202-jk-mgag200-fix-bad-udelay-v2-1-ce1e9665987d@intel.com --- drivers/gpu/drm/mgag200/mgag200_bmc.c | 31 +++++++++++---------------- drivers/gpu/drm/mgag200/mgag200_drv.h | 6 ++++++ 2 files changed, 18 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/mgag200/mgag200_bmc.c b/drivers/gpu/drm/mgag200/mgag200_bmc.c index a689c71ff165..bbdeb791c5b3 100644 --- a/drivers/gpu/drm/mgag200/mgag200_bmc.c +++ b/drivers/gpu/drm/mgag200/mgag200_bmc.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only #include +#include #include #include @@ -12,7 +13,7 @@ void mgag200_bmc_stop_scanout(struct mga_device *mdev) { u8 tmp; - int iter_max; + int ret; /* * 1 - The first step is to inform the BMC of an upcoming mode @@ -42,30 +43,22 @@ void mgag200_bmc_stop_scanout(struct mga_device *mdev) /* * 3a- The third step is to verify if there is an active scan. - * We are waiting for a 0 on remhsyncsts ). + * We are waiting for a 0 on remhsyncsts (). */ - iter_max = 300; - while (!(tmp & 0x1) && iter_max) { - WREG8(DAC_INDEX, MGA1064_SPAREREG); - tmp = RREG8(DAC_DATA); - udelay(1000); - iter_max--; - } + ret = read_poll_timeout(RREG_DAC, tmp, !(tmp & 0x1), + 1000, 300000, false, + MGA1064_SPAREREG); + if (ret == -ETIMEDOUT) + return; /* - * 3b- This step occurs only if the remove is actually + * 3b- This step occurs only if the remote BMC is actually * scanning. We are waiting for the end of the frame which is * a 1 on remvsyncsts (XSPAREREG<1>) */ - if (iter_max) { - iter_max = 300; - while ((tmp & 0x2) && iter_max) { - WREG8(DAC_INDEX, MGA1064_SPAREREG); - tmp = RREG8(DAC_DATA); - udelay(1000); - iter_max--; - } - } + (void)read_poll_timeout(RREG_DAC, tmp, (tmp & 0x2), + 1000, 300000, false, + MGA1064_SPAREREG); } void mgag200_bmc_start_scanout(struct mga_device *mdev) diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h b/drivers/gpu/drm/mgag200/mgag200_drv.h index f4bf40cd7c88..a875c4bf8cbe 100644 --- a/drivers/gpu/drm/mgag200/mgag200_drv.h +++ b/drivers/gpu/drm/mgag200/mgag200_drv.h @@ -111,6 +111,12 @@ #define DAC_INDEX 0x3c00 #define DAC_DATA 0x3c0a +#define RREG_DAC(reg) \ + ({ \ + WREG8(DAC_INDEX, reg); \ + RREG8(DAC_DATA); \ + }) \ + #define WREG_DAC(reg, v) \ do { \ WREG8(DAC_INDEX, reg); \ From c62e0658d458d8f100445445c3ddb106f3824a45 Mon Sep 17 00:00:00 2001 From: Alban Bedel Date: Thu, 29 Jan 2026 15:59:44 +0100 Subject: [PATCH 127/167] gpiolib: acpi: Fix gpio count with string references Since commit 9880702d123f2 ("ACPI: property: Support using strings in reference properties") it is possible to use strings instead of local references. This work fine with single GPIO but not with arrays as acpi_gpio_package_count() didn't handle this case. Update it to handle strings like local references to cover this case as well. Signed-off-by: Alban Bedel Reviewed-by: Mika Westerberg Link: https://patch.msgid.link/20260129145944.3372777-1-alban.bedel@lht.dlh.de Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpiolib-acpi-core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpio/gpiolib-acpi-core.c b/drivers/gpio/gpiolib-acpi-core.c index 9627b3a9c7f3..5e709ba35ec2 100644 --- a/drivers/gpio/gpiolib-acpi-core.c +++ b/drivers/gpio/gpiolib-acpi-core.c @@ -1359,6 +1359,7 @@ static int acpi_gpio_package_count(const union acpi_object *obj) while (element < end) { switch (element->type) { case ACPI_TYPE_LOCAL_REFERENCE: + case ACPI_TYPE_STRING: element += 3; fallthrough; case ACPI_TYPE_INTEGER: From 4327fb13fa47770183c4850c35382c30ba5f939d Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Mon, 2 Feb 2026 10:39:40 +0100 Subject: [PATCH 128/167] sched/mmcid: Prevent live lock on task to CPU mode transition Ihor reported a BPF CI failure which turned out to be a live lock in the MM_CID management. The scenario is: A test program creates the 5th thread, which means the MM_CID users become more than the number of CPUs (four in this example), so it switches to per CPU ownership mode. At this point each live task of the program has a CID associated. Assume thread creation order assignment for simplicity. T0 CID0 runs fork() and creates T4 T1 CID1 T2 CID2 T3 CID3 T4 --- not visible yet T0 sets mm_cid::percpu = true and transfers its own CID to CPU0 where it runs on and then starts the fixup which walks through the threads to transfer the per task CIDs either to the CPU the task is running on or drop it back into the pool if the task is not on a CPU. During that T1 - T3 are free to schedule in and out before the fixup caught up with them. Going through all possible permutations with a python script revealed a few problematic cases. The most trivial one is: T1 schedules in on CPU1 and observes percpu == true, so it transfers its CID to CPU1 T1 is migrated to CPU2 and schedule in observes percpu == true, but CPU2 does not have a CID associated and T1 transferred its own to CPU1 So it has to allocate one with CPU2 runqueue lock held, but the pool is empty, so it keeps looping in mm_get_cid(). Now T0 reaches T1 in the thread walk and tries to lock the corresponding runqueue lock, which is held causing a full live lock. There is a similar scenario in the reverse direction of switching from per CPU to task mode which is way more obvious and got therefore addressed by an intermediate mode. In this mode the CIDs are marked with MM_CID_TRANSIT, which means that they are neither owned by the CPU nor by the task. When a task schedules out with a transit CID it drops the CID back into the pool making it available for others to use temporarily. Once the task which initiated the mode switch finished the fixup it clears the transit mode and the process goes back into per task ownership mode. Unfortunately this insight was not mapped back to the task to CPU mode switch as the above described scenario was not considered in the analysis. Apply the same transit mechanism to the task to CPU mode switch to handle these problematic cases correctly. As with the CPU to task transition this results in a potential temporary contention on the CID bitmap, but that's only for the time it takes to complete the transition. After that it stays in steady mode which does not touch the bitmap at all. Fixes: fbd0e71dc370 ("sched/mmcid: Provide CID ownership mode fixup functions") Closes: https://lore.kernel.org/2b7463d7-0f58-4e34-9775-6e2115cfb971@linux.dev Reported-by: Ihor Solodrai Signed-off-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Mathieu Desnoyers Link: https://patch.msgid.link/20260201192834.897115238@kernel.org --- kernel/sched/core.c | 128 ++++++++++++++++++++++++++++--------------- kernel/sched/sched.h | 4 ++ 2 files changed, 88 insertions(+), 44 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 045f83ad261e..1e790f25f709 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10269,7 +10269,8 @@ void call_trace_sched_update_nr_running(struct rq *rq, int count) * Serialization rules: * * mm::mm_cid::mutex: Serializes fork() and exit() and therefore - * protects mm::mm_cid::users. + * protects mm::mm_cid::users and mode switch + * transitions * * mm::mm_cid::lock: Serializes mm_update_max_cids() and * mm_update_cpus_allowed(). Nests in mm_cid::mutex @@ -10285,14 +10286,61 @@ void call_trace_sched_update_nr_running(struct rq *rq, int count) * * A CID is either owned by a task (stored in task_struct::mm_cid.cid) or * by a CPU (stored in mm::mm_cid.pcpu::cid). CIDs owned by CPUs have the - * MM_CID_ONCPU bit set. During transition from CPU to task ownership mode, - * MM_CID_TRANSIT is set on the per task CIDs. When this bit is set the - * task needs to drop the CID into the pool when scheduling out. Both bits - * (ONCPU and TRANSIT) are filtered out by task_cid() when the CID is - * actually handed over to user space in the RSEQ memory. + * MM_CID_ONCPU bit set. + * + * During the transition of ownership mode, the MM_CID_TRANSIT bit is set + * on the CIDs. When this bit is set the tasks drop the CID back into the + * pool when scheduling out. + * + * Both bits (ONCPU and TRANSIT) are filtered out by task_cid() when the + * CID is actually handed over to user space in the RSEQ memory. * * Mode switching: * + * All transitions of ownership mode happen in two phases: + * + * 1) mm:mm_cid.transit contains MM_CID_TRANSIT. This is OR'ed on the CIDs + * and denotes that the CID is only temporarily owned by a task. When + * the task schedules out it drops the CID back into the pool if this + * bit is set. + * + * 2) The initiating context walks the per CPU space or the tasks to fixup + * or drop the CIDs and after completion it clears mm:mm_cid.transit. + * After that point the CIDs are strictly task or CPU owned again. + * + * This two phase transition is required to prevent CID space exhaustion + * during the transition as a direct transfer of ownership would fail: + * + * - On task to CPU mode switch if a task is scheduled in on one CPU and + * then migrated to another CPU before the fixup freed enough per task + * CIDs. + * + * - On CPU to task mode switch if two tasks are scheduled in on the same + * CPU before the fixup freed per CPU CIDs. + * + * Both scenarios can result in a live lock because sched_in() is invoked + * with runqueue lock held and loops in search of a CID and the fixup + * thread can't make progress freeing them up because it is stuck on the + * same runqueue lock. + * + * While MM_CID_TRANSIT is active during the transition phase the MM_CID + * bitmap can be contended, but that's a temporary contention bound to the + * transition period. After that everything goes back into steady state and + * nothing except fork() and exit() will touch the bitmap. This is an + * acceptable tradeoff as it completely avoids complex serialization, + * memory barriers and atomic operations for the common case. + * + * Aside of that this mechanism also ensures RT compability: + * + * - The task which runs the fixup is fully preemptible except for the + * short runqueue lock held sections. + * + * - The transient impact of the bitmap contention is only problematic + * when there is a thundering herd scenario of tasks scheduling in and + * out concurrently. There is not much which can be done about that + * except for avoiding mode switching by a proper overall system + * configuration. + * * Switching to per CPU mode happens when the user count becomes greater * than the maximum number of CIDs, which is calculated by: * @@ -10306,12 +10354,13 @@ void call_trace_sched_update_nr_running(struct rq *rq, int count) * * At the point of switching to per CPU mode the new user is not yet * visible in the system, so the task which initiated the fork() runs the - * fixup function: mm_cid_fixup_tasks_to_cpu() walks the thread list and - * either transfers each tasks owned CID to the CPU the task runs on or - * drops it into the CID pool if a task is not on a CPU at that point in - * time. Tasks which schedule in before the task walk reaches them do the - * handover in mm_cid_schedin(). When mm_cid_fixup_tasks_to_cpus() completes - * it's guaranteed that no task related to that MM owns a CID anymore. + * fixup function. mm_cid_fixup_tasks_to_cpu() walks the thread list and + * either marks each task owned CID with MM_CID_TRANSIT if the task is + * running on a CPU or drops it into the CID pool if a task is not on a + * CPU. Tasks which schedule in before the task walk reaches them do the + * handover in mm_cid_schedin(). When mm_cid_fixup_tasks_to_cpus() + * completes it is guaranteed that no task related to that MM owns a CID + * anymore. * * Switching back to task mode happens when the user count goes below the * threshold which was recorded on the per CPU mode switch: @@ -10327,28 +10376,11 @@ void call_trace_sched_update_nr_running(struct rq *rq, int count) * run either in the deferred update function in context of a workqueue or * by a task which forks a new one or by a task which exits. Whatever * happens first. mm_cid_fixup_cpus_to_task() walks through the possible - * CPUs and either transfers the CPU owned CIDs to a related task which - * runs on the CPU or drops it into the pool. Tasks which schedule in on a - * CPU which the walk did not cover yet do the handover themself. - * - * This transition from CPU to per task ownership happens in two phases: - * - * 1) mm:mm_cid.transit contains MM_CID_TRANSIT This is OR'ed on the task - * CID and denotes that the CID is only temporarily owned by the - * task. When it schedules out the task drops the CID back into the - * pool if this bit is set. - * - * 2) The initiating context walks the per CPU space and after completion - * clears mm:mm_cid.transit. So after that point the CIDs are strictly - * task owned again. - * - * This two phase transition is required to prevent CID space exhaustion - * during the transition as a direct transfer of ownership would fail if - * two tasks are scheduled in on the same CPU before the fixup freed per - * CPU CIDs. - * - * When mm_cid_fixup_cpus_to_tasks() completes it's guaranteed that no CID - * related to that MM is owned by a CPU anymore. + * CPUs and either marks the CPU owned CIDs with MM_CID_TRANSIT if a + * related task is running on the CPU or drops it into the pool. Tasks + * which are scheduled in before the fixup covered them do the handover + * themself. When mm_cid_fixup_cpus_to_tasks() completes it is guaranteed + * that no CID related to that MM is owned by a CPU anymore. */ /* @@ -10400,9 +10432,9 @@ static bool mm_update_max_cids(struct mm_struct *mm) /* Mode change required? */ if (!!mc->percpu == !!mc->pcpu_thrs) return false; - /* When switching back to per TASK mode, set the transition flag */ - if (!mc->pcpu_thrs) - WRITE_ONCE(mc->transit, MM_CID_TRANSIT); + + /* Set the transition flag to bridge the transfer */ + WRITE_ONCE(mc->transit, MM_CID_TRANSIT); WRITE_ONCE(mc->percpu, !!mc->pcpu_thrs); return true; } @@ -10493,10 +10525,10 @@ static void mm_cid_fixup_cpus_to_tasks(struct mm_struct *mm) WRITE_ONCE(mm->mm_cid.transit, 0); } -static inline void mm_cid_transfer_to_cpu(struct task_struct *t, struct mm_cid_pcpu *pcp) +static inline void mm_cid_transit_to_cpu(struct task_struct *t, struct mm_cid_pcpu *pcp) { if (cid_on_task(t->mm_cid.cid)) { - t->mm_cid.cid = cid_to_cpu_cid(t->mm_cid.cid); + t->mm_cid.cid = cid_to_transit_cid(t->mm_cid.cid); pcp->cid = t->mm_cid.cid; } } @@ -10509,18 +10541,17 @@ static bool mm_cid_fixup_task_to_cpu(struct task_struct *t, struct mm_struct *mm if (!t->mm_cid.active) return false; if (cid_on_task(t->mm_cid.cid)) { - /* If running on the CPU, transfer the CID, otherwise drop it */ + /* If running on the CPU, put the CID in transit mode, otherwise drop it */ if (task_rq(t)->curr == t) - mm_cid_transfer_to_cpu(t, per_cpu_ptr(mm->mm_cid.pcpu, task_cpu(t))); + mm_cid_transit_to_cpu(t, per_cpu_ptr(mm->mm_cid.pcpu, task_cpu(t))); else mm_unset_cid_on_task(t); } return true; } -static void mm_cid_fixup_tasks_to_cpus(void) +static void mm_cid_do_fixup_tasks_to_cpus(struct mm_struct *mm) { - struct mm_struct *mm = current->mm; struct task_struct *p, *t; unsigned int users; @@ -10558,6 +10589,15 @@ static void mm_cid_fixup_tasks_to_cpus(void) } } +static void mm_cid_fixup_tasks_to_cpus(void) +{ + struct mm_struct *mm = current->mm; + + mm_cid_do_fixup_tasks_to_cpus(mm); + /* Clear the transition bit */ + WRITE_ONCE(mm->mm_cid.transit, 0); +} + static bool sched_mm_cid_add_user(struct task_struct *t, struct mm_struct *mm) { t->mm_cid.active = 1; @@ -10596,7 +10636,7 @@ void sched_mm_cid_fork(struct task_struct *t) if (!percpu) mm_cid_transit_to_task(current, pcp); else - mm_cid_transfer_to_cpu(current, pcp); + mm_cid_transit_to_cpu(current, pcp); } if (percpu) { diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 93fce4bbff5e..eff207346e8e 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3841,6 +3841,10 @@ static __always_inline void mm_cid_from_cpu(struct task_struct *t, unsigned int /* Still nothing, allocate a new one */ if (!cid_on_cpu(cpu_cid)) cpu_cid = cid_to_cpu_cid(mm_get_cid(mm)); + + /* Set the transition mode flag if required */ + if (READ_ONCE(mm->mm_cid.transit)) + cpu_cid = cpu_cid_to_cid(cpu_cid) | MM_CID_TRANSIT; } mm_cid_update_pcpu_cid(mm, cpu_cid); mm_cid_update_task_cid(t, cpu_cid); From 47ee94efccf6732e4ef1a815c451aacaf1464757 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Mon, 2 Feb 2026 10:39:45 +0100 Subject: [PATCH 129/167] sched/mmcid: Protect transition on weakly ordered systems Shrikanth reported a hard lockup which he observed once. The stack trace shows the following CID related participants: watchdog: CPU 23 self-detected hard LOCKUP @ mm_get_cid+0xe8/0x188 NIP: mm_get_cid+0xe8/0x188 LR: mm_get_cid+0x108/0x188 mm_cid_switch_to+0x3c4/0x52c __schedule+0x47c/0x700 schedule_idle+0x3c/0x64 do_idle+0x160/0x1b0 cpu_startup_entry+0x48/0x50 start_secondary+0x284/0x288 start_secondary_prolog+0x10/0x14 watchdog: CPU 11 self-detected hard LOCKUP @ plpar_hcall_norets_notrace+0x18/0x2c NIP: plpar_hcall_norets_notrace+0x18/0x2c LR: queued_spin_lock_slowpath+0xd88/0x15d0 _raw_spin_lock+0x80/0xa0 raw_spin_rq_lock_nested+0x3c/0xf8 mm_cid_fixup_cpus_to_tasks+0xc8/0x28c sched_mm_cid_exit+0x108/0x22c do_exit+0xf4/0x5d0 make_task_dead+0x0/0x178 system_call_exception+0x128/0x390 system_call_vectored_common+0x15c/0x2ec The task on CPU11 is running the CID ownership mode change fixup function and is stuck on a runqueue lock. The task on CPU23 is trying to get a CID from the pool with the same runqueue lock held, but the pool is empty. After decoding a similar issue in the opposite direction switching from per task to per CPU mode the tool which models the possible scenarios failed to come up with a similar loop hole. This showed up only once, was not reproducible and according to tooling not related to a overlooked scheduling scenario permutation. But the fact that it was observed on a PowerPC system gave the right hint: PowerPC is a weakly ordered architecture. The transition mechanism does: WRITE_ONCE(mm->mm_cid.transit, MM_CID_TRANSIT); WRITE_ONCE(mm->mm_cid.percpu, new_mode); fixup() WRITE_ONCE(mm->mm_cid.transit, 0); mm_cid_schedin() does: if (!READ_ONCE(mm->mm_cid.percpu)) ... cid |= READ_ONCE(mm->mm_cid.transit); so weakly ordered systems can observe percpu == false and transit == 0 even if the fixup function has not yet completed. As a consequence the task will not drop the CID when scheduling out before the fixup is completed, which means the CID space can be exhausted and the next task scheduling in will loop in mm_get_cid() and the fixup thread can livelock on the held runqueue lock as above. This could obviously be solved by using: smp_store_release(&mm->mm_cid.percpu, true); and smp_load_acquire(&mm->mm_cid.percpu); but that brings a memory barrier back into the scheduler hotpath, which was just designed out by the CID rewrite. That can be completely avoided by combining the per CPU mode and the transit storage into a single mm_cid::mode member and ordering the stores against the fixup functions to prevent the CPU from reordering them. That makes the update of both states atomic and a concurrent read observes always consistent state. The price is an additional AND operation in mm_cid_schedin() to evaluate the per CPU or the per task path, but that's in the noise even on strongly ordered architectures as the actual load can be significantly more expensive and the conditional branch evaluation is there anyway. Fixes: fbd0e71dc370 ("sched/mmcid: Provide CID ownership mode fixup functions") Closes: https://lore.kernel.org/bdfea828-4585-40e8-8835-247c6a8a76b0@linux.ibm.com Reported-by: Shrikanth Hegde Signed-off-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Mathieu Desnoyers Link: https://patch.msgid.link/20260201192834.965217106@kernel.org --- include/linux/rseq_types.h | 6 ++-- kernel/sched/core.c | 66 +++++++++++++++++++++++++------------- kernel/sched/sched.h | 21 ++++++------ 3 files changed, 58 insertions(+), 35 deletions(-) diff --git a/include/linux/rseq_types.h b/include/linux/rseq_types.h index 332dc14b81c9..ef0811379c54 100644 --- a/include/linux/rseq_types.h +++ b/include/linux/rseq_types.h @@ -121,8 +121,7 @@ struct mm_cid_pcpu { /** * struct mm_mm_cid - Storage for per MM CID data * @pcpu: Per CPU storage for CIDs associated to a CPU - * @percpu: Set, when CIDs are in per CPU mode - * @transit: Set to MM_CID_TRANSIT during a mode change transition phase + * @mode: Indicates per CPU and transition mode * @max_cids: The exclusive maximum CID value for allocation and convergence * @irq_work: irq_work to handle the affinity mode change case * @work: Regular work to handle the affinity mode change case @@ -139,8 +138,7 @@ struct mm_cid_pcpu { struct mm_mm_cid { /* Hotpath read mostly members */ struct mm_cid_pcpu __percpu *pcpu; - unsigned int percpu; - unsigned int transit; + unsigned int mode; unsigned int max_cids; /* Rarely used. Moves @lock and @mutex into the second cacheline */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1e790f25f709..858028300c5f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10297,16 +10297,25 @@ void call_trace_sched_update_nr_running(struct rq *rq, int count) * * Mode switching: * + * The ownership mode is per process and stored in mm:mm_cid::mode with the + * following possible states: + * + * 0: Per task ownership + * 0 | MM_CID_TRANSIT: Transition from per CPU to per task + * MM_CID_ONCPU: Per CPU ownership + * MM_CID_ONCPU | MM_CID_TRANSIT: Transition from per task to per CPU + * * All transitions of ownership mode happen in two phases: * - * 1) mm:mm_cid.transit contains MM_CID_TRANSIT. This is OR'ed on the CIDs - * and denotes that the CID is only temporarily owned by a task. When - * the task schedules out it drops the CID back into the pool if this - * bit is set. + * 1) mm:mm_cid::mode has the MM_CID_TRANSIT bit set. This is OR'ed on the + * CIDs and denotes that the CID is only temporarily owned by a + * task. When the task schedules out it drops the CID back into the + * pool if this bit is set. * * 2) The initiating context walks the per CPU space or the tasks to fixup - * or drop the CIDs and after completion it clears mm:mm_cid.transit. - * After that point the CIDs are strictly task or CPU owned again. + * or drop the CIDs and after completion it clears MM_CID_TRANSIT in + * mm:mm_cid::mode. After that point the CIDs are strictly task or CPU + * owned again. * * This two phase transition is required to prevent CID space exhaustion * during the transition as a direct transfer of ownership would fail: @@ -10411,6 +10420,7 @@ static inline unsigned int mm_cid_calc_pcpu_thrs(struct mm_mm_cid *mc) static bool mm_update_max_cids(struct mm_struct *mm) { struct mm_mm_cid *mc = &mm->mm_cid; + bool percpu = cid_on_cpu(mc->mode); lockdep_assert_held(&mm->mm_cid.lock); @@ -10419,7 +10429,7 @@ static bool mm_update_max_cids(struct mm_struct *mm) __mm_update_max_cids(mc); /* Check whether owner mode must be changed */ - if (!mc->percpu) { + if (!percpu) { /* Enable per CPU mode when the number of users is above max_cids */ if (mc->users > mc->max_cids) mc->pcpu_thrs = mm_cid_calc_pcpu_thrs(mc); @@ -10430,12 +10440,17 @@ static bool mm_update_max_cids(struct mm_struct *mm) } /* Mode change required? */ - if (!!mc->percpu == !!mc->pcpu_thrs) + if (percpu == !!mc->pcpu_thrs) return false; - /* Set the transition flag to bridge the transfer */ - WRITE_ONCE(mc->transit, MM_CID_TRANSIT); - WRITE_ONCE(mc->percpu, !!mc->pcpu_thrs); + /* Flip the mode and set the transition flag to bridge the transfer */ + WRITE_ONCE(mc->mode, mc->mode ^ (MM_CID_TRANSIT | MM_CID_ONCPU)); + /* + * Order the store against the subsequent fixups so that + * acquire(rq::lock) cannot be reordered by the CPU before the + * store. + */ + smp_mb(); return true; } @@ -10460,7 +10475,7 @@ static inline void mm_update_cpus_allowed(struct mm_struct *mm, const struct cpu WRITE_ONCE(mc->nr_cpus_allowed, weight); __mm_update_max_cids(mc); - if (!mc->percpu) + if (!cid_on_cpu(mc->mode)) return; /* Adjust the threshold to the wider set */ @@ -10478,6 +10493,16 @@ static inline void mm_update_cpus_allowed(struct mm_struct *mm, const struct cpu irq_work_queue(&mc->irq_work); } +static inline void mm_cid_complete_transit(struct mm_struct *mm, unsigned int mode) +{ + /* + * Ensure that the store removing the TRANSIT bit cannot be + * reordered by the CPU before the fixups have been completed. + */ + smp_mb(); + WRITE_ONCE(mm->mm_cid.mode, mode); +} + static inline void mm_cid_transit_to_task(struct task_struct *t, struct mm_cid_pcpu *pcp) { if (cid_on_cpu(t->mm_cid.cid)) { @@ -10521,8 +10546,7 @@ static void mm_cid_fixup_cpus_to_tasks(struct mm_struct *mm) } } } - /* Clear the transition bit */ - WRITE_ONCE(mm->mm_cid.transit, 0); + mm_cid_complete_transit(mm, 0); } static inline void mm_cid_transit_to_cpu(struct task_struct *t, struct mm_cid_pcpu *pcp) @@ -10594,8 +10618,7 @@ static void mm_cid_fixup_tasks_to_cpus(void) struct mm_struct *mm = current->mm; mm_cid_do_fixup_tasks_to_cpus(mm); - /* Clear the transition bit */ - WRITE_ONCE(mm->mm_cid.transit, 0); + mm_cid_complete_transit(mm, MM_CID_ONCPU); } static bool sched_mm_cid_add_user(struct task_struct *t, struct mm_struct *mm) @@ -10626,13 +10649,13 @@ void sched_mm_cid_fork(struct task_struct *t) } if (!sched_mm_cid_add_user(t, mm)) { - if (!mm->mm_cid.percpu) + if (!cid_on_cpu(mm->mm_cid.mode)) t->mm_cid.cid = mm_get_cid(mm); return; } /* Handle the mode change and transfer current's CID */ - percpu = !!mm->mm_cid.percpu; + percpu = cid_on_cpu(mm->mm_cid.mode); if (!percpu) mm_cid_transit_to_task(current, pcp); else @@ -10671,7 +10694,7 @@ static bool __sched_mm_cid_exit(struct task_struct *t) * affinity change increased the number of allowed CPUs and the * deferred fixup did not run yet. */ - if (WARN_ON_ONCE(mm->mm_cid.percpu)) + if (WARN_ON_ONCE(cid_on_cpu(mm->mm_cid.mode))) return false; /* * A failed fork(2) cleanup never gets here, so @current must have @@ -10762,7 +10785,7 @@ static void mm_cid_work_fn(struct work_struct *work) if (!mm_update_max_cids(mm)) return; /* Affinity changes can only switch back to task mode */ - if (WARN_ON_ONCE(mm->mm_cid.percpu)) + if (WARN_ON_ONCE(cid_on_cpu(mm->mm_cid.mode))) return; } mm_cid_fixup_cpus_to_tasks(mm); @@ -10783,8 +10806,7 @@ static void mm_cid_irq_work(struct irq_work *work) void mm_init_cid(struct mm_struct *mm, struct task_struct *p) { mm->mm_cid.max_cids = 0; - mm->mm_cid.percpu = 0; - mm->mm_cid.transit = 0; + mm->mm_cid.mode = 0; mm->mm_cid.nr_cpus_allowed = p->nr_cpus_allowed; mm->mm_cid.users = 0; mm->mm_cid.pcpu_thrs = 0; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index eff207346e8e..f85fd6b81f5e 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3816,7 +3816,8 @@ static __always_inline void mm_cid_update_pcpu_cid(struct mm_struct *mm, unsigne __this_cpu_write(mm->mm_cid.pcpu->cid, cid); } -static __always_inline void mm_cid_from_cpu(struct task_struct *t, unsigned int cpu_cid) +static __always_inline void mm_cid_from_cpu(struct task_struct *t, unsigned int cpu_cid, + unsigned int mode) { unsigned int max_cids, tcid = t->mm_cid.cid; struct mm_struct *mm = t->mm; @@ -3842,15 +3843,16 @@ static __always_inline void mm_cid_from_cpu(struct task_struct *t, unsigned int if (!cid_on_cpu(cpu_cid)) cpu_cid = cid_to_cpu_cid(mm_get_cid(mm)); - /* Set the transition mode flag if required */ - if (READ_ONCE(mm->mm_cid.transit)) + /* Handle the transition mode flag if required */ + if (mode & MM_CID_TRANSIT) cpu_cid = cpu_cid_to_cid(cpu_cid) | MM_CID_TRANSIT; } mm_cid_update_pcpu_cid(mm, cpu_cid); mm_cid_update_task_cid(t, cpu_cid); } -static __always_inline void mm_cid_from_task(struct task_struct *t, unsigned int cpu_cid) +static __always_inline void mm_cid_from_task(struct task_struct *t, unsigned int cpu_cid, + unsigned int mode) { unsigned int max_cids, tcid = t->mm_cid.cid; struct mm_struct *mm = t->mm; @@ -3876,7 +3878,7 @@ static __always_inline void mm_cid_from_task(struct task_struct *t, unsigned int if (!cid_on_task(tcid)) tcid = mm_get_cid(mm); /* Set the transition mode flag if required */ - tcid |= READ_ONCE(mm->mm_cid.transit); + tcid |= mode & MM_CID_TRANSIT; } mm_cid_update_pcpu_cid(mm, tcid); mm_cid_update_task_cid(t, tcid); @@ -3885,16 +3887,17 @@ static __always_inline void mm_cid_from_task(struct task_struct *t, unsigned int static __always_inline void mm_cid_schedin(struct task_struct *next) { struct mm_struct *mm = next->mm; - unsigned int cpu_cid; + unsigned int cpu_cid, mode; if (!next->mm_cid.active) return; cpu_cid = __this_cpu_read(mm->mm_cid.pcpu->cid); - if (likely(!READ_ONCE(mm->mm_cid.percpu))) - mm_cid_from_task(next, cpu_cid); + mode = READ_ONCE(mm->mm_cid.mode); + if (likely(!cid_on_cpu(mode))) + mm_cid_from_task(next, cpu_cid, mode); else - mm_cid_from_cpu(next, cpu_cid); + mm_cid_from_cpu(next, cpu_cid, mode); } static __always_inline void mm_cid_schedout(struct task_struct *prev) From 007d84287c7466ca68a5809b616338214dc5b77b Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Mon, 2 Feb 2026 10:39:50 +0100 Subject: [PATCH 130/167] sched/mmcid: Drop per CPU CID immediately when switching to per task mode When a exiting task initiates the switch from per CPU back to per task mode, it has already dropped its CID and marked itself inactive. But a leftover from an earlier iteration of the rework then reassigns the per CPU CID to the exiting task with the transition bit set. That's wrong as the task is already marked CID inactive, which means it is inconsistent state. It's harmless because the CID is marked in transit and therefore dropped back into the pool when the exiting task schedules out either through preemption or the final schedule(). Simply drop the per CPU CID when the exiting task triggered the transition. Fixes: fbd0e71dc370 ("sched/mmcid: Provide CID ownership mode fixup functions") Signed-off-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Mathieu Desnoyers Link: https://patch.msgid.link/20260201192835.032221009@kernel.org --- kernel/sched/core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 858028300c5f..854984967fe2 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10727,8 +10727,14 @@ void sched_mm_cid_exit(struct task_struct *t) scoped_guard(raw_spinlock_irq, &mm->mm_cid.lock) { if (!__sched_mm_cid_exit(t)) return; - /* Mode change required. Transfer currents CID */ - mm_cid_transit_to_task(current, this_cpu_ptr(mm->mm_cid.pcpu)); + /* + * Mode change. The task has the CID unset + * already. The CPU CID is still valid and + * does not have MM_CID_TRANSIT set as the + * mode change has just taken effect under + * mm::mm_cid::lock. Drop it. + */ + mm_drop_cid_on_cpu(mm, this_cpu_ptr(mm->mm_cid.pcpu)); } mm_cid_fixup_cpus_to_tasks(mm); return; From 4463c7aa11a6e67169ae48c6804968960c4bffea Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Mon, 2 Feb 2026 10:39:55 +0100 Subject: [PATCH 131/167] sched/mmcid: Optimize transitional CIDs when scheduling out During the investigation of the various transition mode issues instrumentation revealed that the amount of bitmap operations can be significantly reduced when a task with a transitional CID schedules out after the fixup function completed and disabled the transition mode. At that point the mode is stable and therefore it is not required to drop the transitional CID back into the pool. As the fixup is complete the potential exhaustion of the CID pool is not longer possible, so the CID can be transferred to the scheduling out task or to the CPU depending on the current ownership mode. The racy snapshot of mm_cid::mode which contains both the ownership state and the transition bit is valid because runqueue lock is held and the fixup function of a concurrent mode switch is serialized. Assigning the ownership right there not only spares the bitmap access for dropping the CID it also avoids it when the task is scheduled back in as it directly hits the fast path in both modes when the CID is within the optimal range. If it's outside the range the next schedule in will need to converge so dropping it right away is sensible. In the good case this also allows to go into the fast path on the next schedule in operation. With a thread pool benchmark which is configured to cross the mode switch boundaries frequently this reduces the number of bitmap operations by about 30% and increases the fastpath utilization in the low single digit percentage range. Signed-off-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Mathieu Desnoyers Link: https://patch.msgid.link/20260201192835.100194627@kernel.org --- kernel/sched/sched.h | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index f85fd6b81f5e..bd350e40859d 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3902,12 +3902,31 @@ static __always_inline void mm_cid_schedin(struct task_struct *next) static __always_inline void mm_cid_schedout(struct task_struct *prev) { + struct mm_struct *mm = prev->mm; + unsigned int mode, cid; + /* During mode transitions CIDs are temporary and need to be dropped */ if (likely(!cid_in_transit(prev->mm_cid.cid))) return; - mm_drop_cid(prev->mm, cid_from_transit_cid(prev->mm_cid.cid)); - prev->mm_cid.cid = MM_CID_UNSET; + mode = READ_ONCE(mm->mm_cid.mode); + cid = cid_from_transit_cid(prev->mm_cid.cid); + + /* + * If transition mode is done, transfer ownership when the CID is + * within the convergence range to optimize the next schedule in. + */ + if (!cid_in_transit(mode) && cid < READ_ONCE(mm->mm_cid.max_cids)) { + if (cid_on_cpu(mode)) + cid = cid_to_cpu_cid(cid); + + /* Update both so that the next schedule in goes into the fast path */ + mm_cid_update_pcpu_cid(mm, cid); + prev->mm_cid.cid = cid; + } else { + mm_drop_cid(mm, cid); + prev->mm_cid.cid = MM_CID_UNSET; + } } static inline void mm_cid_switch_to(struct task_struct *prev, struct task_struct *next) From 7f67ba5413f98d93116a756e7f17cd2c1d6c2bd6 Mon Sep 17 00:00:00 2001 From: Chris Bainbridge Date: Mon, 2 Feb 2026 20:50:33 +0000 Subject: [PATCH 132/167] ASoC: amd: fix memory leak in acp3x pdm dma ops Fixes: 4a767b1d039a8 ("ASoC: amd: add acp3x pdm driver dma ops") Signed-off-by: Chris Bainbridge Link: https://patch.msgid.link/20260202205034.7697-1-chris.bainbridge@gmail.com Signed-off-by: Mark Brown --- sound/soc/amd/renoir/acp3x-pdm-dma.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/soc/amd/renoir/acp3x-pdm-dma.c b/sound/soc/amd/renoir/acp3x-pdm-dma.c index 95ac8c680037..a560d06097d5 100644 --- a/sound/soc/amd/renoir/acp3x-pdm-dma.c +++ b/sound/soc/amd/renoir/acp3x-pdm-dma.c @@ -301,9 +301,11 @@ static int acp_pdm_dma_close(struct snd_soc_component *component, struct snd_pcm_substream *substream) { struct pdm_dev_data *adata = dev_get_drvdata(component->dev); + struct pdm_stream_instance *rtd = substream->runtime->private_data; disable_pdm_interrupts(adata->acp_base); adata->capture_stream = NULL; + kfree(rtd); return 0; } From 85352e59de4ce09de8322b2591a26f515fbde9c0 Mon Sep 17 00:00:00 2001 From: Frank Li Date: Mon, 2 Feb 2026 15:57:57 -0500 Subject: [PATCH 133/167] ASoC: dt-bindings: ti,tlv320aic3x: Add compatible string ti,tlv320aic23 Add compatible string ti,tlv320aic23 to fix below CHECK_DTB warning: arch/arm/boot/dts/nxp/imx/imx35-eukrea-mbimxsd35-baseboard.dtb: /soc/bus@43f00000/i2c@43f80000/codec@1a: failed to match any schema with compatible: ['ti,tlv320aic23'] Signed-off-by: Frank Li Reviewed-by: Daniel Baluta Link: https://patch.msgid.link/20260202205758.3044617-1-Frank.Li@nxp.com Signed-off-by: Mark Brown --- Documentation/devicetree/bindings/sound/ti,tlv320aic3x.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/sound/ti,tlv320aic3x.yaml b/Documentation/devicetree/bindings/sound/ti,tlv320aic3x.yaml index 206f6d61e362..50088698adac 100644 --- a/Documentation/devicetree/bindings/sound/ti,tlv320aic3x.yaml +++ b/Documentation/devicetree/bindings/sound/ti,tlv320aic3x.yaml @@ -46,6 +46,7 @@ maintainers: properties: compatible: enum: + - ti,tlv320aic23 - ti,tlv320aic3x - ti,tlv320aic33 - ti,tlv320aic3007 From f514248727606b9087bc38a284ff686e0093abf1 Mon Sep 17 00:00:00 2001 From: Ziyi Guo Date: Mon, 2 Feb 2026 17:41:12 +0000 Subject: [PATCH 134/167] ASoC: fsl_xcvr: fix missing lock in fsl_xcvr_mode_put() fsl_xcvr_activate_ctl() has lockdep_assert_held(&card->snd_card->controls_rwsem), but fsl_xcvr_mode_put() calls it without acquiring this lock. Other callers of fsl_xcvr_activate_ctl() in fsl_xcvr_startup() and fsl_xcvr_shutdown() properly acquire the lock with down_read()/up_read(). Add the missing down_read()/up_read() calls around fsl_xcvr_activate_ctl() in fsl_xcvr_mode_put() to fix the lockdep assertion and prevent potential race conditions when multiple userspace threads access the control. Signed-off-by: Ziyi Guo Link: https://patch.msgid.link/20260202174112.2018402-1-n7l8m4@u.northwestern.edu Signed-off-by: Mark Brown --- sound/soc/fsl/fsl_xcvr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c index a268fb81a2f8..5de93f458b56 100644 --- a/sound/soc/fsl/fsl_xcvr.c +++ b/sound/soc/fsl/fsl_xcvr.c @@ -223,10 +223,13 @@ static int fsl_xcvr_mode_put(struct snd_kcontrol *kcontrol, xcvr->mode = snd_soc_enum_item_to_val(e, item[0]); + down_read(&card->snd_card->controls_rwsem); fsl_xcvr_activate_ctl(dai, fsl_xcvr_arc_mode_kctl.name, (xcvr->mode == FSL_XCVR_MODE_ARC)); fsl_xcvr_activate_ctl(dai, fsl_xcvr_earc_capds_kctl.name, (xcvr->mode == FSL_XCVR_MODE_EARC)); + up_read(&card->snd_card->controls_rwsem); + /* Allow playback for SPDIF only */ rtd = snd_soc_get_pcm_runtime(card, card->dai_link); rtd->pcm->streams[SNDRV_PCM_STREAM_PLAYBACK].substream_count = From 7ee9b3e091c63da71e15c72003f1f07e467f5158 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Fri, 30 Jan 2026 04:39:08 +0000 Subject: [PATCH 135/167] drm/xe/query: Fix topology query pointer advance MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The topology query helper advanced the user pointer by the size of the pointer, not the size of the structure. This can misalign the output blob and corrupt the following mask. Fix the increment to use sizeof(*topo). There is no issue currently, as sizeof(*topo) happens to be equal to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes). Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Shuicheng Lin Reviewed-by: Matt Roper Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper (cherry picked from commit c2a6859138e7f73ad904be17dd7d1da6cc7f06b3) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_query.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c index 1c0915e2cc16..fde36280e14d 100644 --- a/drivers/gpu/drm/xe/xe_query.c +++ b/drivers/gpu/drm/xe/xe_query.c @@ -491,7 +491,7 @@ static int copy_mask(void __user **ptr, if (copy_to_user(*ptr, topo, sizeof(*topo))) return -EFAULT; - *ptr += sizeof(topo); + *ptr += sizeof(*topo); if (copy_to_user(*ptr, mask, mask_size)) return -EFAULT; From e022c16965b8345af3c384c9136c7ca541a25f72 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Thu, 29 Jan 2026 23:38:36 +0000 Subject: [PATCH 136/167] drm/xe: Fix kerneldoc for xe_migrate_exec_queue MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_migrate.c:1262 expecting prototype for xe_get_migrate_exec_queue(). Prototype was for xe_migrate_exec_queue() instead" Fixes: 916ee4704a865 ("drm/xe/vf: Register CCS read/write contexts with Guc") Reviewed-by: Michal Wajdeczko Signed-off-by: Shuicheng Lin Signed-off-by: Michal Wajdeczko Link: https://patch.msgid.link/20260129233834.419977-6-shuicheng.lin@intel.com (cherry picked from commit 9fd8da717934f05125b9ba6782622c459a368dc0) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index d8ee76aab4e4..9d7329cef910 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -1201,7 +1201,7 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q, } /** - * xe_get_migrate_exec_queue() - Get the execution queue from migrate context. + * xe_migrate_exec_queue() - Get the execution queue from migrate context. * @migrate: Migrate context. * * Return: Pointer to execution queue on success, error on failure From 51db5eef2ce39311cee2919623cda2ac0fc13c02 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Thu, 29 Jan 2026 23:38:37 +0000 Subject: [PATCH 137/167] drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_tlb_inval.c:136 expecting prototype for xe_gt_tlb_inval_init(). Prototype was for xe_gt_tlb_inval_init_early() instead" v2: add () for the function. (Michal) Fixes: db16f9d90c1d9 ("drm/xe: Split TLB invalidation code in frontend and backend") Reviewed-by: Michal Wajdeczko Signed-off-by: Shuicheng Lin Signed-off-by: Michal Wajdeczko Link: https://patch.msgid.link/20260129233834.419977-7-shuicheng.lin@intel.com (cherry picked from commit 0651dbb9d6a72e99569576fbec4681fd8160d161) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_tlb_inval.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c index 918a59e686ea..1b7ac31a37ca 100644 --- a/drivers/gpu/drm/xe/xe_tlb_inval.c +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c @@ -115,7 +115,7 @@ static void tlb_inval_fini(struct drm_device *drm, void *arg) } /** - * xe_gt_tlb_inval_init - Initialize TLB invalidation state + * xe_gt_tlb_inval_init_early() - Initialize TLB invalidation state * @gt: GT structure * * Initialize TLB invalidation state, purely software initialization, should From 16264a3b594282a7f25028745158bc59a9cf7f96 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Thu, 29 Jan 2026 23:38:38 +0000 Subject: [PATCH 138/167] drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Correct the function name in the kerneldoc. It is for below warning: "Warning: drivers/gpu/drm/xe/xe_tlb_inval_job.c:210 expecting prototype for xe_tlb_inval_alloc_dep(). Prototype was for xe_tlb_inval_job_alloc_dep() instead" Fixes: 15366239e2130 ("drm/xe: Decouple TLB invalidations from GT") Reviewed-by: Michal Wajdeczko Signed-off-by: Shuicheng Lin Signed-off-by: Michal Wajdeczko Link: https://patch.msgid.link/20260129233834.419977-8-shuicheng.lin@intel.com (cherry picked from commit 9f9c117ac566cb567dd56cc5b7564c45653f7a2a) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_tlb_inval_job.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.c b/drivers/gpu/drm/xe/xe_tlb_inval_job.c index 1ae0dec2cf31..1ec9c0f5f0b6 100644 --- a/drivers/gpu/drm/xe/xe_tlb_inval_job.c +++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c @@ -164,7 +164,7 @@ static void xe_tlb_inval_job_destroy(struct kref *ref) } /** - * xe_tlb_inval_alloc_dep() - TLB invalidation job alloc dependency + * xe_tlb_inval_job_alloc_dep() - TLB invalidation job alloc dependency * @job: TLB invalidation job to alloc dependency for * * Allocate storage for a dependency in the TLB invalidation fence. This From bb36170d959fad7f663f91eb9c32a84dd86bef2b Mon Sep 17 00:00:00 2001 From: Karthik Poosa Date: Fri, 23 Jan 2026 23:02:38 +0530 Subject: [PATCH 139/167] drm/xe/pm: Disable D3Cold for BMG only on specific platforms MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Restrict D3Cold disablement for BMG to unsupported NUC platforms, instead of disabling it on all platforms. Signed-off-by: Karthik Poosa Fixes: 3e331a6715ee ("drm/xe/pm: Temporarily disable D3Cold on BMG") Link: https://patch.msgid.link/20260123173238.1642383-1-karthik.poosa@intel.com Reviewed-by: Rodrigo Vivi Signed-off-by: Rodrigo Vivi (cherry picked from commit 39125eaf8863ab09d70c4b493f58639b08d5a897) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_pm.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c index 766922530265..51eb6d005331 100644 --- a/drivers/gpu/drm/xe/xe_pm.c +++ b/drivers/gpu/drm/xe/xe_pm.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -357,9 +358,15 @@ ALLOW_ERROR_INJECTION(xe_pm_init_early, ERRNO); /* See xe_pci_probe() */ static u32 vram_threshold_value(struct xe_device *xe) { - /* FIXME: D3Cold temporarily disabled by default on BMG */ - if (xe->info.platform == XE_BATTLEMAGE) - return 0; + if (xe->info.platform == XE_BATTLEMAGE) { + const char *product_name; + + product_name = dmi_get_system_info(DMI_PRODUCT_NAME); + if (product_name && strstr(product_name, "NUC13RNG")) { + drm_warn(&xe->drm, "BMG + D3Cold not supported on this platform\n"); + return 0; + } + } return DEFAULT_VRAM_THRESHOLD; } From 7987cce375ac8ce98e170a77aa2399f2cf6eb99f Mon Sep 17 00:00:00 2001 From: Viacheslav Dubeyko Date: Tue, 3 Feb 2026 14:54:46 -0800 Subject: [PATCH 140/167] ceph: fix NULL pointer dereference in ceph_mds_auth_match() The CephFS kernel client has regression starting from 6.18-rc1. We have issue in ceph_mds_auth_match() if fs_name == NULL: const char fs_name = mdsc->fsc->mount_options->mds_namespace; ... if (auth->match.fs_name && strcmp(auth->match.fs_name, fs_name)) { / fsname mismatch, try next one */ return 0; } Patrick Donnelly suggested that: In summary, we should definitely start decoding `fs_name` from the MDSMap and do strict authorizations checks against it. Note that the `-o mds_namespace=foo` should only be used for selecting the file system to mount and nothing else. It's possible no mds_namespace is specified but the kernel will mount the only file system that exists which may have name "foo". This patch reworks ceph_mdsmap_decode() and namespace_equals() with the goal of supporting the suggested concept. Now struct ceph_mdsmap contains m_fs_name field that receives copy of extracted FS name by ceph_extract_encoded_string(). For the case of "old" CephFS file systems, it is used "cephfs" name. [ idryomov: replace redundant %*pE with %s in ceph_mdsmap_decode(), get rid of a series of strlen() calls in ceph_namespace_match(), drop changes to namespace_equals() body to avoid treating empty mds_namespace as equal, drop changes to ceph_mdsc_handle_fsmap() as namespace_equals() isn't an equivalent substitution there ] Cc: stable@vger.kernel.org Fixes: 22c73d52a6d0 ("ceph: fix multifs mds auth caps issue") Link: https://tracker.ceph.com/issues/73886 Signed-off-by: Viacheslav Dubeyko Reviewed-by: Patrick Donnelly Tested-by: Patrick Donnelly Signed-off-by: Ilya Dryomov --- fs/ceph/mds_client.c | 5 +++-- fs/ceph/mdsmap.c | 26 +++++++++++++++++++------- fs/ceph/mdsmap.h | 1 + fs/ceph/super.h | 16 ++++++++++++++-- include/linux/ceph/ceph_fs.h | 6 ++++++ 5 files changed, 43 insertions(+), 11 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 7e4eab824dae..c45bd19d4b1c 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -5671,7 +5671,7 @@ static int ceph_mds_auth_match(struct ceph_mds_client *mdsc, u32 caller_uid = from_kuid(&init_user_ns, cred->fsuid); u32 caller_gid = from_kgid(&init_user_ns, cred->fsgid); struct ceph_client *cl = mdsc->fsc->client; - const char *fs_name = mdsc->fsc->mount_options->mds_namespace; + const char *fs_name = mdsc->mdsmap->m_fs_name; const char *spath = mdsc->fsc->mount_options->server_path; bool gid_matched = false; u32 gid, tlen, len; @@ -5679,7 +5679,8 @@ static int ceph_mds_auth_match(struct ceph_mds_client *mdsc, doutc(cl, "fsname check fs_name=%s match.fs_name=%s\n", fs_name, auth->match.fs_name ? auth->match.fs_name : ""); - if (auth->match.fs_name && strcmp(auth->match.fs_name, fs_name)) { + + if (!ceph_namespace_match(auth->match.fs_name, fs_name)) { /* fsname mismatch, try next one */ return 0; } diff --git a/fs/ceph/mdsmap.c b/fs/ceph/mdsmap.c index 2c7b151a7c95..b228e5ecfb92 100644 --- a/fs/ceph/mdsmap.c +++ b/fs/ceph/mdsmap.c @@ -353,22 +353,33 @@ struct ceph_mdsmap *ceph_mdsmap_decode(struct ceph_mds_client *mdsc, void **p, __decode_and_drop_type(p, end, u8, bad_ext); } if (mdsmap_ev >= 8) { - u32 fsname_len; + size_t fsname_len; + /* enabled */ ceph_decode_8_safe(p, end, m->m_enabled, bad_ext); + /* fs_name */ - ceph_decode_32_safe(p, end, fsname_len, bad_ext); + m->m_fs_name = ceph_extract_encoded_string(p, end, + &fsname_len, + GFP_NOFS); + if (IS_ERR(m->m_fs_name)) { + m->m_fs_name = NULL; + goto nomem; + } /* validate fsname against mds_namespace */ - if (!namespace_equals(mdsc->fsc->mount_options, *p, + if (!namespace_equals(mdsc->fsc->mount_options, m->m_fs_name, fsname_len)) { - pr_warn_client(cl, "fsname %*pE doesn't match mds_namespace %s\n", - (int)fsname_len, (char *)*p, + pr_warn_client(cl, "fsname %s doesn't match mds_namespace %s\n", + m->m_fs_name, mdsc->fsc->mount_options->mds_namespace); goto bad; } - /* skip fsname after validation */ - ceph_decode_skip_n(p, end, fsname_len, bad); + } else { + m->m_enabled = false; + m->m_fs_name = kstrdup(CEPH_OLD_FS_NAME, GFP_NOFS); + if (!m->m_fs_name) + goto nomem; } /* damaged */ if (mdsmap_ev >= 9) { @@ -430,6 +441,7 @@ void ceph_mdsmap_destroy(struct ceph_mdsmap *m) kfree(m->m_info); } kfree(m->m_data_pg_pools); + kfree(m->m_fs_name); kfree(m); } diff --git a/fs/ceph/mdsmap.h b/fs/ceph/mdsmap.h index 1f2171dd01bf..d48d07c3516d 100644 --- a/fs/ceph/mdsmap.h +++ b/fs/ceph/mdsmap.h @@ -45,6 +45,7 @@ struct ceph_mdsmap { bool m_enabled; bool m_damaged; int m_num_laggy; + char *m_fs_name; }; static inline struct ceph_entity_addr * diff --git a/fs/ceph/super.h b/fs/ceph/super.h index a1f781c46b41..29a980e22dc2 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -104,14 +104,26 @@ struct ceph_mount_options { struct fscrypt_dummy_policy dummy_enc_policy; }; +#define CEPH_NAMESPACE_WILDCARD "*" + +static inline bool ceph_namespace_match(const char *pattern, + const char *target) +{ + if (!pattern || !pattern[0] || + !strcmp(pattern, CEPH_NAMESPACE_WILDCARD)) + return true; + + return !strcmp(pattern, target); +} + /* * Check if the mds namespace in ceph_mount_options matches * the passed in namespace string. First time match (when * ->mds_namespace is NULL) is treated specially, since * ->mds_namespace needs to be initialized by the caller. */ -static inline int namespace_equals(struct ceph_mount_options *fsopt, - const char *namespace, size_t len) +static inline bool namespace_equals(struct ceph_mount_options *fsopt, + const char *namespace, size_t len) { return !(fsopt->mds_namespace && (strlen(fsopt->mds_namespace) != len || diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h index c7f2c63b3bc3..08e5dbe15ca4 100644 --- a/include/linux/ceph/ceph_fs.h +++ b/include/linux/ceph/ceph_fs.h @@ -31,6 +31,12 @@ #define CEPH_INO_CEPH 2 /* hidden .ceph dir */ #define CEPH_INO_GLOBAL_SNAPREALM 3 /* global dummy snaprealm */ +/* + * name for "old" CephFS file systems, + * see ceph.git e2b151d009640114b2565c901d6f41f6cd5ec652 + */ +#define CEPH_OLD_FS_NAME "cephfs" + /* arbitrary limit on max # of monitors (cluster of 3 is typical) */ #define CEPH_MAX_MON 31 From 0eca95cba2b7bf7b7b4f2fa90734a85fcaa72782 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 4 Feb 2026 10:07:55 -1000 Subject: [PATCH 141/167] sched_ext: Short-circuit sched_class operations on dead tasks 7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free() to finish_task_switch()") moved sched_ext_free() to finish_task_switch() and renamed it to sched_ext_dead() to fix cgroup exit ordering issues. However, this created a race window where certain sched_class ops may be invoked on dead tasks leading to failures - e.g. sched_setscheduler() may try to switch a task which finished sched_ext_dead() back into SCX triggering invalid SCX task state transitions. Add task_dead_and_done() which tests whether a task is TASK_DEAD and has completed its final context switch, and use it to short-circuit sched_class operations which may be called on dead tasks. Fixes: 7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free() to finish_task_switch()") Reported-by: Andrea Righi Link: http://lkml.kernel.org/r/20260202151341.796959-1-arighi@nvidia.com Reviewed-by: Andrea Righi Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 8f6d8d7f895c..1a5ead4a476e 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -194,6 +194,7 @@ MODULE_PARM_DESC(bypass_lb_intv_us, "bypass load balance interval in microsecond #include static void process_ddsp_deferred_locals(struct rq *rq); +static bool task_dead_and_done(struct task_struct *p); static u32 reenq_local(struct rq *rq); static void scx_kick_cpu(struct scx_sched *sch, s32 cpu, u64 flags); static bool scx_vexit(struct scx_sched *sch, enum scx_exit_kind kind, @@ -2618,6 +2619,9 @@ static void set_cpus_allowed_scx(struct task_struct *p, set_cpus_allowed_common(p, ac); + if (task_dead_and_done(p)) + return; + /* * The effective cpumask is stored in @p->cpus_ptr which may temporarily * differ from the configured one in @p->cpus_mask. Always tell the bpf @@ -3033,10 +3037,45 @@ void scx_cancel_fork(struct task_struct *p) percpu_up_read(&scx_fork_rwsem); } +/** + * task_dead_and_done - Is a task dead and done running? + * @p: target task + * + * Once sched_ext_dead() removes the dead task from scx_tasks and exits it, the + * task no longer exists from SCX's POV. However, certain sched_class ops may be + * invoked on these dead tasks leading to failures - e.g. sched_setscheduler() + * may try to switch a task which finished sched_ext_dead() back into SCX + * triggering invalid SCX task state transitions and worse. + * + * Once a task has finished the final switch, sched_ext_dead() is the only thing + * that needs to happen on the task. Use this test to short-circuit sched_class + * operations which may be called on dead tasks. + */ +static bool task_dead_and_done(struct task_struct *p) +{ + struct rq *rq = task_rq(p); + + lockdep_assert_rq_held(rq); + + /* + * In do_task_dead(), a dying task sets %TASK_DEAD with preemption + * disabled and __schedule(). If @p has %TASK_DEAD set and off CPU, @p + * won't ever run again. + */ + return unlikely(READ_ONCE(p->__state) == TASK_DEAD) && + !task_on_cpu(rq, p); +} + void sched_ext_dead(struct task_struct *p) { unsigned long flags; + /* + * By the time control reaches here, @p has %TASK_DEAD set, switched out + * for the last time and then dropped the rq lock - task_dead_and_done() + * should be returning %true nullifying the straggling sched_class ops. + * Remove from scx_tasks and exit @p. + */ raw_spin_lock_irqsave(&scx_tasks_lock, flags); list_del_init(&p->scx.tasks_node); raw_spin_unlock_irqrestore(&scx_tasks_lock, flags); @@ -3062,6 +3101,9 @@ static void reweight_task_scx(struct rq *rq, struct task_struct *p, lockdep_assert_rq_held(task_rq(p)); + if (task_dead_and_done(p)) + return; + p->scx.weight = sched_weight_to_cgroup(scale_load_down(lw->weight)); if (SCX_HAS_OP(sch, set_weight)) SCX_CALL_OP_TASK(sch, SCX_KF_REST, set_weight, rq, @@ -3076,6 +3118,9 @@ static void switching_to_scx(struct rq *rq, struct task_struct *p) { struct scx_sched *sch = scx_root; + if (task_dead_and_done(p)) + return; + scx_enable_task(p); /* @@ -3089,6 +3134,9 @@ static void switching_to_scx(struct rq *rq, struct task_struct *p) static void switched_from_scx(struct rq *rq, struct task_struct *p) { + if (task_dead_and_done(p)) + return; + scx_disable_task(p); } From 831a2b27914cc880130ffe8fb8d1e65a5324d07f Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 3 Feb 2026 17:34:36 +0100 Subject: [PATCH 142/167] hwmon: (occ) Mark occ_init_attribute() as __printf This is a printf-style function, which gcc -Werror=suggest-attribute=format correctly points out: drivers/hwmon/occ/common.c: In function 'occ_init_attribute': drivers/hwmon/occ/common.c:761:9: error: function 'occ_init_attribute' might be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format] Add the attribute to avoid this warning and ensure any incorrect format strings are detected here. Fixes: 744c2fe950e9 ("hwmon: (occ) Rework attribute registration for stack usage") Signed-off-by: Arnd Bergmann Link: https://lore.kernel.org/r/20260203163440.2674340-1-arnd@kernel.org Signed-off-by: Guenter Roeck --- drivers/hwmon/occ/common.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hwmon/occ/common.c b/drivers/hwmon/occ/common.c index b3694a4209b9..89928d38831b 100644 --- a/drivers/hwmon/occ/common.c +++ b/drivers/hwmon/occ/common.c @@ -749,6 +749,7 @@ static ssize_t occ_show_extended(struct device *dev, * are dynamically allocated, we cannot use the existing kernel macros which * stringify the name argument. */ +__printf(7, 8) static void occ_init_attribute(struct occ_attribute *attr, int mode, ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf), ssize_t (*store)(struct device *dev, struct device_attribute *attr, From f41c5d151078c5348271ffaf8e7410d96f2d82f8 Mon Sep 17 00:00:00 2001 From: Andrew Fasano Date: Wed, 4 Feb 2026 17:46:58 +0100 Subject: [PATCH 143/167] netfilter: nf_tables: fix inverted genmask check in nft_map_catchall_activate() nft_map_catchall_activate() has an inverted element activity check compared to its non-catchall counterpart nft_mapelem_activate() and compared to what is logically required. nft_map_catchall_activate() is called from the abort path to re-activate catchall map elements that were deactivated during a failed transaction. It should skip elements that are already active (they don't need re-activation) and process elements that are inactive (they need to be restored). Instead, the current code does the opposite: it skips inactive elements and processes active ones. Compare the non-catchall activate callback, which is correct: nft_mapelem_activate(): if (nft_set_elem_active(ext, iter->genmask)) return 0; /* skip active, process inactive */ With the buggy catchall version: nft_map_catchall_activate(): if (!nft_set_elem_active(ext, genmask)) continue; /* skip inactive, process active */ The consequence is that when a DELSET operation is aborted, nft_setelem_data_activate() is never called for the catchall element. For NFT_GOTO verdict elements, this means nft_data_hold() is never called to restore the chain->use reference count. Each abort cycle permanently decrements chain->use. Once chain->use reaches zero, DELCHAIN succeeds and frees the chain while catchall verdict elements still reference it, resulting in a use-after-free. This is exploitable for local privilege escalation from an unprivileged user via user namespaces + nftables on distributions that enable CONFIG_USER_NS and CONFIG_NF_TABLES. Fix by removing the negation so the check matches nft_mapelem_activate(): skip active elements, process inactive ones. Fixes: 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase") Signed-off-by: Andrew Fasano Signed-off-by: Florian Westphal --- net/netfilter/nf_tables_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 729a92781a1a..be92750e2af3 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -5914,7 +5914,7 @@ static void nft_map_catchall_activate(const struct nft_ctx *ctx, list_for_each_entry(catchall, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem); - if (!nft_set_elem_active(ext, genmask)) + if (nft_set_elem_active(ext, genmask)) continue; nft_clear(ctx->net, ext); From 51db05283f7c9c95a3e6853a3044cd04226551bf Mon Sep 17 00:00:00 2001 From: Breno Baptista Date: Wed, 4 Feb 2026 23:43:41 -0300 Subject: [PATCH 144/167] ALSA: hda/realtek: Enable headset mic for Acer Nitro 5 Add quirk to support microphone input through headphone jack on Acer Nitro 5 AN515-57 (ALC295). Signed-off-by: Breno Baptista Link: https://patch.msgid.link/20260205024341.26694-1-brenomb07@gmail.com Signed-off-by: Takashi Iwai --- sound/hda/codecs/realtek/alc269.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 0a0496deb9c8..b66965a52107 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -6353,6 +6353,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1025, 0x1430, "Acer TravelMate B311R-31", ALC256_FIXUP_ACER_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1025, 0x1466, "Acer Aspire A515-56", ALC255_FIXUP_ACER_HEADPHONE_AND_MIC), SND_PCI_QUIRK(0x1025, 0x1534, "Acer Predator PH315-54", ALC255_FIXUP_ACER_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1025, 0x1539, "Acer Nitro 5 AN515-57", ALC2XX_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1025, 0x159c, "Acer Nitro 5 AN515-58", ALC2XX_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1025, 0x1597, "Acer Nitro 5 AN517-55", ALC2XX_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1025, 0x169a, "Acer Swift SFG16", ALC256_FIXUP_ACER_SFG16_MICMUTE_LED), From 40b24d9cdd4141ef43eeaa7e57c3efc07a567473 Mon Sep 17 00:00:00 2001 From: Shengjiu Wang Date: Fri, 30 Jan 2026 16:09:10 +0800 Subject: [PATCH 145/167] drm/bridge: imx8mp-hdmi-pai: enable PM runtime There is an audio channel shift issue with multi channel case - the channel order is correct for the first run, but the channel order is shifted for the second run. The fix method is to reset the PAI interface at the end of playback. The reset can be handled by PM runtime, so enable PM runtime. Fixes: 0205fae6327a ("drm/bridge: imx: add driver for HDMI TX Parallel Audio Interface") Signed-off-by: Shengjiu Wang Reviewed-by: Liu Ying Signed-off-by: Liu Ying Link: https://lore.kernel.org/r/20260130080910.3532724-1-shengjiu.wang@nxp.com --- drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pai.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pai.c b/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pai.c index 8d13a35b206a..cc221483ef0d 100644 --- a/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pai.c +++ b/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-pai.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -33,6 +34,7 @@ struct imx8mp_hdmi_pai { struct regmap *regmap; + struct device *dev; }; static void imx8mp_hdmi_pai_enable(struct dw_hdmi *dw_hdmi, int channel, @@ -43,6 +45,9 @@ static void imx8mp_hdmi_pai_enable(struct dw_hdmi *dw_hdmi, int channel, struct imx8mp_hdmi_pai *hdmi_pai = pdata->priv_audio; int val; + if (pm_runtime_resume_and_get(hdmi_pai->dev) < 0) + return; + /* PAI set control extended */ val = WTMK_HIGH(3) | WTMK_LOW(3); val |= NUM_CH(channel); @@ -85,6 +90,8 @@ static void imx8mp_hdmi_pai_disable(struct dw_hdmi *dw_hdmi) /* Stop PAI */ regmap_write(hdmi_pai->regmap, HTX_PAI_CTRL, 0); + + pm_runtime_put_sync(hdmi_pai->dev); } static const struct regmap_config imx8mp_hdmi_pai_regmap_config = { @@ -101,6 +108,7 @@ static int imx8mp_hdmi_pai_bind(struct device *dev, struct device *master, void struct imx8mp_hdmi_pai *hdmi_pai; struct resource *res; void __iomem *base; + int ret; hdmi_pai = devm_kzalloc(dev, sizeof(*hdmi_pai), GFP_KERNEL); if (!hdmi_pai) @@ -121,6 +129,13 @@ static int imx8mp_hdmi_pai_bind(struct device *dev, struct device *master, void plat_data->disable_audio = imx8mp_hdmi_pai_disable; plat_data->priv_audio = hdmi_pai; + hdmi_pai->dev = dev; + ret = devm_pm_runtime_enable(dev); + if (ret < 0) { + dev_err(dev, "failed to enable PM runtime: %d\n", ret); + return ret; + } + return 0; } From 4cb1b327135dddf3d0ec2544ea36ed05ba2252bc Mon Sep 17 00:00:00 2001 From: Daniele Ceraolo Spurio Date: Thu, 29 Jan 2026 10:25:48 -0800 Subject: [PATCH 146/167] drm/xe/guc: Fix CFI violation in debugfs access. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit xe_guc_print_info is void-returning, but the function pointer it is assigned to expects an int-returning function, leading to the following CFI error: [ 206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe] (target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a) Fix this by updating xe_guc_print_info to return an integer. Fixes: e15826bb3c2c ("drm/xe/guc: Refactor GuC debugfs initialization") Signed-off-by: Daniele Ceraolo Spurio Cc: Michal Wajdeczko Cc: George D Sworo Reviewed-by: Michal Wajdeczko Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com (cherry picked from commit dd8ea2f2ab71b98887fdc426b0651dbb1d1ea760) Signed-off-by: Thomas Hellström --- drivers/gpu/drm/xe/xe_guc.c | 6 ++++-- drivers/gpu/drm/xe/xe_guc.h | 2 +- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c index a686b04879d6..edb939f26268 100644 --- a/drivers/gpu/drm/xe/xe_guc.c +++ b/drivers/gpu/drm/xe/xe_guc.c @@ -1618,7 +1618,7 @@ int xe_guc_start(struct xe_guc *guc) return xe_guc_submit_start(guc); } -void xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p) +int xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p) { struct xe_gt *gt = guc_to_gt(guc); unsigned int fw_ref; @@ -1630,7 +1630,7 @@ void xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p) if (!IS_SRIOV_VF(gt_to_xe(gt))) { fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); if (!fw_ref) - return; + return -EIO; status = xe_mmio_read32(>->mmio, GUC_STATUS); @@ -1658,6 +1658,8 @@ void xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p) drm_puts(p, "\n"); xe_guc_submit_print(guc, p); + + return 0; } /** diff --git a/drivers/gpu/drm/xe/xe_guc.h b/drivers/gpu/drm/xe/xe_guc.h index e2d4c5f44ae3..61226ab81669 100644 --- a/drivers/gpu/drm/xe/xe_guc.h +++ b/drivers/gpu/drm/xe/xe_guc.h @@ -45,7 +45,7 @@ int xe_guc_self_cfg32(struct xe_guc *guc, u16 key, u32 val); int xe_guc_self_cfg64(struct xe_guc *guc, u16 key, u64 val); void xe_guc_irq_handler(struct xe_guc *guc, const u16 iir); void xe_guc_sanitize(struct xe_guc *guc); -void xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p); +int xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p); int xe_guc_reset_prepare(struct xe_guc *guc); void xe_guc_reset_wait(struct xe_guc *guc); void xe_guc_stop_prepare(struct xe_guc *guc); From e9ab2b83893dd03cf04d98faded81190e635233f Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Wed, 4 Feb 2026 19:11:41 +0800 Subject: [PATCH 147/167] pmdomain: imx8mp-blk-ctrl: Keep gpc power domain on for system wakeup Current design will power off all dependent GPC power domains in imx8mp_blk_ctrl_suspend(), even though the user device has enabled wakeup capability. The result is that wakeup function never works for such device. An example will be USB wakeup on i.MX8MP. PHY device '382f0040.usb-phy' is attached to power domain 'hsioblk-usb-phy2' which is spawned by hsio block control. A virtual power domain device 'genpd:3:32f10000.blk-ctrl' is created to build connection with 'hsioblk-usb-phy2' and it depends on GPC power domain 'usb-otg2'. If device '382f0040.usb-phy' enable wakeup, only power domain 'hsioblk-usb-phy2' keeps on during system suspend, power domain 'usb-otg2' is off all the time. So the wakeup event can't happen. In order to further establish a connection between the power domains related to GPC and block control during system suspend, register a genpd power on/off notifier for the power_dev. This allows us to prevent the GPC power domain from being powered off, in case the block control power domain is kept on to serve system wakeup. Suggested-by: Ulf Hansson Fixes: 556f5cf9568a ("soc: imx: add i.MX8MP HSIO blk-ctrl") Cc: stable@vger.kernel.org Signed-off-by: Xu Yang Signed-off-by: Ulf Hansson --- drivers/pmdomain/imx/imx8mp-blk-ctrl.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/drivers/pmdomain/imx/imx8mp-blk-ctrl.c b/drivers/pmdomain/imx/imx8mp-blk-ctrl.c index 34576be606e3..56bbfee8668d 100644 --- a/drivers/pmdomain/imx/imx8mp-blk-ctrl.c +++ b/drivers/pmdomain/imx/imx8mp-blk-ctrl.c @@ -65,6 +65,7 @@ struct imx8mp_blk_ctrl_domain { struct icc_bulk_data paths[DOMAIN_MAX_PATHS]; struct device *power_dev; struct imx8mp_blk_ctrl *bc; + struct notifier_block power_nb; int num_paths; int id; }; @@ -594,6 +595,20 @@ static int imx8mp_blk_ctrl_power_off(struct generic_pm_domain *genpd) return 0; } +static int imx8mp_blk_ctrl_gpc_notifier(struct notifier_block *nb, + unsigned long action, void *data) +{ + struct imx8mp_blk_ctrl_domain *domain = + container_of(nb, struct imx8mp_blk_ctrl_domain, power_nb); + + if (action == GENPD_NOTIFY_PRE_OFF) { + if (domain->genpd.status == GENPD_STATE_ON) + return NOTIFY_BAD; + } + + return NOTIFY_OK; +} + static struct lock_class_key blk_ctrl_genpd_lock_class; static int imx8mp_blk_ctrl_probe(struct platform_device *pdev) @@ -698,6 +713,14 @@ static int imx8mp_blk_ctrl_probe(struct platform_device *pdev) goto cleanup_pds; } + domain->power_nb.notifier_call = imx8mp_blk_ctrl_gpc_notifier; + ret = dev_pm_genpd_add_notifier(domain->power_dev, &domain->power_nb); + if (ret) { + dev_err_probe(dev, ret, "failed to add power notifier\n"); + dev_pm_domain_detach(domain->power_dev, true); + goto cleanup_pds; + } + domain->genpd.name = data->name; domain->genpd.power_on = imx8mp_blk_ctrl_power_on; domain->genpd.power_off = imx8mp_blk_ctrl_power_off; @@ -707,6 +730,7 @@ static int imx8mp_blk_ctrl_probe(struct platform_device *pdev) ret = pm_genpd_init(&domain->genpd, NULL, true); if (ret) { dev_err_probe(dev, ret, "failed to init power domain\n"); + dev_pm_genpd_remove_notifier(domain->power_dev); dev_pm_domain_detach(domain->power_dev, true); goto cleanup_pds; } @@ -755,6 +779,7 @@ static int imx8mp_blk_ctrl_probe(struct platform_device *pdev) cleanup_pds: for (i--; i >= 0; i--) { pm_genpd_remove(&bc->domains[i].genpd); + dev_pm_genpd_remove_notifier(bc->domains[i].power_dev); dev_pm_domain_detach(bc->domains[i].power_dev, true); } @@ -774,6 +799,7 @@ static void imx8mp_blk_ctrl_remove(struct platform_device *pdev) struct imx8mp_blk_ctrl_domain *domain = &bc->domains[i]; pm_genpd_remove(&domain->genpd); + dev_pm_genpd_remove_notifier(domain->power_dev); dev_pm_domain_detach(domain->power_dev, true); } From e2c4c5b2bbd4f688a0f9f6da26cdf6d723c53478 Mon Sep 17 00:00:00 2001 From: Xu Yang Date: Wed, 4 Feb 2026 19:11:42 +0800 Subject: [PATCH 148/167] pmdomain: imx8mp-blk-ctrl: Keep usb phy power domain on for system wakeup USB system wakeup need its PHY on, so add the GENPD_FLAG_ACTIVE_WAKEUP flags to USB PHY genpd configuration. Signed-off-by: Xu Yang Fixes: 556f5cf9568a ("soc: imx: add i.MX8MP HSIO blk-ctrl") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson --- drivers/pmdomain/imx/imx8mp-blk-ctrl.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/pmdomain/imx/imx8mp-blk-ctrl.c b/drivers/pmdomain/imx/imx8mp-blk-ctrl.c index 56bbfee8668d..8fc79f9723f0 100644 --- a/drivers/pmdomain/imx/imx8mp-blk-ctrl.c +++ b/drivers/pmdomain/imx/imx8mp-blk-ctrl.c @@ -53,6 +53,7 @@ struct imx8mp_blk_ctrl_domain_data { const char * const *path_names; int num_paths; const char *gpc_name; + const unsigned int flags; }; #define DOMAIN_MAX_CLKS 3 @@ -265,10 +266,12 @@ static const struct imx8mp_blk_ctrl_domain_data imx8mp_hsio_domain_data[] = { [IMX8MP_HSIOBLK_PD_USB_PHY1] = { .name = "hsioblk-usb-phy1", .gpc_name = "usb-phy1", + .flags = GENPD_FLAG_ACTIVE_WAKEUP, }, [IMX8MP_HSIOBLK_PD_USB_PHY2] = { .name = "hsioblk-usb-phy2", .gpc_name = "usb-phy2", + .flags = GENPD_FLAG_ACTIVE_WAKEUP, }, [IMX8MP_HSIOBLK_PD_PCIE] = { .name = "hsioblk-pcie", @@ -724,6 +727,7 @@ static int imx8mp_blk_ctrl_probe(struct platform_device *pdev) domain->genpd.name = data->name; domain->genpd.power_on = imx8mp_blk_ctrl_power_on; domain->genpd.power_off = imx8mp_blk_ctrl_power_off; + domain->genpd.flags = data->flags; domain->bc = bc; domain->id = i; From 033c55fe2e326bea022c3cc5178ecf3e0e459b82 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Wed, 4 Feb 2026 11:36:28 -0500 Subject: [PATCH 149/167] tracing: Fix ftrace event field alignments The fields of ftrace specific events (events used to save ftrace internal events like function traces and trace_printk) are generated similarly to how normal trace event fields are generated. That is, the fields are added to a trace_events_fields array that saves the name, offset, size, alignment and signness of the field. It is used to produce the output in the format file in tracefs so that tooling knows how to parse the binary data of the trace events. The issue is that some of the ftrace event structures are packed. The function graph exit event structures are one of them. The 64 bit calltime and rettime fields end up 4 byte aligned, but the algorithm to show to userspace shows them as 8 byte aligned. The macros that create the ftrace events has one for embedded structure fields. There's two macros for theses fields: __field_desc() and __field_packed() The difference of the latter macro is that it treats the field as packed. Rename that field to __field_desc_packed() and create replace the __field_packed() to be a normal field that is packed and have the calltime and rettime use those. This showed up on 32bit architectures for function graph time fields. It had: ~# cat /sys/kernel/tracing/events/ftrace/funcgraph_exit/format [..] field:unsigned long func; offset:8; size:4; signed:0; field:unsigned int depth; offset:12; size:4; signed:0; field:unsigned int overrun; offset:16; size:4; signed:0; field:unsigned long long calltime; offset:24; size:8; signed:0; field:unsigned long long rettime; offset:32; size:8; signed:0; Notice that overrun is at offset 16 with size 4, where in the structure calltime is at offset 20 (16 + 4), but it shows the offset at 24. That's because it used the alignment of unsigned long long when used as a declaration and not as a member of a structure where it would be aligned by word size (in this case 4). By using the proper structure alignment, the format has it at the correct offset: ~# cat /sys/kernel/tracing/events/ftrace/funcgraph_exit/format [..] field:unsigned long func; offset:8; size:4; signed:0; field:unsigned int depth; offset:12; size:4; signed:0; field:unsigned int overrun; offset:16; size:4; signed:0; field:unsigned long long calltime; offset:20; size:8; signed:0; field:unsigned long long rettime; offset:28; size:8; signed:0; Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers Cc: Mark Rutland Acked-by: Masami Hiramatsu (Google) Reported-by: "jempty.liang" Link: https://patch.msgid.link/20260204113628.53faec78@gandalf.local.home Fixes: 04ae87a52074e ("ftrace: Rework event_create_dir()") Closes: https://lore.kernel.org/all/20260130015740.212343-1-imntjempty@163.com/ Closes: https://lore.kernel.org/all/20260202123342.2544795-1-imntjempty@163.com/ Signed-off-by: Steven Rostedt (Google) --- kernel/trace/trace.h | 7 +++++-- kernel/trace/trace_entries.h | 32 ++++++++++++++++---------------- kernel/trace/trace_export.c | 21 +++++++++++++++------ 3 files changed, 36 insertions(+), 24 deletions(-) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index b6d42fe06115..c11edec5d8f5 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -68,14 +68,17 @@ enum trace_type { #undef __field_fn #define __field_fn(type, item) type item; +#undef __field_packed +#define __field_packed(type, item) type item; + #undef __field_struct #define __field_struct(type, item) __field(type, item) #undef __field_desc #define __field_desc(type, container, item) -#undef __field_packed -#define __field_packed(type, container, item) +#undef __field_desc_packed +#define __field_desc_packed(type, container, item) #undef __array #define __array(type, item, size) type item[size]; diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h index f6a8d29c0d76..54417468fdeb 100644 --- a/kernel/trace/trace_entries.h +++ b/kernel/trace/trace_entries.h @@ -79,8 +79,8 @@ FTRACE_ENTRY(funcgraph_entry, ftrace_graph_ent_entry, F_STRUCT( __field_struct( struct ftrace_graph_ent, graph_ent ) - __field_packed( unsigned long, graph_ent, func ) - __field_packed( unsigned long, graph_ent, depth ) + __field_desc_packed(unsigned long, graph_ent, func ) + __field_desc_packed(unsigned long, graph_ent, depth ) __dynamic_array(unsigned long, args ) ), @@ -96,9 +96,9 @@ FTRACE_ENTRY_PACKED(fgraph_retaddr_entry, fgraph_retaddr_ent_entry, F_STRUCT( __field_struct( struct fgraph_retaddr_ent, graph_rent ) - __field_packed( unsigned long, graph_rent.ent, func ) - __field_packed( unsigned long, graph_rent.ent, depth ) - __field_packed( unsigned long, graph_rent, retaddr ) + __field_desc_packed( unsigned long, graph_rent.ent, func ) + __field_desc_packed( unsigned long, graph_rent.ent, depth ) + __field_desc_packed( unsigned long, graph_rent, retaddr ) __dynamic_array(unsigned long, args ) ), @@ -123,12 +123,12 @@ FTRACE_ENTRY_PACKED(funcgraph_exit, ftrace_graph_ret_entry, F_STRUCT( __field_struct( struct ftrace_graph_ret, ret ) - __field_packed( unsigned long, ret, func ) - __field_packed( unsigned long, ret, retval ) - __field_packed( unsigned int, ret, depth ) - __field_packed( unsigned int, ret, overrun ) - __field(unsigned long long, calltime ) - __field(unsigned long long, rettime ) + __field_desc_packed( unsigned long, ret, func ) + __field_desc_packed( unsigned long, ret, retval ) + __field_desc_packed( unsigned int, ret, depth ) + __field_desc_packed( unsigned int, ret, overrun ) + __field_packed(unsigned long long, calltime) + __field_packed(unsigned long long, rettime ) ), F_printk("<-- %ps (%u) (start: %llx end: %llx) over: %u retval: %lx", @@ -146,11 +146,11 @@ FTRACE_ENTRY_PACKED(funcgraph_exit, ftrace_graph_ret_entry, F_STRUCT( __field_struct( struct ftrace_graph_ret, ret ) - __field_packed( unsigned long, ret, func ) - __field_packed( unsigned int, ret, depth ) - __field_packed( unsigned int, ret, overrun ) - __field(unsigned long long, calltime ) - __field(unsigned long long, rettime ) + __field_desc_packed( unsigned long, ret, func ) + __field_desc_packed( unsigned int, ret, depth ) + __field_desc_packed( unsigned int, ret, overrun ) + __field_packed(unsigned long long, calltime ) + __field_packed(unsigned long long, rettime ) ), F_printk("<-- %ps (%u) (start: %llx end: %llx) over: %u", diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c index 1698fc22afa0..32a42ef31855 100644 --- a/kernel/trace/trace_export.c +++ b/kernel/trace/trace_export.c @@ -42,11 +42,14 @@ static int ftrace_event_register(struct trace_event_call *call, #undef __field_fn #define __field_fn(type, item) type item; +#undef __field_packed +#define __field_packed(type, item) type item; + #undef __field_desc #define __field_desc(type, container, item) type item; -#undef __field_packed -#define __field_packed(type, container, item) type item; +#undef __field_desc_packed +#define __field_desc_packed(type, container, item) type item; #undef __array #define __array(type, item, size) type item[size]; @@ -104,11 +107,14 @@ static void __always_unused ____ftrace_check_##name(void) \ #undef __field_fn #define __field_fn(_type, _item) __field_ext(_type, _item, FILTER_TRACE_FN) +#undef __field_packed +#define __field_packed(_type, _item) __field_ext_packed(_type, _item, FILTER_OTHER) + #undef __field_desc #define __field_desc(_type, _container, _item) __field_ext(_type, _item, FILTER_OTHER) -#undef __field_packed -#define __field_packed(_type, _container, _item) __field_ext_packed(_type, _item, FILTER_OTHER) +#undef __field_desc_packed +#define __field_desc_packed(_type, _container, _item) __field_ext_packed(_type, _item, FILTER_OTHER) #undef __array #define __array(_type, _item, _len) { \ @@ -146,11 +152,14 @@ static struct trace_event_fields ftrace_event_fields_##name[] = { \ #undef __field_fn #define __field_fn(type, item) +#undef __field_packed +#define __field_packed(type, item) + #undef __field_desc #define __field_desc(type, container, item) -#undef __field_packed -#define __field_packed(type, container, item) +#undef __field_desc_packed +#define __field_desc_packed(type, container, item) #undef __array #define __array(type, item, len) From 071be3b0b6575d45be9df9c5b612f5882bfc5e88 Mon Sep 17 00:00:00 2001 From: Keith Busch Date: Wed, 4 Feb 2026 06:29:11 -0800 Subject: [PATCH 150/167] nvme-pci: handle changing device dma map requirements The initial state of dma_needs_unmap may be false, but change to true while mapping the data iterator. Enabling swiotlb is one such case that can change the result. The nvme driver needs to save the mapped dma vectors to be unmapped later, so allocate as needed during iteration rather than assume it was always allocated at the beginning. This fixes a NULL dereference from accessing an uninitialized dma_vecs when the device dma unmapping requirements change mid-iteration. Fixes: b8b7570a7ec8 ("nvme-pci: fix dma unmapping when using PRPs and not using the IOVA mapping") Link: https://lore.kernel.org/linux-nvme/20260202125738.1194899-1-pradeep.pragallapati@oss.qualcomm.com/ Reported-by: Pradeep P V K Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch --- drivers/nvme/host/pci.c | 45 +++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 15 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index c2bee32332fe..d86f2565a92c 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -816,6 +816,32 @@ static void nvme_unmap_data(struct request *req) nvme_free_descriptors(req); } +static bool nvme_pci_prp_save_mapping(struct request *req, + struct device *dma_dev, + struct blk_dma_iter *iter) +{ + struct nvme_iod *iod = blk_mq_rq_to_pdu(req); + + if (dma_use_iova(&iod->dma_state) || !dma_need_unmap(dma_dev)) + return true; + + if (!iod->nr_dma_vecs) { + struct nvme_queue *nvmeq = req->mq_hctx->driver_data; + + iod->dma_vecs = mempool_alloc(nvmeq->dev->dmavec_mempool, + GFP_ATOMIC); + if (!iod->dma_vecs) { + iter->status = BLK_STS_RESOURCE; + return false; + } + } + + iod->dma_vecs[iod->nr_dma_vecs].addr = iter->addr; + iod->dma_vecs[iod->nr_dma_vecs].len = iter->len; + iod->nr_dma_vecs++; + return true; +} + static bool nvme_pci_prp_iter_next(struct request *req, struct device *dma_dev, struct blk_dma_iter *iter) { @@ -825,12 +851,7 @@ static bool nvme_pci_prp_iter_next(struct request *req, struct device *dma_dev, return true; if (!blk_rq_dma_map_iter_next(req, dma_dev, &iod->dma_state, iter)) return false; - if (!dma_use_iova(&iod->dma_state) && dma_need_unmap(dma_dev)) { - iod->dma_vecs[iod->nr_dma_vecs].addr = iter->addr; - iod->dma_vecs[iod->nr_dma_vecs].len = iter->len; - iod->nr_dma_vecs++; - } - return true; + return nvme_pci_prp_save_mapping(req, dma_dev, iter); } static blk_status_t nvme_pci_setup_data_prp(struct request *req, @@ -843,15 +864,8 @@ static blk_status_t nvme_pci_setup_data_prp(struct request *req, unsigned int prp_len, i; __le64 *prp_list; - if (!dma_use_iova(&iod->dma_state) && dma_need_unmap(nvmeq->dev->dev)) { - iod->dma_vecs = mempool_alloc(nvmeq->dev->dmavec_mempool, - GFP_ATOMIC); - if (!iod->dma_vecs) - return BLK_STS_RESOURCE; - iod->dma_vecs[0].addr = iter->addr; - iod->dma_vecs[0].len = iter->len; - iod->nr_dma_vecs = 1; - } + if (!nvme_pci_prp_save_mapping(req, nvmeq->dev->dev, iter)) + return iter->status; /* * PRP1 always points to the start of the DMA transfers. @@ -1219,6 +1233,7 @@ static blk_status_t nvme_prep_rq(struct request *req) iod->nr_descriptors = 0; iod->total_len = 0; iod->meta_total_len = 0; + iod->nr_dma_vecs = 0; ret = nvme_setup_cmd(req->q->queuedata, req); if (ret) From 52a0a98549344ca20ad81a4176d68d28e3c05a5c Mon Sep 17 00:00:00 2001 From: YunJe Shin Date: Wed, 28 Jan 2026 09:41:07 +0900 Subject: [PATCH 151/167] nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec nvmet_tcp_build_pdu_iovec() could walk past cmd->req.sg when a PDU length or offset exceeds sg_cnt and then use bogus sg->length/offset values, leading to _copy_to_iter() GPF/KASAN. Guard sg_idx, remaining entries, and sg->length/offset before building the bvec. Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Signed-off-by: YunJe Shin Reviewed-by: Sagi Grimberg Reviewed-by: Joonkyo Jung Signed-off-by: Keith Busch --- drivers/nvme/target/tcp.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 549a4786d1c3..bda816d66846 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -349,11 +349,14 @@ static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd) cmd->req.sg = NULL; } +static void nvmet_tcp_fatal_error(struct nvmet_tcp_queue *queue); + static void nvmet_tcp_build_pdu_iovec(struct nvmet_tcp_cmd *cmd) { struct bio_vec *iov = cmd->iov; struct scatterlist *sg; u32 length, offset, sg_offset; + unsigned int sg_remaining; int nr_pages; length = cmd->pdu_len; @@ -361,9 +364,22 @@ static void nvmet_tcp_build_pdu_iovec(struct nvmet_tcp_cmd *cmd) offset = cmd->rbytes_done; cmd->sg_idx = offset / PAGE_SIZE; sg_offset = offset % PAGE_SIZE; + if (!cmd->req.sg_cnt || cmd->sg_idx >= cmd->req.sg_cnt) { + nvmet_tcp_fatal_error(cmd->queue); + return; + } sg = &cmd->req.sg[cmd->sg_idx]; + sg_remaining = cmd->req.sg_cnt - cmd->sg_idx; while (length) { + if (!sg_remaining) { + nvmet_tcp_fatal_error(cmd->queue); + return; + } + if (!sg->length || sg->length <= sg_offset) { + nvmet_tcp_fatal_error(cmd->queue); + return; + } u32 iov_len = min_t(u32, length, sg->length - sg_offset); bvec_set_page(iov, sg_page(sg), iov_len, @@ -371,6 +387,7 @@ static void nvmet_tcp_build_pdu_iovec(struct nvmet_tcp_cmd *cmd) length -= iov_len; sg = sg_next(sg); + sg_remaining--; iov++; sg_offset = 0; } From ab10815472fcbc2c772dc21a979460b7f74f0145 Mon Sep 17 00:00:00 2001 From: Petr Pavlu Date: Fri, 23 Jan 2026 11:26:56 +0100 Subject: [PATCH 152/167] livepatch: Fix having __klp_objects relics in non-livepatch modules The linker script scripts/module.lds.S specifies that all input __klp_objects sections should be consolidated into an output section of the same name, and start/stop symbols should be created to enable scripts/livepatch/init.c to locate this data. This start/stop pattern is not ideal for modules because the symbols are created even if no __klp_objects input sections are present. Consequently, a dummy __klp_objects section also appears in the resulting module. This unnecessarily pollutes non-livepatch modules. Instead, since modules are relocatable files, the usual method for locating consolidated data in a module is to read its section table. This approach avoids the aforementioned problem. The klp_modinfo already stores a copy of the entire section table with the final addresses. Introduce a helper function that scripts/livepatch/init.c can call to obtain the location of the __klp_objects section from this data. Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files") Signed-off-by: Petr Pavlu Acked-by: Joe Lawrence Acked-by: Miroslav Benes Reviewed-by: Aaron Tomlin Link: https://patch.msgid.link/20260123102825.3521961-2-petr.pavlu@suse.com Signed-off-by: Josh Poimboeuf --- include/linux/livepatch.h | 3 +++ kernel/livepatch/core.c | 19 +++++++++++++++++++ scripts/livepatch/init.c | 20 +++++++++----------- scripts/module.lds.S | 7 +------ 4 files changed, 32 insertions(+), 17 deletions(-) diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h index 772919e8096a..ba9e3988c07c 100644 --- a/include/linux/livepatch.h +++ b/include/linux/livepatch.h @@ -175,6 +175,9 @@ int klp_enable_patch(struct klp_patch *); int klp_module_coming(struct module *mod); void klp_module_going(struct module *mod); +void *klp_find_section_by_name(const struct module *mod, const char *name, + size_t *sec_size); + void klp_copy_process(struct task_struct *child); void klp_update_patch_state(struct task_struct *task); diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index 9917756dae46..1acbad2dbfdf 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -1356,6 +1356,25 @@ void klp_module_going(struct module *mod) mutex_unlock(&klp_mutex); } +void *klp_find_section_by_name(const struct module *mod, const char *name, + size_t *sec_size) +{ + struct klp_modinfo *info = mod->klp_info; + + for (int i = 1; i < info->hdr.e_shnum; i++) { + Elf_Shdr *shdr = &info->sechdrs[i]; + + if (!strcmp(info->secstrings + shdr->sh_name, name)) { + *sec_size = shdr->sh_size; + return (void *)shdr->sh_addr; + } + } + + *sec_size = 0; + return NULL; +} +EXPORT_SYMBOL_GPL(klp_find_section_by_name); + static int __init klp_init(void) { klp_root_kobj = kobject_create_and_add("livepatch", kernel_kobj); diff --git a/scripts/livepatch/init.c b/scripts/livepatch/init.c index 2274d8f5a482..9e315fc857bd 100644 --- a/scripts/livepatch/init.c +++ b/scripts/livepatch/init.c @@ -9,19 +9,19 @@ #include #include -extern struct klp_object_ext __start_klp_objects[]; -extern struct klp_object_ext __stop_klp_objects[]; - static struct klp_patch *patch; static int __init livepatch_mod_init(void) { + struct klp_object_ext *obj_exts; + size_t obj_exts_sec_size; struct klp_object *objs; unsigned int nr_objs; int ret; - nr_objs = __stop_klp_objects - __start_klp_objects; - + obj_exts = klp_find_section_by_name(THIS_MODULE, "__klp_objects", + &obj_exts_sec_size); + nr_objs = obj_exts_sec_size / sizeof(*obj_exts); if (!nr_objs) { pr_err("nothing to patch!\n"); ret = -EINVAL; @@ -41,7 +41,7 @@ static int __init livepatch_mod_init(void) } for (int i = 0; i < nr_objs; i++) { - struct klp_object_ext *obj_ext = __start_klp_objects + i; + struct klp_object_ext *obj_ext = obj_exts + i; struct klp_func_ext *funcs_ext = obj_ext->funcs; unsigned int nr_funcs = obj_ext->nr_funcs; struct klp_func *funcs = objs[i].funcs; @@ -90,12 +90,10 @@ static int __init livepatch_mod_init(void) static void __exit livepatch_mod_exit(void) { - unsigned int nr_objs; + struct klp_object *obj; - nr_objs = __stop_klp_objects - __start_klp_objects; - - for (int i = 0; i < nr_objs; i++) - kfree(patch->objs[i].funcs); + klp_for_each_object_static(patch, obj) + kfree(obj->funcs); kfree(patch->objs); kfree(patch); diff --git a/scripts/module.lds.S b/scripts/module.lds.S index 3037d5e5527c..383d19beffb4 100644 --- a/scripts/module.lds.S +++ b/scripts/module.lds.S @@ -35,12 +35,7 @@ SECTIONS { __patchable_function_entries : { *(__patchable_function_entries) } __klp_funcs 0: ALIGN(8) { KEEP(*(__klp_funcs)) } - - __klp_objects 0: ALIGN(8) { - __start_klp_objects = .; - KEEP(*(__klp_objects)) - __stop_klp_objects = .; - } + __klp_objects 0: ALIGN(8) { KEEP(*(__klp_objects)) } #ifdef CONFIG_ARCH_USES_CFI_TRAPS __kcfi_traps : { KEEP(*(.kcfi_traps)) } From b525fcaf0a76507f152d58c6f9e5ef67b3ff552c Mon Sep 17 00:00:00 2001 From: Petr Pavlu Date: Fri, 23 Jan 2026 11:26:57 +0100 Subject: [PATCH 153/167] livepatch: Free klp_{object,func}_ext data after initialization The klp_object_ext and klp_func_ext data, which are stored in the __klp_objects and __klp_funcs sections, respectively, are not needed after they are used to create the actual klp_object and klp_func instances. This operation is implemented by the init function in scripts/livepatch/init.c. Prefix the two sections with ".init" so they are freed after the module is initializated. Signed-off-by: Petr Pavlu Acked-by: Joe Lawrence Acked-by: Miroslav Benes Reviewed-by: Aaron Tomlin Link: https://patch.msgid.link/20260123102825.3521961-3-petr.pavlu@suse.com Signed-off-by: Josh Poimboeuf --- scripts/livepatch/init.c | 2 +- scripts/module.lds.S | 4 ++-- tools/objtool/check.c | 2 +- tools/objtool/include/objtool/klp.h | 10 +++++----- tools/objtool/klp-diff.c | 2 +- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/scripts/livepatch/init.c b/scripts/livepatch/init.c index 9e315fc857bd..638c95cffe76 100644 --- a/scripts/livepatch/init.c +++ b/scripts/livepatch/init.c @@ -19,7 +19,7 @@ static int __init livepatch_mod_init(void) unsigned int nr_objs; int ret; - obj_exts = klp_find_section_by_name(THIS_MODULE, "__klp_objects", + obj_exts = klp_find_section_by_name(THIS_MODULE, ".init.klp_objects", &obj_exts_sec_size); nr_objs = obj_exts_sec_size / sizeof(*obj_exts); if (!nr_objs) { diff --git a/scripts/module.lds.S b/scripts/module.lds.S index 383d19beffb4..054ef99e8288 100644 --- a/scripts/module.lds.S +++ b/scripts/module.lds.S @@ -34,8 +34,8 @@ SECTIONS { __patchable_function_entries : { *(__patchable_function_entries) } - __klp_funcs 0: ALIGN(8) { KEEP(*(__klp_funcs)) } - __klp_objects 0: ALIGN(8) { KEEP(*(__klp_objects)) } + .init.klp_funcs 0 : ALIGN(8) { KEEP(*(.init.klp_funcs)) } + .init.klp_objects 0 : ALIGN(8) { KEEP(*(.init.klp_objects)) } #ifdef CONFIG_ARCH_USES_CFI_TRAPS __kcfi_traps : { KEEP(*(.kcfi_traps)) } diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 3f7999317f4d..933868ee3beb 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -4761,7 +4761,7 @@ static int validate_ibt(struct objtool_file *file) !strcmp(sec->name, "__bug_table") || !strcmp(sec->name, "__ex_table") || !strcmp(sec->name, "__jump_table") || - !strcmp(sec->name, "__klp_funcs") || + !strcmp(sec->name, ".init.klp_funcs") || !strcmp(sec->name, "__mcount_loc") || !strcmp(sec->name, ".llvm.call-graph-profile") || !strcmp(sec->name, ".llvm_bb_addr_map") || diff --git a/tools/objtool/include/objtool/klp.h b/tools/objtool/include/objtool/klp.h index ad830a7ce55b..e32e5e8bc631 100644 --- a/tools/objtool/include/objtool/klp.h +++ b/tools/objtool/include/objtool/klp.h @@ -6,12 +6,12 @@ #define SHN_LIVEPATCH 0xff20 /* - * __klp_objects and __klp_funcs are created by klp diff and used by the patch - * module init code to build the klp_patch, klp_object and klp_func structs - * needed by the livepatch API. + * .init.klp_objects and .init.klp_funcs are created by klp diff and used by the + * patch module init code to build the klp_patch, klp_object and klp_func + * structs needed by the livepatch API. */ -#define KLP_OBJECTS_SEC "__klp_objects" -#define KLP_FUNCS_SEC "__klp_funcs" +#define KLP_OBJECTS_SEC ".init.klp_objects" +#define KLP_FUNCS_SEC ".init.klp_funcs" /* * __klp_relocs is an intermediate section which are created by klp diff and diff --git a/tools/objtool/klp-diff.c b/tools/objtool/klp-diff.c index d94531e3f64e..1e649a3eb4cd 100644 --- a/tools/objtool/klp-diff.c +++ b/tools/objtool/klp-diff.c @@ -1436,7 +1436,7 @@ static int clone_special_sections(struct elfs *e) } /* - * Create __klp_objects and __klp_funcs sections which are intermediate + * Create .init.klp_objects and .init.klp_funcs sections which are intermediate * sections provided as input to the patch module's init code for building the * klp_patch, klp_object and klp_func structs for the livepatch API. */ From 18328546dd59b6adc111cf84a0ee4cdd3a867611 Mon Sep 17 00:00:00 2001 From: Josh Poimboeuf Date: Mon, 2 Feb 2026 10:01:08 -0800 Subject: [PATCH 154/167] objtool/klp: Fix symbol correlation for orphaned local symbols When compiling with CONFIG_LTO_CLANG_THIN, vmlinux.o has __irf_[start|end] before the first FILE entry: $ readelf -sW vmlinux.o Symbol table '.symtab' contains 597706 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 NOTYPE LOCAL DEFAULT 18 __irf_start 2: 0000000000000200 0 NOTYPE LOCAL DEFAULT 18 __irf_end 3: 0000000000000000 0 SECTION LOCAL DEFAULT 17 .text 4: 0000000000000000 0 SECTION LOCAL DEFAULT 18 .init.ramfs This causes klp-build warnings like: vmlinux.o: warning: objtool: no correlation: __irf_start vmlinux.o: warning: objtool: no correlation: __irf_end The problem is that Clang LTO is stripping the initramfs_data.o FILE symbol, causing those two symbols to be orphaned and not noticed by klp-diff's correlation logic. Add a loop to correlate any symbols found before the first FILE symbol. Fixes: dd590d4d57eb ("objtool/klp: Introduce klp diff subcommand for diffing object files") Reported-by: Song Liu Acked-by: Song Liu Link: https://patch.msgid.link/e21ec1141fc749b5f538d7329b531c1ab63a6d1a.1770055235.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf --- tools/objtool/klp-diff.c | 39 ++++++++++++++++++++++++++++++++++----- 1 file changed, 34 insertions(+), 5 deletions(-) diff --git a/tools/objtool/klp-diff.c b/tools/objtool/klp-diff.c index 1e649a3eb4cd..9f1f4011eb9c 100644 --- a/tools/objtool/klp-diff.c +++ b/tools/objtool/klp-diff.c @@ -364,11 +364,40 @@ static int correlate_symbols(struct elfs *e) struct symbol *file1_sym, *file2_sym; struct symbol *sym1, *sym2; - /* Correlate locals */ - for (file1_sym = first_file_symbol(e->orig), - file2_sym = first_file_symbol(e->patched); ; - file1_sym = next_file_symbol(e->orig, file1_sym), - file2_sym = next_file_symbol(e->patched, file2_sym)) { + file1_sym = first_file_symbol(e->orig); + file2_sym = first_file_symbol(e->patched); + + /* + * Correlate any locals before the first FILE symbol. This has been + * seen when LTO inexplicably strips the initramfs_data.o FILE symbol + * due to the file only containing data and no code. + */ + for_each_sym(e->orig, sym1) { + if (sym1 == file1_sym || !is_local_sym(sym1)) + break; + + if (dont_correlate(sym1)) + continue; + + for_each_sym(e->patched, sym2) { + if (sym2 == file2_sym || !is_local_sym(sym2)) + break; + + if (sym2->twin || dont_correlate(sym2)) + continue; + + if (strcmp(sym1->demangled_name, sym2->demangled_name)) + continue; + + sym1->twin = sym2; + sym2->twin = sym1; + break; + } + } + + /* Correlate locals after the first FILE symbol */ + for (; ; file1_sym = next_file_symbol(e->orig, file1_sym), + file2_sym = next_file_symbol(e->patched, file2_sym)) { if (!file1_sym && file2_sym) { ERROR("FILE symbol mismatch: NULL != %s", file2_sym->name); From f495054bd12e2abe5068e243bdf344b704c303c6 Mon Sep 17 00:00:00 2001 From: Josh Poimboeuf Date: Mon, 2 Feb 2026 11:00:17 -0800 Subject: [PATCH 155/167] objtool/klp: Fix unexported static call key access for manually built livepatch modules Enabling CONFIG_MEM_ALLOC_PROFILING_DEBUG with CONFIG_SAMPLE_LIVEPATCH results in the following error: samples/livepatch/livepatch-shadow-fix1.o: error: objtool: static_call: can't find static_call_key symbol: __SCK__WARN_trap This is caused an extra file->klp sanity check which was added by commit 164c9201e1da ("objtool: Add base objtool support for livepatch modules"). That check was intended to ensure that livepatch modules built with klp-build always have full access to their static call keys. However, it failed to account for the fact that manually built livepatch modules (i.e., not built with klp-build) might need access to unexported static call keys, for which read-only access is typically allowed for modules. While the livepatch-shadow-fix1 module doesn't explicitly use any static calls, it does have a memory allocation, which can cause CONFIG_MEM_ALLOC_PROFILING_DEBUG to insert a WARN() call. And WARN() is now an unexported static call as of commit 860238af7a33 ("x86_64/bug: Inline the UD1"). Fix it by removing the overzealous file->klp check, restoring the original behavior for manually built livepatch modules. Fixes: 164c9201e1da ("objtool: Add base objtool support for livepatch modules") Reported-by: Arnd Bergmann Acked-by: Song Liu Tested-by: Arnd Bergmann Link: https://patch.msgid.link/0bd3ae9a53c3d743417fe842b740a7720e2bcd1c.1770058775.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf --- tools/objtool/check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 933868ee3beb..ef451cd6277c 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -682,7 +682,7 @@ static int create_static_call_sections(struct objtool_file *file) key_sym = find_symbol_by_name(file->elf, tmp); if (!key_sym) { - if (!opts.module || file->klp) { + if (!opts.module) { ERROR("static_call: can't find static_call_key symbol: %s", tmp); return -1; } From a6abd64e145ca3084b6a567e06517a7c2dbfd38d Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Thu, 5 Feb 2026 09:26:53 -0700 Subject: [PATCH 156/167] loop: revert exclusive opener loop status change This commit effectively reverts the following two commits: 2704024d83fa ("loop: add missing bd_abort_claiming in loop_set_status") 08e136ebd193 ("loop: don't change loop device under exclusive opener in loop_set_status") as there are reports of them causing issues with unmounting. As we're close to the 6.19 kernel release and the original author hasn't taken a closer look at this yet, revert them for release. Reported-by: nokangaroo Link: https://lore.kernel.org/all/62de4453-17e8-47f6-a10b-39bf5a49fdee@leemhuis.info/ Signed-off-by: Jens Axboe --- drivers/block/loop.c | 45 ++++++++++++-------------------------------- 1 file changed, 12 insertions(+), 33 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index bd59c0e9508b..32a3a5b13802 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1225,28 +1225,16 @@ static int loop_clr_fd(struct loop_device *lo) } static int -loop_set_status(struct loop_device *lo, blk_mode_t mode, - struct block_device *bdev, const struct loop_info64 *info) +loop_set_status(struct loop_device *lo, const struct loop_info64 *info) { int err; bool partscan = false; bool size_changed = false; unsigned int memflags; - /* - * If we don't hold exclusive handle for the device, upgrade to it - * here to avoid changing device under exclusive owner. - */ - if (!(mode & BLK_OPEN_EXCL)) { - err = bd_prepare_to_claim(bdev, loop_set_status, NULL); - if (err) - goto out_reread_partitions; - } - err = mutex_lock_killable(&lo->lo_mutex); if (err) - goto out_abort_claiming; - + return err; if (lo->lo_state != Lo_bound) { err = -ENXIO; goto out_unlock; @@ -1285,10 +1273,6 @@ loop_set_status(struct loop_device *lo, blk_mode_t mode, } out_unlock: mutex_unlock(&lo->lo_mutex); -out_abort_claiming: - if (!(mode & BLK_OPEN_EXCL)) - bd_abort_claiming(bdev, loop_set_status); -out_reread_partitions: if (partscan) loop_reread_partitions(lo); @@ -1368,9 +1352,7 @@ loop_info64_to_old(const struct loop_info64 *info64, struct loop_info *info) } static int -loop_set_status_old(struct loop_device *lo, blk_mode_t mode, - struct block_device *bdev, - const struct loop_info __user *arg) +loop_set_status_old(struct loop_device *lo, const struct loop_info __user *arg) { struct loop_info info; struct loop_info64 info64; @@ -1378,19 +1360,17 @@ loop_set_status_old(struct loop_device *lo, blk_mode_t mode, if (copy_from_user(&info, arg, sizeof (struct loop_info))) return -EFAULT; loop_info64_from_old(&info, &info64); - return loop_set_status(lo, mode, bdev, &info64); + return loop_set_status(lo, &info64); } static int -loop_set_status64(struct loop_device *lo, blk_mode_t mode, - struct block_device *bdev, - const struct loop_info64 __user *arg) +loop_set_status64(struct loop_device *lo, const struct loop_info64 __user *arg) { struct loop_info64 info64; if (copy_from_user(&info64, arg, sizeof (struct loop_info64))) return -EFAULT; - return loop_set_status(lo, mode, bdev, &info64); + return loop_set_status(lo, &info64); } static int @@ -1569,14 +1549,14 @@ static int lo_ioctl(struct block_device *bdev, blk_mode_t mode, case LOOP_SET_STATUS: err = -EPERM; if ((mode & BLK_OPEN_WRITE) || capable(CAP_SYS_ADMIN)) - err = loop_set_status_old(lo, mode, bdev, argp); + err = loop_set_status_old(lo, argp); break; case LOOP_GET_STATUS: return loop_get_status_old(lo, argp); case LOOP_SET_STATUS64: err = -EPERM; if ((mode & BLK_OPEN_WRITE) || capable(CAP_SYS_ADMIN)) - err = loop_set_status64(lo, mode, bdev, argp); + err = loop_set_status64(lo, argp); break; case LOOP_GET_STATUS64: return loop_get_status64(lo, argp); @@ -1670,9 +1650,8 @@ loop_info64_to_compat(const struct loop_info64 *info64, } static int -loop_set_status_compat(struct loop_device *lo, blk_mode_t mode, - struct block_device *bdev, - const struct compat_loop_info __user *arg) +loop_set_status_compat(struct loop_device *lo, + const struct compat_loop_info __user *arg) { struct loop_info64 info64; int ret; @@ -1680,7 +1659,7 @@ loop_set_status_compat(struct loop_device *lo, blk_mode_t mode, ret = loop_info64_from_compat(arg, &info64); if (ret < 0) return ret; - return loop_set_status(lo, mode, bdev, &info64); + return loop_set_status(lo, &info64); } static int @@ -1706,7 +1685,7 @@ static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode, switch(cmd) { case LOOP_SET_STATUS: - err = loop_set_status_compat(lo, mode, bdev, + err = loop_set_status_compat(lo, (const struct compat_loop_info __user *)arg); break; case LOOP_GET_STATUS: From bbf4a17ad9ffc4e3d7ec13d73ecd59dea149ed25 Mon Sep 17 00:00:00 2001 From: Shigeru Yoshida Date: Wed, 4 Feb 2026 18:58:37 +0900 Subject: [PATCH 157/167] ipv6: Fix ECMP sibling count mismatch when clearing RTF_ADDRCONF syzbot reported a kernel BUG in fib6_add_rt2node() when adding an IPv6 route. [0] Commit f72514b3c569 ("ipv6: clear RA flags when adding a static route") introduced logic to clear RTF_ADDRCONF from existing routes when a static route with the same nexthop is added. However, this causes a problem when the existing route has a gateway. When RTF_ADDRCONF is cleared from a route that has a gateway, that route becomes eligible for ECMP, i.e. rt6_qualify_for_ecmp() returns true. The issue is that this route was never added to the fib6_siblings list. This leads to a mismatch between the following counts: - The sibling count computed by iterating fib6_next chain, which includes the newly ECMP-eligible route - The actual siblings in fib6_siblings list, which does not include that route When a subsequent ECMP route is added, fib6_add_rt2node() hits BUG_ON(sibling->fib6_nsiblings != rt->fib6_nsiblings) because the counts don't match. Fix this by only clearing RTF_ADDRCONF when the existing route does not have a gateway. Routes without a gateway cannot qualify for ECMP anyway (rt6_qualify_for_ecmp() requires fib_nh_gw_family), so clearing RTF_ADDRCONF on them is safe and matches the original intent of the commit. [0]: kernel BUG at net/ipv6/ip6_fib.c:1217! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 6010 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 RIP: 0010:fib6_add_rt2node+0x3433/0x3470 net/ipv6/ip6_fib.c:1217 [...] Call Trace: fib6_add+0x8da/0x18a0 net/ipv6/ip6_fib.c:1532 __ip6_ins_rt net/ipv6/route.c:1351 [inline] ip6_route_add+0xde/0x1b0 net/ipv6/route.c:3946 ipv6_route_ioctl+0x35c/0x480 net/ipv6/route.c:4571 inet6_ioctl+0x219/0x280 net/ipv6/af_inet6.c:577 sock_do_ioctl+0xdc/0x300 net/socket.c:1245 sock_ioctl+0x576/0x790 net/socket.c:1366 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: f72514b3c569 ("ipv6: clear RA flags when adding a static route") Reported-by: syzbot+cb809def1baaac68ab92@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=cb809def1baaac68ab92 Tested-by: syzbot+cb809def1baaac68ab92@syzkaller.appspotmail.com Signed-off-by: Shigeru Yoshida Reviewed-by: Fernando Fernandez Mancera Link: https://patch.msgid.link/20260204095837.1285552-1-syoshida@redhat.com Signed-off-by: Jakub Kicinski --- net/ipv6/ip6_fib.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 2111af022d94..c6439e30e892 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -1138,7 +1138,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt, fib6_set_expires(iter, rt->expires); fib6_add_gc_list(iter); } - if (!(rt->fib6_flags & (RTF_ADDRCONF | RTF_PREFIX_RT))) { + if (!(rt->fib6_flags & (RTF_ADDRCONF | RTF_PREFIX_RT)) && + !iter->fib6_nh->fib_nh_gw_family) { iter->fib6_flags &= ~RTF_ADDRCONF; iter->fib6_flags &= ~RTF_PREFIX_RT; } From e34f77b09080c86c929153e2a72da26b4f8947ff Mon Sep 17 00:00:00 2001 From: Chen Ni Date: Thu, 5 Feb 2026 15:26:49 +0800 Subject: [PATCH 158/167] gpio: loongson-64bit: Fix incorrect NULL check after devm_kcalloc() Fix incorrect NULL check in loongson_gpio_init_irqchip(). The function checks chip->parent instead of chip->irq.parents. Fixes: 03c146cb6cd1 ("gpio: loongson-64bit: Add support for Loongson-2K0300 SoC") Signed-off-by: Chen Ni Link: https://patch.msgid.link/20260205072649.3271158-1-nichen@iscas.ac.cn Signed-off-by: Bartosz Golaszewski --- drivers/gpio/gpio-loongson-64bit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpio-loongson-64bit.c b/drivers/gpio/gpio-loongson-64bit.c index 77d07e31366f..0fdf15faa344 100644 --- a/drivers/gpio/gpio-loongson-64bit.c +++ b/drivers/gpio/gpio-loongson-64bit.c @@ -263,7 +263,7 @@ static int loongson_gpio_init_irqchip(struct platform_device *pdev, chip->irq.num_parents = data->intr_num; chip->irq.parents = devm_kcalloc(&pdev->dev, data->intr_num, sizeof(*chip->irq.parents), GFP_KERNEL); - if (!chip->parent) + if (!chip->irq.parents) return -ENOMEM; for (i = 0; i < data->intr_num; i++) { From 351ea48ae880b1673abcf232947c577183fdf712 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Mon, 26 Jan 2026 01:05:57 -0500 Subject: [PATCH 159/167] rust_binderfs: fix a dentry leak Parallel to binderfs patches - 02da8d2c0965 "binderfs_binder_ctl_create(): kill a bogus check" and the bit of b89aa544821d "convert binderfs" that got lost when making 4433d8e25d73 "convert rust_binderfs"; the former is a cleanup, the latter is about marking /binder-control persistent, so that it would be taken out on umount. Fixes: 4433d8e25d73 ("convert rust_binderfs") Acked-by: Alice Ryhl Acked-by: Christian Brauner Signed-off-by: Al Viro --- drivers/android/binder/rust_binderfs.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/android/binder/rust_binderfs.c b/drivers/android/binder/rust_binderfs.c index c69026df775c..e36011e89116 100644 --- a/drivers/android/binder/rust_binderfs.c +++ b/drivers/android/binder/rust_binderfs.c @@ -391,12 +391,6 @@ static int binderfs_binder_ctl_create(struct super_block *sb) if (!device) return -ENOMEM; - /* If we have already created a binder-control node, return. */ - if (info->control_dentry) { - ret = 0; - goto out; - } - ret = -ENOMEM; inode = new_inode(sb); if (!inode) @@ -431,7 +425,8 @@ static int binderfs_binder_ctl_create(struct super_block *sb) inode->i_private = device; info->control_dentry = dentry; - d_add(dentry, inode); + d_make_persistent(dentry, inode); + dput(dentry); return 0; From 2005aabe94eaab8608879d98afb901bc99bc3a31 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 31 Jan 2026 18:24:41 -0500 Subject: [PATCH 160/167] functionfs: use spinlock for FFS_DEACTIVATED/FFS_CLOSING transitions When all files are closed, functionfs needs ffs_data_reset() to be done before any further opens are allowed. During that time we have ffs->state set to FFS_CLOSING; that makes ->open() fail with -EBUSY. Once ffs_data_reset() is done, it switches state (to FFS_READ_DESCRIPTORS) indicating that opening that thing is allowed again. There's a couple of additional twists: * mounting with -o no_disconnect delays ffs_data_reset() from doing that at the final ->release() to the first subsequent open(). That's indicated by ffs->state set to FFS_DEACTIVATED; if open() sees that, it immediately switches to FFS_CLOSING and proceeds with doing ffs_data_reset() before returning to userland. * a couple of usb callbacks need to force the delayed transition; unfortunately, they are done in locking environment that does not allow blocking and ffs_data_reset() can block. As the result, if these callbacks see FFS_DEACTIVATED, they change state to FFS_CLOSING and use schedule_work() to get ffs_data_reset() executed asynchronously. Unfortunately, the locking is rather insufficient. A fix attempted in e5bf5ee26663 ("functionfs: fix the open/removal races") had closed a bunch of UAF, but it didn't do anything to the callbacks, lacked barriers in transition from FFS_CLOSING to FFS_READ_DESCRIPTORS _and_ it had been too heavy-handed in open()/open() serialization - I've used ffs->mutex for that, and it's being held over actual IO on ep0, complete with copy_from_user(), etc. Even more unfortunately, the userland side is apparently racy enough to have the resulting timing changes (no failures, just a delayed return of open(2)) disrupt the things quite badly. Userland bugs or not, it's a clear regression that needs to be dealt with. Solution is to use a spinlock for serializing these state checks and transitions - unlike ffs->mutex it can be taken in these callbacks and it doesn't disrupt the timings in open(). We could introduce a new spinlock, but it's easier to use the one that is already there (ffs->eps_lock) instead - the locking environment is safe for it in all affected places. Since now it is held over all places that alter or check the open count (ffs->opened), there's no need to keep that atomic_t - int would serve just fine and it's simpler that way. Fixes: e5bf5ee26663 ("functionfs: fix the open/removal races") Fixes: 18d6b32fca38 ("usb: gadget: f_fs: add "no_disconnect" mode") # v4.0 Tested-by: Samuel Wu Signed-off-by: Al Viro --- drivers/usb/gadget/function/f_fs.c | 102 ++++++++++++++--------------- drivers/usb/gadget/function/u_fs.h | 2 +- 2 files changed, 50 insertions(+), 54 deletions(-) diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 05c6750702b6..fa467a40949d 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -59,7 +59,6 @@ static struct ffs_data *__must_check ffs_data_new(const char *dev_name) __attribute__((malloc)); /* Opened counter handling. */ -static void ffs_data_opened(struct ffs_data *ffs); static void ffs_data_closed(struct ffs_data *ffs); /* Called with ffs->mutex held; take over ownership of data. */ @@ -636,23 +635,25 @@ static ssize_t ffs_ep0_read(struct file *file, char __user *buf, return ret; } + +static void ffs_data_reset(struct ffs_data *ffs); + static int ffs_ep0_open(struct inode *inode, struct file *file) { struct ffs_data *ffs = inode->i_sb->s_fs_info; - int ret; - /* Acquire mutex */ - ret = ffs_mutex_lock(&ffs->mutex, file->f_flags & O_NONBLOCK); - if (ret < 0) - return ret; - - ffs_data_opened(ffs); + spin_lock_irq(&ffs->eps_lock); if (ffs->state == FFS_CLOSING) { - ffs_data_closed(ffs); - mutex_unlock(&ffs->mutex); + spin_unlock_irq(&ffs->eps_lock); return -EBUSY; } - mutex_unlock(&ffs->mutex); + if (!ffs->opened++ && ffs->state == FFS_DEACTIVATED) { + ffs->state = FFS_CLOSING; + spin_unlock_irq(&ffs->eps_lock); + ffs_data_reset(ffs); + } else { + spin_unlock_irq(&ffs->eps_lock); + } file->private_data = ffs; return stream_open(inode, file); @@ -1202,15 +1203,10 @@ ffs_epfile_open(struct inode *inode, struct file *file) { struct ffs_data *ffs = inode->i_sb->s_fs_info; struct ffs_epfile *epfile; - int ret; - /* Acquire mutex */ - ret = ffs_mutex_lock(&ffs->mutex, file->f_flags & O_NONBLOCK); - if (ret < 0) - return ret; - - if (!atomic_inc_not_zero(&ffs->opened)) { - mutex_unlock(&ffs->mutex); + spin_lock_irq(&ffs->eps_lock); + if (!ffs->opened) { + spin_unlock_irq(&ffs->eps_lock); return -ENODEV; } /* @@ -1220,11 +1216,11 @@ ffs_epfile_open(struct inode *inode, struct file *file) */ epfile = smp_load_acquire(&inode->i_private); if (unlikely(ffs->state != FFS_ACTIVE || !epfile)) { - mutex_unlock(&ffs->mutex); - ffs_data_closed(ffs); + spin_unlock_irq(&ffs->eps_lock); return -ENODEV; } - mutex_unlock(&ffs->mutex); + ffs->opened++; + spin_unlock_irq(&ffs->eps_lock); file->private_data = epfile; return stream_open(inode, file); @@ -2092,8 +2088,6 @@ static int ffs_fs_init_fs_context(struct fs_context *fc) return 0; } -static void ffs_data_reset(struct ffs_data *ffs); - static void ffs_fs_kill_sb(struct super_block *sb) { @@ -2150,15 +2144,6 @@ static void ffs_data_get(struct ffs_data *ffs) refcount_inc(&ffs->ref); } -static void ffs_data_opened(struct ffs_data *ffs) -{ - if (atomic_add_return(1, &ffs->opened) == 1 && - ffs->state == FFS_DEACTIVATED) { - ffs->state = FFS_CLOSING; - ffs_data_reset(ffs); - } -} - static void ffs_data_put(struct ffs_data *ffs) { if (refcount_dec_and_test(&ffs->ref)) { @@ -2176,28 +2161,29 @@ static void ffs_data_put(struct ffs_data *ffs) static void ffs_data_closed(struct ffs_data *ffs) { - if (atomic_dec_and_test(&ffs->opened)) { - if (ffs->no_disconnect) { - struct ffs_epfile *epfiles; - unsigned long flags; + spin_lock_irq(&ffs->eps_lock); + if (--ffs->opened) { // not the last opener? + spin_unlock_irq(&ffs->eps_lock); + return; + } + if (ffs->no_disconnect) { + struct ffs_epfile *epfiles; - ffs->state = FFS_DEACTIVATED; - spin_lock_irqsave(&ffs->eps_lock, flags); - epfiles = ffs->epfiles; - ffs->epfiles = NULL; - spin_unlock_irqrestore(&ffs->eps_lock, - flags); + ffs->state = FFS_DEACTIVATED; + epfiles = ffs->epfiles; + ffs->epfiles = NULL; + spin_unlock_irq(&ffs->eps_lock); - if (epfiles) - ffs_epfiles_destroy(ffs->sb, epfiles, - ffs->eps_count); + if (epfiles) + ffs_epfiles_destroy(ffs->sb, epfiles, + ffs->eps_count); - if (ffs->setup_state == FFS_SETUP_PENDING) - __ffs_ep0_stall(ffs); - } else { - ffs->state = FFS_CLOSING; - ffs_data_reset(ffs); - } + if (ffs->setup_state == FFS_SETUP_PENDING) + __ffs_ep0_stall(ffs); + } else { + ffs->state = FFS_CLOSING; + spin_unlock_irq(&ffs->eps_lock); + ffs_data_reset(ffs); } } @@ -2214,7 +2200,7 @@ static struct ffs_data *ffs_data_new(const char *dev_name) } refcount_set(&ffs->ref, 1); - atomic_set(&ffs->opened, 0); + ffs->opened = 0; ffs->state = FFS_READ_DESCRIPTORS; mutex_init(&ffs->mutex); spin_lock_init(&ffs->eps_lock); @@ -2266,6 +2252,7 @@ static void ffs_data_reset(struct ffs_data *ffs) { ffs_data_clear(ffs); + spin_lock_irq(&ffs->eps_lock); ffs->raw_descs_data = NULL; ffs->raw_descs = NULL; ffs->raw_strings = NULL; @@ -2289,6 +2276,7 @@ static void ffs_data_reset(struct ffs_data *ffs) ffs->ms_os_descs_ext_prop_count = 0; ffs->ms_os_descs_ext_prop_name_len = 0; ffs->ms_os_descs_ext_prop_data_len = 0; + spin_unlock_irq(&ffs->eps_lock); } @@ -3756,6 +3744,7 @@ static int ffs_func_set_alt(struct usb_function *f, { struct ffs_function *func = ffs_func_from_usb(f); struct ffs_data *ffs = func->ffs; + unsigned long flags; int ret = 0, intf; if (alt > MAX_ALT_SETTINGS) @@ -3768,12 +3757,15 @@ static int ffs_func_set_alt(struct usb_function *f, if (ffs->func) ffs_func_eps_disable(ffs->func); + spin_lock_irqsave(&ffs->eps_lock, flags); if (ffs->state == FFS_DEACTIVATED) { ffs->state = FFS_CLOSING; + spin_unlock_irqrestore(&ffs->eps_lock, flags); INIT_WORK(&ffs->reset_work, ffs_reset_work); schedule_work(&ffs->reset_work); return -ENODEV; } + spin_unlock_irqrestore(&ffs->eps_lock, flags); if (ffs->state != FFS_ACTIVE) return -ENODEV; @@ -3791,16 +3783,20 @@ static void ffs_func_disable(struct usb_function *f) { struct ffs_function *func = ffs_func_from_usb(f); struct ffs_data *ffs = func->ffs; + unsigned long flags; if (ffs->func) ffs_func_eps_disable(ffs->func); + spin_lock_irqsave(&ffs->eps_lock, flags); if (ffs->state == FFS_DEACTIVATED) { ffs->state = FFS_CLOSING; + spin_unlock_irqrestore(&ffs->eps_lock, flags); INIT_WORK(&ffs->reset_work, ffs_reset_work); schedule_work(&ffs->reset_work); return; } + spin_unlock_irqrestore(&ffs->eps_lock, flags); if (ffs->state == FFS_ACTIVE) { ffs->func = NULL; diff --git a/drivers/usb/gadget/function/u_fs.h b/drivers/usb/gadget/function/u_fs.h index 4b3365f23fd7..6a80182aadd7 100644 --- a/drivers/usb/gadget/function/u_fs.h +++ b/drivers/usb/gadget/function/u_fs.h @@ -176,7 +176,7 @@ struct ffs_data { /* reference counter */ refcount_t ref; /* how many files are opened (EP0 and others) */ - atomic_t opened; + int opened; /* EP0 state */ enum ffs_state state; From a0a75b40c919b9f6d3a0b6c978e6ccf344c1be5a Mon Sep 17 00:00:00 2001 From: Vishwaroop A Date: Wed, 4 Feb 2026 14:12:12 +0000 Subject: [PATCH 161/167] spi: tegra114: Preserve SPI mode bits in def_command1_reg The COMMAND1 register bits [29:28] set the SPI mode, which controls the clock idle level. When a transfer ends, tegra_spi_transfer_end() writes def_command1_reg back to restore the default state, but this register value currently lacks the mode bits. This results in the clock always being configured as idle low, breaking devices that need it high. Fix this by storing the mode bits in def_command1_reg during setup, to prevent this field from always being cleared. Fixes: f333a331adfa ("spi/tegra114: add spi driver") Signed-off-by: Vishwaroop A Link: https://patch.msgid.link/20260204141212.1540382-1-va@nvidia.com Signed-off-by: Mark Brown --- drivers/spi/spi-tegra114.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/spi/spi-tegra114.c b/drivers/spi/spi-tegra114.c index 795a8482c2c7..48fb11fea55f 100644 --- a/drivers/spi/spi-tegra114.c +++ b/drivers/spi/spi-tegra114.c @@ -978,11 +978,14 @@ static int tegra_spi_setup(struct spi_device *spi) if (spi_get_csgpiod(spi, 0)) gpiod_set_value(spi_get_csgpiod(spi, 0), 0); + /* Update default register to include CS polarity and SPI mode */ val = tspi->def_command1_reg; if (spi->mode & SPI_CS_HIGH) val &= ~SPI_CS_POL_INACTIVE(spi_get_chipselect(spi, 0)); else val |= SPI_CS_POL_INACTIVE(spi_get_chipselect(spi, 0)); + val &= ~SPI_CONTROL_MODE_MASK; + val |= SPI_MODE_SEL(spi->mode & 0x3); tspi->def_command1_reg = val; tegra_spi_writel(tspi, tspi->def_command1_reg, SPI_COMMAND1); spin_unlock_irqrestore(&tspi->lock, flags); From b5cbacd7f86f4f62b8813688c8e73be94e8e1951 Mon Sep 17 00:00:00 2001 From: Andrii Nakryiko Date: Thu, 29 Jan 2026 13:53:40 -0800 Subject: [PATCH 162/167] procfs: avoid fetching build ID while holding VMA lock Fix PROCMAP_QUERY to fetch optional build ID only after dropping mmap_lock or per-VMA lock, whichever was used to lock VMA under question, to avoid deadlock reported by syzbot: -> #1 (&mm->mmap_lock){++++}-{4:4}: __might_fault+0xed/0x170 _copy_to_iter+0x118/0x1720 copy_page_to_iter+0x12d/0x1e0 filemap_read+0x720/0x10a0 blkdev_read_iter+0x2b5/0x4e0 vfs_read+0x7f4/0xae0 ksys_read+0x12a/0x250 do_syscall_64+0xcb/0xf80 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (&sb->s_type->i_mutex_key#8){++++}-{4:4}: __lock_acquire+0x1509/0x26d0 lock_acquire+0x185/0x340 down_read+0x98/0x490 blkdev_read_iter+0x2a7/0x4e0 __kernel_read+0x39a/0xa90 freader_fetch+0x1d5/0xa80 __build_id_parse.isra.0+0xea/0x6a0 do_procmap_query+0xd75/0x1050 procfs_procmap_ioctl+0x7a/0xb0 __x64_sys_ioctl+0x18e/0x210 do_syscall_64+0xcb/0xf80 entry_SYSCALL_64_after_hwframe+0x77/0x7f other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(&mm->mmap_lock); lock(&sb->s_type->i_mutex_key#8); lock(&mm->mmap_lock); rlock(&sb->s_type->i_mutex_key#8); *** DEADLOCK *** This seems to be exacerbated (as we haven't seen these syzbot reports before that) by the recent: 777a8560fd29 ("lib/buildid: use __kernel_read() for sleepable context") To make this safe, we need to grab file refcount while VMA is still locked, but other than that everything is pretty straightforward. Internal build_id_parse() API assumes VMA is passed, but it only needs the underlying file reference, so just add another variant build_id_parse_file() that expects file passed directly. [akpm@linux-foundation.org: fix up kerneldoc] Link: https://lkml.kernel.org/r/20260129215340.3742283-1-andrii@kernel.org Fixes: ed5d583a88a9 ("fs/procfs: implement efficient VMA querying API for /proc//maps") Signed-off-by: Andrii Nakryiko Reported-by: Reviewed-by: Suren Baghdasaryan Tested-by: Suren Baghdasaryan Reviewed-by: Shakeel Butt Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Eduard Zingerman Cc: Hao Luo Cc: Jiri Olsa Cc: John Fastabend Cc: KP Singh Cc: Martin KaFai Lau Cc: Song Liu Cc: Stanislav Fomichev Cc: Yonghong Song Cc: Signed-off-by: Andrew Morton --- fs/proc/task_mmu.c | 42 ++++++++++++++++++++++++++--------------- include/linux/buildid.h | 3 +++ lib/buildid.c | 42 +++++++++++++++++++++++++++++------------ 3 files changed, 60 insertions(+), 27 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 81dfc26bfae8..26188a4ad1ab 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -656,6 +656,7 @@ static int do_procmap_query(struct mm_struct *mm, void __user *uarg) struct proc_maps_locking_ctx lock_ctx = { .mm = mm }; struct procmap_query karg; struct vm_area_struct *vma; + struct file *vm_file = NULL; const char *name = NULL; char build_id_buf[BUILD_ID_SIZE_MAX], *name_buf = NULL; __u64 usize; @@ -727,21 +728,6 @@ static int do_procmap_query(struct mm_struct *mm, void __user *uarg) karg.inode = 0; } - if (karg.build_id_size) { - __u32 build_id_sz; - - err = build_id_parse(vma, build_id_buf, &build_id_sz); - if (err) { - karg.build_id_size = 0; - } else { - if (karg.build_id_size < build_id_sz) { - err = -ENAMETOOLONG; - goto out; - } - karg.build_id_size = build_id_sz; - } - } - if (karg.vma_name_size) { size_t name_buf_sz = min_t(size_t, PATH_MAX, karg.vma_name_size); const struct path *path; @@ -775,10 +761,34 @@ static int do_procmap_query(struct mm_struct *mm, void __user *uarg) karg.vma_name_size = name_sz; } + if (karg.build_id_size && vma->vm_file) + vm_file = get_file(vma->vm_file); + /* unlock vma or mmap_lock, and put mm_struct before copying data to user */ query_vma_teardown(&lock_ctx); mmput(mm); + if (karg.build_id_size) { + __u32 build_id_sz; + + if (vm_file) + err = build_id_parse_file(vm_file, build_id_buf, &build_id_sz); + else + err = -ENOENT; + if (err) { + karg.build_id_size = 0; + } else { + if (karg.build_id_size < build_id_sz) { + err = -ENAMETOOLONG; + goto out; + } + karg.build_id_size = build_id_sz; + } + } + + if (vm_file) + fput(vm_file); + if (karg.vma_name_size && copy_to_user(u64_to_user_ptr(karg.vma_name_addr), name, karg.vma_name_size)) { kfree(name_buf); @@ -798,6 +808,8 @@ static int do_procmap_query(struct mm_struct *mm, void __user *uarg) out: query_vma_teardown(&lock_ctx); mmput(mm); + if (vm_file) + fput(vm_file); kfree(name_buf); return err; } diff --git a/include/linux/buildid.h b/include/linux/buildid.h index 831c1b4b626c..7acc06b22fb7 100644 --- a/include/linux/buildid.h +++ b/include/linux/buildid.h @@ -7,7 +7,10 @@ #define BUILD_ID_SIZE_MAX 20 struct vm_area_struct; +struct file; + int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, __u32 *size); +int build_id_parse_file(struct file *file, unsigned char *build_id, __u32 *size); int build_id_parse_nofault(struct vm_area_struct *vma, unsigned char *build_id, __u32 *size); int build_id_parse_buf(const void *buf, unsigned char *build_id, u32 buf_size); diff --git a/lib/buildid.c b/lib/buildid.c index 818331051afe..c4b737640621 100644 --- a/lib/buildid.c +++ b/lib/buildid.c @@ -279,7 +279,7 @@ static int get_build_id_64(struct freader *r, unsigned char *build_id, __u32 *si /* enough for Elf64_Ehdr, Elf64_Phdr, and all the smaller requests */ #define MAX_FREADER_BUF_SZ 64 -static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, +static int __build_id_parse(struct file *file, unsigned char *build_id, __u32 *size, bool may_fault) { const Elf32_Ehdr *ehdr; @@ -287,11 +287,7 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, char buf[MAX_FREADER_BUF_SZ]; int ret; - /* only works for page backed storage */ - if (!vma->vm_file) - return -EINVAL; - - freader_init_from_file(&r, buf, sizeof(buf), vma->vm_file, may_fault); + freader_init_from_file(&r, buf, sizeof(buf), file, may_fault); /* fetch first 18 bytes of ELF header for checks */ ehdr = freader_fetch(&r, 0, offsetofend(Elf32_Ehdr, e_type)); @@ -319,8 +315,8 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, return ret; } -/* - * Parse build ID of ELF file mapped to vma +/** + * build_id_parse_nofault() - Parse build ID of ELF file mapped to vma * @vma: vma object * @build_id: buffer to store build id, at least BUILD_ID_SIZE long * @size: returns actual build id size in case of success @@ -332,11 +328,14 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, */ int build_id_parse_nofault(struct vm_area_struct *vma, unsigned char *build_id, __u32 *size) { - return __build_id_parse(vma, build_id, size, false /* !may_fault */); + if (!vma->vm_file) + return -EINVAL; + + return __build_id_parse(vma->vm_file, build_id, size, false /* !may_fault */); } -/* - * Parse build ID of ELF file mapped to VMA +/** + * build_id_parse() - Parse build ID of ELF file mapped to VMA * @vma: vma object * @build_id: buffer to store build id, at least BUILD_ID_SIZE long * @size: returns actual build id size in case of success @@ -348,7 +347,26 @@ int build_id_parse_nofault(struct vm_area_struct *vma, unsigned char *build_id, */ int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, __u32 *size) { - return __build_id_parse(vma, build_id, size, true /* may_fault */); + if (!vma->vm_file) + return -EINVAL; + + return __build_id_parse(vma->vm_file, build_id, size, true /* may_fault */); +} + +/** + * build_id_parse_file() - Parse build ID of ELF file + * @file: file object + * @build_id: buffer to store build id, at least BUILD_ID_SIZE long + * @size: returns actual build id size in case of success + * + * Assumes faultable context and can cause page faults to bring in file data + * into page cache. + * + * Return: 0 on success; negative error, otherwise + */ +int build_id_parse_file(struct file *file, unsigned char *build_id, __u32 *size) +{ + return __build_id_parse(file, build_id, size, true /* may_fault */); } /** From ae9fd76c111bb4a5293c2831a1ad373605251dd6 Mon Sep 17 00:00:00 2001 From: Miaohe Lin Date: Thu, 5 Feb 2026 15:53:28 +0800 Subject: [PATCH 163/167] mm/memory-failure: reject unsupported non-folio compound page MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When !CONFIG_TRANSPARENT_HUGEPAGE, a non-folio compound page can appear in a userspace mapping via either vm_insert_*() functions or vm_operatios_struct->fault(). They are not folios, thus should not be considered for folio operations like split. To reject these pages, make sure get_hwpoison_page() is always called as HWPoisonHandlable() will do the right work. [Some commit log borrowed from Zi Yan. Thanks.] Link: https://lkml.kernel.org/r/20260205075328.523211-1-linmiaohe@huawei.com Fixes: 689b8986776c ("mm/memory-failure: improve large block size folio handling") Signed-off-by: Miaohe Lin Reported-by: 是参差 Closes: https://lore.kernel.org/all/PS1PPF7E1D7501F1E4F4441E7ECD056DEADAB98A@PS1PPF7E1D7501F.apcprd02.prod.outlook.com/ Reviewed-by: Zi Yan Tested-by: Zi Yan Cc: David Hildenbrand Cc: Jane Chu Cc: Miaohe Lin Cc: Naoya Horiguchi Signed-off-by: Andrew Morton --- mm/memory-failure.c | 42 ++++++++++++++++++++---------------------- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index cf0d526e6d41..9fd8355176eb 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2411,31 +2411,29 @@ int memory_failure(unsigned long pfn, int flags) * In fact it's dangerous to directly bump up page count from 0, * that may make page_ref_freeze()/page_ref_unfreeze() mismatch. */ - if (!(flags & MF_COUNT_INCREASED)) { - res = get_hwpoison_page(p, flags); - if (!res) { - if (is_free_buddy_page(p)) { - if (take_page_off_buddy(p)) { - page_ref_inc(p); - res = MF_RECOVERED; - } else { - /* We lost the race, try again */ - if (retry) { - ClearPageHWPoison(p); - retry = false; - goto try_again; - } - res = MF_FAILED; - } - res = action_result(pfn, MF_MSG_BUDDY, res); + res = get_hwpoison_page(p, flags); + if (!res) { + if (is_free_buddy_page(p)) { + if (take_page_off_buddy(p)) { + page_ref_inc(p); + res = MF_RECOVERED; } else { - res = action_result(pfn, MF_MSG_KERNEL_HIGH_ORDER, MF_IGNORED); + /* We lost the race, try again */ + if (retry) { + ClearPageHWPoison(p); + retry = false; + goto try_again; + } + res = MF_FAILED; } - goto unlock_mutex; - } else if (res < 0) { - res = action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED); - goto unlock_mutex; + res = action_result(pfn, MF_MSG_BUDDY, res); + } else { + res = action_result(pfn, MF_MSG_KERNEL_HIGH_ORDER, MF_IGNORED); } + goto unlock_mutex; + } else if (res < 0) { + res = action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED); + goto unlock_mutex; } folio = page_folio(p); From e6c53ead2d8fa73206e0a63e9cd9aea6bc929837 Mon Sep 17 00:00:00 2001 From: Hao Ge Date: Wed, 4 Feb 2026 18:14:01 +0800 Subject: [PATCH 164/167] mm/slab: Add alloc_tagging_slab_free_hook for memcg_alloc_abort_single When CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled, the following warning may be noticed: [ 3959.023862] ------------[ cut here ]------------ [ 3959.023891] alloc_tag was not cleared (got tag for lib/xarray.c:378) [ 3959.023947] WARNING: ./include/linux/alloc_tag.h:155 at alloc_tag_add+0x128/0x178, CPU#6: mkfs.ntfs/113998 [ 3959.023978] Modules linked in: dns_resolver tun brd overlay exfat btrfs blake2b libblake2b xor xor_neon raid6_pq loop sctp ip6_udp_tunnel udp_tunnel ext4 crc16 mbcache jbd2 rfkill sunrpc vfat fat sg fuse nfnetlink sr_mod virtio_gpu cdrom drm_client_lib virtio_dma_buf drm_shmem_helper drm_kms_helper ghash_ce drm sm4 backlight virtio_net net_failover virtio_scsi failover virtio_console virtio_blk virtio_mmio dm_mirror dm_region_hash dm_log dm_multipath dm_mod i2c_dev aes_neon_bs aes_ce_blk [last unloaded: hwpoison_inject] [ 3959.024170] CPU: 6 UID: 0 PID: 113998 Comm: mkfs.ntfs Kdump: loaded Tainted: G W 6.19.0-rc7+ #7 PREEMPT(voluntary) [ 3959.024182] Tainted: [W]=WARN [ 3959.024186] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 [ 3959.024192] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 3959.024199] pc : alloc_tag_add+0x128/0x178 [ 3959.024207] lr : alloc_tag_add+0x128/0x178 [ 3959.024214] sp : ffff80008b696d60 [ 3959.024219] x29: ffff80008b696d60 x28: 0000000000000000 x27: 0000000000000240 [ 3959.024232] x26: 0000000000000000 x25: 0000000000000240 x24: ffff800085d17860 [ 3959.024245] x23: 0000000000402800 x22: ffff0000c0012dc0 x21: 00000000000002d0 [ 3959.024257] x20: ffff0000e6ef3318 x19: ffff800085ae0410 x18: 0000000000000000 [ 3959.024269] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 3959.024281] x14: 0000000000000000 x13: 0000000000000001 x12: ffff600064101293 [ 3959.024292] x11: 1fffe00064101292 x10: ffff600064101292 x9 : dfff800000000000 [ 3959.024305] x8 : 00009fff9befed6e x7 : ffff000320809493 x6 : 0000000000000001 [ 3959.024316] x5 : ffff000320809490 x4 : ffff600064101293 x3 : ffff800080691838 [ 3959.024328] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000d5bcd640 [ 3959.024340] Call trace: [ 3959.024346] alloc_tag_add+0x128/0x178 (P) [ 3959.024355] __alloc_tagging_slab_alloc_hook+0x11c/0x1a8 [ 3959.024362] kmem_cache_alloc_lru_noprof+0x1b8/0x5e8 [ 3959.024369] xas_alloc+0x304/0x4f0 [ 3959.024381] xas_create+0x1e0/0x4a0 [ 3959.024388] xas_store+0x68/0xda8 [ 3959.024395] __filemap_add_folio+0x5b0/0xbd8 [ 3959.024409] filemap_add_folio+0x16c/0x7e0 [ 3959.024416] __filemap_get_folio_mpol+0x2dc/0x9e8 [ 3959.024424] iomap_get_folio+0xfc/0x180 [ 3959.024435] __iomap_get_folio+0x2f8/0x4b8 [ 3959.024441] iomap_write_begin+0x198/0xc18 [ 3959.024448] iomap_write_iter+0x2ec/0x8f8 [ 3959.024454] iomap_file_buffered_write+0x19c/0x290 [ 3959.024461] blkdev_write_iter+0x38c/0x978 [ 3959.024470] vfs_write+0x4d4/0x928 [ 3959.024482] ksys_write+0xfc/0x1f8 [ 3959.024489] __arm64_sys_write+0x74/0xb0 [ 3959.024496] invoke_syscall+0xd4/0x258 [ 3959.024507] el0_svc_common.constprop.0+0xb4/0x240 [ 3959.024514] do_el0_svc+0x48/0x68 [ 3959.024520] el0_svc+0x40/0xf8 [ 3959.024526] el0t_64_sync_handler+0xa0/0xe8 [ 3959.024533] el0t_64_sync+0x1ac/0x1b0 [ 3959.024540] ---[ end trace 0000000000000000 ]--- When __memcg_slab_post_alloc_hook() fails, there are two different free paths depending on whether size == 1 or size != 1. In the kmem_cache_free_bulk() path, we do call alloc_tagging_slab_free_hook(). However, in memcg_alloc_abort_single() we don't, the above warning will be triggered on the next allocation. Therefore, add alloc_tagging_slab_free_hook() to the memcg_alloc_abort_single() path. Fixes: 9f9796b413d3 ("mm, slab: move memcg charging to post-alloc hook") Cc: stable@vger.kernel.org Suggested-by: Hao Li Signed-off-by: Hao Ge Reviewed-by: Hao Li Reviewed-by: Suren Baghdasaryan Reviewed-by: Harry Yoo Link: https://patch.msgid.link/20260204101401.202762-1-hao.ge@linux.dev Signed-off-by: Vlastimil Babka --- mm/slub.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index f77b7407c51b..cdc1e652ec52 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -6689,8 +6689,12 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object, static noinline void memcg_alloc_abort_single(struct kmem_cache *s, void *object) { + struct slab *slab = virt_to_slab(object); + + alloc_tagging_slab_free_hook(s, slab, &object, 1); + if (likely(slab_free_hook(s, object, slab_want_init_on_free(s), false))) - do_slab_free(s, virt_to_slab(object), object, object, 1, _RET_IP_); + do_slab_free(s, slab, object, object, 1, _RET_IP_); } #endif From 02f9d76a76adb5ea16b4e3b403496c42033f8fd1 Mon Sep 17 00:00:00 2001 From: Viktor Kleen Date: Thu, 5 Feb 2026 16:49:41 +0800 Subject: [PATCH 165/167] iommu/vt-d: Treat PAGE_SNOOP and PWSNP separately The PASID_FLAG_PAGE_SNOOP and PASID_FLAG_PWSNP constants are identical. This will cause the pasid code to always set both or neither of the PGSNP and PWSNP bits in PASID table entries. However, PWSNP is a reserved bit if SMPWC is not set in the IOMMU's extended capability register, even if SC is supported. This has resulted in DMAR errors when testing the iommufd code on an Arrow Lake platform. With this patch, those errors disappear and the PASID table entries look correct. Fixes: 101a2854110fa ("iommu/vt-d: Follow PT_FEAT_DMA_INCOHERENT into the PASID entry") Cc: stable@vger.kernel.org Signed-off-by: Viktor Kleen Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20260202192109.1665799-1-viktor@kleen.org Signed-off-by: Lu Baolu Signed-off-by: Joerg Roedel --- drivers/iommu/intel/pasid.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index b4c85242dc79..3809793e0259 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -24,7 +24,7 @@ #define PASID_FLAG_NESTED BIT(1) #define PASID_FLAG_PAGE_SNOOP BIT(2) -#define PASID_FLAG_PWSNP BIT(2) +#define PASID_FLAG_PWSNP BIT(3) /* * The PASID_FLAG_FL5LP flag Indicates using 5-level paging for first- From 2687c848e57820651b9f69d30c4710f4219f7dbf Mon Sep 17 00:00:00 2001 From: Josh Poimboeuf Date: Fri, 6 Feb 2026 14:24:55 -0800 Subject: [PATCH 166/167] x86/vmware: Fix hypercall clobbers Fedora QA reported the following panic: BUG: unable to handle page fault for address: 0000000040003e54 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20251119-3.fc43 11/19/2025 RIP: 0010:vmware_hypercall4.constprop.0+0x52/0x90 .. Call Trace: vmmouse_report_events+0x13e/0x1b0 psmouse_handle_byte+0x15/0x60 ps2_interrupt+0x8a/0xd0 ... because the QEMU VMware mouse emulation is buggy, and clears the top 32 bits of %rdi that the kernel kept a pointer in. The QEMU vmmouse driver saves and restores the register state in a "uint32_t data[6];" and as a result restores the state with the high bits all cleared. RDI originally contained the value of a valid kernel stack address (0xff5eeb3240003e54). After the vmware hypercall it now contains 0x40003e54, and we get a page fault as a result when it is dereferenced. The proper fix would be in QEMU, but this works around the issue in the kernel to keep old setups working, when old kernels had not happened to keep any state in %rdi over the hypercall. In theory this same issue exists for all the hypercalls in the vmmouse driver; in practice it has only been seen with vmware_hypercall3() and vmware_hypercall4(). For now, just mark RDI/RSI as clobbered for those two calls. This should have a minimal effect on code generation overall as it should be rare for the compiler to want to make RDI/RSI live across hypercalls. Reported-by: Justin Forbes Link: https://lore.kernel.org/all/99a9c69a-fc1a-43b7-8d1e-c42d6493b41f@broadcom.com/ Signed-off-by: Josh Poimboeuf Cc: stable@kernel.org Signed-off-by: Linus Torvalds --- arch/x86/include/asm/vmware.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/vmware.h b/arch/x86/include/asm/vmware.h index c9cf43d5ef23..4220dae14a2d 100644 --- a/arch/x86/include/asm/vmware.h +++ b/arch/x86/include/asm/vmware.h @@ -140,7 +140,7 @@ unsigned long vmware_hypercall3(unsigned long cmd, unsigned long in1, "b" (in1), "c" (cmd), "d" (0) - : "cc", "memory"); + : "di", "si", "cc", "memory"); return out0; } @@ -165,7 +165,7 @@ unsigned long vmware_hypercall4(unsigned long cmd, unsigned long in1, "b" (in1), "c" (cmd), "d" (0) - : "cc", "memory"); + : "di", "si", "cc", "memory"); return out0; } From 05f7e89ab9731565d8a62e3b5d1ec206485eeb0b Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 8 Feb 2026 13:03:27 -0800 Subject: [PATCH 167/167] Linux 6.19 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index bde507d5c03d..d3a8482bdbd0 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 6 PATCHLEVEL = 19 SUBLEVEL = 0 -EXTRAVERSION = -rc8 +EXTRAVERSION = NAME = Baby Opossum Posse # *DOCUMENTATION*