linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-07 14:04:54 +02:00

Author	SHA1	Message	Date
Breno Leitao	3bc179bc71	netpoll: fix IPv6 local-address corruption netpoll_setup() decides whether to auto-populate the local source address by testing np->local_ip.ip, which only inspects the first 4 bytes of the union inet_addr storage. For an IPv6 netpoll whose caller-supplied local address has a zero high-32 bits (::1, ::<suffix>, IPv4-mapped ::ffff:a.b.c.d, etc.), this misdetects the address as unset (which they are not, but the first 4 bytes are empty), calls netpoll_take_ipv6() and overwrites it with whatever matching link-local/global address the device happens to expose first. Introduce a helper netpoll_local_ip_unset() that picks the correct family-aware test (ipv6_addr_any() for IPv6, !.ip for IPv4) and use it from netpoll_setup(). Reproducer is something like: echo "::2" > local_ip echo 1 > enabled cat local_ip # before this fix: 2001:db8::1 (caller-supplied ::2 was clobbered) # after this fix: ::2 Fixes: `b7394d2429` ("netpoll: prepare for ipv6") Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260424-netpoll_fix-v1-1-3a55348c625f@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 19:16:18 -07:00
Altan Hacigumus	2b9f6f7065	tcp: make probe0 timer handle expired user timeout tcp_clamp_probe0_to_user_timeout() computes remaining time in jiffies using subtraction with an unsigned lvalue. If elapsed probing time exceeds the configured TCP_USER_TIMEOUT, the underflow yields a large value. This ends up re-arming the probe timer for a full backoff interval instead of expiring immediately, delaying connection teardown beyond the configured timeout. Fix this by preventing underflow so user-set timeout expiration is handled correctly without extending the probe timer. Fixes: `344db93ae3` ("tcp: make TCP_USER_TIMEOUT accurate for zero window probes") Link: https://lore.kernel.org/r/20260414013634.43997-1-ahacigu.linux@gmail.com Signed-off-by: Altan Hacigumus <ahacigu.linux@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260424014639.54110-1-ahacigu.linux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 19:16:07 -07:00
Mingming Cao	cc427d24ac	ibmveth: Disable GSO for packets with small MSS Some physical adapters on Power systems do not support segmentation offload when the MSS is less than 224 bytes. Attempting to send such packets causes the adapter to freeze, stopping all traffic until manually reset. Implement ndo_features_check to disable GSO for packets with small MSS values. The network stack will perform software segmentation instead. The 224-byte minimum matches ibmvnic commit <f10b09ef687f> ("ibmvnic: Enforce stronger sanity checks on GSO packets") which uses the same physical adapters in SEA configurations. The issue occurs specifically when the hardware attempts to perform segmentation (gso_segs > 1) with a small MSS. Single-segment GSO packets (gso_segs == 1) do not trigger the problematic LSO code path and are transmitted normally without segmentation. Add an ndo_features_check callback to disable GSO when MSS < 224 bytes. Also call vlan_features_check() to ensure proper handling of VLAN packets, particularly QinQ (802.1ad) configurations where the hardware parser may not support certain offload features. Validated using iptables to force small MSS values. Without the fix, the adapter freezes. With the fix, packets are segmented in software and transmission succeeds. Comprehensive regression testing completedd (MSS tests, performance, stability). Fixes: `8641dd8579` ("ibmveth: Add support for TSO") Cc: stable@vger.kernel.org Reviewed-by: Brian King <bjking1@linux.ibm.com> Tested-by: Shaik Abdulla <shaik.abdulla1@ibm.com> Tested-by: Naveed Ahmed <naveedaus@in.ibm.com> Signed-off-by: Mingming Cao <mmc@linux.ibm.com> Link: https://patch.msgid.link/20260424162917.65725-1-mmc@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 19:07:57 -07:00
Florian Westphal	4438113be6	neigh: let neigh_xmit take skb ownership neigh_xmit always releases the skb, except when no neighbour table is found. But even the first added user of neigh_xmit (mpls) relied on neigh_xmit to release the skb (or queue it for tx). sashiko reported: If neigh_xmit() is called with an uninitialized neighbor table (for example, NEIGH_ND_TABLE when IPv6 is disabled), it returns -EAFNOSUPPORT and bypasses its internal out_kfree_skb error path. Because the return value of neigh_xmit() is ignored here, does this leak the SKB? Assume full ownership and remove the last code path that doesn't xmit or free skb. Fixes: `4fd3d7d9e8` ("neigh: Add helper function neigh_xmit") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260424145843.74055-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 19:02:11 -07:00
Kuniyuki Iwashima	b3b6babf47	ipmr: Free mr_table after RCU grace period. With CONFIG_IP_MROUTE_MULTIPLE_TABLES=n, ipmr_fib_lookup() does not check if net->ipv4.mrt is NULL. Since default_device_exit_batch() is called after ->exit_rtnl(), a device could receive IGMP packets and access net->ipv4.mrt during/after ipmr_rules_exit_rtnl(). If ipmr_rules_exit_rtnl() had already cleared it and freed the memory, the access would trigger null-ptr-deref or use-after-free. Let's fix it by using RCU helper and free mrt after RCU grace period. In addition, check_net(net) is added to mroute_clean_tables() and ipmr_cache_unresolved() to synchronise via mfc_unres_lock. This prevents ipmr_cache_unresolved() from putting skb into c->_c.mfc_un.unres.unresolved after mroute_clean_tables() purges it. For the same reason, timer_shutdown_sync() is moved after mroute_clean_tables(). Since rhltable_destroy() holds mutex internally, rcu_work is used, and it is placed as the first member because rcu_head must be placed within <4K offset. mr_table is alraedy 3864 bytes without rcu_work. Note that IP6MR is not yet converted to ->exit_rtnl(), so this change is not needed for now but will be. Fixes: `b22b018674` ("ipmr: Convert ipmr_net_exit_batch() to ->exit_rtnl().") Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260423053456.4097409-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 18:46:17 -07:00
Morduan Zang	5b0c911bcd	net: phonet: do not BUG_ON() in pn_socket_autobind() on failed bind syzbot reported a kernel BUG triggered from pn_socket_sendmsg() via pn_socket_autobind(): kernel BUG at net/phonet/socket.c:213! RIP: 0010:pn_socket_autobind net/phonet/socket.c:213 [inline] RIP: 0010:pn_socket_sendmsg+0x240/0x250 net/phonet/socket.c:421 Call Trace: sock_sendmsg_nosec+0x112/0x150 net/socket.c:797 __sock_sendmsg net/socket.c:812 [inline] __sys_sendto+0x402/0x590 net/socket.c:2280 ... pn_socket_autobind() calls pn_socket_bind() with port 0 and, on -EINVAL, assumes the socket was already bound and asserts that the port is non-zero: err = pn_socket_bind(sock, ..., sizeof(struct sockaddr_pn)); if (err != -EINVAL) return err; BUG_ON(!pn_port(pn_sk(sock->sk)->sobject)); return 0; /* socket was already bound */ However pn_socket_bind() also returns -EINVAL when sk->sk_state is not TCP_CLOSE, even when the socket has never been bound and pn_port() is still 0. In that case the BUG_ON() fires and panics the kernel from a user-triggerable path. Treat the "bind returned -EINVAL but pn_port() is still 0" case as a regular error and propagate -EINVAL to the caller instead of crashing. Existing callers already translate a non-zero return from pn_socket_autobind() into -ENOBUFS/-EAGAIN, so returning -EINVAL here only changes behaviour from panic to a normal errno. Fixes: `ba113a94b7` ("Phonet: common socket glue") Reported-by: syzbot+706f5eb79044e686c794@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=706f5eb79044e686c794 Suggested-by: Remi Denis-Courmont <courmisch@gmail.com> Signed-off-by: Morduan Zang <zhangdandan@uniontech.com> Signed-off-by: zhanjun <zhanjun@uniontech.com> Acked-by: Rémi Denis-Courmont <remi@remlab.net> Link: https://patch.msgid.link/87A8960A2045AF3C+20260423010557.138124-1-zhangdandan@uniontech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 18:45:17 -07:00
Jakub Kicinski	c4047e7075	Merge branch 'net-sched-taprio-fix-null-pointer-dereference-in-class-dump' Weiming Shi says: ==================== net/sched: taprio: fix NULL pointer dereference in class dump Patch 1/2 is the fix: replace NULL entries in q->qdiscs[] with the global &noop_qdisc singleton so that control-plane dump paths, as well as the existing NULL guards in the data-plane enqueue/dequeue paths, cannot deref a NULL child qdisc. Patch 2/2 is a tdc regression test that drives the graft + delete + class-dump sequence on a multi-queue netdevsim device. It panics the vulnerable kernel and passes on the fixed one. ==================== Link: https://patch.msgid.link/20260422161958.2517539-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 18:41:38 -07:00
Weiming Shi	a469feed39	selftests/tc-testing: add taprio test for class dump after child delete Add a regression test for the NULL pointer dereference fixed in the previous commit. Before the fix, taprio_graft() stored NULL into q->qdiscs[cl - 1] when an explicitly grafted child qdisc was deleted via RTM_DELQDISC; the next RTM_GETTCLASS dump then crashed the kernel in taprio_dump_class() while reading child->handle. The test installs a taprio root qdisc on a multi-queue netdevsim device, grafts a pfifo child onto class 8001:1, deletes that child, and then performs a class dump. On a fixed kernel the dump succeeds and all eight taprio classes are listed; on an unpatched kernel the class dump crashes, which surfaces as a test failure. Signed-off-by: Weiming Shi <bestswngs@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260422161958.2517539-4-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 18:41:36 -07:00
Weiming Shi	3d07ca5c0f	net/sched: taprio: fix NULL pointer dereference in class dump When a TAPRIO child qdisc is deleted via RTM_DELQDISC, taprio_graft() is called with new == NULL and stores NULL into q->qdiscs[cl - 1]. Subsequent RTM_GETTCLASS dump operations walk all classes via taprio_walk() and call taprio_dump_class(), which calls taprio_leaf() returning the NULL pointer, then dereferences it to read child->handle, causing a kernel NULL pointer dereference. The bug is reachable with namespace-scoped CAP_NET_ADMIN on any kernel with CONFIG_NET_SCH_TAPRIO enabled. On systems with unprivileged user namespaces enabled, an unprivileged local user can trigger a kernel panic by creating a taprio qdisc inside a new network namespace, grafting an explicit child qdisc, deleting it, and requesting a class dump. The RTM_GETTCLASS dump itself requires no capability. Oops: general protection fault, probably for non-canonical address 0xdffffc0000000007: 0000 [#1] SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000038-0x000000000000003f] RIP: 0010:taprio_dump_class (net/sched/sch_taprio.c:2478) Call Trace: <TASK> tc_fill_tclass (net/sched/sch_api.c:1966) qdisc_class_dump (net/sched/sch_api.c:2326) taprio_walk (net/sched/sch_taprio.c:2514) tc_dump_tclass_qdisc (net/sched/sch_api.c:2352) tc_dump_tclass_root (net/sched/sch_api.c:2370) tc_dump_tclass (net/sched/sch_api.c:2431) rtnl_dumpit (net/core/rtnetlink.c:6864) netlink_dump (net/netlink/af_netlink.c:2325) rtnetlink_rcv_msg (net/core/rtnetlink.c:6959) netlink_rcv_skb (net/netlink/af_netlink.c:2550) </TASK> Fix this by substituting &noop_qdisc when new is NULL in taprio_graft(), a common pattern used by other qdiscs (e.g., multiq_graft()) to ensure the q->qdiscs[] slots are never NULL. This makes control-plane dump paths safe without requiring individual NULL checks. Since the data-plane paths (taprio_enqueue and taprio_dequeue_from_txq) previously had explicit NULL guards that would drop/skip the packet cleanly, update those checks to test for &noop_qdisc instead. Without this, packets would reach taprio_enqueue_one() which increments the root qdisc's qlen and backlog before calling the child's enqueue; noop_qdisc drops the packet but those counters are never rolled back, permanently inflating the root qdisc's statistics. After this change old can be a valid qdisc, NULL, or &noop_qdisc. Only call qdisc_put(old) in the first case to avoid decreasing noop_qdisc's refcount, which was never increased. Fixes: `665338b2a7` ("net/sched: taprio: dump class stats for the actual q->qdiscs[]") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Tested-by: Weiming Shi <bestswngs@gmail.com> Link: https://patch.msgid.link/20260422161958.2517539-3-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 18:41:36 -07:00
Linus Torvalds	dca922e019	xen: XSA-485 and XSA-487 security patches for v7.1 -----BEGIN PGP SIGNATURE----- iJEEABYKADkWIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCaeoflBsUgAAAAAAEAA5t YW51MiwyLjUrMS4xMiwyLDIACgkQgFxhu0/YY75GSAD/RZ0vMd5FHkPkcx5C4Q3c VK12E6+fQT5CEp7E9Sg2mBEBAOhzi8WMYR5b3nlEQWKRraFg651+do9Tt1QspKdW /IEG =LCjg -----END PGP SIGNATURE----- Merge tag 'xsa48x-7.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "XSA-485 and XSA-487 security patches" * tag 'xsa48x-7.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/privcmd: fix double free via VMA splitting Buffer overflow in drivers/xen/sys-hypervisor.c	2026-04-27 18:36:47 -07:00
Paul Geurts	a9bc28aa4e	NFC: trf7970a: Ignore antenna noise when checking for RF field The main channel Received Signal Strength Indicator (RSSI) measurement is used to determine whether an RF field is present or not. RSSI != 0 is interpreted as an RF Field is present. This does not take RF noise and measurement inaccuracy into account, and results in false positives in the field. Define a noise level and make sure the RF field is only interpreted as present when the RSSI is above the noise level. Fixes: `851ee3cbf8` ("NFC: trf7970a: Don't turn on RF if there is already an RF field") Signed-off-by: Paul Geurts <paul.geurts@prodrive-technologies.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: Mark Greer <mgreer@animalcreek.com> Link: https://patch.msgid.link/20260422100930.581237-1-paul.geurts@prodrive-technologies.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 18:00:43 -07:00
Felix Gu	8d0189c1ea	spi: amlogic-spisg: initialize completion before requesting IRQ Move init_completion(&spisg->completion) to before devm_request_irq() to avoid a potential race condition where an interrupt could fire before the completion structure is initialized. Fixes: `cef9991e04` ("spi: Add Amlogic SPISG driver") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Link: https://patch.msgid.link/20260428-amlogic-spisg-v1-1-8eecc3b446d6@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2026-04-28 09:57:15 +09:00
Morduan Zang	adbe2cdf75	net: usb: rtl8150: free skb on usb_submit_urb() failure in xmit When rtl8150_start_xmit() fails to submit the tx URB, the URB is never handed to the USB core and write_bulk_callback() will not run. The driver returns NETDEV_TX_OK, which tells the networking stack that the skb has been consumed, but nothing actually frees the skb on this error path: dev->tx_skb = skb; ... if ((res = usb_submit_urb(dev->tx_urb, GFP_ATOMIC))) { ... /* no kfree_skb here */ } return NETDEV_TX_OK; This leaks the skb on every submit failure and also leaves dev->tx_skb pointing at memory that the driver itself may later free, which is fragile. Free the skb with dev_kfree_skb_any() in the error path and clear dev->tx_skb so no stale pointer is left behind. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Morduan Zang <zhangdandan@uniontech.com> Link: https://patch.msgid.link/E7D3E1C013C5A859+20260424015517.9574-1-zhangdandan@uniontech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:51:14 -07:00
Zhan Jun	23f0e34c64	net: usb: rtl8150: fix use-after-free in rtl8150_start_xmit() syzbot reported a KASAN slab-use-after-free read in rtl8150_start_xmit() when accessing skb->len for tx statistics after usb_submit_urb() has been called: BUG: KASAN: slab-use-after-free in rtl8150_start_xmit+0x71f/0x760 drivers/net/usb/rtl8150.c:712 Read of size 4 at addr ffff88810eb7a930 by task kworker/0:4/5226 The URB completion handler write_bulk_callback() frees the skb via dev_kfree_skb_irq(dev->tx_skb). The URB may complete on another CPU in softirq context before usb_submit_urb() returns in the submitter, so by the time the submitter reads skb->len the skb has already been queued to the per-CPU completion_queue and freed by net_tx_action(): CPU A (xmit) CPU B (USB completion softirq) ------------ ------------------------------ dev->tx_skb = skb; usb_submit_urb() --+ \|-------> write_bulk_callback() \| dev_kfree_skb_irq(dev->tx_skb) \| net_tx_action() \| napi_skb_cache_put() <-- free netdev->stats.tx_bytes \| += skb->len; <-- UAF read Fix it by caching skb->len before submitting the URB and using the cached value when updating the tx_bytes counter. The pre-existing tx_bytes semantics are preserved: the counter tracks the original frame length (skb->len), not the ETH_ZLEN/USB-alignment padded "count" value that is handed to the device. Changing that would be a user-visible accounting change and is out of scope for this UAF fix. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: syzbot+3f46c095ac0ca048cb71@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69e69ee7.050a0220.24bfd3.002b.GAE@google.com/ Closes: https://syzkaller.appspot.com/bug?extid=3f46c095ac0ca048cb71 Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Zhan Jun <zhanjun@uniontech.com> Link: https://patch.msgid.link/809895186B866C10+20260423004913.136655-1-zhangdandan@uniontech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:51:04 -07:00
Greg Kroah-Hartman	9e6bf146b5	ipv6: rpl: reserve mac_len headroom when recompressed SRH grows ipv6_rpl_srh_rcv() decompresses an RFC 6554 Source Routing Header, swaps the next segment into ipv6_hdr->daddr, recompresses, then pulls the old header and pushes the new one plus the IPv6 header back. The recompressed header can be larger than the received one when the swap reduces the common-prefix length the segments share with daddr (CmprI=0, CmprE>0, seg[0][0] != daddr[0] gives the maximum +8 bytes). pskb_expand_head() was gated on segments_left == 0, so on earlier segments the push consumed unchecked headroom. Once skb_push() leaves fewer than skb->mac_len bytes in front of data, skb_mac_header_rebuild()'s call to: skb_set_mac_header(skb, -skb->mac_len); will store (data - head) - mac_len into the u16 mac_header field, which wraps to ~65530, and the following memmove() writes mac_len bytes ~64KiB past skb->head. A single AF_INET6/SOCK_RAW/IPV6_HDRINCL packet over lo with a two segment type-3 SRH (CmprI=0, CmprE=15) reaches headroom 8 after one pass; KASAN reports a 14-byte OOB write in ipv6_rthdr_rcv. Fix this by expanding the head whenever the remaining room is less than the push size plus mac_len, and request that much extra so the rebuilt MAC header fits afterwards. Fixes: `8610c7c6e3` ("net: ipv6: add support for rpl sr exthdr") Cc: stable <stable@kernel.org> Reported-by: Anthropic Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/2026042133-gout-unvented-1bd9@gregkh Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:47:26 -07:00
Ido Schimmel	2674d603a9	vrf: Fix a potential NPD when removing a port from a VRF RCU readers that identified a net device as a VRF port using netif_is_l3_slave() assume that a subsequent call to netdev_master_upper_dev_get_rcu() will return a VRF device. They then continue to dereference its l3mdev operations. This assumption is not always correct and can result in a NPD [1]. There is no RCU synchronization when removing a port from a VRF, so it is possible for an RCU reader to see a new master device (e.g., a bridge) that does not have l3mdev operations. Fix by adding RCU synchronization after clearing the IFF_L3MDEV_SLAVE flag. Skip this synchronization when a net device is removed from a VRF as part of its deletion and when the VRF device itself is deleted. In the latter case an RCU grace period will pass by the time RTNL is released. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] RIP: 0010:l3mdev_fib_table_rcu (net/l3mdev/l3mdev.c:181) [...] Call Trace: <TASK> l3mdev_fib_table_by_index (net/l3mdev/l3mdev.c:201 net/l3mdev/l3mdev.c:189) __inet_bind (net/ipv4/af_inet.c:499 (discriminator 3)) inet_bind_sk (net/ipv4/af_inet.c:469) __sys_bind (./include/linux/file.h:62 (discriminator 1) ./include/linux/file.h:83 (discriminator 1) net/socket.c:1951 (discriminator 1)) __x64_sys_bind (net/socket.c:1969 (discriminator 1) net/socket.c:1967 (discriminator 1) net/socket.c:1967 (discriminator 1)) do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Fixes: `fdeea7be88` ("net: vrf: Set slave's private flag before linking") Reported-by: Haoze Xie <royenheart@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Yuan Tan <yuantan098@gmail.com> Closes: https://lore.kernel.org/netdev/20260419145332.3988923-1-n05ec@lzu.edu.cn/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260423063607.1208202-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:43:22 -07:00
Eric Dumazet	59b145771c	net/sched: sch_fq_pie: annotate data-races in fq_pie_dump_stats() fq_codel_dump_stats() acquires the qdisc spinlock a bit too late. Move this acquisition before we fill tc_fq_pie_xstats with live data. Alternative would be to add READ_ONCE() and WRITE_ONCE() annotations, but the spinlock is needed anyway to scan q->new_flows and q->old_flows. Fixes: `ec97ecf1eb` ("net: sched: add Flow Queue PIE packet scheduler") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260423063527.2568262-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:41:52 -07:00
Eric Dumazet	d3aeb889dc	net/sched: sch_choke: annotate data-races in choke_dump_stats() choke_dump_stats() only runs with RTNL held. It reads fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Fixes: `edb09eb17e` ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260423062839.2524324-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:41:08 -07:00
Lorenzo Bianconi	bde34e84ed	net: airoha: Do not read uninitialized fragment address in airoha_dev_xmit() The transmit loop in airoha_dev_xmit() reads fragment address and length during its final iteration, when the loop index equals skb_shinfo(skb)->nr_frags, at which point the fragment data is uninitialized. While these values are never consumed, the read itself is unsafe and may trigger a page fault. Fix this by avoiding the fragment read on the last iteration. Additionally, move the skb pointer from the first to the last used packet descriptor, so that airoha_qdma_tx_napi_poll() defers freeing the skb until the final descriptor is processed. Fixes: `23020f0493` ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260424-airoha-xmit-fix-read-frag-v1-1-fdc0a83c79e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:40:11 -07:00
Lorenzo Bianconi	e070aac63b	net: airoha: Do not wake all netdev TX queues in airoha_qdma_wake_netdev_txqs() Do not wake every netdev TX queue across all ports sharing the QDMA running netif_tx_wake_all_queues routine in airoha_qdma_wake_netdev_txqs() but only the ones that are mapped the specific QDMA stopped hw TX queue. This patch can potentially avoid waking already stopped netdev TX queues that are mapped to a different QDMA hw TX queue. Introduce airoha_qdma_get_txq utility routine. Fixes: `b94769eb2f` ("net: airoha: Fix possible TX queue stall in airoha_qdma_tx_napi_poll()") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260421-airoha-wake_netdev_txqs-optmization-v1-1-e0be95115d53@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:37:57 -07:00
Lorenzo Bianconi	3854de7b38	net: airoha: stop net_device TX queue before updating CPU index Currently, airoha_eth driver updates the CPU index register prior of verifying whether the number of free descriptors has fallen below the threshold. Move net_device TX queue length check before updating the TX CPU index in order to update TX CPU index even if there are more packets to be transmitted but the net_device TX queue is going to be stopped accounting the inflight packets. Fixes: `1d30417410` ("net: airoha: Implement BQL support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260421-airoha-xmit-stop-condition-v1-1-e670d6a48467@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:34:03 -07:00
Lorenzo Bianconi	2d9f5a1182	net: airoha: fix BQL imbalance in TX path Fix a possible BQL imbalance in airoha_dev_xmit(), where inflight packets are accounted only for the AIROHA_NUM_TX_RING netdev TX queues. The queue index is computed as: qid = skb_get_queue_mapping(skb) % ARRAY_SIZE(qdma->q_tx) txq = netdev_get_tx_queue(dev, qid); However, airoha_qdma_tx_napi_poll() accounts completions across all netdev TX queues (num_tx_queues), leading to inconsistent BQL accounting. Also reset all netdev TX queues in the ndo_stop callback. Fixes: `1d30417410` ("net: airoha: Implement BQL support") Fixes: `c9f947769b` ("net: airoha: Reset BQL stopping the netdevice") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260421-airoha-fix-bql-v1-1-f135afe4275b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:32:59 -07:00
Jakub Kicinski	5bd6252b9d	Merge branch 'netem-bug-fixes' Stephen Hemminger says: ==================== netem: bug fixes These bugs were found when doing AI-assisted review of sch_netem.c during investigation of the packet duplication recursion problem addressed in Jamal's series. The fixes cover: - probability gaps in the 4-state Markov loss model - queue limit not accounting for reordered packets - PRNG reseeded on every tc change, breaking reproducibility - slot configuration not validated (inverted ranges, negative delays, negative limits) - slot delay arithmetic overflow for ranges above ~2.1 seconds - negative latency and jitter wrapping to huge time_to_send values via u64 arithmetic ==================== Link: https://patch.msgid.link/20260418032027.900913-1-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:30:48 -07:00
Stephen Hemminger	90be9fedb2	net/sched: netem: check for negative latency and jitter Reject requests with negative latency or jitter. A negative value added to current timestamp (u64) wraps to an enormous time_to_send, disabling dequeue. The original UAPI used u32 for these values; the conversion to 64-bit time values via TCA_NETEM_LATENCY64 and TCA_NETEM_JITTER64 allowed signed values to reach the kernel without validation. Jitter is already silently clamped by an abs() in netem_change(); that abs() can be removed in a follow-up once this rejection is in place. Fixes: `99803171ef` ("netem: add uapi to express delay and jitter in nanoseconds") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260418032027.900913-7-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:30:28 -07:00
Stephen Hemminger	51e94e1e2f	net/sched: netem: fix slot delay calculation overflow get_slot_next() computes a random delay between min_delay and max_delay using: get_random_u32() * (max_delay - min_delay) >> 32 This overflows signed 64-bit arithmetic when the delay range exceeds approximately 2.1 seconds (2^31 nanoseconds), producing a negative result that effectively disables slot-based pacing. This is a realistic configuration for WAN emulation (e.g., slot 1s 5s). Use mul_u64_u32_shr() which handles the widening multiply without overflow. Fixes: `0a9fe5c375` ("netem: slotting with non-uniform distribution") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260418032027.900913-6-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:30:28 -07:00
Stephen Hemminger	01801c359a	net/sched: netem: validate slot configuration Reject slot configurations that have no defensible meaning: - negative min_delay or max_delay - min_delay greater than max_delay - negative dist_delay or dist_jitter - negative max_packets or max_bytes Negative or out-of-order delays underflow in get_slot_next(), producing garbage intervals. Negative limits trip the per-slot accounting (packets_left/bytes_left <= 0) on the first packet of every slot, defeating the rate-limiting half of the slot feature. Note that dist_jitter has been silently coerced to its absolute value by get_slot() since the feature was introduced; rejecting negatives here converts that silent coercion into -EINVAL. The abs() can be removed in a follow-up. Fixes: `836af83b54` ("netem: support delivering packets in delayed time slots") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260418032027.900913-5-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:30:28 -07:00
Stephen Hemminger	986afaf809	net/sched: netem: only reseed PRNG when seed is explicitly provided netem_change() unconditionally reseeds the PRNG on every tc change command. If TCA_NETEM_PRNG_SEED is not specified, a new random seed is generated, destroying reproducibility for users who set a deterministic seed on a previous change. Move the initial random seed generation to netem_init() and only reseed in netem_change() when TCA_NETEM_PRNG_SEED is explicitly provided by the user. Fixes: `4072d97ddc` ("netem: add prng attribute to netem_sched_data") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260418032027.900913-4-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:30:28 -07:00
Stephen Hemminger	4185701fcc	net/sched: netem: fix queue limit check to include reordered packets The queue limit check in netem_enqueue() uses q->t_len which only counts packets in the internal tfifo. Packets placed in sch->q by the reorder path (__qdisc_enqueue_head) are not counted, allowing the total queue occupancy to exceed sch->limit under reordering. Include sch->q.qlen in the limit check. Fixes: `f8d4bc4550` ("net/sched: netem: account for backlog updates from child qdisc") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260418032027.900913-3-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:30:27 -07:00
Stephen Hemminger	732b463449	net/sched: netem: fix probability gaps in 4-state loss model The 4-state Markov chain in loss_4state() has gaps at the boundaries between transition probability ranges. The comparisons use: if (rnd < a4) else if (a4 < rnd && rnd < a1 + a4) When rnd equals a boundary value exactly, neither branch matches and no state transition occurs. The redundant lower-bound check (a4 < rnd) is already implied by being in the else branch. Remove the unnecessary lower-bound comparisons so the ranges are contiguous and every random value produces a transition, matching the GI (General and Intuitive) loss model specification. This bug goes back to original implementation of this model. Fixes: `661b79725f` ("netem: revised correlated loss generator") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260418032027.900913-2-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:30:27 -07:00
Nikola Z. Ivanov	35eaa6d8d6	netdevsim: zero initialize struct iphdr in dummy sk_buff Syzbot reports a KMSAN uninit-value originating from nsim_dev_trap_skb_build, with the allocation also being performed in the same function. Fix this by calling skb_put_zero instead of skb_put to guarantee zero initialization of the whole IP header. Closes: https://syzkaller.appspot.com/bug?extid=23d7fcd204e3837866ff Fixes: `da58f90f11` ("netdevsim: Add devlink-trap support") Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260426201434.742030-1-zlatistiv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 17:03:13 -07:00
Linus Torvalds	3b3bea6d4b	cgroup: Fixes for v7.1-rc1 - Fix UAF race in psi pressure_write() against cgroup file release by extending cgroup_mutex coverage and ordering of->priv access after cgroup_kn_lock_live(). - Fix integer overflow in rdmacg_try_charge() when usage equals INT_MAX by performing the increment in s64. - Fix asymmetric DL bandwidth accounting on cpuset attach rollback by recording the CPU used by dl_bw_alloc() so cancel_attach() returns the reservation to the same root domain. - Fix nr_dying_subsys_* race that briefly showed 0 in cgroup.stat after rmdir by incrementing from kill_css() instead of offline_css(). - Typo fix in cgroup-v2 documentation. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCae+xjw4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGaIUAQD2hJ7ELRDXAtXzL1Ck1zH8vESvbX8syFfkSO6L IgtovQEA4Tk7/RIO3HfBxBjgp6Q5vo7C7Biz4ye7fCu/ry7x3Qk= =pypQ -----END PGP SIGNATURE----- Merge tag 'cgroup-for-7.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix UAF race in psi pressure_write() against cgroup file release by extending cgroup_mutex coverage and ordering of->priv access after cgroup_kn_lock_live() - Fix integer overflow in rdmacg_try_charge() when usage equals INT_MAX by performing the increment in s64 - Fix asymmetric DL bandwidth accounting on cpuset attach rollback by recording the CPU used by dl_bw_alloc() so cancel_attach() returns the reservation to the same root domain - Fix nr_dying_subsys_* race that briefly showed 0 in cgroup.stat after rmdir by incrementing from kill_css() instead of offline_css() - Typo fix in cgroup-v2 documentation * tag 'cgroup-for-7.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup: fix typo 'protetion' -> 'protection' cgroup: Increment nr_dying_subsys_* from rmdir context cgroup/cpuset: record DL BW alloc CPU for attach rollback cgroup/rdma: fix integer overflow in rdmacg_try_charge() sched/psi: fix race between file release and pressure write	2026-04-27 16:51:27 -07:00
Linus Torvalds	a1a671092d	\n -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmnvhuAACgkQnJ2qBz9k QNn7NQgAhNZiUl5HQ2/+6XjrhrbaLTiwYLFYX36MlVcy6jZZ7PABEn4Iq+dKn19L EQ6WL97yEAOV8A3wdXLnm3J/euf37JlWApuNWqmgA5ZsJ0p14nXKbhd/jcqSf5LF K4afO0OlN6WrPPJxxY2KfiaElgf9bPAhSkX/JvV5pEVyz0CcAjFfgwmNhxrYxIY1 DfQJ/2w7G1VdgpxeO1kVW5REH5NvbsWj9IQSSRS9r1HzTa7E28e3Zn75XWcdab3I pt3E03nuUiyfVTmJXi2/HGNb40XZjH5TCeDrsbjo759ZdiPIvWDUpcTx0+5acOaj b039wWZKKBFTag4KA4yPMkEV37ROiA== =+sU8 -----END PGP SIGNATURE----- Merge tag 'fs_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull isofs and udf fixes from Jan Kara: "Several isofs and udf fixes" * tag 'fs_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: docs: isofs: replace dead ECMA-119 FTP link udf: reject descriptors with oversized CRC length isofs: use QSTR_LEN() in isofs_cmp isofs: validate block number from NFS file handle in isofs_export_iget isofs: validate Rock Ridge CE continuation extent against volume size	2026-04-27 16:45:39 -07:00
Linus Torvalds	53b6156308	\n -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmnvhacACgkQnJ2qBz9k QNn5hwgAvh1eiRgLrIQ/dd+jpOQ2SbwAN0XiD/SI+verBhF+SC7Y6idGNz0LvYEp OQ+OZmMJK/HOKzcUHAK13LtDUmBU/qTtfEFysiMqJ48QSmHlrYVGuDMSdWhjhZ9r 2rcQ/aGf4C/rLxSGEsH9Rv3WOp/7ZSUTBLB1RojIJmYnjmXsFr1EHpUZyh7ACun7 at18okN4HIjxpwtuB/A15So3vQIaI6tTkH01v11nMSYh1OAaZgs9m+4p3+ZzuW1Y ecAWtuHLDfuLqblg0FxhxGx0tiHODCTWmFghTgVAaXs8OjTpLAEzzOQ7VloGqKdB a/er1rYJOeyV8x3R+rvTL4R2b+d4xA== =2Au7 -----END PGP SIGNATURE----- Merge tag 'fsnotify_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fixes from Jan Kara: "Three fixes for fsnotify / fanotify" * tag 'fsnotify_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: fix inode reference leak in fsnotify_recalc_mask() fanotify: Fix spelling mistake "enforecement" -> "enforcement" fanotify: fix false positive on permission events	2026-04-27 16:40:24 -07:00
Felix Gu	f5c6a272b6	spi: axiado: replace usleep_range() with udelay() in IRQ path ax_spi_fill_tx_fifo() can be called from ax_spi_irq() which is a hard irq handler. Replace usleep_range(10, 10) with udelay(10) in atomic context. Fixes: `e75a6b00ad` ("spi: axiado: Add driver for Axiado SPI DB controller") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Link: https://patch.msgid.link/20260428-axiado-v1-1-cd767500af72@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2026-04-28 08:35:59 +09:00
Linus Torvalds	73082fbdb1	for-7.1-rc1-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmnvdO8ACgkQxWXV+ddt WDuLqA//fcHDOnClWHRRUaIWhhkMYm7gNZkXf2d+qyYLMtAP2Cv2sZ+aV+OkHp5D /Gq1W1mUXZLabu0EV0xKICn01nwzWtbZwDO8Bo3+QEdLoAi2gITODsYyY8yeW9KO GfSBPsom+d7ktVrjaYE7Ppcm6YifBjWNDDcC+MX7Kpy+OUqhyOtsJIaEeTwii9+P eiyAAC1zqrHZtaQfLsY3WvM0baNaqlm1xURMjJPyRCAtjGpjZy1hK/iFsGcHRlfc SR//WT/MRnUAFn8zlIBG0wNrk1IEIgPPiA7hAXMRGXFKo0C6ICYLl5MJQh/o/MUs tFBdkBhtcX/Kynvwb059SyalXZzVhQvzaRN89ZGuDyalNiejRzb8F2oVCfKAVKIU MdkKOjnR5b7BUzCcZ1cJf1LgX4SngYKTnXrNGHpW0fuUzX6moJEd4wbrgmHjk9ke +TVdl2vcpAduvBU9idkpDAcUW998tcYmX/LyQhGYpR6k/4n2UdFZJPINqco3pOAO RIFbIgEAq9rUi+GMSJdEDMO6xLmUYoI6vaw7uZSU6E04zJPiVIcixfRDCBKGPV5Q Yl9PC3ViLSlgKWaG7UVl8PVaSkCQ7esbfPAnNI/+RBCUeehhSFygePcY+kH1k4LA 0qMne1abDysUVwolb/1de/fqkznLlB3SlA447HwdvwMI0mCSb7w= =ajKs -----END PGP SIGNATURE----- Merge tag 'for-7.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - space reservation fixes: - correctly undo 'may_use' accounting for remap tree - avoid double decrement of 'may_use' when submitting async io - actually enable the shutdown ioctl callback (not just the superblock ops) - raid stripe tree fixes when deleting extents - add missing error handling - fix various incorrect values set - fix transaction state when removing a directory, possibly leading to EIO during log replay - additional b-tree node key checks during metadata readahead - error handling and transaction abort updates * tag 'for-7.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix double-decrement of bytes_may_use in submit_one_async_extent() btrfs: check return value of btrfs_partially_delete_raid_extent() btrfs: handle -EAGAIN from btrfs_duplicate_item and refresh stale leaf pointer btrfs: replace ASSERT with proper error handling in stripe lookup fallback btrfs: fix wrong min_objectid in btrfs_previous_item() call btrfs: fix raid stripe search missing entries at leaf boundaries btrfs: copy devid in btrfs_partially_delete_raid_extent() btrfs: handle unexpected free-space-tree key types btrfs: fix missing last_unlink_trans update when removing a directory btrfs: don't clobber errors in add_remap_tree_entries() btrfs: enable shutdown ioctl for non-experimental builds btrfs: apply first key check for readahead when possible btrfs: abort transaction in do_remap_reloc_trans() on failure btrfs: fix bytes_may_use leak in do_remap_reloc_trans() btrfs: fix bytes_may_use leak in move_existing_remap()	2026-04-27 16:35:44 -07:00
Linus Torvalds	d762a96e3d	- fix metadata corruption in dm-thin -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRnH8MwLyZDhyYfesYTAyx9YGnhbQUCae9j2xQcbXBhdG9ja2FA cmVkaGF0LmNvbQAKCRATAyx9YGnhbZNOAQCRaAC3AItYoodZnPgVZe9jAiVWEowg YwKsPaBHo+WWHgEAnQbmUJ/twEMFNEP7RHcDqcorO2fu83W+lFkCV7hU+g4= =m1L+ -----END PGP SIGNATURE----- Merge tag 'for-7.1/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mikulas Patocka: - fix metadata corruption in dm-thin * tag 'for-7.1/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm-thin: fix metadata refcount underflow	2026-04-27 16:33:23 -07:00
David Windsor	1e5a8eed78	selinux: don't reserve xattr slot when we won't fill it Move lsm_get_xattr_slot() below the SBLABEL_MNT check so we don't leave a NULL-named slot in the array when returning -EOPNOTSUPP; filesystem initxattrs() callbacks stop iterating at the first NULL ->name, silently dropping xattrs installed by later LSMs. Cc: stable@vger.kernel.org Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2026-04-27 19:32:56 -04:00
Jakub Kicinski	3618442d54	MAINTAINERS: add pcnet_cs to PCMCIA Per discussion under the Link make sure Dominik can help with the patches to drivers/net/ethernet/8390/pcnet_cs.c cc: linux@dominikbrodowski.net Link: https://lore.kernel.org/aeomUh5JqFvkLTH7@scops.dominikbrodowski.net Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> Link: https://patch.msgid.link/20260423220857.3490118-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-27 16:30:25 -07:00
Zongyao Chen	032e70aff0	selinux: use sk blob accessor in socket permission helpers SELinux socket state lives in the composite LSM socket blob. sock_has_perm() and nlmsg_sock_has_extended_perms() currently dereference sk->sk_security directly, which assumes the SELinux socket blob is at offset zero. In stacked configurations that assumption does not hold. If another LSM allocates socket blob storage before SELinux, these helpers may read the wrong blob and feed invalid SID and class values into AVC checks. Use selinux_sock() instead of accessing sk->sk_security directly. Fixes: `d1d991efaf` ("selinux: Add netlink xperm support") Cc: stable@vger.kernel.org # v6.13+ Signed-off-by: Zongyao Chen <ZongYao.Chen@linux.alibaba.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2026-04-27 19:26:57 -04:00
Keith Busch	cfcbfe5cb1	PCI: Don't fallback to bus reset after failed slot reset If a bus has hotplug slots that implement the slot's reset_slot callback, it is not safe to do the non-slot specific bus reset, so don't fallback to it. If a slot reset does fail, the subsequent bus reset will attempt a 2nd link reset on top of previous and fail to handle the hotplug events. Fixes: `8238cb69c0` ("PCI: Make reset_subordinate hotplug safe") Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260421150644.3543733-1-kbusch@meta.com	2026-04-27 17:26:59 -05:00
Linus Torvalds	a7cc308da5	mailbox: updates for v7.1 - core: fix NULL message handling and add API to query TX queue slots - test: resolve concurrency bugs, dangling IRQs, and memory leaks - dt-bindings: qcom: add Eliza IPCC - mtk: fix address calculation and pointer handling bugs - cix: resolve SCMI suspend timeouts - misc memory allocation optimizations and cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE6EwehDt/SOnwFyTyf9lkf8eYP5UFAmnulSwACgkQf9lkf8eY P5VYNxAAjvkfFpAy2eBhugVF0Ld46xHfExAawi6LV0vEtLw7wgbkEaJbkFvP+60v byFGlGk61kyzkXYvpCwtHWAzlGbi6u8CG7R2/1pVY0Xb0PScLPiKqBhMZOGfK0oZ iC/7Yqzdo2s1wjakl5OuAfLUk6KzXipKM7gkdL61q6M9JUY/xF701LfPYv+nlqEv c6XZl0gNVmabWb17L3aU5qtka6zupm+BClNNSI6xPGeDYcJBZKNhu1PFTJJNbVv7 755qyo2whneYVyXVUjARJc9yCrg+E3XLER40Ztij1cPaOpz8ypl5MzgQOhF+YepZ TdsaXI31zOYgfg3bmJPKC+ncY5zwEd+0Mndc3LgfM71ednU/VQgNP6IX/hqEKlof v2VEOrdG3z4qE9fGsnAmbOcKpnGwTMoaASlB55XrVDWaa1uHj7ji8nm5iX2p8og6 KNw05aYfwr0Xi7XP3F4wJ40g5SNpxjNRcreaeFWatoIC9arZaaZXVPq6CMk7gqx6 7b1wh6+JDtk01m+8qfPMaUazjrgBgJpIkfMm1OVXFFpQqUruxspxI5yWooroOjcR Vh/eXqZxaBfqdycp/25eCuuuDANk9Qbvb9SiwXEK6sW6qVEglkQuNA7VoEau07pP CseefAdtvl7/2YF8nWlE3FDuhoMlvM9IHnkschr4CDwFN0/Pw8I= =FeQ1 -----END PGP SIGNATURE----- Merge tag 'mailbox-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox Pull mailbox updates from Jassi Brar: - core: fix NULL message handling and add API to query TX queue slots - test: resolve concurrency bugs, dangling IRQs, and memory leaks - dt-bindings: qcom: add Eliza IPCC - mtk: fix address calculation and pointer handling bugs - cix: resolve SCMI suspend timeouts - misc memory allocation optimizations and cleanups * tag 'mailbox-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox: mailbox: mailbox-test: make data_ready a per-instance variable mailbox: mailbox-test: initialize struct earlier mailbox: mailbox-test: don't free the reused channel mailbox: mailbox-test: handle channel errors consistently mailbox: update kdoc for struct mbox_controller mailbox: add sanity check for channel array mailbox: mailbox-test: free channels on probe error mailbox: prefix new constants with MBOX_ dt-bindings: mailbox: qcom-ipcc: Document the Eliza Inter-Processor Communication Controller mailbox: cix: Add IRQF_NO_SUSPEND to mailbox interrupt mailbox: Fix NULL message support in mbox_send_message() mailbox: remove superfluous internal header mailbox: correct kdoc title for mbox_bind_client mailbox: test: really ignore optional memory resources mailbox: exynos: drop superfluous mbox setting per channel mailbox: mtk-cmdq: Fix CURR and END addr for task insert case mailbox: mtk-vcp-mailbox: Fix the return value in mtk_vcp_mbox_xlate() mailbox: hi6220: kzalloc + kcalloc to kzalloc mailbox: rockchip: kzalloc + kcalloc to kzalloc mailbox: add API to query available TX queue slots	2026-04-27 15:21:18 -07:00
Daan De Meyer	0898a81762	cdrom, scsi: sr: propagate read-only status to block layer via set_disk_ro() The cdrom core never calls set_disk_ro() for a registered device, so BLKROGET on a CD-ROM device always returns 0 (writable), even when the drive has no write capabilities and writes will inevitably fail. This causes problems for userspace that relies on BLKROGET to determine whether a block device is read-only. For example, systemd's loop device setup uses BLKROGET to decide whether to create a loop device with LO_FLAGS_READ_ONLY. Without the read-only flag, writes pass through the loop device to the CD-ROM and fail with I/O errors. systemd-fsck similarly checks BLKROGET to decide whether to run fsck in no-repair mode (-n). The write-capability bits in cdi->mask come from two different sources: CDC_DVD_RAM and CDC_CD_RW are populated by the driver from the MODE SENSE capabilities page (page 0x2A) before register_cdrom() is called, while CDC_MRW_W and CDC_RAM require the MMC GET CONFIGURATION command and were only probed by cdrom_open_write() at device open time. This meant that any attempt to compute the writable state from the full mask at probe time was incorrect, because the GET CONFIGURATION bits were still unset (and cdi->mask is initialized such that capabilities are assumed present). Fix this by factoring the GET CONFIGURATION probing out of cdrom_open_write() into a new exported helper, cdrom_probe_write_features(), and having sr call it from sr_probe() right after get_capabilities() has populated the MODE SENSE bits. register_cdrom() then calls set_disk_ro() based on the full write-capability mask (CDC_DVD_RAM \| CDC_MRW_W \| CDC_RAM \| CDC_CD_RW) so the block layer reflects the drive's actual write support. The feature queries used (CDF_MRW and CDF_RWRT via GET CONFIGURATION with RT=00) report drive-level capabilities that are persistent across media, so a single probe before register_cdrom() is sufficient and the redundant probe at open time is dropped. With set_disk_ro() now accurate, the long-vestigial cd->writeable flag in sr can go: get_capabilities() used to set cd->writeable based on the same four mask bits, but because CDC_MRW_W and CDC_RAM default to "capability present" in cdi->mask and aren't touched by MODE SENSE, the condition that gated cd->writeable was always true, making it unconditionally 1. Replace the corresponding gate in sr_init_command() with get_disk_ro(cd->disk), which turns a previously no-op check into a real one and also catches kernel-internal bio writers that bypass blkdev_write_iter()'s bdev_read_only() check. The sd driver (SCSI disks) does not have this problem because it checks the MODE SENSE Write Protect bit and calls set_disk_ro() accordingly. The sr driver cannot use the same approach because the MMC specification does not define the WP bit in the MODE SENSE device-specific parameter byte for CD-ROM devices. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Daan De Meyer <daan@amutable.com> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Link: https://patch.msgid.link/20260427210139.1400-2-phil@philpotter.co.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-27 15:52:51 -06:00
Jens Axboe	aa03cfe9db	nvme fixes for Linux 7.1 - Target data transfer size confiruation (Aurelien) - Enable P2P for RDMA (Shivaji Kant) - TCP target updates (Maurizio, Alistair, Chaitanya, Shivam Kumar) - TCP host updates (Alistair, Chaitanya) - Authentication updates (Alistair, Daniel, Chris Leech) - Multipath fixes (John Garry) - New quirks (Alan Cui, Tao Jiang) - Apple driver fix (Fedor Pchelkin) - PCI admin doorbell update fix (Keith) -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE3Fbyvv+648XNRdHTPe3zGtjzRgkFAmnriqwACgkQPe3zGtjz Rgm3zw//S0WS/UyfPBr8L7zUL4sukcGINH5WIOZpKz4BUADxtIGY9i4gIyTKJzhA OM8IAOSIqflXbpwsZXQY0saG0S50H82OpH9tF2iAaZd1ja6dOJR05L3GpZ2n0Buc GFlPkzFA6OxaRBml9GKnSi+05t7/HmgSdWHUNQ1MyTuBy6YDVjWB7Xnv88hK2L/O 2M/aD+vU+4UM+ITvPmin3JPS1qS0MyIQewG3Fo5clVwfHQ3Fox1KGCSRKEeiWwr8 pfv90QgGaIBlbnTO19Ng6cFPAL8XLlIY3veLMP+9SsDzJMZRo9zmvO3qXe3C3iS9 61oMl7gsoPmzQtsy9GUo2D2F8Lnf0ss/5QcJDpkD+wzxmx9QEDqMnmfia6l0FCzW dFPtKzYPgM01EFJa/Ulj1Yk52i2lLUVdLnb5ghz75HEu3gUyFbV1WrxPJuWhzek4 TeI0tGbC7ogfwVT/0aWTsYpUsYJ0tbLK5RK6aSy9TcYXhi/Px0rOxE3vULgZX3C1 ZaWi0z6mPiyIvUrh9+lt6GsHjow7uunvxNPAdUtyHjM/YQZh47b9tWLslIj2yNVE 1nkiYRunPxuB/CclLHDfjAxTHWYxCte2BGplKAjYcjLcqTN4mDskMnaeleX4Rj5X xOqqmwOoAPxL4kid2WjVtMe5YIybcOAB6f5oJLvJt3rEILFCsFc= =iwmA -----END PGP SIGNATURE----- Merge tag 'nvme-7.1-2026-04-24' of git://git.infradead.org/nvme into block-7.1 Pull NVMe fixes from Keith: "- Target data transfer size confiruation (Aurelien) - Enable P2P for RDMA (Shivaji Kant) - TCP target updates (Maurizio, Alistair, Chaitanya, Shivam Kumar) - TCP host updates (Alistair, Chaitanya) - Authentication updates (Alistair, Daniel, Chris Leech) - Multipath fixes (John Garry) - New quirks (Alan Cui, Tao Jiang) - Apple driver fix (Fedor Pchelkin) - PCI admin doorbell update fix (Keith)" * tag 'nvme-7.1-2026-04-24' of git://git.infradead.org/nvme: (22 commits) nvme-auth: Hash DH shared secret to create session key nvme-pci: fix missed admin queue sq doorbell write nvme-auth: Include SC_C in RVAL controller hash nvme-tcp: teardown circular locking fixes nvmet-tcp: Don't clear tls_key when freeing sq Revert "nvmet-tcp: Don't free SQ on authentication success" nvme: skip trace completion for host path errors nvme-pci: add quirk for Memblaze Pblaze5 (0x1c5f:0x0555) nvme-multipath: put module reference when delayed removal work is canceled nvme: expose TLS mode nvme-apple: drop invalid put of admin queue reference count nvme-core: fix parameter name in comment nvmet: avoid recursive nvmet-wq flush in nvmet_ctrl_free nvme-multipath: drop head pointer check in nvme_mpath_clear_current_path() nvme: add quirk NVME_QUIRK_IGNORE_DEV_SUBNQN for 144d:a808 (Samsung PM981/983/970 EVO Plus ) nvmet-tcp: fix race between ICReq handling and queue teardown nvmet-tcp: remove redundant calls to nvmet_tcp_fatal_error() nvmet-tcp: propagate nvmet_tcp_build_pdu_iovec() errors to its callers nvme: enable PCI P2PDMA support for RDMA transport nvmet: introduce new mdts configuration entry ...	2026-04-27 15:47:21 -06:00
Bartosz Golaszewski	ea216d3ae7	ACPI: bus: add missing forward declaration to acpi_bus.h The header references struct notifier_block but neither includes linux/notifier.h nor contains the relevant forward declaration. Add the latter for correctness. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> [ rjw: Subject tweak ] Link: https://patch.msgid.link/20260427112238.132419-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2026-04-27 22:02:24 +02:00
Shivam Kalra	4b506ea535	ACPI: video: force native backlight on HP OMEN 16 (8A44) The HP OMEN 16 Gaming Laptop (board name 8A44) has a mux-less hybrid GPU configuration with AMD Rembrandt (Radeon 680M) and NVIDIA GA104 (RTX 3070 Ti). The internal eDP panel is wired to the AMD iGPU. When Nouveau loads without GSP firmware, the ACPI video backlight device (acpi_video0) gets registered alongside the native AMD backlight (amdgpu_bl2). In this state, writes to amdgpu_bl2 update the software brightness value but fail to change the physical panel brightness. Force native backlight to prevent acpi_video0 from registering. Confirmed that booting with acpi_backlight=native resolves the issue. Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Shivam Kalra <shivamkalra98@zohomail.in> Link: https://patch.msgid.link/20260426-omen-16-backlight-fix-v1-1-62364f268ea6@zohomail.in Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2026-04-27 22:00:00 +02:00
Rafael J. Wysocki	ad034486ad	ACPI: TAD: Fix up a comment in acpi_tad_probe() Fix grammar in the comment preceding the pm_runtime_set_active() call in acpi_tad_probe(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/8678306.T7Z3S40VBb@rafael.j.wysocki	2026-04-27 21:57:09 +02:00
Rafael J. Wysocki	e0d2190104	ACPI: TAD: RTC: Refine timer value computations and checks Since rtc_tm_to_ktime() may overflow for large RTC time values and full second granularity is sufficient in timer value computations in acpi_tad_rtc_set_alarm() and acpi_tad_rtc_read_alarm(), use rtc_tm_to_time64() instead of that function, which also allows the computations to be simplified. Moreover, U32_MAX is a special "timer disabled" value, so make acpi_tad_rtc_set_alarm() reject it when attempting to program the alarm timers. Fixes: `7572dcabe3` ("ACPI: TAD: Add alarm support to the RTC class device interface") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://patch.msgid.link/3414608.aeNJFYEL58@rafael.j.wysocki	2026-04-27 21:57:09 +02:00
Rafael J. Wysocki	77dd14ab1b	ACPI: TAD: Use devres for all driver cleanup The code in acpi_tad_remove() needs to run after the unregistration of the devres-managed RTC class device so that it doesn't race with the class callbacks of the latter. To make that happen, pass it to devm_add_action_or_reset() before registering the RTC class device. Fixes: `7572dcabe3` ("ACPI: TAD: Add alarm support to the RTC class device interface") Fixes: `8a1e7f4b17` ("ACPI: TAD: Add RTC class device interface") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/14001754.uLZWGnKmhe@rafael.j.wysocki	2026-04-27 21:57:09 +02:00
Rafael J. Wysocki	88b2670ea6	ACPI: TAD: Use __ATTRIBUTE_GROUPS() macro Recent commit `93afe8ba9b` ("ACPI: TAD: Use dev_groups in struct device_driver") switched over the ACPI TAD driver to using device attribute groups instead of creating and removing the device sysfs attributes directly, but it might go one step farther and use the __ATTRIBUTE_GROUPS() macro which would reduce the code size slightly. Do it now. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [ rjw: Fixed typo in the changelog ] Link: https://patch.msgid.link/1961102.tdWV9SEqCh@rafael.j.wysocki Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2026-04-27 21:56:44 +02:00
Jinjie Ruan	75141a770f	ACPI: CPPC: Fix related_cpus inconsistency during CPU hotplug When concurrently bringing up and down two SMT threads of a physical core, many warning call traces occur as below: The issue timeline is as follows: 1. When the system starts, cpufreq: CPU: 220, policy->related_cpus: 220-221, policy->cpus: 220-221 2. Offline CPU 220 and CPU 221. 3. Online CPU 220 - CPU 221 is now offline, as acpi_get_psd_map() use for_each_online_cpu(), so the cpu_data->shared_cpu_map, policy->cpus, and related_cpus has only CPU 220. cpufreq: CPU: 220, policy->related_cpus: 220, policy->cpus: 220 4. Offline CPU 220 5. Online CPU 221, the below call trace occurs: - Since CPU 220 and CPU 221 share one policy, and policy->related_cpus = 220 after step 3, so CPU 221 is not in policy->related_cpus but per_cpu(cpufreq_cpu_data, cpu221) is not NULL. After reverting commit `56eb0c0ed3` ("ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs"), the issue disappeared. The _PSD (P-State Dependency) defines the hardware-level dependency of frequency control across CPU cores. Since this relationship is a physical attribute of the hardware topology, it remains constant regardless of the online or offline status of the CPUs. Using for_each_online_cpu() in acpi_get_psd_map() is problematic. If a CPU is offline, it will be excluded from the shared_cpu_map. Consequently, if that CPU is brought online later, the kernel will fail to recognize it as part of any shared frequency domain. Switch back to for_each_possible_cpu() to ensure that all cores defined in the ACPI tables are correctly mapped into their respective performance domains from the start. This aligns with the logic of policy->related_cpus, which must encompass all potentially available cores in the domain to prevent logic gaps during CPU hotplug operations. To resolve the original issue regarding the "nosmt" or "nosmt=force" boot parameter, as send_pcc_cmd() function already does if (!desc) continue, so reverting that loop back to for_each_possible_cpu() is ok, only need to change the match_cpc_ptr NULL case in acpi_get_psd_map() to continue as Sean suggested. How to reproduce, on arm64 machine with SMT support which use acpi cppc cpufreq driver: bash test.sh 220 & bash test.sh 221 & The test.sh is as below: while true do echo 0 > /sys/devices/system/cpu/cpu${1}/online sleep 0.5 cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus echo 1 > /sys/devices/system/cpu/cpu${1}/online cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus done CPU: 221 PID: 1119 Comm: cpuhp/221 Kdump: loaded Not tainted 6.6.0debug+ #5 Hardware name: To be filled by O.E.M. S920X20/BC83AMDA01-7270Z, BIOS 20.39 09/04/2024 pstate: a1400009 (NzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : cpufreq_online+0x8ac/0xa90 lr : cpuhp_cpufreq_online+0x18/0x30 sp : ffff80008739bce0 x29: ffff80008739bce0 x28: 0000000000000000 x27: ffff28400ca32200 x26: 0000000000000000 x25: 0000000000000003 x24: ffffd483503ff000 x23: ffffd483504051a0 x22: ffffd48350024a00 x21: 00000000000000dd x20: 000000000000001d x19: ffff28400ca32000 x18: 0000000000000000 x17: 0000000000000020 x16: ffffd4834e6a3fc8 x15: 0000000000000020 x14: 0000000000000008 x13: 0000000000000001 x12: 00000000ffffffff x11: 0000000000000040 x10: ffffd48350430728 x9 : ffffd4834f087c78 x8 : 0000000000000001 x7 : ffff2840092bdf00 x6 : ffffd483504264f0 x5 : ffffd48350405000 x4 : ffff283f7f95cc60 x3 : 0000000000000000 x2 : ffff53bc2f94b000 x1 : 00000000000000dd x0 : 0000000000000000 Call trace: cpufreq_online+0x8ac/0xa90 cpuhp_cpufreq_online+0x18/0x30 cpuhp_invoke_callback+0x128/0x580 cpuhp_thread_fun+0x110/0x1b0 smpboot_thread_fn+0x140/0x190 kthread+0xec/0x100 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- Cc: All applicable <stable@vger.kernel.org> Fixes: `56eb0c0ed3` ("ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs") Co-developed-by: Sean Kelley <skelley@nvidia.com> Signed-off-by: Sean Kelley <skelley@nvidia.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> [ rjw: Changelog edits ] Link: https://patch.msgid.link/20260417040112.3727756-1-ruanjinjie@huawei.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2026-04-27 21:50:26 +02:00

... 52 53 54 55 56 ...

1447036 Commits