linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-05 04:56:13 +02:00

Author	SHA1	Message	Date
Matthieu Baerts (NGI0)	3cf1249289	mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker When an ADD_ADDR needs to be retransmitted and another one has already been prepared -- e.g. multiple ADD_ADDRs have been sent in a row and need to be retransmitted later -- this additional retransmission will need to wait. In this case, the timer was reset to TCP_RTO_MAX / 8, which is ~15 seconds. This delay is unnecessary long: it should just be rescheduled at the next opportunity, e.g. after the retransmission timeout. Without this modification, some issues can be seen from time to time in the selftests when multiple ADD_ADDRs are sent, and the host takes time to process them, e.g. the "signal addresses, ADD_ADDR timeout" MPTCP Join selftest, especially with a debug kernel config. Note that on older kernels, 'timeout' is not available. It should be enough to replace it by one second (HZ). Fixes: `00cfd77b90` ("mptcp: retransmit ADD_ADDR when timeout") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-6-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 18:16:45 -07:00
Matthieu Baerts (NGI0)	b7b9a46156	mptcp: pm: ADD_ADDR rtx: free sk if last When an ADD_ADDR is retransmitted, the sk is held in sk_reset_timer(), and released at the end. If at that moment, it was the last reference being held, the sk would not be freed. sock_put() should then be called instead of __sock_put(). But that's not enough: if it is the last reference, sock_put() will call sk_free(), which will end up calling sk_stop_timer_sync() on the same timer, and waiting indefinitely to finish. So it is needed to mark that the timer is done at the end of the timer handler when it has not been rescheduled, not to call sk_stop_timer_sync() on "itself". Fixes: `00cfd77b90` ("mptcp: retransmit ADD_ADDR when timeout") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-5-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 18:16:44 -07:00
Matthieu Baerts (NGI0)	9634cb35af	mptcp: pm: ADD_ADDR rtx: always decrease sk refcount When an ADD_ADDR is retransmitted, the sk is held in sk_reset_timer(). It should then be released in all cases at the end. Some (unlikely) checks were returning directly instead of calling sock_put() to decrease the refcount. Jump to a new 'exit' label to call __sock_put() (which will become sock_put() in the next commit) to fix this potential leak. While at it, drop the '!msk' check which cannot happen because it is never reset, and explicitly mark the remaining one as "unlikely". Fixes: `00cfd77b90` ("mptcp: retransmit ADD_ADDR when timeout") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-4-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 18:16:44 -07:00
Matthieu Baerts (NGI0)	5cd6e0ad79	mptcp: pm: ADD_ADDR rtx: fix potential data-race This mptcp_pm_add_timer() helper is executed as a timer callback in softirq context. To avoid any data races, the socket lock needs to be held with bh_lock_sock(). If the socket is in use, retry again soon after, similar to what is done with the keepalive timer. Fixes: `00cfd77b90` ("mptcp: retransmit ADD_ADDR when timeout") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-3-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 18:16:44 -07:00
Matthieu Baerts (NGI0)	03f324f3f1	mptcp: pm: ADD_ADDR rtx: allow ID 0 ADD_ADDR can be sent for the ID 0, which corresponds to the local address and port linked to the initial subflow. Indeed, this address could be removed, and re-added later on, e.g. what is done in the "delete re-add signal" MPTCP Join selftests. So no reason to ignore it. Fixes: `00cfd77b90` ("mptcp: retransmit ADD_ADDR when timeout") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-2-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 18:16:44 -07:00
Matthieu Baerts (NGI0)	b12014d2d3	mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0 When adding the ADD_ADDR to the list, the address including the IP, port and ID are copied. On the other hand, when the endpoint corresponds to the one from the initial subflow, the ID is set to 0, as specified by the MPTCP protocol. The issue is that the ID was reset after having copied the ID in the ADD_ADDR entry. So the retransmission was done, but using a different ID than the initial one. Fixes: `8b8ed1b429` ("mptcp: pm: reuse ID 0 after delete and re-add") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-1-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 18:16:44 -07:00
Eric Dumazet	c8f7244c8c	tcp: tcp_child_process() related UAF tcp_child_process( .. child ...) currently calls sock_put(child). Unfortunately @child (named @nsk in callers) can be used after this point to send a RST packet. To fix this UAF, I remove the sock_put() from tcp_child_process() and let the callers handle this after it is safe. Remove @rsk variable in tcp_v4_do_rcv() and change tcp_v6_do_rcv() so that both functions look the same. Fixes: `cfb6eeb4c8` ("[TCP]: MD5 Signature Option (RFC2385) support.") Reported-by: Damiano Melotti <melotti@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260505153927.3435532-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 18:11:33 -07:00
Eric Dumazet	770b136ff9	net/sched: sch_sfq: annotate data-races from sfq_dump_class_stats() sfq_dump_class_stats() runs locklessly, add needed READ_ONCE() and WRITE_ONCE() annotations. Fixes: `edb09eb17e` ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260505091133.2452510-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:46:05 -07:00
Eric Dumazet	67ef49047d	inetpeer: add a missing read_seqretry() in inet_getpeer() When performing a lockless lookup over the inet_peer rbtree, if a matching node is found, inet_getpeer() returns it immediately without validating the seqlock sequence. This missing check introduces a race condition: Trigger Path: When a host receives an incoming fragmented IPv4 packet, ip4_frag_init() (in net/ipv4/ip_fragment.c) calls inet_getpeer_v4() to track the peer. The Race: If the packet is from a new source IP, CPU A acquires the write_seqlock, allocates a new inet_peer node (p), sets its IP address (daddr), and links it to the rbtree (rb_link_node). Uninitialized Access: Due to the lack of memory barriers between rb_link_node and the initialization of the rest of the struct (like refcount_set(&p->refcnt, 1)), CPU A can make the node visible to readers before its refcnt is initialized. This is especially true on weakly-ordered architectures like ARM64 where the CPU can reorder the memory stores. Lockless Reader: Concurrently, CPU B processes a second fragmented packet from the same source IP. CPU B does a lockless lookup, finds the newly inserted node, and returns it immediately. Use-After-Free (UAF): CPU B reads p->refcnt as uninitialized garbage (left over from previous kmalloc-128/192 allocations). If the garbage is > 0, refcount_inc_not_zero(&p->refcnt) succeeds. CPU A then executes refcount_set(&p->refcnt, 1), overwriting CPU B's increment. When CPU B finishes with the fragment queue, it calls inet_putpeer(), which drops the refcount to 0 and frees the node via RCU. The node is now freed but remains linked in the rbtree, resulting in a Use-After-Free in the rbtree. Fixes: `b145425f26` ("inetpeer: remove AVL implementation in favor of RB tree") Reported-by: Damiano Melotti <melotti@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260505133233.3039575-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:44:13 -07:00
Shitalkumar Gandhi	701ea57fea	net: rtsn: fix mdio_node leak in rtsn_mdio_alloc() of_get_child_by_name() takes a reference. The rtsn_reset() and rtsn_change_mode() failure paths jump to out_free_bus and leak mdio_node. Add out_put_node to drop it before falling through. Fixes: `b0d3969d2b` ("net: ethernet: rtsn: Add support for Renesas Ethernet-TSN") Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Link: https://patch.msgid.link/20260505123236.406000-1-shitalkumar.gandhi@cambiumnetworks.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:42:50 -07:00
Jakub Kicinski	e418273936	Merge branch 'netdevsim-psp-fix-init-and-uninit-bugs' Daniel Zahka says: ==================== netdevsim: psp: fix init and uninit bugs This series has three fixes. The first is a straightforward NULL pointer dereference that is reachable by creating and destroying some vfs on a kernel with INET_PSP enabled. The last two patches deal with nsim_psp_rereg_write(), which is a debugfs handler that reregisters netdevsim's psp_dev without aquiescing and disabling tx/rx processing. This was added to enable some tests in psp.py where a psp device is unregistered while it still referenced by tcp socket state. There are two issues with this code: 1. Calls to nsim_psp_uninit() are not properly serialized 2. netdevsim's psp_dev refcount can be released while nsim_do_psp() is reading from it. ==================== Link: https://patch.msgid.link/20260505-psd-rcu-v1-0-a8f69ec1ab96@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:39:22 -07:00
Daniel Zahka	07bdec3fc7	netdevsim: psp: rcu protect psp_dev reference There are two issues with the way psp_dev is used in nsim_do_psp(): 1. There is no check for IS_ERR() on the peers psp_dev, before dereferencing. 2. The refcount on this psp_dev can be dropped by nsim_psp_rereg_write() To fix this, we can make netdevsim's reference to its psp_dev an rcu reference, and then nsim_do_psp() can read the fields it needs from an rcu critical section. Fixes: `f857478d62` ("netdevsim: a basic test PSP implementation") Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260505-psd-rcu-v1-3-a8f69ec1ab96@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:39:20 -07:00
Daniel Zahka	24c96a4200	netdevsim: psp: serialize calls to nsim_psp_uninit() The debugfs write handler, nsim_psp_rereg_write(), can race against nsim_destroy() and against itself, causing nsim_psp_uninit() to run more than once concurrently. Two complementary changes serialize all callers: 1. Delete the psp_rereg debugfs file from nsim_psp_uninit() before doing the actual teardown. debugfs_remove() drains any in-flight writers and prevents new ones from starting. 2. Add a mutex around the body of nsim_psp_rereg_write() so that two concurrent userspace writers cannot both enter the teardown path at once. The teardown work itself is moved into a new __nsim_psp_uninit() that the rereg handler calls under the mutex, while the public nsim_psp_uninit() wraps it with the debugfs_remove()/mutex_destroy() pair so nsim_destroy() doesn't have to know about the psp internals. Fixes: `f857478d62` ("netdevsim: a basic test PSP implementation") Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260505-psd-rcu-v1-2-a8f69ec1ab96@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:39:20 -07:00
Daniel Zahka	7ce3f1beda	netdevsim: psp: only call nsim_psp_uninit() on PFs VFs go through nsim_init_netdevsim_vf() which never calls nsim_psp_init(), so ns->psp.dev stays NULL. nsim_psp_uninit() guards with !IS_ERR(ns->psp.dev), so destroying a VF reaches psp_dev_unregister(NULL) and dereferences NULL on the first mutex_lock(&psd->lock): BUG: kernel NULL pointer dereference, address: 0000000000000020 RIP: 0010:mutex_lock+0x1c/0x30 Call Trace: psp_dev_unregister+0x2a/0x1a0 nsim_psp_uninit+0x1f/0x40 [netdevsim] nsim_destroy+0x61/0x1e0 [netdevsim] __nsim_dev_port_del+0x47/0x90 [netdevsim] nsim_drv_configure_vfs+0xc9/0x130 [netdevsim] nsim_bus_dev_numvfs_store+0x79/0xb0 [netdevsim] Gate nsim_psp_uninit() on nsim_dev_port_is_pf(), matching the pattern already used for nsim_exit_netdevsim() and the bpf/ipsec/macsec/queue teardowns. Reproducer: modprobe netdevsim echo "10 1" > /sys/bus/netdevsim/new_device echo 1 > /sys/bus/netdevsim/devices/netdevsim10/sriov_numvfs devlink dev eswitch set netdevsim/netdevsim10 mode switchdev echo 0 > /sys/bus/netdevsim/devices/netdevsim10/sriov_numvfs Fixes: `f857478d62` ("netdevsim: a basic test PSP implementation") Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260505-psd-rcu-v1-1-a8f69ec1ab96@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:39:20 -07:00
Eric Dumazet	7aaa8f5e45	ipv6: fix potential UAF caused by ip6_forward_proxy_check() ip6_forward_proxy_check() calls pskb_may_pull() which might re-allocate skb->head. Reload ipv6_hdr() after the pskb_may_pull() call to avoid using the freed memory. Fixes: `e21e0b5f19` ("[IPV6] NDISC: Handle NDP messages to proxied addresses.") Reported-by: Damiano Melotti <melotti@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260505130056.2927197-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:29:23 -07:00
Jakub Kicinski	0e1368a28d	selftests: drv-net: fix sort order of makefile and config Recent changes added configs and tests in the wrong spot. Link: https://lore.kernel.org/20260506170435.34984dfc@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 17:22:24 -07:00
Jakub Kicinski	dc61989e37	ipsec-2026-05-05 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmn57igACgkQrB3Eaf9P W7cDig//aXeIEN6VUYPU6lTDYXNCWz2A7sM636rXMMizF1nVjkRtrZlzQFwE9pIm LOla+Mu1VLGVsuxaoYfW2NagKt6bUg3xEDrlOt+lL/Bn6hengdjVF9PibvP4XCjt 5bwtg0xN0AysoktYS2v+2b+fSh5CSnQkcEcn9F2d+3zXmFlLpxuyPJqhHn54nHmI JPACVyk9bZdKutdfr86uThgWnTDInPvJ2vMRpRlwpGWx5f2JspJv1g4zzWzc38Ad yTcRZQXhZ7zfOaYFGjqMD0eHtFDPC+HqMTi0Ak9ngCBAFpZS8/iBJ3/TlukJjNcy q805gPyRqnpiVgm6NH55C8HUguzpD7m8tcjBbVADvIrMA0OzMw3mBxwFsbG2aaCs cPXxvtT7crDbKPtxvY5RhVJIvCe4BCMP/uqlmo7wuwPE01arVau5i4miZKGPTzXB LRNchWJMDIrwE/+MnAbJBXT5RfiN5RPvPdV5OdTlrofkwDzBjpTev5FeQq7QktSx ctPy7I28IRw+eCKlu2FNrUJ4x8C/7Fv1ZPADOSvd3D5PdaOAArUb3RhTGwC9giuo qKKv8Q30x5xyOv90MB3M8vQwM7mGUloIfZPN6AhRoaDGikdMyy6gZ8Y5M3noGUUJ D4z+kZgHy1ZrdYDM58CdfE1Kz/s96rA5aIHUVZQYonaz35YGRts= =WKO1 -----END PGP SIGNATURE----- Merge tag 'ipsec-2026-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2026-05-05 1. Fix an IPv6 encapsulation error path that leaked route references when UDPv6 ESP decapsulation resolved to an error route. From Yilin Zhu. 2. Fix AH with ESN on async crypto paths by accounting for the extra high-order sequence number when reconstructing the temporary authentication layout in the completion callbacks. From Michael Bomarito. 3. Fix XFRM output so it does not overwrite already-correct inner header pointers when a tunnel layer such as VXLAN has already saved them. The fix comes with new selftests. From Cosmin Ratiu. 4. Add the missing native payload size entry for XFRM_MSG_MAPPING in the compat translation path. From Ruijie Li. 5. Harden __xfrm_state_delete() against repeated or inconsistent unhashing of state list nodes by keying the removal on actual list membership and using delete-and-init helpers. From Michal Kosiorek. 6. Prevent ESP from decrypting shared splice-backed skb fragments in place by marking UDP splice frags as shared and forcing copy-on-write in ESP input when needed. From Kuan-Ting Chen. * tag 'ipsec-2026-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: xfrm: esp: avoid in-place decrypt on shared skb frags xfrm: defensively unhash xfrm_state lists in __xfrm_state_delete xfrm: provide message size for XFRM_MSG_MAPPING xfrm: Don't clobber inner headers when already set tools/selftests: Add a VXLAN+IPsec traffic test tools/selftests: Use a sensible timeout value for iperf3 client xfrm: ah: account for ESN high bits in async callbacks ipv6: xfrm6: release dst on error in xfrm6_rcv_encap() ==================== Link: https://patch.msgid.link/20260505132326.1362733-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 16:49:42 -07:00
Jakub Kicinski	f4eac70d1e	Includes changes: * ensure MAC header offset is reset before delivering packet * ensure gro_cells_receive() and dstats_dev_add() are called with BH disabled * reduce ping count in selftest to ensure it completes within timeout -----BEGIN PGP SIGNATURE----- iJEEABYIADkWIQQKU153ubb5unbkl6Gx/ZpNW1HNdwUCafkekRsUgAAAAAAEAA5t YW51MiwyLjUrMS4xMiwyLDIACgkQsf2aTVtRzXdZAAEA2yDBkZdIiALO7V5ul5Ao Y2/jFxysR5fyCMiOdxTOruwBAJaX9KExSE2QucHKOBFLmrjIIqwI5Br6whljdZKt n1YE =UhhZ -----END PGP SIGNATURE----- Merge tag 'ovpn-net-20260504' of https://github.com/OpenVPN/ovpn-net-next Antonio Quartulli says: ==================== Includes changes: * ensure MAC header offset is reset before delivering packet * ensure gro_cells_receive() and dstats_dev_add() are called with BH disabled * reduce ping count in selftest to ensure it completes within timeout * tag 'ovpn-net-20260504' of https://github.com/OpenVPN/ovpn-net-next: selftests: ovpn: reduce ping count in test.sh ovpn: ensure packet delivery happens with BH disabled ovpn: reset MAC header before passing skb up ==================== Link: https://patch.msgid.link/20260504230305.2681646-1-antonio@openvpn.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 16:10:03 -07:00
Jakub Kicinski	bd75e1003d	bluetooth pull request for net: - hci_conn: fix potential UAF in create_big_sync - hci_event: fix memset typo - hci_event: Fix OOB read and infinite loop in hci_le_create_big_complete_evt - L2CAP: fix MPS check in l2cap_ecred_reconf_req - L2CAP: defer conn param update to avoid conn->lock/hdev->lock inversion - L2CAP: Fix null-ptr-deref in l2cap_sock_state_change_cb() - L2CAP: Fix null-ptr-deref in l2cap_sock_get_sndtimeo_cb() - L2CAP: Fix null-ptr-deref in l2cap_sock_new_connection_cb() - RFCOMM: pull credit byte with skb_pull_data() - SCO: fix sleeping under spinlock in sco_conn_ready - SCO: hold sk properly in sco_conn_ready - ISO: Fix data-race on dst in iso_sock_connect() - ISO: Fix data-race on iso_pi(sk) in socket and HCI event paths - bnep: fix incorrect length parsing in bnep_rx_frame() extension handling - hci_uart: Fix NULL deref in recv callbacks when priv is uninitialized - virtio_bt: clamp rx length before skb_put - virtio_bt: validate rx pkt_type header length - HIDP: serialise l2cap_unregister_user via hidp_session_sem - btintel_pcie: treat boot stage bit 12 as warning - btmtk: validate WMT event SKB length before struct access -----BEGIN PGP SIGNATURE----- iQJNBAABCgA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmn7qCwZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKdbuD/wIj4GwiCd/vWz6qEdbK3Xl naw2i1HH4W3cLSDbEREQ7pJos+Uti6VqdzgW3yldzpKG3rZRjCx5hh3HxqmpuWmI LbCv4cI13ZfPgjfRqyjmX2AhpY8zkeOVy5wFiIVQsqsRm6s30g7lqxPkMPYG0K2G FDjS06iZsoRGXRFp2+lqpSk1H/90Bcz78yDyEr0qoHxpxUace2lx5gVmoZQxWasx Y5dcuNSVUvnftHMd4Lv2pehllpJDbmuyll1aVrhqEueRqdmyocjINXZRyYTdrECz 8WR4tiax1zvl/eYgJ6zdVLJ1Iva1HyiTVN5tY0uSM03+u1P/OxSInkoo2VSZoIIK bQUFQ92Xml1J0qL6g0rwEHESEYzaJXz9Ai+XdAFzHv1RkziiYRkDqvPFjivqh/JG QeOuNosSKGfG9V5m02Ym/GVTdE59xonukNr+RaIdpt6djsybv4go+E8RpnxVyQvy 5CMKchOvE6TnW3JRcaaXtC9cdMfOAjgBiebnWTguFBLutpPf1z8EhiKNFyYlt1yb r8tNhci6jimoD9hKzemEuKwyP07HnBo8B361kHByFYfhBHs+XZANtcVyMo6HtbqE 94eRdWvKBvG3ixAP5/ujqrmp9HFyMBhZPc3XimZxkBx71/JCqcSpsQHGtQRvAiXy 4FKJXqDINoOtpkggRQfkyQ== =KbxG -----END PGP SIGNATURE----- Merge tag 'for-net-2026-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_conn: fix potential UAF in create_big_sync - hci_event: fix memset typo - hci_event: Fix OOB read and infinite loop in hci_le_create_big_complete_evt - L2CAP: fix MPS check in l2cap_ecred_reconf_req - L2CAP: defer conn param update to avoid conn->lock/hdev->lock inversion - L2CAP: Fix null-ptr-deref in l2cap_sock_state_change_cb() - L2CAP: Fix null-ptr-deref in l2cap_sock_get_sndtimeo_cb() - L2CAP: Fix null-ptr-deref in l2cap_sock_new_connection_cb() - RFCOMM: pull credit byte with skb_pull_data() - SCO: fix sleeping under spinlock in sco_conn_ready - SCO: hold sk properly in sco_conn_ready - ISO: Fix data-race on dst in iso_sock_connect() - ISO: Fix data-race on iso_pi(sk) in socket and HCI event paths - bnep: fix incorrect length parsing in bnep_rx_frame() extension handling - hci_uart: Fix NULL deref in recv callbacks when priv is uninitialized - virtio_bt: clamp rx length before skb_put - virtio_bt: validate rx pkt_type header length - HIDP: serialise l2cap_unregister_user via hidp_session_sem - btintel_pcie: treat boot stage bit 12 as warning - btmtk: validate WMT event SKB length before struct access * tag 'for-net-2026-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: HIDP: serialise l2cap_unregister_user via hidp_session_sem Bluetooth: hci_event: fix memset typo Bluetooth: RFCOMM: pull credit byte with skb_pull_data() Bluetooth: virtio_bt: validate rx pkt_type header length Bluetooth: virtio_bt: clamp rx length before skb_put Bluetooth: btmtk: validate WMT event SKB length before struct access Bluetooth: ISO: Fix data-race on iso_pi(sk) in socket and HCI event paths Bluetooth: ISO: Fix data-race on dst in iso_sock_connect() Bluetooth: hci_uart: Fix NULL deref in recv callbacks when priv is uninitialized Bluetooth: btintel_pcie: treat boot stage bit 12 as warning Bluetooth: SCO: hold sk properly in sco_conn_ready Bluetooth: L2CAP: Fix null-ptr-deref in l2cap_sock_new_connection_cb() Bluetooth: L2CAP: Fix null-ptr-deref in l2cap_sock_get_sndtimeo_cb() Bluetooth: L2CAP: Fix null-ptr-deref in l2cap_sock_state_change_cb() Bluetooth: l2cap: defer conn param update to avoid conn->lock/hdev->lock inversion Bluetooth: l2cap: fix MPS check in l2cap_ecred_reconf_req Bluetooth: bnep: fix incorrect length parsing in bnep_rx_frame() extension handling Bluetooth: hci_event: Fix OOB read and infinite loop in hci_le_create_big_complete_evt Bluetooth: hci_conn: fix potential UAF in create_big_sync Bluetooth: SCO: fix sleeping under spinlock in sco_conn_ready ==================== Link: https://patch.msgid.link/20260506204553.58686-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-05-06 15:43:34 -07:00
Michael Bommarito	c5d415596c	Bluetooth: HIDP: serialise l2cap_unregister_user via hidp_session_sem Commit `dbf666e4fc` ("Bluetooth: HIDP: Fix possible UAF") made hidp_session_remove() drop the L2CAP reference and set session->conn = NULL once the session is considered removed, and added a bare if (session->conn) guard around the kthread-exit l2cap_unregister_user() call in hidp_session_thread(). The sibling ioctl site in hidp_connection_del() still reads session->conn unlocked and unguarded, and the kthread-exit guard itself is a lockless double-read. hidp_session_find() drops hidp_session_sem before returning, so hidp_session_remove() can null session->conn between the lookup and the call in hidp_connection_del(). Worse, since commit `752a6c9596` ("Bluetooth: L2CAP: Fix use-after-free in l2cap_unregister_user") takes mutex_lock(&conn->lock) inside l2cap_unregister_user(), a stale non-NULL snapshot also UAFs on conn->lock. v1 only added an if (session->conn) guard at the ioctl site, which doesn't address either race; Luiz suggested snapshotting session->conn under the sem and clearing it before the call. Taking hidp_session_sem across l2cap_unregister_user() would be wrong: l2cap_conn_del() already establishes the lock order conn->lock -> hidp_session_sem via l2cap_unregister_all_users() -> user->remove == hidp_session_remove(), so taking hidp_session_sem before conn->lock would AB/BA deadlock. Factor a helper hidp_session_unregister_conn() that under down_write(&hidp_session_sem) snapshots session->conn and clears the member, then outside the sem calls l2cap_unregister_user() and l2cap_conn_put() on the snapshot. Call it from both hidp_connection_del() and hidp_session_thread()'s exit path. At most one consumer wins the write-sem; later callers observe session->conn == NULL and skip the unregister and put, so the reference hidp_session_new() took via l2cap_conn_get() is consumed exactly once. session_free() already tolerates a NULL session->conn. Fixes: `dbf666e4fc` ("Bluetooth: HIDP: Fix possible UAF") Suggested-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Link: https://lore.kernel.org/all/20260422011437.176643-1-michael.bommarito@gmail.com/ Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:27:53 -04:00
Jann Horn	72d97cae2a	Bluetooth: hci_event: fix memset typo hci_le_big_sync_established_evt() currently does: conn->num_bis = 0; memset(conn->bis, 0, sizeof(conn->num_bis)); sizeof(conn->num_bis) is wrong - it would make sense to either use conn->num_bis (before setting that to 0) or sizeof(conn->bis). Fix it by using sizeof(conn->bis), the least intrusive change. Luckily, nothing actually depends on this memset() working properly: Nothing seems to ever read from conn->bis beyond conn->num_bis, and when conn->num_bis is increased, the corresponding elements of conn->bis are initialized. So I think this line could also just be removed. This is a purely theoretical fix and should have no impact on actual behavior. Fixes: `42ecf19471` ("Bluetooth: ISO: Do not emit LE BIG Create Sync if previous is pending") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:27:29 -04:00
Pengpeng Hou	8f59d17b18	Bluetooth: RFCOMM: pull credit byte with skb_pull_data() rfcomm_recv_data() treats the first payload byte as a credit field when the UIH frame carries PF and credit-based flow control is enabled. After the header has been stripped, the PF/CFC path consumes that byte with a direct skb->data dereference followed by skb_pull(). A malformed short frame can reach this path without a byte available. Use skb_pull_data() so the length check and pull happen together before the returned credit byte is consumed. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:23:20 -04:00
Michael Bommarito	daf23014e5	Bluetooth: virtio_bt: validate rx pkt_type header length virtbt_rx_handle() reads the leading pkt_type byte from the RX skb and forwards the remainder to hci_recv_frame() for every event/ACL/SCO/ISO type, without checking that the remaining payload is at least the fixed HCI header for that type. After the preceding patch bounds the backend-supplied used.len to [1, VIRTBT_RX_BUF_SIZE], a one-byte completion still reaches hci_recv_frame() with skb->len already pulled to 0. If the byte happened to be HCI_ACLDATA_PKT, the ACL-vs-ISO classification fast-path in hci_dev_classify_pkt_type() dereferences hci_acl_hdr(skb)->handle whenever the HCI device has an active CIS_LINK, BIS_LINK, or PA_LINK connection, reading two bytes of uninitialized RX-buffer data. The same hazard exists for every packet type the driver accepts because none of the switch cases in virtbt_rx_handle() check skb->len against the per-type minimum HCI header size before handing the frame to the core. After stripping pkt_type, require skb->len to cover the fixed header size for the selected type (event 2, ACL 4, SCO 3, ISO 4) before calling hci_recv_frame(); drop ratelimited otherwise. Unknown pkt_type values still take the original kfree_skb() default path. Use bt_dev_err_ratelimited() because both the length and pkt_type values come from an untrusted backend that can otherwise flood the kernel log. Fixes: `160fbcf3bf` ("Bluetooth: virtio_bt: Use skb_put to set length") Cc: stable@vger.kernel.org Cc: Soenke Huster <soenke.huster@eknoes.de> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:22:33 -04:00
Michael Bommarito	21bd244b6d	Bluetooth: virtio_bt: clamp rx length before skb_put virtbt_rx_work() calls skb_put(skb, len) where len comes directly from virtqueue_get_buf() with no validation against the buffer we posted to the device. The RX skb is allocated in virtbt_add_inbuf() and exposed to virtio as exactly 1000 bytes via sg_init_one(). Checking len against skb_tailroom(skb) is not sufficient because alloc_skb() can leave more tailroom than the 1000 bytes actually handed to the device. A malicious or buggy backend can therefore report used.len between 1001 and skb_tailroom(skb), causing skb_put() to include uninitialized kernel heap bytes that were never written by the device. The same path also accepts len == 0, in which case skb_put(skb, 0) leaves the skb empty but virtbt_rx_handle() still reads the pkt_type byte from skb->data, consuming uninitialized memory. Define VIRTBT_RX_BUF_SIZE once and reuse it in alloc_skb() and sg_init_one(), and gate virtbt_rx_work() on that same constant so the bound checked matches the buffer actually exposed to the device. Reject used.len == 0 in the same gate so an empty completion can no longer reach virtbt_rx_handle(). Use bt_dev_err_ratelimited() because the length value comes from an untrusted backend that can otherwise flood the kernel log. Same class of bug as commit `c04db81cd0` ("net/9p: Fix buffer overflow in USB transport layer"), which hardened the USB 9p transport against unchecked device-reported length. Fixes: `160fbcf3bf` ("Bluetooth: virtio_bt: Use skb_put to set length") Cc: stable@vger.kernel.org Cc: Soenke Huster <soenke.huster@eknoes.de> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:22:25 -04:00
Tristan Madani	634a4408c0	Bluetooth: btmtk: validate WMT event SKB length before struct access btmtk_usb_hci_wmt_sync() casts the WMT event response SKB data to struct btmtk_hci_wmt_evt (7 bytes) and struct btmtk_hci_wmt_evt_funcc (9 bytes) without first checking that the SKB contains enough data. A short firmware response causes out-of-bounds reads from SKB tailroom. Use skb_pull_data() to validate and advance past the base WMT event header. For the FUNC_CTRL case, pull the additional status field bytes before accessing them. Fixes: `d019930b00` ("Bluetooth: btmtk: move btusb_mtk_hci_wmt_sync to btmtk.c") Cc: stable@vger.kernel.org Signed-off-by: Tristan Madani <tristan@talencesecurity.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:22:19 -04:00
SeungJu Cheon	f958c7805b	Bluetooth: ISO: Fix data-race on iso_pi(sk) in socket and HCI event paths Several iso_pi(sk) fields (qos, qos_user_set, bc_sid, base, base_len, sync_handle, bc_num_bis) are written under lock_sock in iso_sock_setsockopt() and iso_sock_bind(), but read and written under hci_dev_lock only in two other paths: - iso_connect_bis() / iso_connect_cis(), invoked from connect(2), read qos/base/bc_sid and reset qos to default_qos on the qos_user_set validation failure -- all without lock_sock. - iso_connect_ind(), invoked from hci_rx_work, writes sync_handle, bc_sid, qos.bcast.encryption, bc_num_bis, base and base_len on PA_SYNC_ESTABLISHED / PAST_RECEIVED / BIG_INFO_ADV_REPORT / PER_ADV_REPORT events. The BIG_INFO handler additionally passes &iso_pi(sk)->qos together with sync_handle / bc_num_bis / bc_bis to hci_conn_big_create_sync() while setsockopt may be mutating them. Acquire lock_sock around the affected accesses in both paths. The locking order hci_dev_lock -> lock_sock matches the existing iso_conn_big_sync() precedent, whose comment documents the same requirement for hci_conn_big_create_sync(). The HCI connect/bind helpers do not wait for command completion -- they enqueue work via hci_cmd_sync_queue{,_once}() / hci_le_create_cis_pending() and return -- so the added hold time is comparable to iso_conn_big_sync(). KCSAN report: BUG: KCSAN: data-race in iso_connect_cis / iso_sock_setsockopt read to 0xffffa3ae8ce3cdc8 of 1 bytes by task 335 on cpu 0: iso_connect_cis+0x49f/0xa20 iso_sock_connect+0x60e/0xb40 __sys_connect_file+0xbd/0xe0 __sys_connect+0xe0/0x110 __x64_sys_connect+0x40/0x50 x64_sys_call+0xcad/0x1c60 do_syscall_64+0x133/0x590 entry_SYSCALL_64_after_hwframe+0x77/0x7f write to 0xffffa3ae8ce3cdc8 of 60 bytes by task 334 on cpu 1: iso_sock_setsockopt+0x69a/0x930 do_sock_setsockopt+0xc3/0x170 __sys_setsockopt+0xd1/0x130 __x64_sys_setsockopt+0x64/0x80 x64_sys_call+0x1547/0x1c60 do_syscall_64+0x133/0x590 entry_SYSCALL_64_after_hwframe+0x77/0x7f Reported by Kernel Concurrency Sanitizer on: CPU: 1 UID: 0 PID: 334 Comm: iso_setup_race Not tainted 7.0.0-10949-g8541d8f725c6 #44 PREEMPT(lazy) The iso_connect_ind() races were found by inspection. Fixes: `ccf74f2390` ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: SeungJu Cheon <suunj1331@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:22:05 -04:00
SeungJu Cheon	ca40d48107	Bluetooth: ISO: Fix data-race on dst in iso_sock_connect() iso_sock_connect() copies the destination address into iso_pi(sk)->dst under lock_sock, then releases the lock and reads it back with bacmp() to decide between the CIS and BIS connect paths: lock_sock(sk); bacpy(&iso_pi(sk)->dst, &sa->iso_bdaddr); iso_pi(sk)->dst_type = sa->iso_bdaddr_type; release_sock(sk); if (bacmp(&iso_pi(sk)->dst, BDADDR_ANY)) // <- no lock held This read after release_sock() races with any concurrent write to iso_pi(sk)->dst on the same socket. Fix by reading the destination address directly from the local sockaddr argument (sa->iso_bdaddr) instead of iso_pi(sk)->dst. Since sa is a function-local argument, reading it requires no locking and avoids the race. This patch addresses only the bacmp() race in iso_sock_connect(); other unprotected iso_pi(sk) accesses are fixed separately in the next patch. KCSAN report: BUG: KCSAN: data-race in memcmp+0x39/0xb0 race at unknown origin, with read to 0xffff8f96ea66dde3 of 1 bytes by task 549 on cpu 1: memcmp+0x39/0xb0 iso_sock_connect+0x275/0xb40 __sys_connect_file+0xbd/0xe0 __sys_connect+0xe0/0x110 __x64_sys_connect+0x40/0x50 x64_sys_call+0xcad/0x1c60 do_syscall_64+0x133/0x590 entry_SYSCALL_64_after_hwframe+0x77/0x7f value changed: 0x00 -> 0xee Reported by Kernel Concurrency Sanitizer on: CPU: 1 UID: 0 PID: 549 Comm: iso_race_combin Not tainted 7.0.0-08391-g1d51b370a0f8 #40 PREEMPT(lazy) Fixes: `ccf74f2390` ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: SeungJu Cheon <suunj1331@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:21:58 -04:00
Aurelien DESBRIERES	902fe40bce	Bluetooth: hci_uart: Fix NULL deref in recv callbacks when priv is uninitialized When a fault is injected during hci_uart line discipline setup, the proto open() callback may fail leaving hu->priv as NULL. A subsequent TIOCSTI ioctl can trigger the recv() callback before priv is initialized, causing a NULL pointer dereference. Fix all four affected HCI UART protocol drivers by adding a NULL check on hu->priv at the start of their recv() callbacks: h4, h5, ath and bcsp. Reported-by: syzbot+ff30eeab8e07b37d524e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ff30eeab8e07b37d524e Signed-off-by: Aurelien DESBRIERES <aurelien@hackers.camp> Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:21:43 -04:00
Sai Teja Aluvala	5917dd39db	Bluetooth: btintel_pcie: treat boot stage bit 12 as warning CSR boot stage register bit 12 is documented as a device warning, not a fatal error. Rename the bit definition accordingly and stop including it in btintel_pcie_in_error(). This keeps warning-only boot stage values from being classified as errors while preserving abort-handler state as the actual error condition. Fixes: `190377500f` ("Bluetooth: btintel_pcie: Dump debug registers on error") Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Sai Teja Aluvala <aluvala.sai.teja@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:21:34 -04:00
Pauli Virtanen	4e37f6452d	Bluetooth: SCO: hold sk properly in sco_conn_ready sk deref in sco_conn_ready must be done either under conn->lock, or holding a refcount, to avoid concurrent close. conn->sk and parent sk is currently accessed without either, and without checking parent->sk_state: [Task 1] [Task 2] sco_sock_release sco_conn_ready sk = conn->sk lock_sock(sk) conn->sk = NULL lock_sock(sk) release_sock(sk) sco_sock_kill(sk) UAF on sk deref and similarly for access to sco_get_sock_listen() return value. Fix possible UAF by holding sk refcount in sco_conn_ready() and making sco_get_sock_listen() increase refcount. Also recheck after lock_sock that the socket is still valid. Adjust conn->sk locking so it's protected also by lock_sock() of the associated socket if any. Fixes: `27c24fda62` ("Bluetooth: switch to lock_sock in SCO") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:21:25 -04:00
Siwei Zhang	0a120d9616	Bluetooth: L2CAP: Fix null-ptr-deref in l2cap_sock_new_connection_cb() Add the same NULL guard already present in l2cap_sock_resume_cb() and l2cap_sock_ready_cb(). Fixes: `80808e431e` ("Bluetooth: Add l2cap_chan_ops abstraction") Cc: stable@kernel.org Signed-off-by: Siwei Zhang <oss@fourdim.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:21:09 -04:00
Siwei Zhang	78a88d43da	Bluetooth: L2CAP: Fix null-ptr-deref in l2cap_sock_get_sndtimeo_cb() Add the same NULL guard already present in l2cap_sock_resume_cb() and l2cap_sock_ready_cb(). Fixes: `8d836d71e2` ("Bluetooth: Access sk_sndtimeo indirectly in l2cap_core.c") Cc: stable@kernel.org Signed-off-by: Siwei Zhang <oss@fourdim.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:21:07 -04:00
Siwei Zhang	2ff1a41a91	Bluetooth: L2CAP: Fix null-ptr-deref in l2cap_sock_state_change_cb() Add the same NULL guard already present in l2cap_sock_resume_cb() and l2cap_sock_ready_cb(). Fixes: `89bc500e41` ("Bluetooth: Add state tracking to struct l2cap_chan") Cc: stable@kernel.org Signed-off-by: Siwei Zhang <oss@fourdim.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:21:04 -04:00
Mikhail Gavrilov	91b5a598b5	Bluetooth: l2cap: defer conn param update to avoid conn->lock/hdev->lock inversion When a BLE peripheral sends an L2CAP Connection Parameter Update Request the processing path is: process_pending_rx() [takes conn->lock] l2cap_le_sig_channel() l2cap_conn_param_update_req() hci_le_conn_update() [takes hdev->lock] Meanwhile other code paths take the locks in the opposite order: l2cap_chan_connect() [takes hdev->lock] ... mutex_lock(&conn->lock) l2cap_conn_ready() [hdev->lock via hci_cb_list_lock] ... mutex_lock(&conn->lock) This is a classic AB/BA deadlock which lockdep reports as a circular locking dependency when connecting a BLE MIDI keyboard (Carry-On FC-49). Fix this by making hci_le_conn_update() defer the HCI command through hci_cmd_sync_queue() so it no longer needs to take hdev->lock in the caller context. The sync callback uses __hci_cmd_sync_status_sk() to wait for the HCI_EV_LE_CONN_UPDATE_COMPLETE event, then updates the stored connection parameters (hci_conn_params) and notifies userspace (mgmt_new_conn_param) only after the controller has confirmed the update. A reference on hci_conn is held via hci_conn_get()/hci_conn_put() for the lifetime of the queued work to prevent use-after-free, and hci_conn_valid() is checked before proceeding in case the connection was removed while the work was pending. The hci_dev_lock is held across hci_conn_valid() and all conn field accesses to prevent a concurrent disconnect from invalidating the connection mid-use. Fixes: `f044eb0524` ("Bluetooth: Store latency and supervision timeout in connection params") Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:20:51 -04:00
Dudu Lu	4f42363c81	Bluetooth: l2cap: fix MPS check in l2cap_ecred_reconf_req The L2CAP specification states that if more than one channel is being reconfigured, the MPS shall not be decreased. The current check has two issues: 1) The comparison uses >= (greater-than-or-equal), which incorrectly rejects reconfiguration requests where the MPS stays the same. Since the spec says MPS "shall be greater than or equal to the current MPS", only a strict decrease (remote_mps > mps) should be rejected. Keeping the same MPS is valid. 2) The multi-channel guard uses `&& i` (loop index) to approximate "more than one channel", but this incorrectly allows MPS decrease for the first channel (i==0) even when multiple channels are being reconfigured. Replace with `&& num_scid > 1` which correctly checks whether the request covers more than one channel. Fixes: `7accb1c432` ("Bluetooth: L2CAP: Fix invalid response to L2CAP_ECRED_RECONF_REQ") Signed-off-by: Dudu Lu <phx0fer@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:20:38 -04:00
Dudu Lu	72b8deccff	Bluetooth: bnep: fix incorrect length parsing in bnep_rx_frame() extension handling In bnep_rx_frame(), the BNEP_FILTER_NET_TYPE_SET and BNEP_FILTER_MULTI_ADDR_SET extension header parsing has two bugs: 1) The 2-byte length field is read with (u16 )(skb->data + 1), which performs a native-endian read. The BNEP protocol specifies this field in big-endian (network byte order), and the same file correctly uses get_unaligned_be16() for the identical fields in bnep_ctrl_set_netfilter() and bnep_ctrl_set_mcfilter(). 2) The length is multiplied by 2, but unlike BNEP_SETUP_CONN_REQ where the length byte counts UUID pairs (requiring * 2 for two UUIDs per entry), the filter extension length field already represents the total data size in bytes. This is confirmed by bnep_ctrl_set_netfilter() which reads the same field as a byte count and divides by 4 to get the number of filter entries. The bogus * 2 means skb_pull advances twice as far as it should, either dropping valid data from the next header or causing the pull to fail entirely when the doubled length exceeds the remaining skb. Fix by splitting the pull into two steps: first use skb_pull_data() to safely pull and validate the 3-byte fixed header (ctrl type + length), then pull the variable-length data using the properly decoded length. Fixes: `bf8b9a9cb7` ("Bluetooth: bnep: Add support to extended headers of control frames") Signed-off-by: Dudu Lu <phx0fer@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:19:09 -04:00
Luiz Augusto von Dentz	5ddb801426	Bluetooth: hci_event: Fix OOB read and infinite loop in hci_le_create_big_complete_evt hci_le_create_big_complete_evt() iterates over BT_BOUND connections for a BIG handle using a while loop, accessing ev->bis_handle[i++] on each iteration. However, there is no check that i stays within ev->num_bis before the array access. When a controller sends a LE_Create_BIG_Complete event with fewer bis_handle entries than there are BT_BOUND connections for that BIG, or with num_bis=0, the loop reads beyond the valid bis_handle[] flex array into adjacent heap memory. Since the out-of-bounds values typically exceed HCI_CONN_HANDLE_MAX (0x0EFF), hci_conn_set_handle() rejects them and the connection remains in BT_BOUND state. The same connection is then found again by hci_conn_hash_lookup_big_state(), creating an infinite loop with hci_dev_lock held. Fix this by terminating the BIG if in case not all BIS could be setup properly. Fixes: `a0bfde167b` ("Bluetooth: ISO: Add support for connecting multiple BISes") Cc: stable@vger.kernel.org Signed-off-by: ZhiTao Ou <hkbinbinbin@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 16:18:22 -04:00
David Carlier	0beddb0c38	Bluetooth: hci_conn: fix potential UAF in create_big_sync Add hci_conn_valid() check in create_big_sync() to detect stale connections before proceeding with BIG creation. Handle the resulting -ECANCELED in create_big_complete() and re-validate the connection under hci_dev_lock() before dereferencing, matching the pattern used by create_le_conn_complete() and create_pa_complete(). Keep the hci_conn object alive across the async boundary by taking a reference via hci_conn_get() when queueing create_big_sync(), and dropping it in the completion callback. The refcount and the lock are complementary: the refcount keeps the object allocated, while hci_dev_lock() serializes hci_conn_hash_del()'s list_del_rcu() on hdev->conn_hash, as required by hci_conn_del(). hci_conn_put() is called outside hci_dev_unlock() so the final put (which resolves to kfree() via bt_link_release) does not run under hdev->lock, though the release path would be safe either way. Without this, create_big_complete() would unconditionally dereference the conn pointer on error, causing a use-after-free via hci_connect_cfm() and hci_conn_del(). Fixes: `eca0ae4aea` ("Bluetooth: Add initial implementation of BIS connections") Cc: stable@vger.kernel.org Co-developed-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: David Carlier <devnexen@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 15:58:40 -04:00
Pauli Virtanen	b819db93d7	Bluetooth: SCO: fix sleeping under spinlock in sco_conn_ready sco_conn_ready calls sleeping functions under conn->lock spinlock. The critical section can be reduced: conn->hcon is modified only with hdev->lock held. It is guaranteed to be held in sco_conn_ready, so conn->lock is not needed to guard it. Move taking conn->lock after lock_sock(parent). This also follows the lock ordering lock_sock() > conn->lock elsewhere in the file. Fixes: `27c24fda62` ("Bluetooth: switch to lock_sock in SCO") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2026-05-06 15:58:29 -04:00
Linus Torvalds	5862221fdd	parisc architecture fixes for kernel v7.1-rc3: - Revert "parisc: led: fix reference leak on failed device registration" - Fix build failures introduced when allowing to build 32-/64-bit only VDSO - Switch to dynamic parisc root device to avoid upcoming warnings - Fix IRQ leak in LASI driver -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCafuSEAAKCRD3ErUQojoP X8T2AQCJq3aNsTJPG2pLeHFrzDwl0Xfik1W5SHqn9acHPay7mwD9H/uMCDtukZ2D WQsbyqxWEvuNGyWq3ww8WTlFMUhrXA8= =t5SV -----END PGP SIGNATURE----- Merge tag 'parisc-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Revert "parisc: led: fix reference leak on failed device registration" - Fix build failures introduced when allowing to build 32-/64-bit only VDSO - Switch to dynamic parisc root device to avoid upcoming warnings - Fix IRQ leak in LASI driver * tag 'parisc-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix IRQ leak in LASI driver parisc: Fix 64-bit kernel build when CONFIG_COMPAT=n parisc: Fix build failure for 32-bit kernel with PA2.0 instruction set parisc: drivers: switch to dynamic root device Revert "parisc: led: fix reference leak on failed device registration"	2026-05-06 12:51:07 -07:00
Steffen Eiden	54ea22273e	MAINTAINERS: Add Steffen as reviewer for KVM/arm64 KVM/arm64 and KVM/s390 will eventually share some code. Add me as a cross-reviewer from the s390 team to arm64 to help to keep both architectures in sync. Signed-off-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://patch.msgid.link/20260428160527.1378085-16-seiden@linux.ibm.com [maz: rephrase commit message to use future tense, since this is merged ahead of the code] Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-05-06 17:33:56 +01:00
Mostafa Saleh	9a624ea3f2	KVM: arm64: Remove potential UB on nvhe tracing clock update Sashiko(locally) reports possiblity of division by zero and out-of-bounds bitwise shift in trace_clock_update(). Although the clock update is untrusted, we should at least have some basic checks to avoid undefined behaviours. Reviewed-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Mostafa Saleh <smostafa@google.com> Link: https://patch.msgid.link/20260430103724.2151625-1-smostafa@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-05-06 17:09:48 +01:00
Sebastian Ott	fc240715fc	KVM: selftests: arm64: Fix steal_time test after UAPI refactoring Fix the following failure to the steal_time test on arm64 by making the timer address known to the guest. ==== Test Assertion Failure ==== steal_time.c:229: !ret pid=18514 tid=18514 errno=22 - Invalid argument 1 0x000000000040252f: check_steal_time_uapi at steal_time.c:229 (discriminator 20) 2 (inlined by) main at steal_time.c:537 (discriminator 20) 3 0x0000ffffa23d621b: ?? ??:0 4 0x0000ffffa23d62fb: ?? ??:0 5 0x0000000000402b6f: _start at ??:? KVM_SET_DEVICE_ATTR failed, rc: -1 errno: 22 (Invalid argument) Fixes: `40351ed924` ("KVM: selftests: Refactor UAPI tests into dedicated function") Signed-off-by: Sebastian Ott <sebott@redhat.com> Link: https://patch.msgid.link/20260504112808.21276-1-sebott@redhat.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-05-06 17:09:03 +01:00
Alexandru Elisei	9be19df816	KVM: arm64: Handle permission faults with guest_memfd gmem_abort() calls kvm_pgtable_stage2_map() to make changes to stage 2. It does this for both relaxing permissions on an existing mapping and to install a missing mapping. kvm_pgtable_stage2_map() doesn't make changes to stage 2 if there is an existing, valid entry and the new entry modifies only the permissions. This is checked in: kvm_pgtable_stage2_map() stage2_map_walk_leaf() stage2_map_walker_try_leaf() stage2_pte_needs_update() and if only the permissions differ, kvm_pgtable_stage2_map() returns -EAGAIN and KVM returns to the guest to replay the instruction. The assumption is that a concurrent fault on a different VCPU already mapped the faulting IPA, and replaying the instruction will either succeed, or cause a permission fault, which should be handled with kvm_pgtable_stage2_relax_perms(). gmem_abort(), on a read or write fault on a system without DIC (instruction cache invalidation required for data to instruction coherence), installs a valid entry with read and write permissions, but without executable permissions. On an execution fault on the same page, gmem_abort() attempts to relax the permissions to allow execution, but calls kvm_pgtable_stage2_map() to change the existing, valid, entry. kvm_pgtable_stage2_map() returns -EAGAIN and KVM resumes execution from the faulting instruction, which leads to an infinite loop of permission faults on the same instruction. Allow the guest to make progress by using kvm_pgtable_stage2_relax_perms() to relax permissions. Fixes: `a7b57e0995` ("KVM: arm64: Handle guest_memfd-backed guest page faults") Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260505094913.75317-1-alexandru.elisei@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-05-06 17:08:39 +01:00
Wei-Lin Chang	8d9b9d985a	KVM: arm64: nv: Consider the DS bit when translating TCR_EL2 When running an nVHE L1, TCR_EL2 is mapped to TCR_EL1. Writes to the register are trapped and written to TCR_EL1 after a translation. Booting an nVHE L1 with 52-bit VA isn't working because the translation was ignoring the DS bit set by the guest, hence causing repeating level 0 faults. Add it in the translation function. Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com> Link: https://patch.msgid.link/20260505144735.1496530-1-weilin.chang@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-05-06 17:08:39 +01:00
James Morse	1f7305d87a	KVM: arm64: Work around C1-Pro erratum 4193714 for protected guests C1-Pro cores with SME have an erratum where TLBI+DSB does not complete all outstanding SME accesses. Instead a DSB needs to be executed on the affected CPUs. The implication is that pages cannot be unmapped from the host Stage 2 and then provided to a protected guest or to the hypervisor. Host SME accesses may still complete after this point. This erratum breaks pKVM's guarantees, and the workaround is hard to implement as EL2 and EL1 share a security state meaning EL1 can mask IPIs sent by EL2, leading to interrupt blackouts. Instead, do this in EL3. This has the advantage of a separate security state, meaning lower EL cannot mask the IPI. It is also simpler for EL3 to know about CPUs that are off or in PSCI's CPU_SUSPEND. Add the needed hook to host_stage2_set_owner_metadata_locked(). This covers the cases where the host loses access to a page: __pkvm_host_donate_guest() __pkvm_guest_unshare_host() host_stage2_set_owner_locked() when owner_id == PKVM_ID_HYP Since pKVM relies on the firmware call for correctness, check for the firmware counterpart during protected KVM initialisation and fail the pKVM initialisation if it is missing. Signed-off-by: James Morse <james.morse@arm.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oupton@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Sudeep Holla <sudeep.holla@kernel.org> Link: https://patch.msgid.link/20260505165205.2690919-1-catalin.marinas@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-05-06 17:08:39 +01:00
Vincent Guittot	9f6d929ee2	sched/fair: Fix wakeup_preempt_fair() for not waking up task Make sure to only call pick_next_entity() on an non-empty cfs_rq. The assumption that p is always enqueued and not delayed, is only true for wakeup. If p was moved while delayed, pick_next_entity() will dequeue it and the cfs might become empty. Test if there are still queued tasks before trying again to determine if p could be the next one to be picked. There are at least 2 cases: When cfs becomes idle, it tries to pull tasks but if those pulled tasks are delayed, they will be dequeued when attached to cfs. attach_tasks() -> attach_task() -> wakeup_preempt(rq, p, 0); A misfit task running on cfs A triggers a load balance to be pulled on a better cpu, the load balance on cfs B starts an active load balance to pulled the running misfit task. If there is a delayed dequeue task on cfs A, it can be pulled instead of the previously running misfit task. attach_one_task() -> attach_task() -> wakeup_preempt(rq, p, 0); Fixes: `ac8e69e693` ("sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260503104503.1732682-1-vincent.guittot@linaro.org	2026-05-06 17:41:18 +02:00
Zhan Xusheng	b6eee96843	sched/fair: Fix overflow in vruntime_eligible() Zhan Xusheng reported running into sporadic a s64 mult overflow in vruntime_eligible(). When constructing a worst case scenario: If you have cgroups, then you can have an entity of weight 2 (per calc_group_shares()), and its vlag should then be bounded by: (slice+TICK_NSEC) * NICE_0_LOAD, which is around 44 bits as per the comment on entity_key(). The other extreme is 100NICE_0_LOAD, thus you get: {key, weight}[] := { puny: { (slice + TICK_NSEC) NICE_0_LOAD, 2 }, max: { 0, 100NICE_0_LOAD }, } The avg_vruntime() would end up being very close to 0 (which is zero_vruntime), so no real help making that more accurate. vruntime_eligible(puny) ends up with: avg = 2 puny.key (+ 0) load = 2 + 100 * NICE_0_LOAD avg >= puny.key * load And that is: (slice + TICK_NSEC) * NICE_0_LOAD * NICE_0_LOAD * 100, which will overflow s64. Zhan suggested using __builtin_mul_overflow(), however after staring at compiler output for various architectures using godbolt, it seems that using an __int128 multiplication often results in better code. Specifically, a number of architectures already compute the __int128 product to determine the overflow. Eg. arm64 already has the 'smulh' instruction used. By explicitly doing an __int128 multiply, it will emit the 'mul; smulh' pattern, which modern cores can fuse (armv8-a clang-22.1.0). x86_64 has less branches (no OF handling). Since Linux has ARCH_SUPPORTS_INT128 to gate __int128 usage, also provide the __builtin_mul_overflow() variant as a fallback. [peterz: Changelog and __int128 bits] Fixes: `556146ce5e` ("sched/fair: Avoid overflow in enqueue_entity()") Reported-by: Zhan Xusheng <zhanxusheng1024@gmail.com> Closes: https://patch.msgid.link/20260415145742.10359-1-zhanxusheng%40xiaomi.com Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260505103155.GN3102924%40noisy.programming.kicks-ass.net	2026-05-06 17:41:17 +02:00
Thomas Gleixner	e744060076	selftests/rseq: Expand for optimized RSEQ ABI v2 Update the selftests so they are executed for legacy (32 bytes RSEQ region) and optimized RSEQ ABI v2 mode. Fixes: `d6200245c7` ("rseq: Allow registering RSEQ with slice extension") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Link: https://patch.msgid.link/20260428224428.009121296%40kernel.org Cc: stable@vger.kernel.org	2026-05-06 17:41:08 +02:00
Thomas Gleixner	99428157dc	rseq: Reenable performance optimizations conditionally Due to the incompatibility with TCMalloc the RSEQ optimizations and extended features (time slice extensions) have been disabled and made run-time conditional. The original RSEQ implementation, which TCMalloc depends on, registers a 32 byte region (ORIG_RSEG_SIZE). This region has a 32 byte alignment requirement. The extension safe newer variant exposes the kernel RSEQ feature size via getauxval(AT_RSEQ_FEATURE_SIZE) and the alignment requirement via getauxval(AT_RSEQ_ALIGN). The alignment requirement is that the registered RSEQ region is aligned to the next power of two of the feature size. The kernel currently has a feature size of 33 bytes, which means the alignment requirement is 64 bytes. The TCMalloc RSEQ region is embedded into a cache line aligned data structure starting at offset 32 bytes so that bytes 28-31 and the cpu_id_start field at bytes 32-35 form a 64-bit little endian pointer with the top-most bit (63 set) to check whether the kernel has overwritten cpu_id_start with an actual CPU id value, which is guaranteed to not have the top most bit set. As this is part of their performance tuned magic, it's a pretty safe assumption, that TCMalloc won't use a larger RSEQ size. This allows the kernel to declare that registrations with a size greater than the original size of 32 bytes, which is the cases since time slice extensions got introduced, as RSEQ ABI v2 with the following differences to the original behaviour: 1) Unconditional updates of the user read only fields (CPU, node, MMCID) are removed. Those fields are only updated on registration, task migration and MMCID changes. 2) Unconditional evaluation of the criticial section pointer is removed. It's only evaluated when user space was interrupted and was scheduled out or before delivering a signal in the interrupted context. 3) The read/only requirement of the ID fields is enforced. When the kernel detects that userspace manipulated the fields, the process is terminated. This ensures that multiple entities (libraries) can utilize RSEQ without interfering. 4) Todays extended RSEQ feature (time slice extensions) and future extensions are only enabled in the v2 enabled mode. Registrations with the original size of 32 bytes operate in backwards compatible legacy mode without performance improvements and extended features. Unfortunately that also affects users of older GLIBC versions which register the original size of 32 bytes and do not evaluate the kernel required size in the auxiliary vector AT_RSEQ_FEATURE_SIZE. That's the result of the lack of enforcement in the original implementation and the unwillingness of a single entity to cooperate with the larger ecosystem for many years. Implement the required registration changes by restructuring the spaghetti code and adding the size/version check. Also add documentation about the differences of legacy and optimized RSEQ V2 mode. Thanks to Mathieu for pointing out the ORIG_RSEQ_SIZE constraints! Fixes: `d6200245c7` ("rseq: Allow registering RSEQ with slice extension") Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Link: https://patch.msgid.link/20260428224427.927160119%40kernel.org Cc: stable@vger.kernel.org	2026-05-06 17:40:27 +02:00

... 35 36 37 38 39 ...

1446901 Commits