linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-05 04:56:13 +02:00

Author	SHA1	Message	Date
David Howells	0422e7a488	rxrpc: Fix re-decryption of RESPONSE packets If a RESPONSE packet gets a temporary failure during processing, it may end up in a partially decrypted state - and then get requeued for a retry. Fix this by just discarding the packet; we will send another CHALLENGE packet and thereby elicit a further response. Similarly, discard an incoming CHALLENGE packet if we get an error whilst generating a RESPONSE; the server will send another CHALLENGE. Fixes: `17926a7932` ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Closes: https://sashiko.dev/#/patchset/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260423200909.3049438-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 14:29:15 -07:00
David Howells	55b2984c96	rxrpc: Fix rxrpc_input_call_event() to only unshare DATA packets Fix rxrpc_input_call_event() to only unshare DATA packets and not ACK, ABORT, etc.. And with that, rxrpc_input_packet() doesn't need to take a pointer to the pointer to the packet, so change that to just a pointer. Fixes: `1f2740150f` ("rxrpc: Fix potential UAF after skb_unshare() failure") Closes: https://sashiko.dev/#/patchset/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260423200909.3049438-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 14:29:15 -07:00
Jakub Kicinski	27ae4bcf4d	Merge branch 'rxrpc-miscellaneous-fixes' David Howells says: ==================== rxrpc: Miscellaneous fixes Here are some fixes for rxrpc, as found by Sashiko[1]: (1) Fix leaks in rxkad_verify_response(). (2) Fix handling of rxkad-encrypted packets with crypto-misaligned lengths. (3) Fix problem with unsharing DATA packets potentially causing a crash in the caller. (4) Fix lack of unsharing of RESPONSE packets. (5) Fix integer overflow in RxGK ticket length check. (6) Fix missing length check in RxKAD tickets. ==================== Link: https://patch.msgid.link/20260422161438.2593376-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:41:52 -07:00
Anderson Nascimento	ac33733b10	rxrpc: Fix missing validation of ticket length in non-XDR key preparsing In rxrpc_preparse(), there are two paths for parsing key payloads: the XDR path (for large payloads) and the non-XDR path (for payloads <= 28 bytes). While the XDR path (rxrpc_preparse_xdr_rxkad()) correctly validates the ticket length against AFSTOKEN_RK_TIX_MAX, the non-XDR path fails to do so. This allows an unprivileged user to provide a very large ticket length. When this key is later read via rxrpc_read(), the total token size (toksize) calculation results in a value that exceeds AFSTOKEN_LENGTH_MAX, triggering a WARN_ON(). [ 2001.302904] WARNING: CPU: 2 PID: 2108 at net/rxrpc/key.c:778 rxrpc_read+0x109/0x5c0 [rxrpc] Fix this by adding a check in the non-XDR parsing path of rxrpc_preparse() to ensure the ticket length does not exceed AFSTOKEN_RK_TIX_MAX, bringing it into parity with the XDR parsing logic. Fixes: `8a7a3eb4dd` ("KEYS: RxRPC: Use key preparsing") Fixes: `84924aac08` ("rxrpc: Fix checker warning") Reported-by: Anderson Nascimento <anderson@allelesecurity.com> Signed-off-by: Anderson Nascimento <anderson@allelesecurity.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-7-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:41:49 -07:00
David Howells	6929350080	rxgk: Fix potential integer overflow in length check Fix potential integer overflow in rxgk_extract_token() when checking the length of the ticket. Rather than rounding up the value to be tested (which might overflow), round down the size of the available data. Fixes: `2429a19764` ("rxrpc: Fix untrusted unsigned subtract") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-6-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:40:52 -07:00
David Howells	24481a7f57	rxrpc: Fix conn-level packet handling to unshare RESPONSE packets The security operations that verify the RESPONSE packets decrypt bits of it in place - however, the sk_buff may be shared with a packet sniffer, which would lead to the sniffer seeing an apparently corrupt packet (actually decrypted). Fix this by handing a copy of the packet off to the specific security handler if the packet was cloned. Fixes: `17926a7932` ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:40:52 -07:00
David Howells	1f2740150f	rxrpc: Fix potential UAF after skb_unshare() failure If skb_unshare() fails to unshare a packet due to allocation failure in rxrpc_input_packet(), the skb pointer in the parent (rxrpc_io_thread()) will be NULL'd out. This will likely cause the call to trace_rxrpc_rx_done() to oops. Fix this by moving the unsharing down to where rxrpc_input_call_event() calls rxrpc_input_call_packet(). There are a number of places prior to that where we ignore DATA packets for a variety of reasons (such as the call already being complete) for which an unshare is then avoided. And with that, rxrpc_input_packet() doesn't need to take a pointer to the pointer to the packet, so change that to just a pointer. Fixes: `2d1faf7a0c` ("rxrpc: Simplify skbuff accounting in receive path") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-4-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:40:52 -07:00
David Howells	def304aae2	rxrpc: Fix rxkad crypto unalignment handling Fix handling of a packet with a misaligned crypto length. Also handle non-ENOMEM errors from decryption by aborting. Further, remove the WARN_ON_ONCE() so that it can't be remotely triggered (a trace line can still be emitted). Fixes: `f93af41b9f` ("rxrpc: Fix missing error checks for rxkad encryption/decryption failure") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:40:52 -07:00
David Howells	34f61a07e0	rxrpc: Fix memory leaks in rxkad_verify_response() Fix rxkad_verify_response() to free the ticket and the server key under all circumstances by initialising the ticket pointer to NULL and then making all paths through the function after the first allocation has been done go through a single common epilogue that just releases everything - where all the releases skip on a NULL pointer. Fixes: `57af281e53` ("rxrpc: Tidy up abort generation infrastructure") Fixes: `ec832bd06d` ("rxrpc: Don't retain the server key in the connection") Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Link: https://patch.msgid.link/20260422161438.2593376-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:40:52 -07:00
Ao Zhou	8141a2dc70	net: rds: fix MR cleanup on copy error __rds_rdma_map() hands sg/pages ownership to the transport after get_mr() succeeds. If copying the generated cookie back to user space fails after that point, the error path must not free those resources again before dropping the MR reference. Remove the duplicate unpin/free from the put_user() failure branch so that MR teardown is handled only through the existing final cleanup path. Fixes: `0d4597c8c5` ("net/rds: Track user mapped pages through special API") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Ao Zhou <draw51280@163.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Allison Henderson <achender@kernel.org> Link: https://patch.msgid.link/79c8ef73ec8e5844d71038983940cc2943099baf.1776764247.git.draw51280@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:18:08 -07:00
Daniel Palmer	7256eb3e09	m68k: mvme147: Make me the maintainer I'm actively using mainline + patches on this board as a bootloader for another VME board and as a terminal server using a multiport serial board in the same VME backplane. I even have mainline u-boot on real EPROMs. Make me the maintainer of its ethernet, scsi and arch code so I get an email before one or more of them get deleted. Signed-off-by: Daniel Palmer <daniel@thingy.jp> Link: https://patch.msgid.link/20260422132710.2855826-1-daniel@thingy.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:03:25 -07:00
Jiawen Wu	c263f644ad	net: txgbe: fix firmware version check For the device SP, the firmware version is a 32-bit value where the lower 20 bits represent the base version number. And the customized firmware version populates the upper 12 bits with a specific identification number. For other devices AML 25G and 40G, the upper 12 bits of the firmware version is always non-zero, and they have other naming conventions. Only SP devices need to check this to tell if XPCS will work properly. So the judgement of MAC type is added here. And the original logic compared the entire 32-bit value against 0x20010, which caused the outdated base firmwares bypass the version check without a warning. Apply a mask 0xfffff to isolate the lower 20 bits for an accurate base version comparison. Fixes: `ab928c24e6` ("net: txgbe: add FW version warning") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/C787AA5C07598B13+20260422071837.372731-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 12:02:59 -07:00
Jakub Kicinski	07811361a3	Merge branch 'tcp-fix-listener-wakeup-after-reuseport-migration' Zhenzhong Wu says: ==================== tcp: fix listener wakeup after reuseport migration This series fixes a missing wakeup when inet_csk_listen_stop() migrates an established child socket from a closing listener to another socket in the same SO_REUSEPORT group after the child has already been queued for accept. The target listener receives the migrated accept-queue entry via inet_csk_reqsk_queue_add(), but its waiters are not notified. Nonblocking accept() still succeeds because it checks the accept queue directly, but readiness-based waiters can remain asleep until another connection generates a wakeup. Patch 1 notifies the target listener after a successful migration in inet_csk_listen_stop() and protects the post-queue_add() nsk accesses with rcu_read_lock()/rcu_read_unlock(). Patch 2 extends the existing migrate_reuseport BPF selftest with epoll readiness checks inside migrate_dance(), around shutdown() where the migration happens. The test now verifies that the target listener is not ready before migration and becomes ready immediately after it, for both TCP_ESTABLISHED and TCP_SYN_RECV. TCP_NEW_SYN_RECV remains excluded because it still depends on later handshake completion. Testing: - On a local unpatched kernel, the focused migrate_reuseport test fails for the listener-migration cases and passes for the TCP_NEW_SYN_RECV cases: not ok 1 IPv4 TCP_ESTABLISHED inet_csk_listen_stop not ok 2 IPv4 TCP_SYN_RECV inet_csk_listen_stop ok 3 IPv4 TCP_NEW_SYN_RECV reqsk_timer_handler ok 4 IPv4 TCP_NEW_SYN_RECV inet_csk_complete_hashdance not ok 5 IPv6 TCP_ESTABLISHED inet_csk_listen_stop not ok 6 IPv6 TCP_SYN_RECV inet_csk_listen_stop ok 7 IPv6 TCP_NEW_SYN_RECV reqsk_timer_handler ok 8 IPv6 TCP_NEW_SYN_RECV inet_csk_complete_hashdance - On a patched kernel booted under QEMU, the full migrate_reuseport selftest passes: ok 1 IPv4 TCP_ESTABLISHED inet_csk_listen_stop ok 2 IPv4 TCP_SYN_RECV inet_csk_listen_stop ok 3 IPv4 TCP_NEW_SYN_RECV reqsk_timer_handler ok 4 IPv4 TCP_NEW_SYN_RECV inet_csk_complete_hashdance ok 5 IPv6 TCP_ESTABLISHED inet_csk_listen_stop ok 6 IPv6 TCP_SYN_RECV inet_csk_listen_stop ok 7 IPv6 TCP_NEW_SYN_RECV reqsk_timer_handler ok 8 IPv6 TCP_NEW_SYN_RECV inet_csk_complete_hashdance SELFTEST_RC=0 ==================== Link: https://patch.msgid.link/20260422024554.130346-1-jt26wzz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:54:45 -07:00
Zhenzhong Wu	c01cfc4886	selftests/bpf: check epoll readiness during reuseport migration Inside migrate_dance(), add epoll checks around shutdown() to verify that the target listener is not ready before shutdown() and becomes ready immediately after shutdown() triggers migration. Cover TCP_ESTABLISHED and TCP_SYN_RECV. Exclude TCP_NEW_SYN_RECV as it depends on later handshake completion. Suggested-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com> Link: https://patch.msgid.link/20260422024554.130346-3-jt26wzz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:54:44 -07:00
Zhenzhong Wu	3864c6ba1e	tcp: call sk_data_ready() after listener migration When inet_csk_listen_stop() migrates an established child socket from a closing listener to another socket in the same SO_REUSEPORT group, the target listener gets a new accept-queue entry via inet_csk_reqsk_queue_add(), but that path never notifies the target listener's waiters. A nonblocking accept() still works because it checks the queue directly, but poll()/epoll_wait() waiters and blocking accept() callers can also remain asleep indefinitely. Call READ_ONCE(nsk->sk_data_ready)(nsk) after a successful migration in inet_csk_listen_stop(). However, after inet_csk_reqsk_queue_add() succeeds, the ref acquired in reuseport_migrate_sock() is effectively transferred to nreq->rsk_listener. Another CPU can then dequeue nreq via accept() or listener shutdown, hit reqsk_put(), and drop that listener ref. Since listeners are SOCK_RCU_FREE, wrap the post-queue_add() dereferences of nsk in rcu_read_lock()/rcu_read_unlock(), which also covers the existing sock_net(nsk) access in that path. The reqsk_timer_handler() path does not need the same changes for two reasons: half-open requests become readable only after the final ACK, where tcp_child_process() already wakes the listener; and once nreq is visible via inet_ehash_insert(), the success path no longer touches nsk directly. Fixes: `54b92e8419` ("tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues.") Cc: stable@vger.kernel.org Suggested-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260422024554.130346-2-jt26wzz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:54:43 -07:00
Kohei Enju	e08a9fac5c	vhost_net: fix sleeping with preempt-disabled in vhost_net_busy_poll() syzbot reported "sleeping function called from invalid context" in vhost_net_busy_poll(). Commit `0308813724` ("vhost_net: basic polling support") introduced a busy-poll loop and preempt_{disable,enable}() around it, where each iteration calls a sleepable function inside the loop. The purpose of disabling preemption was to keep local_clock()-based timeout accounting on a single CPU, rather than as a requirement of busy-poll itself: https://lore.kernel.org/1448435489-5949-4-git-send-email-jasowang@redhat.com From this perspective, migrate_disable() is sufficient here, so replace preempt_disable() with migrate_disable(), avoiding sleepable accesses from a preempt-disabled context. Fixes: `0308813724` ("vhost_net: basic polling support") Tested-by: syzbot+6985cb8e543ea90ba8ee@syzkaller.appspotmail.com Reported-by: syzbot+6985cb8e543ea90ba8ee@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69e6a414.050a0220.24bfd3.002d.GAE@google.com/T/ Signed-off-by: Kohei Enju <kohei@enjuk.jp> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:53:31 -07:00
Daniel Borkmann	076b8cad77	ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim Commit `47d3d7ac65` ("ipv6: Implement limits on Hop-by-Hop and Destination options") added net.ipv6.max_{hbh,dst}_opts_{cnt,len} and applied them in ip6_parse_tlv(), the generic TLV walker invoked from ipv6_destopt_rcv() and ipv6_parse_hopopts(). ip6_tnl_parse_tlv_enc_lim() does not go through ip6_parse_tlv(); it has its own hand-rolled TLV scanner inside its NEXTHDR_DEST branch which looks for IPV6_TLV_TNL_ENCAP_LIMIT. That inner loop is bounded only by optlen, which can be up to 2048 bytes. Stuffing the Destination Options header with 2046 Pad1 (type=0) entries advances the scanner a single byte at a time, yielding ~2000 TLV iterations per extension header. Reusing max_dst_opts_cnt to bound the TLV iterations, matching the semantics from `47d3d7ac65`, would require duplicating ip6_parse_tlv() to also validate Pad1/PadN payload. It would also mandate enforcing max_dst_opts_len, since otherwise an attacker shifts the axis to few options with a giant PadN and recovers the original DoS. Allowing up to 8 options before the tunnel encapsulation limit TLV is liberal enough; in practice encap limit is the first TLV. Thus, go with a hard-coded limit IP6_TUNNEL_MAX_DEST_TLVS (8). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:52:07 -07:00
Lee Jones	d293ca716e	tipc: fix double-free in tipc_buf_append() tipc_msg_validate() can potentially reallocate the skb it is validating, freeing the old one. In tipc_buf_append(), it was being called with a pointer to a local variable which was a copy of the caller's skb pointer. If the skb was reallocated and validation subsequently failed, the error handling path would free the original skb pointer, which had already been freed, leading to double-free. Fix this by checking if head now points to a newly allocated reassembled skb. If it does, reassign *headbuf for later freeing operations. Fixes: `d618d09a68` ("tipc: enforce valid ratio between skb truesize and contents") Suggested-by: Tung Nguyen <tung.quang.nguyen@est.tech> Signed-off-by: Lee Jones <lee@kernel.org> Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:45:01 -07:00
Ernestas Kulik	864ba40c80	llc: Return -EINPROGRESS from llc_ui_connect() Given a zero sk_sndtimeo, llc_ui_connect() skips waiting for state change and returns 0, confusing userspace applications that will assume the socket is connected, making e.g. getpeername() calls error out. More specifically, the issue was discovered in libcoap, where newly-added AF_LLC socket support was behaving differently from AF_INET connections due to EINPROGRESS handling being skipped. Set rc to -EINPROGRESS if connect() would not block, akin to AF_INET sockets. Signed-off-by: Ernestas Kulik <ernestas.k@iconn-networks.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260421060304.285419-1-ernestas.k@iconn-networks.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:40:39 -07:00
Ruide Cao	67bf002a2d	ipv4: icmp: validate reply type before using icmp_pointers Extended echo replies use ICMP_EXT_ECHOREPLY as the outbound reply type. That value is outside the range covered by icmp_pointers[], which only describes the traditional ICMP types up to NR_ICMP_TYPES. Avoid consulting icmp_pointers[] for reply types outside that range, and use array_index_nospec() for the remaining in-range lookup. Normal ICMP replies keep their existing behavior unchanged. Fixes: `d329ea5bd8` ("icmp: add response to RFC 8335 PROBE messages") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Ruide Cao <caoruide123@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/0dace90c01a5978e829ca741ef684dbd7304ce62.1776628519.git.caoruide123@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:40:08 -07:00
Jakub Kicinski	7ebc650474	Merge branch 'tcp-symmetric-challenge-ack-for-seg-ack-snd-nxt' Jiayuan Chen says: ==================== tcp: symmetric challenge ACK for SEG.ACK > SND.NXT Commit `354e4aa391` ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") quotes RFC 5961 Section 5.2 in full, which requires that any incoming segment whose ACK value falls outside [SND.UNA - MAX.SND.WND, SND.NXT] MUST be discarded and an ACK sent back. Linux currently sends that challenge ACK only on the lower edge (SEG.ACK < SND.UNA - MAX.SND.WND); on the symmetric upper edge (SEG.ACK > SND.NXT) the segment is silently dropped with SKB_DROP_REASON_TCP_ACK_UNSENT_DATA. Patch 1 completes the mitigation by emitting a rate-limited challenge ACK on that branch, reusing tcp_send_challenge_ack() and honouring FLAG_NO_CHALLENGE_ACK for consistency with the lower-edge case. It also updates the existing tcp_ts_recent_invalid_ack.pkt selftest, which drives this exact path, to consume the new challenge ACK so bisect stays clean. Patch 2 adds a new packetdrill selftest that exercises RFC 5961 Section 5.2 on both edges of the acceptable window, filling a gap in the selftests tree (neither edge had dedicated coverage before). ==================== Link: https://patch.msgid.link/20260422123605.320000-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:04:05 -07:00
Jiayuan Chen	cf94b3c0f0	selftests/net: packetdrill: cover RFC 5961 5.2 challenge ACK on both edges RFC 5961 Section 5.2 / RFC 793 Section 3.9 require a challenge ACK whenever an incoming SEG.ACK falls outside [SND.UNA - MAX.SND.WND, SND.NXT]. There is currently no packetdrill coverage for either edge. Add tcp_rfc5961_ack-out-of-window.pkt, which in a single passive-open connection exercises: - Upper edge (SEG.ACK > SND.NXT): peer ACKs data that was never sent before the server has transmitted anything. - Lower edge (SEG.ACK < SND.UNA - MAX.SND.WND): after the server has sent 2000 bytes (the peer-advertised rwnd forces two 1000-byte segments, both acknowledged), peer sends an ACK that is older than the acceptable window. Both cases must elicit a challenge ACK <SEQ = SND.NXT, ACK = RCV.NXT, CTL = ACK>. The per-socket RFC 5961 Section 7 rate limit is disabled for the duration of the test so that both challenge ACKs can fire back-to-back. Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260422123605.320000-3-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:04:01 -07:00
Jiayuan Chen	42726ec644	tcp: send a challenge ACK on SEG.ACK > SND.NXT RFC 5961 Section 5.2 validates an incoming segment's ACK value against the range [SND.UNA - MAX.SND.WND, SND.NXT] and states: "All incoming segments whose ACK value doesn't satisfy the above condition MUST be discarded and an ACK sent back." Commit `354e4aa391` ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") opted Linux into this mitigation and implements the challenge ACK on the lower side (SEG.ACK < SND.UNA - MAX.SND.WND), but the symmetric upper side (SEG.ACK > SND.NXT) still takes the pre-RFC-5961 path and silently returns SKB_DROP_REASON_TCP_ACK_UNSENT_DATA, even though RFC 793 Section 3.9 (now RFC 9293 Section 3.10.7.4) has always required: "If the ACK acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an ACK, drop the segment, and return." Complete the mitigation by sending a challenge ACK on that branch, reusing the existing tcp_send_challenge_ack() path which already enforces the per-socket RFC 5961 Section 7 rate limit via __tcp_oow_rate_limited(). FLAG_NO_CHALLENGE_ACK is honoured for symmetry with the lower-edge case. Update the existing tcp_ts_recent_invalid_ack.pkt selftest, which drives this exact path, to consume the new challenge ACK. Fixes: `354e4aa391` ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260422123605.320000-2-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:04:00 -07:00
Alexey Kodanev	4078c5611d	nfp: fix swapped arguments in nfp_encode_basic_qdr() calls There is a mismatch between the passed arguments and the actual nfp_encode_basic_qdr() function parameter names: static int nfp_encode_basic_qdr(u64 addr, int dest_island, int cpp_tgt, int mode, bool addr40, int isld1, int isld0) { ... But "dest_island" and "cpp_tgt" are swapped at every call-site. For example: return nfp_encode_basic_qdr(addr, cpp_tgt, dest_island, mode, addr40, isld1, isld0); As a result, nfp_encode_basic_qdr() receives "dest_island" as CPP target type, which is always NFP_CPP_TARGET_QDR(2) for these calls, and "cpp_tgt" as the destination island ID, which can accidentally match or be outside the valid NFP_CPP_TARGET_ types (e.g. '-1' for any destination). Since code already worked for years, also add extra pr_warn() to error paths in nfp_encode_basic_qdr() to help identify any potential address verification failures. Detected using the static analysis tool - Svace. Fixes: `4cb584e0ee` ("nfp: add CPP access core") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Link: https://patch.msgid.link/20260422160536.61855-1-aleksei.kodanev@bell-sw.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:01:20 -07:00
Ruijie Li	5a8db80f72	net/smc: avoid early lgr access in smc_clc_wait_msg A CLC decline can be received while the handshake is still in an early stage, before the connection has been associated with a link group. The decline handling in smc_clc_wait_msg() updates link-group level sync state for first-contact declines, but that state only exists after link group setup has completed. Guard the link-group update accordingly and keep the per-socket peer diagnosis handling unchanged. This preserves the existing sync_err handling for established link-group contexts and avoids touching link-group state before it is available. Fixes: `0cfdd8f92c` ("smc: connection and link group creation") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Ruijie Li <ruijieli51@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Link: https://patch.msgid.link/08c68a5c817acf198cce63d22517e232e8d60718.1776850759.git.ruijieli51@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 11:00:57 -07:00
Dexuan Cui	3d1f20727a	hv_sock: Return -EIO for malformed/short packets Commit `f631529589` fixes a regression, however it fails to report an error for malformed/short packets -- normally we should never see such packets, but let's report an error for them just in case. Fixes: `f631529589` ("hv_sock: Report EOF instead of -EIO for FIN") Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui <decui@microsoft.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20260423064811.1371749-1-decui@microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 10:53:16 -07:00
Brett Creeley	3bc06da858	virtio_net: sync rss_trailer.max_tx_vq on queue_pairs change via VQ_PAIRS_SET When netif_is_rxfh_configured() is true (i.e., the user has explicitly configured the RSS indirection table), virtnet_set_queues() skips the RSS update path and falls through to the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command to change the number of queue pairs. However, it does not update vi->rss_trailer.max_tx_vq to reflect the new queue_pairs value. This causes a mismatch between vi->curr_queue_pairs and vi->rss_trailer.max_tx_vq. Any subsequent RSS reconfiguration (e.g., via ethtool -X) calls virtnet_commit_rss_command(), which sends the stale max_tx_vq to the device, silently reverting the queue count. Reproduction: 1. User configured RSS ethtool -X eth0 equal 8 2. VQ_PAIRS_SET path; max_tx_vq stays 16 ethtool -L eth0 combined 12 3. RSS commit uses max_tx_vq=16 instead of 12 ethtool -X eth0 equal 4 Fix this by updating vi->rss_trailer.max_tx_vq after a successful VQ_PAIRS_SET command when RSS is enabled, keeping it in sync with curr_queue_pairs. Fixes: `50bfcaedd7` ("virtio_net: Update rss when set queue") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20260416212121.29073-1-brett.creeley@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-23 09:35:53 -07:00
Paolo Abeni	d40831b016	Merge branch 'mptcp-sync-the-msk-sndbuf-at-accept-time' Matthieu Baerts says: ==================== mptcp: sync the msk->sndbuf at accept() time On passive MPTCP connections, the MPTCP socket send buffer doesn't have the expected size at accept() time. Patch 1 fixes the regression introduced in v6.7, while the following one validates the fix in the selftests. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> ==================== Link: https://patch.msgid.link/20260420-net-mptcp-sync-sndbuf-accept-v1-0-e3523e3aeb44@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 13:20:25 +02:00
Gang Yan	d0576eb850	selftests: mptcp: add a check for sndbuf of S/C Add a new chk_sndbuf() helper to diag.sh that extracts the sndbuf (the 'tb' field from 'ss -m' skmem output) for both server and client MPTCP sockets, and verifies they are equal. Without the previous patch, it will fail: ''' 07 ....chk sndbuf server/client [FAIL] sndbuf S=20480 != C=2630656 ''' Signed-off-by: Gang Yan <yangang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260420-net-mptcp-sync-sndbuf-accept-v1-2-e3523e3aeb44@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 13:20:17 +02:00
Gang Yan	fcf04b1433	mptcp: sync the msk->sndbuf at accept() time On passive MPTCP connections, the msk sndbuf is not updated correctly. The root cause is an order issue in the accept path: - tcp_check_req() -> subflow_syn_recv_sock() -> mptcp_sk_clone_init() calls __mptcp_propagate_sndbuf() to copy the ssk sndbuf into msk - Later, tcp_child_process() -> tcp_init_transfer() -> tcp_sndbuf_expand() grows the ssk sndbuf. So __mptcp_propagate_sndbuf() runs before the ssk sndbuf has been expanded and the msk ends up with a much smaller sndbuf than the subflow: MPTCP: msk->sndbuf:20480, msk->first->sndbuf:2626560 Fix this by moving the __mptcp_propagate_sndbuf() call from mptcp_sk_clone_init() -- the ssk sndbuf is not yet finalized there -- to __mptcp_propagate_sndbuf() at accept() time, when the ssk sndbuf has been fully expanded by tcp_sndbuf_expand(). Fixes: `8005184fd1` ("mptcp: refactor sndbuf auto-tuning") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/602 Signed-off-by: Gang Yan <yangang@kylinos.cn> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260420-net-mptcp-sync-sndbuf-accept-v1-1-e3523e3aeb44@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 13:20:17 +02:00
Stefano Garzarella	1cb36e2522	vsock/virtio: fix MSG_ZEROCOPY pinned-pages accounting virtio_transport_init_zcopy_skb() uses iter->count as the size argument for msg_zerocopy_realloc(), which in turn passes it to mm_account_pinned_pages() for RLIMIT_MEMLOCK accounting. However, this function is called after virtio_transport_fill_skb() has already consumed the iterator via __zerocopy_sg_from_iter(), so on the last skb, iter->count will be 0, skipping the RLIMIT_MEMLOCK enforcement. Pass pkt_len (the total bytes being sent) as an explicit parameter to virtio_transport_init_zcopy_skb() instead of reading the already-consumed iter->count. This matches TCP and UDP, which both call msg_zerocopy_realloc() with the original message size. Fixes: `581512a6dc` ("vsock/virtio: MSG_ZEROCOPY flag support") Reported-by: Yiming Qian <yimingqian591@gmail.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260420132051.217589-1-sgarzare@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 13:03:21 +02:00
Paolo Abeni	42ea37b077	Merge branch 'net-mana-fix-probe-remove-error-path-bugs' Erni Sri Satya Vennela says: ==================== net: mana: Fix probe/remove error path bugs Fix five bugs in mana_probe()/mana_remove() error handling that can cause warnings on uninitialized work structs, NULL pointer dereferences, masked errors, and resource leaks when early probe steps fail. Patches 1-2 move work struct initialization (link_change_work and gf_stats_work) to before any error path that could trigger mana_remove(), preventing WARN_ON in __flush_work() or debug object warnings when sync cancellation runs on uninitialized work structs. Patch 3 guards mana_remove() against double invocation. If PM resume fails, mana_probe() calls mana_remove() which sets gdma_context and driver_data to NULL. A failed resume does not unbind the driver, so when the device is eventually unbound, mana_remove() is called again and dereferences NULL, causing a kernel panic. An early return on NULL gdma_context or driver_data makes the second call harmless. Patch 4 prevents add_adev() from overwriting a port probe error, which could leave the driver in a broken state with NULL ports while reporting success. Patch 5 changes 'goto out' to 'break' in mana_remove()'s port loop so that mana_destroy_eq() is always reached, preventing EQ leaks when a NULL port is encountered. ==================== Link: https://patch.msgid.link/20260420124741.1056179-1-ernis@linux.microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:49:16 +02:00
Erni Sri Satya Vennela	65267c9c4f	net: mana: Fix EQ leak in mana_remove on NULL port In mana_remove(), when a NULL port is encountered in the port iteration loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event queues allocated earlier by mana_create_eq(). This can happen when mana_probe_port() fails for port 0, leaving ac->ports[0] as NULL. On driver unload or error cleanup, mana_remove() hits the NULL entry and jumps past mana_destroy_eq(). Change 'goto out' to 'break' so the for-loop exits normally and mana_destroy_eq() is always reached. Remove the now-unreferenced out: label. Fixes: `1e2d0824a9` ("net: mana: Add support for EQ sharing") Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Link: https://patch.msgid.link/20260420124741.1056179-6-ernis@linux.microsoft.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:49:13 +02:00
Erni Sri Satya Vennela	a7fdaf069b	net: mana: Don't overwrite port probe error with add_adev result In mana_probe(), if mana_probe_port() fails for any port, the error is stored in 'err' and the loop breaks. However, the subsequent unconditional 'err = add_adev(gd, "eth")' overwrites this error. If add_adev() succeeds, mana_probe() returns success despite ports being left in a partially initialized state (ac->ports[i] == NULL). Only call add_adev() when there is no prior error, so the probe correctly fails and triggers mana_remove() cleanup. Fixes: `a69839d432` ("net: mana: Add support for auxiliary device") Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Link: https://patch.msgid.link/20260420124741.1056179-5-ernis@linux.microsoft.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:49:13 +02:00
Erni Sri Satya Vennela	50271d7ec9	net: mana: Guard mana_remove against double invocation If PM resume fails (e.g., mana_attach() returns an error), mana_probe() calls mana_remove(), which tears down the device and sets gd->gdma_context = NULL and gd->driver_data = NULL. However, a failed resume callback does not automatically unbind the driver. When the device is eventually unbound, mana_remove() is invoked a second time. Without a NULL check, it dereferences gc->dev with gc == NULL, causing a kernel panic. Add an early return if gdma_context or driver_data is NULL so the second invocation is harmless. Move the dev = gc->dev assignment after the guard so it cannot dereference NULL. Fixes: `635096a86e` ("net: mana: Support hibernation and kexec") Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Link: https://patch.msgid.link/20260420124741.1056179-4-ernis@linux.microsoft.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:49:13 +02:00
Erni Sri Satya Vennela	6e8bc03349	net: mana: Init gf_stats_work before potential error paths in probe Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(), while keeping schedule_delayed_work() at its original location. Previously, if any function between mana_create_eq() and the INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove() which unconditionally calls cancel_delayed_work_sync(gf_stats_work) in __flush_work() or debug object warnings with CONFIG_DEBUG_OBJECTS_WORK enabled. Fixes: `be4f1d67ec` ("net: mana: Add standard counter rx_missed_errors") Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Link: https://patch.msgid.link/20260420124741.1056179-3-ernis@linux.microsoft.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:49:13 +02:00
Erni Sri Satya Vennela	cb4a90744b	net: mana: Init link_change_work before potential error paths in probe Move INIT_WORK(link_change_work) to right after the mana_context allocation, before any error path that could reach mana_remove(). Previously, if mana_create_eq() or mana_query_device_cfg() failed, mana_probe() would jump to the error path which calls mana_remove(). mana_remove() unconditionally calls disable_work_sync(link_change_work), but the work struct had not been initialized yet. This can trigger CONFIG_DEBUG_OBJECTS_WORK enabled. Fixes: `54133f9b4b` ("net: mana: Support HW link state events") Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Link: https://patch.msgid.link/20260420124741.1056179-2-ernis@linux.microsoft.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:49:13 +02:00
Breno Leitao	7079c8c13f	netconsole: avoid out-of-bounds access on empty string in trim_newline() trim_newline() unconditionally dereferences s[len - 1] after computing len = strnlen(s, maxlen). When the string is empty, len is 0 and the expression underflows to s[(size_t)-1], reading (and potentially writing) one byte before the buffer. The two callers feed trim_newline() with the result of strscpy() from configfs store callbacks (dev_name_store, userdatum_value_store). configfs guarantees count >= 1 reaches the callback, but the byte itself can be NUL: a userspace write(fd, "\0", 1) leaves the destination empty after strscpy() and triggers the underflow. The OOB write only fires if the adjacent byte happens to be '\n', so this is not a security issue, but the access is undefined behaviour either way. This pattern is commonly flagged by LLM-based code reviewers. While it is not a security fix, the underlying access is undefined behaviour and the change is small and self-contained, so it is a reasonable candidate for the stable trees. Guard the dereference on a non-zero length. Fixes: `ae001dc679` ("net: netconsole: move newline trimming to function") Cc: stable@vger.kernel.org Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Gustavo Luiz Duarte <gustavold@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260420-netcons_trim_newline-v1-1-dc35889aeedf@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:45:02 +02:00
Paolo Abeni	063571ab9f	Merge branch 'net-airoha-fix-null-pointer-derefrences-in-airoha_qdma_cleanup' Lorenzo Bianconi says: ==================== net: airoha: Fix NULL pointer derefrences in airoha_qdma_cleanup() Fix two possible NULL pointer derefrences in airoha_qdma_cleanup routine if airoha_qdma_init() fails. v1: https://lore.kernel.org/r/20260417-airoha_qdma_init_rx_queue-fix-v1-0-db9fa5e468e5@kernel.org ==================== Link: https://patch.msgid.link/20260420-airoha_qdma_init_rx_queue-fix-v2-0-d99347e5c18d@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:21:12 +02:00
Lorenzo Bianconi	4b91cb6578	net: airoha: Add size check for TX NAPIs in airoha_qdma_cleanup() If airoha_qdma_init routine fails before airoha_qdma_tx_irq_init() runs successfully for all TX NAPIs, airoha_qdma_cleanup() will unconditionally runs netif_napi_del() on TX NAPIs, triggering a NULL pointer dereference. Fix the issue relying on q_tx_irq size value to check if the TX NAPIs is properly initialized in airoha_qdma_cleanup(). Moreover, run netif_napi_add_tx() just if irq_q queue is properly allocated. Fixes: `23020f0493` ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260420-airoha_qdma_init_rx_queue-fix-v2-2-d99347e5c18d@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:17:35 +02:00
Lorenzo Bianconi	379050947a	net: airoha: Move ndesc initialization at end of airoha_qdma_init_rx_queue() If queue entry or DMA descriptor list allocation fails in airoha_qdma_init_rx_queue routine, airoha_qdma_cleanup() will trigger a NULL pointer dereference running netif_napi_del() for RX queue NAPIs since netif_napi_add() has never been executed to this particular RX NAPI. The issue is due to the early ndesc initialization in airoha_qdma_init_rx_queue() since airoha_qdma_cleanup() relies on ndesc value to check if the queue is properly initialized. Fix the issue moving ndesc initialization at end of airoha_qdma_init_tx routine. Move page_pool allocation after descriptor list allocation in order to avoid memory leaks if desc allocation fails. Fixes: `23020f0493` ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260420-airoha_qdma_init_rx_queue-fix-v2-1-d99347e5c18d@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:17:35 +02:00
Longxuan Yu	7dddc74af3	8021q: delete cleared egress QoS mappings vlan_dev_set_egress_priority() currently keeps cleared egress priority mappings in the hash as tombstones. Repeated set/clear cycles with distinct skb priorities therefore accumulate mapping nodes until device teardown and leak memory. Delete mappings when vlan_prio is cleared instead of keeping tombstones. Now that the egress mapping lists are RCU protected, the node can be unlinked safely and freed after a grace period. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@kernel.org Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Longxuan Yu <ylong030@ucr.edu> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Link: https://patch.msgid.link/ecfa6f6ce2467a42647ff4c5221238ae85b79a59.1776647968.git.yuantan098@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:13:57 +02:00
Longxuan Yu	fc69decc81	8021q: use RCU for egress QoS mappings The TX fast path and reporting paths walk egress QoS mappings without RTNL. Convert the mapping lists to RCU-protected pointers, use RCU reader annotations in readers, and defer freeing mapping nodes with an embedded rcu_head. This prepares the egress QoS mapping code for safe removal of mapping nodes in a follow-up change while preserving the current behavior. Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Longxuan Yu <ylong030@ucr.edu> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Link: https://patch.msgid.link/9136768189f8c6d3f824f476c62d2fa1111688e8.1776647968.git.yuantan098@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 12:13:57 +02:00
Paolo Abeni	5a5db99c34	netfilter pull request 26-04-20 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmnmnwYACgkQ1w0aZmrP KyE1lg//VKRxQCN9R0XQPrqS/Dvz5GuNcHYtGkq1DZQIqGmaLLarZMmTN7b+iZNk +JHdzzd2B88IuYcorxoxu9JTUC+BdQnw+PP8WWUFrW6vaU5sMDvYC0vOp9/gybl2 D7xIH+HCeepGJz4SvdNowxXXSTnyvjl4h85G4kJLKScAe3KB1/t/TcKl3xJcJ8eb 8eTmJSt15F7QAom+vMGdRe8NlQrm9FVphW3CntBN4Hzc7+GwuIbk+KoXivcbgu+f hHGm/TpclSmOpnIkjLvyI6OBty9ubD1wtJcoqF6toDYUytdvi7pxQ103YQdIENSR snuQcXXXtkqaIkXGU3nXBVdfhIFzSVn8Y8imUhtLHcUfJlZSg1rrZu+YoseAJ9MR CnWDk0cTI5nHLpqNUJ4tFnUURfJYFev1ebeeoZpTM7ScK/5Vy0OUtjswdCntn7j2 mdb6ZlB6RTjl7blelk/A4WSImSplhSCy6vvlxa1ysApP+eq6zr2+Sh+nuUVIa8F8 8uplN5keUrozZ+hGolfS5Qrd9BtjBlINOx0T272aYHoiDDUXeXPaA0c63M85B1I7 VxUxUYyxBHCiYoMHzvUeat6KAMzLGA9jNCVgIDlBEaRtrI0SH99hUob8GuPAfySM 3aruUoNdzAspRigBlEKk4HrxdO5QLwVNYjQncTF+iYGEKI3E1vg= =6RJG -----END PGP SIGNATURE----- Merge tag 'nf-26-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following batch contains Netfilter/IPVS fixes for net: 1) nft_osf actually only supports IPv4, restrict it. 2) Address possible division by zero in nfnetlink_osf, from Xiang Mei. 3) Remove unsafe use of sprintf to fix possible buffer overflow in the SIP NAT helper, from Florian Westphal. 4) Restrict xt_mac, xt_owner and xt_physdev to inet families only; xt_realm is only for ipv4, otherwise null-pointer-deref is possible. 5) Use kfree_rcu() in nat core to release hooks, this can be an issue once nfnetlink_hook gets support to dump NAT hook information, not currently a real issue but better fix it now. From Florian Westphal. 6) Fix MTU checks in IPVS, from Yingnan Zhang. 7) Fix possible out-of-bounds when matching TCP options in nfnetlink_osf, from Fernando Fernandez Mancera. 8) Fix potential nul-ptr-deref in ttl check in nfnetlink_osf, remove useless loop to fix this, also from Fernando. This is a smaller batch, there are more patches pending in the queue to arm another pull request as soon as this is considered good enough. AI might complain again about one more issue regarding osf and big-endian arches in osf but this batch is targetting crash fixes for osf at this stage. netfilter pull request 26-04-20 * tag 'nf-26-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nfnetlink_osf: fix potential NULL dereference in ttl check netfilter: nfnetlink_osf: fix out-of-bounds read on option matching ipvs: fix MTU check for GSO packets in tunnel mode netfilter: nat: use kfree_rcu to release ops netfilter: xtables: restrict several matches to inet family netfilter: conntrack: remove sprintf usage netfilter: nfnetlink_osf: fix divide-by-zero in OSF_WSS_MODULO netfilter: nft_osf: restrict it to ipv4 ==================== Link: https://patch.msgid.link/20260420220215.111510-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 11:20:38 +02:00
Mieczyslaw Nalewaj	0c078021d3	net: dsa: realtek: rtl8365mb: fix mode mask calculation The RTL8365MB_DIGITAL_INTERFACE_SELECT_MODE_MASK macro was shifting the 4-bit mask (0xF) by only (_extint % 2) bits instead of (_extint % 2) * 4. This caused the mask to overlap with the adjacent nibble when configuring odd-numbered external interfaces, selecting the wrong bits entirely. Align the shift calculation with the existing ...MODE_OFFSET macro. Fixes: `4af2950c50` ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC") Signed-off-by: Abdulkader Alrezej <alrazj.abdulkader@gmail.com> Signed-off-by: Mieczyslaw Nalewaj <namiltd@yahoo.com> Reviewed-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Link: https://patch.msgid.link/400a6387-a444-4576-af6d-26be5410bce3@yahoo.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 10:50:33 +02:00
Paolo Abeni	084a39af97	Merge branch 'net-airoha-fix-airoha_qdma_cleanup_tx_queue-processing' Lorenzo Bianconi says: ==================== net: airoha: Fix airoha_qdma_cleanup_tx_queue() processing Add missing bits in airoha_qdma_cleanup_tx_queue routine. Fix airoha_qdma_cleanup_tx_queue processing errors intorduced in commit '3f47e67dff1f7 ("net: airoha: Add the capability to consume out-of-order DMA tx descriptors")'. v3: https://lore.kernel.org/r/20260416-airoha_qdma_cleanup_tx_queue-fix-net-v3-0-2b69f5788580@kernel.org v2: https://lore.kernel.org/r/20260414-airoha_qdma_cleanup_tx_queue-fix-net-v2-1-875de57cc022@kernel.org v1: https://lore.kernel.org/r/20260410-airoha_qdma_cleanup_tx_queue-fix-net-v1-1-b7171c8f1e78@kernel.org ==================== Link: https://patch.msgid.link/20260417-airoha_qdma_cleanup_tx_queue-fix-net-v4-0-e04bcc2c9642@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 09:08:00 +02:00
Lorenzo Bianconi	3309965fe4	net: airoha: Add missing bits in airoha_qdma_cleanup_tx_queue() Similar to airoha_qdma_cleanup_rx_queue(), reset DMA TX descriptors in airoha_qdma_cleanup_tx_queue routine. Moreover, reset TX_DMA_IDX to TX_CPU_IDX to notify the NIC the QDMA TX ring is empty. Fixes: `23020f0493` ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260417-airoha_qdma_cleanup_tx_queue-fix-net-v4-2-e04bcc2c9642@kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 09:07:57 +02:00
Lorenzo Bianconi	f329924bb4	net: airoha: Move ndesc initialization at end of airoha_qdma_init_tx() If queue entry list allocation fails in airoha_qdma_init_tx_queue routine, airoha_qdma_cleanup_tx_queue() will trigger a NULL pointer dereference accessing the queue entry array. The issue is due to the early ndesc initialization in airoha_qdma_init_tx_queue(). Fix the issue moving ndesc initialization at end of airoha_qdma_init_tx routine. Fixes: `3f47e67dff` ("net: airoha: Add the capability to consume out-of-order DMA tx descriptors") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260417-airoha_qdma_cleanup_tx_queue-fix-net-v4-1-e04bcc2c9642@kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-23 09:07:57 +02:00
Eric Dumazet	1ada03fdef	net/sched: sch_sfb: annotate data-races in sfb_dump_stats() sfb_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Alternative would be to acquire the qdisc spinlock, but our long-term goal is to make qdisc dump operations lockless as much as we can. tc_sfb_xstats fields don't need to be latched atomically, otherwise this bug would have been caught earlier. Fixes: `edb09eb17e` ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260421141655.3953721-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-22 21:12:59 -07:00
Eric Dumazet	a8f5192809	net/sched: sch_red: annotate data-races in red_dump_stats() red_dump_stats() only runs with RTNL held, reading fields that can be changed in qdisc fast path. Add READ_ONCE()/WRITE_ONCE() annotations. Alternative would be to acquire the qdisc spinlock, but our long-term goal is to make qdisc dump operations lockless as much as we can. tc_red_xstats fields don't need to be latched atomically, otherwise this bug would have been caught earlier. Fixes: `edb09eb17e` ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260421142309.3964322-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-22 21:12:54 -07:00

1 2 3 4 5 ...

1434459 Commits