Commit Graph

1446786 Commits

Author SHA1 Message Date
Linus Torvalds
70390501d1 Miscellaneous x86 fixes:
- Fix memory map enumeration bug in the Xen e820 parsing code
    (Juergen Gross)
 
  - Re-enable e820 BIOS fallback if e820 table is empty
    (David Gow)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmn+lgARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iSlQ//dDeRK+FeY9zb2ehi07FwpAsbOyDRCOwh
 SC99VYU1+Gl9YtmQj3rG2KSiL2gkf3llZ1Rai6EPJDHsBEmOa4bDgwPnW4+/rAoR
 +putuelDn+qEUkaSB8Rnx4eewQ5vHzKnxdUt7K2aeuntCOBxdVSo7uPJML7v3r6p
 eVpsqHmcFfsJkr3p38zuv9CzGjIxRprnNhT+pbSZ+GHjk2ugJIr6dz6QFL9eNt4n
 PEvYLttPtJepFHq18+1LS+E8OkcRvf1Ub0X3YtHfBJ/yANC4/BlQumGCLCd4UN0w
 uRABVW/JifxVbza6/icJouemS7nXZMTLe5wvh8HXHWOKbPjz9pTNq49MjOepkNfx
 0ysZNiBGqOfSbjFav2T5HBpM2TMvxmPWf/pyP8b2RqRFWLxvCpnI0c1FDg8gCIRs
 px9pAr75Zw7BrZkWjw1v1xkKMwlKK9KX2s30d3McaLUvcqebjaC7EJu9wXHuMwe6
 V6uE3hEDn7NUtewShUgFmsA4mxFrP7vM81lXjIwOpgvFPdBS4a+0idVJJDwiciQ9
 Lq1D2D9QxfieZ2WaRJBfWo69V981veaOerbbImdMbkowYeasV5ZDOG5qDk2JVeFh
 V5CBt67Hs0EtK/tiQLitXLKIdTq1I6PzZXybRoW0Viks/Ul6xz1BYWEHEzM1S/Co
 FYnleI7mxH4=
 =+FLg
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

 - Fix memory map enumeration bug in the Xen e820 parsing code (Juergen
   Gross)

 - Re-enable e820 BIOS fallback if e820 table is empty (David Gow)

* tag 'x86-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot/e820: Re-enable BIOS fallback if e820 table is empty
  x86/xen: Fix a potential problem in xen_e820_resolve_conflicts()
2026-05-08 20:28:45 -07:00
Zhan Xusheng
d1aabc2132 ntfs: fix missing kstrdup() error check in ntfs_write_volume_label()
ntfs_write_volume_label() does not check the return value of
kstrdup().  If the allocation fails, vol->volume_label is set to
NULL while the function returns success.  A subsequent
FS_IOC_GETFSLABEL then returns an empty string even though the
on-disk label was updated correctly.

Fix by allocating the new label before taking vol_ni->mrec_lock and
updating any on-disk metadata, so an -ENOMEM from kstrdup() leaves
both the in-memory and on-disk labels untouched and consistent.  On
success the preallocated copy replaces the old vol->volume_label.
Also move mark_inode_dirty_sync() into the success path so that it
is not called when no metadata was actually modified.

Fixes: 6251f0b0de ("ntfs: update super block operations")
Suggested-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2026-05-09 12:12:59 +09:00
Linus Torvalds
6e1e5a33e8 Fix CPU hotplug activation race in the timer migration code,
by Frederic Weisbecker.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmn+lQYRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iNhhAAmzI4kh9kulol1M73u+LiQTdx0zMcDcSM
 NiHeZ+QbNNMqkfnj6xxOVGViPLTVSXpa+dU4hdeHM3n7vYnlGDiHBm4EM95z7S28
 Jk39z6Wq/0vmR938rhVVNgMbyx3N+rHV15ReyY88vq6OcI4SdJ9sin72ZW8piuBi
 EeHZLOyPsz+qf4tG6UFHP3uzM7fQZ9DrVxzcDZE4+xbVpk1kXLuAGo89wXF0GbUX
 NhUYtHakvARehLg3N+0qo/DUGRXYp1GF+qfIXH7V6g91eMhPureUrhzsKVmbu5BL
 OXMMskWG20LhySrLX+fKDk8DH3n3G+dTFCgM5kc6JaR4vLT0prEJHeVsIX+ydOLF
 sZqqkaVhKvFNlxjrqbjRNBV09/Cd9XN1wYUqnBeh55qzB+yoOnfS9+z7dsguf38K
 YKiCmlt2T68NL5/ZUvXF3ghm5ic7dTyZFUyM/abo7rSTeGByh5vruD19r2zp0C1/
 hgSPmAm8fuJAzQCRUzXxDsY5hLhKtv+1h9+Y9humakzfxUdneydsHigz5YjZ6tzI
 E8LyLsBZnSB6IKZfVetwpg0xTxB75GtSwa+p0XUg2+q2ss94mYmWMFM+ZKWxaTYq
 yI5paLXZT1DO9Fx/4gsCDlUrR1y096GL6O/1t3TOUsoJ/9kEFvOL/ZF4sjEnWrch
 CYI+DPoUOUA=
 =UHOI
 -----END PGP SIGNATURE-----

Merge tag 'timers-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "Fix CPU hotplug activation race in the timer migration code, by
  Frederic Weisbecker"

* tag 'timers-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers/migration: Fix another hotplug activation race
2026-05-08 20:03:39 -07:00
Linus Torvalds
7f00232152 Miscellaneous scheduler fixes:
- Fix spurious failures in rseq self-tests (Mark Brown)
 
  - Fix rseq rseq::cpu_id_start ABI regression due to
    TCMalloc's creative use of the supposedly read-only field.
    The fix is to introduce a new ABI variant based on
    a new (larger) rseq area registration size, to keep
    the TCMalloc use of rseq backwards compatible on new kernels.
    (Thomas Gleixner)
 
  - Fix wakeup_preempt_fair() for not waking up task (Vincent Guittot)
 
  - Fix s64 mult overflow in vruntime_eligible() (Zhan Xusheng)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmn+lBARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iy1hAAunlBoDq8/MXSt4JeMRX/3p+CKihExTnO
 LO535Rv8DfcepBgZIysTJKMn9bM/l+7OXGdQ+YDjS70GsLM2aOzDYBKCwOHHm0pZ
 OJ6Y+UUFAacnQS4EuQLqyNBW0Ice4AIYWu0pLLADs2KUgX1DmSo9bhgZbHcbsMnA
 IjoaFNhebeA1bHSDD11UIHTza23mqEinxM0yOK8pT+M6fOMXWOo/kLLLYjG/yAIB
 qBFGpwJkdKjBcpCmAYU9jpw26p/17YMzkgmAaUXOKRLZi+h5zQMNVjR+OIjK4qxt
 z5Tj+h7t3IcFV2d1zUThPpxxHLn3ro30R5mW0OrsPPFI8AkSRC6GsIX/Ft9uFbQQ
 1SGknyx5qLrSldmT8KKXPlM/vriyh3iL6/QMgXtTb8FfegRCbjXsZy39s3wOexCD
 oBnJt0rX3NviPyb/Up9cfdx++kPfJM074NdVHRBW4ucoOpwHosNcuBP0YQsctFw0
 QO7lkfjTo19eB9ftSyfadwF9e+2jYF7YoLmTOKqGLbZEQeIT5tJHDgHklLWrqHAh
 HDmDCHyqtiXBDCeapqnomrhqczKlSUK1Qqk/Hh+Hwiwj0N4vLhHXmRIpTb/R43/Y
 i6cdwV8Xtl+RltmYaINpMxdnFl/iz/kpapYkqy/ykLNfBj01AWbdaIVPRk47ndU1
 E29BSjMZhPU=
 =2QJp
 -----END PGP SIGNATURE-----

Merge tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:

 - Fix spurious failures in rseq self-tests (Mark Brown)

 - Fix rseq rseq::cpu_id_start ABI regression due to TCMalloc's creative
   use of the supposedly read-only field

   The fix is to introduce a new ABI variant based on a new (larger)
   rseq area registration size, to keep the TCMalloc use of rseq
   backwards compatible on new kernels (Thomas Gleixner)

 - Fix wakeup_preempt_fair() for not waking up task (Vincent Guittot)

 - Fix s64 mult overflow in vruntime_eligible() (Zhan Xusheng)

* tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix wakeup_preempt_fair() for not waking up task
  sched/fair: Fix overflow in vruntime_eligible()
  selftests/rseq: Expand for optimized RSEQ ABI v2
  rseq: Reenable performance optimizations conditionally
  rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode
  selftests/rseq: Validate legacy behavior
  selftests/rseq: Make registration flexible for legacy and optimized mode
  selftests/rseq: Skip tests if time slice extensions are not available
  rseq: Revert to historical performance killing behaviour
  rseq: Don't advertise time slice extensions if disabled
  rseq: Protect rseq_reset() against interrupts
  rseq: Set rseq::cpu_id_start to 0 on unregistration
  selftests/rseq: Don't run tests with runner scripts outside of the scripts
2026-05-08 19:42:10 -07:00
Linus Torvalds
e5cf0260a7 Miscellaneous perf events fixes:
- Fix deadlock in the perf_mmap() failure path (Peter Zijlstra)
 
  - Intel ACR (Auto Counter Reload) fixes (Dapeng Mi):
 
    - Fix validation and configuration of ACR masks
    - Fix ACR rescheduling bug causing stale masks
    - Disable the PMI on ACR-enabled hardware
    - Enable ACR on Panther Cover uarch too
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmn+kB0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ib2RAAkh19iPRq7tHCXvDYJEHztpAuioyaKznw
 57pNlzG+/K/gg/yZ7uoCtXvogEHgDtart9ZtH7CVtZQgky6YfdiBq0g0pNSdOoY5
 O4IB0ZIXIu/FOh1+Q1k3Md5MeuxUz1Jp21Wq+JNK6VkWwfq+oCZ3XJK06+2C45wI
 uUePaEMFZn7VX9WOToZJQZME/+5yQvrgOq+D6gBs+y3UJO5u6kpdoley6fPXRtYV
 hyfBYiutJlcV1dJC9g7Dc6CHrBkaolFTKsRi2RjD658fHmUUMCubsn6lSG9UJqiZ
 CWtNMHJ/k1WBLuPLUaZBa3W0s+mUZ+0E6W3nLRHC2ORRQhAnQKeoDb2IWlVhYTdB
 NmyABPqjwvGfgidMh39aMt8GS4lFZBXGozVNWTZprN56U/jYH/Ol4cJNLJ9Ez5yk
 fzIkljCc5L/ZmxmqjJNBvmJtTpAt/FhN0qKT/k9jksISFE24bzZ0oRg3t051OyXs
 Mndldyl/2EFHA2PBIN2phISTVWh5lewYNBaK0SBbx77DX6NzMdevhdGAvw2cRVT/
 BJvqj+OeBfiaGBNb/lAIsoZCnuMClQi2t4jlKGkmN3n9hbgPyPAsz/WJRDLr9GZ+
 cqQgh7fL80HoqZTfV7tWxKTkDK3AciXXZE+8ntBpGC6CgMmgsJqLmxc60Jzfh2OO
 qGXcodOISag=
 =WNs1
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events fixes from Ingo Molnar:

 - Fix deadlock in the perf_mmap() failure path (Peter Zijlstra)

 - Intel ACR (Auto Counter Reload) fixes (Dapeng Mi):
     - Fix validation and configuration of ACR masks
     - Fix ACR rescheduling bug causing stale masks
     - Disable the PMI on ACR-enabled hardware
     - Enable ACR on Panther Cover uarch too

* tag 'perf-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Enable auto counter reload for DMR
  perf/x86/intel: Disable PMI for self-reloaded ACR events
  perf/x86/intel: Always reprogram ACR events to prevent stale masks
  perf/x86/intel: Improve validation and configuration of ACR masks
  perf/core: Fix deadlock in perf_mmap() failure path
2026-05-08 19:39:18 -07:00
Holger Brunck
496c0c4c53 net: wan: fsl_ucc_hdlc: free tx_skbuff in uhdlc_memclean
When the device is removed all allocated resources should be freed.
In uhdlc_memclean the netdev transmit queue was already stopped. But at
this point we may have pending skb in the transmit queue which must be
freed. Therefore iterate over the tx_skbuff pointers and free all
pending skb. The issue was discovered by sashiko.
Tested on a ls1043a board running HDLC in bus mode on kernel 6.12.

https: //sashiko.dev/#/patchset/20260429114208.941011-1-holger.brunck%40hitachienergy.com
Fixes: c19b6d246a ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Holger Brunck <holger.brunck@hitachienergy.com>
Link: https://patch.msgid.link/20260507155332.3452319-1-holger.brunck@hitachienergy.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 18:48:30 -07:00
Jakub Kicinski
28d0060632 netfilter pull request 26-05-08
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmn9IKAACgkQ1w0aZmrP
 KyF4kw/+NUCZ1C4dp1QlkhV4fQ6yrfkxmV34QTi1zy4lUPqDC/20T7DHc4klYNhP
 28mI6jnfLUq94Qyp+jVLGMT+W/2O34sk8mdSnBsn6Cj4HxscY0cSyhXAKarr/Fb2
 aYmWP5+rxYp0ZyIyU6ayK7pBQUeVqYtDvZ6WyUE49GU4bTjORyBKN4Xhett+wtPA
 2THWq2KsUmOVJhtl6pyGBAgveZhDlCj9XH4C9pRNQbdcRoCUpMgFfhJEWrRPb8So
 1yZf8b+3RaBwK6WGoiLGv5u8RfGkKCJ6u3PkKRYdrk3m1K1d9kVUGyNWPbbaG5zg
 kwIOhI1xM740cpUo3pC0t/hDdWKCgSykS83zgMYtuesOUSh5330qMWpW+sHR12ya
 9AH/4XzCrVlNdIsU5ffK0nXhTi/tu19ldW/L/yWDRycxDdprAQujdSUinH/bG7JR
 tmQyVtX/kf6mUEzrZ7fqY44nJiNkkBZoLo4XVpCxTCExglPvs2ZTF/Npze7z7dkW
 qXyXA77W0djmycIVjGbNxvKeXX8foZZfLI//CvF+OGHGgZ3/j/Vg2N59+wiO8N0s
 Mxu2ME1o1q7DWlOLYbgvP2YPs5Zlcv2P3gpNh5kukntheky/Q5CtGQDX3LAUUJYv
 j9nPz8OfYvW/4rzZgSiCYET2UJAkJe12DCOPiRhZcCt6s4LLCrE=
 =i40t
 -----END PGP SIGNATURE-----

Merge tag 'nf-26-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains Netfilter fixes for net:

1) Allow initial x_tables table replacement without emitting an audit
   log message. Delay the register message until after hooks are wired up
   to avoid unnecessary unregister logs during error unwinding.

2) Fix a NULL dereference by allocating hook ops before adding the
   table to the per-netns list. Use `synchronize_rcu()` during error
   unwinding to ensure the table stops processing packets before
   teardown. Defer audit log register message until all operations
   succeed.

3) Refactor xtables to use a single `xt_unregister_table_pre_exit`
   function. Eliminate code duplication by centralizing table
   unregistration logic within the xtables core. ebtables cannot be
   changed due to incompatibility.

4) Unregister xtables templates before module removal. This prevents
   a race condition where userspace instantiates a new table after the
   pernet unreg removed the current table.

5) Add `xtables_unregister_table_exit` to fully unregister netfilter
   tables during module removal. Unlink the table from dying lists,
   then free hook operations.

6) Implement a two-stage removal scheme for ebtables following the
   x_tables pattern. Assign table->ops while holding the ebt mutex to
   prevent exposing partially-filled structures.

7) Fix ebtables module initialization race. Register the template last
   in table initialization functions. Prevent table instantiation before
   pernet operations are available.

8) Fix a race condition in x_tables module initialization. Ensure
   pernet ops are fully set up before exposing the table to userspace.

9) Fix a race condition in ebtables module initialization, similar to
   previous patch.

10) Restore propagation of helper to expected connection, this is a
    fix-for-recent-fix.

11) Validate that the expectation tuple and mask netlink attributes are
    present when adding expectation via nfqueue, this fixes a possible
    null-ptr-deref.

12) Fix possible rare memleak in the SIP helper in case helper has been
    detached from conntrack entry, from Li Xiasong.

13) Fix refcount leak in nft_ct when creating custom expectation, also
    from Li Xiason.

Patches 1-9 from Florian Westphal.

10) Restore propagation of helper to expected connection, this is a
    fix-for-recent-fix.

11) Check that tuple and mask netlink attributes are set when creating an
    expectation via nfqueue.

* tag 'nf-26-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_ct: fix missing expect put in obj eval
  netfilter: nf_conntrack_sip: get helper before allocating expectation
  netfilter: ctnetlink: check tuple and mask in expectations created via nfqueue
  netfilter: nf_conntrack_expect: restore helper propagation via expectation
  netfilter: bridge: eb_tables: close module init race
  netfilter: x_tables: close dangling table module init race
  netfilter: ebtables: close dangling table module init race
  netfilter: ebtables: move to two-stage removal scheme
  netfilter: x_tables: add and use xtables_unregister_table_exit
  netfilter: x_tables: unregister the templates first
  netfilter: x_tables: add and use xt_unregister_table_pre_exit
  netfilter: x_tables: allocate hook ops while under mutex
  netfilter: x_tables: allow initial table replace without emitting audit log message
====================

Link: https://patch.msgid.link/20260507234509.603182-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 18:28:27 -07:00
Ben Morris
abb5f36771 sctp: revalidate list cursor after sctp_sendmsg_to_asoc() in SCTP_SENDALL
The SCTP_SENDALL path in sctp_sendmsg() iterates ep->asocs with
list_for_each_entry_safe(), which caches the next entry in @tmp before
the loop body runs.  The body calls sctp_sendmsg_to_asoc(), which may
drop the socket lock inside sctp_wait_for_sndbuf().

While the lock is dropped, another thread can SCTP_SOCKOPT_PEELOFF the
association cached in @tmp, migrating it to a new endpoint via
sctp_sock_migrate() (list_del_init() + list_add_tail() to
newep->asocs), and optionally close the new socket which frees the
association via kfree_rcu().  The cached @tmp can also be freed by a
network ABORT for that association, processed in softirq while the
lock is dropped.

sctp_wait_for_sndbuf() revalidates @asoc (the current entry) on re-lock
via the "sk != asoc->base.sk" and "asoc->base.dead" checks, but nothing
revalidates @tmp.  After a successful return, the iterator advances to
the stale @tmp, yielding either a use-after-free (if the peeled socket
was closed) or a list-walk onto the new endpoint's list head (type
confusion of &newep->asocs as a struct sctp_association *).

Both are reachable from CapEff=0; the type-confusion path gives
controlled indirect call via the outqueue.sched->init_sid pointer.

Fix by re-deriving @tmp from @asoc after sctp_sendmsg_to_asoc()
returns.  @asoc is known to still be on ep->asocs at that point: the
only callers that list_del an association from ep->asocs are
sctp_association_free() (which sets asoc->base.dead) and
sctp_assoc_migrate() (which changes asoc->base.sk), and
sctp_wait_for_sndbuf() checks both under the lock before any
successful return; a tripped check propagates as err < 0 and the loop
bails before the re-derive.

The SCTP_ABORT path in sctp_sendmsg_check_sflags() returns 0 and the
loop hits 'continue' before sctp_sendmsg_to_asoc() is ever called, so
the @tmp cached by list_for_each_entry_safe() still covers the
lock-held free that ba59fb0273 ("sctp: walk the list of asoc
safely") was added for.

Fixes: 4910280503 ("sctp: add support for snd flag SCTP_SENDALL process in sendmsg")
Cc: stable@vger.kernel.org
Signed-off-by: Ben Morris <bmorris@anthropic.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20260508001455.3137-1-joycathacker@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 18:21:09 -07:00
Shitalkumar Gandhi
6635fa8440 net: ti: icssm-prueth: fix eth_ports_node leak in probe
The error path on of_property_read_u32() failure inside
icssm_prueth_probe() returns without putting eth_ports_node,
which was acquired before the for_each_child_of_node() loop.

Drop it before returning.

Fixes: 511f6c1ae0 ("net: ti: icssm-prueth: Adds ICSSM Ethernet driver")
Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com>
Link: https://patch.msgid.link/20260506195813.641610-1-shitalkumar.gandhi@cambiumnetworks.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 18:10:30 -07:00
Myeonghun Pak
c4f3d6eb1f net: lan966x: avoid unregistering netdev on register failure
lan966x_probe_port() stores the newly allocated net_device in the
port before calling register_netdev(). If register_netdev() fails,
the probe error path calls lan966x_cleanup_ports(), which sees
port->dev and calls unregister_netdev() for a device that was never
registered.

Destroy the phylink instance created for this port and clear port->dev
before returning the registration error. The common cleanup path now skips
ports without port->dev before reaching the registered netdev cleanup, so
it only handles ports that reached the registered-netdev lifetime.

This also avoids treating an uninitialized FDMA netdev and the failed port
as a NULL == NULL match in the common cleanup path.

Fixes: d28d6d2e37 ("net: lan966x: add port module support")
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Link: https://patch.msgid.link/20260506124331.31945-1-mhun512@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:30:45 -07:00
Linus Torvalds
27a26ccfd5 arm64 ptrace(PTRACE_SETREGSET) fix to zero the target's fpsimd_state
rather than the tracer's.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmn+axkACgkQa9axLQDI
 XvGuUBAAsRvdVRfO4Td1UKAMJj3jnqXZpD1Ws4LXAHsqZP2J5Zpbu5uQso0BD8wa
 mVlAv62r/JQQXImtxEEcJ006xl4zzQK7NP85KLPeeXzFMtcOm74rNZ5mQR++Pv/1
 t6Xi2DGxapqKbeVEnud+5L7eWK0kxrd0cuL338aMTZ8+aQIImIKtp2KkdowDtLqJ
 hzTeL89W00T/V9+o7/aCCLCzY3X9ti5N2F5YnoeK7TZi4rWrjPdCSbpAEfToafHq
 x+5qMvdfxzZjHfGwhD3R+BZmlR9L7XPabQvWXM2SMGxw79/qMurnKeczYlFADgsa
 jM9c8iSdkSavcTDlm1J9qoLEh6rEEQqAYf+h9aaGnmJDqVzTvdzeuIQlAX9mKS6a
 0oQJ4WUTw8lgdgTrEbiE38DkaTQMqvB47tw3BxNHDIsD1ApWf1hZWoNEQysJPHsS
 z27MYUUUIpKmq3iDcR38IHgI5uYLG5I50A2bIvlyo6Kr66LrIDyxVHCrnBL+ZiQh
 9gTswe9X5vJVDlVhkQ1GHKdUADb+WVu4KU4ZTfcTjuk4T6e/KBB/e4LsjvCqgz01
 xTulCs5lBjczoGmzT/SlK8ecAQOX3nHFhp9EMDXv/7DWw7waHohF+TAdTHlVFSOh
 NNduLU4VuGpUaynPLeaM7C6ZvXNdIHnj3ZmQAZnc5GATXKSm5I4=
 =0Pwc
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:

 - ptrace(PTRACE_SETREGSET) fix to zero the target's fpsimd_state rather
   than the tracer's

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/fpsimd: ptrace: zero target's fpsimd_state, not the tracer's
2026-05-08 16:18:35 -07:00
Linus Torvalds
678ede852f pci-v7.1-fixes-1
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmn+TPoUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwRXw/8DpJFIe5jOYZIBGngR/tP2cmzeF5g
 SOQzMN24FxbMaN6pCvXj9oaxtbmU3wfTGd/ZbIRaIk0NDf1t5Gc6YTjPD7joJekJ
 m4GRHkms9IyOFrSLPpHHFY9VOVO5FXXgjwehRkdfDssfYwZfNQVg6Vls3z2StsoG
 09D4F6eqOY1u+QuFyh3L9G6aYTtniXJS+jc3VZeKM4Ce4dX6G2gKjr2Cz6D+s6rU
 9RJBKKmG15QpHFkui5lD//ZMtD0WuJUa2BsVIkl52lRpRgKgFM14GTn2p5OVVRNK
 /QpTg4r/SdhsWsINh/aWuto5Q/Gky7dDqojAAPTbM331yMrA91Uv9on2gAIePa6e
 Jc2We6fekSwtRaOWkiNei/xzW/mMWmj15k4pePKCqHl+LVrxZM5jA/MtVOAzpfyU
 7mDNIIGxVQd3YjboZGAvuZ1XuOhzVsNQBnLC57Zx3CCvX9N9dCwS6bZIwwCBVzPv
 GG7WIDqx58LKH6N/X971ZzMcPha2syTZZA5RR59aY6tLe5ufLHHuZ0GLqx/txb7U
 OZxaIzOKF0LNO5DXuqFHL9aQmqLFn0tsP83/x60AEosrx8URQ8L3aQw4PQkZhNIl
 qpRcmkb/jxNMLcB7EXK/tEoCdg0C6obxnf0fcgOTtlQYD7LUGNlMGxCDW5lhdWzN
 uo1/LD6f71MqplQ=
 =g+Cm
 -----END PGP SIGNATURE-----

Merge tag 'pci-v7.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fixes from Bjorn Helgaas:

 - Don't fallback to bus reset after failed slot reset; a bus reset
   isn't safe if the .reset_slot() callback is implemented (Keith Busch)

 - Update saved_config_space upon resource assignment to fix passthrough
   regressions when x86 pcibios_assign_resources() updates BARs (Lukas
   Wunner)

 - Initialize a temporary pci_dev->dev in sysfs 'new_id' attribute to
   fix a lockdep regression after driver_override was moved from PCI to
   device core (Samiullah Khawaja)

 - Update MAINTAINERS email addresses (Marek Vasut, Hans Zhang)

 - Add MAINTAINERS reviewer for PCIe Cadence IP (Aksh Garg)

* tag 'pci-v7.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  MAINTAINERS: Add Aksh Garg as PCIe CADENCE reviewer
  MAINTAINERS: Update Hans Zhang email for PCIe CIX Sky1
  MAINTAINERS: Update Marek Vasut email for PCIe R-Car
  PCI: Initialize temporary device in new_id_store()
  PCI: Update saved_config_space upon resource assignment
  PCI: Don't fallback to bus reset after failed slot reset
2026-05-08 16:08:58 -07:00
Jakub Kicinski
f149471f4e Merge branch 'intel-wired-lan-driver-updates-2026-05-04-i40e-ice-idpf'
Jacob Keller says:

====================
Intel Wired LAN Driver Updates 2026-05-04 (i40e, ice, idpf)

Matt Volrath fixes two issues with the i40e driver probe routine, ensuring
that PTP is properly cleaned up if the probe fails.

Emil corrects the initialization of the read_dev_clk_lock spinlock in
idpf_ptp_init, ensuring it is initialized prior to when the
ptp_schedule_worker() is called.

Greg KH fixes a double free and use-after free in the idpf auxiliary device
error paths.

Marcin fixes ice_set_rss_hfunc() to use the correct q_opt_flags field,
correcting the assignment and preventing submission of invalid data to the
firmware.

Bart corrects the locking in ice_dcb_rebuild(), ensuring that the tc_mutex
is held over the entire operation.

Ivan fixes the rclk pin state get for E810 devices, ensuring the index is
properly offset by the base_rclk_idx value. This ensures that the correct
pin index is used to look up recovered clock state. He additionally adds
bounds checking to prevent attempting to access pins outside of the pin
state array.

Ivan also moves the CGU register macros to the top of ice_dpll.h, inside
the header guard to avoid duplicate macro definitions should the ice_dpll.h
header is included multiple times.
====================

Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-0-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:12 -07:00
Ivan Vecera
30f1658fc5 ice: dpll: fix misplaced header macros
The CGU register definitions (ICE_CGU_R10, ICE_CGU_R11 and related field
masks) were placed after the #endif of the _ICE_DPLL_H_ include guard,
leaving them unprotected. Move them inside the guard.

Fixes: ad1df4f2d5 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-8-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:09 -07:00
Ivan Vecera
cce709d8df ice: dpll: fix rclk pin state get for E810
The refactoring of ice_dpll_rclk_state_on_pin_get() to use
ice_dpll_pin_get_parent_idx() omitted the base_rclk_idx adjustment that was
correctly added in the ice_dpll_rclk_state_on_pin_set() path. This breaks
E810 devices where base_rclk_idx is non-zero, causing the wrong hardware
index to be used for pin state lookup and incorrect recovered clock state
to be reported via the DPLL subsystem. E825C is unaffected as its
base_rclk_idx is 0.

While at it, add bounds check against ICE_DPLL_RCLK_NUM_MAX on hw_idx after
the base_rclk_idx subtraction in both ice_dpll_rclk_state_on_pin_{get,set}()
to prevent out-of-bounds access on the pin state array.

Fixes: ad1df4f2d5 ("ice: dpll: Support E825-C SyncE and dynamic pin discovery")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-7-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:09 -07:00
Bart Van Assche
0ded1f36ba ice: fix locking in ice_dcb_rebuild()
Move the mutex_lock() call up to prevent that DCB settings change after
the first ice_query_port_ets() call. The second ice_query_port_ets()
call in ice_dcb_rebuild() is already protected by pf->tc_mutex.

This also fixes a bug in an error path, as before taking the first
"goto dcb_error" in the function jumped over mutex_lock() to
mutex_unlock().

This bug has been detected by the clang thread-safety analyzer.

Cc: intel-wired-lan@lists.osuosl.org
Fixes: 242b5e068b ("ice: Fix DCB rebuild after reset")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-6-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:09 -07:00
Marcin Szycik
b3cda96feb ice: fix setting RSS VSI hash for E830
ice_set_rss_hfunc() performs a VSI update, in which it sets hashing
function, leaving other VSI options unchanged. However, ::q_opt_flags is
mistakenly set to the value of another field, instead of its original
value, probably due to a typo. What happens next is hardware-dependent:

On E810, only the first bit is meaningful (see
ICE_AQ_VSI_Q_OPT_PE_FLTR_EN) and can potentially end up in a different
state than before VSI update.

On E830, some of the remaining bits are not reserved. Setting them
to some unrelated values can cause the firmware to reject the update
because of invalid settings, or worse - succeed.

Reproducer:
  sudo ethtool -X $PF1 equal 8

Output in dmesg:
  Failed to configure RSS hash for VSI 6, error -5

Fixes: 352e9bf238 ("ice: enable symmetric-xor RSS for Toeplitz hash function")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-5-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:09 -07:00
Greg Kroah-Hartman
6c77b95108 idpf: fix double free and use-after-free in aux device error paths
When auxiliary_device_add() fails in idpf_plug_vport_aux_dev() or
idpf_plug_core_aux_dev(), the err_aux_dev_add label calls
auxiliary_device_uninit() and falls through to err_aux_dev_init.  The
uninit call will trigger put_device(), which invokes the release
callback (idpf_vport_adev_release / idpf_core_adev_release) that frees
iadev.  The fall-through then reads adev->id from the freed iadev for
ida_free() and double-frees iadev with kfree().

Free the IDA slot and clear the back-pointer before uninit, while adev
is still valid, then return immediately.

Commit 65637c3a18 ("idpf: fix UAF in RDMA core aux dev deinitialization")
fixed the same use-after-free in the matching unplug path in this file but
missed both probe error paths.

Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: stable@kernel.org
Fixes: be91128c57 ("idpf: implement RDMA vport auxiliary dev create, init, and destroy")
Fixes: f4312e6bfa ("idpf: implement core RDMA auxiliary dev create, init, and destroy")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-4-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:09 -07:00
Emil Tantilov
da4f76b6a8 idpf: fix read_dev_clk_lock spinlock init in idpf_ptp_init()
In idpf_ptp_init(), read_dev_clk_lock is initialized after
ptp_schedule_worker() had already been called (and after
idpf_ptp_settime64() could reach the lock). The PTP aux worker
fires immediately upon scheduling and can call into
idpf_ptp_read_src_clk_reg_direct(), which takes
spin_lock(&ptp->read_dev_clk_lock) on an uninitialized lock, triggering
the lockdep "non-static key" warning:

[12973.796587] idpf 0000:83:00.0: Device HW Reset initiated
[12974.094507] INFO: trying to register non-static key.
...
[12974.097208] Call Trace:
[12974.097213]  <TASK>
[12974.097218]  dump_stack_lvl+0x93/0xe0
[12974.097234]  register_lock_class+0x4c4/0x4e0
[12974.097249]  ? __lock_acquire+0x427/0x2290
[12974.097259]  __lock_acquire+0x98/0x2290
[12974.097272]  lock_acquire+0xc6/0x310
[12974.097281]  ? idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097311]  ? lockdep_hardirqs_on_prepare+0xde/0x190
[12974.097318]  ? finish_task_switch.isra.0+0xd2/0x350
[12974.097330]  ? __pfx_ptp_aux_kworker+0x10/0x10 [ptp]
[12974.097343]  _raw_spin_lock+0x30/0x40
[12974.097353]  ? idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097373]  idpf_ptp_read_src_clk_reg+0xb7/0x150 [idpf]
[12974.097391]  ? kthread_worker_fn+0x88/0x3d0
[12974.097404]  ? kthread_worker_fn+0x4e/0x3d0
[12974.097411]  idpf_ptp_update_cached_phctime+0x26/0x120 [idpf]
[12974.097428]  ? _raw_spin_unlock_irq+0x28/0x50
[12974.097436]  idpf_ptp_do_aux_work+0x15/0x20 [idpf]
[12974.097454]  ptp_aux_kworker+0x20/0x40 [ptp]
[12974.097464]  kthread_worker_fn+0xd5/0x3d0
[12974.097474]  ? __pfx_kthread_worker_fn+0x10/0x10
[12974.097482]  kthread+0xf4/0x130
[12974.097489]  ? __pfx_kthread+0x10/0x10
[12974.097498]  ret_from_fork+0x32c/0x410
[12974.097512]  ? __pfx_kthread+0x10/0x10
[12974.097519]  ret_from_fork_asm+0x1a/0x30
[12974.097540]  </TASK>

Move the call to spin_lock_init() up a bit to make sure read_dev_clk_lock
is not touched before it's been initialized.

Fixes: 5cb8805d23 ("idpf: negotiate PTP capabilities and get PTP clock")
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-3-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:09 -07:00
Matt Vollrath
678b713ece i40e: Cleanup PTP pins on probe failure
PTP pin structs are allocated early in probe, but never cleaned up.

Fix this by calling i40e_ptp_free_pins in the error path.

To support this, i40e_ptp_free_pins is added to the header and
pin_config is correctly nullified after being freed.

This has been an issue since i40e_ptp_alloc_pins was introduced.

Fixes: 1050713026 ("i40e: add support for PTP external synchronization clock")
Reported-by: Kohei Enju <kohei@enjuk.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Kohei Enju <kohei@enjuk.jp>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-2-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:08 -07:00
Matt Vollrath
1619553b0a i40e: Cleanup PTP registration on probe failure
Fix two conditions which would leak PTP registration on probe failure:

1. i40e_setup_pf_switch can encounter an error in
   i40e_setup_pf_filter_control, call i40e_ptp_init, then return
   non-zero, sending i40e_probe to err_vsis.

2. i40e_setup_misc_vector can return non-zero, sending i40e_probe to
   err_vsis.

Both of these conditions have been present since PTP was introduced in
this driver.

Found with coccinelle.

Fixes: beb0dff125 ("i40e: enable PTP")
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-1-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 16:01:08 -07:00
Mohsin Bashir
a77d5a069d net: shaper: Reject reparenting of existing nodes
When an existing node-scope shaper is moved to a different parent
via the group operation, the framework fails to update the leaves
count on both the old and new parent shapers. Only newly created
nodes (handle.id == NET_SHAPER_ID_UNSPEC) trigger the parent
leaves increment at line 1039.

This causes the parent's leaves counter to diverge from the
actual number of children in the xarray. When the node is later
deleted, pre_del_node() allocates an array sized by the stale
leaves count, but the xarray iteration finds more children than
expected, hitting the WARN_ON_ONCE guard and returning -EINVAL.

Rather than adding reparenting support with complex leaves count
bookkeeping, reject group calls that attempt to change an existing
node's parent. Updates to an existing node's rate or leaves under
the same parent remain permitted. We expect that for any modification
of the topology user should always create new groups and let the
kernel garbage collect the leaf-less nodes.

Fixes: 5d5d4700e7 ("net-shapers: implement NL group operation")
Signed-off-by: Mohsin Bashir <hmohsin@meta.com>
Link: https://patch.msgid.link/20260506233745.111895-1-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 15:45:47 -07:00
Alice Ryhl
efda25ee84 genetlink: free the skb on 'group >= family->n_mcgrps'
These methods generally consume ownership of the provided skb, so even
if an error path is encountered, the skb is freed. This is because the
very first thing they do after some initial setup is to unconditionally
consume the skb via consume_skb(skb). Any subsequent errors lead to the
core netlink layer freeing the skb.

However, there is one check that occurs before ownership is passed,
which is the check for the group index. So if this error condition is
encountered, then the skb is leaked. This error condition is generally
considered a violation of the netlink API, so it's not expected to occur
under normal circumstances. For the same reason, no callers check for
this error condition, and no callers need to be adjusted. However, we
should still follow the same ownership semantics of the rest of the
function. Thus, free the skb in this codepath.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Suggested-by: Matthew Maurer <mmaurer@google.com>
Fixes: 2a94fe48f3 ("genetlink: make multicast groups const, prevent abuse")
Link: https://lore.kernel.org/r/845b36ba-7b3a-41f2-acb2-b284f253e2ca@lunn.ch
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260506-genlmsg-return-v2-1-a63ee2a055d6@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 15:43:29 -07:00
Ilya Maximets
f2ab4fd027 net: nsh: fix incorrect header length macros
NSH header length is a 6-bit field that encodes the total length of
the header in 4-byte words.  So the maximum length is 0b111111 * 4,
which is 252 and not 256.  The maximum context length is the same
number minus the length of the base header (8), so 244.

These macros are used to validate push_nsh() action in openvswitch.
Miscalculation here doesn't cause any real issues.  In the worst case
the oversized context is truncated while building the header, so we'll
construct and send a broken packet, which is not a big problem, as any
receiver should validate the fields.  No invalid memory accesses will
happen during the header push.  But we should fix the macros to reject
the incorrect actions in the first place.

Using previously defined values and calculating the length instead
of defining numbers directly, so it's easier to understand where they
come from and harder to make a mistake.

Fixes: 1f0b7744c5 ("net: add NSH header structures and helpers")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20260507120434.2962505-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 15:32:59 -07:00
Quan Sun
4908f1395f net: ethtool: fix NULL pointer dereference in phy_reply_size
In phy_prepare_data(), several strings such as 'name', 'drvname',
'upstream_sfp_name', and 'downstream_sfp_name' are allocated using
kstrdup(). However, these allocations were not checked  for failure.

If kstrdup() fails for 'name', it returns NULL while the function
continues. This leads to a kernel NULL pointer dereference and panic
later in phy_reply_size() when it unconditionally calls strlen() on
the NULL pointer.

While other strings like 'upstream_sfp_name' might be checked before
access in certain code paths, failing to handle these allocations
consistently can lead to incomplete data reporting or hidden bugs.

Fix this by adding proper NULL checks for all kstrdup() calls in
phy_prepare_data() and implement a centralized error handling path
using goto labels to ensure all previously allocated resources are
freed on failure.

Fixes: 9dd2ad5e92 ("net: ethtool: phy: Convert the PHY_GET command to generic phy dump")
Signed-off-by: Quan Sun <2022090917019@std.uestc.edu.cn>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260507131738.1173835-1-2022090917019@std.uestc.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 15:30:10 -07:00
Nicolas Ferre
0a549298f4 MAINTAINERS: change maintainers for macb Ethernet driver
I would like to hand over the macb maintenance to Théo, as I'm unable to
keep up with the recent flow of patches for this driver. After speaking
with Claudiu, he indicated that he is in the same position as me.
To help with this work, Conor has agreed to act as a reviewer.

I was given responsibility for this driver years ago, and I'm glad to
see it continue with talented developers.

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Acked-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260507120444.9733-1-nicolas.ferre@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 15:27:31 -07:00
Dragos Tatulea
58e2330bd4 net: napi: Avoid gro timer misfiring at end of busypoll
When in irq deferral mode (defer-hard-irqs > 0), a short enough
gro-flush timeout can trigger before NAPI_STATE_SCHED is cleared if the
last poll in busy_poll_stop() takes too long. This can have the effect
of leaving the queue stuck with interrupts disabled and no timer armed
which results in a tx timeout if there is no subsequent busypoll cycle.

To prevent this, defer the gro-flush timer arm after the last poll.

Fixes: 7fd3253a7d ("net: Introduce preferred busy-polling")
Co-developed-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260506090808.820559-2-dtatulea@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 15:02:32 -07:00
Jakub Kicinski
dffddaa0ce Merge branch 'ipv6-flowlabel-per-netns-budget-for-unprivileged-callers'
Maoyi Xie says:

====================
ipv6: flowlabel: per-netns budget for unprivileged callers

From: Maoyi Xie <maoyi.xie@ntu.edu.sg>

This series fixes the cross-tenant DoS in net/ipv6/ip6_flowlabel.c.
v1 through v6 were single-patch postings, each in its own thread.
v6 review pointed out that the existing fl_size read in
mem_check() and the corresponding write in fl_intern() are not in
the same critical section. v7 split the work into 2 patches.

Patch 1/2 is a prerequisite. It moves spin_lock_bh(&ip6_fl_lock)
and the matching unlock from fl_intern() into its only caller
ipv6_flowlabel_get(), so the mem_check() call runs under the same
critical section as the fl_intern() insert. With all writers and
the read of fl_size under the lock, fl_size is converted from
atomic_t to plain int. This is independent of the per-netns
budget. It also makes 2/2 backportable without conflicts.

Patch 2/2 is the v6 patch, rebased on 1/2.

  - flowlabel_count is plain int rather than atomic_t, since the
    previous patch put all writers and readers under ip6_fl_lock.
  - In ip6_fl_gc(), fl_free() is now placed below the fl_size
    and flowlabel_count decrements, removing the v6 cache of
    fl->fl_net.
  - In ip6_fl_purge(), fl_free() stays in its original position.
    The function argument net is used for flowlabel_count.
  - mem_check() uses spaces around the / operator on all four
    expressions, addressing the checkpatch note in v6 review.

Numeric budget (preserved from v6):

  pre-patch:
    global non-CAP_NET_ADMIN budget = FL_MAX_SIZE - FL_MAX_SIZE/4
                                    = 4096 - 1024 = 3072
    per-actor reach                 = 3072

  post-patch:
    FL_MAX_SIZE doubled to 8192
    global non-CAP_NET_ADMIN budget = 8192 - 2048 = 6144
    per-netns ceiling               = 6144 / 2 = 3072
    per-actor reach                 = 3072 (preserved)

CAP_NET_ADMIN against init_user_ns still bypasses both caps.

Reproducer (KASAN VM, 4 cores, qemu): unprivileged netns A holds
3072 flowlabels via 100 procs. Fresh unprivileged netns B then
allocates 32 flowlabels (the FL_MAX_PER_SOCK ceiling for one
socket), the same as a clean baseline. Without the per-netns
ceiling, netns A could push fl_size past FL_MAX_SIZE - FL_MAX_SIZE
/ 4 and netns B would see allocations denied.
====================

Link: https://patch.msgid.link/20260506082416.2259567-1-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 14:59:17 -07:00
Maoyi Xie
e68eadffb7 ipv6: flowlabel: enforce per-netns limit for unprivileged callers
fl_size, fl_ht and ip6_fl_lock in net/ipv6/ip6_flowlabel.c are
file scope and shared across netns. mem_check() reads fl_size to
decide whether to deny non-CAP_NET_ADMIN callers. capable() runs
against init_user_ns, so an unprivileged user in any non-init
userns can push fl_size past FL_MAX_SIZE - FL_MAX_SIZE / 4 and
starve every other unprivileged userns on the host.

Add struct netns_ipv6::flowlabel_count, bumped and decremented
next to fl_size in fl_intern, ip6_fl_gc and ip6_fl_purge. The new
field fills the existing 4-byte hole after ipmr_seq, so struct
netns_ipv6 stays the same size on 64-bit builds.

Bump FL_MAX_SIZE from 4096 to 8192. It has been 4096 since the
file was added. Machines and connection counts have grown.

mem_check() folds an extra per-netns ceiling into the existing
non-CAP_NET_ADMIN conditional. The ceiling is half of the total
budget that unprivileged callers have ever been able to use, i.e.
(FL_MAX_SIZE - FL_MAX_SIZE / 4) / 2 = 3072 entries. With
FL_MAX_SIZE doubled, this preserves the original per-user reach
of 3K (what an unprivileged caller could already obtain before
this change), while forcing an attacker to spread allocations
across at least two netns to exhaust the global non-CAP_NET_ADMIN
budget.

CAP_NET_ADMIN against init_user_ns still bypasses both caps.

The previous patch took ip6_fl_lock across mem_check and
fl_intern, so the new flowlabel_count read in mem_check and the
new flowlabel_count++ in fl_intern run under the same critical
section. flowlabel_count is therefore plain int, like fl_size.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Suggested-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Link: https://patch.msgid.link/20260506082416.2259567-3-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 14:59:14 -07:00
Maoyi Xie
7ce5556f25 ipv6: flowlabel: take ip6_fl_lock across mem_check and fl_intern
mem_check() in net/ipv6/ip6_flowlabel.c reads fl_size without
holding ip6_fl_lock. fl_intern() takes the lock immediately
afterwards. The two checks therefore race against concurrent
fl_intern, ip6_fl_gc and ip6_fl_purge writers, which makes the
mem_check budget check approximate.

Move spin_lock_bh(&ip6_fl_lock) and the matching unlock from
fl_intern() into its only caller ipv6_flowlabel_get(). The
mem_check() call now runs under the same critical section as the
fl_intern() insert, so the budget check is exact.

With all writers and the read of fl_size under ip6_fl_lock,
convert fl_size from atomic_t to plain int. The four sites that
update or read fl_size are fl_intern (insert path), ip6_fl_gc
(garbage collector, the !sched check and the per-entry decrement),
ip6_fl_purge (per-netns purge), and mem_check (budget check), and
all four now run under ip6_fl_lock.

This is a prerequisite for adding a per-netns budget alongside
fl_size. The follow-up patch adds netns_ipv6::flowlabel_count and
folds it into mem_check().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Suggested-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Link: https://patch.msgid.link/20260506082416.2259567-2-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 14:59:13 -07:00
Maciej W. Rozycki
e539acf9f9 MAINTAINERS: Add self for the 3c509 network driver
It appears there's a need for a maintainer for the 3Com EtherLink III
family of Ethernet network adapters.  There is documentation available
and the driver is very mature so the task ought to be of little hassle,
so I think I should be able to squeeze in any issues to be addressed.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/alpine.DEB.2.21.2604271056460.28583@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 14:56:27 -07:00
Jakub Kicinski
4378bf6124 Merge branch 'tcp-two-fixes-for-socket-migration-in-reqsk_timer_handler'
Kuniyuki Iwashima says:

====================
tcp: Two fixes for socket migration in reqsk_timer_handler().

The series fixes two bugs in the error path of socket migration
in reqsk_timer_handler().

Patch 1 fixes a potential UAF in reqsk_timer_handler().

Patch 2 fixes imbalanced icsk_accept_queue count.
====================

Link: https://patch.msgid.link/20260506035954.1563147-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 14:54:55 -07:00
Kuniyuki Iwashima
7eca3292ca tcp: Fix imbalanced icsk_accept_queue count.
When TCP socket migration happens in reqsk_timer_handler(),
@sk_listener will be updated with the new listener.

When we call __inet_csk_reqsk_queue_drop(), the listener must
be the one stored in req->rsk_listener.

The cited commit accidentally replaced oreq->rsk_listener with
sk_listener, leading to imbalanced icsk_accept_queue count.

Let's pass the correct listener to __inet_csk_reqsk_queue_drop().

Fixes: e8c526f2bd ("tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink().")
Reported-by: Damiano Melotti <melotti@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260506035954.1563147-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 14:54:51 -07:00
Kuniyuki Iwashima
97c8a3c1f7 tcp: Fix potential UAF in reqsk_timer_handler().
When TCP socket migration fails at inet_ehash_insert() in
reqsk_timer_handler(), we jump to the no_ownership: label
and free the new reqsk immediately with __reqsk_free().

Thus, we must stop the new reqsk's timer before jumping to the
label, but the timer might be missed since the cited commit,
resulting in UAF.

As we are in the original reqsk's timer context, we can safely
call timer_delete_sync() for the new reqsk.

Let's pass false to __inet_csk_reqsk_queue_drop() to stop
the new reqsk's timer.

Fixes: 83fccfc394 ("inet: fix potential deadlock in reqsk_queue_unlink()")
Reported-by: Damiano Melotti <melotti@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260506035954.1563147-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-08 14:54:50 -07:00
Aksh Garg
9ef40a09c5 MAINTAINERS: Add Aksh Garg as PCIe CADENCE reviewer
I wish to contribute to the review process for Cadence PCIe IP drivers,
hence add myself as a reviewer.

Signed-off-by: Aksh Garg <a-garg7@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260508060951.840233-1-a-garg7@ti.com
2026-05-08 15:50:07 -05:00
Hans Zhang
78e115d806 MAINTAINERS: Update Hans Zhang email for PCIe CIX Sky1
Update my email address as my work email account is no longer in use.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260508023006.1787674-1-18255117159@163.com
2026-05-08 15:50:06 -05:00
Marek Vasut
bf5421b3d8 MAINTAINERS: Update Marek Vasut email for PCIe R-Car
Use up to date address. No functional change.

Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260428052030.51101-1-marek.vasut+renesas@mailbox.org
2026-05-08 15:50:06 -05:00
Samiullah Khawaja
f45a49a238 PCI: Initialize temporary device in new_id_store()
When setting new_id of a PCI device driver using sysfs a lockdep splat
occurs. This is because new_id_store() builds a temporary pci_dev for
pci_match_device(), which calls device_match_driver_override().  That
depends on the driver_override.lock added by cb3d1049f4 ("driver core:
generalize driver_override in struct device").

The new driver_override.lock was not initialized in the temporary pci_dev,
resulting in this lockdep splat.

Initialize the temporary pci_dev to fix this.

Repro:

  Build with CONFIG_LOCKDEP=y, boot with QEMU, and add a new ID:

  # echo "8086 10f5" > /sys/bus/pci/drivers/e1000e/new_id

  INFO: trying to register non-static key.
  The code is fine but needs lockdep annotation, or maybe
  you didn't initialize this object before use?
  turning off the locking correctness validator.
  CPU: 2 UID: 0 PID: 177 Comm: liveupdate-iomm Not tainted 7.0.0+ #9 PREEMPT(full)
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x5d/0x80
   register_lock_class+0x77e/0x790
   lock_acquire+0xbf/0x2e0
   pci_match_device+0x24/0x180
   new_id_store+0x189/0x1d0
   kernfs_fop_write_iter+0x14f/0x210
   vfs_write+0x263/0x5e0
   ksys_write+0x79/0xf0
   do_syscall_64+0x117/0xf80

Fixes: 10a4206a24 ("PCI: use generic driver_override infrastructure")
Fixes: 8895d3bcb8 ("PCI: Fail new_id for vendor/device values already built into driver")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
[bhelgaas: add commit log details and repro, trim backtrace]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260505234327.716630-1-skhawaja@google.com
2026-05-08 15:50:06 -05:00
Lukas Wunner
909f7bf9b0 PCI: Update saved_config_space upon resource assignment
Bernd reports passthrough failure of a Digital Devices Cine S2 V6 DVB
adapter plugged into an ASRock X570S PG Riptide board with BIOS version
P5.41 (09/07/2023):

  ddbridge 0000:05:00.0: detected Digital Devices Cine S2 V6 DVB adapter
  ddbridge 0000:05:00.0: cannot read registers
  ddbridge 0000:05:00.0: fail

BIOS assigns an incorrect BAR to the DVB adapter which doesn't fit into the
upstream bridge window.  The kernel corrects the BAR assignment:

  pci 0000:07:00.0: BAR 0 [mem 0xfffffffffc500000-0xfffffffffc50ffff 64bit]: can't claim; no compatible bridge window
  pci 0000:07:00.0: BAR 0 [mem 0xfc500000-0xfc50ffff 64bit]: assigned

Correction of the BAR assignment happens in an x86-specific fs_initcall,
pcibios_assign_resources(), after device enumeration in a subsys_initcall.
This order was introduced at the behest of Linus in 2004:

  https://git.kernel.org/tglx/history/c/a06a30144bbc

No other architecture performs such a late BAR correction.

Bernd bisected the issue to commit a2f1e22390 ("PCI/ERR: Ensure error
recoverability at all times"), but it only occurs in the absence of commit
4d4c10f763 ("PCI: Explicitly put devices into D0 when initializing").
This combination exists in stable kernel v6.12.70, but not in mainline,
hence Bernd cannot reproduce the issue with mainline.

Since a2f1e22390, config space is saved on enumeration, prior to BAR
correction.  Upon passthrough, the corrected BAR is overwritten with the
incorrect saved value by:

  vfio_pci_core_register_device()
    vfio_pci_set_power_state()
      pci_restore_state()

But only if the device's current_state is PCI_UNKNOWN, as it was prior to
commit 4d4c10f763.  Since the commit, it is PCI_D0, which changes the
behavior of vfio_pci_set_power_state() to no longer restore the state
without saving it first.

Alexandre is reporting the same issue as Bernd, but in his case, mainline
is affected as well.  The difference is that on Alexandre's system, the
host kernel binds a driver to the device which is unbound prior to
passthrough, whereas on Bernd's system no driver gets bound by the host
kernel.

Unbinding sets current_state to PCI_UNKNOWN in pci_device_remove(), so when
vfio-pci is subsequently bound to the device, pci_restore_state() is once
again called without invoking pci_save_state() first.

To robustly fix the issue, always update saved_config_space upon resource
assignment.

Reported-by: Bernd Schumacher <bernd@bschu.de>
Closes: https://lore.kernel.org/r/acfZrlP0Ua_5D3U4@eldamar.lan/
Reported-by: Alexandre N. <an.tech@mailo.com>
Closes: https://lore.kernel.org/r/dd3c3358-de0f-4a56-9c81-04aceaab4058@mailo.com/
Fixes: a2f1e22390 ("PCI/ERR: Ensure error recoverability at all times")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Bernd Schumacher <bernd@bschu.de>
Tested-by: Alexandre N. <an.tech@mailo.com>
Cc: stable@vger.kernel.org # v6.12+
Link: https://patch.msgid.link/febc3f354e0c1f5a9f5b3ee9ffddaa44caccf651.1776268054.git.lukas@wunner.de
2026-05-08 15:50:06 -05:00
Kuniyuki Iwashima
18fc650ccd bpf: Free reuseport cBPF prog after RCU grace period.
Eulgyu Kim reported the splat below with a repro. [0]

The repro sets up a UDP reuseport group with a cBPF prog and
replaces it with a new one while another thread is sending
a UDP packet to the group.

The reuseport prog is freed by sk_reuseport_prog_free().
bpf_prog_put() is called for "e"BPF prog to destruct through
multiple stages while cBPF prog is freed immediately by
bpf_release_orig_filter() and bpf_prog_free().

If a reuseport prog is detached from the setsockopt() path
(reuseport_attach_prog() or reuseport_detach_prog()),
sk_reuseport_prog_free() is called without waiting for RCU
readers to complete, resulting in various bugs.

Let's defer freeing the reuseport cBPF prog after one RCU
grace period.

Note "e"BPF prog is safe as is unless the fast path starts
to touch fields destroyed in bpf_prog_put_deferred() and
__bpf_prog_put_noref().

[0]:
BUG: KASAN: vmalloc-out-of-bounds in reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596
Read of size 4 at addr ffffc9000051e004 by task slowme/10208
CPU: 6 UID: 1000 PID: 10208 Comm: slowme Not tainted 7.0.0-geb7ac95ff75e #32 PREEMPT(full)
Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
 <IRQ>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xca/0x240 mm/kasan/report.c:482
 kasan_report+0x118/0x150 mm/kasan/report.c:595
 reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596
 udp4_lib_lookup2+0x3bc/0x950 net/ipv4/udp.c:495
 __udp4_lib_lookup+0x768/0xe20 net/ipv4/udp.c:723
 __udp4_lib_lookup_skb+0x297/0x390 net/ipv4/udp.c:752
 __udp4_lib_rcv+0x1312/0x2620 net/ipv4/udp.c:2752
 ip_protocol_deliver_rcu+0x282/0x440 net/ipv4/ip_input.c:207
 ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:241
 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318
 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318
 __netif_receive_skb_one_core net/core/dev.c:6181 [inline]
 __netif_receive_skb net/core/dev.c:6294 [inline]
 process_backlog+0xaa4/0x1960 net/core/dev.c:6645
 __napi_poll+0xae/0x340 net/core/dev.c:7709
 napi_poll net/core/dev.c:7772 [inline]
 net_rx_action+0x5d7/0xf50 net/core/dev.c:7929
 handle_softirqs+0x22b/0x870 kernel/softirq.c:622
 do_softirq+0x76/0xd0 kernel/softirq.c:523
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0xf8/0x130 kernel/softirq.c:450
 local_bh_enable include/linux/bottom_half.h:33 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:924 [inline]
 __dev_queue_xmit+0x1dd7/0x3710 net/core/dev.c:4890
 neigh_output include/net/neighbour.h:556 [inline]
 ip_finish_output2+0xca9/0x1070 net/ipv4/ip_output.c:237
 NF_HOOK_COND include/linux/netfilter.h:307 [inline]
 ip_output+0x29f/0x450 net/ipv4/ip_output.c:438
 ip_send_skb+0x45/0xc0 net/ipv4/ip_output.c:1508
 udp_send_skb+0xb04/0x1510 net/ipv4/udp.c:1195
 udp_sendmsg+0x1a71/0x2350 net/ipv4/udp.c:1485
 sock_sendmsg_nosec net/socket.c:727 [inline]
 __sock_sendmsg net/socket.c:742 [inline]
 __sys_sendto+0x554/0x680 net/socket.c:2206
 __do_sys_sendto net/socket.c:2213 [inline]
 __se_sys_sendto net/socket.c:2209 [inline]
 __x64_sys_sendto+0xde/0x100 net/socket.c:2209
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x160/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x415a2d
Code: b3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6bc31e41e8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f6bc31e4cdc RCX: 0000000000415a2d
RDX: 0000000000000001 RSI: 00007f6bc31e421f RDI: 0000000000000003
RBP: 00007f6bc31e4240 R08: 00007f6bc31e4220 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000212 R12: 00007f6bc31e46c0
R13: ffffffffffffffb8 R14: 0000000000000000 R15: 00007ffc9b0d70b0
 </TASK>

Fixes: 538950a1b7 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Reported-by: Eulgyu Kim <eulgyukim@snu.ac.kr>
Reported-by: Taeyang Lee <0wn@theori.io>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20260426012647.3233119-1-kuniyu@google.com
2026-05-08 22:40:05 +02:00
Linus Torvalds
cbf457c584 block-7.1-20260508
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmn+FSMQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgphZLEACU3HeIpCrWGqLShqiop0x9FSzREhdAOgHC
 SKJVPyJeGGvS46dCBM3Dkhfwa8VyJvLkpfAYcg9FXQxDyvMB+40jZVxUT+tshHPm
 UtjIGoUJ8DMtLzXzvg73nEcnIZWOmTlSQFwTCfOo+h79zHzsJu+YTUoNQ4vgGO6l
 Hn+lNybPHIdPizH4hMq913OcIApQyTuA0+JsuUm5YVb0kUiJoDIvfwuOH6jIQ5aX
 Mv7MTRRvXRQ/XRgY+Er46YYSL+GA+IEUgVXLawSNiaiOj17m0G1jzNIG7ERZJVkE
 vWD+Yq9U0ECEWT9V1c0VaDnOAK8Bfd/h3ZGnYAWXtfj3aToRyIUOfw5rO0UdoT47
 /fQWrz8qS3X71FlQHmNVgB3hZWSQ+o+3TrWqhSm2oytCZ3mpzz9Rytt4qltfX6tz
 2PncyyD124m6KAaouCQpitZlTLoXQWCdYmjCbfOwFYPkBjROXYtMu0x/eoCyjusg
 e42qt8zDc1ZnpgMejto4g1IBmrQLpQK9ui1cJzwbcuFomegIzfsivtYEav4v0JYN
 Y4SmuvvwOiErJVrQzhBXR/iDUSQ/d4AEow1GGTKRrt+naip4PWx9/HPCsJQ9N+xv
 kvcpZhTZQ8G063wCNW1Nt3c5kzDQ6xjeO9FwVYrRxLGYHKjhmWRzRjaKzODEG9g4
 NY3LC7T/JA==
 =k+a/
 -----END PGP SIGNATURE-----

Merge tag 'block-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - Fix for ublk not doing an actual issue from the task_work fallback
   path. Any request hitting that should be canceled automatically

 - Fix for uring_cmd prep side handling, for the block side uring_cmd
   discard handling

 - Fix for missing validation of the io and physical block size shifts

 - Fix for a use-after-free in ublk's cancel command handling

* tag 'block-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  ublk: fix use-after-free in ublk_cancel_cmd()
  ublk: validate physical_bs_shift, io_min_shift and io_opt_shift
  block: only read from sqe on initial invocation of blkdev_uring_cmd()
  ublk: don't issue uring_cmd from fallback task work
2026-05-08 13:18:13 -07:00
Linus Torvalds
8be01e1280 io_uring-7.1-20260508
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmn+FU4QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnIsEADVG1zqBrfj6JE0pscyrFbFHwAjoihs7jW2
 dHVRzrWKlCG0iflVAxIcoA6WQLAc1W5cmWi9GHuOHtPvWY1rdiZCvp8GMGuq40ye
 3OQrMAcDdpowAUBfO9tiZ9L9Bn96HFZCa92V0PEp/fSPxuRv1HGE/yTpWsardbxn
 eUGBoOMAclqawMU5thfOXFMT+DetwrY//nd799iEElzyNfk92mDCZZ5n3WPyl1J4
 hn/iUu04YVozto9P17SJfEOg1c4kz84wL6ATR+2IuxrWm8/LxXspmbIovJYaCLRr
 EkdevTrxABBTJ77dllnnaFg233F75ZdYr0z0xgHoOFT2totSz2lZFxqz8R0/b/NE
 mHdshkn4LTU4yDHuILDt0LxImi62i1Bmn7QQIbICcMAEhTGh2hebaM/lmmHp7IlN
 R98q4ALm+dTu5vp+MDlve53P4UITxsgclAICqxyY26FrNneHd/TodeDYPGYLwj4F
 2EPZyzHe3WpTXMmF6pxLlEr2r8DRqZhBqj+mohN/pZK+ecs6GmCLh1F1zcuaLQyg
 VOPf3FY48f6EWku0gCUpOev2iFTQHICf3RC39uR5pVVdC3+Yzc2+yqa9xg+I5ckt
 gNTbmD1vkssaCmxIXR0Cj/pHskZxJqtdmDJmcBQfbLm9ytFuCGFK0IqpvpyRya/H
 jGjRGJxJ5w==
 =y3+7
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:

 - Ensure that the absolute timeouts for both the command side and the
   waiting side honor the callers time namespace

 - Ensure tracked NAPI entries are cleared at unregistration time, as
   the NAPI polling loop checks the list state rather than the general
   NAPI state. This can lead to NAPI polling even after unregistration
   has been done. If unregistered, all NAPI polling should be disabled

 - Fix for eventfd recursive invocation handling

* tag 'io_uring-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/wait: honour caller's time namespace for IORING_ENTER_ABS_TIMER
  io_uring/timeout: honour caller's time namespace for IORING_TIMEOUT_ABS
  io_uring/eventfd: reset deferred signal state
  io_uring/napi: clear tracked NAPI entries on unregister
2026-05-08 13:12:48 -07:00
Mario Limonciello
db5dadb562 Revert "ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn"
Some older systems don't support CPPC in the firmware and this just makes
noise for them when booting.  Drop back to debug.

This reverts commit 21fb59ab4b.

Fixes: 21fb59ab4b ("ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn")
Suggested-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Kim Phillips <kim.phillips@amd.com>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/20260504230141.484743-2-mario.limonciello@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-05-08 21:14:19 +02:00
Martin KaFai Lau
f3b8c28135 Merge branch 'bpf-tcp-fix-type-confusion-in-bpf-helper-functions'
Kuniyuki Iwashima says:

====================
bpf: tcp: Fix type confusion in bpf helper functions.

bpf_tcp_sock() only check if sk->sk_protocol is IPPROTO_TCP,
but RAW socket can bypass it:

  socket(AF_INET, SOCK_RAW, IPPROTO_TCP)

The same issues exist in other bpf functions:

  * bpf_mptcp_sock_from_subflow()
  * bpf_skc_to_tcp_sock()
  * bpf_skc_to_tcp6_sock()
  * sol_tcp_sockopt()

Patch 1 fixes bpf_tcp_sock() and Patch 2 adds a test for it.
Patch 3 ~ 6 fix the rest of the functions above.

Changes:
  v2:
    * Inverse if (err) to if (!err) in the selftest
    * Add patch 3 ~ 6

  v1: https://lore.kernel.org/bpf/20260430184405.1227386-1-kuniyu@google.com/
      https://lore.kernel.org/mptcp/20260430-mptcp-bpf-mptcp-sock-type-v1-1-d2ed5cda7da9@kernel.org/
====================

Link: https://patch.msgid.link/20260504210610.180150-1-kuniyu@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2026-05-08 11:38:11 -07:00
Kuniyuki Iwashima
1c2958e4ab bpf: tcp: Fix type confusion in sol_tcp_sockopt().
sol_tcp_sockopt() only checks if sk->sk_protocol is IPPROTO_TCP,
but RAW socket can bypass it:

  socket(AF_INET, SOCK_RAW, IPPROTO_TCP)

Let's use sk_is_tcp().

Note that initially sol_tcp_sockopt() checked sk->sk_prot->setsockopt.

Fixes: 2ab42c7b87 ("bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt().")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260504210610.180150-7-kuniyu@google.com
2026-05-08 11:38:10 -07:00
Kuniyuki Iwashima
843064b0a7 bpf: tcp: Fix type confusion in bpf_skc_to_tcp6_sock().
bpf_skc_to_tcp6_sock() only checks if sk->sk_protocol is IPPROTO_TCP
and sk->sk_family is AF_INET6, but RAW socket can bypass it:

  socket(AF_INET6, SOCK_RAW, IPPROTO_TCP)

Let's check sk->sk_type too.

Fixes: af7ec13833 ("bpf: Add bpf_skc_to_tcp6_sock() helper")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260504210610.180150-6-kuniyu@google.com
2026-05-08 11:38:10 -07:00
Kuniyuki Iwashima
decb84b838 bpf: tcp: Fix type confusion in bpf_skc_to_tcp_sock().
bpf_skc_to_tcp_sock() only checks if sk->sk_protocol is
IPPROTO_TCP, but RAW socket can bypass it:

  socket(AF_INET, SOCK_RAW, IPPROTO_TCP)

Let's use sk_is_tcp().

Fixes: 478cfbdf5f ("bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260504210610.180150-5-kuniyu@google.com
2026-05-08 11:38:10 -07:00
Matthieu Baerts (NGI0)
7995b216a7 mptcp: bpf: Fix type confusion in bpf_mptcp_sock_from_subflow()
bpf_mptcp_sock_from_subflow() only checks if sk->sk_protocol is
IPPROTO_TCP, but RAW socket can bypass it:

  socket(AF_INET, SOCK_RAW, IPPROTO_TCP)

In this case, it would NOT be valid to call sk_is_mptcp() which will
assume sk is a pointer to a struct tcp_sock, and wrongly checks for:
tcp_sk(sk)->is_mptcp.

Fixes: 3bc253c2e6 ("bpf: Add bpf_skc_to_mptcp_sock_proto")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260504210610.180150-4-kuniyu@google.com
2026-05-08 11:38:10 -07:00
Kuniyuki Iwashima
d73549b8bb selftest: bpf: Add test for bpf_tcp_sock() and RAW socket.
Let's extend sockopt_sk.c to cover bpf_tcp_sock() for the
wrong socket type.

Before:
  # ./test_progs -t sockopt_sk
  [  151.948613] ==================================================================
  [  151.951376] BUG: KASAN: slab-out-of-bounds in sol_tcp_sockopt+0xc7/0x8e0
  [  151.954159] Read of size 8 at addr ffff88801083d760 by task test_progs/1259
  ...
  run_test:FAIL:getsetsockopt unexpected error: -1 (errno 0)
  #427     sockopt_sk:FAIL

After:
  #427     sockopt_sk:OK

While at it, missing free() is fixed up.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260504210610.180150-3-kuniyu@google.com
2026-05-08 11:38:08 -07:00
Breno Leitao
0143033dc2 workqueue: Fix wq->cpu_pwq leak in alloc_and_link_pwqs() WQ_UNBOUND path
For WQ_UNBOUND workqueues, alloc_and_link_pwqs() allocates wq->cpu_pwq
via alloc_percpu() and then calls apply_workqueue_attrs_locked(). On
failure it returns the error directly, bypassing the enomem: label
which holds the only free_percpu(wq->cpu_pwq) in this function.

The caller's error path kfree()s wq without touching wq->cpu_pwq,
leaking one percpu pointer table (nr_cpu_ids * sizeof(void *) bytes) per
failed call.

If kmemleak is enabled, we can see:

  unreferenced object (percpu) 0xc0fffa5b121048 (size 8):
    comm "insmod", pid 776, jiffies 4294682844
    backtrace (crc 0):
      pcpu_alloc_noprof+0x665/0xac0
      __alloc_workqueue+0x33f/0xa20
      alloc_workqueue_noprof+0x60/0x100

Route the error through the existing enomem: cleanup and any error
before this one.

Cc: stable@kernel.org
Fixes: 636b927eba ("workqueue: Make unbound workqueues to use per-cpu pool_workqueues")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-05-08 07:59:44 -10:00