Commit Graph

1445078 Commits

Author SHA1 Message Date
Jason Gunthorpe
159f2efabc RDMA/mana: Remove user triggerable WARN_ON() in mana_ib_create_qp_rss()
Sashiko points out that the user can specify WQs sharing the same CQ as a
part of the uAPI and this will trigger the WARN_ON() then go on to corrupt
the kernel.

Just reject it outright and fail the QP creation.

Cc: stable@vger.kernel.org
Fixes: c15d7802a4 ("RDMA/mana_ib: Add CQ interrupt support for RAW QP")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/5-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-05-02 15:30:48 -03:00
Jason Gunthorpe
6dd2d4ad9c RDMA/mana: Validate rx_hash_key_len
Sashiko points out that rx_hash_key_len comes from a uAPI structure and is
blindly passed to memcpy, allowing the userspace to trash kernel
memory. Bounds check it so the memcpy cannot overflow.

Cc: stable@vger.kernel.org
Fixes: 0266a17763 ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/4-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-05-02 15:30:48 -03:00
Jason Gunthorpe
45e8ebc9ed RDMA/mlx5: Add missing store/release for lock elision pattern
mlx5 has a common pattern implementing a device-global singleton resource
where it checks the resource pointer for !NULL and then skips obtaining
the lock.

This is not ordered properly as observing !NULL doesn't mean that all the
data under that pointer is also visible on this CPU when the lock is not
taken.

Use a release/acquire pairing to explicitly manage this.

Pointed out by sashiko, Codex found more cases.

Fixes: 5895e70f2e ("IB/mlx5: Allocate resources just before first QP/SRQ is created")
Fixes: 638420115c ("IB/mlx5: Create UMR QP just before first reg_mr occurs")
Link: https://sashiko.dev/#/patchset/SYBPR01MB7881E1E0970268BD69C0BA75AF2B2%40SYBPR01MB7881.ausprd01.prod.outlook.com
Link: https://patch.msgid.link/r/3-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Assisted-by: Codex:GPT-5.5
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-05-02 15:30:48 -03:00
Jason Gunthorpe
641858d52f RDMA/mlx5: Restore zero-init to mlx5_ib_modify_qp() ucmd
Sashiko points out the check for inlen==0 got missed, the ={} was not
redundant, put it back.

Fixes: a9cd442a5347 ("RDMA: Remove redundant = {} for udata req structs")
Link: https://patch.msgid.link/r/2-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-05-02 15:30:48 -03:00
Jason Gunthorpe
70f780edcd RDMA/ionic: Fix typo in format string
Applying the corrupted patch by hand mangled the format string, put the s
in the right place.

Cc: stable@vger.kernel.org
Fixes: 654a27f255 ("RDMA/ionic: bound node_desc sysfs read with %.64s")
Link: https://patch.msgid.link/r/1-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reported-by: Brad Spengler <brad.spengler@opensrcsec.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-05-02 15:30:48 -03:00
Maoyi Xie
1d324c2f43 ip6_gre: Use cached t->net in ip6erspan_changelink().
After commit 5e72ce3e39 ("net: ipv6: Use link netns in newlink() of
rtnl_link_ops"), ip6erspan_newlink() correctly resolves the per-netns
ip6gre hash via link_net. ip6erspan_changelink() was not converted in
that series and still uses dev_net(dev), which diverges from the
device's creation netns after IFLA_NET_NS_FD migration.

This re-inserts the tunnel into the wrong per-netns hash. The
original netns keeps a stale entry. When that netns is later
destroyed, ip6gre_exit_rtnl_net() walks the stale entry, producing a
slab-use-after-free reported by KASAN, followed by a kernel BUG at
net/core/dev.c (LIST_POISON1) in unregister_netdevice_many_notify().

Reachable from an unprivileged user namespace (unshare --user
--map-root-user --net).

ip6gre_changelink() earlier in the same file already uses the cached
t->net; only ip6erspan_changelink() has the wrong shape.

Fixes: 2d665034f2 ("net: ip6_gre: Fix ip6erspan hlen calculation")
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260430103318.3206018-1-maoyi.xie@ntu.edu.sg
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:36:21 -07:00
Jakub Kicinski
386829cd89 Merge branch 'replace-direct-dequeue-call-with-qdisc_dequeue_peeked'
Jamal Hadi Salim says:

====================
Replace direct dequeue call with qdisc_dequeue_peeked

When sfb and red qdiscs have children (eg qfq qdisc) whose peek() callback is
qdisc_peek_dequeued(), we could get a kernel panic. When the parent of such
qdiscs (eg illustrated in patch #3 as tbf) wants to retrieve an skb from
its child (red/sfb in this case), it will do the following:
 1a. do a peek() - and when sensing there's an skb the child can offer, then
     - the child in this case(red/sfb) calls its child's (qfq) peek.
        qfq does the right thing and will return the gso_skb queue packet.
        Note: if there wasnt a gso_skb entry then qfq will store it there.
 1b. invoke a dequeue() on the child (red/sfb). And herein lies the problem.
     - red/sfb will call the child's dequeue() which will essentially just
       try to grab something of qfq's queue.

The right thing to do in #1b is to grab the skb off gso_skb queue.
This patchset fixes that issue by changing #1b to use qdisc_dequeue_peeked()
method instead.

Patch 1 fixes the issue for red qdisc. Patch 2 fixes it for sfb.
Patch 3 adds testcases for the two setups.
====================

Link: https://patch.msgid.link/20260430152957.194015-1-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:21:00 -07:00
Victor Nogueira
3a3a30c14d selftests/tc-testing: Add tests that force red and sfb to dequeue from child's gso_skb
Create 4 test cases:
- Force red to dequeue from its child's gso_skb with qfq leaf
- Force sfb to dequeue from its child's gso_skb with qfq leaf
- Force red to dequeue from its child's gso_skb with dualpi2 leaf
- Force sfb to dequeue from its child's gso_skb with dualpi2 leaf

All of them have tbf followed by red (or sfb) followed by qfq (or
dualpi2). Since tbf calls its child's peek followed by
qdisc_dequeue_peeked, it will force red/sfb to call their child's peek.
In this case, since the child (qfq/dualpi2) has qdisc_peek_dequeued as
its peek callback, the packet will be stored in its gso_skb queue. During
the subsequent call to qdisc_dequeue_peeked, red/sfb will have to dequeue
from the child's gso_skb to retrieve the packet.
Not doing so will cause a NULL ptr deref which was happening before a
recent fix.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260430152957.194015-4-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:20:56 -07:00
Victor Nogueria
1b9bc71153 net/sched: sch_sfb: Replace direct dequeue call with peek and qdisc_dequeue_peeked
When sfb has children (eg qfq qdisc) whose peek() callback is
qdisc_peek_dequeued(), we could get a kernel panic. When the parent of such
qdiscs (eg illustrated in patch #3 as tbf) wants to retrieve an skb from
its child (sfb in this case), it will do the following:
 1a. do a peek() - and when sensing there's an skb the child can offer, then
     - the child in this case(sfb) calls its child's (qfq) peek.
        qfq does the right thing and will return the gso_skb queue packet.
        Note: if there wasnt a gso_skb entry then qfq will store it there.
 1b. invoke a dequeue() on the child (sfb). And herein lies the problem.
     - sfb will call the child's dequeue() which will essentially just
       try to grab something of qfq's queue.

[  127.594489][  T453] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[  127.594741][  T453] CPU: 2 UID: 0 PID: 453 Comm: ping Not tainted 7.1.0-rc1-00035-gac961974495b-dirty #793 PREEMPT(full)
[  127.595059][  T453] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  127.595254][  T453] RIP: 0010:qfq_dequeue+0x35c/0x1650 [sch_qfq]
[  127.595461][  T453] Code: 00 fc ff df 80 3c 02 00 0f 85 17 0e 00 00 4c 8d 73 48 48 89 9d b8 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 76 0c 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b
[  127.596081][  T453] RSP: 0018:ffff88810e5af440 EFLAGS: 00010216
[  127.596337][  T453] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: dffffc0000000000
[  127.596623][  T453] RDX: 0000000000000009 RSI: 0000001880000000 RDI: ffff888104fd82b0
[  127.596917][  T453] RBP: ffff888104fd8000 R08: ffff888104fd8280 R09: 1ffff110211893a3
[  127.597165][  T453] R10: 1ffff110211893a6 R11: 1ffff110211893a7 R12: 0000001880000000
[  127.597404][  T453] R13: ffff888104fd82b8 R14: 0000000000000048 R15: 0000000040000000
[  127.597644][  T453] FS:  00007fc380cbfc40(0000) GS:ffff88816f2a8000(0000) knlGS:0000000000000000
[  127.597956][  T453] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  127.598160][  T453] CR2: 00005610aa9890a8 CR3: 000000010369e000 CR4: 0000000000750ef0
[  127.598390][  T453] PKRU: 55555554
[  127.598509][  T453] Call Trace:
[  127.598629][  T453]  <TASK>
[  127.598718][  T453]  ? mark_held_locks+0x40/0x70
[  127.598890][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599053][  T453]  sfb_dequeue+0x88/0x4d0
[  127.599174][  T453]  ? ktime_get+0x137/0x230
[  127.599328][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599480][  T453]  ? qdisc_peek_dequeued+0x7b/0x350 [sch_qfq]
[  127.599670][  T453]  ? srso_alias_return_thunk+0x5/0xfbef5
[  127.599831][  T453]  tbf_dequeue+0x6b1/0x1098 [sch_tbf]
[  127.599988][  T453]  __qdisc_run+0x169/0x1900

The right thing to do in #1b is to grab the skb off gso_skb queue.
This patchset fixes that issue by changing #1b to use qdisc_dequeue_peeked()
method instead.

Fixes: e13e02a3c6 ("net_sched: SFB flow scheduler")
Signed-off-by: Victor Nogueria <victor@mojatatu.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430152957.194015-3-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:20:56 -07:00
Jamal Hadi Salim
458d561527 net/sched: sch_red: Replace direct dequeue call with peek and qdisc_dequeue_peeked
When red qdisc has children (eg qfq qdisc) whose peek() callback is
qdisc_peek_dequeued(), we could get a kernel panic. When the parent of such
qdiscs (eg illustrated in patch #3 as tbf) wants to retrieve an skb from
its child (red in this case), it will do the following:
 1a. do a peek() - and when sensing there's an skb the child can offer, then
     - the child in this case(red) calls its child's (qfq) peek.
        qfq does the right thing and will return the gso_skb queue packet.
        Note: if there wasnt a gso_skb entry then qfq will store it there.
 1b. invoke a dequeue() on the child (red). And herein lies the problem.
     - red will call the child's dequeue() which will essentially just
       try to grab something of qfq's queue.

[   78.667668][  T363] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[   78.667927][  T363] CPU: 1 UID: 0 PID: 363 Comm: ping Not tainted 7.1.0-rc1-00033-g46f74a3f7d57-dirty #790 PREEMPT(full)
[   78.668263][  T363] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   78.668486][  T363] RIP: 0010:qfq_dequeue+0x446/0xc90 [sch_qfq]
[   78.668718][  T363] Code: 54 c0 e8 dd 90 00 f1 48 c7 c7 e0 03 54 c0 48 89 de e8 ce 90 00 f1 48 8d 7b 48 b8 ff ff 37 00 48 89 fa 48 c1 e0 2a 48 c1 ea 03 <80> 3c 02 00 74 05 e8 ef a1 e1 f1 48 8b 7b 48 48 8d 54 24 58 48 8d
[   78.669312][  T363] RSP: 0018:ffff88810de573e0 EFLAGS: 00010216
[   78.669533][  T363] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   78.669790][  T363] RDX: 0000000000000009 RSI: 0000000000000004 RDI: 0000000000000048
[   78.670044][  T363] RBP: ffff888110dc4000 R08: ffffffffb1b0885a R09: fffffbfff6ba9078
[   78.670297][  T363] R10: 0000000000000003 R11: ffff888110e31c80 R12: 0000001880000000
[   78.670560][  T363] R13: ffff888110dc4150 R14: ffff888110dc42b8 R15: 0000000000000200
[   78.670814][  T363] FS:  00007f66a8f09c40(0000) GS:ffff888163428000(0000) knlGS:0000000000000000
[   78.671110][  T363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   78.671324][  T363] CR2: 000055db4c6a30a8 CR3: 000000010da67000 CR4: 0000000000750ef0
[   78.671585][  T363] PKRU: 55555554
[   78.671713][  T363] Call Trace:
[   78.671843][  T363]  <TASK>
[   78.671936][  T363]  ? __pfx_qfq_dequeue+0x10/0x10 [sch_qfq]
[   78.672148][  T363]  ? __pfx__printk+0x10/0x10
[   78.672322][  T363]  ? srso_alias_return_thunk+0x5/0xfbef5
[   78.672496][  T363]  ? lockdep_hardirqs_on_prepare+0xa8/0x1a0
[   78.672706][  T363]  ? srso_alias_return_thunk+0x5/0xfbef5
[   78.672875][  T363]  ? trace_hardirqs_on+0x19/0x1a0
[   78.673047][  T363]  red_dequeue+0x65/0x270 [sch_red]
[   78.673217][  T363]  ? srso_alias_return_thunk+0x5/0xfbef5
[   78.673385][  T363]  tbf_dequeue.cold+0xb0/0x70c [sch_tbf]
[   78.673566][  T363]  __qdisc_run+0x169/0x1900

The right thing to do in #1b is to grab the skb off gso_skb queue.
This patchset fixes that issue by changing #1b to use qdisc_dequeue_peeked()
method instead.

Fixes: 77be155cba ("pkt_sched: Add peek emulation for non-work-conserving qdiscs.")
Reported-by: Manas <ghandatmanas@gmail.com>
Reported-by: Rakshit Awasthi <rakshitawasthi17@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430152957.194015-2-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:20:55 -07:00
Gregory Fuchedgi
383d0fb894 amd-xgbe: fix PTP addend overflow causing frozen clock
XGBE_PTP_ACT_CLK_FREQ and XGBE_V2_PTP_ACT_CLK_FREQ were 10x too
large (500MHz/1GHz instead of 50MHz/100MHz), causing the computed
addend to overflow the 32-bit tstamp_addend. In the general case
this would result in the clock advancing at the wrong rate. For v2
(PCI), ptpclk_rate is hardcoded to 125MHz, so the addend formula
(ACT_CLK_FREQ << 32) / ptpclk_rate yields exactly 8 * 2^32, and
when stored to the 32-bit tstamp_addend the value is zero. With
addend = 0 the hardware accumulator never overflows and the PTP
clock is fully stopped. For v1 (platform), ptpclk_rate is read from
ACPI/DT so the exact overflow behavior depends on the
firmware-reported frequency.

Define the constants as NSEC_PER_SEC / SSINC so the relationship is
explicit and cannot drift out of sync.

Fixes: fbd47be098 ("amd-xgbe: add hardware PTP timestamping support")
Tested-by: Gregory Fuchedgi <gfuchedgi@gmail.com>
Signed-off-by: Gregory Fuchedgi <gfuchedgi@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260429-fix-xgbe-ptp-addend-v1-1-fca5b0ca5e62@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:16:27 -07:00
Holger Brunck
851bba8068 net: wan: fsl_ucc_hdlc: fix ucc_hdlc_remove
If the driver is used in a non tdm mode priv->utdm is a NULL pointer.
Therefore we need to check this pointer first before checking si_regs.

Fixes: c19b6d246a ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Holger Brunck <holger.brunck@hitachienergy.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:14:06 -07:00
Holger Brunck
1a57efe250 net: wan: fsl_ucc_hdlc: fix uhdlc_memclean
Unmapping of uf_regs is done from ucc_fast_free and doesn't need to be
done explicitly. If already unmapped ucc_fast_free will crash.

Fixes: c19b6d246a ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Holger Brunck <holger.brunck@hitachienergy.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:14:00 -07:00
Maciej W. Rozycki
ddca6da148 MAINTAINERS: Add self for the DEC LANCE network driver
Like with the rest of DECstation and TURBOchannel hardware I have been
handling the DEC LANCE network driver for some 25 years now anyway.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/alpine.DEB.2.21.2604271113520.28583@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-02 10:06:32 -07:00
Paul Chaignon
0c7ae13069 tools/headers: Regenerate stddef.h to fix BPF selftests
With commit dacbfc1678 ("crypto: af_alg - Annotate struct af_alg_iv
with __counted_by"), two selftests, test_tag and crypto_sanity, now
indirectly rely on the __counted_by macro. On systems with commit
dacbfc1678 in the installed UAPI headers, the selftests build fails
with:

  In file included from tools/testing/selftests/bpf/prog_tests/crypto_sanity.c:7:
  /usr/include/linux/if_alg.h:45:22: error: expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘__attribute__’ before ‘__counted_by’
     45 |         __u8    iv[] __counted_by(ivlen);
        |                      ^~~~~~~~~~~~

This patch fixes it by regenerating stddef.h in tools/include using the
instructions from commit a778f5d46b ("tools/headers: Pull in stddef.h
to uapi to fix BPF selftests build in CI").

Fixes: dacbfc1678 ("crypto: af_alg - Annotate struct af_alg_iv with __counted_by")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/8da8ef16055aa452d940668ed5359ce54adc6b0b.1777715500.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-05-02 09:37:44 -07:00
Shota Zaizen
996454bc0d ksmbd: validate inherited ACE SID length
smb_inherit_dacl() walks the parent directory DACL loaded from the
security descriptor xattr. It verifies that each ACE contains the fixed
SID header before using it, but does not verify that the variable-length
SID described by sid.num_subauth is fully contained in the ACE.

A malformed inheritable ACE can advertise more subauthorities than are
present in the ACE. compare_sids() may then read past the ACE.
smb_set_ace() also clamps the copied destination SID, but used the
unchecked source SID count to compute the inherited ACE size. That could
advance the temporary inherited ACE buffer pointer and nt_size accounting
past the allocated buffer.

Fix this by validating the parent ACE SID count and SID length before
using the SID during inheritance. Compute the inherited ACE size from the
copied SID so the size matches the bounded destination SID. Reject the
inherited DACL if size accumulation would overflow smb_acl.size or the
security descriptor allocation size.

Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Shota Zaizen <s@zaizen.me>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 21:49:35 -05:00
Namjae Jeon
6fd7dd4e44 ksmbd: fix kernel-doc warnings from ksmbd_conn_get/put()
The kernel test robot reported W=1 build warnings for ksmbd_conn_get()
and ksmbd_conn_put() due to missing parameter descriptions.
Add the @conn description to fix these warnings.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 21:49:35 -05:00
Shuhao Fu
a74668eb2c ksmbd: fail share config requests when path allocation fails
Non-pipe shares must have a duplicated backing path before they can be
published. share_config_request() currently calls kstrndup() for that
path, but if the allocation fails it leaves ret unchanged. If veto list
parsing succeeds and share->name exists, the partially built share is
still inserted into the global share table with share->path left NULL.

A later share-root SMB2 create uses tree_conn->share_conf->path as the
lookup root. If the share was published with path == NULL, that request
passes a NULL pathname into do_getname_kernel()/strlen() and can crash
the ksmbd worker.

Set ret = -ENOMEM when path duplication fails so the incomplete share is
destroyed before publication.

Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 21:49:35 -05:00
DaeMyung Kang
bf736184d0 ksmbd: close durable scavenger races against m_fp_list lookups
ksmbd_durable_scavenger() has two related races against any walker
that iterates f_ci->m_fp_list, including ksmbd_lookup_fd_inode()
(used by ksmbd_vfs_rename) and the share-mode checks in
fs/smb/server/smb_common.c.

(1) fp->node list-head reuse.  Durable-preserved handles can remain
linked on f_ci->m_fp_list after session teardown so share-mode checks
still see them while the handle is reconnectable.  The scavenger
collected expired handles by adding fp->node to a local
scavenger_list after removing them from the global durable idr.
Because fp->node is the same list_head used by m_fp_list,
list_add(&fp->node, &scavenger_list) overwrites the m_fp_list links
and corrupts both lists.  CONFIG_DEBUG_LIST can report this on the
share-mode walk path.

(2) Refcount race against m_fp_list walkers.  The scavenger qualifies
an expired durable handle with atomic_read(&fp->refcount) > 1 and
fp->conn under global_ft.lock, removes fp from global_ft, then drops
global_ft.lock before unlinking fp from m_fp_list and freeing it.
During that gap fp is still linked on m_fp_list with f_state ==
FP_INITED.  ksmbd_lookup_fd_inode() under m_lock read calls
ksmbd_fp_get() (atomic_inc_not_zero on refcount that is still 1) and
takes a live reference; the scavenger then unlinks and frees fp
while the holder owns a reference, leading to UAF on the holder's
subsequent ksmbd_fd_put() and on any field reads performed by a
concurrent share-mode walker that iterates m_fp_list without taking
ksmbd_fp_get() (smb_check_perm_dleases-like paths).

Fix both:

  * Stop reusing fp->node as a scavenger-private list node.  Remove
    one expired handle from global_ft under global_ft.lock, take an
    explicit transient reference, drop the lock, unlink fp->node
    from m_fp_list under f_ci->m_lock, then drop both the durable
    lifetime and transient references with atomic_sub_and_test(2,
    &fp->refcount).  If the scavenger is the last putter the close
    runs there; otherwise an in-flight holder that already raced
    through the m_fp_list lookup owns the final close via its
    ksmbd_fd_put() path.  The one-at-a-time disposal can rescan the
    durable idr when multiple handles expire in the same pass, but
    durable scavenging is a background expiration path and the final
    full scan recomputes min_timeout before the next wait.

  * Clear fp->persistent_id inside __ksmbd_remove_durable_fd() right
    after idr_remove(), so a delayed final close from a holder that
    snatched fp does not re-issue idr_remove() on a persistent id
    that idr_alloc_cyclic() in ksmbd_open_durable_fd() may have
    already handed out to a brand-new durable handle.

  * Bypass the per-conn open_files_count decrement in
    __put_fd_final() when fp is detached from any session table
    (fp->conn cleared by session_fd_check() at durable preserve --
    paired with the volatile_id clear at unpublish, so checking
    fp->conn alone is sufficient).  The walker that owns the final
    close runs from an unrelated work->conn whose
    stats.open_files_count never tracked this durable fp; without
    this guard the holder would underflow that unrelated counter.

The two races are folded into one patch because patch (1) alone
cleans up the corrupted list but leaves a deterministic UAF window
for m_fp_list walkers that the transient-reference and
persistent_id discipline in (2) close; bisecting onto an
intermediate state would land on a UAF that pre-patch chaos merely
made less reproducible.

Validation:
  * CONFIG_DEBUG_LIST coverage for the list_head reuse path.
  * KASAN-enabled direct SMB2 durable-handle coverage that exercised
    ksmbd_durable_scavenger() and non-NULL ksmbd_lookup_fd_inode()
    returns while durable handles expired under concurrent rename
    lookups, with no KASAN, UAF, list-corruption, ODEBUG, or WARNING
    reports.
  * checkpatch --strict
  * make -j$(nproc) M=fs/smb/server

Fixes: d484d621d4 ("ksmbd: add durable scavenger timer")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 21:49:35 -05:00
DaeMyung Kang
a42896bebf ksmbd: harden file lifetime during session teardown
__close_file_table_ids() is the per-session teardown that closes every
fp belonging to a session (or to one tree connect on that session) by
walking the session's volatile-id idr.  The current loop has three
related problems on busy or racing workloads:

  * Sleeping under ft->lock.  The session-teardown skip callback,
    session_fd_check(), already sleeps in ksmbd_vfs_copy_durable_owner()
    -> kstrdup(GFP_KERNEL) and down_write(&fp->f_ci->m_lock) (a
    rw_semaphore).  Running the callback inside write_lock(&ft->lock)
    trips CONFIG_DEBUG_ATOMIC_SLEEP / CONFIG_PROVE_LOCKING on a
    durable-fd workload.

  * Refcount accounting blind to f_state.  The unconditional
    atomic_dec_and_test(&fp->refcount) does not distinguish
    FP_INITED (idr-owned reference still intact) from FP_CLOSED (an
    earlier ksmbd_close_fd() already consumed the idr-owned reference
    while leaving fp in the idr because a holder kept refcount
    non-zero).  When the latter races with teardown the same path
    over-decrements into a holder reference and ksmbd_fd_put() later
    UAFs that holder.

  * FP_NEW window.  Between __open_id() publishing fp into the
    session idr and ksmbd_update_fstate(..., FP_INITED) committing the
    transition at the end of smb2_open(), an fp is in FP_NEW and an
    intervening teardown that takes a transient reference and
    unpublishes the volatile id leaves the original idr-owned
    reference orphaned -- the opener is unaware that fp has been
    unpublished, returns success to the client, and the fp leaks at
    refcount = 1.

Refactor __close_file_table_ids() to take a transient reference on fp
and unpublish fp from the session idr *under ft->lock* before calling
skip() outside the lock.  A transient ref protects lifetime but not
concurrent field mutation, so the idr_remove() is what keeps
__ksmbd_lookup_fd() through this session's idr from granting a new
ksmbd_fp_get() reference to an fp whose fp->conn / fp->tcon /
fp->volatile_id / op->conn / lock_list links are about to be rewritten
by session_fd_check().  Durable reconnect is unaffected because it
reaches fp through the global durable table (ksmbd_lookup_durable_fd
-> global_ft).

Decide n_to_drop together with any FP_INITED -> FP_CLOSED transition
under ft->lock so teardown and ksmbd_close_fd() never both consume the
idr-owned reference.  See ksmbd_mark_fp_closed() for the per-state
accounting.  For the FP_NEW path to be safe, the opener has to learn
that fp was unpublished: ksmbd_update_fstate() now returns -ENOENT
when an FP_NEW -> FP_INITED transition finds f_state already advanced
or the volatile id cleared (both committed by teardown under
ft->lock); smb2_open() propagates that as STATUS_OBJECT_NAME_INVALID
and drops the original reference via ksmbd_fd_put().

The list removal cannot be left for a deferred final putter because
fp->volatile_id has already been cleared and __ksmbd_remove_fd() will
intentionally skip both idr_remove() and list_del_init().  Move the
m_fp_list unlink in __ksmbd_remove_fd() above the volatile-id check so
that an FP_NEW fp that happened to be added to m_fp_list (smb2_open()
adds fp->node before ksmbd_update_fstate() runs) is still cleaned up
on the deferred putter path; list_del_init() on an empty node is a
no-op and remains safe for fps that were never added.

Add a defensive guard in session_fd_check() that refuses non-FP_INITED
fps so that even if a teardown reaches an FP_NEW fp it falls into the
close branch (where the n_to_drop = 1 accounting keeps the opener's
reference alive) instead of the durable-preserve branch (which mutates
fp->conn / fp->tcon).

Validation on a debug kernel additionally built with CONFIG_DEBUG_LIST
and CONFIG_DEBUG_OBJECTS_WORK used a same-session two-tcon workload
(open/write storm on one tcon, 50 tree disconnects on the other) and
reported no list-corruption, work_struct ODEBUG, sleep-in-atomic,
lockdep or kmemleak reports.  Reverting only the
__close_file_table_ids() hunk while keeping a forced-is_reconnectable()
harness produced the expected sleep-in-atomic at vfs_cache.c:1095,
confirming the ft->lock-out-of-sleepable-skip discipline.

KASAN-enabled direct SMB2 coverage with durable handles enabled
exercised ksmbd_close_tree_conn_fds(), ksmbd_close_session_fds(),
the FP_NEW failure path, tree_conn_fd_check(), and a non-zero
session_fd_check() durable-preserve return.  This produced no KASAN,
DEBUG_LIST, ODEBUG, or WARNING reports.

Fixes: f441584858 ("cifsd: add file operations")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 21:49:35 -05:00
DaeMyung Kang
b1f1e80620 ksmbd: centralize ksmbd_conn final release to plug transport leak
ksmbd_conn_free() is one of four sites that can observe the last
refcount drop of a struct ksmbd_conn.  The other three

    fs/smb/server/connection.c    ksmbd_conn_r_count_dec()
    fs/smb/server/oplock.c        __free_opinfo()
    fs/smb/server/vfs_cache.c     session_fd_check()

end the conn with a bare kfree(), skipping
ida_destroy(&conn->async_ida) and
conn->transport->ops->free_transport(conn->transport).  Whenever one
of them is the last putter, the embedded async_ida and the entire
transport struct leak -- for TCP, that is also the struct socket and
the kvec iov.

__free_opinfo() being a final putter is not theoretical.  opinfo_put()
queues the callback via call_rcu(&opinfo->rcu, free_opinfo_rcu), so
ksmbd_server_terminate_conn() can deposit N opinfo releases in RCU and
have ksmbd_conn_free() run in the handler thread before any of them
fire.  ksmbd_conn_free() then observes refcnt > 0 and short-circuits;
the last RCU-delivered __free_opinfo() falls onto its bare kfree(conn)
branch and the transport is lost.

A/B validation in a QEMU/virtme guest, mounting //127.0.0.1/testshare:
each iteration holds 8 files open via sleep processes, force-closes
TCP with "ss -K sport = :445", kills the holders, lazy-umounts;
repeated 10 times, then ksmbd shutdown and kmemleak scan.

    state         conn_alloc  conn_free  tcp_free  opi_rcu  kmemleak
    ----------    ----------  ---------  --------  -------  --------
    pre-patch         20          20        10       160        7
    with patch        20          20        20       160        0

Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the
bare-kfree paths skipping transport cleanup; kmemleak backtraces point
into struct tcp_transport / iov.  With this patch tcp_free matches
conn_free at 20/20 and kmemleak is clean.

Move the per-struct final release into __ksmbd_conn_release_work() and
route the three bare-kfree final-put sites through a new
ksmbd_conn_put().  Those sites now pair ida_destroy() and
free_transport() with kfree(conn) regardless of which holder happens
to release the last reference.  stop_sessions() only triggers the
transport shutdown and does not itself drop the last conn reference,
so it is unaffected.

The centralized release reaches sock_release() -> tcp_close() ->
lock_sock_nested() (might_sleep) from every final putter, including
__free_opinfo() invoked from an RCU softirq callback, which trips
CONFIG_DEBUG_ATOMIC_SLEEP.  Defer the release to a dedicated
ksmbd_conn_wq workqueue so ksmbd_conn_put() is safe from any
non-sleeping context.

Make ksmbd_file own a strong connection reference while fp->conn is
non-NULL so durable-preserve and final-close paths cannot dereference
a stale connection.  ksmbd_open_fd() and ksmbd_reopen_durable_fd()
take the reference via ksmbd_conn_get() (the latter also reorders the
fp->conn / fp->tcon assignments before __open_id() so the published fp
is never observed with fp->conn == NULL); session_fd_check() and
__ksmbd_close_fd() drop it via ksmbd_conn_put().  With that invariant,
session_fd_check() can take a local conn pointer once and use it
across the m_op_list and lock_list iterations even though op->conn
puts may otherwise drop the last reference.

At module exit the workqueue is flushed and destroyed after
rcu_barrier(), so any release queued by a trailing RCU callback is
drained before the inode hash and module text go away.

Fixes: ee426bfb9d ("ksmbd: add refcnt to ksmbd_conn struct")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 21:49:35 -05:00
Jakub Kicinski
ebb639024e Merge branch 'ipv6-fix-ecmp-route-failover-on-carrier-loss'
Sagarika Sharma says:

====================
ipv6: fix ECMP route failover on carrier loss

This patchset resolves an issue where established IPv6 connections are
unable to transition to alternative ECMP nexthops upon carrier loss.

Unlike IPv4, the IPv6 routing subsystem does not actively invalidate
cached destinations during a NETDEV_CHANGE event. Sockets persist
with dead routes, leading to stalled traffic or connection drops.

This series introduces a fix to trigger route invalidation by
updating the route serial number on link carrier loss and provides
a corresponding selftest to validate the failover behavior for IPv4
and IPv6.
====================

Link: https://patch.msgid.link/20260430200909.527827-1-sharmasagarika@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 17:58:46 -07:00
Kuniyuki Iwashima
d1ae37dc68 selftest: net: Add test for TCP flow failover with ECMP routes.
Without the previous commit, TCP failed to switch to alternative
IPv6 routes immediately upon carrier loss.

It would persist with the dead route until reaching the threshold
net.ipv4.tcp_retries1, leading to unnecessary delays in failover.

Let's add a selftest for this scenario to ensure TCP fails over
immediately upon a carrier loss event.

Before:
  TEST: TCP IPv4 failover                                             [ OK ]
  TEST: TCP IPv6 failover                                             [FAIL]

After:
  TEST: TCP IPv4 failover                                             [ OK ]
  TEST: TCP IPv6 failover                                             [ OK ]

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Sagarika Sharma <sharmasagarika@google.com>
Link: https://patch.msgid.link/20260430200909.527827-3-sharmasagarika@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 17:58:44 -07:00
Sagarika Sharma
4bc852006b ipv6: update route serial number on NETDEV_CHANGE
When using IPv6 ECMP routes, if a netdev listed as a nexthop experiences
a carrier change event (e.g., a bond device generating a NETDEV_CHANGE
event after its slaves go linkdown), established connections utilizing
that nexthop fail to fail over to other available nexthops. Instead,
these connections stall or drop.

This happens because the IPv6 FIB code does not invalidate the socket's
cached destination when a NETDEV_CHANGE event occurs. While
fib6_ifdown() correctly marks the nexthop with RTNH_F_LINKDOWN, it
leaves the route's serial number unchanged. As a result, sockets with a
previously cached dst do not realize the route is no longer viable and
continue to try using the non-functional nexthop.

This behavior contrasts with IPv4, which actively flushes cached
destinations on a NETDEV_CHANGE event (see fib_netdev_event() in
net/ipv4/fib_frontend.c).

Fix this by updating the route serial number in fib6_ifdown() when
setting RTNH_F_LINKDOWN. This invalidates stale cached destinations,
forcing sockets to perform a new route lookup and fail over to a
functioning nexthop.

Fixes: 51ebd31815 ("ipv6: add support of equal cost multipath (ECMP)")
Signed-off-by: Sagarika Sharma <sharmasagarika@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430200909.527827-2-sharmasagarika@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 17:58:44 -07:00
Eric Dumazet
6d4106e8df net/sched: sch_pie: annotate more data-races in pie_dump_stats()
My prior patch missed few READ_ONCE()/WRITE_ONCE() annotations.

Fixes: 5154561d9b ("net/sched: sch_pie: annotate data-races in pie_dump_stats()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430080056.35104-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 17:54:57 -07:00
Alex Cheema
a5148bc2fa net: usb: cdc_ncm: add Apple Mac USB-C direct networking quirk
Apple Silicon Macs expose two CDC NCM "private" data interfaces over
USB-C with VID:PID 0x05ac:0x1905 and product string "Mac". This is the
same protocol Apple already ships on iPhone (0x05ac:0x12a8) and iPad
(0x05ac:0x12ab) for RemoteXPC since iOS 17 -- both data interfaces lack
an interrupt status endpoint, so they rely on the FLAG_LINK_INTR-
conditional bind path introduced in commit 3ec8d7572a ("CDC-NCM: add
support for Apple's private interface").

The id_table currently has entries for iPhone and iPad but not for the
Mac. Without a match, cdc_ncm falls through to the generic CDC NCM
class-match entry, which uses the FLAG_LINK_INTR-having cdc_ncm_info
struct, so bind_common() fails on the missing status endpoint and no
netdev appears.

Add id_table entries for both interface numbers (0 and 2) of the Mac,
bound to the existing apple_private_interface_info driver_info.

Verified empirically on a Mac Studio M3 Ultra running macOS 26.5: when
a Mac is connected via USB-C, ioreg shows VID 0x05ac, PID 0x1905,
product string "Mac", with two NCM data interfaces at numbers 0 and 2.
The same PID is presented by all current Apple Silicon Mac models
(MacBook Pro/Air, Mac mini, Mac Studio across the M-series), mirroring
Apple's single-PID-per-family pattern from iPhone/iPad.

After this patch, plugging a Mac into a Linux host running the patched
kernel produces two enx... interfaces (one per data interface),
"ip -br link" lists them as UP, and standard userspace networking
(DHCP, NetworkManager shared mode, etc.) works without any modprobe
overrides or out-of-tree modules.

Signed-off-by: Alex Cheema <alex@exolabs.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260429175739.34426-1-alex@exolabs.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 17:17:27 -07:00
Eric Dumazet
c6bebaa744 ipv4: igmp: annotate data-races in igmp_heard_query()
Multiple cpus can run igmp_heard_query() concurrently.

Add missing READ_ONCE()/WRITE_ONCE() over following in_dev fields.

- mr_qrv
- mr_qi
- mr_qri
- mr_v1_seen
- mr_v2_seen

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot+ae9a171f239b14485310@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69f38675.050a0220.3cbe47.0002.GAE@google.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430164836.872079-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 17:11:42 -07:00
Kai Zen
4b9e327991 net: rtnetlink: zero ifla_vf_broadcast to avoid stack infoleak in rtnl_fill_vfinfo
rtnl_fill_vfinfo() declares struct ifla_vf_broadcast on the stack
without initialisation:

	struct ifla_vf_broadcast vf_broadcast;

The struct contains a single fixed 32-byte field:

	/* include/uapi/linux/if_link.h */
	struct ifla_vf_broadcast {
		__u8 broadcast[32];
	};

The function then copies dev->broadcast into it using dev->addr_len
as the length:

	memcpy(vf_broadcast.broadcast, dev->broadcast, dev->addr_len);

On Ethernet devices (the overwhelming majority of SR-IOV NICs)
dev->addr_len is 6, so only the first 6 bytes of broadcast[] are
written. The remaining 26 bytes retain whatever was previously on
the kernel stack. The full struct is then handed to userspace via:

	nla_put(skb, IFLA_VF_BROADCAST,
		sizeof(vf_broadcast), &vf_broadcast)

leaking up to 26 bytes of uninitialised kernel stack per VF per
RTM_GETLINK request, repeatable.

The other vf_* structs in the same function are explicitly zeroed
for exactly this reason - see the memset() calls for ivi,
vf_vlan_info, node_guid and port_guid a few lines above.
vf_broadcast was simply missed when it was added.

Reachability: any unprivileged local process can open AF_NETLINK /
NETLINK_ROUTE without capabilities and send RTM_GETLINK with an
IFLA_EXT_MASK attribute carrying RTEXT_FILTER_VF. The kernel walks
each VF and emits IFLA_VF_BROADCAST, leaking 26 bytes of stack per
VF per request. Stack residue at this call site can include return
addresses and transient sensitive data; KASAN with stack
instrumentation, or KMSAN, will flag the nla_put() when reproduced.

Zero the on-stack struct before the partial memcpy, matching the
existing pattern used for the other vf_* structs in the same
function.

Fixes: 75345f888f ("ipoib: show VF broadcast address")
Cc: stable@vger.kernel.org
Signed-off-by: Kai Zen <kai.aizen.dev@gmail.com>
Link: https://patch.msgid.link/3c506e8f936e52b57620269b55c348af05d413a2.1777557228.git.kai.aizen.dev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 17:04:28 -07:00
Eric Dumazet
4f34002e2e ipmr: prevent info-leak in pmr_cache_report()
Yiming Qian reported:

<quote>
 ipmr_cache_report()` allocates a report skb with `alloc_skb(128,
 GFP_ATOMIC)` and appends a `struct igmphdr` using `skb_put()`. In the
 non-`IGMPMSG_WHOLEPKT` path it initializes only:

 - `igmp->type`
 - `igmp->code`

 but does not initialize:

 - `igmp->csum`
 - `igmp->group`

 Later, `igmpmsg_netlink_event()` copies the bytes after `sizeof(struct
 igmpmsg)` into the `IPMRA_CREPORT_PKT` netlink attribute and emits
 `RTM_NEWCACHEREPORT` on `RTNLGRP_IPV4_MROUTE_R`.

 As a result, 6 bytes of stale heap data from the skb head are
 disclosed to userspace.
</quote>

Let's use skb_put_zero() instead of skb_put() to fix this bug.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260430070611.4004529-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 17:01:40 -07:00
Linus Torvalds
f1a5e78a55 drm fixes for 7.1-rc2
core and helpers:
 - calculate framebuffer geometry with format helpers
 - fix docs
 
 amdgpu:
 - GFX12 fix for CONFIG_DRM_DEBUG_MM configs
 - Fix DC analog support
 - Userq fixes
 - GART placement fix
 - Aldebaran SMU fixes
 - AMDGPU_INFO_READ_MMR_REG fix
 - UVD 3.1 fix
 - GC 6 TCC fix
 - Fix root reservation in amdgpu_vm_handle_fault()
 - RAS fix
 - Module reload fix for APUs
 - Fix build for CONFIG_DRM_FBDEV_EMULATION=n
 - IGT DWB regression fix
 - GC 11.5.4 fix
 - VCN user fence fixes
 - JPEG user fence fixes
 - SMU 13.0.6 fix
 - VCN 3/4 IB parser fixes
 - NV3x+ dGPU vblank fix
 - DCE6/8 fixes for LVDS/eDP panels without an EDID
 
 amdkfd:
 - Fix for when CONFIG_HSA_AMD is not set
 - SVM fixes
 
 xe:
 - uapi: Add missing pad and extensions check
 - uapi: Reject unsafe PAT indices for CPU cached memory
 - Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge
 - Xe3p tuning and workaround fixes
 - USE drm mm instead of drm SA for CCS read/write
 - Fix leaks and null derefs
 - Fix Wa_18022495364
 
 appletbdrm:
 - allocate protocol buffers with kvzalloc()
 
 dma-buf:
 - fix docs
 
 imagination:
 - avoid segfault in debugfs
 
 ofdrm:
 - put PCI device reference on errors
 
 udl:
 - increase USB timeout
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmn1JYQACgkQDHTzWXnE
 hr5rkBAAnAEiMatySCl54Zwt9RlC1S8PDJ+cKW0GbGE6ID3UYMcIgBgXjBfRWPGI
 smhCUq1a/tNjIFCO+JCNe3WqX/vhghtJKfh2FJVWy0tu18S/PvrxB5m3Iasm7JfP
 NxKGyoVCXknDQW4dMATWrDm5JoqAsh5b59Jf8WCcBrMQXeqVSZgHxXjVkwj8e092
 i/FIoS/sV83Lf4xJcm9l25+0OcLhkoLdXz6+r7pwFwsafP07mWbXYXa53efWqy8v
 848AH25FaB+cK16QcrluhIvdVFl3iLbX2b7WpJF3TAbDe3Emr4uggBqiqwcI4p5/
 rQGfVZkng1FBLOcHBZ7p0Wsa+F35C+6H14R8fueMiOmsgX6pXZLnJJ0KpQvSEc+d
 acia8SYp1SGTaxBrdvrhRY6BKtcq/ClOPvbIvV0CPuxFtVNWU940FE+b3V51EpbG
 TGhks4Nuh1C7ihm82Kep34pZjx7ZRnQWPAz7Cm9L9ZfX2DOOi9Uu16u71IwgumfL
 yp/7Jt06Hx/TS0qWV1dnH3ZtluQgBA/EUARmv1MNyIEvSOjpvKiVqlNJmlPKi0+9
 piXl0QUrOQz+Wj9glzcM3ENKh7ZxDFJxcIMkHx7q/wEwSIppnhOPuQAwNXWO4Y4Q
 p4X99W+gHfKwVG8BrY5tbW7lbkt8/4MWSR/9R2Vj8prIJCeeKFI=
 =wagp
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2026-05-02' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Fixes for rc2, the usual amdgpu/xe double header, I think xe had a
  couple of weeks combined due to some maintainer access issues,
  otherwise there's just a few misc fixes and documentation fixups.

  core and helpers:
   - calculate framebuffer geometry with format helpers
   - fix docs

  amdgpu:
   - GFX12 fix for CONFIG_DRM_DEBUG_MM configs
   - Fix DC analog support
   - Userq fixes
   - GART placement fix
   - Aldebaran SMU fixes
   - AMDGPU_INFO_READ_MMR_REG fix
   - UVD 3.1 fix
   - GC 6 TCC fix
   - Fix root reservation in amdgpu_vm_handle_fault()
   - RAS fix
   - Module reload fix for APUs
   - Fix build for CONFIG_DRM_FBDEV_EMULATION=n
   - IGT DWB regression fix
   - GC 11.5.4 fix
   - VCN user fence fixes
   - JPEG user fence fixes
   - SMU 13.0.6 fix
   - VCN 3/4 IB parser fixes
   - NV3x+ dGPU vblank fix
   - DCE6/8 fixes for LVDS/eDP panels without an EDID

  amdkfd:
   - Fix for when CONFIG_HSA_AMD is not set
   - SVM fixes

  xe:
   - uapi: Add missing pad and extensions check
   - uapi: Reject unsafe PAT indices for CPU cached memory
   - Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge
   - Xe3p tuning and workaround fixes
   - USE drm mm instead of drm SA for CCS read/write
   - Fix leaks and null derefs
   - Fix Wa_18022495364

  appletbdrm:
   - allocate protocol buffers with kvzalloc()

  dma-buf:
   - fix docs

  imagination:
   - avoid segfault in debugfs

  ofdrm:
   - put PCI device reference on errors

  udl:
   - increase USB timeout"

* tag 'drm-fixes-2026-05-02' of https://gitlab.freedesktop.org/drm/kernel: (77 commits)
  drm/xe/uapi: Reject coh_none PAT index for CPU_ADDR_MIRROR
  drm/xe/uapi: Reject coh_none PAT index for CPU cached memory in madvise
  drm/xe/xelp: Fix Wa_18022495364
  drm/xe/gsc: Fix BO leak on error in query_compatibility_version()
  drm/xe/eustall: Fix drm_dev_put called before stream disable in close
  drm/xe: Fix error cleanup in xe_exec_queue_create_ioctl()
  drm/xe: Fix dma-buf attachment leak in xe_gem_prime_import()
  drm/xe: Fix bo leak in xe_dma_buf_init_obj() on allocation failure
  drm/xe/bo: Fix bo leak on GGTT flag validation in xe_bo_init_locked()
  drm/xe/bo: Fix bo leak on unaligned size validation in xe_bo_init_locked()
  drm/xe: Fix potential NULL deref in xe_exec_queue_tlb_inval_last_fence_put_unlocked
  drm/xe/vf: Use drm mm instead of drm sa for CCS read/write
  drm/xe: Add memory pool with shadow support
  drm/xe/debugfs: Correct printing of register whitelist ranges
  drm/xe: Mark ROW_CHICKEN5 as a masked register
  drm/xe/tuning: Use proper register offset for GAMSTLB_CTRL
  drm/xe/xe3p_lpg: Add missing indirect ring state feature flag
  drm/xe: Drop redundant rtp entries for Wa_14019988906 & Wa_14019877138
  drm/xe/vm: Add missing pad and extensions check
  drm/xe: Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge()
  ...
2026-05-01 16:56:08 -07:00
Aleksander Jan Bajkowski
f93836b236 net: usb: r8152: add TRENDnet TUC-ET2G v2.0
The TRENDnet TUC-ET2G V2.0 is an RTL8156B based 2.5G Ethernet controller.

Add the vendor and product ID values to the driver. This makes Ethernet
work with the adapter.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Birger Koblitz <mail@birger-koblitz.de>
Link: https://patch.msgid.link/20260430213435.21821-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 16:50:49 -07:00
Jakub Kicinski
2ab02ac411 netfilter pull request 26-05-01
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmn0hZ8ACgkQ1w0aZmrP
 KyFzNg//ZVbSZyMag+CJoIJv3sMFDJ7uLSEko9mR0nNvo6hPZDWAysCNychhPCDl
 w9yiar5wM9W1zcSWvtlBFozZUcS55mQbcqCHNEyJdSjQ1zTr7C9Dl9zDU3jDJEoK
 aplUk5VvFYFqEp4Bqy7EA1VGY5uc2WzmbsCAf9Z2pjprTQKD/E5tzyx0RFEPksKU
 0pSvsC8VfOES6mJs3KIng6TfvnaC/TWilOtjXC/1y1jl+WftXgwb0gwIVnWKjZnc
 yEJ6h4VOiW2NjwcW+gcaaqvt0c1T4EO/bDvuVnCJzwxDZKI2W9KOs8yQytO2hNTo
 jrAyjTB0F3yDxcnDP1AO8ipkJzu42wOfZblrZKvSmC4Kwwqq8QlsXqD1HMh3oMqv
 JGNJSB8rNbIqt9RTMB+A5wiAZvZbSGZc3qH+y7Z5z/2Zl7u0+Zwl20YZ1r7RqM9Z
 Ay/+QzZIyRAyKmQDr8nSoqmBy2i0wfw79NovvhgPDl9qak8Cfc8Df8wkd59t3z33
 0VzPO9kieTWW6aqW19l88C7dtspsd93IsMZz3He3Lvy5e4dpPG+2OdLKpPkTYHBg
 17KY4Qs7gYM0m5baHlcmana4bZHWcBz146dmIMUuhoj3gPyjgV+s/Hum3YxD/P43
 PNA6X8pI38R8O97VkPXYg1aoQIRLt9YsGwVTYxPXv2gZgLD0Acw=
 =ASC0
 -----END PGP SIGNATURE-----

Merge tag 'nf-26-05-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following batch contains Netfilter fixes for net:

1) Replace skb_try_make_writable() by skb_ensure_writable() in
   nft_fwd_netdev and the flowtable to deal with uncloned packets
   having their network header in paged fragments.

2) Drop packet if output device does not exist and ensure sufficient
   headroom in nft_fwd_netdev before transmitting the skb.

3) Use the existing dup recursion counter in nft_fwd_netdev for the
   neigh_xmit variant, from Weiming Shi.

4) Add .check_hooks interface to x_tables to detach the control plane
   hook check based on the match/target configuration. Then, update
   nft_compat to use .check_hooks from .validate path, this fixes a
   lack of hook validation for several match/targets.

5) Fix incorrect .usersize in xt_CT, from Florian Westphal.

6) Fix a memleak with netdev tables in dormant state,
   from Florian Westphal.

7) Several patches to check if the packet is a fragment, then skip
   layer 4 inspection, for x_tables and nf_tables; as well as common
   nf_socket infrastructure. The xt_hashlimit match drops fragments
   to stay consistent with the existing approach when failing to parse
   the layer 4 protocol header.

8) Ensure sufficient headroom in the flowtable before transmitting
   the skb.

9) Fix the flowtable inline vlan approach for double-tagged vlan:
   Reverse the iteration over .encap[] since it represents the
   encapsulation as seen from the ingress path. Postpone pushing
   layer 2 header so output device is available to calculate needed
   headroom. Finally, add and use nf_flow_vlan_push() to fix it.

10) Fix flowtable inline pppoe with GSO packets. Moreover, use
    FLOW_OFFLOAD_XMIT_DIRECT to fill up destination hardware
    address since neighbour cache does not exist in pppoe.

11) Use skb_pull_rcsum() to decapsulate vlan and pppoe headers, for
    double-tagged vlan in particular this should provide some benefits
    in certain scenarios.

More notes regarding 9-11):

- sashiko is also signalling to use it for IPIP headers, but that needs
  more adjustments such setting skb->protocol after removing the IPIP
  header, will follow up in a separated patch.
- I plan to submit selftests to cover double-tagged-vlan. As for pppoe,
  it should be possible but that would mandate a few userspace dependencies.
  This has been semi-automatically  tested by me and reporters describing
  broken double-vlan-tagged and pppoe currently in the flowtable.

* tag 'nf-26-05-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: flowtable: use skb_pull_rcsum() to pop vlan/pppoe header
  netfilter: flowtable: fix inline pppoe encapsulation in xmit path
  netfilter: flowtable: fix inline vlan encapsulation in xmit path
  netfilter: flowtable: ensure sufficient headroom in xmit path
  netfilter: xtables: fix L4 header parsing for non-first fragments
  netfilter: nf_tables: skip L4 header parsing for non-first fragments
  netfilter: nf_socket: skip socket lookup for non-first fragments
  netfilter: nf_tables: fix netdev hook allocation memleak with dormant tables
  netfilter: xt_CT: fix usersize for v1 and v2 revision
  netfilter: nft_compat: run xt_check_hooks_{match,target}() from .validate
  netfilter: x_tables: add .check_hooks to matches and targets
  netfilter: nft_fwd_netdev: use recursion counter in neigh egress path
  netfilter: nft_fwd_netdev: add device and headroom validate with neigh forwarding
  netfilter: replace skb_try_make_writable() by skb_ensure_writable()
====================

Link: https://patch.msgid.link/20260501122237.296262-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-01 16:45:42 -07:00
Linus Torvalds
cd546f7ae2 Assorted arm64, ACPI and kselftest fixes for 7.1-rc2:
- Avoid writing an uninitialised stack variable to POR_EL0 on
    sigreturn if the poe_context record is absent
 
  - Reserve one more page for the early 4K-page kernel mapping to cover
    the extra [_text, _stext) split introduced by the non-executable
    read-only mapping
 
  - Force the arch_local_irq_*() wrappers to be __always_inline so that
    noinstr entry and idle paths cannot call out-of-line, instrumentable
    copies
 
  - Fix potential sign extension in the arm64 SCS unwinder's DWARF
    advance_loc4 decoding
 
  - Tolerate arm64 ACPI platforms with only WFI and no deeper PSCI idle
    states, restoring cpuidle registration on such systems
 
  - Include the UAPI <asm/ptrace.h> header in the arm64 GCS libc test
    rather than carrying a duplicate struct user_gcs definition (the
    original #ifdef NT_ARM_GCS was wrong to cover the structure
    definition as it would be masked out if the toolchain defined it)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmn03FoACgkQa9axLQDI
 XvELcxAAmhoarEo1Te6wWyybco9LqvfZPzirij+YYLw0GWuqnN99N+f79FZirTbz
 ug9AZiG1PPQY0hCurNWwEjQfWJ6dJYo/4mIT9R1rbeU2MwcxHawIePrM0T8PMBF8
 nHMZaEy/EZ8hX3pam98d78F38yFUvxaikghhxQvHLFlQA4nU19IElQCyMogofe05
 RTE71nDdMZAnfoOS6cVk7wnH99VLfbqiyl97zUOjnyFNdye99UDovayXPUdUkgbN
 clF2qxWInS8TPuoKQPz5hzYkbuR0doFwIasLjSMnOQx+FMZdMmPXEZbwqI/hYl7l
 xc5bjKtJH/AQqdoEkZW9MUJ1GhzMttTpoYW9//wgRpJtBDNxisdOE9LpcsCMMNIM
 wKLrLVLTXsv5jyPeEFMRtUjd0tJ7bV0f3cO/sv5EVBd238CGT76zwCgjpMtZQqbj
 KWsTJpM5oYAsKkBHAYE6XCa5h7kre0/249zH/CYhI/mXJkaHJRM8Ub2CnqBgqeTG
 KobtDIUJt+TPAhThj/2OQ/HxP6SLzgBgsgVmVqE1nhkOPlcfg3YYBsgpgN+bzMfG
 Z7h14yyCAhunoGRVBMtyUgksAvflR+PIS06soRjLZ5cXcOp/3h+sXs6/XVXHtOr/
 UCeO5mfaNUNAr3xJ9oYhuAAT74b7zKXY3YM4NRVASfq6rQS0nWg=
 =a9er
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Avoid writing an uninitialised stack variable to POR_EL0 on sigreturn
   if the poe_context record is absent

 - Reserve one more page for the early 4K-page kernel mapping to cover
   the extra [_text, _stext) split introduced by the non-executable
   read-only mapping

 - Force the arch_local_irq_*() wrappers to be __always_inline so that
   noinstr entry and idle paths cannot call out-of-line, instrumentable
   copies

 - Fix potential sign extension in the arm64 SCS unwinder's DWARF
   advance_loc4 decoding

 - Tolerate arm64 ACPI platforms with only WFI and no deeper PSCI idle
   states, restoring cpuidle registration on such systems

 - Include the UAPI <asm/ptrace.h> header in the arm64 GCS libc test
   rather than carrying a duplicate struct user_gcs definition (the
   original #ifdef NT_ARM_GCS was wrong to cover the structure
   definition as it would be masked out if the toolchain defined it)

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: signal: Preserve POR_EL0 if poe_context is missing
  arm64: Reserve an extra page for early kernel mapping
  kselftest/arm64: Include <asm/ptrace.h> for user_gcs definition
  ACPI: arm64: cpuidle: Tolerate platforms with no deep PSCI idle states
  arm64/irqflags: __always_inline the arch_local_irq_*() helpers
  arm64/scs: Fix potential sign extension issue of advance_loc4
2026-05-01 16:32:42 -07:00
Yi Kuo
9900b9fee5 smb: smbdirect: fix MR registration for coalesced SG lists
ib_dma_map_sg() modifies the provided scatterlist and returns the
number of mapped entries, which can be fewer than the requested
mr->sgt.nents if the DMA controller coalesces contiguous memory
segments. Passing the original, uncoalesced count to ib_map_mr_sg()
causes memory registration failures if coalescing actually occurs.

Capture the actual mapped count returned by ib_dma_map_sg() and pass it
to ib_map_mr_sg() to ensure correct MR registration.

Also update the ib_dma_map_sg() error logging to drop the error
pointer formatting, since the return value is an integer count
rather than an error code.

Ensure a proper error code (-EIO) is assigned when DMA mapping or
MR registration fails.

Fixes: de5ef8ec3c ("smb: smbdirect: introduce smbdirect_mr.c with client mr code")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221408
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Yi Kuo <yi@yikuo.dev>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 16:24:29 -05:00
Stefan Metzmacher
e768103cfb smb: smbdirect: introduce and use include/linux/smbdirect.h
This makes it easier to rebuild cifs.ko and ksmbd.ko against
a running kernel.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 16:24:25 -05:00
Stefan Metzmacher
5234094c01 smb: smbdirect: make use of DEFAULT_SYMBOL_NAMESPACE and EXPORT_SYMBOL_GPL
This is a better solution than
EXPORT_SYMBOL_FOR_MODULES(__sym, "cifs,ksmbd") as it makes
it possible to rebuild smbdirect.ko against a
running kernel and then load the existing cifs.ko and ksmbd.ko
from the running kernel.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01 16:24:22 -05:00
Alok Tiwari
01eb80b767 accel/qaic: fix incorrect counter check in RAS message decode
The UE and UE_NF cases check ce_count against UINT_MAX before incrementing
their respective counters. This is logically incorrect and prevents
ue_count and ue_nf_count from incrementing when ce_count reaches UINT_MAX.

Fixes: c11a50b170 ("accel/qaic: Add Reliability, Accessibility, Serviceability (RAS)")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20260410112015.592546-1-alok.a.tiwari@oracle.com
2026-05-01 15:14:25 -06:00
Linus Torvalds
ef5f46b630 selinux/stable-7.1 PR 20260501
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmn0uzwUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNwng//WgLizoKU5myWxBG0HBfLX2uaQOOf
 VcVhEB9jCuo8DMYWuLX11bCX9uu8T9RISbVRgmikZItWUGLHWRpELQv6K2hm3T7D
 Yy+AjOHVksG4S3nJubhtqYkA7uozCtohGIQW+yxRzCsVEb+3NJEAFqowYFeznU6l
 /rHY/t7eL7Q1ORt+WRsdvA+tM67iaDzndNsR02qZoMnDHlUU4GdPOvD04nPRDLMS
 tK2UE0sNCEG2pTZBihsrrJW4IDTtjSfF0MQVdxYIO1+3oWmYtJW9pVKbPwDAKK17
 kLgkwdV3w1EAvguRCjd4X9kEZ6MpzTnkkJUuVyTUDxtI5npcpUWA25hqPX2JGjll
 6+S0YJKQ3KPXQTvOLQctojx7tMRhojiV6uRuo0bg5iTQAUANoR/uqeU42+lgY+ww
 4uP1oRwf0aNPh9LXQjb08Z2HhyMfa/1ROkVUFOvRDrSGDIva5P2cMaZ3Uug7ZORn
 V8LUq90gQDJTc4YMTLCcHOWzG++h1gAeDu30AfBTVhMJffqpmap04YASlg600itD
 zg4nJg/5r3Px9j7IJBwIqBkvL5FAhT7UAAh17LN5vo9yB/AKCWnnakTZKr5v5kLc
 nDLlV6O/asI2gcL3KfG8tzr3k+I3lbiJ4139MRPNbmh+pF52qIAKycToLGnrqYnc
 FwyyirkJEP2TI3c=
 =/qwt
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20260501' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fixes from Paul Moore:

 - Ensure SELinux is always properly accessing its own sock LSM state

 - Only reserve an xattr slot for SELinux if it will be used

 - Fix a SELinux auditing regression in the directory avdcache

* tag 'selinux-pr-20260501' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix avdcache auditing
  selinux: don't reserve xattr slot when we won't fill it
  selinux: use sk blob accessor in socket permission helpers
2026-05-01 13:19:14 -07:00
Davidlohr Bueso
ee9dce4436 futex: Drop CLONE_THREAD requirement for private default hash alloc
Currently need_futex_hash_allocate_default() depends on strict pthread
semantics, abusing CLONE_THREAD.  This breaks the non-concurrency
assumptions when doing the mm->futex_ref pcpu allocations, leading to
bugs[0] when sharing the mm in other ways; ie:

    BUG: KASAN: slab-use-after-free in futex_hash_put

... where the +1 bias can end up on a percpu counter that mm->futex_ref
no longer points at.

Loosen the check to cover any CLONE_VM clone, except vfork().  Excluding
vfork keeps the existing paths untouched (no overhead), and we can't
race in the first place: either the parent is suspended and the child
runs alone, or mm->futex_ref is already allocated from an earlier
CLONE_VM.

Link: https://lore.kernel.org/all/CAL_bE8LsmCQ-FAtYDuwbJhOkt9p2wwYQwAbMh=PifC=VsiBM6A@mail.gmail.com/ [0]
Fixes: d9b05321e2 ("futex: Move futex_hash_free() back to __mmput()")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-05-01 13:12:34 -07:00
Linus Torvalds
bb1d73f2cd s390 updates for 7.1-rc2
- Reject zero-length writes from userspace that corrupt
   Debug Facility buffers
 
 - Replace one s390 PCI maintainer
 
 - Remove SCLP_OFB Kconfig option and enable the guarded code
   unconditionally
 
 - Replace incorrect use of phys_to_folio() to virt_to_folio()
   in do_secure_storage_access()
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCafTJpBccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8OEgAQDsVJ448e9YY4j5e+zk5Q3m1Eag
 q5SIntMSb7r7df0AiAEAuiiNQWQMeDjPYpuOOS2SVp0qj2bf3y6RlHgD0sb1OAs=
 =UkrI
 -----END PGP SIGNATURE-----

Merge tag 's390-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - Reject zero-length writes from userspace that corrupt Debug Facility
   buffers

 - Replace one s390 PCI maintainer

 - Remove SCLP_OFB Kconfig option and enable the guarded code
   unconditionally

 - Replace incorrect use of phys_to_folio() to virt_to_folio() in
   do_secure_storage_access()

* tag 's390-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: Fix phys_to_folio() usage in do_secure_storage_access()
  s390/sclp: Remove SCLP_OFB Kconfig option
  MAINTAINERS: Replace one of the maintainers for s390/pci
  s390/debug: Reject zero-length input in debug_input_flush_fn()
  s390/debug: Reject zero-length input before trimming a newline
2026-05-01 12:58:02 -07:00
Thomas Gleixner
010b7723c0 rseq: Don't advertise time slice extensions if disabled
If time slice extensions have been disabled on the kernel command line,
then advertising them in RSEQ flags is wrong.

Adjust the conditionals to reflect reality, fixup the misleading comments
about the gap of these flags and the rseq::flags field.

Fixes: d6200245c7 ("rseq: Allow registering RSEQ with slice extension")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.437059375%40kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Thomas Gleixner
e9766e6f7d rseq: Protect rseq_reset() against interrupts
rseq_reset() uses memset() to clear the tasks rseq data. That's racy
against membarrier() and preemption.

Guard it with irqsave to cure this.

Fixes: faba9d250e ("rseq: Introduce struct rseq_data")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.353887714%40kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Thomas Gleixner
2cb68e4512 rseq: Set rseq::cpu_id_start to 0 on unregistration
The RSEQ rework changed that to RSEQ_CPU_UNINITILIZED, which is obviously
incompatible. Revert back to the original behavior.

Fixes: 0f085b4188 ("rseq: Provide and use rseq_set_ids()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.271566313%40kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Mark Brown
cb48828f06 selftests/rseq: Don't run tests with runner scripts outside of the scripts
The rseq selftests include two runner scripts run_param_test.sh and
run_syscall_errors_test.sh which set up the environment for test binaries
and run them with various parameters. Currently we list these test binaries
in TEST_GEN_PROGS but this results in the kselftest framework running them
directly as well as via the runners, resulting in duplication and spurious
failures when the environment is not correctly set up (eg, if glibc tries
to use rseq).

Move the binaries the runners invoke to TEST_GEN_PROGS_EXTENDED, binaries
listed there are built but not run by the framework.  The param_test
benchmarks are not moved since they are not run by run_param_test.sh.

Fixes: 830969e782 ("selftests/rseq: Implement time slice extension test")

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260423-selftests-rseq-use-runner-v1-1-e13a133754c1@kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Linus Torvalds
227c3d546e two ksmbd server fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmnz64kACgkQiiy9cAdy
 T1GLPgv/cuJlvhCW4NknYvOplaHZrOYFIeO3DFWc5GvAFO/9nK+6R2s7OoL2CNV+
 QR5CTsWZgYq0vm2Vj2XeuyrnsmCvLkCTY/nmOVmHGxPfyKbjuIvKS5m2+mHiON9p
 aqNqAui03n8OGBACFi7LeaY3hH/8g2MlxbT3uwcbWbaUkZ6UiF1TaNw/hkFkIsnJ
 CarnOd0K08chXMwSIFttFeUYeZg0tVOUG80Zw5YJwnjxn8MY2VI6rf9fu4GVwbZY
 +ycqI49BjaG/CAVMcrPOJnceDkuO1jsfv39HHjXSEwTpE3GtgsS+RFMl2CTOsb/H
 VVdHBsq5pJ/E4zqbhwB+/oju75Ke8/xhNjsXliyqqkZW4vRnQUBKZSh1jarXoFV9
 GW4Eg+cx5nduDI8qVB8IxoEvrwhF1dvbTkEGKN5r7Zy2SlyqvhXiDl0voRGm2am4
 gD9SsKRkdm/wWUoFT2VVeFu4I7rj4ne42LNbhtmmzvkIWLJuvAXmynk2GGMgGrjk
 /1TlyI0t
 =Rq2W
 -----END PGP SIGNATURE-----

Merge tag 'v7.1-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

 - Fix shutdown (stop sessions)

 - Fix readdir unsupported info level

* tag 'v7.1-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: rewrite stop_sessions() with restartable iteration
  smb: server: handle readdir_info_level_struct_sz() error
2026-05-01 12:16:42 -07:00
Linus Torvalds
6fe0be6dc7 block-7.1-20260430
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmnz4j0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgprB/EACLZ9Z3TpHOGe2evynV+uluq+M3GFUxUizc
 f6TQxvHCyXTIDR/r+5J5gXkOEu3Jfepc/WpvXSwcMEneyYdZBUfQ7/ct/6/IpozR
 RYiGb96H6qil0iV+cxvpGHJht894mZKaqYO5gNn8Q3mw/4SVYGmkhJ974j4+4uEn
 JdcVTXFgGfw1u1UL5A0XTwQ3mk0GJmpRVtFgbMyvysdi8TngyB3M7F1Yko+sePLP
 gSaP7UKGeRxAqvvEChVkMyc+m+oQhBl3+7IGwER7IupxMx2b5Ht2XMJ09ERW4fBZ
 rLWZXWpN8iEkwSO6WTELBqWPGAe0ddFBA0b4qrOBdC0gvsJp7XtAlx9cDqTvD6zx
 E/d8aDf4Elq8wzf6dPfnu18Ld8fgG0BO/7Pl1P7KPiwzGhj2TAwXhwkkOg+yUjZ6
 4ej/MzeWocVmkwFv8fJ6D77O+0ziz67wgpzIYAz5dpoDcW7no8S1rHS5RLtZdfm0
 JhsX6Epwlak7BR+4OTkcRx4dEjCPiW6W0henuuMicUFrEXs+eZoOqA7yS70vXd/f
 9PRLgZk0r5Dpe/8aqGbuzkMZx5zzODDDcLNzMmAJbEbjn/8lnbMnBNroyIZS5gve
 b7MuMC7RocmhUGs/8o1gpvTpa5SFp33kHjSb33s6WRiTM8qH+UbSq6ZWuchr77Iv
 n3ZP40b5kw==
 =Z0UT
 -----END PGP SIGNATURE-----

Merge tag 'block-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - MD pull request via Yu:
      - Fix a raid5 UAF on IO across the reshape position
      - Avoid failing RAID1/RAID10 devices for invalid IO errors
      - Fix RAID10 divide-by-zero when far_copies is zero
      - Restore bitmap grow through sysfs
      - Use mddev_is_dm() instead of open-coding gendisk checks
      - Use ATTRIBUTE_GROUPS() for md default sysfs attributes
      - Replace open-coded wait loops with wait_event helpers

 - NVMe pull request via Keith:
      - Target data transfer size configuation (Aurelien)
      - Enable P2P for RDMA (Shivaji Kant)
      - TCP target updates (Maurizio, Alistair, Chaitanya, Shivam Kumar)
      - TCP host updates (Alistair, Chaitanya)
      - Authentication updates (Alistair, Daniel, Chris Leech)
      - Multipath fixes (John Garry)
      - New quirks (Alan Cui, Tao Jiang)
      - Apple driver fix (Fedor Pchelkin)
      - PCI admin doorbell update fix (Keith)

 - Properly propagate CDROM read-only state to the block layer

* tag 'block-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (35 commits)
  md: use ATTRIBUTE_GROUPS() for md default sysfs attributes
  md: use mddev_is_dm() instead of open-coding gendisk checks
  md/raid1: replace wait loop with wait_event_idle() in raid1_write_request()
  md/md-bitmap: add a none backend for bitmap grow
  md/md-bitmap: split bitmap sysfs groups
  md: factor bitmap creation away from sysfs handling
  md: use mddev_lock_nointr() in mddev_suspend_and_lock_nointr()
  md: replace wait loop with wait_event() in md_handle_request()
  md/raid10: fix divide-by-zero in setup_geo() with zero far_copies
  md/raid1,raid10: don't fail devices for invalid IO errors
  MAINTAINERS: Add Xiao Ni as md/raid reviewer
  md/raid5: Fix UAF on IO across the reshape position
  cdrom, scsi: sr: propagate read-only status to block layer via set_disk_ro()
  nvme-auth: Hash DH shared secret to create session key
  nvme-pci: fix missed admin queue sq doorbell write
  nvme-auth: Include SC_C in RVAL controller hash
  nvme-tcp: teardown circular locking fixes
  nvmet-tcp: Don't clear tls_key when freeing sq
  Revert "nvmet-tcp: Don't free SQ on authentication success"
  nvme: skip trace completion for host path errors
  ...
2026-05-01 11:26:15 -07:00
Linus Torvalds
9d88bb929a io_uring-7.1-20260430
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmnz4ikQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpoecEACan9sGbcqrTAqBaJnNqMfjo5OoX0X2/3LP
 JoFH+GbLzJ7Ojj1+AWSHNsjjwhSZ71HmwRk78uWCPcl3oiKQLGyXTnbho0qKrwhk
 4cnoFfhdcBEDGuh8GW/PnBset8ukq9occ11TbojC681tmaTma1WpXFk1vRabcwvw
 T8/Jr18kttHi8aj+MPowkTcqXV7iOjzX9RD/vS97jCWBxUbAmYjRGfm3nbDbDydI
 oMEstxqp+8jiFF1SHBdq3aGreoZDIegh1nXsjobAmoEMvAJQQ3K7zRsiqFEnoXFU
 CDVoS6LhlSBmG2jT657azYzhF3o7HwSiYk2B15YiYHO+EqIxMhIQYRlP5s/3UJD8
 KLJPSYqivQ14m9yff5zjn//mad3QBxvOhrVrxHj/diIKclZDLs9VDPZjjB6A6DUO
 X01uJy7zuzp57GFh0FwyFGU3yBUl7WJGscLarIMHnOdmEWOIU3WRLWGYZRgZRUny
 1yHXxGEucR7LMiYPzh7PnGnaAzDxtJJUzIXbIF+l+A5A0f1Ayb8cfFy6QoGc7v8j
 t+vG2gbRtwPR6DxFRNhGDMeZtstEoKj0IX4zw7ZQF7MFgpvdfjlkDGCveJ0jQO6x
 pw8UJW1KOQpT9MiheOAvop5hhvGlqSYWXByluW05y7O+CDQcHiVeXcjQsG7zREsO
 +zGvO5WHEg==
 =5d8D
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:

 - Remove dead struct io_buffer_list member

 - Fix for incrementally consumed buffers with recvmsg multishot, which
   requires a minimum value left in a buffer for any receive for the
   headers. If there's still a bit of buffer left but it's smaller than
   that value, then userspace will see a spurious -EFAULT returned in
   the CQE

 - Locking fix for the DEFER_TASKRUN retry list, which otherwise could
   race with fallback cancelations. If the task is exiting with
   task_work left in both the normal and retry list AND the exit cleanup
   races with the task running task work, then entries could either be
   doubly completed or lost

 - Cap NAPI busy poll timeout to something sane, to avoid syzbot running
   into excessive polling and triggering warnings around that

* tag 'io_uring-7.1-20260430' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/tw: serialize ctx->retry_llist with ->uring_lock
  io_uring/napi: cap busy_poll_to 10 msec
  io_uring/kbuf: support min length left for incremental buffers
  io_uring/kbuf: kill dead struct io_buffer_list 'nr_entries' member
2026-05-01 11:01:31 -07:00
Helge Deller
41ca998fbe parisc: Fix 64-bit kernel build when CONFIG_COMPAT=n
VDSO32_SYMBOL() is used in signal.c, defining the value to zero avoids
liker issues when CONFIG_COMPAT=n.

Signed-off-by: Helge Deller <deller@gmx.de>
2026-05-01 19:09:30 +02:00
Linus Torvalds
33d0c9c5f0 spi: Fixes for v7.1
There are a couple of nasty issues fixed here in the axiado and rockchip
 drivers.  We've also got more of the fixes from Johan here, this time
 for the two Cadence drivers, plus a couple of other similar fixes from
 John and Felix.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmn0AeEACgkQJNaLcl1U
 h9Co7Qf+I+1MpKYz07zWMhyh1SMZfMIAiCEQ4PipBO5ekc/I2ns7jLSNK0onquO9
 tRDdKqvCQUwNUn+XnLrLBikZqemzpcCBYN91Fzxqa7j2oofr1jOafaBxk8HjPVco
 J3RaLkk3o0+mMixaQdCIFnlBzPOqt6OlORcUAbBKjY7ZI0+Z/ODDkRXSU/cuM2eK
 yfQLpLZ25VBhS1QPXg6CgZKdx85g76x5dfXpwpsaBoBY6e+VHP62Y7kwnPj6agV0
 i4WGvDN5uGNAVCcu08Tf1J091TYtmEsuaS7cVTHYzqACbvJ6oU0k34ibjd7GtION
 BabmrWpxqHKN7ve/K3WSzewLF96HhQ==
 =Zom6
 -----END PGP SIGNATURE-----

Merge tag 'spi-fix-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "There are a couple of nasty issues fixed here in the axiado and
  rockchip drivers. We've also got more of the fixes from Johan here,
  this time for the two Cadence drivers, plus a couple of other similar
  fixes from John and Felix"

* tag 'spi-fix-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: amlogic-spisg: initialize completion before requesting IRQ
  spi: axiado: replace usleep_range() with udelay() in IRQ path
  spi: cadence-quadspi: fix runtime pm and clock imbalance on unbind
  spi: cadence-quadspi: fix unclocked access on unbind
  spi: cadence-quadspi: fix clock imbalance on probe failure
  spi: cadence-quadspi: fix runtime pm disable imbalance on probe failure
  spi: cadence: fix clock imbalance on probe failure
  spi: cadence: fix unclocked access on unbind
  spi: rockchip: Drop unused and broken CR0 macros
  spi: rockchip: Read ISR, not IMR, to detect cs-inactive IRQ
  spi: rzv2h-rspi: Fix silent failure in clock setup error path
2026-05-01 09:51:38 -07:00
Kevin Brodsky
030e8a40ff arm64: signal: Preserve POR_EL0 if poe_context is missing
Commit 2e8a1acea8 ("arm64: signal: Improve POR_EL0 handling to
avoid uaccess failures") delayed the write to POR_EL0 in
rt_sigreturn to avoid spurious uaccess failures. This change however
relies on the poe_context frame record being present: on a system
supporting POE, calling sigreturn without a poe_context record now
results in writing arbitrary data from the kernel stack into POR_EL0.

Fix this by adding a __valid_fields member to struct
user_access_state, and zeroing the struct on allocation.
restore_poe_context() then indicates that the por_el0 field is valid
by setting the corresponding bit in __valid_fields, and
restore_user_access_state() only touches POR_EL0 if there is a valid
value to set it to. This is in line with how POR_EL0 was originally
handled; all frame records are currently optional, except
fpsimd_context.

To ensure that __valid_fields is kept in sync, fields (currently
just por_el0) are now accessed via accessors and prefixed with __ to
discourage direct access.

Fixes: 2e8a1acea8 ("arm64: signal: Improve POR_EL0 handling to avoid uaccess failures")
Cc: <stable@vger.kernel.org>
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-05-01 17:44:25 +01:00