Using definitions in kernel policies is awkward right now.
On one hand we want defines for max values and such.
On the other we don't have a way of adding kernel-only defines.
Adding unnecessary defines to uAPI is a bad idea, we won't
be able to delete them. And when it comes to policy user
space should just query it via the policy dump, not use
hard coded defines.
Add a "scope" property to definitions, which will let us tell
the codegen that a definition is for kernel use only. Support
following values:
- uapi: render into the uAPI header (default, today's behavior)
- kernel: render to kernel header only
- user: same as kernel but for the user-side generated header
Definitions may have a header property (definition is "external",
provided by existing header). Extend the scope to headers, too.
If definition has both scope and header properties we will only
generate the includes in the right scope.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20260510192904.3987113-8-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Make sure resources are not improperly shared in the op cache and
cause instruction corruption this way.
Signed-off-by: Prathyushi Nangia <prathyushi.nangia@amd.com>
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a selftest to verify the verifier and JIT behavior when handling
bpf-to-bpf calls with relative jump offsets exceeding the s16 boundary.
The test utilizes an inline assembly block with ".rept 32765" to generate
a massive dummy subprogram. By placing this padding between the main
program and the target subprogram, it forces the verifier to process a
bpf-to-bpf call where the imm field exceeds the s16 range.
- When JIT is enabled, it asserts that the program is successfully loaded
and executes correctly to return the expected value. Since the fix
does not change the JIT behavior, the test passes whether the fix is
applied or not.
- When JIT is disabled, it also asserts that the program is successfully
loaded and executes correctly to return the expected value 3.
- Before the fix, the verifier rewrites the call instruction with a
truncated offset (here 32768 -> -32768) and lets it pass. When the
program is executed, the call instruction will go to a wrong target
(the landing pad) instead of the intended subprogram, then return -1
and fail.
- After the fix, the verifier correctly handles the large offset and
allows it to pass. The program then executes correctly to return the
expected value 3.
Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260506094714.419842-4-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Building the dequeue selftest with newer compilers (e.g., gcc 16)
triggers the following error:
dequeue.c:28:22: error: variable 'sum' set but not used
The 'volatile' qualifier prevents the writes from being optimized away,
but does not silence the unused variable 'sum' is indeed only written
and never read.
Consume 'sum' via an empty asm() with a register input constraint. This
forces the compiler to keep the accumulated value (preserving the CPU
stress loop) and avoiding the build error.
Fixes: 658ad2259b ("selftests/sched_ext: Add test to validate ops.dequeue() semantics")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
cg_read_strcmp() allocated a buffer sized to strlen(expected) + 1,
then passed it to read_text() which calls read(fd, buf, size-1).
When comparing against an empty string (""), strlen("") = 0 gives a
1-byte buffer, and read() is asked to read 0 bytes. The file content
is never actually read, so strcmp("", buf) always returns 0 regardless
of the real content. This caused cg_test_proc_killed() to always
report the cgroup as empty immediately, making OOM tests pass without
verifying that processes were killed.
Signed-off-by: Hongfu Li <lihongfu@kylinos.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
ethtool.h includes linux/typelimits.h which is a relatively new header
not yet shipped in most distro kernel-header packages. Without the
explicit entry, the build silently falls through to -idirafter.
dev_energymodel.h is a new YNL family whose uapi header is not in
system paths at all and was missing a CFLAGS entry entirely.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260508204114.205896-2-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Fix spurious failures in rseq self-tests (Mark Brown)
- Fix rseq rseq::cpu_id_start ABI regression due to
TCMalloc's creative use of the supposedly read-only field.
The fix is to introduce a new ABI variant based on
a new (larger) rseq area registration size, to keep
the TCMalloc use of rseq backwards compatible on new kernels.
(Thomas Gleixner)
- Fix wakeup_preempt_fair() for not waking up task (Vincent Guittot)
- Fix s64 mult overflow in vruntime_eligible() (Zhan Xusheng)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmn+lBARHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iy1hAAunlBoDq8/MXSt4JeMRX/3p+CKihExTnO
LO535Rv8DfcepBgZIysTJKMn9bM/l+7OXGdQ+YDjS70GsLM2aOzDYBKCwOHHm0pZ
OJ6Y+UUFAacnQS4EuQLqyNBW0Ice4AIYWu0pLLADs2KUgX1DmSo9bhgZbHcbsMnA
IjoaFNhebeA1bHSDD11UIHTza23mqEinxM0yOK8pT+M6fOMXWOo/kLLLYjG/yAIB
qBFGpwJkdKjBcpCmAYU9jpw26p/17YMzkgmAaUXOKRLZi+h5zQMNVjR+OIjK4qxt
z5Tj+h7t3IcFV2d1zUThPpxxHLn3ro30R5mW0OrsPPFI8AkSRC6GsIX/Ft9uFbQQ
1SGknyx5qLrSldmT8KKXPlM/vriyh3iL6/QMgXtTb8FfegRCbjXsZy39s3wOexCD
oBnJt0rX3NviPyb/Up9cfdx++kPfJM074NdVHRBW4ucoOpwHosNcuBP0YQsctFw0
QO7lkfjTo19eB9ftSyfadwF9e+2jYF7YoLmTOKqGLbZEQeIT5tJHDgHklLWrqHAh
HDmDCHyqtiXBDCeapqnomrhqczKlSUK1Qqk/Hh+Hwiwj0N4vLhHXmRIpTb/R43/Y
i6cdwV8Xtl+RltmYaINpMxdnFl/iz/kpapYkqy/ykLNfBj01AWbdaIVPRk47ndU1
E29BSjMZhPU=
=2QJp
-----END PGP SIGNATURE-----
Merge tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix spurious failures in rseq self-tests (Mark Brown)
- Fix rseq rseq::cpu_id_start ABI regression due to TCMalloc's creative
use of the supposedly read-only field
The fix is to introduce a new ABI variant based on a new (larger)
rseq area registration size, to keep the TCMalloc use of rseq
backwards compatible on new kernels (Thomas Gleixner)
- Fix wakeup_preempt_fair() for not waking up task (Vincent Guittot)
- Fix s64 mult overflow in vruntime_eligible() (Zhan Xusheng)
* tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix wakeup_preempt_fair() for not waking up task
sched/fair: Fix overflow in vruntime_eligible()
selftests/rseq: Expand for optimized RSEQ ABI v2
rseq: Reenable performance optimizations conditionally
rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode
selftests/rseq: Validate legacy behavior
selftests/rseq: Make registration flexible for legacy and optimized mode
selftests/rseq: Skip tests if time slice extensions are not available
rseq: Revert to historical performance killing behaviour
rseq: Don't advertise time slice extensions if disabled
rseq: Protect rseq_reset() against interrupts
rseq: Set rseq::cpu_id_start to 0 on unregistration
selftests/rseq: Don't run tests with runner scripts outside of the scripts
Current release - fix to a fix:
- ipmr: add __rcu to netns_ipv4.mrt, make sure we hold the RCU lock
in all relevant places
Current release - new code bugs:
- fixes for the recently added resizable hash tables
- ipv6: make sure we default IPv6 tunnel drivers to =m now that
IPv6 itself is built in
- drv: octeontx2-af: fixes for parser/CAM fixes
Previous releases - regressions:
- phy: micrel: fix LAN8814 QSGMII soft reset
- wifi: cw1200: revert "Fix locking in error paths"
- wifi: ath12k: fix crash on WCN7850, due to adding the same queue
buffer to a list multiple times
Previous releases - always broken:
- number of info leak fixes
- ipv6: implement limits on extension header parsing
- wifi: number of fixes for missing bound checks in the drivers
- Bluetooth: fixes for races and locking issues
- af_unix: fix an issue between garbage collection and PEEK
- af_unix: fix yet another issue with OOB data
- xfrm: esp: avoid in-place decrypt on shared skb frags
- netfilter: replace skb_try_make_writable() by skb_ensure_writable()
- openvswitch: vport: fix race between tunnel creation and linking
leading to invalid memory accesses (type confusion)
- drv: amd-xgbe: fix PTP addend overflow causing frozen clock
Misc:
- sched/isolation: make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
(for relevant IPVS change)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmn8whsACgkQMUZtbf5S
IrtH+w//SEA3x/LuhHYW7dAr6j+yF5YXnGQ5+DlvUI2N5MzzLRKZqPQB9XSNzQEv
Ptq8XAoNY+ItZft8GjTGOQddMDPYi+sS43Lt5t2W1Fq74tUkI+9YEPin0WH9WapQ
uy7e9/15/gJW+3jQeCg4KYMwf65THCUhoSNZO5ORIeVuT/m0OdsoVLNwuM6zjzNZ
hwOGWZDFn06M0tm8oL2LnE9D/iXqztgnHhcSVMe5X3uCgK7EHMySjmj4JsjdEV6R
9U83zG+5/VKjVc7fGiozCiYk5ctiGYFMhvLUDZtdxNpf95rVRpCWu2TNx4gsCh9E
QaOarvqHXTrrG4bz57vN7Bcegcjk+5GvREIPNznEWfMN+k3Tft9x+wEHP5MGBbn3
K+2JGoNF7aTQGlBuH/szPdHh3m0V3oTRu1azisY0NiAGUKHageQAch6uHTm/7ixj
iNmOQ/aY10ITlhUqmTN2jHdl+8bgtMOmaV9zyn5588UWakVu+5zUGTtxjefrxMiQ
bYTs9Ya54BM5dKVT1tbgKs6ZY+shb8Erncd8vr0ltUmp8UW4Ovz6FxQ9tRP7tAaV
vRAQYWE50751YUjqwJJShWpKAlWJlJ0jELEsYYXNeeBbmpMYLm9vgGf7+2TNXToE
0dth3eJTGhzYYDqqgC9Q2ZlkbhdCMWotde9vMUGdhiErgorKsqs=
=ogzY
-----END PGP SIGNATURE-----
Merge tag 'net-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Netfilter, IPsec, Bluetooth and WiFi.
Current release - fix to a fix:
- ipmr: add __rcu to netns_ipv4.mrt, make sure we hold the RCU lock
in all relevant places
Current release - new code bugs:
- fixes for the recently added resizable hash tables
- ipv6: make sure we default IPv6 tunnel drivers to =m now that IPv6
itself is built in
- drv: octeontx2-af: fixes for parser/CAM fixes
Previous releases - regressions:
- phy: micrel: fix LAN8814 QSGMII soft reset
- wifi:
- cw1200: revert "Fix locking in error paths"
- ath12k: fix crash on WCN7850, due to adding the same queue
buffer to a list multiple times
Previous releases - always broken:
- number of info leak fixes
- ipv6: implement limits on extension header parsing
- wifi: number of fixes for missing bound checks in the drivers
- Bluetooth: fixes for races and locking issues
- af_unix:
- fix an issue between garbage collection and PEEK
- fix yet another issue with OOB data
- xfrm: esp: avoid in-place decrypt on shared skb frags
- netfilter: replace skb_try_make_writable() by skb_ensure_writable()
- openvswitch: vport: fix race between tunnel creation and linking
leading to invalid memory accesses (type confusion)
- drv: amd-xgbe: fix PTP addend overflow causing frozen clock
Misc:
- sched/isolation: make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
(for relevant IPVS change)"
* tag 'net-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (190 commits)
net: sparx5: configure serdes for 1000BASE-X in sparx5_port_init()
net: sparx5: fix wrong chip ids for TSN SKUs
net: stmmac: dwmac-nuvoton: fix NULL pointer dereference in nvt_set_phy_intf_sel()
tcp: Fix dst leak in tcp_v6_connect().
ipmr: Call ipmr_fib_lookup() under RCU.
net: phy: broadcom: Save PHY counters during suspend
net/smc: fix missing sk_err when TCP handshake fails
af_unix: Reject SIOCATMARK on non-stream sockets
veth: fix OOB txq access in veth_poll() with asymmetric queue counts
eth: fbnic: fix double-free of PCS on phylink creation failure
net: ethernet: cortina: Drop half-assembled SKB
selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl
selftests: mptcp: check output: catch cmd errors
mptcp: pm: prio: skip closed subflows
mptcp: pm: ADD_ADDR rtx: return early if no retrans
mptcp: pm: ADD_ADDR rtx: skip inactive subflows
mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker
mptcp: pm: ADD_ADDR rtx: free sk if last
mptcp: pm: ADD_ADDR rtx: always decrease sk refcount
mptcp: pm: ADD_ADDR rtx: fix potential data-race
...
When pm_netlink.sh is executed with '-i', 'ip mptcp' is used instead of
'pm_nl_ctl'. IPRoute2 doesn't support the 'unknown' flag, which has only
been added to 'pm_nl_ctl' for this specific check: to ensure that the
kernel ignores such unsupported flag.
No reason to add this flag to 'ip mptcp'. Then, this check should be
skipped when 'ip mptcp' is used.
Fixes: 0cef6fcac2 ("selftests: mptcp: ip_mptcp option for more scripts")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-11-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Using '${?}' inside the if-statement to check the returned value from
the command that was evaluated as part of the if-statement is not
correct: here, '${?}' will be linked to the previous instruction, not
the one that is expected here (${cmd}).
Instead, simply mark the error, except if an error is expected. If
that's the case, 1 can be passed as the 4th argument of this helper.
Three checks from pm_netlink.sh expect an error.
While at it, improve the error message when the command unexpectedly
fails or succeeds.
Note that we could expect a specific returned value, but the checks
currently expecting an error can be used with 'ip mptcp' or 'pm_nl_ctl',
and these two tools don't return the same error code.
Fixes: 2d0c1d27ea ("selftests: mptcp: add mptcp_lib_check_output helper")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-10-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmn57igACgkQrB3Eaf9P
W7cDig//aXeIEN6VUYPU6lTDYXNCWz2A7sM636rXMMizF1nVjkRtrZlzQFwE9pIm
LOla+Mu1VLGVsuxaoYfW2NagKt6bUg3xEDrlOt+lL/Bn6hengdjVF9PibvP4XCjt
5bwtg0xN0AysoktYS2v+2b+fSh5CSnQkcEcn9F2d+3zXmFlLpxuyPJqhHn54nHmI
JPACVyk9bZdKutdfr86uThgWnTDInPvJ2vMRpRlwpGWx5f2JspJv1g4zzWzc38Ad
yTcRZQXhZ7zfOaYFGjqMD0eHtFDPC+HqMTi0Ak9ngCBAFpZS8/iBJ3/TlukJjNcy
q805gPyRqnpiVgm6NH55C8HUguzpD7m8tcjBbVADvIrMA0OzMw3mBxwFsbG2aaCs
cPXxvtT7crDbKPtxvY5RhVJIvCe4BCMP/uqlmo7wuwPE01arVau5i4miZKGPTzXB
LRNchWJMDIrwE/+MnAbJBXT5RfiN5RPvPdV5OdTlrofkwDzBjpTev5FeQq7QktSx
ctPy7I28IRw+eCKlu2FNrUJ4x8C/7Fv1ZPADOSvd3D5PdaOAArUb3RhTGwC9giuo
qKKv8Q30x5xyOv90MB3M8vQwM7mGUloIfZPN6AhRoaDGikdMyy6gZ8Y5M3noGUUJ
D4z+kZgHy1ZrdYDM58CdfE1Kz/s96rA5aIHUVZQYonaz35YGRts=
=WKO1
-----END PGP SIGNATURE-----
Merge tag 'ipsec-2026-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2026-05-05
1. Fix an IPv6 encapsulation error path that leaked route references
when UDPv6 ESP decapsulation resolved to an error route.
From Yilin Zhu.
2. Fix AH with ESN on async crypto paths by accounting for the extra
high-order sequence number when reconstructing the temporary
authentication layout in the completion callbacks.
From Michael Bomarito.
3. Fix XFRM output so it does not overwrite already-correct inner header
pointers when a tunnel layer such as VXLAN has already saved them.
The fix comes with new selftests. From Cosmin Ratiu.
4. Add the missing native payload size entry for XFRM_MSG_MAPPING in the
compat translation path. From Ruijie Li.
5. Harden __xfrm_state_delete() against repeated or inconsistent unhashing
of state list nodes by keying the removal on actual list membership and
using delete-and-init helpers. From Michal Kosiorek.
6. Prevent ESP from decrypting shared splice-backed skb fragments in place
by marking UDP splice frags as shared and forcing copy-on-write in ESP
input when needed. From Kuan-Ting Chen.
* tag 'ipsec-2026-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: esp: avoid in-place decrypt on shared skb frags
xfrm: defensively unhash xfrm_state lists in __xfrm_state_delete
xfrm: provide message size for XFRM_MSG_MAPPING
xfrm: Don't clobber inner headers when already set
tools/selftests: Add a VXLAN+IPsec traffic test
tools/selftests: Use a sensible timeout value for iperf3 client
xfrm: ah: account for ESN high bits in async callbacks
ipv6: xfrm6: release dst on error in xfrm6_rcv_encap()
====================
Link: https://patch.msgid.link/20260505132326.1362733-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* ensure MAC header offset is reset before delivering packet
* ensure gro_cells_receive() and dstats_dev_add() are called with BH
disabled
* reduce ping count in selftest to ensure it completes within
timeout
-----BEGIN PGP SIGNATURE-----
iJEEABYIADkWIQQKU153ubb5unbkl6Gx/ZpNW1HNdwUCafkekRsUgAAAAAAEAA5t
YW51MiwyLjUrMS4xMiwyLDIACgkQsf2aTVtRzXdZAAEA2yDBkZdIiALO7V5ul5Ao
Y2/jFxysR5fyCMiOdxTOruwBAJaX9KExSE2QucHKOBFLmrjIIqwI5Br6whljdZKt
n1YE
=UhhZ
-----END PGP SIGNATURE-----
Merge tag 'ovpn-net-20260504' of https://github.com/OpenVPN/ovpn-net-next
Antonio Quartulli says:
====================
Includes changes:
* ensure MAC header offset is reset before delivering packet
* ensure gro_cells_receive() and dstats_dev_add() are called
with BH disabled
* reduce ping count in selftest to ensure it completes within
timeout
* tag 'ovpn-net-20260504' of https://github.com/OpenVPN/ovpn-net-next:
selftests: ovpn: reduce ping count in test.sh
ovpn: ensure packet delivery happens with BH disabled
ovpn: reset MAC header before passing skb up
====================
Link: https://patch.msgid.link/20260504230305.2681646-1-antonio@openvpn.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix the following failure to the steal_time test on arm64 by making
the timer address known to the guest.
==== Test Assertion Failure ====
steal_time.c:229: !ret
pid=18514 tid=18514 errno=22 - Invalid argument
1 0x000000000040252f: check_steal_time_uapi at steal_time.c:229 (discriminator 20)
2 (inlined by) main at steal_time.c:537 (discriminator 20)
3 0x0000ffffa23d621b: ?? ??:0
4 0x0000ffffa23d62fb: ?? ??:0
5 0x0000000000402b6f: _start at ??:?
KVM_SET_DEVICE_ATTR failed, rc: -1 errno: 22 (Invalid argument)
Fixes: 40351ed924 ("KVM: selftests: Refactor UAPI tests into dedicated function")
Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://patch.msgid.link/20260504112808.21276-1-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Update the selftests so they are executed for legacy (32 bytes RSEQ region)
and optimized RSEQ ABI v2 mode.
Fixes: d6200245c7 ("rseq: Allow registering RSEQ with slice extension")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224428.009121296%40kernel.org
Cc: stable@vger.kernel.org
The RSEQ legacy mode behavior requires that the ID fields in the rseq
region are unconditionally updated on every context switch and before
signal delivery even if not required by the ABI specification.
To ensure that this behavior is preserved for legacy users in the future,
add a test which validates that with a sleep() and a signal sent to self.
Provide a run script which prevents GLIBC from registering a RSEQ region,
so that the test can register it's own legacy sized region.
Fixes: 566d8015f7 ("rseq: Avoid CPU/MM CID updates when no event pending")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.764705536%40kernel.org
Cc: stable@vger.kernel.org
rseq_register_current_thread() either uses the glibc registered RSEQ region
or registers it's own region with the legacy size of 32 bytes.
That worked so far, but becomes a problem when the kernel implements a
distinction between legacy and performance optimized behavior based on the
registration size as that does not allow to test both modes with the self
test suite.
Add two arguments to the function. One to enforce that the registration is
not using libc provided mode and one to tell the registration to use the
legacy size and not the kernel advertised size.
Rename it and make the original one a inline wrapper which preserves the
existing behavior.
Fixes: 566d8015f7 ("rseq: Avoid CPU/MM CID updates when no event pending")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.677889423%40kernel.org
Cc: stable@vger.kernel.org
Don't fail, skip the test if the extensions are not enabled at compile or
runtime.
Fixes: 830969e782 ("selftests/rseq: Implement time slice extension test")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.597838491%40kernel.org
Cc: stable@vger.kernel.org
There were a few issues found with the tunnel vport types around the
vport destruction code. Add some basic tests, so at least we know that
they can be properly added and removed without obvious issues.
The test creates OVS datapath, adds a non-LWT tunnel port, makes sure
they are created, and then removes the datapath and waits for all the
ports to be gone.
The dpctl script had a few bugs in the none-lwt tunnel creation code,
so fixing them as well to make the testing possible:
- The type of the --lwt option changed in order to properly disable it.
- Removed byte order conversion for the port numbers, as the value
supposed to be in the host order.
- Added missing 'gre' choice for the tunnel type.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20260430233848.440994-3-i.maximets@ovn.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Fixes extra test number increment in ksft_exit_skip() that results in
incorrect KTAP result.
Fixes regression introduced by addition of explicit constructor orders
for fixture tests. This addition broke the ordering of those relative to
non-fixture tests and the reverse-constructor-order detection.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmn4/0MACgkQCwJExA0N
QxwcXxAAjWAn/T16yWs5AV1txbTwEBv3siH6+3nMnZvgM8/VJEJRs0n6zQSjW1LQ
MfR6LdZPOkRpKyspz8RxSZnHT3qeKCiOnUaJbmwA4+BXv325N4T0m18Vy+izN1w7
pda3FEoe+vlCmclVv5iD3IsaeYMvMf0/PNy58+hHO1yrzOQ0GdgQCM+xDSwYZYBg
GZAYNlGkMql/XToBkBTZCTEC32MorldjKn0bs25MLDtWW+J9X+frgGmYbj2WN+9k
w2EjPGpakllDOtgfmj53Sa5BW9B8U25KXzpZryovwi5NHPhxz+VsdNet8YNkd0yy
VMqIYCPdvpC7DPMKdEboWlXMm3j7oiOeADXZGA8vEAvcXn0JmTbBy97RIsn+5wME
jqAtM4oj+n01QryxDF4pllJU5X5nvam+aO8Nax5jo1u8djQ1eFZToy8QviDT7oSE
dNJzWfzadOkWnHhBbEqjI7RxWbOwxU0yKHjun69ucmROP9zcK02Tw0PEb5rGkcu5
1fb/isNtszBe/LNvrERjPTty52g7K80kw05byzYCwjR1IUFLBX+Xjrs9mLuthIbD
7Mk4HlErKLK5YPLc2gwlI4KQNqoSZqirWQFqy0y9pFHL0Ip0ot8Ne190imP6i18A
lEzr0J22EuCmAlWs+Wn6RpCq3J3845ybq0lH2Hl+HfJphyPd/Fk=
=u8Qv
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-fixes-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
- Fix extra test number increment in ksft_exit_skip() that results in
incorrect KTAP result
- Fix regression introduced by addition of explicit constructor orders
for fixture tests. This addition broke the ordering of those relative
to non-fixture tests and the reverse-constructor-order detection
* tag 'linux_kselftest-fixes-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: harness: Restore order of test functions
selftests: kselftest: fix wrong test number in ksft_exit_skip
The second stage of test.sh ("run baseline data traffic") performs a
basic connectivity check with ping -qfc 500 -w 3. On slower CI
instances this is too strict for TCP: the RTT is high enough that 500
echo requests do not reliably complete within 3 seconds, so the stage
flakes and the test fails even though the ovpn setup is healthy.
Reduce the packet count to 100 for both the plain and 3000-byte pings in
that stage. This still verifies peer setup, key exchange, routing, and
data-path traffic, without making the basic connectivity check depend on
timing out under load.
Fixes: 959bc330a4 ("testing/selftests: add test tool and scripts for ovpn module")
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Create 4 test cases:
- Force red to dequeue from its child's gso_skb with qfq leaf
- Force sfb to dequeue from its child's gso_skb with qfq leaf
- Force red to dequeue from its child's gso_skb with dualpi2 leaf
- Force sfb to dequeue from its child's gso_skb with dualpi2 leaf
All of them have tbf followed by red (or sfb) followed by qfq (or
dualpi2). Since tbf calls its child's peek followed by
qdisc_dequeue_peeked, it will force red/sfb to call their child's peek.
In this case, since the child (qfq/dualpi2) has qdisc_peek_dequeued as
its peek callback, the packet will be stored in its gso_skb queue. During
the subsequent call to qdisc_dequeue_peeked, red/sfb will have to dequeue
from the child's gso_skb to retrieve the packet.
Not doing so will cause a NULL ptr deref which was happening before a
recent fix.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260430152957.194015-4-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With commit dacbfc1678 ("crypto: af_alg - Annotate struct af_alg_iv
with __counted_by"), two selftests, test_tag and crypto_sanity, now
indirectly rely on the __counted_by macro. On systems with commit
dacbfc1678 in the installed UAPI headers, the selftests build fails
with:
In file included from tools/testing/selftests/bpf/prog_tests/crypto_sanity.c:7:
/usr/include/linux/if_alg.h:45:22: error: expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘__attribute__’ before ‘__counted_by’
45 | __u8 iv[] __counted_by(ivlen);
| ^~~~~~~~~~~~
This patch fixes it by regenerating stddef.h in tools/include using the
instructions from commit a778f5d46b ("tools/headers: Pull in stddef.h
to uapi to fix BPF selftests build in CI").
Fixes: dacbfc1678 ("crypto: af_alg - Annotate struct af_alg_iv with __counted_by")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/8da8ef16055aa452d940668ed5359ce54adc6b0b.1777715500.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Without the previous commit, TCP failed to switch to alternative
IPv6 routes immediately upon carrier loss.
It would persist with the dead route until reaching the threshold
net.ipv4.tcp_retries1, leading to unnecessary delays in failover.
Let's add a selftest for this scenario to ensure TCP fails over
immediately upon a carrier loss event.
Before:
TEST: TCP IPv4 failover [ OK ]
TEST: TCP IPv6 failover [FAIL]
After:
TEST: TCP IPv4 failover [ OK ]
TEST: TCP IPv6 failover [ OK ]
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Sagarika Sharma <sharmasagarika@google.com>
Link: https://patch.msgid.link/20260430200909.527827-3-sharmasagarika@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Avoid writing an uninitialised stack variable to POR_EL0 on
sigreturn if the poe_context record is absent
- Reserve one more page for the early 4K-page kernel mapping to cover
the extra [_text, _stext) split introduced by the non-executable
read-only mapping
- Force the arch_local_irq_*() wrappers to be __always_inline so that
noinstr entry and idle paths cannot call out-of-line, instrumentable
copies
- Fix potential sign extension in the arm64 SCS unwinder's DWARF
advance_loc4 decoding
- Tolerate arm64 ACPI platforms with only WFI and no deeper PSCI idle
states, restoring cpuidle registration on such systems
- Include the UAPI <asm/ptrace.h> header in the arm64 GCS libc test
rather than carrying a duplicate struct user_gcs definition (the
original #ifdef NT_ARM_GCS was wrong to cover the structure
definition as it would be masked out if the toolchain defined it)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmn03FoACgkQa9axLQDI
XvELcxAAmhoarEo1Te6wWyybco9LqvfZPzirij+YYLw0GWuqnN99N+f79FZirTbz
ug9AZiG1PPQY0hCurNWwEjQfWJ6dJYo/4mIT9R1rbeU2MwcxHawIePrM0T8PMBF8
nHMZaEy/EZ8hX3pam98d78F38yFUvxaikghhxQvHLFlQA4nU19IElQCyMogofe05
RTE71nDdMZAnfoOS6cVk7wnH99VLfbqiyl97zUOjnyFNdye99UDovayXPUdUkgbN
clF2qxWInS8TPuoKQPz5hzYkbuR0doFwIasLjSMnOQx+FMZdMmPXEZbwqI/hYl7l
xc5bjKtJH/AQqdoEkZW9MUJ1GhzMttTpoYW9//wgRpJtBDNxisdOE9LpcsCMMNIM
wKLrLVLTXsv5jyPeEFMRtUjd0tJ7bV0f3cO/sv5EVBd238CGT76zwCgjpMtZQqbj
KWsTJpM5oYAsKkBHAYE6XCa5h7kre0/249zH/CYhI/mXJkaHJRM8Ub2CnqBgqeTG
KobtDIUJt+TPAhThj/2OQ/HxP6SLzgBgsgVmVqE1nhkOPlcfg3YYBsgpgN+bzMfG
Z7h14yyCAhunoGRVBMtyUgksAvflR+PIS06soRjLZ5cXcOp/3h+sXs6/XVXHtOr/
UCeO5mfaNUNAr3xJ9oYhuAAT74b7zKXY3YM4NRVASfq6rQS0nWg=
=a9er
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Avoid writing an uninitialised stack variable to POR_EL0 on sigreturn
if the poe_context record is absent
- Reserve one more page for the early 4K-page kernel mapping to cover
the extra [_text, _stext) split introduced by the non-executable
read-only mapping
- Force the arch_local_irq_*() wrappers to be __always_inline so that
noinstr entry and idle paths cannot call out-of-line, instrumentable
copies
- Fix potential sign extension in the arm64 SCS unwinder's DWARF
advance_loc4 decoding
- Tolerate arm64 ACPI platforms with only WFI and no deeper PSCI idle
states, restoring cpuidle registration on such systems
- Include the UAPI <asm/ptrace.h> header in the arm64 GCS libc test
rather than carrying a duplicate struct user_gcs definition (the
original #ifdef NT_ARM_GCS was wrong to cover the structure
definition as it would be masked out if the toolchain defined it)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: signal: Preserve POR_EL0 if poe_context is missing
arm64: Reserve an extra page for early kernel mapping
kselftest/arm64: Include <asm/ptrace.h> for user_gcs definition
ACPI: arm64: cpuidle: Tolerate platforms with no deep PSCI idle states
arm64/irqflags: __always_inline the arch_local_irq_*() helpers
arm64/scs: Fix potential sign extension issue of advance_loc4
The rseq selftests include two runner scripts run_param_test.sh and
run_syscall_errors_test.sh which set up the environment for test binaries
and run them with various parameters. Currently we list these test binaries
in TEST_GEN_PROGS but this results in the kselftest framework running them
directly as well as via the runners, resulting in duplication and spurious
failures when the environment is not correctly set up (eg, if glibc tries
to use rseq).
Move the binaries the runners invoke to TEST_GEN_PROGS_EXTENDED, binaries
listed there are built but not run by the framework. The param_test
benchmarks are not moved since they are not run by run_param_test.sh.
Fixes: 830969e782 ("selftests/rseq: Implement time slice extension test")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260423-selftests-rseq-use-runner-v1-1-e13a133754c1@kernel.org
Cc: stable@vger.kernel.org
and the remainder are for post-7.0 issues or aren't deemed suitable for
backporting.
There's a 2 patch DAMON series from SeongJae Park which address races
which could lead to use-after-free errors. And a 3 patch DAMON series
which avoids the possibility of presenting stale parameter values to
users.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCafPaVQAKCRDdBJ7gKXxA
jiUxAQCceUQi6IqADUMhYAsbGcs1LoeMWfiMfbCz2NCoiTXAEwD/S2uqSELRPQQc
7iW6D7U6dTa3d2kkbnxC02ocekaxiQ4=
=M/FI
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2026-04-30-15-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM fixes from Andrew Morton:
"20 hotfixes. All are for MM (and for MMish maintainers). 9 are
cc:stable and the remainder are for post-7.0 issues or aren't deemed
suitable for backporting.
There are two DAMON series from SeongJae Park which address races
which could lead to use-after-free errors, and avoid the possibility
of presenting stale parameter values to users"
* tag 'mm-hotfixes-stable-2026-04-30-15-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: memcontrol: fix rcu unbalance in get_non_dying_memcg_end()
mm/userfaultfd: detect VMA type change after copy retry in mfill_copy_folio_retry()
MAINTAINERS: remove stale kdump project URL
mm/damon/stat: detect and use fresh enabled value
mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values
mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values
selftests/mm: specify requirement for PROC_MEM_ALWAYS_FORCE=y
mm/damon/sysfs-schemes: protect path kfree() with damon_sysfs_lock
mm/damon/sysfs-schemes: protect memcg_path kfree() with damon_sysfs_lock
MAINTAINERS: update Li Wang's email address
MAINTAINERS, mailmap: update email address for Qi Zheng
MAINTAINERS: update Liam's email address
mm/hugetlb_cma: round up per_node before logging it
MAINTAINERS: fix regex pattern in CORE MM category
mm/vma: do not try to unmap a VMA if mmap_prepare() invoked from mmap()
mm: start background writeback based on per-wb threshold for strictlimit BDIs
kho: fix error handling in kho_add_subtree()
liveupdate: fix return value on session allocation failure
mailmap: update entry for Dan Carpenter
vmalloc: fix buffer overflow in vrealloc_node_align()
kselftest includes kernel uAPI headers with option:
-isystem $(top_srcdir)/usr/include
Include <asm/ptrace.h> in libc-gcs.c for the definition of struct
user_gcs from the uAPI headers, and remove the redundant definition in
gcs-util.h. This fixes a compilation error on systems where the
toolchain defines NT_ARM_GCS.
Fixes: a505a52b4e ("kselftest/arm64: Add a GCS test program built with the system libc")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Minor clarifications in the README:
- call out what linters we expect to be clean
- make it clear that by "frameworks" we mean code under lib/
not just factoring code out in the same file
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Allow tracing for non-pKVM, which was accidentally disabled when
the series was merged
* Rationalise the way the pKVM hypercall ranges are defined by using
the same mechanism as already used for the vcpu_sysreg enum
* Enforce that SMCCC function numbers relayed by the pKVM proxy are
actually compliant with the specification
* Fix a couple of feature to idreg mappings which resulted in the
wrong sanitisation being applied
* Fix the GICD_IIDR revision number field that could never been
written correctly by userspace
* Make kvm_vcpu_initialized() correctly use its parameter instead
of relying on the surrounding context
* Enforce correct ordering in __pkvm_init_vcpu(), plugging a
potential pin leak at the same time
* Move __pkvm_init_finalise() to a less dangerous spot, avoiding
future problems
* Restore functional userspace irqchip support after a four year
breakage (last functional kernel was 5.18...).
* Spelling fixes
Selftests:
* Rename types across all KVM selftests to more closely align with types used
in the kernel:
vm_vaddr_t -> gva_t
vm_paddr_t -> gpa_t
uint64_t -> u64
uint32_t -> u32
uint16_t -> u16
uint8_t -> u8
int64_t -> s64
int32_t -> s32
int16_t -> s16
int8_t -> s8
* Fix Loongarch compilation.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmnvHfwUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroO+DQf/eEi0jLfQc7EBn6JQNU9AhegPDu0s
o76HQwI1/NuKodV8txpNKISvkwvc6CVzatAJLJ781VYu+a/lI9EXOsZI8dnlbYy4
M6j0Lo58rneEkb5rplzo8+ds300OZeXQBOlC7loCmLakhtEOrdrmfTVmi+U3BqdP
Q15kCEymuLLqxQggRivd2flyKHT7vY2HeuYJa+ocnqV8qNNUKDmyIwPoTbZ44Plo
c+EU7rOSwvFlYFDP19oViIwoW89/vUkF762U0EkLcsWwB4k5L8qoybog9cYAJ9u4
AG7wmwczkOlJKFwrIDD2Fx8JARqSIZhitwzT+zA/aUOtOl8fRwRKfHpsWw==
=PoE5
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"On top of a lot of Arm fixes, this includes a massive rename of types
and variables in tools/testing/selftests/kvm - these were
unnecessarily different from what the kernel uses, so they're being
made consistent.
arm64:
- Allow tracing for non-pKVM, which was accidentally disabled when
the series was merged
- Rationalise the way the pKVM hypercall ranges are defined by using
the same mechanism as already used for the vcpu_sysreg enum
- Enforce that SMCCC function numbers relayed by the pKVM proxy are
actually compliant with the specification
- Fix a couple of feature to idreg mappings which resulted in the
wrong sanitisation being applied
- Fix the GICD_IIDR revision number field that could never been
written correctly by userspace
- Make kvm_vcpu_initialized() correctly use its parameter instead of
relying on the surrounding context
- Enforce correct ordering in __pkvm_init_vcpu(), plugging a
potential pin leak at the same time
- Move __pkvm_init_finalise() to a less dangerous spot, avoiding
future problems
- Restore functional userspace irqchip support after a four year
breakage (last functional kernel was 5.18...)
- Spelling fixes
Selftests:
- Rename types across all KVM selftests to more closely align with
types used in the kernel:
vm_vaddr_t -> gva_t
vm_paddr_t -> gpa_t
uint64_t -> u64
uint32_t -> u32
uint16_t -> u16
uint8_t -> u8
int64_t -> s64
int32_t -> s32
int16_t -> s16
int8_t -> s8
- Fix Loongarch compilation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (31 commits)
KVM: selftests: Add check_steal_time_uapi() implementation for LoongArch
KVM: arm64: Wake-up from WFI when iqrchip is in userspace
KVM: arm64: Fix initialisation order in __pkvm_init_finalise()
KVM: arm64: Fix pin leak and publication ordering in __pkvm_init_vcpu()
KVM: arm64: Fix kvm_vcpu_initialized() macro parameter
KVM: arm64: Fix FEAT_SPE_FnE to use PMSIDR_EL1.FnE, not PMSVer
KVM: arm64: Fix typo in feature check comments
KVM: arm64: Fix FEAT_Debugv8p9 to check DebugVer, not PMUVer
KVM: arm64: Reject non compliant SMCCC function calls in pKVM
KVM: arm64: vgic: Fix IIDR revision field extracted from wrong value
KVM: selftests: Replace "paddr" with "gpa" throughout
KVM: selftests: Replace "u64 nested_paddr" with "gpa_t l2_gpa"
KVM: selftests: Replace "u64 gpa" with "gpa_t" throughout
KVM: selftests: Replace "vaddr" with "gva" throughout
KVM: selftests: Clarify that arm64's inject_uer() takes a host PA, not a guest PA
KVM: selftests: Rename translate_to_host_paddr() => translate_hva_to_hpa()
KVM: selftests: Rename vm_vaddr_populate_bitmap() => vm_populate_gva_bitmap()
KVM: selftests: Rename vm_vaddr_unused_gap() => vm_unused_gva_gap()
KVM: selftests: Drop "vaddr_" from APIs that allocate memory for a given VM
KVM: selftests: Use u8 instead of uint8_t
...
The merge window pulled in the cgroup sub-scheduler infrastructure, and
new AI reviews are accelerating bug reporting and fixing - hence the
larger than usual fixes batch.
- Use-after-frees during scheduler load/unload. The disable path
could free the BPF scheduler while deferred irq_work / kthread work
was still in flight; cgroup setter callbacks read the active
scheduler outside the rwsem that synchronizes against teardown.
Fixed both, and reused the disable drain in the enable error paths
so the BPF JIT page can't be freed under live callbacks.
- Several BPF op invocations didn't tell the framework which runqueue
was already locked, so helper kfuncs that re-acquire the runqueue
by CPU could deadlock on the held lock. Fixed at the affected
callsites, including recursive parent-into-child dispatch.
- The hardlockup notifier ran from NMI but eventually took a
non-NMI-safe lock. Bounced through irq_work.
- A handful of bugs in the new sub-scheduler hierarchy: helper
kfuncs hard-coded the root instead of resolving the caller's
scheduler; the enable error path tried to disable per-task state
that had never been initialized, and leaked cpus_read_lock on the
way out; a sysfs object was leaked on every load/unload; the
dispatch fast-path used the root scheduler instead of the task's;
and a couple of CONFIG #ifdef guards were misclassified.
- Verifier-time hardening: BPF programs of unrelated struct_ops
types (e.g. tcp_congestion_ops) could call sched_ext kfuncs - a
semantic bug and, once sub-sched was enabled, a KASAN
out-of-bounds read. Now rejected at load. Plus a few NULL and
cross-task argument checks on sched_ext kfuncs, and a selftest
covering the new deny.
- rhashtable (Herbert): restored the insecure_elasticity toggle and
bounced the deferred-resize kick through irq_work to break a
lock-order cycle observable from raw-spinlock callers. sched_ext's
scheduler-instance hash is the first user of both.
- The bypass-mode load balancer used file-scope cpumasks; with
multiple scheduler instances now possible, those raced. Moved
per-instance, plus a follow-up to skip tasks whose recorded CPU is
stale relative to the new owning runqueue.
- Smaller fixes: a dispatch queue's first-task tracking misbehaved
when a parked iterator cursor sat in the list; the runqueue's
next-class wasn't promoted on local-queue enqueue, leaving an SCX
task behind RT in edge cases; the reference qmap scheduler stopped
erroring on legitimate cross-scheduler task-storage misses.
-----BEGIN PGP SIGNATURE-----
iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCafEN/A4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGaydAQCxWrUqnZXhxF4LnztjxTF2tgv7p8P7TbpS6aU6
etqRpAEA9RFmIXs7XrhwCm0n2BwSgjvrNxnWfPhWvuH0uN0GTAA=
=wLna
-----END PGP SIGNATURE-----
Merge tag 'sched_ext-for-7.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
"The merge window pulled in the cgroup sub-scheduler infrastructure,
and new AI reviews are accelerating bug reporting and fixing - hence
the larger than usual fixes batch:
- Use-after-frees during scheduler load/unload:
- The disable path could free the BPF scheduler while deferred
irq_work / kthread work was still in flight
- cgroup setter callbacks read the active scheduler outside the
rwsem that synchronizes against teardown
Fix both, and reuse the disable drain in the enable error paths so
the BPF JIT page can't be freed under live callbacks.
- Several BPF op invocations didn't tell the framework which runqueue
was already locked, so helper kfuncs that re-acquire the runqueue
by CPU could deadlock on the held lock
Fix the affected callsites, including recursive parent-into-child
dispatch.
- The hardlockup notifier ran from NMI but eventually took a
non-NMI-safe lock. Bounce it through irq_work.
- A handful of bugs in the new sub-scheduler hierarchy:
- helper kfuncs hard-coded the root instead of resolving the
caller's scheduler
- the enable error path tried to disable per-task state that had
never been initialized, and leaked cpus_read_lock on the way
out
- a sysfs object was leaked on every load/unload
- the dispatch fast-path used the root scheduler instead of the
task's
- a couple of CONFIG #ifdef guards were misclassified
- Verifier-time hardening: BPF programs of unrelated struct_ops types
(e.g. tcp_congestion_ops) could call sched_ext kfuncs - a semantic
bug and, once sub-sched was enabled, a KASAN out-of-bounds read.
Now rejected at load. Plus a few NULL and cross-task argument
checks on sched_ext kfuncs, and a selftest covering the new deny.
- rhashtable (Herbert): restore the insecure_elasticity toggle and
bounce the deferred-resize kick through irq_work to break a
lock-order cycle observable from raw-spinlock callers. sched_ext's
scheduler-instance hash is the first user of both.
- The bypass-mode load balancer used file-scope cpumasks; with
multiple scheduler instances now possible, those raced. Move to
per-instance cpumasks, plus a follow-up to skip tasks whose
recorded CPU is stale relative to the new owning runqueue.
- Smaller fixes:
- a dispatch queue's first-task tracking misbehaved when a parked
iterator cursor sat in the list
- the runqueue's next-class wasn't promoted on local-queue
enqueue, leaving an SCX task behind RT in edge cases
- the reference qmap scheduler stopped erroring on legitimate
cross-scheduler task-storage misses"
* tag 'sched_ext-for-7.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (26 commits)
sched_ext: Fix scx_flush_disable_work() UAF race
sched_ext: Call wakeup_preempt() in local_dsq_post_enq()
sched_ext: Release cpus_read_lock on scx_link_sched() failure in root enable
sched_ext: Reject NULL-sch callers in scx_bpf_task_set_slice/dsq_vtime
sched_ext: Refuse cross-task select_cpu_from_kfunc calls
sched_ext: Align cgroup #ifdef guards with SUB_SCHED vs GROUP_SCHED
sched_ext: Make bypass LB cpumasks per-scheduler
sched_ext: Pass held rq to SCX_CALL_OP() for core_sched_before
sched_ext: Pass held rq to SCX_CALL_OP() for dump_cpu/dump_task
sched_ext: Save and restore scx_locked_rq across SCX_CALL_OP
sched_ext: Use dsq->first_task instead of list_empty() in dispatch_enqueue() FIFO-tail
sched_ext: Resolve caller's scheduler in scx_bpf_destroy_dsq() / scx_bpf_dsq_nr_queued()
sched_ext: Read scx_root under scx_cgroup_ops_rwsem in cgroup setters
sched_ext: Don't disable tasks in scx_sub_enable_workfn() abort path
sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu()
sched_ext: Guard scx_dsq_move() against NULL kit->dsq after failed iter_new
sched_ext: Unregister sub_kset on scheduler disable
sched_ext: Defer scx_hardlockup() out of NMI
sched_ext: sync disable_irq_work in bpf_scx_unreg()
sched_ext: Fix local_dsq_post_enq() to use task's scheduler in sub-sched
...
There are VXLAN tests and IPsec tests, but there is no test that
combines the two protocols and exercises the tunnel-over-ipsec code
paths. Fix that by adding a traffic test with VXLAN and IPsec using
crypto offload. This is runnable on HW which supports ESP offload (so no
nsim unfortunately).
Traffic is done with iperf3 and the test validates that there are no
packet drops and iperf3 can get to at least 100 Mbps (a very
conservative value on today's crypto offload HW, as it can typically
reach multi-Gbps rates).
Ran right now, the test fails due to a recently exposed bug in xfrm,
which will be fixed in the next patch:
# ./tools/testing/selftests/drivers/net/hw/ipsec_vxlan.py
TAP version 13
1..4
# Check| At ./tools/testing/selftests/drivers/net/hw/ipsec_vxlan.py,
# line 161, in test_vxlan_ipsec_crypto_offload:
# Check| ksft_eq(drops_after - drops_before, 0,
# Check failed 189 != 0 TX drops during VXLAN+IPsec
# Check| At ./tools/testing/selftests/drivers/net/hw/ipsec_vxlan.py,
# line 163, in test_vxlan_ipsec_crypto_offload:
# Check| ksft_ge(bw_gbps, 0.1,
# Check failed 0.0015058278404812596 < 0.1 Minimum 100Mbps over
# VXLAN+IPsec
not ok 1 ipsec_vxlan.test_vxlan_ipsec_crypto_offload.outer_v4_inner_v4
...
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The default timeout of cmd() is 5 seconds and Iperf3Runner requests the
iperf3 client to run for 10 seconds, which clearly doesn't work since
commit [1] enforced the timeout parameter.
Use a value derived from duration as timeout (+5 seconds for
startup/teardown/various other overhead).
[1] commit f0bd193166 ("selftests: net: fix timeout passed as positional argument to communicate()")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Add a regression test for the NULL pointer dereference fixed in the
previous commit. Before the fix, taprio_graft() stored NULL into
q->qdiscs[cl - 1] when an explicitly grafted child qdisc was deleted
via RTM_DELQDISC; the next RTM_GETTCLASS dump then crashed the kernel
in taprio_dump_class() while reading child->handle.
The test installs a taprio root qdisc on a multi-queue netdevsim
device, grafts a pfifo child onto class 8001:1, deletes that child,
and then performs a class dump. On a fixed kernel the dump succeeds
and all eight taprio classes are listed; on an unpatched kernel the
class dump crashes, which surfaces as a test failure.
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260422161958.2517539-4-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The recent addition of explicit constructor orders for fixture tests
broke the ordering of those relative to non-fixture tests and the
reverse-constructor-order detection.
Restore the ordering of the test functions relative to each other by
using the same explicit test order for all test registrations and
__constructor_order_first().
Rename the constant, as it is not specific to TEST_F() anymore.
Link: https://lore.kernel.org/r/20260422-kselftests-harness-order-v2-1-93ea980ea3ac@linutronix.de
Fixes: 6be2681514 ("selftests/harness: order TEST_F and XFAIL_ADD constructors")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ksft_exit_skip() increments ksft_xskip before printing the KTAP
result. As a result, ksft_test_num() already includes the skipped
test.
Adding 1 to ksft_test_num() increments the printed test number
again, producing an incorrect test number and wrong KTAP output.
Drop the extra increment and print ksft_test_num() directly.
Link: https://lore.kernel.org/r/20260427112447.147985-1-sarthak.sharma@arm.com
Fixes: b85d387c9b ("kselftest: fix TAP output for skipped tests")
Signed-off-by: Sarthak Sharma <sarthak.sharma@arm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Several of the mm selftests made use of /proc/pid/mem as part of their
operation but we do not specify this in the config fragment for them, at
least mkdirty and ksm_functional_tests have this requirement.
This has been working fine in practice since PROC_MEM_ALWAYS_FORCE was the
default setting but commit 599bbba5a3 ("proc: make PROC_MEM_FORCE_PTRACE
the Kconfig default") that is no longer the case, meaning that tests run
on kernels built based on defconfigs have started having the new more
restrictive default and failing. Add PROC_MEM_ALWAYS_FORCE to the config
fragment for the mm selftests.
Thanks to Aishwarya TCV for spotting the issue and identifying the commit
that introduced it.
Link: https://lore.kernel.org/20260416-selftests-mm-proc-mem-always-force-v1-1-3f5865153c67@kernel.org
Fixes: 599bbba5a3 ("proc: make PROC_MEM_FORCE_PTRACE the Kconfig default")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Switching to private email address. Update all contact information
Add an entry to mailmap at the same time.
Link: https://lore.kernel.org/20260422184310.2682901-1-liam@infradead.org
Signed-off-by: Liam R. Howlett <liam@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The mmap_prepare hook functionality includes the ability to invoke
mmap_prepare() from the mmap() hook of existing 'stacked' drivers, that is
ones which are capable of calling the mmap hooks of other drivers/file
systems (e.g. overlayfs, shm).
As part of the mmap_prepare action functionality, we deal with errors by
unmapping the VMA should one arise. This works in the usual mmap_prepare
case, as we invoke this action at the last moment, when the VMA is
established in the maple tree.
However, the mmap() hook passes a not-fully-established VMA pointer to the
caller (which is the motivation behind the mmap_prepare() work), which is
detached.
So attempting to unmap a VMA in this state will be problematic, with the
most obvious symptom being a warning in vma_mark_detached(), because the
VMA is already detached.
It's also unncessary - the mmap() handler will clean up the VMA on error.
So to fix this issue, this patch propagates whether or not an mmap action
is being completed via the compatibility layer or directly.
If the former, then we do not attempt VMA cleanup, if the latter, then we
do.
This patch also updates the userland VMA tests to reflect the change.
Link: https://lore.kernel.org/20260421102150.189982-1-ljs@kernel.org
Fixes: ac0a3fc9c0 ("mm: add ability to take further action in vm_area_desc")
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Reported-by: syzbot+db390288d141a1dccf96@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69e69734.050a0220.24bfd3.0027.GAE@google.com/
Cc: David Hildenbrand <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Renames types across all KVM selftests to more closely align with types used
in the kernel:
vm_vaddr_t -> gva_t
vm_paddr_t -> gpa_t
uint64_t -> u64
uint32_t -> u32
uint16_t -> u16
uint8_t -> u8
int64_t -> s64
int32_t -> s32
int16_t -> s16
int8_t -> s8
Using the kernel's preferred types eliminates a source of friction for many
contributors, as the majority of KVM selftests contributions come from kernel
developers. The kernel names are also shorter, which allows for more concise
code, and in any many cases eliminates newlines thanks to shorter types and
parameter names.
Rename variables and parameters as well as types, e.g. gpa instead of paddr,
to again align with the kernel, and in a few cases to remove ambiguity, e.g.
where paddr is used to refer to a _host_ physical address.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmnqaAUACgkQOlYIJqCj
N/2k0Q//eRyhxrf7ZX7F3LNRLeHvvETipwXlkaiIT47HBnCf/hECCDiU4nmi5w3s
QA9Jmt4DcbvUd0OxHwxTGoTA53/Y5g4abJUU7W3pviBxNo4aMnyw+92L1I4oQVEN
gLCK7fQ+jl9e/utKCIhSiQPYfHgPGvSaWbP0X1WFNVfpiOryZt2z2iWo1c0kp5jv
mdAmWPwNX9ygWB0xLfc4MCo2S9Cgi3CvmIIVxHKnQt0V6yv+Lyzv725dy2FVbw/r
3CGTUM/Nr4mvbjBZsutaZLUY2+i/0g8VAp00m+SRhvREdAKUZla0eCbNJKr5qEji
jfGvZssQtv/1NOP5X79b0ewRNJ17ZhQ5hkUczh08H9ekGDrRyByUpbWTrC98ePYb
GmUEHZRfulxNEPa7lAqFfOCBZ1C2uHXD/+slh3ZQp+xtlwgj6iFQGk9zgN6Cw89p
RCla+R8LkEWQZjhJ4ZJzgLJggLD6/3UbgpV6Ic/KKxKGgIKMxeYmYDC1NiYmGN4T
5i4p3tMy6cHIwDEdNhix5/7VWZ5VGGRx0g+aQvjTvtmt3zoae+CwHX3kSowfLcPu
9jN87NPFl09IC6mB90Bebufzx0nUzCudyto7jqQaV1dVdRhkXe7YOdZZb5QtDVNu
3pPfH4+Zyx62emOOZO/pMTZgkLXye5ak/02TxrzheWCYUFt8Uvg=
=nYWK
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests_kernel_types-7.1' of https://github.com/kvm-x86/linux into HEAD
KVM selftests type renames for 7.1
Renames types across all KVM selftests to more closely align with types used
in the kernel:
vm_vaddr_t -> gva_t
vm_paddr_t -> gpa_t
uint64_t -> u64
uint32_t -> u32
uint16_t -> u16
uint8_t -> u8
int64_t -> s64
int32_t -> s32
int16_t -> s16
int8_t -> s8
Using the kernel's preferred types eliminates a source of friction for many
contributors, as the majority of KVM selftests contributions come from kernel
developers. The kernel names are also shorter, which allows for more concise
code, and in any many cases eliminates newlines thanks to shorter types and
parameter names.
Rename variables and parameters as well as types, e.g. gpa instead of paddr,
to again align with the kernel, and in a few cases to remove ambiguity, e.g.
where paddr is used to refer to a _host_ physical address.
Define check_steal_time_uapi() for LoongArch so that the steal_time test
builds. Note, while LoongArch's steal_time_init() has some funky asserts,
none of the code is uniquely verifying KVM's uAPI.
Cc: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Cc: Jiakai Xu <jiakaiPeanut@gmail.com>
Cc: Andrew Jones <andrew.jones@oss.qualcomm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Tianrui Zhao <zhaotianrui@loongson.cn>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Huacai Chen <chenhuacai@kernel.org>
Fixes: 40351ed924 ("KVM: selftests: Refactor UAPI tests into dedicated function")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260420192644.3892050-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since v2025.11.22:
Initial SoC Slider support
SoC Slider is an SoC-wide power/performance policy setting.
On SoC Slider systems, EPP plays a diminished role.
Whitespace cleanup via: indent -npro -kr -i8 -ts8 -sob -l160 -ss -ncs -cp1
No functional changes
Signed-off-by: Len Brown <len.brown@intel.com>
x86_energy_perf_policy accesses the SoC Slider via standard
user/kernel APIs to the processor_thermal_soc_slider driver.
Machines that support SoC Slider largely use it instead of EPP,
which may continue to exist in a diminished role, or vanish entirely.
Signed-off-by: Len Brown <len.brown@intel.com>
When processor_thermal_soc_slider is loaded, its slider
and offset modparams are visible. Check that the driver
actually registered the profile named "SoC Slider" before
reading or writing these modparams.
n.b. This utility allows writing the Slider and Offset modparams
even if the driver policy is not "balanced". Currently the
processor_thermal_soc_slider consults those modparams
only in "balanced" mode.
Signed-off-by: Len Brown <len.brown@intel.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmnrh6gQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpkE+D/4h1caXVwyXjeBQ+UQ0hXqRkPHgCym+NxAx
W/ezIs2/Yldbe31x5YZqtYXfV+0SyZhd6ugDbRpOahol8+7wpa4E4wZNYfoxKKqF
+q9SysEDRCAOOGkCtNvxpALIzmmmXEswrqwJOmge2J+vdNsZwFq8APMiGS0kEo3x
nJrzOdxnpMHVY68yCB+LtRjggyptoJNMT14lIeFg1uXLAUK7CAg8Ax3qSzuAcW1m
zWTkGUudrVtVWNEjKdR4kvPWtrkbNb8PY6ybKxJo+4z1XMrOkVlbIAEX6o6dOIqy
hk5WKWNQIU82oxUi+FondGgq2yRHwuRY4fiDAz+eQ+SO6EcaUxQWqJgAD3gcNyRl
C2ywNRFT3EZhYHSr8ec+kuNX3Bot5YvRP2BdFpQpmKLhGGs1JwV9pjsYiLr1bXOy
fI6Wo1CsEnHM7I4rBFaji7W9ZhOXo/6IlehfDO2vf6iAAaWo1aDnJ4gOxNMcIYx3
hCGWWiug1RadsPXKe9wIjGjR9ahUl48Ucs06fgN2kMjtPb09QsB66yUF7jJS9mvk
Mdn9EltVqLIoyTt5vS/hUk/hO7VPzYaD/pL0yun4I9h7vi9hEyd/rXm3BIvuLQgJ
7u7c00ozgZyNVbubUa3lOxJLW7L/ukxHsCOJQW+FW/uJUvqxC3g5LGYaxe9r6uE0
KB1uNV7sKg==
=0AP+
-----END PGP SIGNATURE-----
Merge tag 'block-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Series for zloop, fixing a variety of issues
- t10-pi code cleanup
- Fix for a merge window regression with the bio memory allocation mask
- Fix for a merge window regression in ublk, caused by an issue with
the maple tree iteration code at teardown
- ublk self tests additions
- Zoned device pgmap fixes
- Various little cleanups and fixes
* tag 'block-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (21 commits)
Revert "floppy: fix reference leak on platform_device_register() failure"
ublk: avoid unpinning pages under maple tree spinlock
ublk: refactor common helper ublk_shmem_remove_ranges()
ublk: fix maple tree lockdep warning in ublk_buf_cleanup
selftests: ublk: add ublk auto integrity test
selftests: ublk: enable test_integrity_02.sh on fio 3.42
selftests: ublk: remove unused argument to _cleanup
block: only restrict bio allocation gfp mask asked to block
block/blk-throttle: Add WQ_PERCPU to alloc_workqueue users
block: Add WQ_PERCPU to alloc_workqueue users
block: relax pgmap check in bio_add_page for compatible zone device pages
block: add pgmap check to biovec_phys_mergeable
floppy: fix reference leak on platform_device_register() failure
ublk: use unchecked copy helpers for bio page data
t10-pi: reduce ref tag code duplication
zloop: remove irq-safe locking
zloop: factor out zloop_mark_{full,empty} helpers
zloop: set RQF_QUIET when completing requests on deleted devices
zloop: improve the unaligned write pointer warning
zloop: use vfs_truncate
...
- Add Kunit correctness testing and microbenchmarks for strlen(),
strnlen(), and strrchr()
- Add RISC-V-specific strnlen(), strchr(), strrchr() implementations
- Add hardware error exception handling
- Clean up and optimize our unaligned access probe code
- Enable HAVE_IOREMAP_PROT to be able to use generic_access_phys()
- Remove XIP kernel support
- Warn when addresses outside the vmemmap range are passed to
vmemmap_populate()
- Update the ACPI FADT revision check to warn if it's not at least
ACPI v6.6, which is when key RISC-V-specific tables were added to the
specification
- Increase COMMAND_LINE_SIZE to 2048 to match ARM64, x86, PowerPC, etc.
- Make kaslr_offset() a static inline function, since there's no need
for it to show up in the symbol table
- Add KASLR offset and SATP to the VMCOREINFO ELF notes to improve
kdump support
- Add Makefile cleanup rule for vdso_cfi copied source files, and add
a .gitignore for the build artifacts in that directory
- Remove some redundant ifdefs that check Kconfig macros
- Add missing SPDX license tag to the CFI selftest
- Simplify UTS_MACHINE assignment in the RISC-V Makefile
- Clarify some unclear comments and remove some superfluous comments
- Fix various English typos across the RISC-V codebase
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmnqoPQACgkQx4+xDQu9
KksBGw/+K4cZ5+m2hnS8RVZmreDHPkpLuRmIxqPe1JG/cS0KHwWBX+IX9uYdrmqP
Ex+hZyt+pqFAdbEcV0t4445RR8Lz7D4SxzFFk6q36OuWkrFahOnQQm0prdO+CSok
Ch4AqbH0WNbgoU5xGpCbfsBeNeDOJWc+sNKmoMGF1mlZyy7s7m5jwu2vxdpuc7Ut
pkzqA87JR2Pn2C0EitlJv2mYiKLrnl+ma+yRLjLC3mtubs1HjIUoPTtS4iEuZt41
SabT0SWKPhKXvjxnVxqxKGizH77eciIz+fjecFGB2lO07Lc3z2asT8sJ1bnCspMI
e0Thbohs5Z2q2vGg49UqfDCm47BUWkSjhtgOi1E/JcWPahgCGGP4mYLD6AVZ9biK
gQofXZq5XGxLWjKOoNqh5nPIYIWDtgQgQkXkLiCNYcp1CZ0RaCkkER64UKeRuhoS
tSZuLIbjNzqQMhD9tKWnPueQS3tz3CdNvSMWiDgy+2HoKYIxcaDJ5zPPCMVTWEHn
ohoTLG63oRglV2x5ol27FQKip4SUpxXaDtnuPBytsgys88m0TIOkXvWpzU5si5jQ
O3n43ZiHsnA7jRl4MVlFKDwzHFnm8eOMxpThU34oHJku8AyYQS9zTc05KfbjJEsp
p7YDuh8bH7FHyxLQXHFNor4dCDRY7xU67urz3wjaGRopKA4UE4g=
=hG4G
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-7.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
"There is one significant change outside arch/riscv in this pull
request: the addition of a set of KUnit tests for strlen(), strnlen(),
and strrchr().
Otherwise, the most notable changes are to add some RISC-V-specific
string function implementations, to remove XIP kernel support, to add
hardware error exception handling, and to optimize our runtime
unaligned access speed testing.
A few comments on the motivation for removing XIP support. It's been
broken in the RISC-V kernel for months. The code is not easy to
maintain. Furthermore, for XIP support to truly be useful for RISC-V,
we think that compile-time feature switches would need to be added for
many of the RISC-V ISA features and microarchitectural properties that
are currently implemented with runtime patching. No one has stepped
forward to take responsibility for that work, so many of us think it's
best to remove it until clear use cases and champions emerge.
Summary:
- Add Kunit correctness testing and microbenchmarks for strlen(),
strnlen(), and strrchr()
- Add RISC-V-specific strnlen(), strchr(), strrchr() implementations
- Add hardware error exception handling
- Clean up and optimize our unaligned access probe code
- Enable HAVE_IOREMAP_PROT to be able to use generic_access_phys()
- Remove XIP kernel support
- Warn when addresses outside the vmemmap range are passed to
vmemmap_populate()
- Update the ACPI FADT revision check to warn if it's not at least
ACPI v6.6, which is when key RISC-V-specific tables were added to
the specification
- Increase COMMAND_LINE_SIZE to 2048 to match ARM64, x86, PowerPC,
etc.
- Make kaslr_offset() a static inline function, since there's no need
for it to show up in the symbol table
- Add KASLR offset and SATP to the VMCOREINFO ELF notes to improve
kdump support
- Add Makefile cleanup rule for vdso_cfi copied source files, and add
a .gitignore for the build artifacts in that directory
- Remove some redundant ifdefs that check Kconfig macros
- Add missing SPDX license tag to the CFI selftest
- Simplify UTS_MACHINE assignment in the RISC-V Makefile
- Clarify some unclear comments and remove some superfluous comments
- Fix various English typos across the RISC-V codebase"
* tag 'riscv-for-linus-7.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (31 commits)
riscv: Remove support for XIP kernel
riscv: Reuse compare_unaligned_access() in check_vector_unaligned_access()
riscv: Split out compare_unaligned_access()
riscv: Reuse measure_cycles() in check_vector_unaligned_access()
riscv: Split out measure_cycles() for reuse
riscv: Clean up & optimize unaligned scalar access probe
riscv: lib: add strrchr() implementation
riscv: lib: add strchr() implementation
riscv: lib: add strnlen() implementation
lib/string_kunit: extend benchmarks to strnlen() and chr searches
lib/string_kunit: add performance benchmark for strlen()
lib/string_kunit: add correctness test for strrchr()
lib/string_kunit: add correctness test for strnlen()
lib/string_kunit: add correctness test for strlen()
riscv: vdso_cfi: Add .gitignore for build artifacts
riscv: vdso_cfi: Add clean rule for copied sources
riscv: enable HAVE_IOREMAP_PROT
riscv: mm: WARN_ON() for bad addresses in vmemmap_populate()
riscv: acpi: update FADT revision check to 6.6
riscv: add hardware error trap handler support
...
1, Adjust build infrastructure for 32BIT/64BIT;
2, Add HIGHMEM (PKMAP and FIX_KMAP) support;
3, Show and handle CPU vulnerabilites correctly;
4, Batch the icache maintenance for jump_label;
5, Add more atomic instructions support for BPF JIT;
6, Add more features (e.g. fsession) support for BPF trampoline;
7, Some bug fixes and other small changes.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmnpwWgWHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImeiAXD/0RSRhj2y8LYGhVSPStMgN4uwMl
1ylbkRg0biTvV0g8sD1R3MQ58/tKBZY5wTeLjwT50rl+JgOqVdrN6OMAxjwOKzJ6
7C0rgpxBG5/YHI93paFVIYszsiWhRQaB5qfZCUOr230ZDJzvnfF1aH6JLybeHoMp
HvERNURQsRbZo9yc69YxhrmHETEbum37u9hsrY5mJSEs5Fh+QxvTSYjE36z3Dtal
YFqopTCaBgAhVw6BldVAcyvGvVK+d6iQEA035jObNLKKReNkwsQixxgnJhDSkbbG
Z3md+hWp+YQQElGIP5q6+rj1rJZGrs/XL3HAnTQfXN+8bXIUO9AOf2/l5f9fZx7o
2Vtt8n2/vVdzsVnKiHXGtsZ5uXrw4/kLiMZSCrUMZCtEOxJV9mmrVskPeie0iq0/
nDG9uCgRldL8Xpg7d5NM9coECui3J+ztNkv06tL/JLm02bJPuqNwt3FeA1T/aH1c
l2Hpw3Xuzl7lYuAYoa5CMm4X6yD/RA6w44pW1NKnb6j6llIOk6V6NwcwggWUnqht
oB5VIqPKMOYjZ+fLurI2o9VWqWokJxDdzyrHhXyaG0JRK9Vak06C8UI5BQuosu88
9WBoxK77PyNa60m56C32OZ5tu4UoPT8PgZWXDQDwn82SWzuYKWRruS2ng5A/JF7r
H2Ez4iBjs2/P7vTQHA==
=FiFl
-----END PGP SIGNATURE-----
Merge tag 'loongarch-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Adjust build infrastructure for 32BIT/64BIT
- Add HIGHMEM (PKMAP and FIX_KMAP) support
- Show and handle CPU vulnerabilites correctly
- Batch the icache maintenance for jump_label
- Add more atomic instructions support for BPF JIT
- Add more features (e.g. fsession) support for BPF trampoline
- Some bug fixes and other small changes
* tag 'loongarch-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (21 commits)
selftests/bpf: Enable CAN_USE_LOAD_ACQ_STORE_REL for LoongArch
LoongArch: BPF: Add fsession support for trampolines
LoongArch: BPF: Introduce emit_store_stack_imm64() helper
LoongArch: BPF: Support up to 12 function arguments for trampoline
LoongArch: BPF: Support small struct arguments for trampoline
LoongArch: BPF: Open code and remove invoke_bpf_mod_ret()
LoongArch: BPF: Support load-acquire and store-release instructions
LoongArch: BPF: Support 8 and 16 bit read-modify-write instructions
LoongArch: BPF: Add the default case in emit_atomic() and rename it
LoongArch: Define instruction formats for AM{SWAP/ADD}.{B/H} and DBAR
LoongArch: Batch the icache maintenance for jump_label
LoongArch: Add flush_icache_all()/local_flush_icache_all()
LoongArch: Add spectre boundry for syscall dispatch table
LoongArch: Show CPU vulnerabilites correctly
LoongArch: Make arch_irq_work_has_interrupt() true only if IPI HW exist
LoongArch: Use get_random_canary() for stack canary init
LoongArch: Improve the logging of disabling KASLR
LoongArch: Align FPU register state to 32 bytes
LoongArch: Handle CONFIG_32BIT in syscall_get_arch()
LoongArch: Add HIGHMEM (PKMAP and FIX_KMAP) support
...
Steady stream of fixes. Last two weeks feel comparable to the two
weeks before the merge window. Lots of AI-aided bug discovery.
A newer big source is Sashiko/Gemini (Roman Gushchin's system),
which points out issues in existing code during patch review
(maybe 25% of fixes here likely originating from Sashiko).
Nice thing is these are often fixed by the respective maintainers,
not drive-bys.
Current release - new code bugs:
- kconfig: MDIO_PIC64HPSC should depend on ARCH_MICROCHIP
Previous releases - regressions:
- add async ndo_set_rx_mode and switch drivers which we promised
to be called under the per-netdev mutex to it
- dsa: remove duplicate netdev_lock_ops() for conduit ethtool ops
- hv_sock: report EOF instead of -EIO for FIN
- vsock/virtio: fix MSG_PEEK calculation on bytes to copy
Previous releases - always broken:
- ipv6: fix possible UAF in icmpv6_rcv()
- icmp: validate reply type before using icmp_pointers
- af_unix: drop all SCM attributes for SOCKMAP
- netfilter: fix a number of bugs in the osf (OS fingerprinting)
- eth: intel: fix timestamp interrupt configuration for E825C
Misc:
- bunch of data-race annotations
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmnqkmMACgkQMUZtbf5S
Iruqig/+NSg/YwEkZLbSaW+0LqNMIdOVZPdves97YAvNRdcKvgAPB5I13/G+koCz
bRpmtdDLYTkfMFLaM582DO6XeO3Hsz/BrRRuRbyEz7lTi7PtxTEs1J+6W6NxGOQ2
30f3J7OGudGlinsFV9VkJe81rvFbKZFZ9fGPmOcVzzzfLvT3rrt20iVvMOyM+PpD
H0ixFW+myescEx6AQoGcVs/sDveJ4bpLpNG3p4gADh3Laj9HKSl00kudCIOQ1Kdy
SEHsSZs3A87ueOnGwIBl/x24zVWGTGHyKcmc5ENPUSIaNGOWzmBxvfhb5dZ989RQ
HQix+FMue21k4JypYwrdhU3MAnMDPLk+FDp4XJuwJ5I/caNLZXS2geIlnXOI5IFJ
ojuq4pF5njoWtvkWGvxxRM+shIMiDUYUK+k9xTMqmge88O9ahGIAYb2qyKL+P6Sl
mMuSRcArk6pw3lPbUA4u1wEaU52IdxRJDPQA/Ai3O5UVTfemJO/VqawQfuBE274g
KZXG4x0lwE+LSyoguTnSqhMCJk1ZXAeHjtpz1Yo3CEHOwCH9MxEEL/dldAXWZiWN
K0nLcUQ8fg3GnmOEzYw1gzDVJrgkR1eIrh6OCpw+UGCg0Af0HE6C6QBL9q59YhQw
DjLJAUNM8puBNIh9paCsHf1aIcFpPXBcR5dKoufCQx41x1OOqew=
=knNy
-----END PGP SIGNATURE-----
Merge tag 'net-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Netfilter.
Steady stream of fixes. Last two weeks feel comparable to the two
weeks before the merge window. Lots of AI-aided bug discovery. A newer
big source is Sashiko/Gemini (Roman Gushchin's system), which points
out issues in existing code during patch review (maybe 25% of fixes
here likely originating from Sashiko). Nice thing is these are often
fixed by the respective maintainers, not drive-bys.
Current release - new code bugs:
- kconfig: MDIO_PIC64HPSC should depend on ARCH_MICROCHIP
Previous releases - regressions:
- add async ndo_set_rx_mode and switch drivers which we promised to
be called under the per-netdev mutex to it
- dsa: remove duplicate netdev_lock_ops() for conduit ethtool ops
- hv_sock: report EOF instead of -EIO for FIN
- vsock/virtio: fix MSG_PEEK calculation on bytes to copy
Previous releases - always broken:
- ipv6: fix possible UAF in icmpv6_rcv()
- icmp: validate reply type before using icmp_pointers
- af_unix: drop all SCM attributes for SOCKMAP
- netfilter: fix a number of bugs in the osf (OS fingerprinting)
- eth: intel: fix timestamp interrupt configuration for E825C
Misc:
- bunch of data-race annotations"
* tag 'net-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (148 commits)
rxrpc: Fix error handling in rxgk_extract_token()
rxrpc: Fix re-decryption of RESPONSE packets
rxrpc: Fix rxrpc_input_call_event() to only unshare DATA packets
rxrpc: Fix missing validation of ticket length in non-XDR key preparsing
rxgk: Fix potential integer overflow in length check
rxrpc: Fix conn-level packet handling to unshare RESPONSE packets
rxrpc: Fix potential UAF after skb_unshare() failure
rxrpc: Fix rxkad crypto unalignment handling
rxrpc: Fix memory leaks in rxkad_verify_response()
net: rds: fix MR cleanup on copy error
m68k: mvme147: Make me the maintainer
net: txgbe: fix firmware version check
selftests/bpf: check epoll readiness during reuseport migration
tcp: call sk_data_ready() after listener migration
vhost_net: fix sleeping with preempt-disabled in vhost_net_busy_poll()
ipv6: Cap TLV scan in ip6_tnl_parse_tlv_enc_lim
tipc: fix double-free in tipc_buf_append()
llc: Return -EINPROGRESS from llc_ui_connect()
ipv4: icmp: validate reply type before using icmp_pointers
selftests/net: packetdrill: cover RFC 5961 5.2 challenge ACK on both edges
...
Inside migrate_dance(), add epoll checks around shutdown() to
verify that the target listener is not ready before shutdown()
and becomes ready immediately after shutdown() triggers migration.
Cover TCP_ESTABLISHED and TCP_SYN_RECV. Exclude TCP_NEW_SYN_RECV
as it depends on later handshake completion.
Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com>
Link: https://patch.msgid.link/20260422024554.130346-3-jt26wzz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
RFC 5961 Section 5.2 / RFC 793 Section 3.9 require a challenge ACK
whenever an incoming SEG.ACK falls outside
[SND.UNA - MAX.SND.WND, SND.NXT]. There is currently no packetdrill
coverage for either edge.
Add tcp_rfc5961_ack-out-of-window.pkt, which in a single passive-open
connection exercises:
- Upper edge (SEG.ACK > SND.NXT): peer ACKs data that was never
sent before the server has transmitted anything.
- Lower edge (SEG.ACK < SND.UNA - MAX.SND.WND): after the server
has sent 2000 bytes (the peer-advertised rwnd forces two 1000-byte
segments, both acknowledged), peer sends an ACK that is older
than the acceptable window.
Both cases must elicit a challenge ACK
<SEQ = SND.NXT, ACK = RCV.NXT, CTL = ACK>. The per-socket RFC 5961
Section 7 rate limit is disabled for the duration of the test so that
both challenge ACKs can fire back-to-back.
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260422123605.320000-3-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
RFC 5961 Section 5.2 validates an incoming segment's ACK value
against the range [SND.UNA - MAX.SND.WND, SND.NXT] and states:
"All incoming segments whose ACK value doesn't satisfy the above
condition MUST be discarded and an ACK sent back."
Commit 354e4aa391 ("tcp: RFC 5961 5.2 Blind Data Injection Attack
Mitigation") opted Linux into this mitigation and implements the
challenge ACK on the lower side (SEG.ACK < SND.UNA - MAX.SND.WND),
but the symmetric upper side (SEG.ACK > SND.NXT) still takes the
pre-RFC-5961 path and silently returns
SKB_DROP_REASON_TCP_ACK_UNSENT_DATA, even though RFC 793 Section 3.9
(now RFC 9293 Section 3.10.7.4) has always required:
"If the ACK acknowledges something not yet sent (SEG.ACK > SND.NXT)
then send an ACK, drop the segment, and return."
Complete the mitigation by sending a challenge ACK on that branch,
reusing the existing tcp_send_challenge_ack() path which already
enforces the per-socket RFC 5961 Section 7 rate limit via
__tcp_oow_rate_limited(). FLAG_NO_CHALLENGE_ACK is honoured for
symmetry with the lower-edge case.
Update the existing tcp_ts_recent_invalid_ack.pkt selftest, which
drives this exact path, to consume the new challenge ACK.
Fixes: 354e4aa391 ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260422123605.320000-2-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new chk_sndbuf() helper to diag.sh that extracts the sndbuf
(the 'tb' field from 'ss -m' skmem output) for both server and
client MPTCP sockets, and verifies they are equal.
Without the previous patch, it will fail:
'''
07 ....chk sndbuf server/client [FAIL] sndbuf S=20480 != C=2630656
'''
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260420-net-mptcp-sync-sndbuf-accept-v1-2-e3523e3aeb44@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The end-to-end integrity ublk selftest test_integrity_02 requires a
relatively recent fio version to support I/O with integrity buffers. Add
a version test_integrity_03 that uses the block layer's auto integrity
path instead. The auto integrity code doesn't check the application tag,
and doesn't indicate the bad guard/ref tag (just returns EILSEQ). But
it's a good smoke-test of the ublk integrity code and provides coverage
of the auto integrity path as well.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260421200901.1528842-4-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
fio 3.42 was released with the needed fix for test_integrity_02.sh.
Allow 3.42 and newer in the fio version check.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260421200901.1528842-3-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The _cleanup helper function doesn't take any arguments, so drop them
from its callers.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260421200901.1528842-2-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In order to do the following load-acquire and store-release tests on
LoongArch:
sudo ./test_progs -t verifier_load_acquire
sudo ./test_progs -t verifier_store_release
sudo ./test_progs -t verifier_precision/bpf_load_acquire
sudo ./test_progs -t verifier_precision/bpf_store_release
sudo ./test_progs -t compute_live_registers/atomic_load_acq_store_rel
It needs to enable CAN_USE_LOAD_ACQ_STORE_REL for LoongArch.
Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
- Fix month in date timestamp used to create failure directories
On failure, a directory is created to store the logs and config file to
analyze the failure. The Perl function localtime is used to create the
data timestamp of the directory. The month passed back from that
function starts at 0 and not 1, but the timestamp used does not account
for that. Thus for April 20, 2026, the timestamp of 20260320 is used,
instead of 20260420.
- Save the logfile to the failure directory
Just the test log is saved to the directory on failure, but there's
useful information in the full logfile that can be helpful to analyzing
the failure. Save the logfile as well.
-----BEGIN PGP SIGNATURE-----
iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaeknTxQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qm9hAP9jPNvXaCqvcGfz9QN2fEOs7sWWVOAQ
gryZ8Nyqn4CmswD/aQkYBVRSeWsvUrbw7WYrFvEOH0hUfKD6CaFLaA6EigE=
=ONM2
-----END PGP SIGNATURE-----
Merge tag 'ktest-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest updates from Steven Rostedt:
- Fix month in date timestamp used to create failure directories
On failure, a directory is created to store the logs and config file
to analyze the failure. The Perl function localtime is used to create
the data timestamp of the directory. The month passed back from that
function starts at 0 and not 1, but the timestamp used does not
account for that. Thus for April 20, 2026, the timestamp of 20260320
is used, instead of 20260420.
- Save the logfile to the failure directory
Just the test log was saved to the directory on failure, but there's
useful information in the full logfile that can be helpful to
analyzing the failure. Save the logfile as well.
* tag 'ktest-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Add logfile to failure directory
ktest: Fix the month in the name of the failure directory
Add a bpf_tcp_ca selftest for the TCP_NODELAY restriction in
bpf-tcp-cc.
Update bpf_cubic to exercise init() and cwnd_event_tx_start(),
and check that both callbacks reject bpf_setsockopt(TCP_NODELAY)
with -EOPNOTSUPP.
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260421155804.135786-5-kafai.wan@linux.dev
Add a sockops selftest for the TCP_NODELAY restriction in
BPF_SOCK_OPS_HDR_OPT_LEN_CB and BPF_SOCK_OPS_WRITE_HDR_OPT_CB.
With BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG enabled,
bpf_setsockopt(TCP_NODELAY) returns -EOPNOTSUPP from
BPF_SOCK_OPS_HDR_OPT_LEN_CB and BPF_SOCK_OPS_WRITE_HDR_OPT_CB, avoiding
unbounded recursion and kernel stack overflow.
Other cases continue to work as before, including
BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB.
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260421155804.135786-4-kafai.wan@linux.dev
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmnobFATHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXhafB/wMvf8yu4FgapbSRIlPboeW/ONDyuHd
k0y1a0IFMQLspWfCxK8+snEcHT3g9xzG3ksqcnac0SPgzqxQ4dlK/c7Xr+5EBXuf
TEt/bHrt6KT5BUpb1k/XxcdoObCsvZd3vfqR020OsHijw1Ni9PrdqeZxk56/vFJs
nvgyKvsrAEyfALOH1Vgwg0gNWpqJBj1KcT3Kl1o4p5lwQzVpUREZii6RyvgXT/pu
mckN63FrEPOpDaJllHmCPcfFSzqNi+wFQcFxm35w9rDQsVdnMsRWoJbcXNYW5QGM
+KOZ1/tzw4Z3queW78hcxsH6sXiElLDsJgtDohBhxbVvUXy+xHrJCOm5
=4Ke3
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V updates from Wei Liu:
- Fix cross-compilation for hv tools (Aditya Garg)
- Fix vmemmap_shift exceeding MAX_FOLIO_ORDER in mshv_vtl (Naman Jain)
- Limit channel interrupt scan to relid high water mark (Michael
Kelley)
- Export hv_vmbus_exists() and use it in pci-hyperv (Dexuan Cui)
- Fix cleanup and shutdown issues for MSHV (Jork Loeser)
- Introduce more tracing support for MSHV (Stanislav Kinsburskii)
* tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Skip LP/VP creation on kexec
x86/hyperv: move stimer cleanup to hv_machine_shutdown()
Drivers: hv: vmbus: fix hyperv_cpuhp_online variable shadowing
mshv: Add tracepoint for GPA intercept handling
mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER
tools: hv: Fix cross-compilation
Drivers: hv: vmbus: Export hv_vmbus_exists() and use it in pci-hyperv
mshv: Introduce tracing support
Drivers: hv: vmbus: Limit channel interrupt scan to relid high water mark
Since v2026.02.14
Display HT siblings in cpu# order.
Add Module-ID column.
Print Core-ID and APIC-ID in hex.
Fix misc bugs.
Signed-off-by: Len Brown <len.brown@intel.com>
On large systems with HT sibling cpu#'s more than 32 apart,
HT siblings were processed and displayed in reverse order.
This was due to how set_thread_siblings() parsed the
sibling-bit-mask.
Update set_thread_siblings to instead parse the sibling-list,
like other cpu lists, and to thus order HT siblings
by ascending CPU number, no matter the size of the system.
Signed-off-by: Len Brown <len.brown@intel.com>
Get the "module_id" from the Linux topology "cluster_id".
If the there is more than one id, show it by default.
Module joins Die etc. in the "topology" group.
Display in hex, as it is usually based mask of the APIC-id
Signed-off-by: Len Brown <len.brown@intel.com>
The core_id is based on a mask of the apic_id.
Print them both in hex, rather than decimal,
to make this relationship visibly clear.
Signed-off-by: Len Brown <len.brown@intel.com>
Make printer helper functions more readable by factoring
out a local 'sep' variable.
Remove the redundant parentheses around sprintf() calls.
Remove an unnecessary cast to "unsigned int" by using the '%08llx' instead
of '%08x'.
No functional changes.
[lenb: fix typos, simplify]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
When the "--cpu-set" option limits turbostat to run on
a higher numbered HT sibling, it exits upon dividing by zero.
This is because the HT support handles higher numbered siblings
at the same time as lower numbered siblings. But when that lower
number sibling is dis-allowed, the higher numbered sibling is
never processed. The result is a time delta of 0, which results
in a divide by 0 for any of the "per-second" metrics.
Enhance the HT enumeration code to record all siblings (up to SMT4).
Consult this complete HT sibling list to determine when
to process an HT sibling, and when to skip it.
Fixes: a2b4d0f8bf ("tools/power turbostat: Favor cpu# over core#")
Signed-off-by: Len Brown <len.brown@intel.com>
"turbostat --cpu-set 0" appears to hang if cpu0 has an HT sibling.
This is because the initialization code recognizes that it does not
have to open perf files for the HT sibling, but the HT support
in the collection code sees the HT sibling and tries to read
from an uninitialized file descriptor, 0 (standard input).
Access HT siblings only when they are in the allowed set.
Fixes: a2b4d0f8bf ("tools/power turbostat: Favor cpu# over core#")
Signed-off-by: Len Brown <len.brown@intel.com>
Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
The '-P' short option (shorthand for --no-perf) is not present in the
optstring of the second call to getopt_long_only(). This results in
the "unrecognized option" error when the tool reaches the main parsing
loop.
Add 'P' to the second getopt_long_only() call to ensure it is
consistently recognized.
Fixes: a0e86c90b8 ("tools/power turbostat: Add --no-perf option")
Signed-off-by: David Arcari <darcari@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
- fprobe: Fixed several bugs, which includes followings:
. Prevention of re-registration: Added an earlier check to reject
re-registering an already active fprobe before its state is modified
during the initialization phase.
. Robustness in failure paths:
- Ensured fprobes are correctly removed from all internal tables and
properly RCU-freed during registration failure.
- Modified unregister_fprobe() to proceed with unregistration even if
temporary memory allocation fails.
. RCU safety in module unloading: Avoided a potential "sleep in RCU"
warning by removing a kcalloc() call in the module notifier path. This
also tries to remove fprobe_hash_node even if memory allocation fails.
. Type-aware unregistration: Fixed a bug where unregistering an fprobe
did not account for different types (entry-only vs. entry-exit) at the
same address, which previously left "junk" entries in the underlying
ftrace/fgraph ops.
. Unregistration of empty ftrace_ops: Avoided unneeded performance
overhead because of making registered ftrace_ops empty (means trace
all functions.) This counts remaining entries and unregister ftrace_ops
when it becomes empty.
- ftracetest: In addition to the fix, two new selftests to check above
fixes, are included:
. Module Unloading Test: Specifically verifies that fprobe events on
a module are correctly cleaned up and do not trigger "trace-all"
behavior when the module is removed.
. Multiple Fprobe Events Test: Ensures that having multiple fprobes on
the same function correctly manages the ftrace hash map during
removal.
-----BEGIN PGP SIGNATURE-----
iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmnoGFUbHG1hc2FtaS5o
aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bvR8H/3f+lFlflidmqgjW+dy5
cB73jVVznarkvVr2AdR1KpNv3V9qjLCWAiUM6sDI79mbMNbOKh44EkEX9P5actNq
3e4NofYr8TyfnJfD6tnrevjhZspCbvRWKVc2+gY9EIi9XUxSRjGuv/5VkIa5HMAN
3Zg++U0mYBTEAn59nGR++GakMuF/UJChOC5eVKpVYjtRLQ+TzWe6nwBhXj6PrhVT
OjZN4LHLH/Eze+Suk2duBDt0RQhfPqiz8yW7eP6TzxkS/DN+hmgoQ/zqFjndPsmc
1TnJ1iJLJ5vo0SqtuCKDv9OTc1oclMCYYdzCxEOnrfFXX4BXxGjzcEd7sjINHMKd
kBY=
=9oav
-----END PGP SIGNATURE-----
Merge tag 'probes-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
"fprobe bug fixes:
- Prevent re-registration
Add an earlier check to reject re-registering an already active
fprobe before its state is modified during the initialization phase
- Robustness in failure paths:
- Ensure fprobes are correctly removed from all internal tables
and properly RCU-freed during registration failure
- Make unregister_fprobe() proceed with unregistration even if
temporary memory allocation fails
- RCU safety in module unloading
Avoid a potential "sleep in RCU" warning by removing a kcalloc()
call in the module notifier path. This also tries to remove
fprobe_hash_node even if memory allocation fails.
- Type-aware unregistration
Fix a bug where unregistering an fprobe did not account for
different types (entry-only vs entry-exit) at the same address,
which previously left "junk" entries in the underlying
ftrace/fgraph ops
- Unregistration of empty ftrace_ops
Avoid unneeded performance overhead due to making registered
ftrace_ops empty - which means 'trace all functions'. This counts
remaining entries and unregister ftrace_ops when it becomes empty.
Two new selftests to check above fixes:
- Module Unloading Test:
Specifically verifies that fprobe events on a module are correctly
cleaned up and do not trigger 'trace-all' behavior when the module
is removed.
- Multiple Fprobe Events Test:
Ensure that having multiple fprobes on the same function correctly
manages the ftrace hash map during removal"
* tag 'probes-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
selftests/ftrace: Add a testcase for multiple fprobe events
selftests/ftrace: Add a testcase for fprobe events on module
tracing/fprobe: Fix to unregister ftrace_ops if it is empty on module unloading
tracing/fprobe: Check the same type fprobe on table as the unregistered one
tracing/fprobe: Avoid kcalloc() in rcu_read_lock section
tracing/fprobe: Remove fprobe from hash in failure path
tracing/fprobe: Unregister fprobe even if memory allocation fails
tracing/fprobe: Reject registration of a registered fprobe before init
turbostat.c:8688: rapl_perf_init: Assertion `next_domain < num_domains' failed.
The initial fix for this regression was incomplete, as it did not
handle multi-package systems with sparse core ids.
Fixes: ef0e60083f ("tools/power turbostat: Fix AMD RAPL regression")
Signed-off-by: Len Brown <len.brown@intel.com>
- Fix an integer underflow in the mpi library
- Improve the crypto library documentation
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaefDABQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOKy6mAQDjR6/V4IukSFyMaWYJBGGOZNRUwLda
DwHkFLpfAKJP4AD6AovNzxLksG2oms6pkcvPiDVtT2maRr+MiVlc0CpKJQg=
=DkWb
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull more crypto library updates from Eric Biggers:
"Crypto library fix and documentation update:
- Fix an integer underflow in the mpi library
- Improve the crypto library documentation"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: docs: Add rst documentation to Documentation/crypto/
docs: kdoc: Expand 'at_least' when creating parameter list
lib/crypto: mpi: Fix integer underflow in mpi_read_raw_from_sgl()
scx_fork() dispatches ops.init_task to exactly one scheduler - the one
owning the forking task's cgroup. A task forked inside a sub-scheduler's
cgroup is init'd into the sub only; the root scheduler has no task_ctx
entry for it. When that task later appears as @prev in the root's
qmap_dispatch() (or flows through core-sched comparison via task_qdist),
the bpf_task_storage_get() legitimately misses.
qmap treated those misses as fatal via scx_bpf_error("task_ctx lookup
failed") and aborted the scheduler as soon as the first cross-sched
task hit the root. Drop the error in the sites where the miss is
legitimate: lookup_task_ctx() (helper; callers already check for NULL),
qmap_dispatch()'s @prev branch (bookkeeping-only), task_qdist()
(returns 0 which makes the comparison a no-op), and qmap_select_cpu()
(returns prev_cpu as a no-op fallback instead of -ESRCH). The existing
scx_error was a paranoid guard from the pre-sub-sched world where every
task was owned by the one and only scheduler.
v2: qmap_select_cpu() returns prev_cpu on NULL instead of -ESRCH, so
the root scheduler doesn't error on cross-sched tasks that pass
through it (Andrea Righi).
Fixes: 4f8b122848 ("sched_ext: Add basic building blocks for nested sub-scheduler dispatching")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn>
Replace teamd daemon usage with ip link commands for team device
setup. teamd -d daemonizes and returns to the shell before port
addition completes, creating a race: the test may create the macvlan
(and check for its address on a slave) before teamd has finished
adding ports. This makes the test inherently dependent on scheduling
timing.
Using ip commands makes port addition synchronous, removing the race
and making the test deterministic.
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jay Vosburgh <jv@jvosburgh.net>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260416185712.2155425-16-sdf@fomichev.me
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Fixes regressions in non-bash shells and busybox support, and reverts
a commit that regression in build and installation when one or more
tests fail to build. Fixes duplicated test number reporting introduced
in ktap support patch.
- selftests: Fix duplicated test number reporting
- selftests: Fix runner.sh for non-bash shells
- selftests: Fix runner.sh busybox support
- selftests: Deescalate error reporting
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmnmPx8ACgkQCwJExA0N
QxwZ7g/+L4ZZDp3RyuauvCV4zhG5GQFudtvAkcLOSwBVnRbiLdPmdOXXz7IkW7DN
U/WQx3pDOYmvtr2QdNvXch3HOdk1vUfNViU5yPNqC4jVZEMON2N7oGr2Eq+WVhi+
gl63pRYk9ISh+5vOlzQY9UX1sLOxlME1foMJdHQEZHhgbNxlc7s/NfpqAnRC7a4l
SFuzL/PJl9kYiMUFeYLB9kwrelvoLrzItMVz7/m56dgNVuEmbNDESBXGwJQneH6l
SWOXPC96gu2cajluNfyhOqarkuGVD8x6J+2vWBwrDnSiyMLyealAOHnK5JGR17hW
NErJDpqpdlIue5/h/XFnZ+4o43J8uEiUxmP7UiPAmreBllajeNz4xZPuz+i2vLH2
O9dzzj/SV9War5txaFdqHXpbZE2zYOfhA07Xg6VdjcB0LTWaSOPqiIPr9UTwvT6o
T7vYkvE+w4rjXwTFEscHkZ5jXrvAiWMrgiK4BuzXWy03/BvOF6LiMf0NCELvKvZG
ZubLCJ1N/2EXgt+MX9dRmxq+7ZXCGu53TU5GeX1u/vT5lqsaPwoXT4RylZek5hwx
DfKjEOU22TOQXAV01z1sPJvNwPZa84Hejzf6c6v2xobY/vaf4XxgyXAIAJJGOfpj
25nEdecvjdX62kFfY/QrX7akpr7IreonpssmQVuGt3jilpq4uLg=
=1k+f
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-next-7.1-next-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"Fix regressions in non-bash shells and busybox support, and revert a
commit that regressed in build and installation when one or more tests
fail to build.
Fix duplicated test number reporting introduced in ktap support patch"
* tag 'linux_kselftest-next-7.1-next-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: Fix duplicated test number reporting
selftests: Fix runner.sh for non-bash shells
selftests: Fix runner.sh busybox support
selftests: Deescalate error reporting
Replace all variations of "paddr" variables in KVM selftests with "gpa",
with the exception of the ELF structures, as those fields are not specific
to guest virtual addresses, to complete the conversion from vm_paddr_t to
gpa_t.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
In x86's nested TDP APIs, use the appropriate gpa_t typedef and rename
variables from nested_paddr to l2_gpa to match KVM x86's nomenclature.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Replace all variations of "vaddr" variables in KVM selftests with "gva",
with the exception of the ELF structures, as those fields are not specific
to guest virtual addresses, to complete the conversion from vm_vaddr_t to
gva_t.
Opportunistically use gva_t instead of u64 for relevant variables, and
fixup indentation as appropriate.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Rename inject_uer()'s @paddr to @hpa to make it more obvious that it
injects an error using a host PA, not a guest PA.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Rename arm64's translate_to_host_paddr() to translate_hva_to_hpa() and
update variable names to match, as using "vaddr" and "paddr" terminology
is super confusing due to selftests using those exact names for *guest*
addresses.
Opportunisitically drop superfluous local page_addr and paddr variables.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Now that KVM selftests use gva_t instead of vm_vaddr_t, rename the helper
for populating the initial GVA bitmap to drop the defunct terminology and
use "vm" for the scope.
Opportunistically fixup the declaration of the API, which has been broken
since day 1. The flaw went unnoticed because the sole caller is defined
after the weak version, i.e. can see the prototype without a previous
declaration.
No functional change intended.
Fixes: e8b9a055fa ("KVM: arm64: selftests: Align VA space allocator with TTBR0")
Link: https://patch.msgid.link/20260420212004.3938325-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Now that KVM selftests use gva_t instead of vm_vaddr_t, rename the API
for finding an unused range of virtual memory to drop the defunct
terminology and use "vm" for the scope.
Opportunistically clean up the function comment to drop superfluous
and redundant information.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Now that KVM selftests use gva_t instead of vm_vaddr_t, drop "vaddr_" from
the core memory allocation APIs as the information is extraneous and does
more harm than good. E.g. the APIs don't _just_ allocate virtual memory,
they allocate backing physical memory and install mappings in the guest
page tables. And as proven by kmalloc() and malloc(), developers generally
expect that allocations come with a working virtual address.
Opportunistically clean up the function comment for vm_alloc(), and drop
the misleading and superfluous comments for its wrappers.
No functional change intended.
Link: https://patch.msgid.link/20260420212004.3938325-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use u8 instead of uint8_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/uint8_t/u8/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use s16 instead of int16_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/int16_t/s16/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-10-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use u16 instead of uint16_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/uint16_t/u16/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use s32 instead of int32_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/int32_t/s32/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use u32 instead of uint32_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/uint32_t/u32/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Use s64 instead of int64_t to make the KVM selftests code more concise
and more similar to the kernel (since selftests are primarily developed
by kernel developers).
This commit was generated with the following command:
git ls-files tools/testing/selftests/kvm | xargs sed -i 's/int64_t/s64/g'
Then by manually adjusting whitespace to make checkpatch.pl happy.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://patch.msgid.link/20260420212004.3938325-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>