Commit Graph

49430 Commits

Author SHA1 Message Date
Jakub Sitnicki
29960e635b selftests/bpf: Cover skb metadata access after bpf_skb_adjust_room
Add a test to verify that skb metadata remains accessible after calling
bpf_skb_adjust_room(), which modifies the packet headroom and can trigger
head reallocation.

The helper expects an Ethernet frame carrying an IP packet so switch test
packet identification by source MAC address since we can no longer rely on
Ethernet proto being set to zero.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-14-5ceb08a9b37b@cloudflare.com
2025-11-10 10:52:33 -08:00
Jakub Sitnicki
354d020c29 selftests/bpf: Cover skb metadata access after vlan push/pop helper
Add a test to verify that skb metadata remains accessible after calling
bpf_skb_vlan_push() and bpf_skb_vlan_pop(), which modify the packet
headroom.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-13-5ceb08a9b37b@cloudflare.com
2025-11-10 10:52:32 -08:00
Jakub Sitnicki
1e1357fde8 selftests/bpf: Expect unclone to preserve skb metadata
Since pskb_expand_head() no longer clears metadata on unclone, update tests
for cloned packets to expect metadata to remain intact.

Also simplify the clone_dynptr_kept_on_{data,meta}_slice_write tests.
Creating an r/w dynptr slice is sufficient to trigger an unclone in the
prologue, so remove the extraneous writes to the data/meta slice.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-12-5ceb08a9b37b@cloudflare.com
2025-11-10 10:52:32 -08:00
Jakub Sitnicki
9ef9ac15a5 selftests/bpf: Dump skb metadata on verification failure
Add diagnostic output when metadata verification fails to help with
troubleshooting test failures. Introduce a check_metadata() helper that
prints both expected and received metadata to the BPF program's stderr
stream on mismatch. The userspace test reads and dumps this stream on
failure.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-11-5ceb08a9b37b@cloudflare.com
2025-11-10 10:52:32 -08:00
Jakub Sitnicki
967534e57c selftests/bpf: Verify skb metadata in BPF instead of userspace
Move metadata verification into the BPF TC programs. Previously,
userspace read metadata from a map and verified it once at test end.

Now TC programs compare metadata directly using __builtin_memcmp() and
set a test_pass flag. This enables verification at multiple points during
test execution rather than a single final check.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-10-5ceb08a9b37b@cloudflare.com
2025-11-10 10:52:32 -08:00
Linus Torvalds
4ea7c1717f Arm:
- Fix trapping regression when no in-kernel irqchip is present
 
 - Check host-provided, untrusted ranges and offsets in pKVM
 
 - Fix regression restoring the ID_PFR1_EL1 register
 
 - Fix vgic ITS locking issues when LPIs are not directly injected
 
 Arm selftests:
 
 - Correct target CPU programming in vgic_lpi_stress selftest
 
 - Fix exposure of SCTLR2_EL2 and ZCR_EL2 in get-reg-list selftest
 
 RISC-V:
 
 - Fix check for local interrupts on riscv32
 
 - Read HGEIP CSR on the correct cpu when checking for IMSIC interrupts
 
 - Remove automatic I/O mapping from kvm_arch_prepare_memory_region()
 
 x86:
 
 - Inject #UD if the guest attempts to execute SEAMCALL or TDCALL as KVM
   doesn't support virtualization the instructions, but the instructions
   are gated only by VMXON.  That is, they will VM-Exit instead of taking
   a #UD and until now this resulted in KVM exiting to userspace with an
   emulation error.
 
 - Unload the "FPU" when emulating INIT of XSTATE features if and only if
   the FPU is actually loaded, instead of trying to predict when KVM will
   emulate an INIT (CET support missed the MP_STATE path).  Add sanity
   checks to detect and harden against similar bugs in the future.
 
 - Unregister KVM's GALog notifier (for AVIC) when kvm-amd.ko is unloaded.
 
 - Use a raw spinlock for svm->ir_list_lock as the lock is taken during
   schedule(), and "normal" spinlocks are sleepable locks when PREEMPT_RT=y.
 
 - Remove guest_memfd bindings on memslot deletion when a gmem file is dying
   to fix a use-after-free race found by syzkaller.
 
 - Fix a goof in the EPT Violation handler where KVM checks the wrong
   variable when determining if the reported GVA is valid.
 
 - Fix and simplify the handling of LBR virtualization on AMD, which was made
   buggy and unnecessarily complicated by nested VM support
 
 Misc:
 
 - Update Oliver's email address
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmkQSAAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMDzQf9GW6F8yAo1QTnLNfSIvHMQcJAKwr3
 pd+bOLtMN/OzdQO/dvGVdkYuYYg/PQv/+zw6dVIeiLjtlrsTg40rsqVZEAXYCsbB
 q7TFM634thgON6R6fD6eLa+72UP0wMai7xqEfEyXVW3enAEEe+lrPWC9BiwJRqKZ
 pY1MpVIdYa0XNfUBeiOhO5AH+y9OmUDq5AHptrYn9X5xdsU2OWqQCyHW/1RLPWvA
 9bkyuz1A8+EQ20ngHUd0hrQx4UeJ7jvPblbryUxXaMwqahPC9sA2iDI12gAu8a84
 skvWbIPHSMgj5qDO/CkAsHb47GiqudU4LH7LniDZNsq21iTekiURSbdklw==
 =SM6O
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Arm:

   - Fix trapping regression when no in-kernel irqchip is present

   - Check host-provided, untrusted ranges and offsets in pKVM

   - Fix regression restoring the ID_PFR1_EL1 register

   - Fix vgic ITS locking issues when LPIs are not directly injected

  Arm selftests:

   - Correct target CPU programming in vgic_lpi_stress selftest

   - Fix exposure of SCTLR2_EL2 and ZCR_EL2 in get-reg-list selftest

  RISC-V:

   - Fix check for local interrupts on riscv32

   - Read HGEIP CSR on the correct cpu when checking for IMSIC
     interrupts

   - Remove automatic I/O mapping from kvm_arch_prepare_memory_region()

  x86:

   - Inject #UD if the guest attempts to execute SEAMCALL or TDCALL as
     KVM doesn't support virtualization the instructions, but the
     instructions are gated only by VMXON. That is, they will VM-Exit
     instead of taking a #UD and until now this resulted in KVM exiting
     to userspace with an emulation error.

   - Unload the "FPU" when emulating INIT of XSTATE features if and only
     if the FPU is actually loaded, instead of trying to predict when
     KVM will emulate an INIT (CET support missed the MP_STATE path).
     Add sanity checks to detect and harden against similar bugs in the
     future.

   - Unregister KVM's GALog notifier (for AVIC) when kvm-amd.ko is
     unloaded.

   - Use a raw spinlock for svm->ir_list_lock as the lock is taken
     during schedule(), and "normal" spinlocks are sleepable locks when
     PREEMPT_RT=y.

   - Remove guest_memfd bindings on memslot deletion when a gmem file is
     dying to fix a use-after-free race found by syzkaller.

   - Fix a goof in the EPT Violation handler where KVM checks the wrong
     variable when determining if the reported GVA is valid.

   - Fix and simplify the handling of LBR virtualization on AMD, which
     was made buggy and unnecessarily complicated by nested VM support

  Misc:

   - Update Oliver's email address"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
  KVM: nSVM: Fix and simplify LBR virtualization handling with nested
  KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()
  KVM: SVM: Mark VMCB_LBR dirty when MSR_IA32_DEBUGCTLMSR is updated
  MAINTAINERS: Switch myself to using kernel.org address
  KVM: arm64: vgic-v3: Release reserved slot outside of lpi_xa's lock
  KVM: arm64: vgic-v3: Reinstate IRQ lock ordering for LPI xarray
  KVM: arm64: Limit clearing of ID_{AA64PFR0,PFR1}_EL1.GIC to userspace irqchip
  KVM: arm64: Set ID_{AA64PFR0,PFR1}_EL1.GIC when GICv3 is configured
  KVM: arm64: Make all 32bit ID registers fully writable
  KVM: VMX: Fix check for valid GVA on an EPT violation
  KVM: guest_memfd: Remove bindings on memslot deletion when gmem is dying
  KVM: SVM: switch to raw spinlock for svm->ir_list_lock
  KVM: SVM: Make avic_ga_log_notifier() local to avic.c
  KVM: SVM: Unregister KVM's GALog notifier on kvm-amd.ko exit
  KVM: SVM: Initialize per-CPU svm_data at the end of hardware setup
  KVM: x86: Call out MSR_IA32_S_CET is not handled by XSAVES
  KVM: x86: Harden KVM against imbalanced load/put of guest FPU state
  KVM: x86: Unload "FPU" state on INIT if and only if its currently in-use
  KVM: arm64: Check the untrusted offset in FF-A memory share
  KVM: arm64: Check range args for pKVM mem transitions
  ...
2025-11-10 08:54:36 -08:00
Paolo Bonzini
ca00c3af8e KVM/arm654 fixes for 6.18, take #2
* Core fixes
 
   - Fix trapping regression when no in-kernel irqchip is present
     (20251021094358.1963807-1-sascha.bischoff@arm.com)
 
   - Check host-provided, untrusted ranges and offsets in pKVM
     (20251016164541.3771235-1-vdonnefort@google.com)
     (20251017075710.2605118-1-sebastianene@google.com)
 
   - Fix regression restoring the ID_PFR1_EL1 register
     (20251030122707.2033690-1-maz@kernel.org
 
   - Fix vgic ITS locking issues when LPIs are not directly injected
     (20251107184847.1784820-1-oupton@kernel.org)
 
 * Test fixes
 
   - Correct target CPU programming in vgic_lpi_stress selftest
     (20251020145946.48288-1-mdittgen@amazon.de)
 
   - Fix exposure of SCTLR2_EL2 and ZCR_EL2 in get-reg-list selftest
     (20251023-b4-kvm-arm64-get-reg-list-sctlr-el2-v1-1-088f88ff992a@kernel.org)
     (20251024-kvm-arm64-get-reg-list-zcr-el2-v1-1-0cd0ff75e22f@kernel.org)
 
 * Misc
 
   - Update Oliver's email address
     (20251107012830.1708225-1-oupton@kernel.org)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmkPMCoACgkQI9DQutE9
 ekPbaBAAuunvToIxTQoa1hf8+9qlCl/uQNO0R1Q/72N2sLv02kyma9xNA6UGuGFX
 a5AURJItw3RzEDnsJ3Grvu/2IgRgHUrTOqpwPSdGzN/t5wq046Rjp7tWpyQVlwDn
 hax1y7xWpHgC2pDpVran+I7fkC9SZlw5FXGTyEyyGhhYsw41aT6DwYJNyv4TKlxQ
 btXkFN0dhJGw0GjqFvUsZoWBvdWYfhZpMwt24eJ0H8uZXgVImG2TeIcRdwF7A4zr
 0Z3eE17uIochflpWARydX2bAejWWkBGXwFMLtE8SpbFz5W6SpA6e0kkaUsfwIM3c
 d29azeTSspneqjl4CqT7r28ea3F4pOUzCSSsVMWZ9CplRef1/lqIQAQ+sjVWebE/
 gonzdGiRkhylb3lKMPuaip8tnoaLx+KLYGNkY2uBWUtC1C182d+Jy8W49POZHfSn
 01k0VoiwwUiyCA152n2VGThzwmNyfGImNKLnPBlUZpsFGHbdMdKgHr6IBIGkCK8C
 K9ylq++xoQ5KFpDqkE2/mcv75FAJyFJDb7TavtHl6b6H/LjglSYhhperkXbWhx3K
 +GYfsG/m7eEaro41vfLdqZEJt0mgLJaX+asruA+lFHqWTfnY8QFyczFJs8doox+o
 EqIf6bmlDmSE8Waz0YMiPBFl3fgvj3yvMuceDmV3fL8gmQpPa+0=
 =ptQ6
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm654 fixes for 6.18, take #2

* Core fixes

  - Fix trapping regression when no in-kernel irqchip is present
    (20251021094358.1963807-1-sascha.bischoff@arm.com)

  - Check host-provided, untrusted ranges and offsets in pKVM
    (20251016164541.3771235-1-vdonnefort@google.com)
    (20251017075710.2605118-1-sebastianene@google.com)

  - Fix regression restoring the ID_PFR1_EL1 register
    (20251030122707.2033690-1-maz@kernel.org

  - Fix vgic ITS locking issues when LPIs are not directly injected
    (20251107184847.1784820-1-oupton@kernel.org)

* Test fixes

  - Correct target CPU programming in vgic_lpi_stress selftest
    (20251020145946.48288-1-mdittgen@amazon.de)

  - Fix exposure of SCTLR2_EL2 and ZCR_EL2 in get-reg-list selftest
    (20251023-b4-kvm-arm64-get-reg-list-sctlr-el2-v1-1-088f88ff992a@kernel.org)
    (20251024-kvm-arm64-get-reg-list-zcr-el2-v1-1-0cd0ff75e22f@kernel.org)

* Misc

  - Update Oliver's email address
    (20251107012830.1708225-1-oupton@kernel.org)
2025-11-09 08:07:55 +01:00
Daniel Zahka
2098cec328 selftests: drv-net: psp: add assertions on core-tracked psp dev stats
Add assertions to existing test cases to cover key rotations and
'stale-events'.

Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20251106002608.1578518-3-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07 18:53:56 -08:00
Alexander Sverdlin
57531b3416 selftests: net: local_termination: Wait for interfaces to come up
It seems that most of the tests prepare the interfaces once before the test
run (setup_prepare()), rely on setup_wait() to wait for link and only then
run the test(s).

local_termination brings the physical interfaces down and up during test
run but never wait for them to come up. If the auto-negotiation takes
some seconds, first test packets are being lost, which leads to
false-negative test results.

Use setup_wait() in run_test() to make sure auto-negotiation has been
completed after all simple_if_init() calls on physical interfaces and test
packets will not be lost because of the race against link establishment.

Fixes: 90b9566aa5 ("selftests: forwarding: add a test for local_termination.sh")
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://patch.msgid.link/20251106161213.459501-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07 18:46:36 -08:00
Kuniyuki Iwashima
ffc56c9081 selftest: packetdrill: Add max RTO test for SYN+ACK.
This script sets net.ipv4.tcp_rto_max_ms to 1000 and checks
if SYN+ACK RTO is capped at 1s for TFO and non-TFO.

Without the previous patch, the max RTO is applied to TFO
SYN+ACK only, and non-TFO SYN+ACK RTO increases exponentially.

  # selftests: net/packetdrill: tcp_rto_synack_rto_max.pkt
  # TAP version 13
  # 1..2
  # tcp_rto_synack_rto_max.pkt:46: error handling packet: timing error:
     expected outbound packet at 5.091936 sec but happened at 6.107826 sec; tolerance 0.127974 sec
  # script packet:  5.091936 S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK>
  # actual packet:  6.107826 S. 0:0(0) ack 1 win 65535 <mss 1460,nop,nop,sackOK>
  # not ok 1 ipv4
  # tcp_rto_synack_rto_max.pkt:46: error handling packet: timing error:
     expected outbound packet at 5.075901 sec but happened at 6.091841 sec; tolerance 0.127976 sec
  # script packet:  5.075901 S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK>
  # actual packet:  6.091841 S. 0:0(0) ack 1 win 65535 <mss 1460,nop,nop,sackOK>
  # not ok 2 ipv6
  # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
  not ok 49 selftests: net/packetdrill: tcp_rto_synack_rto_max.pkt # exit=1

With the previous patch, all SYN+ACKs are retransmitted
after 1s.

  # selftests: net/packetdrill: tcp_rto_synack_rto_max.pkt
  # TAP version 13
  # 1..2
  # ok 1 ipv4
  # ok 2 ipv6
  # # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0
  ok 49 selftests: net/packetdrill: tcp_rto_synack_rto_max.pkt

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251106003357.273403-7-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-07 18:05:26 -08:00
Linus Torvalds
a2e33fb926 iommufd 6.18 first rc pull
Three fixes:
 
 - Syzkaller found a case where maths overflows can cause divide by 0
 
 - Typo in a compiler bug warning fix in the selftests broke the selftests
 
 - type1 compatability had a mismatch when unmapping an already unmaped
   range, it should succeed
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCaQ473AAKCRCFwuHvBreF
 Ya8gAP9Qifu4AYobrhgD9rKKXLniGRejXUHBQ0G8y8y9sBHJdgEApwxzCUcI1jLZ
 GTaPEABblBBWxUmN+Psya2Hm3Oa53AA=
 =TJHp
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd fixes from Jason Gunthorpe:

 - Syzkaller found a case where maths overflows can cause divide by 0

 - Typo in a compiler bug warning fix in the selftests broke the
   selftests

 - type1 compatability had a mismatch when unmapping an already unmapped
   range, it should succeed

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
  iommufd: Make vfio_compat's unmap succeed if the range is already empty
  iommufd/selftest: Fix ioctl return value in _test_cmd_trigger_vevents()
  iommufd: Don't overflow during division for dirty tracking
2025-11-07 13:13:09 -08:00
Linus Torvalds
5b95a50001 Fixes for tracing:
- Check for reader catching up in ring_buffer_map_get_reader()
 
   If the reader catches up to the writer in the memory mapped ring buffer
   then calling rb_get_reader_page() will return NULL as there's no
   pages left. But this isn't checked for before calling rb_get_reader_page()
   and the return of NULL causes a warning.
 
   If it is detected that the reader caught up to the writer, then simply
   exit the routine.
 
 - Fix memory leak in histogram create_field_var()
 
   The couple of the error paths in create_field_var() did not properly clean
   up what was allocated. Make sure everything is freed properly on error.
 
 - Fix help message of tools latency_collector
 
   The help message incorrectly stated that "-t" was the same as "--threads"
   whereas "--threads" is actually represented by "-e".
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaQ3wOxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qrYvAP9zLYz/pCTTCY64/Yx2gMimFt7g9XhO
 b5xL+mZWoiYJigD+Ma7IpRC1QVyAk5YgxkWJqpEyHrxE84fBIBevoTRBTQE=
 =+x8m
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Check for reader catching up in ring_buffer_map_get_reader()

   If the reader catches up to the writer in the memory mapped ring
   buffer then calling rb_get_reader_page() will return NULL as there's
   no pages left. But this isn't checked for before calling
   rb_get_reader_page() and the return of NULL causes a warning.

   If it is detected that the reader caught up to the writer, then
   simply exit the routine

 - Fix memory leak in histogram create_field_var()

   The couple of the error paths in create_field_var() did not properly
   clean up what was allocated. Make sure everything is freed properly
   on error

 - Fix help message of tools latency_collector

   The help message incorrectly stated that "-t" was the same as
   "--threads" whereas "--threads" is actually represented by "-e"

* tag 'trace-v6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/tools: Fix incorrcet short option in usage text for --threads
  tracing: Fix memory leaks in create_field_var()
  ring-buffer: Do not warn in ring_buffer_map_get_reader() when reader catches up
2025-11-07 08:07:11 -08:00
Zhang Chujun
53afec2c8f tracing/tools: Fix incorrcet short option in usage text for --threads
The help message incorrectly listed '-t' as the short option for
--threads, but the actual getopt_long configuration uses '-e'.
This mismatch can confuse users and lead to incorrect command-line
usage. This patch updates the usage string to correctly show:
	"-e, --threads NRTHR"
to match the implementation.

Note: checkpatch.pl reports a false-positive spelling warning on
'Run', which is intentional.

Link: https://patch.msgid.link/20251106031040.1869-1-zhangchujun@cmss.chinamobile.com
Signed-off-by: Zhang Chujun <zhangchujun@cmss.chinamobile.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-11-07 07:59:37 -05:00
Linus Torvalds
f5f2e20b1c perf tools fixes for v6.18:
- Add James Clark as a perf tools reviewer.
 
 - Handle '1' type symbols in /proc/kallsyms, related to anonymous Rust closures
   in the DRM panic QR encoder, caught by 'perf test'.
 
 - Sync kernel header copies: MSRs, uprobe syscall, DRM_IOCTL_GEM_CHANGE_HANDLE,
   KVM exit reasons, etc.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCaQ0FawAKCRCyPKLppCJ+
 J+QLAP9NMeWfs1/7Wyt5P028Ul5pB55hLc3MUv33ul94CV9sPAD+K57sXYgesAz9
 serupEojk9NKbJdt6Bx0WYBGh/XivgI=
 =J28f
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v6.18-1-2025-11-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Add James Clark as a perf tools reviewer

 - Handle '1' type symbols in /proc/kallsyms, related to anonymous
   Rust closures in the DRM panic QR encoder, caught by 'perf test'

 - Sync kernel header copies: MSRs, uprobe syscall,
   DRM_IOCTL_GEM_CHANGE_HANDLE, KVM exit reasons, etc

* tag 'perf-tools-fixes-for-v6.18-1-2025-11-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf symbols: Handle '1' symbols in /proc/kallsyms
  tools headers asm: Sync fls headers header with the kernel sources
  tools headers UAPI: Sync KVM's vmx.h header with the kernel sources to handle new exit reasons
  tools headers svm: Sync svm headers with the kernel sources
  tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
  MAINTAINERS: Add James Clark as a perf tools reviewer
  tools headers UAPI: Sync linux/kvm.h with the kernel sources
  tools headers UAPI: Update tools's copy of drm.h to pick DRM_IOCTL_GEM_CHANGE_HANDLE
  tools headers x86 cpufeatures: Sync with the kernel sources
  tools headers x86: Sync table due to introducion of uprobe syscall
  tools headers: Sync uapi/linux/fcntl.h with the kernel sources
  tools headers: Sync uapi/linux/prctl.h with the kernel source
  tools headers uapi: Update fs.h with the kernel sources
  tools arch x86: Sync msr-index.h to pick AMD64_{PERF_CNTR_GLOBAL_STATUS_SET,SAVIC_CONTROL}, IA32_L3_QOS_{ABMC,EXT}_CFG
2025-11-06 16:05:33 -08:00
Jakub Kicinski
1ec9871fbb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.18-rc5).

Conflicts:

drivers/net/wireless/ath/ath12k/mac.c
  9222582ec5 ("Revert "wifi: ath12k: Fix missing station power save configuration"")
  6917e268c4 ("wifi: ath12k: Defer vdev bring-up until CSA finalize to avoid stale beacon")
https://lore.kernel.org/11cece9f7e36c12efd732baa5718239b1bf8c950.camel@sipsolutions.net

Adjacent changes:

drivers/net/ethernet/intel/Kconfig
  b1d16f7c00 ("libie: depend on DEBUG_FS when building LIBIE_FWLOG")
  93f53db9f9 ("ice: switch to Page Pool")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 09:27:40 -08:00
Linus Torvalds
c2c2ccfd4b Including fixes from bluetooth and wireless.
Current release - new code bugs:
 
  - ptp: expose raw cycles only for clocks with free-running counter
 
  - bonding: fix null-deref in actor_port_prio setting
 
  - mdio: ERR_PTR-check regmap pointer returned by device_node_to_regmap()
 
  - eth: libie: depend on DEBUG_FS when building LIBIE_FWLOG
 
 Previous releases - regressions:
 
  - virtio_net: fix perf regression due to bad alignment of
    virtio_net_hdr_v1_hash
 
  - Revert "wifi: ath10k: avoid unnecessary wait for service ready message"
    caused regressions for QCA988x and QCA9984
 
  - Revert "wifi: ath12k: Fix missing station power save configuration"
    caused regressions for WCN7850
 
  - eth: bnxt_en: shutdown FW DMA in bnxt_shutdown(), fix memory
    corruptions after kexec
 
 Previous releases - always broken:
 
  - virtio-net: fix received packet length check for big packets
 
  - sctp: fix races in socket diag handling
 
  - wifi: add an hrtimer-based delayed work item to avoid low granularity
    of timers set relatively far in the future, and use it where it matters
    (e.g. when performing AP-scheduled channel switch)
 
  - eth: mlx5e:
    - correctly propagate error in case of module EEPROM read failure
    - fix HW-GRO on systems with PAGE_SIZE == 64kB
 
  - dsa: b53: fixes for tagging, link configuration / RMII, FDB, multicast
 
  - phy: lan8842: implement latest errata
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmkMyx0ACgkQMUZtbf5S
 Irs4gA//fRGPHfEksbVO0lAR9QVUKq/Wvokt6DnLXQsi7js7kun7ymVFyU8tVacP
 sGoYbTTXO34XpijKVMTFe5WnnnSF3cri1/e+PYNXndKGHOheLvz0aq+CniiLexaT
 ag/opHX9+Io6x8DyOVkkRJYByvEcXgy1vFjhk5O04wn7tGPBNzvaKQJzjfVIiiWP
 thma/e77jVUqsNptS/VzJNCDwPB3Qi0fqylRovu7A59Kmr7GLZxaqZQpvM5/XjU3
 s3NhNjNDawmiwxN2AIztXo+vMReqFaiCr+z36OcruE04CFslkQGmE/5g3FdDGg+Q
 RE7Wfgk2UJ+Q5Y1jnQWNYOkGaMd6bMgPQD/8zZvwEf163Gh1YBh0rtDY07DWV5FR
 wp5cmeNDo2Mf+nnCM2UaoUQS2AAmkub2aAIjnlvrefeJMKciyMqtQe+V02aIxuvK
 BPmeSCN1IOyAIDCRE3Fd6JFyk+4Z4YC6Hdm/WOG517eGrDh3yV9JE/+C3jdh3JwT
 lvaScKhCdyq/9mFqEcU/yR5Kc4Od6Rw0K0s5DM4LAbKSsxI6hp6SkuoFHUsqFsRR
 HL3ucy4c4hmLxz6AA5/+UVIPzFsY5cwOnhs3AU2UVuA95fH0QWqX2u2ll0rC9mAW
 xDXa/ai0/KAGvVYqjC+MDyF8zrtzSG7C40ZsxhdcU84s94ZNVFI=
 =a/Ml
 -----END PGP SIGNATURE-----

Merge tag 'net-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
  Including fixes from bluetooth and wireless.

  Current release - new code bugs:

   - ptp: expose raw cycles only for clocks with free-running counter

   - bonding: fix null-deref in actor_port_prio setting

   - mdio: ERR_PTR-check regmap pointer returned by
     device_node_to_regmap()

   - eth: libie: depend on DEBUG_FS when building LIBIE_FWLOG

  Previous releases - regressions:

   - virtio_net: fix perf regression due to bad alignment of
     virtio_net_hdr_v1_hash

   - Revert "wifi: ath10k: avoid unnecessary wait for service ready
     message" caused regressions for QCA988x and QCA9984

   - Revert "wifi: ath12k: Fix missing station power save configuration"
     caused regressions for WCN7850

   - eth: bnxt_en: shutdown FW DMA in bnxt_shutdown(), fix memory
     corruptions after kexec

  Previous releases - always broken:

   - virtio-net: fix received packet length check for big packets

   - sctp: fix races in socket diag handling

   - wifi: add an hrtimer-based delayed work item to avoid low
     granularity of timers set relatively far in the future, and use it
     where it matters (e.g. when performing AP-scheduled channel switch)

   - eth: mlx5e:
       - correctly propagate error in case of module EEPROM read failure
       - fix HW-GRO on systems with PAGE_SIZE == 64kB

   - dsa: b53: fixes for tagging, link configuration / RMII, FDB,
     multicast

   - phy: lan8842: implement latest errata"

* tag 'net-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits)
  selftests/vsock: avoid false-positives when checking dmesg
  net: bridge: fix MST static key usage
  net: bridge: fix use-after-free due to MST port state bypass
  lan966x: Fix sleeping in atomic context
  bonding: fix NULL pointer dereference in actor_port_prio setting
  net: dsa: microchip: Fix reserved multicast address table programming
  net: wan: framer: pef2256: Switch to devm_mfd_add_devices()
  net: libwx: fix device bus LAN ID
  net/mlx5e: SHAMPO, Fix header formulas for higher MTUs and 64K pages
  net/mlx5e: SHAMPO, Fix skb size check for 64K pages
  net/mlx5e: SHAMPO, Fix header mapping for 64K pages
  net: ti: icssg-prueth: Fix fdb hash size configuration
  net/mlx5e: Fix return value in case of module EEPROM read error
  net: gro_cells: Reduce lock scope in gro_cell_poll
  libie: depend on DEBUG_FS when building LIBIE_FWLOG
  wifi: mac80211_hwsim: Limit destroy_on_close radio removal to netgroup
  netpoll: Fix deadlock in memory allocation under spinlock
  net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error
  virtio-net: fix received length check in big packets
  bnxt_en: Fix warning in bnxt_dl_reload_down()
  ...
2025-11-06 08:52:30 -08:00
Bobby Eshleman
3534e03e0e selftests/vsock: avoid false-positives when checking dmesg
Sometimes VMs will have some intermittent dmesg warnings that are
unrelated to vsock. Change the dmesg parsing to filter on strings
containing 'vsock' to avoid false positive failures that are unrelated
to vsock. The downside is that it is possible for some vsock related
warnings to not contain the substring 'vsock', so those will be missed.

Fixes: a4a65c6fe0 ("selftests/vsock: add initial vmtest.sh for vsock")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251105-vsock-vmtest-dmesg-fix-v2-1-1a042a14892c@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 07:34:50 -08:00
Jason Gunthorpe
afb47765f9 iommufd: Make vfio_compat's unmap succeed if the range is already empty
iommufd returns ENOENT when attempting to unmap a range that is already
empty, while vfio type1 returns success. Fix vfio_compat to match.

Fixes: d624d6652a ("iommufd: vfio container FD ioctl compatibility")
Link: https://patch.msgid.link/r/0-v1-76be45eff0be+5d-iommufd_unmap_compat_jgg@nvidia.com
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Alex Mastro <amastro@fb.com>
Reported-by: Alex Mastro <amastro@fb.com>
Closes: https://lore.kernel.org/r/aP0S5ZF9l3sWkJ1G@devgpu012.nha5.facebook.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-11-05 15:11:26 -04:00
Kees Cook
2b5e9f9b7e net: Convert struct sockaddr to fixed-size "sa_data[14]"
Revert struct sockaddr from flexible array to fixed 14-byte "sa_data",
to solve over 36,000 -Wflex-array-member-not-at-end warnings, since
struct sockaddr is embedded within many network structs.

With socket/proto sockaddr-based internal APIs switched to use struct
sockaddr_unsized, there should be no more uses of struct sockaddr that
depend on reading beyond the end of struct sockaddr::sa_data that might
trigger bounds checking.

Comparing an x86_64 "allyesconfig" vmlinux build before and after this
patch showed no new "ud1" instructions from CONFIG_UBSAN_BOUNDS nor any
new "field-spanning" memcpy CONFIG_FORTIFY_SOURCE instrumentations.

Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-8-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04 19:10:33 -08:00
Kees Cook
85cb0757d7 net: Convert proto_ops connect() callbacks to use sockaddr_unsized
Update all struct proto_ops connect() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.

No binary changes expected.

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-3-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04 19:10:32 -08:00
Kees Cook
0e50474fa5 net: Convert proto_ops bind() callbacks to use sockaddr_unsized
Update all struct proto_ops bind() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.

No binary changes expected.

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04 19:10:32 -08:00
Matthieu Baerts (NGI0)
5c59df126b selftests: mptcp: join: validate extra bind cases
By design, an MPTCP connection will not accept extra subflows where no
MPTCP listening sockets can accept such requests.

In other words, it means that if the 'server' listens on a specific
address / device, it cannot accept MP_JOIN sent to a different address /
device. Except if there is another MPTCP listening socket accepting
them.

This is what the new tests are validating:

 - Forcing a bind on the main v4/v6 address, and checking that MP_JOIN
   to announced addresses are not accepted.

 - Also forcing a bind on the main v4/v6 address, but before, another
   listening socket is created to accept additional subflows. Note that
   'mptcpize run nc -l' -- or something else only doing: socket(MPTCP),
   bind(<IP>), listen(0) -- would be enough, but here mptcp_connect is
   reused not to depend on another tool just for that.

 - Same as the previous one, but using v6 link-local addresses: this is
   a bit particular because it is required to specify the outgoing
   network interface when connecting to a link-local address announced
   by the other peer. When using the routing rules, this doesn't work
   (the outgoing interface is not known) ; but it does work with a
   'laminar' endpoint having a specified interface.

Note that extra small modifications are needed for these tests to work:

 - mptcp_connect's check_getpeername_connect() check should strip the
   specified interface when comparing addresses.

 - With IPv6 link-local addresses, it is required to wait for them to
   be ready (no longer in 'tentative' mode) before using them, otherwise
   the bind() will not be allowed.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/591
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251101-net-next-mptcp-fm-endp-nb-bind-v1-4-b4166772d6bb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04 17:15:07 -08:00
Matthieu Baerts (NGI0)
4a6220a453 selftests: mptcp: join: do_transfer: reduce code dup
The same extra long commands are present twice, with small differences:
the variable for the stdin file is different.

Use new dedicated variables in one command to avoid this code
duplication.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251101-net-next-mptcp-fm-endp-nb-bind-v1-3-b4166772d6bb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04 17:15:06 -08:00
Matthieu Baerts (NGI0)
e461e8a799 mptcp: pm: in kernel: only use fullmesh endp if any
Our documentation is saying that the in-kernel PM is only using fullmesh
endpoints to establish subflows to announced addresses when at least one
endpoint has a fullmesh flag. But this was not totally correct: only
fullmesh endpoints were used if at least one endpoint *from the same
address family as the received ADD_ADDR* has the fullmesh flag.

This is confusing, and it seems clearer not to have differences
depending on the address family.

So, now, when at least one MPTCP endpoint has a fullmesh flag, the local
addresses are picked from all fullmesh endpoints, which might be 0 if
there are no endpoints for the correct address family.

One selftest needs to be adapted for this behaviour change.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251101-net-next-mptcp-fm-endp-nb-bind-v1-2-b4166772d6bb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04 17:15:06 -08:00
Samiullah Khawaja
add3c1324a selftests: Add napi threaded busy poll test in busy_poller
Add testcase to run busy poll test with threaded napi busy poll enabled.

Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Martin Karsten <mkarsten@uwaterloo.ca>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
Link: https://patch.msgid.link/20251028203007.575686-3-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03 18:11:40 -08:00
Samiullah Khawaja
c18d4b190a net: Extend NAPI threaded polling to allow kthread based busy polling
Add a new state NAPI_STATE_THREADED_BUSY_POLL to the NAPI state enum to
enable and disable threaded busy polling.

When threaded busy polling is enabled for a NAPI, enable
NAPI_STATE_THREADED also.

When the threaded NAPI is scheduled, set NAPI_STATE_IN_BUSY_POLL to
signal napi_complete_done not to rearm interrupts.

Whenever NAPI_STATE_THREADED_BUSY_POLL is unset, the
NAPI_STATE_IN_BUSY_POLL will be unset, napi_complete_done unsets the
NAPI_STATE_SCHED_THREADED bit also, which in turn will make the kthread
go to sleep.

Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Martin Karsten <mkarsten@uwaterloo.ca>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
Link: https://patch.msgid.link/20251028203007.575686-2-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-03 18:11:40 -08:00
Arnaldo Carvalho de Melo
7f17ef0d47 perf symbols: Handle '1' symbols in /proc/kallsyms
I started seeing this in recent Fedora 42 kernels:

  root@x1:~# uname -a
  Linux x1 6.17.4-200.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Oct 19 18:47:49 UTC 2025 x86_64 GNU/Linux
  root@x1:~#

  root@x1:~# perf test 1
    1: vmlinux symtab matches kallsyms     : FAILED!
  root@x1:~#

Related to:

  root@x1:~# grep ' 1 ' /proc/kallsyms
  ffffffffb098bc00 1 __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  ffffffffb098bc10 1 _RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  root@x1:~#

That is found in:

  root@x1:~# pahole --running_kernel_vmlinux
  /usr/lib/debug/lib/modules/6.17.4-200.fc42.x86_64/vmlinux
  root@x1:~#

  root@x1:~# readelf -sW /usr/lib/debug/lib/modules/6.17.4-200.fc42.x86_64/vmlinux | grep __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  150649: ffffffff81f8bc00    16 FUNC    LOCAL  DEFAULT    1 __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  root@x1:~#

But was being filtered out when reading /proc/kallsyms, as the '1'
symbol type was not being handled, do it, there are just two of them at
this point.

Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <lossin@kernel.org>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-03 14:54:31 -03:00
Arnaldo Carvalho de Melo
549042f167 tools headers asm: Sync fls headers header with the kernel sources
To pick the changes in:

  6606c8c7e8 ("bitops: Add __attribute_const__ to generic ffs()-family implementations")

This addresses these tools build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/asm-generic/bitops/__fls.h include/asm-generic/bitops/__fls.h
    diff -u tools/include/asm-generic/bitops/fls.h include/asm-generic/bitops/fls.h
    diff -u tools/include/asm-generic/bitops/fls64.h include/asm-generic/bitops/fls64.h

Please see tools/include/uapi/README for further details.

Cc: Kees Cook <kees@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-03 13:35:06 -03:00
Arnaldo Carvalho de Melo
fc9ef9118d tools headers UAPI: Sync KVM's vmx.h header with the kernel sources to handle new exit reasons
To pick the changes in:

  885df2d210 ("KVM: x86: Add support for RDMSR/WRMSRNS w/ immediate on Intel")
  c42856af8f ("KVM: TDX: Add a place holder for handler of TDX hypercalls (TDG.VP.VMCALL)")

That makes 'perf kvm-stat' aware of these new TDCALL and
MSR_{READ,WRITE}_IMM exit reasons, thus addressing the following perf
build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h

Please see tools/include/uapi/README for further details.

Cc: Sean Christopherson <seanjc@google.com>
Cc: Xin Li <xin@zytor.com>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-03 13:29:53 -03:00
Arnaldo Carvalho de Melo
649a0cc96e tools headers svm: Sync svm headers with the kernel sources
To pick the changes in:

  b8c3c9f5d0 ("x86/apic: Initialize Secure AVIC APIC backing page")

That triggers:

  CC      /tmp/build/perf-tools/arch/x86/util/kvm-stat.o
  LD      /tmp/build/perf-tools/arch/x86/util/perf-util-in.o
  LD      /tmp/build/perf-tools/arch/x86/perf-util-in.o
  LD      /tmp/build/perf-tools/arch/perf-util-in.o
  LD      /tmp/build/perf-tools/perf-util-in.o
  AR      /tmp/build/perf-tools/libperf-util.a
  LINK    /tmp/build/perf-tools/perf

But this time causes no changes in tooling results, as the introduced
SVM_VMGEXIT_SAVIC exit reason wasn't added to SVM_EXIT_REASONS, that is
used in kvm-stat.c.

And addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h

Please see tools/include/uapi/README for further details.

Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-03 13:29:53 -03:00
Arnaldo Carvalho de Melo
b1d46bc10f tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
To pick the changes in:

  fddd07626b ("KVM: x86: Define AMD's #HV, #VC, and #SX exception vectors")
  f2f5519aa4 ("KVM: x86: Define Control Protection Exception (#CP) vector")
  9d6812d415 ("KVM: x86: Enable guest SSP read/write interface with new uAPIs")
  06f2969c6a ("KVM: x86: Introduce KVM_{G,S}ET_ONE_REG uAPIs support")

That just rebuilds kvm-stat.c on x86, no change in functionality.

This silences these perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Please see tools/include/uapi/README for further details.

Cc: Sean Christopherson <seanjc@google.com>
Cc: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-03 13:29:53 -03:00
Arnaldo Carvalho de Melo
29e4d12a29 tools headers UAPI: Sync linux/kvm.h with the kernel sources
To pick the changes in:

  fe2bf6234e ("KVM: guest_memfd: Add INIT_SHARED flag, reject user page faults if not set")
  d2042d8f96 ("KVM: Rework KVM_CAP_GUEST_MEMFD_MMAP into KVM_CAP_GUEST_MEMFD_FLAGS")
  3d3a04fad2 ("KVM: Allow and advertise support for host mmap() on guest_memfd files")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the 'perf trace' ioctl syscall argument beautifiers.

This addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Please see tools/include/uapi/README for further details.

Cc: Sean Christopherson <seanjc@google.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-03 13:10:19 -03:00
Arnaldo Carvalho de Melo
f8950b47db tools headers UAPI: Update tools's copy of drm.h to pick DRM_IOCTL_GEM_CHANGE_HANDLE
Picking the changes from:

  0864197382 ("drm: Move drm_gem ioctl kerneldoc to uapi file")
  53096728b8 ("drm: Add DRM prime interface to reassign GEM handle")

Addressing these perf build warnings:

  Warning: Kernel ABI header differences:

Now 'perf trace' and other code that might use the tools/perf/trace/beauty
autogenerated tables will be able to translate this new ioctl command into
a string:

  $ tools/perf/trace/beauty/drm_ioctl.sh > before
  $ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h
  $ tools/perf/trace/beauty/drm_ioctl.sh > after
  $ diff -u before after
  --- before	2025-11-03 09:57:34.832553174 -0300
  +++ after	2025-11-03 09:57:47.969409428 -0300
  @@ -111,6 +111,7 @@
   	[0xCF] = "SYNCOBJ_EVENTFD",
   	[0xD0] = "MODE_CLOSEFB",
   	[0xD1] = "SET_CLIENT_NAME",
  +	[0xD2] = "GEM_CHANGE_HANDLE",
   	[DRM_COMMAND_BASE + 0x00] = "I915_INIT",
   	[DRM_COMMAND_BASE + 0x01] = "I915_FLUSH",
   	[DRM_COMMAND_BASE + 0x02] = "I915_FLIP",
  $

Please see tools/include/uapi/README for further details.

Cc: Christian König <christian.koenig@amd.com>
Cc: David Francis <David.Francis@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-03 12:14:18 -03:00
Arnaldo Carvalho de Melo
ccaba800e7 tools headers x86 cpufeatures: Sync with the kernel sources
To pick the changes from:

  e19c062199 ("x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)")
  7b59c73fd6 ("x86/cpufeatures: Add SNP Secure TSC")
  3c7cb84145 ("x86/cpufeatures: Add a CPU feature bit for MSR immediate form instructions")
  2f8f173413 ("x86/vmscape: Add conditional IBPB mitigation")
  a508cec6e5 ("x86/vmscape: Enumerate VMSCAPE bug")

This causes these perf files to be rebuilt and brings some X86_FEATURE
that may be used by:

      CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
      CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Please see tools/include/uapi/README for further details.

Cc: Babu Moger <babu.moger@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Nikunj A Dadhania <nikunj@amd.com>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Xin Li <xin@zytor.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01 13:16:47 -03:00
Arnaldo Carvalho de Melo
e0acec3369 tools headers x86: Sync table due to introducion of uprobe syscall
To pick the changes in this cset:

  56101b69c9 ("uprobes/x86: Add uprobe syscall to speed up uprobe")

That add support for this new 'uprobe' syscall in tools such as 'perf trace'.

Now it is possible to do a system wide 'perf trace' to look if this new
syscall is being used:

  root@number:~# perf trace -v -e uprobe
  <SNIP>
  event qualifier tracepoint filter: (common_pid != 33989) && (id == 336)
  ^C
  root@number#

  $ grep -w uprobe tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
  336	common	uprobe			sys_uprobe
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl

Please see tools/include/uapi/README for further details.

Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01 13:04:20 -03:00
Arnaldo Carvalho de Melo
5466858a70 tools headers: Sync uapi/linux/fcntl.h with the kernel sources
To pick up the changes in this cset:

  e83f0b5d10 ("nsfs: support exhaustive file handles")

That doesn't introduce anything of interest for tools/, just addresses
these perf build warnings:

Warning: Kernel ABI header differences:
  diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h

Please see tools/include/uapi/README for further details.

Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01 13:01:13 -03:00
Arnaldo Carvalho de Melo
5be93389f3 tools headers: Sync uapi/linux/prctl.h with the kernel source
To pick up the changes in these csets:

  8cdc4d2701 ("mm/huge_memory: respect MADV_COLLAPSE with PR_THP_DISABLE_EXCEPT_ADVISED")
  9dc21bbd62 ("prctl: extend PR_SET_THP_DISABLE to optionally exclude VM_HUGEPAGE")

That don't introduce anything of interest for the tools/, just
addressing these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h

Please see tools/include/uapi/README for further details.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01 12:53:56 -03:00
Arnaldo Carvalho de Melo
76977baa83 tools headers uapi: Update fs.h with the kernel sources
To pick up changes from:

  db2ab24a34 ("Add RWF_NOSIGNAL flag for pwritev2")

These are used to beautify fs syscall arguments, albeit the changes in
this update are not affecting those beautifiers.

This addresses these tools/ build warnings:

  Warning: Kernel ABI header differences:
  diff -u tools/perf/trace/beauty/include/uapi/linux/fs.h include/uapi/linux/fs.h

Please see tools/include/uapi/README for details (it's in the first patch
of this series).

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lauri Vasama <git@vasama.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01 12:24:27 -03:00
Arnaldo Carvalho de Melo
3f67355979 tools arch x86: Sync msr-index.h to pick AMD64_{PERF_CNTR_GLOBAL_STATUS_SET,SAVIC_CONTROL}, IA32_L3_QOS_{ABMC,EXT}_CFG
To pick up the changes in:

  cdfed9370b ("KVM: x86/pmu: Move PMU_CAP_{FW_WRITES,LBR_FMT} into msr-index.h header")
  bc6397cf0b ("x86/cpu/topology: Define AMD64_CPUID_EXT_FEAT MSR")
  84ecefb766 ("x86/resctrl: Add data structures and definitions for ABMC assignment")
  faebbc58cd ("x86/resctrl: Add support to enable/disable AMD ABMC feature")
  c4074ab87f ("x86/apic: Enable Secure AVIC in the control MSR")
  869e36b966 ("x86/apic: Allow NMI to be injected from hypervisor for Secure AVIC")
  30c2b98aa8 ("x86/apic: Add new driver for Secure AVIC")
  0c5caea762 ("perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag")
  68e61f6fd6 ("KVM: SVM: Emulate PERF_CNTR_GLOBAL_STATUS_SET for PerfMonV2")
  a3c4f3396b ("x86/msr-index: Add AMD workload classification MSRs")
  65f55a3017 ("x86/CPU/AMD: Add CPUID faulting support")
  17ec2f9653 ("KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is supported")

Addressing this tools/perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2025-10-30 09:34:49.283533597 -0300
  +++ after	2025-10-30 09:35:00.971426811 -0300
  @@ -272,6 +272,9 @@
   	[0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS",
   	[0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL",
   	[0xc0000302 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS_CLR",
  +	[0xc0000303 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS_SET",
  +	[0xc00003fd - x86_64_specific_MSRs_offset] = "IA32_L3_QOS_ABMC_CFG",
  +	[0xc00003ff - x86_64_specific_MSRs_offset] = "IA32_L3_QOS_EXT_CFG",
   	[0xc0000400 - x86_64_specific_MSRs_offset] = "IA32_EVT_CFG_BASE",
   	[0xc0000500 - x86_64_specific_MSRs_offset] = "AMD_WORKLOAD_CLASS_CONFIG",
   	[0xc0000501 - x86_64_specific_MSRs_offset] = "AMD_WORKLOAD_CLASS_ID",
  @@ -319,6 +322,7 @@
   	[0xc0010133 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_RMP_END",
   	[0xc0010134 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_GUEST_TSC_FREQ",
   	[0xc0010136 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_RMP_CFG",
  +	[0xc0010138 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SAVIC_CONTROL",
   	[0xc0010140 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_OSVW_ID_LENGTH",
   	[0xc0010141 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_OSVW_STATUS",
   	[0xc0010200 - x86_AMD_V_KVM_MSRs_offset] = "F15H_PERF_CTL",
  $

Now one can trace systemwide asking to see backtraces to where that MSR
is being read/written:

  root@x1:~# perf trace -e msr:*_msr/max-stack=32/ --filter="msr==IA32_L3_QOS_ABMC_CFG"
  ^Croot@x1:~#

If we use -v (verbose mode) we can see what it does behind the scenes:

  root@x1:~# perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_L3_QOS_ABMC_CFG"
  0xc00003fd
  New filter for msr:write_msr: (msr==0xc00003fd) && (common_pid != 449842 && common_pid != 433756)
  0xc00003fd
  New filter for msr:read_msr: (msr==0xc00003fd) && (common_pid != 449842 && common_pid != 433756)
  mmap size 528384B
  ^Croot@x1:~#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
   0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                   do_trace_write_msr ([kernel.kallsyms])
                                   do_trace_write_msr ([kernel.kallsyms])
                                   __switch_to_xtra ([kernel.kallsyms])
                                   __switch_to ([kernel.kallsyms])
                                   __schedule ([kernel.kallsyms])
                                   schedule ([kernel.kallsyms])
                                   futex_wait_queue_me ([kernel.kallsyms])
                                   futex_wait ([kernel.kallsyms])
                                   do_futex ([kernel.kallsyms])
                                   __x64_sys_futex ([kernel.kallsyms])
                                   do_syscall_64 ([kernel.kallsyms])
                                   entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                   __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
   0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                   do_trace_write_msr ([kernel.kallsyms])
                                   do_trace_write_msr ([kernel.kallsyms])
                                   __switch_to_xtra ([kernel.kallsyms])
                                   __switch_to ([kernel.kallsyms])
                                   __schedule ([kernel.kallsyms])
                                   schedule_idle ([kernel.kallsyms])
                                   do_idle ([kernel.kallsyms])
                                   cpu_startup_entry ([kernel.kallsyms])
                                   secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Please see tools/include/uapi/README for further details.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Cc: Perry Yuan <perry.yuan@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-01 12:24:27 -03:00
Josh Poimboeuf
c44b4b9eeb objtool: Fix skip_alt_group() for non-alternative STAC/CLAC
If an insn->alt points to a STAC/CLAC instruction, skip_alt_group()
assumes it's part of an alternative ("alt group") as opposed to some
other kind of "alt" such as an exception fixup.

While that assumption may hold true in the current code base, Linus has
an out-of-tree patch which breaks that assumption by replacing the
STAC/CLAC alternatives with raw STAC/CLAC instructions.

Make skip_alt_group() more robust by making sure it's actually an alt
group before continuing.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 2d12c6fb78 ("objtool: Remove ANNOTATE_IGNORE_ALTERNATIVE from CLAC/STAC")
Closes: https://lore.kernel.org/CAHk-=wi6goUT36sR8GE47_P-aVrd5g38=VTRHpktWARbyE-0ow@mail.gmail.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://patch.msgid.link/3d22415f7b8e06a64e0873b21f48389290eeaa49.1761767616.git.jpoimboe@kernel.org
2025-11-01 07:43:20 +01:00
Linus Torvalds
ba36dd5ee6 bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmkFVBIACgkQ6rmadz2v
 bTrWVg//chctHGZZcP2ZbLxDDwLOwfjjsUY2COaD9P3ZN8/vWX6GEbvElLulkLgD
 Hwv3pe2C6NzHN9QH37M+WtJmLE1vI5aRuMXzpBKhOtOFAE5BfHzXeON0M4pswZd1
 jh7f4w7mBdW3MMoR2Dg/l+lbGxDKFfb9jfD1blm+uOuBodHdbIpa66Mscakannrx
 tNWoauPDcu7fu7b+KCItnICC+VewaoDmhr20Q8X/kwvqbNPZ98D/tzUw7YlngO1d
 p+K/oKVAfXbWbW79agNoqD+zVDKAos7dQgqCDY/cuZhJNzt4xBZfTkM62SXdHU7g
 aCXHg+qxoWMrYTWGGueAhwf4gB3YIe0atKxP9w5gbjtxbWa5Y6oTyIpgGKvO5SMj
 7qsmg/m338kS4aKQVjr9D042W+qqxRjrn2eF/x4Sth1GXMJd1ny14NpoNGEk/xsU
 TZfBdFgNOYUa1jeK3N3oEDdlxx8ITA9gsNPzSy9O8Ke6WRHp5u9Ob/7UIJsiVYWw
 6SPdIhagv719m93GvAC4Xe3BrRi1dmf5UX39oOqpnGKkg4lT/xNu4aYP89UbFVGW
 XgTPX+Cm7kRKb32Fv9GiLC0sTQEWVAiB0jVTGB9E8v15P7ybJ/9IrcRNcwcrKGNS
 ny+cn1SR+CmX6c8TdliSzLdtgGuPk3QrXkwWs4442IphtbPnhE4=
 =t7MS
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Mark migrate_disable/enable() as always_inline to avoid issues with
   partial inlining (Yonghong Song)

 - Fix powerpc stack register definition in libbpf bpf_tracing.h (Andrii
   Nakryiko)

 - Reject negative head_room in __bpf_skb_change_head (Daniel Borkmann)

 - Conditionally include dynptr copy kfuncs (Malin Jonsson)

 - Sync pending IRQ work before freeing BPF ring buffer (Noorain Eqbal)

 - Do not audit capability check in x86 do_jit() (Ondrej Mosnacek)

 - Fix arm64 JIT of BPF_ST insn when it writes into arena memory
   (Puranjay Mohan)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf/arm64: Fix BPF_ST into arena memory
  bpf: Make migrate_disable always inline to avoid partial inlining
  bpf: Reject negative head_room in __bpf_skb_change_head
  bpf: Conditionally include dynptr copy kfuncs
  libbpf: Fix powerpc's stack register definition in bpf_tracing.h
  bpf: Do not audit capability check in do_jit()
  bpf: Sync pending IRQ work before freeing ring buffer
2025-10-31 18:22:26 -07:00
Wang Liang
d01f8136d4 selftests: netdevsim: Fix ethtool-coalesce.sh fail by installing ethtool-common.sh
The script "ethtool-common.sh" is not installed in INSTALL_PATH, and
triggers some errors when I try to run the test
'drivers/net/netdevsim/ethtool-coalesce.sh':

  TAP version 13
  1..1
  # timeout set to 600
  # selftests: drivers/net/netdevsim: ethtool-coalesce.sh
  # ./ethtool-coalesce.sh: line 4: ethtool-common.sh: No such file or directory
  # ./ethtool-coalesce.sh: line 25: make_netdev: command not found
  # ethtool: bad command line argument(s)
  # ./ethtool-coalesce.sh: line 124: check: command not found
  # ./ethtool-coalesce.sh: line 126: [: -eq: unary operator expected
  # FAILED /0 checks
  not ok 1 selftests: drivers/net/netdevsim: ethtool-coalesce.sh # exit=1

Install this file to avoid this error. After this patch:

  TAP version 13
  1..1
  # timeout set to 600
  # selftests: drivers/net/netdevsim: ethtool-coalesce.sh
  # PASSED all 22 checks
  ok 1 selftests: drivers/net/netdevsim: ethtool-coalesce.sh

Fixes: fbb8531e58 ("selftests: extract common functions in ethtool-common.sh")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20251030040340.3258110-1-wangliang74@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-31 17:41:54 -07:00
Anubhav Singh
f8e8486702 selftests/net: use destination options instead of hop-by-hop
The GRO self-test, gro.c, currently constructs IPv6 packets containing a
Hop-by-Hop Options header (IPPROTO_HOPOPTS) to ensure the GRO path
correctly handles IPv6 extension headers.

However, network elements may be configured to drop packets with the
Hop-by-Hop Options header (HBH). This causes the self-test to fail
in environments where such network elements are present.

To improve the robustness and reliability of this test in diverse
network environments, switch from using IPPROTO_HOPOPTS to
IPPROTO_DSTOPTS (Destination Options).

The Destination Options header is less likely to be dropped by
intermediate routers and still serves the core purpose of the test:
validating GRO's handling of an IPv6 extension header. This change
ensures the test can execute successfully without being incorrectly
failed by network policies outside the kernel's control.

Fixes: 7d1575014a ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Anubhav Singh <anubhavsinggh@google.com>
Link: https://patch.msgid.link/20251030060436.1556664-1-anubhavsinggh@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-31 17:33:17 -07:00
Anubhav Singh
02d064de05 selftests/net: fix out-of-order delivery of FIN in gro:tcp test
Due to the gro_sender sending data packets and FIN packets
in very quick succession, these are received almost simultaneously
by the gro_receiver. FIN packets are sometimes processed before the
data packets leading to intermittent (~1/100) test failures.

This change adds a delay of 100ms before sending FIN packets
in gro:tcp test to avoid the out-of-order delivery. The same
mitigation already exists for the gro:ip test.

Fixes: 7d1575014a ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Anubhav Singh <anubhavsinggh@google.com>
Link: https://patch.msgid.link/20251030062818.1562228-1-anubhavsinggh@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-31 17:32:24 -07:00
Linus Torvalds
39bcf0f7d4 VFIO update for v6.18-rc4
- Fix overflows in vfio type1 backend for mappings at the end of the
    64-bit address space, resulting in leaked pinned memory.  New
    selftest support included to avoid such issues in the future.
    (Alex Mastro)
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmkFF0YRHGFsZXhAc2hh
 emJvdC5vcmcACgkQI5ubbjuwiyLZsQ/+OzFOEOOw5UIpVrMsnp7DmLOSYgiorlpS
 q9jtsP99dh4U1JqJaONAGb6uu1SL6yX6VL3QZNEppPY83WXVSxX44lo3qpTZ90iq
 xpeUTNHv/YOhAYJ+eXbGfjoJsxQqRLECkWyJwQAQl227yRsqWzAR5twRhBrLi9Gf
 hd6d+ZQuTbEQo3oa7bNA2Q+EUevn7gEiNEBAz6L+zz0DWiVrd+XEk4N+37MR7UKr
 6og/p3lI6AyjFG3dqUGBB2uUVfCGzdCXi6F01reZm3q7aPUdO+PuP7MnYjpJ5j4n
 Ph3qN5iZ+bF24wXtZEGs+NDvaNXsfk5RoPxjc62DeYRG7WfH83KZCyrQju6vuCsd
 CFirCf/97MK/LdDaj46nOi37vNQGap4EibmJTI9hMRh2x2RRlxTVdOv+j/lkQAEr
 GgNwPMZO1yp2tJKQ5DJEgaXgQyOfEpWAtG2qbVCXHUc6lzsN2HQTpIDSbXsZdpfK
 IuCLtmaR20tfUoRQJYCryb2hww2+6Z5j6W+CDdZpCfkqmQnsC0duoebru1g32vc0
 20PrVE/xMMtw+I1CBTDLWFDLs/M8sg0AbBicoHFXh1Ea8n9EC3b/aczPk0ff0b9g
 x2BcDBaoCyslE7pSpeCwemCrACtreXu7Ttzb5CgAIQAuBcZCvKAFGC80KGOt8Tbh
 M0bQYDkeouM=
 =sfMw
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.18-rc4' of https://github.com/awilliam/linux-vfio

Pull VFIO fixes from Alex Williamson:

 - Fix overflows in vfio type1 backend for mappings at the end of the
   64-bit address space, resulting in leaked pinned memory.

   New selftest support included to avoid such issues in the future
   (Alex Mastro)

* tag 'vfio-v6.18-rc4' of https://github.com/awilliam/linux-vfio:
  vfio: selftests: add end of address space DMA map/unmap tests
  vfio: selftests: update DMA map/unmap helpers to support more test kinds
  vfio/type1: handle DMA map/unmap up to the addressable limit
  vfio/type1: move iova increment to unmap_unpin_*() caller
  vfio/type1: sanitize for overflow using check_*_overflow()
2025-10-31 14:20:09 -07:00
Jakub Kicinski
1a2352ad82 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.18-rc4).

No conflicts, adjacent changes:

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
  ded9813d17 ("net: stmmac: Consider Tx VLAN offload tag length for maxSDU")
  26ab9830be ("net: stmmac: replace has_xxxx with core_type")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-31 06:46:03 -07:00
Linus Torvalds
d127176862 linux_kselftest-fixes-6.18-rc4
Fixes build warning in cachestat found during clang build and adds
 tmpshmcstat to .gitignore.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmkDy4wACgkQCwJExA0N
 Qxz/zw//XkD/6mI0KHL88ko6i9GGcbN7QI0RAZpJFHmKTCRkZ59eXrA1h0eeZGwB
 KOt9JzhGkeYHeFYVXV6ghzlV8zXig5romxyBZ7f3WTtPDIFobNJQqnV8ZcOMe95A
 PtdxNtR1tXK7VYP6MlUETCcKEjO1USrbaHMBHSPpDhoO+uysiW2XptqVjVMWbaCZ
 9yydd4bCsIKSWHc4O2AaOOdA45A1ta+DZFGwvvq4Y0RrRUrLYsvgN/EcC5KGd9yO
 2wqbQ9pmNdqZgrVWXYGNjGoNAlDsyc0vBSDBEYzvEUvcASQw9MG62xXHF6C23TMM
 szNg4QR0xwWXB3EHceHJYUqmiPLWsfjk1az2mkkMKE0f+84JWdKWGCTb2hyU+vjl
 dzUGDRLKuRcrfmAcgGxa1ARuJFElo4H4iAf76/dXVCS7rOciPx2Xc2J2yP2y7wsb
 rh17+k5fmpeOIrAHju4tiSOejDG9bxr27wj/VsL3qUxOcxh20tAk/0D6ojtuyzWF
 Zg247AQT+X/7C4lKwwKMZKdOTqZN5iIEOb4T1vm3dC8rY6KI2hzwyt3TlDyN7rsS
 q84/zP7h3z95LhgQ5WCb3Z7x3vo0gkIirpooD81XVN9X6nQJg73TpcNkzrGcnY9K
 rKALuPtPeaX9b6Bc2tuns10HWMtBCQLI+8sSayMSoz/IFqN1HAU=
 =XQbQ
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "Fix build warning in cachestat found during clang build and add
  tmpshmcstat to .gitignore"

* tag 'linux_kselftest-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: cachestat: Fix warning on declaration under label
  selftests/cachestat: add tmpshmcstat file to .gitignore
2025-10-30 19:48:13 -07:00
Linus Torvalds
e576349123 Including fixes from wireless, Bluetooth and netfilter.
Current release - regressions:
 
   - tcp: fix too slow tcp_rcvbuf_grow() action
 
   - bluetooth: fix corruption in h4_recv_buf() after cleanup
 
 Previous releases - regressions:
 
   - mptcp: restore window probe
 
   - bluetooth:
     - fix connection cleanup with BIG with 2 or more BIS
     - fix crash in set_mesh_sync and set_mesh_complete
 
   - batman-adv: release references to inactive interfaces
 
   - nic: ice: fix usage of logical PF id
 
   - nic: sfc: fix potential memory leak in efx_mae_process_mport()
 
 Previous releases - always broken:
 
   - devmem: refresh devmem TX dst in case of route invalidation
 
   - netfilter: add seqadj extension for natted connections
 
   - wifi:
     - iwlwifi: fix potential use after free in iwl_mld_remove_link()
     - brcmfmac: fix crash while sending action frames in standalone AP Mode
 
   - eth: mlx5e: cancel tls RX async resync request in error flows
 
   - eth: ixgbe: fix memory leak and use-after-free in ixgbe_recovery_probe()
 
   - eth: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
 
   - eth: cxgb4: fix potential use-after-free in ipsec callback
 
   - eth: nfp: fix memory leak in nfp_net_alloc()
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmkDctUSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkLSQP/1jXD3wLztnTT32+AItyg+R+Sf8EJN78
 hJHI1Kp9zJIdgdwnokMMTohfuSjtrXJ3HSk6RyfZnd3lD8Qo1krp4Dy9CQR0lvzb
 STNvjmA4Lbs/pS2EjSeMxAkrY0JbWyQ2X0NPrp8ZilqLYyGe8/wqq7eKR9OVZjwR
 XMgFlpPncPhMAfTaN6TRHy+YoxGp+ItiGWdX1n0tW2cBpOMNH5lKHz3Qp19SvumU
 G9ilscybTQP7EJQKvGsa76Lwl1djFFuK4BXTn3TTIieds943/l6lG5PvSOKVqhTl
 Nth6+EsbkQiC/R6rLFdzzM5D8L2ZSD2on6AX7owDF5DN02PiwamwXc+1W8KWBI73
 ShiWzYJ78m4nqesWMqCgCTcjv16npW9UuemeWavumAruRpEuy6Ffa3BAKTqDiwVX
 tHFms8q8NCgEQ7r3IPdbNZwx3Z/Sig/wvAzrC23jrth1/wDeaJ2nqBmQ+S4dt9HC
 wGTZyKfU0oGfvKydZQJjGZM+OOB3J1EeOad/+7fxikFFQ/KDBIUzeJZdzJa0Ba64
 rkKR6McggdreYFfS3jVM8iwD+bLk77JBJGd3SQhFZlkk55RYCeIRHEEhVdX1tjon
 bgr1Fh0hqyVQkDmJCUpj91y2ulUmI9CJRWybtoH2w8QR90pMSgC/egAtigo1595c
 LY8LDoe4NxKD
 =6//1
 -----END PGP SIGNATURE-----

Merge tag 'net-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from wireless, Bluetooth and netfilter.

  Current release - regressions:

    - tcp: fix too slow tcp_rcvbuf_grow() action

    - bluetooth: fix corruption in h4_recv_buf() after cleanup

  Previous releases - regressions:

    - mptcp: restore window probe

    - bluetooth:
       - fix connection cleanup with BIG with 2 or more BIS
       - fix crash in set_mesh_sync and set_mesh_complete

    - batman-adv: release references to inactive interfaces

    - nic:
       - ice: fix usage of logical PF id
       - sfc: fix potential memory leak in efx_mae_process_mport()

  Previous releases - always broken:

    - devmem: refresh devmem TX dst in case of route invalidation

    - netfilter: add seqadj extension for natted connections

    - wifi:
       - iwlwifi: fix potential use after free in iwl_mld_remove_link()
       - brcmfmac: fix crash while sending action frames in standalone AP Mode

    - eth:
       - mlx5e: cancel tls RX async resync request in error flows
       - ixgbe: fix memory leak and use-after-free in ixgbe_recovery_probe()
       - hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
       - cxgb4: fix potential use-after-free in ipsec callback
       - nfp: fix memory leak in nfp_net_alloc()"

* tag 'net-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits)
  net: sctp: fix KMSAN uninit-value in sctp_inq_pop
  net: devmem: refresh devmem TX dst in case of route invalidation
  net: stmmac: est: Fix GCL bounds checks
  net: stmmac: Consider Tx VLAN offload tag length for maxSDU
  net: stmmac: vlan: Disable 802.1AD tag insertion offload
  net/mlx5e: kTLS, Cancel RX async resync request in error flows
  net: tls: Cancel RX async resync request on rcd_delta overflow
  net: tls: Change async resync helpers argument
  net: phy: dp83869: fix STRAP_OPMODE bitmask
  selftests: net: use BASH for bareudp testing
  net: mctp: Fix tx queue stall
  net/mlx5: Don't zero user_count when destroying FDB tables
  net: usb: asix_devices: Check return value of usbnet_get_endpoints
  mptcp: zero window probe mib
  mptcp: restore window probe
  mptcp: fix MSG_PEEK stream corruption
  mptcp: drop bogus optimization in __mptcp_check_push()
  netconsole: Fix race condition in between reader and writer of userdata
  Documentation: netconsole: Remove obsolete contact people
  nfp: xsk: fix memory leak in nfp_net_alloc()
  ...
2025-10-30 18:35:35 -07:00
Jakub Kicinski
ecca75ae5a selftests: drv-net: replace the nsim ring test with a drv-net one
We are trying to move away from netdevsim-only tests and towards
tests which can be run both against netdevsim and real drivers.

Replace the simple bash script we have for checking ethtool -g/-G
on netdevsim with a Python test tweaking those params as well
as channel count.

The new test is not exactly equivalent to the netdevsim one,
but real drivers don't often support random ring sizes,
let alone modifying max values via debugfs.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251029164930.2923448-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-30 17:32:18 -07:00
Mark Brown
a186fbcfd8 KVM: arm64: selftests: Filter ZCR_EL2 in get-reg-list
get-reg-list includes ZCR_EL2 in the list of EL2 registers that it looks
for when NV is enabled but does not have any feature gate for this register,
meaning that testing any combination of features that includes EL2 but does
not include SVE will result in a test failure due to a missing register
being reported:

| The following lines are missing registers:
|
|	ARM64_SYS_REG(3, 4, 1, 2, 0),

Add ZCR_EL2 to feat_id_regs so that the test knows not to expect to see it
without SVE being enabled.

Fixes: 3a90b6f279 ("KVM: arm64: selftests: get-reg-list: Add base EL2 registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20251024-kvm-arm64-get-reg-list-zcr-el2-v1-1-0cd0ff75e22f@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-30 16:13:27 +00:00
Mark Brown
92e781c93e KVM: arm64: selftests: Add SCTLR2_EL2 to get-reg-list
We recently added support for SCTLR2_EL2 to the kernel but did not add it
to get-reg-list, resulting in it reporting the missing register when it
is available. Add it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20251023-b4-kvm-arm64-get-reg-list-sctlr-el2-v1-1-088f88ff992a@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-30 16:13:04 +00:00
Maximilian Dittgen
a24f7afce0 KVM: selftests: fix MAPC RDbase target formatting in vgic_lpi_stress
Since GITS_TYPER.PTA == 0, the ITS MAPC command demands a CPU ID,
rather than a physical redistributor address, for its RDbase
command argument.

As such, when MAPC-ing guest ITS collections, vgic_lpi_stress iterates
over CPU IDs in the range [0, nr_cpus), passing them as the RDbase
vcpu_id argument to its_send_mapc_cmd().

However, its_encode_target() in the its_send_mapc_cmd() selftest
handler expects RDbase arguments to be formatted with a 16 bit
offset, as shown by the 16-bit target_addr right shift its implementation:

        its_mask_encode(&cmd->raw_cmd[2], target_addr >> 16, 51, 16)

At the moment, all CPU IDs passed into its_send_mapc_cmd() have no
offset, therefore becoming 0x0 after the bit shift. Thus, when
vgic_its_cmd_handle_mapc() receives the ITS command in vgic-its.c,
it always interprets the RDbase target CPU as CPU 0. All interrupts
sent to collections will be processed by vCPU 0, which defeats the
purpose of this multi-vCPU test.

Fix by creating procnum_to_rdbase() helper function, which left-shifts
the vCPU parameter received by its_send_mapc_cmd 16 bits before passing
it to its_encode_target for encoding.

Signed-off-by: Maximilian Dittgen <mdittgen@amazon.de>
Link: https://patch.msgid.link/20251020145946.48288-1-mdittgen@amazon.de
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-30 16:12:30 +00:00
Ido Schimmel
02da595751 selftests: traceroute: Add ICMP extensions tests
Test that ICMP extensions are reported correctly when enabled and not
reported when disabled. Test both IPv4 and IPv6 and using different
packet sizes, to make sure trimming / padding works correctly.

Disable ICMP rate limiting (defaults to 1 per-second per-target) so that
the kernel will always generate ICMP errors when needed.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251027082232.232571-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 18:28:30 -07:00
Po-Hsu Lin
9311e9540a selftests: net: use BASH for bareudp testing
In bareudp.sh, this script uses /bin/sh and it will load another lib.sh
BASH script at the very beginning.

But on some operating systems like Ubuntu, /bin/sh is actually pointed to
DASH, thus it will try to run BASH commands with DASH and consequently
leads to syntax issues:
  # ./bareudp.sh: 4: ./lib.sh: Bad substitution
  # ./bareudp.sh: 5: ./lib.sh: source: not found
  # ./bareudp.sh: 24: ./lib.sh: Syntax error: "(" unexpected

Fix this by explicitly using BASH for bareudp.sh. This fixes test
execution failures on systems where /bin/sh is not BASH.

Reported-by: Edoardo Canepa <edoardo.canepa@canonical.com>
Link: https://bugs.launchpad.net/bugs/2129812
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20251027095710.2036108-2-po-hsu.lin@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:56:03 -07:00
Ankit Khushwaha
afb8f6567a selftest: net: fix socklen_t type mismatch in sctp_collision test
Socket APIs like recvfrom(), accept(), and getsockname() expect socklen_t*
arg, but tests were using int variables. This causes -Wpointer-sign
warnings on platforms where socklen_t is unsigned.

Change the variable type from int to socklen_t to resolve the warning and
ensure type safety across platforms.

warning fixed:

sctp_collision.c:62:70: warning: passing 'int *' to parameter of
type 'socklen_t *' (aka 'unsigned int *') converts between pointers to
integer types with different sign [-Wpointer-sign]
   62 |                 ret = recvfrom(sd, buf, sizeof(buf),
									0, (struct sockaddr *)&daddr, &len);
      |                                                           ^~~~
/usr/include/sys/socket.h:165:27: note: passing argument to
parameter '__addr_len' here
  165 |                          socklen_t *__restrict __addr_len);
      |                                                ^

Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251028172947.53153-1-ankitkhushwaha.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:39:26 -07:00
Jakub Kicinski
34164142b5 tools: ynl: rework the string representation of NlError
In early days of YNL development dumping the NlMsg on errors
was quite useful, as the library itself could have been buggy.
These days increasingly the NlMsg is just taking up screen space
and means nothing to a typical user. Try to format the errors
more in line with how YNL C formats its errors strings.

Before:
  $ ynl --family ethtool  --do channels-set  --json '{}'
  Netlink error: Invalid argument
  nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
	error: -22
	extack: {'miss-type': 'header'}

  $ ynl --family ethtool  --do channels-set  --json '{..., "tx-count": 999}'
  Netlink error: Invalid argument
  nl_len = 88 (72) nl_flags = 0x300 nl_type = 2
	error: -22
	extack: {'msg': 'requested channel count exceeds maximum', 'bad-attr': '.tx-count'}

After:
  $ ynl --family ethtool  --do channels-set  --json '{}'
  Netlink error: Invalid argument {'miss-type': 'header'}

  $ ynl --family ethtool  --do channels-set  --json '{..., "tx-count": 999}'
  Netlink error: requested channel count exceeds maximum: Invalid argument {'bad-attr': '.tx-count'}

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20251027192958.2058340-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28 16:35:06 -07:00
Jakub Kicinski
09e2603513 tools: ynl: fix indent issues in the main Python lib
Class NlError() and operation_do_attributes() are indented by 2 spaces
rather than 4 spaces used by the rest of the file.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20251027192958.2058340-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-28 16:35:06 -07:00
Alex Mastro
de8d1f2fd5 vfio: selftests: add end of address space DMA map/unmap tests
Add tests which validate dma map/unmap at the end of address space. Add
negative test cases for checking that overflowing ioctl args fail with
the expected errno.

Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Alex Mastro <amastro@fb.com>
Link: https://lore.kernel.org/r/20251028-fix-unmap-v6-5-2542b96bcc8e@fb.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-10-28 15:54:41 -06:00
Alex Mastro
16950b60c1 vfio: selftests: update DMA map/unmap helpers to support more test kinds
Add __vfio_pci_dma_*() helpers which return -errno from the underlying
ioctls.

Add __vfio_pci_dma_unmap_all() to test more unmapping code paths. Add an
out unmapped arg to report the unmapped byte size.

The existing vfio_pci_dma_*() functions, which are intended for
happy-path usage (assert on failure) are now thin wrappers on top of the
double-underscore helpers.

Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Alex Mastro <amastro@fb.com>
Link: https://lore.kernel.org/r/20251028-fix-unmap-v6-4-2542b96bcc8e@fb.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-10-28 15:54:41 -06:00
Hangbin Liu
e396694055 tools: ynl: avoid print_field when there is no reply
When request a none support device operation, there will be no reply.
In this case, the len(desc) check will always be true, causing print_field
to enter an infinite loop and crash the program. Example reproducer:

  # ethtool.py -c veth0

To fix this, return immediately if there is no reply.

Fixes: f3d07b02b2 ("tools: ynl: ethtool testing tool")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251024125853.102916-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-27 18:16:49 -07:00
Petr Machata
d10920607f selftests: bridge_mdb: Add a test for MDB flush on snooping disable
Check that non-permanent MDB entries are removed as IGMP / MLD snooping is
disabled.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/9420dfbcf26c8e1134d31244e9e7d6a49d677a69.1761228273.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-27 17:57:21 -07:00
Petr Oros
65f9c4c588 tools: ynl: fix string attribute length to include null terminator
The ynl_attr_put_str() function was not including the null terminator
in the attribute length calculation. This caused kernel to reject
CTRL_CMD_GETFAMILY requests with EINVAL:
"Attribute failed policy validation".

For a 4-character family name like "dpll":
- Sent: nla_len=8 (4 byte header + 4 byte string without null)
- Expected: nla_len=9 (4 byte header + 5 byte string with null)

The bug was introduced in commit 15d2540e0d ("tools: ynl: check for
overflow of constructed messages") when refactoring from stpcpy() to
strlen(). The original code correctly included the null terminator:

  end = stpcpy(ynl_attr_data(attr), str);
  attr->nla_len = NLA_HDRLEN + NLA_ALIGN(end -
                                (char *)ynl_attr_data(attr));

Since stpcpy() returns a pointer past the null terminator, the length
included it. The refactored version using strlen() omitted the +1.

The fix also removes NLA_ALIGN() from nla_len calculation, since
nla_len should contain actual attribute length, not aligned length.
Alignment is only for calculating next attribute position. This makes
the code consistent with ynl_attr_put().

CTRL_ATTR_FAMILY_NAME uses NLA_NUL_STRING policy which requires
null terminator. Kernel validates with memchr() and rejects if not
found.

Fixes: 15d2540e0d ("tools: ynl: check for overflow of constructed messages")
Signed-off-by: Petr Oros <poros@redhat.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Link: https://lore.kernel.org/20251018151737.365485-3-zahari.doychev@linux.com
Link: https://patch.msgid.link/20251024132438.351290-1-poros@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-27 16:47:29 -07:00
Wilfred Mallawa
5f30bc4706 selftests: tls: add tls record_size_limit test
Test that outgoing plaintext records respect the tls TLS_TX_MAX_PAYLOAD_LEN
set using setsockopt(). The limit is set to be 128, thus, in all received
records, the plaintext must not exceed this amount.

Also test that setting a new record size limit whilst a pending open
record exists is handled correctly by discarding the request.

Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251022001937.20155-2-wilfred.opensource@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-27 16:13:43 -07:00
Linus Torvalds
af8159515f - Fix x32 build due to wrong format specifier on that sub-arch
- Add one more Rust noreturn function to objtool's list
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmj+C/cACgkQEsHwGGHe
 VUojtA//V5LCV5+OGYhqrjc8Bkim42COCQ1cDU/fcw59iP9TNGI17E+kpCGPNztq
 9ZfGFtFlQqm4er/RFgIUoeCYzt3xpmOa1QgogHtDDdPSvTUULXPHAMXvybpvJww4
 uxfOJMWUOKoPF2qkwVfzZNLYeJdi3fMAWJB7q0ZnLFm2OYHSfLDO8yyK9ADIgrNw
 vbEFoFtgkxWfR6/f82IZ2EdF1Z6/3hCDmlXAP3Ev+3afww+nQMHfUyg/RjpP6aMI
 lYqFSCxhUgl6fjYDi8Hx8DLl/yDxJY3A1FYe+xy1Hx0IO6YZ1V1MaQnSeSi3tU39
 Yvghq0bGZqv+wBXs9tG7Kxdb35vzDevRQctJtLv+ClgaGZEiIkedN1+Km4z0vNIj
 ughNo/LXP3ax0H8KGsh2jJPjykQpLbjnraAYoE/M40v8ZHEEjJlVEnDXhQ4ySUHr
 /sRq7nQ7nHL4c1XAP3TVM2ETaoTFEzWq2wxIYJ4Nm1cvouljQfsllg7JjmV3e323
 vW4ZtOQdAj4EikenaQTP94OZ/V3Z/LM4CV9XHiI5O22ewoya6TEGJaimF5MrbTHy
 xmZ3ANkLJCCnI0vdE9cjn1MatRU9VN0Q2XHyhTN8XU0c2iA3pYHRICg5TTJw3XYu
 f3zvcXgziDedBJc8baCYhnZJXNy2oM3lV8/WYa/y1tlS9TBdLNo=
 =d4ht
 -----END PGP SIGNATURE-----

Merge tag 'objtool_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixes from Borislav Petkov:

 - Fix x32 build due to wrong format specifier on that sub-arch

 - Add one more Rust noreturn function to objtool's list

* tag 'objtool_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix failure when being compiled on x32 system
  objtool/rust: add one more `noreturn` Rust function
2025-10-26 09:44:36 -07:00
Alessandro Zanni
13cb6ac5b5 selftest: net: prevent use of uninitialized variable
Fix to avoid the usage of the `ret` variable uninitialized in the
following macro expansions.

It solves the following warning:

In file included from netlink-dumps.c:21:
netlink-dumps.c: In function ‘dump_extack’:
../kselftest_harness.h:788:35: warning: ‘ret’ may be used uninitialized [-Wmaybe-uninitialized]
  788 |                         intmax_t  __exp_print = (intmax_t)__exp; \
      |                                   ^~~~~~~~~~~
../kselftest_harness.h:631:9: note: in expansion of macro ‘__EXPECT’
  631 |         __EXPECT(expected, #expected, seen, #seen, ==, 0)
      |         ^~~~~~~~
netlink-dumps.c:169:9: note: in expansion of macro ‘EXPECT_EQ’
  169 |         EXPECT_EQ(ret, FOUND_EXTACK);
      |         ^~~~~~~~~

The issue can be reproduced, building the tests, with the command:
make -C tools/testing/selftests TARGETS=net

Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
Link: https://patch.msgid.link/20251023205354.28249-1-alessandro.zanni87@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-24 18:43:37 -07:00
Andrii Nakryiko
7221b9caf8 libbpf: Fix powerpc's stack register definition in bpf_tracing.h
retsnoop's build on powerpc (ppc64le) architecture ([0]) failed due to
wrong definition of PT_REGS_SP() macro. Looking at powerpc's
implementation of stack unwinding in perf_callchain_user_64() clearly
shows that stack pointer register is gpr[1].

Fix libbpf's definition of __PT_SP_REG for powerpc to fix all this.

  [0] https://kojipkgs.fedoraproject.org/work/tasks/1544/137921544/build.log

Fixes: 138d6153a1 ("samples/bpf: Enable powerpc support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Naveen N Rao (AMD) <naveen@kernel.org>
Link: https://lore.kernel.org/r/20251020203643.989467-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-23 11:25:16 -07:00
Jakub Kicinski
2b7553db91 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.18-rc3).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-23 10:53:08 -07:00
Linus Torvalds
ab431bc397 Including fixes from can. Slim pickings, I'm guessing people haven't
really started testing.
 
 Current release - new code bugs:
 
  - eth: mlx5e:
    - psp: avoid 'accel' NULL pointer dereference
    - skip PPHCR register query for FEC histogram if not supported
 
 Previous releases - regressions:
 
  - bonding: update the slave array for broadcast mode
 
  - rtnetlink: re-allow deleting FDB entries in user namespace
 
  - eth: dpaa2: fix the pointer passed to PTR_ALIGN on Tx path
 
 Previous releases - always broken:
 
  - can: drop skb on xmit if device is in listen-only mode
 
  - gro: clear skb_shinfo(skb)->hwtstamps in napi_reuse_skb()
 
  - eth: mlx5e
    - RX, fix generating skb from non-linear xdp_buff if program
      trims frags
    - make devcom init failures non-fatal, fix races with IPSec
 
 Misc:
 
  - some documentation formatting "fixes"
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmj6QcgACgkQMUZtbf5S
 IrsRvw/+N4rc/M7X6kYXftxW+DchJ9daXVFFXDR6/L05LyOhNL6wDi0TNT0MM2f0
 RLQQ6ReOyCGhDGDzUJN6FNdn6B75HSzoeZKDoCVBoCpHg8C55X2YXMpTZILlKxH+
 A7yI8H6Zb5ljiXlOwhqSrIzth6orNxjqUBAOIiu0JUhvfvuYnRh1RZPYX+3babjS
 2DcEGK36GLe671PR6S33tu2q/xcmPonbBtQjplEtDKyGrV6+/BtvHSJ/bnKjtnar
 5DR1yJE8PdtujdXyxzaS7x2xfVnbnX/huNepqY3K0d5ZXJphQGxCEfY5DTWAsooF
 zX7Q/lRe56bmNzjg+LRhYzIeKaSFlXw23FQe2b+Dpz/qPMKkm4F2LdIq1C9GOXom
 StX0zz7v6C332N/fABAzCUbklaclZA+HyVmjEJYIdK037xaIpYdh9AfkX0Byd8xP
 JOGFeVtqlW/Vg7yqmiJ8N66yaMKOtmzfA3yyQid0F0CVwy/h3z/fDp78rfTjgORp
 tXX2DO9jF5aUZXE5p6VAWQGHOylpVP83GHKzTZtLxDgzK0d4VBXUGslwifn1rhTz
 yzr//qwt18Rcbv7ezmpsLT+Q2S7pTXtEblWZgf/IM5HE0AGdJIItbhYm3sjUYMiM
 UGZgBr0AwWywXcUmtiD/IdA85UJLUGjDKx88Du356VGPBizIG3A=
 =rD+V
 -----END PGP SIGNATURE-----

Merge tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from can. Slim pickings, I'm guessing people haven't
  really started testing.

  Current release - new code bugs:

   - eth: mlx5e:
       - psp: avoid 'accel' NULL pointer dereference
       - skip PPHCR register query for FEC histogram if not supported

  Previous releases - regressions:

   - bonding: update the slave array for broadcast mode

   - rtnetlink: re-allow deleting FDB entries in user namespace

   - eth: dpaa2: fix the pointer passed to PTR_ALIGN on Tx path

  Previous releases - always broken:

   - can: drop skb on xmit if device is in listen-only mode

   - gro: clear skb_shinfo(skb)->hwtstamps in napi_reuse_skb()

   - eth: mlx5e
       - RX, fix generating skb from non-linear xdp_buff if program
         trims frags
       - make devcom init failures non-fatal, fix races with IPSec

  Misc:

   - some documentation formatting 'fixes'"

* tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  net/mlx5: Fix IPsec cleanup over MPV device
  net/mlx5: Refactor devcom to return NULL on failure
  net/mlx5e: Skip PPHCR register query if not supported by the device
  net/mlx5: Add PPHCR to PCAM supported registers mask
  virtio-net: zero unused hash fields
  net: phy: micrel: always set shared->phydev for LAN8814
  vsock: fix lock inversion in vsock_assign_transport()
  ovpn: use datagram_poll_queue for socket readiness in TCP
  espintcp: use datagram_poll_queue for socket readiness
  net: datagram: introduce datagram_poll_queue for custom receive queues
  net: bonding: fix possible peer notify event loss or dup issue
  net: hsr: prevent creation of HSR device with slaves from another netns
  sctp: avoid NULL dereference when chunk data buffer is missing
  ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
  net: ravb: Ensure memory write completes before ringing TX doorbell
  net: ravb: Enforce descriptor type ordering
  net: hibmcge: select FIXED_PHY
  net: dlink: use dev_kfree_skb_any instead of dev_kfree_skb
  Documentation: networking: ax25: update the mailing list info.
  net: gro_cells: fix lock imbalance in gro_cells_receive()
  ...
2025-10-23 07:03:18 -10:00
Sidharth Seela
920aa3a770 selftests: cachestat: Fix warning on declaration under label
Fix warning caused from declaration under a case label. The proper way
is to declare variable at the beginning of the function. The warning
came from running clang using LLVM=1; and is as follows:

-test_cachestat.c:260:3: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
  260 |                 char *map = mmap(NULL, filesize, PROT_READ | PROT_WRITE,
      |

Link: https://lore.kernel.org/r/20250929115405.25695-2-sidharthseela@gmail.com
Signed-off-by: Sidharth Seela <sidharthseela@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: wang lian <lianux.mm@gmail.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-10-22 09:23:18 -06:00
Madhur Kumar
b90cafb438 selftests/cachestat: add tmpshmcstat file to .gitignore
Add the tmpshmcstat file to .gitignore to avoid
accidentally staging the build artifact

Link: https://lore.kernel.org/r/20251013095149.1386628-1-madhurkumar004@gmail.com
Signed-off-by: Madhur Kumar <madhurkumar004@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-10-22 09:23:05 -06:00
Mikulas Patocka
49c98f30f4 objtool: Fix failure when being compiled on x32 system
Fix compilation failure when compiling the kernel with the x32 toolchain.

In file included from check.c:16:
check.c: In function ¡check_abs_references¢:
/usr/src/git/linux-2.6/tools/objtool/include/objtool/warn.h:47:17: error: format ¡%lx¢ expects argument of type ¡long unsigned int¢, but argument 7 has type ¡u64¢ {aka ¡long
long unsigned int¢} [-Werror=format=]
   47 |                 "%s%s%s: objtool" extra ": " format "\n",               \
      |                 ^~~~~~~~~~~~~~~~~
/usr/src/git/linux-2.6/tools/objtool/include/objtool/warn.h:54:9: note: in expansion of macro ¡___WARN¢
   54 |         ___WARN(severity, "", format, ##__VA_ARGS__)
      |         ^~~~~~~
/usr/src/git/linux-2.6/tools/objtool/include/objtool/warn.h:74:27: note: in expansion of macro ¡__WARN¢
   74 | #define WARN(format, ...) __WARN(WARN_STR, format, ##__VA_ARGS__)
      |                           ^~~~~~
check.c:4713:33: note: in expansion of macro ¡WARN¢
 4713 |                                 WARN("section %s has absolute relocation at offset 0x%lx",
      |                                 ^~~~

Fixes: 0d6e4563fc ("objtool: Add action to check for absence of absolute relocations")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://patch.msgid.link/1ac32fff-2e67-5155-f570-69aad5bf5412@redhat.com
2025-10-22 15:21:55 +02:00
Miguel Ojeda
dbdf2a7feb objtool/rust: add one more noreturn Rust function
Between Rust 1.79 and 1.86, under `CONFIG_RUST_KERNEL_DOCTESTS=y`,
`objtool` may report:

    rust/doctests_kernel_generated.o: warning: objtool:
    rust_doctest_kernel_alloc_kbox_rs_13() falls through to next
    function rust_doctest_kernel_alloc_kvec_rs_0()

(as well as in rust_doctest_kernel_alloc_kvec_rs_0) due to calls to the
`noreturn` symbol:

    core::option::expect_failed

from code added in commits 779db37373 ("rust: alloc: kvec: implement
AsPageIter for VVec") and 671618432f ("rust: alloc: kbox: implement
AsPageIter for VBox").

Thus add the mangled one to the list so that `objtool` knows it is
actually `noreturn`.

This can be reproduced as well in other versions by tweaking the code,
such as the latest stable Rust (1.90.0).

Stable does not have code that triggers this, but it could have it in
the future. Downstream forks could too. Thus tag it for backport.

See commit 56d680dd23 ("objtool/rust: list `noreturn` Rust functions")
for more details.

Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Cc: stable@vger.kernel.org # Needed in 6.12.y and later.
Link: https://patch.msgid.link/20251020020714.2511718-1-ojeda@kernel.org
2025-10-22 15:21:54 +02:00
Matthieu Baerts (NGI0)
a9649dfbe5 selftests: mptcp: join: mark laminar tests as skipped if not supported
The call to 'continue_if' was missing: it properly marks a subtest as
'skipped' if the attached condition is not valid.

Without that, the test is wrongly marked as passed on older kernels.

Fixes: c912f935a5 ("selftests: mptcp: join: validate new laminar endp")
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251020-net-mptcp-c-flag-late-add-addr-v1-5-8207030cb0e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-21 17:36:47 -07:00
Matthieu Baerts (NGI0)
c3496c052a selftests: mptcp: join: mark 'delete re-add signal' as skipped if not supported
The call to 'continue_if' was missing: it properly marks a subtest as
'skipped' if the attached condition is not valid.

Without that, the test is wrongly marked as passed on older kernels.

Fixes: b5e2fb832f ("selftests: mptcp: add explicit test case for remove/readd")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251020-net-mptcp-c-flag-late-add-addr-v1-4-8207030cb0e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-21 17:36:46 -07:00
Matthieu Baerts (NGI0)
973f80d715 selftests: mptcp: join: mark implicit tests as skipped if not supported
The call to 'continue_if' was missing: it properly marks a subtest as
'skipped' if the attached condition is not valid.

Without that, the test is wrongly marked as passed on older kernels.

Fixes: 36c4127ae8 ("selftests: mptcp: join: skip implicit tests if not supported")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251020-net-mptcp-c-flag-late-add-addr-v1-3-8207030cb0e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-21 17:36:46 -07:00
Matthieu Baerts (NGI0)
d68460bc31 selftests: mptcp: join: mark 'flush re-add' as skipped if not supported
The call to 'continue_if' was missing: it properly marks a subtest as
'skipped' if the attached condition is not valid.

Without that, the test is wrongly marked as passed on older kernels.

Fixes: e06959e9ee ("selftests: mptcp: join: test for flush/re-add endpoints")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251020-net-mptcp-c-flag-late-add-addr-v1-2-8207030cb0e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-21 17:36:46 -07:00
Xin Long
a73ca0449b selftests: net: fix server bind failure in sctp_vrf.sh
sctp_vrf.sh could fail:

  TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [FAIL]
  not ok 1 selftests: net: sctp_vrf.sh # exit=3

The failure happens when the server bind in a new run conflicts with an
existing association from the previous run:

[1] ip netns exec $SERVER_NS ./sctp_hello server ...
[2] ip netns exec $CLIENT_NS ./sctp_hello client ...
[3] ip netns exec $SERVER_NS pkill sctp_hello ...
[4] ip netns exec $SERVER_NS ./sctp_hello server ...

It occurs if the client in [2] sends a message and closes immediately.
With the message unacked, no SHUTDOWN is sent. Killing the server in [3]
triggers a SHUTDOWN the client also ignores due to the unacked message,
leaving the old association alive. This causes the bind at [4] to fail
until the message is acked and the client responds to a second SHUTDOWN
after the server’s T2 timer expires (3s).

This patch fixes the issue by preventing the client from sending data.
Instead, the client blocks on recv() and waits for the server to close.
It also waits until both the server and the client sockets are fully
released in stop_server and wait_client before restarting.

Additionally, replace 2>&1 >/dev/null with -q in sysctl and grep, and
drop other redundant 2>&1 >/dev/null redirections, and fix a typo from
N to Y (connect successfully) in the description of the last test.

Fixes: a61bd7b9fe ("selftests: add a selftest for sctp vrf")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Tested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/be2dacf52d0917c4ba5e2e8c5a9cb640740ad2b6.1760731574.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-20 16:41:33 -07:00
Nicolin Chen
b09ed52db1 iommufd/selftest: Fix ioctl return value in _test_cmd_trigger_vevents()
The ioctl returns 0 upon success, so !0 returning -1 breaks the selftest.

Drop the '!' to fix it.

Fixes: 1d235d8494 ("iommu/selftest: prevent use of uninitialized variable")
Link: https://patch.msgid.link/r/20251014214847.1113759-1-nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-10-20 20:01:23 -03:00
Linus Torvalds
6548d364a3 cgroup: Fixes for v6.18-rc2
- Fix seqcount lockdep assertion failure in cgroup freezer on PREEMPT_RT.
   Plain seqcount_t expects preemption disabled, but PREEMPT_RT spinlocks
   don't disable preemption. Switch to seqcount_spinlock_t to properly
   associate css_set_lock with the freeze timing seqcount.
 
 - Misc changes including kernel-doc warning fix for misc_res_type enum and
   improved selftest diagnostics.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaPZ7wg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGTiuAP4swnb+BmcqOu37rxIuKXS+6FwRb0Om2wBs4m5W
 bgASMAEAva+DWpWx0l8GFV4kf14wRV6rYlkMRcP3KSyqXNLmLg4=
 =8A4u
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.18-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:

 - Fix seqcount lockdep assertion failure in cgroup freezer on
   PREEMPT_RT.

   Plain seqcount_t expects preemption disabled, but PREEMPT_RT
   spinlocks don't disable preemption. Switch to seqcount_spinlock_t to
   properly associate css_set_lock with the freeze timing seqcount.

 - Misc changes including kernel-doc warning fix for misc_res_type enum
   and improved selftest diagnostics.

* tag 'cgroup-for-6.18-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/misc: fix misc_res_type kernel-doc warning
  selftests: cgroup: Use values_close_report in test_cpu
  selftests: cgroup: add values_close_report helper
  cgroup: Fix seqcount lockdep assertion in cgroup freezer
2025-10-20 09:41:27 -10:00
Linus Torvalds
2953fb6548 hid-for-linus-2025101701
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEL65usyKPHcrRDEicpmLzj2vtYEkFAmjyYSUACgkQpmLzj2vt
 YEnGlxAAukp9PohVJYj5XXfIZXd6nl9wQyTLfVxCaw03R4IBSwq6/SpbSFP3Khr0
 nR6wss6GdpiHGc8WA7Yo3EIZGsZtSolaOrg2pozKop24FD42S5bgq/qWeUNERhGF
 LU6+zpZbYhZ5Joo2Q57vRZ4D9UDYg7CoU1jVAkscEuvQhishLFKpG88VE9Jy/2Eu
 SGQ9Aqq/0NAC/msUHtVWn90EiLIQBS4izWO02Js6JObf8mdisH1MQscOeJ2rfJhV
 7WcdMX/LkqL7DQ9dFFo3j5NqE3SF1aJFjhMY+xivQaYeuwPSffCuWrbVq6TM74hP
 uRfIII1YUYAEyh+69KgQbQLGl0OqZbDhtEZ5p2SCn3DaMwXu+sraxfWRQ8nkG55w
 Gw74ibc3F9pUwwls9i7FUJE3CzoNBcN/FdxvWST5izni4rC2XAC7pzeSmyO3jiQO
 Zvpwa0VkaOeZ1OmVqo4YQBZoNlgBIaBPxqtoHfHxwvEvqUmivSzU+//XFGbvXgK1
 XL8hgYd7P1P6g9UVSvsgiVA6FIdR4EtpbOV3YxQpMVv8uAOCN5omeOAd9jRImN2a
 BxpKr5bcB4JhZcji/E4T4IPegA+8LUniZ6AW5KIUCCz34uDyeDSHsWB+cZiy31zc
 gGa0NqGRfhKOzqj9ExnnRDoZWyhnoHZ3yxUb41TPQfS2p5uZepM=
 =UpoT
 -----END PGP SIGNATURE-----

Merge tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

 - fix for sticky fingers handling in hid-multitouch (Benjamin
   Tissoires)

 - fix for reporting of 0 battery levels (Dmitry Torokhov)

 - build fix for hid-haptic in certain configurations (Jonathan Denose)

 - improved probe and avoiding spamming kernel log by hid-nintendo
   (Vicki Pfau)

 - fix for OOB in hid-cp2112 (Deepak Sharma)

 - interrupt handling fix for intel-thc-hid (Even Xu)

 - a couple of new device IDs and device-specific quirks

* tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-hidpp: Add HIDPP_QUIRK_RESET_HI_RES_SCROLL
  selftests/hid: add tests for missing release on the Dell Synaptics
  HID: multitouch: fix sticky fingers
  HID: multitouch: fix name of Stylus input devices
  HID: hid-input: only ignore 0 battery events for digitizers
  HID: hid-debug: Fix spelling mistake "Rechargable" -> "Rechargeable"
  HID: Kconfig: Fix build error from CONFIG_HID_HAPTIC
  HID: nintendo: Rate limit IMU compensation message
  HID: nintendo: Wait longer for initial probe
  HID: core: Add printk_ratelimited variants to hid_warn() etc
  HID: quirks: Add ALWAYS_POLL quirk for VRS R295 steering wheel
  HID: quirks: avoid Cooler Master MM712 dongle wakeup bug
  HID: cp2112: Add parameter validation to data length
  HID: intel-thc-hid: intel-quickspi: Add ARL PCI Device Id's
  HID: intel-thc-hid: Intel-quickspi: switch first interrupt from level to edge detection
  HID: intel-thc-hid: intel-quicki2c: Fix wrong type casting
2025-10-18 08:18:18 -10:00
Linus Torvalds
d303caf5ca bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmjykZUACgkQ6rmadz2v
 bTppcA/+LKOzgJVJd9q3tbPeXb+xJOqiCGPDuv3tfmglHxD+rR5n2lsoTkryL4PZ
 k5yJhA95Z5RVeV3cevl8n4HlTJipm795k+fKz1YRbe9gB49w2SqqDf5s7LYuABOm
 YqdUSCWaLyoBNd+qi9SuyOVXSg0mSbJk0bEsXsgTp/5rUt9v6cz+36BE1Pkcp3Aa
 d6y2I2MBRGCADtEmsXh7+DI0aUvgMi2zlX8veQdfAJZjsQFwbr1ZxSxHYqtS6gux
 wueEiihzipakhONACJskQbRv80NwT5/VmrAI/ZRVzIhsywQhDGdXtQVNs6700/oq
 QmIZtgAL17Y0SAyhzQsQuGhJGKdWKhf3hzDKEDPslmyJ6OCpMAs/ttPHTJ6s/MmG
 6arSwZD/GAIoWhvWYP/zxTdmjqKX13uradvaNTv55hOhCLTTnZQKRSLk1NabHl7e
 V0f7NlZaVPnLerW/90+pn5pZFSrhk0Nno9to+yXaQ9TYJlK4e6dcB9y+rButQNrr
 7bTRyQ55fQlyS+NnaD85wL41IIX4WmJ3ATdrKZIMvGMJaZxjzXRvW4AjGHJ6hbbt
 GATdtISkYqZ4AdlSf2Vj9BysZkf7tS83SyRlM6WDm3IaAS3v5E/e1Ky2Kn6gWu70
 MNcSW/O0DSqXRkcDkY/tSOlYJBZJYo6ZuWhNdQAERA3OxldBSFM=
 =p8bI
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Replace bpf_map_kmalloc_node() with kmalloc_nolock() to fix kmemleak
   imbalance in tracking of bpf_async_cb structures (Alexei Starovoitov)

 - Make selftests/bpf arg_parsing.c more robust to errors (Andrii
   Nakryiko)

 - Fix redefinition of 'off' as different kind of symbol when I40E
   driver is builtin (Brahmajit Das)

 - Do not disable preemption in bpf_test_run (Sahil Chandna)

 - Fix memory leak in __lookup_instance error path (Shardul Bankar)

 - Ensure test data is flushed to disk before reading it (Xing Guo)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Fix redefinition of 'off' as different kind of symbol
  bpf: Do not disable preemption in bpf_test_run().
  bpf: Fix memory leak in __lookup_instance error path
  selftests: arg_parsing: Ensure data is flushed to disk before reading.
  bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate bpf_async_cb structures.
  selftests/bpf: make arg_parsing.c more robust to crashes
  bpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error path
2025-10-18 08:00:43 -10:00
Linus Torvalds
02e5f74ef0 ARM:
- Fix the handling of ZCR_EL2 in NV VMs
 
 - Pick the correct translation regime when doing a PTW on
   the back of a SEA
 
 - Prevent userspace from injecting an event into a vcpu that isn't
   initialised yet
 
 - Move timer save/restore to the sysreg handling code, fixing EL2 timer
   access in the process
 
 - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
 
 - Fix trapping configuration when the host isn't GICv3
 
 - Improve the detection of HCR_EL2.E2H being RES1
 
 - Drop a spurious 'break' statement in the S1 PTW
 
 - Don't try to access SPE when owned by EL3
 
 Documentation updates:
 
 - Document the failure modes of event injection
 
 - Document that a GICv3 guest can be created on a GICv5 host
   with FEAT_GCIE_LEGACY
 
 Selftest improvements:
 
 - Add a selftest for the effective value of HCR_EL2.AMO
 
 - Address build warning in the timer selftest when building with clang
 
 - Teach irqfd selftests about non-x86 architectures
 
 - Add missing sysregs to the set_id_regs selftest
 
 - Fix vcpu allocation in the vgic_lpi_stress selftest
 
 - Correctly enable interrupts in the vgic_lpi_stress selftest
 
 x86:
 
 - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
   bug fixed by commit 3ccbf6f470 ("KVM: x86/mmu: Return -EAGAIN if userspace
   deletes/moves memslot during prefault")
 
 - Don't try to get PMU capabilities from perf when running a CPU with hybrid
   CPUs/PMUs, as perf will rightly WARN.
 
 guest_memfd:
 
 - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
   generic KVM_CAP_GUEST_MEMFD_FLAGS
 
 - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
   said flag to initialize memory as SHARED, irrespective of MMAP.  The
   behavior merged in 6.18 is that enabling mmap() implicitly initializes
   memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
   as their memory is currently always initialized PRIVATE.
 
 - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
   memory, to enable testing such setups, i.e. to hopefully flush out any
   other lurking ABI issues before 6.18 is officially released.
 
 - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
   and host userspace accesses to mmap()'d private memory.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmjzqVIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO+qQgArc7XXmoiHQfTmdqbFL+1ipzfqd/c
 SHJghONWVNKaSm0EsH72iEokmUyI8HssllaBuaGEAT/1F6YmRFwSSFgUG+N02rah
 pL5ShCG2fPVxHal9ZJ04M4DYWPPClmcE2myfQ6k9kwcMgCRK2BdSRRnKH3XfOKrY
 jAFNZVBCeODcnSvjOyxK2QFEt7J97H1AoAxOORvdqFmRqVIEQNJA/3Hx51wPfkwD
 UnCQiNaPinDMxuuwvcmlYsIrQhGaqO4de1Kx0A4ZkSQqFUcyhvB6Qa+DoApz/IBw
 qsFLqoR/1XXJ90wxutSTFzfjHM/SU6fhj57Cl9dAHI3pgnssC1iUvEt9Iw==
 =dvAj
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Fix the handling of ZCR_EL2 in NV VMs

   - Pick the correct translation regime when doing a PTW on the back of
     a SEA

   - Prevent userspace from injecting an event into a vcpu that isn't
     initialised yet

   - Move timer save/restore to the sysreg handling code, fixing EL2
     timer access in the process

   - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug

   - Fix trapping configuration when the host isn't GICv3

   - Improve the detection of HCR_EL2.E2H being RES1

   - Drop a spurious 'break' statement in the S1 PTW

   - Don't try to access SPE when owned by EL3

  Documentation updates:

   - Document the failure modes of event injection

   - Document that a GICv3 guest can be created on a GICv5 host with
     FEAT_GCIE_LEGACY

  Selftest improvements:

   - Add a selftest for the effective value of HCR_EL2.AMO

   - Address build warning in the timer selftest when building with
     clang

   - Teach irqfd selftests about non-x86 architectures

   - Add missing sysregs to the set_id_regs selftest

   - Fix vcpu allocation in the vgic_lpi_stress selftest

   - Correctly enable interrupts in the vgic_lpi_stress selftest

  x86:

   - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test
     for the bug fixed by commit 3ccbf6f470 ("KVM: x86/mmu: Return
     -EAGAIN if userspace deletes/moves memslot during prefault")

   - Don't try to get PMU capabilities from perf when running a CPU with
     hybrid CPUs/PMUs, as perf will rightly WARN.

  guest_memfd:

   - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a
     more generic KVM_CAP_GUEST_MEMFD_FLAGS

   - Add a guest_memfd INIT_SHARED flag and require userspace to
     explicitly set said flag to initialize memory as SHARED,
     irrespective of MMAP.

     The behavior merged in 6.18 is that enabling mmap() implicitly
     initializes memory as SHARED, which would result in an ABI
     collision for x86 CoCo VMs as their memory is currently always
     initialized PRIVATE.

   - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with
     private memory, to enable testing such setups, i.e. to hopefully
     flush out any other lurking ABI issues before 6.18 is officially
     released.

   - Add testcases to the guest_memfd selftest to cover guest_memfd
     without MMAP, and host userspace accesses to mmap()'d private
     memory"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
  arm64: Revamp HCR_EL2.E2H RES1 detection
  KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
  KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
  KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
  KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
  KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit
  KVM: arm64: Kill leftovers of ad-hoc timer userspace access
  KVM: arm64: Fix WFxT handling of nested virt
  KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
  KVM: arm64: Make timer_set_offset() generally accessible
  KVM: arm64: Replace timer context vcpu pointer with timer_id
  KVM: arm64: Introduce timer_context_to_vcpu() helper
  KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
  Documentation: KVM: Update GICv3 docs for GICv5 hosts
  KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
  KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress
  KVM: arm64: selftests: Allocate vcpus with correct size
  ...
2025-10-18 07:07:14 -10:00
Paolo Bonzini
4361f5aa8b KVM x86 fixes for 6.18:
- Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
    bug fixed by commit 3ccbf6f470 ("KVM: x86/mmu: Return -EAGAIN if userspace
    deletes/moves memslot during prefault")
 
  - Don't try to get PMU capabbilities from perf when running a CPU with hybrid
    CPUs/PMUs, as perf will rightly WARN.
 
  - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
    generic KVM_CAP_GUEST_MEMFD_FLAGS
 
  - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
    said flag to initialize memory as SHARED, irrespective of MMAP.  The
    behavior merged in 6.18 is that enabling mmap() implicitly initializes
    memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
    as their memory is currently always initialized PRIVATE.
 
  - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
    memory, to enable testing such setups, i.e. to hopefully flush out any
    other lurking ABI issues before 6.18 is officially released.
 
  - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
    and host userspace accesses to mmap()'d private memory.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmjyszoACgkQOlYIJqCj
 N/03+g/9FPoIPxGL9+tJyGahdH2Mygiip8Q3tUTlGVkskp+dplf+6T51ogdqBOkS
 tvlGjccxAOVW73ijn8ox7UtGSY9B1IJ+rj/uhsBOMFQptyUJEv4iEujFKB/t5RIF
 gOTVVR6Z/mcrYJY7F21qBPCHPbz++rEXrfgyAsosz6tpS/nL6vrNdp1LZlcsLM/k
 5DuhkQHLEwJoUXO5VUsBWRa3gRdOuZ+SJGd4C1mfDwXagxe8uFYRO/vraOV9dKJx
 l9RxKPIhf42cfu9tIeJYDJyXIDgzXUEND4smVw/ito1SeakBlC9rQ15ya8ETLAEX
 tHzmj38RPOfjsycGKIRhlLzQx77b+t7FhNdse19QC3p3u9Jn8FuMyFcGZuiaP5kK
 e9Xrp/zcaCNfHB6gGKEud7h7fpV9yB8SNlCsP81if73buq3qHx0+jeVg3jCS4mkb
 zGY7CG+oKJOdTSN8JAh8nJMM4bUv5m8myr6yYGU+SsGBQsyPyqQdRWuKdQyOQIVC
 RZSWXSjKfmT5FwSs0KRI0yUjMeUYbgwrsysFuS3qX62mZpr0vLaPoiYFvuffe9gB
 W3Tt98QNPtBmJdINNncgKvbn3sp/CzUHirygaI0APZwh6QQAkL1Dp2beA1i95uVI
 8uin4zUjRbRjUpMUaEtAQIaVMTVQqgfiPAvtcDCRBBeOGaNr81M=
 =lBtj
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-fixes-6.18-rc2' of https://github.com/kvm-x86/linux into HEAD

KVM x86 fixes for 6.18:

 - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
   bug fixed by commit 3ccbf6f470 ("KVM: x86/mmu: Return -EAGAIN if userspace
   deletes/moves memslot during prefault")

 - Don't try to get PMU capabbilities from perf when running a CPU with hybrid
   CPUs/PMUs, as perf will rightly WARN.

 - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
   generic KVM_CAP_GUEST_MEMFD_FLAGS

 - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
   said flag to initialize memory as SHARED, irrespective of MMAP.  The
   behavior merged in 6.18 is that enabling mmap() implicitly initializes
   memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
   as their memory is currently always initialized PRIVATE.

 - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
   memory, to enable testing such setups, i.e. to hopefully flush out any
   other lurking ABI issues before 6.18 is officially released.

 - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
   and host userspace accesses to mmap()'d private memory.
2025-10-18 10:25:43 +02:00
Jakub Kicinski
e90576829c bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCaPFVxwAKCRAraS+Z+3y5
 E1yDAP98nxKLBFb9auLj8AV1zUNl4lKEpOIhH7ceXaDT7ucsagEAwvSbGAN5sun/
 M0+sOpKEZwF8JBoRU8L61uYPYSu7XAk=
 =2K9O
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Martin KaFai Lau says:

====================
pull-request: bpf-next 2025-10-16

We've added 6 non-merge commits during the last 1 day(s) which contain
a total of 18 files changed, 577 insertions(+), 38 deletions(-).

The main changes are:

1) Bypass the global per-protocol memory accounting either by setting
   a netns sysctl or using bpf_setsockopt in a bpf program,
   from Kuniyuki Iwashima.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Add test for sk->sk_bypass_prot_mem.
  bpf: Introduce SK_BPF_BYPASS_PROT_MEM.
  bpf: Support bpf_setsockopt() for BPF_CGROUP_INET_SOCK_CREATE.
  net: Introduce net.core.bypass_prot_mem sysctl.
  net: Allow opt-out from global protocol memory accounting.
  tcp: Save lock_sock() for memcg in inet_csk_accept().
====================

Link: https://patch.msgid.link/20251016204539.773707-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-17 17:20:42 -07:00
Carlos Llamas
f578ff4c53 selftests/net: io_uring: fix unknown errnum values
The io_uring functions return negative error values, but error() expects
these to be positive to properly match them to an errno string. Fix this
to make sure the correct error descriptions are displayed upon failure.

Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://patch.msgid.link/20251016182538.3790567-1-cmllamas@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-17 16:57:53 -07:00
Brahmajit Das
a1e83d4c03 selftests/bpf: Fix redefinition of 'off' as different kind of symbol
This fixes the following build error

   CLNG-BPF [test_progs] verifier_global_ptr_args.bpf.o
progs/verifier_global_ptr_args.c:228:5: error: redefinition of 'off' as
different kind of symbol
   228 | u32 off;
       |     ^

The symbol 'off' was previously defined in
tools/testing/selftests/bpf/tools/include/vmlinux.h, which includes an
enum i40e_ptp_gpio_pin_state from
drivers/net/ethernet/intel/i40e/i40e_ptp.c:

	enum i40e_ptp_gpio_pin_state {
		end = -2,
		invalid = -1,
		off = 0,
		in_A = 1,
		in_B = 2,
		out_A = 3,
		out_B = 4,
	};

This enum is included when CONFIG_I40E is enabled. As of commit
032676ff82 ("LoongArch: Update Loongson-3 default config file"),
CONFIG_I40E is set in the defconfig, which leads to the conflict.

Renaming the local variable avoids the redefinition and allows the
build to succeed.

Suggested-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20251017171551.53142-1-listout@listout.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-17 11:33:23 -07:00
Eric Dumazet
56cef47c28 selftests/net: packetdrill: unflake tcp_user_timeout_user-timeout-probe.pkt
This test fails the first time I am running it after a fresh virtme-ng boot.

tcp_user_timeout_user-timeout-probe.pkt:33: runtime error in write call: Expected result -1 but got 24 with errno 2 (No such file or directory)

Tweaks the timings a bit, to reduce flakiness.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soham Chakradeo <sohamch@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20251014171907.3554413-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-16 16:25:09 -07:00
Kuniyuki Iwashima
5f941dd87b selftests/bpf: Add test for sk->sk_bypass_prot_mem.
The test does the following for IPv4/IPv6 x TCP/UDP sockets
with/without sk->sk_bypass_prot_mem, which can be turned on by
net.core.bypass_prot_mem or bpf_setsockopt(SK_BPF_BYPASS_PROT_MEM).

  1. Create socket pairs
  2. Send NR_PAGES (32) of data (TCP consumes around 35 pages,
     and UDP consuems 66 pages due to skb overhead)
  3. Read memory_allocated from sk->sk_prot->memory_allocated and
     sk->sk_prot->memory_per_cpu_fw_alloc
  4. Check if unread data is charged to memory_allocated

If sk->sk_bypass_prot_mem is set, memory_allocated should not be
changed, but we allow a small error (up to 10 pages) in case
other processes on the host use some amounts of TCP/UDP memory.

The amount of allocated pages are buffered to per-cpu variable
{tcp,udp}_memory_per_cpu_fw_alloc up to +/- net.core.mem_pcpu_rsv
before reported to {tcp,udp}_memory_allocated.

At 3., memory_allocated is calculated from the 2 variables at
fentry of socket create function.

We drain the receive queue only for UDP before close() because UDP
recv queue is destroyed after RCU grace period.  When I printed
memory_allocated, UDP bypass cases sometimes saw the no-bypass
case's leftover, but it's still in the small error range (<10 pages).

  bpf_trace_printk: memory_allocated: 0   <-- TCP no-bypass
  bpf_trace_printk: memory_allocated: 35
  bpf_trace_printk: memory_allocated: 0   <-- TCP w/ sysctl
  bpf_trace_printk: memory_allocated: 0
  bpf_trace_printk: memory_allocated: 0   <-- TCP w/ bpf
  bpf_trace_printk: memory_allocated: 0
  bpf_trace_printk: memory_allocated: 0   <-- UDP no-bypass
  bpf_trace_printk: memory_allocated: 66
  bpf_trace_printk: memory_allocated: 2   <-- UDP w/ sysctl (2 pages leftover)
  bpf_trace_printk: memory_allocated: 2
  bpf_trace_printk: memory_allocated: 2   <-- UDP w/ bpf (2 pages leftover)
  bpf_trace_printk: memory_allocated: 2

We prefer finishing tests faster than oversleeping for call_rcu()
 + sk_destruct().

The test completes within 2s on QEMU (64 CPUs) w/ KVM.

  # time ./test_progs -t sk_bypass
  #371/1   sk_bypass_prot_mem/TCP  :OK
  #371/2   sk_bypass_prot_mem/UDP  :OK
  #371/3   sk_bypass_prot_mem/TCPv6:OK
  #371/4   sk_bypass_prot_mem/UDPv6:OK
  #371     sk_bypass_prot_mem:OK
  Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED

  real	0m1.481s
  user	0m0.181s
  sys	0m0.441s

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Link: https://patch.msgid.link/20251014235604.3057003-7-kuniyu@google.com
2025-10-16 12:04:47 -07:00
Kuniyuki Iwashima
38163af068 bpf: Introduce SK_BPF_BYPASS_PROT_MEM.
If a socket has sk->sk_bypass_prot_mem flagged, the socket opts out
of the global protocol memory accounting.

This is easily controlled by net.core.bypass_prot_mem sysctl, but it
lacks flexibility.

Let's support flagging (and clearing) sk->sk_bypass_prot_mem via
bpf_setsockopt() at the BPF_CGROUP_INET_SOCK_CREATE hook.

  int val = 1;

  bpf_setsockopt(ctx, SOL_SOCKET, SK_BPF_BYPASS_PROT_MEM,
                 &val, sizeof(val));

As with net.core.bypass_prot_mem, this is inherited to child sockets,
and BPF always takes precedence over sysctl at socket(2) and accept(2).

SK_BPF_BYPASS_PROT_MEM is only supported at BPF_CGROUP_INET_SOCK_CREATE
and not supported on other hooks for some reasons:

  1. UDP charges memory under sk->sk_receive_queue.lock instead
     of lock_sock()

  2. Modifying the flag after skb is charged to sk requires such
     adjustment during bpf_setsockopt() and complicates the logic
     unnecessarily

We can support other hooks later if a real use case justifies that.

Most changes are inline and hard to trace, but a microbenchmark on
__sk_mem_raise_allocated() during neper/tcp_stream showed that more
samples completed faster with sk->sk_bypass_prot_mem == 1.  This will
be more visible under tcp_mem pressure (but it's not a fair comparison).

  # bpftrace -e 'kprobe:__sk_mem_raise_allocated { @start[tid] = nsecs; }
    kretprobe:__sk_mem_raise_allocated /@start[tid]/
    { @end[tid] = nsecs - @start[tid]; @times = hist(@end[tid]); delete(@start[tid]); }'
  # tcp_stream -6 -F 1000 -N -T 256

Without bpf prog:

  [128, 256)          3846 |                                                    |
  [256, 512)       1505326 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
  [512, 1K)        1371006 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@     |
  [1K, 2K)          198207 |@@@@@@                                              |
  [2K, 4K)           31199 |@                                                   |

With bpf prog in the next patch:
  (must be attached before tcp_stream)
  # bpftool prog load sk_bypass_prot_mem.bpf.o /sys/fs/bpf/test type cgroup/sock_create
  # bpftool cgroup attach /sys/fs/cgroup/test cgroup_inet_sock_create pinned /sys/fs/bpf/test

  [128, 256)          6413 |                                                    |
  [256, 512)       1868425 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
  [512, 1K)        1101697 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                      |
  [1K, 2K)          117031 |@@@@                                                |
  [2K, 4K)           11773 |                                                    |

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Link: https://patch.msgid.link/20251014235604.3057003-6-kuniyu@google.com
2025-10-16 12:04:47 -07:00
Linus Torvalds
634ec1fc79 Including fixes from CAN
Current release - regressions:
 
   - udp: do not use skb_release_head_state() before skb_attempt_defer_free()
 
   - gro_cells: use nested-BH locking for gro_cell
 
   - dpll: zl3073x: increase maximum size of flash utility
 
 Previous releases - regressions:
 
   - core: fix lockdep splat on device unregister
 
   - tcp: fix tcp_tso_should_defer() vs large RTT
 
   - tls:
     - don't rely on tx_work during send()
     - wait for pending async decryptions if tls_strp_msg_hold fails
 
   - can: j1939: add missing calls in NETDEV_UNREGISTER notification handler
 
   - eth: lan78xx: fix lost EEPROM write timeout in lan78xx_write_raw_eeprom
 
 Previous releases - always broken:
 
   - ip6_tunnel: prevent perpetual tunnel growth
 
   - dpll: zl3073x: handle missing or corrupted flash configuration
 
   - can: m_can: fix pm_runtime and CAN state handling
 
   - eth:ixgbe: fix too early devlink_free() in ixgbe_remove()
 
   - eth: ixgbevf: fix mailbox API compatibility
 
   - eth: gve: Check valid ts bit on RX descriptor before hw timestamping
 
   - eth: idpf: cleanup remaining SKBs in PTP flows
 
   - eth: r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmjxBUASHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkTfwP/0I5aKZHbZ9I3gHL6Dw9TtcYPsP8cZDK
 Iu7nOtgyD8AKSnX+3ynABg4K3WbE6HFi1BQzv63uO9G2UYLZSMdU16uytUKbstR6
 HWw+r2N0NMb+qacb++nbMugmYczSWUYKRyrpWHrNtqbKi3aQRJt5lQ81AGwgMITX
 mx8VILjMTXIt/K055LVnF9z2c4ltFmLTGiODKqHjQQirNbBiRJFNYn4qtw3BkbdS
 OwxNtTqOozQppRf+1l4MRHHXNbwGe7W9hOnALbiHpwtXxvG/+NVRu8IkNcD8vMD9
 I7aThpkOt7B9xcWqLb2pAAk976MuKQr6BUGKQrP26vZb9W0NIJbew1Y8d74Zld5d
 Z9bD5Uqr85Ge4Xp3/mWnaQZWk2OTsVdOOxkUoPWilFByULGfJORMvu6+y6/RWpZt
 dezE7WL9o+K1AJLBWnZ3RTWI8vmSEDup4WzPSP1v7paAFcvb9Ub4pBghX9+RGCfE
 ZmCecQPZfVzTe08PvAysc9VR7o3jGx5FpCLhinLxaHaSrV4YvALyV8iYagrmtQWY
 iWVkMYEIxh5B6KLmIVVu5cQJJjuRLm6k+nCTemBJaAItu2YLdy4ZeLeFWUhVgmxM
 SIsueBmTGorRqCbSed7L17yMApsMaBzE775+qOFoK7tLoKVTm22AMA3DzQtKGeIv
 OFlNreoOWzMX
 =jVlx
 -----END PGP SIGNATURE-----

Merge tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from CAN

  Current release - regressions:

    - udp: do not use skb_release_head_state() before
      skb_attempt_defer_free()

    - gro_cells: use nested-BH locking for gro_cell

    - dpll: zl3073x: increase maximum size of flash utility

  Previous releases - regressions:

    - core: fix lockdep splat on device unregister

    - tcp: fix tcp_tso_should_defer() vs large RTT

    - tls:
        - don't rely on tx_work during send()
        - wait for pending async decryptions if tls_strp_msg_hold fails

    - can: j1939: add missing calls in NETDEV_UNREGISTER notification
      handler

    - eth: lan78xx: fix lost EEPROM write timeout in
      lan78xx_write_raw_eeprom

  Previous releases - always broken:

    - ip6_tunnel: prevent perpetual tunnel growth

    - dpll: zl3073x: handle missing or corrupted flash configuration

    - can: m_can: fix pm_runtime and CAN state handling

    - eth:
        - ixgbe: fix too early devlink_free() in ixgbe_remove()
        - ixgbevf: fix mailbox API compatibility
        - gve: Check valid ts bit on RX descriptor before hw timestamping
        - idpf: cleanup remaining SKBs in PTP flows
        - r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H"

* tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
  udp: do not use skb_release_head_state() before skb_attempt_defer_free()
  net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset
  netdevsim: set the carrier when the device goes up
  selftests: tls: add test for short splice due to full skmsg
  selftests: net: tls: add tests for cmsg vs MSG_MORE
  tls: don't rely on tx_work during send()
  tls: wait for pending async decryptions if tls_strp_msg_hold fails
  tls: always set record_type in tls_process_cmsg
  tls: wait for async encrypt in case of error during latter iterations of sendmsg
  tls: trim encrypted message to match the plaintext on short splice
  tg3: prevent use of uninitialized remote_adv and local_adv variables
  MAINTAINERS: new entry for IPv6 IOAM
  gve: Check valid ts bit on RX descriptor before hw timestamping
  net: core: fix lockdep splat on device unregister
  MAINTAINERS: add myself as maintainer for b53
  selftests: net: check jq command is supported
  net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit()
  tcp: fix tcp_tso_should_defer() vs large RTT
  r8152: add error handling in rtl8152_driver_init
  usbnet: Fix using smp_processor_id() in preemptible code warnings
  ...
2025-10-16 09:41:21 -07:00
Xing Guo
0c1999ed33 selftests: arg_parsing: Ensure data is flushed to disk before reading.
test_parse_test_list_file writes some data to
/tmp/bpf_arg_parsing_test.XXXXXX and parse_test_list_file() will read
the data back.  However, after writing data to that file, we forget to
call fsync() and it's causing testing failure in my laptop.  This patch
helps fix it by adding the missing fsync() call.

Fixes: 64276f01dc ("selftests/bpf: Test_progs can read test lists from file")
Signed-off-by: Xing Guo <higuoxing@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20251016035330.3217145-1-higuoxing@gmail.com
2025-10-16 09:34:39 -07:00
Sabrina Dubroca
3667e9b442 selftests: tls: add test for short splice due to full skmsg
We don't have a test triggering a partial splice caused by a full
skmsg. Add one, based on a program by Jann Horn.

Use MAX_FRAGS=48 to make sure the skmsg will be full for any allowed
value of CONFIG_MAX_SKB_FRAGS (17..45).

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/1d129a15f526ea3602f3a2b368aa0b6f7e0d35d5.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15 17:41:46 -07:00
Sabrina Dubroca
f95fce1e95 selftests: net: tls: add tests for cmsg vs MSG_MORE
We don't have a test to check that MSG_MORE won't let us merge records
of different types across sendmsg calls.

Add new tests that check:
 - MSG_MORE is only allowed for DATA records
 - a pending DATA record gets closed and pushed before a non-DATA
   record is processed

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/b34feeadefe8a997f068d5ed5617afd0072df3c0.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15 17:41:45 -07:00
Benjamin Tissoires
d9b3014a7f selftests/hid: add tests for missing release on the Dell Synaptics
Add a simple test for the corner case not currently covered by the
sticky fingers quirk. Because it's a corner case test, we only test this
on a couple of devices, not on all of them because the value of adding
the same test over and over is rather moot.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-10-15 17:27:06 +02:00
Sebastian Chlad
4cdde87d72 selftests: cgroup: Use values_close_report in test_cpu
Convert test_cpu to use the newly added values_close_report() helper
to print detailed diagnostics when a tolerance check fails. This
provides clearer insight into deviations while run in the CI.

Signed-off-by: Sebastian Chlad <sebastian.chlad@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-10-15 05:00:59 -10:00
Sebastian Chlad
3f9c60f4d3 selftests: cgroup: add values_close_report helper
Some cgroup selftests, such as test_cpu, occasionally fail by a very
small margin and if run in the CI context, it is useful to have detailed
diagnostic output to understand the deviation.

Introduce a values_close_report() helper which performs the same
comparison as values_close(), but prints detailed information when the
values differ beyond the allowed tolerance.

Signed-off-by: Sebastian Chlad <sebastian.chlad@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-10-15 05:00:49 -10:00
Andrii Nakryiko
e603a342cf selftests/bpf: make arg_parsing.c more robust to crashes
We started getting a crash in BPF CI, which seems to originate from
test_parse_test_list_file() test and is happening at this line:

  ASSERT_OK(strcmp("test_with_spaces", set.tests[0].name), "test 0 name");

One way we can crash there is if set.cnt zero, which is checked for with
ASSERT_EQ() above, but we proceed after this regardless of the outcome.
Instead of crashing, we should bail out with test failure early.

Similarly, if parse_test_list_file() fails, we shouldn't be even looking
at set, so bail even earlier if ASSERT_OK() fails.

Fixes: 64276f01dc ("selftests/bpf: Test_progs can read test lists from file")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20251014202037.72922-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-14 16:39:33 -07:00
Wang Liang
4f86eb0a38 selftests: net: check jq command is supported
The jq command is used in vlan_bridge_binding.sh, if it is not supported,
the test will spam the following log.

  # ./vlan_bridge_binding.sh: line 51: jq: command not found
  # ./vlan_bridge_binding.sh: line 51: jq: command not found
  # ./vlan_bridge_binding.sh: line 51: jq: command not found
  # ./vlan_bridge_binding.sh: line 51: jq: command not found
  # ./vlan_bridge_binding.sh: line 51: jq: command not found
  # TEST: Test bridge_binding on->off when lower down                   [FAIL]
  #       Got operstate of , expected 0

The rtnetlink.sh has the same problem. It makes sense to check if jq is
installed before running these tests. After this patch, the
vlan_bridge_binding.sh skipped if jq is not supported:

  # timeout set to 3600
  # selftests: net: vlan_bridge_binding.sh
  # TEST: jq not installed                                              [SKIP]

Fixes: dca12e9ab7 ("selftests: net: Add a VLAN bridge binding selftest")
Fixes: 6a414fd77f ("selftests: rtnetlink: Add an address proto test")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251013080039.3035898-1-wangliang74@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14 15:12:18 +02:00
Marc Zyngier
5c7cf1e44e KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
The userspace-visible encoding for CNTV_CVAL_EL0 and CNTVCNT_EL0
have been swapped for as long as usersapce has had access to the
registers. This is documented in arch/arm64/include/uapi/asm/kvm.h.

Despite that, the get_reg_list test has unhelpful comments indicating
the wrong register for the encoding.

Replace this with definitions exposed in the include file, and
a comment explaining again the brokenness.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:43:12 +01:00
Marc Zyngier
4da5a9af78 KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
Add yet another configuration, this time dealing E2H=0.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:41 +01:00