Commit Graph

22457 Commits

Author SHA1 Message Date
Mykyta Yatsenko
f64eb44ce9 selftests/bpf: Add torn write detection test for htab BPF_F_LOCK
Add a consistency subtest to htab_reuse that detects torn writes
caused by the BPF_F_LOCK lockless update racing with element
reallocation in alloc_htab_elem().

The test uses three thread roles started simultaneously via a pipe:
 - locked updaters: BPF_F_LOCK|BPF_EXIST in-place updates
 - delete+update workers: delete then BPF_ANY|BPF_F_LOCK insert
 - locked readers: BPF_F_LOCK lookup checking value consistency

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260401-bpf_map_torn_writes-v1-2-782d071c55e7@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-05 18:37:32 -07:00
Linus Torvalds
85fb6da43a RISC-V updates for v7.0-rc7
- Fix a CONFIG_SPARSEMEM crash on RV32 by avoiding early phys_to_page()
 
 - Prevent runtime const infrastructure from being used by modules, similar
   to what was done for x86
 
 - Avoid problems when shutting down ACPI systems with IOMMUs by adding
   a device dependency between IOMMU and devices that use it
 
 - Fix a bug where the CPU pointer masking state isn't properly reset
   when tagged addresses aren't enabled for a task
 
 - Fix some incorrect register assignments, and add some missing ones,
   in kgdb support code
 
 - Fix compilation of non-kernel code that uses the ptrace uapi header
   by replacing BIT() with _BITUL()
 
 - Fix compilation of the validate_v_ptrace kselftest by working around
   kselftest macro expansion issues
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmnSgysACgkQx4+xDQu9
 KksznQ//UKuNcpTgGoTOSAi9m5XrLNG7B0Z2Es5n3IuuFLeX4uFwD8pJjUouAqja
 Y89HKHcbuawAZLxoEj5QImbFxyM6zgdA24R2kM76+Ds5nMM4hetL1hR1Gphs1ghs
 Vg/klLkSQ/QkV8xTZlWe9A3s96PeiYKgwQUYdENjL/OXWjTbi4Ho/EQYjsXWGyuc
 sGkWVbGeqPhNlv8bMcA11kM8rCsvyhFnAC5yIbmybmup6ObzS1tEnOXodp1jVDlZ
 TPzi7SyjSLiTbsaJGZ1O5oFXSrr8zBLFt2RinR7rUt/8Aq8c5xSSvK9n808jytNP
 ubIgqWjW3wGjzbZfQw4WhOIihtAsp2VssWZlt1p0Q7EGOx0g+/zMA6Uq1VVIuEML
 +Xm6BwxLFm43NDSa7HPtytCoN/qqIQmiRkiLAG7WHL3mSkYDXYjTXZxTmp0awJ8R
 WTlZsQFQlnNd8VydP++cwqi/lCPPqWqZbc8ys0lLt57+oe6eE91W3a4jXnIn/5YR
 dtHLdmHF6xG3pVdilEfFgH7CkA1DMlFox5qQRFx4lLWBY7tTEY1S2o1tmIG1zqKd
 QTcaO1VbuobTLAy06kD8XNUNh8jzW0zedk37BcxA+J+1B59c0N9J7rW8rkRYu4Le
 eeIy9p8kPWUB/JfcMY+6jKUjZgQL9un8M4PpVZ/uWJDxQVDJcRs=
 =d0PH
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:

 - Fix a CONFIG_SPARSEMEM crash on RV32 by avoiding early phys_to_page()

 - Prevent runtime const infrastructure from being used by modules,
   similar to what was done for x86

 - Avoid problems when shutting down ACPI systems with IOMMUs by adding
   a device dependency between IOMMU and devices that use it

 - Fix a bug where the CPU pointer masking state isn't properly reset
   when tagged addresses aren't enabled for a task

 - Fix some incorrect register assignments, and add some missing ones,
   in kgdb support code

 - Fix compilation of non-kernel code that uses the ptrace uapi header
   by replacing BIT() with _BITUL()

 - Fix compilation of the validate_v_ptrace kselftest by working around
   kselftest macro expansion issues

* tag 'riscv-for-linus-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  ACPI: RIMT: Add dependency between iommu and devices
  selftests: riscv: Add braces around EXPECT_EQ()
  riscv: use _BITUL macro rather than BIT() in ptrace uapi and kselftests
  riscv: Reset pmm when PR_TAGGED_ADDR_ENABLE is not set
  riscv: make runtime const not usable by modules
  riscv: patch: Avoid early phys_to_page()
  riscv: kgdb: fix several debug register assignment bugs
2026-04-05 14:43:47 -07:00
Lorenzo Stoakes (Oracle)
62c65fd740 mm: add mmap_action_map_kernel_pages[_full]()
A user can invoke mmap_action_map_kernel_pages() to specify that the
mapping should map kernel pages starting from desc->start of a specified
number of pages specified in an array.

In order to implement this, adjust mmap_action_prepare() to be able to
return an error code, as it makes sense to assert that the specified
parameters are valid as quickly as possible as well as updating the VMA
flags to include VMA_MIXEDMAP_BIT as necessary.

This provides an mmap_prepare equivalent of vm_insert_pages().  We
additionally update the existing vm_insert_pages() code to use
range_in_vma() and add a new range_in_vma_desc() helper function for the
mmap_prepare case, sharing the code between the two in range_is_subset().

We add both mmap_action_map_kernel_pages() and
mmap_action_map_kernel_pages_full() to allow for both partial and full VMA
mappings.

We update the documentation to reflect the new features.

Finally, we update the VMA tests accordingly to reflect the changes.

Link: https://lkml.kernel.org/r/926ac961690d856e67ec847bee2370ab3c6b9046.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:45 -07:00
Lorenzo Stoakes (Oracle)
668937b7b2 mm: allow handling of stacked mmap_prepare hooks in more drivers
While the conversion of mmap hooks to mmap_prepare is underway, we will
encounter situations where mmap hooks need to invoke nested mmap_prepare
hooks.

The nesting of mmap hooks is termed 'stacking'.  In order to flexibly
facilitate the conversion of custom mmap hooks in drivers which stack, we
must split up the existing __compat_vma_mmap() function into two separate
functions:

* compat_set_desc_from_vma() - This allows the setting of a vm_area_desc
  object's fields to the relevant fields of a VMA.

* __compat_vma_mmap() - Once an mmap_prepare hook has been executed upon a
  vm_area_desc object, this function performs any mmap actions specified by
  the mmap_prepare hook and then invokes its vm_ops->mapped() hook if any
  were specified.

In ordinary cases, where a file's f_op->mmap_prepare() hook simply needs
to be invoked in a stacked mmap() hook, compat_vma_mmap() can be used.

However some drivers define their own nested hooks, which are invoked in
turn by another hook.

A concrete example is vmbus_channel->mmap_ring_buffer(), which is invoked
in turn by bin_attribute->mmap():

vmbus_channel->mmap_ring_buffer() has a signature of:

int (*mmap_ring_buffer)(struct vmbus_channel *channel,
			struct vm_area_struct *vma);

And bin_attribute->mmap() has a signature of:

	int (*mmap)(struct file *, struct kobject *,
		    const struct bin_attribute *attr,
		    struct vm_area_struct *vma);

And so compat_vma_mmap() cannot be used here for incremental conversion of
hooks from mmap() to mmap_prepare().

There are many such instances like this, where conversion to mmap_prepare
would otherwise cascade to a huge change set due to nesting of this kind.

The changes in this patch mean we could now instead convert
vmbus_channel->mmap_ring_buffer() to
vmbus_channel->mmap_prepare_ring_buffer(), and implement something like:

	struct vm_area_desc desc;
	int err;

	compat_set_desc_from_vma(&desc, file, vma);
	err = channel->mmap_prepare_ring_buffer(channel, &desc);
	if (err)
		return err;

	return __compat_vma_mmap(&desc, vma);

Allowing us to incrementally update this logic, and other logic like it.

Unfortunately, as part of this change, we need to be able to flexibly
assign to the VMA descriptor, so have to remove some of the const
declarations within the structure.

Also update the VMA tests to reflect the changes.

Link: https://lkml.kernel.org/r/24aac3019dd34740e788d169fccbe3c62781e648.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:44 -07:00
Lorenzo Stoakes (Oracle)
a1b7fb40cb mm: add mmap_action_simple_ioremap()
Currently drivers use vm_iomap_memory() as a simple helper function for
I/O remapping memory over a range starting at a specified physical address
over a specified length.

In order to utilise this from mmap_prepare, separate out the core logic
into __simple_ioremap_prep(), update vm_iomap_memory() to use it, and add
simple_ioremap_prepare() to do the same with a VMA descriptor object.

We also add MMAP_SIMPLE_IO_REMAP and relevant fields to the struct
mmap_action type to permit this operation also.

We use mmap_action_ioremap() to set up the actual I/O remap operation once
we have checked and figured out the parameters, which makes
simple_ioremap_prepare() easy to implement.

We then add mmap_action_simple_ioremap() to allow drivers to make use of
this mode.

We update the mmap_prepare documentation to describe this mode.  Finally,
we update the VMA tests to reflect this change.

Link: https://lkml.kernel.org/r/a08ef1c4542202684da63bb37f459d5dbbeddd91.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:43 -07:00
Lorenzo Stoakes (Oracle)
c50ca15dd4 mm: add vm_ops->mapped hook
Previously, when a driver needed to do something like establish a
reference count, it could do so in the mmap hook in the knowledge that the
mapping would succeed.

With the introduction of f_op->mmap_prepare this is no longer the case, as
it is invoked prior to actually establishing the mapping.

mmap_prepare is not appropriate for this kind of thing as it is called
before any merge might take place, and after which an error might occur
meaning resources could be leaked.

To take this into account, introduce a new vm_ops->mapped callback which
is invoked when the VMA is first mapped (though notably - not when it is
merged - which is correct and mirrors existing mmap/open/close behaviour).

We do better that vm_ops->open() here, as this callback can return an
error, at which point the VMA will be unmapped.

Note that vm_ops->mapped() is invoked after any mmap action is complete
(such as I/O remapping).

We intentionally do not expose the VMA at this point, exposing only the
fields that could be used, and an output parameter in case the operation
needs to update the vma->vm_private_data field.

In order to deal with stacked filesystems which invoke inner filesystem's
mmap() invocations, add __compat_vma_mapped() and invoke it on vfs_mmap()
(via compat_vma_mmap()) to ensure that the mapped callback is handled when
an mmap() caller invokes a nested filesystem's mmap_prepare() callback.

Update the mmap_prepare documentation to describe the mapped hook and make
it clear what its intended use is.

The vm_ops->mapped() call is handled by the mmap complete logic to ensure
the same code paths are handled by both the compatibility and VMA layers.

Additionally, update VMA userland test headers to reflect the change.

Link: https://lkml.kernel.org/r/4c5e98297eb0aae9565c564e1c296a112702f144.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:43 -07:00
Lorenzo Stoakes (Oracle)
382c0f2895 mm: have mmap_action_complete() handle the rmap lock and unmap
Rather than have the callers handle this both the rmap lock release and
unmapping the VMA on error, handle it within the mmap_action_complete()
logic where it makes sense to, being careful not to unlock twice.

This simplifies the logic and makes it harder to make mistake with this,
while retaining correct behaviour with regard to avoiding deadlocks.

Also replace the call_action_complete() function with a direct invocation
of mmap_action_complete() as the abstraction is no longer required.

Also update the VMA tests to reflect this change.

Link: https://lkml.kernel.org/r/8d1ee8ebd3542d006a47e8382fb80cf5b57ecf10.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:42 -07:00
Lorenzo Stoakes (Oracle)
33506d4bae mm: switch the rmap lock held option off in compat layer
In the mmap_prepare compatibility layer, we don't need to hold the rmap
lock, as we are being called from an .mmap handler.

The .mmap_prepare hook, when invoked in the VMA logic, is called prior to
the VMA being instantiated, but the completion hook is called after the VMA
is linked into the maple tree, meaning rmap walkers can reach it.

The mmap hook does not link the VMA into the tree, so this cannot happen.

Therefore it's safe to simply disable this in the mmap_prepare
compatibility layer.

Also update VMA tests code to reflect current compatibility layer state.

[akpm@linux-foundation.org: fix comment typo, per Vlastimil]
Link: https://lkml.kernel.org/r/dda74230d26a1fcd79a3efab61fa4101dd1cac64.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:42 -07:00
Lorenzo Stoakes (Oracle)
827e97cf4b mm: document vm_operations_struct->open the same as close()
Describe when the operation is invoked and the context in which it is
invoked, matching the description already added for vm_op->close().

While we're here, update all outdated references to an 'area' field for
VMAs to the more consistent 'vma'.

Link: https://lkml.kernel.org/r/7d0ca833c12014320f0fa00f816f95e6e10076f2.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:42 -07:00
Lorenzo Stoakes (Oracle)
3e4bb27068 mm: various small mmap_prepare cleanups
Patch series "mm: expand mmap_prepare functionality and usage", v4.

This series expands the mmap_prepare functionality, which is intended to
replace the deprecated f_op->mmap hook which has been the source of bugs
and security issues for some time.

This series starts with some cleanup of existing mmap_prepare logic, then
adds documentation for the mmap_prepare call to make it easier for
filesystem and driver writers to understand how it works.

It then importantly adds a vm_ops->mapped hook, a key feature that was
missing from mmap_prepare previously - this is invoked when a driver which
specifies mmap_prepare has successfully been mapped but not merged with
another VMA.

mmap_prepare is invoked prior to a merge being attempted, so you cannot
manipulate state such as reference counts as if it were a new mapping.

The vm_ops->mapped hook allows a driver to perform tasks required at this
stage, and provides symmetry against subsequent vm_ops->open,close calls.

The series uses this to correct the afs implementation which wrongly
manipulated reference count at mmap_prepare time.

It then adds an mmap_prepare equivalent of vm_iomap_memory() -
mmap_action_simple_ioremap(), then uses this to update a number of drivers.

It then splits out the mmap_prepare compatibility layer (which allows for
invocation of mmap_prepare hooks in an mmap() hook) in such a way as to
allow for more incremental implementation of mmap_prepare hooks.

It then uses this to extend mmap_prepare usage in drivers.

Finally it adds an mmap_prepare equivalent of vm_map_pages(), which lays
the foundation for future work which will extend mmap_prepare to DMA
coherent mappings.


This patch (of 21):

Rather than passing arbitrary fields, pass a vm_area_desc pointer to mmap
prepare functions to mmap prepare, and an action and vma pointer to mmap
complete in order to put all the action-specific logic in the function
actually doing the work.

Additionally, allow mmap prepare functions to return an error so we can
error out as soon as possible if there is something logically incorrect in
the input.

Update remap_pfn_range_prepare() to properly check the input range for the
CoW case.

Also remove io_remap_pfn_range_complete(), as we can simply set up the
fields correctly in io_remap_pfn_range_prepare() and use
remap_pfn_range_complete() for this.

While we're here, make remap_pfn_range_prepare_vma() a little neater, and
pass mmap_action directly to call_action_complete().

Then, update compat_vma_mmap() to perform its logic directly, as
__compat_vma_map() is not used by anything so we don't need to export it.

Also update compat_vma_mmap() to use vfs_mmap_prepare() rather than
calling the mmap_prepare op directly.

Finally, update the VMA userland tests to reflect the changes.

Link: https://lkml.kernel.org/r/cover.1774045440.git.ljs@kernel.org
Link: https://lkml.kernel.org/r/99f408e4694f44ab12bdc55fe0bd9685d3bd1117.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)
90cb921c4d mm/vma: convert __mmap_region() to use vma_flags_t
Update the mmap() implementation logic implemented in __mmap_region() and
functions invoked by it.  The mmap_region() function converts its input
vm_flags_t parameter to a vma_flags_t value which it then passes to
__mmap_region() which uses the vma_flags_t value consistently from then
on.

As part of the change, we convert map_deny_write_exec() to using
vma_flags_t (it was incorrectly using unsigned long before), and place it
in vma.h, as it is only used internal to mm.

With this change, we eliminate the legacy is_shared_maywrite_vm_flags()
helper function which is now no longer required.

We are also able to update the MMAP_STATE() and VMG_MMAP_STATE() macros to
use the vma_flags_t value.

Finally, we update the VMA tests to reflect the change.

Link: https://lkml.kernel.org/r/1fc33a404c962f02da778da100387cc19bd62153.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)
a06eb2f827 mm/vma: convert vma_modify_flags[_uffd]() to use vma_flags_t
Update the vma_modify_flags() and vma_modify_flags_uffd() functions to
accept a vma_flags_t parameter rather than a vm_flags_t one, and propagate
the changes as needed to implement this change.

Also add vma_flags_reset_once() in replacement of vm_flags_reset_once(). We
still need to be careful here because we need to avoid tearing, so maintain
the assumption that the first system word set of flags are the only ones
that require protection from tearing, and retain this functionality.

We can copy the remainder of VMA flags above 64 bits normally. But
hopefully by the time that happens, we will have replaced the logic that
requires these WRITE_ONCE()'s with something else.

We also replace instances of vm_flags_reset() with a simple write of VMA
flags. We are no longer perform a number of checks, most notable of all the
VMA flags asserts becase:

1. We might be operating on a VMA that is not yet added to the tree.

2. We might be operating on a VMA that is now detached.

3. Really in all but core code, you should be using vma_desc_xxx().

4. Other VMA fields are manipulated with no such checks.

5. It'd be egregious to have to add variants of flag functions just to
   account for cases such as the above, especially when we don't do so for
   other VMA fields. Drivers are the problematic cases and why it was
   especially important (and also for debug as VMA locks were introduced),
   the mmap_prepare work is solving this generally.

Additionally, we can fairly safely assume by this point the soft dirty
flags are being set correctly, so it's reasonable to drop this also.

Finally, update the VMA tests to reflect this.

Link: https://lkml.kernel.org/r/51afbb2b8c3681003cc7926647e37335d793836e.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)
769669bd9c mm/vma: convert as much as we can in mm/vma.c to vma_flags_t
Now we have established a good foundation for vm_flags_t to vma_flags_t
changes, update mm/vma.c to utilise vma_flags_t wherever possible.

We are able to convert VM_STARTGAP_FLAGS entirely as this is only used in
mm/vma.c, and to account for the fact we can't use VM_NONE to make life
easier, place the definition of this within existing #ifdef's to be
cleaner.

Generally the remaining changes are mechanical.

Also update the VMA tests to reflect the changes.

Link: https://lkml.kernel.org/r/5fdeaf8af9a12c2a5d68497495f52fa627d05a5b.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)
a6f14fb593 tools/testing/vma: update VMA tests to test vma_clear_flags[_mask]()
The tests have existing flag clearing logic, so simply expand this to use
the new VMA-specific flag clearing helpers.

Also correct some trivial formatting issue in a macro define.

Link: https://lkml.kernel.org/r/f5da681d3c33039dd4a838188385796eb8d58373.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)
d720b81d01 mm/vma: introduce vma_clear_flags[_mask]()
Introduce a helper function and helper macro to easily clear a VMA's flags
using the new vma_flags_t vma->flags field:

* vma_clear_flags_mask() - Clears all of the flags in a specified mask in
			   the VMA's flags field.
* vma_clear_flags() - Clears all of the specified individual VMA flag bits
		      in a VMA's flags field.

Also update the VMA tests to reflect the change.

Link: https://lkml.kernel.org/r/9bd15da35c2c90e7441265adf01b5c2d3b5c6d41.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)
3a6455d56b mm: convert do_brk_flags() to use vma_flags_t
In order to be able to do this, we need to change VM_DATA_DEFAULT_FLAGS
and friends and update the architecture-specific definitions also.

We then have to update some KSM logic to handle VMA flags, and introduce
VMA_STACK_FLAGS to define the vma_flags_t equivalent of VM_STACK_FLAGS.

We also introduce two helper functions for use during the time we are
converting legacy flags to vma_flags_t values - vma_flags_to_legacy() and
legacy_to_vma_flags().

This enables us to iteratively make changes to break these changes up into
separate parts.

We use these explicitly here to keep VM_STACK_FLAGS around for certain
users which need to maintain the legacy vm_flags_t values for the time
being.

We are no longer able to rely on the simple VM_xxx being set to zero if
the feature is not enabled, so in the case of VM_DROPPABLE we introduce
VMA_DROPPABLE as the vma_flags_t equivalent, which is set to
EMPTY_VMA_FLAGS if the droppable flag is not available.

While we're here, we make the description of do_brk_flags() into a kdoc
comment, as it almost was already.

We use vma_flags_to_legacy() to not need to update the vm_get_page_prot()
logic as this time.

Note that in create_init_stack_vma() we have to replace the BUILD_BUG_ON()
with a VM_WARN_ON_ONCE() as the tested values are no longer build time
available.

We also update mprotect_fixup() to use VMA flags where possible, though we
have to live with a little duplication between vm_flags_t and vma_flags_t
values for the time being until further conversions are made.

While we're here, update VM_SPECIAL to be defined in terms of
VMA_SPECIAL_FLAGS now we have vma_flags_to_legacy().

Finally, we update the VMA tests to reflect these changes.

Link: https://lkml.kernel.org/r/d02e3e45d9a33d7904b149f5604904089fd640ae.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>	[SELinux]
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)
bbbc17cb02 tools/testing/vma: test vma_flags_count,vma[_flags]_test_single_mask
Update the VMA tests to assert that vma_flags_count() behaves as expected,
as well as vma_flags_test_single_mask() and vma_test_single_mask().

For the test functions we can simply update the existing vma_test(), et
al.  test to also test the single_mask variants.

We also add some explicit testing of an empty VMA flag to this test to
ensure this is handled properly.

In order to test vma_flags_count() we simply take an existing set of flags
and gradually remove flags ensuring the count remains as expected
throughout.

We also update the vma[_flags]_test_all() tests to make clear the
semantics that we expect vma[_flags]_test_all(..., EMPTY_VMA_FLAGS) to
return true, as trivially, all flags of none are always set in VMA flags.

Link: https://lkml.kernel.org/r/4af95d559cd2af0ba3388de1e1386b9f94c0e009.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)
e79d1c500f mm: introduce vma_flags_count() and vma[_flags]_test_single_mask()
vma_flags_count() determines how many bits are set in VMA flags, using
bitmap_weight().

vma_flags_test_single_mask() determines if a vma_flags_t set of flags
contains a single flag specified as another vma_flags_t value, or if the
sought flag mask is empty, it is defined to return false.

This is useful when we want to declare a VMA flag as optionally a single
flag in a mask or empty depending on kernel configuration.

This allows us to have VM_NONE-like semantics when checking whether the
flag is set.

In a subsequent patch, we introduce the use of VMA_DROPPABLE of type
vma_flags_t using precisely these semantics.

It would be actively confusing to use vma_flags_test_any_single_mask() for
this (and vma_flags_test_all_mask() is not correct to use here, as it
trivially returns true when tested against an empty vma flags mask).

We introduce vma_flags_count() to be able to assert that the compared flag
mask is singular or empty, checked when CONFIG_DEBUG_VM is enabled.

Also update the VMA tests as part of this change.

Link: https://lkml.kernel.org/r/cd778dd02b9f2a01eb54d25a49dea8ec2ddf7753.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)
63cdb667d1 tools/testing/vma: update VMA flag tests to test vma_test[_any_mask]()
Update the existing test logic to assert that vma_test(), vma_test_any()
and vma_test_any_mask() (implicitly tested via vma_test_any()) are
functioning correctly.

We already have tests for other variants like this, so it's simply a
matter of expanding those tests to also include tests for the VMA-specific
helpers.

Link: https://lkml.kernel.org/r/dea3e97c6c3dd86f1a3f1a0703241b03f6e3a33f.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)
fb67bba5d9 mm/vma: introduce vma_test[_any[_mask]](), and make inlining consistent
Introduce helper functions and macros to make it convenient to test flags
and flag masks for VMAs, specifically:

* vma_test() - determine if a single VMA flag is set in a VMA.
* vma_test_any_mask() - determine if any flags in a vma_flags_t value are
			set in a VMA.
* vma_test_any() - Helper macro to test if any of specific flags are set.

Also, there are a mix of 'inline's and '__always_inline's in VMA helper
function declarations, update to consistently use __always_inline.

Finally, update the VMA tests to reflect the changes.

Link: https://lkml.kernel.org/r/be1d71f08307d747a82232cbd8664a88c0f41419.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)
a8add93f80 tools/testing/vma: test that legacy flag helpers work correctly
Update the existing compare_legacy_flags() predicate function to assert
that legacy_to_vma_flags() and vma_flags_to_legacy() behave as expected.

Link: https://lkml.kernel.org/r/3374e50053adb65818fde948ae3488e1e29ae8b1.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)
c8555bc95d mm/vma: introduce [vma_flags,legacy]_to_[legacy,vma_flags]() helpers
While we are still converting VMA flags from vma_flags_t to vm_flags_t,
introduce helpers to convert between the two to allow for iterative
development without having to 'change the world' in a single commit'.

Also update VMA flags tests to reflect the change.

Finally, refresh vma_flags_overwrite_word(),
vma_flag_overwrite_word_once(), vma_flags_set_word() and
vma_flags_clear_word() in the VMA tests to reflect current kernel
implementations - this should make no functional difference, but keeps the
logic consistent between the two.

Link: https://lkml.kernel.org/r/d3569470dbb3ae79134ca7c3eb3fc4df7086e874.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)
3ee5845382 mm/vma: introduce vma_flags_same[_mask/_pair]()
Add helpers to determine if two sets of VMA flags are precisely the same,
that is - that every flag set one is set in another, and neither contain
any flags not set in the other.

We also introduce vma_flags_same_pair() for cases where we want to compare
two sets of VMA flags which are both non-const values.

Also update the VMA tests to reflect the change, we already implicitly
test that this functions correctly having used it for testing purposes
previously.

Link: https://lkml.kernel.org/r/4f764bf619e77205837c7c819b62139ef6337ca3.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)
b22a48ec09 tools/testing/vma: add simple test for append_vma_flags()
Add a simple test for append_vma_flags() to assert that it behaves as
expected.

Additionally, include the VMA_REMAP_FLAGS definition in the VMA tests to
allow us to use this value in the testing.

Link: https://lkml.kernel.org/r/eebd946c5325ad7fae93027245a562eb1aeb68a2.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)
e8d464f4a9 mm/vma: add append_vma_flags() helper
In order to be able to efficiently combine VMA flag masks with additional
VMA flag bits we need to extend the concept introduced in mk_vma_flags()
and __mk_vma_flags() by allowing the specification of a VMA flag mask to
append VMA flag bits to.

Update __mk_vma_flags() to allow for this and update mk_vma_flags()
accordingly, and also provide append_vma_flags() to allow for the caller
to specify which VMA flags mask to append to.

Finally, update the VMA flags tests to reflect the change.

Link: https://lkml.kernel.org/r/9f928cd4688270002f2c0c3777fcc9b49cc7a8ea.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)
06531d2bf3 tools/testing/vma: fix VMA flag tests
The VMA tests are incorrectly referencing NUM_VMA_FLAGS, which doesn't
exist, rather they should reference NUM_VMA_FLAG_BITS.

Additionally, remove the custom-written implementation of __mk_vma_flags()
as this means we are not testing the code as present in the kernel, rather
add the actual __mk_vma_flags() to dup.h and add #ifdef's to handle
declarations differently depending on NUM_VMA_FLAG_BITS.

Link: https://lkml.kernel.org/r/b19c63af3d5efdfe712bf5d5f89368a5360a60f7.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)
7ec1885a7e mm/vma: use new VMA flags for sticky flags logic
Use the new vma_flags_t flags implementation to perform the logic around
sticky flags and what flags are ignored on VMA merge.

We make use of the new vma_flags_empty(), vma_flags_diff_pair(), and
vma_flags_and_mask() functionality.

Also update the VMA tests accordingly.

Link: https://lkml.kernel.org/r/369574f06360ffa44707047e3b58eb4897345fba.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:38 -07:00
Lorenzo Stoakes (Oracle)
bd44d91d0c tools/testing/vma: convert bulk of test code to vma_flags_t
Convert the test code to utilise vma_flags_t as opposed to the deprecate
vm_flags_t as much as possible.

As part of this change, add VMA_STICKY_FLAGS and VMA_SPECIAL_FLAGS as
early versions of what these defines will look like in the kernel logic
once this logic is implemented.

Link: https://lkml.kernel.org/r/df90efe29300bd899989f695be4ae3adc901a828.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:38 -07:00
Lorenzo Stoakes (Oracle)
8228e42b5f mm/vma: add further vma_flags_t unions
In order to utilise the new vma_flags_t type, we currently place it in
union with legacy vm_flags fields of type vm_flags_t to make the
transition smoother.

Add vma_flags_t union entries for mm->def_flags and vmg->vm_flags -
mm->def_vma_flags and vmg->vma_flags respectively.

Once the conversion is complete, these will be replaced with vma_flags_t
entries alone.

Also update the VMA tests to reflect the change.

Link: https://lkml.kernel.org/r/d507d542c089ba132e9da53f2ff7f80ca117c3b4.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:38 -07:00
Lorenzo Stoakes (Oracle)
e4fd34b84b tools/testing/vma: add unit tests flag empty, diff_pair, and[_mask]
Add VMA unit tests to assert that:

* vma_flags_empty()
* vma_flags_diff_pair()
* vma_flags_and_mask()
* vma_flags_and()

All function as expected.

In additional to the added tests, in order to make testing easier, add
vma_flags_same_mask() and vma_flags_same() for testing only.  If/when
these are required in kernel code, they can be moved over.

Also add ASSERT_FLAGS_[NOT_]SAME[_MASK](), ASSERT_FLAGS_[NON]EMPTY() test
helpers to make asserting flag state easier and more convenient.

Link: https://lkml.kernel.org/r/471ce7ceb1d32e5fc9c0660966b9eacdf899b4d1.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:38 -07:00
Lorenzo Stoakes (Oracle)
6bc0987d0b mm/vma: add vma_flags_empty(), vma_flags_and(), vma_flags_diff_pair()
Patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma code", v4.

This series converts a lot of the existing use of the legacy vm_flags_t
data type to the new vma_flags_t type which replaces it.

In order to do so it adds a number of additional helpers:

* vma_flags_empty() - Determines whether a vma_flags_t value has no bits
  set.

* vma_flags_and() - Performs a bitwise AND between two vma_flags_t values.

* vma_flags_diff_pair() - Determines which flags are not shared between a
  pair of VMA flags (typically non-constant values)

* append_vma_flags() - Similar to mk_vma_flags(), but allows a vma_flags_t
  value to be specified (typically a constant value) which will be copied
  and appended to to create a new vma_flags_t value, with additional flags
  specified to append to it.

* vma_flags_same() - Determines if a vma_flags_t value is exactly equal to
  a set of VMA flags.

* vma_flags_same_mask() - Determines if a vma_flags_t value is eactly equal
  to another vma_flags_t value (typically constant).

* vma_flags_same_pair() - Determines if a pair of vma_flags_t values are
  exactly equal to one another (typically both non-constant).

* vma_flags_to_legacy() - Converts a vma_flags_t value to a vm_flags_t
  value, used to enable more iterative introduction of the use of
  vma_flags_t.

* legacy_to_vma_flags() - Converts a vm_flags_t value to a vma_flags-t
  value, for the same purpose.

* vma_flags_test_single_mask() - Tests whether a vma_flags_t value contain
  the single flag specified in an input vma_flags_t flag mask, or if that
  flag mask is empty, is defined to return false. Useful for
  config-predicated VMA flag mask defines.

* vma_test() - Tests whether a VMA's flags contain a specific singular VMA
  flag.

* vma_test_any() - Tests whether a VMA's flags contain any of a set of VMA
  flags.

* vma_test_any_mask() - Tests whether a VMA's flags contain any of the
  flags specified in another, typically constant, vma_flags_t value.

* vma_test_single_mask() - Tests whether a VMA's flags contain the single
  flag specified in an input vma_flags_t flag mask, or if that flag mask is
  empty, is defined to return false. Useful for config-predicated VMA flag
  mask defines.

* vma_clear_flags() - Clears a specific set of VMA flags from a vma_flags_t
  value.

* vma_clear_flags_mask() - Clears those flag set in a vma_flags_t value
  (typically constant) from a (typically not constant) vma_flags_t value.

The series mostly focuses on the the VMA specific code, especially that
contained in mm/vma.c and mm/vma.h.

It updates both brk() and mmap() logic to utils vma_flags_t values as much
as is practiaclly possible at this point, changing surrounding logic to be
able to do so.

It also updates the vma_modify_xxx() functions where they interact with VMA
flags directly to use vm_flags_t values where possible.

There is extensive testing added in the VMA userland tests to assert that
all of these new VMA flag functions work correctly.


This patch (of 25):

Firstly, add the ability to determine if VMA flags are empty, that is no
flags are set in a vma_flags_t value.

Next, add the ability to obtain the equivalent of the bitwise and of two
vma_flags_t values, via vma_flags_and_mask().

Next, add the ability to obtain the difference between two sets of VMA
flags, that is the equivalent to the exclusive bitwise OR of the two sets
of flags, via vma_flags_diff_pair().

vma_flags_xxx_mask() typically operates on a pointer to a vma_flags_t
value, which is assumed to be an lvalue of some kind (such as a field in a
struct or a stack variable) and an rvalue of some kind (typically a
constant set of VMA flags obtained e.g.  via mk_vma_flags() or
equivalent).

However vma_flags_diff_pair() is intended to operate on two lvalues, so
use the _pair() suffix to make this clear.

Finally, update VMA userland tests to add these helpers.

We also port bitmap_xor() and __bitmap_xor() to the tools/ headers and
source to allow the tests to work with vma_flags_diff_pair().

Link: https://lkml.kernel.org/r/cover.1774034900.git.ljs@kernel.org
Link: https://lkml.kernel.org/r/53ab55b7da91425775e42c03177498ad6de88ef4.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:38 -07:00
Zi Yan
224f129261 selftests/mm: add folio_split() and filemap_get_entry() race test
The added folio_split_race_test is a modified C port of the race condition
test from [1].  The test creates shmem huge pages, where the main thread
punches holes in the shmem to cause folio_split() in the kernel and a set
of 16 threads reads the shmem to cause filemap_get_entry() in the kernel. 
filemap_get_entry() reads the folio and xarray split by folio_split()
locklessly.  The original test[2] is written in rust and uses memfd (shmem
backed).  This C port uses shmem directly and use a single process.

Note: the initial rust to C conversion is done by Cursor.

Link: https://lore.kernel.org/all/CAKNNEtw5_kZomhkugedKMPOG-sxs5Q5OLumWJdiWXv+C9Yct0w@mail.gmail.com/ [1]
Link: https://github.com/dfinity/thp-madv-remove-test [2]
Link: https://lkml.kernel.org/r/20260323163717.184107-1-ziy@nvidia.com
Co-developed-by: Bas van Dijk <bas@dfinity.org>
Signed-off-by: Bas van Dijk <bas@dfinity.org>
Co-developed-by: Adam Bratschi-Kaye <adam.bratschikaye@dfinity.org>
Signed-off-by: Adam Bratschi-Kaye <adam.bratschikaye@dfinity.org>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:36 -07:00
Lorenzo Stoakes (Oracle)
2d1e54aab6 mm: abstract reading sysctl_max_map_count, and READ_ONCE()
Concurrent reads and writes of sysctl_max_map_count are possible, so we
should READ_ONCE() and WRITE_ONCE().

The sysctl procfs logic already enforces WRITE_ONCE(), so abstract the
read side with get_sysctl_max_map_count().

While we're here, also move the field to mm/internal.h and add the getter
there since only mm interacts with it, there's no need for anybody else to
have access.

Finally, update the VMA userland tests to reflect the change.

Link: https://lkml.kernel.org/r/0715259eb37cbdfde4f9e5db92a20ec7110a1ce5.1773249037.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Jianzhou Zhao <luckd0g@163.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:28 -07:00
Waiman Long
2d028f3e4b selftest: memcg: skip memcg_sock test if address family not supported
The test_memcg_sock test in memcontrol.c sets up an IPv6 socket and send
data over it to consume memory and verify that memory.stat.sock and
memory.current values are close.

On systems where IPv6 isn't enabled or not configured to support
SOCK_STREAM, the test_memcg_sock test always fails.  When the socket()
call fails, there is no way we can test the memory consumption and verify
the above claim.  I believe it is better to just skip the test in this
case instead of reporting a test failure hinting that there may be
something wrong with the memcg code.

Link: https://lkml.kernel.org/r/20260311200526.885899-1-longman@redhat.com
Fixes: 5f8f019380 ("selftests: cgroup/memcontrol: add basic test for socket accounting")
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:27 -07:00
Mike Rapoport (Microsoft)
f08f610ea0 selftests/mm: pagemap_ioctl: remove hungarian notation
Replace lpBaseAddress with addr and dwRegionSize with size.

Link: https://lkml.kernel.org/r/20260311180737.3767545-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:27 -07:00
SeongJae Park
ddac713da3 selftests/damon/sysfs.py: test goal_tuner commit
Extend the near-full DAMON parameters commit selftest to commit goal_tuner
and confirm the internal status is updated as expected.

Link: https://lkml.kernel.org/r/20260310010529.91162-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:26 -07:00
SeongJae Park
c2b0cb96e7 selftests/damon/drgn_dump_damon_status: support quota goal_tuner dumping
Update drgn_dump_damon_status.py, which is being used to dump the
in-kernel DAMON status for tests, to dump goal_tuner setup status.

Link: https://lkml.kernel.org/r/20260310010529.91162-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:26 -07:00
SeongJae Park
c00863bc7c selftests/damon/_damon_sysfs: support goal_tuner setup
Add support of goal_tuner setup to the test-purpose DAMON sysfs interface
control helper, _damon_sysfs.py.

Link: https://lkml.kernel.org/r/20260310010529.91162-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:26 -07:00
Anthony Yznaga
d239462787 mm: prevent droppable mappings from being locked
Droppable mappings must not be lockable.  There is a check for VMAs with
VM_DROPPABLE set in mlock_fixup() along with checks for other types of
unlockable VMAs which ensures this when calling mlock()/mlock2().

For mlockall(MCL_FUTURE), the check for unlockable VMAs is different.  In
apply_mlockall_flags(), if the flags parameter has MCL_FUTURE set, the
current task's mm's default VMA flag field mm->def_flags has VM_LOCKED
applied to it.  VM_LOCKONFAULT is also applied if MCL_ONFAULT is also set.
When these flags are set as default in this manner they are cleared in
__mmap_complete() for new mappings that do not support mlock.  A check for
VM_DROPPABLE in __mmap_complete() is missing resulting in droppable
mappings created with VM_LOCKED set.  To fix this and reduce that chance
of similar bugs in the future, introduce and use vma_supports_mlock().

Link: https://lkml.kernel.org/r/20260310155821.17869-1-anthony.yznaga@oracle.com
Fixes: 9651fcedf7 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings")
Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Suggested-by: David Hildenbrand <david@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Tested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:25 -07:00
David Hildenbrand (Arm)
a9496e9e4b mm: move vma_mmu_pagesize() from hugetlb to vma.c
vma_mmu_pagesize() is also queried on non-hugetlb VMAs and does not really
belong into hugetlb.c.

PPC64 provides a custom overwrite with CONFIG_HUGETLB_PAGE, see
arch/powerpc/mm/book3s64/slice.c, so we cannot easily make this a static
inline function.

So let's move it to vma.c and add some proper kerneldoc.

To make vma tests happy, add a simple vma_kernel_pagesize() stub in
tools/testing/vma/include/custom.h.

Link: https://lkml.kernel.org/r/20260309151901.123947-3-david@kernel.org
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:23 -07:00
SeongJae Park
300252ebb1 selftests/damon/config: enable DAMON_DEBUG_SANITY
CONFIG_DAMON_DEBUG_SANITY is recommended for DAMON development and test
setups.  Enable it on the build config for DAMON selftests.

Link: https://lkml.kernel.org/r/20260306152914.86303-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:21 -07:00
Lorenzo Stoakes (Oracle)
5cfb95f38a tools/testing/vma: add test for vma_flags_test(), vma_desc_test()
Now we have helpers which test singular VMA flags - vma_flags_test() and
vma_desc_test() - add a test to explicitly assert that these behave as
expected.

[ljs@kernel.org: test_vma_flags_test(): use struct initializer, per David]
  Link: https://lkml.kernel.org/r/f6f396d2-1ba2-426f-b756-d8cc5985cc7c@lucifer.local
Link: https://lkml.kernel.org/r/376a39eb9e134d2c8ab10e32720dd292970b080a.1772704455.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Chunhai Guo <guochunhai@vivo.com>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hongbo Li <lihongbo22@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yue Hu <zbestahu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:19 -07:00
Lorenzo Stoakes (Oracle)
0c2aa66357 mm: reintroduce vma_desc_test() as a singular flag test
Similar to vma_flags_test(), we have previously renamed vma_desc_test() to
vma_desc_test_any().  Now that is in place, we can reintroduce
vma_desc_test() to explicitly check for a single VMA flag.

As with vma_flags_test(), this is useful as often flag tests are against a
single flag, and vma_desc_test_any(flags, VMA_READ_BIT) reads oddly and
potentially causes confusion.

As with vma_flags_test() a combination of sparse and vma_flags_t being a
struct means that users cannot misuse this function without it getting
flagged.

Also update the VMA tests to reflect this change.

Link: https://lkml.kernel.org/r/3a65ca23defb05060333f0586428fe279a484564.1772704455.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Chunhai Guo <guochunhai@vivo.com>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hongbo Li <lihongbo22@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yue Hu <zbestahu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:19 -07:00
Lorenzo Stoakes (Oracle)
5e6d45d720 mm: reintroduce vma_flags_test() as a singular flag test
Since we've now renamed vma_flags_test() to vma_flags_test_any() to be
very clear as to what we are in fact testing, we now have the opportunity
to bring vma_flags_test() back, but for explicitly testing a single VMA
flag.

This is useful, as often flag tests are against a single flag, and
vma_flags_test_any(flags, VMA_READ_BIT) reads oddly and potentially causes
confusion.

We use sparse to enforce that users won't accidentally pass vm_flags_t to
this function without it being flagged so this should make it harder to
get this wrong.

Of course, passing vma_flags_t to the function is impossible, as it is a
struct.

Also update the VMA tests to reflect this change.

Link: https://lkml.kernel.org/r/f33f8d7f16c3f3d286a1dc2cba12c23683073134.1772704455.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Chunhai Guo <guochunhai@vivo.com>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hongbo Li <lihongbo22@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yue Hu <zbestahu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:18 -07:00
Lorenzo Stoakes (Oracle)
a5eee1128d mm: always inline __mk_vma_flags() and invoked functions
Be explicit about __mk_vma_flags() (which is used by the mk_vma_flags()
macro) always being inline, as we rely on the compiler to evaluate the
loop in this function and determine that it can replace the code with the
an equivalent constant value, e.g. that:

__mk_vma_flags(2, (const vma_flag_t []){ VMA_WRITE_BIT, VMA_EXEC_BIT });

Can be replaced with:

(1UL << VMA_WRITE_BIT) | (1UL << VMA_EXEC_BIT)

= (1UL << 1) | (1UL << 2) = 6

Most likely an 'inline' will suffice for this, but be explicit as we can
be.

Also update all of the functions __mk_vma_flags() ultimately invokes to be
always inline too.

Note that test_bitmap_const_eval() asserts that the relevant bitmap
functions result in build time constant values.

Additionally, vma_flag_set() operates on a vma_flags_t type, so it is
inconsistently named versus other VMA flags functions.

We only use vma_flag_set() in __mk_vma_flags() so we don't need to worry
about its new name being rather cumbersome, so rename it to
vma_flags_set_flag() to disambiguate it from vma_flags_set().

Also update the VMA test headers to reflect the changes.

Link: https://lkml.kernel.org/r/241f49c52074d436edbb9c6a6662a8dc142a8f43.1772704455.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Chunhai Guo <guochunhai@vivo.com>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hongbo Li <lihongbo22@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yue Hu <zbestahu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:18 -07:00
Lorenzo Stoakes (Oracle)
0b3ed2a495 mm: add vma_desc_test_all() and use it
erofs and zonefs are using vma_desc_test_any() twice to check whether all
of VMA_SHARED_BIT and VMA_MAYWRITE_BIT are set, this is silly, so add
vma_desc_test_all() to test all flags and update erofs and zonefs to use
it.

While we're here, update the helper function comments to be more
consistent.

Also add the same to the VMA test headers.

Link: https://lkml.kernel.org/r/568c8f8d6a84ff64014f997517cba7a629f7eed6.1772704455.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Chunhai Guo <guochunhai@vivo.com>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hongbo Li <lihongbo22@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yue Hu <zbestahu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:18 -07:00
Lorenzo Stoakes (Oracle)
e650bb30ca mm: rename VMA flag helpers to be more readable
Patch series "mm: vma flag tweaks".

The ongoing work around introducing non-system word VMA flags has
introduced a number of helper functions and macros to make life easier
when working with these flags and to make conversions from the legacy use
of VM_xxx flags more straightforward.

This series improves these to reduce confusion as to what they do and to
improve consistency and readability.

Firstly the series renames vma_flags_test() to vma_flags_test_any() to
make it abundantly clear that this function tests whether any of the flags
are set (as opposed to vma_flags_test_all()).

It then renames vma_desc_test_flags() to vma_desc_test_any() for the same
reason.  Note that we drop the 'flags' suffix here, as
vma_desc_test_any_flags() would be cumbersome and 'test' implies a flag
test.

Similarly, we rename vma_test_all_flags() to vma_test_all() for
consistency.

Next, we have a couple of instances (erofs, zonefs) where we are now
testing for vma_desc_test_any(desc, VMA_SHARED_BIT) &&
vma_desc_test_any(desc, VMA_MAYWRITE_BIT).

This is silly, so this series introduces vma_desc_test_all() so these
callers can instead invoke vma_desc_test_all(desc, VMA_SHARED_BIT,
VMA_MAYWRITE_BIT).

We then observe that quite a few instances of vma_flags_test_any() and
vma_desc_test_any() are in fact only testing against a single flag.

Using the _any() variant here is just confusing - 'any' of single item
reads strangely and is liable to cause confusion.

So in these instances the series reintroduces vma_flags_test() and
vma_desc_test() as helpers which test against a single flag.

The fact that vma_flags_t is a struct and that vma_flag_t utilises sparse
to avoid confusion with vm_flags_t makes it impossible for a user to
misuse these helpers without it getting flagged somewhere.

The series also updates __mk_vma_flags() and functions invoked by it to
explicitly mark them always inline to match expectation and to be
consistent with other VMA flag helpers.

It also renames vma_flag_set() to vma_flags_set_flag() (a function only
used by __mk_vma_flags()) to be consistent with other VMA flag helpers.

Finally it updates the VMA tests for each of these changes, and introduces
explicit tests for vma_flags_test() and vma_desc_test() to assert that
they behave as expected.


This patch (of 6):

On reflection, it's confusing to have vma_flags_test() and
vma_desc_test_flags() test whether any comma-separated VMA flag bit is
set, while also having vma_flags_test_all() and vma_test_all_flags()
separately test whether all flags are set.

Firstly, rename vma_flags_test() to vma_flags_test_any() to eliminate this
confusion.

Secondly, since the VMA descriptor flag functions are becoming rather
cumbersome, prefer vma_desc_test*() to vma_desc_test_flags*(), and also
rename vma_desc_test_flags() to vma_desc_test_any().

Finally, rename vma_test_all_flags() to vma_test_all() to keep the
VMA-specific helper consistent with the VMA descriptor naming convention
and to help avoid confusion vs.  vma_flags_test_all().

While we're here, also update whitespace to be consistent in helper
functions.

Link: https://lkml.kernel.org/r/cover.1772704455.git.ljs@kernel.org
Link: https://lkml.kernel.org/r/0f9cb3c511c478344fac0b3b3b0300bb95be95e9.1772704455.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Suggested-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Chunhai Guo <guochunhai@vivo.com>
Cc: Damien Le Maol <dlemoal@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hongbo Li <lihongbo22@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yue Hu <zbestahu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:18 -07:00
Jason Miu
6b0dd42d76 kho: remove finalize state and clients
Eliminate the `kho_finalize()` function and its associated state from the
KHO subsystem.  The transition to a radix tree for memory tracking makes
the explicit "finalize" state and its serialization step obsolete.

Remove the `kho_finalize()` and `kho_finalized()` APIs and their stub
implementations.  Update KHO client code and the debugfs interface to no
longer call or depend on the `kho_finalize()` mechanism.

Complete the move towards a stateless KHO, simplifying the overall design
by removing unnecessary state management.

Link: https://lkml.kernel.org/r/20260206021428.3386442-3-jasonmiu@google.com
Signed-off-by: Jason Miu <jasonmiu@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:04 -07:00
Chen Ni
15c578d0dc selftests/mm: remove duplicate include of unistd.h
Remove duplicate inclusion of unistd.h in memory-failure.c to clean up
redundant code.

Link: https://lkml.kernel.org/r/20260211064311.2981726-1-nichen@iscas.ac.cn
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:01 -07:00
Jiayuan Chen
4e89004eeb selftests/cgroup: add test for zswap incompressible pages
Add test_zswap_incompressible() to verify that the zswap_incomp memcg stat
correctly tracks incompressible pages.

The test allocates memory filled with random data from /dev/urandom, which
cannot be effectively compressed by zswap.  When this data is swapped out
to zswap, it should be stored as-is and tracked by the zswap_incomp
counter.

The test verifies that:
1. Pages are swapped out to zswap (zswpout increases)
2. Incompressible pages are tracked (zswap_incomp increases)

test:
dd if=/dev/zero of=/swapfile bs=1M count=2048
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo Y > /sys/module/zswap/parameters/enabled

./test_zswap
 TAP version 13
 1..8
 ok 1 test_zswap_usage
 ok 2 test_swapin_nozswap
 ok 3 test_zswapin
 ok 4 test_zswap_writeback_enabled
 ok 5 test_zswap_writeback_disabled
 ok 6 test_no_kmem_bypass
 ok 7 test_no_invasive_cgroup_shrink
 ok 8 test_zswap_incompressible
 Totals: pass:8 fail:0 xfail:0 xpass:0 skip:0 error:0

Link: https://lkml.kernel.org/r/20260213071827.5688-3-jiayuan.chen@linux.dev
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:00 -07:00
AnishMulay
54218f10df selftests/mm: skip migration tests if NUMA is unavailable
Currently, the migration test asserts that numa_available() returns 0.  On
systems where NUMA is not available (returning -1), such as certain ARM64
configurations or single-node systems, this assertion fails and crashes
the test.

Update the test to check the return value of numa_available().  If it is
less than 0, skip the test gracefully instead of failing.

This aligns the behavior with other MM selftests (like rmap) that skip
when NUMA support is missing.

Link: https://lkml.kernel.org/r/20260218163941.13499-1-anishm7030@gmail.com
Fixes: 0c2d087284 ("mm: add selftests for migration entries")
Signed-off-by: AnishMulay <anishm7030@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Sayali Patil <sayalip@linux.ibm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:52:57 -07:00
Liam R. Howlett
280b792cac maple_tree: use maple copy node for mas_wr_split()
Instead of using the maple big node, use the maple copy node for reduced
stack usage and aligning with mas_wr_rebalance() and
mas_wr_spanning_store().

Splitting a node is similar to rebalancing, but a new evaluation of when
to ascend is needed.  The only other difference is that the data is pushed
and never rebalanced at each level.

The testing must also align with the changes to this commit to ensure the
test suite continues to pass.

Link: https://lkml.kernel.org/r/20260130205935.2559335-27-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:52:56 -07:00
Liam R. Howlett
ebfee00c0b maple_tree: add test for rebalance calculation off-by-one
During the big node removal, an incorrect rebalance step went too far up
the tree causing insufficient nodes.  Test the faulty condition by
recreating the scenario in the userspace testing.

Link: https://lkml.kernel.org/r/20260130205935.2559335-24-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:52:56 -07:00
Liam R. Howlett
a9c6716e08 maple_tree: start using maple copy node for destination
Stop using the maple subtree state and big node in favour of using three
destinations in the maple copy node.  That is, expand the way leaves were
handled to all levels of the tree and use the maple copy node to track the
new nodes.

Extract out the sibling init into the data calculation since this is where
the insufficient data can be detected.  The remainder of the sibling code
to shift the next iteration is moved to the spanning_ascend() function,
since it is not always needed.

Next introduce the dst_setup() function which will decide how many nodes
are needed to contain the data at this level.  Using the destination
count, populate the copy node's dst array with the new nodes and set
d_count to the correct value.  Note that this can be tricky in the case of
a leaf node with exactly enough room because of the rule against NULLs at
the end of leaves.

Once the destinations are ready, copy the data by altering the
cp_data_write() function to copy from the sources to the destinations
directly.  This eliminates the use of the big node in this code path.  On
node completion, node_finalise() will zero out the remaining area and set
the metadata, if necessary.

spanning_ascend() is used to decide if the operation is complete.  It may
create a new root, converge into one destination, or continue upwards by
ascending the left and right write maple states.

One test case setup needed to be tweaked so that the targeted node was
surrounded by full nodes.

[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20260130205935.2559335-18-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:52:55 -07:00
Liam R. Howlett
b14ffd2c6c maple_tree: testing update for spanning store
Spanning store had some corner cases which showed up during rcu stress
testing.  Add explicit tests for those cases.

At the same time add some locking for easier visibility of the rcu stress
testing.  Only a single dump of the tree will happen on the first detected
issue instead of flooding the console with output.

Link: https://lkml.kernel.org/r/20260130205935.2559335-13-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:52:54 -07:00
Charlie Jenkins
7eb2e29546 selftests: riscv: Add license to cfi selftest
The cfi selftest was missing a license so add it.

Signed-off-by: Charlie Jenkins <thecharlesjenkins@gmail.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Link: https://patch.msgid.link/20260309-fix_selftests-v2-4-9d5a553a531e@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04 18:42:41 -06:00
Paul Walmsley
08ee155905 prctl: cfi: change the branch landing pad prctl()s to be more descriptive
Per Linus' comments requesting the replacement of "INDIR_BR_LP" in the
indirect branch tracking prctl()s with something more readable, and
suggesting the use of the speculation control prctl()s as an exemplar,
reimplement the prctl()s and related constants that control per-task
forward-edge control flow integrity.

This primarily involves two changes.  First, the prctls are
restructured to resemble the style of the speculative execution
workaround control prctls PR_{GET,SET}_SPECULATION_CTRL, to make them
easier to extend in the future.  Second, the "indir_br_lp" abbrevation
is expanded to "branch_landing_pads" to be less telegraphic.  The
kselftest and documentation is adjusted accordingly.

Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04 18:40:58 -06:00
Paul Walmsley
e5342fe2c1 riscv: ptrace: cfi: expand "SS" references to "shadow stack" in uapi headers
Similar to the recent change to expand "LP" to "branch landing pad",
let's expand "SS" in the ptrace uapi macros to "shadow stack" as well.
This aligns with the existing prctl() arguments, which use the
expanded "shadow stack" names, rather than just the abbreviation.

Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04 18:40:58 -06:00
Paul Walmsley
ac4e61c730 riscv: ptrace: expand "LP" references to "branch landing pads" in uapi headers
Per Linus' comments about the unreadability of abbreviations such as
"LP", rename the RISC-V ptrace landing pad CFI macro names to be more
explicit.  This primarily involves expanding "LP" in the names to some
variant of "branch landing pad."

Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04 18:40:58 -06:00
Charlie Jenkins
511361fe7a selftests: riscv: Add braces around EXPECT_EQ()
EXPECT_EQ() expands to multiple lines, breaking up one-line if
statements. This issue was not present in the patch on the mailing list
but was instead introduced by the maintainer when attempting to fix up
checkpatch warnings. Add braces around EXPECT_EQ() to avoid the error
even though checkpatch suggests them to be removed:

validate_v_ptrace.c:626:17: error: ‘else’ without a previous ‘if’

Fixes: 3789d5eecd ("selftests: riscv: verify syscalls discard vector context")
Fixes: 30eb191c89 ("selftests: riscv: verify ptrace rejects invalid vector csr inputs")
Fixes: 849f05ae1e ("selftests: riscv: verify ptrace accepts valid vector csr values")
Signed-off-by: Charlie Jenkins <thecharlesjenkins@gmail.com>
Reviewed-and-tested-by: Sergey Matyukevich <geomatsi@gmail.com>
Link: https://patch.msgid.link/20260309-fix_selftests-v2-2-9d5a553a531e@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04 18:37:57 -06:00
Paul Walmsley
87ad7cc9aa riscv: use _BITUL macro rather than BIT() in ptrace uapi and kselftests
Fix the build of non-kernel code that includes the RISC-V ptrace uapi
header, and the RISC-V validate_v_ptrace.c kselftest, by using the
_BITUL() macro rather than BIT().  BIT() is not available outside
the kernel.

Based on patches and comments from Charlie Jenkins, Michael Neuling,
and Andreas Schwab.

Fixes: 30eb191c89 ("selftests: riscv: verify ptrace rejects invalid vector csr inputs")
Fixes: 2af7c9cf02 ("riscv/ptrace: expose riscv CFI status and state via ptrace and in core files")
Cc: Andreas Schwab <schwab@suse.de>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Charlie Jenkins <thecharlesjenkins@gmail.com>
Link: https://patch.msgid.link/20260330024248.449292-1-mikey@neuling.org
Link: https://lore.kernel.org/linux-riscv/20260309-fix_selftests-v2-1-9d5a553a531e@gmail.com/
Link: https://lore.kernel.org/linux-riscv/20260309-fix_selftests-v2-3-9d5a553a531e@gmail.com/
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04 18:37:54 -06:00
Thomas Weißschuh
572246dcdd tools/nolibc: handle all major and minor numbers in makedev() and friends
Remove the limitation of only handling small major and minor numbers.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260404-nolibc-makedev-v2-5-456a429bf60c@weissschuh.net
2026-04-04 10:29:02 +02:00
Thomas Weißschuh
5afc7e9b90 selftests/nolibc: add a test for stat().st_rdev
The handling of 'dev_t' values is about to be changed.

Add a test to make sure they are returned correctly from stat().

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260404-nolibc-makedev-v2-2-456a429bf60c@weissschuh.net
2026-04-04 10:28:44 +02:00
Thomas Weißschuh
9c0ff257fb selftests/nolibc: add some tests for makedev() and friends
These functions/macros are about to be changed.

Add some tests to make sure they continue working.
As they only handle small dev_t values, only test those for now.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260404-nolibc-makedev-v2-1-456a429bf60c@weissschuh.net
2026-04-04 10:28:35 +02:00
Yosry Ahmed
052ca584bd KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name
The test checks both invalid GPAs as well as unmappable GPAs, so drop
'invalid' from its name.

Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260316202732.3164936-10-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-04-03 16:08:05 -07:00
Yosry Ahmed
428543fbf0 KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa
The test currently allegedly makes sure that VMRUN causes a #GP in
vmcb12 GPA is valid but unmappable. However, it calls run_guest() with
an the test vmcb12 GPA, and the #GP is produced from VMLOAD, not VMRUN.

Additionally, the underlying logic just changed to match architectural
behavior, and all of VMRUN/VMLOAD/VMSAVE fail emulation if vmcb12 cannot
be mapped. The CPU still injects a #GP if the vmcb12 GPA exceeds
maxphyaddr.

Rework the test such to use the KVM_ONE_VCPU_TEST[_SUITE] harness, and
test all of VMRUN/VMLOAD/VMSAVE with both an invalid GPA (-1ULL) causing
a #GP, and a valid but unmappable GPA causing emulation failure. Execute
the instructions directly from L1 instead of run_guest() to make sure
the #GP or emulation failure is produced by the right instruction.

Leave the #VMEXIT with unmappable GPA test case as-is, but wrap it with
a test harness as well.

Opportunisitically drop gp_triggered, as the test already checks that
a #GP was injected through a SYNC. Also, use the first unmapped GPA
instead of the maximum legal GPA, as some CPUs inject a #GP for the
maximum legal GPA (likely in a reserved area).

Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260316202732.3164936-9-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-04-03 16:08:04 -07:00
Jakub Kicinski
764d0833e7 selftests: drv-net: gro: add a test for bad IPv4 csum
We have a test for coalescing with bad TCP checksum, let's also
test bad IPv4 header checksum.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:45 -07:00
Jakub Kicinski
9a84a4047d selftests: drv-net: gro: test ip6ip6
We explicitly test ipip encap. Let's add ip6ip6, too. Having
just ipip seems like favoring IPv4 which we should not do :)
Testing all combinations is left for future work, not sure
it's actually worth it.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:44 -07:00
Jakub Kicinski
024597cc20 selftests: drv-net: gro: make large packet math more precise
When constructing the packets for large_* test cases we use
a static value for packet count and MSS. It works okay for
ipv4 vs ipv6 but the gap between ipv4 and ip6ip6 is going to
be quite significant.

Make the defines calculate the worst case values, those
are only used for sizing stack arrays. Create helpers for
calculating precise values based on the exact test case.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:44 -07:00
Jakub Kicinski
166b0cc6df selftests: drv-net: gro: remove TOTAL_HDR_LEN
Willem points out TOTAL_HDR_LEN is identical to MAX_HDR_LEN.
This seems to have been the case ever since the test was added.
Replace the uses of TOTAL_HDR_LEN with MAX_HDR_LEN, MAX seems
more common for what this value is.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:44 -07:00
Jakub Kicinski
5469b695f2 selftests: drv-net: gro: prepare for ip6ip6 support
Try to use already calculated offsets and not depend on the ipip
flag as much. This patch should not change any functionality,
it's just a cleanup to make ip6ip6 support easier.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:43 -07:00
Jakub Kicinski
d973484747 selftests: drv-net: gro: always wait for FIN in the capacity test
The new capacity/order test exits as soon as it sees the expected
packet sequence. This may allow the "flushing" FIN packet to spill
over to the next test. Let's always wait for the FIN before exiting.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:43 -07:00
Jakub Kicinski
436ea8a1b7 selftests: drv-net: gro: add 1 byte payload test
Small IPv4 packets get padded to 60B, this may break / confuse
some buggy implementations. Add a test to coalesce a 1B payload.
Keep this separate from the lrg_sml test because I suspect some
implementations may not handle this case (treat padded frames
as ineligible for coalescing).

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:43 -07:00
Jakub Kicinski
30f831b44a selftests: drv-net: gro: add data burst test case
Add a test trying to induce a GRO context timeout followed
by another sequence of packets for the same flow. The second
burst arrives 100ms after the first one so any implementation
(SW or HW) must time out waiting at that point. We expect both
bursts to be aggregated successfully but separately.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260402210000.1512696-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03 15:05:42 -07:00
Dave Jiang
202432ae8a Merge branch 'for-7.1/cxl-region-refactor' into cxl-for-next
Refactor CXL core/region code to make region code more manageable by
splitting out DAX and PMEM code from RAM handling code.

cxl/core: use cleanup.h for devm_cxl_add_dax_region
cxl/core/region: move dax region device logic into region_dax.c
cxl/core/region: move pmem region driver logic into region_pmem.c
2026-04-03 12:30:57 -07:00
Dave Jiang
303d32843b Merge branch 'for-7.1/dax-hmem' into cxl-for-next
The series addresses conflicts between HMEM and CXL when handling Soft
Reserved memory ranges. CXL will try best effort in claiming the Soft
Reserved memory region that are CXL regions. If fails, it will punt
back to HMEM.

tools/testing/cxl: Test dax_hmem takeover of CXL regions
tools/testing/cxl: Simulate auto-assembly failure
dax/hmem: Parent dax_hmem devices
dax/hmem: Fix singleton confusion between dax_hmem_work and hmem devices
dax/hmem: Reduce visibility of dax_cxl coordination symbols
cxl/region: Constify cxl_region_resource_contains()
cxl/region: Limit visibility of cxl_region_contains_resource()
dax/cxl: Fix HMEM dependencies
cxl/region: Fix use-after-free from auto assembly failure
dax/hmem, cxl: Defer and resolve Soft Reserved ownership
cxl/region: Add helper to check Soft Reserved containment by CXL regions
dax: Track all dax_region allocations under a global resource tree
dax/cxl, hmem: Initialize hmem early and defer dax_cxl binding
dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL
dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges
dax/hmem: Factor HMEM registration into __hmem_register_device()
dax/bus: Use dax_region_put() in alloc_dax_region() error path
2026-04-03 12:24:18 -07:00
Dave Jiang
7aacc62557 Merge branch 'for-7.1/cxl-type2-support' into cxl-for-next
Prep patches for CXL type2 accelerator basic support

cxl/region: Factor out interleave granularity setup
cxl/region: Factor out interleave ways setup
cxl: Make region type based on endpoint type
cxl/pci: Remove redundant cxl_pci_find_port() call
cxl: Move pci generic code from cxl_pci to core/cxl_pci
cxl: export internal structs for external Type2 drivers
cxl: support Type2 when initializing cxl_dev_state
2026-04-03 12:18:23 -07:00
Alison Schofield
f3b1d22607 tools/testing/cxl: Enable replay of user regions as auto regions
The cxl_test module currently hard-codes auto regions in the mock
topology, limiting coverage of the driver's region auto-assembly
logic.

Teach cxl_test to replay previously committed decoder programming
across a cxl_acpi unbind/bind cycle. Decoder programming is recorded
in a registry keyed by a stable port identity and decoder id. The
registry is updated on decoder commit and reset events and consulted
during enumeration to restore previously enabled decoders.

This allows regions created through the user interface to be replayed
during enumeration and treated as auto-discovered regions, enabling
testing of region auto-assembly using configurations created in the
cxl_test topology.

Example workflow:
  # cxl create-region ...
  # echo 1 > /sys/bus/platform/devices/cxl_acpi.0/decoder_reset_preserve_registry
  # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/unbind
  # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/bind
  # echo 0 > /sys/bus/platform/devices/cxl_acpi.0/decoder_reset_preserve_registry

The NDCTL CXL unit test, cxl-region-replay.sh, demonstrates the usage.

Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Co-developed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260314061952.2221030-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-04-03 11:45:48 -07:00
Sean Christopherson
25a642b6ab KVM: selftests: Remove duplicate LAUNCH_UPDATE_VMSA call in SEV-ES migrate test
Drop the explicit KVM_SEV_LAUNCH_UPDATE_VMSA call when creating an SEV-ES
VM in the SEV migration test, as sev_vm_create() automatically updates the
VMSA pages for SEV-ES guests.  The only reason the duplicate call doesn't
cause visible problems is because the test doesn't actually try to run the
vCPUs.  That will change when KVM adds a check to prevent userspace from
re-launching a VMSA (which corrupts the VMSA page due to KVM writing
encrypted private memory).

Fixes: 69f8e15ab6 ("KVM: selftests: Use the SEV library APIs in the intra-host migration test")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260310234829.2608037-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-04-03 09:37:35 -07:00
Oleg Nesterov
41fa043273 selftests/seccomp: Add hard-coded __NR_uprobe for x86_64
This complements the commit 18f7686a1c ("selftests/seccomp:
Add hard-coded __NR_uretprobe for x86_64").

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://patch.msgid.link/ac_BAMSggw-_ABPE@redhat.com
Signed-off-by: Kees Cook <kees@kernel.org>
2026-04-03 08:41:38 -07:00
Alexei Starovoitov
f1606dd0ac bpf: Add bpf_compute_const_regs() and bpf_prune_dead_branches() passes
Add two passes before the main verifier pass:

bpf_compute_const_regs() is a forward dataflow analysis that tracks
register values in R0-R9 across the program using fixed-point
iteration in reverse postorder. Each register is tracked with
a six-state lattice:

  UNVISITED -> CONST(val) / MAP_PTR(map_index) /
               MAP_VALUE(map_index, offset) / SUBPROG(num) -> UNKNOWN

At merge points, if two paths produce the same state and value for
a register, it stays; otherwise it becomes UNKNOWN.

The analysis handles:
 - MOV, ADD, SUB, AND with immediate or register operands
 - LD_IMM64 for plain constants, map FDs, map values, and subprogs
 - LDX from read-only maps: constant-folds the load by reading the
   map value directly via bpf_map_direct_read()

Results that fit in 32 bits are stored per-instruction in
insn_aux_data and bitmasks.

bpf_prune_dead_branches() uses the computed constants to evaluate
conditional branches. When both operands of a conditional jump are
known constants, the branch outcome is determined statically and the
instruction is rewritten to an unconditional jump.
The CFG postorder is then recomputed to reflect new control flow.
This eliminates dead edges so that subsequent liveness analysis
doesn't propagate through dead code.

Also add runtime sanity check to validate that precomputed
constants match the verifier's tracked state.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-5-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03 08:34:36 -07:00
Alexei Starovoitov
427c07ddb9 selftests/bpf: Add tests for subprog topological ordering
Add few tests for topo sort:
- linear chain: main -> A -> B
- diamond: main -> A, main -> B, A -> C, B -> C
- mixed global/static: main -> global -> static leaf
- shared callee: main -> leaf, main -> global -> leaf
- duplicate calls: main calls same subprog twice
- no calls: single subprog

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-4-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03 08:34:33 -07:00
Alexei Starovoitov
e6898ec751 bpf: Sort subprogs in topological order after check_cfg()
Add a pass that sorts subprogs in topological order so that iterating
subprog_topo_order[] walks leaf subprogs first, then their callers.
This is computed as a DFS post-order traversal of the CFG.

The pass runs after check_cfg() to ensure the CFG has been validated
before traversing and after postorder has been computed to avoid
walking dead code.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-3-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03 08:34:30 -07:00
Alexei Starovoitov
503d21ef8e bpf: Do register range validation early
Instead of checking src/dst range multiple times during
the main verifier pass do them once.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-2-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03 08:34:26 -07:00
Alexei Starovoitov
891a05ccba Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.0-rc6+
Cross-merge BPF and other fixes after downstream PR.

Minor conflict in kernel/bpf/verifier.c

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03 08:14:13 -07:00
Abhishek Dubey
e1f7a0e196 selftest/bpf: Enable gotox tests for powerpc64
With gotox instruction and jumptable now supported,
enable corresponding bpf selftest on powerpc.

Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260401152133.42544-5-adubey@linux.ibm.com
2026-04-03 14:14:25 +05:30
Abhishek Dubey
66cad93ad3 selftest/bpf: Enable instruction array test for powerpc
With instruction array now supported, enable corresponding bpf
selftest for powerpc.

Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260401152133.42544-3-adubey@linux.ibm.com
2026-04-03 14:14:25 +05:30
Abhishek Dubey
e640bcd1bf selftests/bpf: Enable private stack tests for powerpc64
With support of private stack, relevant tests must pass
on powerpc64.

#./test_progs -t struct_ops_private_stack
#434/1   struct_ops_private_stack/private_stack:OK
#434/2   struct_ops_private_stack/private_stack_fail:OK
#434/3   struct_ops_private_stack/private_stack_recur:OK
#434     struct_ops_private_stack:OK
Summary: 1/3 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260401103215.104438-2-adubey@linux.ibm.com
2026-04-03 14:09:43 +05:30
Linus Torvalds
7b9e74c5a4 bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmnPGdMACgkQ6rmadz2v
 bTrNxw/9Hcn2V/Jqp/cEagmKIKqSAUFgEE+AwRbQU5YL2Yem/6Q15rnOk8pOSDT5
 jqk7VbuchVmWa+a9DVy7d3XVWohk332QbvQRHfqV8P0ZpnfJa0YqdZlKg2/4/8P/
 yVhLzVrGIGcvvz9CfhIynRhq/fvr7iYbSSv9JT3nig4qCYpUf7kPbXSLtxyElNWN
 xX36KfTxQO4xI2+iezsNwklXF25Tv59V1fNuKF2lshxS+DwaroAzAJLd3MGvTHRj
 8y5kU1UDb+HeJh9DpEFjppQp4qUQjIKAiNVvXGUOe7TI/i9VTIiMfesniWKNwzYv
 Alo2G8fLb4nJhzNL2ol4R0I5BCYmMT55tBFvSNJQ+9Esy6azkbExmKuE1hXsUXo1
 jY0TbNt58zSZEmyz9SYoFKlg4lOW4ZIMl0RtnSBRoDwtK3ThGV7QFlnKq3uPZ6ce
 RcpMk7cOnERLzwPnpSiACrQmzhMk+j5HG1u+Eb3rXKxYCQO6bAhpQyPDKsiXNgkL
 uezq2zqAnNho0/CInHGlRj7E1JnvRoHCcLBT4zzyIY/jruI8fzK0aMqGMvk/qOby
 BWDnJ9GG3VmGSUc/FOp3IchKCnxXhkYqsjBCP03cbIZgr1MuixZeom81OsPNmSX8
 Ke+FeGNsU5zOUJ1iG2BZjdya/DAgP8hd85WVtaXyX60KKhuu45c=
 =w0RY
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Fix register equivalence for pointers to packet (Alexei Starovoitov)

 - Fix incorrect pruning due to atomic fetch precision tracking (Daniel
   Borkmann)

 - Fix grace period wait for bpf_link-ed tracepoints (Kumar Kartikeya
   Dwivedi)

 - Fix use-after-free of sockmap's sk->sk_socket (Kuniyuki Iwashima)

 - Reject direct access to nullable PTR_TO_BUF pointers (Qi Tang)

 - Reject sleepable kprobe_multi programs at attach time (Varun R
   Mallya)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Add more precision tracking tests for atomics
  bpf: Fix incorrect pruning due to atomic fetch precision tracking
  bpf: Reject sleepable kprobe_multi programs at attach time
  bpf: reject direct access to nullable PTR_TO_BUF pointers
  bpf: sockmap: Fix use-after-free of sk->sk_socket in sk_psock_verdict_data_ready().
  bpf: Fix grace period wait for tracepoint bpf_link
  bpf: Fix regsafe() for pointers to packet
2026-04-02 18:59:56 -07:00
Paul Chaignon
7cbded6ed9 selftests/bpf: Remove invariant violation flags
With the changes to the verifier in previous commits, we're not
expecting any invariant violations anymore. We should therefore always
enable BPF_F_TEST_REG_INVARIANTS to fail on invariant violations. Turns
out that's already the case and we've been explicitly setting this flag
in selftests when it wasn't necessary. This commit removes those flags
from selftests, which should hopefully make clearer that it's always
enabled.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/9afce92510a7d44569dc3af63c9b8c608e69298a.1775142354.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 18:23:25 -07:00
Paul Chaignon
2ba199067b selftests/bpf: Cover invariant violation case from syzbot
This patch adds a selftest for the change in the previous patch. The
selftest is derived from a syzbot reproducer from [1] (among the 22
reproducers on that page, only 4 still reproduced on latest bpf tree,
all being small variants of the same invariant violation).

The test case failure without the previous patch is shown below.

  0: R1=ctx() R10=fp0
  0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
  1: (bf) r5 = r0                       ; R0=scalar(id=1) R5=scalar(id=1)
  2: (57) r5 &= -4                      ; R5=scalar(smax=0x7ffffffffffffffc,umax=0xfffffffffffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0; 0xfffffffffffffffc))
  3: (bf) r7 = r0                       ; R0=scalar(id=1) R7=scalar(id=1)
  4: (57) r7 &= 1                       ; R7=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,var_off=(0x0; 0x1))
  5: (07) r7 += -43                     ; R7=scalar(smin=smin32=-43,smax=smax32=-42,umin=0xffffffffffffffd5,umax=0xffffffffffffffd6,umin32=0xffffffd5,umax32=0xffffffd6,var_off=(0xffffffffffffffd4; 0x3))
  6: (5e) if w5 != w7 goto pc+1
  verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0xffffffd5, 0xffffffffffffffd4] s64=[0x80000000ffffffd5, 0x7fffffffffffffd4] u32=[0xffffffd5, 0xffffffd4] s32=[0xffffffd5, 0xffffffd4] var_off=(0xffffffd4, 0xffffffff00000000)

R5 and R7 are prepared such that their tnums intersection results in a
known constant but that constant isn't within R7's u32 bounds.
is_branch_taken isn't able to detect this case today, so the verifier
walks the impossible fallthrough branch. After regs_refine_cond_op and
reg_bounds_sync refine R5 on the assumption that the branch is taken,
the impossibility becomes apparent and results in an invariant violation
for R5: umin32 is greater than umax32.

The previous patch fixes this by using regs_refine_cond_op and
reg_bounds_sync in is_branch_taken to detect the impossible branch. The
fallthrough branch is therefore correctly detected as dead code.

Link: https://syzkaller.appspot.com/bug?extid=c950cc277150935cc0b5 [1]
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/b1e22233a3206ead522f02eda27b9c5c991a0de9.1775142354.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 18:23:25 -07:00
Amery Hung
63f5156a9c selftests/bpf: Improve task local data documentation and fix potential memory leak
If TLD_FREE_DATA_ON_THREAD_EXIT is not enabled in a translation unit
that calls __tld_create_key() first, another translation unit that
enables it will not get the auto cleanup feature as pthread key is only
created once when allocation metadata. Fix it by always try to create
the pthread key when __tld_create_key() is called.

Also improve the documentation:
- Discourage user from using different options in different translation
  units
- Specify calling tld_free() before thread exit as undefined behavior

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260331213555.1993883-6-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 15:11:08 -07:00
Amery Hung
0b481a6915 selftests/bpf: Remove TLD_READ_ONCE() in the user space header
TLD_READ_ONCE() is redundant as the only reference passed to it is
defined as _Atomic. The load is guaranteed to be atomic in C11 standard
(6.2.6.1). Drop the macro.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Sun Jian <sun.jian.kdev@gmail.com>
Link: https://lore.kernel.org/r/20260331213555.1993883-5-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 15:11:08 -07:00
Amery Hung
80aa8e9c64 selftests/bpf: Make sure TLD_DEFINE_KEY runs first
Without specifying constructor priority of the hidden constructor
function defined by TLD_DEFINE_KEY, __tld_create_key(..., dyn_data =
false) may run after tld_get_data() called from other constructors.
Threads calling tld_get_data() before __tld_create_key(..., dyn_data
= false) will not allocate enough memory for all TLDs and later result
in OOB access. Therefore, set it to the lowest value available to
users. Note that lower means higher priority and 0-100 is reserved to
the compiler.

Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Sun Jian <sun.jian.kdev@gmail.com>
Link: https://lore.kernel.org/r/20260331213555.1993883-4-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 15:11:08 -07:00
Amery Hung
bb6d9f5cf1 selftests/bpf: Simplify task_local_data memory allocation
Simplify data allocation by always using aligned_alloc() and passing
size_pot, size rounded up to the closest power of two to alignment.

Currently, aligned_alloc(page_size, size) is only intended to be used
with memory allocators that can fulfill the request without rounding
size up to page_size to conserve memory. This is enabled by defining
TLD_DATA_USE_ALIGNED_ALLOC. The reason to align to page_size is due to
the limitation of UPTR where only a page can be pinned to the kernel.
Otherwise, malloc(size * 2) is used to allocate memory for data.

However, we don't need to call aligned_alloc(page_size, size) to get
a contiguous memory of size bytes within a page. aligned_alloc(size_pot,
...) will also do the trick. Therefore, just use aligned_alloc(size_pot,
...) universally.

As for the size argument, create a new option,
TLD_DONT_ROUND_UP_DATA_SIZE, to specify not rounding up the size.
This preserves the current TLD_DATA_USE_ALIGNED_ALLOC behavior, allowing
memory allocators with low overhead aligned_alloc() to not waste memory.
To enable this, users need to make sure it is not an undefined behavior
for the memory allocator to have size not being an integral multiple of
alignment.

Compared to the current implementation, !TLD_DATA_USE_ALIGNED_ALLOC
used to always waste size-byte of memory due to malloc(size * 2).
Now the worst case becomes size - 1 and the best case is 0 when the size
is already a power of two.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260331213555.1993883-3-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 15:11:08 -07:00
Amery Hung
7c8ca532a7 selftests/bpf: Fix task_local_data data allocation size
Currently, when allocating memory for data, size of tld_data_u->start
is not taken into account. This may cause OOB access. Fixed it by adding
the non-flexible array part of tld_data_u.

Besides, explicitly align tld_data_u->data to 8 bytes in case some
fields are added before data in the future. It could break the
assumption that every data field is 8 byte aligned and
sizeof(tld_data_u) will no longer be equal to
offsetof(struct tld_data_u, data), which we use interchangeably.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Sun Jian <sun.jian.kdev@gmail.com>
Link: https://lore.kernel.org/r/20260331213555.1993883-2-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 15:11:08 -07:00
Hoyeon Lee
9d77cefe8f selftests/bpf: Add test for raw-address single kprobe attach
Currently, attach_probe covers manual single-kprobe attaches by
func_name, but not the raw-address form that the PMU-based
single-kprobe path can accept.

This commit adds PERF and LINK raw-address coverage. It resolves
SYS_NANOSLEEP_KPROBE_NAME through kallsyms, passes the absolute address
in bpf_kprobe_opts.offset with func_name = NULL, and verifies that
kprobe and kretprobe are still triggered. It also verifies that LEGACY
rejects the same form.

Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20260401143116.185049-4-hoyeon.lee@suse.com
2026-04-02 13:23:19 -07:00
Jakub Kicinski
8ffb33d770 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-7.0-rc7).

Conflicts:

net/vmw_vsock/af_vsock.c
  b18c833888 ("vsock: initialize child_ns_mode_locked in vsock_net_init()")
  0de607dc4f ("vsock: add G2H fallback for CIDs not owned by H2G transport")

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
  ceee35e567 ("bnxt_en: Refactor some basic ring setup and adjustment logic")
  57cdfe0dc7 ("bnxt_en: Resize RSS contexts on channel count change")

drivers/net/wireless/intel/iwlwifi/mld/mac80211.c
  4d56037a02 ("wifi: iwlwifi: mld: block EMLSR during TDLS connections")
  687a95d204 ("wifi: iwlwifi: mld: correctly set wifi generation data")

drivers/net/wireless/intel/iwlwifi/mld/scan.h
  b6045c899e ("wifi: iwlwifi: mld: Refactor scan command handling")
  ec66ec6a5a ("wifi: iwlwifi: mld: Fix MLO scan timing")

drivers/net/wireless/intel/iwlwifi/mvm/fw.c
  078df640ef ("wifi: iwlwifi: mld: add support for iwl_mcc_allowed_ap_type_cmd v
2")
  323156c354 ("wifi: iwlwifi: mvm: don't send a 6E related command when not supported")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 11:03:13 -07:00
Daniel Borkmann
e1b5687a86 selftests/bpf: Add more precision tracking tests for atomics
Add verifier precision tracking tests for BPF atomic fetch operations.
Validate that backtrack_insn correctly propagates precision from the
fetch dst_reg to the stack slot for {fetch_add,xchg,cmpxchg} atomics.
For the first two src_reg gets the old memory value, and for the last
one r0. The fetched register is used for pointer arithmetic to trigger
backtracking. Also add coverage for fetch_{or,and,xor} flavors which
exercises the bitwise atomic fetch variants going through the same
insn->imm & BPF_FETCH check but with different imm values.

Add dual-precision regression tests for fetch_add and cmpxchg where
both the fetched value and a reread of the same stack slot are tracked
for precision. After the atomic operation, the stack slot is STACK_MISC,
so the ldx does not set INSN_F_STACK_ACCESS. These tests verify that
stack precision propagates solely through the atomic fetch's load side.

Add map-based tests for fetch_add and cmpxchg which validate that non-
stack atomic fetch completes precision tracking without falling back
to mark_all_scalars_precise. Lastly, add 32-bit variants for {fetch_add,
cmpxchg} on map values to cover the second valid atomic operand size.

  # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_precision
  [...]
  + /etc/rcS.d/S50-startup
  ./test_progs -t verifier_precision
  [    1.697105] bpf_testmod: loading out-of-tree module taints kernel.
  [    1.700220] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  [    1.777043] tsc: Refined TSC clocksource calibration: 3407.986 MHz
  [    1.777619] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc6d7268, max_idle_ns: 440795260133 ns
  [    1.778658] clocksource: Switched to clocksource tsc
  #633/1   verifier_precision/bpf_neg:OK
  #633/2   verifier_precision/bpf_end_to_le:OK
  #633/3   verifier_precision/bpf_end_to_be:OK
  #633/4   verifier_precision/bpf_end_bswap:OK
  #633/5   verifier_precision/bpf_load_acquire:OK
  #633/6   verifier_precision/bpf_store_release:OK
  #633/7   verifier_precision/state_loop_first_last_equal:OK
  #633/8   verifier_precision/bpf_cond_op_r10:OK
  #633/9   verifier_precision/bpf_cond_op_not_r10:OK
  #633/10  verifier_precision/bpf_atomic_fetch_add_precision:OK
  #633/11  verifier_precision/bpf_atomic_xchg_precision:OK
  #633/12  verifier_precision/bpf_atomic_fetch_or_precision:OK
  #633/13  verifier_precision/bpf_atomic_fetch_and_precision:OK
  #633/14  verifier_precision/bpf_atomic_fetch_xor_precision:OK
  #633/15  verifier_precision/bpf_atomic_cmpxchg_precision:OK
  #633/16  verifier_precision/bpf_atomic_fetch_add_dual_precision:OK
  #633/17  verifier_precision/bpf_atomic_cmpxchg_dual_precision:OK
  #633/18  verifier_precision/bpf_atomic_fetch_add_map_precision:OK
  #633/19  verifier_precision/bpf_atomic_cmpxchg_map_precision:OK
  #633/20  verifier_precision/bpf_atomic_fetch_add_32bit_precision:OK
  #633/21  verifier_precision/bpf_atomic_cmpxchg_32bit_precision:OK
  #633/22  verifier_precision/bpf_neg_2:OK
  #633/23  verifier_precision/bpf_neg_3:OK
  #633/24  verifier_precision/bpf_neg_4:OK
  #633/25  verifier_precision/bpf_neg_5:OK
  #633     verifier_precision:OK
  Summary: 1/25 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260331222020.401848-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 09:57:59 -07:00
Linus Torvalds
f8f5627a8a With fixes from wireless, bluetooth and Netfilter included we're back
to each PR carrying 30%+ more fixes than in previous era. The good
 news is that so far none of the "extra" fixes are themselves
 causing real regressions. Not sure how much comfort that is.
 
 Current release - fix to a fix:
 
  - netdevsim: fix build if SKB_EXTENSIONS=n
 
  - eth: stmmac: skip VLAN restore when VLAN hash ops are missing
 
 Previous releases - regressions:
 
  - wifi: iwlwifi: mvm: don't send a 6E related command when
    not supported
 
 Previous releases - always broken:
 
  - some info leak fixes
 
  - add missing clearing of skb->cb[] on ICMP paths from tunnels
 
  - ipv6: flowlabel: defer exclusive option free until RCU teardown
 
  - ipv6: avoid overflows in ip6_datagram_send_ctl()
 
  - mpls: add seqcount to protect platform_labels from OOB access
 
  - bridge: improve safety of parsing ND options
 
  - Bluetooth: fix leaks, overflows and races in hci_sync
 
  - netfilter: add more input validation, some to address bugs directly
    some to prevent exploits from cooking up broken configurations
 
  - wifi: ath: avoid poor performance due to stopping the wrong
    aggregation session
 
  - wifi: virt_wifi: remove SET_NETDEV_DEV to avoid use-after-free
 
  - eth: fec: fix the PTP periodic output sysfs interface
 
  - eth: enetc: safely reinitialize TX BD ring when it has unsent frames
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmnOldMACgkQMUZtbf5S
 IrvaPQ/9EdZIY8AnvdgZmzVrMkTbbshpOy/lLxkpFE4yX1Hgw9BLSZqoC3rq2b41
 78Q6Zk7tbOHQb8rBLawi3+YuY+Eq5R4ajt4MNWWd1sYaaHnOXwp91jO4rvocSCjz
 8o8/Z3VU4znG+cK85mcuYqNZcar/0dI8m01136Dtoi0dtZ4KKdUBBDT/Zq7Ov3gJ
 pKrSMZBFT5UwnhlLi+xZ65KjdUMlbTujlQf0vH815p+iM+5E8fJNK5h+a6ZefXB4
 Un+jXxhD/Vj5TBwq8ZouDSAWVCAG26Yy9RGcn5O7w0mlzv48mWB1bIoXFEyc2F8s
 EbsiEqCNygHLoVTsBU1+0psYqey7aZDfceokzYMONHpJgpWbFmmHjfcFxfgeq9Of
 iI3DU7IQMBKdN7uC4dCKc94Ty9Jye+DvCnkeMUEwxV4Dkhnr+2wP0pGqo6r2K0sT
 9mFBh8YP2KyRd5+Ei8D4zmQrGpqpsXwSIwrhnGHEkWGjMAW+TltyOPzPzUgvMBHX
 XllZIAFpTFaZiR9ZZU8PRyUNRfh93AmV0tY4xYCqVArf85A/LjqmJCw6K6Pthcmw
 RzezpyQUCJ044EyDfDhjVgK/YEEkdT+wUcKKLw31pdOvQVAPJ4pI95pWbeVz4kLk
 30DE7PR+2hExm44GHUfG/v8MJTE2OkSRu26Ci4dQsm3sT2zvv2g=
 =3Pjk
 -----END PGP SIGNATURE-----

Merge tag 'net-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "With fixes from wireless, bluetooth and netfilter included we're back
  to each PR carrying 30%+ more fixes than in previous era.

  The good news is that so far none of the "extra" fixes are themselves
  causing real regressions. Not sure how much comfort that is.

  Current release - fix to a fix:

   - netdevsim: fix build if SKB_EXTENSIONS=n

   - eth: stmmac: skip VLAN restore when VLAN hash ops are missing

  Previous releases - regressions:

   - wifi: iwlwifi: mvm: don't send a 6E related command when
     not supported

  Previous releases - always broken:

   - some info leak fixes

   - add missing clearing of skb->cb[] on ICMP paths from tunnels

   - ipv6:
      - flowlabel: defer exclusive option free until RCU teardown
      - avoid overflows in ip6_datagram_send_ctl()

   - mpls: add seqcount to protect platform_labels from OOB access

   - bridge: improve safety of parsing ND options

   - bluetooth: fix leaks, overflows and races in hci_sync

   - netfilter: add more input validation, some to address bugs directly
     some to prevent exploits from cooking up broken configurations

   - wifi:
      - ath: avoid poor performance due to stopping the wrong
        aggregation session
      - virt_wifi: remove SET_NETDEV_DEV to avoid use-after-free

   - eth:
      - fec: fix the PTP periodic output sysfs interface
      - enetc: safely reinitialize TX BD ring when it has unsent frames"

* tag 'net-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits)
  eth: fbnic: Increase FBNIC_QUEUE_SIZE_MIN to 64
  ipv6: avoid overflows in ip6_datagram_send_ctl()
  net: hsr: fix VLAN add unwind on slave errors
  net: hsr: serialize seq_blocks merge across nodes
  vsock: initialize child_ns_mode_locked in vsock_net_init()
  selftests/tc-testing: add tests for cls_fw and cls_flow on shared blocks
  net/sched: cls_flow: fix NULL pointer dereference on shared blocks
  net/sched: cls_fw: fix NULL pointer dereference on shared blocks
  net/x25: Fix overflow when accumulating packets
  net/x25: Fix potential double free of skb
  bnxt_en: Restore default stat ctxs for ULP when resource is available
  bnxt_en: Don't assume XDP is never enabled in bnxt_init_dflt_ring_mode()
  bnxt_en: Refactor some basic ring setup and adjustment logic
  net/mlx5: Fix switchdev mode rollback in case of failure
  net/mlx5: Avoid "No data available" when FW version queries fail
  net/mlx5: lag: Check for LAG device before creating debugfs
  net: macb: properly unregister fixed rate clocks
  net: macb: fix clk handling on PCI glue driver removal
  virtio_net: clamp rss_max_key_size to NETDEV_RSS_KEY_LEN
  net/sched: sch_netem: fix out-of-bounds access in packet corruption
  ...
2026-04-02 09:57:06 -07:00
Leon Hwang
da77f3a9aa selftests/bpf: Add test to verify the fix of kprobe_write_ctx abuse
Add a test to verify the issue: kprobe_write_ctx can be abused to modify
struct pt_regs of kernel functions via kprobe_write_ctx=true freplace
progs.

Without the fix, the issue is verified:

kprobe_write_ctx=true freplace prog is allowed to attach to
kprobe_write_ctx=false kprobe prog. Then, the first arg of
bpf_fentry_test1 will be set as 0, and bpf_prog_test_run_opts() gets
-EFAULT instead of 0.

With the fix, the issue is rejected at attach time.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260331145353.87606-3-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-02 09:29:49 -07:00
Mayuresh Chitale
1ec8bea903 KVM: riscv: selftests: Implement kvm_arch_has_default_irqchip
kvm_arch_has_default_irqchip is required for irqfd_test and returns
true if an in-kernel interrupt controller is supported.

Fixes: a133052666 ("KVM: selftests: Fix irqfd_test for non-x86 architectures")
Signed-off-by: Mayuresh Chitale <mayuresh.chitale@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260402101818.2982071-1-mayuresh.chitale@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2026-04-02 21:31:15 +05:30
Vincent Donnefort
ec07906bdc tracing: selftests: Extend hotplug testing for trace remotes
The hotplug testing only tries reading a trace remote buffer, loaded
before a CPU is offline. Extend this testing to cover:

  * A trace remote buffer loaded after a CPU is offline.
  * A trace remote buffer loaded before a CPU is online.

Because of these added test cases, move the hotplug testing into a
separate hotplug.tc file.

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260401045100.3394299-3-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-02 14:16:09 +01:00
Xiang Mei
70f73562d2 selftests/tc-testing: add tests for cls_fw and cls_flow on shared blocks
Regression tests for the shared-block NULL derefs fixed in the previous
two patches:

  - fw: attempt to attach an empty fw filter to a shared block and
    verify the configuration is rejected with EINVAL.
  - flow: create a flow filter on a shared block without a baseclass
    and verify the configuration is rejected with EINVAL.

Signed-off-by: Xiang Mei <xmei5@asu.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260331050217.504278-3-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 15:08:42 +02:00
Ioana Ciornei
3016574ea2 selftests: drivers: hw: add test for the ethtool standard counters
Add a new selftest - ethtool_std_stats.sh - which validates the
eth-ctrl, eth-mac and pause standard statistics exported by an
interface. Collision related eth-mac counters as well as the error ones
will be checked against zero since that is the most likely correct
scenario.

The central part of this patch is the traffic_test() function which
gathers the 'before' counter values, sends a batch of traffic and then
interrogates again the same counters in order to determine if the delta
is on target. The function receives an array through which the caller
can request what counters to be interrogated and, for each of them, what
is their target delta value.

The output from this selftest looks as follows on a LX2160ARDB board:

 $ ./run_kselftest.sh -t drivers/net/hw:ethtool_std_stats.sh
 TAP version 13
 1..1
 # timeout set to 0
 # selftests: drivers/net/hw: ethtool_std_stats.sh
 # TAP version 13
 # 1..26
 # ok 1 ethtool_std_stats.eth-ctrl-MACControlFramesTransmitted
 # ok 2 ethtool_std_stats.eth-ctrl-MACControlFramesReceived
 # ok 3 ethtool_std_stats.eth-mac-FrameCheckSequenceErrors
 # ok 4 ethtool_std_stats.eth-mac-AlignmentErrors
 # ok 5 ethtool_std_stats.eth-mac-FramesLostDueToIntMACXmitError
 # ok 6 ethtool_std_stats.eth-mac-CarrierSenseErrors # SKIP
 # ok 7 ethtool_std_stats.eth-mac-FramesLostDueToIntMACRcvError
 # ok 8 ethtool_std_stats.eth-mac-InRangeLengthErrors # SKIP
 # ok 9 ethtool_std_stats.eth-mac-OutOfRangeLengthField # SKIP
 # ok 10 ethtool_std_stats.eth-mac-FrameTooLongErrors # SKIP
 # ok 11 ethtool_std_stats.eth-mac-FramesAbortedDueToXSColls # SKIP
 # ok 12 ethtool_std_stats.eth-mac-SingleCollisionFrames # SKIP
 # ok 13 ethtool_std_stats.eth-mac-MultipleCollisionFrames # SKIP
 # ok 14 ethtool_std_stats.eth-mac-FramesWithDeferredXmissions # SKIP
 # ok 15 ethtool_std_stats.eth-mac-LateCollisions # SKIP
 # ok 16 ethtool_std_stats.eth-mac-FramesWithExcessiveDeferral # SKIP
 # ok 17 ethtool_std_stats.eth-mac-BroadcastFramesXmittedOK
 # ok 18 ethtool_std_stats.eth-mac-OctetsTransmittedOK
 # ok 19 ethtool_std_stats.eth-mac-BroadcastFramesReceivedOK
 # ok 20 ethtool_std_stats.eth-mac-OctetsReceivedOK
 # ok 21 ethtool_std_stats.eth-mac-FramesTransmittedOK
 # ok 22 ethtool_std_stats.eth-mac-MulticastFramesXmittedOK
 # ok 23 ethtool_std_stats.eth-mac-FramesReceivedOK
 # ok 24 ethtool_std_stats.eth-mac-MulticastFramesReceivedOK
 # ok 25 ethtool_std_stats.pause-tx_pause_frames
 # ok 26 ethtool_std_stats.pause-rx_pause_frames
 # # 10 skipped test(s) detected.  Consider enabling relevant config options to improve coverage.
 # # Totals: pass:16 fail:0 xfail:0 xpass:0 skip:10 error:0
 ok 1 selftests: drivers/net/hw: ethtool_std_stats.sh

Please note that not all MACs are counting the software injected pause
frames as real Tx pause. For example, on a LS1028ARDB the selftest
output will reflect the fact that neither the ENETC MAC, nor the Felix
switch MAC are able to detect Tx pause frames injected by software.

 $ ./run_kselftest.sh -t drivers/net/hw:ethtool_std_stats.sh
 (...)
 # # software sent pause frames not detected
 # ok 25 ethtool_std_stats.pause-tx_pause_frames # XFAIL
 # ok 26 ethtool_std_stats.pause-rx_pause_frames

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-10-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:04 +02:00
Ioana Ciornei
abe4929bc7 selftests: drivers: hw: update ethtool_rmon to work with a single local interface
This patch finalizes the transition to work with a single local
interface for the ethtool_rmon.sh test. Each 'ip link' and 'ethtool'
command used by the test is annotated with the necessary run_on in
order to be executed on the necessary target system, be it local, in
another network namespace or through ssh.

Since we need NETIF up and running also for control traffic, we now
expect that the interfaces are up and running and do not touch bring
them up or down at the end of the test. This is also documented in the
drivers/net/README.rst.

The ethtool_rmon.sh script can still be used in the older fashion by
passing two interfaces as command line arguments, the only restriction
is that those interfaces need to be already up.

 $ DRIVER_TEST_CONFORMANT=no ./ethtool_rmon.sh eth0 eth1

As part of the kselftest infrastructure, this test can be run in the
following manner:

 $ make -C tools/testing/selftests/ TARGETS="drivers/net drivers/net/hw" \
 install INSTALL_PATH=/tmp/ksft-net-drv
 $ cd /tmp/ksft-net-drv/
 $ cat > ./drivers/net/net.config <<EOF
 NETIF=endpmac17
 LOCAL_V4=17.0.0.1
 REMOTE_V4=17.0.0.2
 REMOTE_TYPE=ssh
 REMOTE_ARGS=root@192.168.5.200
 EOF

 $ ./run_kselftest.sh -t drivers/net/hw:ethtool_rmon.sh
 TAP version 13
 1..1
 # timeout set to 0
 # selftests: drivers/net/hw: ethtool_rmon.sh
 # TAP version 13
 # 1..14
 # ok 1 ethtool_rmon.rx-pkts64to64
 # ok 2 ethtool_rmon.rx-pkts65to127
 # ok 3 ethtool_rmon.rx-pkts128to255
 # ok 4 ethtool_rmon.rx-pkts256to511
 # ok 5 ethtool_rmon.rx-pkts512to1023
 # ok 6 ethtool_rmon.rx-pkts1024to1518
 # ok 7 ethtool_rmon.rx-pkts1519to10240
 # ok 8 ethtool_rmon.tx-pkts64to64
 # ok 9 ethtool_rmon.tx-pkts65to127
 # ok 10 ethtool_rmon.tx-pkts128to255
 # ok 11 ethtool_rmon.tx-pkts256to511
 # ok 12 ethtool_rmon.tx-pkts512to1023
 # ok 13 ethtool_rmon.tx-pkts1024to1518
 # ok 14 ethtool_rmon.tx-pkts1519to10240
 # # Totals: pass:14 fail:0 xfail:0 xpass:0 skip:0 error:0
 ok 1 selftests: drivers/net/hw: ethtool_rmon.sh

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-9-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:04 +02:00
Ioana Ciornei
eec1b9057c selftests: drivers: hw: move to KTAP output
Update the ethtool_rmon.sh test so that it uses the KTAP format for its
output. This is achieved by using the helpers found in ktap_helpers.sh.
An example output can be found below.

 $ ./ethtool_rmon.sh endpmac3 endpmac4
 TAP version 13
 1..14
 ok 1 ethtool_rmon.rx-pkts64to64
 ok 2 ethtool_rmon.rx-pkts65to127
 ok 3 ethtool_rmon.rx-pkts128to255
 ok 4 ethtool_rmon.rx-pkts256to511
 ok 5 ethtool_rmon.rx-pkts512to1023
 ok 6 ethtool_rmon.rx-pkts1024to1518
 ok 7 ethtool_rmon.rx-pkts1519to10240
 ok 8 ethtool_rmon.tx-pkts64to64
 ok 9 ethtool_rmon.tx-pkts65to127
 ok 10 ethtool_rmon.tx-pkts128to255
 ok 11 ethtool_rmon.tx-pkts256to511
 ok 12 ethtool_rmon.tx-pkts512to1023
 ok 13 ethtool_rmon.tx-pkts1024to1518
 ok 14 ethtool_rmon.tx-pkts1519to10240
 # Totals: pass:14 fail:0 xfail:0 xpass:0 skip:0 error:0

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-8-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:04 +02:00
Ioana Ciornei
557c067c00 selftests: drivers: hw: replace counter upper limit with UINT32_MAX in rmon test
The ethtool_rmon.sh script checks that the number of packets sent /
received during a test matches the expected value with a 1% tolerance.

Since in the next patches this test will gain the capability to also be
run on systems with a single interface where the traffic generator is
accesible through ssh, use the UINT32_MAX as the upper limit. This is
necessary since the same interface will be used also for control traffic
(the ssh commands) as well as the mausezahn generated one.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-7-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:04 +02:00
Ioana Ciornei
e76c71483a selftests: drivers: hw: test rmon counters only on first interface
The selftests in drivers/net are slowly transitioning to being able to
be used on systems with a single network interface. The first step for the
ethtool_rmon.sh test is to only validate that the rmon counters are
properly exported on the first interface supplied as an argument.

Remove the rmon_histogram calls which intend to test also the rmon
counters on the 2nd interface. This also removes the need for the remote
system, which should be used only to inject traffic, to also support
rmon counters.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-6-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:03 +02:00
Ioana Ciornei
afd1b9bf75 selftests: drivers: hw: cleanup shellcheck warnings in the rmon test
If run on the ethtool_rmon.sh script, shellcheck generates a bunch of
false positive errors. Suppress those checks that generate them.

Also cleanup the remaining warnings by using double quoting around the
used variables.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-5-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:03 +02:00
Ioana Ciornei
7db9da4c67 selftests: net: update some helpers to use run_on
Update some helpers so that they are capable to run commands on
different targets than the local one. This patch makes the necesasy
modification for those helpers / sections of code which are needed for
the ethtool_rmon.sh test that will be converted in the next patches.

For example, mac_addr_prepare() and mac_addr_restore() used when
STABLE_MAC_ADDRS=yes need to ensure stable MAC addresses on interfaces
located even in other namespaces. In order to do that, append the 'ip
link' commands with a 'run_on $dev' tag.

The same run_on is necessary also when verifying if all the interfaces
listed in NETIFS are indeed available.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-4-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:03 +02:00
Ioana Ciornei
2de16ebe78 selftests: net: extend lib.sh to parse drivers/net/net.config
Extend lib.sh so that it's able to parse driver/net/net.config and
environment variables such as NETIF, REMOTE_TYPE, LOCAL_V4 etc described
in drivers/net/README.rst.

In order to make the transition towards running with a single local
interface smoother for the bash networking driver tests, beside sourcing
the net.config file also translate the new env variables into the old
style based on the NETIFS array. Since the NETIFS array only holds the
network interface names, also add a new array - TARGETS - which keeps
track of the target on which a specific interfaces resides - local,
netns or accesible through an ssh command.

For example, a net.config which looks like below:

	NETIF=eth0
	LOCAL_V4=192.168.1.1
	REMOTE_V4=192.168.1.2
	REMOTE_TYPE=ssh
	REMOTE_ARGS=root@192.168.1.2

will generate the NETIFS and TARGETS arrays with the following data.

	NETIFS[p1]="eth0"
	NETIFS[p2]="eth2"

	TARGETS[eth0]="local:"
	TARGETS[eth2]="ssh:root@192.168.1.2"

The above will be true if on the remote target, the interface which has
the 192.168.1.2 address is named eth2.

Since the TARGETS array is indexed by the network interface name,
document a new restriction README.rst which states that the remote
interface cannot have the same name as the local one. Keep the old way
of populating the NETIFS variable based on the command line arguments.
This will be invoked in case DRIVER_TEST_CONFORMANT = "no".

Also add a couple of helpers which can be used by tests which need to
run a specific bash command on a different target than the local system,
be it either another netns or a remote system accessible through ssh.
The __run_on() function is passed through $1 the target on which the
command should be executed while run_on() is passed the name of the
interface that is then used to retrieve the target from the TARGETS
array.

Also add a stub run_on() function in net/lib.sh so that users of the
net/lib.sh are going through the stub only since neither NETIFS nor
TARGETS are valid in that circumstance.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-3-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:03 +02:00
Ioana Ciornei
4799ff80c6 selftests: forwarding: extend ethtool_std_stats_get with pause statistics
Even though pause frame statistics are not exported through the same
ethtool command, there is no point in adding another helper just for
them. Extent the ethtool_std_stats_get() function so that we are able to
interrogate using the same helper all the standard statistics.

And since we are touching the function, convert the initial ethtool call
as well to the jq --arg form in order to be easier to read.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260330152933.2195885-2-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-02 12:11:03 +02:00
Zhenzhong Duan
c9587216d9 iommufd/selftest: Test dirty tracking on PASID
Add test case for dirty tracking on a domain attached to PASID, also
confirm attachment to PASID fail if device doesn't support dirty tracking.

Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20260330101108.12594-5-zhenzhong.duan@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-04-02 09:26:05 +02:00
Jakub Kicinski
6c3dec3e3d bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCac2n1wAKCRAraS+Z+3y5
 E7INAPwOyqMJws2kswrIPZ8jqfaBIcNVe9MM9a9Ldp8qZmWUHAD/ayqW4hHP6eMA
 WBNcVCDGStYeI4lyINS5AqPN8mMhOAI=
 =iquL
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Martin KaFai Lau says:

====================
pull-request: bpf-next 2026-04-01

We've added 2 non-merge commits during the last 2 day(s) which contain
a total of 3 files changed, 139 insertions(+), 23 deletions(-).

The main changes are:

1) skb_dst_drop(skb) when bpf prog does a encap or decap,
   from Jakub Kicinski

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Test that dst is cleared on same-protocol encap
  net: Clear the dst when performing encap / decap
====================

Link: https://patch.msgid.link/20260401233956.4133413-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-01 19:02:44 -07:00
Zenghui Yu (Huawei)
03db5f05d4 KVM: arm64: selftests: Avoid testing the IMPDEF behavior
It turned out that we can't really force KVM to use the "slow" path when
emulating AT instructions [1]. We should therefore avoid testing the IMPDEF
behavior (i.e., TEST_ACCESS_FLAG - address translation instructions are
permitted to update AF but not required).

Remove it and improve the comment a bit.

[1] https://lore.kernel.org/r/b951dcfb-0ad1-4d7b-b6ce-d54b272dd9be@linux.dev

Signed-off-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev>
Link: https://patch.msgid.link/20260317131558.52751-1-zenghui.yu@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-01 17:29:21 +01:00
David Laight
1dff9ac2c8 tools/nolibc/printf: Support negative variable width and precision
For (eg) "%*.*s" treat a negative field width as a request to left align
the output (the same as the '-' flag), and a negative precision to
request the default precision.

Set the default precision to -1 (not INT_MAX) and add explicit checks
to the string handling for negative values (makes the tet unsigned).

For numeric output check for 'precision >= 0' instead of testing
_NOLIBC_PF_FLAGS_CONTAIN(flags, '.').
This needs an inverted test, some extra goto and removes an indentation.
The changed conditionals fix printf("%0-#o", 0) - but '0' and '-' shouldn't
both be specified.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Link: https://patch.msgid.link/20260323112247.3196-1-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-04-01 18:02:37 +02:00
Dan Williams
549b5c12ef tools/testing/cxl: Test dax_hmem takeover of CXL regions
When platform firmware is committed to publishing EFI_CONVENTIONAL_MEMORY
in the memory map, but CXL fails to assemble the region, dax_hmem can
attempt to attach a dax device to the memory range.

Take advantage of the new ability to support multiple "hmem_platform"
devices, and to enable regression testing of several scenarios:

* CXL correctly assembles a region, check dax_hmem fails to attach dax
* CXL fails to assemble a region, check dax_hmem successfully attaches dax
* Check that loading the dax_cxl driver loads the dax_hmem driver
* Attempt to race cxl_mock_mem async probe vs dax_hmem probe flushing.
  Check that both positive and negative cases.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260327052821.440749-10-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-04-01 08:12:18 -07:00
Dan Williams
78b8f1a7a4 tools/testing/cxl: Simulate auto-assembly failure
Add a cxl_test module option to skip setting up one of the members of the
default auto-assembled region.

This simulates a device failing between firmware setup and OS boot, or
region configuration interrupted by an event like kexec.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260327052821.440749-9-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-04-01 08:12:18 -07:00
Marc Zyngier
b3265a1b2b KVM: arm64: set_id_regs: Allow GICv3 support to be set at runtime
set_id_regs creates a GIC3 guest when possible, and then proceeds
to write the ID registers as if they were not affected by the presence
of a GIC. As it turns out, ID_AA64PFR1_EL1 is the proof of the
contrary.

KVM now makes a point in exposing the GIC support to the guest,
no matter what userspace says (userspace such as QEMU is known to
write silly things at times).

Accommodate for this level of nonsense by teaching set_id_regs about
fields that are mutable, and only compare registers that have been
re-sanitised first.

Reported-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20260401103611.357092-17-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-01 15:42:26 +01:00
Takashi Iwai
0542972950 Merge branch 'for-linus' into for-next
Pull 7.0 devel branch for further cleanups of ctxfi driver & co.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-04-01 14:43:00 +02:00
Mike Rapoport (Microsoft)
0510bdab53 mm: move free_reserved_area() to mm/memblock.c
free_reserved_area() is related to memblock as it frees reserved memory
back to the buddy allocator, similar to what memblock_free_late() does.

Move free_reserved_area() to mm/memblock.c to prepare for further
consolidation of the functions that free reserved memory.

No functional changes.

Link: https://patch.msgid.link/20260323074836.3653702-5-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
2026-04-01 11:19:46 +03:00
Mike Rapoport (Microsoft)
8b7b85384f memblock: move reserve_bootmem_range() to memblock.c and make it static
reserve_bootmem_region() is only called from
memmap_init_reserved_pages() and it was in mm/mm_init.c because of its
dependecies on static init_deferred_page().

Since init_deferred_page() is not static anymore, move
reserve_bootmem_region(), rename it to memmap_init_reserved_range() and
make it static.

Update the comment describing it to better reflect what the function
does and drop bogus comment about reserved pages in free_bootmem_page().

Update memblock test stubs to reflect the core changes.

Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Link: https://patch.msgid.link/20260323072042.3651061-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-04-01 11:19:45 +03:00
Guilherme G. Piccoli
0709682cdb memblock: Add reserve_mem debugfs info
When using the "reserve_mem" parameter, users aim at having an
area that (hopefully) persists across boots, so pstore infrastructure
(like ramoops module) can make use of that to save oops/ftrace logs,
for example.

There is no easy way to determine if this kernel parameter is properly
set though; the kernel doesn't show information about this memory in
memblock debugfs, neither in /proc/iomem nor dmesg. This is a relevant
information for tools like kdumpst[0], to determine if it's reliable
to use the reserved area as ramoops persistent storage; checking only
/proc/cmdline is not sufficient as it doesn't tell if the reservation
effectively succeeded or not.

Add here a new file under memblock debugfs showing properly set memory
reservations, with name and size as passed to "reserve_mem". Notice that
if no "reserve_mem=" is passed on command-line or if the reservation
attempts fail, the file is not created.

[0] https://aur.archlinux.org/packages/kdumpst

Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://patch.msgid.link/20260324012839.1991765-2-gpiccoli@igalia.com
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-04-01 11:19:43 +03:00
Amit Machhiwal
6e65886fce selftests/powerpc: Suppress -Wmaybe-uninitialized with GCC 15
GCC 15 reports the below false positive '-Wmaybe-uninitialized' warning
in vphn_unpack_associativity() when building the powerpc selftests.

  # make -C tools/testing/selftests TARGETS="powerpc"
  [...]
    CC       test-vphn
  In file included from test-vphn.c:3:
  In function ‘vphn_unpack_associativity’,
      inlined from ‘test_one’ at test-vphn.c:371:2,
      inlined from ‘test_vphn’ at test-vphn.c:399:9:
  test-vphn.c:10:33: error: ‘be_packed’ may be used uninitialized [-Werror=maybe-uninitialized]
     10 | #define be16_to_cpup(x)         bswap_16(*x)
        |                                 ^~~~~~~~
  vphn.c:42:27: note: in expansion of macro ‘be16_to_cpup’
     42 |                 u16 new = be16_to_cpup(field++);
        |                           ^~~~~~~~~~~~
  In file included from test-vphn.c:19:
  vphn.c: In function ‘test_vphn’:
  vphn.c:27:16: note: ‘be_packed’ declared here
     27 |         __be64 be_packed[VPHN_REGISTER_COUNT];
        |                ^~~~~~~~~
  cc1: all warnings being treated as errors

When vphn_unpack_associativity() is called from hcall_vphn() in kernel
the error is not seen while building vphn.c during kernel compilation.
This is because the top level Makefile includes '-fno-strict-aliasing'
flag always.

The issue here is that GCC 15 emits '-Wmaybe-uninitialized' due to type
punning between __be64[] and __b16* when accessing the buffer via
be16_to_cpup(). The underlying object is fully initialized but GCC 15
fails to track the aliasing due to the strict aliasing violation here.
Please refer [1] and [2]. This results in a false positive warning which
is promoted to an error under '-Werror'. This problem is not seen when
the compilation is performed with GCC 13 and 14. An issue [1] has also
been created on GCC bugzilla.

The selftest compiles fine with '-fno-strict-aliasing'. Since this GCC
flag is used to compile vphn.c in kernel too, the same flag should be
used to build vphn tests when compiling vphn.c in the selftest as well.

Fix this by including '-fno-strict-aliasing' during vphn.c compilation
in the selftest. This keeps the build working while limiting the scope
of the suppression to building vphn tests.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124427
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99768

Fixes: 58dae82843 ("selftests/powerpc: Add test for VPHN")
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260313165426.43259-1-amachhiw@linux.ibm.com
2026-04-01 09:21:04 +05:30
Jakub Kicinski
f843687c30 selftests: drv-net: update the README with variants
Test authors need to know about variants, existing tests don't use
them because variants are relatively recent.

Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260331001930.3411279-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-31 19:18:20 -07:00
Kees Cook
cf2f06f715 lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test
The str* family of fortified functions all use member-sized limits
for a while now, so the FORTIFY_STR_OBJECT test is redundant to
FORTIFY_STR_MEMBER. While here, replace the strncpy() use with strscpy(),
as strncpy() is being removed.

Link: https://patch.msgid.link/20260324020726.work.624-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2026-03-31 16:53:47 -07:00
Mykyta Yatsenko
0eeb0094ba selftests/bpf: Suppress veristat error messages in non-verbose mode
When running veristat across many BPF objects, expected load failures
produce noisy stderr output that obscures actual issues. Gate these
diagnostic messages behind --verbose.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260331172634.57402-2-mykyta.yatsenko5@gmail.com
2026-03-31 15:55:47 -07:00
Menglong Dong
3e6475dc60 selftests/bpf: Test access to ringbuf position with map pointer
Add the testing to access the bpf_ringbuf with the map pointer.
"consumer_pos" and "producer_pos" is accessed in this testing. We reserve
128 bytes in the ringbuf to test the producer_pos, which should be
"128 + BPF_RINGBUF_HDR_SZ".

It will be helpful if we want to evaluate the usage of the ringbuf in bpf
prog with the consumer and producer position.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/bpf/20260331070434.10037-1-dongml2@chinatelecom.cn
2026-03-31 15:47:14 -07:00
Ricardo B. Marlière
dbb6153c13 selftests/run_kselftest.sh: Remove unused $ROOT
Fix the following shellcheck warning:

ROOT appears unused. Verify use (or export if used externally). [SC2034]

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260320-selftests-fixes-v1-1-79144f76be01@suse.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 16:31:07 -06:00
Dmytro Maluka
9d2dbd3d59 selftests/cpu-hotplug: Fix check for cpu hotplug not supported
If CONFIG_HOTPLUG_CPU is disabled, /sys/devices/system/cpu/cpu*
directories are still populated, so the test fails to correctly detect
that CPU hotplug is not supported.

Fix this by checking for the presence of 'online' files in those
directories instead. The 'online' node is created for the given CPU if
and only if this CPU supports hotplug. So if none of the CPUs have
'online' nodes, it means CPU hotplug is not supported.

Signed-off-by: Dmytro Maluka <dmaluka@chromium.org>
Link: https://lore.kernel.org/r/20260319153825.2813576-1-dmaluka@chromium.org
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 16:07:19 -06:00
Linus Torvalds
9147566d80 sched_ext: Fixes for v7.0-rc6
- Fix SCX_KICK_WAIT deadlock where multiple CPUs waiting for each other in
   hardirq context form a cycle. Move the wait to a balance callback which
   can drop the rq lock and process IPIs.
 
 - Fix inconsistent NUMA node lookup in scx_select_cpu_dfl() where the
   waker_node used cpu_to_node() while prev_cpu used
   scx_cpu_node_if_enabled(), leading to undefined behavior when per-node
   idle tracking is disabled.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCacwiiQ4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGVILAP44s30JBpNyJ9JhAiCoTYzxzOXqqGbotnpQckMF
 +7WoJAD/Z9dJO/Sw/AH0fX6WVJDmO0QsQvFXLXJBxWy7A5XVAA0=
 =2DW5
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

 - Fix SCX_KICK_WAIT deadlock where multiple CPUs waiting for each other
   in hardirq context form a cycle. Move the wait to a balance callback
   which can drop the rq lock and process IPIs.

 - Fix inconsistent NUMA node lookup in scx_select_cpu_dfl() where
   the waker_node used cpu_to_node() while prev_cpu used
   scx_cpu_node_if_enabled(), leading to undefined behavior when
   per-node idle tracking is disabled.

* tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test
  sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback
  sched_ext: Fix inconsistent NUMA node lookup in scx_select_cpu_dfl()
2026-03-31 14:23:12 -07:00
Simon Liebold
64fac99037 selftests/mqueue: Fix incorrectly named file
Commit 85506aca2e ("selftests/mqueue: Set timeout to 180 seconds")
intended to increase the timeout for mq_perf_tests from the default
kselftest limit of 45 seconds to 180 seconds.

Unfortunately, the file storing this information was incorrectly named
`setting` instead of `settings`, causing the kselftest runner not to
pick up the limit and keep using the default 45 seconds limit.

Fix this by renaming it to `settings` to ensure that the kselftest
runner uses the increased timeout of 180 seconds for this test.

Fixes: 85506aca2e ("selftests/mqueue: Set timeout to 180 seconds")
Cc: <stable@vger.kernel.org> # 5.10.y
Signed-off-by: Simon Liebold <simonlie@amazon.de>
Link: https://lore.kernel.org/r/20260312140200.2224850-1-simonlie@amazon.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 15:01:51 -06:00
Linus Torvalds
53d85a2056 cgroup: Fixes for v7.0-rc6
- Fix cgroup rmdir racing with dying tasks. Deferred task cgroup unlink
   introduced a window where cgroup.procs is empty but the cgroup is still
   populated, causing rmdir to fail with -EBUSY and selftest failures. Make
   rmdir wait for dying tasks to fully leave and fix selftests to not depend
   on synchronous populated updates.
 
 - Fix cpuset v1 task migration failure from empty cpusets under strict
   security policies. When CPU hotplug removes the last CPU from a v1
   cpuset, tasks must be migrated to an ancestor without a
   security_task_setscheduler() check that would block the migration.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCacwibg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGXHEAP98nVEKyl7c7+sXYtwOPn8KEhdHkdpHyPZwhpS2
 1wLhaQEAm8yO49s7IgvGPWSz0s/gQdmF5/x8RAee0sJsZALvGQg=
 =bUUt
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:

 - Fix cgroup rmdir racing with dying tasks.

   Deferred task cgroup unlink introduced a window where cgroup.procs
   is empty but the cgroup is still populated, causing rmdir to fail
   with -EBUSY and selftest failures.

   Make rmdir wait for dying tasks to fully leave and fix selftests to
   not depend on synchronous populated updates.

 - Fix cpuset v1 task migration failure from empty cpusets under strict
   security policies.

   When CPU hotplug removes the last CPU from a v1 cpuset, tasks must be
   migrated to an ancestor without a security_task_setscheduler() check
   that would block the migration.

* tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: Skip security check for hotplug induced v1 task migration
  cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()
  cgroup: Fix cgroup_drain_dying() testing the wrong condition
  selftests/cgroup: Don't require synchronous populated update on task exit
  cgroup: Wait for dying tasks to leave on rmdir
2026-03-31 13:59:51 -07:00
Hangbin Liu
2964f6b816 selftests: Use ktap helpers for runner.sh
Instead of manually writing ktap messages, we should use the formal
ktap helpers in runner.sh. Brendan did some work in commit d9e6269e33
("selftests/run_kselftest.sh: exit with error if tests fail") to make
run_kselftest.sh exit with the correct return value. However, the output
does not include the total results, such as how many tests passed or failed.

Let’s convert all manually printed messages in runner.sh to use the
formal ktap helpers. Here are what I changed:

  1. Move TAP header from runner.sh to run_kselftest.sh, since
     run_kselftest.sh is the only caller of run_many().
  2. In run_kselftest.sh, call run_many() in main process to count the
     pass/fail numbers.
  3. In run_kselftest.sh, do not generate kselftest_failures_file. Just
     use ktap_print_totals to report the result.
  4. In runner.sh run_one(), get the return value and use ktap helpers for
     all pass/fail reporting. This allows counting pass/fail numbers in the
     main process.
  5. In runner.sh run_in_netns(), also return the correct rc, so we can
     count results during wait.

After the change, the printed result looks like:

  not ok 4 4 selftests: clone3: clone3_cap_checkpoint_restore # exit=1
  # Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0

  ]# echo $?
  1

Fixed change log commit description errors and long lines:
Shuah Khan <skhan@linuxfoundation.org>

Tested-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/r/20260225010833.11301-1-liuhangbin@gmail.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 14:48:48 -06:00
Thomas Weißschuh
132618c5b6 selftests: harness: Validate intermixing of kselftest and harness functionality
Make sure that calling ksft_test_result_*() functions from harness
tests work as expected.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-5-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 13:48:34 -06:00
Thomas Weißschuh
a0877a3590 selftests: harness: Detect illegal mixing of kselftest and harness functionality
Users may accidentally use the kselftest_test_result_*() functions in
their harness tests. If ksft_finished() is not used, the results
reported in this way are silently ignored.

Detect such false-positive cases and fail the test.

A more correct test would be to reject *any* usage of the ksft APIs but
that would force code churn on users.

Correct usages, which do use ksft_finished() will not trigger this
validation as the test will exit before it.

Reported-by: Yuwen Chen <ywen.chen@foxmail.com>
Link: https://lore.kernel.org/lkml/tencent_56D79AF3D23CEFAF882E83A2196EC1F12107@qq.com/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-4-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 13:48:28 -06:00
Thomas Weißschuh
d431ea8695 selftests: kselftest: Add ksft_reset_state()
Add a helper to reset the internal state of the kselftest framework.
It will be used by the selftest harness.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-2-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 13:48:23 -06:00
Thomas Weißschuh
758b8905ec selftests: harness: Validate that explicit kselftest exitcodes are handled
The test programs can directly call exit with one of the KSFT_* constants.

Add tests for this functionality.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-2-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 13:48:16 -06:00
Thomas Weißschuh
fd1b4eb837 selftests: kselftest: Treat xpass as successful result
The harness treats these tests as successful, as does pytest.

Align kselftest.h to the rest of the ecosystem.

None of the Linux selftests seem to actually use this anyways.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-1-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 13:48:10 -06:00
Masami Hiramatsu (Google)
3d0b8e4507 selftests/tracing: Fix to check awk supports non POSIX strtonum()
Check the awk command supports non POSIX strtonum() function in
the trace_marker_raw test case.

Fixes: 37f4660138 ("selftests/tracing: Add basic test for trace_marker_raw file")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/177071726229.2369897.11506524546451139051.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 13:06:24 -06:00
Masami Hiramatsu (Google)
e011853dd7 selftests/tracing: Fix to make --logdir option work again
Since commit a0aa283c53 ("selftest/ftrace: Generalise ftracetest to
use with RV") moved the default LOG_DIR setting after --logdir option
parser, it overwrites the user given LOG_DIR.
This fixes it to check the --logdir option parameter when setting new
default LOG_DIR with a new TOP_DIR.

Fixes: a0aa283c53 ("selftest/ftrace: Generalise ftracetest to use with RV")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/177071725191.2369897.14781037901532893911.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2026-03-31 13:06:18 -06:00
Jakub Kicinski
64c535db76 selftests/bpf: Test that dst is cleared on same-protocol encap
Verify that bpf_skb_adjust_room() clears the routing dst even when
the encap L3 protocol matches the original packet (e.g. IPIP).
The dst selected for the inner packet is not valid for the
encapsulated result; a stale dst could lead to misrouting.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260329180428.2657785-2-kuba@kernel.org
2026-03-30 15:52:25 -07:00
Joel Fernandes
6c3d9ad795 rcutorture: Add NOCB02 config for nocb poll mode testing
Add new rcutorture config NOCB02 that enables rcu_nocb_poll boot
parameter combined with CONFIG_RCU_NOCB_CPU to exercise the polling
mode code paths in the NOCB implementation.

This config exercises poll-mode paths not covered by other configs,
where callback invocation uses active polling instead of kthread
wakeups.

This config is not added to CFLIST to avoid increasing the default
test duration; it can be run explicitly when poll-mode testing
is needed.

Acked-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:14 -04:00
Joel Fernandes
18a6770f1f rcutorture: Add NOCB01 config for RCU_LAZY torture testing
Add new rcutorture config NOCB01 that enables CONFIG_RCU_LAZY combined
with CONFIG_RCU_NOCB_CPU to exercise the lazy callback code paths in
the NOCB implementation.

This config exercises lazy callback paths not covered by other configs,
including lazy-only wake and lazy defer logic.

This config is not added to CFLIST to avoid increasing the default
test duration; it can be run explicitly when lazy callback testing
is needed.

Acked-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:14 -04:00
Paul E. McKenney
359cf5c942 rcuscale: Ditch rcu_scale_shutdown in favor of torture_shutdown_init()
The torture_shutdown_init() function spawns a shutdown kthread in
a manner very similar to that implemented by rcu_scale_shutdown().
This commit therefore re-implements rcu_scale_shutdown() in terms of
torture_shutdown_init().

This patch was generated by Claude given as input the patch making the
same transformation of ref_scale_shutdown().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:13 -04:00
Paul E. McKenney
b0c8dd5097 refscale: Ditch ref_scale_shutdown in favor of torture_shutdown_init()
The torture_shutdown_init() function spawns a shutdown kthread in
a manner very similar to that implemented by ref_scale_shutdown().
This commit therefore re-implements ref_scale_shutdown in terms of
torture_shutdown_init().

The initial draft of this patch was generated by version 2.1.16 of the
Claude AI/LLM, but trained and configured for use by my employer, and
prompted to refer to Linux-kernel source code.  This initial draft failed
to provide a forward reference to ref_scale_cleanup(), passed zero to
torture_shutdown_init() for an unwelcome insta-shutdown, and failed to
pass the kvm.sh --duration argument in as a refscale module parameter.
On the other hand, it did catch the need to NULL main_task on the
post-test self-shutdown code path, which I might well have forgotten
to do.

This version of the patch fixes those problems, and in fact very little
of the initial draft remains.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:13 -04:00
Paul E. McKenney
df6e6ae18f rcutorture: Fix numeric "test" comparison in srcu_lockdep.sh
This commit switches from "-eq" to "=" to handle the non-numeric
comparisons in srcu_lockdep.sh.  While in the area, adjust SRCU flavor
to improve coverage.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:13 -04:00
Paul E. McKenney
6778178c3b torture: Print informative message for test without recheck file
If a type of torture test lacks a recheck file, a bash diagnostic is
printed, which looks like a torture-test bug.  This commit gets rid of
this false positive by explicitly checking for the file, invoking it if
it exists, otherwise printing an informative non-diagnostic message.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:13 -04:00
Paul E. McKenney
08d5cade66 torture: Make hangs more visible in torture.sh output
This commit labels "QEMU killed" lines so that they will be picked up
by torture.sh processing.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:13 -04:00
Paul E. McKenney
69642000bb kvm-check-branches.sh: Remove in favor of kvm-series.sh
The kvm-series.sh script is an order-of-magnitude optimization of
kvm-check-branches.sh, so remove the old script.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:13 -04:00
Paul E. McKenney
c6f4e552e1 rcutorture: Add a textbook-style trivial preemptible RCU
This commit adds a trivial textbook implementation of preemptible RCU
to rcutorture ("torture_type=trivial-preempt"), similar in spirit to the
existing "torture_type=trivial" textbook implementation of non-preemptible
RCU.  Neither trivial RCU implementation has any value for production use,
and are intended only to keep Paul honest in his introductory writings
and presentations.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30 15:48:13 -04:00
Tejun Heo
94555ca6d0 Merge branch 'for-7.0-fixes' into for-7.1
Conflict in kernel/sched/ext.c init_sched_ext_class() between:

  415cb193bb ("sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait
  to balance callback")

which adds cpus_to_sync cpumask allocation, and:

  84b1a0ea0b ("sched_ext: Implement scx_bpf_dsq_reenq() for user DSQs")
  8c1b9453fd ("sched_ext: Convert deferred_reenq_locals from llist to
  regular list")

which add deferred_reenq init code at the same location. Both are
independent additions. Include both.

Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-30 09:02:05 -10:00
Tejun Heo
090d34f0f0 selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test
Add a test that creates a 3-CPU kick_wait cycle (A->B->C->A). A BPF
scheduler kicks the next CPU in the ring with SCX_KICK_WAIT on every
enqueue while userspace workers generate continuous scheduling churn via
sched_yield(). Without the preceding fix, this hangs the machine within seconds.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
2026-03-30 08:37:55 -10:00
Leon Romanovsky
e6fd249178 Merge branch 'master' into rdma-next
Let's bring v7.0-rc6 to the -next branch, so we can merge the DMA
attributes fix [1] without merge conflicts.

[1] https://lore.kernel.org/all/20260323-umem-dma-attrs-v1-1-d6890f2e6a1e@nvidia.com

Signed-off-by: Leon Romanovsky <leon@kernel.org>

* master: (1688 commits)
  Linux 7.0-rc6
  ...
2026-03-30 13:47:44 -04:00
Zhu Yanjun
e01027cab3 RDMA/rxe: Add testcase for net namespace rxe
Add 4 testcases for rxe with net namespace.

Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://patch.msgid.link/20260313023058.13020-5-yanjun.zhu@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-30 13:47:43 -04:00
Pablo Alessandro Santos Hugen
57000fe6a6 selftests/livepatch: add test for module function patching
Add a target module and livepatch pair that verify module function
patching via a proc entry. Two test cases cover both the
klp_enable_patch path (target loaded before livepatch) and the
klp_module_coming path (livepatch loaded before target).

Signed-off-by: Pablo Alessandro Santos Hugen <phugen@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20260320201135.1203992-1-phugen@redhat.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
2026-03-30 15:37:31 +02:00
Linus Torvalds
d1384f70b2 vfs-7.0-rc6.fixes
Please consider pulling these changes from the signed vfs-7.0-rc6.fixes tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCacmRjQAKCRCRxhvAZXjc
 olJnAQD2iiLqih8Y8nX3ESMkkIQWUoSikrfSVw/GqmuKTmlrDgEA/z+LRgDGnI/+
 6xzkEw4UNmJ9JoJsiPSlHq18yyga/ww=
 =DxTb
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Fix netfs_limit_iter() hitting BUG() when an ITER_KVEC iterator
   reaches it via core dump writes to 9P filesystems. Add ITER_KVEC
   handling following the same pattern as the existing ITER_BVEC code.

 - Fix a NULL pointer dereference in the netfs unbuffered write retry
   path when the filesystem (e.g., 9P) doesn't set the prepare_write
   operation.

 - Clear I_DIRTY_TIME in sync_lazytime for filesystems implementing
  ->sync_lazytime. Without this the flag stays set and may cause
   additional unnecessary calls during inode deactivation.

 - Increase tmpfs size in mount_setattr selftests. A recent commit
   bumped the ext4 image size to 2 GB but didn't adjust the tmpfs
   backing store, so mkfs.ext4 fails with ENOSPC writing metadata.

 - Fix an invalid folio access in iomap when i_blkbits matches the folio
   size but differs from the I/O granularity. The cur_folio pointer
   would not get invalidated and iomap_read_end() would still be called
   on it despite the IO helper owning it.

 - Fix hash_name() docstring.

 - Fix read abandonment during netfs retry where the subreq variable
   used for abandonment could be uninitialized on the first pass or
   point to a deleted subrequest on later passes.

 - Don't block sync for filesystems with no data integrity guarantees.
   Add a SB_I_NO_DATA_INTEGRITY superblock flag replacing the per-inode
   AS_NO_DATA_INTEGRITY mapping flag so sync kicks off writeback but
   doesn't wait for flusher threads. This fixes a suspend-to-RAM hang on
   fuse-overlayfs where the flusher thread blocks when the fuse daemon
   is frozen.

 - Fix a lockdep splat in iomap when reads fail. iomap_read_end_io()
   invokes fserror_report() which calls igrab() taking i_lock in hardirq
   context while i_lock is normally held with interrupts enabled. Kick
   failed read handling to a workqueue.

 - Remove the redundant netfs_io_stream::front member and use
   stream->subrequests.next instead, fixing a potential issue in the
   direct write code path.

* tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  netfs: Fix the handling of stream->front by removing it
  iomap: fix lockdep complaint when reads fail
  writeback: don't block sync for filesystems with no data integrity guarantees
  netfs: Fix read abandonment during retry
  vfs: fix docstring of hash_name()
  iomap: fix invalid folio access when i_blkbits differs from I/O granularity
  selftests/mount_setattr: increase tmpfs size for idmapped mount tests
  fs: clear I_DIRTY_TIME in sync_lazytime
  netfs: Fix NULL pointer dereference in netfs_unbuffered_write() on retry
  netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
2026-03-29 15:24:28 -07:00
Chris J Arges
e3cdf6cf5f selftests: drv-net: xdp: Add rss_hash metadata tests
This test loads xdp_metadata.bpf which calls bpf_xdp_metadata_rx_hash() on
incoming packets. The metadata from that packet is then sent to a BPF
map for validation. It borrows structure from xdp.py, reusing common
functions.

The test checks the device's xdp-rx-metadata-features via netlink
before running and skips on devices that do not advertise hash support.
This can be run on veth devices as well as real hardware.

The test is fairly simple and just verifies that a TCP or UDP packet can be
identified as an L4 flow. This minimal test also passes if run on a veth
device.

Signed-off-by: Chris J Arges <carges@cloudflare.com>
Link: https://patch.msgid.link/20260325201139.2501937-7-carges@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-29 14:09:08 -07:00
Chris J Arges
4ce0640695 selftests: net: move common xdp.py functions into lib
This moves a few functions which can be useful to other python programs
that manipulate XDP programs. This also refactors xdp.py to use the
refactored functions.

Signed-off-by: Chris J Arges <carges@cloudflare.com>
Link: https://patch.msgid.link/20260325201139.2501937-6-carges@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-29 14:09:08 -07:00
Daniel Borkmann
398ad123e8 selftests/bpf: Add few tests for alu32 shift value tracking and zext
Add few more alu32 shift tests using div-by-zero on provably dead paths
to check both verifier and JIT xlation resp. runtime correctness.

If the verifier mistracks the result, it rejects due to the div by 0;
if the JIT computes a wrong value, then runtime hits the dead path and
retval changes.

  # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_subreg
  [...]
  #644/76  verifier_subreg/arsh32_imm1_value:OK
  #644/77  verifier_subreg/lsh32_reg0_zero_extend_check:OK
  #644/78  verifier_subreg/rsh32_reg0_zero_extend_check:OK
  #644/79  verifier_subreg/arsh32_reg0_zero_extend_check:OK
  #644/80  verifier_subreg/lsh32_imm31_value:OK
  #644/81  verifier_subreg/rsh32_imm31_value:OK
  #644/82  verifier_subreg/arsh32_imm31_value:OK
  #644/83  verifier_subreg/lsh32_unknown_precise_bounds:OK
  #644/84  verifier_subreg/rsh32_unknown_bounds:OK
  #644     verifier_subreg:OK
  Summary: 1/84 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260327220629.343327-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-29 09:57:39 -07:00
Ihor Solodrai
101a9d9df8 selftests/bpf: Update kfuncs using btf_struct_meta to new variants
Update selftests to use the new non-_impl kfuncs marked with
KF_IMPLICIT_ARGS by removing redundant declarations and macros from
bpf_experimental.h (the new kfuncs are present in the vmlinux.h) and
updating relevant callsites.

Fix spin_lock verifier-log matching for lock_id_kptr_preserve by
accepting variable instruction numbers. The calls to kfuncs with
implicit arguments do not have register moves (e.g. r5 = 0)
corresponding to dummy arguments anymore, so the order of instructions
has shifted.

Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260327203241.3365046-2-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-29 09:56:06 -07:00
Ihor Solodrai
d457072576 bpf: Support struct btf_struct_meta via KF_IMPLICIT_ARGS
The following kfuncs currently accept void *meta__ign argument:
  * bpf_obj_new_impl
  * bpf_obj_drop_impl
  * bpf_percpu_obj_new_impl
  * bpf_percpu_obj_drop_impl
  * bpf_refcount_acquire_impl
  * bpf_list_push_back_impl
  * bpf_list_push_front_impl
  * bpf_rbtree_add_impl

The __ign suffix is an indicator for the verifier to skip the argument
in check_kfunc_args(). Then, in fixup_kfunc_call() the verifier may
set the value of this argument to struct btf_struct_meta *
kptr_struct_meta from insn_aux_data.

BPF programs must pass a dummy NULL value when calling these kfuncs.

Additionally, the list and rbtree _impl kfuncs also accept an implicit
u64 argument, which doesn't require __ign suffix because it's a
scalar, and BPF programs explicitly pass 0.

Add new kfuncs with KF_IMPLICIT_ARGS [1], that correspond to each
_impl kfunc accepting meta__ign. The existing _impl kfuncs remain
unchanged for backwards compatibility.

To support this, add "btf_struct_meta" to the list of recognized
implicit argument types in resolve_btfids.

Implement is_kfunc_arg_implicit() in the verifier, that determines
implicit args by inspecting both a non-_impl BTF prototype of the
kfunc.

Update the special_kfunc_list in the verifier and relevant checks to
support both the old _impl and the new KF_IMPLICIT_ARGS variants of
btf_struct_meta users.

[1] https://lore.kernel.org/bpf/20260120222638.3976562-1-ihor.solodrai@linux.dev/

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260327203241.3365046-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-29 09:56:06 -07:00
Aleksei Oladko
5a1292137e selftests: fix ARCH normalization to handle command-line argument
Several selftests Makefiles (e.g.  prctl, breakpoints, etc) attempt to
normalize the ARCH variable by converting x86_64 and i.86 to x86. 
However, it uses the conditional assignment operator '?='.

When ARCH is passed as a command-line argument (e.g., during an rpmbuild
process), the '?=' operator ignores the shell command and the sed
transformation.  This leads to an incorrect ARCH value being used, which
causes build failures

  # make -C tools/testing/selftests TARGETS=prctl ARCH=x86_64
  make: Entering directory '/build/tools/testing/selftests'
  make[1]: Entering directory '/build/tools/testing/selftests/prctl'
  make[1]: *** No targets.  Stop.
  make[1]: Leaving directory '/build/tools/testing/selftests/prctl'
  make: *** [Makefile:197: all] Error 2

Change the assignment to use 'override' and ':=' to ensure the
normalization logic is applied regardless of how the ARCH variable was
initially defined.

Link: https://lkml.kernel.org/r/20260309205145.572778-1-aleksey.oladko@virtuozzo.com
Signed-off-by: Aleksei Oladko <aleksey.oladko@virtuozzo.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com>
Cc: Chelsy Ratnawat <chelsyratnawat2001@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 21:19:44 -07:00
Mark Brown
80266c154f selftests/fchmodat2: use ksft_finished()
The fchmodat2 test program open codes a version of ksft_finished(), use
the standard version.

Link: https://lkml.kernel.org/r/20260226-selftests-fchmodat2-v4-2-a6419435f2e8@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Alexey Gladkov <legion@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 21:19:37 -07:00
Mark Brown
04e23830b8 selftests/fchmodat2: clean up temporary files and directories
Patch series "selftests/fchmodat2: Error handling and general", v4.

I looked at the fchmodat2() tests since I've been experiencing some random
intermittent segfaults with them in my test systems, while doing so I
noticed these two issues.  Unfortunately I didn't figure out the original
yet, unless I managed to fix it unwittingly.


This patch (of 2):

The fchmodat2() test program creates a temporary directory with a file and
a symlink for every test it runs but never cleans these up, resulting in
${TMPDIR} getting left with stale files after every run.  Restructure the
program a bit to ensure that we clean these up, this is more invasive than
it might otherwise be due to the extensive use of ksft_exit_fail_msg() in
the program.

As a side effect this also ensures that we report a consistent test name
for the tests and always try both tests even if they are skipped.

Link: https://lkml.kernel.org/r/20260226-selftests-fchmodat2-v4-0-a6419435f2e8@kernel.org
Link: https://lkml.kernel.org/r/20260226-selftests-fchmodat2-v4-1-a6419435f2e8@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Alexey Gladkov <legion@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 21:19:36 -07:00
UYeol Jo
225ba47fb9 selftests/ipc: skip msgque test when MSG_COPY is unsupported
msgque kselftest uses msgrcv(..., MSG_COPY) to copy messages.  When the
kernel is built without CONFIG_CHECKPOINT_RESTORE, prepare_copy() is
stubbed out and msgrcv() returns -ENOSYS.  The test currently reports this
as a failure even though it is simply a missing feature/configuration.

Skip the test when msgrcv() fails with ENOSYS.

Link: https://lkml.kernel.org/r/20260210135359.178636-1-jouyeol8739@gmail.com
Signed-off-by: UYeol Jo <jouyeol8739@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 21:19:34 -07:00
Xiang Mei
5d17af9eb2 selftests/tc-testing: add test for HFSC divide-by-zero in rtsc_min()
Add a regression test for the divide-by-zero in rtsc_min() triggered
when m2sm() converts a large m1 value (e.g. 32gbit) to a u64 scaled
slope reaching 2^32. rtsc_min() stores the difference of two such u64
values (sm1 - sm2) in a u32 variable `dsm`, truncating 2^32 to zero
and causing a divide-by-zero oops in the concave-curve intersection
path. The test configures an HFSC class with m1=32gbit d=1ms m2=0bit,
sends a packet to activate the class, waits for it to drain and go
idle, then sends another packet to trigger reactivation through
rtsc_min().

Signed-off-by: Xiang Mei <xmei5@asu.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260326204310.1549327-2-xmei5@asu.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-27 20:41:11 -07:00
Gregory Price
d747cf98f0 cxl/core/region: move dax region device logic into region_dax.c
core/region.c is overloaded with per-region control logic (pmem, dax,
sysram, etc). Move the CXL DAX region device infrastructure from
region.c into a new region_dax.c file.

This will also allow us to add additional dax-driver integration paths
that don't further dirty the core region.c logic.

No functional changes.

Signed-off-by: Gregory Price <gourry@gourry.net>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260327020203.876122-3-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-03-27 11:45:31 -07:00
Gregory Price
8a1ec5fb23 cxl/core/region: move pmem region driver logic into region_pmem.c
core/region.c is overloaded with per-region control logic (pmem, dax,
sysram, etc). Move the pmem region driver logic from region.c into
region_pmem.c make it clear that this code only applies to pmem regions.

No functional changes.

[ dj: Fixed up some tabbing issues, may be from original code. ]

Signed-off-by: Gregory Price <gourry@gourry.net>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260327020203.876122-2-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-03-27 11:45:21 -07:00
Christian Brauner
96f4c251a0 selftests/bpf: add block device management selftests
Add selftests to test block device tracking for bpf lsm programs.

Signed-off-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20260326-work-bpf-bdev-v2-2-5e3c58963987@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-27 09:05:13 -07:00
Jiakai Xu
198c7ce980 RISC-V: KVM: selftests: Fix firmware counter read in sbi_pmu_test
The current sbi_pmu_test attempts to read firmware counters without
configuring them first with SBI_EXT_PMU_COUNTER_CFG_MATCH.

Previously this did not fail because KVM incorrectly allowed the read
and accessed fw_event[] with an out-of-bounds index when the counter
was unconfigured. After fixing that bug, the read now correctly returns
SBI_ERR_INVALID_PARAM, causing the selftest to fail.

Update the test to configure a firmware event before reading the
counter. Also add a negative test to ensure that attempting to read an
unconfigured firmware counter fails gracefully.

Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Link: https://lore.kernel.org/r/20260316014533.2312254-3-xujiakai2025@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2026-03-27 20:02:11 +05:30
Ben Copeland
4b097a7b25 selftests: ALSA: Skip utimer test when CONFIG_SND_UTIMER is not enabled
The timer_f.utimer test hard-fails with ASSERT_EQ when
SNDRV_TIMER_IOCTL_CREATE returns -1 on kernels without
CONFIG_SND_UTIMER. This causes the entire alsa kselftest suite to
report a failure rather than skipping the unsupported test.

When CONFIG_SND_UTIMER is not enabled, the ioctl is not recognised and
the kernel returns -ENOTTY. If the timer device or subdevice does not
exist, -ENXIO is returned. Skip the test in both cases, but still fail
on any other unexpected error.

Suggested-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/linux-kselftest/0e9c25d3-efbd-433b-9fb1-0923010101b9@stanley.mountain/
Signed-off-by: Ben Copeland <ben.copeland@linaro.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20260319124521.191491-1-ben.copeland@linaro.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-03-27 14:40:18 +01:00
Yohei Kojima
5f70b0aa08 selftests: net: Remove unnecessary backslashes in fq_band_pktlimit.sh
Address "grep: warning: stray \ before white space" warning from GNU
grep 3.12. This warns the misplaced backslashes before whitespaces
(e.g. \\' ' or '\ ') which leads to unspecified behavior [1].

We can just remove the backslashes before whitespaces as POSIX says:

  Enclosing characters in single-quotes ('') shall preserve the literal
  value of each character within the single-quotes.

and bourne-compatible shells behave so.

[1]: https://lists.gnu.org/r/bug-gnulib/2022-05/msg00057.html

Signed-off-by: Yohei Kojima <yk@y-koj.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/dd0bbd48cdf468da56ec34fd61cecd4d2111d7ba.1774372510.git.yk@y-koj.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-26 18:48:44 -07:00
Justin Iurman
2d047673d3 selftests: add check for seg6 tunsrc
Extend srv6_hencap_red_l3vpn_test.sh to include checks for the new
"tunsrc" feature. If there is no support for tunsrc, it silently
falls back to the encap config without tunsrc.

Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Justin Iurman <justin.iurman@6wind.com>
Reviewed-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Link: https://patch.msgid.link/20260324091434.359341-3-justin.iurman@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-26 18:45:29 -07:00
Tejun Heo
c6f99d0ecc Revert "selftests/sched_ext: Add tests for SCX_ENQ_IMMED and scx_bpf_dsq_reenq()"
This reverts commit c50dcf5331.

The tests are superficial, likely AI-generated slop, and flaky. They
don't add actual value and just churn the selftests.

Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-26 14:45:37 -10:00
Alan Maguire
0467491617 selftests/bpf: Test kind encoding/decoding
verify btf__new_empty_opts() adds layouts for all kinds supported,
and after adding kind-related types for an unknown kind, ensure that
parsing uses this info when that kind is encountered rather than
giving up.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260326145444.2076244-9-alan.maguire@oracle.com
2026-03-26 13:53:57 -07:00
Jakub Kicinski
9ebcf66cd6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-7.0-rc6).

No conflicts, or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-26 12:09:57 -07:00
Linus Torvalds
25b69ebe28 Landlock fix for v7.0-rc6
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCacVk0xAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbS0v4A/joA39PP40bpHZorGYVgHyEZZgCgGicffmYd
 TnvlvawOAPoDc6h1HwkcOonhYgvEe29JPIBrEFOCNBZsGTntvN29Ag==
 =T4m+
 -----END PGP SIGNATURE-----

Merge tag 'landlock-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull Landlock fixes from Mickaël Salaün:
 "This mainly fixes Landlock TSYNC issues related to interrupts and
  unexpected task exit.

  Other fixes touch documentation and sample, and a new test extends
  coverage"

* tag 'landlock-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Expand restrict flags example for ABI version 8
  selftests/landlock: Test tsync interruption and cancellation paths
  landlock: Clean up interrupted thread logic in TSYNC
  landlock: Serialize TSYNC thread restriction
  samples/landlock: Bump ABI version to 8
  landlock: Improve TSYNC types
  landlock: Fully release unused TSYNC work entries
  landlock: Fix formatting
2026-03-26 12:03:37 -07:00
Yeoreum Yun
42550d7d8a KVM: arm64: kselftest: set_id_regs: Add test for FEAT_LSUI
Add test coverage for FEAT_LSUI.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-03-26 18:19:41 +00:00
Linus Torvalds
453a4a5f97 Including fixes from Bluetooth, CAN, IPsec and Netfilter.
Notably, this includes the fix for the Bluetooth regression that you
 were notified about. I'm not aware of any other pending regressions.
 
 Current release - regressions:
 
   - bluetooth:
     - fix stack-out-of-bounds read in l2cap_ecred_conn_req
     - fix regressions caused by reusing ident
 
   - netfilter: revisit array resize logic
 
   - eth: ice: set max queues in alloc_etherdev_mqs()
 
 Previous releases - regressions:
 
   - core: correctly handle tunneled traffic on IPV6_CSUM GSO fallback
 
   - bluetooth:
     - fix dangling pointer on mgmt_add_adv_patterns_monitor_complete
     - fix deadlock in l2cap_conn_del()
 
   - sched: codel: fix stale state for empty flows in fq_codel
 
   - ipv6: remove permanent routes from tb6_gc_hlist when all exceptions expire.
 
   - xfrm: fix skb_put() panic on non-linear skb during reassembly
 
   - openvswitch:
     - avoid releasing netdev before teardown completes
     - validate MPLS set/set_masked payload length
 
   - eth: iavf: fix out-of-bounds writes in iavf_get_ethtool_stats()
 
 Previous releases - always broken:
 
   - bluetooth: fix null-ptr-deref on l2cap_sock_ready_cb
 
   - udp: fix wildcard bind conflict check when using hash2
 
   - netfilter: fix use of uninitialized rtp_addr in process_sdp
 
   - tls: Purge async_hold in tls_decrypt_async_wait()
 
   - xfrm:
     - prevent policy_hthresh.work from racing with netns teardown
     - fix skb leak with espintcp and async crypto
 
   - smc: fix double-free of smc_spd_priv when tee() duplicates splice pipe buffer
 
   - can:
     - add missing error handling to call can_ctrlmode_changelink()
     - fix OOB heap access in cgw_csum_crc8_rel()
 
   - eth: mana: fix use-after-free in add_adev() error path
 
   - eth: virtio-net: fix for VIRTIO_NET_F_GUEST_HDRLEN
 
   - eth: bcmasp: fix double free of WoL irq
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmnFXoESHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkHIsP+QGWo2TdNS6FENMC4TL0HYR/pTf6Kwil
 rxxvt3CSpqfEkdwtRlgXqJycaLydl6YiNlIgnd1QtNYUTa6SHziuhUevSmK3nJTa
 QkuqXnIbWFDD6nWID/P0CrBTQsZB+Vv4OhVFTbMkkvjX4X73PIbtvoNcWaZd+6rz
 1BANq4dS5AjBieY824ovGlgqCxRyTr1g71RjSQy89RVgxabCwrKokLFFeqo8VObm
 gemyra4wmAt43mxcTWbq/ZgTPfKpHoUSFBF8/c93zad7UKNTP9UqzydrRDSvoqZL
 ZLIYxZ2jGwXCKQjs2pOEmBk2KWfrfnVxFPXizjqpPEagUQEV0NK2hSQswj1sOQnB
 Ee4eL1MgXD6oUQI3qXK6XNAC4rOyDCYjVK3AeRs4hgwEkJCQp4iR+OcGO9lhU+/3
 kNAUG3x0ySW0l2UNmpqnfAAu0+LR2FIdvfpoa9lTKjm6A+kKPMIrlQM4Wq1vwWoN
 f6NV0Br0fQ4FclDh+KyvpZFHrEWCWlfB8sIV5JhQNHnVviP2Rarbw9VBiWvjDxky
 Xn9Mac6mCswHx9GxrC3nPfNBtoNTQ9JvY0Cms/FtrERHaEvIhYxsUpXJs0fxK5mg
 LACTkRzfQkXun9i1B+DYM26M6Yq4lm5TN9a1Mj2G3Df2EECriIgwdl6RI+J2bPJU
 D8RbdAX5qiWu
 =kySL
 -----END PGP SIGNATURE-----

Merge tag 'net-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from Bluetooth, CAN, IPsec and Netfilter.

  Notably, this includes the fix for the Bluetooth regression that you
  were notified about. I'm not aware of any other pending regressions.

  Current release - regressions:

    - bluetooth:
       - fix stack-out-of-bounds read in l2cap_ecred_conn_req
       - fix regressions caused by reusing ident

    - netfilter: revisit array resize logic

    - eth: ice: set max queues in alloc_etherdev_mqs()

  Previous releases - regressions:

    - core: correctly handle tunneled traffic on IPV6_CSUM GSO fallback

    - bluetooth:
       - fix dangling pointer on mgmt_add_adv_patterns_monitor_complete
       - fix deadlock in l2cap_conn_del()

    - sched: codel: fix stale state for empty flows in fq_codel

    - ipv6: remove permanent routes from tb6_gc_hlist when all exceptions expire.

    - xfrm: fix skb_put() panic on non-linear skb during reassembly

    - openvswitch:
       - avoid releasing netdev before teardown completes
       - validate MPLS set/set_masked payload length

    - eth: iavf: fix out-of-bounds writes in iavf_get_ethtool_stats()

  Previous releases - always broken:

    - bluetooth: fix null-ptr-deref on l2cap_sock_ready_cb

    - udp: fix wildcard bind conflict check when using hash2

    - netfilter: fix use of uninitialized rtp_addr in process_sdp

    - tls: Purge async_hold in tls_decrypt_async_wait()

    - xfrm:
       - prevent policy_hthresh.work from racing with netns teardown
       - fix skb leak with espintcp and async crypto

    - smc: fix double-free of smc_spd_priv when tee() duplicates splice pipe buffer

    - can:
       - add missing error handling to call can_ctrlmode_changelink()
       - fix OOB heap access in cgw_csum_crc8_rel()

    - eth:
       - mana: fix use-after-free in add_adev() error path
       - virtio-net: fix for VIRTIO_NET_F_GUEST_HDRLEN
       - bcmasp: fix double free of WoL irq"

* tag 'net-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (90 commits)
  net: macb: use the current queue number for stats
  netfilter: ctnetlink: use netlink policy range checks
  netfilter: nf_conntrack_sip: fix use of uninitialized rtp_addr in process_sdp
  netfilter: nf_conntrack_expect: skip expectations in other netns via proc
  netfilter: nf_conntrack_expect: store netns and zone in expectation
  netfilter: ctnetlink: ensure safe access to master conntrack
  netfilter: nf_conntrack_expect: use expect->helper
  netfilter: nf_conntrack_expect: honor expectation helper field
  netfilter: nft_set_rbtree: revisit array resize logic
  netfilter: ip6t_rt: reject oversized addrnr in rt_mt6_check()
  netfilter: nfnetlink_log: fix uninitialized padding leak in NFULA_PAYLOAD
  tls: Purge async_hold in tls_decrypt_async_wait()
  selftests: netfilter: nft_concat_range.sh: add check for flush+reload bug
  netfilter: nft_set_pipapo_avx2: don't return non-matching entry on expiry
  Bluetooth: btusb: clamp SCO altsetting table indices
  Bluetooth: L2CAP: Fix ERTM re-init and zero pdu_len infinite loop
  Bluetooth: L2CAP: Fix deadlock in l2cap_conn_del()
  Bluetooth: btintel: serialize btintel_hw_error() with hci_req_sync_lock
  Bluetooth: L2CAP: Fix send LE flow credits in ACL link
  net: mana: fix use-after-free in add_adev() error path
  ...
2026-03-26 09:53:08 -07:00
Jiakai Xu
7c61e7433b RISC-V: KVM: selftests: Add RISC-V SBI STA shmem alignment tests
Add RISC-V KVM selftests to verify the SBI Steal-Time Accounting (STA)
shared memory alignment requirements.

The SBI specification requires the STA shared memory GPA to be 64-byte
aligned, or set to all-ones to explicitly disable steal-time accounting.
This test verifies that KVM enforces the expected behavior when
configuring the SBI STA shared memory via KVM_SET_ONE_REG.

Specifically, the test checks that:
- misaligned GPAs are rejected with -EINVAL
- 64-byte aligned GPAs are accepted
- all-ones GPA is accepted

Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260303010859.1763177-4-xujiakai2025@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2026-03-26 21:21:03 +05:30
Jiakai Xu
40351ed924 KVM: selftests: Refactor UAPI tests into dedicated function
Move steal time UAPI tests from steal_time_init() into a separate
check_steal_time_uapi() function for better code organization and
maintainability.

Previously, x86 and ARM64 architectures performed UAPI validation
tests within steal_time_init(), mixing initialization logic with
uapi tests.

Changes by architecture:
x86_64:
  - Extract MSR reserved bits test from steal_time_init()
  - Move to check_steal_time_uapi() which tests that setting
    MSR_KVM_STEAL_TIME with KVM_STEAL_RESERVED_MASK fails
ARM64:
  - Extract three UAPI tests from steal_time_init():
     Device attribute support check
     Misaligned IPA rejection (EINVAL)
     Duplicate IPA setting rejection (EEXIST)
  - Move all tests to check_steal_time_uapi()
RISC-V:
  - Add empty check_steal_time_uapi() stub for future use
  - No changes to steal_time_init() (had no tests to extract)

The new check_steal_time_uapi() function:
  - Is called once before the per-VCPU test loop

No functional change intended.

Suggested-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260303010859.1763177-3-xujiakai2025@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2026-03-26 21:21:03 +05:30
Florian Westphal
6caefcd949 selftests: netfilter: nft_concat_range.sh: add check for flush+reload bug
This test will fail without
the preceding commit ("netfilter: nft_set_pipapo_avx2: fix match retart if found element is expired"):

  reject overlapping range on add       0s                              [ OK ]
  reload with flush                 /dev/stdin:59:32-52: Error: Could not process rule: File exists
add element inet filter test { 10.0.0.29 . 10.0.2.29 }

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-03-25 21:40:47 +01:00
Alexei Starovoitov
400ff899c3 selftests/bpf: Make reg_bounds test more robust
The verifier log output may contain multiple lines that start with
18: (bf) r0 = r6
teach reg_bounds to look for lines that have ';' in them,
since reg_bounds test is looking for:
18: (bf) r0 = r6       ; R0=... R6=...

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260325012242.45606-1-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-25 08:50:33 -07:00
Eric Dumazet
d1e59a4697 tcp: add cwnd_event_tx_start to tcp_congestion_ops
(tcp_congestion_ops)->cwnd_event() is called very often, with
@event oscillating between CA_EVENT_TX_START and other values.

This is not branch prediction friendly.

Provide a new cwnd_event_tx_start pointer dedicated for CA_EVENT_TX_START.

Both BBR and CUBIC benefit from this change, since they only care
about CA_EVENT_TX_START.

No change in kernel size:

$ scripts/bloat-o-meter -t vmlinux.0 vmlinux
add/remove: 4/4 grow/shrink: 3/1 up/down: 564/-568 (-4)
Function                                     old     new   delta
bbr_cwnd_event_tx_start                        -     450    +450
cubictcp_cwnd_event_tx_start                   -      70     +70
__pfx_cubictcp_cwnd_event_tx_start             -      16     +16
__pfx_bbr_cwnd_event_tx_start                  -      16     +16
tcp_unregister_congestion_control             93      99      +6
tcp_update_congestion_control                518     521      +3
tcp_register_congestion_control              422     425      +3
__tcp_transmit_skb                          3308    3306      -2
__pfx_cubictcp_cwnd_event                     16       -     -16
__pfx_bbr_cwnd_event                          16       -     -16
cubictcp_cwnd_event                           80       -     -80
bbr_cwnd_event                               454       -    -454
Total: Before=25240512, After=25240508, chg -0.00%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260323234920.1097858-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-24 21:00:38 -07:00
Bobby Eshleman
112f4c6320 selftests: drv-net: add missing tc config options for netkit tests
The NetDrvContEnv env context uses tc clsact qdiscs and BPF tc filters
for traffic redirection, but the kernel config options are missing from
the selftests config.

Without them, the tc qdisc installation trips on:

  CMD: tc qdisc add dev enp1s0 clsact
    EXIT: 2
    STDERR: Error: Specified qdisc kind is unknown.

  net.lib.py.utils.CmdExitFailure: Command failed

Add CONFIG_NET_CLS_ACT and CONFIG_NET_SCH_INGRESS to enable these tc
options.

Fixes: 3f74d5bb80 ("selftests/net: Add env for container based tests")
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260323-config-fixes-for-nk-tests-v2-1-6c505d83e52d@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-24 20:59:40 -07:00
Steven Rostedt
dc1d9408c9 Merge commit 'f35dbac6942171dc4ce9398d1d216a59224590a9' into trace/ring-buffer/core
The commit f35dbac694 ("ring-buffer: Fix to update per-subbuf entries of
persistent ring buffer") was a fix and merged upstream. It is needed for
some other work in the ring buffer. The current branch has the remote
buffer code that is shared with the Arm64 subsystem and can't be rebased.

Merge in the upstream commit to allow continuing of the ring buffer work.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-24 22:19:21 -04:00
Alexei Starovoitov
9f7d8fa681 selftests/bpf: Test variable length stack write
Add a test to make sure that variable length stack writes
scrubs STACK_SPILL into STACK_MISC.

Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260324215938.81733-2-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-24 17:00:16 -07:00
Davidlohr Bueso
b374977413 selftests/futex: Bump up libnuma version check
numa_set_mempolicy_home_node() was introduced in libnuma 2.0.18, not
2.0.16, via:
	https://github.com/numactl/numactl/commit/8f2ffc89654c

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20260306182215.2088991-1-dave@stgolabs.net
2026-03-24 22:59:59 +01:00
Nylon Chen
3814c4e145 selftests/futex: Conditionally include libnuma support
Use LIBNUMA_TEST to conditionally add -lnuma to LDLIBS.
Guard numa header includes with #ifdef LIBNUMA_VER_SUFFICIENT
to allow compilation without libnuma installed.

Co-developed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260301-20260128_nylon_chen_sifive_com-v3-1-995ab4cc71aa@sifive.com
2026-03-24 22:59:59 +01:00
Thomas Weißschuh
55722b3f80 selftests/bpf: verify_pkcs7_sig: Use 'struct module_signature' from the UAPI headers
Now that the UAPI headers provide the required definitions, use those.
Some symbols have been renamed, adapt to those.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2026-03-24 21:42:37 +00:00
Thomas Weißschuh
e340db306c sign-file: use 'struct module_signature' from the UAPI headers
Now that the UAPI headers provide the required definitions, use those.
Some symbols have been renamed, adapt to those.

Also adapt the include path for the custom sign-file rule in the
bpf selftests.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2026-03-24 21:42:37 +00:00
Sun Jian
7f5b0a60a8 selftests/bpf: move trampoline_count to dedicated bpf_testmod target
trampoline_count fills all trampoline attachment slots for a single
target function and verifies that one extra attach fails with -E2BIG.

It currently targets bpf_modify_return_test, which is also used by
other selftests such as modify_return, get_func_ip_test, and
get_func_args_test. When such tests run in parallel, they can contend
for the same per-function trampoline quota and cause unexpected attach
failures. This issue is currently masked by harness serialization.

Move trampoline_count to a dedicated bpf_testmod target and register it
for fmod_ret attachment. Also route the final trigger through
trigger_module_test_read(), so the execution path exercises the same
dedicated target.

This keeps the test semantics unchanged while isolating it from other
selftests, so it no longer needs to run in serial mode. Remove the
TODO comment as well.

Tested:
  ./test_progs -t trampoline_count -vv
  ./test_progs -j$(nproc) -t trampoline_count -vv
  ./test_progs -j$(nproc) -t \
    trampoline_count,modify_return,get_func_ip_test,get_func_args_test -vv
  20 runs of:
    ./test_progs -j$(nproc) -t \
      trampoline_count,modify_return,get_func_ip_test,get_func_args_test

Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260324044949.869801-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-24 13:39:32 -07:00
Jiayuan Chen
d9d7125e44 selftests/bpf: Fix sockmap_multi_channels reliability
Previously I added a FIONREAD test for sockmap, but it can occasionally
fail in CI [1].

The test sends 10 bytes in two segments (2 + 8). For UDP, FIONREAD only
reports the length of the first datagram, not the total queued data.
The original code used recv_timeout() expecting all 10 bytes, but under
high system load, the second datagram may not yet be processed by the
protocol stack, so recv would only return the first 2-byte datagram,
causing a size mismatch failure.

Fix this by receiving exactly the expected bytes (matching FIONREAD) in
the first recv. The remaining datagram is then consumed in a second recv
block, which is only reachable for UDP since TCP's expected already
equals sizeof(buf).

Test:
./test_progs -a sockmap_basic
410/1   sockmap_basic/sockmap create_update_free:OK
...
Summary: 1/35 PASSED, 0 SKIPPED, 0 FAILED

[1] https://github.com/kernel-patches/bpf/actions/runs/22919385910/job/66515395423

Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Fixes: 17e2ce02bf ("selftests/bpf: Add tests for FIONREAD and copied_seq")
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://lore.kernel.org/r/20260312072549.6766-1-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-24 13:38:43 -07:00
Jiayuan Chen
2790db208b selftests/bpf: Improve tc_tunnel test reliability
A test failure was discovered in BPF CI [1] caused by connection timeout.
The current test timeout of 500ms is insufficient for CI environments,
particularly under high load.

While the optimal timeout is unclear, this test was converted from the
original test_tc_tunnel.sh script. The original script used nc with "-w 1"
to specify a 1-second timeout [2]. Therefore, this test restores the
timeout to 1s.

Test:
./test_progs -a tc_tunnel
 #478/1   tc_tunnel/ipip_none:OK
 #478/2   tc_tunnel/ipip6_none:OK
 #478/3   tc_tunnel/ip6tnl_none:OK
 #478/4   tc_tunnel/sit_none:OK
 #478/5   tc_tunnel/vxlan_eth:OK
 #478/6   tc_tunnel/ip6vxlan_eth:OK
 #478/7   tc_tunnel/gre_none:OK
 #478/8   tc_tunnel/gre_eth:OK
 #478/9   tc_tunnel/gre_mpls:OK
 #478/10  tc_tunnel/ip6gre_none:OK
 #478/11  tc_tunnel/ip6gre_eth:OK
 #478/12  tc_tunnel/ip6gre_mpls:OK
 #478/13  tc_tunnel/udp_none:OK
 #478/14  tc_tunnel/udp_eth:OK
 #478/15  tc_tunnel/udp_mpls:OK
 #478/16  tc_tunnel/ip6udp_none:OK
 #478/17  tc_tunnel/ip6udp_eth:OK
 #478/18  tc_tunnel/ip6udp_mpls:OK
 #478     tc_tunnel:OK
 Summary: 1/18 PASSED, 0 SKIPPED, 0 FAILED

[1] https://github.com/kernel-patches/bpf/actions/runs/22674350732/job/65728072723
[2] https://lore.kernel.org/all/20251027-tc_tunnel-v3-4-505c12019f9d@bootlin.com/

Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://lore.kernel.org/r/20260312083615.31835-1-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-24 13:38:04 -07:00
Slava Imameev
e8571de534 selftests/bpf: Add trampolines single and multi-level pointer params test coverage
Add single and multi-level pointer parameters and return value test
coverage for BPF trampolines. Includes verifier tests for single and
multi-level pointers. The tests check verifier logs for pointers
inferred as scalar() type.

Signed-off-by: Slava Imameev <slava.imameev@crowdstrike.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260314082127.7939-3-slava.imameev@crowdstrike.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-24 13:36:32 -07:00
Cheng-Yang Chou
7ef26d62f3 selftests/sched_ext: Skip rt_stall on older kernels and list skipped tests
rt_stall tests the ext DL server which was introduced in commit
cd959a3562 ("sched_ext: Add a DL server for sched_ext tasks").
On older kernels that lack this feature, the test calls ksft_exit_fail()
internally which terminates the entire runner process, preventing
subsequent tests from running.

Add a guard in setup() that checks for the ext_server field in struct rq
via __COMPAT_struct_has_field() and returns SCX_TEST_SKIP if not present.

Also print the names of skipped tests in the results summary, mirroring
the existing behavior for failed tests.

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-24 10:33:25 -10:00
Tejun Heo
6680c162b4 selftests/cgroup: Don't require synchronous populated update on task exit
test_cgcore_populated (test_core) and test_cgkill_{simple,tree,forkbomb}
(test_kill) check cgroup.events "populated 0" immediately after reaping
child tasks with waitpid(). This used to work because cgroup_task_exit() in
do_exit() unlinked tasks from css_sets before exit_notify() woke up
waitpid().

d245698d72 ("cgroup: Defer task cgroup unlink until after the task is done
switching out") moved the unlink to cgroup_task_dead() in
finish_task_switch(), which runs after exit_notify(). The populated counter
is now decremented after the parent's waitpid() can return, so there is no
longer a synchronous ordering guarantee. On PREEMPT_RT, where
cgroup_task_dead() is further deferred through lazy irq_work, the race
window is even larger.

The synchronous populated transition was never part of the cgroup interface
contract - it was an implementation artifact. Use cg_read_strcmp_wait() which
retries for up to 1 second, matching what these tests actually need to
verify: that the cgroup eventually becomes unpopulated after all tasks exit.

Fixes: d245698d72 ("cgroup: Defer task cgroup unlink until after the task is done switching out")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: cgroups@vger.kernel.org
2026-03-24 10:21:57 -10:00
Alexei Starovoitov
7b4f1a29c7 selftests/bpf: Test 32-bit scalar spill pruning in stacksafe()
Add a test verifying that stacksafe() correctly handles 32-bit scalar
spills when comparing stack states for equivalence during state pruning.

A 32-bit scalar spill creates slot[0-3] = STACK_INVALID and
slot[4-7] = STACK_SPILL. Without the im=4 check in stacksafe(), the
STACK_SPILL vs STACK_MISC mismatch at byte 4 causes pruning to fail,
forcing the verifier to re-explore a path that is provably safe.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260323022410.75444-2-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-24 12:10:38 -07:00
Paolo Bonzini
12fd965871 KVM: s390: Fixes for 7.0
- fix deadlock in new memory management
 - handle kernel faults on donated memory properly
 - fix bounds checking for irq routing + selftest
 - fix invalid machine checks + logging
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE+SKTgaM0CPnbq/vKEXu8gLWmHHwFAmm5TzoACgkQEXu8gLWm
 HHyrjQ/+KlX/odZnN6KE/WGxB0pf06aXfQTBhM8vmfrig/vimIZrm2xszO6TIdZQ
 rYcUik1mMv1VTCYi4RWnKPklj70NgXRRKwfUNrHzql4VFiTlCPmALHw7LDUDrJEf
 OriU4wL+T9G/638logfZJBmfhunHR6HqHP+LJLm6eIIQKIYmEjPoGpSB1HBP+9YN
 viz2dvKXO8NR41rx14NkqMeyR6zQl+I+1CQCuJmSqxtnAyRFPCTrWLElPFO+J+ha
 02jurSiQk89nLlgEqlzthnbv9NopyaLErSXXx9FzESjHli6hhP8rPtxDL2oJB1VF
 YHDW5ln1w1H22i1VXuyU5jg4D3OOUz7e//CaP5wZBHFUIJxpYzeK7faDLYJHphk4
 JNg4uI+mhQ/6E2Dlos8efefP/gqdVAfqOHr7l+4nCYtfh3aQhezbQAB24W6wQL9/
 gs/TnTRt8Rs2UGXLAY0t3+Y7ATrRynDD5DzmQodc19l26076QodvI1xCeptX5Kth
 N855SIIcCcEbYSK1fSquIeCoJ9aAAyQbLDefNLHtWzgzX+Lz77lnmu90tpVnq4qk
 sjIsFq6qw8xso3bDKviiFOLdJz/zTW33YCHKPAl43iFgc6yC8pTT4hp6J5kcGHmD
 bwRSnUz9mmgmyCzU/DetXo3P+n5mqXG2c+iMMQ8vkig+NVduQ7w=
 =uUMD
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixes for 7.0

- fix deadlock in new memory management
- handle kernel faults on donated memory properly
- fix bounds checking for irq routing + selftest
- fix invalid machine checks + logging
2026-03-24 17:32:13 +01:00
Varun R Mallya
b43d574c00 selftests/bpf: Add test for struct_ops __ref argument in any position
Add a selftest to verify that the verifier correctly identifies refcounted
arguments in struct_ops programs, even when they are not the first
argument. This ensures that the restriction on tail calls for programs
with __ref arguments is properly enforced regardless of which argument
they appear in.

This test verifies the fix for check_struct_ops_btf_id() proposed by
Keisuke Nishimura [0], which corrected a bug where only the first
argument was checked for the refcounted flag.
The test includes:
- An update to bpf_testmod to add 'test_refcounted_multi', an operator with
  three arguments where the third is tagged with "__ref".
- A BPF program 'test_refcounted_multi' that attempts a tail call.
- A test runner that asserts the verifier rejects the program with
  "program with __ref argument cannot tail call".

[0]: https://lore.kernel.org/bpf/20260320130219.63711-1-keisuke.nishimura@inria.fr/

Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Link: https://lore.kernel.org/r/20260321214038.80479-1-varunrmallya@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-24 08:51:23 -07:00
Amery Hung
03b7b389fe selftests/bpf: Fix compiler warnings in task_local_data.h
Fix compiler warnings about unused parameter, narrowing non-constant
into a smaller type and comparison between integers of different size.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260323231133.859941-1-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-24 08:46:54 -07:00
Jiayuan Chen
56063823b9 selftests: team: add non-Ethernet header_ops reproducer
Add a team selftest that sets up:
  g0 (gre) -> b0 (bond) -> t0 (team)

and triggers IPv6 traffic on t0. This reproduces the non-Ethernet
header_ops confusion scenario and protects against regressions in stacked
team/bond/gre configurations.

Using this script, the panic reported by syzkaller can be reproduced [1].

After the fix:

  # ./non_ether_header_ops.sh
  PASS: non-Ethernet header_ops stacking did not crash

[1] https://syzkaller.appspot.com/bug?extid=3d8bc31c45e11450f24c

Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://patch.msgid.link/20260320072139.134249-3-jiayuan.chen@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-24 11:26:32 +01:00
Aleksei Oladko
9d463f7863 selftests: net: io_uring_zerocopy: enable io_uring for the test
The io_uring_zerocopy.sh kselftest assumes that io_uring support is
enabled on the host system. When io_uring is disabled via the
kernel.io_uring_disabled sysctl, the test fails.

Explicitly enable io_uring for the test by setting
kernel.io_uring_disabled=0.

Save the original value of kernel.io_uring_disabled before changing
it and restore it in cleanup handler to ensure the system state is
restored regardless of test outcome.

Signed-off-by: Aleksei Oladko <aleksey.oladko@virtuozzo.com>
Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Link: https://patch.msgid.link/20260321215908.175465-5-aleksey.oladko@virtuozzo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-23 19:58:32 -07:00
Aleksei Oladko
a897e194c4 selftests: net: run reuseport in an isolated netns
The reuseport_* tests (bpf, bpf_cpu, bpf_numa, dualstack) currently use
a fixed port range. This can cause intermittent test failures when the
ports are already in use by other services:

  failed to bind receive socket: Address already in use

To avoid conflicts, run these tests in separate network namespaces using
unshare. Each test now has its own isolated network stack, preventing
port collisions with the host services.

Signed-off-by: Aleksei Oladko <aleksey.oladko@virtuozzo.com>
Link: https://patch.msgid.link/20260321215908.175465-2-aleksey.oladko@virtuozzo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-23 19:58:29 -07:00
Allison Henderson
381a503f0e selftests: rds: Add -c config option to rds/config.sh
This patch adds a new -c flag to config.sh that enables callers
to specify the file path of the config they would like to update.
If no config is specified, the default will be the .config of the
current directory.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260320041834.2761069-3-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-23 19:39:07 -07:00
Allison Henderson
0f5d680047 selftests: rds: add tools/testing/selftests/net/rds/config
The ksft CI runtime needs an rds specific config file to build a
minimal kernel with the right options enabled.  This patch adds
an rds selftest config containing the required CONFIG_RDS* and
CONFIG_NET_* options.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260320041834.2761069-2-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-23 19:38:53 -07:00
Jiayuan Chen
e81cf512c1 selftests: bonding: add test for stacked bond header_parse recursion
Add a selftest to reproduce the infinite recursion in bond_header_parse()
when bonds are stacked (bond1 -> bond0 -> gre). When a packet is received
via AF_PACKET SOCK_DGRAM on the topmost bond, dev_parse_header() calls
bond_header_parse() which used skb->dev (always the topmost bond) to get
the bonding struct. This caused it to recurse back into itself
indefinitely, leading to stack overflow.

Before commit b7405dcf73 ("bonding: prevent potential infinite loop
in bond_header_parse()"), the test triggers:

  ./bond_stacked_header_parse.sh

  [  71.999481] BUG: MAX_LOCK_DEPTH too low!
  [  72.000170] turning off the locking correctness validator.
  [  72.001029] Please attach the output of /proc/lock_stat to the bug report
  [  72.002079] depth: 48  max: 48!
  ...

After the fix, everything works fine:

  ./bond_stacked_header_parse.sh
  TEST: Stacked bond header_parse does not recurse                  [ OK ]

Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://patch.msgid.link/20260320022245.392384-1-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-23 19:37:04 -07:00
Daniel Golle
db3bd9e55c selftests: forwarding: local_termination: fix PTP UDP cksums
All six PTP-over-IP test frames (3x IPv4 + 3x IPv6) contain incorrect
UDP checksums. The stored values are the ones-complement sums of just
the pseudo-headers, not the complete UDP checksums over pseudo-header +
UDP header + payload. This is characteristic of frames captured on the
sender before TX checksum offload completion.

For example, the IPv4 Sync and Follow-Up frames both store checksum
0xa3c8 despite having different UDP payloads and port numbers - 0xa3c8
is their shared pseudo-header sum (same src/dst IP, same protocol and
UDP length).

While most L2 switches forward frames without verifying transport
checksums, hardware that performs deep packet inspection or has PTP
awareness may validate UDP checksums and drop frames that fail
verification. This causes the 1588v2 over IPv4/IPv6 tests to fail on
such hardware even though L2 PTP (which has no UDP checksum) passes
fine.

Replace all six pseudo-header partial sums with the correctly computed
full UDP checksums:

  IPv4 Sync:           0xa3c8 -> 0x9f41
  IPv4 Follow-Up:      0xa3c8 -> 0xeb8a
  IPv4 Peer Delay Req: 0xa2bc -> 0x9ab9
  IPv6 Sync:           0x2e92 -> 0x1476
  IPv6 Follow-Up:      0x2e92 -> 0xf047
  IPv6 Peer Delay Req: 0xb454 -> 0x891f

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Link: https://patch.msgid.link/651c3decb80023e4395ec149fd81110afa3869a1.1774067006.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-23 18:57:31 -07:00
Björn Töpel
10329ce492 selftests: rss_drv: Add RSS indirection table resize tests
Add resize tests to rss_drv.py. Devices without dynamic table sizing
are skipped via _require_dynamic_indir_size().

resize_periodic: set a periodic 4-entry table, shrink channels to
fold, grow back to unfold. Check the exact pattern is preserved. Has
main and non-default context variants.

resize_below_user_size_reject: send a periodic table with user_size
between the big and small device table sizes. Verify that shrinking
below user_size is rejected even though the table is periodic. Has
main and non-default context variants.

resize_nonperiodic_reject: set a non-periodic table (equal N), verify
that channel reduction is rejected.

resize_nonperiodic_no_corruption: verify a failed resize leaves both
the indirection table contents and the channel count unchanged.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Link: https://patch.msgid.link/20260320085826.1957255-5-bjorn@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-23 17:59:54 -07:00
Kuniyuki Iwashima
3e9e84e92c selftest: net: Add GC test for temporary routes with exceptions.
Without the prior commit, IPv6 GC cannot track exceptions tied
to permanent routes if they were originally added as temporary
routes.

Let's add a test case for the issue.

  1. Add temporary routes
  2. Create exceptions for the temporary routes
  3. Promote the routes to permanent routes
  4. Check if GC can find and purge the exceptions

A few notes:

  + At step 4, unlike other test cases, we cannot wait for
    $GC_WAIT_TIME.  While the exceptions are always iterable via
    netlink (since it traverses the entire fib tree instead of
    tb6_gc_hlist), rt6_nh_dump_exceptions() skips expired entries.

    If we waited for the expiration time, we would be unable to
    distinguish whether the exceptions were truly purged by GC or
    just hidden due to being expired.

  + For the same reason, at step 2, we use ICMPv6 redirect message
    instead of Packet Too Big message.  This is because MTU exceptions
    always have RTF_EXPIRES, and rt6_age_examine_exception() does not
    respect the period specified by net.ipv6.route.flush=1.

  + We add a neighbour entry for the redirect target with NTF_ROUTER.
    Without this, the exceptions would be removed at step 3 when the
    fib6_may_remove_gc_list() is called.

Without the fix, the exceptions remain even after GC is triggered
by sysctl -wq net.ipv6.route.flush=1.

  FAIL: Expected 0 routes, got 5
      TEST: ipv6 route garbage collection (promote to permanent routes)   [FAIL]

With the fix, GC purges the exceptions properly.

      TEST: ipv6 route garbage collection (promote to permanent routes)   [ OK ]

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260320072317.2561779-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-23 16:59:31 -07:00
Varun R Mallya
bb6da652c5 selftests/bpf: Improve connect_force_port test reliability
The connect_force_port test fails intermittently in CI because the
hardcoded server ports (60123/60124) may already be in use by other
tests or processes [1].

Fix this by passing port 0 to start_server(), letting the kernel assign
a free port dynamically. The actual assigned port is then propagated to
the BPF programs by writing it into the .bss map's initial value (via
bpf_map__initial_value()) before loading, so the BPF programs use the
correct backend port at runtime.

[1] https://github.com/kernel-patches/bpf/actions/runs/22697676317/job/65808536038

Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260323081131.65604-1-varunrmallya@gmail.com
2026-03-23 11:37:34 -07:00
Emanuele Rocca
7aaa4915cb
selftests: check pidfd_info->coredump_code correctness
Extend the coredump_socket and coredump_socket_protocol selftests to verify
that the field coredump_code is set as expected in struct pidfd_info.

Signed-off-by: Emanuele Rocca <emanuele.rocca@arm.com>
Link: https://patch.msgid.link/acE6Eyuv2MM75pmk@NH27D9T0LF
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-23 16:29:15 +01:00
Emanuele Rocca
3fc66a1033
kselftest/coredump: reintroduce null pointer dereference
Commit 673a55cc49 replaced the null pointer dereference used in
crashing_child() with __builtin_trap to address the following LLVM warnings:

 coredump_test_helpers.c:59:6: warning: indirection of non-volatile null pointer will be deleted, not trap [-Wnull-dereference]
 coredump_test_helpers.c:59:6: note: consider using __builtin_trap() or qualifying pointer with 'volatile'

All coredump tests expect crashing_child() to result in a SIGSEGV. However, the
behavior of __builtin_trap is architecture-dependent. On x86 it yields SIGILL,
on aarch64 SIGTRAP. Given that neither of those signals are SIGSEGV, both
coredump_socket_test and coredump_socket_protocol_test are currently failing:

 get_pidfd_info: mask=0xd7, coredump_mask=0x5, coredump_signal=5
 socket_coredump_signal_sigsegv: coredump_signal=5, expected SIGSEGV=11

Qualify the pointer with volatile instead of calling __builtin_trap to fix the
tests.

Signed-off-by: Emanuele Rocca <emanuele.rocca@arm.com>
Link: https://patch.msgid.link/ab2kI0PI_Vk6bU88@NH27D9T0LF
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-23 16:27:10 +01:00
Alexei Starovoitov
bfec8e88ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.0-rc5
Cross-merge BPF and other fixes after downstream PR.

Minor conflicts in:
  tools/testing/selftests/bpf/progs/exceptions_fail.c
  tools/testing/selftests/bpf/progs/verifier_bounds.c

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-22 19:33:29 -07:00
zhidao su
c50dcf5331 selftests/sched_ext: Add tests for SCX_ENQ_IMMED and scx_bpf_dsq_reenq()
Add three selftests covering features introduced in v7.1:

- dsq_reenq: Verify scx_bpf_dsq_reenq() on user DSQs triggers
  ops.enqueue() with SCX_ENQ_REENQ and SCX_TASK_REENQ_KFUNC in
  p->scx.flags.

- enq_immed: Verify SCX_OPS_ALWAYS_ENQ_IMMED slow path where tasks
  dispatched to a busy CPU's local DSQ are re-enqueued through
  ops.enqueue() with SCX_TASK_REENQ_IMMED.

- consume_immed: Verify SCX_ENQ_IMMED via the consume path using
  scx_bpf_dsq_move_to_local___v2() with explicit SCX_ENQ_IMMED.

All three tests skip gracefully on kernels that predate the required
features by checking availability via __COMPAT_has_ksym() /
__COMPAT_read_enum() before loading.

Signed-off-by: zhidao su <suzhidao@xiaomi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-22 14:29:42 -10:00
Danilo Krummrich
14cf406e08 Linux 7.0-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmnAYjkeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGUawH/RJlc6oT7epAoDmS
 STyZW2CF1U4r8YRFmB2LdmAeydMBbxiQeISN6ap3SkNGURGCZ9WcxjOE4uFhL2WW
 aQMTxFaSwj9YxB6x7P9JUn/evJWFh8Fhh48kh2LfXPEpxYQBaMhnGzDSZJ3rtRxi
 NMJ/OXqg8JbZS44S2OOC9HxgoYBJ1pNXvi/Q6hWFEJphr3qXaxSvl4go0zMNFEbq
 39CK5vbJC8yQo4R/MAjuGvFs+Ivpjypp56iSIpK45z53Dfv255iSzIJhDxM3YGvl
 p3nVKpKAPuQMXpOUMOEl6z6muRL50es/Kgi2xh2JK6FWOOOgfunGNWDG/g5mgxjI
 QDUAm6U=
 =2ECC
 -----END PGP SIGNATURE-----

Merge tag 'v7.0-rc5' into driver-core-next

We need the driver-core fixes in here as well to build on top of.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-22 23:13:33 +01:00
Linus Torvalds
d5273fd3ca bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmnAGisACgkQ6rmadz2v
 bTqjsw/9GfHT/fdnjfA/q27TQH28ZdrZfq90BpI3m5BfTO8/l+Kt+g1HDGpku+C/
 iWh66rg9t/P9nMvtdzvPsdT833UbwbY6fPEK3r7ANgf7SBb1DNvaGHBM6XNefvZV
 j+VcykKUaEo8U1GeG+gI4TyAALSqvvMeBPYpAPZDUYguYLyE+YIl2Pl6tWt+A7yf
 9V3JjCSz63t75qqnhY2SIBZv2pqWiMaCI8uPgaF7drhQM5Xc0l/R75CMPGeF9BrT
 GRtTVJhY+6UyI2Q0ZRSRSVHZ1j2kYHI/eK3Kamxwal5hNh37BYHm3pT5TSHbZTe1
 xO7c1AB0vds8kznRkclQfsMdjVwuBQj03ukLVNqnnaaE4Ir7JlXlXYgeG0KJbbfW
 kQG8UyDD7tMWZkvaA0Z51FC88WJNLJoNAku519alcMtgAf1CrxzG9aUAYEWE4erh
 E/FKKvFqQ6T0mOFSXlk1NFeMjNXcg5Tu2KKKKOjAWT6goUc4hw80IWydTyxMy32m
 8/eLmdTZpAQovc2rS+5LSTigQ3DT082J950sxdQ3yRaLTWBGNC06gkA/WcRq2ZI+
 hBdW6GI1XFwkXGw5+F9fN9Bt5FmE42v44i+RrlNZV1R5bVr0Za/ofkWP3dm1/SOg
 QRSJk30hx9JveR9gD/xWawycYFuwmha/BL0tur2T32M67MneJpo=
 =Ye1S
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Fix how linked registers track zero extension of subregisters (Daniel
   Borkmann)

 - Fix unsound scalar fork for OR instructions (Daniel Wade)

 - Fix exception exit lock check for subprogs (Ihor Solodrai)

 - Fix undefined behavior in interpreter for SDIV/SMOD instructions
   (Jenny Guanni Qu)

 - Release module's BTF when module is unloaded (Kumar Kartikeya
   Dwivedi)

 - Fix constant blinding for PROBE_MEM32 instructions (Sachin Kumar)

 - Reset register ID for END instructions to prevent incorrect value
   tracking (Yazhou Tang)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Add a test cases for sync_linked_regs regarding zext propagation
  bpf: Fix sync_linked_regs regarding BPF_ADD_CONST32 zext propagation
  selftests/bpf: Add tests for maybe_fork_scalars() OR vs AND handling
  bpf: Fix unsound scalar forking in maybe_fork_scalars() for BPF_OR
  selftests/bpf: Add tests for sdiv32/smod32 with INT_MIN dividend
  bpf: Fix undefined behavior in interpreter sdiv/smod for INT_MIN
  selftests/bpf: Add tests for bpf_throw lock leak from subprogs
  bpf: Fix exception exit lock checking for subprogs
  bpf: Release module BTF IDR before module unload
  selftests/bpf: Fix pkg-config call on static builds
  bpf: Fix constant blinding for PROBE_MEM32 stores
  selftests/bpf: Add test for BPF_END register ID reset
  bpf: Reset register ID for BPF_END value tracking
2026-03-22 11:16:06 -07:00
Thomas Weißschuh
c8f6a4bbad selftests/nolibc: enable -Wundef
Users might use -Wundef, so also use it in the nolibc testsuite to
ensure no warnings are triggered.

The existing line with warning options is getting too long,
so move all warnings to a dedicated line.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20260318-nolibc-wundef-v1-2-fcb7f9ac7298@weissschuh.net
2026-03-22 11:03:59 +01:00
Thomas Weißschuh
b74be92274 tools/nolibc: add support for program_invocation_{,short_}name
Add support for the GNU extensions 'program_invocation_name' and
'program_invocation_short_name'. These are useful to print error
messages, which by convention include the program name.

As these are global variables which take up memory even if not used,
similar to 'errno', gate them behind NOLIBC_IGNORE_ERRNO.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260318-nolibc-err-h-v4-1-08247a694bd9@weissschuh.net
2026-03-22 10:40:30 +01:00
Daniel Borkmann
4a04d13576 selftests/bpf: Add a test cases for sync_linked_regs regarding zext propagation
Add multiple test cases for linked register tracking with alu32 ops:

  - Add a test that checks sync_linked_regs() regarding reg->id (the linked
    target register) for BPF_ADD_CONST32 rather than known_reg->id (the
    branch register).

  - Add a test case for linked register tracking that exposes the cross-type
    sync_linked_regs() bug. One register uses alu32 (w7 += 1, BPF_ADD_CONST32)
    and another uses alu64 (r8 += 2, BPF_ADD_CONST64), both linked to the
    same base register.

  - Add a test case that exercises regsafe() path pruning when two execution
    paths reach the same program point with linked registers carrying
    different ADD_CONST flags (BPF_ADD_CONST32 from alu32 vs BPF_ADD_CONST64
    from alu64). This particular test passes with and without the fix since
    the pruning will fail due to different ranges, but it would still be
    useful to carry this one as a regression test for the unreachable div
    by zero.

With the fix applied all the tests pass:

  # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_linked_scalars
  [...]
  ./test_progs -t verifier_linked_scalars
  #602/1   verifier_linked_scalars/scalars: find linked scalars:OK
  #602/2   verifier_linked_scalars/sync_linked_regs_preserves_id:OK
  #602/3   verifier_linked_scalars/scalars_neg:OK
  #602/4   verifier_linked_scalars/scalars_neg_sub:OK
  #602/5   verifier_linked_scalars/scalars_neg_alu32_add:OK
  #602/6   verifier_linked_scalars/scalars_neg_alu32_sub:OK
  #602/7   verifier_linked_scalars/scalars_pos:OK
  #602/8   verifier_linked_scalars/scalars_sub_neg_imm:OK
  #602/9   verifier_linked_scalars/scalars_double_add:OK
  #602/10  verifier_linked_scalars/scalars_sync_delta_overflow:OK
  #602/11  verifier_linked_scalars/scalars_sync_delta_overflow_large_range:OK
  #602/12  verifier_linked_scalars/scalars_alu32_big_offset:OK
  #602/13  verifier_linked_scalars/scalars_alu32_basic:OK
  #602/14  verifier_linked_scalars/scalars_alu32_wrap:OK
  #602/15  verifier_linked_scalars/scalars_alu32_zext_linked_reg:OK
  #602/16  verifier_linked_scalars/scalars_alu32_alu64_cross_type:OK
  #602/17  verifier_linked_scalars/scalars_alu32_alu64_regsafe_pruning:OK
  #602/18  verifier_linked_scalars/alu32_negative_offset:OK
  #602/19  verifier_linked_scalars/spurious_precision_marks:OK
  #602     verifier_linked_scalars:OK
  Summary: 1/19 PASSED, 0 SKIPPED, 0 FAILED

Co-developed-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260319211507.213816-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 13:19:40 -07:00
Mykyta Yatsenko
ceebdeec6e selftests/bpf: Test bpf_program__clone() attach_btf_id override
Add a test that verifies bpf_program__clone() respects caller-provided
attach_btf_id in bpf_prog_load_opts.

The BPF program has SEC("fentry/bpf_fentry_test1"). It is cloned twice
from the same prepared object: first with no opts, verifying the
callback resolves attach_btf_id from sec_name to bpf_fentry_test1;
then with attach_btf_id overridden to bpf_fentry_test2, verifying the
loaded program is actually attached to bpf_fentry_test2. Both results
are checked via bpf_prog_get_info_by_fd().

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260317-veristat_prepare-v4-3-74193d4cc9d9@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 13:17:14 -07:00
Mykyta Yatsenko
3be706b937 selftests/bpf: Use bpf_program__clone() in veristat
Replace veristat's per-program object re-opening with
bpf_program__clone().

Previously, veristat opened a separate bpf_object for every program in a
multi-program object file, iterated all programs to enable only the
target one, and then loaded the entire object.

Use bpf_object__prepare() once, then call bpf_program__clone() for each
program individually. This lets veristat load programs one at a time
from a single prepared object.

The caller now owns the returned fd and closes it after collecting stats.
Remove the special single-program fast path and the per-file early exit
in handle_verif_mode() so all files are always processed.

Split fixup_obj() into fixup_obj_maps() for object-wide map fixups that
must run before bpf_object__prepare(), and fixup_obj() for per-program
fixups (struct_ops masking, freplace type guessing) that run before each
bpf_program__clone() call.

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260317-veristat_prepare-v4-2-74193d4cc9d9@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 13:17:14 -07:00
Daniel Wade
0ad1734cc5 selftests/bpf: Add tests for maybe_fork_scalars() OR vs AND handling
Add three test cases to verifier_bounds.c to verify that
maybe_fork_scalars() correctly tracks register values for BPF_OR
operations with constant source operands:

1. or_scalar_fork_rejects_oob: After ARSH 63 + OR 8, the pushed
   path should have dst = 8. With value_size = 8, accessing
   map_value + 8 is out of bounds and must be rejected.

2. and_scalar_fork_still_works: Regression test ensuring AND
   forking continues to work. ARSH 63 + AND 4 produces pushed
   dst = 0 and current dst = 4, both within value_size = 8.

3. or_scalar_fork_allows_inbounds: After ARSH 63 + OR 4, the
   pushed path has dst = 4, which is within value_size = 8
   and should be accepted.

These tests exercise the fix in the previous patch, which makes the
pushed path re-execute the ALU instruction so it computes the correct
result for BPF_OR.

Signed-off-by: Daniel Wade <danjwade95@gmail.com>
Reviewed-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260314021521.128361-3-danjwade95@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 13:14:28 -07:00
Jenny Guanni Qu
4ac95c65ef selftests/bpf: Add tests for sdiv32/smod32 with INT_MIN dividend
Add tests to verify that signed 32-bit division and modulo operations
produce correct results when the dividend is INT_MIN (0x80000000).

The bug fixed in the previous commit only affects the BPF interpreter
path. When JIT is enabled (the default on most architectures), the
native CPU division instruction produces the correct result and these
tests pass regardless. With bpf_jit_enable=0, the interpreter is used
and without the previous fix, INT_MIN / 2 incorrectly returns
0x40000000 instead of 0xC0000000 due to abs(S32_MIN) undefined
behavior, causing these tests to fail.

Test cases:
  - SDIV32 INT_MIN / 2 = -1073741824 (imm and reg divisor)
  - SMOD32 INT_MIN % 2 = 0 (positive and negative divisor)

Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Jenny Guanni Qu <qguanni@gmail.com>
Link: https://lore.kernel.org/r/20260311011116.2108005-3-qguanni@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 13:12:17 -07:00
Tiezhu Yang
f4706504e2 selftests/bpf: Add alignment flag for test_verifier 190 testcase
There exists failure when executing the testcase "./test_verifier 190" if
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is not set on LoongArch.

  #190/p calls: two calls that return map_value with incorrect bool check FAIL
  ...
  misaligned access off (0x0; 0xffffffffffffffff)+0 size 8
  ...
  Summary: 0 PASSED, 0 SKIPPED, 1 FAILED

It means that the program has unaligned accesses, but the kernel sets
CONFIG_ARCH_STRICT_ALIGN by default to enable -mstrict-align to prevent
unaligned accesses, so add a flag F_NEEDS_EFFICIENT_UNALIGNED_ACCESS
into the testcase to avoid the failure.

This is somehow similar with the commit ce1f289f54 ("selftests/bpf:
Add F_NEEDS_EFFICIENT_UNALIGNED_ACCESS to some tests").

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260310064507.4228-3-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 13:11:18 -07:00
Puranjay Mohan
a2542a91aa bpf: Consolidate sleepable checks in check_func_call()
The sleepable context check for global function calls in
check_func_call() open-codes the same checks that in_sleepable_context()
already performs. Replace the open-coded check with a call to
in_sleepable_context() and use non_sleepable_context_description() for
the error message, consistent with check_helper_call() and
check_kfunc_call().

Note that in_sleepable_context() also checks active_locks, which
overlaps with the existing active_locks check above it. However, the two
checks serve different purposes: the active_locks check rejects all
global function calls while holding a lock (not just sleepable ones), so
it must remain as a separate guard.

Update the expected error messages in the irq and preempt_lock selftests
to match.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260318174327.3151925-4-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 13:09:35 -07:00
Puranjay Mohan
a0d06cf102 bpf: Consolidate sleepable checks in check_helper_call()
check_helper_call() prints the error message for every
env->cur_state->active* element when calling a sleepable helper.
Consolidate all of them into a single print statement.

The check for env->cur_state->active_locks was not part of the removed
print statements and will not be triggered with the consolidated print
as well because it is checked in do_check() before check_helper_call()
is even reached.

Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260318174327.3151925-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 13:09:34 -07:00
Ihor Solodrai
a1e5c46eae selftests/bpf: Add tests for bpf_throw lock leak from subprogs
Add test cases to ensure the verifier correctly rejects bpf_throw from
subprogs when RCU, preempt, or IRQ locks are held:

  * reject_subprog_rcu_lock_throw: subprog acquires bpf_rcu_read_lock and
    then calls bpf_throw
  * reject_subprog_throw_preempt_lock: always-throwing subprog called while
    caller holds bpf_preempt_disable
  * reject_subprog_throw_irq_lock: always-throwing subprog called while
    caller holds bpf_local_irq_save

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260320000809.643798-2-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 12:51:44 -07:00
Ihor Solodrai
6c2128505f bpf: Fix exception exit lock checking for subprogs
process_bpf_exit_full() passes check_lock = !curframe to
check_resource_leak(), which is false in cases when bpf_throw() is
called from a static subprog. This makes check_resource_leak() to skip
validation of active_rcu_locks, active_preempt_locks, and
active_irq_id on exception exits from subprogs.

At runtime bpf_throw() unwinds the stack via ORC without releasing any
user-acquired locks, which may cause various issues as the result.

Fix by setting check_lock = true for exception exits regardless of
curframe, since exceptions bypass all intermediate frame
cleanup. Update the error message prefix to "bpf_throw" for exception
exits to distinguish them from normal BPF_EXIT.

Fix reject_subprog_with_rcu_read_lock test which was previously
passing for the wrong reason. Test program returned directly from the
subprog call without closing the RCU section, so the error was
triggered by the unclosed RCU lock on normal exit, not by
bpf_throw. Update __msg annotations for affected tests to match the
new "bpf_throw" error prefix.

The spin_lock case is not affected because they are already checked [1]
at the call site in do_check_insn() before bpf_throw can run.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v7.0-rc4#n21098

Assisted-by: Claude:claude-opus-4-6
Fixes: f18b03faba ("bpf: Implement BPF exceptions")
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260320000809.643798-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-21 12:51:44 -07:00
zhidao su
818dbedd04 selftests/sched_ext: Return non-zero exit code on test failure
runner.c always returned 0 regardless of test results.  The kselftest
framework (tools/testing/selftests/kselftest/runner.sh) invokes the runner
binary and treats a non-zero exit code as a test failure; with the old
code, failed sched_ext tests were silently hidden from the parent harness
even though individual "not ok" TAP lines were emitted.

Return 1 when at least one test failed, 0 when all tests passed or were
skipped.

Signed-off-by: zhidao su <suzhidao@xiaomi.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-21 08:35:51 -10:00
Davide Caratti
5754a1c9f9 tc-testing: add a test case for ETS offload
While reviewing the fix for unintentional u32 overflows in ets offload
code, Jamal said:

 [...]

 > otherwise a tdc test should cover it fine (when you get to the
 > netdevsim change perhaps)

Extend tdc to allow setting hw-tc-offload via ethtool, and
add a test case to reproduce the division by zero fixed in [1].

[1] https://lore.kernel.org/all/CAM0EoMm17wsYZmdFLshH3_-GrZtzd=i0xnoO2yiVB=-N4761mw@mail.gmail.com/

Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/39129c374cbd00147b8c5afc04db59db62b50acc.1773945414.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-20 20:13:14 -07:00
Bobby Eshleman
0c3893d378 selftests/vsock: fix vsock_test path shadowing in nested VMs
The /root mount introduced for nested VM support shadows any host paths
under /root. This breaks systems where the outer VM runs as root and the
vsock_test binary path is something like:

/root/linux/tools/testing/selftests/vsock/vsock_test

Fix this by copying vsock_test into the temporary home directory that
gets mounted as /root in the guest, and using a relative path to invoke
it.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260317-vsock-vmtest-nested-fixes-v2-2-0b3f53b80a0f@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-20 18:34:48 -07:00
Bobby Eshleman
865926e26e selftests/vsock: fix vmtest.sh for read-only nested VM runners
When running vmtest.sh inside a nested VM, there occurs a problem
with stacking two sets of virtiofs/overlay layers (the first set from
the outer VM and the second set from the inner VM). The virtme init
scripts (sshd, udhcpd, etc...) fail to execute basic programs (e.g.,
/bin/cat) and load library dependencies (e.g., libpam) due to ESTALE.
This only occurs when both layers (outer and inner) use virtiofs. Work
around this by using 9p in the inner VM via --force-9p.

Additionally, when the outer VM is read-only, the inner VM's attempt at
populating SSH keys to the root filesystem fails:

virtme-ng-init: mkdir: cannot create directory '/root/.cache': Read-only file system

Work around this by creating a temporary home directory with generated
SSH keys and passing it through to the guest as /root via --rwdir.
Disable strict host key checking in vm_ssh() since the VM will be seen
as a new host each run.

The --rw arg had to be removed to prevent a vng complaint about overlay
(in combination with the other parameters). The guest doesn't really
need write access anyway, so this was probably overly permissive to
begin with.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260317-vsock-vmtest-nested-fixes-v2-1-0b3f53b80a0f@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-20 18:34:47 -07:00
Yi Lai
c82cfe1591 vfio: selftests: Support DMR and GNR-D DSA devices
Currently, the VFIO DSA driver test only supports the SPR DSA device ID.
Attempting to run the test on newer platforms like DMR or GNR-D results
in a "No driver found" error, causing the test to be skipped.

Refactor dsa_probe() to use a switch statement for checking device IDs.
This improves maintainability and makes it easier to add new device IDs
in the future.

Add the following DSA device IDs to the supported list:
PCI_DEVICE_ID_INTEL_DSA_DMR  (0x1212)
PCI_DEVICE_ID_INTEL_DSA_GNRD (0x11fb)

Signed-off-by: Yi Lai <yi1.lai@intel.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20260320010930.481380-1-yi1.lai@intel.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2026-03-20 13:54:51 -06:00
Ted Logan
1347a742a1 vfio: selftests: Build tests on aarch64
Fix vfio selftests on aarch64, allowing native builds on aarch64 hosts.

Reported-by: Matt Evans <mattev@meta.com>
Closes: https://lore.kernel.org/all/e51b4ff2-13c4-47d4-b781-3dcbd740d274@meta.com/
Fixes: a55d4bbbe6 ("vfio: selftests: only build tests on arm64 and x86_64")
Signed-off-by: Ted Logan <tedlogan@fb.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20260319-vfio-selftests-aarch64-v2-1-bb2621c24dc4@fb.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2026-03-20 13:52:16 -06:00
Thomas Weißschuh
5662ec000d selftests/nolibc: validate NOLIBC_IGNORE_ERRNO compilation
When NOLIBC_IGNORE_ERRNO is set, various bits of nolibc are disabled.

Make sure that all the ifdeffery does not result in any compilation
errors by compiling a dummy source file.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260311-nolibc-err-h-v1-2-735a9de7f15d@weissschuh.net
2026-03-20 17:58:28 +01:00
Thomas Weißschuh
e290017632 selftests/nolibc: add a variable for nolibc-test source files
The list of the nolibc-test source files is repeated many times.
Another source file is about to be added, adding to the mess.

Introduce a common variable instead.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260311-nolibc-err-h-v1-1-735a9de7f15d@weissschuh.net
2026-03-20 17:58:27 +01:00
David Laight
9bc019e7ba selftests/nolibc: Use printf variable field widths and precisions
Now that printf supports '*' for field widths and precisions
then can be used to simplify the test output.
 - aligning the "[OK]" strings.
 - reporting the expected sprintf() output when there is a mismatch.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260308113742.12649-18-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:58:26 +01:00
David Laight
248c7cf60c tools/nolibc/printf: Add support for octal output
Octal output isn't often used, but adding it costs very little.

Supporting "%#o" is mildly annoying, it has to add a leading '0' if
there isn't one present. In simple cases this is the same as adding a sign
of '0' - but that adds an extra '0' in a few places.
So you need 3 tests, %o, # and no leading '0' (which can only be checked
after the zero pad for precision).
If all the test are deferred until after zero padding then too many values
are 'live' across the call to _nolibc_u64toa_base() and get spilled to stack.
Hence the check that ignores the 'sign' if it is the same as the first
character of the output string.

Add tests for octal output.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260308113742.12649-17-david.laight.linux@gmail.com
[Thomas: avoid a -Wsign-compare]
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:58:24 +01:00
David Laight
d3d3f64f8e tools/nolibc/printf: Add support for zero padding and field precision
Includes support for variable field widths (eg "%*.*d").

Zero padding is limited to 31 zero characters.
This is wider than the largest numeric field so shouldn't be a problem.

All the standard printf formats are now supported except octal
and floating point.

Add tests for new features

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260308113742.12649-16-david.laight.linux@gmail.com
[Thomas: fixup testcases for musl libc]
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:58:10 +01:00
David Laight
b5f3f59cf4 tools/nolibc/printf: Add support for left aligning fields
Output the characters before or after the pad - writing the pad takes more code.

Include additional/changed tests

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260308113742.12649-15-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:57:21 +01:00
David Laight
5eae5f1a01 tools/nolibc/printf: Special case 0 and add support for %#x
The output for %#x is almost the same as that for %p, both output in
hexadecimal with a leading "0x".
However for zero %#x should just output "0" (the same as decimal and ocal).
For %p match glibc and output "(nil)" rather than "0x0" or "0".

Add tests for "%#x", "% d", "%+d" and passing NULL to "%p".

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260308113742.12649-14-david.laight.linux@gmail.com
[Thomas: fix up testcases for musl libc]
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:57:08 +01:00
David Laight
1256328719 tools/nolibc/printf: Add support for length modifiers tzqL and formats iX
Length modifiers t (ptrdiff_t) and z (size_t) are aliases for l (long),
q and L are 64bit the same as j (intmax).
Format i is an alias for d and X similar to x but upper case.
Supporting them is mostly just adding the relevant bit to the bit
pattern used for matching characters.
Although %X is detected the output will be lower case.

Change/add tests to use conversions i and X, and length modifiers L and ll.
Use the correct minimum value for "%Li".

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260308113742.12649-10-david.laight.linux@gmail.com
[Thomas: Fix up testcases for musl libc]
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:55:28 +01:00
David Laight
2177dd375d tools/nolibc: Implement strerror() in terms of strerror_r()
strerror() can be the only part of a program that has a .data section.
This requires 4k in the program file.

Add a simple implementation of strerror_r() and use that in strerror()
so that the "errno=" string is copied at run-time.
Use __builtin_memcpy() because that optimises away the input string
and just writes the required constants to the target buffer.

Code size change largely depends on whether the inlining decision for
strerror() changes.

Change the tests to use the normal EXPECT_VFPRINTF() when testing %m.
Skip the tests when !is_nolibc.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260308113742.12649-4-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:46:07 +01:00
David Laight
ab7cd329a8 selftests/nolibc: Rename w to written in expect_vfprintf()
Single character variable names don't make code easy to read.
Rename 'w' (used for the return value from snprintf()) 'written'.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260308113742.12649-3-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:46:07 +01:00
David Laight
cf34708824 selftests/nolibc: Let EXPECT_VFPRINTF() tests be skipped
Tests that check explicit nolibc behavior (eg "%m") or test places
where the nolibc behaviour deviates from the libc need skipping
when compiled to use the host libc.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260302101815.3043-8-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:46:05 +01:00
David Laight
4ea2dedd50 selftests/nolibc: Check that snprintf() doesn't write beyond the buffer end
Fill buf[] with known data and check the vsnprintf() doesn't write
beyond the specified buffer length.

Would have picked up the bug in field padding.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260302101815.3043-7-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:46:04 +01:00
David Laight
f36e1ec61a selftests/nolibc: Use length of 'expected' string to check snprintf() output
Instead of requiring the test cases specifying both the length and
expected output, take the length from the expected output.
Tests that expect the output be truncated are changed to specify
the un-truncated output.

Change the strncmp() to a memcmp() with an extra check that the
output is actually terminated.

Append a '+' to the printed output (after the final ") when the output
is truncated.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260302101815.3043-6-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:46:03 +01:00
David Laight
9aa8a4afd4 selftests/nolibc: check vsnprintf() output buffer before the length
Check the string matches before checking the returned length.
Only print the string once when it matches.

Makes it a lot easier to diagnose any incorrect output.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260302101815.3043-5-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:46:02 +01:00
David Laight
b42f02da2b selftests/nolibc: Return correct value when printf test fails
Correctly return 1 (the number of errors) when strcmp()
fails rather than the return value from strncmp() which is the
signed difference between the mismatching characters.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://patch.msgid.link/20260302101815.3043-4-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:46:01 +01:00
David Laight
27532c645e selftests/nolibc: Fix build with host headers and libc
Many systems don't have strlcpy() or strlcat() and readdir_r() is
deprecated. This makes the tests fail to build with the host headers.
Disable the 'directories' test and define strlcpy(), strlcat() and
readdir_r() using #defines so that the code compiles.

Fixes: 6fe8360b16 ("selftests/nolibc: also test libc-test through regular selftest framework")
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Link: https://patch.msgid.link/20260223101735.2922-4-david.laight.linux@gmail.com
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:45:45 +01:00
Thomas Weißschuh
8ba600aa57 selftests/nolibc: fix test_file_stream() on musl libc
fwrite() modifying errno is non-standard.

Only validate this behavior on those libc implementations which
implement it.

Fixes: a5f00be9b3 ("tools/nolibc: Add a simple test for writing to a FILE and reading it back")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2026-03-20 17:44:06 +01:00
Pavel Tikhomirov
7c5219e1a6
selftests: Add tests for creating pidns init via setns
First testcase "pidns_init_via_setns" checks that a process can become
Pid 1 (init) in a new Pid namespace created via unshare() and joined via
setns().

Second testcase "pidns_init_via_setns_set_tid" checks that during this
process we can use clone3() + set_tid and set the pid in both the new
and old pid namespaces (owned by different user namespaces). This test
requires root to run to avoid complex setup for wrapper userns.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

--
pidns_init_via_setns. Make pidns_init_via_setns_set_tid require root.

Link: https://patch.msgid.link/20260318122157.280595-5-ptikhomirov@virtuozzo.com
v6: Move wrapper userns creation for unprivileged case to the top of
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-20 14:44:26 +01:00
Jakub Kicinski
7dae8ffb09 selftests: drv-net: gro: add a test for GRO depth
Reuse the long sequence test to max out the GRO contexts.
Repeat for a single queue, 8 queues, and default number
of queues but flow steering to just one.

The SW GRO's capacity should be around 64 per queue
(8 buckets, up to 8 skbs in a chain).

Link: https://patch.msgid.link/20260318033819.1469350-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:29 -07:00
Jakub Kicinski
ff1cb3ad2a selftests: drv-net: gro: add test for packet ordering
Add a test to check if the NIC reorders packets if the hit GRO.

Link: https://patch.msgid.link/20260318033819.1469350-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
ba5d4128fc selftests: drv-net: gro: test GRO stats
Test accuracy of GRO stats. We want to cover two potentially tricky
cases:
 - single segment GRO
 - packets which were eligible but didn't get GRO'd

The first case is trivial, teach gro.c to send one packet, and check
GRO stats didn't move.

Second case requires gro.c to send a lot of flows expecting the NIC
to run out of GRO flow capacity.

To avoid system traffic noise we steer the packets to a dedicated
queue and operate on qstat.

Link: https://patch.msgid.link/20260318033819.1469350-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
f26d43acf1 selftests: drv-net: gro: use SO_TXTIME to schedule packets together
Longer packet sequence tests are quite flaky when the test is run
over a real network. Try to avoid at least the jitter on the sender
side by scheduling all the packets to be sent at once using SO_TXTIME.
Use hardcoded tx time of 5msec in the future. In my test increasing
this time past 2msec makes no difference so 5msec is plenty of margin.
Since we now expect more output buffering make sure to raise SNDBUF.

Note that this is an opportunistic reliability improvement which
will only work if the qdisc can schedule Tx time for us (fq).
Fiddling with qdisc config was deemed too complex, so it's not
part of the patch.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260318033819.1469350-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
9b29afa116 selftests: drv-net: give HW stats sync time extra 25% of margin
There are transient failures for devices which update stats
periodically, especially if it's the FW DMA'ing the stats
rather than host periodic work querying the FW. Wait 25%
longer than strictly necessary.

For devices which don't report stats-block-usecs we retain
25 msec as the default wait time (0.025sec == 20,000usec * 1.25).

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260318033819.1469350-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
8888bf4fb9 selftests: net: move gro to lib for HW vs SW reuse
The gro.c packet sender is used for SW testing but bulk of incoming
new tests will be HW-specific. So it's better to put them under
drivers/net/hw/, to avoid tip-toeing around netdevsim. Move gro.c
to lib so we can reuse it.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260318033819.1469350-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 16:57:28 -07:00
Jakub Kicinski
edab1ca5ec Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-7.0-rc5).

net/netfilter/nft_set_rbtree.c
  598adea720 ("netfilter: revert nft_set_rbtree: validate open interval overlap")
  3aea466a43 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock")
https://lore.kernel.org/abgaQBpeGstdN4oq@sirena.org.uk

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-19 14:16:00 -07:00
Mickaël Salaün
a54142d9ff
selftests/landlock: Test tsync interruption and cancellation paths
Add tsync_interrupt test to exercise the signal interruption path in
landlock_restrict_sibling_threads().  When a signal interrupts
wait_for_completion_interruptible() while the calling thread waits for
sibling threads to finish credential preparation, the kernel:

1. Sets ERESTARTNOINTR to request a transparent syscall restart.
2. Calls cancel_tsync_works() to opportunistically dequeue task works
   that have not started running yet.
3. Breaks out of the preparation loop, then unblocks remaining
   task works via complete_all() and waits for them to finish.
4. Returns the error, causing abort_creds() in the syscall handler.

Specifically, cancel_tsync_works() in its entirety, the ERESTARTNOINTR
error branch in landlock_restrict_sibling_threads(), and the
abort_creds() error branch in the landlock_restrict_self() syscall
handler are timing-dependent and not exercised by the existing tsync
tests, making code coverage measurements non-deterministic.

The test spawns a signaler thread that rapidly sends SIGUSR1 to the
calling thread while it performs landlock_restrict_self() with
LANDLOCK_RESTRICT_SELF_TSYNC.  Since ERESTARTNOINTR causes a
transparent restart, userspace always sees the syscall succeed.

This is a best-effort coverage test: the interruption path is exercised
when the signal lands during the preparation wait, which depends on
thread scheduling.  The test creates enough idle sibling threads (200)
to ensure multiple serialized waves of credential preparation even on
machines with many cores (e.g., 64), widening the window for the
signaler.  Deterministic coverage would require wrapping the wait call
with ALLOW_ERROR_INJECTION() and using CONFIG_FAIL_FUNCTION.

Test coverage for security/landlock was 90.2% of 2105 lines according to
LLVM 21, and it is now 91.1% of 2105 lines with this new test.

Cc: Günther Noack <gnoack@google.com>
Cc: Justin Suess <utilityemal77@gmail.com>
Cc: Tingmao Wang <m@maowtm.org>
Cc: Yihan Ding <dingyihan@uniontech.com>
Link: https://lore.kernel.org/r/20260310190416.1913908-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-03-19 20:57:39 +01:00
Manish Honap
4f42d71670 vfio: selftests: Fix VLA initialisation in vfio_pci_irq_set()
C does not permit an initialiser expression on a variable-length array
(C99 Section 6.7.9 constraint: "The type of the entity to be initialized
shall not be a variable length array type").

vfio_pci_irq_set() declared:

      u8 buf[sizeof(struct vfio_irq_set) + sizeof(int) * count] = {};

where `count` is a runtime function parameter, making `buf` a VLA.

GCC rejects this with (tried with GCC-9.4.0):

      error: variable-sized object may not be initialized

Fix by removing the `= {}` initialiser and inserting an explicit
memset() immediately after the declaration.  memset() on a VLA is
perfectly legal and achieves the same zero-initialisation on all
conforming C implementations.

Fixes: 19faf6fd96 ("vfio: selftests: Add a helper library for VFIO selftests")
Cc: stable@vger.kernel.org
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Manish Honap <mhonap@nvidia.com>
Link: https://lore.kernel.org/r/20260317051402.3725670-1-mhonap@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2026-03-19 12:32:08 -06:00
Sascha Bischoff
ce29261ec6 KVM: arm64: selftests: Add no-vgic-v5 selftest
Now that GICv5 is supported, it is important to check that all of the
GICv5 register state is hidden from a guest that doesn't create a
vGICv5.

Rename the no-vgic-v3 selftest to no-vgic, and extend it to check
GICv5 system registers too.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Link: https://patch.msgid.link/20260319154937.3619520-42-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-19 18:21:29 +00:00
Sascha Bischoff
0a9f38bf61 KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest
This basic selftest creates a vgic_v5 device (if supported), and tests
that one of the PPI interrupts works as expected with a basic
single-vCPU guest.

Upon starting, the guest enables interrupts. That means that it is
initialising all PPIs to have reasonable priorities, but marking them
as disabled. Then the priority mask in the ICC_PCR_EL1 is set, and
interrupts are enable in ICC_CR0_EL1. At this stage the guest is able
to receive interrupts. The architected SW_PPI (64) is enabled and
KVM_IRQ_LINE ioctl is used to inject the state into the guest.

The guest's interrupt handler has an explicit WFI in order to ensure
that the guest skips WFI when there are pending and enabled PPI
interrupts.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-41-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-19 18:21:29 +00:00
Eric Biggers
8d54748223 kunit: configs: Enable all crypto library tests in all_tests.config
The new option CONFIG_CRYPTO_LIB_ENABLE_ALL_FOR_KUNIT enables all the
crypto library code that has KUnit tests, causing CONFIG_KUNIT_ALL_TESTS
to enable all these tests.  Add this option to all_tests.config so that
kunit.py will run them when passed the --alltests option.

Link: https://lore.kernel.org/r/20260314035927.51351-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-19 10:10:30 -07:00
Paolo Abeni
0c45064487 Included features:
* use bitops.h API when possible
 * send netlink notification in case of client float event
 * implement support for asymmetric peer IDs
 * consolidate memory allocations during crypto operations
 * add netlink notification check in selftests
 * add FW mark check in selftest
 -----BEGIN PGP SIGNATURE-----
 
 iJEEABYIADkWIQQKU153ubb5unbkl6Gx/ZpNW1HNdwUCabkqjRsUgAAAAAAEAA5t
 YW51MiwyLjUrMS4xMSwyLDIACgkQsf2aTVtRzXdlaAEA4fQA41/tbgsciMSf7aqT
 lEAbZF/6DnsFZiTmuUfqPvQA/3+R0uiJlUTB3NGhhXXXikP4Yj61lWMDjw//lvYJ
 74IG
 =elM6
 -----END PGP SIGNATURE-----

Merge tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next

Antonio Quartulli says:

====================
Included features:
* use bitops.h API when possible
* send netlink notification in case of client float event
* implement support for asymmetric peer IDs
* consolidate memory allocations during crypto operations
* add netlink notification check in selftests
* add FW mark check in selftest

* tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next:
  ovpn: consolidate crypto allocations in one chunk
  selftests: ovpn: add test for the FW mark feature
  selftests: ovpn: check asymmetric peer-id
  ovpn: add support for asymmetric peer IDs
  selftests: ovpn: add notification parsing and matching
  ovpn: notify userspace on client float event
  ovpn: pktid: use bitops.h API
  ovpn: use correct array size to parse nested attributes in ovpn_nl_key_swap_doit
  selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3
====================

Link: https://patch.msgid.link/20260317104023.192548-1-antonio@openvpn.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 12:50:42 +01:00
Simon Baatz
96a584db75 selftests/net: packetdrill: improve tcp_rcv_neg_window.pkt
The test depends on accepting a packet that is larger than the
advertised window and that does not trigger an immediate ACK.

Previously, the test might still pass even if kernel behavior changed
unexpectedly. Add assertions verifying that the large packet was
accepted and no ACK was sent.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Link: https://patch.msgid.link/20260316-improve_tcp_neg_usable_wnd_test-v1-1-f16d5e365107@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-19 10:26:25 +01:00
Bobby Eshleman
3883c2b509 selftests/vsock: auto-detect kernel for guest VMs
When running vmtest.sh inside a nested VM the running kernel may not be
installed on the filesystem at the standard /boot/ or /usr/lib/modules/
paths.

Previously, this would cause vng to fail with "does not exist" since it
could not find the kernel image. Instead, this patch uses --dry-run to
detect if the kernel is available. If not, then we fall back to the
kernel in the kernel source tree. If that fails, then we die.

This way runners, like NIPA, can use vng --run arch/x86/boot/bzImage to
setup an outer VM, and vmtest.sh will still do the right thing setting
up the inner VM.

Due to job control issues in vng, a workaround is used to prevent 'make
kselftest TARGETS=vsock' from hanging until test timeout. A PR has been
placed upstream to solve the issue in vng:

https://github.com/arighi/virtme-ng/pull/453

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260316-vsock-vmtest-autodetect-kernel-v2-1-5eec7b4831f8@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 19:28:23 -07:00
Sun Jian
4a4fedb8a5 selftests/bpf: Isolate fmod_ret hooks by pid
modify_return's fmod_ret programs can override bpf_modify_return_test()'s
return value, which conflicts with get_func_ip_test when selftests run in
parallel.

Store current tgid in BSS and make modify_return hooks act only for that
tgid. For other tasks, fentry/fexit become no-ops and fmod_ret returns the
original ret.

Drop the serial-only restriction and remove the TODO comment.

Tested:
  sudo ./test_progs -t modify_return
  sudo ./test_progs -t get_func_ip_test
  sudo ./test_progs -j$(nproc) -t modify_return
  sudo ./test_progs -j$(nproc) -t get_func_ip_test

Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Link: https://lore.kernel.org/r/20260313034540.255805-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-18 17:16:06 -07:00
Sun Jian
888329ba6c selftests/bpf: Avoid spurious failures perf_link
perf_link creates a system-wide perf event pinned to CPU 0 (pid=-1, cpu=0)
and also pins the test thread to CPU 0. Under concurrent selftests this
can lead to cross-test interference and CPU 0 contention, making the test
flaky.

Create a per-task perf event instead (pid=0, cpu=-1) and drop CPU pinning
from burn_cpu(). Use barrier() to prevent the burn loop from being
optimized away. Drop the serial_ prefix so the test can run in parallel.
Also remove the stale TODO comment.

Tested:
  ./test_progs -t perf_link -vv
  ./test_progs -j$(nproc) -t perf_link -vv
  for i in $(seq 1 50); do ./test_progs -j$(nproc) -t perf_link; done

Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260305084306.283983-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-18 17:08:54 -07:00
Jakub Kicinski
17a55ddb19 tools: ynl: rework policy access to support recursion
Donald points out that the current naive implementation using dicts
breaks if policy is recursive (child nest uses policy idx already
used by its parent).

Lean more into the NlPolicy class. This lets us "render" the policy
on demand, when user accesses it. If someone wants to do an infinite
walk that's on them :) Show policy info as attributes of the class
and use dict format to descend into sub-policies for extra neatness.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260313232047.2068518-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18 16:41:42 -07:00
Ricardo B. Marlière
81fca70874 ktest: Store failure logs also in fatal paths
STORE_FAILURES was only saved from fail(), so paths that reached dodie()
could exit without preserving failure logs.

That includes fatal hook paths such as:

  POST_BUILD_DIE = 1

and ordinary failures when:

  DIE_ON_FAILURE = 1

Call save_logs("fail", ...) from dodie() too so fatal failures keep the
same STORE_FAILURES artifacts as non-fatal fail() paths.

Cc: John Hawley <warthog9@eaglescrag.net>
Link: https://patch.msgid.link/20260318-ktest-fixes-v1-1-9dd94d46d84c@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-18 16:00:54 -04:00
Linus Torvalds
f0caa1d49c hid-for-linus-2026031701
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEL65usyKPHcrRDEicpmLzj2vtYEkFAmm5t3QACgkQpmLzj2vt
 YEk90hAArhy85Gy5uTQwdj5TVnGIwVH5+twiOz4S0If396duQGfK3MY+c+jzW2kO
 7UI2uWV4tgXJGSoK1kTuQ3IXGgUDGofbaQ15nltpibQ4c+3whcDmMRICGIA/OqCk
 FmHHAV264Y5y6bapG7VvT8PLv0N+TtGV4G4LhQv90eObnRXXnc0m+H0s2IObr2py
 aBJgmgFQQ3DMWddX5DMatMd6M6e+2kJHY3X/41youvbvVJoWtqrJQEKIB19HDR8o
 9GkEjn4GLMPZ6hPlTJCnkRn7zfRWQ3MvUMft1kCdVtpOlqvftHoBrXvP6X1YDtPR
 Hy07HkH1Jpq0zI6AYyKj7f36oasnnNCm4ZFijn2RBx2chmKEUrpz2fSJ4aS5YU81
 QqhOjVR+euYL7kQ1UtoFGNwOhBHKWcJr5AezxxUNwn4SJ1bl8TGB63OEt2/1GI8/
 L1PMMAgHnxUAlJui38PfeiXboUeS9bfPiJd20FnGzCghsdvk6a+W9oWz+2yhs+Fy
 csm1MvcxhZZ7ugXPmscE/U6iLueaqlj42dQ+wkm6sh8aYKS+9eIlIgNpu4Q9Z//e
 LZGPOjx+jDWiqqTBmXke7hGMXHNXHRbWDLWlE+Du4XS5sfKwTDRKXs+g3MWI/nVf
 gHvuHuKQvrQHiAcnWH0fNJbsHaGExpWWAbb+yoWCJHJAU3vj7ek=
 =1U5m
 -----END PGP SIGNATURE-----

Merge tag 'hid-for-linus-2026031701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

 - various fixes dealing with (intentionally) broken devices in HID
   core, logitech-hidpp and multitouch drivers (Lee Jones)

 - fix for OOB in wacom driver (Benoît Sevens)

 - fix for potentialy HID-bpf-induced buffer overflow in () (Benjamin
   Tissoires)

 - various other small fixes and device ID / quirk additions

* tag 'hid-for-linus-2026031701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: multitouch: Check to ensure report responses match the request
  HID: logitech-hidpp: Prevent use-after-free on force feedback initialisation failure
  HID: bpf: prevent buffer overflow in hid_hw_request
  selftests/hid: fix compilation when bpf_wq and hid_device are not exported
  HID: core: Mitigate potential OOB by removing bogus memset()
  HID: intel-thc-hid: Set HID_PHYS with PCI BDF
  HID: appletb-kbd: add .resume method in PM
  HID: logitech-hidpp: Enable MX Master 4 over bluetooth
  HID: input: Add HID_BATTERY_QUIRK_DYNAMIC for Elan touchscreens
  HID: input: Drop Asus UX550* touchscreen ignore battery quirks
  HID: asus: add xg mobile 2022 external hardware support
  HID: wacom: fix out-of-bounds read in wacom_intuos_bt_irq
2026-03-17 13:55:51 -07:00
Danilo Krummrich
de25dc008e Register abstraction and I/O infrastructure improvements
Introduce the register!() macro to define type-safe I/O register
 accesses. Refactor the IoCapable trait into a functional trait, which
 simplifies I/O backends and removes the need for overloaded Io methods.
 
 This is a stable tag for other trees to merge.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCabmm4QAKCRBFlHeO1qrK
 LgW0AP91SGioag/Bcrq6htEiIDb3qECrLnJEoU4V011VnHPWGAEA2xmSZ5w1MREp
 plpKOF2KeNDZ+o9zjXG4okzqzaBf2wg=
 =Jrfw
 -----END PGP SIGNATURE-----

Merge tag 'rust_io-7.1-rc1' into driver-core-next

Register abstraction and I/O infrastructure improvements

Introduce the register!() macro to define type-safe I/O register
accesses. Refactor the IoCapable trait into a functional trait, which
simplifies I/O backends and removes the need for overloaded Io methods.

This is a stable tag for other trees to merge.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-03-17 20:12:36 +01:00
Cheng-Yang Chou
f6689792ff selftests/sched_ext: Show failed test names in summary
When tests fail, the runner only printed the failure count, making
it hard to tell which tests failed without scrolling through output.

Track failed test names in an array and print them after the summary
so failures are immediately visible at the end of the run.

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-17 07:46:57 -10:00
Eric Biggers
44ff3791d6 kunit: configs: Enable all CRC tests in all_tests.config
The new option CONFIG_CRC_ENABLE_ALL_FOR_KUNIT enables all the CRC code
that has KUnit tests, causing CONFIG_KUNIT_ALL_TESTS to enable all these
tests.  Add this option to all_tests.config so that kunit.py will run
them when passed the --alltests option.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260314172224.15152-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-17 09:29:10 -07:00
Christian Brauner
c465f5591a selftests/mount_setattr: increase tmpfs size for idmapped mount tests
The mount_setattr_idmapped fixture mounts a 2 MB tmpfs at /mnt and then
creates a 2 GB sparse ext4 image at /mnt/C/ext4.img. While ftruncate()
succeeds (sparse file), mkfs.ext4 needs to write actual metadata blocks
(inode tables, journal, bitmaps) which easily exceeds the 2 MB tmpfs
limit, causing ENOSPC and failing the fixture setup for all
mount_setattr_idmapped tests.

This was introduced by commit d37d4720c3 ("selftests/mount_settattr:
ensure that ext4 filesystem can be created") which increased the image
size from 2 MB to 2 GB but didn't adjust the tmpfs size.

Bump the tmpfs size to 256 MB which is sufficient for the ext4 metadata.

Fixes: d37d4720c3 ("selftests/mount_settattr: ensure that ext4 filesystem can be created")
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-17 16:59:45 +01:00
Fernando Fernandez Mancera
fdd973148a selftests: net: add ipv6 RA route to ECMP merge test
As commit bbf4a17ad9 ("ipv6: Fix ECMP sibling count mismatch when
clearing RTF_ADDRCONF") pointed out, RA routes are not elegible for ECMP
merging.

Add a test scenario mixing RA and static routes with gateway to check
that they are not getting merged.

Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260313124827.3945-1-fmancera@suse.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-17 12:53:30 +01:00
Ralf Lici
7b80d8a335 selftests: ovpn: add test for the FW mark feature
Add a selftest to verify that the FW mark socket option is correctly
supported and its value propagated by ovpn.

The test adds and removes nftables DROP rules based on the mark value,
and checks that the rule counter aligns with the number of lost ping
packets.

Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: horms@kernel.org
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
2026-03-17 11:09:20 +01:00
Ralf Lici
367f4b163a selftests: ovpn: check asymmetric peer-id
Extend the base test to verify that the correct peer-id is set in data
packet headers. This is done by capturing ping packets with tcpdump during
the initial exchange and matching the first portion of the header
against the expected sequence for every connection.

Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: horms@kernel.org
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
2026-03-17 11:09:05 +01:00
Ralf Lici
77de28cd7c selftests: ovpn: add notification parsing and matching
To verify that netlink notifications are correctly emitted and contain
the expected fields, this commit uses the tools/net/ynl/pyynl/cli.py
script to create multicast listeners. These listeners record the
captured notifications to a JSON file, which is later compared to the
expected output.

Cc: linux-kselftest@vger.kernel.org
Cc: shuah@kernel.org
Cc: horms@kernel.org
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
2026-03-17 11:08:55 +01:00
Ralf Lici
c841b676da ovpn: notify userspace on client float event
Send a netlink notification when a client updates its remote UDP
endpoint. The notification includes the new IP address, port, and scope
ID (for IPv6).

Cc: linux-kselftest@vger.kernel.org
Cc: horms@kernel.org
Cc: shuah@kernel.org
Cc: donald.hunter@gmail.com
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
2026-03-17 11:08:55 +01:00
Antonio Quartulli
a8e136b496 selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3
mbedtls 3 installs headers and calls the shared object
differently than version 2, therefore we must now rely
on pkgconfig to fill the right C/LDFLAGS.

Moreover the mbedtls3 library expects any base64 file to
have their content on one line.
Since this change does no break older versions,
let's change the sample key file format and make mbedtls3
happy.

Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: horms@kernel.org
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
2026-03-17 11:08:54 +01:00
Jakub Kicinski
6a33a70626 selftests: net: py: give bpftrace more time to start
After commit under Fixes debug runners in the CI hit the following:

  # subprocess.TimeoutExpired: Command '['bpftrace', '-f', 'json', '-q', '-e', 'kprobe:netpoll_poll_dev { @hits = count(); } interval:s:10 { exit(); }']' timed out after 15 seconds
  # # Exception| net.lib.py.ksft.KsftFailEx: bpftrace failed to run!?: {}

in netpoll_basic.py >10% of the time. Let's give bpftool more time
to start, it can take a while on a debug kernel.

Fixes: 82562972b8 ("selftests: net: pass bpftrace timeout to cmd()")
Reviewed-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Link: https://patch.msgid.link/20260315160038.3187730-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-16 19:26:37 -07:00
Alejandro Lucero
9a775c07bb cxl: support Type2 when initializing cxl_dev_state
In preparation for type2 drivers add function and macro for
differentiating CXL memory expanders (type 3) from CXL device
accelerators (type 2) helping drivers built from public headers
to embed struct cxl_dev_state inside a private struct.

Update type3 driver for using this same initialization.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260306164741.3796372-2-alejandro.lucero-palau@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-03-16 16:24:29 -07:00
Ihor Solodrai
7e2f40ef0a selftests/bpf: Bump path and command buffer sizes in bpftool_helpers.c
The path length of 64 is way too low in some envirnoments, which leads
to subtle failures due to truncation [1].

Replace BPFTOOL_PATH_MAX_LEN with PATH_MAX, and set
BPFTOOL_FULL_CMD_MAX_LEN to double of PATH_MAX.

[1] https://github.com/libbpf/libbpf/actions/runs/22980753016/job/66719800527

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260312234820.439720-1-ihor.solodrai@linux.dev
2026-03-16 14:14:57 -07:00
Mykyta Yatsenko
c73a244366 bpftool: Allow explicitly skip llvm, libbfd and libcrypto dependencies
Introduce SKIP_LLVM, SKIP_LIBBFD, and SKIP_CRYPTO build flags that let
users build bpftool without these optional dependencies.

SKIP_LLVM=1 skips LLVM even when detected. SKIP_LIBBFD=1 prevents the
libbfd JIT disassembly fallback when LLVM is absent. Together, they
produce a bpftool with no disassembly support.

SKIP_CRYPTO=1 excludes sign.c and removes the -lcrypto link dependency.
Inline stubs in main.h return errors with a clear message if signing
functions are called at runtime.

Use BPFTOOL_WITHOUT_CRYPTO (not HAVE_LIBCRYPTO_SUPPORT) as the C
define, following the BPFTOOL_WITHOUT_SKELETONS naming convention for
bpftool-internal build config, leaving HAVE_LIBCRYPTO_SUPPORT free for
proper feature detection in the future.

All three flags are propagated through the selftests Makefile to bpftool
sub-builds.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260312-b4-bpftool_build-v2-1-4c9d57133644@meta.com
2026-03-16 14:14:14 -07:00
Emil Tsalapatis
01d5d2f7d9 selftests/bpf: Add deep call stack selftests
Add tests that demonstrate the verifier support for deep call stacks
while still enforcing maximum stack size limits.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260316161225.128011-3-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-16 11:26:41 -07:00
Emil Tsalapatis
ad95d3c758 bpf: Only enforce 8 frame call stack limit for all-static stacks
The BPF verifier currently enforces a call stack depth of 8 frames,
regardless of the actual stack space consumption of those frames. The
limit is necessary for static call stacks, because the bookkeeping data
structures used by the verifier when stepping into static functions
during verification only support 8 stack frames. However, this
limitation only matters for static stack frames: Global subprogs are
verified by themselves and do not require limiting the call depth.

Relax this limitation to only apply to static stack frames. Verification
now only fails when there is a sequence of 8 calls to non-global
subprogs. Calling into a global subprog resets the counter. This allows
deeper call stacks, provided all frames still fit in the stack.

The change does not increase the maximum size of the call stack, only
the maximum number of frames we can place in it.

Also change the progs/test_global_func3.c selftest to use static
functions, since with the new patch it would otherwise unexpectedly
pass verification.

Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260316161225.128011-2-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-16 11:26:41 -07:00
Janosch Frank
0c6294d98a KVM: s390: selftests: Add IRQ routing address offset tests
This test tries to setup routes which have address + offset
combinations which cross a page.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2026-03-16 16:56:39 +01:00
Benjamin Tissoires
5d4c6c132e selftests/hid: fix compilation when bpf_wq and hid_device are not exported
This can happen in situations when CONFIG_HID_SUPPORT is set to no, or
some complex situations where struct bpf_wq is not exported.

So do the usual dance of hiding them before including vmlinux.h, and
then redefining them and make use of CO-RE to have the correct offsets.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603111558.KLCIxsZB-lkp@intel.com/
Fixes: fe8d561db3 ("selftests/hid: add wq test for hid_bpf_input_report()")
Cc: stable@vger.kernel.org
Acked-by: Jiri Kosina <jkosina@suse.com>
Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2026-03-16 16:21:06 +01:00
Linus Torvalds
11e8c7e947 ARM:
- Correctly handle deeactivation of interrupts that were activated from
   LRs.  Since EOIcount only denotes deactivation of interrupts that
   are not present in an LR, start EOIcount deactivation walk *after*
   the last irq that made it into an LR.
 
 - Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when
   pKVM is already enabled -- not only thhis isn't possible (pKVM
   will reject the call), but it is also useless: this can only
   happen for a CPU that has already booted once, and the capability
   will not change.
 
 - Fix a couple of low-severity bugs in our S2 fault handling path,
   affecting the recently introduced LS64 handling and the even more
   esoteric handling of hwpoison in a nested context
 
 - Address yet another syzkaller finding in the vgic initialisation,
   where we would end-up destroying an uninitialised vgic with nasty
   consequences
 
 - Address an annoying case of pKVM failing to boot when some of the
   memblock regions that the host is faulting in are not page-aligned
 
 - Inject some sanity in the NV stage-2 walker by checking the limits
   against the advertised PA size, and correctly report the resulting
   faults
 
 PPC:
 
 - Fix a PPC e500 build error due to a long-standing wart that was exposed by
   the recent conversion to kmalloc_obj(); rip out all the ugliness that
   led to the wart.
 
 RISC-V:
 
 - Prevent speculative out-of-bounds access using array_index_nospec()
   in APLIC interrupt handling, ONE_REG regiser access, AIA CSR access,
   float register access, and PMU counter access
 
 - Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(),
   kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr()
 
 - Fix potential null pointer dereference in kvm_riscv_vcpu_aia_rmw_topei()
 
 - Fix off-by-one array access in SBI PMU
 
 - Skip THP support check during dirty logging
 
 - Fix error code returned for Smstateen and Ssaia ONE_REG interface
 
 - Check host Ssaia extension when creating AIA irqchip
 
 x86:
 
 - Fix cases where CPUID mitigation features were incorrectly marked as
   available whenever the kernel used scattered feature words for them.
 
 - Validate _all_ GVAs, rather than just the first GVA, when processing
   a range of GVAs for Hyper-V's TLB flush hypercalls.
 
 - Fix a brown paper bug in add_atomic_switch_msr().
 
 - Use hlist_for_each_entry_srcu() when traversing mask_notifier_list,
   to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu.
 
 - Ensure AVIC VMCB fields are initialized if the VM has an in-kernel local
   APIC (and AVIC is enabled at the module level).
 
 - Update CR8 write interception when AVIC is (de)activated, to fix a bug
   where the guest can run in perpetuity with the CR8 intercept enabled.
 
 - Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to allow
   L1 hypervisors to set FREEZE_IN_SMM.  This reverts (by default) an
   unintentional tightening of userspace ABI in 6.17, and provides some
   amount of backwards compatibility with hypervisors who want to freeze
   PMCs on VM-Entry.
 
 - Validate the VMCS/VMCB on return to a nested guest from SMM, because
   either userspace or the guest could stash invalid values in memory
   and trigger the processor's consistency checks.
 
 Generic:
 
 - Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from being
   unnecessary and confusing, triggered compiler warnings due to
   -Wflex-array-member-not-at-end.
 
 - Document that vcpu->mutex is take outside of kvm->slots_lock and
   kvm->slots_arch_lock, which is intentional and desirable despite being
   rather unintuitive.
 
 Selftests:
 
 - Increase the maximum number of NUMA nodes in the guest_memfd selftest to
   64 (from 8).
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmmy6n8UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNX7ggAhWoCG+AE6P3yrp6Mi+nRYpeRGC3q
 q2IiZCn0UoCg6q3c2kgn7b/N2zLJs0Q8FZRCEp2Je+2uvptpmdp/BMEfiIU3n2/a
 61z+Dydbpyc+kUmhJzUJ+aotq5FnMNmAAmqSKoc19GhAx2OQhQmBP/JOZ0P/eqLE
 Is0qNBgr/Zms2ib3GFf/JT+urysL2mX47qe92HTzq1T9EEG0KleID0Jz8vYQI8Fr
 I5N9+lTxagQDi8ytwOM85Cn8K7wh+CQIgzmciHcVErpAvAWkrEjrPlQltpEz2C5B
 aWEcRgw46utEaAiwPQGJRW6TeoKUG0pUR3v6T90nBkjjJ1npm6gPVE6TBA==
 =7nQ9
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Quite a large pull request, partly due to skipping last week and
  therefore having material from ~all submaintainers in this one. About
  a fourth of it is a new selftest, and a couple more changes are large
  in number of files touched (fixing a -Wflex-array-member-not-at-end
  compiler warning) or lines changed (reformatting of a table in the API
  documentation, thanks rST).

  But who am I kidding---it's a lot of commits and there are a lot of
  bugs being fixed here, some of them on the nastier side like the
  RISC-V ones.

  ARM:

   - Correctly handle deactivation of interrupts that were activated
     from LRs. Since EOIcount only denotes deactivation of interrupts
     that are not present in an LR, start EOIcount deactivation walk
     *after* the last irq that made it into an LR

   - Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM
     is already enabled -- not only thhis isn't possible (pKVM will
     reject the call), but it is also useless: this can only happen for
     a CPU that has already booted once, and the capability will not
     change

   - Fix a couple of low-severity bugs in our S2 fault handling path,
     affecting the recently introduced LS64 handling and the even more
     esoteric handling of hwpoison in a nested context

   - Address yet another syzkaller finding in the vgic initialisation,
     where we would end-up destroying an uninitialised vgic with nasty
     consequences

   - Address an annoying case of pKVM failing to boot when some of the
     memblock regions that the host is faulting in are not page-aligned

   - Inject some sanity in the NV stage-2 walker by checking the limits
     against the advertised PA size, and correctly report the resulting
     faults

  PPC:

   - Fix a PPC e500 build error due to a long-standing wart that was
     exposed by the recent conversion to kmalloc_obj(); rip out all the
     ugliness that led to the wart

  RISC-V:

   - Prevent speculative out-of-bounds access using array_index_nospec()
     in APLIC interrupt handling, ONE_REG regiser access, AIA CSR
     access, float register access, and PMU counter access

   - Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(),
     kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr()

   - Fix potential null pointer dereference in
     kvm_riscv_vcpu_aia_rmw_topei()

   - Fix off-by-one array access in SBI PMU

   - Skip THP support check during dirty logging

   - Fix error code returned for Smstateen and Ssaia ONE_REG interface

   - Check host Ssaia extension when creating AIA irqchip

  x86:

   - Fix cases where CPUID mitigation features were incorrectly marked
     as available whenever the kernel used scattered feature words for
     them

   - Validate _all_ GVAs, rather than just the first GVA, when
     processing a range of GVAs for Hyper-V's TLB flush hypercalls

   - Fix a brown paper bug in add_atomic_switch_msr()

   - Use hlist_for_each_entry_srcu() when traversing mask_notifier_list,
     to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu

   - Ensure AVIC VMCB fields are initialized if the VM has an in-kernel
     local APIC (and AVIC is enabled at the module level)

   - Update CR8 write interception when AVIC is (de)activated, to fix a
     bug where the guest can run in perpetuity with the CR8 intercept
     enabled

   - Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to
     allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by
     default) an unintentional tightening of userspace ABI in 6.17, and
     provides some amount of backwards compatibility with hypervisors
     who want to freeze PMCs on VM-Entry

   - Validate the VMCS/VMCB on return to a nested guest from SMM,
     because either userspace or the guest could stash invalid values in
     memory and trigger the processor's consistency checks

  Generic:

   - Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from
     being unnecessary and confusing, triggered compiler warnings due to
     -Wflex-array-member-not-at-end

   - Document that vcpu->mutex is take outside of kvm->slots_lock and
     kvm->slots_arch_lock, which is intentional and desirable despite
     being rather unintuitive

  Selftests:

   - Increase the maximum number of NUMA nodes in the guest_memfd
     selftest to 64 (from 8)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
  KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8
  Documentation: kvm: fix formatting of the quirks table
  KVM: x86: clarify leave_smm() return value
  selftests: kvm: add a test that VMX validates controls on RSM
  selftests: kvm: extract common functionality out of smm_test.c
  KVM: SVM: check validity of VMCB controls when returning from SMM
  KVM: VMX: check validity of VMCS controls when returning from SMM
  KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated
  KVM: SVM: Initialize AVIC VMCB fields if AVIC is enabled with in-kernel APIC
  KVM: x86: Introduce KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM
  KVM: x86: Fix SRCU list traversal in kvm_fire_mask_notifiers()
  KVM: VMX: Fix a wrong MSR update in add_atomic_switch_msr()
  KVM: x86: hyper-v: Validate all GVAs during PV TLB flush
  KVM: x86: synthesize CPUID bits only if CPU capability is set
  KVM: PPC: e500: Rip out "struct tlbe_ref"
  KVM: PPC: e500: Fix build error due to using kmalloc_obj() with wrong type
  KVM: selftests: Increase 'maxnode' for guest_memfd tests
  KVM: arm64: pkvm: Don't reprobe for ICH_VTR_EL2.TDS on CPU hotplug
  KVM: arm64: vgic: Pick EOIcount deactivations from AP-list tail
  KVM: arm64: Remove the redundant ISB in __kvm_at_s1e2()
  ...
2026-03-15 12:22:10 -07:00
Linus Torvalds
4f3df2e5ea powerpc fixes for 7.0 #3
- Fix KUAP warning in VMX usercopy path
  - Fix lockdep warning during PCI enumeration
  - Fix to move CMA reservations to arch_mm_preinit
  - Fix to check current->mm is alive before getting user callchain
 
 Thanks to: Aboorva Devarajan, Christophe Leroy (CS GROUP), Dan Horák, Nicolin
 Chen, Nilay Shroff, Qiao Zhao, Ritesh Harjani (IBM), Saket Kumar Bhaskar,
 Sayali Patil, Shrikanth Hegde, Venkat Rao Bagalkote, Viktor Malik,
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmm2KVsACgkQpnEsdPSH
 ZJT6VBAAlmediTysFMpS6qnxrhJ/ZbERskIvfGkcW4i8lPr5yluPjQhj75Q9RYIy
 eRFS5eYssQVXbhS1/YWfsQKcG2tH7ucl0ocYfl8xvGGCgpSEu+wYTwECj2OVSF7T
 BiQ6VsHcOLJJ1SxCoS17n+sl8WGuIGikWKYM2ECeNx7iysrFczcj4RQ9Z4aYWT91
 xmgDyQwrNmxSy85OXq5ITLcY5IcVLtwnpjyTp4z94fP2Ho/R/muL9i3Sven7Iiqm
 a5I5XDozMFxtFtOxYlh7y8cKisDEYqinqoA/9P59kEtZ5XML8yp/s7rJ7Gjl/AmF
 O3fEAbtevTz2XvpVpx6XiRAXDtdRyR+YFUZMTABawDFlHZffD7m4eg/9A4JvDJ/8
 LxklCGLECZes+dEULGG/kXoOD7e2jJKDBsGYjgGWXU5+ZI8qjhfSWdiXAcl1DEHd
 gYZ2N6eYNWP/m2wqs5FUiabdB0yPdcpI7ukxmECpQDdS4TCA4sU3DI0FRyGktABV
 nNaYBZezZhlCWzNo/NBxFAvj6OHmo8WYHX1G6piE6nJKYyPlbjLyV5/tvkW9oxlM
 HlejFBKF4Us9ZotNgWxQdJzZCJ3qWmuxDgukzShX4mDbGdK8+4Vv9Qjk1SwsCypS
 HQ/ff0SNcHVdDJkw41jOJxoTv/2+vEB+1FmytmZ7s/fxUs/qW04=
 =OrN9
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Madhavan Srinivasan:

 - Fix KUAP warning in VMX usercopy path

 - Fix lockdep warning during PCI enumeration

 - Fix to move CMA reservations to arch_mm_preinit

 - Fix to check current->mm is alive before getting user callchain

Thanks to Aboorva Devarajan, Christophe Leroy (CS GROUP), Dan Horák,
Nicolin Chen, Nilay Shroff, Qiao Zhao, Ritesh Harjani (IBM), Saket Kumar
Bhaskar, Sayali Patil, Shrikanth Hegde, Venkat Rao Bagalkote, and Viktor
Malik.

* tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/iommu: fix lockdep warning during PCI enumeration
  powerpc/selftests/copyloops: extend selftest to exercise __copy_tofrom_user_power7_vmx
  powerpc: fix KUAP warning in VMX usercopy path
  powerpc, perf: Check that current->mm is alive before getting user callchain
  powerpc/mem: Move CMA reservations to arch_mm_preinit
2026-03-15 11:36:11 -07:00
Niklas Cassel
e022f0c72c selftests: pci_endpoint: Skip reserved BARs
Running a test against a reserved BAR will result in the pci-epf-test
driver returning -ENOBUFS.

Make sure that the pci_endpoint_test selftest will return skip instead of
failure or success for reserved BARs.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Tested-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Tested-by: Koichiro Den <den@valinux.co.jp>
Reviewed-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260312130229.2282001-22-cassel@kernel.org
2026-03-15 22:04:28 +05:30
Cheng-Yang Chou
f96bc0fa92 sched_ext: Update selftests to drop ops.cpu_acquire/release()
ops.cpu_acquire/release() are deprecated by commit a3f5d48222
("sched_ext: Allow scx_bpf_reenqueue_local() to be called from
anywhere") in favor of handling CPU preemption via the sched_switch
tracepoint.

In the maximal selftest, replace the cpu_acquire/release stubs with a
minimal sched_switch TP program. Attach all non-struct_ops programs
(including the new TP) via maximal__attach() after disabling auto-attach
for the maximal_ops struct_ops map, which is managed manually in run().

Apply the same fix to reload_loop, which also uses the maximal skeleton.

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-14 22:54:05 -10:00
Cheng-Yang Chou
6712c4fefc sched_ext: Update demo schedulers and selftests to use scx_bpf_task_set_dsq_vtime()
Direct writes to p->scx.dsq_vtime are deprecated in favor of
scx_bpf_task_set_dsq_vtime(). Update scx_simple, scx_flatcg, and
select_cpu_vtime selftest to use the new kfunc with
scale_by_task_weight_inverse().

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-14 22:53:59 -10:00
Eric Dumazet
4686679a14 selftests/net: packetdrill: add tcp_disorder_fin_in_FIN_WAIT.pkt
Commit 795a7dfbc3 ("net: tcp: accept old ack during closing")
was fixing an old bug, add a test to make sure we won't break
this case in future kernels.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Menglong Dong <menglong8.dong@gmail.com>
Link: https://patch.msgid.link/20260313115429.3365751-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14 13:15:46 -07:00
Yifan Wu
74cd4e0e53 selftests/arm64: Implement cmpbr_sigill() to hwcap test
The function executes a CBEQ instruction which is valid if the CPU
supports the CMPBR extension. The CBEQ branches to skip the following
UDF instruction, and no SIGILL is generated. Otherwise, it will
generate a SIGILL.

Signed-off-by: Yifan Wu <wuyifan50@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-03-14 16:58:16 +00:00
Simon Baatz
3eb371edda selftests/net: packetdrill: add tcp_rcv_neg_window.pkt
The test ensures we correctly apply the maximum advertised window limit
when rcv_nxt advances past rcv_mwnd_seq, so that the "usable window"
is properly clamped to zero rather than becoming negative.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-6-4c7f96b1ec69@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14 08:02:51 -07:00
Simon Baatz
ba58b3e70b selftests/net: packetdrill: add tcp_rcv_wnd_shrink_allowed.pkt
This test verifies the sequence number checks using the maximum
advertised window sequence number when net.ipv4.tcp_shrink_window
is enabled.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-5-4c7f96b1ec69@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14 08:02:51 -07:00
Simon Baatz
ec1adf8ecf selftests/net: packetdrill: add tcp_rcv_wnd_shrink_nomem.pkt
This test verifies
- the sequence number checks using the maximum advertised window
  sequence number and
- the logic for handling received data in tcp_data_queue()

for the cases:

1. The window is reduced to zero because of memory

2. The window grows again but still does not reach the originally
   advertised window

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-4-4c7f96b1ec69@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14 08:02:16 -07:00
Simon Baatz
0e24d17bd9 tcp: implement RFC 7323 window retraction receiver requirements
By default, the Linux TCP implementation does not shrink the
advertised window (RFC 7323 calls this "window retraction") with the
following exceptions:

- When an incoming segment cannot be added due to the receive buffer
  running out of memory. Since commit 8c670bdfa5 ("tcp: correct
  handling of extreme memory squeeze") a zero window will be
  advertised in this case. It turns out that reaching the required
  memory pressure is easy when window scaling is in use. In the
  simplest case, sending a sufficient number of segments smaller than
  the scale factor to a receiver that does not read data is enough.

- Commit b650d953cd ("tcp: enforce receive buffer memory limits by
  allowing the tcp window to shrink") addressed the "eating memory"
  problem by introducing a sysctl knob that allows shrinking the
  window before running out of memory.

However, RFC 7323 does not only state that shrinking the window is
necessary in some cases, it also formulates requirements for TCP
implementations when doing so (Section 2.4).

This commit addresses the receiver-side requirements: After retracting
the window, the peer may have a snd_nxt that lies within a previously
advertised window but is now beyond the retracted window. This means
that all incoming segments (including pure ACKs) will be rejected
until the application happens to read enough data to let the peer's
snd_nxt be in window again (which may be never).

To comply with RFC 7323, the receiver MUST honor any segment that
would have been in window for any ACK sent by the receiver and, when
window scaling is in effect, SHOULD track the maximum window sequence
number it has advertised. This patch tracks that maximum window
sequence number rcv_mwnd_seq throughout the connection and uses it in
tcp_sequence() when deciding whether a segment is acceptable.

rcv_mwnd_seq is updated together with rcv_wup and rcv_wnd in
tcp_select_window(). If we count tcp_sequence() as fast path, it is
read in the fast path. Therefore, rcv_mwnd_seq is put into rcv_wnd's
cacheline group.

The logic for handling received data in tcp_data_queue() is already
sufficient and does not need to be updated.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-1-4c7f96b1ec69@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14 08:01:49 -07:00
Cheng-Yang Chou
c959218c65 sched_ext/selftests: Fix incorrect include guard comments
Fix two mismatched closing comments in header include guards:

- util.h: closing comment says __SCX_TEST_H__ but the guard is
  __SCX_TEST_UTIL_H__
- exit_test.h: closing comment has a spurious '#' character before
  the guard name

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-13 23:01:06 -10:00
Andrea Righi
12b49dd15e selftests/sched_ext: Update scx_bpf_dsq_move_to_local() in kselftests
After commit 860683763e ("sched_ext: Add enq_flags to
scx_bpf_dsq_move_to_local()") some of the kselftests are failing to
build:

 exit.bpf.c:44:34: error: too few arguments provided to function-like macro invocation
    44 |         scx_bpf_dsq_move_to_local(DSQ_ID);

Update the kselftests adding the new argument to
scx_bpf_dsq_move_to_local().

Fixes: 860683763e ("sched_ext: Add enq_flags to scx_bpf_dsq_move_to_local()")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-13 22:43:52 -10:00
Paul Chaignon
0a753d8cd6 selftests/bpf: Test case for refinement improvement using 64b bounds
This new selftest demonstrates the improvement of bounds refinement from
the previous patch. It is inspired from a set of reg_bounds_sync inputs
generated using CBMC [1] by Shung-Hsi:

  reg.smin_value=0x8000000000000002
  reg.smax_value=2
  reg.umin_value=2
  reg.umax_value=19
  reg.s32_min_value=2
  reg.s32_max_value=3
  reg.u32_min_value=2
  reg.u32_max_value=3

reg_bounds_sync returns R=[2; 3] without the previous patch, and R=2
with it. __reg64_deduce_bounds is able to derive that u64=2, but before
the previous patch, those bounds are overwritten in
__reg_deduce_mixed_bounds using the 32bits bounds.

To arrive to these reg_bounds_sync inputs, we bound the 32bits value
first to [2; 3]. We can then upper-bound s64 without impacting u64. At
that point, the refinement to u64=2 doesn't happen because the ranges
still overlap in two points:

  0  umin=2                             umax=0xff..ff00..03   U64_MAX
  |  [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]      |
  |----------------------------|------------------------------|
  |xx]                           [xxxxxxxxxxxxxxxxxxxxxxxxxxxx|
  0  smax=2                      smin=0x800..02               -1

With an upper-bound check at value 19, we can reach the above inputs for
reg_bounds_sync. At that point, the refinement to u64=2 happens and
because it isn't overwritten by __reg_deduce_mixed_bounds anymore,
reg_bounds_sync returns with reg=2.

The test validates this result by including an illegal instruction in
the (dead) branch reg != 2.

Link: https://github.com/shunghsiyu/reg_bounds_sync-review/ [1]
Co-developed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/622dc51c581cd4d652fff362188b2a5f73c1fe99.1773401138.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-13 19:09:35 -07:00
Linus Torvalds
8369b2e97d sched_ext: Fixes for v7.0-rc3
- Fix data races flagged by KCSAN: add missing READ_ONCE()/WRITE_ONCE()
   annotations for lock-free accesses to module parameters and dsq->seq.
 
 - Fix silent truncation of upper 32 enqueue flags (SCX_ENQ_PREEMPT and
   above) when passed through the int sched_class interface.
 
 - Documentation updates: scheduling class precedence, task ownership
   state machine, example scheduler descriptions, config list cleanup.
 
 - Selftest fix for format specifier and buffer length in
   file_write_long().
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCabRyHg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGZiWAQCmUOHiGAk73p9DDn6Zyrm+o/iQm/iOinchBeUs
 ZiG0bgEAn15giAnLCA5Zs6cG7PemxBH1v7ctyzTjh1VsBds0rwo=
 =zXix
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

 - Fix data races flagged by KCSAN: add missing READ_ONCE()/WRITE_ONCE()
   annotations for lock-free accesses to module parameters and dsq->seq

 - Fix silent truncation of upper 32 enqueue flags (SCX_ENQ_PREEMPT and
   above) when passed through the int sched_class interface

 - Documentation updates: scheduling class precedence, task ownership
   state machine, example scheduler descriptions, config list cleanup

 - Selftest fix for format specifier and buffer length in
   file_write_long()

* tag 'sched_ext-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Use WRITE_ONCE() for the write side of scx_enable helper pointer
  sched_ext: Fix enqueue_task_scx() truncation of upper enqueue flags
  sched_ext: Documentation: Update sched-ext.rst
  sched_ext: Use READ_ONCE() for scx_slice_bypass_us in scx_bypass()
  sched_ext: Documentation: Mention scheduling class precedence
  sched_ext: Document task ownership state machine
  sched_ext: Use READ_ONCE() for lock-free reads of module param variables
  sched_ext/selftests: Fix format specifier and buffer length in file_write_long()
  sched_ext: Use WRITE_ONCE() for the write side of dsq->seq update
2026-03-13 14:54:56 -07:00
David Carlier
1d02346fec selftests/sched_ext: Add missing error check for exit__load()
exit__load(skel) was called without checking its return value.
Every other test in the suite wraps the load call with
SCX_FAIL_IF(). Add the missing check to be consistent with the
rest of the test suite.

Fixes: a5db7817af ("sched_ext: Add selftests")
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-13 07:00:45 -10:00
Hari Bathini
2af3aa702c selftests/bpf: Improve test coverage for kfunc call
On powerpc, immediate load instructions are sign extended. In case
of unsigned types, arguments should be explicitly zero-extended by
the caller. For kfunc call, this needs to be handled in the JIT code.
In bpf_kfunc_call_test4(), that tests for sign-extension of signed
argument types in kfunc calls, add some additional failure checks.
And add bpf_kfunc_call_test5() to test zero-extension of unsigned
argument types in kfunc calls.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260312080113.843408-1-hbathini@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-13 07:13:35 -07:00
Jakub Kicinski
c1f9a89b0c selftests: net: add test for Netlink policy dumps
Add validation for the nlctrl family, accessing family info and
dumping policies.

  TAP version 13
  1..4
  ok 1 nl_nlctrl.getfamily_do
  ok 2 nl_nlctrl.getfamily_dump
  ok 3 nl_nlctrl.getpolicy_dump
  ok 4 nl_nlctrl.getpolicy_by_op
  # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0

Link: https://patch.msgid.link/20260311032839.417748-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-12 18:02:13 -07:00
Jakub Kicinski
e911be8354 selftests: net: make sure that Netlink rejects unknown attrs in dump
Add a test case for rejecting attrs if policy is not set.
dev_get dump has no input policy (accepts no attrs).

Link: https://patch.msgid.link/20260311032839.417748-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-12 18:02:13 -07:00
Jakub Kicinski
72374257ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-7.0-rc4).

drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
  db25c42c2e ("net/mlx5e: RX, Fix XDP multi-buf frag counting for striding RQ")
  dff1c3164a ("net/mlx5e: SHAMPO, Always calculate page size")
https://lore.kernel.org/aa7ORohmf67EKihj@sirena.org.uk

drivers/net/ethernet/ti/am65-cpsw-nuss.c
  840c9d13cb ("net: ethernet: ti: am65-cpsw-nuss: Fix rx_filter value for PTP support")
  a23c657e33 ("net: ethernet: ti: am65-cpsw: Use also port number to identify timestamps")
https://lore.kernel.org/abK3EkIXuVgMyGI7@sirena.org.uk

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-12 12:53:34 -07:00
Linus Torvalds
2c7e63d702 Including fixes from CAN and netfilter.
Current release - regressions:
 
   - eth: mana: Null service_wq on setup error to prevent double destroy
 
 Previous releases - regressions:
 
   - nexthop: fix percpu use-after-free in remove_nh_grp_entry
 
   - sched: teql: fix NULL pointer dereference in iptunnel_xmit on TEQL slave xmit
 
   - bpf: fix nd_tbl NULL dereference when IPv6 is disabled
 
   - neighbour: restore protocol != 0 check in pneigh update
 
   - tipc: fix divide-by-zero in tipc_sk_filter_connect()
 
   - eth: mlx5:
     - fix crash when moving to switchdev mode
     - fix DMA FIFO desync on error CQE SQ recovery
 
   - eth: iavf: fix PTP use-after-free during reset
 
   - eth: bonding: fix type confusion in bond_setup_by_slave()
 
   - eth: lan78xx: fix WARN in __netif_napi_del_locked on disconnect
 
 Previous releases - always broken:
 
   - core: add xmit recursion limit to tunnel xmit functions
 
   - net-shapers: don't free reply skb after genlmsg_reply()
 
   - netfilter:
     - fix stack out-of-bounds read in pipapo_drop()
     - fix OOB read in nfnl_cthelper_dump_table()
 
   - mctp:
     - fix device leak on probe failure
     - i2c: fix skb memory leak in receive path
 
   - can: keep the max bitrate error at 5%
 
   - eth: bonding: fix nd_tbl NULL dereference when IPv6 is disabled
 
   - eth: bnxt_en: fix RSS table size check when changing ethtool channels
 
   - eth: amd-xgbe: prevent CRC errors during RX adaptation with AN disabled
 
   - eth: octeontx2-af: devlink: fix NIX RAS reporter recovery condition
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmmy2TQSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk8gUQAJEUISnfUwJKv9t+Ktb7lDXV+5giTaLt
 ckYoTM7I+8N1vpETXvsDpICZXKSdYsXCWQZx4fYt02AKWom234OGDxCwN46antct
 AUmIdhNR2pSCKSqrPdfsLgyR9YuuQNiDb5KLFbyNTyIOGbhpRVAGor1AFOFHzV8O
 WMC6BTIQWxqx1SXCtfSO5dTXcuTSG+TEC37RWYBA9tui+pUTDgx+ZnmBVZhknrbl
 vD3b9APHRQ8KnfWhdpL8RVy8p/OfOgYeFSeQI/rPcSIM3pTBdJwl2cbsu/91k0xk
 svXpQMivFMrDWWcnCv7zv6zE+EiHBZnA/Yiqu4h4x2Y+UlkTmOgDVYfh7vUlbmhu
 k7xAfD/m92lQTznZaebgijbMO/Ec/qMzvsMNg3po0dVjf3Xdg2k0OKTqfnAfsxhs
 +knVjVf+FFh7AQQ0uU0HvBiy3BvfEaiArLAITjH10je4D/SObwqym1sjaX/wTeYc
 NXdN7Ww8CCIddnhfXJpX3fM9lUlM38Qa4Vrxti3xX6JGHrOL+Of1wfeKvDPQw8ET
 5MRCYPAJpv62tmeYV4954nCFx0jeKeN41LcTgs5I+0A1jL85AFHZ7CFjNnVc5kZS
 8mEDxDXUwcPERVyttM9ATl95ALs1++DG6Ri0772EP6JUq6oOVyJc33m6IZdsMM29
 6ZUWxAokttk4
 =56Mx
 -----END PGP SIGNATURE-----

Merge tag 'net-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from CAN and netfilter.

  Current release - regressions:

   - eth: mana: Null service_wq on setup error to prevent double destroy

  Previous releases - regressions:

   - nexthop: fix percpu use-after-free in remove_nh_grp_entry

   - sched: teql: fix NULL pointer dereference in iptunnel_xmit on TEQL slave xmit

   - bpf: fix nd_tbl NULL dereference when IPv6 is disabled

   - neighbour: restore protocol != 0 check in pneigh update

   - tipc: fix divide-by-zero in tipc_sk_filter_connect()

   - eth:
      - mlx5:
         - fix crash when moving to switchdev mode
         - fix DMA FIFO desync on error CQE SQ recovery
      - iavf: fix PTP use-after-free during reset
      - bonding: fix type confusion in bond_setup_by_slave()
      - lan78xx: fix WARN in __netif_napi_del_locked on disconnect

  Previous releases - always broken:

   - core: add xmit recursion limit to tunnel xmit functions

   - net-shapers: don't free reply skb after genlmsg_reply()

   - netfilter:
      - fix stack out-of-bounds read in pipapo_drop()
      - fix OOB read in nfnl_cthelper_dump_table()

   - mctp:
      - fix device leak on probe failure
      - i2c: fix skb memory leak in receive path

   - can: keep the max bitrate error at 5%

   - eth:
      - bonding: fix nd_tbl NULL dereference when IPv6 is disabled
      - bnxt_en: fix RSS table size check when changing ethtool channels
      - amd-xgbe: prevent CRC errors during RX adaptation with AN disabled
      - octeontx2-af: devlink: fix NIX RAS reporter recovery condition"

* tag 'net-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
  net: prevent NULL deref in ip[6]tunnel_xmit()
  octeontx2-af: devlink: fix NIX RAS reporter to use RAS interrupt status
  octeontx2-af: devlink: fix NIX RAS reporter recovery condition
  net: ethernet: ti: am65-cpsw-nuss: Fix rx_filter value for PTP support
  net/mana: Null service_wq on setup error to prevent double destroy
  selftests: rtnetlink: add neighbour update test
  neighbour: restore protocol != 0 check in pneigh update
  net: dsa: realtek: Fix LED group port bit for non-zero LED group
  tipc: fix divide-by-zero in tipc_sk_filter_connect()
  net: dsa: microchip: Fix error path in PTP IRQ setup
  bpf: bpf_out_neigh_v6: Fix nd_tbl NULL dereference when IPv6 is disabled
  bpf: bpf_out_neigh_v4: Fix nd_tbl NULL dereference when IPv6 is disabled
  net: bonding: Fix nd_tbl NULL dereference when IPv6 is disabled
  ipv6: move the disable_ipv6_mod knob to core code
  net: bcmgenet: fix broken EEE by converting to phylib-managed state
  net-shapers: don't free reply skb after genlmsg_reply()
  net: dsa: mxl862xx: don't set user_mii_bus
  net: ethernet: arc: emac: quiesce interrupts before requesting IRQ
  page_pool: store detach_time as ktime_t to avoid false-negatives
  net: macb: Shuffle the tx ring before enabling tx
  ...
2026-03-12 11:33:35 -07:00
Sean Christopherson
d2ea4ff1ce KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8
Add "do no harm" testing of EFER, CR0, CR4, and CR8 for SEV+ guests to
verify that the guest can read and write the registers, without hitting
e.g. a #VC on SEV-ES guests due to KVM incorrectly trying to intercept a
register.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260310211841.2552361-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-03-12 17:31:53 +01:00
T.J. Mercier
2de27980e1 selftests: memcg: Add tests for IN_DELETE_SELF and IN_IGNORED
Add two new tests that verify inotify events are sent when memcg files
or directories are removed with rmdir.

Signed-off-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Amir Goldstein <amir73il@gmail.com>
Tested-by: syzbot@syzkaller.appspotmail.com
Link: https://patch.msgid.link/20260225223404.783173-4-tjmercier@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-12 15:51:03 +01:00
Christian Brauner
bb5c17bc86
selftests/filesystems: add MOVE_MOUNT_BENEATH rootfs tests
Add tests for mounting beneath the rootfs using MOVE_MOUNT_BENEATH:

- beneath_rootfs_success: mount beneath /, fchdir, chroot, umount2
  MNT_DETACH -- verify root changed
- beneath_rootfs_old_root_stacked: after mount-beneath, verify old root
  parent is clone via statmount
- beneath_rootfs_in_chroot_fail: chroot into subdir of same mount,
  mount-beneath fails (dentry != mnt_root)
- beneath_rootfs_in_chroot_success: chroot into separate tmpfs mount,
  mount-beneath succeeds
- beneath_rootfs_locked_transfer: in user+mount ns: mount-beneath
  rootfs succeeds, MNT_LOCKED transfers, old root unmountable
- beneath_rootfs_locked_containment: in user+mount ns: after full
  root-switch workflow, new root is MNT_LOCKED (containment preserved)
- beneath_non_rootfs_locked_transfer: mounts created before
  unshare(CLONE_NEWUSER | CLONE_NEWNS) become locked; mount-beneath
  transfers MNT_LOCKED, displaced mount can be unmounted
- beneath_non_rootfs_locked_containment: same setup, verify new mount
  is MNT_LOCKED (containment preserved)

Link: https://patch.msgid.link/20260224-work-mount-beneath-rootfs-v1-3-8c58bf08488f@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-12 13:34:59 +01:00
Christian Brauner
5b8ffd63fb
selftests/filesystems: add clone3 tests for empty mount namespaces
Add a test suite for the CLONE_EMPTY_MNTNS flag exercising the empty
mount namespace functionality through the clone3() syscall.

The clone3() code path is distinct from the unshare() path already
tested in empty_mntns_test.c.  With clone3(), CLONE_EMPTY_MNTNS
(0x400000000ULL) is a 64-bit flag that implies CLONE_NEWNS.  The
implication happens in kernel_clone() before copy_process(), unlike
unshare() where it goes through UNSHARE_EMPTY_MNTNS to
CLONE_EMPTY_MNTNS conversion in unshare_nsproxy_namespaces().

The tests cover:

- basic functionality: clone3 child gets empty mount namespace with
  exactly one mount, root and cwd point to the same mount
- CLONE_NEWNS implication: CLONE_EMPTY_MNTNS works without explicit
  CLONE_NEWNS, also works with redundant CLONE_NEWNS
- flag interactions: combines correctly with CLONE_NEWUSER,
  CLONE_NEWPID, CLONE_NEWUTS, CLONE_NEWIPC, CLONE_PIDFD
- mutual exclusion: CLONE_EMPTY_MNTNS | CLONE_FS returns EINVAL
  because the implied CLONE_NEWNS conflicts with CLONE_FS
- error paths: EPERM without capabilities, unknown 64-bit flags
  rejected
- parent isolation: parent mount namespace is unchanged after clone
- many parent mounts: child still gets exactly one mount
- mount properties: root mount is nullfs, is its own parent, is the
  only listmount entry
- overmount workflow: child can mount tmpfs over nullfs root to build
  a writable filesystem from scratch
- repeated clone3: each child gets a distinct mount namespace
- setns: parent can join child's empty mount namespace via setns()
- regression: plain CLONE_NEWNS via clone3 still copies the full
  mount tree

Link: https://patch.msgid.link/20260306-work-empty-mntns-consolidated-v1-3-6eb30529bbb0@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-12 13:33:55 +01:00
Christian Brauner
32f54f2bbc
selftests/filesystems: add tests for empty mount namespaces
Add a test suite for the UNSHARE_EMPTY_MNTNS and CLONE_EMPTY_MNTNS
flags exercising the empty mount namespace functionality through the
kselftest harness.

The tests cover:

- basic functionality: unshare succeeds, exactly one mount exists in
  the new namespace, root and cwd point to the same mount
- flag interactions: UNSHARE_EMPTY_MNTNS works standalone without
  explicit CLONE_NEWNS, combines correctly with CLONE_NEWUSER and
  other namespace flags (CLONE_NEWUTS, CLONE_NEWIPC)
- edge cases: EPERM without capabilities, works from a user namespace,
  many source mounts still result in one mount, cwd on a different
  mount gets reset to root
- error paths: invalid flags return EINVAL
- regression: plain CLONE_NEWNS still copies the full mount tree,
  other namespace unshares are unaffected
- mount properties: the root mount has the expected statmount
  properties, is its own parent, and is the only entry returned by
  listmount
- repeated unshare: consecutive UNSHARE_EMPTY_MNTNS calls each
  produce a new namespace with a distinct mount ID
- overmount workflow: verifies the intended usage pattern of creating
  an empty mount namespace with a nullfs root and then mounting tmpfs
  over it to build a writable filesystem from scratch

Link: https://patch.msgid.link/20260306-work-empty-mntns-consolidated-v1-2-6eb30529bbb0@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-12 13:33:55 +01:00
Christian Brauner
3ac7ea91f3
selftests: add FSMOUNT_NAMESPACE tests
Add selftests for FSMOUNT_NAMESPACE which creates a new mount namespace
with the newly created filesystem mounted onto a copy of the real
rootfs.

Link: https://patch.msgid.link/20260122-work-fsmount-namespace-v1-6-5ef0a886e646@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-12 13:33:54 +01:00
Christian Brauner
be1ca3ee8f
selftests/statmount: add statmount_alloc() helper
Add a helper to allocate a statmount buffer and call statmount(). This
helper will be shared by multiple test suites that need to query mount
information via statmount().

Link: https://patch.msgid.link/20260122-work-fsmount-namespace-v1-5-5ef0a886e646@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-12 13:33:54 +01:00
Sayali Patil
146c9ab38b powerpc/selftests/copyloops: extend selftest to exercise __copy_tofrom_user_power7_vmx
The new PowerPC VMX fast path (__copy_tofrom_user_power7_vmx) is not
exercised by existing copyloops selftests. This patch updates
the selftest to exercise the VMX variant, ensuring the VMX copy path
is validated.

Changes include:
  - COPY_LOOP=test___copy_tofrom_user_power7_vmx with -D VMX_TEST is used
    in existing selftest build targets.
  - Inclusion of ../utils.c to provide get_auxv_entry() for hardware
    feature detection.
  - At runtime, the test skips execution if Altivec is not available.
  - Copy sizes above VMX_COPY_THRESHOLD are used to ensure the VMX
    path is taken.

This enables validation of the VMX fast path without affecting systems
that do not support Altivec.

Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260304122201.153049-2-sayalip@linux.ibm.com
2026-03-12 11:03:48 +05:30
Gal Pressman
f0bd193166 selftests: net: fix timeout passed as positional argument to communicate()
The cited commit refactored the hardcoded timeout=5 into a parameter,
but dropped the keyword from the communicate() call.
Since Popen.communicate()'s first positional argument is 'input' (not
'timeout'), the timeout value is silently treated as stdin input and the
call never enforces a timeout.

Pass timeout as a keyword argument to restore the intended behavior.

Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20260310115803.2521050-3-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-11 19:11:40 -07:00
Gal Pressman
82562972b8 selftests: net: pass bpftrace timeout to cmd()
The bpftrace() helper configures an interval based exit timer but does
not propagate the timeout to the cmd object, which defaults to 5
seconds. Since the default BPFTRACE_TIMEOUT is 10 seconds, cmd.process()
always raises a TimeoutExpired exception before bpftrace has a chance to
exit gracefully.

Pass timeout+5 to cmd() to allow bpftrace to complete gracefully.

Note: this issue is masked by a bug in the way cmd() passes timeout,
this is fixed in the next commit.

Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20260310115803.2521050-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-11 19:11:36 -07:00
Sabrina Dubroca
68e76fc12d selftests: rtnetlink: add neighbour update test
Check that protocol and flags are updated correctly for
neighbour and pneigh entries.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/d28f72b5b4ff4c9ecbbbde06146a938dcc4c264a.1772894876.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-11 19:04:55 -07:00
Daniel Golle
7e27d6202e selftests: net: local_termination: test link-local protocols
Add tests to local_termination.sh to verify that link-local frames
arrive. On some switches the DSA driver uses bridges to connect the
user ports to their CPU ports. More "intelligent" switches typically
don't forward link-local frames, but may trap them to an internal
microcontroller. The driver may have to change trapping rules, so
link-local frames end up on the DSA CPU ports instead of being
silently dropped or trapped to the internal microcontroller of the
switch.

Add two tests which help to validate this has been done correctly:
 - Link-local STP BPDU should arrive at the Linux netdev when the
   bridge has STP disabled (BR_NO_STP), in which case the bridge
   forwards them rather than consuming them in the control plane
 - Link-local LLDP should arrive at standalone ports (and the test
   should be skipped on bridged ports similar to how it is done
   for the IEEE1588v2/PTP tests)

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/1a67081b2ede1e6d2d32f7dd54ae9688f3566152.1773166131.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-11 18:58:05 -07:00
Soichiro Ueda
34c0378b15 selftests: af_unix: validate SO_PEEK_OFF advancement and reset
Extend the so_peek_off selftest to ensure the socket peek offset is handled
correctly after both MSG_PEEK and actual data consumption.

Verify that the peek offset advances by the same amount as the number of
bytes read when performing a read with MSG_PEEK.

After exercising SO_PEEK_OFF via MSG_PEEK, drain the receive queue with a
non-peek recv() and verify that it can receive all the content in the
buffer and SO_PEEK_OFF returns back to 0.

The verification after actual data consumption was suggested by Miao Wang
when the original so_peek_off selftest was introduced.

Link: https://lore.kernel.org/all/7B657CC7-B5CA-46D2-8A4B-8AB5FB83C6DA@gmail.com/
Suggested-by: Miao Wang <shankerwangmiao@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Soichiro Ueda <the.latticeheart@gmail.com>
Link: https://patch.msgid.link/20260310072832.127848-1-the.latticeheart@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-11 18:20:17 -07:00
Christian Brauner
ec26879e6d selftests/pidfd: add CLONE_PIDFD_AUTOKILL tests
Add tests for CLONE_PIDFD_AUTOKILL:

- autokill_basic: Verify closing the clone3 pidfd kills the child.
- autokill_requires_pidfd: Verify AUTOKILL without CLONE_PIDFD fails.
- autokill_requires_autoreap: Verify AUTOKILL without CLONE_AUTOREAP
  fails.
- autokill_rejects_thread: Verify AUTOKILL with CLONE_THREAD fails.
- autokill_pidfd_open_no_effect: Verify only the clone3 pidfd triggers
  autokill, not pidfd_open().
- autokill_requires_cap_sys_admin: Verify AUTOKILL without CLONE_NNP
  fails with -EPERM for an unprivileged caller.
- autokill_without_nnp_with_cap: Verify AUTOKILL without CLONE_NNP
  succeeds with CAP_SYS_ADMIN.

Link: https://patch.msgid.link/20260226-work-pidfs-autoreap-v5-6-d148b984a989@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-11 23:23:46 +01:00
Christian Brauner
2a4d85aa1c selftests/pidfd: add CLONE_NNP tests
Add tests for the new CLONE_NNP flag:

- nnp_sets_no_new_privs: Verify a child created with CLONE_NNP has
  no_new_privs set while the parent does not.

- nnp_rejects_thread: Verify CLONE_NNP | CLONE_THREAD is rejected
  with -EINVAL since threads share credentials.

- autoreap_no_new_privs_unset: Verify a plain CLONE_AUTOREAP child
  does not get no_new_privs.

Link: https://patch.msgid.link/20260226-work-pidfs-autoreap-v5-5-d148b984a989@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-11 23:23:31 +01:00
Christian Brauner
76d46ad2c5 selftests/pidfd: add CLONE_AUTOREAP tests
Add tests for the new CLONE_AUTOREAP clone3() flag:

- autoreap_without_pidfd: CLONE_AUTOREAP without CLONE_PIDFD works
  (fire-and-forget)
- autoreap_rejects_exit_signal: CLONE_AUTOREAP with non-zero
  exit_signal fails
- autoreap_rejects_parent: CLONE_AUTOREAP with CLONE_PARENT fails
- autoreap_rejects_thread: CLONE_AUTOREAP with CLONE_THREAD fails
- autoreap_basic: child exits, pidfd poll works, PIDFD_GET_INFO returns
  correct exit code, waitpid() returns -ECHILD
- autoreap_signaled: child killed by signal, exit info correct via pidfd
- autoreap_reparent: autoreap grandchild reparented to subreaper still
  auto-reaps
- autoreap_multithreaded: autoreap process with sub-threads auto-reaps
  after last thread exits
- autoreap_no_inherit: grandchild forked without CLONE_AUTOREAP becomes
  a regular zombie

Link: https://patch.msgid.link/20260226-work-pidfs-autoreap-v5-4-d148b984a989@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-11 23:23:16 +01:00
Cheng-Yang Chou
bd377af097 sched_ext: Fix incomplete help text usage strings
Several demo schedulers and the selftest runner had usage strings
that omitted options which are actually supported:

- scx_central: add missing [-v]
- scx_pair: add missing [-v]
- scx_qmap: add missing [-S] and [-H]
- scx_userland: add missing [-v]
- scx_sdt: remove [-f] which no longer exists
- runner.c: add missing [-s], [-l], [-q]; drop [-h] which none of the
  other sched_ext tools list in their usage lines

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-11 11:02:57 -10:00
Varun R Mallya
ca0f39a369 selftests/bpf: Fix const qualifier warning in fexit_bpf2bpf.c
Building selftests with
clang 23.0.0 (6fae863eba8a72cdd82f37e7111a46a70be525e0) triggers
the following error:

  tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c:117:12:
  error: assigning to 'char *' from 'const char *' discards qualifiers
  [-Werror,-Wincompatible-pointer-types-discards-qualifiers]

The variable `tgt_name` is declared as `char *`, but it stores the
result of strstr(prog_name[i], "/"). Since `prog_name[i]` is a
`const char *`, the returned pointer should also be treated as
const-qualified.

Update `tgt_name` to `const char *` to match the type of the underlying
string and silence the compiler warning.

Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Menglong Dong <menglong.dong@linux.dev>
Link: https://lore.kernel.org/bpf/20260305222132.470700-1-varunrmallya@gmail.com
2026-03-11 10:54:40 -07:00
Paolo Bonzini
3e745694b0 selftests: kvm: add a test that VMX validates controls on RSM
Add a test checking that invalid eVMCS contents are validated after an
RSM instruction is emulated.

The failure mode is simply that the RSM succeeds, because KVM virtualizes
NMIs anyway while running L2; the two pin-based execution controls used
by the test are entirely handled by KVM and not by the processor.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-03-11 18:41:12 +01:00
Paolo Bonzini
c52b534f26 selftests: kvm: extract common functionality out of smm_test.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-03-11 18:41:12 +01:00
Kai Huang
cf534a09fb KVM: selftests: Increase 'maxnode' for guest_memfd tests
Increase 'maxnode' when using 'get_mempolicy' syscall in guest_memfd
mmap and NUMA policy tests to fix a failure on one Intel GNR platform.

On a CXL-capable platform, the memory affinity of CXL memory regions may
not be covered by the SRAT.  Since each CXL memory region is enumerated
via a CFMWS table, at early boot the kernel parses all CFMWS tables to
detect all CXL memory regions and assigns a 'faked' NUMA node for each
of them, starting from the highest NUMA node ID enumerated via the SRAT.

This increases the 'nr_node_ids'.  E.g., on the aforementioned Intel GNR
platform which has 4 NUMA nodes and 18 CFMWS tables, it increases to 22.

This results in the 'get_mempolicy' syscall failure on that platform,
because currently 'maxnode' is hard-coded to 8 but the 'get_mempolicy'
syscall requires the 'maxnode' to be not smaller than the 'nr_node_ids'.

Increase the 'maxnode' to the number of bits of 'nodemask', which is
'unsigned long', to fix this.

This may not cover all systems.  Perhaps a better way is to always set
the 'nodemask' and 'maxnode' based on the actual maximum NUMA node ID on
the system, but for now just do the simple way.

Reported-by: Yi Lai <yi1.lai@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221014
Closes: https://lore.kernel.org/all/bug-221014-28872@https.bugzilla.kernel.org%2F
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>
Link: https://patch.msgid.link/20260302205158.178058-1-kai.huang@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2026-03-11 18:41:10 +01:00
Sun Jian
c02e0ab8ae selftests/bpf: Skip livepatch test when prerequisites are missing
livepatch_trampoline relies on livepatch sysfs and livepatch-sample.ko.
When CONFIG_LIVEPATCH is disabled or the samples module isn't built, the
test fails with ENOENT and causes false failures in minimal CI configs.

Skip the test when livepatch sysfs or the sample module is unavailable.
Also avoid writing to livepatch sysfs when it's not present.

Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260309104448.817401-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-11 09:36:37 -07:00
Sun Jian
aa181c7d64 selftests/bpf: drop serial restriction
Patch 1/2 added PID filtering to the probe_user BPF program to avoid
cross-test interference from the global connect() hooks.

With the interference removed, drop the serial_ prefix and remove the
stale TODO comment so the test can run in parallel.

Tested:
  ./test_progs -t probe_user -v
  ./test_progs -j$(nproc) -t probe_user

Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260306083330.518627-2-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-11 09:34:22 -07:00
Sun Jian
70ce840d5f selftests/bpf: filter by pid to avoid cross-test interference
The test installs a kprobe on __sys_connect and checks that
bpf_probe_write_user() can modify the syscall argument. However, any
concurrent thread in any other test that calls connect() will also
trigger the kprobe and have its sockaddr silently overwritten, causing
flaky failures in unrelated tests.

Constrain the hook to the current test process by filtering on a PID
stored as a global variable in .bss. Initialize the .bss value from
user space before bpf_object__load() using bpf_map__set_initial_value(),
and validate the bss map value size to catch layout mismatches.

No new map is introduced and the test keeps the existing non-skeleton
flow.

Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260306083330.518627-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-11 09:34:22 -07:00
Viktor Malik
900b7cc73c selftests/bpf: Speed up module_attach test
The module_attach test contains subtests which check that unloading a
module while there are BPF programs attached to its functions is not
possible because the module is still referenced.

The problem is that the test calls the generic unload_module() helper
function which is used for module cleanup after test_progs terminate and
tries to wait until all module references are released. This
unnecessarily slows down the module_attach subtests since each
unsuccessful call to unload_module() takes about 1 second.

Introduce try_unload_module() which takes the number of retries as a
parameter. Make unload_module() call it with the currently used amount
of 10000 retries but call it with just 1 retry from module_attach tests
as it is always expected to fail. This speeds up the module_attach()
test significantly.

Before:

    # time ./test_progs -t module_attach
    [...]
    Summary: 1/14 PASSED, 0 SKIPPED, 0 FAILED

    real        0m5.011s
    user        0m0.293s
    sys         0m0.108s

After:

    # time ./test_progs -t module_attach
    [...]
    Summary: 1/14 PASSED, 0 SKIPPED, 0 FAILED

    real        0m0.350s
    user        0m0.197s
    sys         0m0.063s

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260306101628.3822284-1-vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-11 09:29:09 -07:00
Thomas Weißschuh
bed0053a63 selftests: vDSO: vdso_test_correctness: Add a test for time()
Extend the test to also cover the time() function.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260227-vdso-selftest-cleanups-v2-6-d84830fa8beb@linutronix.de
2026-03-11 15:23:24 +01:00
Thomas Weißschuh
a8b22a158a selftests: vDSO: vdso_test_correctness: Use facilities from parse_vdso.c
The soname from the vDSO is not a public API. Furthermore it requires
libc to implement dlsym() and friends.

Use the facilities from parse_vdso.c instead which uses the official
vDSO ABI to find it, aligned with the other vDSO selftests.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260227-vdso-selftest-cleanups-v2-5-d84830fa8beb@linutronix.de
2026-03-11 15:23:18 +01:00
Thomas Weißschuh
38bc16aa47 selftests: vDSO: vdso_test_correctness: Handle different tv_usec types
On SPARC the field tv_usec of 'struct timespec' is not a 'long int', but
only a regular int. In this case the format string is incorrect and will
trigger compiler warnings.

Avoid the warnings by casting to 'long long', similar to how it is done for
the tv_sec and what the other similar selftests are doing.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Andreas Larsson <andreas@gaisler.com>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Link: https://patch.msgid.link/20260227-vdso-selftest-cleanups-v2-4-d84830fa8beb@linutronix.de
2026-03-11 15:23:11 +01:00
Thomas Weißschuh
ad2af7768f selftests: vDSO: vdso_test_correctness: Drop SYS_getcpu fallbacks
These fallbacks are only valid on x86 and unused in the first place.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260227-vdso-selftest-cleanups-v2-3-d84830fa8beb@linutronix.de
2026-03-11 15:23:06 +01:00
Thomas Weißschuh
50692c25ee selftests: vDSO: vdso_test_gettimeofday: Remove nolibc checks
nolibc now provides these headers, making the check unnecessary.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260227-vdso-selftest-cleanups-v2-2-d84830fa8beb@linutronix.de
2026-03-11 15:23:01 +01:00
Thomas Weißschuh
5abfa0c4da Revert "selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers"
This reverts commit c9fbaa8795 ("selftests: vDSO: parse_vdso: Use UAPI
headers instead of libc headers")

The kernel headers were used to make parse_vdso.c compatible with
nolibc.  Unfortunately linux/elf.h is incompatible with glibc's
sys/auxv.h. When using glibc it is therefore not possible build
parse_vdso.c as part of the same compilation unit as its caller
as sys/auxv.h is needed for getauxval().

In the meantime nolibc gained its own elf.h, providing compatibility
with the documented libc interfaces.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260227-vdso-selftest-cleanups-v2-1-d84830fa8beb@linutronix.de
2026-03-11 15:22:55 +01:00
Vincent Donnefort
39d5ca62a3 tracing: selftests: Add hypervisor trace remote tests
Run the trace remote selftests with the trace remote 'hypervisor', This
trace remote is most likely created when the arm64 KVM nVHE/pKVM
hypervisor is in use.

Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260309162516.2623589-31-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-11 08:51:17 +00:00
Jakub Kicinski
77a6401a87 tools: ynl: add Python API for easier access to policies
The format of Netlink policy dump is a bit curious with messages
in the same dump carrying both attrs and mapping info. Plus each
message carries a single piece of the puzzle the caller must then
reassemble.

I need to do this reassembly for a test, but I think it's generally
useful. So let's add proper support to YnlFamily to return more
user-friendly representation. See the various docs in the patch
for more details.

Link: https://patch.msgid.link/20260310005337.3594225-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-10 19:32:46 -07:00
Dimitri Daskalakis
690043b95c selftests: drv-net: rss: Add retries to test_rss_key_indir to reduce flakes
The test generates 16 flows, and verifies that traffic is distributed
across two queues via the NICs RSS indirection table. The likelihood of the
flows skewing to a single queue is high, so we retry sending traffic up to
3 times.

Alternatively, we could increase the number of generated flows. But
debug kernels may struggle to ramp this many flows.

During manual testing, the test passed for 10,000 consecutive runs.

Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20260309204215.2110486-1-dimitri.daskalakis1@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-10 19:01:51 -07:00
Allison Henderson
87fdf57ded selftests: rds: Fix tcpdump segfault in rds selftests
net/rds/test.py sees a segfault in tcpdump when executed through the
ksft runner.

[   21.903713] tcpdump[1469]: segfault at 0 ip 000072100e99126d
sp 00007ffccf740fd0 error 4
[   21.903721]  in libc.so.6[16a26d,7798b149a000+188000]
[   21.905074]  in libc.so.6[16a26d,72100e84f000+188000] likely on
CPU 5 (core 5, socket 0)
[   21.905084] Code: 00 0f 85 a0 00 00 00 48 83 c4 38 89 d8 5b 41 5c
41 5d 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 05 91 8b 09 00 8b 4d ac
64 89 08 <41> 0f b6 07 83 e8 2b a8 fd 0f 84 54 ff ff ff 49 8b 36 4c 89
ff e8
[   21.906760]  likely on CPU 9 (core 9, socket 0)
[   21.913469] Code: 00 0f 85 a0 00 00 00 48 83 c4 38 89 d8 5b 41 5c 41
5d 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 05 91 8b 09 00 8b 4d ac 64 89
08 <41> 0f b6 07 83 e8 2b a8 fd 0f 84 54 ff ff ff 49 8b 36 4c 89 ff e8

The os.fork() call creates extra complexity because it forks the entire
process including the python interpreter.  ip() then calls cmd() which
creates a subprocess.Popen.  We can avoid the extra layering by simply
calling subprocess.Popen directly. Track the process handles directly
and terminate them at cleanup rather than relying on killall. Further
tcpdump's -Z flag attempts to change savefile ownership, which is not
supported by the 9p protocol.  Fix this by writing pcap captures to
"/tmp" during the test and move them to the log directory after tcpdump
exits.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260308055835.1338257-4-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-10 18:54:24 -07:00
Allison Henderson
b873b4e160 selftests: rds: Add ksft timeout
rds/run.sh sets a timer of 400s when calling test.py.  However when
tests are run through ksft, a default 45s timer is applied.  Fix this
by adding a ksft timeout in tools/testing/selftests/net/rds/settings

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260308055835.1338257-3-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-10 18:54:15 -07:00
Allison Henderson
5a0c5702bd selftests: rds: Fix pylint warnings
Tidy up all exiting pylint errors in test.py.  No functional
changes are introduced in this patch

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260308055835.1338257-2-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-10 18:53:42 -07:00
Alan Maguire
e95e85b891 selftests/bpf: Handle !CONFIG_SMC in bpf_smc.c
Currently BPF selftests will fail to compile if CONFIG_SMC
is not set.

Use BPF CO-RE to work around the case where CONFIG_SMC is
not set; use ___local variants of relevant structures and
utilize bpf_core_field_exists() for net->smc.

The test continues to pass where

  CONFIG_SMC=y
  CONFIG_SMC_HS_CTRL_BPF=y

but these changes allow the selftests to build in the absence
of CONFIG_SMC=y.

Also ensure that we get a pure skip rather than a skip+fail
by removing the SMC is unsupported part from the ASSERT_FALSE()
in get_smc_nl_family(); doing this means we get a skip without
a fail when CONFIG_SMC is not set:

$ sudo ./test_progs -t bpf_smc
Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED

Fixes: beb3c67297 ("bpf/selftests: Add selftest for bpf_smc_hs_ctrl")
Reported-by: Colm Harrington <colm.harrington@oracle.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Tested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://patch.msgid.link/20260310111330.601765-1-alan.maguire@oracle.com
2026-03-10 17:40:15 -07:00
Paul Chaignon
e06e6b8001 selftests/bpf: Fix pkg-config call on static builds
For commit b0dcdcb9ae ("resolve_btfids: Fix linker flags detection"),
I suggested setting HOSTPKG_CONFIG to $PKG_CONFIG when compiling
resolve_btfids, but I forgot the quotes around that variable.

As a result, when running vmtest.sh with static linking, it fails as
follows:

    $ LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh
    [...]
    make: unrecognized option '--static'
    Usage: make [options] [target] ...
    [...]

This worked when I tested it because HOSTPKG_CONFIG didn't have a
default value in the resolve_btfids Makefile, but once it does, the
quotes aren't preserved and it fails on the next make call.

Fixes: b0dcdcb9ae ("resolve_btfids: Fix linker flags detection")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/abADBwn_ykblpABE@mail.gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-10 12:04:00 -07:00
Hui Zhu
da99028c21 selftests/bpf: Use bpf_core_enum_value for stats in cgroup_iter_memcg
Replace hardcoded enum values with bpf_core_enum_value() calls in
cgroup_iter_memcg test to improve portability across different
kernel versions.

The change adds runtime enum value resolution for:
- node_stat_item: NR_ANON_MAPPED, NR_SHMEM, NR_FILE_PAGES,
  NR_FILE_MAPPED
- vm_event_item: PGFAULT

This ensures the BPF program can adapt to enum value changes
between kernel versions.

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: JP Kobryn <jp.kobryn@linux.dev>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Link: https://lore.kernel.org/r/ca6eb1a1a4fd7a17ffe995acf52c9a4ceb7bac13.1772505399.git.zhuhui@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-10 11:53:48 -07:00
Hui Zhu
a8fce027e1 selftests/bpf: Remove kmem subtest from cgroup_iter_memcg
When cgroup.memory=nokmem is set in the kernel command line, kmem
accounting is disabled. This causes the test_kmem subtest in
cgroup_iter_memcg to fail because it expects non-zero kmem values.

Remove the kmem subtest altogether since the remaining subtests
(shmem, file, pgfault) already provide sufficient coverage for
the cgroup iter memcg functionality.

Reviewed-by: JP Kobryn <jp.kobryn@linux.dev>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Link: https://lore.kernel.org/r/35fa32a019361ec26265c8a789ee31e448d4dbda.1772505399.git.zhuhui@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-10 11:53:23 -07:00
Cupertino Miranda
6a1c9a442f selftests/bpf: tests to non_null ptr detection using register operand in JEQ/JNE
This patch adds two tests to check non_null ptr detection when using JEQ and JNE
have a register in second operand, and its value is known to be 0.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: David Faust  <david.faust@oracle.com>
Cc: Jose Marchesi  <jose.marchesi@oracle.com>
Cc: Elena Zannoni  <elena.zannoni@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260304195018.181396-4-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-10 11:51:18 -07:00
Yazhou Tang
ea1989746b selftests/bpf: Add test for BPF_END register ID reset
Add a test case to ensure that BPF_END operations correctly break
register's scalar ID ties.

The test creates a scenario where r1 is a copy of r0, r0 undergoes a
byte swap, and then r0 is checked against a constant.

- Without the fix in the verifier, the bounds learned from r0 are
  incorrectly propagated to r1, making the verifier believe r1 is
  bounded and wrongly allowing subsequent pointer arithmetic.

- With the fix, r1 remains an unbounded scalar, and the verifier
  correctly rejects the arithmetic operation between the frame pointer
  and the unbounded register.

Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260304083228.142016-3-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-10 11:46:31 -07:00
Alex Mastro
f183963891 vfio: selftests: fix crash in vfio_dma_mapping_mmio_test
Remove the __iommu_unmap() call on a region that was never mapped.
When __iommu_map() fails (expected for MMIO vaddrs in non-VFIO
modes), the region is not added to the dma_regions list, leaving its
list_head zero-initialized. If the unmap ioctl returns success,
__iommu_unmap() calls list_del_init() on this zeroed node and crashes.

This fixes the iommufd_compat_type1 and iommufd_compat_type1v2
test variants.

Fixes: 080723f4d4 ("vfio: selftests: Add vfio_dma_mapping_mmio_test")
Signed-off-by: Alex Mastro <amastro@fb.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>
Link: https://lore.kernel.org/r/20260303-fix-mmio-test-v1-1-78b4a9e46a4e@fb.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
2026-03-10 11:56:01 -06:00
Victor Nogueira
56acc7f519 selftests/tc-testing: Adapt test's output to HFSC's iproute2 printing changes
To make the printing of HFSC's defcls consistent with HTB's,
iproute2 is now printing defcls prepended with "0x".

This commit adapts test a4c3 to this change.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20260307220724.2501212-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-09 19:42:19 -07:00
Alok Tiwari
dc9c9193c7 selftests: fib_tests: fix link-local retrieval in fib6_nexthop()
fib6_nexthop() retrieves the link-local address for two interfaces used
in the test. However, both lldummy and llv1 are obtained from dummy0.

llv1 is expected to be retrieved from veth1, which is the interface used
later in the test. The subsequent check and error message also expect
the address to be retrieved from veth1.

Fix this by retrieving llv1 from veth1.

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Link: https://patch.msgid.link/20260306180830.2329477-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-09 19:17:48 -07:00
Vincent Donnefort
0a1b03251d tracing: selftests: Add trace remote tests
Exercise the tracefs interface for trace remote with a set of tests to
check:

  * loading/unloading (unloading.tc)
  * reset (reset.tc)
  * size changes (buffer_size.tc)
  * consuming read (trace_pipe.tc)
  * non-consuming read (trace.tc)

Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: linux-kselftest@vger.kernel.org
Link: https://patch.msgid.link/20260309162516.2623589-16-vdonnefort@google.com
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-09 12:33:55 -04:00
Viktor Malik
fcec7c66d6 selftests/bpf: Move sleepable refcounted_kptr tests to syscalls
Now that sleepable programs are always enabled on syscalls, let
refcounted_kptr tests use syscalls rather than bpf_testmod_test_read,
which is not sleepable with error injection disabled.

The tests just check that the verifier can handle usage of RCU locks in
sleepable programs and never actually attach. So, the attachment target
doesn't matter (as long as it is sleepable) and with syscalls, the tests
pass on kernels with disabled error injection.

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/8b6626eae384559855f7a0e846a16e83f25f06f6.1773055375.git.vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-09 09:28:42 -07:00
Ricardo B. Marlière
2295174498 ktest: Add a --dry-run mode
When working on a ktest configuration, it is often useful to inspect the
final option values after includes, defaults, per-test overrides, and
variable expansion have been applied, without actually starting a test run.

Add a --dry-run option that reads the configuration, prints the test
preamble using resolved option values, and exits before opening LOG_FILE or
executing any test logic.

This is useful for debugging ktest configurations and for scripts that need
to validate the final resolved settings without triggering side effects.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-9-565d412f4925@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 11:03:00 -04:00
Ricardo B. Marlière
bc6e165a45 ktest: Run POST_KTEST hooks on failure and cancellation
PRE_KTEST can be useful for setting up the environment and POST_KTEST to
tear it down, however POST_KTEST only runs on the normal end-of-run path.
It is skipped when ktest exits through dodie() or cancel_test(). Final
cleanup hooks are skipped.

Factor the final hook execution into run_post_ktest(), call it from the
normal exit path and from the early exit paths, and guard it so the hook
runs at most once.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-8-565d412f4925@suse.com
Fixes: 921ed4c720 ("ktest: Add PRE/POST_KTEST and TEST options")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 11:02:18 -04:00
Ricardo B. Marlière
972816d21b ktest: Add PRE_KTEST_DIE for PRE_KTEST failures
PRE_KTEST runs before the first test, but its return status is currently
ignored. A failing setup hook can leave the rest of the run executing in a
partially initialized environment.

Add PRE_KTEST_DIE so PRE_KTEST can fail the run in the same way
PRE_BUILD_DIE and PRE_TEST_DIE already can. Keep the default behavior
unchanged when the new option is not set.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-7-565d412f4925@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 10:32:08 -04:00
Ricardo B. Marlière
eae247f65d ktest: Stop dropping console output during power-cycle reboot
The POWER_CYCLE fallback added to reboot() flushes monitor output at the
wrong time. In the untimed reboot path, flushing immediately after
start_monitor() can consume the first output from the new boot before
monitor() begins reading it. In the timed path, flushing after POWER_CYCLE
can eat the "Linux version" banner or REBOOT_SUCCESS_LINE from the new
kernel.

That makes ktest miss the boot it is waiting for and can trigger an
unnecessary second power cycle.

Start the monitor before POWER_CYCLE so the reference counting stays
balanced, but only flush when reboot() was asked to wait for a timed
reboot. Perform that flush before issuing POWER_CYCLE so it drains stale
output from the old kernel instead of consuming the next boot.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-6-565d412f4925@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 10:32:07 -04:00
Ricardo B. Marlière
fcfc25725a ktest: Run commands through list-form shell open
run_command() currently uses string-form open():

  open(CMD, "$command 2>&1 |")

That delegates parsing to the shell but also mixes the stderr redirection
into the command string. Switch to list-form open() with an explicit sh -c
wrapper so shell syntax errors are captured in the same output stream as
command output. Otherwise, important errors can not be retrieved from the
ktest LOG_FILE.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-5-565d412f4925@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 10:32:07 -04:00
Ricardo B. Marlière
a2de57a3c8 ktest: Honor empty per-test option overrides
A per-test override can clear an inherited default option by assigning an
empty value, but __set_test_option() still used option_defined() to decide
whether a per-test key existed. That turned an empty per-test assignment
back into "fall back to the default", so tests still could not clear
inherited settings.

For example:

  DEFAULTS
  (...)
  LOG_FILE = /tmp/ktest-empty-override.log
  CLEAR_LOG = 1
  ADD_CONFIG = /tmp/.config

  TEST_START
  TEST_TYPE = build
  BUILD_TYPE = nobuild
  ADD_CONFIG =

This would run the test with ADD_CONFIG[1] = /tmp/.config

Fix by checking whether the per-test key exists before falling back. If it
does exist but is empty, treat it as unset for that test and stop the
fallback chain there.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-4-565d412f4925@suse.com
Fixes: 22c37a9ac4 ("ktest: Allow tests to undefine default options")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 10:32:07 -04:00
Ricardo B. Marlière
a2706c6c4a ktest: Treat undefined self-reference as empty
Config variables are expanded when they are assigned. A first-time append
such as:

  VAR := ${VAR} foo

leaves the literal ${VAR} in the stored value because VAR has not been
defined yet. Later expansions then carry the self-reference forward instead
of behaving like an empty prefix.

Drop an unescaped self-reference when the variable has no current value,
and trim the outer whitespace left behind. Keep escaped \${VAR} references
unchanged so literal text still works.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-3-565d412f4925@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 10:32:07 -04:00
Ricardo B. Marlière
eb30942e4e ktest: Resolve LOG_FILE in test option context
LOG_FILE is expanded immediately after the config file is parsed with
eval_option(..., -1). That uses the default context, not the same option
resolution path used for tests. If LOG_FILE depends on options that are
finalized per test, it can be resolved from stale values before the first
test starts.

Resolve LOG_FILE through set_test_option("LOG_FILE", 1) instead so it uses
the same expansion rules as the rest of the test options.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-2-565d412f4925@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 10:32:07 -04:00
Ricardo B. Marlière
057854f8a5 ktest: Avoid undef warning when WARNINGS_FILE is unset
check_buildlog() probes $warnings_file with -f even when WARNINGS_FILE is
not configured. Perl warns about the uninitialized value and adds noise to
the test log, which can hide the output we actually care about.

Check that WARNINGS_FILE is defined before testing whether the file exists.

Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-1-565d412f4925@suse.com
Fixes: 4283b169ab ("ktest: Add make_warnings_file and process full warnings")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-03-09 10:32:06 -04:00
Alexei Starovoitov
099bded752 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.0-rc3
Cross-merge BPF and other fixes after downstream PR.

No conflicts.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-08 17:46:38 -07:00
Linus Torvalds
8b7f4cd3ac bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmmsahoACgkQ6rmadz2v
 bToD4Q/9Hr2lAELsTRZIENBTYYTfk48Rzrr+rJbZWLfX4nOu9RwhVboTw0gvJr8r
 QEteYO8+gP5hrIuifcC+vdzW12CpgziBK7/Qj2NB7MGSX0ZCLK0ShQK6TAbNE8uZ
 xA4qMcJp0BWJpgfU51z4Ur5nMzG0Th2XlY+SGwHYZ/53qRMmP7Tr8R9wPltKuEll
 J+tS6BE6bFKMQe1CuIYlYfYj9kO3j4yx9S48j1e8x9LXc2/2arfsOIZQ6L11rnxZ
 YTa9jbcRL0YdLy7d4hMLyPQquF9hiHDrJLJih+YNmbyOCPGD3HP0ib2c2g0ip+qk
 WlNFLc2K9w0eS02uuAytaxwArdOpDUHG+skWe+dGt95Bk3Xld0hW/JSI5hrohM1E
 0YKjuZeGGvTkI5FJ5kk+BWoApy+FveiN1w7VsHZgr1tix5NRZRgJP/5swM+0eSp/
 neTzp08fC10gi+GOnUpdF/lI4wEayoGMpo68BTGiM/hP6Qh268wNWgTHaau3Sk/z
 wfJb90VTcDHk+N8mnbi4970RZ6OdqX2n7aBy8D7hBW31kHhP27zofzYX51CFD8Ln
 5NQDU3wX/hUw9tNwoghCNe12//gfDBi6rO/GoEztNFfVVef4QuzsfwKKdxzUMc9a
 jZvyDyGh87rGLauDWylq4s23vgxgIG/p114uv+eFp3AMkOfT/Dk=
 =knWl
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Fix u32/s32 bounds when ranges cross min/max boundary (Eduard
   Zingerman)

 - Fix precision backtracking with linked registers (Eduard Zingerman)

 - Fix linker flags detection for resolve_btfids (Ihor Solodrai)

 - Fix race in update_ftrace_direct_add/del (Jiri Olsa)

 - Fix UAF in bpf_trampoline_link_cgroup_shim (Lang Xu)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  resolve_btfids: Fix linker flags detection
  selftests/bpf: add reproducer for spurious precision propagation through calls
  bpf: collect only live registers in linked regs
  Revert "selftests/bpf: Update reg_bound range refinement logic"
  selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary
  bpf: Fix u32/s32 bounds when ranges cross min/max boundary
  bpf: Fix a UAF issue in bpf_trampoline_link_cgroup_shim
  ftrace: Add missing ftrace_lock to update_ftrace_direct_add/del
2026-03-07 12:20:37 -08:00
Linus Torvalds
03dcad79ee RCU fixes for v7.0
Fix a regression in RCU torture test pre-defined scenarios caused by
 commit 7dadeaa6e8 ("sched: Further restrict the preemption modes")
 which limits PREEMPT_NONE to architectures that do not support
 preemption at all and PREEMPT_VOLUNTARY to those architectures that do
 not yet have PREEMPT_LAZY support. Since major architectures (e.g. x86
 and arm64) no longer support CONFIG_PREEMPT_NONE and
 CONFIG_PREEMPT_VOLUNTARY, using them in rcutorture, rcuscale, refscale,
 and scftorture pre-defined scenarios causes config checking errors.
 Hence switch these kconfigs to PREEMPT_LAZY.
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCAAvFiEEj5IosQTPz8XU1wRHSXnow7UH+rgFAmmsZu0RHGJvcXVuQGtl
 cm5lbC5vcmcACgkQSXnow7UH+rgNWwgAn1bIWDIWQR9CkRQ0grhdO+pPfLusbVg/
 Y7H3SsdEX03meSunA/IVGejP6Qbanuab9nyHdv3WxxowpCGYBaLFPvklSfcBWYeZ
 4lxi6Fj+2rrzvOnQ54Pk5i4v5VayxEo12XBIgx6HDV7+5LgWk0gU0LpWPUhEWYYR
 z/zJ/XbcLr8e9tZVwAWZj/ShLUH301razC2SaR/OlS93zqRG9Sd251Knjqq/lEwI
 T6RZfhT2Wz3bgqU3QcjxDWw5dhB0/Y9wKoJjx4bB9m8lnSt+o96gH40TO+lnllnS
 T4OHjqK1J1fUXNLzQufyfKKAiwi/LBBc9H4pe4tiLFxg0fg5s21flQ==
 =NBys
 -----END PGP SIGNATURE-----

Merge tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU selftest fixes from Boqun Feng:
 "Fix a regression in RCU torture test pre-defined scenarios caused by
  commit 7dadeaa6e8 ("sched: Further restrict the preemption modes")
  which limits PREEMPT_NONE to architectures that do not support
  preemption at all and PREEMPT_VOLUNTARY to those architectures that do
  not yet have PREEMPT_LAZY support.

  Since major architectures (e.g. x86 and arm64) no longer support
  CONFIG_PREEMPT_NONE and CONFIG_PREEMPT_VOLUNTARY, using them in
  rcutorture, rcuscale, refscale, and scftorture pre-defined scenarios
  causes config checking errors.

  Switch these kconfigs to PREEMPT_LAZY"

* tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
  scftorture: Update due to x86 not supporting none/voluntary preemption
  refscale: Update due to x86 not supporting none/voluntary preemption
  rcuscale: Update due to x86 not supporting none/voluntary preemption
  rcutorture: Update due to x86 not supporting none/voluntary preemption
2026-03-07 11:56:55 -08:00
Ihor Solodrai
b0dcdcb9ae resolve_btfids: Fix linker flags detection
The "|| echo -lzstd" default makes zstd an unconditional link
dependency of resolve_btfids. On systems where libzstd-dev is not
installed and pkg-config fails, the linker fails:

  ld: cannot find -lzstd: No such file or directory

libzstd is a transitive dependency of libelf, so the -lzstd flag is
strictly necessary only for static builds [1].

Remove ZSTD_LIBS variable, and instead set LIBELF_LIBS depending on
whether the build is static or not. Use $(HOSTPKG_CONFIG) as primary
source of the flags list.

Also add a default value for HOSTPKG_CONFIG in case it's not built via
the toplevel Makefile. Pass it from selftests/bpf too.

[1] https://lore.kernel.org/bpf/4ff82800-2daa-4b9f-95a9-6f512859ee70@linux.dev/

Reported-by: BPF CI Bot (Claude Opus 4.6) <bot+bpf-ci@kernel.org>
Reported-by: Vitaly Chikunov <vt@altlinux.org>
Closes: https://lore.kernel.org/bpf/aaWqMcK-2AQw5dx8@altlinux.org/
Fixes: 4021848a90 ("selftests/bpf: Pass through build flags to bpftool and resolve_btfids")
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Reviewed-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260305014730.3123382-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-07 08:51:51 -08:00
Eduard Zingerman
223ffb6a3d selftests/bpf: add reproducer for spurious precision propagation through calls
Add a test for the scenario described in the previous commit:
an iterator loop with two paths where one ties r2/r7 via
shared scalar id and skips a call, while the other goes
through the call. Precision marks from the linked registers
get spuriously propagated to the call path via
propagate_precision(), hitting "backtracking call unexpected
regs" in backtrack_insn().

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-2-18e859be570d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 21:50:05 -08:00
Eduard Zingerman
2658a1720a bpf: collect only live registers in linked regs
Fix an inconsistency between func_states_equal() and
collect_linked_regs():
- regsafe() uses check_ids() to verify that cached and current states
  have identical register id mapping.
- func_states_equal() calls regsafe() only for registers computed as
  live by compute_live_registers().
- clean_live_states() is supposed to remove dead registers from cached
  states, but it can skip states belonging to an iterator-based loop.
- collect_linked_regs() collects all registers sharing the same id,
  ignoring the marks computed by compute_live_registers().
  Linked registers are stored in the state's jump history.
- backtrack_insn() marks all linked registers for an instruction
  as precise whenever one of the linked registers is precise.

The above might lead to a scenario:
- There is an instruction I with register rY known to be dead at I.
- Instruction I is reached via two paths: first A, then B.
- On path A:
  - There is an id link between registers rX and rY.
  - Checkpoint C is created at I.
  - Linked register set {rX, rY} is saved to the jump history.
  - rX is marked as precise at I, causing both rX and rY
    to be marked precise at C.
- On path B:
  - There is no id link between registers rX and rY,
    otherwise register states are sub-states of those in C.
  - Because rY is dead at I, check_ids() returns true.
  - Current state is considered equal to checkpoint C,
    propagate_precision() propagates spurious precision
    mark for register rY along the path B.
  - Depending on a program, this might hit verifier_bug()
    in the backtrack_insn(), e.g. if rY ∈  [r1..r5]
    and backtrack_insn() spots a function call.

The reproducer program is in the next patch.
This was hit by sched_ext scx_lavd scheduler code.

Changes in tests:
- verifier_scalar_ids.c selftests need modification to preserve
  some registers as live for __msg() checks.
- exceptions_assert.c adjusted to match changes in the verifier log,
  R0 is dead after conditional instruction and thus does not get
  range.
- precise.c adjusted to match changes in the verifier log, register r9
  is dead after comparison and it's range is not important for test.

Reported-by: Emil Tsalapatis <emil@etsalapatis.com>
Fixes: 0fb3cf6110 ("bpf: use register liveness information for func_states_equal")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-1-18e859be570d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 21:49:40 -08:00
Linus Torvalds
4660e168c6 arm64 fixes for -rc3
- Fix kexec/hibernation hang due to bogus read-only mappings.
 
 - Fix sparse warnings in our cmpxchg() implementation.
 
 - Prevent runtime-const being used in modules, just like x86.
 
 - Fix broken elision of access flag modifications for contiguous entries
   on systems without support for hardware updates.
 
 - Fix a broken SVE selftest that was testing the wrong instruction.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmmrH5wQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNLiWB/40+A3Q3gz9VB3obupFeC/s688TjGMwLbIO
 m03Qu/ulGwBZaPRPZxsxnr8pFZKjSple5NJiHv5kQ/wR4Cfc4zwF2zOSdRvAI/c3
 qPT2YL0CcVt0OgbWd2VCjiThTuFREewdCqRWbmkDaPYd27k0KWY14gHHpriRw7XM
 QY0OOz8wrWi3lg2Wyvub9wWLkyjKtFlrkwZaACD5D90k/CwKVgncC1z4vh41hQxk
 qjxdygNJt2sV+31+F7QMoY/rbyVnUkdSvWSwe9z2Bs9mwebaoGgx4c1l47Wq+oQD
 NiVgHOZnPQkDgd2MWkUiCwzAr6C3B0aF2BCu+NTgILkbX7PyG792
 =knFu
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "The main changes are a fix to the way in which we manage the access
  flag setting for mappings using the contiguous bit and a fix for a
  hang on the kexec/hibernation path.

  Summary:

   - Fix kexec/hibernation hang due to bogus read-only mappings

   - Fix sparse warnings in our cmpxchg() implementation

   - Prevent runtime-const being used in modules, just like x86

   - Fix broken elision of access flag modifications for contiguous
     entries on systems without support for hardware updates

   - Fix a broken SVE selftest that was testing the wrong instruction"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  selftest/arm64: Fix sve2p1_sigill() to hwcap test
  arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults
  arm64: make runtime const not usable by modules
  arm64: mm: Add PTE_DIRTY back to PAGE_KERNEL* to fix kexec/hibernation
  arm64: Silence sparse warnings caused by the type casting in (cmp)xchg
2026-03-06 19:57:03 -08:00
Aleksei Oladko
0bcac7b112 selftests: net: make ovs-dpctl.py fail when pyroute2 is unsupported
The pmtu.sh kselftest configures OVS using ovs-dpctl.py and falls back
to ovs-vsctl only when ovs-dpctl.py fails. However, ovs-dpctl.py exits
with a success status when the installed pyroute2 package version is
lower than 0.6, even though the OVS datapath is not configured.

As a result, pmtu.sh assumes that the setup was successful and
continues running the test, which later fails due to the missing
OVS configuration.

Fix the exit code handling in ovs-dpctl.py so that pmtu.sh can detect
that the setup did not complete successfully and fall back to
ovs-vsctl.

Signed-off-by: Aleksei Oladko <aleksey.oladko@virtuozzo.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20260306000127.519064-3-aleksey.oladko@virtuozzo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 19:26:23 -08:00
Eduard Zingerman
d87c9305a8 Revert "selftests/bpf: Update reg_bound range refinement logic"
This reverts commit da653de268.
Removed logic is now covered by range_refine_in_halves()
which handles both 32-bit and 64-bit refinements.

Suggested-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-3-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 18:16:17 -08:00
Eduard Zingerman
f81fdfd167 selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary
Two test cases for signed/unsigned 32-bit bounds refinement
when s32 range crosses the sign boundary:
- s32 range [S32_MIN..1] overlapping with u32 range [3..U32_MAX],
  s32 range tail before sign boundary overlaps with u32 range.
- s32 range [-3..5] overlapping with u32 range [0..S32_MIN+3],
  s32 range head after the sign boundary overlaps with u32 range.

This covers both branches added in the __reg32_deduce_bounds().

Also, crossing_32_bit_signed_boundary_2() no longer triggers invariant
violations.

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-2-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 18:16:17 -08:00
Eduard Zingerman
fbc7aef517 bpf: Fix u32/s32 bounds when ranges cross min/max boundary
Same as in __reg64_deduce_bounds(), refine s32/u32 ranges
in __reg32_deduce_bounds() in the following situations:

- s32 range crosses U32_MAX/0 boundary, positive part of the s32 range
  overlaps with u32 range:

  0                                                   U32_MAX
  |  [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx]              |
  |----------------------------|----------------------------|
  |xxxxx s32 range xxxxxxxxx]                       [xxxxxxx|
  0                     S32_MAX S32_MIN                    -1

- s32 range crosses U32_MAX/0 boundary, negative part of the s32 range
  overlaps with u32 range:

  0                                                   U32_MAX
  |              [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx]  |
  |----------------------------|----------------------------|
  |xxxxxxxxx]                       [xxxxxxxxxxxx s32 range |
  0                     S32_MAX S32_MIN                    -1

- No refinement if ranges overlap in two intervals.

This helps for e.g. consider the following program:

   call %[bpf_get_prandom_u32];
   w0 &= 0xffffffff;
   if w0 < 0x3 goto 1f;    // on fall-through u32 range [3..U32_MAX]
   if w0 s> 0x1 goto 1f;   // on fall-through s32 range [S32_MIN..1]
   if w0 s< 0x0 goto 1f;   // range can be narrowed to  [S32_MIN..-1]
   r10 = 0;
1: ...;

The reg_bounds.c selftest is updated to incorporate identical logic,
refinement based on non-overflowing range halves:

  ((x ∩ [0, smax]) ∩ (y ∩ [0, smax])) ∪
  ((x ∩ [smin,-1]) ∩ (y ∩ [smin,-1]))

Reported-by: Andrea Righi <arighi@nvidia.com>
Reported-by: Emil Tsalapatis <emil@etsalapatis.com>
Closes: https://lore.kernel.org/bpf/aakqucg4vcujVwif@gpd4/T/
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-1-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-06 18:16:06 -08:00
Aleksei Oladko
21c0dc7cdd selftests: net: forwarding: fix IPv6 address leak in cleanup
Several forwarding tests (e.g., gre_multipath.sh) initialize both IPv4
and IPv6 addresses using simple_if_init, but only clean up IPv4
in simple_if_fini. This leaves stale IPv6 addresses on the interfaces,
which causes subsequent tests to fail when they encounter unexpected
address configuration.

The issue can be reproduced by running tests in sequence:
  # run_kselftest.sh -t net/forwarding:ipip_hier_gre.sh
  # run_kselftest.sh -t net/forwarding:min_max_mtu.sh
  TAP version 13
  1..1
  # timeout set to 0
  # selftests: net/forwarding: min_max_mtu.sh
  # TEST: ping                                                          [ OK ]
  # TEST: ping6                                                         [ OK ]
  # TEST: Test maximum MTU configuration                                [ OK ]
  # TEST: Test traffic, packet size is maximum MTU                      [FAIL]
  #       Ping6, packet size: 65487 succeeded, but should have failed
  # TEST: Test minimum MTU configuration                                [ OK ]
  # TEST: Test traffic, packet size is minimum MTU                      [ OK ]
  not ok 1 selftests: net/forwarding: min_max_mtu.sh # exit=1

Fix this by removing the unused IPv6 argument from simple_if_init in
tests that don't use IPv6 (gre_multipath.sh, ipip_lib.sh), and by
adding the missing IPv6 argument to simple_if_fini in tests that
use IPv6 (gre_multipath_nh.sh, gre_multipath_nh_res.sh).

Signed-off-by: Aleksei Oladko <aleksey.oladko@virtuozzo.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260305211000.515301-1-aleksey.oladko@virtuozzo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 17:15:33 -08:00
Dragos Tatulea
507ccb668f selftests: drv-net: iou-zcrx: wait for memory cleanup of probe run
The large chunks test does a probe run of iou-zcrx before it runs the
actual test. After the probe run finishes, the context will still exist
until the deferred io_uring teardown. When running iou-zcrx the second
time, io_uring_register_ifq() can return -EEXIST due to the existence of
the old context.

The fix is simple: wait for the context teardown using the new
mp_clear_wait() utility before running the second instance of iou-zcrx.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://patch.msgid.link/20260305080446.897628-2-dtatulea@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 15:36:37 -08:00
David Wei
37d24994a7 selftests/net: Add netkit container ping test
Add a basic ping test using NetDrvContEnv that sets up a netkit pair,
with one end in a netns. Use LOCAL_PREFIX_V6 and nk_forward BPF program
to ping from a remote host to the netkit in netns.

Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260305181803.2912736-5-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:16:40 -08:00
David Wei
3f74d5bb80 selftests/net: Add env for container based tests
Add an env NetDrvContEnv for container based selftests. This automates
the setup of a netns, netkit pair with one inside the netns, and a BPF
program that forwards skbs from the NETIF host inside the container.

Currently only netkit is used, but other virtual netdevs e.g. veth can
be used too.

Expect netkit container datapath selftests to have a publicly routable
IP prefix to assign to netkit in a container, such that packets will
land on eth0. The BPF skb forward program will then forward such packets
from the host netns to the container netns.

Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260305181803.2912736-4-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:11:17 -08:00
David Wei
68ab8ba927 selftests/net: Export Netlink class via lib.py
Making rtnl newlink calls requires constants defined in Netlink class in
pyynl. Export it.

Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20260305181803.2912736-3-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:11:17 -08:00
David Wei
01748996f6 selftests/net: Add bpf skb forwarding program
Add nk_forward.bpf.c, a BPF program that forwards skbs matching some IPv6
prefix received on eth0 ifindex to a specified netkit ifindex. This will
be needed by netkit container tests.

Signed-off-by: David Wei <dw@davidwei.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260305181803.2912736-2-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-06 13:11:17 -08:00
Linus Torvalds
3593e678f5 linux_kselftest-kunit-fixes-7.0-rc3
kunit:
 - Fixes rust warnings when CONFIG_PRINTK is disabled.
 - Reduces stack usage in kunit_run_tests() to fix warnings when
   CONFIG_FRAME_WARN is set to a relatively low value.
 - Updates email address for David Gow.
 
 kunit tool:
 - Copies caller args in run_kernel to prevent mutation.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmmrNf4ACgkQCwJExA0N
 QxyVNxAAtemv6AH1pP/cR23teXd5uuBThTM8EASEAC9/VJmCuPjg9mb6df816EXx
 2ZrTKGR5k/5basYDhNKIAJTGGjT3pcau7OKmD7Piy3bbhJEq5kb31rVbbmCV1STl
 vzJcBbSNP5Z+otz/flmewoeiL+xYarQvCrfFll5Y+gMer3HFevG0y0rCpEe0+7tH
 avgq01m7rHFg0DLYxJhjE3NFgFlgYaKdHyQPrGYOtZUXJDSNUO1cULJ8ZrMG3skc
 JWngB/5eS6hP/3QSaBm9coHKbrAYatpPWLfg9wtgfuj3be2oEAnWeynb/vWrdEV3
 uid96CEs1UPG8Z/jTtRK86q1md+5I2QX6s7B74EXnzCo0bg150h1qtT2J/za1Kxu
 7CP6Rq59VHCL6SMpQ4cks00zpWffCCtAphto2yk9ZphbisyVUGN8zEiCdd0FrLob
 putdMLeJVLXbnWFCnLA/8gFfx6Inp+PXNoVr+6LNZo/qB4W/EJOLDUvjaVJ7abpY
 yUxvJVj0+dwOlYUYnpldGCz9pfywgIyLI7ERtu4KIwJ5EMsVo8yJgykEd2MkQbZ+
 2Db3omyalnAn/vXP5p48vawI9oj6scMNxukBRTdqZsQGRma/jj6FnZCi77llQY+g
 V6pTQPjJRhRGLIwow+dCkJ4CucafNacp1RVuf61MM/I03XOQEwU=
 =j8w0
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-fixes-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:

 - Fix rust warnings when CONFIG_PRINTK is disabled

 - Reduce stack usage in kunit_run_tests() to fix warnings when
   CONFIG_FRAME_WARN is set to a relatively low value

 - Update email address for David Gow

 - Copy caller args in kunit tool in run_kernel to prevent mutation

* tag 'linux_kselftest-kunit-fixes-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: reduce stack usage in kunit_run_tests()
  kunit: tool: copy caller args in run_kernel to prevent mutation
  rust: kunit: fix warning when !CONFIG_PRINTK
  MAINTAINERS: Update email address for David Gow
2026-03-06 12:34:49 -08:00
Linus Torvalds
48976c0eba hid-for-linus-2026030601
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEoEVH9lhNrxiMPSyI7MXwXhnZSjYFAmmqkoIACgkQ7MXwXhnZ
 SjZg1xAAo8zWO1w8U99bYckAqgjWNEUbNiBo3370HbaNiOknW4B2Hzib4LIex4NE
 I/b1W1Wjq9ifgLLdHKP18eWVk4JgKupruTyoomfBrkHZSMbh8Y17GRdzpEj+4UEj
 HjmWTtNGKxJxpgVlSxPQuh1jNmkJm7FIYOMoSEM0CQJWHuUYERDrKsvJoHdZrX+p
 /C1Mn4+4o32CN2nUR7eRsvSisXjy/nTbNllLJNYZCCNBcvh3FaJNpEH5F+XMwg3n
 QKa5Ex0ZWMur1dN3y7xmJNdZszJoFFwb4UL1eWG3N0rIoHZpzbwdHDu2IXqJc0Jq
 HmdWqIceiScuhyjHNZ/DEes+GLMyIUUApO9Bs3Zmnde3yPtkbXc5TOvFeSL0ih+5
 WZjcW0b3rDhwibilgZA9ZuZfD44ajZsAOeZWJrc/7mNYoploJCc6ZOo3Ze7MJqjA
 tNnx33eX4+FpZndXVBAdkmM8iNTVNIaJ2SGPwauAfOUADaskdg83NFVa/q3IMAo3
 44Sw8Wk5E59J5vxGlceqvzySDgO5uB8v4J3kHnDKxOPtNivvV+38cnLAwnaI+/+h
 adGcCC9B67ekw02gAOOu+ExiijCCYLfKxIUckCKvGBiIXJELjzx98+H863/m+QwE
 gyvKdLM0EFu6qnAycma5IXtsM0aPlNz/WHOw/SQgGeMz2NePtLw=
 =fPAZ
 -----END PGP SIGNATURE-----

Merge tag 'hid-for-linus-2026030601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Benjamin Tissoires:

 - fix a few memory leaks (Günther Noack)

 - fix potential kernel crashes in cmedia, creative-sb0540 and zydacron
   (Greg Kroah-Hartman)

 - fix NULL pointer dereference in pidff (Tomasz Pakuła)

 - fix battery reporting for Apple Magic Trackpad 2 (Julius Lehmann)

 - mcp2221 proper handling of failed read operation (Romain Sioen)

 - various device quirks / device ID additions

* tag 'hid-for-linus-2026030601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: mcp2221: cancel last I2C command on read error
  HID: asus: add xg mobile 2023 external hardware support
  HID: multitouch: Keep latency normal on deactivate for reactivation gesture
  HID: apple: Add EPOMAKER TH87 to the non-apple keyboards list
  HID: intel-ish-hid: ipc: Add Nova Lake-H/S PCI device IDs
  selftests: hid: tests: test_wacom_generic: add tests for display devices and opaque devices
  HID: multitouch: new class MT_CLS_EGALAX_P80H84
  HID: magicmouse: fix battery reporting for Apple Magic Trackpad 2
  HID: pidff: Fix condition effect bit clearing
  HID: Add HID_CLAIMED_INPUT guards in raw_event callbacks missing them
  HID: asus: avoid memory leak in asus_report_fixup()
  HID: magicmouse: avoid memory leak in magicmouse_report_fixup()
  HID: apple: avoid memory leak in apple_report_fixup()
  HID: Document memory allocation properties of report_fixup()
2026-03-06 10:00:58 -08:00
Tejun Heo
a0b0f6c7d7 Merge branch 'for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup into for-7.1
To receive 5b30afc20b ("cgroup: Expose some cgroup helpers") which will be
used by sub-sched support.

Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-06 07:48:21 -10:00
Tejun Heo
32e940f2bd Merge branch 'for-7.0-fixes' into for-7.1
To prepare for hierarchical scheduling patchset which will cause multiple
conflicts otherwise.

Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-06 07:46:32 -10:00
Marcos Paulo de Souza
920e5001f4 selftests: livepatch: test-ftrace: livepatch a traced function
This is basically the inverse case of commit 474eecc882
("selftests: livepatch: test if ftrace can trace a livepatched function")
but ensuring that livepatch would work on a traced function.

Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Link: https://patch.msgid.link/20260220-lp-test-trace-v1-1-4b6703cd01a6@suse.com
[pmladek: Fixed a typo found by Joe.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2026-03-06 14:54:28 +01:00
Yifan Wu
d87c828daa selftest/arm64: Fix sve2p1_sigill() to hwcap test
The FEAT_SVE2p1 is indicated by ID_AA64ZFR0_EL1.SVEver. However,
the BFADD requires the FEAT_SVE_B16B16, which is indicated by
ID_AA64ZFR0_EL1.B16B16. This could cause the test to incorrectly
fail on a CPU that supports FEAT_SVE2.1 but not FEAT_SVE_B16B16.

LD1Q Gather load quadwords which is decoded from SVE encodings and
implied by FEAT_SVE2p1.

Fixes: c5195b027d ("kselftest/arm64: Add SVE 2.1 to hwcap test")
Signed-off-by: Yifan Wu <wuyifan50@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2026-03-06 11:54:26 +00:00
Andrey Grodzovsky
a28441dd29 selftests/bpf: Add tests for kprobe.session optimization
Extend existing kprobe_multi_test subtests to validate the
kprobe.session exact function name optimization:

In kprobe_multi_session.c, add test_kprobe_syms which attaches a
kprobe.session program to an exact function name (bpf_fentry_test1)
exercising the fast syms[] path that bypasses kallsyms parsing.  It
calls session_check() so bpf_fentry_test1 is hit by both the wildcard
and exact probes, and test_session_skel_api validates
kprobe_session_result[0] == 4 (entry + return from each probe).

In test_attach_api_fails, add fail_7 and fail_8 verifying error code
consistency between the wildcard pattern path (slow, parses kallsyms)
and the exact function name path (fast, uses syms[] array).  Both
paths must return -ENOENT for non-existent functions.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260302200837.317907-4-andrey.grodzovsky@crowdstrike.com
2026-03-05 15:14:53 -08:00
Sun Jian
7f20d371fd selftests/bpf: bpf_cookie: Make perf_event subtest trigger reliably
The perf_event subtest relies on SW_CPU_CLOCK sampling to trigger the BPF
program, but the current CPU burn loop can be too short on slower systems
and may fail to generate any overflow sample. This leaves pe_res unchanged
and makes the test flaky.

Make burn_cpu() take a loop count and use a longer burn only for the
perf_event subtest. Also scope perf_event_open() to the current task to
avoid wasting samples on unrelated activity.

Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20260228074555.122950-3-sun.jian.kdev@gmail.com
2026-03-05 15:07:40 -08:00
Sun Jian
74d3305e62 selftests/bpf: bpf_cookie: Skip kprobe_multi tests without bpf_testmod
The kprobe_multi subtests rely on bpf_testmod fentry ksyms.

When bpf_testmod isn't available, libbpf fails to resolve
bpf_testmod_fentry_test* and skeleton load fails with -ESRCH, causing
false failures.

Skip these subtests when env.has_testmod is false.

Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20260228074555.122950-2-sun.jian.kdev@gmail.com
2026-03-05 15:07:40 -08:00
Josef Bacik
fefeeec612 selftests/bpf: Add test for btf__add_btf() with split BTF sources
Add a test that verifies btf__add_btf() correctly handles merging
multiple split BTF objects that share the same base BTF. The test
creates two sibling split BTFs on a common base, merges them into
a combined split BTF, and validates that base type references are
preserved while split type references are properly remapped.

Assisted-by: Claude:claude-opus-4-6

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/64a8c947bff1ae89efa9ba8c099466477762490f.1772657690.git.josef@toxicpanda.com
2026-03-05 15:03:04 -08:00
Paul E. McKenney
78c2ce0fd6 scftorture: Update due to x86 not supporting none/voluntary preemption
As of v7.0-rc1, architectures that support preemption, including x86 and
arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY.
Attempting to build kernels with these two Kconfig options results in
.config errors.  This commit therefore switches such scftorture scenarios
to CONFIG_PREEMPT_LAZY.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun@kernel.org>
Link: https://patch.msgid.link/20260303235903.1967409-4-paulmck@kernel.org
2026-03-05 13:11:10 -08:00
Paul E. McKenney
3c6ddb58f6 refscale: Update due to x86 not supporting none/voluntary preemption
As of v7.0-rc1, architectures that support preemption, including x86 and
arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY.
Attempting to build kernels with these two Kconfig options results in
.config errors.  This commit therefore switches such refscale scenarios
to CONFIG_PREEMPT_LAZY.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun@kernel.org>
Link: https://patch.msgid.link/20260303235903.1967409-3-paulmck@kernel.org
2026-03-05 13:11:07 -08:00