In chacha_zvkb, avoid using the s0 register, which is the frame pointer,
by reallocating KEY0 to t5. This makes stack traces available if e.g. a
crash happens in chacha_zvkb.
No frame pointer maintenance is otherwise required since this is a leaf
function.
Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
Fixes: bb54668837 ("crypto: riscv - add vector crypto accelerated ChaCha20")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20251202-riscv-chacha_zvkb-fp-v2-1-7bd00098c9dc@iscas.ac.cn
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a cond_lock annotation for lockref_put_or_lock to make sparse
happy with using it. Note that for this the return value has to be
double-inverted as the return value convention of lockref_put_or_lock
is inverted compared to _trylock conventions expected by __cond_lock,
as lockref_put_or_lock returns true when it did not need to take the
lock.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Framework updates:
- fonts: Add Terminus 10x18 console font [Neilay Kharwadkar]
Driver fixes:
- gbefb: fix to use physical address instead of dma address [René Rebe]
- tcx.c fix mem_map to correct smem_start offset [René Rebe]
- pxafb: Fix multiple clamped values in pxafb_adjust_timing [Thorsten Blum]
- ssd1307fb: fix potential page leak in ssd1307fb_probe() [Abdun Nihaal]
Cleanups:
- vga16fb: Request memory region [Javier Garcia]
- vga16fb: replace printk() with dev_*() in probe [Vivek BalachandharTN]
- vesafb, gxt4500fb, tridentfb: Use dev_dbg() instead of printk() [Javier Garcia]
- i810: use dev_info() [Shi Hao]
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCaTSM3gAKCRD3ErUQojoP
X+J6AP4+xxGAluBRwJwUXX+imIKq9DxkufXL4UzECl1WGeNVMAEAxAEcV+5RtvtQ
lh7yk5tUr5yjntP1HRaVT93aGhpv6QQ=
=0Oyt
-----END PGP SIGNATURE-----
Merge tag 'fbdev-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev updates from Helge Deller:
"The Termius 10x18 console bitmap font has been added. It is good
match for modern 13-16 inch laptop displays with resolutions like
1280x800 and 1440x900 pixels.
The gbefb and tcx.c drivers got some fixes to restore X11 support,
pxafb was not actually clamping input values and the ssd1307fb driver
leaked memory in the failure path.
The other patches convert some common drivers to use dev_info() and
dev_dbg() instead of printk(). Summary:
Framework updates:
- fonts: Add Terminus 10x18 console font [Neilay Kharwadkar]
Driver fixes:
- gbefb: fix to use physical address instead of dma address [René Rebe]
- tcx.c fix mem_map to correct smem_start offset [René Rebe]
- pxafb: Fix multiple clamped values in pxafb_adjust_timing [Thorsten Blum]
- ssd1307fb: fix potential page leak in ssd1307fb_probe() [Abdun Nihaal]
Cleanups:
- vga16fb: Request memory region [Javier Garcia]
- vga16fb: replace printk() with dev_*() in probe [Vivek BalachandharTN]
- vesafb, gxt4500fb, tridentfb: Use dev_dbg() instead of printk() [Javier Garcia]
- i810: use dev_info() [Shi Hao]"
* tag 'fbdev-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbdev: ssd1307fb: fix potential page leak in ssd1307fb_probe()
fbdev: i810: use appopriate log interface dev_info
fbdev: tridentfb: replace printk() with dev_*() in probe
lib/fonts: Add Terminus 10x18 console font
fbdev: pxafb: Fix multiple clamped values in pxafb_adjust_timing
fbdev: tcx.c fix mem_map to correct smem_start offset
fbdev: gxt4500fb: Use dev_err instead of printk
fbdev: gbefb: fix to use physical address instead of dma address
fbdev: vesafb: Use dev_* fn's instead printk
fbdev: vga16fb: Request memory region
fbdev: vga16fb: replace printk() with dev_*() in probe
- The 6 patch series "panic: sys_info: Refactor and fix a potential
issue" from Andy Shevchenko fixes a build issue and does some cleanup in
ib/sys_info.c.
- The 9 patch series "Implement mul_u64_u64_div_u64_roundup()" from
David Laight enhances the 64-bit math code on behalf of a PWM driver and
beefs up the test module for these library functions.
- The 2 patch series "scripts/gdb/symbols: make BPF debug info available
to GDB" from Ilya Leoshkevich makes BPF symbol names, sizes, and line
numbers available to the GDB debugger.
- The 4 patch series "Enable hung_task and lockup cases to dump system
info on demand" from Feng Tang adds a sysctl which can be used to cause
additional info dumping when the hung-task and lockup detectors fire.
- The 6 patch series "lib/base64: add generic encoder/decoder, migrate
users" from Kuan-Wei Chiu adds a general base64 encoder/decoder to lib/
and migrates several users away from their private implementations.
- The 2 patch series "rbree: inline rb_first() and rb_last()" from Eric
Dumazet makes TCP a little faster.
- The 9 patch series "liveupdate: Rework KHO for in-kernel users" from
Pasha Tatashin reworks the KEXEC Handover interfaces in preparation for
Live Update Orchestrator (LUO), and possibly for other future clients.
- The 13 patch series "kho: simplify state machine and enable dynamic
updates" from Pasha Tatashin increases the flexibility of KEXEC
Handover. Also preparation for LUO.
- The 18 patch series "Live Update Orchestrator" from Pasha Tatashin is
a major new feature targeted at cloud environments. Quoting the [0/N]:
This series introduces the Live Update Orchestrator, a kernel subsystem
designed to facilitate live kernel updates using a kexec-based reboot.
This capability is critical for cloud environments, allowing hypervisors
to be updated with minimal downtime for running virtual machines. LUO
achieves this by preserving the state of selected resources, such as
memory, devices and their dependencies, across the kernel transition.
As a key feature, this series includes support for preserving memfd file
descriptors, which allows critical in-memory data, such as guest RAM or
any other large memory region, to be maintained in RAM across the kexec
reboot.
Mike Rappaport merits a mention here, for his extensive review and
testing work.
- The 3 patch series "kexec: reorganize kexec and kdump sysfs" from
Sourabh Jain moves the kexec and kdump sysfs entries from /sys/kernel/
to /sys/kernel/kexec/ and adds back-compatibility symlinks which can
hopefully be removed one day.
- The 2 patch series "kho: fixes for vmalloc restoration" from Mike
Rapoport fixes a BUG which was being hit during KHO restoration of
vmalloc() regions.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTSAkQAKCRDdBJ7gKXxA
jrkiAP9QKfsRv46XZaM5raScjY1ayjP+gqb2rgt6BQ/gZvb2+wD/cPAYOR6BiX52
n0pVpQmG5P/KyOmpLztn96ejL4heKwQ=
=JY96
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko)
fixes a build issue and does some cleanup in ib/sys_info.c
- "Implement mul_u64_u64_div_u64_roundup()" (David Laight)
enhances the 64-bit math code on behalf of a PWM driver and beefs up
the test module for these library functions
- "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich)
makes BPF symbol names, sizes, and line numbers available to the GDB
debugger
- "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang)
adds a sysctl which can be used to cause additional info dumping when
the hung-task and lockup detectors fire
- "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu)
adds a general base64 encoder/decoder to lib/ and migrates several
users away from their private implementations
- "rbree: inline rb_first() and rb_last()" (Eric Dumazet)
makes TCP a little faster
- "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin)
reworks the KEXEC Handover interfaces in preparation for Live Update
Orchestrator (LUO), and possibly for other future clients
- "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin)
increases the flexibility of KEXEC Handover. Also preparation for LUO
- "Live Update Orchestrator" (Pasha Tatashin)
is a major new feature targeted at cloud environments. Quoting the
cover letter:
This series introduces the Live Update Orchestrator, a kernel
subsystem designed to facilitate live kernel updates using a
kexec-based reboot. This capability is critical for cloud
environments, allowing hypervisors to be updated with minimal
downtime for running virtual machines. LUO achieves this by
preserving the state of selected resources, such as memory,
devices and their dependencies, across the kernel transition.
As a key feature, this series includes support for preserving
memfd file descriptors, which allows critical in-memory data, such
as guest RAM or any other large memory region, to be maintained in
RAM across the kexec reboot.
Mike Rappaport merits a mention here, for his extensive review and
testing work.
- "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain)
moves the kexec and kdump sysfs entries from /sys/kernel/ to
/sys/kernel/kexec/ and adds back-compatibility symlinks which can
hopefully be removed one day
- "kho: fixes for vmalloc restoration" (Mike Rapoport)
fixes a BUG which was being hit during KHO restoration of vmalloc()
regions
* tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits)
calibrate: update header inclusion
Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"
vmcoreinfo: track and log recoverable hardware errors
kho: fix restoring of contiguous ranges of order-0 pages
kho: kho_restore_vmalloc: fix initialization of pages array
MAINTAINERS: TPM DEVICE DRIVER: update the W-tag
init: replace simple_strtoul with kstrtoul to improve lpj_setup
KHO: fix boot failure due to kmemleak access to non-PRESENT pages
Documentation/ABI: new kexec and kdump sysfs interface
Documentation/ABI: mark old kexec sysfs deprecated
kexec: move sysfs entries to /sys/kernel/kexec
test_kho: always print restore status
kho: free chunks using free_page() instead of kfree()
selftests/liveupdate: add kexec test for multiple and empty sessions
selftests/liveupdate: add simple kexec-based selftest for LUO
selftests/liveupdate: add userspace API selftests
docs: add documentation for memfd preservation via LUO
mm: memfd_luo: allow preserving memfd
liveupdate: luo_file: add private argument to store runtime state
mm: shmem: export some functions to internal.h
...
- Arch Topology:
- Move parse_acpi_topology() from arm64 to common code for reuse in RISC-V
- CPU:
- Expose housekeeping CPUs through /sys/devices/system/cpu/housekeeping
- Print a newline (or 0x0A) instead of '(null)' reading
/sys/devices/system/cpu/nohz_full when nohz_full= is not set
- debugfs
- Remove (broken) 'no-mount' mode
- Remove redundant access mode checks in debugfs_get_tree() and
debugfs_create_*() functions
- Devres:
- Remove unused devm_free_percpu() helper
- Move devm_alloc_percpu() from device.h to devres.h
- Firmware Loader:
- Replace simple_strtol() with kstrtoint()
- Do not call cancel_store() when no upload is in progress
- kernfs:
- Increase struct super_block::maxbytes to MAX_LFS_FILESIZE
- Fix a missing unwind path in __kernfs_new_node()
- Misc:
- Increase the name size in struct auxiliary_device_id to 40 characters
- Replace system_unbound_wq with system_dfl_wq and add WQ_PERCPU to
alloc_workqueue()
- Platform:
- Replace ERR_PTR() with IOMEM_ERR_PTR() in platform ioremap functions
- Rust:
- Auxiliary:
- Unregister auxiliary device on parent device unbind
- Move parent() to impl Device; implement device context aware parent() for
Device<Bound>
- Illustrate how to safely obtain a driver's device private data when
calling from an auxiliary driver into the parant device driver
- DebugFs:
- Implement support for binary large objects
- Device:
- Let probe() return the driver's device private data as pinned initializer,
i.e. impl PinInit<Self, Error>
- Implement safe accessor for a driver's device private data for
Device<Bound> (returned reference can't out-live driver binding and
guarantees the correct private data type)
- Implement AsBusDevice trait, to be used by class device abstractions to
derive the bus device type of the parent device
- DMA:
- Store raw pointer of allocation as NonNull
- Use start_ptr() and start_ptr_mut() to inherit correct mutability of self
- FS:
- Add file::Offset type alias
- I2C:
- Add abstractions for I2C device / driver infrastructure
- Implement abstractions for manual I2C device registrations
- I/O:
- Use "kernel vertical" style for imports
- Define ResourceSize as resource_size_t
- Move ResourceSize to top-level I/O module
- Add type alias for phys_addr_t
- Implement Rust version of read_poll_timeout_atomic()
- PCI:
- Use "kernel vertical" style for imports
- Move I/O and IRQ infrastructure to separate files
- Add support for PCI interrupt vectors
- Implement TryInto<IrqRequest<'a>> for IrqVector<'a> to convert an
IrqVector bound to specific pci::Device into an IrqRequest bound to the
same pci::Device's parent Device
- Leverage pin_init_scope() to get rid of redundant Result in IRQ methods
- PinInit:
- Add {pin_}init_scope() to execute code before creating an initializer
- Platform:
- Leverage pin_init_scope() to get rid of redundant Result in IRQ methods
- Timekeeping:
- Implement abstraction of udelay()
- Uaccess:
- Implement read_slice_partial() and read_slice_file() for UserSliceReader
- Implement write_slice_partial() and write_slice_file() for UserSliceWriter
- sysfs
- Prepare the constification of struct attribute
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCaTAehwAKCRBFlHeO1qrK
LmzvAP0TWwKUGIduccknIa1AFvBM92lWVZptSysotv3SLFZq3wD9GBLIENt1DkEk
s+GBqbobPgoyaodaysqLQ/SNqF9TcAM=
=Wutw
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core updates from Danilo Krummrich:
"Arch Topology:
- Move parse_acpi_topology() from arm64 to common code for reuse in
RISC-V
CPU:
- Expose housekeeping CPUs through /sys/devices/system/cpu/housekeeping
- Print a newline (or 0x0A) instead of '(null)' reading
/sys/devices/system/cpu/nohz_full when nohz_full= is not set
debugfs
- Remove (broken) 'no-mount' mode
- Remove redundant access mode checks in debugfs_get_tree() and
debugfs_create_*() functions
Devres:
- Remove unused devm_free_percpu() helper
- Move devm_alloc_percpu() from device.h to devres.h
Firmware Loader:
- Replace simple_strtol() with kstrtoint()
- Do not call cancel_store() when no upload is in progress
kernfs:
- Increase struct super_block::maxbytes to MAX_LFS_FILESIZE
- Fix a missing unwind path in __kernfs_new_node()
Misc:
- Increase the name size in struct auxiliary_device_id to 40
characters
- Replace system_unbound_wq with system_dfl_wq and add WQ_PERCPU to
alloc_workqueue()
Platform:
- Replace ERR_PTR() with IOMEM_ERR_PTR() in platform ioremap
functions
Rust:
- Auxiliary:
- Unregister auxiliary device on parent device unbind
- Move parent() to impl Device; implement device context aware
parent() for Device<Bound>
- Illustrate how to safely obtain a driver's device private data
when calling from an auxiliary driver into the parant device
driver
- DebugFs:
- Implement support for binary large objects
- Device:
- Let probe() return the driver's device private data as pinned
initializer, i.e. impl PinInit<Self, Error>
- Implement safe accessor for a driver's device private data for
Device<Bound> (returned reference can't out-live driver binding
and guarantees the correct private data type)
- Implement AsBusDevice trait, to be used by class device
abstractions to derive the bus device type of the parent device
- DMA:
- Store raw pointer of allocation as NonNull
- Use start_ptr() and start_ptr_mut() to inherit correct
mutability of self
- FS:
- Add file::Offset type alias
- I2C:
- Add abstractions for I2C device / driver infrastructure
- Implement abstractions for manual I2C device registrations
- I/O:
- Use "kernel vertical" style for imports
- Define ResourceSize as resource_size_t
- Move ResourceSize to top-level I/O module
- Add type alias for phys_addr_t
- Implement Rust version of read_poll_timeout_atomic()
- PCI:
- Use "kernel vertical" style for imports
- Move I/O and IRQ infrastructure to separate files
- Add support for PCI interrupt vectors
- Implement TryInto<IrqRequest<'a>> for IrqVector<'a> to convert
an IrqVector bound to specific pci::Device into an IrqRequest
bound to the same pci::Device's parent Device
- Leverage pin_init_scope() to get rid of redundant Result in IRQ
methods
- PinInit:
- Add {pin_}init_scope() to execute code before creating an
initializer
- Platform:
- Leverage pin_init_scope() to get rid of redundant Result in IRQ
methods
- Timekeeping:
- Implement abstraction of udelay()
- Uaccess:
- Implement read_slice_partial() and read_slice_file() for
UserSliceReader
- Implement write_slice_partial() and write_slice_file() for
UserSliceWriter
sysfs:
- Prepare the constification of struct attribute"
* tag 'driver-core-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (75 commits)
rust: pci: fix build failure when CONFIG_PCI_MSI is disabled
debugfs: Fix default access mode config check
debugfs: Remove broken no-mount mode
debugfs: Remove redundant access mode checks
driver core: Check drivers_autoprobe for all added devices
driver core: WQ_PERCPU added to alloc_workqueue users
driver core: replace use of system_unbound_wq with system_dfl_wq
tick/nohz: Expose housekeeping CPUs in sysfs
tick/nohz: avoid showing '(null)' if nohz_full= not set
sysfs/cpu: Use DEVICE_ATTR_RO for nohz_full attribute
kernfs: fix memory leak of kernfs_iattrs in __kernfs_new_node
fs/kernfs: raise sb->maxbytes to MAX_LFS_FILESIZE
mod_devicetable: Bump auxiliary_device_id name size
sysfs: simplify attribute definition macros
samples/kobject: constify 'struct foo_attribute'
samples/kobject: add is_visible() callback to attribute group
sysfs: attribute_group: enable const variants of is_visible()
sysfs: introduce __SYSFS_FUNCTION_ALTERNATIVE()
sysfs: transparently handle const pointers in ATTRIBUTE_GROUPS()
sysfs: attribute_group: allow registration of const attribute
...
These updates came a little late, or were based on a later 6.18-rc
tag than the others:
- A new driver for cache management on cxl devices with memory shared
in a coherent cluster. This is part of the drivers/cache/ tree, but
unlike the other drivers that back the dma-mapping interfaces, this
one is needed only during CPU hotplug.
- A shared branch for reset controllers using swnode infrastructure
- Added support for new SoC variants in the Amlogic soc_device
identification
- Minor updates in Freescale, Microchip, Samsung, and Apple SoC drivers
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmkzAWoACgkQmmx57+YA
GNmyOw//ZR4ie0Mcr2NrQFx0eozSKQ2NYpb+bCqy96livGiEjPMrQXqiEv9XMNmw
V9c3749cXUUMvbd6sNxco75k/8jTMko/K3ZC5DwMlAytgp49b/LK9I3Lj3hUX3+q
5X4B1TrgH5J42pJ60kDstjiY17anNxP99ht7A1gd2ijp1GIfQxYvUrhGMsELqTxm
aAznnsYhwI3O2pmJZj2F2Kj4jen6tfTlQh37oIDoLdweXxI9VGjSRY38/TcsgE3E
o2EwTuyhimmk2iVN5GmSksGQNj1neeJe3QEjMImcn3asR2WtLQOQGOcIa7OSvF7d
LIE3uQTxtz2W/2CLmM6RHeUwOwBOz9AD0dZ+JGaZ63ePdypU0w8xyrKhMgw9Pq5F
Mtt4ml3w2zAfyV4VqmkiYdCAML2kkzPfZRYxhASlYcZ/iAylhCqHJXWJ5SSp8BTc
QY+aZS9RFAylNvx5qVyOtPeDEqKl0UAnYJHF6JGNQR/6vEKvMwxVJ0EEaAo1luXg
z7RQCC2MRXX9QPq6YQcJXc4u3jDMTNkbElQy2CAXBUdbRVgFJPjTdtEqg860Cml6
HHXCazeVAPwA88NR4zHBl6QKxfqp8Iezf9WHTz1WS5xq/71kfdzyG9/zKK8jpoLh
5MXuyGpWhngep+DmOmP0i43HBzP7vD7g7086x+4jozTwg/TjV/s=
=YvxQ
-----END PGP SIGNATURE-----
Merge tag 'soc-drivers-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more SoC driver updates from Arnd Bergmann:
"These updates came a little late, or were based on a later 6.18-rc tag
than the others:
- A new driver for cache management on cxl devices with memory shared
in a coherent cluster. This is part of the drivers/cache/ tree, but
unlike the other drivers that back the dma-mapping interfaces, this
one is needed only during CPU hotplug.
- A shared branch for reset controllers using swnode infrastructure
- Added support for new SoC variants in the Amlogic soc_device
identification
- Minor updates in Freescale, Microchip, Samsung, and Apple SoC
drivers"
* tag 'soc-drivers-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
soc: samsung: exynos-pmu: fix device leak on regmap lookup
soc: samsung: exynos-pmu: Fix structure initialization
soc: fsl: qbman: use kmalloc_array() instead of kmalloc()
soc: fsl: qbman: add WQ_PERCPU to alloc_workqueue users
MAINTAINERS: Update email address for Christophe Leroy
MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS
cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent
cache: Make top level Kconfig menu a boolean dependent on RISCV
MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header
arm64: Select GENERIC_CPU_CACHE_MAINTENANCE
lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
soc: amlogic: meson-gx-socinfo: add new SoCs id
dt-bindings: arm: amlogic: meson-gx-ao-secure: support more SoCs
memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion()
memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion()
dt-bindings: cache: sifive,ccache0: add a pic64gx compatible
MAINTAINERS: rename Microchip RISC-V entry
MAINTAINERS: add new soc drivers to Microchip RISC-V entry
soc: microchip: add mfd drivers for two syscon regions on PolarFire SoC
dt-bindings: soc: microchip: document the simple-mfd syscon on PolarFire SoC
...
- Enable parallel hotplug for RISC-V
- Optimize vector regset allocation for ptrace()
- Add a kernel selftest for the vector ptrace interface
- Enable the userspace RAID6 test to build and run using RISC-V
vectors
- Add initial support for the Zalasr RISC-V ratified ISA extension
- For the Zicbop RISC-V ratified ISA extension to userspace, expose
hardware and kernel support to userspace and add a kselftest for
Zicbop
- Convert open-coded instances of 'asm goto's that are controlled by
runtime ALTERNATIVEs to use riscv_has_extension_{un,}likely(),
following arm64's alternative_has_cap_{un,}likely()
- Remove an unnecessary mask in the GFP flags used in some calls to
pagetable_alloc()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmkx2zkACgkQx4+xDQu9
KkvagQ/+Iv94J0S8VWS4vvRMtvPMkEHvPADCmEOrMrVsQ+hZdasKsSFBKY4DHZt6
baGKEQoyJ0NDrIt51uNWzmR503/PGPwVYSDfrgrpS87IkcO9OWe/HiMSEAu0aCyp
dhKGYnjWUtawWzFQg1GQ2n1vLOm5cQ2u1vTptISqU1yg9XlXUN0LNzIqNB8GpQv3
Hm7qqNFyxZOyAC9/ZFGbX2/0KpKQh1xkXSEONxzRGADJ/bqHKKz3hdaOj3aVcoMM
zObvHBm+I0U6AnGSgiZm71dvO0vlijg1RsuD/wd1DlcGO9QAuaPHX+RKqmJUJFe8
d0JjOon+d6n/pBKxoPnZyfB1IHxFNb3kX3LjmKdP6NYeOmdHZ7LI3dzB+LFyPXZe
mkC948+9GExEqmHQx5ZqyCWDwKtknIQPA45ZjBi8e5YOU8nJQiapShYWQu/6ybKV
GFHRqjDIGfhpjZGVdxnUX2Iok1XcwaGKrI6/6P6WlQ/zHKGurtEEtGiI7XHDYS8m
FSGxYay4GIWUsNbNSBWcZp5QaGH0jW7qiZle3DR+8gDLe1DJzbIo6pWMiArm5v7x
fUmroR4FcF9x2X7A01IB2tEUGf/0nfuHgVfMNPNoHBGsiRs9mXzLMngJjbFXOGyF
EP61/W+K+eaJ0jjkfmnscBQEy8URLSNoACnwQLT9SQcAIC4xWTs=
=J1nv
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
- Enable parallel hotplug for RISC-V
- Optimize vector regset allocation for ptrace()
- Add a kernel selftest for the vector ptrace interface
- Enable the userspace RAID6 test to build and run using RISC-V vectors
- Add initial support for the Zalasr RISC-V ratified ISA extension
- For the Zicbop RISC-V ratified ISA extension to userspace, expose
hardware and kernel support to userspace and add a kselftest for
Zicbop
- Convert open-coded instances of 'asm goto's that are controlled by
runtime ALTERNATIVEs to use riscv_has_extension_{un,}likely(),
following arm64's alternative_has_cap_{un,}likely()
- Remove an unnecessary mask in the GFP flags used in some calls to
pagetable_alloc()
* tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
selftests/riscv: Add Zicbop prefetch test
riscv: hwprobe: Expose Zicbop extension and its block size
riscv: Introduce Zalasr instructions
riscv: hwprobe: Export Zalasr extension
dt-bindings: riscv: Add Zalasr ISA extension description
riscv: Add ISA extension parsing for Zalasr
selftests: riscv: Add test for the Vector ptrace interface
riscv: ptrace: Optimize the allocation of vector regset
raid6: test: Add support for RISC-V
raid6: riscv: Allow code to be compiled in userspace
raid6: riscv: Prevent compiler from breaking inline vector assembly code
riscv: cmpxchg: Use riscv_has_extension_likely
riscv: bitops: Use riscv_has_extension_likely
riscv: hweight: Use riscv_has_extension_likely
riscv: checksum: Use riscv_has_extension_likely
riscv: pgtable: Use riscv_has_extension_unlikely
riscv: Remove __GFP_HIGHMEM masking
RISC-V: Enable HOTPLUG_PARALLEL for secondary CPUs
- The 10 patch series "__vmalloc()/kvmalloc() and no-block support" from
Uladzislau Rezki reworks the vmalloc() code to support non-blocking
allocations (GFP_ATOIC, GFP_NOWAIT).
- The 2 patch series "ksm: fix exec/fork inheritance" from xu xin fixes
a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not inherited
across fork/exec.
- The 4 patch series "mm/zswap: misc cleanup of code and documentations"
from SeongJae Park does some light maintenance work on the zswap code.
- The 5 patch series "mm/page_owner: add debugfs files 'show_handles'
and 'show_stacks_handles'" from Mauricio Faria de Oliveira enhances the
/sys/kernel/debug/page_owner debug feature. It adds unique identifiers
to differentiate the various stack traces so that userspace monitoring
tools can better match stack traces over time.
- The 2 patch series "mm/page_alloc: pcp->batch cleanups" from Joshua
Hahn makes some minor alterations to the page allocator's per-cpu-pages
feature.
- The 2 patch series "Improve UFFDIO_MOVE scalability by removing
anon_vma lock" from Lokesh Gidra addresses a scalability issue in
userfaultfd's UFFDIO_MOVE operation.
- The 2 patch series "kasan: cleanups for kasan_enabled() checks" from
Sabyrzhan Tasbolatov performs some cleanup in the KASAN code.
- The 2 patch series "drivers/base/node: fold node register and
unregister functions" from Donet Tom cleans up the NUMA node handling
code a little.
- The 4 patch series "mm: some optimizations for prot numa" from Kefeng
Wang provides some cleanups and small optimizations to the NUMA
allocation hinting code.
- The 5 patch series "mm/page_alloc: Batch callers of
free_pcppages_bulk" from Joshua Hahn addresses long lock hold times at
boot on large machines. These were causing (harmless) softlockup
warnings.
- The 2 patch series "optimize the logic for handling dirty file folios
during reclaim" from Baolin Wang removes some now-unnecessary work from
page reclaim.
- The 10 patch series "mm/damon: allow DAMOS auto-tuned for per-memcg
per-node memory usage" from SeongJae Park enhances the DAMOS auto-tuning
feature.
- The 2 patch series "mm/damon: fixes for address alignment issues in
DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan fixes DAMON_LRU_SORT
and DAMON_RECLAIM with certain userspace configuration.
- The 15 patch series "expand mmap_prepare functionality, port more
users" from Lorenzo Stoakes enhances the new(ish)
file_operations.mmap_prepare() method and ports additional callsites
from the old ->mmap() over to ->mmap_prepare().
- The 8 patch series "Fix stale IOTLB entries for kernel address space"
from Lu Baolu fixes a bug (and possible security issue on non-x86) in
the IOMMU code. In some situations the IOMMU could be left hanging onto
a stale kernel pagetable entry.
- The 4 patch series "mm/huge_memory: cleanup __split_unmapped_folio()"
from Wei Yang cleans up and optimizes the folio splitting code.
- The 5 patch series "mm, swap: misc cleanup and bugfix" from Kairui
Song implements some cleanups and a minor fix in the swap discard code.
- The 8 patch series "mm/damon: misc documentation fixups" from SeongJae
Park does as advertised.
- The 9 patch series "mm/damon: support pin-point targets removal" from
SeongJae Park permits userspace to remove a specific monitoring target
in the middle of the current targets list.
- The 2 patch series "mm: MISC follow-up patches for linux/pgalloc.h"
from Harry Yoo implements a couple of cleanups related to mm header file
inclusion.
- The 2 patch series "mm/swapfile.c: select swap devices of default
priority round robin" from Baoquan He improves the selection of swap
devices for NUMA machines.
- The 3 patch series "mm: Convert memory block states (MEM_*) macros to
enums" from Israel Batista changes the memory block labels from macros
to enums so they will appear in kernel debug info.
- The 3 patch series "ksm: perform a range-walk to jump over holes in
break_ksm" from Pedro Demarchi Gomes addresses an inefficiency when KSM
unmerges an address range.
- The 22 patch series "mm/damon/tests: fix memory bugs in kunit tests"
from SeongJae Park fixes leaks and unhandled malloc() failures in DAMON
userspace unit tests.
- The 2 patch series "some cleanups for pageout()" from Baolin Wang
cleans up a couple of minor things in the page scanner's
writeback-for-eviction code.
- The 2 patch series "mm/hugetlb: refactor sysfs/sysctl interfaces" from
Hui Zhu moves hugetlb's sysfs/sysctl handling code into a new file.
- The 9 patch series "introduce VM_MAYBE_GUARD and make it sticky" from
Lorenzo Stoakes makes the VMA guard regions available in /proc/pid/smaps
and improves the mergeability of guarded VMAs.
- The 2 patch series "mm: perform guard region install/remove under VMA
lock" from Lorenzo Stoakes reduces mmap lock contention for callers
performing VMA guard region operations.
- The 2 patch series "vma_start_write_killable" from Matthew Wilcox
starts work in permitting applications to be killed when they are
waiting on a read_lock on the VMA lock.
- The 11 patch series "mm/damon/tests: add more tests for online
parameters commit" from SeongJae Park adds additional userspace testing
of DAMON's "commit" feature.
- The 9 patch series "mm/damon: misc cleanups" from SeongJae Park does
that.
- The 2 patch series "make VM_SOFTDIRTY a sticky VMA flag" from Lorenzo
Stoakes addresses the possible loss of a VMA's VM_SOFTDIRTY flag when
that VMA is merged with another.
- The 16 patch series "mm: support device-private THP" from Balbir Singh
introduces support for Transparent Huge Page (THP) migration in zone
device-private memory.
- The 3 patch series "Optimize folio split in memory failure" from Zi
Yan optimizes folio split operations in the memory failure code.
- The 2 patch series "mm/huge_memory: Define split_type and consolidate
split support checks" from Wei Yang provides some more cleanups in the
folio splitting code.
- The 16 patch series "mm: remove is_swap_[pte, pmd]() + non-swap
entries, introduce leaf entries" from Lorenzo Stoakes cleans up our
handling of pagetable leaf entries by introducing the concept of
'software leaf entries', of type softleaf_t.
- The 4 patch series "reparent the THP split queue" from Muchun Song
reparents the THP split queue to its parent memcg. This is in
preparation for addressing the long-standing "dying memcg" problem,
wherein dead memcg's linger for too long, consuming memory resources.
- The 3 patch series "unify PMD scan results and remove redundant
cleanup" from Wei Yang does a little cleanup in the hugepage collapse
code.
- The 6 patch series "zram: introduce writeback bio batching" from
Sergey Senozhatsky improves zram writeback efficiency by introducing
batched bio writeback support.
- The 4 patch series "memcg: cleanup the memcg stats interfaces" from
Shakeel Butt cleans up our handling of the interrupt safety of some
memcg stats.
- The 4 patch series "make vmalloc gfp flags usage more apparent" from
Vishal Moola cleans up vmalloc's handling of incoming GFP flags.
- The 6 patch series "mm: Add soft-dirty and uffd-wp support for RISC-V"
from Chunyan Zhang teches soft dirty and userfaultfd write protect
tracking to use RISC-V's Svrsw60t59b extension.
- The 5 patch series "mm: swap: small fixes and comment cleanups" from
Youngjun Park fixes a small bug and cleans up some of the swap code.
- The 4 patch series "initial work on making VMA flags a bitmap" from
Lorenzo Stoakes starts work on converting the vma struct's flags to a
bitmap, so we stop running out of them, especially on 32-bit.
- The 2 patch series "mm/swapfile: fix and cleanup swap list iterations"
from Youngjun Park addresses a possible bug in the swap discard code and
cleans things up a little.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTEb0wAKCRDdBJ7gKXxA
jjfIAP94W4EkCCwNOupnChoG+YWw/JW21anXt5NN+i5svn1yugEAwzvv6A+cAFng
o+ug/fyrfPZG7PLp2R8WFyGIP0YoBA4=
=IUzS
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki)
Rework the vmalloc() code to support non-blocking allocations
(GFP_ATOIC, GFP_NOWAIT)
"ksm: fix exec/fork inheritance" (xu xin)
Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not
inherited across fork/exec
"mm/zswap: misc cleanup of code and documentations" (SeongJae Park)
Some light maintenance work on the zswap code
"mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira)
Enhance the /sys/kernel/debug/page_owner debug feature by adding
unique identifiers to differentiate the various stack traces so
that userspace monitoring tools can better match stack traces over
time
"mm/page_alloc: pcp->batch cleanups" (Joshua Hahn)
Minor alterations to the page allocator's per-cpu-pages feature
"Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra)
Address a scalability issue in userfaultfd's UFFDIO_MOVE operation
"kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov)
"drivers/base/node: fold node register and unregister functions" (Donet Tom)
Clean up the NUMA node handling code a little
"mm: some optimizations for prot numa" (Kefeng Wang)
Cleanups and small optimizations to the NUMA allocation hinting
code
"mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn)
Address long lock hold times at boot on large machines. These were
causing (harmless) softlockup warnings
"optimize the logic for handling dirty file folios during reclaim" (Baolin Wang)
Remove some now-unnecessary work from page reclaim
"mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park)
Enhance the DAMOS auto-tuning feature
"mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan)
Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace
configuration
"expand mmap_prepare functionality, port more users" (Lorenzo Stoakes)
Enhance the new(ish) file_operations.mmap_prepare() method and port
additional callsites from the old ->mmap() over to ->mmap_prepare()
"Fix stale IOTLB entries for kernel address space" (Lu Baolu)
Fix a bug (and possible security issue on non-x86) in the IOMMU
code. In some situations the IOMMU could be left hanging onto a
stale kernel pagetable entry
"mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang)
Clean up and optimize the folio splitting code
"mm, swap: misc cleanup and bugfix" (Kairui Song)
Some cleanups and a minor fix in the swap discard code
"mm/damon: misc documentation fixups" (SeongJae Park)
"mm/damon: support pin-point targets removal" (SeongJae Park)
Permit userspace to remove a specific monitoring target in the
middle of the current targets list
"mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo)
A couple of cleanups related to mm header file inclusion
"mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He)
improve the selection of swap devices for NUMA machines
"mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista)
Change the memory block labels from macros to enums so they will
appear in kernel debug info
"ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes)
Address an inefficiency when KSM unmerges an address range
"mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park)
Fix leaks and unhandled malloc() failures in DAMON userspace unit
tests
"some cleanups for pageout()" (Baolin Wang)
Clean up a couple of minor things in the page scanner's
writeback-for-eviction code
"mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu)
Move hugetlb's sysfs/sysctl handling code into a new file
"introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes)
Make the VMA guard regions available in /proc/pid/smaps and
improves the mergeability of guarded VMAs
"mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes)
Reduce mmap lock contention for callers performing VMA guard region
operations
"vma_start_write_killable" (Matthew Wilcox)
Start work on permitting applications to be killed when they are
waiting on a read_lock on the VMA lock
"mm/damon/tests: add more tests for online parameters commit" (SeongJae Park)
Add additional userspace testing of DAMON's "commit" feature
"mm/damon: misc cleanups" (SeongJae Park)
"make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes)
Address the possible loss of a VMA's VM_SOFTDIRTY flag when that
VMA is merged with another
"mm: support device-private THP" (Balbir Singh)
Introduce support for Transparent Huge Page (THP) migration in zone
device-private memory
"Optimize folio split in memory failure" (Zi Yan)
"mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang)
Some more cleanups in the folio splitting code
"mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes)
Clean up our handling of pagetable leaf entries by introducing the
concept of 'software leaf entries', of type softleaf_t
"reparent the THP split queue" (Muchun Song)
Reparent the THP split queue to its parent memcg. This is in
preparation for addressing the long-standing "dying memcg" problem,
wherein dead memcg's linger for too long, consuming memory
resources
"unify PMD scan results and remove redundant cleanup" (Wei Yang)
A little cleanup in the hugepage collapse code
"zram: introduce writeback bio batching" (Sergey Senozhatsky)
Improve zram writeback efficiency by introducing batched bio
writeback support
"memcg: cleanup the memcg stats interfaces" (Shakeel Butt)
Clean up our handling of the interrupt safety of some memcg stats
"make vmalloc gfp flags usage more apparent" (Vishal Moola)
Clean up vmalloc's handling of incoming GFP flags
"mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang)
Teach soft dirty and userfaultfd write protect tracking to use
RISC-V's Svrsw60t59b extension
"mm: swap: small fixes and comment cleanups" (Youngjun Park)
Fix a small bug and clean up some of the swap code
"initial work on making VMA flags a bitmap" (Lorenzo Stoakes)
Start work on converting the vma struct's flags to a bitmap, so we
stop running out of them, especially on 32-bit
"mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park)
Address a possible bug in the swap discard code and clean things
up a little
[ This merge also reverts commit ebb9aeb980 ("vfio/nvgrace-gpu:
register device memory for poison handling") because it looks
broken to me, I've asked for clarification - Linus ]
* tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
mm: fix vma_start_write_killable() signal handling
mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate
mm/swapfile: fix list iteration when next node is removed during discard
fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling
mm/kfence: add reboot notifier to disable KFENCE on shutdown
memcg: remove inc/dec_lruvec_kmem_state helpers
selftests/mm/uffd: initialize char variable to Null
mm: fix DEBUG_RODATA_TEST indentation in Kconfig
mm: introduce VMA flags bitmap type
tools/testing/vma: eliminate dependency on vma->__vm_flags
mm: simplify and rename mm flags function for clarity
mm: declare VMA flags by bit
zram: fix a spelling mistake
mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
pagemap: update BUDDY flag documentation
mm: swap: remove scan_swap_map_slots() references from comments
mm: swap: change swap_alloc_slow() to void
mm, swap: remove redundant comment for read_swap_cache_async
mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational
...
- fprobe: Performance enhancement of the fprobe using rhltable
. fprobe: use rhltable for fprobe_ip_table. The fprobe IP table has
been converted to use an rhltable for improved performance when
dealing with a large number of probed functions.
. Fix a suspicious RCU usage warning of the above change in the
fprobe entry handler.
. Remove an unused local variable of the above change.
. Fix to initialize fprobe_ip_table in core_initcall().
- fprobe: Performance optimization of fprobe by ftrace
. fprobe: Use ftrace instead of fgraph for entry only probes. This
avoids the unneeded overhead of fgraph stack setup.
. Also update fprobe selftest for entry-only probe.
. fprobe: Use ftrace only if CONFIG_DYNAMIC_FTRACE_WITH_ARGS or
WITH_REGS is defined.
- probes: Cleanup probe event subsystems.
. uprobe/eprobe: Allocate traceprobe_parse_context per probe instead
of each probe argument parsing. This reduce memory allocation/free
of temporary working memory.
. uprobes: Cleanup code using __free().
. eprobes: Cleanup code using __free().
. probes: Cleanup code using __free(trace_probe_log_clear) to clear
error log automatically.
. probes: Replace strcpy() with memcpy() in __trace_probe_log_err().
-----BEGIN PGP SIGNATURE-----
iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmkvhSsbHG1hc2FtaS5o
aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bWhEH/23XM5Msjy5vopB+ECZb
iCj8SkWrQzfiCBILUqxCkZdfJHFomGPHewxvxIOWdb7evtHuy0Ypne/Uw/TMAtAh
xvDQmu03IV2jO7h7GExsnEh0nX0upYg4IVmN0sCSSWSfgLLTWO9ICClavV9adcva
ZR+5TdZbK+W59n+ejxA9OMDt1G+nz1Ls9Qhx9ktf7odkJzBkQGPq/heZuPbF3+6k
Vj2IHTuqWobDDt+ekKOBRWNh9cS61ybxvsr/vmkT6s904ortP6mZa3zEYPRVOUNG
WJ/KGJwvExTcaG/Dy2g6q8tam1Bidx9/S6klyOGXQXxvaIT1VtBc66HzAUfso6jg
yIc=
=w6Kq
-----END PGP SIGNATURE-----
Merge tag 'probes-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
"fprobe performance enhancement using rhltable:
- use rhltable for fprobe_ip_table. The fprobe IP table has been
converted to use an rhltable for improved performance when dealing
with a large number of probed functions
- Fix a suspicious RCU usage warning of the above change in the
fprobe entry handler
- Remove an unused local variable of the above change
- Fix to initialize fprobe_ip_table in core_initcall()
Performance optimization of fprobe by ftrace:
- Use ftrace instead of fgraph for entry only probes. This avoids the
unneeded overhead of fgraph stack setup
- Also update fprobe selftest for entry-only probe
- fprobe: Use ftrace only if CONFIG_DYNAMIC_FTRACE_WITH_ARGS or
WITH_REGS is defined
Cleanup probe event subsystems:
- Allocate traceprobe_parse_context per probe instead of each probe
argument parsing. This reduce memory allocation/free of temporary
working memory
- Cleanup code using __free()
- Replace strcpy() with memcpy() in __trace_probe_log_err()"
* tag 'probes-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: fprobe: use ftrace if CONFIG_DYNAMIC_FTRACE_WITH_ARGS
lib/test_fprobe: add testcase for mixed fprobe
tracing: fprobe: optimization for entry only case
tracing: fprobe: Fix to init fprobe_ip_table earlier
tracing: fprobe: Remove unused local variable
tracing: probes: Replace strcpy() with memcpy() in __trace_probe_log_err()
tracing: fprobe: fix suspicious rcu usage in fprobe_entry
tracing: uprobe: eprobes: Allocate traceprobe_parse_context per probe
tracing: uprobes: Cleanup __trace_uprobe_create() with __free()
tracing: eprobe: Cleanup eprobe event using __free()
tracing: probes: Use __free() for trace_probe_log
tracing: fprobe: use rhltable for fprobe_ip_table
The majority of changes at this time were about ASoC with a lot of
code refactoring works. From the functionality POV, there aren't much
to see, but we have a wide range of device-specific fixes and updates.
Here are some highlights:
- Continued ASoC API clean works, spanned over many files
- Added a SoundWire SCDA generic class driver with regmap support
- Enhancements and fixes for Cirrus, Intel, Maxim and Qualcomm.
- Support for ASoC Allwinner A523, Mediatek MT8189, Qualcomm QCM2290,
QRB2210 and SM6115, SpacemiT K1, and TI TAS2568, TAS5802, TAS5806,
TAS5815, TAS5828 and TAS5830
- Usual HD-audio and USB-audio quirks and fixups
- Support for Onkyo SE-300PCIE, TASCAM IF-FW/DM MkII
Some gpiolib changes for shared GPIOs are included along with this PR
for covering ASoC drivers changes.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmkwQ2UOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE8tIRAAjCHdIlMejNTCzGRlhsRSQVD6bo1wASXcjfJ6
COH84akbnA0oT5z7H7JnzTOmfjzxLJpwC8j6IpZ/9CQazanT5IIVE41FZquXZ1JB
RhQVzuGw9Pl4MaYVdFuRqIXjiP+msY1jpbo9/QXQo8D/B41wpmVTgzkFVW2rxPMy
0aBOu4Wpu+11aBpNBy6dXDiKQ5kDqn7zOLoFGgcf5wlFIvOGZJ0Wg/i0kvCjl+ia
xYiP+/F6xKOyTY1c98iqExbKzSSy4ddGFUwrkevm6bWpu8hkXiL1O0zMWOe769x6
0wy0b5zvsbtOQOxbtK5+8gdjJw7ycgDa441hDtsaXBBROYZEV3D6+XZJCfq8Tz8F
+vLH5lfZeLg+59eqt3GOMGlwBfuhH91qzukIYG3q9EQGOkNkZ19ySJnFMLom68Ei
TCfNzh/ggSGXA9qAmfBcPoizgC/j9o+v4kbLRQteuRRWxES1FxqeN9Ba3d5JcHT3
BQpz1bhUli73477D6voPcwXLiQlM+Alv4QUKTFr2nUnWUQKwMvkZFwiv2jTqVdDf
f71Usv7xdyM7XijgmXuLg+3n0UvCwUPBB+bv3a1Bu7G4iTB1deNKU8t9k+sBJpcX
aRs5ych3MiU/zG+KRMB5FEx31KzpKu+Kk9NQ207/1HLaNhTgD3cg2wS3T3qdRUPv
Yf6wFHs=
=1JUI
-----END PGP SIGNATURE-----
Merge tag 'sound-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"The majority of changes at this time were about ASoC with a lot of
code refactoring works. From the functionality POV, there isn't much
to see, but we have a wide range of device-specific fixes and updates.
Here are some highlights:
- Continued ASoC API cleanup work, spanned over many files
- Added a SoundWire SCDA generic class driver with regmap support
- Enhancements and fixes for Cirrus, Intel, Maxim and Qualcomm.
- Support for ASoC Allwinner A523, Mediatek MT8189, Qualcomm QCM2290,
QRB2210 and SM6115, SpacemiT K1, and TI TAS2568, TAS5802, TAS5806,
TAS5815, TAS5828 and TAS5830
- Usual HD-audio and USB-audio quirks and fixups
- Support for Onkyo SE-300PCIE, TASCAM IF-FW/DM MkII
Some gpiolib changes for shared GPIOs are included along with this PR
for covering ASoC drivers changes"
* tag 'sound-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (739 commits)
ALSA: hda/realtek: Add PCI SSIDs to HP ProBook quirks
ALSA: usb-audio: Simplify with usb_endpoint_max_periodic_payload()
ALSA: hda/realtek: fix mute/micmute LEDs don't work for more HP laptops
ALSA: rawmidi: Fix inconsistent indenting warning reported by smatch
ALSA: dice: fix buffer overflow in detect_stream_formats()
ASoC: codecs: Modify awinic amplifier dsp read and write functions
ASoC: SDCA: Fixup some more Kconfig issues
ASoC: cs35l56: Log a message if firmware is missing
ASoC: nau8325: Delete a stray tab
firmware: cs_dsp: Add test cases for client_ops == NULL
firmware: cs_dsp: Don't require client to provide a struct cs_dsp_client_ops
ASoC: fsl_micfil: Set channel range control
ASoC: fsl_micfil: Add default quality for different platforms
ASoC: intel: sof_sdw: Add codec_info for cs42l45
ASoC: sdw_utils: Add cs42l45 support functions
ASoC: intel: sof_sdw: Add ability to have auxiliary devices
ASoC: sdw_utils: Move codec_name to dai info
ASoC: sdw_utils: Add codec_conf for every DAI
ASoC: SDCA: Add terminal type into input/output widget name
ASoC: SDCA: Align mute controls to ALSA expectations
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmktsoMQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpuiUD/92eivL+HmOh10o8trvxajB0yuyqfSjHHrL
g+xUbF4s9bgAg/v+Upx7sTY8jdrTcMjKov+G9T6uPvBMqVmeVdZckA1PSAKQaIX1
Zb7nS2LnO7F6JKbwpwVrrIaqVbcz8MfGIIMbN4yNNEOMCwdIVMp4fo7trPBknJNx
WddNSGUFlIF3NqSI8AflSS/pYnGm+McfBHXBpJAKipI3iquKKubHv+FX9kLp7Tn4
x27ZoCWOHglIBTJXU0mmXCVsLF8b5BA8DQcGtT62azb8+l0cRTkaHY0DFAv5BvhG
TqcjrKdmR0cGSNt+nEmFrujE3atBRl0G0kiHA80YgA1MTtYzdPaUVOUtM9k/rEem
gpiGMDpBypdxyJAyijPSaVJdfcg0psOlYbhIR4N2wbj/dq8268h+cWzXlF1spgVt
/7ygoaCmfMNbTy9rKThTjH+es787AVXUAXXaPHhIFsnCKUj8xQl4pT7XltmgYeWx
1/XD1NEJeLHHog5upAVlGX3H5tbvP1nIICxbZa9mDOJX1rwxxI7/s/RucPjbNXuY
AiaKPTfxtB9+Ihd2HrJ/76RVMkckcOBc4GIKoFfwuKDbcdLXQ5FcZCmVRoI1V9SV
KsH7JBgihLwR9XWKE1vp9+CBNe1Qlu3K4IjG/E7CNLeuDntIBu73ihqGP/DqV6Bq
RX1Dc0OyAQ==
=m22w
-----END PGP SIGNATURE-----
Merge tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block updates from Jens Axboe:
- Fix head insertion for mq-deadline, a regression from when priority
support was added
- Series simplifying and improving the ublk user copy code
- Various ublk related cleanups
- Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the
request is punted to a thread for handling
- Merge and then later revert loop dio nowait support, as it ended up
causing excessive stack usage for when the inline issue code needs to
dip back into the full file system code
- Improve auto integrity code, making it less deadlock prone
- Speedup polled IO handling, but manually managing the hctx lookups
- Fixes for blk-throttle for SSD devices
- Small series with fixes for the S390 dasd driver
- Add support for caching zones, avoiding unnecessary report zone
queries
- MD pull requests via Yu:
- fix null-ptr-dereference regression for dm-raid0
- fix IO hang for raid5 when array is broken with IO inflight
- remove legacy 1s delay to speed up system shutdown
- change maintainer's email address
- data can be lost if array is created with different lbs devices,
fix this problem and record lbs of the array in metadata
- fix rcu protection for md_thread
- fix mddev kobject lifetime regression
- enable atomic writes for md-linear
- some cleanups
- bcache updates via Coly
- remove useless discard and cache device code
- improve usage of per-cpu workqueues
- Reorganize the IO scheduler switching code, fixing some lockdep
reports as well
- Improve the block layer P2P DMA support
- Add support to the block tracing code for zoned devices
- Segment calculation improves, and memory alignment flexibility
improvements
- Set of prep and cleanups patches for ublk batching support. The
actual batching hasn't been added yet, but helps shrink down the
workload of getting that patchset ready for 6.20
- Fix for how the ps3 block driver handles segments offsets
- Improve how block plugging handles batch tag allocations
- nbd fixes for use-after-free of the configuration on device clear/put
- Set of improvements and fixes for zloop
- Add Damien as maintainer of the block zoned device code handling
- Various other fixes and cleanups
* tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits)
block/rnbd: correct all kernel-doc complaints
blk-mq: use queue_hctx in blk_mq_map_queue_type
md: remove legacy 1s delay in md_notify_reboot
md/raid5: fix IO hang when array is broken with IO inflight
md: warn about updating super block failure
md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid
sbitmap: fix all kernel-doc warnings
ublk: add helper of __ublk_fetch()
ublk: pass const pointer to ublk_queue_is_zoned()
ublk: refactor auto buffer register in ublk_dispatch_req()
ublk: add `union ublk_io_buf` with improved naming
ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg()
kfifo: add kfifo_alloc_node() helper for NUMA awareness
blk-mq: fix potential uaf for 'queue_hw_ctx'
blk-mq: use array manage hctx map instead of xarray
ublk: prevent invalid access with DEBUG
s390/dasd: Use scnprintf() instead of sprintf()
s390/dasd: Move device name formatting into separate function
s390/dasd: Remove unnecessary debugfs_create() return checks
s390/dasd: Fix gendisk parent after copy pair swap
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmktzC4ACgkQ6rmadz2v
bTpA1w/+PZ45N3y6O+NQVIpBlpnHG7DEMK7Lw19On0xVLwH+XPHz6J5PEfzjyJR1
SCbsV30qkJ1YCtgRHHf+ZCuWPWm58hY8dXYwSDyjNavdQyVGOdf17aBu9pvH45NW
K20OhwQHpCHWIfDlijjPkDdiHnYf5S7Xy6ctt/3ztF0pMDHIaghGxJymG4wULcDT
iLKnT37kwO8b2ihmw/HbcZPQYMWfHRye7X009K+wCv0dnhJ6q/Ny1m+Pg4kF92e6
ON/RY26ep2dq7LpaNWa1rI1yOgFlI7uUlVojqrAuAb+xrg+64wUDBxeijvE37EN1
s/+PuEKAR6xwz1dbY2cWAI0D633saz24UdV6kCBW9HrjHKVRQ7ZSsBF9ENkS4DTK
nowx4wOe1ZHc/6YgTktZp9LEn/0YrmQtFxjqEAJiYUgD18FrBrSjmhHpBiL+HghP
sTqy41qDQGoKtg3bRu42Co9wmNeeLsnxT8NQExCmTYQ4ufpdA/VMQux9cBVX3GBq
EchJb465+AcvvCJUiKbnHLxDsHCQz1YYytz3RqyFLgGDFZnHOE0FjwPJmM8I5kkK
gvDB3ZYdO3Halm8BZfZZBnKv5uK7myuAWwqRLgMRanuZcRgmIV1oUP5EP88HdH75
fB20vZSVcfzB17SLyhiM20ivEWodJa9VCLEw9WDOmDoml+33Pks=
=kaCJ
-----END PGP SIGNATURE-----
Merge tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Convert selftests/bpf/test_tc_edt and test_tc_tunnel from .sh to
test_progs runner (Alexis Lothoré)
- Convert selftests/bpf/test_xsk to test_progs runner (Bastien
Curutchet)
- Replace bpf memory allocator with kmalloc_nolock() in
bpf_local_storage (Amery Hung), and in bpf streams and range tree
(Puranjay Mohan)
- Introduce support for indirect jumps in BPF verifier and x86 JIT
(Anton Protopopov) and arm64 JIT (Puranjay Mohan)
- Remove runqslower bpf tool (Hoyeon Lee)
- Fix corner cases in the verifier to close several syzbot reports
(Eduard Zingerman, KaFai Wan)
- Several improvements in deadlock detection in rqspinlock (Kumar
Kartikeya Dwivedi)
- Implement "jmp" mode for BPF trampoline and corresponding
DYNAMIC_FTRACE_WITH_JMP. It improves "fexit" program type performance
from 80 M/s to 136 M/s. With Steven's Ack. (Menglong Dong)
- Add ability to test non-linear skbs in BPF_PROG_TEST_RUN (Paul
Chaignon)
- Do not let BPF_PROG_TEST_RUN emit invalid GSO types to stack (Daniel
Borkmann)
- Generalize buildid reader into bpf_dynptr (Mykyta Yatsenko)
- Optimize bpf_map_update_elem() for map-in-map types (Ritesh
Oedayrajsingh Varma)
- Introduce overwrite mode for BPF ring buffer (Xu Kuohai)
* tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (169 commits)
bpf: optimize bpf_map_update_elem() for map-in-map types
bpf: make kprobe_multi_link_prog_run always_inline
selftests/bpf: do not hardcode target rate in test_tc_edt BPF program
selftests/bpf: remove test_tc_edt.sh
selftests/bpf: integrate test_tc_edt into test_progs
selftests/bpf: rename test_tc_edt.bpf.c section to expose program type
selftests/bpf: Add success stats to rqspinlock stress test
rqspinlock: Precede non-head waiter queueing with AA check
rqspinlock: Disable spinning for trylock fallback
rqspinlock: Use trylock fallback when per-CPU rqnode is busy
rqspinlock: Perform AA checks immediately
rqspinlock: Enclose lock/unlock within lock entry acquisitions
bpf: Remove runqslower tool
selftests/bpf: Remove usage of lsm/file_alloc_security in selftest
bpf: Disable file_alloc_security hook
bpf: check for insn arrays in check_ptr_alignment
bpf: force BPF_F_RDONLY_PROG on insn array creation
bpf: Fix exclusive map memory leak
selftests/bpf: Make CS length configurable for rqspinlock stress test
selftests/bpf: Add lock wait time stats to rqspinlock stress test
...
-----BEGIN PGP SIGNATURE-----
iQJPBAABCAA5FiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmktlbUbFIAAAAAABAAO
bWFudTIsMi41KzEuMTEsMiwyAAoJEFKgDEdIgJTyevsP/1z98/wfCaSCquIq4H8S
OTqFGybGgYQt1NmMj2cGPpbAE3LJNYORT0A4tcoqOTy1Z5xbQz63rO3clSI/e7Mf
n4ZZ7NvkE40i8et1BjqtZa9dSkAv4QLYH73KrtNeuTr5tqvHo1x8FakUH6gQnb1k
QOOebvbVXnOb+rh89j1GZShrLFcCil0psjp165WHAYE/3PyFBgYGLMCgwLqS+W3H
re5Q4sl/ySXpMFF/XN1Kww48FWxy/h+YQFCxZwuWlUcXtVjqZ+BN+keb7AqaFQ7R
dC2exV2W0RBoupEJR/FWHoXrm/bDDLhzqRaMvoggLJrMJ9L6V0WdIhaFA4qzoG63
paJGFjUfmDX3dpPsAddq7kKeevCz4a2/HwFKhiBqqq4tdHuely7wZgnoFO7ovgmu
DYDCXHtpJuWZR3WJ5I/V/sJ9i9KFXhhyWcKVf13QTAFiCaA09aeSAcUWNYNaaxbn
nu6IkUxdIVnWIEBgcYH6jz1DrPGreYLYuD4bVb2gdZoP0r3tnMpG6xfSNIUueSGd
VFAKW9PJYaj7Id+jgACH6V+gQ22L600xJDdL1bPjRbGE0LD7vlz2F1MZTq3BFJFn
hUxJeOZplHX+TPophdvH4MO9VLmydWLUyJiDBP1yA8M9XZms/5s7IJJ1RYXqUCcf
qEB4L7W1+Qy1R/lzf2PU9X4R
=FnfO
-----END PGP SIGNATURE-----
Merge tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Allow creaing nbcon console drivers with an unsafe write_atomic()
callback that can only be called by the final nbcon_atomic_flush_unsafe().
Otherwise, the driver would rely on the kthread.
It is going to be used as the-best-effort approach for an
experimental nbcon netconsole driver, see
https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org
Note that a safe .write_atomic() callback is supposed to work in NMI
context. But some networking drivers are not safe even in IRQ
context:
https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35
In an ideal world, all networking drivers would be fixed first and
the atomic flush would be blocked only in NMI context. But it brings
the question how reliable networking drivers are when the system is
in a bad state. They might block flushing more reliable serial
consoles which are more suitable for serious debugging anyway.
- Allow to use the last 4 bytes of the printk ring buffer.
- Prevent queuing IRQ work and block printk kthreads when consoles are
suspended. Otherwise, they create non-necessary churn or even block
the suspend.
- Release console_lock() between each record in the kthread used for
legacy consoles on RT. It might significantly speed up the boot.
- Release nbcon context between each record in the atomic flush. It
prevents stalls of the related printk kthread after it has lost the
ownership in the middle of a record
- Add support for NBCON consoles into KDB
- Add %ptsP modifier for printing struct timespec64 and use it where
possible
- Misc code clean up
* tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits)
printk: Use console_is_usable on console_unblank
arch: um: kmsg_dump: Use console_is_usable
drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT
lib/vsprintf: Unify FORMAT_STATE_NUM handlers
printk: Avoid irq_work for printk_deferred() on suspend
printk: Avoid scheduling irq_work on suspend
printk: Allow printk_trigger_flush() to flush all types
tracing: Switch to use %ptSp
scsi: snic: Switch to use %ptSp
scsi: fnic: Switch to use %ptSp
s390/dasd: Switch to use %ptSp
ptp: ocp: Switch to use %ptSp
pps: Switch to use %ptSp
PCI: epf-test: Switch to use %ptSp
net: dsa: sja1105: Switch to use %ptSp
mmc: mmc_test: Switch to use %ptSp
media: av7110: Switch to use %ptSp
ipmi: Switch to use %ptSp
igb: Switch to use %ptSp
e1000e: Switch to use %ptSp
...
API:
- Rewrite memcpy_sglist from scratch.
- Add on-stack AEAD request allocation.
- Fix partial block processing in ahash.
Algorithms:
- Remove ansi_cprng.
- Remove tcrypt tests for poly1305.
- Fix EINPROGRESS processing in authenc.
- Fix double-free in zstd.
Drivers:
- Use drbg ctr helper when reseeding xilinx-trng.
- Add support for PCI device 0x115A to ccp.
- Add support of paes in caam.
- Add support for aes-xts in dthev2.
Others:
- Use likely in rhashtable lookup.
- Fix lockdep false-positive in padata by removing a helper.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmktaHwACgkQxycdCkmx
i6duthAAl4ZjsuSgt0P9ZPJXWgSH+QbNT/6fL1QzLEuzLVGn8Mt99LTQpaYU8HRh
fced8+R7UpqA/FgZTYbRKopZJVJJqhmTf2zqjbe47CroRm2Wf5UO+6ZXBsiqbMwa
6fNLilhcrq5G3DrIHepCpIQ7NM2+ucTMnPRIWP3cvzLwX0JzPtYIpYUSiVPAtkjh
9g24oPz6LR/xZfyk+wPbHOSYeqz4sSXnGJkL+Vn33AtU5KJZLum9zMP4Lleim7HP
XaNnUL/S/PYCspycrvfrnq6+YMLPw2USguttuZe0Dg0qhq/jPMyzdEkTAjcTD5LG
NZavVUbQsf6BW+YjXgaE/ybcSs6WR3ySs8aza1Ev8QqsmpbJj9xdpF9fn4RsffGR
mbhc5plJCKWzfiaparea8yY9n5vHwbOK4zoyF9P6kI5ykkoA+GmwRwTW73M9KCfa
i1R6g97O+t4Yaq9JI9GG7dkm9bxJpY+XaKouW7rqv/MX0iND1ExDYaqdcA+Xa61c
TNfdlVcGyX7Dolm2xnpvRv8EqF9NzeK4Vw1QslrdCijXfe7eJymabNKhLBlV4li0
tVfmh4vyQFgruyiR7r7AkXIKzsLZbji030UoOsQqiMW7ualBUQ0dCDbBa8J6kUcX
/vjbSmxV3LKgVgYvUBRRGIi9CJbKfs29RkS6RFtdqcq/YT4KsJU=
=DHes
-----END PGP SIGNATURE-----
Merge tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Rewrite memcpy_sglist from scratch
- Add on-stack AEAD request allocation
- Fix partial block processing in ahash
Algorithms:
- Remove ansi_cprng
- Remove tcrypt tests for poly1305
- Fix EINPROGRESS processing in authenc
- Fix double-free in zstd
Drivers:
- Use drbg ctr helper when reseeding xilinx-trng
- Add support for PCI device 0x115A to ccp
- Add support of paes in caam
- Add support for aes-xts in dthev2
Others:
- Use likely in rhashtable lookup
- Fix lockdep false-positive in padata by removing a helper"
* tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits)
crypto: zstd - fix double-free in per-CPU stream cleanup
crypto: ahash - Zero positive err value in ahash_update_finish
crypto: ahash - Fix crypto_ahash_import with partial block data
crypto: lib/mpi - use min() instead of min_t()
crypto: ccp - use min() instead of min_t()
hwrng: core - use min3() instead of nested min_t()
crypto: aesni - ctr_crypt() use min() instead of min_t()
crypto: drbg - Delete unused ctx from struct sdesc
crypto: testmgr - Add missing DES weak and semi-weak key tests
Revert "crypto: scatterwalk - Move skcipher walk and use it for memcpy_sglist"
crypto: scatterwalk - Fix memcpy_sglist() to always succeed
crypto: iaa - Request to add Kanchana P Sridhar to Maintainers.
crypto: tcrypt - Remove unused poly1305 support
crypto: ansi_cprng - Remove unused ansi_cprng algorithm
crypto: asymmetric_keys - fix uninitialized pointers with free attribute
KEYS: Avoid -Wflex-array-member-not-at-end warning
crypto: ccree - Correctly handle return of sg_nents_for_len
crypto: starfive - Correctly handle return of sg_nents_for_len
crypto: iaa - Fix incorrect return value in save_iaa_wq()
crypto: zstd - Remove unnecessary size_t cast
...
In v6.8, the size of task_struct on arm64 increased by 528 bytes due
to the new 'kernel_fpsimd_state' field. This field was added to allow
kernel-mode FPSIMD code to be preempted.
Unfortunately, 528 bytes is kind of a lot for task_struct. This
regression in the task_struct size was noticed and reported.
Recover that space by making this state be allocated on the stack at
the beginning of each kernel-mode FPSIMD section.
To make it easier for all the users of kernel-mode FPSIMD to do that
correctly, introduce and use a 'scoped_ksimd' abstraction.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSuxbxQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOKysnAQCbN4Jed8IqwGUEkkjZrnMeN0pEO4RI
lAhb2Obj3n/grQEAiPBmqWVjXaIPO4lSgLQxY6XoVLr+utMod4TMTYHfnAY=
=0zQQ
-----END PGP SIGNATURE-----
Merge tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers:
"This is a core arm64 change. However, I was asked to take this because
most uses of kernel-mode FPSIMD are in crypto or CRC code.
In v6.8, the size of task_struct on arm64 increased by 528 bytes due
to the new 'kernel_fpsimd_state' field. This field was added to allow
kernel-mode FPSIMD code to be preempted.
Unfortunately, 528 bytes is kind of a lot for task_struct. This
regression in the task_struct size was noticed and reported.
Recover that space by making this state be allocated on the stack at
the beginning of each kernel-mode FPSIMD section.
To make it easier for all the users of kernel-mode FPSIMD to do that
correctly, introduce and use a 'scoped_ksimd' abstraction"
* tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits)
lib/crypto: arm64: Move remaining algorithms to scoped ksimd API
lib/crypto: arm/blake2b: Move to scoped ksimd API
arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
arm64/fpu: Enforce task-context only for generic kernel mode FPU
net/mlx5: Switch to more abstract scoped ksimd guard API on arm64
arm64/xorblocks: Switch to 'ksimd' scoped guard API
crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API
crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API
crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API
crypto/arm64: polyval - Switch to 'ksimd' scoped guard API
crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API
raid6: Move to more abstract 'ksimd' guard API
crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit
crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit
crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit
lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API
...
C supports lower bounds on the sizes of array parameters, using the
static keyword as follows: 'void f(int a[static 32]);'. This allows
the compiler to warn about a too-small array being passed.
As discussed, this reuse of the 'static' keyword, while standard, is a
bit obscure. Therefore, add an alias 'at_least' to compiler_types.h.
Then, add this 'at_least' annotation to the array parameters of
various crypto library functions.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSuslxQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK/dXAP9WyuTsEYOOSwSaDI5sKzdNXT3GZNeO
jGhx9qPN3KIq8QD/RElN9oF7iU9wsKvU6kKpnqGcajGTzkW/GOAA20BFcAM=
=qFEB
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull 'at_least' array size update from Eric Biggers:
"C supports lower bounds on the sizes of array parameters, using the
static keyword as follows: 'void f(int a[static 32]);'. This allows
the compiler to warn about a too-small array being passed.
As discussed, this reuse of the 'static' keyword, while standard, is a
bit obscure. Therefore, add an alias 'at_least' to compiler_types.h.
Then, add this 'at_least' annotation to the array parameters of
various crypto library functions"
* tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: sha2: Add at_least decoration to fixed-size array params
lib/crypto: sha1: Add at_least decoration to fixed-size array params
lib/crypto: poly1305: Add at_least decoration to fixed-size array params
lib/crypto: md5: Add at_least decoration to fixed-size array params
lib/crypto: curve25519: Add at_least decoration to fixed-size array params
lib/crypto: chacha: Add at_least decoration to fixed-size array params
lib/crypto: chacha20poly1305: Statically check fixed array lengths
compiler_types: introduce at_least parameter decoration pseudo keyword
wifi: iwlwifi: trans: rename at_least variable to min_mode
- Add KUnit test suites for SHA-3, BLAKE2b, and POLYVAL. These are the
algorithms that have new crypto library interfaces this cycle.
- Remove the crypto_shash POLYVAL tests. They're no longer needed
because POLYVAL support was removed from crypto_shash. Better
POLYVAL test coverage is now provided via the KUnit test suite.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSur3BQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOKxlzAP0cf+vR8aA+nS1LaAC8WTt6skTtwbVF
J/LPvOzHCLlj2AEA15CIQvBYkAAVCF2JPZiPb43lMMso7PkrjqIszQSunAc=
=B84P
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library test updates from Eric Biggers:
- Add KUnit test suites for SHA-3, BLAKE2b, and POLYVAL. These are the
algorithms that have new crypto library interfaces this cycle.
- Remove the crypto_shash POLYVAL tests. They're no longer needed
because POLYVAL support was removed from crypto_shash. Better POLYVAL
test coverage is now provided via the KUnit test suite.
* tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
crypto: testmgr - Remove polyval tests
lib/crypto: tests: Add KUnit tests for POLYVAL
lib/crypto: tests: Add additional SHAKE tests
lib/crypto: tests: Add SHA3 kunit tests
lib/crypto: tests: Add KUnit tests for BLAKE2b
This is the main crypto library pull request for 6.19. It includes:
- Add SHA-3 support to lib/crypto/, including support for both the
hash functions and the extendable-output functions. Reimplement the
existing SHA-3 crypto_shash support on top of the library.
This is motivated mainly by the upcoming support for the ML-DSA
signature algorithm, which needs the SHAKE128 and SHAKE256
functions. But even on its own it's a useful cleanup.
This also fixes the longstanding issue where the
architecture-optimized SHA-3 code was disabled by default.
- Add BLAKE2b support to lib/crypto/, and reimplement the existing
BLAKE2b crypto_shash support on top of the library.
This is motivated mainly by btrfs, which supports BLAKE2b checksums.
With this change, all btrfs checksum algorithms now have library
APIs. btrfs is planned to start just using the library directly.
This refactor also improves consistency between the BLAKE2b code and
BLAKE2s code. And as usual, it also fixes the issue where the
architecture-optimized BLAKE2b code was disabled by default.
- Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL
support in crypto_shash. Reimplement HCTR2 on top of the library.
This simplifies the code and improves HCTR2 performance. As usual,
it also makes the architecture-optimized code be enabled by default.
The generic implementation of POLYVAL is greatly improved as well.
- Clean up the BLAKE2s code.
- Add FIPS self-tests for SHA-1, SHA-2, and SHA-3.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSurXBQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK4RhAQC/ZWiFD8MAnyBSVEeo5G5wr835bGl0
FbXLdIT8orj/mAD+JgO9LaE15THL+Cy7K9VO4XN6HjbKd/7FdshYlrqrag8=
=qitH
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
"This is the main crypto library pull request for 6.19. It includes:
- Add SHA-3 support to lib/crypto/, including support for both the
hash functions and the extendable-output functions. Reimplement the
existing SHA-3 crypto_shash support on top of the library.
This is motivated mainly by the upcoming support for the ML-DSA
signature algorithm, which needs the SHAKE128 and SHAKE256
functions. But even on its own it's a useful cleanup.
This also fixes the longstanding issue where the
architecture-optimized SHA-3 code was disabled by default.
- Add BLAKE2b support to lib/crypto/, and reimplement the existing
BLAKE2b crypto_shash support on top of the library.
This is motivated mainly by btrfs, which supports BLAKE2b
checksums. With this change, all btrfs checksum algorithms now have
library APIs. btrfs is planned to start just using the library
directly.
This refactor also improves consistency between the BLAKE2b code
and BLAKE2s code. And as usual, it also fixes the issue where the
architecture-optimized BLAKE2b code was disabled by default.
- Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL
support in crypto_shash. Reimplement HCTR2 on top of the library.
This simplifies the code and improves HCTR2 performance. As usual,
it also makes the architecture-optimized code be enabled by
default. The generic implementation of POLYVAL is greatly improved
as well.
- Clean up the BLAKE2s code
- Add FIPS self-tests for SHA-1, SHA-2, and SHA-3"
* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits)
fscrypt: Drop obsolete recommendation to enable optimized POLYVAL
crypto: polyval - Remove the polyval crypto_shash
crypto: hctr2 - Convert to use POLYVAL library
lib/crypto: x86/polyval: Migrate optimized code into library
lib/crypto: arm64/polyval: Migrate optimized code into library
lib/crypto: polyval: Add POLYVAL library
crypto: polyval - Rename conflicting functions
lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs
lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value
lib/crypto: x86/blake2s: Improve readability
lib/crypto: x86/blake2s: Use local labels for data
lib/crypto: x86/blake2s: Drop check for nblocks == 0
lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
lib/crypto: arm, arm64: Drop filenames from file comments
lib/crypto: arm/blake2s: Fix some comments
crypto: s390/sha3 - Remove superseded SHA-3 code
crypto: sha3 - Reimplement using library API
crypto: jitterentropy - Use default sha3 implementation
lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
lib/crypto: sha3: Support arch overrides of one-shot digest functions
...
Add a compile-in option for Terminus 10x18 bitmap console font
to improve readability on modern laptop displays.
On modern 13-16 inch laptop displays with high pixel density,
common scaled resolutions like 1280x800 and 1440x900 are widely
used.
At these resolutions, VGA 8x16 is too small and difficult to
read for extended periods, while Terminus 16x32 is too large,
providing only 25-28 rows. The existing 10x18 font has poor
readability.
Terminus 10x18 provides improved readability with its clean,
fixed-width design while maintaining practical row counts
(44-50 rows).
A comfortable and readable built-in font for early boot messages,
kernel panics or whenever userspace is unavailable.
The font was converted from standard Terminus ter-i18b.psf using
psftools and formatted to match kernel font conventions.
This patch is non-intrusive, no options are enabled by default
so most users won't notice a thing.
Signed-off-by: Neilay Kharwadkar <neilaykharwadkar@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
by in_hardirq() and marked deprecated in 2020.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmkvDhUTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoW+XD/959KAIm2JpcEYUWuNBmlhEyuYWvPLw
ZyOiraLYBNyWmfCO/Yz4Ff8VZSR9gdWQoNfvBb8uxkbSXa0UOEUhCbzWsuoTnqR5
ObTIHCJ9QmPlRiFDvs4Sf5TGmy/4nXh6/PoH3JykNdlD3rZMTxiAz/k6QuO/S2iu
ykA+DNtNL7jDkQHzrWa3rf597BkBN1Z+hUD8zHRt8LYKRfmLYWjCMggjPLMnuqcn
240fnV/FubCLd9f5ZgNxHQMQCQH2qB7GYMk08YwXwCZQqIIXWqbNnhedkkNO3kWq
Sws4TEO6yg9pgTFqkuiDU5QgYEboRY4pDT45KSkdTHHGZl2OAAl3eVIGCto72UEI
Eyzn4k900hZ1iI/Rad5mx3D4XJZEXFgEbXhjph0odn6jVvmSj+Fmg3J67u1niO2a
obzB+xeaIkbGNQIgJFy8+A9SSnZckvuPlXdZdUxS2S95zH7f9+vBY8HWJMuyursa
3AJAKa82mN1i3A9FdSuMTdttQWkDmrwPKVzxvixs1mBu7kB70XaRIKsPjZj7LH6X
CiqP9Kt5FO0hVA7K+nKTeUA5DdjB4HzYzOgMqzFUhExY3hksVsj8rQEO6B0bCp9t
CfITA3BvU7GXxhXZHOq3dABQ21J/ZHgeuK3QdQSnOxSQOv2ElYIdKvYirJy2QdS1
tSM3O3GXb4zWDg==
=6LKf
-----END PGP SIGNATURE-----
Merge tag 'core-core-2025-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core irq cleanup from Thomas Gleixner:
"Tree wide cleanup of the remaining users of in_irq() which got
replaced by in_hardirq() and marked deprecated in 2020"
* tag 'core-core-2025-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
treewide: Remove in_irq()
- Allow pool refill on RT enabled kernels before the scheduler is up
and running to prevent pool exhaustion
- Correct the lockdep override to prevent false positives.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksu/UTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoScuD/9g3OtG29VZ2uNkOhgvGKuAThxZ+Y4d
8YPNT/X9SuSevuwBCc9zXgpc5S4Af40ndRbcsiZ38t/xrInE6+J6qPJ7BbKCTiac
cYvNz4ibMx1qz35BPtym4RnJyZA2EHX2hVIFGCfdh4MkILI7r3OPjemX542epZAW
8MdKu5WZJNa8KYIvUE1UZdDtH1imxU7jdYBkr1ockN66+HMjRKHxcPwrhTCFJeCT
N4DHOQ+hf9NzipHpRppDmqwkzQCOyKrOojXht00rG92QXIzmZRepH93cCFi/nW1d
8aUjHU6myNQa65VkFDM2I2bpzCzlK7HpBU3iNXEkXPLZ8bVrYMP9koK+SXIa+Gj0
icXdJwBe9uOKQOaG6MRSO2hn3fHO0m+PjZGtQFg7EqFCaY0J+8tv9k3WttDDpfMg
hjXjyJ0U9T+/YUuSDBLdPczIJZr8eGh960SF0OTshHGGVOCGJt4dlvoC0NtUxN8x
WQ/he9K/Cyz7U6yr1aNO6hAfqX/+6c0ZhD3OONuC9xgxHUkjPdlEe1ntLbdfn92z
VygbJaguvRdzkAeaAlXNNU5WTNvm3ZeLPqDnnUHUlDW1f7hF0KwCrfZUW0PqdB76
+94ptMeIlCv53zIEKamHuALGp7WtGddGGzaZLH8rUnPxfiff+JiMhXtV0ioMuUNG
jpdlyBMXK+s0PA==
=dUo2
-----END PGP SIGNATURE-----
Merge tag 'core-debugobjects-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects update from Thomas Gleixner:
"Two small updates for debugobjects:
- Allow pool refill on RT enabled kernels before the scheduler is up
and running to prevent pool exhaustion
- Correct the lockdep override to prevent false positives"
* tag 'core-debugobjects-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debugobjects: Use LD_WAIT_CONFIG instead of LD_WAIT_SLEEP
debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING
The recent enablement of RSEQ in glibc resulted in regressions which are
caused by the related overhead. It turned out that the decision to invoke
the exit to user work was not really a decision. More or less each
context switch caused that. There is a long list of small issues which
sums up nicely and results in a 3-4% regression in I/O benchmarks.
The other detail which caused issues due to extra work in context switch
and task migration is the CID (memory context ID) management. It also
requires to use a task work to consolidate the CID space, which is
executed in the context of an arbitrary task and results in sporadic
uncontrolled exit latencies.
The rewrite addresses this by:
- Removing deprecated and long unsupported functionality
- Moving the related data into dedicated data structures which are
optimized for fast path processing.
- Caching values so actual decisions can be made
- Replacing the current implementation with a optimized inlined variant.
- Separating fast and slow path for architectures which use the generic
entry code, so that only fault and error handling goes into the
TIF_NOTIFY_RESUME handler.
- Rewriting the CID management so that it becomes mostly invisible in the
context switch path. That moves the work of switching modes into the
fork/exit path, which is a reasonable tradeoff. That work is only
required when a process creates more threads than the cpuset it is
allowed to run on or when enough threads exit after that. An artificial
thread pool benchmarks which triggers this did not degrade, it actually
improved significantly.
The main effect in migration heavy scenarios is that runqueue lock held
time and therefore contention goes down significantly.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksaRYTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoencEADA5he8PAFPmSRRPo6+2G5mHzWe8kIU
5ZViQStWFNAA0qqy8VXryWiJ6qqrO6la9o7K4YOXASUtlkVjquRp1DF7PabqGwuy
zshbRCXNlT51J8uqanN8VrGVjlf+bMdHDbGoI1SLkUTxG8b+kDD5PXUQE1ARelPP
Slbg9u+EMrxj6D5MDTPbuW6TqryJEkPtiNScyOz43emp9ww9+WVxenOcRqU4D+Th
mjWmrGIzkroSf4XReMoD/wg9TPTpUjXnNCwl2viY9JvBpkMfYtU4tJAGK3aNFOWy
zsAN0O9CaFGrUEFne7qUmtwhNLdtnjx5HN5pe7yZd1EhdTuQKq4jPiiQnwwm8w72
c0o6m45FNPmPoSyfaZWCkLjbTEUXonT9JF61iN35JVxim8gBDDJjHFKnLxDmLrH3
X0eESE48ReY2EneDV6Y8RJRo6oG14Fccvc39aTf/2Rw3trpmtt2agvConQzupQIg
DzANw4jhUUzFRrHrMHACNsqKFXh9ratue/S9DM3xxTpGO/bKdeK7jGIgzNf8O34M
J0O6Hvk5jMdcWlIJTx21GoGzoSkkXnR49g/71aCcp+MwdY4x9zFz5SWi8LWQRmkx
xRo6tY27Bma8/SEwMJjPpAUXDTpq6v+j3cPisybL1yGsyt9lh+p8LX7VUtwcoEqe
6ZelC5Kgw/+/kg==
=n5KT
-----END PGP SIGNATURE-----
Merge tag 'core-rseq-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rseq updates from Thomas Gleixner:
"A large overhaul of the restartable sequences and CID management:
The recent enablement of RSEQ in glibc resulted in regressions which
are caused by the related overhead. It turned out that the decision to
invoke the exit to user work was not really a decision. More or less
each context switch caused that. There is a long list of small issues
which sums up nicely and results in a 3-4% regression in I/O
benchmarks.
The other detail which caused issues due to extra work in context
switch and task migration is the CID (memory context ID) management.
It also requires to use a task work to consolidate the CID space,
which is executed in the context of an arbitrary task and results in
sporadic uncontrolled exit latencies.
The rewrite addresses this by:
- Removing deprecated and long unsupported functionality
- Moving the related data into dedicated data structures which are
optimized for fast path processing.
- Caching values so actual decisions can be made
- Replacing the current implementation with a optimized inlined
variant.
- Separating fast and slow path for architectures which use the
generic entry code, so that only fault and error handling goes into
the TIF_NOTIFY_RESUME handler.
- Rewriting the CID management so that it becomes mostly invisible in
the context switch path. That moves the work of switching modes
into the fork/exit path, which is a reasonable tradeoff. That work
is only required when a process creates more threads than the
cpuset it is allowed to run on or when enough threads exit after
that. An artificial thread pool benchmarks which triggers this did
not degrade, it actually improved significantly.
The main effect in migration heavy scenarios is that runqueue lock
held time and therefore contention goes down significantly"
* tag 'core-rseq-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
sched/mmcid: Switch over to the new mechanism
sched/mmcid: Implement deferred mode change
irqwork: Move data struct to a types header
sched/mmcid: Provide CID ownership mode fixup functions
sched/mmcid: Provide new scheduler CID mechanism
sched/mmcid: Introduce per task/CPU ownership infrastructure
sched/mmcid: Serialize sched_mm_cid_fork()/exit() with a mutex
sched/mmcid: Provide precomputed maximal value
sched/mmcid: Move initialization out of line
signal: Move MMCID exit out of sighand lock
sched/mmcid: Convert mm CID mask to a bitmap
cpumask: Cache num_possible_cpus()
sched/mmcid: Use cpumask_weighted_or()
cpumask: Introduce cpumask_weighted_or()
sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
sched/mmcid: Move scheduler code out of global header
sched: Fixup whitespace damage
sched/mmcid: Cacheline align MM CID storage
sched/mmcid: Use proper data structures
sched/mmcid: Revert the complex CID management
...
- Implement the missing u64 user access function on ARM when
CONFIG_CPU_SPECTRE=n. This makes it possible to access a 64bit value in
generic code with [unsafe_]get_user(). All other architectures and ARM
variants provide the relevant accessors already.
- Ensure that ASM GOTO jump label usage in the user mode access helpers
always goes through a local C scope label indirection inside the
helpers. This is required because compilers are not supporting that a
ASM GOTO target leaves a auto cleanup scope. GCC silently fails to emit
the cleanup invocation and CLANG fails the build.
This provides generic wrapper macros and the conversion of affected
architecture code to use them.
- Scoped user mode access with auto cleanup
Access to user mode memory can be required in hot code paths, but if it
has to be done with user controlled pointers, the access is shielded
with a speculation barrier, so that the CPU cannot speculate around the
address range check. Those speculation barriers impact performance quite
significantly. This can be avoided by "masking" the provided pointer so
it is guaranteed to be in the valid user memory access range and
otherwise to point to a guaranteed unpopulated address space. This has
to be done without branches so it creates an address dependency for the
access, which the CPU cannot speculate ahead.
This results in repeating and error prone programming patterns:
if (can_do_masked_user_access())
from = masked_user_read_access_begin((from));
else if (!user_read_access_begin(from, sizeof(*from)))
return -EFAULT;
unsafe_get_user(val, from, Efault);
user_read_access_end();
return 0;
Efault:
user_read_access_end();
return -EFAULT;
which can be replaced with scopes and automatic cleanup:
scoped_user_read_access(from, Efault)
unsafe_get_user(val, from, Efault);
return 0;
Efault:
return -EFAULT;
- Convert code which implements the above pattern over to
scope_user.*.access(). This also corrects a couple of imbalanced
masked_*_begin() instances which are harmless on most architectures, but
prevent PowerPC from implementing the masking optimization.
- Add a missing speculation barrier in copy_from_user_iter()
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksRfITHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoVhBEACEySjWcyCrD1e0ZFMFAOJZFI2BShav
reotzCzmHYQdpVukDRxc64BgM2vN4yB04xnyMhi2o4hSTiIJhz1NzbKggsQJhVoA
psYz+xEI161HuLZnUBUBuF9RRko/HVsbGqO2JFCuOKor4GCycvjVgupR3EIN9h5T
HZEWGIgaTmN7MBj0QRrJgJkaaSTnPKOwWaNMV/F9pfk27zuB7vuV8WM9P3FaJYG+
JGa9td7VGaBpWavxgMJqfdvXWBCVDDfZ1dunWx8tPTnLxKZZZD6HlfQXhZTr2n1e
rtJpGgfVBx5Uqxn4RrhS0I7QeK1b9rrt3IU7EkFoaa3Z8LU5B7cHlm7KyicyoHhy
SzFFUszssznT/0OhA5fmgPRlqI295HynW2p1L4Xy9hC0EZ2vXJPG5rO6X3x6QwSR
asjRB7x/6JzWQUzE7/nhXd9KcB66wvQxhnjp7GqulF74aPBCtIdXXDD68YEDYkbi
dPC3NRBr0ePbsGVGWbYvYIPWcvo1u814C2io1zKwmVbiN6lCYURgQK861vfAZUP8
oP5D2a6ENgezDKoJo6eJ82inuDu64qZy7OOkU/aO3cbOuWGVyY9CjYD11x85Nr0k
UNabSOfvcmhmobtYUiAgLLrjX1grQUG3F74ZQTw513mwgMObuDAAoS11GPjY6HL6
b99WUJRv8jP66A==
=6no0
-----END PGP SIGNATURE-----
Merge tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scoped user access updates from Thomas Gleixner:
"Scoped user mode access and related changes:
- Implement the missing u64 user access function on ARM when
CONFIG_CPU_SPECTRE=n.
This makes it possible to access a 64bit value in generic code with
[unsafe_]get_user(). All other architectures and ARM variants
provide the relevant accessors already.
- Ensure that ASM GOTO jump label usage in the user mode access
helpers always goes through a local C scope label indirection
inside the helpers.
This is required because compilers are not supporting that a ASM
GOTO target leaves a auto cleanup scope. GCC silently fails to emit
the cleanup invocation and CLANG fails the build.
[ Editor's note: gcc-16 will have fixed the code generation issue
in commit f68fe3ddda4 ("eh: Invoke cleanups/destructors in asm
goto jumps [PR122835]"). But we obviously have to deal with clang
and older versions of gcc, so.. - Linus ]
This provides generic wrapper macros and the conversion of affected
architecture code to use them.
- Scoped user mode access with auto cleanup
Access to user mode memory can be required in hot code paths, but
if it has to be done with user controlled pointers, the access is
shielded with a speculation barrier, so that the CPU cannot
speculate around the address range check. Those speculation
barriers impact performance quite significantly.
This cost can be avoided by "masking" the provided pointer so it is
guaranteed to be in the valid user memory access range and
otherwise to point to a guaranteed unpopulated address space. This
has to be done without branches so it creates an address dependency
for the access, which the CPU cannot speculate ahead.
This results in repeating and error prone programming patterns:
if (can_do_masked_user_access())
from = masked_user_read_access_begin((from));
else if (!user_read_access_begin(from, sizeof(*from)))
return -EFAULT;
unsafe_get_user(val, from, Efault);
user_read_access_end();
return 0;
Efault:
user_read_access_end();
return -EFAULT;
which can be replaced with scopes and automatic cleanup:
scoped_user_read_access(from, Efault)
unsafe_get_user(val, from, Efault);
return 0;
Efault:
return -EFAULT;
- Convert code which implements the above pattern over to
scope_user.*.access(). This also corrects a couple of imbalanced
masked_*_begin() instances which are harmless on most
architectures, but prevent PowerPC from implementing the masking
optimization.
- Add a missing speculation barrier in copy_from_user_iter()"
* tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lib/strn*,uaccess: Use masked_user_{read/write}_access_begin when required
scm: Convert put_cmsg() to scoped user access
iov_iter: Add missing speculation barrier to copy_from_user_iter()
iov_iter: Convert copy_from_user_iter() to masked user access
select: Convert to scoped user access
x86/futex: Convert to scoped user access
futex: Convert to get/put_user_inline()
uaccess: Provide put/get_user_inline()
uaccess: Provide scoped user access regions
arm64: uaccess: Use unsafe wrappers for ASM GOTO
s390/uaccess: Use unsafe wrappers for ASM GOTO
riscv/uaccess: Use unsafe wrappers for ASM GOTO
powerpc/uaccess: Use unsafe wrappers for ASM GOTO
x86/uaccess: Use unsafe wrappers for ASM GOTO
uaccess: Provide ASM GOTO safe wrappers for unsafe_*_user()
ARM: uaccess: Implement missing __get_user_asm_dword()
- Improve WARN(), which has vararg printf like arguments,
to work with the x86 #UD based WARN-optimizing infrastructure
by hiding the format in the bug_table and replacing this
first argument with the address of the bug-table entry,
while making the actual function that's called a UD1 instruction.
(Peter Zijlstra)
- Introduce the CONFIG_DEBUG_BUGVERBOSE_DETAILED Kconfig switch
(Ingo Molnar, s390 support by Heiko Carstens)
Fixes and cleanups:
- bugs/s390: Remove private WARN_ON() implementation (Heiko Carstens)
- <asm/bugs.h>: Make i386 use GENERIC_BUG_RELATIVE_POINTERS
(Peter Zijlstra)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktffcRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gLCQ//ZMHTpv6UF+xY4Kq3rdVFJWRR7Prs2hDn
4V/cFHu7Xb2lRwpJDBNWnbiEkLbGR5qlWD+0CpAMK4mGuL9VjnoeflEzXxOPP9W1
7SWw1HDc6NkjHM3BNnvkPQbzWF4RL4ZGs4Lbb1nv7pjoTSdBMXNrD5RL+iNmfHHd
QfOG9ZiRyD5A/b07bfyjIXNaeC/Hot+FeVXTMfD7/vCfc2ywhL2Sm5G/igY/shTY
Una7q8sbFDD/bFFtWSR2eFQeHQQfT6c/Pu39ZcAIdTlLk1FtZ+A2wcRtwYv/FdPV
6KDOAxZK7fLMHoCdNTswsg+LbazhABOb/V1J7TaHq2tF/PeyN+B+ucxVY2KFcxQJ
V5eG5crMrUIL22QO/UaT3dPWxGbJYkNlAWl416tAKdgXA52W4OsPd+k4DGjeP569
KogAy3SY0D/80v1QN2HcFEJMvr7W2SukxtErqdtA5OKt/ZevB49lGqZoBecqASDO
QjI1K0yLKnb+erbMIpCpNj+o/Fr1JQgWbYVpwipL2GON4an6vrowimGTsUG7qqxN
Hwb7IaTNnWQ/4iyCkVV44q6Ln1gyP2hz5Lnzo7QJINUdzp98UjJLAEK4NRG44T4L
p0t5NMxnvREpAv35rwy03xhmITYeQMOqzN9JEBBvegQyld5ghbThxI7g9BDzcXyL
Bd0mF7WOV9M=
=bMUL
-----END PGP SIGNATURE-----
Merge tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull bug handling infrastructure updates from Ingo Molnar:
"Core updates:
- Improve WARN(), which has vararg printf like arguments, to work
with the x86 #UD based WARN-optimizing infrastructure by hiding the
format in the bug_table and replacing this first argument with the
address of the bug-table entry, while making the actual function
that's called a UD1 instruction (Peter Zijlstra)
- Introduce the CONFIG_DEBUG_BUGVERBOSE_DETAILED Kconfig switch (Ingo
Molnar, s390 support by Heiko Carstens)
Fixes and cleanups:
- bugs/s390: Remove private WARN_ON() implementation (Heiko Carstens)
- <asm/bugs.h>: Make i386 use GENERIC_BUG_RELATIVE_POINTERS (Peter
Zijlstra)"
* tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
x86/bugs: Make i386 use GENERIC_BUG_RELATIVE_POINTERS
x86/bug: Fix BUG_FORMAT vs KASLR
x86_64/bug: Inline the UD1
x86/bug: Implement WARN_ONCE()
x86_64/bug: Implement __WARN_printf()
x86/bug: Use BUG_FORMAT for DEBUG_BUGVERBOSE_DETAILED
x86/bug: Add BUG_FORMAT basics
bug: Allow architectures to provide __WARN_printf()
bug: Implement WARN_ON() using __WARN_FLAGS()
bug: Add report_bug_entry()
bug: Add BUG_FORMAT_ARGS infrastructure
bug: Clean up CONFIG_GENERIC_BUG_RELATIVE_POINTERS
bug: Add BUG_FORMAT infrastructure
x86: Rework __bug_table helpers
bugs/s390: Remove private WARN_ON() implementation
bugs/core: Reorganize fields in the first line of WARNING output, add ->comm[] output
bugs/sh: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output
bugs/parisc: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output
bugs/riscv: Concatenate 'cond_str' with '__FILE__' in __BUG_FLAGS(), to extend WARN_ON/BUG_ON output
bugs/riscv: Pass in 'cond_str' to __BUG_FLAGS()
...
- klp-build livepatch module generation (Josh Poimboeuf)
Introduce new objtool features and a klp-build
script to generate livepatch modules using a
source .patch as input.
This builds on concepts from the longstanding out-of-tree
kpatch project which began in 2012 and has been used for
many years to generate livepatch modules for production kernels.
However, this is a complete rewrite which incorporates
hard-earned lessons from 12+ years of maintaining kpatch.
Key improvements compared to kpatch-build:
- Integrated with objtool: Leverages objtool's existing control-flow
graph analysis to help detect changed functions.
- Works on vmlinux.o: Supports late-linked objects, making it
compatible with LTO, IBT, and similar.
- Simplified code base: ~3k fewer lines of code.
- Upstream: No more out-of-tree #ifdef hacks, far less cruft.
- Cleaner internals: Vastly simplified logic for symbol/section/reloc
inclusion and special section extraction.
- Robust __LINE__ macro handling: Avoids false positive binary diffs
caused by the __LINE__ macro by introducing a fix-patch-lines script
which injects #line directives into the source .patch to preserve
the original line numbers at compile time.
- Disassemble code with libopcodes instead of running objdump
(Alexandre Chartre)
- Disassemble support (-d option to objtool) by Alexandre Chartre,
which supports the decoding of various Linux kernel code generation
specials such as alternatives:
17ef: sched_balance_find_dst_group+0x62f mov 0x34(%r9),%edx
17f3: sched_balance_find_dst_group+0x633 | <alternative.17f3> | X86_FEATURE_POPCNT
17f3: sched_balance_find_dst_group+0x633 | call 0x17f8 <__sw_hweight64> | popcnt %rdi,%rax
17f8: sched_balance_find_dst_group+0x638 cmp %eax,%edx
... jump table alternatives:
1895: sched_use_asym_prio+0x5 test $0x8,%ch
1898: sched_use_asym_prio+0x8 je 0x18a9 <sched_use_asym_prio+0x19>
189a: sched_use_asym_prio+0xa | <jump_table.189a> | JUMP
189a: sched_use_asym_prio+0xa | jmp 0x18ae <sched_use_asym_prio+0x1e> | nop2
189c: sched_use_asym_prio+0xc mov $0x1,%eax
18a1: sched_use_asym_prio+0x11 and $0x80,%ecx
... exception table alternatives:
native_read_msr:
5b80: native_read_msr+0x0 mov %edi,%ecx
5b82: native_read_msr+0x2 | <ex_table.5b82> | EXCEPTION
5b82: native_read_msr+0x2 | rdmsr | resume at 0x5b84 <native_read_msr+0x4>
5b84: native_read_msr+0x4 shl $0x20,%rdx
.... x86 feature flag decoding (also see the X86_FEATURE_POPCNT
example in sched_balance_find_dst_group() above):
2faaf: start_thread_common.constprop.0+0x1f jne 0x2fba4 <start_thread_common.constprop.0+0x114>
2fab5: start_thread_common.constprop.0+0x25 | <alternative.2fab5> | X86_FEATURE_ALWAYS | X86_BUG_NULL_SEG
2fab5: start_thread_common.constprop.0+0x25 | jmp 0x2faba <.altinstr_aux+0x2f4> | jmp 0x4b0 <start_thread_common.constprop.0+0x3f> | nop5
2faba: start_thread_common.constprop.0+0x2a mov $0x2b,%eax
... NOP sequence shortening:
1048e2: snapshot_write_finalize+0xc2 je 0x104917 <snapshot_write_finalize+0xf7>
1048e4: snapshot_write_finalize+0xc4 nop6
1048ea: snapshot_write_finalize+0xca nop11
1048f5: snapshot_write_finalize+0xd5 nop11
104900: snapshot_write_finalize+0xe0 mov %rax,%rcx
104903: snapshot_write_finalize+0xe3 mov 0x10(%rdx),%rax
... and much more.
- Function validation tracing support (Alexandre Chartre)
- Various -ffunction-sections fixes (Josh Poimboeuf)
- Clang AutoFDO (Automated Feedback-Directed Optimizations) support (Josh Poimboeuf)
- Misc fixes and cleanups (Borislav Petkov, Chen Ni,
Dylan Hatch, Ingo Molnar, John Wang, Josh Poimboeuf,
Pankaj Raghav, Peter Zijlstra, Thorsten Blum)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmktavcRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1j3IhAAvc9tRV8SJcohim6DrkPGxCN/S80uzt5S
q8v1x5tBzMYmUxftfpoLsPCri6Ww0jprNuhnbRCvWAzXFuW79HWBNdVkEO7V/cym
OsCKQv3r0mWv5UXP3o8VM5K3tnU61wOAIx3yZCz5XKWeOg6NPXBJCSGWYpLuA7z0
1wUWAXuHgmj4RHMlHu5x0FZnSqGU3/TkUDGAqdxrY+myhdwm0Ul+dSwWGQdQjCgA
59Y/gDsWWEe5BVL56suwKZ1e+8UFnpbncbWkjELD6euJpYpDSNMOW/S6PYqOOz5M
rjMv06XIX5ma7QQbF5fMG/sXW64tZtc090UocDnx7hpDq9mLEyNNkXsqRQlmd8Wt
wG19IaeWo8aG9DTQkiv8OhtmssPKZHJsVjRUvXGnjktvxnsYSomgOT1lNme38dJD
X9jHgZCFMdPsQmG0dp00Y0oejfTChqIDef7qSpYwT96R7l9VQQF7K7AxfJwSeLGO
3hClZ0Gz/u9NiJTUUWTxUmR+YEy+1xIeaQSDq6t4JRtNJaMZlcevfVW+F2Lm04XH
9eSeF7bJS2XKrlLHVdPgWCGZOmee+ghdQ7svsyEGpzdzaAZ7UveTucHJ9CvW3Fft
Dcrl8rxX2NiD2PLz03HCHR/JVUDc3W3Exrer1TD8PD4LcZhFoBEGQbZ/gFlkBTxb
TOcemtJT03U=
=yPrS
-----END PGP SIGNATURE-----
Merge tag 'objtool-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- klp-build livepatch module generation (Josh Poimboeuf)
Introduce new objtool features and a klp-build script to generate
livepatch modules using a source .patch as input.
This builds on concepts from the longstanding out-of-tree kpatch
project which began in 2012 and has been used for many years to
generate livepatch modules for production kernels. However, this is a
complete rewrite which incorporates hard-earned lessons from 12+
years of maintaining kpatch.
Key improvements compared to kpatch-build:
- Integrated with objtool: Leverages objtool's existing control-flow
graph analysis to help detect changed functions.
- Works on vmlinux.o: Supports late-linked objects, making it
compatible with LTO, IBT, and similar.
- Simplified code base: ~3k fewer lines of code.
- Upstream: No more out-of-tree #ifdef hacks, far less cruft.
- Cleaner internals: Vastly simplified logic for
symbol/section/reloc inclusion and special section extraction.
- Robust __LINE__ macro handling: Avoids false positive binary diffs
caused by the __LINE__ macro by introducing a fix-patch-lines
script which injects #line directives into the source .patch to
preserve the original line numbers at compile time.
- Disassemble code with libopcodes instead of running objdump
(Alexandre Chartre)
- Disassemble support (-d option to objtool) by Alexandre Chartre,
which supports the decoding of various Linux kernel code generation
specials such as alternatives:
17ef: sched_balance_find_dst_group+0x62f mov 0x34(%r9),%edx
17f3: sched_balance_find_dst_group+0x633 | <alternative.17f3> | X86_FEATURE_POPCNT
17f3: sched_balance_find_dst_group+0x633 | call 0x17f8 <__sw_hweight64> | popcnt %rdi,%rax
17f8: sched_balance_find_dst_group+0x638 cmp %eax,%edx
... jump table alternatives:
1895: sched_use_asym_prio+0x5 test $0x8,%ch
1898: sched_use_asym_prio+0x8 je 0x18a9 <sched_use_asym_prio+0x19>
189a: sched_use_asym_prio+0xa | <jump_table.189a> | JUMP
189a: sched_use_asym_prio+0xa | jmp 0x18ae <sched_use_asym_prio+0x1e> | nop2
189c: sched_use_asym_prio+0xc mov $0x1,%eax
18a1: sched_use_asym_prio+0x11 and $0x80,%ecx
... exception table alternatives:
native_read_msr:
5b80: native_read_msr+0x0 mov %edi,%ecx
5b82: native_read_msr+0x2 | <ex_table.5b82> | EXCEPTION
5b82: native_read_msr+0x2 | rdmsr | resume at 0x5b84 <native_read_msr+0x4>
5b84: native_read_msr+0x4 shl $0x20,%rdx
.... x86 feature flag decoding (also see the X86_FEATURE_POPCNT
example in sched_balance_find_dst_group() above):
2faaf: start_thread_common.constprop.0+0x1f jne 0x2fba4 <start_thread_common.constprop.0+0x114>
2fab5: start_thread_common.constprop.0+0x25 | <alternative.2fab5> | X86_FEATURE_ALWAYS | X86_BUG_NULL_SEG
2fab5: start_thread_common.constprop.0+0x25 | jmp 0x2faba <.altinstr_aux+0x2f4> | jmp 0x4b0 <start_thread_common.constprop.0+0x3f> | nop5
2faba: start_thread_common.constprop.0+0x2a mov $0x2b,%eax
... NOP sequence shortening:
1048e2: snapshot_write_finalize+0xc2 je 0x104917 <snapshot_write_finalize+0xf7>
1048e4: snapshot_write_finalize+0xc4 nop6
1048ea: snapshot_write_finalize+0xca nop11
1048f5: snapshot_write_finalize+0xd5 nop11
104900: snapshot_write_finalize+0xe0 mov %rax,%rcx
104903: snapshot_write_finalize+0xe3 mov 0x10(%rdx),%rax
... and much more.
- Function validation tracing support (Alexandre Chartre)
- Various -ffunction-sections fixes (Josh Poimboeuf)
- Clang AutoFDO (Automated Feedback-Directed Optimizations) support
(Josh Poimboeuf)
- Misc fixes and cleanups (Borislav Petkov, Chen Ni, Dylan Hatch, Ingo
Molnar, John Wang, Josh Poimboeuf, Pankaj Raghav, Peter Zijlstra,
Thorsten Blum)
* tag 'objtool-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
objtool: Fix segfault on unknown alternatives
objtool: Build with disassembly can fail when including bdf.h
objtool: Trim trailing NOPs in alternative
objtool: Add wide output for disassembly
objtool: Compact output for alternatives with one instruction
objtool: Improve naming of group alternatives
objtool: Add Function to get the name of a CPU feature
objtool: Provide access to feature and flags of group alternatives
objtool: Fix address references in alternatives
objtool: Disassemble jump table alternatives
objtool: Disassemble exception table alternatives
objtool: Print addresses with alternative instructions
objtool: Disassemble group alternatives
objtool: Print headers for alternatives
objtool: Preserve alternatives order
objtool: Add the --disas=<function-pattern> action
objtool: Do not validate IBT for .return_sites and .call_sites
objtool: Improve tracing of alternative instructions
objtool: Add functions to better name alternatives
objtool: Identify the different types of alternatives
...
Add __kfifo_alloc_node() by refactoring and reusing __kfifo_alloc(),
and define kfifo_alloc_node() macro to support NUMA-aware memory
allocation.
The new __kfifo_alloc_node() function accepts a NUMA node parameter
and uses kmalloc_array_node() instead of kmalloc_array() for
node-specific allocation. The existing __kfifo_alloc() now calls
__kfifo_alloc_node() with NUMA_NO_NODE to maintain backward
compatibility.
This enables users to allocate kfifo buffers on specific NUMA nodes,
which is important for performance in NUMA systems where the kfifo
will be primarily accessed by threads running on specific nodes.
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently the KHO test only prints a message on success, and remains
silent on failure. This makes it difficult to notice a failing test. A
failing test is usually more interesting than a successful one.
Always print the test status after attempting restore.
Link: https://lkml.kernel.org/r/20251118181046.23321-1-pratyush@kernel.org
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: SeongJae Park <sj@kerneel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If there is an error half way through KHO memory preservation, we should
rollback and unpreserve everything that is partially preserved.
Link: https://lkml.kernel.org/r/20251101142325.1326536-6-pasha.tatashin@soleen.com
Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Simon Horman <horms@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The KHO framework uses a notifier chain as the mechanism for clients to
participate in the finalization process. While this works for a single,
central state machine, it is too restrictive for kernel-internal
components like pstore/reserve_mem or IMA. These components need a
simpler, direct way to register their state for preservation (e.g., during
their initcall) without being part of a complex, shutdown-time notifier
sequence. The notifier model forces all participants into a single
finalization flow and makes direct preservation from an arbitrary context
difficult. This patch refactors the client participation model by
removing the notifier chain and introducing a direct API for managing FDT
subtrees.
The core kho_finalize() and kho_abort() state machine remains, but clients
now register their data with KHO beforehand.
Link: https://lkml.kernel.org/r/20251101142325.1326536-3-pasha.tatashin@soleen.com
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Co-developed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Simon Horman <horms@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This is a very small function, inlining it saves cpu cycles in TCP by
reducing register pressure and removing call/ret overhead.
It also reduces vmlinux text size by 122 bytes on a typical x86_64 build.
Before:
size vmlinux
text data bss dec hex filename
34811781 22177365 5685248 62674394 3bc55da vmlinux
After:
size vmlinux
text data bss dec hex filename
34811659 22177365 5685248 62674272 3bc5560 vmlinux
[ojeda@kernel.org: fix rust build]
Link: https://lkml.kernel.org/r/20251120085518.1463498-1-ojeda@kernel.org
Link: https://lkml.kernel.org/r/20251114140646.3817319-3-edumazet@google.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Jakub Kacinski <kuba@kernel.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Stehen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "rbree: inline rb_first() and rb_last()".
Inline these two small helpers, heavily used in TCP and FQ packet scheduler,
and in many other places.
This reduces kernel text size, and brings an 1.5 % improvement on network
TCP stress test.
This patch (of 2):
This is a very small function, inlining it saves cpu cycles by reducing
register pressure and removing call/ret overhead.
It also reduces vmlinux text size by 744 bytes on a typical x86_64 build.
Before:
size vmlinux
text data bss dec hex filename
34812525 22177365 5685248 62675138 3bc58c2 vmlinux
After:
size vmlinux
text data bss dec hex filename
34811781 22177365 5685248 62674394 3bc55da vmlinux
[ojeda@kernel.org: fix rust build]
Link: https://lkml.kernel.org/r/20251120085518.1463498-1-ojeda@kernel.org
Link: https://lkml.kernel.org/r/20251114140646.3817319-1-edumazet@google.com
Link: https://lkml.kernel.org/r/20251114140646.3817319-2-edumazet@google.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Jakub Kacinski <kuba@kernel.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Stehen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ccache:
Add a compatible for the pic64gx SoC. No driver change needed, as it
falls back to the PolarFire SoC.
hisi hha/generic cpu cache maintenance:
Add support for a non-architectural mechanism for invalidating memory
regions, needed for some cxl implementations on arm64 (and probably
elsewhere in the future). The HiSilicon Hydra Home Agent is the first
driver to provide this support.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCaSiFRQAKCRB4tDGHoIJi
0glXAQDJ0NsReTniO9TgJkzw05oWwCKoOL4MadxBM/4MRLJXyQD/YWW09btaYxTZ
fDVHpb/P2BDD5qNwaXkONMIeoU/iHAo=
=zk03
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmkoyhIACgkQmmx57+YA
GNla4xAAw7gN+w/AIBFzh50ZT1DFxIaey67k3OGNEnictgWA8IOYu28brmqv04dT
vqppDeviw/fT6MC9A1qykRS8KVPbD1UAfD9w3akQsN6Kzb1h+cjhhv2qw4ZsWbjH
XcRj0ZeraL/3Pc7ecTiRvJ3ZebvzPeQNOVmkgaEWZusAsTPPzj+riNg0+LwazHrw
BDTLXgDZel/lyt8AIkLWWmzUUnr57VO22ZY1/seT4i161+2XjCaECbsCQzV6rTmo
EsYsQWf7G8jX5c2DxjVEkmPGVLJMBclvSgtfLABF/Co2rvVFZbdMjkCkHcySmWq5
KcvXsF+hRcHLC2kZ5jmxpyc0rlChYNGNgJUqJGSKM/sQRQmM0zsAGRZMiEb5ZEF5
xXNYAKVkFVAiIo+zw2ph334GIgv2tU5KB2WrNJu6kxqT7qQGZXO44Bg+1jmc6Q9n
efZOTAqOfyeT0pwVpdp8ZJfp/mwLceehFhb8u3OUrkM/xiJlvMFSY3cV568bR2z4
XYvEw5UEJybWPgsm5qXUMPgjPqd8nK/UFAwGSJ7zK6p/EM1B3rrNGo2Kdz2AJwxZ
/c4lSpr6Nz0MMQJpCfET9eet8fsAJi7UYopDDdNNTSYgiZ+1ZkUAffQX5wBQpGMI
czzuDEHwsLhKFPyQ2bqTYs77YS2qiVaHuKtBABkKYvoNodl8Tik=
=wip1
-----END PGP SIGNATURE-----
Merge tag 'cache-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/drivers-late
standalone cache drivers for v6.19
ccache:
Add a compatible for the pic64gx SoC. No driver change needed, as it
falls back to the PolarFire SoC.
hisi hha/generic cpu cache maintenance:
Add support for a non-architectural mechanism for invalidating memory
regions, needed for some cxl implementations on arm64 (and probably
elsewhere in the future). The HiSilicon Hydra Home Agent is the first
driver to provide this support.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'cache-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS
cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent
cache: Make top level Kconfig menu a boolean dependent on RISCV
MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header
arm64: Select GENERIC_CPU_CACHE_MAINTENANCE
lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion()
memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion()
dt-bindings: cache: sifive,ccache0: add a pic64gx compatible
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
fill_pool_map is used to suppress nesting violations caused by acquiring
a spinlock_t (from within the memory allocator) while holding a
raw_spinlock_t. The used annotation is wrong.
LD_WAIT_SLEEP is for always sleeping lock types such as mutex_t.
LD_WAIT_CONFIG is for lock type which are sleeping while spinning on
PREEMPT_RT such as spinlock_t.
Use LD_WAIT_CONFIG as override.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251127153652.291697-3-bigeasy@linutronix.de
The pool of free objects is refilled on several occasions such as object
initialisation. On PREEMPT_RT refilling is limited to preemptible
sections due to sleeping locks used by the memory allocator. The system
boots with disabled interrupts so the pool can not be refilled.
If too many objects are initialized and the pool gets empty then
debugobjects disables itself.
Refiling can also happen early in the boot with disabled interrupts as
long as the scheduler is not operational. If the scheduler can not
preempt a task then a sleeping lock can not be contended.
Allow to additionally refill the pool if the scheduler is not
operational.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251127153652.291697-2-bigeasy@linutronix.de
debugfs access modes were added in Linux 5.10 (Dec 2020) [1], but the
no-mount mode has behaved effectively the same as the off mode since
Linux 5.12 (Apr 2021) [2]. The only difference is the specific error
code returned by the debugfs_create_* functions, which is -ENOENT in
no-mount mode and -EPERM in off mode.
Given that no-mount hasn't worked for several years with no complaints,
just remove it.
[1] a24c6f7bc9 ("debugfs: Add access restriction option")
[2] bc6de804d3 ("debugfs: be more robust at handling improper input in debugfs_lookup()")
56348560d4 ("debugfs: do not attempt to create a new file before the filesystem is initalized")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Link: https://patch.msgid.link/20251120102222.18371-3-dev@null.aaront.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Encoding a relative NULL pointer doesn't work for KASLR, when the
whole kernel image gets shifted, the __bug_table and the target string
get shifted by the same amount and the relative offset is preserved.
However when the target is an absolute 0 value and the __bug_table
gets moved about, the end result in a pointer equivalent to
kaslr_offset(), not NULL.
Notably, this will generate SHN_UNDEF relocations, and Ard would
really like to not have those at all.
Use the empty string to denote no-string.
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Merge series from Johan Hovold <johan@kernel.org>:
This series fixes device and OF node reference leaks during probe and
a clock prepare imbalance on probe failures.
Included is a related cleanup of an error path.
That was already the limit with KASAN enabled, and the 32-bit x86 build
ends up having a couple of drm cases that have stack frames _just_ over
1kB on my allmodconfig test. So the minimal fix for this build issue
for now is to just bump the limit and make it independent of KASAN.
[ Side note: XTENSA already used 1.5k and PARISC uses 2k, so 1280 is
still relatively conservative ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dmirror_device_init() calls device_initialize() which sets the device
reference count to 1, but fails to call put_device() when error occurs
after dev_set_name() or cdev_device_add() failures. This results in
memory leaks of struct device objects. Additionally,
dmirror_device_remove() lacks the final put_device() call to properly
release the device reference.
Found by code review.
Link: https://lkml.kernel.org/r/20251108115346.6368-1-make24@iscas.ac.cn
Fixes: 6a760f58c7 ("mm/hmm/test: use char dev with struct device to get device node")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Cc: Haoxiang Li <make24@iscas.ac.cn>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Mika Penttilä <mpenttil@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add HMM_DMIRROR_FLAG_FAIL_ALLOC flag to simulate large page allocation
failures, enabling testing of split migration code paths.
This test flag allows validation of the fallback behavior when destination
device cannot allocate compound pages. This is useful for testing the
split migration functionality.
Link: https://lkml.kernel.org/r/20251001065707.920170-13-balbirs@nvidia.com
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Implement migrate_vma_split_pages() to handle THP splitting during the
migration process when destination cannot allocate compound pages.
This addresses the common scenario where migrate_vma_setup() succeeds with
MIGRATE_PFN_COMPOUND pages, but the destination device cannot allocate
large pages during the migration phase.
Key changes:
- migrate_vma_split_pages(): Split already-isolated pages during migration
- Enhanced folio_split() and __split_unmapped_folio() with isolated
parameter to avoid redundant unmap/remap operations
This provides a fallback mechansim to ensure migration succeeds even when
large page allocation fails at the destination.
[matthew.brost@intel.com: add THP splitting during migration]
Link: https://lkml.kernel.org/r/20251120230825.181072-2-matthew.brost@intel.com
Link: https://lkml.kernel.org/r/20251001065707.920170-12-balbirs@nvidia.com
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When a zone device page is split (via huge pmd folio split). The driver
callback for folio_split is invoked to let the device driver know that the
folio size has been split into a smaller order.
Provide a default implementation for drivers that do not provide this
callback that copies the pgmap and mapping fields for the split folios.
Update the HMM test driver to handle the split.
Link: https://lkml.kernel.org/r/20251001065707.920170-11-balbirs@nvidia.com
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Enhance the hmm test driver (lib/test_hmm) with support for THP pages.
A new pool of free_folios() has now been added to the dmirror device,
which can be allocated when a request for a THP zone device private page
is made.
Add compound page awareness to the allocation function during normal
migration and fault based migration. These routines also copy
folio_nr_pages() when moving data between system memory and device memory.
args.src and args.dst used to hold migration entries are now dynamically
allocated (as they need to hold HPAGE_PMD_NR entries or more).
Split and migrate support will be added in future patches in this series.
Link: https://lkml.kernel.org/r/20251001065707.920170-10-balbirs@nvidia.com
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Change page_free to folio_free to make the folio support for
zone device-private more consistent. The PCI P2PDMA callback
has also been updated and changed to folio_free() as a result.
For drivers that do not support folios (yet), the folio is
converted back into page via &folio->page and the page is used
as is, in the current callback implementation.
Link: https://lkml.kernel.org/r/20251001065707.920170-3-balbirs@nvidia.com
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm: support device-private THP", v7.
This patch series introduces support for Transparent Huge Page (THP)
migration in zone device-private memory. The implementation enables
efficient migration of large folios between system memory and
device-private memory
Background
Current zone device-private memory implementation only supports PAGE_SIZE
granularity, leading to:
- Increased TLB pressure
- Inefficient migration between CPU and device memory
This series extends the existing zone device-private infrastructure to
support THP, leading to:
- Reduced page table overhead
- Improved memory bandwidth utilization
- Seamless fallback to base pages when needed
In my local testing (using lib/test_hmm) and a throughput test, the series
shows a 350% improvement in data transfer throughput and a 80% improvement
in latency
These patches build on the earlier posts by Ralph Campbell [1]
Two new flags are added in vma_migration to select and mark compound
pages. migrate_vma_setup(), migrate_vma_pages() and
migrate_vma_finalize() support migration of these pages when
MIGRATE_VMA_SELECT_COMPOUND is passed in as arguments.
The series also adds zone device awareness to (m)THP pages along with
fault handling of large zone device private pages. page vma walk and the
rmap code is also zone device aware. Support has also been added for
folios that might need to be split in the middle of migration (when the
src and dst do not agree on MIGRATE_PFN_COMPOUND), that occurs when src
side of the migration can migrate large pages, but the destination has not
been able to allocate large pages. The code supported and used
folio_split() when migrating THP pages, this is used when
MIGRATE_VMA_SELECT_COMPOUND is not passed as an argument to
migrate_vma_setup().
The test infrastructure lib/test_hmm.c has been enhanced to support THP
migration. A new ioctl to emulate failure of large page allocations has
been added to test the folio split code path. hmm-tests.c has new test
cases for huge page migration and to test the folio split path. A new
throughput test has been added as well.
The nouveau dmem code has been enhanced to use the new THP migration
capability.
mTHP support:
The patches hard code, HPAGE_PMD_NR in a few places, but the code has been
kept generic to support various order sizes. With additional refactoring
of the code support of different order sizes should be possible.
The future plan is to post enhancements to support mTHP with a rough
design as follows:
1. Add the notion of allowable thp orders to the HMM based test driver
2. For non PMD based THP paths in migrate_device.c, check to see if
a suitable order is found and supported by the driver
3. Iterate across orders to check the highest supported order for migration
4. Migrate and finalize
The mTHP patches can be built on top of this series, the key design
elements that need to be worked out are infrastructure and driver support
for multiple ordered pages and their migration.
HMM support for large folios was added in 10b9feee2d ("mm/hmm:
populate PFNs from PMD swap entry").
This patch (of 16)
Add routines to support allocation of large order zone device folios and
helper functions for zone device folios, to check if a folio is device
private and helpers for setting zone device data.
When large folios are used, the existing page_free() callback in pgmap is
called when the folio is freed, this is true for both PAGE_SIZE and higher
order pages.
Zone device private large folios do not support deferred split and scan
like normal THP folios.
Link: https://lkml.kernel.org/r/20251001065707.920170-1-balbirs@nvidia.com
Link: https://lkml.kernel.org/r/20251001065707.920170-2-balbirs@nvidia.com
Link: https://lore.kernel.org/linux-mm/20201106005147.20113-1-rcampbell@nvidia.com/ [1]
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We have two almost identical pieces that handle FORMAT_STATE_NUM case.
The differences are:
- redundant {} for one-line if-else conditional
- missing blank line after variable definitions
- inverted conditional
Unify the style of two.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20251120083140.3478507-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
min_t(unsigned int, a, b) casts an 'unsigned long' to 'unsigned int'.
Use min(a, b) instead as it promotes any 'unsigned int' to 'unsigned long'
and so cannot discard significant bits.
In this case the 'unsigned long' value is small enough that the result
is ok.
Detected by an extra check added to min_t().
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Several parameters of the chacha20poly1305 functions require arrays of
an exact length. Use the new at_least keyword to instruct gcc and
clang to statically check that the caller is passing an object of at
least that length.
Here it is in action, with this faulty patch to wireguard's cookie.h:
struct cookie_checker {
u8 secret[NOISE_HASH_LEN];
- u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN];
+ u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN - 1];
u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN];
If I try compiling this code, I get this helpful warning:
CC drivers/net/wireguard/cookie.o
drivers/net/wireguard/cookie.c: In function ‘wg_cookie_message_create’:
drivers/net/wireguard/cookie.c:193:9: warning: ‘xchacha20poly1305_encrypt’ reading 32 bytes from a region of size 31 [-Wstringop-overread]
193 | xchacha20poly1305_encrypt(dst->encrypted_cookie, cookie, COOKIE_LEN,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
194 | macs->mac1, COOKIE_LEN, dst->nonce,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
195 | checker->cookie_encryption_key);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireguard/cookie.c:193:9: note: referencing argument 7 of type ‘const u8 *’ {aka ‘const unsigned char *’}
In file included from drivers/net/wireguard/messages.h:10,
from drivers/net/wireguard/cookie.h:9,
from drivers/net/wireguard/cookie.c:6:
include/crypto/chacha20poly1305.h:28:6: note: in a call to function ‘xchacha20poly1305_encrypt’
28 | void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251123054819.2371989-4-Jason@zx2c4.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Fix another KMSAN warning that made it in while KMSAN wasn't working
reliably.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSITwxQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK898AQCZFgPxQxVrkSaZuDXJ1V5ZpH/PcQVe
RWsmR6TtVE84QQD/a/mteTNxeB0xDFCrhgYG4TmF1g/RhPTTdBTmhubSdwo=
=VNQT
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
"Fix another KMSAN warning that made it in while KMSAN wasn't working
reliably"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: tests: Fix KMSAN warning in test_sha256_finup_2x()
ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION provides the mechanism for
invalidating certain memory regions in a cache-incoherent manner. Currently
this is used by NVDIMM and CXL memory drivers in cases where it is
necessary to flush all data from caches by physical address range.
The operations in question are effectively memory hotplug, where stale
data might otherwise remain in the caches.
This is separate from the invalidates done to enable use of non-coherent
DMA masters, primarily in terms of when it is needed (not related to DMA
mappings) and how deep the flush must push data. The flushes done for
non-coherent DMA only need to reach the Point of Coherence of a single host
(which is often nearer CPUs and DMA masters than the physical storage).
This operation must push the data out of non architectural caches
(memory-side caches, write buffers etc) and typically all the way to the
memory device.
In some architectures these operations are supported by system components
that may become available only later in boot as they are either present
on a discoverable bus, or via a firmware description of an MMIO interface
(e.g. ACPI DSDT). Provide a framework to handle this case.
Architectures can opt in for this support via
CONFIG_GENERIC_CPU_CACHE_MAINTENANCE
Add a registration framework. Each driver provides an ops structure and
the first op is Write Back and Invalidate by PA Range. The driver may
over invalidate.
For systems that can perform this operation asynchronously an optional
completion check operation is also provided. If present that must be called
to ensure that the action has finished. This provides a considerable
performance advantage if multiple agents are involved in the maintenance
operation.
When multiple agents are present in the system each should register with
this framework and the core code will issue the invalidate to all of them
before checking for completion on each. This is done to avoid need for
filtering in the core code which can become complex when interleave,
potentially across different cache coherency hardware is going on, so it
is easier to tell everyone and let those who don't care do nothing.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Fully initialize *ctx, including the buf field which sha256_init()
doesn't initialize, to avoid a KMSAN warning when comparing *ctx to
orig_ctx. This KMSAN warning slipped in while KMSAN was not working
reliably due to a stackdepot bug, which has now been fixed.
Fixes: 6733968be7 ("lib/crypto: tests: Add tests and benchmark for sha256_finup_2x()")
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251121033431.34406-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a report_bug() variant where the bug_entry is already known. This
is useful when the exception instruction is not instantiated per-site.
But instead has a single instance. In such a case the bug_entry
address might be passed along in a known register or something.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251110115757.575795595@infradead.org
Add BUG_FORMAT_ARGS; when an architecture is able to provide a va_list
given pt_regs, use this to print format arguments.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251110115757.457339417@infradead.org
Add BUG_FORMAT; an architecture opt-in feature that allows adding the
WARN_printf() format string to the bug_entry table.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251110115757.223371452@infradead.org
Bring in the UDB and objtool data annotations to avoid conflicts while further extending the bug exceptions.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Corrects a spelling mistake in a comment in ratelimit.c where 'seperately'
was used instead of 'separately'.
Link: https://lkml.kernel.org/r/20251119101144.3175-1-a0979625527@icloud.com
Signed-off-by: Chia-Liang Wang <a0979652527@icloud.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add a KUnit test suite to validate the base64 helpers. The tests cover
both encoding and decoding, including padded and unpadded forms as defined
by RFC 4648 (standard base64), and add negative cases for malformed inputs
and padding errors.
The test suite also validates other variants (URLSAFE, IMAP) to ensure
their correctness.
In addition to functional checks, the suite includes simple
microbenchmarks which report average encode/decode latency for small (64B)
and larger (1KB) inputs. These numbers are informational only and do not
gate the tests.
Kconfig (BASE64_KUNIT) and lib/tests/Makefile are updated accordingly.
Sample KUnit output:
KTAP version 1
# Subtest: base64
# module: base64_kunit
1..4
# base64_performance_tests: [64B] encode run : 32ns
# base64_performance_tests: [64B] decode run : 35ns
# base64_performance_tests: [1KB] encode run : 510ns
# base64_performance_tests: [1KB] decode run : 530ns
ok 1 base64_performance_tests
ok 2 base64_std_encode_tests
ok 3 base64_std_decode_tests
ok 4 base64_variant_tests
# base64: pass:4 fail:0 skip:0 total:4
# Totals: pass:4 fail:0 skip:0 total:4
Link: https://lkml.kernel.org/r/20251114060157.89507-1-409411716@gms.tku.edu.tw
Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Laight <david.laight.linux@gmail.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Cc: Xiubo Li <xiubli@redhat.com>
Cc: Yu-Sheng Huang <home7438072@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The old base64 implementation relied on a bit-accumulator loop, which was
slow for larger inputs and too permissive in validation. It would accept
extra '=', missing '=', or even '=' appearing in the middle of the input,
allowing malformed strings to pass. This patch reworks the internals to
improve performance and enforce stricter validation.
Changes:
- Encoder:
* Process input in 3-byte blocks, mapping 24 bits into four 6-bit
symbols, avoiding bit-by-bit shifting and reducing loop iterations.
* Handle the final 1-2 leftover bytes explicitly and emit '=' only when
requested.
- Decoder:
* Based on the reverse lookup tables from the previous patch, decode
input in 4-character groups.
* Each group is looked up directly, converted into numeric values, and
combined into 3 output bytes.
* Explicitly handle padded and unpadded forms:
- With padding: input length must be a multiple of 4, and '=' is
allowed only in the last two positions. Reject stray or early '='.
- Without padding: validate tail lengths (2 or 3 chars) and require
unused low bits to be zero.
* Removed the bit-accumulator style loop to reduce loop iterations.
Performance (x86_64, Intel Core i7-10700 @ 2.90GHz, avg over 1000 runs,
KUnit):
Encode:
64B ~90ns -> ~32ns (~2.8x)
1KB ~1332ns -> ~510ns (~2.6x)
Decode:
64B ~1530ns -> ~35ns (~43.7x)
1KB ~27726ns -> ~530ns (~52.3x)
[akpm@linux-foundation.org: remove u32 casts, per David and Guan-Chun]
Link: https://lkml.kernel.org/r/20251114060132.89279-1-409411716@gms.tku.edu.tw
Co-developed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Co-developed-by: Yu-Sheng Huang <home7438072@gmail.com>
Signed-off-by: Yu-Sheng Huang <home7438072@gmail.com>
Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Reviewed-by: David Laight <david.laight.linux@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Cc: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series " lib/base64: add generic encoder/decoder, migrate users", v5.
This series introduces a generic Base64 encoder/decoder to the kernel
library, eliminating duplicated implementations and delivering significant
performance improvements.
The Base64 API has been extended to support multiple variants (Standard,
URL-safe, and IMAP) as defined in RFC 4648 and RFC 3501. The API now
takes a variant parameter and an option to control padding. As part of
this series, users are migrated to the new interface while preserving
their specific formats: fscrypt now uses BASE64_URLSAFE, Ceph uses
BASE64_IMAP, and NVMe is updated to BASE64_STD.
On the encoder side, the implementation processes input in 3-byte blocks,
mapping 24 bits directly to 4 output symbols. This avoids bit-by-bit
streaming and reduces loop overhead, achieving about a 2.7x speedup
compared to previous implementations.
On the decoder side, replace strchr() lookups with per-variant reverse
tables and process input in 4-character groups. Each group is mapped to
numeric values and combined into 3 bytes. Padded and unpadded forms are
validated explicitly, rejecting invalid '=' usage and enforcing tail
rules. This improves throughput by ~43-52x.
This patch (of 6):
Extend the base64 API to support multiple variants (standard, URL-safe,
and IMAP) as defined in RFC 4648 and RFC 3501. The API now takes a
variant parameter and an option to control padding. Update NVMe auth code
to use the new interface with BASE64_STD.
Link: https://lkml.kernel.org/r/20251114055829.87814-1-409411716@gms.tku.edu.tw
Link: https://lkml.kernel.org/r/20251114060045.88792-1-409411716@gms.tku.edu.tw
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Co-developed-by: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Reviewed-by: David Laight <david.laight.linux@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Cc: Xiubo Li <xiubli@redhat.com>
Cc: Yu-Sheng Huang <home7438072@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Which serves as a global default sys_info mask. When users want the same
system information for many error cases (panic, hung, lockup ...), they
can chose to set this global knob only once, while not setting up each
individual sys_info knobs.
This just adds a 'lazy' option, and doesn't change existing kernel
behavior as the mask is 0 by default.
Link: https://lkml.kernel.org/r/20251113111039.22701-5-feng.tang@linux.alibaba.com
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This reverts commit 7abcb84f95.
The introduction of WRITE_ONCE() calls for the 'prev' and 'next' variables
inside plist_check_list() was a misapplication. WRITE_ONCE() is
fundamentally a compiler barrier designed to prevent compiler
optimizations (like caching or reordering) on shared memory locations.
However, the variables 'prev' and 'next' are local, stack-allocated
pointers accessed only by the current thread's invocation of the function.
Since these pointers are thread-local and are never accessed concurrently,
applying WRITE_ONCE() to them is semantically incorrect and unnecessary.
Furthermore, the use of WRITE_ONCE() on local variables prevents the
compiler from performing standard optimizations, such as keeping these
variables cached solely in CPU registers throughout the loop, potentially
introducing performance overhead. Restore the conventional C assignment
for local loop variables, allowing the compiler to generate optimal code.
Link: https://lkml.kernel.org/r/20251113193413.499309-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: I Hsin Cheng <richard120310@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There are slight differences in the mul_u64_add_u64_div_u64() code between
32bit and 64bit systems.
Compile and test the 32bit version on 64bit hosts for better test
coverage.
Link: https://lkml.kernel.org/r/20251105201035.64043-10-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Replace the bit by bit algorithm with one that generates 16 bits per
iteration on 32bit architectures and 32 bits on 64bit ones.
On my zen 5 this reduces the time for the tests (using the generic code)
from ~3350ns to ~1000ns.
Running the 32bit algorithm on 64bit x86 takes ~1500ns. It'll be slightly
slower on a real 32bit system, mostly due to register pressure.
The savings for 32bit x86 are much higher (tested in userspace). The
worst case (lots of bits in the quotient) drops from ~900 clocks to ~130
(pretty much independant of the arguments). Other 32bit architectures may
see better savings.
It is possibly to optimise for divisors that span less than
__LONG_WIDTH__/2 bits. However I suspect they don't happen that often and
it doesn't remove any slow cpu divide instructions which dominate the
result.
Typical improvements for 64bit random divides:
old new
sandy bridge: 470 150
haswell: 400 144
piledriver: 960 467 I think rdpmc is very slow.
zen5: 244 80
(Timing is 'rdpmc; mul_div(); rdpmc' with the multiply depending on the
first rdpmc and the second rdpmc depending on the quotient.)
Object code (64bit x86 test program): old 0x173 new 0x141.
Link: https://lkml.kernel.org/r/20251105201035.64043-9-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
gcc generates horrid code for both ((u64)u32_a * u32_b) and (u64_a +
u32_b). As well as the extra instructions it can generate a lot of spills
to stack (including spills of constant zeros and even multiplies by
constant zero).
mul_u32_u32() already exists to optimise the multiply. Add a similar
add_u64_32() for the addition. Disable both for clang - it generates
better code without them.
Move the 64x64 => 128 multiply into a static inline helper function for
code clarity. No need for the a/b_hi/lo variables, the implicit casts on
the function calls do the work for us. Should have minimal effect on the
generated code.
Use mul_u32_u32() and add_u64_u32() in the 64x64 => 128 multiply in
mul_u64_add_u64_div_u64().
Link: https://lkml.kernel.org/r/20251105201035.64043-8-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Change the #if in div64.c so that test_mul_u64_u64_div_u64.c can compile
and test the generic version (including the 'long multiply') on
architectures (eg amd64) that define their own copy.
Test the kernel version and the locally compiled version on all arch.
Output the time taken (in ns) on the 'test completed' trace.
For reference, on my zen 5, the optimised version takes ~220ns and the
generic version ~3350ns. Using the native multiply saves ~200ns and
adding back the ilog2() 'optimisation' test adds ~50ms.
Link: https://lkml.kernel.org/r/20251105201035.64043-7-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Replicate the existing mul_u64_u64_div_u64() test cases with round up.
Update the shell script that verifies the table, remove the comment
markers so that it can be directly pasted into a shell.
Rename the divisor from 'c' to 'd' to match mul_u64_add_u64_div_u64().
It any tests fail then fail the module load with -EINVAL.
Link: https://lkml.kernel.org/r/20251105201035.64043-6-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The existing mul_u64_u64_div_u64() rounds down, a 'rounding up' variant
needs 'divisor - 1' adding in between the multiply and divide so cannot
easily be done by a caller.
Add mul_u64_add_u64_div_u64(a, b, c, d) that calculates (a * b + c)/d and
implement the 'round down' and 'round up' using it.
Update the x86-64 asm to optimise for 'c' being a constant zero.
Add kerndoc definitions for all three functions.
Link: https://lkml.kernel.org/r/20251105201035.64043-5-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If the product is only 64bits div64_u64() can be used for the divide.
Replace the pre-multiply check (ilog2(a) + ilog2(b) <= 62) with a simple
post-multiply check that the high 64bits are zero.
This has the advantage of being simpler, more accurate and less code. It
will always be faster when the product is larger than 64bits.
Most 64bit cpu have a native 64x64=128 bit multiply, this is needed (for
the low 64bits) even when div64_u64() is called - so the early check gains
nothing and is just extra code.
32bit cpu will need a compare (etc) to generate the 64bit ilog2() from two
32bit bit scans - so that is non-trivial. (Never mind the mess of x86's
'bsr' and any oddball cpu without fast bit-scan instructions.) Whereas the
additional instructions for the 128bit multiply result are pretty much one
multiply and two adds (typically the 'adc $0,%reg' can be run in parallel
with the instruction that follows).
The only outliers are 64bit systems without 128bit mutiply and simple in
order 32bit ones with fast bit scan but needing extra instructions to get
the high bits of the multiply result. I doubt it makes much difference to
either, the latter is definitely not mainstream.
If anyone is worried about the analysis they can look at the generated
code for x86 (especially when cmov isn't used).
Link: https://lkml.kernel.org/r/20251105201035.64043-4-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Since the overflow check always triggers when the divisor is zero
move the check for divide by zero inside the overflow check.
This means there is only one test in the normal path.
Link: https://lkml.kernel.org/r/20251105201035.64043-3-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Implement mul_u64_u64_div_u64_roundup()", v5.
The pwm-stm32.c code wants a 'rounding up' version of
mul_u64_u64_div_u64(). This can be done simply by adding 'divisor - 1' to
the 128bit product. Implement mul_u64_add_u64_div_u64(a, b, c, d) = (a *
b + c)/d based on the existing code. Define mul_u64_u64_div_u64(a, b, d)
as mul_u64_add_u64_div_u64(a, b, 0, d) and mul_u64_u64_div_u64_roundup(a,
b, d) as mul_u64_add_u64_div_u64(a, b, d-1, d).
Only x86-64 has an optimsed (asm) version of the function. That is
optimised to avoid the 'add c' when c is known to be zero. In all other
cases the extra code will be noise compared to the software divide code.
The test module has been updated to test mul_u64_u64_div_u64_roundup() and
also enhanced it to verify the C division code on x86-64 and the 32bit
division code on 64bit.
This patch (of 9):
Change to prototype from mul_u64_u64_div_u64(u64 a, u64 b, u64 c) to
mul_u64_u64_div_u64(u64 a, u64 b, u64 d). Using 'd' for 'divisor' makes
more sense.
An upcoming change adds a 'c' parameter to calculate (a * b + c)/d.
Link: https://lkml.kernel.org/r/20251105201035.64043-1-david.laight.linux@gmail.com
Link: https://lkml.kernel.org/r/20251105201035.64043-2-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
For the sake of the code readability and easier maintenance factor out
read and write sys_info handlers.
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20251030132007.3742368-7-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Feng Tang <feng.tang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The both handlers use the local 'table' variable and assign the same data
to it, deduplicate that.
Link: https://lkml.kernel.org/r/20251030132007.3742368-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Feng Tang <feng.tang@linux.alibaba.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Compiler was not happy about dead variable in use:
lib/sys_info.c:52:19: error: variable 'sys_info_avail' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
52 | static const char sys_info_avail[] = "tasks,mem,timers,locks,ftrace,all_bt,blocked_tasks";
| ^~~~~~~~~~~~~~
This was fixed by adding __maybe_unused attribute that just hides the
issue and didn't actually fix the root cause. Rewrite the fix by moving
the local variable from stack to a heap.
As a side effect this drops unneeded "synchronisation" of duplicative info
and also makes code ready for the further refactoring.
Link: https://lkml.kernel.org/r/20251030132007.3742368-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Feng Tang <feng.tang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There is no need to keep a custom structure just for the need of a plain
array of strings. Replace struct sys_info_name with plain array of
strings.
With that done, simplify the code, in particular, naturally use
for_each_set_bit() when iterating over si_bits_global bitmap.
Link: https://lkml.kernel.org/r/20251030132007.3742368-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Feng Tang <feng.tang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Align constant definition names with parameters to make it easier to map.
It's also better to maintain and extend the names while keeping their
uniqueness.
Link: https://lkml.kernel.org/r/20251030132007.3742368-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Feng Tang <feng.tang@linux.alibaba.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "panic: sys_info: Refactor and fix a potential issue", v3.
While targeting the compilation issue due to dangling variable, I have
noticed more opportunities for refactoring that helps to avoid above
mentioned compilation issue in a cleaner way and also fixes a potential
problem with global variable access.
This patch (of 6):
The for-loop might re-read the content of the memory the si_bits_global
points to on each iteration. Instead, just capture it for the sake of
consistency and use that instead.
Link: https://lkml.kernel.org/r/20251030132007.3742368-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20251030132007.3742368-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Feng Tang <feng.tang@linux.alibaba.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The %pe format specifier is designed to print error pointers. It prints a
symbolic error name (eg. -EINVAL) and it makes the code simpler by
omitting PTR_ERR();
This patch fixes this cocci report:
lib/alloc_tag.c:776:63-70: WARNING: Consider using %pe to print PTR_ERR()
Link: https://lkml.kernel.org/r/20251105023925.1447482-1-zeng_chi911@163.com
Signed-off-by: Zeng Chi <zengchi@kylinos.cn>
Acked-by: SeongJae Park <sj@kernel.org>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Merge series from Bartosz Golaszewski <brgl@bgdev.pl>:
Problem statement: GPIOs are implemented as a strictly exclusive
resource in the kernel but there are lots of platforms on which single
pin is shared by multiple devices which don't communicate so need some
way of properly sharing access to a GPIO. What we have now is the
GPIOD_FLAGS_BIT_NONEXCLUSIVE flag which was introduced as a hack and
doesn't do any locking or arbitration of access - it literally just hand
the same GPIO descriptor to all interested users.
The proposed solution is composed of three major parts: the high-level,
shared GPIO proxy driver that arbitrates access to the shared pin and
exposes a regular GPIO chip interface to consumers, a low-level shared
GPIOLIB module that scans firmware nodes and creates auxiliary devices
that attach to the proxy driver and finally a set of core GPIOLIB
changes that plug the former into the GPIO lookup path.
The changes are implemented in a way that allows to seamlessly compile
out any code related to sharing GPIOs for systems that don't need it.
The practical use-case for this are the powerdown GPIOs shared by
speakers on Qualcomm db845c platform, however I have also extensively
tested it using gpio-virtuser on arm64 qemu with various DT
configurations.
CID management OR's two cpumasks and then calculates the weight on the
result. That's inefficient as that has to walk the same stuff twice. As
this is done with runqueue lock held, there is a real benefit of speeding
this up. Depending on the system this results in 10-20% less cycles spent
with runqueue lock held for a 4K cpumask.
Provide cpumask_weighted_or() and the corresponding bitmap functions which
return the weight of the OR result right away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251119172549.448263340@linutronix.de
Add RISC-V code to be compiled to allow the userspace raid6test program
to be built and run on RISC-V.
Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://patch.msgid.link/20250718072711.3865118-6-zhangchunyan@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
To support userspace raid6test, this patch adds __KERNEL__ ifdef for kernel
header inclusions also userspace wrapper definitions to allow code to be
compiled in userspace.
This patch also drops the NSIZE macro, instead of using the vector length,
which can work for both kernel and user space.
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Link: https://patch.msgid.link/20250718072711.3865118-5-zhangchunyan@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
To prevent the compiler from breaking the inline vector assembly code,
this code must be built without compiler support for vector.
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Link: https://patch.msgid.link/20250718072711.3865118-4-zhangchunyan@iscas.ac.cn
[pjw@kernel.org: cleaned up commit message]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
A handful drivers want to print a content of the struct timespec64
in a format of %lld:%09ld. In order to make their lives easier, add
the respecting specifier directly to the printf() implementation.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20251113150217.3030010-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Two functions use the same specifier data for the special hex number.
Almost the same as the field width is calculated on the size of the
given type. Due to that, make a compound literal macro in order to
deduplicate the rest.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20251113150313.3030700-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Properly use masked_user_read_access_begin() and
masked_user_write_access_begin() instead of masked_user_access_begin() in
order to match user_read_access_end() and user_write_access_end(). This is
important for architectures like PowerPC that enable separately user reads
and user writes.
That means masked_user_read_access_begin() is used when user memory is
exclusively read during the window and masked_user_write_access_begin()
is used when user memory is exclusively writen during the window.
masked_user_access_begin() remains and is used when both reads and
writes are performed during the open window. Each of them is expected
to be terminated by the matching user_read_access_end(),
user_write_access_end() and user_access_end().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/cb5e4b0fa49ea9c740570949d5e3544423389757.1763396724.git.christophe.leroy@csgroup.eu
The results of "access_ok()" can be mis-speculated. The result is that
the CPU can end speculatively:
if (access_ok(from, size))
// Right here
For the same reason as done in copy_from_user() in commit 74e19ef0ff
("uaccess: Add speculation barrier to copy_from_user()"), add a speculation
barrier to copy_from_user_iter().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/6b73e69cc7168c89df4eab0a216e3ed4cca36b0a.1763396724.git.christophe.leroy@csgroup.eu
copy_from_user_iter() lacks a speculation barrier, which will degrade
performance on some architecture like x86, which would be unfortunate as
copy_from_user_iter() is a critical hotpath function.
Convert copy_from_user_iter() to using masked user access on architecture
that support it. This allows to add the speculation barrier without
impacting performance.
This is similar to what was done for copy_from_user() in commit
0fc810ae3a ("x86/uaccess: Avoid barrier_nospec() in 64-bit
copy_from_user()")
[ tglx: Massage change log ]
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/58e4b07d469ca68a2b9477fe2c1ccc8a44cef131.1763396724.git.christophe.leroy@csgroup.eu
Implement a function for checking if a string ends with a different
string and add its kunit test cases.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20251112-gpio-shared-v4-1-b51f97b1abd8@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
A test marked with "xfail = true" is expected to fail but that does not
mean it is predetermined to fail. Remove "xfail" condition check for
tests which pass successfully.
Link: https://lkml.kernel.org/r/20251007122035.56347-3-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "__vmalloc()/kvmalloc() and no-block support", v4.
This patch (of 10):
Introduce a new test case "no_block_alloc_test" that verifies non-blocking
allocations using __vmalloc() with GFP_ATOMIC and GFP_NOWAIT flags.
It is recommended to build kernel with CONFIG_DEBUG_ATOMIC_SLEEP enabled
to help catch "sleeping while atomic" issues. This test ensures that
memory allocation logic under atomic constraints does not inadvertently
sleep.
Link: https://lkml.kernel.org/r/20251007122035.56347-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We must check whether KHO is enabled prior to issuing KHO commands,
otherwise KHO internal data structures are not initialized.
Link: https://lkml.kernel.org/r/20251106220635.2608494-1-pasha.tatashin@soleen.com
Fixes: b753522bed ("kho: add test for kexec handover")
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202511061629.e242724-lkp@intel.com
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Enable the preset of filter parameters from kconfig options, similar to
how other KUnit configuration parameters are handled already.
This is useful to run a subset of tests even if the cmdline is not
readily modifyable.
Link: https://lore.kernel.org/r/20251106-kunit-filter-kconfig-v1-1-d723fb7ac221@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Move the arm64 implementations of SHA-3 and POLYVAL to the newly
introduced scoped ksimd API, which replaces kernel_neon_begin() and
kernel_neon_end(). On arm64, this is needed because the latter API
will change in an incompatible manner.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
xxh32_reset() and xxh32_copy_state() are unused, and with those gone, the
xxh32_state struct is also unused.
xxh64_copy_state() is also unused.
Remove them all.
(Also fixes a comment above the xxh64_state that referred to it as
xxh32_state).
Link: https://lkml.kernel.org/r/20251024205120.454508-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In practical problem diagnosis, especially during the boot phase, it is
often desirable to know the call sequence. However, currently, apart from
adding print statements and recompiling the kernel, there seems to be no
good alternative. If dynamic_debug supported printing the call stack, it
would be very helpful for diagnosing issues. This patch add support '+d'
for dump stack.
Link: https://lkml.kernel.org/r/20251025080003.312536-1-yebin@huaweicloud.com
Signed-off-by: Ye Bin <yebin10@huawei.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 1f9a8286bc ("uaccess: always export _copy_[from|to]_user with
CONFIG_RUST") exports _copy_{from,to}_user() unconditionally, if RUST is
enabled. This pollutes exported symbols namespace, and spreads RUST
ifdefery in core files.
It's better to declare a corresponding helper under the rust/helpers,
similarly to how non-underscored copy_{from,to}_user() is handled.
[yury.norov@gmail.com: drop rust part of comment for _copy_from_user(), per Alice]
Link: https://lkml.kernel.org/r/20251024154754.99768-1-yury.norov@gmail.com
Link: https://lkml.kernel.org/r/20251023171607.1171534-1-yury.norov@gmail.com
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Support for the IA-64 (Itanium) architecture was removed in commit
cf8e865810 ("arch: Remove Itanium (IA-64) architecture").
This patch drops the IA-64 specific decompression code from lib/xz, which
was conditionally compiled with the now-obsolete CONFIG_XZ_DEC_IA64
option.
Link: https://lkml.kernel.org/r/20251014052738.31185-1-spyjetfayed@gmail.com
Signed-off-by: Ankan Biswas <spyjetfayed@gmail.com>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Khalid Aziz <khalid@kernel.org>
Acked-by: Lasse Collin <lasse.collin@tukaani.org>
Cc: David Hunter <david.hunter.linux@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The hung_task_panic sysctl is currently a blunt instrument: it's all or
nothing.
Panicking on a single hung task can be an overreaction to a transient
glitch. A more reliable indicator of a systemic problem is when
multiple tasks hang simultaneously.
Extend hung_task_panic to accept an integer threshold, allowing the
kernel to panic only when N hung tasks are detected in a single scan.
This provides finer control to distinguish between isolated incidents
and system-wide failures.
The accepted values are:
- 0: Don't panic (unchanged)
- 1: Panic on the first hung task (unchanged)
- N > 1: Panic after N hung tasks are detected in a single scan
The original behavior is preserved for values 0 and 1, maintaining full
backward compatibility.
[lance.yang@linux.dev: new changelog]
Link: https://lkml.kernel.org/r/20251015063615.2632-1-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Tested-by: Lance Yang <lance.yang@linux.dev>
Acked-by: Andrew Jeffery <andrew@codeconstruct.com.au> [aspeed_g5_defconfig]
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Florian Wesphal <fw@strlen.de>
Cc: Jakub Kacinski <kuba@kernel.org>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Simon Horman <horms@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
As of writing, Documentation/Changes states the minimal versions of GNU C
being 8.1, Clang being 15.0.0 and binutils being 2.30. A few Kconfig help
texts are pointing out that specific GCC and Clang versions are needed,
but by now, those pointers to versions, such later than 4.0, later than
4.4, or clang later than 5.0, are obsolete and unlikely to be found by
users configuring their kernel builds anyway.
Drop these outdated remarks in Kconfig help texts referring to older
compiler and binutils versions. No functional change.
Link: https://lkml.kernel.org/r/20251010082138.185752-1-lukas.bulwahn@redhat.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Russel King <linux@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Even though ARM's versions of kernel_neon_begin()/_end() are not being
changed, update the newly migrated ARM blake2b to the scoped ksimd API
so that all ARM and arm64 in lib/crypto remains consistent in this
manner.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Introduce a more strict replacement API for
kernel_neon_begin()/kernel_neon_end() on both ARM and arm64, and replace
occurrences of the latter pair appearing in lib/crypto
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCaRRKpgAKCRAwbglWLn0t
XAKXAQD/L/XErOIGgSnvJnxG9sF+V2S+id1u9aoEJApbqMvW/gD9Fnvjqa7mRM7f
jSZeDCMB++24SS2zL0/BFiRMmEl5/gc=
=0IKE
-----END PGP SIGNATURE-----
Merge tag 'scoped-ksimd-for-arm-arm64' into libcrypto-fpsimd-on-stack
Pull scoped ksimd API for ARM and arm64 from Ard Biesheuvel:
"Introduce a more strict replacement API for
kernel_neon_begin()/kernel_neon_end() on both ARM and arm64, and
replace occurrences of the latter pair appearing in lib/crypto"
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The pointer may be invalid when gets to the printf(). In particular
the time_and_date() dereferencing it in some cases without checking.
Move the check from rtc_str() to time_and_date() to cover all cases.
Fixes: 7daac5b2fd ("lib/vsprintf: Print time64_t in human readable format")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20251110132118.4113976-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Move away from calling kernel_neon_begin() and kernel_neon_end()
directly, and instead, use the newly introduced scoped_ksimd() API. This
permits arm64 to modify the kernel mode NEON API without affecting code
that is shared between ARM and arm64.
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.
For symmetry, do the same for 32-bit ARM too.
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.
For symmetry, do the same for 32-bit ARM too.
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Add a test suite for the POLYVAL library, including:
- All the standard tests and the benchmark from hash-test-template.h
- Comparison with a test vector from the RFC
- Test with key and message containing all one bits
- Additional tests related to the key struct
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add the following test cases to cover gaps in the SHAKE testing:
- test_shake_all_lens_up_to_4096()
- test_shake_multiple_squeezes()
- test_shake_with_guarded_bufs()
Remove test_shake256_tiling() and test_shake256_tiling2() since they are
superseded by test_shake_multiple_squeezes(). It provides better test
coverage by using randomized testing. E.g., it's able to generate a
zero-length squeeze followed by a nonzero-length squeeze, which the
first 7 versions of the SHA-3 patchset handled incorrectly.
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a SHA3 kunit test suite, providing the following:
(*) A simple test of each of SHA3-224, SHA3-256, SHA3-384, SHA3-512,
SHAKE128 and SHAKE256.
(*) NIST 0- and 1600-bit test vectors for SHAKE128 and SHAKE256.
(*) Output tiling (multiple squeezing) tests for SHAKE256.
(*) Standard hash template test for SHA3-256. To make this possible,
gen-hash-testvecs.py is modified to support sha3-256.
(*) Standard benchmark test for SHA3-256.
[EB: dropped some unnecessary changes to gen-hash-testvecs.py, moved
addition of Testing section in doc file into this commit, and
other small cleanups]
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a KUnit test suite for the BLAKE2b library API, mirroring the
BLAKE2s test suite very closely.
As with the BLAKE2s test suite, a benchmark is included.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Migrate the x86_64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface. This makes the POLYVAL library be
properly optimized on x86_64.
This drops the x86_64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there. But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.
Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.
Also replace a movaps instruction with movups to remove the assumption
that the key struct is 16-byte aligned. Users can still align the key
if they want (and at least in this case, movups is just as fast as
movaps), but it's inconvenient to require it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Migrate the arm64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface. This makes the POLYVAL library be
properly optimized on arm64.
This drops the arm64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there. But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.
Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add support for POLYVAL to lib/crypto/.
This will replace the polyval crypto_shash algorithm and its use in the
hctr2 template, simplifying the code and reducing overhead.
Specifically, this commit introduces the POLYVAL library API and a
generic implementation of it. Later commits will migrate the existing
architecture-optimized implementations of POLYVAL into lib/crypto/ and
add a KUnit test suite.
I've also rewritten the generic implementation completely, using a more
modern approach instead of the traditional table-based approach. It's
now constant-time, requires no precomputation or dynamic memory
allocations, decreases the per-key memory usage from 4096 bytes to 16
bytes, and is faster than the old polyval-generic even on bulk data
reusing the same key (at least on x86_64, where I measured 15% faster).
We should do this for GHASH too, but for now just do it for POLYVAL.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add the testcase for the fprobe, which will hook the same target with two
fprobe: entry, entry+exit. And the two fprobes will be registered with
different order.
fgraph and ftrace are both used for the fprobe, and this testcase is for
the mixed situation.
Link: https://lore.kernel.org/all/20251015083238.2374294-3-dongml2@chinatelecom.cn/
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
maple_tree tracepoints contain pointers to function names. Such a pointer
is saved when a tracepoint logs an event. There's no guarantee that it's
still valid when the event is parsed later and the pointer is dereferenced.
The kernel warns about these unsafe pointers.
event 'ma_read' has unsafe pointer field 'fn'
WARNING: kernel/trace/trace.c:3779 at ignore_event+0x1da/0x1e4
Mark the function names as tracepoint_string() to fix the events.
One case that doesn't work without my patch would be trace-cmd record
to save the binary ringbuffer and trace-cmd report to parse it in
userspace. The address of __func__ can't be dereferenced from
userspace but tracepoint_string will add an entry to
/sys/kernel/tracing/printk_formats
Link: https://lkml.kernel.org/r/20251030155537.87972-1-martin@kaiser.cx
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Two Curve25519 related fixes:
- Re-enable KASAN support on curve25519-hacl64.c with gcc.
- Disable the arm optimized Curve25519 code on CONFIG_CPU_BIG_ENDIAN=y
kernels. It has always been broken in that configuration.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaQ0FSBQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOKx9QAP0RbzQkYTIRcbsKkZ0SKNsVO5mpcRsJ
TybkX62xhS70FgEAiWeV9KD1XoAwwSA4eo4Y9b2joafeb6lcNBPClS5TAgg=
=YEZJ
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fixes from Eric Biggers:
"Two Curve25519 related fixes:
- Re-enable KASAN support on curve25519-hacl64.c with gcc.
- Disable the arm optimized Curve25519 code on CPU_BIG_ENDIAN
kernels. It has always been broken in that configuration"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: arm/curve25519: Disable on CPU_BIG_ENDIAN
lib/crypto: curve25519-hacl64: Fix older clang KASAN workaround for GCC
The kernel-doc in lib/hweight.c is global to the file and
currently has issues:
Warning: lib/hweight.c:13 expecting prototype for hweightN(). Prototype was for __sw_hweight32() instead
Warning: lib/hweight.c:13 function parameter 'w' not described in '__sw_hweight32'
Update it accordingly.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
AVX-512 supports 3-input XORs via the vpternlogd (or vpternlogq)
instruction with immediate 0x96. This approach, vs. the alternative of
two vpxor instructions, is already used in the CRC, AES-GCM, and AES-XTS
code, since it reduces the instruction count and is faster on some CPUs.
Make blake2s_compress_avx512() take advantage of it too.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Just before returning, blake2s_compress_ssse3() and
blake2s_compress_avx512() store updated values to the 'h', 't', and 'f'
fields of struct blake2s_ctx. But 'f' is always unchanged (which is
correct; only the C code changes it). So, there's no need to write to
'f'. Use 64-bit stores (movq and vmovq) instead of 128-bit stores
(movdqu and vmovdqu) so that only 't' is written.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Various cleanups for readability. No change to the generated code:
- Add some comments
- Add #defines for arguments
- Rename some labels
- Use decimal constants instead of hex where it makes sense.
(The pshufd immediates intentionally remain as hex.)
- Add blank lines when there's a logical break
The round loop still could use some work, but this is at least a start.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Following the usual practice, prefix the names of the data labels with
".L" so that the assembler treats them as truly local. This more
clearly expresses the intent and is less error-prone.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since blake2s_compress() is always passed nblocks != 0, remove the
unnecessary check for nblocks == 0 from blake2s_compress_ssse3().
Note that this makes it consistent with blake2s_compress_avx512() in the
same file as well as the arm32 blake2s_compress().
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
In the C code, the 'inc' argument to the assembly functions
blake2s_compress_ssse3() and blake2s_compress_avx512() is declared with
type u32, matching blake2s_compress(). The assembly code then reads it
from the 64-bit %rcx. However, the ABI doesn't guarantee zero-extension
to 64 bits, nor do gcc or clang guarantee it. Therefore, fix these
functions to read this argument from the 32-bit %ecx.
In theory, this bug could have caused the wrong 'inc' value to be used,
causing incorrect BLAKE2s hashes. In practice, probably not: I've fixed
essentially this same bug in many other assembly files too, but there's
never been a real report of it having caused a problem. In x86_64, all
writes to 32-bit registers are zero-extended to 64 bits. That results
in zero-extension in nearly all situations. I've only been able to
demonstrate a lack of zero-extension with a somewhat contrived example
involving truncation, e.g. when the C code has a u64 variable holding
0x1234567800000040 and passes it as a u32 expecting it to be truncated
to 0x40 (64). But that's not what the real code does, of course.
Fixes: ed0356eda1 ("crypto: blake2s - x86_64 SIMD implementation")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Remove self-references to filenames from assembly files in
lib/crypto/arm/ and lib/crypto/arm64/. This follows the recommended
practice and eliminates an outdated reference to sha2-ce-core.S.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102014809.170713-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Some z/Architecture processors can compute a SHA-3 digest in a single
instruction. arch/s390/crypto/ already uses this capability to optimize
the SHA-3 crypto_shash algorithms.
Use this capability to implement the sha3_224(), sha3_256(), sha3_384(),
and sha3_512() library functions too.
SHA3-256 benchmark results provided by Harald Freudenberger
(https://lore.kernel.org/r/4188d18bfcc8a64941c5ebd8de10ede2@linux.ibm.com/)
on a z/Architecture machine with "facility 86" (MSA level 12):
Length (bytes) Before (MB/s) After (MB/s)
============== ============= ============
16 212 225
64 820 915
256 1850 3350
1024 5400 8300
4096 11200 11300
Note: the original data from Harald was given in the form of a graph for
each length, showing the distribution of throughputs from 500 runs. I
guesstimated the peak of each one.
Harald also reported that the generic SHA-3 code was at most 259 MB/s
(https://lore.kernel.org/r/c39f6b6c110def0095e5da5becc12085@linux.ibm.com/).
So as expected, the earlier commit that optimized sha3_absorb_blocks()
and sha3_keccakf() is the more important one; it optimized the Keccak
permutation which is the most performance-critical part of SHA-3.
Still, this additional commit does notably improve performance further
on some lengths.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add support for architecture-specific overrides of sha3_224(),
sha3_256(), sha3_384(), and sha3_512(). This will be used to implement
these functions more efficiently on s390 than is possible via the usual
init + update + final flow.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Implement sha3_absorb_blocks() and sha3_keccakf() using the hardware-
accelerated SHA-3 support in Message-Security-Assist Extension 6.
This accelerates the SHA3-224, SHA3-256, SHA3-384, SHA3-512, and
SHAKE256 library functions.
Note that arch/s390/crypto/ already has SHA-3 code that uses this
extension, but it is exposed only via crypto_shash. This commit brings
the same acceleration to the SHA-3 library. The arch/s390/crypto/
version will become redundant and be removed in later changes.
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm64-optimized SHA-3 code via arm64-specific
crypto_shash algorithms, instead just implement the sha3_absorb_blocks()
and sha3_keccakf() library functions. This is much simpler, it makes
the SHA-3 library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-3 code was disabled by
default. SHA-3 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Note: to see the diff from arch/arm64/crypto/sha3-ce-glue.c to
lib/crypto/arm64/sha3.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since the SHA-3 algorithms are FIPS-approved, add the boot-time
self-test which is apparently required. This closely follows the
corresponding SHA-1, SHA-256, and SHA-512 tests.
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
In crypto/sha3_generic.c, the keccakf() function calls keccakf_round()
to do four of Keccak-f's five step mappings. However, it does not do
the Iota step mapping - presumably because that is dependent on round
number, whereas Theta, Rho, Pi and Chi are not.
Note that the keccakf_round() function needs to be explicitly
non-inlined on certain architectures as gcc's produced output will (or
used to) use over 1KiB of stack space if inlined.
Now, this code was copied more or less verbatim into lib/crypto/sha3.c,
so that has the same aesthetic issue. Fix this there by passing the
round number into sha3_keccakf_one_round_generic() and doing the Iota
step mapping there.
crypto/sha3_generic.c is left untouched as that will be converted to use
lib/crypto/sha3.c at some point.
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add SHA-3 support to lib/crypto/. All six algorithms in the SHA-3
family are supported: four digests (SHA3-224, SHA3-256, SHA3-384, and
SHA3-512) and two extendable-output functions (SHAKE128 and SHAKE256).
The SHAKE algorithms will be required for ML-DSA.
[EB: simplified the API to use fewer types and functions, fixed bug that
sometimes caused incorrect SHAKE output, cleaned up the
documentation, dropped an ad-hoc test that was inconsistent with
the rest of lib/crypto/, and many other cleanups]
Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
On big endian arm kernels, the arm optimized Curve25519 code produces
incorrect outputs and fails the Curve25519 test. This has been true
ever since this code was added.
It seems that hardly anyone (or even no one?) actually uses big endian
arm kernels. But as long as they're ostensibly supported, we should
disable this code on them so that it's not accidentally used.
Note: for future-proofing, use !CPU_BIG_ENDIAN instead of
CPU_LITTLE_ENDIAN. Both of these are arch-specific options that could
get removed in the future if big endian support gets dropped.
Fixes: d8f1308a02 ("crypto: arm/curve25519 - wire up NEON implementation")
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251104054906.716914-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Commit 2f13daee2a ("lib/crypto/curve25519-hacl64: Disable KASAN with
clang-17 and older") inadvertently disabled KASAN in curve25519-hacl64.o
for GCC unconditionally because clang-min-version will always evaluate
to nothing for GCC. Add a check for CONFIG_CC_IS_CLANG to avoid applying
the workaround for GCC, which is only needed for clang-17 and older.
Cc: stable@vger.kernel.org
Fixes: 2f13daee2a ("lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251103-curve25519-hacl64-fix-kasan-workaround-v2-1-ab581cbd8035@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Clarify that the return values of vsprintf() and sprintf() exclude the
trailing NUL character.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251103090913.2066-2-thorsten.blum@linux.dev
Signed-off-by: Petr Mladek <pmladek@suse.com>
- Formally adopt Kconfig in MAINTAINERS
- Fix install-extmod-build for more O= paths
- Align end of .modinfo to fix Authenticode calculation in EDK2
- Restore dynamic check for '-fsanitize=kernel-memory' in
CONFIG_HAVE_KMSAN_COMPILER to ensure backend target has support for
it
- Initialize locale in menuconfig and nconfig to fix UTF-8 terminals
that may not support VT100 ACS by default like PuTTY
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaQWN/gAKCRAdayaRccAa
ljUzAQCGh5zhfwsCfMc5M7/9v2FiEBKNzTc/xTK04vWdZ3A39QEA6XuyVx+QPBkS
ts7gPrR6NC7AokdqTHXKhBW/ebaNOw0=
=GmCb
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:
- Formally adopt Kconfig in MAINTAINERS
- Fix install-extmod-build for more O= paths
- Align end of .modinfo to fix Authenticode calculation in EDK2
- Restore dynamic check for '-fsanitize=kernel-memory' in
CONFIG_HAVE_KMSAN_COMPILER to ensure backend target has support
for it
- Initialize locale in menuconfig and nconfig to fix UTF-8 terminals
that may not support VT100 ACS by default like PuTTY
* tag 'kbuild-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kconfig/nconf: Initialize the default locale at startup
kconfig/mconf: Initialize the default locale at startup
KMSAN: Restore dynamic check for '-fsanitize=kernel-memory'
kbuild: align modinfo section for Secureboot Authenticode EDK2 compat
kbuild: install-extmod-build: Fix when given dir outside the build dir
MAINTAINERS: Update Kconfig section
Fixes log overwrite in param_tests and fixes incorrect cast of priv
pointer in test_dev_action(). Updates email address for Rae Moar in
MAINTAINERS KUnit entry.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmkDxT0ACgkQCwJExA0N
QxyDaxAAxyE7WU1vUg7AakYo80p5wku/ehahQCicePW0cvKiSwI+VJ3o4A9e7qxI
l0xnvLaisqPzwrSZ4GxCi/0L50OpoFTCAyEeey88tNeqEcjR/b4CDYfJoftCau+b
/4+jluhuCbSC3HQbq77aww1Vgoo184hisAdCvRj0qiPKLk7+D4x1xKMKGCBZYyNP
Jq2sb4H7TJ9rKpRoN9xi+9YWmJGI8EPQHJ1A4FBoS2vk6JyOGlbpJcID1XROtNwz
19tQ9eNarcHSq7yf/Z20BHv0v53ZEvn/c9GYEd1Z5/NtAvPIbyY1U8EhDH9smoAs
0+bDUddZOj1JoQ5Q5TXTx7onKLFpecAIhSGPjBn7mETmIdv0lbtfcJ6aBe0FQTz3
jFfSOBPwd1ng5RVM6FXsJf+JW11Cyba/reGvgtcZDrg8hMluIWUK4N3761KYQnt/
+IoJeJ3A791HNzLDoXKeA6uR0w55LtkJpEnDWi1UqAZE+OOUSx+p1LcGgUAIHxyU
MRaIS4+ZoeVmgopQCSDk2KbiCp8d+yek6Upbm5ivNrwJ00MZqgH5oZA/O8W/shMw
0vjc7mVTpefCSq3EudkQKCT+cB7nh3WAH6devsUNuZlQ7kL1o5Css5bmVF0Iswa7
j+KFuZEiOlOmZgQ+zpzrQWgI+Pu+ZqQg4W8KA3NXwg86blARRfg=
=stUG
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fixes from Shuah Khan:
"Fix log overwrite in param_tests and fixes incorrect cast of priv
pointer in test_dev_action().
Update email address for Rae Moar in MAINTAINERS KUnit entry"
* tag 'linux_kselftest-kunit-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
MAINTAINERS: Update KUnit email address for Rae Moar
kunit: prevent log overwrite in param_tests
kunit: test_dev_action: Correctly cast 'priv' pointer to long*
Migrate the arm-optimized BLAKE2b code from arch/arm/crypto/ to
lib/crypto/arm/. This makes the BLAKE2b library able to use it, and it
also simplifies the code because it's easier to integrate with the
library than crypto_shash.
This temporarily makes the arm-optimized BLAKE2b code unavailable via
crypto_shash. A later commit reimplements the blake2b-* crypto_shash
algorithms on top of the BLAKE2b library API, making it available again.
Note that as per the lib/crypto/ convention, the optimized code is now
enabled by default. So, this also fixes the longstanding issue where
the optimized BLAKE2b code was not enabled by default.
To see the diff from arch/arm/crypto/blake2b-neon-glue.c to
lib/crypto/arm/blake2b.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API.
This will allow in-kernel users such as btrfs to use BLAKE2b without
going through the generic crypto layer. In addition, as usual the
BLAKE2b crypto_shash algorithms will be reimplemented on top of this.
Note: to create lib/crypto/blake2b.c I made a copy of
lib/crypto/blake2s.c and made the updates from BLAKE2s => BLAKE2b. This
way, the BLAKE2s and BLAKE2b code is kept consistent. Therefore, it
borrows the SPDX-License-Identifier and Copyright from
lib/crypto/blake2s.c rather than crypto/blake2b_generic.c.
The library API uses 'struct blake2b_ctx', consistent with other
lib/crypto/ APIs. The existing 'struct blake2b_state' will be removed
once the blake2b crypto_shash algorithms are updated to stop using it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
A couple more small cleanups to the BLAKE2s code before these things get
propagated into the BLAKE2b code:
- Drop 'const' from some non-pointer function parameters. It was a bit
excessive and not conventional.
- Rename 'block' argument of blake2s_compress*() to 'data'. This is for
consistency with the SHA-* code, and also to avoid the implication
that it points to a singular "block".
No functional changes.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
For consistency with the SHA-1, SHA-2, SHA-3 (in development), and MD5
library APIs, rename blake2s_state to blake2s_ctx.
As a refresher, the ctx name:
- Is a bit shorter.
- Avoids confusion with the compression function state, which is also
often called the state (but is just part of the full context).
- Is consistent with OpenSSL.
Not a big deal, of course. But consistency is nice. With a BLAKE2b
library API about to be added, this is a convenient time to update this.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reorder the parameters of blake2s() from (out, in, key, outlen, inlen,
keylen) to (key, keylen, in, inlen, out, outlen).
This aligns BLAKE2s with the common conventions of pairing buffers and
their lengths, and having outputs follow inputs. This is widely used
elsewhere in lib/crypto/ and crypto/, and even elsewhere in the BLAKE2s
code itself such as blake2s_init_key() and blake2s_final(). So
blake2s() was a bit of an exception.
Notably, this results in the same order as hmac_*_usingrawkey().
Note that since the type signature changed, it's not possible for a
blake2s() call site to be silently missed.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add FIPS cryptographic algorithm self-tests for all SHA-1 and SHA-2
algorithms. Following the "Implementation Guidance for FIPS 140-3"
document, to achieve this it's sufficient to just test a single test
vector for each of HMAC-SHA1, HMAC-SHA256, and HMAC-SHA512.
Just run these tests in the initcalls, following the example of e.g.
crypto/kdf_sp800108.c. Note that this should meet the FIPS self-test
requirement even in the built-in case, given that the initcalls run
before userspace, storage, network, etc. are accessible.
This does not fix a regression, seeing as lib/ has had SHA-1 support
since 2005 and SHA-256 support since 2018. Neither ever had FIPS
self-tests. Moreover, fips=1 support has always been an unfinished
feature upstream. However, with lib/ now being used more widely, it's
now seeing more scrutiny and people seem to want these now [1][2].
[1] https://lore.kernel.org/r/3226361.1758126043@warthog.procyon.org.uk/
[2] https://lore.kernel.org/r/f31dbb22-0add-481c-aee0-e337a7731f8e@oracle.com/
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251011001047.51886-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Commit 5ff8c11775 ("KMSAN: Remove tautological checks") changed
CONFIG_HAVE_KMSAN_COMPILER from a dynamic check for
'-fsanitize=kernel-memory' to just being true for CONFIG_CC_IS_CLANG.
This missed the fact that not all architectures supported
'-fsanitize=kernel-memory' at the same time. For example, SystemZ / s390
gained support for KMSAN in clang-18 [1], so builds with clang-15
through clang-17 can select KMSAN but they error with:
clang-16: error: unsupported option '-fsanitize=kernel-memory' for target 's390x-unknown-linux-gnu'
Restore the cc-option check for '-fsanitize=kernel-memory' to make sure
the compiler target properly supports '-fsanitize=kernel-memory'. The
check for '-msan-disable-checks=1' does not need to be restored because
all supported clang versions for building the kernel support it.
Fixes: 5ff8c11775 ("KMSAN: Remove tautological checks")
Link: a3e56a8792 [1]
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202510220236.AVuXXCYy-lkp@intel.com/
Acked-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20251023-fix-kmsan-check-s390-clang-v1-1-4e6df477a4cc@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
When running parameterized tests, each test case is initialized with
kunit_init_test(). This function takes the test_case->log as a parameter
but it clears it via string_stream_clear() on each iteration.
This results in only the log from the last parameter being preserved in
the test_case->log and the results from the previous parameters are lost
from the debugfs entry.
Fix this by manually setting the param_test.log to the test_case->log
after it has been initialized. This prevents kunit_init_test() from
clearing the log on each iteration.
Link: https://lore.kernel.org/r/20251024190101.2091549-1-cmllamas@google.com
Fixes: 4b59300ba4 ("kunit: Add parent kunit for parameterized test context")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
freader_fetch currently reads from at most two folios. When a read spans
into a third folio, the overflow bytes are copied adjacent to the second
folio’s data instead of being handled as a separate folio.
This patch modifies fetch algorithm to support reading from many folios.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Reviewed-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20251026203853.135105-5-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Move struct freader and prototypes of the functions operating on it into
the buildid.h.
This allows reusing freader outside buildid, e.g. for file dynptr
support added later.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20251026203853.135105-4-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This old alias for in_hardirq() has been marked as deprecated since
2020; remove the stragglers.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251024180654.1691095-1-willy@infradead.org
The previous implementation incorrectly assumed the original type of
'priv' was void**, leading to an unnecessary and misleading
cast. Correct the cast of the 'priv' pointer in test_dev_action() to
its actual type, long*, removing an unnecessary cast.
As an additional benefit, this fixes an out-of-bounds CHERI fault on
hardware with architectural capabilities. The original implementation
tried to store a capability-sized pointer using the priv
pointer. However, the priv pointer's capability only granted access to
the memory region of its original long type, leading to a bounds
violation since the size of a long is smaller than the size of a
capability. This change ensures that the pointer usage respects the
capabilities' bounds.
Link: https://lore.kernel.org/r/20251017092814.80022-1-florian.schmaus@codasip.com
Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Florian Schmaus <florian.schmaus@codasip.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
For consistency with the other function templates, change
_subtree_search_*() to use the user-supplied ITSTATIC rather than the
hard-coded 'static'.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
KHO test stores physical addresses of the preserved folios directly in
fdt. Use kho_preserve_vmalloc() instead of it and kho_restore_vmalloc()
to retrieve the addresses after kexec.
This makes the test more scalable from one side and adds tests coverage
for kho_preserve_vmalloc() from the other.
Link: https://lkml.kernel.org/r/20250921054458.4043761-5-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCaOAhtRQcem9oYXJAbGlu
dXguaWJtLmNvbQAKCRDLwZzRsCrn5bGkAP91Vptp6rJj3R0N0M338tzkVaeybA+V
fjOHnuB8AvF+FwD/XGLF+lcIFiB8GizXGf/OoyXJwYwciNU7hEdONh5dzQ0=
=euve
-----END PGP SIGNATURE-----
Merge tag 'integrity-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
"Just a couple of changes: crypto code cleanup and a IMA xattr bug fix"
* tag 'integrity-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: don't clear IMA_DIGSIG flag when setting or removing non-IMA xattr
lib/digsig: Use SHA-1 library instead of crypto_shash
integrity: Select CRYPTO from INTEGRITY_ASYMMETRIC_KEYS
Drivers:
- Add ciphertext hiding support to ccp.
- Add hashjoin, gather and UDMA data move features to hisilicon.
- Add lz4 and lz77_only to hisilicon.
- Add xilinx hwrng driver.
- Add ti driver with ecb/cbc aes support.
- Add ring buffer idle and command queue telemetry for GEN6 in qat.
Others:
- Use rcu_dereference_all to stop false alarms in rhashtable.
- Fix CPU number wraparound in padata.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmjbpJsACgkQxycdCkmx
i6fuTRAAzv5o0MIw4Kc7EEU3zMgFSX0FdcTUPY+eiFrWZrSrvUVW+jYcH9ppO8J7
offAYSZYatcyyU9+u8X22CQNKLdXnKQQ0YymWO35TOpvVxveUM1bqEEV1ZK0xaXD
hlJTLoFIsPaVVhi8CW+ZNhDJBwJHNCv7Yi9TUB6sC7rilWWbJ5LzbEVw3Rtg81Lx
0hcuGX2LrpsHOVVWYxGdJ534Kt2lrkt+8/gWOFg3ap3RVQ39tohEjS2Adm2p8eiX
zIdru/aYd89EcYoxuFyylX2d/OLmMAQpFsADy/Fys26eeOWtqggH62V1LAiSyEqw
vLRBCVKpLhlbNNfnUs0f5nqjjYEUrNk9SA4rgoxITwKoucbWBQMS4zWJTEDKz29n
iBBqHsukGpwVOE6RY8BzR/QNJKhZCSsJpGkagS1v6VPa5P1QomuKftGXKB7JKXKz
xoyk+DhJyA8rkb/E5J9Ni7+Tb08Y4zvJ1dpCQHZMlln3DKkK+kk3gkpoxXMZwBV2
LbEMGTI+sfnAfqkGCJYAZR9gDJ5LQDR9jy/Ds5jvPuVvvjyY5LY/bjETqGPF2QVs
Rz2Sg0RHl7PVZOP6QgbQzkV7SkJrZfyu5iYd0ZfUqZr7BaHLOHJG/E/HlUW3/mXu
OjD+Q5gPhiOdc/qn+32+QERTDCFQdbByv0h7khGQA5vHE3XCu8E=
=knnk
-----END PGP SIGNATURE-----
Merge tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"Drivers:
- Add ciphertext hiding support to ccp
- Add hashjoin, gather and UDMA data move features to hisilicon
- Add lz4 and lz77_only to hisilicon
- Add xilinx hwrng driver
- Add ti driver with ecb/cbc aes support
- Add ring buffer idle and command queue telemetry for GEN6 in qat
Others:
- Use rcu_dereference_all to stop false alarms in rhashtable
- Fix CPU number wraparound in padata"
* tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (78 commits)
dt-bindings: rng: hisi-rng: convert to DT schema
crypto: doc - Add explicit title heading to API docs
hwrng: ks-sa - fix division by zero in ks_sa_rng_init
KEYS: X.509: Fix Basic Constraints CA flag parsing
crypto: anubis - simplify return statement in anubis_mod_init
crypto: hisilicon/qm - set NULL to qm->debug.qm_diff_regs
crypto: hisilicon/qm - clear all VF configurations in the hardware
crypto: hisilicon - enable error reporting again
crypto: hisilicon/qm - mask axi error before memory init
crypto: hisilicon/qm - invalidate queues in use
crypto: qat - Return pointer directly in adf_ctl_alloc_resources
crypto: aspeed - Fix dma_unmap_sg() direction
rhashtable: Use rcu_dereference_all and rcu_dereference_all_check
crypto: comp - Use same definition of context alloc and free ops
crypto: omap - convert from tasklet to BH workqueue
crypto: qat - Replace kzalloc() + copy_from_user() with memdup_user()
crypto: caam - double the entropy delay interval for retry
padata: WQ_PERCPU added to alloc_workqueue users
padata: replace use of system_unbound_wq with system_dfl_wq
crypto: cryptd - WQ_PERCPU added to alloc_workqueue users
...
Now that a SHA-1 library API is available, use it instead of
crypto_shash. This is simpler and faster.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
- The 3 patch series "ida: Remove the ida_simple_xxx() API" from
Christophe Jaillet completes the removal of this legacy IDR API.
- The 9 patch series "panic: introduce panic status function family"
from Jinchao Wang provides a number of cleanups to the panic code and
its various helpers, which were rather ad-hoc and scattered all over the
place.
- The 5 patch series "tools/delaytop: implement real-time keyboard
interaction support" from Fan Yu adds a few nice user-facing usability
changes to the delaytop monitoring tool.
- The 3 patch series "efi: Fix EFI boot with kexec handover (KHO)" from
Evangelos Petrongonas fixes a panic which was happening with the
combination of EFI and KHO.
- The 2 patch series "Squashfs: performance improvement and a sanity
check" from Phillip Lougher teaches squashfs's lseek() about
SEEK_DATA/SEEK_HOLE. A mere 150x speedup was measured for a well-chosen
microbenchmark.
- Plus another 50-odd singleton patches all over the place.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN78zwAKCRDdBJ7gKXxA
jhLeAQCddTv0XtSUTrvBvmrJVUBrQQeJc+LtNopMIjfAF/WAWAEAogSVKxg+HHEB
GaVixx4zDriNzEqrqiCx9rm4l+YooQA=
=XRe0
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-10-02-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "ida: Remove the ida_simple_xxx() API" from Christophe Jaillet
completes the removal of this legacy IDR API
- "panic: introduce panic status function family" from Jinchao Wang
provides a number of cleanups to the panic code and its various
helpers, which were rather ad-hoc and scattered all over the place
- "tools/delaytop: implement real-time keyboard interaction support"
from Fan Yu adds a few nice user-facing usability changes to the
delaytop monitoring tool
- "efi: Fix EFI boot with kexec handover (KHO)" from Evangelos
Petrongonas fixes a panic which was happening with the combination of
EFI and KHO
- "Squashfs: performance improvement and a sanity check" from Phillip
Lougher teaches squashfs's lseek() about SEEK_DATA/SEEK_HOLE. A mere
150x speedup was measured for a well-chosen microbenchmark
- plus another 50-odd singleton patches all over the place
* tag 'mm-nonmm-stable-2025-10-02-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (75 commits)
Squashfs: reject negative file sizes in squashfs_read_inode()
kallsyms: use kmalloc_array() instead of kmalloc()
MAINTAINERS: update Sibi Sankar's email address
Squashfs: add SEEK_DATA/SEEK_HOLE support
Squashfs: add additional inode sanity checking
lib/genalloc: fix device leak in of_gen_pool_get()
panic: remove CONFIG_PANIC_ON_OOPS_VALUE
ocfs2: fix double free in user_cluster_connect()
checkpatch: suppress strscpy warnings for userspace tools
cramfs: fix incorrect physical page address calculation
kernel: prevent prctl(PR_SET_PDEATHSIG) from racing with parent process exit
Squashfs: fix uninit-value in squashfs_get_parent
kho: only fill kimage if KHO is finalized
ocfs2: avoid extra calls to strlen() after ocfs2_sprintf_system_inode_name()
kernel/sys.c: fix the racy usage of task_lock(tsk->group_leader) in sys_prlimit64() paths
sched/task.h: fix the wrong comment on task_lock() nesting with tasklist_lock
coccinelle: platform_no_drv_owner: handle also built-in drivers
coccinelle: of_table: handle SPI device ID tables
lib/decompress: use designated initializers for struct compress_format
efi: support booting with kexec handover (KHO)
...
- The 3 patch series "mm, swap: improve cluster scan strategy" from
Kairui Song improves performance and reduces the failure rate of swap
cluster allocation.
- The 4 patch series "support large align and nid in Rust allocators"
from Vitaly Wool permits Rust allocators to set NUMA node and large
alignment when perforning slub and vmalloc reallocs.
- The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from
Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets
for virtual address spaces for ops-level DAMOS filters.
- The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock"
from Suren Baghdasaryan reduces mmap_lock contention during reads of
/proc/pid/maps.
- The 2 patch series "mm/mincore: minor clean up for swap cache
checking" from Kairui Song performs some cleanup in the swap code.
- The 11 patch series "mm: vm_normal_page*() improvements" from David
Hildenbrand provides code cleanup in the pagemap code.
- The 5 patch series "add persistent huge zero folio support" from
Pankaj Raghav provides a block layer speedup by optionalls making the
huge_zero_pagepersistent, instead of releasing it when its refcount
falls to zero.
- The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a
few touchups to the recently added Kexec Handover feature.
- The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all
arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To
end the constant struggle with space shortage on 32-bit conflicting with
64-bit's needs.
- The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li
cleans up some swap code.
- The 7 patch series "selftests/mm: Fix false positives and skip
unsupported tests" from Donet Tom fixes a few things in our selftests
code.
- The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide
THPs when advised" from David Hildenbrand "allows individual processes
to opt-out of THP=always into THP=madvise, without affecting other
workloads on the system".
It's a long story - the [1/N] changelog spells out the considerations.
- The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox
gets us started on the memdesc project. Please see
https://kernelnewbies.org/MatthewWilcox/Memdescs and
https://blogs.oracle.com/linux/post/introducing-memdesc.
- The 3 patch series "Tiny optimization for large read operations" from
Chi Zhiling improves the efficiency of the pagecache read path.
- The 5 patch series "Better split_huge_page_test result check" from Zi
Yan improves our folio splitting selftest code.
- The 2 patch series "test that rmap behaves as expected" from Wei Yang
adds some rmap selftests.
- The 3 patch series "remove write_cache_pages()" from Christoph Hellwig
removes that function and converts its two remaining callers.
- The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain
fixes some UFFD selftests issues.
- The 3 patch series "introduce kernel file mapped folios" from Boris
Burkov introduces the concept of "kernel file pages". Using these
permits btrfs to account its metadata pages to the root cgroup, rather
than to the cgroups of random inappropriate tasks.
- The 2 patch series "mm/pageblock: improve readability of some
pageblock handling" from Wei Yang provides some readability improvements
to the page allocator code.
- The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae
Park teaches DAMON to understand arm32 highmem.
- The 4 patch series "tools: testing: Use existing atomic.h for
vma/maple tests" from Brendan Jackman performs some code cleanups and
deduplication under tools/testing/.
- The 2 patch series "maple_tree: Fix testing for 32bit compiles" from
Liam Howlett fixes a couple of 32-bit issues in
tools/testing/radix-tree.c.
- The 2 patch series "kasan: unify kasan_enabled() and remove
arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN
arch-specific initialization code into a common arch-neutral
implementation.
- The 3 patch series "mm: remove zpool" from Johannes Weiner removes
zspool - an indirection layer which now only redirects to a single thing
(zsmalloc).
- The 2 patch series "mm: task_stack: Stack handling cleanups" from
Pasha Tatashin makes a couple of cleanups in the fork code.
- The 37 patch series "mm: remove nth_page()" from David Hildenbrand
makes rather a lot of adjustments at various nth_page() callsites,
eventually permitting the removal of that undesirable helper function.
- The 2 patch series "introduce kasan.write_only option in hw-tags" from
Yeoreum Yun creates a KASAN read-only mode for ARM, using that
architecture's memory tagging feature. It is felt that a read-only mode
KASAN is suitable for use in production systems rather than debug-only.
- The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation"
from Kefeng Wang does some tidying in the hugetlb folio allocation code.
- The 12 patch series "mm: establish const-correctness for pointer
parameters" from Max Kellermann makes quite a number of the MM API
functions more accurate about the constness of their arguments. This
was getting in the way of subsystems (in this case CEPH) when they
attempt to improving their own const/non-const accuracy.
- The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola
fixes a number of code sites which were confused over when to use
free_pages() vs __free_pages().
- The 3 patch series "Add Rust abstraction for Maple Trees" from Alice
Ryhl makes the mapletree code accessible to Rust. Required by nouveau
and by its forthcoming successor: the new Rust Nova driver.
- The 2 patch series "selftests/mm: split_huge_page_test:
split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and
some cleanups to the thp selftesting code.
- The 14 patch series "mm, swap: introduce swap table as swap cache
(phase I)" from Chris Li and Kairui Song is the first step along the
path to implementing "swap tables" - a new approach to swap allocation
and state tracking which is expected to yield speed and space
improvements. This patchset itself yields a 5-20% performance benefit
in some situations.
- The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes
the new memdesc layer to clean up the ptdesc code a little.
- The 3 patch series "Fix va_high_addr_switch.sh test failure" from
Chunyu Hu fixes some issues in our 5-level pagetable selftesting code.
- The 2 patch series "Minor fixes for memory allocation profiling" from
Suren Baghdasaryan addresses a couple of minor issues in relatively new
memory allocation profiling feature.
- The 3 patch series "Small cleanups" from Matthew Wilcox has a few
cleanups in preparation for more memdesc work.
- The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and
DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in
furtherance of supporting arm highmem.
- The 2 patch series "selftests/mm: Add -Wunreachable-code and fix
warnings" from Muhammad Anjum adds that compiler check to selftests code
and fixes the fallout, by removing dead code.
- The 10 patch series "Improvements to Victim Process Thawing and OOM
Reaper Traversal Order" from zhongjinji makes a number of improvements
in the OOM killer: mainly thawing a more appropriate group of victim
threads so they can release resources.
- The 5 patch series "mm/damon: misc fixups and improvements for 6.18"
from SeongJae Park is a bunch of small and unrelated fixups for DAMON.
- The 7 patch series "mm/damon: define and use DAMON initialization
check function" from SeongJae Park implement reliability and
maintainability improvements to a recently-added bug fix.
- The 2 patch series "mm/damon/stat: expose auto-tuned intervals and
non-idle ages" from SeongJae Park provides additional transparency to
userspace clients of the DAMON_STAT information.
- The 2 patch series "Expand scope of khugepaged anonymous collapse"
from Dev Jain removes some constraints on khubepaged's collapsing of
anon VMAs. It also increases the success rate of MADV_COLLAPSE against
an anon vma.
- The 2 patch series "mm: do not assume file == vma->vm_file in
compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards
removal of file_operations.mmap(). This patchset concentrates upon
clearing up the treatment of stacked filesystems.
- The 6 patch series "mm: Improve mlock tracking for large folios" from
Kiryl Shutsemau provides some fixes and improvements to mlock's tracking
of large folios. /proc/meminfo's "Mlocked" field became more accurate.
- The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters
during fork" from Donet Tom fixes several user-visible KSM stats
inaccuracies across forks and adds selftest code to verify these
counters.
- The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei
Yang addresses some potential but presently benign issues in KSM's
mm_slot handling.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA
jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1
2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA=
=4mhY
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "mm, swap: improve cluster scan strategy" from Kairui Song improves
performance and reduces the failure rate of swap cluster allocation
- "support large align and nid in Rust allocators" from Vitaly Wool
permits Rust allocators to set NUMA node and large alignment when
perforning slub and vmalloc reallocs
- "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
DAMOS_STAT's handling of the DAMON operations sets for virtual
address spaces for ops-level DAMOS filters
- "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
Baghdasaryan reduces mmap_lock contention during reads of
/proc/pid/maps
- "mm/mincore: minor clean up for swap cache checking" from Kairui Song
performs some cleanup in the swap code
- "mm: vm_normal_page*() improvements" from David Hildenbrand provides
code cleanup in the pagemap code
- "add persistent huge zero folio support" from Pankaj Raghav provides
a block layer speedup by optionalls making the
huge_zero_pagepersistent, instead of releasing it when its refcount
falls to zero
- "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
the recently added Kexec Handover feature
- "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
Stoakes turns mm_struct.flags into a bitmap. To end the constant
struggle with space shortage on 32-bit conflicting with 64-bit's
needs
- "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
code
- "selftests/mm: Fix false positives and skip unsupported tests" from
Donet Tom fixes a few things in our selftests code
- "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
from David Hildenbrand "allows individual processes to opt-out of
THP=always into THP=madvise, without affecting other workloads on the
system".
It's a long story - the [1/N] changelog spells out the considerations
- "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
the memdesc project. Please see
https://kernelnewbies.org/MatthewWilcox/Memdescs and
https://blogs.oracle.com/linux/post/introducing-memdesc
- "Tiny optimization for large read operations" from Chi Zhiling
improves the efficiency of the pagecache read path
- "Better split_huge_page_test result check" from Zi Yan improves our
folio splitting selftest code
- "test that rmap behaves as expected" from Wei Yang adds some rmap
selftests
- "remove write_cache_pages()" from Christoph Hellwig removes that
function and converts its two remaining callers
- "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
selftests issues
- "introduce kernel file mapped folios" from Boris Burkov introduces
the concept of "kernel file pages". Using these permits btrfs to
account its metadata pages to the root cgroup, rather than to the
cgroups of random inappropriate tasks
- "mm/pageblock: improve readability of some pageblock handling" from
Wei Yang provides some readability improvements to the page allocator
code
- "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
to understand arm32 highmem
- "tools: testing: Use existing atomic.h for vma/maple tests" from
Brendan Jackman performs some code cleanups and deduplication under
tools/testing/
- "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
a couple of 32-bit issues in tools/testing/radix-tree.c
- "kasan: unify kasan_enabled() and remove arch-specific
implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
initialization code into a common arch-neutral implementation
- "mm: remove zpool" from Johannes Weiner removes zspool - an
indirection layer which now only redirects to a single thing
(zsmalloc)
- "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
couple of cleanups in the fork code
- "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
adjustments at various nth_page() callsites, eventually permitting
the removal of that undesirable helper function
- "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
creates a KASAN read-only mode for ARM, using that architecture's
memory tagging feature. It is felt that a read-only mode KASAN is
suitable for use in production systems rather than debug-only
- "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
some tidying in the hugetlb folio allocation code
- "mm: establish const-correctness for pointer parameters" from Max
Kellermann makes quite a number of the MM API functions more accurate
about the constness of their arguments. This was getting in the way
of subsystems (in this case CEPH) when they attempt to improving
their own const/non-const accuracy
- "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
code sites which were confused over when to use free_pages() vs
__free_pages()
- "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
mapletree code accessible to Rust. Required by nouveau and by its
forthcoming successor: the new Rust Nova driver
- "selftests/mm: split_huge_page_test: split_pte_mapped_thp
improvements" from David Hildenbrand adds a fix and some cleanups to
the thp selftesting code
- "mm, swap: introduce swap table as swap cache (phase I)" from Chris
Li and Kairui Song is the first step along the path to implementing
"swap tables" - a new approach to swap allocation and state tracking
which is expected to yield speed and space improvements. This
patchset itself yields a 5-20% performance benefit in some situations
- "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
layer to clean up the ptdesc code a little
- "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
issues in our 5-level pagetable selftesting code
- "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
addresses a couple of minor issues in relatively new memory
allocation profiling feature
- "Small cleanups" from Matthew Wilcox has a few cleanups in
preparation for more memdesc work
- "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
Quanmin Yan makes some changes to DAMON in furtherance of supporting
arm highmem
- "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
Anjum adds that compiler check to selftests code and fixes the
fallout, by removing dead code
- "Improvements to Victim Process Thawing and OOM Reaper Traversal
Order" from zhongjinji makes a number of improvements in the OOM
killer: mainly thawing a more appropriate group of victim threads so
they can release resources
- "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
is a bunch of small and unrelated fixups for DAMON
- "mm/damon: define and use DAMON initialization check function" from
SeongJae Park implement reliability and maintainability improvements
to a recently-added bug fix
- "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
SeongJae Park provides additional transparency to userspace clients
of the DAMON_STAT information
- "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
some constraints on khubepaged's collapsing of anon VMAs. It also
increases the success rate of MADV_COLLAPSE against an anon vma
- "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
from Lorenzo Stoakes moves us further towards removal of
file_operations.mmap(). This patchset concentrates upon clearing up
the treatment of stacked filesystems
- "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
provides some fixes and improvements to mlock's tracking of large
folios. /proc/meminfo's "Mlocked" field became more accurate
- "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
Donet Tom fixes several user-visible KSM stats inaccuracies across
forks and adds selftest code to verify these counters
- "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
some potential but presently benign issues in KSM's mm_slot handling
* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
mm: swap: check for stable address space before operating on the VMA
mm: convert folio_page() back to a macro
mm/khugepaged: use start_addr/addr for improved readability
hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
alloc_tag: fix boot failure due to NULL pointer dereference
mm: silence data-race in update_hiwater_rss
mm/memory-failure: don't select MEMORY_ISOLATION
mm/khugepaged: remove definition of struct khugepaged_mm_slot
mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
hugetlb: increase number of reserving hugepages via cmdline
selftests/mm: add fork inheritance test for ksm_merging_pages counter
mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
drivers/base/node: fix double free in register_one_node()
mm: remove PMD alignment constraint in execmem_vmalloc()
mm/memory_hotplug: fix typo 'esecially' -> 'especially'
mm/rmap: improve mlock tracking for large folios
mm/filemap: map entire large folio faultaround
mm/fault: try to map the entire file folio in finish_fault()
mm/rmap: mlock large folios in try_to_unmap_one()
mm/rmap: fix a mlock race condition in folio_referenced_one()
...
-----BEGIN PGP SIGNATURE-----
iQFPBAABCAA5FiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmja74IbFIAAAAAABAAO
bWFudTIsMi41KzEuMTEsMiwyAAoJELvgsHXSRYiacR4H/04aBsr7LZnTJVeZLQwK
HKoOwXBqiQyqPdjKXGKnp7Mh9gRp2W3V11VsYTuDJNUS+Vz5YXW0z8cRnUfZ3SYs
l+GZC3vZeAy2EVJE1U6Mb673hU8vziI80IO2q/tGzaj9a+wC3L0lemc+YFQTwG+u
pMtt8zU2vHRjgkx8TNNqJBBOLLDV+RzIl8pqXVnh4eju6x6ZdreGnjXaePYMdjG0
fXLf9XwIeWREqbfeOCEOB50Ts71kkdiOeskwnJyfCTDT8WTu3zC/dICqfh66e3Gg
8hQKvMsuKpm/FwbtgdB0WvaDjENH6PmY+ubLYVxwvNpcsTSqfe0IYGm+HpUP+TPf
m+Y=
=w+JL
-----END PGP SIGNATURE-----
Merge tag 'slab-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- A new layer for caching objects for allocation and free via percpu
arrays called sheaves.
The aim is to combine the good parts of SLAB (lower-overhead and
simpler percpu caching, compared to SLUB) without the past issues
with arrays for freeing remote NUMA node objects and their flushing.
It also allows more efficient kfree_rcu(), and cheaper object
preallocations for cases where the exact number of objects is
unknown, but an upper bound is.
Currently VMAs and maple nodes are using this new caching, with a
plan to enable it for all caches and remove the complex SLUB fastpath
based on cpu (partial) slabs and this_cpu_cmpxchg_double().
(Vlastimil Babka, with Liam Howlett and Pedro Falcato for the maple
tree changes)
- Re-entrant kmalloc_nolock(), which allows opportunistic allocations
from NMI and tracing/kprobe contexts.
Building on prior page allocator and memcg changes, it will result in
removing BPF-specific caches on top of slab (Alexei Starovoitov)
- Various fixes and cleanups. (Kuan-Wei Chiu, Matthew Wilcox, Suren
Baghdasaryan, Ye Liu)
* tag 'slab-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (40 commits)
slab: Introduce kmalloc_nolock() and kfree_nolock().
slab: Reuse first bit for OBJEXTS_ALLOC_FAIL
slab: Make slub local_(try)lock more precise for LOCKDEP
mm: Introduce alloc_frozen_pages_nolock()
mm: Allow GFP_ACCOUNT to be used in alloc_pages_nolock().
locking/local_lock: Introduce local_lock_is_locked().
maple_tree: Convert forking to use the sheaf interface
maple_tree: Add single node allocation support to maple state
maple_tree: Prefilled sheaf conversion and testing
tools/testing: Add support for prefilled slab sheafs
maple_tree: Replace mt_free_one() with kfree()
maple_tree: Use kfree_rcu in ma_free_rcu
testing/radix-tree/maple: Hack around kfree_rcu not existing
tools/testing: include maple-shim.c in maple.c
maple_tree: use percpu sheaves for maple_node_cache
mm, vma: use percpu sheaves for vm_area_struct cache
tools/testing: Add support for changes to slab for sheaves
slab: allow NUMA restricted allocations to use percpu sheaves
tools/testing/vma: Implement vm_refcnt reset
slab: skip percpu sheaves for remote object freeing
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmjbLCgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpoY0D/9J+11BC88pBxCrLKv/V2TwCNokRMi0dU3L
r3EUdA46k0oXmvb6ueZqIcfY2e+IX7rdQkaRbh1zRdsNejqHo4548C3ePWGdBAcM
OdNEGfpehO0aD0td1+mK/NxoJMLhbs5QraPanz+SOkGZOKeF+vGCga5PUDivsr5J
16T9yb7i+isENLdAc2RJbZVyAphqHQlo5GHi5ZIKOVi5cNt8GU/q2sQl7NYmGvHd
aq37svvZHFOhLRajP959Fw9WOxEYITewzQ4UYf1FZjUodJUxO+vCnP0ooBQRlyu8
1B4PYWwSE+Vn3GkQE0Om+mzo9AVPOiLmoAWGxdgJBMyEkZndocr46XEslXOufQ1Z
T3Gu19G6jCxcyByNVhjVnaajYKmvSQAy1w75m4XlfqTRm4f9Om+LAJavUk3RuaOL
7lXKQ7Ql1/Tby9Jmf8afjYYXXotNDNku6rz2P3qtOwAA26mNJfgVt0rO+8XGRDe9
ioLbCkTjslYMc/Oh4jSsbrspsVALbaQMq/Dmah8k0EWb4QAHVgCJyGBoff3hOboI
jD6B1enaKOQVgcjWcjm/FjOk3jv2h3v4X26YWQZTvEc/1PnSnST78Zi/ePhzDdmt
sBALUAS37TfTgNMzrhbHl5Zs13k0C0XyANuayuKuo5hlNnC1wbdap+5FZJOmpuOB
YT+VkYnaOA==
=kOmc
-----END PGP SIGNATURE-----
Merge tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block updates from Jens Axboe:
- NVMe pull request via Keith:
- FC target fixes (Daniel)
- Authentication fixes and updates (Martin, Chris)
- Admin controller handling (Kamaljit)
- Target lockdep assertions (Max)
- Keep-alive updates for discovery (Alastair)
- Suspend quirk (Georg)
- MD pull request via Yu:
- Add support for a lockless bitmap.
A key feature for the new bitmap are that the IO fastpath is
lockless. If a user issues lots of write IO to the same bitmap
bit in a short time, only the first write has additional overhead
to update bitmap bit, no additional overhead for the following
writes.
By supporting only resync or recover written data, means in the
case creating new array or replacing with a new disk, there is no
need to do a full disk resync/recovery.
- Switch ->getgeo() and ->bios_param() to using struct gendisk rather
than struct block_device.
- Rust block changes via Andreas. This series adds configuration via
configfs and remote completion to the rnull driver. The series also
includes a set of changes to the rust block device driver API: a few
cleanup patches, and a few features supporting the rnull changes.
The series removes the raw buffer formatting logic from
`kernel::block` and improves the logic available in `kernel::string`
to support the same use as the removed logic.
- floppy arch cleanups
- Reduce the number of dereferencing needed for ublk commands
- Restrict supported sockets for nbd. Mostly done to eliminate a class
of issues perpetually reported by syzbot, by using nonsensical socket
setups.
- A few s390 dasd block fixes
- Fix a few issues around atomic writes
- Improve DMA interation for integrity requests
- Improve how iovecs are treated with regards to O_DIRECT aligment
constraints.
We used to require each segment to adhere to the constraints, now
only the request as a whole needs to.
- Clean up and improve p2p support, enabling use of p2p for metadata
payloads
- Improve locking of request lookup, using SRCU where appropriate
- Use page references properly for brd, avoiding very long RCU sections
- Fix ordering of recursively submitted IOs
- Clean up and improve updating nr_requests for a live device
- Various fixes and cleanups
* tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (164 commits)
s390/dasd: enforce dma_alignment to ensure proper buffer validation
s390/dasd: Return BLK_STS_INVAL for EINVAL from do_dasd_request
ublk: remove redundant zone op check in ublk_setup_iod()
nvme: Use non zero KATO for persistent discovery connections
nvmet: add safety check for subsys lock
nvme-core: use nvme_is_io_ctrl() for I/O controller check
nvme-core: do ioccsz/iorcsz validation only for I/O controllers
nvme-core: add method to check for an I/O controller
blk-cgroup: fix possible deadlock while configuring policy
blk-mq: fix null-ptr-deref in blk_mq_free_tags() from error path
blk-mq: Fix more tag iteration function documentation
selftests: ublk: fix behavior when fio is not installed
ublk: don't access ublk_queue in ublk_unmap_io()
ublk: pass ublk_io to __ublk_complete_rq()
ublk: don't access ublk_queue in ublk_need_complete_req()
ublk: don't access ublk_queue in ublk_check_commit_and_fetch()
ublk: don't pass ublk_queue to ublk_fetch()
ublk: don't access ublk_queue in ublk_config_io_buf()
ublk: don't access ublk_queue in ublk_check_fetch_buf()
ublk: pass q_id and tag to __ublk_check_and_get_req()
...
- Extend modules.builtin.modinfo to include module aliases from
MODULE_DEVICE_TABLE for builtin modules so that userspace tools (such
as kmod) can verify that a particular module alias will be handled by
a builtin module.
- Bump the minimum version of LLVM for building the kernel to 15.0.0.
- Upgrade several userspace API checks in headers_check.pl to errors.
- Unify and consolidate CONFIG_WERROR / W=e handling.
- Turn assembler and linker warnings into errors with CONFIG_WERROR /
W=e.
- Respect CONFIG_WERROR / W=e when building userspace programs
(userprogs).
- Enable -Werror unconditionally when building host programs
(hostprogs).
- Support copy_file_range() and data segment alignment in gen_init_cpio
to improve performance on filesystems that support reflinks such as
btrfs and XFS.
- Miscellaneous small changes to scripts and configuration files.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaNrp6QAKCRAdayaRccAa
ljxRAP4hYocKXeWsiJzkTB199P4QUGWf220a9elBmtdJEed07gD/VBnCbSOxG3RO
vS8qbJHwxUFL7a+mDV8RIVXSt99NpAg=
=psG/
-----END PGP SIGNATURE-----
Merge tag 'kbuild-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild updates from Nathan Chancellor:
- Extend modules.builtin.modinfo to include module aliases from
MODULE_DEVICE_TABLE for builtin modules so that userspace tools (such
as kmod) can verify that a particular module alias will be handled by
a builtin module
- Bump the minimum version of LLVM for building the kernel to 15.0.0
- Upgrade several userspace API checks in headers_check.pl to errors
- Unify and consolidate CONFIG_WERROR / W=e handling
- Turn assembler and linker warnings into errors with CONFIG_WERROR /
W=e
- Respect CONFIG_WERROR / W=e when building userspace programs
(userprogs)
- Enable -Werror unconditionally when building host programs
(hostprogs)
- Support copy_file_range() and data segment alignment in gen_init_cpio
to improve performance on filesystems that support reflinks such as
btrfs and XFS
- Miscellaneous small changes to scripts and configuration files
* tag 'kbuild-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (47 commits)
modpost: Initialize builtin_modname to stop SIGSEGVs
Documentation: kbuild: note CONFIG_DEBUG_EFI in reproducible builds
kbuild: vmlinux.unstripped should always depend on .vmlinux.export.o
modpost: Create modalias for builtin modules
modpost: Add modname to mod_device_table alias
scsi: Always define blogic_pci_tbl structure
kbuild: extract modules.builtin.modinfo from vmlinux.unstripped
kbuild: keep .modinfo section in vmlinux.unstripped
kbuild: always create intermediate vmlinux.unstripped
s390: vmlinux.lds.S: Reorder sections
KMSAN: Remove tautological checks
objtool: Drop noinstr hack for KCSAN_WEAK_MEMORY
lib/Kconfig.debug: Drop CLANG_VERSION check from DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
riscv: Remove ld.lld version checks from many TOOLCHAIN_HAS configs
riscv: Unconditionally use linker relaxation
riscv: Remove version check for LTO_CLANG selects
powerpc: Drop unnecessary initializations in __copy_inst_from_kernel_nofault()
mips: Unconditionally select ARCH_HAS_CURRENT_STACK_POINTER
arm64: Remove tautological LLVM Kconfig conditions
ARM: Clean up definition of ARM_HAS_GROUP_RELOCS
...
- A seven patch series adds a new parameterized test features
KUnit parameterized tests currently support two primary methods for
getting parameters:
1. Defining custom logic within a generate_params() function.
2. Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC()
macros with a pre-defined static array and passing
the created *_gen_params() to KUNIT_CASE_PARAM().
These methods present limitations when dealing with dynamically
generated parameter arrays, or in scenarios where populating parameters
sequentially via generate_params() is inefficient or overly complex.
These limitations are fixed with a parameterized test method.
- Fixes issues in kunit build artifacts cleanup,
- Fixes parsing skipped test problem in kselftest framework,
- Enables PCI on UML without triggering WARN()
- a few other fixes and adds support for new configs such as MIPS
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmjbB0QACgkQCwJExA0N
QxxpUQ/8CslEjTv2+/LA122eGDtg8np61W1MTNEQslYyiobhZ9CXeFw4yJlTTb0w
hShFK1pGAWgkVCVKIJOaeItY0hF3BIxyVtlQxDUKvukpMnLvZRI0KhG7p8aYnCq+
jfQRy4gqwjaHyLQekQ1v6vRdHdTfh5mB6OUOCYa7QPQuKhOkBOaq2ZJ+9eFnkPXl
KUGTcC3pL0jQQmaSgo7zFUbdGSq0JZkNbpMj0fAYB0zs+MSpfkAnQYEw3qaxU1/Z
lu+CQdom+77Kt0r7UyEb6xUBeRveqYAWv6oFKq3CBzhlEGCkqVv6zgNg48qMC0QO
cVW0E61u8bVAalhb7hOT3QsEp1k01Lr2N9/VBLP4LRb2HMlD2qlXDo6ftBjNgx/y
m5Kukh1wQoOTaTJekdDMlaRXwApdn3MegheXE83n4dr44C5oyhlR8fFk/0Y6gSEU
RKUg8ZwUNUhCdw4VgBn8zoSkK18D74feRoustmuMS00I/8to3iGvQ7kn3nz2fPMb
AaZ6a2K95gk9TrD4UZdHF1aPHDudn3e8g6LdSOV/NoRNI9H+fXfhvxcLwW8WCqOw
u5qfTxNUk4mPQrCs9hcDnWm9Ppuu5YeY989rKkA7j4z71dhyMOPynpOM+vXRijB8
9zRcvyls87tgDAquIZvTp6Q/oEmi60OF40DRC9MjgOw0Iw7feIc=
=VfHu
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
- New parameterized test features
KUnit parameterized tests supported two primary methods for getting
parameters:
- Defining custom logic within a generate_params() function.
- Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC() macros
with a pre-defined static array and passing the created
*_gen_params() to KUNIT_CASE_PARAM().
These methods present limitations when dealing with dynamically
generated parameter arrays, or in scenarios where populating
parameters sequentially via generate_params() is inefficient or
overly complex.
These limitations are fixed with a parameterized test method
- Fix issues in kunit build artifacts cleanup
- Fix parsing skipped test problem in kselftest framework
- Enable PCI on UML without triggering WARN()
- a few other fixes and adds support for new configs such as MIPS
* tag 'linux_kselftest-kunit-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: Extend kconfig help text for KUNIT_UML_PCI
rust: kunit: allow `cfg` on `test`s
kunit: qemu_configs: Add MIPS configurations
kunit: Enable PCI on UML without triggering WARN()
Documentation: kunit: Document new parameterized test features
kunit: Add example parameterized test with direct dynamic parameter array setup
kunit: Add example parameterized test with shared resource management using the Resource API
kunit: Enable direct registration of parameter arrays to a KUnit test
kunit: Pass parameterized test context to generate_params()
kunit: Introduce param_init/exit for parameterized test context management
kunit: Add parent kunit for parameterized test context
kunit: tool: Accept --raw_output=full as an alias of 'all'
kunit: tool: Parse skipped tests from kselftest.h
kunit: Always descend into kunit directory during build
- Further consolidation of the VDSO infrastructure and the common data
store.
- Simplification of the related Kconfig logic
- Improve the VDSO selftest suite
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaUJgTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoT5VD/9l+TP42vJzwygPArefVM9QijAc4k+g
iptaz5M+Lhk0BlwMUu502Tp1zcwGq5529LlE2P72DuYoZRtZKSFCGES2mye0JiAs
A5Vk8d1zicg3IIGjYgw39shA9ghzuzV3gZdgHVqQ6024uSeKxUY3SqF156GGfPFG
azXWDV7i4RGQvKLm2ye8m7JNYCE3P9f8Wh+GvPaq7qBlW1MsMCWC4FP5qu4iStf3
mol9uA/EM2zSDwkpIpq6P5L2TpIH1dGDNGxVf/HG3J16UhNXzzXv/rxuKHKPNGWm
MQZ8WdPjF+uwdUsNXaImC6aCE0hoQ1Ga7GE2YA3pMQYfySeiJ/fA5I1KfhlUCiHD
oUgnb6PQfZjaBDrIODKcrFHAYc5pTOyJXnJ6q1Lc1yrOfoVcIpMaRXHO/phCeuwI
61gfb9ggbFEH/7cL26Z9XGSqh4BDCKVP5Z+y98OnNEzV74ScpX13/szB3VgdN6LW
tmTOeMPYCo4FQoPw8OV1xkeYOKD2JyqH+garnxDZNFjaVaeUkzlAMKz+1JUjxRZj
UpUhZ8y5KFVjgKkqZHxhucr9cnwV/xd4n+IzZeG4l6XZp4VyirG+5QpAikGYe+Jq
WFU4Ao1xvevyizrJEhoGJKznwtrNrsy6AQ+wLQxYiRR4aTnng8n9T/ry2DVvhJf8
diYWxuRTR7KMxg==
=i3UB
-----END PGP SIGNATURE-----
Merge tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull VDSO updates from Thomas Gleixner:
- Further consolidation of the VDSO infrastructure and the common data
store
- Simplification of the related Kconfig logic
- Improve the VDSO selftest suite
* tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests: vDSO: Drop vdso_test_clock_getres
selftests: vDSO: vdso_test_abi: Add tests for clock_gettime64()
selftests: vDSO: vdso_test_abi: Test CPUTIME clocks
selftests: vDSO: vdso_test_abi: Use explicit indices for name array
selftests: vDSO: vdso_test_abi: Drop clock availability tests
selftests: vDSO: vdso_test_abi: Use ksft_finished()
selftests: vDSO: vdso_test_abi: Correctly skip whole test with missing vDSO
selftests: vDSO: Fix -Wunitialized in powerpc VDSO_CALL() wrapper
vdso: Add struct __kernel_old_timeval forward declaration to gettime.h
vdso: Gate VDSO_GETRANDOM behind HAVE_GENERIC_VDSO
vdso: Drop Kconfig GENERIC_VDSO_TIME_NS
vdso: Drop Kconfig GENERIC_VDSO_DATA_STORE
vdso: Drop kconfig GENERIC_COMPAT_VDSO
vdso: Drop kconfig GENERIC_VDSO_32
riscv: vdso: Untangle Kconfig logic
time: Build generic update_vsyscall() only with generic time vDSO
vdso/gettimeofday: Remove !CONFIG_TIME_NS stubs
vdso: Move ENABLE_COMPAT_VDSO from core to arm64
ARM: VDSO: Remove cntvct_ok global variable
vdso/datastore: Gate time data behind CONFIG_GENERIC_GETTIMEOFDAY
- Address the inconsistent shutdown sequence of per CPU clockevents on
CPU hotplug, which onoly removed it from the core but failed to invoke
the actual device driver shutdown callback. This keeps the timer
active, which prevents power savings and causes pointless noise in
virtualization.
- Encapsulate the open coded access to the hrtimer clock base, which is a
private implementation detail, so that the implementation can be
changed without breaking a lot of usage sites.
- Enhance the debug output of the clocksource watchdog to provide better
information for analysis.
- The usual set of cleanups and enhancements all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaT+ETHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoT8HD/47VRqHbFKazGcwZ4+OLKfFTq0PKzIU
kIhimNMSKVqiPlmWhud/4idGX5JFqMP3dS9ZaDlp7xCtu1ScCqzH72kbCrab0l4g
wjRgneGWsHhv06PY50Ty9FusFSetkjA8XSYavFV9HZNgCRIvho958akdckpazMcF
i4V5WTIVkwhGynhcGwqBXcmHASUa7x6gEjZYBbX6wspPl2Wk5z+vn/zAVIHAozwS
9GPw8HlUeKqM72U12mWGkt4KLjy2gzoupvx9vD4giXRvqkt09rfHtRqvEdkew+/G
ZANhmTJqfA1AiihYP30fHgY5lQbNQG+9O10UKhlUyrlBKPKHKu6dIuJQRSBy0j59
Bqef4UubPBlMP4xRgTbt11SFrjuqDo68bTIDGmQNQvb1BXSXhGA08PDPbIKkFdJh
8cXwUHzjLkDk0tL6kVXRWlsZcCmjz87TVS7PufGE3EIZkh0J2Ob+g7nFS3PsORKW
65ROSRoYmh4CRh/l9CTNbPWFvz3jahIEnxlrSc70o0A4DUeeWTwT/M+MoNOtSJuV
D4qSD35JsFaEbKFfSJwF/0CybG3Nx3UPI8nvEEYQtS4D+BCJHy7INg1P/NtWw+vb
W7puAFd+WbjSN14uQfYkYWC53vQ22isW0nU8W6G5NbMRD5K4WIld4UhDksJIGCLe
2LZ0VHxv4gaXkg==
=A9Tu
-----END PGP SIGNATURE-----
Merge tag 'timers-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner:
- Address the inconsistent shutdown sequence of per CPU clockevents on
CPU hotplug, which only removed it from the core but failed to invoke
the actual device driver shutdown callback. This kept the timer
active, which prevented power savings and caused pointless noise in
virtualization.
- Encapsulate the open coded access to the hrtimer clock base, which is
a private implementation detail, so that the implementation can be
changed without breaking a lot of usage sites.
- Enhance the debug output of the clocksource watchdog to provide
better information for analysis.
- The usual set of cleanups and enhancements all over the place
* tag 'timers-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Fix spelling mistakes in comments
clocksource: Print durations for sync check unconditionally
LoongArch: Remove clockevents shutdown call on offlining
tick: Do not set device to detached state in tick_shutdown()
hrtimer: Reorder branches in hrtimer_clockid_to_base()
hrtimer: Remove hrtimer_clock_base:: Get_time
hrtimer: Use hrtimer_cb_get_time() helper
media: pwm-ir-tx: Avoid direct access to hrtimer clockbase
ALSA: hrtimer: Avoid direct access to hrtimer clockbase
lib: test_objpool: Avoid direct access to hrtimer clockbase
sched/core: Avoid direct access to hrtimer clockbase
timers/itimer: Avoid direct access to hrtimer clockbase
posix-timers: Avoid direct access to hrtimer clockbase
jiffies: Remove obsolete SHIFTED_HZ comment
First set of RISC-V updates for the v6.18 merge window, including:
- Replacement of __ASSEMBLY__ with __ASSEMBLER__ in header files (other
architectures have already merged this type of cleanup)
- The introduction of ioremap_wc() for RISC-V
- Cleanup of the RISC-V kprobes code to use mostly-extant macros rather than
open code
- A RISC-V kprobes unit test
- An architecture-specific endianness swap macro set implementation,
leveraging some dedicated RISC-V instructions for this purpose if they
are available
- The ability to identity and communicate to userspace the presence of a
MIPS P8700-specific ISA extension, and to leverage its MIPS-specific PAUSE
implementation in cpu_relax()
- Several other miscellaneous cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmjaMVIACgkQx4+xDQu9
Kkva4g/9GE4gzwM+KHlc4e1sfsaTo4oXe9eV+Hj3gUJdM+g8dCtNFchPRCKFjHYb
X9lm2YVL9Q3cZ7yWWy/DJtZ66Yz9foQ4laX7uYbjsdbdjbsAXLXhjNp6uk4nBrqb
uW7Uq+Qel8qq66J/B4Z/U0UzF3e5MaptQ6sLuNRONY9OJUxG76zTSJVKwiVNsGaX
8W59b7ALFlCIwCVyXGm/KO1EELjl8FKDENWXFE1v6T8XvfXfYuhcvUMp84ebbzHV
D4kQrO3nxKQVgKEdCtW8xxt0/aWkYQ8zbQg8bD0gDDzMYb5uiJDcMFaa8ZuEYiLg
EcAJX8LmE5GGlTcJf8/jxvaA87hisNsGFvNPXX1OuI26w2TUNC2u80wE6Q6a3aNu
74oUtZEaPhdFuB31A0rALC8Zb2zalnwwjAL7xRlyAozUuye8Ej7mE7w97WTjIfWz
7ZL19/+C1uawljzYLn+FJBIcl1wTyvOgx6T4TJlmOsa4OFnCreFx+a0Az6+Rx1GC
XGsRyDkGPSabIbNVGxUCyTr4w+VpW1WlDjrjwyLC0DQTFW8wyP44W9/K2scp+CqJ
bSCcAz8QtGAeZ5UlSmXYTOV69xXqZPaom7fVk5RFHoy24en5DSo7kj1NJ3ChupRD
8ACpALcIgw/VEo0Tyqqy3dyVhPDuYaZMY3WgGGj9Cz18U37e/Ho=
=nK3D
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley
- Replacement of __ASSEMBLY__ with __ASSEMBLER__ in header files (other
architectures have already merged this type of cleanup)
- The introduction of ioremap_wc() for RISC-V
- Cleanup of the RISC-V kprobes code to use mostly-extant macros rather
than open code
- A RISC-V kprobes unit test
- An architecture-specific endianness swap macro set implementation,
leveraging some dedicated RISC-V instructions for this purpose if
they are available
- The ability to identity and communicate to userspace the presence
of a MIPS P8700-specific ISA extension, and to leverage its
MIPS-specific PAUSE implementation in cpu_relax()
- Several other miscellaneous cleanups
* tag 'riscv-for-linus-6.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (39 commits)
riscv: errata: Fix the PAUSE Opcode for MIPS P8700
riscv: hwprobe: Document MIPS xmipsexectl vendor extension
riscv: hwprobe: Add MIPS vendor extension probing
riscv: Add xmipsexectl instructions
riscv: Add xmipsexectl as a vendor extension
dt-bindings: riscv: Add xmipsexectl ISA extension description
riscv: cpufeature: add validation for zfa, zfh and zfhmin
perf: riscv: skip empty batches in counter start
selftests: riscv: Add README for RISC-V KSelfTest
riscv: sbi: Switch to new sys-off handler API
riscv: Move vendor errata definitions to new header
RISC-V: ACPI: enable parsing the BGRT table
riscv: Enable ARCH_HAVE_NMI_SAFE_CMPXCHG
riscv: pi: use 'targets' instead of extra-y in Makefile
riscv: introduce asm/swab.h
riscv: mmap(): use unsigned offset type in riscv_sys_mmap
drivers/perf: riscv: Remove redundant ternary operators
riscv: mm: Use mmu-type from FDT to limit SATP mode
riscv: mm: Return intended SATP mode for noXlvl options
riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM
...
- Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)
- lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
(Junjie Cao)
- Add str_assert_deassert() helper (Lad Prabhakar)
- gcc-plugins: Remove TODO_verify_il for GCC >= 16
- kconfig: Fix BrokenPipeError warnings in selftests
- kconfig: Add transitional symbol attribute for migration support
- kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaNraNQAKCRA2KwveOeQk
u/DkAPwKPP5BSmVR2wkdpQaXIr3PGA+cbBYp34DMJNujZ9piIwD/WZ+HfGTLoERy
+2Q6HLj9hUdd+Rx3IZ8/w1QmnhUIUAU=
=AwV9
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"One notable addition is the creation of the 'transitional' keyword for
kconfig so CONFIG renaming can go more smoothly.
This has been a long-standing deficiency, and with the renaming of
CONFIG_CFI_CLANG to CONFIG_CFI (since GCC will soon have KCFI
support), this came up again.
The breadth of the diffstat is mainly this renaming.
- Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)
- lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
(Junjie Cao)
- Add str_assert_deassert() helper (Lad Prabhakar)
- gcc-plugins: Remove TODO_verify_il for GCC >= 16
- kconfig: Fix BrokenPipeError warnings in selftests
- kconfig: Add transitional symbol attribute for migration support
- kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI"
* tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lib/string_choices: Add str_assert_deassert() helper
kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
kconfig: Add transitional symbol attribute for migration support
kconfig: Fix BrokenPipeError warnings in selftests
gcc-plugins: Remove TODO_verify_il for GCC >= 16
stddef: Introduce __TRAILING_OVERLAP()
stddef: Remove token-pasting in TRAILING_OVERLAP()
lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
- PCI: Fix theoretical underflow in use of ffs().
- Universally apply __attribute_const__ to all architecture's ffs()-family
of functions.
- Add KUnit tests for ffs() behavior and const-ness.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaNrWngAKCRA2KwveOeQk
u3ZGAPwJTscARU4MspnqpbuAV601dG1TNoJG+8JYH84r+R2jjQEAlmBZB0jaHbC2
qFWjHivD/0ofvihKfAPFgxlakyV1XAg=
=diXF
-----END PGP SIGNATURE-----
Merge tag 'ffs-const-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull ffs const-attribute cleanups from Kees Cook:
"While working on various hardening refactoring a while back we
encountered inconsistencies in the application of __attribute_const__
on the ffs() family of functions.
This series fixes this across all archs and adds KUnit tests.
Notably, this found a theoretical underflow in PCI (also fixed here)
and uncovered an inefficiency in ARC (fixed in the ARC arch PR). I
kept the series separate from the general hardening PR since it is a
stand-alone "topic".
- PCI: Fix theoretical underflow in use of ffs().
- Universally apply __attribute_const__ to all architecture's
ffs()-family of functions.
- Add KUnit tests for ffs() behavior and const-ness"
* tag 'ffs-const-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
KUnit: ffs: Validate all the __attribute_const__ annotations
sparc: Add __attribute_const__ to ffs()-family implementations
xtensa: Add __attribute_const__ to ffs()-family implementations
s390: Add __attribute_const__ to ffs()-family implementations
parisc: Add __attribute_const__ to ffs()-family implementations
mips: Add __attribute_const__ to ffs()-family implementations
m68k: Add __attribute_const__ to ffs()-family implementations
openrisc: Add __attribute_const__ to ffs()-family implementations
riscv: Add __attribute_const__ to ffs()-family implementations
hexagon: Add __attribute_const__ to ffs()-family implementations
alpha: Add __attribute_const__ to ffs()-family implementations
sh: Add __attribute_const__ to ffs()-family implementations
powerpc: Add __attribute_const__ to ffs()-family implementations
x86: Add __attribute_const__ to ffs()-family implementations
csky: Add __attribute_const__ to ffs()-family implementations
bitops: Add __attribute_const__ to generic ffs()-family implementations
KUnit: Introduce ffs()-family tests
PCI: Test for bit underflow in pcie_set_readrq()
Add support for 2-way interleaved SHA-256 hashing to lib/crypto/, and
make fsverity use it for faster file data verification. This improves
fsverity performance on many x86_64 and arm64 processors.
Later, I plan to make dm-verity use this too.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaNg4/RQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK4fMAP9Xz00JNDfJ+mOVHIYOhAlWFGnug0X1
cvoRf4QXchNlbwD9HTJQQDQXnbsPy3QPrUVfl2FqCW7c6vRlBJijhD6j4wE=
=6dCR
-----END PGP SIGNATURE-----
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux
Pull interleaved SHA-256 hashing support from Eric Biggers:
"Optimize fsverity with 2-way interleaved hashing
Add support for 2-way interleaved SHA-256 hashing to lib/crypto/, and
make fsverity use it for faster file data verification. This improves
fsverity performance on many x86_64 and arm64 processors.
Later, I plan to make dm-verity use this too"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fsverity: Use 2-way interleaved SHA-256 hashing when supported
fsverity: Remove inode parameter from fsverity_hash_block()
lib/crypto: tests: Add tests and benchmark for sha256_finup_2x()
lib/crypto: x86/sha256: Add support for 2-way interleaved hashing
lib/crypto: arm64/sha256: Add support for 2-way interleaved hashing
lib/crypto: sha256: Add support for 2-way interleaved hashing
- Add a RISC-V optimized implementation of Poly1305. This code was
written by Andy Polyakov and contributed by Zhihang Shao.
- Migrate the MD5 code into lib/crypto/, and add KUnit tests for MD5.
Yes, it's still the 90s, and several kernel subsystems are still using
MD5 for legacy use cases. As long as that remains the case, it's
helpful to clean it up in the same way as I've been doing for other
algorithms. Later, I plan to convert most of these users of MD5 to use
the new MD5 library API instead of the generic crypto API.
- Simplify the organization of the ChaCha, Poly1305, BLAKE2s, and
Curve25519 code. Consolidate these into one module per algorithm,
and centralize the configuration and build process. This is the same
reorganization that has already been successful for SHA-1 and SHA-2.
- Remove the unused crypto_kpp API for Curve25519.
- Migrate the BLAKE2s and Curve25519 self-tests to KUnit.
- Always enable the architecture-optimized BLAKE2s code.
Due to interdependencies between test and non-test code, both are
included in this pull request. The broken-down diffstat is as follows:
Tests: 735 insertions(+), 1917 deletions(-)
RISC-V Poly1305: 877 insertions(+), 1 deletion(-)
Other: 1777 insertions(+), 3117 deletions(-)
Besides the new RISC-V code which is an addition, there are quite a
few simplifications due to the improved code organization for multiple
algorithms, the removal of the unused crypto_kpp API for Curve25519
and redundant tests, and the redesign of the BLAKE2s test.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaNgwUhQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK3EnAP96hB1wD12DvIovGCmWnnlbzOt+CoK2
B5CW74eYEZiSbwD7BiKPDqvSmLzEBtbKmOSwRvxKuQ2uGGef3USFKYVCiw0=
=DY5R
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
- Add a RISC-V optimized implementation of Poly1305. This code was
written by Andy Polyakov and contributed by Zhihang Shao.
- Migrate the MD5 code into lib/crypto/, and add KUnit tests for MD5.
Yes, it's still the 90s, and several kernel subsystems are still
using MD5 for legacy use cases. As long as that remains the case,
it's helpful to clean it up in the same way as I've been doing for
other algorithms.
Later, I plan to convert most of these users of MD5 to use the new
MD5 library API instead of the generic crypto API.
- Simplify the organization of the ChaCha, Poly1305, BLAKE2s, and
Curve25519 code.
Consolidate these into one module per algorithm, and centralize the
configuration and build process. This is the same reorganization that
has already been successful for SHA-1 and SHA-2.
- Remove the unused crypto_kpp API for Curve25519.
- Migrate the BLAKE2s and Curve25519 self-tests to KUnit.
- Always enable the architecture-optimized BLAKE2s code.
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (38 commits)
crypto: md5 - Implement export_core() and import_core()
wireguard: kconfig: simplify crypto kconfig selections
lib/crypto: tests: Enable Curve25519 test when CRYPTO_SELFTESTS
lib/crypto: curve25519: Consolidate into single module
lib/crypto: curve25519: Move a couple functions out-of-line
lib/crypto: tests: Add Curve25519 benchmark
lib/crypto: tests: Migrate Curve25519 self-test to KUnit
crypto: curve25519 - Remove unused kpp support
crypto: testmgr - Remove curve25519 kpp tests
crypto: x86/curve25519 - Remove unused kpp support
crypto: powerpc/curve25519 - Remove unused kpp support
crypto: arm/curve25519 - Remove unused kpp support
crypto: hisilicon/hpre - Remove unused curve25519 kpp support
lib/crypto: tests: Add KUnit tests for BLAKE2s
lib/crypto: blake2s: Consolidate into single C translation unit
lib/crypto: blake2s: Move generic code into blake2s.c
lib/crypto: blake2s: Always enable arch-optimized BLAKE2s code
lib/crypto: blake2s: Remove obsolete self-test
lib/crypto: x86/blake2s: Reduce size of BLAKE2S_SIGMA2
lib/crypto: chacha: Consolidate into single module
...
Update crc_kunit to test the CRC functions in softirq and hardirq
contexts, similar to what the lib/crypto/ KUnit tests do. Move the
helper function needed to do this into a common header.
This is useful mainly to test fallback code paths for when
FPU/SIMD/vector registers are unusable.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaNgevxQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK42cAP4qrRd2p4lLm4lntiL9igO0AVUl7Lsj
2hhYcVWNj24wwgD/bBb5aCCKfreOQbjMldYjL+UlRiYqQAsstlfcwY7EegM=
=aEsr
-----END PGP SIGNATURE-----
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
"Update crc_kunit to test the CRC functions in softirq and hardirq
contexts, similar to what the lib/crypto/ KUnit tests do. Move the
helper function needed to do this into a common header.
This is useful mainly to test fallback code paths for when
FPU/SIMD/vector registers are unusable"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
Documentation/staging: Fix typo and incorrect citation in crc32.rst
lib/crc: Drop inline from all *_mod_init_arch() functions
lib/crc: Use underlying functions instead of crypto_simd_usable()
lib/crc: crc_kunit: Test CRC computation in interrupt contexts
kunit, lib/crypto: Move run_irq_test() to common header
Use the generic interface which should result in less bulk allocations
during a forking.
A part of this is to abstract the freeing of the sheaf or maple state
allocations into its own function so mas_destroy() and the tree
duplication code can use the same functionality to return any unused
resources.
[andriy.shevchenko@linux.intel.com: remove unused mt_alloc_bulk()]
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
The fast path through a write will require replacing a single node in
the tree. Using a sheaf (32 nodes) is too heavy for the fast path, so
special case the node store operation by just allocating one node in the
maple state.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Use prefilled sheaves instead of bulk allocations. This should speed up
the allocations and the return path of unused allocations.
Remove the push and pop of nodes from the maple state as this is now
handled by the slab layer with sheaves.
Testing has been removed as necessary since the features of the tree
have been reduced.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
kfree() is a little shorter and works with kmem_cache_alloc'd pointers
too. Also lets us remove one more helper.
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
kfree_rcu is an optimized version of call_rcu + kfree. It used to not be
possible to call it on non-kmalloc objects, but this restriction was
lifted ever since SLOB was dropped from the kernel, and since commit
6c6c47b063 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()").
Thus, replace call_rcu + mt_free_rcu with kfree_rcu.
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Setup the maple_node_cache with percpu sheaves of size 32 to hopefully
improve its performance. Note this will not immediately take advantage
of sheaf batching of kfree_rcu() operations due to the maple tree using
call_rcu with custom callbacks. The followup changes to maple tree will
change that and also make use of the prefilled sheaves functionality.
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Bulk insert mode was added to facilitate forking faster, but forking now
uses __mt_dup() to duplicate the tree.
The addition of sheaves has made the bulk allocations difficult to
maintain - since the expected entries would preallocate into the maple
state. A big part of the maple state node allocation was the ability to
push nodes back onto the state for later use, which was essential to the
bulk insert algorithm.
Remove mas_expected_entries() and mas_destroy_rebalance() functions as
well as the MA_STATE_BULK and MA_STATE_REBALANCE maple state flags since
there are no users anymore. Drop the associated testing as well.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.
Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these
redundant flags across subsystems.
No functional changes.
Link: https://lkml.kernel.org/r/20250804125657.482109-1-rongqianfeng@vivo.com
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Make sure to drop the reference taken when looking up the genpool platform
device in of_gen_pool_get() before returning the pool.
Note that holding a reference to a device does typically not prevent its
devres managed resources from being released so there is no point in
keeping the reference.
Link: https://lkml.kernel.org/r/20250924080207.18006-1-johan@kernel.org
Fixes: 9375db07ad ("genalloc: add devres support, allow to find a managed pool by device")
Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Vladimir Zapolskiy <vz@mleia.com>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There's really no need for this since it's 0 or 1 when
CONFIG_PANIC_ON_OOPS is disabled/enabled, so just use IS_ENABLED()
instead. The extra symbol goes back to the original code adding it in
commit 2a01bb3885 ("panic: Make panic_on_oops configurable").
Link: https://lkml.kernel.org/r/20250924094303.18521-2-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The kernel's CFI implementation uses the KCFI ABI specifically, and is
not strictly tied to a particular compiler. In preparation for GCC
supporting KCFI, rename CONFIG_CFI_CLANG to CONFIG_CFI (along with
associated options).
Use new "transitional" Kconfig option for old CONFIG_CFI_CLANG that will
enable CONFIG_CFI during olddefconfig.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250923213422.1105654-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Switch 'compressed_formats[]' to the more modern and flexible designated
initializers. This improves readability and allows struct fields to be
reordered. Also use a more concise sentinel marker.
Remove the curly braces around the for loop while we're at it.
No functional changes intended.
Link: https://lkml.kernel.org/r/20250910232350.1308206-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
While rare, memory allocation profiling can contain inaccurate counters if
slab object extension vector allocation fails. That allocation might
succeed later but prior to that, slab allocations that would have used
that object extension vector will not be accounted for. To indicate
incorrect counters, "accurate:no" marker is appended to the call site line
in the /proc/allocinfo output. Bump up /proc/allocinfo version to reflect
the change in the file format and update documentation.
Example output with invalid counters:
allocinfo - version: 2.0
0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no
0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no
32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
[surenb@google.com: document new "accurate:no" marker]
Fixes: 39d117e04d15 ("alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output")
[akpm@linux-foundation.org: simplification per Usama, reflow text]
[akpm@linux-foundation.org: add newline to prevent docs warning, per Randy]
Link: https://lkml.kernel.org/r/20250915230224.4115531-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: David Wang <00107082@163.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Memory profiling can be shut down due to reasons like a failure during
initialization. When this happens, the user should not be able to
re-enable it. Current sysctrl interface does not handle this properly and
will allow re-enabling memory profiling. Fix this by checking for this
condition during sysctrl write operation.
Link: https://lkml.kernel.org/r/20250915212756.3998938-3-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Cc: David Wang <00107082@163.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Minor fixes for memory allocation profiling", v2.
Over the last couple months I gathered a few reports of minor issues in
memory allocation profiling which are addressed in this patchset.
This patch (of 2):
When bulk-freeing an array of pages use release_pages() instead of freeing
them page-by-page.
Link: https://lkml.kernel.org/r/20250915212756.3998938-1-surenb@google.com
Link: https://lkml.kernel.org/r/20250915212756.3998938-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: Usama Arif <usamaarif642@gmail.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Cc: David Wang <00107082@163.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "kasan: unify kasan_enabled() and remove arch-specific
implementations", v6.
This patch series addresses the fragmentation in KASAN initialization
across architectures by introducing a unified approach that eliminates
duplicate static keys and arch-specific kasan_arch_is_ready()
implementations.
The core issue is that different architectures have inconsistent approaches
to KASAN readiness tracking:
- PowerPC, LoongArch, and UML arch, each implement own kasan_arch_is_ready()
- Only HW_TAGS mode had a unified static key (kasan_flag_enabled)
- Generic and SW_TAGS modes relied on arch-specific solutions
or always-on behavior
This patch (of 2):
Introduce CONFIG_ARCH_DEFER_KASAN to identify architectures [1] that need
to defer KASAN initialization until shadow memory is properly set up, and
unify the static key infrastructure across all KASAN modes.
[1] PowerPC, UML, LoongArch selects ARCH_DEFER_KASAN.
The core issue is that different architectures haveinconsistent approaches
to KASAN readiness tracking:
- PowerPC, LoongArch, and UML arch, each implement own
kasan_arch_is_ready()
- Only HW_TAGS mode had a unified static key (kasan_flag_enabled)
- Generic and SW_TAGS modes relied on arch-specific solutions or always-on
behavior
This patch addresses the fragmentation in KASAN initialization across
architectures by introducing a unified approach that eliminates duplicate
static keys and arch-specific kasan_arch_is_ready() implementations.
Let's replace kasan_arch_is_ready() with existing kasan_enabled() check,
which examines the static key being enabled if arch selects
ARCH_DEFER_KASAN or has HW_TAGS mode support. For other arch,
kasan_enabled() checks the enablement during compile time.
Now KASAN users can use a single kasan_enabled() check everywhere.
Link: https://lkml.kernel.org/r/20250810125746.1105476-1-snovitoll@gmail.com
Link: https://lkml.kernel.org/r/20250810125746.1105476-2-snovitoll@gmail.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217049
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> #powerpc
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: David Gow <davidgow@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Cc: Marco Elver <elver@google.com>
Cc: Qing Zhang <zhangqing@loongson.cn>
Cc: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmjHMcoeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG5bwH/23w8iGB4hf7L/7Z
e7blX42Pe9EXA1uK62iWmwEjDvBuJ7TmVfXH09qYJ56fj6/rJEdpQwtBMd4ypL81
QA/7lq5UEl0apPzMN86J8EHCzmjNzv7o+UtEd4C/hPFEZHZJa5Hqj9CBglSwSCEn
fTkLk7Gl6s8SfzBQ/rXX6/ZChAB/RleVWabDlIQMDz++/+9DZ0aqphj+5bYSqysL
ROQOaj4LOICuLfrup9J61hKNBoF7Dv3sO20vc+Iic0XHRPZ6/lKCnHgCUsqVIOOQ
L4kDT7XKQg+n3ttjrMe84/8iHZdWtf8VMWrtniPT8e1YGYuMpavVplgIcFoFCoNm
Qa7NPDs=
=rZeT
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaM3AYQAKCRAdayaRccAa
lkrsAQCfR0LymE8Hq+Vfk65DK4qZxigaXGTfg5n3xlPhTAh/iQEA02N0/ReHOOdH
nQde8709saIFE5axIMFvdWzbFPDtWwE=
=eIkf
-----END PGP SIGNATURE-----
Merge 6.17-rc6 into kbuild-next
Commit bd7c231212 ("pinctrl: meson: Fix typo in device table macro")
is needed in kbuild-next to avoid a build error with a future change.
While at it, address the conflict between commit 41f9049cff ("riscv:
Only allow LTO with CMODEL_MEDANY") and commit 6578a1ff6a ("riscv:
Remove version check for LTO_CLANG selects"), as reported by Stephen
Rothwell [1].
Link: https://lore.kernel.org/20250908134913.68778b7b@canb.auug.org.au/ [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Add an implementation of sha256_finup_2x_arch() for x86_64. It
interleaves the computation of two SHA-256 hashes using the x86 SHA-NI
instructions. dm-verity and fs-verity will take advantage of this for
greatly improved performance on capable CPUs.
This increases the throughput of SHA-256 hashing 4096-byte messages by
the following amounts on the following CPUs:
Intel Ice Lake (server): 4%
Intel Sapphire Rapids: 38%
Intel Emerald Rapids: 38%
AMD Zen 1 (Threadripper 1950X): 84%
AMD Zen 4 (EPYC 9B14): 98%
AMD Zen 5 (Ryzen 9 9950X): 64%
For now, this seems to benefit AMD more than Intel. This seems to be
because current AMD CPUs support concurrent execution of the SHA-NI
instructions, but unfortunately current Intel CPUs don't, except for the
sha256msg2 instruction. Hopefully future Intel CPUs will support SHA-NI
on more execution ports. Zen 1 supports 2 concurrent sha256rnds2, and
Zen 4 supports 4 concurrent sha256rnds2, which suggests that even better
performance may be achievable on Zen 4 by interleaving more than two
hashes. However, doing so poses a number of trade-offs, and furthermore
Zen 5 goes back to supporting "only" 2 concurrent sha256rnds2.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250915160819.140019-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add an implementation of sha256_finup_2x_arch() for arm64. It
interleaves the computation of two SHA-256 hashes using the ARMv8
SHA-256 instructions. dm-verity and fs-verity will take advantage of
this for greatly improved performance on capable CPUs.
This increases the throughput of SHA-256 hashing 4096-byte messages by
the following amounts on the following CPUs:
ARM Cortex-X1: 70%
ARM Cortex-X3: 68%
ARM Cortex-A76: 65%
ARM Cortex-A715: 43%
ARM Cortex-A510: 25%
ARM Cortex-A55: 8%
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250915160819.140019-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Many arm64 and x86_64 CPUs can compute two SHA-256 hashes in nearly the
same speed as one, if the instructions are interleaved. This is because
SHA-256 is serialized block-by-block, and two interleaved hashes take
much better advantage of the CPU's instruction-level parallelism.
Meanwhile, a very common use case for SHA-256 hashing in the Linux
kernel is dm-verity and fs-verity. Both use a Merkle tree that has a
fixed block size, usually 4096 bytes with an empty or 32-byte salt
prepended. Usually, many blocks need to be hashed at a time. This is
an ideal scenario for 2-way interleaved hashing.
To enable this optimization, add a new function sha256_finup_2x() to the
SHA-256 library API. It computes the hash of two equal-length messages,
starting from a common initial context.
For now it always falls back to sequential processing. Later patches
will wire up arm64 and x86_64 optimized implementations.
Note that the interleaving factor could in principle be higher than 2x.
However, that runs into many practical difficulties and CPU throughput
limitations. Thus, both the implementations I'm adding are 2x. In the
interest of using the simplest solution, the API matches that.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250915160819.140019-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since wp$$==wq$$, it doesn't need to load the same data twice, use move
instruction to replace one of the loads to let the program run faster.
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Link: https://lore.kernel.org/r/20250718072711.3865118-3-zhangchunyan@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
These two C files don't reference things defined in simd.h or types.h
so remove these redundant #inclusions.
Fixes: 6093faaf95 ("raid6: Add RISC-V SIMD syndrome and recovery calculations")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250718072711.3865118-2-zhangchunyan@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Various KUnit tests require PCI infrastructure to work. All normal
platforms enable PCI by default, but UML does not. Enabling PCI from
.kunitconfig files is problematic as it would not be portable. So in
commit 6fc3a8636a ("kunit: tool: Enable virtio/PCI by default on UML")
PCI was enabled by way of CONFIG_UML_PCI_OVER_VIRTIO=y. However
CONFIG_UML_PCI_OVER_VIRTIO requires additional configuration of
CONFIG_UML_PCI_OVER_VIRTIO_DEVICE_ID or will otherwise trigger a WARN() in
virtio_pcidev_init(). However there is no one correct value for
UML_PCI_OVER_VIRTIO_DEVICE_ID which could be used by default.
This warning is confusing when debugging test failures.
On the other hand, the functionality of CONFIG_UML_PCI_OVER_VIRTIO is not
used at all, given that it is completely non-functional as indicated by
the WARN() in question. Instead it is only used as a way to enable
CONFIG_UML_PCI which itself is not directly configurable.
Instead of going through CONFIG_UML_PCI_OVER_VIRTIO, introduce a custom
configuration option which enables CONFIG_UML_PCI without triggering
warnings or building dead code.
Link: https://lore.kernel.org/r/20250908-kunit-uml-pci-v2-1-d8eba5f73c9d@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Previously btree_merge() called btree_last() only to test existence, then
performed an extra btree_lookup() to fetch the value. This patch changes
it to directly use the value returned by btree_last(), avoiding redundant
lookups and simplifying the merge loop.
Link: https://lkml.kernel.org/r/20250826161741.686704-1-409411716@gms.tku.edu.tw
Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The helper this_cpu_in_panic() duplicated logic already provided by
panic_on_this_cpu().
Remove this_cpu_in_panic() and switch all users to panic_on_this_cpu().
This simplifies the code and avoids having two helpers for the same check.
Link: https://lkml.kernel.org/r/20250825022947.1596226-8-wangjinchao600@gmail.com
Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Doug Anderson <dianders@chromium.org>
Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Kees Cook <kees@kernel.org>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Luo Gengkun <luogengkun@huaweicloud.com>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Nam Cao <namcao@linutronix.de>
Cc: oushixiong <oushixiong@kylinos.cn>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Qianqiang Liu <qianqiang.liu@163.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Thomas Zimemrmann <tzimmermann@suse.de>
Cc: Thorsten Blum <thorsten.blum@linux.dev>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Yunhui Cui <cuiyunhui@bytedance.com>
Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Generalization of panic_print's dump function [1] has been merged, and
this patchset is to address some remaining issues, like adding note of the
obsoletion of 'panic_print' cmdline parameter, refining the kernel
document for panic_print, and hardening some string management.
This patch (of 4):
It is a normal case that bitmask parameter is 0, so pre-initialize the
names[] to null string to cover this case.
Also remove the superfluous "+1" in names[sizeof(sys_info_avail) + 1],
which is needed for 'strlen()', but not for 'sizeof()'.
Link: https://lkml.kernel.org/r/20250825025701.81921-1-feng.tang@linux.alibaba.com
Link: https://lkml.kernel.org/r/20250825025701.81921-2-feng.tang@linux.alibaba.com
Link: Link: https://lkml.kernel.org/r/20250703021004.42328-1-feng.tang@linux.alibaba.com [1]
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Askar Safin <safinaskar@zohomail.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Replace the ternary (enable ? "on" : "off") with the str_on_off() helper
from string_choices.h. This improves readability by replacing the
three-operand ternary with a single function call, ensures consistent
string output, and allows potential string deduplication by the linker,
resulting in a slightly smaller binary.
Link: https://lkml.kernel.org/r/20250814093827.237980-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Replace ternary (condition ? "true" : "false") expressions with the
str_true_false() helper from string_choices.h. This improves readability
by replacing the three-operand ternary with a single function call,
ensures consistent string output, and allows potential string
deduplication by the linker, resulting in a slightly smaller binary.
Link: https://lkml.kernel.org/r/20250814095033.244034-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kzalloc() has already been initialized to full 0 space, there is no need
to use memset() to initialize again.
Link: https://lkml.kernel.org/r/20250811082739.378284-1-liaoyuanhong@vivo.com
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.
Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these
redundant flags across subsystems.
No functional changes.
Link: https://lkml.kernel.org/r/20250805023031.331718-1-rongqianfeng@vivo.com
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
MAPLE_PARENT_RANGE32 should be 0x02 as a 32 bit node is indicated by the
bit pattern 0b010 which is the hex value 0x02. There are no users
currently, so there is no associated bug with this wrong value.
Fix typo Note -> Node and replace x with b to indicate binary values.
Link: https://lkml.kernel.org/r/20250826151344.403286-1-sidhartha.kumar@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The result of integer comparison already evaluates to bool. No need for
explicit conversion.
No functional impact.
Link: https://lkml.kernel.org/r/20250819070457.486348-1-zhao.xichao@vivo.com
Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Update kho_test_save() so that folios array won't be freed when
returning from the function and the fdt will be freed on error
* Reset state->nr_folios to 0 in kho_test_generate_data() on error
* Simplify allocation of folios info in fdt.
Link: https://lkml.kernel.org/r/20250811082510.4154080-3-rppt@kernel.org
Fixes: b753522bed ("kho: add test for kexec handover")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reported-by: Pratyush Yadav <pratyush@kernel.org>
Closes: https://lore.kernel.org/all/mafs0zfcjcepf.fsf@kernel.org
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.
Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these
redundant flags across subsystems.
No functional changes.
Link: https://lkml.kernel.org/r/20250804130018.484321-1-rongqianfeng@vivo.com
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reimplement k[v]realloc_node() to be able to set node and alignment should
a user need to do so. In order to do that while retaining the maximal
backward compatibility, add k[v]realloc_node_align() functions and
redefine the rest of API using these new ones.
While doing that, we also keep the number of _noprof variants to a
minimum, which implies some changes to the existing users of older _noprof
functions, that basically being bcachefs.
With that change we also provide the ability for the Rust part of the
kernel to set node and alignment in its K[v]xxx [re]allocations.
Link: https://lkml.kernel.org/r/20250806124147.1724658-1-vitaly.wool@konsulko.se
Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jann Horn <jannh@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
No more callers.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The field timer->base->get_time is a private implementation detail and
should not be accessed outside of the hrtimer core.
Switch to the equivalent helper.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250821-hrtimer-cleanup-get_time-v2-4-3ae822e5bfbd@linutronix.de
While tracking down a problem where constant expressions used by
BUILD_BUG_ON() suddenly stopped working[1], we found that an added static
initializer was convincing the compiler that it couldn't track the state
of the prior statically initialized value. Tracing this down found that
ffs() was used in the initializer macro, but since it wasn't marked with
__attribute_const__, the compiler had to assume the function might
change variable states as a side-effect (which is not true for ffs(),
which provides deterministic math results).
Validate all the __attibute_const__ annotations were found for all
architectures by reproducing the specific problem encountered in the
original bug report.
Build and run tested with everything I could reach with KUnit:
$ ./tools/testing/kunit/kunit.py run --arch=x86_64 ffs
$ ./tools/testing/kunit/kunit.py run --arch=i386 ffs
$ ./tools/testing/kunit/kunit.py run --arch=arm64 --make_options "CROSS_COMPILE=aarch64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=arm --make_options "CROSS_COMPILE=arm-linux-gnueabi-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpc ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpc32 ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpcle ffs
$ ./tools/testing/kunit/kunit.py run --arch=m68k ffs
$ ./tools/testing/kunit/kunit.py run --arch=loongarch ffs
$ ./tools/testing/kunit/kunit.py run --arch=s390 --make_options "CROSS_COMPILE=s390x-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=riscv --make_options "CROSS_COMPILE=riscv64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=riscv32 --make_options "CROSS_COMPILE=riscv64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sparc --make_options "CROSS_COMPILE=sparc64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sparc64 --make_options "CROSS_COMPILE=sparc64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=alpha --make_options "CROSS_COMPILE=alpha-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sh --make_options "CROSS_COMPILE=sh4-linux-gnu-" ffs
Closes: https://github.com/KSPP/linux/issues/364
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250804164417.1612371-17-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
While tracking down a problem where constant expressions used by
BUILD_BUG_ON() suddenly stopped working[1], we found that an added static
initializer was convincing the compiler that it couldn't track the state
of the prior statically initialized value. Tracing this down found that
ffs() was used in the initializer macro, but since it wasn't marked with
__attribute__const__, the compiler had to assume the function might
change variable states as a side-effect (which is not true for ffs(),
which provides deterministic math results).
Add missing __attribute_const__ annotations to generic implementations of
ffs(), __ffs(), fls(), and __fls() functions. These are pure mathematical
functions that always return the same result for the same input with no
side effects, making them eligible for compiler optimization.
Build tested with x86_64 defconfig using GCC 14.2.0, which should validate
the implementations when used by ARM, ARM64, LoongArch, Microblaze,
NIOS2, and SPARC32 architectures.
Link: https://github.com/KSPP/linux/issues/364 [1]
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250804164417.1612371-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Add KUnit tests for ffs()-family bit scanning functions: ffs(), __ffs(),
fls(), __fls(), fls64(), __ffs64(), and ffz(). The tests validate
mathematical relationships (e.g. ffs(x) == __ffs(x) + 1), and test zero
values, power-of-2 patterns, maximum values, and sparse bit patterns.
Build and run tested with everything I could reach with KUnit:
$ ./tools/testing/kunit/kunit.py run --arch=x86_64 ffs
$ ./tools/testing/kunit/kunit.py run --arch=i386 ffs
$ ./tools/testing/kunit/kunit.py run --arch=arm64 --make_options "CROSS_COMPILE=aarch64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=arm --make_options "CROSS_COMPILE=arm-linux-gnueabihf-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpc ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpc32 ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpcle ffs
$ ./tools/testing/kunit/kunit.py run --arch=riscv --make_options "CROSS_COMPILE=riscv64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=riscv32 --make_options "CROSS_COMPILE=riscv64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=s390 --make_options "CROSS_COMPILE=s390x-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=m68k ffs
$ ./tools/testing/kunit/kunit.py run --arch=loongarch ffs
$ ./tools/testing/kunit/kunit.py run --arch=mips --make_options "CROSS_COMPILE=mipsel-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sparc --make_options "CROSS_COMPILE=sparc64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sparc64 --make_options "CROSS_COMPILE=sparc64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=alpha --make_options "CROSS_COMPILE=alpha-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sh --make_options "CROSS_COMPILE=sh4-linux-gnu-" ffs
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250804164417.1612371-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Now that the Curve25519 library has been disentangled from CRYPTO,
adding CRYPTO_SELFTESTS as a default value of
CRYPTO_LIB_CURVE25519_KUNIT_TEST no longer causes a recursive kconfig
dependency. Do this, which makes this option consistent with the other
crypto KUnit test options in the same file.
Link: https://lore.kernel.org/r/20250906213523.84915-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reorganize the Curve25519 library code:
- Build a single libcurve25519 module, instead of up to three modules:
libcurve25519, libcurve25519-generic, and an arch-specific module.
- Move the arch-specific Curve25519 code from arch/$(SRCARCH)/crypto/ to
lib/crypto/$(SRCARCH)/. Centralize the build rules into
lib/crypto/Makefile and lib/crypto/Kconfig.
- Include the arch-specific code directly in lib/crypto/curve25519.c via
a header, rather than using a separate .c file.
- Eliminate the entanglement with CRYPTO. CRYPTO_LIB_CURVE25519 no
longer selects CRYPTO, and the arch-specific Curve25519 code no longer
depends on CRYPTO.
This brings Curve25519 in line with the latest conventions for
lib/crypto/, used by other algorithms. The exception is that I kept the
generic code in separate translation units for now. (Some of the
function names collide between the x86 and generic Curve25519 code. And
the Curve25519 functions are very long anyway, so inlining doesn't
matter as much for Curve25519 as it does for some other algorithms.)
Link: https://lore.kernel.org/r/20250906213523.84915-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the Curve25519 test from an ad-hoc self-test to a KUnit test.
Generally keep the same test logic for now, just translated to KUnit.
There's one exception, which is that I dropped the incomplete test of
curve25519_generic(). The approach I'm taking to cover the different
implementations with the KUnit tests is to just rely on booting kernels
in QEMU with different '-cpu' options, rather than try to make the tests
(incompletely) test multiple implementations on one CPU. This way, both
the test and the library API are simpler.
This commit makes the file lib/crypto/curve25519.c no longer needed, as
its only purpose was to call the self-test. However, keep it for now,
since a later commit will add code to it again.
Temporarily omit the default value of CRYPTO_SELFTESTS that the other
lib/crypto/ KUnit tests have. It would cause a recursive kconfig
dependency, since the Curve25519 code is still entangled with CRYPTO. A
later commit will fix that.
Link: https://lore.kernel.org/r/20250906213523.84915-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
All architectures implementing time-related functionality in the vDSO are
using the generic vDSO library which handles time namespaces properly.
Remove the now unnecessary Kconfig symbol.
Enables the use of time namespaces on architectures, which use the
generic vDSO but did not enable GENERIC_VDSO_TIME_NS, namely MIPS and arm.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-10-d9b65750e49f@linutronix.de
All users of the generic vDSO library also use the generic vDSO datastore.
Remove the now unnecessary Kconfig symbol.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-9-d9b65750e49f@linutronix.de
All calls of these functions are already gated behind CONFIG_TIME_NS. The
compiler will already optimize them away if time namespaces are disabled.
Drop the unnecessary stubs.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-4-d9b65750e49f@linutronix.de
When the generic vDSO does not provide time functions, as for example on
riscv32, then the time data store is not necessary.
Avoid allocating these time data pages when not used.
Fixes: df7fcbefa7 ("vdso: Add generic time data storage")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-1-d9b65750e49f@linutronix.de
Add a KUnit test suite for BLAKE2s. Most of the core test logic is in
the previously-added hash-test-template.h. This commit just adds the
actual KUnit suite, commits the generated test vectors to the tree so
that gen-hash-testvecs.py won't have to be run at build time, and adds a
few BLAKE2s-specific test cases.
This is the replacement for blake2s-selftest, which an earlier commit
removed. Improvements over blake2s-selftest include integration with
KUnit, more comprehensive test cases, and support for benchmarking.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
As was done with the other algorithms, reorganize the BLAKE2s code so
that the generic implementation and the arch-specific "glue" code is
consolidated into a single translation unit, so that the compiler will
inline the functions and automatically decide whether to include the
generic code in the resulting binary or not.
Similarly, also consolidate the build rules into
lib/crypto/{Makefile,Kconfig}. This removes the last uses of
lib/crypto/{arm,x86}/{Makefile,Kconfig}, so remove those too.
Don't keep the !KMSAN dependency. It was needed only for other
algorithms such as ChaCha that initialize memory from assembly code.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move blake2s_compress_generic() from blake2s-generic.c to blake2s.c.
For now it's still guarded by CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC, but
this prepares for changing it to a 'static __maybe_unused' function and
just using the compiler to automatically decide its inclusion.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
When support for a crypto algorithm is enabled, the arch-optimized
implementation of that algorithm should be enabled too. We've learned
this the hard way many times over the years: people regularly forget to
enable the arch-optimized implementations of the crypto algorithms,
resulting in significant performance being left on the table.
Currently, BLAKE2s support is always enabled ('obj-y'), since random.c
uses it. Therefore, the arch-optimized BLAKE2s code, which exists for
ARM and x86_64, should be always enabled too. Let's do that.
Note that the effect on kernel image size is very small and should not
be a concern. On ARM, enabling CRYPTO_BLAKE2S_ARM actually *shrinks*
the kernel size by about 1200 bytes, since the ARM-optimized
blake2s_compress() completely replaces the generic blake2s_compress().
On x86_64, enabling CRYPTO_BLAKE2S_X86 increases the kernel size by
about 1400 bytes, as the generic blake2s_compress() is still included as
a fallback; however, for context, that is only about a quarter the size
of the generic blake2s_compress(). The x86_64 optimized BLAKE2s code
uses much less icache at runtime than the generic code.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Save 480 bytes of .rodata by replacing the .long constants with .bytes,
and using the vpmovzxbd instruction to expand them.
Also update the code to do the loads before incrementing %rax rather
than after. This avoids the need for the first load to use an offset.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Consolidate the ChaCha code into a single module (excluding
chacha-block-generic.c which remains always built-in for random.c),
similar to various other algorithms:
- Each arch now provides a header file lib/crypto/$(SRCARCH)/chacha.h,
replacing lib/crypto/$(SRCARCH)/chacha*.c. The header defines
chacha_crypt_arch() and hchacha_block_arch(). It is included by
lib/crypto/chacha.c, and thus the code gets built into the single
libchacha module, with improved inlining in some cases.
- Whether arch-optimized ChaCha is buildable is now controlled centrally
by lib/crypto/Kconfig instead of by lib/crypto/$(SRCARCH)/Kconfig.
The conditions for enabling it remain the same as before, and it
remains enabled by default.
- Any additional arch-specific translation units for the optimized
ChaCha code, such as assembly files, are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
This removes the last use for the Makefile and Kconfig files in the
arm64, mips, powerpc, riscv, and s390 subdirectories of lib/crypto/. So
also remove those files and the references to them.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Rename libchacha.c to chacha.c to make the naming consistent with other
algorithms and allow additional source files to be added to the
libchacha module. This file currently contains chacha_crypt_generic(),
but it will soon be updated to contain chacha_crypt().
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Rename chacha.c to chacha-block-generic.c to free up the name chacha.c
for the high-level API entry points (chacha_crypt() and
hchacha_block()), similar to the other algorithms.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305
implementation for riscv authored by Andy Polyakov. The file
'poly1305-riscv.pl' is taken straight from
https://github.com/dot-asm/cryptogams commit
5e3fba73576244708a752fa61a8e93e587f271bb. This patch was tested on
SpacemiT X60, with 2~2.5x improvement over generic implementation.
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Signed-off-by: Zhihang Shao <zhihang.shao.iscas@gmail.com>
[EB: ported to lib/crypto/riscv/]
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Consolidate the Poly1305 code into a single module, similar to various
other algorithms (SHA-1, SHA-256, SHA-512, etc.):
- Each arch now provides a header file lib/crypto/$(SRCARCH)/poly1305.h,
replacing lib/crypto/$(SRCARCH)/poly1305*.c. The header defines
poly1305_block_init(), poly1305_blocks(), poly1305_emit(), and
optionally poly1305_mod_init_arch(). It is included by
lib/crypto/poly1305.c, and thus the code gets built into the single
libpoly1305 module, with improved inlining in some cases.
- Whether arch-optimized Poly1305 is buildable is now controlled
centrally by lib/crypto/Kconfig instead of by
lib/crypto/$(SRCARCH)/Kconfig. The conditions for enabling it remain
the same as before, and it remains enabled by default. (The PPC64 one
remains unconditionally disabled due to 'depends on BROKEN'.)
- Any additional arch-specific translation units for the optimized
Poly1305 code, such as assembly files, are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
A special consideration is needed because the Adiantum code uses the
poly1305_core_*() functions directly. For now, just carry forward that
approach. This means retaining the CRYPTO_LIB_POLY1305_GENERIC kconfig
symbol, and keeping the poly1305_core_*() functions in separate
translation units. So it's not quite as streamlined I've done with the
other hash functions, but we still get a single libpoly1305 module.
Note: to see the diff from the arm, arm64, and x86 .c files to the new
.h files, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Now that the minimum supported version of LLVM for building the kernel
has been bumped to 15.0.0, two KMSAN checks can be cleaned up.
CONFIG_HAVE_KMSAN_COMPILER will always be true when using clang so
remove the cc-option test and use a simple check for CONFIG_CC_IS_CLANG.
CONFIG_HAVE_KMSAN_PARAM_RETVAL will always be true so it can be removed
outright.
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250821-bump-min-llvm-ver-15-v2-12-635f3294e5f0@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Now that the minimum supported version of LLVM for building the kernel
has been bumped to 15.0.0, __no_kcsan will always ensure that the thread
sanitizer functions are not generated, so remove the check for tsan
functions in is_profiling_func() and the always true depends and
unnecessary select lines in KCSAN_WEAK_MEMORY.
Acked-by: Marco Elver <elver@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infraded.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250821-bump-min-llvm-ver-15-v2-11-635f3294e5f0@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Now that the minimum supported version of LLVM for building the kernel
has been bumped to 15.0.0, the CLANG_VERSION check for older than 14.0.0
is always false, so remove it.
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250821-bump-min-llvm-ver-15-v2-10-635f3294e5f0@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Drop 'inline' from all the *_mod_init_arch() functions so that the
compiler will warn about any bugs where they are unused due to not being
wired up properly. (There are no such bugs currently, so this just
establishes a more robust convention for the future. Of course, these
functions also tend to get inlined anyway, regardless of the keyword.)
Link: https://lore.kernel.org/r/20250816020457.432040-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a KUnit test suite for the MD5 library functions, including the
corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suite, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Link: https://lore.kernel.org/r/20250805222855.10362-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Introduce example_params_test_with_init_dynamic_arr(). This new
KUnit test demonstrates directly assigning a dynamic parameter
array, using the kunit_register_params_array() macro, to a
parameterized test context.
It highlights the use of param_init() and param_exit() for
initialization and exit of a parameterized test, and their
registration to the test case with KUNIT_CASE_PARAM_WITH_INIT().
Link: https://lore.kernel.org/r/20250826091341.1427123-7-davidgow@google.com
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add example_params_test_with_init() to illustrate how to manage
shared resources across a parameterized KUnit test. This example
showcases the use of the new param_init() function and its registration
to a test using the KUNIT_CASE_PARAM_WITH_INIT() macro.
Additionally, the test demonstrates how to directly pass a parameter array
to the parameterized test context via kunit_register_params_array()
and leveraging the Resource API for shared resource management.
Link: https://lore.kernel.org/r/20250826091341.1427123-6-davidgow@google.com
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
KUnit parameterized tests currently support two primary methods f
or getting parameters:
1. Defining custom logic within a generate_params() function.
2. Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC()
macros with a pre-defined static array and passing
the created *_gen_params() to KUNIT_CASE_PARAM().
These methods present limitations when dealing with dynamically
generated parameter arrays, or in scenarios where populating parameters
sequentially via generate_params() is inefficient or overly complex.
This patch addresses these limitations by adding a new `params_array`
field to `struct kunit`, of the type `kunit_params`. The
`struct kunit_params` is designed to store the parameter array itself,
along with essential metadata including the parameter count, parameter
size, and a get_description() function for providing custom descriptions
for individual parameters.
The `params_array` field can be populated by calling the new
kunit_register_params_array() macro from within a param_init() function.
This will register the array as part of the parameterized test context.
The user will then need to pass kunit_array_gen_params() to the
KUNIT_CASE_PARAM_WITH_INIT() macro as the generator function, if not
providing their own. kunit_array_gen_params() is a KUnit helper that will
use the registered array to generate parameters.
The arrays passed to KUNIT_ARRAY_PARAM(,DESC) will also be registered to
the parameterized test context for consistency as well as for higher
availability of the parameter count that will be used for outputting a KTAP
test plan for a parameterized test.
This modification provides greater flexibility to the KUnit framework,
allowing testers to easily register and utilize both dynamic and static
parameter arrays.
Link: https://lore.kernel.org/r/20250826091341.1427123-5-davidgow@google.com
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
[Only output the test plan if using kunit_array_gen_params --David]
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
To enable more complex parameterized testing scenarios, the
generate_params() function needs additional context beyond just
the previously generated parameter. This patch modifies the
generate_params() function signature to include an extra
`struct kunit *test` argument, giving test users access to the
parameterized test context when generating parameters.
The `struct kunit *test` argument was added as the first parameter
to the function signature as it aligns with the convention of other
KUnit functions that accept `struct kunit *test` first. This also
mirrors the "this" or "self" reference found in object-oriented
programming languages.
This patch also modifies xe_pci_live_device_gen_param() in xe_pci.c
and nthreads_gen_params() in kcsan_test.c to reflect this signature
change.
Link: https://lore.kernel.org/r/20250826091341.1427123-4-davidgow@google.com
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: Marco Elver <elver@google.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
[Catch some additional gen_params signatures in drm/xe/tests --David]
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add (*param_init) and (*param_exit) function pointers to
`struct kunit_case`. Users will be able to set them via the new
KUNIT_CASE_PARAM_WITH_INIT() macro.
param_init/exit will be invoked by kunit_run_tests() once before and once
after the parameterized test, respectively. They will receive the
`struct kunit` that holds the parameterized test context; facilitating
init and exit for shared state.
This patch also sets param_init/exit to None in rust/kernel/kunit.rs.
Link: https://lore.kernel.org/r/20250826091341.1427123-3-davidgow@google.com
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Currently, KUnit parameterized tests lack a mechanism to share
resources across parameter runs because the same `struct kunit`
instance is cleaned up and reused for each run.
This patch introduces parameterized test context, enabling test
users to share resources between parameter runs. It also allows
setting up resources that need to be available for all parameter
runs only once, which is helpful in cases where setup is expensive.
To establish a parameterized test context, this patch adds a
parent pointer field to `struct kunit`. This allows resources added
to the parent `struct kunit` to be shared and accessible across all
parameter runs.
In kunit_run_tests(), the default `struct kunit` created is now
designated to act as the parameterized test context whenever a test
is parameterized.
Subsequently, a new `struct kunit` is made for each parameter run, and
its parent pointer is set to the `struct kunit` that holds the
parameterized test context.
Link: https://lore.kernel.org/r/20250826091341.1427123-2-davidgow@google.com
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Instead of exposing the sparc-optimized MD5 code via sparc-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
sparc-optimized, and it fixes the longstanding issue where the
sparc-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Note: to see the diff from arch/sparc/crypto/md5_glue.c to
lib/crypto/sparc/md5.h, view this commit with 'git show -M10'.
Link: https://lore.kernel.org/r/20250805222855.10362-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the powerpc-optimized MD5 code via powerpc-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
powerpc-optimized, and it fixes the longstanding issue where the
powerpc-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Link: https://lore.kernel.org/r/20250805222855.10362-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the mips-optimized MD5 code via mips-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
mips-optimized, and it fixes the longstanding issue where the
mips-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-md5.c
to lib/crypto/mips/md5.h, view this commit with 'git show -M10'.
Link: https://lore.kernel.org/r/20250805222855.10362-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add library functions for MD5, including HMAC support. The MD5
implementation is derived from crypto/md5.c. This closely mirrors the
corresponding SHA-1 and SHA-2 changes.
Like SHA-1 and SHA-2, support for architecture-optimized MD5
implementations is included. I originally proposed dropping those, but
unfortunately there is an AF_ALG user of the PowerPC MD5 code
(https://lore.kernel.org/r/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
and dropping that code would be viewed as a performance regression. We
don't add new software algorithm implementations purely for AF_ALG, as
escalating to kernel mode merely to do calculations that could be done
in userspace is inefficient and is completely the wrong design. But
since this one already existed, it gets grandfathered in for now. An
objection was also raised to dropping the SPARC64 MD5 code because it
utilizes the CPU's direct support for MD5, although it remains unclear
that anyone is using that. Regardless, we'll keep these around for now.
Note that while MD5 is a legacy algorithm that is vulnerable to
practical collision attacks, it still has various in-kernel users that
implement legacy protocols. Switching to a simple library API, which is
the way the code should have been organized originally, will greatly
simplify their code. For example:
MD5:
drivers/md/dm-crypt.c (for lmk IV generation)
fs/nfsd/nfs4recover.c
fs/ecryptfs/
fs/smb/client/
net/{ipv4,ipv6}/ (for TCP-MD5 signatures)
HMAC-MD5:
fs/smb/client/
fs/smb/server/
(Also net/sctp/ if it continues using HMAC-MD5 for cookie generation.
However, that use case has the flexibility to upgrade to a more modern
algorithm, which I'll be proposing instead.)
As usual, the "md5" and "hmac(md5)" crypto_shash algorithms will also be
reimplemented on top of these library functions. For "hmac(md5)" this
will provide a faster, more streamlined implementation.
Link: https://lore.kernel.org/r/20250805222855.10362-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since sha512_kunit tests the fallback code paths without using
crypto_simd_disabled_for_test, make the SHA-512 code just use the
underlying may_use_simd() and irq_fpu_usable() functions directly
instead of crypto_simd_usable(). This eliminates an unnecessary layer.
Link: https://lore.kernel.org/r/20250731223651.136939-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since sha256_kunit tests the fallback code paths without using
crypto_simd_disabled_for_test, make the SHA-256 code just use the
underlying may_use_simd() and irq_fpu_usable() functions directly
instead of crypto_simd_usable(). This eliminates an unnecessary layer.
While doing this, also add likely() annotations, and fix a minor
inconsistency where the static keys in the sha256.h files were in a
different place than in the corresponding sha1.h and sha512.h files.
Link: https://lore.kernel.org/r/20250731223510.136650-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
__ubsan_handle_divrem_overflow() incorrectly uses the RHS to report.
It always reports the same log: division of -1 by -1. But it should
report division of LHS by -1.
Signed-off-by: Junhui Pei <paradoxskin233@gmail.com>
Fixes: c6d308534a ("UBSAN: run-time undefined behavior sanity checker")
Link: https://lore.kernel.org/r/20250602153841.62935-1-paradoxskin233@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
Fix a regression where 'make clean' stopped removing some of the
generated assembly files on arm and arm64.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaKaR/xQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK486AP4z3+DbFNj3WHLA/O3uFxgR4eiUW9tl
gVtzslK5j2rxLwEAuOrY0qAhzX2dJ0/KN6y7T1qtOEZoGIZ2j85HDvMh3wU=
=d+ja
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fixes from Eric Biggers:
"Fix a regression where 'make clean' stopped removing some of the
generated assembly files on arm and arm64"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: ensure generated *.S files are removed on make clean
lib/crypto: sha: Update Kconfig help for SHA1 and SHA256
The NEED_* macros do an implicit goto in case the safety bounds checks
fail. Add the unlikely hints as this is the error case and not a hot
path. The final assembly is slightly shorter and some jumps got
reordered according to the hints.
text data bss dec hex filename
2294 16 0 2310 906 pre/lzo1x_decompress_safe.o
2277 16 0 2293 8f5 post/lzo1x_decompress_safe.o
text data bss dec hex filename
3444 48 0 3492 da4 pre/lzo1x_compress_safe.o
3372 48 0 3420 d5c post/lzo1x_compress_safe.o
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Drop 'inline' from all the *_mod_init_arch() functions so that the
compiler will warn about any bugs where they are unused due to not being
wired up properly. (There are no such bugs currently, so this just
establishes a more robust convention for the future. Of course, these
functions also tend to get inlined anyway, regardless of the keyword.)
Link: https://lore.kernel.org/r/20250816020240.431545-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
For kbuild to properly clean up these build artifacts in the subdirectory,
even after CONFIG_KUNIT changed do disabled, the directory needs to be
processed always.
Pushing the special logic for hook.o into the kunit Makefile also makes the
logic easier to understand.
Link: https://lore.kernel.org/r/20250813-kunit-always-descend-v1-1-7bbd387ff13b@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
make clean does not check the kernel config when removing files. As
such, additions to clean-files under CONFIG_ARM or CONFIG_ARM64 are not
evaluated. For example, when building on arm64, this means that
lib/crypto/arm64/sha{256,512}-core.S are left over after make clean.
Set clean-files unconditionally to ensure that make clean removes these
files.
Fixes: e96cb9507f ("lib/crypto: sha256: Consolidate into single module")
Fixes: 24c91b62ac ("lib/crypto: arm/sha512: Migrate optimized SHA-512 code to library")
Fixes: 60e3f1e9b7 ("lib/crypto: arm64/sha512: Migrate optimized SHA-512 code to library")
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Link: https://lore.kernel.org/r/20250814-crypto_clean-v2-1-659a2dc86302@columbia.edu
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Current release - regressions:
- netfilter: nft_set_pipapo:
- don't return bogus extension pointer
- fix null deref for empty set
Current release - new code bugs:
- core: prevent deadlocks when enabling NAPIs with mixed kthread config
- eth: netdevsim: Fix wild pointer access in nsim_queue_free().
Previous releases - regressions:
- page_pool: allow enabling recycling late, fix false positive warning
- sched: ets: use old 'nbands' while purging unused classes
- xfrm:
- restore GSO for SW crypto
- bring back device check in validate_xmit_xfrm
- tls: handle data disappearing from under the TLS ULP
- ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
- eth: bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
- eth: hv_netvsc: fix panic during namespace deletion with VF
Previous releases - always broken:
- netfilter: fix refcount leak on table dump
- vsock: do not allow binding to VMADDR_PORT_ANY
- sctp: linearize cloned gso packets in sctp_rcv
- eth: hibmcge: fix the division by zero issue
- eth: microchip: fix KSZ8863 reset problem
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmidxhgSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkPckP/AhJ0kARPgo72OhElbW6KvkHhXVNUTJb
6m15j9Z8ybQPBupSorxUEBx7pxczv/GBloN1xoJqUY/p7B6kLO2HDOpaReNFjZvF
J9hPl6ZF6CWzwgfcUfwI3UB1zhALKyHDClclfd8FBoKjAYXCrZXuPv/AV4oqYsA1
y7g24zpI76Cu+M+Nf5YhrlIVSmQ1/DXX8gifdcHFYnAKmCn7KxNv2lwvm2/yE2lL
9/Xl/D1cG/BiAaCUUXR4BP8RU5gdW72+lM3qmC/QFO/7jcBaoE89Y9Anona8p0PQ
dqDerHd0GDUH9QA6bht3asCS+IW+Zo2gf25o53OzlYIMAxDmEZLUBCwetJhvNJBq
DUQ6agzfNRxsCnlOc4JhMOqNr7rdU7d+9c9KuZWA/m8KRWdlvTYGJd/qzSlTWOhq
s9228dl+4oTb9Mnq8Bqafi42+TImeOyFRW9ZgF8ptjlF0l/lyv6moIrRCmVXppRZ
awABNDdG+i004XmAOAeOhjbUT7clLkLr+KEnsfH16qCa2o3dM6rlhvWYp2sucVJf
SyRvMdz5VqMLgruefpQS/DuK52UklpRawgvgngzU6UDYQUaxQKToeusMjRU7xUnW
hVI1y7/oNH6+r7Zr/iLTLKRR007B+RVC7VSbeMpxmAW+n6puMb+z7RnrJlnFapGM
qqqtk2/jItuK
=Ydk1
-----END PGP SIGNATURE-----
Merge tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from Netfilter and IPsec.
Current release - regressions:
- netfilter: nft_set_pipapo:
- don't return bogus extension pointer
- fix null deref for empty set
Current release - new code bugs:
- core: prevent deadlocks when enabling NAPIs with mixed kthread
config
- eth: netdevsim: Fix wild pointer access in nsim_queue_free().
Previous releases - regressions:
- page_pool: allow enabling recycling late, fix false positive
warning
- sched: ets: use old 'nbands' while purging unused classes
- xfrm:
- restore GSO for SW crypto
- bring back device check in validate_xmit_xfrm
- tls: handle data disappearing from under the TLS ULP
- ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
- eth:
- bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
- hv_netvsc: fix panic during namespace deletion with VF
Previous releases - always broken:
- netfilter: fix refcount leak on table dump
- vsock: do not allow binding to VMADDR_PORT_ANY
- sctp: linearize cloned gso packets in sctp_rcv
- eth:
- hibmcge: fix the division by zero issue
- microchip: fix KSZ8863 reset problem"
* tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
net: usb: asix_devices: add phy_mask for ax88772 mdio bus
net: kcm: Fix race condition in kcm_unattach()
selftests: net/forwarding: test purge of active DWRR classes
net/sched: ets: use old 'nbands' while purging unused classes
bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
netdevsim: Fix wild pointer access in nsim_queue_free().
net: mctp: Fix bad kfree_skb in bind lookup test
netfilter: nf_tables: reject duplicate device on updates
ipvs: Fix estimator kthreads preferred affinity
netfilter: nft_set_pipapo: fix null deref for empty set
selftests: tls: test TCP stealing data from under the TLS socket
tls: handle data disappearing from under the TLS ULP
ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
ixgbe: prevent from unwanted interface name changes
devlink: let driver opt out of automatic phys_port_name generation
net: prevent deadlocks when enabling NAPIs with mixed kthread config
net: update NAPI threaded config even for disabled NAPIs
selftests: drv-net: don't assume device has only 2 queues
docs: Fix name for net.ipv4.udp_child_hash_entries
riscv: dts: thead: Add APB clocks for TH1520 GMACs
...
As Kees points out, this is a kernel address leak, and debugging is
not a sufficiently good reason to expose the real kernel address.
Fixes: 65b584f536 ("ref_tracker: automatically register a file in debugfs for a ref_tracker_dir")
Reported-by: Kees Cook <kees@kernel.org>
Closes: https://lore.kernel.org/netdev/202507301603.62E553F93@keescook/
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since crc_kunit now tests the fallback code paths without using
crypto_simd_disabled_for_test, make the CRC code just use the underlying
may_use_simd() and irq_fpu_usable() functions directly instead of
crypto_simd_usable(). This eliminates an unnecessary layer.
Take the opportunity to add likely() and unlikely() annotations as well.
Link: https://lore.kernel.org/r/20250811182631.376302-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Test that if CRCs are computed in task, softirq, and hardirq context
concurrently, then all results are as expected. Implement this using
kunit_run_irq_test() which is also used by the lib/crypto/ tests.
As with the corresponding lib/crypto/ tests, the purpose of this is to
test fallback code paths and to exercise edge cases in the
architecture's support for in-kernel FPU/SIMD/vector.
Remove the code from crc_test() that sometimes disabled interrupts, as
that was just an incomplete attempt to achieve similar test coverage.
Link: https://lore.kernel.org/r/20250811182631.376302-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiWLjoQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpvveD/9vbvp3XaF0LagRJLH0fcdhcxL7Z+IHD+7U
v5vICMeoeBhhhOtPJ0y+h/9LMLQWFYDFl6drkY0atSSxp/CK6CB25qFhIDsoA6Qk
RBM/qZ64z4Uxvlc+VQmCqI2EMc/ZrYtrcr7jsornwORoTSEKXVHdyO5k7Q9002Sw
XNWc0bZKIibFlgOk12Wnd8ZS5RWHw1uViUcreojcGVZAVR+BuHNGGoa3xq0bLiHU
ERbQXfjaN28R+eo4E1euCtdf++7tW2kFjClrDmLcszdb27E2+MWMA6AKMiSTBE2k
2e2TvJUcGZs1s8atqSIIjBtmwQW3rKws33zODLMONzOP8CIErcaniHxyDSaxJIJr
kjsdKnwlziL3xVnwQcpgnVOPvvDSKZ4OKEqx8rAuYTqiknpz3uhbt/7EqumuPLHr
e7Rz0MnFolrVN7KZOHQ5CPJIezkEAOAEpItLdfc5cfLS06pbeTN3j+dJZp+tUohi
WP/K3l2N3C5pkXA0ilAzshRF20Rwv/09M85BoqWocTLBJY7WqyIKXywCNdX81wkv
tpbQvp2MpPkJXUIbAh5484BOfCfx9vkYVm2cam2UxXJhR6VfrQCjYfXIjfpqF4jp
q7xxNesUezrOqB2Q/cKxw8dKOaRtO1XzVnmwutBrcKgqqLezMwUTDDjQYe8l6p1Z
40E74tsJwQ==
=EQ7g
-----END PGP SIGNATURE-----
Merge tag 'block-6.17-20250808' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe:
- MD pull request via Yu:
- mddev null-ptr-dereference fix, by Erkun
- md-cluster fail to remove the faulty disk regression fix, by
Heming
- minor cleanup, by Li Nan and Jinchao
- mdadm lifetime regression fix reported by syzkaller, by Yu Kuai
- MD pull request via Christoph
- add support for getting the FDP featuee in fabrics passthru path
(Nitesh Shetty)
- add capability to connect to an administrative controller
(Kamaljit Singh)
- fix a leak on sgl setup error (Keith Busch)
- initialize discovery subsys after debugfs is initialized
(Mohamed Khalfella)
- fix various comment typos (Bjorn Helgaas)
- remove unneeded semicolons (Jiapeng Chong)
- nvmet debugfs ordering issue fix
- Fix UAF in the tag_set in zloop
- Ensure sbitmap shallow depth covers entire set
- Reduce lock roundtrips in io context lookup
- Move scheduler tags alloc/free out of elevator and freeze lock, to
fix some lockdep found issues
- Improve robustness of queue limits checking
- Fix a regression with IO priorities, if no io context exists
* tag 'block-6.17-20250808' of git://git.kernel.dk/linux: (26 commits)
lib/sbitmap: make sbitmap_get_shallow() internal
lib/sbitmap: convert shallow_depth from one word to the whole sbitmap
nvmet: exit debugfs after discovery subsystem exits
block, bfq: Reorder struct bfq_iocq_bfqq_data
md: make rdev_addable usable for rcu mode
md/raid1: remove struct pool_info and related code
md/raid1: change r1conf->r1bio_pool to a pointer type
block: ensure discard_granularity is zero when discard is not supported
zloop: fix KASAN use-after-free of tag set
block: Fix default IO priority if there is no IO context
nvme: fix various comment typos
nvme-auth: remove unneeded semicolon
nvme-pci: fix leak on sgl setup error
nvmet: initialize discovery subsys after debugfs is initialized
nvme: add capability to connect to an administrative controller
nvmet: add support for FDP in fabrics passthru path
md: rename recovery_cp to resync_offset
md/md-cluster: handle REMOVE message earlier
md: fix create on open mddev lifetime regression
block: fix potential deadlock while running nr_hw_queue update
...
Because it's only used in sbitmap.c
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20250807032413.1469456-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently elevators will record internal 'async_depth' to throttle
asynchronous requests, and they both calculate shallow_dpeth based on
sb->shift, with the respect that sb->shift is the available tags in one
word.
However, sb->shift is not the availbale tags in the last word, see
__map_depth:
if (index == sb->map_nr - 1)
return sb->depth - (index << sb->shift);
For consequence, if the last word is used, more tags can be get than
expected, for example, assume nr_requests=256 and there are four words,
in the worst case if user set nr_requests=32, then the first word is
the last word, and still use bits per word, which is 64, to calculate
async_depth is wrong.
One the ohter hand, due to cgroup qos, bfq can allow only one request
to be allocated, and set shallow_dpeth=1 will still allow the number
of words request to be allocated.
Fix this problems by using shallow_depth to the whole sbitmap instead
of per word, also change kyber, mq-deadline and bfq to follow this,
a new helper __map_depth_with_shallow() is introduced to calculate
available bits in each word.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20250807032413.1469456-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Resolve conflicts with this commit that was developed in parallel
during the merge window:
8c8efa93db ("x86/bug: Add ARCH_WARN_ASM macro for BUG/WARN asm code sharing with Rust")
Conflicts:
arch/riscv/include/asm/bug.h
arch/x86/include/asm/bug.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmiQpykACgkQUqAMR0iA
lPJcrg/9Hez6+zO7LECCn5VkuK5oJWR5CyCfwx14ki8UF38djQGU2frckI5837rE
MnVoEBexZunK5SXy4MAy7bTCitzR+lMqNtP5uq9J2ovlSPtNlfuJRDr7uGQLDtSS
M5KZ1qsZnhgwLYeNhfVVToHgp+OwIQb2GcgYmYc8k03fUI1NQpdxIM46DzoTj+06
x6qgrNsmmJbm8E73VWBByJAEFoq9ugjny8Rt+tYMi/CmhgZpp0ZyF1r5dYfYX/KS
VS8UQY//aZOFhNsQUAXwP7Ym00CYRgTg7Na+MHivYLXmYGH2gF6tWQhX/eEgHKcJ
RTmUbLFx70fdBbjJMxv2k8vyMk2sy6sTfJHPqM/NS/Fb0tSPBXQJG/EexzfoqiBc
wcjgOPkeALIosVdFdTqXxjoIGOP8rqsU4t6Y6WFjJlWK04SBVjxBUofytRdQSxkG
5Sb0rFVGKrKIkXaVkt4byPa1/BDpfNhfKMYPtQ56pv2VNUgzfye4prUAZHE5pLnK
8nixeeMtKDFFCBpn6rG5wZW7k2mK5FrWGZUfdfxdK1gWQ1y0kqGy5wa3lNZLcxlH
l3AtOYoDeWM2DjDVO6WCj8ambEWkbjbGg7tC9TI3F0NvRJSYytTb6npMqb3Gwhcb
U4NgT+Ho0GJ/5BLUye8HMfhvrGoCfRCeptHtEFXAK7pzKyjc0+c=
=Mocd
-----END PGP SIGNATURE-----
Merge tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Add new "hash_pointers=[auto|always|never]" boot parameter to force
the hashing even with "slab_debug" enabled
- Allow to stop CPU, after losing nbcon console ownership during
panic(), even without proper NMI
- Allow to use the printk kthread immediately even for the 1st
registered nbcon
- Compiler warning removal
* tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: nbcon: Allow reacquire during panic
printk: Allow to use the printk kthread immediately even for 1st nbcon
slab: Decouple slab_debug and no_hash_pointers
vsprintf: Use __diag macros to disable '-Wsuggest-attribute=format'
compiler-gcc.h: Introduce __diag_GCC_all
- The 2 patch series "squashfs: Remove page->mapping references" from
Matthew Wilcox gets us closer to being able to remove page->mapping.
- The 5 patch series "relayfs: misc changes" from Jason Xing does some
maintenance and minor feature addition work in relayfs.
- The 5 patch series "kdump: crashkernel reservation from CMA" from Jiri
Bohac switches us from static preallocation of the kdump crashkernel's
working memory over to dynamic allocation. So the difficulty of
a-priori estimation of the second kernel's needs is removed and the
first kernel obtains extra memory.
- The 5 patch series "generalize panic_print's dump function to be used
by other kernel parts" from Feng Tang implements some consolidation and
rationalizatio of the various ways in which a faiing kernel splats
information at the operator.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaI+82gAKCRDdBJ7gKXxA
jj4JAP9xb+w9DrBY6sa+7KTPIb+aTqQ7Zw3o9O2m+riKQJv6jAEA6aEwRnDA0451
fDT5IqVlCWGvnVikdZHSnvhdD7TGsQ0=
=rT71
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
"Significant patch series in this pull request:
- "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
us closer to being able to remove page->mapping
- "relayfs: misc changes" (Jason Xing) does some maintenance and
minor feature addition work in relayfs
- "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
us from static preallocation of the kdump crashkernel's working
memory over to dynamic allocation. So the difficulty of a-priori
estimation of the second kernel's needs is removed and the first
kernel obtains extra memory
- "generalize panic_print's dump function to be used by other
kernel parts" (Feng Tang) implements some consolidation and
rationalization of the various ways in which a failing kernel
splats information at the operator
* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
tools/getdelays: add backward compatibility for taskstats version
kho: add test for kexec handover
delaytop: enhance error logging and add PSI feature description
samples: Kconfig: fix spelling mistake "instancess" -> "instances"
fat: fix too many log in fat_chain_add()
scripts/spelling.txt: add notifer||notifier to spelling.txt
xen/xenbus: fix typo "notifer"
net: mvneta: fix typo "notifer"
drm/xe: fix typo "notifer"
cxl: mce: fix typo "notifer"
KVM: x86: fix typo "notifer"
MAINTAINERS: add maintainers for delaytop
ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
ucount: fix atomic_long_inc_below() argument type
kexec: enable CMA based contiguous allocation
stackdepot: make max number of pools boot-time configurable
lib/xxhash: remove unused functions
init/Kconfig: restore CONFIG_BROKEN help text
lib/raid6: update recov_rvv.c zero page usage
docs: update docs after introducing delaytop
...
to use the module data structures in combination with the already no-op stub
module functions, even when support for modules is disabled in the kernel
configuration. This change follows the kernel's coding style for conditional
compilation and allows kunit code to drop all CONFIG_MODULES ifdefs, which is
also part of the changes. This should allow others part of the kernel to do the
same cleanup.
Note that this had a conflict with sysctl changes [1] but should be fixed now as I
rebased on top.
The remaining changes include a fix for module name length handling which could
potentially lead to the removal of an incorrect module, and various cleanups.
The module name fix and related cleanup has been in linux-next since Thursday
(July 31) while the rest of the changes for a bit more than 3 weeks.
Note that this currently has conflicts in next with kbuild's tree [2].
Link: https://lore.kernel.org/all/20250714175916.774e6d79@canb.auug.org.au/ [1]
Link: https://lore.kernel.org/all/20250801132941.6815d93d@canb.auug.org.au/ [2]
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE73Ua4R8Pc+G5xjxTQJ6jxB8ZUfsFAmiPQgkACgkQQJ6jxB8Z
UfuPTA//XrRguJFBhQh6cUWqVleTNQJuhjiPsOSO5S52aVET4wsrnRNeM2eM5oqw
0+6ELvhIJINQ1LjpOP8D67d8P5Ds1/qM1pbQIkQsoKiEj6E7Q4dXH5N0uyf/BzO3
HaosLG9cpqcomlSorYEiYoPjqy9EChQzsi+YAYWAB+fW6bvU/AdUHTRH88m3ppBJ
Y22BTTPOKKyj5/QgfY+kwH8TTnrzCzY8aoOqW7uimLI5h4c9dFQ2PigRJnoMfDG1
11w5VshOTzZJvNFrUk5GVSirwlxdJDbW6dKfG0DD5+eNWK5dfIEc+/EcuhaGoPvO
Euwv8VQubdxHTAG6kzHI0MtxAVQUM1gyz8zHiu18eW++GTtnTFs6m8E6H9AC176G
nDkUh3qSxJN2HHgxtS9VUExEEZpYqtWeB9Zts8K3oSWvTaQenHWpVHPADkxzS4JU
Jvkjq8SiKo+RqHxaOKfyf1RfOtYe5tjMCLrP7zX39d1+cwGxuc6mip/omY9HFDgn
op132fYdt24JSHoioJDzRz9mTfvj3nICEmgX4D4WDQx5lP27CUcLugPnBNHPp0fu
5hL+ajy8M8nq4zm/42Y+F7VS74TIA6mSnJKs9dMCknUWueD6HrDEU9xHi1YMpUMZ
cBUSpU+P94dCIScwEzkp926vDnHyxCHLbpF1Jsq5qNNdj7AelHk=
=4bGB
-----END PGP SIGNATURE-----
Merge tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull module updates from Daniel Gomez:
"This is a small set of changes for modules, primarily to extend module
users to use the module data structures in combination with the
already no-op stub module functions, even when support for modules is
disabled in the kernel configuration. This change follows the kernel's
coding style for conditional compilation and allows kunit code to drop
all CONFIG_MODULES ifdefs, which is also part of the changes. This
should allow others part of the kernel to do the same cleanup.
The remaining changes include a fix for module name length handling
which could potentially lead to the removal of an incorrect module,
and various cleanups"
* tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
module: Rename MAX_PARAM_PREFIX_LEN to __MODULE_NAME_LEN
tracing: Replace MAX_PARAM_PREFIX_LEN with MODULE_NAME_LEN
module: Restore the moduleparam prefix length check
module: Remove unnecessary +1 from last_unloaded_module::name size
module: Prevent silent truncation of module name in delete_module(2)
kunit: test: Drop CONFIG_MODULE ifdeffery
module: make structure definitions always visible
module: move 'struct module_use' to internal.h
Testing kexec handover requires a kernel driver that will generate some
data and preserve it with KHO on the first boot and then restore that data
and verify it was preserved properly after kexec.
To facilitate such test, along with the kernel driver responsible for data
generation, preservation and restoration add a script that runs a kernel
in a VM with a minimal /init. The /init enables KHO, loads a kernel image
for kexec and runs kexec reboot. After the boot of the kexeced kernel,
the driver verifies that the data was properly preserved.
[rppt@kernel.org: fix section mismatch]
Link: https://lkml.kernel.org/r/aIiRC8fXiOXKbPM_@kernel.org
Link: https://lkml.kernel.org/r/20250727083733.2590139-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We're hitting the WARN in depot_init_pool() about reaching the stack depot
limit because we have long stacks that don't dedup very well.
Introduce a new start-up parameter to allow users to set the number of
maximum stack depot pools.
Link: https://lkml.kernel.org/r/20250718153928.94229-1-matt@readmodwrite.com
Signed-off-by: Matt Fleming <mfleming@cloudflare.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
xxh32_digest() and xxh32_update() were added in 2017 in the original
xxhash commit, but have remained unused.
Remove them.
Link: https://lkml.kernel.org/r/20250716133245.243363-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Dave Gilbert <linux@treblig.org>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- The 4 patch series "mm: ksm: prevent KSM from breaking merging of new
VMAs" from Lorenzo Stoakes addresses an issue with KSM's
PR_SET_MEMORY_MERGE mode: newly mapped VMAs were not eligible for
merging with existing adjacent VMAs.
- The 4 patch series "mm/damon: introduce DAMON_STAT for simple and
practical access monitoring" from SeongJae Park adds a new kernel module
which simplifies the setup and usage of DAMON in production
environments.
- The 6 patch series "stop passing a writeback_control to swap/shmem
writeout" from Christoph Hellwig is a cleanup to the writeback code
which removes a couple of pointers from struct writeback_control.
- The 7 patch series "drivers/base/node.c: optimization and cleanups"
from Donet Tom contains largely uncorrelated cleanups to the NUMA node
setup and management code.
- The 4 patch series "mm: userfaultfd: assorted fixes and cleanups" from
Tal Zussman does some maintenance work on the userfaultfd code.
- The 5 patch series "Readahead tweaks for larger folios" from Ryan
Roberts implements some tuneups for pagecache readahead when it is
reading into order>0 folios.
- The 4 patch series "selftests/mm: Tweaks to the cow test" from Mark
Brown provides some cleanups and consistency improvements to the
selftests code.
- The 4 patch series "Optimize mremap() for large folios" from Dev Jain
does that. A 37% reduction in execution time was measured in a
memset+mremap+munmap microbenchmark.
- The 5 patch series "Remove zero_user()" from Matthew Wilcox expunges
zero_user() in favor of the more modern memzero_page().
- The 3 patch series "mm/huge_memory: vmf_insert_folio_*() and
vmf_insert_pfn_pud() fixes" from David Hildenbrand addresses some warts
which David noticed in the huge page code. These were not known to be
causing any issues at this time.
- The 3 patch series "mm/damon: use alloc_migrate_target() for
DAMOS_MIGRATE_{HOT,COLD" from SeongJae Park provides some cleanup and
consolidation work in DAMON.
- The 3 patch series "use vm_flags_t consistently" from Lorenzo Stoakes
uses vm_flags_t in places where we were inappropriately using other
types.
- The 3 patch series "mm/memfd: Reserve hugetlb folios before
allocation" from Vivek Kasireddy increases the reliability of large page
allocation in the memfd code.
- The 14 patch series "mm: Remove pXX_devmap page table bit and pfn_t
type" from Alistair Popple removes several now-unneeded PFN_* flags.
- The 5 patch series "mm/damon: decouple sysfs from core" from SeongJae
Park implememnts some cleanup and maintainability work in the DAMON
sysfs layer.
- The 5 patch series "madvise cleanup" from Lorenzo Stoakes does quite a
lot of cleanup/maintenance work in the madvise() code.
- The 4 patch series "madvise anon_name cleanups" from Vlastimil Babka
provides additional cleanups on top or Lorenzo's effort.
- The 11 patch series "Implement numa node notifier" from Oscar Salvador
creates a standalone notifier for NUMA node memory state changes.
Previously these were lumped under the more general memory on/offline
notifier.
- The 6 patch series "Make MIGRATE_ISOLATE a standalone bit" from Zi Yan
cleans up the pageblock isolation code and fixes a potential issue which
doesn't seem to cause any problems in practice.
- The 5 patch series "selftests/damon: add python and drgn based DAMON
sysfs functionality tests" from SeongJae Park adds additional drgn- and
python-based DAMON selftests which are more comprehensive than the
existing selftest suite.
- The 5 patch series "Misc rework on hugetlb faulting path" from Oscar
Salvador fixes a rather obscure deadlock in the hugetlb fault code and
follows that fix with a series of cleanups.
- The 3 patch series "cma: factor out allocation logic from
__cma_declare_contiguous_nid" from Mike Rapoport rationalizes and cleans
up the highmem-specific code in the CMA allocator.
- The 28 patch series "mm/migration: rework movable_ops page migration
(part 1)" from David Hildenbrand provides cleanups and
future-preparedness to the migration code.
- The 2 patch series "mm/damon: add trace events for auto-tuned
monitoring intervals and DAMOS quota" from SeongJae Park adds some
tracepoints to some DAMON auto-tuning code.
- The 6 patch series "mm/damon: fix misc bugs in DAMON modules" from
SeongJae Park does that.
- The 6 patch series "mm/damon: misc cleanups" from SeongJae Park also
does what it claims.
- The 4 patch series "mm: folio_pte_batch() improvements" from David
Hildenbrand cleans up the large folio PTE batching code.
- The 13 patch series "mm/damon/vaddr: Allow interleaving in
migrate_{hot,cold} actions" from SeongJae Park facilitates dynamic
alteration of DAMON's inter-node allocation policy.
- The 3 patch series "Remove unmap_and_put_page()" from Vishal Moola
provides a couple of page->folio conversions.
- The 4 patch series "mm: per-node proactive reclaim" from Davidlohr
Bueso implements a per-node control of proactive reclaim - beyond the
current memcg-based implementation.
- The 14 patch series "mm/damon: remove damon_callback" from SeongJae
Park replaces the damon_callback interface with a more general and
powerful damon_call()+damos_walk() interface.
- The 10 patch series "mm/mremap: permit mremap() move of multiple VMAs"
from Lorenzo Stoakes implements a number of mremap cleanups (of course)
in preparation for adding new mremap() functionality: newly permit the
remapping of multiple VMAs when the user is specifying MREMAP_FIXED. It
still excludes some specialized situations where this cannot be
performed reliably.
- The 3 patch series "drop hugetlb_free_pgd_range()" from Anthony Yznaga
switches some sparc hugetlb code over to the generic version and removes
the thus-unneeded hugetlb_free_pgd_range().
- The 4 patch series "mm/damon/sysfs: support periodic and automated
stats update" from SeongJae Park augments the present
userspace-requested update of DAMON sysfs monitoring files. Automatic
update is now provided, along with a tunable to control the update
interval.
- The 4 patch series "Some randome fixes and cleanups to swapfile" from
Kemeng Shi does what is claims.
- The 4 patch series "mm: introduce snapshot_page" from Luiz Capitulino
and David Hildenbrand provides (and uses) a means by which debug-style
functions can grab a copy of a pageframe and inspect it locklessly
without tripping over the races inherent in operating on the live
pageframe directly.
- The 6 patch series "use per-vma locks for /proc/pid/maps reads" from
Suren Baghdasaryan addresses the large contention issues which can be
triggered by reads from that procfs file. Latencies are reduced by more
than half in some situations. The series also introduces several new
selftests for the /proc/pid/maps interface.
- The 6 patch series "__folio_split() clean up" from Zi Yan cleans up
__folio_split()!
- The 7 patch series "Optimize mprotect() for large folios" from Dev
Jain provides some quite large (>3x) speedups to mprotect() when dealing
with large folios.
- The 2 patch series "selftests/mm: reuse FORCE_READ to replace "asm
volatile("" : "+r" (XXX));" and some cleanup" from wang lian does some
cleanup work in the selftests code.
- The 3 patch series "tools/testing: expand mremap testing" from Lorenzo
Stoakes extends the mremap() selftest in several ways, including adding
more checking of Lorenzo's recently added "permit mremap() move of
multiple VMAs" feature.
- The 22 patch series "selftests/damon/sysfs.py: test all parameters"
from SeongJae Park extends the DAMON sysfs interface selftest so that it
tests all possible user-requested parameters. Rather than the present
minimal subset.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaIqcCgAKCRDdBJ7gKXxA
jkVBAQCCn9DR1QP0CRk961ot0cKzOgioSc0aA03DPb2KXRt2kQEAzDAz0ARurFhL
8BzbvI0c+4tntHLXvIlrC33n9KWAOQM=
=XsFy
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"As usual, many cleanups. The below blurbiage describes 42 patchsets.
21 of those are partially or fully cleanup work. "cleans up",
"cleanup", "maintainability", "rationalizes", etc.
I never knew the MM code was so dirty.
"mm: ksm: prevent KSM from breaking merging of new VMAs" (Lorenzo Stoakes)
addresses an issue with KSM's PR_SET_MEMORY_MERGE mode: newly
mapped VMAs were not eligible for merging with existing adjacent
VMAs.
"mm/damon: introduce DAMON_STAT for simple and practical access monitoring" (SeongJae Park)
adds a new kernel module which simplifies the setup and usage of
DAMON in production environments.
"stop passing a writeback_control to swap/shmem writeout" (Christoph Hellwig)
is a cleanup to the writeback code which removes a couple of
pointers from struct writeback_control.
"drivers/base/node.c: optimization and cleanups" (Donet Tom)
contains largely uncorrelated cleanups to the NUMA node setup and
management code.
"mm: userfaultfd: assorted fixes and cleanups" (Tal Zussman)
does some maintenance work on the userfaultfd code.
"Readahead tweaks for larger folios" (Ryan Roberts)
implements some tuneups for pagecache readahead when it is reading
into order>0 folios.
"selftests/mm: Tweaks to the cow test" (Mark Brown)
provides some cleanups and consistency improvements to the
selftests code.
"Optimize mremap() for large folios" (Dev Jain)
does that. A 37% reduction in execution time was measured in a
memset+mremap+munmap microbenchmark.
"Remove zero_user()" (Matthew Wilcox)
expunges zero_user() in favor of the more modern memzero_page().
"mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes" (David Hildenbrand)
addresses some warts which David noticed in the huge page code.
These were not known to be causing any issues at this time.
"mm/damon: use alloc_migrate_target() for DAMOS_MIGRATE_{HOT,COLD" (SeongJae Park)
provides some cleanup and consolidation work in DAMON.
"use vm_flags_t consistently" (Lorenzo Stoakes)
uses vm_flags_t in places where we were inappropriately using other
types.
"mm/memfd: Reserve hugetlb folios before allocation" (Vivek Kasireddy)
increases the reliability of large page allocation in the memfd
code.
"mm: Remove pXX_devmap page table bit and pfn_t type" (Alistair Popple)
removes several now-unneeded PFN_* flags.
"mm/damon: decouple sysfs from core" (SeongJae Park)
implememnts some cleanup and maintainability work in the DAMON
sysfs layer.
"madvise cleanup" (Lorenzo Stoakes)
does quite a lot of cleanup/maintenance work in the madvise() code.
"madvise anon_name cleanups" (Vlastimil Babka)
provides additional cleanups on top or Lorenzo's effort.
"Implement numa node notifier" (Oscar Salvador)
creates a standalone notifier for NUMA node memory state changes.
Previously these were lumped under the more general memory
on/offline notifier.
"Make MIGRATE_ISOLATE a standalone bit" (Zi Yan)
cleans up the pageblock isolation code and fixes a potential issue
which doesn't seem to cause any problems in practice.
"selftests/damon: add python and drgn based DAMON sysfs functionality tests" (SeongJae Park)
adds additional drgn- and python-based DAMON selftests which are
more comprehensive than the existing selftest suite.
"Misc rework on hugetlb faulting path" (Oscar Salvador)
fixes a rather obscure deadlock in the hugetlb fault code and
follows that fix with a series of cleanups.
"cma: factor out allocation logic from __cma_declare_contiguous_nid" (Mike Rapoport)
rationalizes and cleans up the highmem-specific code in the CMA
allocator.
"mm/migration: rework movable_ops page migration (part 1)" (David Hildenbrand)
provides cleanups and future-preparedness to the migration code.
"mm/damon: add trace events for auto-tuned monitoring intervals and DAMOS quota" (SeongJae Park)
adds some tracepoints to some DAMON auto-tuning code.
"mm/damon: fix misc bugs in DAMON modules" (SeongJae Park)
does that.
"mm/damon: misc cleanups" (SeongJae Park)
also does what it claims.
"mm: folio_pte_batch() improvements" (David Hildenbrand)
cleans up the large folio PTE batching code.
"mm/damon/vaddr: Allow interleaving in migrate_{hot,cold} actions" (SeongJae Park)
facilitates dynamic alteration of DAMON's inter-node allocation
policy.
"Remove unmap_and_put_page()" (Vishal Moola)
provides a couple of page->folio conversions.
"mm: per-node proactive reclaim" (Davidlohr Bueso)
implements a per-node control of proactive reclaim - beyond the
current memcg-based implementation.
"mm/damon: remove damon_callback" (SeongJae Park)
replaces the damon_callback interface with a more general and
powerful damon_call()+damos_walk() interface.
"mm/mremap: permit mremap() move of multiple VMAs" (Lorenzo Stoakes)
implements a number of mremap cleanups (of course) in preparation
for adding new mremap() functionality: newly permit the remapping
of multiple VMAs when the user is specifying MREMAP_FIXED. It still
excludes some specialized situations where this cannot be performed
reliably.
"drop hugetlb_free_pgd_range()" (Anthony Yznaga)
switches some sparc hugetlb code over to the generic version and
removes the thus-unneeded hugetlb_free_pgd_range().
"mm/damon/sysfs: support periodic and automated stats update" (SeongJae Park)
augments the present userspace-requested update of DAMON sysfs
monitoring files. Automatic update is now provided, along with a
tunable to control the update interval.
"Some randome fixes and cleanups to swapfile" (Kemeng Shi)
does what is claims.
"mm: introduce snapshot_page" (Luiz Capitulino and David Hildenbrand)
provides (and uses) a means by which debug-style functions can grab
a copy of a pageframe and inspect it locklessly without tripping
over the races inherent in operating on the live pageframe
directly.
"use per-vma locks for /proc/pid/maps reads" (Suren Baghdasaryan)
addresses the large contention issues which can be triggered by
reads from that procfs file. Latencies are reduced by more than
half in some situations. The series also introduces several new
selftests for the /proc/pid/maps interface.
"__folio_split() clean up" (Zi Yan)
cleans up __folio_split()!
"Optimize mprotect() for large folios" (Dev Jain)
provides some quite large (>3x) speedups to mprotect() when dealing
with large folios.
"selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));" and some cleanup" (wang lian)
does some cleanup work in the selftests code.
"tools/testing: expand mremap testing" (Lorenzo Stoakes)
extends the mremap() selftest in several ways, including adding
more checking of Lorenzo's recently added "permit mremap() move of
multiple VMAs" feature.
"selftests/damon/sysfs.py: test all parameters" (SeongJae Park)
extends the DAMON sysfs interface selftest so that it tests all
possible user-requested parameters. Rather than the present minimal
subset"
* tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (370 commits)
MAINTAINERS: add missing headers to mempory policy & migration section
MAINTAINERS: add missing file to cgroup section
MAINTAINERS: add MM MISC section, add missing files to MISC and CORE
MAINTAINERS: add missing zsmalloc file
MAINTAINERS: add missing files to page alloc section
MAINTAINERS: add missing shrinker files
MAINTAINERS: move memremap.[ch] to hotplug section
MAINTAINERS: add missing mm_slot.h file THP section
MAINTAINERS: add missing interval_tree.c to memory mapping section
MAINTAINERS: add missing percpu-internal.h file to per-cpu section
mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info()
selftests/damon: introduce _common.sh to host shared function
selftests/damon/sysfs.py: test runtime reduction of DAMON parameters
selftests/damon/sysfs.py: test non-default parameters runtime commit
selftests/damon/sysfs.py: generalize DAMON context commit assertion
selftests/damon/sysfs.py: generalize monitoring attributes commit assertion
selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion
selftests/damon/sysfs.py: test DAMOS filters commitment
selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion
selftests/damon/sysfs.py: test DAMOS destinations commitment
...
The definitions of GENMASK() and GENMASK_ULL() do not depend any more
on __GENMASK() and __GENMASK_ULL(). Duplicate the existing unit tests
so that __GENMASK{,ULL}() are still covered.
Because __GENMASK() and __GENMASK_ULL() do use GENMASK_INPUT_CHECK(),
drop the TEST_GENMASK_FAILURES negative tests.
It would be good to have a small assembly test case for GENMASK*() in
case somebody decides to unify both in the future. However, I lack
expertise in assembly to do so. Instead add a FIXME message to
highlight the absence of the asm unit test.
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
The function stubs exposed by module.h allow the code to compile properly
without the ifdeffery. The generated object code stays the same, as the
compiler can optimize away all the dead code.
As the code is still typechecked developer errors can be detected faster.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Link: https://lore.kernel.org/r/20250711-kunit-ifdef-modules-v2-3-39443decb1f8@linutronix.de
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Core & protocols
----------------
- Wrap datapath globals into net_aligned_data, to avoid false sharing.
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container).
- Add SO_INQ and SCM_INQ support to AF_UNIX.
- Add SIOCINQ support to AF_VSOCK.
- Add TCP_MAXSEG sockopt to MPTCP.
- Add IPv6 force_forwarding sysctl to enable forwarding per interface.
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB.
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users.
- Convert TCP send queue handling from tasklet to BH workque.
- Improve BPF iteration over TCP sockets to see each socket exactly once.
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code.
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel NAPI
thread when threaded NAPI gets disabled. Previously thread would stick
around until ifdown due to tricky synchronization.
- Allow multicast routing to take effect on locally-generated packets.
- Add output interface argument for End.X in segment routing.
- MCTP: add support for gateway routing, improve bind() handling.
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink.
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed.
- Support NUD_PERMANENT for proxy neighbor entries.
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM.
- Add sequence numbers to netconsole messages. Unregister netconsole's
console when all net targets are removed. Code refactoring.
Add a number of selftests.
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup.
- Support inspecting ref_tracker state via DebugFS.
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links.
- Allow providing upcall pid for the 'execute' command in openvswitch.
- Remove DCCP support from Netfilter's conntrack.
- Disallow multiple packet duplications in the queuing layer.
- Prevent use of deprecated iptables code on PREEMPT_RT.
Driver API
----------
- Support RSS and hashing configuration over ethtool Netlink.
- Add dedicated ethtool callbacks for getting and setting hashing fields.
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc.
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs.
- Support traffic classes in devlink rate API for bandwidth management.
- Remove rtnl_lock dependency from UDP tunnel port configuration.
Device drivers
--------------
- Add a new Broadcom driver for 800G Ethernet (bnge).
- Add a standalone driver for Microchip ZL3073x DPLL.
- Remove IBM's NETIUCV device driver.
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is used
by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmiFgLgACgkQMUZtbf5S
IrvafxAAnQRwYBoIG+piCILx6z5pRvBGHkmEQ4AQgSCFuq2eO3ubwMFIqEybfma1
5+QFjUZAV3OgGgKRBS2KGWxtSzdiF+/JGV1VOIN67sX3Mm0a2QgjA4n5CgKL0FPr
o6BEzjX5XwG1zvGcBNQ5BZ19xUUKjoZQgTtnea8sZ57Fsp5RtRgmYRqoewNvNk/n
uImh0NFsDVb0UeOpSzC34VD9l1dJvLGdui4zJAjno/vpvmT1DkXjoK419J/r52SS
X+5WgsfJ6DkjHqVN1tIhhK34yWqBOcwGFZJgEnWHMkFIl2FqRfFKMHyqtfLlVnLA
mnIpSyz8Sq2AHtx0TlgZ3At/Ri8p5+yYJgHOXcDKyABa8y8Zf4wrycmr6cV9JLuL
z54nLEVnJuvfDVDVJjsLYdJXyhMpZFq6+uAItdxKaw8Ugp/QqG4QtoRj+XIHz4ZW
z6OohkCiCzTwEISFK+pSTxPS30eOxq43kCspcvuLiwCCStJBRkRb5GdZA4dm7LA+
1Od4ADAkHjyrFtBqTyyC2scX8UJ33DlAIpAYyIeS6w9Cj9EXxtp1z33IAAAZ03MW
jJwIaJuc8bK2fWKMmiG7ucIXjPo4t//KiWlpkwwqLhPbjZgfDAcxq1AC2TLoqHBL
y4EOgKpHDCMAghSyiFIAn2JprGcEt8dp+11B0JRXIn4Pm/eYDH8=
=lqbe
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Wrap datapath globals into net_aligned_data, to avoid false sharing
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container)
- Add SO_INQ and SCM_INQ support to AF_UNIX
- Add SIOCINQ support to AF_VSOCK
- Add TCP_MAXSEG sockopt to MPTCP
- Add IPv6 force_forwarding sysctl to enable forwarding per interface
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users
- Convert TCP send queue handling from tasklet to BH workque
- Improve BPF iteration over TCP sockets to see each socket exactly
once
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel
NAPI thread when threaded NAPI gets disabled. Previously thread
would stick around until ifdown due to tricky synchronization
- Allow multicast routing to take effect on locally-generated packets
- Add output interface argument for End.X in segment routing
- MCTP: add support for gateway routing, improve bind() handling
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed
- Support NUD_PERMANENT for proxy neighbor entries
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM
- Add sequence numbers to netconsole messages. Unregister
netconsole's console when all net targets are removed. Code
refactoring. Add a number of selftests
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup
- Support inspecting ref_tracker state via DebugFS
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links
- Allow providing upcall pid for the 'execute' command in openvswitch
- Remove DCCP support from Netfilter's conntrack
- Disallow multiple packet duplications in the queuing layer
- Prevent use of deprecated iptables code on PREEMPT_RT
Driver API:
- Support RSS and hashing configuration over ethtool Netlink
- Add dedicated ethtool callbacks for getting and setting hashing
fields
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs
- Support traffic classes in devlink rate API for bandwidth
management
- Remove rtnl_lock dependency from UDP tunnel port configuration
Device drivers:
- Add a new Broadcom driver for 800G Ethernet (bnge)
- Add a standalone driver for Microchip ZL3073x DPLL
- Remove IBM's NETIUCV device driver
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is
used by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading"
* tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits)
dpll: zl3073x: Fix build failure
selftests: bpf: fix legacy netfilter options
ipv6: annotate data-races around rt->fib6_nsiblings
ipv6: fix possible infinite loop in fib6_info_uses_dev()
ipv6: prevent infinite loop in rt6_nlmsg_size()
ipv6: add a retry logic in net6_rt_notify()
vrf: Drop existing dst reference in vrf_ip6_input_dst
net/sched: taprio: align entry index attr validation with mqprio
net: fsl_pq_mdio: use dev_err_probe
selftests: rtnetlink.sh: remove esp4_offload after test
vsock: remove unnecessary null check in vsock_getname()
igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
stmmac: xsk: fix negative overflow of budget in zerocopy mode
dt-bindings: ieee802154: Convert at86rf230.txt yaml format
net: dsa: microchip: Disable PTP function of KSZ8463
net: dsa: microchip: Setup fiber ports for KSZ8463
net: dsa: microchip: Write switch MAC address differently for KSZ8463
net: dsa: microchip: Use different registers for KSZ8463
net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
dt-bindings: net: dsa: microchip: Add KSZ8463 switch support
...
* Move sysctls out of the kern_table array
This is the final move of ctl_tables into their respective subsystems. Only 5
(out of the original 50) will remain in kernel/sysctl.c file; these handle
either sysctl or common arch variables.
By decentralizing sysctl registrations, subsystem maintainers regain control
over their sysctl interfaces, improving maintainability and reducing the
likelihood of merge conflicts.
* docs: Remove false positives from check-sysctl-docs
Stopped falsely identifying sysctls as undocumented or unimplemented in the
check-sysctl-docs script. This script can now be used to automatically
identify if documentation is missing.
* Testing
All these have been in linux-next since rc3, giving them a solid 3 to 4 weeks
worth of testing. Additionally, sysctl selftests and kunit were also run
locally on my x86_64
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmiAvd8ACgkQupfNUreW
QU+9nAv/dtxaKoL4BXJSzsA2+49bbo9QfiK5Vjz1wSRYRQTb+jhGr9QdS5hG+NeX
uN2ilvcNQqW7ENdiblU10lvcbPjIn2hw4lbMcpv/+QXnrudtGYlBFXlkWqW5nv7X
AVvHU8y3uzfs6JbRIpROUA7Cn2cDOlfP2mMtwxCXR3iP+orS1ziuVEi1JRoirIyG
iq5I/1rJMJBU3FjqqDTq6yljspLx8AlXO1yc5xUxAM67IcY4ew3ZTxqiZr6M9AhV
DUbR2lu/88wcFNERt8DJmuQ50dSGGqOEpK3FURTmkwtMFxzNLmenFDQeBKKahz3Q
2ntXSDfp2y+ppZNmcOP8tZZkra03Xpy1DQyoOgQ2r9uGekPxyr+wmKXwYPOeJIPO
YWTNBm8omX9qr49zVzaZ1f2foRGfgStHL6aa6xLIf34zzScSDEPtO3og2+5Hw/30
gnp+7v9E19uKpoE6oiGE0PtiFzAi/I6nFxzG2RRqrlMLFXyKVccTKygzY6tCnI3P
6144s/Bt
=R369
-----END PGP SIGNATURE-----
Merge tag 'sysctl-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
- Move sysctls out of the kern_table array
This is the final move of ctl_tables into their respective
subsystems. Only 5 (out of the original 50) will remain in
kernel/sysctl.c file; these handle either sysctl or common arch
variables.
By decentralizing sysctl registrations, subsystem maintainers regain
control over their sysctl interfaces, improving maintainability and
reducing the likelihood of merge conflicts.
- docs: Remove false positives from check-sysctl-docs
Stopped falsely identifying sysctls as undocumented or unimplemented
in the check-sysctl-docs script. This script can now be used to
automatically identify if documentation is missing.
* tag 'sysctl-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (23 commits)
docs: Downgrade arm64 & riscv from titles to comment
docs: Replace spaces with tabs in check-sysctl-docs
docs: Remove colon from ctltable title in vm.rst
docs: Add awk section for ucount sysctl entries
docs: Use skiplist when checking sysctl admin-guide
docs: nixify check-sysctl-docs
sysctl: rename kern_table -> sysctl_subsys_table
kernel/sys.c: Move overflow{uid,gid} sysctl into kernel/sys.c
uevent: mv uevent_helper into kobject_uevent.c
sysctl: Removed unused variable
sysctl: Nixify sysctl.sh
sysctl: Remove superfluous includes from kernel/sysctl.c
sysctl: Remove (very) old file changelog
sysctl: Move sysctl_panic_on_stackoverflow to kernel/panic.c
sysctl: move cad_pid into kernel/pid.c
sysctl: Move tainted ctl_table into kernel/panic.c
Input: sysrq: mv sysrq into drivers/tty/sysrq.c
fork: mv threads-max into kernel/fork.c
parisc/power: Move soft-power into power.c
mm: move randomize_va_space into memory.c
...
- Standardize on the __ASSEMBLER__ macro that is provided by GCC
and Clang compilers and replace __ASSEMBLY__ with __ASSEMBLER__
in both uapi and non-uapi headers
- Explicitly include <linux/export.h> in architecture and driver
files which contain an EXPORT_SYMBOL() and remove the include
from the files which do not contain the EXPORT_SYMBOL()
- Use the full title of "z/Architecture Principles of Operation"
manual and the name of a section where facility bits are listed
- Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid
unnecessary slowing down of the build and confusing external
kABI tools that process symtypes data
- Print additional unrecoverable machine check information to make
the root cause analysis easier
- Move cmpxchg_user_key() handling to uaccess library code, since
the generated code is large anyway and there is no benefit if it
is inlined
- Fix a problem when cmpxchg_user_key() is executing a code with a
non-default key: if a system is IPL-ed with "LOAD NORMAL", and
the previous system used storage keys where the fetch-protection
bit was set for some pages, and the cmpxchg_user_key() is located
within such page, a protection exception happens
- Either the external call or emergency signal order is used to send
an IPI to a remote CPU. Use the external order only, since it is at
least as good and sometimes even better, than the emergency signal
- In case of an early crash the early program check handler prints
more or less random value of the last breaking event address, since
it is not initialized properly. Copy the last breaking event address
from the lowcore to pt_regs to address this
- During STP synchronization check udelay() can not be used, since the
first CPU modifies tod_clock_base and get_tod_clock_monotonic() might
return a non-monotonic time. Instead, busy-loop on other CPUs, while
the the first CPU actually handles the synchronization operation
- When debugging the early kernel boot using QEMU with the -S flag and
GDB attached, skip the decompressor and start directly in kernel
- Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture
Principles of Operation" manual
- Remove the in-kernel time steering support in favour of the new s390
PTP driver, which allows the kernel clock steered more precisely
- Remove a possible false-positive warning in pte_free_defer(), which
could be triggered in a valid case KVM guest process is initializing
-----BEGIN PGP SIGNATURE-----
iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCaIJQThccYWdvcmRlZXZA
bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8FI2APwPnlrj6ZVXzNA6dw0fSUt697rS
NlaHEORXL8KcfoQh8QD/WwHUe1VNtDG1R5bBn0guR+UytVgR9Tt7LxyKfIgT3ws=
=tdMb
-----END PGP SIGNATURE-----
Merge tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:
- Standardize on the __ASSEMBLER__ macro that is provided by GCC and
Clang compilers and replace __ASSEMBLY__ with __ASSEMBLER__ in both
uapi and non-uapi headers
- Explicitly include <linux/export.h> in architecture and driver files
which contain an EXPORT_SYMBOL() and remove the include from the
files which do not contain the EXPORT_SYMBOL()
- Use the full title of "z/Architecture Principles of Operation" manual
and the name of a section where facility bits are listed
- Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid
unnecessary slowing down of the build and confusing external kABI
tools that process symtypes data
- Print additional unrecoverable machine check information to make the
root cause analysis easier
- Move cmpxchg_user_key() handling to uaccess library code, since the
generated code is large anyway and there is no benefit if it is
inlined
- Fix a problem when cmpxchg_user_key() is executing a code with a
non-default key: if a system is IPL-ed with "LOAD NORMAL", and the
previous system used storage keys where the fetch-protection bit was
set for some pages, and the cmpxchg_user_key() is located within such
page, a protection exception happens
- Either the external call or emergency signal order is used to send an
IPI to a remote CPU. Use the external order only, since it is at
least as good and sometimes even better, than the emergency signal
- In case of an early crash the early program check handler prints more
or less random value of the last breaking event address, since it is
not initialized properly. Copy the last breaking event address from
the lowcore to pt_regs to address this
- During STP synchronization check udelay() can not be used, since the
first CPU modifies tod_clock_base and get_tod_clock_monotonic() might
return a non-monotonic time. Instead, busy-loop on other CPUs, while
the the first CPU actually handles the synchronization operation
- When debugging the early kernel boot using QEMU with the -S flag and
GDB attached, skip the decompressor and start directly in kernel
- Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture
Principles of Operation" manual
- Remove the in-kernel time steering support in favour of the new s390
PTP driver, which allows the kernel clock steered more precisely
- Remove a possible false-positive warning in pte_free_defer(), which
could be triggered in a valid case KVM guest process is initializing
* tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits)
s390/mm: Remove possible false-positive warning in pte_free_defer()
s390/stp: Default to enabled
s390/stp: Remove leap second support
s390/time: Remove in-kernel time steering
s390/sclp: Use monotonic clock in sclp_sync_wait()
s390/smp: Use monotonic clock in smp_emergency_stop()
s390/time: Use monotonic clock in get_cycles()
s390/pai_crypto: Rename PAI Crypto event 4210
scripts/gdb/symbols: make lx-symbols skip the s390 decompressor
s390/boot: Introduce jump_to_kernel() function
s390/stp: Remove udelay from stp_sync_clock()
s390/early: Copy last breaking event address to pt_regs
s390/smp: Remove conditional emergency signal order code usage
s390/uaccess: Merge cmpxchg_user_key() inline assemblies
s390/uaccess: Prevent kprobes on cmpxchg_user_key() functions
s390/uaccess: Initialize code pages executed with non-default access key
s390/skey: Provide infrastructure for executing with non-default access key
s390/uaccess: Make cmpxchg_user_key() library code
s390/page: Add memory clobber to page_set_storage_key()
s390/page: Cleanup page_set_storage_key() inline assemblies
...
Core scheduler changes:
- Better tracking of maximum lag of tasks in presence of different
slices duration, for better handling of lag in the fair
scheduler. (Vincent Guittot)
- Clean up and standardize #if/#else/#endif markers throughout
the entire scheduler code base (Ingo Molnar)
- Make SMP unconditional: build the SMP scheduler's
data structures and logic on UP kernel too, even though
they are not used, to simplify the scheduler and remove
around 200 #ifdef/[#else]/#endif blocks from the
scheduler. (Ingo Molnar)
- Reorganize cgroup bandwidth control interface handling
for better interfacing with sched_ext (Tejun Heo)
Balancing:
- Bump sd->max_newidle_lb_cost when newidle balance fails (Chris Mason)
- Remove sched_domain_topology_level::flags to simplify the code (Prateek Nayak)
- Simplify and clean up build_sched_topology() (Li Chen)
- Optimize build_sched_topology() on large machines (Li Chen)
Real-time scheduling:
- Add initial version of proxy execution: a mechanism for mutex-owning
tasks to inherit the scheduling context of higher priority waiters.
Currently limited to a single runqueue and conditional on CONFIG_EXPERT,
and other limitations. (John Stultz, Peter Zijlstra, Valentin Schneider)
- Deadline scheduler (Juri Lelli):
- Fix dl_servers initialization order (Juri Lelli)
- Fix DL scheduler's root domain reinitialization logic (Juri Lelli)
- Fix accounting bugs after global limits change (Juri Lelli)
- Fix scalability regression by implementing less agressive dl_server handling
(Peter Zijlstra)
PSI:
- Improve scalability by optimizing psi_group_change() cpu_clock() usage
(Peter Zijlstra)
Rust changes:
- Make Task, CondVar and PollCondVar methods inline to avoid unnecessary
function calls (Kunwu Chan, Panagiotis Foliadis)
- Add might_sleep() support for Rust code: Rust's "#[track_caller]"
mechanism is used so that Rust's might_sleep() doesn't need to be
defined as a macro (Fujita Tomonori)
- Introduce file_from_location() (Boqun Feng)
Debugging & instrumentation:
- Make clangd usable with scheduler source code files again (Peter Zijlstra)
- tools: Add root_domains_dump.py which dumps root domains info (Juri Lelli)
- tools: Add dl_bw_dump.py for printing bandwidth accounting info (Juri Lelli)
Misc cleanups & fixes:
- Remove play_idle() (Feng Lee)
- Fix check_preemption_disabled() (Sebastian Andrzej Siewior)
- Do not call __put_task_struct() on RT if pi_blocked_on is set
(Luis Claudio R. Goncalves)
- Correct the comment in place_entity() (wang wei)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmiHHNIRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1g7DhAAg9aMW33PuC24A4hCS1XQay6j3rgmR5qC
AOqDofj/CY4Q374HQtOl4m5CYZB/G5csRv6TZliWQKhAy9vr6VWddoyOMJYOAlAx
XRurl1Z3MriOMD6DPgNvtHd5PrR5Un8ygALgT+32d0PRz27KNXORW5TyvEf2Bv4r
BX4/GazlOlK0PdGUdZl0q/3dtkU4Wr5IifQzT8KbarOSBbNwZwVcg+83hLW5gJMx
LgMGLaAATmiN7VuvJWNDATDfEOmOvQOu8veoS8TuP1AOVeJPfPT2JVh9Jen5V1/5
3w1RUOkUI2mQX+cujWDW3koniSxjsA1OegXfHnFkF5BXp4q5e54k6D5sSh1xPFDX
iDhkU5jsbKkkJS2ulD6Vi4bIAct3apMl4IrbJn/OYOLcUVI8WuunHs4UPPEuESAS
TuQExKSdj4Ntrzo3pWEy8kX3/Z9VGa+WDzwsPUuBSvllB5Ir/jjKgvkxPA6zGsiY
rbkmZT8qyI01IZ/GXqfI2AQYCGvgp+SOvFPi755ZlELTQS6sUkGZH2/2M5XnKA9t
Z1wB2iwttoS1VQInx0HgiiAGrXrFkr7IzSIN2T+CfWIqilnL7+nTxzwlJtC206P4
DB97bF6azDtJ6yh1LetRZ1ZMX/Gr56Cy0Z6USNoOu+a12PLqlPk9+fPBBpkuGcdy
BRk8KgysEuk=
=8T0v
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2025-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Core scheduler changes:
- Better tracking of maximum lag of tasks in presence of different
slices duration, for better handling of lag in the fair scheduler
(Vincent Guittot)
- Clean up and standardize #if/#else/#endif markers throughout the
entire scheduler code base (Ingo Molnar)
- Make SMP unconditional: build the SMP scheduler's data structures
and logic on UP kernel too, even though they are not used, to
simplify the scheduler and remove around 200 #ifdef/[#else]/#endif
blocks from the scheduler (Ingo Molnar)
- Reorganize cgroup bandwidth control interface handling for better
interfacing with sched_ext (Tejun Heo)
Balancing:
- Bump sd->max_newidle_lb_cost when newidle balance fails (Chris
Mason)
- Remove sched_domain_topology_level::flags to simplify the code
(Prateek Nayak)
- Simplify and clean up build_sched_topology() (Li Chen)
- Optimize build_sched_topology() on large machines (Li Chen)
Real-time scheduling:
- Add initial version of proxy execution: a mechanism for
mutex-owning tasks to inherit the scheduling context of higher
priority waiters.
Currently limited to a single runqueue and conditional on
CONFIG_EXPERT, and other limitations (John Stultz, Peter Zijlstra,
Valentin Schneider)
- Deadline scheduler (Juri Lelli):
- Fix dl_servers initialization order (Juri Lelli)
- Fix DL scheduler's root domain reinitialization logic (Juri
Lelli)
- Fix accounting bugs after global limits change (Juri Lelli)
- Fix scalability regression by implementing less agressive
dl_server handling (Peter Zijlstra)
PSI:
- Improve scalability by optimizing psi_group_change() cpu_clock()
usage (Peter Zijlstra)
Rust changes:
- Make Task, CondVar and PollCondVar methods inline to avoid
unnecessary function calls (Kunwu Chan, Panagiotis Foliadis)
- Add might_sleep() support for Rust code: Rust's "#[track_caller]"
mechanism is used so that Rust's might_sleep() doesn't need to be
defined as a macro (Fujita Tomonori)
- Introduce file_from_location() (Boqun Feng)
Debugging & instrumentation:
- Make clangd usable with scheduler source code files again (Peter
Zijlstra)
- tools: Add root_domains_dump.py which dumps root domains info (Juri
Lelli)
- tools: Add dl_bw_dump.py for printing bandwidth accounting info
(Juri Lelli)
Misc cleanups & fixes:
- Remove play_idle() (Feng Lee)
- Fix check_preemption_disabled() (Sebastian Andrzej Siewior)
- Do not call __put_task_struct() on RT if pi_blocked_on is set (Luis
Claudio R. Goncalves)
- Correct the comment in place_entity() (wang wei)"
* tag 'sched-core-2025-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits)
sched/idle: Remove play_idle()
sched: Do not call __put_task_struct() on rt if pi_blocked_on is set
sched: Start blocked_on chain processing in find_proxy_task()
sched: Fix proxy/current (push,pull)ability
sched: Add an initial sketch of the find_proxy_task() function
sched: Fix runtime accounting w/ split exec & sched contexts
sched: Move update_curr_task logic into update_curr_se
locking/mutex: Add p->blocked_on wrappers for correctness checks
locking/mutex: Rework task_struct::blocked_on
sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable
sched/topology: Remove sched_domain_topology_level::flags
x86/smpboot: avoid SMT domain attach/destroy if SMT is not enabled
x86/smpboot: moves x86_topology to static initialize and truncate
x86/smpboot: remove redundant CONFIG_SCHED_SMT
smpboot: introduce SDTL_INIT() helper to tidy sched topology setup
tools/sched: Add dl_bw_dump.py for printing bandwidth accounting info
tools/sched: Add root_domains_dump.py which dumps root domains info
sched/deadline: Fix accounting after global limits change
sched/deadline: Reset extra_bw to max_bw when clearing root domains
sched/deadline: Initialize dl_servers after SMP
...
Changes
-------
* Add trivial kunit test for ratelimit
* Make the ratelimit test more reliable (Petr Mladek)
* Add stress test for ratelimit
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmiBbmkTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jGDHD/wLrUvsUk494xhGudBd+uSy4JPtC3TZ
s53I/IwUwmWUzPFW3kCwwDmq6LDVRg2I7kPh/6uAniMsOXF2watNrGloDu3ZJfqP
Z3H5pXQDHi8Y/vzU/S/jUIjGqw9Zaop/3zVcNBdYH593Vnie8QTn6+SFtgBW93+J
c0ScaazRrBBri8n/S5dC0wcjebQQjOeThGhOOeKIQHWauqNhLwqoRGKyNGNHVfux
2jQIRThv1vcVpjMQo5D3g6prZpI6XBQ3D3EkHOHjGChW6G89EEvtdUghlB56SG2d
mK7CPUd7zI/QI/33p4wsbqffZUO5j0Tveoir8/Xw69YYpP1HVAU8g6B5Y6eqs7Gv
4s8r9XgCmwFDrl33Gkgm7JBPOqixTu3zk+MBAAeo8MKrox1hamms6pBI0KWUf9QW
UOGEW4GfPc4VS7WRF9/JCTMpBZ9tBB7eTBYMy4hfrHBhsMKXfy7xHwK139PUu/JS
W6O8BQnBfJ+ZOSq5oJpeV/hiyVrk/0nQfqjzMfA8zyF5IlGC8x5MiYXsFn61VJw2
k8WmfF//pU6O3NDwfJLmv2Oi5NY/0QuI597TxGm/Cw0Ns/qtTtQ/BEZZsQ0V2fxg
bJoFRrxOBI3fw9UPzW5LGfNUcMG5VsxuQTgRglrul9+WQIsDkZqyRQD+FP2T7hfr
IcXU+V5M4nryYw==
=DDJ8
-----END PGP SIGNATURE-----
Merge tag 'ratelimit.2025.07.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull ratelimit test updates from Paul McKenney:
"Add functional and stress tests:
- Add trivial kunit test for ratelimit
- Make the ratelimit test more reliable (Petr Mladek)
- Add stress test for ratelimit"
* tag 'ratelimit.2025.07.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
lib: Add stress test for ratelimit
lib: Make the ratelimit test more reliable
lib: Add trivial kunit test for ratelimit
- Introduce support for auxiliary timekeepers
PTP clocks can be disconnected from the universal CLOCK_TAI reality
for various reasons including regularatory requirements for
functional safety redundancy.
The kernel so far only supports a single notion of time, which means
that all clocks are correlated in frequency and only differ by
offset to each other.
Access to non-correlated PTP clocks has been available so far only
through the file descriptor based "POSIX clock IDs", which are
subject to locking and have to go all the way out to the hardware.
The access is not only horribly slow, as it has to go all the way out
to the NIC/PTP hardware, but that also prevents the kernel to read
the time of such clocks e.g. from the network stack, where it is
required for TSN networking both on the transmit and receive side
unless the hardware provides offloading.
The auxiliary clocks provide a mechanism to support arbitrary clocks
which are not correlated to the system clock. This is not restricted
to the PTP use case on purpose as there is no kernel side association
of these clocks to a particular PTP device because that's a pure user
space configuration decision. Having them independent allows to
utilize them for other purposes and also enables them to be tested
without hardware dependencies.
To avoid pointless overhead these clocks have to be enabled
individualy via a new sysfs interface to reduce the overhead to a
single compare in the hotpath if they are enabled at the Kconfig
level at all.
These clocks utilize the existing timekeeping/NTP infrastructures,
which has been made possible over the recent releases by incrementaly
converting these infrastructures over from a single static instance
to a multi-instance pointer based implementation without any
performance regression reported.
The auxiliary clocks provide the same "emulation" of a "correct"
clock as the existing CLOCK_* variants do with an independent
instance of data and provide the same steering mechanism through the
existing sys_clock_adjtime() interface, which has been confirmed to
work by the chronyd(8) maintainer.
That allows to provide lockless kernel internal and VDSO support so
that applications and kernel internal functionalities can access
these clocks without restrictions and at the same performance as the
existing system clocks.
- Avoid double notifications in the adjtimex() syscall. Not a big issue,
but a trivial to avoid latency source.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmiGo/MTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWTIEACy/HC2OD7IbAzECgwQUvo59xvmw6ak
p1wRNYpTjOLUgWZjcl/jQfh7jYe60/0PrIzYgAeltXSGwOVtqQDrUIWrTKrAHOUa
wqKUCEfCucTUJRKLQ1Ktnjy/2Pp0Ojpf32Av0v/wgLUMxQk9Av39UdQwMOGyoHOa
07//lrVzfYfqe5Ne7cmuZSbVcHlKyWpXtSvPiVhyk+tHZea4646Pz17sBeVsefps
41mxZBRk7VNiE8yWtRWYKcaXxE/0nYkptjhXOqgmNRTGB/WfyKavDYVLWe31XPrI
G3/QcAAJHBEYZgoGMHRn76L+NWNqnuxeFPtSOaGceBks7HwCKdfvrAwaJzJ1Mr22
IxeoPm0Fzdrtjy2L2PM1txGdEI2iErprCNtQolM4BCQzUslsukP/Fts+SOujgpnC
SJ9rbIIeKJzt2dW6kQai+xGw6WIpQyus7Lbt0sEcyBdWi5Bqvh9g1ZXUn8SHY2xx
/aaEBe2J1RGGhNhHD6bSLYMAXoKDPcpZIrwO+2N96Z18uYee0FU5g3JVKGfuuTdo
wYfpK79xsFmhaBDj8pYAJoU3y/v+WycE2pP2oFgBhN49Xcxo1yYIm5ECb9dYesfD
4nW/Av9bI/NR7J4MilXvw2hSmfapgNgcUGb6sgEuSD7M8/UXkfFbcbqp+RcUBiQj
0XGiNzqLnwgzBQ==
=TTjt
-----END PGP SIGNATURE-----
Merge tag 'timers-ptp-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timekeeping and VDSO updates from Thomas Gleixner:
- Introduce support for auxiliary timekeepers
PTP clocks can be disconnected from the universal CLOCK_TAI reality
for various reasons including regularatory requirements for
functional safety redundancy.
The kernel so far only supports a single notion of time, which means
that all clocks are correlated in frequency and only differ by offset
to each other.
Access to non-correlated PTP clocks has been available so far only
through the file descriptor based "POSIX clock IDs", which are
subject to locking and have to go all the way out to the hardware.
The access is not only horribly slow, as it has to go all the way out
to the NIC/PTP hardware, but that also prevents the kernel to read
the time of such clocks e.g. from the network stack, where it is
required for TSN networking both on the transmit and receive side
unless the hardware provides offloading.
The auxiliary clocks provide a mechanism to support arbitrary clocks
which are not correlated to the system clock. This is not restricted
to the PTP use case on purpose as there is no kernel side association
of these clocks to a particular PTP device because that's a pure user
space configuration decision. Having them independent allows to
utilize them for other purposes and also enables them to be tested
without hardware dependencies.
To avoid pointless overhead these clocks have to be enabled
individualy via a new sysfs interface to reduce the overhead to a
single compare in the hotpath if they are enabled at the Kconfig
level at all.
These clocks utilize the existing timekeeping/NTP infrastructures,
which has been made possible over the recent releases by incrementaly
converting these infrastructures over from a single static instance
to a multi-instance pointer based implementation without any
performance regression reported.
The auxiliary clocks provide the same "emulation" of a "correct"
clock as the existing CLOCK_* variants do with an independent
instance of data and provide the same steering mechanism through the
existing sys_clock_adjtime() interface, which has been confirmed to
work by the chronyd(8) maintainer.
That allows to provide lockless kernel internal and VDSO support so
that applications and kernel internal functionalities can access
these clocks without restrictions and at the same performance as the
existing system clocks.
- Avoid double notifications in the adjtimex() syscall. Not a big
issue, but a trivial to avoid latency source.
* tag 'timers-ptp-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
vdso/gettimeofday: Add support for auxiliary clocks
vdso/vsyscall: Update auxiliary clock data in the datapage
vdso: Introduce aux_clock_resolution_ns()
vdso/gettimeofday: Introduce vdso_get_timestamp()
vdso/gettimeofday: Introduce vdso_set_timespec()
vdso/gettimeofday: Introduce vdso_clockid_valid()
vdso/gettimeofday: Return bool from clock_gettime() helpers
vdso/gettimeofday: Return bool from clock_getres() helpers
vdso/helpers: Add helpers for seqlocks of single vdso_clock
vdso/vsyscall: Split up __arch_update_vsyscall() into __arch_update_vdso_clock()
vdso/vsyscall: Introduce a helper to fill clock configurations
timekeeping: Remove the temporary CLOCK_AUX workaround
timekeeping: Provide ktime_get_clock_ts64()
timekeeping: Provide interface to control auxiliary clocks
timekeeping: Provide update for auxiliary timekeepers
timekeeping: Provide adjtimex() for auxiliary clocks
timekeeping: Prepare do_adtimex() for auxiliary clocks
timekeeping: Make do_adjtimex() reusable
timekeeping: Add auxiliary clock support to __timekeeping_inject_offset()
timekeeping: Make timekeeping_inject_offset() reusable
...
Corrects MODULE_IMPORT_NS() syntax documentation, makes kunit_test timeout
configurable via a module parameter and a Kconfig option, fixes longest
symbol length test, adds a test for static stub, and adjusts kunit_test
timeout based on test_{suite,case} speed.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmiH3FMACgkQCwJExA0N
QxzTJQ/5AZiQXo6dbUTkmHuG14YZtWVnZ9AJMqP7tfr5s3kXRL8v96RqbLAV+2GY
jZfZkcQInzQRw464NmJcad9V0fPEUQo8Hn5lx2iPQpAUy8c0EFim4vXs9zrO246R
VFx9USrGBlhm6NAtPbbvTgcUtigaciur9/uEb4r4lTWB12dgYqfjGsII0ZYoNEUg
AXlDq950AnZktnXoN/4/fybwtB9NnYriobUn7jrDDcVoXtpSCMgWyBxsWhpEnl6P
TRMcYjs6mC1Txbfrhj9GiS/qcAVg4g9Ft3JtFc3MosyWF8tA/v9RTMZcPTqSpoDu
di94Mb4FXm9NpEk6nFCLLR+zYXbTsLLkRUmE1Xx6/uCmmocF2vHH302rH6exZ5iT
N9jXPPGyEk9FimZSKqf7hoZpPK5lDSqQhTttBJaNEw3j+Gx4cil0Sz02mHZc/5nE
ng1DjMsf1a5XRWoyxWjXuWcD1sjvLl3yvLHxD50QTSNsTVr6DJTLUipeBqC7xxGc
1uHZDjnMWAgHr8XHF4hyIy/eNPjns/htCAoFCspB8iLhMEkwQmjExaT/Z5lOwvMs
QamUz7SCcGPqUe+qG9zS3l4PqybZQvDgLygBNwDZokfLolHCLKizzISfDz0ktaK8
JyBCX3pxZfuAnBs8q5rt7NV4xOgMpp/xOIxqejWZ+CN5ku3vTb0=
=WIJH
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
"Correct MODULE_IMPORT_NS() syntax documentation, make kunit_test
timeout configurable via a module parameter and a Kconfig option, fix
longest symbol length test, add a test for static stub, and adjust
kunit_test timeout based on test_{suite,case} speed"
* tag 'linux_kselftest-kunit-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: fix longest symbol length test
kunit: Make default kunit_test timeout configurable via both a module parameter and a Kconfig option
kunit: Adjust kunit_test timeout based on test_{suite,case} speed
kunit: Add test for static stub
Documentation: kunit: Correct MODULE_IMPORT_NS() syntax
Here is the big set of char/misc/iio and other smaller driver subsystems
for 6.17-rc1. It's a big set this time around, with the huge majority
being in the iio subsystem with new drivers and dts files being added
there.
Highlights include:
- IIO driver updates, additions, and changes making more code const
and cleaning up some init logic
- bus_type constant conversion changes
- misc device test functions added
- rust miscdevice minor fixup
- unused function removals for some drivers
- mei driver updates
- mhi driver updates
- interconnect driver updates
- Android binder updates and test infrastructure added
- small cdx driver updates
- small comedi fixes
- small nvmem driver updates
- small pps driver updates
- some acrn virt driver fixes for printk messages
- other small driver updates
All of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaIeYDg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk3dwCdF6xoEVZaDkLM5IF3XKWWgdYxjNsAoKUy2kUx
YtmVh4S0tMmugfeRGac7
=3Wxi
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / IIO / other driver updates from Greg KH:
"Here is the big set of char/misc/iio and other smaller driver
subsystems for 6.17-rc1. It's a big set this time around, with the
huge majority being in the iio subsystem with new drivers and dts
files being added there.
Highlights include:
- IIO driver updates, additions, and changes making more code const
and cleaning up some init logic
- bus_type constant conversion changes
- misc device test functions added
- rust miscdevice minor fixup
- unused function removals for some drivers
- mei driver updates
- mhi driver updates
- interconnect driver updates
- Android binder updates and test infrastructure added
- small cdx driver updates
- small comedi fixes
- small nvmem driver updates
- small pps driver updates
- some acrn virt driver fixes for printk messages
- other small driver updates
All of these have been in linux-next with no reported issues"
* tag 'char-misc-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits)
binder: Use seq_buf in binder_alloc kunit tests
binder: Add copyright notice to new kunit files
misc: ti_fpc202: Switch to of_fwnode_handle()
bus: moxtet: Use dev_fwnode()
pc104: move PC104 option to drivers/Kconfig
drivers: virt: acrn: Don't use %pK through printk
comedi: fix race between polling and detaching
interconnect: qcom: Add Milos interconnect provider driver
dt-bindings: interconnect: document the RPMh Network-On-Chip Interconnect in Qualcomm Milos SoC
mei: more prints with client prefix
mei: bus: use cldev in prints
bus: mhi: host: pci_generic: Add Telit FN990B40 modem support
bus: mhi: host: Detect events pointing to unexpected TREs
bus: mhi: host: pci_generic: Add Foxconn T99W696 modem
bus: mhi: host: Use str_true_false() helper
bus: mhi: host: pci_generic: Add support for EM929x and set MRU to 32768 for better performance.
bus: mhi: host: Fix endianness of BHI vector table
bus: mhi: host: pci_generic: Disable runtime PM for QDU100
bus: mhi: host: pci_generic: Fix the modem name of Foxconn T99W640
dt-bindings: interconnect: qcom,msm8998-bwmon: Allow 'nonposted-mmio'
...
Add KUnit test suites for the Poly1305, SHA-1, SHA-224, SHA-256,
SHA-384, and SHA-512 library functions.
These are the first KUnit tests for lib/crypto/. So in addition to
being useful tests for these specific algorithms, they also establish
some conventions for lib/crypto/ testing going forwards.
The new tests are fairly comprehensive: more comprehensive than the
generic crypto infrastructure's tests. They use a variety of
techniques to check for the types of implementation bugs that tend to
occur in the real world, rather than just naively checking some test
vectors. (Interestingly, poly1305_kunit found a bug in QEMU.)
The core test logic is shared by all six algorithms, rather than being
duplicated for each algorithm.
Each algorithm's test suite also optionally includes a benchmark.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaIZ+WhQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK+C4AQCOi7iOlFLouUfu9klrovp3i/iSMhyQ
gUEHPGSelBy4wQD+NnrLGIdpCcaDAzyWpT4TxG6esN2/97ewh4VUa2MDuQQ=
=rS0F
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library test updates from Eric Biggers:
"Add KUnit test suites for the Poly1305, SHA-1, SHA-224, SHA-256,
SHA-384, and SHA-512 library functions.
These are the first KUnit tests for lib/crypto/. So in addition to
being useful tests for these specific algorithms, they also establish
some conventions for lib/crypto/ testing going forwards.
The new tests are fairly comprehensive: more comprehensive than the
generic crypto infrastructure's tests. They use a variety of
techniques to check for the types of implementation bugs that tend to
occur in the real world, rather than just naively checking some test
vectors. (Interestingly, poly1305_kunit found a bug in QEMU)
The core test logic is shared by all six algorithms, rather than being
duplicated for each algorithm.
Each algorithm's test suite also optionally includes a benchmark"
* tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: tests: Annotate worker to be on stack
lib/crypto: tests: Add KUnit tests for SHA-1 and HMAC-SHA1
lib/crypto: tests: Add KUnit tests for Poly1305
lib/crypto: tests: Add KUnit tests for SHA-384 and SHA-512
lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256
lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py
This is the main crypto library pull request for 6.17. The main focus
this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
and establishing conventions for lib/crypto/ going forward:
- Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
most of the SHA-512 code) into lib/crypto/. This includes both the
generic and architecture-optimized code. Greatly simplify how the
architecture-optimized code is integrated. Add an easy-to-use
library API for each SHA variant, including HMAC support. Finally,
reimplement the crypto_shash support on top of the library API.
- Apply the same reorganization to the SHA-256 code (and also SHA-224
which shares most of the SHA-256 code). This is a somewhat smaller
change, due to my earlier work on SHA-256. But this brings in all
the same additional improvements that I made for SHA-1 and SHA-512.
There are also some smaller changes:
- Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
these algorithms it's just a move, not a full reorganization yet.
- Fix the MIPS chacha-core.S to build with the clang assembler.
- Fix the Poly1305 functions to work in all contexts.
- Fix a performance regression in the x86_64 Poly1305 code.
- Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.
Note that since the new organization of the SHA code is much simpler,
the diffstat of this pull request is negative, despite the addition of
new fully-documented library APIs for multiple SHA and HMAC-SHA
variants. These APIs will allow further simplifications across the
kernel as users start using them instead of the old-school crypto API.
(I've already written a lot of such conversion patches, removing over
1000 more lines of code. But most of those will target 6.18 or later.)
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaIZ93BQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK8HCAQD3O9P0qd6wscne5XuRwaybzKHQ2AqU
OlhlDZWQQEvYAgD/aa6KP/DS+8RKGj0TBn6bACAJyXyDygFXq5a5s9pGzAs=
=UmMM
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
"This is the main crypto library pull request for 6.17. The main focus
this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
and establishing conventions for lib/crypto/ going forward:
- Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
most of the SHA-512 code) into lib/crypto/. This includes both the
generic and architecture-optimized code. Greatly simplify how the
architecture-optimized code is integrated. Add an easy-to-use
library API for each SHA variant, including HMAC support. Finally,
reimplement the crypto_shash support on top of the library API.
- Apply the same reorganization to the SHA-256 code (and also SHA-224
which shares most of the SHA-256 code). This is a somewhat smaller
change, due to my earlier work on SHA-256. But this brings in all
the same additional improvements that I made for SHA-1 and SHA-512.
There are also some smaller changes:
- Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
these algorithms it's just a move, not a full reorganization yet.
- Fix the MIPS chacha-core.S to build with the clang assembler.
- Fix the Poly1305 functions to work in all contexts.
- Fix a performance regression in the x86_64 Poly1305 code.
- Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.
Note that since the new organization of the SHA code is much simpler,
the diffstat of this pull request is negative, despite the addition of
new fully-documented library APIs for multiple SHA and HMAC-SHA
variants.
These APIs will allow further simplifications across the kernel as
users start using them instead of the old-school crypto API. (I've
already written a lot of such conversion patches, removing over 1000
more lines of code. But most of those will target 6.18 or later)"
* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (67 commits)
lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils
lib/crypto: x86/sha1-ni: Convert to use rounds macros
lib/crypto: x86/sha1-ni: Minor optimizations and cleanup
crypto: sha1 - Remove sha1_base.h
lib/crypto: x86/sha1: Migrate optimized code into library
lib/crypto: sparc/sha1: Migrate optimized code into library
lib/crypto: s390/sha1: Migrate optimized code into library
lib/crypto: powerpc/sha1: Migrate optimized code into library
lib/crypto: mips/sha1: Migrate optimized code into library
lib/crypto: arm64/sha1: Migrate optimized code into library
lib/crypto: arm/sha1: Migrate optimized code into library
crypto: sha1 - Use same state format as legacy drivers
crypto: sha1 - Wrap library and add HMAC support
lib/crypto: sha1: Add HMAC support
lib/crypto: sha1: Add SHA-1 library functions
lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
crypto: x86/sha1 - Rename conflicting symbol
lib/crypto: sha2: Add hmac_sha*_init_usingrawkey()
lib/crypto: arm/poly1305: Remove unneeded empty weak function
lib/crypto: x86/poly1305: Fix performance regression on short messages
...
Updates for the kernel's CRC (cyclic redundancy check) code:
- Reorganize the architecture-optimized CRC code. It now lives in
lib/crc/$(SRCARCH)/ rather than arch/$(SRCARCH)/lib/, and it is no
longer artificially split into separate generic and arch modules.
This allows better inlining and dead code elimination. The generic
CRC code is also no longer exported, simplifying the API. (This
mirrors the similar changes to SHA-1 and SHA-2 in lib/crypto/,
which can be found in the "Crypto library updates" pull request.)
- Improve crc32c() performance on newer x86_64 CPUs on long messages
by enabling the VPCLMULQDQ optimized code.
- Simplify the crypto_shash wrappers for crc32_le() and crc32c().
Register just one shash algorithm for each that uses the (fully
optimized) library functions, instead of unnecessarily providing
direct access to the generic CRC code.
- Remove unused and obsolete drivers for hardware CRC engines.
- Remove CRC-32 combination functions that are no longer used.
- Add kerneldoc for crc32_le(), crc32_be(), and crc32c().
- Convert the crc32() macro to an inline function.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaIZ8rRQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK3yOAP9OuoCirD42ZHNSgQeGTzhhZ2jCHiPN
BPvHChwtE2MSRwEA0ddNX36aOiEKmpjog3TMllOIBz7wBrwZV7KgoX75+AU=
=uAY8
-----END PGP SIGNATURE-----
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
- Reorganize the architecture-optimized CRC code
It now lives in lib/crc/$(SRCARCH)/ rather than arch/$(SRCARCH)/lib/,
and it is no longer artificially split into separate generic and arch
modules. This allows better inlining and dead code elimination
The generic CRC code is also no longer exported, simplifying the API.
(This mirrors the similar changes to SHA-1 and SHA-2 in lib/crypto/,
which can be found in the "Crypto library updates" pull request)
- Improve crc32c() performance on newer x86_64 CPUs on long messages by
enabling the VPCLMULQDQ optimized code
- Simplify the crypto_shash wrappers for crc32_le() and crc32c()
Register just one shash algorithm for each that uses the (fully
optimized) library functions, instead of unnecessarily providing
direct access to the generic CRC code
- Remove unused and obsolete drivers for hardware CRC engines
- Remove CRC-32 combination functions that are no longer used
- Add kerneldoc for crc32_le(), crc32_be(), and crc32c()
- Convert the crc32() macro to an inline function
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (26 commits)
lib/crc: x86/crc32c: Enable VPCLMULQDQ optimization where beneficial
lib/crc: x86: Reorganize crc-pclmul static_call initialization
lib/crc: crc64: Add include/linux/crc64.h to kernel-api.rst
lib/crc: crc32: Change crc32() from macro to inline function and remove cast
nvmem: layouts: Switch from crc32() to crc32_le()
lib/crc: crc32: Document crc32_le(), crc32_be(), and crc32c()
lib/crc: Explicitly include <linux/export.h>
lib/crc: Remove ARCH_HAS_* kconfig symbols
lib/crc: x86: Migrate optimized CRC code into lib/crc/
lib/crc: sparc: Migrate optimized CRC code into lib/crc/
lib/crc: s390: Migrate optimized CRC code into lib/crc/
lib/crc: riscv: Migrate optimized CRC code into lib/crc/
lib/crc: powerpc: Migrate optimized CRC code into lib/crc/
lib/crc: mips: Migrate optimized CRC code into lib/crc/
lib/crc: loongarch: Migrate optimized CRC code into lib/crc/
lib/crc: arm64: Migrate optimized CRC code into lib/crc/
lib/crc: arm: Migrate optimized CRC code into lib/crc/
lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/
lib/crc: Move files into lib/crc/
lib/crc32: Remove unused combination support
...
- Introduce and start using TRAILING_OVERLAP() helper for fixing
embedded flex array instances (Gustavo A. R. Silva)
- mux: Convert mux_control_ops to a flex array member in mux_chip
(Thorsten Blum)
- string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
- Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
Kees Cook)
- Refactor and rename stackleak feature to support Clang
- Add KUnit test for seq_buf API
- Fix KUnit fortify test under LTO
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaIfUkgAKCRA2KwveOeQk
uypLAP92r6f47sWcOw/5B9aVffX6Bypsb7dqBJQpCNxI5U1xcAEAiCrZ98UJyOeQ
JQgnXd4N67K4EsS2JDc+FutRn3Yi+A8=
=+5Bq
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- Introduce and start using TRAILING_OVERLAP() helper for fixing
embedded flex array instances (Gustavo A. R. Silva)
- mux: Convert mux_control_ops to a flex array member in mux_chip
(Thorsten Blum)
- string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
- Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
Kees Cook)
- Refactor and rename stackleak feature to support Clang
- Add KUnit test for seq_buf API
- Fix KUnit fortify test under LTO
* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
sched/task_stack: Add missing const qualifier to end_of_stack()
kstack_erase: Support Clang stack depth tracking
kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
init.h: Disable sanitizer coverage for __init and __head
kstack_erase: Disable kstack_erase for all of arm compressed boot code
x86: Handle KCOV __init vs inline mismatches
arm64: Handle KCOV __init vs inline mismatches
s390: Handle KCOV __init vs inline mismatches
arm: Handle KCOV __init vs inline mismatches
mips: Handle KCOV __init vs inline mismatch
powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
configs/hardening: Enable CONFIG_KSTACK_ERASE
stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
stackleak: Rename STACKLEAK to KSTACK_ERASE
seq_buf: Introduce KUnit tests
string: Group str_has_prefix() and strstarts()
kunit/fortify: Add back "volatile" for sizeof() constants
acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiHdZ8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgptRED/9o3dQ1QHL5yNM/AyCCGox0V4zra8qGS/Vc
cBWpAVrmPGRw0IYlLZENtN9PdwKcbMzJq3l6cxeC7dBnAZP0AxTzP4YYJYUNVsqo
WtJ3d/k5+cVp0OyOp4uabaqNeMeLoPk9/JXe1Ml2KxtDmHtj5yee0JRh7zlPZmZj
tsrpIUTeHgAPn6yR1EI+0ybx/mjCb05Mv2Y8gF5hkUPA2PuON+MTFixJmqoy2ySh
n+22mz/prqlyOSYh/VVv1+9jcQ94wMjcW0JIpg9lM3Kg8BCPU4IetvO1UiX6X33v
154zEh2aJJDBx+yORS4BM4JMXjRZI7lYea2dkHM8Cajctu1Wpja9bNwnK9ibXvEc
WtyBwztleLbAZef25fA/W87JE23fGa/r3nwIb2cF4QqkAFslCvhjA93WkOzNJCgQ
qsWOrlCh3IK2NUu4b1Ncs3ZHOPvc51+zzjMzC6SUr54xhrxDK+gngDPhRy7XDqWJ
DTMpIlr366o8GdJqnib0/e/CPBrThS6Vl6u0tgLnNbwdpK1svgo/uHW5ksKvDqHX
kGEIhyRRJJC+4wyl4dsYKXa2twcyFrlWdAE+pZguEC2nZRYqYl9uXftOtvfp1x0y
/skDX0FIDjvyjRqCLcqF03FSGqwCGS8WuWXZjPhVhcfz47NvbHeFDh1G/jMzsbpj
S9zrPve/DQ==
=e86T
-----END PGP SIGNATURE-----
Merge tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- MD pull request via Yu:
- call del_gendisk synchronously (Xiao)
- cleanup unused variable (John)
- cleanup workqueue flags (Ryo)
- fix faulty rdev can't be removed during resync (Qixing)
- NVMe pull request via Christoph:
- try PCIe function level reset on init failure (Keith Busch)
- log TLS handshake failures at error level (Maurizio Lombardi)
- pci-epf: do not complete commands twice if nvmet_req_init()
fails (Rick Wertenbroek)
- misc cleanups (Alok Tiwari)
- Removal of the pktcdvd driver
This has been more than a decade coming at this point, and some
recently revealed breakages that had it causing issues even for cases
where it isn't required made me re-pull the trigger on this one. It's
known broken and nobody has stepped up to maintain the code
- Series for ublk supporting batch commands, enabling the use of
multishot where appropriate
- Speed up ublk exit handling
- Fix for the two-stage elevator fixing which could leak data
- Convert NVMe to use the new IOVA based API
- Increase default max transfer size to something more reasonable
- Series fixing write operations on zoned DM devices
- Add tracepoints for zoned block device operations
- Prep series working towards improving blk-mq queue management in the
presence of isolated CPUs
- Don't allow updating of the block size of a loop device that is
currently under exclusively ownership/open
- Set chunk sectors from stacked device stripe size and use it for the
atomic write size limit
- Switch to folios in bcache read_super()
- Fix for CD-ROM MRW exit flush handling
- Various tweaks, fixes, and cleanups
* tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux: (94 commits)
block: restore two stage elevator switch while running nr_hw_queue update
cdrom: Call cdrom_mrw_exit from cdrom_release function
sunvdc: Balance device refcount in vdc_port_mpgroup_check
nvme-pci: try function level reset on init failure
dm: split write BIOs on zone boundaries when zone append is not emulated
block: use chunk_sectors when evaluating stacked atomic write limits
dm-stripe: limit chunk_sectors to the stripe size
md/raid10: set chunk_sectors limit
md/raid0: set chunk_sectors limit
block: sanitize chunk_sectors for atomic write limits
ilog2: add max_pow_of_two_factor()
nvmet: pci-epf: Do not complete commands twice if nvmet_req_init() fails
nvme-tcp: log TLS handshake failures at error level
docs: nvme: fix grammar in nvme-pci-endpoint-target.rst
nvme: fix typo in status code constant for self-test in progress
nvmet: remove redundant assignment of error code in nvmet_ns_enable()
nvme: fix incorrect variable in io cqes error message
nvme: fix multiple spelling and grammar issues in host drivers
block: fix blk_zone_append_update_request_bio() kernel-doc
md/raid10: fix set but not used variable in sync_request_write()
...
Move both uevent_helper table into lib/kobject_uevent.c. Place the
registration early in the initcall order with postcore_initcall.
This is part of a greater effort to move ctl tables into their
respective subsystems which will reduce the merge conflicts in
kernel/sysctl.c.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:
- Add the new top-level CONFIG_KSTACK_ERASE option which will be
implemented either with the stackleak GCC plugin, or with the Clang
stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
for what it does rather than what it protects against), but leave as
many of the internals alone as possible to avoid even more churn.
While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
The following warning traceback is seen if object debugging is enabled
with the new crypto test code.
ODEBUG: object 9000000106237c50 is on stack 9000000106234000, but NOT annotated.
------------[ cut here ]------------
WARNING: lib/debugobjects.c:655 at lookup_object_or_alloc.part.0+0x19c/0x1f4, CPU#0: kunit_try_catch/468
...
This also results in a boot stall when running the code in qemu:loongarch.
Initializing the worker with INIT_WORK_ONSTACK() fixes the problem.
Fixes: 950a81224e ("lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250721231917.3182029-1-linux@roeck-us.net
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Now that the oldest supported binutils version is 2.30, the macros that
emit the SHA-512 instructions as '.inst' words are no longer needed. So
drop them. No change in the generated machine code.
Changed from the original patch by Ard Biesheuvel:
(https://lore.kernel.org/r/20250515142702.2592942-2-ardb+git@google.com):
- Reduced scope to just SHA-512
- Added comment that explains why "sha3" is used instead of "sha2"
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718220706.475240-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The assembly code that does all 80 rounds of SHA-1 is highly repetitive.
Replace it with 20 expansions of a macro that does 4 rounds, using the
macro arguments and .if directives to handle the slight variations
between rounds. This reduces the length of sha1-ni-asm.S by 129 lines
while still producing the exact same object file. This mirrors
sha256-ni-asm.S which uses this same strategy.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
- Store the previous state in %xmm8-%xmm9 instead of spilling it to the
stack. There are plenty of unused XMM registers here, so there is no
reason to spill to the stack. (While 32-bit code is limited to
%xmm0-%xmm7, this is 64-bit code, so it's free to use %xmm8-%xmm15.)
- Remove the unnecessary check for nblocks == 0. sha1_ni_transform() is
always passed a positive nblocks.
- To get an XMM register with 'e' in the high dword and the rest zeroes,
just zeroize the register using pxor, then load 'e'. Previously the
code loaded 'e', then zeroized the lower dwords by AND-ing with a
constant, which was slightly less efficient.
- Instead of computing &DATA_PTR[NBLOCKS << 6] and stopping when
DATA_PTR reaches that value, instead just decrement NBLOCKS on each
iteration and stop when it reaches 0. This is fewer instructions.
- Rename DIGEST_PTR to STATE_PTR. It points to the SHA-1 internal
state, not a SHA-1 digest value.
This commit shrinks the code size of sha1_ni_transform() from 624 bytes
to 589 bytes and also shrinks rodata by 16 bytes.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Improve crc32c() performance on lengths >= 512 bytes by using
crc32_lsb_vpclmul_avx512() instead of crc32c_x86_3way(), when the CPU
supports VPCLMULQDQ and has a "good" implementation of AVX-512. For now
that means AMD Zen 4 and later, and Intel Sapphire Rapids and later.
Pass crc32_lsb_vpclmul_avx512() the table of constants needed to make it
use the CRC-32C polynomial.
Rationale: VPCLMULQDQ performance has improved on newer CPUs, making
crc32_lsb_vpclmul_avx512() faster than crc32c_x86_3way(), even though
crc32_lsb_vpclmul_avx512() is designed for generic 32-bit CRCs and does
not utilize x86_64's dedicated CRC-32C instructions.
Performance results for len=4096 using crc_kunit:
CPU Before (MB/s) After (MB/s)
====================== ============= ============
AMD Zen 4 (Genoa) 19868 28618
AMD Zen 5 (Ryzen AI 9 365) 24080 46940
AMD Zen 5 (Turin) 29566 58468
Intel Sapphire Rapids 22340 73794
Intel Emerald Rapids 24696 78666
Performance results for len=512 using crc_kunit:
CPU Before (MB/s) After (MB/s)
====================== ============= ============
AMD Zen 4 (Genoa) 7251 7758
AMD Zen 5 (Ryzen AI 9 365) 17481 19135
AMD Zen 5 (Turin) 21332 25424
Intel Sapphire Rapids 18886 29312
Intel Emerald Rapids 19675 29045
That being said, in the above benchmarks the ZMM registers are "warm",
so they don't quite tell the whole story. While significantly improved
from older Intel CPUs, Intel still has ~2000 ns of ZMM warm-up time
where 512-bit instructions execute 4 times more slowly than they
normally do. In contrast, AMD does better and has virtually zero ZMM
warm-up time (at most ~60 ns). Thus, while this change is always
beneficial on AMD, strictly speaking there are cases in which it is not
beneficial on Intel, e.g. a small number of 512-byte messages with
"cold" ZMM registers. But typically, it is beneficial even on Intel.
Note that on AMD Zen 3--5, crc32c() performance could be further
improved with implementations that interleave crc32q and VPCLMULQDQ
instructions. Unfortunately, it appears that a different such
implementation would be optimal on *each* of these microarchitectures.
Such improvements are left for future work. This commit just improves
the way that we choose the implementations we already have.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250719224938.126512-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reorganize the crc-pclmul static_call initialization to place more of
the logic in the *_mod_init_arch() functions instead of in the
INIT_CRC_PCLMUL macro. This provides the flexibility to do more than a
single static_call update for each CPU feature check. Right away,
optimize crc64_mod_init_arch() to check the CPU features just once
instead of twice, doing both the crc64_msb and crc64_lsb static_call
updates together. A later commit will also use this to initialize an
additional static_key when crc32_lsb_vpclmul_avx512() is enabled.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250719224938.126512-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Update lib/raid6/recov_rvv.c, for 1857fcc847 ("lib/raid6: replace custom
zero page with ZERO_PAGE"), per Klara.
Link: https://lkml.kernel.org/r/aFkUnXWtxcgOTVkw@gondor.apana.org.au
Fixes: 1857fcc847 ("lib/raid6: replace custom zero page with ZERO_PAGE")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Song Liu <song@kernel.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Optimize GCD performance on RISC-V by selecting
implementation at runtime", v3.
The current implementation of gcd() selects between the binary GCD and the
odd-even GCD algorithm at compile time, depending on whether
CONFIG_CPU_NO_EFFICIENT_FFS is set. On platforms like RISC-V, however,
this compile-time decision can be misleading: even when the compiler emits
ctz instructions based on the assumption that they are efficient (as is
the case when CONFIG_RISCV_ISA_ZBB is enabled), the actual hardware may
lack support for the Zbb extension. In such cases, ffs() falls back to a
software implementation at runtime, making the binary GCD algorithm
significantly slower than the odd-even variant.
To address this, we introduce a static key to allow runtime selection
between the binary and odd-even GCD implementations. On RISC-V, the
kernel now checks for Zbb support during boot. If Zbb is unavailable, the
static key is disabled so that gcd() consistently uses the more efficient
odd-even algorithm in that scenario. Additionally, to further reduce code
size, we select CONFIG_CPU_NO_EFFICIENT_FFS automatically when
CONFIG_RISCV_ISA_ZBB is not enabled, avoiding compilation of the unused
binary GCD implementation entirely on systems where it would never be
executed.
This series ensures that the most efficient GCD algorithm is used in
practice and avoids compiling unnecessary code based on hardware
capabilities and kernel configuration.
This patch (of 3):
On platforms like RISC-V, the compiler may generate hardware FFS
instructions even if the underlying CPU does not actually support them.
Currently, the GCD implementation is chosen at compile time based on
CONFIG_CPU_NO_EFFICIENT_FFS, which can result in suboptimal behavior on
such systems.
Introduce a static key, efficient_ffs_key, to enable runtime selection
between the binary GCD (using ffs) and the odd-even GCD implementation.
This allows the kernel to default to the faster binary GCD when FFS is
efficient, while retaining the ability to fall back when needed.
Link: https://lkml.kernel.org/r/20250606134758.1308400-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20250606134758.1308400-2-visitorckw@gmail.com
Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bitmap definition for 'panic_print' is hard to remember and decode. Add
'panic_sys_info='sysctl to take human readable string like
"tasks,mem,timers,locks,ftrace,..." and translate it into bitmap.
The detailed mapping is:
SYS_INFO_TASKS "tasks"
SYS_INFO_MEM "mem"
SYS_INFO_TIMERS "timers"
SYS_INFO_LOCKS "locks"
SYS_INFO_FTRACE "ftrace"
SYS_INFO_ALL_CPU_BT "all_bt"
SYS_INFO_BLOCKED_TASKS "blocked_tasks"
[nathan@kernel.org: add __maybe_unused to sys_info_avail]
Link: https://lkml.kernel.org/r/20250708-fix-clang-sys_info_avail-warning-v1-1-60d239eacd64@kernel.org
Link: https://lkml.kernel.org/r/20250703021004.42328-4-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
'panic_print' was introduced to help debugging kernel panic by dumping
different kinds of system information like tasks' call stack, memory,
ftrace buffer, etc. Actually this function could also be used to help
debugging other cases like task-hung, soft/hard lockup, etc. where user
may need the snapshot of system info at that time.
Extract system info dump function related code from panic.c to separate
file sys_info.[ch], for wider usage by other kernel parts for debugging.
Also modify the macro names about singulars/plurals.
Link: https://lkml.kernel.org/r/20250703021004.42328-3-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Expose the auxiliary clocks through the vDSO.
Architectures not using the generic vDSO time framework,
namely SPARC64, are not supported.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-12-df7d9f87b9b8@linutronix.de
The internal helpers are effectively using boolean results,
while pretending to use error numbers.
Switch the return type to bool for more clarity.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-6-df7d9f87b9b8@linutronix.de
Tests can allocate from virtual memory using kunit_vm_mmap(), which
transparently creates and attaches an mm_struct to the test runner if
one is not already attached. This is suitable for most cases, except for
when the code under test must access a task's mm before performing an
mmap. Expose kunit_attach_mm() as part of the interface for those
cases. This does not change the existing behavior.
Cc: David Gow <davidgow@google.com>
Signed-off-by: Tiffany Yang <ynaffit@google.com>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20250714185321.2417234-4-ynaffit@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It seems the Clang can see through OPTIMIZER_HIDE_VAR when the constant
is coming from sizeof. Adding "volatile" back to these variables solves
this false positive without reintroducing the issues that originally led
to switching to OPTIMIZER_HIDE_VAR in the first place[1].
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2075 [1]
Cc: Jannik Glückert <jannik.glueckert@gmail.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 6ee149f61b ("kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()")
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250628234034.work.800-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Add a KUnit test suite for the SHA-1 library functions, including the
corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suite, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a KUnit test suite for the Poly1305 functions. Most of its test
cases are instantiated from hash-test-template.h, which is also used by
the SHA-2 tests. A couple additional test cases are also included to
test edge cases specific to Poly1305.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add KUnit test suites for the SHA-384 and SHA-512 library functions,
including the corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add KUnit test suites for the SHA-224 and SHA-256 library functions,
including the corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add hash-test-template.h which generates the following KUnit test cases
for hash functions:
test_hash_test_vectors
test_hash_all_lens_up_to_4096
test_hash_incremental_updates
test_hash_buffer_overruns
test_hash_overlaps
test_hash_alignment_consistency
test_hash_ctx_zeroization
test_hash_interrupt_context_1
test_hash_interrupt_context_2
test_hmac (when HMAC is supported)
benchmark_hash (when CONFIG_CRYPTO_LIB_BENCHMARK=y)
The initial use cases for this will be sha224_kunit, sha256_kunit,
sha384_kunit, sha512_kunit, and poly1305_kunit.
Add a Python script gen-hash-testvecs.py which generates the test
vectors required by test_hash_test_vectors,
test_hash_all_lens_up_to_4096, and test_hmac.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the x86-optimized SHA-1 code via x86-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be x86-optimized, and it fixes the longstanding issue where
the x86-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t. The assembly functions actually
already treated it as size_t.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the sparc-optimized SHA-1 code via sparc-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be sparc-optimized, and it fixes the longstanding issue where
the sparc-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Note: to see the diff from arch/sparc/crypto/sha1_glue.c to
lib/crypto/sparc/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the s390-optimized SHA-1 code via s390-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be s390-optimized, and it fixes the longstanding issue where
the s390-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the powerpc-optimized SHA-1 code via
powerpc-specific crypto_shash algorithms, instead just implement the
sha1_blocks() library function. This is much simpler, it makes the
SHA-1 library functions be powerpc-optimized, and it fixes the
longstanding issue where the powerpc-optimized SHA-1 code was disabled
by default. SHA-1 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Note: to see the diff from arch/powerpc/crypto/sha1-spe-glue.c to
lib/crypto/powerpc/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the mips-optimized SHA-1 code via mips-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be mips-optimized, and it fixes the longstanding issue where
the mips-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-sha1.c
to lib/crypto/mips/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm64-optimized SHA-1 code via arm64-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be arm64-optimized, and it fixes the longstanding issue where
the arm64-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Remove support for SHA-1 finalization from assembly code, since the
library does not yet support architecture-specific overrides of the
finalization. (Support for that has been omitted for now, for
simplicity and because usually it isn't performance-critical.)
To match sha1_blocks(), change the type of the nblocks parameter and the
return value of __sha1_ce_transform() from int to size_t. Update the
assembly code accordingly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm-optimized SHA-1 code via arm-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be arm-optimized, and it fixes the longstanding issue where
the arm-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t. The assembly functions actually
already treated it as size_t.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add HMAC support to the SHA-1 library, again following what was done for
SHA-2. Besides providing the basis for a more streamlined "hmac(sha1)"
shash, this will also be useful for multiple in-kernel users such as
net/sctp/auth.c, net/ipv6/seg6_hmac.c, and
security/keys/trusted-keys/trusted_tpm1.c. Those are currently using
crypto_shash, but using the library functions would be much simpler.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a library interface for SHA-1, following the SHA-2 one. As was the
case with SHA-2, this will be useful for various in-kernel users. The
crypto_shash interface will be reimplemented on top of it as well.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Rename the existing sha1_init() to sha1_init_raw(), since it conflicts
with the upcoming library function. This will later be removed, but
this keeps the kernel building for the introduction of the library.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
While the HMAC library functions support both incremental and one-shot
computation and both prepared and raw keys, the combination of raw key
+ incremental was missing. It turns out that several potential users of
the HMAC library functions (tpm2-sessions.c, smb2transport.c,
trusted_tpm1.c) want exactly that.
Therefore, add the missing functions hmac_sha*_init_usingrawkey().
Implement them in an optimized way that directly initializes the HMAC
context without a separate key preparation step.
Reimplement the one-shot raw key functions hmac_sha*_usingrawkey() on
top of the new functions, which makes them a bit more efficient.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250711215844.41715-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Fix poly1305-armv4.pl to not do '.globl poly1305_blocks_neon' when
poly1305_blocks_neon() is not defined. Then, remove the empty __weak
definition of poly1305_blocks_neon(), which was still needed only
because of that unnecessary globl statement. (It also used to be needed
because the compiler could generate calls to it when
CONFIG_KERNEL_MODE_NEON=n, but that has been fixed.)
Thanks to Arnd Bergmann for reporting that the globl statement in the
asm file was still depending on the weak symbol.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250711212822.6372-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The test align_shift_alloc_test is expected to fail. Reporting the test
as fail confuses to be a genuine failure. Introduce widely used xfail
sematics to address the issue.
Note: a warn_alloc dump similar to below is still expected:
Call Trace:
<TASK>
dump_stack_lvl+0x64/0x80
warn_alloc+0x137/0x1b0
? __get_vm_area_node+0x134/0x140
Snippet of dmesg after change:
Summary: random_size_align_alloc_test passed: 1 failed: 0 xfailed: 0 ..
Summary: align_shift_alloc_test passed: 0 failed: 0 xfailed: 1 ..
Summary: pcpu_alloc_test passed: 1 failed: 0 xfailed: 0 ..
Link: https://lkml.kernel.org/r/20250702064319.885-1-raghavendra.kt@amd.com
Signed-off-by: Raghavendra K T <raghavendra.kt@amd.com>
Reviewed-by: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Dev Jain <dev.jain@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add comments explaining the fields for maple_metadata, since "end" is
ambiguous and "gap" can be confused as the largest gap, whereas it is
actually the offset of the largest gap.
Add comment for mas_ascend() to explain, whose min and max we are trying
to find. Explain that, for example, if we are already on offset zero,
then the parent min is mas->min, otherwise we need to walk up to find the
implied pivot min.
Link: https://lkml.kernel.org/r/20250703063338.51509-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Restore the len >= 288 condition on using the AVX implementation, which
was incidentally removed by commit 318c53ae02 ("crypto: x86/poly1305 -
Add block-only interface"). This check took into account the overhead
in key power computation, kernel-mode "FPU", and tail handling
associated with the AVX code. Indeed, restoring this check slightly
improves performance for len < 256 as measured using poly1305_kunit on
an "AMD Ryzen AI 9 365" (Zen 5) CPU:
Length Before After
====== ========== ==========
1 30 MB/s 36 MB/s
16 516 MB/s 598 MB/s
64 1700 MB/s 1882 MB/s
127 2265 MB/s 2651 MB/s
128 2457 MB/s 2827 MB/s
200 2702 MB/s 3238 MB/s
256 3841 MB/s 3768 MB/s
511 4580 MB/s 4585 MB/s
512 5430 MB/s 5398 MB/s
1024 7268 MB/s 7305 MB/s
3173 8999 MB/s 8948 MB/s
4096 9942 MB/s 9921 MB/s
16384 10557 MB/s 10545 MB/s
While the optimal threshold for this CPU might be slightly lower than
288 (see the len == 256 case), other CPUs would need to be tested too,
and these sorts of benchmarks can underestimate the true cost of
kernel-mode "FPU". Therefore, for now just restore the 288 threshold.
Fixes: 318c53ae02 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Restore the SIMD usability check that was removed by commit a59e5468a9
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a9 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Restore the SIMD usability check that was removed by commit 773426f477
("crypto: arm/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 773426f477 ("crypto: arm/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Commit cac5cefbad ("sched/smp: Make SMP unconditional")
migrate_disable() even on UP builds.
Commit 06ddd17521 ("sched/smp: Always define is_percpu_thread() and
scheduler_ipi()") made is_percpu_thread() check the affinity mask
instead replying always true for UP mask.
As a consequence smp_processor_id() now complains if invoked within a
migrate_disable() section because is_percpu_thread() checks its mask and
the migration check is left out.
Make migration check unconditional of SMP.
Fixes: cac5cefbad ("sched/smp: Make SMP unconditional")
Closes: https://lore.kernel.org/oe-lkp/202507100448.6b88d6f1-lkp@intel.com
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/r/20250710082748.-DPO1rjO@linutronix.de
With sanitizers enabled, this function uses a lot of stack, causing a
harmless warning:
lib/test_objagg.c: In function 'test_hints_case.constprop':
lib/test_objagg.c:994:1: error: the frame size of 1440 bytes is larger than 1408 bytes [-Werror=frame-larger-than=]
Most of this is from the two 'struct world' structures. Since most of the
work in this function is duplicated for the two, split it up into separate
functions that each use one of them.
The combined stack usage is still the same here, but there is no warning
any more, and the code is still safe because of the known call chain.
Link: https://lkml.kernel.org/r/20250620111907.3395296-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The current implementation forces a compile-time 1/0 division, which
generates an undefined instruction (ud2 on x86) rather than a proper
runtime division-by-zero exception.
Change to trigger an actual div-by-0 exception at runtime, consistent with
other division operations. Use a non-1 dividend to prevent the compiler
from optimizing the division into a comparison.
Link: https://lkml.kernel.org/r/q246p466-1453-qon9-29so-37105116009q@onlyvoer.pbz
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Cc: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Restoring maple status to ma_active on overflow/underflow when mas->node
was NULL could have happened in the past, but was masked by a bug in
mas_walk(). Add test cases that triggered the bug when the node was
mas->node prior to fixing the maple state setup.
Add a few extra tests around restoring the active maple status.
Link: https://lore.kernel.org/all/202506191556.6bfc7b93-lkp@intel.com/
Link: https://lkml.kernel.org/r/20250624154823.52221-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
During the initial call with a maple state, an error status may be set
before a valid node is populated into the maple state node. Subsequent
calls with the maple state may restore the state into an active state with
no node set. This was masked by the mas_walk() always resetting the
status to ma_start and result in an extra walk in this rare scenario.
Don't restore the state to active unless there is a value in the structs
node. This also allows mas_walk() to be fixed to use the active state
without exposing an issue.
User visible results are marginal performance improvements when an active
state can be restored and used instead of rewalking the tree.
Stable is not Cc'ed because the existing code is stable and the
performance gains are not worth the risk.
Link: https://lore.kernel.org/all/20250611011253.19515-1-richard.weiyang@gmail.com/
Link: https://lore.kernel.org/all/20250407231354.11771-1-richard.weiyang@gmail.com/
Link: https://lore.kernel.org/all/202506191556.6bfc7b93-lkp@intel.com/
Link: https://lkml.kernel.org/r/20250624154823.52221-1-Liam.Howlett@oracle.com
Fixes: a8091f039c ("maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Wei Yang <richard.weiyang@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202506191556.6bfc7b93-lkp@intel.com
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When the vmalloc test is built into the kernel, it runs automatically
during the boot. The current-default "run_test_mask" includes all test
cases, including those which are designed to fail and which trigger kernel
warnings.
These kernel splats can be misinterpreted as actual kernel bugs, leading
to false alarms and unnecessary reports.
To address this, limit the default test mask to only the first few tests
which are expected to pass cleanly. These tests are safe and should not
generate any warnings unless there is a real bug.
Users who wish to explicitly run specific test cases have to pass the
run_test_mask as a boot parameter or at module load time.
Link: https://lkml.kernel.org/r/20250623184035.581229-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: David Wang <00107082@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When the vmalloc test code is compiled as a built-in, use late_initcall()
instead of module_init() to defer a vmalloc test execution until most
subsystems are up and running.
It avoids interfering with components that may not yet be initialized at
module_init() time. For example, there was a recent report of memory
profiling infrastructure not being ready early enough leading to kernel
crash.
By using late_initcall() in the built-in case, we ensure the tests are run
at a safer point during a boot sequence.
Link: https://lkml.kernel.org/r/20250623184035.581229-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: David Wang <00107082@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Use the underflow goto label to set the status to ma_underflow and return
NULL, as is being done elsewhere.
[akpm@linux-foundation.org: add newline, per Liam (and remove one, per akpm)]
Link: https://lkml.kernel.org/r/20250624080748.4855-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Recently discovered this entry while checking kallsyms on ARM64:
ffff800083e509c0 D _shared_alloc_tag
If ARCH_NEEDS_WEAK_PER_CPU is not defined(it is only defined for s390 and
alpha architectures), there's no need to statically define the percpu
variable _shared_alloc_tag.
Therefore, we need to implement isolation for this purpose.
When building the core kernel code for s390 or alpha architectures,
ARCH_NEEDS_WEAK_PER_CPU remains undefined (as it is gated by #if
defined(MODULE)). However, when building modules for these architectures,
the macro is explicitly defined.
Therefore, we remove all instances of ARCH_NEEDS_WEAK_PER_CPU from the
code and introduced CONFIG_ARCH_MODULE_NEEDS_WEAK_PER_CPU to replace the
relevant logic. We can now conditionally define the perpcu variable
_shared_alloc_tag based on CONFIG_ARCH_MODULE_NEEDS_WEAK_PER_CPU. This
allows architectures (such as s390/alpha) that require weak definitions
for percpu variables in modules to include the definition, while others
can omit it via compile-time exclusion.
Link: https://lkml.kernel.org/r/20250618015809.1235761-1-hao.ge@linux.dev
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390]
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Chistoph Lameter <cl@linux.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When reading /proc/allocinfo, for each read syscall, seq_file would invoke
start/stop callbacks. In start callback, a memory is alloced to store
iterator and the iterator would start from beginning to walk linearly to
current read position.
seq_file read() takes at most 4096 bytes, even if read with a larger user
space buffer, meaning read out all of /proc/allocinfo, tens of read
syscalls are needed. For example, a 306036 bytes allocinfo files need 76
reads:
$ sudo cat /proc/allocinfo | wc
3964 16678 306036
$ sudo strace -T -e read cat /proc/allocinfo
...
read(3, " 4096 1 arch/x86/k"..., 131072) = 4063 <0.000062>
...
read(3, " 0 0 sound/core"..., 131072) = 4021 <0.000150>
...
For those n=3964 lines, each read takes about m=3964/76=52 lines,
since iterator restart from beginning for each read(),
it would move forward
m steps on 1st read
2*m steps on 2nd read
3*m steps on 3rd read
...
n steps on last read
As read() along, those linear seek steps make read() calls slower and
slower. Adding those up, codetag iterator moves about O(n*n/m) steps,
making data structure traversal take significant part of the whole
reading. Profiling when stress reading /proc/allocinfo confirms it:
vfs_read(99.959% 1677299/1677995)
proc_reg_read_iter(99.856% 1674881/1677299)
seq_read_iter(99.959% 1674191/1674881)
allocinfo_start(75.664% 1266755/1674191)
codetag_next_ct(79.217% 1003487/1266755) <---
srso_return_thunk(1.264% 16011/1266755)
__kmalloc_cache_noprof(0.102% 1296/1266755)
...
allocinfo_show(21.287% 356378/1674191)
allocinfo_next(1.530% 25621/1674191)
codetag_next_ct() takes major part.
A private data alloced at open() time can be used to carry iterator alive
across read() calls, and avoid the memory allocation and iterator reset
for each read(). This way, only O(1) memory allocation and O(n) steps
iterating, and `time` shows performance improvement from ~7ms to ~4ms.
Profiling with the change:
vfs_read(99.865% 1581073/1583214)
proc_reg_read_iter(99.485% 1572934/1581073)
seq_read_iter(99.846% 1570519/1572934)
allocinfo_show(87.428% 1373074/1570519)
seq_buf_printf(83.695% 1149196/1373074)
seq_buf_putc(1.917% 26321/1373074)
_find_next_bit(1.531% 21023/1373074)
...
codetag_to_text(0.490% 6727/1373074)
...
allocinfo_next(6.275% 98543/1570519)
...
allocinfo_start(0.369% 5790/1570519)
...
Now seq_buf_printf() takes major part.
Link: https://lkml.kernel.org/r/20250609064408.112783-1-00107082@163.com
Signed-off-by: David Wang <00107082@163.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Codetag iterator use <id,address> pair to guarantee the validness. But
both id and address can be reused, there is theoretical possibility when
module inserted right after another module removed, kmalloc returns an
address same as the address kfree by previous module and IDR key reuses
the key recently removed.
Add a sequence number to codetag_module and code_iterator, the sequence
number is strickly incremented whenever a module is loaded. An iterator
is valid if and only if its sequence number match codetag_module's.
Link: https://lkml.kernel.org/r/20250609064200.112639-1-00107082@163.com
Signed-off-by: David Wang <00107082@163.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The various test ioctl handlers use arrays of 64 integers that add up to
1KiB of stack data, which in turn leads to exceeding the warning limit in
some configurations:
lib/test_hmm.c:935:12: error: stack frame size (1408) exceeds limit (1280)
in 'dmirror_migrate_to_device' [-Werror,-Wframe-larger-than]
Use half the size for these arrays, in order to stay under the warning
limits. The code can already deal with arbitrary lengths, but this may be
a little less efficient.
Link: https://lkml.kernel.org/r/20250610092159.2639515-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Suppose xas is pointing somewhere near the end of the multi-entry batch.
Then it may happen that the computed slot already falls beyond the batch,
thus breaking the loop due to !xa_is_sibling(), and computing the wrong
order.
For example, suppose we have a shift-6 node having an order-9 entry => 8 -
1 = 7 siblings, so assume the slots are at offset 0 till 7 in this node.
If xas->xa_offset is 6, then the code will compute order as 1 +
xas->xa_node->shift = 7. Therefore, the order computation must start from
the beginning of the multi-slot entries, that is, the non-sibling entry.
Thus ensure that the caller is aware of this by triggering a BUG when the
entry is a sibling entry. Note that this BUG_ON() is only active while
running selftests, so there is no overhead in a running kernel.
Link: https://lkml.kernel.org/r/20250604041533.91198-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On destroy, we should set each node dead. But current code miss this when
the maple tree has only the root node.
The reason is mt_destroy_walk() leverage mte_destroy_descend() to set node
dead, but this is skipped since the only root node is a leaf.
Fixes this by setting the node dead if it is a leaf.
Link: https://lore.kernel.org/all/20250407231354.11771-1-richard.weiyang@gmail.com/
Link: https://lkml.kernel.org/r/20250624191841.64682-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
alloc_tag_top_users() attempts to lock alloc_tag_cttype->mod_lock even
when the alloc_tag_cttype is not allocated because:
1) alloc tagging is disabled because mem profiling is disabled
(!alloc_tag_cttype)
2) alloc tagging is enabled, but not yet initialized (!alloc_tag_cttype)
3) alloc tagging is enabled, but failed initialization
(!alloc_tag_cttype or IS_ERR(alloc_tag_cttype))
In all cases, alloc_tag_cttype is not allocated, and therefore
alloc_tag_top_users() should not attempt to acquire the semaphore.
This leads to a crash on memory allocation failure by attempting to
acquire a non-existent semaphore:
Oops: general protection fault, probably for non-canonical address 0xdffffc000000001b: 0000 [#3] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x00000000000000d8-0x00000000000000df]
CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G D 6.16.0-rc2 #1 VOLUNTARY
Tainted: [D]=DIE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:down_read_trylock+0xaa/0x3b0
Code: d0 7c 08 84 d2 0f 85 a0 02 00 00 8b 0d df 31 dd 04 85 c9 75 29 48 b8 00 00 00 00 00 fc ff df 48 8d 6b 68 48 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 88 02 00 00 48 3b 5b 68 0f 85 53 01 00 00 65 ff
RSP: 0000:ffff8881002ce9b8 EFLAGS: 00010016
RAX: dffffc0000000000 RBX: 0000000000000070 RCX: 0000000000000000
RDX: 000000000000001b RSI: 000000000000000a RDI: 0000000000000070
RBP: 00000000000000d8 R08: 0000000000000001 R09: ffffed107dde49d1
R10: ffff8883eef24e8b R11: ffff8881002cec20 R12: 1ffff11020059d37
R13: 00000000003fff7b R14: ffff8881002cec20 R15: dffffc0000000000
FS: 00007f963f21d940(0000) GS:ffff888458ca6000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f963f5edf71 CR3: 000000010672c000 CR4: 0000000000350ef0
Call Trace:
<TASK>
codetag_trylock_module_list+0xd/0x20
alloc_tag_top_users+0x369/0x4b0
__show_mem+0x1cd/0x6e0
warn_alloc+0x2b1/0x390
__alloc_frozen_pages_noprof+0x12b9/0x21a0
alloc_pages_mpol+0x135/0x3e0
alloc_slab_page+0x82/0xe0
new_slab+0x212/0x240
___slab_alloc+0x82a/0xe00
</TASK>
As David Wang points out, this issue became easier to trigger after commit
780138b123 ("alloc_tag: check mem_profiling_support in alloc_tag_init").
Before the commit, the issue occurred only when it failed to allocate and
initialize alloc_tag_cttype or if a memory allocation fails before
alloc_tag_init() is called. After the commit, it can be easily triggered
when memory profiling is compiled but disabled at boot.
To properly determine whether alloc_tag_init() has been called and its
data structures initialized, verify that alloc_tag_cttype is a valid
pointer before acquiring the semaphore. If the variable is NULL or an
error value, it has not been properly initialized. In such a case, just
skip and do not attempt to acquire the semaphore.
[harry.yoo@oracle.com: v3]
Link: https://lkml.kernel.org/r/20250624072513.84219-1-harry.yoo@oracle.com
Link: https://lkml.kernel.org/r/20250620195305.1115151-1-harry.yoo@oracle.com
Fixes: 780138b123 ("alloc_tag: check mem_profiling_support in alloc_tag_init")
Fixes: 1438d349d1 ("lib: add memory allocations report in show_mem()")
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202506181351.bba867dd-lkp@intel.com
Acked-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: David Wang <00107082@163.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Yuanyuan Zhong <yzhong@purestorage.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The internal helpers are effectively using boolean results,
while pretending to use error numbers.
Switch the return type to bool for more clarity.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-5-df7d9f87b9b8@linutronix.de
Generalize node_random() and make it available to general bitmaps and
cpumasks users.
Notice, find_first_bit() is generally faster than find_nth_bit(), and we
employ it when there's a single set bit in the bitmap.
See commit 3e061d924f ("lib/nodemask: optimize node_random for
nodemask with single NUMA node").
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>
crypto/hash_info.c just contains a couple of arrays that map HASH_ALGO_*
algorithm IDs to properties of those algorithms. It is compiled only
when CRYPTO_HASH_INFO=y, but currently CRYPTO_HASH_INFO depends on
CRYPTO. Since this can be useful without the old-school crypto API,
move it into lib/crypto/ so that it no longer depends on CRYPTO.
This eliminates the need for FS_VERITY to select CRYPTO after it's been
converted to use lib/crypto/.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630172224.46909-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since sha256_blocks() is called only with nblocks >= 1, remove
unnecessary checks for nblocks == 0 from the x86 SHA-256 assembly code.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250704023958.73274-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
As I did for sha512_blocks(), reorganize x86's sha256_blocks() to be
just a static_call. To achieve that, for each assembly function add a C
function that handles the kernel-mode FPU section and fallback. While
this increases total code size slightly, the amount of code actually
executed on a given system does not increase, and it is slightly more
efficient since it eliminates the extra static_key. It also makes the
assembly functions be called with standard direct calls instead of
static calls, eliminating the need for ANNOTATE_NOENDBR.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250704023958.73274-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The BLOCK_HASH_UPDATE_BLOCKS macro is difficult to read. For now, let's
just write the update explicitly in the straightforward way, mirroring
sha512_update(). It's possible that we'll bring back a macro for this
later, but it needs to be properly justified and hopefully a bit more
readable.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Consolidate the CPU-based SHA-256 code into a single module, following
what I did with SHA-512:
- Each arch now provides a header file lib/crypto/$(SRCARCH)/sha256.h,
replacing lib/crypto/$(SRCARCH)/sha256.c. The header defines
sha256_blocks() and optionally sha256_mod_init_arch(). It is included
by lib/crypto/sha256.c, and thus the code gets built into the single
libsha256 module, with proper inlining and dead code elimination.
- sha256_blocks_generic() is moved from lib/crypto/sha256-generic.c into
lib/crypto/sha256.c. It's now a static function marked with
__maybe_unused, so the compiler automatically eliminates it in any
cases where it's not used.
- Whether arch-optimized SHA-256 is buildable is now controlled
centrally by lib/crypto/Kconfig instead of by
lib/crypto/$(SRCARCH)/Kconfig. The conditions for enabling it remain
the same as before, and it remains enabled by default.
- Any additional arch-specific translation units for the optimized
SHA-256 code (such as assembly files) are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since HMAC support is commonly needed and is fairly simple, include it
as a first-class citizen of the SHA-256 library.
The API supports both incremental and one-shot computation, and either
preparing the key ahead of time or just using a raw key. The
implementation is much more streamlined than crypto/hmac.c.
I've kept it consistent with the HMAC-SHA384 and HMAC-SHA512 code as
much as possible.
Testing of these functions will be via sha224_kunit and sha256_kunit,
added by a later commit.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The previous commit made the SHA-256 compression function state be
strongly typed, but it wasn't propagated all the way down to the
implementations of it. Do that now.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Currently the SHA-224 and SHA-256 library functions can be mixed
arbitrarily, even in ways that are incorrect, for example using
sha224_init() and sha256_final(). This is because they operate on the
same structure, sha256_state.
Introduce stronger typing, as I did for SHA-384 and SHA-512.
Also as I did for SHA-384 and SHA-512, use the names *_ctx instead of
*_state. The *_ctx names have the following small benefits:
- They're shorter.
- They avoid an ambiguity with the compression function state.
- They're consistent with the well-known OpenSSL API.
- Users usually name the variable 'sctx' anyway, which suggests that
*_ctx would be the more natural name for the actual struct.
Therefore: update the SHA-224 and SHA-256 APIs, implementation, and
calling code accordingly.
In the new structs, also strongly-type the compression function state.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a one-shot SHA-224 computation function sha224(), for consistency
with sha256(), sha384(), and sha512() which all already exist.
Similarly, add sha224_update(). While for now it's identical to
sha256_update(), omitting it makes the API harder to use since users
have to "know" which functions are the same between SHA-224 and SHA-256.
Also, this is a prerequisite for using different context types for each.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of having both sha256_blocks_arch() and sha256_blocks_simd(),
instead have just sha256_blocks_arch() which uses the most efficient
implementation that is available in the calling context.
This is simpler, as it reduces the API surface. It's also safer, since
sha256_blocks_arch() just works in all contexts, including contexts
where the FPU/SIMD/vector registers cannot be used. This doesn't mean
that SHA-256 computations *should* be done in such contexts, but rather
we should just do the right thing instead of corrupting a random task's
registers. Eliminating this footgun and simplifying the code is well
worth the very small performance cost of doing the check.
Note: in the case of arm and arm64, what used to be sha256_blocks_arch()
is renamed back to its original name of sha256_block_data_order().
sha256_blocks_arch() is now used for the higher-level dispatch function.
This renaming also required an update to lib/crypto/arm64/sha512.h,
since sha2-armv8.pl is shared by both SHA-256 and SHA-512.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
First, move the declarations of sha224_init/update/final to be just
above the corresponding SHA-256 code, matching the order that I used for
SHA-384 and SHA-512. In sha2.h, the end result is that SHA-224,
SHA-256, SHA-384, and SHA-512 are all in the logical order.
Second, move sha224_block_init() and sha256_block_init() to be just
below crypto_sha256_state. In later changes, these functions as well as
struct crypto_sha256_state will no longer be used by the library
functions. They'll remain just for some legacy offload drivers. This
gets them into a logical place in the file for that.
No code changes other than reordering.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Current release - new code bugs:
- eth: txgbe: fix the issue of TX failure
- eth: ngbe: specify IRQ vector when the number of VFs is 7
Previous releases - regressions:
- sched: always pass notifications when child class becomes empty
- ipv4: fix stat increase when udp early demux drops the packet
- bluetooth: prevent unintended pause by checking if advertising is active
- virtio: fix error reporting in virtqueue_resize
- eth: virtio-net:
- ensure the received length does not exceed allocated size
- fix the xsk frame's length check
- eth: lan78xx: fix WARN in __netif_napi_del_locked on disconnect
Previous releases - always broken:
- bluetooth: mesh: check instances prior disabling advertising
- eth: idpf: convert control queue mutex to a spinlock
- eth: dpaa2: fix xdp_rxq_info leak
- eth: amd-xgbe: align CL37 AN sequence as per databook
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmhmfzQSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOk/GMP/ixlapKjTP/ggGIFO0nEDTm1tAFnhQl3
bBuwBDoGPjalb46WBO24SFSFYqvZwV6ZIYxCxCeBfmkPyEun0FBX6xjqUIZqohTZ
u5ZSmKFkODMoxQWAG0hXBGvfeKg/GBMWJT761o5IB2XvknRlqHq6uufUBcalvlJK
t58ykSYp2wjfowXSRQ4jEZnr4HZzVuvarhbCB9hJWv206fdk4LiC07teHB1VhW4w
LYmBQChp8SXDFCCYZajum0cNCzx78q90lGzz+MEErVXdXXnRVeqRAUY+k4Vd/Fz+
0OY1vZJ7xgFpy2ns3Z6TH8D41P9whBI8jUYXZ5nA45J8N5wdEQo8oVHlRe9a6Y/E
0oC+DPahhSQAq8BKGFtYSyyURGJvd4+TpQP/LV4e83myReW8i0ZKtyXVgH0Cibwb
529l6wIXBAcLK03tyYwmoCI2VjJbRoMV3nMCeiACCtDExK1YCa3dhjQ82fa8voLc
MIn7zXAGf12IKca39ZapRrdaooaqvSG4htxTn94vEqScNu0wi1cymvG47h9bDrES
cPyS4/MIUH0sduSDVL5PpFYfIDhqS3mpc0e8Nc3pOy7VLQ9kvtBX37OaO/tX5aeh
SWU+8q8y1Cnq0+mcUUHpENFMOgZEC5UO6rdeaJB3Nu0vlHlDEZoEkUXSkHEfsf2F
aodwE/oPyQCg
=O7OS
-----END PGP SIGNATURE-----
Merge tag 'net-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from Bluetooth.
Current release - new code bugs:
- eth:
- txgbe: fix the issue of TX failure
- ngbe: specify IRQ vector when the number of VFs is 7
Previous releases - regressions:
- sched: always pass notifications when child class becomes empty
- ipv4: fix stat increase when udp early demux drops the packet
- bluetooth: prevent unintended pause by checking if advertising is active
- virtio: fix error reporting in virtqueue_resize
- eth:
- virtio-net:
- ensure the received length does not exceed allocated size
- fix the xsk frame's length check
- lan78xx: fix WARN in __netif_napi_del_locked on disconnect
Previous releases - always broken:
- bluetooth: mesh: check instances prior disabling advertising
- eth:
- idpf: convert control queue mutex to a spinlock
- dpaa2: fix xdp_rxq_info leak
- amd-xgbe: align CL37 AN sequence as per databook"
* tag 'net-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
vsock/vmci: Clear the vmci transport packet properly when initializing it
dt-bindings: net: sophgo,sg2044-dwmac: Drop status from the example
net: ngbe: specify IRQ vector when the number of VFs is 7
net: wangxun: revert the adjustment of the IRQ vector sequence
net: txgbe: request MISC IRQ in ndo_open
virtio_net: Enforce minimum TX ring size for reliability
virtio_net: Cleanup '2+MAX_SKB_FRAGS'
virtio_ring: Fix error reporting in virtqueue_resize
virtio-net: xsk: rx: fix the frame's length check
virtio-net: use the check_mergeable_len helper
virtio-net: remove redundant truesize check with PAGE_SIZE
virtio-net: ensure the received length does not exceed allocated size
net: ipv4: fix stat increase when udp early demux drops the packet
net: libwx: fix the incorrect display of the queue number
amd-xgbe: do not double read link status
net/sched: Always pass notifications when child class becomes empty
nui: Fix dma_mapping_error() check
rose: fix dangling neighbour pointers in rose_rt_device_down()
enic: fix incorrect MTU comparison in enic_change_mtu()
amd-xgbe: align CL37 AN sequence as per databook
...
Smatch complains that the error message isn't set in the caller:
lib/test_objagg.c:923 test_hints_case2()
error: uninitialized symbol 'errmsg'.
This static checker warning only showed up after a recent refactoring
but the bug dates back to when the code was originally added. This
likely doesn't affect anything in real life.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202506281403.DsuyHFTZ-lkp@intel.com/
Fixes: 0a020d416d ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8548f423-2e3b-4bb7-b816-5041de2762aa@sabinyo.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
group_cpu_evenly() might have allocated less groups then requested:
group_cpu_evenly()
__group_cpus_evenly()
alloc_nodes_groups()
# allocated total groups may be less than numgrps when
# active total CPU number is less then numgrps
In this case, the caller will do an out of bound access because the
caller assumes the masks returned has numgrps.
Return the number of groups created so the caller can limit the access
range accordingly.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250617-isolcpus-queue-counters-v1-1-13923686b54b@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix build warnings with W=1 that started appearing after
commit a934a57a42 ("scripts/misc-check: check missing #include
<linux/export.h> when W=1"). While at it, sort the include lists
alphabetically.
Link: https://lore.kernel.org/r/20250612183852.114878-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
These symbols are no longer used, so remove them.
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the x86-optimized CRC code from arch/x86/lib/crc* into its new
location in lib/crc/x86/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the sparc-optimized CRC code from arch/sparc/lib/crc* into its new
location in lib/crc/sparc/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the s390-optimized CRC code from arch/s390/lib/crc* into its new
location in lib/crc/s390/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the riscv-optimized CRC code from arch/riscv/lib/crc* into its new
location in lib/crc/riscv/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the powerpc-optimized CRC code from arch/powerpc/lib/crc* into its
new location in lib/crc/powerpc/, and wire it up in the new way. This
new way of organizing the CRC code eliminates the need to artificially
split the code for each CRC variant into separate arch and generic
modules, enabling better inlining and dead code elimination. For more
details, see "lib/crc: Prepare for arch-optimized code in subdirs of
lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the mips-optimized CRC code from arch/mips/lib/crc* into its new
location in lib/crc/mips/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the loongarch-optimized CRC code from arch/loongarch/lib/crc* into
its new location in lib/crc/loongarch/, and wire it up in the new way.
This new way of organizing the CRC code eliminates the need to
artificially split the code for each CRC variant into separate arch and
generic modules, enabling better inlining and dead code elimination.
For more details, see "lib/crc: Prepare for arch-optimized code in
subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the arm64-optimized CRC code from arch/arm64/lib/crc* into its new
location in lib/crc/arm64/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the arm-optimized CRC code from arch/arm/lib/crc* into its new
location in lib/crc/arm/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Rework how lib/crc/ supports arch-optimized code. First, instead of the
arch-optimized CRC code being in arch/$(SRCARCH)/lib/, it will now be in
lib/crc/$(SRCARCH)/. Second, the API functions (e.g. crc32c()),
arch-optimized functions (e.g. crc32c_arch()), and generic functions
(e.g. crc32c_base()) will now be part of a single module for each CRC
type, allowing better inlining and dead code elimination. The second
change is made possible by the first.
As an example, consider CONFIG_CRC32=m on x86. We'll now have just
crc32.ko instead of both crc32-x86.ko and crc32.ko. The two modules
were already coupled together and always both got loaded together via
direct symbol dependency, so the separation provided no benefit.
Note: later I'd like to apply the same design to lib/crypto/ too, where
often the API functions are out-of-line so this will work even better.
In those cases, for each algorithm we currently have 3 modules all
coupled together, e.g. libsha256.ko, libsha256-generic.ko, and
sha256-x86.ko. We should have just one, inline things properly, and
rely on the compiler's dead code elimination to decide the inclusion of
the generic code instead of manually setting it via kconfig.
Having arch-specific code outside arch/ was somewhat controversial when
Zinc proposed it back in 2018. But I don't think the concerns are
warranted. It's better from a technical perspective, as it enables the
improvements mentioned above. This model is already successfully used
in other places in the kernel such as lib/raid6/. The community of each
architecture still remains free to work on the code, even if it's not in
arch/. At the time there was also a desire to put the library code in
the same files as the old-school crypto API, but that was a mistake; now
that the library is separate, that's no longer a constraint either.
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-3-ebiggers@kernel.org
Link: https://lore.kernel.org/r/20250612054514.142728-1-ebiggers@kernel.org
Link: https://lore.kernel.org/r/20250621012221.4351-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move all CRC files in lib/ into a subdirectory lib/crc/ to keep them
from cluttering up the main lib/ directory.
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Remove crc32_le_combine() and crc32_le_shift(), since they are no longer
used.
Although combination is an interesting thing that can be done with CRCs,
it turned out that none of the users of it in the kernel were even close
to being worthwhile. All were much better off simply chaining the CRCs
or processing zeroes.
Let's remove the CRC32 combination code for now. It can come back
(potentially optimized with carryless multiplication instructions) if
there is ever a case where it would actually be worthwhile.
Link: https://lore.kernel.org/r/20250607032228.27868-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The MIPS32r2 ChaCha code has never been buildable with the clang
assembler. First, clang doesn't support the 'rotl' pseudo-instruction:
error: unknown instruction, did you mean: rol, rotr?
Second, clang requires that both operands of the 'wsbh' instruction be
explicitly given:
error: too few operands for instruction
To fix this, align the code with the real instruction set by (1) using
the real instruction 'rotr' instead of the nonstandard pseudo-
instruction 'rotl', and (2) explicitly giving both operands to 'wsbh'.
To make removing the use of 'rotl' a bit easier, also remove the
unnecessary special-casing for big endian CPUs at
.Lchacha_mips_xor_bytes. The tail handling is actually
endian-independent since it processes one byte at a time. On big endian
CPUs the old code byte-swapped SAVED_X, then iterated through it in
reverse order. But the byteswap and reverse iteration canceled out.
Tested with chacha20poly1305-selftest in QEMU using "-M malta" with both
little endian and big endian mips32r2 kernels.
Fixes: 49aa7c00ed ("crypto: mips/chacha - import 32r2 ChaCha code from Zinc")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505080409.EujEBwA0-lkp@intel.com/
Link: https://lore.kernel.org/r/20250619225535.679301-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/x86/lib/crypto/ into lib/crypto/x86/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Add a gitignore entry for the removed directory arch/x86/lib/crypto/ so
that people don't accidentally commit leftover generated files.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/sparc/lib/crypto/ into lib/crypto/sparc/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/s390/lib/crypto/ into lib/crypto/s390/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/riscv/lib/crypto/ into lib/crypto/riscv/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Palmer Dabbelt <palmer@dabbelt.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/powerpc/lib/crypto/ into lib/crypto/powerpc/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/mips/lib/crypto/ into lib/crypto/mips/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Add a gitignore entry for the removed directory arch/mips/lib/crypto/ so
that people don't accidentally commit leftover generated files.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/arm64/lib/crypto/ into lib/crypto/arm64/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Add a gitignore entry for the removed directory arch/arm64/lib/crypto/
so that people don't accidentally commit leftover generated files.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/arm/lib/crypto/ into lib/crypto/arm/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Add a gitignore entry for the removed directory arch/arm/lib/crypto/ so
that people don't accidentally commit leftover generated files.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the x86-optimized SHA-512 code via x86-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be x86-optimized, and it fixes the
longstanding issue where the x86-optimized SHA-512 code was disabled by
default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly functions from int to size_t. The assembly functions
actually already treated it as size_t.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-15-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the sparc-optimized SHA-512 code via sparc-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be sparc-optimized, and it fixes the
longstanding issue where the sparc-optimized SHA-512 code was disabled
by default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly function from int to size_t. The assembly function
actually already treated it as size_t.
Note: to see the diff from arch/sparc/crypto/sha512_glue.c to
lib/crypto/sparc/sha512.h, view this commit with 'git show -M10'.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the s390-optimized SHA-512 code via s390-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be s390-optimized, and it fixes the
longstanding issue where the s390-optimized SHA-512 code was disabled by
default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the riscv-optimized SHA-512 code via riscv-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be riscv-optimized, and it fixes the
longstanding issue where the riscv-optimized SHA-512 code was disabled
by default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly function from int to size_t. The assembly function
actually already treated it as size_t.
Note: to see the diff from arch/riscv/crypto/sha512-riscv64-glue.c to
lib/crypto/riscv/sha512.h, view this commit with 'git show -M10'.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the mips-optimized SHA-512 code via mips-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be mips-optimized, and it fixes the
longstanding issue where the mips-optimized SHA-512 code was disabled by
default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Note: to see the diff from
arch/mips/cavium-octeon/crypto/octeon-sha512.c to
lib/crypto/mips/sha512.h, view this commit with 'git show -M10'.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm64-optimized SHA-512 code via arm64-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-512 code was disabled
by default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly functions from int or 'unsigned int' to size_t. Update the
ARMv8 CE assembly function accordingly. The scalar assembly function
actually already treated it as size_t.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm-optimized SHA-512 code via arm-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be arm-optimized, and it fixes the
longstanding issue where the arm-optimized SHA-512 code was disabled by
default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly functions from int to size_t. The assembly functions
actually already treated it as size_t.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since HMAC support is commonly needed and is fairly simple, include it
as a first-class citizen of the SHA-512 library.
The API supports both incremental and one-shot computation, and either
preparing the key ahead of time or just using a raw key. The
implementation is much more streamlined than crypto/hmac.c.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add basic support for SHA-384 and SHA-512 to lib/crypto/.
Various in-kernel users will be able to use this instead of the
old-school crypto API, which is harder to use and has more overhead.
The basic support added by this commit consists of the API and its
documentation, backed by a C implementation of the algorithms.
sha512_block_generic() is derived from crypto/sha512_generic.c.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Fix build warnings with W=1 that started appearing after
commit a934a57a42 ("scripts/misc-check: check missing #include
<linux/export.h> when W=1").
While at it, also sort the include lists alphabetically. (Keep
asm/irqflags.h last, as otherwise it doesn't build on alpha.)
This handles all of lib/crypto/, but not arch/*/lib/crypto/. The
exports in arch/*/lib/crypto/ will go away when the code is properly
integrated into lib/crypto/ as planned.
Link: https://lore.kernel.org/r/20250613184814.50173-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
or aren't considered necessary for -stable kernels. 5 are for MM.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaF8vtQAKCRDdBJ7gKXxA
jlK9AP9Syx5isoE7MAMKjr9iI/2z+NRaCCro/VM4oQk8m2cNFgD/ZsL9YMhjZlcL
bMIVUZ9E+yf1w9dLeHLoDba+pnF7Wwc=
=vdkO
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2025-06-27-16-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"16 hotfixes.
6 are cc:stable and the remainder address post-6.15 issues or aren't
considered necessary for -stable kernels. 5 are for MM"
* tag 'mm-hotfixes-stable-2025-06-27-16-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: add Lorenzo as THP co-maintainer
mailmap: update Duje Mihanović's email address
selftests/mm: fix validate_addr() helper
crashdump: add CONFIG_KEYS dependency
mailmap: correct name for a historical account of Zijun Hu
mailmap: add entries for Zijun Hu
fuse: fix runtime warning on truncate_folio_batch_exceptionals()
scripts/gdb: fix dentry_name() lookup
mm/damon/sysfs-schemes: free old damon_sysfs_scheme_filter->memcg_path on write
mm/alloc_tag: fix the kmemleak false positive issue in the allocation of the percpu variable tag->counters
lib/group_cpus: fix NULL pointer dereference from group_cpus_evenly()
mm/hugetlb: remove unnecessary holding of hugetlb_lock
MAINTAINERS: add missing files to mm page alloc section
MAINTAINERS: add tree entry to mm init block
mm: add OOM killer maintainer structure
fs/proc/task_mmu: fix PAGE_IS_PFNZERO detection for the huge zero folio
* .rodata is no longer linkd into PT_DYNAMIC, it was not supposed to be
there in the first place and resultst in invalid (but unused) entries.
This manifests as at least warnings in llvm-readelf.
* A fix for runtime constants with all-0 upper 32-bits. This should
only manifest on MMU=n kernels.
* A fix for context save/restore on systems using the T-Head vector
extensions.
* A fix for a conflicting "+r"/"r" register constraint in the VDSO
getrandom syscall wrapper, which is undefined behavior in clang.
* A fix for a missing register clobber in the RVV raid6 implementation.
This manifests as a NULL pointer reference on some compilers, but
could trigger in other ways.
* Misaligned accesses from userspace at faulting addresses are now
handled correctly.
* A fix for an incorrect optimization that allowed access_ok() to mark
invalid addresses as accessible, which can result in userspace
triggering BUG()s.
* A few fixes for build warnings, and an update to Drew's email address.
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmhe80kZHHBhbG1lcmRh
YmJlbHRAZ29vZ2xlLmNvbQAKCRAuExnzX7sYicV6EACT/5384tdpYSQ6WQ4K2mT2
XxPbrYTJ4jrhZMugnfe1LHBokeBGoGPRK11Dr/PyNJ71oeeDF7opv0kxAfqsiOO3
QrwUE/4zhGgEzs7Z6D8UgYiqVDfb4aMU+oZ0qIfy+r+cB4F9M65TIejdVj99V6Hu
V9cjJ4ABM9KfaZhD5BvoqflblYtwuSg/VYsUmZH6aolDyadzTy4rWcPk1jdFJDQt
tIEsXjc92KNAKGSFe8DDZjjhM216Th/nUsZcxI2DLRQjjHPNEthkAgLNltQGocU9
gJ8U3IqfazgnqcZAlrr7BXlWYlBFH/wGXVsxuBL5LPov19RcTkjl2PWH7T08yyuv
lCGXrfkz3hSu+Sa9A40w4LptrKNWUEFJztaPkQ68gn1ZQP7KB/rsWp+82dCqhT35
RNxmSznLyTsHFRXR2n9fZrWX/F/LwxY7vaH7cTZUDkMHI8F7WP/3tlihxPCQaUHD
dIb+osch8puxG3YjO7H99WrpJamNNw3+L1l2lXtXTRmXdxE+x7fyatmHX98mY8IC
7NXGOdNNIEvv4i9vzSphYQHBOT3tBVfz40z878qfSL3xYHG3ZLMIsWuynaWDMI73
QprwAPmdFxdmJrHyIY6gIiyrscNHz5WLMjkG4K+jXlsBBmDxJMAY5zzNdFoeUVDz
tjnDY4DYc4fCnteKSA/hpw==
=42TO
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V Fixes for 5.16-rc4
- .rodata is no longer linkd into PT_DYNAMIC.
It was not supposed to be there in the first place and resulted in
invalid (but unused) entries. This manifests as at least warnings in
llvm-readelf
- A fix for runtime constants with all-0 upper 32-bits. This should
only manifest on MMU=n kernels
- A fix for context save/restore on systems using the T-Head vector
extensions
- A fix for a conflicting "+r"/"r" register constraint in the VDSO
getrandom syscall wrapper, which is undefined behavior in clang
- A fix for a missing register clobber in the RVV raid6 implementation.
This manifests as a NULL pointer reference on some compilers, but
could trigger in other ways
- Misaligned accesses from userspace at faulting addresses are now
handled correctly
- A fix for an incorrect optimization that allowed access_ok() to mark
invalid addresses as accessible, which can result in userspace
triggering BUG()s
- A few fixes for build warnings, and an update to Drew's email address
* tag 'riscv-for-linus-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: export boot_cpu_hartid
Revert "riscv: Define TASK_SIZE_MAX for __access_ok()"
riscv: Fix sparse warning in vendor_extensions/sifive.c
Revert "riscv: misaligned: fix sleeping function called during misaligned access handling"
MAINTAINERS: Update Drew Fustini's email address
RISC-V: uaccess: Wrap the get_user_8 uaccess macro
raid6: riscv: Fix NULL pointer dereference caused by a missing clobber
RISC-V: vDSO: Correct inline assembly constraints in the getrandom syscall wrapper
riscv: vector: Fix context save/restore with xtheadvector
riscv: fix runtime constant support for nommu kernels
riscv: vdso: Exclude .rodata from the PT_DYNAMIC segment
To accommodate varying hardware performance and use cases,
the default kunit test case timeout (currently 300 seconds)
is now configurable. Users can adjust the timeout by
either setting the 'timeout' module parameter or the
KUNIT_DEFAULT_TIMEOUT Kconfig option to their desired
timeout in seconds.
Link: https://lore.kernel.org/r/20250626171730.1765004-1-marievic@google.com
Signed-off-by: Marie Zhussupova <marievic@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Merge a fix from Jeff from a stable commit ID:
* ref_tracker: do xarray and workqueue job initializations earlier
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The kernel test robot reported an oops that occurred when attempting to
deregister a dentry from the xarray during subsys_initcall().
The ref_tracker xarrays and workqueue job are being initialized in
late_initcall() which is too late. Move those to postcore_initcall()
instead.
Fixes: 65b584f536 ("ref_tracker: automatically register a file in debugfs for a ref_tracker_dir")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202506251406.c28f2adb-lkp@intel.com
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250626-reftrack-dbgfs-v1-1-812102e2a394@kernel.org
When loading a module, as long as the module has memory allocation
operations, kmemleak produces a false positive report that resembles the
following:
unreferenced object (percpu) 0x7dfd232a1650 (size 16):
comm "modprobe", pid 1301, jiffies 4294940249
hex dump (first 16 bytes on cpu 2):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 0):
kmemleak_alloc_percpu+0xb4/0xd0
pcpu_alloc_noprof+0x700/0x1098
load_module+0xd4/0x348
codetag_module_init+0x20c/0x450
codetag_load_module+0x70/0xb8
load_module+0xef8/0x1608
init_module_from_file+0xec/0x158
idempotent_init_module+0x354/0x608
__arm64_sys_finit_module+0xbc/0x150
invoke_syscall+0xd4/0x258
el0_svc_common.constprop.0+0xb4/0x240
do_el0_svc+0x48/0x68
el0_svc+0x40/0xf8
el0t_64_sync_handler+0x10c/0x138
el0t_64_sync+0x1ac/0x1b0
This is because the module can only indirectly reference
alloc_tag_counters through the alloc_tag section, which misleads kmemleak.
However, we don't have a kmemleak ignore interface for percpu allocations
yet. So let's create one and invoke it for tag->counters.
[gehao@kylinos.cn: fix build error when CONFIG_DEBUG_KMEMLEAK=n, s/igonore/ignore/]
Link: https://lkml.kernel.org/r/20250620093102.2416767-1-hao.ge@linux.dev
Link: https://lkml.kernel.org/r/20250619183154.2122608-1-hao.ge@linux.dev
Fixes: 12ca42c237 ("alloc_tag: allocate percpu counters for module tags dynamically")
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Suren Baghdasaryan <surenb@google.com> [lib/alloc_tag.c]
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
While testing null_blk with configfs, echo 0 > poll_queues will trigger
following panic:
BUG: kernel NULL pointer dereference, address: 0000000000000010
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 27 UID: 0 PID: 920 Comm: bash Not tainted 6.15.0-02023-gadbdb95c8696-dirty #1238 PREEMPT(undef)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
RIP: 0010:__bitmap_or+0x48/0x70
Call Trace:
<TASK>
__group_cpus_evenly+0x822/0x8c0
group_cpus_evenly+0x2d9/0x490
blk_mq_map_queues+0x1e/0x110
null_map_queues+0xc9/0x170 [null_blk]
blk_mq_update_queue_map+0xdb/0x160
blk_mq_update_nr_hw_queues+0x22b/0x560
nullb_update_nr_hw_queues+0x71/0xf0 [null_blk]
nullb_device_poll_queues_store+0xa4/0x130 [null_blk]
configfs_write_iter+0x109/0x1d0
vfs_write+0x26e/0x6f0
ksys_write+0x79/0x180
__x64_sys_write+0x1d/0x30
x64_sys_call+0x45c4/0x45f0
do_syscall_64+0xa5/0x240
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Root cause is that numgrps is set to 0, and ZERO_SIZE_PTR is returned from
kcalloc(), and later ZERO_SIZE_PTR will be deferenced.
Fix the problem by checking numgrps first in group_cpus_evenly(), and
return NULL directly if numgrps is zero.
[yukuai3@huawei.com: also fix the non-SMP version]
Link: https://lkml.kernel.org/r/20250620010958.1265984-1-yukuai1@huaweicloud.com
Link: https://lkml.kernel.org/r/20250619132655.3318883-1-yukuai1@huaweicloud.com
Fixes: 6a6dcae8f4 ("blk-mq: Build default queue map via group_cpus_evenly()")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: ErKun Yang <yangerkun@huawei.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: "zhangyi (F)" <yi.zhang@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently, the in-kernel kunit test case timeout is 300 seconds. (There
is a separate timeout mechanism for the whole test execution in
kunit.py, but that's unrelated.) However, tests marked 'slow' or 'very
slow' may timeout, particularly on slower machines.
Implement a multiplier to the test-case timeout, so that slower tests
have longer to complete:
- DEFAULT -> 1x default timeout
- KUNIT_SPEED_SLOW -> 3x default timeout
- KUNIT_SPEED_VERY_SLOW -> 12x default timeout
A further change is planned to allow user configuration of the
default/base timeout to allow people with faster or slower machines to
adjust these to their use-cases.
Link: https://lore.kernel.org/r/20250614084711.2654593-2-davidgow@google.com
Signed-off-by: Ujwal Jain <ujwaljain@google.com>
Co-developed-by: David Gow <davidgow@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add test cases for static and dynamic minor number allocation and
deallocation.
While at it, improve description and test suite name.
Some of the cases include:
- that static and dynamic allocation reserved the expected minors.
- that registering duplicate minors or duplicate names will fail.
- that failing to create a sysfs file (due to duplicate names) will
deallocate the dynamic minor correctly.
- that dynamic allocation does not allocate a minor number in the static
range.
- that there are no collisions when mixing dynamic and static allocations.
- that opening devices with various minor device numbers work.
- that registering a static number in the dynamic range won't conflict with
a dynamic allocation.
This last test verifies the bug fixed by commit 6d04d2b554 ("misc:
misc_minor_alloc to use ida for all dynamic/misc dynamic minors") has not
regressed.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Link: https://lore.kernel.org/r/20250612-misc-dynrange-v5-1-6f35048f7273@igalia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a simple stress test for lib/ratelimit.c
To run on x86:
./tools/testing/kunit/kunit.py run --arch x86_64 --kconfig_add CONFIG_RATELIMIT_KUNIT_TEST=y --kconfig_add CONFIG_SMP=y --qemu_args "-smp 4" lib_ratelimit
On a 16-CPU system, the "4" in "-smp 4" can be varied between 1 and 8.
Larger numbers have higher probabilities of introducing delays that
break the smoke test. In the extreme case, increasing the number to
larger than the number of CPUs in the underlying system is an excellent
way to get a test failure.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jon Pan-Doh <pandoh@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Karolina Stolarek <karolina.stolarek@oracle.com>
The selftest fails most of the times when running in qemu with
a kernel configured with CONFIG_HZ = 250:
> test_ratelimit_smoke: 1 callbacks suppressed
> # test_ratelimit_smoke: ASSERTION FAILED at lib/tests/test_ratelimit.c:28
> Expected ___ratelimit(&testrl, "test_ratelimit_smoke") == (false), but
> ___ratelimit(&testrl, "test_ratelimit_smoke") == 1 (0x1)
> (false) == 0 (0x0)
Try to make the test slightly more reliable by calling the problematic
ratelimit in the middle of the interval.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Add a simple single-threaded smoke test for lib/ratelimit.c
To run on x86:
make ARCH=x86_64 mrproper
./tools/testing/kunit/kunit.py run --arch x86_64 --kconfig_add CONFIG_RATELIMIT_KUNIT_TEST=y --kconfig_add CONFIG_SMP=y lib_ratelimit
This will fail on old ___ratelimit(), and subsequent patches provide
the fixes that are required.
[ paulmck: Apply timeout and kunit feedback from Petr Mladek. ]
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jon Pan-Doh <pandoh@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Karolina Stolarek <karolina.stolarek@oracle.com>
With sanitizers enabled, this function uses a lot of stack, causing
a harmless warning:
lib/test_objagg.c: In function 'test_hints_case.constprop':
lib/test_objagg.c:994:1: error: the frame size of 1440 bytes is larger than 1408 bytes [-Werror=frame-larger-than=]
Most of this is from the two 'struct world' structures. Since most of
the work in this function is duplicated for the two, split it up into
separate functions that each use one of them.
The combined stack usage is still the same here, but there is no warning
any more, and the code is still safe because of the known call chain.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250620111907.3395296-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
or aren't considered necessary for -stable kernels. Only 4 are for MM.
- The 3 patch series `Revert "bcache: update min_heap_callbacks to use
default builtin swap"' from Kuan-Wei Chiu backs out the author's recent
min_heap changes due to a performance regression. A fix for this
regression has been developed but we felt it best to go back to the
known-good version to give the new code more bake time.
- A lot of MAINTAINERS maintenance. I like to get these changes
upstreamed promptly because they can't break things and more
accurate/complete MAINTAINERS info hopefully improves the speed and
accuracy of our responses to submitters and reporters.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaFizWwAKCRDdBJ7gKXxA
jhivAQDGQXgzgzPCu/5/fTQjjq+D/8M2QjGxNy4o1itKoK+fYAEAzQGTL/8ay9FY
yhcipreU4A3lrxf94iOidiBCYkZaOgk=
=kFFb
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2025-06-22-18-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"20 hotfixes. 7 are cc:stable and the remainder address post-6.15
issues or aren't considered necessary for -stable kernels. Only 4 are
for MM.
- The series `Revert "bcache: update min_heap_callbacks to use
default builtin swap"' from Kuan-Wei Chiu backs out the author's
recent min_heap changes due to a performance regression.
A fix for this regression has been developed but we felt it best to
go back to the known-good version to give the new code more bake
time.
- A lot of MAINTAINERS maintenance.
I like to get these changes upstreamed promptly because they can't
break things and more accurate/complete MAINTAINERS info hopefully
improves the speed and accuracy of our responses to submitters and
reporters"
* tag 'mm-hotfixes-stable-2025-06-22-18-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: add additional mmap-related files to mmap section
MAINTAINERS: add memfd, shmem quota files to shmem section
MAINTAINERS: add stray rmap file to mm rmap section
MAINTAINERS: add hugetlb_cgroup.c to hugetlb section
MAINTAINERS: add further init files to mm init block
MAINTAINERS: update maintainers for HugeTLB
maple_tree: fix MA_STATE_PREALLOC flag in mas_preallocate()
MAINTAINERS: add missing test files to mm gup section
MAINTAINERS: add missing mm/workingset.c file to mm reclaim section
selftests/mm: skip uprobe vma merge test if uprobes are not enabled
bcache: remove unnecessary select MIN_HEAP
Revert "bcache: remove heap-related macros and switch to generic min_heap"
Revert "bcache: update min_heap_callbacks to use default builtin swap"
selftests/mm: add configs to fix testcase failure
kho: initialize tail pages for higher order folios properly
MAINTAINERS: add linux-mm@ list to Kexec Handover
mm: userfaultfd: fix race of userfaultfd_move and swap cache
mm/gup: revert "mm: gup: fix infinite loop within __get_longterm_locked"
selftests/mm: increase timeout from 180 to 900 seconds
mm/shmem, swap: fix softlockup with mTHP swapin
and reinstates a Kconfig option to control the extra self-tests.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmhSgQIACgkQxycdCkmx
i6eWig//aNg4YL30eTh41eTWTCiA1PLZpyOE2/Wz7q/Yg4M0Refn85A+tREm18q+
uwuZKAoFz8VaF0trqSQQ3PFzZaJWWRn0yLqeToxGyd7sY9kBh93FdQLub8wTxO0F
qDPLnAR+Gt7VAGcYSjhyB/TCsJ5h6oRN87qMIr8g807SiIB6mHiuXxJAAKy1U7OD
cXafp3HTkzUjgk/wbj7qSK6HJR3Cq3o/3JmsE/D7yvJRH1Bx7mNoiRpEX17CkgQX
qVZmLj8lE4HzFpTLKBAY8sXlzxscN+rHnS5WUhTqWL1hAI2b52p1moJPzT9QM/Zb
yI+x1DbO21Pvr4mZJ/hX18Y9VvTbea0hkD/wFD+hKJyQ9j70B8/bBeT/sOxKqDZn
0G1o9UyVTNdw4m2m/6lYJBgG0yiuD3hZID+Wjgq6lOsfoVBThU3CWq11NW98HQKz
0VUWztcG7JTqM1wUwwjlMXnm8+WKwiuYqYZCwBl8o0Ii29/Sm0pGMXtiDqmWFWLA
a4FJNFxiKEfVA95yRuRPfEM7KMwRWdw2C9YGe6hk3kcUbfDYSJykUme/USFzz8X8
5lmwWESNggggQEw9BxUAILIzRZwsDhCakgRjd11JRbNjrNTwXIbP9+nv+LH91mPK
zm5DJqyqSUVr2iXeQYYH/etyRsMX+dAuWPrFvvjuDBb8/fgEce4=
=6/TP
-----END PGP SIGNATURE-----
Merge tag 'v6.16-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a regression in ahash (broken fallback finup) and
reinstates a Kconfig option to control the extra self-tests"
* tag 'v6.16-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ahash - Fix infinite recursion in ahash_def_finup
crypto: testmgr - reinstate kconfig control over full self-tests
Temporarily clear the preallocation flag when explicitly requesting
allocations. Pre-existing allocations are already counted against the
request through mas_node_count_gfp(), but the allocations will not happen
if the MA_STATE_PREALLOC flag is set. This flag is meant to avoid
re-allocating in bulk allocation mode, and to detect issues with
preallocation calculations.
The MA_STATE_PREALLOC flag should also always be set on zero allocations
so that detection of underflow allocations will print a WARN_ON() during
consumption.
User visible effect of this flaw is a WARN_ON() followed by a null pointer
dereference when subsequent requests for larger number of nodes is
ignored, such as the vma merge retry in mmap_region() caused by drivers
altering the vma flags (which happens in v6.6, at least)
Link: https://lkml.kernel.org/r/20250616184521.3382795-3-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Reported-by: Hailong Liu <hailong.liu@oppo.com>
Link: https://lore.kernel.org/all/1652f7eb-a51b-4fee-8058-c73af63bacd1@oppo.com/
Link: https://lore.kernel.org/all/20250428184058.1416274-1-Liam.Howlett@oracle.com/
Link: https://lore.kernel.org/all/20250429014754.1479118-1-Liam.Howlett@oracle.com/
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Hailong Liu <hailong.liu@oppo.com>
Cc: zhangpeng.00@bytedance.com <zhangpeng.00@bytedance.com>
Cc: Steve Kang <Steve.Kang@unisoc.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Now that we have dentries and the ability to create meaningful symlinks
to them, don't keep a name string in each tracker. Switch the output
format to print "class@address", and drop the name field.
Also, add a kerneldoc header for ref_tracker_dir_init().
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-9-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the ability for a subsystem to add a user-friendly symlink that
points to a ref_tracker_dir's debugfs file. Add a separate
debugfs_symlinks xarray and use that to track symlinks. The reaper
workqueue job will remove symlinks before their corresponding dentries.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-7-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, there is no convenient way to see the info that the
ref_tracking infrastructure collects. Attempt to create a file in
debugfs when called from ref_tracker_dir_init().
The file is given the name "class@%px", as having the unmodified address
is helpful for debugging. This should be safe since this directory is only
accessible by root
While ref_tracker_dir_init() is generally called from a context where
sleeping is OK, ref_tracker_dir_exit() can be called from anywhere.
Thus, dentry cleanup must be handled asynchronously.
Add a new global xarray that has entries with the ref_tracker_dir
pointer as the index and the corresponding debugfs dentry pointer as the
value. Instead of removing the debugfs dentry, have
ref_tracker_dir_exit() set a mark on the xarray entry and schedule a
workqueue job. The workqueue job then walks the xarray looking for
marked entries, and removes their xarray entries and the debugfs
dentries.
Because of this, the debugfs dentry can outlive the corresponding
ref_tracker_dir. Have ref_tracker_debugfs_show() take extra care to
ensure that it's safe to dereference the dir pointer before using it.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-6-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Allow pr_ostream to also output directly to a seq_file without an
intermediate buffer. The first caller of +ref_tracker_dir_seq_print()
will come in a later patch, so mark that __maybe_unused for now. That
designation will be removed once it is used.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-5-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A later patch in the series will be adding debugfs files for each
ref_tracker that get created in ref_tracker_dir_init(). The format will
be "class@%px". The current "name" string can vary between
ref_tracker_dir objects of the same type, so it's not suitable for this
purpose.
Add a new "class" string to the ref_tracker dir that describes the
the type of object (sans any individual info for that object).
Also, in the i915 driver, gate the creation of debugfs files on whether
the dentry pointer is still set to NULL. CI has shown that the
ref_tracker_dir can be initialized more than once.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-4-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In a later patch, we'll be adding a 3rd mechanism for outputting
ref_tracker info via seq_file. Instead of a conditional, have the caller
set a pointer to an output function in struct ostream. As part of this,
the log prefix must be explicitly passed in, as it's too late for the
pr_fmt macro.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-3-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new "ref_tracker" directory in debugfs. Each individual refcount
tracker can register files under there to display info about
currently-held references.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-2-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
requires adding support for a number of new FW commands so
it's quite large in terms of LoC. The rest is relatively small.
Current release - fix to a fix:
- ptp: fix breakage after ptp_vclock_in_use() rework
Current release - regressions:
- openvswitch: allocate struct ovs_pcpu_storage dynamically, static
allocation may exhaust module loader limit on smaller systems
Previous releases - regressions:
- tcp: fix tcp_packet_delayed() for peers with no selective ACK support
Previous releases - always broken:
- wifi: ath12k: don't activate more links than firmware supports
- tcp: make sure sockets open via passive TFO have valid NAPI ID
- eth: bnxt_en: update MRU and RSS table of RSS contexts on queue reset,
prevent Rx queues from silently hanging after queue reset
- NFC: uart: set tty->disc_data only in success path
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmhUPLgACgkQMUZtbf5S
IrvrXxAAvREuPy2BhCrQW77TeVW2ikBlGYYp+dCIL4r4kfqquHU3C+R/GZ5yRbcA
fiRMkRv7J4xhHOT8AnFmFiWGz6QEJfESy3jSQUfcLssHeandEjJbcZ0qVunEpE4e
rt0leygfY+axEP3ewoGQEP7Ds2Ku+L3BrIetwsUZPFo9pVAtGJ4XOV56SRYPP7vL
7toaHcxXtx5vL9Wlb9vulom9kfZU4/hfDMS9/FES+aE4QnRDB+sh0v6wL4pocLrj
7swSqWCY7vjd7vJck4Ta/O9+bL6kz1wG4Qvv5enXGJlksiBgCpShfbzGMZTbegeP
8gCj4heAATCRmzCmfkRk5sWzPqedtLI5rzb7jUYYF4wiZpJwjwLzrDxopazFogXs
kKDCD3lxDmUTV4AkT/5hn8Rlf4CApMrhPs+zhzQQ620Sj/G3k2lyepKmKjOUSAZ4
fEFKmP9Mzt3fKt2sOi5ETlrpuiSv5CCAZfnCg7UoBXrQM1Rxyn+utLMeGM6oHXRa
iH9GwVN2cycEXl16y8HVCui8lzBYyZKVbyiRBM6k3Pl6ZozDuHvCRmHgs8g9pTja
3hGe2wjYcsi+HKrF1zDQrTJbhKzsNMpJej0KIZ06xjyXAgr1VV0noEq/iQsPEj4z
033DotqhyoFXc6ERmOaca8LfE2kHolwT+0leO+i0sbdmVlSFxB8=
=2mAa
-----END PGP SIGNATURE-----
Merge tag 'net-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless.
The ath12k fix to avoid FW crashes requires adding support for a
number of new FW commands so it's quite large in terms of LoC. The
rest is relatively small.
Current release - fix to a fix:
- ptp: fix breakage after ptp_vclock_in_use() rework
Current release - regressions:
- openvswitch: allocate struct ovs_pcpu_storage dynamically, static
allocation may exhaust module loader limit on smaller systems
Previous releases - regressions:
- tcp: fix tcp_packet_delayed() for peers with no selective ACK
support
Previous releases - always broken:
- wifi: ath12k: don't activate more links than firmware supports
- tcp: make sure sockets open via passive TFO have valid NAPI ID
- eth: bnxt_en: update MRU and RSS table of RSS contexts on queue
reset, prevent Rx queues from silently hanging after queue reset
- NFC: uart: set tty->disc_data only in success path"
* tag 'net-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
net: airoha: Differentiate hwfd buffer size for QDMA0 and QDMA1
net: airoha: Compute number of descriptors according to reserved memory size
tools: ynl: fix mixing ops and notifications on one socket
net: atm: fix /proc/net/atm/lec handling
net: atm: add lec_mutex
mlxbf_gige: return EPROBE_DEFER if PHY IRQ is not available
net: airoha: Always check return value from airoha_ppe_foe_get_entry()
NFC: nci: uart: Set tty->disc_data only in success path
calipso: Fix null-ptr-deref in calipso_req_{set,del}attr().
MAINTAINERS: Remove Shannon Nelson from MAINTAINERS file
net: lan743x: fix potential out-of-bounds write in lan743x_ptp_io_event_clock_get()
eth: fbnic: avoid double free when failing to DMA-map FW msg
tcp: fix passive TFO socket having invalid NAPI ID
selftests: net: add test for passive TFO socket NAPI ID
selftests: net: add passive TFO test binary
selftests: netdevsim: improve lib.sh include in peer.sh
tipc: fix null-ptr-deref when acquiring remote ip of ethernet bearer
Octeontx2-pf: Fix Backpresure configuration
net: ftgmac100: select FIXED_PHY
net: ethtool: remove duplicate defines for family info
...