Implement sha3_absorb_blocks() and sha3_keccakf() using the hardware-
accelerated SHA-3 support in Message-Security-Assist Extension 6.
This accelerates the SHA3-224, SHA3-256, SHA3-384, SHA3-512, and
SHAKE256 library functions.
Note that arch/s390/crypto/ already has SHA-3 code that uses this
extension, but it is exposed only via crypto_shash. This commit brings
the same acceleration to the SHA-3 library. The arch/s390/crypto/
version will become redundant and be removed in later changes.
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm64-optimized SHA-3 code via arm64-specific
crypto_shash algorithms, instead just implement the sha3_absorb_blocks()
and sha3_keccakf() library functions. This is much simpler, it makes
the SHA-3 library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-3 code was disabled by
default. SHA-3 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Note: to see the diff from arch/arm64/crypto/sha3-ce-glue.c to
lib/crypto/arm64/sha3.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since the SHA-3 algorithms are FIPS-approved, add the boot-time
self-test which is apparently required. This closely follows the
corresponding SHA-1, SHA-256, and SHA-512 tests.
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
In crypto/sha3_generic.c, the keccakf() function calls keccakf_round()
to do four of Keccak-f's five step mappings. However, it does not do
the Iota step mapping - presumably because that is dependent on round
number, whereas Theta, Rho, Pi and Chi are not.
Note that the keccakf_round() function needs to be explicitly
non-inlined on certain architectures as gcc's produced output will (or
used to) use over 1KiB of stack space if inlined.
Now, this code was copied more or less verbatim into lib/crypto/sha3.c,
so that has the same aesthetic issue. Fix this there by passing the
round number into sha3_keccakf_one_round_generic() and doing the Iota
step mapping there.
crypto/sha3_generic.c is left untouched as that will be converted to use
lib/crypto/sha3.c at some point.
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add SHA-3 support to lib/crypto/. All six algorithms in the SHA-3
family are supported: four digests (SHA3-224, SHA3-256, SHA3-384, and
SHA3-512) and two extendable-output functions (SHAKE128 and SHAKE256).
The SHAKE algorithms will be required for ML-DSA.
[EB: simplified the API to use fewer types and functions, fixed bug that
sometimes caused incorrect SHAKE output, cleaned up the
documentation, dropped an ad-hoc test that was inconsistent with
the rest of lib/crypto/, and many other cleanups]
Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
On big endian arm kernels, the arm optimized Curve25519 code produces
incorrect outputs and fails the Curve25519 test. This has been true
ever since this code was added.
It seems that hardly anyone (or even no one?) actually uses big endian
arm kernels. But as long as they're ostensibly supported, we should
disable this code on them so that it's not accidentally used.
Note: for future-proofing, use !CPU_BIG_ENDIAN instead of
CPU_LITTLE_ENDIAN. Both of these are arch-specific options that could
get removed in the future if big endian support gets dropped.
Fixes: d8f1308a02 ("crypto: arm/curve25519 - wire up NEON implementation")
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251104054906.716914-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Commit 2f13daee2a ("lib/crypto/curve25519-hacl64: Disable KASAN with
clang-17 and older") inadvertently disabled KASAN in curve25519-hacl64.o
for GCC unconditionally because clang-min-version will always evaluate
to nothing for GCC. Add a check for CONFIG_CC_IS_CLANG to avoid applying
the workaround for GCC, which is only needed for clang-17 and older.
Cc: stable@vger.kernel.org
Fixes: 2f13daee2a ("lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251103-curve25519-hacl64-fix-kasan-workaround-v2-1-ab581cbd8035@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Clarify that the return values of vsprintf() and sprintf() exclude the
trailing NUL character.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251103090913.2066-2-thorsten.blum@linux.dev
Signed-off-by: Petr Mladek <pmladek@suse.com>
- Formally adopt Kconfig in MAINTAINERS
- Fix install-extmod-build for more O= paths
- Align end of .modinfo to fix Authenticode calculation in EDK2
- Restore dynamic check for '-fsanitize=kernel-memory' in
CONFIG_HAVE_KMSAN_COMPILER to ensure backend target has support for
it
- Initialize locale in menuconfig and nconfig to fix UTF-8 terminals
that may not support VT100 ACS by default like PuTTY
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaQWN/gAKCRAdayaRccAa
ljUzAQCGh5zhfwsCfMc5M7/9v2FiEBKNzTc/xTK04vWdZ3A39QEA6XuyVx+QPBkS
ts7gPrR6NC7AokdqTHXKhBW/ebaNOw0=
=GmCb
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:
- Formally adopt Kconfig in MAINTAINERS
- Fix install-extmod-build for more O= paths
- Align end of .modinfo to fix Authenticode calculation in EDK2
- Restore dynamic check for '-fsanitize=kernel-memory' in
CONFIG_HAVE_KMSAN_COMPILER to ensure backend target has support
for it
- Initialize locale in menuconfig and nconfig to fix UTF-8 terminals
that may not support VT100 ACS by default like PuTTY
* tag 'kbuild-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kconfig/nconf: Initialize the default locale at startup
kconfig/mconf: Initialize the default locale at startup
KMSAN: Restore dynamic check for '-fsanitize=kernel-memory'
kbuild: align modinfo section for Secureboot Authenticode EDK2 compat
kbuild: install-extmod-build: Fix when given dir outside the build dir
MAINTAINERS: Update Kconfig section
Fixes log overwrite in param_tests and fixes incorrect cast of priv
pointer in test_dev_action(). Updates email address for Rae Moar in
MAINTAINERS KUnit entry.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmkDxT0ACgkQCwJExA0N
QxyDaxAAxyE7WU1vUg7AakYo80p5wku/ehahQCicePW0cvKiSwI+VJ3o4A9e7qxI
l0xnvLaisqPzwrSZ4GxCi/0L50OpoFTCAyEeey88tNeqEcjR/b4CDYfJoftCau+b
/4+jluhuCbSC3HQbq77aww1Vgoo184hisAdCvRj0qiPKLk7+D4x1xKMKGCBZYyNP
Jq2sb4H7TJ9rKpRoN9xi+9YWmJGI8EPQHJ1A4FBoS2vk6JyOGlbpJcID1XROtNwz
19tQ9eNarcHSq7yf/Z20BHv0v53ZEvn/c9GYEd1Z5/NtAvPIbyY1U8EhDH9smoAs
0+bDUddZOj1JoQ5Q5TXTx7onKLFpecAIhSGPjBn7mETmIdv0lbtfcJ6aBe0FQTz3
jFfSOBPwd1ng5RVM6FXsJf+JW11Cyba/reGvgtcZDrg8hMluIWUK4N3761KYQnt/
+IoJeJ3A791HNzLDoXKeA6uR0w55LtkJpEnDWi1UqAZE+OOUSx+p1LcGgUAIHxyU
MRaIS4+ZoeVmgopQCSDk2KbiCp8d+yek6Upbm5ivNrwJ00MZqgH5oZA/O8W/shMw
0vjc7mVTpefCSq3EudkQKCT+cB7nh3WAH6devsUNuZlQ7kL1o5Css5bmVF0Iswa7
j+KFuZEiOlOmZgQ+zpzrQWgI+Pu+ZqQg4W8KA3NXwg86blARRfg=
=stUG
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fixes from Shuah Khan:
"Fix log overwrite in param_tests and fixes incorrect cast of priv
pointer in test_dev_action().
Update email address for Rae Moar in MAINTAINERS KUnit entry"
* tag 'linux_kselftest-kunit-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
MAINTAINERS: Update KUnit email address for Rae Moar
kunit: prevent log overwrite in param_tests
kunit: test_dev_action: Correctly cast 'priv' pointer to long*
Migrate the arm-optimized BLAKE2b code from arch/arm/crypto/ to
lib/crypto/arm/. This makes the BLAKE2b library able to use it, and it
also simplifies the code because it's easier to integrate with the
library than crypto_shash.
This temporarily makes the arm-optimized BLAKE2b code unavailable via
crypto_shash. A later commit reimplements the blake2b-* crypto_shash
algorithms on top of the BLAKE2b library API, making it available again.
Note that as per the lib/crypto/ convention, the optimized code is now
enabled by default. So, this also fixes the longstanding issue where
the optimized BLAKE2b code was not enabled by default.
To see the diff from arch/arm/crypto/blake2b-neon-glue.c to
lib/crypto/arm/blake2b.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API.
This will allow in-kernel users such as btrfs to use BLAKE2b without
going through the generic crypto layer. In addition, as usual the
BLAKE2b crypto_shash algorithms will be reimplemented on top of this.
Note: to create lib/crypto/blake2b.c I made a copy of
lib/crypto/blake2s.c and made the updates from BLAKE2s => BLAKE2b. This
way, the BLAKE2s and BLAKE2b code is kept consistent. Therefore, it
borrows the SPDX-License-Identifier and Copyright from
lib/crypto/blake2s.c rather than crypto/blake2b_generic.c.
The library API uses 'struct blake2b_ctx', consistent with other
lib/crypto/ APIs. The existing 'struct blake2b_state' will be removed
once the blake2b crypto_shash algorithms are updated to stop using it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
A couple more small cleanups to the BLAKE2s code before these things get
propagated into the BLAKE2b code:
- Drop 'const' from some non-pointer function parameters. It was a bit
excessive and not conventional.
- Rename 'block' argument of blake2s_compress*() to 'data'. This is for
consistency with the SHA-* code, and also to avoid the implication
that it points to a singular "block".
No functional changes.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
For consistency with the SHA-1, SHA-2, SHA-3 (in development), and MD5
library APIs, rename blake2s_state to blake2s_ctx.
As a refresher, the ctx name:
- Is a bit shorter.
- Avoids confusion with the compression function state, which is also
often called the state (but is just part of the full context).
- Is consistent with OpenSSL.
Not a big deal, of course. But consistency is nice. With a BLAKE2b
library API about to be added, this is a convenient time to update this.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reorder the parameters of blake2s() from (out, in, key, outlen, inlen,
keylen) to (key, keylen, in, inlen, out, outlen).
This aligns BLAKE2s with the common conventions of pairing buffers and
their lengths, and having outputs follow inputs. This is widely used
elsewhere in lib/crypto/ and crypto/, and even elsewhere in the BLAKE2s
code itself such as blake2s_init_key() and blake2s_final(). So
blake2s() was a bit of an exception.
Notably, this results in the same order as hmac_*_usingrawkey().
Note that since the type signature changed, it's not possible for a
blake2s() call site to be silently missed.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add FIPS cryptographic algorithm self-tests for all SHA-1 and SHA-2
algorithms. Following the "Implementation Guidance for FIPS 140-3"
document, to achieve this it's sufficient to just test a single test
vector for each of HMAC-SHA1, HMAC-SHA256, and HMAC-SHA512.
Just run these tests in the initcalls, following the example of e.g.
crypto/kdf_sp800108.c. Note that this should meet the FIPS self-test
requirement even in the built-in case, given that the initcalls run
before userspace, storage, network, etc. are accessible.
This does not fix a regression, seeing as lib/ has had SHA-1 support
since 2005 and SHA-256 support since 2018. Neither ever had FIPS
self-tests. Moreover, fips=1 support has always been an unfinished
feature upstream. However, with lib/ now being used more widely, it's
now seeing more scrutiny and people seem to want these now [1][2].
[1] https://lore.kernel.org/r/3226361.1758126043@warthog.procyon.org.uk/
[2] https://lore.kernel.org/r/f31dbb22-0add-481c-aee0-e337a7731f8e@oracle.com/
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251011001047.51886-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Commit 5ff8c11775 ("KMSAN: Remove tautological checks") changed
CONFIG_HAVE_KMSAN_COMPILER from a dynamic check for
'-fsanitize=kernel-memory' to just being true for CONFIG_CC_IS_CLANG.
This missed the fact that not all architectures supported
'-fsanitize=kernel-memory' at the same time. For example, SystemZ / s390
gained support for KMSAN in clang-18 [1], so builds with clang-15
through clang-17 can select KMSAN but they error with:
clang-16: error: unsupported option '-fsanitize=kernel-memory' for target 's390x-unknown-linux-gnu'
Restore the cc-option check for '-fsanitize=kernel-memory' to make sure
the compiler target properly supports '-fsanitize=kernel-memory'. The
check for '-msan-disable-checks=1' does not need to be restored because
all supported clang versions for building the kernel support it.
Fixes: 5ff8c11775 ("KMSAN: Remove tautological checks")
Link: a3e56a8792 [1]
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202510220236.AVuXXCYy-lkp@intel.com/
Acked-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20251023-fix-kmsan-check-s390-clang-v1-1-4e6df477a4cc@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
When running parameterized tests, each test case is initialized with
kunit_init_test(). This function takes the test_case->log as a parameter
but it clears it via string_stream_clear() on each iteration.
This results in only the log from the last parameter being preserved in
the test_case->log and the results from the previous parameters are lost
from the debugfs entry.
Fix this by manually setting the param_test.log to the test_case->log
after it has been initialized. This prevents kunit_init_test() from
clearing the log on each iteration.
Link: https://lore.kernel.org/r/20251024190101.2091549-1-cmllamas@google.com
Fixes: 4b59300ba4 ("kunit: Add parent kunit for parameterized test context")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
freader_fetch currently reads from at most two folios. When a read spans
into a third folio, the overflow bytes are copied adjacent to the second
folio’s data instead of being handled as a separate folio.
This patch modifies fetch algorithm to support reading from many folios.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Reviewed-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20251026203853.135105-5-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Move struct freader and prototypes of the functions operating on it into
the buildid.h.
This allows reusing freader outside buildid, e.g. for file dynptr
support added later.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20251026203853.135105-4-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This old alias for in_hardirq() has been marked as deprecated since
2020; remove the stragglers.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251024180654.1691095-1-willy@infradead.org
The previous implementation incorrectly assumed the original type of
'priv' was void**, leading to an unnecessary and misleading
cast. Correct the cast of the 'priv' pointer in test_dev_action() to
its actual type, long*, removing an unnecessary cast.
As an additional benefit, this fixes an out-of-bounds CHERI fault on
hardware with architectural capabilities. The original implementation
tried to store a capability-sized pointer using the priv
pointer. However, the priv pointer's capability only granted access to
the memory region of its original long type, leading to a bounds
violation since the size of a long is smaller than the size of a
capability. This change ensures that the pointer usage respects the
capabilities' bounds.
Link: https://lore.kernel.org/r/20251017092814.80022-1-florian.schmaus@codasip.com
Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Florian Schmaus <florian.schmaus@codasip.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
For consistency with the other function templates, change
_subtree_search_*() to use the user-supplied ITSTATIC rather than the
hard-coded 'static'.
Acked-by: Petr Mladek <pmladek@suse.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
KHO test stores physical addresses of the preserved folios directly in
fdt. Use kho_preserve_vmalloc() instead of it and kho_restore_vmalloc()
to retrieve the addresses after kexec.
This makes the test more scalable from one side and adds tests coverage
for kho_preserve_vmalloc() from the other.
Link: https://lkml.kernel.org/r/20250921054458.4043761-5-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCaOAhtRQcem9oYXJAbGlu
dXguaWJtLmNvbQAKCRDLwZzRsCrn5bGkAP91Vptp6rJj3R0N0M338tzkVaeybA+V
fjOHnuB8AvF+FwD/XGLF+lcIFiB8GizXGf/OoyXJwYwciNU7hEdONh5dzQ0=
=euve
-----END PGP SIGNATURE-----
Merge tag 'integrity-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
"Just a couple of changes: crypto code cleanup and a IMA xattr bug fix"
* tag 'integrity-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: don't clear IMA_DIGSIG flag when setting or removing non-IMA xattr
lib/digsig: Use SHA-1 library instead of crypto_shash
integrity: Select CRYPTO from INTEGRITY_ASYMMETRIC_KEYS
Drivers:
- Add ciphertext hiding support to ccp.
- Add hashjoin, gather and UDMA data move features to hisilicon.
- Add lz4 and lz77_only to hisilicon.
- Add xilinx hwrng driver.
- Add ti driver with ecb/cbc aes support.
- Add ring buffer idle and command queue telemetry for GEN6 in qat.
Others:
- Use rcu_dereference_all to stop false alarms in rhashtable.
- Fix CPU number wraparound in padata.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmjbpJsACgkQxycdCkmx
i6fuTRAAzv5o0MIw4Kc7EEU3zMgFSX0FdcTUPY+eiFrWZrSrvUVW+jYcH9ppO8J7
offAYSZYatcyyU9+u8X22CQNKLdXnKQQ0YymWO35TOpvVxveUM1bqEEV1ZK0xaXD
hlJTLoFIsPaVVhi8CW+ZNhDJBwJHNCv7Yi9TUB6sC7rilWWbJ5LzbEVw3Rtg81Lx
0hcuGX2LrpsHOVVWYxGdJ534Kt2lrkt+8/gWOFg3ap3RVQ39tohEjS2Adm2p8eiX
zIdru/aYd89EcYoxuFyylX2d/OLmMAQpFsADy/Fys26eeOWtqggH62V1LAiSyEqw
vLRBCVKpLhlbNNfnUs0f5nqjjYEUrNk9SA4rgoxITwKoucbWBQMS4zWJTEDKz29n
iBBqHsukGpwVOE6RY8BzR/QNJKhZCSsJpGkagS1v6VPa5P1QomuKftGXKB7JKXKz
xoyk+DhJyA8rkb/E5J9Ni7+Tb08Y4zvJ1dpCQHZMlln3DKkK+kk3gkpoxXMZwBV2
LbEMGTI+sfnAfqkGCJYAZR9gDJ5LQDR9jy/Ds5jvPuVvvjyY5LY/bjETqGPF2QVs
Rz2Sg0RHl7PVZOP6QgbQzkV7SkJrZfyu5iYd0ZfUqZr7BaHLOHJG/E/HlUW3/mXu
OjD+Q5gPhiOdc/qn+32+QERTDCFQdbByv0h7khGQA5vHE3XCu8E=
=knnk
-----END PGP SIGNATURE-----
Merge tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"Drivers:
- Add ciphertext hiding support to ccp
- Add hashjoin, gather and UDMA data move features to hisilicon
- Add lz4 and lz77_only to hisilicon
- Add xilinx hwrng driver
- Add ti driver with ecb/cbc aes support
- Add ring buffer idle and command queue telemetry for GEN6 in qat
Others:
- Use rcu_dereference_all to stop false alarms in rhashtable
- Fix CPU number wraparound in padata"
* tag 'v6.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (78 commits)
dt-bindings: rng: hisi-rng: convert to DT schema
crypto: doc - Add explicit title heading to API docs
hwrng: ks-sa - fix division by zero in ks_sa_rng_init
KEYS: X.509: Fix Basic Constraints CA flag parsing
crypto: anubis - simplify return statement in anubis_mod_init
crypto: hisilicon/qm - set NULL to qm->debug.qm_diff_regs
crypto: hisilicon/qm - clear all VF configurations in the hardware
crypto: hisilicon - enable error reporting again
crypto: hisilicon/qm - mask axi error before memory init
crypto: hisilicon/qm - invalidate queues in use
crypto: qat - Return pointer directly in adf_ctl_alloc_resources
crypto: aspeed - Fix dma_unmap_sg() direction
rhashtable: Use rcu_dereference_all and rcu_dereference_all_check
crypto: comp - Use same definition of context alloc and free ops
crypto: omap - convert from tasklet to BH workqueue
crypto: qat - Replace kzalloc() + copy_from_user() with memdup_user()
crypto: caam - double the entropy delay interval for retry
padata: WQ_PERCPU added to alloc_workqueue users
padata: replace use of system_unbound_wq with system_dfl_wq
crypto: cryptd - WQ_PERCPU added to alloc_workqueue users
...
Now that a SHA-1 library API is available, use it instead of
crypto_shash. This is simpler and faster.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
- The 3 patch series "ida: Remove the ida_simple_xxx() API" from
Christophe Jaillet completes the removal of this legacy IDR API.
- The 9 patch series "panic: introduce panic status function family"
from Jinchao Wang provides a number of cleanups to the panic code and
its various helpers, which were rather ad-hoc and scattered all over the
place.
- The 5 patch series "tools/delaytop: implement real-time keyboard
interaction support" from Fan Yu adds a few nice user-facing usability
changes to the delaytop monitoring tool.
- The 3 patch series "efi: Fix EFI boot with kexec handover (KHO)" from
Evangelos Petrongonas fixes a panic which was happening with the
combination of EFI and KHO.
- The 2 patch series "Squashfs: performance improvement and a sanity
check" from Phillip Lougher teaches squashfs's lseek() about
SEEK_DATA/SEEK_HOLE. A mere 150x speedup was measured for a well-chosen
microbenchmark.
- Plus another 50-odd singleton patches all over the place.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN78zwAKCRDdBJ7gKXxA
jhLeAQCddTv0XtSUTrvBvmrJVUBrQQeJc+LtNopMIjfAF/WAWAEAogSVKxg+HHEB
GaVixx4zDriNzEqrqiCx9rm4l+YooQA=
=XRe0
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-10-02-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "ida: Remove the ida_simple_xxx() API" from Christophe Jaillet
completes the removal of this legacy IDR API
- "panic: introduce panic status function family" from Jinchao Wang
provides a number of cleanups to the panic code and its various
helpers, which were rather ad-hoc and scattered all over the place
- "tools/delaytop: implement real-time keyboard interaction support"
from Fan Yu adds a few nice user-facing usability changes to the
delaytop monitoring tool
- "efi: Fix EFI boot with kexec handover (KHO)" from Evangelos
Petrongonas fixes a panic which was happening with the combination of
EFI and KHO
- "Squashfs: performance improvement and a sanity check" from Phillip
Lougher teaches squashfs's lseek() about SEEK_DATA/SEEK_HOLE. A mere
150x speedup was measured for a well-chosen microbenchmark
- plus another 50-odd singleton patches all over the place
* tag 'mm-nonmm-stable-2025-10-02-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (75 commits)
Squashfs: reject negative file sizes in squashfs_read_inode()
kallsyms: use kmalloc_array() instead of kmalloc()
MAINTAINERS: update Sibi Sankar's email address
Squashfs: add SEEK_DATA/SEEK_HOLE support
Squashfs: add additional inode sanity checking
lib/genalloc: fix device leak in of_gen_pool_get()
panic: remove CONFIG_PANIC_ON_OOPS_VALUE
ocfs2: fix double free in user_cluster_connect()
checkpatch: suppress strscpy warnings for userspace tools
cramfs: fix incorrect physical page address calculation
kernel: prevent prctl(PR_SET_PDEATHSIG) from racing with parent process exit
Squashfs: fix uninit-value in squashfs_get_parent
kho: only fill kimage if KHO is finalized
ocfs2: avoid extra calls to strlen() after ocfs2_sprintf_system_inode_name()
kernel/sys.c: fix the racy usage of task_lock(tsk->group_leader) in sys_prlimit64() paths
sched/task.h: fix the wrong comment on task_lock() nesting with tasklist_lock
coccinelle: platform_no_drv_owner: handle also built-in drivers
coccinelle: of_table: handle SPI device ID tables
lib/decompress: use designated initializers for struct compress_format
efi: support booting with kexec handover (KHO)
...
- The 3 patch series "mm, swap: improve cluster scan strategy" from
Kairui Song improves performance and reduces the failure rate of swap
cluster allocation.
- The 4 patch series "support large align and nid in Rust allocators"
from Vitaly Wool permits Rust allocators to set NUMA node and large
alignment when perforning slub and vmalloc reallocs.
- The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from
Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets
for virtual address spaces for ops-level DAMOS filters.
- The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock"
from Suren Baghdasaryan reduces mmap_lock contention during reads of
/proc/pid/maps.
- The 2 patch series "mm/mincore: minor clean up for swap cache
checking" from Kairui Song performs some cleanup in the swap code.
- The 11 patch series "mm: vm_normal_page*() improvements" from David
Hildenbrand provides code cleanup in the pagemap code.
- The 5 patch series "add persistent huge zero folio support" from
Pankaj Raghav provides a block layer speedup by optionalls making the
huge_zero_pagepersistent, instead of releasing it when its refcount
falls to zero.
- The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a
few touchups to the recently added Kexec Handover feature.
- The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all
arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To
end the constant struggle with space shortage on 32-bit conflicting with
64-bit's needs.
- The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li
cleans up some swap code.
- The 7 patch series "selftests/mm: Fix false positives and skip
unsupported tests" from Donet Tom fixes a few things in our selftests
code.
- The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide
THPs when advised" from David Hildenbrand "allows individual processes
to opt-out of THP=always into THP=madvise, without affecting other
workloads on the system".
It's a long story - the [1/N] changelog spells out the considerations.
- The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox
gets us started on the memdesc project. Please see
https://kernelnewbies.org/MatthewWilcox/Memdescs and
https://blogs.oracle.com/linux/post/introducing-memdesc.
- The 3 patch series "Tiny optimization for large read operations" from
Chi Zhiling improves the efficiency of the pagecache read path.
- The 5 patch series "Better split_huge_page_test result check" from Zi
Yan improves our folio splitting selftest code.
- The 2 patch series "test that rmap behaves as expected" from Wei Yang
adds some rmap selftests.
- The 3 patch series "remove write_cache_pages()" from Christoph Hellwig
removes that function and converts its two remaining callers.
- The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain
fixes some UFFD selftests issues.
- The 3 patch series "introduce kernel file mapped folios" from Boris
Burkov introduces the concept of "kernel file pages". Using these
permits btrfs to account its metadata pages to the root cgroup, rather
than to the cgroups of random inappropriate tasks.
- The 2 patch series "mm/pageblock: improve readability of some
pageblock handling" from Wei Yang provides some readability improvements
to the page allocator code.
- The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae
Park teaches DAMON to understand arm32 highmem.
- The 4 patch series "tools: testing: Use existing atomic.h for
vma/maple tests" from Brendan Jackman performs some code cleanups and
deduplication under tools/testing/.
- The 2 patch series "maple_tree: Fix testing for 32bit compiles" from
Liam Howlett fixes a couple of 32-bit issues in
tools/testing/radix-tree.c.
- The 2 patch series "kasan: unify kasan_enabled() and remove
arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN
arch-specific initialization code into a common arch-neutral
implementation.
- The 3 patch series "mm: remove zpool" from Johannes Weiner removes
zspool - an indirection layer which now only redirects to a single thing
(zsmalloc).
- The 2 patch series "mm: task_stack: Stack handling cleanups" from
Pasha Tatashin makes a couple of cleanups in the fork code.
- The 37 patch series "mm: remove nth_page()" from David Hildenbrand
makes rather a lot of adjustments at various nth_page() callsites,
eventually permitting the removal of that undesirable helper function.
- The 2 patch series "introduce kasan.write_only option in hw-tags" from
Yeoreum Yun creates a KASAN read-only mode for ARM, using that
architecture's memory tagging feature. It is felt that a read-only mode
KASAN is suitable for use in production systems rather than debug-only.
- The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation"
from Kefeng Wang does some tidying in the hugetlb folio allocation code.
- The 12 patch series "mm: establish const-correctness for pointer
parameters" from Max Kellermann makes quite a number of the MM API
functions more accurate about the constness of their arguments. This
was getting in the way of subsystems (in this case CEPH) when they
attempt to improving their own const/non-const accuracy.
- The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola
fixes a number of code sites which were confused over when to use
free_pages() vs __free_pages().
- The 3 patch series "Add Rust abstraction for Maple Trees" from Alice
Ryhl makes the mapletree code accessible to Rust. Required by nouveau
and by its forthcoming successor: the new Rust Nova driver.
- The 2 patch series "selftests/mm: split_huge_page_test:
split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and
some cleanups to the thp selftesting code.
- The 14 patch series "mm, swap: introduce swap table as swap cache
(phase I)" from Chris Li and Kairui Song is the first step along the
path to implementing "swap tables" - a new approach to swap allocation
and state tracking which is expected to yield speed and space
improvements. This patchset itself yields a 5-20% performance benefit
in some situations.
- The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes
the new memdesc layer to clean up the ptdesc code a little.
- The 3 patch series "Fix va_high_addr_switch.sh test failure" from
Chunyu Hu fixes some issues in our 5-level pagetable selftesting code.
- The 2 patch series "Minor fixes for memory allocation profiling" from
Suren Baghdasaryan addresses a couple of minor issues in relatively new
memory allocation profiling feature.
- The 3 patch series "Small cleanups" from Matthew Wilcox has a few
cleanups in preparation for more memdesc work.
- The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and
DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in
furtherance of supporting arm highmem.
- The 2 patch series "selftests/mm: Add -Wunreachable-code and fix
warnings" from Muhammad Anjum adds that compiler check to selftests code
and fixes the fallout, by removing dead code.
- The 10 patch series "Improvements to Victim Process Thawing and OOM
Reaper Traversal Order" from zhongjinji makes a number of improvements
in the OOM killer: mainly thawing a more appropriate group of victim
threads so they can release resources.
- The 5 patch series "mm/damon: misc fixups and improvements for 6.18"
from SeongJae Park is a bunch of small and unrelated fixups for DAMON.
- The 7 patch series "mm/damon: define and use DAMON initialization
check function" from SeongJae Park implement reliability and
maintainability improvements to a recently-added bug fix.
- The 2 patch series "mm/damon/stat: expose auto-tuned intervals and
non-idle ages" from SeongJae Park provides additional transparency to
userspace clients of the DAMON_STAT information.
- The 2 patch series "Expand scope of khugepaged anonymous collapse"
from Dev Jain removes some constraints on khubepaged's collapsing of
anon VMAs. It also increases the success rate of MADV_COLLAPSE against
an anon vma.
- The 2 patch series "mm: do not assume file == vma->vm_file in
compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards
removal of file_operations.mmap(). This patchset concentrates upon
clearing up the treatment of stacked filesystems.
- The 6 patch series "mm: Improve mlock tracking for large folios" from
Kiryl Shutsemau provides some fixes and improvements to mlock's tracking
of large folios. /proc/meminfo's "Mlocked" field became more accurate.
- The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters
during fork" from Donet Tom fixes several user-visible KSM stats
inaccuracies across forks and adds selftest code to verify these
counters.
- The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei
Yang addresses some potential but presently benign issues in KSM's
mm_slot handling.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA
jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1
2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA=
=4mhY
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "mm, swap: improve cluster scan strategy" from Kairui Song improves
performance and reduces the failure rate of swap cluster allocation
- "support large align and nid in Rust allocators" from Vitaly Wool
permits Rust allocators to set NUMA node and large alignment when
perforning slub and vmalloc reallocs
- "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
DAMOS_STAT's handling of the DAMON operations sets for virtual
address spaces for ops-level DAMOS filters
- "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
Baghdasaryan reduces mmap_lock contention during reads of
/proc/pid/maps
- "mm/mincore: minor clean up for swap cache checking" from Kairui Song
performs some cleanup in the swap code
- "mm: vm_normal_page*() improvements" from David Hildenbrand provides
code cleanup in the pagemap code
- "add persistent huge zero folio support" from Pankaj Raghav provides
a block layer speedup by optionalls making the
huge_zero_pagepersistent, instead of releasing it when its refcount
falls to zero
- "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
the recently added Kexec Handover feature
- "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
Stoakes turns mm_struct.flags into a bitmap. To end the constant
struggle with space shortage on 32-bit conflicting with 64-bit's
needs
- "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
code
- "selftests/mm: Fix false positives and skip unsupported tests" from
Donet Tom fixes a few things in our selftests code
- "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
from David Hildenbrand "allows individual processes to opt-out of
THP=always into THP=madvise, without affecting other workloads on the
system".
It's a long story - the [1/N] changelog spells out the considerations
- "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
the memdesc project. Please see
https://kernelnewbies.org/MatthewWilcox/Memdescs and
https://blogs.oracle.com/linux/post/introducing-memdesc
- "Tiny optimization for large read operations" from Chi Zhiling
improves the efficiency of the pagecache read path
- "Better split_huge_page_test result check" from Zi Yan improves our
folio splitting selftest code
- "test that rmap behaves as expected" from Wei Yang adds some rmap
selftests
- "remove write_cache_pages()" from Christoph Hellwig removes that
function and converts its two remaining callers
- "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
selftests issues
- "introduce kernel file mapped folios" from Boris Burkov introduces
the concept of "kernel file pages". Using these permits btrfs to
account its metadata pages to the root cgroup, rather than to the
cgroups of random inappropriate tasks
- "mm/pageblock: improve readability of some pageblock handling" from
Wei Yang provides some readability improvements to the page allocator
code
- "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
to understand arm32 highmem
- "tools: testing: Use existing atomic.h for vma/maple tests" from
Brendan Jackman performs some code cleanups and deduplication under
tools/testing/
- "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
a couple of 32-bit issues in tools/testing/radix-tree.c
- "kasan: unify kasan_enabled() and remove arch-specific
implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
initialization code into a common arch-neutral implementation
- "mm: remove zpool" from Johannes Weiner removes zspool - an
indirection layer which now only redirects to a single thing
(zsmalloc)
- "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
couple of cleanups in the fork code
- "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
adjustments at various nth_page() callsites, eventually permitting
the removal of that undesirable helper function
- "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
creates a KASAN read-only mode for ARM, using that architecture's
memory tagging feature. It is felt that a read-only mode KASAN is
suitable for use in production systems rather than debug-only
- "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
some tidying in the hugetlb folio allocation code
- "mm: establish const-correctness for pointer parameters" from Max
Kellermann makes quite a number of the MM API functions more accurate
about the constness of their arguments. This was getting in the way
of subsystems (in this case CEPH) when they attempt to improving
their own const/non-const accuracy
- "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
code sites which were confused over when to use free_pages() vs
__free_pages()
- "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
mapletree code accessible to Rust. Required by nouveau and by its
forthcoming successor: the new Rust Nova driver
- "selftests/mm: split_huge_page_test: split_pte_mapped_thp
improvements" from David Hildenbrand adds a fix and some cleanups to
the thp selftesting code
- "mm, swap: introduce swap table as swap cache (phase I)" from Chris
Li and Kairui Song is the first step along the path to implementing
"swap tables" - a new approach to swap allocation and state tracking
which is expected to yield speed and space improvements. This
patchset itself yields a 5-20% performance benefit in some situations
- "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
layer to clean up the ptdesc code a little
- "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
issues in our 5-level pagetable selftesting code
- "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
addresses a couple of minor issues in relatively new memory
allocation profiling feature
- "Small cleanups" from Matthew Wilcox has a few cleanups in
preparation for more memdesc work
- "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
Quanmin Yan makes some changes to DAMON in furtherance of supporting
arm highmem
- "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
Anjum adds that compiler check to selftests code and fixes the
fallout, by removing dead code
- "Improvements to Victim Process Thawing and OOM Reaper Traversal
Order" from zhongjinji makes a number of improvements in the OOM
killer: mainly thawing a more appropriate group of victim threads so
they can release resources
- "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
is a bunch of small and unrelated fixups for DAMON
- "mm/damon: define and use DAMON initialization check function" from
SeongJae Park implement reliability and maintainability improvements
to a recently-added bug fix
- "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
SeongJae Park provides additional transparency to userspace clients
of the DAMON_STAT information
- "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
some constraints on khubepaged's collapsing of anon VMAs. It also
increases the success rate of MADV_COLLAPSE against an anon vma
- "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
from Lorenzo Stoakes moves us further towards removal of
file_operations.mmap(). This patchset concentrates upon clearing up
the treatment of stacked filesystems
- "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
provides some fixes and improvements to mlock's tracking of large
folios. /proc/meminfo's "Mlocked" field became more accurate
- "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
Donet Tom fixes several user-visible KSM stats inaccuracies across
forks and adds selftest code to verify these counters
- "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
some potential but presently benign issues in KSM's mm_slot handling
* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
mm: swap: check for stable address space before operating on the VMA
mm: convert folio_page() back to a macro
mm/khugepaged: use start_addr/addr for improved readability
hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
alloc_tag: fix boot failure due to NULL pointer dereference
mm: silence data-race in update_hiwater_rss
mm/memory-failure: don't select MEMORY_ISOLATION
mm/khugepaged: remove definition of struct khugepaged_mm_slot
mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
hugetlb: increase number of reserving hugepages via cmdline
selftests/mm: add fork inheritance test for ksm_merging_pages counter
mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
drivers/base/node: fix double free in register_one_node()
mm: remove PMD alignment constraint in execmem_vmalloc()
mm/memory_hotplug: fix typo 'esecially' -> 'especially'
mm/rmap: improve mlock tracking for large folios
mm/filemap: map entire large folio faultaround
mm/fault: try to map the entire file folio in finish_fault()
mm/rmap: mlock large folios in try_to_unmap_one()
mm/rmap: fix a mlock race condition in folio_referenced_one()
...
-----BEGIN PGP SIGNATURE-----
iQFPBAABCAA5FiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmja74IbFIAAAAAABAAO
bWFudTIsMi41KzEuMTEsMiwyAAoJELvgsHXSRYiacR4H/04aBsr7LZnTJVeZLQwK
HKoOwXBqiQyqPdjKXGKnp7Mh9gRp2W3V11VsYTuDJNUS+Vz5YXW0z8cRnUfZ3SYs
l+GZC3vZeAy2EVJE1U6Mb673hU8vziI80IO2q/tGzaj9a+wC3L0lemc+YFQTwG+u
pMtt8zU2vHRjgkx8TNNqJBBOLLDV+RzIl8pqXVnh4eju6x6ZdreGnjXaePYMdjG0
fXLf9XwIeWREqbfeOCEOB50Ts71kkdiOeskwnJyfCTDT8WTu3zC/dICqfh66e3Gg
8hQKvMsuKpm/FwbtgdB0WvaDjENH6PmY+ubLYVxwvNpcsTSqfe0IYGm+HpUP+TPf
m+Y=
=w+JL
-----END PGP SIGNATURE-----
Merge tag 'slab-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- A new layer for caching objects for allocation and free via percpu
arrays called sheaves.
The aim is to combine the good parts of SLAB (lower-overhead and
simpler percpu caching, compared to SLUB) without the past issues
with arrays for freeing remote NUMA node objects and their flushing.
It also allows more efficient kfree_rcu(), and cheaper object
preallocations for cases where the exact number of objects is
unknown, but an upper bound is.
Currently VMAs and maple nodes are using this new caching, with a
plan to enable it for all caches and remove the complex SLUB fastpath
based on cpu (partial) slabs and this_cpu_cmpxchg_double().
(Vlastimil Babka, with Liam Howlett and Pedro Falcato for the maple
tree changes)
- Re-entrant kmalloc_nolock(), which allows opportunistic allocations
from NMI and tracing/kprobe contexts.
Building on prior page allocator and memcg changes, it will result in
removing BPF-specific caches on top of slab (Alexei Starovoitov)
- Various fixes and cleanups. (Kuan-Wei Chiu, Matthew Wilcox, Suren
Baghdasaryan, Ye Liu)
* tag 'slab-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (40 commits)
slab: Introduce kmalloc_nolock() and kfree_nolock().
slab: Reuse first bit for OBJEXTS_ALLOC_FAIL
slab: Make slub local_(try)lock more precise for LOCKDEP
mm: Introduce alloc_frozen_pages_nolock()
mm: Allow GFP_ACCOUNT to be used in alloc_pages_nolock().
locking/local_lock: Introduce local_lock_is_locked().
maple_tree: Convert forking to use the sheaf interface
maple_tree: Add single node allocation support to maple state
maple_tree: Prefilled sheaf conversion and testing
tools/testing: Add support for prefilled slab sheafs
maple_tree: Replace mt_free_one() with kfree()
maple_tree: Use kfree_rcu in ma_free_rcu
testing/radix-tree/maple: Hack around kfree_rcu not existing
tools/testing: include maple-shim.c in maple.c
maple_tree: use percpu sheaves for maple_node_cache
mm, vma: use percpu sheaves for vm_area_struct cache
tools/testing: Add support for changes to slab for sheaves
slab: allow NUMA restricted allocations to use percpu sheaves
tools/testing/vma: Implement vm_refcnt reset
slab: skip percpu sheaves for remote object freeing
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmjbLCgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpoY0D/9J+11BC88pBxCrLKv/V2TwCNokRMi0dU3L
r3EUdA46k0oXmvb6ueZqIcfY2e+IX7rdQkaRbh1zRdsNejqHo4548C3ePWGdBAcM
OdNEGfpehO0aD0td1+mK/NxoJMLhbs5QraPanz+SOkGZOKeF+vGCga5PUDivsr5J
16T9yb7i+isENLdAc2RJbZVyAphqHQlo5GHi5ZIKOVi5cNt8GU/q2sQl7NYmGvHd
aq37svvZHFOhLRajP959Fw9WOxEYITewzQ4UYf1FZjUodJUxO+vCnP0ooBQRlyu8
1B4PYWwSE+Vn3GkQE0Om+mzo9AVPOiLmoAWGxdgJBMyEkZndocr46XEslXOufQ1Z
T3Gu19G6jCxcyByNVhjVnaajYKmvSQAy1w75m4XlfqTRm4f9Om+LAJavUk3RuaOL
7lXKQ7Ql1/Tby9Jmf8afjYYXXotNDNku6rz2P3qtOwAA26mNJfgVt0rO+8XGRDe9
ioLbCkTjslYMc/Oh4jSsbrspsVALbaQMq/Dmah8k0EWb4QAHVgCJyGBoff3hOboI
jD6B1enaKOQVgcjWcjm/FjOk3jv2h3v4X26YWQZTvEc/1PnSnST78Zi/ePhzDdmt
sBALUAS37TfTgNMzrhbHl5Zs13k0C0XyANuayuKuo5hlNnC1wbdap+5FZJOmpuOB
YT+VkYnaOA==
=kOmc
-----END PGP SIGNATURE-----
Merge tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block updates from Jens Axboe:
- NVMe pull request via Keith:
- FC target fixes (Daniel)
- Authentication fixes and updates (Martin, Chris)
- Admin controller handling (Kamaljit)
- Target lockdep assertions (Max)
- Keep-alive updates for discovery (Alastair)
- Suspend quirk (Georg)
- MD pull request via Yu:
- Add support for a lockless bitmap.
A key feature for the new bitmap are that the IO fastpath is
lockless. If a user issues lots of write IO to the same bitmap
bit in a short time, only the first write has additional overhead
to update bitmap bit, no additional overhead for the following
writes.
By supporting only resync or recover written data, means in the
case creating new array or replacing with a new disk, there is no
need to do a full disk resync/recovery.
- Switch ->getgeo() and ->bios_param() to using struct gendisk rather
than struct block_device.
- Rust block changes via Andreas. This series adds configuration via
configfs and remote completion to the rnull driver. The series also
includes a set of changes to the rust block device driver API: a few
cleanup patches, and a few features supporting the rnull changes.
The series removes the raw buffer formatting logic from
`kernel::block` and improves the logic available in `kernel::string`
to support the same use as the removed logic.
- floppy arch cleanups
- Reduce the number of dereferencing needed for ublk commands
- Restrict supported sockets for nbd. Mostly done to eliminate a class
of issues perpetually reported by syzbot, by using nonsensical socket
setups.
- A few s390 dasd block fixes
- Fix a few issues around atomic writes
- Improve DMA interation for integrity requests
- Improve how iovecs are treated with regards to O_DIRECT aligment
constraints.
We used to require each segment to adhere to the constraints, now
only the request as a whole needs to.
- Clean up and improve p2p support, enabling use of p2p for metadata
payloads
- Improve locking of request lookup, using SRCU where appropriate
- Use page references properly for brd, avoiding very long RCU sections
- Fix ordering of recursively submitted IOs
- Clean up and improve updating nr_requests for a live device
- Various fixes and cleanups
* tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (164 commits)
s390/dasd: enforce dma_alignment to ensure proper buffer validation
s390/dasd: Return BLK_STS_INVAL for EINVAL from do_dasd_request
ublk: remove redundant zone op check in ublk_setup_iod()
nvme: Use non zero KATO for persistent discovery connections
nvmet: add safety check for subsys lock
nvme-core: use nvme_is_io_ctrl() for I/O controller check
nvme-core: do ioccsz/iorcsz validation only for I/O controllers
nvme-core: add method to check for an I/O controller
blk-cgroup: fix possible deadlock while configuring policy
blk-mq: fix null-ptr-deref in blk_mq_free_tags() from error path
blk-mq: Fix more tag iteration function documentation
selftests: ublk: fix behavior when fio is not installed
ublk: don't access ublk_queue in ublk_unmap_io()
ublk: pass ublk_io to __ublk_complete_rq()
ublk: don't access ublk_queue in ublk_need_complete_req()
ublk: don't access ublk_queue in ublk_check_commit_and_fetch()
ublk: don't pass ublk_queue to ublk_fetch()
ublk: don't access ublk_queue in ublk_config_io_buf()
ublk: don't access ublk_queue in ublk_check_fetch_buf()
ublk: pass q_id and tag to __ublk_check_and_get_req()
...
- Extend modules.builtin.modinfo to include module aliases from
MODULE_DEVICE_TABLE for builtin modules so that userspace tools (such
as kmod) can verify that a particular module alias will be handled by
a builtin module.
- Bump the minimum version of LLVM for building the kernel to 15.0.0.
- Upgrade several userspace API checks in headers_check.pl to errors.
- Unify and consolidate CONFIG_WERROR / W=e handling.
- Turn assembler and linker warnings into errors with CONFIG_WERROR /
W=e.
- Respect CONFIG_WERROR / W=e when building userspace programs
(userprogs).
- Enable -Werror unconditionally when building host programs
(hostprogs).
- Support copy_file_range() and data segment alignment in gen_init_cpio
to improve performance on filesystems that support reflinks such as
btrfs and XFS.
- Miscellaneous small changes to scripts and configuration files.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaNrp6QAKCRAdayaRccAa
ljxRAP4hYocKXeWsiJzkTB199P4QUGWf220a9elBmtdJEed07gD/VBnCbSOxG3RO
vS8qbJHwxUFL7a+mDV8RIVXSt99NpAg=
=psG/
-----END PGP SIGNATURE-----
Merge tag 'kbuild-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild updates from Nathan Chancellor:
- Extend modules.builtin.modinfo to include module aliases from
MODULE_DEVICE_TABLE for builtin modules so that userspace tools (such
as kmod) can verify that a particular module alias will be handled by
a builtin module
- Bump the minimum version of LLVM for building the kernel to 15.0.0
- Upgrade several userspace API checks in headers_check.pl to errors
- Unify and consolidate CONFIG_WERROR / W=e handling
- Turn assembler and linker warnings into errors with CONFIG_WERROR /
W=e
- Respect CONFIG_WERROR / W=e when building userspace programs
(userprogs)
- Enable -Werror unconditionally when building host programs
(hostprogs)
- Support copy_file_range() and data segment alignment in gen_init_cpio
to improve performance on filesystems that support reflinks such as
btrfs and XFS
- Miscellaneous small changes to scripts and configuration files
* tag 'kbuild-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (47 commits)
modpost: Initialize builtin_modname to stop SIGSEGVs
Documentation: kbuild: note CONFIG_DEBUG_EFI in reproducible builds
kbuild: vmlinux.unstripped should always depend on .vmlinux.export.o
modpost: Create modalias for builtin modules
modpost: Add modname to mod_device_table alias
scsi: Always define blogic_pci_tbl structure
kbuild: extract modules.builtin.modinfo from vmlinux.unstripped
kbuild: keep .modinfo section in vmlinux.unstripped
kbuild: always create intermediate vmlinux.unstripped
s390: vmlinux.lds.S: Reorder sections
KMSAN: Remove tautological checks
objtool: Drop noinstr hack for KCSAN_WEAK_MEMORY
lib/Kconfig.debug: Drop CLANG_VERSION check from DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
riscv: Remove ld.lld version checks from many TOOLCHAIN_HAS configs
riscv: Unconditionally use linker relaxation
riscv: Remove version check for LTO_CLANG selects
powerpc: Drop unnecessary initializations in __copy_inst_from_kernel_nofault()
mips: Unconditionally select ARCH_HAS_CURRENT_STACK_POINTER
arm64: Remove tautological LLVM Kconfig conditions
ARM: Clean up definition of ARM_HAS_GROUP_RELOCS
...
- A seven patch series adds a new parameterized test features
KUnit parameterized tests currently support two primary methods for
getting parameters:
1. Defining custom logic within a generate_params() function.
2. Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC()
macros with a pre-defined static array and passing
the created *_gen_params() to KUNIT_CASE_PARAM().
These methods present limitations when dealing with dynamically
generated parameter arrays, or in scenarios where populating parameters
sequentially via generate_params() is inefficient or overly complex.
These limitations are fixed with a parameterized test method.
- Fixes issues in kunit build artifacts cleanup,
- Fixes parsing skipped test problem in kselftest framework,
- Enables PCI on UML without triggering WARN()
- a few other fixes and adds support for new configs such as MIPS
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmjbB0QACgkQCwJExA0N
QxxpUQ/8CslEjTv2+/LA122eGDtg8np61W1MTNEQslYyiobhZ9CXeFw4yJlTTb0w
hShFK1pGAWgkVCVKIJOaeItY0hF3BIxyVtlQxDUKvukpMnLvZRI0KhG7p8aYnCq+
jfQRy4gqwjaHyLQekQ1v6vRdHdTfh5mB6OUOCYa7QPQuKhOkBOaq2ZJ+9eFnkPXl
KUGTcC3pL0jQQmaSgo7zFUbdGSq0JZkNbpMj0fAYB0zs+MSpfkAnQYEw3qaxU1/Z
lu+CQdom+77Kt0r7UyEb6xUBeRveqYAWv6oFKq3CBzhlEGCkqVv6zgNg48qMC0QO
cVW0E61u8bVAalhb7hOT3QsEp1k01Lr2N9/VBLP4LRb2HMlD2qlXDo6ftBjNgx/y
m5Kukh1wQoOTaTJekdDMlaRXwApdn3MegheXE83n4dr44C5oyhlR8fFk/0Y6gSEU
RKUg8ZwUNUhCdw4VgBn8zoSkK18D74feRoustmuMS00I/8to3iGvQ7kn3nz2fPMb
AaZ6a2K95gk9TrD4UZdHF1aPHDudn3e8g6LdSOV/NoRNI9H+fXfhvxcLwW8WCqOw
u5qfTxNUk4mPQrCs9hcDnWm9Ppuu5YeY989rKkA7j4z71dhyMOPynpOM+vXRijB8
9zRcvyls87tgDAquIZvTp6Q/oEmi60OF40DRC9MjgOw0Iw7feIc=
=VfHu
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
- New parameterized test features
KUnit parameterized tests supported two primary methods for getting
parameters:
- Defining custom logic within a generate_params() function.
- Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC() macros
with a pre-defined static array and passing the created
*_gen_params() to KUNIT_CASE_PARAM().
These methods present limitations when dealing with dynamically
generated parameter arrays, or in scenarios where populating
parameters sequentially via generate_params() is inefficient or
overly complex.
These limitations are fixed with a parameterized test method
- Fix issues in kunit build artifacts cleanup
- Fix parsing skipped test problem in kselftest framework
- Enable PCI on UML without triggering WARN()
- a few other fixes and adds support for new configs such as MIPS
* tag 'linux_kselftest-kunit-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: Extend kconfig help text for KUNIT_UML_PCI
rust: kunit: allow `cfg` on `test`s
kunit: qemu_configs: Add MIPS configurations
kunit: Enable PCI on UML without triggering WARN()
Documentation: kunit: Document new parameterized test features
kunit: Add example parameterized test with direct dynamic parameter array setup
kunit: Add example parameterized test with shared resource management using the Resource API
kunit: Enable direct registration of parameter arrays to a KUnit test
kunit: Pass parameterized test context to generate_params()
kunit: Introduce param_init/exit for parameterized test context management
kunit: Add parent kunit for parameterized test context
kunit: tool: Accept --raw_output=full as an alias of 'all'
kunit: tool: Parse skipped tests from kselftest.h
kunit: Always descend into kunit directory during build
- Further consolidation of the VDSO infrastructure and the common data
store.
- Simplification of the related Kconfig logic
- Improve the VDSO selftest suite
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaUJgTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoT5VD/9l+TP42vJzwygPArefVM9QijAc4k+g
iptaz5M+Lhk0BlwMUu502Tp1zcwGq5529LlE2P72DuYoZRtZKSFCGES2mye0JiAs
A5Vk8d1zicg3IIGjYgw39shA9ghzuzV3gZdgHVqQ6024uSeKxUY3SqF156GGfPFG
azXWDV7i4RGQvKLm2ye8m7JNYCE3P9f8Wh+GvPaq7qBlW1MsMCWC4FP5qu4iStf3
mol9uA/EM2zSDwkpIpq6P5L2TpIH1dGDNGxVf/HG3J16UhNXzzXv/rxuKHKPNGWm
MQZ8WdPjF+uwdUsNXaImC6aCE0hoQ1Ga7GE2YA3pMQYfySeiJ/fA5I1KfhlUCiHD
oUgnb6PQfZjaBDrIODKcrFHAYc5pTOyJXnJ6q1Lc1yrOfoVcIpMaRXHO/phCeuwI
61gfb9ggbFEH/7cL26Z9XGSqh4BDCKVP5Z+y98OnNEzV74ScpX13/szB3VgdN6LW
tmTOeMPYCo4FQoPw8OV1xkeYOKD2JyqH+garnxDZNFjaVaeUkzlAMKz+1JUjxRZj
UpUhZ8y5KFVjgKkqZHxhucr9cnwV/xd4n+IzZeG4l6XZp4VyirG+5QpAikGYe+Jq
WFU4Ao1xvevyizrJEhoGJKznwtrNrsy6AQ+wLQxYiRR4aTnng8n9T/ry2DVvhJf8
diYWxuRTR7KMxg==
=i3UB
-----END PGP SIGNATURE-----
Merge tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull VDSO updates from Thomas Gleixner:
- Further consolidation of the VDSO infrastructure and the common data
store
- Simplification of the related Kconfig logic
- Improve the VDSO selftest suite
* tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests: vDSO: Drop vdso_test_clock_getres
selftests: vDSO: vdso_test_abi: Add tests for clock_gettime64()
selftests: vDSO: vdso_test_abi: Test CPUTIME clocks
selftests: vDSO: vdso_test_abi: Use explicit indices for name array
selftests: vDSO: vdso_test_abi: Drop clock availability tests
selftests: vDSO: vdso_test_abi: Use ksft_finished()
selftests: vDSO: vdso_test_abi: Correctly skip whole test with missing vDSO
selftests: vDSO: Fix -Wunitialized in powerpc VDSO_CALL() wrapper
vdso: Add struct __kernel_old_timeval forward declaration to gettime.h
vdso: Gate VDSO_GETRANDOM behind HAVE_GENERIC_VDSO
vdso: Drop Kconfig GENERIC_VDSO_TIME_NS
vdso: Drop Kconfig GENERIC_VDSO_DATA_STORE
vdso: Drop kconfig GENERIC_COMPAT_VDSO
vdso: Drop kconfig GENERIC_VDSO_32
riscv: vdso: Untangle Kconfig logic
time: Build generic update_vsyscall() only with generic time vDSO
vdso/gettimeofday: Remove !CONFIG_TIME_NS stubs
vdso: Move ENABLE_COMPAT_VDSO from core to arm64
ARM: VDSO: Remove cntvct_ok global variable
vdso/datastore: Gate time data behind CONFIG_GENERIC_GETTIMEOFDAY
- Address the inconsistent shutdown sequence of per CPU clockevents on
CPU hotplug, which onoly removed it from the core but failed to invoke
the actual device driver shutdown callback. This keeps the timer
active, which prevents power savings and causes pointless noise in
virtualization.
- Encapsulate the open coded access to the hrtimer clock base, which is a
private implementation detail, so that the implementation can be
changed without breaking a lot of usage sites.
- Enhance the debug output of the clocksource watchdog to provide better
information for analysis.
- The usual set of cleanups and enhancements all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaT+ETHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoT8HD/47VRqHbFKazGcwZ4+OLKfFTq0PKzIU
kIhimNMSKVqiPlmWhud/4idGX5JFqMP3dS9ZaDlp7xCtu1ScCqzH72kbCrab0l4g
wjRgneGWsHhv06PY50Ty9FusFSetkjA8XSYavFV9HZNgCRIvho958akdckpazMcF
i4V5WTIVkwhGynhcGwqBXcmHASUa7x6gEjZYBbX6wspPl2Wk5z+vn/zAVIHAozwS
9GPw8HlUeKqM72U12mWGkt4KLjy2gzoupvx9vD4giXRvqkt09rfHtRqvEdkew+/G
ZANhmTJqfA1AiihYP30fHgY5lQbNQG+9O10UKhlUyrlBKPKHKu6dIuJQRSBy0j59
Bqef4UubPBlMP4xRgTbt11SFrjuqDo68bTIDGmQNQvb1BXSXhGA08PDPbIKkFdJh
8cXwUHzjLkDk0tL6kVXRWlsZcCmjz87TVS7PufGE3EIZkh0J2Ob+g7nFS3PsORKW
65ROSRoYmh4CRh/l9CTNbPWFvz3jahIEnxlrSc70o0A4DUeeWTwT/M+MoNOtSJuV
D4qSD35JsFaEbKFfSJwF/0CybG3Nx3UPI8nvEEYQtS4D+BCJHy7INg1P/NtWw+vb
W7puAFd+WbjSN14uQfYkYWC53vQ22isW0nU8W6G5NbMRD5K4WIld4UhDksJIGCLe
2LZ0VHxv4gaXkg==
=A9Tu
-----END PGP SIGNATURE-----
Merge tag 'timers-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner:
- Address the inconsistent shutdown sequence of per CPU clockevents on
CPU hotplug, which only removed it from the core but failed to invoke
the actual device driver shutdown callback. This kept the timer
active, which prevented power savings and caused pointless noise in
virtualization.
- Encapsulate the open coded access to the hrtimer clock base, which is
a private implementation detail, so that the implementation can be
changed without breaking a lot of usage sites.
- Enhance the debug output of the clocksource watchdog to provide
better information for analysis.
- The usual set of cleanups and enhancements all over the place
* tag 'timers-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Fix spelling mistakes in comments
clocksource: Print durations for sync check unconditionally
LoongArch: Remove clockevents shutdown call on offlining
tick: Do not set device to detached state in tick_shutdown()
hrtimer: Reorder branches in hrtimer_clockid_to_base()
hrtimer: Remove hrtimer_clock_base:: Get_time
hrtimer: Use hrtimer_cb_get_time() helper
media: pwm-ir-tx: Avoid direct access to hrtimer clockbase
ALSA: hrtimer: Avoid direct access to hrtimer clockbase
lib: test_objpool: Avoid direct access to hrtimer clockbase
sched/core: Avoid direct access to hrtimer clockbase
timers/itimer: Avoid direct access to hrtimer clockbase
posix-timers: Avoid direct access to hrtimer clockbase
jiffies: Remove obsolete SHIFTED_HZ comment
First set of RISC-V updates for the v6.18 merge window, including:
- Replacement of __ASSEMBLY__ with __ASSEMBLER__ in header files (other
architectures have already merged this type of cleanup)
- The introduction of ioremap_wc() for RISC-V
- Cleanup of the RISC-V kprobes code to use mostly-extant macros rather than
open code
- A RISC-V kprobes unit test
- An architecture-specific endianness swap macro set implementation,
leveraging some dedicated RISC-V instructions for this purpose if they
are available
- The ability to identity and communicate to userspace the presence of a
MIPS P8700-specific ISA extension, and to leverage its MIPS-specific PAUSE
implementation in cpu_relax()
- Several other miscellaneous cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmjaMVIACgkQx4+xDQu9
Kkva4g/9GE4gzwM+KHlc4e1sfsaTo4oXe9eV+Hj3gUJdM+g8dCtNFchPRCKFjHYb
X9lm2YVL9Q3cZ7yWWy/DJtZ66Yz9foQ4laX7uYbjsdbdjbsAXLXhjNp6uk4nBrqb
uW7Uq+Qel8qq66J/B4Z/U0UzF3e5MaptQ6sLuNRONY9OJUxG76zTSJVKwiVNsGaX
8W59b7ALFlCIwCVyXGm/KO1EELjl8FKDENWXFE1v6T8XvfXfYuhcvUMp84ebbzHV
D4kQrO3nxKQVgKEdCtW8xxt0/aWkYQ8zbQg8bD0gDDzMYb5uiJDcMFaa8ZuEYiLg
EcAJX8LmE5GGlTcJf8/jxvaA87hisNsGFvNPXX1OuI26w2TUNC2u80wE6Q6a3aNu
74oUtZEaPhdFuB31A0rALC8Zb2zalnwwjAL7xRlyAozUuye8Ej7mE7w97WTjIfWz
7ZL19/+C1uawljzYLn+FJBIcl1wTyvOgx6T4TJlmOsa4OFnCreFx+a0Az6+Rx1GC
XGsRyDkGPSabIbNVGxUCyTr4w+VpW1WlDjrjwyLC0DQTFW8wyP44W9/K2scp+CqJ
bSCcAz8QtGAeZ5UlSmXYTOV69xXqZPaom7fVk5RFHoy24en5DSo7kj1NJ3ChupRD
8ACpALcIgw/VEo0Tyqqy3dyVhPDuYaZMY3WgGGj9Cz18U37e/Ho=
=nK3D
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley
- Replacement of __ASSEMBLY__ with __ASSEMBLER__ in header files (other
architectures have already merged this type of cleanup)
- The introduction of ioremap_wc() for RISC-V
- Cleanup of the RISC-V kprobes code to use mostly-extant macros rather
than open code
- A RISC-V kprobes unit test
- An architecture-specific endianness swap macro set implementation,
leveraging some dedicated RISC-V instructions for this purpose if
they are available
- The ability to identity and communicate to userspace the presence
of a MIPS P8700-specific ISA extension, and to leverage its
MIPS-specific PAUSE implementation in cpu_relax()
- Several other miscellaneous cleanups
* tag 'riscv-for-linus-6.18-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (39 commits)
riscv: errata: Fix the PAUSE Opcode for MIPS P8700
riscv: hwprobe: Document MIPS xmipsexectl vendor extension
riscv: hwprobe: Add MIPS vendor extension probing
riscv: Add xmipsexectl instructions
riscv: Add xmipsexectl as a vendor extension
dt-bindings: riscv: Add xmipsexectl ISA extension description
riscv: cpufeature: add validation for zfa, zfh and zfhmin
perf: riscv: skip empty batches in counter start
selftests: riscv: Add README for RISC-V KSelfTest
riscv: sbi: Switch to new sys-off handler API
riscv: Move vendor errata definitions to new header
RISC-V: ACPI: enable parsing the BGRT table
riscv: Enable ARCH_HAVE_NMI_SAFE_CMPXCHG
riscv: pi: use 'targets' instead of extra-y in Makefile
riscv: introduce asm/swab.h
riscv: mmap(): use unsigned offset type in riscv_sys_mmap
drivers/perf: riscv: Remove redundant ternary operators
riscv: mm: Use mmu-type from FDT to limit SATP mode
riscv: mm: Return intended SATP mode for noXlvl options
riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM
...
- Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)
- lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
(Junjie Cao)
- Add str_assert_deassert() helper (Lad Prabhakar)
- gcc-plugins: Remove TODO_verify_il for GCC >= 16
- kconfig: Fix BrokenPipeError warnings in selftests
- kconfig: Add transitional symbol attribute for migration support
- kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaNraNQAKCRA2KwveOeQk
u/DkAPwKPP5BSmVR2wkdpQaXIr3PGA+cbBYp34DMJNujZ9piIwD/WZ+HfGTLoERy
+2Q6HLj9hUdd+Rx3IZ8/w1QmnhUIUAU=
=AwV9
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"One notable addition is the creation of the 'transitional' keyword for
kconfig so CONFIG renaming can go more smoothly.
This has been a long-standing deficiency, and with the renaming of
CONFIG_CFI_CLANG to CONFIG_CFI (since GCC will soon have KCFI
support), this came up again.
The breadth of the diffstat is mainly this renaming.
- Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)
- lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
(Junjie Cao)
- Add str_assert_deassert() helper (Lad Prabhakar)
- gcc-plugins: Remove TODO_verify_il for GCC >= 16
- kconfig: Fix BrokenPipeError warnings in selftests
- kconfig: Add transitional symbol attribute for migration support
- kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI"
* tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lib/string_choices: Add str_assert_deassert() helper
kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
kconfig: Add transitional symbol attribute for migration support
kconfig: Fix BrokenPipeError warnings in selftests
gcc-plugins: Remove TODO_verify_il for GCC >= 16
stddef: Introduce __TRAILING_OVERLAP()
stddef: Remove token-pasting in TRAILING_OVERLAP()
lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
- PCI: Fix theoretical underflow in use of ffs().
- Universally apply __attribute_const__ to all architecture's ffs()-family
of functions.
- Add KUnit tests for ffs() behavior and const-ness.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaNrWngAKCRA2KwveOeQk
u3ZGAPwJTscARU4MspnqpbuAV601dG1TNoJG+8JYH84r+R2jjQEAlmBZB0jaHbC2
qFWjHivD/0ofvihKfAPFgxlakyV1XAg=
=diXF
-----END PGP SIGNATURE-----
Merge tag 'ffs-const-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull ffs const-attribute cleanups from Kees Cook:
"While working on various hardening refactoring a while back we
encountered inconsistencies in the application of __attribute_const__
on the ffs() family of functions.
This series fixes this across all archs and adds KUnit tests.
Notably, this found a theoretical underflow in PCI (also fixed here)
and uncovered an inefficiency in ARC (fixed in the ARC arch PR). I
kept the series separate from the general hardening PR since it is a
stand-alone "topic".
- PCI: Fix theoretical underflow in use of ffs().
- Universally apply __attribute_const__ to all architecture's
ffs()-family of functions.
- Add KUnit tests for ffs() behavior and const-ness"
* tag 'ffs-const-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
KUnit: ffs: Validate all the __attribute_const__ annotations
sparc: Add __attribute_const__ to ffs()-family implementations
xtensa: Add __attribute_const__ to ffs()-family implementations
s390: Add __attribute_const__ to ffs()-family implementations
parisc: Add __attribute_const__ to ffs()-family implementations
mips: Add __attribute_const__ to ffs()-family implementations
m68k: Add __attribute_const__ to ffs()-family implementations
openrisc: Add __attribute_const__ to ffs()-family implementations
riscv: Add __attribute_const__ to ffs()-family implementations
hexagon: Add __attribute_const__ to ffs()-family implementations
alpha: Add __attribute_const__ to ffs()-family implementations
sh: Add __attribute_const__ to ffs()-family implementations
powerpc: Add __attribute_const__ to ffs()-family implementations
x86: Add __attribute_const__ to ffs()-family implementations
csky: Add __attribute_const__ to ffs()-family implementations
bitops: Add __attribute_const__ to generic ffs()-family implementations
KUnit: Introduce ffs()-family tests
PCI: Test for bit underflow in pcie_set_readrq()
Add support for 2-way interleaved SHA-256 hashing to lib/crypto/, and
make fsverity use it for faster file data verification. This improves
fsverity performance on many x86_64 and arm64 processors.
Later, I plan to make dm-verity use this too.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaNg4/RQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK4fMAP9Xz00JNDfJ+mOVHIYOhAlWFGnug0X1
cvoRf4QXchNlbwD9HTJQQDQXnbsPy3QPrUVfl2FqCW7c6vRlBJijhD6j4wE=
=6dCR
-----END PGP SIGNATURE-----
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux
Pull interleaved SHA-256 hashing support from Eric Biggers:
"Optimize fsverity with 2-way interleaved hashing
Add support for 2-way interleaved SHA-256 hashing to lib/crypto/, and
make fsverity use it for faster file data verification. This improves
fsverity performance on many x86_64 and arm64 processors.
Later, I plan to make dm-verity use this too"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fsverity: Use 2-way interleaved SHA-256 hashing when supported
fsverity: Remove inode parameter from fsverity_hash_block()
lib/crypto: tests: Add tests and benchmark for sha256_finup_2x()
lib/crypto: x86/sha256: Add support for 2-way interleaved hashing
lib/crypto: arm64/sha256: Add support for 2-way interleaved hashing
lib/crypto: sha256: Add support for 2-way interleaved hashing
- Add a RISC-V optimized implementation of Poly1305. This code was
written by Andy Polyakov and contributed by Zhihang Shao.
- Migrate the MD5 code into lib/crypto/, and add KUnit tests for MD5.
Yes, it's still the 90s, and several kernel subsystems are still using
MD5 for legacy use cases. As long as that remains the case, it's
helpful to clean it up in the same way as I've been doing for other
algorithms. Later, I plan to convert most of these users of MD5 to use
the new MD5 library API instead of the generic crypto API.
- Simplify the organization of the ChaCha, Poly1305, BLAKE2s, and
Curve25519 code. Consolidate these into one module per algorithm,
and centralize the configuration and build process. This is the same
reorganization that has already been successful for SHA-1 and SHA-2.
- Remove the unused crypto_kpp API for Curve25519.
- Migrate the BLAKE2s and Curve25519 self-tests to KUnit.
- Always enable the architecture-optimized BLAKE2s code.
Due to interdependencies between test and non-test code, both are
included in this pull request. The broken-down diffstat is as follows:
Tests: 735 insertions(+), 1917 deletions(-)
RISC-V Poly1305: 877 insertions(+), 1 deletion(-)
Other: 1777 insertions(+), 3117 deletions(-)
Besides the new RISC-V code which is an addition, there are quite a
few simplifications due to the improved code organization for multiple
algorithms, the removal of the unused crypto_kpp API for Curve25519
and redundant tests, and the redesign of the BLAKE2s test.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaNgwUhQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK3EnAP96hB1wD12DvIovGCmWnnlbzOt+CoK2
B5CW74eYEZiSbwD7BiKPDqvSmLzEBtbKmOSwRvxKuQ2uGGef3USFKYVCiw0=
=DY5R
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
- Add a RISC-V optimized implementation of Poly1305. This code was
written by Andy Polyakov and contributed by Zhihang Shao.
- Migrate the MD5 code into lib/crypto/, and add KUnit tests for MD5.
Yes, it's still the 90s, and several kernel subsystems are still
using MD5 for legacy use cases. As long as that remains the case,
it's helpful to clean it up in the same way as I've been doing for
other algorithms.
Later, I plan to convert most of these users of MD5 to use the new
MD5 library API instead of the generic crypto API.
- Simplify the organization of the ChaCha, Poly1305, BLAKE2s, and
Curve25519 code.
Consolidate these into one module per algorithm, and centralize the
configuration and build process. This is the same reorganization that
has already been successful for SHA-1 and SHA-2.
- Remove the unused crypto_kpp API for Curve25519.
- Migrate the BLAKE2s and Curve25519 self-tests to KUnit.
- Always enable the architecture-optimized BLAKE2s code.
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (38 commits)
crypto: md5 - Implement export_core() and import_core()
wireguard: kconfig: simplify crypto kconfig selections
lib/crypto: tests: Enable Curve25519 test when CRYPTO_SELFTESTS
lib/crypto: curve25519: Consolidate into single module
lib/crypto: curve25519: Move a couple functions out-of-line
lib/crypto: tests: Add Curve25519 benchmark
lib/crypto: tests: Migrate Curve25519 self-test to KUnit
crypto: curve25519 - Remove unused kpp support
crypto: testmgr - Remove curve25519 kpp tests
crypto: x86/curve25519 - Remove unused kpp support
crypto: powerpc/curve25519 - Remove unused kpp support
crypto: arm/curve25519 - Remove unused kpp support
crypto: hisilicon/hpre - Remove unused curve25519 kpp support
lib/crypto: tests: Add KUnit tests for BLAKE2s
lib/crypto: blake2s: Consolidate into single C translation unit
lib/crypto: blake2s: Move generic code into blake2s.c
lib/crypto: blake2s: Always enable arch-optimized BLAKE2s code
lib/crypto: blake2s: Remove obsolete self-test
lib/crypto: x86/blake2s: Reduce size of BLAKE2S_SIGMA2
lib/crypto: chacha: Consolidate into single module
...
Update crc_kunit to test the CRC functions in softirq and hardirq
contexts, similar to what the lib/crypto/ KUnit tests do. Move the
helper function needed to do this into a common header.
This is useful mainly to test fallback code paths for when
FPU/SIMD/vector registers are unusable.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaNgevxQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK42cAP4qrRd2p4lLm4lntiL9igO0AVUl7Lsj
2hhYcVWNj24wwgD/bBb5aCCKfreOQbjMldYjL+UlRiYqQAsstlfcwY7EegM=
=aEsr
-----END PGP SIGNATURE-----
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
"Update crc_kunit to test the CRC functions in softirq and hardirq
contexts, similar to what the lib/crypto/ KUnit tests do. Move the
helper function needed to do this into a common header.
This is useful mainly to test fallback code paths for when
FPU/SIMD/vector registers are unusable"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
Documentation/staging: Fix typo and incorrect citation in crc32.rst
lib/crc: Drop inline from all *_mod_init_arch() functions
lib/crc: Use underlying functions instead of crypto_simd_usable()
lib/crc: crc_kunit: Test CRC computation in interrupt contexts
kunit, lib/crypto: Move run_irq_test() to common header
Use the generic interface which should result in less bulk allocations
during a forking.
A part of this is to abstract the freeing of the sheaf or maple state
allocations into its own function so mas_destroy() and the tree
duplication code can use the same functionality to return any unused
resources.
[andriy.shevchenko@linux.intel.com: remove unused mt_alloc_bulk()]
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
The fast path through a write will require replacing a single node in
the tree. Using a sheaf (32 nodes) is too heavy for the fast path, so
special case the node store operation by just allocating one node in the
maple state.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Use prefilled sheaves instead of bulk allocations. This should speed up
the allocations and the return path of unused allocations.
Remove the push and pop of nodes from the maple state as this is now
handled by the slab layer with sheaves.
Testing has been removed as necessary since the features of the tree
have been reduced.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
kfree() is a little shorter and works with kmem_cache_alloc'd pointers
too. Also lets us remove one more helper.
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
kfree_rcu is an optimized version of call_rcu + kfree. It used to not be
possible to call it on non-kmalloc objects, but this restriction was
lifted ever since SLOB was dropped from the kernel, and since commit
6c6c47b063 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()").
Thus, replace call_rcu + mt_free_rcu with kfree_rcu.
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Setup the maple_node_cache with percpu sheaves of size 32 to hopefully
improve its performance. Note this will not immediately take advantage
of sheaf batching of kfree_rcu() operations due to the maple tree using
call_rcu with custom callbacks. The followup changes to maple tree will
change that and also make use of the prefilled sheaves functionality.
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Bulk insert mode was added to facilitate forking faster, but forking now
uses __mt_dup() to duplicate the tree.
The addition of sheaves has made the bulk allocations difficult to
maintain - since the expected entries would preallocate into the maple
state. A big part of the maple state node allocation was the ability to
push nodes back onto the state for later use, which was essential to the
bulk insert algorithm.
Remove mas_expected_entries() and mas_destroy_rebalance() functions as
well as the MA_STATE_BULK and MA_STATE_REBALANCE maple state flags since
there are no users anymore. Drop the associated testing as well.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.
Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these
redundant flags across subsystems.
No functional changes.
Link: https://lkml.kernel.org/r/20250804125657.482109-1-rongqianfeng@vivo.com
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Make sure to drop the reference taken when looking up the genpool platform
device in of_gen_pool_get() before returning the pool.
Note that holding a reference to a device does typically not prevent its
devres managed resources from being released so there is no point in
keeping the reference.
Link: https://lkml.kernel.org/r/20250924080207.18006-1-johan@kernel.org
Fixes: 9375db07ad ("genalloc: add devres support, allow to find a managed pool by device")
Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Vladimir Zapolskiy <vz@mleia.com>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There's really no need for this since it's 0 or 1 when
CONFIG_PANIC_ON_OOPS is disabled/enabled, so just use IS_ENABLED()
instead. The extra symbol goes back to the original code adding it in
commit 2a01bb3885 ("panic: Make panic_on_oops configurable").
Link: https://lkml.kernel.org/r/20250924094303.18521-2-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The kernel's CFI implementation uses the KCFI ABI specifically, and is
not strictly tied to a particular compiler. In preparation for GCC
supporting KCFI, rename CONFIG_CFI_CLANG to CONFIG_CFI (along with
associated options).
Use new "transitional" Kconfig option for old CONFIG_CFI_CLANG that will
enable CONFIG_CFI during olddefconfig.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250923213422.1105654-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Switch 'compressed_formats[]' to the more modern and flexible designated
initializers. This improves readability and allows struct fields to be
reordered. Also use a more concise sentinel marker.
Remove the curly braces around the for loop while we're at it.
No functional changes intended.
Link: https://lkml.kernel.org/r/20250910232350.1308206-2-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
While rare, memory allocation profiling can contain inaccurate counters if
slab object extension vector allocation fails. That allocation might
succeed later but prior to that, slab allocations that would have used
that object extension vector will not be accounted for. To indicate
incorrect counters, "accurate:no" marker is appended to the call site line
in the /proc/allocinfo output. Bump up /proc/allocinfo version to reflect
the change in the file format and update documentation.
Example output with invalid counters:
allocinfo - version: 2.0
0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no
0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no
32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
[surenb@google.com: document new "accurate:no" marker]
Fixes: 39d117e04d15 ("alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output")
[akpm@linux-foundation.org: simplification per Usama, reflow text]
[akpm@linux-foundation.org: add newline to prevent docs warning, per Randy]
Link: https://lkml.kernel.org/r/20250915230224.4115531-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: David Wang <00107082@163.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Memory profiling can be shut down due to reasons like a failure during
initialization. When this happens, the user should not be able to
re-enable it. Current sysctrl interface does not handle this properly and
will allow re-enabling memory profiling. Fix this by checking for this
condition during sysctrl write operation.
Link: https://lkml.kernel.org/r/20250915212756.3998938-3-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Cc: David Wang <00107082@163.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Minor fixes for memory allocation profiling", v2.
Over the last couple months I gathered a few reports of minor issues in
memory allocation profiling which are addressed in this patchset.
This patch (of 2):
When bulk-freeing an array of pages use release_pages() instead of freeing
them page-by-page.
Link: https://lkml.kernel.org/r/20250915212756.3998938-1-surenb@google.com
Link: https://lkml.kernel.org/r/20250915212756.3998938-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: Usama Arif <usamaarif642@gmail.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Cc: David Wang <00107082@163.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "kasan: unify kasan_enabled() and remove arch-specific
implementations", v6.
This patch series addresses the fragmentation in KASAN initialization
across architectures by introducing a unified approach that eliminates
duplicate static keys and arch-specific kasan_arch_is_ready()
implementations.
The core issue is that different architectures have inconsistent approaches
to KASAN readiness tracking:
- PowerPC, LoongArch, and UML arch, each implement own kasan_arch_is_ready()
- Only HW_TAGS mode had a unified static key (kasan_flag_enabled)
- Generic and SW_TAGS modes relied on arch-specific solutions
or always-on behavior
This patch (of 2):
Introduce CONFIG_ARCH_DEFER_KASAN to identify architectures [1] that need
to defer KASAN initialization until shadow memory is properly set up, and
unify the static key infrastructure across all KASAN modes.
[1] PowerPC, UML, LoongArch selects ARCH_DEFER_KASAN.
The core issue is that different architectures haveinconsistent approaches
to KASAN readiness tracking:
- PowerPC, LoongArch, and UML arch, each implement own
kasan_arch_is_ready()
- Only HW_TAGS mode had a unified static key (kasan_flag_enabled)
- Generic and SW_TAGS modes relied on arch-specific solutions or always-on
behavior
This patch addresses the fragmentation in KASAN initialization across
architectures by introducing a unified approach that eliminates duplicate
static keys and arch-specific kasan_arch_is_ready() implementations.
Let's replace kasan_arch_is_ready() with existing kasan_enabled() check,
which examines the static key being enabled if arch selects
ARCH_DEFER_KASAN or has HW_TAGS mode support. For other arch,
kasan_enabled() checks the enablement during compile time.
Now KASAN users can use a single kasan_enabled() check everywhere.
Link: https://lkml.kernel.org/r/20250810125746.1105476-1-snovitoll@gmail.com
Link: https://lkml.kernel.org/r/20250810125746.1105476-2-snovitoll@gmail.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217049
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> #powerpc
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: David Gow <davidgow@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Cc: Marco Elver <elver@google.com>
Cc: Qing Zhang <zhangqing@loongson.cn>
Cc: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmjHMcoeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG5bwH/23w8iGB4hf7L/7Z
e7blX42Pe9EXA1uK62iWmwEjDvBuJ7TmVfXH09qYJ56fj6/rJEdpQwtBMd4ypL81
QA/7lq5UEl0apPzMN86J8EHCzmjNzv7o+UtEd4C/hPFEZHZJa5Hqj9CBglSwSCEn
fTkLk7Gl6s8SfzBQ/rXX6/ZChAB/RleVWabDlIQMDz++/+9DZ0aqphj+5bYSqysL
ROQOaj4LOICuLfrup9J61hKNBoF7Dv3sO20vc+Iic0XHRPZ6/lKCnHgCUsqVIOOQ
L4kDT7XKQg+n3ttjrMe84/8iHZdWtf8VMWrtniPT8e1YGYuMpavVplgIcFoFCoNm
Qa7NPDs=
=rZeT
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaM3AYQAKCRAdayaRccAa
lkrsAQCfR0LymE8Hq+Vfk65DK4qZxigaXGTfg5n3xlPhTAh/iQEA02N0/ReHOOdH
nQde8709saIFE5axIMFvdWzbFPDtWwE=
=eIkf
-----END PGP SIGNATURE-----
Merge 6.17-rc6 into kbuild-next
Commit bd7c231212 ("pinctrl: meson: Fix typo in device table macro")
is needed in kbuild-next to avoid a build error with a future change.
While at it, address the conflict between commit 41f9049cff ("riscv:
Only allow LTO with CMODEL_MEDANY") and commit 6578a1ff6a ("riscv:
Remove version check for LTO_CLANG selects"), as reported by Stephen
Rothwell [1].
Link: https://lore.kernel.org/20250908134913.68778b7b@canb.auug.org.au/ [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Add an implementation of sha256_finup_2x_arch() for x86_64. It
interleaves the computation of two SHA-256 hashes using the x86 SHA-NI
instructions. dm-verity and fs-verity will take advantage of this for
greatly improved performance on capable CPUs.
This increases the throughput of SHA-256 hashing 4096-byte messages by
the following amounts on the following CPUs:
Intel Ice Lake (server): 4%
Intel Sapphire Rapids: 38%
Intel Emerald Rapids: 38%
AMD Zen 1 (Threadripper 1950X): 84%
AMD Zen 4 (EPYC 9B14): 98%
AMD Zen 5 (Ryzen 9 9950X): 64%
For now, this seems to benefit AMD more than Intel. This seems to be
because current AMD CPUs support concurrent execution of the SHA-NI
instructions, but unfortunately current Intel CPUs don't, except for the
sha256msg2 instruction. Hopefully future Intel CPUs will support SHA-NI
on more execution ports. Zen 1 supports 2 concurrent sha256rnds2, and
Zen 4 supports 4 concurrent sha256rnds2, which suggests that even better
performance may be achievable on Zen 4 by interleaving more than two
hashes. However, doing so poses a number of trade-offs, and furthermore
Zen 5 goes back to supporting "only" 2 concurrent sha256rnds2.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250915160819.140019-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add an implementation of sha256_finup_2x_arch() for arm64. It
interleaves the computation of two SHA-256 hashes using the ARMv8
SHA-256 instructions. dm-verity and fs-verity will take advantage of
this for greatly improved performance on capable CPUs.
This increases the throughput of SHA-256 hashing 4096-byte messages by
the following amounts on the following CPUs:
ARM Cortex-X1: 70%
ARM Cortex-X3: 68%
ARM Cortex-A76: 65%
ARM Cortex-A715: 43%
ARM Cortex-A510: 25%
ARM Cortex-A55: 8%
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250915160819.140019-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Many arm64 and x86_64 CPUs can compute two SHA-256 hashes in nearly the
same speed as one, if the instructions are interleaved. This is because
SHA-256 is serialized block-by-block, and two interleaved hashes take
much better advantage of the CPU's instruction-level parallelism.
Meanwhile, a very common use case for SHA-256 hashing in the Linux
kernel is dm-verity and fs-verity. Both use a Merkle tree that has a
fixed block size, usually 4096 bytes with an empty or 32-byte salt
prepended. Usually, many blocks need to be hashed at a time. This is
an ideal scenario for 2-way interleaved hashing.
To enable this optimization, add a new function sha256_finup_2x() to the
SHA-256 library API. It computes the hash of two equal-length messages,
starting from a common initial context.
For now it always falls back to sequential processing. Later patches
will wire up arm64 and x86_64 optimized implementations.
Note that the interleaving factor could in principle be higher than 2x.
However, that runs into many practical difficulties and CPU throughput
limitations. Thus, both the implementations I'm adding are 2x. In the
interest of using the simplest solution, the API matches that.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250915160819.140019-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since wp$$==wq$$, it doesn't need to load the same data twice, use move
instruction to replace one of the loads to let the program run faster.
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Link: https://lore.kernel.org/r/20250718072711.3865118-3-zhangchunyan@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
These two C files don't reference things defined in simd.h or types.h
so remove these redundant #inclusions.
Fixes: 6093faaf95 ("raid6: Add RISC-V SIMD syndrome and recovery calculations")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250718072711.3865118-2-zhangchunyan@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Various KUnit tests require PCI infrastructure to work. All normal
platforms enable PCI by default, but UML does not. Enabling PCI from
.kunitconfig files is problematic as it would not be portable. So in
commit 6fc3a8636a ("kunit: tool: Enable virtio/PCI by default on UML")
PCI was enabled by way of CONFIG_UML_PCI_OVER_VIRTIO=y. However
CONFIG_UML_PCI_OVER_VIRTIO requires additional configuration of
CONFIG_UML_PCI_OVER_VIRTIO_DEVICE_ID or will otherwise trigger a WARN() in
virtio_pcidev_init(). However there is no one correct value for
UML_PCI_OVER_VIRTIO_DEVICE_ID which could be used by default.
This warning is confusing when debugging test failures.
On the other hand, the functionality of CONFIG_UML_PCI_OVER_VIRTIO is not
used at all, given that it is completely non-functional as indicated by
the WARN() in question. Instead it is only used as a way to enable
CONFIG_UML_PCI which itself is not directly configurable.
Instead of going through CONFIG_UML_PCI_OVER_VIRTIO, introduce a custom
configuration option which enables CONFIG_UML_PCI without triggering
warnings or building dead code.
Link: https://lore.kernel.org/r/20250908-kunit-uml-pci-v2-1-d8eba5f73c9d@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Previously btree_merge() called btree_last() only to test existence, then
performed an extra btree_lookup() to fetch the value. This patch changes
it to directly use the value returned by btree_last(), avoiding redundant
lookups and simplifying the merge loop.
Link: https://lkml.kernel.org/r/20250826161741.686704-1-409411716@gms.tku.edu.tw
Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The helper this_cpu_in_panic() duplicated logic already provided by
panic_on_this_cpu().
Remove this_cpu_in_panic() and switch all users to panic_on_this_cpu().
This simplifies the code and avoids having two helpers for the same check.
Link: https://lkml.kernel.org/r/20250825022947.1596226-8-wangjinchao600@gmail.com
Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Doug Anderson <dianders@chromium.org>
Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Kees Cook <kees@kernel.org>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Luo Gengkun <luogengkun@huaweicloud.com>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Nam Cao <namcao@linutronix.de>
Cc: oushixiong <oushixiong@kylinos.cn>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Qianqiang Liu <qianqiang.liu@163.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Thomas Zimemrmann <tzimmermann@suse.de>
Cc: Thorsten Blum <thorsten.blum@linux.dev>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Yunhui Cui <cuiyunhui@bytedance.com>
Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Generalization of panic_print's dump function [1] has been merged, and
this patchset is to address some remaining issues, like adding note of the
obsoletion of 'panic_print' cmdline parameter, refining the kernel
document for panic_print, and hardening some string management.
This patch (of 4):
It is a normal case that bitmask parameter is 0, so pre-initialize the
names[] to null string to cover this case.
Also remove the superfluous "+1" in names[sizeof(sys_info_avail) + 1],
which is needed for 'strlen()', but not for 'sizeof()'.
Link: https://lkml.kernel.org/r/20250825025701.81921-1-feng.tang@linux.alibaba.com
Link: https://lkml.kernel.org/r/20250825025701.81921-2-feng.tang@linux.alibaba.com
Link: Link: https://lkml.kernel.org/r/20250703021004.42328-1-feng.tang@linux.alibaba.com [1]
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Askar Safin <safinaskar@zohomail.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Replace the ternary (enable ? "on" : "off") with the str_on_off() helper
from string_choices.h. This improves readability by replacing the
three-operand ternary with a single function call, ensures consistent
string output, and allows potential string deduplication by the linker,
resulting in a slightly smaller binary.
Link: https://lkml.kernel.org/r/20250814093827.237980-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Replace ternary (condition ? "true" : "false") expressions with the
str_true_false() helper from string_choices.h. This improves readability
by replacing the three-operand ternary with a single function call,
ensures consistent string output, and allows potential string
deduplication by the linker, resulting in a slightly smaller binary.
Link: https://lkml.kernel.org/r/20250814095033.244034-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kzalloc() has already been initialized to full 0 space, there is no need
to use memset() to initialize again.
Link: https://lkml.kernel.org/r/20250811082739.378284-1-liaoyuanhong@vivo.com
Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.
Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these
redundant flags across subsystems.
No functional changes.
Link: https://lkml.kernel.org/r/20250805023031.331718-1-rongqianfeng@vivo.com
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
MAPLE_PARENT_RANGE32 should be 0x02 as a 32 bit node is indicated by the
bit pattern 0b010 which is the hex value 0x02. There are no users
currently, so there is no associated bug with this wrong value.
Fix typo Note -> Node and replace x with b to indicate binary values.
Link: https://lkml.kernel.org/r/20250826151344.403286-1-sidhartha.kumar@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The result of integer comparison already evaluates to bool. No need for
explicit conversion.
No functional impact.
Link: https://lkml.kernel.org/r/20250819070457.486348-1-zhao.xichao@vivo.com
Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* Update kho_test_save() so that folios array won't be freed when
returning from the function and the fdt will be freed on error
* Reset state->nr_folios to 0 in kho_test_generate_data() on error
* Simplify allocation of folios info in fdt.
Link: https://lkml.kernel.org/r/20250811082510.4154080-3-rppt@kernel.org
Fixes: b753522bed ("kho: add test for kexec handover")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reported-by: Pratyush Yadav <pratyush@kernel.org>
Closes: https://lore.kernel.org/all/mafs0zfcjcepf.fsf@kernel.org
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 16f5dfbc85 ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made
GFP_NOWAIT implicitly include __GFP_NOWARN.
Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g.,
`GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these
redundant flags across subsystems.
No functional changes.
Link: https://lkml.kernel.org/r/20250804130018.484321-1-rongqianfeng@vivo.com
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reimplement k[v]realloc_node() to be able to set node and alignment should
a user need to do so. In order to do that while retaining the maximal
backward compatibility, add k[v]realloc_node_align() functions and
redefine the rest of API using these new ones.
While doing that, we also keep the number of _noprof variants to a
minimum, which implies some changes to the existing users of older _noprof
functions, that basically being bcachefs.
With that change we also provide the ability for the Rust part of the
kernel to set node and alignment in its K[v]xxx [re]allocations.
Link: https://lkml.kernel.org/r/20250806124147.1724658-1-vitaly.wool@konsulko.se
Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jann Horn <jannh@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
No more callers.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The field timer->base->get_time is a private implementation detail and
should not be accessed outside of the hrtimer core.
Switch to the equivalent helper.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250821-hrtimer-cleanup-get_time-v2-4-3ae822e5bfbd@linutronix.de
While tracking down a problem where constant expressions used by
BUILD_BUG_ON() suddenly stopped working[1], we found that an added static
initializer was convincing the compiler that it couldn't track the state
of the prior statically initialized value. Tracing this down found that
ffs() was used in the initializer macro, but since it wasn't marked with
__attribute_const__, the compiler had to assume the function might
change variable states as a side-effect (which is not true for ffs(),
which provides deterministic math results).
Validate all the __attibute_const__ annotations were found for all
architectures by reproducing the specific problem encountered in the
original bug report.
Build and run tested with everything I could reach with KUnit:
$ ./tools/testing/kunit/kunit.py run --arch=x86_64 ffs
$ ./tools/testing/kunit/kunit.py run --arch=i386 ffs
$ ./tools/testing/kunit/kunit.py run --arch=arm64 --make_options "CROSS_COMPILE=aarch64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=arm --make_options "CROSS_COMPILE=arm-linux-gnueabi-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpc ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpc32 ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpcle ffs
$ ./tools/testing/kunit/kunit.py run --arch=m68k ffs
$ ./tools/testing/kunit/kunit.py run --arch=loongarch ffs
$ ./tools/testing/kunit/kunit.py run --arch=s390 --make_options "CROSS_COMPILE=s390x-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=riscv --make_options "CROSS_COMPILE=riscv64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=riscv32 --make_options "CROSS_COMPILE=riscv64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sparc --make_options "CROSS_COMPILE=sparc64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sparc64 --make_options "CROSS_COMPILE=sparc64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=alpha --make_options "CROSS_COMPILE=alpha-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sh --make_options "CROSS_COMPILE=sh4-linux-gnu-" ffs
Closes: https://github.com/KSPP/linux/issues/364
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250804164417.1612371-17-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
While tracking down a problem where constant expressions used by
BUILD_BUG_ON() suddenly stopped working[1], we found that an added static
initializer was convincing the compiler that it couldn't track the state
of the prior statically initialized value. Tracing this down found that
ffs() was used in the initializer macro, but since it wasn't marked with
__attribute__const__, the compiler had to assume the function might
change variable states as a side-effect (which is not true for ffs(),
which provides deterministic math results).
Add missing __attribute_const__ annotations to generic implementations of
ffs(), __ffs(), fls(), and __fls() functions. These are pure mathematical
functions that always return the same result for the same input with no
side effects, making them eligible for compiler optimization.
Build tested with x86_64 defconfig using GCC 14.2.0, which should validate
the implementations when used by ARM, ARM64, LoongArch, Microblaze,
NIOS2, and SPARC32 architectures.
Link: https://github.com/KSPP/linux/issues/364 [1]
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250804164417.1612371-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Add KUnit tests for ffs()-family bit scanning functions: ffs(), __ffs(),
fls(), __fls(), fls64(), __ffs64(), and ffz(). The tests validate
mathematical relationships (e.g. ffs(x) == __ffs(x) + 1), and test zero
values, power-of-2 patterns, maximum values, and sparse bit patterns.
Build and run tested with everything I could reach with KUnit:
$ ./tools/testing/kunit/kunit.py run --arch=x86_64 ffs
$ ./tools/testing/kunit/kunit.py run --arch=i386 ffs
$ ./tools/testing/kunit/kunit.py run --arch=arm64 --make_options "CROSS_COMPILE=aarch64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=arm --make_options "CROSS_COMPILE=arm-linux-gnueabihf-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpc ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpc32 ffs
$ ./tools/testing/kunit/kunit.py run --arch=powerpcle ffs
$ ./tools/testing/kunit/kunit.py run --arch=riscv --make_options "CROSS_COMPILE=riscv64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=riscv32 --make_options "CROSS_COMPILE=riscv64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=s390 --make_options "CROSS_COMPILE=s390x-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=m68k ffs
$ ./tools/testing/kunit/kunit.py run --arch=loongarch ffs
$ ./tools/testing/kunit/kunit.py run --arch=mips --make_options "CROSS_COMPILE=mipsel-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sparc --make_options "CROSS_COMPILE=sparc64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sparc64 --make_options "CROSS_COMPILE=sparc64-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=alpha --make_options "CROSS_COMPILE=alpha-linux-gnu-" ffs
$ ./tools/testing/kunit/kunit.py run --arch=sh --make_options "CROSS_COMPILE=sh4-linux-gnu-" ffs
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250804164417.1612371-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Now that the Curve25519 library has been disentangled from CRYPTO,
adding CRYPTO_SELFTESTS as a default value of
CRYPTO_LIB_CURVE25519_KUNIT_TEST no longer causes a recursive kconfig
dependency. Do this, which makes this option consistent with the other
crypto KUnit test options in the same file.
Link: https://lore.kernel.org/r/20250906213523.84915-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reorganize the Curve25519 library code:
- Build a single libcurve25519 module, instead of up to three modules:
libcurve25519, libcurve25519-generic, and an arch-specific module.
- Move the arch-specific Curve25519 code from arch/$(SRCARCH)/crypto/ to
lib/crypto/$(SRCARCH)/. Centralize the build rules into
lib/crypto/Makefile and lib/crypto/Kconfig.
- Include the arch-specific code directly in lib/crypto/curve25519.c via
a header, rather than using a separate .c file.
- Eliminate the entanglement with CRYPTO. CRYPTO_LIB_CURVE25519 no
longer selects CRYPTO, and the arch-specific Curve25519 code no longer
depends on CRYPTO.
This brings Curve25519 in line with the latest conventions for
lib/crypto/, used by other algorithms. The exception is that I kept the
generic code in separate translation units for now. (Some of the
function names collide between the x86 and generic Curve25519 code. And
the Curve25519 functions are very long anyway, so inlining doesn't
matter as much for Curve25519 as it does for some other algorithms.)
Link: https://lore.kernel.org/r/20250906213523.84915-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the Curve25519 test from an ad-hoc self-test to a KUnit test.
Generally keep the same test logic for now, just translated to KUnit.
There's one exception, which is that I dropped the incomplete test of
curve25519_generic(). The approach I'm taking to cover the different
implementations with the KUnit tests is to just rely on booting kernels
in QEMU with different '-cpu' options, rather than try to make the tests
(incompletely) test multiple implementations on one CPU. This way, both
the test and the library API are simpler.
This commit makes the file lib/crypto/curve25519.c no longer needed, as
its only purpose was to call the self-test. However, keep it for now,
since a later commit will add code to it again.
Temporarily omit the default value of CRYPTO_SELFTESTS that the other
lib/crypto/ KUnit tests have. It would cause a recursive kconfig
dependency, since the Curve25519 code is still entangled with CRYPTO. A
later commit will fix that.
Link: https://lore.kernel.org/r/20250906213523.84915-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
All architectures implementing time-related functionality in the vDSO are
using the generic vDSO library which handles time namespaces properly.
Remove the now unnecessary Kconfig symbol.
Enables the use of time namespaces on architectures, which use the
generic vDSO but did not enable GENERIC_VDSO_TIME_NS, namely MIPS and arm.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-10-d9b65750e49f@linutronix.de
All users of the generic vDSO library also use the generic vDSO datastore.
Remove the now unnecessary Kconfig symbol.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-9-d9b65750e49f@linutronix.de
All calls of these functions are already gated behind CONFIG_TIME_NS. The
compiler will already optimize them away if time namespaces are disabled.
Drop the unnecessary stubs.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-4-d9b65750e49f@linutronix.de
When the generic vDSO does not provide time functions, as for example on
riscv32, then the time data store is not necessary.
Avoid allocating these time data pages when not used.
Fixes: df7fcbefa7 ("vdso: Add generic time data storage")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250826-vdso-cleanups-v1-1-d9b65750e49f@linutronix.de
Add a KUnit test suite for BLAKE2s. Most of the core test logic is in
the previously-added hash-test-template.h. This commit just adds the
actual KUnit suite, commits the generated test vectors to the tree so
that gen-hash-testvecs.py won't have to be run at build time, and adds a
few BLAKE2s-specific test cases.
This is the replacement for blake2s-selftest, which an earlier commit
removed. Improvements over blake2s-selftest include integration with
KUnit, more comprehensive test cases, and support for benchmarking.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
As was done with the other algorithms, reorganize the BLAKE2s code so
that the generic implementation and the arch-specific "glue" code is
consolidated into a single translation unit, so that the compiler will
inline the functions and automatically decide whether to include the
generic code in the resulting binary or not.
Similarly, also consolidate the build rules into
lib/crypto/{Makefile,Kconfig}. This removes the last uses of
lib/crypto/{arm,x86}/{Makefile,Kconfig}, so remove those too.
Don't keep the !KMSAN dependency. It was needed only for other
algorithms such as ChaCha that initialize memory from assembly code.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move blake2s_compress_generic() from blake2s-generic.c to blake2s.c.
For now it's still guarded by CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC, but
this prepares for changing it to a 'static __maybe_unused' function and
just using the compiler to automatically decide its inclusion.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
When support for a crypto algorithm is enabled, the arch-optimized
implementation of that algorithm should be enabled too. We've learned
this the hard way many times over the years: people regularly forget to
enable the arch-optimized implementations of the crypto algorithms,
resulting in significant performance being left on the table.
Currently, BLAKE2s support is always enabled ('obj-y'), since random.c
uses it. Therefore, the arch-optimized BLAKE2s code, which exists for
ARM and x86_64, should be always enabled too. Let's do that.
Note that the effect on kernel image size is very small and should not
be a concern. On ARM, enabling CRYPTO_BLAKE2S_ARM actually *shrinks*
the kernel size by about 1200 bytes, since the ARM-optimized
blake2s_compress() completely replaces the generic blake2s_compress().
On x86_64, enabling CRYPTO_BLAKE2S_X86 increases the kernel size by
about 1400 bytes, as the generic blake2s_compress() is still included as
a fallback; however, for context, that is only about a quarter the size
of the generic blake2s_compress(). The x86_64 optimized BLAKE2s code
uses much less icache at runtime than the generic code.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Save 480 bytes of .rodata by replacing the .long constants with .bytes,
and using the vpmovzxbd instruction to expand them.
Also update the code to do the loads before incrementing %rax rather
than after. This avoids the need for the first load to use an offset.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Consolidate the ChaCha code into a single module (excluding
chacha-block-generic.c which remains always built-in for random.c),
similar to various other algorithms:
- Each arch now provides a header file lib/crypto/$(SRCARCH)/chacha.h,
replacing lib/crypto/$(SRCARCH)/chacha*.c. The header defines
chacha_crypt_arch() and hchacha_block_arch(). It is included by
lib/crypto/chacha.c, and thus the code gets built into the single
libchacha module, with improved inlining in some cases.
- Whether arch-optimized ChaCha is buildable is now controlled centrally
by lib/crypto/Kconfig instead of by lib/crypto/$(SRCARCH)/Kconfig.
The conditions for enabling it remain the same as before, and it
remains enabled by default.
- Any additional arch-specific translation units for the optimized
ChaCha code, such as assembly files, are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
This removes the last use for the Makefile and Kconfig files in the
arm64, mips, powerpc, riscv, and s390 subdirectories of lib/crypto/. So
also remove those files and the references to them.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Rename libchacha.c to chacha.c to make the naming consistent with other
algorithms and allow additional source files to be added to the
libchacha module. This file currently contains chacha_crypt_generic(),
but it will soon be updated to contain chacha_crypt().
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Rename chacha.c to chacha-block-generic.c to free up the name chacha.c
for the high-level API entry points (chacha_crypt() and
hchacha_block()), similar to the other algorithms.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305
implementation for riscv authored by Andy Polyakov. The file
'poly1305-riscv.pl' is taken straight from
https://github.com/dot-asm/cryptogams commit
5e3fba73576244708a752fa61a8e93e587f271bb. This patch was tested on
SpacemiT X60, with 2~2.5x improvement over generic implementation.
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Signed-off-by: Zhihang Shao <zhihang.shao.iscas@gmail.com>
[EB: ported to lib/crypto/riscv/]
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Consolidate the Poly1305 code into a single module, similar to various
other algorithms (SHA-1, SHA-256, SHA-512, etc.):
- Each arch now provides a header file lib/crypto/$(SRCARCH)/poly1305.h,
replacing lib/crypto/$(SRCARCH)/poly1305*.c. The header defines
poly1305_block_init(), poly1305_blocks(), poly1305_emit(), and
optionally poly1305_mod_init_arch(). It is included by
lib/crypto/poly1305.c, and thus the code gets built into the single
libpoly1305 module, with improved inlining in some cases.
- Whether arch-optimized Poly1305 is buildable is now controlled
centrally by lib/crypto/Kconfig instead of by
lib/crypto/$(SRCARCH)/Kconfig. The conditions for enabling it remain
the same as before, and it remains enabled by default. (The PPC64 one
remains unconditionally disabled due to 'depends on BROKEN'.)
- Any additional arch-specific translation units for the optimized
Poly1305 code, such as assembly files, are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
A special consideration is needed because the Adiantum code uses the
poly1305_core_*() functions directly. For now, just carry forward that
approach. This means retaining the CRYPTO_LIB_POLY1305_GENERIC kconfig
symbol, and keeping the poly1305_core_*() functions in separate
translation units. So it's not quite as streamlined I've done with the
other hash functions, but we still get a single libpoly1305 module.
Note: to see the diff from the arm, arm64, and x86 .c files to the new
.h files, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Now that the minimum supported version of LLVM for building the kernel
has been bumped to 15.0.0, two KMSAN checks can be cleaned up.
CONFIG_HAVE_KMSAN_COMPILER will always be true when using clang so
remove the cc-option test and use a simple check for CONFIG_CC_IS_CLANG.
CONFIG_HAVE_KMSAN_PARAM_RETVAL will always be true so it can be removed
outright.
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250821-bump-min-llvm-ver-15-v2-12-635f3294e5f0@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Now that the minimum supported version of LLVM for building the kernel
has been bumped to 15.0.0, __no_kcsan will always ensure that the thread
sanitizer functions are not generated, so remove the check for tsan
functions in is_profiling_func() and the always true depends and
unnecessary select lines in KCSAN_WEAK_MEMORY.
Acked-by: Marco Elver <elver@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infraded.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250821-bump-min-llvm-ver-15-v2-11-635f3294e5f0@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Now that the minimum supported version of LLVM for building the kernel
has been bumped to 15.0.0, the CLANG_VERSION check for older than 14.0.0
is always false, so remove it.
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250821-bump-min-llvm-ver-15-v2-10-635f3294e5f0@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Drop 'inline' from all the *_mod_init_arch() functions so that the
compiler will warn about any bugs where they are unused due to not being
wired up properly. (There are no such bugs currently, so this just
establishes a more robust convention for the future. Of course, these
functions also tend to get inlined anyway, regardless of the keyword.)
Link: https://lore.kernel.org/r/20250816020457.432040-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a KUnit test suite for the MD5 library functions, including the
corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suite, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Link: https://lore.kernel.org/r/20250805222855.10362-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Introduce example_params_test_with_init_dynamic_arr(). This new
KUnit test demonstrates directly assigning a dynamic parameter
array, using the kunit_register_params_array() macro, to a
parameterized test context.
It highlights the use of param_init() and param_exit() for
initialization and exit of a parameterized test, and their
registration to the test case with KUNIT_CASE_PARAM_WITH_INIT().
Link: https://lore.kernel.org/r/20250826091341.1427123-7-davidgow@google.com
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add example_params_test_with_init() to illustrate how to manage
shared resources across a parameterized KUnit test. This example
showcases the use of the new param_init() function and its registration
to a test using the KUNIT_CASE_PARAM_WITH_INIT() macro.
Additionally, the test demonstrates how to directly pass a parameter array
to the parameterized test context via kunit_register_params_array()
and leveraging the Resource API for shared resource management.
Link: https://lore.kernel.org/r/20250826091341.1427123-6-davidgow@google.com
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
KUnit parameterized tests currently support two primary methods f
or getting parameters:
1. Defining custom logic within a generate_params() function.
2. Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC()
macros with a pre-defined static array and passing
the created *_gen_params() to KUNIT_CASE_PARAM().
These methods present limitations when dealing with dynamically
generated parameter arrays, or in scenarios where populating parameters
sequentially via generate_params() is inefficient or overly complex.
This patch addresses these limitations by adding a new `params_array`
field to `struct kunit`, of the type `kunit_params`. The
`struct kunit_params` is designed to store the parameter array itself,
along with essential metadata including the parameter count, parameter
size, and a get_description() function for providing custom descriptions
for individual parameters.
The `params_array` field can be populated by calling the new
kunit_register_params_array() macro from within a param_init() function.
This will register the array as part of the parameterized test context.
The user will then need to pass kunit_array_gen_params() to the
KUNIT_CASE_PARAM_WITH_INIT() macro as the generator function, if not
providing their own. kunit_array_gen_params() is a KUnit helper that will
use the registered array to generate parameters.
The arrays passed to KUNIT_ARRAY_PARAM(,DESC) will also be registered to
the parameterized test context for consistency as well as for higher
availability of the parameter count that will be used for outputting a KTAP
test plan for a parameterized test.
This modification provides greater flexibility to the KUnit framework,
allowing testers to easily register and utilize both dynamic and static
parameter arrays.
Link: https://lore.kernel.org/r/20250826091341.1427123-5-davidgow@google.com
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
[Only output the test plan if using kunit_array_gen_params --David]
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
To enable more complex parameterized testing scenarios, the
generate_params() function needs additional context beyond just
the previously generated parameter. This patch modifies the
generate_params() function signature to include an extra
`struct kunit *test` argument, giving test users access to the
parameterized test context when generating parameters.
The `struct kunit *test` argument was added as the first parameter
to the function signature as it aligns with the convention of other
KUnit functions that accept `struct kunit *test` first. This also
mirrors the "this" or "self" reference found in object-oriented
programming languages.
This patch also modifies xe_pci_live_device_gen_param() in xe_pci.c
and nthreads_gen_params() in kcsan_test.c to reflect this signature
change.
Link: https://lore.kernel.org/r/20250826091341.1427123-4-davidgow@google.com
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: Marco Elver <elver@google.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
[Catch some additional gen_params signatures in drm/xe/tests --David]
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add (*param_init) and (*param_exit) function pointers to
`struct kunit_case`. Users will be able to set them via the new
KUNIT_CASE_PARAM_WITH_INIT() macro.
param_init/exit will be invoked by kunit_run_tests() once before and once
after the parameterized test, respectively. They will receive the
`struct kunit` that holds the parameterized test context; facilitating
init and exit for shared state.
This patch also sets param_init/exit to None in rust/kernel/kunit.rs.
Link: https://lore.kernel.org/r/20250826091341.1427123-3-davidgow@google.com
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Currently, KUnit parameterized tests lack a mechanism to share
resources across parameter runs because the same `struct kunit`
instance is cleaned up and reused for each run.
This patch introduces parameterized test context, enabling test
users to share resources between parameter runs. It also allows
setting up resources that need to be available for all parameter
runs only once, which is helpful in cases where setup is expensive.
To establish a parameterized test context, this patch adds a
parent pointer field to `struct kunit`. This allows resources added
to the parent `struct kunit` to be shared and accessible across all
parameter runs.
In kunit_run_tests(), the default `struct kunit` created is now
designated to act as the parameterized test context whenever a test
is parameterized.
Subsequently, a new `struct kunit` is made for each parameter run, and
its parent pointer is set to the `struct kunit` that holds the
parameterized test context.
Link: https://lore.kernel.org/r/20250826091341.1427123-2-davidgow@google.com
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Marie Zhussupova <marievic@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Instead of exposing the sparc-optimized MD5 code via sparc-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
sparc-optimized, and it fixes the longstanding issue where the
sparc-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Note: to see the diff from arch/sparc/crypto/md5_glue.c to
lib/crypto/sparc/md5.h, view this commit with 'git show -M10'.
Link: https://lore.kernel.org/r/20250805222855.10362-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the powerpc-optimized MD5 code via powerpc-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
powerpc-optimized, and it fixes the longstanding issue where the
powerpc-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Link: https://lore.kernel.org/r/20250805222855.10362-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the mips-optimized MD5 code via mips-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
mips-optimized, and it fixes the longstanding issue where the
mips-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-md5.c
to lib/crypto/mips/md5.h, view this commit with 'git show -M10'.
Link: https://lore.kernel.org/r/20250805222855.10362-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add library functions for MD5, including HMAC support. The MD5
implementation is derived from crypto/md5.c. This closely mirrors the
corresponding SHA-1 and SHA-2 changes.
Like SHA-1 and SHA-2, support for architecture-optimized MD5
implementations is included. I originally proposed dropping those, but
unfortunately there is an AF_ALG user of the PowerPC MD5 code
(https://lore.kernel.org/r/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
and dropping that code would be viewed as a performance regression. We
don't add new software algorithm implementations purely for AF_ALG, as
escalating to kernel mode merely to do calculations that could be done
in userspace is inefficient and is completely the wrong design. But
since this one already existed, it gets grandfathered in for now. An
objection was also raised to dropping the SPARC64 MD5 code because it
utilizes the CPU's direct support for MD5, although it remains unclear
that anyone is using that. Regardless, we'll keep these around for now.
Note that while MD5 is a legacy algorithm that is vulnerable to
practical collision attacks, it still has various in-kernel users that
implement legacy protocols. Switching to a simple library API, which is
the way the code should have been organized originally, will greatly
simplify their code. For example:
MD5:
drivers/md/dm-crypt.c (for lmk IV generation)
fs/nfsd/nfs4recover.c
fs/ecryptfs/
fs/smb/client/
net/{ipv4,ipv6}/ (for TCP-MD5 signatures)
HMAC-MD5:
fs/smb/client/
fs/smb/server/
(Also net/sctp/ if it continues using HMAC-MD5 for cookie generation.
However, that use case has the flexibility to upgrade to a more modern
algorithm, which I'll be proposing instead.)
As usual, the "md5" and "hmac(md5)" crypto_shash algorithms will also be
reimplemented on top of these library functions. For "hmac(md5)" this
will provide a faster, more streamlined implementation.
Link: https://lore.kernel.org/r/20250805222855.10362-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since sha512_kunit tests the fallback code paths without using
crypto_simd_disabled_for_test, make the SHA-512 code just use the
underlying may_use_simd() and irq_fpu_usable() functions directly
instead of crypto_simd_usable(). This eliminates an unnecessary layer.
Link: https://lore.kernel.org/r/20250731223651.136939-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since sha256_kunit tests the fallback code paths without using
crypto_simd_disabled_for_test, make the SHA-256 code just use the
underlying may_use_simd() and irq_fpu_usable() functions directly
instead of crypto_simd_usable(). This eliminates an unnecessary layer.
While doing this, also add likely() annotations, and fix a minor
inconsistency where the static keys in the sha256.h files were in a
different place than in the corresponding sha1.h and sha512.h files.
Link: https://lore.kernel.org/r/20250731223510.136650-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
__ubsan_handle_divrem_overflow() incorrectly uses the RHS to report.
It always reports the same log: division of -1 by -1. But it should
report division of LHS by -1.
Signed-off-by: Junhui Pei <paradoxskin233@gmail.com>
Fixes: c6d308534a ("UBSAN: run-time undefined behavior sanity checker")
Link: https://lore.kernel.org/r/20250602153841.62935-1-paradoxskin233@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
Fix a regression where 'make clean' stopped removing some of the
generated assembly files on arm and arm64.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaKaR/xQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK486AP4z3+DbFNj3WHLA/O3uFxgR4eiUW9tl
gVtzslK5j2rxLwEAuOrY0qAhzX2dJ0/KN6y7T1qtOEZoGIZ2j85HDvMh3wU=
=d+ja
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fixes from Eric Biggers:
"Fix a regression where 'make clean' stopped removing some of the
generated assembly files on arm and arm64"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: ensure generated *.S files are removed on make clean
lib/crypto: sha: Update Kconfig help for SHA1 and SHA256
The NEED_* macros do an implicit goto in case the safety bounds checks
fail. Add the unlikely hints as this is the error case and not a hot
path. The final assembly is slightly shorter and some jumps got
reordered according to the hints.
text data bss dec hex filename
2294 16 0 2310 906 pre/lzo1x_decompress_safe.o
2277 16 0 2293 8f5 post/lzo1x_decompress_safe.o
text data bss dec hex filename
3444 48 0 3492 da4 pre/lzo1x_compress_safe.o
3372 48 0 3420 d5c post/lzo1x_compress_safe.o
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Drop 'inline' from all the *_mod_init_arch() functions so that the
compiler will warn about any bugs where they are unused due to not being
wired up properly. (There are no such bugs currently, so this just
establishes a more robust convention for the future. Of course, these
functions also tend to get inlined anyway, regardless of the keyword.)
Link: https://lore.kernel.org/r/20250816020240.431545-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
For kbuild to properly clean up these build artifacts in the subdirectory,
even after CONFIG_KUNIT changed do disabled, the directory needs to be
processed always.
Pushing the special logic for hook.o into the kunit Makefile also makes the
logic easier to understand.
Link: https://lore.kernel.org/r/20250813-kunit-always-descend-v1-1-7bbd387ff13b@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
make clean does not check the kernel config when removing files. As
such, additions to clean-files under CONFIG_ARM or CONFIG_ARM64 are not
evaluated. For example, when building on arm64, this means that
lib/crypto/arm64/sha{256,512}-core.S are left over after make clean.
Set clean-files unconditionally to ensure that make clean removes these
files.
Fixes: e96cb9507f ("lib/crypto: sha256: Consolidate into single module")
Fixes: 24c91b62ac ("lib/crypto: arm/sha512: Migrate optimized SHA-512 code to library")
Fixes: 60e3f1e9b7 ("lib/crypto: arm64/sha512: Migrate optimized SHA-512 code to library")
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Link: https://lore.kernel.org/r/20250814-crypto_clean-v2-1-659a2dc86302@columbia.edu
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Current release - regressions:
- netfilter: nft_set_pipapo:
- don't return bogus extension pointer
- fix null deref for empty set
Current release - new code bugs:
- core: prevent deadlocks when enabling NAPIs with mixed kthread config
- eth: netdevsim: Fix wild pointer access in nsim_queue_free().
Previous releases - regressions:
- page_pool: allow enabling recycling late, fix false positive warning
- sched: ets: use old 'nbands' while purging unused classes
- xfrm:
- restore GSO for SW crypto
- bring back device check in validate_xmit_xfrm
- tls: handle data disappearing from under the TLS ULP
- ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
- eth: bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
- eth: hv_netvsc: fix panic during namespace deletion with VF
Previous releases - always broken:
- netfilter: fix refcount leak on table dump
- vsock: do not allow binding to VMADDR_PORT_ANY
- sctp: linearize cloned gso packets in sctp_rcv
- eth: hibmcge: fix the division by zero issue
- eth: microchip: fix KSZ8863 reset problem
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmidxhgSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkPckP/AhJ0kARPgo72OhElbW6KvkHhXVNUTJb
6m15j9Z8ybQPBupSorxUEBx7pxczv/GBloN1xoJqUY/p7B6kLO2HDOpaReNFjZvF
J9hPl6ZF6CWzwgfcUfwI3UB1zhALKyHDClclfd8FBoKjAYXCrZXuPv/AV4oqYsA1
y7g24zpI76Cu+M+Nf5YhrlIVSmQ1/DXX8gifdcHFYnAKmCn7KxNv2lwvm2/yE2lL
9/Xl/D1cG/BiAaCUUXR4BP8RU5gdW72+lM3qmC/QFO/7jcBaoE89Y9Anona8p0PQ
dqDerHd0GDUH9QA6bht3asCS+IW+Zo2gf25o53OzlYIMAxDmEZLUBCwetJhvNJBq
DUQ6agzfNRxsCnlOc4JhMOqNr7rdU7d+9c9KuZWA/m8KRWdlvTYGJd/qzSlTWOhq
s9228dl+4oTb9Mnq8Bqafi42+TImeOyFRW9ZgF8ptjlF0l/lyv6moIrRCmVXppRZ
awABNDdG+i004XmAOAeOhjbUT7clLkLr+KEnsfH16qCa2o3dM6rlhvWYp2sucVJf
SyRvMdz5VqMLgruefpQS/DuK52UklpRawgvgngzU6UDYQUaxQKToeusMjRU7xUnW
hVI1y7/oNH6+r7Zr/iLTLKRR007B+RVC7VSbeMpxmAW+n6puMb+z7RnrJlnFapGM
qqqtk2/jItuK
=Ydk1
-----END PGP SIGNATURE-----
Merge tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from Netfilter and IPsec.
Current release - regressions:
- netfilter: nft_set_pipapo:
- don't return bogus extension pointer
- fix null deref for empty set
Current release - new code bugs:
- core: prevent deadlocks when enabling NAPIs with mixed kthread
config
- eth: netdevsim: Fix wild pointer access in nsim_queue_free().
Previous releases - regressions:
- page_pool: allow enabling recycling late, fix false positive
warning
- sched: ets: use old 'nbands' while purging unused classes
- xfrm:
- restore GSO for SW crypto
- bring back device check in validate_xmit_xfrm
- tls: handle data disappearing from under the TLS ULP
- ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
- eth:
- bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
- hv_netvsc: fix panic during namespace deletion with VF
Previous releases - always broken:
- netfilter: fix refcount leak on table dump
- vsock: do not allow binding to VMADDR_PORT_ANY
- sctp: linearize cloned gso packets in sctp_rcv
- eth:
- hibmcge: fix the division by zero issue
- microchip: fix KSZ8863 reset problem"
* tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
net: usb: asix_devices: add phy_mask for ax88772 mdio bus
net: kcm: Fix race condition in kcm_unattach()
selftests: net/forwarding: test purge of active DWRR classes
net/sched: ets: use old 'nbands' while purging unused classes
bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE
netdevsim: Fix wild pointer access in nsim_queue_free().
net: mctp: Fix bad kfree_skb in bind lookup test
netfilter: nf_tables: reject duplicate device on updates
ipvs: Fix estimator kthreads preferred affinity
netfilter: nft_set_pipapo: fix null deref for empty set
selftests: tls: test TCP stealing data from under the TLS socket
tls: handle data disappearing from under the TLS ULP
ptp: prevent possible ABBA deadlock in ptp_clock_freerun()
ixgbe: prevent from unwanted interface name changes
devlink: let driver opt out of automatic phys_port_name generation
net: prevent deadlocks when enabling NAPIs with mixed kthread config
net: update NAPI threaded config even for disabled NAPIs
selftests: drv-net: don't assume device has only 2 queues
docs: Fix name for net.ipv4.udp_child_hash_entries
riscv: dts: thead: Add APB clocks for TH1520 GMACs
...
As Kees points out, this is a kernel address leak, and debugging is
not a sufficiently good reason to expose the real kernel address.
Fixes: 65b584f536 ("ref_tracker: automatically register a file in debugfs for a ref_tracker_dir")
Reported-by: Kees Cook <kees@kernel.org>
Closes: https://lore.kernel.org/netdev/202507301603.62E553F93@keescook/
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since crc_kunit now tests the fallback code paths without using
crypto_simd_disabled_for_test, make the CRC code just use the underlying
may_use_simd() and irq_fpu_usable() functions directly instead of
crypto_simd_usable(). This eliminates an unnecessary layer.
Take the opportunity to add likely() and unlikely() annotations as well.
Link: https://lore.kernel.org/r/20250811182631.376302-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Test that if CRCs are computed in task, softirq, and hardirq context
concurrently, then all results are as expected. Implement this using
kunit_run_irq_test() which is also used by the lib/crypto/ tests.
As with the corresponding lib/crypto/ tests, the purpose of this is to
test fallback code paths and to exercise edge cases in the
architecture's support for in-kernel FPU/SIMD/vector.
Remove the code from crc_test() that sometimes disabled interrupts, as
that was just an incomplete attempt to achieve similar test coverage.
Link: https://lore.kernel.org/r/20250811182631.376302-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiWLjoQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpvveD/9vbvp3XaF0LagRJLH0fcdhcxL7Z+IHD+7U
v5vICMeoeBhhhOtPJ0y+h/9LMLQWFYDFl6drkY0atSSxp/CK6CB25qFhIDsoA6Qk
RBM/qZ64z4Uxvlc+VQmCqI2EMc/ZrYtrcr7jsornwORoTSEKXVHdyO5k7Q9002Sw
XNWc0bZKIibFlgOk12Wnd8ZS5RWHw1uViUcreojcGVZAVR+BuHNGGoa3xq0bLiHU
ERbQXfjaN28R+eo4E1euCtdf++7tW2kFjClrDmLcszdb27E2+MWMA6AKMiSTBE2k
2e2TvJUcGZs1s8atqSIIjBtmwQW3rKws33zODLMONzOP8CIErcaniHxyDSaxJIJr
kjsdKnwlziL3xVnwQcpgnVOPvvDSKZ4OKEqx8rAuYTqiknpz3uhbt/7EqumuPLHr
e7Rz0MnFolrVN7KZOHQ5CPJIezkEAOAEpItLdfc5cfLS06pbeTN3j+dJZp+tUohi
WP/K3l2N3C5pkXA0ilAzshRF20Rwv/09M85BoqWocTLBJY7WqyIKXywCNdX81wkv
tpbQvp2MpPkJXUIbAh5484BOfCfx9vkYVm2cam2UxXJhR6VfrQCjYfXIjfpqF4jp
q7xxNesUezrOqB2Q/cKxw8dKOaRtO1XzVnmwutBrcKgqqLezMwUTDDjQYe8l6p1Z
40E74tsJwQ==
=EQ7g
-----END PGP SIGNATURE-----
Merge tag 'block-6.17-20250808' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe:
- MD pull request via Yu:
- mddev null-ptr-dereference fix, by Erkun
- md-cluster fail to remove the faulty disk regression fix, by
Heming
- minor cleanup, by Li Nan and Jinchao
- mdadm lifetime regression fix reported by syzkaller, by Yu Kuai
- MD pull request via Christoph
- add support for getting the FDP featuee in fabrics passthru path
(Nitesh Shetty)
- add capability to connect to an administrative controller
(Kamaljit Singh)
- fix a leak on sgl setup error (Keith Busch)
- initialize discovery subsys after debugfs is initialized
(Mohamed Khalfella)
- fix various comment typos (Bjorn Helgaas)
- remove unneeded semicolons (Jiapeng Chong)
- nvmet debugfs ordering issue fix
- Fix UAF in the tag_set in zloop
- Ensure sbitmap shallow depth covers entire set
- Reduce lock roundtrips in io context lookup
- Move scheduler tags alloc/free out of elevator and freeze lock, to
fix some lockdep found issues
- Improve robustness of queue limits checking
- Fix a regression with IO priorities, if no io context exists
* tag 'block-6.17-20250808' of git://git.kernel.dk/linux: (26 commits)
lib/sbitmap: make sbitmap_get_shallow() internal
lib/sbitmap: convert shallow_depth from one word to the whole sbitmap
nvmet: exit debugfs after discovery subsystem exits
block, bfq: Reorder struct bfq_iocq_bfqq_data
md: make rdev_addable usable for rcu mode
md/raid1: remove struct pool_info and related code
md/raid1: change r1conf->r1bio_pool to a pointer type
block: ensure discard_granularity is zero when discard is not supported
zloop: fix KASAN use-after-free of tag set
block: Fix default IO priority if there is no IO context
nvme: fix various comment typos
nvme-auth: remove unneeded semicolon
nvme-pci: fix leak on sgl setup error
nvmet: initialize discovery subsys after debugfs is initialized
nvme: add capability to connect to an administrative controller
nvmet: add support for FDP in fabrics passthru path
md: rename recovery_cp to resync_offset
md/md-cluster: handle REMOVE message earlier
md: fix create on open mddev lifetime regression
block: fix potential deadlock while running nr_hw_queue update
...
Because it's only used in sbitmap.c
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20250807032413.1469456-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently elevators will record internal 'async_depth' to throttle
asynchronous requests, and they both calculate shallow_dpeth based on
sb->shift, with the respect that sb->shift is the available tags in one
word.
However, sb->shift is not the availbale tags in the last word, see
__map_depth:
if (index == sb->map_nr - 1)
return sb->depth - (index << sb->shift);
For consequence, if the last word is used, more tags can be get than
expected, for example, assume nr_requests=256 and there are four words,
in the worst case if user set nr_requests=32, then the first word is
the last word, and still use bits per word, which is 64, to calculate
async_depth is wrong.
One the ohter hand, due to cgroup qos, bfq can allow only one request
to be allocated, and set shallow_dpeth=1 will still allow the number
of words request to be allocated.
Fix this problems by using shallow_depth to the whole sbitmap instead
of per word, also change kyber, mq-deadline and bfq to follow this,
a new helper __map_depth_with_shallow() is introduced to calculate
available bits in each word.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20250807032413.1469456-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Resolve conflicts with this commit that was developed in parallel
during the merge window:
8c8efa93db ("x86/bug: Add ARCH_WARN_ASM macro for BUG/WARN asm code sharing with Rust")
Conflicts:
arch/riscv/include/asm/bug.h
arch/x86/include/asm/bug.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmiQpykACgkQUqAMR0iA
lPJcrg/9Hez6+zO7LECCn5VkuK5oJWR5CyCfwx14ki8UF38djQGU2frckI5837rE
MnVoEBexZunK5SXy4MAy7bTCitzR+lMqNtP5uq9J2ovlSPtNlfuJRDr7uGQLDtSS
M5KZ1qsZnhgwLYeNhfVVToHgp+OwIQb2GcgYmYc8k03fUI1NQpdxIM46DzoTj+06
x6qgrNsmmJbm8E73VWBByJAEFoq9ugjny8Rt+tYMi/CmhgZpp0ZyF1r5dYfYX/KS
VS8UQY//aZOFhNsQUAXwP7Ym00CYRgTg7Na+MHivYLXmYGH2gF6tWQhX/eEgHKcJ
RTmUbLFx70fdBbjJMxv2k8vyMk2sy6sTfJHPqM/NS/Fb0tSPBXQJG/EexzfoqiBc
wcjgOPkeALIosVdFdTqXxjoIGOP8rqsU4t6Y6WFjJlWK04SBVjxBUofytRdQSxkG
5Sb0rFVGKrKIkXaVkt4byPa1/BDpfNhfKMYPtQ56pv2VNUgzfye4prUAZHE5pLnK
8nixeeMtKDFFCBpn6rG5wZW7k2mK5FrWGZUfdfxdK1gWQ1y0kqGy5wa3lNZLcxlH
l3AtOYoDeWM2DjDVO6WCj8ambEWkbjbGg7tC9TI3F0NvRJSYytTb6npMqb3Gwhcb
U4NgT+Ho0GJ/5BLUye8HMfhvrGoCfRCeptHtEFXAK7pzKyjc0+c=
=Mocd
-----END PGP SIGNATURE-----
Merge tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Add new "hash_pointers=[auto|always|never]" boot parameter to force
the hashing even with "slab_debug" enabled
- Allow to stop CPU, after losing nbcon console ownership during
panic(), even without proper NMI
- Allow to use the printk kthread immediately even for the 1st
registered nbcon
- Compiler warning removal
* tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: nbcon: Allow reacquire during panic
printk: Allow to use the printk kthread immediately even for 1st nbcon
slab: Decouple slab_debug and no_hash_pointers
vsprintf: Use __diag macros to disable '-Wsuggest-attribute=format'
compiler-gcc.h: Introduce __diag_GCC_all
- The 2 patch series "squashfs: Remove page->mapping references" from
Matthew Wilcox gets us closer to being able to remove page->mapping.
- The 5 patch series "relayfs: misc changes" from Jason Xing does some
maintenance and minor feature addition work in relayfs.
- The 5 patch series "kdump: crashkernel reservation from CMA" from Jiri
Bohac switches us from static preallocation of the kdump crashkernel's
working memory over to dynamic allocation. So the difficulty of
a-priori estimation of the second kernel's needs is removed and the
first kernel obtains extra memory.
- The 5 patch series "generalize panic_print's dump function to be used
by other kernel parts" from Feng Tang implements some consolidation and
rationalizatio of the various ways in which a faiing kernel splats
information at the operator.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaI+82gAKCRDdBJ7gKXxA
jj4JAP9xb+w9DrBY6sa+7KTPIb+aTqQ7Zw3o9O2m+riKQJv6jAEA6aEwRnDA0451
fDT5IqVlCWGvnVikdZHSnvhdD7TGsQ0=
=rT71
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
"Significant patch series in this pull request:
- "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
us closer to being able to remove page->mapping
- "relayfs: misc changes" (Jason Xing) does some maintenance and
minor feature addition work in relayfs
- "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
us from static preallocation of the kdump crashkernel's working
memory over to dynamic allocation. So the difficulty of a-priori
estimation of the second kernel's needs is removed and the first
kernel obtains extra memory
- "generalize panic_print's dump function to be used by other
kernel parts" (Feng Tang) implements some consolidation and
rationalization of the various ways in which a failing kernel
splats information at the operator
* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
tools/getdelays: add backward compatibility for taskstats version
kho: add test for kexec handover
delaytop: enhance error logging and add PSI feature description
samples: Kconfig: fix spelling mistake "instancess" -> "instances"
fat: fix too many log in fat_chain_add()
scripts/spelling.txt: add notifer||notifier to spelling.txt
xen/xenbus: fix typo "notifer"
net: mvneta: fix typo "notifer"
drm/xe: fix typo "notifer"
cxl: mce: fix typo "notifer"
KVM: x86: fix typo "notifer"
MAINTAINERS: add maintainers for delaytop
ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
ucount: fix atomic_long_inc_below() argument type
kexec: enable CMA based contiguous allocation
stackdepot: make max number of pools boot-time configurable
lib/xxhash: remove unused functions
init/Kconfig: restore CONFIG_BROKEN help text
lib/raid6: update recov_rvv.c zero page usage
docs: update docs after introducing delaytop
...
to use the module data structures in combination with the already no-op stub
module functions, even when support for modules is disabled in the kernel
configuration. This change follows the kernel's coding style for conditional
compilation and allows kunit code to drop all CONFIG_MODULES ifdefs, which is
also part of the changes. This should allow others part of the kernel to do the
same cleanup.
Note that this had a conflict with sysctl changes [1] but should be fixed now as I
rebased on top.
The remaining changes include a fix for module name length handling which could
potentially lead to the removal of an incorrect module, and various cleanups.
The module name fix and related cleanup has been in linux-next since Thursday
(July 31) while the rest of the changes for a bit more than 3 weeks.
Note that this currently has conflicts in next with kbuild's tree [2].
Link: https://lore.kernel.org/all/20250714175916.774e6d79@canb.auug.org.au/ [1]
Link: https://lore.kernel.org/all/20250801132941.6815d93d@canb.auug.org.au/ [2]
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE73Ua4R8Pc+G5xjxTQJ6jxB8ZUfsFAmiPQgkACgkQQJ6jxB8Z
UfuPTA//XrRguJFBhQh6cUWqVleTNQJuhjiPsOSO5S52aVET4wsrnRNeM2eM5oqw
0+6ELvhIJINQ1LjpOP8D67d8P5Ds1/qM1pbQIkQsoKiEj6E7Q4dXH5N0uyf/BzO3
HaosLG9cpqcomlSorYEiYoPjqy9EChQzsi+YAYWAB+fW6bvU/AdUHTRH88m3ppBJ
Y22BTTPOKKyj5/QgfY+kwH8TTnrzCzY8aoOqW7uimLI5h4c9dFQ2PigRJnoMfDG1
11w5VshOTzZJvNFrUk5GVSirwlxdJDbW6dKfG0DD5+eNWK5dfIEc+/EcuhaGoPvO
Euwv8VQubdxHTAG6kzHI0MtxAVQUM1gyz8zHiu18eW++GTtnTFs6m8E6H9AC176G
nDkUh3qSxJN2HHgxtS9VUExEEZpYqtWeB9Zts8K3oSWvTaQenHWpVHPADkxzS4JU
Jvkjq8SiKo+RqHxaOKfyf1RfOtYe5tjMCLrP7zX39d1+cwGxuc6mip/omY9HFDgn
op132fYdt24JSHoioJDzRz9mTfvj3nICEmgX4D4WDQx5lP27CUcLugPnBNHPp0fu
5hL+ajy8M8nq4zm/42Y+F7VS74TIA6mSnJKs9dMCknUWueD6HrDEU9xHi1YMpUMZ
cBUSpU+P94dCIScwEzkp926vDnHyxCHLbpF1Jsq5qNNdj7AelHk=
=4bGB
-----END PGP SIGNATURE-----
Merge tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull module updates from Daniel Gomez:
"This is a small set of changes for modules, primarily to extend module
users to use the module data structures in combination with the
already no-op stub module functions, even when support for modules is
disabled in the kernel configuration. This change follows the kernel's
coding style for conditional compilation and allows kunit code to drop
all CONFIG_MODULES ifdefs, which is also part of the changes. This
should allow others part of the kernel to do the same cleanup.
The remaining changes include a fix for module name length handling
which could potentially lead to the removal of an incorrect module,
and various cleanups"
* tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
module: Rename MAX_PARAM_PREFIX_LEN to __MODULE_NAME_LEN
tracing: Replace MAX_PARAM_PREFIX_LEN with MODULE_NAME_LEN
module: Restore the moduleparam prefix length check
module: Remove unnecessary +1 from last_unloaded_module::name size
module: Prevent silent truncation of module name in delete_module(2)
kunit: test: Drop CONFIG_MODULE ifdeffery
module: make structure definitions always visible
module: move 'struct module_use' to internal.h
Testing kexec handover requires a kernel driver that will generate some
data and preserve it with KHO on the first boot and then restore that data
and verify it was preserved properly after kexec.
To facilitate such test, along with the kernel driver responsible for data
generation, preservation and restoration add a script that runs a kernel
in a VM with a minimal /init. The /init enables KHO, loads a kernel image
for kexec and runs kexec reboot. After the boot of the kexeced kernel,
the driver verifies that the data was properly preserved.
[rppt@kernel.org: fix section mismatch]
Link: https://lkml.kernel.org/r/aIiRC8fXiOXKbPM_@kernel.org
Link: https://lkml.kernel.org/r/20250727083733.2590139-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We're hitting the WARN in depot_init_pool() about reaching the stack depot
limit because we have long stacks that don't dedup very well.
Introduce a new start-up parameter to allow users to set the number of
maximum stack depot pools.
Link: https://lkml.kernel.org/r/20250718153928.94229-1-matt@readmodwrite.com
Signed-off-by: Matt Fleming <mfleming@cloudflare.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
xxh32_digest() and xxh32_update() were added in 2017 in the original
xxhash commit, but have remained unused.
Remove them.
Link: https://lkml.kernel.org/r/20250716133245.243363-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Dave Gilbert <linux@treblig.org>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- The 4 patch series "mm: ksm: prevent KSM from breaking merging of new
VMAs" from Lorenzo Stoakes addresses an issue with KSM's
PR_SET_MEMORY_MERGE mode: newly mapped VMAs were not eligible for
merging with existing adjacent VMAs.
- The 4 patch series "mm/damon: introduce DAMON_STAT for simple and
practical access monitoring" from SeongJae Park adds a new kernel module
which simplifies the setup and usage of DAMON in production
environments.
- The 6 patch series "stop passing a writeback_control to swap/shmem
writeout" from Christoph Hellwig is a cleanup to the writeback code
which removes a couple of pointers from struct writeback_control.
- The 7 patch series "drivers/base/node.c: optimization and cleanups"
from Donet Tom contains largely uncorrelated cleanups to the NUMA node
setup and management code.
- The 4 patch series "mm: userfaultfd: assorted fixes and cleanups" from
Tal Zussman does some maintenance work on the userfaultfd code.
- The 5 patch series "Readahead tweaks for larger folios" from Ryan
Roberts implements some tuneups for pagecache readahead when it is
reading into order>0 folios.
- The 4 patch series "selftests/mm: Tweaks to the cow test" from Mark
Brown provides some cleanups and consistency improvements to the
selftests code.
- The 4 patch series "Optimize mremap() for large folios" from Dev Jain
does that. A 37% reduction in execution time was measured in a
memset+mremap+munmap microbenchmark.
- The 5 patch series "Remove zero_user()" from Matthew Wilcox expunges
zero_user() in favor of the more modern memzero_page().
- The 3 patch series "mm/huge_memory: vmf_insert_folio_*() and
vmf_insert_pfn_pud() fixes" from David Hildenbrand addresses some warts
which David noticed in the huge page code. These were not known to be
causing any issues at this time.
- The 3 patch series "mm/damon: use alloc_migrate_target() for
DAMOS_MIGRATE_{HOT,COLD" from SeongJae Park provides some cleanup and
consolidation work in DAMON.
- The 3 patch series "use vm_flags_t consistently" from Lorenzo Stoakes
uses vm_flags_t in places where we were inappropriately using other
types.
- The 3 patch series "mm/memfd: Reserve hugetlb folios before
allocation" from Vivek Kasireddy increases the reliability of large page
allocation in the memfd code.
- The 14 patch series "mm: Remove pXX_devmap page table bit and pfn_t
type" from Alistair Popple removes several now-unneeded PFN_* flags.
- The 5 patch series "mm/damon: decouple sysfs from core" from SeongJae
Park implememnts some cleanup and maintainability work in the DAMON
sysfs layer.
- The 5 patch series "madvise cleanup" from Lorenzo Stoakes does quite a
lot of cleanup/maintenance work in the madvise() code.
- The 4 patch series "madvise anon_name cleanups" from Vlastimil Babka
provides additional cleanups on top or Lorenzo's effort.
- The 11 patch series "Implement numa node notifier" from Oscar Salvador
creates a standalone notifier for NUMA node memory state changes.
Previously these were lumped under the more general memory on/offline
notifier.
- The 6 patch series "Make MIGRATE_ISOLATE a standalone bit" from Zi Yan
cleans up the pageblock isolation code and fixes a potential issue which
doesn't seem to cause any problems in practice.
- The 5 patch series "selftests/damon: add python and drgn based DAMON
sysfs functionality tests" from SeongJae Park adds additional drgn- and
python-based DAMON selftests which are more comprehensive than the
existing selftest suite.
- The 5 patch series "Misc rework on hugetlb faulting path" from Oscar
Salvador fixes a rather obscure deadlock in the hugetlb fault code and
follows that fix with a series of cleanups.
- The 3 patch series "cma: factor out allocation logic from
__cma_declare_contiguous_nid" from Mike Rapoport rationalizes and cleans
up the highmem-specific code in the CMA allocator.
- The 28 patch series "mm/migration: rework movable_ops page migration
(part 1)" from David Hildenbrand provides cleanups and
future-preparedness to the migration code.
- The 2 patch series "mm/damon: add trace events for auto-tuned
monitoring intervals and DAMOS quota" from SeongJae Park adds some
tracepoints to some DAMON auto-tuning code.
- The 6 patch series "mm/damon: fix misc bugs in DAMON modules" from
SeongJae Park does that.
- The 6 patch series "mm/damon: misc cleanups" from SeongJae Park also
does what it claims.
- The 4 patch series "mm: folio_pte_batch() improvements" from David
Hildenbrand cleans up the large folio PTE batching code.
- The 13 patch series "mm/damon/vaddr: Allow interleaving in
migrate_{hot,cold} actions" from SeongJae Park facilitates dynamic
alteration of DAMON's inter-node allocation policy.
- The 3 patch series "Remove unmap_and_put_page()" from Vishal Moola
provides a couple of page->folio conversions.
- The 4 patch series "mm: per-node proactive reclaim" from Davidlohr
Bueso implements a per-node control of proactive reclaim - beyond the
current memcg-based implementation.
- The 14 patch series "mm/damon: remove damon_callback" from SeongJae
Park replaces the damon_callback interface with a more general and
powerful damon_call()+damos_walk() interface.
- The 10 patch series "mm/mremap: permit mremap() move of multiple VMAs"
from Lorenzo Stoakes implements a number of mremap cleanups (of course)
in preparation for adding new mremap() functionality: newly permit the
remapping of multiple VMAs when the user is specifying MREMAP_FIXED. It
still excludes some specialized situations where this cannot be
performed reliably.
- The 3 patch series "drop hugetlb_free_pgd_range()" from Anthony Yznaga
switches some sparc hugetlb code over to the generic version and removes
the thus-unneeded hugetlb_free_pgd_range().
- The 4 patch series "mm/damon/sysfs: support periodic and automated
stats update" from SeongJae Park augments the present
userspace-requested update of DAMON sysfs monitoring files. Automatic
update is now provided, along with a tunable to control the update
interval.
- The 4 patch series "Some randome fixes and cleanups to swapfile" from
Kemeng Shi does what is claims.
- The 4 patch series "mm: introduce snapshot_page" from Luiz Capitulino
and David Hildenbrand provides (and uses) a means by which debug-style
functions can grab a copy of a pageframe and inspect it locklessly
without tripping over the races inherent in operating on the live
pageframe directly.
- The 6 patch series "use per-vma locks for /proc/pid/maps reads" from
Suren Baghdasaryan addresses the large contention issues which can be
triggered by reads from that procfs file. Latencies are reduced by more
than half in some situations. The series also introduces several new
selftests for the /proc/pid/maps interface.
- The 6 patch series "__folio_split() clean up" from Zi Yan cleans up
__folio_split()!
- The 7 patch series "Optimize mprotect() for large folios" from Dev
Jain provides some quite large (>3x) speedups to mprotect() when dealing
with large folios.
- The 2 patch series "selftests/mm: reuse FORCE_READ to replace "asm
volatile("" : "+r" (XXX));" and some cleanup" from wang lian does some
cleanup work in the selftests code.
- The 3 patch series "tools/testing: expand mremap testing" from Lorenzo
Stoakes extends the mremap() selftest in several ways, including adding
more checking of Lorenzo's recently added "permit mremap() move of
multiple VMAs" feature.
- The 22 patch series "selftests/damon/sysfs.py: test all parameters"
from SeongJae Park extends the DAMON sysfs interface selftest so that it
tests all possible user-requested parameters. Rather than the present
minimal subset.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaIqcCgAKCRDdBJ7gKXxA
jkVBAQCCn9DR1QP0CRk961ot0cKzOgioSc0aA03DPb2KXRt2kQEAzDAz0ARurFhL
8BzbvI0c+4tntHLXvIlrC33n9KWAOQM=
=XsFy
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"As usual, many cleanups. The below blurbiage describes 42 patchsets.
21 of those are partially or fully cleanup work. "cleans up",
"cleanup", "maintainability", "rationalizes", etc.
I never knew the MM code was so dirty.
"mm: ksm: prevent KSM from breaking merging of new VMAs" (Lorenzo Stoakes)
addresses an issue with KSM's PR_SET_MEMORY_MERGE mode: newly
mapped VMAs were not eligible for merging with existing adjacent
VMAs.
"mm/damon: introduce DAMON_STAT for simple and practical access monitoring" (SeongJae Park)
adds a new kernel module which simplifies the setup and usage of
DAMON in production environments.
"stop passing a writeback_control to swap/shmem writeout" (Christoph Hellwig)
is a cleanup to the writeback code which removes a couple of
pointers from struct writeback_control.
"drivers/base/node.c: optimization and cleanups" (Donet Tom)
contains largely uncorrelated cleanups to the NUMA node setup and
management code.
"mm: userfaultfd: assorted fixes and cleanups" (Tal Zussman)
does some maintenance work on the userfaultfd code.
"Readahead tweaks for larger folios" (Ryan Roberts)
implements some tuneups for pagecache readahead when it is reading
into order>0 folios.
"selftests/mm: Tweaks to the cow test" (Mark Brown)
provides some cleanups and consistency improvements to the
selftests code.
"Optimize mremap() for large folios" (Dev Jain)
does that. A 37% reduction in execution time was measured in a
memset+mremap+munmap microbenchmark.
"Remove zero_user()" (Matthew Wilcox)
expunges zero_user() in favor of the more modern memzero_page().
"mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes" (David Hildenbrand)
addresses some warts which David noticed in the huge page code.
These were not known to be causing any issues at this time.
"mm/damon: use alloc_migrate_target() for DAMOS_MIGRATE_{HOT,COLD" (SeongJae Park)
provides some cleanup and consolidation work in DAMON.
"use vm_flags_t consistently" (Lorenzo Stoakes)
uses vm_flags_t in places where we were inappropriately using other
types.
"mm/memfd: Reserve hugetlb folios before allocation" (Vivek Kasireddy)
increases the reliability of large page allocation in the memfd
code.
"mm: Remove pXX_devmap page table bit and pfn_t type" (Alistair Popple)
removes several now-unneeded PFN_* flags.
"mm/damon: decouple sysfs from core" (SeongJae Park)
implememnts some cleanup and maintainability work in the DAMON
sysfs layer.
"madvise cleanup" (Lorenzo Stoakes)
does quite a lot of cleanup/maintenance work in the madvise() code.
"madvise anon_name cleanups" (Vlastimil Babka)
provides additional cleanups on top or Lorenzo's effort.
"Implement numa node notifier" (Oscar Salvador)
creates a standalone notifier for NUMA node memory state changes.
Previously these were lumped under the more general memory
on/offline notifier.
"Make MIGRATE_ISOLATE a standalone bit" (Zi Yan)
cleans up the pageblock isolation code and fixes a potential issue
which doesn't seem to cause any problems in practice.
"selftests/damon: add python and drgn based DAMON sysfs functionality tests" (SeongJae Park)
adds additional drgn- and python-based DAMON selftests which are
more comprehensive than the existing selftest suite.
"Misc rework on hugetlb faulting path" (Oscar Salvador)
fixes a rather obscure deadlock in the hugetlb fault code and
follows that fix with a series of cleanups.
"cma: factor out allocation logic from __cma_declare_contiguous_nid" (Mike Rapoport)
rationalizes and cleans up the highmem-specific code in the CMA
allocator.
"mm/migration: rework movable_ops page migration (part 1)" (David Hildenbrand)
provides cleanups and future-preparedness to the migration code.
"mm/damon: add trace events for auto-tuned monitoring intervals and DAMOS quota" (SeongJae Park)
adds some tracepoints to some DAMON auto-tuning code.
"mm/damon: fix misc bugs in DAMON modules" (SeongJae Park)
does that.
"mm/damon: misc cleanups" (SeongJae Park)
also does what it claims.
"mm: folio_pte_batch() improvements" (David Hildenbrand)
cleans up the large folio PTE batching code.
"mm/damon/vaddr: Allow interleaving in migrate_{hot,cold} actions" (SeongJae Park)
facilitates dynamic alteration of DAMON's inter-node allocation
policy.
"Remove unmap_and_put_page()" (Vishal Moola)
provides a couple of page->folio conversions.
"mm: per-node proactive reclaim" (Davidlohr Bueso)
implements a per-node control of proactive reclaim - beyond the
current memcg-based implementation.
"mm/damon: remove damon_callback" (SeongJae Park)
replaces the damon_callback interface with a more general and
powerful damon_call()+damos_walk() interface.
"mm/mremap: permit mremap() move of multiple VMAs" (Lorenzo Stoakes)
implements a number of mremap cleanups (of course) in preparation
for adding new mremap() functionality: newly permit the remapping
of multiple VMAs when the user is specifying MREMAP_FIXED. It still
excludes some specialized situations where this cannot be performed
reliably.
"drop hugetlb_free_pgd_range()" (Anthony Yznaga)
switches some sparc hugetlb code over to the generic version and
removes the thus-unneeded hugetlb_free_pgd_range().
"mm/damon/sysfs: support periodic and automated stats update" (SeongJae Park)
augments the present userspace-requested update of DAMON sysfs
monitoring files. Automatic update is now provided, along with a
tunable to control the update interval.
"Some randome fixes and cleanups to swapfile" (Kemeng Shi)
does what is claims.
"mm: introduce snapshot_page" (Luiz Capitulino and David Hildenbrand)
provides (and uses) a means by which debug-style functions can grab
a copy of a pageframe and inspect it locklessly without tripping
over the races inherent in operating on the live pageframe
directly.
"use per-vma locks for /proc/pid/maps reads" (Suren Baghdasaryan)
addresses the large contention issues which can be triggered by
reads from that procfs file. Latencies are reduced by more than
half in some situations. The series also introduces several new
selftests for the /proc/pid/maps interface.
"__folio_split() clean up" (Zi Yan)
cleans up __folio_split()!
"Optimize mprotect() for large folios" (Dev Jain)
provides some quite large (>3x) speedups to mprotect() when dealing
with large folios.
"selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));" and some cleanup" (wang lian)
does some cleanup work in the selftests code.
"tools/testing: expand mremap testing" (Lorenzo Stoakes)
extends the mremap() selftest in several ways, including adding
more checking of Lorenzo's recently added "permit mremap() move of
multiple VMAs" feature.
"selftests/damon/sysfs.py: test all parameters" (SeongJae Park)
extends the DAMON sysfs interface selftest so that it tests all
possible user-requested parameters. Rather than the present minimal
subset"
* tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (370 commits)
MAINTAINERS: add missing headers to mempory policy & migration section
MAINTAINERS: add missing file to cgroup section
MAINTAINERS: add MM MISC section, add missing files to MISC and CORE
MAINTAINERS: add missing zsmalloc file
MAINTAINERS: add missing files to page alloc section
MAINTAINERS: add missing shrinker files
MAINTAINERS: move memremap.[ch] to hotplug section
MAINTAINERS: add missing mm_slot.h file THP section
MAINTAINERS: add missing interval_tree.c to memory mapping section
MAINTAINERS: add missing percpu-internal.h file to per-cpu section
mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info()
selftests/damon: introduce _common.sh to host shared function
selftests/damon/sysfs.py: test runtime reduction of DAMON parameters
selftests/damon/sysfs.py: test non-default parameters runtime commit
selftests/damon/sysfs.py: generalize DAMON context commit assertion
selftests/damon/sysfs.py: generalize monitoring attributes commit assertion
selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion
selftests/damon/sysfs.py: test DAMOS filters commitment
selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion
selftests/damon/sysfs.py: test DAMOS destinations commitment
...
The definitions of GENMASK() and GENMASK_ULL() do not depend any more
on __GENMASK() and __GENMASK_ULL(). Duplicate the existing unit tests
so that __GENMASK{,ULL}() are still covered.
Because __GENMASK() and __GENMASK_ULL() do use GENMASK_INPUT_CHECK(),
drop the TEST_GENMASK_FAILURES negative tests.
It would be good to have a small assembly test case for GENMASK*() in
case somebody decides to unify both in the future. However, I lack
expertise in assembly to do so. Instead add a FIXME message to
highlight the absence of the asm unit test.
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
The function stubs exposed by module.h allow the code to compile properly
without the ifdeffery. The generated object code stays the same, as the
compiler can optimize away all the dead code.
As the code is still typechecked developer errors can be detected faster.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Link: https://lore.kernel.org/r/20250711-kunit-ifdef-modules-v2-3-39443decb1f8@linutronix.de
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Core & protocols
----------------
- Wrap datapath globals into net_aligned_data, to avoid false sharing.
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container).
- Add SO_INQ and SCM_INQ support to AF_UNIX.
- Add SIOCINQ support to AF_VSOCK.
- Add TCP_MAXSEG sockopt to MPTCP.
- Add IPv6 force_forwarding sysctl to enable forwarding per interface.
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB.
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users.
- Convert TCP send queue handling from tasklet to BH workque.
- Improve BPF iteration over TCP sockets to see each socket exactly once.
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code.
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel NAPI
thread when threaded NAPI gets disabled. Previously thread would stick
around until ifdown due to tricky synchronization.
- Allow multicast routing to take effect on locally-generated packets.
- Add output interface argument for End.X in segment routing.
- MCTP: add support for gateway routing, improve bind() handling.
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink.
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed.
- Support NUD_PERMANENT for proxy neighbor entries.
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM.
- Add sequence numbers to netconsole messages. Unregister netconsole's
console when all net targets are removed. Code refactoring.
Add a number of selftests.
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup.
- Support inspecting ref_tracker state via DebugFS.
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links.
- Allow providing upcall pid for the 'execute' command in openvswitch.
- Remove DCCP support from Netfilter's conntrack.
- Disallow multiple packet duplications in the queuing layer.
- Prevent use of deprecated iptables code on PREEMPT_RT.
Driver API
----------
- Support RSS and hashing configuration over ethtool Netlink.
- Add dedicated ethtool callbacks for getting and setting hashing fields.
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc.
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs.
- Support traffic classes in devlink rate API for bandwidth management.
- Remove rtnl_lock dependency from UDP tunnel port configuration.
Device drivers
--------------
- Add a new Broadcom driver for 800G Ethernet (bnge).
- Add a standalone driver for Microchip ZL3073x DPLL.
- Remove IBM's NETIUCV device driver.
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is used
by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmiFgLgACgkQMUZtbf5S
IrvafxAAnQRwYBoIG+piCILx6z5pRvBGHkmEQ4AQgSCFuq2eO3ubwMFIqEybfma1
5+QFjUZAV3OgGgKRBS2KGWxtSzdiF+/JGV1VOIN67sX3Mm0a2QgjA4n5CgKL0FPr
o6BEzjX5XwG1zvGcBNQ5BZ19xUUKjoZQgTtnea8sZ57Fsp5RtRgmYRqoewNvNk/n
uImh0NFsDVb0UeOpSzC34VD9l1dJvLGdui4zJAjno/vpvmT1DkXjoK419J/r52SS
X+5WgsfJ6DkjHqVN1tIhhK34yWqBOcwGFZJgEnWHMkFIl2FqRfFKMHyqtfLlVnLA
mnIpSyz8Sq2AHtx0TlgZ3At/Ri8p5+yYJgHOXcDKyABa8y8Zf4wrycmr6cV9JLuL
z54nLEVnJuvfDVDVJjsLYdJXyhMpZFq6+uAItdxKaw8Ugp/QqG4QtoRj+XIHz4ZW
z6OohkCiCzTwEISFK+pSTxPS30eOxq43kCspcvuLiwCCStJBRkRb5GdZA4dm7LA+
1Od4ADAkHjyrFtBqTyyC2scX8UJ33DlAIpAYyIeS6w9Cj9EXxtp1z33IAAAZ03MW
jJwIaJuc8bK2fWKMmiG7ucIXjPo4t//KiWlpkwwqLhPbjZgfDAcxq1AC2TLoqHBL
y4EOgKpHDCMAghSyiFIAn2JprGcEt8dp+11B0JRXIn4Pm/eYDH8=
=lqbe
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Wrap datapath globals into net_aligned_data, to avoid false sharing
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container)
- Add SO_INQ and SCM_INQ support to AF_UNIX
- Add SIOCINQ support to AF_VSOCK
- Add TCP_MAXSEG sockopt to MPTCP
- Add IPv6 force_forwarding sysctl to enable forwarding per interface
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users
- Convert TCP send queue handling from tasklet to BH workque
- Improve BPF iteration over TCP sockets to see each socket exactly
once
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel
NAPI thread when threaded NAPI gets disabled. Previously thread
would stick around until ifdown due to tricky synchronization
- Allow multicast routing to take effect on locally-generated packets
- Add output interface argument for End.X in segment routing
- MCTP: add support for gateway routing, improve bind() handling
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed
- Support NUD_PERMANENT for proxy neighbor entries
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM
- Add sequence numbers to netconsole messages. Unregister
netconsole's console when all net targets are removed. Code
refactoring. Add a number of selftests
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup
- Support inspecting ref_tracker state via DebugFS
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links
- Allow providing upcall pid for the 'execute' command in openvswitch
- Remove DCCP support from Netfilter's conntrack
- Disallow multiple packet duplications in the queuing layer
- Prevent use of deprecated iptables code on PREEMPT_RT
Driver API:
- Support RSS and hashing configuration over ethtool Netlink
- Add dedicated ethtool callbacks for getting and setting hashing
fields
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs
- Support traffic classes in devlink rate API for bandwidth
management
- Remove rtnl_lock dependency from UDP tunnel port configuration
Device drivers:
- Add a new Broadcom driver for 800G Ethernet (bnge)
- Add a standalone driver for Microchip ZL3073x DPLL
- Remove IBM's NETIUCV device driver
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is
used by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading"
* tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits)
dpll: zl3073x: Fix build failure
selftests: bpf: fix legacy netfilter options
ipv6: annotate data-races around rt->fib6_nsiblings
ipv6: fix possible infinite loop in fib6_info_uses_dev()
ipv6: prevent infinite loop in rt6_nlmsg_size()
ipv6: add a retry logic in net6_rt_notify()
vrf: Drop existing dst reference in vrf_ip6_input_dst
net/sched: taprio: align entry index attr validation with mqprio
net: fsl_pq_mdio: use dev_err_probe
selftests: rtnetlink.sh: remove esp4_offload after test
vsock: remove unnecessary null check in vsock_getname()
igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
stmmac: xsk: fix negative overflow of budget in zerocopy mode
dt-bindings: ieee802154: Convert at86rf230.txt yaml format
net: dsa: microchip: Disable PTP function of KSZ8463
net: dsa: microchip: Setup fiber ports for KSZ8463
net: dsa: microchip: Write switch MAC address differently for KSZ8463
net: dsa: microchip: Use different registers for KSZ8463
net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
dt-bindings: net: dsa: microchip: Add KSZ8463 switch support
...
* Move sysctls out of the kern_table array
This is the final move of ctl_tables into their respective subsystems. Only 5
(out of the original 50) will remain in kernel/sysctl.c file; these handle
either sysctl or common arch variables.
By decentralizing sysctl registrations, subsystem maintainers regain control
over their sysctl interfaces, improving maintainability and reducing the
likelihood of merge conflicts.
* docs: Remove false positives from check-sysctl-docs
Stopped falsely identifying sysctls as undocumented or unimplemented in the
check-sysctl-docs script. This script can now be used to automatically
identify if documentation is missing.
* Testing
All these have been in linux-next since rc3, giving them a solid 3 to 4 weeks
worth of testing. Additionally, sysctl selftests and kunit were also run
locally on my x86_64
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmiAvd8ACgkQupfNUreW
QU+9nAv/dtxaKoL4BXJSzsA2+49bbo9QfiK5Vjz1wSRYRQTb+jhGr9QdS5hG+NeX
uN2ilvcNQqW7ENdiblU10lvcbPjIn2hw4lbMcpv/+QXnrudtGYlBFXlkWqW5nv7X
AVvHU8y3uzfs6JbRIpROUA7Cn2cDOlfP2mMtwxCXR3iP+orS1ziuVEi1JRoirIyG
iq5I/1rJMJBU3FjqqDTq6yljspLx8AlXO1yc5xUxAM67IcY4ew3ZTxqiZr6M9AhV
DUbR2lu/88wcFNERt8DJmuQ50dSGGqOEpK3FURTmkwtMFxzNLmenFDQeBKKahz3Q
2ntXSDfp2y+ppZNmcOP8tZZkra03Xpy1DQyoOgQ2r9uGekPxyr+wmKXwYPOeJIPO
YWTNBm8omX9qr49zVzaZ1f2foRGfgStHL6aa6xLIf34zzScSDEPtO3og2+5Hw/30
gnp+7v9E19uKpoE6oiGE0PtiFzAi/I6nFxzG2RRqrlMLFXyKVccTKygzY6tCnI3P
6144s/Bt
=R369
-----END PGP SIGNATURE-----
Merge tag 'sysctl-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
- Move sysctls out of the kern_table array
This is the final move of ctl_tables into their respective
subsystems. Only 5 (out of the original 50) will remain in
kernel/sysctl.c file; these handle either sysctl or common arch
variables.
By decentralizing sysctl registrations, subsystem maintainers regain
control over their sysctl interfaces, improving maintainability and
reducing the likelihood of merge conflicts.
- docs: Remove false positives from check-sysctl-docs
Stopped falsely identifying sysctls as undocumented or unimplemented
in the check-sysctl-docs script. This script can now be used to
automatically identify if documentation is missing.
* tag 'sysctl-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (23 commits)
docs: Downgrade arm64 & riscv from titles to comment
docs: Replace spaces with tabs in check-sysctl-docs
docs: Remove colon from ctltable title in vm.rst
docs: Add awk section for ucount sysctl entries
docs: Use skiplist when checking sysctl admin-guide
docs: nixify check-sysctl-docs
sysctl: rename kern_table -> sysctl_subsys_table
kernel/sys.c: Move overflow{uid,gid} sysctl into kernel/sys.c
uevent: mv uevent_helper into kobject_uevent.c
sysctl: Removed unused variable
sysctl: Nixify sysctl.sh
sysctl: Remove superfluous includes from kernel/sysctl.c
sysctl: Remove (very) old file changelog
sysctl: Move sysctl_panic_on_stackoverflow to kernel/panic.c
sysctl: move cad_pid into kernel/pid.c
sysctl: Move tainted ctl_table into kernel/panic.c
Input: sysrq: mv sysrq into drivers/tty/sysrq.c
fork: mv threads-max into kernel/fork.c
parisc/power: Move soft-power into power.c
mm: move randomize_va_space into memory.c
...
- Standardize on the __ASSEMBLER__ macro that is provided by GCC
and Clang compilers and replace __ASSEMBLY__ with __ASSEMBLER__
in both uapi and non-uapi headers
- Explicitly include <linux/export.h> in architecture and driver
files which contain an EXPORT_SYMBOL() and remove the include
from the files which do not contain the EXPORT_SYMBOL()
- Use the full title of "z/Architecture Principles of Operation"
manual and the name of a section where facility bits are listed
- Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid
unnecessary slowing down of the build and confusing external
kABI tools that process symtypes data
- Print additional unrecoverable machine check information to make
the root cause analysis easier
- Move cmpxchg_user_key() handling to uaccess library code, since
the generated code is large anyway and there is no benefit if it
is inlined
- Fix a problem when cmpxchg_user_key() is executing a code with a
non-default key: if a system is IPL-ed with "LOAD NORMAL", and
the previous system used storage keys where the fetch-protection
bit was set for some pages, and the cmpxchg_user_key() is located
within such page, a protection exception happens
- Either the external call or emergency signal order is used to send
an IPI to a remote CPU. Use the external order only, since it is at
least as good and sometimes even better, than the emergency signal
- In case of an early crash the early program check handler prints
more or less random value of the last breaking event address, since
it is not initialized properly. Copy the last breaking event address
from the lowcore to pt_regs to address this
- During STP synchronization check udelay() can not be used, since the
first CPU modifies tod_clock_base and get_tod_clock_monotonic() might
return a non-monotonic time. Instead, busy-loop on other CPUs, while
the the first CPU actually handles the synchronization operation
- When debugging the early kernel boot using QEMU with the -S flag and
GDB attached, skip the decompressor and start directly in kernel
- Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture
Principles of Operation" manual
- Remove the in-kernel time steering support in favour of the new s390
PTP driver, which allows the kernel clock steered more precisely
- Remove a possible false-positive warning in pte_free_defer(), which
could be triggered in a valid case KVM guest process is initializing
-----BEGIN PGP SIGNATURE-----
iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCaIJQThccYWdvcmRlZXZA
bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8FI2APwPnlrj6ZVXzNA6dw0fSUt697rS
NlaHEORXL8KcfoQh8QD/WwHUe1VNtDG1R5bBn0guR+UytVgR9Tt7LxyKfIgT3ws=
=tdMb
-----END PGP SIGNATURE-----
Merge tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:
- Standardize on the __ASSEMBLER__ macro that is provided by GCC and
Clang compilers and replace __ASSEMBLY__ with __ASSEMBLER__ in both
uapi and non-uapi headers
- Explicitly include <linux/export.h> in architecture and driver files
which contain an EXPORT_SYMBOL() and remove the include from the
files which do not contain the EXPORT_SYMBOL()
- Use the full title of "z/Architecture Principles of Operation" manual
and the name of a section where facility bits are listed
- Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid
unnecessary slowing down of the build and confusing external kABI
tools that process symtypes data
- Print additional unrecoverable machine check information to make the
root cause analysis easier
- Move cmpxchg_user_key() handling to uaccess library code, since the
generated code is large anyway and there is no benefit if it is
inlined
- Fix a problem when cmpxchg_user_key() is executing a code with a
non-default key: if a system is IPL-ed with "LOAD NORMAL", and the
previous system used storage keys where the fetch-protection bit was
set for some pages, and the cmpxchg_user_key() is located within such
page, a protection exception happens
- Either the external call or emergency signal order is used to send an
IPI to a remote CPU. Use the external order only, since it is at
least as good and sometimes even better, than the emergency signal
- In case of an early crash the early program check handler prints more
or less random value of the last breaking event address, since it is
not initialized properly. Copy the last breaking event address from
the lowcore to pt_regs to address this
- During STP synchronization check udelay() can not be used, since the
first CPU modifies tod_clock_base and get_tod_clock_monotonic() might
return a non-monotonic time. Instead, busy-loop on other CPUs, while
the the first CPU actually handles the synchronization operation
- When debugging the early kernel boot using QEMU with the -S flag and
GDB attached, skip the decompressor and start directly in kernel
- Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture
Principles of Operation" manual
- Remove the in-kernel time steering support in favour of the new s390
PTP driver, which allows the kernel clock steered more precisely
- Remove a possible false-positive warning in pte_free_defer(), which
could be triggered in a valid case KVM guest process is initializing
* tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits)
s390/mm: Remove possible false-positive warning in pte_free_defer()
s390/stp: Default to enabled
s390/stp: Remove leap second support
s390/time: Remove in-kernel time steering
s390/sclp: Use monotonic clock in sclp_sync_wait()
s390/smp: Use monotonic clock in smp_emergency_stop()
s390/time: Use monotonic clock in get_cycles()
s390/pai_crypto: Rename PAI Crypto event 4210
scripts/gdb/symbols: make lx-symbols skip the s390 decompressor
s390/boot: Introduce jump_to_kernel() function
s390/stp: Remove udelay from stp_sync_clock()
s390/early: Copy last breaking event address to pt_regs
s390/smp: Remove conditional emergency signal order code usage
s390/uaccess: Merge cmpxchg_user_key() inline assemblies
s390/uaccess: Prevent kprobes on cmpxchg_user_key() functions
s390/uaccess: Initialize code pages executed with non-default access key
s390/skey: Provide infrastructure for executing with non-default access key
s390/uaccess: Make cmpxchg_user_key() library code
s390/page: Add memory clobber to page_set_storage_key()
s390/page: Cleanup page_set_storage_key() inline assemblies
...
Core scheduler changes:
- Better tracking of maximum lag of tasks in presence of different
slices duration, for better handling of lag in the fair
scheduler. (Vincent Guittot)
- Clean up and standardize #if/#else/#endif markers throughout
the entire scheduler code base (Ingo Molnar)
- Make SMP unconditional: build the SMP scheduler's
data structures and logic on UP kernel too, even though
they are not used, to simplify the scheduler and remove
around 200 #ifdef/[#else]/#endif blocks from the
scheduler. (Ingo Molnar)
- Reorganize cgroup bandwidth control interface handling
for better interfacing with sched_ext (Tejun Heo)
Balancing:
- Bump sd->max_newidle_lb_cost when newidle balance fails (Chris Mason)
- Remove sched_domain_topology_level::flags to simplify the code (Prateek Nayak)
- Simplify and clean up build_sched_topology() (Li Chen)
- Optimize build_sched_topology() on large machines (Li Chen)
Real-time scheduling:
- Add initial version of proxy execution: a mechanism for mutex-owning
tasks to inherit the scheduling context of higher priority waiters.
Currently limited to a single runqueue and conditional on CONFIG_EXPERT,
and other limitations. (John Stultz, Peter Zijlstra, Valentin Schneider)
- Deadline scheduler (Juri Lelli):
- Fix dl_servers initialization order (Juri Lelli)
- Fix DL scheduler's root domain reinitialization logic (Juri Lelli)
- Fix accounting bugs after global limits change (Juri Lelli)
- Fix scalability regression by implementing less agressive dl_server handling
(Peter Zijlstra)
PSI:
- Improve scalability by optimizing psi_group_change() cpu_clock() usage
(Peter Zijlstra)
Rust changes:
- Make Task, CondVar and PollCondVar methods inline to avoid unnecessary
function calls (Kunwu Chan, Panagiotis Foliadis)
- Add might_sleep() support for Rust code: Rust's "#[track_caller]"
mechanism is used so that Rust's might_sleep() doesn't need to be
defined as a macro (Fujita Tomonori)
- Introduce file_from_location() (Boqun Feng)
Debugging & instrumentation:
- Make clangd usable with scheduler source code files again (Peter Zijlstra)
- tools: Add root_domains_dump.py which dumps root domains info (Juri Lelli)
- tools: Add dl_bw_dump.py for printing bandwidth accounting info (Juri Lelli)
Misc cleanups & fixes:
- Remove play_idle() (Feng Lee)
- Fix check_preemption_disabled() (Sebastian Andrzej Siewior)
- Do not call __put_task_struct() on RT if pi_blocked_on is set
(Luis Claudio R. Goncalves)
- Correct the comment in place_entity() (wang wei)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmiHHNIRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1g7DhAAg9aMW33PuC24A4hCS1XQay6j3rgmR5qC
AOqDofj/CY4Q374HQtOl4m5CYZB/G5csRv6TZliWQKhAy9vr6VWddoyOMJYOAlAx
XRurl1Z3MriOMD6DPgNvtHd5PrR5Un8ygALgT+32d0PRz27KNXORW5TyvEf2Bv4r
BX4/GazlOlK0PdGUdZl0q/3dtkU4Wr5IifQzT8KbarOSBbNwZwVcg+83hLW5gJMx
LgMGLaAATmiN7VuvJWNDATDfEOmOvQOu8veoS8TuP1AOVeJPfPT2JVh9Jen5V1/5
3w1RUOkUI2mQX+cujWDW3koniSxjsA1OegXfHnFkF5BXp4q5e54k6D5sSh1xPFDX
iDhkU5jsbKkkJS2ulD6Vi4bIAct3apMl4IrbJn/OYOLcUVI8WuunHs4UPPEuESAS
TuQExKSdj4Ntrzo3pWEy8kX3/Z9VGa+WDzwsPUuBSvllB5Ir/jjKgvkxPA6zGsiY
rbkmZT8qyI01IZ/GXqfI2AQYCGvgp+SOvFPi755ZlELTQS6sUkGZH2/2M5XnKA9t
Z1wB2iwttoS1VQInx0HgiiAGrXrFkr7IzSIN2T+CfWIqilnL7+nTxzwlJtC206P4
DB97bF6azDtJ6yh1LetRZ1ZMX/Gr56Cy0Z6USNoOu+a12PLqlPk9+fPBBpkuGcdy
BRk8KgysEuk=
=8T0v
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2025-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Core scheduler changes:
- Better tracking of maximum lag of tasks in presence of different
slices duration, for better handling of lag in the fair scheduler
(Vincent Guittot)
- Clean up and standardize #if/#else/#endif markers throughout the
entire scheduler code base (Ingo Molnar)
- Make SMP unconditional: build the SMP scheduler's data structures
and logic on UP kernel too, even though they are not used, to
simplify the scheduler and remove around 200 #ifdef/[#else]/#endif
blocks from the scheduler (Ingo Molnar)
- Reorganize cgroup bandwidth control interface handling for better
interfacing with sched_ext (Tejun Heo)
Balancing:
- Bump sd->max_newidle_lb_cost when newidle balance fails (Chris
Mason)
- Remove sched_domain_topology_level::flags to simplify the code
(Prateek Nayak)
- Simplify and clean up build_sched_topology() (Li Chen)
- Optimize build_sched_topology() on large machines (Li Chen)
Real-time scheduling:
- Add initial version of proxy execution: a mechanism for
mutex-owning tasks to inherit the scheduling context of higher
priority waiters.
Currently limited to a single runqueue and conditional on
CONFIG_EXPERT, and other limitations (John Stultz, Peter Zijlstra,
Valentin Schneider)
- Deadline scheduler (Juri Lelli):
- Fix dl_servers initialization order (Juri Lelli)
- Fix DL scheduler's root domain reinitialization logic (Juri
Lelli)
- Fix accounting bugs after global limits change (Juri Lelli)
- Fix scalability regression by implementing less agressive
dl_server handling (Peter Zijlstra)
PSI:
- Improve scalability by optimizing psi_group_change() cpu_clock()
usage (Peter Zijlstra)
Rust changes:
- Make Task, CondVar and PollCondVar methods inline to avoid
unnecessary function calls (Kunwu Chan, Panagiotis Foliadis)
- Add might_sleep() support for Rust code: Rust's "#[track_caller]"
mechanism is used so that Rust's might_sleep() doesn't need to be
defined as a macro (Fujita Tomonori)
- Introduce file_from_location() (Boqun Feng)
Debugging & instrumentation:
- Make clangd usable with scheduler source code files again (Peter
Zijlstra)
- tools: Add root_domains_dump.py which dumps root domains info (Juri
Lelli)
- tools: Add dl_bw_dump.py for printing bandwidth accounting info
(Juri Lelli)
Misc cleanups & fixes:
- Remove play_idle() (Feng Lee)
- Fix check_preemption_disabled() (Sebastian Andrzej Siewior)
- Do not call __put_task_struct() on RT if pi_blocked_on is set (Luis
Claudio R. Goncalves)
- Correct the comment in place_entity() (wang wei)"
* tag 'sched-core-2025-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits)
sched/idle: Remove play_idle()
sched: Do not call __put_task_struct() on rt if pi_blocked_on is set
sched: Start blocked_on chain processing in find_proxy_task()
sched: Fix proxy/current (push,pull)ability
sched: Add an initial sketch of the find_proxy_task() function
sched: Fix runtime accounting w/ split exec & sched contexts
sched: Move update_curr_task logic into update_curr_se
locking/mutex: Add p->blocked_on wrappers for correctness checks
locking/mutex: Rework task_struct::blocked_on
sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable
sched/topology: Remove sched_domain_topology_level::flags
x86/smpboot: avoid SMT domain attach/destroy if SMT is not enabled
x86/smpboot: moves x86_topology to static initialize and truncate
x86/smpboot: remove redundant CONFIG_SCHED_SMT
smpboot: introduce SDTL_INIT() helper to tidy sched topology setup
tools/sched: Add dl_bw_dump.py for printing bandwidth accounting info
tools/sched: Add root_domains_dump.py which dumps root domains info
sched/deadline: Fix accounting after global limits change
sched/deadline: Reset extra_bw to max_bw when clearing root domains
sched/deadline: Initialize dl_servers after SMP
...
Changes
-------
* Add trivial kunit test for ratelimit
* Make the ratelimit test more reliable (Petr Mladek)
* Add stress test for ratelimit
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmiBbmkTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jGDHD/wLrUvsUk494xhGudBd+uSy4JPtC3TZ
s53I/IwUwmWUzPFW3kCwwDmq6LDVRg2I7kPh/6uAniMsOXF2watNrGloDu3ZJfqP
Z3H5pXQDHi8Y/vzU/S/jUIjGqw9Zaop/3zVcNBdYH593Vnie8QTn6+SFtgBW93+J
c0ScaazRrBBri8n/S5dC0wcjebQQjOeThGhOOeKIQHWauqNhLwqoRGKyNGNHVfux
2jQIRThv1vcVpjMQo5D3g6prZpI6XBQ3D3EkHOHjGChW6G89EEvtdUghlB56SG2d
mK7CPUd7zI/QI/33p4wsbqffZUO5j0Tveoir8/Xw69YYpP1HVAU8g6B5Y6eqs7Gv
4s8r9XgCmwFDrl33Gkgm7JBPOqixTu3zk+MBAAeo8MKrox1hamms6pBI0KWUf9QW
UOGEW4GfPc4VS7WRF9/JCTMpBZ9tBB7eTBYMy4hfrHBhsMKXfy7xHwK139PUu/JS
W6O8BQnBfJ+ZOSq5oJpeV/hiyVrk/0nQfqjzMfA8zyF5IlGC8x5MiYXsFn61VJw2
k8WmfF//pU6O3NDwfJLmv2Oi5NY/0QuI597TxGm/Cw0Ns/qtTtQ/BEZZsQ0V2fxg
bJoFRrxOBI3fw9UPzW5LGfNUcMG5VsxuQTgRglrul9+WQIsDkZqyRQD+FP2T7hfr
IcXU+V5M4nryYw==
=DDJ8
-----END PGP SIGNATURE-----
Merge tag 'ratelimit.2025.07.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull ratelimit test updates from Paul McKenney:
"Add functional and stress tests:
- Add trivial kunit test for ratelimit
- Make the ratelimit test more reliable (Petr Mladek)
- Add stress test for ratelimit"
* tag 'ratelimit.2025.07.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
lib: Add stress test for ratelimit
lib: Make the ratelimit test more reliable
lib: Add trivial kunit test for ratelimit
- Introduce support for auxiliary timekeepers
PTP clocks can be disconnected from the universal CLOCK_TAI reality
for various reasons including regularatory requirements for
functional safety redundancy.
The kernel so far only supports a single notion of time, which means
that all clocks are correlated in frequency and only differ by
offset to each other.
Access to non-correlated PTP clocks has been available so far only
through the file descriptor based "POSIX clock IDs", which are
subject to locking and have to go all the way out to the hardware.
The access is not only horribly slow, as it has to go all the way out
to the NIC/PTP hardware, but that also prevents the kernel to read
the time of such clocks e.g. from the network stack, where it is
required for TSN networking both on the transmit and receive side
unless the hardware provides offloading.
The auxiliary clocks provide a mechanism to support arbitrary clocks
which are not correlated to the system clock. This is not restricted
to the PTP use case on purpose as there is no kernel side association
of these clocks to a particular PTP device because that's a pure user
space configuration decision. Having them independent allows to
utilize them for other purposes and also enables them to be tested
without hardware dependencies.
To avoid pointless overhead these clocks have to be enabled
individualy via a new sysfs interface to reduce the overhead to a
single compare in the hotpath if they are enabled at the Kconfig
level at all.
These clocks utilize the existing timekeeping/NTP infrastructures,
which has been made possible over the recent releases by incrementaly
converting these infrastructures over from a single static instance
to a multi-instance pointer based implementation without any
performance regression reported.
The auxiliary clocks provide the same "emulation" of a "correct"
clock as the existing CLOCK_* variants do with an independent
instance of data and provide the same steering mechanism through the
existing sys_clock_adjtime() interface, which has been confirmed to
work by the chronyd(8) maintainer.
That allows to provide lockless kernel internal and VDSO support so
that applications and kernel internal functionalities can access
these clocks without restrictions and at the same performance as the
existing system clocks.
- Avoid double notifications in the adjtimex() syscall. Not a big issue,
but a trivial to avoid latency source.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmiGo/MTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWTIEACy/HC2OD7IbAzECgwQUvo59xvmw6ak
p1wRNYpTjOLUgWZjcl/jQfh7jYe60/0PrIzYgAeltXSGwOVtqQDrUIWrTKrAHOUa
wqKUCEfCucTUJRKLQ1Ktnjy/2Pp0Ojpf32Av0v/wgLUMxQk9Av39UdQwMOGyoHOa
07//lrVzfYfqe5Ne7cmuZSbVcHlKyWpXtSvPiVhyk+tHZea4646Pz17sBeVsefps
41mxZBRk7VNiE8yWtRWYKcaXxE/0nYkptjhXOqgmNRTGB/WfyKavDYVLWe31XPrI
G3/QcAAJHBEYZgoGMHRn76L+NWNqnuxeFPtSOaGceBks7HwCKdfvrAwaJzJ1Mr22
IxeoPm0Fzdrtjy2L2PM1txGdEI2iErprCNtQolM4BCQzUslsukP/Fts+SOujgpnC
SJ9rbIIeKJzt2dW6kQai+xGw6WIpQyus7Lbt0sEcyBdWi5Bqvh9g1ZXUn8SHY2xx
/aaEBe2J1RGGhNhHD6bSLYMAXoKDPcpZIrwO+2N96Z18uYee0FU5g3JVKGfuuTdo
wYfpK79xsFmhaBDj8pYAJoU3y/v+WycE2pP2oFgBhN49Xcxo1yYIm5ECb9dYesfD
4nW/Av9bI/NR7J4MilXvw2hSmfapgNgcUGb6sgEuSD7M8/UXkfFbcbqp+RcUBiQj
0XGiNzqLnwgzBQ==
=TTjt
-----END PGP SIGNATURE-----
Merge tag 'timers-ptp-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timekeeping and VDSO updates from Thomas Gleixner:
- Introduce support for auxiliary timekeepers
PTP clocks can be disconnected from the universal CLOCK_TAI reality
for various reasons including regularatory requirements for
functional safety redundancy.
The kernel so far only supports a single notion of time, which means
that all clocks are correlated in frequency and only differ by offset
to each other.
Access to non-correlated PTP clocks has been available so far only
through the file descriptor based "POSIX clock IDs", which are
subject to locking and have to go all the way out to the hardware.
The access is not only horribly slow, as it has to go all the way out
to the NIC/PTP hardware, but that also prevents the kernel to read
the time of such clocks e.g. from the network stack, where it is
required for TSN networking both on the transmit and receive side
unless the hardware provides offloading.
The auxiliary clocks provide a mechanism to support arbitrary clocks
which are not correlated to the system clock. This is not restricted
to the PTP use case on purpose as there is no kernel side association
of these clocks to a particular PTP device because that's a pure user
space configuration decision. Having them independent allows to
utilize them for other purposes and also enables them to be tested
without hardware dependencies.
To avoid pointless overhead these clocks have to be enabled
individualy via a new sysfs interface to reduce the overhead to a
single compare in the hotpath if they are enabled at the Kconfig
level at all.
These clocks utilize the existing timekeeping/NTP infrastructures,
which has been made possible over the recent releases by incrementaly
converting these infrastructures over from a single static instance
to a multi-instance pointer based implementation without any
performance regression reported.
The auxiliary clocks provide the same "emulation" of a "correct"
clock as the existing CLOCK_* variants do with an independent
instance of data and provide the same steering mechanism through the
existing sys_clock_adjtime() interface, which has been confirmed to
work by the chronyd(8) maintainer.
That allows to provide lockless kernel internal and VDSO support so
that applications and kernel internal functionalities can access
these clocks without restrictions and at the same performance as the
existing system clocks.
- Avoid double notifications in the adjtimex() syscall. Not a big
issue, but a trivial to avoid latency source.
* tag 'timers-ptp-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
vdso/gettimeofday: Add support for auxiliary clocks
vdso/vsyscall: Update auxiliary clock data in the datapage
vdso: Introduce aux_clock_resolution_ns()
vdso/gettimeofday: Introduce vdso_get_timestamp()
vdso/gettimeofday: Introduce vdso_set_timespec()
vdso/gettimeofday: Introduce vdso_clockid_valid()
vdso/gettimeofday: Return bool from clock_gettime() helpers
vdso/gettimeofday: Return bool from clock_getres() helpers
vdso/helpers: Add helpers for seqlocks of single vdso_clock
vdso/vsyscall: Split up __arch_update_vsyscall() into __arch_update_vdso_clock()
vdso/vsyscall: Introduce a helper to fill clock configurations
timekeeping: Remove the temporary CLOCK_AUX workaround
timekeeping: Provide ktime_get_clock_ts64()
timekeeping: Provide interface to control auxiliary clocks
timekeeping: Provide update for auxiliary timekeepers
timekeeping: Provide adjtimex() for auxiliary clocks
timekeeping: Prepare do_adtimex() for auxiliary clocks
timekeeping: Make do_adjtimex() reusable
timekeeping: Add auxiliary clock support to __timekeeping_inject_offset()
timekeeping: Make timekeeping_inject_offset() reusable
...
Corrects MODULE_IMPORT_NS() syntax documentation, makes kunit_test timeout
configurable via a module parameter and a Kconfig option, fixes longest
symbol length test, adds a test for static stub, and adjusts kunit_test
timeout based on test_{suite,case} speed.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmiH3FMACgkQCwJExA0N
QxzTJQ/5AZiQXo6dbUTkmHuG14YZtWVnZ9AJMqP7tfr5s3kXRL8v96RqbLAV+2GY
jZfZkcQInzQRw464NmJcad9V0fPEUQo8Hn5lx2iPQpAUy8c0EFim4vXs9zrO246R
VFx9USrGBlhm6NAtPbbvTgcUtigaciur9/uEb4r4lTWB12dgYqfjGsII0ZYoNEUg
AXlDq950AnZktnXoN/4/fybwtB9NnYriobUn7jrDDcVoXtpSCMgWyBxsWhpEnl6P
TRMcYjs6mC1Txbfrhj9GiS/qcAVg4g9Ft3JtFc3MosyWF8tA/v9RTMZcPTqSpoDu
di94Mb4FXm9NpEk6nFCLLR+zYXbTsLLkRUmE1Xx6/uCmmocF2vHH302rH6exZ5iT
N9jXPPGyEk9FimZSKqf7hoZpPK5lDSqQhTttBJaNEw3j+Gx4cil0Sz02mHZc/5nE
ng1DjMsf1a5XRWoyxWjXuWcD1sjvLl3yvLHxD50QTSNsTVr6DJTLUipeBqC7xxGc
1uHZDjnMWAgHr8XHF4hyIy/eNPjns/htCAoFCspB8iLhMEkwQmjExaT/Z5lOwvMs
QamUz7SCcGPqUe+qG9zS3l4PqybZQvDgLygBNwDZokfLolHCLKizzISfDz0ktaK8
JyBCX3pxZfuAnBs8q5rt7NV4xOgMpp/xOIxqejWZ+CN5ku3vTb0=
=WIJH
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
"Correct MODULE_IMPORT_NS() syntax documentation, make kunit_test
timeout configurable via a module parameter and a Kconfig option, fix
longest symbol length test, add a test for static stub, and adjust
kunit_test timeout based on test_{suite,case} speed"
* tag 'linux_kselftest-kunit-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: fix longest symbol length test
kunit: Make default kunit_test timeout configurable via both a module parameter and a Kconfig option
kunit: Adjust kunit_test timeout based on test_{suite,case} speed
kunit: Add test for static stub
Documentation: kunit: Correct MODULE_IMPORT_NS() syntax
Here is the big set of char/misc/iio and other smaller driver subsystems
for 6.17-rc1. It's a big set this time around, with the huge majority
being in the iio subsystem with new drivers and dts files being added
there.
Highlights include:
- IIO driver updates, additions, and changes making more code const
and cleaning up some init logic
- bus_type constant conversion changes
- misc device test functions added
- rust miscdevice minor fixup
- unused function removals for some drivers
- mei driver updates
- mhi driver updates
- interconnect driver updates
- Android binder updates and test infrastructure added
- small cdx driver updates
- small comedi fixes
- small nvmem driver updates
- small pps driver updates
- some acrn virt driver fixes for printk messages
- other small driver updates
All of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaIeYDg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk3dwCdF6xoEVZaDkLM5IF3XKWWgdYxjNsAoKUy2kUx
YtmVh4S0tMmugfeRGac7
=3Wxi
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / IIO / other driver updates from Greg KH:
"Here is the big set of char/misc/iio and other smaller driver
subsystems for 6.17-rc1. It's a big set this time around, with the
huge majority being in the iio subsystem with new drivers and dts
files being added there.
Highlights include:
- IIO driver updates, additions, and changes making more code const
and cleaning up some init logic
- bus_type constant conversion changes
- misc device test functions added
- rust miscdevice minor fixup
- unused function removals for some drivers
- mei driver updates
- mhi driver updates
- interconnect driver updates
- Android binder updates and test infrastructure added
- small cdx driver updates
- small comedi fixes
- small nvmem driver updates
- small pps driver updates
- some acrn virt driver fixes for printk messages
- other small driver updates
All of these have been in linux-next with no reported issues"
* tag 'char-misc-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits)
binder: Use seq_buf in binder_alloc kunit tests
binder: Add copyright notice to new kunit files
misc: ti_fpc202: Switch to of_fwnode_handle()
bus: moxtet: Use dev_fwnode()
pc104: move PC104 option to drivers/Kconfig
drivers: virt: acrn: Don't use %pK through printk
comedi: fix race between polling and detaching
interconnect: qcom: Add Milos interconnect provider driver
dt-bindings: interconnect: document the RPMh Network-On-Chip Interconnect in Qualcomm Milos SoC
mei: more prints with client prefix
mei: bus: use cldev in prints
bus: mhi: host: pci_generic: Add Telit FN990B40 modem support
bus: mhi: host: Detect events pointing to unexpected TREs
bus: mhi: host: pci_generic: Add Foxconn T99W696 modem
bus: mhi: host: Use str_true_false() helper
bus: mhi: host: pci_generic: Add support for EM929x and set MRU to 32768 for better performance.
bus: mhi: host: Fix endianness of BHI vector table
bus: mhi: host: pci_generic: Disable runtime PM for QDU100
bus: mhi: host: pci_generic: Fix the modem name of Foxconn T99W640
dt-bindings: interconnect: qcom,msm8998-bwmon: Allow 'nonposted-mmio'
...
Add KUnit test suites for the Poly1305, SHA-1, SHA-224, SHA-256,
SHA-384, and SHA-512 library functions.
These are the first KUnit tests for lib/crypto/. So in addition to
being useful tests for these specific algorithms, they also establish
some conventions for lib/crypto/ testing going forwards.
The new tests are fairly comprehensive: more comprehensive than the
generic crypto infrastructure's tests. They use a variety of
techniques to check for the types of implementation bugs that tend to
occur in the real world, rather than just naively checking some test
vectors. (Interestingly, poly1305_kunit found a bug in QEMU.)
The core test logic is shared by all six algorithms, rather than being
duplicated for each algorithm.
Each algorithm's test suite also optionally includes a benchmark.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaIZ+WhQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK+C4AQCOi7iOlFLouUfu9klrovp3i/iSMhyQ
gUEHPGSelBy4wQD+NnrLGIdpCcaDAzyWpT4TxG6esN2/97ewh4VUa2MDuQQ=
=rS0F
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library test updates from Eric Biggers:
"Add KUnit test suites for the Poly1305, SHA-1, SHA-224, SHA-256,
SHA-384, and SHA-512 library functions.
These are the first KUnit tests for lib/crypto/. So in addition to
being useful tests for these specific algorithms, they also establish
some conventions for lib/crypto/ testing going forwards.
The new tests are fairly comprehensive: more comprehensive than the
generic crypto infrastructure's tests. They use a variety of
techniques to check for the types of implementation bugs that tend to
occur in the real world, rather than just naively checking some test
vectors. (Interestingly, poly1305_kunit found a bug in QEMU)
The core test logic is shared by all six algorithms, rather than being
duplicated for each algorithm.
Each algorithm's test suite also optionally includes a benchmark"
* tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: tests: Annotate worker to be on stack
lib/crypto: tests: Add KUnit tests for SHA-1 and HMAC-SHA1
lib/crypto: tests: Add KUnit tests for Poly1305
lib/crypto: tests: Add KUnit tests for SHA-384 and SHA-512
lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256
lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py
This is the main crypto library pull request for 6.17. The main focus
this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
and establishing conventions for lib/crypto/ going forward:
- Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
most of the SHA-512 code) into lib/crypto/. This includes both the
generic and architecture-optimized code. Greatly simplify how the
architecture-optimized code is integrated. Add an easy-to-use
library API for each SHA variant, including HMAC support. Finally,
reimplement the crypto_shash support on top of the library API.
- Apply the same reorganization to the SHA-256 code (and also SHA-224
which shares most of the SHA-256 code). This is a somewhat smaller
change, due to my earlier work on SHA-256. But this brings in all
the same additional improvements that I made for SHA-1 and SHA-512.
There are also some smaller changes:
- Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
these algorithms it's just a move, not a full reorganization yet.
- Fix the MIPS chacha-core.S to build with the clang assembler.
- Fix the Poly1305 functions to work in all contexts.
- Fix a performance regression in the x86_64 Poly1305 code.
- Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.
Note that since the new organization of the SHA code is much simpler,
the diffstat of this pull request is negative, despite the addition of
new fully-documented library APIs for multiple SHA and HMAC-SHA
variants. These APIs will allow further simplifications across the
kernel as users start using them instead of the old-school crypto API.
(I've already written a lot of such conversion patches, removing over
1000 more lines of code. But most of those will target 6.18 or later.)
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaIZ93BQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK8HCAQD3O9P0qd6wscne5XuRwaybzKHQ2AqU
OlhlDZWQQEvYAgD/aa6KP/DS+8RKGj0TBn6bACAJyXyDygFXq5a5s9pGzAs=
=UmMM
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
"This is the main crypto library pull request for 6.17. The main focus
this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
and establishing conventions for lib/crypto/ going forward:
- Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
most of the SHA-512 code) into lib/crypto/. This includes both the
generic and architecture-optimized code. Greatly simplify how the
architecture-optimized code is integrated. Add an easy-to-use
library API for each SHA variant, including HMAC support. Finally,
reimplement the crypto_shash support on top of the library API.
- Apply the same reorganization to the SHA-256 code (and also SHA-224
which shares most of the SHA-256 code). This is a somewhat smaller
change, due to my earlier work on SHA-256. But this brings in all
the same additional improvements that I made for SHA-1 and SHA-512.
There are also some smaller changes:
- Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
these algorithms it's just a move, not a full reorganization yet.
- Fix the MIPS chacha-core.S to build with the clang assembler.
- Fix the Poly1305 functions to work in all contexts.
- Fix a performance regression in the x86_64 Poly1305 code.
- Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.
Note that since the new organization of the SHA code is much simpler,
the diffstat of this pull request is negative, despite the addition of
new fully-documented library APIs for multiple SHA and HMAC-SHA
variants.
These APIs will allow further simplifications across the kernel as
users start using them instead of the old-school crypto API. (I've
already written a lot of such conversion patches, removing over 1000
more lines of code. But most of those will target 6.18 or later)"
* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (67 commits)
lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils
lib/crypto: x86/sha1-ni: Convert to use rounds macros
lib/crypto: x86/sha1-ni: Minor optimizations and cleanup
crypto: sha1 - Remove sha1_base.h
lib/crypto: x86/sha1: Migrate optimized code into library
lib/crypto: sparc/sha1: Migrate optimized code into library
lib/crypto: s390/sha1: Migrate optimized code into library
lib/crypto: powerpc/sha1: Migrate optimized code into library
lib/crypto: mips/sha1: Migrate optimized code into library
lib/crypto: arm64/sha1: Migrate optimized code into library
lib/crypto: arm/sha1: Migrate optimized code into library
crypto: sha1 - Use same state format as legacy drivers
crypto: sha1 - Wrap library and add HMAC support
lib/crypto: sha1: Add HMAC support
lib/crypto: sha1: Add SHA-1 library functions
lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
crypto: x86/sha1 - Rename conflicting symbol
lib/crypto: sha2: Add hmac_sha*_init_usingrawkey()
lib/crypto: arm/poly1305: Remove unneeded empty weak function
lib/crypto: x86/poly1305: Fix performance regression on short messages
...
Updates for the kernel's CRC (cyclic redundancy check) code:
- Reorganize the architecture-optimized CRC code. It now lives in
lib/crc/$(SRCARCH)/ rather than arch/$(SRCARCH)/lib/, and it is no
longer artificially split into separate generic and arch modules.
This allows better inlining and dead code elimination. The generic
CRC code is also no longer exported, simplifying the API. (This
mirrors the similar changes to SHA-1 and SHA-2 in lib/crypto/,
which can be found in the "Crypto library updates" pull request.)
- Improve crc32c() performance on newer x86_64 CPUs on long messages
by enabling the VPCLMULQDQ optimized code.
- Simplify the crypto_shash wrappers for crc32_le() and crc32c().
Register just one shash algorithm for each that uses the (fully
optimized) library functions, instead of unnecessarily providing
direct access to the generic CRC code.
- Remove unused and obsolete drivers for hardware CRC engines.
- Remove CRC-32 combination functions that are no longer used.
- Add kerneldoc for crc32_le(), crc32_be(), and crc32c().
- Convert the crc32() macro to an inline function.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaIZ8rRQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK3yOAP9OuoCirD42ZHNSgQeGTzhhZ2jCHiPN
BPvHChwtE2MSRwEA0ddNX36aOiEKmpjog3TMllOIBz7wBrwZV7KgoX75+AU=
=uAY8
-----END PGP SIGNATURE-----
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
- Reorganize the architecture-optimized CRC code
It now lives in lib/crc/$(SRCARCH)/ rather than arch/$(SRCARCH)/lib/,
and it is no longer artificially split into separate generic and arch
modules. This allows better inlining and dead code elimination
The generic CRC code is also no longer exported, simplifying the API.
(This mirrors the similar changes to SHA-1 and SHA-2 in lib/crypto/,
which can be found in the "Crypto library updates" pull request)
- Improve crc32c() performance on newer x86_64 CPUs on long messages by
enabling the VPCLMULQDQ optimized code
- Simplify the crypto_shash wrappers for crc32_le() and crc32c()
Register just one shash algorithm for each that uses the (fully
optimized) library functions, instead of unnecessarily providing
direct access to the generic CRC code
- Remove unused and obsolete drivers for hardware CRC engines
- Remove CRC-32 combination functions that are no longer used
- Add kerneldoc for crc32_le(), crc32_be(), and crc32c()
- Convert the crc32() macro to an inline function
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (26 commits)
lib/crc: x86/crc32c: Enable VPCLMULQDQ optimization where beneficial
lib/crc: x86: Reorganize crc-pclmul static_call initialization
lib/crc: crc64: Add include/linux/crc64.h to kernel-api.rst
lib/crc: crc32: Change crc32() from macro to inline function and remove cast
nvmem: layouts: Switch from crc32() to crc32_le()
lib/crc: crc32: Document crc32_le(), crc32_be(), and crc32c()
lib/crc: Explicitly include <linux/export.h>
lib/crc: Remove ARCH_HAS_* kconfig symbols
lib/crc: x86: Migrate optimized CRC code into lib/crc/
lib/crc: sparc: Migrate optimized CRC code into lib/crc/
lib/crc: s390: Migrate optimized CRC code into lib/crc/
lib/crc: riscv: Migrate optimized CRC code into lib/crc/
lib/crc: powerpc: Migrate optimized CRC code into lib/crc/
lib/crc: mips: Migrate optimized CRC code into lib/crc/
lib/crc: loongarch: Migrate optimized CRC code into lib/crc/
lib/crc: arm64: Migrate optimized CRC code into lib/crc/
lib/crc: arm: Migrate optimized CRC code into lib/crc/
lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/
lib/crc: Move files into lib/crc/
lib/crc32: Remove unused combination support
...
- Introduce and start using TRAILING_OVERLAP() helper for fixing
embedded flex array instances (Gustavo A. R. Silva)
- mux: Convert mux_control_ops to a flex array member in mux_chip
(Thorsten Blum)
- string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
- Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
Kees Cook)
- Refactor and rename stackleak feature to support Clang
- Add KUnit test for seq_buf API
- Fix KUnit fortify test under LTO
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaIfUkgAKCRA2KwveOeQk
uypLAP92r6f47sWcOw/5B9aVffX6Bypsb7dqBJQpCNxI5U1xcAEAiCrZ98UJyOeQ
JQgnXd4N67K4EsS2JDc+FutRn3Yi+A8=
=+5Bq
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- Introduce and start using TRAILING_OVERLAP() helper for fixing
embedded flex array instances (Gustavo A. R. Silva)
- mux: Convert mux_control_ops to a flex array member in mux_chip
(Thorsten Blum)
- string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
- Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
Kees Cook)
- Refactor and rename stackleak feature to support Clang
- Add KUnit test for seq_buf API
- Fix KUnit fortify test under LTO
* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
sched/task_stack: Add missing const qualifier to end_of_stack()
kstack_erase: Support Clang stack depth tracking
kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
init.h: Disable sanitizer coverage for __init and __head
kstack_erase: Disable kstack_erase for all of arm compressed boot code
x86: Handle KCOV __init vs inline mismatches
arm64: Handle KCOV __init vs inline mismatches
s390: Handle KCOV __init vs inline mismatches
arm: Handle KCOV __init vs inline mismatches
mips: Handle KCOV __init vs inline mismatch
powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
configs/hardening: Enable CONFIG_KSTACK_ERASE
stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
stackleak: Rename STACKLEAK to KSTACK_ERASE
seq_buf: Introduce KUnit tests
string: Group str_has_prefix() and strstarts()
kunit/fortify: Add back "volatile" for sizeof() constants
acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmiHdZ8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgptRED/9o3dQ1QHL5yNM/AyCCGox0V4zra8qGS/Vc
cBWpAVrmPGRw0IYlLZENtN9PdwKcbMzJq3l6cxeC7dBnAZP0AxTzP4YYJYUNVsqo
WtJ3d/k5+cVp0OyOp4uabaqNeMeLoPk9/JXe1Ml2KxtDmHtj5yee0JRh7zlPZmZj
tsrpIUTeHgAPn6yR1EI+0ybx/mjCb05Mv2Y8gF5hkUPA2PuON+MTFixJmqoy2ySh
n+22mz/prqlyOSYh/VVv1+9jcQ94wMjcW0JIpg9lM3Kg8BCPU4IetvO1UiX6X33v
154zEh2aJJDBx+yORS4BM4JMXjRZI7lYea2dkHM8Cajctu1Wpja9bNwnK9ibXvEc
WtyBwztleLbAZef25fA/W87JE23fGa/r3nwIb2cF4QqkAFslCvhjA93WkOzNJCgQ
qsWOrlCh3IK2NUu4b1Ncs3ZHOPvc51+zzjMzC6SUr54xhrxDK+gngDPhRy7XDqWJ
DTMpIlr366o8GdJqnib0/e/CPBrThS6Vl6u0tgLnNbwdpK1svgo/uHW5ksKvDqHX
kGEIhyRRJJC+4wyl4dsYKXa2twcyFrlWdAE+pZguEC2nZRYqYl9uXftOtvfp1x0y
/skDX0FIDjvyjRqCLcqF03FSGqwCGS8WuWXZjPhVhcfz47NvbHeFDh1G/jMzsbpj
S9zrPve/DQ==
=e86T
-----END PGP SIGNATURE-----
Merge tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- MD pull request via Yu:
- call del_gendisk synchronously (Xiao)
- cleanup unused variable (John)
- cleanup workqueue flags (Ryo)
- fix faulty rdev can't be removed during resync (Qixing)
- NVMe pull request via Christoph:
- try PCIe function level reset on init failure (Keith Busch)
- log TLS handshake failures at error level (Maurizio Lombardi)
- pci-epf: do not complete commands twice if nvmet_req_init()
fails (Rick Wertenbroek)
- misc cleanups (Alok Tiwari)
- Removal of the pktcdvd driver
This has been more than a decade coming at this point, and some
recently revealed breakages that had it causing issues even for cases
where it isn't required made me re-pull the trigger on this one. It's
known broken and nobody has stepped up to maintain the code
- Series for ublk supporting batch commands, enabling the use of
multishot where appropriate
- Speed up ublk exit handling
- Fix for the two-stage elevator fixing which could leak data
- Convert NVMe to use the new IOVA based API
- Increase default max transfer size to something more reasonable
- Series fixing write operations on zoned DM devices
- Add tracepoints for zoned block device operations
- Prep series working towards improving blk-mq queue management in the
presence of isolated CPUs
- Don't allow updating of the block size of a loop device that is
currently under exclusively ownership/open
- Set chunk sectors from stacked device stripe size and use it for the
atomic write size limit
- Switch to folios in bcache read_super()
- Fix for CD-ROM MRW exit flush handling
- Various tweaks, fixes, and cleanups
* tag 'for-6.17/block-20250728' of git://git.kernel.dk/linux: (94 commits)
block: restore two stage elevator switch while running nr_hw_queue update
cdrom: Call cdrom_mrw_exit from cdrom_release function
sunvdc: Balance device refcount in vdc_port_mpgroup_check
nvme-pci: try function level reset on init failure
dm: split write BIOs on zone boundaries when zone append is not emulated
block: use chunk_sectors when evaluating stacked atomic write limits
dm-stripe: limit chunk_sectors to the stripe size
md/raid10: set chunk_sectors limit
md/raid0: set chunk_sectors limit
block: sanitize chunk_sectors for atomic write limits
ilog2: add max_pow_of_two_factor()
nvmet: pci-epf: Do not complete commands twice if nvmet_req_init() fails
nvme-tcp: log TLS handshake failures at error level
docs: nvme: fix grammar in nvme-pci-endpoint-target.rst
nvme: fix typo in status code constant for self-test in progress
nvmet: remove redundant assignment of error code in nvmet_ns_enable()
nvme: fix incorrect variable in io cqes error message
nvme: fix multiple spelling and grammar issues in host drivers
block: fix blk_zone_append_update_request_bio() kernel-doc
md/raid10: fix set but not used variable in sync_request_write()
...
Move both uevent_helper table into lib/kobject_uevent.c. Place the
registration early in the initcall order with postcore_initcall.
This is part of a greater effort to move ctl tables into their
respective subsystems which will reduce the merge conflicts in
kernel/sysctl.c.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:
- Add the new top-level CONFIG_KSTACK_ERASE option which will be
implemented either with the stackleak GCC plugin, or with the Clang
stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
for what it does rather than what it protects against), but leave as
many of the internals alone as possible to avoid even more churn.
While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
The following warning traceback is seen if object debugging is enabled
with the new crypto test code.
ODEBUG: object 9000000106237c50 is on stack 9000000106234000, but NOT annotated.
------------[ cut here ]------------
WARNING: lib/debugobjects.c:655 at lookup_object_or_alloc.part.0+0x19c/0x1f4, CPU#0: kunit_try_catch/468
...
This also results in a boot stall when running the code in qemu:loongarch.
Initializing the worker with INIT_WORK_ONSTACK() fixes the problem.
Fixes: 950a81224e ("lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250721231917.3182029-1-linux@roeck-us.net
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Now that the oldest supported binutils version is 2.30, the macros that
emit the SHA-512 instructions as '.inst' words are no longer needed. So
drop them. No change in the generated machine code.
Changed from the original patch by Ard Biesheuvel:
(https://lore.kernel.org/r/20250515142702.2592942-2-ardb+git@google.com):
- Reduced scope to just SHA-512
- Added comment that explains why "sha3" is used instead of "sha2"
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718220706.475240-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The assembly code that does all 80 rounds of SHA-1 is highly repetitive.
Replace it with 20 expansions of a macro that does 4 rounds, using the
macro arguments and .if directives to handle the slight variations
between rounds. This reduces the length of sha1-ni-asm.S by 129 lines
while still producing the exact same object file. This mirrors
sha256-ni-asm.S which uses this same strategy.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
- Store the previous state in %xmm8-%xmm9 instead of spilling it to the
stack. There are plenty of unused XMM registers here, so there is no
reason to spill to the stack. (While 32-bit code is limited to
%xmm0-%xmm7, this is 64-bit code, so it's free to use %xmm8-%xmm15.)
- Remove the unnecessary check for nblocks == 0. sha1_ni_transform() is
always passed a positive nblocks.
- To get an XMM register with 'e' in the high dword and the rest zeroes,
just zeroize the register using pxor, then load 'e'. Previously the
code loaded 'e', then zeroized the lower dwords by AND-ing with a
constant, which was slightly less efficient.
- Instead of computing &DATA_PTR[NBLOCKS << 6] and stopping when
DATA_PTR reaches that value, instead just decrement NBLOCKS on each
iteration and stop when it reaches 0. This is fewer instructions.
- Rename DIGEST_PTR to STATE_PTR. It points to the SHA-1 internal
state, not a SHA-1 digest value.
This commit shrinks the code size of sha1_ni_transform() from 624 bytes
to 589 bytes and also shrinks rodata by 16 bytes.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Improve crc32c() performance on lengths >= 512 bytes by using
crc32_lsb_vpclmul_avx512() instead of crc32c_x86_3way(), when the CPU
supports VPCLMULQDQ and has a "good" implementation of AVX-512. For now
that means AMD Zen 4 and later, and Intel Sapphire Rapids and later.
Pass crc32_lsb_vpclmul_avx512() the table of constants needed to make it
use the CRC-32C polynomial.
Rationale: VPCLMULQDQ performance has improved on newer CPUs, making
crc32_lsb_vpclmul_avx512() faster than crc32c_x86_3way(), even though
crc32_lsb_vpclmul_avx512() is designed for generic 32-bit CRCs and does
not utilize x86_64's dedicated CRC-32C instructions.
Performance results for len=4096 using crc_kunit:
CPU Before (MB/s) After (MB/s)
====================== ============= ============
AMD Zen 4 (Genoa) 19868 28618
AMD Zen 5 (Ryzen AI 9 365) 24080 46940
AMD Zen 5 (Turin) 29566 58468
Intel Sapphire Rapids 22340 73794
Intel Emerald Rapids 24696 78666
Performance results for len=512 using crc_kunit:
CPU Before (MB/s) After (MB/s)
====================== ============= ============
AMD Zen 4 (Genoa) 7251 7758
AMD Zen 5 (Ryzen AI 9 365) 17481 19135
AMD Zen 5 (Turin) 21332 25424
Intel Sapphire Rapids 18886 29312
Intel Emerald Rapids 19675 29045
That being said, in the above benchmarks the ZMM registers are "warm",
so they don't quite tell the whole story. While significantly improved
from older Intel CPUs, Intel still has ~2000 ns of ZMM warm-up time
where 512-bit instructions execute 4 times more slowly than they
normally do. In contrast, AMD does better and has virtually zero ZMM
warm-up time (at most ~60 ns). Thus, while this change is always
beneficial on AMD, strictly speaking there are cases in which it is not
beneficial on Intel, e.g. a small number of 512-byte messages with
"cold" ZMM registers. But typically, it is beneficial even on Intel.
Note that on AMD Zen 3--5, crc32c() performance could be further
improved with implementations that interleave crc32q and VPCLMULQDQ
instructions. Unfortunately, it appears that a different such
implementation would be optimal on *each* of these microarchitectures.
Such improvements are left for future work. This commit just improves
the way that we choose the implementations we already have.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250719224938.126512-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reorganize the crc-pclmul static_call initialization to place more of
the logic in the *_mod_init_arch() functions instead of in the
INIT_CRC_PCLMUL macro. This provides the flexibility to do more than a
single static_call update for each CPU feature check. Right away,
optimize crc64_mod_init_arch() to check the CPU features just once
instead of twice, doing both the crc64_msb and crc64_lsb static_call
updates together. A later commit will also use this to initialize an
additional static_key when crc32_lsb_vpclmul_avx512() is enabled.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250719224938.126512-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Update lib/raid6/recov_rvv.c, for 1857fcc847 ("lib/raid6: replace custom
zero page with ZERO_PAGE"), per Klara.
Link: https://lkml.kernel.org/r/aFkUnXWtxcgOTVkw@gondor.apana.org.au
Fixes: 1857fcc847 ("lib/raid6: replace custom zero page with ZERO_PAGE")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Song Liu <song@kernel.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Optimize GCD performance on RISC-V by selecting
implementation at runtime", v3.
The current implementation of gcd() selects between the binary GCD and the
odd-even GCD algorithm at compile time, depending on whether
CONFIG_CPU_NO_EFFICIENT_FFS is set. On platforms like RISC-V, however,
this compile-time decision can be misleading: even when the compiler emits
ctz instructions based on the assumption that they are efficient (as is
the case when CONFIG_RISCV_ISA_ZBB is enabled), the actual hardware may
lack support for the Zbb extension. In such cases, ffs() falls back to a
software implementation at runtime, making the binary GCD algorithm
significantly slower than the odd-even variant.
To address this, we introduce a static key to allow runtime selection
between the binary and odd-even GCD implementations. On RISC-V, the
kernel now checks for Zbb support during boot. If Zbb is unavailable, the
static key is disabled so that gcd() consistently uses the more efficient
odd-even algorithm in that scenario. Additionally, to further reduce code
size, we select CONFIG_CPU_NO_EFFICIENT_FFS automatically when
CONFIG_RISCV_ISA_ZBB is not enabled, avoiding compilation of the unused
binary GCD implementation entirely on systems where it would never be
executed.
This series ensures that the most efficient GCD algorithm is used in
practice and avoids compiling unnecessary code based on hardware
capabilities and kernel configuration.
This patch (of 3):
On platforms like RISC-V, the compiler may generate hardware FFS
instructions even if the underlying CPU does not actually support them.
Currently, the GCD implementation is chosen at compile time based on
CONFIG_CPU_NO_EFFICIENT_FFS, which can result in suboptimal behavior on
such systems.
Introduce a static key, efficient_ffs_key, to enable runtime selection
between the binary GCD (using ffs) and the odd-even GCD implementation.
This allows the kernel to default to the faster binary GCD when FFS is
efficient, while retaining the ability to fall back when needed.
Link: https://lkml.kernel.org/r/20250606134758.1308400-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20250606134758.1308400-2-visitorckw@gmail.com
Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bitmap definition for 'panic_print' is hard to remember and decode. Add
'panic_sys_info='sysctl to take human readable string like
"tasks,mem,timers,locks,ftrace,..." and translate it into bitmap.
The detailed mapping is:
SYS_INFO_TASKS "tasks"
SYS_INFO_MEM "mem"
SYS_INFO_TIMERS "timers"
SYS_INFO_LOCKS "locks"
SYS_INFO_FTRACE "ftrace"
SYS_INFO_ALL_CPU_BT "all_bt"
SYS_INFO_BLOCKED_TASKS "blocked_tasks"
[nathan@kernel.org: add __maybe_unused to sys_info_avail]
Link: https://lkml.kernel.org/r/20250708-fix-clang-sys_info_avail-warning-v1-1-60d239eacd64@kernel.org
Link: https://lkml.kernel.org/r/20250703021004.42328-4-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
'panic_print' was introduced to help debugging kernel panic by dumping
different kinds of system information like tasks' call stack, memory,
ftrace buffer, etc. Actually this function could also be used to help
debugging other cases like task-hung, soft/hard lockup, etc. where user
may need the snapshot of system info at that time.
Extract system info dump function related code from panic.c to separate
file sys_info.[ch], for wider usage by other kernel parts for debugging.
Also modify the macro names about singulars/plurals.
Link: https://lkml.kernel.org/r/20250703021004.42328-3-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Expose the auxiliary clocks through the vDSO.
Architectures not using the generic vDSO time framework,
namely SPARC64, are not supported.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-12-df7d9f87b9b8@linutronix.de
The internal helpers are effectively using boolean results,
while pretending to use error numbers.
Switch the return type to bool for more clarity.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-6-df7d9f87b9b8@linutronix.de
Tests can allocate from virtual memory using kunit_vm_mmap(), which
transparently creates and attaches an mm_struct to the test runner if
one is not already attached. This is suitable for most cases, except for
when the code under test must access a task's mm before performing an
mmap. Expose kunit_attach_mm() as part of the interface for those
cases. This does not change the existing behavior.
Cc: David Gow <davidgow@google.com>
Signed-off-by: Tiffany Yang <ynaffit@google.com>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20250714185321.2417234-4-ynaffit@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It seems the Clang can see through OPTIMIZER_HIDE_VAR when the constant
is coming from sizeof. Adding "volatile" back to these variables solves
this false positive without reintroducing the issues that originally led
to switching to OPTIMIZER_HIDE_VAR in the first place[1].
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2075 [1]
Cc: Jannik Glückert <jannik.glueckert@gmail.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 6ee149f61b ("kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()")
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250628234034.work.800-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Add a KUnit test suite for the SHA-1 library functions, including the
corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suite, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a KUnit test suite for the Poly1305 functions. Most of its test
cases are instantiated from hash-test-template.h, which is also used by
the SHA-2 tests. A couple additional test cases are also included to
test edge cases specific to Poly1305.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add KUnit test suites for the SHA-384 and SHA-512 library functions,
including the corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add KUnit test suites for the SHA-224 and SHA-256 library functions,
including the corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add hash-test-template.h which generates the following KUnit test cases
for hash functions:
test_hash_test_vectors
test_hash_all_lens_up_to_4096
test_hash_incremental_updates
test_hash_buffer_overruns
test_hash_overlaps
test_hash_alignment_consistency
test_hash_ctx_zeroization
test_hash_interrupt_context_1
test_hash_interrupt_context_2
test_hmac (when HMAC is supported)
benchmark_hash (when CONFIG_CRYPTO_LIB_BENCHMARK=y)
The initial use cases for this will be sha224_kunit, sha256_kunit,
sha384_kunit, sha512_kunit, and poly1305_kunit.
Add a Python script gen-hash-testvecs.py which generates the test
vectors required by test_hash_test_vectors,
test_hash_all_lens_up_to_4096, and test_hmac.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the x86-optimized SHA-1 code via x86-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be x86-optimized, and it fixes the longstanding issue where
the x86-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t. The assembly functions actually
already treated it as size_t.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the sparc-optimized SHA-1 code via sparc-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be sparc-optimized, and it fixes the longstanding issue where
the sparc-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Note: to see the diff from arch/sparc/crypto/sha1_glue.c to
lib/crypto/sparc/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the s390-optimized SHA-1 code via s390-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be s390-optimized, and it fixes the longstanding issue where
the s390-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the powerpc-optimized SHA-1 code via
powerpc-specific crypto_shash algorithms, instead just implement the
sha1_blocks() library function. This is much simpler, it makes the
SHA-1 library functions be powerpc-optimized, and it fixes the
longstanding issue where the powerpc-optimized SHA-1 code was disabled
by default. SHA-1 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Note: to see the diff from arch/powerpc/crypto/sha1-spe-glue.c to
lib/crypto/powerpc/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the mips-optimized SHA-1 code via mips-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be mips-optimized, and it fixes the longstanding issue where
the mips-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-sha1.c
to lib/crypto/mips/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm64-optimized SHA-1 code via arm64-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be arm64-optimized, and it fixes the longstanding issue where
the arm64-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Remove support for SHA-1 finalization from assembly code, since the
library does not yet support architecture-specific overrides of the
finalization. (Support for that has been omitted for now, for
simplicity and because usually it isn't performance-critical.)
To match sha1_blocks(), change the type of the nblocks parameter and the
return value of __sha1_ce_transform() from int to size_t. Update the
assembly code accordingly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm-optimized SHA-1 code via arm-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be arm-optimized, and it fixes the longstanding issue where
the arm-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t. The assembly functions actually
already treated it as size_t.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add HMAC support to the SHA-1 library, again following what was done for
SHA-2. Besides providing the basis for a more streamlined "hmac(sha1)"
shash, this will also be useful for multiple in-kernel users such as
net/sctp/auth.c, net/ipv6/seg6_hmac.c, and
security/keys/trusted-keys/trusted_tpm1.c. Those are currently using
crypto_shash, but using the library functions would be much simpler.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a library interface for SHA-1, following the SHA-2 one. As was the
case with SHA-2, this will be useful for various in-kernel users. The
crypto_shash interface will be reimplemented on top of it as well.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Rename the existing sha1_init() to sha1_init_raw(), since it conflicts
with the upcoming library function. This will later be removed, but
this keeps the kernel building for the introduction of the library.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
While the HMAC library functions support both incremental and one-shot
computation and both prepared and raw keys, the combination of raw key
+ incremental was missing. It turns out that several potential users of
the HMAC library functions (tpm2-sessions.c, smb2transport.c,
trusted_tpm1.c) want exactly that.
Therefore, add the missing functions hmac_sha*_init_usingrawkey().
Implement them in an optimized way that directly initializes the HMAC
context without a separate key preparation step.
Reimplement the one-shot raw key functions hmac_sha*_usingrawkey() on
top of the new functions, which makes them a bit more efficient.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250711215844.41715-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Fix poly1305-armv4.pl to not do '.globl poly1305_blocks_neon' when
poly1305_blocks_neon() is not defined. Then, remove the empty __weak
definition of poly1305_blocks_neon(), which was still needed only
because of that unnecessary globl statement. (It also used to be needed
because the compiler could generate calls to it when
CONFIG_KERNEL_MODE_NEON=n, but that has been fixed.)
Thanks to Arnd Bergmann for reporting that the globl statement in the
asm file was still depending on the weak symbol.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250711212822.6372-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The test align_shift_alloc_test is expected to fail. Reporting the test
as fail confuses to be a genuine failure. Introduce widely used xfail
sematics to address the issue.
Note: a warn_alloc dump similar to below is still expected:
Call Trace:
<TASK>
dump_stack_lvl+0x64/0x80
warn_alloc+0x137/0x1b0
? __get_vm_area_node+0x134/0x140
Snippet of dmesg after change:
Summary: random_size_align_alloc_test passed: 1 failed: 0 xfailed: 0 ..
Summary: align_shift_alloc_test passed: 0 failed: 0 xfailed: 1 ..
Summary: pcpu_alloc_test passed: 1 failed: 0 xfailed: 0 ..
Link: https://lkml.kernel.org/r/20250702064319.885-1-raghavendra.kt@amd.com
Signed-off-by: Raghavendra K T <raghavendra.kt@amd.com>
Reviewed-by: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Dev Jain <dev.jain@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add comments explaining the fields for maple_metadata, since "end" is
ambiguous and "gap" can be confused as the largest gap, whereas it is
actually the offset of the largest gap.
Add comment for mas_ascend() to explain, whose min and max we are trying
to find. Explain that, for example, if we are already on offset zero,
then the parent min is mas->min, otherwise we need to walk up to find the
implied pivot min.
Link: https://lkml.kernel.org/r/20250703063338.51509-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Restore the len >= 288 condition on using the AVX implementation, which
was incidentally removed by commit 318c53ae02 ("crypto: x86/poly1305 -
Add block-only interface"). This check took into account the overhead
in key power computation, kernel-mode "FPU", and tail handling
associated with the AVX code. Indeed, restoring this check slightly
improves performance for len < 256 as measured using poly1305_kunit on
an "AMD Ryzen AI 9 365" (Zen 5) CPU:
Length Before After
====== ========== ==========
1 30 MB/s 36 MB/s
16 516 MB/s 598 MB/s
64 1700 MB/s 1882 MB/s
127 2265 MB/s 2651 MB/s
128 2457 MB/s 2827 MB/s
200 2702 MB/s 3238 MB/s
256 3841 MB/s 3768 MB/s
511 4580 MB/s 4585 MB/s
512 5430 MB/s 5398 MB/s
1024 7268 MB/s 7305 MB/s
3173 8999 MB/s 8948 MB/s
4096 9942 MB/s 9921 MB/s
16384 10557 MB/s 10545 MB/s
While the optimal threshold for this CPU might be slightly lower than
288 (see the len == 256 case), other CPUs would need to be tested too,
and these sorts of benchmarks can underestimate the true cost of
kernel-mode "FPU". Therefore, for now just restore the 288 threshold.
Fixes: 318c53ae02 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Restore the SIMD usability check that was removed by commit a59e5468a9
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a9 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Restore the SIMD usability check that was removed by commit 773426f477
("crypto: arm/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 773426f477 ("crypto: arm/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Commit cac5cefbad ("sched/smp: Make SMP unconditional")
migrate_disable() even on UP builds.
Commit 06ddd17521 ("sched/smp: Always define is_percpu_thread() and
scheduler_ipi()") made is_percpu_thread() check the affinity mask
instead replying always true for UP mask.
As a consequence smp_processor_id() now complains if invoked within a
migrate_disable() section because is_percpu_thread() checks its mask and
the migration check is left out.
Make migration check unconditional of SMP.
Fixes: cac5cefbad ("sched/smp: Make SMP unconditional")
Closes: https://lore.kernel.org/oe-lkp/202507100448.6b88d6f1-lkp@intel.com
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/r/20250710082748.-DPO1rjO@linutronix.de
With sanitizers enabled, this function uses a lot of stack, causing a
harmless warning:
lib/test_objagg.c: In function 'test_hints_case.constprop':
lib/test_objagg.c:994:1: error: the frame size of 1440 bytes is larger than 1408 bytes [-Werror=frame-larger-than=]
Most of this is from the two 'struct world' structures. Since most of the
work in this function is duplicated for the two, split it up into separate
functions that each use one of them.
The combined stack usage is still the same here, but there is no warning
any more, and the code is still safe because of the known call chain.
Link: https://lkml.kernel.org/r/20250620111907.3395296-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The current implementation forces a compile-time 1/0 division, which
generates an undefined instruction (ud2 on x86) rather than a proper
runtime division-by-zero exception.
Change to trigger an actual div-by-0 exception at runtime, consistent with
other division operations. Use a non-1 dividend to prevent the compiler
from optimizing the division into a comparison.
Link: https://lkml.kernel.org/r/q246p466-1453-qon9-29so-37105116009q@onlyvoer.pbz
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Cc: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Restoring maple status to ma_active on overflow/underflow when mas->node
was NULL could have happened in the past, but was masked by a bug in
mas_walk(). Add test cases that triggered the bug when the node was
mas->node prior to fixing the maple state setup.
Add a few extra tests around restoring the active maple status.
Link: https://lore.kernel.org/all/202506191556.6bfc7b93-lkp@intel.com/
Link: https://lkml.kernel.org/r/20250624154823.52221-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
During the initial call with a maple state, an error status may be set
before a valid node is populated into the maple state node. Subsequent
calls with the maple state may restore the state into an active state with
no node set. This was masked by the mas_walk() always resetting the
status to ma_start and result in an extra walk in this rare scenario.
Don't restore the state to active unless there is a value in the structs
node. This also allows mas_walk() to be fixed to use the active state
without exposing an issue.
User visible results are marginal performance improvements when an active
state can be restored and used instead of rewalking the tree.
Stable is not Cc'ed because the existing code is stable and the
performance gains are not worth the risk.
Link: https://lore.kernel.org/all/20250611011253.19515-1-richard.weiyang@gmail.com/
Link: https://lore.kernel.org/all/20250407231354.11771-1-richard.weiyang@gmail.com/
Link: https://lore.kernel.org/all/202506191556.6bfc7b93-lkp@intel.com/
Link: https://lkml.kernel.org/r/20250624154823.52221-1-Liam.Howlett@oracle.com
Fixes: a8091f039c ("maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Wei Yang <richard.weiyang@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202506191556.6bfc7b93-lkp@intel.com
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When the vmalloc test is built into the kernel, it runs automatically
during the boot. The current-default "run_test_mask" includes all test
cases, including those which are designed to fail and which trigger kernel
warnings.
These kernel splats can be misinterpreted as actual kernel bugs, leading
to false alarms and unnecessary reports.
To address this, limit the default test mask to only the first few tests
which are expected to pass cleanly. These tests are safe and should not
generate any warnings unless there is a real bug.
Users who wish to explicitly run specific test cases have to pass the
run_test_mask as a boot parameter or at module load time.
Link: https://lkml.kernel.org/r/20250623184035.581229-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: David Wang <00107082@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When the vmalloc test code is compiled as a built-in, use late_initcall()
instead of module_init() to defer a vmalloc test execution until most
subsystems are up and running.
It avoids interfering with components that may not yet be initialized at
module_init() time. For example, there was a recent report of memory
profiling infrastructure not being ready early enough leading to kernel
crash.
By using late_initcall() in the built-in case, we ensure the tests are run
at a safer point during a boot sequence.
Link: https://lkml.kernel.org/r/20250623184035.581229-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: David Wang <00107082@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Use the underflow goto label to set the status to ma_underflow and return
NULL, as is being done elsewhere.
[akpm@linux-foundation.org: add newline, per Liam (and remove one, per akpm)]
Link: https://lkml.kernel.org/r/20250624080748.4855-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Recently discovered this entry while checking kallsyms on ARM64:
ffff800083e509c0 D _shared_alloc_tag
If ARCH_NEEDS_WEAK_PER_CPU is not defined(it is only defined for s390 and
alpha architectures), there's no need to statically define the percpu
variable _shared_alloc_tag.
Therefore, we need to implement isolation for this purpose.
When building the core kernel code for s390 or alpha architectures,
ARCH_NEEDS_WEAK_PER_CPU remains undefined (as it is gated by #if
defined(MODULE)). However, when building modules for these architectures,
the macro is explicitly defined.
Therefore, we remove all instances of ARCH_NEEDS_WEAK_PER_CPU from the
code and introduced CONFIG_ARCH_MODULE_NEEDS_WEAK_PER_CPU to replace the
relevant logic. We can now conditionally define the perpcu variable
_shared_alloc_tag based on CONFIG_ARCH_MODULE_NEEDS_WEAK_PER_CPU. This
allows architectures (such as s390/alpha) that require weak definitions
for percpu variables in modules to include the definition, while others
can omit it via compile-time exclusion.
Link: https://lkml.kernel.org/r/20250618015809.1235761-1-hao.ge@linux.dev
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390]
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Chistoph Lameter <cl@linux.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When reading /proc/allocinfo, for each read syscall, seq_file would invoke
start/stop callbacks. In start callback, a memory is alloced to store
iterator and the iterator would start from beginning to walk linearly to
current read position.
seq_file read() takes at most 4096 bytes, even if read with a larger user
space buffer, meaning read out all of /proc/allocinfo, tens of read
syscalls are needed. For example, a 306036 bytes allocinfo files need 76
reads:
$ sudo cat /proc/allocinfo | wc
3964 16678 306036
$ sudo strace -T -e read cat /proc/allocinfo
...
read(3, " 4096 1 arch/x86/k"..., 131072) = 4063 <0.000062>
...
read(3, " 0 0 sound/core"..., 131072) = 4021 <0.000150>
...
For those n=3964 lines, each read takes about m=3964/76=52 lines,
since iterator restart from beginning for each read(),
it would move forward
m steps on 1st read
2*m steps on 2nd read
3*m steps on 3rd read
...
n steps on last read
As read() along, those linear seek steps make read() calls slower and
slower. Adding those up, codetag iterator moves about O(n*n/m) steps,
making data structure traversal take significant part of the whole
reading. Profiling when stress reading /proc/allocinfo confirms it:
vfs_read(99.959% 1677299/1677995)
proc_reg_read_iter(99.856% 1674881/1677299)
seq_read_iter(99.959% 1674191/1674881)
allocinfo_start(75.664% 1266755/1674191)
codetag_next_ct(79.217% 1003487/1266755) <---
srso_return_thunk(1.264% 16011/1266755)
__kmalloc_cache_noprof(0.102% 1296/1266755)
...
allocinfo_show(21.287% 356378/1674191)
allocinfo_next(1.530% 25621/1674191)
codetag_next_ct() takes major part.
A private data alloced at open() time can be used to carry iterator alive
across read() calls, and avoid the memory allocation and iterator reset
for each read(). This way, only O(1) memory allocation and O(n) steps
iterating, and `time` shows performance improvement from ~7ms to ~4ms.
Profiling with the change:
vfs_read(99.865% 1581073/1583214)
proc_reg_read_iter(99.485% 1572934/1581073)
seq_read_iter(99.846% 1570519/1572934)
allocinfo_show(87.428% 1373074/1570519)
seq_buf_printf(83.695% 1149196/1373074)
seq_buf_putc(1.917% 26321/1373074)
_find_next_bit(1.531% 21023/1373074)
...
codetag_to_text(0.490% 6727/1373074)
...
allocinfo_next(6.275% 98543/1570519)
...
allocinfo_start(0.369% 5790/1570519)
...
Now seq_buf_printf() takes major part.
Link: https://lkml.kernel.org/r/20250609064408.112783-1-00107082@163.com
Signed-off-by: David Wang <00107082@163.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Codetag iterator use <id,address> pair to guarantee the validness. But
both id and address can be reused, there is theoretical possibility when
module inserted right after another module removed, kmalloc returns an
address same as the address kfree by previous module and IDR key reuses
the key recently removed.
Add a sequence number to codetag_module and code_iterator, the sequence
number is strickly incremented whenever a module is loaded. An iterator
is valid if and only if its sequence number match codetag_module's.
Link: https://lkml.kernel.org/r/20250609064200.112639-1-00107082@163.com
Signed-off-by: David Wang <00107082@163.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The various test ioctl handlers use arrays of 64 integers that add up to
1KiB of stack data, which in turn leads to exceeding the warning limit in
some configurations:
lib/test_hmm.c:935:12: error: stack frame size (1408) exceeds limit (1280)
in 'dmirror_migrate_to_device' [-Werror,-Wframe-larger-than]
Use half the size for these arrays, in order to stay under the warning
limits. The code can already deal with arbitrary lengths, but this may be
a little less efficient.
Link: https://lkml.kernel.org/r/20250610092159.2639515-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Suppose xas is pointing somewhere near the end of the multi-entry batch.
Then it may happen that the computed slot already falls beyond the batch,
thus breaking the loop due to !xa_is_sibling(), and computing the wrong
order.
For example, suppose we have a shift-6 node having an order-9 entry => 8 -
1 = 7 siblings, so assume the slots are at offset 0 till 7 in this node.
If xas->xa_offset is 6, then the code will compute order as 1 +
xas->xa_node->shift = 7. Therefore, the order computation must start from
the beginning of the multi-slot entries, that is, the non-sibling entry.
Thus ensure that the caller is aware of this by triggering a BUG when the
entry is a sibling entry. Note that this BUG_ON() is only active while
running selftests, so there is no overhead in a running kernel.
Link: https://lkml.kernel.org/r/20250604041533.91198-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On destroy, we should set each node dead. But current code miss this when
the maple tree has only the root node.
The reason is mt_destroy_walk() leverage mte_destroy_descend() to set node
dead, but this is skipped since the only root node is a leaf.
Fixes this by setting the node dead if it is a leaf.
Link: https://lore.kernel.org/all/20250407231354.11771-1-richard.weiyang@gmail.com/
Link: https://lkml.kernel.org/r/20250624191841.64682-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
alloc_tag_top_users() attempts to lock alloc_tag_cttype->mod_lock even
when the alloc_tag_cttype is not allocated because:
1) alloc tagging is disabled because mem profiling is disabled
(!alloc_tag_cttype)
2) alloc tagging is enabled, but not yet initialized (!alloc_tag_cttype)
3) alloc tagging is enabled, but failed initialization
(!alloc_tag_cttype or IS_ERR(alloc_tag_cttype))
In all cases, alloc_tag_cttype is not allocated, and therefore
alloc_tag_top_users() should not attempt to acquire the semaphore.
This leads to a crash on memory allocation failure by attempting to
acquire a non-existent semaphore:
Oops: general protection fault, probably for non-canonical address 0xdffffc000000001b: 0000 [#3] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x00000000000000d8-0x00000000000000df]
CPU: 2 UID: 0 PID: 1 Comm: systemd Tainted: G D 6.16.0-rc2 #1 VOLUNTARY
Tainted: [D]=DIE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:down_read_trylock+0xaa/0x3b0
Code: d0 7c 08 84 d2 0f 85 a0 02 00 00 8b 0d df 31 dd 04 85 c9 75 29 48 b8 00 00 00 00 00 fc ff df 48 8d 6b 68 48 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 88 02 00 00 48 3b 5b 68 0f 85 53 01 00 00 65 ff
RSP: 0000:ffff8881002ce9b8 EFLAGS: 00010016
RAX: dffffc0000000000 RBX: 0000000000000070 RCX: 0000000000000000
RDX: 000000000000001b RSI: 000000000000000a RDI: 0000000000000070
RBP: 00000000000000d8 R08: 0000000000000001 R09: ffffed107dde49d1
R10: ffff8883eef24e8b R11: ffff8881002cec20 R12: 1ffff11020059d37
R13: 00000000003fff7b R14: ffff8881002cec20 R15: dffffc0000000000
FS: 00007f963f21d940(0000) GS:ffff888458ca6000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f963f5edf71 CR3: 000000010672c000 CR4: 0000000000350ef0
Call Trace:
<TASK>
codetag_trylock_module_list+0xd/0x20
alloc_tag_top_users+0x369/0x4b0
__show_mem+0x1cd/0x6e0
warn_alloc+0x2b1/0x390
__alloc_frozen_pages_noprof+0x12b9/0x21a0
alloc_pages_mpol+0x135/0x3e0
alloc_slab_page+0x82/0xe0
new_slab+0x212/0x240
___slab_alloc+0x82a/0xe00
</TASK>
As David Wang points out, this issue became easier to trigger after commit
780138b123 ("alloc_tag: check mem_profiling_support in alloc_tag_init").
Before the commit, the issue occurred only when it failed to allocate and
initialize alloc_tag_cttype or if a memory allocation fails before
alloc_tag_init() is called. After the commit, it can be easily triggered
when memory profiling is compiled but disabled at boot.
To properly determine whether alloc_tag_init() has been called and its
data structures initialized, verify that alloc_tag_cttype is a valid
pointer before acquiring the semaphore. If the variable is NULL or an
error value, it has not been properly initialized. In such a case, just
skip and do not attempt to acquire the semaphore.
[harry.yoo@oracle.com: v3]
Link: https://lkml.kernel.org/r/20250624072513.84219-1-harry.yoo@oracle.com
Link: https://lkml.kernel.org/r/20250620195305.1115151-1-harry.yoo@oracle.com
Fixes: 780138b123 ("alloc_tag: check mem_profiling_support in alloc_tag_init")
Fixes: 1438d349d1 ("lib: add memory allocations report in show_mem()")
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202506181351.bba867dd-lkp@intel.com
Acked-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: David Wang <00107082@163.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Yuanyuan Zhong <yzhong@purestorage.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The internal helpers are effectively using boolean results,
while pretending to use error numbers.
Switch the return type to bool for more clarity.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-5-df7d9f87b9b8@linutronix.de
Generalize node_random() and make it available to general bitmaps and
cpumasks users.
Notice, find_first_bit() is generally faster than find_nth_bit(), and we
employ it when there's a single set bit in the bitmap.
See commit 3e061d924f ("lib/nodemask: optimize node_random for
nodemask with single NUMA node").
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>
crypto/hash_info.c just contains a couple of arrays that map HASH_ALGO_*
algorithm IDs to properties of those algorithms. It is compiled only
when CRYPTO_HASH_INFO=y, but currently CRYPTO_HASH_INFO depends on
CRYPTO. Since this can be useful without the old-school crypto API,
move it into lib/crypto/ so that it no longer depends on CRYPTO.
This eliminates the need for FS_VERITY to select CRYPTO after it's been
converted to use lib/crypto/.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630172224.46909-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since sha256_blocks() is called only with nblocks >= 1, remove
unnecessary checks for nblocks == 0 from the x86 SHA-256 assembly code.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250704023958.73274-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
As I did for sha512_blocks(), reorganize x86's sha256_blocks() to be
just a static_call. To achieve that, for each assembly function add a C
function that handles the kernel-mode FPU section and fallback. While
this increases total code size slightly, the amount of code actually
executed on a given system does not increase, and it is slightly more
efficient since it eliminates the extra static_key. It also makes the
assembly functions be called with standard direct calls instead of
static calls, eliminating the need for ANNOTATE_NOENDBR.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250704023958.73274-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The BLOCK_HASH_UPDATE_BLOCKS macro is difficult to read. For now, let's
just write the update explicitly in the straightforward way, mirroring
sha512_update(). It's possible that we'll bring back a macro for this
later, but it needs to be properly justified and hopefully a bit more
readable.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Consolidate the CPU-based SHA-256 code into a single module, following
what I did with SHA-512:
- Each arch now provides a header file lib/crypto/$(SRCARCH)/sha256.h,
replacing lib/crypto/$(SRCARCH)/sha256.c. The header defines
sha256_blocks() and optionally sha256_mod_init_arch(). It is included
by lib/crypto/sha256.c, and thus the code gets built into the single
libsha256 module, with proper inlining and dead code elimination.
- sha256_blocks_generic() is moved from lib/crypto/sha256-generic.c into
lib/crypto/sha256.c. It's now a static function marked with
__maybe_unused, so the compiler automatically eliminates it in any
cases where it's not used.
- Whether arch-optimized SHA-256 is buildable is now controlled
centrally by lib/crypto/Kconfig instead of by
lib/crypto/$(SRCARCH)/Kconfig. The conditions for enabling it remain
the same as before, and it remains enabled by default.
- Any additional arch-specific translation units for the optimized
SHA-256 code (such as assembly files) are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since HMAC support is commonly needed and is fairly simple, include it
as a first-class citizen of the SHA-256 library.
The API supports both incremental and one-shot computation, and either
preparing the key ahead of time or just using a raw key. The
implementation is much more streamlined than crypto/hmac.c.
I've kept it consistent with the HMAC-SHA384 and HMAC-SHA512 code as
much as possible.
Testing of these functions will be via sha224_kunit and sha256_kunit,
added by a later commit.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The previous commit made the SHA-256 compression function state be
strongly typed, but it wasn't propagated all the way down to the
implementations of it. Do that now.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Currently the SHA-224 and SHA-256 library functions can be mixed
arbitrarily, even in ways that are incorrect, for example using
sha224_init() and sha256_final(). This is because they operate on the
same structure, sha256_state.
Introduce stronger typing, as I did for SHA-384 and SHA-512.
Also as I did for SHA-384 and SHA-512, use the names *_ctx instead of
*_state. The *_ctx names have the following small benefits:
- They're shorter.
- They avoid an ambiguity with the compression function state.
- They're consistent with the well-known OpenSSL API.
- Users usually name the variable 'sctx' anyway, which suggests that
*_ctx would be the more natural name for the actual struct.
Therefore: update the SHA-224 and SHA-256 APIs, implementation, and
calling code accordingly.
In the new structs, also strongly-type the compression function state.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add a one-shot SHA-224 computation function sha224(), for consistency
with sha256(), sha384(), and sha512() which all already exist.
Similarly, add sha224_update(). While for now it's identical to
sha256_update(), omitting it makes the API harder to use since users
have to "know" which functions are the same between SHA-224 and SHA-256.
Also, this is a prerequisite for using different context types for each.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of having both sha256_blocks_arch() and sha256_blocks_simd(),
instead have just sha256_blocks_arch() which uses the most efficient
implementation that is available in the calling context.
This is simpler, as it reduces the API surface. It's also safer, since
sha256_blocks_arch() just works in all contexts, including contexts
where the FPU/SIMD/vector registers cannot be used. This doesn't mean
that SHA-256 computations *should* be done in such contexts, but rather
we should just do the right thing instead of corrupting a random task's
registers. Eliminating this footgun and simplifying the code is well
worth the very small performance cost of doing the check.
Note: in the case of arm and arm64, what used to be sha256_blocks_arch()
is renamed back to its original name of sha256_block_data_order().
sha256_blocks_arch() is now used for the higher-level dispatch function.
This renaming also required an update to lib/crypto/arm64/sha512.h,
since sha2-armv8.pl is shared by both SHA-256 and SHA-512.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
First, move the declarations of sha224_init/update/final to be just
above the corresponding SHA-256 code, matching the order that I used for
SHA-384 and SHA-512. In sha2.h, the end result is that SHA-224,
SHA-256, SHA-384, and SHA-512 are all in the logical order.
Second, move sha224_block_init() and sha256_block_init() to be just
below crypto_sha256_state. In later changes, these functions as well as
struct crypto_sha256_state will no longer be used by the library
functions. They'll remain just for some legacy offload drivers. This
gets them into a logical place in the file for that.
No code changes other than reordering.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Current release - new code bugs:
- eth: txgbe: fix the issue of TX failure
- eth: ngbe: specify IRQ vector when the number of VFs is 7
Previous releases - regressions:
- sched: always pass notifications when child class becomes empty
- ipv4: fix stat increase when udp early demux drops the packet
- bluetooth: prevent unintended pause by checking if advertising is active
- virtio: fix error reporting in virtqueue_resize
- eth: virtio-net:
- ensure the received length does not exceed allocated size
- fix the xsk frame's length check
- eth: lan78xx: fix WARN in __netif_napi_del_locked on disconnect
Previous releases - always broken:
- bluetooth: mesh: check instances prior disabling advertising
- eth: idpf: convert control queue mutex to a spinlock
- eth: dpaa2: fix xdp_rxq_info leak
- eth: amd-xgbe: align CL37 AN sequence as per databook
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmhmfzQSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOk/GMP/ixlapKjTP/ggGIFO0nEDTm1tAFnhQl3
bBuwBDoGPjalb46WBO24SFSFYqvZwV6ZIYxCxCeBfmkPyEun0FBX6xjqUIZqohTZ
u5ZSmKFkODMoxQWAG0hXBGvfeKg/GBMWJT761o5IB2XvknRlqHq6uufUBcalvlJK
t58ykSYp2wjfowXSRQ4jEZnr4HZzVuvarhbCB9hJWv206fdk4LiC07teHB1VhW4w
LYmBQChp8SXDFCCYZajum0cNCzx78q90lGzz+MEErVXdXXnRVeqRAUY+k4Vd/Fz+
0OY1vZJ7xgFpy2ns3Z6TH8D41P9whBI8jUYXZ5nA45J8N5wdEQo8oVHlRe9a6Y/E
0oC+DPahhSQAq8BKGFtYSyyURGJvd4+TpQP/LV4e83myReW8i0ZKtyXVgH0Cibwb
529l6wIXBAcLK03tyYwmoCI2VjJbRoMV3nMCeiACCtDExK1YCa3dhjQ82fa8voLc
MIn7zXAGf12IKca39ZapRrdaooaqvSG4htxTn94vEqScNu0wi1cymvG47h9bDrES
cPyS4/MIUH0sduSDVL5PpFYfIDhqS3mpc0e8Nc3pOy7VLQ9kvtBX37OaO/tX5aeh
SWU+8q8y1Cnq0+mcUUHpENFMOgZEC5UO6rdeaJB3Nu0vlHlDEZoEkUXSkHEfsf2F
aodwE/oPyQCg
=O7OS
-----END PGP SIGNATURE-----
Merge tag 'net-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from Bluetooth.
Current release - new code bugs:
- eth:
- txgbe: fix the issue of TX failure
- ngbe: specify IRQ vector when the number of VFs is 7
Previous releases - regressions:
- sched: always pass notifications when child class becomes empty
- ipv4: fix stat increase when udp early demux drops the packet
- bluetooth: prevent unintended pause by checking if advertising is active
- virtio: fix error reporting in virtqueue_resize
- eth:
- virtio-net:
- ensure the received length does not exceed allocated size
- fix the xsk frame's length check
- lan78xx: fix WARN in __netif_napi_del_locked on disconnect
Previous releases - always broken:
- bluetooth: mesh: check instances prior disabling advertising
- eth:
- idpf: convert control queue mutex to a spinlock
- dpaa2: fix xdp_rxq_info leak
- amd-xgbe: align CL37 AN sequence as per databook"
* tag 'net-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
vsock/vmci: Clear the vmci transport packet properly when initializing it
dt-bindings: net: sophgo,sg2044-dwmac: Drop status from the example
net: ngbe: specify IRQ vector when the number of VFs is 7
net: wangxun: revert the adjustment of the IRQ vector sequence
net: txgbe: request MISC IRQ in ndo_open
virtio_net: Enforce minimum TX ring size for reliability
virtio_net: Cleanup '2+MAX_SKB_FRAGS'
virtio_ring: Fix error reporting in virtqueue_resize
virtio-net: xsk: rx: fix the frame's length check
virtio-net: use the check_mergeable_len helper
virtio-net: remove redundant truesize check with PAGE_SIZE
virtio-net: ensure the received length does not exceed allocated size
net: ipv4: fix stat increase when udp early demux drops the packet
net: libwx: fix the incorrect display of the queue number
amd-xgbe: do not double read link status
net/sched: Always pass notifications when child class becomes empty
nui: Fix dma_mapping_error() check
rose: fix dangling neighbour pointers in rose_rt_device_down()
enic: fix incorrect MTU comparison in enic_change_mtu()
amd-xgbe: align CL37 AN sequence as per databook
...
Smatch complains that the error message isn't set in the caller:
lib/test_objagg.c:923 test_hints_case2()
error: uninitialized symbol 'errmsg'.
This static checker warning only showed up after a recent refactoring
but the bug dates back to when the code was originally added. This
likely doesn't affect anything in real life.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202506281403.DsuyHFTZ-lkp@intel.com/
Fixes: 0a020d416d ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8548f423-2e3b-4bb7-b816-5041de2762aa@sabinyo.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
group_cpu_evenly() might have allocated less groups then requested:
group_cpu_evenly()
__group_cpus_evenly()
alloc_nodes_groups()
# allocated total groups may be less than numgrps when
# active total CPU number is less then numgrps
In this case, the caller will do an out of bound access because the
caller assumes the masks returned has numgrps.
Return the number of groups created so the caller can limit the access
range accordingly.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250617-isolcpus-queue-counters-v1-1-13923686b54b@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix build warnings with W=1 that started appearing after
commit a934a57a42 ("scripts/misc-check: check missing #include
<linux/export.h> when W=1"). While at it, sort the include lists
alphabetically.
Link: https://lore.kernel.org/r/20250612183852.114878-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
These symbols are no longer used, so remove them.
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the x86-optimized CRC code from arch/x86/lib/crc* into its new
location in lib/crc/x86/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the sparc-optimized CRC code from arch/sparc/lib/crc* into its new
location in lib/crc/sparc/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the s390-optimized CRC code from arch/s390/lib/crc* into its new
location in lib/crc/s390/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the riscv-optimized CRC code from arch/riscv/lib/crc* into its new
location in lib/crc/riscv/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the powerpc-optimized CRC code from arch/powerpc/lib/crc* into its
new location in lib/crc/powerpc/, and wire it up in the new way. This
new way of organizing the CRC code eliminates the need to artificially
split the code for each CRC variant into separate arch and generic
modules, enabling better inlining and dead code elimination. For more
details, see "lib/crc: Prepare for arch-optimized code in subdirs of
lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the mips-optimized CRC code from arch/mips/lib/crc* into its new
location in lib/crc/mips/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the loongarch-optimized CRC code from arch/loongarch/lib/crc* into
its new location in lib/crc/loongarch/, and wire it up in the new way.
This new way of organizing the CRC code eliminates the need to
artificially split the code for each CRC variant into separate arch and
generic modules, enabling better inlining and dead code elimination.
For more details, see "lib/crc: Prepare for arch-optimized code in
subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the arm64-optimized CRC code from arch/arm64/lib/crc* into its new
location in lib/crc/arm64/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the arm-optimized CRC code from arch/arm/lib/crc* into its new
location in lib/crc/arm/, and wire it up in the new way. This new way
of organizing the CRC code eliminates the need to artificially split the
code for each CRC variant into separate arch and generic modules,
enabling better inlining and dead code elimination. For more details,
see "lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/".
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Rework how lib/crc/ supports arch-optimized code. First, instead of the
arch-optimized CRC code being in arch/$(SRCARCH)/lib/, it will now be in
lib/crc/$(SRCARCH)/. Second, the API functions (e.g. crc32c()),
arch-optimized functions (e.g. crc32c_arch()), and generic functions
(e.g. crc32c_base()) will now be part of a single module for each CRC
type, allowing better inlining and dead code elimination. The second
change is made possible by the first.
As an example, consider CONFIG_CRC32=m on x86. We'll now have just
crc32.ko instead of both crc32-x86.ko and crc32.ko. The two modules
were already coupled together and always both got loaded together via
direct symbol dependency, so the separation provided no benefit.
Note: later I'd like to apply the same design to lib/crypto/ too, where
often the API functions are out-of-line so this will work even better.
In those cases, for each algorithm we currently have 3 modules all
coupled together, e.g. libsha256.ko, libsha256-generic.ko, and
sha256-x86.ko. We should have just one, inline things properly, and
rely on the compiler's dead code elimination to decide the inclusion of
the generic code instead of manually setting it via kconfig.
Having arch-specific code outside arch/ was somewhat controversial when
Zinc proposed it back in 2018. But I don't think the concerns are
warranted. It's better from a technical perspective, as it enables the
improvements mentioned above. This model is already successfully used
in other places in the kernel such as lib/raid6/. The community of each
architecture still remains free to work on the code, even if it's not in
arch/. At the time there was also a desire to put the library code in
the same files as the old-school crypto API, but that was a mistake; now
that the library is separate, that's no longer a constraint either.
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-3-ebiggers@kernel.org
Link: https://lore.kernel.org/r/20250612054514.142728-1-ebiggers@kernel.org
Link: https://lore.kernel.org/r/20250621012221.4351-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move all CRC files in lib/ into a subdirectory lib/crc/ to keep them
from cluttering up the main lib/ directory.
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Remove crc32_le_combine() and crc32_le_shift(), since they are no longer
used.
Although combination is an interesting thing that can be done with CRCs,
it turned out that none of the users of it in the kernel were even close
to being worthwhile. All were much better off simply chaining the CRCs
or processing zeroes.
Let's remove the CRC32 combination code for now. It can come back
(potentially optimized with carryless multiplication instructions) if
there is ever a case where it would actually be worthwhile.
Link: https://lore.kernel.org/r/20250607032228.27868-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
The MIPS32r2 ChaCha code has never been buildable with the clang
assembler. First, clang doesn't support the 'rotl' pseudo-instruction:
error: unknown instruction, did you mean: rol, rotr?
Second, clang requires that both operands of the 'wsbh' instruction be
explicitly given:
error: too few operands for instruction
To fix this, align the code with the real instruction set by (1) using
the real instruction 'rotr' instead of the nonstandard pseudo-
instruction 'rotl', and (2) explicitly giving both operands to 'wsbh'.
To make removing the use of 'rotl' a bit easier, also remove the
unnecessary special-casing for big endian CPUs at
.Lchacha_mips_xor_bytes. The tail handling is actually
endian-independent since it processes one byte at a time. On big endian
CPUs the old code byte-swapped SAVED_X, then iterated through it in
reverse order. But the byteswap and reverse iteration canceled out.
Tested with chacha20poly1305-selftest in QEMU using "-M malta" with both
little endian and big endian mips32r2 kernels.
Fixes: 49aa7c00ed ("crypto: mips/chacha - import 32r2 ChaCha code from Zinc")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505080409.EujEBwA0-lkp@intel.com/
Link: https://lore.kernel.org/r/20250619225535.679301-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/x86/lib/crypto/ into lib/crypto/x86/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Add a gitignore entry for the removed directory arch/x86/lib/crypto/ so
that people don't accidentally commit leftover generated files.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/sparc/lib/crypto/ into lib/crypto/sparc/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/s390/lib/crypto/ into lib/crypto/s390/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/riscv/lib/crypto/ into lib/crypto/riscv/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Palmer Dabbelt <palmer@dabbelt.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/powerpc/lib/crypto/ into lib/crypto/powerpc/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/mips/lib/crypto/ into lib/crypto/mips/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Add a gitignore entry for the removed directory arch/mips/lib/crypto/ so
that people don't accidentally commit leftover generated files.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/arm64/lib/crypto/ into lib/crypto/arm64/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Add a gitignore entry for the removed directory arch/arm64/lib/crypto/
so that people don't accidentally commit leftover generated files.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Move the contents of arch/arm/lib/crypto/ into lib/crypto/arm/.
The new code organization makes a lot more sense for how this code
actually works and is developed. In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination. For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/
This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.
Add a gitignore entry for the removed directory arch/arm/lib/crypto/ so
that people don't accidentally commit leftover generated files.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the x86-optimized SHA-512 code via x86-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be x86-optimized, and it fixes the
longstanding issue where the x86-optimized SHA-512 code was disabled by
default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly functions from int to size_t. The assembly functions
actually already treated it as size_t.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-15-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the sparc-optimized SHA-512 code via sparc-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be sparc-optimized, and it fixes the
longstanding issue where the sparc-optimized SHA-512 code was disabled
by default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly function from int to size_t. The assembly function
actually already treated it as size_t.
Note: to see the diff from arch/sparc/crypto/sha512_glue.c to
lib/crypto/sparc/sha512.h, view this commit with 'git show -M10'.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the s390-optimized SHA-512 code via s390-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be s390-optimized, and it fixes the
longstanding issue where the s390-optimized SHA-512 code was disabled by
default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the riscv-optimized SHA-512 code via riscv-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be riscv-optimized, and it fixes the
longstanding issue where the riscv-optimized SHA-512 code was disabled
by default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly function from int to size_t. The assembly function
actually already treated it as size_t.
Note: to see the diff from arch/riscv/crypto/sha512-riscv64-glue.c to
lib/crypto/riscv/sha512.h, view this commit with 'git show -M10'.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the mips-optimized SHA-512 code via mips-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be mips-optimized, and it fixes the
longstanding issue where the mips-optimized SHA-512 code was disabled by
default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Note: to see the diff from
arch/mips/cavium-octeon/crypto/octeon-sha512.c to
lib/crypto/mips/sha512.h, view this commit with 'git show -M10'.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm64-optimized SHA-512 code via arm64-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-512 code was disabled
by default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly functions from int or 'unsigned int' to size_t. Update the
ARMv8 CE assembly function accordingly. The scalar assembly function
actually already treated it as size_t.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Instead of exposing the arm-optimized SHA-512 code via arm-specific
crypto_shash algorithms, instead just implement the sha512_blocks()
library function. This is much simpler, it makes the SHA-512 (and
SHA-384) library functions be arm-optimized, and it fixes the
longstanding issue where the arm-optimized SHA-512 code was disabled by
default. SHA-512 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
To match sha512_blocks(), change the type of the nblocks parameter of
the assembly functions from int to size_t. The assembly functions
actually already treated it as size_t.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Since HMAC support is commonly needed and is fairly simple, include it
as a first-class citizen of the SHA-512 library.
The API supports both incremental and one-shot computation, and either
preparing the key ahead of time or just using a raw key. The
implementation is much more streamlined than crypto/hmac.c.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Add basic support for SHA-384 and SHA-512 to lib/crypto/.
Various in-kernel users will be able to use this instead of the
old-school crypto API, which is harder to use and has more overhead.
The basic support added by this commit consists of the API and its
documentation, backed by a C implementation of the algorithms.
sha512_block_generic() is derived from crypto/sha512_generic.c.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Fix build warnings with W=1 that started appearing after
commit a934a57a42 ("scripts/misc-check: check missing #include
<linux/export.h> when W=1").
While at it, also sort the include lists alphabetically. (Keep
asm/irqflags.h last, as otherwise it doesn't build on alpha.)
This handles all of lib/crypto/, but not arch/*/lib/crypto/. The
exports in arch/*/lib/crypto/ will go away when the code is properly
integrated into lib/crypto/ as planned.
Link: https://lore.kernel.org/r/20250613184814.50173-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
or aren't considered necessary for -stable kernels. 5 are for MM.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaF8vtQAKCRDdBJ7gKXxA
jlK9AP9Syx5isoE7MAMKjr9iI/2z+NRaCCro/VM4oQk8m2cNFgD/ZsL9YMhjZlcL
bMIVUZ9E+yf1w9dLeHLoDba+pnF7Wwc=
=vdkO
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2025-06-27-16-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"16 hotfixes.
6 are cc:stable and the remainder address post-6.15 issues or aren't
considered necessary for -stable kernels. 5 are for MM"
* tag 'mm-hotfixes-stable-2025-06-27-16-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: add Lorenzo as THP co-maintainer
mailmap: update Duje Mihanović's email address
selftests/mm: fix validate_addr() helper
crashdump: add CONFIG_KEYS dependency
mailmap: correct name for a historical account of Zijun Hu
mailmap: add entries for Zijun Hu
fuse: fix runtime warning on truncate_folio_batch_exceptionals()
scripts/gdb: fix dentry_name() lookup
mm/damon/sysfs-schemes: free old damon_sysfs_scheme_filter->memcg_path on write
mm/alloc_tag: fix the kmemleak false positive issue in the allocation of the percpu variable tag->counters
lib/group_cpus: fix NULL pointer dereference from group_cpus_evenly()
mm/hugetlb: remove unnecessary holding of hugetlb_lock
MAINTAINERS: add missing files to mm page alloc section
MAINTAINERS: add tree entry to mm init block
mm: add OOM killer maintainer structure
fs/proc/task_mmu: fix PAGE_IS_PFNZERO detection for the huge zero folio
* .rodata is no longer linkd into PT_DYNAMIC, it was not supposed to be
there in the first place and resultst in invalid (but unused) entries.
This manifests as at least warnings in llvm-readelf.
* A fix for runtime constants with all-0 upper 32-bits. This should
only manifest on MMU=n kernels.
* A fix for context save/restore on systems using the T-Head vector
extensions.
* A fix for a conflicting "+r"/"r" register constraint in the VDSO
getrandom syscall wrapper, which is undefined behavior in clang.
* A fix for a missing register clobber in the RVV raid6 implementation.
This manifests as a NULL pointer reference on some compilers, but
could trigger in other ways.
* Misaligned accesses from userspace at faulting addresses are now
handled correctly.
* A fix for an incorrect optimization that allowed access_ok() to mark
invalid addresses as accessible, which can result in userspace
triggering BUG()s.
* A few fixes for build warnings, and an update to Drew's email address.
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmhe80kZHHBhbG1lcmRh
YmJlbHRAZ29vZ2xlLmNvbQAKCRAuExnzX7sYicV6EACT/5384tdpYSQ6WQ4K2mT2
XxPbrYTJ4jrhZMugnfe1LHBokeBGoGPRK11Dr/PyNJ71oeeDF7opv0kxAfqsiOO3
QrwUE/4zhGgEzs7Z6D8UgYiqVDfb4aMU+oZ0qIfy+r+cB4F9M65TIejdVj99V6Hu
V9cjJ4ABM9KfaZhD5BvoqflblYtwuSg/VYsUmZH6aolDyadzTy4rWcPk1jdFJDQt
tIEsXjc92KNAKGSFe8DDZjjhM216Th/nUsZcxI2DLRQjjHPNEthkAgLNltQGocU9
gJ8U3IqfazgnqcZAlrr7BXlWYlBFH/wGXVsxuBL5LPov19RcTkjl2PWH7T08yyuv
lCGXrfkz3hSu+Sa9A40w4LptrKNWUEFJztaPkQ68gn1ZQP7KB/rsWp+82dCqhT35
RNxmSznLyTsHFRXR2n9fZrWX/F/LwxY7vaH7cTZUDkMHI8F7WP/3tlihxPCQaUHD
dIb+osch8puxG3YjO7H99WrpJamNNw3+L1l2lXtXTRmXdxE+x7fyatmHX98mY8IC
7NXGOdNNIEvv4i9vzSphYQHBOT3tBVfz40z878qfSL3xYHG3ZLMIsWuynaWDMI73
QprwAPmdFxdmJrHyIY6gIiyrscNHz5WLMjkG4K+jXlsBBmDxJMAY5zzNdFoeUVDz
tjnDY4DYc4fCnteKSA/hpw==
=42TO
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V Fixes for 5.16-rc4
- .rodata is no longer linkd into PT_DYNAMIC.
It was not supposed to be there in the first place and resulted in
invalid (but unused) entries. This manifests as at least warnings in
llvm-readelf
- A fix for runtime constants with all-0 upper 32-bits. This should
only manifest on MMU=n kernels
- A fix for context save/restore on systems using the T-Head vector
extensions
- A fix for a conflicting "+r"/"r" register constraint in the VDSO
getrandom syscall wrapper, which is undefined behavior in clang
- A fix for a missing register clobber in the RVV raid6 implementation.
This manifests as a NULL pointer reference on some compilers, but
could trigger in other ways
- Misaligned accesses from userspace at faulting addresses are now
handled correctly
- A fix for an incorrect optimization that allowed access_ok() to mark
invalid addresses as accessible, which can result in userspace
triggering BUG()s
- A few fixes for build warnings, and an update to Drew's email address
* tag 'riscv-for-linus-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: export boot_cpu_hartid
Revert "riscv: Define TASK_SIZE_MAX for __access_ok()"
riscv: Fix sparse warning in vendor_extensions/sifive.c
Revert "riscv: misaligned: fix sleeping function called during misaligned access handling"
MAINTAINERS: Update Drew Fustini's email address
RISC-V: uaccess: Wrap the get_user_8 uaccess macro
raid6: riscv: Fix NULL pointer dereference caused by a missing clobber
RISC-V: vDSO: Correct inline assembly constraints in the getrandom syscall wrapper
riscv: vector: Fix context save/restore with xtheadvector
riscv: fix runtime constant support for nommu kernels
riscv: vdso: Exclude .rodata from the PT_DYNAMIC segment
To accommodate varying hardware performance and use cases,
the default kunit test case timeout (currently 300 seconds)
is now configurable. Users can adjust the timeout by
either setting the 'timeout' module parameter or the
KUNIT_DEFAULT_TIMEOUT Kconfig option to their desired
timeout in seconds.
Link: https://lore.kernel.org/r/20250626171730.1765004-1-marievic@google.com
Signed-off-by: Marie Zhussupova <marievic@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Merge a fix from Jeff from a stable commit ID:
* ref_tracker: do xarray and workqueue job initializations earlier
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The kernel test robot reported an oops that occurred when attempting to
deregister a dentry from the xarray during subsys_initcall().
The ref_tracker xarrays and workqueue job are being initialized in
late_initcall() which is too late. Move those to postcore_initcall()
instead.
Fixes: 65b584f536 ("ref_tracker: automatically register a file in debugfs for a ref_tracker_dir")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202506251406.c28f2adb-lkp@intel.com
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250626-reftrack-dbgfs-v1-1-812102e2a394@kernel.org
When loading a module, as long as the module has memory allocation
operations, kmemleak produces a false positive report that resembles the
following:
unreferenced object (percpu) 0x7dfd232a1650 (size 16):
comm "modprobe", pid 1301, jiffies 4294940249
hex dump (first 16 bytes on cpu 2):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 0):
kmemleak_alloc_percpu+0xb4/0xd0
pcpu_alloc_noprof+0x700/0x1098
load_module+0xd4/0x348
codetag_module_init+0x20c/0x450
codetag_load_module+0x70/0xb8
load_module+0xef8/0x1608
init_module_from_file+0xec/0x158
idempotent_init_module+0x354/0x608
__arm64_sys_finit_module+0xbc/0x150
invoke_syscall+0xd4/0x258
el0_svc_common.constprop.0+0xb4/0x240
do_el0_svc+0x48/0x68
el0_svc+0x40/0xf8
el0t_64_sync_handler+0x10c/0x138
el0t_64_sync+0x1ac/0x1b0
This is because the module can only indirectly reference
alloc_tag_counters through the alloc_tag section, which misleads kmemleak.
However, we don't have a kmemleak ignore interface for percpu allocations
yet. So let's create one and invoke it for tag->counters.
[gehao@kylinos.cn: fix build error when CONFIG_DEBUG_KMEMLEAK=n, s/igonore/ignore/]
Link: https://lkml.kernel.org/r/20250620093102.2416767-1-hao.ge@linux.dev
Link: https://lkml.kernel.org/r/20250619183154.2122608-1-hao.ge@linux.dev
Fixes: 12ca42c237 ("alloc_tag: allocate percpu counters for module tags dynamically")
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Suren Baghdasaryan <surenb@google.com> [lib/alloc_tag.c]
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
While testing null_blk with configfs, echo 0 > poll_queues will trigger
following panic:
BUG: kernel NULL pointer dereference, address: 0000000000000010
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 27 UID: 0 PID: 920 Comm: bash Not tainted 6.15.0-02023-gadbdb95c8696-dirty #1238 PREEMPT(undef)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
RIP: 0010:__bitmap_or+0x48/0x70
Call Trace:
<TASK>
__group_cpus_evenly+0x822/0x8c0
group_cpus_evenly+0x2d9/0x490
blk_mq_map_queues+0x1e/0x110
null_map_queues+0xc9/0x170 [null_blk]
blk_mq_update_queue_map+0xdb/0x160
blk_mq_update_nr_hw_queues+0x22b/0x560
nullb_update_nr_hw_queues+0x71/0xf0 [null_blk]
nullb_device_poll_queues_store+0xa4/0x130 [null_blk]
configfs_write_iter+0x109/0x1d0
vfs_write+0x26e/0x6f0
ksys_write+0x79/0x180
__x64_sys_write+0x1d/0x30
x64_sys_call+0x45c4/0x45f0
do_syscall_64+0xa5/0x240
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Root cause is that numgrps is set to 0, and ZERO_SIZE_PTR is returned from
kcalloc(), and later ZERO_SIZE_PTR will be deferenced.
Fix the problem by checking numgrps first in group_cpus_evenly(), and
return NULL directly if numgrps is zero.
[yukuai3@huawei.com: also fix the non-SMP version]
Link: https://lkml.kernel.org/r/20250620010958.1265984-1-yukuai1@huaweicloud.com
Link: https://lkml.kernel.org/r/20250619132655.3318883-1-yukuai1@huaweicloud.com
Fixes: 6a6dcae8f4 ("blk-mq: Build default queue map via group_cpus_evenly()")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: ErKun Yang <yangerkun@huawei.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: "zhangyi (F)" <yi.zhang@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently, the in-kernel kunit test case timeout is 300 seconds. (There
is a separate timeout mechanism for the whole test execution in
kunit.py, but that's unrelated.) However, tests marked 'slow' or 'very
slow' may timeout, particularly on slower machines.
Implement a multiplier to the test-case timeout, so that slower tests
have longer to complete:
- DEFAULT -> 1x default timeout
- KUNIT_SPEED_SLOW -> 3x default timeout
- KUNIT_SPEED_VERY_SLOW -> 12x default timeout
A further change is planned to allow user configuration of the
default/base timeout to allow people with faster or slower machines to
adjust these to their use-cases.
Link: https://lore.kernel.org/r/20250614084711.2654593-2-davidgow@google.com
Signed-off-by: Ujwal Jain <ujwaljain@google.com>
Co-developed-by: David Gow <davidgow@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add test cases for static and dynamic minor number allocation and
deallocation.
While at it, improve description and test suite name.
Some of the cases include:
- that static and dynamic allocation reserved the expected minors.
- that registering duplicate minors or duplicate names will fail.
- that failing to create a sysfs file (due to duplicate names) will
deallocate the dynamic minor correctly.
- that dynamic allocation does not allocate a minor number in the static
range.
- that there are no collisions when mixing dynamic and static allocations.
- that opening devices with various minor device numbers work.
- that registering a static number in the dynamic range won't conflict with
a dynamic allocation.
This last test verifies the bug fixed by commit 6d04d2b554 ("misc:
misc_minor_alloc to use ida for all dynamic/misc dynamic minors") has not
regressed.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Link: https://lore.kernel.org/r/20250612-misc-dynrange-v5-1-6f35048f7273@igalia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a simple stress test for lib/ratelimit.c
To run on x86:
./tools/testing/kunit/kunit.py run --arch x86_64 --kconfig_add CONFIG_RATELIMIT_KUNIT_TEST=y --kconfig_add CONFIG_SMP=y --qemu_args "-smp 4" lib_ratelimit
On a 16-CPU system, the "4" in "-smp 4" can be varied between 1 and 8.
Larger numbers have higher probabilities of introducing delays that
break the smoke test. In the extreme case, increasing the number to
larger than the number of CPUs in the underlying system is an excellent
way to get a test failure.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jon Pan-Doh <pandoh@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Karolina Stolarek <karolina.stolarek@oracle.com>
The selftest fails most of the times when running in qemu with
a kernel configured with CONFIG_HZ = 250:
> test_ratelimit_smoke: 1 callbacks suppressed
> # test_ratelimit_smoke: ASSERTION FAILED at lib/tests/test_ratelimit.c:28
> Expected ___ratelimit(&testrl, "test_ratelimit_smoke") == (false), but
> ___ratelimit(&testrl, "test_ratelimit_smoke") == 1 (0x1)
> (false) == 0 (0x0)
Try to make the test slightly more reliable by calling the problematic
ratelimit in the middle of the interval.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Add a simple single-threaded smoke test for lib/ratelimit.c
To run on x86:
make ARCH=x86_64 mrproper
./tools/testing/kunit/kunit.py run --arch x86_64 --kconfig_add CONFIG_RATELIMIT_KUNIT_TEST=y --kconfig_add CONFIG_SMP=y lib_ratelimit
This will fail on old ___ratelimit(), and subsequent patches provide
the fixes that are required.
[ paulmck: Apply timeout and kunit feedback from Petr Mladek. ]
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jon Pan-Doh <pandoh@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Karolina Stolarek <karolina.stolarek@oracle.com>
With sanitizers enabled, this function uses a lot of stack, causing
a harmless warning:
lib/test_objagg.c: In function 'test_hints_case.constprop':
lib/test_objagg.c:994:1: error: the frame size of 1440 bytes is larger than 1408 bytes [-Werror=frame-larger-than=]
Most of this is from the two 'struct world' structures. Since most of
the work in this function is duplicated for the two, split it up into
separate functions that each use one of them.
The combined stack usage is still the same here, but there is no warning
any more, and the code is still safe because of the known call chain.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250620111907.3395296-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
or aren't considered necessary for -stable kernels. Only 4 are for MM.
- The 3 patch series `Revert "bcache: update min_heap_callbacks to use
default builtin swap"' from Kuan-Wei Chiu backs out the author's recent
min_heap changes due to a performance regression. A fix for this
regression has been developed but we felt it best to go back to the
known-good version to give the new code more bake time.
- A lot of MAINTAINERS maintenance. I like to get these changes
upstreamed promptly because they can't break things and more
accurate/complete MAINTAINERS info hopefully improves the speed and
accuracy of our responses to submitters and reporters.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaFizWwAKCRDdBJ7gKXxA
jhivAQDGQXgzgzPCu/5/fTQjjq+D/8M2QjGxNy4o1itKoK+fYAEAzQGTL/8ay9FY
yhcipreU4A3lrxf94iOidiBCYkZaOgk=
=kFFb
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2025-06-22-18-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"20 hotfixes. 7 are cc:stable and the remainder address post-6.15
issues or aren't considered necessary for -stable kernels. Only 4 are
for MM.
- The series `Revert "bcache: update min_heap_callbacks to use
default builtin swap"' from Kuan-Wei Chiu backs out the author's
recent min_heap changes due to a performance regression.
A fix for this regression has been developed but we felt it best to
go back to the known-good version to give the new code more bake
time.
- A lot of MAINTAINERS maintenance.
I like to get these changes upstreamed promptly because they can't
break things and more accurate/complete MAINTAINERS info hopefully
improves the speed and accuracy of our responses to submitters and
reporters"
* tag 'mm-hotfixes-stable-2025-06-22-18-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: add additional mmap-related files to mmap section
MAINTAINERS: add memfd, shmem quota files to shmem section
MAINTAINERS: add stray rmap file to mm rmap section
MAINTAINERS: add hugetlb_cgroup.c to hugetlb section
MAINTAINERS: add further init files to mm init block
MAINTAINERS: update maintainers for HugeTLB
maple_tree: fix MA_STATE_PREALLOC flag in mas_preallocate()
MAINTAINERS: add missing test files to mm gup section
MAINTAINERS: add missing mm/workingset.c file to mm reclaim section
selftests/mm: skip uprobe vma merge test if uprobes are not enabled
bcache: remove unnecessary select MIN_HEAP
Revert "bcache: remove heap-related macros and switch to generic min_heap"
Revert "bcache: update min_heap_callbacks to use default builtin swap"
selftests/mm: add configs to fix testcase failure
kho: initialize tail pages for higher order folios properly
MAINTAINERS: add linux-mm@ list to Kexec Handover
mm: userfaultfd: fix race of userfaultfd_move and swap cache
mm/gup: revert "mm: gup: fix infinite loop within __get_longterm_locked"
selftests/mm: increase timeout from 180 to 900 seconds
mm/shmem, swap: fix softlockup with mTHP swapin
and reinstates a Kconfig option to control the extra self-tests.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmhSgQIACgkQxycdCkmx
i6eWig//aNg4YL30eTh41eTWTCiA1PLZpyOE2/Wz7q/Yg4M0Refn85A+tREm18q+
uwuZKAoFz8VaF0trqSQQ3PFzZaJWWRn0yLqeToxGyd7sY9kBh93FdQLub8wTxO0F
qDPLnAR+Gt7VAGcYSjhyB/TCsJ5h6oRN87qMIr8g807SiIB6mHiuXxJAAKy1U7OD
cXafp3HTkzUjgk/wbj7qSK6HJR3Cq3o/3JmsE/D7yvJRH1Bx7mNoiRpEX17CkgQX
qVZmLj8lE4HzFpTLKBAY8sXlzxscN+rHnS5WUhTqWL1hAI2b52p1moJPzT9QM/Zb
yI+x1DbO21Pvr4mZJ/hX18Y9VvTbea0hkD/wFD+hKJyQ9j70B8/bBeT/sOxKqDZn
0G1o9UyVTNdw4m2m/6lYJBgG0yiuD3hZID+Wjgq6lOsfoVBThU3CWq11NW98HQKz
0VUWztcG7JTqM1wUwwjlMXnm8+WKwiuYqYZCwBl8o0Ii29/Sm0pGMXtiDqmWFWLA
a4FJNFxiKEfVA95yRuRPfEM7KMwRWdw2C9YGe6hk3kcUbfDYSJykUme/USFzz8X8
5lmwWESNggggQEw9BxUAILIzRZwsDhCakgRjd11JRbNjrNTwXIbP9+nv+LH91mPK
zm5DJqyqSUVr2iXeQYYH/etyRsMX+dAuWPrFvvjuDBb8/fgEce4=
=6/TP
-----END PGP SIGNATURE-----
Merge tag 'v6.16-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a regression in ahash (broken fallback finup) and
reinstates a Kconfig option to control the extra self-tests"
* tag 'v6.16-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ahash - Fix infinite recursion in ahash_def_finup
crypto: testmgr - reinstate kconfig control over full self-tests
Temporarily clear the preallocation flag when explicitly requesting
allocations. Pre-existing allocations are already counted against the
request through mas_node_count_gfp(), but the allocations will not happen
if the MA_STATE_PREALLOC flag is set. This flag is meant to avoid
re-allocating in bulk allocation mode, and to detect issues with
preallocation calculations.
The MA_STATE_PREALLOC flag should also always be set on zero allocations
so that detection of underflow allocations will print a WARN_ON() during
consumption.
User visible effect of this flaw is a WARN_ON() followed by a null pointer
dereference when subsequent requests for larger number of nodes is
ignored, such as the vma merge retry in mmap_region() caused by drivers
altering the vma flags (which happens in v6.6, at least)
Link: https://lkml.kernel.org/r/20250616184521.3382795-3-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Reported-by: Hailong Liu <hailong.liu@oppo.com>
Link: https://lore.kernel.org/all/1652f7eb-a51b-4fee-8058-c73af63bacd1@oppo.com/
Link: https://lore.kernel.org/all/20250428184058.1416274-1-Liam.Howlett@oracle.com/
Link: https://lore.kernel.org/all/20250429014754.1479118-1-Liam.Howlett@oracle.com/
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Hailong Liu <hailong.liu@oppo.com>
Cc: zhangpeng.00@bytedance.com <zhangpeng.00@bytedance.com>
Cc: Steve Kang <Steve.Kang@unisoc.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Now that we have dentries and the ability to create meaningful symlinks
to them, don't keep a name string in each tracker. Switch the output
format to print "class@address", and drop the name field.
Also, add a kerneldoc header for ref_tracker_dir_init().
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-9-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the ability for a subsystem to add a user-friendly symlink that
points to a ref_tracker_dir's debugfs file. Add a separate
debugfs_symlinks xarray and use that to track symlinks. The reaper
workqueue job will remove symlinks before their corresponding dentries.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-7-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, there is no convenient way to see the info that the
ref_tracking infrastructure collects. Attempt to create a file in
debugfs when called from ref_tracker_dir_init().
The file is given the name "class@%px", as having the unmodified address
is helpful for debugging. This should be safe since this directory is only
accessible by root
While ref_tracker_dir_init() is generally called from a context where
sleeping is OK, ref_tracker_dir_exit() can be called from anywhere.
Thus, dentry cleanup must be handled asynchronously.
Add a new global xarray that has entries with the ref_tracker_dir
pointer as the index and the corresponding debugfs dentry pointer as the
value. Instead of removing the debugfs dentry, have
ref_tracker_dir_exit() set a mark on the xarray entry and schedule a
workqueue job. The workqueue job then walks the xarray looking for
marked entries, and removes their xarray entries and the debugfs
dentries.
Because of this, the debugfs dentry can outlive the corresponding
ref_tracker_dir. Have ref_tracker_debugfs_show() take extra care to
ensure that it's safe to dereference the dir pointer before using it.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-6-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Allow pr_ostream to also output directly to a seq_file without an
intermediate buffer. The first caller of +ref_tracker_dir_seq_print()
will come in a later patch, so mark that __maybe_unused for now. That
designation will be removed once it is used.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-5-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A later patch in the series will be adding debugfs files for each
ref_tracker that get created in ref_tracker_dir_init(). The format will
be "class@%px". The current "name" string can vary between
ref_tracker_dir objects of the same type, so it's not suitable for this
purpose.
Add a new "class" string to the ref_tracker dir that describes the
the type of object (sans any individual info for that object).
Also, in the i915 driver, gate the creation of debugfs files on whether
the dentry pointer is still set to NULL. CI has shown that the
ref_tracker_dir can be initialized more than once.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-4-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In a later patch, we'll be adding a 3rd mechanism for outputting
ref_tracker info via seq_file. Instead of a conditional, have the caller
set a pointer to an output function in struct ostream. As part of this,
the log prefix must be explicitly passed in, as it's too late for the
pr_fmt macro.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-3-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new "ref_tracker" directory in debugfs. Each individual refcount
tracker can register files under there to display info about
currently-held references.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250618-reftrack-dbgfs-v15-2-24fc37ead144@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
requires adding support for a number of new FW commands so
it's quite large in terms of LoC. The rest is relatively small.
Current release - fix to a fix:
- ptp: fix breakage after ptp_vclock_in_use() rework
Current release - regressions:
- openvswitch: allocate struct ovs_pcpu_storage dynamically, static
allocation may exhaust module loader limit on smaller systems
Previous releases - regressions:
- tcp: fix tcp_packet_delayed() for peers with no selective ACK support
Previous releases - always broken:
- wifi: ath12k: don't activate more links than firmware supports
- tcp: make sure sockets open via passive TFO have valid NAPI ID
- eth: bnxt_en: update MRU and RSS table of RSS contexts on queue reset,
prevent Rx queues from silently hanging after queue reset
- NFC: uart: set tty->disc_data only in success path
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmhUPLgACgkQMUZtbf5S
IrvrXxAAvREuPy2BhCrQW77TeVW2ikBlGYYp+dCIL4r4kfqquHU3C+R/GZ5yRbcA
fiRMkRv7J4xhHOT8AnFmFiWGz6QEJfESy3jSQUfcLssHeandEjJbcZ0qVunEpE4e
rt0leygfY+axEP3ewoGQEP7Ds2Ku+L3BrIetwsUZPFo9pVAtGJ4XOV56SRYPP7vL
7toaHcxXtx5vL9Wlb9vulom9kfZU4/hfDMS9/FES+aE4QnRDB+sh0v6wL4pocLrj
7swSqWCY7vjd7vJck4Ta/O9+bL6kz1wG4Qvv5enXGJlksiBgCpShfbzGMZTbegeP
8gCj4heAATCRmzCmfkRk5sWzPqedtLI5rzb7jUYYF4wiZpJwjwLzrDxopazFogXs
kKDCD3lxDmUTV4AkT/5hn8Rlf4CApMrhPs+zhzQQ620Sj/G3k2lyepKmKjOUSAZ4
fEFKmP9Mzt3fKt2sOi5ETlrpuiSv5CCAZfnCg7UoBXrQM1Rxyn+utLMeGM6oHXRa
iH9GwVN2cycEXl16y8HVCui8lzBYyZKVbyiRBM6k3Pl6ZozDuHvCRmHgs8g9pTja
3hGe2wjYcsi+HKrF1zDQrTJbhKzsNMpJej0KIZ06xjyXAgr1VV0noEq/iQsPEj4z
033DotqhyoFXc6ERmOaca8LfE2kHolwT+0leO+i0sbdmVlSFxB8=
=2mAa
-----END PGP SIGNATURE-----
Merge tag 'net-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless.
The ath12k fix to avoid FW crashes requires adding support for a
number of new FW commands so it's quite large in terms of LoC. The
rest is relatively small.
Current release - fix to a fix:
- ptp: fix breakage after ptp_vclock_in_use() rework
Current release - regressions:
- openvswitch: allocate struct ovs_pcpu_storage dynamically, static
allocation may exhaust module loader limit on smaller systems
Previous releases - regressions:
- tcp: fix tcp_packet_delayed() for peers with no selective ACK
support
Previous releases - always broken:
- wifi: ath12k: don't activate more links than firmware supports
- tcp: make sure sockets open via passive TFO have valid NAPI ID
- eth: bnxt_en: update MRU and RSS table of RSS contexts on queue
reset, prevent Rx queues from silently hanging after queue reset
- NFC: uart: set tty->disc_data only in success path"
* tag 'net-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
net: airoha: Differentiate hwfd buffer size for QDMA0 and QDMA1
net: airoha: Compute number of descriptors according to reserved memory size
tools: ynl: fix mixing ops and notifications on one socket
net: atm: fix /proc/net/atm/lec handling
net: atm: add lec_mutex
mlxbf_gige: return EPROBE_DEFER if PHY IRQ is not available
net: airoha: Always check return value from airoha_ppe_foe_get_entry()
NFC: nci: uart: Set tty->disc_data only in success path
calipso: Fix null-ptr-deref in calipso_req_{set,del}attr().
MAINTAINERS: Remove Shannon Nelson from MAINTAINERS file
net: lan743x: fix potential out-of-bounds write in lan743x_ptp_io_event_clock_get()
eth: fbnic: avoid double free when failing to DMA-map FW msg
tcp: fix passive TFO socket having invalid NAPI ID
selftests: net: add test for passive TFO socket NAPI ID
selftests: net: add passive TFO test binary
selftests: netdevsim: improve lib.sh include in peer.sh
tipc: fix null-ptr-deref when acquiring remote ip of ethernet bearer
Octeontx2-pf: Fix Backpresure configuration
net: ftgmac100: select FIXED_PHY
net: ethtool: remove duplicate defines for family info
...
- Fix a regression in the arm64 Poly1305 code
- Fix a couple compiler warnings
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaFMXvhQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK1wvAP4mnXzTFhaGjaz/ZXwnD++H69s/rKH+
EKrNpcqYiwnn2wD/RLREXAnkrU2WRZCSI4H1D2J9NbQe+X9xZ7ieJpsPNgM=
=AXpO
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fixes from Eric Biggers:
- Fix a regression in the arm64 Poly1305 code
- Fix a couple compiler warnings
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto/poly1305: Fix arm64's poly1305_blocks_arch()
lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older
lib/crypto: Annotate crypto strings with nonstring
__kunit_activate_static_stub() works effectively as
kunit_deactivate_static_stub() if `replacement_addr` is NULL.
Add a test case to catch the issue discovered in
commit 772e50a76e ("kunit: Fix wrong parameter to kunit_deactivate_static_stub()").
For running the test:
$ ./tools/testing/kunit/kunit.py run \
--arch=x86_64 \
kunit_stub
Fixed change log:
Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250612084834.587576-1-tzungbi@kernel.org
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Remove include <linux/export.h> from all files which do not contain an
EXPORT_SYMBOL().
See commit 7d95680d64 ("scripts/misc-check: check unnecessary #include
<linux/export.h> when W=1") for more details.
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
pldmfw calls crc32 code and depends on it being enabled, else
there is a link error as follows. So PLDMFW should select CRC32.
lib/pldmfw/pldmfw.o: In function `pldmfw_flash_image':
pldmfw.c:(.text+0x70f): undefined reference to `crc32_le_base'
This problem was introduced by commit b8265621f4 ("Add pldmfw library
for PLDM firmware update").
It manifests as of commit d69ea414c9 ("ice: implement device flash
update via devlink").
And is more likely to occur as of commit 9ad19171b6 ("lib/crc: remove
unnecessary prompt for CONFIG_CRC32 and drop 'default y'").
Found by chance while exercising builds based on tinyconfig.
Fixes: b8265621f4 ("Add pldmfw library for PLDM firmware update")
Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250613-pldmfw-crc32-v1-1-f3fad109eee6@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After commit 6f110a5e4f ("Disable SLUB_TINY for build testing"), which
causes CONFIG_KASAN to be enabled in allmodconfig again, arm64
allmodconfig builds with clang-17 and older show an instance of
-Wframe-larger-than (which breaks the build with CONFIG_WERROR=y):
lib/crypto/curve25519-hacl64.c:757:6: error: stack frame size (2336) exceeds limit (2048) in 'curve25519_generic' [-Werror,-Wframe-larger-than]
757 | void curve25519_generic(u8 mypublic[CURVE25519_KEY_SIZE],
| ^
When KASAN is disabled, the stack usage is roughly quartered:
lib/crypto/curve25519-hacl64.c:757:6: error: stack frame size (608) exceeds limit (128) in 'curve25519_generic' [-Werror,-Wframe-larger-than]
757 | void curve25519_generic(u8 mypublic[CURVE25519_KEY_SIZE],
| ^
Using '-Rpass-analysis=stack-frame-layout' shows the following variables
and many, many 8-byte spills when KASAN is enabled:
Offset: [SP-144], Type: Variable, Align: 8, Size: 40
Offset: [SP-464], Type: Variable, Align: 8, Size: 320
Offset: [SP-784], Type: Variable, Align: 8, Size: 320
Offset: [SP-864], Type: Variable, Align: 32, Size: 80
Offset: [SP-896], Type: Variable, Align: 32, Size: 32
Offset: [SP-1016], Type: Variable, Align: 8, Size: 120
When KASAN is disabled, there are still spills but not at many and the
variables list is smaller:
Offset: [SP-192], Type: Variable, Align: 32, Size: 80
Offset: [SP-224], Type: Variable, Align: 32, Size: 32
Offset: [SP-344], Type: Variable, Align: 8, Size: 120
Disable KASAN for this file when using clang-17 or older to avoid
blowing out the stack, clearing up the warning.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250609-curve25519-hacl64-disable-kasan-clang-v1-1-08ea0ac5ccff@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Annotate various keys, ivs, and other byte arrays with __nonstring so
that static initializers will not complain about truncating the trailing
NUL byte under GCC 15 with -Wunterminated-string-initialization enabled.
Silences many warnings like:
../lib/crypto/aesgcm.c:642:27: warning: initializer-string for array of 'unsigned char' truncates NUL terminator but destination lacks 'nonstring' attribute (13 chars into 12 available) [-Wunterminated-string-initialization]
642 | .iv = "\xca\xfe\xba\xbe\xfa\xce\xdb\xad"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250529173113.work.760-kees@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Commit 698de82278 ("crypto: testmgr - make it easier to enable the
full set of tests") removed support for building kernels that run only
the "fast" set of crypto self-tests by default. This assumed that
nearly everyone actually wanted the full set of tests, *if* they had
already chosen to enable the tests at all.
Unfortunately, it turns out that both Debian and Fedora intentionally
have the crypto self-tests enabled in their production kernels. And for
production kernels we do need to keep the testing time down, which
implies just running the "fast" tests, not the full set of tests.
For Fedora, a reason for enabling the tests in production is that they
are being (mis)used to meet the FIPS 140-3 pre-operational testing
requirement.
However, the other reason for enabling the tests in production, which
applies to both distros, is that they provide some value in protecting
users from buggy drivers. Unfortunately, the crypto/ subsystem has many
buggy and untested drivers for off-CPU hardware accelerators on rare
platforms. These broken drivers get shipped to users, and there have
been multiple examples of the tests preventing these buggy drivers from
being used. So effectively, the tests are being relied on in production
kernels. I think this is kind of crazy (untested drivers should just
not be enabled at all), but that seems to be how things work currently.
Thus, reintroduce a kconfig option that controls the level of testing.
Call it CRYPTO_SELFTESTS_FULL instead of the original name
CRYPTO_MANAGER_EXTRA_TESTS, which was slightly misleading.
Moreover, given the "production kernel" use case, make CRYPTO_SELFTESTS
depend on EXPERT instead of DEBUG_KERNEL.
I also haven't reinstated all the #ifdefs in crypto/testmgr.c. Instead,
just rely on the compiler to optimize out unused code.
Fixes: 40b9969796 ("crypto: testmgr - replace CRYPTO_MANAGER_DISABLE_TESTS with CRYPTO_SELFTESTS")
Fixes: 698de82278 ("crypto: testmgr - make it easier to enable the full set of tests")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Allow configurability of the inclusion of more detailed
WARN_ON() strings, to be implemented in subsequent
commits.
Since the full cost will be around 100K more memory on
an x86 defconfig, disable it by default.
Provide the WARN_CONDITION_STR() macro to allow the conditional
passing of extra strings to lower level BUG/WARN handlers.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Link: https://lore.kernel.org/r/20250515124644.2958810-4-mingo@kernel.org
When running the raid6 user-space test program on RISC-V QEMU, there's a
segmentation fault which seems caused by accessing a NULL pointer,
which is the pointer variable p/q in raid6_rvv*_gen/xor_syndrome_real(),
p/q should have been equal to dptr[x], but when I use GDB command to
see its value, which was 0x10 like below:
"
Program received signal SIGSEGV, Segmentation fault.
0x0000000000011062 in raid6_rvv2_xor_syndrome_real (disks=<optimized out>, start=0, stop=<optimized out>, bytes=4096, ptrs=<optimized out>) at rvv.c:386
(gdb) p p
$1 = (u8 *) 0x10 <error: Cannot access memory at address 0x10>
"
The issue was found to be related with:
1) Compile optimization
There's no segmentation fault if compiling the raid6test program with
the optimization flag -O0.
2) The RISC-V vector command vsetvli
If not used t0 as the first parameter in vsetvli, there's no
segmentation fault either.
This patch selects the 2nd solution to fix the issue.
[Palmer: The actual issue here is a missing clobber in the vsetvli code.
It's a little tricky: we've already probed for VLENB so we don't need to
look at the output register, we just need to have an X register in the
instruction as that's the form required to actually set VL. Thus we
clobber a register, and without describing that we end up breaking
compilers.]
Fixes: 6093faaf95 ("raid6: Add RISC-V SIMD syndrome and recovery calculations")
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250610101234.1100660-3-zhangchunyan@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Some system owners use slab_debug=FPZ (or similar) as a hardening option,
but do not want to be forced into having kernel addresses exposed due
to the implicit "no_hash_pointers" boot param setting.[1]
Introduce the "hash_pointers" boot param, which defaults to "auto"
(the current behavior), but also includes "always" (forcing on hashing
even when "slab_debug=..." is defined), and "never". The existing
"no_hash_pointers" boot param becomes an alias for "hash_pointers=never".
This makes it possible to boot with "slab_debug=FPZ hash_pointers=always".
Link: https://github.com/KSPP/linux/issues/368 [1]
Fixes: 792702911f ("slub: force on no_hash_pointers when slub_debug is enabled")
Co-developed-by: Sergio Perez Gonzalez <sperezglz@gmail.com>
Signed-off-by: Sergio Perez Gonzalez <sperezglz@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Acked-by: Rafael Aquini <raquini@redhat.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20250415170232.it.467-kees@kernel.org
[kees@kernel.org: Add note about hash_pointers into slab_debug kernel parameter documentation.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
or aren't considered necessary for -stable kernels. 11 are for MM.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaENzlAAKCRDdBJ7gKXxA
joNYAP9n38QNDUoRR6ChFikzzY77q4alD2NL0aqXBZdcSRXoUgEAlQ8Ea+t6xnzp
GnH+cnsA6FDp4F6lIoZBdENJyBYrkQE=
=ud9O
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"13 hotfixes.
6 are cc:stable and the remainder address post-6.15 issues or aren't
considered necessary for -stable kernels. 11 are for MM"
* tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
MAINTAINERS: add mm swap section
kmsan: test: add module description
MAINTAINERS: add tlb trace events to MMU GATHER AND TLB INVALIDATION
mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race
mm/hugetlb: unshare page tables during VMA split, not before
MAINTAINERS: add Alistair as reviewer of mm memory policy
iov_iter: use iov_offset for length calculation in iov_iter_aligned_bvec
mm/mempolicy: fix incorrect freeing of wi_kobj
alloc_tag: handle module codetag load errors as module load failures
mm/madvise: handle madvise_lock() failure during race unwinding
mm: fix vmstat after removing NR_BOUNCE
KVM: s390: rename PROT_NONE to PROT_TYPE_DUMMY
* Support for the FWFT SBI extension, which is part of SBI 3.0 and a
dependency for many new SBI and ISA extensions.
* Support for getrandom() in the VDSO.
* Support for mseal.
* Optimized routines for raid6 syndrome and recovery calculations.
* kexec_file() supports loading Image-formatted kernel binaries.
* Improvements to the instruction patching framework to allow for atomic
instruction patching, along with rules as to how systems need to
behave in order to function correctly.
* Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
some SiFive vendor extensions.
* Various fixes and cleanups, including: misaligned access handling, perf
symbol mangling, module loading, PUD THPs, and improved uaccess
routines.
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmhDLP8ZHHBhbG1lcmRh
YmJlbHRAZ29vZ2xlLmNvbQAKCRAuExnzX7sYiZhFD/4+Zikkld812VjFb9dTF+Wj
n/x9h86zDwAEFgf2BMIpUQhHru6vtdkO2l/Ky6mQblTPMWLafF4eK85yCsf84sQ0
+RX4sOMLZ0+qvqxKX+aOFe9JXOWB0QIQuPvgBfDDOV4UTm60sglIxwqOpKcsBEHs
2nplXXjiv0ckaMFLos8xlwu1uy4A/jMfT3Y9FDcABxYCqBoKOZ1frcL9ezJZbHbv
BoOKLDH8ZypFxIG/eQ511lIXXtrnLas0l4jHWjrfsWu6pmXTgJasKtbGuH3LoLnM
G/4qvHufR6lpVUOIL5L0V6PpsmYwDi/ciFIFlc8NH2oOZil3qiVaGSEbJIkWGFu9
8lWTXQWnbinZbfg2oYbWp8GlwI70vKomtDyYNyB9q9Cq9jyiTChMklRNODr4764j
ZiEnzc/l4KyvaxUg8RLKCT595lKECiUDnMytbIbunJu05HBqRCoGpBtMVzlQsyUd
ybkRt3BA7eOR8/xFA7ZZQeJofmiu2yxkBs5ggMo8UnSragw27hmv/OA0mWMXEuaD
aaWc4ZKpKqf7qLchLHOvEl5ORUhsisyIJgZwOqdme5rQoWorVtr51faA4AKwFAN4
vcKgc5qJjK8vnpW+rl3LNJF9LtH+h4TgmUI853vUlukPoH2oqRkeKVGSkxG0iAze
eQy2VjP1fJz6ciRtJZn9aw==
=cZGy
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for the FWFT SBI extension, which is part of SBI 3.0 and a
dependency for many new SBI and ISA extensions
- Support for getrandom() in the VDSO
- Support for mseal
- Optimized routines for raid6 syndrome and recovery calculations
- kexec_file() supports loading Image-formatted kernel binaries
- Improvements to the instruction patching framework to allow for
atomic instruction patching, along with rules as to how systems need
to behave in order to function correctly
- Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
some SiFive vendor extensions
- Various fixes and cleanups, including: misaligned access handling,
perf symbol mangling, module loading, PUD THPs, and improved uaccess
routines
* tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits)
riscv: uaccess: Only restore the CSR_STATUS SUM bit
RISC-V: vDSO: Wire up getrandom() vDSO implementation
riscv: enable mseal sysmap for RV64
raid6: Add RISC-V SIMD syndrome and recovery calculations
riscv: mm: Add support for Svinval extension
RISC-V: Documentation: Add enough title underlines to CMODX
riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
MAINTAINERS: Update Atish's email address
riscv: uaccess: do not do misaligned accesses in get/put_user()
riscv: process: use unsigned int instead of unsigned long for put_user()
riscv: make unsafe user copy routines use existing assembly routines
riscv: hwprobe: export Zabha extension
riscv: Make regs_irqs_disabled() more clear
perf symbols: Ignore mapping symbols on riscv
RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
riscv: module: Optimize PLT/GOT entry counting
riscv: Add support for PUD THP
riscv: xchg: Prefetch the destination word for sc.w
riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
riscv: Add support for Zicbop
...
If iov_offset is non-zero, then we need to consider iov_offset in length
calculation, otherwise we might pass smaller IOs such as 512 bytes, in
below scenario [1].
This issue is reproducible using lib-uring test/fixed-seg.c application
with fixed buffer on a 512 LBA formatted device.
[1]
At present we pass the alignment check, for 512 LBA formatted devices,
len_mask = 511 when IO is smaller, i->count = 512 has an offset,
i->io_offset = 3584 with bvec values, bvec->bv_offset = 256,
bvec->bv_len = 3840. In short, the first 256 bytes are in the current
page, next 256 bytes are in the another page. Ideally we expect to
fail the IO.
I can think of 2 userspace scenarios where we experience this.
a: From userspace, we observe a different behaviour when device LBA
size is 512 vs 4096 bytes. For 4096 LBA formatted device, I see the
same liburing test [2] failing, whereas 512 the test passes without
this. This is reproducible everytime.
[2] https://github.com/axboe/liburing/
b: Although I was not able to reproduce the below condition, but I
suspect below case should be possible from user space for devices
with 512 LBA formatted device. Lets say from userspace while
allocating a virtually single chunk of memory, if we get 2 physical
chunk of memory, and IO happens to be at the boundary of first
physical chunk with length crossing first chunk, then we allow IOs
to proceed and hence we might map wrong physical address length and
proceed with IO rather than failing.
: --- a/test/fixed-seg.c
: +++ b/test/fixed-seg.c
: @@ -64,7 +64,7 @@ static int test(struct io_uring *ring, int fd, int
: vec_off)
: return T_EXIT_FAIL;
: }
:
: - ret = read_it(ring, fd, 4096, vec_off);
: + ret = read_it(ring, fd, 4096, 7*512 + 256);
: if (ret) {
: fprintf(stderr, "4096 0 failed\n");
: return T_EXIT_FAIL;
Effectively this is a write crossing the page boundary.
Link: https://lkml.kernel.org/r/20250428095849.11709-1-nj.shetty@samsung.com
Fixes: 2263639f96 ("iov_iter: streamline iovec/bvec alignment iteration")
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Failures inside codetag_load_module() are currently ignored. As a result
an error there would not cause a module load failure and freeing of the
associated resources. Correct this behavior by propagating the error code
to the caller and handling possible errors. With this change, error to
allocate percpu counters, which happens at this stage, will not be ignored
and will cause a module load failure and freeing of resources. With this
change we also do not need to disable memory allocation profiling when
this error happens, instead we fail to load the module.
Link: https://lkml.kernel.org/r/20250521160602.1940771-1-surenb@google.com
Fixes: 10075262888b ("alloc_tag: allocate percpu counters for module tags dynamically")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: Casey Chen <cachen@purestorage.com>
Closes: https://lore.kernel.org/all/20250520231620.15259-1-cachen@purestorage.com/
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Wang <00107082@163.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Luis Chamberalin <mcgrof@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bitmap updates for 6.16-rc1 include:
- dead code cleanups for cpumasks and nodemasks (me);
- fixed-width flavors of GENMASK() and BIT() (Vincent, Lucas and me);
- FIELD_MODIFY() helper (Luo);
- for_each_node_with_cpus() optimization (me);
- bitmap-str fixes (Andy).
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmg43+EACgkQsUSA/Tof
vsjn9wwAgQmfIgANEcFo2Gz3OjNIGHRyolG+dXYpf3OytlnbMlTnMev+LASl9VXL
LML+2gZsQtMff1Ac7VJzufg5OHg2tTKyozWFPOG5mc+ysBL24d0Pd/Ki8mLlbbTl
EQMu5uExXP4qja1vHLF+kEEYMCzWVTRwOE1H/nauu/OUonuOFLlIiXhgJHrmO22Z
qHzBAto2bqE/Jy6OneYWiXLtIOcJhfoS2a+Xczf4cBZH6nqeepg7Vnudmd10IUJc
BlXcD4Qt/uG6MTS96knNslcgV1Q5Nfkch9JPLu/bzO34nR0VBB1VqMos37fAHQln
RhSxd6LdqGxA5ZafivN5YIwHrCJIr7yi6m+92kazX9baqf1Lh7opAK4NgV2o7Rm9
v9SlX+Rcb+HyyoLI7Fh+hVyNdrymbAo4KDzZDk+yHuAYGKHdxgeK5D2GTtpm6ZiJ
u515P6zFi7h2jPMTCOWwCBwpeHIDL1hyf9St5yvtDEEeWGRy3y7MkRmxnI4QVlTN
b0in+YW5
=vATl
-----END PGP SIGNATURE-----
Merge tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux
Pull bitmap updates from Yury Norov:
- dead code cleanups for cpumasks and nodemasks (me)
- fixed-width flavors of GENMASK() and BIT() (Vincent, Lucas and me)
- FIELD_MODIFY() helper (Luo)
- for_each_node_with_cpus() optimization (me)
- bitmap-str fixes (Andy)
* tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux:
topology: make for_each_node_with_cpus() O(N)
bitfield: Add FIELD_MODIFY() helper
bitmap-str: Add missing header(s)
bitmap-str: Get rid of 'extern' for function prototypes
build_bug.h: more user friendly error messages in BUILD_BUG_ON_ZERO()
test_bits: add tests for BIT_U*()
test_bits: add tests for GENMASK_U*()
drm/i915: Convert REG_GENMASK*() to fixed-width GENMASK_U*()
bits: introduce fixed-type BIT_U*()
bits: introduce fixed-type GENMASK_U*()
bits: add comments and newlines to #if, #else and #endif directives
cpumask: drop cpumask_assign_cpu()
riscv: switch set_icache_stale_mask() to using non-atomic assign_cpu()
cpumask: add non-atomic __assign_cpu()
nodemask: drop nodes_shift
Sergey Senozhatsky adds infrastructure for passing algorithm-specific
parameters into zram. A single parameter `winbits' is implemented at
this time.
- The 5 patch series "memcg: nmi-safe kmem charging" from Shakeel Butt
makes memcg charging nmi-safe, which is required by BFP, which can
operate in NMI context.
- The 5 patch series "Some random fixes and cleanup to shmem" from
Kemeng Shi implements small fixes and cleanups in the shmem code.
- The 2 patch series "Skip mm selftests instead when kernel features are
not present" from Zi Yan fixes some issues in the MM selftest code.
- The 2 patch series "mm/damon: build-enable essential DAMON components
by default" from SeongJae Park reworks DAMON Kconfig to make it easier
to enable CONFIG_DAMON.
- The 2 patch series "sched/numa: add statistics of numa balance task
migration" from Libo Chen adds more info into sysfs and procfs files to
improve visibility into the NUMA balancer's task migration activity.
- The 4 patch series "selftests/mm: cow and gup_longterm cleanups" from
Mark Brown provides various updates to some of the MM selftests to make
them play better with the overall containing framework.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaDzA9wAKCRDdBJ7gKXxA
js8sAP9V3COg+vzTmimzP3ocTkkbbIJzDfM6nXpE2EQ4BR3ejwD+NsIT2ZLtTF6O
LqAZpgO7ju6wMjR/lM30ebCq5qFbZAw=
=oruw
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-06-01-14-06' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- "zram: support algorithm-specific parameters" from Sergey Senozhatsky
adds infrastructure for passing algorithm-specific parameters into
zram. A single parameter `winbits' is implemented at this time.
- "memcg: nmi-safe kmem charging" from Shakeel Butt makes memcg
charging nmi-safe, which is required by BFP, which can operate in NMI
context.
- "Some random fixes and cleanup to shmem" from Kemeng Shi implements
small fixes and cleanups in the shmem code.
- "Skip mm selftests instead when kernel features are not present" from
Zi Yan fixes some issues in the MM selftest code.
- "mm/damon: build-enable essential DAMON components by default" from
SeongJae Park reworks DAMON Kconfig to make it easier to enable
CONFIG_DAMON.
- "sched/numa: add statistics of numa balance task migration" from Libo
Chen adds more info into sysfs and procfs files to improve visibility
into the NUMA balancer's task migration activity.
- "selftests/mm: cow and gup_longterm cleanups" from Mark Brown
provides various updates to some of the MM selftests to make them
play better with the overall containing framework.
* tag 'mm-stable-2025-06-01-14-06' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (43 commits)
mm/khugepaged: clean up refcount check using folio_expected_ref_count()
selftests/mm: fix test result reporting in gup_longterm
selftests/mm: report unique test names for each cow test
selftests/mm: add helper for logging test start and results
selftests/mm: use standard ksft_finished() in cow and gup_longterm
selftests/damon/_damon_sysfs: skip testcases if CONFIG_DAMON_SYSFS is disabled
sched/numa: add statistics of numa balance task
sched/numa: fix task swap by skipping kernel threads
tools/testing: check correct variable in open_procmap()
tools/testing/vma: add missing function stub
mm/gup: update comment explaining why gup_fast() disables IRQs
selftests/mm: two fixes for the pfnmap test
mm/khugepaged: fix race with folio split/free using temporary reference
mm: add CONFIG_PAGE_BLOCK_ORDER to select page block order
mmu_notifiers: remove leftover stub macros
selftests/mm: deduplicate test names in madv_populate
kcov: rust: add flags for KCOV with Rust
mm: rust: make CONFIG_MMU ifdefs more narrow
mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables()
mm/damon/Kconfig: enable CONFIG_DAMON by default
...
- randstruct: gcc-plugin: Fix attribute addition with GCC 15
- ubsan: integer-overflow: depend on BROKEN to keep this out of CI
- overflow: Introduce __DEFINE_FLEX for having no initializer
- wifi: iwlwifi: mld: Work around Clang loop unrolling bug
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaDx5RgAKCRA2KwveOeQk
u4KjAP9tpSeAc2cKb2ZeuVV2dVSf689jR/fxPbgyy2yWIJrPogD/bsFs+LTCXnwB
/Rk838ZZJEB0eXKoKk/LKmaN2UMSMgQ=
=6m7/
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.16-rc1-fix1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
- randstruct: gcc-plugin: Fix attribute addition with GCC 15
- ubsan: integer-overflow: depend on BROKEN to keep this out of CI
- overflow: Introduce __DEFINE_FLEX for having no initializer
- wifi: iwlwifi: mld: Work around Clang loop unrolling bug
[ Take two after a jump scare due to some repo rewriting by 'b4' - Linus ]
* tag 'hardening-v6.16-rc1-fix1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
randstruct: gcc-plugin: Fix attribute addition
overflow: Introduce __DEFINE_FLEX for having no initializer
ubsan: integer-overflow: depend on BROKEN to keep this out of CI
wifi: iwlwifi: mld: Work around Clang loop unrolling bug
All callers now use copy_folio_from_iter_atomic(), so convert
copy_page_from_iter_atomic(). While I'm in there, use kmap_local_folio()
and pagefault_disable() instead of kmap_atomic(). That allows preemption
and/or task migration to happen during the copy_from_user(). Also use the
new folio_test_partial_kmap() predicate instead of open-coding it.
Link: https://lkml.kernel.org/r/20250514170607.3000994-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
semaphore" from Lance Yang enhances the hung task detector. The
detector presently dumps the blocking tasks's stack when it is blocked
on a mutex. Lance's series extends this to semaphores.
- The 2 patch series "nilfs2: improve sanity checks in dirty state
propagation" from Wentao Liang addresses a couple of minor flaws in
nilfs2.
- The 2 patch series "scripts/gdb: Fixes related to lx_per_cpu()" from
Illia Ostapyshyn fixes a couple of issues in the gdb scripts.
- The 9 patch series "Support kdump with LUKS encryption by reusing LUKS
volume keys" from Coiby Xu addresses a usability problem with kdump.
When the dump device is LUKS-encrypted, the kdump kernel may not have
the keys to the encrypted filesystem. A full writeup of this is in the
series [0/N] cover letter.
- The 2 patch series "sysfs: add counters for lockups and stalls" from
Max Kellermann adds /sys/kernel/hardlockup_count and
/sys/kernel/hardlockup_count and /sys/kernel/rcu_stall_count.
- The 3 patch series "fork: Page operation cleanups in the fork code"
from Pasha Tatashin implements a number of code cleanups in fork.c.
- The 3 patch series "scripts/gdb/symbols: determine KASLR offset on
s390 during early boot" from Ilya Leoshkevich fixes some s390 issues in
the gdb scripts.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaDuCvQAKCRDdBJ7gKXxA
jrkxAQCnFAp/uK9ckkbN4nfpJ0+OMY36C+A+dawSDtuRsIkXBAEAq3e6MNAUdg5W
Ca0cXdgSIq1Op7ZKEA+66Km6Rfvfow8=
=g45L
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "hung_task: extend blocking task stacktrace dump to semaphore" from
Lance Yang enhances the hung task detector.
The detector presently dumps the blocking tasks's stack when it is
blocked on a mutex. Lance's series extends this to semaphores
- "nilfs2: improve sanity checks in dirty state propagation" from
Wentao Liang addresses a couple of minor flaws in nilfs2
- "scripts/gdb: Fixes related to lx_per_cpu()" from Illia Ostapyshyn
fixes a couple of issues in the gdb scripts
- "Support kdump with LUKS encryption by reusing LUKS volume keys" from
Coiby Xu addresses a usability problem with kdump.
When the dump device is LUKS-encrypted, the kdump kernel may not have
the keys to the encrypted filesystem. A full writeup of this is in
the series [0/N] cover letter
- "sysfs: add counters for lockups and stalls" from Max Kellermann adds
/sys/kernel/hardlockup_count and /sys/kernel/hardlockup_count and
/sys/kernel/rcu_stall_count
- "fork: Page operation cleanups in the fork code" from Pasha Tatashin
implements a number of code cleanups in fork.c
- "scripts/gdb/symbols: determine KASLR offset on s390 during early
boot" from Ilya Leoshkevich fixes some s390 issues in the gdb
scripts
* tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (67 commits)
llist: make llist_add_batch() a static inline
delayacct: remove redundant code and adjust indentation
squashfs: add optional full compressed block caching
crash_dump, nvme: select CONFIGFS_FS as built-in
scripts/gdb/symbols: determine KASLR offset on s390 during early boot
scripts/gdb/symbols: factor out pagination_off()
scripts/gdb/symbols: factor out get_vmlinux()
kernel/panic.c: format kernel-doc comments
mailmap: update and consolidate Casey Connolly's name and email
nilfs2: remove wbc->for_reclaim handling
fork: define a local GFP_VMAP_STACK
fork: check charging success before zeroing stack
fork: clean-up naming of vm_stack/vm_struct variables in vmap stacks code
fork: clean-up ifdef logic around stack allocation
kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count
x86/crash: make the page that stores the dm crypt keys inaccessible
x86/crash: pass dm crypt keys to kdump kernel
Revert "x86/mm: Remove unused __set_memory_prot()"
crash_dump: retrieve dm crypt keys in kdump kernel
...
simplifies the act of creating a pte which addresses the first page in a
folio and reduces the amount of plumbing which architecture must
implement to provide this.
- The 8 patch series "Misc folio patches for 6.16" from Matthew Wilcox
is a shower of largely unrelated folio infrastructure changes which
clean things up and better prepare us for future work.
- The 3 patch series "memory,x86,acpi: hotplug memory alignment
advisement" from Gregory Price adds early-init code to prevent x86 from
leaving physical memory unused when physical address regions are not
aligned to memory block size.
- The 2 patch series "mm/compaction: allow more aggressive proactive
compaction" from Michal Clapinski provides some tuning of the (sadly,
hard-coded (more sadly, not auto-tuned)) thresholds for our invokation
of proactive compaction. In a simple test case, the reduction of a guest
VM's memory consumption was dramatic.
- The 8 patch series "Minor cleanups and improvements to swap freeing
code" from Kemeng Shi provides some code cleaups and a small efficiency
improvement to this part of our swap handling code.
- The 6 patch series "ptrace: introduce PTRACE_SET_SYSCALL_INFO API"
from Dmitry Levin adds the ability for a ptracer to modify syscalls
arguments. At this time we can alter only "system call information that
are used by strace system call tampering, namely, syscall number,
syscall arguments, and syscall return value.
This series should have been incorporated into mm.git's "non-MM"
branch, but I goofed.
- The 3 patch series "fs/proc: extend the PAGEMAP_SCAN ioctl to report
guard regions" from Andrei Vagin extends the info returned by the
PAGEMAP_SCAN ioctl against /proc/pid/pagemap. This permits CRIU to more
efficiently get at the info about guard regions.
- The 2 patch series "Fix parameter passed to page_mapcount_is_type()"
from Gavin Shan implements that fix. No runtime effect is expected
because validate_page_before_insert() happens to fix up this error.
- The 3 patch series "kernel/events/uprobes: uprobe_write_opcode()
rewrite" from David Hildenbrand basically brings uprobe text poking into
the current decade. Remove a bunch of hand-rolled implementation in
favor of using more current facilities.
- The 3 patch series "mm/ptdump: Drop assumption that pxd_val() is u64"
from Anshuman Khandual provides enhancements and generalizations to the
pte dumping code. This might be needed when 128-bit Page Table
Descriptors are enabled for ARM.
- The 12 patch series "Always call constructor for kernel page tables"
from Kevin Brodsky "ensures that the ctor/dtor is always called for
kernel pgtables, as it already is for user pgtables". This permits the
addition of more functionality such as "insert hooks to protect page
tables". This change does result in various architectures performing
unnecesary work, but this is fixed up where it is anticipated to occur.
- The 9 patch series "Rust support for mm_struct, vm_area_struct, and
mmap" from Alice Ryhl adds plumbing to permit Rust access to core MM
structures.
- The 3 patch series "fix incorrectly disallowed anonymous VMA merges"
from Lorenzo Stoakes takes advantage of some VMA merging opportunities
which we've been missing for 15 years.
- The 4 patch series "mm/madvise: batch tlb flushes for MADV_DONTNEED
and MADV_FREE" from SeongJae Park optimizes process_madvise()'s TLB
flushing. Instead of flushing each address range in the provided iovec,
we batch the flushing across all the iovec entries. The syscall's cost
was approximately halved with a microbenchmark which was designed to
load this particular operation.
- The 6 patch series "Track node vacancy to reduce worst case allocation
counts" from Sidhartha Kumar makes the maple tree smarter about its node
preallocation. stress-ng mmap performance increased by single-digit
percentages and the amount of unnecessarily preallocated memory was
dramaticelly reduced.
- The 3 patch series "mm/gup: Minor fix, cleanup and improvements" from
Baoquan He removes a few unnecessary things which Baoquan noted when
reading the code.
- The 3 patch series ""Enhance sysfs handling for memory hotplug in
weighted interleave" from Rakie Kim "enhances the weighted interleave
policy in the memory management subsystem by improving sysfs handling,
fixing memory leaks, and introducing dynamic sysfs updates for memory
hotplug support". Fixes things on error paths which we are unlikely to
hit.
- The 7 patch series "mm/damon: auto-tune DAMOS for NUMA setups
including tiered memory" from SeongJae Park introduces new DAMOS quota
goal metrics which eliminate the manual tuning which is required when
utilizing DAMON for memory tiering.
- The 5 patch series "mm/vmalloc.c: code cleanup and improvements" from
Baoquan He provides cleanups and small efficiency improvements which
Baoquan found via code inspection.
- The 2 patch series "vmscan: enforce mems_effective during demotion"
from Gregory Price "changes reclaim to respect cpuset.mems_effective
during demotion when possible". because "presently, reclaim explicitly
ignores cpuset.mems_effective when demoting, which may cause the cpuset
settings to violated." "This is useful for isolating workloads on a
multi-tenant system from certain classes of memory more consistently."
- The 2 patch series ""Clean up split_huge_pmd_locked() and remove
unnecessary folio pointers" from Gavin Guo provides minor cleanups and
efficiency gains in in the huge page splitting and migrating code.
- The 3 patch series "Use kmem_cache for memcg alloc" from Huan Yang
creates a slab cache for `struct mem_cgroup', yielding improved memory
utilization.
- The 4 patch series "add max arg to swappiness in memory.reclaim and
lru_gen" from Zhongkun He adds a new "max" argument to the "swappiness="
argument for memory.reclaim MGLRU's lru_gen. This directs proactive
reclaim to reclaim from only anon folios rather than file-backed folios.
- The 17 patch series "kexec: introduce Kexec HandOver (KHO)" from Mike
Rapoport is the first step on the path to permitting the kernel to
maintain existing VMs while replacing the host kernel via file-based
kexec. At this time only memblock's reserve_mem is preserved.
- The 7 patch series "mm: Introduce for_each_valid_pfn()" from David
Woodhouse provides and uses a smarter way of looping over a pfn range.
By skipping ranges of invalid pfns.
- The 2 patch series "sched/numa: Skip VMA scanning on memory pinned to
one NUMA node via cpuset.mems" from Libo Chen removes a lot of pointless
VMA scanning when a task is pinned a single NUMA mode. Dramatic
performance benefits were seen in some real world cases.
- The 2 patch series "JFS: Implement migrate_folio for
jfs_metapage_aops" from Shivank Garg addresses a warning which occurs
during memory compaction when using JFS.
- The 4 patch series "move all VMA allocation, freeing and duplication
logic to mm" from Lorenzo Stoakes moves some VMA code from kernel/fork.c
into the more appropriate mm/vma.c.
- The 6 patch series "mm, swap: clean up swap cache mapping helper" from
Kairui Song provides code consolidation and cleanups related to the
folio_index() function.
- The 2 patch series "mm/gup: Cleanup memfd_pin_folios()" from Vishal
Moola does that.
- The 8 patch series "memcg: Fix test_memcg_min/low test failures" from
Waiman Long addresses some bogus failures which are being reported by
the test_memcontrol selftest.
- The 3 patch series "eliminate mmap() retry merge, add .mmap_prepare
hook" from Lorenzo Stoakes commences the deprecation of
file_operations.mmap() in favor of the new
file_operations.mmap_prepare(). The latter is more restrictive and
prevents drivers from messing with things in ways which, amongst other
problems, may defeat VMA merging.
- The 4 patch series "memcg: decouple memcg and objcg stocks"" from
Shakeel Butt decouples the per-cpu memcg charge cache from the objcg's
one. This is a step along the way to making memcg and objcg charging
NMI-safe, which is a BPF requirement.
- The 6 patch series "mm/damon: minor fixups and improvements for code,
tests, and documents" from SeongJae Park is "yet another batch of
miscellaneous DAMON changes. Fix and improve minor problems in code,
tests and documents."
- The 7 patch series "memcg: make memcg stats irq safe" from Shakeel
Butt converts memcg stats to be irq safe. Another step along the way to
making memcg charging and stats updates NMI-safe, a BPF requirement.
- The 4 patch series "Let unmap_hugepage_range() and several related
functions take folio instead of page" from Fan Ni provides folio
conversions in the hugetlb code.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaDt5qgAKCRDdBJ7gKXxA
ju6XAP9nTiSfRz8Cz1n5LJZpFKEGzLpSihCYyR6P3o1L9oe3mwEAlZ5+XAwk2I5x
Qqb/UGMEpilyre1PayQqOnct3aSL9Ao=
=tYYm
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "Add folio_mk_pte()" from Matthew Wilcox simplifies the act of
creating a pte which addresses the first page in a folio and reduces
the amount of plumbing which architecture must implement to provide
this.
- "Misc folio patches for 6.16" from Matthew Wilcox is a shower of
largely unrelated folio infrastructure changes which clean things up
and better prepare us for future work.
- "memory,x86,acpi: hotplug memory alignment advisement" from Gregory
Price adds early-init code to prevent x86 from leaving physical
memory unused when physical address regions are not aligned to memory
block size.
- "mm/compaction: allow more aggressive proactive compaction" from
Michal Clapinski provides some tuning of the (sadly, hard-coded (more
sadly, not auto-tuned)) thresholds for our invokation of proactive
compaction. In a simple test case, the reduction of a guest VM's
memory consumption was dramatic.
- "Minor cleanups and improvements to swap freeing code" from Kemeng
Shi provides some code cleaups and a small efficiency improvement to
this part of our swap handling code.
- "ptrace: introduce PTRACE_SET_SYSCALL_INFO API" from Dmitry Levin
adds the ability for a ptracer to modify syscalls arguments. At this
time we can alter only "system call information that are used by
strace system call tampering, namely, syscall number, syscall
arguments, and syscall return value.
This series should have been incorporated into mm.git's "non-MM"
branch, but I goofed.
- "fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions" from
Andrei Vagin extends the info returned by the PAGEMAP_SCAN ioctl
against /proc/pid/pagemap. This permits CRIU to more efficiently get
at the info about guard regions.
- "Fix parameter passed to page_mapcount_is_type()" from Gavin Shan
implements that fix. No runtime effect is expected because
validate_page_before_insert() happens to fix up this error.
- "kernel/events/uprobes: uprobe_write_opcode() rewrite" from David
Hildenbrand basically brings uprobe text poking into the current
decade. Remove a bunch of hand-rolled implementation in favor of
using more current facilities.
- "mm/ptdump: Drop assumption that pxd_val() is u64" from Anshuman
Khandual provides enhancements and generalizations to the pte dumping
code. This might be needed when 128-bit Page Table Descriptors are
enabled for ARM.
- "Always call constructor for kernel page tables" from Kevin Brodsky
ensures that the ctor/dtor is always called for kernel pgtables, as
it already is for user pgtables.
This permits the addition of more functionality such as "insert hooks
to protect page tables". This change does result in various
architectures performing unnecesary work, but this is fixed up where
it is anticipated to occur.
- "Rust support for mm_struct, vm_area_struct, and mmap" from Alice
Ryhl adds plumbing to permit Rust access to core MM structures.
- "fix incorrectly disallowed anonymous VMA merges" from Lorenzo
Stoakes takes advantage of some VMA merging opportunities which we've
been missing for 15 years.
- "mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE" from
SeongJae Park optimizes process_madvise()'s TLB flushing.
Instead of flushing each address range in the provided iovec, we
batch the flushing across all the iovec entries. The syscall's cost
was approximately halved with a microbenchmark which was designed to
load this particular operation.
- "Track node vacancy to reduce worst case allocation counts" from
Sidhartha Kumar makes the maple tree smarter about its node
preallocation.
stress-ng mmap performance increased by single-digit percentages and
the amount of unnecessarily preallocated memory was dramaticelly
reduced.
- "mm/gup: Minor fix, cleanup and improvements" from Baoquan He removes
a few unnecessary things which Baoquan noted when reading the code.
- ""Enhance sysfs handling for memory hotplug in weighted interleave"
from Rakie Kim "enhances the weighted interleave policy in the memory
management subsystem by improving sysfs handling, fixing memory
leaks, and introducing dynamic sysfs updates for memory hotplug
support". Fixes things on error paths which we are unlikely to hit.
- "mm/damon: auto-tune DAMOS for NUMA setups including tiered memory"
from SeongJae Park introduces new DAMOS quota goal metrics which
eliminate the manual tuning which is required when utilizing DAMON
for memory tiering.
- "mm/vmalloc.c: code cleanup and improvements" from Baoquan He
provides cleanups and small efficiency improvements which Baoquan
found via code inspection.
- "vmscan: enforce mems_effective during demotion" from Gregory Price
changes reclaim to respect cpuset.mems_effective during demotion when
possible. because presently, reclaim explicitly ignores
cpuset.mems_effective when demoting, which may cause the cpuset
settings to violated.
This is useful for isolating workloads on a multi-tenant system from
certain classes of memory more consistently.
- "Clean up split_huge_pmd_locked() and remove unnecessary folio
pointers" from Gavin Guo provides minor cleanups and efficiency gains
in in the huge page splitting and migrating code.
- "Use kmem_cache for memcg alloc" from Huan Yang creates a slab cache
for `struct mem_cgroup', yielding improved memory utilization.
- "add max arg to swappiness in memory.reclaim and lru_gen" from
Zhongkun He adds a new "max" argument to the "swappiness=" argument
for memory.reclaim MGLRU's lru_gen.
This directs proactive reclaim to reclaim from only anon folios
rather than file-backed folios.
- "kexec: introduce Kexec HandOver (KHO)" from Mike Rapoport is the
first step on the path to permitting the kernel to maintain existing
VMs while replacing the host kernel via file-based kexec. At this
time only memblock's reserve_mem is preserved.
- "mm: Introduce for_each_valid_pfn()" from David Woodhouse provides
and uses a smarter way of looping over a pfn range. By skipping
ranges of invalid pfns.
- "sched/numa: Skip VMA scanning on memory pinned to one NUMA node via
cpuset.mems" from Libo Chen removes a lot of pointless VMA scanning
when a task is pinned a single NUMA mode.
Dramatic performance benefits were seen in some real world cases.
- "JFS: Implement migrate_folio for jfs_metapage_aops" from Shivank
Garg addresses a warning which occurs during memory compaction when
using JFS.
- "move all VMA allocation, freeing and duplication logic to mm" from
Lorenzo Stoakes moves some VMA code from kernel/fork.c into the more
appropriate mm/vma.c.
- "mm, swap: clean up swap cache mapping helper" from Kairui Song
provides code consolidation and cleanups related to the folio_index()
function.
- "mm/gup: Cleanup memfd_pin_folios()" from Vishal Moola does that.
- "memcg: Fix test_memcg_min/low test failures" from Waiman Long
addresses some bogus failures which are being reported by the
test_memcontrol selftest.
- "eliminate mmap() retry merge, add .mmap_prepare hook" from Lorenzo
Stoakes commences the deprecation of file_operations.mmap() in favor
of the new file_operations.mmap_prepare().
The latter is more restrictive and prevents drivers from messing with
things in ways which, amongst other problems, may defeat VMA merging.
- "memcg: decouple memcg and objcg stocks"" from Shakeel Butt decouples
the per-cpu memcg charge cache from the objcg's one.
This is a step along the way to making memcg and objcg charging
NMI-safe, which is a BPF requirement.
- "mm/damon: minor fixups and improvements for code, tests, and
documents" from SeongJae Park is yet another batch of miscellaneous
DAMON changes. Fix and improve minor problems in code, tests and
documents.
- "memcg: make memcg stats irq safe" from Shakeel Butt converts memcg
stats to be irq safe. Another step along the way to making memcg
charging and stats updates NMI-safe, a BPF requirement.
- "Let unmap_hugepage_range() and several related functions take folio
instead of page" from Fan Ni provides folio conversions in the
hugetlb code.
* tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (285 commits)
mm: pcp: increase pcp->free_count threshold to trigger free_high
mm/hugetlb: convert use of struct page to folio in __unmap_hugepage_range()
mm/hugetlb: refactor __unmap_hugepage_range() to take folio instead of page
mm/hugetlb: refactor unmap_hugepage_range() to take folio instead of page
mm/hugetlb: pass folio instead of page to unmap_ref_private()
memcg: objcg stock trylock without irq disabling
memcg: no stock lock for cpu hot-unplug
memcg: make __mod_memcg_lruvec_state re-entrant safe against irqs
memcg: make count_memcg_events re-entrant safe against irqs
memcg: make mod_memcg_state re-entrant safe against irqs
memcg: move preempt disable to callers of memcg_rstat_updated
memcg: memcg_rstat_updated re-entrant safe against irqs
mm: khugepaged: decouple SHMEM and file folios' collapse
selftests/eventfd: correct test name and improve messages
alloc_tag: check mem_profiling_support in alloc_tag_init
Docs/damon: update titles and brief introductions to explain DAMOS
selftests/damon/_damon_sysfs: read tried regions directories in order
mm/damon/tests/core-kunit: add a test for damos_set_filters_default_reject()
mm/damon/paddr: remove unused variable, folio_list, in damon_pa_stat()
mm/damon/sysfs-schemes: fix wrong comment on damons_sysfs_quota_goal_metric_strs
...
x86 already uses gcc-8 as the minimum version, this changes all other
architectures to the same version. gcc-8 is used is Debian 10 and Red
Hat Enterprise Linux 8, both of which are still supported, and binutils
2.30 is the oldest corresponding version on those. Ubuntu Pro 18.04 and
SUSE Linux Enterprise Server 15 both use gcc-7 as the system compiler
but additionally include toolchains that remain supported.
With the new minimum toolchain versions, a number of workarounds for older
versions can be dropped, in particular on x86_64 and arm64. Importantly,
the updated compiler version allows removing two of the five remaining
gcc plugins, as support for sancov and structeak features is already
included in modern compiler versions.
I tried collecting the known changes that are possible based on the
new toolchain version, but expect that more cleanups will be possible.
Since this touches multiple architectures, I merged the patches through
the asm-generic tree.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmg6vNMACgkQmmx57+YA
GNkOmg/+LtR9B2P27GPBeG8HnLTZ8hKELiyYeSk6ZFgQv5hevE37HV35Yru7e7gu
wcF6CgYr8ff4CVcHM7y0790oGew1thkqq5CklFIH0EwCDJx/mWfZR1SS2jfZIEWM
HSDOlQQd1S8oWia14tSnQos3nW3CB9/ABVTHH+Wvl3xn48WMRvgK2LJgGLuxJrt8
5DD9auHiLjchWB5tB4DU98IgWWgFUGMTsI6IayZ4dkF4CdWqd89h0Y3pjJYeBgHS
mPxzR2q8WjEmG9hp7QuZQgn/pAYleJAwHvvkoLrkQ2ieqx3FjWiwFbQp4CG1Sc8L
eBR1lnkqS2z/e7xJLfe86fOoKWWu4I0tZKhRan/0+UOGm5nXrGpqSxKS8ZDsRuAp
3fvyhIp1cYSa7Xkok8BFhLEFR0tguXJXnXBc3tWE5VXIfFNd0Ohh1GUYhXDAqWKh
i0jN9dSNhokM3AqBi6qZl5kmBnRA3UsIaOg3QRrqN8IlBPp+u7i5xsrJIUWvD95o
TO06admmLcCJT8n6ZfNVfRjBgzu8+t54UVaDx9YYwxoNGOSFwqOb8CSPTWPxLmDr
RKDUOvO8DBlP7uFz9neP+LxluA3DjurRZvb0z0AmCZ8/RXEmTMCyfP5a6esxquXt
0Bqo6hM9q+TeXTHNS1CNvqLSWWikw+AzS/ZPPvriYFn5lxtbq6c=
=pdDC
-----END PGP SIGNATURE-----
Merge tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull compiler version requirement update from Arnd Bergmann:
"Require gcc-8 and binutils-2.30
x86 already uses gcc-8 as the minimum version, this changes all other
architectures to the same version. gcc-8 is used is Debian 10 and Red
Hat Enterprise Linux 8, both of which are still supported, and
binutils 2.30 is the oldest corresponding version on those.
Ubuntu Pro 18.04 and SUSE Linux Enterprise Server 15 both use gcc-7 as
the system compiler but additionally include toolchains that remain
supported.
With the new minimum toolchain versions, a number of workarounds for
older versions can be dropped, in particular on x86_64 and arm64.
Importantly, the updated compiler version allows removing two of the
five remaining gcc plugins, as support for sancov and structeak
features is already included in modern compiler versions.
I tried collecting the known changes that are possible based on the
new toolchain version, but expect that more cleanups will be possible.
Since this touches multiple architectures, I merged the patches
through the asm-generic tree."
* tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
Makefile.kcov: apply needed compiler option unconditionally in CFLAGS_KCOV
Documentation: update binutils-2.30 version reference
gcc-plugins: remove SANCOV gcc plugin
Kbuild: remove structleak gcc plugin
arm64: drop binutils version checks
raid6: skip avx512 checks
kbuild: require gcc-8 and binutils-2.30
DT Bindings:
- Convert all remaining interrupt-controller bindings to DT schema
- Convert Rockchip CDN-DP and Freescale TCON, M4IF, TigerP, LDB, PPC
PMC, imx-drm, and ftm-quaddec to DT schema
- Add bindings for fsl,vf610-pit, fsl,ls1021a-wdt, sgx,vz89te,
maxim,max30208, ti,lp8864, and fairphone,fp5-sndcard
- Add top-level constraints for renesas,vsp1 and renesas,fcp
- Add missing constraint in amlogic,pinctrl-a4 'group' nodes
- Adjust the allowed properties for dwc3-xilinx, sony,imx219,
pci-iommu, and renesas,dsi
- Add EcoNet vendor prefix
- Fix the reserved-memory.yaml in fsl,qman-fqd
- Drop obsolete numa.txt and cpu-topology.txt which are schemas in
dtschema now
- Drop Renesas RZ/N1S bindings
- Ensure Arm cpu nodes don't allow undocumented properties. Add all
the properties which are in use and undocumented. Drop the Mediatek
cpufreq binding which is not a binding, but just what DT properties
the driver uses.
- Add compatibles for Renesas RZ/G3E and RZ/V2N Mali Bifrost GPU
- Update documentation on defining child nodes with separate schemas
- Add bindings to PSCI MAINTAINERS entry
DT core:
- Add new functions to simplify driver handling of 'memory-region'
properties. Users to be added next cycle.
- Simplify of_dma_set_restricted_buffer() to use of_for_each_phandle()
- Add missing unlock on error in unittest_data_add()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmg3L7IACgkQ+vtdtY28
YcPHew//dSH7WHwx+AwQag8svrBZbx+GrGqZlyGEGJYmF2o9TJ8d7tDCS5oPGNoV
TObns4F1DQuX7YVrke5tYIiFMyWUmu+f8CSHg9a4Ifo+Gf5QqEhzxe1CfW6Y7VZv
EIRrlCcW8VpTBuphsJMp6TLYof/mSBj4Ma1fRgp0H3CF0h1I5bM07jMCol+7fwT9
0ZMEOiFOFx0HeOBVCmPvfETX1+gflnlTD+aULJwYky2Tzj6FLLWNTf94ca2iMjCd
A4g9lJTaasRukL1RiHYoQYECgh57f3VMPRxI5wNPPVF7r3cHL7pjGzEt2vOkFDWC
xNkIsNPrCu14He17vrh6XrNn1KMIOkLtE9yCcysE2OgdlOXfdfN6Ryz4gm1BPaFo
ZDNZgs840r3gcQXjvCnSMq/Wxnwrka+x5vVT9VbDTV+1NWgFTpQZXqfiukxnuATa
K0X8hW7pWatNhmT11rIfcp7WUIs0bpJ+J03ptzmYsvH4qyZpzpMZvbBoAZ9RA+E1
dEFr7ISxhC4LlzjafqluUtBdaxKEAk8alzA9Z/OTKxDB+IiFTqztqIP3wSSApyhw
GBdRy8iC1zU7/TbdmAVDtrT1+xvqH5On1DsjU8y96O/2dR9OKT/6Tv1ozPkcssDV
EIeAZycS+6cFeusRxySMrOeGFQ0LiU2Qj/e9ugm07g+HMgY5jbI=
=gMwN
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"DT Bindings:
- Convert all remaining interrupt-controller bindings to DT schema
- Convert Rockchip CDN-DP and Freescale TCON, M4IF, TigerP, LDB, PPC
PMC, imx-drm, and ftm-quaddec to DT schema
- Add bindings for fsl,vf610-pit, fsl,ls1021a-wdt, sgx,vz89te,
maxim,max30208, ti,lp8864, and fairphone,fp5-sndcard
- Add top-level constraints for renesas,vsp1 and renesas,fcp
- Add missing constraint in amlogic,pinctrl-a4 'group' nodes
- Adjust the allowed properties for dwc3-xilinx, sony,imx219,
pci-iommu, and renesas,dsi
- Add EcoNet vendor prefix
- Fix the reserved-memory.yaml in fsl,qman-fqd
- Drop obsolete numa.txt and cpu-topology.txt which are schemas in
dtschema now
- Drop Renesas RZ/N1S bindings
- Ensure Arm cpu nodes don't allow undocumented properties. Add all
the properties which are in use and undocumented. Drop the Mediatek
cpufreq binding which is not a binding, but just what DT properties
the driver uses.
- Add compatibles for Renesas RZ/G3E and RZ/V2N Mali Bifrost GPU
- Update documentation on defining child nodes with separate schemas
- Add bindings to PSCI MAINTAINERS entry
DT core:
- Add new functions to simplify driver handling of 'memory-region'
properties. Users to be added next cycle.
- Simplify of_dma_set_restricted_buffer() to use
of_for_each_phandle()
- Add missing unlock on error in unittest_data_add()"
* tag 'devicetree-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (87 commits)
dt-bindings: timer: Add fsl,vf610-pit.yaml
dt-bindings: gpu: mali-bifrost: Add compatible for RZ/G3E SoC
ASoC: dt-bindings: qcom,sm8250: Add Fairphone 5 sound card
dt-bindings: arm/cpus: Allow 2 power-domains entries
dt-bindings: usb: dwc3-xilinx: allow dma-coherent
media: dt-bindings: sony,imx219: Allow props from video-interface-devices
dt-bindings: soundwire: qcom: Document v2.1.0 version of IP block
dt-bindings: watchdog: fsl-imx-wdt: add compatible string fsl,ls1021a-wdt
dt-bindings: pinctrl: amlogic,pinctrl-a4: Add missing constraint on allowed 'group' node properties
dt-bindings: display: rockchip: Convert cdn-dp-rockchip.txt to yaml
dt-bindings: display: bridge: renesas,dsi: allow properties from dsi-controller
dt-bindings: trivial-devices: Add VZ89TE to trivial
media: dt-bindings: renesas,vsp1: add top-level constraints
media: dt-bindings: renesas,fcp: add top-level constraints
dt-bindings: trivial-devices: Add Maxim max30208
dt-bindings: soc: fsl,qman-fqd: Fix reserved-memory.yaml reference
dt-bindings: interrupt-controller: Convert ti,omap-intc-irq to DT schema
dt-bindings: interrupt-controller: Convert ti,omap4-wugen-mpu to DT schema
dt-bindings: interrupt-controller: Convert ti,keystone-irq to DT schema
dt-bindings: interrupt-controller: Convert technologic,ts4800-irqc to DT schema
...
* Add large stage-2 mapping (THP) support for non-protected guests when
pKVM is enabled, clawing back some performance.
* Enable nested virtualisation support on systems that support it,
though it is disabled by default.
* Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
protected modes.
* Large rework of the way KVM tracks architecture features and links
them with the effects of control bits. While this has no functional
impact, it ensures correctness of emulation (the data is automatically
extracted from the published JSON files), and helps dealing with the
evolution of the architecture.
* Significant changes to the way pKVM tracks ownership of pages,
avoiding page table walks by storing the state in the hypervisor's
vmemmap. This in turn enables the THP support described above.
* New selftest checking the pKVM ownership transition rules
* Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
even if the host didn't have it.
* Fixes for the address translation emulation, which happened to be
rather buggy in some specific contexts.
* Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
from the number of counters exposed to a guest and addressing a
number of issues in the process.
* Add a new selftest for the SVE host state being corrupted by a
guest.
* Keep HCR_EL2.xMO set at all times for systems running with the
kernel at EL2, ensuring that the window for interrupts is slightly
bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
* Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
from a pretty bad case of TLB corruption unless accesses to HCR_EL2
are heavily synchronised.
* Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
tables in a human-friendly fashion.
* and the usual random cleanups.
LoongArch:
* Don't flush tlb if the host supports hardware page table walks.
* Add KVM selftests support.
RISC-V:
* Add vector registers to get-reg-list selftest
* VCPU reset related improvements
* Remove scounteren initialization from VCPU reset
* Support VCPU reset from userspace using set_mpstate() ioctl
x86:
* Initial support for TDX in KVM. This finally makes it possible to use the
TDX module to run confidential guests on Intel processors. This is quite a
large series, including support for private page tables (managed by the
TDX module and mirrored in KVM for efficiency), forwarding some TDVMCALLs
to userspace, and handling several special VM exits from the TDX module.
This has been in the works for literally years and it's not really possible
to describe everything here, so I'll defer to the various merge commits
up to and including commit 7bcf7246c4 ("Merge branch 'kvm-tdx-finish-initial'
into HEAD").
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmg02hwUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNnkwf/db4xeWKSMseCIvBVR+ObDn3LXhwT
hAgmTkDkP1zq9RfbfJSbUA1DXRwfP+f1sWySLMWECkFEQW9fGIJF9fOQRDSXKmhX
158U3+FEt+3jxLRCGFd4zyXAqyY3C8JSkPUyJZxCpUbXtB5tdDNac4rZAXKDULwe
sUi0OW/kFDM2yt369pBGQAGdN+75/oOrYISGOSvMXHxjccNqvveX8MUhpBjYIuuj
73iBWmsfv3vCtam56Racz3C3v44ie498PmWFtnB0R+CVfWfrnUAaRiGWx+egLiBW
dBPDiZywMn++prmphEUFgaStDTQy23JBLJ8+RvHkp+o5GaTISKJB3nedZQ==
=adZU
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"As far as x86 goes this pull request "only" includes TDX host support.
Quotes are appropriate because (at 6k lines and 100+ commits) it is
much bigger than the rest, which will come later this week and
consists mostly of bugfixes and selftests. s390 changes will also come
in the second batch.
ARM:
- Add large stage-2 mapping (THP) support for non-protected guests
when pKVM is enabled, clawing back some performance.
- Enable nested virtualisation support on systems that support it,
though it is disabled by default.
- Add UBSAN support to the standalone EL2 object used in nVHE/hVHE
and protected modes.
- Large rework of the way KVM tracks architecture features and links
them with the effects of control bits. While this has no functional
impact, it ensures correctness of emulation (the data is
automatically extracted from the published JSON files), and helps
dealing with the evolution of the architecture.
- Significant changes to the way pKVM tracks ownership of pages,
avoiding page table walks by storing the state in the hypervisor's
vmemmap. This in turn enables the THP support described above.
- New selftest checking the pKVM ownership transition rules
- Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
even if the host didn't have it.
- Fixes for the address translation emulation, which happened to be
rather buggy in some specific contexts.
- Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
from the number of counters exposed to a guest and addressing a
number of issues in the process.
- Add a new selftest for the SVE host state being corrupted by a
guest.
- Keep HCR_EL2.xMO set at all times for systems running with the
kernel at EL2, ensuring that the window for interrupts is slightly
bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
- Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
from a pretty bad case of TLB corruption unless accesses to HCR_EL2
are heavily synchronised.
- Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
tables in a human-friendly fashion.
- and the usual random cleanups.
LoongArch:
- Don't flush tlb if the host supports hardware page table walks.
- Add KVM selftests support.
RISC-V:
- Add vector registers to get-reg-list selftest
- VCPU reset related improvements
- Remove scounteren initialization from VCPU reset
- Support VCPU reset from userspace using set_mpstate() ioctl
x86:
- Initial support for TDX in KVM.
This finally makes it possible to use the TDX module to run
confidential guests on Intel processors. This is quite a large
series, including support for private page tables (managed by the
TDX module and mirrored in KVM for efficiency), forwarding some
TDVMCALLs to userspace, and handling several special VM exits from
the TDX module.
This has been in the works for literally years and it's not really
possible to describe everything here, so I'll defer to the various
merge commits up to and including commit 7bcf7246c4 ('Merge
branch 'kvm-tdx-finish-initial' into HEAD')"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (248 commits)
x86/tdx: mark tdh_vp_enter() as __flatten
Documentation: virt/kvm: remove unreferenced footnote
RISC-V: KVM: lock the correct mp_state during reset
KVM: arm64: Fix documentation for vgic_its_iter_next()
KVM: arm64: np-guest CMOs with PMD_SIZE fixmap
KVM: arm64: Stage-2 huge mappings for np-guests
KVM: arm64: Add a range to pkvm_mappings
KVM: arm64: Convert pkvm_mappings to interval tree
KVM: arm64: Add a range to __pkvm_host_test_clear_young_guest()
KVM: arm64: Add a range to __pkvm_host_wrprotect_guest()
KVM: arm64: Add a range to __pkvm_host_unshare_guest()
KVM: arm64: Add a range to __pkvm_host_share_guest()
KVM: arm64: Introduce for_each_hyp_page
KVM: arm64: Handle huge mappings for np-guest CMOs
KVM: arm64: nv: Release faulted-in VNCR page from mmu_lock critical section
KVM: arm64: nv: Handle TLBI S1E2 for VNCR invalidation with mmu_lock held
KVM: arm64: nv: Hold mmu_lock when invalidating VNCR SW-TLB before translating
RISC-V: KVM: add KVM_CAP_RISCV_MP_STATE_RESET
RISC-V: KVM: Remove scounteren initialization
KVM: RISC-V: remove unnecessary SBI reset state
...
Core
----
- Implement the Device Memory TCP transmit path, allowing zero-copy
data transmission on top of TCP from e.g. GPU memory to the wire.
- Move all the IPv6 routing tables management outside the RTNL scope,
under its own lock and RCU. The route control path is now 3x times
faster.
- Convert queue related netlink ops to instance lock, reducing
again the scope of the RTNL lock. This improves the control plane
scalability.
- Refactor the software crc32c implementation, removing unneeded
abstraction layers and improving significantly the related
micro-benchmarks.
- Optimize the GRO engine for UDP-tunneled traffic, for a 10%
performance improvement in related stream tests.
- Cover more per-CPU storage with local nested BH locking; this is a
prep work to remove the current per-CPU lock in local_bh_disable()
on PREMPT_RT.
- Introduce and use nlmsg_payload helper, combining buffer bounds
verification with accessing payload carried by netlink messages.
Netfilter
---------
- Rewrite the procfs conntrack table implementation, improving
considerably the dump performance. A lot of user-space tools
still use this interface.
- Implement support for wildcard netdevice in netdev basechain
and flowtables.
- Integrate conntrack information into nft trace infrastructure.
- Export set count and backend name to userspace, for better
introspection.
BPF
---
- BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
programs and can be controlled in similar way to traditional qdiscs
using the "tc qdisc" command.
- Refactor the UDP socket iterator, addressing long standing issues
WRT duplicate hits or missed sockets.
Protocols
---------
- Improve TCP receive buffer auto-tuning and increase the default
upper bound for the receive buffer; overall this improves the single
flow maximum thoughput on 200Gbs link by over 60%.
- Add AFS GSSAPI security class to AF_RXRPC; it provides transport
security for connections to the AFS fileserver and VL server.
- Improve TCP multipath routing, so that the sources address always
matches the nexthop device.
- Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
and thus preventing DoS caused by passing around problematic FDs.
- Retire DCCP socket. DCCP only receives updates for bugs, and major
distros disable it by default. Its removal allows for better
organisation of TCP fields to reduce the number of cache lines hit
in the fast path.
- Extend TCP drop-reason support to cover PAWS checks.
Driver API
----------
- Reorganize PTP ioctl flag support to require an explicit opt-in for
the drivers, avoiding the problem of drivers not rejecting new
unsupported flags.
- Converted several device drivers to timestamping APIs.
- Introduce per-PHY ethtool dump helpers, improving the support for
dump operations targeting PHYs.
Tests and tooling
-----------------
- Add support for classic netlink in user space C codegen, so that
ynl-c can now read, create and modify links, routes addresses and
qdisc layer configuration.
- Add ynl sub-types for binary attributes, allowing ynl-c to output
known struct instead of raw binary data, clarifying the classic
netlink output.
- Extend MPTCP selftests to improve the code-coverage.
- Add tests for XDP tail adjustment in AF_XDP.
New hardware / drivers
----------------------
- OpenVPN virtual driver: offload OpenVPN data channels processing
to the kernel-space, increasing the data transfer throughput WRT
the user-space implementation.
- Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
- Broadcom asp-v3.0 ethernet driver.
- AMD Renoir ethernet device.
- ReakTek MT9888 2.5G ethernet PHY driver.
- Aeonsemi 10G C45 PHYs driver.
Drivers
-------
- Ethernet high-speed NICs:
- nVidia/Mellanox (mlx5):
- refactor the stearing table handling to reduce significantly
the amount of memory used
- add support for complex matches in H/W flow steering
- improve flow streeing error handling
- convert to netdev instance locking
- Intel (100G, ice, igb, ixgbe, idpf):
- ice: add switchdev support for LLDP traffic over VF
- ixgbe: add firmware manipulation and regions devlink support
- igb: introduce support for frame transmission premption
- igb: adds persistent NAPI configuration
- idpf: introduce RDMA support
- idpf: add initial PTP support
- Meta (fbnic):
- extend hardware stats coverage
- add devlink dev flash support
- Broadcom (bnxt):
- add support for RX-side device memory TCP
- Wangxun (txgbe):
- implement support for udp tunnel offload
- complete PTP and SRIOV support for AML 25G/10G devices
- Ethernet NICs embedded and virtual:
- Google (gve):
- add device memory TCP TX support
- Amazon (ena):
- support persistent per-NAPI config
- Airoha:
- add H/W support for L2 traffic offload
- add per flow stats for flow offloading
- RealTek (rtl8211): add support for WoL magic packet
- Synopsys (stmmac):
- dwmac-socfpga 1000BaseX support
- add Loongson-2K3000 support
- introduce support for hardware-accelerated VLAN stripping
- Broadcom (bcmgenet):
- expose more H/W stats
- Freescale (enetc, dpaa2-eth):
- enetc: add MAC filter, VLAN filter RSS and loopback support
- dpaa2-eth: convert to H/W timestamping APIs
- vxlan: convert FDB table to rhashtable, for better scalabilty
- veth: apply qdisc backpressure on full ring to reduce TX drops
- Ethernet switches:
- Microchip (kzZ88x3): add ETS scheduler support
- Ethernet PHYs:
- RealTek (rtl8211):
- add support for WoL magic packet
- add support for PHY LEDs
- CAN:
- Adds RZ/G3E CANFD support to the rcar_canfd driver.
- Preparatory work for CAN-XL support.
- Add self-tests framework with support for CAN physical interfaces.
- WiFi:
- mac80211:
- scan improvements with multi-link operation (MLO)
- Qualcomm (ath12k):
- enable AHB support for IPQ5332
- add monitor interface support to QCN9274
- add multi-link operation support to WCN7850
- add 802.11d scan offload support to WCN7850
- monitor mode for WCN7850, better 6 GHz regulatory
- Qualcomm (ath11k):
- restore hibernation support
- MediaTek (mt76):
- WiFi-7 improvements
- implement support for mt7990
- Intel (iwlwifi):
- enhanced multi-link single-radio (EMLSR) support on 5 GHz links
- rework device configuration
- RealTek (rtw88):
- improve throughput for RTL8814AU
- RealTek (rtw89):
- add multi-link operation support
- STA/P2P concurrency improvements
- support different SAR configs by antenna
- Bluetooth:
- introduce HCI Driver protocol
- btintel_pcie: do not generate coredump for diagnostic events
- btusb: add HCI Drv commands for configuring altsetting
- btusb: add RTL8851BE device 0x0bda:0xb850
- btusb: add new VID/PID 13d3/3584 for MT7922
- btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
- btnxpuart: implement host-wakeup feature
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmg3D64SHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkcIsQAK2eEc+BxQer975wzvtMg6gF9eoex4a+
rZ7jxfDzDtNvTauoQsrpehDZp0FnySaVGCU36lHGB2OvDnhCpPc5hXzKDWQpOuqQ
SHrGG3/6FTbdTG/HfHUcbNyrUzIf53SADSObiQ3qg4gyEQ3sCpcOKtVtMcU8rvsY
/HqMnsJWFaROUMjMtCcnUSgjmeY9kBvha3sTXUqgeRugEOCvZD7z4rpqFIcQqHw7
e2Fi8dwIXEYNxqPp6MRq2qdyUTewCRruE8ZIMAFuhtfYeMElUZMPlqlMENX3AzTQ
cr0EgwcFOUxRA7oZRxhoBNBsVXavtSpQr4ZDoWplxP4aQ37n5tc1E9Q72axpB/Og
FbJRl6GvWYnCd8071BczgmfHlKaTAigPvt2Z4r6JjM5I/Bij/IZ3k+On1OTuOAj/
EqfFkdZ0a5cfKrwUMP+oSGtSAywkMVUtnIKJlZeRbjSj2432sCfe2jVAlS8ELM43
3LUgXYrAKtA87g171LlsRu5EEpI5QmqPb+i5LpPlEXe2TJEgPisyfecJ3NafF/2+
j575lm+TFNm9NTNhGGjDPEvw0djI5wSGGMe9J4gC74eWi6s5t6C4cuUf84TKWdwR
x+9H0IB7rfFncAwXHJuUUtzd+fPHaYzs5dDGbSgMQOXr1cr1wlubCK8mQ1r/Wt/a
3GjFIOQKW2Q5
=t/Tz
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Implement the Device Memory TCP transmit path, allowing zero-copy
data transmission on top of TCP from e.g. GPU memory to the wire.
- Move all the IPv6 routing tables management outside the RTNL scope,
under its own lock and RCU. The route control path is now 3x times
faster.
- Convert queue related netlink ops to instance lock, reducing again
the scope of the RTNL lock. This improves the control plane
scalability.
- Refactor the software crc32c implementation, removing unneeded
abstraction layers and improving significantly the related
micro-benchmarks.
- Optimize the GRO engine for UDP-tunneled traffic, for a 10%
performance improvement in related stream tests.
- Cover more per-CPU storage with local nested BH locking; this is a
prep work to remove the current per-CPU lock in local_bh_disable()
on PREMPT_RT.
- Introduce and use nlmsg_payload helper, combining buffer bounds
verification with accessing payload carried by netlink messages.
Netfilter:
- Rewrite the procfs conntrack table implementation, improving
considerably the dump performance. A lot of user-space tools still
use this interface.
- Implement support for wildcard netdevice in netdev basechain and
flowtables.
- Integrate conntrack information into nft trace infrastructure.
- Export set count and backend name to userspace, for better
introspection.
BPF:
- BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
programs and can be controlled in similar way to traditional qdiscs
using the "tc qdisc" command.
- Refactor the UDP socket iterator, addressing long standing issues
WRT duplicate hits or missed sockets.
Protocols:
- Improve TCP receive buffer auto-tuning and increase the default
upper bound for the receive buffer; overall this improves the
single flow maximum thoughput on 200Gbs link by over 60%.
- Add AFS GSSAPI security class to AF_RXRPC; it provides transport
security for connections to the AFS fileserver and VL server.
- Improve TCP multipath routing, so that the sources address always
matches the nexthop device.
- Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
and thus preventing DoS caused by passing around problematic FDs.
- Retire DCCP socket. DCCP only receives updates for bugs, and major
distros disable it by default. Its removal allows for better
organisation of TCP fields to reduce the number of cache lines hit
in the fast path.
- Extend TCP drop-reason support to cover PAWS checks.
Driver API:
- Reorganize PTP ioctl flag support to require an explicit opt-in for
the drivers, avoiding the problem of drivers not rejecting new
unsupported flags.
- Converted several device drivers to timestamping APIs.
- Introduce per-PHY ethtool dump helpers, improving the support for
dump operations targeting PHYs.
Tests and tooling:
- Add support for classic netlink in user space C codegen, so that
ynl-c can now read, create and modify links, routes addresses and
qdisc layer configuration.
- Add ynl sub-types for binary attributes, allowing ynl-c to output
known struct instead of raw binary data, clarifying the classic
netlink output.
- Extend MPTCP selftests to improve the code-coverage.
- Add tests for XDP tail adjustment in AF_XDP.
New hardware / drivers:
- OpenVPN virtual driver: offload OpenVPN data channels processing to
the kernel-space, increasing the data transfer throughput WRT the
user-space implementation.
- Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
- Broadcom asp-v3.0 ethernet driver.
- AMD Renoir ethernet device.
- ReakTek MT9888 2.5G ethernet PHY driver.
- Aeonsemi 10G C45 PHYs driver.
Drivers:
- Ethernet high-speed NICs:
- nVidia/Mellanox (mlx5):
- refactor the steering table handling to significantly
reduce the amount of memory used
- add support for complex matches in H/W flow steering
- improve flow streeing error handling
- convert to netdev instance locking
- Intel (100G, ice, igb, ixgbe, idpf):
- ice: add switchdev support for LLDP traffic over VF
- ixgbe: add firmware manipulation and regions devlink support
- igb: introduce support for frame transmission premption
- igb: adds persistent NAPI configuration
- idpf: introduce RDMA support
- idpf: add initial PTP support
- Meta (fbnic):
- extend hardware stats coverage
- add devlink dev flash support
- Broadcom (bnxt):
- add support for RX-side device memory TCP
- Wangxun (txgbe):
- implement support for udp tunnel offload
- complete PTP and SRIOV support for AML 25G/10G devices
- Ethernet NICs embedded and virtual:
- Google (gve):
- add device memory TCP TX support
- Amazon (ena):
- support persistent per-NAPI config
- Airoha:
- add H/W support for L2 traffic offload
- add per flow stats for flow offloading
- RealTek (rtl8211): add support for WoL magic packet
- Synopsys (stmmac):
- dwmac-socfpga 1000BaseX support
- add Loongson-2K3000 support
- introduce support for hardware-accelerated VLAN stripping
- Broadcom (bcmgenet):
- expose more H/W stats
- Freescale (enetc, dpaa2-eth):
- enetc: add MAC filter, VLAN filter RSS and loopback support
- dpaa2-eth: convert to H/W timestamping APIs
- vxlan: convert FDB table to rhashtable, for better scalabilty
- veth: apply qdisc backpressure on full ring to reduce TX drops
- Ethernet switches:
- Microchip (kzZ88x3): add ETS scheduler support
- Ethernet PHYs:
- RealTek (rtl8211):
- add support for WoL magic packet
- add support for PHY LEDs
- CAN:
- Adds RZ/G3E CANFD support to the rcar_canfd driver.
- Preparatory work for CAN-XL support.
- Add self-tests framework with support for CAN physical interfaces.
- WiFi:
- mac80211:
- scan improvements with multi-link operation (MLO)
- Qualcomm (ath12k):
- enable AHB support for IPQ5332
- add monitor interface support to QCN9274
- add multi-link operation support to WCN7850
- add 802.11d scan offload support to WCN7850
- monitor mode for WCN7850, better 6 GHz regulatory
- Qualcomm (ath11k):
- restore hibernation support
- MediaTek (mt76):
- WiFi-7 improvements
- implement support for mt7990
- Intel (iwlwifi):
- enhanced multi-link single-radio (EMLSR) support on 5 GHz links
- rework device configuration
- RealTek (rtw88):
- improve throughput for RTL8814AU
- RealTek (rtw89):
- add multi-link operation support
- STA/P2P concurrency improvements
- support different SAR configs by antenna
- Bluetooth:
- introduce HCI Driver protocol
- btintel_pcie: do not generate coredump for diagnostic events
- btusb: add HCI Drv commands for configuring altsetting
- btusb: add RTL8851BE device 0x0bda:0xb850
- btusb: add new VID/PID 13d3/3584 for MT7922
- btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
- btnxpuart: implement host-wakeup feature"
* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits)
selftests/bpf: Fix bpf selftest build warning
selftests: netfilter: Fix skip of wildcard interface test
net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames
net: openvswitch: Fix the dead loop of MPLS parse
calipso: Don't call calipso functions for AF_INET sk.
selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
net_sched: hfsc: Address reentrant enqueue adding class to eltree twice
octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback
octeontx2-pf: QOS: Perform cache sync on send queue teardown
net: mana: Add support for Multi Vports on Bare metal
net: devmem: ncdevmem: remove unused variable
net: devmem: ksft: upgrade rx test to send 1K data
net: devmem: ksft: add 5 tuple FS support
net: devmem: ksft: add exit_wait to make rx test pass
net: devmem: ksft: add ipv4 support
net: devmem: preserve sockc_err
page_pool: fix ugly page_pool formatting
net: devmem: move list_add to net_devmem_bind_dmabuf.
selftests: netfilter: nft_queue.sh: include file transfer duration in log message
net: phy: mscc: Fix memory leak when using one step timestamping
...
new drivers:
- bring in the asahi uapi header standalone
- nova-drm: stub driver
rust dependencies (for nova-core):
- auxiliary
- bus abstractions
- driver registration
- sample driver
- devres changes from driver-core
- revocable changes
core:
- add Apple fourcc modifiers
- add virtio capset definitions
- extend EXPORT_SYNC_FILE for timeline syncobjs
- convert to devm_platform_ioremap_resource
- refactor shmem helper page pinning
- DP powerup/down link helpers
- remove disgusting turds
- extended %p4cc in vsprintf.c to support fourcc prints
- change vsprintf %p4cn to %p4chR, remove %p4cn
- Add drm_file_err function
- IN_FORMATS_ASYNC property
- move sitronix from tiny to their own subdir
rust:
- add drm core infrastructure rust abstractions
(device/driver, ioctl, file, gem)
dma-buf:
- adjust sg handling to not cache map on attach
- allow setting dma-device for import
- Add a helper to sort and deduplicate dma_fence arrays
docs:
- updated drm scheduler docs
- fbdev todo update
- fb rendering
- actual brightness
ttm:
- fix delayed destroy resv object
bridge:
- add kunit tests
- convert tc358775 to atomic
- convert drivers to devm_drm_bridge_alloc
- convert rk3066_hdmi to bridge driver
scheduler:
- add kunit tests
panel:
- refcount panels to improve lifetime handling
- Powertip PH128800T004-ZZA01
- NLT NL13676BC25-03F, Tianma TM070JDHG34-00
- Himax HX8279/HX8279-D DDIC
- Visionox G2647FB105
- Sitronix ST7571
- ZOTAC rotation quirk
vkms:
- allow attaching more displays
i915:
- xe3lpd display updates
- vrr refactor
- intel_display struct conversions
- xe2hpd memory type identification
- add link rate/count to i915_display_info
- cleanup VGA plane handling
- refactor HDCP GSC
- fix SLPC wait boosting reference counting
- add 20ms delay to engine reset
- fix fence release on early probe errors
xe:
- SRIOV updates
- BMG PCI ID update
- support separate firmware for each GT
- SVM fix, prelim SVM multi-device work
- export fan speed
- temp disable d3cold on BMG
- backup VRAM in PM notifier instead of suspend/freeze
- update xe_ttm_access_memory to use GPU for non-visible access
- fix guc_info debugfs for VFs
- use copy_from_user instead of __copy_from_user
- append PCIe gen5 limitations to xe_firmware document
amdgpu:
- DSC cleanup
- DC Scaling updates
- Fused I2C-over-AUX updates
- DMUB updates
- Use drm_file_err in amdgpu
- Enforce isolation updates
- Use new dma_fence helpers
- USERQ fixes
- Documentation updates
- SR-IOV updates
- RAS updates
- PSP 12 cleanups
- GC 9.5 updates
- SMU 13.x updates
- VCN / JPEG SR-IOV updates
amdkfd:
- Update error messages for SDMA
- Userptr updates
- XNACK fixes
radeon:
- CIK doorbell cleanup
nouveau:
- add support for NVIDIA r570 GSP firmware
- enable Hopper/Blackwell support
nova-core:
- fix task list
- register definition infrastructure
- move firmware into own rust module
- register auxiliary device for nova-drm
nova-drm:
- initial driver skeleton
msm:
- GPU:
- ACD (adaptive clock distribution) for X1-85
- drop fictional address_space_size
- improve GMU HFI response time out robustness
- fix crash when throttling during boot
- DPU:
- use single CTL path for flushing on DPU 5.x+
- improve SSPP allocation code for better sharing
- Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
- Added SAR2130P support
- Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
- DP:
- switch to new audio helpers
- better LTTPR handling
- DSI:
- Added support for SA8775P
- Added SAR2130P support
- HDMI:
- Switched to use new helpers for ACR data
- Fixed old standing issue of HPD not working in some cases
amdxdna:
- add dma-buf support
- allow empty command submits
renesas:
- add dma-buf support
- add zpos, alpha, blend support
panthor:
- fail properly for NO_MMAP bos
- add SET_LABEL ioctl
- debugfs BO dumping support
imagination:
- update DT bindings
- support TI AM68 GPU
hibmc:
- improve interrupt handling and HPD support
virtio:
- add panic handler support
rockchip:
- add RK3588 support
- add DP AUX bus panel support
ivpu:
- add heartbeat based hangcheck
mediatek:
- prepares support for MT8195/99 HDMIv2/DDCv2
anx7625:
- improve HPD
tegra:
- speed up firmware loading
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmg2aVAACgkQDHTzWXnE
hr6DjhAApr2fZjugU3EmpsARdcIWgEd+X65R97ef7RlUGqBKm2joSwZGOhH0oBsG
9WyO92Qzu6XMe8OibKqY4D2hir9UPz5v+uEWe3q9CzZGbNyAwyVRjVkaKpnI9upv
1dmHFI7HgPu6qbz6RfPIfgALBLXvVXMaQ4+ZgN/cLtZFa+OLAV5ByqWsRPPXZFb0
F/pQGQ4ursglfA+LH3SVPfnTN53lu93IlM5/Os9OQQGj+44w94zQ6DCm7CY1AugH
n+RM/0Yv7WaoF1ByeOtq4FcrmLRrd+ozsvITbRZqhOx7zS/mhP8LRzAwgKWOYzSh
puKunyQiSdHR7FSqSi8uyY3YumcLWNa/17LMKoTf+KqweJbKGE7RVBuFBn6WUdPb
AYHZrSB4USAeyahdrrsU+q7ltu5urs5ckpbXsRurMiaUz/BLim1PIm3N5FDLPY7B
PD1n1FcMUv3CmJT5Y+aNIQgmf1/dETESRTSAgSoOo3gNp6jdRCYqSuWIBsppibWT
26+tyz0/FGhE50QviHzg0Sv+jd/g93fN6snNlV8wNFMviq3bC69Toa+y3qJ5e7UC
/42R7nCWdkCZJfr6E67rOaahe9TDV/LXLqPErwptOkdK8sMchaIgF+deybgTtTi/
zGRBfjLvb5ocYBmPbeGX4mtXNRpyZ3o9I0QUyGUO4zMwFXmFwn0=
=jpVr
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
"As part of building up nova-core/nova-drm pieces we've brought in some
rust abstractions through this tree, aux bus being the main one, with
devres changes also in the driver-core tree. Along with the drm core
abstractions and enough nova-core/nova-drm to use them. This is still
all stub work under construction, to build the nova driver upstream.
The other big NVIDIA related one is nouveau adds support for
Hopper/Blackwell GPUs, this required a new GSP firmware update to
570.144, and a bunch of rework in order to support multiple fw
interfaces.
There is also the introduction of an asahi uapi header file as a
precursor to getting the real driver in later, but to unblock
userspace mesa packages while the driver is trapped behind rust
enablement.
Otherwise it's the usual mixture of stuff all over, amdgpu, i915/xe,
and msm being the main ones, and some changes to vsprintf.
new drivers:
- bring in the asahi uapi header standalone
- nova-drm: stub driver
rust dependencies (for nova-core):
- auxiliary
- bus abstractions
- driver registration
- sample driver
- devres changes from driver-core
- revocable changes
core:
- add Apple fourcc modifiers
- add virtio capset definitions
- extend EXPORT_SYNC_FILE for timeline syncobjs
- convert to devm_platform_ioremap_resource
- refactor shmem helper page pinning
- DP powerup/down link helpers
- extended %p4cc in vsprintf.c to support fourcc prints
- change vsprintf %p4cn to %p4chR, remove %p4cn
- Add drm_file_err function
- IN_FORMATS_ASYNC property
- move sitronix from tiny to their own subdir
rust:
- add drm core infrastructure rust abstractions
(device/driver, ioctl, file, gem)
dma-buf:
- adjust sg handling to not cache map on attach
- allow setting dma-device for import
- Add a helper to sort and deduplicate dma_fence arrays
docs:
- updated drm scheduler docs
- fbdev todo update
- fb rendering
- actual brightness
ttm:
- fix delayed destroy resv object
bridge:
- add kunit tests
- convert tc358775 to atomic
- convert drivers to devm_drm_bridge_alloc
- convert rk3066_hdmi to bridge driver
scheduler:
- add kunit tests
panel:
- refcount panels to improve lifetime handling
- Powertip PH128800T004-ZZA01
- NLT NL13676BC25-03F, Tianma TM070JDHG34-00
- Himax HX8279/HX8279-D DDIC
- Visionox G2647FB105
- Sitronix ST7571
- ZOTAC rotation quirk
vkms:
- allow attaching more displays
i915:
- xe3lpd display updates
- vrr refactor
- intel_display struct conversions
- xe2hpd memory type identification
- add link rate/count to i915_display_info
- cleanup VGA plane handling
- refactor HDCP GSC
- fix SLPC wait boosting reference counting
- add 20ms delay to engine reset
- fix fence release on early probe errors
xe:
- SRIOV updates
- BMG PCI ID update
- support separate firmware for each GT
- SVM fix, prelim SVM multi-device work
- export fan speed
- temp disable d3cold on BMG
- backup VRAM in PM notifier instead of suspend/freeze
- update xe_ttm_access_memory to use GPU for non-visible access
- fix guc_info debugfs for VFs
- use copy_from_user instead of __copy_from_user
- append PCIe gen5 limitations to xe_firmware document
amdgpu:
- DSC cleanup
- DC Scaling updates
- Fused I2C-over-AUX updates
- DMUB updates
- Use drm_file_err in amdgpu
- Enforce isolation updates
- Use new dma_fence helpers
- USERQ fixes
- Documentation updates
- SR-IOV updates
- RAS updates
- PSP 12 cleanups
- GC 9.5 updates
- SMU 13.x updates
- VCN / JPEG SR-IOV updates
amdkfd:
- Update error messages for SDMA
- Userptr updates
- XNACK fixes
radeon:
- CIK doorbell cleanup
nouveau:
- add support for NVIDIA r570 GSP firmware
- enable Hopper/Blackwell support
nova-core:
- fix task list
- register definition infrastructure
- move firmware into own rust module
- register auxiliary device for nova-drm
nova-drm:
- initial driver skeleton
msm:
- GPU:
- ACD (adaptive clock distribution) for X1-85
- drop fictional address_space_size
- improve GMU HFI response time out robustness
- fix crash when throttling during boot
- DPU:
- use single CTL path for flushing on DPU 5.x+
- improve SSPP allocation code for better sharing
- Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
- Added SAR2130P support
- Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
- DP:
- switch to new audio helpers
- better LTTPR handling
- DSI:
- Added support for SA8775P
- Added SAR2130P support
- HDMI:
- Switched to use new helpers for ACR data
- Fixed old standing issue of HPD not working in some cases
amdxdna:
- add dma-buf support
- allow empty command submits
renesas:
- add dma-buf support
- add zpos, alpha, blend support
panthor:
- fail properly for NO_MMAP bos
- add SET_LABEL ioctl
- debugfs BO dumping support
imagination:
- update DT bindings
- support TI AM68 GPU
hibmc:
- improve interrupt handling and HPD support
virtio:
- add panic handler support
rockchip:
- add RK3588 support
- add DP AUX bus panel support
ivpu:
- add heartbeat based hangcheck
mediatek:
- prepares support for MT8195/99 HDMIv2/DDCv2
anx7625:
- improve HPD
tegra:
- speed up firmware loading
* tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel: (1627 commits)
drm/nouveau/tegra: Fix error pointer vs NULL return in nvkm_device_tegra_resource_addr()
drm/xe: Default auto_link_downgrade status to false
drm/xe/guc: Make creation of SLPC debugfs files conditional
drm/i915/display: Add check for alloc_ordered_workqueue() and alloc_workqueue()
drm/i915/dp_mst: Work around Thunderbolt sink disconnect after SINK_COUNT_ESI read
drm/i915/ptl: Use everywhere the correct DDI port clock select mask
drm/nouveau/kms: add support for GB20x
drm/dp: add option to disable zero sized address only transactions.
drm/nouveau: add support for GB20x
drm/nouveau/gsp: add hal for fifo.chan.doorbell_handle
drm/nouveau: add support for GB10x
drm/nouveau/gf100-: track chan progress with non-WFI semaphore release
drm/nouveau/nv50-: separate CHANNEL_GPFIFO handling out from CHANNEL_DMA
drm/nouveau: add helper functions for allocating pinned/cpu-mapped bos
drm/nouveau: add support for GH100
drm/nouveau: improve handling of 64-bit BARs
drm/nouveau/gv100-: switch to volta semaphore methods
drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES
drm/nouveau/gsp: init client VMMs with NV0080_CTRL_DMA_SET_PAGE_DIRECTORY
drm/nouveau/gsp: fetch level shift and PDE from BAR2 VMM
...
- Update overflow helpers to ease refactoring of on-stack flex array
instances (Gustavo A. R. Silva, Kees Cook)
- lkdtm: Use SLAB_NO_MERGE instead of constructors (Harry Yoo)
- Simplify CONFIG_CC_HAS_COUNTED_BY (Jan Hendrik Farr)
- Disable u64 usercopy KUnit test on 32-bit SPARC (Thomas Weißschuh)
- Add missed designated initializers now exposed by fixed randstruct
(Nathan Chancellor, Kees Cook)
- Document compilers versions for __builtin_dynamic_object_size
- Remove ARM_SSP_PER_TASK GCC plugin
- Fix GCC plugin randstruct, add selftests, and restore COMPILE_TEST
builds
- Kbuild: induce full rebuilds when dependencies change with GCC plugins,
the Clang sanitizer .scl file, or the randstruct seed.
- Kbuild: Switch from -Wvla to -Wvla-larger-than=1
- Correct several __nonstring uses for -Wunterminated-string-initialization
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaDUq9gAKCRA2KwveOeQk
u+ZCAQDhqpOE/yn5gfjyplIvaTtzj9CaW6g11AmPYrimJCuj3QD9G+0o35kzlXOw
f0ZIj2U7LFNgbLos+20hQwhMFf1Zhgg=
=OYzD
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- Update overflow helpers to ease refactoring of on-stack flex array
instances (Gustavo A. R. Silva, Kees Cook)
- lkdtm: Use SLAB_NO_MERGE instead of constructors (Harry Yoo)
- Simplify CONFIG_CC_HAS_COUNTED_BY (Jan Hendrik Farr)
- Disable u64 usercopy KUnit test on 32-bit SPARC (Thomas Weißschuh)
- Add missed designated initializers now exposed by fixed randstruct
(Nathan Chancellor, Kees Cook)
- Document compilers versions for __builtin_dynamic_object_size
- Remove ARM_SSP_PER_TASK GCC plugin
- Fix GCC plugin randstruct, add selftests, and restore COMPILE_TEST
builds
- Kbuild: induce full rebuilds when dependencies change with GCC
plugins, the Clang sanitizer .scl file, or the randstruct seed.
- Kbuild: Switch from -Wvla to -Wvla-larger-than=1
- Correct several __nonstring uses for -Wunterminated-string-initialization
* tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
Revert "hardening: Disable GCC randstruct for COMPILE_TEST"
lib/tests: randstruct: Add deep function pointer layout test
lib/tests: Add randstruct KUnit test
randstruct: gcc-plugin: Remove bogus void member
net: qede: Initialize qede_ll_ops with designated initializer
scsi: qedf: Use designated initializer for struct qed_fcoe_cb_ops
md/bcache: Mark __nonstring look-up table
integer-wrap: Force full rebuild when .scl file changes
randstruct: Force full rebuild when seed changes
gcc-plugins: Force full rebuild when plugins change
kbuild: Switch from -Wvla to -Wvla-larger-than=1
hardening: simplify CONFIG_CC_HAS_COUNTED_BY
overflow: Fix direct struct member initialization in _DEFINE_FLEX()
kunit/overflow: Add tests for STACK_FLEX_ARRAY_SIZE() helper
overflow: Add STACK_FLEX_ARRAY_SIZE() helper
input/joystick: magellan: Mark __nonstring look-up table const
watchdog: exar: Shorten identity name to fit correctly
mod_devicetable: Enlarge the maximum platform_device_id name length
overflow: Clarify expectations for getting DEFINE_FLEX variable sizes
compiler_types: Identify compiler versions for __builtin_dynamic_object_size
...
* Move kern_table members out of kernel/sysctl.c
Moved a subset (tracing, panic, signal, stack_tracer and sparc) out of the
kern_table array. The goal is for kern_table to only have sysctl elements. All
this increases modularity by placing the ctl_tables closer to where they are
used while reducing the chances of merge conflicts in kernel/sysctl.c.
* Fixed sysctl unit test panic by relocating it to selftests
* Testing
These have been in linux-next from rc2, so they have had more than a month
worth of testing.
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmgwLsAACgkQupfNUreW
QU9ghwv/VKZW+IXEvSjc8OiwntWkL7e5ddHY6O2Vf44MzhBefLTXmfx2HfkEA0Xw
RaOQ28Hf/zQL83RqHHnXqI7JdGWQJUm8bCPwk4H3DCaF8qOfPVvblVYmfNL2auSY
oyRRpRzZuY5EtKcrNjiHFHL2WIC8KvPVwS748oHY1eZY7kn1fcs8DDnNO4iuWop+
uJeDxu87wkRCFXF3DIM+MAHRvxSa8GHtZvb9EjAl/EHMbAyVSz3uTb7FdQDdnE09
s7P30EC03RHtgi3sd2Ku04dJsHLz7VErvpToxSH2KFlcdpJuWuCSCTT8XaD8kII8
kYYCxNpmPOf4LzEy/J2vVZB0PSHrHvuQCH7iGy+8wOPk9GHTOMkKMMXVmeGnAsef
AiosPYroxXp/nBFcuNs6/1LKpsdpFr2F6u6oMgbzLaW1Xe/oc+6oynuOgeVj9LuM
FrSxSwaVvpdwHYHujYPQAAWIgKRzITiEXnCgtSyohFquKb+7E8ZspwjOqYH2xWMQ
WwABNRqY
=45X2
-----END PGP SIGNATURE-----
Merge tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
- Move kern_table members out of kernel/sysctl.c
Moved a subset (tracing, panic, signal, stack_tracer and sparc) out
of the kern_table array. The goal is for kern_table to only have
sysctl elements. All this increases modularity by placing the
ctl_tables closer to where they are used while reducing the chances
of merge conflicts in kernel/sysctl.c.
- Fixed sysctl unit test panic by relocating it to selftests
* tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
sysctl: Close test ctl_headers with a for loop
sysctl: call sysctl tests with a for loop
sysctl: Add 0012 to test the u8 range check
sysctl: move u8 register test to lib/test_sysctl.c
sparc: mv sparc sysctls into their own file under arch/sparc/kernel
stack_tracer: move sysctl registration to kernel/trace/trace_stack.c
tracing: Move trace sysctls into trace.c
signal: Move signal ctl tables into signal.c
panic: Move panic ctl tables into panic.c
The function is small enough that it should be, and it's a (very) hot path
for io_uring. Doing this actually reduces my vmlinux text size for my
standard build/test box.
Before:
axboe@r7625 ~/g/linux (test)> size vmlinux
text data bss dec hex filename
19892174 5938310 2470432 28300916 1afd674 vmlinux
After:
axboe@r7625 ~/g/linux (test)> size vmlinux
text data bss dec hex filename
19891878 5938310 2470436 28300624 1afd550 vmlinux
Link: https://lkml.kernel.org/r/f1d104c6-7ac8-457a-a53d-6bb741421b2f@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Add Platform Temperature Control (PTC) support to the Intel int340x
thermal driver (Srinivas Pandruvada).
- Make the Hisilicon thermal driver compile by default when ARCH_HISI
is set (Krzysztof Kozlowski).
- Clean up printk() format by using %pC instead of %pCn in the bcm2835
thermal driver (Luca Ceresoli).
- Fix variable name coding style in the AmLogic thermal driver (Enrique
Isidoro Vazquez Ramos).
- Fix missing debugfs entry removal on failure by using the devm_
variant in the LVTS thermal driver (AngeloGioacchino Del Regno).
- Remove the unused lvts_debugfs_exit() function as the devm_ variant
introduced before takes care of removing the debugfs entry in the
LVTS driver (Arnd Bergmann).
- Add the Airoha EN7581 thermal sensor support along with its DT
bindings (Christian Marangi).
- Add ipq5018 compatible string DT binding, cleanup and add its suppot
to the QCom Tsens thermal driver (Sricharan Ramabadhran, George
Moussalem).
- Fix comments typos in the Airoha driver (Christian Marangi, Colin Ian
King).
- Address a sparse warning by making a local variable static in the
QCom thermal driver (George Moussalem).
- Fix the usage of the _SCP control method in the driver for ACPI
thermal zones (Armin Wolf).
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmg0jdQSHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO1uPkH/2hZsBDB0sKr9nLN+V1tprdhflZxSRIB
qD65DxWXJ6pumXctaO6WYD1Vf8drO0X3kOcdpHrb+R4Im8qBz290FoUPi3FzUmNM
Qq6erheB/h4FP6EFKJmf5vWCn23nLAT0YtVS6yP9+4DKrqAoGAIlVjxcKc6+tcf/
ZBPWNUNuis+Xk6FD300X2gE1OIh5ZfIvKSh/RnExIqAqRYV8rtGCUdsqid4Rn+Jb
4mzDVnXW5TiJpfoRf5/0gxMYtcTcOIxbtAPAOnXw+4aJZtNK/oN5AqLbf9TNz1vM
ZSHBR0kPDxIa0ppT8SfvzPL5gE6lrxx05WDOvr7PEWHG1GzSFIPkMNc=
=Un/6
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These add support for a new feature, Platform Temperature Control
(PTC), to the Intel int340x thermal driver, add support for the Airoha
EN7581 thermal sensor and the IPQ5018 platform, fix up the ACPI
thermal zones handling, fix other assorted issues and clean up code
Specifics:
- Add Platform Temperature Control (PTC) support to the Intel int340x
thermal driver (Srinivas Pandruvada)
- Make the Hisilicon thermal driver compile by default when ARCH_HISI
is set (Krzysztof Kozlowski)
- Clean up printk() format by using %pC instead of %pCn in the
bcm2835 thermal driver (Luca Ceresoli)
- Fix variable name coding style in the AmLogic thermal driver
(Enrique Isidoro Vazquez Ramos)
- Fix missing debugfs entry removal on failure by using the devm_
variant in the LVTS thermal driver (AngeloGioacchino Del Regno)
- Remove the unused lvts_debugfs_exit() function as the devm_ variant
introduced before takes care of removing the debugfs entry in the
LVTS driver (Arnd Bergmann)
- Add the Airoha EN7581 thermal sensor support along with its DT
bindings (Christian Marangi)
- Add ipq5018 compatible string DT binding, cleanup and add its
suppot to the QCom Tsens thermal driver (Sricharan Ramabadhran,
George Moussalem)
- Fix comments typos in the Airoha driver (Christian Marangi, Colin
Ian King)
- Address a sparse warning by making a local variable static in the
QCom thermal driver (George Moussalem)
- Fix the usage of the _SCP control method in the driver for ACPI
thermal zones (Armin Wolf)"
* tag 'thermal-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: qcom: ipq5018: make ops_ipq5018 struct static
thermal/drivers/airoha: Fix spelling mistake "calibrarion" -> "calibration"
ACPI: thermal: Execute _SCP before reading trip points
ACPI: OSI: Stop advertising support for "3.0 _SCP Extensions"
thermal/drivers/airoha: Fix spelling mistake
thermal/drivers/qcom/tsens: Add support for IPQ5018 tsens
thermal/drivers/qcom/tsens: Add support for tsens v1 without RPM
thermal/drivers/qcom/tsens: Update conditions to strictly evaluate for IP v2+
dt-bindings: thermal: qcom-tsens: Add ipq5018 compatible
thermal/drivers: Add support for Airoha EN7581 thermal sensor
dt-bindings: thermal: Add support for Airoha EN7581 thermal sensor
thermal/drivers/mediatek/lvts: Remove unused lvts_debugfs_exit
thermal/drivers/mediatek/lvts: Fix debugfs unregister on failure
thermal/drivers/amlogic: Rename Uptat to uptat to follow kernel coding style
vsprintf: remove redundant and unused %pCn format specifier
thermal/drivers/bcm2835: Use %pC instead of %pCn
thermal/drivers/hisi: Do not enable by default during compile testing
thermal: int340x: processor_thermal: Platform temperature control documentation
thermal: intel: int340x: Enable platform temperature control
thermal: intel: int340x: Add platform temperature control interface
We've received a lot of activities in this cycle, mostly about leaf
driver codes rather than the core part, but with a good mixture of
code cleanups and new driver additions. Below are some highlights:
* ASoC:
- Support for automatically enumerating DAIs from standards conforming
SoundWire SDCA devices; not much used as of this writing, rather for
future implementations
- Conversion of quite a few drivers to newer GPIO APIs
- Continued cleanups and helper usages in allover places
- Support for a wider range of Intel AVS platforms
- Support for AMD ACP 7.x platforms, Cirrus Logic CS35L63 and CS48L32
Everest Semiconductor ES8375 and ES8389, Longsoon-1 AC'97
controllers, nVidia Tegra264, Richtek ALC203 and RT9123 and Rockchip
SAI controllers
* HD-audio:
- Lots of cleanups of TAS2781 codec drivers
- A new HD-audio control bound via ACPI for Nvidia
- Support for Tegra264, Intel WCL, usual new codec quirks
* USB-audio:
- Fix a race at removal of MIDI device
- Pioneer DJM-V10 support, Scarlett2 driver cleanups
* Misc:
- Cleanups of deprecated PCI functions
- Removal of unused / dead function codes
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmg0KfoOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE/Opg/9H5ZaQpuSj9z5YiG6q3gNzy7lsfcvCqoAqgLW
w9tVo5cXFH7t9+9EZUhB73sxI0VWNJsF83l+vnMqxCn/SkUzey3CPThiGQuhJtjh
oRsqeTxxhuHjOXDapnbHJ2r9rMoAqmnabATdQYKKkYZEV8bBeBQQWFLNGtoBCE24
xmIsyvM7lycOTZaf43uUQVJNqV86ZxV78y7Zoit5l11iZZY78j0c7i7naxSHd+Vr
WVsxd90urSHBo6EsXyHMtaqBrWLTQmhA5v9UU0k6wm7DLKrNmb1fUo0vQlKk/EXn
VipFz6V90IzHRbti6nZefSi2UjwaTneHa+FTspPZjOPG19q+h4MCF0s1gUNou6YG
nqSLU+T37TEZeWpNurhiAwDNKax3/F4Pt7Hz+u4pMcnx25bNvKCb5LMgNU9l9stV
Ar9X4rC5zfqdSsHTFOUgndV+GilqTgUk2efCW89fH2BmkZGM4Xd0JRp+xy2ECvzl
RQq4PPvKcqt0/9GphLkLhpQCh5rWpXahVsmxH7GVrtMUlvRYd+FbKUrlalwOfJqE
j8SJLQKe3yHztH+AXIaIigLaDA0qtCGjnEGSokKGXmCdFH1Pmdm+mnrPFw/wrCXv
9uvWZvEAhqP5TiH5n8Yw50n8p4X7IDNBALeKFukQVi7qKV4R/aYWN1IaQMoZfVUZ
duVWPZg=
=ApEf
-----END PGP SIGNATURE-----
Merge tag 'sound-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"We've received a lot of activities in this cycle, mostly about leaf
driver codes rather than the core part, but with a good mixture of
code cleanups and new driver additions. Below are some highlights:
ASoC:
- Support for automatically enumerating DAIs from standards
conforming SoundWire SDCA devices; not much used as of this
writing, rather for future implementations
- Conversion of quite a few drivers to newer GPIO APIs
- Continued cleanups and helper usages in allover places
- Support for a wider range of Intel AVS platforms
- Support for AMD ACP 7.x platforms, Cirrus Logic CS35L63 and CS48L32
Everest Semiconductor ES8375 and ES8389, Longsoon-1 AC'97
controllers, nVidia Tegra264, Richtek ALC203 and RT9123 and
Rockchip SAI controllers
HD-audio:
- Lots of cleanups of TAS2781 codec drivers
- A new HD-audio control bound via ACPI for Nvidia
- Support for Tegra264, Intel WCL, usual new codec quirks
USB-audio:
- Fix a race at removal of MIDI device
- Pioneer DJM-V10 support, Scarlett2 driver cleanups
Misc:
- Cleanups of deprecated PCI functions
- Removal of unused / dead function codes"
* tag 'sound-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (364 commits)
firmware: cs_dsp: Fix OOB memory read access in KUnit test
ASoC: codecs: add support for ES8375
ASoC: dt-bindings: Add Everest ES8375 audio CODEC
ALSA: hda: acpi: Make driver's match data const static
ALSA: hda: acpi: Use SYSTEM_SLEEP_PM_OPS()
ALSA: atmel: Replace deprecated strcpy() with strscpy()
ALSA: core: fix up bus match const issues.
ASoC: wm_adsp: Make cirrus_dir const
ASoC: tegra: Tegra264 support in isomgr_bw
ASoC: tegra: AHUB: Add Tegra264 support
ASoC: tegra: ADX: Add Tegra264 support
ASoC: tegra: AMX: Add Tegra264 support
ASoC: tegra: I2S: Add Tegra264 support
ASoC: tegra: Update PLL rate for Tegra264
ASoC: tegra: ASRC: Update ARAM address
ASoC: tegra: ADMAIF: Add Tegra264 support
ASoC: tegra: CIF: Add Tegra264 support
dt-bindings: ASoC: Document Tegra264 APE support
dt-bindings: ASoC: admaif: Add missing properties
ASoC: dt-bindings: audio-graph-card2: reference audio-graph routing property
...
Changes
-------
* Reduce open-coded use of ratelimit_state structure fields.
* Convert the ->missed field to atomic_t.
* Count misses that are due to lock contention.
* Eliminate jiffies=0 special case.
* Reduce ___ratelimit() false-positive rate limiting (Petr Mladek).
* Allow zero ->burst to hard-disable rate limiting.
* Optimize away atomic operations when a miss is guaranteed.
* Warn if ->interval or ->burst are negative (Petr Mladek).
* Simplify the resulting code.
A smoke test and stress test have been created, but they are not yet ready
for mainline. With luck, we will offer them for the v6.17 merge window.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmgzWdgTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jBnZD/wKD+5f8qwEuaib901yLb/s4ZkS3Aly
mcPLAcTSFc7jp3c188V2qPAAgobQW+NnOLZ/TuB34tvS/Ngm/Yo1EPiHD2AXPzfY
FPlgmjvOQQQo9dfA1PNOegh/aKYhMJrho85cilcM1TojuSSVbo1lG1FbuvqMJ9Ub
jPHRB6KaFDnhwJkWHJ4Fjl1z1TQcyxjrBoswEcMCKapNqrm6IiXLvw03Nme3wa7F
tr30xue5GV/FyHMa14g/8GSpZ88Lr5VGsOoC0wz2KhfMsZcFRkmslgm8mxHawwXj
MbQaW7Th+fD7H0pC/lbHIiKaXvizYbQCPXr4qf5gfqNf4R/BHE1QdSTmP45kjEXO
CmZEwwVx8cVdyoY9N9udDhNZly/U83G3F1n38jdM2SCPjn3F8BAanZRwakLEvmCC
XUQ0bvzQvJl0LM5ktyaaMZdWf3p6uah7otryCPdsA5V7BFgyx5ZHniXY6v1JgfhX
2nmYRO3vEoQ39Z+vPtu7DvS64oe/aYgkSIoG68rKhSb4S+nIdIZ8zwB1af4Nkk8e
YZwlwjIRw+RCu/QET4GXLE1tHQ031kWR/xG8nDnNGE3XCjnfRArAcdMZNDlc3U5k
GT1g8zOJR2jmlEvWUZwpmclc1yeHXTK1P9nOHmzhxw26eiXmY353PZcND9Ktnt/a
RH550D0vUqFM7A==
=ivk+
-----END PGP SIGNATURE-----
Merge tag 'ratelimit.2025.05.25a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull rate-limit updates from Paul McKenney:
"lib/ratelimit: Reduce false-positive and silent misses:
- Reduce open-coded use of ratelimit_state structure fields.
- Convert the ->missed field to atomic_t.
- Count misses that are due to lock contention.
- Eliminate jiffies=0 special case.
- Reduce ___ratelimit() false-positive rate limiting (Petr Mladek).
- Allow zero ->burst to hard-disable rate limiting.
- Optimize away atomic operations when a miss is guaranteed.
- Warn if ->interval or ->burst are negative (Petr Mladek).
- Simplify the resulting code.
A smoke test and stress test have been created, but they are not yet
ready for mainline. With luck, we will offer them for the v6.17 merge
window"
* tag 'ratelimit.2025.05.25a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
ratelimit: Drop redundant accesses to burst
ratelimit: Use nolock_ret restructuring to collapse common case code
ratelimit: Use nolock_ret label to collapse lock-failure code
ratelimit: Use nolock_ret label to save a couple of lines of code
ratelimit: Simplify common-case exit path
ratelimit: Warn if ->interval or ->burst are negative
ratelimit: Avoid atomic decrement under lock if already rate-limited
ratelimit: Avoid atomic decrement if already rate-limited
ratelimit: Don't flush misses counter if RATELIMIT_MSG_ON_RELEASE
ratelimit: Force re-initialization when rate-limiting re-enabled
ratelimit: Allow zero ->burst to disable ratelimiting
ratelimit: Reduce ___ratelimit() false-positive rate limiting
ratelimit: Avoid jiffies=0 special case
ratelimit: Count misses due to lock contention
ratelimit: Convert the ->missed field to atomic_t
drm/amd/pm: Avoid open-coded use of ratelimit_state structure's internals
drm/i915: Avoid open-coded use of ratelimit_state structure's ->missed field
random: Avoid open-coded use of ratelimit_state structure's ->missed field
ratelimit: Create functions to handle ratelimit_state internals
multiple architectures can share the fs API for manipulating their
respective hw resource control implementation. This is the second step
in the work towards sharing the resctrl filesystem interface, the next
one being plugging ARM's MPAM into the aforementioned fs API.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmg0UDwACgkQEsHwGGHe
VUqsZw//SNSNcVHF7Gz2YvHrMXGYQFBETScg6fRWn/pTe3x1NrKEJedzMANXpAIy
1sBAsfDSOyi8MxIZnvMYapLcRdfLGAD+6FQTkyu/IQ3oSsjAxPgrTXornhxUswMY
LUs40hCv/UaEMkg35NVrRqDlT973kWLwA4iDNNnm6IGtrC8qv4EmdJvgVWHyPTjk
D80KA5ta+iPzK4l8noBrqyhUIZN3ZAJVJLrjS3Tx/gabuolLURE6p4IdlF/O6WzC
4NcqUjpwDeFpHpl2M9QJLVEKXHxKz9zZF2gLpT8Eon/ftqqQigBjzsUx/FKp07hZ
fe2AiQsd4gN9GZa3BGX+Lv+bjvyFadARsOoFbY45szuiUb0oceaRYtFF1ihmO0bV
bD4nAROE1kAfZpr/9ZRZT63LfE/DAm9TR1YBsViq1rrJvp4odvL15YbdOlIDHZD3
SmxhTxAokj058MRnhGdHoiMtPa54iw186QYDp0KxLQHLrToBPd7RBtRE8jsYrqrv
2EvwUxYKyO4vtwr9tzr0ZfptZ/DEsGovoTYD5EtlEGjotQUqsmi5Rxx4+SEQuwFw
CKSJ3j73gpxqDXTujjOe9bCeeXJqyEbrIkaWpkiBRwm5of7eFPG3Sw74jaCGvm4L
NM4UufMSDtyVAKfu3HmPkGhujHv0/7h1zYND51aW+GXEroKxy9s=
=eNCr
-----END PGP SIGNATURE-----
Merge tag 'x86_cache_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov:
"Carve out the resctrl filesystem-related code into fs/resctrl/ so that
multiple architectures can share the fs API for manipulating their
respective hw resource control implementation.
This is the second step in the work towards sharing the resctrl
filesystem interface, the next one being plugging ARM's MPAM into the
aforementioned fs API"
* tag 'x86_cache_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
MAINTAINERS: Add reviewers for fs/resctrl
x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrl
x86/resctrl: Always initialise rid field in rdt_resources_all[]
x86/resctrl: Relax some asm #includes
x86/resctrl: Prefer alloc(sizeof(*foo)) idiom in rdt_init_fs_context()
x86/resctrl: Squelch whitespace anomalies in resctrl core code
x86/resctrl: Move pseudo lock prototypes to include/linux/resctrl.h
x86/resctrl: Fix types in resctrl_arch_mon_ctx_{alloc,free}() stubs
x86/resctrl: Move enum resctrl_event_id to resctrl.h
x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
fs/resctrl: Add boiler plate for external resctrl code
x86/resctrl: Add 'resctrl' to the title of the resctrl documentation
x86/resctrl: Split trace.h
x86/resctrl: Expand the width of domid by replacing mon_data_bits
x86/resctrl: Add end-marker to the resctrl_event_id enum
x86/resctrl: Move is_mba_sc() out of core.c
x86/resctrl: Drop __init/__exit on assorted symbols
x86/resctrl: Resctrl_exit() teardown resctrl but leave the mount point
x86/resctrl: Check all domains are offline in resctrl_exit()
x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
...
- Enables qemu_config for riscv32, sparc 64-bit, PowerPC 32-bit BE and
64-bit LE.
- Enables CONFIG_SPARC32 to clearly differentiate between sparc 32-bit
and 64-bit configurations.
- Enables CONFIG_CPU_BIG_ENDIAN to clearly differentiate between powerpc
LE and BE configurations.
- Add feature to list available architectures to kunit tool.
- Fixes to bugs and changes to documentation.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmgwylUACgkQCwJExA0N
Qxz1/w/+NVxyC80omepHOxoi8Y+j5vWSlQ3S4n+HThyeb4srJlvJuidZN+zrRIEU
QLKo4edUDa25I+JkXnnw/8gLG1000PESM/VjC6NIzhCrEHc/cX/WS9OlwbR5JkpW
wOlxju3vWYrfy5i+LSYiH+v822WlSZsznH9AM9w+ws4FWoRNN+7LiEitj2LElrWI
XHpEYDsBhePu2cFQgAdHaQ5YcMmG64KqeX5Xj+cGjtURciijQDxVuX5k03n0qkbn
YrH4U0it8ZqxJTImpExnqlP2G8uZS72hE91vK0hwW9oZ+/k9vVeBQWJ13JvNKveb
+zHGUimJ0n6d5HB5ZFaUAazkbZHKq7mWPp86lBYNiRXsseDwwqDSva5tfUxzXGQP
MNDK2wrHusH0beRdzdlPjynJABxiOnczmBOAj/Y6t+ZgC2D8BHvyvj+Pecdz3Tr0
YPLgpY72LqxTKDCvOf91IeT8EFaAjyGfVcN6SvUJ/zjPc7UcopM2DEQvlSZcU/Ac
EAzgWmFfkwowSWwr8Q4GOXuKzkMj65QhouJrSCoBjuHQuUxa2MZxAbv5snPmSLs0
635MdPU9uPAEvZKJTbVlXKBMCM3FrS8Buy+dHFEuUiUy0i3PndV5aJUQpIxwM/Du
+DjQkREy6wsIHXSTu4lm+zoZ0yVKQdEZJgsYjY+JfRQ25cLWgiQ=
=32OH
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
- Enable qemu_config for riscv32, sparc 64-bit, PowerPC 32-bit BE and
64-bit LE
- Enable CONFIG_SPARC32 to clearly differentiate between sparc 32-bit
and 64-bit configurations
- Enable CONFIG_CPU_BIG_ENDIAN to clearly differentiate between powerpc
LE and BE configurations
- Add feature to list available architectures to kunit tool
- Fixes to bugs and changes to documentation
* tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: Fix wrong parameter to kunit_deactivate_static_stub()
kunit: tool: add test counts to JSON output
Documentation: kunit: improve example on testing static functions
kunit: executor: Remove const from kunit_filter_suites() allocation type
kunit: qemu_configs: Disable faulting tests on 32-bit SPARC
kunit: qemu_configs: Add 64-bit SPARC configuration
kunit: qemu_configs: sparc: Explicitly enable CONFIG_SPARC32=y
kunit: qemu_configs: Add PowerPC 32-bit BE and 64-bit LE
kunit: qemu_configs: powerpc: Explicitly enable CONFIG_CPU_BIG_ENDIAN=y
kunit: tool: Implement listing of available architectures
kunit: qemu_configs: Add riscv32 config
kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests
API:
- Fix memcpy_sglist to handle partially overlapping SG lists.
- Use memcpy_sglist to replace null skcipher.
- Rename CRYPTO_TESTS to CRYPTO_BENCHMARK.
- Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS.
- Hide CRYPTO_MANAGER.
- Add delayed freeing of driver crypto_alg structures.
Compression:
- Allocate large buffers on first use instead of initialisation in scomp.
- Drop destination linearisation buffer in scomp.
- Move scomp stream allocation into acomp.
- Add acomp scatter-gather walker.
- Remove request chaining.
- Add optional async request allocation.
Hashing:
- Remove request chaining.
- Add optional async request allocation.
- Move partial block handling into API.
- Add ahash support to hmac.
- Fix shash documentation to disallow usage in hard IRQs.
Algorithms:
- Remove unnecessary SIMD fallback code on x86 and arm/arm64.
- Drop avx10_256 xts(aes)/ctr(aes) on x86.
- Improve avx-512 optimisations for xts(aes).
- Move chacha arch implementations into lib/crypto.
- Move poly1305 into lib/crypto and drop unused Crypto API algorithm.
- Disable powerpc/poly1305 as it has no SIMD fallback.
- Move sha256 arch implementations into lib/crypto.
- Convert deflate to acomp.
- Set block size correctly in cbcmac.
Drivers:
- Do not use sg_dma_len before mapping in sun8i-ss.
- Fix warm-reboot failure by making shutdown do more work in qat.
- Add locking in zynqmp-sha.
- Remove cavium/zip.
- Add support for PCI device 0x17D8 to ccp.
- Add qat_6xxx support in qat.
- Add support for RK3576 in rockchip-rng.
- Add support for i.MX8QM in caam.
Others:
- Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up.
- Add new SEV/SNP platform shutdown API in ccp.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmgz47AACgkQxycdCkmx
i6fvKRAAr4Xa903L0r1Q1P1alQqoFFCqimUWeH72m68LiWynHWi0lUo0z/+tKweg
mnPStz7/Ha9HRHJjdNCMPnlJqXQDkuH3bIOuBJCwduDuhHo9VGOd46XGzmGMv3gb
HKuZhI0lk7pznK3CSyD/2nHmbDCHD+7feTZSBMoN9mm875+aSoM6fdxgak8uPFcq
KbB1L+hObTn2kAPSqRrNOR8/xG2N7hdH8eax7Li+LAtqYNVT5HvWVECsB/CKRPfB
sgAv3UTzcIFapSSHUHaONppSeoqPAIAeV7SdQhJvlT+EUUR/h/B6+D9OUQQqbphQ
LBalgTnqMKl0ymDEQFQ6QyYCat9ZfNmDft2WcXEsxc8PxImkgJI1W3B8O51sOjbG
78D8JqVQ96dleo4FsBhM2wfG0b41JM6zU4raC4vS7a3qsUS+Q1MpehvcS1iORicy
SpGdE8e7DLlxKhzWyW1xJnbrtMZDC7Sa2hUnxrvP0/xOvRhChKscRVtWcf0a5q7X
8JmuvwVSOJuSbQ3MeFbQvpo5lR9+0WsNjM6e9miiH6Y7vZUKmWcq2yDp377qVzeh
7NK6+OwGIQZZExrmtPw2BXwssT9Eg+ks6Y7g2Ne7yzvrjVNfEPY7Cws/5w7p8mRS
qhrcpbJNFlWgD7YYkmGZFTQ8DCN25ipP8lklO/hbcfchqLE/o1o=
=O8L5
-----END PGP SIGNATURE-----
Merge tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Fix memcpy_sglist to handle partially overlapping SG lists
- Use memcpy_sglist to replace null skcipher
- Rename CRYPTO_TESTS to CRYPTO_BENCHMARK
- Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS
- Hide CRYPTO_MANAGER
- Add delayed freeing of driver crypto_alg structures
Compression:
- Allocate large buffers on first use instead of initialisation in scomp
- Drop destination linearisation buffer in scomp
- Move scomp stream allocation into acomp
- Add acomp scatter-gather walker
- Remove request chaining
- Add optional async request allocation
Hashing:
- Remove request chaining
- Add optional async request allocation
- Move partial block handling into API
- Add ahash support to hmac
- Fix shash documentation to disallow usage in hard IRQs
Algorithms:
- Remove unnecessary SIMD fallback code on x86 and arm/arm64
- Drop avx10_256 xts(aes)/ctr(aes) on x86
- Improve avx-512 optimisations for xts(aes)
- Move chacha arch implementations into lib/crypto
- Move poly1305 into lib/crypto and drop unused Crypto API algorithm
- Disable powerpc/poly1305 as it has no SIMD fallback
- Move sha256 arch implementations into lib/crypto
- Convert deflate to acomp
- Set block size correctly in cbcmac
Drivers:
- Do not use sg_dma_len before mapping in sun8i-ss
- Fix warm-reboot failure by making shutdown do more work in qat
- Add locking in zynqmp-sha
- Remove cavium/zip
- Add support for PCI device 0x17D8 to ccp
- Add qat_6xxx support in qat
- Add support for RK3576 in rockchip-rng
- Add support for i.MX8QM in caam
Others:
- Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up
- Add new SEV/SNP platform shutdown API in ccp"
* tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (382 commits)
x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
crypto: qat - add missing header inclusion
crypto: api - Redo lookup on EEXIST
Revert "crypto: testmgr - Add hash export format testing"
crypto: marvell/cesa - Do not chain submitted requests
crypto: powerpc/poly1305 - add depends on BROKEN for now
Revert "crypto: powerpc/poly1305 - Add SIMD fallback"
crypto: ccp - Add missing tee info reg for teev2
crypto: ccp - Add missing bootloader info reg for pspv5
crypto: sun8i-ce - move fallback ahash_request to the end of the struct
crypto: octeontx2 - Use dynamic allocated memory region for lmtst
crypto: octeontx2 - Initialize cptlfs device info once
crypto: xts - Only add ecb if it is not already there
crypto: lrw - Only add ecb if it is not already there
crypto: testmgr - Add hash export format testing
crypto: testmgr - Use ahash for generic tfm
crypto: hmac - Add ahash support
crypto: testmgr - Ignore EEXIST on shash allocation
crypto: algapi - Add driver template support to crypto_inst_setname
crypto: shash - Set reqsize in shash_alg
...
Cleanups for the kernel's CRC (cyclic redundancy check) code:
- Use __ro_after_init where appropriate
- Remove unnecessary static_key on s390
- Rename some source code files
- Rename the crc32 and crc32c crypto API modules
- Use subsys_initcall instead of arch_initcall
- Restore maintainers for crc_kunit.c
- Fold crc16_byte() into crc16.c
- Add some SPDX license identifiers
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaDNd3xQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKz0tAQCDqDA4Jd/54nnKpChMlKH8MTQDuwfz
8GHZi50mn4Rw5gD/f+hOGItPfswBId/+MZy+rKWL7bE2e9DdJdtoqRRtwA4=
=RWFl
-----END PGP SIGNATURE-----
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
"Cleanups for the kernel's CRC (cyclic redundancy check) code:
- Use __ro_after_init where appropriate
- Remove unnecessary static_key on s390
- Rename some source code files
- Rename the crc32 and crc32c crypto API modules
- Use subsys_initcall instead of arch_initcall
- Restore maintainers for crc_kunit.c
- Fold crc16_byte() into crc16.c
- Add some SPDX license identifiers"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crc32: add SPDX license identifier
lib/crc16: unexport crc16_table and crc16_byte()
w1: ds2406: use crc16() instead of crc16_byte() loop
MAINTAINERS: add crc_kunit.c back to CRC LIBRARY
lib/crc: make arch-optimized code use subsys_initcall
crypto: crc32 - remove "generic" from file and module names
x86/crc: drop "glue" from filenames
sparc/crc: drop "glue" from filenames
s390/crc: drop "glue" from filenames
powerpc/crc: rename crc32-vpmsum_core.S to crc-vpmsum-template.S
powerpc/crc: drop "glue" from filenames
arm64/crc: drop "glue" from filenames
arm/crc: drop "glue" from filenames
s390/crc32: Remove no-op module init and exit functions
s390/crc32: Remove have_vxrs static key
lib/crc: make the CPU feature static keys __ro_after_init
When a module gets unloaded it checks whether any of its tags are still in
use and if so, we keep the memory containing module's allocation tags
alive until all tags are unused. However percpu counters referenced by
the tags are freed by free_module(). This will lead to UAF if the memory
allocated by a module is accessed after module was unloaded.
To fix this we allocate percpu counters for module allocation tags
dynamically and we keep it alive for tags which are still in use after
module unloading. This also removes the requirement of a larger
PERCPU_MODULE_RESERVE when memory allocation profiling is enabled because
percpu memory for counters does not need to be reserved anymore.
Link: https://lkml.kernel.org/r/20250517000739.5930-1-surenb@google.com
Fixes: 0db6f8d782 ("alloc_tag: load module tags into separate contiguous memory")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: David Wang <00107082@163.com>
Closes: https://lore.kernel.org/all/20250516131246.6244-1-00107082@163.com/
Tested-by: David Wang <00107082@163.com>
Cc: Christoph Lameter (Ampere) <cl@gentwo.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If mem_profiling_support is false, for example by
sysctl.vm.mem_profiling=never, alloc_tag_init should skip module tags
allocation, codetag type registration and procfs init.
Link: https://lkml.kernel.org/r/20250513182602.121843-1-cachen@purestorage.com
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
crc32c_combine() and crc32c_shift() are no longer used (except by the
KUnit test that tests them), and their current implementation is very
slow. Remove them.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/20250519175012.36581-8-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
kunit_deactivate_static_stub() accepts real_fn_addr instead of
replacement_addr. In the case, it always passes NULL to
kunit_deactivate_static_stub().
Fix it.
Link: https://lore.kernel.org/r/20250520082050.2254875-1-tzungbi@kernel.org
Fixes: e047c5eaa7 ("kunit: Expose 'static stub' API to redirect functions")
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
%pC and %pCn print the same string, and commit 900cca2944 ("lib/vsprintf:
add %pC{,n,r} format specifiers for clocks") introducing them does not
clarify any intended difference. It can be assumed %pC is a default for
%pCn as some other specifiers do, but not all are consistent with this
policy. Moreover there is now no other suffix other than 'n', which makes a
default not really useful.
All users in the kernel were using %pC except for one which has been
converted. So now remove %pCn and all the unnecessary extra code and
documentation.
Acked-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Link: https://lore.kernel.org/r/20250311-vsprintf-pcn-v2-2-0af40fc7dee4@bootlin.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Not all drivers require send_package_data or send_component_table when
updating firmware. Instead of forcing drivers to implement a stub allow
these functions to go undefined.
Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250512190109.2475614-2-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
lib/crc32.c and include/linux/crc32.h got missed by the bulk SPDX
conversion because of the nonstandard explanation of the license.
However, crc32.c clearly states that it's licensed under the GNU General
Public License, Version 2. And the comment in crc32.h clearly indicates
that it's meant to have the same license as crc32.c. Therefore, apply
SPDX-License-Identifier: GPL-2.0-only to both files.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250514052409.194822-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Fix the documentation for __xa_cmpxchg to actually describe the
cmpxch-like semantics correctly, based on the version for xa_cmpxchg.
Link: https://lkml.kernel.org/r/20250507051656.3900864-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently the full set of crypto self-tests requires
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y. This is problematic in two ways.
First, developers regularly overlook this option. Second, the
description of the tests as "extra" sometimes gives the impression that
it is not required that all algorithms pass these tests.
Given that the main use case for the crypto self-tests is for
developers, make enabling CONFIG_CRYPTO_SELFTESTS=y just enable the full
set of crypto self-tests by default.
The slow tests can still be disabled by adding the command-line
parameter cryptomgr.noextratests=1, soon to be renamed to
cryptomgr.noslowtests=1. The only known use case for doing this is for
people trying to use the crypto self-tests to satisfy the FIPS 140-3
pre-operational self-testing requirements when the kernel is being
validated as a FIPS 140-3 cryptographic module.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The negative-sense of CRYPTO_MANAGER_DISABLE_TESTS is a longstanding
mistake that regularly causes confusion. Especially bad is that you can
have CRYPTO=n && CRYPTO_MANAGER_DISABLE_TESTS=n, which is ambiguous.
Replace CRYPTO_MANAGER_DISABLE_TESTS with CRYPTO_SELFTESTS which has the
expected behavior.
The tests continue to be disabled by default.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add explicit array bounds to the function prototypes for the parameters
that didn't already get handled by the conversion to use chacha_state:
- chacha_block_*():
Change 'u8 *out' or 'u8 *stream' to u8 out[CHACHA_BLOCK_SIZE].
- hchacha_block_*():
Change 'u32 *out' or 'u32 *stream' to u32 out[HCHACHA_OUT_WORDS].
- chacha_init():
Change 'const u32 *key' to 'const u32 key[CHACHA_KEY_WORDS]'.
Change 'const u8 *iv' to 'const u8 iv[CHACHA_IV_SIZE]'.
No functional changes. This just makes it clear when fixed-size arrays
are expected.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the ChaCha state matrix is strongly-typed, add a helper
function chacha_zeroize_state() which zeroizes it. Then convert all
applicable callers to use it instead of direct memzero_explicit. No
functional changes.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use struct assignment instead of memcpy() in lib/crypto/chacha.c where
appropriate. No functional change.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ChaCha state matrix is 16 32-bit words. Currently it is represented
in the code as a raw u32 array, or even just a pointer to u32. This
weak typing is error-prone. Instead, introduce struct chacha_state:
struct chacha_state {
u32 x[16];
};
Convert all ChaCha and HChaCha functions to use struct chacha_state.
No functional changes.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
sprint_OID() was added as part of 2012's commit 4f73175d03 ("X.509: Add
utility functions to render OIDs as strings") but it hasn't been used.
Remove it.
Note that there's also 'sprint_oid' (lower case) which is used in a lot of
places; that's left as is except for fixing its case in a comment.
Link: https://lkml.kernel.org/r/20250501010502.326472-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Right now test_kmod has hardcoded dependencies on btrfs/xfs. That is not
optimal since you end up needing to select/build them, but it is not
really required since other fs could be selected for the testing. Also,
we can't change the default/driver module used for testing on
initialization.
Thus make it more generic: introduce two module parameters (start_driver
and start_test_fs), which allow to select which modules/fs to use for the
testing on test_kmod initialization. Then it's up to the user to select
which modules/fs to use for testing based on his config. However, keep
test_module as required default.
This way, config/modules becomes selectable as when the testing is done
from selftests (userspace).
While at it, also change trigger_config_run_type, since at module
initialization we already set the defaults at __kmod_config_init and
should not need to do it again in test_kmod_init(), thus we can avoid to
again set test_driver/test_fs.
Link: https://lkml.kernel.org/r/20250418165047.702487-1-herton@redhat.com
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Reviewed-by: Luis Chambelrain <mcgrof@kernel.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
sg_next() is a short function called frequently in I/O paths. Define it
in the header file so it can be inlined into its callers.
Link: https://lkml.kernel.org/r/20250416160615.3571958-1-csander@purestorage.com
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Current errseq implementation depends on a very special precondition that
macro MAX_ERRNO must be (2^n - 1).
Eliminate the limitation by
- redefining macro ERRSEQ_SHIFT
- defining a new macro ERRNO_MASK instead of MAX_ERRNO for errno mask.
There is no plan to change the value of MAX_ERRNO, but this makes the
implementation more generic and eliminates the BUILD_BUG_ON().
Link: https://lkml.kernel.org/r/20250407-improve_errseq-v1-1-7b27cbeb8298@quicinc.com
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In some places in the kernel there is a design pattern for sysfs
attributes to use kstrtobool() in store() and str_enabled_disabled() in
show().
This is counterintuitive to interact with because kstrtobool() takes
on/off but str_enabled_disabled() shows enabled/disabled. Some of those
sysfs uses could switch to str_on_off() but for some attributes
enabled/disabled really makes more sense.
Add support for kstrtobool() to accept enabled/disabled.
Link: https://lkml.kernel.org/r/20250321022538.1532445-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Replace `sr` with `Sr`. The condition `!tmp1 || rb_is_black(tmp1)`
ensures that `tmp1` (which is `sibling->rb_right`) is either NULL or a
black node. Therefore, the right child of the sibling must be black, and
the example should use `Sr` instead of `sr`.
Link: https://lkml.kernel.org/r/20250403112614.570140-1-johnny1001s000602@gmail.com
Signed-off-by: Chisheng Chen <johnny1001s000602@gmail.com>
Cc: Hsin Chang Yu <zxcvb600870024@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Remove the dependency on module loading ("m") for the vmalloc test suite,
enabling it to be built directly into the kernel, so both ("=m") and
("=y") are supported.
Motivation:
- Faster debugging/testing of vmalloc code;
- It allows to configure the test via kernel-boot parameters.
Configuration example:
test_vmalloc.nr_threads=64
test_vmalloc.run_test_mask=7
test_vmalloc.sequential_test_order=1
Link: https://lkml.kernel.org/r/20250417161216.88318-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Adrian Huang <ahuang12@lenovo.com>
Tested-by: Adrian Huang <ahuang12@lenovo.com>
Cc: Christop Hellwig <hch@infradead.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The test has the initialization step during which threads are created. To
prevent the workers from starting prematurely a write lock was previously
used by the main setup thread, while each worker would block on a read
lock.
Replace this RWSEM based synchronization with a simpler SRCU based
approach. Which does two basic steps:
- Main thread wraps the setup phase in an SRCU read-side critical
section. Pair of srcu_read_lock()/srcu_read_unlock().
- Each worker calls synchronize_srcu() on entry, ensuring it waits for
the initialization phase to be completed.
This patch eliminates the need for down_read()/up_read() and
down_write()/up_write() pairs thus simplifying the logic and improving
clarity.
[urezki@gmail.com: fix compile error with CONFIG_TINY_RCU]
Link: https://lkml.kernel.org/r/20250420142029.103169-1-urezki@gmail.com
Link: https://lkml.kernel.org/r/20250417161216.88318-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Adrian Huang <ahuang12@lenovo.com>
Tested-by: Adrian Huang <ahuang12@lenovo.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christop Hellwig <hch@infradead.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Move the unlikely case that mas->store_type is invalid to be the last
evaluated case and put liklier cases higher up.
Link: https://lkml.kernel.org/r/20250410191446.2474640-7-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Suggested-by: Liam R. Howlett <liam.howlett@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In order to support rebalancing and spanning stores using less than the
worst case number of nodes, we need to track more than just the vacant
height. Using only vacant height to reduce the worst case maple node
allocation count can lead to a shortcoming of nodes in the following
scenarios.
For rebalancing writes, when a leaf node becomes insufficient, it may be
combined with a sibling into a single node. This means that the parent
node which has entries for this children will lose one entry. If this
parent node was just meeting the minimum entries, losing one entry will
now cause this parent node to be insufficient. This leads to a cascading
operation of rebalancing at different levels and can lead to more node
allocations than simply using vacant height can return.
For spanning writes, a similar situation occurs. At the location at which
a spanning write is detected, the number of ancestor nodes may similarly
need to rebalanced into a smaller number of nodes and the same cascading
situation could occur.
To use less than the full height of the tree for the number of
allocations, we also need to track the height at which a non-leaf node
cannot become insufficient. This means even if a rebalance occurs to a
child of this node, it currently has enough entries that it can lose one
without any further action. This field is stored in the maple write state
as sufficient height. In mas_prealloc_calc() when figuring out how many
nodes to allocate, we check if the vacant node is lower in the tree than a
sufficient node (has a larger value). If it is, we cannot use the vacant
height and must use the difference in the height and sufficient height as
the basis for the number of nodes needed.
An off by one bug was also discovered in mast_overflow() where it is using
>= rather than >. This caused extra iterations of the
mas_spanning_rebalance() loop and lead to unneeded allocations. A test is
also added to check the number of allocations is correct.
Link: https://lkml.kernel.org/r/20250410191446.2474640-6-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This allows support for using the vacant height to calculate the worst
case number of nodes needed for wr_rebalance operation.
mas_spanning_rebalance() was seen to perform unnecessary node allocations.
We can reduce allocations by breaking early during the rebalancing loop
once we realize that we have ascended to a common ancestor.
Link: https://lkml.kernel.org/r/20250410191446.2474640-5-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Suggested-by: Liam Howlett <liam.howlett@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In order to determine the store type for a maple tree operation, a walk of
the tree is done through mas_wr_walk(). This function descends the tree
until a spanning write is detected or we reach a leaf node. While
descending, keep track of the height at which we encounter a node with
available space. This is done by checking if mas->end is less than the
number of slots a given node type can fit.
Now that the height of the vacant node is tracked, we can use the
difference between the height of the tree and the height of the vacant
node to know how many levels we will have to propagate creating new nodes.
Update mas_prealloc_calc() to consider the vacant height and reduce the
number of worst-case allocations.
Rebalancing and spanning stores are not supported and fall back to using
the full height of the tree for allocations.
Update preallocation testing assertions to take into account vacant
height.
Link: https://lkml.kernel.org/r/20250410191446.2474640-4-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
For the maple tree, the root node is defined to have a depth of 0 with a
height of 1. Each level down from the node, these values are incremented
by 1. Various code paths define a root with depth 1 which is inconsisent
with the definition. Modify the code to be consistent with this
definition.
In mas_spanning_rebalance(), l_mas.depth was being used to track the
height based on the number of iterations done in the main loop. This
information was then used in mas_put_in_tree() to set the height. Rather
than overload the l_mas.depth field to track height, simply keep track of
height in the local variable new_height and directly pass this to
mas_wmb_replace() which will be passed into mas_put_in_tree(). This
allows up to remove writes to l_mas.depth.
Link: https://lkml.kernel.org/r/20250410191446.2474640-3-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Track node vacancy to reduce worst case allocation counts", v5.
================ overview ========================
Currently, the maple tree preallocates the worst case number of nodes for
given store type by taking into account the whole height of the tree.
This comes from a worst case scenario of every node in the tree being full
and having to propagate node allocation upwards until we reach the root of
the tree. This can be optimized if there are vacancies in nodes that are
at a lower depth than the root node. This series implements tracking the
level at which there is a vacant node so we only need to allocate until
this level is reached, rather than always using the full height of the
tree. The ma_wr_state struct is modified to add a field which keeps track
of the vacant height and is updated during walks of the tree. This value
is then read in mas_prealloc_calc() when we decide how many nodes to
allocate.
For rebalancing and spanning stores, we also need to track the lowest
height at which a node has 1 more entry than the minimum sufficient number
of entries. This is because rebalancing can cause a parent node to become
insufficient which results in further node allocations. In this case, we
need to use the sufficient height as the worst case rather than the vacant
height.
patch 1-2: preparatory patches
patch 3: implement vacant height tracking + update the tests
patch 4: support vacant height tracking for rebalancing writes
patch 5: implement sufficient height tracking
patch 6: reorder switch case statements
================ results =========================
Bpftrace was used to profile the allocation path for requesting new maple
nodes while running stress-ng mmap 120s. The histograms below represent
requests to kmem_cache_alloc_bulk() and show the count argument. This
represnts how many maple nodes the caller is requesting in
kmem_cache_alloc_bulk()
command: stress-ng --mmap 4 --timeout 120
mm-unstable
@bulk_alloc_req:
[3, 4) 4 | |
[4, 5) 54170 |@ |
[5, 6) 0 | |
[6, 7) 893057 |@@@@@@@@@@@@@@@@@@@@ |
[7, 8) 4 | |
[8, 9) 2230287 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[9, 10) 55811 |@ |
[10, 11) 77834 |@ |
[11, 12) 0 | |
[12, 13) 1368684 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[13, 14) 0 | |
[14, 15) 0 | |
[15, 16) 367197 |@@@@@@@@ |
@maple_node_total: 46,630,160
@total_vmas: 46184591
mm-unstable + this series
@bulk_alloc_req:
[2, 3) 198 | |
[3, 4) 4 | |
[4, 5) 43 | |
[5, 6) 0 | |
[6, 7) 1069503 |@@@@@@@@@@@@@@@@@@@@@ |
[7, 8) 4 | |
[8, 9) 2597268 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[9, 10) 472191 |@@@@@@@@@ |
[10, 11) 191904 |@@@ |
[11, 12) 0 | |
[12, 13) 247316 |@@@@ |
[13, 14) 0 | |
[14, 15) 0 | |
[15, 16) 98769 |@ |
@maple_node_total: 37,813,856
@total_vmas: 43493287
This represents a ~19% reduction in the number of bulk maple nodes allocated.
For more reproducible results, a historgram of the return value of
mas_prealloc_calc() is displayed while running the maple_tree_tests whcih
have a deterministic store pattern
mas_prealloc_calc() return value mm-unstable
1 : (12068)
3 : (11836)
5 : ***** (271192)
7 : ************************************************** (2329329)
9 : *********** (534186)
10 : (435)
11 : *************** (704306)
13 : ******** (409781)
mas_prealloc_calc() return value mm-unstable + this series
1 : (12070)
3 : ************************************************** (3548777)
5 : ******** (633458)
7 : (65081)
9 : (11224)
10 : (341)
11 : (2973)
13 : (68)
do_mmap latency was also measured for regressions:
command: stress-ng --mmap 4 --timeout 120
mm-unstable:
avg = 7162 nsecs, total: 16101821292 nsecs, count: 2248034
mm-unstable + this series:
avg = 6689 nsecs, total: 15135391764 nsecs, count: 2262726
stress-ng --mmap4 --timeout 120
with vacant_height:
stress-ng: info: [257] 21526312 Maple Tree Read 0.176 M/sec
stress-ng: info: [257] 339979348 Maple Tree Write 2.774 M/sec
without vacant_height:
stress-ng: info: [8228] 20968900 Maple Tree Read 0.171 M/sec
stress-ng: info: [8228] 312214648 Maple Tree Write 2.547 M/sec
This represents an increase of ~3% read throughput and ~9% increase in
write throughput.
This patch (of 6):
In a subsequent patch, mas_prealloc_calc() will need to access fields only
in the ma_wr_state. Convert the function to take in a ma_wr_state and
modify all callers. There is no functional change.
Link: https://lkml.kernel.org/r/20250410191446.2474640-1-sidhartha.kumar@oracle.com
Link: https://lkml.kernel.org/r/20250410191446.2474640-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Change xa_alloc_cyclic() to return 0 even on wrap-around. Do the same for
xa_alloc_cyclic_irq() and xa_alloc_cyclic_bh().
This will prevent any future bug of treating return of 1 as an error:
int ret = xa_alloc_cyclic(...)
if (ret) // currently mishandles ret==1
goto failure;
If there will be someone interested in when wrap-around occurs, there is
still __xa_alloc_cyclic() that behaves as before. For now there is no
such user.
Link: https://lkml.kernel.org/r/20250320102219.8101-1-przemyslaw.kitszel@intel.com
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/netdev/Z9gUd-5t8b5NX2wE@casper.infradead.org
Cc: Andriy Shevchenko <andriy.shevchenko@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ITER_XARRAY is exclusively used with xarrays that contain folios, not
pages, so extract folio pointers from it, not page pointers. Removes a
use of find_subpage().
Link: https://lkml.kernel.org/r/20250402210612.2444135-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ITER_XARRAY is exclusively used with xarrays that contain folios, not
pages, so extract folio pointers from it, not page pointers. Removes a
hidden call to compound_head() and a use of find_subpage().
Link: https://lkml.kernel.org/r/20250402210612.2444135-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Now that there is the "burst <= 0" fastpath, for all later code, burst
must be strictly greater than zero. Therefore, drop the redundant checks
of this local variable.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Now that unlock_ret releases the lock, then falls into nolock_ret, which
handles ->missed based on the value of ret, the common-case lock-held
code can be collapsed into a single "if" statement with a single-statement
"then" clause.
Yes, we could go further and just assign the "if" condition to ret,
but in the immortal words of MSDOS, "Are you sure?".
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Now that we have a nolock_ret label that handles ->missed correctly
based on the value of ret, we can eliminate a local variable and collapse
several "if" statements on the lock-acquisition-failure code path.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Create a nolock_ret label in order to start consolidating the unlocked
return paths that conditionally invoke ratelimit_state_inc_miss().
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
By making "ret" always be initialized, and moving the final call to
ratelimit_state_inc_miss() out from under the lock, we save a goto and
a couple lines of code. This also saves a couple of lines of code from
the unconditional enable/disable slowpath.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Currently, ___ratelimit() treats a negative ->interval or ->burst as
if it was zero, but this is an accident of the current implementation.
Therefore, splat in this case, which might have the benefit of detecting
use of uninitialized ratelimit_state structures on the one hand or easing
addition of new features on the other.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Currently, if the lock is acquired, the code unconditionally does
an atomic decrement on ->rs_n_left, even if that atomic operation is
guaranteed to return a limit-rate verdict. A limit-rate verdict will
in fact be the common case when something is spewing into a rate limit.
This unconditional atomic operation incurs needless overhead and also
raises the spectre of counter wrap.
Therefore, do the atomic decrement only if there is some chance that
rates won't be limited.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Currently, if the lock could not be acquired, the code unconditionally
does an atomic decrement on ->rs_n_left, even if that atomic operation
is guaranteed to return a limit-rate verdict. This incurs needless
overhead and also raises the spectre of counter wrap.
Therefore, do the atomic decrement only if there is some chance that
rates won't be limited.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Restore the previous semantics where the misses counter is unchanged if
the RATELIMIT_MSG_ON_RELEASE flag is set.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Currently, if rate limiting is disabled, ___ratelimit() does an immediate
early return with no state changes. This can result in false-positive
drops when re-enabling rate limiting. Therefore, mark the ratelimit_state
structure "uninitialized" when rate limiting is disabled.
[ paulmck: Apply Petr Mladek feedback. ]
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
If ->interval is zero, then rate-limiting will be disabled.
Alternatively, if interval is greater than zero and ->burst is zero,
then rate-limiting will be applied unconditionally. The point of this
distinction is to handle current users that pass zero-initialized
ratelimit_state structures to ___ratelimit(), and in such cases the
->lock field will be uninitialized. Acquiring ->lock in this case is
clearly not a strategy to win.
Therefore, make this classification be lockless.
Note that although negative ->interval and ->burst happen to be treated
as if they were zero, this is an accident of the current implementation.
The semantics of negative values for these fields is subject to change
without notice. Especially given that Bert Karwatzki determined that
no current calls to ___ratelimit() ever have negative values for these
fields.
This commit replaces an earlier buggy versions.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Reported-by: Bert Karwatzki <spasswolf@web.de>
Reported-by: "Aithal, Srikanth" <sraithal@amd.com>
Closes: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/all/257c3b91-e30f-48be-9788-d27a4445a416@sirena.org.uk/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: "Aithal, Srikanth" <sraithal@amd.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
The ___ratelimit() function special-cases the jiffies-counter value of zero
as "uninitialized". This works well on 64-bit systems, where the jiffies
counter is not going to return to zero for more than half a billion years
on systems with HZ=1000, but similar 32-bit systems take less than 50 days
to wrap the jiffies counter. And although the consequences of wrapping the
jiffies counter seem to be limited to minor confusion on the duration of
the rate-limiting interval that happens to end at time zero, it is almost
no work to avoid this confusion.
Therefore, introduce a RATELIMIT_INITIALIZED bit to the ratelimit_state
structure's ->flags field so that a ->begin value of zero is no longer
special.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
The ___ratelimit() function simply returns zero ("do ratelimiting")
if the trylock fails, but does not adjust the ->missed field. This
means that the resulting dropped printk()s are dropped silently, which
could seriously confuse people trying to do console-log-based debugging.
Therefore, increment the ->missed field upon trylock failure.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
The ratelimit_state structure's ->missed field is sometimes incremented
locklessly, and it would be good to avoid lost counts. This is also
needed to count the number of misses due to trylock failure. Therefore,
convert the ratelimit_state structure's ->missed field to atomic_t.
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
A number of ratelimit use cases do open-coded access to the
ratelimit_state structure's ->missed field. This works, but is a bit
messy and makes it more annoying to make changes to this field.
Therefore, provide a ratelimit_state_inc_miss() function that increments
the ->missed field, a ratelimit_state_get_miss() function that reads
out the ->missed field, and a ratelimit_state_reset_miss() function
that reads out that field, but that also resets its value to zero.
These functions will replace client-code open-coded uses of ->missed.
In addition, a new ratelimit_state_reset_interval() function encapsulates
what was previously open-coded lock acquisition and direct field updates.
[ paulmck: Apply kernel test robot feedback. ]
Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
The recent fix in commit c2ea09b193d2 ("randstruct: gcc-plugin: Remove
bogus void member") has fixed another issue: it was not always detecting
composite structures made only of function pointers and structures of
function pointers. Add a test for this case, and break out the layout
tests since this issue is actually a problem for Clang as well[1].
Link: https://github.com/llvm/llvm-project/issues/138355 [1]
Link: https://lore.kernel.org/r/20250502224116.work.591-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Variable Length Arrays (VLAs) on the stack must not be used in the kernel.
Function parameter VLAs[1] should be usable, but -Wvla will warn for
those. For example, this will produce a warning but it is not using a
stack VLA:
int something(size_t n, int array[n]) { ...
Clang has no way yet to distinguish between the VLA types[2], so
depend on GCC for now to keep stack VLAs out of the tree by using GCC's
-Wvla-larger-than=N option (though GCC may split -Wvla similarly[3] to
how Clang is planning to).
While GCC 8+ supports -Wvla-larger-than, only 9+ supports ...=0[4],
so use -Wvla-larger-than=1. Adjust mm/kasan/Makefile to remove it from
CFLAGS (GCC <9 appears unable to disable the warning correctly[5]).
The VLA usage in lib/test_ubsan.c was removed in commit 9d7ca61b13
("lib/test_ubsan.c: VLA no longer used in kernel") so the lib/Makefile
disabling of VLA checking can be entirely removed.
Link: https://en.cppreference.com/w/c/language/array [1]
Link: https://github.com/llvm/llvm-project/issues/57098 [2]
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98217 [3]
Link: https://lore.kernel.org/lkml/7780883c-0ac8-4aaa-b850-469e33b50672@linux.ibm.com/ [4]
Link: https://lore.kernel.org/r/202505071331.4iOzqmuE-lkp@intel.com/ [5]
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Link: https://lore.kernel.org/r/20250418213235.work.532-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Add a new Kconfig CONFIG_UBSAN_KVM_EL2 for KVM which enables
UBSAN for EL2 code (in protected/nvhe/hvhe) modes.
This will re-use the same checks enabled for the kernel for
the hypervisor. The only difference is that for EL2 it always
emits a "brk" instead of implementing hooks as the hypervisor
can't print reports.
The KVM code will re-use the same code for the kernel
"report_ubsan_failure()" so #ifdefs are changed to also have this
code for CONFIG_UBSAN_KVM_EL2
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250430162713.1997569-4-smostafa@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
report_ubsan_failure() doesn't use argument regs, and soon it will
be called from the hypervisor context were regs are not available.
So, remove the unused argument.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Acked-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250430162713.1997569-3-smostafa@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Split the lib poly1305 code just as was done with sha256. Make
the main library code conditional on LIB_POLY1305 instead of
LIB_POLY1305_GENERIC.
Reported-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Fixes: 10a6d72ea3 ("crypto: lib/poly1305 - Use block-only interface")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmgX1CgeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGxiIH/A7LHlVatGEQgRFi
0JALDgcuGTMtMU1qD43rv8Z1GXqTpCAlaBt9D1C9cUH/86MGyBTVRWgVy0wkaU2U
8QSfFWQIbrdaIzelHtzmAv5IDtb+KrcX1iYGLcMb6ZYaWkv8/CMzMX1nkgxEr1QT
37Xo3/F17yJumAdNQxdRhVLGy2d3X5rScecpufwh97sMwoddllMCDs2LIoeSAYpG
376/wzni09G2fADa8MEKqcaMue4qcf0FOo/gOkT8YwFGSZLKa6uumlBLg04QoCt0
foK2vfcci1q4H4ZbCu3uQESYGLQHY0f2ICDCwC3m25VF9a81TmlbC3MLum3vhmKe
RtLDcXg=
=xyaI
-----END PGP SIGNATURE-----
BackMerge tag 'v6.15-rc5' into drm-next
Linux 6.15-rc5, requested by tzimmerman for fixes required in drm-next.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Use the BLOCK_HASH_UPDATE_BLOCKS helper instead of duplicating
partial block handling.
Also remove the unused lib/sha256 force-generic interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add an internal sha256_finup helper and move the finalisation code
from __sha256_final into it.
Also add sha256_choose_blocks and CRYPTO_ARCH_HAVE_LIB_SHA256_SIMD
so that the Crypto API can use the SIMD block function unconditionally.
The Crypto API must not be used in hard IRQs and there is no reason
to have a fallback path for hardirqs.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Follow best practices by changing the length parameters to size_t and
explicitly specifying the length of the output digest arrays.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Instead of providing crypto_shash algorithms for the arch-optimized
SHA-256 code, instead implement the SHA-256 library. This is much
simpler, it makes the SHA-256 library functions be arch-optimized, and
it fixes the longstanding issue where the arch-optimized SHA-256 was
disabled by default. SHA-256 still remains available through
crypto_shash, but individual architectures no longer need to handle it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As has been done for various other algorithms, rework the design of the
SHA-256 library to support arch-optimized implementations, and make
crypto/sha256.c expose both generic and arch-optimized shash algorithms
that wrap the library functions.
This allows users of the SHA-256 library functions to take advantage of
the arch-optimized code, and this makes it much simpler to integrate
SHA-256 for each architecture.
Note that sha256_base.h is not used in the new design. It will be
removed once all the architecture-specific code has been updated.
Move the generic block function into its own module to avoid a circular
dependency from libsha256.ko => sha256-$ARCH.ko => libsha256.ko.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Add export and import functions to maintain existing export format.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that every architecture provides a block function, use that
to implement the lib/poly1305 and remove the old per-arch code.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a block-only interface for poly1305. Implement the generic
code first.
Also use the generic partial block helper.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Merge mainline to pick up bcachefs poly1305 patch 4bf4b5046d
("bcachefs: use library APIs for ChaCha20 and Poly1305"). This
is a prerequisite for removing the poly1305 shash algorithm.
With the minimum gcc version raised to 8.1, all supported compilers
now understand the -fsanitize-coverage=trace-pc option, and there
is no longer a need for the separate compiler plugin.
Since only gcc-5 was able to use the plugin for several year now,
it was already likely unused.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
gcc-12 and higher support the -ftrivial-auto-var-init= flag, after
gcc-8 is the minimum version, this is half of the supported ones, and
the vast majority of the versions that users are actually likely to
have, so it seems like a good time to stop having the fallback
plugin implementation
Older toolchains are still able to build kernels normally without
this plugin, but won't be able to use variable initialization..
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
It is no longer necessary to check for CONFIG_AS_AVX512, since the minimum
assembler version is now from binutils-2.30 and this always supports it.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Commit a3e8fe814a ("x86/build: Raise the minimum GCC version to 8.1")
raised the minimum compiler version as enforced by Kbuild to gcc-8.1
and clang-15 for x86.
This is actually the same gcc version that has been discussed as the
minimum for all architectures several times in the past, with little
objection. A previous concern was the kernel for SLE15-SP7 needing to
be built with gcc-7. As this ended up still using linux-6.4 and there
is no plan for an SP8, this is no longer a problem.
Change it for all architectures and adjust the documentation accordingly.
A few version checks can be removed in the process. The binutils
version 2.30 is the lowest version used in combination with gcc-8 on
common distros, so use that as the corresponding minimum.
Link: https://lore.kernel.org/lkml/20240925150059.3955569-32-ardb+git@google.com/
Link: https://lore.kernel.org/lkml/871q7yxrgv.wl-tiwai@suse.de/
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Add some additional tests in lib/tests/test_bits.c to cover the
expected results of the fixed type BIT_U*() macros.
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Add some additional tests in lib/tests/test_bits.c to cover the
expected/non-expected values of the fixed-type GENMASK_U*() macros.
Also check that the result value matches the expected type. Since
those are known at build time, use static_assert() instead of normal
kunit tests.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)
The assigned type is "struct kunit_suite **" but the returned type will
be "struct kunit_suite * const *". Since it isn't generally possible
to remove the const qualifier, adjust the allocation type to match
the assignment.
Link: https://lore.kernel.org/r/20250426062433.work.124-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The generic FourCC format always prints the data using the big endian
order. It is generic because it allows to read the data using a custom
ordering.
The current code uses "n" for reading data in the reverse host ordering.
It makes the 4 variants [hnbl] consistent with the generic printing
of IPv4 addresses.
Unfortunately, it creates confusion on big endian systems. For example,
it shows the data &(u32)0x67503030 as
%p4cn 00Pg (0x30305067)
But people expect that the ordering stays the same. The network ordering
is a big-endian ordering.
The problem is that the semantic is not the same. The modifiers affect
the output ordering of IPv4 addresses while they affect the reading order
in case of FourCC code.
Avoid the confusion by replacing the "n" modifier with "hR", aka
reverse host ordering. It is inspired by the existing %p[mM]R printf
format.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/r/CAMuHMdV9tX=TG7E_CrSF=2PY206tXf+_yYRuacG48EWEtJLo-Q@mail.gmail.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/20250428123132.578771-1-pmladek@suse.com
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
usercopy of 64 bit values does not work on 32-bit SPARC:
# usercopy_test_valid: EXPECTATION FAILED at lib/tests/usercopy_kunit.c:209
Expected val_u64 == 0x5a5b5c5d6a6b6c6d, but
val_u64 == 1515936861 (0x5a5b5c5d)
0x5a5b5c5d6a6b6c6d == 6510899242581322861 (0x5a5b5c5d6a6b6c6d)
Disable the test.
Fixes: 4c5d7bc637 ("usercopy: Add tests for all get_user() sizes")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20250416-kunit-sparc-usercopy-v1-1-a772054db3af@linutronix.de
Signed-off-by: Kees Cook <kees@kernel.org>
Now that the architecture-optimized Poly1305 kconfig symbols are defined
regardless of CRYPTO, there is no need for CRYPTO_LIB_POLY1305 to select
CRYPTO. So, remove that. This makes the indirection through the
CRYPTO_LIB_POLY1305_INTERNAL symbol unnecessary, so get rid of that and
just use CRYPTO_LIB_POLY1305 directly. Finally, make the fallback to
the generic implementation use a default value instead of a select; this
makes it consistent with how the arch-optimized code gets enabled and
also with how CRYPTO_LIB_BLAKE2S_GENERIC gets enabled.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the architecture-optimized ChaCha kconfig symbols are defined
regardless of CRYPTO, there is no need for CRYPTO_LIB_CHACHA to select
CRYPTO. So, remove that. This makes the indirection through the
CRYPTO_LIB_CHACHA_INTERNAL symbol unnecessary, so get rid of that and
just use CRYPTO_LIB_CHACHA directly. Finally, make the fallback to the
generic implementation use a default value instead of a select; this
makes it consistent with how the arch-optimized code gets enabled and
also with how CRYPTO_LIB_BLAKE2S_GENERIC gets enabled.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the x86 BLAKE2s, ChaCha, and Poly1305
library functions into a new directory arch/x86/lib/crypto/ that does
not depend on CRYPTO. This mirrors the distinction between crypto/ and
lib/crypto/.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the s390 ChaCha library functions into a
new directory arch/s390/lib/crypto/ that does not depend on CRYPTO.
This mirrors the distinction between crypto/ and lib/crypto/.
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the riscv ChaCha library functions into
a new directory arch/riscv/lib/crypto/ that does not depend on CRYPTO.
This mirrors the distinction between crypto/ and lib/crypto/.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the powerpc ChaCha and Poly1305 library
functions into a new directory arch/powerpc/lib/crypto/ that does not
depend on CRYPTO. This mirrors the distinction between crypto/ and
lib/crypto/.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the mips ChaCha and Poly1305 library
functions into a new directory arch/mips/lib/crypto/ that does not
depend on CRYPTO. This mirrors the distinction between crypto/ and
lib/crypto/.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the arm64 ChaCha and Poly1305 library
functions into a new directory arch/arm64/lib/crypto/ that does not
depend on CRYPTO. This mirrors the distinction between crypto/ and
lib/crypto/.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the arm BLAKE2s, ChaCha, and Poly1305
library functions into a new directory arch/arm/lib/crypto/ that does
not depend on CRYPTO. This mirrors the distinction between crypto/ and
lib/crypto/.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that all sm3_base users have been converted to use the API
partial block handling, remove the partial block helpers as well
as the lib/crypto functions.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that all sha256_base users have been converted to use the API
partial block handling, remove the partial block helpers.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
%p4cc is designed for DRM/V4L2 FourCCs with their specific quirks, but
it's useful to be able to print generic 4-character codes formatted as
an integer. Extend it to add format specifiers for printing generic
32-bit FourCCs with various endian semantics:
%p4ch Host byte order
%p4cn Network byte order
%p4cl Little-endian
%p4cb Big-endian
The endianness determines how bytes are interpreted as a u32, and the
FourCC is then always printed MSByte-first (this is the opposite of
V4L/DRM FourCCs). This covers most practical cases, e.g. %p4cn would
allow printing LSByte-first FourCCs stored in host endian order
(other than the hex form being in character order, not the integer
value).
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/PN3PR01MB9597B01823415CB7FCD3BC27B8B52@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
- lib/prime_numbers: KUnit test should not select PRIME_NUMBERS
(Geert Uytterhoeven)
- ubsan: Fix panic from test_ubsan_out_of_bounds (Mostafa Saleh)
- ubsan: Remove 'default UBSAN' from UBSAN_INTEGER_WRAP (Nathan Chancellor)
- string: Add load_unaligned_zeropad() code path to sized_strscpy()
(Peter Collingbourne)
- kasan: Add strscpy() test to trigger tag fault on arm64 (Vincenzo
Frascino)
- Disable GCC randstruct for COMPILE_TEST
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaAKv9QAKCRA2KwveOeQk
u160AP90D0BTkrwYIt1oRMOlN0LX0oipfFDiOKrxuZpgfwqYgwD/XHlTCglva+Kl
1Y0T/wUpA4tL8XoKtcs/kBzsWNyI6wU=
=4ji5
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
- lib/prime_numbers: KUnit test should not select PRIME_NUMBERS (Geert
Uytterhoeven)
- ubsan: Fix panic from test_ubsan_out_of_bounds (Mostafa Saleh)
- ubsan: Remove 'default UBSAN' from UBSAN_INTEGER_WRAP (Nathan
Chancellor)
- string: Add load_unaligned_zeropad() code path to sized_strscpy()
(Peter Collingbourne)
- kasan: Add strscpy() test to trigger tag fault on arm64 (Vincenzo
Frascino)
- Disable GCC randstruct for COMPILE_TEST
* tag 'hardening-v6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lib/prime_numbers: KUnit test should not select PRIME_NUMBERS
ubsan: Fix panic from test_ubsan_out_of_bounds
lib/Kconfig.ubsan: Remove 'default UBSAN' from UBSAN_INTEGER_WRAP
hardening: Disable GCC randstruct for COMPILE_TEST
kasan: Add strscpy() test to trigger tag fault on arm64
string: Add load_unaligned_zeropad() code path to sized_strscpy()
or aren't considered necessary for -stable kernels.
22 patches are for MM, 9 are otherwise.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaABuqgAKCRDdBJ7gKXxA
jkZ7AQCkxrzIxZs7uUcHZNIGpNhbhg0Dl07j6txgf7piCBSk4wD+LX6skmC2CXLF
QWDhw1+dKHY/Ha0TSQkXUlMTjAP1mA4=
=3vRc
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2025-04-16-19-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"31 hotfixes.
9 are cc:stable and the remainder address post-6.15 issues or aren't
considered necessary for -stable kernels.
22 patches are for MM, 9 are otherwise"
* tag 'mm-hotfixes-stable-2025-04-16-19-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (31 commits)
MAINTAINERS: update HUGETLB reviewers
mm: fix apply_to_existing_page_range()
selftests/mm: fix compiler -Wmaybe-uninitialized warning
alloc_tag: handle incomplete bulk allocations in vm_module_tags_populate
mailmap: add entry for Jean-Michel Hautbois
mm: (un)track_pfn_copy() fix + doc improvements
mm: fix filemap_get_folios_contig returning batches of identical folios
mm/hugetlb: add a line break at the end of the format string
selftests: mincore: fix tmpfs mincore test failure
mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
mm/cma: report base address of single range correctly
mm: page_alloc: speed up fallbacks in rmqueue_bulk()
kunit: slub: add module description
mm/kasan: add module decription
ucs2_string: add module description
zlib: add module description
fpga: tests: add module descriptions
samples/livepatch: add module descriptions
ASN.1: add module description
mm/vma: add give_up_on_oom option on modify/merge, use in uffd release
...
These fields are no longer needed, so remove them.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Enabling a (modular) test should not silently enable additional kernel
functionality, as that may increase the attack vector of a product.
Fix this by making PRIME_NUMBERS_KUNIT_TEST depend on PRIME_NUMBERS
instead of selecting it.
After this, one can safely enable CONFIG_KUNIT_ALL_TESTS=m to build
modules for all appropriate tests for ones system, without pulling in
extra unwanted functionality, while still allowing a tester to manually
enable PRIME_NUMBERS and this test suite on a system where PRIME_NUMBERS
is not enabled by default. Resurrect CONFIG_PRIME_NUMBERS=m in
tools/testing/selftests/lib/config for the latter use case.
Fixes: 313b38a6ec ("lib/prime_numbers: convert self-test to KUnit")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/40f8a40eef4930d3ac9febd205bc171eb04e171c.1744641237.git.geert@linux-m68k.org
Signed-off-by: Kees Cook <kees@kernel.org>
Running lib_ubsan.ko on arm64 (without CONFIG_UBSAN_TRAP) panics the
kernel:
[ 31.616546] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: test_ubsan_out_of_bounds+0x158/0x158 [test_ubsan]
[ 31.646817] CPU: 3 UID: 0 PID: 179 Comm: insmod Not tainted 6.15.0-rc2 #1 PREEMPT
[ 31.648153] Hardware name: linux,dummy-virt (DT)
[ 31.648970] Call trace:
[ 31.649345] show_stack+0x18/0x24 (C)
[ 31.650960] dump_stack_lvl+0x40/0x84
[ 31.651559] dump_stack+0x18/0x24
[ 31.652264] panic+0x138/0x3b4
[ 31.652812] __ktime_get_real_seconds+0x0/0x10
[ 31.653540] test_ubsan_load_invalid_value+0x0/0xa8 [test_ubsan]
[ 31.654388] init_module+0x24/0xff4 [test_ubsan]
[ 31.655077] do_one_initcall+0xd4/0x280
[ 31.655680] do_init_module+0x58/0x2b4
That happens because the test corrupts other data in the stack:
400: d5384108 mrs x8, sp_el0
404: f9426d08 ldr x8, [x8, #1240]
408: f85f83a9 ldur x9, [x29, #-8]
40c: eb09011f cmp x8, x9
410: 54000301 b.ne 470 <test_ubsan_out_of_bounds+0x154> // b.any
As there is no guarantee the compiler will order the local variables
as declared in the module:
volatile char above[4] = { }; /* Protect surrounding memory. */
volatile int arr[4];
volatile char below[4] = { }; /* Protect surrounding memory. */
There is another problem where the out-of-bound index is 5 which is larger
than the extra surrounding memory for protection.
So, use a struct to enforce the ordering, and fix the index to be 4.
Also, remove some of the volatiles and rely on OPTIMIZER_HIDE_VAR()
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Link: https://lore.kernel.org/r/20250415203354.4109415-1-smostafa@google.com
Signed-off-by: Kees Cook <kees@kernel.org>
CONFIG_UBSAN_INTEGER_WRAP is 'default UBSAN', which is problematic for a
couple of reasons.
The first is that this sanitizer is under active development on the
compiler side to come up with a solution that is maintainable on the
compiler side and usable on the kernel side. As a result of this, there
are many warnings when the sanitizer is enabled that have no clear path
to resolution yet but users may see them and report them in the meantime.
The second is that this option was renamed from
CONFIG_UBSAN_SIGNED_WRAP, meaning that if a configuration has
CONFIG_UBSAN=y but CONFIG_UBSAN_SIGNED_WRAP=n and it is upgraded via
olddefconfig (common in non-interactive scenarios such as CI),
CONFIG_UBSAN_INTEGER_WRAP will be silently enabled again.
Remove 'default UBSAN' from CONFIG_UBSAN_INTEGER_WRAP until it is ready
for regular usage and testing from a broader community than the folks
actively working on the feature.
Cc: stable@vger.kernel.org
Fixes: 557f8c582a ("ubsan: Reintroduce signed overflow sanitizer")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250414-drop-default-ubsan-integer-wrap-v1-1-392522551d6b@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
The call to read_word_at_a_time() in sized_strscpy() is problematic
with MTE because it may trigger a tag check fault when reading
across a tag granule (16 bytes) boundary. To make this code
MTE compatible, let's start using load_unaligned_zeropad()
on architectures where it is available (i.e. architectures that
define CONFIG_DCACHE_WORD_ACCESS). Because load_unaligned_zeropad()
takes care of page boundaries as well as tag granule boundaries,
also disable the code preventing crossing page boundaries when using
load_unaligned_zeropad().
Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
Fixes: 94ab5b61ee ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
Cc: stable@vger.kernel.org
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250403000703.2584581-2-pcc@google.com
Signed-off-by: Kees Cook <kees@kernel.org>
As more tests are added, the exit function gets longer than it should
be. Condense the un-register calls into a for loop to make it easier to
add/remove tests.
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
As we add more test functions in lib/tests_sysctl the main test function
(test_sysctl_init) grows. Condense the logic to make it easier to
add/remove tests.
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
If the test added in commit b5ffbd1396 ("sysctl: move the extra1/2
boundary check of u8 to sysctl_check_table_array") is run as a module, a
lingering reference to the module is left behind, and a 'sysctl -a'
leads to a panic.
To reproduce
CONFIG_KUNIT=y
CONFIG_SYSCTL_KUNIT_TEST=m
Then run these commands:
modprobe sysctl-test
rmmod sysctl-test
sysctl -a
The panic varies but generally looks something like this:
BUG: unable to handle page fault for address: ffffa4571c0c7db4
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 100000067 P4D 100000067 PUD 100351067 PMD 114f5e067 PTE 0
Oops: Oops: 0000 [#1] SMP NOPTI
... ... ...
RIP: 0010:proc_sys_readdir+0x166/0x2c0
... ... ...
Call Trace:
<TASK>
iterate_dir+0x6e/0x140
__se_sys_getdents+0x6e/0x100
do_syscall_64+0x70/0x150
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Move the test to lib/test_sysctl.c where the registration reference is
handled on module exit
Fixes: b5ffbd1396 ("sysctl: move the extra1/2 boundary check of u8 to sysctl_check_table_array")
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
Merge series from James Calligeros <jcalligeros99@gmail.com>:
This series introduces a number of changes to the drivers for
the Texas Instruments TAS2764 and TAS2770 amplifiers in order to
introduce (and improve in the case of TAS2770) support for the
variants of these amps found in Apple Silicon Macs.
Apple's variant of TAS2764 is known as SN012776, and as always with
Apple is a subtly incompatible variant with a number of quirks. It
is not publicly available. The TAS2770 variant is known as TAS5770L,
and does not require incompatible handling.
Much as with the Cirrus codec patches, I do not
expect that we will get any official acknowledgement that these parts
exist from TI, however I would be delighted to be proven wrong.
This series has been living in the downstream Asahi kernel tree[1]
for over two years, and has been tested by many thousands of users
by this point[2].
v4 drops the TDM idle TX slot behaviour patches. I experimented with
the API discussed in v3, however this did not work on any of the machines
I tested it with. More tweaking is probably needed.
[1] https://github.com/AsahiLinux/linux/tree/asahi-wip
[2] https://stats.asahilinux.org/
alloc_pages_bulk_node() may partially succeed and allocate fewer than the
requested nr_pages. There are several conditions under which this can
occur, but we have encountered the case where CONFIG_PAGE_OWNER is enabled
causing all bulk allocations to always fallback to single page allocations
due to commit 187ad460b8 ("mm/page_alloc: avoid page allocator recursion
with pagesets.lock held").
Currently vm_module_tags_populate() immediately fails when
alloc_pages_bulk_node() returns fewer than the requested number of pages.
When this happens memory allocation profiling gets disabled, for example
[ 14.297583] [9: modprobe: 465] Failed to allocate memory for allocation tags in the module scsc_wlan. Memory allocation profiling is disabled!
[ 14.299339] [9: modprobe: 465] modprobe: Failed to insmod '/vendor/lib/modules/scsc_wlan.ko' with args '': Out of memory
This patch causes vm_module_tags_populate() to retry bulk allocations for
the remaining memory instead of failing immediately which will avoid the
disablement of memory allocation profiling.
Link: https://lkml.kernel.org/r/20250409225111.3770347-1-tjmercier@google.com
Fixes: 0f9b685626 ("alloc_tag: populate memory for module tags as needed")
Signed-off-by: T.J. Mercier <tjmercier@google.com>
Reported-by: Janghyuck Kim <janghyuck.kim@samsung.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Modules without a description now cause a warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/slub_kunit.o
Link: https://lkml.kernel.org/r/20250324173242.1501003-10-arnd@kernel.org
Fixes: 6c6c1fc09d ("modpost: require a MODULE_DESCRIPTION()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Guenetr Roeck <linux@roeck-us.net>
Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Pei Xiao <xiaopei01@kylinos.cn>
Cc: Rae Moar <rmoar@google.com>
Cc: Stehen Rothwell <sfr@canb.auug.org.au>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Modules without a description now cause a warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ucs2_string.o
Link: https://lkml.kernel.org/r/20250324173242.1501003-7-arnd@kernel.org
Fixes: 6c6c1fc09d ("modpost: require a MODULE_DESCRIPTION()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Stehen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Modules without a description now cause a warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/zlib_inflate/zlib_inflate.o
Link: https://lkml.kernel.org/r/20250324173242.1501003-6-arnd@kernel.org
Fixes: 6c6c1fc09d ("modpost: require a MODULE_DESCRIPTION()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Stehen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This is needed to avoid a build warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/asn1_decoder.o
Link: https://lkml.kernel.org/r/20250324173242.1501003-2-arnd@kernel.org
Fixes: 6c6c1fc09d ("modpost: require a MODULE_DESCRIPTION()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Stehen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When testing EROFS file-backed mount over v9fs on qemu, I encountered a
folio UAF issue. The page sanity check reports the following call trace.
The root cause is that pages in bvec are coalesced across a folio bounary.
The refcount of all non-slab folios should be increased to ensure
p9_releas_pages can put them correctly.
BUG: Bad page state in process md5sum pfn:18300
page: refcount:0 mapcount:0 mapping:00000000d5ad8e4e index:0x60 pfn:0x18300
head: order:0 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
aops:z_erofs_aops ino:30b0f dentry name(?):"GoogleExtServicesCn.apk"
flags: 0x100000000000041(locked|head|node=0|zone=1)
raw: 0100000000000041 dead000000000100 dead000000000122 ffff888014b13bd0
raw: 0000000000000060 0000000000000020 00000000ffffffff 0000000000000000
head: 0100000000000041 dead000000000100 dead000000000122 ffff888014b13bd0
head: 0000000000000060 0000000000000020 00000000ffffffff 0000000000000000
head: 0100000000000000 0000000000000000 ffffffffffffffff 0000000000000000
head: 0000000000000010 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
Call Trace:
dump_stack_lvl+0x53/0x70
bad_page+0xd4/0x220
__free_pages_ok+0x76d/0xf30
__folio_put+0x230/0x320
p9_release_pages+0x179/0x1f0
p9_virtio_zc_request+0xa2a/0x1230
p9_client_zc_rpc.constprop.0+0x247/0x700
p9_client_read_once+0x34d/0x810
p9_client_read+0xf3/0x150
v9fs_issue_read+0x111/0x360
netfs_unbuffered_read_iter_locked+0x927/0x1390
netfs_unbuffered_read_iter+0xa2/0xe0
vfs_iocb_iter_read+0x2c7/0x460
erofs_fileio_rq_submit+0x46b/0x5b0
z_erofs_runqueue+0x1203/0x21e0
z_erofs_readahead+0x579/0x8b0
read_pages+0x19f/0xa70
page_cache_ra_order+0x4ad/0xb80
filemap_readahead.isra.0+0xe7/0x150
filemap_get_pages+0x7aa/0x1890
filemap_read+0x320/0xc80
vfs_read+0x6c6/0xa30
ksys_read+0xf9/0x1c0
do_syscall_64+0x9e/0x1a0
entry_SYSCALL_64_after_hwframe+0x71/0x79
Link: https://lkml.kernel.org/r/20250401144712.1377719-1-shengyong1@xiaomi.com
Fixes: b9c0e49abf ("mm: decline to manipulate the refcount on a slab page")
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The GCC specific warning '-Wsuggest-attribute=format' is disabled around
va_format() using raw #pragma statements, which includes an
'#ifndef __clang__' to avoid a warning about an unknown warning option
from clang (which recognizes '#pragma GCC' for compatibility reasons):
lib/vsprintf.c:1703:32: error: unknown warning group '-Wsuggest-attribute=format', ignored [-Werror,-Wunknown-warning-option]
1703 | #pragma GCC diagnostic ignored "-Wsuggest-attribute=format"
| ^
While the current solution works, it is not visually appealing. The
kernel already has some infrastructure that wraps these #pragma
statements to give more specific control over diagnostics without
needing #ifdef blocks for different compilers. Convert the existing
statements over to the __diag macros.
Closes: https://lore.kernel.org/r/CAHk-=wgfX9nBGE0Ap9GjhOy7Mn=RSy=rx0MvqfYFFDx31KJXqQ@mail.gmail.com
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20250404-vsprintf-convert-pragmas-to-__diag-v1-2-5d6c5c55b2bd@kernel.org
Signed-off-by: Petr Mladek <pmladek@suse.com>
Finish cleaning up the CRC kconfig options by removing the remaining
unnecessary prompts and an unnecessary 'default y', removing
CONFIG_LIBCRC32C, and documenting all the CRC library options.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ/P7QhQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyoOAQCynFcS1dWuD27S+SdUREmBjMAoZo5M
zdsIvlPv9KLycgD/QX5lXjW3KIYY6jQ8vHUuLVwfDl/JEp4GJS9dLGU+agg=
=0R1T
-----END PGP SIGNATURE-----
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC cleanups from Eric Biggers:
"Finish cleaning up the CRC kconfig options by removing the remaining
unnecessary prompts and an unnecessary 'default y', removing
CONFIG_LIBCRC32C, and documenting all the CRC library options"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crc: remove CONFIG_LIBCRC32C
lib/crc: document all the CRC library kconfig options
lib/crc: remove unnecessary prompt for CONFIG_CRC_ITU_T
lib/crc: remove unnecessary prompt for CONFIG_CRC_T10DIF
lib/crc: remove unnecessary prompt for CONFIG_CRC16
lib/crc: remove unnecessary prompt for CONFIG_CRC_CCITT
lib/crc: remove unnecessary prompt for CONFIG_CRC32 and drop 'default y'
Existing parse_inte_array_user() works with __user buffers only.
Separate array parsing from __user bits so the functionality can be
utilized with kernel buffers too.
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20250404090337.3564117-2-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
- Improve performance in gendwarfksyms
- Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS
- Support CONFIG_HEADERS_INSTALL for ARCH=um
- Use more relative paths to sources files for better reproducibility
- Support the loong64 Debian architecture
- Add Kbuild bash completion
- Introduce intermediate vmlinux.unstripped for architectures that need
static relocations to be stripped from the final vmlinux
- Fix versioning in Debian packages for -rc releases
- Treat missing MODULE_DESCRIPTION() as an error
- Convert Nios2 Makefiles to use the generic rule for built-in DTB
- Add debuginfo support to the RPM package
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmfxp2EVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGkIUP/AgNiP6or6fmY5+HSyjlrdutBWAh
QNW0AiKh5vytmBIv63/i103OE0SRbt+U6IApn9c7FQKkeuyIlD1e9NfSwFMZixmP
P7t6JqDCL61G5d3W2Iisqle1cpBoVvNgUwu0k3sTSXl0vNsDbiyxcCzQzLhZMKsd
O+Ppwp3zNGE2vIUwpIjzJsR5Dt/Z5MfuKDi4UShsyWpFZ1rg9X93YKc9QJOXjKwj
4Np2x2cukDo2oz4uXuZQ8F1+bOFsKYoilCwjtxlrC6BO0lSPiJsRTN6nGJ0ejns9
GGD56mBNGcGk+NEPGhAMQmZHqNAP4JfjEvAgaoSBn0Rdnjd9Cj/2T+4n61xkR4Wu
MXCP/LEJ3MyctmkZjUq+0fDAe2wjxuaAG15kAHCha+9KxIG2NzHbf2XXb4E49DDU
2rw3fqA41/cKCq1ZEaqRn3pZZgU6ysfsEW42JmnNxO+7zz9k8RX4rk8CVaVIEUuw
Xojkis//KnE6+OCBe6Tb0H2Rzo0JF3AG2eNF4zY/xnc562FRIMS19WYS38tKZng6
Gr1BRG0bA4t9mf2Vck1W1LcAb3Jh0mddtyrgYKhbcwq0YOj2q/H6F50DkC+wL282
wvhV6B/vKAH8BByEWAn3rBcN0N+w/VFc0uPCz//tkoAm4nPg8PvKq63JHPrHsyZe
mOMhifoiVbjF4KFo
=GiQ6
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Improve performance in gendwarfksyms
- Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS
- Support CONFIG_HEADERS_INSTALL for ARCH=um
- Use more relative paths to sources files for better reproducibility
- Support the loong64 Debian architecture
- Add Kbuild bash completion
- Introduce intermediate vmlinux.unstripped for architectures that need
static relocations to be stripped from the final vmlinux
- Fix versioning in Debian packages for -rc releases
- Treat missing MODULE_DESCRIPTION() as an error
- Convert Nios2 Makefiles to use the generic rule for built-in DTB
- Add debuginfo support to the RPM package
* tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits)
kbuild: rpm-pkg: build a debuginfo RPM
kconfig: merge_config: use an empty file as initfile
nios2: migrate to the generic rule for built-in DTB
rust: kbuild: skip `--remap-path-prefix` for `rustdoc`
kbuild: pacman-pkg: hardcode module installation path
kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally
modpost: require a MODULE_DESCRIPTION()
kbuild: make all file references relative to source root
x86: drop unnecessary prefix map configuration
kbuild: deb-pkg: add comment about future removal of KDEB_COMPRESS
kbuild: Add a help message for "headers"
kbuild: deb-pkg: remove "version" variable in mkdebian
kbuild: deb-pkg: fix versioning for -rc releases
Documentation/kbuild: Fix indentation in modules.rst example
x86: Get rid of Makefile.postlink
kbuild: Create intermediate vmlinux build with relocations preserved
kbuild: Introduce Kconfig symbol for linking vmlinux with relocations
kbuild: link-vmlinux.sh: Make output file name configurable
kbuild: do not generate .tmp_vmlinux*.map when CONFIG_VMLINUX_MAP=y
Revert "kheaders: Ignore silly-rename files"
...
Now that LIBCRC32C does nothing besides select CRC32, make every option
that selects LIBCRC32C instead select CRC32 directly. Then remove
LIBCRC32C.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250401221600.24878-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Previous commits removed all the original CRC kconfig help text, since
it was oriented towards people configuring the kernel, and the options
are no longer user-selectable. However, it's still useful for there to
be help text for kernel developers. Add this.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250401221600.24878-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
All modules that need CONFIG_CRC_ITU_T already select it, so there is no
need to bother users about the option.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250401221600.24878-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
All modules that need CONFIG_CRC_T10DIF already select it, so there is no
need to bother users about the option.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250401221600.24878-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
All modules that need CONFIG_CRC16 already select it, so there is no
need to bother users about the option.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250401221600.24878-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
All modules that need CONFIG_CRC_CCITT already select it, so there is no
need to bother users about the option.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250401221600.24878-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
All modules that need CONFIG_CRC32 already select it, so there is no
need to bother users about the option, nor to default it to y.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250401221600.24878-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ+4Y2wAKCRDdBJ7gKXxA
jpqDAQDE7mee8FW6be6dAD+dAdHgSsKZ9vUm4zQTMsSYTmCaowEAxx3ro7NEO4fk
ekxRJGlv0PNRssMbFzMCzR5ig+kzBww=
=OX46
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-04-02-22-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more non-MM updates from Andrew Morton:
"One bugfix and a couple of small late-arriving updates"
* tag 'mm-nonmm-stable-2025-04-02-22-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
lib: scatterlist: fix sg_split_phys to preserve original scatterlist offsets
lib/sort.c: add _nonatomic() variants with cond_resched()
mailmap: add an entry for Nicolas Schier
from Mike Rapoport fixes a couple of issues with the just-merged "arch,
mm: reduce code duplication in mem_init()" series.
- The 4 patch series "MAINTAINERS: add my isub-entries to MM part." from
Mike Rapoport does some maintenance on MAINTAINERS.
- The 6 patch series "remove tlb_remove_page_ptdesc()" from Qi Zheng
does some cleanup work to the page mapping code.
- The 7 patch series "mseal system mappings" from Jeff Xu permits
sealing of "system mappings", such as vdso, vvar, vvar_vclock, vectors
(arm compat-mode), sigpage (arm compat-mode).
- Plus the usual shower of singleton patches.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ+4XpgAKCRDdBJ7gKXxA
jnwtAP43Rp3zyWf034fEypea36xQqcsy4I7YUTdZEgnFS7LCZwEApM97JvGHsYEr
Ns9Zhnh+E3RWASfOAzJoVZVrAaMovg4=
=MyVR
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-04-02-22-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- The series "mm: fixes for fallouts from mem_init() cleanup" from Mike
Rapoport fixes a couple of issues with the just-merged "arch, mm:
reduce code duplication in mem_init()" series
- The series "MAINTAINERS: add my isub-entries to MM part." from Mike
Rapoport does some maintenance on MAINTAINERS
- The series "remove tlb_remove_page_ptdesc()" from Qi Zheng does some
cleanup work to the page mapping code
- The series "mseal system mappings" from Jeff Xu permits sealing of
"system mappings", such as vdso, vvar, vvar_vclock, vectors (arm
compat-mode), sigpage (arm compat-mode)
- Plus the usual shower of singleton patches
* tag 'mm-stable-2025-04-02-22-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (31 commits)
mseal sysmap: add arch-support txt
mseal sysmap: enable s390
selftest: test system mappings are sealed
mseal sysmap: update mseal.rst
mseal sysmap: uprobe mapping
mseal sysmap: enable arm64
mseal sysmap: enable x86-64
mseal sysmap: generic vdso vvar mapping
selftests: x86: test_mremap_vdso: skip if vdso is msealed
mseal sysmap: kernel config and header change
mm: pgtable: remove tlb_remove_page_ptdesc()
x86: pgtable: convert to use tlb_remove_ptdesc()
riscv: pgtable: unconditionally use tlb_remove_ptdesc()
mm: pgtable: convert some architectures to use tlb_remove_ptdesc()
mm: pgtable: change pt parameter of tlb_remove_ptdesc() to struct ptdesc*
mm: pgtable: make generic tlb_remove_table() use struct ptdesc
microblaze/mm: put mm_cmdline_setup() in .init.text section
mm/memory_hotplug: fix call folio_test_large with tail page in do_migrate_range
MAINTAINERS: mm: add entry for secretmem
MAINTAINERS: mm: add entry for numa memblocks and numa emulation
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmftIx0ACgkQUqAMR0iA
lPJKUxAArJaWC7EXtDtd8u8Rl/CYpIEaMdPd7V+XA5sqdUyjkSJI+jRswonpOsNX
Zn9pbGMds1LNBXm1NO9039+2TrPSJFTCGK6OgeCJC17/O31wnnm0LhZiU+JElgfi
iQI5fdTnc3sB37bsjkvEUr9HFizRxY2fHHMWZ8ngiLfkKfki4ET+1u/yf7CraRk1
6+LK9mM/WyytP6gYaSlL5YYVYs9fNcR/ND6IQgpfIN15/fOAOXWbMB1jE2iDRzqt
MQUD4+DTYQYmeS6jQ4ToZdx3Ql9NwcP2nJnA5fxXeqPFHc/SgRS6KqOPQgQUD4tV
N4q6ozLPlzDFeHVHMhPz/PzlSEn0zC1ZX87xXCUAilnkJpbEujcPxf44R/3RHu3d
y7kmCRj0RwgHpLIwzLH5POrF4il9/wVlyZFRaYBPMkj09l0WBwYvfMhlnzvAtCP8
pRKqHkjJ1FOWQFJyn98ONqcCmm2pZ8XKW2enikAhISVXcptI/1lIQ6IIpRdTjte1
r60CbiJ7UFL+TrVqsWBuqWQRi5u5HykPkZiCL/YYXzZmrl3zLO+0ti9YzEU8Yrzd
K1VAB/1aK/MDrTgOI+VaqlPq79uJBwtbrflgFhFBKAKsqTpBcsZUv9/1KHthnqXV
Y84SsY2XpoGtjn58mU6eEc+8lLTOTDVXs+ZZL4/M3maW7ygNiYY=
=Biv4
-----END PGP SIGNATURE-----
Merge tag 'printk-for-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull more printk updates from Petr Mladek:
- Silence warnings about candidates for ‘gnu_print’ format attribute
* tag 'printk-for-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
vsnprintf: Silence false positive GCC warning for va_format()
vsnprintf: Drop unused const char fmt * in va_format()
vsnprintf: Mark binary printing functions with __printf() attribute
tracing: Mark binary printing functions with __printf() attribute
seq_file: Mark binary printing functions with __printf() attribute
seq_buf: Mark binary printing functions with __printf() attribute
Two significant new items:
- Allow reporting IOMMU HW events to userspace when the events are clearly
linked to a device. This is linked to the VIOMMU object and is intended to
be used by a VMM to forward HW events to the virtual machine as part of
emulating a vIOMMU. ARM SMMUv3 is the first driver to use this
mechanism. Like the existing fault events the data is delivered through
a simple FD returning event records on read().
- PASID support in VFIO. "Process Address Space ID" is a PCI feature that
allows the device to tag all PCI DMA operations with an ID. The IOMMU
will then use the ID to select a unique translation for those DMAs. This
is part of Intel's vIOMMU support as VT-D HW requires the hypervisor to
manage each PASID entry. The support is generic so any VFIO user could
attach any translation to a PASID, and the support should work on ARM
SMMUv3 as well. AMD requires additional driver work.
Some minor updates, along with fixes:
- Prevent using nested parents with fault's, no driver support today
- Put a single "cookie_type" value in the iommu_domain to indicate what
owns the various opaque owner fields
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZ+q6NgAKCRCFwuHvBreF
YZ3zAQDbl4/Z0O+CLN2AXq4Zeiyq1HTSoF94hzqmm7lQ17zTIwD8CCdyLXHvupaq
tkBIv5IovpaxlrSk6M0kh2K8vPCk9Qk=
=CIM3
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
"Two significant new items:
- Allow reporting IOMMU HW events to userspace when the events are
clearly linked to a device.
This is linked to the VIOMMU object and is intended to be used by a
VMM to forward HW events to the virtual machine as part of
emulating a vIOMMU. ARM SMMUv3 is the first driver to use this
mechanism. Like the existing fault events the data is delivered
through a simple FD returning event records on read().
- PASID support in VFIO.
The "Process Address Space ID" is a PCI feature that allows the
device to tag all PCI DMA operations with an ID. The IOMMU will
then use the ID to select a unique translation for those DMAs. This
is part of Intel's vIOMMU support as VT-D HW requires the
hypervisor to manage each PASID entry.
The support is generic so any VFIO user could attach any
translation to a PASID, and the support should work on ARM SMMUv3
as well. AMD requires additional driver work.
Some minor updates, along with fixes:
- Prevent using nested parents with fault's, no driver support today
- Put a single "cookie_type" value in the iommu_domain to indicate
what owns the various opaque owner fields"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (49 commits)
iommufd: Test attach before detaching pasid
iommufd: Fix iommu_vevent_header tables markup
iommu: Convert unreachable() to BUG()
iommufd: Balance veventq->num_events inc/dec
iommufd: Initialize the flags of vevent in iommufd_viommu_report_event()
iommufd/selftest: Add coverage for reporting max_pasid_log2 via IOMMU_HW_INFO
iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability
vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid
vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices
ida: Add ida_find_first_range()
iommufd/selftest: Add coverage for iommufd pasid attach/detach
iommufd/selftest: Add test ops to test pasid attach/detach
iommufd/selftest: Add a helper to get test device
iommufd/selftest: Add set_dev_pasid in mock iommu
iommufd: Allow allocating PASID-compatible domain
iommu/vt-d: Add IOMMU_HWPT_ALLOC_PASID support
iommufd: Enforce PASID-compatible domain for RID
iommufd: Support pasid attach/replace
iommufd: Enforce PASID-compatible domain in PASID path
iommufd/device: Add pasid_attach array to track per-PASID attach
...
The split_sg_phys function was incorrectly setting the offsets of all
scatterlist entries (except the first) to 0. Only the first scatterlist
entry's offset and length needs to be modified to account for the skip.
Setting the rest entries' offsets to 0 could lead to incorrect data
access.
I am using this function in a crypto driver that I'm currently developing
(not yet sent to mailing list). During testing, it was observed that the
output scatterlists (except the first one) contained incorrect garbage
data.
I narrowed this issue down to the call of sg_split(). Upon debugging
inside this function, I found that this resetting of offset is the cause
of the problem, causing the subsequent scatterlists to point to incorrect
memory locations in a page. By removing this code, I am obtaining
expected data in all the split output scatterlists. Thus, this was indeed
causing observable runtime effects!
This patch removes the offending code, ensuring that the page offsets in
the input scatterlist are preserved in the output scatterlist.
Link: https://lkml.kernel.org/r/20250319111437.1969903-1-t-pratham@ti.com
Fixes: f8bcbe62ac ("lib: scatterlist: add sg splitting function")
Signed-off-by: T Pratham <t-pratham@ti.com>
Cc: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kamlesh Gurudasani <kamlesh@ti.com>
Cc: Praneeth Bajjuri <praneeth@ti.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
With the introduction of the generic vdso data storage the VM_SEALED_SYSMAP
vm flag must be moved from the architecture specific
_install_special_mapping() call [1] [2] which maps the vvar mapping to
generic code.
[1] https://lkml.kernel.org/r/20250305021711.3867874-4-jeffxu@google.com
[2] https://lkml.kernel.org/r/20250305021711.3867874-5-jeffxu@google.com
Link: https://lkml.kernel.org/r/20250311123326.2686682-2-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Here is the big set of char, misc, iio, and other smaller driver
subsystems for 6.15-rc1. Lots of stuff in here, including:
- loads of IIO changes and driver updates
- counter driver updates
- w1 driver updates
- faux conversions for some drivers that were abusing the platform bus
interface
- coresight driver updates
- rust miscdevice binding updates based on real-world-use
- other minor driver updates
All of these have been in linux-next with no reported issues for quite a
while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ+mNdQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylktACfYJix41jCCDbiFjnu7Hz4OIdcrUsAnRyF164M
1n5MhEhsEmvQj7WBwQLE
=AmmW
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / IIO driver updates from Greg KH:
"Here is the big set of char, misc, iio, and other smaller driver
subsystems for 6.15-rc1. Lots of stuff in here, including:
- loads of IIO changes and driver updates
- counter driver updates
- w1 driver updates
- faux conversions for some drivers that were abusing the platform
bus interface
- coresight driver updates
- rust miscdevice binding updates based on real-world-use
- other minor driver updates
All of these have been in linux-next with no reported issues for quite
a while"
* tag 'char-misc-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits)
samples: rust_misc_device: fix markup in top-level docs
Coresight: Fix a NULL vs IS_ERR() bug in probe
misc: lis3lv02d: convert to use faux_device
tlclk: convert to use faux_device
regulator: dummy: convert to use the faux device interface
bus: mhi: host: Fix race between unprepare and queue_buf
coresight: configfs: Constify struct config_item_type
doc: iio: ad7380: describe offload support
iio: ad7380: add support for SPI offload
iio: light: Add check for array bounds in veml6075_read_int_time_ms
iio: adc: ti-ads7924 Drop unnecessary function parameters
staging: iio: ad9834: Use devm_regulator_get_enable()
staging: iio: ad9832: Use devm_regulator_get_enable()
iio: gyro: bmg160_spi: add of_match_table
dt-bindings: iio: adc: Add i.MX94 and i.MX95 support
iio: adc: ad7768-1: remove unnecessary locking
Documentation: ABI: add wideband filter type to sysfs-bus-iio
iio: adc: ad7768-1: set MOSI idle state to prevent accidental reset
iio: adc: ad7768-1: Fix conversion result sign
iio: adc: ad7124: Benefit of dev = indio_dev->dev.parent in ad7124_parse_channel_config()
...
reservation" from Sourabh Jain changes powerpc's kexec code to use more
of the generic layers.
- The 2 patch series "get_maintainer: report subsystem status
separately" from Vlastimil Babka makes some long-requested improvements
to the get_maintainer output.
- The 4 patch series "ucount: Simplify refcounting with rcuref_t" from
Sebastian Siewior cleans up and optimizing the refcounting in the ucount
code.
- The 12 patch series "reboot: support runtime configuration of
emergency hw_protection action" from Ahmad Fatoum improves the ability
for a driver to perform an emergency system shutdown or reboot.
- The 16 patch series "Converge on using secs_to_jiffies() part two"
from Easwar Hariharan performs further migrations from
msecs_to_jiffies() to secs_to_jiffies().
- The 7 patch series "lib/interval_tree: add some test cases and
cleanup" from Wei Yang permits more userspace testing of kernel library
code, adds some more tests and performs some cleanups.
- The 2 patch series "hung_task: Dump the blocking task stacktrace" from
Masami Hiramatsu arranges for the hung_task detector to dump the stack
of the blocking task and not just that of the blocked task.
- The 4 patch series "resource: Split and use DEFINE_RES*() macros" from
Andy Shevchenko provides some cleanups to the resource definition
macros.
- Plus the usual shower of singleton patches - please see the individual
changelogs for details.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ+nuqwAKCRDdBJ7gKXxA
jtNqAQDxqJpjWkzn4yN9CNSs1ivVx3fr6SqazlYCrt3u89WQvwEA1oRrGpETzUGq
r6khQUIcQImPPcjFqEFpuiSOU0MBZA0=
=Kii8
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- The series "powerpc/crash: use generic crashkernel reservation" from
Sourabh Jain changes powerpc's kexec code to use more of the generic
layers.
- The series "get_maintainer: report subsystem status separately" from
Vlastimil Babka makes some long-requested improvements to the
get_maintainer output.
- The series "ucount: Simplify refcounting with rcuref_t" from
Sebastian Siewior cleans up and optimizing the refcounting in the
ucount code.
- The series "reboot: support runtime configuration of emergency
hw_protection action" from Ahmad Fatoum improves the ability for a
driver to perform an emergency system shutdown or reboot.
- The series "Converge on using secs_to_jiffies() part two" from Easwar
Hariharan performs further migrations from msecs_to_jiffies() to
secs_to_jiffies().
- The series "lib/interval_tree: add some test cases and cleanup" from
Wei Yang permits more userspace testing of kernel library code, adds
some more tests and performs some cleanups.
- The series "hung_task: Dump the blocking task stacktrace" from Masami
Hiramatsu arranges for the hung_task detector to dump the stack of
the blocking task and not just that of the blocked task.
- The series "resource: Split and use DEFINE_RES*() macros" from Andy
Shevchenko provides some cleanups to the resource definition macros.
- Plus the usual shower of singleton patches - please see the
individual changelogs for details.
* tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
mailmap: consolidate email addresses of Alexander Sverdlin
fs/procfs: fix the comment above proc_pid_wchan()
relay: use kasprintf() instead of fixed buffer formatting
resource: replace open coded variant of DEFINE_RES()
resource: replace open coded variants of DEFINE_RES_*_NAMED()
resource: replace open coded variant of DEFINE_RES_NAMED_DESC()
resource: split DEFINE_RES_NAMED_DESC() out of DEFINE_RES_NAMED()
samples: add hung_task detector mutex blocking sample
hung_task: show the blocker task if the task is hung on mutex
kexec_core: accept unaccepted kexec segments' destination addresses
watchdog/perf: optimize bytes copied and remove manual NUL-termination
lib/interval_tree: fix the comment of interval_tree_span_iter_next_gap()
lib/interval_tree: skip the check before go to the right subtree
lib/interval_tree: add test case for span iteration
lib/interval_tree: add test case for interval_tree_iter_xxx() helpers
lib/rbtree: add random seed
lib/rbtree: split tests
lib/rbtree: enable userland test suite for rbtree related data structure
checkpatch: describe --min-conf-desc-length
scripts/gdb/symbols: determine KASLR offset on s390
...
Uros Bizjak uses x86 named address space qualifiers to provide
compile-time checking of percpu area accesses.
This has caused a small amount of fallout - two or three issues were
reported. In all cases the calling code was founf to be incorrect.
- The 4 patch series "Some cleanup for memcg" from Chen Ridong
implements some relatively monir cleanups for the memcontrol code.
- The 17 patch series "mm: fixes for device-exclusive entries (hmm)"
from David Hildenbrand fixes a boatload of issues which David found then
using device-exclusive PTE entries when THP is enabled. More work is
needed, but this makes thins better - our own HMM selftests now succeed.
- The 2 patch series "mm: zswap: remove z3fold and zbud" from Yosry
Ahmed remove the z3fold and zbud implementations. They have been
deprecated for half a year and nobody has complained.
- The 5 patch series "mm: further simplify VMA merge operation" from
Lorenzo Stoakes implements numerous simplifications in this area. No
runtime effects are anticipated.
- The 4 patch series "mm/madvise: remove redundant mmap_lock operations
from process_madvise()" from SeongJae Park rationalizes the locking in
the madvise() implementation. Performance gains of 20-25% were observed
in one MADV_DONTNEED microbenchmark.
- The 12 patch series "Tiny cleanup and improvements about SWAP code"
from Baoquan He contains a number of touchups to issues which Baoquan
noticed when working on the swap code.
- The 2 patch series "mm: kmemleak: Usability improvements" from Catalin
Marinas implements a couple of improvements to the kmemleak user-visible
output.
- The 2 patch series "mm/damon/paddr: fix large folios access and
schemes handling" from Usama Arif provides a couple of fixes for DAMON's
handling of large folios.
- The 3 patch series "mm/damon/core: fix wrong and/or useless
damos_walk() behaviors" from SeongJae Park fixes a few issues with the
accuracy of kdamond's walking of DAMON regions.
- The 3 patch series "expose mapping wrprotect, fix fb_defio use" from
Lorenzo Stoakes changes the interaction between framebuffer deferred-io
and core MM. No functional changes are anticipated - this is
preparatory work for the future removal of page structure fields.
- The 4 patch series "mm/damon: add support for hugepage_size DAMOS
filter" from Usama Arif adds a DAMOS filter which permits the filtering
by huge page sizes.
- The 4 patch series "mm: permit guard regions for file-backed/shmem
mappings" from Lorenzo Stoakes extends the guard region feature from its
present "anon mappings only" state. The feature now covers shmem and
file-backed mappings.
- The 4 patch series "mm: batched unmap lazyfree large folios during
reclamation" from Barry Song cleans up and speeds up the unmapping for
pte-mapped large folios.
- The 18 patch series "reimplement per-vma lock as a refcount" from
Suren Baghdasaryan puts the vm_lock back into the vma. Our reasons for
pulling it out were largely bogus and that change made the code more
messy. This patchset provides small (0-10%) improvements on one
microbenchmark.
- The 5 patch series "Docs/mm/damon: misc DAMOS filters documentation
fixes and improves" from SeongJae Park does some maintenance work on the
DAMON docs.
- The 27 patch series "hugetlb/CMA improvements for large systems" from
Frank van der Linden addresses a pile of issues which have been observed
when using CMA on large machines.
- The 2 patch series "mm/damon: introduce DAMOS filter type for unmapped
pages" from SeongJae Park enables users of DMAON/DAMOS to filter my the
page's mapped/unmapped status.
- The 19 patch series "zsmalloc/zram: there be preemption" from Sergey
Senozhatsky teaches zram to run its compression and decompression
operations preemptibly.
- The 12 patch series "selftests/mm: Some cleanups from trying to run
them" from Brendan Jackman fixes a pile of unrelated issues which
Brendan encountered while runnimg our selftests.
- The 2 patch series "fs/proc/task_mmu: add guard region bit to pagemap"
from Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to
determine whether a particular page is a guard page.
- The 7 patch series "mm, swap: remove swap slot cache" from Kairui Song
removes the swap slot cache from the allocation path - it simply wasn't
being effective.
- The 5 patch series "mm: cleanups for device-exclusive entries (hmm)"
from David Hildenbrand implements a number of unrelated cleanups in this
code.
- The 5 patch series "mm: Rework generic PTDUMP configs" from Anshuman
Khandual implements a number of preparatoty cleanups to the
GENERIC_PTDUMP Kconfig logic.
- The 8 patch series "mm/damon: auto-tune aggregation interval" from
SeongJae Park implements a feedback-driven automatic tuning feature for
DAMON's aggregation interval tuning.
- The 5 patch series "Fix lazy mmu mode" from Ryan Roberts fixes some
issues in powerpc, sparc and x86 lazy MMU implementations. Ryan did
this in preparation for implementing lazy mmu mode for arm64 to optimize
vmalloc.
- The 2 patch series "mm/page_alloc: Some clarifications for migratetype
fallback" from Brendan Jackman reworks some commentary to make the code
easier to follow.
- The 3 patch series "page_counter cleanup and size reduction" from
Shakeel Butt cleans up the page_counter code and fixes a size increase
which we accidentally added late last year.
- The 3 patch series "Add a command line option that enables control of
how many threads should be used to allocate huge pages" from Thomas
Prescher does that. It allows the careful operator to significantly
reduce boot time by tuning the parallalization of huge page
initialization.
- The 3 patch series "Fix calculations in trace_balance_dirty_pages()
for cgwb" from Tang Yizhou fixes the tracing output from the dirty page
balancing code.
- The 9 patch series "mm/damon: make allow filters after reject filters
useful and intuitive" from SeongJae Park improves the handling of allow
and reject filters. Behaviour is made more consistent and the
documention is updated accordingly.
- The 5 patch series "Switch zswap to object read/write APIs" from Yosry
Ahmed updates zswap to the new object read/write APIs and thus permits
the removal of some legacy code from zpool and zsmalloc.
- The 6 patch series "Some trivial cleanups for shmem" from Baolin Wang
does as it claims.
- The 20 patch series "fs/dax: Fix ZONE_DEVICE page reference counts"
from Alistair Popple regularizes the weird ZONE_DEVICE page refcount
handling in DAX, permittig the removal of a number of special-case
checks.
- The 4 patch series "refactor mremap and fix bug" from Lorenzo Stoakes
is a preparatoty refactoring and cleanup of the mremap() code.
- The 20 patch series "mm: MM owner tracking for large folios (!hugetlb)
+ CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in
which we determine whether a large folio is known to be mapped
exclusively into a single MM.
- The 8 patch series "mm/damon: add sysfs dirs for managing DAMOS
filters based on handling layers" from SeongJae Park adds a couple of
new sysfs directories to ease the management of DAMON/DAMOS filters.
- The 13 patch series "arch, mm: reduce code duplication in mem_init()"
from Mike Rapoport consolidates many per-arch implementations of
mem_init() into code generic code, where that is practical.
- The 13 patch series "mm/damon/sysfs: commit parameters online via
damon_call()" from SeongJae Park continues the cleaning up of sysfs
access to DAMON internal data.
- The 3 patch series "mm: page_ext: Introduce new iteration API" from
Luiz Capitulino reworks the page_ext initialization to fix a boot-time
crash which was observed with an unusual combination of compile and
cmdline options.
- The 8 patch series "Buddy allocator like (or non-uniform) folio split"
from Zi Yan reworks the code to split a folio into smaller folios. The
main benefit is lessened memory consumption: fewer post-split folios are
generated.
- The 2 patch series "Minimize xa_node allocation during xarry split"
from Zi Yan reduces the number of xarray xa_nodes which are generated
during an xarray split.
- The 2 patch series "drivers/base/memory: Two cleanups" from Gavin Shan
performs some maintenance work on the drivers/base/memory code.
- The 3 patch series "Add tracepoints for lowmem reserves, watermarks
and totalreserve_pages" from Martin Liu adds some more tracepoints to
the page allocator code.
- The 4 patch series "mm/madvise: cleanup requests validations and
classifications" from SeongJae Park cleans up some warts which SeongJae
observed during his earlier madvise work.
- The 3 patch series "mm/hwpoison: Fix regressions in memory failure
handling" from Shuai Xue addresses two quite serious regressions which
Shuai has observed in the memory-failure implementation.
- The 5 patch series "mm: reliable huge page allocator" from Johannes
Weiner makes huge page allocations cheaper and more reliable by reducing
fragmentation.
- The 5 patch series "Minor memcg cleanups & prep for memdescs" from
Matthew Wilcox is preparatory work for the future implementation of
memdescs.
- The 4 patch series "track memory used by balloon drivers" from Nico
Pache introduces a way to track memory used by our various balloon
drivers.
- The 2 patch series "mm/damon: introduce DAMOS filter type for active
pages" from Nhat Pham permits users to filter for active/inactive pages,
separately for file and anon pages.
- The 2 patch series "Adding Proactive Memory Reclaim Statistics" from
Hao Jia separates the proactive reclaim statistics from the direct
reclaim statistics.
- The 2 patch series "mm/vmscan: don't try to reclaim hwpoison folio"
from Jinjiang Tu fixes our handling of hwpoisoned pages within the
reclaim code.
-----BEGIN PGP SIGNATURE-----
iHQEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ+nZaAAKCRDdBJ7gKXxA
jsOWAPiP4r7CJHMZRK4eyJOkvS1a1r+TsIarrFZtjwvf/GIfAQCEG+JDxVfUaUSF
Ee93qSSLR1BkNdDw+931Pu0mXfbnBw==
=Pn2K
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- The series "Enable strict percpu address space checks" from Uros
Bizjak uses x86 named address space qualifiers to provide
compile-time checking of percpu area accesses.
This has caused a small amount of fallout - two or three issues were
reported. In all cases the calling code was found to be incorrect.
- The series "Some cleanup for memcg" from Chen Ridong implements some
relatively monir cleanups for the memcontrol code.
- The series "mm: fixes for device-exclusive entries (hmm)" from David
Hildenbrand fixes a boatload of issues which David found then using
device-exclusive PTE entries when THP is enabled. More work is
needed, but this makes thins better - our own HMM selftests now
succeed.
- The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed
remove the z3fold and zbud implementations. They have been deprecated
for half a year and nobody has complained.
- The series "mm: further simplify VMA merge operation" from Lorenzo
Stoakes implements numerous simplifications in this area. No runtime
effects are anticipated.
- The series "mm/madvise: remove redundant mmap_lock operations from
process_madvise()" from SeongJae Park rationalizes the locking in the
madvise() implementation. Performance gains of 20-25% were observed
in one MADV_DONTNEED microbenchmark.
- The series "Tiny cleanup and improvements about SWAP code" from
Baoquan He contains a number of touchups to issues which Baoquan
noticed when working on the swap code.
- The series "mm: kmemleak: Usability improvements" from Catalin
Marinas implements a couple of improvements to the kmemleak
user-visible output.
- The series "mm/damon/paddr: fix large folios access and schemes
handling" from Usama Arif provides a couple of fixes for DAMON's
handling of large folios.
- The series "mm/damon/core: fix wrong and/or useless damos_walk()
behaviors" from SeongJae Park fixes a few issues with the accuracy of
kdamond's walking of DAMON regions.
- The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo
Stoakes changes the interaction between framebuffer deferred-io and
core MM. No functional changes are anticipated - this is preparatory
work for the future removal of page structure fields.
- The series "mm/damon: add support for hugepage_size DAMOS filter"
from Usama Arif adds a DAMOS filter which permits the filtering by
huge page sizes.
- The series "mm: permit guard regions for file-backed/shmem mappings"
from Lorenzo Stoakes extends the guard region feature from its
present "anon mappings only" state. The feature now covers shmem and
file-backed mappings.
- The series "mm: batched unmap lazyfree large folios during
reclamation" from Barry Song cleans up and speeds up the unmapping
for pte-mapped large folios.
- The series "reimplement per-vma lock as a refcount" from Suren
Baghdasaryan puts the vm_lock back into the vma. Our reasons for
pulling it out were largely bogus and that change made the code more
messy. This patchset provides small (0-10%) improvements on one
microbenchmark.
- The series "Docs/mm/damon: misc DAMOS filters documentation fixes and
improves" from SeongJae Park does some maintenance work on the DAMON
docs.
- The series "hugetlb/CMA improvements for large systems" from Frank
van der Linden addresses a pile of issues which have been observed
when using CMA on large machines.
- The series "mm/damon: introduce DAMOS filter type for unmapped pages"
from SeongJae Park enables users of DMAON/DAMOS to filter my the
page's mapped/unmapped status.
- The series "zsmalloc/zram: there be preemption" from Sergey
Senozhatsky teaches zram to run its compression and decompression
operations preemptibly.
- The series "selftests/mm: Some cleanups from trying to run them" from
Brendan Jackman fixes a pile of unrelated issues which Brendan
encountered while runnimg our selftests.
- The series "fs/proc/task_mmu: add guard region bit to pagemap" from
Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to
determine whether a particular page is a guard page.
- The series "mm, swap: remove swap slot cache" from Kairui Song
removes the swap slot cache from the allocation path - it simply
wasn't being effective.
- The series "mm: cleanups for device-exclusive entries (hmm)" from
David Hildenbrand implements a number of unrelated cleanups in this
code.
- The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual
implements a number of preparatoty cleanups to the GENERIC_PTDUMP
Kconfig logic.
- The series "mm/damon: auto-tune aggregation interval" from SeongJae
Park implements a feedback-driven automatic tuning feature for
DAMON's aggregation interval tuning.
- The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in
powerpc, sparc and x86 lazy MMU implementations. Ryan did this in
preparation for implementing lazy mmu mode for arm64 to optimize
vmalloc.
- The series "mm/page_alloc: Some clarifications for migratetype
fallback" from Brendan Jackman reworks some commentary to make the
code easier to follow.
- The series "page_counter cleanup and size reduction" from Shakeel
Butt cleans up the page_counter code and fixes a size increase which
we accidentally added late last year.
- The series "Add a command line option that enables control of how
many threads should be used to allocate huge pages" from Thomas
Prescher does that. It allows the careful operator to significantly
reduce boot time by tuning the parallalization of huge page
initialization.
- The series "Fix calculations in trace_balance_dirty_pages() for cgwb"
from Tang Yizhou fixes the tracing output from the dirty page
balancing code.
- The series "mm/damon: make allow filters after reject filters useful
and intuitive" from SeongJae Park improves the handling of allow and
reject filters. Behaviour is made more consistent and the documention
is updated accordingly.
- The series "Switch zswap to object read/write APIs" from Yosry Ahmed
updates zswap to the new object read/write APIs and thus permits the
removal of some legacy code from zpool and zsmalloc.
- The series "Some trivial cleanups for shmem" from Baolin Wang does as
it claims.
- The series "fs/dax: Fix ZONE_DEVICE page reference counts" from
Alistair Popple regularizes the weird ZONE_DEVICE page refcount
handling in DAX, permittig the removal of a number of special-case
checks.
- The series "refactor mremap and fix bug" from Lorenzo Stoakes is a
preparatoty refactoring and cleanup of the mremap() code.
- The series "mm: MM owner tracking for large folios (!hugetlb) +
CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in
which we determine whether a large folio is known to be mapped
exclusively into a single MM.
- The series "mm/damon: add sysfs dirs for managing DAMOS filters based
on handling layers" from SeongJae Park adds a couple of new sysfs
directories to ease the management of DAMON/DAMOS filters.
- The series "arch, mm: reduce code duplication in mem_init()" from
Mike Rapoport consolidates many per-arch implementations of
mem_init() into code generic code, where that is practical.
- The series "mm/damon/sysfs: commit parameters online via
damon_call()" from SeongJae Park continues the cleaning up of sysfs
access to DAMON internal data.
- The series "mm: page_ext: Introduce new iteration API" from Luiz
Capitulino reworks the page_ext initialization to fix a boot-time
crash which was observed with an unusual combination of compile and
cmdline options.
- The series "Buddy allocator like (or non-uniform) folio split" from
Zi Yan reworks the code to split a folio into smaller folios. The
main benefit is lessened memory consumption: fewer post-split folios
are generated.
- The series "Minimize xa_node allocation during xarry split" from Zi
Yan reduces the number of xarray xa_nodes which are generated during
an xarray split.
- The series "drivers/base/memory: Two cleanups" from Gavin Shan
performs some maintenance work on the drivers/base/memory code.
- The series "Add tracepoints for lowmem reserves, watermarks and
totalreserve_pages" from Martin Liu adds some more tracepoints to the
page allocator code.
- The series "mm/madvise: cleanup requests validations and
classifications" from SeongJae Park cleans up some warts which
SeongJae observed during his earlier madvise work.
- The series "mm/hwpoison: Fix regressions in memory failure handling"
from Shuai Xue addresses two quite serious regressions which Shuai
has observed in the memory-failure implementation.
- The series "mm: reliable huge page allocator" from Johannes Weiner
makes huge page allocations cheaper and more reliable by reducing
fragmentation.
- The series "Minor memcg cleanups & prep for memdescs" from Matthew
Wilcox is preparatory work for the future implementation of memdescs.
- The series "track memory used by balloon drivers" from Nico Pache
introduces a way to track memory used by our various balloon drivers.
- The series "mm/damon: introduce DAMOS filter type for active pages"
from Nhat Pham permits users to filter for active/inactive pages,
separately for file and anon pages.
- The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia
separates the proactive reclaim statistics from the direct reclaim
statistics.
- The series "mm/vmscan: don't try to reclaim hwpoison folio" from
Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim
code.
* tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits)
mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex()
x86/mm: restore early initialization of high_memory for 32-bits
mm/vmscan: don't try to reclaim hwpoison folio
mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
cgroup: docs: add pswpin and pswpout items in cgroup v2 doc
mm: vmscan: split proactive reclaim statistics from direct reclaim statistics
selftests/mm: speed up split_huge_page_test
selftests/mm: uffd-unit-tests support for hugepages > 2M
docs/mm/damon/design: document active DAMOS filter type
mm/damon: implement a new DAMOS filter type for active pages
fs/dax: don't disassociate zero page entries
MM documentation: add "Unaccepted" meminfo entry
selftests/mm: add commentary about 9pfs bugs
fork: use __vmalloc_node() for stack allocation
docs/mm: Physical Memory: Populate the "Zones" section
xen: balloon: update the NR_BALLOON_PAGES state
hv_balloon: update the NR_BALLOON_PAGES state
balloon_compaction: update the NR_BALLOON_PAGES state
meminfo: add a per node counter for balloon drivers
mm: remove references to folio in __memcg_kmem_uncharge_page()
...
Toolchain and infrastructure:
- Extract the 'pin-init' API from the 'kernel' crate and make it into
a standalone crate.
In order to do this, the contents are rearranged so that they can
easily be kept in sync with the version maintained out-of-tree that
other projects have started to use too (or plan to, like QEMU).
This will reduce the maintenance burden for Benno, who will now have
his own sub-tree, and will simplify future expected changes like the
move to use 'syn' to simplify the implementation.
- Add '#[test]'-like support based on KUnit.
We already had doctests support based on KUnit, which takes the
examples in our Rust documentation and runs them under KUnit.
Now, we are adding the beginning of the support for "normal" tests,
similar to those the '#[test]' tests in userspace Rust. For instance:
#[kunit_tests(my_suite)]
mod tests {
#[test]
fn my_test() {
assert_eq!(1 + 1, 2);
}
}
Unlike with doctests, the 'assert*!'s do not map to the KUnit
assertion APIs yet.
- Check Rust signatures at compile time for functions called from C by
name.
In particular, introduce a new '#[export]' macro that can be placed
in the Rust function definition. It will ensure that the function
declaration on the C side matches the signature on the Rust function:
#[export]
pub unsafe extern "C" fn my_function(a: u8, b: i32) -> usize {
// ...
}
The macro essentially forces the compiler to compare the types of
the actual Rust function and the 'bindgen'-processed C signature.
These cases are rare so far. In the future, we may consider
introducing another tool, 'cbindgen', to generate C headers
automatically. Even then, having these functions explicitly marked
may be a good idea anyway.
- Enable the 'raw_ref_op' Rust feature: it is already stable, and
allows us to use the new '&raw' syntax, avoiding a couple macros.
After everyone has migrated, we will disallow the macros.
- Pass the correct target to 'bindgen' on Usermode Linux.
- Fix 'rusttest' build in macOS.
'kernel' crate:
- New 'hrtimer' module: add support for setting up intrusive timers
without allocating when starting the timer. Add support for
'Pin<Box<_>>', 'Arc<_>', 'Pin<&_>' and 'Pin<&mut _>' as pointer types
for use with timer callbacks. Add support for setting clock source
and timer mode.
- New 'dma' module: add a simple DMA coherent allocator abstraction and
a test sample driver.
- 'list' module: make the linked list 'Cursor' point between elements,
rather than at an element, which is more convenient to us and allows
for cursors to empty lists; and document it with examples of how to
perform common operations with the provided methods.
- 'str' module: implement a few traits for 'BStr' as well as the
'strip_prefix()' method.
- 'sync' module: add 'Arc::as_ptr'.
- 'alloc' module: add 'Box::into_pin'.
- 'error' module: extend the 'Result' documentation, including a few
examples on different ways of handling errors, a warning about using
methods that may panic, and links to external documentation.
'macros' crate:
- 'module' macro: add the 'authors' key to support multiple authors.
The original key will be kept until everyone has migrated.
Documentation:
- Add error handling sections.
MAINTAINERS:
- Add Danilo Krummrich as reviewer of the Rust "subsystem".
- Add 'RUST [PIN-INIT]' entry with Benno Lossin as maintainer. It has
its own sub-tree.
- Add sub-tree for 'RUST [ALLOC]'.
- Add 'DMA MAPPING HELPERS DEVICE DRIVER API [RUST]' entry with Abdiel
Janulgue as primary maintainer. It will go through the sub-tree of
the 'RUST [ALLOC]' entry.
- Add 'HIGH-RESOLUTION TIMERS [RUST]' entry with Andreas Hindborg as
maintainer. It has its own sub-tree.
And a few other cleanups and improvements.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmfpQgAACgkQGXyLc2ht
IW35CQ//VOIFKtG6qgHVMIxrmpT7YFsrAU41h+cHT2lzy5KiTqSYlCgd18SJ+Iyy
vi1ylfdyqOpH5EoO+opPN2H4E+VUlRJg7BkZrT4p1lgGDEKg1mtR/825TxquLNFM
A653f3FvK/scMb6X43kWNKGK/jnxlfxBGmUwIY4/p7+adIuZzXnNbPkV9XYGLx3r
8KIBKJ9gM52eXoCoF8XJpg6Vg/0rYWIet32OzYF0PvzSAOqUlH4keu15jeUo+59V
tgCzAkc2yV3oSo721KYlpPeCPKI5iVCzIcwT0n8fqraXtgGnaFPe5XF16U9Qvrjv
vRp5/dePAHwsOcj5ErzOgLMqGa1sqY76lxDI05PNcBJ8fBAhNEV/rpCTXs/wRagQ
xUZOdsQyEn0V/BOtV+dnwu410dElEeJdOAeojSYFm1gUay43a0e6yIboxn3Ylnfx
8jONSokZ/UFHX3wOFNqHeXsY+REB8Qq8OZXjNBZVFpKHNsICWA0G3BcCRnB1815k
0v7seSdrST78EJ/A5nM0a9gghuLzYgAN04SDx0FzKjb2mHs3PiVfXDvrNMCJ0pBW
zbF9RlvszKZStY5tpxdZ5Zh+f7rfYcnJHYhNpoP7DJr136iWP+NnHbk1lK6+o4WY
lPVdMMgUSUlEXIHgK2ebcb/I1KBrDYiPktmvKAFLrH3qVzhkLAU=
=PCxf
-----END PGP SIGNATURE-----
Merge tag 'rust-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust updates from Miguel Ojeda:
"Toolchain and infrastructure:
- Extract the 'pin-init' API from the 'kernel' crate and make it into
a standalone crate.
In order to do this, the contents are rearranged so that they can
easily be kept in sync with the version maintained out-of-tree that
other projects have started to use too (or plan to, like QEMU).
This will reduce the maintenance burden for Benno, who will now
have his own sub-tree, and will simplify future expected changes
like the move to use 'syn' to simplify the implementation.
- Add '#[test]'-like support based on KUnit.
We already had doctests support based on KUnit, which takes the
examples in our Rust documentation and runs them under KUnit.
Now, we are adding the beginning of the support for "normal" tests,
similar to those the '#[test]' tests in userspace Rust. For
instance:
#[kunit_tests(my_suite)]
mod tests {
#[test]
fn my_test() {
assert_eq!(1 + 1, 2);
}
}
Unlike with doctests, the 'assert*!'s do not map to the KUnit
assertion APIs yet.
- Check Rust signatures at compile time for functions called from C
by name.
In particular, introduce a new '#[export]' macro that can be placed
in the Rust function definition. It will ensure that the function
declaration on the C side matches the signature on the Rust
function:
#[export]
pub unsafe extern "C" fn my_function(a: u8, b: i32) -> usize {
// ...
}
The macro essentially forces the compiler to compare the types of
the actual Rust function and the 'bindgen'-processed C signature.
These cases are rare so far. In the future, we may consider
introducing another tool, 'cbindgen', to generate C headers
automatically. Even then, having these functions explicitly marked
may be a good idea anyway.
- Enable the 'raw_ref_op' Rust feature: it is already stable, and
allows us to use the new '&raw' syntax, avoiding a couple macros.
After everyone has migrated, we will disallow the macros.
- Pass the correct target to 'bindgen' on Usermode Linux.
- Fix 'rusttest' build in macOS.
'kernel' crate:
- New 'hrtimer' module: add support for setting up intrusive timers
without allocating when starting the timer. Add support for
'Pin<Box<_>>', 'Arc<_>', 'Pin<&_>' and 'Pin<&mut _>' as pointer
types for use with timer callbacks. Add support for setting clock
source and timer mode.
- New 'dma' module: add a simple DMA coherent allocator abstraction
and a test sample driver.
- 'list' module: make the linked list 'Cursor' point between
elements, rather than at an element, which is more convenient to us
and allows for cursors to empty lists; and document it with
examples of how to perform common operations with the provided
methods.
- 'str' module: implement a few traits for 'BStr' as well as the
'strip_prefix()' method.
- 'sync' module: add 'Arc::as_ptr'.
- 'alloc' module: add 'Box::into_pin'.
- 'error' module: extend the 'Result' documentation, including a few
examples on different ways of handling errors, a warning about
using methods that may panic, and links to external documentation.
'macros' crate:
- 'module' macro: add the 'authors' key to support multiple authors.
The original key will be kept until everyone has migrated.
Documentation:
- Add error handling sections.
MAINTAINERS:
- Add Danilo Krummrich as reviewer of the Rust "subsystem".
- Add 'RUST [PIN-INIT]' entry with Benno Lossin as maintainer. It has
its own sub-tree.
- Add sub-tree for 'RUST [ALLOC]'.
- Add 'DMA MAPPING HELPERS DEVICE DRIVER API [RUST]' entry with
Abdiel Janulgue as primary maintainer. It will go through the
sub-tree of the 'RUST [ALLOC]' entry.
- Add 'HIGH-RESOLUTION TIMERS [RUST]' entry with Andreas Hindborg as
maintainer. It has its own sub-tree.
And a few other cleanups and improvements"
* tag 'rust-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (71 commits)
rust: dma: add `Send` implementation for `CoherentAllocation`
rust: macros: fix `make rusttest` build on macOS
rust: block: refactor to use `&raw mut`
rust: enable `raw_ref_op` feature
rust: uaccess: name the correct function
rust: rbtree: fix comments referring to Box instead of KBox
rust: hrtimer: add maintainer entry
rust: hrtimer: add clocksource selection through `ClockId`
rust: hrtimer: add `HrTimerMode`
rust: hrtimer: implement `HrTimerPointer` for `Pin<Box<T>>`
rust: alloc: add `Box::into_pin`
rust: hrtimer: implement `UnsafeHrTimerPointer` for `Pin<&mut T>`
rust: hrtimer: implement `UnsafeHrTimerPointer` for `Pin<&T>`
rust: hrtimer: add `hrtimer::ScopedHrTimerPointer`
rust: hrtimer: add `UnsafeHrTimerPointer`
rust: hrtimer: allow timer restart from timer handler
rust: str: implement `strip_prefix` for `BStr`
rust: str: implement `AsRef<BStr>` for `[u8]` and `BStr`
rust: str: implement `Index` for `BStr`
rust: str: implement `PartialEq` for `BStr`
...
- Use RCU instead of RCU-sched
The mix of rcu_read_lock(), rcu_read_lock_sched() and preempt_disable()
in the module code and its users has been replaced with just
rcu_read_lock().
- The rest of changes are smaller fixes and updates.
The changes have been on linux-next for at least 2 weeks, with the RCU
cleanup present for 2 months. One performance problem was reported with the
RCU change when KASAN + lockdep were enabled, but it was effectively
addressed by the already merged ee57ab5a32 ("locking/lockdep: Disable
KASAN instrumentation of lockdep.c").
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEIduBR9MnFA82q/jtumpXJwqY6poFAmfmwrsUHHBldHIucGF2
bHVAc3VzZS5jb20ACgkQumpXJwqY6prWxgf/S7Pvdywm10vJ6fooYa+GxXNMwhyh
XRjZ4m9gjeTNf2KLwX0XHv0XZeFHOmHfjd3iI+pS6CXZnCFTN9J3XPLYsrTxXUb6
U6zzLf8Zsz8TzeI4dgvSBsZln7oICSACkAgdJCq23hpNKeaeRo91dgiZaIwyZJG3
FekqSFtP7pYhfFoNkrFKysqbgl1+RWWZ79L2qRJA0bPzVFlvRUuh6cOHQw+8RMqf
BYLwnArjTkW8AcXpxIXSiwphDHVZ81B96xoplavyoprA5FDpv1W+8y4DtxdWFn+1
QVWCs/ZV3KrwXWpZev625w3fIOOIXILqRINOzLfvXTw+1xFS3TzSQEpVeg==
=4OKc
-----END PGP SIGNATURE-----
Merge tag 'modules-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull modules updates from Petr Pavlu:
- Use RCU instead of RCU-sched
The mix of rcu_read_lock(), rcu_read_lock_sched() and
preempt_disable() in the module code and its users has
been replaced with just rcu_read_lock()
- The rest of changes are smaller fixes and updates
* tag 'modules-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: (32 commits)
MAINTAINERS: Update the MODULE SUPPORT section
module: Remove unnecessary size argument when calling strscpy()
module: Replace deprecated strncpy() with strscpy()
params: Annotate struct module_param_attrs with __counted_by()
bug: Use RCU instead RCU-sched to protect module_bug_list.
static_call: Use RCU in all users of __module_text_address().
kprobes: Use RCU in all users of __module_text_address().
bpf: Use RCU in all users of __module_text_address().
jump_label: Use RCU in all users of __module_text_address().
jump_label: Use RCU in all users of __module_address().
x86: Use RCU in all users of __module_address().
cfi: Use RCU while invoking __module_address().
powerpc/ftrace: Use RCU in all users of __module_text_address().
LoongArch: ftrace: Use RCU in all users of __module_text_address().
LoongArch/orc: Use RCU in all users of __module_address().
arm64: module: Use RCU in all users of __module_text_address().
ARM: module: Use RCU in all users of __module_text_address().
module: Use RCU in all users of __module_text_address().
module: Use RCU in all users of __module_address().
module: Use RCU in search_module_extables().
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmfkHCQACgkQ6rmadz2v
bTrzWhAAnDcJsGgSJ9EbElpTfgBWE7aijXo/MsPxxRhORc0uR6MnhPx1iADP4KYj
lTGEIBgRuDG3qaM4EXpPd32rUJJHv8hot7z9zfvUgSuFNLZEHWXJtz/i4ileOxin
08zV+zA5WL2fqamAmMRFMI37DeSWy3xU0/qlbWgNnURjPjRri6CF4rVFUWq+QMY+
XP8ITD/6nOLUR6Bq2M18aHnk2VJWkxVP9Oi+vz1VHbOjKaJC7ATa1+Q4qMqWyTb1
8IAYWiZR1ZPc214ITaspVzLoLb/wxHxy3QMrdAWAL6sjp0B4J8YxIq1qsBuR1FN7
TxTRQND/+LjqrAgs5AmFqz3ndKmahjGQWnQEh/rDYJtx+sLJk9hfsMIDF8Wmxuwl
RftdV0g9bPljR5Qgc9i8DNtEjoAbNjoP8xLjt9HfQakVl8V9jPe0bxZ5tJDf+T0M
n/VgEjaRzdXqFOLal6Z5wl/jkIn1l1kWQuCMI2z5Z0Ls+PlYX56xdZxfK2Rh3m+e
3W89vqj9ytJ3rZKG8DRsxukuHwnJ+Gia3XI2h/5cc8kEM5ss1Ase8oIkmrwaLd9x
+zVXNoDCCPRQgTStwItW+2YdFmE9uijhEZh9yPwT1/rtFuKd0oSebVIpjih/bGqH
mMN9gYO4+ArSbqku9X2lP3VjMOf6M6SZGm+PzG25PAMGzjqGqwk=
=AHTr
-----END PGP SIGNATURE-----
Merge tag 'bpf_try_alloc_pages' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf try_alloc_pages() support from Alexei Starovoitov:
"The pull includes work from Sebastian, Vlastimil and myself with a lot
of help from Michal and Shakeel.
This is a first step towards making kmalloc reentrant to get rid of
slab wrappers: bpf_mem_alloc, kretprobe's objpool, etc. These patches
make page allocator safe from any context.
Vlastimil kicked off this effort at LSFMM 2024:
https://lwn.net/Articles/974138/
and we continued at LSFMM 2025:
https://lore.kernel.org/all/CAADnVQKfkGxudNUkcPJgwe3nTZ=xohnRshx9kLZBTmR_E1DFEg@mail.gmail.com/
Why:
SLAB wrappers bind memory to a particular subsystem making it
unavailable to the rest of the kernel. Some BPF maps in production
consume Gbytes of preallocated memory. Top 5 in Meta: 1.5G, 1.2G,
1.1G, 300M, 200M. Once we have kmalloc that works in any context BPF
map preallocation won't be necessary.
How:
Synchronous kmalloc/page alloc stack has multiple stages going from
fast to slow: cmpxchg16 -> slab_alloc -> new_slab -> alloc_pages ->
rmqueue_pcplist -> __rmqueue, where rmqueue_pcplist was already
relying on trylock.
This set changes rmqueue_bulk/rmqueue_buddy to attempt a trylock and
return ENOMEM if alloc_flags & ALLOC_TRYLOCK. It then wraps this
functionality into try_alloc_pages() helper. We make sure that the
logic is sane in PREEMPT_RT.
End result: try_alloc_pages()/free_pages_nolock() are safe to call
from any context.
try_kmalloc() for any context with similar trylock approach will
follow. It will use try_alloc_pages() when slab needs a new page.
Though such try_kmalloc/page_alloc() is an opportunistic allocator,
this design ensures that the probability of successful allocation of
small objects (up to one page in size) is high.
Even before we have try_kmalloc(), we already use try_alloc_pages() in
BPF arena implementation and it's going to be used more extensively in
BPF"
* tag 'bpf_try_alloc_pages' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
mm: Fix the flipped condition in gfpflags_allow_spinning()
bpf: Use try_alloc_pages() to allocate pages for bpf needs.
mm, bpf: Use memcg in try_alloc_pages().
memcg: Use trylock to access memcg stock_lock.
mm, bpf: Introduce free_pages_nolock()
mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation
locking/local_lock: Introduce localtry_lock_t
- Add sorting of mcount locations at build time
- Rework uaccess functions with C exception handling to shorten inline
assembly size and enable full inlining. This yields near-optimal code
for small constant copies with a ~40kb kernel size increase
- Add support for a configurable STRICT_MM_TYPECHECKS which allows to
generate better code, but also allows to have type checking for
debug builds
- Optimize get_lowcore() for common callers with alternatives that
nearly revert to the pre-relocated lowcore code, while also slightly
reducing syscall entry and exit time
- Convert MACHINE_HAS_* checks for single facility tests into cpu_has_*
style macros that call test_facility(), and for features with additional
conditions, add a new ALT_TYPE_FEATURE alternative to provide a static
branch via alternative patching. Also, move machine feature detection
to the decompressor for early patching and add debugging functionality
to easily show which alternatives are patched
- Add exception table support to early boot / startup code to get rid
of the open coded exception handling
- Use asm_inline for all inline assemblies with EX_TABLE or ALTERNATIVE
to ensure correct inlining and unrolling decisions
- Remove 2k page table leftovers now that s390 has been switched to
always allocate 4k page tables
- Split kfence pool into 4k mappings in arch_kfence_init_pool() and
remove the architecture-specific kfence_split_mapping()
- Use READ_ONCE_NOCHECK() in regs_get_kernel_stack_nth() to silence
spurious KASAN warnings from opportunistic ftrace argument tracing
- Force __atomic_add_const() variants on s390 to always return void,
ensuring compile errors for improper usage
- Remove s390's ioremap_wt() and pgprot_writethrough() due to mismatched
semantics and lack of known users, relying on asm-generic fallbacks
- Signal eventfd in vfio-ap to notify userspace when the guest AP
configuration changes, including during mdev removal
- Convert mdev_types from an array to a pointer in vfio-ccw and vfio-ap
drivers to avoid fake flex array confusion
- Cleanup trap code
- Remove references to the outdated linux390@de.ibm.com address
- Other various small fixes and improvements all over the code
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmfmuPwACgkQjYWKoQLX
FBgTDAgAjKmZ5OYjACRfYepTvKk9SDqa2CBlQZ+BhbAXEVIrxKnv8OkImAXoWNsM
mFxiCxAHWdcD+nqTrxFsXhkNLsndijlwnj/IqZgvy6R/3yNtBlAYRPLujOmVrsQB
dWB8Dl38p63Ip1JfAqyabiAOUjfhrclRcM5FX5tgciXA6N/vhY3OM6k0+k7wN4Nj
Dei/rCrnYRXTrFQgtM4w8JTIrwdnXjeKvaTYCflh4Q5ISJ7TceSF7cqq8HOs5hhK
o2ciaoTdx212522CIsxeN3Ls3jrn8bCOCoOeSCysc5RL84grAuFnmjSajo1LFide
S/TQtHXYy78Wuei9xvHi561ogiv/ww==
=Kxgc
-----END PGP SIGNATURE-----
Merge tag 's390-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Add sorting of mcount locations at build time
- Rework uaccess functions with C exception handling to shorten inline
assembly size and enable full inlining. This yields near-optimal code
for small constant copies with a ~40kb kernel size increase
- Add support for a configurable STRICT_MM_TYPECHECKS which allows to
generate better code, but also allows to have type checking for debug
builds
- Optimize get_lowcore() for common callers with alternatives that
nearly revert to the pre-relocated lowcore code, while also slightly
reducing syscall entry and exit time
- Convert MACHINE_HAS_* checks for single facility tests into cpu_has_*
style macros that call test_facility(), and for features with
additional conditions, add a new ALT_TYPE_FEATURE alternative to
provide a static branch via alternative patching. Also, move machine
feature detection to the decompressor for early patching and add
debugging functionality to easily show which alternatives are patched
- Add exception table support to early boot / startup code to get rid
of the open coded exception handling
- Use asm_inline for all inline assemblies with EX_TABLE or ALTERNATIVE
to ensure correct inlining and unrolling decisions
- Remove 2k page table leftovers now that s390 has been switched to
always allocate 4k page tables
- Split kfence pool into 4k mappings in arch_kfence_init_pool() and
remove the architecture-specific kfence_split_mapping()
- Use READ_ONCE_NOCHECK() in regs_get_kernel_stack_nth() to silence
spurious KASAN warnings from opportunistic ftrace argument tracing
- Force __atomic_add_const() variants on s390 to always return void,
ensuring compile errors for improper usage
- Remove s390's ioremap_wt() and pgprot_writethrough() due to
mismatched semantics and lack of known users, relying on asm-generic
fallbacks
- Signal eventfd in vfio-ap to notify userspace when the guest AP
configuration changes, including during mdev removal
- Convert mdev_types from an array to a pointer in vfio-ccw and vfio-ap
drivers to avoid fake flex array confusion
- Cleanup trap code
- Remove references to the outdated linux390@de.ibm.com address
- Other various small fixes and improvements all over the code
* tag 's390-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (78 commits)
s390: Use inline qualifier for all EX_TABLE and ALTERNATIVE inline assemblies
s390/kfence: Split kfence pool into 4k mappings in arch_kfence_init_pool()
s390/ptrace: Avoid KASAN false positives in regs_get_kernel_stack_nth()
s390/boot: Ignore vmlinux.map
s390/sysctl: Remove "vm/allocate_pgste" sysctl
s390: Remove 2k vs 4k page table leftovers
s390/tlb: Use mm_has_pgste() instead of mm_alloc_pgste()
s390/lowcore: Use lghi instead llilh to clear register
s390/syscall: Merge __do_syscall() and do_syscall()
s390/spinlock: Implement SPINLOCK_LOCKVAL with inline assembly
s390/smp: Implement raw_smp_processor_id() with inline assembly
s390/current: Implement current with inline assembly
s390/lowcore: Use inline qualifier for get_lowcore() inline assembly
s390: Move s390 sysctls into their own file under arch/s390
s390/syscall: Simplify syscall_get_arguments()
s390/vfio-ap: Notify userspace that guest's AP config changed when mdev removed
s390: Remove ioremap_wt() and pgprot_writethrough()
s390/mm: Add configurable STRICT_MM_TYPECHECKS
s390/mm: Convert pgste_val() into function
s390/mm: Convert pgprot_val() into function
...
va_format() is using vsnprintf(), and GCC compiler (Debian 14.2.0-17)
is not happy about this:
lib/vsprintf.c:1704:9: error: function ‘va_format’ might be a candidate for ‘gnu_print ’ format attribute [-Werror=suggest-attribute=format]
Fix the compilation errors (`make W=1` when CONFIG_WERROR=y, which is default)
by silencing the false positive GCC warning.
Suggested-by: Rasmus Villemoes <ravi@prevas.dk>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250321144822.324050-7-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
va_format() doesn't use original formatting string, drop that
argument as it's done elsewhere in similar cases.
Suggested-by: Rasmus Villemoes <ravi@prevas.dk>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250321144822.324050-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
kunit tool:
- Changes to kunit tool to use qboot on QEMU x86_64, and build GDB scripts.
- Fixes kunit tool bug in parsing test plan.
- Adds test to kunit tool to check parsing late test plan.
kunit:
- Clarifies kunit_skip() argument name.
- Adds Kunit check for the longest symbol length.
- Changes qemu_configs for sparc to use Zilog console.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmfkvDYACgkQCwJExA0N
Qxwljg//ZRoF/Jncvlb0vapnOIYywHbJEPRVTKfNurRjhb7stAX7CpLKXing4Gtq
ewy3UXRaAZKg1BvugDYWoUsDDD5o7jx6y9rOMOWM+aAHPzYgxY6gbIzyUVolNZg/
50/ANMhT0bvME8KBB2k2l6p1NAblzOpH3zH35CCDL/40eVodwMPrhq0V5AqccOaE
C5Bn+tDiviS6Icw+b/mVUw8fvmoJSTSKvdjaSeRAqThJN3KtqBVyX383++A1zNqy
Y6tItu9wG06FDjuQ1miOlSMwhgMEYK4TS4GwbX4PUucR8ETaZNUXVviMRou7vMEa
GGOdtsBG3CBgFNtO2VK1qJLWbJesw2G9+w2oIZ2KQKtyfoF7nDMj+DBO2QD/T+GB
u2g/xlSDJ5PTzZBMVKENDMy+C9Q+ux8Y2PsQ0fTCdpYgadytKYBFA23EAiZaMdKa
d1AweNvFS5gi8WkpS8SyMjs0D5pZnKMgHQqOIfRFjCi0HXsGE9RJfkOjLOzRnaOc
zldLAgDcrhtdG8Xin08bux5UuCoqg/e/RJiXF+xQLLJkE7cltN/CuWMrHX4kija+
8xmJtj4Oe0p7JCwnIaXjLAQDuFfxHYHM9wM0nKm+YpVJLPSWqSXk4+xtQEOlvZhN
DJW61ez+pYVCmXuIZ/bgeRzpwXJMfALmI3kn+UtCYwqdTt6Xhp8=
=h8xS
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
"kunit tool:
- Changes to kunit tool to use qboot on QEMU x86_64, and build GDB
scripts
- Fixes kunit tool bug in parsing test plan
- Adds test to kunit tool to check parsing late test plan
kunit:
- Clarifies kunit_skip() argument name
- Adds Kunit check for the longest symbol length
- Changes qemu_configs for sparc to use Zilog console"
* tag 'linux_kselftest-kunit-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: tool: add test to check parsing late test plan
kunit: tool: Fix bug in parsing test plan
Kunit to check the longest symbol length
kunit: Clarify kunit_skip() argument name
kunit: tool: Build GDB scripts
kunit: qemu_configs: sparc: use Zilog console
kunit: tool: Use qboot on QEMU x86_64
This is mainly set of cleanups of asm-generic/io.h, resolving problems
with inconsistent semantics of ioread64/iowrite64 that were causing
runtime and build issues.
The "GENERIC_IOMAP" version that switches between inb()/outb() and
readb()/writeb() style accessors is now only used on architectures that
have PC-style ISA devices that are not memory mapped (x86, uml, m68k-q40
and powerpc-powernv), while alpha and parisc use a more complicated
variant and everything else just maps the ioread interfaces to plan MMIO
(readb/writeb etc).
In addition there are two small changes from Raag Jadav to simplify
the asm-generic/io.h indirect inclusions and from Jann Horn to fix
a corner case with read_word_at_a_time.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmfkb2MACgkQYKtH/8kJ
UicZMg//Va7h0cZBAM64yvHH9SJ1JrM2u4oZNspvcWuncpqaDp3/lFAUBf1m0m46
PhZ8mJmVm/qD7DH8uJRA4kI9t0hjeI1nwb2Pgo60omEpZKY2nIJMsJMIluQYEdAt
nthz9RUvNOu0WSR/zMVmLfEAtncNewJzyUrlTnoQnIM9S+WQ8e5f1TxZbaz754Cb
XYOpfZNj4nyP3wXtMedee3eZiKKxs/OcZBLoGyKnrBIkUbHCucXsAL962SoI3AXr
pMjAIVNC1588fhOc2fA9Jl3K73j8Tj7/34UM+ztd5wxI1lwepxq4EDOCyJrhF5Oh
z7oZ4laGoIc4i1aSrUWFK10TrcSBvC9D3zvUjYL8ryYw3HrpB3VppcObpCBtpWZS
97LGSlwq8UmkQOXt8xFzffOEDSh97ojxJAvUUUtuQtnS7PbkmyZ/OCnddBb0F7pa
Bg68mzzZHm8/WUCMXwKxh+GA+qVZsMsPaPaexS/aG/TuV7+Mnj93GY1GSkj3Qzaw
T9eUuGnFRCvSHU/WJ/Lrl4X1dFdWgHAbSOMNZBVfRFgSUt1ypChV1Sqt2jEfe6Uv
dEeD84vZ0uhTsLoFVv/V4xY0osGKL+kAAtEwszLPfmP43kH+jC7cD3+CSTHW0IgV
EHuFcjv2CraTF3wvX8Mph6ivoh1EwW/ycFm2mw8onloUUZaoMHM=
=6j9g
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"This is mainly set of cleanups of asm-generic/io.h, resolving problems
with inconsistent semantics of ioread64/iowrite64 that were causing
runtime and build issues.
The "GENERIC_IOMAP" version that switches between inb()/outb() and
readb()/writeb() style accessors is now only used on architectures
that have PC-style ISA devices that are not memory mapped (x86, uml,
m68k-q40 and powerpc-powernv), while alpha and parisc use a more
complicated variant and everything else just maps the ioread
interfaces to plan MMIO (readb/writeb etc).
In addition there are two small changes from Raag Jadav to simplify
the asm-generic/io.h indirect inclusions and from Jann Horn to fix a
corner case with read_word_at_a_time"
* tag 'asm-generic-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
rwonce: fix crash by removing READ_ONCE() for unaligned read
rwonce: handle KCSAN like KASAN in read_word_at_a_time()
m68k: coldfire: select PCI_IOMAP for PCI
mips: export pci_iounmap()
mips: fix PCI_IOBASE definition
m68k/nommu: stop using GENERIC_IOMAP
mips: drop GENERIC_IOMAP wrapper
powerpc: asm/io.h: remove split ioread64/iowrite64 helpers
parisc: stop using asm-generic/iomap.h
sh: remove duplicate ioread/iowrite helpers
alpha: stop using asm-generic/iomap.h
io.h: drop unused headers
drm/draw: include missing headers
asm-generic/io.h: rework split ioread64/iowrite64 helpers
Core & protocols
----------------
- Continue Netlink conversions to per-namespace RTNL lock
(IPv4 routing, routing rules, routing next hops, ARP ioctls).
- Continue extending the use of netdev instance locks. As a driver
opt-in protect queue operations and (in due course) ethtool
operations with the instance lock and not RTNL lock.
- Support collecting TCP timestamps (data submitted, sent, acked)
in BPF, allowing for transparent (to the application) and lower
overhead tracking of TCP RPC performance.
- Tweak existing networking Rx zero-copy infra to support zero-copy
Rx via io_uring.
- Optimize MPTCP performance in single subflow mode by 29%.
- Enable GRO on packets which went thru XDP CPU redirect (were queued
for processing on a different CPU). Improving TCP stream performance
up to 2x.
- Improve performance of contended connect() by 200% by searching
for an available 4-tuple under RCU rather than a spin lock.
Bring an additional 229% improvement by tweaking hash distribution.
- Avoid unconditionally touching sk_tsflags on RX, improving
performance under UDP flood by as much as 10%.
- Avoid skb_clone() dance in ping_rcv() to improve performance under
ping flood.
- Avoid FIB lookup in netfilter if socket is available, 20% perf win.
- Rework network device creation (in-kernel) API to more clearly
identify network namespaces and their roles.
There are up to 4 namespace roles but we used to have just 2 netns
pointer arguments, interpreted differently based on context.
- Use sysfs_break_active_protection() instead of trylock to avoid
deadlocks between unregistering objects and sysfs access.
- Add a new sysctl and sockopt for capping max retransmit timeout
in TCP.
- Support masking port and DSCP in routing rule matches.
- Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.
- Support specifying at what time packet should be sent on AF_XDP
sockets.
- Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin users.
- Add Netlink YAML spec for WiFi (nl80211) and conntrack.
- Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
which only need to be exported when IPv6 support is built as a module.
- Age FDB entries based on Rx not Tx traffic in VxLAN, similar
to normal bridging.
- Allow users to specify source port range for GENEVE tunnels.
- netconsole: allow attaching kernel release, CPU ID and task name
to messages as metadata
Driver API
----------
- Continue rework / fixing of Energy Efficient Ethernet (EEE) across
the SW layers. Delegate the responsibilities to phylink where possible.
Improve its handling in phylib.
- Support symmetric OR-XOR RSS hashing algorithm.
- Support tracking and preserving IRQ affinity by NAPI itself.
- Support loopback mode speed selection for interface selftests.
Device drivers
--------------
- Remove the IBM LCS driver for s390.
- Remove the sb1000 cable modem driver.
- Add support for SFP module access over SMBus.
- Add MCTP transport driver for MCTP-over-USB.
- Enable XDP metadata support in multiple drivers.
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- add PCIe TLP Processing Hints (TPH) support for new AMD platforms
- support dumping RoCE queue state for debug
- opt into instance locking
- Intel (100G, ice, idpf):
- ice: rework MSI-X IRQ management and distribution
- ice: support for E830 devices
- iavf: add support for Rx timestamping
- iavf: opt into instance locking
- nVidia/Mellanox:
- mlx4: use page pool memory allocator for Rx
- mlx5: support for one PTP device per hardware clock
- mlx5: support for 200Gbps per-lane link modes
- mlx5: move IPSec policy check after decryption
- AMD/Solarflare:
- support FW flashing via devlink
- Cisco (enic):
- use page pool memory allocator for Rx
- enable 32, 64 byte CQEs
- get max rx/tx ring size from the device
- Meta (fbnic):
- support flow steering and RSS configuration
- report queue stats
- support TCP segmentation
- support IRQ coalescing
- support ring size configuration
- Marvell/Cavium:
- support AF_XDP
- Wangxun:
- support for PTP clock and timestamping
- Huawei (hibmcge):
- checksum offload
- add more statistics
- Ethernet virtual:
- VirtIO net:
- aggressively suppress Tx completions, improve perf by 96% with
1 CPU and 55% with 2 CPUs
- expose NAPI to IRQ mapping and persist NAPI settings
- Google (gve):
- support XDP in DQO RDA Queue Format
- opt into instance locking
- Microsoft vNIC:
- support BIG TCP
- Ethernet NICs consumer, and embedded:
- Synopsys (stmmac):
- cleanup Tx and Tx clock setting and other link-focused cleanups
- enable SGMII and 2500BASEX mode switching for Intel platforms
- support Sophgo SG2044
- Broadcom switches (b53):
- support for BCM53101
- TI:
- iep: add perout configuration support
- icssg: support XDP
- Cadence (macb):
- implement BQL
- Xilinx (axinet):
- support dynamic IRQ moderation and changing coalescing at runtime
- implement BQL
- report standard stats
- MediaTek:
- support phylink managed EEE
- Intel:
- igc: don't restart the interface on every XDP program change
- RealTek (r8169):
- support reading registers of internal PHYs directly
- increase max jumbo packet size on RTL8125/RTL8126
- Airoha:
- support for RISC-V NPU packet processing unit
- enable scatter-gather and support MTU up to 9kB
- Tehuti (tn40xx):
- support cards with TN4010 MAC and an Aquantia AQR105 PHY
- Ethernet PHYs:
- support for TJA1102S, TJA1121
- dp83tg720: add randomized polling intervals for link detection
- dp83822: support changing the transmit amplitude voltage
- support for LEDs on 88q2xxx
- CAN:
- canxl: support Remote Request Substitution bit access
- flexcan: add S32G2/S32G3 SoC
- WiFi:
- remove cooked monitor support
- strict mode for better AP testing
- basic EPCS support
- OMI RX bandwidth reduction support
- batman-adv: add support for jumbo frames
- WiFi drivers:
- RealTek (rtw88):
- support RTL8814AE and RTL8814AU
- RealTek (rtw89):
- switch using wiphy_lock and wiphy_work
- add BB context to manipulate two PHY as preparation of MLO
- improve BT-coexistence mechanism to play A2DP smoothly
- Intel (iwlwifi):
- add new iwlmld sub-driver for latest HW/FW combinations
- MediaTek (mt76):
- preparation for mt7996 Multi-Link Operation (MLO) support
- Qualcomm/Atheros (ath12k):
- continued work on MLO
- Silabs (wfx):
- Wake-on-WLAN support
- Bluetooth:
- add support for skb TX SND/COMPLETION timestamping
- hci_core: enable buffer flow control for SCO/eSCO
- coredump: log devcd dumps into the monitor
- Bluetooth drivers:
- intel: add support to configure TX power
- nxp: handle bootloader error during cmd5 and cmd7
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmfkLC8ACgkQMUZtbf5S
Irsb5g/+L7oKOf0ALbaV9kxFsoz8AymZfAW9i/27F07omGJGpks8oX6j6rQLgIRO
OQOFcp7XEdDh1+jh82gHVuPrw2/6lchLtW8ARtzdiQKFr5DRjrsbtua6GRc8iBqA
DIRCBFoV2HuMkF39Vr09HMa9AZAT7QR2RLsRGpSq8E8Z8xxKz0X7oujs10PFpMTE
IVKhTrVrk+NDot/IU2hzVpnpup+0ld+T2/ZaBklJGcU8uDffImsqNepHRyCG5UC3
xz74Ju23MAj24Gct+og0yFUooF+lUltKyVm0FYCDCY3bASTwgY01NR3kEH/0NQvM
cywLzd/ngHm/SMD2ggVAHkjZUieiIVHdaZ53dgjDeBOQoVP6p0dgUK7EumXX8Mx4
8ReR2UiGoYRPaq9c4o+IjG4K027MwVK2p+mF1a6MLa+20XcyMbev8FIRbbHtC/V4
z5/FsOAxcuICWkA1hU9bODrrGzIqemmdRgKG8sGuTJCt/kYGAn72/TCATGNSaCJ0
00n2jN1aepa7wtywHJ5MhVzxN9iQX7+geUHXz0BI+lK4e1Pmk+vjGksymb9ai2fk
eQAUV9ekub6q68/J16scD7XeOUM37bTLiMBQeIF8UtZBOJscKiS71zn9QP9Twwxv
P2pm01RDZUI+z5ZX3hc12Pm1vjRHaAh9S1JpAw/pTOVlQ+mAJEM=
=XY0S
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Continue Netlink conversions to per-namespace RTNL lock
(IPv4 routing, routing rules, routing next hops, ARP ioctls)
- Continue extending the use of netdev instance locks. As a driver
opt-in protect queue operations and (in due course) ethtool
operations with the instance lock and not RTNL lock.
- Support collecting TCP timestamps (data submitted, sent, acked) in
BPF, allowing for transparent (to the application) and lower
overhead tracking of TCP RPC performance.
- Tweak existing networking Rx zero-copy infra to support zero-copy
Rx via io_uring.
- Optimize MPTCP performance in single subflow mode by 29%.
- Enable GRO on packets which went thru XDP CPU redirect (were queued
for processing on a different CPU). Improving TCP stream
performance up to 2x.
- Improve performance of contended connect() by 200% by searching for
an available 4-tuple under RCU rather than a spin lock. Bring an
additional 229% improvement by tweaking hash distribution.
- Avoid unconditionally touching sk_tsflags on RX, improving
performance under UDP flood by as much as 10%.
- Avoid skb_clone() dance in ping_rcv() to improve performance under
ping flood.
- Avoid FIB lookup in netfilter if socket is available, 20% perf win.
- Rework network device creation (in-kernel) API to more clearly
identify network namespaces and their roles. There are up to 4
namespace roles but we used to have just 2 netns pointer arguments,
interpreted differently based on context.
- Use sysfs_break_active_protection() instead of trylock to avoid
deadlocks between unregistering objects and sysfs access.
- Add a new sysctl and sockopt for capping max retransmit timeout in
TCP.
- Support masking port and DSCP in routing rule matches.
- Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.
- Support specifying at what time packet should be sent on AF_XDP
sockets.
- Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin
users.
- Add Netlink YAML spec for WiFi (nl80211) and conntrack.
- Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
which only need to be exported when IPv6 support is built as a
module.
- Age FDB entries based on Rx not Tx traffic in VxLAN, similar to
normal bridging.
- Allow users to specify source port range for GENEVE tunnels.
- netconsole: allow attaching kernel release, CPU ID and task name to
messages as metadata
Driver API:
- Continue rework / fixing of Energy Efficient Ethernet (EEE) across
the SW layers. Delegate the responsibilities to phylink where
possible. Improve its handling in phylib.
- Support symmetric OR-XOR RSS hashing algorithm.
- Support tracking and preserving IRQ affinity by NAPI itself.
- Support loopback mode speed selection for interface selftests.
Device drivers:
- Remove the IBM LCS driver for s390
- Remove the sb1000 cable modem driver
- Add support for SFP module access over SMBus
- Add MCTP transport driver for MCTP-over-USB
- Enable XDP metadata support in multiple drivers
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- add PCIe TLP Processing Hints (TPH) support for new AMD
platforms
- support dumping RoCE queue state for debug
- opt into instance locking
- Intel (100G, ice, idpf):
- ice: rework MSI-X IRQ management and distribution
- ice: support for E830 devices
- iavf: add support for Rx timestamping
- iavf: opt into instance locking
- nVidia/Mellanox:
- mlx4: use page pool memory allocator for Rx
- mlx5: support for one PTP device per hardware clock
- mlx5: support for 200Gbps per-lane link modes
- mlx5: move IPSec policy check after decryption
- AMD/Solarflare:
- support FW flashing via devlink
- Cisco (enic):
- use page pool memory allocator for Rx
- enable 32, 64 byte CQEs
- get max rx/tx ring size from the device
- Meta (fbnic):
- support flow steering and RSS configuration
- report queue stats
- support TCP segmentation
- support IRQ coalescing
- support ring size configuration
- Marvell/Cavium:
- support AF_XDP
- Wangxun:
- support for PTP clock and timestamping
- Huawei (hibmcge):
- checksum offload
- add more statistics
- Ethernet virtual:
- VirtIO net:
- aggressively suppress Tx completions, improve perf by 96%
with 1 CPU and 55% with 2 CPUs
- expose NAPI to IRQ mapping and persist NAPI settings
- Google (gve):
- support XDP in DQO RDA Queue Format
- opt into instance locking
- Microsoft vNIC:
- support BIG TCP
- Ethernet NICs consumer, and embedded:
- Synopsys (stmmac):
- cleanup Tx and Tx clock setting and other link-focused
cleanups
- enable SGMII and 2500BASEX mode switching for Intel platforms
- support Sophgo SG2044
- Broadcom switches (b53):
- support for BCM53101
- TI:
- iep: add perout configuration support
- icssg: support XDP
- Cadence (macb):
- implement BQL
- Xilinx (axinet):
- support dynamic IRQ moderation and changing coalescing at
runtime
- implement BQL
- report standard stats
- MediaTek:
- support phylink managed EEE
- Intel:
- igc: don't restart the interface on every XDP program change
- RealTek (r8169):
- support reading registers of internal PHYs directly
- increase max jumbo packet size on RTL8125/RTL8126
- Airoha:
- support for RISC-V NPU packet processing unit
- enable scatter-gather and support MTU up to 9kB
- Tehuti (tn40xx):
- support cards with TN4010 MAC and an Aquantia AQR105 PHY
- Ethernet PHYs:
- support for TJA1102S, TJA1121
- dp83tg720: add randomized polling intervals for link detection
- dp83822: support changing the transmit amplitude voltage
- support for LEDs on 88q2xxx
- CAN:
- canxl: support Remote Request Substitution bit access
- flexcan: add S32G2/S32G3 SoC
- WiFi:
- remove cooked monitor support
- strict mode for better AP testing
- basic EPCS support
- OMI RX bandwidth reduction support
- batman-adv: add support for jumbo frames
- WiFi drivers:
- RealTek (rtw88):
- support RTL8814AE and RTL8814AU
- RealTek (rtw89):
- switch using wiphy_lock and wiphy_work
- add BB context to manipulate two PHY as preparation of MLO
- improve BT-coexistence mechanism to play A2DP smoothly
- Intel (iwlwifi):
- add new iwlmld sub-driver for latest HW/FW combinations
- MediaTek (mt76):
- preparation for mt7996 Multi-Link Operation (MLO) support
- Qualcomm/Atheros (ath12k):
- continued work on MLO
- Silabs (wfx):
- Wake-on-WLAN support
- Bluetooth:
- add support for skb TX SND/COMPLETION timestamping
- hci_core: enable buffer flow control for SCO/eSCO
- coredump: log devcd dumps into the monitor
- Bluetooth drivers:
- intel: add support to configure TX power
- nxp: handle bootloader error during cmd5 and cmd7"
* tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits)
unix: fix up for "apparmor: add fine grained af_unix mediation"
mctp: Fix incorrect tx flow invalidation condition in mctp-i2c
net: usb: asix: ax88772: Increase phy_name size
net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string
net: libwx: fix Tx L4 checksum
net: libwx: fix Tx descriptor content for some tunnel packets
atm: Fix NULL pointer dereference
net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards
net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card
net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus
net: phy: aquantia: add essential functions to aqr105 driver
net: phy: aquantia: search for firmware-name in fwnode
net: phy: aquantia: add probe function to aqr105 for firmware loading
net: phy: Add swnode support to mdiobus_scan
gve: add XDP DROP and PASS support for DQ
gve: update XDP allocation path support RX buffer posting
gve: merge packet buffer size fields
gve: update GQ RX to use buf_size
gve: introduce config-based allocation for XDP
gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics
...
upstream tag v1.5.7-kernel, which is signed by upstream's signing key
EF8FE99528B52FFD.
Link: https://github.com/facebook/zstd/releases/tag/v1.5.7
Link: https://github.com/facebook/zstd/releases/tag/v1.5.7-kernel
Link: https://keyserver.ubuntu.com/pks/lookup?search=EF8FE99528B52FFD&fingerprint=on&op=index
Signed-off-by: Nick Terrell <terrelln@fb.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmfjI5oACgkQuzRpqaNE
qPUZJA//foLgy1etgBSTaUbCCIFxOLguFjmH2qfs/0yGX1ekhlv5jXobyUmKhYVM
q0WR3G4lS1MNC40T9zNoKR0GfmZGyrCjOlGkwMEwdNYc+4y5sWujbckE+Xl/mgld
Gz1NEEentFNIeC5htnBX797PJldqawHl6OYax/+6gVZyeLPYfbNYtTGy30fQIvcz
vdIR/KCR2XzHn8+xah1zga5Ey/8LAXpgoYY9Pu3J3HWFRTV35laUe0nZ8EQ1mW3q
nGritp0453RFJgD1wHewp1CgJx9lAixPAMPZ5BCOqOxsCxyalbvefWc6u/cS3zJM
KEeKChyF6k5VqaW4A9jVeKq+HoGfngYjFJmELeKG4vK1d2UwMeDZRJ2IfkKej7xK
0awM0E0LO95H0mWEPhI3bmNbcfOLiJ4TIdWcr/sztF8Vv7fxKkK67Bwk4NTYmyPv
sgFZMEyw0eFYNf8/0j9FCATu7AgmbF3yes4vExuAy0cgZaiNxOxaAspRM2A8Tmdf
WWiAIsS6ZYwp6L1Gm6Rva+GRB15I3wevxOuEJ4kTsVVgvzgLQ+N6Fn5H5g2zgb1Q
hgRlJx6ivyRpoaJhbBB7tqNsK38lQ53i0DHQ21jkBHEPFmRzLnvC+D205Dz3tQK5
kwPAGOCbxoiQbqzhY4NOm75ZPxzy8OW7ygjow0HaX6fgv9Y9n9s=
=+maf
-----END PGP SIGNATURE-----
Merge tag 'zstd-linus-v6.15-rc1' of https://github.com/terrelln/linux
Pull zstd updates from Nick Terrell:
"Update zstd to the latest upstream release v1.5.7.
The two major motivations for updating Zstandard are to keep the code
up to date, and to expose API's needed by Intel for the QAT
compression accelerator.
Imported cleanly from the upstream tag v1.5.7-kernel, which is signed
by upstream's signing key EF8FE99528B52FFD"
Link: https://github.com/facebook/zstd/releases/tag/v1.5.7
Link: https://github.com/facebook/zstd/releases/tag/v1.5.7-kernel
Link: https://keyserver.ubuntu.com/pks/lookup?search=EF8FE99528B52FFD&fingerprint=on&op=index
* tag 'zstd-linus-v6.15-rc1' of https://github.com/terrelln/linux:
zstd: Import upstream v1.5.7
Another set of improvements to the kernel's CRC (cyclic redundancy
check) code:
- Rework the CRC64 library functions to be directly optimized, like what
I did last cycle for the CRC32 and CRC-T10DIF library functions.
- Rewrite the x86 PCLMULQDQ-optimized CRC code, and add VPCLMULQDQ
support and acceleration for crc64_be and crc64_nvme.
- Rewrite the riscv Zbc-optimized CRC code, and add acceleration for
crc_t10dif, crc64_be, and crc64_nvme.
- Remove crc_t10dif and crc64_rocksoft from the crypto API, since they
are no longer needed there.
- Rename crc64_rocksoft to crc64_nvme, as the old name was incorrect.
- Add kunit test cases for crc64_nvme and crc7.
- Eliminate redundant functions for calculating the Castagnoli CRC32,
settling on just crc32c().
- Remove unnecessary prompts from some of the CRC kconfig options.
- Further optimize the x86 crc32c code.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ+CGGhQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK3wRAP4tbnzawUmlIHIF0hleoADXehUgAhMt
NZn15mGvyiuwIQEA8W9qvnLdFXZkdxhxAEvDDFjyrRauL6eGtr/GvCx4AQY=
=wmKG
-----END PGP SIGNATURE-----
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
"Another set of improvements to the kernel's CRC (cyclic redundancy
check) code:
- Rework the CRC64 library functions to be directly optimized, like
what I did last cycle for the CRC32 and CRC-T10DIF library
functions
- Rewrite the x86 PCLMULQDQ-optimized CRC code, and add VPCLMULQDQ
support and acceleration for crc64_be and crc64_nvme
- Rewrite the riscv Zbc-optimized CRC code, and add acceleration for
crc_t10dif, crc64_be, and crc64_nvme
- Remove crc_t10dif and crc64_rocksoft from the crypto API, since
they are no longer needed there
- Rename crc64_rocksoft to crc64_nvme, as the old name was incorrect
- Add kunit test cases for crc64_nvme and crc7
- Eliminate redundant functions for calculating the Castagnoli CRC32,
settling on just crc32c()
- Remove unnecessary prompts from some of the CRC kconfig options
- Further optimize the x86 crc32c code"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (36 commits)
x86/crc: drop the avx10_256 functions and rename avx10_512 to avx512
lib/crc: remove unnecessary prompt for CONFIG_CRC64
lib/crc: remove unnecessary prompt for CONFIG_LIBCRC32C
lib/crc: remove unnecessary prompt for CONFIG_CRC8
lib/crc: remove unnecessary prompt for CONFIG_CRC7
lib/crc: remove unnecessary prompt for CONFIG_CRC4
lib/crc7: unexport crc7_be_syndrome_table
lib/crc_kunit.c: update comment in crc_benchmark()
lib/crc_kunit.c: add test and benchmark for crc7_be()
x86/crc32: optimize tail handling for crc32c short inputs
riscv/crc64: add Zbc optimized CRC64 functions
riscv/crc-t10dif: add Zbc optimized CRC-T10DIF function
riscv/crc32: reimplement the CRC32 functions using new template
riscv/crc: add "template" for Zbc optimized CRC functions
x86/crc: add ANNOTATE_NOENDBR to suppress objtool warnings
x86/crc32: improve crc32c_arch() code generation with clang
x86/crc64: implement crc64_be and crc64_nvme using new template
x86/crc-t10dif: implement crc_t10dif using new template
x86/crc32: implement crc32_le using new template
x86/crc: add "template" for [V]PCLMULQDQ based CRC functions
...
- Consolidate the VDSO storage
The VDSO data storage and data layout has been largely architecture
specific for historical reasons. That increases the maintenance effort
and causes inconsistencies over and over.
There is no real technical reason for architecture specific layouts and
implementations. The architecture specific details can easily be
integrated into a generic layout, which also reduces the amount of
duplicated code for managing the mappings.
Convert all architectures over to a unified layout and common mapping
infrastructure. This splits the VDSO data layout into subsystem
specific blocks, timekeeping, random and architecture parts, which
provides a better structure and allows to improve and update the
functionalities without conflict and interaction.
- Rework the timekeeping data storage
The current implementation is designed for exposing system timekeeping
accessors, which was good enough at the time when it was designed.
PTP and Time Sensitive Networking (TSN) change that as there are
requirements to expose independent PTP clocks, which are not related to
system timekeeping.
Replace the monolithic data storage by a structured layout, which
allows to add support for independent PTP clocks on top while reusing
both the data structures and the time accessor implementations.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmfgSWUTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYGED/0f/M8YyacAyErDYW4ufW+zh2sUidSf
GVlK0Jn5BMljOoye+y2XfTxuvvXxEDjJNYiJm2uKGPdV29tjNXreGK39XyNqXPu5
jwR4f/IN/QVSM2nCO6jyydMz8ympJ2k6M4RewwmxXBL2KsUzzJWSKTgRNqM5Tdjs
1RhJMjkQVTiiSYerBpHXYCeZLM7/VEfZ120uuzVAYPXo0/R6zuyF7IBgIao9hbfO
IQeCMLLfpDQHQhwquTA8ZbWqQusiEoSYHT+kTDa3eXDDbE/2UklAUs9gaatI979x
73zs0Yqxyx2iIGaghACWOAbKdcBWBeCYDw5fFwYVKn4VMQi1+wcxbtOYL767jp9o
vfkLXGilXcVkvDjv4fH+e1NoJXXBxq1Ug1silKdOeJzenQF8Q1i3tavkWUVCNfwH
qyOIM72NiCEWbYBDcz0lwBxEAyO4o0E6NP1bDc4y50VedEYIbXwSh0QGrdev1abn
rjY9vsuUR9oznmZ6BRPPxMTY87gOSHoKvqydgSZUACEgLV9346f5qZf341OReYai
MXUmXOM4+LdyaM1+Mec8ppvjMbLw+736NZyZtT2InusEBE+Ddp25L3hYiWnklJu8
2uwv0AoyrwaJ8y6ADOX4thcLZq0gND0Z/Ayz/XvpeI30eftsGUCt5KOVlqwfwOkI
4EQKvk2fAixPxg==
=rwei
-----END PGP SIGNATURE-----
Merge tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull VDSO infrastructure updates from Thomas Gleixner:
- Consolidate the VDSO storage
The VDSO data storage and data layout has been largely architecture
specific for historical reasons. That increases the maintenance
effort and causes inconsistencies over and over.
There is no real technical reason for architecture specific layouts
and implementations. The architecture specific details can easily be
integrated into a generic layout, which also reduces the amount of
duplicated code for managing the mappings.
Convert all architectures over to a unified layout and common mapping
infrastructure. This splits the VDSO data layout into subsystem
specific blocks, timekeeping, random and architecture parts, which
provides a better structure and allows to improve and update the
functionalities without conflict and interaction.
- Rework the timekeeping data storage
The current implementation is designed for exposing system
timekeeping accessors, which was good enough at the time when it was
designed.
PTP and Time Sensitive Networking (TSN) change that as there are
requirements to expose independent PTP clocks, which are not related
to system timekeeping.
Replace the monolithic data storage by a structured layout, which
allows to add support for independent PTP clocks on top while reusing
both the data structures and the time accessor implementations.
* tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits)
sparc/vdso: Always reject undefined references during linking
x86/vdso: Always reject undefined references during linking
vdso: Rework struct vdso_time_data and introduce struct vdso_clock
vdso: Move architecture related data before basetime data
powerpc/vdso: Prepare introduction of struct vdso_clock
arm64/vdso: Prepare introduction of struct vdso_clock
x86/vdso: Prepare introduction of struct vdso_clock
time/namespace: Prepare introduction of struct vdso_clock
vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct
vdso/vsyscall: Prepare introduction of struct vdso_clock
vdso/gettimeofday: Prepare helper functions for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare introduction of struct vdso_clock
vdso/helpers: Prepare introduction of struct vdso_clock
vdso/datapage: Define vdso_clock to prepare for multiple PTP clocks
vdso: Make vdso_time_data cacheline aligned
arm64: Make asm/cache.h compatible with vDSO
...
hrtimers are initialized with hrtimer_init() and a subsequent store to
the callback pointer. This turned out to be suboptimal for the upcoming
Rust integration and is obviously a silly implementation to begin with.
This cleanup replaces the hrtimer_init(T); T->function = cb; sequence
with hrtimer_setup(T, cb);
The conversion was done with Coccinelle and a few manual fixups.
Once the conversion has completely landed in mainline, hrtimer_init()
will be removed and the hrtimer::function becomes a private member.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmff5jQTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoVvRD/wKtuwmiA66NJFgXC0qVq82A6fO3bY8
GBdbfysDJIbqGu5PTcULTbJ8qkqv3jeLUv6CcXvS4sZ7y/uJQl2lzf8yrD/0bbwc
rLI6sHiPSZmK93kNVN4X5H7kvt7cE/DYC9nnEOgK3BY5FgKc4n9887d4aVBhL8Lv
ODwVXvZ+xi351YCj7qRyPU24zt/p4tkkT1o2k4a0HBluqLI0D+V20fke9IERUL8r
d1uWKlcn0TqYDesE8HXKIhbst3gx52rMJrXBJDHwFmG6v8Pj1fkTXCVpPo8QcBz8
OTVkpomN9f/Tx4+GZwhZOF86LhLL3OhxD6pT7JhFCXdmSGv+Ez8uyk1YZysM/XpV
Juy/1yAcBpDIDkmhMFGdAAn48Nn9Fotty0r4je60zSEp1d/4QMXcFme29qr2JTUE
iWnQ/HD6DxUjVHqy7CYvvo26Xegg1C7qgyOVt4PYZwAM1VKF5P3kzYTb4SAdxtop
Tpji1sfW9QV08jqMNo6XntD32DSP9S2HqjO9LwBw700jnx2jjJ35fcJs6iodMOUn
gckIZLMn3L0OoglPdyA5O7SNTbKE7aFiRKdnT/cJtR3Fa39Qu27CwC5gfiyuie9I
Q+LG8GLuYSBHXAR+PBK4GWlzJ7Dn8k3eqmbnLeKpRMsU6ZzcttgA64xhaviN2wN0
iJbvLJeisXr3GA==
=bYAX
-----END PGP SIGNATURE-----
Merge tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer cleanups from Thomas Gleixner:
"A treewide hrtimer timer cleanup
hrtimers are initialized with hrtimer_init() and a subsequent store to
the callback pointer. This turned out to be suboptimal for the
upcoming Rust integration and is obviously a silly implementation to
begin with.
This cleanup replaces the hrtimer_init(T); T->function = cb; sequence
with hrtimer_setup(T, cb);
The conversion was done with Coccinelle and a few manual fixups.
Once the conversion has completely landed in mainline, hrtimer_init()
will be removed and the hrtimer::function becomes a private member"
* tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits)
wifi: rt2x00: Switch to use hrtimer_update_function()
io_uring: Use helper function hrtimer_update_function()
serial: xilinx_uartps: Use helper function hrtimer_update_function()
ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup()
RDMA: Switch to use hrtimer_setup()
virtio: mem: Switch to use hrtimer_setup()
drm/vmwgfx: Switch to use hrtimer_setup()
drm/xe/oa: Switch to use hrtimer_setup()
drm/vkms: Switch to use hrtimer_setup()
drm/msm: Switch to use hrtimer_setup()
drm/i915/request: Switch to use hrtimer_setup()
drm/i915/uncore: Switch to use hrtimer_setup()
drm/i915/pmu: Switch to use hrtimer_setup()
drm/i915/perf: Switch to use hrtimer_setup()
drm/i915/gvt: Switch to use hrtimer_setup()
drm/i915/huc: Switch to use hrtimer_setup()
drm/amdgpu: Switch to use hrtimer_setup()
stm class: heartbeat: Switch to use hrtimer_setup()
i2c: Switch to use hrtimer_setup()
iio: Switch to use hrtimer_setup()
...
Executing dql_reset after setting a non-zero value for limit_min can
lead to an unreasonable situation where dql->limit is less than
dql->limit_min.
For instance, after setting
/sys/class/net/eth*/queues/tx-0/byte_queue_limits/limit_min,
an ifconfig down/up operation might cause the ethernet driver to call
netdev_tx_reset_queue, which in turn invokes dql_reset.
In this case, dql->limit is reset to 0 while dql->limit_min remains
non-zero value, which is unexpected. The limit should always be
greater than or equal to limit_min.
Signed-off-by: Jing Su <jingsusu@didiglobal.com>
Link: https://patch.msgid.link/Z9qHD1s/NEuQBdgH@pilot-ThinkCentre-M930t-N000
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is no helpers for user to check if a given ID is allocated or not,
neither a helper to loop all the allocated IDs in an IDA and do something
for cleanup. With the two needs, a helper to get the lowest allocated ID
of a range and two variants based on it.
Caller can check if a given ID is allocated or not by:
bool ida_exists(struct ida *ida, unsigned int id)
Caller can iterate all allocated IDs by:
int id;
while ((id = ida_find_first(&pasid_ida)) >= 0) {
//anything to do with the allocated ID
ida_free(pasid_ida, pasid);
}
Link: https://patch.msgid.link/r/20250321180143.8468-2-yi.l.liu@intel.com
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
two locking commits in the locking tree,
part of the locking-core-2025-03-22 pull request. ]
x86 CPU features support:
- Generate the <asm/cpufeaturemasks.h> header based on build config
(H. Peter Anvin, Xin Li)
- x86 CPUID parsing updates and fixes (Ahmed S. Darwish)
- Introduce the 'setcpuid=' boot parameter (Brendan Jackman)
- Enable modifying CPU bug flags with '{clear,set}puid='
(Brendan Jackman)
- Utilize CPU-type for CPU matching (Pawan Gupta)
- Warn about unmet CPU feature dependencies (Sohil Mehta)
- Prepare for new Intel Family numbers (Sohil Mehta)
Percpu code:
- Standardize & reorganize the x86 percpu layout and
related cleanups (Brian Gerst)
- Convert the stackprotector canary to a regular percpu
variable (Brian Gerst)
- Add a percpu subsection for cache hot data (Brian Gerst)
- Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak)
- Construct __percpu_seg_override from __percpu_seg (Uros Bizjak)
MM:
- Add support for broadcast TLB invalidation using AMD's INVLPGB instruction
(Rik van Riel)
- Rework ROX cache to avoid writable copy (Mike Rapoport)
- PAT: restore large ROX pages after fragmentation
(Kirill A. Shutemov, Mike Rapoport)
- Make memremap(MEMREMAP_WB) map memory as encrypted by default
(Kirill A. Shutemov)
- Robustify page table initialization (Kirill A. Shutemov)
- Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn)
- Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW
(Matthew Wilcox)
KASLR:
- x86/kaslr: Reduce KASLR entropy on most x86 systems,
to support PCI BAR space beyond the 10TiB region
(CONFIG_PCI_P2PDMA=y) (Balbir Singh)
CPU bugs:
- Implement FineIBT-BHI mitigation (Peter Zijlstra)
- speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta)
- speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan Gupta)
- RFDS: Exclude P-only parts from the RFDS affected list (Pawan Gupta)
System calls:
- Break up entry/common.c (Brian Gerst)
- Move sysctls into arch/x86 (Joel Granados)
Intel LAM support updates: (Maciej Wieczor-Retman)
- selftests/lam: Move cpu_has_la57() to use cpuinfo flag
- selftests/lam: Skip test if LAM is disabled
- selftests/lam: Test get_user() LAM pointer handling
AMD SMN access updates:
- Add SMN offsets to exclusive region access (Mario Limonciello)
- Add support for debugfs access to SMN registers (Mario Limonciello)
- Have HSMP use SMN through AMD_NODE (Yazen Ghannam)
Power management updates: (Patryk Wlazlyn)
- Allow calling mwait_play_dead with an arbitrary hint
- ACPI/processor_idle: Add FFH state handling
- intel_idle: Provide the default enter_dead() handler
- Eliminate mwait_play_dead_cpuid_hint()
Bootup:
Build system:
- Raise the minimum GCC version to 8.1 (Brian Gerst)
- Raise the minimum LLVM version to 15.0.0
(Nathan Chancellor)
Kconfig: (Arnd Bergmann)
- Add cmpxchg8b support back to Geode CPUs
- Drop 32-bit "bigsmp" machine support
- Rework CONFIG_GENERIC_CPU compiler flags
- Drop configuration options for early 64-bit CPUs
- Remove CONFIG_HIGHMEM64G support
- Drop CONFIG_SWIOTLB for PAE
- Drop support for CONFIG_HIGHPTE
- Document CONFIG_X86_INTEL_MID as 64-bit-only
- Remove old STA2x11 support
- Only allow CONFIG_EISA for 32-bit
Headers:
- Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI headers
(Thomas Huth)
Assembly code & machine code patching:
- x86/alternatives: Simplify alternative_call() interface (Josh Poimboeuf)
- x86/alternatives: Simplify callthunk patching (Peter Zijlstra)
- KVM: VMX: Use named operands in inline asm (Josh Poimboeuf)
- x86/hyperv: Use named operands in inline asm (Josh Poimboeuf)
- x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra)
- x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h>
(Uros Bizjak)
- Use named operands in inline asm (Uros Bizjak)
- Improve performance by using asm_inline() for atomic locking instructions
(Uros Bizjak)
Earlyprintk:
- Harden early_serial (Peter Zijlstra)
NMI handler:
- Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()
(Waiman Long)
Miscellaneous fixes and cleanups:
- by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel,
Artem Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst,
Dan Carpenter, Dr. David Alan Gilbert, H. Peter Anvin,
Ingo Molnar, Josh Poimboeuf, Kevin Brodsky, Mike Rapoport,
Lukas Bulwahn, Maciej Wieczor-Retman, Max Grobecker,
Patryk Wlazlyn, Pawan Gupta, Peter Zijlstra,
Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner,
Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak,
Vitaly Kuznetsov, Xin Li, liuye.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfenkQRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1g1FRAAi6OFTSn/5aeLMI0IMNBxJ6ddQiFc3imd
7+C/vU5nul4CyDs8mKyj/+f/DDrbkG9lKz3VG631Yl237lXHjD8XWcVMeC/1z/q0
3zInDIloE9/nBHRPkF6F7fARBLBZ0LFgaBsGrCo7mwpGybiQdqGcqcxllvTbtXaw
OHta4q6ok+lBDNlfc0v6H4cRnzhmmlKu6Ng0j6UI3V7uFhi3vtxas32ltDQtzorq
2+jbV6/+kbrrv+xPC+jlzOFhTEKRupNPQXmvyQteoQg6G3kqAKMDvBthGXd1rHuX
Qa+BoDIifE/2NiVeRwNrhoqYH/pHCzUzDREW5IW8+ca+4XNKuzAC6EuC8CeCzyK1
q8ZjZjooQW4zEeVFeJYllHONzJYfxfSH5CLsnbcuhq99yfGlrQhF1qL72/Omn1w/
DfPJM8Zt5zyKvLqUg3Md+fkVCO2wyDNhB61QPzRgHF+yD+rvuDpoqvUWir+w7cSn
fwEDVZGXlFx6dumtSrqRaTd1nvFt80s8yP2ll09DMvGQ8D/yruS7hndGAmmJVCSW
NAfd8pSjq5v2+ux2UR92/Cc3VF3SjaUqHBOp/Nq9rESya18ZVa3cJpHhVYYtPIVf
THW0h07RIkGVKs1uq+5ekLCr/8uAZg58UPIqmhTuW0ttymRHCNfohR45FQZzy+0M
tJj1oc2TIZw=
=Dcb3
-----END PGP SIGNATURE-----
Merge tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core x86 updates from Ingo Molnar:
"x86 CPU features support:
- Generate the <asm/cpufeaturemasks.h> header based on build config
(H. Peter Anvin, Xin Li)
- x86 CPUID parsing updates and fixes (Ahmed S. Darwish)
- Introduce the 'setcpuid=' boot parameter (Brendan Jackman)
- Enable modifying CPU bug flags with '{clear,set}puid=' (Brendan
Jackman)
- Utilize CPU-type for CPU matching (Pawan Gupta)
- Warn about unmet CPU feature dependencies (Sohil Mehta)
- Prepare for new Intel Family numbers (Sohil Mehta)
Percpu code:
- Standardize & reorganize the x86 percpu layout and related cleanups
(Brian Gerst)
- Convert the stackprotector canary to a regular percpu variable
(Brian Gerst)
- Add a percpu subsection for cache hot data (Brian Gerst)
- Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak)
- Construct __percpu_seg_override from __percpu_seg (Uros Bizjak)
MM:
- Add support for broadcast TLB invalidation using AMD's INVLPGB
instruction (Rik van Riel)
- Rework ROX cache to avoid writable copy (Mike Rapoport)
- PAT: restore large ROX pages after fragmentation (Kirill A.
Shutemov, Mike Rapoport)
- Make memremap(MEMREMAP_WB) map memory as encrypted by default
(Kirill A. Shutemov)
- Robustify page table initialization (Kirill A. Shutemov)
- Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn)
- Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW
(Matthew Wilcox)
KASLR:
- x86/kaslr: Reduce KASLR entropy on most x86 systems, to support PCI
BAR space beyond the 10TiB region (CONFIG_PCI_P2PDMA=y) (Balbir
Singh)
CPU bugs:
- Implement FineIBT-BHI mitigation (Peter Zijlstra)
- speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta)
- speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan
Gupta)
- RFDS: Exclude P-only parts from the RFDS affected list (Pawan
Gupta)
System calls:
- Break up entry/common.c (Brian Gerst)
- Move sysctls into arch/x86 (Joel Granados)
Intel LAM support updates: (Maciej Wieczor-Retman)
- selftests/lam: Move cpu_has_la57() to use cpuinfo flag
- selftests/lam: Skip test if LAM is disabled
- selftests/lam: Test get_user() LAM pointer handling
AMD SMN access updates:
- Add SMN offsets to exclusive region access (Mario Limonciello)
- Add support for debugfs access to SMN registers (Mario Limonciello)
- Have HSMP use SMN through AMD_NODE (Yazen Ghannam)
Power management updates: (Patryk Wlazlyn)
- Allow calling mwait_play_dead with an arbitrary hint
- ACPI/processor_idle: Add FFH state handling
- intel_idle: Provide the default enter_dead() handler
- Eliminate mwait_play_dead_cpuid_hint()
Build system:
- Raise the minimum GCC version to 8.1 (Brian Gerst)
- Raise the minimum LLVM version to 15.0.0 (Nathan Chancellor)
Kconfig: (Arnd Bergmann)
- Add cmpxchg8b support back to Geode CPUs
- Drop 32-bit "bigsmp" machine support
- Rework CONFIG_GENERIC_CPU compiler flags
- Drop configuration options for early 64-bit CPUs
- Remove CONFIG_HIGHMEM64G support
- Drop CONFIG_SWIOTLB for PAE
- Drop support for CONFIG_HIGHPTE
- Document CONFIG_X86_INTEL_MID as 64-bit-only
- Remove old STA2x11 support
- Only allow CONFIG_EISA for 32-bit
Headers:
- Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI
headers (Thomas Huth)
Assembly code & machine code patching:
- x86/alternatives: Simplify alternative_call() interface (Josh
Poimboeuf)
- x86/alternatives: Simplify callthunk patching (Peter Zijlstra)
- KVM: VMX: Use named operands in inline asm (Josh Poimboeuf)
- x86/hyperv: Use named operands in inline asm (Josh Poimboeuf)
- x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra)
- x86/kexec: Merge x86_32 and x86_64 code using macros from
<asm/asm.h> (Uros Bizjak)
- Use named operands in inline asm (Uros Bizjak)
- Improve performance by using asm_inline() for atomic locking
instructions (Uros Bizjak)
Earlyprintk:
- Harden early_serial (Peter Zijlstra)
NMI handler:
- Add an emergency handler in nmi_desc & use it in
nmi_shootdown_cpus() (Waiman Long)
Miscellaneous fixes and cleanups:
- by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel, Artem
Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst, Dan
Carpenter, Dr. David Alan Gilbert, H. Peter Anvin, Ingo Molnar,
Josh Poimboeuf, Kevin Brodsky, Mike Rapoport, Lukas Bulwahn, Maciej
Wieczor-Retman, Max Grobecker, Patryk Wlazlyn, Pawan Gupta, Peter
Zijlstra, Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner,
Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak, Vitaly
Kuznetsov, Xin Li, liuye"
* tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (211 commits)
zstd: Increase DYNAMIC_BMI2 GCC version cutoff from 4.8 to 11.0 to work around compiler segfault
x86/asm: Make asm export of __ref_stack_chk_guard unconditional
x86/mm: Only do broadcast flush from reclaim if pages were unmapped
perf/x86/intel, x86/cpu: Replace Pentium 4 model checks with VFM ones
perf/x86/intel, x86/cpu: Simplify Intel PMU initialization
x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-UAPI headers
x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers
x86/locking/atomic: Improve performance by using asm_inline() for atomic locking instructions
x86/asm: Use asm_inline() instead of asm() in clwb()
x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h>
x86/hweight: Use asm_inline() instead of asm()
x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()
x86/hweight: Use named operands in inline asm()
x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP
x86/head/64: Avoid Clang < 17 stack protector in startup code
x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h>
x86/runtime-const: Add the RUNTIME_CONST_PTR assembly macro
x86/cpu/intel: Limit the non-architectural constant_tsc model checks
x86/mm/pat: Replace Intel x86_model checks with VFM ones
x86/cpu/intel: Fix fast string initialization for extended Families
...
[ Merge note, these two commits are identical:
- f3fa0e40df ("sched/clock: Don't define sched_clock_irqtime as static key")
- b9f2b29b94 ("sched: Don't define sched_clock_irqtime as static key")
The first one is a cherry-picked version of the second, and the first one
is already upstream. ]
Core & fair scheduler changes:
- Cancel the slice protection of the idle entity (Zihan Zhou)
- Reduce the default slice to avoid tasks getting an extra tick
(Zihan Zhou)
- Force propagating min_slice of cfs_rq when {en,de}queue tasks
(Tianchen Ding)
- Refactor can_migrate_task() to elimate looping (I Hsin Cheng)
- Add unlikey branch hints to several system calls (Colin Ian King)
- Optimize current_clr_polling() on certain architectures (Yujun Dong)
Deadline scheduler: (Juri Lelli)
- Remove redundant dl_clear_root_domain call
- Move dl_rebuild_rd_accounting to cpuset.h
Uclamp:
- Use the uclamp_is_used() helper instead of open-coding it (Xuewen Yan)
- Optimize sched_uclamp_used static key enabling (Xuewen Yan)
Scheduler topology support: (Juri Lelli)
- Ignore special tasks when rebuilding domains
- Add wrappers for sched_domains_mutex
- Generalize unique visiting of root domains
- Rebuild root domain accounting after every update
- Remove partition_and_rebuild_sched_domains
- Stop exposing partition_sched_domains_locked
RSEQ: (Michael Jeanson)
- Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y
- Fix segfault on registration when rseq_cs is non-zero
- selftests: Add rseq syscall errors test
- selftests: Ensure the rseq ABI TLS is actually 1024 bytes
Membarriers:
- Fix redundant load of membarrier_state (Nysal Jan K.A.)
Scheduler debugging:
- Introduce and use preempt_model_str() (Sebastian Andrzej Siewior)
- Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar)
Fixes and cleanups:
- Always save/restore x86 TSC sched_clock() on suspend/resume
(Guilherme G. Piccoli)
- Misc fixes and cleanups (Thorsten Blum, Juri Lelli,
Sebastian Andrzej Siewior)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfejsoRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1ivkhAAwBF2tYRBS1oIHcC/OKK3JJoHVDp2LFbU
9sm5S3ZlGD/Ns2fbpY+9A8UFgUFfjYiTSV7hvf2B9Vge0XSxTmMNFu/MdxLBbo9r
w6GSeNcNDQKpjEGLkrmPFsa2fiYI4dmH0IzDbS9V2cNPk470QBKjAKXNPaSER691
n2wLnQq+m5o4gXnPjnSz6RrrzisRnm2GOWnDV/iqR47pZFNlX2wWlo3s5r7//Hw0
a+QfEfpgKehhy/VSDXmSAgpqnNffjc78yBV6LNoVUddwahnOWiQMS3XViOqgy5VO
jUGBrzW+sKkdRMBppxwJ/0XWgHGC27amIgnU0ZE5u+eiUEu8H9qWl1cRCFxyeB0O
8+WNfwmkH+FPWUdsn84kdePhSsZy6HfM6h44Xe0hx1V7tQXEXfbPzK3TnQg8Ktt1
Ky6ctbZt4cGpqGQuIqvba21A/racrD/DgvB7mHeZksnqZoKTDwxhT/nlQGpuwPoy
SJYd1ynFVJvfC69SwMdwnaglimvEZx1GfT0o5XtCMslY5NkWCou5u+e65WX7ccU5
94wBCwI1/+KiFMJZp6TlPw07Q/Hsj9dDryxLc3OunU3zMVt3++1GZnBS/eb5FN1A
5TlAEpxgH9c5Q4/XJFCvx21DrTVHSuIrR6naK91bgqHo0cEfaHGtO/ejPtJGSfxe
YIFnnu1dhRw=
=SuQE
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Core & fair scheduler changes:
- Cancel the slice protection of the idle entity (Zihan Zhou)
- Reduce the default slice to avoid tasks getting an extra tick
(Zihan Zhou)
- Force propagating min_slice of cfs_rq when {en,de}queue tasks
(Tianchen Ding)
- Refactor can_migrate_task() to elimate looping (I Hsin Cheng)
- Add unlikey branch hints to several system calls (Colin Ian King)
- Optimize current_clr_polling() on certain architectures (Yujun
Dong)
Deadline scheduler: (Juri Lelli)
- Remove redundant dl_clear_root_domain call
- Move dl_rebuild_rd_accounting to cpuset.h
Uclamp:
- Use the uclamp_is_used() helper instead of open-coding it (Xuewen
Yan)
- Optimize sched_uclamp_used static key enabling (Xuewen Yan)
Scheduler topology support: (Juri Lelli)
- Ignore special tasks when rebuilding domains
- Add wrappers for sched_domains_mutex
- Generalize unique visiting of root domains
- Rebuild root domain accounting after every update
- Remove partition_and_rebuild_sched_domains
- Stop exposing partition_sched_domains_locked
RSEQ: (Michael Jeanson)
- Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y
- Fix segfault on registration when rseq_cs is non-zero
- selftests: Add rseq syscall errors test
- selftests: Ensure the rseq ABI TLS is actually 1024 bytes
Membarriers:
- Fix redundant load of membarrier_state (Nysal Jan K.A.)
Scheduler debugging:
- Introduce and use preempt_model_str() (Sebastian Andrzej Siewior)
- Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar)
Fixes and cleanups:
- Always save/restore x86 TSC sched_clock() on suspend/resume
(Guilherme G. Piccoli)
- Misc fixes and cleanups (Thorsten Blum, Juri Lelli, Sebastian
Andrzej Siewior)"
* tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling()
sched/debug: Remove CONFIG_SCHED_DEBUG
sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files
sched/debug, Documentation: Remove (most) CONFIG_SCHED_DEBUG references from documentation
sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional
sched/debug: Make 'const_debug' tunables unconditional __read_mostly
sched/debug: Change SCHED_WARN_ON() to WARN_ON_ONCE()
rseq/selftests: Fix namespace collision with rseq UAPI header
include/{topology,cpuset}: Move dl_rebuild_rd_accounting to cpuset.h
sched/topology: Stop exposing partition_sched_domains_locked
cgroup/cpuset: Remove partition_and_rebuild_sched_domains
sched/topology: Remove redundant dl_clear_root_domain call
sched/deadline: Rebuild root domain accounting after every update
sched/deadline: Generalize unique visiting of root domains
sched/topology: Wrappers for sched_domains_mutex
sched/deadline: Ignore special tasks when rebuilding domains
tracing: Use preempt_model_str()
xtensa: Rely on generic printing of preemption model
x86: Rely on generic printing of preemption model
s390: Rely on generic printing of preemption model
...
- The biggest change is the new option to automatically fail
the build on objtool warnings: CONFIG_OBJTOOL_WERROR.
While there are no currently known unfixed false positives
left, such an expansion in the severity of objtool warnings
inevitably creates a risk of build failures, so it's disabled by
default and depends on !COMPILE_TEST, so it shouldn't be enabled
on allyesconfig/allmodconfig builds and won't be forced on people
who just accept build-time defaults in 'make oldconfig'.
While the option is strongly recommended, only people who enable
it explicitly should see it.
(Josh Poimboeuf)
- Disable branch profiling in noinstr code with a broad
brush that includes all of arch/x86/ and kernel/sched/. (Josh Poimboeuf)
- Create backup object files on objtool errors and print exact
objtool arguments to make failure analysis easier (Josh Poimboeuf)
- Improve noreturn handling (Josh Poimboeuf)
- Improve rodata handling (Tiezhu Yang)
- Support jump tables, switch tables and goto tables on LoongArch (Tiezhu Yang)
- Misc cleanups and fixes (Josh Poimboeuf, David Engraf, Ingo Molnar)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfefkARHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1inlRAAvd9Hom2qh9Iu+KYYF58vsg9zsxWZA6I1
blouKI4SUA8Xjuw6Nihx+emaPaMW1boGSLTNsFzrCa3S1+4UHTTp/Y8snZWJ/Mc/
Peg52N6u/LIcoQM+vNJYRtd9y4wabX87vl0qTxte0kB0Neps3/yQvxtUa2K1srXp
8nwHK+PdzNsgPuIrIiNc9ymsPvbqFHmVIRRVNVKX4BlPJi2kJs9kx43kszweQR/X
kW/bs1315m1HS5i02K0Zs/XdOZHLsk9ERu+aviBJV1txrgZIukATIqbODiI+3RZX
0oa3KxfzEVFN2k3OukrezV2INzETkN+oOSTAZIUOqwSVe+8rdQVBSdYT6svYn/yy
aS8Bi5Mm1nfizTU+cRrzU7FxWCmwsxm9r4fHPTV8Owjxg0uoGk/E/qlvERuR2rpA
p2tHMo1lp2Yo+VZBZPfm5KDHFG4tSGhF9eav2bqSI7/Kf5AWxRl8kBs5iLrcxsXh
4qk3FalnuM7A+1McAUcBJAvM897Yie0s2G83ZipyYyA6U3LSBhBMWh9FlIiAjuIh
YnX6IFkW9tVzVZpJFEGQn+2Ewl5Y2Go5bokKk03vkWCZCgg+hEUsVh6Cnm1ocZpO
Ll3/UF4i8XjjyuAuHDn6mzyHIgch2xRN02v7dJSb09+O9b8vIoPoSqbWSLJUOBqf
r6UesXDG8mY=
=iswJ
-----END PGP SIGNATURE-----
Merge tag 'objtool-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- The biggest change is the new option to automatically fail the build
on objtool warnings: CONFIG_OBJTOOL_WERROR.
While there are no currently known unfixed false positives left, such
an expansion in the severity of objtool warnings inevitably creates a
risk of build failures, so it's disabled by default and depends on
!COMPILE_TEST, so it shouldn't be enabled on
allyesconfig/allmodconfig builds and won't be forced on people who
just accept build-time defaults in 'make oldconfig'.
While the option is strongly recommended, only people who enable it
explicitly should see it.
(Josh Poimboeuf)
- Disable branch profiling in noinstr code with a broad brush that
includes all of arch/x86/ and kernel/sched/. (Josh Poimboeuf)
- Create backup object files on objtool errors and print exact objtool
arguments to make failure analysis easier (Josh Poimboeuf)
- Improve noreturn handling (Josh Poimboeuf)
- Improve rodata handling (Tiezhu Yang)
- Support jump tables, switch tables and goto tables on LoongArch
(Tiezhu Yang)
- Misc cleanups and fixes (Josh Poimboeuf, David Engraf, Ingo Molnar)
* tag 'objtool-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
tracing: Disable branch profiling in noinstr code
objtool: Use O_CREAT with explicit mode mask
objtool: Add CONFIG_OBJTOOL_WERROR
objtool: Create backup on error and print args
objtool: Change "warning:" to "error:" for --Werror
objtool: Add --Werror option
objtool: Add --output option
objtool: Upgrade "Linked object detected" warning to error
objtool: Consolidate option validation
objtool: Remove --unret dependency on --rethunk
objtool: Increase per-function WARN_FUNC() rate limit
objtool: Update documentation
objtool: Improve __noreturn annotation warning
objtool: Fix error handling inconsistencies in check()
x86/traps: Make exc_double_fault() consistently noreturn
LoongArch: Enable jump table for objtool
objtool/LoongArch: Add support for goto table
objtool/LoongArch: Add support for switch table
objtool: Handle PC relative relocation type
objtool: Handle different entry size of rodata
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmfb4r0ACgkQu+CwddJF
iJq6NQf/WNEQAoRY1DEeQiBAvixTYry0j/w1dumpValvt/lybccMwwhWho5i17/o
2J4nif5L5O6D+jZWyz76fx2bcn7GjhteiKtzuVI0mSdDXyYLBLVGa9dMrE1/0kxy
51HnldCLfNmC3qp0pG2E7j2chsxDbTwz4ZPiEAW9kzpvgfEWmfydejzv5+ROFQm7
gH3vRJ7H5enxp2a52DovBN1JllYK9uxMTM3Pq1L37n9Hm1zIR+swbI/3VhklRN4C
nrO6my6GU2+bMQTvPKwuHBIHUH7yS6Z411wCotPmRO0jfLMq/UY5lthgWpqvsC+o
XtgULoikQbcd8kts9g71bHSEinwlGw==
=whkW
-----END PGP SIGNATURE-----
Merge tag 'slab-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- Move the TINY_RCU kvfree_rcu() implementation from RCU to SLAB
subsystem and cleanup its integration (Vlastimil Babka)
Following the move of the TREE_RCU batching kvfree_rcu()
implementation in 6.14, move also the simpler TINY_RCU variant.
Refactor the #ifdef guards so that the simple implementation is also
used with SLUB_TINY.
Remove the need for RCU to recognize fake callback function pointers
(__is_kvfree_rcu_offset()) when handling call_rcu() by implementing a
callback that calculates the object's address from the embedded
rcu_head address without knowing its offset.
- Improve kmalloc cache randomization in kvmalloc (GONG Ruiqi)
Due to an extra layer of function call, all kvmalloc() allocations
used the same set of random caches. Thanks to moving the kvmalloc()
implementation to slub.c, this is improved and randomization now
works for kvmalloc.
- Various improvements to debugging, testing and other cleanups (Hyesoo
Yu, Lilith Gkini, Uladzislau Rezki, Matthew Wilcox, Kevin Brodsky, Ye
Bin)
* tag 'slab-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
slub: Handle freelist cycle in on_freelist()
mm/slab: call kmalloc_noprof() unconditionally in kmalloc_array_noprof()
slab: Mark large folios for debugging purposes
kunit, slub: Add test_kfree_rcu_wq_destroy use case
mm, slab: cleanup slab_bug() parameters
mm: slub: call WARN() when detecting a slab corruption
mm: slub: Print the broken data before restoring them
slab: Achieve better kmalloc caches randomization in kvmalloc
slab: Adjust placement of __kvmalloc_node_noprof
mm/slab: simplify SLAB_* flag handling
slab: don't batch kvfree_rcu() with SLUB_TINY
rcu, slab: use a regular callback function for kvfree_rcu
rcu: remove trace_rcu_kvfree_callback
slab, rcu: move TINY_RCU variant of kvfree_rcu() to SLAB
- loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan Vadivel)
- samples/check-exec: Fix script name (Mickaël Salaün)
- yama: remove needless locking in yama_task_prctl() (Oleg Nesterov)
- lib/string_choices: Sort by function name (R Sundar)
- hardening: Allow default HARDENED_USERCOPY to be set at compile time
(Mel Gorman)
- uaccess: Split out compile-time checks into ucopysize.h
- kbuild: clang: Support building UM with SUBARCH=i386
- x86: Enable i386 FORTIFY_SOURCE on Clang 16+
- ubsan/overflow: Rework integer overflow sanitizer option
- Add missing __nonstring annotations for callers of memtostr*()/strtomem*()
- Add __must_be_noncstr() and have memtostr*()/strtomem*() check for it
- Introduce __nonstring_array for silencing future GCC 15 warnings
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ9hGrgAKCRA2KwveOeQk
u1WvAQC3ZxFu3b0Omfmht2pPqCltf2UOQNvUx3egjoeXpUaNSgD+Lxr/T4xksy7E
jHh7rCYDkruOWs3DHA5JjRQcf0BBLQo=
=FTQp
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"As usual, it's scattered changes all over. Patches touching things
outside of our traditional areas in the tree have been Acked by
maintainers or were trivial changes:
- loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan
Vadivel)
- samples/check-exec: Fix script name (Mickaël Salaün)
- yama: remove needless locking in yama_task_prctl() (Oleg Nesterov)
- lib/string_choices: Sort by function name (R Sundar)
- hardening: Allow default HARDENED_USERCOPY to be set at compile
time (Mel Gorman)
- uaccess: Split out compile-time checks into ucopysize.h
- kbuild: clang: Support building UM with SUBARCH=i386
- x86: Enable i386 FORTIFY_SOURCE on Clang 16+
- ubsan/overflow: Rework integer overflow sanitizer option
- Add missing __nonstring annotations for callers of
memtostr*()/strtomem*()
- Add __must_be_noncstr() and have memtostr*()/strtomem*() check for
it
- Introduce __nonstring_array for silencing future GCC 15 warnings"
* tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
compiler_types: Introduce __nonstring_array
hardening: Enable i386 FORTIFY_SOURCE on Clang 16+
x86/build: Remove -ffreestanding on i386 with GCC
ubsan/overflow: Enable ignorelist parsing and add type filter
ubsan/overflow: Enable pattern exclusions
ubsan/overflow: Rework integer overflow sanitizer option to turn on everything
samples/check-exec: Fix script name
yama: don't abuse rcu_read_lock/get_task_struct in yama_task_prctl()
kbuild: clang: Support building UM with SUBARCH=i386
loadpin: remove MODULE_COMPRESS_NONE as it is no longer supported
lib/string_choices: Rearrange functions in sorted order
string.h: Validate memtostr*()/strtomem*() arguments more carefully
compiler.h: Introduce __must_be_noncstr()
nilfs2: Mark on-disk strings as nonstring
uapi: stddef.h: Introduce __kernel_nonstring
x86/tdx: Mark message.bytes as nonstring
string: kunit: Mark nonstring test strings as __nonstring
scsi: qla2xxx: Mark device strings as nonstring
scsi: mpt3sas: Mark device strings as nonstring
scsi: mpi3mr: Mark device strings as nonstring
...
- move lib/ selftests into lib/tests/ (Kees Cook, Gabriela Bittencourt,
Luis Felipe Hernandez, Lukas Bulwahn, Tamir Duberstein)
- lib/math: Add int_log test suite (Bruno Sobreira França)
- lib/math: Add Kunit test suite for gcd() (Yu-Chun Lin)
- lib/tests/kfifo_kunit.c: add tests for the kfifo structure (Diego Vieira)
- unicode: refactor selftests into KUnit (Gabriela Bittencourt)
- lib/prime_numbers: convert self-test to KUnit (Tamir Duberstein)
- printf: convert self-test to KUnit (Tamir Duberstein)
- scanf: convert self-test to KUnit (Tamir Duberstein)
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ9hCvgAKCRA2KwveOeQk
u1nzAP9F/vQZUPkU9ADuqdcbyyEXlTzNk8R5rC2e1w+uKzJx+QD9EAbeCv9ZLdC0
KQF0pWVYCJtiSMEhkiMS/bMmpRCgwQ8=
=VYZG
-----END PGP SIGNATURE-----
Merge tag 'move-lib-kunit-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull lib kunit selftest move from Kees Cook:
"This is a one-off tree to coordinate the move of selftests out of lib/
and into lib/tests/. A separate tree was used for this to keep the
paths sane with all the work in the same place.
- move lib/ selftests into lib/tests/ (Kees Cook, Gabriela
Bittencourt, Luis Felipe Hernandez, Lukas Bulwahn, Tamir
Duberstein)
- lib/math: Add int_log test suite (Bruno Sobreira França)
- lib/math: Add Kunit test suite for gcd() (Yu-Chun Lin)
- lib/tests/kfifo_kunit.c: add tests for the kfifo structure (Diego
Vieira)
- unicode: refactor selftests into KUnit (Gabriela Bittencourt)
- lib/prime_numbers: convert self-test to KUnit (Tamir Duberstein)
- printf: convert self-test to KUnit (Tamir Duberstein)
- scanf: convert self-test to KUnit (Tamir Duberstein)"
* tag 'move-lib-kunit-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (21 commits)
scanf: break kunit into test cases
scanf: convert self-test to KUnit
scanf: remove redundant debug logs
scanf: implicate test line in failure messages
printf: implicate test line in failure messages
printf: break kunit into test cases
printf: convert self-test to KUnit
kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()
kunit/fortify: Expand testing of __compiletime_strlen()
kunit/stackinit: Use fill byte different from Clang i386 pattern
kunit/overflow: Fix DEFINE_FLEX tests for counted_by
selftests: remove reference to prime_numbers.sh
MAINTAINERS: adjust entries in FORTIFY_SOURCE and KERNEL HARDENING
lib/prime_numbers: convert self-test to KUnit
lib/math: Add Kunit test suite for gcd()
unicode: kunit: change tests filename and path
unicode: kunit: refactor selftest to kunit tests
lib/tests/kfifo_kunit.c: add tests for the kfifo structure
lib: Move KUnit tests into tests/ subdirectory
lib/math: Add int_log test suite
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90sHgAKCRCRxhvAZXjc
omOhAP42jYMtpeE78b7W5UP8YdpyMVtkbgpDqJYirdKDx9QtCwEA4QKR14SKH4G2
s3fJEh5PbBFzkE7pjPGdTy2S5EfDlAo=
=DBbG
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.15-rc1.initramfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs initramfs updates from Christian Brauner:
"This adds basic kunit test coverage for initramfs unpacking and cleans
up some buffer handling issues and inefficiencies"
* tag 'vfs-6.15-rc1.initramfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
MAINTAINERS: append initramfs files to the VFS section
initramfs: avoid static buffer for error message
initramfs: fix hardlink hash leak without TRAILER
initramfs: reuse name_len for dir mtime tracking
initramfs: allocate heap buffers together
initramfs: avoid memcpy for hex header fields
vsprintf: add simple_strntoul
initramfs_test: kunit tests for initramfs unpacking
init: add initramfs_internal.h
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90p4AAKCRCRxhvAZXjc
ojMIAP9atkG3u7+490+NGWLdulQlaHnD51Owa9MiW87UfKpsTQEArwi/NrJqXJNT
PFQ2xIa5TxG+9haChR89w3kjZ6b/hgs=
=iDkx
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"Features:
- Add CONFIG_DEBUG_VFS infrastucture:
- Catch invalid modes in open
- Use the new debug macros in inode_set_cached_link()
- Use debug-only asserts around fd allocation and install
- Place f_ref to 3rd cache line in struct file to resolve false
sharing
Cleanups:
- Start using anon_inode_getfile_fmode() helper in various places
- Don't take f_lock during SEEK_CUR if exclusion is guaranteed by
f_pos_lock
- Add unlikely() to kcmp()
- Remove legacy ->remount_fs method from ecryptfs after port to the
new mount api
- Remove invalidate_inodes() in favour of evict_inodes()
- Simplify ep_busy_loopER by removing unused argument
- Avoid mmap sem relocks when coredumping with many missing pages
- Inline getname()
- Inline new_inode_pseudo() and de-staticize alloc_inode()
- Dodge an atomic in putname if ref == 1
- Consistently deref the files table with rcu_dereference_raw()
- Dedup handling of struct filename init and refcounts bumps
- Use wq_has_sleeper() in end_dir_add()
- Drop the lock trip around I_NEW wake up in evict()
- Load the ->i_sb pointer once in inode_sb_list_{add,del}
- Predict not reaching the limit in alloc_empty_file()
- Tidy up do_sys_openat2() with likely/unlikely
- Call inode_sb_list_add() outside of inode hash lock
- Sort out fd allocation vs dup2 race commentary
- Turn page_offset() into a wrapper around folio_pos()
- Remove locking in exportfs around ->get_parent() call
- try_lookup_one_len() does not need any locks in autofs
- Fix return type of several functions from long to int in open
- Fix return type of several functions from long to int in ioctls
Fixes:
- Fix watch queue accounting mismatch"
* tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits)
fs: sort out fd allocation vs dup2 race commentary, take 2
fs: call inode_sb_list_add() outside of inode hash lock
fs: tidy up do_sys_openat2() with likely/unlikely
fs: predict not reaching the limit in alloc_empty_file()
fs: load the ->i_sb pointer once in inode_sb_list_{add,del}
fs: drop the lock trip around I_NEW wake up in evict()
fs: use wq_has_sleeper() in end_dir_add()
VFS/autofs: try_lookup_one_len() does not need any locks
fs: dedup handling of struct filename init and refcounts bumps
fs: consistently deref the files table with rcu_dereference_raw()
exportfs: remove locking around ->get_parent() call.
fs: use debug-only asserts around fd allocation and install
fs: dodge an atomic in putname if ref == 1
vfs: Remove invalidate_inodes()
ecryptfs: remove NULL remount_fs from super_operations
watch_queue: fix pipe accounting mismatch
fs: place f_ref to 3rd cache line in struct file to resolve false sharing
epoll: simplify ep_busy_loop by removing always 0 argument
fs: Turn page_offset() into a wrapper around folio_pos()
kcmp: improve performance adding an unlikely hint to task comparisons
...
'man dpkg-deb' describes as follows:
DPKG_DEB_COMPRESSOR_TYPE
Sets the compressor type to use (since dpkg 1.21.10).
The -Z option overrides this value.
When commit 1a7f0a34ea ("builddeb: allow selection of .deb compressor")
was applied, dpkg-deb did not support this environment variable.
Later, dpkg commit c10aeffc6d71 ("dpkg-deb: Add support for
DPKG_DEB_COMPRESSOR_TYPE/LEVEL") introduced support for
DPKG_DEB_COMPRESSOR_TYPE, which provides the same functionality as
KDEB_COMPRESS.
KDEB_COMPRESS is still useful for users of older dpkg versions, but I
would like to remove this redundant functionality in the future.
This commit adds comments to notify users of the planned removal and to
encourage migration to DPKG_DEB_COMPRESSOR_TYPE where possible.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
CONFIG_TRACE_BRANCH_PROFILING inserts a call to ftrace_likely_update()
for each use of likely() or unlikely(). That breaks noinstr rules if
the affected function is annotated as noinstr.
Disable branch profiling for files with noinstr functions. In addition
to some individual files, this also includes the entire arch/x86
subtree, as well as the kernel/entry, drivers/cpuidle, and drivers/idle
directories, all of which are noinstr-heavy.
Due to the nature of how sched binaries are built by combining multiple
.c files into one, branch profiling is disabled more broadly across the
sched code than would otherwise be needed.
This fixes many warnings like the following:
vmlinux.o: warning: objtool: do_syscall_64+0x40: call to ftrace_likely_update() leaves .noinstr.text section
vmlinux.o: warning: objtool: __rdgsbase_inactive+0x33: call to ftrace_likely_update() leaves .noinstr.text section
vmlinux.o: warning: objtool: handle_bug.isra.0+0x198: call to ftrace_likely_update() leaves .noinstr.text section
...
Reported-by: Ingo Molnar <mingo@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/fb94fc9303d48a5ed370498f54500cc4c338eb6d.1742586676.git.jpoimboe@kernel.org
Patch series "hung_task: Dump the blocking task stacktrace", v4.
The hung_task detector is very useful for detecting the lockup. However,
since it only dumps the blocked (uninterruptible sleep) processes, it is
not enough to identify the root cause of that lockup.
For example, if a process holds a mutex and sleep an event in
interruptible state long time, the other processes will wait on the mutex
in uninterruptible state. In this case, the waiter processes are dumped,
but the blocker process is not shown because it is sleep in interruptible
state.
This adds a feature to dump the blocker task which holds a mutex
when detecting a hung task. e.g.
INFO: task cat:115 blocked for more than 122 seconds.
Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:cat state:D stack:13432 pid:115 tgid:115 ppid:106 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x731/0x960
? schedule_preempt_disabled+0x54/0xa0
schedule+0xb7/0x140
? __mutex_lock+0x51b/0xa60
? __mutex_lock+0x51b/0xa60
schedule_preempt_disabled+0x54/0xa0
__mutex_lock+0x51b/0xa60
read_dummy+0x23/0x70
full_proxy_read+0x6a/0xc0
vfs_read+0xc2/0x340
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x1bd/0x2e0
ksys_read+0x76/0xe0
do_syscall_64+0xe3/0x1c0
? exc_page_fault+0xa9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4840cd
RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
</TASK>
INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
task:cat state:S stack:13432 pid:114 tgid:114 ppid:106 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x731/0x960
? schedule_timeout+0xa8/0x120
schedule+0xb7/0x140
schedule_timeout+0xa8/0x120
? __pfx_process_timeout+0x10/0x10
msleep_interruptible+0x3e/0x60
read_dummy+0x2d/0x70
full_proxy_read+0x6a/0xc0
vfs_read+0xc2/0x340
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x1bd/0x2e0
ksys_read+0x76/0xe0
do_syscall_64+0xe3/0x1c0
? exc_page_fault+0xa9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4840cd
RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
</TASK>
TBD: We can extend this feature to cover other locks like rwsem and
rt_mutex, but rwsem requires to dump all the tasks which acquire and wait
that rwsem. We can follow the waiter link but the output will be a bit
different compared with mutex case.
This patch (of 2):
The "hung_task" shows a long-time uninterruptible slept task, but most
often, it's blocked on a mutex acquired by another task. Without dumping
such a task, investigating the root cause of the hung task problem is very
difficult.
This introduce task_struct::blocker_mutex to point the mutex lock which
this task is waiting for. Since the mutex has "owner" information, we can
find the owner task and dump it with hung tasks.
Note: the owner can be changed while dumping the owner task, so
this is "likely" the owner of the mutex.
With this change, the hung task shows blocker task's info like below;
INFO: task cat:115 blocked for more than 122 seconds.
Not tainted 6.14.0-rc3-00003-ga8946be3de00 #156
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:cat state:D stack:13432 pid:115 tgid:115 ppid:106 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x731/0x960
? schedule_preempt_disabled+0x54/0xa0
schedule+0xb7/0x140
? __mutex_lock+0x51b/0xa60
? __mutex_lock+0x51b/0xa60
schedule_preempt_disabled+0x54/0xa0
__mutex_lock+0x51b/0xa60
read_dummy+0x23/0x70
full_proxy_read+0x6a/0xc0
vfs_read+0xc2/0x340
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x1bd/0x2e0
ksys_read+0x76/0xe0
do_syscall_64+0xe3/0x1c0
? exc_page_fault+0xa9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4840cd
RSP: 002b:00007ffe99071828 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
RDX: 0000000000001000 RSI: 00007ffe99071870 RDI: 0000000000000003
RBP: 00007ffe99071870 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
R13: 00000000132fd3a0 R14: 0000000000000001 R15: ffffffffffffffff
</TASK>
INFO: task cat:115 is blocked on a mutex likely owned by task cat:114.
task:cat state:S stack:13432 pid:114 tgid:114 ppid:106 task_flags:0x400100 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x731/0x960
? schedule_timeout+0xa8/0x120
schedule+0xb7/0x140
schedule_timeout+0xa8/0x120
? __pfx_process_timeout+0x10/0x10
msleep_interruptible+0x3e/0x60
read_dummy+0x2d/0x70
full_proxy_read+0x6a/0xc0
vfs_read+0xc2/0x340
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x1bd/0x2e0
ksys_read+0x76/0xe0
do_syscall_64+0xe3/0x1c0
? exc_page_fault+0xa9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x4840cd
RSP: 002b:00007ffe3e0147b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
RDX: 0000000000001000 RSI: 00007ffe3e014800 RDI: 0000000000000003
RBP: 00007ffe3e014800 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
R13: 000000001a0a93a0 R14: 0000000000000001 R15: ffffffffffffffff
</TASK>
[akpm@linux-foundation.org: implement debug_show_blocker() in C rather than in CPP]
Link: https://lkml.kernel.org/r/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com
Link: https://lkml.kernel.org/r/174046695384.2194069.16796289525958195643.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Reviewed-by: Lance Yang <ioworker0@gmail.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Will Deacon <will@kernel.org>
Cc: Yongliang Gao <leonylgao@tencent.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Any driver that needs these library functions should already be selecting
the corresponding Kconfig symbols, so there is no real point in making
these visible.
The original patch that made these user selectable described problems
with drivers failing to select the code they use, but for consistency
it's better to always use 'select' on a symbol than to mix it with
'depends on'.
Fixes: e56e189855 ("lib/crypto: add prompts back to crypto libraries")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add kmap_local support to the scatterlist iterator. Use it for
all the helper functions in lib/scatterlist.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Due to pending percpu improvements in -next, GCC9 and GCC10 are
crashing during the build with:
lib/zstd/compress/huf_compress.c:1033:1: internal compiler error: Segmentation fault
1033 | {
| ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-9/README.Bugs> for instructions.
The DYNAMIC_BMI2 feature is a known-challenging feature of
the ZSTD library, with an existing GCC quirk turning it off
for GCC versions below 4.8.
Increase the DYNAMIC_BMI2 version cutoff to GCC 11.0 - GCC 10.5
is the last version known to crash.
Reported-by: Michael Kelley <mhklinux@outlook.com>
Debugged-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: https://lore.kernel.org/r/SN6PR02MB415723FBCD79365E8D72CA5FD4D82@SN6PR02MB4157.namprd02.prod.outlook.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cross-merge networking fixes after downstream PR (net-6.14-rc8).
Conflict:
tools/testing/selftests/net/Makefile
03544faad7 ("selftest: net: add proc_net_pktgen")
3ed61b8938 ("selftests: net: test for lwtunnel dst ref loops")
tools/testing/selftests/net/config:
85cb3711ac ("selftests: net: Add test cases for link and peer netns")
3ed61b8938 ("selftests: net: test for lwtunnel dst ref loops")
Adjacent commits:
tools/testing/selftests/net/Makefile
c935af429e ("selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY")
355d940f4d ("Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
For more than a decade, CONFIG_SCHED_DEBUG=y has been enabled
in all the major Linux distributions:
/boot/config-6.11.0-19-generic:CONFIG_SCHED_DEBUG=y
The reason is that while originally CONFIG_SCHED_DEBUG started
out as a debugging feature, over the years (decades ...) it has
grown various bits of statistics, instrumentation and
control knobs that are useful for sysadmin and general software
development purposes as well.
But within the kernel we still pretend that there's a choice,
and sometimes code that is seemingly 'debug only' creates overhead
that should be optimized in reality.
So make it all official and make CONFIG_SCHED_DEBUG unconditional.
Now that all uses of CONFIG_SCHED_DEBUG are removed from
the code by previous patches, remove the Kconfig option as well.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250317104257.3496611-6-mingo@kernel.org
There are a few places in the tree which compute the length of the
string representation of a MAC address as 3 * ETH_ALEN - 1. Define a
constant for this and use it where relevant. No functionality changes
are expected.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@verge.net.au>
Link: https://patch.msgid.link/20250312-netconsole-v6-1-3437933e79b8@purestorage.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Patch series "Minimize xa_node allocation during xarry split", v3.
When splitting a multi-index entry in XArray from order-n to order-m,
existing xas_split_alloc()+xas_split() approach requires 2^(n %
XA_CHUNK_SHIFT) xa_node allocations. But its callers,
__filemap_add_folio() and shmem_split_large_entry(), use at most 1
xa_node. To minimize xa_node allocation and remove the limitation of no
split from order-12 (or above) to order-0 (or anything between 0 and
5)[1], xas_try_split() was added[2], which allocates (n / XA_CHUNK_SHIFT -
m / XA_CHUNK_SHIFT) xa_node. It is used for non-uniform folio split, but
can be used by __filemap_add_folio() and shmem_split_large_entry().
xas_split_alloc() and xas_split() split an order-9 to order-0:
---------------------------------
| | | | | | | | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| | | | | | | | |
---------------------------------
| | | |
------- --- --- -------
| | ... | |
V V V V
----------- ----------- ----------- -----------
| xa_node | | xa_node | ... | xa_node | | xa_node |
----------- ----------- ----------- -----------
xas_try_split() splits an order-9 to order-0:
---------------------------------
| | | | | | | | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| | | | | | | | |
---------------------------------
|
|
V
-----------
| xa_node |
-----------
xas_try_split() is designed to be called iteratively with n = m + 1.
xas_try_split_mini_order() is added to minmize the number of calls to
xas_try_split() by telling the caller the next minimal order to split to
instead of n - 1. Splitting order-n to order-m when m= l * XA_CHUNK_SHIFT
does not require xa_node allocation and requires 1 xa_node when n=l *
XA_CHUNK_SHIFT and m = n - 1, so it is OK to use xas_try_split() with n >
m + 1 when no new xa_node is needed.
xfstests quick group test passed on xfs and tmpfs.
[1] https://lore.kernel.org/linux-mm/Z6YX3RznGLUD07Ao@casper.infradead.org/
[2] https://lore.kernel.org/linux-mm/20250226210032.2044041-1-ziy@nvidia.com/
This patch (of 2):
During __filemap_add_folio(), a shadow entry is covering n slots and a
folio covers m slots with m < n is to be added. Instead of splitting all
n slots, only the m slots covered by the folio need to be split and the
remaining n-m shadow entries can be retained with orders ranging from m to
n-1. This method only requires
(n/XA_CHUNK_SHIFT) - (m/XA_CHUNK_SHIFT)
new xa_nodes instead of
(n % XA_CHUNK_SHIFT) * ((n/XA_CHUNK_SHIFT) - (m/XA_CHUNK_SHIFT))
new xa_nodes, compared to the original xas_split_alloc() + xas_split()
one. For example, to insert an order-0 folio when an order-9 shadow entry
is present (assuming XA_CHUNK_SHIFT is 6), 1 xa_node is needed instead of
8.
xas_try_split_min_order() is introduced to reduce the number of calls to
xas_try_split() during split.
Link: https://lkml.kernel.org/r/20250314222113.711703-1-ziy@nvidia.com
Link: https://lkml.kernel.org/r/20250314222113.711703-2-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mattew Wilcox <willy@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Buddy allocator like (or non-uniform) folio split", v10.
This patchset adds a new buddy allocator like (or non-uniform) large folio
split from a order-n folio to order-m with m < n. It reduces
1. the total number of after-split folios from 2^(n-m) to n-m+1;
2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to
n/6-m/6, assuming XA_CHUNK_SHIFT=6;
3. keep more large folios after a split from all order-m folios to
order-(n-1) to order-m folios.
For example, to split an order-9 to order-0, folio split generates 10 (or
11 for anonymous memory) folios instead of 512, allocates 1 xa_node
instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2
order-0 folios (or 4 order-0 for anonymous memory) instead of 512 order-0
folios.
Instead of duplicating existing split_huge_page*() code, __folio_split()
is introduced as the shared backend code for both
split_huge_page_to_list_to_order() and folio_split(). __folio_split() can
support both uniform split and buddy allocator like (or non-uniform)
split. All existing split_huge_page*() users can be gradually converted
to use folio_split() if possible. In this patchset, I converted
truncate_inode_partial_folio() to use folio_split().
xfstests quick group passed for both tmpfs and xfs. I also
semi-replicated Hugh's test[12] and ran it without any issue for almost 24
hours.
This patch (of 8):
A preparation patch for non-uniform folio split, which always split a
folio into half iteratively, and minimal xarray entry split.
Currently, xas_split_alloc() and xas_split() always split all slots from a
multi-index entry. They cost the same number of xa_node as the
to-be-split slots. For example, to split an order-9 entry, which takes
2^(9-6)=8 slots, assuming XA_CHUNK_SHIFT is 6 (!CONFIG_BASE_SMALL), 8
xa_node are needed. Instead xas_try_split() is intended to be used
iteratively to split the order-9 entry into 2 order-8 entries, then split
one order-8 entry, based on the given index, to 2 order-7 entries, ...,
and split one order-1 entry to 2 order-0 entries. When splitting the
order-6 entry and a new xa_node is needed, xas_try_split() will try to
allocate one if possible. As a result, xas_try_split() would only need 1
xa_node instead of 8.
When a new xa_node is needed during the split, xas_try_split() can try to
allocate one but no more. -ENOMEM will be return if a node cannot be
allocated. -EINVAL will be return if a sibling node is split or cascade
split happens, where two or more new nodes are needed, and these are not
supported by xas_try_split().
xas_split_alloc() and xas_split() split an order-9 to order-0:
---------------------------------
| | | | | | | | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| | | | | | | | |
---------------------------------
| | | |
------- --- --- -------
| | ... | |
V V V V
----------- ----------- ----------- -----------
| xa_node | | xa_node | ... | xa_node | | xa_node |
----------- ----------- ----------- -----------
xas_try_split() splits an order-9 to order-0:
---------------------------------
| | | | | | | | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| | | | | | | | |
---------------------------------
|
|
V
-----------
| xa_node |
-----------
Link: https://lkml.kernel.org/r/20250307174001.242794-1-ziy@nvidia.com
Link: https://lkml.kernel.org/r/20250307174001.242794-2-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Kairui Song <kasong@tencent.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Zone device pages are used to represent various type of device memory
managed by device drivers. Currently compound zone device pages are not
supported. This is because MEMORY_DEVICE_FS_DAX pages are the only user
of higher order zone device pages and have their own page reference
counting.
A future change will unify FS DAX reference counting with normal page
reference counting rules and remove the special FS DAX reference counting.
Supporting that requires compound zone device pages.
Supporting compound zone device pages requires compound_head() to
distinguish between head and tail pages whilst still preserving the
special struct page fields that are specific to zone device pages.
A tail page is distinguished by having bit zero being set in
page->compound_head, with the remaining bits pointing to the head page.
For zone device pages page->compound_head is shared with page->pgmap.
The page->pgmap field must be common to all pages within a folio, even if
the folio spans memory sections. Therefore pgmap is the same for both
head and tail pages and can be moved into the folio and we can use the
standard scheme to find compound_head from a tail page.
Link: https://lkml.kernel.org/r/67055d772e6102accf85161d0b57b0b3944292bf.1740713401.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: linmiaohe <linmiaohe@huawei.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The comment of interval_tree_span_iter_next_gap() is not exact, nodes[1]
is not always !NULL.
There are threes cases here. If there is an interior hole, the statement
is correct. If there is a tailing hole or the contiguous used range span
to the end, nodes[1] is NULL.
Link: https://lkml.kernel.org/r/20250310074938.26756-8-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Verify interval_tree_span_iter_xxx() helpers works as expected.
Link: https://lkml.kernel.org/r/20250310074938.26756-6-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Current test use pseudo rand function with fixed seed, which means the
test data is the same pattern each time.
Add random seed parameter to randomize the test.
Link: https://lkml.kernel.org/r/20250310074938.26756-4-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Current tests are gathered in one big function.
Split tests into its own function for better understanding and also it
is a preparation for introducing new test cases.
Link: https://lkml.kernel.org/r/20250310074938.26756-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Objtool warnings can be indicative of crashes, broken live patching, or
even boot failures. Ignoring them is not recommended.
Add CONFIG_OBJTOOL_WERROR to upgrade objtool warnings to errors by
enabling the objtool --Werror option. Also set --backtrace to print the
branches leading up to the warning, which can help considerably when
debugging certain warnings.
To avoid breaking bots too badly for now, make it the default for real
world builds only (!COMPILE_TEST).
Co-developed-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/3e7c109313ff15da6c80788965cc7450115b0196.1741975349.git.jpoimboe@kernel.org
Use preempt_model_str() to print the current preemption model. Use
pr_warn() instead of printk() to pass a loglevel. This makes it part of
generic WARN/ BUG traces.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250314160810.2373416-3-bigeasy@linutronix.de
Patch series "mm: cleanups for device-exclusive entries (hmm)", v2.
Some smaller device-exclusive cleanups I have lying around.
This patch (of 5):
The caller now always passes a single page; let's simplify, and return "0"
on success.
Link: https://lkml.kernel.org/r/20250226132257.2826043-1-david@redhat.com
Link: https://lkml.kernel.org/r/20250226132257.2826043-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Replace the int type with size_t for variables representing array sizes
and indices in the min-heap implementation. Using size_t aligns with
standard practices for size-related variables and avoids potential issues
on platforms where int may be insufficient to represent all valid sizes or
indices.
Link: https://lkml.kernel.org/r/20250215165618.1757219-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The macro is prehistoric, and only exists to help those readers who don't
know what memcmp() returns if memory areas differ. This is pretty well
documented, so the macro looks excessive.
Now that the only user of the macro depends on DEBUG_ZLIB config, GCC
warns about unused macro if the library is built with W=2 against
defconfig. So drop it for good.
Link: https://lkml.kernel.org/r/20250205212933.68695-1-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Cc: Heiko Carsten <heiko.carstens@de.ibm.com>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In the operation of plist_requeue(), "node" is deleted from the list
before queueing it back to the list again, which involves looping to find
the tail of same-prio entries.
If "node" is the head of same-prio entries which means its prio_list is on
the priority list, then "node_next" can be retrieve immediately by the
next entry of prio_list, instead of looping nodes on node_list.
The shortcut implementation can benefit plist_requeue() running the below
test, and the test result is shown in the following table.
One can observe from the test result that when the number of nodes of
same-prio entries is smaller, then the probability of hitting the shortcut
can be bigger, thus the benefit can be more significant.
While it tends to behave almost the same for long same-prio entries, since
the probability of taking the shortcut is much smaller.
-----------------------------------------------------------------------
| Test size | 200 | 400 | 600 | 800 | 1000 |
-----------------------------------------------------------------------
| new_plist_requeue | 271521| 1007913| 2148033| 4346792| 12200940|
-----------------------------------------------------------------------
| old_plist_requeue | 301395| 1105544| 2488301| 4632980| 12217275|
-----------------------------------------------------------------------
The test is done on x86_64 architecture with v6.9 kernel and
Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz.
Test script( executed in kernel module mode ):
int init_module(void)
{
unsigned int test_data[test_size];
/* Split the list into 10 different priority
* , when test_size is larger, the number of
* nodes within each priority is larger.
*/
for (i = 0; i < ARRAY_SIZE(test_data); i++) {
test_data[i] = i % 10;
}
ktime_t start, end, time_elapsed = 0;
plist_head_init(&test_head_local);
for (i = 0; i < ARRAY_SIZE(test_node_local); i++) {
plist_node_init(test_node_local + i, 0);
test_node_local[i].prio = test_data[i];
}
for (i = 0; i < ARRAY_SIZE(test_node_local); i++) {
if (plist_node_empty(test_node_local + i)) {
plist_add(test_node_local + i, &test_head_local);
}
}
for (i = 0; i < ARRAY_SIZE(test_node_local); i += 1) {
start = ktime_get();
plist_requeue(test_node_local + i, &test_head_local);
end = ktime_get();
time_elapsed += (end - start);
}
pr_info("plist_requeue() elapsed time : %lld, size %d\n", time_elapsed, test_size);
return 0;
}
[akpm@linux-foundation.org: tweak comment and code layout]
Link: https://lkml.kernel.org/r/20250119062408.77638-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Remove a BUG_ON() right before a WARN_ON() with the same condition.
Calling WARN_ON() and BUG_ON() here is definitely wrong. Since the goal is
generally to remove BUG_ON() invocations from the kernel, keep only the
WARN_ON().
Link: https://lkml.kernel.org/r/20250213114453.1078318-1-ptesarik@suse.com
Fixes: 067311d33e ("maple_tree: separate ma_state node from status")
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Utilize ma_dead_node() in mte_dead_node(). It can prevent decoding the
maple enode for a second time. Use the "node" to find parent for
comparison.
Link: https://lkml.kernel.org/r/20250211071850.330632-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Shuah khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There's no mas->status of "mas_start", what the function is checking is
whether mas->status equals to "ma_start". Correct the comment for the
function.
Link: https://lkml.kernel.org/r/20250209181023.228856-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Reviewed-by: Liam R. Howlett <howlett@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Refactor code to avoid extra mem_alloc_profiling_enabled() checks inside
pgalloc_tag_get() function which is often called after that check was
already done.
Link: https://lkml.kernel.org/r/20250201231803.2661189-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: David Wang <00107082@163.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Minchan Kim <minchan@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The single "real" user in the tree of make_device_exclusive_range() always
requests making only a single address exclusive. The current
implementation is hard to fix for properly supporting anonymous THP /
large folios and for avoiding messing with rmap walks in weird ways.
So let's always process a single address/page and return folio + page to
minimize page -> folio lookups. This is a preparation for further
changes.
Reject any non-anonymous or hugetlb folios early, directly after GUP.
While at it, extend the documentation of make_device_exclusive() to
clarify some things.
Link: https://lkml.kernel.org/r/20250210193801.781278-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Tested-by: Alistair Popple <apopple@nvidia.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Lyude <lyude@redhat.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yanteng Si <si.yanteng@linux.dev>
Cc: Barry Song <v-songbaohua@oppo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Slab pages now have a refcount of 0, so nobody should be trying to
manipulate the refcount on them. Doing so has little effect; the object
could be freed and reallocated to a different purpose, although the slab
itself would not be until the refcount was put making it behave rather
like TYPESAFE_BY_RCU.
Unfortunately, __iov_iter_get_pages_alloc() does take a refcount. Fix
that to not change the refcount, and make put_page() silently not change
the refcount. get_page() warns so that we can fix any other callers that
need to be changed.
Long-term, networking needs to stop taking a refcount on the pages that it
uses and rely on the caller to hold whatever references are necessary to
make the memory stable. In the medium term, more page types are going to
hav a zero refcount, so we'll want to move get_page() and put_page() out
of line.
Link: https://lkml.kernel.org/r/20250310143544.1216127-1-willy@infradead.org
Fixes: 9aec2fb0fd (slab: allocate frozen pages)
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Hannes Reinecke <hare@suse.de>
Closes: https://lore.kernel.org/all/08c29e4b-2f71-4b6d-8046-27e407214d8c@suse.com/
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The longest length of a symbol (KSYM_NAME_LEN) was increased to 512
in the reference [1]. This patch adds kunit test suite to check the longest
symbol length. These tests verify that the longest symbol length defined
is supported.
This test can also help other efforts for longer symbol length,
like [2].
The test suite defines one symbol with the longest possible length.
The first test verify that functions with names of the created
symbol, can be called or not.
The second test, verify that the symbols are created (or
not) in the kernel symbol table.
[1] https://lore.kernel.org/lkml/20220802015052.10452-6-ojeda@kernel.org/
[2] https://lore.kernel.org/lkml/20240605032120.3179157-1-song@kernel.org/
Link: https://lore.kernel.org/r/20250302221518.76874-1-sergio.collado@gmail.com
Tested-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Sergio González Collado <sergio.collado@gmail.com>
Link: https://github.com/Rust-for-Linux/linux/issues/504
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
userprogs sometimes need access to UAPI headers.
This is currently not possible for Usermode Linux, as UM is only
a pseudo architecture built on top of a regular architecture and does
not have its own UAPI.
Instead use the UAPI headers from the underlying regular architecture.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Use `suite_init` and move some tests into `scanf_test_cases`. This
gives us nicer output in the event of a failure.
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250307-scanf-kunit-convert-v9-4-b98820fa39ff@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
Convert the scanf() self-test to a KUnit test.
In the interest of keeping the patch reasonably-sized this doesn't
refactor the tests into proper parameterized tests - it's all one big
test case.
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250307-scanf-kunit-convert-v9-3-b98820fa39ff@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
Remove `pr_debug` calls which emit information already contained in
`pr_warn` calls that occur on test failure. This reduces unhelpful test
verbosity.
Note that a `pr_debug` removed from `_check_numbers_template` appears to
have been the only guard against silent false positives, but in fact
this condition is handled in `_test`; it is only possible for `n_args`
to be `0` in `_check_numbers_template` if the test explicitly expects it
*and* `vsscanf` returns `0`, matching the expectation.
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250307-scanf-kunit-convert-v9-2-b98820fa39ff@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
This improves the failure output by pointing to the failing line at the
top level of the test.
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Link: https://lore.kernel.org/r/20250307-scanf-kunit-convert-v9-1-b98820fa39ff@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
Cross-merge networking fixes after downstream PR (net-6.14-rc6).
Conflicts:
tools/testing/selftests/drivers/net/ping.py
75cc19c8ff ("selftests: drv-net: add xdp cases for ping.py")
de94e86974 ("selftests: drv-net: store addresses in dict indexed by ipver")
https://lore.kernel.org/netdev/20250311115758.17a1d414@canb.auug.org.au/
net/core/devmem.c
a70f891e0f ("net: devmem: do not WARN conditionally after netdev_rx_queue_restart()")
1d22d3060b ("net: drop rtnl_lock for queue_mgmt operations")
https://lore.kernel.org/netdev/20250313114929.43744df1@canb.auug.org.au/
Adjacent changes:
tools/testing/selftests/net/Makefile
6f50175cca ("selftests: Add IPv6 link-local address generation tests for GRE devices.")
2e5584e0f9 ("selftests/net: expand cmsg_ipv6.sh with ipv4")
drivers/net/ethernet/broadcom/bnxt/bnxt.c
661958552e ("eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic")
fe96d717d3 ("bnxt_en: Extend queue stop/start for TX rings")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In addition to keeping the kernel's copy of zstd up to date, this update
was requested by Intel to expose upstream's APIs that allow QAT to accelerate
the LZ match finding stage of Zstd.
This patch is imported from the upstream tag v1.5.7-kernel [0], which is signed
with upstream's signing key EF8FE99528B52FFD [1]. It was imported from upstream
using this command:
export ZSTD=/path/to/repo/zstd/
export LINUX=/path/to/repo/linux/
cd "$ZSTD/contrib/linux-kernel"
git checkout v1.5.7-kernel
make import LINUX="$LINUX"
This patch has been tested on x86-64, and has been boot tested with
a zstd compressed kernel & initramfs on i386 and aarch64. I benchmarked
the patch on x86-64 with gcc-14.2.1 on an Intel i9-9900K by measruing the
performance of compressed filesystem reads and writes.
Component, Level, Size delta, C. time delta, D. time delta
Btrfs , 1, +0.00%, -6.1%, +1.4%
Btrfs , 3, +0.00%, -9.8%, +3.0%
Btrfs , 5, +0.00%, +1.7%, +1.4%
Btrfs , 7, +0.00%, -1.9%, +2.7%
Btrfs , 9, +0.00%, -3.4%, +3.7%
Btrfs , 15, +0.00%, -0.3%, +3.6%
SquashFS , 1, +0.00%, N/A, +1.9%
The major changes that impact the kernel use cases for each version are:
v1.5.7: https://github.com/facebook/zstd/releases/tag/v1.5.7
* Add zstd_compress_sequences_and_literals() for use by Intel's QAT driver
to implement Zstd compression acceleration in the kernel.
* Fix an underflow bug in 32-bit builds that can cause data corruption when
processing more than 4GB of data with a single `ZSTD_CCtx` object, when an
input crosses the 4GB boundry. I don't believe this impacts any current kernel
use cases, because the `ZSTD_CCtx` is typically reconstructed between
compressions.
* Levels 1-4 see 5-10% compression speed improvements for inputs smaller than
128KB.
v1.5.6: https://github.com/facebook/zstd/releases/tag/v1.5.6
* Improved compression ratio for the highest compression levels. I don't expect
these see much use however, due to their slow speeds.
v1.5.5: https://github.com/facebook/zstd/releases/tag/v1.5.5
* Fix a rare corruption bug that can trigger on levels 13 and above.
* Improve compression speed of levels 5-11 on incompressible data.
v1.5.4: https://github.com/facebook/zstd/releases/tag/v1.5.4
* Improve copmression speed of levels 5-11 on ARM.
* Improve dictionary compression speed.
Signed-off-by: Nick Terrell <terrelln@fb.com>
This improves the failure output by pointing to the failing line at the
top level of the test, e.g.:
# test_number: EXPECTATION FAILED at lib/printf_kunit.c:103
lib/printf_kunit.c:167: vsnprintf(buf, 256, "%#-12x", ...) wrote '0x1234abcd ', expected '0x1234abce '
# test_number: EXPECTATION FAILED at lib/printf_kunit.c:142
lib/printf_kunit.c:167: kvasprintf(..., "%#-12x", ...) returned '0x1234abcd ', expected '0x1234abce '
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250307-printf-kunit-convert-v6-3-4d85c361c241@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
Move all tests into `printf_test_cases`. This gives us nicer output in
the event of a failure.
Combine `plain_format` and `plain_hash` into `hash_pointer` since
they're testing the same scenario.
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250307-printf-kunit-convert-v6-2-4d85c361c241@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
Convert the printf() self-test to a KUnit test.
In the interest of keeping the patch reasonably-sized this doesn't
refactor the tests into proper parameterized tests - it's all one big
test case.
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250307-printf-kunit-convert-v6-1-4d85c361c241@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
All modules that need CONFIG_CRC64 already select it, so there is no
need to bother users about the option.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250304230712.167600-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
All modules that need CONFIG_LIBCRC32C already select it, so there is no
need to bother users about the option.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250304230712.167600-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
None of the CRC library functions use __pure anymore, so the comment in
crc_benchmark() is outdated. But the comment was not really correct
anyway, since the CRC computation could (in principle) be optimized out
regardless of __pure. Update the comment to have a proper explanation.
Link: https://lore.kernel.org/r/20250305015830.37813-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
The list module_bug_list relies on module_mutex for writer
synchronisation. The list is already RCU style.
The list removal is synchronized with modules' synchronize_rcu() in
free_module().
Use RCU read lock protection instead of RCU-sched.
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250108090457.512198-29-bigeasy@linutronix.de
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Use "#!/usr/bin/env bash" instead of "#!/bin/bash". This is necessary
for nix environments as they only provide /usr/bin/env at the standard
location.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20250122-jag-nix-ify-v1-1-addb3170f93c@kernel.org
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
This moves the rust_fmt_argument function over to use the new #[export]
macro, which will verify at compile-time that the function signature
matches what is in the header file.
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Tamir Duberstein <tamird@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250303-export-macro-v3-4-41fbad85a27f@google.com
[ Removed period as requested by Andy. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Without this change, the rest of this series will emit the following
error message:
error[E0308]: `if` and `else` have incompatible types
--> <linux>/rust/kernel/print.rs:22:22
|
21 | #[export]
| --------- expected because of this
22 | unsafe extern "C" fn rust_fmt_argument(
| ^^^^^^^^^^^^^^^^^ expected `u8`, found `i8`
|
= note: expected fn item `unsafe extern "C" fn(*mut u8, *mut u8, *mut c_void) -> *mut u8 {bindings::rust_fmt_argument}`
found fn item `unsafe extern "C" fn(*mut i8, *mut i8, *const c_void) -> *mut i8 {print::rust_fmt_argument}`
The error may be different depending on the architecture.
To fix this, change the void pointer argument to use a const pointer,
and change the imports to use crate::ffi instead of core::ffi for
integer types.
Fixes: 787983da77 ("vsprintf: add new `%pA` format specifier")
Reviewed-by: Tamir Duberstein <tamird@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250303-export-macro-v3-1-41fbad85a27f@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
or aren't considered necessary for -stable kernels.
26 are for MM and 7 are for non-MM.
- "mm: memory_failure: unmap poisoned folio during migrate properly"
from Ma Wupeng fixes a couple of two year old bugs involving the
migration of hwpoisoned folios.
- "selftests/damon: three fixes for false results" from SeongJae Park
fixes three one year old bugs in the SAMON selftest code.
The remainder are singletons and doubletons. Please see the individual
changelogs for details.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ8zgnAAKCRDdBJ7gKXxA
jmTzAP9LsTIpkPkRXDpwxPR/Si5KPwOkE6sGj4ETEqbX3vUvcAEA/Lp7oafc7Vqr
XxlC1VFush1ZK29Tecxzvnapl2/VSAs=
=zBvY
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"33 hotfixes. 24 are cc:stable and the remainder address post-6.13
issues or aren't considered necessary for -stable kernels.
26 are for MM and 7 are for non-MM.
- "mm: memory_failure: unmap poisoned folio during migrate properly"
from Ma Wupeng fixes a couple of two year old bugs involving the
migration of hwpoisoned folios.
- "selftests/damon: three fixes for false results" from SeongJae Park
fixes three one year old bugs in the SAMON selftest code.
The remainder are singletons and doubletons. Please see the individual
changelogs for details"
* tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (33 commits)
mm/page_alloc: fix uninitialized variable
rapidio: add check for rio_add_net() in rio_scan_alloc_net()
rapidio: fix an API misues when rio_add_net() fails
MAINTAINERS: .mailmap: update Sumit Garg's email address
Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone"
mm: fix finish_fault() handling for large folios
mm: don't skip arch_sync_kernel_mappings() in error paths
mm: shmem: remove unnecessary warning in shmem_writepage()
userfaultfd: fix PTE unmapping stack-allocated PTE copies
userfaultfd: do not block on locking a large folio with raised refcount
mm: zswap: use ATOMIC_LONG_INIT to initialize zswap_stored_pages
mm: shmem: fix potential data corruption during shmem swapin
mm: fix kernel BUG when userfaultfd_move encounters swapcache
selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries
selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms
selftests/damon/damos_quota: make real expectation of quota exceeds
include/linux/log2.h: mark is_power_of_2() with __always_inline
NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback
mm, swap: avoid BUG_ON in relocate_cluster()
mm: swap: use correct step in loop to wait all clusters in wait_for_allocation()
...
To support multiple PTP clocks, the VDSO data structure needs to be
reworked. All clock specific data will end up in struct vdso_clock and in
struct vdso_time_data there will be an array of VDSO clocks.
Now that all preparatory changes are in place:
Split the clock related struct members into a separate struct
vdso_clock. Make sure all users are aware, that vdso_time_data is no longer
initialized as an array and vdso_clock is now the array inside
vdso_data. Remove the vdso_clock define, which mapped it to vdso_time_data
for the transition.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-19-c1b5c69a166f@linutronix.de
To support multiple PTP clocks, the VDSO data structure needs to be
reworked. All clock specific data will end up in struct vdso_clock and in
struct vdso_time_data there will be array of VDSO clocks. At the moment,
vdso_clock is simply a define which maps vdso_clock to vdso_time_data.
For time namespaces, vdso_time_data needs to be set up. But only the clock
related part of the vdso_data thats requires this setup. To reflect the
future struct vdso_clock, rename timens_setup_vdso_data() to
timns_setup_vdso_clock_data().
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-13-c1b5c69a166f@linutronix.de
To support multiple PTP clocks, the VDSO data structure needs to be
reworked. All clock specific data will end up in struct vdso_clock and in
struct vdso_time_data there will be array of VDSO clocks. At the moment,
vdso_clock is simply a define which maps vdso_clock to vdso_time_data.
To prepare for the rework of the data structures, replace the struct
vdso_time_data pointer argument of the helper functions with struct
vdso_clock pointer where applicable.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-11-c1b5c69a166f@linutronix.de
To support multiple PTP clocks, the VDSO data structure needs to be
reworked. All clock specific data will end up in struct vdso_clock and in
struct vdso_time_data there will be array of VDSO clocks. At the moment,
vdso_clock is simply a define which maps vdso_clock to vdso_time_data.
Prepare for the rework of these structures by adding a struct vdso_clock
pointer argument to do_coarse_time_ns(), and replace the struct
vdso_time_data pointer with the new pointer argument where applicable.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-10-c1b5c69a166f@linutronix.de
To support multiple PTP clocks, the VDSO data structure needs to be
reworked. All clock specific data will end up in struct vdso_clock and in
struct vdso_time_data there will be array of VDSO clocks. At the moment,
vdso_clock is simply a define which maps vdso_clock to vdso_time_data.
Prepare for the rework of these structures by adding a struct vdso_clock
pointer argument to do_coarse(), and replace the struct vdso_time_data
pointer with the new pointer argument where applicable.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-9-c1b5c69a166f@linutronix.de
To support multiple PTP clocks, the VDSO data structure needs to be
reworked. All clock specific data will end up in struct vdso_clock and in
struct vdso_time_data there will be array of VDSO clocks. At the moment,
vdso_clock is simply a define which maps vdso_clock to vdso_time_data.
Prepare for the rework of these structures by adding a struct vdso_clock
pointer argument to do_hres_timens(), and replace the struct vdso_time_data
pointer with the new pointer argument where applicable.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-8-c1b5c69a166f@linutronix.de
To support multiple PTP clocks, the VDSO data structure needs to be
reworked. All clock specific data will end up in struct vdso_clock and in
struct vdso_time_data there will be array of VDSO clocks. At the moment,
vdso_clock is simply a define which maps vdso_clock to vdso_time_data.
Prepare for the rework of these structures by adding a struct vdso_clock
pointer argument to do_hres(), and replace the struct vdso_time_data
pointer with the new pointer argument where applicable.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-7-c1b5c69a166f@linutronix.de
To support multiple PTP clocks, the VDSO data structure needs to be
reworked. All clock specific data will end up in struct vdso_clock and in
struct vdso_time_data there will be array of VDSO clocks. At the moment,
vdso_clock is simply a define which maps vdso_clock to vdso_time_data.
Prepare all functions which need the pointer to the vdso_clock array to
work correctly after introducing the new struct. Where applicable, replace
the struct vdso_time_data pointer by a struct vdso_clock pointer.
No functional change.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-6-c1b5c69a166f@linutronix.de
cpio extraction currently does a memcpy to ensure that the archive hex
fields are null terminated for simple_strtoul(). simple_strntoul() will
allow us to avoid the memcpy.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Link: https://lore.kernel.org/r/20250304061020.9815-4-ddiss@suse.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
The ChaCha20-Poly1305 library code uses the sg_miter API to process
input presented via scatterlists, except for the special case where the
digest buffer is not covered entirely by the same scatterlist entry as
the last byte of input. In that case, it uses scatterwalk_map_and_copy()
to access the memory in the input scatterlist where the digest is stored.
This results in a dependency on crypto/scatterwalk.c and therefore on
CONFIG_CRYPTO_ALGAPI, which is unnecessary, as the sg_miter API already
provides this functionality via sg_copy_to_buffer(). So use that
instead, and drop the dependencies on CONFIG_CRYPTO_ALGAPI and
CONFIG_CRYPTO.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Unlike the decompression code, the compression code in LZO never
checked for output overruns. It instead assumes that the caller
always provides enough buffer space, disregarding the buffer length
provided by the caller.
Add a safe compression interface that checks for the end of buffer
before each write. Use the safe interface in crypto/lzo.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Limit integer wrap-around mitigation to only the "size_t" type (for
now). Notably this covers all special functions/builtins that return
"size_t", like sizeof(). This remains an experimental feature and is
likely to be replaced with type annotations.
Reviewed-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20250307041914.937329-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
To make integer wrap-around mitigation actually useful, the associated
sanitizers must not instrument cases where the wrap-around is explicitly
defined (e.g. "-2UL"), being tested for (e.g. "if (a + b < a)"), or
where it has no impact on code flow (e.g. "while (var--)"). Enable
pattern exclusions for the integer wrap sanitizers.
Reviewed-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20250307041914.937329-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Since we're going to approach integer overflow mitigation a type at a
time, we need to enable all of the associated sanitizers, and then opt
into types one at a time.
Rename the existing "signed wrap" sanitizer to just the entire topic area:
"integer wrap". Enable the implicit integer truncation sanitizers, with
required callbacks and tests.
Notably, this requires features (currently) only available in Clang,
so we can depend on the cc-option tests to determine availability
instead of doing version tests.
Link: https://lore.kernel.org/r/20250307041914.937329-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Cross-merge networking fixes after downstream PR (net-6.14-rc6).
Conflicts:
net/ethtool/cabletest.c
2bcf4772e4 ("net: ethtool: try to protect all callback with netdev instance lock")
637399bf7e ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device")
No Adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The byte initialization values used with -ftrivial-auto-var-init=pattern
(CONFIG_INIT_STACK_ALL_PATTERN=y) depends on the compiler, architecture,
and byte position relative to struct member types. On i386 with Clang,
this includes the 0xFF value, which means it looks like nothing changes
between the leaf byte filling pass and the expected "stack wiping"
pass of the stackinit test.
Use the byte fill value of 0x99 instead, fixing the test for i386 Clang
builds.
Reported-by: ernsteiswuerfel
Closes: https://github.com/ClangBuiltLinux/linux/issues/2071
Fixes: 8c30d32b1a ("lib/test_stackinit: Handle Clang auto-initialization pattern")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250304225606.work.030-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Unfortunately, __builtin_dynamic_object_size() does not take into account
flexible array sizes, even when they are sized by __counted_by. As a
result, the size tests for the flexible arrays need to be separated to
get an accurate check of the compiler's behavior. While at it, fully test
sizeof, __struct_size (bdos(..., 0)), and __member_size (bdos(..., 1)).
I still think this is a compiler design issue, but there's not much to
be done about it currently beyond adjusting these tests. GCC and Clang
agree on this behavior at least. :)
Reported-by: "Thomas Weißschuh" <linux@weissschuh.net>
Closes: https://lore.kernel.org/lkml/e1a1531d-6968-4ae8-a3b5-5ea0547ec4b3@t-8ch.de/
Fixes: 9dd5134c61 ("kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()")
Signed-off-by: Kees Cook <kees@kernel.org>
Instead of having a private cpu_has_vx() implementation use the new common
cpu feature method. Move the facility detection to the decompressor so it
matches all other cpu features.
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Add a test_kfree_rcu_wq_destroy test to verify a kmem_cache_destroy()
from a workqueue context. The problem is that, before destroying any
cache the kvfree_rcu_barrier() is invoked to guarantee that in-flight
freed objects are flushed.
The _barrier() function queues and flushes its own internal workers
which might conflict with a workqueue type a kmem-cache gets destroyed
from.
One example is when a WQ_MEM_RECLAIM workqueue is flushing !WQ_MEM_RECLAIM
events which leads to a kernel splat. See the check_flush_dependency() in
the workqueue.c file.
If this test does not emits any kernel warning, it is passed.
Reviewed-by: Keith Busch <kbusch@kernel.org>
Co-developed-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
The ARCH_MAY_HAVE patch missed arm64, mips and s390. But it may
also lead to arch options being enabled but ineffective because
of modular/built-in conflicts.
As the primary user of all these options wireguard is selecting
the arch options anyway, make the same selections at the lib/crypto
option level and hide the arch options from the user.
Instead of selecting them centrally from lib/crypto, simply set
the default of each arch option as suggested by Eric Biggers.
Change the Crypto API generic algorithms to select the top-level
lib/crypto options instead of the generic one as otherwise there
is no way to enable the arch options (Eric Biggers). Introduce a
set of INTERNAL options to work around dependency cycles on the
CONFIG_CRYPTO symbol.
Fixes: 1047e21aec ("crypto: lib/Kconfig - Fix lib built-in failure when arch is modular")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Closes: https://lore.kernel.org/oe-kbuild-all/202502232152.JC84YDLp-lkp@intel.com/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are two incompatible sets of definitions of these eight functions:
On 64-bit architectures setting CONFIG_HAS_IOPORT, they turn into
either pair of 32-bit PIO (inl/outl) accesses or a single 64-bit MMIO
(readq/writeq). On other 64-bit architectures, they are always split
into 32-bit accesses.
Depending on which header gets included in a driver, there are
additionally definitions for ioread64()/iowrite64() that are
expected to produce a 64-bit register MMIO access on all 64-bit
architectures.
To separate the conflicting definitions, make the version in
include/linux/io-64-nonatomic-*.h visible on all architectures
but pick the one from lib/iomap.c on architectures that set
CONFIG_GENERIC_IOMAP in place of the default fallback.
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
In preparation for strtomem*() checking that its destination is a
__nonstring, annotate "nonstring" and "nonstring_small" variables
accordingly.
Signed-off-by: Kees Cook <kees@kernel.org>
Replace X86_CMPXCHG64 with X86_CX8, as CX8 is the name of the CPUID
flag, thus to make it consistent with X86_FEATURE_CX8 defined in
<asm/cpufeatures.h>.
No functional change intended.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250228082338.73859-2-xin@zytor.com
Introduce free_pages_nolock() that can free pages without taking locks.
It relies on trylock and can be called from any context.
Since spin_trylock() cannot be used in PREEMPT_RT from hard IRQ or NMI
it uses lockless link list to stash the pages which will be freed
by subsequent free_pages() from good context.
Do not use llist unconditionally. BPF maps continuously
allocate/free, so we cannot unconditionally delay the freeing to
llist. When the memory becomes free make it available to the
kernel and BPF users right away if possible, and fallback to
llist as the last resort.
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20250222024427.30294-4-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tracing BPF programs execute from tracepoints and kprobes where
running context is unknown, but they need to request additional
memory. The prior workarounds were using pre-allocated memory and
BPF specific freelists to satisfy such allocation requests.
Instead, introduce gfpflags_allow_spinning() condition that signals
to the allocator that running context is unknown.
Then rely on percpu free list of pages to allocate a page.
try_alloc_pages() -> get_page_from_freelist() -> rmqueue() ->
rmqueue_pcplist() will spin_trylock to grab the page from percpu
free list. If it fails (due to re-entrancy or list being empty)
then rmqueue_bulk()/rmqueue_buddy() will attempt to
spin_trylock zone->lock and grab the page from there.
spin_trylock() is not safe in PREEMPT_RT when in NMI or in hard IRQ.
Bailout early in such case.
The support for gfpflags_allow_spinning() mode for free_page and memcg
comes in the next patches.
This is a first step towards supporting BPF requirements in SLUB
and getting rid of bpf_mem_alloc.
That goal was discussed at LSFMM: https://lwn.net/Articles/974138/
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20250222024427.30294-3-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
A common task for most drivers is to remember the user-set CPU affinity
to its IRQs. On each netdev reset, the driver should re-assign the user's
settings to the IRQs. Unify this task across all drivers by moving the CPU
affinity to napi->config.
However, to move the CPU affinity to core, we also need to move aRFS
rmap management since aRFS uses its own IRQ notifiers.
For the aRFS, add a new netdev flag "rx_cpu_rmap_auto". Drivers supporting
aRFS should set the flag via netif_enable_cpu_rmap() and core will allocate
and manage the aRFS rmaps. Freeing the rmap is also done by core when the
netdev is freed. For better IRQ affinity management, move the IRQ rmap
notifier inside the napi_struct and add new notify.notify and
notify.release functions: netif_irq_cpu_rmap_notify() and
netif_napi_affinity_release().
Now we have the aRFS rmap management in core, add CPU affinity mask to
napi_config. To delegate the CPU affinity management to the core, drivers
must:
1 - set the new netdev flag "irq_affinity_auto":
netif_enable_irq_affinity(netdev)
2 - create the napi with persistent config:
netif_napi_add_config()
3 - bind an IRQ to the napi instance: netif_napi_set_irq()
the core will then make sure to use re-assign affinity to the napi's
IRQ.
The default IRQ mask is set to one cpu starting from the closest NUMA.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://patch.msgid.link/20250224232228.990783-2-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now that we have cpumask_next_wrap() wired to generic find_next_bit_wrap(),
the old implementation is not needed.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Now that cpumask_next{_and}_wrap() is wired to generic
find_next_bit_wrap(), we can use it in cpumask_any{_and}_distribute().
This automatically makes the cpumask_*_distribute() functions to use
small_cpumask_bits instead of nr_cpumask_bits, which itself is a good
optimization.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
The next patch aligns implementation of cpumask_next_wrap() with the
find_next_bit_wrap(), and it changes function signature.
To make the transition smooth, this patch deprecates current
implementation by adding an _old suffix. The following patches switch
current users to the new implementation one by one.
No functional changes were intended.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
The HAVE_ARCH Kconfig options in lib/crypto try to solve the
modular versus built-in problem, but it still fails when the
the LIB option (e.g., CRYPTO_LIB_CURVE25519) is selected externally.
Fix this by introducing a level of indirection with ARCH_MAY_HAVE
Kconfig options, these then go on to select the ARCH_HAVE options
if the ARCH Kconfig options matches that of the LIB option.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501230223.ikroNDr1-lkp@intel.com/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>