mirror of
https://github.com/torvalds/linux.git
synced 2026-05-14 17:32:42 +02:00
4e9ad033b4
46075 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ad52c55e1d |
Power management updates for 6.13-rc1
- Update the amd-pstate driver to set the initial scaling frequency
policy lower bound to be the lowest non-linear frequency (Dhananjay
Ugwekar).
- Enable amd-pstate by default on servers starting with newer AMD Epyc
processors (Swapnil Sapkal).
- Align more codepaths between shared memory and MSR designs in
amd-pstate (Dhananjay Ugwekar).
- Clean up amd-pstate code to rename functions and remove redundant
calls (Dhananjay Ugwekar, Mario Limonciello).
- Do other assorted fixes and cleanups in amd-pstate (Dhananjay Ugwekar
and Mario Limonciello).
- Change the Balance-performance EPP value for Granite Rapids in the
intel_pstate driver to a more performance-biased one (Srinivas
Pandruvada).
- Simplify MSR read on the boot CPU in the ACPI cpufreq driver (Chang
S. Bae).
- Ensure sugov_eas_rebuild_sd() is always called when sugov_init()
succeeds to always enforce sched domains rebuild in case EAS needs
to be enabled (Christian Loehle).
- Switch cpufreq back to platform_driver::remove() (Uwe Kleine-König).
- Use proper frequency unit names in cpufreq (Marcin Juszkiewicz).
- Add a built-in idle states table for Granite Rapids Xeon D to the
intel_idle driver (Artem Bityutskiy).
- Fix some typos in comments in the cpuidle core and drivers (Shen
Lichuan).
- Remove iowait influence from the menu cpuidle governor (Christian
Loehle).
- Add min/max available performance state limits to the Energy Model
management code (Lukasz Luba).
- Update pm-graph to v5.13 (Todd Brandt).
- Add documentation for some recently introduced cpupower utility
options (Tor Vic).
- Make cpupower inform users where cpufreq-bench.conf should be located
when opening it fails (Peng Fan).
- Allow overriding cross-compiling env params in cpupower (Peng Fan).
- Add compile_commands.json to .gitignore in cpupower (John B. Wyatt
IV).
- Improve disable c_state block in cpupower bindings and add a test to
confirm that CPU state is disabled to it (John B. Wyatt IV).
- Add Chinese Simplified translation to cpupower (Kieran Moy).
- Add checks for xgettext and msgfmt to cpupower (Siddharth Menon).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmc3r6sSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxQMUQALNEbh/Ko1d+avq0sfvyPw18BZjEiQw7
M+L0GydLW6tXLYOrD+ZTASksdDhHbK0iuFr1Gca2cZi0Dl+1XF9sy70ITTqzCDIA
8qj1JrPmRYI0KXCfiSSke0W9fU18IdxVX3I7XezVqBl0ICzsroN5wliCkmEnVOU9
LQkw0fyYr7gev4GFEGSJ7WzfPxci0d6J9pYnafFlDEE28WpKz/cyOzYuSghX5lmG
ISHIVNIM6lqNgXyQirConvhrlg60XAyw5k5jqAYZbe78T+dqhH7lr9sDi7c4XxkG
syeiOOyjpiBMZv1rSjIUapi8AfJHyqH7B6KyTgiulIy31x8Dji62925B63CSahkM
AminAq0lYkqbhIcqEr4sW0JQ/oW3iX4cZ3TJXTUL+vFByR0ZF81tgQcXufhrcvBs
ViNugcX0q1vDX3lZsm9L6UHXN2yhUb36sgreUvbGfwnE79tuR/eUnAukTWBfXau/
TWnyDiQn1CjZcfHB+YAPYZNyUHHqjoIJwzfJLwnsaHgFA80YcSwfSC9kcogCawK1
NCyfs29lAccWsrOul5iARJu8pLw1X//UfDEmVNrBD+1hveKYMrjjiQXnPoVVnNhc
J5T2q5S1QeO05+wf8WaZ7MbRNzHLj0A3gYHSVPWNclxFwsQjqCHHZS2qz8MTX+f6
W6/eZuvmMbG7
=w8QT
-----END PGP SIGNATURE-----
Merge tag 'pm-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"The amd-pstate cpufreq driver gets the majority of changes this time.
They are mostly fixes and cleanups, but one of them causes it to
become the default cpufreq driver on some AMD server platforms.
Apart from that, the menu cpuidle governor is modified to not use
iowait any more, the intel_idle gets a custom C-states table for
Granite Rapids Xeon D, and the intel_pstate driver will use a more
aggressive Balance- performance default EPP value on Granite Rapids
now.
There are also some fixes, cleanups and tooling updates.
Specifics:
- Update the amd-pstate driver to set the initial scaling frequency
policy lower bound to be the lowest non-linear frequency (Dhananjay
Ugwekar)
- Enable amd-pstate by default on servers starting with newer AMD
Epyc processors (Swapnil Sapkal)
- Align more codepaths between shared memory and MSR designs in
amd-pstate (Dhananjay Ugwekar)
- Clean up amd-pstate code to rename functions and remove redundant
calls (Dhananjay Ugwekar, Mario Limonciello)
- Do other assorted fixes and cleanups in amd-pstate (Dhananjay
Ugwekar and Mario Limonciello)
- Change the Balance-performance EPP value for Granite Rapids in the
intel_pstate driver to a more performance-biased one (Srinivas
Pandruvada)
- Simplify MSR read on the boot CPU in the ACPI cpufreq driver (Chang
S. Bae)
- Ensure sugov_eas_rebuild_sd() is always called when sugov_init()
succeeds to always enforce sched domains rebuild in case EAS needs
to be enabled (Christian Loehle)
- Switch cpufreq back to platform_driver::remove() (Uwe Kleine-König)
- Use proper frequency unit names in cpufreq (Marcin Juszkiewicz)
- Add a built-in idle states table for Granite Rapids Xeon D to the
intel_idle driver (Artem Bityutskiy)
- Fix some typos in comments in the cpuidle core and drivers (Shen
Lichuan)
- Remove iowait influence from the menu cpuidle governor (Christian
Loehle)
- Add min/max available performance state limits to the Energy Model
management code (Lukasz Luba)
- Update pm-graph to v5.13 (Todd Brandt)
- Add documentation for some recently introduced cpupower utility
options (Tor Vic)
- Make cpupower inform users where cpufreq-bench.conf should be
located when opening it fails (Peng Fan)
- Allow overriding cross-compiling env params in cpupower (Peng Fan)
- Add compile_commands.json to .gitignore in cpupower (John B. Wyatt
IV)
- Improve disable c_state block in cpupower bindings and add a test
to confirm that CPU state is disabled to it (John B. Wyatt IV)
- Add Chinese Simplified translation to cpupower (Kieran Moy)
- Add checks for xgettext and msgfmt to cpupower (Siddharth Menon)"
* tag 'pm-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (38 commits)
cpufreq: intel_pstate: Update Balance-performance EPP for Granite Rapids
cpufreq: ACPI: Simplify MSR read on the boot CPU
sched/cpufreq: Ensure sd is rebuilt for EAS check
intel_idle: add Granite Rapids Xeon D support
PM: EM: Add min/max available performance state limits
cpufreq/amd-pstate: Move registration after static function call update
cpufreq/amd-pstate: Push adjust_perf vfunc init into cpu_init
cpufreq/amd-pstate: Align offline flow of shared memory and MSR based systems
cpufreq/amd-pstate: Call cppc_set_epp_perf in the reenable function
cpufreq/amd-pstate: Do not attempt to clear MSR_AMD_CPPC_ENABLE
cpufreq/amd-pstate: Rename functions that enable CPPC
cpufreq/amd-pstate-ut: Add fix for min freq unit test
amd-pstate: Switch to amd-pstate by default on some Server platforms
amd-pstate: Set min_perf to nominal_perf for active mode performance gov
cpufreq/amd-pstate: Remove the redundant amd_pstate_set_driver() call
cpufreq/amd-pstate: Remove the switch case in amd_pstate_init()
cpufreq/amd-pstate: Call amd_pstate_set_driver() in amd_pstate_register_driver()
cpufreq/amd-pstate: Call amd_pstate_register() in amd_pstate_init()
cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq
cpufreq/amd-pstate: Remove the redundant verify() function
...
|
||
|
|
8a7fa81137 |
Random number generator updates for Linux 6.13-rc1.
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmc6oE0ACgkQSfxwEqXe A65n5BAAtNmfBJhYRiC6Svsg7+ktHmhCAHoHwnP7sv+bjs81FRAEv21CsfI+02Nb zUvaPuyiLtYzlWxzE5Yg44v1cADHAq+QZE1Fg5yl7ge6zPZ3+S1pv/8suNSyyI2M PKvh1sb4OkUtqplveYSuP1J87u55zAtV9mP9qC3hSlY3XkeQUObt9Awss8peOMdv sH2AxwBlRkqFXpY2worxlfg3p5iLemb3AUZ3f0Jc6fRmOagSJCt7i4mDrWo3EXke 90Ao8ypY0x3YVGRFACHnxCS53X20HGwLxm7jdicfriMCzAJ6JQR6asO+NYnXR+Ev 9Za3UquVHP6HbQGWj6d1k5k2nF+IbkTHTgFBPRK/CY9ZpVbP04B2K7tE1gmT81wj AscRGi9RBVBPKAUguyi99MXYlprFG/ZTLOux3hvdarv5u0bP94eXmy1FrRM+IO0r u4BiQ39FlkDdtRxjzKfCiKkMrf3NmFEciZJhxCnflzmOBaj64r1hRt/ea8Bjxvp3 a4k0MfULmcEn2JwPiT1/Swz45ypZQc4OgbP87SCU8P0a23r21r2oK+9v3No/rCzB TI0fP6ykDTFQoiKUOSg1mJmkipdjeDyQ9E+0XIDsKd+T8Yv9rFoaV6RWoMrkt4AJ Yea9+V+XEI8F3SjhdD4OL/s3/+bjTjnRHDaXnJf2XzGmXcuvnbs= =o4ww -----END PGP SIGNATURE----- Merge tag 'random-6.13-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "This contains a single series from Uros to replace uses of <linux/random.h> with prandom.h or other more specific headers as needed, in order to avoid a circular header issue. Uros' goal is to be able to use percpu.h from prandom.h, which will then allow him to define __percpu in percpu.h rather than in compiler_types.h" * tag 'random-6.13-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: Include <linux/percpu.h> in <linux/prandom.h> random: Do not include <linux/prandom.h> in <linux/random.h> netem: Include <linux/prandom.h> in sch_netem.c lib/test_scanf: Include <linux/prandom.h> instead of <linux/random.h> lib/test_parman: Include <linux/prandom.h> instead of <linux/random.h> bpf/tests: Include <linux/prandom.h> instead of <linux/random.h> lib/rbtree-test: Include <linux/prandom.h> instead of <linux/random.h> random32: Include <linux/prandom.h> instead of <linux/random.h> kunit: string-stream-test: Include <linux/prandom.h> lib/interval_tree_test.c: Include <linux/prandom.h> instead of <linux/random.h> bpf: Include <linux/prandom.h> instead of <linux/random.h> scsi: libfcoe: Include <linux/prandom.h> instead of <linux/random.h> fscrypt: Include <linux/once.h> in fs/crypto/keyring.c mtd: tests: Include <linux/prandom.h> instead of <linux/random.h> media: vivid: Include <linux/prandom.h> in vivid-vid-cap.c drm/lib: Include <linux/prandom.h> instead of <linux/random.h> drm/i915/selftests: Include <linux/prandom.h> instead of <linux/random.h> crypto: testmgr: Include <linux/prandom.h> instead of <linux/random.h> x86/kaslr: Include <linux/prandom.h> instead of <linux/random.h> |
||
|
|
02b2f1a7b8 |
This update includes the following changes:
API:
- Add sig driver API.
- Remove signing/verification from akcipher API.
- Move crypto_simd_disabled_for_test to lib/crypto.
- Add WARN_ON for return values from driver that indicates memory corruption.
Algorithms:
- Provide crc32-arch and crc32c-arch through Crypto API.
- Optimise crc32c code size on x86.
- Optimise crct10dif on arm/arm64.
- Optimise p10-aes-gcm on powerpc.
- Optimise aegis128 on x86.
- Output full sample from test interface in jitter RNG.
- Retry without padata when it fails in pcrypt.
Drivers:
- Add support for Airoha EN7581 TRNG.
- Add support for STM32MP25x platforms in stm32.
- Enable iproc-r200 RNG driver on BCMBCA.
- Add Broadcom BCM74110 RNG driver.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmc6sQsACgkQxycdCkmx
i6dfHxAAnkI65TE6agZq9DlkEU4ZqOsxxdk0MsGIhbCUTxW3KENzu9vtKjnvg9T/
Ou0d2J49ny87Y4zaA59Wf/Q1+gg5YSQR5kelonpfrPLkCkJjr72HZpyCHv8TTzEC
uHHoVj9cnPIF5/yfiqQsrWT1ACip9vn+slyVPaMJV1qR6gnvnSALtsg4e/vKHkn7
ZMaf2pZ2ROYXdB02nMK5KQcCrxD64MQle/yQepY44eYjnT+XclkqPdi6o1nUSpj/
RFAeY0jFSTu0pj3DqT48TnU/LiiNLlFOZrGjCdEySoac63vmTtKqfYDmrRaFz4hB
sucxbgJ3xnnYseRijtfXnxaD/IkDJln+ipGNQKAZLfOVMDCTxPdYGmOpobMTXMS+
0sY0eAHgqr23P9pOp+sOzcAEFIqg6llAYQVWx3Zl4vpXBUuxzg6AqmHnPicnck7y
Lw1cJhQxij2De3dG2ZL/0dgQxMjGN/YfCM8SSg6l+Xn3j4j47rqJNH2ZsmXtbJ2n
kTkmemmWdgRR1IvgQQGsvyKs9ThkcEDW+IzW26SUv3Clvru2NSkX4ZPHbezZQf+D
R0wMZsW3Fw7Zymerz1GIBSqdLnsyFWtIAjukDpOR6ordPgOBeDt76v6tw5vL2/II
KYoeN1pdEEecwuhAsEvCryT5ZG4noBeNirf/ElWAfEybgcXiTks=
=T8pa
-----END PGP SIGNATURE-----
Merge tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Add sig driver API
- Remove signing/verification from akcipher API
- Move crypto_simd_disabled_for_test to lib/crypto
- Add WARN_ON for return values from driver that indicates memory
corruption
Algorithms:
- Provide crc32-arch and crc32c-arch through Crypto API
- Optimise crc32c code size on x86
- Optimise crct10dif on arm/arm64
- Optimise p10-aes-gcm on powerpc
- Optimise aegis128 on x86
- Output full sample from test interface in jitter RNG
- Retry without padata when it fails in pcrypt
Drivers:
- Add support for Airoha EN7581 TRNG
- Add support for STM32MP25x platforms in stm32
- Enable iproc-r200 RNG driver on BCMBCA
- Add Broadcom BCM74110 RNG driver"
* tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (112 commits)
crypto: marvell/cesa - fix uninit value for struct mv_cesa_op_ctx
crypto: cavium - Fix an error handling path in cpt_ucode_load_fw()
crypto: aesni - Move back to module_init
crypto: lib/mpi - Export mpi_set_bit
crypto: aes-gcm-p10 - Use the correct bit to test for P10
hwrng: amd - remove reference to removed PPC_MAPLE config
crypto: arm/crct10dif - Implement plain NEON variant
crypto: arm/crct10dif - Macroify PMULL asm code
crypto: arm/crct10dif - Use existing mov_l macro instead of __adrl
crypto: arm64/crct10dif - Remove remaining 64x64 PMULL fallback code
crypto: arm64/crct10dif - Use faster 16x64 bit polynomial multiply
crypto: arm64/crct10dif - Remove obsolete chunking logic
crypto: bcm - add error check in the ahash_hmac_init function
crypto: caam - add error check to caam_rsa_set_priv_key_form
hwrng: bcm74110 - Add Broadcom BCM74110 RNG driver
dt-bindings: rng: add binding for BCM74110 RNG
padata: Clean up in padata_do_multithreaded()
crypto: inside-secure - Fix the return value of safexcel_xcbcmac_cra_init()
crypto: qat - Fix missing destroy_workqueue in adf_init_aer()
crypto: rsassa-pkcs1 - Reinstate support for legacy protocols
...
|
||
|
|
311e062ad5 |
CSD-lock diagnostic updates for v6.13
This commit switches from sched_clock() to ktime_get_mono_fast_ns(), which on x86 switches from the rdtsc instruction to the rdtscp instruction, thus avoiding instruction reorderings that cause false-positive reports of CSD-lock stalls of almost 2^46 nanoseconds. These false positives are rare, but really are seen in the wild. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmc5XvETHHBhdWxtY2tA a2VybmVsLm9yZwAKCRCevxLzctn7jNUCD/9NqeuxsVcumybbjlHs/IbJt47qTPVk 1O+mpLiKfscw/ndfvqJe1RU+IOUJUPBPzBPUWvZQZ2SzeU03oOI4/szFttDdXSi3 0uI9qOJn3auk2+cdU7CxXOLSiWYEWlMjWvN6d34QeLh7smLkendxH2wo2fkL9kf0 DzvosOrlyNWGZPUQrb1TRW7RKGE7vap8x7tK/p1qMO2xmaPeIX7dfiY38CJC5fjj +n8i1aZIxLFc65I0/Z+nGTMFrktzbYjJik6k++QZzHx+GiXaCkgfidZFspj3uPXW CPa6KxheCrdmFV4A/TVnKYJyutoGeheMjwlVfz0YOSe8J5/N3F9RfDFBYedt2fL+ 11gRpOg5hz61AsyxZ1+iViW0guXoVzn2uwQ5rkou9184fBXPuwH1MAwBcsKYwQig Frd0ZzyrqGHCHwDWtBfAb+qC17b5krsa+fKkjiPFDRDRB5N2hh67tcquOE3wzvrG oAHEZgeFwxZQYGIZ7uITebyThe9NvkBRyrJLvxUEpvF2MoI0yJaqoAwkHieSl1vD KJJ+o+HxVa3D/WCWxTNCjDyxvCMJpFHFWB3h8+hi+X2UleRGrDiIDmIdddMM1/gr meYjZ/c1/t7Y14zwzp/SxUHFJ4U8U2jI23/K5ldJVH34k6XwccU06NYWjHYokXgT ZRMCJ8sAcTtlhw== =8GQJ -----END PGP SIGNATURE----- Merge tag 'csd-lock.2024.11.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull CSD-lock update from Paul McKenney: "This switches from sched_clock() to ktime_get_mono_fast_ns(), which on x86 switches from the rdtsc instruction to the rdtscp instruction, thus avoiding instruction reorderings that cause false-positive reports of CSD-lock stalls of almost 2^46 nanoseconds. These false positives are rare, but really are seen in the wild" * tag 'csd-lock.2024.11.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: locking/csd-lock: Switch from sched_clock() to ktime_get_mono_fast_ns() |
||
|
|
d7d4102f0a |
scftorture changes for v6.13
o Avoid divide operation. o Fix cleanup code waiting for IPI handlers. o Move memory allocations out of preempt-disable region of code for PREEMPT_RT compatibility. o Use a lockless list to avoid freeing memory while interrupts are disabled, again for PREEMPT_RT compatibility. o Make lockless list scf_add_to_free_list() correctly handle freeing a NULL pointer. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmc5X0gTHHBhdWxtY2tA a2VybmVsLm9yZwAKCRCevxLzctn7jDVMEACQRdJ0NYxygGFpUzDj2Er2wdOtBG0E n1NOqmNX7nlBL8BzseCFa2OiVbvggE7+ynAGcqISzDLZGE6aa4/HwKLkxSGB62UV WMXNiJE+t4bb1TsdMwLcQnOmmDniy6ID0NIEA8YHEEZltuDNQGQfjB8ynJewwNmY yMU90JDwVvDVmM9+AXUqYYRAar1gR5k7jknQbnXqb+6xT/kMEu+B1z5BGiMB3Z5L LylobI+3OZTY417tgJU/iSeRZbLZn7Xs6pxOcJMpeFvvYMn4mkYaUX+WUOU9oTQd h91wGxRouTQpS41zGNI5HcqnTtevrnmtXNROyUkei1aipvnq8N9HR11UJDXWgSV4 24dH8qZVzTv+/cWIuNA3uUH+hu7kFZztQQQeIJdenm3CBtEYIK4ssrlyXUM7U5AY JQOjeEzApQLht++VTjGSS3CZhODLCTQU+IeQH1ChM1EZz2M9gsv9RqKfXrnFTDnO 6UrLNa2YCpvQCEeNj2i8TaFHZAInGTcNFHjhxd+kA4SsCDygi9PYxKq6xVadLVZs Kwj6kpgPpatQzZ5w7Il9RF+qTgpOnbqB52JFt3rGjQg8uALfDo5S85wurhvu6+GC Qy7XvDWhmUn8fZwvlRO+DABBWOYmXeAVHKxWA3VBxO3O454Pxx5IuVSW4213GFVz 58sAl0WwK8Jscg== =ElHE -----END PGP SIGNATURE----- Merge tag 'scftorture.2024.11.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull scftorture updates from Paul McKenney: - Avoid divide operation - Fix cleanup code waiting for IPI handlers - Move memory allocations out of preempt-disable region of code for PREEMPT_RT compatibility - Use a lockless list to avoid freeing memory while interrupts are disabled, again for PREEMPT_RT compatibility - Make lockless list scf_add_to_free_list() correctly handle freeing a NULL pointer * tag 'scftorture.2024.11.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: scftorture: Handle NULL argument passed to scf_add_to_free_list(). scftorture: Use a lock-less list to free memory. scftorture: Move memory allocation outside of preempt_disable region. scftorture: Wait until scf_cleanup_handler() completes. scftorture: Avoid additional div operation. |
||
|
|
ba1f9c8fe3 |
arm64 updates for 6.13:
* Support for running Linux in a protected VM under the Arm Confidential
Compute Architecture (CCA)
* Guarded Control Stack user-space support. Current patches follow the
x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
patches (already on the list) will add support for clone3() allowing
finer-grained control of the shadow stack size and placement from libc
* AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
getting close with the upcoming dpISA support)
* Other arch features:
- In-kernel use of the memcpy instructions, FEAT_MOPS (previously only
exposed to user; uaccess support not merged yet)
- MTE: hugetlbfs support and the corresponding kselftests
- Optimise CRC32 using the PMULL instructions
- Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG
- Optimise the kernel TLB flushing to use the range operations
- POE/pkey (permission overlays): further cleanups after bringing the
signal handler in line with the x86 behaviour for 6.12
* arm64 perf updates:
- Support for the NXP i.MX91 PMU in the existing IMX driver
- Support for Ampere SoCs in the Designware PCIe PMU driver
- Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC
- Support for Samsung's 'Mongoose' CPU PMU
- Support for PMUv3.9 finer-grained userspace counter access control
- Switch back to platform_driver::remove() now that it returns 'void'
- Add some missing events for the CXL PMU driver
* Miscellaneous arm64 fixes/cleanups:
- Page table accessors cleanup: type updates, drop unused macros,
reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
check addresses before runtime P4D/PUD folding
- Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
FEAT_ECV for the generic timers) allowing Linux to boot with
firmware deployments that don't set SCTLR_EL3.ECVEn
- ACPI/arm64: tighten the check for the array of platform timer
structures and adjust the error handling procedure in
gtdt_parse_timer_block()
- Optimise the cache flush for the uprobes xol slot (skip if no
change) and other uprobes/kprobes cleanups
- Fix the context switching of tpidrro_el0 when kpti is enabled
- Dynamic shadow call stack fixes
- Sysreg updates
- Various arm64 kselftest improvements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmc5POIACgkQa9axLQDI
XvEDYA//a3eeNkgMuGdnSCVcLz+zy+oNwAwboG/4X1DqL8jiCbI4npwugPx95RIA
YZOUvo9T2aL3OyefpUHll4gFHqx9OwoZIig2F70TEUmlPsGUbh0KBkdfQF3xZPdl
EwV0kHSGEqMWMBwsGJGwgCYrUaf1MUQzh1GBl7VJ2ts5XsJBaBeOyKkysij26wtZ
V+aHq2IUx7qQS7+HC/4P6IoHxKziFcsCMovaKaynP4cw9xXBQbDMcNlHEwndOMyk
pu2zrv7GG0j3KQuVP/2Alf5FKhmI0GVGP/6Nc/zsOmw96w8Kf7HfzEtkHawr2aRq
rqg/c9ivzDn1p+fUBo4ZYtrRk4IAY+yKu6hdzdLTP5+bQrBTWTO9rjQVBm9FAGYT
sCdEj1NqzvExvNHD7X6ut/GJ05lmce3K+qeSXSEysN9gqiT3eomYWMXrD2V2lxzb
rIDDcb/icfaqjt14Mksh19r/rzNeq7noj9CGSmcqw0BHZfHzl38Lai6pdfYzCNyn
vCM/c4c1D/WWX8/lifO1JZVbhDk1jy82Iphg2KEhL8iKPxDsKBBZLmYuU1oa7tMo
WryGAz9+GQwd+W9chFuaOEtMnzvW2scEJ5Eb2fEf0Qj0aEurkL+C9dZR6o1GN77V
DBUxtU628Ef4PJJGfbNCwZzdd8UPYG3a/mKfQQ3dz0oz2LySlW4=
=wDot
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- Support for running Linux in a protected VM under the Arm
Confidential Compute Architecture (CCA)
- Guarded Control Stack user-space support. Current patches follow the
x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
patches (already on the list) will add support for clone3() allowing
finer-grained control of the shadow stack size and placement from
libc
- AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
getting close with the upcoming dpISA support)
- Other arch features:
- In-kernel use of the memcpy instructions, FEAT_MOPS (previously
only exposed to user; uaccess support not merged yet)
- MTE: hugetlbfs support and the corresponding kselftests
- Optimise CRC32 using the PMULL instructions
- Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG
- Optimise the kernel TLB flushing to use the range operations
- POE/pkey (permission overlays): further cleanups after bringing
the signal handler in line with the x86 behaviour for 6.12
- arm64 perf updates:
- Support for the NXP i.MX91 PMU in the existing IMX driver
- Support for Ampere SoCs in the Designware PCIe PMU driver
- Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC
- Support for Samsung's 'Mongoose' CPU PMU
- Support for PMUv3.9 finer-grained userspace counter access
control
- Switch back to platform_driver::remove() now that it returns
'void'
- Add some missing events for the CXL PMU driver
- Miscellaneous arm64 fixes/cleanups:
- Page table accessors cleanup: type updates, drop unused macros,
reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
check addresses before runtime P4D/PUD folding
- Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
FEAT_ECV for the generic timers) allowing Linux to boot with
firmware deployments that don't set SCTLR_EL3.ECVEn
- ACPI/arm64: tighten the check for the array of platform timer
structures and adjust the error handling procedure in
gtdt_parse_timer_block()
- Optimise the cache flush for the uprobes xol slot (skip if no
change) and other uprobes/kprobes cleanups
- Fix the context switching of tpidrro_el0 when kpti is enabled
- Dynamic shadow call stack fixes
- Sysreg updates
- Various arm64 kselftest improvements
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits)
arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled
kselftest/arm64: Try harder to generate different keys during PAC tests
kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()
arm64/ptrace: Clarify documentation of VL configuration via ptrace
kselftest/arm64: Corrupt P0 in the irritator when testing SSVE
acpi/arm64: remove unnecessary cast
arm64/mm: Change protval as 'pteval_t' in map_range()
kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c
kselftest/arm64: Add FPMR coverage to fp-ptrace
kselftest/arm64: Expand the set of ZA writes fp-ptrace does
kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
kselftest/arm64: Enable build of PAC tests with LLVM=1
kselftest/arm64: Check that SVCR is 0 in signal handlers
selftests/mm: Fix unused function warning for aarch64_write_signal_pkey()
kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests
kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test
kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests
kselftest/arm64: Fix build with stricter assemblers
arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux()
arm64/scs: Deal with 64-bit relative offsets in FDE frames
...
|
||
|
|
5591fd5e03 |
lsm/stable-6.13 PR 20241112
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmcztFcUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPvFQ/+KYwRe3g6gFSu7tRA34okHtUopvpF KGAaic06c8oy85gSX4B2Xk4HINCgXVUuRi9Z+0yExRWvvBXRRdQRUj1Vdbj4KOEG sRsIA1j1YhPU3wyhkAqwpJ97sQE1v9Xb3xizGwTfQKGQkd+cvtHg0QKM08/jPQYq bbbcSxoVsUzh8+idAq1UMfdoTsMh2xeCW7Q1+dbBINJykNzKiqEEc21xgBxeomST lSG9XFP3BJr1RBlb4Ux+J8YL+2G/rDBWZh1sR5+t31kgClSgs3CMBRFdTATvplKk e9vrcUF8wR7xWWnDmmdobHa462qUt6BWifYarX9RTomGBugZfYDOR/C+jpb+xZwd +tZfL6HSOVeBtQ/Zu1bs18eS5i2dj7GxFN7GPY2qXIPvsW5Acwcx1CCK6oNDmX05 1cOaNuZRYBDye4eAnT3yufnJ34VO80UQIfKTE6dqrX0XtCFYomTxb+Km0qM3utl5 ubr3Krp6GmVs65lIvtnIhDKSlcNIBbJfH64vdQNnOn/8FvkovGqp2eaX+0wBhROM 8KgbqntXU4/DgQuDiP01g13mTDeTGdcfyRWKcKMI/CzI/WASPZBpVuqX6xWXh3bs NlZmJ/7+Y48Xp2FvaEchQ/A8ppyIrigMLloZ8yAHf2P1z9g6wBNRCrsScdSQVx63 ArxHLRY44pUOnPs= =m/yY -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: "Thirteen patches, all focused on moving away from the current 'secid' LSM identifier to a richer 'lsm_prop' structure. This move will help reduce the translation that is necessary in many LSMs, offering better performance, and make it easier to support different LSMs in the future" * tag 'lsm-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: remove lsm_prop scaffolding netlabel,smack: use lsm_prop for audit data audit: change context data from secid to lsm_prop lsm: create new security_cred_getlsmprop LSM hook audit: use an lsm_prop in audit_names lsm: use lsm_prop in security_inode_getsecid lsm: use lsm_prop in security_current_getsecid audit: update shutdown LSM data lsm: use lsm_prop in security_ipc_getsecid audit: maintain an lsm_prop in audit_context lsm: add lsmprop_to_secctx hook lsm: use lsm_prop in security_audit_rule_match lsm: add the lsm_prop data structure |
||
|
|
a8220b0ca7 |
audit/stable-6.13 PR 20241112
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmcztDIUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOI6g//dAY0z6TVYzGWSbsSim+ZBDycMAjA AVwQONdQTkQBO9MEw6C7HIQeECQn+jm52gTDSpRxeUfJBbO/KTPbm3e0TQ0vGT1A ED7QW2u3BkEM/8mMY8UOPPx+PWO7yb08gMZd+WSKGuhL34Ypsa1zm2Pf5hjiX7S+ eGXJ/5IMaCQcCevR0EpMz8T1VgidJRRhl0HfaNALt4FR+4Ppsn6upMQtOZ9mmr7Q IQpL0ZlOJiSjoYRpOmNfGM94ikS+H8b7OC0EjJzRyetw7laaHqmM/OtLWqlgyOdZ B2oZ3q0J79wBOJMZxHf09rodNmhl686nHeDPOnpGKahjsNON7LFua13b+UqHzHHE QlMdquZpO2QNaXxfN+H9S8VOe7rcGfLO1yElhP+ydpfX4DHHUGSv22Gu1jmAmR8V Uyem7zZWTAkcK0zx0w9MjNN+IgD2uI+r175eL/jfOZUFqYnG2696KEBksnd8k4Vr fP/99MGn+juX8zhMTUUxcNfcYISPwJjLAT1mA/conhXHD8SSACFK963cGjuo8snI 1QA1qMfW4sEMphTBiit4fDd2+2V6MPCR+uMovvzqTXOC1tO8FuSkJWmcXQbY0LkT JOQsVrp7XqZDecvmERYn2EAmO68XlVOJD+Bx8i4b4W3u5hlp5uG3WAv/QEDfVVdf y+gs16bgWb9fyS0= =SOyQ -----END PGP SIGNATURE----- Merge tag 'audit-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "The audit patches are minimal this time around with one patch to correct some kdoc function parameters and one to leverage the `str_yes_no()` function; nothing very exciting" * tag 'audit-pr-20241112' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: Use str_yes_no() helper function audit: Reorganize kerneldoc parameter names |
||
|
|
0f25f0e4ef |
the bulk of struct fd memory safety stuff
Making sure that struct fd instances are destroyed in the same
scope where they'd been created, getting rid of reassignments
and passing them by reference, converting to CLASS(fd{,_pos,_raw}).
We are getting very close to having the memory safety of that stuff
trivial to verify.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZzdikAAKCRBZ7Krx/gZQ
69nJAQCmbQHK3TGUbQhOw6MJXOK9ezpyEDN3FZb4jsu38vTIdgEA6OxAYDO2m2g9
CN18glYmD3wRyU6Bwl4vGODouSJvDgA=
=gVH3
-----END PGP SIGNATURE-----
Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull 'struct fd' class updates from Al Viro:
"The bulk of struct fd memory safety stuff
Making sure that struct fd instances are destroyed in the same scope
where they'd been created, getting rid of reassignments and passing
them by reference, converting to CLASS(fd{,_pos,_raw}).
We are getting very close to having the memory safety of that stuff
trivial to verify"
* tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
deal with the last remaing boolean uses of fd_file()
css_set_fork(): switch to CLASS(fd_raw, ...)
memcg_write_event_control(): switch to CLASS(fd)
assorted variants of irqfd setup: convert to CLASS(fd)
do_pollfd(): convert to CLASS(fd)
convert do_select()
convert vfs_dedupe_file_range().
convert cifs_ioctl_copychunk()
convert media_request_get_by_fd()
convert spu_run(2)
switch spufs_calls_{get,put}() to CLASS() use
convert cachestat(2)
convert do_preadv()/do_pwritev()
fdget(), more trivial conversions
fdget(), trivial conversions
privcmd_ioeventfd_assign(): don't open-code eventfd_ctx_fdget()
o2hb_region_dev_store(): avoid goto around fdget()/fdput()
introduce "fd_pos" class, convert fdget_pos() users to it.
fdget_raw() users: switch to CLASS(fd_raw)
convert vmsplice() to CLASS(fd)
...
|
||
|
|
a5ca574796 |
vfs-6.13.usercopy
-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZzchMwAKCRCRxhvAZXjc okICAP4h6tDl7dgTv8GkL0tgaHi/36m+ilctXbEtIe9fbkc/fQD8D5t6jYaz47gu zVY7qOrtQOQ/diNavzxyky99Uh3dKgo= =lwkw -----END PGP SIGNATURE----- Merge tag 'vfs-6.13.usercopy' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull copy_struct_to_user helper from Christian Brauner: "This adds a copy_struct_to_user() helper which is a companion helper to the already widely used copy_struct_from_user(). It copies a struct from kernel space to userspace, in a way that guarantees backwards-compatibility for struct syscall arguments as long as future struct extensions are made such that all new fields are appended to the old struct, and zeroed-out new fields have the same meaning as the old struct. The first user is sched_getattr() system call but the new extensible pidfs ioctl will be ported to it as well" * tag 'vfs-6.13.usercopy' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: sched_getattr: port to copy_struct_to_user uaccess: add copy_struct_to_user helper |
||
|
|
4c797b11a8 |
vfs-6.13.file
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZzcW4gAKCRCRxhvAZXjc
okF+AP9xTMb2SlnRPBOBd9yFcmVXmQi86TSCUPAEVb+wIldGYwD/RIOdvXYJlp9v
RgJkU1DC3ddkXtONNDY6gFaP+siIWA0=
=gMc7
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.13.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs file updates from Christian Brauner:
"This contains changes the changes for files for this cycle:
- Introduce a new reference counting mechanism for files.
As atomic_inc_not_zero() is implemented with a try_cmpxchg() loop
it has O(N^2) behaviour under contention with N concurrent
operations and it is in a hot path in __fget_files_rcu().
The rcuref infrastructures remedies this problem by using an
unconditional increment relying on safe- and dead zones to make
this work and requiring rcu protection for the data structure in
question. This not just scales better it also introduces overflow
protection.
However, in contrast to generic rcuref, files require a memory
barrier and thus cannot rely on *_relaxed() atomic operations and
also require to be built on atomic_long_t as having massive amounts
of reference isn't unheard of even if it is just an attack.
This adds a file specific variant instead of making this a generic
library.
This has been tested by various people and it gives consistent
improvement up to 3-5% on workloads with loads of threads.
- Add a fastpath for find_next_zero_bit(). Skip 2-levels searching
via find_next_zero_bit() when there is a free slot in the word that
contains the next fd. This improves pts/blogbench-1.1.0 read by 8%
and write by 4% on Intel ICX 160.
- Conditionally clear full_fds_bits since it's very likely that a bit
in full_fds_bits has been cleared during __clear_open_fds(). This
improves pts/blogbench-1.1.0 read up to 13%, and write up to 5% on
Intel ICX 160.
- Get rid of all lookup_*_fdget_rcu() variants. They were used to
lookup files without taking a reference count. That became invalid
once files were switched to SLAB_TYPESAFE_BY_RCU and now we're
always taking a reference count. Switch to an already existing
helper and remove the legacy variants.
- Remove pointless includes of <linux/fdtable.h>.
- Avoid cmpxchg() in close_files() as nobody else has a reference to
the files_struct at that point.
- Move close_range() into fs/file.c and fold __close_range() into it.
- Cleanup calling conventions of alloc_fdtable() and expand_files().
- Merge __{set,clear}_close_on_exec() into one.
- Make __set_open_fd() set cloexec as well instead of doing it in two
separate steps"
* tag 'vfs-6.13.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests: add file SLAB_TYPESAFE_BY_RCU recycling stressor
fs: port files to file_ref
fs: add file_ref
expand_files(): simplify calling conventions
make __set_open_fd() set cloexec state as well
fs: protect backing files with rcu
file.c: merge __{set,clear}_close_on_exec()
alloc_fdtable(): change calling conventions.
fs/file.c: add fast path in find_next_fd()
fs/file.c: conditionally clear full_fds
fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd()
move close_range(2) into fs/file.c, fold __close_range() into it
close_files(): don't bother with xchg()
remove pointless includes of <linux/fdtable.h>
get rid of ...lookup...fdget_rcu() family
|
||
|
|
6ac81fd55e |
vfs-6.13.mgtime
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZzcScQAKCRCRxhvAZXjc
oj+5AP4k822a77wc/3iPFk379naIvQ4dsrgemh0/Pb6ZvzvkFQEAi3vFCfzCDR2x
SkJF/RwXXKZv6U31QXMRt2Qo6wfBuAc=
=nVlm
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.13.mgtime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs multigrain timestamps from Christian Brauner:
"This is another try at implementing multigrain timestamps. This time
with significant help from the timekeeping maintainers to reduce the
performance impact.
Thomas provided a base branch that contains the required timekeeping
interfaces for the VFS. It serves as the base for the multi-grain
timestamp work:
- Multigrain timestamps allow the kernel to use fine-grained
timestamps when an inode's attributes is being actively observed
via ->getattr(). With this support, it's possible for a file to get
a fine-grained timestamp, and another modified after it to get a
coarse-grained stamp that is earlier than the fine-grained time. If
this happens then the files can appear to have been modified in
reverse order, which breaks VFS ordering guarantees.
To prevent this, a floor value is maintained for multigrain
timestamps. Whenever a fine-grained timestamp is handed out, record
it, and when later coarse-grained stamps are handed out, ensure
they are not earlier than that value. If the coarse-grained
timestamp is earlier than the fine-grained floor, return the floor
value instead.
The timekeeper changes add a static singleton atomic64_t into
timekeeper.c that is used to keep track of the latest fine-grained
time ever handed out. This is tracked as a monotonic ktime_t value
to ensure that it isn't affected by clock jumps. Because it is
updated at different times than the rest of the timekeeper object,
the floor value is managed independently of the timekeeper via a
cmpxchg() operation, and sits on its own cacheline.
Two new public timekeeper interfaces are added:
(1) ktime_get_coarse_real_ts64_mg() fills a timespec64 with the
later of the coarse-grained clock and the floor time
(2) ktime_get_real_ts64_mg() gets the fine-grained clock value,
and tries to swap it into the floor. A timespec64 is filled
with the result.
- The VFS has always used coarse-grained timestamps when updating the
ctime and mtime after a change. This has the benefit of allowing
filesystems to optimize away a lot metadata updates, down to around
1 per jiffy, even when a file is under heavy writes.
Unfortunately, this has always been an issue when we're exporting
via NFSv3, which relies on timestamps to validate caches. A lot of
changes can happen in a jiffy, so timestamps aren't sufficient to
help the client decide when to invalidate the cache. Even with
NFSv4, a lot of exported filesystems don't properly support a
change attribute and are subject to the same problems with
timestamp granularity. Other applications have similar issues with
timestamps (e.g backup applications).
If we were to always use fine-grained timestamps, that would
improve the situation, but that becomes rather expensive, as the
underlying filesystem would have to log a lot more metadata
updates.
This adds a way to only use fine-grained timestamps when they are
being actively queried. Use the (unused) top bit in
inode->i_ctime_nsec as a flag that indicates whether the current
timestamps have been queried via stat() or the like. When it's set,
we allow the kernel to use a fine-grained timestamp iff it's
necessary to make the ctime show a different value.
This solves the problem of being able to distinguish the timestamp
between updates, but introduces a new problem: it's now possible
for a file being changed to get a fine-grained timestamp. A file
that is altered just a bit later can then get a coarse-grained one
that appears older than the earlier fine-grained time. This
violates timestamp ordering guarantees.
This is where the earlier mentioned timkeeping interfaces help. A
global monotonic atomic64_t value is kept that acts as a timestamp
floor. When we go to stamp a file, we first get the latter of the
current floor value and the current coarse-grained time. If the
inode ctime hasn't been queried then we just attempt to stamp it
with that value.
If it has been queried, then first see whether the current coarse
time is later than the existing ctime. If it is, then we accept
that value. If it isn't, then we get a fine-grained time and try to
swap that into the global floor. Whether that succeeds or fails, we
take the resulting floor time, convert it to realtime and try to
swap that into the ctime.
We take the result of the ctime swap whether it succeeds or fails,
since either is just as valid.
Filesystems can opt into this by setting the FS_MGTIME fstype flag.
Others should be unaffected (other than being subject to the same
floor value as multigrain filesystems)"
* tag 'vfs-6.13.mgtime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: reduce pointer chasing in is_mgtime() test
tmpfs: add support for multigrain timestamps
btrfs: convert to multigrain timestamps
ext4: switch to multigrain timestamps
xfs: switch to multigrain timestamps
Documentation: add a new file documenting multigrain timestamps
fs: add percpu counters for significant multigrain timestamp events
fs: tracepoints around multigrain timestamp events
fs: handle delegated timestamps in setattr_copy_mgtime
timekeeping: Add percpu counter for tracking floor swap events
timekeeping: Add interfaces for handling timestamps with a floor value
fs: have setattr_copy handle multigrain timestamps appropriately
fs: add infrastructure for multigrain timestamps
|
||
|
|
4a5df37964 |
10 hotfixes, 7 of which are cc:stable. All singletons, please see the
changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZzkr6AAKCRDdBJ7gKXxA jsb2AP9HCOI4w9rQTmBdnaefXytS7fiiPq+LVNpjJ0NGXX2FSgD/e1NM0wi8KevQ npcvlqTcXtRSJvYNF904aTNyDn+Kuw0= =KFGY -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2024-11-16-15-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "10 hotfixes, 7 of which are cc:stable. All singletons, please see the changelogs for details" * tag 'mm-hotfixes-stable-2024-11-16-15-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: revert "mm: shmem: fix data-race in shmem_getattr()" ocfs2: uncache inode which has failed entering the group mm: fix NULL pointer dereference in alloc_pages_bulk_noprof mm, doc: update read_ahead_kb for MADV_HUGEPAGE fs/proc/task_mmu: prevent integer overflow in pagemap_scan_get_args() sched/task_stack: fix object_is_on_stack() for KASAN tagged pointers crash, powerpc: default to CRASH_DUMP=n on PPC_BOOK3S_32 mm/mremap: fix address wraparound in move_page_tables() tools/mm: fix compile error mm, swap: fix allocation and scanning race with swapoff |
||
|
|
b5a24181e4 |
Ring buffer fixes for 6.12:
- Revert: "ring-buffer: Do not have boot mapped buffers hook to CPU hotplug" A crash that happened on cpu hotplug was actually caused by the incorrect ref counting that was fixed by commit |
||
|
|
923c256e37 |
Merge branches 'pm-cpuidle' and 'pm-em'
Merge cpuidle and Energy Model changes for 6.13-rc1: - Add a built-in idle states table for Granite Rapids Xeon D to the intel_idle driver (Artem Bityutskiy). - Fix some typos in comments in the cpuidle core and drivers (Shen Lichuan). - Remove iowait influence from the menu cpuidle governor (Christian Loehle). - Add min/max available performance state limits to the Energy Model management code (Lukasz Luba). * pm-cpuidle: intel_idle: add Granite Rapids Xeon D support cpuidle: Correct some typos in comments cpuidle: menu: Remove iowait influence * pm-em: PM: EM: Add min/max available performance state limits |
||
|
|
d79944b094 |
sched_ext: One more fix for v6.12-rc7
ops.cpu_acquire() was being invoked with the wrong kfunc mask allowing the operation to call kfuncs which shouldn't be allowed. Fix it by using SCX_KF_REST instead, which is trivial and low risk. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZzamXw4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGRReAP4/JQ1mKkJv+9nTZkW9OcFFHGVVhrprOUEEFk5j pmHwPAD8DTBMMS/BCQOoXDdiB9uU7ut6M8VdsIj1jmJkMja+eQI= =942J -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-6.12-rc7-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fix from Tejun Heo: "One more fix for v6.12-rc7 ops.cpu_acquire() was being invoked with the wrong kfunc mask allowing the operation to call kfuncs which shouldn't be allowed. Fix it by using SCX_KF_REST instead, which is trivial and low risk" * tag 'sched_ext-for-6.12-rc7-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: ops.cpu_acquire() should be called with SCX_KF_REST |
||
|
|
31daa34315 |
crash, powerpc: default to CRASH_DUMP=n on PPC_BOOK3S_32
Fixes boot failures on 6.9 on PPC_BOOK3S_32 machines using Open Firmware.
On these machines, the kernel refuses to boot from non-zero
PHYSICAL_START, which occurs when CRASH_DUMP is on.
Since most PPC_BOOK3S_32 machines boot via Open Firmware, it should
default to off for them. Users booting via some other mechanism can still
turn it on explicitly.
Does not change the default on any other architectures for the
time being.
Link: https://lkml.kernel.org/r/20240917163720.1644584-1-dave@vasilevsky.ca
Fixes:
|
||
|
|
f946cae86d |
scftorture: Handle NULL argument passed to scf_add_to_free_list().
Dan reported that after the rework the newly introduced
scf_add_to_free_list() may get a NULL pointer passed. This replaced
kfree() which was fine with a NULL pointer but scf_add_to_free_list()
isn't.
Let scf_add_to_free_list() handle NULL pointer.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/2375aa2c-3248-4ffa-b9b0-f0a24c50f237@stanley.mountain
Fixes:
|
||
|
|
a4af89cc50 |
sched_ext: ops.cpu_acquire() should be called with SCX_KF_REST
ops.cpu_acquire() is currently called with 0 kf_maks which is interpreted as
SCX_KF_UNLOCKED which allows all unlocked kfuncs, but ops.cpu_acquire() is
called from balance_one() under the rq lock and should only be allowed call
kfuncs that are safe under the rq lock. Update it to use SCX_KF_REST.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>
Cc: Zhao Mengmeng <zhaomzhao@126.com>
Link: http://lkml.kernel.org/r/ZzYvf2L3rlmjuKzh@slm.duckdns.org
Fixes:
|
||
|
|
09663753bb |
tracing/ring-buffer: Clear all memory mapped CPU ring buffers on first recording
The events of a memory mapped ring buffer from the previous boot should
not be mixed in with events from the current boot. There's meta data that
is used to handle KASLR so that function names can be shown properly.
Also, since the timestamps of the previous boot have no meaning to the
timestamps of the current boot, having them intermingled in a buffer can
also cause confusion because there could possibly be events in the future.
When a trace is activated the meta data is reset so that the pointers of
are now processed for the new address space. The trace buffers are reset
when tracing starts for the first time. The problem here is that the reset
only happens on online CPUs. If a CPU is offline, it does not get reset.
To demonstrate the issue, a previous boot had tracing enabled in the boot
mapped ring buffer on reboot. On the following boot, tracing has not been
started yet so the function trace from the previous boot is still visible.
# trace-cmd show -B boot_mapped -c 3 | tail
<idle>-0 [003] d.h2. 156.462395: __rcu_read_lock <-cpu_emergency_disable_virtualization
<idle>-0 [003] d.h2. 156.462396: vmx_emergency_disable_virtualization_cpu <-cpu_emergency_disable_virtualization
<idle>-0 [003] d.h2. 156.462396: __rcu_read_unlock <-__sysvec_reboot
<idle>-0 [003] d.h2. 156.462397: stop_this_cpu <-__sysvec_reboot
<idle>-0 [003] d.h2. 156.462397: set_cpu_online <-stop_this_cpu
<idle>-0 [003] d.h2. 156.462397: disable_local_APIC <-stop_this_cpu
<idle>-0 [003] d.h2. 156.462398: clear_local_APIC <-disable_local_APIC
<idle>-0 [003] d.h2. 156.462574: mcheck_cpu_clear <-stop_this_cpu
<idle>-0 [003] d.h2. 156.462575: mce_intel_feature_clear <-stop_this_cpu
<idle>-0 [003] d.h2. 156.462575: lmce_supported <-mce_intel_feature_clear
Now, if CPU 3 is taken offline, and tracing is started on the memory
mapped ring buffer, the events from the previous boot in the CPU 3 ring
buffer is not reset. Now those events are using the meta data from the
current boot and produces just hex values.
# echo 0 > /sys/devices/system/cpu/cpu3/online
# trace-cmd start -B boot_mapped -p function
# trace-cmd show -B boot_mapped -c 3 | tail
<idle>-0 [003] d.h2. 156.462395: 0xffffffff9a1e3194 <-0xffffffff9a0f655e
<idle>-0 [003] d.h2. 156.462396: 0xffffffff9a0a1d24 <-0xffffffff9a0f656f
<idle>-0 [003] d.h2. 156.462396: 0xffffffff9a1e6bc4 <-0xffffffff9a0f7323
<idle>-0 [003] d.h2. 156.462397: 0xffffffff9a0d12b4 <-0xffffffff9a0f732a
<idle>-0 [003] d.h2. 156.462397: 0xffffffff9a1458d4 <-0xffffffff9a0d12e2
<idle>-0 [003] d.h2. 156.462397: 0xffffffff9a0faed4 <-0xffffffff9a0d12e7
<idle>-0 [003] d.h2. 156.462398: 0xffffffff9a0faaf4 <-0xffffffff9a0faef2
<idle>-0 [003] d.h2. 156.462574: 0xffffffff9a0e3444 <-0xffffffff9a0d12ef
<idle>-0 [003] d.h2. 156.462575: 0xffffffff9a0e4964 <-0xffffffff9a0d12ef
<idle>-0 [003] d.h2. 156.462575: 0xffffffff9a0e3fb0 <-0xffffffff9a0e496f
Reset all CPUs when starting a boot mapped ring buffer for the first time,
and not just the online CPUs.
Fixes:
|
||
|
|
580bb355bc |
Revert: "ring-buffer: Do not have boot mapped buffers hook to CPU hotplug"
A crash happened when testing cpu hotplug with respect to the memory mapped ring buffers. It was assumed that the hot plug code was adding a per CPU buffer that was already created that caused the crash. The real problem was due to ref counting and was fixed by commit |
||
|
|
5a4332062e |
Merge branches 'for-next/gcs', 'for-next/probes', 'for-next/asm-offsets', 'for-next/tlb', 'for-next/misc', 'for-next/mte', 'for-next/sysreg', 'for-next/stacktrace', 'for-next/hwcap3', 'for-next/kselftest', 'for-next/crc32', 'for-next/guest-cca', 'for-next/haft' and 'for-next/scs', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
perf: Switch back to struct platform_driver::remove()
perf: arm_pmuv3: Add support for Samsung Mongoose PMU
dt-bindings: arm: pmu: Add Samsung Mongoose core compatible
perf/dwc_pcie: Fix typos in event names
perf/dwc_pcie: Add support for Ampere SoCs
ARM: pmuv3: Add missing write_pmuacr()
perf/marvell: Marvell PEM performance monitor support
perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access control
perf/dwc_pcie: Convert the events with mixed case to lowercase
perf/cxlpmu: Support missing events in 3.1 spec
perf: imx_perf: add support for i.MX91 platform
dt-bindings: perf: fsl-imx-ddr: Add i.MX91 compatible
drivers perf: remove unused field pmu_node
* for-next/gcs: (42 commits)
: arm64 Guarded Control Stack user-space support
kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c
arm64/gcs: Fix outdated ptrace documentation
kselftest/arm64: Ensure stable names for GCS stress test results
kselftest/arm64: Validate that GCS push and write permissions work
kselftest/arm64: Enable GCS for the FP stress tests
kselftest/arm64: Add a GCS stress test
kselftest/arm64: Add GCS signal tests
kselftest/arm64: Add test coverage for GCS mode locking
kselftest/arm64: Add a GCS test program built with the system libc
kselftest/arm64: Add very basic GCS test program
kselftest/arm64: Always run signals tests with GCS enabled
kselftest/arm64: Allow signals tests to specify an expected si_code
kselftest/arm64: Add framework support for GCS to signal handling tests
kselftest/arm64: Add GCS as a detected feature in the signal tests
kselftest/arm64: Verify the GCS hwcap
arm64: Add Kconfig for Guarded Control Stack (GCS)
arm64/ptrace: Expose GCS via ptrace and core files
arm64/signal: Expose GCS state in signal frames
arm64/signal: Set up and restore the GCS context for signal handlers
arm64/mm: Implement map_shadow_stack()
...
* for-next/probes:
: Various arm64 uprobes/kprobes cleanups
arm64: insn: Simulate nop instruction for better uprobe performance
arm64: probes: Remove probe_opcode_t
arm64: probes: Cleanup kprobes endianness conversions
arm64: probes: Move kprobes-specific fields
arm64: probes: Fix uprobes for big-endian kernels
arm64: probes: Fix simulate_ldr*_literal()
arm64: probes: Remove broken LDR (literal) uprobe support
* for-next/asm-offsets:
: arm64 asm-offsets.c cleanup (remove unused offsets)
arm64: asm-offsets: remove PREEMPT_DISABLE_OFFSET
arm64: asm-offsets: remove DMA_{TO,FROM}_DEVICE
arm64: asm-offsets: remove VM_EXEC and PAGE_SZ
arm64: asm-offsets: remove MM_CONTEXT_ID
arm64: asm-offsets: remove COMPAT_{RT_,SIGFRAME_REGS_OFFSET
arm64: asm-offsets: remove VMA_VM_*
arm64: asm-offsets: remove TSK_ACTIVE_MM
* for-next/tlb:
: TLB flushing optimisations
arm64: optimize flush tlb kernel range
arm64: tlbflush: add __flush_tlb_range_limit_excess()
* for-next/misc:
: Miscellaneous patches
arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled
arm64/ptrace: Clarify documentation of VL configuration via ptrace
acpi/arm64: remove unnecessary cast
arm64/mm: Change protval as 'pteval_t' in map_range()
arm64: uprobes: Optimize cache flushes for xol slot
acpi/arm64: Adjust error handling procedure in gtdt_parse_timer_block()
arm64: fix .data.rel.ro size assertion when CONFIG_LTO_CLANG
arm64/ptdump: Test both PTE_TABLE_BIT and PTE_VALID for block mappings
arm64/mm: Sanity check PTE address before runtime P4D/PUD folding
arm64/mm: Drop setting PTE_TYPE_PAGE in pte_mkcont()
ACPI: GTDT: Tighten the check for the array of platform timer structures
arm64/fpsimd: Fix a typo
arm64: Expose ID_AA64ISAR1_EL1.XS to sanitised feature consumers
arm64: Return early when break handler is found on linked-list
arm64/mm: Re-organize arch_make_huge_pte()
arm64/mm: Drop _PROT_SECT_DEFAULT
arm64: Add command-line override for ID_AA64MMFR0_EL1.ECV
arm64: head: Drop SWAPPER_TABLE_SHIFT
arm64: cpufeature: add POE to cpucap_is_possible()
arm64/mm: Change pgattr_change_is_safe() arguments as pteval_t
* for-next/mte:
: Various MTE improvements
selftests: arm64: add hugetlb mte tests
hugetlb: arm64: add mte support
* for-next/sysreg:
: arm64 sysreg updates
arm64/sysreg: Update ID_AA64MMFR1_EL1 to DDI0601 2024-09
* for-next/stacktrace:
: arm64 stacktrace improvements
arm64: preserve pt_regs::stackframe during exec*()
arm64: stacktrace: unwind exception boundaries
arm64: stacktrace: split unwind_consume_stack()
arm64: stacktrace: report recovered PCs
arm64: stacktrace: report source of unwind data
arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk()
arm64: use a common struct frame_record
arm64: pt_regs: swap 'unused' and 'pmr' fields
arm64: pt_regs: rename "pmr_save" -> "pmr"
arm64: pt_regs: remove stale big-endian layout
arm64: pt_regs: assert pt_regs is a multiple of 16 bytes
* for-next/hwcap3:
: Add AT_HWCAP3 support for arm64 (also wire up AT_HWCAP4)
arm64: Support AT_HWCAP3
binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4
* for-next/kselftest: (30 commits)
: arm64 kselftest fixes/cleanups
kselftest/arm64: Try harder to generate different keys during PAC tests
kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()
kselftest/arm64: Corrupt P0 in the irritator when testing SSVE
kselftest/arm64: Add FPMR coverage to fp-ptrace
kselftest/arm64: Expand the set of ZA writes fp-ptrace does
kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
kselftest/arm64: Enable build of PAC tests with LLVM=1
kselftest/arm64: Check that SVCR is 0 in signal handlers
kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests
kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test
kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests
kselftest/arm64: Fix build with stricter assemblers
kselftest/arm64: Test signal handler state modification in fp-stress
kselftest/arm64: Provide a SIGUSR1 handler in the kernel mode FP stress test
kselftest/arm64: Implement irritators for ZA and ZT
kselftest/arm64: Remove unused ADRs from irritator handlers
kselftest/arm64: Correct misleading comments on fp-stress irritators
kselftest/arm64: Poll less often while waiting for fp-stress children
kselftest/arm64: Increase frequency of signal delivery in fp-stress
kselftest/arm64: Fix encoding for SVE B16B16 test
...
* for-next/crc32:
: Optimise CRC32 using PMULL instructions
arm64/crc32: Implement 4-way interleave using PMULL
arm64/crc32: Reorganize bit/byte ordering macros
arm64/lib: Handle CRC-32 alternative in C code
* for-next/guest-cca:
: Support for running Linux as a guest in Arm CCA
arm64: Document Arm Confidential Compute
virt: arm-cca-guest: TSM_REPORT support for realms
arm64: Enable memory encrypt for Realms
arm64: mm: Avoid TLBI when marking pages as valid
arm64: Enforce bounce buffers for realm DMA
efi: arm64: Map Device with Prot Shared
arm64: rsi: Map unprotected MMIO as decrypted
arm64: rsi: Add support for checking whether an MMIO is protected
arm64: realm: Query IPA size from the RMM
arm64: Detect if in a realm and set RIPAS RAM
arm64: rsi: Add RSI definitions
* for-next/haft:
: Support for arm64 FEAT_HAFT
arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()
arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG
arm64: Add support for FEAT_HAFT
arm64: setup: name 'tcr2' register
arm64/sysreg: Update ID_AA64MMFR1_EL1 register
* for-next/scs:
: Dynamic shadow call stack fixes
arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux()
arm64/scs: Deal with 64-bit relative offsets in FDE frames
arm64/scs: Fix handling of DWARF augmentation data in CIE/FDE frames
|
||
|
|
70d8b6485b |
sched/cpufreq: Ensure sd is rebuilt for EAS check
Ensure sugov_eas_rebuild_sd() is always called when sugov_init()
succeeds. The out goto initialized sugov without forcing the rebuild.
Previously the missing call to sugov_eas_rebuild_sd() could lead to EAS
not being enabled on boot when it should have been, because it requires
all policies to be controlled by schedutil while they might not have
been initialized yet.
Fixes:
|
||
|
|
3022e9d00e |
sched_ext: Fixes for v6.12-rc7
- The fair sched class currently has a bug where its balance() returns true telling the sched core that it has tasks to run but then NULL from pick_task(). This makes sched core call sched_ext's pick_task() without preceding balance() which can lead to stalls in partial mode. For now, work around by detecting the condition and forcing the CPU to go through another scheduling cycle. - Add a missing newline to an error message and fix drgn introspection tool which went out of sync. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZzI8sw4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGb5KAP40b/o6TyAFDG+Hn6GxyxQT7rcAUMXsdB2bcEpg /IjmzQEAwbHU5KP5vQXV6XHv+2V7Rs7u6ZqFtDnL88N0A9hf3wk= =7hL8 -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-6.12-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - The fair sched class currently has a bug where its balance() returns true telling the sched core that it has tasks to run but then NULL from pick_task(). This makes sched core call sched_ext's pick_task() without preceding balance() which can lead to stalls in partial mode. For now, work around by detecting the condition and forcing the CPU to go through another scheduling cycle. - Add a missing newline to an error message and fix drgn introspection tool which went out of sync. * tag 'sched_ext-for-6.12-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Handle cases where pick_task_scx() is called without preceding balance_scx() sched_ext: Update scx_show_state.py to match scx_ops_bypass_depth's new type sched_ext: Add a missing newline at the end of an error message |
||
|
|
28e43197c4 |
20 hotfixes, 14 of which are cc:stable.
Three affect DAMON. Lorenzo's five-patch series to address the
mmap_region error handling is here also.
Apart from that, various singletons.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZzBVmAAKCRDdBJ7gKXxA
ju42AQD0EEnzW+zFyI+E7x5FwCmLL6ofmzM8Sw9YrKjaeShdZgEAhcyS2Rc/AaJq
Uty2ZvVMDF2a9p9gqHfKKARBXEbN2w0=
=n+lO
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2024-11-09-22-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"20 hotfixes, 14 of which are cc:stable.
Three affect DAMON. Lorenzo's five-patch series to address the
mmap_region error handling is here also.
Apart from that, various singletons"
* tag 'mm-hotfixes-stable-2024-11-09-22-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mailmap: add entry for Thorsten Blum
ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove()
signal: restore the override_rlimit logic
fs/proc: fix compile warning about variable 'vmcore_mmap_ops'
ucounts: fix counter leak in inc_rlimit_get_ucounts()
selftests: hugetlb_dio: check for initial conditions to skip in the start
mm: fix docs for the kernel parameter ``thp_anon=``
mm/damon/core: avoid overflow in damon_feed_loop_next_input()
mm/damon/core: handle zero schemes apply interval
mm/damon/core: handle zero {aggregation,ops_update} intervals
mm/mlock: set the correct prev on failure
objpool: fix to make percpu slot allocation more robust
mm/page_alloc: keep track of free highatomic
mm: resolve faulty mmap_region() error path behaviour
mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
mm: refactor map_deny_write_exec()
mm: unconditionally close VMAs on error
mm: avoid unsafe VMA hook invocation when error arises on mmap hook
mm/thp: fix deferred split unqueue naming and locking
mm/thp: fix deferred split queue not partially_mapped
|
||
|
|
e45f0ab6ee |
padata: Clean up in padata_do_multithreaded()
In commit
|
||
|
|
a6250aa251 |
sched_ext: Handle cases where pick_task_scx() is called without preceding balance_scx()
sched_ext dispatches tasks from the BPF scheduler from balance_scx() and thus every pick_task_scx() call must be preceded by balance_scx(). While this usually holds, due to a bug, there are cases where the fair class's balance() returns true indicating that it has tasks to run on the CPU and thus terminating balance() calls but fails to actually find the next task to run when pick_task() is called. In such cases, pick_task_scx() can be called without preceding balance_scx(). Detect this condition using SCX_RQ_BAL_PENDING flags. If detected, keep running the previous task if possible and avoid stalling from entering idle without balancing. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/Ztj_h5c2LYsdXYbA@slm.duckdns.org |
||
|
|
4788c861ad |
scftorture: Use a lock-less list to free memory.
scf_handler() is used as a SMP function call. This function is always invoked in IRQ-context even with forced-threading enabled. This function frees memory which not allowed on PREEMPT_RT because the locking underneath is using sleeping locks. Add a per-CPU scf_free_pool where each SMP functions adds its memory to be freed. This memory is then freed by scftorture_invoker() on each iteration. On the majority of invocations the number of items is less than five. If the thread sleeps/ gets delayed the number exceed 350 but did not reach 400 in testing. These were the spikes during testing. The bulk free of 64 pointers at once should improve the give-back if the list grows. The list size is ~1.3 items per invocations. Having one global scf_free_pool with one cleaning thread let the list grow to over 10.000 items with 32 CPUs (again, spikes not the average) especially if the CPU went to sleep. The per-CPU part looks like a good compromise. Reported-by: "Paul E. McKenney" <paulmck@kernel.org> Closes: https://lore.kernel.org/lkml/41619255-cdc2-4573-a360-7794fc3614f7@paulmck-laptop/ Tested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
||
|
|
64bdaf963c |
scftorture: Move memory allocation outside of preempt_disable region.
Memory allocations can not happen within regions with explicit disabled preemption PREEMPT_RT. The problem is that the locking structures underneath are sleeping locks. Move the memory allocation outside of the preempt-disabled section. Keep the GFP_ATOMIC for the allocation to behave like a "ememergncy allocation". Tested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
||
|
|
43082cd579 |
scftorture: Wait until scf_cleanup_handler() completes.
The smp_call_function() needs to be invoked with the wait flag set to wait until scf_cleanup_handler() is done. This ensures that all SMP function calls, that have been queued earlier, complete at this point. Tested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
||
|
|
42eeb3b573 |
scftorture: Avoid additional div operation.
Replace "scfp->cpu % nr_cpu_ids" with "cpu". This has been computed earlier. Tested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Tested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
||
|
|
9e05e5c7ee |
signal: restore the override_rlimit logic
Prior to commit |
||
|
|
432dc0654c |
ucounts: fix counter leak in inc_rlimit_get_ucounts()
The inc_rlimit_get_ucounts() increments the specified rlimit counter and
then checks its limit. If the value exceeds the limit, the function
returns an error without decrementing the counter.
Link: https://lkml.kernel.org/r/20241101191940.3211128-1-roman.gushchin@linux.dev
Fixes:
|
||
|
|
7758b20611 |
Fix tracefs mount options:
The commit |
||
|
|
f7d1b585e1 |
sched_ext: Add a missing newline at the end of an error message
Signed-off-by: Tejun Heo <tj@kernel.org> |
||
|
|
5609296750 |
PM: EM: Add min/max available performance state limits
On some devices there are HW dependencies for shared frequency and voltage between devices. It will impact Energy Aware Scheduler (EAS) decision, where CPUs share the voltage & frequency domain with other CPUs or devices e.g. - Mid CPUs + Big CPU - Little CPU + L3 cache in DSU - some other device + Little CPUs Detailed explanation of one example: When the L3 cache frequency is increased, the affected Little CPUs might run at higher voltage and frequency. That higher voltage causes higher CPU power and thus more energy is used for running the tasks. This is important for background running tasks, which try to run on energy efficient CPUs. Therefore, add performance state limits which are applied for the device (in this case CPU). This is important on SoCs with HW dependencies mentioned above so that the Energy Aware Scheduler (EAS) does not use performance states outside the valid min-max range for energy calculation. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/20241030164126.1263793-2-lukasz.luba@arm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
||
|
|
b019b4a670 |
A single fix for posix CPU timers
When a thread is cloned, the posix CPU timers are not inherited. If the parent has a CPU timer armed the corresponding tick dependency in the tasks tick_dep_mask is set and copied to the new thread, which means the new thread and all decendants will prevent the system to go into full NOHZ operation. Clear the tick dependency mask in copy_process() to fix this. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmcnT/oTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoZw5D/0bqjjMYiE0ws8nuuXN1gL2T1wt6P2C 2zzEKnk6nsxnGbMfFs7XifDkqSOHNVro7F6kCkz6cH4U/VSK6R2nNONbufz4mNWk 6uDvdlZps30ekmxN+C3fB6S/4MNVOhXFFXjsQhT/PxX/CfibVP5fATLtcRLq9Lfx mV3nnxKTHPjDGz2/2QRyYpA3G1fzOj/l2QeJsVAIC8GxCo7drLVe0yn5Yt+13zhU JMjgt9ox4PKFsjaXrqvt1yKNTtb+evjYSIVXxIg60oxUkfva6XFLXJv3rjbARUNj aqBHfkZ1/d2Hwc0WexDZfvhNWeCqnfUA+db7ALSYICbNd37EVxWVZA2TwfkkWKSt RDq3xX6NJSd71h0lxDvzv7Ph3NUq23rq3LycAkjqfhiFjPQmE6axtnioXcR5mtVp q9tilB/3I4zj4BIYPfd9KowkdclRSK+B3Oo0DTyuVhKLggF0UD3poDxT4HxnBCFB uKDV8GDsoD8Ksjsl0/X/D4oorqLYAT0tG9gxMw1Kii16gijhhu5qeqTWY+qs9ieg 2J+Ku83QLQgljvy2s7y0AnMZMxaeKN5YMs6zNV+yGAoTFyft3CnBMKalPHfOCI0A fdKHi2aKm+lLrp/UIG2Yw9N1xfcrGWA2moH+9dw6zTyAei+TInP+WYndI7VR5EDf 3KvW3OoRf0sywA== =83r/ -----END PGP SIGNATURE----- Merge tag 'timers-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for posix CPU timers. When a thread is cloned, the posix CPU timers are not inherited. If the parent has a CPU timer armed the corresponding tick dependency in the tasks tick_dep_mask is set and copied to the new thread, which means the new thread and all decendants will prevent the system to go into full NOHZ operation. Clear the tick dependency mask in copy_process() to fix this" * tag 'timers-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone |
||
|
|
33e83ffe4c |
A few scheduler fixes:
- Plug a race between pick_next_task_fair() and try_to_wake_up() where
both try to write to the same task, even though both paths hold a
runqueue lock, but obviously from different runqueues.
The problem is that the store to task::on_rq in __block_task() is
visible to try_to_wake_up() which assumes that the task is not queued.
Both sides then operate on the same task.
Cure it by rearranging __block_task() so the the store to task::on_rq is
the last operation on the task.
- Prevent a potential NULL pointer dereference in task_numa_work()
task_numa_work() iterates the VMAs of a process. A concurrent unmap of
the address space can result in a NULL pointer return from vma_next()
which is unchecked.
Add the missing NULL pointer check to prevent this.
- Operate on the correct scheduler policy in task_should_scx()
task_should_scx() returns true when a task should be handled by sched
EXT. It checks the tasks scheduling policy.
This fails when the check is done before a policy has been set.
Cure it by handing the policy into task_should_scx() so it operates
on the requested value.
- Add the missing handling of sched EXT in the delayed dequeue
mechanism. This was simply forgotten.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmcnTqATHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoX/aD/4yvskeG9i7wAj2NOdDTAs1K0gLURt+
nHDb1YkoIOXOfanaG7ZdBWb4sYnsnLX/KIhVsDQiXACFr6G0IjQ1zaN1iRtEkH79
5BfVi98gAXdFU3y+EGqyaqiAp7MFOBTmsfJi5095fX0L+2aViSAjDEvHzvvC/hXD
tmq47vFQEgIZPSxljEaKPaNmyDM+geusv5lX/lABH5MG0fYsT85VV6BQ2T1LsN1O
WFBLD/uPEOSXumyZW8nV8yE2PioLDJz8W+uSnr38/HCH99mtJApqZyskaagKtr0g
vLhOfoaYVR/j5ODUk6LExZ8zy140zDzUWzC5+RNnyb8jQf/Lx88fTNZY8/Wsm5m9
oKtoiGzkL0LG/c05Cjh/vqReK26qILK4+ynDGaowDmTlUTS2jeNZL1ABlIwWkaLP
5TDegJPkoUA1Z4YegxtRFROGHp1J+lfbqz537bghMaqdJXMaG84qjSszsPz9NbS9
F7K63JKjfXAF6N8bhKvZk4jAbD97EYf3B0o8E69TjoZxaiuKf00xK7HGWmuQD3u3
lOHkfIZzf5b7ELNgcketCYsbJvxbI4oQrp/9V425ORSr1Ih2GxCT51/x/NlFHoEH
ujIjAe2YQyLhb26M0RG8Xao3BPT7RGMR058C8lwxtPLuPNIwB8MqCsXmU9xlEypg
iexGnsj6zXTddg==
=4mie
-----END PGP SIGNATURE-----
Merge tag 'sched-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
- Plug a race between pick_next_task_fair() and try_to_wake_up() where
both try to write to the same task, even though both paths hold a
runqueue lock, but obviously from different runqueues.
The problem is that the store to task::on_rq in __block_task() is
visible to try_to_wake_up() which assumes that the task is not
queued. Both sides then operate on the same task.
Cure it by rearranging __block_task() so the the store to task::on_rq
is the last operation on the task.
- Prevent a potential NULL pointer dereference in task_numa_work()
task_numa_work() iterates the VMAs of a process. A concurrent unmap
of the address space can result in a NULL pointer return from
vma_next() which is unchecked.
Add the missing NULL pointer check to prevent this.
- Operate on the correct scheduler policy in task_should_scx()
task_should_scx() returns true when a task should be handled by sched
EXT. It checks the tasks scheduling policy.
This fails when the check is done before a policy has been set.
Cure it by handing the policy into task_should_scx() so it operates
on the requested value.
- Add the missing handling of sched EXT in the delayed dequeue
mechanism. This was simply forgotten.
* tag 'sched-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/ext: Fix scx vs sched_delayed
sched: Pass correct scheduling policy to __setscheduler_class
sched/numa: Fix the potential null pointer dereference in task_numa_work()
sched: Fix pick_next_task_fair() vs try_to_wake_up() race
|
||
|
|
68f05b251b |
A single fix for perf:
perf_event_clear_cpumask() uses list_for_each_entry_rcu() without being in a RCU read side critical section, which triggers a "suspicious RCU usage" warning. It turns out that the list walk does not be RCU protected because the write side lock is held in this contxt. Change it to a regular list walk. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmcnSfoTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoVERD/0fMj1Y79mC2DoAB9XBy6d7xumHbDlN Jt4v3Hq0oADjqrGAro/XDbAm9y7gK9b5BhV2UH4ehvJj+WeesGRikV26Fdh3WuWo aTyNTh7PxsMoNRqIWWBiGbumHya9+INZKyKFAMD/WtQ3Av2emws0nnm9uv+eJzVZ zr1+NiofUDsu1I04E6zVXBra3aLqIbsWg5NOCsJAdW/9AKE+GQMA0/aw14Z2ftqH Mry4PqW4aGcTnCRNtoaHHwbP4677ZXX6pQnbUTGYZ4ywJJFKQ54YH1mUqUUP6cOo GWg20gVK4PTkJSt2nL/I+i1RVq7Ipw725e540XEAFDsDVj5jy/rJbmrmyUys6sr7 Xu6cXbjAs/kV/A9TB1wBsb+iMUnHTNbRWMS1d8bsxaUWSIe9wouDJHAIumCMr3B3 qALdXxHqppPZuccMFWHyxAClJEY8YEp9+n32BMpePASLhv3JBJHOUSn8HWr+GIgC N4slnJvLevETlO0HcQ3IUifwqfQBJ6O0Kyu0IXmrb3aCV9TzrbE1iZDgv6HbZBVP FsUaMBB/se24R/4zxSsH+u7yLFcgEJKVWVzngXzNoUvRX8xF4um6x1y89049Q0CC iGdRq3/fV/b/Tp7wvEuIxCr0GPUi28OCZTwjESmluUIS6ZSd83oDajBXix725hk+ 1YIwANTHMeBadQ== =qlTN -----END PGP SIGNATURE----- Merge tag 'perf-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "perf_event_clear_cpumask() uses list_for_each_entry_rcu() without being in a RCU read side critical section, which triggers a 'suspicious RCU usage' warning. It turns out that the list walk does not be RCU protected because the write side lock is held in this context. Change it to a regular list walk" * tag 'perf-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix missing RCU reader protection in perf_event_clear_cpumask() |
||
|
|
8f0b844adc |
Two fixes for the interrupt subsystem:
- Fix an off-by-one error in the failure path of msi_domain_alloc(),
which causes the cleanup loop to terminate early and leaking the first
allocated interrupt.
- Handle a corner case in GIC-V4 versus a lazily mapped Virtual
Processing Element (VPE). If the VPE has not been mapped because the
guest has not yet emitted a mapping command, then the set_affinity()
callback returns an error code, which causes the vCPU management to fail.
Return success in this case without touching the hardware. This will be
done later when the guest issues the mapping command.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmcnSPwTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYp6D/9VtT4/JrR8mc44OIfAPKdFfQFkAowB
+lqyv+8PXl41/BY2gMv6kdoaHZzI/nJspbgs8nh4DfIOfhoZXo+jVe0ewn0kvgEw
NzEsN2l5TfFWk6W3pJ/dzklPEepja1Ju9/7E9bHX0sJlZwfl43PGJqqBXQjKyhJB
+NXlqCh66P137V6LgTBobjfO8B+gdbZn80+LHtBsA7M8dEyK7zdYINM3OwK3li0V
umNTsvabimxY7om8xZVI03h8wedABG+/itINzfiEu3fR9Dpp4gwQrbOzQTxion7S
4WkbVCh2OKiEJGcjstzHeaNYZatCvkEKyvSBIRDrI2+JCJlnFax8fhZn9w65ExMv
BeU0mG/ip6tfH9ieaqm82IT7yYX4PPv+ma6L3BGmdDkM1o0z317Orm/mbcE4a6MD
EPxnUxOEGqBKc+ylsvZiHriYRtUsyxR2y343XSuCZuYZHpdB1IN+Q1qFBoNY0MlU
q7igpXj6FM0qD3zadz5H4Kb4Sj09oWMnhGJCUMEqknOzd1U0cBwWsIvuNUq3VWCe
8P9arwFK4fa7B1YZ8cgLVw9JYqazpdY1GOn6k0lBDdF0tnieP4NaOvWs/imlkt4s
kX9Qr/mFoA2EV4vBiURPsK43TlGpRm0kEJgIeElvsXVFlmiTBkVFS2CR7Ep2dFs4
ezIo15/4GmlRsQ==
=6Li6
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
- Fix an off-by-one error in the failure path of msi_domain_alloc(),
which causes the cleanup loop to terminate early and leaking the
first allocated interrupt.
- Handle a corner case in GIC-V4 versus a lazily mapped Virtual
Processing Element (VPE). If the VPE has not been mapped because the
guest has not yet emitted a mapping command, then the set_affinity()
callback returns an error code, which causes the vCPU management to
fail.
Return success in this case without touching the hardware. This will
be done later when the guest issues the mapping command.
* tag 'irq-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v4: Correctly deal with set_affinity on lazily-mapped VPEs
genirq/msi: Fix off-by-one error in msi_domain_alloc()
|
||
|
|
457a654939 |
css_set_fork(): switch to CLASS(fd_raw, ...)
reference acquired there by fget_raw() is not stashed anywhere - we could as well borrow instead. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
8152f82010 |
fdget(), more trivial conversions
all failure exits prior to fdget() leave the scope, all matching fdput() are immediately followed by leaving the scope. [xfs_ioc_commit_range() chunk moved here as well] Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
6348be02ee |
fdget(), trivial conversions
fdget() is the first thing done in scope, all matching fdput() are immediately followed by leaving the scope. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
048181992c |
fdget_raw() users: switch to CLASS(fd_raw)
Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
4dd53b84ff |
get rid of perf_fget_light(), convert kernel/events/core.c to CLASS(fd)
Lift fdget() and fdput() out of perf_fget_light(), turning it into is_perf_file(struct fd f). The life gets easier in both callers if we do fdget() unconditionally, including the case when we are given -1 instead of a descriptor - that avoids a reassignment in perf_event_open(2) and it avoids a nasty temptation in _perf_ioctl() where we must *not* lift output_event out of scope for output. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
05e555642c |
regularize emptiness checks in fini_module(2) and vfs_dedupe_file_range()
With few exceptions emptiness checks are done as fd_file(...) in boolean context (usually something like if (!fd_file(f))...); those will be taken care of later. However, there's a couple of places where we do those checks as 'store fd_file(...) into a variable, then check if this variable is NULL' and those are harder to spot. Get rid of those now. use fd_empty() instead of extracting file and then checking it for NULL. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
fa17cb4b3b |
tracing: Document tracefs gid mount option
Commit |
||
|
|
5635f18942 |
BPF fixes:
- Fix BPF verifier to force a checkpoint when the program's jump history becomes too long (Eduard Zingerman) - Add several fixes to the BPF bits iterator addressing issues like memory leaks and overflow problems (Hou Tao) - Fix an out-of-bounds write in trie_get_next_key (Byeonguk Jeong) - Fix BPF test infra's LIVE_FRAME frame update after a page has been recycled (Toke Høiland-Jørgensen) - Fix BPF verifier and undo the 40-bytes extra stack space for bpf_fastcall patterns due to various bugs (Eduard Zingerman) - Fix a BPF sockmap race condition which could trigger a NULL pointer dereference in sock_map_link_update_prog (Cong Wang) - Fix tcp_bpf_recvmsg_parser to retrieve seq_copied from tcp_sk under the socket lock (Jiayuan Chen) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> -----BEGIN PGP SIGNATURE----- iIsEABYIADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZyQO/RUcZGFuaWVsQGlv Z2VhcmJveC5uZXQACgkQ2yufC7HISIO2vAD+NAng11x6W9tnIOVDHTwvsWL4aafQ pmf1zda90bwCIyIA/07ptFPWOH+WTmWqP8pZ9PGY5279KAxurZZDud0SOwIO =28aY -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Daniel Borkmann: - Fix BPF verifier to force a checkpoint when the program's jump history becomes too long (Eduard Zingerman) - Add several fixes to the BPF bits iterator addressing issues like memory leaks and overflow problems (Hou Tao) - Fix an out-of-bounds write in trie_get_next_key (Byeonguk Jeong) - Fix BPF test infra's LIVE_FRAME frame update after a page has been recycled (Toke Høiland-Jørgensen) - Fix BPF verifier and undo the 40-bytes extra stack space for bpf_fastcall patterns due to various bugs (Eduard Zingerman) - Fix a BPF sockmap race condition which could trigger a NULL pointer dereference in sock_map_link_update_prog (Cong Wang) - Fix tcp_bpf_recvmsg_parser to retrieve seq_copied from tcp_sk under the socket lock (Jiayuan Chen) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, test_run: Fix LIVE_FRAME frame update after a page has been recycled selftests/bpf: Add three test cases for bits_iter bpf: Use __u64 to save the bits in bits iterator bpf: Check the validity of nr_words in bpf_iter_bits_new() bpf: Add bpf_mem_alloc_check_size() helper bpf: Free dynamically allocated bits in bpf_iter_bits_destroy() bpf: disallow 40-bytes extra stack for bpf_fastcall patterns selftests/bpf: Add test for trie_get_next_key() bpf: Fix out-of-bounds write in trie_get_next_key() selftests/bpf: Test with a very short loop bpf: Force checkpoint when jmp history is too long bpf: fix filed access without lock sock_map: fix a NULL pointer dereference in sock_map_link_update_prog() |
||
|
|
69d5e722be |
sched/ext: Fix scx vs sched_delayed
Commit |
||
|
|
e133938367 |
bpf: Use __u64 to save the bits in bits iterator
On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the content of the u64. However, bits_copy is only 4 bytes, leading to stack corruption. The straightforward solution would be to replace u64 with unsigned long in bpf_iter_bits_new(). However, this introduces confusion and problems for 32-bit hosts because the size of ulong in bpf program is 8 bytes, but it is treated as 4-bytes after passed to bpf_iter_bits_new(). Fix it by changing the type of both bits and bit_count from unsigned long to u64. However, the change is not enough. The main reason is that bpf_iter_bits_next() uses find_next_bit() to find the next bit and the pointer passed to find_next_bit() is an unsigned long pointer instead of a u64 pointer. For 32-bit little-endian host, it is fine but it is not the case for 32-bit big-endian host. Because under 32-bit big-endian host, the first iterated unsigned long will be the bits 32-63 of the u64 instead of the expected bits 0-31. Therefore, in addition to changing the type, swap the two unsigned longs within the u64 for 32-bit big-endian host. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> |