linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-12 16:18:45 +02:00

Author	SHA1	Message	Date
Arnaldo Carvalho de Melo	47c68eb15a	perf header: Sanity check HEADER_HYBRID_TOPOLOGY Add upper bound check on nr_nodes in process_hybrid_topology() to harden against malformed perf.data files (reuses MAX_PMU_MAPPINGS, 4096). Cc: Ian Rogers <irogers@google.com> Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:53 -07:00
Arnaldo Carvalho de Melo	110a661708	perf header: Sanity check HEADER_CACHE Add upper bound check on cache entry count in process_cache() to harden against malformed perf.data files (max 32768). Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ian Rogers <irogers@google.com> Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:53 -07:00
Arnaldo Carvalho de Melo	6830e20c92	perf header: Sanity check HEADER_GROUP_DESC Add upper bound check on nr_groups in process_group_desc() to harden against malformed perf.data files (max 32768), and move the env assignment after validation. Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:53 -07:00
Arnaldo Carvalho de Melo	f613a6d694	perf header: Sanity check HEADER_PMU_MAPPINGS Add upper bound check on pmu_num in process_pmu_mappings() to harden against malformed perf.data files (max 4096). Cc: Ian Rogers <irogers@google.com> Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:53 -07:00
Arnaldo Carvalho de Melo	a881fc5603	perf header: Sanity check HEADER_MEM_TOPOLOGY Add validation to process_mem_topology() to harden against malformed perf.data files: - Upper bound check on nr_nodes (reuses MAX_NUMA_NODES, 4096) - Minimum section size check before allocating This is particularly important here since nr is u64, making unbounded values especially dangerous. Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ian Rogers <irogers@google.com> Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:53 -07:00
Arnaldo Carvalho de Melo	4ba223016b	perf header: Sanity check HEADER_NUMA_TOPOLOGY Add validation to process_numa_topology() to harden against malformed perf.data files: - Upper bound check on nr_nodes (max 4096) - Minimum section size check before allocating Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ian Rogers <irogers@google.com> Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:53 -07:00
Arnaldo Carvalho de Melo	22a2e2b292	perf header: Sanity check HEADER_CPU_TOPOLOGY Add validation to process_cpu_topology() to harden against malformed perf.data files: - Verify nr_cpus_avail was initialized (HEADER_NRCPUS processed first) - Bounds check sibling counts (cores, threads, dies) against nr_cpus_avail - Fix two bare 'return -1' that leaked env->cpu by using 'goto free_cpu' Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ian Rogers <irogers@google.com> Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:53 -07:00
Arnaldo Carvalho de Melo	376ce5a9f7	perf header: Sanity check HEADER_NRCPUS and HEADER_CPU_DOMAIN_INFO While working on some cleanups sashiko questioned about pre-existing issues, namely lacking sanity checks for perf.data headers, add some with the help of Claude. Cc: Ian Rogers <irogers@google.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:52 -07:00
Arnaldo Carvalho de Melo	06452a412e	perf header: Bump up the max number of command line args allowed We need to do some upper limit validation, bump up the arbitrary limit as per suggestion of Sashiko about command line wildcard expansion ending up with more than 32768 args. Link: https://sashiko.dev/#/patchset/20260408172846.96360-1-acme%40kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:52 -07:00
Arnaldo Carvalho de Melo	f823d7efb8	perf header: Validate nr_domains when reading HEADER_CPU_DOMAIN_INFO Further validate the HEADER_CPU_DOMAIN_INFO fields, this time checking the nr_domains field. Assisted-by: Claude Code:claude-opus-4-6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-13 23:21:52 -07:00
Aditya Garg	ca5ee0e918	tools: hv: Fix cross-compilation Use the native ARCH only in case it is not set, this will allow the cross-compilation where ARCH is explicitly set. Additionally, simplify the ARCH check to build the fcopy daemon only for x86 and x86_64. Fixes: `82b0945ce2` ("tools: hv: Add new fcopy application based on uio driver") Reported-by: Adrian Vladu <avladu@cloudbasesolutions.com> Closes: https://lore.kernel.org/linux-hyperv/PR3PR09MB54119DB2FD76977C62D8DD6AB04D2@PR3PR09MB5411.eurprd09.prod.outlook.com/ Co-developed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>	2026-04-14 04:43:26 +00:00
Linus Torvalds	1334d2a3b3	gpio updates for v7.1-rc1 GPIO core: - defer probe on software node lookups when the remote software node exists but has not been registered as a firmware node yet - unify GPIO hog handling by moving code duplicated in OF and ACPI modules into GPIO core and allow setting up hogs with software nodes - allow matching GPIO controllers by secondary firmware node if matching by primary does not succeed - demote deferral warnings to debug level as they are quite normal when using software nodes which don't support fw_devlink yet - disable the legacy GPIO character device uAPI v1 supprt in Kconfig by default - rework several core functions in preparation for the upcoming Revocable helper library for protecting resources against sudden removal, this reduces the number of SRCU dereferences in GPIO core - simplify file descriptor logic in GPIO character device code by using FD_PREPARE() - introduce a header defining symbols used by both GPIO consumers and providers to avoid having to include provider-specific headers from drivers which only consume GPIOs - replace snprintf() with strscpy() where formatting is not required New drivers: - add the gpio-by-pinctrl generic driver using the ARM SCMI protocol to control GPIOs (along with SCMI changes pulled from the pinctrl tree) - add a driver providing support for handling of platform events via GPIO-signalled ACPI events (used on Intel Nova Lake and later platforms) Driver changes: - extend the gpio-kempld driver with support for more recent models, interrupts and setting/getting multiple values at once - improve interrupt handling in gpio-brcmstb - add support for multi-SoC systems in gpio-tegra186 - make sure we return correct values from the .get() callbacks in several GPIO drivers by normalizing any values other than 0, 1 or negative error numbers - use flexible arrays in several drivers to reduce the number of required memory allocations - simplify synchronous waiting for virtual drivers to probe and remove the dedicated, a bit overengineered helper library dev-sync-probe - remove unneeded Kconfig dependencies on OF_GPIO in several drivers and subsystems - convert the two remaining users of of_get_named_gpio() to using GPIO descriptors and remove the (no longer used) function along with the header that declares it - add missing includes in gpio-mmio - shrink and simplify code in gpio-max732x by using guard(mutex) - remove duplicated code handling the 'ngpios' property from gpio-ts4800, it's already handled in GPIO core - use correct variable type in gpio-aspeed - add support for a new model in gpio-realtek-otto - allow to specify the active-low setting of simulated hogs over the configfs interface (in addition to existing devicetree support) in gpio-sim Bug fixes: - clear the OF_POPULATED flag on hog nodes in GPIO chip remove path on OF systems - fix resource leaks in error path in gpiochip_add_data_with_key() - drop redundant device reference in gpio-mpsse Tests: - add selftests for use-after-free cases in GPIO character device code DT bindings: - add a DT binding document for SCMI based, gpio-over-pinctrl devices - fix interrupt description in microchip,mpfs-gpio - add new compatible for gpio-realtek-otto - describe the resets of the mpfs-gpio controller - fix maintainer's email in gpio-delay bindings - remove the binding document for cavium,thunder-8890 as the corresponding device is bound over PCI and not firmware nodes Documentation: - update the recommended way of converting legacy boards to using software nodes for GPIO description - describe GPIO line value semantics - misc updates to kerneldocs Misc: - convert OMAP1 ams-delta board to using GPIO hogs described with software nodes -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEkeUTLeW1Rh17omX8BZ0uy/82hMMFAmnYsngACgkQBZ0uy/82 hMO+Tw/+N8eX1GOWkEdZBZRzd6QW+2qjmeMgvizMUu2CzfYpcuO9kpOqiSjguj2S 6hODOajwU6EDdrxEHy7zJrAc6Tw1xIQeCnmSsFjC+4OePsCP6bU0QBgdb0V5waIx 3kEdBsM3Msw48SCDFTUQJ0XyUD/VP4TZXzhDmU0X1OJsGYDSMchiBytNpZrFt18q qTq/NJm6mT6h5XlTeTCmfBcf/TG7MZhAPzXw8YZp+ZIgDsTRtD/P6CAZgJ0OU9f4 MQwJO5+JFkTO7XhL+qOfJcnKnC2lRHaa7mJiSQ+XS43NOqO7NGeGH2l7hU/Lx/fR NIIZk27uBRV1akpnMGtgbqL2A8SFeH5yj3/o6S4rp9IzDwOouN+1seaL2RUHpTns TgIm037MNIZI8eQ2lSA9/+f4vwF1bml8mA/6lVBGHI9ZcaZbcUyWjXPcsuVIyiqY HlV+A3sVjchpaH9Eie78nbVm2X7Wm5slEazXAl3zVjlekQut+Fp+xoBWwulEjp+H 7PZXqLP2hV/Xw4C2mn/zwwokzj+1S1DPW5Inn5Y7Qi6/j4GmvdVmI1zBculrf7jj GE6UUz+Vm9v6oE+q19jsksudVrDPapASYV/TLGQZk48IXhy+KxB8lep6L2rEAQtS YgBltnjJlbrx99u1sZkPECzgCRYSgm59Lt0aOv93CL2fURI1Seg= =VknJ -----END PGP SIGNATURE----- Merge tag 'gpio-updates-for-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio updates from Bartosz Golaszewski: "For this merge window we have two new drivers: support for GPIO-signalled ACPI events on Intel platforms and a generic GPIO-over-pinctrl driver using the ARM SCMI protocol for controlling pins. Several things have been reworked in GPIO core: we unduplicated GPIO hog handling, reduced the number of SRCU locks and dereferences, improved support for software-node-based lookup and removed more legacy code after converting remaining users to modern alternatives. There's also a number of driver reworks and refactoring, documentation updates, some bug-fixes and new tests. GPIO core: - defer probe on software node lookups when the remote software node exists but has not been registered as a firmware node yet - unify GPIO hog handling by moving code duplicated in OF and ACPI modules into GPIO core and allow setting up hogs with software nodes - allow matching GPIO controllers by secondary firmware node if matching by primary does not succeed - demote deferral warnings to debug level as they are quite normal when using software nodes which don't support fw_devlink yet - disable the legacy GPIO character device uAPI v1 supprt in Kconfig by default - rework several core functions in preparation for the upcoming Revocable helper library for protecting resources against sudden removal, this reduces the number of SRCU dereferences in GPIO core - simplify file descriptor logic in GPIO character device code by using FD_PREPARE() - introduce a header defining symbols used by both GPIO consumers and providers to avoid having to include provider-specific headers from drivers which only consume GPIOs - replace snprintf() with strscpy() where formatting is not required New drivers: - add the gpio-by-pinctrl generic driver using the ARM SCMI protocol to control GPIOs (along with SCMI changes pulled from the pinctrl tree) - add a driver providing support for handling of platform events via GPIO-signalled ACPI events (used on Intel Nova Lake and later platforms) Driver changes: - extend the gpio-kempld driver with support for more recent models, interrupts and setting/getting multiple values at once - improve interrupt handling in gpio-brcmstb - add support for multi-SoC systems in gpio-tegra186 - make sure we return correct values from the .get() callbacks in several GPIO drivers by normalizing any values other than 0, 1 or negative error numbers - use flexible arrays in several drivers to reduce the number of required memory allocations - simplify synchronous waiting for virtual drivers to probe and remove the dedicated, a bit overengineered helper library dev-sync-probe - remove unneeded Kconfig dependencies on OF_GPIO in several drivers and subsystems - convert the two remaining users of of_get_named_gpio() to using GPIO descriptors and remove the (no longer used) function along with the header that declares it - add missing includes in gpio-mmio - shrink and simplify code in gpio-max732x by using guard(mutex) - remove duplicated code handling the 'ngpios' property from gpio-ts4800, it's already handled in GPIO core - use correct variable type in gpio-aspeed - add support for a new model in gpio-realtek-otto - allow to specify the active-low setting of simulated hogs over the configfs interface (in addition to existing devicetree support) in gpio-sim Bug fixes: - clear the OF_POPULATED flag on hog nodes in GPIO chip remove path on OF systems - fix resource leaks in error path in gpiochip_add_data_with_key() - drop redundant device reference in gpio-mpsse Tests: - add selftests for use-after-free cases in GPIO character device code DT bindings: - add a DT binding document for SCMI based, gpio-over-pinctrl devices - fix interrupt description in microchip,mpfs-gpio - add new compatible for gpio-realtek-otto - describe the resets of the mpfs-gpio controller - fix maintainer's email in gpio-delay bindings - remove the binding document for cavium,thunder-8890 as the corresponding device is bound over PCI and not firmware nodes Documentation: - update the recommended way of converting legacy boards to using software nodes for GPIO description - describe GPIO line value semantics - misc updates to kerneldocs Misc: - convert OMAP1 ams-delta board to using GPIO hogs described with software nodes" * tag 'gpio-updates-for-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (79 commits) gpio: swnode: defer probe on references to unregistered software nodes dt-bindings: gpio: cavium,thunder-8890: Remove DT binding Documentation: gpio: update the preferred method for using software node lookup gpio: gpio-by-pinctrl: s/used to do/is used to do/ gpio: aspeed: fix unsigned long int declaration gpio: rockchip: convert to dynamic GPIO base allocation gpio: remove dev-sync-probe gpio: virtuser: stop using dev-sync-probe gpio: aggregator: stop using dev-sync-probe gpio: sim: stop using dev-sync-probe gpio: Add Intel Nova Lake ACPI GPIO events driver gpiolib: Make deferral warnings debug messages gpiolib: fix hogs with multiple lines gpio: fix up CONFIG_OF dependencies gpio: gpio-by-pinctrl: add pinctrl based generic GPIO driver gpio: dt-bindings: Add GPIO on top of generic pin control firmware: arm_scmi: Allow PINCTRL_REQUEST to return EOPNOTSUPP pinctrl: scmi: ignore PIN_CONFIG_PERSIST_STATE pinctrl: scmi: Delete PIN_CONFIG_OUTPUT_IMPEDANCE_OHMS support pinctrl: scmi: Add SCMI_PIN_INPUT_VALUE ...	2026-04-13 20:10:58 -07:00
Linus Torvalds	d7c8087a9c	Power management updates for 7.1-rc1 - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa) - Update cpufreq-dt-platdev blocklist (Faruque Ansari) - Minor updates to driver and dt-bindings for Tegra (Thierry Reding, Rosen Penev) - Add MAINTAINERS entry for CPPC driver (Viresh Kumar) - Add support for new features: CPPC performance priority, Dynamic EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy, Mario Limonciello) - Fix sysfs files being present when HW missing and broken/outdated documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy) - Pass the policy to cpufreq_driver->adjust_perf() to avoid using cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which leads to a scheduling-while-atomic bug (K Prateek Nayak) - Clean up dead code in Kconfig for cpufreq (Julian Braha) - Remove max_freq_req update for pre-existing cpufreq policy and add a boost_freq_req QoS request to save the boost constraint instead of overwriting the last scaling_max_freq constraint (Pierre Gondois) - Embed cpufreq QoS freq_req objects in cpufreq policy so they all are allocated in one go along with the policy to simplify lifetime rules and avoid error handling issues (Viresh Kumar) - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq scaling driver (Henry Tseng) - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead of cpumask_weight() because the former is more efficient (Yury Norov) - Use sysfs_emit() in sysfs show functions for cpufreq governor attributes (Thorsten Blum) - Update intel_pstate to stop returning an error when "off" is written to its status sysfs attribute while the driver is already off (Fabio De Francesco) - Include current frequency in the debug message printed by __cpufreq_driver_target() (Pengjie Zhang) - Refine stopped tick handling in the menu cpuidle governor and rearrange stopped tick handling in the teo cpuidle governor (Rafael Wysocki) - Add Panther Lake C-states table to the intel_idle driver (Artem Bityutskiy) - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha) - Simplify cpuidle_register_device() with guard() (Huisong Li) - Use performance level if available to distinguish between rates in OPP debugfs (Manivannan Sadhasivam) - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar) - Return -ENODATA if the snapshot image is not loaded (Alberto Garcia) - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric Biggers) - Clean up and rearrange the intel_rapl power capping driver to make the respective interface drivers (TPMI, MSR, and MMOI) hold their own settings and primitives and consolidate PL4 and PMU support flags into rapl_defaults (Kuppuswamy Sathyanarayanan) - Correct kernel-doc function parameter names in the power capping core code (Randy Dunlap) - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko) - Use _visible attribute to replace create/remove_sysfs_files() in devfreq (Pengjie Zhang) - Add Tegra114 support to activity monitor device in tegra30-devfreq as a preparation to upcoming EMC controller support (Svyatoslav Ryhel) - Fix mistakes in cpupower man pages, add the boost and epp options to the cpupower-frequency-info man page, and add the perf-bias option to the cpupower-info man page (Roberto Ricci) - Remove unnecessary extern declarations from getopt.h in arguments parsing functions in cpufreq-set, cpuidle-info, cpuidle-set, cpupower-info, and cpupower-set utilities (Kaushlendra Kumar) -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY9TISHHJqd0Byand5 c29ja2kubmV0AAoJEO5fvZ0v1OO1G9gH/j5mEqfPpiwX6fQ/ZwOGdNOOPVA5w9j4 KPHSMwMD5lZkoaZfasp2vt27KY5SOoVVvRZ2DKkFJ3Jai4I3cUPZYypga2nre1ag tgzX4vOjcw2r40Eda6ezWl1h4mca/xJJBX7xH2+hn1JY+Y1in37g50CqMIjKh96z Uugkk6UZytL1XcF55PMhIUgDf6pDtRT5UOW9xOKOkUt8FVWTJ7ei3HaWyV5kDmVq b5eQ42+OH7y6sWNnoKczFd8fStvh6J/avoJurBEvcOQhMcjaIaB48G19+KjDg73E NjrVcgG20P2rltBvV2d0J1TKskZHkaP7XjIeWfkwjGZhee3FL7ssS/g= =fRCO -----END PGP SIGNATURE----- Merge tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "Once again, cpufreq is the most active development area, mostly because of the new feature additions and documentation updates in the amd-pstate driver, but there are also changes in the cpufreq core related to boost support and other assorted updates elsewhere. Next up are power capping changes due to the major cleanup of the Intel RAPL driver. On the cpuidle front, a new C-states table for Intel Panther Lake is added to the intel_idle driver, the stopped tick handling in the menu and teo governors is updated, and there are a couple of cleanups. Apart from the above, support for Tegra114 is added to devfreq and there are assorted cleanups of that code, there are also two updates of the operating performance points (OPP) library, two minor updates related to hibernation, and cpupower utility man pages updates and cleanups. Specifics: - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa) - Update cpufreq-dt-platdev blocklist (Faruque Ansari) - Minor updates to driver and dt-bindings for Tegra (Thierry Reding, Rosen Penev) - Add MAINTAINERS entry for CPPC driver (Viresh Kumar) - Add support for new features: CPPC performance priority, Dynamic EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy, Mario Limonciello) - Fix sysfs files being present when HW missing and broken/outdated documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy) - Pass the policy to cpufreq_driver->adjust_perf() to avoid using cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which leads to a scheduling-while-atomic bug (K Prateek Nayak) - Clean up dead code in Kconfig for cpufreq (Julian Braha) - Remove max_freq_req update for pre-existing cpufreq policy and add a boost_freq_req QoS request to save the boost constraint instead of overwriting the last scaling_max_freq constraint (Pierre Gondois) - Embed cpufreq QoS freq_req objects in cpufreq policy so they all are allocated in one go along with the policy to simplify lifetime rules and avoid error handling issues (Viresh Kumar) - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq scaling driver (Henry Tseng) - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead of cpumask_weight() because the former is more efficient (Yury Norov) - Use sysfs_emit() in sysfs show functions for cpufreq governor attributes (Thorsten Blum) - Update intel_pstate to stop returning an error when "off" is written to its status sysfs attribute while the driver is already off (Fabio De Francesco) - Include current frequency in the debug message printed by __cpufreq_driver_target() (Pengjie Zhang) - Refine stopped tick handling in the menu cpuidle governor and rearrange stopped tick handling in the teo cpuidle governor (Rafael Wysocki) - Add Panther Lake C-states table to the intel_idle driver (Artem Bityutskiy) - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha) - Simplify cpuidle_register_device() with guard() (Huisong Li) - Use performance level if available to distinguish between rates in OPP debugfs (Manivannan Sadhasivam) - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar) - Return -ENODATA if the snapshot image is not loaded (Alberto Garcia) - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric Biggers) - Clean up and rearrange the intel_rapl power capping driver to make the respective interface drivers (TPMI, MSR, and MMOI) hold their own settings and primitives and consolidate PL4 and PMU support flags into rapl_defaults (Kuppuswamy Sathyanarayanan) - Correct kernel-doc function parameter names in the power capping core code (Randy Dunlap) - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko) - Use _visible attribute to replace create/remove_sysfs_files() in devfreq (Pengjie Zhang) - Add Tegra114 support to activity monitor device in tegra30-devfreq as a preparation to upcoming EMC controller support (Svyatoslav Ryhel) - Fix mistakes in cpupower man pages, add the boost and epp options to the cpupower-frequency-info man page, and add the perf-bias option to the cpupower-info man page (Roberto Ricci) - Remove unnecessary extern declarations from getopt.h in arguments parsing functions in cpufreq-set, cpuidle-info, cpuidle-set, cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)" * tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits) cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP cpupower: remove extern declarations in cmd functions cpuidle: Simplify cpuidle_register_device() with guard() PM / devfreq: tegra30-devfreq: add support for Tegra114 PM / devfreq: use _visible attribute to replace create/remove_sysfs_files() PM / devfreq: Remove unneeded casting for HZ_PER_KHZ MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer cpufreq: Pass the policy to cpufreq_driver->adjust_perf() cpufreq/amd-pstate: Pass the policy to amd_pstate_update() cpufreq/amd-pstate-ut: Add a unit test for raw EPP cpufreq/amd-pstate: Add support for raw EPP writes cpufreq/amd-pstate: Add support for platform profile class cpufreq/amd-pstate: add kernel command line to override dynamic epp cpufreq/amd-pstate: Add dynamic energy performance preference Documentation: amd-pstate: fix dead links in the reference section cpufreq/amd-pstate: Cache the max frequency in cpudata Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count} Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file amd-pstate-ut: Add a testcase to validate the visibility of driver attributes ...	2026-04-13 19:47:52 -07:00
Linus Torvalds	4793dae01f	Driver core changes for 7.1-rc1 - debugfs: - Fix NULL pointer dereference in debugfs_create_str() - Fix misplaced EXPORT_SYMBOL_GPL for debugfs_create_str() - Fix soundwire debugfs NULL pointer dereference from uninitialized firmware_file - device property: - Make fwnode flags modifications thread safe; widen the field to unsigned long and use set_bit() / clear_bit() based accessors - Document how to check for the property presence - devres: - Separate struct devres_node from its "subclasses" (struct devres, struct devres_group); give struct devres_node its own release and free callbacks for per-type dispatch - Introduce struct devres_action for devres actions, avoiding the ARCH_DMA_MINALIGN alignment overhead of struct devres - Export struct devres_node and its init/add/remove/dbginfo primitives for use by Rust Devres<T> - Fix missing node debug info in devm_krealloc() - Use guard(spinlock_irqsave) where applicable; consolidate unlock paths in devres_release_group() - driver_override: - Convert PCI, WMI, vdpa, s390/cio, s390/ap, and fsl-mc to the generic driver_override infrastructure, replacing per-bus driver_override strings, sysfs attributes, and match logic; fixes a potential UAF from unsynchronized access to driver_override in bus match() callbacks - Simplify __device_set_driver_override() logic - kernfs: - Send IN_DELETE_SELF and IN_IGNORED inotify events on kernfs file and directory removal - Add corresponding selftests for memcg - platform: - Allow attaching software nodes when creating platform devices via a new 'swnode' field in struct platform_device_info - Add kerneldoc for struct platform_device_info - software node: - Move software node initialization from postcore_initcall() to driver_init(), making it available early in the boot process - Move kernel_kobj initialization (ksysfs_init) earlier to support the above - Remove software_node_exit(); dead code in a built-in unit - SoC: - Introduce of_machine_read_compatible() and of_machine_read_model() OF helpers and export soc_attr_read_machine() to replace direct accesses to of_root from SoC drivers; also enables CONFIG_COMPILE_TEST coverage for these drivers - sysfs: - Constify attribute group array pointers to 'const struct attribute_group const ' in sysfs functions, device_add_groups() / device_remove_groups(), and struct class - Rust: - Devres: - Embed struct devres_node directly in Devres<T> instead of going through devm_add_action(), avoiding the extra allocation and the unnecessary ARCH_DMA_MINALIGN alignment - I/O: - Turn IoCapable from a marker trait into a functional trait carrying the raw I/O accessor implementation (io_read / io_write), providing working defaults for the per-type Io methods - Add RelaxedMmio wrapper type, making relaxed accessors usable in code generic over the Io trait - Remove overloaded per-type Io methods and per-backend macros from Mmio and PCI ConfigSpace - I/O (Register): - Add IoLoc trait and generic read/write/update methods to the Io trait, making I/O operations parameterizable by typed locations - Add register! macro for defining hardware register types with typed bitfield accessors backed by Bounded values; supports direct, relative, and array register addressing - Add write_reg() / try_write_reg() and LocatedRegister trait - Update PCI sample driver to demonstrate the register! macro Example: ``` register! { /// UART control register. CTRL(u32) @ 0x18 { /// Receiver enable. 19:19 rx_enable => bool; /// Parity configuration. 14:13 parity ?=> Parity; } /// FIFO watermark and counter register. WATER(u32) @ 0x2c { /// Number of datawords in the receive FIFO. 26:24 rx_count; /// RX interrupt threshold. 17:16 rx_water; } } impl WATER { fn rx_above_watermark(&self) -> bool { self.rx_count() > self.rx_water() } } fn init(bar: &pci::Bar<BAR0_SIZE>) { let water = WATER::zeroed() .with_const_rx_water::<1>(); // > 3 would not compile bar.write_reg(water); let ctrl = CTRL::zeroed() .with_parity(Parity::Even) .with_rx_enable(true); bar.write_reg(ctrl); } fn handle_rx(bar: &pci::Bar<BAR0_SIZE>) { if bar.read(WATER).rx_above_watermark() { // drain the FIFO } } fn set_parity(bar: &pci::Bar<BAR0_SIZE>, parity: Parity) { bar.update(CTRL, \|r\| r.with_parity(parity)); } ``` - IRQ: - Move 'static bounds from where clauses to trait declarations for IRQ handler traits - Misc: - Enable the generic_arg_infer Rust feature - Extend Bounded with shift operations, single-bit bool conversion, and const get() - Misc: - Make deferred_probe_timeout default a Kconfig option - Drop auxiliary_dev_pm_ops; the PM core falls back to driver PM callbacks when no bus type PM ops are set - Add conditional guard support for device_lock() - Add ksysfs.c to the DRIVER CORE MAINTAINERS entry - Fix kernel-doc warnings in base.h - Fix stale reference to memory_block_add_nid() in documentation -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCadl5SwAKCRBFlHeO1qrK LpjDAQCSG3vYznwrngfpmRU5bCB9sdUy/pZiX5px1357+amJkwEA9LgIVQvtHAZW ZXcQ7Jr+mR3mJEdlatbkWHp3w1VHqAQ= =y1DV -----END PGP SIGNATURE----- Merge tag 'driver-core-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core updates from Danilo Krummrich: "debugfs: - Fix NULL pointer dereference in debugfs_create_str() - Fix misplaced EXPORT_SYMBOL_GPL for debugfs_create_str() - Fix soundwire debugfs NULL pointer dereference from uninitialized firmware_file device property: - Make fwnode flags modifications thread safe; widen the field to unsigned long and use set_bit() / clear_bit() based accessors - Document how to check for the property presence devres: - Separate struct devres_node from its "subclasses" (struct devres, struct devres_group); give struct devres_node its own release and free callbacks for per-type dispatch - Introduce struct devres_action for devres actions, avoiding the ARCH_DMA_MINALIGN alignment overhead of struct devres - Export struct devres_node and its init/add/remove/dbginfo primitives for use by Rust Devres<T> - Fix missing node debug info in devm_krealloc() - Use guard(spinlock_irqsave) where applicable; consolidate unlock paths in devres_release_group() driver_override: - Convert PCI, WMI, vdpa, s390/cio, s390/ap, and fsl-mc to the generic driver_override infrastructure, replacing per-bus driver_override strings, sysfs attributes, and match logic; fixes a potential UAF from unsynchronized access to driver_override in bus match() callbacks - Simplify __device_set_driver_override() logic kernfs: - Send IN_DELETE_SELF and IN_IGNORED inotify events on kernfs file and directory removal - Add corresponding selftests for memcg platform: - Allow attaching software nodes when creating platform devices via a new 'swnode' field in struct platform_device_info - Add kerneldoc for struct platform_device_info software node: - Move software node initialization from postcore_initcall() to driver_init(), making it available early in the boot process - Move kernel_kobj initialization (ksysfs_init) earlier to support the above - Remove software_node_exit(); dead code in a built-in unit SoC: - Introduce of_machine_read_compatible() and of_machine_read_model() OF helpers and export soc_attr_read_machine() to replace direct accesses to of_root from SoC drivers; also enables CONFIG_COMPILE_TEST coverage for these drivers sysfs: - Constify attribute group array pointers to 'const struct attribute_group const ' in sysfs functions, device_add_groups() / device_remove_groups(), and struct class Rust: - Devres: - Embed struct devres_node directly in Devres<T> instead of going through devm_add_action(), avoiding the extra allocation and the unnecessary ARCH_DMA_MINALIGN alignment - I/O: - Turn IoCapable from a marker trait into a functional trait carrying the raw I/O accessor implementation (io_read / io_write), providing working defaults for the per-type Io methods - Add RelaxedMmio wrapper type, making relaxed accessors usable in code generic over the Io trait - Remove overloaded per-type Io methods and per-backend macros from Mmio and PCI ConfigSpace - I/O (Register): - Add IoLoc trait and generic read/write/update methods to the Io trait, making I/O operations parameterizable by typed locations - Add register! macro for defining hardware register types with typed bitfield accessors backed by Bounded values; supports direct, relative, and array register addressing - Add write_reg() / try_write_reg() and LocatedRegister trait - Update PCI sample driver to demonstrate the register! macro Example: ``` register! { /// UART control register. CTRL(u32) @ 0x18 { /// Receiver enable. 19:19 rx_enable => bool; /// Parity configuration. 14:13 parity ?=> Parity; } /// FIFO watermark and counter register. WATER(u32) @ 0x2c { /// Number of datawords in the receive FIFO. 26:24 rx_count; /// RX interrupt threshold. 17:16 rx_water; } } impl WATER { fn rx_above_watermark(&self) -> bool { self.rx_count() > self.rx_water() } } fn init(bar: &pci::Bar<BAR0_SIZE>) { let water = WATER::zeroed() .with_const_rx_water::<1>(); // > 3 would not compile bar.write_reg(water); let ctrl = CTRL::zeroed() .with_parity(Parity::Even) .with_rx_enable(true); bar.write_reg(ctrl); } fn handle_rx(bar: &pci::Bar<BAR0_SIZE>) { if bar.read(WATER).rx_above_watermark() { // drain the FIFO } } fn set_parity(bar: &pci::Bar<BAR0_SIZE>, parity: Parity) { bar.update(CTRL, \|r\| r.with_parity(parity)); } ``` - IRQ: - Move 'static bounds from where clauses to trait declarations for IRQ handler traits - Misc: - Enable the generic_arg_infer Rust feature - Extend Bounded with shift operations, single-bit bool conversion, and const get() Misc: - Make deferred_probe_timeout default a Kconfig option - Drop auxiliary_dev_pm_ops; the PM core falls back to driver PM callbacks when no bus type PM ops are set - Add conditional guard support for device_lock() - Add ksysfs.c to the DRIVER CORE MAINTAINERS entry - Fix kernel-doc warnings in base.h - Fix stale reference to memory_block_add_nid() in documentation" * tag 'driver-core-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (67 commits) bus: fsl-mc: use generic driver_override infrastructure s390/ap: use generic driver_override infrastructure s390/cio: use generic driver_override infrastructure vdpa: use generic driver_override infrastructure platform/wmi: use generic driver_override infrastructure PCI: use generic driver_override infrastructure driver core: make software nodes available earlier software node: remove software_node_exit() kernel: ksysfs: initialize kernel_kobj earlier MAINTAINERS: add ksysfs.c to the DRIVER CORE entry drivers/base/memory: fix stale reference to memory_block_add_nid() device property: Document how to check for the property presence soundwire: debugfs: initialize firmware_file to empty string debugfs: fix placement of EXPORT_SYMBOL_GPL for debugfs_create_str() debugfs: check for NULL pointer in debugfs_create_str() driver core: Make deferred_probe_timeout default a Kconfig option driver core: simplify __device_set_driver_override() clearing logic driver core: auxiliary bus: Drop auxiliary_dev_pm_ops device property: Make modifications of fwnode "flags" thread safe rust: devres: embed struct devres_node directly ...	2026-04-13 19:03:11 -07:00
Linus Torvalds	d568788baa	hardening updates for v7.1-rc1 - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCad16PwAKCRA2KwveOeQk u7crAP4qz8gXCjes76KsZm/YQS8PtOG5JroAVu5Oa4ohw0RfaQD+K/XLow1plcNF 4Bi8zSuv2ifcLysh9qEAbx5+wcHijgo= =woB3 -----END PGP SIGNATURE----- Merge tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations * tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test refcount: Remove unused __signed_wrap function annotations randomize_kstack: Unify random source across arches randomize_kstack: Maintain kstack_offset per task	2026-04-13 17:52:29 -07:00
Linus Torvalds	cea4a90faf	seccomp update for v7.1-rc1 - selftests: Add hard-coded __NR_uprobe for x86_64 (Oleg Nesterov) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCad146wAKCRA2KwveOeQk u9LrAP4hemiMrgvEIo2DPVJ8H+QiRZtsJvFLFC/yvdgrllxyUQEA24RjRKXoZcL4 G9hD3FL31MTcbTV9Gdf0bOlLCqqUqg4= =7sDF -----END PGP SIGNATURE----- Merge tag 'seccomp-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp update from Kees Cook: - selftests: Add hard-coded __NR_uprobe for x86_64 (Oleg Nesterov) * tag 'seccomp-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: Add hard-coded __NR_uprobe for x86_64	2026-04-13 17:45:25 -07:00
Linus Torvalds	d142ab35ee	CRC updates for 7.1 - Several improvements related to crc_kunit, to align with the standard KUnit conventions and make it easier for developers and CI systems to run this test suite - Add an arm64-optimized implementation of CRC64-NVME - Remove unused code for big endian arm64 -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCadWAVxQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOK4zEAP9Asz9U6aA8XdlYa66nSORpUz0zQxVL tIvSOaQis6K0LwEAm4xABXc+rrVHqYhePRRqDteAEOzhBOX9mwY8wPjR3Ag= =VknF -----END PGP SIGNATURE----- Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC updates from Eric Biggers: - Several improvements related to crc_kunit, to align with the standard KUnit conventions and make it easier for developers and CI systems to run this test suite - Add an arm64-optimized implementation of CRC64-NVME - Remove unused code for big endian arm64 * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crc: arm64: Simplify intrinsics implementation lib/crc: arm64: Use existing macros for kernel-mode FPU cflags lib/crc: arm64: Drop unnecessary chunking logic from crc64 lib/crc: arm64: Assume a little-endian kernel lib/crc: arm64: add NEON accelerated CRC64-NVMe implementation lib/crc: arm64: Drop check for CONFIG_KERNEL_MODE_NEON crypto: crc32c - Remove another outdated comment crypto: crc32c - Remove more outdated usage information kunit: configs: Enable all CRC tests in all_tests.config lib/crc: tests: Add a .kunitconfig file lib/crc: tests: Add CRC_ENABLE_ALL_FOR_KUNIT lib/crc: tests: Make crc_kunit test only the enabled CRC variants	2026-04-13 17:36:04 -07:00
Linus Torvalds	370c388319	Crypto library updates for 7.1 - Migrate more hash algorithms from the traditional crypto subsystem to lib/crypto/. Like the algorithms migrated earlier (e.g. SHA-), this simplifies the implementations, improves performance, enables further simplifications in calling code, and solves various other issues: - AES CBC-based MACs (AES-CMAC, AES-XCBC-MAC, and AES-CBC-MAC) - Support these algorithms in lib/crypto/ using the AES library and the existing arm64 assembly code - Reimplement the traditional crypto API's "cmac(aes)", "xcbc(aes)", and "cbcmac(aes)" on top of the library - Convert mac80211 to use the AES-CMAC library. Note: several other subsystems can use it too and will be converted later - Drop the broken, nonstandard, and likely unused support for "xcbc(aes)" with key lengths other than 128 bits - Enable optimizations by default - GHASH - Migrate the standalone GHASH code into lib/crypto/ - Integrate the GHASH code more closely with the very similar POLYVAL code, and improve the generic GHASH implementation to resist cache-timing attacks and use much less memory - Reimplement the AES-GCM library and the "gcm" crypto_aead template on top of the GHASH library. Remove "ghash" from the crypto_shash API, as it's no longer needed - Enable optimizations by default - SM3 - Migrate the kernel's existing SM3 code into lib/crypto/, and reimplement the traditional crypto API's "sm3" on top of it - I don't recommend using SM3, but this cleanup is worthwhile to organize the code the same way as other algorithms - Testing improvements - Add a KUnit test suite for each of the new library APIs - Migrate the existing ChaCha20Poly1305 test to KUnit - Make the KUnit all_tests.config enable all crypto library tests - Move the test kconfig options to the Runtime Testing menu - Other updates to arch-optimized crypto code - Optimize SHA-256 for Zhaoxin CPUs using the Padlock Hash Engine - Remove some MD5 implementations that are no longer worth keeping - Drop big endian and voluntary preemption support from the arm64 code, as those configurations are no longer supported on arm64 - Make jitterentropy and samples/tsm-mr use the crypto library APIs Note: the overall diffstat is neutral, but when the test code is excluded it is significantly negative: Tests: 13 files changed, 1982 insertions(+), 888 deletions(-) Non-test: 141 files changed, 2897 insertions(+), 3987 deletions(-) All: 154 files changed, 4879 insertions(+), 4875 deletions(-) -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCadWPyxQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOK8QCAQD0i98miI1mu01RKuEwrBzmn7L/2sUH ReYV/dFDtnN0GwD+KMCiNAM2XTVLRKq5t3OxPHpKZ4y+gZwRowAJeFA02Q8= =5rip -----END PGP SIGNATURE----- Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library updates from Eric Biggers: - Migrate more hash algorithms from the traditional crypto subsystem to lib/crypto/ Like the algorithms migrated earlier (e.g. SHA-), this simplifies the implementations, improves performance, enables further simplifications in calling code, and solves various other issues: - AES CBC-based MACs (AES-CMAC, AES-XCBC-MAC, and AES-CBC-MAC) - Support these algorithms in lib/crypto/ using the AES library and the existing arm64 assembly code - Reimplement the traditional crypto API's "cmac(aes)", "xcbc(aes)", and "cbcmac(aes)" on top of the library - Convert mac80211 to use the AES-CMAC library. Note: several other subsystems can use it too and will be converted later - Drop the broken, nonstandard, and likely unused support for "xcbc(aes)" with key lengths other than 128 bits - Enable optimizations by default - GHASH - Migrate the standalone GHASH code into lib/crypto/ - Integrate the GHASH code more closely with the very similar POLYVAL code, and improve the generic GHASH implementation to resist cache-timing attacks and use much less memory - Reimplement the AES-GCM library and the "gcm" crypto_aead template on top of the GHASH library. Remove "ghash" from the crypto_shash API, as it's no longer needed - Enable optimizations by default - SM3 - Migrate the kernel's existing SM3 code into lib/crypto/, and reimplement the traditional crypto API's "sm3" on top of it - I don't recommend using SM3, but this cleanup is worthwhile to organize the code the same way as other algorithms - Testing improvements: - Add a KUnit test suite for each of the new library APIs - Migrate the existing ChaCha20Poly1305 test to KUnit - Make the KUnit all_tests.config enable all crypto library tests - Move the test kconfig options to the Runtime Testing menu - Other updates to arch-optimized crypto code: - Optimize SHA-256 for Zhaoxin CPUs using the Padlock Hash Engine - Remove some MD5 implementations that are no longer worth keeping - Drop big endian and voluntary preemption support from the arm64 code, as those configurations are no longer supported on arm64 - Make jitterentropy and samples/tsm-mr use the crypto library APIs * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (66 commits) lib/crypto: arm64: Assume a little-endian kernel arm64: fpsimd: Remove obsolete cond_yield macro lib/crypto: arm64/sha3: Remove obsolete chunking logic lib/crypto: arm64/sha512: Remove obsolete chunking logic lib/crypto: arm64/sha256: Remove obsolete chunking logic lib/crypto: arm64/sha1: Remove obsolete chunking logic lib/crypto: arm64/poly1305: Remove obsolete chunking logic lib/crypto: arm64/gf128hash: Remove obsolete chunking logic lib/crypto: arm64/chacha: Remove obsolete chunking logic lib/crypto: arm64/aes: Remove obsolete chunking logic lib/crypto: Include <crypto/utils.h> instead of <crypto/algapi.h> lib/crypto: aesgcm: Don't disable IRQs during AES block encryption lib/crypto: aescfb: Don't disable IRQs during AES block encryption lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit lib/crypto: sparc: Drop optimized MD5 code lib/crypto: mips: Drop optimized MD5 code lib: Move crypto library tests to Runtime Testing menu crypto: sm3 - Remove 'struct sm3_state' crypto: sm3 - Remove the original "sm3_block_generic()" crypto: sm3 - Remove sm3_base.h ...	2026-04-13 17:31:39 -07:00
Linus Torvalds	7fe6ac157b	for-7.1/block-20260411 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmna0tgQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgptEbD/0ZMEsz5pcN+/bpM9Qva5lVVkByRieua+JA T7L+JMcEigp1Hf2idAPlv1e9dbrtgOGhkjZNlbZenP2MHXBmbUTnzTWDKW5w0ZQ4 UqnVC7fMmxzI57DPt7iG/1WQo8O6QPHWwBof5ZXn0b83qwByTB2oVkAb9ysT7CdM wGk5KnPRLIAWf5o+aZ4LoWE+196jQiszx1m6U58FTqnCgvJ/GyKyrgzx+uvGUgF+ owZT/6TrN7cN9A68fOnmcjEZ7beZXygOQPTn32sF9rEOi8JsgK71EE2LofdVVSNU ES/tyKVJbSNDgUH2b0T84rErT4MtZcw5J29V3k7CVndC+DcT2uLSroPz3lYQjDg9 TLeq7ZLjnyoBG+muboWdXcvBKn3aKLec3nfVSbz6J1xb/Z22gWYy5TZbrGnGH8fJ zBiyKkHMaZi55IdTDWQT3a48h36qFh0Y2wbvZ6uhyYOfXHyj4pA4ccJZgFfmf4ZG flVRFGEL9Tqc82lB8dfy9DBp0ZQSjeBUCd+gyDKjiuWVau5L5iTUeMMkt8yr7qbg PY+ATJcHk5S5zwM2xcZUt5EcHBBbCaKQ6DdRZKwzMMUvCjHlvnWvENVjUtRa9Dng 1vUKpB/e5NGpqD05Iqgyai+OD9/tALc4sUEI2yQ7/dk9pKIXQ4RE9HR/pSkgbjeR LGokj08cgg== =ga3t -----END PGP SIGNATURE----- Merge tag 'for-7.1/block-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Add shared memory zero-copy I/O support for ublk, bypassing per-I/O copies between kernel and userspace by matching registered buffer PFNs at I/O time. Includes selftests. - Refactor bio integrity to support filesystem initiated integrity operations and arbitrary buffer alignment. - Clean up bio allocation, splitting bio_alloc_bioset() into clear fast and slow paths. Add bio_await() and bio_submit_or_kill() helpers, unify synchronous bi_end_io callbacks. - Fix zone write plug refcount handling and plug removal races. Add support for serializing zone writes at QD=1 for rotational zoned devices, yielding significant throughput improvements. - Add SED-OPAL ioctls for Single User Mode management and a STACK_RESET command. - Add io_uring passthrough (uring_cmd) support to the BSG layer. - Replace pp_buf in partition scanning with struct seq_buf. - zloop improvements and cleanups. - drbd genl cleanup, switching to pre_doit/post_doit. - NVMe pull request via Keith: - Fabrics authentication updates - Enhanced block queue limits support - Workqueue usage updates - A new write zeroes device quirk - Tagset cleanup fix for loop device - MD pull requests via Yu Kuai: - Fix raid5 soft lockup in retry_aligned_read() - Fix raid10 deadlock with check operation and nowait requests - Fix raid1 overlapping writes on writemostly disks - Fix sysfs deadlock on array_state=clear - Proactive RAID-5 parity building with llbitmap, with write_zeroes_unmap optimization for initial sync - Fix llbitmap barrier ordering, rdev skipping, and bitmap_ops version mismatch fallback - Fix bcache use-after-free and uninitialized closure - Validate raid5 journal metadata payload size - Various cleanups - Various other fixes, improvements, and cleanups * tag 'for-7.1/block-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (146 commits) ublk: fix tautological comparison warning in ublk_ctrl_reg_buf scsi: bsg: fix buffer overflow in scsi_bsg_uring_cmd() block: refactor blkdev_zone_mgmt_ioctl MAINTAINERS: update ublk driver maintainer email Documentation: ublk: address review comments for SHMEM_ZC docs ublk: allow buffer registration before device is started ublk: replace xarray with IDA for shmem buffer index allocation ublk: simplify PFN range loop in __ublk_ctrl_reg_buf ublk: verify all pages in multi-page bvec fall within registered range ublk: widen ublk_shmem_buf_reg.len to __u64 for 4GB buffer support xfs: use bio_await in xfs_zone_gc_reset_sync block: add a bio_submit_or_kill helper block: factor out a bio_await helper block: unify the synchronous bi_end_io callbacks xfs: fix number of GC bvecs selftests/ublk: add read-only buffer registration test selftests/ublk: add filesystem fio verify test for shmem_zc selftests/ublk: add hugetlbfs shmem_zc test for loop target selftests/ublk: add shared memory zero-copy test selftests/ublk: add UBLK_F_SHMEM_ZC support for loop target ...	2026-04-13 15:51:31 -07:00
Linus Torvalds	b8f82cb0d8	Landlock update for v7.1-rc1 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCadfgCxAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSl5cA/0QZjJ+4V2DVzJQM5qzmNK9He9uYaOs7F2Ks xRvg7IebAPwMEcVY+CVQxD+YGj08UgM753yx4CRbhsu4k5mowEEJDQ== =Lz9R -----END PGP SIGNATURE----- Merge tag 'landlock-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull Landlock update from Mickaël Salaün: "This adds a new Landlock access right for pathname UNIX domain sockets thanks to a new LSM hook, and a few fixes" * tag 'landlock-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (23 commits) landlock: Document fallocate(2) as another truncation corner case landlock: Document FS access right for pathname UNIX sockets selftests/landlock: Simplify ruleset creation and enforcement in fs_test selftests/landlock: Check that coredump sockets stay unrestricted selftests/landlock: Audit test for LANDLOCK_ACCESS_FS_RESOLVE_UNIX selftests/landlock: Test LANDLOCK_ACCESS_FS_RESOLVE_UNIX selftests/landlock: Replace access_fs_16 with ACCESS_ALL in fs_test samples/landlock: Add support for named UNIX domain socket restrictions landlock: Clarify BUILD_BUG_ON check in scoping logic landlock: Control pathname UNIX domain socket resolution by path landlock: Use mem_is_zero() in is_layer_masks_allowed() lsm: Add LSM hook security_unix_find landlock: Fix kernel-doc warning for pointer-to-array parameters landlock: Fix formatting in tsync.c landlock: Improve kernel-doc "Return:" section consistency landlock: Add missing kernel-doc "Return:" sections selftests/landlock: Fix format warning for __u64 in net_test selftests/landlock: Skip stale records in audit_match_record() selftests/landlock: Drain stale audit records on init selftests/landlock: Fix socket file descriptor leaks in audit helpers ...	2026-04-13 15:42:19 -07:00
Linus Torvalds	ef3da345cc	vfs-7.1-rc1.misc Please consider pulling these changes from the signed vfs-7.1-rc1.misc tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCwAKCRCRxhvAZXjc ohhBAQCAmQMlMRAXAgUZFYMTZpeQlcujP5rv+/vT2Tf/xS76YwD/dRDaw1FH294+ qtk/Z1NjleNixzE2sld1K9J32NxeyAc= =+g9q -----END PGP SIGNATURE----- Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - coredump: add tracepoint for coredump events - fs: hide file and bfile caches behind runtime const machinery Fixes: - fix architecture-specific compat_ftruncate64 implementations - dcache: Limit the minimal number of bucket to two - fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START - fs/mbcache: cancel shrink work before destroying the cache - dcache: permit dynamic_dname()s up to NAME_MAX Cleanups: - remove or unexport unused fs_context infrastructure - trivial ->setattr cleanups - selftests/filesystems: Assume that TIOCGPTPEER is defined - writeback: fix kernel-doc function name mismatch for wb_put_many() - autofs: replace manual symlink buffer allocation in autofs_dir_symlink - init/initramfs.c: trivial fix: FSM -> Finite-state machine - fs: remove stale and duplicate forward declarations - readdir: Introduce dirent_size() - fs: Replace user_access_{begin/end} by scoped user access - kernel: acct: fix duplicate word in comment - fs: write a better comment in step_into() concerning .mnt assignment - fs: attr: fix comment formatting and spelling issues" * tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) dcache: permit dynamic_dname()s up to NAME_MAX fs: attr: fix comment formatting and spelling issues fs: hide file and bfile caches behind runtime const machinery fs: write a better comment in step_into() concerning .mnt assignment proc: rename proc_notify_change to proc_setattr proc: rename proc_setattr to proc_nochmod_setattr affs: rename affs_notify_change to affs_setattr adfs: rename adfs_notify_change to adfs_setattr hfs: update comments on hfs_inode_setattr kernel: acct: fix duplicate word in comment fs: Replace user_access_{begin/end} by scoped user access readdir: Introduce dirent_size() coredump: add tracepoint for coredump events fs: remove do_sys_truncate fs: pass on FTRUNCATE_* flags to do_truncate fs: fix archiecture-specific compat_ftruncate64 fs: remove stale and duplicate forward declarations init/initramfs.c: trivial fix: FSM -> Finite-state machine autofs: replace manual symlink buffer allocation in autofs_dir_symlink fs/mbcache: cancel shrink work before destroying the cache ...	2026-04-13 14:20:11 -07:00
Linus Torvalds	07c3ef5822	vfs-7.1-rc1.pidfs Please consider pulling these changes from the signed vfs-7.1-rc1.pidfs tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCwAKCRCRxhvAZXjc omfuAQDckt5g7vxBr9hKdyrq1//nsu44fst/mRqr2iSYjuKfPQD/VN6Lw9e56Y/q l4hHxsPPrSSxbijwng7im36iPIGdfwI= =BbFh -----END PGP SIGNATURE----- Merge tag 'vfs-7.1-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull clone and pidfs updates from Christian Brauner: "Add three new clone3() flags for pidfd-based process lifecycle management. CLONE_AUTOREAP: CLONE_AUTOREAP makes a child process auto-reap on exit without ever becoming a zombie. This is a per-process property in contrast to the existing auto-reap mechanism via SA_NOCLDWAIT or SIG_IGN for SIGCHLD which applies to all children of a given parent. Currently the only way to automatically reap children is to set SA_NOCLDWAIT or SIG_IGN on SIGCHLD. This is a parent-scoped property affecting all children which makes it unsuitable for libraries or applications that need selective auto-reaping of specific children while still being able to wait() on others. CLONE_AUTOREAP stores an autoreap flag in the child's signal_struct. When the child exits do_notify_parent() checks this flag and causes exit_notify() to transition the task directly to EXIT_DEAD. Since the flag lives on the child it survives reparenting: if the original parent exits and the child is reparented to a subreaper or init the child still auto-reaps when it eventually exits. This is cleaner than forcing the subreaper to get SIGCHLD and then reaping it. If the parent doesn't care the subreaper won't care. If there's a subreaper that would care it would be easy enough to add a prctl() that either just turns back on SIGCHLD and turns off auto-reaping or a prctl() that just notifies the subreaper whenever a child is reparented to it. CLONE_AUTOREAP can be combined with CLONE_PIDFD to allow the parent to monitor the child's exit via poll() and retrieve exit status via PIDFD_GET_INFO. Without CLONE_PIDFD it provides a fire-and-forget pattern. No exit signal is delivered so exit_signal must be zero. CLONE_THREAD and CLONE_PARENT are rejected: CLONE_THREAD because autoreap is a process-level property, and CLONE_PARENT because an autoreap child reparented via CLONE_PARENT could become an invisible zombie under a parent that never calls wait(). The flag is not inherited by the autoreap process's own children. Each child that should be autoreaped must be explicitly created with CLONE_AUTOREAP. CLONE_NNP: CLONE_NNP sets no_new_privs on the child at clone time. Unlike prctl(PR_SET_NO_NEW_PRIVS) which a process sets on itself, CLONE_NNP allows the parent to impose no_new_privs on the child at creation without affecting the parent's own privileges. CLONE_THREAD is rejected because threads share credentials. CLONE_NNP is useful on its own for any spawn-and-sandbox pattern but was specifically introduced to enable unprivileged usage of CLONE_PIDFD_AUTOKILL. CLONE_PIDFD_AUTOKILL: This flag ties a child's lifetime to the pidfd returned from clone3(). When the last reference to the struct file created by clone3() is closed the kernel sends SIGKILL to the child. A pidfd obtained via pidfd_open() for the same process does not keep the child alive and does not trigger autokill - only the specific struct file from clone3() has this property. This is useful for container runtimes, service managers, and sandboxed subprocess execution - any scenario where the child must die if the parent crashes or abandons the pidfd or just wants a throwaway helper process. CLONE_PIDFD_AUTOKILL requires both CLONE_PIDFD and CLONE_AUTOREAP. It requires CLONE_PIDFD because the whole point is tying the child's lifetime to the pidfd. It requires CLONE_AUTOREAP because a killed child with no one to reap it would become a zombie - the primary use case is the parent crashing or abandoning the pidfd so no one is around to call waitpid(). CLONE_THREAD is rejected because autokill targets a process not a thread. If CLONE_NNP is specified together with CLONE_PIDFD_AUTOKILL an unprivileged user may spawn a process that is autokilled. The child cannot escalate privileges via setuid/setgid exec after being spawned. If CLONE_PIDFD_AUTOKILL is specified without CLONE_NNP the caller must have have CAP_SYS_ADMIN in its user namespace" * tag 'vfs-7.1-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests: check pidfd_info->coredump_code correctness pidfds: add coredump_code field to pidfd_info kselftest/coredump: reintroduce null pointer dereference selftests/pidfd: add CLONE_PIDFD_AUTOKILL tests selftests/pidfd: add CLONE_NNP tests selftests/pidfd: add CLONE_AUTOREAP tests pidfd: add CLONE_PIDFD_AUTOKILL clone: add CLONE_NNP clone: add CLONE_AUTOREAP	2026-04-13 13:27:11 -07:00
Linus Torvalds	c8db08110c	vfs-7.1-rc1.xattr Please consider pulling these changes from the signed vfs-7.1-rc1.xattr tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCgAKCRCRxhvAZXjc olWpAQCf//do9MtHmqTjgVfMIv18vw5woI8Bdx/X3m5S9e2D9gD/QddDnkFiVMmu txsT+N8BGuW+jbAkYwax5LnUFGUTYgY= =a4gm -----END PGP SIGNATURE----- Merge tag 'vfs-7.1-rc1.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs xattr updates from Christian Brauner: "This reworks the simple_xattr infrastructure and adds support for user.* extended attributes on sockets. The simple_xattr subsystem currently uses an rbtree protected by a reader-writer spinlock. This series replaces the rbtree with an rhashtable giving O(1) average-case lookup with RCU-based lockless reads. This sped up concurrent access patterns on tmpfs quite a bit and it's an overall easy enough conversion to do and gets rid or rwlock_t. The conversion is done incrementally: a new rhashtable path is added alongside the existing rbtree, consumers are migrated one at a time (shmem, kernfs, pidfs), and then the rbtree code is removed. All three consumers switch from embedded structs to pointer-based lazy allocation so the rhashtable overhead is only paid for inodes that actually use xattrs. With this infrastructure in place the series adds support for user.* xattrs on sockets. Path-based AF_UNIX sockets inherit xattr support from the underlying filesystem (e.g. tmpfs) but sockets in sockfs - that is everything created via socket() including abstract namespace AF_UNIX sockets - had no xattr support at all. The xattr_permission() checks are reworked to allow user.* xattrs on S_IFSOCK inodes. Sockfs sockets get per-inode limits of 128 xattrs and 128KB total value size matching the limits already in use for kernfs. The practical motivation comes from several directions. systemd and GNOME are expanding their use of Varlink as an IPC mechanism. For D-Bus there are tools like dbus-monitor that can observe IPC traffic across the system but this only works because D-Bus has a central broker. For Varlink there is no broker and there is currently no way to identify which sockets speak Varlink. With user.* xattrs on sockets a service can label its socket with the IPC protocol it speaks (e.g., user.varlink=1) and an eBPF program can then selectively capture traffic on those sockets. Enumerating bound sockets via netlink combined with these xattr labels gives a way to discover all Varlink IPC entrypoints for debugging and introspection. Similarly, systemd-journald wants to use xattrs on the /dev/log socket for protocol negotiation to indicate whether RFC 5424 structured syslog is supported or whether only the legacy RFC 3164 format should be used. In containers these labels are particularly useful as high-privilege or more complicated solutions for socket identification aren't available. The series comes with comprehensive selftests covering path-based AF_UNIX sockets, sockfs socket operations, per-inode limit enforcement, and xattr operations across multiple address families (AF_INET, AF_INET6, AF_NETLINK, AF_PACKET)" * tag 'vfs-7.1-rc1.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests/xattr: test xattrs on various socket families selftests/xattr: sockfs socket xattr tests selftests/xattr: path-based AF_UNIX socket xattr tests xattr: support extended attributes on sockets xattr,net: support limited amount of extended attributes on sockfs sockets xattr: move user limits for xattrs to generic infra xattr: switch xattr_permission() to switch statement xattr: add xattr_permission_error() xattr: remove rbtree-based simple_xattr infrastructure pidfs: adapt to rhashtable-based simple_xattrs kernfs: adapt to rhashtable-based simple_xattrs with lazy allocation shmem: adapt to rhashtable-based simple_xattrs with lazy allocation xattr: add rhashtable-based simple_xattr infrastructure xattr: add rcu_head and rhash_head to struct simple_xattr	2026-04-13 10:10:28 -07:00
Cao Ruichuang	f8e0a5a174	selftests/ftrace: Quote check_requires comparisons check_requires() compares requirement strings that can contain shell pattern characters such as '[' and ']'. Under /bin/sh, the unquoted test expressions can emit 'unexpected operator' warnings while parsing README-backed requirements. Quote the relevant comparisons and path checks so the helper handles those patterns without spurious shell warnings. Validated by rerunning fprobe_syntax_errors.tc and confirming the previous '/bin/sh: unexpected operator' lines disappear from the detailed ftracetest log. Signed-off-by: Cao Ruichuang <create0818@163.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20260408043212.8063-1-create0818@163.com Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-13 11:05:39 -06:00
Ricardo B. Marlière	7e47389142	selftests: Preserve subtarget failures in all/install Track failures explicitly in the top-level selftests all/install loops. The current code multiplies `ret` by each sub-make exit status. For example, with `TARGETS=net`, the implicit `net/lib` dependency runs after `net`, so a failed `net` build can be followed by a successful `net/lib` build and reset the final result to success. Set `ret` to 1 on any non-zero sub-make exit code and keep it sticky, so the top-level make returns failure when any selected selftest target fails. Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Link: https://lore.kernel.org/r/20260320-selftests-fixes-v1-5-79144f76be01@suse.com Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-13 11:05:39 -06:00
Ricardo B. Marlière	d6ea9f404b	selftests/run_kselftest.sh: Allow choosing per-test log directory The --per-test-log option currently hard-codes /tmp. However, the system under test will most likely have tmpfs mounted there. Since it's not clear which filenames the log files will have, the user should be able to specify a persistent directory to store the logs. Keeping those logs are important because the run_kselftest.sh runner will only yield KTAP output, trimming information that is otherwise available through running individual tests directly. Allow --per-test-log to take an optional directory argument. Keep the existing behaviour when the option is passed without an argument, but if a directory is provided, create it if needed, reject non-directory paths and non-writable directories, canonicalize it, and have runner.sh write per-test logs there instead of /tmp. This also makes relative paths safe by resolving them before the runner changes into a collection directory. Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Link: https://lore.kernel.org/r/20260320-selftests-fixes-v1-4-79144f76be01@suse.com Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-13 11:05:39 -06:00
Ricardo B. Marlière	a82e076f4a	selftests/run_kselftest.sh: Resolve BASE_DIR with pwd -P run_kselftest.sh only needs to canonicalize the directory containing the script itself. Use shell-native path resolution for that by changing into the directory and calling pwd -P. This avoids depending on either realpath or readlink -f while still producing a physical absolute path for BASE_DIR. Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Link: https://lore.kernel.org/r/20260320-selftests-fixes-v1-3-79144f76be01@suse.com Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-13 11:05:39 -06:00
Paolo Bonzini	6b80203187	- ESA nesting support - 4k memslots - LPSW/E fix -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmncpfkACgkQ41TmuOI4 ufj4ehAA0fTpaA4VdUbF/uH1o4BLu/hElPXhJYnyDa6hUK0XiFS6bpouz50wTMz/ QjbmM+uCLKxVBK2FPE0cPj3iobvlfTTgP0tNkgwHDFlLfuZ9914cxYc4HYPrRJ/y Ey+6TT4ynkf2mihiLFHKKuBPi4DjfC3rAjy8ZHOnNh5ro+00uXVCGhssBUKvXNST X45q6JaN6p3eDVjC/ov/K593BJgMoW5x/kDmoyICuhDYs+8TiY+n+61BdVARKdtu 3+vwkjQ/mrl+IwJMvfeH+nO2qnjREc6EZd9YTJOCheThhELw0tX4jeha4PldeeZY fg+8uObSmbzxcmsvWRGTuVpobEBpOqRP9sdADxF77dq1ExFXwthXFT8AQw8NzI2k leU8DQqXVUOkykmpvacV96AGlYrRWb47806TdVM+fJmLkvmt0llS/MK6fQNz+Jlb okFx1kLnqSKz7x0O6Avgz/+F6yjFAwTp7mwKmd8bHzKCkLCYq8Gl6WPxx/peFY0P dwEwq0k89Wld7gjkAXwtwjttIrQcwghacqBCJAu4cA/3NnM2DCAPf3gSiY1PoYPX 06ZUYBzLH8wQJRZLToWpYvH9xOOfMmTETx7LDsYuMztxyesS+ReR/dVkCCei/2oD KeoGD0vBA0d8/wW+ZmB6YYxUiWT0WOllb/9s26NG/7lCTY1UgLI= =YTUc -----END PGP SIGNATURE----- Merge tag 'kvm-s390-next-7.1-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - ESA nesting support - 4k memslots - LPSW/E fix	2026-04-13 19:01:15 +02:00
Paolo Bonzini	92cdeac6a4	KVM SVM changes for 7.1 - Fix and optimize IRQ window inhibit handling for AVIC (the tracking needs to be per-vCPU, e.g. so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs). - Fix an undefined behavior warning where a crafty userspace can read the "avic" module param before it's fully initialized. - Fix a (likely benign) bug in the "OS-visible workarounds" handling, where KVM could clobber state when enabling virtualization on multiple CPUs in parallel, and clean up and optimize the code. - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input, and clean up and harden the related pinning code. - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will trigger an RMP violation #PF and crash the host. - Protect all of sev_mem_enc_register_region() with kvm->lock to ensure sev_guest() is stable for the entire of the function. - Lock all vCPUs when synchronizing VMSAs for SNP guests to ensure the VMSA page isn't actively being used. - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock (KVM has had multiple bugs due "is SEV?" checks becoming stale), enforced by lockdep. Add and use vCPU-scoped APIs when possible/appropriate, as all checks that originate from a vCPU are guaranteed to be stable. - Convert a pile of kvm->lock SEV code to guard(). -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmnZK4wACgkQOlYIJqCj N/2uOQ/+LzGQD7myCn47rUhiMo/aY3qjrS+u6PSuFeEMFyaATiWpf/s50hIMHh+/ VCRAptKgL0PBV/RbOqhZdx4Zn/Yb/NNBwraqc7xQgMOlQwFedOetuFtRveJ4z6Af 8ycwMxYYtz6SbaT+R3AdK51Nb8S2ZRpd082CiaLgChVcdodkeFtS5KVBqrlBGB21 EKFbW+QXMHrpmGbgZ8YWMrL5UCSmJFG8ZztcncNfsLS6WxbUjdo/MEiLEDIsrXZd oGViwmnY7hcJ5ClcF8UMPtXHHP1+EOk6BKAsmYguG3qUxbX+EEbymb8o16k+h6iw ybUZWF7cq44Pl1FModTFAB5LQPg6z6XNhjZ8L+0kjAI05lvszf3QDtezQ+BF24tW S18x6yCIpdEJ3VxM4r5Yqf10CRbxMtHKU6EUjL7C4KNNYOz2sX+Tqgi/uHtbgzUJ zPG9faY5M3hMjfj5tOCpy/fAEF3fD1mg4GE8pfXZa8d/ppqI4hU0ASpFzw/d4LnH PJSaeJhmmEIdRj+RtIGIRSZ9flHM61/+clKngaoR+c/mPQPnDbapivl2kgKWbVJ4 47c44pYQLTWI01nuwcEILCEj8D1mABJygPjNoO79b2mitmYazMnO42mV3lI5oP0c QyzX7sSed6ImIRn8xadfE+tIz3ji9r/ak+ekZvdNiqiNEoi2YG8= =AjgE -----END PGP SIGNATURE----- Merge tag 'kvm-x86-svm-7.1' of https://github.com/kvm-x86/linux into HEAD KVM SVM changes for 7.1 - Fix and optimize IRQ window inhibit handling for AVIC (the tracking needs to be per-vCPU, e.g. so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs). - Fix an undefined behavior warning where a crafty userspace can read the "avic" module param before it's fully initialized. - Fix a (likely benign) bug in the "OS-visible workarounds" handling, where KVM could clobber state when enabling virtualization on multiple CPUs in parallel, and clean up and optimize the code. - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input, and clean up and harden the related pinning code. - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will trigger an RMP violation #PF and crash the host. - Protect all of sev_mem_enc_register_region() with kvm->lock to ensure sev_guest() is stable for the entire of the function. - Lock all vCPUs when synchronizing VMSAs for SNP guests to ensure the VMSA page isn't actively being used. - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock (KVM has had multiple bugs due "is SEV?" checks becoming stale), enforced by lockdep. Add and use vCPU-scoped APIs when possible/appropriate, as all checks that originate from a vCPU are guaranteed to be stable. - Convert a pile of kvm->lock SEV code to guard().	2026-04-13 19:00:43 +02:00
Linus Torvalds	28483203f7	RCU changes for v7.1 NOCB CPU management: - Consolidate rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() to reduce code duplication. - Extract nocb_bypass_needs_flush() helper to reduce duplication in NOCB bypass path. rcutorture/torture infrastructure: - Add NOCB01 config for RCU_LAZY torture testing. - Add NOCB02 config for NOCB poll mode testing. - Add TRIVIAL-PREEMPT config for textbook-style preemptible RCU torture. - Test call_srcu() with preemption both disabled and enabled. - Remove kvm-check-branches.sh in favor of kvm-series.sh. - Make hangs more visible in torture.sh output. - Add informative message for tests without a recheck file. - Fix numeric test comparison in srcu_lockdep.sh. - Use torture_shutdown_init() in refscale and rcuscale instead of open-coded shutdown functions. - Fix modulo-zero error in torture_hrtimeout_ns(). SRCU: - Fix SRCU read flavor macro comments. - Fix s/they disables/they disable/ typo in srcu_read_unlock_fast(). RCU Tasks: - Document that RCU Tasks Trace grace periods now imply RCU grace periods. - Remove unnecessary smp_store_release() in cblist_init_generic(). RCU stall: - Add BOOTPARAM_RCU_STALL_PANIC Kconfig option to allow triggering a kernel panic on RCU stall via kernel boot parameter. -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEcoCIrlGe4gjE06JJqA4nf2o45hAFAmnMBEAWHGpvZWxhZ25l bGZAbnZpZGlhLmNvbQAKCRCoDid/ajjmEBCdD/4rUNIocKBmSJWHFeDzl8rvMGSx RISjeaj8m+HUsQRSq7hecygUkSbGEQuj8KwFDJOsmuzhMRg3hR7yEz1AEphQiiIR TN0IdGblQd57KqAk6sHsXA5mNjSWZfJQuOkvHsMxIJdhhDRDHPTGtaCBwlYoLc6b de31LCvCfRL1QgiNAznTlPLRyehI9/+6VASLvKbmi3sQQv9B/KBBWydkryvMJbfW uc05SUn2qQq2h/NNeYvo6RllXWMPqO4/0ll8Y71ltwFhg8JdS9qXDSFX5cWlLISA uBA4pC9bk0xi7RGE5v4n9+fNNTr64Tjtd9QoxFvd/Jb6jdKDGjabwojoAYXCsdNw r8Dp3fKwGofruhUqgO6LyHyG54Ro2paVYjsqr5HW9C6jalZ9+HmH5wOnKw2h5E/i VRAM6MpwQ869ZeHhDxF8meRKf3+E+UIR95qfNZkABcFnwxgXheT+RASP/UOnoTM7 Zexuzp9c6GQu34MTt0Rz9vNfJta+ZmPlnccan1T3DgoHvcWip7IDhUD1Vqerd9/4 VK4JeeKI0t/fR5ydjs5qiA/O5caIFqtQTD0Ag2vLGQiP72pe2lRroIjD3xNZ2bWS DN4k6ZCxR2hdPUcWzdYlGe2pYmvAlqCBxZ4fNgrs55uP2EuqW+Y8aJd9RwvRPb2B wmTZepOh7Ct9+cjTBA== =wjsx -----END PGP SIGNATURE----- Merge tag 'rcu.2026.03.31a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Joel Fernandes: "NOCB CPU management: - Consolidate rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() to reduce code duplication - Extract nocb_bypass_needs_flush() helper to reduce duplication in NOCB bypass path rcutorture/torture infrastructure: - Add NOCB01 config for RCU_LAZY torture testing - Add NOCB02 config for NOCB poll mode testing - Add TRIVIAL-PREEMPT config for textbook-style preemptible RCU torture - Test call_srcu() with preemption both disabled and enabled - Remove kvm-check-branches.sh in favor of kvm-series.sh - Make hangs more visible in torture.sh output - Add informative message for tests without a recheck file - Fix numeric test comparison in srcu_lockdep.sh - Use torture_shutdown_init() in refscale and rcuscale instead of open-coded shutdown functions - Fix modulo-zero error in torture_hrtimeout_ns(). SRCU: - Fix SRCU read flavor macro comments - Fix s/they disables/they disable/ typo in srcu_read_unlock_fast() RCU Tasks: - Document that RCU Tasks Trace grace periods now imply RCU grace periods - Remove unnecessary smp_store_release() in cblist_init_generic()" * tag 'rcu.2026.03.31a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: rcutorture: Test call_srcu() with preemption disabled and not rcu: Add BOOTPARAM_RCU_STALL_PANIC Kconfig option torture: Avoid modulo-zero error in torture_hrtimeout_ns() rcu/nocb: Extract nocb_bypass_needs_flush() to reduce duplication rcu/nocb: Consolidate rcu_nocb_cpu_offload/deoffload functions rcu-tasks: Remove unnecessary smp_store_release() in cblist_init_generic() rcutorture: Add NOCB02 config for nocb poll mode testing rcutorture: Add NOCB01 config for RCU_LAZY torture testing rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods srcu: Fix s/they disables/they disable/ typo in srcu_read_unlock_fast() srcu: Fix SRCU read flavor macro comments rcuscale: Ditch rcu_scale_shutdown in favor of torture_shutdown_init() refscale: Ditch ref_scale_shutdown in favor of torture_shutdown_init() rcutorture: Fix numeric "test" comparison in srcu_lockdep.sh torture: Print informative message for test without recheck file torture: Make hangs more visible in torture.sh output kvm-check-branches.sh: Remove in favor of kvm-series.sh rcutorture: Add a textbook-style trivial preemptible RCU	2026-04-13 09:36:45 -07:00
Kuba Piecuch	7e311bafb9	tools/sched_ext: Add explicit cast from void* in RESIZE_ARRAY() This fixes the following compilation error when using the header from C++ code: error: assigning to 'struct scx_flux__data_uei_dump ' from incompatible type 'void ' Signed-off-by: Kuba Piecuch <jpiecuch@google.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2026-04-13 06:14:11 -10:00
Kuba Piecuch	4615361f0b	sched_ext: Make string params of __ENUM_set() const A small change to improve type safety/const correctness. __COMPAT_read_enum() already has const string parameters. It fixes a warning when using the header in C++ code: error: ISO C++11 does not allow conversion from string literal to 'char *' [-Werror,-Wwritable-strings] That's because string literals have type char[N] in C and const char[N] in C++. Signed-off-by: Kuba Piecuch <jpiecuch@google.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2026-04-13 06:14:05 -10:00
Tejun Heo	3d3667f265	tools/sched_ext: Kick home CPU for stranded tasks in scx_qmap scx_qmap uses global BPF queue maps (BPF_MAP_TYPE_QUEUE) that any CPU's ops.dispatch() can pop from. When a CPU pops a task that can't run on it (e.g. a pinned per-CPU kthread), it inserts the task into SHARED_DSQ. consume_dispatch_q() then skips the task due to affinity mismatch, leaving it stranded until some CPU in its allowed mask calls ops.dispatch(). This doesn't cause indefinite stalls -- the periodic tick keeps firing (can_stop_idle_tick() returns false when softirq is pending) -- but can cause noticeable scheduling delays. After inserting to SHARED_DSQ, kick the task's home CPU if this CPU can't run it. There's a small race window where the home CPU can enter idle before the kick lands -- if a per-CPU kthread like ksoftirqd is the stranded task, this can trigger a "NOHZ tick-stop error" warning. The kick arrives shortly after and the home CPU drains the task. Rather than fully eliminating the warning by routing pinned tasks to local or global DSQs, the current code keeps them going through the normal BPF queue path and documents the race and the resulting warning in detail. scx_qmap is an example scheduler and having tasks go through the usual dispatch path is useful for testing. The detailed comment also serves as a reference for other schedulers that may encounter similar warnings. Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2026-04-13 06:13:59 -10:00
Linus Torvalds	fdcbb1bc06	Merge branch 'nocache-cleanup' This series cleans up some of the special user copy functions naming and semantics. In particular, get rid of the (very traditional) double underscore names and behavior: the whole "optimize away the range check" model has been largely excised from the other user accessors because it's so subtle and can be unsafe, but also because it's just not a relevant optimization any more. To do that, a couple of drivers that misused the "user" copies as kernel copies in order to get non-temporal stores had to be fixed up, but that kind of code should never have been allowed anyway. The x86-only "nocache" version was also renamed to more accurately reflect what it actually does. This was all done because I looked at this code due to a report by Jann Horn, and I just couldn't stand the inconsistent naming, the horrible semantics, and the random misuse of these functions. This code should probably be cleaned up further, but it's at least slightly closer to normal semantics. I had a more intrusive series that went even further in trying to normalize the semantics, but that ended up hitting so many other inconsistencies between different architectures in this area (eg 'size_t' vs 'unsigned long' vs 'int' as size arguments, and various iovec check differences that Vasily Gorbik pointed out) that I ended up with this more limited version that fixed the worst of the issues. Reported-by: Jann Horn <jannh@google.com> Tested-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/all/CAHk-=wgg1QVWNWG-UCFo1hx0zqrPnB3qhPzUTrWNft+MtXQXig@mail.gmail.com/ * nocache-cleanup: x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache x86: rename and clean up __copy_from_user_inatomic_nocache() x86-64: rename misleadingly named '__copy_user_nocache()' function	2026-04-13 08:39:51 -07:00
Marc Harvey	d3870724eb	selftests: net: Add tests for team driver decoupled tx and rx control Use ping and tcpdump to verify that independent rx and tx enablement of team driver member interfaces works as intended. Signed-off-by: Marc Harvey <marcharvey@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260409-teaming-driver-internal-v7-10-f47e7589685d@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-13 15:09:49 +02:00
Marc Harvey	10407eebe8	selftests: net: Add test for enablement of ports with teamd There are no tests that verify enablement and disablement of team driver ports with teamd. This should work even with changes to the enablement option, so it is important to test. This test sets up an active-backup network configuration across two network namespaces, and tries to send traffic while changing which link is the active one. Also increase the team test timeout to 300 seconds, because gracefully killing teamd can take 30 seconds for each instance. Signed-off-by: Marc Harvey <marcharvey@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260409-teaming-driver-internal-v7-5-f47e7589685d@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-13 15:09:49 +02:00
Marc Harvey	05e352444b	selftests: net: Add tests for failover of team-aggregated ports There are currently no kernel tests that verify the effect of setting the enabled team driver option. In a followup patch, there will be changes to this option, so it will be important to make sure it still behaves as it does now. The test verifies that tcp continues to work across two different team devices in separate network namespaces, even when member links are manually disabled. Signed-off-by: Marc Harvey <marcharvey@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260409-teaming-driver-internal-v7-4-f47e7589685d@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-13 15:09:49 +02:00
Paolo Bonzini	ea8bc95fbb	KVM nested SVM changes for 7.1 (with one common x86 fix) - To minimize the probability of corrupting guest state, defer KVM's non-architectural delivery of exception payloads (e.g. CR2 and DR6) until consumption of the payload is imminent, and force delivery of the payload in all paths where userspace saves relevant state. - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT to fix a bug where L2's CR2 can get corrupted after a save/restore, e.g. if the VM is migrated while L2 is faulting in memory. - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE. - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore. - Add a variety of missing nSVM consistency checks. - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT. - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions. - Add support for save+restore of virtualized LBRs (on SVM). - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain. - Aggressively sanitize fields when copying from vmcb12 to guard against unintentionally allowing L1 to utilize yet-to-be-defined features. - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. Note, KVM is still flawed in that KVM doesn't address size prefix overrides for 64-bit guests; this should probably be documented as a KVM erratum. - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't bastardize AMD's already- sketchy behavior of generating #GP if for "unsupported" addresses). - Cache all used vmcb12 fields to further harden against TOCTOU bugs. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmnZfbwACgkQOlYIJqCj N/0pVRAAkys8LLtIekQtEVkaX3EPaXk0lGGmnzXbihgHFsS5lMAS4tcsr7oyk4TI rvJUGmkaTKTboQdTaCq0G7lwCu5hMuXsZ10WvmKfivMFxy3kSppqfffux5zVXng2 U/8oyJSorkX1WPC7d5QAZYMqqcSwQaR+a0FxowghGWBXMRHylerSuH00CiGr6Ron QQbZaKBNtkYwYFNos2tLuT4tueyFogk8FPAmdejEQ9CMxUjeAivlKm8JVXaDvGik lyPYbJJLukjuxSYGYmeRyGLLwK7VBGkFHQp/KBYSBgzGdweabhsQa1Z0CGm24+w1 q626W0sxsq97dZ0cd7oE6Cw+AdlMBK+mjpxB9gX4uLGyYlnFkdJV7OSlHVTR9d96 cqKduT0JvlBnVb7Yd5jyaGVl1YD62p0nwcrTuWidR5IJ16b4mYwwPzvkkQKHLt64 VAhH8lBVtATtblI9gfsbwGezV74xXnuLb0L1G7xeh1VIWu7pubFdqyRwIA+qiXQa OkyxzoDlFl+QF2Uf3cBCFMojBOrSZRiGiLzIkUnjBsN4N2uOPYTsQEfr9BXVVcv7 obT9xl/wUwry2fAJhUL+IBCDE42+8C62UaWT5KJHQLttBL7Mm06e75hFN5ObbE/x nExL+NmAcsSUUbbdojjnD0KWxYKkosNiONBVrjqqXdmBjmzzOvI= =ys7N -----END PGP SIGNATURE----- Merge tag 'kvm-x86-nested-7.1' of https://github.com/kvm-x86/linux into HEAD KVM nested SVM changes for 7.1 (with one common x86 fix) - To minimize the probability of corrupting guest state, defer KVM's non-architectural delivery of exception payloads (e.g. CR2 and DR6) until consumption of the payload is imminent, and force delivery of the payload in all paths where userspace saves relevant state. - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT to fix a bug where L2's CR2 can get corrupted after a save/restore, e.g. if the VM is migrated while L2 is faulting in memory. - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE. - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore. - Add a variety of missing nSVM consistency checks. - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT. - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions. - Add support for save+restore of virtualized LBRs (on SVM). - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain. - Aggressively sanitize fields when copying from vmcb12 to guard against unintentionally allowing L1 to utilize yet-to-be-defined features. - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. Note, KVM is still flawed in that KVM doesn't address size prefix overrides for 64-bit guests; this should probably be documented as a KVM erratum. - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't bastardize AMD's already- sketchy behavior of generating #GP if for "unsupported" addresses). - Cache all used vmcb12 fields to further harden against TOCTOU bugs.	2026-04-13 13:01:50 +02:00
Paolo Bonzini	c13008ed3d	KVM selftests changes for 7.1 - Add support for Hygon CPUs in KVM selftests. - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP. - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmnZJ9oACgkQOlYIJqCj N/1vxBAAkpV4HCG6Q5BV/2jxnRLDwRllg8YOAQk1p+DttmOraNe0PI8zYSzZG+f1 Np8HnWwjhEiQZLlBpcTpab937mlaCK45xr1CPLfqIvAD16sIZIt0/g/DDIOH+IkQ NJGSyQySVBoo50ln8WzC3vDmxaVKZntFzhilRYY+g+Kgo+mgAnuLHM1grPwM/oVU q621DGQHDeJzeovMNy+bJoZg75AybZIV+GvBlF1/pZXUMkZp7K7z8NJeilKcptBb vCeIfNwDSdPZ5zTPfAoPhts90IGkIdgsmVhG3/j29OApbiADj5Wgcdgae96BkYPD hleeQXrTNNDfQKYhHdNkl+d+9Ab/j4/dQ5gAwtciF+LkgT7HPs+q6t9qNAYAJvRS c7FJNZPQqtVr1chbc3nI7FOBJwK+R9UY9biHR1DE1Uwfpuh+fjhCMiSFr8MktSWO GQ6ZgTvXbkwqYsjGJNCCn03egkj+2B4i18De0j8Lzmrz0FVv1Y1WIQZ+vaEwoF9g hzpGvyMarqJR2QGezthHGhjO6eRbeZkTq9Ya1t6NYfNQIvkfy86nl6b5CUJjZ4/1 Xj7mNeOsfRO/Ez+LwO60jXsUI7YGmcFKOpivhI5QtUCwEqj8NkJ4xmYerB1K8E8L 5llsPETeWxGm4FDf/hadmLy9FBzE7Gltd9/oDHPKdE6wunvsAiI= =aySp -----END PGP SIGNATURE----- Merge tag 'kvm-x86-selftests-7.1' of https://github.com/kvm-x86/linux into HEAD KVM selftests changes for 7.1 - Add support for Hygon CPUs in KVM selftests. - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP. - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will.	2026-04-13 11:53:46 +02:00
Paolo Bonzini	e74c3a8891	KVM/arm64 updates for 7.1 * New features: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This comes with a full infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware. - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM. - Finally add support for pKVM protected guests, with anonymous memory being used as a backing store. About time! * Improvements and bug fixes: - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable. - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis. - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow. - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups. - A small set of patches fixing the Stage-2 MMU freeing in error cases. - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls. - The usual cleanups and other selftest churn. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmnWdswACgkQI9DQutE9 ekNYvBAAxj5Zmsx8sJ2CYDTJc2w4XkEjSgDugA+J/s0TMgrzExeBlWCstdhVTncy 68nwOjQl3TotnIrt7q36kko9u7IdD0pHNrk34NtlggLjHfB61n9SNcAA6j4F6zJa GFkHpJSrSnZuUPqapkDnlyhuPkgTIAkEUk2Am9siksSfY4HvRyHZJm2FTdxsdIBn NN9wvQqw2wefTXOQ8gS+oHbPVp1cPbwrF2a3EhzXXv/6W3mUBstXgsijgo07UzCp W6vHCv2wqHbHdf67z3Q3hL+VXlVH6oHlyW99/swqISvqRkH/iSB90+oUojnMRrSm yB6Wmhh8jboCaajWMJhG+veZw+7GMXU4nOrGd1rbnY8cwRl/TQ5YibhRm7DIdvjO xeUluTLJ0NdweQUwE2k4OlgKOuGang3E2p0clmkUO4SstA48MdqR/kpST6guIlWw U5syuNaaaiuwP5QOi9qZmMCNmQ3ZfnZG3nseJFdoyGjhVhf5jyQyv4Du9vGZQFF/ Zkg7yTqC4OWiC+3GkW9YYAySM1MyetivLtd47PGzHPTdtaZziWhNvQ0y+8QjQ+R+ CJNvyS/DvsT7epSya4sLgMP1ZAlih9xkz5sQ6k8NJLBYYXi0v33qwqditErgLLyj S4Ci4WNhHHWIusvCVM7JUBkH0AElpmi506f7F6iHoFLlkYR4t9U= =/SuQ -----END PGP SIGNATURE----- Merge tag 'kvmarm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 7.1 * New features: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This comes with a full infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware. - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM. - Finally add support for pKVM protected guests, with anonymous memory being used as a backing store. About time! * Improvements and bug fixes: - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable. - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis. - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow. - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups. - A small set of patches fixing the Stage-2 MMU freeing in error cases. - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls. - The usual cleanups and other selftest churn.	2026-04-13 11:49:54 +02:00
Paolo Bonzini	05578316ca	LoongArch KVM changes for v7.1 1. Use CSR_CRMD_PLV in kvm_arch_vcpu_in_kernel(). 2. Let vcpu_is_preempted() a macro & some enhanments. 3. Add DMSINTC irqchip in kernel support. 4. Add KVM PMU test cases for tools/selftests. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmnXiPMWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImehVBD/4v8Y6S4Sxkc/EBUDKbPwLGGhGR aZEe2dzHr10C/mx7Q2cYwjKhR9bpgPcBe0xiEomxAopLwK15qMai2mnNRX6SJA8P 00J/3xWpKR6XsgsMv2KF9XvqdT1SlnzOC04D2v/wbkjlebWaCRIZgWG7yoRTHRIj TOpzf7XFBOnNpuzg94DjXsgAlSOo0qbHAMGMgbQ3k7OKzomAIlD4ljCyPD+JdvCz T7jW7n4Nho1SoOYPeWwXyxbIeorgtRB3JQ8RakMCjkJYyChICe1BGXJ66qeTLizd G5GOhiePtU5LLXQlRUU/uOLmxsJ5jZjJWs3tfsQOFz9f2i8JmF5nSw3DqmpTaQSF IF3v+3Iu9o+1dUBPsZVUjPWORWuRSFrXnnrUF3JPBZazXPwJHq8Gvbt3z6QFE8RO Z+Z9zDDcVrSWfJkYV3uHocPPnkCNTcIUdT2QFAWZYkBQCVlbbKET43dY0MbeFR9R n+mQcQVJOfp/a5oXwQyiiov6c67JX9yTT8wCB3tyPJVsLsiCOR8hN9UHSiQBc+Sx TLCuSkt0uVgwkTEM+pnJqLofRZGc9A6z8RPubwCgyxJp3+YPX5d3FtgHYcdNmAfK fQ2ILp7K0L52FcjVSr3uV8QacqUMhxLknODdjBhBcU0sh2V7yJPd9zoJq41xjtLy e8PC1D6NHGneLiuCCA== =zbdN -----END PGP SIGNATURE----- Merge tag 'loongarch-kvm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v7.1 1. Use CSR_CRMD_PLV in kvm_arch_vcpu_in_kernel(). 2. Let vcpu_is_preempted() a macro & some enhanments. 3. Add DMSINTC irqchip in kernel support. 4. Add KVM PMU test cases for tools/selftests.	2026-04-13 11:46:11 +02:00
Paolo Bonzini	d880d2a9c6	KVM/riscv changes for 7.1 - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space -----BEGIN PGP SIGNATURE----- iQIyBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmnbi74ACgkQrUjsVaLH LAejGw/3XRqexEOrxJ74GvAylGtGkQRuTw003mhIBIyshosYw6PiOSHEuAu7TQLc N074wt0QSfwbmQmeNaa0q3gIqb7Sp6gC0Eidurv1zHXuKSaAGbppnKD0VOZibtm6 CK4+HQqBXFxf2mMeJSQX7+EOWNO+rf2jfw80c3SiKTnPE8mtb8Xfn3G6Zw22UmBZ gOrDrW4IijNKNkrBItb8V1IJgsfFIdUY+1Il2n1MSRuuqQL+tJcmHWXEPs7GdfMf 9siV7asCfhKdf6xDys/Px42DBgQxLASG72Q8X2cESCxkO1kDplJZAt1AitAGfzXL bk20uAWikO0j7/Su93pWDOwMxRqb8c9dIMrnyRsCpmx1ovWA/odEnkrhslS1lafN hpUr6DWmTrE+2I6PW9UgBRAbleMomtK411fLhZh28PSyvp42sxG8841PwUNd/hFP lfcFO/ksJS7HFxQK5RaGZaMUuSMsgtZAu5P7zGGZm9uQh60eaHRF1vFcA+c5GAHe hibOegHjdYMkcLFqbVJDegU6a5+kohO3I8R21WXItp+VQjGvezOWuTDJtcR+jPAJ RNe4R1QeRgbsIwRuP9+fm6QnLeTPmBfxkCPFS1FvzrimLWskcQxqPWUq+eDvNDp+ EdnE1KPjPsTykWq2baiG70+0xnetHP4ZGIrWwOQNOREJPEGOqg== =LbnE -----END PGP SIGNATURE----- Merge tag 'kvm-riscv-7.1-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 7.1 - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space	2026-04-13 11:42:26 +02:00
Sun Jian	f1cc94665d	selftests/bpf: cover short IPv4/IPv6 inputs with adjust_room Add a selftest covering ETH_HLEN-sized IPv4/IPv6 EtherType inputs for bpf_prog_test_run_skb(). Reuse a single zero-initialized struct ethhdr eth_hlen and set eth_hlen.h_proto from the per-test h_proto field. Also add a dedicated tc_adjust_room program and route the short IPv4/IPv6 cases to it, so the selftest actually exercises the bpf_skb_adjust_room() path from the report. Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Link: https://lore.kernel.org/r/20260408034623.180320-3-sun.jian.kdev@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-12 15:42:57 -07:00
Alexei Starovoitov	47687a29b2	selftests/bpf: Use memfd_create instead of shm_open in cgroup_iter_memcg Replace shm_open/shm_unlink with memfd_create in the shmem subtest. shm_open requires /dev/shm to be mounted, which is not always available in test environments, causing the test to fail with ENOENT. memfd_create creates an anonymous shmem-backed fd without any filesystem dependency while exercising the same shmem accounting path. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20260412210636.47516-1-alexei.starovoitov@gmail.com Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>	2026-04-12 23:46:32 +02:00
Alexei Starovoitov	fa2942918a	Merge patch series "bpf: Fix OOB in pcpu_init_value and add a test" xulang <xulang@uniontech.com> says: ==================== Fix OOB read when copying element from a BPF_MAP_TYPE_CGROUP_STORAGE map to another pcpu map with the same value_size that is not rounded up to 8 bytes, and add a test case to reproduce the issue. The root cause is that pcpu_init_value() uses copy_map_value_long() which rounds up the copy size to 8 bytes, but CGROUP_STORAGE map values are not 8-byte aligned (e.g., 4-byte). This causes a 4-byte OOB read when the copy is performed. ==================== Link: https://lore.kernel.org/r/7653EEEC2BAB17DF+20260402073948.2185396-1-xulang@uniontech.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-12 13:36:55 -07:00
Lang Xu	171609f047	selftests/bpf: Add test for cgroup storage OOB read Add a test case to reproduce the out-of-bounds read issue when copying from a cgroup storage map to a pcpu map with a value_size not rounded up to 8 bytes. The test creates: 1. A CGROUP_STORAGE map with 4-byte value (not 8-byte aligned) 2. A LRU_PERCPU_HASH map with 4-byte value (same size) When a socket is created in the cgroup, the BPF program triggers bpf_map_update_elem() which calls copy_map_value_long(). This function rounds up the copy size to 8 bytes, but the cgroup storage buffer is only 4 bytes, causing an OOB read (before the fix). Signed-off-by: Lang Xu <xulang@uniontech.com> Link: https://lore.kernel.org/r/D63BF0DBFF1EA122+20260402074236.2187154-2-xulang@uniontech.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-12 13:36:32 -07:00
Paul Chaignon	2fefa9c81a	selftests/bpf: Fix reg_bounds to match new tnum-based refinement Commit `efc11a6678` ("bpf: Improve bounds when tnum has a single possible value") improved the bounds refinement to detect when the tnum and u64 range overlap in a single value (and the bounds can thus be set to that value). Eduard then noticed that it broke the slow-mode reg_bounds selftests because they don't have an equivalent logic and are therefore unable to refine the bounds as much as the verifier. The following test case illustrates this. ACTUAL TRUE1: scalar(u64=0xffffffff00000000,u32=0,s64=0xffffffff00000000,s32=0) EXPECTED TRUE1: scalar(u64=[0xfffffffe00000001; 0xffffffff00000000],u32=0,s64=[0xfffffffe00000001; 0xffffffff00000000],s32=0) [...] #323/1007 reg_bounds_gen_consts_s64_s32/(s64)[0xfffffffe00000001; 0xffffffff00000000] (s32)<op> S64_MIN:FAIL with the verifier logs: [...] 19: w0 = w6 ; R0=scalar(smin=0,smax=umax=0xffffffff, var_off=(0x0; 0xffffffff)) R6=scalar(smin=0xfffffffe00000001,smax=0xffffffff00000000, umin=0xfffffffe00000001,umax=0xffffffff00000000, var_off=(0xfffffffe00000000; 0x1ffffffff)) 20: w0 = w7 ; R0=0 R7=0x8000000000000000 21: if w6 == w7 goto pc+3 [...] from 21 to 25: [...] 25: w0 = w6 ; R0=0 R6=0xffffffff00000000 ; ^ ; unexpected refined value 26: w0 = w7 ; R0=0 R7=0x8000000000000000 27: exit When w6 == w7 is true, the verifier can deduce that the R6's tnum is equal to (0xfffffffe00000000; 0x100000000) and then use that information to refine the bounds: the tnum only overlap with the u64 range in 0xffffffff00000000. The reg_bounds selftest doesn't know about tnums and therefore fails to perform the same refinement. This issue happens when the tnum carries information that cannot be represented in the ranges, as otherwise the selftest could reach the same refined value using just the ranges. The tnum thus needs to represent non-contiguous values (ex., R6's tnum above, after the condition). The only way this can happen in the reg_bounds selftest is at the boundary between the 32 and 64bit ranges. We therefore only need to handle that case. This patch fixes the selftest refinement logic by checking if the u32 and u64 ranges overlap in a single value. If so, the ranges can be set to that value. We need to handle two cases: either they overlap in umin64... u64 values matching u32 range: xxx xxx xxx xxx \|--------------------------------------\| u64 range: 0 xxxxx UMAX64 or in umax64: u64 values matching u32 range: xxx xxx xxx xxx \|--------------------------------------\| u64 range: 0 xxxxx UMAX64 To detect the first case, we decrease umax64 to the maximum value that matches the u32 range. If that happens to be umin64, then umin64 is the only overlap. We proceed similarly for the second case, increasing umin64 to the minimum value that matches the u32 range. Note this is similar to how the verifier handles the general case using tnum, but we don't need to care about a single-value overlap in the middle of the range. That case is not possible when comparing two ranges. This patch also adds two test cases reproducing this bug as part of the normal test runs (without SLOW_TESTS=1). Fixes: `efc11a6678` ("bpf: Improve bounds when tnum has a single possible value") Reported-by: Eduard Zingerman <eddyz87@gmail.com> Closes: https://lore.kernel.org/bpf/4e6dd64a162b3cab3635706ae6abfdd0be4db5db.camel@gmail.com/ Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/ada9UuSQi2SE2IfB@mail.gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-12 13:15:31 -07:00
Emil Tsalapatis	4c5f21d4df	selftests/bpf: Add tests for non-arena/arena operations Add a selftest that ensures instructions with arena source and non-arena destination registers are accepted by the verifier. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20260412174546.18684-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-12 12:48:11 -07:00
Menglong Dong	f0e16ac716	bpftool: add missing fsession to the usage and docs of bpftool Add the fsession attach type to the usage of bpftool in do_help(). Meanwhile, add it to the bash-completion and bpftool-prog.rst too. Acked-by: Leon Hwang <leon.hwang@linux.dev> Acked-by: Quentin Monnet <qmo@kernel.org> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260412060346.142007-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-12 12:42:38 -07:00
Menglong Dong	9fd19e3ed7	bpf: add missing fsession to the verifier log The fsession attach type is missed in the verifier log in check_get_func_ip(), bpf_check_attach_target() and check_attach_btf_id(). Update them to make the verifier log proper. Meanwhile, update the corresponding selftests. Acked-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260412060346.142007-2-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-12 12:42:38 -07:00
Jiayuan Chen	04013c3ca0	selftests/bpf: Add tests for sock_ops ctx access with same src/dst register Add selftests to verify SOCK_OPS_GET_SK() and SOCK_OPS_GET_FIELD() correctly return NULL/zero when dst_reg == src_reg and is_fullsock == 0. Three subtests are included: - get_sk: ctx->sk with same src/dst register (SOCK_OPS_GET_SK) - get_field: ctx->snd_cwnd with same src/dst register (SOCK_OPS_GET_FIELD) - get_sk_diff_reg: ctx->sk with different src/dst register (baseline) Each BPF program uses inline asm (__naked) to force specific register allocation, reads is_fullsock first, then loads the field using the same (or different) register. The test triggers TCP_NEW_SYN_RECV via a TCP handshake and checks that the result is NULL/zero when is_fullsock == 0. Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260407022720.162151-3-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-12 12:28:05 -07:00
Ian Rogers	fab205e492	perf sample: Fix documentation typo s/PEF/PERF/ Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-12 12:12:11 -07:00
Hangbin Liu	594ba44771	tools: ynl: ethtool: add --dbg-small-recv option Add a --dbg-small-recv debug option to control the recv() buffer size used by YNL, matching the same option already present in cli.py. This is useful if user need to get large netlink message. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260408-b4-ynl_ethtool-v2-3-7623a5e8f70b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-12 11:23:49 -07:00
Hangbin Liu	1c43d471a5	tools: ynl: ethtool: use doit instead of dumpit for per-device GET Rename the local helper doit() to do_set() and dumpit() to do_get() to better reflect their purpose. Convert do_get() to use ynl.do() with an explicit device header instead of ynl.dump() followed by client-side filtering. This is more efficient as the kernel only processes and returns data for the requested device, rather than dumping all devices across the netns. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260408-b4-ynl_ethtool-v2-2-7623a5e8f70b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-12 11:23:49 -07:00
Hangbin Liu	22ef8a263c	tools: ynl: move ethtool.py to selftest We have converted all the samples to selftests. This script is the last piece of random "PoC" code we still have lying around. Let's move it to tests. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260408-b4-ynl_ethtool-v2-1-7623a5e8f70b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-12 11:23:49 -07:00
Joe Damato	5d3b12d1a2	selftests: drv-net: Add USO test Add a simple test for USO. Tests both ipv4 and ipv6 with several full segments and a partial segment. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260408230607.2019402-11-joe@dama.to Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-12 10:54:33 -07:00
Florian Westphal	6111954266	selftests: netfilter: nft_tproxy.sh: adjust to socat changes Like `e65d8b6f30` ("selftests: drv-net: adjust to socat changes") we need to add shut-none for this test too. The extra 0-packet can trigger a second (unexpected) reply from the server. Fixes: `7e37e0eacd` ("selftests: netfilter: nft_tproxy.sh: add tcp tests") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20260408152432.24b8ad0d@kernel.org/ Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20260409224506.27072-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-12 09:41:16 -07:00
Jakub Kicinski	e46ff213f7	selftests: net: py: add test case filtering and listing When developing new test cases and reproducing failures in existing ones we currently have to run the entire test which can take minutes to finish. Add command line options for test selection, modeled after kselftest_harness.h: -l list tests (filtered, if filters were specified) -t name include test -T name exclude test Since we don't have as clean separation into fixture / variant / test as kselftest_harness this is not really a 1 to 1 match. We have to lean on glob patterns instead. Like in kselftest_harness filters are evaluated in order, first match wins. If only exclusions are specified everything else is included and vice versa. Glob patterns (*, ?, [) are supported in addition to exact matching. Reviewed-by: Willem de Bruijn <willemb@google.com> Tested-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260410013921.1710295-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-12 09:09:09 -07:00
Alexei Starovoitov	ae3f8ca2ba	bpf: Move checks for reserved fields out of the main pass Check reserved fields of each insn once in a prepass instead of repeatedly rechecking them during the main verifier pass. Link: https://lore.kernel.org/r/20260411200932.41797-1-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-11 15:24:41 -07:00
Eduard Zingerman	335a6ca041	selftests/bpf: inline TEST_TAG constants in test_loader.c After str_has_pfx() refactoring each TEST_TAG_* / TEST_BTF_PATH constant is used exactly once. Since constant definitions are not shared between BPF-side bpf_misc.h and userspace side test_loader.c, there is no need in the additional redirection layer. Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-selftests-global-tags-ordering-v2-4-c566ec9781bf@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-11 07:17:06 -07:00
Eduard Zingerman	713db9fd03	selftests/bpf: impose global ordering for test decl_tags Impose global ordering for all decl tags used by test_loader.c based tests (__success, __failure, __msg, etc): - change every tag to expand as __attribute__((btf_decl_tag("comment:" XSTR(__COUNTER__) ...))) - change parse_test_spec() to collect all decl tags before processing and sort them using strverscmp(). The ordering is necessary for gcc-bpf. Neither GCC nor the C standard defines the order in which function attributes are consumed. While Clang tends to preserve definition order, GCC may process them out of sequence. This inconsistency causes BPF tests with multiple __msg entries to fail when compiled with GCC. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-selftests-global-tags-ordering-v2-3-c566ec9781bf@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-11 07:17:06 -07:00
Eduard Zingerman	5160e584c3	selftests/bpf: make str_has_pfx return pointer past the prefix Change str_has_pfx() to return a pointer to the first character after the prefix, thus eliminating the repetitive (s + sizeof(PFX) - 1) patterns. Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-selftests-global-tags-ordering-v2-2-c566ec9781bf@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-11 07:17:06 -07:00
Eduard Zingerman	cdd54fe98c	selftests/bpf: fix __jited_unpriv tag name __jited_unpriv was using "test_jited=" as its tag name, same as the priv variant __jited. Fix by using "test_jited_unpriv=". Fixes: `7d743e4c75` ("selftests/bpf: __jited test tag to check disassembly after jit") Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-selftests-global-tags-ordering-v2-1-c566ec9781bf@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-11 07:17:06 -07:00
Zongmin Zhou	cfcd7b29e5	usbip: tools: add hint when no exported devices are found When refresh_exported_devices() finds no devices, it's helpful to inform users about potential causes. This could be due to: 1. The usbip driver module is not loaded. 2. No devices have been exported yet. Add an informational message to guide users when ndevs == 0. Also update the condition in usbip_host_driver_open() and usbip_device_driver_open() to check both ret and ndevs == 0, and change err() to info(). Message visibility by scenario: - usbipd (console mode): Show on console/serial, this allows instant visibility for debugging. - usbipd -D (daemon mode): Message logged to syslog, can keep logs for later traceability in production. Also can use "journalctl -f" to trace on console. Suggested-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Zongmin Zhou <zhouzongmin@kylinos.cn> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/20260402083204.53179-1-min_halo@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-04-11 12:02:00 +02:00
Amery Hung	78ee02a966	selftests/bpf: Remove kmalloc tracing from local storage create bench Remove the raw_tp/kmalloc BPF program and its associated reporting from the local storage create benchmark. The kmalloc count per create is not a useful metric as different code paths use different allocators (e.g. kmalloc_nolock vs kzalloc), introducing noise that makes the number hard to interpret. Keep total_creates in the summary output as it is useful for normalizing perf statistics collected alongside the benchmark. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260411015419.114016-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 21:22:31 -07:00
Daniel Borkmann	497fa510ee	selftests/bpf: Add test for add_const base_id consistency Add a test to verifier_linked_scalars that exercises the base_id consistency check for BPF_ADD_CONST linked scalars during state pruning. With the fix, pruning fails and the verifier discovers the true branch's R3 is too wide for the stack access. # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_linked_scalars [...] #613/22 verifier_linked_scalars/scalars_stale_delta_from_cleared_id:OK #613/23 verifier_linked_scalars/scalars_stale_delta_from_cleared_id_alu32:OK #613/24 verifier_linked_scalars/linked scalars: add_const base_id must be consistent for pruning:OK #613 verifier_linked_scalars:OK Summary: 1/24 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20260410232651.559778-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 17:39:09 -07:00
Linus Torvalds	e774d5f1bc	RISC-V updates for v7.0-rc8 Before v7.0 is released, fix a few issues with the CFI patchset, merged earlier in v7.0-rc, that primarily affect interfaces to non-kernel code: - Improve the prctl() interface for per-task indirect branch landing pad control to expand abbreviations and to resemble the speculation control prctl() interface - Expand the "LP" and "SS" abbreviations in the ptrace uapi header file to "branch landing pad" and "shadow stack", to improve readability - Fix a typo in a CFI-related macro name in the ptrace uapi header file - Ensure that the indirect branch tracking state and shadow stack state are unlocked immediately after an exec() on the new task so that libc subsequently can control it - While working in this area, clean up the kernel-internal, cross-architecture prctl() function names by expanding the abbreviations mentioned above -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmnYP5YACgkQx4+xDQu9 KkuoPQ//Yye5D+35EqfA12yP96Vrtg0QCKiqMotz3yLo0T7zh5KosAs/QIE5eQi7 vWRnCld5PsFa0ZS2822oPfQo8pKVO1y7M2ecFWSwaOWq865Xs82M/puqEQF3GFCS 219cg1dTVBGvvKSf4MINUBRprfZmZRT9pzhSk79qHEbHKzwCDk7uah51iUdyPJyd KX3hshYMLq3rooTHR2wD/ChTpV+pCrt2rSUVbW8+sTUWDfv2sTLauHmemKw7LpdW C0SulXvcYkGyiqsB5AXW9x2ttJ5hX9diPb73XS6eBCU0CaMl9BVZWNKeqhEMJxKR wmqIadD8pelf7Jh7wGAbNW4hWqTsO3xRpZH38Y/cGLdhs3cqvKjEmT3fOFWUP9bP hWv5027gVXVSOmvxhPiUJs7D5WWAz4Q64JZfdJSmDdEWVXcI0v/hzdukuPw4iiT6 DaqOyClTcwc+j1jawFTICXTF7wXfvZT5sjulrmPk1HX4nZ5padKpfQ77AdKHF9Q6 9pC25QHQk42h/R4ynA4lm15YnCOfYvjP25hU7K64gQnqO6qBrolfrA4kJOmdYv/g 1IXsA2YZafJbcXwyFZjWy50uu5gaCM5JhRRFdUrjmB6j3gv9HfBlWJXQywReUjPo Kq4tnFppxzFVm23COj9j5kyjsFjUhZ8KCft3+n7lrndeOCk5Z3E= =5/Ct -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-v7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: "Before v7.0 is released, fix a few issues with the CFI patchset, merged earlier in v7.0-rc, that primarily affect interfaces to non-kernel code: - Improve the prctl() interface for per-task indirect branch landing pad control to expand abbreviations and to resemble the speculation control prctl() interface - Expand the "LP" and "SS" abbreviations in the ptrace uapi header file to "branch landing pad" and "shadow stack", to improve readability - Fix a typo in a CFI-related macro name in the ptrace uapi header file - Ensure that the indirect branch tracking state and shadow stack state are unlocked immediately after an exec() on the new task so that libc subsequently can control it - While working in this area, clean up the kernel-internal, cross-architecture prctl() function names by expanding the abbreviations mentioned above" * tag 'riscv-for-linus-v7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: prctl: cfi: change the branch landing pad prctl()s to be more descriptive riscv: ptrace: cfi: expand "SS" references to "shadow stack" in uapi headers prctl: rename branch landing pad implementation functions to be more explicit riscv: ptrace: expand "LP" references to "branch landing pads" in uapi headers riscv: cfi: clear CFI lock status in start_thread() riscv: ptrace: cfi: fix "PRACE" typo in uapi header	2026-04-10 17:27:08 -07:00
Andy Roulin	20ae6d76e3	selftests: net: add bridge STP mode selection test Add a selftest for the IFLA_BR_STP_MODE bridge attribute that verifies: 1. stp_mode defaults to auto on new bridges 2. stp_mode can be toggled between user, kernel, and auto 3. Changing stp_mode while STP is active is rejected with -EBUSY 4. Re-setting the same stp_mode while STP is active succeeds 5. stp_mode user in a network namespace yields userspace STP (stp_state=2) 6. stp_mode kernel forces kernel STP (stp_state=1) 7. stp_mode auto in a netns preserves traditional fallback to kernel STP 8. stp_mode and stp_state can be set atomically in a single message 9. stp_mode persists across STP disable/enable cycles Test 5 is the key use case: it demonstrates that userspace STP can now be enabled in non-init network namespaces by setting stp_mode to user before enabling STP. Test 8 verifies the atomic usage pattern where both attributes are set in a single netlink message, which is supported because br_changelink() processes IFLA_BR_STP_MODE before IFLA_BR_STP_STATE. The test gracefully skips if the installed iproute2 does not support the stp_mode attribute. Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Andy Roulin <aroulin@nvidia.com> Link: https://patch.msgid.link/20260405205224.3163000-4-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-10 15:52:25 -07:00
Dimitri Daskalakis	a66374a3eb	selftests: drv-net: ntuple: Add dst-ip, src-port, dst-port fields Extend the ntuple flow steering test to cover dst-ip, src-port, and dst-port fields. The test supports arbitrary combinations of the fields, for now we test src_ip/dst_ip, and src_ip/dst_ip/src_port/dst_port. The tests currently match full fields, but we can consider adding support for masked fields in the future. TAP version 13 1..24 ok 1 ntuple.queue.tcp4.src_ip ok 2 ntuple.queue.tcp4.dst_ip ok 3 ntuple.queue.tcp4.src_port ok 4 ntuple.queue.tcp4.dst_port ok 5 ntuple.queue.tcp4.src_ip.dst_ip ok 6 ntuple.queue.tcp4.src_ip.dst_ip.src_port.dst_port ok 7 ntuple.queue.udp4.src_ip ok 8 ntuple.queue.udp4.dst_ip ok 9 ntuple.queue.udp4.src_port ok 10 ntuple.queue.udp4.dst_port ok 11 ntuple.queue.udp4.src_ip.dst_ip ok 12 ntuple.queue.udp4.src_ip.dst_ip.src_port.dst_port ok 13 ntuple.queue.tcp6.src_ip ok 14 ntuple.queue.tcp6.dst_ip ok 15 ntuple.queue.tcp6.src_port ok 16 ntuple.queue.tcp6.dst_port ok 17 ntuple.queue.tcp6.src_ip.dst_ip ok 18 ntuple.queue.tcp6.src_ip.dst_ip.src_port.dst_port ok 19 ntuple.queue.udp6.src_ip ok 20 ntuple.queue.udp6.dst_ip ok 21 ntuple.queue.udp6.src_port ok 22 ntuple.queue.udp6.dst_port ok 23 ntuple.queue.udp6.src_ip.dst_ip ok 24 ntuple.queue.udp6.src_ip.dst_ip.src_port.dst_port # Totals: pass:24 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dimitri Daskalakis <daskald@meta.com> Link: https://patch.msgid.link/20260407164954.2977820-3-dimitri.daskalakis1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-10 15:32:11 -07:00
Dimitri Daskalakis	18589df934	selftests: drv-net: Add ntuple (NFC) flow steering test Add a test for ethtool NFC (ntuple) flow steering rules. The test creates an ntuple rule matching on various flow fields and verifies that traffic is steered to the correct queue. The test forces all traffic to queue 0 via the indirection table, then installs an ntuple rule to steer select traffic to a specific queue. The test then verifies the expected number of packets is received on the queue. This test has variants for TCP/UDP over IPv4/IPv6, with rules matching the source IP. Additional match fields will be added in the next commit. TAP version 13 1..4 ok 1 ntuple.queue.tcp4.src_ip ok 2 ntuple.queue.udp4.src_ip ok 3 ntuple.queue.tcp6.src_ip ok 4 ntuple.queue.udp6.src_ip # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Dimitri Daskalakis <daskald@meta.com> Link: https://patch.msgid.link/20260407164954.2977820-2-dimitri.daskalakis1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-10 15:32:11 -07:00
Alexei Starovoitov	2cb27158ad	bpf: poison dead stack slots As a sanity check poison stack slots that stack liveness determined to be dead, so that any read from such slots will cause program rejection. If stack liveness logic is incorrect the poison can cause valid program to be rejected, but it also will prevent unsafe program to be accepted. Allow global subprogs "read" poisoned stack slots. The static stack liveness determined that subprog doesn't read certain stack slots, but sizeof(arg_type) based global subprog validation isn't accurate enough to know which slots will actually be read by the callee, so it needs to check full sizeof(arg_type) at the caller. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-14-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:13:38 -07:00
Alexei Starovoitov	27417e5eb9	selftests/bpf: add new tests for static stack liveness analysis Add a bunch of new tests to verify the static stack liveness analysis. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-13-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:13:38 -07:00
Alexei Starovoitov	957c30c067	selftests/bpf: adjust verifier_log buffers The new liveness analysis in liveness.c adds verbose output at BPF_LOG_LEVEL2, making the verifier log for good_prog exceed the 1024-byte reference buffer. When the reference is truncated in fixed mode, the rolling mode captures the actual tail of the full log, which doesn't match the truncated reference. The fix is to increase the buffer sizes in the test. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-12-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:13:38 -07:00
Alexei Starovoitov	b42eb55f6c	selftests/bpf: update existing tests due to liveness changes The verifier cleans all dead registers and stack slots in the current state. Adjust expected output in tests or insert dummy stack/register reads. Also update verifier_live_stack tests to adhere to new logging scheme. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-patch-set-v4-11-5d4eecb343db@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 15:13:37 -07:00
Alan Maguire	0e4dc6fbdd	selftests/bpf: Add BTF sanitize test covering BTF layout Add test that fakes up a feature cache of supported BPF features to simulate an older kernel that does not support BTF layout information. Ensure that BTF is sanitized correctly to remove layout info between types and strings, and that all offsets and lengths are adjusted appropriately. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20260408165735.843763-3-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 12:34:36 -07:00
Alan Maguire	7419fcadd1	libbpf: Allow use of feature cache for non-token cases Allow bpf object feat_cache assignment in BPF selftests to simulate missing features via inclusion of libbpf_internal.h and use of bpf_object_set_feat_cache() and bpf_object__sanitize_btf() to test BTF sanitization for cases where missing features are simulated. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20260408165735.843763-2-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 12:34:36 -07:00
Venkat Rao Bagalkote	aacee214d5	selftests/bpf: Remove test_access_variable_array test_access_variable_array relied on accessing struct sched_domain::span to validate variable-length array handling via BTF. Recent scheduler refactoring removed or hid this field, causing the test to fail to build. Given that this test depends on internal scheduler structures that are subject to refactoring, and equivalent variable-length array coverage already exists via bpf_testmod-based tests, remove test_access_variable_array entirely. Link: https://lore.kernel.org/all/177434340048.1647592.8586759362906719839.tip-bot2@tip-bot2/ Signed-off-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Tested-by: Naveen Kumar Thummalapenta <naveen66@linux.ibm.com> Link: https://lore.kernel.org/r/20260410105404.91126-1-venkat88@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-10 12:32:53 -07:00
Leo Yan	4e03d6494f	perf arm_spe: Improve SIMD flags setting Fill in ASE and SME operations for the SIMD arch field. Also set the predicate flags for SVE and SME, but differences between them: SME does not have a predicate flag, so the setting is based on events. SVE provides a predicate flag to indicate whether the predicate is disabled, which allows it to be distinguished into four cases: full predicates, empty predicates, fully predicated, and disabled predicates. After: perf report -s +simd ... 0.06% 0.06% sve-test sve-test [.] setz [p] SVE 0.06% 0.06% sve-test [kernel.kallsyms] [k] do_raw_spin_lock 0.06% 0.06% sve-test sve-test [.] getz [p] SVE 0.06% 0.06% sve-test [kernel.kallsyms] [k] timekeeping_advance 0.06% 0.06% sve-test sve-test [.] getz [d] SVE 0.06% 0.06% sve-test [kernel.kallsyms] [k] update_load_avg 0.06% 0.06% sve-test sve-test [.] getz [e] SVE 0.05% 0.05% sve-test sve-test [.] setz [e] SVE 0.05% 0.05% sve-test [kernel.kallsyms] [k] update_curr 0.05% 0.05% sve-test sve-test [.] setz [d] SVE 0.05% 0.05% sve-test [kernel.kallsyms] [k] do_raw_spin_unlock 0.05% 0.05% sve-test [kernel.kallsyms] [k] timekeeping_update_from_shadow.constprop.0 0.05% 0.05% sve-test sve-test [.] getz [f] SVE 0.05% 0.05% sve-test sve-test [.] setz [f] SVE Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-10 09:52:06 -07:00
Leo Yan	54940f1526	perf report: Update document for SIMD flags Update SIMD architecture and predicate flags. Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-10 09:52:06 -07:00
Leo Yan	0f648fc245	perf sort: Sort disabled and full predicated flags According to the Arm ARM (ARM DDI 0487, L.a), section D18.2.6 "Events packet", apart from the empty predicate and partial predicates, an SVE or SME operation can be predicate-disabled or full predicated. To provide complete results, introduce two predicate types for these cases. Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-10 09:52:06 -07:00
Leo Yan	faaf70f938	perf sort: Support sort ASE and SME Support sort Advance SIMD extension (ASE) and SME. Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-10 09:52:06 -07:00
Linus Torvalds	96463e4e02	Turbostat Fixes Fix a memory allocation issue that could corrupt output values or SEGV Fix a perf initilization issue that could exit on some HW + kernels. Minor fixes. -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEE67dNfPFP+XUaA73mB9BFOha3NhcFAmnY/PkUHGxlbi5icm93 bkBpbnRlbC5jb20ACgkQB9BFOha3NhcTIA//QsbmJV7qddOe/kOhS7izRyVjuaLQ QIyIrSLSa5+49oT4uRhZxuzDm6N6zVykFf+yxK5E2TSf+JfvzfwjZJQEgUPqHtkt vietoJheI4voMgmDQk39oxQmYyxmXuD3TRgvY+EN+QEIuWFA3/J9YaOulQMICrRy TX3juoZ3mxUpO8LC6+t3Yd8Kl1JEl3cKj3LdX7cMIchLM2cSTlbVFGSP0y5f0mu/ m54+JLSeSCFyqfin8rkgLYar8+AIIrHVak57rQ3qKBUOfoJlMaJEgXC4/sKdEzAW zKld2cFfzzXd2LnShN9Su7L9VCF09POKpD7P9bk9rDcp/MfsOmlKRctRvnY4T7Bq si+AflAYzYtHlq0LQbw2tHsKLFnqIJziDzz+YHl9UxcK0X17Gkp32NSQOX7Djvdu g/HDMHudnyoq0dAObieQgevYAWYmfKTNeQpJk+d+27X9HTzNXCDe70D774+DhM73 ViDU5zPrFBc1Z9ScQmgbe4z2N3MvzWGkodJRseb1v44ClH+Z6sCG4RYxYpxj0Zyu 0rR5GBb6ymuDUAQYEVuk2/F05bgmp3oXhoaddC+bO115XGEfe9sR7PBk3p9LYSM5 KirznZsjE8qI8GUXdL4EztTtIUJ5sx9CWb5FqP6k7XovrPCoMhSsbbo0vUbLCK4C eh0+iPh3sAf0NZs= =xJLy -----END PGP SIGNATURE----- Merge tag 'turbostat-fixes-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat fixes from Len Brown: - Fix a memory allocation issue that could corrupt output values or SEGV - Fix a perf initilization issue that could exit on some HW + kernels - Minor fixes * tag 'turbostat-fixes-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: Allow execution to continue after perf_l2_init() failure tools/power turbostat: Fix delimiter bug in print functions tools/power turbostat: Fix --show/--hide for individual cpuidle counters tools/power turbostat: Fix incorrect format variable tools/power turbostat: Consistently use print_float_value() tools/power/turbostat: Fix microcode patch level output for AMD/Hygon tools/power turbostat: Eliminate unnecessary data structure allocation tools/power turbostat: Fix swidle header vs data display tools/power turbostat: Fix illegal memory access when SMT is present and disabled	2026-04-10 08:36:32 -07:00
Catalin Marinas	480a9e57cc	Merge branches 'for-next/misc', 'for-next/tlbflush', 'for-next/ttbr-macros-cleanup', 'for-next/kselftest', 'for-next/feat_lsui', 'for-next/mpam', 'for-next/hotplug-batched-tlbi', 'for-next/bbml2-fixes', 'for-next/sysreg', 'for-next/generic-entry' and 'for-next/acpi', remote-tracking branches 'arm64/for-next/perf' and 'arm64/for-next/read-once' into for-next/core * arm64/for-next/perf: : Perf updates perf/arm-cmn: Fix resource_size_t printk specifier in arm_cmn_init_dtc() perf/arm-cmn: Fix incorrect error check for devm_ioremap() perf: add NVIDIA Tegra410 C2C PMU perf: add NVIDIA Tegra410 CPU Memory Latency PMU perf/arm_cspmu: nvidia: Add Tegra410 PCIE-TGT PMU perf/arm_cspmu: nvidia: Add Tegra410 PCIE PMU perf/arm_cspmu: Add arm_cspmu_acpi_dev_get perf/arm_cspmu: nvidia: Add Tegra410 UCF PMU perf/arm_cspmu: nvidia: Rename doc to Tegra241 perf/arm-cmn: Stop claiming entire iomem region arm64: cpufeature: Use pmuv3_implemented() function arm64: cpufeature: Make PMUVer and PerfMon unsigned KVM: arm64: Read PMUVer as unsigned * arm64/for-next/read-once: : Fixes for __READ_ONCE() with CONFIG_LTO=y arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y arm64: Optimize __READ_ONCE() with CONFIG_LTO=y * for-next/misc: : Miscellaneous cleanups/fixes arm64: rsi: use linear-map alias for realm config buffer arm64: Kconfig: fix duplicate word in CMDLINE help text arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps arm64: kexec: Remove duplicate allocation for trans_pgd arm64: mm: Use generic enum pgtable_level arm64: scs: Remove redundant save/restore of SCS SP on entry to/from EL0 arm64: remove ARCH_INLINE_* * for-next/tlbflush: : Refactor the arm64 TLB invalidation API and implementation arm64: mm: __ptep_set_access_flags must hint correct TTL arm64: mm: Provide level hint for flush_tlb_page() arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range() arm64: mm: More flags for __flush_tlb_range() arm64: mm: Refactor __flush_tlb_range() to take flags arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid() arm64: mm: Simplify __flush_tlb_range_limit_excess() arm64: mm: Simplify __TLBI_RANGE_NUM() macro arm64: mm: Re-implement the __flush_tlb_range_op macro in C arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() arm64: mm: Push __TLBI_VADDR() into __tlbi_level() arm64: mm: Implicitly invalidate user ASID based on TLBI operation arm64: mm: Introduce a C wrapper for by-range TLB invalidation arm64: mm: Re-implement the __tlbi_level macro as a C function * for-next/ttbr-macros-cleanup: : Cleanups of the TTBR1_* macros arm64/mm: Directly use TTBRx_EL1_CnP arm64/mm: Directly use TTBRx_EL1_ASID_MASK arm64/mm: Describe TTBR1_BADDR_4852_OFFSET * for-next/kselftest: : arm64 kselftest updates selftests/arm64: Implement cmpbr_sigill() to hwcap test * for-next/feat_lsui: : Futex support using FEAT_LSUI instructions to avoid toggling PAN arm64: armv8_deprecated: Disable swp emulation when FEAT_LSUI present arm64: Kconfig: Add support for LSUI KVM: arm64: Use CAST instruction for swapping guest descriptor arm64: futex: Support futex with FEAT_LSUI arm64: futex: Refactor futex atomic operation KVM: arm64: kselftest: set_id_regs: Add test for FEAT_LSUI KVM: arm64: Expose FEAT_LSUI to guests arm64: cpufeature: Add FEAT_LSUI * for-next/mpam: (40 commits) : Expose MPAM to user-space via resctrl: : - Add architecture context-switch and hiding of the feature from KVM. : - Add interface to allow MPAM to be exposed to user-space using resctrl. : - Add errata workaoround for some existing platforms. : - Add documentation for using MPAM and what shape of platforms can use resctrl arm64: mpam: Add initial MPAM documentation arm_mpam: Quirk CMN-650's CSU NRDY behaviour arm_mpam: Add workaround for T241-MPAM-6 arm_mpam: Add workaround for T241-MPAM-4 arm_mpam: Add workaround for T241-MPAM-1 arm_mpam: Add quirk framework arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl arm64: mpam: Select ARCH_HAS_CPU_RESCTRL arm_mpam: resctrl: Add empty definitions for assorted resctrl functions arm_mpam: resctrl: Update the rmid reallocation limit arm_mpam: resctrl: Add resctrl_arch_rmid_read() arm_mpam: resctrl: Allow resctrl to allocate monitors arm_mpam: resctrl: Add support for csu counters arm_mpam: resctrl: Add monitor initialisation and domain boilerplate arm_mpam: resctrl: Add kunit test for control format conversions arm_mpam: resctrl: Add support for 'MB' resource arm_mpam: resctrl: Wait for cacheinfo to be ready arm_mpam: resctrl: Add rmid index helpers arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats arm_mpam: resctrl: Hide CDP emulation behind CONFIG_EXPERT ... * for-next/hotplug-batched-tlbi: : arm64/mm: Enable batched TLB flush in unmap_hotplug_range() arm64/mm: Reject memory removal that splits a kernel leaf mapping arm64/mm: Enable batched TLB flush in unmap_hotplug_range() * for-next/bbml2-fixes: : Fixes for realm guest and BBML2_NOABORT arm64: mm: Remove pmd_sect() and pud_sect() arm64: mm: Handle invalid large leaf mappings correctly arm64: mm: Fix rodata=full block mapping support for realm guests * for-next/sysreg: : arm64 sysreg updates arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update SMIDR_EL1 to DDI0601 2025-06 * for-next/generic-entry: : More arm64 refactoring towards using the generic entry code arm64: Check DAIF (and PMR) at task-switch time arm64: entry: Use split preemption logic arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode() arm64: entry: Consistently prefix arm64-specific wrappers arm64: entry: Don't preempt with SError or Debug masked entry: Split preemption from irqentry_exit_to_kernel_mode() entry: Split kernel mode logic from irqentry_{enter,exit}() entry: Move irqentry_enter() prototype later entry: Remove local_irq_{enable,disable}_exit_to_user() entry: Fix stale comment for irqentry_enter() * for-next/acpi: : arm64 ACPI updates ACPI: AGDI: fix missing newline in error message	2026-04-10 14:22:24 +01:00
David Arcari	ba893caead	tools/power turbostat: Allow execution to continue after perf_l2_init() failure Currently, if perf_l2_init() fails turbostat exits after issuing the following error (which was encountered on AlderLake): turbostat: perf_l2_init(cpu0, 0x0, 0xff24) REFS: Invalid argument This occurs because perf_l2_init() calls err(). However, the code has been written in such a manner that it is able to perform cleanup and continue. Therefore, this issue can be addressed by changing the appropriate calls to err() to warnx(). Additionally, correct the PMU type arguments passed to the warning strings in the ecore and lcore blocks so the logs accurately reflect the failing counter type. Signed-off-by: David Arcari <darcari@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>	2026-04-10 09:04:32 -04:00
Rafael J. Wysocki	83e990310d	Merge branch 'pm-cpufreq' Merge cpufreq updates for 7.1-rc1: - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa) - Update cpufreq-dt-platdev blocklist (Faruque Ansari) - Minor updates to driver and dt-bindings for Tegra (Thierry Reding, Rosen Penev) - Add MAINTAINERS entry for CPPC driver (Viresh Kumar) - Add support for new features: CPPC performance priority, Dynamic EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy, Mario Limonciello) - Fix sysfs files being present when HW missing and broken/outdated documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy) - Pass the policy to cpufreq_driver->adjust_perf() to avoid using cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which leads to a scheduling-while-atomic bug (K Prateek Nayak) - Clean up dead code in Kconfig for cpufreq (Julian Braha) - Remove max_freq_req update for pre-existing cpufreq policy and add a boost_freq_req QoS request to save the boost constraint instead of overwriting the last scaling_max_freq constraint (Pierre Gondois) - Embed cpufreq QoS freq_req objects in cpufreq policy so they all are allocated in one go along with the policy to simplify lifetime rules and avoid error handling issues (Viresh Kumar) - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq scaling driver (Henry Tseng) - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead of cpumask_weight() because the former is more efficient (Yury Norov) - Use sysfs_emit() in sysfs show functions for cpufreq governor attributes (Thorsten Blum) - Update intel_pstate to stop returning an error when "off" is written to its status sysfs attribute while the driver is already off (Fabio De Francesco) - Include current frequency in the debug message printed by __cpufreq_driver_target() (Pengjie Zhang) * pm-cpufreq: (38 commits) cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer cpufreq: Pass the policy to cpufreq_driver->adjust_perf() cpufreq/amd-pstate: Pass the policy to amd_pstate_update() cpufreq/amd-pstate-ut: Add a unit test for raw EPP cpufreq/amd-pstate: Add support for raw EPP writes cpufreq/amd-pstate: Add support for platform profile class cpufreq/amd-pstate: add kernel command line to override dynamic epp cpufreq/amd-pstate: Add dynamic energy performance preference Documentation: amd-pstate: fix dead links in the reference section cpufreq/amd-pstate: Cache the max frequency in cpudata Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count} Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file amd-pstate-ut: Add a testcase to validate the visibility of driver attributes amd-pstate-ut: Add module parameter to select testcases amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2() amd-pstate: Add sysfs support for floor_freq and floor_count amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF x86/cpufeatures: Add AMD CPPC Performance Priority feature. ...	2026-04-10 12:05:32 +02:00
fangqiurong	dcd47f27c0	selftests/sched_ext: Fix wrong DSQ ID in peek_dsq error message The error path after scx_bpf_create_dsq(real_dsq_id, ...) was reporting test_dsq_id instead of real_dsq_id in the error message, which would mislead debugging. Signed-off-by: fangqiurong <fangqiurong@kylinos.cn> Signed-off-by: Tejun Heo <tj@kernel.org>	2026-04-09 23:00:44 -10:00
Hangbin Liu	42f9b4c6ef	tools: ynl: tests: fix leading space on Makefile target The ../generated/protos.a rule had a spurious leading space before the target name. In make, target rules must start at column 0; only recipe lines are indented with a tab. The extra space caused make to misparse the rule. Remove the leading space to match the style of the adjacent ../lib/ynl.a rule. Fixes: `e0aa0c6175` ("tools: ynl: move samples to tests") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260408-ynl_makefile-v1-1-f9624acc2ad9@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 20:41:40 -07:00
Jakub Kicinski	3d2c3d2eea	selftests: net: py: explicitly forbid multiple ksft_run() calls People (do people still write code or is it all AI?) seem to not get that ksft_run() can only be called once. If we call it multiple times KTAP parsers will likely cut off after the first batch has finished. Link: https://patch.msgid.link/20260408221952.819822-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 20:38:33 -07:00
Cosmin Ratiu	26555673bc	selftests: Add MACsec VLAN propagation traffic test Add VLAN filter propagation tests through offloaded MACsec devices via actual traffic. The tests create MACsec tunnels with matching SAs on both endpoints, stack VLANs on top, and verify connectivity with ping. Covered: - Offloaded MACsec with VLAN (filters propagate to HW) - Software MACsec with VLAN (no HW filter propagation) - Offload on/off toggle and verifying traffic still works On netdevsim this makes use of the VLAN filter debugfs file to actually validate that filters are applied/removed correctly. On real hardware the traffic should validate actual VLAN filter propagation. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260408115240.1636047-4-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 19:38:42 -07:00
Cosmin Ratiu	e1ab601bb2	selftests: Migrate nsim-only MACsec tests to Python Move MACsec offload API and ethtool feature tests from tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh to tools/testing/selftests/drivers/net/macsec.py using the NetDrvEnv framework so tests can run against both netdevsim (default) and real hardware (NETIF=ethX). As some real hardware requires MACsec to use encryption, add that to the tests. Netdevsim-specific limit checks (max SecY, max RX SC) were moved into separate test cases to avoid failures on real hardware. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260408115240.1636047-2-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 19:38:41 -07:00
Jakub Kicinski	1508922588	Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp' Daniel Borkmann says: ==================== netkit: Support for io_uring zero-copy and AF_XDP Containers use virtual netdevs to route traffic from a physical netdev in the host namespace. They do not have access to the physical netdev in the host and thus can't use memory providers or AF_XDP that require reconfiguring/restarting queues in the physical netdev. This patchset adds the concept of queue leasing to virtual netdevs that allow containers to use memory providers and AF_XDP at native speed. Leased queues are bound to a real queue in a physical netdev and act as a proxy. Memory providers and AF_XDP operations take an ifindex and queue id, so containers would pass in an ifindex for a virtual netdev and a queue id of a leased queue, which then gets proxied to the underlying real queue. We have implemented support for this concept in netkit and tested the latter against Nvidia ConnectX-6 (mlx5) as well as Broadcom BCM957504 (bnxt_en) 100G NICs. For more details see the individual patches. ==================== Link: https://patch.msgid.link/20260402231031.447597-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 18:24:35 -07:00
David Wei	65d657d806	selftests/net: Add queue leasing tests with netkit Add extensive selftests for netkit queue leasing, using io_uring zero copy test binary inside of a netns with netkit. This checks that memory providers can be bound against virtual queues in a netkit within a netns that are leasing from a physical netdev in the default netns. Also add various test cases around corner cases for the queue creation itself as well as queue info dumping and teardown in case of netkit in device pair and single mode. Signed-off-by: David Wei <dw@davidwei.uk> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20260402231031.447597-15-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 18:21:47 -07:00
Daniel Borkmann	7789c6bb76	net: Add queue-create operation Add a ynl netdev family operation called queue-create that creates a new queue on a netdevice: name: queue-create attribute-set: queue flags: [admin-perm] do: request: attributes: - ifindex - type - lease reply: &queue-create-op attributes: - id This is a generic operation such that it can be extended for various use cases in future. Right now it is mandatory to specify ifindex, the queue type which is enforced to rx and a lease. The newly created queue id is returned to the caller. A queue from a virtual device can have a lease which refers to another queue from a physical device. This is useful for memory providers and AF_XDP operations which take an ifindex and queue id to allow applications to bind against virtual devices in containers. The lease couples both queues together and allows to proxy the operations from a virtual device in a container to the physical device. In future, the nested lease attribute can be lifted and made optional for other use-cases such as dynamic queue creation for physical netdevs. The lack of lease and the specification of the physical device as an ifindex will imply that we need a real queue to be allocated. Similarly, the queue type enforcement to rx can then be lifted as well to support tx. An early implementation had only driver-specific integration [0], but in order for other virtual devices to reuse, it makes sense to have this as a generic API in core net. For leasing queues, the virtual netdev must have real_num_rx_queues less than num_rx_queues at the time of calling queue-create. The queue-type must be rx as only rx queues are supported for leasing for now. We also enforce that the queue-create ifindex must point to a virtual device, and that the nested lease attribute's ifindex must point to a physical device. The nested lease attribute set contains a netns-id attribute which is optional and can specify a netns-id relative to the caller's netns. It requires cap_net_admin and if the netns-id attribute is not specified, the lease ifindex will be retrieved from the current netns. Also, it is modeled as an s32 type similarly as done elsewhere in the stack. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://bpfconf.ebpf.io/bpfconf2025/bpfconf2025_material/lsfmmbpf_2025_netkit_borkmann.pdf [0] Link: https://patch.msgid.link/20260402231031.447597-2-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 18:21:45 -07:00
Thomas Weißschuh	b070dc3629	selftests/nolibc: use gcc 15 Newer compilers tend to detect more problematic code. Update the testsuite to use gcc 15.2.0 by default. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20260408-nolibc-gcc-15-v1-3-330d0c40f894@weissschuh.net	2026-04-09 23:25:45 +02:00
Thomas Weißschuh	3495279d05	tools/nolibc: support UBSAN on gcc The UBSAN implementation in gcc requires a slightly different function attribute to skip instrumentation. Extend __nolibc_no_sanitize_undefined to also handle gcc. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20260408-nolibc-gcc-15-v1-2-330d0c40f894@weissschuh.net	2026-04-09 23:25:40 +02:00
Thomas Weißschuh	08ab958072	tools/nolibc: create __nolibc_no_sanitize_ubsan The logic to disable UBSAN will become a bit more complicated. Move it out into compiler.h, so crt.h stays readable. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20260408-nolibc-gcc-15-v1-1-330d0c40f894@weissschuh.net	2026-04-09 23:25:35 +02:00
Jakub Kicinski	b6e39e4846	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-7.0-rc8). Conflicts: net/ipv6/seg6_iptunnel.c `c3812651b5` ("seg6: separate dst_cache for input and output paths in seg6 lwtunnel") `78723a62b9` ("seg6: add per-route tunnel source address") https://lore.kernel.org/adZhwtOYfo-0ImSa@sirena.org.uk net/ipv4/icmp.c `fde29fd934` ("ipv4: icmp: fix null-ptr-deref in icmp_build_probe()") `d98adfbdd5` ("ipv4: drop ipv6_stub usage and use direct function calls") https://lore.kernel.org/adO3dccqnr6j-BL9@sirena.org.uk Adjacent changes: drivers/net/ethernet/stmicro/stmmac/chain_mode.c `51f4e090b9` ("net: stmmac: fix integer underflow in chain mode") `6b4286e055` ("net: stmmac: rename STMMAC_GET_ENTRY() -> STMMAC_NEXT_ENTRY()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-09 13:20:59 -07:00
Daniel Borkmann	8697bdd67b	selftests/bpf: Add test for stale pkt range after scalar arithmetic Extend the verifier_direct_packet_access BPF selftests to exercise the verifier code paths which ensure that the pkt range is cleared after add/sub alu with a known scalar. The tests reject the invalid access. # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_direct [...] #592/35 verifier_direct_packet_access/direct packet access: pkt_range cleared after sub with known scalar:OK #592/36 verifier_direct_packet_access/direct packet access: pkt_range cleared after add with known scalar:OK #592/37 verifier_direct_packet_access/direct packet access: test3:OK #592/38 verifier_direct_packet_access/direct packet access: test3 @unpriv:OK #592/39 verifier_direct_packet_access/direct packet access: test34 (non-linear, cgroup_skb/ingress, too short eth):OK #592/40 verifier_direct_packet_access/direct packet access: test35 (non-linear, cgroup_skb/ingress, too short 1):OK #592/41 verifier_direct_packet_access/direct packet access: test36 (non-linear, cgroup_skb/ingress, long enough):OK #592 verifier_direct_packet_access:OK [...] Summary: 2/47 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20260409155016.536608-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-09 13:11:31 -07:00
Linus Torvalds	a55f7f5f29	Including fixes from netfilter, IPsec and wireless. This is again considerably bigger than the old average. No known outstanding regressions. Current release - regressions: - net: increase IP_TUNNEL_RECURSION_LIMIT to 5 - eth: ice: fix PTP timestamping broken by SyncE code on E825C Current release - new code bugs: - eth: stmmac: dwmac-motorcomm: fix eFUSE MAC address read failure Previous releases - regressions: - core: fix cross-cache free of KFENCE-allocated skb head - sched: act_csum: validate nested VLAN headers - rxrpc: fix call removal to use RCU safe deletion - xfrm: - wait for RCU readers during policy netns exit - fix refcount leak in xfrm_migrate_policy_find - wifi: rt2x00usb: fix devres lifetime - mptcp: fix slab-use-after-free in __inet_lookup_established - ipvs: fix NULL deref in ip_vs_add_service error path - eth: airoha: fix memory leak in airoha_qdma_rx_process() - eth: lan966x: fix use-after-free and leak in lan966x_fdma_reload() Previous releases - always broken: - ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data() - ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump - bridge: guard local VLAN-0 FDB helpers against NULL vlan group - xsk: tailroom reservation and MTU validation - rxrpc: - fix to request an ack if window is limited - fix RESPONSE authenticator parser OOB read - netfilter: nft_ct: fix use-after-free in timeout object destroy - batman-adv: hold claim backbone gateways by reference - eth: stmmac: fix PTP ref clock for Tegra234 - eth: idpf: fix PREEMPT_RT raw/bh spinlock nesting for async VC handling - eth: ipa: fix GENERIC_CMD register field masks for IPA v5.0+ Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmnXtnsSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkZeYQAKfZCL4rCkeO7VuoZn8lMN4YrBqVphuU MFpLKnvU8muDamBSmXGwpsdryrzQdUtEl0C7E/YyKO8TKpmFkjQRKe/Ay5XSsmJi fqjQiZIC9TKgVbJJbQZ4yZqOO2EZXHMRx8awnDjIwIrSLTyJtD29XaJqvmm+rojw uAVECbXpVOWdRVyIgHf0N3y99ItvwQycv6npjXWGHDryGVH1uXz4CiWFgltFd827 MgNx5gZ7wn6ls1B4E1EsIXZeCnVOoNMUBX+CtkSl7ctZD/nvqLZ0PqGEViqGZ+w7 kEK9jWWvsmST3j0wG4IldbnQJORZrDXR5lAmvOJILxUDD4jG4zaqHPYs4ELS5sHK E1QOs6uNBNvu40neGe7zcH4DpQzv5/W5yj0ELPBZJhV/5madjEpETOh6yO7EJRBl sdd32LD0z8wFt8yJGEbXM7YC4A8tzNagWF0wKpRqbiKFlWHdJffwqcmEe6+2CiXx rg0q2DAfvTesmzdMgGuk4ZOeczfZ9JbxPYA0IYrUegYmbI6tAuCK5slaKGOwoyml hX2lXNBxaVmTk7F9Qq6I9Ona78XqO0Tg0UBzC2dIsQITvkue7ItJBpkurOwYSOGt a8SAVV0JwXSfPquKlOfLhagPZcuQuTQfIqRKVqM47KPPO/i99okRXQbfJGrpHJKM 8bzRl6654nAs =uzl/ -----END PGP SIGNATURE----- Merge tag 'net-7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, IPsec and wireless. This is again considerably bigger than the old average. No known outstanding regressions. Current release - regressions: - net: increase IP_TUNNEL_RECURSION_LIMIT to 5 - eth: ice: fix PTP timestamping broken by SyncE code on E825C Current release - new code bugs: - eth: stmmac: dwmac-motorcomm: fix eFUSE MAC address read failure Previous releases - regressions: - core: fix cross-cache free of KFENCE-allocated skb head - sched: act_csum: validate nested VLAN headers - rxrpc: fix call removal to use RCU safe deletion - xfrm: - wait for RCU readers during policy netns exit - fix refcount leak in xfrm_migrate_policy_find - wifi: rt2x00usb: fix devres lifetime - mptcp: fix slab-use-after-free in __inet_lookup_established - ipvs: fix NULL deref in ip_vs_add_service error path - eth: - airoha: fix memory leak in airoha_qdma_rx_process() - lan966x: fix use-after-free and leak in lan966x_fdma_reload() Previous releases - always broken: - ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data() - ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump - bridge: guard local VLAN-0 FDB helpers against NULL vlan group - xsk: tailroom reservation and MTU validation - rxrpc: - fix to request an ack if window is limited - fix RESPONSE authenticator parser OOB read - netfilter: nft_ct: fix use-after-free in timeout object destroy - batman-adv: hold claim backbone gateways by reference - eth: - stmmac: fix PTP ref clock for Tegra234 - idpf: fix PREEMPT_RT raw/bh spinlock nesting for async VC handling - ipa: fix GENERIC_CMD register field masks for IPA v5.0+" * tag 'net-7.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (104 commits) net: lan966x: fix use-after-free and leak in lan966x_fdma_reload() net: lan966x: fix page pool leak in error paths net: lan966x: fix page_pool error handling in lan966x_fdma_rx_alloc_page_pool() nfc: pn533: allocate rx skb before consuming bytes l2tp: Drop large packets with UDP encap net: ipa: fix event ring index not programmed for IPA v5.0+ net: ipa: fix GENERIC_CMD register field masks for IPA v5.0+ MAINTAINERS: Add Prashanth as additional maintainer for amd-xgbe driver devlink: Fix incorrect skb socket family dumping af_unix: read UNIX_DIAG_VFS data under unix_state_lock Revert "mptcp: add needs_id for netlink appending addr" mptcp: fix slab-use-after-free in __inet_lookup_established net: txgbe: leave space for null terminators on property_entry net: ioam6: fix OOB and missing lock rxrpc: proc: size address buffers for %pISpc output rxrpc: only handle RESPONSE during service challenge rxrpc: Fix buffer overread in rxgk_do_verify_authenticator() rxrpc: Fix leak of rxgk context in rxgk_verify_response() rxrpc: Fix integer overflow in rxgk_verify_response() rxrpc: Fix missing error checks for rxkad encryption/decryption failure ...	2026-04-09 08:39:25 -07:00
Will Deacon	f8d5e7066d	Merge branches 'fixes', 'arm/smmu/updates', 'arm/smmu/bindings', 'riscv', 'intel/vt-d', 'amd/amd-vi' and 'core' into next	2026-04-09 13:18:27 +01:00
Song Gao	e47b8e1db9	KVM: LoongArch: selftests: Add PMU overflow interrupt test Extend the PMU test suite to cover overflow interrupts. The test enables the PMI (Performance Monitor Interrupt), sets counter 0 to one less than the overflow value, and verifies that an interrupt is raised when the counter overflows. A guest interrupt handler checks the interrupt cause and disables further PMU interrupts upon success. Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-09 18:56:38 +08:00
Song Gao	11c8401927	KVM: LoongArch: selftests: Add basic PMU event counting test Introduce a basic PMU test that verifies hardware event counting for four performance counters. The test enables the events for CPU cycles, instructions retired, branch instructions, and branch misses, runs a fixed number of loops, and checks that the counter values fall within expected ranges. It also validates that the host supports PMU and that the VM feature is enabled. Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-09 18:56:37 +08:00
Song Gao	fa19ea9a7b	KVM: LoongArch: selftests: Add cpucfg read/write helpers Add helper macros and functions to read and write CPU configuration registers (cpucfg) from the guest and from the VMM. This interface is required in upcoming selftests for querying and setting CPU features, such as PMU capabilities. Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-09 18:56:37 +08:00
Takashi Iwai	3f44bccdd6	Merge branch 'for-linus' into for-next Pull 7.0-devel branch for further development of HD-audio codec quirks. Signed-off-by: Takashi Iwai <tiwai@suse.de>	2026-04-09 08:32:17 +02:00
Thomas Richter	4cf1f549bb	perf test: Make perf trace BTF general tests exclusive Running both tests cases 126 128 together causes the first test case 126 to fail: # for i in $(seq 3); do ./perf test 'perf trace BTF general tests' \ 'perf trace record and replay'; done 126: perf trace BTF general tests : FAILED! 128: perf trace record and replay : Ok 126: perf trace BTF general tests : FAILED! 128: perf trace record and replay : Ok 126: perf trace BTF general tests : FAILED! 128: perf trace record and replay : Ok # Test case 126 fails because test case 128 runs concurrently as can be observed using a ps -ef \| grep perf output list on a different window. Both do a perf trace command concurrently. Make test case 'perf trace BTF general tests' exclusive. Output after: # for i in $(seq 3); do ./perf test 'perf trace BTF general tests' \ 'perf trace record and replay'; done 127: perf trace BTF general tests : Ok 155: perf trace record and replay : Ok 127: perf trace BTF general tests : Ok 155: perf trace record and replay : Ok 127: perf trace BTF general tests : Ok 155: perf trace record and replay : Ok # Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 21:42:25 -07:00
Or Har-Toov	2a8e912352	selftest: netdevsim: Add resource dump and scope filter test Add resource_dump_test() which verifies dumping resources for all devices and ports, and tests that scope=dev returns only device-level resources and scope=port returns only port resources. Skip if userspace does not support the scope parameter. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-12-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 19:55:39 -07:00
Or Har-Toov	3961353771	selftest: netdevsim: Add devlink port resource doit test Tests that querying a specific port handle returns the expected resource name and size. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 19:55:39 -07:00
Leon Hwang	5ae4ba98d7	selftests/drivers/net: Add an xdp test to xdp.py In "bpf: Disallow freplace on XDP with mismatched xdp_has_frags values" [1], this XDP test is suggested to add to xdp.py. 1. Verify the failure of updating frag-capable prog with non-frag-capable prog, when the frag-capable prog attaches to mtu=9k driver. The test has been verified against Mellanox CX6 and Intel 82599ES NICs. With dropping other tests, here is the test log. # ethtool -i eth0 driver: mlx5_core version: 6.19.0-061900-generic # NETIF=eth0 python3 xdp.py TAP version 13 1..1 ok 1 xdp.test_xdp_native_update_mb_to_sb # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0 # ethtool -i eth0 driver: ixgbe version: 6.19.0-061900-generic # NETIF=eth0 python3 xdp.py TAP version 13 1..1 # CMD: ip link set dev eth0 xdpdrv obj /path/to/tools/testing/selftests/net/lib/xdp_dummy.bpf.o sec xdp.frags # EXIT: 2 # STDERR: RTNETLINK answers: Invalid argument ok 1 xdp.test_xdp_native_update_mb_to_sb # SKIP device does not support multi-buffer XDP # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0 Signed-off-by: Leon Hwang <leon.huangfu@shopee.com> Link: https://patch.msgid.link/20260406072655.368173-1-leon.huangfu@shopee.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 19:48:56 -07:00
Ioana Ciornei	fff75dba79	selftests: forwarding: lib: rewrite processing of command line arguments The piece of code which processes the command line arguments and populates NETIFS based on them is really unobvious. Rewrite it so that the intention is clear and the code is easy to follow. Suggested-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20260407102058.867279-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 19:26:44 -07:00
Ian Rogers	80b549be27	perf data: Clean up use_stdio and structures use_stdio was associated with struct perf_data and not perf_data_file meaning there was implicit use of fd rather than fptr that may not be safe. For example, in perf_data_file__write. Reorganize perf_data_file to better abstract use_stdio, add kernel-doc and more consistently use perf_data__ accessors so that use_stdio is better respected. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:17 -07:00
Arnaldo Carvalho de Melo	19a9ed115f	perf tools: Replace basename() calls with perf_basename() As noticed in a sashiko review for a patch adding a missing libgen.h in a file using basename(): https://sashiko.dev/#/patchset/20260402001740.2220481-1-acme%40kernel.org So avoid these subtleties and instead reuse the gnu_basename() function we had in srcline.c, renaming it to perf_basename() and replace basename() calls with it, simplifying several cases by removing now needless strdups. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:05 -07:00
Arnaldo Carvalho de Melo	fbfb858552	perf tools: Use calloc() where applicable Instead of using zalloc(nr_entries * sizeof_entry) that is what calloc() does. In some places where linux/zalloc.h isn't needed, remove it, add when needed and was getting it indirectly. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:05 -07:00
Arnaldo Carvalho de Melo	7507abd16a	perf header: Do validation of perf.data HEADER_CPU_DOMAIN_INFO As suggested in an unrelated sashiko review: https://sashiko.dev/#/patchset/20260407195145.2372104-1-acme%40kernel.org " Could a malformed perf.data file provide out-of-bounds values for cpu and domain? These variables are read directly from the file and used as indices for cd_map and cd_map[cpu]->domains without any validation against env->nr_cpus_avail or max_sched_domains. Similar to the issue above, this is an existing lack of validation that becomes apparent when looking at the allocation boundaries. " Validate it. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:05 -07:00
Arnaldo Carvalho de Melo	fc32ae6df8	perf header: Use a max number of command line args Sashiko suggests we use some reasonable max number of args to avoid overflows when reading perf.data files, do it. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:05 -07:00
Arnaldo Carvalho de Melo	c89f35def8	perf bench: Constify tables Those tables and variables don't change, better capture this by explicitely using 'const'. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:05 -07:00
Ian Rogers	046fd8206d	perf tools: Make more global variables static `make check` will run sparse on the perf code base. A frequent warning is "warning: symbol '...' was not declared. Should it be static?" Go through and make global definitions without declarations static. In some cases it is deliberate due to dlsym accessing the symbol, this change doesn't clean up the missing declarations for perf test suites. Sometimes things can opportunistically be made const. Making somethings static exposed unused functions warnings, so restructuring of ifdefs was necessary for that. These changes reduce the size of the perf binary by 568 bytes. Committer notes: Refreshed the patch, the original one fell thru the cracks, updated the size reduction. Remove the trace-event-scripting.c changes, break the build, noticed with container builds and with sashiko: https://sashiko.dev/#/patchset/20260401215306.2152898-1-acme%40kernel.org Also make two variables static to address another sashiko review comment: https://sashiko.dev/#/patchset/20260402001740.2220481-1-acme%40kernel.org Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Ankur Arora <ankur.a.arora@oracle.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Guo Ren <guoren@kernel.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yujie Liu <yujie.liu@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:04 -07:00
Arnaldo Carvalho de Melo	e5cce1b9c8	perf util: Kill die() prototype, dead for a long time In `fef2a73516` ("perf tools: Kill die()") the die() function was removed, but not the prototype in util.h, now when building with LIBPERL=1, during a 'make -C tools/perf build-test' routine test, it is failing as perl likes die() calls and then this clashes with this remnant, remove it. Fixes: `fef2a73516` ("perf tools: Kill die()") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:04 -07:00
Arnaldo Carvalho de Melo	d3e01be6da	perf symbols: Make variable receiving result strrchr() const Fixing: util/symbol.c: In function ‘symbol__config_symfs’: util/symbol.c:2499:20: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers] 2499 \| layout_str = strrchr(dir, ','); \| With recent gcc/glibc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:04 -07:00
Jakub Kicinski	ea0f90d1ed	ipsec-next-2026-04-08 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmnWIe8ACgkQrB3Eaf9P W7dDqRAAho59mSQlQAaoj6lkPlBCR8/TZrEHWXeTZvWzzyILiE8GJGdkMoUOk47S QR2YJ7xTg/eAALJFFPCKj82k5GOt2CjOo30BS901zdBhSZbN/H+tW57QfYRegR3o BFZ0eBCDc5FHQYRl8QbCi2XtF4Sqr8erLIvNwfaOiuoPCZmoehD2kyMpPhb/w9qQ DD0OsYWjZuhBP+MwHGCsmtMBoesVKI/86HV0LpeyH7uU+928Tf+TcACJzkLMrUcE AwrvTL3Mvp2ljsm9mw6mElyiAqemQHM87yg8BrR7NoXlahAEOJx8UWchKpAgGXv5 bO8ng0Y8lNcuG+tN7rVk4/KeyjGNSW6ubRKfZbast6aoj5LfUhOIxxMTyYOEU5rH wKbIX00ilONs8S+kK/S4D0/1EdszOB/WVUTN5yEH1+FxkpvMGs3LUfhEjzfk9Lnz sT1ZF65YNwR0qa1SaIU4kYM543mlr/CrFgoPx5VOu0+jG+xCVWiC8fy+/SD688ht VTQGf8Y6gGX0yRMYJeauHHCBeMwbF7WEu7MYSi+4+7uUCYexh700QpOjaYLrTpgS NLpT9JPvuyWQ389DjJ+h5cpTqIsLrNs6+SXo+mZ6nkubGe+HRKZnLFwXj+41p3hE tUv+EcZTKDa+YVGymVcjORC5JjqvJXXklqQFeROuoamdJF0M96c= =0E3L -----END PGP SIGNATURE----- Merge tag 'ipsec-next-2026-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2026-04-08 1) Update outdated comment in xfrm_dst_check(). From kexinsun. 2) Drop support for HMAC-RIPEMD-160 from IPsec. From Eric Biggers. * tag 'ipsec-next-2026-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: Drop support for HMAC-RIPEMD-160 xfrm: update outdated comment ==================== Link: https://patch.msgid.link/20260408094258.148555-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-08 18:51:54 -07:00
Daniel Borkmann	e0fcb42bc6	selftests/bpf: Add tests for ld_{abs,ind} failure path in subprogs Extend the verifier_ld_ind BPF selftests with subprogs containing ld_{abs,ind} and craft the test in a way where the invalid register read is rejected in the fixed case. Also add a success case each, and add additional coverage related to the BTF return type enforcement. # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_ld_ind [...] #611/1 verifier_ld_ind/ld_ind: check calling conv, r1:OK #611/2 verifier_ld_ind/ld_ind: check calling conv, r1 @unpriv:OK #611/3 verifier_ld_ind/ld_ind: check calling conv, r2:OK #611/4 verifier_ld_ind/ld_ind: check calling conv, r2 @unpriv:OK #611/5 verifier_ld_ind/ld_ind: check calling conv, r3:OK #611/6 verifier_ld_ind/ld_ind: check calling conv, r3 @unpriv:OK #611/7 verifier_ld_ind/ld_ind: check calling conv, r4:OK #611/8 verifier_ld_ind/ld_ind: check calling conv, r4 @unpriv:OK #611/9 verifier_ld_ind/ld_ind: check calling conv, r5:OK #611/10 verifier_ld_ind/ld_ind: check calling conv, r5 @unpriv:OK #611/11 verifier_ld_ind/ld_ind: check calling conv, r7:OK #611/12 verifier_ld_ind/ld_ind: check calling conv, r7 @unpriv:OK #611/13 verifier_ld_ind/ld_abs: subprog early exit on ld_abs failure:OK #611/14 verifier_ld_ind/ld_ind: subprog early exit on ld_ind failure:OK #611/15 verifier_ld_ind/ld_abs: subprog with both paths safe:OK #611/16 verifier_ld_ind/ld_ind: subprog with both paths safe:OK #611/17 verifier_ld_ind/ld_abs: reject void return subprog:OK #611/18 verifier_ld_ind/ld_ind: reject void return subprog:OK #611 verifier_ld_ind:OK Summary: 1/18 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20260408191242.526279-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-08 18:43:28 -07:00
Cheng-Yang Chou	ff1befcb16	selftests/sched_ext: Improve runner error reporting for invalid arguments Report an error for './runner foo' (positional arg instead of -t) and for './runner -t foo' when the filter matches no tests. Previously both cases produced no error output. Pre-scan the test list before the main loop so the error is reported immediately, avoiding spurious SKIP output from '-s' when no tests match. Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2026-04-08 15:20:44 -10:00
Varun R Mallya	c7cab53f9d	selftests/bpf: Add test to ensure kprobe_multi is not sleepable Add a selftest to ensure that kprobe_multi programs cannot be attached using the BPF_F_SLEEPABLE flag. This test succeeds when the kernel rejects attachment of kprobe_multi when the BPF_F_SLEEPABLE flag is set. Suggested-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Varun R Mallya <varunrmallya@gmail.com> Link: https://lore.kernel.org/r/20260408190137.101418-3-varunrmallya@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-08 18:15:56 -07:00
Christian Bruel	1d3225cb5d	selftests: pci_endpoint: Skip BAR subrange test on -ENOSPC In pci-epf-test.c, set the STATUS_NO_RESOURCE status bit if pci_epc_set_bar() returns -ENOSPC. This status bit is used to indicate that there are not enough inbound window resources to allocate the subrange. In pci_endpoint_test.c, return -ENOSPC instead of -EIO when STATUS_NO_RESOURCE is set. In pci_endpoint_test.c, skip the BAR subrange test if -ENOSPC, i.e., there are not enough inbound window resources to run the test. Signed-off-by: Christian Bruel <christian.bruel@foss.st.com> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: squash related commits] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Koichiro Den <den@valinux.co.jp> Link: https://patch.msgid.link/20260407-skip-bar_subrange-tests-if-enospc-v4-1-6f2e65f2298c@foss.st.com Link: https://patch.msgid.link/20260407-skip-bar_subrange-tests-if-enospc-v4-2-6f2e65f2298c@foss.st.com Link: https://patch.msgid.link/20260407-skip-bar_subrange-tests-if-enospc-v4-3-6f2e65f2298c@foss.st.com	2026-04-08 14:41:39 -05:00
Ian Rogers	f552b132e4	perf maps: Fix copy_from that can break sorted by name order When an parent is copied into a child the name array is populated in address not name order. Make sure the name array isn't flagged as sorted. Fixes: `659ad3492b` ("perf maps: Switch from rbtree to lazily sorted array for addresses") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 10:28:49 -07:00
Ian Rogers	c4f3ff3289	perf maps: Fix fixup_overlap_and_insert that can break sorted by name order When an entry in the address array is replaced, the corresponding name entry is replaced. The entries names may sort differently and so it is important that the sorted by name property be cleared on the maps. Fixes: `0d11fab327` ("perf maps: Fixup maps_by_name when modifying maps_by_address") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 10:28:49 -07:00
Ian Rogers	b01741b285	perf maps: Move getting debug_file to verbose path Getting debug_file can trigger warnings if not set. Avoid getting these warnings by pushing the use under the controlling if. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 10:28:49 -07:00
Thomas Richter	83674a7829	perf addr2line: Remove global variable addr2line_timeout_ms Remove global variable addr2line_timeout_ms and add it as a member to symbol_conf structure. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> [namhyung: move the initialization to util/symbol.c] Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 10:28:49 -07:00
Thomas Richter	59f6de4e8f	perf config: Make symbol_conf::addr2line_disable_warn configurable Make symbol_conf::addr2line_disable_warn configurable by reading the perfconfig file. Use section core and addr2line-disable-warn = value. Update documentation. Example: # perf config -l core.addr2line-timeout=5000 core.addr2line-disable-warn=1 # Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 10:28:49 -07:00
Thomas Richter	bb7aeeaa21	perf config: Rename symbol_conf::disable_add2line_warn Rename member symbol_conf::disable_add2line_warn to symbol_conf::addr2line_disable_warn to make it consistent with other addr2line_xxx constants. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 10:28:49 -07:00
Fernando Fernandez Mancera	dde1a6084c	selftests: nft_queue.sh: add a parallel stress test Introduce a new stress test to check for race conditions in the nfnetlink_queue subsystem, where an entry is freed while another CPU is concurrently walking the global rhashtable. To trigger this, `nf_queue.c` is extended with two new flags: * -O (out-of-order): Buffers packet IDs and flushes them in reverse. * -b (bogus verdicts): Floods the kernel with non-existent packet IDs. The bogus verdict loop forces the kernel's lookup function to perform full rhashtable bucket traversals (-ENOENT). Combined with reverse-order flushing and heavy parallel UDP/ping flooding across 8 queues, this puts the nfnetlink_queue code under pressure. Joint work with Florian Westphal. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de>	2026-04-08 13:34:51 +02:00
Marc Zyngier	94b4ae79eb	Merge branch kvm-arm64/misc-7.1 into kvmarm-master/next * kvm-arm64/misc-7.1: KVM: arm64: selftests: Avoid testing the IMPDEF behavior KVM: arm64: Destroy stage-2 page-table in kvm_arch_destroy_vm() KVM: arm64: Don't leave mmu->pgt dangling on kvm_init_stage2_mmu() error KVM: arm64: Prevent the host from using an smc with imm16 != 0 Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-04-08 12:26:11 +01:00
Marc Zyngier	d77f4792db	Merge branch kvm-arm64/vgic-fixes-7.1 into kvmarm-master/next * kvm-arm64/vgic-fixes-7.1: : . : FIrst pass at fixing a number of vgic-v5 bugs that were found : after the merge of the initial series. : . KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE KVM: arm64: vgic-v5: Fold PPI state for all exposed PPIs KVM: arm64: set_id_regs: Allow GICv3 support to be set at runtime KVM: arm64: Don't advertises GICv3 in ID_PFR1_EL1 if AArch32 isn't supported KVM: arm64: Correctly plumb ID_AA64PFR2_EL1 into pkvm idreg handling KVM: arm64: Move GICv5 timer PPI validation into timer_irqs_are_valid() KVM: arm64: Remove evaluation of timer state in kvm_cpu_has_pending_timer() KVM: arm64: Kill arch_timer_context::direct field KVM: arm64: vgic-v5: Correctly set dist->ready once initialised KVM: arm64: vgic-v5: Make the effective priority mask a strict limit KVM: arm64: vgic-v5: Cast vgic_apr to u32 to avoid undefined behaviours KVM: arm64: vgic-v5: Transfer edge pending state to ICH_PPI_PENDRx_EL2 KVM: arm64: vgic-v5: Hold config_lock while finalizing GICv5 PPIs KVM: arm64: Account for RESx bits in __compute_fgt() KVM: arm64: Fix writeable mask for ID_AA64PFR2_EL1 arm64: Fix field references for ICH_PPI_DVIR[01]_EL2 KVM: arm64: Don't skip per-vcpu NV initialisation KVM: arm64: vgic: Don't reset cpuif/redist addresses at finalize time Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-04-08 12:26:00 +01:00
Marc Zyngier	f8078d51ee	Merge branch kvm-arm64/vgic-v5-ppi into kvmarm-master/next * kvm-arm64/vgic-v5-ppi: (40 commits) : . : Add initial GICv5 support for KVM guests, only adding PPI support : for the time being. Patches courtesy of Sascha Bischoff. : : From the cover letter: : : "This is v7 of the patch series to add the virtual GICv5 [1] device : (vgic_v5). Only PPIs are supported by this initial series, and the : vgic_v5 implementation is restricted to the CPU interface, : only. Further patch series are to follow in due course, and will add : support for SPIs, LPIs, the GICv5 IRS, and the GICv5 ITS." : . KVM: arm64: selftests: Add no-vgic-v5 selftest KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest KVM: arm64: gic-v5: Communicate userspace-driveable PPIs via a UAPI Documentation: KVM: Introduce documentation for VGICv5 KVM: arm64: gic-v5: Probe for GICv5 device KVM: arm64: gic-v5: Set ICH_VCTLR_EL2.En on boot KVM: arm64: gic-v5: Introduce kvm_arm_vgic_v5_ops and register them KVM: arm64: gic-v5: Hide FEAT_GCIE from NV GICv5 guests KVM: arm64: gic: Hide GICv5 for protected guests KVM: arm64: gic-v5: Mandate architected PPI for PMU emulation on GICv5 KVM: arm64: gic-v5: Enlighten arch timer for GICv5 irqchip/gic-v5: Introduce minimal irq_set_type() for PPIs KVM: arm64: gic-v5: Initialise ID and priority bits when resetting vcpu KVM: arm64: gic-v5: Create and initialise vgic_v5 KVM: arm64: gic-v5: Support GICv5 interrupts with KVM_IRQ_LINE KVM: arm64: gic-v5: Implement direct injection of PPIs KVM: arm64: Introduce set_direct_injection irq_op KVM: arm64: gic-v5: Trap and mask guest ICC_PPI_ENABLERx_EL1 writes KVM: arm64: gic-v5: Check for pending PPIs KVM: arm64: gic-v5: Clear TWI if single task running ... Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-04-08 12:22:35 +01:00
Marc Zyngier	2de32a25a3	Merge branch kvm-arm64/hyp-tracing into kvmarm-master/next * kvm-arm64/hyp-tracing: (40 commits) : . : EL2 tracing support, adding both 'remote' ring-buffer : infrastructure and the tracing itself, courtesy of : Vincent Donnefort. From the cover letter: : : "The growing set of features supported by the hypervisor in protected : mode necessitates debugging and profiling tools. Tracefs is the : ideal candidate for this task: : : * It is simple to use and to script. : : * It is supported by various tools, from the trace-cmd CLI to the : Android web-based perfetto. : : * The ring-buffer, where are stored trace events consists of linked : pages, making it an ideal structure for sharing between kernel and : hypervisor. : : This series first introduces a new generic way of creating remote events and : remote buffers. Then it adds support to the pKVM hypervisor." : . tracing: selftests: Extend hotplug testing for trace remotes tracing: Non-consuming read for trace remotes with an offline CPU tracing: Adjust cmd_check_undefined to show unexpected undefined symbols tracing: Restore accidentally removed SPDX tag KVM: arm64: avoid unused-variable warning tracing: Generate undef symbols allowlist for simple_ring_buffer KVM: arm64: tracing: add ftrace dependency tracing: add more symbols to whitelist tracing: Update undefined symbols allow list for simple_ring_buffer KVM: arm64: Fix out-of-tree build for nVHE/pKVM tracing tracing: selftests: Add hypervisor trace remote tests KVM: arm64: Add selftest event support to nVHE/pKVM hyp KVM: arm64: Add hyp_enter/hyp_exit events to nVHE/pKVM hyp KVM: arm64: Add event support to the nVHE/pKVM hyp and trace remote KVM: arm64: Add trace reset to the nVHE/pKVM hyp KVM: arm64: Sync boot clock with the nVHE/pKVM hyp KVM: arm64: Add trace remote for the nVHE/pKVM hyp KVM: arm64: Add tracing capability for the nVHE/pKVM hyp KVM: arm64: Support unaligned fixmap in the pKVM hyp KVM: arm64: Initialise hyp_nr_cpus for nVHE hyp ... Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-04-08 12:21:51 +01:00
Andrea Mayer	32dfd742f0	selftests: seg6: add test for dst_cache isolation in seg6 lwtunnel Add a selftest that verifies the dst_cache in seg6 lwtunnel is not shared between the input (forwarding) and output (locally generated) paths. The test creates three namespaces (ns_src, ns_router, ns_dst) connected in a line. An SRv6 encap route on ns_router encapsulates traffic destined to cafe::1 with SID fc00::100. The SID is reachable only for forwarded traffic (from ns_src) via an ip rule matching the ingress interface (iif veth-r0 lookup 100), and blackholed in the main table. The test verifies that: 1. A packet generated locally on ns_router does not reach ns_dst with an empty cache, since the SID is blackholed; 2. A forwarded packet from ns_src populates the input cache from table 100 and reaches ns_dst; 3. A packet generated locally on ns_router still does not reach ns_dst after the input cache is populated, confirming the output path does not reuse the input cache entry. Both the forwarded and local packets are pinned to the same CPU with taskset, since dst_cache is per-cpu. Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Link: https://patch.msgid.link/20260404004405.4057-3-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 20:20:56 -07:00
Daniel Golle	efaa71faf2	selftests: net: bridge_vlan_mcast: wait for h1 before querier check The querier-interval test adds h1 (currently a slave of the VRF created by simple_if_init) to a temporary bridge br1 acting as an outside IGMP querier. The kernel VRF driver (drivers/net/vrf.c) calls cycle_netdev() on every slave add and remove, toggling the interface admin-down then up. Phylink takes the PHY down during the admin-down half of that cycle. Since h1 and swp1 are cable-connected, swp1 also loses its link may need several seconds to re-negotiate. Use setup_wait_dev $h1 0 which waits for h1 to return to UP state, so the test can rely on the link being back up at this point. Fixes: `4d8610ee8b` ("selftests: net: bridge: add vlan mcast_querier_interval tests") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Link: https://patch.msgid.link/c830f130860fd2efae08bfb9e5b25fd028e58ce5.1775424423.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 20:16:16 -07:00
Jakub Kicinski	e65d8b6f30	selftests: drv-net: adjust to socat changes socat v1.8.1.0 now defaults to shut-null, it sends an extra 0-length UDP packet when sender disconnects. This breaks our tests which expect the exact packet sequence. Add shut-none which was the old default where necessary. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260404230103.2719103-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-07 18:54:03 -07:00
Amery Hung	4cfb09a383	selftests/bpf: Test overwriting referenced dynptr Test overwriting referenced dynptr and clones to make sure it is only allow when there is at least one other dynptr with the same ref_obj_id. Also make sure slice is still invalidated after the dynptr's stack slot is destroyed. Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260406150548.1354271-3-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-07 18:20:49 -07:00
Daniel Borkmann	cac16ce1e3	selftests/bpf: Add tests for stale delta leaking through id reassignment Extend the verifier_linked_scalars BPF selftest with a stale delta test such that the div-by-zero path is rejected in the fixed case. # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_linked_scalars [...] ./test_progs -t verifier_linked_scalars #612/1 verifier_linked_scalars/scalars: find linked scalars:OK #612/2 verifier_linked_scalars/sync_linked_regs_preserves_id:OK #612/3 verifier_linked_scalars/scalars_neg:OK #612/4 verifier_linked_scalars/scalars_neg_sub:OK #612/5 verifier_linked_scalars/scalars_neg_alu32_add:OK #612/6 verifier_linked_scalars/scalars_neg_alu32_sub:OK #612/7 verifier_linked_scalars/scalars_pos:OK #612/8 verifier_linked_scalars/scalars_sub_neg_imm:OK #612/9 verifier_linked_scalars/scalars_double_add:OK #612/10 verifier_linked_scalars/scalars_sync_delta_overflow:OK #612/11 verifier_linked_scalars/scalars_sync_delta_overflow_large_range:OK #612/12 verifier_linked_scalars/scalars_alu32_big_offset:OK #612/13 verifier_linked_scalars/scalars_alu32_basic:OK #612/14 verifier_linked_scalars/scalars_alu32_wrap:OK #612/15 verifier_linked_scalars/scalars_alu32_zext_linked_reg:OK #612/16 verifier_linked_scalars/scalars_alu32_alu64_cross_type:OK #612/17 verifier_linked_scalars/scalars_alu32_alu64_regsafe_pruning:OK #612/18 verifier_linked_scalars/alu32_negative_offset:OK #612/19 verifier_linked_scalars/spurious_precision_marks:OK #612/20 verifier_linked_scalars/scalars_self_add_clears_id:OK #612/21 verifier_linked_scalars/scalars_self_add_alu32_clears_id:OK #612/22 verifier_linked_scalars/scalars_stale_delta_from_cleared_id:OK #612/23 verifier_linked_scalars/scalars_stale_delta_from_cleared_id_alu32:OK #612 verifier_linked_scalars:OK Summary: 1/23 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20260407192421.508817-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-07 18:15:43 -07:00
Daniel Borkmann	ed2eecdc0c	selftests/bpf: Add tests for delta tracking when src_reg == dst_reg Extend the verifier_linked_scalars BPF selftest with a rX += rX test such that the div-by-zero path is rejected in the fixed case. # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_linked_scalars [...] ./test_progs -t verifier_linked_scalars #612/1 verifier_linked_scalars/scalars: find linked scalars:OK #612/2 verifier_linked_scalars/sync_linked_regs_preserves_id:OK #612/3 verifier_linked_scalars/scalars_neg:OK #612/4 verifier_linked_scalars/scalars_neg_sub:OK #612/5 verifier_linked_scalars/scalars_neg_alu32_add:OK #612/6 verifier_linked_scalars/scalars_neg_alu32_sub:OK #612/7 verifier_linked_scalars/scalars_pos:OK #612/8 verifier_linked_scalars/scalars_sub_neg_imm:OK #612/9 verifier_linked_scalars/scalars_double_add:OK #612/10 verifier_linked_scalars/scalars_sync_delta_overflow:OK #612/11 verifier_linked_scalars/scalars_sync_delta_overflow_large_range:OK #612/12 verifier_linked_scalars/scalars_alu32_big_offset:OK #612/13 verifier_linked_scalars/scalars_alu32_basic:OK #612/14 verifier_linked_scalars/scalars_alu32_wrap:OK #612/15 verifier_linked_scalars/scalars_alu32_zext_linked_reg:OK #612/16 verifier_linked_scalars/scalars_alu32_alu64_cross_type:OK #612/17 verifier_linked_scalars/scalars_alu32_alu64_regsafe_pruning:OK #612/18 verifier_linked_scalars/alu32_negative_offset:OK #612/19 verifier_linked_scalars/spurious_precision_marks:OK #612/20 verifier_linked_scalars/scalars_self_add_clears_id:OK #612/21 verifier_linked_scalars/scalars_self_add_alu32_clears_id:OK #612 verifier_linked_scalars:OK Summary: 1/21 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20260407192421.508817-3-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-07 18:15:43 -07:00
Andrey Grodzovsky	cea4323f1c	selftests/bpf: Add tests for kprobe attachment with duplicate symbols bpf_fentry_shadow_test exists in both vmlinux (net/bpf/test_run.c) and bpf_testmod (bpf_testmod.c), creating a duplicate symbol condition when bpf_testmod is loaded. Add subtests that verify kprobe behavior with this duplicate symbol: In attach_probe: - dup-sym-{default,legacy,perf,link}: unqualified attach succeeds across all four modes, preferring vmlinux over module shadow. - MOD:SYM qualification attaches to the module version. In kprobe_multi_test: - dup_sym: kprobe_multi attach with kprobe and kretprobe succeeds. bpf_fentry_shadow_test is not invoked via test_run, so tests verify attach and detach succeed without triggering the probe. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Link: https://lore.kernel.org/r/20260407203912.1787502-3-andrey.grodzovsky@crowdstrike.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-07 16:28:12 -07:00
Qi Tang	a4985a1755	selftests/bpf: add test for nullable PTR_TO_BUF access Add iter_buf_null_fail with two tests and a test runner: - iter_buf_null_deref: verifier must reject direct dereference of ctx->key (PTR_TO_BUF \| PTR_MAYBE_NULL) without a null check - iter_buf_null_check_ok: verifier must accept dereference after an explicit null check Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Qi Tang <tpluszz77@gmail.com> Link: https://lore.kernel.org/r/20260407145421.4315-1-tpluszz77@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-07 15:53:45 -07:00
Kumar Kartikeya Dwivedi	a8aa306741	selftests/bpf: Allow prog name matching for tests with __description For tests that carry a __description tag, allow matching on both the description string and program name for convenience. Before this commit, the description string must be spelt out to filter the tests. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260407145606.3991770-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-07 12:24:29 -07:00
Günther Noack	dc75f89046	selftests/landlock: Simplify ruleset creation and enforcement in fs_test * Add enforce_fs() for defining and enforcing a ruleset in one step * In some places, dropped "ASSERT_LE(0, fd)" checks after create_ruleset() call -- create_ruleset() already checks that. * In some places, rename "file_fd" to "fd" if it is not needed to disambiguate any more. Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260327164838.38231-12-gnoack3000@gmail.com [mic: Tweak subjet] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:10 +02:00
Günther Noack	f433fd3fa2	selftests/landlock: Check that coredump sockets stay unrestricted Even when a process is restricted with the new LANDLOCK_ACCESS_FS_RESOLVE_UNIX right, the kernel can continue writing its coredump to the configured coredump socket. In the test, we create a local server and rewire the system to write coredumps into it. We then create a child process within a Landlock domain where LANDLOCK_ACCESS_FS_RESOLVE_UNIX is restricted and make the process crash. The test uses SO_PEERCRED to check that the connecting client process is the expected one. Includes a fix by Mickaël Salaün for setting the EUID to 0 (see [1]). Link[1]: https://lore.kernel.org/all/20260218.ohth8theu8Yi@digikod.net/ Suggested-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260327164838.38231-11-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:10 +02:00
Günther Noack	0f42f5be0b	selftests/landlock: Audit test for LANDLOCK_ACCESS_FS_RESOLVE_UNIX Add an audit test to check that Landlock denials from LANDLOCK_ACCESS_FS_RESOLVE_UNIX result in audit logs in the expected format. (There is one audit test for each filesystem access right, so we should add one for LANDLOCK_ACCESS_FS_RESOLVE_UNIX as well.) Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260327164838.38231-10-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:09 +02:00
Günther Noack	9da41c65c9	selftests/landlock: Test LANDLOCK_ACCESS_FS_RESOLVE_UNIX * Extract common helpers from an existing IOCTL test that also uses pathname unix(7) sockets. * These tests use the common scoped domains fixture which is also used in other Landlock scoping tests and which was used in Tingmao Wang's earlier patch set in [1]. These tests exercise the cross product of the following scenarios: * Stream connect(), Datagram connect(), Datagram sendmsg() and Seqpacket connect(). * Child-to-parent and parent-to-child communication * The Landlock policy configuration as listed in the scoped_domains fixture. * In the default variant, Landlock domains are only placed where prescribed in the fixture. * In the "ALL_DOMAINS" variant, Landlock domains are also placed in the places where the fixture says to omit them, but with a LANDLOCK_RULE_PATH_BENEATH that allows connection. Cc: Justin Suess <utilityemal77@gmail.com> Cc: Tingmao Wang <m@maowtm.org> Cc: Mickaël Salaün <mic@digikod.net> Link[1]: https://lore.kernel.org/all/53b9883648225d5a08e82d2636ab0b4fda003bc9.1767115163.git.m@maowtm.org/ Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260327164838.38231-9-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:09 +02:00
Günther Noack	db8201a3fa	selftests/landlock: Replace access_fs_16 with ACCESS_ALL in fs_test The access_fs_16 variable was originally intended to stay frozen at 16 access rights so that audit tests would not need updating when new access rights are added. Now that we have 17 access rights, the name is confusing. Replace all uses of access_fs_16 with ACCESS_ALL and delete the variable. Suggested-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260327164838.38231-8-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:08 +02:00
Günther Noack	ae97330d1b	landlock: Control pathname UNIX domain socket resolution by path * Add a new access right LANDLOCK_ACCESS_FS_RESOLVE_UNIX, which controls the lookup operations for named UNIX domain sockets. The resolution happens during connect() and sendmsg() (depending on socket type). * Change access_mask_t from u16 to u32 (see below) * Hook into the path lookup in unix_find_bsd() in af_unix.c, using a LSM hook. Make policy decisions based on the new access rights * Increment the Landlock ABI version. * Minor test adaptations to keep the tests working. * Document the design rationale for scoped access rights, and cross-reference it from the header documentation. With this access right, access is granted if either of the following conditions is met: * The target socket's filesystem path was allow-listed using a LANDLOCK_RULE_PATH_BENEATH rule, or: * The target socket was created in the same Landlock domain in which LANDLOCK_ACCESS_FS_RESOLVE_UNIX was restricted. In case of a denial, connect() and sendmsg() return EACCES, which is the same error as it is returned if the user does not have the write bit in the traditional UNIX file system permissions of that file. The access_mask_t type grows from u16 to u32 to make space for the new access right. This also doubles the size of struct layer_access_masks from 32 byte to 64 byte. To avoid memory layout inconsistencies between architectures (especially m68k), pack and align struct access_masks [2]. Document the (possible future) interaction between scoped flags and other access rights in struct landlock_ruleset_attr, and summarize the rationale, as discussed in code review leading up to [3]. This feature was created with substantial discussion and input from Justin Suess, Tingmao Wang and Mickaël Salaün. Cc: Tingmao Wang <m@maowtm.org> Cc: Justin Suess <utilityemal77@gmail.com> Cc: Kuniyuki Iwashima <kuniyu@google.com> Suggested-by: Jann Horn <jannh@google.com> Link[1]: https://github.com/landlock-lsm/linux/issues/36 Link[2]: https://lore.kernel.org/all/20260401.Re1Eesu1Yaij@digikod.net/ Link[3]: https://lore.kernel.org/all/20260205.8531e4005118@gnoack.org/ Signed-off-by: Günther Noack <gnoack3000@gmail.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20260327164838.38231-5-gnoack3000@gmail.com [mic: Fix kernel-doc formatting, pack and align access_masks] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:06 +02:00
Mickaël Salaün	a060ac0b8c	selftests/landlock: Fix format warning for __u64 in net_test On architectures where __u64 is unsigned long (e.g. powerpc64), using %llx to format a __u64 triggers a -Wformat warning because %llx expects unsigned long long. Cast the argument to unsigned long long. Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Fixes: `a549d055a2` ("selftests/landlock: Add network tests") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202604020206.62zgOTeP-lkp@intel.com/ Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260402192608.1458252-6-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:03 +02:00
Mickaël Salaün	07c2572a87	selftests/landlock: Skip stale records in audit_match_record() Domain deallocation records are emitted asynchronously from kworker threads (via free_ruleset_work()). Stale deallocation records from a previous test can arrive during the current test's deallocation read loop and be picked up by audit_match_record() instead of the expected record, causing a domain ID mismatch. The audit.layers test (which creates 16 nested domains) is particularly vulnerable because it reads 16 deallocation records in sequence, providing a large window for stale records to interleave. The same issue affects audit_flags.signal, where deallocation records from a previous test (audit.layers) can leak into the next test and be picked up by audit_match_record() instead of the expected record. Fix this by continuing to read records when the type matches but the content pattern does not. Stale records are silently consumed, and the loop only stops when both type and pattern match (or the socket times out with -EAGAIN). Additionally, extend matches_log_domain_deallocated() with an expected_domain_id parameter. When set, the regex pattern includes the specific domain ID as a literal hex value, so that deallocation records for a different domain do not match the pattern at all. This handles the case where the stale record has the same denial count as the expected one (e.g. both have denials=1), which the type+pattern loop alone cannot distinguish. Callers that already know the expected domain ID (from a prior denial or allocation record) now pass it to filter precisely. When expected_domain_id is set, matches_log_domain_deallocated() also temporarily increases the socket timeout to audit_tv_dom_drop (1 second) to wait for the asynchronous kworker deallocation, and restores audit_tv_default afterward. This removes the need for callers to manage the timeout switch manually. Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Fixes: `6a500b2297` ("selftests/landlock: Add tests for audit flags and domain IDs") Link: https://lore.kernel.org/r/20260402192608.1458252-5-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:02 +02:00
Mickaël Salaün	3647a4977f	selftests/landlock: Drain stale audit records on init Non-audit Landlock tests generate audit records as side effects when audit_enabled is non-zero (e.g. from boot configuration). These records accumulate in the kernel audit backlog while no audit daemon socket is open. When the next test opens a new netlink socket and registers as the audit daemon, the stale backlog is delivered, causing baseline record count checks to fail spuriously. Fix this by draining all pending records in audit_init() right after setting the receive timeout. The 1-usec SO_RCVTIMEO causes audit_recv() to return -EAGAIN once the backlog is empty, naturally terminating the drain loop. Domain deallocation records are emitted asynchronously from a work queue, so they may still arrive after the drain. Remove records.domain == 0 checks that are not preceded by audit_match_record() calls, which would otherwise consume stale records before the count. Document this constraint above audit_count_records(). Increasing the drain timeout to catch in-flight deallocation records was considered but rejected: a longer timeout adds latency to every audit_init() call even when no stale record is pending, and any fixed timeout is still not guaranteed to catch all records under load. Removing the unprotected checks is simpler and avoids the spurious failures. Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Fixes: `6a500b2297` ("selftests/landlock: Add tests for audit flags and domain IDs") Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260402192608.1458252-4-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:01 +02:00
Mickaël Salaün	9143d79033	selftests/landlock: Fix socket file descriptor leaks in audit helpers audit_init() opens a netlink socket and configures it, but leaks the file descriptor if audit_set_status() or setsockopt() fails. Fix this by jumping to an error path that closes the socket before returning. Apply the same fix to audit_init_with_exe_filter(), which leaks the file descriptor from audit_init() if audit_init_filter_exe() or audit_filter_exe() fails, and to audit_cleanup(), which leaks it if audit_init_filter_exe() fails in FIXTURE_TEARDOWN_PARENT(). Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Fixes: `6a500b2297` ("selftests/landlock: Add tests for audit flags and domain IDs") Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260402192608.1458252-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:01 +02:00
Mickaël Salaün	b566f7a4f0	selftests/landlock: Fix snprintf truncation checks in audit helpers snprintf() returns the number of characters that would have been written, excluding the terminating NUL byte. When the output is truncated, this return value equals or exceeds the buffer size. Fix matches_log_domain_allocated() and matches_log_domain_deallocated() to detect truncation with ">=" instead of ">". Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Fixes: `6a500b2297` ("selftests/landlock: Add tests for audit flags and domain IDs") Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260402192608.1458252-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:00 +02:00
Mickaël Salaün	e75e38055b	landlock: Allow TSYNC with LOG_SUBDOMAINS_OFF and fd=-1 LANDLOCK_RESTRICT_SELF_TSYNC does not allow LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF with ruleset_fd=-1, preventing a multithreaded process from atomically propagating subdomain log muting to all threads without creating a domain layer. Relax the fd=-1 condition to accept TSYNC alongside LOG_SUBDOMAINS_OFF, and update the documentation accordingly. Add flag validation tests for all TSYNC combinations with ruleset_fd=-1, and audit tests verifying both transition directions: muting via TSYNC (logged to not logged) and override via TSYNC (not logged to logged). Cc: Günther Noack <gnoack@google.com> Cc: stable@vger.kernel.org Fixes: `42fc7e6543` ("landlock: Multithreading support for landlock_restrict_self()") Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260407164107.2012589-2-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:51:00 +02:00
Mickaël Salaün	874c8f8382	landlock: Fix LOG_SUBDOMAINS_OFF inheritance across fork() hook_cred_transfer() only copies the Landlock security blob when the source credential has a domain. This is inconsistent with landlock_restrict_self() which can set LOG_SUBDOMAINS_OFF on a credential without creating a domain (via the ruleset_fd=-1 path): the field is committed but not preserved across fork() because the child's prepare_creds() calls hook_cred_transfer() which skips the copy when domain is NULL. This breaks the documented use case where a process mutes subdomain logs before forking sandboxed children: the children lose the muting and their domains produce unexpected audit records. Fix this by unconditionally copying the Landlock credential blob. Cc: Günther Noack <gnoack@google.com> Cc: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Fixes: `ead9079f75` ("landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF") Reviewed-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260407164107.2012589-1-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-04-07 18:50:56 +02:00
Weiming Shi	1c22483a2c	bpf: reject negative CO-RE accessor indices in bpf_core_parse_spec() CO-RE accessor strings are colon-separated indices that describe a path from a root BTF type to a target field, e.g. "0:1:2" walks through nested struct members. bpf_core_parse_spec() parses each component with sscanf("%d"), so negative values like -1 are silently accepted. The subsequent bounds checks (access_idx >= btf_vlen(t)) only guard the upper bound and always pass for negative values because C integer promotion converts the __u16 btf_vlen result to int, making the comparison (int)(-1) >= (int)(N) false for any positive N. When -1 reaches btf_member_bit_offset() it gets cast to u32 0xffffffff, producing an out-of-bounds read far past the members array. A crafted BPF program with a negative CO-RE accessor on any struct that exists in vmlinux BTF (e.g. task_struct) crashes the kernel deterministically during BPF_PROG_LOAD on any system with CONFIG_DEBUG_INFO_BTF=y (default on major distributions). The bug is reachable with CAP_BPF: BUG: unable to handle page fault for address: ffffed11818b6626 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page Oops: Oops: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 85 Comm: poc Not tainted 7.0.0-rc6 #18 PREEMPT(full) RIP: 0010:bpf_core_parse_spec (tools/lib/bpf/relo_core.c:354) RAX: 00000000ffffffff Call Trace: <TASK> bpf_core_calc_relo_insn (tools/lib/bpf/relo_core.c:1321) bpf_core_apply (kernel/bpf/btf.c:9507) check_core_relo (kernel/bpf/verifier.c:19475) bpf_check (kernel/bpf/verifier.c:26031) bpf_prog_load (kernel/bpf/syscall.c:3089) __sys_bpf (kernel/bpf/syscall.c:6228) </TASK> CO-RE accessor indices are inherently non-negative (struct member index, array element index, or enumerator index), so reject them immediately after parsing. Fixes: `ddc7c30426` ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260404161221.961828-2-bestswngs@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-07 08:27:55 -07:00
Claudio Imbrenda	857e92662c	KVM: s390: selftests: enable some common memory-related tests Enable the following tests on s390: * memslot_modification_stress_test * memslot_perf_test * mmu_stress_test Since the first two tests are now supported on all architectures, move them into TEST_GEN_PROGS_COMMON and out of the indiviual architectures. Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2026-04-07 17:20:30 +02:00
Claudio Imbrenda	c10e2771c7	KVM: selftests: Remove 1M alignment requirement for s390 Remove the 1M memslot alignment requirement for s390, since it is not needed anymore. Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2026-04-07 17:07:27 +02:00
Ming Lei	affb5f67d7	selftests/ublk: add read-only buffer registration test Add --rdonly_shmem_buf option to kublk that registers shared memory buffers with UBLK_SHMEM_BUF_READ_ONLY (read-only pinning without FOLL_WRITE) and mmaps with PROT_READ only. Add test_shmemzc_04.sh which exercises the new flag with a null target, hugetlbfs buffer, and write workload. Write I/O works because the server only reads from the shared buffer — the data flows from client to kernel to the shared pages, and the server reads them out. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://patch.msgid.link/20260331153207.3635125-11-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-07 07:42:39 -06:00
Ming Lei	12075992c6	selftests/ublk: add filesystem fio verify test for shmem_zc Add test_shmemzc_03.sh which exercises shmem_zc through the full filesystem stack: mkfs ext4 on the ublk device, mount it, then run fio verify on a file inside the filesystem with --mem=mmaphuge. Extend _mkfs_mount_test() to accept an optional command that runs between mount and umount. The function cd's into the mount directory so the command can use relative file paths. Existing callers that pass only the device are unaffected. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://patch.msgid.link/20260331153207.3635125-10-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-07 07:42:23 -06:00
Ming Lei	d486650332	selftests/ublk: add hugetlbfs shmem_zc test for loop target Add test_shmem_zc_02.sh which tests the UBLK_IO_F_SHMEM_ZC zero-copy path on the loop target using a hugetlbfs shared buffer. Both kublk and fio mmap the same hugetlbfs file with MAP_SHARED, sharing physical pages. The kernel's PFN matching enables zero-copy — the loop target reads/writes directly from the shared buffer to the backing file. Uses standard fio --mem=mmaphuge:<path> (supported since fio 1.10), no patched fio required. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://patch.msgid.link/20260331153207.3635125-9-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-07 07:42:23 -06:00
Ming Lei	2f1e9468bd	selftests/ublk: add shared memory zero-copy test Add test_shmem_zc_01.sh which tests UBLK_IO_F_SHMEM_ZC on the null target using a hugetlbfs shared buffer. Both kublk (--htlb) and fio (--mem=mmaphuge:<path>) mmap the same hugetlbfs file with MAP_SHARED, sharing physical pages. The kernel PFN match enables zero-copy I/O. Uses standard fio --mem=mmaphuge:<path> (supported since fio 1.10), no patched fio required. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://patch.msgid.link/20260331153207.3635125-8-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-07 07:42:23 -06:00
Ming Lei	ec20aa44ac	selftests/ublk: add UBLK_F_SHMEM_ZC support for loop target Add loop_queue_shmem_zc_io() which handles I/O requests marked with UBLK_IO_F_SHMEM_ZC. When the kernel sets this flag, the request data lives in a registered shared memory buffer — decode index + offset from iod->addr and use the server's mmap as the I/O buffer. The dispatch check in loop_queue_tgt_rw_io() routes SHMEM_ZC requests to this new function, bypassing the normal buffer registration path. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://patch.msgid.link/20260331153207.3635125-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-07 07:42:23 -06:00
Ming Lei	166b476b8d	selftests/ublk: add shared memory zero-copy support in kublk Add infrastructure for UBLK_F_SHMEM_ZC shared memory zero-copy: - kublk.h: struct ublk_shmem_entry and table for tracking registered shared memory buffers - kublk.c: per-device unix socket listener that accepts memfd registrations from clients via SCM_RIGHTS fd passing. The listener mmaps the memfd and registers the VA range with the kernel for PFN matching. Also adds --shmem_zc command line option. - kublk.c: --htlb <path> option to open a pre-allocated hugetlbfs file, mmap it with MAP_SHARED\|MAP_POPULATE, and register it with the kernel via ublk_ctrl_reg_buf(). Any process that mmaps the same hugetlbfs file shares the same physical pages, enabling zero-copy without socket-based fd passing. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://patch.msgid.link/20260331153207.3635125-6-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-07 07:41:55 -06:00
Rafael J. Wysocki	2acabc866c	linux-cpupower-7.1-rc1 - Fixes errors in cpupower-frequency-info short option names to its manpage. - Fixes cpupower-idle-info perf option name to its manpage. - Adds boost and epp options to cpupower-frequency-info to its manpage. - Adds description for perf-bias option to cpupower-info to its manpage. - Removes unnecessary extern declarations from getopt.h in arguments parsing functions in cpufreq-set, cpuidle-info, cpuidle-set, cpupower-info, and cpupower-set utilities. These functions are defined getopt.h file. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmnT+88ACgkQCwJExA0N QxzwFQ/7BaLieo6RjYGnnu+Z91o1O0Hs0q2ivS1qXQ1eefN45kBF0hR7pK6e+fWi kUXyh9j9E+itiBsZTHxZk6l0oPgw/t9mJJkIvo7Y/F6va5hJkSUdykMsEpDj8rZp MwLHaXTyZnZBW/W4lBsrgS0/mHNsQm9ru5KRGu6qlDu5Sbb4NokvPPkERh4FPWdi u5jsusZn96IkgftvpJ1ilBtVjvJtB7dSMm6NOg0nJXHmzFbFj+k3B0yW1E3fjojx qZVSohPXs1d167yy7KL1nYm7TMhFRQHooSv2/jdFEVJ9nD6IwFX8mmP7Y+2a/Wh9 adiGAbkZx8vt5deqiAgV4UYFkXMronqRz+kbqk8OnHicQL1zub4K2i09vWLdvE4u kHFF2z2ZSyTYDvn8ttX5MeL/NkygWfD4ubtL9iG6AMLx0JOgFXoYefg1WL1PG76v xOR+DQVzPyzIknTkxgtrRCMrXIt9+nJKwf0BCX27P7hHl1hlSRJ9qsgw0mkF+Sjq rdoVTPDdyQIOD2+863sTpMDeH3T/IfjF2Jn9aTbWFUkFYhdnYuIAK7RK4gC0LZ6B DUdNqva6qeaG+o5FSwgp0oHUOIemObBA4Cr0LzkZR/IZSP7yIPNAYn5EWqgdve5s 6XzX8GFHMSKUfpJoHaUjjxZ3ursj0xw3MHrhv5t4npsx3yKHOIk= =f1Hu -----END PGP SIGNATURE----- Merge tag 'linux-cpupower-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Pull cpupower utility updates for 7.1-rc1 from Shuah Khan: "- Fixes errors in cpupower-frequency-info short option names to its manpage. - Fixes cpupower-idle-info perf option name to its manpage. - Adds boost and epp options to cpupower-frequency-info to its manpage. - Adds description for perf-bias option to cpupower-info to its manpage. - Removes unnecessary extern declarations from getopt.h in arguments parsing functions in cpufreq-set, cpuidle-info, cpuidle-set, cpupower-info, and cpupower-set utilities. These functions are defined getopt.h file." * tag 'linux-cpupower-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: cpupower: remove extern declarations in cmd functions cpupower-info.1: describe the --perf-bias option cpupower-frequency-info.1: document --boost and --epp options cpupower-frequency-info.1: use the proper name of the --perf option cpupower-idle-info.1: fix short option names	2026-04-07 15:39:38 +02:00
Qingfang Deng	dfecb0c5af	selftests: net: add tests for PPP Add ping and iperf3 tests for ppp_async.c and pppoe.c. Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev> Link: https://patch.msgid.link/20260403034908.30017-1-qingfang.deng@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-04-07 12:08:46 +02:00
Eric Biggers	05d42dc8ab	xfrm: Drop support for HMAC-RIPEMD-160 Drop support for HMAC-RIPEMD-160 from IPsec to reduce the UAPI surface and simplify future maintenance. It's almost certainly unused. RIPEMD-160 received some attention in the early 2000s when SHA-* weren't quite as well established. But it never received much adoption outside of certain niches such as Bitcoin. It's actually unclear that Linux + IPsec + HMAC-RIPEMD-160 has ever been used, even historically. When support for it was added in 2003, it was done so in a "cleanup" commit without any justification [1]. It didn't actually work until someone happened to fix it 5 years later [2]. That person didn't use or test it either [3]. Finally, also note that "hmac(rmd160)" is by far the slowest of the algorithms in aalg_list[]. Of course, today IPsec is usually used with an AEAD, such as AES-GCM. But even for IPsec users still using a dedicated auth algorithm, they almost certainly aren't using, and shouldn't use, HMAC-RIPEMD-160. Thus, let's just drop support for it. Note: no kconfig update is needed, since CRYPTO_RMD160 wasn't actually being selected anyway. References: [1] linux-history commit d462985fc1941a47 ("[IPSEC]: Clean up key manager algorithm handling.") [2] linux commit `a13366c632` ("xfrm: xfrm_algo: correct usage of RIPEMD-160") [3] https://lore.kernel.org/all/1212340578-15574-1-git-send-email-rueegsegger@swiss-it.ch Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2026-04-07 10:47:58 +02:00
Thomas Weißschuh	598b670af3	selftests/nolibc: don't skip tests for unimplemented syscalls anymore The automatic skipping of tests on ENOSYS returns was introduced in commit `349afc8a52` ("selftests/nolibc: skip tests for unimplemented syscalls"). It handled the fact that nolibc would return ENOSYS for many syscall wrappers on riscv32. Nowadays nolibc handles all these correctly, so this logic is not used anymore. To make missing nolibc functionality more obvious fail the tests again if something is not implemented. Revert the mentioned commit again. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260406-nolibc-no-skip-enosys-v1-2-c046b1ac7d73@weissschuh.net/	2026-04-07 09:29:26 +02:00
Thomas Weißschuh	9a5206f256	selftests/nolibc: explicitly handle ENOSYS from ptrace() The automatic ENOSYS handling in EXPECT_SYSER() is about to be removed. ptrace() will return legitimately return ENOSYS on qemu-user, so handle it explicitly. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260406-nolibc-no-skip-enosys-v1-1-c046b1ac7d73@weissschuh.net/	2026-04-07 09:28:32 +02:00
Thomas Weißschuh	ce834c9cb9	tools/nolibc: add byteorder conversions Add some standard functions to convert between different byte orders. Conveniently the UAPI headers provide all the necessary functionality. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260405-nolibc-bswap-v1-1-f7699ca9cee0@weissschuh.net	2026-04-07 09:27:25 +02:00
Thomas Weißschuh	2eb64b936d	tools/nolibc: add the _syscall() macro The standard syscall() function or macro uses the libc return value convention. Errors returned from the kernel as negative values are stored in errno and -1 is returned. Users who want to avoid using errno don't have a way to call raw syscalls and check the returned error. Add a new macro _syscall() which works like the standard syscall() but passes through the return value from the kernel unchanged. The naming scheme and return values match the named _sys_foo() system call wrappers already part of nolibc. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260405-nolibc-syscall-v1-3-e5b12bc63211@weissschuh.net	2026-04-07 09:27:07 +02:00
Thomas Weißschuh	022bbb5a41	tools/nolibc: move the call to __sysret() into syscall() __sysret() transforms the return value from the kernel into the libc return value convention. There is no reason for it to be called in the middle of the internals of the syscall() implementation macros. Move the call up, directly into syscall(), to make the code simpler. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260405-nolibc-syscall-v1-2-e5b12bc63211@weissschuh.net	2026-04-07 09:27:02 +02:00
Thomas Weißschuh	3f5059f01d	tools/nolibc: rename the internal macros used in syscall() These macros are the internal implementation of syscall(). They can not be used by users. Align them with the standard naming scheme for internal symbols. The current name also prevents the addition of an application-usable _syscall() symbol. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260405-nolibc-syscall-v1-1-e5b12bc63211@weissschuh.net	2026-04-07 09:26:57 +02:00
Matthieu Baerts (NGI0)	c4a5cb2f00	selftests: mptcp: join: recreate signal endp with same ID In this "delete re-add signal" MPTCP Join subtest, the endpoint linked to the initial subflow is removed, but readded once with different ID. It appears that there was an issue when reusing the same ID, recently fixed by commit `d191101dee` ("mptcp: pm: in-kernel: always set ID as avail when rm endp"). The test then now reuses the same ID the first time, but continue to use another one (88) the second time. This should then cover more cases. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/615 Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-5-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 19:14:30 -07:00
Stefano Garzarella	24ad7ff668	vsock/test: fix send_buf()/recv_buf() EINTR handling When send() or recv() returns -1 with errno == EINTR, the code skips the break but still adds the return value to nwritten/nread, making it decrease by 1. This leads to wrong buffer offsets and wrong bytes count. Fix it by explicitly continuing the loop on EINTR, so the return value is only added when it is positive. Fixes: `a8ed71a27e` ("vsock/test: add recv_buf() utility function") Fixes: `12329bd51f` ("vsock/test: add send_buf() utility function") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://patch.msgid.link/20260403093251.30662-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:46:03 -07:00
Maciej Fijalkowski	62838e363e	selftests: bpf: adjust rx_dropped xskxceiver's test to respect tailroom Since we have changed how big user defined headroom in umem can be, change the logic in testapp_stats_rx_dropped() so we pass updated headroom validation in xdp_umem_reg() and still drop half of frames. Test works on non-mbuf setup so __xsk_pool_get_rx_frame_size() that is called on xsk_rcv_check() will not account skb_shared_info size. Taking the tailroom size into account in test being fixed is needed as xdp_umem_reg() defaults to respect it. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-9-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:52 -07:00
Maciej Fijalkowski	16546954e1	selftests: bpf: have a separate variable for drop test Currently two different XDP programs share a static variable for different purposes (picking where to redirect on shared umem test & whether to drop a packet). This can be a problem when running full test suite - idx can be written by shared umem test and this value can cause a false behavior within XDP drop half test. Introduce a dedicated variable for drop half test so that these two don't step on each other toes. There is no real need for using __sync_fetch_and_add here as XSK tests are executed on single CPU. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-8-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:52 -07:00
Maciej Fijalkowski	3197c51ce2	selftests: bpf: fix pkt grow tests Skip tail adjust tests in xskxceiver for SKB mode as it is not very friendly for it. multi-buffer case does not work as xdp_rxq_info that is registered for generic XDP does not report ::frag_size. The non-mbuf path copies packet via skb_pp_cow_data() which only accounts for headroom, leaving us with no tailroom and causing underlying XDP prog to drop packets therefore. For multi-buffer test on other modes, change the amount of bytes we use for growth, assume worst-case scenario and take care of headroom and tailroom. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-7-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:51 -07:00
Maciej Fijalkowski	c5866a6be4	selftests: bpf: introduce a common routine for reading procfs Parametrize current way of getting MAX_SKB_FRAGS value from {sys,proc}fs so that it can be re-used to get cache line size of system's CPU. All that just to mimic and compute size of kernel's struct skb_shared_info which for xsk and test suite interpret as tailroom. Introduce two variables to ifobject struct that will carry count of skb frags and tailroom size. Do the reading and computing once, at the beginning of test suite execution in xskxceiver, but for test_progs such way is not possible as in this environment each test setups and torns down ifobject structs. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-6-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 18:43:51 -07:00
Anton Protopopov	1c2e217ad3	selftests/bpf: Add more tests for loading insn arrays with offsets A `gotox rX` instruction accepts only values of type PTR_TO_INSN. The only way to create such a value is to load it from a map of type insn_array: rX = *(rY + offset) # rY was read from an insn_array ... gotox rX Add instruction-level and C-level selftests to validate loads with nonzero offsets. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20260406160141.36943-3-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-06 18:38:32 -07:00
Jakub Kicinski	3b45559f6c	selftests: net: py: color the basics in the output Sometimes it's hard to spot the ok / not ok lines in the output. This is especially true for the GRO tests which retries a lot so there's a wall of non-fatal output printed. Try to color the crucial lines green / red / yellow when running in a terminal. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20260402215444.1589893-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-04-06 17:47:59 -07:00
Kumar Kartikeya Dwivedi	171580e432	selftests/bpf: Add tests for syscall ctx accesses beyond U16_MAX Ensure we reject programs that access beyond the maximum syscall ctx size, i.e. U16_MAX either through direct accesses or helpers/kfuncs. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260406194403.1649608-8-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-06 15:27:27 -07:00
Kumar Kartikeya Dwivedi	0dca817f4d	selftests/bpf: Add tests for unaligned syscall ctx accesses Add coverage for unaligned access with fixed offsets and variable offsets, and through helpers or kfuncs. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260406194403.1649608-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-06 15:27:27 -07:00
Kumar Kartikeya Dwivedi	02c68b10d8	selftests/bpf: Test modified syscall ctx for ARG_PTR_TO_CTX Ensure that global subprogs and tail calls can only accept an unmodified PTR_TO_CTX for syscall programs. For all other program types, fixed or variable offsets on PTR_TO_CTX is rejected when passed into an argument of any call instruction type, through the unified logic of check_func_arg_reg_off. Finally, add a positive example of a case that should succeed with all our previous changes. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260406194403.1649608-6-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-06 15:27:27 -07:00
Kumar Kartikeya Dwivedi	5a34139b27	selftests/bpf: Add syscall ctx variable offset tests Add various tests to exercise fixed and variable offsets on PTR_TO_CTX for syscall programs, and cover disallowed cases for other program types lacking convert_ctx_access callback. Load verifier_ctx with CAP_SYS_ADMIN so that kfunc related logic can be tested. While at it, convert assembly tests to C. Unfortunately, ctx_pointer_to_helper_2's unpriv case conflicts with usage of kfuncs in the file and cannot be run. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260406194403.1649608-5-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-06 15:27:26 -07:00
Kumar Kartikeya Dwivedi	02f500ce01	selftests/bpf: Convert ctx tests from ASM to C Convert existing tests from ASM to C, in prep for future changes to add more comprehensive tests. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260406194403.1649608-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-06 15:27:26 -07:00
Kumar Kartikeya Dwivedi	ae5ef001aa	bpf: Support variable offsets for syscall PTR_TO_CTX Allow accessing PTR_TO_CTX with variable offsets in syscall programs. Fixed offsets are already enabled for all program types that do not convert their ctx accesses, since the changes we made in the commit `de6c7d99f8` ("bpf: Relax fixed offset check for PTR_TO_CTX"). Note that we also lift the restriction on passing syscall context into helpers, which was not permitted before, and passing modified syscall context into kfuncs. The structure of check_mem_access can be mostly shared and preserved, but we must use check_mem_region_access to correctly verify access with variable offsets. The check made in check_helper_mem_access is hardened to only allow PTR_TO_CTX for syscall programs to be passed in as helper memory. This was the original intention of the existing code anyway, and it makes little sense for other program types' context to be utilized as a memory buffer. In case a convincing example presents itself in the future, this check can be relaxed further. We also no longer use the last-byte access to simulate helper memory access, but instead go through check_mem_region_access. Since this no longer updates our max_ctx_offset, we must do so manually, to keep track of the maximum offset at which the program ctx may be accessed. Take care to ensure that when arg_type is ARG_PTR_TO_CTX, we do not relax any fixed or variable offset constraints around PTR_TO_CTX even in syscall programs, and require them to be passed unmodified. There are several reasons why this is necessary. First, if we pass a modified ctx, then the global subprog's accesses will not update the max_ctx_offset to its true maximum offset, and can lead to out of bounds accesses. Second, tail called program (or extension program replacing global subprog) where their max_ctx_offset exceeds the program they are being called from can also cause issues. For the latter, unmodified PTR_TO_CTX is the first requirement for the fix, the second is ensuring max_ctx_offset >= the program they are being called from, which has to be a separate change not made in this commit. All in all, we can hint using arg_type when we expect ARG_PTR_TO_CTX and make our relaxation around offsets conditional on it. Drop coverage of syscall tests from verifier_ctx.c temporarily for negative cases until they are updated in subsequent commits. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260406194403.1649608-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-06 15:27:26 -07:00
David Gow	8f260b02ee	kunit: tool: Terminate kernel under test on SIGINT kunit.py will attempt to catch SIGINT / ^C in order to ensure the TTY isn't messed up, but never actually attempts to terminate the running kernel (be it UML or QEMU). This can lead to a bit of frustration if the kernel has crashed or hung. Terminate the kernel process in the signal handler, if it's running. This requires plumbing through the process handle in a few more places (and having some checks to see if the kernel is still running in places where it may have already been killed). Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com> Closes: https://lore.kernel.org/all/aaFmiAmg9S18EANA@smile.fi.intel.com/ Signed-off-by: David Gow <david@davidgow.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-06 14:09:04 -06:00
Shuvam Pandey	e42c349f4c	kunit: tool: skip stty when stdin is not a tty run_kernel() cleanup and signal_handler() invoke stty unconditionally. When stdin is not a tty (for example in CI or unit tests), this writes noise to stderr. Call stty only when stdin is a tty. Add regression tests for these paths: - run_kernel() with non-tty stdin - signal_handler() with non-tty stdin - signal_handler() with tty stdin Signed-off-by: Shuvam Pandey <shuvampandey1@gmail.com> Reviewed-by: David Gow <david@davidgow.net> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-06 14:07:29 -06:00
David Gow	b73f50ffd4	kunit: tool: Recommend --raw_output=all if no KTAP found If no KTAP header is found in the kernel output (e.g., because the kernel crashed before the KUnit executor was run), it's very useful to re-run the test with --raw_output=all, as that will show any error output (such as a stacktrace, log message, BUG, etc). This is not particularly intuitive, however, as --raw_output=all is not well known. Add an extra log line to advertise --raw_output=all in this case, as it's a terrible user experience to just get "Did any KUnit tests run?" Signed-off-by: David Gow <david@davidgow.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-06 13:47:52 -06:00
Ryota Sakamoto	b5f92fc4a7	kunit: Add --list_suites to show suites Currently, kunit.py allows listing all individual tests via --list_tests. However, users often need to see only the available test suites. Add --list_suites to show suites. This option parses the test list output from the kernel and prints only the suite names. Example of the output of --list_suites: example_init miscdev_init printk-ringbuffer Signed-off-by: Ryota Sakamoto <sakamo.ryota@gmail.com> Reviewed-by: David Gow <david@davidgow.net> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-06 13:47:52 -06:00
Cheng-Yang Chou	a3c3fb2f86	tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing scx_alloc_free_idx() zeroes the payload of a freed arena allocation one word at a time. The loop bound was alloc->pool.elem_size / 8, but elem_size includes sizeof(struct sdt_data) (the 8-byte union sdt_id header). This caused the loop to write one extra u64 past the allocation, corrupting the tid field of the adjacent pool element. Fix the loop bound to (elem_size - sizeof(struct sdt_data)) / 8 so only the payload portion is zeroed. Test plan: - Add a temporary sanity check in scx_task_free() before the free call: if (mval->data->tid.idx != mval->tid.idx) scx_bpf_error("tid corruption: arena=%d storage=%d", mval->data->tid.idx, (int)mval->tid.idx); - stress-ng --fork 100 -t 10 & sudo ./build/bin/scx_sdt Without this fix, running scx_sdt under fork-heavy load triggers the corruption error. With the fix applied, the same workload completes without error. Fixes: `36929ebd17` ("tools/sched_ext: add arena based scheduler") Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2026-04-06 08:06:24 -10:00
Thomas Weißschuh	08b96aa962	selftests/nolibc: only use libgcc when really necessary nolibc should work without libgcc to be compatible with as many toolchains as possible. Currently the functionality tested by nolibc-test does not contain any dependencies, make sure it stays this way by not linking libgcc anymore. On the ppc target GCC always emits references to '_restgpr_' functions, so keep linking libgcc there. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260404-nolibc-libgcc-v1-1-eb3ecfe0e176@weissschuh.net	2026-04-06 19:46:54 +02:00
Thomas Weißschuh	e70a7bb575	selftests/nolibc: test the memory allocator The memory allocator has not seen any testing so far. Add a simple testcase for it. Suggested-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/lkml/adDRK8D6YBZgv36H@1wt.eu/ Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260404-nolibc-asprintf-v2-2-17d2d0df9763@weissschuh.net	2026-04-06 19:46:52 +02:00
Thomas Weißschuh	1e3c374e9f	tools/nolibc: check for overflow in calloc() without divisions On some architectures without native division instructions the division can generate calls into libgcc/compiler-rt. This library might not be available, so its use should be avoided. Use the compiler builtin to check for overflows without needing a division. The builtin has been available since GCC 3 and clang 3.8. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260404-nolibc-asprintf-v2-1-17d2d0df9763@weissschuh.net	2026-04-06 19:46:52 +02:00
Thomas Weißschuh	12496aad10	tools/nolibc: add support for asprintf() Add support for dynamically allocating formatted strings through asprintf() and vasprintf(). Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260401-nolibc-asprintf-v1-3-46292313439f@weissschuh.net	2026-04-06 19:46:51 +02:00
Kaushlendra Kumar	2fd3b83cac	cpupower: remove extern declarations in cmd functions extern char *optarg and extern int optind, opterr, optopt are already declared by <getopt.h>, which is included at the top of the file. Repeating extern declarations inside a function body is misleading and unnecessary. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2026-04-06 11:25:32 -06:00
Uday Shankar	320f9b1c6a	selftests: ublk: test that teardown after incomplete recovery completes Before the fix, teardown of a ublk server that was attempting to recover a device, but died when it had submitted a nonempty proper subset of the fetch commands to any queue would loop forever. Add a test to verify that, after the fix, teardown completes. This is done by: - Adding a new argument to the fault_inject target that causes it die after fetching a nonempty proper subset of the IOs to a queue - Using that argument in a new test while trying to recover an already-created device - Attempting to delete the ublk device at the end of the test; this hangs forever if teardown from the fault-injected ublk server never completed. It was manually verified that the test passes with the fix and hangs without it. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://patch.msgid.link/20260405-cancel-v2-2-02d711e643c2@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-06 08:37:48 -06:00
Namhyung Kim	dc647eb009	perf test: Skip sched stats test for !root Running perf sched stats requires root and it fails to open the schedstat file for regular users. Let's skip the test. $ perf sched stats true Failed to open /proc/sys/kernel/sched_schedstats Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-05 23:27:51 -07:00
Ian Rogers	c9ef786c09	perf cgroup: Update metric leader in evlist__expand_cgroup When the evlist is expanded the metric leader wasn't being updated. As the original evsel is deleted this creates a use-after-free in stat-shadow's prepare_metric. This was detected running the "perf stat --bpf-counters --for-each-cgroup test" with sanitizers. The change itself puts the copied evsel into the priv field (known unused because of evsel__clone use) and then in a second pass over the list updates the copied values using the priv pointer. Fixes: `d1c5a0e86a` ("perf stat: Add --for-each-cgroup option") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Sun Jian <sun.jian.kdev@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-05 23:23:33 -07:00
Ian Rogers	aeae075a03	perf sample: Add evsel to struct perf_sample Add the evsel from evsel__parse_sample into the struct perf_sample. Sometimes we want to alter the evsel associated with a sample, such as with off-cpu bpf-output events. In general the evsel and perf_sample are passed as a pair, but this makes an altered evsel something of a chore to keep checking for and setting up. Later patches will remove passing an evsel with the perf_sample and switch to just using the perf_sample's value. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-05 23:12:15 -07:00
Ian Rogers	ad5ceacd48	perf sample: Make sure perf_sample__init/exit are used The deferred stack trace code wasn't using perf_sample__init/exit. Add the deferred stack trace clean up to perf_sample__exit which requires proper NULL initialization in perf_sample__init. Make the perf_sample__exit robust to being called more than once by using zfree. Make the error paths in evsel__parse_sample exit the sample. Add a merged_callchain boolean to capture that callchain is allocated, deferred_callchain doen't suffice for this. Pack the struct variables to avoid padding bytes for this. Similiarly powerpc_vpadtl_sample wasn't using perf_sample__init/exit, use it for consistency and potential issues with uninitialized variables. Similarly guest_session__inject_events in builtin-inject wasn't using perf_sample_init/exit. The lifetime management for fetched events is somewhat complex there, but when an event is fetched the sample should be initialized and needs exiting on error. The sample may be left in place so that future injects have access to it. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-05 23:12:15 -07:00
Ian Rogers	8a7a23b27d	perf sample: Document struct perf_sample Add kernel-doc for struct perf_sample capturing the somewhat unusual population of fields and lifetime relationships. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-05 23:12:13 -07:00
Ricky Ringler	c66cf8c593	perf tools: Save cln_size header Store cacheline size during perf record in header, so that cacheline size can be used for other features, like sort keys for perf report. Testing example with feat enabled: $ perf record ./Example $ perf report --header-only \| grep -C 3 cacheline CPU_DOMAIN_INFO info available, use -I to display e_machine : 62 e_flags : 0 cacheline size: 64 missing features: TRACING_DATA BUILD_ID BRANCH_STACK GROUP_DESC AUXTRACE \ STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA ======== [namhyung: Update the commit message and remove blank lines] Signed-off-by: Ricky Ringler <ricky.ringler@proton.me> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-05 22:30:52 -07:00
Ian Rogers	f1d78f5c9b	perf tests sched stats: Write output to temp file Writing to the perf.data file can fail in various contexts such as continual test. Other tests write to a mktemp-ed file, make the "perf sched stats tests" follow this convention. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-05 22:21:42 -07:00
Namhyung Kim	7f5b8d5e6d	perf sched: Avoid crash for unexpected perf sched stats report Doing a `perf sched record` then `perf sched stats report` crashes as the tp_handler isn't set. Add a dummy tp_handler for it rather than adding an extra check. Reported-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-05 22:21:42 -07:00
Alexis Lothoré (eBPF Foundation)	f254fb58dd	selftests/bpf: remove unused toggle in tc_tunnel tc_tunnel test is based on a send_and_test_data function which takes a subtest configuration, and a boolean indicating whether the connection is supposed to fail or not. This boolean is systematically passed to true, and is a remnant from the first (not integrated) attempts to convert tc_tunnel to test_progs: those versions validated for example that a connection properly fails when only one side of the connection has tunneling enabled. This specific testing has not been integrated because it involved large timeouts which increased quite a lot the test duration, for little added value. Remove the unused boolean from send_and_test_data to simplify the generic part of subtests. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260403-tc_tunnel_cleanup-v1-1-4f1bb113d3ab@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-05 18:45:48 -07:00
Weiming Shi	262b857da6	selftests/bpf: add get_next_key boundary test for cgroup_storage Verify that bpf_map__get_next_key() correctly returns -ENOENT when called on the last (and only) key in a cgroup_storage map. Before the fix in the previous patch, this would succeed with bogus key data instead of failing. Suggested-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260403132951.43533-3-bestswngs@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-05 18:45:05 -07:00
Mykyta Yatsenko	f64eb44ce9	selftests/bpf: Add torn write detection test for htab BPF_F_LOCK Add a consistency subtest to htab_reuse that detects torn writes caused by the BPF_F_LOCK lockless update racing with element reallocation in alloc_htab_elem(). The test uses three thread roles started simultaneously via a pipe: - locked updaters: BPF_F_LOCK\|BPF_EXIST in-place updates - delete+update workers: delete then BPF_ANY\|BPF_F_LOCK insert - locked readers: BPF_F_LOCK lookup checking value consistency Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20260401-bpf_map_torn_writes-v1-2-782d071c55e7@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-04-05 18:37:32 -07:00
Linus Torvalds	85fb6da43a	RISC-V updates for v7.0-rc7 - Fix a CONFIG_SPARSEMEM crash on RV32 by avoiding early phys_to_page() - Prevent runtime const infrastructure from being used by modules, similar to what was done for x86 - Avoid problems when shutting down ACPI systems with IOMMUs by adding a device dependency between IOMMU and devices that use it - Fix a bug where the CPU pointer masking state isn't properly reset when tagged addresses aren't enabled for a task - Fix some incorrect register assignments, and add some missing ones, in kgdb support code - Fix compilation of non-kernel code that uses the ptrace uapi header by replacing BIT() with _BITUL() - Fix compilation of the validate_v_ptrace kselftest by working around kselftest macro expansion issues -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmnSgysACgkQx4+xDQu9 KksznQ//UKuNcpTgGoTOSAi9m5XrLNG7B0Z2Es5n3IuuFLeX4uFwD8pJjUouAqja Y89HKHcbuawAZLxoEj5QImbFxyM6zgdA24R2kM76+Ds5nMM4hetL1hR1Gphs1ghs Vg/klLkSQ/QkV8xTZlWe9A3s96PeiYKgwQUYdENjL/OXWjTbi4Ho/EQYjsXWGyuc sGkWVbGeqPhNlv8bMcA11kM8rCsvyhFnAC5yIbmybmup6ObzS1tEnOXodp1jVDlZ TPzi7SyjSLiTbsaJGZ1O5oFXSrr8zBLFt2RinR7rUt/8Aq8c5xSSvK9n808jytNP ubIgqWjW3wGjzbZfQw4WhOIihtAsp2VssWZlt1p0Q7EGOx0g+/zMA6Uq1VVIuEML +Xm6BwxLFm43NDSa7HPtytCoN/qqIQmiRkiLAG7WHL3mSkYDXYjTXZxTmp0awJ8R WTlZsQFQlnNd8VydP++cwqi/lCPPqWqZbc8ys0lLt57+oe6eE91W3a4jXnIn/5YR dtHLdmHF6xG3pVdilEfFgH7CkA1DMlFox5qQRFx4lLWBY7tTEY1S2o1tmIG1zqKd QTcaO1VbuobTLAy06kD8XNUNh8jzW0zedk37BcxA+J+1B59c0N9J7rW8rkRYu4Le eeIy9p8kPWUB/JfcMY+6jKUjZgQL9un8M4PpVZ/uWJDxQVDJcRs= =d0PH -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: - Fix a CONFIG_SPARSEMEM crash on RV32 by avoiding early phys_to_page() - Prevent runtime const infrastructure from being used by modules, similar to what was done for x86 - Avoid problems when shutting down ACPI systems with IOMMUs by adding a device dependency between IOMMU and devices that use it - Fix a bug where the CPU pointer masking state isn't properly reset when tagged addresses aren't enabled for a task - Fix some incorrect register assignments, and add some missing ones, in kgdb support code - Fix compilation of non-kernel code that uses the ptrace uapi header by replacing BIT() with _BITUL() - Fix compilation of the validate_v_ptrace kselftest by working around kselftest macro expansion issues * tag 'riscv-for-linus-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: ACPI: RIMT: Add dependency between iommu and devices selftests: riscv: Add braces around EXPECT_EQ() riscv: use _BITUL macro rather than BIT() in ptrace uapi and kselftests riscv: Reset pmm when PR_TAGGED_ADDR_ENABLE is not set riscv: make runtime const not usable by modules riscv: patch: Avoid early phys_to_page() riscv: kgdb: fix several debug register assignment bugs	2026-04-05 14:43:47 -07:00
Lorenzo Stoakes (Oracle)	62c65fd740	mm: add mmap_action_map_kernel_pages[_full]() A user can invoke mmap_action_map_kernel_pages() to specify that the mapping should map kernel pages starting from desc->start of a specified number of pages specified in an array. In order to implement this, adjust mmap_action_prepare() to be able to return an error code, as it makes sense to assert that the specified parameters are valid as quickly as possible as well as updating the VMA flags to include VMA_MIXEDMAP_BIT as necessary. This provides an mmap_prepare equivalent of vm_insert_pages(). We additionally update the existing vm_insert_pages() code to use range_in_vma() and add a new range_in_vma_desc() helper function for the mmap_prepare case, sharing the code between the two in range_is_subset(). We add both mmap_action_map_kernel_pages() and mmap_action_map_kernel_pages_full() to allow for both partial and full VMA mappings. We update the documentation to reflect the new features. Finally, we update the VMA tests accordingly to reflect the changes. Link: https://lkml.kernel.org/r/926ac961690d856e67ec847bee2370ab3c6b9046.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:45 -07:00
Lorenzo Stoakes (Oracle)	668937b7b2	mm: allow handling of stacked mmap_prepare hooks in more drivers While the conversion of mmap hooks to mmap_prepare is underway, we will encounter situations where mmap hooks need to invoke nested mmap_prepare hooks. The nesting of mmap hooks is termed 'stacking'. In order to flexibly facilitate the conversion of custom mmap hooks in drivers which stack, we must split up the existing __compat_vma_mmap() function into two separate functions: * compat_set_desc_from_vma() - This allows the setting of a vm_area_desc object's fields to the relevant fields of a VMA. * __compat_vma_mmap() - Once an mmap_prepare hook has been executed upon a vm_area_desc object, this function performs any mmap actions specified by the mmap_prepare hook and then invokes its vm_ops->mapped() hook if any were specified. In ordinary cases, where a file's f_op->mmap_prepare() hook simply needs to be invoked in a stacked mmap() hook, compat_vma_mmap() can be used. However some drivers define their own nested hooks, which are invoked in turn by another hook. A concrete example is vmbus_channel->mmap_ring_buffer(), which is invoked in turn by bin_attribute->mmap(): vmbus_channel->mmap_ring_buffer() has a signature of: int (mmap_ring_buffer)(struct vmbus_channel channel, struct vm_area_struct vma); And bin_attribute->mmap() has a signature of: int (mmap)(struct file , struct kobject , const struct bin_attribute attr, struct vm_area_struct vma); And so compat_vma_mmap() cannot be used here for incremental conversion of hooks from mmap() to mmap_prepare(). There are many such instances like this, where conversion to mmap_prepare would otherwise cascade to a huge change set due to nesting of this kind. The changes in this patch mean we could now instead convert vmbus_channel->mmap_ring_buffer() to vmbus_channel->mmap_prepare_ring_buffer(), and implement something like: struct vm_area_desc desc; int err; compat_set_desc_from_vma(&desc, file, vma); err = channel->mmap_prepare_ring_buffer(channel, &desc); if (err) return err; return __compat_vma_mmap(&desc, vma); Allowing us to incrementally update this logic, and other logic like it. Unfortunately, as part of this change, we need to be able to flexibly assign to the VMA descriptor, so have to remove some of the const declarations within the structure. Also update the VMA tests to reflect the changes. Link: https://lkml.kernel.org/r/24aac3019dd34740e788d169fccbe3c62781e648.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:44 -07:00
Lorenzo Stoakes (Oracle)	a1b7fb40cb	mm: add mmap_action_simple_ioremap() Currently drivers use vm_iomap_memory() as a simple helper function for I/O remapping memory over a range starting at a specified physical address over a specified length. In order to utilise this from mmap_prepare, separate out the core logic into __simple_ioremap_prep(), update vm_iomap_memory() to use it, and add simple_ioremap_prepare() to do the same with a VMA descriptor object. We also add MMAP_SIMPLE_IO_REMAP and relevant fields to the struct mmap_action type to permit this operation also. We use mmap_action_ioremap() to set up the actual I/O remap operation once we have checked and figured out the parameters, which makes simple_ioremap_prepare() easy to implement. We then add mmap_action_simple_ioremap() to allow drivers to make use of this mode. We update the mmap_prepare documentation to describe this mode. Finally, we update the VMA tests to reflect this change. Link: https://lkml.kernel.org/r/a08ef1c4542202684da63bb37f459d5dbbeddd91.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:43 -07:00
Lorenzo Stoakes (Oracle)	c50ca15dd4	mm: add vm_ops->mapped hook Previously, when a driver needed to do something like establish a reference count, it could do so in the mmap hook in the knowledge that the mapping would succeed. With the introduction of f_op->mmap_prepare this is no longer the case, as it is invoked prior to actually establishing the mapping. mmap_prepare is not appropriate for this kind of thing as it is called before any merge might take place, and after which an error might occur meaning resources could be leaked. To take this into account, introduce a new vm_ops->mapped callback which is invoked when the VMA is first mapped (though notably - not when it is merged - which is correct and mirrors existing mmap/open/close behaviour). We do better that vm_ops->open() here, as this callback can return an error, at which point the VMA will be unmapped. Note that vm_ops->mapped() is invoked after any mmap action is complete (such as I/O remapping). We intentionally do not expose the VMA at this point, exposing only the fields that could be used, and an output parameter in case the operation needs to update the vma->vm_private_data field. In order to deal with stacked filesystems which invoke inner filesystem's mmap() invocations, add __compat_vma_mapped() and invoke it on vfs_mmap() (via compat_vma_mmap()) to ensure that the mapped callback is handled when an mmap() caller invokes a nested filesystem's mmap_prepare() callback. Update the mmap_prepare documentation to describe the mapped hook and make it clear what its intended use is. The vm_ops->mapped() call is handled by the mmap complete logic to ensure the same code paths are handled by both the compatibility and VMA layers. Additionally, update VMA userland test headers to reflect the change. Link: https://lkml.kernel.org/r/4c5e98297eb0aae9565c564e1c296a112702f144.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:43 -07:00
Lorenzo Stoakes (Oracle)	382c0f2895	mm: have mmap_action_complete() handle the rmap lock and unmap Rather than have the callers handle this both the rmap lock release and unmapping the VMA on error, handle it within the mmap_action_complete() logic where it makes sense to, being careful not to unlock twice. This simplifies the logic and makes it harder to make mistake with this, while retaining correct behaviour with regard to avoiding deadlocks. Also replace the call_action_complete() function with a direct invocation of mmap_action_complete() as the abstraction is no longer required. Also update the VMA tests to reflect this change. Link: https://lkml.kernel.org/r/8d1ee8ebd3542d006a47e8382fb80cf5b57ecf10.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:42 -07:00
Lorenzo Stoakes (Oracle)	33506d4bae	mm: switch the rmap lock held option off in compat layer In the mmap_prepare compatibility layer, we don't need to hold the rmap lock, as we are being called from an .mmap handler. The .mmap_prepare hook, when invoked in the VMA logic, is called prior to the VMA being instantiated, but the completion hook is called after the VMA is linked into the maple tree, meaning rmap walkers can reach it. The mmap hook does not link the VMA into the tree, so this cannot happen. Therefore it's safe to simply disable this in the mmap_prepare compatibility layer. Also update VMA tests code to reflect current compatibility layer state. [akpm@linux-foundation.org: fix comment typo, per Vlastimil] Link: https://lkml.kernel.org/r/dda74230d26a1fcd79a3efab61fa4101dd1cac64.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:42 -07:00
Lorenzo Stoakes (Oracle)	827e97cf4b	mm: document vm_operations_struct->open the same as close() Describe when the operation is invoked and the context in which it is invoked, matching the description already added for vm_op->close(). While we're here, update all outdated references to an 'area' field for VMAs to the more consistent 'vma'. Link: https://lkml.kernel.org/r/7d0ca833c12014320f0fa00f816f95e6e10076f2.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:42 -07:00
Lorenzo Stoakes (Oracle)	3e4bb27068	mm: various small mmap_prepare cleanups Patch series "mm: expand mmap_prepare functionality and usage", v4. This series expands the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. This series starts with some cleanup of existing mmap_prepare logic, then adds documentation for the mmap_prepare call to make it easier for filesystem and driver writers to understand how it works. It then importantly adds a vm_ops->mapped hook, a key feature that was missing from mmap_prepare previously - this is invoked when a driver which specifies mmap_prepare has successfully been mapped but not merged with another VMA. mmap_prepare is invoked prior to a merge being attempted, so you cannot manipulate state such as reference counts as if it were a new mapping. The vm_ops->mapped hook allows a driver to perform tasks required at this stage, and provides symmetry against subsequent vm_ops->open,close calls. The series uses this to correct the afs implementation which wrongly manipulated reference count at mmap_prepare time. It then adds an mmap_prepare equivalent of vm_iomap_memory() - mmap_action_simple_ioremap(), then uses this to update a number of drivers. It then splits out the mmap_prepare compatibility layer (which allows for invocation of mmap_prepare hooks in an mmap() hook) in such a way as to allow for more incremental implementation of mmap_prepare hooks. It then uses this to extend mmap_prepare usage in drivers. Finally it adds an mmap_prepare equivalent of vm_map_pages(), which lays the foundation for future work which will extend mmap_prepare to DMA coherent mappings. This patch (of 21): Rather than passing arbitrary fields, pass a vm_area_desc pointer to mmap prepare functions to mmap prepare, and an action and vma pointer to mmap complete in order to put all the action-specific logic in the function actually doing the work. Additionally, allow mmap prepare functions to return an error so we can error out as soon as possible if there is something logically incorrect in the input. Update remap_pfn_range_prepare() to properly check the input range for the CoW case. Also remove io_remap_pfn_range_complete(), as we can simply set up the fields correctly in io_remap_pfn_range_prepare() and use remap_pfn_range_complete() for this. While we're here, make remap_pfn_range_prepare_vma() a little neater, and pass mmap_action directly to call_action_complete(). Then, update compat_vma_mmap() to perform its logic directly, as __compat_vma_map() is not used by anything so we don't need to export it. Also update compat_vma_mmap() to use vfs_mmap_prepare() rather than calling the mmap_prepare op directly. Finally, update the VMA userland tests to reflect the changes. Link: https://lkml.kernel.org/r/cover.1774045440.git.ljs@kernel.org Link: https://lkml.kernel.org/r/99f408e4694f44ab12bdc55fe0bd9685d3bd1117.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)	90cb921c4d	mm/vma: convert __mmap_region() to use vma_flags_t Update the mmap() implementation logic implemented in __mmap_region() and functions invoked by it. The mmap_region() function converts its input vm_flags_t parameter to a vma_flags_t value which it then passes to __mmap_region() which uses the vma_flags_t value consistently from then on. As part of the change, we convert map_deny_write_exec() to using vma_flags_t (it was incorrectly using unsigned long before), and place it in vma.h, as it is only used internal to mm. With this change, we eliminate the legacy is_shared_maywrite_vm_flags() helper function which is now no longer required. We are also able to update the MMAP_STATE() and VMG_MMAP_STATE() macros to use the vma_flags_t value. Finally, we update the VMA tests to reflect the change. Link: https://lkml.kernel.org/r/1fc33a404c962f02da778da100387cc19bd62153.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)	a06eb2f827	mm/vma: convert vma_modify_flags[_uffd]() to use vma_flags_t Update the vma_modify_flags() and vma_modify_flags_uffd() functions to accept a vma_flags_t parameter rather than a vm_flags_t one, and propagate the changes as needed to implement this change. Also add vma_flags_reset_once() in replacement of vm_flags_reset_once(). We still need to be careful here because we need to avoid tearing, so maintain the assumption that the first system word set of flags are the only ones that require protection from tearing, and retain this functionality. We can copy the remainder of VMA flags above 64 bits normally. But hopefully by the time that happens, we will have replaced the logic that requires these WRITE_ONCE()'s with something else. We also replace instances of vm_flags_reset() with a simple write of VMA flags. We are no longer perform a number of checks, most notable of all the VMA flags asserts becase: 1. We might be operating on a VMA that is not yet added to the tree. 2. We might be operating on a VMA that is now detached. 3. Really in all but core code, you should be using vma_desc_xxx(). 4. Other VMA fields are manipulated with no such checks. 5. It'd be egregious to have to add variants of flag functions just to account for cases such as the above, especially when we don't do so for other VMA fields. Drivers are the problematic cases and why it was especially important (and also for debug as VMA locks were introduced), the mmap_prepare work is solving this generally. Additionally, we can fairly safely assume by this point the soft dirty flags are being set correctly, so it's reasonable to drop this also. Finally, update the VMA tests to reflect this. Link: https://lkml.kernel.org/r/51afbb2b8c3681003cc7926647e37335d793836e.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)	e2963f639f	tools: bitmap: add missing bitmap_copy() implementation I need this for changes I am making to keep the VMA tests running correctly. Link: https://lkml.kernel.org/r/4dcb2fb959137e9fe58a23e21cebcea97de41a1f.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)	769669bd9c	mm/vma: convert as much as we can in mm/vma.c to vma_flags_t Now we have established a good foundation for vm_flags_t to vma_flags_t changes, update mm/vma.c to utilise vma_flags_t wherever possible. We are able to convert VM_STARTGAP_FLAGS entirely as this is only used in mm/vma.c, and to account for the fact we can't use VM_NONE to make life easier, place the definition of this within existing #ifdef's to be cleaner. Generally the remaining changes are mechanical. Also update the VMA tests to reflect the changes. Link: https://lkml.kernel.org/r/5fdeaf8af9a12c2a5d68497495f52fa627d05a5b.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)	a6f14fb593	tools/testing/vma: update VMA tests to test vma_clear_flags[_mask]() The tests have existing flag clearing logic, so simply expand this to use the new VMA-specific flag clearing helpers. Also correct some trivial formatting issue in a macro define. Link: https://lkml.kernel.org/r/f5da681d3c33039dd4a838188385796eb8d58373.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:41 -07:00
Lorenzo Stoakes (Oracle)	d720b81d01	mm/vma: introduce vma_clear_flags[_mask]() Introduce a helper function and helper macro to easily clear a VMA's flags using the new vma_flags_t vma->flags field: * vma_clear_flags_mask() - Clears all of the flags in a specified mask in the VMA's flags field. * vma_clear_flags() - Clears all of the specified individual VMA flag bits in a VMA's flags field. Also update the VMA tests to reflect the change. Link: https://lkml.kernel.org/r/9bd15da35c2c90e7441265adf01b5c2d3b5c6d41.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)	3a6455d56b	mm: convert do_brk_flags() to use vma_flags_t In order to be able to do this, we need to change VM_DATA_DEFAULT_FLAGS and friends and update the architecture-specific definitions also. We then have to update some KSM logic to handle VMA flags, and introduce VMA_STACK_FLAGS to define the vma_flags_t equivalent of VM_STACK_FLAGS. We also introduce two helper functions for use during the time we are converting legacy flags to vma_flags_t values - vma_flags_to_legacy() and legacy_to_vma_flags(). This enables us to iteratively make changes to break these changes up into separate parts. We use these explicitly here to keep VM_STACK_FLAGS around for certain users which need to maintain the legacy vm_flags_t values for the time being. We are no longer able to rely on the simple VM_xxx being set to zero if the feature is not enabled, so in the case of VM_DROPPABLE we introduce VMA_DROPPABLE as the vma_flags_t equivalent, which is set to EMPTY_VMA_FLAGS if the droppable flag is not available. While we're here, we make the description of do_brk_flags() into a kdoc comment, as it almost was already. We use vma_flags_to_legacy() to not need to update the vm_get_page_prot() logic as this time. Note that in create_init_stack_vma() we have to replace the BUILD_BUG_ON() with a VM_WARN_ON_ONCE() as the tested values are no longer build time available. We also update mprotect_fixup() to use VMA flags where possible, though we have to live with a little duplication between vm_flags_t and vma_flags_t values for the time being until further conversions are made. While we're here, update VM_SPECIAL to be defined in terms of VMA_SPECIAL_FLAGS now we have vma_flags_to_legacy(). Finally, we update the VMA tests to reflect these changes. Link: https://lkml.kernel.org/r/d02e3e45d9a33d7904b149f5604904089fd640ae.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> [SELinux] Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)	bbbc17cb02	tools/testing/vma: test vma_flags_count,vma[_flags]_test_single_mask Update the VMA tests to assert that vma_flags_count() behaves as expected, as well as vma_flags_test_single_mask() and vma_test_single_mask(). For the test functions we can simply update the existing vma_test(), et al. test to also test the single_mask variants. We also add some explicit testing of an empty VMA flag to this test to ensure this is handled properly. In order to test vma_flags_count() we simply take an existing set of flags and gradually remove flags ensuring the count remains as expected throughout. We also update the vma[_flags]_test_all() tests to make clear the semantics that we expect vma[_flags]_test_all(..., EMPTY_VMA_FLAGS) to return true, as trivially, all flags of none are always set in VMA flags. Link: https://lkml.kernel.org/r/4af95d559cd2af0ba3388de1e1386b9f94c0e009.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)	e79d1c500f	mm: introduce vma_flags_count() and vma[_flags]_test_single_mask() vma_flags_count() determines how many bits are set in VMA flags, using bitmap_weight(). vma_flags_test_single_mask() determines if a vma_flags_t set of flags contains a single flag specified as another vma_flags_t value, or if the sought flag mask is empty, it is defined to return false. This is useful when we want to declare a VMA flag as optionally a single flag in a mask or empty depending on kernel configuration. This allows us to have VM_NONE-like semantics when checking whether the flag is set. In a subsequent patch, we introduce the use of VMA_DROPPABLE of type vma_flags_t using precisely these semantics. It would be actively confusing to use vma_flags_test_any_single_mask() for this (and vma_flags_test_all_mask() is not correct to use here, as it trivially returns true when tested against an empty vma flags mask). We introduce vma_flags_count() to be able to assert that the compared flag mask is singular or empty, checked when CONFIG_DEBUG_VM is enabled. Also update the VMA tests as part of this change. Link: https://lkml.kernel.org/r/cd778dd02b9f2a01eb54d25a49dea8ec2ddf7753.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)	63cdb667d1	tools/testing/vma: update VMA flag tests to test vma_test[_any_mask]() Update the existing test logic to assert that vma_test(), vma_test_any() and vma_test_any_mask() (implicitly tested via vma_test_any()) are functioning correctly. We already have tests for other variants like this, so it's simply a matter of expanding those tests to also include tests for the VMA-specific helpers. Link: https://lkml.kernel.org/r/dea3e97c6c3dd86f1a3f1a0703241b03f6e3a33f.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)	fb67bba5d9	mm/vma: introduce vma_test[_any[_mask]](), and make inlining consistent Introduce helper functions and macros to make it convenient to test flags and flag masks for VMAs, specifically: * vma_test() - determine if a single VMA flag is set in a VMA. * vma_test_any_mask() - determine if any flags in a vma_flags_t value are set in a VMA. * vma_test_any() - Helper macro to test if any of specific flags are set. Also, there are a mix of 'inline's and '__always_inline's in VMA helper function declarations, update to consistently use __always_inline. Finally, update the VMA tests to reflect the changes. Link: https://lkml.kernel.org/r/be1d71f08307d747a82232cbd8664a88c0f41419.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:40 -07:00
Lorenzo Stoakes (Oracle)	a8add93f80	tools/testing/vma: test that legacy flag helpers work correctly Update the existing compare_legacy_flags() predicate function to assert that legacy_to_vma_flags() and vma_flags_to_legacy() behave as expected. Link: https://lkml.kernel.org/r/3374e50053adb65818fde948ae3488e1e29ae8b1.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)	c8555bc95d	mm/vma: introduce [vma_flags,legacy]_to_[legacy,vma_flags]() helpers While we are still converting VMA flags from vma_flags_t to vm_flags_t, introduce helpers to convert between the two to allow for iterative development without having to 'change the world' in a single commit'. Also update VMA flags tests to reflect the change. Finally, refresh vma_flags_overwrite_word(), vma_flag_overwrite_word_once(), vma_flags_set_word() and vma_flags_clear_word() in the VMA tests to reflect current kernel implementations - this should make no functional difference, but keeps the logic consistent between the two. Link: https://lkml.kernel.org/r/d3569470dbb3ae79134ca7c3eb3fc4df7086e874.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)	3ee5845382	mm/vma: introduce vma_flags_same[_mask/_pair]() Add helpers to determine if two sets of VMA flags are precisely the same, that is - that every flag set one is set in another, and neither contain any flags not set in the other. We also introduce vma_flags_same_pair() for cases where we want to compare two sets of VMA flags which are both non-const values. Also update the VMA tests to reflect the change, we already implicitly test that this functions correctly having used it for testing purposes previously. Link: https://lkml.kernel.org/r/4f764bf619e77205837c7c819b62139ef6337ca3.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)	b22a48ec09	tools/testing/vma: add simple test for append_vma_flags() Add a simple test for append_vma_flags() to assert that it behaves as expected. Additionally, include the VMA_REMAP_FLAGS definition in the VMA tests to allow us to use this value in the testing. Link: https://lkml.kernel.org/r/eebd946c5325ad7fae93027245a562eb1aeb68a2.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)	e8d464f4a9	mm/vma: add append_vma_flags() helper In order to be able to efficiently combine VMA flag masks with additional VMA flag bits we need to extend the concept introduced in mk_vma_flags() and __mk_vma_flags() by allowing the specification of a VMA flag mask to append VMA flag bits to. Update __mk_vma_flags() to allow for this and update mk_vma_flags() accordingly, and also provide append_vma_flags() to allow for the caller to specify which VMA flags mask to append to. Finally, update the VMA flags tests to reflect the change. Link: https://lkml.kernel.org/r/9f928cd4688270002f2c0c3777fcc9b49cc7a8ea.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)	06531d2bf3	tools/testing/vma: fix VMA flag tests The VMA tests are incorrectly referencing NUM_VMA_FLAGS, which doesn't exist, rather they should reference NUM_VMA_FLAG_BITS. Additionally, remove the custom-written implementation of __mk_vma_flags() as this means we are not testing the code as present in the kernel, rather add the actual __mk_vma_flags() to dup.h and add #ifdef's to handle declarations differently depending on NUM_VMA_FLAG_BITS. Link: https://lkml.kernel.org/r/b19c63af3d5efdfe712bf5d5f89368a5360a60f7.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:39 -07:00
Lorenzo Stoakes (Oracle)	7ec1885a7e	mm/vma: use new VMA flags for sticky flags logic Use the new vma_flags_t flags implementation to perform the logic around sticky flags and what flags are ignored on VMA merge. We make use of the new vma_flags_empty(), vma_flags_diff_pair(), and vma_flags_and_mask() functionality. Also update the VMA tests accordingly. Link: https://lkml.kernel.org/r/369574f06360ffa44707047e3b58eb4897345fba.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:38 -07:00
Lorenzo Stoakes (Oracle)	bd44d91d0c	tools/testing/vma: convert bulk of test code to vma_flags_t Convert the test code to utilise vma_flags_t as opposed to the deprecate vm_flags_t as much as possible. As part of this change, add VMA_STICKY_FLAGS and VMA_SPECIAL_FLAGS as early versions of what these defines will look like in the kernel logic once this logic is implemented. Link: https://lkml.kernel.org/r/df90efe29300bd899989f695be4ae3adc901a828.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:38 -07:00
Lorenzo Stoakes (Oracle)	8228e42b5f	mm/vma: add further vma_flags_t unions In order to utilise the new vma_flags_t type, we currently place it in union with legacy vm_flags fields of type vm_flags_t to make the transition smoother. Add vma_flags_t union entries for mm->def_flags and vmg->vm_flags - mm->def_vma_flags and vmg->vma_flags respectively. Once the conversion is complete, these will be replaced with vma_flags_t entries alone. Also update the VMA tests to reflect the change. Link: https://lkml.kernel.org/r/d507d542c089ba132e9da53f2ff7f80ca117c3b4.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:38 -07:00
Lorenzo Stoakes (Oracle)	e4fd34b84b	tools/testing/vma: add unit tests flag empty, diff_pair, and[_mask] Add VMA unit tests to assert that: * vma_flags_empty() * vma_flags_diff_pair() * vma_flags_and_mask() * vma_flags_and() All function as expected. In additional to the added tests, in order to make testing easier, add vma_flags_same_mask() and vma_flags_same() for testing only. If/when these are required in kernel code, they can be moved over. Also add ASSERT_FLAGS_[NOT_]SAME[_MASK](), ASSERT_FLAGS_[NON]EMPTY() test helpers to make asserting flag state easier and more convenient. Link: https://lkml.kernel.org/r/471ce7ceb1d32e5fc9c0660966b9eacdf899b4d1.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:38 -07:00
Lorenzo Stoakes (Oracle)	6bc0987d0b	mm/vma: add vma_flags_empty(), vma_flags_and(), vma_flags_diff_pair() Patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma code", v4. This series converts a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. In order to do so it adds a number of additional helpers: * vma_flags_empty() - Determines whether a vma_flags_t value has no bits set. * vma_flags_and() - Performs a bitwise AND between two vma_flags_t values. * vma_flags_diff_pair() - Determines which flags are not shared between a pair of VMA flags (typically non-constant values) * append_vma_flags() - Similar to mk_vma_flags(), but allows a vma_flags_t value to be specified (typically a constant value) which will be copied and appended to to create a new vma_flags_t value, with additional flags specified to append to it. * vma_flags_same() - Determines if a vma_flags_t value is exactly equal to a set of VMA flags. * vma_flags_same_mask() - Determines if a vma_flags_t value is eactly equal to another vma_flags_t value (typically constant). * vma_flags_same_pair() - Determines if a pair of vma_flags_t values are exactly equal to one another (typically both non-constant). * vma_flags_to_legacy() - Converts a vma_flags_t value to a vm_flags_t value, used to enable more iterative introduction of the use of vma_flags_t. * legacy_to_vma_flags() - Converts a vm_flags_t value to a vma_flags-t value, for the same purpose. * vma_flags_test_single_mask() - Tests whether a vma_flags_t value contain the single flag specified in an input vma_flags_t flag mask, or if that flag mask is empty, is defined to return false. Useful for config-predicated VMA flag mask defines. * vma_test() - Tests whether a VMA's flags contain a specific singular VMA flag. * vma_test_any() - Tests whether a VMA's flags contain any of a set of VMA flags. * vma_test_any_mask() - Tests whether a VMA's flags contain any of the flags specified in another, typically constant, vma_flags_t value. * vma_test_single_mask() - Tests whether a VMA's flags contain the single flag specified in an input vma_flags_t flag mask, or if that flag mask is empty, is defined to return false. Useful for config-predicated VMA flag mask defines. * vma_clear_flags() - Clears a specific set of VMA flags from a vma_flags_t value. * vma_clear_flags_mask() - Clears those flag set in a vma_flags_t value (typically constant) from a (typically not constant) vma_flags_t value. The series mostly focuses on the the VMA specific code, especially that contained in mm/vma.c and mm/vma.h. It updates both brk() and mmap() logic to utils vma_flags_t values as much as is practiaclly possible at this point, changing surrounding logic to be able to do so. It also updates the vma_modify_xxx() functions where they interact with VMA flags directly to use vm_flags_t values where possible. There is extensive testing added in the VMA userland tests to assert that all of these new VMA flag functions work correctly. This patch (of 25): Firstly, add the ability to determine if VMA flags are empty, that is no flags are set in a vma_flags_t value. Next, add the ability to obtain the equivalent of the bitwise and of two vma_flags_t values, via vma_flags_and_mask(). Next, add the ability to obtain the difference between two sets of VMA flags, that is the equivalent to the exclusive bitwise OR of the two sets of flags, via vma_flags_diff_pair(). vma_flags_xxx_mask() typically operates on a pointer to a vma_flags_t value, which is assumed to be an lvalue of some kind (such as a field in a struct or a stack variable) and an rvalue of some kind (typically a constant set of VMA flags obtained e.g. via mk_vma_flags() or equivalent). However vma_flags_diff_pair() is intended to operate on two lvalues, so use the _pair() suffix to make this clear. Finally, update VMA userland tests to add these helpers. We also port bitmap_xor() and __bitmap_xor() to the tools/ headers and source to allow the tests to work with vma_flags_diff_pair(). Link: https://lkml.kernel.org/r/cover.1774034900.git.ljs@kernel.org Link: https://lkml.kernel.org/r/53ab55b7da91425775e42c03177498ad6de88ef4.1774034900.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:38 -07:00
Zi Yan	224f129261	selftests/mm: add folio_split() and filemap_get_entry() race test The added folio_split_race_test is a modified C port of the race condition test from [1]. The test creates shmem huge pages, where the main thread punches holes in the shmem to cause folio_split() in the kernel and a set of 16 threads reads the shmem to cause filemap_get_entry() in the kernel. filemap_get_entry() reads the folio and xarray split by folio_split() locklessly. The original test[2] is written in rust and uses memfd (shmem backed). This C port uses shmem directly and use a single process. Note: the initial rust to C conversion is done by Cursor. Link: https://lore.kernel.org/all/CAKNNEtw5_kZomhkugedKMPOG-sxs5Q5OLumWJdiWXv+C9Yct0w@mail.gmail.com/ [1] Link: https://github.com/dfinity/thp-madv-remove-test [2] Link: https://lkml.kernel.org/r/20260323163717.184107-1-ziy@nvidia.com Co-developed-by: Bas van Dijk <bas@dfinity.org> Signed-off-by: Bas van Dijk <bas@dfinity.org> Co-developed-by: Adam Bratschi-Kaye <adam.bratschikaye@dfinity.org> Signed-off-by: Adam Bratschi-Kaye <adam.bratschikaye@dfinity.org> Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:36 -07:00
Lorenzo Stoakes (Oracle)	2d1e54aab6	mm: abstract reading sysctl_max_map_count, and READ_ONCE() Concurrent reads and writes of sysctl_max_map_count are possible, so we should READ_ONCE() and WRITE_ONCE(). The sysctl procfs logic already enforces WRITE_ONCE(), so abstract the read side with get_sysctl_max_map_count(). While we're here, also move the field to mm/internal.h and add the getter there since only mm interacts with it, there's no need for anybody else to have access. Finally, update the VMA userland tests to reflect the change. Link: https://lkml.kernel.org/r/0715259eb37cbdfde4f9e5db92a20ec7110a1ce5.1773249037.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Cc: Jann Horn <jannh@google.com> Cc: Jianzhou Zhao <luckd0g@163.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:28 -07:00
Waiman Long	2d028f3e4b	selftest: memcg: skip memcg_sock test if address family not supported The test_memcg_sock test in memcontrol.c sets up an IPv6 socket and send data over it to consume memory and verify that memory.stat.sock and memory.current values are close. On systems where IPv6 isn't enabled or not configured to support SOCK_STREAM, the test_memcg_sock test always fails. When the socket() call fails, there is no way we can test the memory consumption and verify the above claim. I believe it is better to just skip the test in this case instead of reporting a test failure hinting that there may be something wrong with the memcg code. Link: https://lkml.kernel.org/r/20260311200526.885899-1-longman@redhat.com Fixes: `5f8f019380` ("selftests: cgroup/memcontrol: add basic test for socket accounting") Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Michal Koutný <mkoutny@suse.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shuah Khan <shuah@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:27 -07:00
Mike Rapoport (Microsoft)	f08f610ea0	selftests/mm: pagemap_ioctl: remove hungarian notation Replace lpBaseAddress with addr and dwRegionSize with size. Link: https://lkml.kernel.org/r/20260311180737.3767545-1-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:27 -07:00
SeongJae Park	ddac713da3	selftests/damon/sysfs.py: test goal_tuner commit Extend the near-full DAMON parameters commit selftest to commit goal_tuner and confirm the internal status is updated as expected. Link: https://lkml.kernel.org/r/20260310010529.91162-12-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
SeongJae Park	c2b0cb96e7	selftests/damon/drgn_dump_damon_status: support quota goal_tuner dumping Update drgn_dump_damon_status.py, which is being used to dump the in-kernel DAMON status for tests, to dump goal_tuner setup status. Link: https://lkml.kernel.org/r/20260310010529.91162-11-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
SeongJae Park	c00863bc7c	selftests/damon/_damon_sysfs: support goal_tuner setup Add support of goal_tuner setup to the test-purpose DAMON sysfs interface control helper, _damon_sysfs.py. Link: https://lkml.kernel.org/r/20260310010529.91162-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
Anthony Yznaga	d239462787	mm: prevent droppable mappings from being locked Droppable mappings must not be lockable. There is a check for VMAs with VM_DROPPABLE set in mlock_fixup() along with checks for other types of unlockable VMAs which ensures this when calling mlock()/mlock2(). For mlockall(MCL_FUTURE), the check for unlockable VMAs is different. In apply_mlockall_flags(), if the flags parameter has MCL_FUTURE set, the current task's mm's default VMA flag field mm->def_flags has VM_LOCKED applied to it. VM_LOCKONFAULT is also applied if MCL_ONFAULT is also set. When these flags are set as default in this manner they are cleared in __mmap_complete() for new mappings that do not support mlock. A check for VM_DROPPABLE in __mmap_complete() is missing resulting in droppable mappings created with VM_LOCKED set. To fix this and reduce that chance of similar bugs in the future, introduce and use vma_supports_mlock(). Link: https://lkml.kernel.org/r/20260310155821.17869-1-anthony.yznaga@oracle.com Fixes: `9651fcedf7` ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com> Suggested-by: David Hildenbrand <david@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Tested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:25 -07:00
David Hildenbrand (Arm)	a9496e9e4b	mm: move vma_mmu_pagesize() from hugetlb to vma.c vma_mmu_pagesize() is also queried on non-hugetlb VMAs and does not really belong into hugetlb.c. PPC64 provides a custom overwrite with CONFIG_HUGETLB_PAGE, see arch/powerpc/mm/book3s64/slice.c, so we cannot easily make this a static inline function. So let's move it to vma.c and add some proper kerneldoc. To make vma tests happy, add a simple vma_kernel_pagesize() stub in tools/testing/vma/include/custom.h. Link: https://lkml.kernel.org/r/20260309151901.123947-3-david@kernel.org Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:23 -07:00

... 3 4 5 6 7 ...

52918 Commits