mirror of
https://github.com/torvalds/linux.git
synced 2026-05-12 08:08:03 +02:00
master
2881 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
9055c64567 |
memblock: updates for 7.0-rc1
* improve debugability of reserve_mem kernel parameter handling with print outs in case of a failure and debugfs info showing what was actually reserved * Make memblock_free_late() and free_reserved_area() use the same core logic for freeing the memory to buddy and ensure it takes care of updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled. -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmnjRmsQHHJwcHRAa2Vy bmVsLm9yZwAKCRA5A4Ymyw79kYh0CAC4NpZGFqpEBep1eQcfqsPH05dvp1LUXDNk i5GwS2ht/F5D9GcD+EyoYRQjRM8k+XZyOe3sqEF01Uav/rHAv3XrITg/pfiA92AR K7CvQv4NvyQqUNcv/mEb+P8niriJ4oHRXCag9inop1jo/x3Mym07oEy73rknAx9r ZQKwoFNOM/QQGVb9hZUANKCkE8cAsUXG89yEOH0n17FOahC0PZbK/vxjeO+br3IL HxEoC5l1j4cUauf8XEhsVXXdch0iqit/fB3ROePYFNCx7koVYHk6Yl1w++AM0RUA ypOmfPsSiqLY2ciuTIAnpTeMfQkkhEmMI3mp6T5BUBwSKJxLRaSM =c1xd -----END PGP SIGNATURE----- Merge tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - improve debuggability of reserve_mem kernel parameter handling with print outs in case of a failure and debugfs info showing what was actually reserved - Make memblock_free_late() and free_reserved_area() use the same core logic for freeing the memory to buddy and ensure it takes care of updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled. * tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: x86/alternative: delay freeing of smp_locks section memblock: warn when freeing reserved memory before memory map is initialized memblock, treewide: make memblock_free() handle late freeing memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y memblock: extract page freeing from free_reserved_area() into a helper memblock: make free_reserved_area() more robust mm: move free_reserved_area() to mm/memblock.c powerpc: opal-core: pair alloc_pages_exact() with free_pages_exact() powerpc: fadump: pair alloc_pages_exact() with free_pages_exact() memblock: reserve_mem: fix end caclulation in reserve_mem_release_by_name() memblock: move reserve_bootmem_range() to memblock.c and make it static memblock: Add reserve_mem debugfs info memblock: Print out errors on reserve_mem parser |
||
|
|
334fbe734e |
mm.git review status for linus..mm-stable
Everything: Total patches: 368 Reviews/patch: 1.56 Reviewed rate: 74% Excluding DAMON: Total patches: 316 Reviews/patch: 1.77 Reviewed rate: 81% Excluding DAMON and zram: Total patches: 306 Reviews/patch: 1.81 Reviewed rate: 82% Excluding DAMON, zram and maple_tree: Total patches: 276 Reviews/patch: 2.01 Reviewed rate: 91% Significant patch series in this merge: - The 30 patch series "maple_tree: Replace big node with maple copy" from Liam Howlett is mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - The 12 patch series "mm, swap: swap table phase III: remove swap_map" from Kairui Song offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - The 2 patch series "mm: memfd_luo: preserve file seals" from Pratyush Yadav adds file seal preservation to LUO's memfd code. - The 2 patch series "mm: zswap: add per-memcg stat for incompressible pages" from Jiayuan Chen adds additional userspace stats reportng to zswap. - The 4 patch series "arch, mm: consolidate empty_zero_page" from Mike Rapoport implements some cleanups for our handling of ZERO_PAGE() and zero_pfn. - The 2 patch series "mm/kmemleak: Improve scan_should_stop() implementation" from Zhongqiu Han provides an robustness improvement and some cleanups in the kmemleak code. - The 4 patch series "Improve khugepaged scan logic" from Vernon Yang "improves the khugepaged scan logic and reduces CPU consumption by prioritizing scanning tasks that access memory frequently". - The 2 patch series "Make KHO Stateless" from Jason Miu simplifies Kexec Handover by "transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel" - The 3 patch series "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" from Thomas Ballasi and Steven Rostedt enhances vmscan's tracepointing. - The 5 patch series "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" from Catalin Marinas is a cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation. - The 2 patch series "Fix KASAN support for KHO restored vmalloc regions" from Pasha Tatashin fixes a WARN() which can be emitted the KHO restores a vmalloc area. - The 4 patch series "mm: Remove stray references to pagevec" from Tal Zussman provides several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago. - The 17 patch series "mm: Eliminate fake head pages from vmemmap optimization" from Kiryl Shutsemau simplifies the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page. - The 2 patch series "mm/damon/core: improve DAMOS quota efficiency for core layer filters" from SeongJae Park improves two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used. - The 3 patch series "mm/damon: strictly respect min_nr_regions" from SeongJae Park improves DAMON usability by extending the treatment of the min_nr_regions user-settable parameter. - The 3 patch series "mm/page_alloc: pcp locking cleanup" from Vlastimil Babka is a proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ennsed. - The 16 patch series "mm: cleanups around unmapping / zapping" from David Hildenbrand implements "a bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions". - The 6 patch series "support batched checking of the young flag for MGLRU" from Baolin Wang supports batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64. - The 5 patch series "memcg: obj stock and slab stat caching cleanups" from Johannes Weiner provides memcg cleanup and robustness improvements. - The 5 patch series "Allow order zero pages in page reporting" from Yuvraj Sakshith enhances page_reporting's free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - The 6 patch series "mm: vma flag tweaks" from Lorenzo Stoakes is cleanup work following from the recent conversion of the VMA flags to a bitmap. - The 10 patch series "mm/damon: add optional debugging-purpose sanity checks" from SeongJae Park adds some more developer-facing debug checks into DAMON core. - The 2 patch series "mm/damon: test and document power-of-2 min_region_sz requirement" from SeongJae Park adds an additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling. - The 3 patch series "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" from SeongJae Park fixes a hard-to-hit time overflow issue in DAMON core. - The 7 patch series "mm/damon: improve/fixup/update ratio calculation, test and documentation" from SeongJae Park is a "batch of misc/minor improvements and fixups" for DAMON. - The 4 patch series "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" from David Hildenbrand fixes a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - The 6 patch series "zram: recompression cleanups and tweaks" from Sergey Senozhatsky provides "a somewhat random mix of fixups, recompression cleanups and improvements" in the zram code. - The 11 patch series "mm/damon: support multiple goal-based quota tuning algorithms" from SeongJae Park extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select. - The 4 patch series "mm: thp: reduce unnecessary start_stop_khugepaged()" from Breno Leitao fixes the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged. - The 3 patch series "mm: improve map count checks" from Lorenzo Stoakes provides some cleanups and slight fixes in the mremap, mmap and vma code. - The 5 patch series "mm/damon: support addr_unit on default monitoring targets for modules" from SeongJae Park extends the use of DAMON core's addr_unit tunable. - The 5 patch series "mm: khugepaged cleanups and mTHP prerequisites" from Nico Pache provides cleanups in the khugepaged and is a base for Nico's planned khugepaged mTHP support. - The 15 patch series "mm: memory hot(un)plug and SPARSEMEM cleanups" from David Hildenbrand implements code movement and cleanups in the memhotplug and sparsemem code. - The 2 patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" from David Hildenbrand rationalizes some memhotplug Kconfig support. - The 6 patch series "change young flag check functions to return bool" from Baolin Wang is "a cleanup patchset to change all young flag check functions to return bool". - The 3 patch series "mm/damon/sysfs: fix memory leak and NULL dereference issues" from Josh Law and SeongJae Park fixes a few potential DAMON bugs. - The 25 patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma code" from "converts a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it". Mainly in the vma code. - The 21 patch series "mm: expand mmap_prepare functionality and usage" from Lorenzo Stoakes "expands the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time". Cleanups, documentation, extension of mmap_prepare into filesystem drivers. - The 13 patch series "mm/huge_memory: refactor zap_huge_pmd()" from Lorenzo Stoakes simplifies and cleans up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad3HDQAKCRDdBJ7gKXxA jrUQAPwNhPk5nPSxnyxjAeQtOBHqgCdnICeEismLajPKd9aYRgEA0s2XAu3tSUYi GrBnWImHG3s4ePQxVcPCegWTsOUrXgQ= =1Q7o -----END PGP SIGNATURE----- Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky) A somewhat random mix of fixups, recompression cleanups and improvements in the zram code - "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park) Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao) Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged - "mm: improve map count checks" (Lorenzo Stoakes) Provide some cleanups and slight fixes in the mremap, mmap and vma code - "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park) Extend the use of DAMON core's addr_unit tunable - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache) Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand) Code movement and cleanups in the memhotplug and sparsemem code - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand) Rationalize some memhotplug Kconfig support - "change young flag check functions to return bool" (Baolin Wang) Cleanups to change all young flag check functions to return bool - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park) Fix a few potential DAMON bugs - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes) Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code. - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes) Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes) Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. * tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ... |
||
|
|
5bdb4078e1 |
sched_ext: Changes for v7.1
- Cgroup sub-scheduler groundwork. Multiple BPF schedulers can be attached to cgroups and the dispatch path is made hierarchical. This involves substantial restructuring of the core dispatch, bypass, watchdog, and dump paths to be per-scheduler, along with new infrastructure for scheduler ownership enforcement, lifecycle management, and cgroup subtree iteration. The enqueue path is not yet updated and will follow in a later cycle. - scx_bpf_dsq_reenq() generalized to support any DSQ including remote local DSQs and user DSQs. Built on top of this, SCX_ENQ_IMMED guarantees that tasks dispatched to local DSQs either run immediately or get reenqueued back through ops.enqueue(), giving schedulers tighter control over queueing latency. Also useful for opportunistic CPU sharing across sub-schedulers. - ops.dequeue() was only invoked when the core knew a task was in BPF data structures, missing scheduling property change events and skipping callbacks for non-local DSQ dispatches from ops.select_cpu(). Fixed to guarantee exactly one ops.dequeue() call when a task leaves BPF scheduler custody. - Kfunc access validation moved from runtime to BPF verifier time, removing runtime mask enforcement. - Idle SMT sibling prioritization in the idle CPU selection path. - Documentation, selftest, and tooling updates. Misc bug fixes and cleanups. - Merges from tip/sched-core, cgroup/for-7.1, and for-7.0-fixes to resolve dependencies and conflicts for the above changes. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCad0uaA4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGbktAQD2HrKdydyEfefz/n4mNpIXh/DFYX49NgKYcgUh sKy4ngD/Sy7nAZS2zwM+36PN6jBV7+cfuoaiKPgCstPFeGsvPwU= =fsgj -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - cgroup sub-scheduler groundwork Multiple BPF schedulers can be attached to cgroups and the dispatch path is made hierarchical. This involves substantial restructuring of the core dispatch, bypass, watchdog, and dump paths to be per-scheduler, along with new infrastructure for scheduler ownership enforcement, lifecycle management, and cgroup subtree iteration The enqueue path is not yet updated and will follow in a later cycle - scx_bpf_dsq_reenq() generalized to support any DSQ including remote local DSQs and user DSQs Built on top of this, SCX_ENQ_IMMED guarantees that tasks dispatched to local DSQs either run immediately or get reenqueued back through ops.enqueue(), giving schedulers tighter control over queueing latency Also useful for opportunistic CPU sharing across sub-schedulers - ops.dequeue() was only invoked when the core knew a task was in BPF data structures, missing scheduling property change events and skipping callbacks for non-local DSQ dispatches from ops.select_cpu() Fixed to guarantee exactly one ops.dequeue() call when a task leaves BPF scheduler custody - Kfunc access validation moved from runtime to BPF verifier time, removing runtime mask enforcement - Idle SMT sibling prioritization in the idle CPU selection path - Documentation, selftest, and tooling updates. Misc bug fixes and cleanups * tag 'sched_ext-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (134 commits) tools/sched_ext: Add explicit cast from void* in RESIZE_ARRAY() sched_ext: Make string params of __ENUM_set() const tools/sched_ext: Kick home CPU for stranded tasks in scx_qmap sched_ext: Drop spurious warning on kick during scheduler disable sched_ext: Warn on task-based SCX op recursion sched_ext: Rename scx_kf_allowed_on_arg_tasks() to scx_kf_arg_task_ok() sched_ext: Remove runtime kfunc mask enforcement sched_ext: Add verifier-time kfunc context filter sched_ext: Drop redundant rq-locked check from scx_bpf_task_cgroup() sched_ext: Decouple kfunc unlocked-context check from kf_mask sched_ext: Fix ops.cgroup_move() invocation kf_mask and rq tracking sched_ext: Track @p's rq lock across set_cpus_allowed_scx -> ops.set_cpumask sched_ext: Add select_cpu kfuncs to scx_kfunc_ids_unlocked sched_ext: Drop TRACING access to select_cpu kfuncs selftests/sched_ext: Fix wrong DSQ ID in peek_dsq error message sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code selftests/sched_ext: Improve runner error reporting for invalid arguments sched_ext: Documentation: Fix scx_bpf_move_to_local kfunc name sched_ext: Documentation: Add ops.dequeue() to task lifecycle tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing ... |
||
|
|
1c3b68f0d5 |
Scheduler changes for v7.1:
Fair scheduling updates:
- Skip SCHED_IDLE rq for SCHED_IDLE tasks (Christian Loehle)
- Remove superfluous rcu_read_lock() in the wakeup path (K Prateek Nayak)
- Simplify the entry condition for update_idle_cpu_scan() (K Prateek Nayak)
- Simplify SIS_UTIL handling in select_idle_cpu() (K Prateek Nayak)
- Avoid overflow in enqueue_entity() (K Prateek Nayak)
- Update overutilized detection (Vincent Guittot)
- Prevent negative lag increase during delayed dequeue (Vincent Guittot)
- Clear buddies for preempt_short (Vincent Guittot)
- Implement more complex proportional newidle balance (Peter Zijlstra)
- Increase weight bits for avg_vruntime (Peter Zijlstra)
- Use full weight to __calc_delta() (Peter Zijlstra)
RT and DL scheduling updates:
- Fix incorrect schedstats for rt and dl thread (Dengjun Su)
- Skip group schedulable check with rt_group_sched=0 (Michal Koutný)
- Move group schedulability check to sched_rt_global_validate()
(Michal Koutný)
- Add reporting of runtime left & abs deadline to sched_getattr()
for DEADLINE tasks (Tommaso Cucinotta)
Scheduling topology updates by K Prateek Nayak:
- Compute sd_weight considering cpuset partitions
- Extract "imb_numa_nr" calculation into a separate helper
- Allocate per-CPU sched_domain_shared in s_data
- Switch to assigning "sd->shared" from s_data
- Remove sched_domain_shared allocation with sd_data
Energy-aware scheduling updates:
- Filter false overloaded_group case for EAS (Vincent Guittot)
- PM: EM: Switch to rcu_dereference_all() in wakeup path
(Dietmar Eggemann)
Infrastructure updates:
- Replace use of system_unbound_wq with system_dfl_wq (Marco Crivellari)
Proxy scheduling updates by John Stultz:
- Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr()
- Minimise repeated sched_proxy_exec() checking
- Fix potentially missing balancing with Proxy Exec
- Fix and improve task::blocked_on et al handling
- Add assert_balance_callbacks_empty() helper
- Add logic to zap balancing callbacks if we pick again
- Move attach_one_task() and attach_task() helpers to sched.h
- Handle blocked-waiter migration (and return migration)
- Add K Prateek Nayak to scheduler reviewers for proxy execution
Misc cleanups and fixes by John Stultz, Joseph Salisbury,
Peter Zijlstra, K Prateek Nayak, Michal Koutný, Randy Dunlap,
Shrikanth Hegde, Vincent Guittot, Zhan Xusheng, Xie Yuanbin
and Vincent Guittot.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmncq4oRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gxoA/8DD0SsMhBLaZLi+LAdY5fD6rGjOLGBtxz
NgwN8CAvPIFH7qFzPjAk7WtVXoKjF62sRDFvUaBEsliflRzOkBkYr3SnUYRORyBB
VRj7D6ymuWhxnhYsy8+Hviu/93c3GyEO59IYU0wIShxBzYBxqDfNxWvEUQte2Cin
1yFy4CICJeGpsBv9Ev+0LtesxtF5bnaioawbAYcpc2IdYsK+nsMKRvkwg1YSdLmh
v9+vIYuQBrclBn3OR7dsv2krBev5qodYtDZFwdJagE+6aaQv2zhWIfhetPpkzwrq
zhuzVZH+E9404Pn5EqJaw7KmU9eyBBwIUVqBaQfH73eSe5PY0tiSrpPU9foocUjo
4Td9sL11SLzjwpM4bIijW0ezZY8y+4Q0A21GwdcwAx3LPstXcF5GIjQ76dVFPRKN
Unbt6o+9O9NvMLg8CLzwonlFzOoLOrL+5eKJs+caOuOikT+cXnBQrukgB4ck3RAD
PIVD8XnufJTCKiDvx2vravLXsWiA2cg7citVsgc8y5FBcdhzv3YVqXd/lGkqg+09
7rVqE6NRDlkk4G4KZACTK45YVcVwXhQlMU/qiS0IduHdD0NtL9DPnQvdfzQWQehO
30cJ5vZ+fqbHspJ8AdPuqntUyfEvPTCbCT4Ou/AEcvO8NRQu2gplcq9mF4U46WZG
GBPWXvGHzM8=
=NjyS
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Fair scheduling updates:
- Skip SCHED_IDLE rq for SCHED_IDLE tasks (Christian Loehle)
- Remove superfluous rcu_read_lock() in the wakeup path (K Prateek Nayak)
- Simplify the entry condition for update_idle_cpu_scan() (K Prateek Nayak)
- Simplify SIS_UTIL handling in select_idle_cpu() (K Prateek Nayak)
- Avoid overflow in enqueue_entity() (K Prateek Nayak)
- Update overutilized detection (Vincent Guittot)
- Prevent negative lag increase during delayed dequeue (Vincent Guittot)
- Clear buddies for preempt_short (Vincent Guittot)
- Implement more complex proportional newidle balance (Peter Zijlstra)
- Increase weight bits for avg_vruntime (Peter Zijlstra)
- Use full weight to __calc_delta() (Peter Zijlstra)
RT and DL scheduling updates:
- Fix incorrect schedstats for rt and dl thread (Dengjun Su)
- Skip group schedulable check with rt_group_sched=0 (Michal Koutný)
- Move group schedulability check to sched_rt_global_validate()
(Michal Koutný)
- Add reporting of runtime left & abs deadline to sched_getattr()
for DEADLINE tasks (Tommaso Cucinotta)
Scheduling topology updates by K Prateek Nayak:
- Compute sd_weight considering cpuset partitions
- Extract "imb_numa_nr" calculation into a separate helper
- Allocate per-CPU sched_domain_shared in s_data
- Switch to assigning "sd->shared" from s_data
- Remove sched_domain_shared allocation with sd_data
Energy-aware scheduling updates:
- Filter false overloaded_group case for EAS (Vincent Guittot)
- PM: EM: Switch to rcu_dereference_all() in wakeup path
(Dietmar Eggemann)
Infrastructure updates:
- Replace use of system_unbound_wq with system_dfl_wq (Marco Crivellari)
Proxy scheduling updates by John Stultz:
- Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr()
- Minimise repeated sched_proxy_exec() checking
- Fix potentially missing balancing with Proxy Exec
- Fix and improve task::blocked_on et al handling
- Add assert_balance_callbacks_empty() helper
- Add logic to zap balancing callbacks if we pick again
- Move attach_one_task() and attach_task() helpers to sched.h
- Handle blocked-waiter migration (and return migration)
- Add K Prateek Nayak to scheduler reviewers for proxy execution
Misc cleanups and fixes by John Stultz, Joseph Salisbury, Peter
Zijlstra, K Prateek Nayak, Michal Koutný, Randy Dunlap, Shrikanth
Hegde, Vincent Guittot, Zhan Xusheng, Xie Yuanbin and Vincent Guittot"
* tag 'sched-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
sched/eevdf: Clear buddies for preempt_short
sched/rt: Cleanup global RT bandwidth functions
sched/rt: Move group schedulability check to sched_rt_global_validate()
sched/rt: Skip group schedulable check with rt_group_sched=0
sched/fair: Avoid overflow in enqueue_entity()
sched: Use u64 for bandwidth ratio calculations
sched/fair: Prevent negative lag increase during delayed dequeue
sched/fair: Use sched_energy_enabled()
sched: Handle blocked-waiter migration (and return migration)
sched: Move attach_one_task and attach_task helpers to sched.h
sched: Add logic to zap balance callbacks if we pick again
sched: Add assert_balance_callbacks_empty helper
sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migration
sched: Fix modifying donor->blocked on without proper locking
locking: Add task::blocked_lock to serialize blocked_on state
sched: Fix potentially missing balancing with Proxy Exec
sched: Minimise repeated sched_proxy_exec() checking
sched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr()
MAINTAINERS: Add K Prateek Nayak to scheduler reviewers
sched/core: Get this cpu once in ttwu_queue_cond()
...
|
||
|
|
f21f7b5162 |
Update to the VDSO subsystem:
- Make the handling of compat functions consistent and more robust
- Rework the underlying data store so that it is dynamically
allocated, which allows the conversion of the last holdout SPARC64
to the generic VDSO implementation
- Rework the SPARC64 VDSO to utilize the generic implementation
- Mop up the left overs of the non-generic VDSO support in the core
code.
- Expand the VDSO selftest and make them more robust
- Allow time namespaces to be enabled independently of the generic
VDSO support, which was not possible before due to SPARC64 not
using it.
- Various cleanups and improvements in the related code.
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnb0v8QHHRnbHhAa2Vy
bmVsLm9yZwAKCRCmGPVMDXSYocfqD/9ywgnvwRH6B612mY4PI3qCbLHs6n9f78aH
YwyXmmfBZ5vt1ZtptHD+BAxiIMm9GC+/exdj5zhcOWucnBVhorcloE6evxhkJAMn
RhTQFKkEmcA/UV2Yfct9r+33kgZRyu4IIul4J7hgn2o5T1BqwZbOil0W/O5adr5P
MDLxjT1OLV80ZZWI9qbWcR/aR7W7sHcdwfVPPqjhombRY7f391Mo3dZeM5C2y55x
8TXCEqVpN1RJzFinWEgQN7QpP4OmF0rRuXSrDQpkH6pk/+RSqNlT/QGG7MJtmCQR
E6CeBjNRUn318KiroaGyTKlM9xsL3gNoiCY24ZTwzZxx3g5gSAR3KTCTJhQU0hpu
Svxj+ksqEAyW7fAOIsbce6W8fUPKC2KM+juXgPKcqZ5hjE2fALD+eEYMlq00jSiu
sj71007cM9tZKOXPdWs3Fv7AY2Yj7iiRiRz9gv1wqS1z7ybxiaFjxjLYYakej0tr
rmwBDEGhNow7msZZttr01BRZk9hDUWfIiJtL+0BrgRLNzst2A7WoagtZ2s0Z7Psl
RjtWgYNBDJ878xK0J+Djqb9TyLraGWZShIIna9uYCAJX9i954xfKJ//NOnUkZhcl
jslDLHhdttyJ+TmgIsc1ntUGvYvHqH5ywQpyDfWepMKyIYdaJLHOr2K6bwFnGHdw
uocXvLrkXw==
=8ixX
-----END PGP SIGNATURE-----
Merge tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vdso updates from Thomas Gleixner:
- Make the handling of compat functions consistent and more robust
- Rework the underlying data store so that it is dynamically allocated,
which allows the conversion of the last holdout SPARC64 to the
generic VDSO implementation
- Rework the SPARC64 VDSO to utilize the generic implementation
- Mop up the left overs of the non-generic VDSO support in the core
code
- Expand the VDSO selftest and make them more robust
- Allow time namespaces to be enabled independently of the generic VDSO
support, which was not possible before due to SPARC64 not using it
- Various cleanups and improvements in the related code
* tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
timens: Use task_lock guard in timens_get*()
timens: Use mutex guard in proc_timens_set_offset()
timens: Simplify some calls to put_time_ns()
timens: Add a __free() wrapper for put_time_ns()
timens: Remove dependency on the vDSO
vdso/timens: Move functions to new file
selftests: vDSO: vdso_test_correctness: Add a test for time()
selftests: vDSO: vdso_test_correctness: Use facilities from parse_vdso.c
selftests: vDSO: vdso_test_correctness: Handle different tv_usec types
selftests: vDSO: vdso_test_correctness: Drop SYS_getcpu fallbacks
selftests: vDSO: vdso_test_gettimeofday: Remove nolibc checks
Revert "selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers"
random: vDSO: Remove ifdeffery
random: vDSO: Trim vDSO includes
vdso/datapage: Trim down unnecessary includes
vdso/datapage: Remove inclusion of gettimeofday.h
vdso/helpers: Explicitly include vdso/processor.h
vdso/gettimeofday: Add explicit includes
random: vDSO: Add explicit includes
MIPS: vdso: Explicitly include asm/vdso/vdso.h
...
|
||
|
|
5d0d362330 |
Kbuild/Kconfig updates for 7.1
Kbuild changes
==============
* tools/build: Reject unexpected values for LLVM=
* kbuild: uapi: remove usage of toolchain headers
* kbuild: Switch from '-fms-extensions' to '-fms-anonymous-structs'
when available (currently: clang >= 23.0.0)
* kbuild: Reduce the number of compiler-generated suffixes for clang
thin-lto build
* kbuild: reduce output spam ("GEN Makefile") when building out of tree
* check-uapi: improve portability for testing headers
* uapi: also test UAPI headers against C++ compilers
* kbuild: vdso_install: drop build ID architecture allow-list
* checksyscalls: only run when necessary
* Documentation: kbuild: Update the debug information notes in
reproducible-builds.rst
* kconfig: forbid multiple entries with the same symbol in a choice
* kbuild: expand inlining hints with -fdiagnostics-show-inlining-chain
Kconfig changes
===============
* kconfig: Error out on duplicated kconfig inclusion
Cc: Alexander Coffin <alex@cyberialabs.net>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bill Wendling <morbo@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dodji Seketeli <dodji@seketeli.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Helge Deller <deller@gmx.de>
Cc: John Moon <john@jmoon.dev>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Song Liu <song@kernel.org>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: kernel-team@fb.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-efi@vger.kernel.org
Cc: linux-hexagon@vger.kernel.org
Cc: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: llvm@lists.linux.dev
Cc: loongarch@lists.linux.dev
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEh0E3p4c3JKeBvsLGB1IKcBYmEmkFAmnatXEACgkQB1IKcBYm
Eml6ww/9Hja/CTBoF+ZgMXN/9VcQhzNonPXIp8IGarX3+LCPh8RfUEywaOLnvR/U
fE6FEIcwDw0M5drS0hEH7t1Xowc6AhDX05lKBj3aGBgn6JqGGQFAfnysQd5z0cwW
Y/8+bMm+Y2XQ/xZNa0J92+3evPO04U7+2kCSVD051ZhRdmK4n290u4YsTgoKs7Fm
1SBIr+tsFa1zMOG6r+J4uCLxXNnujQ5XcejnlmdBM0o19f9kttvVkYKuBVdXPHf4
JaTLti22Td8SklDKMmkSRg+Ul/Wh2x8D8tP98VQAJe5B3f4Uk6YAu1BMrbQaX5Rk
5SsGbhBEeOTDc4qCaS8DS+FJQU6T9W9cf/9+tBY510fXxAIonz5cPB06q5xeJWCd
IkVB3KpmaVxo2B54Cy4b/fvd1J3VMkmFjBQWMNwkq6cnCG1ZK/b6Jmvh9BQSNctl
IYJxWKBjlddrMuvZEMI0CewVq4GmarTLiOpweghDg8OYqya4E6PfOUGnaWMrWT5c
2E8ZMnQSb68yFUaXK+Sy+Pw2Nig/VvxCUxHdaarHi/RmGeoN5dMGfjj/gGZvZrHt
NUGt6qe+X62P0ZAUR8p+GpRcU3+p3uLhCyO7dkwqgLVZTnaXy5XtUQ/uyh2G60hv
eJlFfrn8QXplvzrxcSTJya6PunoIhuWh2BfKhf0RDymJTPyMbBc=
=+wTC
-----END PGP SIGNATURE-----
Merge tag 'kbuild-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild/Kconfig updates from Nicolas Schier:
"Kbuild:
- reject unexpected values for LLVM=
- uapi: remove usage of toolchain headers
- switch from '-fms-extensions' to '-fms-anonymous-structs' when
available (currently: clang >= 23.0.0)
- reduce the number of compiler-generated suffixes for clang thin-lto
build
- reduce output spam ("GEN Makefile") when building out of tree
- improve portability for testing headers
- also test UAPI headers against C++ compilers
- drop build ID architecture allow-list in vdso_install
- only run checksyscalls when necessary
- update the debug information notes in reproducible-builds.rst
- expand inlining hints with -fdiagnostics-show-inlining-chain
Kconfig:
- forbid multiple entries with the same symbol in a choice
- error out on duplicated kconfig inclusion"
* tag 'kbuild-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (35 commits)
kbuild: expand inlining hints with -fdiagnostics-show-inlining-chain
kconfig: forbid multiple entries with the same symbol in a choice
Documentation: kbuild: Update the debug information notes in reproducible-builds.rst
checksyscalls: move instance functionality into generic code
checksyscalls: only run when necessary
checksyscalls: fail on all intermediate errors
checksyscalls: move path to reference table to a variable
kbuild: vdso_install: drop build ID architecture allow-list
kbuild: vdso_install: gracefully handle images without build ID
kbuild: vdso_install: hide readelf warnings
kbuild: vdso_install: split out the readelf invocation
kbuild: uapi: also test UAPI headers against C++ compilers
kbuild: uapi: provide a C++ compatible dummy definition of NULL
kbuild: uapi: handle UML in architecture-specific exclusion lists
kbuild: uapi: move all include path flags together
kbuild: uapi: move some compiler arguments out of the command definition
check-uapi: use dummy libc includes
check-uapi: honor ${CROSS_COMPILE} setting
check-uapi: link into shared objects
kbuild: reduce output spam when building out of tree
...
|
||
|
|
4793dae01f |
Driver core changes for 7.1-rc1
- debugfs:
- Fix NULL pointer dereference in debugfs_create_str()
- Fix misplaced EXPORT_SYMBOL_GPL for debugfs_create_str()
- Fix soundwire debugfs NULL pointer dereference from uninitialized
firmware_file
- device property:
- Make fwnode flags modifications thread safe; widen the field to
unsigned long and use set_bit() / clear_bit() based accessors
- Document how to check for the property presence
- devres:
- Separate struct devres_node from its "subclasses" (struct devres,
struct devres_group); give struct devres_node its own release and
free callbacks for per-type dispatch
- Introduce struct devres_action for devres actions, avoiding the
ARCH_DMA_MINALIGN alignment overhead of struct devres
- Export struct devres_node and its init/add/remove/dbginfo
primitives for use by Rust Devres<T>
- Fix missing node debug info in devm_krealloc()
- Use guard(spinlock_irqsave) where applicable; consolidate unlock
paths in devres_release_group()
- driver_override:
- Convert PCI, WMI, vdpa, s390/cio, s390/ap, and fsl-mc to the
generic driver_override infrastructure, replacing per-bus
driver_override strings, sysfs attributes, and match logic; fixes
a potential UAF from unsynchronized access to driver_override in
bus match() callbacks
- Simplify __device_set_driver_override() logic
- kernfs:
- Send IN_DELETE_SELF and IN_IGNORED inotify events on kernfs
file and directory removal
- Add corresponding selftests for memcg
- platform:
- Allow attaching software nodes when creating platform devices via
a new 'swnode' field in struct platform_device_info
- Add kerneldoc for struct platform_device_info
- software node:
- Move software node initialization from postcore_initcall() to
driver_init(), making it available early in the boot process
- Move kernel_kobj initialization (ksysfs_init) earlier to support
the above
- Remove software_node_exit(); dead code in a built-in unit
- SoC:
- Introduce of_machine_read_compatible() and of_machine_read_model()
OF helpers and export soc_attr_read_machine() to replace direct
accesses to of_root from SoC drivers; also enables
CONFIG_COMPILE_TEST coverage for these drivers
- sysfs:
- Constify attribute group array pointers to
'const struct attribute_group *const *' in sysfs functions,
device_add_groups() / device_remove_groups(), and struct class
- Rust:
- Devres:
- Embed struct devres_node directly in Devres<T> instead of going
through devm_add_action(), avoiding the extra allocation and
the unnecessary ARCH_DMA_MINALIGN alignment
- I/O:
- Turn IoCapable from a marker trait into a functional trait
carrying the raw I/O accessor implementation (io_read /
io_write), providing working defaults for the per-type Io
methods
- Add RelaxedMmio wrapper type, making relaxed accessors usable
in code generic over the Io trait
- Remove overloaded per-type Io methods and per-backend macros
from Mmio and PCI ConfigSpace
- I/O (Register):
- Add IoLoc trait and generic read/write/update methods to the Io
trait, making I/O operations parameterizable by typed locations
- Add register! macro for defining hardware register types with
typed bitfield accessors backed by Bounded values; supports
direct, relative, and array register addressing
- Add write_reg() / try_write_reg() and LocatedRegister trait
- Update PCI sample driver to demonstrate the register! macro
Example:
```
register! {
/// UART control register.
CTRL(u32) @ 0x18 {
/// Receiver enable.
19:19 rx_enable => bool;
/// Parity configuration.
14:13 parity ?=> Parity;
}
/// FIFO watermark and counter register.
WATER(u32) @ 0x2c {
/// Number of datawords in the receive FIFO.
26:24 rx_count;
/// RX interrupt threshold.
17:16 rx_water;
}
}
impl WATER {
fn rx_above_watermark(&self) -> bool {
self.rx_count() > self.rx_water()
}
}
fn init(bar: &pci::Bar<BAR0_SIZE>) {
let water = WATER::zeroed()
.with_const_rx_water::<1>(); // > 3 would not compile
bar.write_reg(water);
let ctrl = CTRL::zeroed()
.with_parity(Parity::Even)
.with_rx_enable(true);
bar.write_reg(ctrl);
}
fn handle_rx(bar: &pci::Bar<BAR0_SIZE>) {
if bar.read(WATER).rx_above_watermark() {
// drain the FIFO
}
}
fn set_parity(bar: &pci::Bar<BAR0_SIZE>, parity: Parity) {
bar.update(CTRL, |r| r.with_parity(parity));
}
```
- IRQ:
- Move 'static bounds from where clauses to trait declarations
for IRQ handler traits
- Misc:
- Enable the generic_arg_infer Rust feature
- Extend Bounded with shift operations, single-bit bool conversion,
and const get()
- Misc:
- Make deferred_probe_timeout default a Kconfig option
- Drop auxiliary_dev_pm_ops; the PM core falls back to driver PM
callbacks when no bus type PM ops are set
- Add conditional guard support for device_lock()
- Add ksysfs.c to the DRIVER CORE MAINTAINERS entry
- Fix kernel-doc warnings in base.h
- Fix stale reference to memory_block_add_nid() in documentation
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCadl5SwAKCRBFlHeO1qrK
LpjDAQCSG3vYznwrngfpmRU5bCB9sdUy/pZiX5px1357+amJkwEA9LgIVQvtHAZW
ZXcQ7Jr+mR3mJEdlatbkWHp3w1VHqAQ=
=y1DV
-----END PGP SIGNATURE-----
Merge tag 'driver-core-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core updates from Danilo Krummrich:
"debugfs:
- Fix NULL pointer dereference in debugfs_create_str()
- Fix misplaced EXPORT_SYMBOL_GPL for debugfs_create_str()
- Fix soundwire debugfs NULL pointer dereference from uninitialized
firmware_file
device property:
- Make fwnode flags modifications thread safe; widen the field to
unsigned long and use set_bit() / clear_bit() based accessors
- Document how to check for the property presence
devres:
- Separate struct devres_node from its "subclasses" (struct devres,
struct devres_group); give struct devres_node its own release and
free callbacks for per-type dispatch
- Introduce struct devres_action for devres actions, avoiding the
ARCH_DMA_MINALIGN alignment overhead of struct devres
- Export struct devres_node and its init/add/remove/dbginfo
primitives for use by Rust Devres<T>
- Fix missing node debug info in devm_krealloc()
- Use guard(spinlock_irqsave) where applicable; consolidate unlock
paths in devres_release_group()
driver_override:
- Convert PCI, WMI, vdpa, s390/cio, s390/ap, and fsl-mc to the
generic driver_override infrastructure, replacing per-bus
driver_override strings, sysfs attributes, and match logic; fixes a
potential UAF from unsynchronized access to driver_override in bus
match() callbacks
- Simplify __device_set_driver_override() logic
kernfs:
- Send IN_DELETE_SELF and IN_IGNORED inotify events on kernfs file
and directory removal
- Add corresponding selftests for memcg
platform:
- Allow attaching software nodes when creating platform devices via a
new 'swnode' field in struct platform_device_info
- Add kerneldoc for struct platform_device_info
software node:
- Move software node initialization from postcore_initcall() to
driver_init(), making it available early in the boot process
- Move kernel_kobj initialization (ksysfs_init) earlier to support
the above
- Remove software_node_exit(); dead code in a built-in unit
SoC:
- Introduce of_machine_read_compatible() and of_machine_read_model()
OF helpers and export soc_attr_read_machine() to replace direct
accesses to of_root from SoC drivers; also enables
CONFIG_COMPILE_TEST coverage for these drivers
sysfs:
- Constify attribute group array pointers to
'const struct attribute_group *const *' in sysfs functions,
device_add_groups() / device_remove_groups(), and struct class
Rust:
- Devres:
- Embed struct devres_node directly in Devres<T> instead of going
through devm_add_action(), avoiding the extra allocation and the
unnecessary ARCH_DMA_MINALIGN alignment
- I/O:
- Turn IoCapable from a marker trait into a functional trait
carrying the raw I/O accessor implementation (io_read /
io_write), providing working defaults for the per-type Io
methods
- Add RelaxedMmio wrapper type, making relaxed accessors usable in
code generic over the Io trait
- Remove overloaded per-type Io methods and per-backend macros
from Mmio and PCI ConfigSpace
- I/O (Register):
- Add IoLoc trait and generic read/write/update methods to the Io
trait, making I/O operations parameterizable by typed locations
- Add register! macro for defining hardware register types with
typed bitfield accessors backed by Bounded values; supports
direct, relative, and array register addressing
- Add write_reg() / try_write_reg() and LocatedRegister trait
- Update PCI sample driver to demonstrate the register! macro
Example:
```
register! {
/// UART control register.
CTRL(u32) @ 0x18 {
/// Receiver enable.
19:19 rx_enable => bool;
/// Parity configuration.
14:13 parity ?=> Parity;
}
/// FIFO watermark and counter register.
WATER(u32) @ 0x2c {
/// Number of datawords in the receive FIFO.
26:24 rx_count;
/// RX interrupt threshold.
17:16 rx_water;
}
}
impl WATER {
fn rx_above_watermark(&self) -> bool {
self.rx_count() > self.rx_water()
}
}
fn init(bar: &pci::Bar<BAR0_SIZE>) {
let water = WATER::zeroed()
.with_const_rx_water::<1>(); // > 3 would not compile
bar.write_reg(water);
let ctrl = CTRL::zeroed()
.with_parity(Parity::Even)
.with_rx_enable(true);
bar.write_reg(ctrl);
}
fn handle_rx(bar: &pci::Bar<BAR0_SIZE>) {
if bar.read(WATER).rx_above_watermark() {
// drain the FIFO
}
}
fn set_parity(bar: &pci::Bar<BAR0_SIZE>, parity: Parity) {
bar.update(CTRL, |r| r.with_parity(parity));
}
```
- IRQ:
- Move 'static bounds from where clauses to trait declarations for
IRQ handler traits
- Misc:
- Enable the generic_arg_infer Rust feature
- Extend Bounded with shift operations, single-bit bool
conversion, and const get()
Misc:
- Make deferred_probe_timeout default a Kconfig option
- Drop auxiliary_dev_pm_ops; the PM core falls back to driver PM
callbacks when no bus type PM ops are set
- Add conditional guard support for device_lock()
- Add ksysfs.c to the DRIVER CORE MAINTAINERS entry
- Fix kernel-doc warnings in base.h
- Fix stale reference to memory_block_add_nid() in documentation"
* tag 'driver-core-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (67 commits)
bus: fsl-mc: use generic driver_override infrastructure
s390/ap: use generic driver_override infrastructure
s390/cio: use generic driver_override infrastructure
vdpa: use generic driver_override infrastructure
platform/wmi: use generic driver_override infrastructure
PCI: use generic driver_override infrastructure
driver core: make software nodes available earlier
software node: remove software_node_exit()
kernel: ksysfs: initialize kernel_kobj earlier
MAINTAINERS: add ksysfs.c to the DRIVER CORE entry
drivers/base/memory: fix stale reference to memory_block_add_nid()
device property: Document how to check for the property presence
soundwire: debugfs: initialize firmware_file to empty string
debugfs: fix placement of EXPORT_SYMBOL_GPL for debugfs_create_str()
debugfs: check for NULL pointer in debugfs_create_str()
driver core: Make deferred_probe_timeout default a Kconfig option
driver core: simplify __device_set_driver_override() clearing logic
driver core: auxiliary bus: Drop auxiliary_dev_pm_ops
device property: Make modifications of fwnode "flags" thread safe
rust: devres: embed struct devres_node directly
...
|
||
|
|
d568788baa |
hardening updates for v7.1-rc1
- randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCad16PwAKCRA2KwveOeQk u7crAP4qz8gXCjes76KsZm/YQS8PtOG5JroAVu5Oa4ohw0RfaQD+K/XLow1plcNF 4Bi8zSuv2ifcLysh9qEAbx5+wcHijgo= =woB3 -----END PGP SIGNATURE----- Merge tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations * tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test refcount: Remove unused __signed_wrap function annotations randomize_kstack: Unify random source across arches randomize_kstack: Maintain kstack_offset per task |
||
|
|
ef3da345cc |
vfs-7.1-rc1.misc
Please consider pulling these changes from the signed vfs-7.1-rc1.misc tag.
Thanks!
Christian
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCwAKCRCRxhvAZXjc
ohhBAQCAmQMlMRAXAgUZFYMTZpeQlcujP5rv+/vT2Tf/xS76YwD/dRDaw1FH294+
qtk/Z1NjleNixzE2sld1K9J32NxeyAc=
=+g9q
-----END PGP SIGNATURE-----
Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"Features:
- coredump: add tracepoint for coredump events
- fs: hide file and bfile caches behind runtime const machinery
Fixes:
- fix architecture-specific compat_ftruncate64 implementations
- dcache: Limit the minimal number of bucket to two
- fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
- fs/mbcache: cancel shrink work before destroying the cache
- dcache: permit dynamic_dname()s up to NAME_MAX
Cleanups:
- remove or unexport unused fs_context infrastructure
- trivial ->setattr cleanups
- selftests/filesystems: Assume that TIOCGPTPEER is defined
- writeback: fix kernel-doc function name mismatch for wb_put_many()
- autofs: replace manual symlink buffer allocation in autofs_dir_symlink
- init/initramfs.c: trivial fix: FSM -> Finite-state machine
- fs: remove stale and duplicate forward declarations
- readdir: Introduce dirent_size()
- fs: Replace user_access_{begin/end} by scoped user access
- kernel: acct: fix duplicate word in comment
- fs: write a better comment in step_into() concerning .mnt assignment
- fs: attr: fix comment formatting and spelling issues"
* tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
dcache: permit dynamic_dname()s up to NAME_MAX
fs: attr: fix comment formatting and spelling issues
fs: hide file and bfile caches behind runtime const machinery
fs: write a better comment in step_into() concerning .mnt assignment
proc: rename proc_notify_change to proc_setattr
proc: rename proc_setattr to proc_nochmod_setattr
affs: rename affs_notify_change to affs_setattr
adfs: rename adfs_notify_change to adfs_setattr
hfs: update comments on hfs_inode_setattr
kernel: acct: fix duplicate word in comment
fs: Replace user_access_{begin/end} by scoped user access
readdir: Introduce dirent_size()
coredump: add tracepoint for coredump events
fs: remove do_sys_truncate
fs: pass on FTRUNCATE_* flags to do_truncate
fs: fix archiecture-specific compat_ftruncate64
fs: remove stale and duplicate forward declarations
init/initramfs.c: trivial fix: FSM -> Finite-state machine
autofs: replace manual symlink buffer allocation in autofs_dir_symlink
fs/mbcache: cancel shrink work before destroying the cache
...
|
||
|
|
26ff969926 |
Rust changes for v7.1
Toolchain and infrastructure:
- Bump the minimum Rust version to 1.85.0 (and 'bindgen' to 0.71.1).
As proposed in LPC 2025 and the Maintainers Summit [1], we are going
to follow Debian Stable's Rust versions as our minimum versions.
Debian Trixie was released on 2025-08-09 with a Rust 1.85.0 and
'bindgen' 0.71.1 toolchain, which is a fair amount of time for e.g.
kernel developers to upgrade.
Other major distributions support a Rust version that is high enough
as well, including:
+ Arch Linux.
+ Fedora Linux.
+ Gentoo Linux.
+ Nix.
+ openSUSE Slowroll and openSUSE Tumbleweed.
+ Ubuntu 25.10 and 26.04 LTS. In addition, 24.04 LTS using
their versioned packages.
The merged patch series comes with the associated cleanups and
simplifications treewide that can be performed thanks to both bumps,
as well as documentation updates.
In addition, start using 'bindgen''s '--with-attribute-custom-enum'
feature to set the 'cfi_encoding' attribute for the 'lru_status' enum
used in Binder.
Link: https://lwn.net/Articles/1050174/ [1]
- Add experimental Kconfig option ('CONFIG_RUST_INLINE_HELPERS') that
inlines C helpers into Rust.
Essentially, it performs a step similar to LTO, but just for the
helpers, i.e. very local and fast.
It relies on 'llvm-link' and its '--internalize' flag, and requires
a compatible LLVM between Clang and 'rustc' (i.e. same major version,
'CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE'). It is only enabled for two
architectures for now.
The result is a measurable speedup in different workloads that
different users have tested. For instance, for the null block driver,
it amounts to a 2%.
- Support global per-version flags.
While we already have per-version flags in many places, we didn't
have a place to set global ones that depend on the compiler version,
i.e. in 'rust_common_flags', which sometimes is needed to e.g. tweak
the lints set per version.
Use that to allow the 'clippy::precedence' lint for Rust < 1.86.0,
since it had a change in behavior.
- Support overriding the crate name and apply it to Rust Binder, which
wanted the module to be called 'rust_binder'.
- Add the remaining '__rust_helper' annotations (started in the
previous cycle).
'kernel' crate:
- Introduce the 'const_assert!' macro: a more powerful version of
'static_assert!' that can refer to generics inside functions or
implementation bodies, e.g.:
fn f<const N: usize>() {
const_assert!(N > 1);
}
fn g<T>() {
const_assert!(size_of::<T>() > 0, "T cannot be ZST");
}
In addition, reorganize our set of build-time assertion macros
('{build,const,static_assert}!') to live in the 'build_assert'
module.
Finally, improve the docs as well to clarify how these are different
from one another and how to pick the right one to use, and their
equivalence (if any) to the existing C ones for extra clarity.
- 'sizes' module: add 'SizeConstants' trait.
This gives us typed 'SZ_*' constants (avoiding casts) for use in
device address spaces where the address width depends on the hardware
(e.g. 32-bit MMIO windows, 64-bit GPU framebuffers, etc.), e.g.:
let gpu_heap = 14 * u64::SZ_1M;
let mmio_window = u32::SZ_16M;
- 'clk' module: implement 'Send' and 'Sync' for 'Clk' and thus simplify
the users in Tyr and PWM.
- 'ptr' module: add 'const_align_up'.
- 'str' module: improve the documentation of the 'c_str!' macro to
explain that one should only use it for non-literal cases (for the
other case we instead use C string literals, e.g. 'c"abc"').
- Disallow the use of 'CStr::{as_ptr,from_ptr}' and clean one such use
in the 'task' module.
- 'sync' module: finish the move of 'ARef' and 'AlwaysRefCounted'
outside of the 'types' module, i.e. update the last remaining
instances and finally remove the re-exports.
- 'error' module: clarify that 'from_err_ptr' can return 'Ok(NULL)',
including runtime-tested examples.
The intention is to hopefully prevent UB that assumes the result of
the function is not 'NULL' if successful. This originated from a case
of UB I noticed in 'regulator' that created a 'NonNull' on it.
Timekeeping:
- Expand the example section in the 'HrTimer' documentation.
- Mark the 'ClockSource' trait as unsafe to ensure valid values for
'ktime_get()'.
- Add 'Delta::from_nanos()'.
'pin-init' crate:
- Replace the 'Zeroable' impls for 'Option<NonZero*>' with impls of
'ZeroableOption' for 'NonZero*'.
- Improve feature gate handling for unstable features.
- Declutter the documentation of implementations of 'Zeroable' for
tuples.
- Replace uses of 'addr_of[_mut]!' with '&raw [mut]'.
rust-analyzer:
- Add type annotations to 'generate_rust_analyzer.py'.
- Add support for scripts written in Rust ('generate_rust_target.rs',
'rustdoc_test_builder.rs', 'rustdoc_test_gen.rs').
- Refactor 'generate_rust_analyzer.py' to explicitly identify host and
target crates, improve readability, and reduce duplication.
And some other fixes, cleanups and improvements.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmnZVNQACgkQGXyLc2ht
IW09aA/9GIbluNhc5xNvfkMvv9Ki70TK+e/W78pQWoRlSmZU1MO6R5K2rMN+iYlu
98S53EO38P5wBWOjIVFHm9mD1b59T945gcyGk9DxxFdl6I5mFKGZvE0Z8onTE/9b
GUnO5dlWjmEwTfwD0csr4moLC8eoCGVmGpe4TEfvscAISeZJZwQ90UCoNSFy6TQS
rJyzmIOBraZPrf1qptt3Sk6KY3b9HaxLv3kh1TAPYH0Dmrhhp+ckHvn5lT8uB8ZW
xr1ThoP44Zwm+nq6JahiK1NWFXTs12vpoCQLbckJsN8r3GTmt9CfHll/0UcW5W7i
bCUeCJDNwfbpVALNmQxHjtkvmDAuhqypxCTFSMMrWS66LOUaKxZ+u0ioi/1Ljfp4
tCR1Uzpr3QD6c8rK0hJ28vW/5DjoqkMMwUDeUm6c36msST37xrDZPa/vN+VLxxhK
H8sQ3SyvE0JdK8wBvd/pHGHv+RvIdi7cbV5H/WqBpwzCcupExuXiKBdFHeVIfXkQ
zn7lsZtnBuL+hLpG1pz6BoCTW1KbR38YomaKupElkYCUYytu0H+0Af/lkK3HhviM
9uynUVsn0+JaS9QvogArW/d+I0w49yjRHkWxfXIJZd0+mkT9V3JrGY7/iXwewl5R
fRRP0hMx0vhY4f/Uss1qEu3RPfsafxnU1NBiVRZZtc37azSOKjE=
=xRA/
-----END PGP SIGNATURE-----
Merge tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust updates from Miguel Ojeda:
"Toolchain and infrastructure:
- Bump the minimum Rust version to 1.85.0 (and 'bindgen' to 0.71.1).
As proposed in LPC 2025 and the Maintainers Summit [1], we are
going to follow Debian Stable's Rust versions as our minimum
versions.
Debian Trixie was released on 2025-08-09 with a Rust 1.85.0 and
'bindgen' 0.71.1 toolchain, which is a fair amount of time for e.g.
kernel developers to upgrade.
Other major distributions support a Rust version that is high
enough as well, including:
+ Arch Linux.
+ Fedora Linux.
+ Gentoo Linux.
+ Nix.
+ openSUSE Slowroll and openSUSE Tumbleweed.
+ Ubuntu 25.10 and 26.04 LTS. In addition, 24.04 LTS using
their versioned packages.
The merged patch series comes with the associated cleanups and
simplifications treewide that can be performed thanks to both
bumps, as well as documentation updates.
In addition, start using 'bindgen''s '--with-attribute-custom-enum'
feature to set the 'cfi_encoding' attribute for the 'lru_status'
enum used in Binder.
Link: https://lwn.net/Articles/1050174/ [1]
- Add experimental Kconfig option ('CONFIG_RUST_INLINE_HELPERS') that
inlines C helpers into Rust.
Essentially, it performs a step similar to LTO, but just for the
helpers, i.e. very local and fast.
It relies on 'llvm-link' and its '--internalize' flag, and requires
a compatible LLVM between Clang and 'rustc' (i.e. same major
version, 'CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE'). It is only enabled
for two architectures for now.
The result is a measurable speedup in different workloads that
different users have tested. For instance, for the null block
driver, it amounts to a 2%.
- Support global per-version flags.
While we already have per-version flags in many places, we didn't
have a place to set global ones that depend on the compiler
version, i.e. in 'rust_common_flags', which sometimes is needed to
e.g. tweak the lints set per version.
Use that to allow the 'clippy::precedence' lint for Rust < 1.86.0,
since it had a change in behavior.
- Support overriding the crate name and apply it to Rust Binder,
which wanted the module to be called 'rust_binder'.
- Add the remaining '__rust_helper' annotations (started in the
previous cycle).
'kernel' crate:
- Introduce the 'const_assert!' macro: a more powerful version of
'static_assert!' that can refer to generics inside functions or
implementation bodies, e.g.:
fn f<const N: usize>() {
const_assert!(N > 1);
}
fn g<T>() {
const_assert!(size_of::<T>() > 0, "T cannot be ZST");
}
In addition, reorganize our set of build-time assertion macros
('{build,const,static_assert}!') to live in the 'build_assert'
module.
Finally, improve the docs as well to clarify how these are
different from one another and how to pick the right one to use,
and their equivalence (if any) to the existing C ones for extra
clarity.
- 'sizes' module: add 'SizeConstants' trait.
This gives us typed 'SZ_*' constants (avoiding casts) for use in
device address spaces where the address width depends on the
hardware (e.g. 32-bit MMIO windows, 64-bit GPU framebuffers, etc.),
e.g.:
let gpu_heap = 14 * u64::SZ_1M;
let mmio_window = u32::SZ_16M;
- 'clk' module: implement 'Send' and 'Sync' for 'Clk' and thus
simplify the users in Tyr and PWM.
- 'ptr' module: add 'const_align_up'.
- 'str' module: improve the documentation of the 'c_str!' macro to
explain that one should only use it for non-literal cases (for the
other case we instead use C string literals, e.g. 'c"abc"').
- Disallow the use of 'CStr::{as_ptr,from_ptr}' and clean one such
use in the 'task' module.
- 'sync' module: finish the move of 'ARef' and 'AlwaysRefCounted'
outside of the 'types' module, i.e. update the last remaining
instances and finally remove the re-exports.
- 'error' module: clarify that 'from_err_ptr' can return 'Ok(NULL)',
including runtime-tested examples.
The intention is to hopefully prevent UB that assumes the result of
the function is not 'NULL' if successful. This originated from a
case of UB I noticed in 'regulator' that created a 'NonNull' on it.
Timekeeping:
- Expand the example section in the 'HrTimer' documentation.
- Mark the 'ClockSource' trait as unsafe to ensure valid values for
'ktime_get()'.
- Add 'Delta::from_nanos()'.
'pin-init' crate:
- Replace the 'Zeroable' impls for 'Option<NonZero*>' with impls of
'ZeroableOption' for 'NonZero*'.
- Improve feature gate handling for unstable features.
- Declutter the documentation of implementations of 'Zeroable' for
tuples.
- Replace uses of 'addr_of[_mut]!' with '&raw [mut]'.
rust-analyzer:
- Add type annotations to 'generate_rust_analyzer.py'.
- Add support for scripts written in Rust ('generate_rust_target.rs',
'rustdoc_test_builder.rs', 'rustdoc_test_gen.rs').
- Refactor 'generate_rust_analyzer.py' to explicitly identify host
and target crates, improve readability, and reduce duplication.
And some other fixes, cleanups and improvements"
* tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (79 commits)
rust: sizes: add SizeConstants trait for device address space constants
rust: kernel: update `file_with_nul` comment
rust: kbuild: allow `clippy::precedence` for Rust < 1.86.0
rust: kbuild: support global per-version flags
rust: declare cfi_encoding for lru_status
docs: rust: general-information: use real example
docs: rust: general-information: simplify Kconfig example
docs: rust: quick-start: remove GDB/Binutils mention
docs: rust: quick-start: remove Nix "unstable channel" note
docs: rust: quick-start: remove Gentoo "testing" note
docs: rust: quick-start: add Ubuntu 26.04 LTS and remove subsection title
docs: rust: quick-start: update minimum Ubuntu version
docs: rust: quick-start: update Ubuntu versioned packages
docs: rust: quick-start: openSUSE provides `rust-src` package nowadays
rust: kbuild: remove "dummy parameter" workaround for `bindgen` < 0.71.1
rust: kbuild: update `bindgen --rust-target` version and replace comment
rust: rust_is_available: remove warning for `bindgen` < 0.69.5 && libclang >= 19.1
rust: rust_is_available: remove warning for `bindgen` 0.66.[01]
rust: bump `bindgen` minimum supported version to 0.71.1 (Debian Trixie)
rust: block: update `const_refs_to_static` MSRV TODO comment
...
|
||
|
|
93553d9922 |
rust: kbuild: remove "dummy parameter" workaround for bindgen < 0.71.1
Until the version bump of `bindgen`, we needed to pass a dummy parameter to avoid failing the `--version` call. Thus remove it. Reviewed-by: Tamir Duberstein <tamird@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-22-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> |
||
|
|
4ab22c543f |
rust: remove RUSTC_HAS_COERCE_POINTEE and simplify code
With the Rust version bump in place, the `RUSTC_HAS_COERCE_POINTEE` Kconfig (automatic) option is always true. Thus remove the option and simplify the code. In particular, this includes removing our use of the predecessor unstable features we used with Rust < 1.84.0 (`coerce_unsized`, `dispatch_from_dyn` and `unsize`). Reviewed-by: Tamir Duberstein <tamird@kernel.org> Acked-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-11-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> |
||
|
|
9b398d0565 |
rust: remove RUSTC_HAS_SLICE_AS_FLATTENED and simplify code
With the Rust version bump in place, the `RUSTC_HAS_SLICE_AS_FLATTENED` Kconfig (automatic) option is always true. Thus remove the option and simplify the code. In particular, this includes removing the `slice` module which contained the temporary slice helpers, i.e. the `AsFlattened` extension trait and its `impl`s. Reviewed-by: Tamir Duberstein <tamird@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-10-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> |
||
|
|
b28711ac98 |
rust: simplify RUSTC_VERSION Kconfig conditions
With the Rust version bump in place, several Kconfig conditions based on `RUSTC_VERSION` are always true. Thus simplify them. The minimum supported major LLVM version by our new Rust minimum version is now LLVM 18, instead of LLVM 16. However, there are no possible cleanups for `RUSTC_LLVM_VERSION`. Reviewed-by: Tamir Duberstein <tamird@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260405235309.418950-9-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> |
||
|
|
6ebf98d71f |
mm: introduce CONFIG_NUMA_MIGRATION and simplify CONFIG_MIGRATION
CONFIG_MEMORY_HOTREMOVE, CONFIG_COMPACTION and CONFIG_CMA all select CONFIG_MIGRATION, because they require it to work (users). Only CONFIG_NUMA_BALANCING and CONFIG_BALLOON_MIGRATION depend on CONFIG_MIGRATION. CONFIG_BALLOON_MIGRATION is not an actual user, but an implementation of migration support, so the dependency is correct (CONFIG_BALLOON_MIGRATION does not make any sense without CONFIG_MIGRATION). However, kconfig-language.rst clearly states "In general use select only for non-visible symbols". So far CONFIG_MIGRATION is user-visible ... and the dependencies rather confusing. The whole reason why CONFIG_MIGRATION is user-visible is because of CONFIG_NUMA: some users might want CONFIG_NUMA but not page migration support. Let's clean all that up by introducing a dedicated CONFIG_NUMA_MIGRATION config option for that purpose only. Make CONFIG_NUMA_BALANCING that so far depended on CONFIG_NUMA && CONFIG_MIGRATION to depend on CONFIG_MIGRATION instead. CONFIG_NUMA_MIGRATION will depend on CONFIG_NUMA && CONFIG_MMU. CONFIG_NUMA_MIGRATION is user-visible and will default to "y". We use that default so new configs will automatically enable it, just like it was the case with CONFIG_MIGRATION. The downside is that some configs that used to have CONFIG_MIGRATION=n might get it re-enabled by CONFIG_NUMA_MIGRATION=y, which shouldn't be a problem. CONFIG_MIGRATION is now a non-visible config option. Any code that select CONFIG_MIGRATION (as before) must depend directly or indirectly on CONFIG_MMU. CONFIG_NUMA_MIGRATION is responsible for any NUMA migration code, which is mempolicy migration code, memory-tiering code, and move_pages() code in migrate.c. CONFIG_NUMA_BALANCING uses its functionality. Note that this implies that with CONFIG_NUMA_MIGRATION=n, move_pages() will not be available even though CONFIG_MIGRATION=y, which is an expected change. In migrate.c, we can remove the CONFIG_NUMA check as both CONFIG_NUMA_MIGRATION and CONFIG_NUMA_BALANCING depend on it. With this change, CONFIG_MIGRATION is an internal config, all users of migration selects CONFIG_MIGRATION, and only CONFIG_BALLOON_MIGRATION depends on it. Link: https://lkml.kernel.org/r/20260319-config_migration-v1-2-42270124966f@kernel.org Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Byungchul Park <byungchul@sk.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Gregory Price <gourry@gourry.net> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
9617b5b62c |
kernel: ksysfs: initialize kernel_kobj earlier
Software nodes depend on kernel_kobj which is initialized pretty late into the boot process - as a core_initcall(). Ahead of moving the software node initialization to driver_init() we must first make kernel_kobj available before it. Make ksysfs_init() visible in a new header - ksysfs.h - and call it in do_basic_setup() right before driver_init(). Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20260402-nokia770-gpio-swnodes-v5-1-d730db3dd299@oss.qualcomm.com Signed-off-by: Danilo Krummrich <dakr@kernel.org> |
||
|
|
fa4a1ff8ab |
locking: Add task::blocked_lock to serialize blocked_on state
So far, we have been able to utilize the mutex::wait_lock for serializing the blocked_on state, but when we move to proxying across runqueues, we will need to add more state and a way to serialize changes to this state in contexts where we don't hold the mutex::wait_lock. So introduce the task::blocked_lock, which nests under the mutex::wait_lock in the locking order, and rework the locking to use it. Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://patch.msgid.link/20260324191337.1841376-5-jstultz@google.com |
||
|
|
b2129a3951 |
memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y
On architectures that keep memblock after boot, freeing of reserved memory with free_reserved_area() is paired with an update of memblock arrays, usually by a call to memblock_free(). Make free_reserved_area() directly update memblock.reserved when ARCH_KEEP_MEMBLOCK is enabled. Remove the now-redundant explicit memblock_free() call from arm64::free_initmem() and the #ifdef CONFIG_ARCH_KEEP_MEMBLOCK block from the generic free_initrd_mem(). Link: https://patch.msgid.link/20260323074836.3653702-8-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> |
||
|
|
e90f97ce20 |
kbuild: rust: add CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE
This config detects if Rust and Clang have matching LLVM major version. All IR or bitcode operations (e.g. LTO) rely on LLVM major version to be matching, otherwise it may generate errors, or worse, miscompile silently due to change of IR semantics. It's usually suggested to use the exact same LLVM version, but this can be difficult to guarantee. Rust's suggestion [1] is also major-version only, so I think this check is sufficient for the kernel. Link: https://doc.rust-lang.org/rustc/linker-plugin-lto.html [1] Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Matthew Maurer <mmaurer@google.com> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nicolas Schier <nsc@kernel.org> Tested-by: Nicolas Schier <nsc@kernel.org> Tested-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://patch.msgid.link/20260203-inline-helpers-v2-1-beb8547a03c9@google.com [ Fixed typo. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org> |
||
|
|
1b6c89285d |
timens: Remove dependency on the vDSO
Previously, missing time namespace support in the vDSO meant that time namespaces needed to be disabled globally. This was expressed in a hard dependency on the generic vDSO library. This also meant that architectures without any vDSO or only a stub vDSO could not enable time namespaces. Now that all architectures using a real vDSO are using the generic library, that dependency is not necessary anymore. Remove the dependency and let all architectures enable time namespaces. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260326-vdso-timens-decoupling-v2-2-c82693a7775f@linutronix.de |
||
|
|
d2a43e7f89 |
hardening fixes for v7.0-rc6
- fix required Clang version for CC_HAS_COUNTED_BY_PTR (Nathan Chancellor) - update Coccinelle script used for kmalloc_obj -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCacRQmQAKCRA2KwveOeQk u0QhAQCX0xv8Tt2OoP2vPRp6VpXcJNyPkm/h/ncvWuX+XyJCAAD/SgUUeDN4WMEl PN6bH0uwH6Q+6Xi8xEfbNs7cOfHqDQw= =ltch -----END PGP SIGNATURE----- Merge tag 'hardening-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - fix required Clang version for CC_HAS_COUNTED_BY_PTR (Nathan Chancellor) - update Coccinelle script used for kmalloc_obj * tag 'hardening-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: init/Kconfig: Require a release version of clang-22 for CC_HAS_COUNTED_BY_PTR coccinelle: kmalloc_obj: Remove default GFP_KERNEL arg |
||
|
|
a96ef5848c |
randomize_kstack: Unify random source across arches
Previously different architectures were using random sources of
differing strength and cost to decide the random kstack offset. A number
of architectures (loongarch, powerpc, s390, x86) were using their
timestamp counter, at whatever the frequency happened to be. Other
arches (arm64, riscv) were using entropy from the crng via
get_random_u16().
There have been concerns that in some cases the timestamp counters may
be too weak, because they can be easily guessed or influenced by user
space. And get_random_u16() has been shown to be too costly for the
level of protection kstack offset randomization provides.
So let's use a common, architecture-agnostic source of entropy; a
per-cpu prng, seeded at boot-time from the crng. This has a few
benefits:
- We can remove choose_random_kstack_offset(); That was only there to
try to make the timestamp counter value a bit harder to influence
from user space [*].
- The architecture code is simplified. All it has to do now is call
add_random_kstack_offset() in the syscall path.
- The strength of the randomness can be reasoned about independently
of the architecture.
- Arches previously using get_random_u16() now have much faster
syscall paths, see below results.
[*] Additionally, this gets rid of some redundant work on s390 and x86.
Before this patch, those architectures called
choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(),
which is also called for exception returns to userspace which were *not*
syscalls (e.g. regular interrupts). Getting rid of
choose_random_kstack_offset() avoids a small amount of redundant work
for the non-syscall cases.
In some configurations, add_random_kstack_offset() will now call
instrumentable code, so for a couple of arches, I have moved the call a
bit later to the first point where instrumentation is allowed. This
doesn't impact the efficacy of the mechanism.
There have been some claims that a prng may be less strong than the
timestamp counter if not regularly reseeded. But the prng has a period
of about 2^113. So as long as the prng state remains secret, it should
not be possible to guess. If the prng state can be accessed, we have
bigger problems.
Additionally, we are only consuming 6 bits to randomize the stack, so
there are only 64 possible random offsets. I assert that it would be
trivial for an attacker to brute force by repeating their attack and
waiting for the random stack offset to be the desired one. The prng
approach seems entirely proportional to this level of protection.
Performance data are provided below. The baseline is v6.18 with rndstack
on for each respective arch. (I)/(R) indicate statistically significant
improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal).
x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge):
+-----------------+--------------+---------------+---------------+
| Benchmark | Result Class | per-cpu-prng | per-cpu-prng |
| | | arm64 (metal) | x86_64 (VM) |
+=================+==============+===============+===============+
| syscall/getpid | mean (ns) | (I) -9.50% | (I) -17.65% |
| | p99 (ns) | (I) -59.24% | (I) -24.41% |
| | p99.9 (ns) | (I) -59.52% | (I) -28.52% |
+-----------------+--------------+---------------+---------------+
| syscall/getppid | mean (ns) | (I) -9.52% | (I) -19.24% |
| | p99 (ns) | (I) -59.25% | (I) -25.03% |
| | p99.9 (ns) | (I) -59.50% | (I) -28.17% |
+-----------------+--------------+---------------+---------------+
| syscall/invalid | mean (ns) | (I) -10.31% | (I) -18.56% |
| | p99 (ns) | (I) -60.79% | (I) -20.06% |
| | p99.9 (ns) | (I) -61.04% | (I) -25.04% |
+-----------------+--------------+---------------+---------------+
I tested an earlier version of this change on x86 bare metal and it
showed a smaller but still significant improvement. The bare metal
system wasn't available this time around so testing was done in a VM
instance. I'm guessing the cost of rdtsc is higher for VMs.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://patch.msgid.link/20260303150840.3789438-3-ryan.roberts@arm.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
||
|
|
37beb42560 |
randomize_kstack: Maintain kstack_offset per task
kstack_offset was previously maintained per-cpu, but this caused a
couple of issues. So let's instead make it per-task.
Issue 1: add_random_kstack_offset() and choose_random_kstack_offset()
expected and required to be called with interrupts and preemption
disabled so that it could manipulate per-cpu state. But arm64, loongarch
and risc-v are calling them with interrupts and preemption enabled. I
don't _think_ this causes any functional issues, but it's certainly
unexpected and could lead to manipulating the wrong cpu's state, which
could cause a minor performance degradation due to bouncing the cache
lines. By maintaining the state per-task those functions can safely be
called in preemptible context.
Issue 2: add_random_kstack_offset() is called before executing the
syscall and expands the stack using a previously chosen random offset.
choose_random_kstack_offset() is called after executing the syscall and
chooses and stores a new random offset for the next syscall. With
per-cpu storage for this offset, an attacker could force cpu migration
during the execution of the syscall and prevent the offset from being
updated for the original cpu such that it is predictable for the next
syscall on that cpu. By maintaining the state per-task, this problem
goes away because the per-task random offset is updated after the
syscall regardless of which cpu it is executing on.
Fixes:
|
||
|
|
e73b1d7210 |
sched_ext: Fix build errors and unused label warning in non-cgroup configs
When building with SCHED_CLASS_EXT=y but CGROUPS=n, clang reports errors
for undeclared cgroup_put() and cgroup_get() calls, and a warning for the
unused err_stop_helper label.
EXT_SUB_SCHED is def_bool y depending only on SCHED_CLASS_EXT, but it
fundamentally requires cgroups (cgroup_path, cgroup_get, cgroup_put,
cgroup_id, etc.). Add the missing CGROUPS dependency to EXT_SUB_SCHED in
init/Kconfig.
Guard cgroup_put() and cgroup_get() in the common paths with:
#if defined(CONFIG_EXT_GROUP_SCHED) || defined(CONFIG_EXT_SUB_SCHED)
Guard the err_stop_helper label with #ifdef CONFIG_EXT_SUB_SCHED since
all gotos targeting it are inside that same ifdef block.
Tested with both CGROUPS enabled and disabled.
Fixes:
|
||
|
|
7a618ca9b9 |
init/Kconfig: Require a release version of clang-22 for CC_HAS_COUNTED_BY_PTR
Commit |
||
|
|
49e64d7136
|
init/initramfs.c: trivial fix: FSM -> Finite-state machine
FSM means "finite-state machine", but I think this is not obvious to everyone. Signed-off-by: Askar Safin <safinaskar@gmail.com> Link: https://patch.msgid.link/20260319232721.452950-1-safinaskar@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> |
||
|
|
0d3fccf68d
|
kbuild: Use '-fms-anonymous-structs' if it is available
Clang recently added '-fms-anonymous-structs' [1] to specifically enable the Microsoft tagged anonymous structure / union extension, for which the kernel added '-fms-extensions' in commit |
||
|
|
05988dba11 |
vdso/datastore: Allocate data pages dynamically
Allocating the data pages as part of the kernel image does not work on SPARC. The MMU will raise a fault when userspace tries to access them. Allocate the data pages through the page allocator instead. Unused pages in the vDSO VMA are still allocated to keep the virtual addresses aligned. Switch the mapping from PFNs to 'struct page' as that is required for dynamically allocated pages. This also aligns the allocation of the datapages with the code pages and is a prerequisite for mlockall() support. VM_MIXEDMAP is necessary for the call to vmf_insert_page() in the timens prefault path to work. The data pages need to be order-0, non-compound pages so that the mapping to userspace and the different orderings work. These pages are also used by the timekeeping, random pool and architecture initialization code. Some of these are running before the page allocator is available. To keep these subsytems working without changes, introduce early, statically data storage which will then replaced by the real one as soon as that is available. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Link: https://patch.msgid.link/20260304-vdso-sparc64-generic-2-v6-3-d8eb3b0e1410@linutronix.de |
||
|
|
ebeca1f930 |
sched_ext: Introduce cgroup sub-sched support
A system often runs multiple workloads especially in multi-tenant server environments where a system is split into partitions servicing separate more-or-less independent workloads each requiring an application-specific scheduler. To support such and other use cases, sched_ext is in the process of growing multiple scheduler support. When partitioning a system in terms of CPUs for such use cases, an oft-taken approach is hard partitioning the system using cpuset. While it would be possible to tie sched_ext multiple scheduler support to cpuset partitions, such an approach would have fundamental limitations stemming from the lack of dynamism and flexibility. Users often don't care which specific CPUs are assigned to which workload and want to take advantage of optimizations which are enabled by running workloads on a larger machine - e.g. opportunistic over-commit, improving latency critical workload characteristics while maintaining bandwidth fairness, employing control mechanisms based on different criteria than on-CPU time for e.g. flexible memory bandwidth isolation, packing similar parts from different workloads on same L3s to improve cache efficiency, and so on. As this sort of dynamic behaviors are impossible or difficult to implement with hard partitioning, sched_ext is implementing cgroup sub-sched support where schedulers can be attached to the cgroup hierarchy and a parent scheduler is responsible for controlling the CPUs that each child can use at any given moment. This makes CPU distribution dynamically controlled by BPF allowing high flexibility. This patch adds the skeletal sched_ext cgroup sub-sched support: - sched_ext_ops.sub_cgroup_id and .sub_attach/detach() are added. Non-zero sub_cgroup_id indicates that the scheduler is to be attached to the identified cgroup. A sub-sched is attached to the cgroup iff the nearest ancestor scheduler implements .sub_attach() and grants the attachment. Max nesting depth is limited by SCX_SUB_MAX_DEPTH. - When a scheduler exits, all its descendant schedulers are exited together. Also, cgroup.scx_sched added which points to the effective scheduler instance for the cgroup. This is updated on scheduler init/exit and inherited on cgroup online. When a cgroup is offlined, the attached scheduler is automatically exited. - Sub-sched support is gated on CONFIG_EXT_SUB_SCHED which is automatically enabled if both SCX and cgroups are enabled. Sub-sched support is not tied to the CPU controller but rather the cgroup hierarchy itself. This is intentional as the support for cpu.weight and cpu.max based resource control is orthogonal to sub-sched support. Note that CONFIG_CGROUPS around cgroup subtree iteration support for scx_task_iter is replaced with CONFIG_EXT_SUB_SCHED for consistency. - This allows loading sub-scheds and most framework operations such as propagating disable down the hierarchy work. However, sub-scheds are not operational yet and all tasks stay with the root sched. This will serve as the basis for building up full sub-sched support. - DSQs point to the scx_sched they belong to. - scx_qmap is updated to allow attachment of sub-scheds and also serving as sub-scheds. - scx_is_descendant() is added but not yet used in this patch. It is used by later changes in the series and placed here as this is where the function belongs. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com> |
||
|
|
3ad66a34cc |
io_uring-7.0-20260305
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmqPTAQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpueIEACHF0uws+uZSEsy9LUyC9ha8+5YN9szIJ3K QUGxa9pmCnQG5K50KpxyEYP6buaJDy1smJGgD2obkeGncC4w6xKK2kQTQV1U+C1C YA+7B/3HLhz5AWS6GIbRy6VZ599I4evlF8W79dX8BTnF8Y1ddkSuUnKx//q0AoQZ hr3foglcFlchy8JuQ2/MpxzfOouvNMdMmeUN4O+t8iXDrmePFYIOgrLcT+ObgC5D SXWx2cc3hMJ35hcSzedMWEBFcXnkX9nfh8Hd/+uPRcKsIwS8kCo6z01GoT/BCPRA jdrxAfoYSL16HPfq6GU52n6iCaRd+5NK+tt/ECCzTxGL32Hadrr+nxw4O7g3Q96u 07zeaqHSoTGUchtlqrGjALQLP2yxdACEjxMh3rfdStRv3x3bbbVVDdioVEzPukCr EBA+AbqaaG3LIYXwcY+15zx5NrAfeBAP1RjLgoV0s2ch4ghEqvnZGY4NLBDkcQ2R 97tM9+OdecBrsnlQr5GBoDbwpqc2pDEqSjkYDwoXqvqXs0DrMRq2MQw1Hjjh7Z7G FZx1KNTiLB/YQ0sSyMcUKnH+qBA0FxwN/C6dDnRjj4dH5YsoeG/GhsS3B00a+0yE S3MKrsf+uN21OYLVPSTEN6qS+02ZvK6E/Aw7/fk2IV60JMeM5KvCccmxa53dKls8 iyEJ7nVLOg== =xyKA -----END PGP SIGNATURE----- Merge tag 'io_uring-7.0-20260305' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring fixes from Jens Axboe: - Fix a typo in the mock_file help text - Fix a comment regarding IORING_SETUP_TASKRUN_FLAG in the io_uring.h UAPI header - Use READ_ONCE() for reading refill queue entries - Reject SEND_VECTORIZED for fixed buffer sends, as it isn't implemented. Currently this flag is silently ignored This is in preparation for making these work, but first we need a fixup so that older kernels will correctly reject them - Ensure "0" means default for the rx page size * tag 'io_uring-7.0-20260305' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring/zcrx: use READ_ONCE with user shared RQEs io_uring/mock: Fix typo in help text io_uring/net: reject SEND_VECTORIZED when unsupported io_uring: correct comment for IORING_SETUP_TASKRUN_FLAG io_uring/zcrx: don't set rx_page_size when not requested |
||
|
|
9e7dc228bb |
io_uring/mock: Fix typo in help text
Fix the spelling of "subsystem". Signed-off-by: J. Neuschäfer <j.ne@posteo.net> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
f709859f33 |
init/Kconfig: Adjust fixed clang version for __builtin_counted_by_ref
Commit |
||
|
|
323bbfcf1e |
Convert 'alloc_flex' family to use the new default GFP_KERNEL argument
This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the *alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs*'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
bf4afc53b7 |
Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
69050f8d6d |
treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org> |
||
|
|
d39a1d7486 |
compiler_types: Disable __builtin_counted_by_ref for Clang
Unfortunately, there is a corner case of __builtin_counted_by_ref() usage that crashes[1] Clang since support was introduced in Clang 19. Disable it prior to Clang 22. Found while tested kmalloc_obj treewide refactoring (via kmalloc_flex() usage). Link: https://github.com/llvm/llvm-project/issues/182575 [1] Signed-off-by: Kees Cook <kees@kernel.org> |
||
|
|
136114e0ab |
mm.git review status for linus..mm-nonmm-stable
Total patches: 107 Reviews/patch: 1.07 Reviewed rate: 67% - The 2 patch series "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" from Heming Zhao saves disk space by teaching ocfs2 to reclaim suballocator block group space. - The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in various places. - The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size. - The 7 patch series "kallsyms: Prevent invalid access when showing module buildid" from Petr Mladek cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces. - The 3 patch series "Address page fault in ima_restore_measurement_list()" from Harshit Mogalapalli fixes a kexec-related crash that can occur when booting the second-stage kernel on x86. - The 6 patch series "kho: ABI headers and Documentation updates" from Mike Rapoport updates the kexec handover ABI documentation. - The 4 patch series "Align atomic storage" from Finn Thain adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh. - The 2 patch series "kho: clean up page initialization logic" from Pratyush Yadav simplifies the page initialization logic in kho_restore_page(). - The 6 patch series "Unload linux/kernel.h" from Yury Norov moves several things out of kernel.h and into more appropriate places. - The 7 patch series "don't abuse task_struct.group_leader" from Oleg Nesterov removes the usage of ->group_leader when it is "obviously unnecessary". - The 5 patch series "list private v2 & luo flb" from Pasha Tatashin adds some infrastructure improvements to the live update orchestrator. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6 0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww= =mmsH -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ... |
||
|
|
4cff5c05e0 |
mm.git review status for linus..mm-stable
Everything:
Total patches: 325
Reviews/patch: 1.39
Reviewed rate: 72%
Excluding DAMON:
Total patches: 262
Reviews/patch: 1.63
Reviewed rate: 82%
Excluding DAMON and zram:
Total patches: 248
Reviews/patch: 1.72
Reviewed rate: 86%
- The 14 patch series "powerpc/64s: do not re-activate batched TLB
flush" from Alexander Gordeev makes arch_{enter|leave}_lazy_mmu_mode()
nest properly.
It adds a generic enter/leave layer and switches architectures to use
it. Various hacks were removed in the process.
- The 7 patch series "zram: introduce compressed data writeback" from
Richard Chang and Sergey Senozhatsky implements data compression for
zram writeback.
- The 8 patch series "mm: folio_zero_user: clear page ranges" from David
Hildenbrand adds clearing of contiguous page ranges for hugepages.
Large improvements during demand faulting are demonstrated.
- The 2 patch series "memcg cleanups" from Chen Ridong tideis up some
memcg code.
- The 12 patch series "mm/damon: introduce {,max_}nr_snapshots and
tracepoint for damos stats" from SeongJae Park improves DAMOS stat's
provided information, deterministic control, and readability.
- The 3 patch series "selftests/mm: hugetlb cgroup charging: robustness
fixes" from Li Wang fixes a few issues in the hugetlb cgroup charging
selftests.
- The 5 patch series "Fix va_high_addr_switch.sh test failure - again"
from Chunyu Hu addresses several issues in the va_high_addr_switch test.
- The 5 patch series "mm/damon/tests/core-kunit: extend existing test
scenarios" from Shu Anzai improves the KUnit test coverage for DAMON.
- The 2 patch series "mm/khugepaged: fix dirty page handling for
MADV_COLLAPSE" from Shivank Garg fixes a glitch in khugepaged which was
causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN.
- The 29 patch series "arch, mm: consolidate hugetlb early reservation"
from Mike Rapoport reworks and consolidates a pile of straggly code
related to reservation of hugetlb memory from bootmem and creation of
CMA areas for hugetlb.
- The 9 patch series "mm: clean up anon_vma implementation" from Lorenzo
Stoakes cleans up the anon_vma implementation in various ways.
- The 3 patch series "tweaks for __alloc_pages_slowpath()" from
Vlastimil Babka does a little streamlining of the page allocator's
slowpath code.
- The 8 patch series "memcg: separate private and public ID namespaces"
from Shakeel Butt cleans up the memcg ID code and prevents the
internal-only private IDs from being exposed to userspace.
- The 6 patch series "mm: hugetlb: allocate frozen gigantic folio" from
Kefeng Wang cleans up the allocation of frozen folios and avoids some
atomic refcount operations.
- The 11 patch series "mm/damon: advance DAMOS-based LRU sorting" from
SeongJae Park improves DAMOS's movement of memory betewwn the active and
inactive LRUs and adds auto-tuning of the ratio-based quotas and of
monitoring intervals.
- The 18 patch series "Support page table check on PowerPC" from Andrew
Donnellan makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc.
- The 3 patch series "nodemask: align nodes_and{,not} with underlying
bitmap ops" from Yury Norov makes nodes_and() and nodes_andnot()
propagate the return values from the underlying bit operations, enabling
some cleanup in calling code.
- The 5 patch series "mm/damon: hide kdamond and kdamond_lock from API
callers" from SeongJae Park cleans up some DAMON internal interfaces.
- The 4 patch series "mm/khugepaged: cleanups and scan limit fix" from
Shivank Garg does some cleanup work in khupaged and fixes a scan limit
accounting issue.
- The 24 patch series "mm: balloon infrastructure cleanups" from David
Hildenbrand goes to town on the balloon infrastructure and its page
migration function. Mainly cleanups, also some locking simplification.
- The 2 patch series "mm/vmscan: add tracepoint and reason for
kswapd_failures reset" from Jiayuan Chen adds additional tracepoints to
the page reclaim code.
- The 3 patch series "Replace wq users and add WQ_PERCPU to
alloc_workqueue() users" from Marco Crivellari is part of Marco's
kernel-wide migration from the legacy workqueue APIs over to the
preferred unbound workqueues.
- The 9 patch series "Various mm kselftests improvements/fixes" from
Kevin Brodsky provides various unrelated improvements/fixes for the mm
kselftests.
- The 5 patch series "mm: accelerate gigantic folio allocation" from
Kefeng Wang greatly speeds up gigantic folio allocation, mainly by
avoiding unnecessary work in pfn_range_valid_contig().
- The 5 patch series "selftests/damon: improve leak detection and wss
estimation reliability" from SeongJae Park improves the reliability of
two of the DAMON selftests.
- The 8 patch series "mm/damon: cleanup kdamond, damon_call(), damos
filter and DAMON_MIN_REGION" from SeongJae Park does some cleanup work
in the core DAMON code.
- The 8 patch series "Docs/mm/damon: update intro, modules, maintainer
profile, and misc" from SeongJae Park performs maintenance work on the
DAMON documentation.
- The 10 patch series "mm: add and use vma_assert_stabilised() helper"
from Lorenzo Stoakes refactors and cleans up the core VMA code. The
main aim here is to be able to use the mmap write lock's lockdep state
to perform various assertions regarding the locking which the VMA code
requires.
- The 19 patch series "mm, swap: swap table phase II: unify swapin use"
from Kairui Song removes some old swap code (swap cache bypassing and
swap synchronization) which wasn't working very well. Various other
cleanups and simplifications were made. The end result is a 20% speedup
in one benchmark.
- The 8 patch series "enable PT_RECLAIM on more 64-bit architectures"
from Qi Zheng makes PT_RECLAIM available on 64-bit alpha, loongarch,
mips, parisc, um, Various cleanups were performed along the way.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY1HfAAKCRDdBJ7gKXxA
jqhZAP9H8ZlKKqCEgnr6U5XXmJ63Ep2FDQpl8p35yr9yVuU9+gEAgfyWiJ43l1fP
rT0yjsUW3KQFBi/SEA3R6aYarmoIBgI=
=+HLt
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "powerpc/64s: do not re-activate batched TLB flush" makes
arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev)
It adds a generic enter/leave layer and switches architectures to use
it. Various hacks were removed in the process.
- "zram: introduce compressed data writeback" implements data
compression for zram writeback (Richard Chang and Sergey Senozhatsky)
- "mm: folio_zero_user: clear page ranges" adds clearing of contiguous
page ranges for hugepages. Large improvements during demand faulting
are demonstrated (David Hildenbrand)
- "memcg cleanups" tidies up some memcg code (Chen Ridong)
- "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos
stats" improves DAMOS stat's provided information, deterministic
control, and readability (SeongJae Park)
- "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few
issues in the hugetlb cgroup charging selftests (Li Wang)
- "Fix va_high_addr_switch.sh test failure - again" addresses several
issues in the va_high_addr_switch test (Chunyu Hu)
- "mm/damon/tests/core-kunit: extend existing test scenarios" improves
the KUnit test coverage for DAMON (Shu Anzai)
- "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a
glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to
transiently return -EAGAIN (Shivank Garg)
- "arch, mm: consolidate hugetlb early reservation" reworks and
consolidates a pile of straggly code related to reservation of
hugetlb memory from bootmem and creation of CMA areas for hugetlb
(Mike Rapoport)
- "mm: clean up anon_vma implementation" cleans up the anon_vma
implementation in various ways (Lorenzo Stoakes)
- "tweaks for __alloc_pages_slowpath()" does a little streamlining of
the page allocator's slowpath code (Vlastimil Babka)
- "memcg: separate private and public ID namespaces" cleans up the
memcg ID code and prevents the internal-only private IDs from being
exposed to userspace (Shakeel Butt)
- "mm: hugetlb: allocate frozen gigantic folio" cleans up the
allocation of frozen folios and avoids some atomic refcount
operations (Kefeng Wang)
- "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement
of memory betewwn the active and inactive LRUs and adds auto-tuning
of the ratio-based quotas and of monitoring intervals (SeongJae Park)
- "Support page table check on PowerPC" makes
CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan)
- "nodemask: align nodes_and{,not} with underlying bitmap ops" makes
nodes_and() and nodes_andnot() propagate the return values from the
underlying bit operations, enabling some cleanup in calling code
(Yury Norov)
- "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up
some DAMON internal interfaces (SeongJae Park)
- "mm/khugepaged: cleanups and scan limit fix" does some cleanup work
in khupaged and fixes a scan limit accounting issue (Shivank Garg)
- "mm: balloon infrastructure cleanups" goes to town on the balloon
infrastructure and its page migration function. Mainly cleanups, also
some locking simplification (David Hildenbrand)
- "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds
additional tracepoints to the page reclaim code (Jiayuan Chen)
- "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is
part of Marco's kernel-wide migration from the legacy workqueue APIs
over to the preferred unbound workqueues (Marco Crivellari)
- "Various mm kselftests improvements/fixes" provides various unrelated
improvements/fixes for the mm kselftests (Kevin Brodsky)
- "mm: accelerate gigantic folio allocation" greatly speeds up gigantic
folio allocation, mainly by avoiding unnecessary work in
pfn_range_valid_contig() (Kefeng Wang)
- "selftests/damon: improve leak detection and wss estimation
reliability" improves the reliability of two of the DAMON selftests
(SeongJae Park)
- "mm/damon: cleanup kdamond, damon_call(), damos filter and
DAMON_MIN_REGION" does some cleanup work in the core DAMON code
(SeongJae Park)
- "Docs/mm/damon: update intro, modules, maintainer profile, and misc"
performs maintenance work on the DAMON documentation (SeongJae Park)
- "mm: add and use vma_assert_stabilised() helper" refactors and cleans
up the core VMA code. The main aim here is to be able to use the mmap
write lock's lockdep state to perform various assertions regarding
the locking which the VMA code requires (Lorenzo Stoakes)
- "mm, swap: swap table phase II: unify swapin use" removes some old
swap code (swap cache bypassing and swap synchronization) which
wasn't working very well. Various other cleanups and simplifications
were made. The end result is a 20% speedup in one benchmark (Kairui
Song)
- "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM
available on 64-bit alpha, loongarch, mips, parisc, and um. Various
cleanups were performed along the way (Qi Zheng)
* tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits)
mm/memory: handle non-split locks correctly in zap_empty_pte_table()
mm: move pte table reclaim code to memory.c
mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE
mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config
um: mm: enable MMU_GATHER_RCU_TABLE_FREE
parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE
mips: mm: enable MMU_GATHER_RCU_TABLE_FREE
LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE
alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE
mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h
mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles
zsmalloc: make common caches global
mm: add SPDX id lines to some mm source files
mm/zswap: use %pe to print error pointers
mm/vmscan: use %pe to print error pointers
mm/readahead: fix typo in comment
mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file()
mm: refactor vma_map_pages to use vm_insert_pages
mm/damon: unify address range representation with damon_addr_range
mm/cma: replace snprintf with strscpy in cma_new_area
...
|
||
|
|
41f1a08645 |
Kbuild/Kconfig updates for 7.0
Kbuild changes
==============
* Drop '*_probe' pattern from modpost section check allowlist, which hid
legitimate warnings (Johan Hovold)
* Disable -Wtype-limits altogether, instead of enabling at W=2 (Vincent
Mailhol)
* Improve UAPI testing to skip testing headers that require a libc when
CONFIG_CC_CAN_LINK is not set, opening up testing of headers with no
libc dependencies to more environments (Thomas Weißschuh)
* Update gendwarfksyms documentation with required dependencies (Jihan
LIN)
* Reject invalid LLVM= values to avoid unintentionally falling back to
system toolchain (Thomas Weißschuh)
* Add a script to help run the kernel build process in a container for
consistent environments and testing (Guillaume Tucker)
* Simplify kallsyms by getting rid of the relative base (Ard Biesheuvel)
* Performance and usability improvements to scripts/make_fit.py (Simon
Glass)
* Minor various clean ups and fixes
Kconfig changes
===============
* Move XPM icons to individual files, clearing up GTK deprecation
warnings (Rostislav Krasny)
* Support
depends on FOO if BAR
as syntactic sugar for
depends on FOO || !BAR' (Nicolas Pitre, Graham Roff)
* Refactor merge_config.sh to use awk over shell/sed/grep, dramatically
speeding up processing large number of config fragments (Anders
Roxell, Mikko Rapeli)
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaYpgQwAKCRAdayaRccAa
liOGAQCqMI42YMLqljFcPu3B/3f43xhDBCXAhquPBIMhbgt+aAEAmmo3uMLHKSRV
XZDKkq13HMMV3Zlmrn5Xk/tzk+hkwwk=
=WYl4
-----END PGP SIGNATURE-----
Merge tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild/Kconfig updates from Nathan Chancellor:
"Kbuild:
- Drop '*_probe' pattern from modpost section check allowlist, which
hid legitimate warnings (Johan Hovold)
- Disable -Wtype-limits altogether, instead of enabling at W=2
(Vincent Mailhol)
- Improve UAPI testing to skip testing headers that require a libc
when CONFIG_CC_CAN_LINK is not set, opening up testing of headers
with no libc dependencies to more environments (Thomas Weißschuh)
- Update gendwarfksyms documentation with required dependencies
(Jihan LIN)
- Reject invalid LLVM= values to avoid unintentionally falling back
to system toolchain (Thomas Weißschuh)
- Add a script to help run the kernel build process in a container
for consistent environments and testing (Guillaume Tucker)
- Simplify kallsyms by getting rid of the relative base (Ard
Biesheuvel)
- Performance and usability improvements to scripts/make_fit.py
(Simon Glass)
- Minor various clean ups and fixes
Kconfig:
- Move XPM icons to individual files, clearing up GTK deprecation
warnings (Rostislav Krasny)
- Support
depends on FOO if BAR
as syntactic sugar for
depends on FOO || !BAR
(Nicolas Pitre, Graham Roff)
- Refactor merge_config.sh to use awk over shell/sed/grep,
dramatically speeding up processing large number of config
fragments (Anders Roxell, Mikko Rapeli)"
* tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (39 commits)
kbuild: remove dependency of run-command on config
scripts/make_fit: Compress dtbs in parallel
scripts/make_fit: Support a few more parallel compressors
kbuild: Support a FIT_EXTRA_ARGS environment variable
scripts/make_fit: Move dtb processing into a function
scripts/make_fit: Support an initial ramdisk
scripts/make_fit: Speed up operation
rust: kconfig: Don't require RUST_IS_AVAILABLE for rustc-option
MAINTAINERS: Add scripts/install.sh into Kbuild entry
modpost: Amend ppc64 save/restfpr symnames for -Os build
MIPS: tools: relocs: Ship a definition of R_MIPS_PC32
streamline_config.pl: remove superfluous exclamation mark
kbuild: dummy-tools: Add python3
scripts: kconfig: merge_config.sh: warn on duplicate input files
scripts: kconfig: merge_config.sh: use awk in checks too
scripts: kconfig: merge_config.sh: refactor from shell/sed/grep to awk
kallsyms: Get rid of kallsyms relative base
mips: Add support for PC32 relocations in vmlinux
Documentation: dev-tools: add container.rst page
scripts: add tool to run containerized builds
...
|
||
|
|
36ae1c45b2 |
Scheduler changes for v7.0:
Scheduler Kconfig space updates:
- Further consolidate configurable preemption modes: reduce
the number of architectures that are allowed to offer
PREEMPT_NONE and PREEMPT_VOLUNTARY, reducing the number
of preemption models from four to just two: 'full' and 'lazy'
on up-to-date architectures (arm64, loongarch, powerpc,
riscv, s390, x86).
None and voluntary are only available as legacy features
on platforms that don't implement lazy preemption yet,
or which don't even support preemption.
The goal is to eventually remove cond_resched() and
voluntary preemption altogether.
(Peter Zijlstra)
RSEQ based 'scheduler time slice extension' support:
This allows a thread to request a time slice extension when it
enters a critical section to avoid contention on a resource when
the thread is scheduled out inside of the critical section.
- Add fields and constants for time slice extension
- Provide static branch for time slice extensions
- Add statistics for time slice extensions
- Add prctl() to enable time slice extensions
- Implement sys_rseq_slice_yield()
- Implement syscall entry work for time slice extensions
- Implement time slice extension enforcement timer
- Reset slice extension when scheduled
- Implement rseq_grant_slice_extension()
- entry: Hook up rseq time slice extension
- selftests: Implement time slice extension test
(Thomas Gleixner)
- Allow registering RSEQ with slice extension
- Move slice_ext_nsec to debugfs
- Lower default slice extension
- selftests/rseq: Add rseq slice histogram script
(Peter Zijlstra)
Scheduler performance/scalability improvements:
- Update rq->avg_idle when a task is moved to an idle CPU,
which improves the scalability of various workloads.
(Shubhang Kaushik)
- Reorder fields in 'struct rq' for better caching
(Blake Jones)
- Fair scheduler SMP NOHZ balancing code speedups:
- Move checking for nohz cpus after time check
- Change likelyhood of nohz.nr_cpus
- Remove nohz.nr_cpus and use weight of cpumask instead
(Shrikanth Hegde)
- Avoid false sharing for sched_clock_irqtime (Wangyang Guo)
- Drop useless cpumask_empty() in find_energy_efficient_cpu()
- Simplify task_numa_find_cpu()
- Use cpumask_weight_and() in sched_balance_find_dst_group()
(Yury Norov)
DL scheduler updates:
- Add a deadline server for sched_ext tasks (by Andrea Righi and
Joel Fernandes, with fixes by Peter Zijlstra)
RT scheduler updates:
- Skip currently executing CPU in rto_next_cpu() (Chen Jinghuang)
Entry code updates and performance improvements, which is part of the
scheduler tree in this cycle due to interdependencies with the RSEQ
based time slice extension work:
- Remove unused syscall argument from syscall_trace_enter()
- Rework syscall_exit_to_user_mode_work() for architecture reuse
- Add arch_ptrace_report_syscall_entry/exit()
- Inline syscall_exit_work() and syscall_trace_enter()
(Jinjie Ruan)
Scheduler core updates:
- Rework sched_class::wakeup_preempt() and rq_modified_*()
- Avoid rq->lock bouncing in sched_balance_newidle()
- Rename rcu_dereference_check_sched_domain() =>
rcu_dereference_sched_domain()
- <linux/compiler_types.h>: Add the __signed_scalar_typeof() helper
(Peter Zijlstra)
Fair scheduler updates/refactoring:
- Fold the sched_avg update
- Change rcu_dereference_check_sched_domain() to rcu-sched
- Switch to rcu_dereference_all()
- Remove superfluous rcu_read_lock()
- Limit hrtick work
(Peter Zijlstra)
- Join two #ifdef CONFIG_FAIR_GROUP_SCHED blocks
- Clean up comments in 'struct cfs_rq'
- Separate se->vlag from se->vprot
- Rename cfs_rq::avg_load to cfs_rq::sum_weight
- Rename cfs_rq::avg_vruntime to ::sum_w_vruntime & helper functions
- Introduce and use the vruntime_cmp() and vruntime_op() wrappers
for wrapped-signed aritmetics
- Sort out 'blocked_load*' namespace noise
(Ingo Molnar)
Scheduler debugging code updates:
- Export hidden tracepoints to modules (Gabriele Monaco)
- Convert copy_from_user() + kstrtouint() to kstrtouint_from_user()
(Fushuai Wang)
- Add assertions to QUEUE_CLASS (Peter Zijlstra)
- hrtimer: Fix tracing oddity (Thomas Gleixner)
Misc fixes and cleanups:
- Re-evaluate scheduling when migrating queued tasks out of
throttled cgroups (Zicheng Qu)
- Remove task_struct->faults_disabled_mapping (Christoph Hellwig)
- Fix math notation errors in avg_vruntime comment (Zhan Xusheng)
- sched/cpufreq: Use %pe format for PTR_ERR() printing (zenghongling)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmJj+IRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1grtQ//WyXYGVE/WicdqslfaCY2Mr0uJnL0tLSM
CJp+0LROdkmy+ChJmftO8RgjCUSsjhC4/xcBhUQXApf/ffQi3b2jH6nkTp/Z64Ms
p2IXLkBiZjwdcO6fGbB0JE2G1J4hGRC5BlqfgkZzWidMf3kIbmrHg99mVWGzODLY
N/cPW4d0WGf9TScl1FgEiOqgF3czMLlqvTDJqaFMpsTzSUcRBnrG4xushb4W/bBx
573eqxgZJ6urNSGu8niY9PAl9F7gskXW3YxI3k8SH7VmJKSevWlwI9vMEhcRDzud
E0XxD7J8iPOKtr7ypXm7anMBv4jWVUdAnPbYi4TDsyDDU/HguqMqT1McTGn8wQ+F
jmdhmMC9/TEIzq93SNLbCYieibqDsmJoNVFFi0FWfPLMtYbcZd5a884SIz532vx4
DegdlDXdazUwhxzDiQR3sq1CsHXpxNS2YdrpadAtF/r2gU86DQjsEew8yBvXi7bb
Wrkzpax70sU1AFI23wJQkEb/OnnXyehAHAhhQN6GVvuiGr9P7C02WLEGLlmSmJrx
zl2F750P76yhTfGcvTfJ/5LTfSB+yRozGvcdXnIkyzWotY6a2D1MKNusAfVax+IR
kyfAWqVdxBhlKnqYbu92lTogvnPh3Lymd6G4TZZRkSH2jixyGd2oS7nZaDBAeBEM
NHQtr9R+KyU=
=Xj2f
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Scheduler Kconfig space updates:
- Further consolidate configurable preemption modes (Peter Zijlstra)
Reduce the number of architectures that are allowed to offer
PREEMPT_NONE and PREEMPT_VOLUNTARY, reducing the number of
preemption models from four to just two: 'full' and 'lazy' on
up-to-date architectures (arm64, loongarch, powerpc, riscv, s390,
x86).
None and voluntary are only available as legacy features on
platforms that don't implement lazy preemption yet, or which don't
even support preemption.
The goal is to eventually remove cond_resched() and voluntary
preemption altogether.
RSEQ based 'scheduler time slice extension' support (Thomas Gleixner
and Peter Zijlstra):
This allows a thread to request a time slice extension when it enters
a critical section to avoid contention on a resource when the thread
is scheduled out inside of the critical section.
- Add fields and constants for time slice extension
- Provide static branch for time slice extensions
- Add statistics for time slice extensions
- Add prctl() to enable time slice extensions
- Implement sys_rseq_slice_yield()
- Implement syscall entry work for time slice extensions
- Implement time slice extension enforcement timer
- Reset slice extension when scheduled
- Implement rseq_grant_slice_extension()
- entry: Hook up rseq time slice extension
- selftests: Implement time slice extension test
- Allow registering RSEQ with slice extension
- Move slice_ext_nsec to debugfs
- Lower default slice extension
- selftests/rseq: Add rseq slice histogram script
Scheduler performance/scalability improvements:
- Update rq->avg_idle when a task is moved to an idle CPU, which
improves the scalability of various workloads (Shubhang Kaushik)
- Reorder fields in 'struct rq' for better caching (Blake Jones)
- Fair scheduler SMP NOHZ balancing code speedups (Shrikanth Hegde):
- Move checking for nohz cpus after time check
- Change likelyhood of nohz.nr_cpus
- Remove nohz.nr_cpus and use weight of cpumask instead
- Avoid false sharing for sched_clock_irqtime (Wangyang Guo)
- Cleanups (Yury Norov):
- Drop useless cpumask_empty() in find_energy_efficient_cpu()
- Simplify task_numa_find_cpu()
- Use cpumask_weight_and() in sched_balance_find_dst_group()
DL scheduler updates:
- Add a deadline server for sched_ext tasks (by Andrea Righi and Joel
Fernandes, with fixes by Peter Zijlstra)
RT scheduler updates:
- Skip currently executing CPU in rto_next_cpu() (Chen Jinghuang)
Entry code updates and performance improvements (Jinjie Ruan)
This is part of the scheduler tree in this cycle due to inter-
dependencies with the RSEQ based time slice extension work:
- Remove unused syscall argument from syscall_trace_enter()
- Rework syscall_exit_to_user_mode_work() for architecture reuse
- Add arch_ptrace_report_syscall_entry/exit()
- Inline syscall_exit_work() and syscall_trace_enter()
Scheduler core updates (Peter Zijlstra):
- Rework sched_class::wakeup_preempt() and rq_modified_*()
- Avoid rq->lock bouncing in sched_balance_newidle()
- Rename rcu_dereference_check_sched_domain() =>
rcu_dereference_sched_domain()
- <linux/compiler_types.h>: Add the __signed_scalar_typeof() helper
Fair scheduler updates/refactoring (Peter Zijlstra and Ingo Molnar):
- Fold the sched_avg update
- Change rcu_dereference_check_sched_domain() to rcu-sched
- Switch to rcu_dereference_all()
- Remove superfluous rcu_read_lock()
- Limit hrtick work
- Join two #ifdef CONFIG_FAIR_GROUP_SCHED blocks
- Clean up comments in 'struct cfs_rq'
- Separate se->vlag from se->vprot
- Rename cfs_rq::avg_load to cfs_rq::sum_weight
- Rename cfs_rq::avg_vruntime to ::sum_w_vruntime & helper functions
- Introduce and use the vruntime_cmp() and vruntime_op() wrappers for
wrapped-signed aritmetics
- Sort out 'blocked_load*' namespace noise
Scheduler debugging code updates:
- Export hidden tracepoints to modules (Gabriele Monaco)
- Convert copy_from_user() + kstrtouint() to kstrtouint_from_user()
(Fushuai Wang)
- Add assertions to QUEUE_CLASS (Peter Zijlstra)
- hrtimer: Fix tracing oddity (Thomas Gleixner)
Misc fixes and cleanups:
- Re-evaluate scheduling when migrating queued tasks out of throttled
cgroups (Zicheng Qu)
- Remove task_struct->faults_disabled_mapping (Christoph Hellwig)
- Fix math notation errors in avg_vruntime comment (Zhan Xusheng)
- sched/cpufreq: Use %pe format for PTR_ERR() printing
(zenghongling)"
* tag 'sched-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
sched: Re-evaluate scheduling when migrating queued tasks out of throttled cgroups
sched/cpufreq: Use %pe format for PTR_ERR() printing
sched/rt: Skip currently executing CPU in rto_next_cpu()
sched/clock: Avoid false sharing for sched_clock_irqtime
selftests/sched_ext: Add test for DL server total_bw consistency
selftests/sched_ext: Add test for sched_ext dl_server
sched/debug: Fix dl_server (re)start conditions
sched/debug: Add support to change sched_ext server params
sched_ext: Add a DL server for sched_ext tasks
sched/debug: Stop and start server based on if it was active
sched/debug: Fix updating of ppos on server write ops
sched/deadline: Clear the defer params
entry: Inline syscall_exit_work() and syscall_trace_enter()
entry: Add arch_ptrace_report_syscall_entry/exit()
entry: Rework syscall_exit_to_user_mode_work() for architecture reuse
entry: Remove unused syscall argument from syscall_trace_enter()
sched: remove task_struct->faults_disabled_mapping
sched: Update rq->avg_idle when a task is moved to an idle CPU
selftests/rseq: Add rseq slice histogram script
hrtimer: Fix trace oddity
...
|
||
|
|
4d84667627 |
Performance events changes for v7.0:
x86 PMU driver updates:
- Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs.
Compared to previous iterations of the Intel PMU code, there's
been a lot of changes, which center around three main areas:
- Introduce the OFF-MODULE RESPONSE (OMR) facility to
replace the Off-Core Response (OCR) facility
- New PEBS data source encoding layout
- Support the new "RDPMC user disable" feature
(Dapeng Mi)
- Likewise, a large series adds uncore PMU support for
Intel Diamond Rapids (DMR) CPUs, which center around these
four main areas:
- DMR may have two Integrated I/O and Memory Hub (IMH) dies,
separate from the compute tile (CBB) dies. Each CBB and
each IMH die has its own discovery domain.
- Unlike prior CPUs that retrieve the global discovery table
portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
discovery and MSR for CBB PMON discovery.
- DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA,
UBR, PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.
- IIO free-running counters in DMR are MMIO-based, unlike SPR.
(Zide Chen)
- Also add support for Add missing PMON units for Intel Panther Lake,
and support Nova Lake (NVL), which largely maps to Panther Lake.
(Zide Chen)
- KVM integration: Add support for mediated vPMUs (by Kan Liang
and Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
Sandipan Das and Mingwei Zhang)
- Add Intel cstate driver to support for Wildcat Lake (WCL)
CPUs, which are a low-power variant of Panther Lake.
(Zide Chen)
- Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
(aka MaxLinear Lightning Mountain), which maps to the existing
Airmont code. (Martin Schiller)
Performance enhancements:
- core: Speed up kexec shutdown by avoiding unnecessary
cross CPU calls. (Jan H. Schönherr)
- core: Fix slow perf_event_task_exit() with LBR callstacks
(Namhyung Kim)
User-space stack unwinding support:
- Various cleanups and refactorings in preparation to generalize
the unwinding code for other architectures. (Jens Remus)
Uprobes updates:
- Transition from kmap_atomic to kmap_local_page (Keke Ming)
- Fix incorrect lockdep condition in filter_chain() (Breno Leitao)
- Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)
Misc fixes and cleanups:
- s390: Remove kvm_types.h from Kbuild (Randy Dunlap)
- x86/intel/uncore: Convert comma to semicolon (Chen Ni)
- x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)
- x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmJhTURHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1i/qw/9F/sjMqbxH8d3kPXB8wk2eUuSkynQ2aNw
Zec9qtfCC5N1U9b2D7ywGJRscTmWYnX/3BKTyzFuyA6SDz6buAgDDIGPlHi+9Fww
+RUUS3lQ7N3pVWZ4Ifu3kbh3Vz4lkQuOXhfcjiyIMS6QIxfrcSLFoKHK+2V6PeU+
x0k+THHz/Ymg+DIpqSjqil1yrKaUmU9xRrbnyy6zJB1duREQrkYBhIWL1+bcd7SA
89RVAGXQ+sWzVQMPaKrMkZj6GavOCB7zseigiiwjBRLznukS2OulDDe8zR6pCJZp
wbdc7TR/nCm+QtNfkHlOmTQvsPAXiXNyXe5Vi8aFjGc0uMGhHaeiL9ah/bwsKA5m
Bm5Y7oVSmBlCJbcr/CTrGYkb+WwvLIgPCwVkn4FYPlsWv+U92qTOx9q7qKmIDFaj
1oUXCwoHbYrYnoZqZqPp2h689m0Lh/lsGhy0QRt8aGnKu0SDjMaqRGbYWH0UI0kA
aDnZstVHG76RnVi0143q8HcvvjNZb82NL0cS749tY/YcwH4kUEGj2XuSK2Ar887T
H0oDJHXijMlGXWqO5bK3WMoCQajR7nRyqBo7/rKYj20OjXmwoXS3vel77/s8WFo2
fUUC469MacDzyxdBNutnkJvvcvsUio3r4MWFsEEWQk2nUE58PtN8YM8j/FdNYpql
zAZ4Jx/A0RM=
=4SRB
-----END PGP SIGNATURE-----
Merge tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull performance event updates from Ingo Molnar:
"x86 PMU driver updates:
- Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs
(Dapeng Mi)
Compared to previous iterations of the Intel PMU code, there's been
a lot of changes, which center around three main areas:
- Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the
Off-Core Response (OCR) facility
- New PEBS data source encoding layout
- Support the new "RDPMC user disable" feature
- Likewise, a large series adds uncore PMU support for Intel Diamond
Rapids (DMR) CPUs (Zide Chen)
This centers around these four main areas:
- DMR may have two Integrated I/O and Memory Hub (IMH) dies,
separate from the compute tile (CBB) dies. Each CBB and each IMH
die has its own discovery domain.
- Unlike prior CPUs that retrieve the global discovery table
portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
discovery and MSR for CBB PMON discovery.
- DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR,
PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.
- IIO free-running counters in DMR are MMIO-based, unlike SPR.
- Also add support for Add missing PMON units for Intel Panther Lake,
and support Nova Lake (NVL), which largely maps to Panther Lake.
(Zide Chen)
- KVM integration: Add support for mediated vPMUs (by Kan Liang and
Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
Sandipan Das and Mingwei Zhang)
- Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs,
which are a low-power variant of Panther Lake (Zide Chen)
- Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
(aka MaxLinear Lightning Mountain), which maps to the existing
Airmont code (Martin Schiller)
Performance enhancements:
- Speed up kexec shutdown by avoiding unnecessary cross CPU calls
(Jan H. Schönherr)
- Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim)
User-space stack unwinding support:
- Various cleanups and refactorings in preparation to generalize the
unwinding code for other architectures (Jens Remus)
Uprobes updates:
- Transition from kmap_atomic to kmap_local_page (Keke Ming)
- Fix incorrect lockdep condition in filter_chain() (Breno Leitao)
- Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)
Misc fixes and cleanups:
- s390: Remove kvm_types.h from Kbuild (Randy Dunlap)
- x86/intel/uncore: Convert comma to semicolon (Chen Ni)
- x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)
- x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)"
* tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
s390: remove kvm_types.h from Kbuild
uprobes: Fix incorrect lockdep condition in filter_chain()
x86/ibs: Fix typo in dc_l2tlb_miss comment
x86/uprobes: Fix XOL allocation failure for 32-bit tasks
perf/x86/intel/uncore: Convert comma to semicolon
perf/x86/intel: Add support for rdpmc user disable feature
perf/x86: Use macros to replace magic numbers in attr_rdpmc
perf/x86/intel: Add core PMU support for Novalake
perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
perf/x86/intel: Add core PMU support for DMR
perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
perf/core: Fix slow perf_event_task_exit() with LBR callstacks
perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls
uprobes: use kmap_local_page() for temporary page mappings
arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
perf/x86/intel/uncore: Add Nova Lake support
...
|
||
|
|
f17b474e36 |
bpf-next-7.0
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmmGmrgACgkQ6rmadz2v
bTq6NxAAkCHosxzGn9GYYBV8xhrBJoJJDCyEbQ4nR0XNY+zaWnuykmiPP9w1aOAM
zm/po3mQB2pZjetvlrPrgG5RLgBCAUHzqVGy0r+phUvD3vbohKlmSlMm2kiXOb9N
T01BgLWsyqN2ZcNFvORdSsftqIJUHcXxU6RdupGD60sO5XM9ty5cwyewLX8GBOas
UN2bOhbK2DpqYWUvtv+3Q3ykxoStMSkXZvDRurwLKl4RHeLjXZXPo8NjnfBlk/F2
vdFo/F4NO4TmhOave6UPXvKb4yo9IlBRmiPAl0RmNKBxenY8j9XuV/xZxU6YgzDn
+SQfDK+CKQ4IYIygE+fqd4e5CaQrnjmPPcIw12AB2CF0LimY9Xxyyk6FSAhMN7wm
GTVh5K2C3Dk3OiRQk4G58EvQ5QcxzX98IeeCpcckMUkPsFWHRvF402WMUcv9SWpD
DsxxPkfENY/6N67EvH0qcSe/ikdUorQKFl4QjXKwsMCd5WhToeP4Z7Ck1gVSNkAh
9CX++mLzg333Lpsc4SSIuk9bEPpFa5cUIKUY7GCsCiuOXciPeMDP3cGSd5LioqxN
qWljs4Z88QDM2LJpAh8g4m3sA7bMhES3nPmdlI5CfgBcVyLW8D8CqQq4GEZ1McwL
Ky084+lEosugoVjRejrdMMEOsqAfcbkTr2b8jpuAZdwJKm6p/bw=
=cBdK
-----END PGP SIGNATURE-----
Merge tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Support associating BPF program with struct_ops (Amery Hung)
- Switch BPF local storage to rqspinlock and remove recursion detection
counters which were causing false positives (Amery Hung)
- Fix live registers marking for indirect jumps (Anton Protopopov)
- Introduce execution context detection BPF helpers (Changwoo Min)
- Improve verifier precision for 32bit sign extension pattern
(Cupertino Miranda)
- Optimize BTF type lookup by sorting vmlinux BTF and doing binary
search (Donglin Peng)
- Allow states pruning for misc/invalid slots in iterator loops (Eduard
Zingerman)
- In preparation for ASAN support in BPF arenas teach libbpf to move
global BPF variables to the end of the region and enable arena kfuncs
while holding locks (Emil Tsalapatis)
- Introduce support for implicit arguments in kfuncs and migrate a
number of them to new API. This is a prerequisite for cgroup
sub-schedulers in sched-ext (Ihor Solodrai)
- Fix incorrect copied_seq calculation in sockmap (Jiayuan Chen)
- Fix ORC stack unwind from kprobe_multi (Jiri Olsa)
- Speed up fentry attach by using single ftrace direct ops in BPF
trampolines (Jiri Olsa)
- Require frozen map for calculating map hash (KP Singh)
- Fix lock entry creation in TAS fallback in rqspinlock (Kumar
Kartikeya Dwivedi)
- Allow user space to select cpu in lookup/update operations on per-cpu
array and hash maps (Leon Hwang)
- Make kfuncs return trusted pointers by default (Matt Bobrowski)
- Introduce "fsession" support where single BPF program is executed
upon entry and exit from traced kernel function (Menglong Dong)
- Allow bpf_timer and bpf_wq use in all programs types (Mykyta
Yatsenko, Andrii Nakryiko, Kumar Kartikeya Dwivedi, Alexei
Starovoitov)
- Make KF_TRUSTED_ARGS the default for all kfuncs and clean up their
definition across the tree (Puranjay Mohan)
- Allow BPF arena calls from non-sleepable context (Puranjay Mohan)
- Improve register id comparison logic in the verifier and extend
linked registers with negative offsets (Puranjay Mohan)
- In preparation for BPF-OOM introduce kfuncs to access memcg events
(Roman Gushchin)
- Use CFI compatible destructor kfunc type (Sami Tolvanen)
- Add bitwise tracking for BPF_END in the verifier (Tianci Cao)
- Add range tracking for BPF_DIV and BPF_MOD in the verifier (Yazhou
Tang)
- Make BPF selftests work with 64k page size (Yonghong Song)
* tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (268 commits)
selftests/bpf: Fix outdated test on storage->smap
selftests/bpf: Choose another percpu variable in bpf for btf_dump test
selftests/bpf: Remove test_task_storage_map_stress_lookup
selftests/bpf: Update task_local_storage/task_storage_nodeadlock test
selftests/bpf: Update task_local_storage/recursion test
selftests/bpf: Update sk_storage_omem_uncharge test
bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy}
bpf: Support lockless unlink when freeing map or local storage
bpf: Prepare for bpf_selem_unlink_nofail()
bpf: Remove unused percpu counter from bpf_local_storage_map_free
bpf: Remove cgroup local storage percpu counter
bpf: Remove task local storage percpu counter
bpf: Change local_storage->lock and b->lock to rqspinlock
bpf: Convert bpf_selem_unlink to failable
bpf: Convert bpf_selem_link_map to failable
bpf: Convert bpf_selem_unlink_map to failable
bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage
selftests/xsk: fix number of Tx frags in invalid packet
selftests/xsk: properly handle batch ending in the middle of a packet
bpf: Prevent reentrance into call_rcu_tasks_trace()
...
|
||
|
|
85f24b0ace |
hardening updates for v7.0-rc1
- Various missed __counted_by annotations (Thorsten Blum)
- Various missed -Wflex-array-member-not-at-end fixes (Gustavo A. R. Silva)
- Avoid leftover tempfiles for interrupted compile-time FORTIFY tests
(Nicolas Schier)
- Remove non-existant CONFIG_UBSAN_REPORT_FULL from docs (Stefan Wiehler)
- fortify: Use C arithmetic not FIELD_xxx() in FORTIFY_REASON defines
(David Laight)
- Add __counted_by_ptr attribute, tests, and first user (Bill Wendling,
Kees Cook)
- Update MAINTAINERS file to make hardening section not include pstore
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaYopPQAKCRA2KwveOeQk
u5QDAQDj9ygsbzLj9EtXtU3T03GTvix4Rx7RkaBAMPSDEJGhNQD+M3dP6Z2ogEMz
1Km6dAC7nTEujsVFur9BVpyEgoBjjQM=
=G1gb
-----END PGP SIGNATURE-----
Merge tag 'hardening-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"Mostly small cleanups and various scattered annotations and flex array
warning fixes that we reviewed by unlanded in other trees. Introduces
new annotation for expanding counted_by to pointer members, now that
compiler behavior between GCC and Clang has been normalized.
- Various missed __counted_by annotations (Thorsten Blum)
- Various missed -Wflex-array-member-not-at-end fixes (Gustavo A. R.
Silva)
- Avoid leftover tempfiles for interrupted compile-time FORTIFY tests
(Nicolas Schier)
- Remove non-existant CONFIG_UBSAN_REPORT_FULL from docs (Stefan
Wiehler)
- fortify: Use C arithmetic not FIELD_xxx() in FORTIFY_REASON defines
(David Laight)
- Add __counted_by_ptr attribute, tests, and first user (Bill
Wendling, Kees Cook)
- Update MAINTAINERS file to make hardening section not include
pstore"
* tag 'hardening-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
MAINTAINERS: pstore: Remove L: entry
nfp: tls: Avoid -Wflex-array-member-not-at-end warnings
carl9170: Avoid -Wflex-array-member-not-at-end warning
coredump: Use __counted_by_ptr for struct core_name::corename
lkdtm/bugs: Add __counted_by_ptr() test PTR_BOUNDS
compiler_types.h: Attributes: Add __counted_by_ptr macro
fortify: Cleanup temp file also on non-successful exit
fortify: Rename temporary file to match ignore pattern
fortify: Use C arithmetic not FIELD_xxx() in FORTIFY_REASON defines
ecryptfs: Annotate struct ecryptfs_message with __counted_by
fs/xattr: Annotate struct simple_xattr with __counted_by
crypto: af_alg - Annotate struct af_alg_iv with __counted_by
Kconfig.ubsan: Remove CONFIG_UBSAN_REPORT_FULL from documentation
drm/nouveau: fifo: Avoid -Wflex-array-member-not-at-end warning
|
||
|
|
d16738a4e7 |
The kthread code provides an infrastructure which manages the preferred
affinity of unbound kthreads (node or custom cpumask) against housekeeping (CPU isolation) constraints and CPU hotplug events. One crucial missing piece is the handling of cpuset: when an isolated partition is created, deleted, or its CPUs updated, all the unbound kthreads in the top cpuset become indifferently affine to _all_ the non-isolated CPUs, possibly breaking their preferred affinity along the way. Solve this with performing the kthreads affinity update from cpuset to the kthreads consolidated relevant code instead so that preferred affinities are honoured and applied against the updated cpuset isolated partitions. The dispatch of the new isolated cpumasks to timers, workqueues and kthreads is performed by housekeeping, as per the nice Tejun's suggestion. As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set from boot defined domain isolation (through isolcpus=) and cpuset isolated partitions. Housekeeping cpumasks are now modifiable with a specific RCU based synchronization. A big step toward making nohz_full= also mutable through cpuset in the future. -----BEGIN PGP SIGNATURE----- iQJPBAABCAA5FiEEd76+gtGM8MbftQlOhSRUR1COjHcFAmmE0mYbFIAAAAAABAAO bWFudTIsMi41KzEuMTEsMiwyAAoJEIUkVEdQjox36eMP/0Ls/ArfYVi/MNAXWlpy rAt6m9Y/X9GBcDM/VI9BXq1ZX4qEr2XjJ8UUb8cM08uHEAt0ErlmpRxREwJFrKbI H4jzg5EwO0D0c6MnvgQJEAwkHxQVIjsxG9DovRIjxyW4ycx3aSsRg/f2VKyWoLvY 7ZT7CbLFE+I/MQh2ZgUu/9pnCDQVR2anss2WYIej5mmgFL5pyEv3YvYgKYVyK08z sXyNxpP976g2d9ECJ9OtFJV9we6mlqxlG0MVCiv/Uxh7DBjxWWPsLvlmLAXggQ03 +0GW+nnutDaKz83pgS7Z4zum/+Oa+I1dTLIN27pARUNcMCYip7njM2KNpJwPdov3 +fAIODH2JVX1xewT+U1cCq6gdI55ejbwdQYGFV075dKBUxKQeIyrghvfC3Ga6aKQ Gw3y68jdrXOw6iyfHR5k/0Mnu2/FDKUW2fZxLKm55PvNZP5jQFmSlz9wyiwwyb3m UUSgThj6Ozodxks8hDX41rGVezCcm1ni+qNSiNIs8HPaaZQrwbnvKHQFBBJHQzJP rJ39VWBx3Hq/ly71BOR6pCzoZsfS1f85YKhJ4vsfjLO6BfhI16nBat89eROSRKcz XptyWqW0PgAD0teDuMCTPNuUym/viBHALXHKuSO12CIizacvftiGcmaQNPlLiiFZ /Dr2+aOhwYw3UD6djn3u94M9 =nWGh -----END PGP SIGNATURE----- Merge tag 'kthread-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks Pull kthread updates from Frederic Weisbecker: "The kthread code provides an infrastructure which manages the preferred affinity of unbound kthreads (node or custom cpumask) against housekeeping (CPU isolation) constraints and CPU hotplug events. One crucial missing piece is the handling of cpuset: when an isolated partition is created, deleted, or its CPUs updated, all the unbound kthreads in the top cpuset become indifferently affine to _all_ the non-isolated CPUs, possibly breaking their preferred affinity along the way. Solve this with performing the kthreads affinity update from cpuset to the kthreads consolidated relevant code instead so that preferred affinities are honoured and applied against the updated cpuset isolated partitions. The dispatch of the new isolated cpumasks to timers, workqueues and kthreads is performed by housekeeping, as per the nice Tejun's suggestion. As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set from boot defined domain isolation (through isolcpus=) and cpuset isolated partitions. Housekeeping cpumasks are now modifiable with a specific RCU based synchronization. A big step toward making nohz_full= also mutable through cpuset in the future" * tag 'kthread-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (33 commits) doc: Add housekeeping documentation kthread: Document kthread_affine_preferred() kthread: Comment on the purpose and placement of kthread_affine_node() call kthread: Honour kthreads preferred affinity after cpuset changes sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management kthread: Include kthreadd to the managed affinity list kthread: Include unbound kthreads in the managed affinity list kthread: Refine naming of affinity related fields PCI: Remove superfluous HK_TYPE_WQ check sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() cpuset: Remove cpuset_cpu_is_isolated() timers/migration: Remove superfluous cpuset isolation test cpuset: Propagate cpuset isolation update to timers through housekeeping cpuset: Propagate cpuset isolation update to workqueue through housekeeping PCI: Flush PCI probe workqueue on cpuset isolated partition change sched/isolation: Flush vmstat workqueues on cpuset isolated partition change sched/isolation: Flush memcg workqueues on cpuset isolated partition change cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset ... |
||
|
|
9e355113f0 |
vfs-7.0-rc1.misc
Please consider pulling these changes from the signed vfs-7.0-rc1.misc tag.
Thanks!
Christian
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49QAKCRCRxhvAZXjc
ojrZAQD1VJzY46r5FnAVf4jlEHyjIbDnZCP/n+c4x6XnqpU6EQEAgB0yAtAGP6+u
SBuytElqHoTT5VtmEXTAabCNQ9Ks8wo=
=JwZz
-----END PGP SIGNATURE-----
Merge tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"This contains a mix of VFS cleanups, performance improvements, API
fixes, documentation, and a deprecation notice.
Scalability and performance:
- Rework pid allocation to only take pidmap_lock once instead of
twice during alloc_pid(), improving thread creation/teardown
throughput by 10-16% depending on false-sharing luck. Pad the
namespace refcount to reduce false-sharing
- Track file lock presence via a flag in ->i_opflags instead of
reading ->i_flctx, avoiding false-sharing with ->i_readcount on
open/close hot paths. Measured 4-16% improvement on 24-core
open-in-a-loop benchmarks
- Use a consume fence in locks_inode_context() to match the
store-release/load-consume idiom, eliminating a hardware fence on
some architectures
- Annotate cdev_lock with __cacheline_aligned_in_smp to prevent
false-sharing
- Remove a redundant DCACHE_MANAGED_DENTRY check in
__follow_mount_rcu() that never fires since the caller already
verifies it, eliminating a 100% mispredicted branch
- Fix a 100% mispredicted likely() in devcgroup_inode_permission()
that became wrong after a prior code reorder
Bug fixes and correctness:
- Make insert_inode_locked() wait for inode destruction instead of
skipping, fixing a corner case where two matching inodes could
exist in the hash
- Move f_mode initialization before file_ref_init() in alloc_file()
to respect the SLAB_TYPESAFE_BY_RCU ordering contract
- Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with
no buffers attached, preventing a null pointer dereference when
AS_RELEASE_ALWAYS is set but no release_folio op exists
- Fix select restart_block to store end_time as timespec64, avoiding
truncation of tv_sec on 32-bit architectures
- Make dump_inode() use get_kernel_nofault() to safely access inode
and superblock fields, matching the dump_mapping() pattern
API modernization:
- Make posix_acl_to_xattr() allocate the buffer internally since
every single caller was doing it anyway. Reduces boilerplate and
unnecessary error checking across ~15 filesystems
- Replace deprecated simple_strtoul() with kstrtoul() for the
ihash_entries, dhash_entries, mhash_entries, and mphash_entries
boot parameters, adding proper error handling
- Convert chardev code to use guard(mutex) and __free(kfree) cleanup
patterns
- Replace min_t() with min() or umin() in VFS code to avoid silently
truncating unsigned long to unsigned int
- Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers
already check the flag
Deprecation:
- Begin deprecating legacy BSD process accounting (acct(2)). The
interface has numerous footguns and better alternatives exist
(eBPF)
Documentation:
- Fix and complete kernel-doc for struct export_operations, removing
duplicated documentation between ReST and source
- Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait()
Testing:
- Add a kunit test for initramfs cpio handling of entries with
filesize > PATH_MAX
Misc:
- Add missing <linux/init_task.h> include in fs_struct.c"
* tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
posix_acl: make posix_acl_to_xattr() alloc the buffer
fs: make insert_inode_locked() wait for inode destruction
initramfs_test: kunit test for cpio.filesize > PATH_MAX
fs: improve dump_inode() to safely access inode fields
fs: add <linux/init_task.h> for 'init_fs'
docs: exportfs: Use source code struct documentation
fs: move initializing f_mode before file_ref_init()
exportfs: Complete kernel-doc for struct export_operations
exportfs: Mark struct export_operations functions at kernel-doc
exportfs: Fix kernel-doc output for get_name()
acct(2): begin the deprecation of legacy BSD process accounting
device_cgroup: remove branch hint after code refactor
VFS: fix __start_dirop() kernel-doc warnings
fs: Describe @isnew parameter in ilookup5_nowait()
fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu
fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS
select: store end_time as timespec64 in restart block
chardev: Switch to guard(mutex) and __free(kfree)
namespace: Replace simple_strtoul with kstrtoul to parse boot params
dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries
...
|
||
|
|
c84bb79f70 |
vfs-7.0-rc1.nullfs
Please consider pulling these changes from the signed vfs-7.0-rc1.nullfs tag.
Thanks!
Christian
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc
olG7AQD9TywOR0HC9PMT8jrhC1TKODnZ4H1aLNlYVltzfJ09xwEAwFSGO4rQmGAF
aZdD0RQw4bkf7IC1PIZHEGUqmVXJCQ8=
=NvyI
-----END PGP SIGNATURE-----
Merge tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs nullfs update from Christian Brauner:
"Add a completely catatonic minimal pseudo filesystem called "nullfs"
and make pivot_root() work in the initramfs.
Currently pivot_root() does not work on the real rootfs because it
cannot be unmounted. Userspace has to recursively delete initramfs
contents manually before continuing boot, using the fragile
switch_root sequence (overmount + chroot).
Add nullfs, a minimal immutable filesystem that serves as the true
root of the mount hierarchy. The mutable rootfs (tmpfs/ramfs) is
mounted on top of it. This allows userspace to simply:
chdir(new_root);
pivot_root(".", ".");
umount2(".", MNT_DETACH);
without the traditional switch_root workarounds. systemd already
handles this correctly. It tries pivot_root() first and falls back
to MS_MOVE only when that fails.
This also means rootfs mounts in unprivileged namespaces no longer
need MNT_LOCKED, since the immutable nullfs guarantees nothing can be
revealed by unmounting the covering mount.
nullfs is a single-instance filesystem (get_tree_single()) marked
SB_NOUSER | SB_I_NOEXEC | SB_I_NODEV with an immutable empty root
directory. This means sooner or later it can be used to overmount
other directories to hide their contents without any additional
protection needed.
We enable it unconditionally. If we see any real regression we'll
hide it behind a boot option.
nullfs has extensions beyond this in the future. It will serve as a
concept to support the creation of completely empty mount namespaces -
which is work coming up in the next cycle"
* tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: use nullfs unconditionally as the real rootfs
docs: mention nullfs
fs: add immutable rootfs
fs: add init_pivot_root()
fs: ensure that internal tmpfs mount gets mount id zero
|
||
|
|
996812c453 |
vfs-7.0-rc1.initrd
Please consider pulling these changes from the signed vfs-7.0-rc1.initrd tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc ordBAQD4d6Y5Zvr852s9deMTDv+ng3bFk1YHqybGe3wEuATstwD/QcvAoNFW9Nn5 n7/268Nk6jTEygT7Fm3tn42SnwOxhgU= =T203 -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs initrd removal from Christian Brauner: "Remove the deprecated linuxrc-based initrd code path and related dead code. The linuxrc initrd path was deprecated in 2020 and this series completes its removal. If we see real-life regressions we'll revert. The core change removes handle_initrd() and init_linuxrc() — the entire flow that ran /linuxrc from an initrd, pivoted roots, and handed off to the real root filesystem. With that gone, initrd_load() becomes void (no longer short-circuits prepare_namespace()), rd_load_image() is simplified to always load /initrd.image instead of taking a path, and rd_load_disk() is deleted. The /proc/sys/kernel/real-root-dev sysctl and its backing variable are removed since they only existed for linuxrc to communicate the real root device back to the kernel. The no-op load_ramdisk= and prompt_ramdisk= parameters are dropped, and noinitrd and ramdisk_start= gain deprecation warnings. Initramfs is entirely unaffected. The non-linuxrc initrd path (root=/dev/ram0) is preserved but now carries a deprecation warning targeting January 2027 removal" * tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: init: remove /proc/sys/kernel/real-root-dev initrd: remove deprecated code path (linuxrc) init: remove deprecated "load_ramdisk" and "prompt_ramdisk" command line parameters |
||
|
|
23f09dcc0a |
cpuset: Propagate cpuset isolation update to workqueue through housekeeping
Until now, cpuset would propagate isolated partition changes to workqueues so that unbound workers get properly reaffined. Since housekeeping now centralizes, synchronize and propagates isolation cpumask changes, perform the work from that subsystem for consolidation and consistency purposes. For simplification purpose, the target function is adapted to take the new housekeeping mask instead of the isolated mask. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Waiman Long <longman@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: "Michal Koutný" <mkoutny@suse.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Marco Crivellari <marco.crivellari@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Waiman Long <longman@redhat.com> Cc: cgroups@vger.kernel.org |
||
|
|
d49004c5f0 |
arch, mm: consolidate initialization of nodes, zones and memory map
To initialize node, zone and memory map data structures every architecture calls free_area_init() during setup_arch() and passes it an array of zone limits. Beside code duplication it creates "interesting" ordering cases between allocation and initialization of hugetlb and the memory map. Some architectures allocate hugetlb pages very early in setup_arch() in certain cases, some only create hugetlb CMA areas in setup_arch() and sometimes hugetlb allocations happen mm_core_init(). With arch_zone_limits_init() helper available now on all architectures it is no longer necessary to call free_area_init() from architecture setup code. Rather core MM initialization can call arch_zone_limits_init() in a single place. This allows to unify ordering of hugetlb vs memory map allocation and initialization. Remove the call to free_area_init() from architecture specific code and place it in a new mm_core_init_early() function that is called immediately after setup_arch(). After this refactoring it is possible to consolidate hugetlb allocations and eliminate differences in ordering of hugetlb and memory map initialization among different architectures. As the first step of this consolidation move hugetlb_bootmem_alloc() to mm_core_early_init(). Link: https://lkml.kernel.org/r/20260111082105.290734-24-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
499f86de4f |
init/main: read bootconfig header with get_unaligned_le32()
get_boot_config_from_initrd() scans up to 3 bytes before initrd_end to handle GRUB 4-byte alignment. As a result, the bootconfig header immediately preceding the magic may be unaligned. Read the size and checksum fields with get_unaligned_le32() instead of casting to u32 * and using le32_to_cpu(), avoiding potential unaligned access and silencing sparse "cast to restricted __le32" warnings. Sparse warnings (gcc + C=1): init/main.c:292:16: warning: cast to restricted __le32 init/main.c:293:16: warning: cast to restricted __le32 No functional change intended. Link: https://lkml.kernel.org/r/20260113101532.1630770-1-sun.jian.kdev@gmail.com Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |