linux

mirror of https://github.com/torvalds/linux.git synced 2026-07-27 17:47:41 +02:00

Author	SHA1	Message	Date
Thomas Gleixner	34594da765	genirq/proc: Increase default interrupt number precision to four Quite some architectures have four character wide acronyms for architecture specific interrupts like IPI, NMI, etc. The default precision of printing the Linux device interrupt numbers is three, which causes quite some code to play games with adding or omitting space after the acronym and the colon in order to keep the per CPU numbers properly aligned. Increase the default number precision to four in the core code and get rid of the space games all over the place. At the same time align all architecture specific descriptor texts left so that they show up in the same column as the interrupt chip names, which makes the output more uniform accross architectures. Fix up the GDB script to this new scheme as well. Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260517194931.839482411@kernel.org	2026-05-26 16:21:14 +02:00
Linus Torvalds	065c4e67cc	Mostly cleanups and small things, notably: - musl libc compatibility - vDSO installation fix - TLB sync race fix for recent SMP support - build fix for 32-bit with Clang 20/21 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmnmKS4ACgkQ10qiO8sP aAD2+w//dOOblgUYgQJUXIxHpS7Gcb3Tm+a7ujC23q/kWf/pc8milCSf+zoxzUXL 23Vwh4Gt4KrHKp8lG1gU3xZqV0qwhXNi5HO2hMpB0ioIVpX3TcrUhFbp/Oirvhgi 3PvnvsFtUlW82DFgewB98tefXZSAlG/pg+RjQ3weHfEo+xQbjYc+kR8o59tN8LNR Ea4rrxyjsr3KN2yBNaFpDkMchudP6XWgKByAZBxZ2FofC3zuVRCyF8ThDfQl/3/W muSqX+2iuKjGpmxV0XWt72hYOhNYjBtDY7f4EPe6sbUy+PU6SjD9h/s7VTyVHgZR 3Sii9AQLLJNYPoglExMfmWfeUnJCUJNNTLUze+ZtnhURZQYTvyJRzVmKj6fDPjK2 jGEKXanfZCK9Cfgy2f2xbQxCxhAVwz6QT0XaQO2dZBXa0anzG+2HM0Zn8MNa9jbU +Lm11k1jd1QBifr+5zeni98KHt2mf77blCny8TraODgLNgWUVi5kMkPF4bZgD4Qj udMU9lOkTD08R89hG/Le9TsB+NIpPauyNxDHUpC/VDterFdZqFvmOFT6afTo/4RZ nXNVdL1tn+7O7v0bLdbyhXwj2her1GDbe6HZ5eTNqmjcOthcgI3gF2stDfFhEbNb /wMHnpGPncMeEI8YWtWOFA4FA5T32+LafLCKhuRJdaw0+f/NMOo= =oovZ -----END PGP SIGNATURE----- Merge tag 'uml-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull uml updates from Johannes Berg: "Mostly cleanups and small things, notably: - musl libc compatibility - vDSO installation fix - TLB sync race fix for recent SMP support - build fix for 32-bit with Clang 20/21" * tag 'uml-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: Disable GCOV_PROFILE_ALL on 32-bit UML with Clang 20/21 um: drivers: call kernel_strrchr() explicitly in cow_user.c um: Replace strncpy() with strnlen()+memcpy_and_pad() in strncpy_chunk_from_user() x86/um: fix vDSO installation um: Remove CONFIG_FRAME_WARN from x86_64_defconfig um: Fix pte_read() and pte_exec() for kernel mappings um: Fix potential race condition in TLB sync um: time-travel: clean up kernel-doc warnings um: avoid struct sigcontext redefinition with musl um: fix address-of CMSG_DATA() rvalue in stub	2026-04-20 16:36:46 -07:00
Mike Rapoport (Microsoft)	6215d9f447	arch, mm: consolidate empty_zero_page Reduce 22 declarations of empty_zero_page to 3 and 23 declarations of ZERO_PAGE() to 4. Every architecture defines empty_zero_page that way or another, but for the most of them it is always a page aligned page in BSS and most definitions of ZERO_PAGE do virt_to_page(empty_zero_page). Move Linus vetted x86 definition of empty_zero_page and ZERO_PAGE() to the core MM and drop these definitions in architectures that do not implement colored zero page (MIPS and s390). ZERO_PAGE() remains a macro because turning it to a wrapper for a static inline causes severe pain in header dependencies. For the most part the change is mechanical, with these being noteworthy: * alpha: aliased empty_zero_page with ZERO_PGE that was also used for boot parameters. Switching to a generic empty_zero_page removes the aliasing and keeps ZERO_PGE for boot parameters only * arm64: uses __pa_symbol() in ZERO_PAGE() so that definition of ZERO_PAGE() is kept intact. * m68k/parisc/um: allocated empty_zero_page from memblock, although they do not support zero page coloring and having it in BSS will work fine. * sparc64 can have empty_zero_page in BSS rather allocate it, but it can't use virt_to_page() for BSS. Keep it's definition of ZERO_PAGE() but instead of allocating it, make mem_map_zero point to empty_zero_page. * sh: used empty_zero_page for boot parameters at the very early boot. Rename the parameters page to boot_params_page and let sh use the generic empty_zero_page. * hexagon: had an amusing comment about empty_zero_page /* A handy thing to have if one has the RAM. Declared in head.S */ that unfortunately had to go :) Link: https://lkml.kernel.org/r/20260211103141.3215197-4-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Helge Deller <deller@gmx.de> [parisc] Tested-by: Helge Deller <deller@gmx.de> [parisc] Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> [alpha] Acked-by: Dinh Nguyen <dinguyen@kernel.org> [nios2] Acked-by: Andreas Larsson <andreas@gaisler.com> [sparc] Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: David S. Miller <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:01 -07:00
Kees Cook	8aae2da610	um: Replace strncpy() with strnlen()+memcpy_and_pad() in strncpy_chunk_from_user() Replace the deprecated[1] strncpy() with strnlen() on the source followed by memcpy_and_pad(). This function is a chunk callback for UML's strncpy_from_user() implementation, called by buffer_op() to process userspace memory one page at a time. The source is a kernel-mapped userspace address that is not guaranteed to be NUL-terminated; "len" bounds how many bytes to read from it. By measuring the source string length first with strnlen(), we avoid reading past the NUL terminator in the source. memcpy_and_pad() then copies the string content and zero-fills the remainder of the chunk, preserving the original strncpy() behavior exactly: copy up to the first NUL, then pad with zeros to the full length. strtomem_pad() would be the idiomatic helper for this strnlen() + memcpy_and_pad() pattern, but it requires a compile-time-determinable destination size (via ARRAY_SIZE()). Here the destination is a char * into a caller-provided buffer and the chunk length is a runtime value, so the explicit two-step is necessary. No behavioral change: the same bytes are written to the destination (string content followed by zero padding), the pointer advances by the same amount, and the NUL-found return condition is unchanged. Link: https://github.com/KSPP/linux/issues/90 [1] Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20260323171713.work.839-kees@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-03-27 08:21:09 +01:00
Tiwei Bie	cd4126d48f	um: Fix pte_read() and pte_exec() for kernel mappings The pte_read() and pte_exec() helpers are only used during the TLB sync to determine the read/exec permissions for mmap. However, for kernel mappings, they will always return 0. This leads to kern_map() having to unconditionally set the exec flag to 1 and the read flag unexpectedly always being 0. Remove the unnecessary check for the _PAGE_USER bit in these helpers to ensure that the kernel mapping permissions can be correctly determined. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20260302235224.1915380-3-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-03-21 10:42:24 +01:00
Tiwei Bie	102331b66b	um: Fix potential race condition in TLB sync During the TLB sync, we need to traverse and modify the page table, so we should hold the page table lock. Since full SMP support for threads within the same process is still missing, let's disable the split page table lock for simplicity. Fixes: `1e4ee5135d` ("um: Add initial SMP support") Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20260302235224.1915380-2-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-03-21 10:42:24 +01:00
Marcel W. Wysocki	4076f73298	um: fix address-of CMSG_DATA() rvalue in stub The UML stub takes the address of CMSG_DATA(fd_msg): fd_map = (void )&CMSG_DATA(fd_msg); CMSG_DATA() is specified by POSIX to return unsigned char . Taking its address is semantically wrong -- the intent is to get a pointer to the control message data, which is exactly what CMSG_DATA() already returns. This happens to compile with glibc because glibc's primary CMSG_DATA definition accesses a flexible array member: #define CMSG_DATA(cmsg) ((cmsg)->__cmsg_data) An array lvalue can have its address taken, and &array yields the same address as array. However, glibc also has an alternative definition that uses pointer arithmetic (returning an rvalue), and musl's definition always uses pointer arithmetic: /* musl / #define CMSG_DATA(cmsg) \ ((unsigned char )(((struct cmsghdr *)(cmsg)) + 1)) Taking the address of an rvalue is a hard error in C, so the current code fails to compile with musl libc. Remove the erroneous & operator. The resulting code is correct regardless of the CMSG_DATA implementation -- it simply assigns the data pointer, which is what the subsequent code (fd_map[--num_fds]) expects. No functional change with glibc; fixes the build with musl. Signed-off-by: Marcel W. Wysocki <maci.stgn@gmail.com> Link: https://patch.msgid.link/20260215142803.1455757-1-maci.stgn@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-03-21 10:41:44 +01:00
Nathan Chancellor	8678591b47	kbuild: Split .modinfo out from ELF_DETAILS Commit `3e86e4d74c` ("kbuild: keep .modinfo section in vmlinux.unstripped") added .modinfo to ELF_DETAILS while removing it from COMMON_DISCARDS, as it was needed in vmlinux.unstripped and ELF_DETAILS was present in all architecture specific vmlinux linker scripts. While this shuffle is fine for vmlinux, ELF_DETAILS and COMMON_DISCARDS may be used by other linker scripts, such as the s390 and x86 compressed boot images, which may not expect to have a .modinfo section. In certain circumstances, this could result in a bootloader failing to load the compressed kernel [1]. Commit `ddc6cbef3e` ("s390/boot/vmlinux.lds.S: Ensure bzImage ends with SecureBoot trailer") recently addressed this for the s390 bzImage but the same bug remains for arm, parisc, and x86. The presence of .modinfo in the x86 bzImage was the root cause of the issue worked around with commit `d50f210913` ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat"). misc.c in arch/x86/boot/compressed includes lib/decompress_unzstd.c, which in turn includes lib/xxhash.c and its MODULE_LICENSE / MODULE_DESCRIPTION macros due to the STATIC definition. Split .modinfo out from ELF_DETAILS into its own macro and handle it in all vmlinux linker scripts. Discard .modinfo in the places where it was previously being discarded from being in COMMON_DISCARDS, as it has never been necessary in those uses. Cc: stable@vger.kernel.org Fixes: `3e86e4d74c` ("kbuild: keep .modinfo section in vmlinux.unstripped") Reported-by: Ed W <lists@wildgooses.com> Closes: https://lore.kernel.org/587f25e0-a80e-46a5-9f01-87cb40cfa377@wildgooses.com/ [1] Tested-by: Ed W <lists@wildgooses.com> # x86_64 Link: https://patch.msgid.link/20260225-separate-modinfo-from-elf-details-v1-1-387ced6baf4b@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>	2026-02-26 11:50:19 -07:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Linus Torvalds	4cff5c05e0	mm.git review status for linus..mm-stable Everything: Total patches: 325 Reviews/patch: 1.39 Reviewed rate: 72% Excluding DAMON: Total patches: 262 Reviews/patch: 1.63 Reviewed rate: 82% Excluding DAMON and zram: Total patches: 248 Reviews/patch: 1.72 Reviewed rate: 86% - The 14 patch series "powerpc/64s: do not re-activate batched TLB flush" from Alexander Gordeev makes arch_{enter\|leave}_lazy_mmu_mode() nest properly. It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - The 7 patch series "zram: introduce compressed data writeback" from Richard Chang and Sergey Senozhatsky implements data compression for zram writeback. - The 8 patch series "mm: folio_zero_user: clear page ranges" from David Hildenbrand adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated. - The 2 patch series "memcg cleanups" from Chen Ridong tideis up some memcg code. - The 12 patch series "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" from SeongJae Park improves DAMOS stat's provided information, deterministic control, and readability. - The 3 patch series "selftests/mm: hugetlb cgroup charging: robustness fixes" from Li Wang fixes a few issues in the hugetlb cgroup charging selftests. - The 5 patch series "Fix va_high_addr_switch.sh test failure - again" from Chunyu Hu addresses several issues in the va_high_addr_switch test. - The 5 patch series "mm/damon/tests/core-kunit: extend existing test scenarios" from Shu Anzai improves the KUnit test coverage for DAMON. - The 2 patch series "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" from Shivank Garg fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN. - The 29 patch series "arch, mm: consolidate hugetlb early reservation" from Mike Rapoport reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb. - The 9 patch series "mm: clean up anon_vma implementation" from Lorenzo Stoakes cleans up the anon_vma implementation in various ways. - The 3 patch series "tweaks for __alloc_pages_slowpath()" from Vlastimil Babka does a little streamlining of the page allocator's slowpath code. - The 8 patch series "memcg: separate private and public ID namespaces" from Shakeel Butt cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace. - The 6 patch series "mm: hugetlb: allocate frozen gigantic folio" from Kefeng Wang cleans up the allocation of frozen folios and avoids some atomic refcount operations. - The 11 patch series "mm/damon: advance DAMOS-based LRU sorting" from SeongJae Park improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals. - The 18 patch series "Support page table check on PowerPC" from Andrew Donnellan makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc. - The 3 patch series "nodemask: align nodes_and{,not} with underlying bitmap ops" from Yury Norov makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code. - The 5 patch series "mm/damon: hide kdamond and kdamond_lock from API callers" from SeongJae Park cleans up some DAMON internal interfaces. - The 4 patch series "mm/khugepaged: cleanups and scan limit fix" from Shivank Garg does some cleanup work in khupaged and fixes a scan limit accounting issue. - The 24 patch series "mm: balloon infrastructure cleanups" from David Hildenbrand goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification. - The 2 patch series "mm/vmscan: add tracepoint and reason for kswapd_failures reset" from Jiayuan Chen adds additional tracepoints to the page reclaim code. - The 3 patch series "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" from Marco Crivellari is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues. - The 9 patch series "Various mm kselftests improvements/fixes" from Kevin Brodsky provides various unrelated improvements/fixes for the mm kselftests. - The 5 patch series "mm: accelerate gigantic folio allocation" from Kefeng Wang greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig(). - The 5 patch series "selftests/damon: improve leak detection and wss estimation reliability" from SeongJae Park improves the reliability of two of the DAMON selftests. - The 8 patch series "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" from SeongJae Park does some cleanup work in the core DAMON code. - The 8 patch series "Docs/mm/damon: update intro, modules, maintainer profile, and misc" from SeongJae Park performs maintenance work on the DAMON documentation. - The 10 patch series "mm: add and use vma_assert_stabilised() helper" from Lorenzo Stoakes refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires. - The 19 patch series "mm, swap: swap table phase II: unify swapin use" from Kairui Song removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark. - The 8 patch series "enable PT_RECLAIM on more 64-bit architectures" from Qi Zheng makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, um, Various cleanups were performed along the way. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY1HfAAKCRDdBJ7gKXxA jqhZAP9H8ZlKKqCEgnr6U5XXmJ63Ep2FDQpl8p35yr9yVuU9+gEAgfyWiJ43l1fP rT0yjsUW3KQFBi/SEA3R6aYarmoIBgI= =+HLt -----END PGP SIGNATURE----- Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "powerpc/64s: do not re-activate batched TLB flush" makes arch_{enter\|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev) It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - "zram: introduce compressed data writeback" implements data compression for zram writeback (Richard Chang and Sergey Senozhatsky) - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated (David Hildenbrand) - "memcg cleanups" tidies up some memcg code (Chen Ridong) - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" improves DAMOS stat's provided information, deterministic control, and readability (SeongJae Park) - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few issues in the hugetlb cgroup charging selftests (Li Wang) - "Fix va_high_addr_switch.sh test failure - again" addresses several issues in the va_high_addr_switch test (Chunyu Hu) - "mm/damon/tests/core-kunit: extend existing test scenarios" improves the KUnit test coverage for DAMON (Shu Anzai) - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN (Shivank Garg) - "arch, mm: consolidate hugetlb early reservation" reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb (Mike Rapoport) - "mm: clean up anon_vma implementation" cleans up the anon_vma implementation in various ways (Lorenzo Stoakes) - "tweaks for __alloc_pages_slowpath()" does a little streamlining of the page allocator's slowpath code (Vlastimil Babka) - "memcg: separate private and public ID namespaces" cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace (Shakeel Butt) - "mm: hugetlb: allocate frozen gigantic folio" cleans up the allocation of frozen folios and avoids some atomic refcount operations (Kefeng Wang) - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals (SeongJae Park) - "Support page table check on PowerPC" makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan) - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code (Yury Norov) - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up some DAMON internal interfaces (SeongJae Park) - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work in khupaged and fixes a scan limit accounting issue (Shivank Garg) - "mm: balloon infrastructure cleanups" goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification (David Hildenbrand) - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds additional tracepoints to the page reclaim code (Jiayuan Chen) - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues (Marco Crivellari) - "Various mm kselftests improvements/fixes" provides various unrelated improvements/fixes for the mm kselftests (Kevin Brodsky) - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig() (Kefeng Wang) - "selftests/damon: improve leak detection and wss estimation reliability" improves the reliability of two of the DAMON selftests (SeongJae Park) - "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" does some cleanup work in the core DAMON code (SeongJae Park) - "Docs/mm/damon: update intro, modules, maintainer profile, and misc" performs maintenance work on the DAMON documentation (SeongJae Park) - "mm: add and use vma_assert_stabilised() helper" refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires (Lorenzo Stoakes) - "mm, swap: swap table phase II: unify swapin use" removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark (Kairui Song) - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, and um. Various cleanups were performed along the way (Qi Zheng) * tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits) mm/memory: handle non-split locks correctly in zap_empty_pte_table() mm: move pte table reclaim code to memory.c mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config um: mm: enable MMU_GATHER_RCU_TABLE_FREE parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE mips: mm: enable MMU_GATHER_RCU_TABLE_FREE LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles zsmalloc: make common caches global mm: add SPDX id lines to some mm source files mm/zswap: use %pe to print error pointers mm/vmscan: use %pe to print error pointers mm/readahead: fix typo in comment mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file() mm: refactor vma_map_pages to use vm_insert_pages mm/damon: unify address range representation with damon_addr_range mm/cma: replace snprintf with strscpy in cma_new_area ...	2026-02-12 11:32:37 -08:00
Mike Rapoport (Microsoft)	d49004c5f0	arch, mm: consolidate initialization of nodes, zones and memory map To initialize node, zone and memory map data structures every architecture calls free_area_init() during setup_arch() and passes it an array of zone limits. Beside code duplication it creates "interesting" ordering cases between allocation and initialization of hugetlb and the memory map. Some architectures allocate hugetlb pages very early in setup_arch() in certain cases, some only create hugetlb CMA areas in setup_arch() and sometimes hugetlb allocations happen mm_core_init(). With arch_zone_limits_init() helper available now on all architectures it is no longer necessary to call free_area_init() from architecture setup code. Rather core MM initialization can call arch_zone_limits_init() in a single place. This allows to unify ordering of hugetlb vs memory map allocation and initialization. Remove the call to free_area_init() from architecture specific code and place it in a new mm_core_init_early() function that is called immediately after setup_arch(). After this refactoring it is possible to consolidate hugetlb allocations and eliminate differences in ordering of hugetlb and memory map initialization among different architectures. As the first step of this consolidation move hugetlb_bootmem_alloc() to mm_core_early_init(). Link: https://lkml.kernel.org/r/20260111082105.290734-24-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 20:02:18 -08:00
Mike Rapoport (Microsoft)	531de7f02d	um: introduce arch_zone_limits_init() Move calculations of zone limits to a dedicated arch_zone_limits_init() function. Later MM core will use this function as an architecture specific callback during nodes and zones initialization and thus there won't be a need to call free_area_init() from every architecture. Link: https://lkml.kernel.org/r/20260111082105.290734-21-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 20:02:17 -08:00
Marco Elver	4f109baeea	um: Fix incorrect __acquires/__releases annotations With Clang's context analysis, the compiler is a bit more strict about what goes into the __acquires/__releases annotations and can't refer to non-existent variables. On an UM build, mm_id.h is transitively included into mm_types.h, and we can observe the following error (if context analysis is enabled in e.g. stackdepot.c): In file included from lib/stackdepot.c:17: In file included from include/linux/debugfs.h:15: In file included from include/linux/fs.h:5: In file included from include/linux/fs/super.h:5: In file included from include/linux/fs/super_types.h:7: In file included from include/linux/list_lru.h:14: In file included from include/linux/xarray.h:16: In file included from include/linux/gfp.h:7: In file included from include/linux/mmzone.h:22: In file included from include/linux/mm_types.h:26: In file included from arch/um/include/asm/mmu.h:12: >> arch/um/include/shared/skas/mm_id.h:24:54: error: use of undeclared identifier 'turnstile' 24 \| void enter_turnstile(struct mm_id mm_id) __acquires(turnstile); \| ^~~~~~~~~ arch/um/include/shared/skas/mm_id.h:25:53: error: use of undeclared identifier 'turnstile' 25 \| void exit_turnstile(struct mm_id mm_id) __releases(turnstile); \| ^~~~~~~~~ One (discarded) option was to use token_context_lock(turnstile) to just define a token with the already used name, but that would not allow the compiler to distinguish between different mm_id-dependent instances. Another constraint is that struct mm_id is only declared and incomplete in the header, so even if we tried to construct an expression to get to the mutex instance, this would fail (including more headers transitively everywhere should also be avoided). Instead, just declare an mm_id-dependent helper to return the mutex, and use the mm_id-dependent call expression in the __acquires/__releases attributes; the compiler will consider the identity of the mutex to be the call expression. Then using __get_turnstile() in the lock/unlock wrappers (with context analysis enabled for mmu.c) the compiler will be able to verify the implementation of the wrappers as-is. We leave context analysis disabled in arch/um/kernel/skas/ for now. This change is a preparatory change to allow enabling context analysis in subsystems that include any of the above headers. No functional change intended. Closes: https://lore.kernel.org/oe-kbuild-all/202512171220.vHlvhpCr-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-23-elver@google.com	2026-01-05 16:43:32 +01:00
Linus Torvalds	08b8ddac1f	Address various objtool scalability bugs/inefficiencies exposed by allmodconfig builds, plus improve the quality of alternatives instructions generated code and disassembly. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmk0FVoRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1iClg//dHY58dvrsp5Fzo10XgU99/kwzNEgl2b5 SMrSEbliTrehdpG4vBvig9tAMZurxOVIf6yDBEtV45XfD6w3tw6EFYpO1was9wTE R/80Ze6BAEeao782xN3sCpakU1Ogwbxhe4jYFZKE/WVbP9ZaeCI8qeBj3RAuOQ9y PCJzjD5fl9c2cAGDqCJEswxIptpP7eXoBo/V3Txf46M8/ffFcXdJbHN3HRBlszVs 5I9Wb2/vFmwJ4Yi4EO8H7KfzwaXA8wW/MJSDcM24P2/+o5iTqSLNd+rADFMW3XF2 /8b3uAy/6A6tT3ek1teNoM7qB9hRpM1pmpFwgjjTkjl8yamEp6P/W99qUN+UmfV+ NTiW9sz7ShhVTMCdALIljyjmji318crKYQBDulAHuEACpodcBg/GUGfuUcrjSRB/ C7PLatOpfMCODPRGPH4+8Wg8nnBGvOEjjODZBjAq2yU5aJnBeLPmbK2mtcaJtKi+ R0T2LIsNgmnEa4wRZbH8i4jXsgcbe6gD45Tx3qZpss7D4d9IyRWPO8v6GegFUpvh dw8qBqhgi1FzryZ/5uwh5IzkVq+iXHqkPBsV9w7CVSFF1Kc5w1/l7MXsEjkc7Xe3 qMjc43qsN0H/7ngoIA7yp4m7q87gqJMzReIfeIF4pGVtoULGQ+drN0jjQE/SHiKS /EM8IAAk0pU= =2DKc -----END PGP SIGNATURE----- Merge tag 'objtool-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "Address various objtool scalability bugs/inefficiencies exposed by allmodconfig builds, plus improve the quality of alternatives instructions generated code and disassembly" * tag 'objtool-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Simplify .annotate_insn code generation output some more objtool: Add more robust signal error handling, detect and warn about stack overflows objtool: Remove newlines and tabs from annotation macros objtool: Consolidate annotation macros x86/asm: Remove ANNOTATE_DATA_SPECIAL usage x86/alternative: Remove ANNOTATE_DATA_SPECIAL usage objtool: Fix stack overflow in validate_branch()	2025-12-06 11:56:51 -08:00
Linus Torvalds	399ead3a6d	Apart from the usual small churn, we have - initial SMP support (only kernel) - major vDSO cleanups (and fixes for 32-bit) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmkyoG8ACgkQ10qiO8sP aAAmVxAAgxi7ZWBgmQS5+CAtlehy81Nen09rcPcwMEDwWoNA6/ePsroqPijpgBGx Ayc0IAbGHs/LN1UZu1HevTkD5ydg6nw3aQzv51Yu+A27MLGUnwPHyYc/rggWr9Zc SlqF5iRp4Wp52M7HqAHv1UzoQdDYtVgKbpSaAV07KwJNTSIWAIhn464MfIUfSh92 AsX6+o8jhns7L7Bx99Tfb9MiPDFQzXRmkLmE56SCpgYC1cXNFRgVKCaNz9w27OH7 hYlFJ1QLg0wmli0K0yZVG+Vh5mtgnulw5oM4ZZKwtnmjIgtBvT59jAXi7dWiJ581 Yt7KqlJE0Fp9XwIgAMVlwZsa3PUJ5A4QGtM9lxcgQf/DNpiiO5CEiiHqYhRS0nDM GAF9lpMjArJt5lTp6jO6gqe8ykoCl2cYbP+Pyad4D2SgIZZtajmgOzugXBNEB/xj LM4a8s8a89lRbTsOKylrRlMz5AIbjcwgKaXSvIauNc5b40kdrOS2fY+Z9IfebQL7 0/WywZO6VHibdY7iJTmTvktQ9LClpM93GqCcNi7W/8zn1awdPhDS+8SxJXUZcH1y eUELEl8mfc60/dszFuKwdI+BGJUDURBUlW4SwLcr/PDp/6FcqRCnIdsYtkuKHhbG driH3E4JuMM6HAjJVz7JxqSXXaIVvj2wZbRjF4Xqg4I9lmN7cbY= =CNVA -----END PGP SIGNATURE----- Merge tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Johannes Berg: "Apart from the usual small churn, we have - initial SMP support (only kernel) - major vDSO cleanups (and fixes for 32-bit)" * tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (33 commits) um: Disable KASAN_INLINE when STATIC_LINK is selected um: Don't rename vmap to kernel_vmap um: drivers: virtio: use string choices helper um: Always set up AT_HWCAP and AT_PLATFORM x86/um: Remove FIXADDR_USER_START and FIXADDR_USE_END um: Remove __access_ok_vsyscall() um: Remove redundant range check from __access_ok_vsyscall() um: Remove fixaddr_user_init() x86/um: Drop gate area handling x86/um: Do not inherit vDSO from host um: Split out default elf_aux_hwcap x86/um: Move ELF_PLATFORM fallback to x86-specific code um: Split out default elf_aux_platform um: Avoid circular dependency on asm-offsets in pgtable.h um: Enable SMP support on x86 asm-generic: percpu: Add assembly guard um: vdso: Remove getcpu support on x86 um: Add initial SMP support um: Define timers on a per-CPU basis um: Determine sleep based on need_resched() ...	2025-12-05 16:30:56 -08:00
Linus Torvalds	4d38b88fd1	printk changes for 6.19 -----BEGIN PGP SIGNATURE----- iQJPBAABCAA5FiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmktlbUbFIAAAAAABAAO bWFudTIsMi41KzEuMTEsMiwyAAoJEFKgDEdIgJTyevsP/1z98/wfCaSCquIq4H8S OTqFGybGgYQt1NmMj2cGPpbAE3LJNYORT0A4tcoqOTy1Z5xbQz63rO3clSI/e7Mf n4ZZ7NvkE40i8et1BjqtZa9dSkAv4QLYH73KrtNeuTr5tqvHo1x8FakUH6gQnb1k QOOebvbVXnOb+rh89j1GZShrLFcCil0psjp165WHAYE/3PyFBgYGLMCgwLqS+W3H re5Q4sl/ySXpMFF/XN1Kww48FWxy/h+YQFCxZwuWlUcXtVjqZ+BN+keb7AqaFQ7R dC2exV2W0RBoupEJR/FWHoXrm/bDDLhzqRaMvoggLJrMJ9L6V0WdIhaFA4qzoG63 paJGFjUfmDX3dpPsAddq7kKeevCz4a2/HwFKhiBqqq4tdHuely7wZgnoFO7ovgmu DYDCXHtpJuWZR3WJ5I/V/sJ9i9KFXhhyWcKVf13QTAFiCaA09aeSAcUWNYNaaxbn nu6IkUxdIVnWIEBgcYH6jz1DrPGreYLYuD4bVb2gdZoP0r3tnMpG6xfSNIUueSGd VFAKW9PJYaj7Id+jgACH6V+gQ22L600xJDdL1bPjRbGE0LD7vlz2F1MZTq3BFJFn hUxJeOZplHX+TPophdvH4MO9VLmydWLUyJiDBP1yA8M9XZms/5s7IJJ1RYXqUCcf qEB4L7W1+Qy1R/lzf2PU9X4R =FnfO -----END PGP SIGNATURE----- Merge tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Allow creaing nbcon console drivers with an unsafe write_atomic() callback that can only be called by the final nbcon_atomic_flush_unsafe(). Otherwise, the driver would rely on the kthread. It is going to be used as the-best-effort approach for an experimental nbcon netconsole driver, see https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org Note that a safe .write_atomic() callback is supposed to work in NMI context. But some networking drivers are not safe even in IRQ context: https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35 In an ideal world, all networking drivers would be fixed first and the atomic flush would be blocked only in NMI context. But it brings the question how reliable networking drivers are when the system is in a bad state. They might block flushing more reliable serial consoles which are more suitable for serious debugging anyway. - Allow to use the last 4 bytes of the printk ring buffer. - Prevent queuing IRQ work and block printk kthreads when consoles are suspended. Otherwise, they create non-necessary churn or even block the suspend. - Release console_lock() between each record in the kthread used for legacy consoles on RT. It might significantly speed up the boot. - Release nbcon context between each record in the atomic flush. It prevents stalls of the related printk kthread after it has lost the ownership in the middle of a record - Add support for NBCON consoles into KDB - Add %ptsP modifier for printing struct timespec64 and use it where possible - Misc code clean up * tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits) printk: Use console_is_usable on console_unblank arch: um: kmsg_dump: Use console_is_usable drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT lib/vsprintf: Unify FORMAT_STATE_NUM handlers printk: Avoid irq_work for printk_deferred() on suspend printk: Avoid scheduling irq_work on suspend printk: Allow printk_trigger_flush() to flush all types tracing: Switch to use %ptSp scsi: snic: Switch to use %ptSp scsi: fnic: Switch to use %ptSp s390/dasd: Switch to use %ptSp ptp: ocp: Switch to use %ptSp pps: Switch to use %ptSp PCI: epf-test: Switch to use %ptSp net: dsa: sja1105: Switch to use %ptSp mmc: mmc_test: Switch to use %ptSp media: av7110: Switch to use %ptSp ipmi: Switch to use %ptSp igb: Switch to use %ptSp e1000e: Switch to use %ptSp ...	2025-12-03 12:42:36 -08:00
Marcos Paulo de Souza	4c70ab110b	arch: um: kmsg_dump: Use console_is_usable All consoles found on for_each_console are registered, meaning that all of them have the CON_ENABLED flag set. Since NBCON was introduced it's important to check if a given console also implements the NBCON callbacks. The function console_is_usable does exactly that. Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://patch.msgid.link/20251121-printk-cleanup-part2-v2-2-57b8b78647f4@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com>	2025-11-27 15:54:50 +01:00
Thomas Weißschuh	78fdfc9fc4	um: Remove fixaddr_user_init() With the removal of the vDSO passthrough from the host, FIXADDR_USER_START is always 0 and fixaddr_user_init() is dead code. Remove it. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-6-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-11-06 13:02:33 +01:00
Tiwei Bie	1e4ee5135d	um: Add initial SMP support Add initial symmetric multi-processing (SMP) support to UML. With this support enabled, users can tell UML to start multiple virtual processors, each represented as a separate host thread. In UML, kthreads and normal threads (when running in kernel mode) can be scheduled and executed simultaneously on different virtual processors. However, the userspace code of normal threads still runs within their respective single-threaded stubs. That is, SMP support is currently available both within the kernel and across different processes, but still remains limited within threads of the same process in userspace. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-6-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:41:15 +01:00
Tiwei Bie	9c82de55d4	um: Define timers on a per-CPU basis Define timers on a per-CPU basis to enable each CPU to have its own timer. This is a preparation for adding SMP support. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-5-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:41:15 +01:00
Tiwei Bie	2670917c2f	um: Determine sleep based on need_resched() With SMP and NO_HZ enabled, the CPU may still need to sleep even if the timer is disarmed. Switch to deciding whether to sleep based on pending resched. Additionally, because disabling IRQs does not block SIGALRM, it is also necessary to check for any pending timer alarms. This is a preparation for adding SMP support. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-4-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:41:15 +01:00
Tiwei Bie	9e5a9f1c9b	um: Turn signals_* into thread-local variables Turn signals_enabled, signals_pending and signals_active into thread-local variables. This enables us to control and track signals independently on each CPU thread. This is a preparation for adding SMP support. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-3-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:41:15 +01:00
Tiwei Bie	6aaf00d14e	um: Do not disable kmalloc in initial_thread_cb() Currently, initial_thread_cb() temporarily disables kmalloc when it invokes the callback, allowing the callback to bypass kmalloc. This is unnecessary for the current users of initial_thread_cb(), and we should avoid memory allocations that are not under the control of the UML kernel. Therefore, let's stop temporarily disabling kmalloc in initial_thread_cb(). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-2-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:41:14 +01:00
Tiwei Bie	a7f7dbae94	um: Remove file-based iomem emulation support The file-based iomem emulation was introduced to support writing paravirtualized drivers based on emulated iomem regions. However, the only driver that makes use of it is an example driver called mmapper, which was written over two decades ago. We now have several modern device emulation mechanisms, such as vhost-user-based virtio-uml. Remove the file-based iomem emulation support to reduce the maintenance burden. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027054519.1996090-5-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:37:12 +01:00
Tiwei Bie	9c84022c1d	um: Replace UML_ROUND_UP() with PAGE_ALIGN() Although UML_ROUND_UP() is defined in a shared header file, it depends on the PAGE_SIZE and PAGE_MASK macros, so it can only be used in kernel code. Considering its name is not very clear and its functionality is the same as PAGE_ALIGN(), replace its usages with a direct call to PAGE_ALIGN() and remove it. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027054519.1996090-4-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:37:12 +01:00
Tiwei Bie	de20326748	um: Use PAGE_ALIGN() for address alignment Use PAGE_ALIGN() instead of open-coded calculations. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027054519.1996090-3-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:37:12 +01:00
Tiwei Bie	691ff59148	um: Make host_task_size a local variable Currently, host_task_size is a global variable, but it is only used in linux_main() to compute stub_start and task_size. Make it a local variable to limit its scope to where it is actually needed. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027054519.1996090-2-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 16:37:12 +01:00
Johannes Berg	6e3fc802ab	um: move asm-offsets generation into a single file There's nothing subarch dependent here, and it's odd that includes need to be done in the subarch, and then entries defined in the common file. Simplify the whole thing from three files into one. Link: https://patch.msgid.link/20251007071452.367989-4-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 15:07:44 +01:00
Johannes Berg	7b5d441696	um: init cpu_tasks[] earlier This is currently done in uml_finishsetup(), but e.g. with KCOV enabled we'll crash because some init code can call into e.g. memparse(), which has coverage annotations, and then the checks in check_kcov_mode() crash because current is NULL. Simply initialize the cpu_tasks[] array statically, which fixes the crash. For the later SMP work, it seems to have not really caused any problems yet, but initialize all of the entries anyway. Link: https://patch.msgid.link/20250924113214.c76cd74d0583.I974f691ebb1a2b47915bd2b04cc38e5263b9447f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-10-27 15:05:50 +01:00
Linus Torvalds	fc282d1731	updates for UML, notably - minor preparations for SMP support - SPARSE_IRQ support for kunit - help output cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmjjjKQACgkQ10qiO8sP aAA/Sw//SVjKUs6SVnaglxPQl/qRksULuakpejp2n6fdWL6EC25W6w1Zpl+iV0Yx asMYb6u9cZQxk8bBYXg3luXQF9YdE7MorcaOBQKNxJX5PZheekexTjvV2An0RcM8 t/g4hQXMFN7xv/k7KQ+DTw9l2jEKK21MltsOvB6VXWyBQ+cnvR66nVzzootH9vNR GELDhd/OFx8M9XMumTNLLHw4GsdLbtUKTFOxnjnBfJf4K0xa/O2wInE0/zrpnsZ4 M6Aj73nxeZ5peKIKmqgzxHcJiwjVx2jwqzwjE2EVmxlUNZjGZ9asTGMnmYHwd7YF 6Z+SrI6vQhBBC4SrDU2NEJxyNg876zjt+ua+D71IRjd0wQhms5nWJ1kDNmpQwWbh r6RkbddPlpPASQne30EXXZKwOsYQc2IT3Yf1GkOa+2RzNu0WIhQUkEfi8J+n8Jle ERPDdP9dKHuuPHmZ/CvjLm+PEIVkUKlq0R8+uiWFG8AbBjdyGlue5C4kixfhoZ6I yuiO/GMRotNaG6Nlr5rfpuWSNSWjY5kNpMFsfV+kVUWPCJavk/MXFFiIZfS4SI12 0GzIE7k1rwKo52EsV4gMnRtuvPioJ/mz335RfxT9ZOfaed9hw/Cl3d8aIw7WuNXz CFwHvSVe51fMaLJCLCiWKpBPcDioMymI72znWLqgHwqDvv9GJSM= =lpSi -----END PGP SIGNATURE----- Merge tag 'uml-for-linux-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull uml updates from Johannes Berg: - minor preparations for SMP support - SPARSE_IRQ support for kunit - help output cleanups * tag 'uml-for-linux-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: Remove unused ipi_pipe field from cpuinfo_um um: Remove unused cpu_data and current_cpu_data macros um: Stop tracking virtual CPUs via mm_cpumask() um: Centralize stub size calculations um: Remove outdated comment about STUB_DATA_PAGES um: Remove unused offset and child_err fields from stub_data um: Indent time-travel help messages um: Fix help message for ssl-non-raw um: vector: Fix indentation for help message um: Add missing trailing newline to help messages um: virtio-pci: implement .shutdown() um: Support SPARSE_IRQ	2025-10-06 12:10:55 -07:00
Linus Torvalds	8804d970fa	Summary of significant series in this pull request: - The 3 patch series "mm, swap: improve cluster scan strategy" from Kairui Song improves performance and reduces the failure rate of swap cluster allocation. - The 4 patch series "support large align and nid in Rust allocators" from Vitaly Wool permits Rust allocators to set NUMA node and large alignment when perforning slub and vmalloc reallocs. - The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets for virtual address spaces for ops-level DAMOS filters. - The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren Baghdasaryan reduces mmap_lock contention during reads of /proc/pid/maps. - The 2 patch series "mm/mincore: minor clean up for swap cache checking" from Kairui Song performs some cleanup in the swap code. - The 11 patch series "mm: vm_normal_page() improvements" from David Hildenbrand provides code cleanup in the pagemap code. - The 5 patch series "add persistent huge zero folio support" from Pankaj Raghav provides a block layer speedup by optionalls making the huge_zero_pagepersistent, instead of releasing it when its refcount falls to zero. - The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to the recently added Kexec Handover feature. - The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To end the constant struggle with space shortage on 32-bit conflicting with 64-bit's needs. - The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap code. - The 7 patch series "selftests/mm: Fix false positives and skip unsupported tests" from Donet Tom fixes a few things in our selftests code. - The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised" from David Hildenbrand "allows individual processes to opt-out of THP=always into THP=madvise, without affecting other workloads on the system". It's a long story - the [1/N] changelog spells out the considerations. - The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on the memdesc project. Please see https://kernelnewbies.org/MatthewWilcox/Memdescs and https://blogs.oracle.com/linux/post/introducing-memdesc. - The 3 patch series "Tiny optimization for large read operations" from Chi Zhiling improves the efficiency of the pagecache read path. - The 5 patch series "Better split_huge_page_test result check" from Zi Yan improves our folio splitting selftest code. - The 2 patch series "test that rmap behaves as expected" from Wei Yang adds some rmap selftests. - The 3 patch series "remove write_cache_pages()" from Christoph Hellwig removes that function and converts its two remaining callers. - The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD selftests issues. - The 3 patch series "introduce kernel file mapped folios" from Boris Burkov introduces the concept of "kernel file pages". Using these permits btrfs to account its metadata pages to the root cgroup, rather than to the cgroups of random inappropriate tasks. - The 2 patch series "mm/pageblock: improve readability of some pageblock handling" from Wei Yang provides some readability improvements to the page allocator code. - The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON to understand arm32 highmem. - The 4 patch series "tools: testing: Use existing atomic.h for vma/maple tests" from Brendan Jackman performs some code cleanups and deduplication under tools/testing/. - The 2 patch series "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes a couple of 32-bit issues in tools/testing/radix-tree.c. - The 2 patch series "kasan: unify kasan_enabled() and remove arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific initialization code into a common arch-neutral implementation. - The 3 patch series "mm: remove zpool" from Johannes Weiner removes zspool - an indirection layer which now only redirects to a single thing (zsmalloc). - The 2 patch series "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a couple of cleanups in the fork code. - The 37 patch series "mm: remove nth_page()" from David Hildenbrand makes rather a lot of adjustments at various nth_page() callsites, eventually permitting the removal of that undesirable helper function. - The 2 patch series "introduce kasan.write_only option in hw-tags" from Yeoreum Yun creates a KASAN read-only mode for ARM, using that architecture's memory tagging feature. It is felt that a read-only mode KASAN is suitable for use in production systems rather than debug-only. - The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does some tidying in the hugetlb folio allocation code. - The 12 patch series "mm: establish const-correctness for pointer parameters" from Max Kellermann makes quite a number of the MM API functions more accurate about the constness of their arguments. This was getting in the way of subsystems (in this case CEPH) when they attempt to improving their own const/non-const accuracy. - The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola fixes a number of code sites which were confused over when to use free_pages() vs __free_pages(). - The 3 patch series "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the mapletree code accessible to Rust. Required by nouveau and by its forthcoming successor: the new Rust Nova driver. - The 2 patch series "selftests/mm: split_huge_page_test: split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and some cleanups to the thp selftesting code. - The 14 patch series "mm, swap: introduce swap table as swap cache (phase I)" from Chris Li and Kairui Song is the first step along the path to implementing "swap tables" - a new approach to swap allocation and state tracking which is expected to yield speed and space improvements. This patchset itself yields a 5-20% performance benefit in some situations. - The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc layer to clean up the ptdesc code a little. - The 3 patch series "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some issues in our 5-level pagetable selftesting code. - The 2 patch series "Minor fixes for memory allocation profiling" from Suren Baghdasaryan addresses a couple of minor issues in relatively new memory allocation profiling feature. - The 3 patch series "Small cleanups" from Matthew Wilcox has a few cleanups in preparation for more memdesc work. - The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in furtherance of supporting arm highmem. - The 2 patch series "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad Anjum adds that compiler check to selftests code and fixes the fallout, by removing dead code. - The 10 patch series "Improvements to Victim Process Thawing and OOM Reaper Traversal Order" from zhongjinji makes a number of improvements in the OOM killer: mainly thawing a more appropriate group of victim threads so they can release resources. - The 5 patch series "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park is a bunch of small and unrelated fixups for DAMON. - The 7 patch series "mm/damon: define and use DAMON initialization check function" from SeongJae Park implement reliability and maintainability improvements to a recently-added bug fix. - The 2 patch series "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from SeongJae Park provides additional transparency to userspace clients of the DAMON_STAT information. - The 2 patch series "Expand scope of khugepaged anonymous collapse" from Dev Jain removes some constraints on khubepaged's collapsing of anon VMAs. It also increases the success rate of MADV_COLLAPSE against an anon vma. - The 2 patch series "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards removal of file_operations.mmap(). This patchset concentrates upon clearing up the treatment of stacked filesystems. - The 6 patch series "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau provides some fixes and improvements to mlock's tracking of large folios. /proc/meminfo's "Mlocked" field became more accurate. - The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters during fork" from Donet Tom fixes several user-visible KSM stats inaccuracies across forks and adds selftest code to verify these counters. - The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses some potential but presently benign issues in KSM's mm_slot handling. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1 2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA= =4mhY -----END PGP SIGNATURE----- Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "mm, swap: improve cluster scan strategy" from Kairui Song improves performance and reduces the failure rate of swap cluster allocation - "support large align and nid in Rust allocators" from Vitaly Wool permits Rust allocators to set NUMA node and large alignment when perforning slub and vmalloc reallocs - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets for virtual address spaces for ops-level DAMOS filters - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren Baghdasaryan reduces mmap_lock contention during reads of /proc/pid/maps - "mm/mincore: minor clean up for swap cache checking" from Kairui Song performs some cleanup in the swap code - "mm: vm_normal_page() improvements" from David Hildenbrand provides code cleanup in the pagemap code - "add persistent huge zero folio support" from Pankaj Raghav provides a block layer speedup by optionalls making the huge_zero_pagepersistent, instead of releasing it when its refcount falls to zero - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to the recently added Kexec Handover feature - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To end the constant struggle with space shortage on 32-bit conflicting with 64-bit's needs - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap code - "selftests/mm: Fix false positives and skip unsupported tests" from Donet Tom fixes a few things in our selftests code - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised" from David Hildenbrand "allows individual processes to opt-out of THP=always into THP=madvise, without affecting other workloads on the system". It's a long story - the [1/N] changelog spells out the considerations - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on the memdesc project. Please see https://kernelnewbies.org/MatthewWilcox/Memdescs and https://blogs.oracle.com/linux/post/introducing-memdesc - "Tiny optimization for large read operations" from Chi Zhiling improves the efficiency of the pagecache read path - "Better split_huge_page_test result check" from Zi Yan improves our folio splitting selftest code - "test that rmap behaves as expected" from Wei Yang adds some rmap selftests - "remove write_cache_pages()" from Christoph Hellwig removes that function and converts its two remaining callers - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD selftests issues - "introduce kernel file mapped folios" from Boris Burkov introduces the concept of "kernel file pages". Using these permits btrfs to account its metadata pages to the root cgroup, rather than to the cgroups of random inappropriate tasks - "mm/pageblock: improve readability of some pageblock handling" from Wei Yang provides some readability improvements to the page allocator code - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON to understand arm32 highmem - "tools: testing: Use existing atomic.h for vma/maple tests" from Brendan Jackman performs some code cleanups and deduplication under tools/testing/ - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes a couple of 32-bit issues in tools/testing/radix-tree.c - "kasan: unify kasan_enabled() and remove arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific initialization code into a common arch-neutral implementation - "mm: remove zpool" from Johannes Weiner removes zspool - an indirection layer which now only redirects to a single thing (zsmalloc) - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a couple of cleanups in the fork code - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of adjustments at various nth_page() callsites, eventually permitting the removal of that undesirable helper function - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun creates a KASAN read-only mode for ARM, using that architecture's memory tagging feature. It is felt that a read-only mode KASAN is suitable for use in production systems rather than debug-only - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does some tidying in the hugetlb folio allocation code - "mm: establish const-correctness for pointer parameters" from Max Kellermann makes quite a number of the MM API functions more accurate about the constness of their arguments. This was getting in the way of subsystems (in this case CEPH) when they attempt to improving their own const/non-const accuracy - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of code sites which were confused over when to use free_pages() vs __free_pages() - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the mapletree code accessible to Rust. Required by nouveau and by its forthcoming successor: the new Rust Nova driver - "selftests/mm: split_huge_page_test: split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and some cleanups to the thp selftesting code - "mm, swap: introduce swap table as swap cache (phase I)" from Chris Li and Kairui Song is the first step along the path to implementing "swap tables" - a new approach to swap allocation and state tracking which is expected to yield speed and space improvements. This patchset itself yields a 5-20% performance benefit in some situations - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc layer to clean up the ptdesc code a little - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some issues in our 5-level pagetable selftesting code - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan addresses a couple of minor issues in relatively new memory allocation profiling feature - "Small cleanups" from Matthew Wilcox has a few cleanups in preparation for more memdesc work - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in furtherance of supporting arm highmem - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad Anjum adds that compiler check to selftests code and fixes the fallout, by removing dead code - "Improvements to Victim Process Thawing and OOM Reaper Traversal Order" from zhongjinji makes a number of improvements in the OOM killer: mainly thawing a more appropriate group of victim threads so they can release resources - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park is a bunch of small and unrelated fixups for DAMON - "mm/damon: define and use DAMON initialization check function" from SeongJae Park implement reliability and maintainability improvements to a recently-added bug fix - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from SeongJae Park provides additional transparency to userspace clients of the DAMON_STAT information - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes some constraints on khubepaged's collapsing of anon VMAs. It also increases the success rate of MADV_COLLAPSE against an anon vma - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards removal of file_operations.mmap(). This patchset concentrates upon clearing up the treatment of stacked filesystems - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau provides some fixes and improvements to mlock's tracking of large folios. /proc/meminfo's "Mlocked" field became more accurate - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from Donet Tom fixes several user-visible KSM stats inaccuracies across forks and adds selftest code to verify these counters - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses some potential but presently benign issues in KSM's mm_slot handling * tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits) mm: swap: check for stable address space before operating on the VMA mm: convert folio_page() back to a macro mm/khugepaged: use start_addr/addr for improved readability hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list alloc_tag: fix boot failure due to NULL pointer dereference mm: silence data-race in update_hiwater_rss mm/memory-failure: don't select MEMORY_ISOLATION mm/khugepaged: remove definition of struct khugepaged_mm_slot mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL hugetlb: increase number of reserving hugepages via cmdline selftests/mm: add fork inheritance test for ksm_merging_pages counter mm/ksm: fix incorrect KSM counter handling in mm_struct during fork drivers/base/node: fix double free in register_one_node() mm: remove PMD alignment constraint in execmem_vmalloc() mm/memory_hotplug: fix typo 'esecially' -> 'especially' mm/rmap: improve mlock tracking for large folios mm/filemap: map entire large folio faultaround mm/fault: try to map the entire file folio in finish_fault() mm/rmap: mlock large folios in try_to_unmap_one() mm/rmap: fix a mlock race condition in folio_referenced_one() ...	2025-10-02 18:18:33 -07:00
Linus Torvalds	6c7340a7a8	Scheduler updates for v6.18: Core scheduler changes: - Make migrate_{en,dis}able() inline, to improve performance (Menglong Dong) - Move STDL_INIT() functions out-of-line (Peter Zijlstra) - Unify the SCHED_{SMT,CLUSTER,MC} Kconfig (Peter Zijlstra) Fair scheduling: - Defer throttling when tasks exit to user-space, to reduce the chance & impact of throttle-preemption with held locks and other resources. (Aaron Lu, Valentin Schneider) - Get rid of sched_domains_curr_level hack for tl->cpumask(), as the warning was getting triggered on certain topologies. (Peter Zijlstra) Misc cleanups & fixes: - Header cleanups (Menglong Dong) - Fix race in push_dl_task() (Harshit Agarwal) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmjWn0IRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1i9ng/+KJYEsNilPA6nGzZLurU3HXm6S5NrNBPc xJU43eo3QdFUymKqYaCoXCwaMah3m8fFW+DC8ZoEvWC5wHnlIw6+u/ZEtsWQ4R8N 8+sjQddQeb9Gez+nkC7pXX8h1nqlhHyGWU88+9hOyEEMk21KgH7tJ5gOuGQdPhiA v24D8dkS1sT6CAeqZAycYIkos+kfNNV+wAEQCXAxfvSSslpzJVZcj9kdQInxF4O4 9+WrvFZFVYJWdJgmgfbpBMw7N8a4Gpv+Lr6U0ZVXHNeLjNdn60I+5Me+9BW9GRcO pPL50RfGgGgVIqyEHvBhhz8p0tlKUJo9NkRJ9hmNB5SKT3P442SU789ppTYgI1il 5P4g3TiuomqPVuo99/mYHYOpT6NkYC1fkwMP9Hfk4NRUX0EA6lKXelVnMXaC3Z7R U+0xVYtK4BBLRFBo4HIEM1HChSIcOnutuufFX4lPPt+ewnAmJwSt619w+lBF0v+n cpiLIlQ74LrYlrULw3bznnwzh5dV8XHm4CT3XAShm6qB97/SaLr0rI5W7njdSvte ym9z4yFWidawowvLWjFX1CykSnX/T5MV+uzuIH64MEIA39B4VKJWCAC3l3AMJn/5 Ckhf93B5uG0q+wUtE66x/JrN7wtbz9PMpwh01fXNWs/eWc1VKkVrvq+PmHODngmz drSUoI1P570= =8ZTZ -----END PGP SIGNATURE----- Merge tag 'sched-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core scheduler changes: - Make migrate_{en,dis}able() inline, to improve performance (Menglong Dong) - Move STDL_INIT() functions out-of-line (Peter Zijlstra) - Unify the SCHED_{SMT,CLUSTER,MC} Kconfig (Peter Zijlstra) Fair scheduling: - Defer throttling to when tasks exit to user-space, to reduce the chance & impact of throttle-preemption with held locks and other resources (Aaron Lu, Valentin Schneider) - Get rid of sched_domains_curr_level hack for tl->cpumask(), as the warning was getting triggered on certain topologies (Peter Zijlstra) Misc cleanups & fixes: - Header cleanups (Menglong Dong) - Fix race in push_dl_task() (Harshit Agarwal)" * tag 'sched-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix some typos in include/linux/preempt.h sched: Make migrate_{en,dis}able() inline rcu: Replace preempt.h with sched.h in include/linux/rcupdate.h arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c sched/fair: Do not balance task to a throttled cfs_rq sched/fair: Do not special case tasks in throttled hierarchy sched/fair: update_cfs_group() for throttled cfs_rqs sched/fair: Propagate load for throttled cfs_rq sched/fair: Get rid of throttled_lb_pair() sched/fair: Task based throttle time accounting sched/fair: Switch to task based throttle model sched/fair: Implement throttle task work and related helpers sched/fair: Add related data structure for task based throttle sched: Unify the SCHED_{SMT,CLUSTER,MC} Kconfig sched: Move STDL_INIT() functions out-of-line sched/fair: Get rid of sched_domains_curr_level hack for tl->cpumask() sched/deadline: Fix race in push_dl_task()	2025-09-30 10:35:11 -07:00
Menglong Dong	35561bab76	arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c The include/generated/asm-offsets.h is generated in Kbuild during compiling from arch/SRCARCH/kernel/asm-offsets.c. When we want to generate another similar offset header file, circular dependency can happen. For example, we want to generate a offset file include/generated/test.h, which is included in include/sched/sched.h. If we generate asm-offsets.h first, it will fail, as include/sched/sched.h is included in asm-offsets.c and include/generated/test.h doesn't exist; If we generate test.h first, it can't success neither, as include/generated/asm-offsets.h is included by it. In x86_64, the macro COMPILE_OFFSETS is used to avoid such circular dependency. We can generate asm-offsets.h first, and if the COMPILE_OFFSETS is defined, we don't include the "generated/test.h". And we define the macro COMPILE_OFFSETS for all the asm-offsets.c for this purpose. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2025-09-25 09:57:15 +02:00
Sabyrzhan Tasbolatov	1e338f4d99	kasan: introduce ARCH_DEFER_KASAN and unify static key across modes Patch series "kasan: unify kasan_enabled() and remove arch-specific implementations", v6. This patch series addresses the fragmentation in KASAN initialization across architectures by introducing a unified approach that eliminates duplicate static keys and arch-specific kasan_arch_is_ready() implementations. The core issue is that different architectures have inconsistent approaches to KASAN readiness tracking: - PowerPC, LoongArch, and UML arch, each implement own kasan_arch_is_ready() - Only HW_TAGS mode had a unified static key (kasan_flag_enabled) - Generic and SW_TAGS modes relied on arch-specific solutions or always-on behavior This patch (of 2): Introduce CONFIG_ARCH_DEFER_KASAN to identify architectures [1] that need to defer KASAN initialization until shadow memory is properly set up, and unify the static key infrastructure across all KASAN modes. [1] PowerPC, UML, LoongArch selects ARCH_DEFER_KASAN. The core issue is that different architectures haveinconsistent approaches to KASAN readiness tracking: - PowerPC, LoongArch, and UML arch, each implement own kasan_arch_is_ready() - Only HW_TAGS mode had a unified static key (kasan_flag_enabled) - Generic and SW_TAGS modes relied on arch-specific solutions or always-on behavior This patch addresses the fragmentation in KASAN initialization across architectures by introducing a unified approach that eliminates duplicate static keys and arch-specific kasan_arch_is_ready() implementations. Let's replace kasan_arch_is_ready() with existing kasan_enabled() check, which examines the static key being enabled if arch selects ARCH_DEFER_KASAN or has HW_TAGS mode support. For other arch, kasan_enabled() checks the enablement during compile time. Now KASAN users can use a single kasan_enabled() check everywhere. Link: https://lkml.kernel.org/r/20250810125746.1105476-1-snovitoll@gmail.com Link: https://lkml.kernel.org/r/20250810125746.1105476-2-snovitoll@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217049 Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> #powerpc Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: David Gow <davidgow@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@loongson.cn> Cc: Marco Elver <elver@google.com> Cc: Qing Zhang <zhangqing@loongson.cn> Cc: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-09-21 14:21:58 -07:00
Tiwei Bie	e66ae377fe	um: Remove unused ipi_pipe field from cpuinfo_um It's no longer used after the removal of the SMP implementation in TT mode by commit `28fa468f53` ("um: Remove broken SMP support"). While at it, remove the outdated comment. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-10 14:23:58 +02:00
Tiwei Bie	e047f9af9d	um: Centralize stub size calculations Currently, the stub size is calculated in multiple places. Define a macro that performs the calculation so that the code is easier to read and maintain. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-10 14:23:52 +02:00
Tiwei Bie	4c134c2a5f	um: Indent time-travel help messages Indent the help messages for time-travel to make them consistent with the format of other help messages. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-10 14:23:45 +02:00
Tiwei Bie	26577cfbe1	um: Add missing trailing newline to help messages Some help messages are missing a trailing newline. They should end with two newlines, but only one is present. Add the missing newline to make the --help output more readable. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-10 14:23:37 +02:00
Sinan Nalkaya	35fae10aaf	um: Support SPARSE_IRQ Motivation: IRQ KUnit tests are going to require CONFIG_SPARSE_IRQ [1] in order to: (a) reliably allocate additional (fake) IRQs and (b) ensure we can test managed affinity, which is only supported with SPARSE_IRQ. It seems that the only thing necessary for ARCH=um is to tell the genirq core to skip over our preallocated NR_IRQS. Tested with: $ ./tools/testing/kunit/kunit.py run [...] [13:55:58] Testing complete. Ran 676 tests: passed: 646, skipped: 30 [...] This compares with pre-patch results: Ran 672 tests: passed: 644, skipped: 28 i.e., we no longer skip tests that 'depend on SPARSE_IRQ', and existing tests all pass. [1] [PATCH v2 4/6] genirq/test: Depend on SPARSE_IRQ https://lore.kernel.org/all/CABVgOSngoD0fh1WEkUCEwSdk0Joypo3dA_Y_SjW+K=nVDnZs3Q@mail.gmail.com/ Signed-off-by: Sinan Nalkaya <sardok@gmail.com> [Brian: Adapted Sinan's patch; rewrote commit message] Signed-off-by: Brian Norris <briannorris@chromium.org> Tested-by: David Gow <davidgow@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-09-10 14:23:30 +02:00
Simon Schuster	bbc46b23af	arch: copy_thread: pass clone_flags as u64 With the introduction of clone3 in commit `7f192e3cd3` ("fork: add clone3") the effective bit width of clone_flags on all architectures was increased from 32-bit to 64-bit, with a new type of u64 for the flags. However, for most consumers of clone_flags the interface was not changed from the previous type of unsigned long. While this works fine as long as none of the new 64-bit flag bits (CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still undesirable in terms of the principle of least surprise. Thus, this commit fixes all relevant interfaces of the copy_thread function that is called from copy_process to consistently pass clone_flags as u64, so that no truncation to 32-bit integers occurs on 32-bit architectures. Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com> Link: https://lore.kernel.org/20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com Fixes: `c5febea095` ("fork: Pass struct kernel_clone_args into copy_thread") Acked-by: Guo Ren (Alibaba Damo Academy) <guoren@kernel.org> Acked-by: Andreas Larsson <andreas@gaisler.com> # sparc Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-09-01 15:31:34 +02:00
Tiwei Bie	f7e9077a16	um: Stop tracking stub's PID via userspace_pid[] The PID of the stub process can be obtained from current_mm_id(). There is no need to track it via userspace_pid[]. Stop doing that to simplify the code. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20250711065021.2535362-4-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-13 19:42:49 +02:00
Tiwei Bie	409a0c00c4	um: Make mm_list and mm_list_lock static They are only used within mmu.c. Make them static. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20250708090403.1067440-3-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-11 08:49:18 +02:00
Thomas Weißschuh	b9e2f2246e	um: Re-evaluate thread flags repeatedly The thread flags may change during their processing. For example a task_work can queue a new signal to be sent. This signal should be delivered before returning to usespace again. Evaluate the flags repeatedly similar to other architectures. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/20250704-uml-thread_flags-v1-1-0e293fd8d627@linutronix.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-11 08:49:02 +02:00
Johannes Berg	ac1ad16f10	um: simplify syscall header files Since Thomas's recent commit 2af10530639b ("um/x86: Add system call table to header file") , we now have two extern declarations of the syscall table, one internal and one external, and they don't even match on 32-bit. Clean this up and remove all the extra code. Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://patch.msgid.link/20250704141243.a68366f6acc3.If8587a4aafdb90644fc6d0b2f5e31a2d1887915f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-11 08:49:02 +02:00
Thomas Weißschuh	2a713f04ed	um/ptrace: Implement HAVE_SYSCALL_TRACEPOINTS Implement syscall tracepoints through the generic tracing infrastructure. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/20250703-uml-have_syscall_tracepoints-v1-2-23c1d3808578@linutronix.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-07-11 08:49:02 +02:00
Linus Torvalds	8630c59e99	Kbuild updates for v6.16 - Add support for the EXPORT_SYMBOL_GPL_FOR_MODULES() macro, which exports a symbol only to specified modules - Improve ABI handling in gendwarfksyms - Forcibly link lib-y objects to vmlinux even if CONFIG_MODULES=n - Add checkers for redundant or missing <linux/export.h> inclusion - Deprecate the extra-y syntax - Fix a genksyms bug when including enum constants from .symref files -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmhEZc4VHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGVAgQAKLRdBGga1kBJJFIkUOHWC5+g/je U/dO5rGnuOLviWDexC6QT8AQV2N+dQXhB11x+KacSu1bwowsEvwuegtA6VqwbETs tyWmB0PftEzVyPfc+Rjfy0LDfKkiKkm4RhXiMwcem/rlw45gvJXrVU7jJin9fI3A So8glpOAX+mEizUHkjZkS51nkYCZFDsn7hVo0X43vqjeFrrFGLEQ5xas4Ci+dkY3 9g8Q5bFL8CC5PHjSO8wFftCcAWwTukAht6CSSb522MKGnCVZ9RxTmRwEPXrBmXtS 5eWa8yg6y0tFVmot8iwZGBYleAWDNsj0a2j2oVjUN+EF91sk3WQApJVNBok/nQFb 4MgO3N3UXZdy4tYkBX8tMgOcGkfjZAFoNxSUm5oVouh9NyT0dpqYHhJHBNVbVJoF igQWeVOYcioDjeU1iXnP2cw64q44ROfxmOpDxOSRz9PTM6CCya1R0m/zzBLV6Lwk rzlXk1LLf+jIfgmS5RLlkCgrXS1U0vNGXxQH9Ui9dZSEtzdU7qt5WQ/Rz44bEBhS OeIlJfMMx6QYJztJc/BaUjkKsutTkII52QctRbRCj/nKswHd8SnHV+xk1c2WPxrg yKq10rPpdg1BcvmODY6cmcndt7ogDRfkogm2gvGQIBZEglRimpmpg51sZQRD0ueE 0rt12TmktsLbglB4 =Dy49 -----END PGP SIGNATURE----- Merge tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add support for the EXPORT_SYMBOL_GPL_FOR_MODULES() macro, which exports a symbol only to specified modules - Improve ABI handling in gendwarfksyms - Forcibly link lib-y objects to vmlinux even if CONFIG_MODULES=n - Add checkers for redundant or missing <linux/export.h> inclusion - Deprecate the extra-y syntax - Fix a genksyms bug when including enum constants from .symref files * tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits) genksyms: Fix enum consts from a reference affecting new values arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES} efi/libstub: use 'targets' instead of extra-y in Makefile module: make __mod_device_table__* symbols static scripts/misc-check: check unnecessary #include <linux/export.h> when W=1 scripts/misc-check: check missing #include <linux/export.h> when W=1 scripts/misc-check: add double-quotes to satisfy shellcheck kbuild: move W=1 check for scripts/misc-check to top-level Makefile scripts/tags.sh: allow to use alternative ctags implementation kconfig: introduce menu type enum docs: symbol-namespaces: fix reST warning with literal block kbuild: link lib-y objects to vmlinux forcibly even when CONFIG_MODULES=n tinyconfig: enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION docs/core-api/symbol-namespaces: drop table of contents and section numbering modpost: check forbidden MODULE_IMPORT_NS("module:") at compile time kbuild: move kbuild syntax processing to scripts/Makefile.build Makefile: remove dependency on archscripts for header installation Documentation/kbuild: Add new gendwarfksyms kABI rules Documentation/kbuild: Drop section numbers ...	2025-06-07 10:05:35 -07:00
Masahiro Yamada	e21efe833e	arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds The extra-y syntax is deprecated. Instead, use always-$(KBUILD_BUILTIN), which behaves equivalently. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Nicolas Schier <n.schier@avm.de>	2025-06-07 14:38:07 +09:00
Linus Torvalds	cfc4ca8986	Notable changes: - remove obsolete network transports - remove PCI IO port support - start adding seccomp-based process handling instead of ptrace -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmhBYTIACgkQ10qiO8sP aACobQ//ZggBPinLNWXep4pcfK0/x1mx76cKVIpf1TSI6BpG1kQmkpOIxYDE6JTv yo1Ydoy7CMs+xxkDRpsm85qcq8BHhK4Ebfg/jYmRSCSKxtWEeNHJv3RmauQGAxym iGLR4Wd7dju0ywiOSAr66cZ0OYHKUbT2j4Vxybb8YG5sJ2s3YVJBYsiGJDtmjF9q ezySizAhW8KLScSiqWDruHUq7yEGWa8fp2RNPKT5WhOobZRAJI5upFNwHh0dINaK 8Qntui4IgG922toBVS26g8ZwV6iJlBUsDttpWZEW1xFBxvxhWI5temW1LVBTvs8M mTCiKRd/oGwgtzNmWwXPzW7oJbBA/IlYtGognmaPgjwomyeGmWbnIWsB/1VV1QL4 5+1+zGQzs8xnN2TsOkIQSiWEEkolreG8NFFY2PZPxiSH6lvkYvlin76DbA+HbmWR oU8GBKAwJmn15yxPuRRaCtUaVr4M+siIfBVp5NCgvlnc6scCWVdGlT9e59D6T886 ZCY4O3UOzhzi9f0xCMx8+XVGjCPntlqLJJQCnSTrtS0+E7B78CxYNZRSLQ83HLa/ ivDA3fu/rvBON/gRYqd1YDOy0NkRddDZLQEwiedRkRSI5TZdEDQZMnOFdqSDEd/D doWw8M3m6g5o2zTOF6XkU9Se1VhkkRDUgxQ+AqLCoMIoM3WVby8= =iHzS -----END PGP SIGNATURE----- Merge tag 'uml-for-linux-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Johannes Berg: "The only really new thing is the long-standing seccomp work (originally from 2021!). Wven if it still isn't enabled by default due to security concerns it can still be used e.g. for tests. - remove obsolete network transports - remove PCI IO port support - start adding seccomp-based process handling instead of ptrace" * tag 'uml-for-linux-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (29 commits) um: remove "extern" from implementation of sigchld_handler um: fix unused variable warning um: fix SECCOMP 32bit xstate register restore um: pass FD for memory operations when needed um: Add SECCOMP support detection and initialization um: Implement kernel side of SECCOMP based process handling um: Track userspace children dying in SECCOMP mode um: Add helper functions to get/set state for SECCOMP um: Add stub side of SECCOMP/futex based process handling um: Move faultinfo extraction into userspace routine um: vector: Use mac_pton() for MAC address parsing um: vector: Clean up and modernize log messages um: chan_kern: use raw spinlock for irqs_to_free_lock MAINTAINERS: remove obsolete file entry in TUN/TAP DRIVER um: Fix tgkill compile error on old host OSes um: stop using PCI port I/O um: Remove legacy network transport infrastructure um: vector: Eliminate the dependency on uml_net um: Remove obsolete legacy network transports um/asm: Replace "REP; NOP" with PAUSE mnemonic ...	2025-06-05 11:45:33 -07:00
Benjamin Berg	e56a50ff7c	um: remove "extern" from implementation of sigchld_handler There is no need to mark the function as extern in the implementation. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506051226.X8r7X5aa-lkp@intel.com/ Fixes: `8420e08fe3` ("um: Track userspace children dying in SECCOMP mode") Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250605050325.1077208-2-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-06-05 11:12:13 +02:00
Benjamin Berg	e92e255285	um: pass FD for memory operations when needed Instead of always sharing the FDs with the userspace process, only hand over the FDs needed for mmap when required. The idea is that userspace might be able to force the stub into executing an mmap syscall, however, it will not be able to manipulate the control flow sufficiently to have access to an FD that would allow mapping arbitrary memory. Security wise, we need to be sure that only the expected syscalls are executed after the kernel sends FDs through the socket. This is currently not the case, as userspace can trivially jump to the rt_sigreturn syscall instruction to execute any syscall that the stub is permitted to do. With this, it can trick the kernel to send the FD, which in turn allows userspace to freely map any physical memory. As such, this is currently not secure. However, in principle the approach should be fine with a more strict SECCOMP filter and a careful review of the stub control flow (as userspace can prepare a stack). With some care, it is likely possible to extend the security model to SMP if desired. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250602130052.545733-8-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2025-06-02 16:20:10 +02:00

1 2 3 4 5 ...

1123 Commits