Commit Graph

618 Commits

Author SHA1 Message Date
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Linus Torvalds
eeccf287a2 mm.git review status for linus..mm-stable
Total patches:       36
 Reviews/patch:       1.77
 Reviewed rate:       83%
 
 - The 2 patch series "mm/vmscan: fix demotion targets checks in
   reclaim/demotion" from Bing Jiao fixes a couple of issues in the
   demotion code - pages were failed demotion and were finding themselves
   demoted into disallowed nodes.
 
 - The 11 patch series "Remove XA_ZERO from error recovery of dup_mmap()"
   from Liam Howlett fixes a rare mapledtree race and performs a number of
   cleanups.
 
 - The 13 patch series "mm: add bitmap VMA flag helpers and convert all
   mmap_prepare to use them" from Lorenzo Stoakes implements a lot of
   cleanups following on from the conversion of the VMA flags into a
   bitmap.
 
 - The 5 patch series "support batch checking of references and unmapping
   for large folios" from Baolin Wang implements batching to greatly
   improve the performance of reclaiming clean file-backed large folios.
 
 - The 3 patch series "selftests/mm: add memory failure selftests" from
   Miaohe Lin does as claimed.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaZaIEQAKCRDdBJ7gKXxA
 jj73AQCQDwLoipDiQRGyjB5BDYydymWuDoiB1tlDPHfYAP3b/QD/UQtVlOEXqwM3
 naOKs3NQ1pwnfhDaQMirGw2eAnJ1SQY=
 =6Iif
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM  updates from Andrew Morton:

 - "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a
   couple of issues in the demotion code - pages were failed demotion
   and were finding themselves demoted into disallowed nodes (Bing Jiao)

 - "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare
   mapledtree race and performs a number of cleanups (Liam Howlett)

 - "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use
   them" implements a lot of cleanups following on from the conversion
   of the VMA flags into a bitmap (Lorenzo Stoakes)

 - "support batch checking of references and unmapping for large folios"
   implements batching to greatly improve the performance of reclaiming
   clean file-backed large folios (Baolin Wang)

 - "selftests/mm: add memory failure selftests" does as claimed (Miaohe
   Lin)

* tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits)
  mm/page_alloc: clear page->private in free_pages_prepare()
  selftests/mm: add memory failure dirty pagecache test
  selftests/mm: add memory failure clean pagecache test
  selftests/mm: add memory failure anonymous page test
  mm: rmap: support batched unmapping for file large folios
  arm64: mm: implement the architecture-specific clear_flush_young_ptes()
  arm64: mm: support batch clearing of the young flag for large folios
  arm64: mm: factor out the address and ptep alignment into a new helper
  mm: rmap: support batched checks of the references for large folios
  tools/testing/vma: add VMA userland tests for VMA flag functions
  tools/testing/vma: separate out vma_internal.h into logical headers
  tools/testing/vma: separate VMA userland tests into separate files
  mm: make vm_area_desc utilise vma_flags_t only
  mm: update all remaining mmap_prepare users to use vma_flags_t
  mm: update shmem_[kernel]_file_*() functions to use vma_flags_t
  mm: update secretmem to use VMA flags on mmap_prepare
  mm: update hugetlbfs to use VMA flags on mmap_prepare
  mm: add basic VMA flag operation helper functions
  tools: bitmap: add missing bitmap_[subset(), andnot()]
  mm: add mk_vma_flags() bitmap flag macro helper
  ...
2026-02-18 20:50:32 -08:00
Linus Torvalds
75a452d31b Changes for 7.0-rc1
Added:
     improve readahead for bitmap initialization and large directory scans
 	fsync files by syncing parent inodes
 	drop of preallocated clusters for sparse and compressed files
 	zero-fill folios beyond i_valid in ntfs_read_folio()
 	implement llseek SEEK_DATA/SEEK_HOLE by scanning data runs
 	implement iomap-based file operations
 	allow explicit boolean acl/prealloc mount options
 	a fall-through between switch labels
 	a delayed-allocation (delalloc) support
 
 Fixed:
     check return value of indx_find to avoid infinite loop
 	initialize new folios before use
 	an infinite loop in attr_load_runs_range on inconsistent metadata
 	an infinite loop triggered by zero-sized ATTR_LIST
 	ntfs_mount_options leak in ntfs_fill_super()
 	a deadlock in ni_read_folio_cmpr
 	a circular locking dependency in run_unpack_ex
 	prevent infinite loops caused by the next valid being the same
 	restore NULL folio initialization in ntfs_writepages()
 	a slab-out-of-bounds read in DeleteIndexEntryRoot
 
 Changed:
     allow readdir() to finish after directory mutations without rewinddir()
 	handle attr_set_size() errors when truncating files
 	make ntfs_writeback_ops static
 	refactor duplicate kmemdup pattern in do_action()
 	avoid calling run_get_entry() when run == NULL in ntfs_read_run_nb_ra()
 
 Replaced:
 	use wait_on_buffer() directly
 	rename ni_readpage_cmpr into ni_read_folio_cmpr
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEh0DEKNP0I9IjwfWEqbAzH4MkB7YFAmmUuGYACgkQqbAzH4Mk
 B7ZEHw/+LI38Sd6kkEJZLz2eMi1oL4FwS604tB9Y2k9KrxiPYbO4u2aEgomaCgTy
 QXdeOQWaCJ2Zj0sNVHPoeF9x2neTWu+3DValLxDyL6C+joYYqFPgfkSMTfMOeX+o
 Adf0WxGQSJ74Xnxn9dZER+rO50/V6xF0m5E8G9f78+fF+iN6zW8+tqc0wjkbgfNq
 2qHJ5pyvN7izkzBV9ZYGt7UeLgoGE7JmKeuw/MyFgqCkh4k9yethK7N2cGYnUFyc
 4RGuZloro5K7YlSgtvOufeLWoXNaP1rd8q+/skY1yJsJZbGknEWP11Ph1N67lW3b
 VuqcHFKTNvj2fuEm+T+YMpnzRXEAoGNaAocn+sv1Pd6SBuI05xzWhCM+DWxDQWlN
 fQhCMphf5FUhRfOLgitXnkgBM9wQAgRrE98+8jPmkaCxYJYviSYeKMkC0QfF09rf
 P4Ct1lYbdZNcYjD0EGVISJL3KxQ3XPK26qVqdumOQk+30s85GlAvCfTPoNcwXmxS
 xx/gnTFFIGFzyZyyMACRu9EdXZktPlBq70nWUMVfva5aGq0t+rKZSydiwQLCYRHj
 NtRtg2O5Qd1QumdpjhQRsX1NC8UU1/VpAnixiS7FMvxo7bw5Ksnk+qiL5Ocao7lC
 3Fd/95WYsUJZLkF/J8ayTkhexTyRcl2QZvDzRX00yEseIpQiu2U=
 =Y5Q9
 -----END PGP SIGNATURE-----

Merge tag 'ntfs3_for_7.0' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 updates from Konstantin Komarov:
 "New code:
   - improve readahead for bitmap initialization and large directory scans
   - fsync files by syncing parent inodes
   - drop of preallocated clusters for sparse and compressed files
   - zero-fill folios beyond i_valid in ntfs_read_folio()
   - implement llseek SEEK_DATA/SEEK_HOLE by scanning data runs
   - implement iomap-based file operations
   - allow explicit boolean acl/prealloc mount options
   - fall-through between switch labels
   - delayed-allocation (delalloc) support

  Fixes:
   - check return value of indx_find to avoid infinite loop
   - initialize new folios before use
   - infinite loop in attr_load_runs_range on inconsistent metadata
   - infinite loop triggered by zero-sized ATTR_LIST
   - ntfs_mount_options leak in ntfs_fill_super()
   - deadlock in ni_read_folio_cmpr
   - circular locking dependency in run_unpack_ex
   - prevent infinite loops caused by the next valid being the same
   - restore NULL folio initialization in ntfs_writepages()
   - slab-out-of-bounds read in DeleteIndexEntryRoot

  Updates:
   - allow readdir() to finish after directory mutations without rewinddir()
   - handle attr_set_size() errors when truncating files
   - make ntfs_writeback_ops static
   - refactor duplicate kmemdup pattern in do_action()
   - avoid calling run_get_entry() when run == NULL in ntfs_read_run_nb_ra()

  Replaced:
   - use wait_on_buffer() directly
   - rename ni_readpage_cmpr into ni_read_folio_cmpr"

* tag 'ntfs3_for_7.0' of https://github.com/Paragon-Software-Group/linux-ntfs3: (26 commits)
  fs/ntfs3: add delayed-allocation (delalloc) support
  fs/ntfs3: avoid calling run_get_entry() when run == NULL in ntfs_read_run_nb_ra()
  fs/ntfs3: add fall-through between switch labels
  fs/ntfs3: allow explicit boolean acl/prealloc mount options
  fs/ntfs3: Fix slab-out-of-bounds read in DeleteIndexEntryRoot
  ntfs3: Restore NULL folio initialization in ntfs_writepages()
  ntfs3: Refactor duplicate kmemdup pattern in do_action()
  fs/ntfs3: prevent infinite loops caused by the next valid being the same
  fs/ntfs3: make ntfs_writeback_ops static
  ntfs3: fix circular locking dependency in run_unpack_ex
  fs/ntfs3: implement iomap-based file operations
  fs/ntfs3: fix deadlock in ni_read_folio_cmpr
  fs/ntfs3: implement llseek SEEK_DATA/SEEK_HOLE by scanning data runs
  fs/ntfs3: zero-fill folios beyond i_valid in ntfs_read_folio()
  fs/ntfs3: handle attr_set_size() errors when truncating files
  fs/ntfs3: drop preallocated clusters for sparse and compressed files
  fs/ntfs3: fsync files by syncing parent inodes
  fs/ntfs3: fix ntfs_mount_options leak in ntfs_fill_super()
  fs/ntfs3: allow readdir() to finish after directory mutations without rewinddir()
  fs/ntfs3: improve readahead for bitmap initialization and large directory scans
  ...
2026-02-17 15:37:06 -08:00
Konstantin Komarov
10d7c95af0
fs/ntfs3: add delayed-allocation (delalloc) support
This patch implements delayed allocation (delalloc) in ntfs3 driver.

It introduces an in-memory delayed-runlist (run_da) and the helpers to
track, reserve and later convert those delayed reservations into real
clusters at writeback time. The change keeps on-disk formats untouched and
focuses on pagecache integration, correctness and safe interaction with
fallocate, truncate, and dio/iomap paths.

Key points:

- add run_da (delay-allocated run tree) and bookkeeping for delayed clusters.

- mark ranges as delalloc (DELALLOC_LCN) instead of immediately allocating.
  Actual allocation performed later (writeback / attr_set_size_ex / explicit
  flush paths).

- direct i/o / iomap paths updated to avoid dio collisions with
  delalloc: dio falls back or forces allocation of delayed blocks before
  proceeding.

- punch/collapse/truncate/fallocate check and cancel delay-alloc reservations.
  Sparse/compressed files handled specially.

- free-space checks updated (ntfs_check_free_space) to account for reserved
  delalloc clusters and MFT record budgeting.

- delayed allocations are committed on last writer (file release) and on
  explicit allocation flush paths.

Tested-by: syzbot@syzkaller.appspotmail.com
Reported-by: syzbot+2bd8e813c7f767aa9bb1@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-02-16 17:23:51 +01:00
Lorenzo Stoakes
5bd2c0650a mm: update all remaining mmap_prepare users to use vma_flags_t
We will be shortly removing the vm_flags_t field from vm_area_desc so we
need to update all mmap_prepare users to only use the dessc->vma_flags
field.

This patch achieves that and makes all ancillary changes required to make
this possible.

This lays the groundwork for future work to eliminate the use of
vm_flags_t in vm_area_desc altogether and more broadly throughout the
kernel.

While we're here, we take the opportunity to replace VM_REMAP_FLAGS with
VMA_REMAP_FLAGS, the vma_flags_t equivalent.

No functional changes intended.

Link: https://lkml.kernel.org/r/fb1f55323799f09fe6a36865b31550c9ec67c225.1769097829.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Damien Le Moal <dlemoal@kernel.org>	[zonefs]
Acked-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: Pedro Falcato <pfalcato@suse.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Yury Norov <ynorov@nvidia.com>
Cc: Chris Mason <clm@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-12 15:42:58 -08:00
Linus Torvalds
136114e0ab mm.git review status for linus..mm-nonmm-stable
Total patches:       107
 Reviews/patch:       1.07
 Reviewed rate:       67%
 
 - The 2 patch series "ocfs2: give ocfs2 the ability to reclaim
   suballocator free bg" from Heming Zhao saves disk space by teaching
   ocfs2 to reclaim suballocator block group space.
 
 - The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one
   bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in
   various places.
 
 - The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than
   PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if
   VMCOREINFO_BYTES ever exceeds the page size.
 
 - The 7 patch series "kallsyms: Prevent invalid access when showing
   module buildid" from Petr Mladek cleans up kallsyms code related to
   module buildid and fixes an invalid access crash when printing
   backtraces.
 
 - The 3 patch series "Address page fault in
   ima_restore_measurement_list()" from Harshit Mogalapalli fixes a
   kexec-related crash that can occur when booting the second-stage kernel
   on x86.
 
 - The 6 patch series "kho: ABI headers and Documentation updates" from
   Mike Rapoport updates the kexec handover ABI documentation.
 
 - The 4 patch series "Align atomic storage" from Finn Thain adds the
   __aligned attribute to atomic_t and atomic64_t definitions to get
   natural alignment of both types on csky, m68k, microblaze, nios2,
   openrisc and sh.
 
 - The 2 patch series "kho: clean up page initialization logic" from
   Pratyush Yadav simplifies the page initialization logic in
   kho_restore_page().
 
 - The 6 patch series "Unload linux/kernel.h" from Yury Norov moves
   several things out of kernel.h and into more appropriate places.
 
 - The 7 patch series "don't abuse task_struct.group_leader" from Oleg
   Nesterov removes the usage of ->group_leader when it is "obviously
   unnecessary".
 
 - The 5 patch series "list private v2 & luo flb" from Pasha Tatashin
   adds some infrastructure improvements to the live update orchestrator.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA
 jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6
 0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww=
 =mmsH
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves
   disk space by teaching ocfs2 to reclaim suballocator block group
   space (Heming Zhao)

 - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the
   ARRAY_END() macro and uses it in various places (Alejandro Colomar)

 - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes
   the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the
   page size (Pnina Feder)

 - "kallsyms: Prevent invalid access when showing module buildid" cleans
   up kallsyms code related to module buildid and fixes an invalid
   access crash when printing backtraces (Petr Mladek)

 - "Address page fault in ima_restore_measurement_list()" fixes a
   kexec-related crash that can occur when booting the second-stage
   kernel on x86 (Harshit Mogalapalli)

 - "kho: ABI headers and Documentation updates" updates the kexec
   handover ABI documentation (Mike Rapoport)

 - "Align atomic storage" adds the __aligned attribute to atomic_t and
   atomic64_t definitions to get natural alignment of both types on
   csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain)

 - "kho: clean up page initialization logic" simplifies the page
   initialization logic in kho_restore_page() (Pratyush Yadav)

 - "Unload linux/kernel.h" moves several things out of kernel.h and into
   more appropriate places (Yury Norov)

 - "don't abuse task_struct.group_leader" removes the usage of
   ->group_leader when it is "obviously unnecessary" (Oleg Nesterov)

 - "list private v2 & luo flb" adds some infrastructure improvements to
   the live update orchestrator (Pasha Tatashin)

* tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits)
  watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency
  procfs: fix missing RCU protection when reading real_parent in do_task_stat()
  watchdog/softlockup: fix sample ring index wrap in need_counting_irqs()
  kcsan, compiler_types: avoid duplicate type issues in BPF Type Format
  kho: fix doc for kho_restore_pages()
  tests/liveupdate: add in-kernel liveupdate test
  liveupdate: luo_flb: introduce File-Lifecycle-Bound global state
  liveupdate: luo_file: Use private list
  list: add kunit test for private list primitives
  list: add primitives for private list manipulations
  delayacct: fix uapi timespec64 definition
  panic: add panic_force_cpu= parameter to redirect panic to a specific CPU
  netclassid: use thread_group_leader(p) in update_classid_task()
  RDMA/umem: don't abuse current->group_leader
  drm/pan*: don't abuse current->group_leader
  drm/amd: kill the outdated "Only the pthreads threading model is supported" checks
  drm/amdgpu: don't abuse current->group_leader
  android/binder: use same_thread_group(proc->tsk, current) in binder_mmap()
  android/binder: don't abuse current->group_leader
  kho: skip memoryless NUMA nodes when reserving scratch areas
  ...
2026-02-12 12:13:01 -08:00
Linus Torvalds
26c9342bb7 struct filename series
[mostly] sanitize struct filename hanling
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaYlcJgAKCRBZ7Krx/gZQ
 6xlKAP9c9J13sJ/mcobsj1Ov7nSHISNbnYqvRRCu09Wq3UQvJgEApNQYOEdLtpff
 zUnWOAQ0nOKY7w9VMLkRRustXpuGjAc=
 =Fld4
 -----END PGP SIGNATURE-----

Merge tag 'pull-filename' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs 'struct filename' updates from Al Viro:
 "[Mostly] sanitize struct filename handling"

* tag 'pull-filename' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (68 commits)
  sysfs(2): fs_index() argument is _not_ a pathname
  alpha: switch osf_mount() to strndup_user()
  ksmbd: use CLASS(filename_kernel)
  mqueue: switch to CLASS(filename)
  user_statfs(): switch to CLASS(filename)
  statx: switch to CLASS(filename_maybe_null)
  quotactl_block(): switch to CLASS(filename)
  chroot(2): switch to CLASS(filename)
  move_mount(2): switch to CLASS(filename_maybe_null)
  namei.c: switch user pathname imports to CLASS(filename{,_flags})
  namei.c: convert getname_kernel() callers to CLASS(filename_kernel)
  do_f{chmod,chown,access}at(): use CLASS(filename_uflags)
  do_readlinkat(): switch to CLASS(filename_flags)
  do_sys_truncate(): switch to CLASS(filename)
  do_utimes_path(): switch to CLASS(filename_uflags)
  chdir(2): unspaghettify a bit...
  do_fchownat(): unspaghettify a bit...
  fspick(2): use CLASS(filename_flags)
  name_to_handle_at(): use CLASS(filename_uflags)
  vfs_open_tree(): use CLASS(filename_uflags)
  ...
2026-02-09 16:58:28 -08:00
Linus Torvalds
9e355113f0 vfs-7.0-rc1.misc
Please consider pulling these changes from the signed vfs-7.0-rc1.misc tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49QAKCRCRxhvAZXjc
 ojrZAQD1VJzY46r5FnAVf4jlEHyjIbDnZCP/n+c4x6XnqpU6EQEAgB0yAtAGP6+u
 SBuytElqHoTT5VtmEXTAabCNQ9Ks8wo=
 =JwZz
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains a mix of VFS cleanups, performance improvements, API
  fixes, documentation, and a deprecation notice.

  Scalability and performance:

   - Rework pid allocation to only take pidmap_lock once instead of
     twice during alloc_pid(), improving thread creation/teardown
     throughput by 10-16% depending on false-sharing luck. Pad the
     namespace refcount to reduce false-sharing

   - Track file lock presence via a flag in ->i_opflags instead of
     reading ->i_flctx, avoiding false-sharing with ->i_readcount on
     open/close hot paths. Measured 4-16% improvement on 24-core
     open-in-a-loop benchmarks

   - Use a consume fence in locks_inode_context() to match the
     store-release/load-consume idiom, eliminating a hardware fence on
     some architectures

   - Annotate cdev_lock with __cacheline_aligned_in_smp to prevent
     false-sharing

   - Remove a redundant DCACHE_MANAGED_DENTRY check in
     __follow_mount_rcu() that never fires since the caller already
     verifies it, eliminating a 100% mispredicted branch

   - Fix a 100% mispredicted likely() in devcgroup_inode_permission()
     that became wrong after a prior code reorder

  Bug fixes and correctness:

   - Make insert_inode_locked() wait for inode destruction instead of
     skipping, fixing a corner case where two matching inodes could
     exist in the hash

   - Move f_mode initialization before file_ref_init() in alloc_file()
     to respect the SLAB_TYPESAFE_BY_RCU ordering contract

   - Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with
     no buffers attached, preventing a null pointer dereference when
     AS_RELEASE_ALWAYS is set but no release_folio op exists

   - Fix select restart_block to store end_time as timespec64, avoiding
     truncation of tv_sec on 32-bit architectures

   - Make dump_inode() use get_kernel_nofault() to safely access inode
     and superblock fields, matching the dump_mapping() pattern

  API modernization:

   - Make posix_acl_to_xattr() allocate the buffer internally since
     every single caller was doing it anyway. Reduces boilerplate and
     unnecessary error checking across ~15 filesystems

   - Replace deprecated simple_strtoul() with kstrtoul() for the
     ihash_entries, dhash_entries, mhash_entries, and mphash_entries
     boot parameters, adding proper error handling

   - Convert chardev code to use guard(mutex) and __free(kfree) cleanup
     patterns

   - Replace min_t() with min() or umin() in VFS code to avoid silently
     truncating unsigned long to unsigned int

   - Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers
     already check the flag

  Deprecation:

   - Begin deprecating legacy BSD process accounting (acct(2)). The
     interface has numerous footguns and better alternatives exist
     (eBPF)

  Documentation:

   - Fix and complete kernel-doc for struct export_operations, removing
     duplicated documentation between ReST and source

   - Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait()

  Testing:

   - Add a kunit test for initramfs cpio handling of entries with
     filesize > PATH_MAX

  Misc:

   - Add missing <linux/init_task.h> include in fs_struct.c"

* tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  posix_acl: make posix_acl_to_xattr() alloc the buffer
  fs: make insert_inode_locked() wait for inode destruction
  initramfs_test: kunit test for cpio.filesize > PATH_MAX
  fs: improve dump_inode() to safely access inode fields
  fs: add <linux/init_task.h> for 'init_fs'
  docs: exportfs: Use source code struct documentation
  fs: move initializing f_mode before file_ref_init()
  exportfs: Complete kernel-doc for struct export_operations
  exportfs: Mark struct export_operations functions at kernel-doc
  exportfs: Fix kernel-doc output for get_name()
  acct(2): begin the deprecation of legacy BSD process accounting
  device_cgroup: remove branch hint after code refactor
  VFS: fix __start_dirop() kernel-doc warnings
  fs: Describe @isnew parameter in ilookup5_nowait()
  fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu
  fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS
  select: store end_time as timespec64 in restart block
  chardev: Switch to guard(mutex) and __free(kfree)
  namespace: Replace simple_strtoul with kstrtoul to parse boot params
  dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries
  ...
2026-02-09 15:13:05 -08:00
Konstantin Komarov
c5226b96c0
fs/ntfs3: avoid calling run_get_entry() when run == NULL in ntfs_read_run_nb_ra()
When ntfs_read_run_nb_ra() is invoked with run == NULL the code later
assumes run is valid and may call run_get_entry(NULL, ...), and also
uses clen/idx without initializing them. Smatch reported uninitialized
variable warnings and this can lead to undefined behaviour. This patch
fixes it.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202512230646.v5hrYXL0-lkp@intel.com/
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-02-09 16:14:33 +01:00
Konstantin Komarov
c1f221c1be
fs/ntfs3: add fall-through between switch labels
Add fall-through to fix the warning in ntfs_fs_parse_param().

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202602041402.uojBz5QY-lkp@intel.com/
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-02-09 10:59:28 +01:00
Konstantin Komarov
3c6248937f
fs/ntfs3: allow explicit boolean acl/prealloc mount options
This patch improves mount option parsing by allowing explicit boolean
values for acl and prealloc. Previously those options were exposed only
as presence/absence flags.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-02-04 01:41:00 +01:00
Jiasheng Jiang
b2bc7c44ed
fs/ntfs3: Fix slab-out-of-bounds read in DeleteIndexEntryRoot
In the 'DeleteIndexEntryRoot' case of the 'do_action' function, the
entry size ('esize') is retrieved from the log record without adequate
bounds checking.

Specifically, the code calculates the end of the entry ('e2') using:
    e2 = Add2Ptr(e1, esize);

It then calculates the size for memmove using 'PtrOffset(e2, ...)',
which subtracts the end pointer from the buffer limit. If 'esize' is
maliciously large, 'e2' exceeds the used buffer size. This results in
a negative offset which, when cast to size_t for memmove, interprets
as a massive unsigned integer, leading to a heap buffer overflow.

This commit adds a check to ensure that the entry size ('esize') strictly
fits within the remaining used space of the index header before performing
memory operations.

Fixes: b46acd6a6a ("fs/ntfs3: Add NTFS journal")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-01-27 19:57:57 +01:00
Randy Dunlap
24c776355f kernel.h: drop hex.h and update all hex.h users
Remove <linux/hex.h> from <linux/kernel.h> and update all users/callers of
hex.h interfaces to directly #include <linux/hex.h> as part of the process
of putting kernel.h on a diet.

Removing hex.h from kernel.h means that 36K C source files don't have to
pay the price of parsing hex.h for the roughly 120 C source files that
need it.

This change has been build-tested with allmodconfig on most ARCHes.  Also,
all users/callers of <linux/hex.h> in the entire source tree have been
updated if needed (if not already #included).

Link: https://lkml.kernel.org/r/20251215005206.2362276-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-20 19:44:19 -08:00
Nathan Chancellor
ca1ceddfaa
ntfs3: Restore NULL folio initialization in ntfs_writepages()
Clang warns (or errors with CONFIG_WERROR=y):

  fs/ntfs3/inode.c:1021:6: error: variable 'folio' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
   1021 |         if (is_resident(ni)) {
        |             ^~~~~~~~~~~~~~~
  fs/ntfs3/inode.c:1024:48: note: uninitialized use occurs here
   1024 |                 while ((folio = writeback_iter(mapping, wbc, folio, &err)))
        |                                                              ^~~~~

folio should be initialized to NULL for the first iteration of
writeback_iter() to start the loop properly. Restore the NULL
initialization of folio that was lost in the recent iomap conversion to
clear up the warning.

Fixes: 099ef9a ("fs/ntfs3: implement iomap-based file operations")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/oe-kbuild-all/202601010644.FIhOXy6Y-lkp@intel.com/
Closes: https://lore.kernel.org/r/202601010513.axd56bks-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
[almaz.alexandrovich@paragon-software.com: added a few more tags]
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-01-16 14:18:06 +01:00
Miklos Szeredi
6cbfdf8947
posix_acl: make posix_acl_to_xattr() alloc the buffer
Without exception all caller do that.  So move the allocation into the
helper.

This reduces boilerplate and removes unnecessary error checking.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://patch.msgid.link/20260115122341.556026-1-mszeredi@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-16 10:51:12 +01:00
Baolin Liu
6b3c83df9a
ntfs3: Refactor duplicate kmemdup pattern in do_action()
Extract the repeated pattern of duplicating attribute and updating
OpenAttr into a helper function to reduce code duplication and improve
maintainability.

Signed-off-by: Baolin Liu <liubaolin@kylinos.cn>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-01-15 05:58:02 +01:00
Edward Adam Davis
27b75ca4e5
fs/ntfs3: prevent infinite loops caused by the next valid being the same
When processing valid within the range [valid : pos), if valid cannot
be retrieved correctly, for example, if the retrieved valid value is
always the same, this can trigger a potential infinite loop, similar
to the hung problem reported by syzbot [1].

Adding a check for the valid value within the loop body, and terminating
the loop and returning -EINVAL if the value is the same as the current
value, can prevent this.

[1]
INFO: task syz.4.21:6056 blocked for more than 143 seconds.
Call Trace:
 rwbase_write_lock+0x14f/0x750 kernel/locking/rwbase_rt.c:244
 inode_lock include/linux/fs.h:1027 [inline]
 ntfs_file_write_iter+0xe6/0x870 fs/ntfs3/file.c:1284

Fixes: 4342306f0f ("fs/ntfs3: Add file operations and implementation")
Reported-by: syzbot+bcf9e1868c1a0c7e04f1@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bcf9e1868c1a0c7e04f1
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-01-15 05:58:01 +01:00
sunliming
1dad2fff02
fs/ntfs3: make ntfs_writeback_ops static
Fix below sparse warnings:
fs/ntfs3/inode.c:972:34: sparse: sparse: symbol 'ntfs_writeback_ops' was not declared.
Should it be static?

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601061424.nbKLNwC5-lkp@intel.com/
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-01-15 05:58:00 +01:00
Szymon Wilczek
08ce2fee1b
ntfs3: fix circular locking dependency in run_unpack_ex
Syzbot reported a circular locking dependency between wnd->rw_lock
(sbi->used.bitmap) and ni->file.run_lock.

The deadlock scenario:
1. ntfs_extend_mft() takes ni->file.run_lock then wnd->rw_lock.
2. run_unpack_ex() takes wnd->rw_lock then tries to acquire
   ni->file.run_lock inside ntfs_refresh_zone().

This creates an AB-BA deadlock.

Fix this by using down_read_trylock() instead of down_read() when
acquiring run_lock in run_unpack_ex(). If the lock is contended,
skip ntfs_refresh_zone() - the MFT zone will be refreshed on the
next MFT operation. This breaks the circular dependency since we
never block waiting for run_lock while holding wnd->rw_lock.

Reported-by: syzbot+d27edf9f96ae85939222@syzkaller.appspotmail.com
Tested-by: syzbot+d27edf9f96ae85939222@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d27edf9f96ae85939222
Signed-off-by: Szymon Wilczek <swilczek.lx@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2026-01-15 05:57:48 +01:00
Al Viro
ca2a04e84a ntfs: ->d_compare() must not block
... so don't use __getname() there.  Switch it (and ntfs_d_hash(), while
we are at it) to kmalloc(PATH_MAX, GFP_NOWAIT).  Yes, ntfs_d_hash()
almost certainly can do with smaller allocations, but let ntfs folks
deal with that - keep the allocation size as-is for now.

Stop abusing names_cachep in ntfs, period - various uses of that thing
in there have nothing to do with pathnames; just use k[mz]alloc() and
be done with that.  For now let's keep sizes as-in, but AFAICS none of
the users actually want PATH_MAX.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-01-13 15:16:44 -05:00
Jeff Layton
6aaa1d6337
ntfs3: add setlease file operation
Add the setlease file_operation to ntfs_file_operations,
ntfs_legacy_file_operations, ntfs_dir_operations, and
ntfs_legacy_dir_operations, pointing to generic_setlease.  A future
patch will change the default behavior to reject lease attempts with
-EINVAL when there is no setlease file operation defined. Add
generic_setlease to retain the ability to set leases on this
filesystem.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20260108-setlease-6-20-v1-14-ea4dec9b67fa@kernel.org
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-12 10:55:47 +01:00
Konstantin Komarov
099ef9ab92
fs/ntfs3: implement iomap-based file operations
This patch modifies the ntfs3 driver by replacing the buffer_head-based
operations with the iomap ones.

Implementation details:
- Implements core iomap operations (ntfs_iomap_begin/end) for block mapping:
    Proper handling of resident attributes via IOMAP_INLINE.
    Support for sparse files through IOMAP_HOLE semantics.
    Correct unwritten extent handling for zeroing operations.
- Replaces custom implementations with standardized iomap helpers:
    Converts buffered reads to use iomap_read_folio and iomap_readahead.
    Implements iomap_file_buffered_write for write operations.
    Uses iomap_dio_rw for direct I/O paths.
    Migrates zero range operations to iomap_zero_range.
- Preserves special handling paths for compressed files
- Implements proper EOF/valid data size management during writes

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-29 13:33:32 +00:00
Szymon Wilczek
e37a75bb86
fs/ntfs3: fix deadlock in ni_read_folio_cmpr
Syzbot reported a task hung in ni_readpage_cmpr (now ni_read_folio_cmpr).
This is caused by a lock inversion deadlock involving the inode mutex
(ni_lock) and page locks.

Scenario:
1. Task A enters ntfs_read_folio() for page X. It acquires ni_lock.
2. Task A calls ni_read_folio_cmpr(), which attempts to lock all pages in
   the compressed frame (including page Y).
3. Concurrently, Task B (e.g., via readahead) has locked page Y and
   calls ntfs_read_folio().
4. Task B waits for ni_lock (held by A).
5. Task A waits for page Y lock (held by B).
   -> DEADLOCK.

The fix is to restructure locking: do not take ni_lock in ntfs_read_folio().
Instead, acquire ni_lock inside ni_read_folio_cmpr() ONLY AFTER all required
page locks for the frame have been successfully acquired. This restores the
correct lock ordering (Page Lock -> ni_lock) consistent with VFS.

Reported-by: syzbot+5af33dd272b913b65880@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5af33dd272b913b65880
Fixes: f35590ee26 ("fs/ntfs3: remove ntfs_bio_pages and use page cache for compressed I/O")
Signed-off-by: Szymon Wilczek <swilczek.lx@gmail.com>
[almaz.alexandrovich@paragon-software.com: ni_readpage_cmpr was renamed to ni_read_folio_cmpr]
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-29 13:33:31 +00:00
Konstantin Komarov
c613269677
fs/ntfs3: implement llseek SEEK_DATA/SEEK_HOLE by scanning data runs
The generic llseek implementation does not understand ntfs data runs,
sparse regions, or compression semantics, and therefore cannot correctly
locate data or holes in files.

Add a filesystem-specific llseek handler that scans attribute data runs
to find the next data or hole starting at the given offset. Handle
resident attributes, sparse runs, compressed holes, and the implicit hole
at end-of-file.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-29 13:33:31 +00:00
Konstantin Komarov
356fa24816
fs/ntfs3: zero-fill folios beyond i_valid in ntfs_read_folio()
Handle ntfs_read_folio() early when the folio offset is beyond i_valid
by zero-filling the folio and marking it uptodate. This avoids needless
I/O and locking, improves read performance.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-29 13:33:30 +00:00
Konstantin Komarov
576248a34b
fs/ntfs3: handle attr_set_size() errors when truncating files
If attr_set_size() fails while truncating down, the error is silently
ignored and the inode may be left in an inconsistent state.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-29 13:33:30 +00:00
Konstantin Komarov
3a6aba7f3c
fs/ntfs3: drop preallocated clusters for sparse and compressed files
Do not keep preallocated clusters for sparsed and compressed files.
Preserving preallocation in these cases causes fsx failures when running
with sparse files and preallocation enabled.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-29 13:33:29 +00:00
Konstantin Komarov
dcd9d6a471
fs/ntfs3: fsync files by syncing parent inodes
Some xfstests expect fsync() on a file or directory to also persist
directory metadata up the parent chain. Using generic_file_fsync() is not
sufficient for ntfs, because parent directories are not explicitly
written out.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-29 13:33:19 +00:00
Baokun Li
f7edab0cee
fs/ntfs3: fix ntfs_mount_options leak in ntfs_fill_super()
In ntfs_fill_super(), the fc->fs_private pointer is set to NULL without
first freeing the memory it points to. This causes the subsequent call to
ntfs_fs_free() to skip freeing the ntfs_mount_options structure.

This results in a kmemleak report:

  unreferenced object 0xff1100015378b800 (size 32):
    comm "mount", pid 582, jiffies 4294890685
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      00 00 00 00 00 00 00 00 ed ff ed ff 00 04 00 00  ................
    backtrace (crc ed541d8c):
      __kmalloc_cache_noprof+0x424/0x5a0
      __ntfs_init_fs_context+0x47/0x590
      alloc_fs_context+0x5d8/0x960
      __x64_sys_fsopen+0xb1/0x190
      do_syscall_64+0x50/0x1f0
      entry_SYSCALL_64_after_hwframe+0x76/0x7e

This issue can be reproduced using the following commands:
        fallocate -l 100M test.file
        mount test.file /tmp/test

Since sbi->options is duplicated from fc->fs_private and does not
directly use the memory allocated for fs_private, it is unnecessary to
set fc->fs_private to NULL.

Additionally, this patch simplifies the code by utilizing the helper
function put_mount_options() instead of open-coding the cleanup logic.

Reported-by: syzbot+23aee7afc440fe803545@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=23aee7afc440fe803545
Fixes: aee4d5a521 ("ntfs3: fix double free of sbi->options->nls and clarify ownership of fc->fs_private")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:04:02 +01:00
Konstantin Komarov
dffc7f2f17
fs/ntfs3: allow readdir() to finish after directory mutations without rewinddir()
This patch introduces a per-directory version counter that increments on
each directory modification (indx_insert_entry() / indx_delete_entry()).
ntfs_readdir() uses this version to detect whether the directory has
changed since enumeration began. If readdir() reaches end-of-directory
but the version has changed, the walk restarts from the beginning of the
index tree instead of returning prematurely. This provides rmdir-like
behavior for tools that remove entries as they enumerate them.

Prior to this change, bonnie++ directory operations could fail due to
premature termination of readdir() during concurrent index updates.
With this patch applied, bonnie++ completes successfully with no errors.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:04:01 +01:00
Konstantin Komarov
989e29450e
fs/ntfs3: improve readahead for bitmap initialization and large directory scans
Previously sequential reads operations relied solely on single-page reads,
causing the block layer to perform many synchronous I/O requests,
especially for large volumes or large directories. This patch introduces
explicit readahead via page_cache_sync_readahead() and file_ra_state to
reduce I/O latency and improve sequential throughput.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:04:01 +01:00
Konstantin Komarov
4248f563f0
fs/ntfs3: rename ni_readpage_cmpr into ni_read_folio_cmpr
The old "readpage" naming is still used in ni_readpage_cmpr(), even though
the vfs has transitioned to the folio-based read_folio() API.

This patch performs a straightforward renaming of the helper:
ni_readpage_cmpr() -> ni_read_folio_cmpr().

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:04:00 +01:00
Jaehun Gou
06909b2549
fs: ntfs3: fix infinite loop triggered by zero-sized ATTR_LIST
We found an infinite loop bug in the ntfs3 file system that can lead to a
Denial-of-Service (DoS) condition.

A malformed NTFS image can cause an infinite loop when an ATTR_LIST attribute
indicates a zero data size while the driver allocates memory for it.

When ntfs_load_attr_list() processes a resident ATTR_LIST with data_size set
to zero, it still allocates memory because of al_aligned(0). This creates an
inconsistent state where ni->attr_list.size is zero, but ni->attr_list.le is
non-null. This causes ni_enum_attr_ex to incorrectly assume that no attribute
list exists and enumerates only the primary MFT record. When it finds
ATTR_LIST, the code reloads it and restarts the enumeration, repeating
indefinitely. The mount operation never completes, hanging the kernel thread.

This patch adds validation to ensure that data_size is non-zero before memory
allocation. When a zero-sized ATTR_LIST is detected, the function returns
-EINVAL, preventing a DoS vulnerability.

Co-developed-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Co-developed-by: Jihoon Kwon <kjh010315@gmail.com>
Signed-off-by: Jihoon Kwon <kjh010315@gmail.com>
Signed-off-by: Jaehun Gou <p22gone@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:03:59 +01:00
Jaehun Gou
4b90f16e4b
fs: ntfs3: fix infinite loop in attr_load_runs_range on inconsistent metadata
We found an infinite loop bug in the ntfs3 file system that can lead to a
Denial-of-Service (DoS) condition.

A malformed NTFS image can cause an infinite loop when an attribute header
indicates an empty run list, while directory entries reference it as
containing actual data. In NTFS, setting evcn=-1 with svcn=0 is a valid way
to represent an empty run list, and run_unpack() correctly handles this by
checking if evcn + 1 equals svcn and returning early without parsing any run
data. However, this creates a problem when there is metadata inconsistency,
where the attribute header claims to be empty (evcn=-1) but the caller
expects to read actual data. When run_unpack() immediately returns success
upon seeing this condition, it leaves the runs_tree uninitialized with
run->runs as a NULL. The calling function attr_load_runs_range() assumes
that a successful return means that the runs were loaded and sets clen to 0,
expecting the next run_lookup_entry() call to succeed. Because runs_tree
remains uninitialized, run_lookup_entry() continues to fail, and the loop
increments vcn by zero (vcn += 0), leading to an infinite loop.

This patch adds a retry counter to detect when run_lookup_entry() fails
consecutively after attr_load_runs_vcn(). If the run is still not found on
the second attempt, it indicates corrupted metadata and returns -EINVAL,
preventing the Denial-of-Service (DoS) vulnerability.

Co-developed-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Co-developed-by: Jihoon Kwon <kjh010315@gmail.com>
Signed-off-by: Jihoon Kwon <kjh010315@gmail.com>
Signed-off-by: Jaehun Gou <p22gone@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:03:58 +01:00
Lalit Shankar Chowdhury
fac760f524
fs/ntfs3: Use wait_on_buffer() directly
wait_on_buffer() checks buffer_locked() internally
before calling __wait_on_buffer().

Signed-off-by: Lalit Shankar Chowdhury <lalitshankarch@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:03:58 +01:00
Bartlomiej Kubik
f223ebffa1
fs/ntfs3: Initialize new folios before use
KMSAN reports an uninitialized value in longest_match_std(), invoked
from ntfs_compress_write(). When new folios are allocated without being
marked uptodate and ni_read_frame() is skipped because the caller expects
the frame to be completely overwritten, some reserved folios may remain
only partially filled, leaving the rest memory uninitialized.

Fixes: 584f60ba22 ("ntfs3: Convert ntfs_get_frame_pages() to use a folio")
Tested-by: syzbot+08d8956768c96a2c52cf@syzkaller.appspotmail.com
Reported-by: syzbot+08d8956768c96a2c52cf@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=08d8956768c96a2c52cf

Signed-off-by: Bartlomiej Kubik <kubik.bartlomiej@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:03:57 +01:00
Jaehun Gou
1732053c8a
fs: ntfs3: check return value of indx_find to avoid infinite loop
We found an infinite loop bug in the ntfs3 file system that can lead to a
Denial-of-Service (DoS) condition.

A malformed dentry in the ntfs3 filesystem can cause the kernel to hang
during the lookup operations. By setting the HAS_SUB_NODE flag in an
INDEX_ENTRY within a directory's INDEX_ALLOCATION block and manipulating the
VCN pointer, an attacker can cause the indx_find() function to repeatedly
read the same block, allocating 4 KB of memory each time. The kernel lacks
VCN loop detection and depth limits, causing memory exhaustion and an OOM
crash.

This patch adds a return value check for fnd_push() to prevent a memory
exhaustion vulnerability caused by infinite loops. When the index exceeds the
size of the fnd->nodes array, fnd_push() returns -EINVAL. The indx_find()
function checks this return value and stops processing, preventing further
memory allocation.

Co-developed-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Co-developed-by: Jihoon Kwon <kjh010315@gmail.com>
Signed-off-by: Jihoon Kwon <kjh010315@gmail.com>
Signed-off-by: Jaehun Gou <p22gone@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-12-19 19:03:41 +01:00
Linus Torvalds
7203ca412f Significant patch series in this merge are as follows:
- The 10 patch series "__vmalloc()/kvmalloc() and no-block support" from
   Uladzislau Rezki reworks the vmalloc() code to support non-blocking
   allocations (GFP_ATOIC, GFP_NOWAIT).
 
 - The 2 patch series "ksm: fix exec/fork inheritance" from xu xin fixes
   a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not inherited
   across fork/exec.
 
 - The 4 patch series "mm/zswap: misc cleanup of code and documentations"
   from SeongJae Park does some light maintenance work on the zswap code.
 
 - The 5 patch series "mm/page_owner: add debugfs files 'show_handles'
   and 'show_stacks_handles'" from Mauricio Faria de Oliveira enhances the
   /sys/kernel/debug/page_owner debug feature.  It adds unique identifiers
   to differentiate the various stack traces so that userspace monitoring
   tools can better match stack traces over time.
 
 - The 2 patch series "mm/page_alloc: pcp->batch cleanups" from Joshua
   Hahn makes some minor alterations to the page allocator's per-cpu-pages
   feature.
 
 - The 2 patch series "Improve UFFDIO_MOVE scalability by removing
   anon_vma lock" from Lokesh Gidra addresses a scalability issue in
   userfaultfd's UFFDIO_MOVE operation.
 
 - The 2 patch series "kasan: cleanups for kasan_enabled() checks" from
   Sabyrzhan Tasbolatov performs some cleanup in the KASAN code.
 
 - The 2 patch series "drivers/base/node: fold node register and
   unregister functions" from Donet Tom cleans up the NUMA node handling
   code a little.
 
 - The 4 patch series "mm: some optimizations for prot numa" from Kefeng
   Wang provides some cleanups and small optimizations to the NUMA
   allocation hinting code.
 
 - The 5 patch series "mm/page_alloc: Batch callers of
   free_pcppages_bulk" from Joshua Hahn addresses long lock hold times at
   boot on large machines.  These were causing (harmless) softlockup
   warnings.
 
 - The 2 patch series "optimize the logic for handling dirty file folios
   during reclaim" from Baolin Wang removes some now-unnecessary work from
   page reclaim.
 
 - The 10 patch series "mm/damon: allow DAMOS auto-tuned for per-memcg
   per-node memory usage" from SeongJae Park enhances the DAMOS auto-tuning
   feature.
 
 - The 2 patch series "mm/damon: fixes for address alignment issues in
   DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan fixes DAMON_LRU_SORT
   and DAMON_RECLAIM with certain userspace configuration.
 
 - The 15 patch series "expand mmap_prepare functionality, port more
   users" from Lorenzo Stoakes enhances the new(ish)
   file_operations.mmap_prepare() method and ports additional callsites
   from the old ->mmap() over to ->mmap_prepare().
 
 - The 8 patch series "Fix stale IOTLB entries for kernel address space"
   from Lu Baolu fixes a bug (and possible security issue on non-x86) in
   the IOMMU code.  In some situations the IOMMU could be left hanging onto
   a stale kernel pagetable entry.
 
 - The 4 patch series "mm/huge_memory: cleanup __split_unmapped_folio()"
   from Wei Yang cleans up and optimizes the folio splitting code.
 
 - The 5 patch series "mm, swap: misc cleanup and bugfix" from Kairui
   Song implements some cleanups and a minor fix in the swap discard code.
 
 - The 8 patch series "mm/damon: misc documentation fixups" from SeongJae
   Park does as advertised.
 
 - The 9 patch series "mm/damon: support pin-point targets removal" from
   SeongJae Park permits userspace to remove a specific monitoring target
   in the middle of the current targets list.
 
 - The 2 patch series "mm: MISC follow-up patches for linux/pgalloc.h"
   from Harry Yoo implements a couple of cleanups related to mm header file
   inclusion.
 
 - The 2 patch series "mm/swapfile.c: select swap devices of default
   priority round robin" from Baoquan He improves the selection of swap
   devices for NUMA machines.
 
 - The 3 patch series "mm: Convert memory block states (MEM_*) macros to
   enums" from Israel Batista changes the memory block labels from macros
   to enums so they will appear in kernel debug info.
 
 - The 3 patch series "ksm: perform a range-walk to jump over holes in
   break_ksm" from Pedro Demarchi Gomes addresses an inefficiency when KSM
   unmerges an address range.
 
 - The 22 patch series "mm/damon/tests: fix memory bugs in kunit tests"
   from SeongJae Park fixes leaks and unhandled malloc() failures in DAMON
   userspace unit tests.
 
 - The 2 patch series "some cleanups for pageout()" from Baolin Wang
   cleans up a couple of minor things in the page scanner's
   writeback-for-eviction code.
 
 - The 2 patch series "mm/hugetlb: refactor sysfs/sysctl interfaces" from
   Hui Zhu moves hugetlb's sysfs/sysctl handling code into a new file.
 
 - The 9 patch series "introduce VM_MAYBE_GUARD and make it sticky" from
   Lorenzo Stoakes makes the VMA guard regions available in /proc/pid/smaps
   and improves the mergeability of guarded VMAs.
 
 - The 2 patch series "mm: perform guard region install/remove under VMA
   lock" from Lorenzo Stoakes reduces mmap lock contention for callers
   performing VMA guard region operations.
 
 - The 2 patch series "vma_start_write_killable" from Matthew Wilcox
   starts work in permitting applications to be killed when they are
   waiting on a read_lock on the VMA lock.
 
 - The 11 patch series "mm/damon/tests: add more tests for online
   parameters commit" from SeongJae Park adds additional userspace testing
   of DAMON's "commit" feature.
 
 - The 9 patch series "mm/damon: misc cleanups" from SeongJae Park does
   that.
 
 - The 2 patch series "make VM_SOFTDIRTY a sticky VMA flag" from Lorenzo
   Stoakes addresses the possible loss of a VMA's VM_SOFTDIRTY flag when
   that VMA is merged with another.
 
 - The 16 patch series "mm: support device-private THP" from Balbir Singh
   introduces support for Transparent Huge Page (THP) migration in zone
   device-private memory.
 
 - The 3 patch series "Optimize folio split in memory failure" from Zi
   Yan optimizes folio split operations in the memory failure code.
 
 - The 2 patch series "mm/huge_memory: Define split_type and consolidate
   split support checks" from Wei Yang provides some more cleanups in the
   folio splitting code.
 
 - The 16 patch series "mm: remove is_swap_[pte, pmd]() + non-swap
   entries, introduce leaf entries" from Lorenzo Stoakes cleans up our
   handling of pagetable leaf entries by introducing the concept of
   'software leaf entries', of type softleaf_t.
 
 - The 4 patch series "reparent the THP split queue" from Muchun Song
   reparents the THP split queue to its parent memcg.  This is in
   preparation for addressing the long-standing "dying memcg" problem,
   wherein dead memcg's linger for too long, consuming memory resources.
 
 - The 3 patch series "unify PMD scan results and remove redundant
   cleanup" from Wei Yang does a little cleanup in the hugepage collapse
   code.
 
 - The 6 patch series "zram: introduce writeback bio batching" from
   Sergey Senozhatsky improves zram writeback efficiency by introducing
   batched bio writeback support.
 
 - The 4 patch series "memcg: cleanup the memcg stats interfaces" from
   Shakeel Butt cleans up our handling of the interrupt safety of some
   memcg stats.
 
 - The 4 patch series "make vmalloc gfp flags usage more apparent" from
   Vishal Moola cleans up vmalloc's handling of incoming GFP flags.
 
 - The 6 patch series "mm: Add soft-dirty and uffd-wp support for RISC-V"
   from Chunyan Zhang teches soft dirty and userfaultfd write protect
   tracking to use RISC-V's Svrsw60t59b extension.
 
 - The 5 patch series "mm: swap: small fixes and comment cleanups" from
   Youngjun Park fixes a small bug and cleans up some of the swap code.
 
 - The 4 patch series "initial work on making VMA flags a bitmap" from
   Lorenzo Stoakes starts work on converting the vma struct's flags to a
   bitmap, so we stop running out of them, especially on 32-bit.
 
 - The 2 patch series "mm/swapfile: fix and cleanup swap list iterations"
   from Youngjun Park addresses a possible bug in the swap discard code and
   cleans things up a little.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTEb0wAKCRDdBJ7gKXxA
 jjfIAP94W4EkCCwNOupnChoG+YWw/JW21anXt5NN+i5svn1yugEAwzvv6A+cAFng
 o+ug/fyrfPZG7PLp2R8WFyGIP0YoBA4=
 =IUzS
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

  "__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki)
     Rework the vmalloc() code to support non-blocking allocations
     (GFP_ATOIC, GFP_NOWAIT)

  "ksm: fix exec/fork inheritance" (xu xin)
     Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not
     inherited across fork/exec

  "mm/zswap: misc cleanup of code and documentations" (SeongJae Park)
     Some light maintenance work on the zswap code

  "mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira)
     Enhance the /sys/kernel/debug/page_owner debug feature by adding
     unique identifiers to differentiate the various stack traces so
     that userspace monitoring tools can better match stack traces over
     time

  "mm/page_alloc: pcp->batch cleanups" (Joshua Hahn)
     Minor alterations to the page allocator's per-cpu-pages feature

  "Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra)
     Address a scalability issue in userfaultfd's UFFDIO_MOVE operation

  "kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov)

  "drivers/base/node: fold node register and unregister functions" (Donet Tom)
     Clean up the NUMA node handling code a little

  "mm: some optimizations for prot numa" (Kefeng Wang)
     Cleanups and small optimizations to the NUMA allocation hinting
     code

  "mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn)
     Address long lock hold times at boot on large machines. These were
     causing (harmless) softlockup warnings

  "optimize the logic for handling dirty file folios during reclaim" (Baolin Wang)
     Remove some now-unnecessary work from page reclaim

  "mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park)
     Enhance the DAMOS auto-tuning feature

  "mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan)
     Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace
     configuration

  "expand mmap_prepare functionality, port more users" (Lorenzo Stoakes)
     Enhance the new(ish) file_operations.mmap_prepare() method and port
     additional callsites from the old ->mmap() over to ->mmap_prepare()

  "Fix stale IOTLB entries for kernel address space" (Lu Baolu)
     Fix a bug (and possible security issue on non-x86) in the IOMMU
     code. In some situations the IOMMU could be left hanging onto a
     stale kernel pagetable entry

  "mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang)
     Clean up and optimize the folio splitting code

  "mm, swap: misc cleanup and bugfix" (Kairui Song)
     Some cleanups and a minor fix in the swap discard code

  "mm/damon: misc documentation fixups" (SeongJae Park)

  "mm/damon: support pin-point targets removal" (SeongJae Park)
     Permit userspace to remove a specific monitoring target in the
     middle of the current targets list

  "mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo)
     A couple of cleanups related to mm header file inclusion

  "mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He)
     improve the selection of swap devices for NUMA machines

  "mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista)
     Change the memory block labels from macros to enums so they will
     appear in kernel debug info

  "ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes)
     Address an inefficiency when KSM unmerges an address range

  "mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park)
     Fix leaks and unhandled malloc() failures in DAMON userspace unit
     tests

  "some cleanups for pageout()" (Baolin Wang)
     Clean up a couple of minor things in the page scanner's
     writeback-for-eviction code

  "mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu)
     Move hugetlb's sysfs/sysctl handling code into a new file

  "introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes)
     Make the VMA guard regions available in /proc/pid/smaps and
     improves the mergeability of guarded VMAs

  "mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes)
     Reduce mmap lock contention for callers performing VMA guard region
     operations

  "vma_start_write_killable" (Matthew Wilcox)
     Start work on permitting applications to be killed when they are
     waiting on a read_lock on the VMA lock

  "mm/damon/tests: add more tests for online parameters commit" (SeongJae Park)
     Add additional userspace testing of DAMON's "commit" feature

  "mm/damon: misc cleanups" (SeongJae Park)

  "make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes)
     Address the possible loss of a VMA's VM_SOFTDIRTY flag when that
     VMA is merged with another

  "mm: support device-private THP" (Balbir Singh)
     Introduce support for Transparent Huge Page (THP) migration in zone
     device-private memory

  "Optimize folio split in memory failure" (Zi Yan)

  "mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang)
     Some more cleanups in the folio splitting code

  "mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes)
     Clean up our handling of pagetable leaf entries by introducing the
     concept of 'software leaf entries', of type softleaf_t

  "reparent the THP split queue" (Muchun Song)
     Reparent the THP split queue to its parent memcg. This is in
     preparation for addressing the long-standing "dying memcg" problem,
     wherein dead memcg's linger for too long, consuming memory
     resources

  "unify PMD scan results and remove redundant cleanup" (Wei Yang)
     A little cleanup in the hugepage collapse code

  "zram: introduce writeback bio batching" (Sergey Senozhatsky)
     Improve zram writeback efficiency by introducing batched bio
     writeback support

  "memcg: cleanup the memcg stats interfaces" (Shakeel Butt)
     Clean up our handling of the interrupt safety of some memcg stats

  "make vmalloc gfp flags usage more apparent" (Vishal Moola)
     Clean up vmalloc's handling of incoming GFP flags

  "mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang)
     Teach soft dirty and userfaultfd write protect tracking to use
     RISC-V's Svrsw60t59b extension

  "mm: swap: small fixes and comment cleanups" (Youngjun Park)
     Fix a small bug and clean up some of the swap code

  "initial work on making VMA flags a bitmap" (Lorenzo Stoakes)
     Start work on converting the vma struct's flags to a bitmap, so we
     stop running out of them, especially on 32-bit

  "mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park)
     Address a possible bug in the swap discard code and clean things
     up a little

[ This merge also reverts commit ebb9aeb980 ("vfio/nvgrace-gpu:
  register device memory for poison handling") because it looks
  broken to me, I've asked for clarification   - Linus ]

* tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm: fix vma_start_write_killable() signal handling
  mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate
  mm/swapfile: fix list iteration when next node is removed during discard
  fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling
  mm/kfence: add reboot notifier to disable KFENCE on shutdown
  memcg: remove inc/dec_lruvec_kmem_state helpers
  selftests/mm/uffd: initialize char variable to Null
  mm: fix DEBUG_RODATA_TEST indentation in Kconfig
  mm: introduce VMA flags bitmap type
  tools/testing/vma: eliminate dependency on vma->__vm_flags
  mm: simplify and rename mm flags function for clarity
  mm: declare VMA flags by bit
  zram: fix a spelling mistake
  mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
  mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
  pagemap: update BUDDY flag documentation
  mm: swap: remove scan_swap_map_slots() references from comments
  mm: swap: change swap_alloc_slow() to void
  mm, swap: remove redundant comment for read_swap_cache_async
  mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational
  ...
2025-12-05 13:52:43 -08:00
Linus Torvalds
559e608c46 Changes for 6.19-rc1
Added:
     support timestamps prior to epoch;
     do not overwrite uptodate pages;
 	disable readahead for compressed files;
 	setting of dummy blocksize to read boot_block when mounting;
 	the run_lock initialization when loading $Extend;
 	initialization of allocated memory before use;
 	support for the NTFS3_IOC_SHUTDOWN ioctl;
 	check for minimum alignment when performing direct I/O reads;
 	check for shutdown in fsync.
 
 Fixed:
     mount failure for sparse runs in run_unpack();
 	use-after-free of sbi->options in cmp_fnames;
 	KMSAN uninit bug after failed mi_read in mi_format_new;
 	uninit error after buffer allocation by __getname();
 	KMSAN uninit-value in ni_create_attr_list;
 	double free of sbi->options->nls and ownership of fc->fs_private;
 	incorrect vcn adjustments in attr_collapse_range();
 	mode update when ACL can be reduced to mode;
 	memory leaks in add sub record.
 
 Changed:
     refactor code, updated terminology, spelling;
 	do not kmap pages in (de)compression code;
 	after ntfs_look_free_mft(), code that fails must put mi;
 	default mount options for "acl" and "prealloc".
 
 Replaced:
 	use unsafe_memcpy() to avoid memcpy size warning;
 	ntfs_bio_pages with page cache for compressed files.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEh0DEKNP0I9IjwfWEqbAzH4MkB7YFAmkwF6UACgkQqbAzH4Mk
 B7aG3w/9H3iUua41pQWH1g94PW7qhRxyN+2hVvnlJvNRB5bV/rhVHnEEkyxNHjau
 I+pknP2KceUrHYQO032b3kuuwab7sMCfVhVQCIzQ1dsp0V3HFwx5IWz8hGiAMrAN
 7pIQuUqqG5NxArVM028HWUXOq1myjRWxuwLTEaSYDOT0/Fl+a/nXKhlg+S5Js8K1
 FlkorvWdmXKUp3Dp96yzlcmcbqNFCeyfAlUu57KDx1wLjSvzEnCol8VHpSl9myc3
 FET6RcsL00jy3Tr3/6xBk6ef+9Eoas6hiafLaHay6jMkzW2hPzVwNgUsGOYQ0IwQ
 6jIWiTsAGfi6/GZRVQrEMFr4pvlIlG//uTezfu9pE7ld6wMwYfT4ZvDuHMYvshiw
 keRIEeF4oCakut9Cy8OVh0FtUCv/RYaT332rKJhtFZyCnewqwHd29/ihjB9yNnDx
 qruFEpjBHOffsbcd0PLDsATutGi8sjd4eNyv1KuRf2xKL3fZ22d79LOzBHRGhAnR
 1MyBNo6Q9eF3W8hbf1aMJ0X3O76EVwXxuAWc32EwHn/35XMjb9ajjmn0G1Xeix7+
 0N/cdXZO4fiZhj2fOVUuLduHpoNCcaQya0GD6YJ+jgVnFSs0ms/pGDryQa2kooqu
 ewav36eEmtU6WOXAxMKVYzok1A+/fMtpDK7cOvW61Vz91X6woCU=
 =RGJu
 -----END PGP SIGNATURE-----

Merge tag 'ntfs3_for_6.19' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 updates from Konstantin Komarov:
 "New code:
   - support timestamps prior to epoch
   - do not overwrite uptodate pages
   - disable readahead for compressed files
   - setting of dummy blocksize to read boot_block when mounting
   - the run_lock initialization when loading $Extend
   - initialization of allocated memory before use
   - support for the NTFS3_IOC_SHUTDOWN ioctl
   - check for minimum alignment when performing direct I/O reads
   - check for shutdown in fsync

  Fixes:
   - mount failure for sparse runs in run_unpack()
   - use-after-free of sbi->options in cmp_fnames
   - KMSAN uninit bug after failed mi_read in mi_format_new
   - uninit error after buffer allocation by __getname()
   - KMSAN uninit-value in ni_create_attr_list
   - double free of sbi->options->nls and ownership of fc->fs_private
   - incorrect vcn adjustments in attr_collapse_range()
   - mode update when ACL can be reduced to mode
   - memory leaks in add sub record

  Changes:
   - refactor code, updated terminology, spelling
   - do not kmap pages in (de)compression code
   - after ntfs_look_free_mft(), code that fails must put mft_inode
   - default mount options for "acl" and "prealloc"

  Replaced:
   - use unsafe_memcpy() to avoid memcpy size warning
   - ntfs_bio_pages with page cache for compressed files"

* tag 'ntfs3_for_6.19' of https://github.com/Paragon-Software-Group/linux-ntfs3: (26 commits)
  fs/ntfs3: check for shutdown in fsync
  fs/ntfs3: change the default mount options for "acl" and "prealloc"
  fs/ntfs3: Prevent memory leaks in add sub record
  fs/ntfs3: out1 also needs to put mi
  fs/ntfs3: Fix spelling mistake "recommened" -> "recommended"
  fs/ntfs3: update mode in xattr when ACL can be reduced to mode
  fs/ntfs3: check minimum alignment for direct I/O
  fs/ntfs3: implement NTFS3_IOC_SHUTDOWN ioctl
  fs/ntfs3: correct attr_collapse_range when file is too fragmented
  ntfs3: fix double free of sbi->options->nls and clarify ownership of fc->fs_private
  fs/ntfs3: Initialize allocated memory before use
  fs/ntfs3: remove ntfs_bio_pages and use page cache for compressed I/O
  ntfs3: avoid memcpy size warning
  fs/ntfs3: fix KMSAN uninit-value in ni_create_attr_list
  ntfs3: init run lock for extend inode
  ntfs: set dummy blocksize to read boot_block when mounting
  fs/ntfs3: disable readahead for compressed files
  ntfs3: Fix uninit buffer allocated by __getname()
  ntfs3: fix uninit memory after failed mi_read in mi_format_new
  ntfs3: fix use-after-free of sbi->options in cmp_fnames
  ...
2025-12-03 20:45:43 -08:00
Linus Torvalds
afdf0fb340 vfs-6.19-rc1.fs_header
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZgAKCRCRxhvAZXjc
 oq2EAQD09y/qVU81E7Qg7Cn4n5/3WTlnQjx0aSvhb4p6dFUcFwD+K9uVJNP8x8tA
 xTaPt59nZbEX9BIAwtLChSPa4CZsnwM=
 =XrvE
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.fs_header' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull fs header updates from Christian Brauner:
 "This contains initial work to start splitting up fs.h.

  Begin the long-overdue work of splitting up the monolithic fs.h
  header. The header has grown to over 3000 lines and includes types and
  functions for many different subsystems, making it difficult to
  navigate and causing excessive compilation dependencies.

  This series introduces new focused headers for superblock-related
  code:

   - Rename fs_types.h to fs_dirent.h to better reflect its actual
     content (directory entry types)

   - Add fs/super_types.h containing superblock type definitions

   - Add fs/super.h containing superblock function declarations

  This is the first step in a longer effort to modularize the VFS
  headers.

  Cleanups:

   - Inode Field Layout Optimization (Mateusz Guzik)

     Move inode fields used during fast path lookup closer together to
     improve cache locality during path resolution.

   - current_umask() Optimization (Mateusz Guzik)

     Inline current_umask() and move it to fs_struct.h. This improves
     performance by avoiding function call overhead for this
     frequently-used function, and places it in a more appropriate
     header since it operates on fs_struct"

* tag 'vfs-6.19-rc1.fs_header' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: move inode fields used during fast path lookup closer together
  fs: inline current_umask() and move it to fs_struct.h
  fs: add fs/super.h header
  fs: add fs/super_types.h header
  fs: rename fs_types.h to fs_dirent.h
2025-12-01 14:18:01 -08:00
Konstantin Komarov
1b2ae190ea
fs/ntfs3: check for shutdown in fsync
Ensure fsync() returns -EIO when the ntfs3 filesystem is in forced
shutdown, instead of silently succeeding via generic_file_fsync().

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-19 09:21:36 +01:00
Konstantin Komarov
bcbb8d0afd
fs/ntfs3: change the default mount options for "acl" and "prealloc"
Switch the "acl" and "prealloc" mount parameters to fsparam_flag_no(),
making them enabled by default and allowing users to disable them with
"noacl" and "noprealloc".

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-18 13:56:28 +01:00
Edward Adam Davis
ccc4e86d1c
fs/ntfs3: Prevent memory leaks in add sub record
If a rb node with the same ino already exists in the rb tree, the newly
alloced mft_inode in ni_add_subrecord() will not have its memory cleaned
up, which leads to the memory leak issue reported by syzbot.

The best option to avoid this issue is to put the newly alloced mft node
when a rb node with the same ino already exists in the rb tree and return
the rb node found in the rb tree to the parent layer.

syzbot reported:
BUG: memory leak
unreferenced object 0xffff888110bef280 (size 128):
  backtrace (crc 126a088f):
    ni_add_subrecord+0x31/0x180 fs/ntfs3/frecord.c:317
    ntfs_look_free_mft+0xf0/0x790 fs/ntfs3/fsntfs.c:715

BUG: memory leak
unreferenced object 0xffff888109093400 (size 1024):
  backtrace (crc 7197c55e):
    mi_init+0x2b/0x50 fs/ntfs3/record.c:105
    mi_format_new+0x40/0x220 fs/ntfs3/record.c:422

Fixes: 4342306f0f ("fs/ntfs3: Add file operations and implementation")
Reported-by: syzbot+3932ccb896e06f7414c9@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-18 13:56:27 +01:00
Edward Adam Davis
4d78d1173a
fs/ntfs3: out1 also needs to put mi
After ntfs_look_free_mft() executes successfully, all subsequent code
that fails to execute must put mi.

Fixes: 4342306f0f ("fs/ntfs3: Add file operations and implementation")
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-18 13:56:12 +01:00
Colin Ian King
2469f2e78d
fs/ntfs3: Fix spelling mistake "recommened" -> "recommended"
There is a spelling mistake in a ntfs_info message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-17 09:09:15 +01:00
Konstantin Komarov
266ab6d02a
fs/ntfs3: update mode in xattr when ACL can be reduced to mode
If a file's ACL can be reduced to standard mode bits, update mode
accordingly, persist the change, and update the cached ACL. This keeps
mode and ACL consistent and avoids redundant xattrs.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-17 09:09:14 +01:00
Konstantin Komarov
d8e1e0d33d
fs/ntfs3: check minimum alignment for direct I/O
Add a check for minimum alignment when performing direct I/O reads. If the
file offset or user buffer is not aligned to the device's logical block
size, fall back to buffered I/O instead of continuing with unaligned direct I/O.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-17 09:09:13 +01:00
Konstantin Komarov
ae91dfe389
fs/ntfs3: implement NTFS3_IOC_SHUTDOWN ioctl
Add support for the NTFS3_IOC_SHUTDOWN ioctl, allowing userspace to
request a filesystem shutdown. The ioctl number is shared with other
filesystems such as ext4, exfat, and f2fs.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-17 09:09:13 +01:00
Konstantin Komarov
2109b08024
fs/ntfs3: correct attr_collapse_range when file is too fragmented
Fix incorrect VCN adjustments in attr_collapse_range() that caused
filesystem errors or corruption on very fragmented NTFS files when
performing collapse-range operations.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2025-11-17 09:08:49 +01:00
Lorenzo Stoakes
54c58a2f5f mm: add vma_desc_size(), vma_desc_pages() helpers
It's useful to be able to determine the size of a VMA descriptor range
used on f_op->mmap_prepare, expressed both in bytes and pages, so add
helpers for both and update code that could make use of it to do so.

Link: https://lkml.kernel.org/r/74ef338203c9ff08a9ace73a8f1f6116a79112a0.1760959442.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chatre, Reinette <reinette.chatre@intel.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Robin Murohy <robin.murphy@arm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-16 17:28:11 -08:00