mirror of
https://github.com/torvalds/linux.git
synced 2026-05-28 00:53:34 +02:00
master
3163 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
292a2bcd17 |
Fixing livelocks in shrink_dcache_tree()
If shrink_dcache_tree() finds a dentry in the middle of being killed by another thread, it has to wait until the victim finishes dying, gets detached from the tree and ceases to pin its parent. The way we used to deal with that amounted to busy-wait; unfortunately, it's not just inefficient but can lead to reliably reproducible hard livelocks. Solved by having shrink_dentry_tree() attach a completion to such dentry, with dentry_unlist() calling complete() on all objects attached to it. With a bit of care it can be done without growing struct dentry or adding overhead in normal case. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaec/ugAKCRBZ7Krx/gZQ 6y77AP9l/IL36Ic45p45FCTirA6LFyIvZ5Gixm3Xk64Pi1Y3nAEAqQ5UOVnJc907 RyrCpcI6vnO8a67MptchOxK9d1bIxQw= =A8JL -----END PGP SIGNATURE----- Merge tag 'pull-dcache-busy-wait' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dcache busy loop updates from Al Viro: "Fix livelocks in shrink_dcache_tree() If shrink_dcache_tree() finds a dentry in the middle of being killed by another thread, it has to wait until the victim finishes dying, gets detached from the tree and ceases to pin its parent. The way we used to deal with that amounted to busy-wait; unfortunately, it's not just inefficient but can lead to reliably reproducible hard livelocks. Solved by having shrink_dentry_tree() attach a completion to such dentry, with dentry_unlist() calling complete() on all objects attached to it. With a bit of care it can be done without growing struct dentry or adding overhead in normal case" * tag 'pull-dcache-busy-wait' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: get rid of busy-waiting in shrink_dcache_tree() dcache.c: more idiomatic "positives are not allowed" sanity checks struct dentry: make ->d_u anonymous for_each_alias(): helper macro for iterating through dentries of given inode |
||
|
|
a436a0b847 |
Various clean ups and bug fixes in ext4 for 7.1:
* Refactor code paths involved with partial block zero-out in
prearation for converting ext4 to use iomap for buffered writes.
* Remove use of d_alloc() from ext4 in preparation for the deprecation
of this interface.
* Replace some J_ASSERTS with a journal abort so we can avoid a kernel
panic for a localized file system error
* Simplify various code paths in mballoc, move_extent, and fast commit
* Fix rare deadlock in jbd2_journal_cancel_revoke() that can be
triggered by generic/013 when blocksize < pagesize.
* Fix memory leak when releasing an extended attribute when its
value is stored in an ea_inode
* Fix various potential kunit test bugs in fs/ext4/extents.c
* Fix potential out-of-bounds access in check_xattr() with a corrupted
file system
* Make the jbd2_inode dirty range tracking safe for lockless reads
* Avoid a WARN_ON when writeback files due to a corrupted file system;
we already print an ext4 warning indicatign that data will be lost,
so the WARN_ON is not necessary and doesn't add any new information
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmniS4wACgkQ8vlZVpUN
gaM1lgf/e9Csrz5Zxyf+VEdE5+kQjuaT0feTe/gUMM53jqMiT7cC30JBF1istkaz
O3qBe6x8O5iHDv1Z/umcI+A4GtAUj6SsTjwggQPJyJvUqHDPsruGfyD6wBmmNyJ2
N9Y3OZ8I5W7c7ecES6/a7W7VheSe9kkgPDD5eBhw7R4l71Q+B2foW4snQ8wcGTNm
uO/QosbaOArGsXHTTcOmOdwxEDIUV16g/iJMhszCI1gaukmL14DIzFtwYkateUBD
Wi1+tmZmMvqSkKJ/6PELLoawiUPhTL6OuPANyIid/1Yv5dxMec8XiO/CXRt7r2X3
q9Nrpqfs3fxv63r2IUJ7+yQVYGXK9g==
=MG/0
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linux-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
- Refactor code paths involved with partial block zero-out in
prearation for converting ext4 to use iomap for buffered writes
- Remove use of d_alloc() from ext4 in preparation for the deprecation
of this interface
- Replace some J_ASSERTS with a journal abort so we can avoid a kernel
panic for a localized file system error
- Simplify various code paths in mballoc, move_extent, and fast commit
- Fix rare deadlock in jbd2_journal_cancel_revoke() that can be
triggered by generic/013 when blocksize < pagesize
- Fix memory leak when releasing an extended attribute when its value
is stored in an ea_inode
- Fix various potential kunit test bugs in fs/ext4/extents.c
- Fix potential out-of-bounds access in check_xattr() with a corrupted
file system
- Make the jbd2_inode dirty range tracking safe for lockless reads
- Avoid a WARN_ON when writeback files due to a corrupted file system;
we already print an ext4 warning indicatign that data will be lost,
so the WARN_ON is not necessary and doesn't add any new information
* tag 'ext4_for_linux-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (37 commits)
jbd2: fix deadlock in jbd2_journal_cancel_revoke()
ext4: fix missing brelse() in ext4_xattr_inode_dec_ref_all()
ext4: fix possible null-ptr-deref in mbt_kunit_exit()
ext4: fix possible null-ptr-deref in extents_kunit_exit()
ext4: fix the error handling process in extents_kunit_init).
ext4: call deactivate_super() in extents_kunit_exit()
ext4: fix miss unlock 'sb->s_umount' in extents_kunit_init()
ext4: fix bounds check in check_xattrs() to prevent out-of-bounds access
ext4: zero post-EOF partial block before appending write
ext4: move pagecache_isize_extended() out of active handle
ext4: remove ctime/mtime update from ext4_alloc_file_blocks()
ext4: unify SYNC mode checks in fallocate paths
ext4: ensure zeroed partial blocks are persisted in SYNC mode
ext4: move zero partial block range functions out of active handle
ext4: pass allocate range as loff_t to ext4_alloc_file_blocks()
ext4: remove handle parameters from zero partial block functions
ext4: move ordered data handling out of ext4_block_do_zero_range()
ext4: rename ext4_block_zero_page_range() to ext4_block_zero_range()
ext4: factor out journalled block zeroing range
ext4: rename and extend ext4_block_truncate_page()
...
|
||
|
|
440d6635b2 |
mm.git review status for linus..mm-nonmm-stable
Total patches: 126
Reviews/patch: 0.92
Reviewed rate: 76%
- The 2 patch series "pid: make sub-init creation retryable" from Oleg
Nesterov increases the robustness of our creation of init in a new
namespace. By clearing away some historical cruft which is no longer
needed. Also some documentation fixups are provided.
- The 2 patch series "selftests/fchmodat2: Error handling and general"
from Mark Brown has a fixup and a cleanup for the fchmodat2() syscall
selftest.
- The 3 patch series "lib: polynomial: Move to math/ and clean up" from
Andy Shevchenko does as advertised.
- The 3 patch series "hung_task: Provide runtime reset interface for
hung task detector" from Aaron Tomlin gives administrators the ability
to zero out /proc/sys/kernel/hung_task_detect_count.
- The 2 patch series "tools/getdelays: use the static UAPI headers from
tools/include/uapi" from Thomas Weißschuh teaches getdelays to use the
in-kernel UAPI headers rather than the system-provided ones.
- The 5 patch series "watchdog/hardlockup: Improvements to hardlockup"
from Mayank Rungta provides several cleanups and fixups to the
hardlockup detector code and its documentation.
- The 2 patch series "lib/bch: fix undefined behavior from signed
left-shifts" from Josh Law provides a couple of small/theoretical fixes
in the bch code.
- The 2 patch series "ocfs2/dlm: fix two bugs in dlm_match_regions()"
from Junrui Luo does what is claims.
- The 27 patch series "cleanup the RAID5 XOR library" from Christoph
Hellwig is a quite far-reaching cleanup to this code. I can't do better
than to quote Christoph:
The XOR library used for the RAID5 parity is a bit of a mess right
now. The main file sits in crypto/ despite not being cryptography and
not using the crypto API, with the generic implementations sitting in
include/asm-generic and the arch implementations sitting in an asm/
header in theory. The latter doesn't work for many cases, so
architectures often build the code directly into the core kernel, or
create another module for the architecture code.
Change this to a single module in lib/ that also contains the
architecture optimizations, similar to the library work Eric Biggers
has done for the CRC and crypto libraries later. After that it
changes to better calling conventions that allow for smarter
architecture implementations (although none is contained here yet),
and uses static_call to avoid indirection function call overhead.
- The 2 patch series "lib/list_sort: Clean up list_sort() scheduling
workarounds" from Kuan-Wei Chiu cleans up this library code by removing
a hacky thing which was added for UBIFS, which UBIFS doesn't actually
need.
- The 5 patch series "Fix bugs in extract_iter_to_sg()" from Christian
Ehrhardt fixes a few bugs in the scatterlist code, adds in-kernel tests
for the now-fixed bugs and fixes a leak in the test itself.
- The 3 patch series "kdump: Enable LUKS-encrypted dump target support
in ARM64 and PowerPC" from Coiby Xu eenables support of the
LUKS-encrypted device dump target on arm64 and powerpc.
- The 4 patch series "ocfs2: consolidate extent list validation into
block read callbacks" from Joseph Qi addresses ocfs2's validation of
extent list fields - cleanup, simplification, robustness. (Kernel test
robot loves mounting corrupted fs images!)
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad90rQAKCRDdBJ7gKXxA
jl7rAQD4/Rq7ZSSnEv6FS4gOwc3MgTdWcZZaXkqL1KiWyYhRwAEA+cVCO344+AKb
znBOjet/hUr+/kBwyViifiC8LHzchwM=
=Nfnf
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "pid: make sub-init creation retryable" (Oleg Nesterov)
Make creation of init in a new namespace more robust by clearing away
some historical cruft which is no longer needed. Also some
documentation fixups
- "selftests/fchmodat2: Error handling and general" (Mark Brown)
Fix and a cleanup for the fchmodat2() syscall selftest
- "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)
- "hung_task: Provide runtime reset interface for hung task detector"
(Aaron Tomlin)
Give administrators the ability to zero out
/proc/sys/kernel/hung_task_detect_count
- "tools/getdelays: use the static UAPI headers from
tools/include/uapi" (Thomas Weißschuh)
Teach getdelays to use the in-kernel UAPI headers rather than the
system-provided ones
- "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)
Several cleanups and fixups to the hardlockup detector code and its
documentation
- "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)
A couple of small/theoretical fixes in the bch code
- "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)
- "cleanup the RAID5 XOR library" (Christoph Hellwig)
A quite far-reaching cleanup to this code. I can't do better than to
quote Christoph:
"The XOR library used for the RAID5 parity is a bit of a mess right
now. The main file sits in crypto/ despite not being cryptography
and not using the crypto API, with the generic implementations
sitting in include/asm-generic and the arch implementations
sitting in an asm/ header in theory. The latter doesn't work for
many cases, so architectures often build the code directly into
the core kernel, or create another module for the architecture
code.
Change this to a single module in lib/ that also contains the
architecture optimizations, similar to the library work Eric
Biggers has done for the CRC and crypto libraries later. After
that it changes to better calling conventions that allow for
smarter architecture implementations (although none is contained
here yet), and uses static_call to avoid indirection function call
overhead"
- "lib/list_sort: Clean up list_sort() scheduling workarounds"
(Kuan-Wei Chiu)
Clean up this library code by removing a hacky thing which was added
for UBIFS, which UBIFS doesn't actually need
- "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)
Fix a few bugs in the scatterlist code, add in-kernel tests for the
now-fixed bugs and fix a leak in the test itself
- "kdump: Enable LUKS-encrypted dump target support in ARM64 and
PowerPC" (Coiby Xu)
Enable support of the LUKS-encrypted device dump target on arm64 and
powerpc
- "ocfs2: consolidate extent list validation into block read callbacks"
(Joseph Qi)
Cleanup, simplify, and make more robust ocfs2's validation of extent
list fields (Kernel test robot loves mounting corrupted fs images!)
* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
ocfs2: validate group add input before caching
ocfs2: validate bg_bits during freefrag scan
ocfs2: fix listxattr handling when the buffer is full
doc: watchdog: fix typos etc
update Sean's email address
ocfs2: use get_random_u32() where appropriate
ocfs2: split transactions in dio completion to avoid credit exhaustion
ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
ocfs2: validate extent block list fields during block read
ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
ocfs2: validate dx_root extent list fields during block read
ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
ocfs2: handle invalid dinode in ocfs2_group_extend
.get_maintainer.ignore: add Askar
ocfs2: validate bg_list extent bounds in discontig groups
checkpatch: exclude forward declarations of const structs
tools/accounting: handle truncated taskstats netlink messages
taskstats: set version in TGID exit notifications
ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
...
|
||
|
|
334fbe734e |
mm.git review status for linus..mm-stable
Everything: Total patches: 368 Reviews/patch: 1.56 Reviewed rate: 74% Excluding DAMON: Total patches: 316 Reviews/patch: 1.77 Reviewed rate: 81% Excluding DAMON and zram: Total patches: 306 Reviews/patch: 1.81 Reviewed rate: 82% Excluding DAMON, zram and maple_tree: Total patches: 276 Reviews/patch: 2.01 Reviewed rate: 91% Significant patch series in this merge: - The 30 patch series "maple_tree: Replace big node with maple copy" from Liam Howlett is mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - The 12 patch series "mm, swap: swap table phase III: remove swap_map" from Kairui Song offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - The 2 patch series "mm: memfd_luo: preserve file seals" from Pratyush Yadav adds file seal preservation to LUO's memfd code. - The 2 patch series "mm: zswap: add per-memcg stat for incompressible pages" from Jiayuan Chen adds additional userspace stats reportng to zswap. - The 4 patch series "arch, mm: consolidate empty_zero_page" from Mike Rapoport implements some cleanups for our handling of ZERO_PAGE() and zero_pfn. - The 2 patch series "mm/kmemleak: Improve scan_should_stop() implementation" from Zhongqiu Han provides an robustness improvement and some cleanups in the kmemleak code. - The 4 patch series "Improve khugepaged scan logic" from Vernon Yang "improves the khugepaged scan logic and reduces CPU consumption by prioritizing scanning tasks that access memory frequently". - The 2 patch series "Make KHO Stateless" from Jason Miu simplifies Kexec Handover by "transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel" - The 3 patch series "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" from Thomas Ballasi and Steven Rostedt enhances vmscan's tracepointing. - The 5 patch series "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" from Catalin Marinas is a cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation. - The 2 patch series "Fix KASAN support for KHO restored vmalloc regions" from Pasha Tatashin fixes a WARN() which can be emitted the KHO restores a vmalloc area. - The 4 patch series "mm: Remove stray references to pagevec" from Tal Zussman provides several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago. - The 17 patch series "mm: Eliminate fake head pages from vmemmap optimization" from Kiryl Shutsemau simplifies the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page. - The 2 patch series "mm/damon/core: improve DAMOS quota efficiency for core layer filters" from SeongJae Park improves two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used. - The 3 patch series "mm/damon: strictly respect min_nr_regions" from SeongJae Park improves DAMON usability by extending the treatment of the min_nr_regions user-settable parameter. - The 3 patch series "mm/page_alloc: pcp locking cleanup" from Vlastimil Babka is a proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ennsed. - The 16 patch series "mm: cleanups around unmapping / zapping" from David Hildenbrand implements "a bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions". - The 6 patch series "support batched checking of the young flag for MGLRU" from Baolin Wang supports batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64. - The 5 patch series "memcg: obj stock and slab stat caching cleanups" from Johannes Weiner provides memcg cleanup and robustness improvements. - The 5 patch series "Allow order zero pages in page reporting" from Yuvraj Sakshith enhances page_reporting's free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - The 6 patch series "mm: vma flag tweaks" from Lorenzo Stoakes is cleanup work following from the recent conversion of the VMA flags to a bitmap. - The 10 patch series "mm/damon: add optional debugging-purpose sanity checks" from SeongJae Park adds some more developer-facing debug checks into DAMON core. - The 2 patch series "mm/damon: test and document power-of-2 min_region_sz requirement" from SeongJae Park adds an additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling. - The 3 patch series "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" from SeongJae Park fixes a hard-to-hit time overflow issue in DAMON core. - The 7 patch series "mm/damon: improve/fixup/update ratio calculation, test and documentation" from SeongJae Park is a "batch of misc/minor improvements and fixups" for DAMON. - The 4 patch series "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" from David Hildenbrand fixes a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - The 6 patch series "zram: recompression cleanups and tweaks" from Sergey Senozhatsky provides "a somewhat random mix of fixups, recompression cleanups and improvements" in the zram code. - The 11 patch series "mm/damon: support multiple goal-based quota tuning algorithms" from SeongJae Park extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select. - The 4 patch series "mm: thp: reduce unnecessary start_stop_khugepaged()" from Breno Leitao fixes the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged. - The 3 patch series "mm: improve map count checks" from Lorenzo Stoakes provides some cleanups and slight fixes in the mremap, mmap and vma code. - The 5 patch series "mm/damon: support addr_unit on default monitoring targets for modules" from SeongJae Park extends the use of DAMON core's addr_unit tunable. - The 5 patch series "mm: khugepaged cleanups and mTHP prerequisites" from Nico Pache provides cleanups in the khugepaged and is a base for Nico's planned khugepaged mTHP support. - The 15 patch series "mm: memory hot(un)plug and SPARSEMEM cleanups" from David Hildenbrand implements code movement and cleanups in the memhotplug and sparsemem code. - The 2 patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" from David Hildenbrand rationalizes some memhotplug Kconfig support. - The 6 patch series "change young flag check functions to return bool" from Baolin Wang is "a cleanup patchset to change all young flag check functions to return bool". - The 3 patch series "mm/damon/sysfs: fix memory leak and NULL dereference issues" from Josh Law and SeongJae Park fixes a few potential DAMON bugs. - The 25 patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma code" from "converts a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it". Mainly in the vma code. - The 21 patch series "mm: expand mmap_prepare functionality and usage" from Lorenzo Stoakes "expands the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time". Cleanups, documentation, extension of mmap_prepare into filesystem drivers. - The 13 patch series "mm/huge_memory: refactor zap_huge_pmd()" from Lorenzo Stoakes simplifies and cleans up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad3HDQAKCRDdBJ7gKXxA jrUQAPwNhPk5nPSxnyxjAeQtOBHqgCdnICeEismLajPKd9aYRgEA0s2XAu3tSUYi GrBnWImHG3s4ePQxVcPCegWTsOUrXgQ= =1Q7o -----END PGP SIGNATURE----- Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky) A somewhat random mix of fixups, recompression cleanups and improvements in the zram code - "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park) Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao) Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged - "mm: improve map count checks" (Lorenzo Stoakes) Provide some cleanups and slight fixes in the mremap, mmap and vma code - "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park) Extend the use of DAMON core's addr_unit tunable - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache) Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand) Code movement and cleanups in the memhotplug and sparsemem code - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand) Rationalize some memhotplug Kconfig support - "change young flag check functions to return bool" (Baolin Wang) Cleanups to change all young flag check functions to return bool - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park) Fix a few potential DAMON bugs - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes) Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code. - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes) Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes) Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. * tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ... |
||
|
|
70b672833f |
ocfs2: validate group add input before caching
[BUG]
OCFS2_IOC_GROUP_ADD can trigger a BUG_ON in
ocfs2_set_new_buffer_uptodate():
kernel BUG at fs/ocfs2/uptodate.c:509!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:ocfs2_set_new_buffer_uptodate+0x194/0x1e0 fs/ocfs2/uptodate.c:509
Code: ffffe88f 42b9fe4c 89e64889 dfe8b4df
Call Trace:
ocfs2_group_add+0x3f1/0x1510 fs/ocfs2/resize.c:507
ocfs2_ioctl+0x309/0x6e0 fs/ocfs2/ioctl.c:887
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x197/0x1e0 fs/ioctl.c:583
x64_sys_call+0x1144/0x26a0 arch/x86/include/generated/asm/syscalls_64.h:17
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x93/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7bbfb55a966d
[CAUSE]
ocfs2_group_add() calls ocfs2_set_new_buffer_uptodate() on a
user-controlled group block before ocfs2_verify_group_and_input()
validates that block number. That helper is only valid for newly
allocated metadata and asserts that the block is not already present in
the chosen metadata cache. The code also uses INODE_CACHE(inode) even
though the group descriptor belongs to main_bm_inode and later journal
accesses use that cache context instead.
[FIX]
Validate the on-disk group descriptor before caching it, then add it to
the metadata cache tracked by INODE_CACHE(main_bm_inode). Keep the
validation failure path separate from the later cleanup path so we only
remove the buffer from that cache after it has actually been inserted.
This keeps the group buffer lifetime consistent across validation,
journaling, and cleanup.
Link: https://lkml.kernel.org/r/20260410020209.3786348-1-gality369@gmail.com
Fixes:
|
||
|
|
8f687eeed3 |
ocfs2: validate bg_bits during freefrag scan
[BUG]
A crafted filesystem can trigger an out-of-bounds bitmap walk when
OCFS2_IOC_INFO is issued with OCFS2_INFO_FL_NON_COHERENT.
BUG: KASAN: use-after-free in instrument_atomic_read include/linux/instrumented.h:68 [inline]
BUG: KASAN: use-after-free in _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline]
BUG: KASAN: use-after-free in test_bit_le include/asm-generic/bitops/le.h:21 [inline]
BUG: KASAN: use-after-free in ocfs2_info_freefrag_scan_chain fs/ocfs2/ioctl.c:495 [inline]
BUG: KASAN: use-after-free in ocfs2_info_freefrag_scan_bitmap fs/ocfs2/ioctl.c:588 [inline]
BUG: KASAN: use-after-free in ocfs2_info_handle_freefrag fs/ocfs2/ioctl.c:662 [inline]
BUG: KASAN: use-after-free in ocfs2_info_handle_request+0x1c66/0x3370 fs/ocfs2/ioctl.c:754
Read of size 8 at addr ffff888031bce000 by task syz.0.636/1435
Call Trace:
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0xbe/0x130 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xd1/0x650 mm/kasan/report.c:482
kasan_report+0xfb/0x140 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:186 [inline]
kasan_check_range+0x11c/0x200 mm/kasan/generic.c:200
__kasan_check_read+0x11/0x20 mm/kasan/shadow.c:31
instrument_atomic_read include/linux/instrumented.h:68 [inline]
_test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline]
test_bit_le include/asm-generic/bitops/le.h:21 [inline]
ocfs2_info_freefrag_scan_chain fs/ocfs2/ioctl.c:495 [inline]
ocfs2_info_freefrag_scan_bitmap fs/ocfs2/ioctl.c:588 [inline]
ocfs2_info_handle_freefrag fs/ocfs2/ioctl.c:662 [inline]
ocfs2_info_handle_request+0x1c66/0x3370 fs/ocfs2/ioctl.c:754
ocfs2_info_handle+0x18d/0x2a0 fs/ocfs2/ioctl.c:828
ocfs2_ioctl+0x632/0x6e0 fs/ocfs2/ioctl.c:913
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x197/0x1e0 fs/ioctl.c:583
...
[CAUSE]
ocfs2_info_freefrag_scan_chain() uses on-disk bg_bits directly as the
bitmap scan limit. The coherent path reads group descriptors through
ocfs2_read_group_descriptor(), which validates the descriptor before
use. The non-coherent path uses ocfs2_read_blocks_sync() instead and
skips that validation, so an impossible bg_bits value can drive the
bitmap walk past the end of the block.
[FIX]
Compute the bitmap capacity from the filesystem format with
ocfs2_group_bitmap_size(), report descriptors whose bg_bits exceeds
that limit, and clamp the scan to the computed capacity. This keeps the
freefrag report going while avoiding reads beyond the buffer.
Link: https://lkml.kernel.org/r/20260410034220.3825769-1-gality369@gmail.com
Fixes:
|
||
|
|
d12f558e62 |
ocfs2: fix listxattr handling when the buffer is full
[BUG] If an OCFS2 inode has both inline and block-based xattrs, listxattr() can return a size larger than the caller's buffer when the inline names consume that buffer exactly. kernel BUG at mm/usercopy.c:102! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI RIP: 0010:usercopy_abort+0xb7/0xd0 mm/usercopy.c:102 Call Trace: __check_heap_object+0xe3/0x120 mm/slub.c:8243 check_heap_object mm/usercopy.c:196 [inline] __check_object_size mm/usercopy.c:250 [inline] __check_object_size+0x5c5/0x780 mm/usercopy.c:215 check_object_size include/linux/ucopysize.h:22 [inline] check_copy_size include/linux/ucopysize.h:59 [inline] copy_to_user include/linux/uaccess.h:219 [inline] listxattr+0xb0/0x170 fs/xattr.c:926 filename_listxattr fs/xattr.c:958 [inline] path_listxattrat+0x137/0x320 fs/xattr.c:988 __do_sys_listxattr fs/xattr.c:1001 [inline] __se_sys_listxattr fs/xattr.c:998 [inline] __x64_sys_listxattr+0x7f/0xd0 fs/xattr.c:998 ... [CAUSE] Commit |
||
|
|
6c9340a2ff |
ocfs2: use get_random_u32() where appropriate
Use the typed random integer helpers instead of get_random_bytes() when filling a single integer variable. The helpers return the value directly, require no pointer or size argument, and better express intent. Link: https://lkml.kernel.org/r/20260405154720.4732-1-devnexen@gmail.com Signed-off-by: David Carlier <devnexen@gmail.com Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
d647c5b2fb |
ocfs2: split transactions in dio completion to avoid credit exhaustion
During ocfs2 dio operations, JBD2 may report warnings via following
call trace:
ocfs2_dio_end_io_write
ocfs2_mark_extent_written
ocfs2_change_extent_flag
ocfs2_split_extent
ocfs2_try_to_merge_extent
ocfs2_extend_rotate_transaction
ocfs2_extend_trans
jbd2__journal_restart
start_this_handle
output: JBD2: kworker/6:2 wants too many credits credits:5450 rsv_credits:0 max:5449
To prevent exceeding the credits limit, modify ocfs2_dio_end_io_write() to
handle extents in a batch of transaction.
Additionally, relocate ocfs2_del_inode_from_orphan(). The orphan inode
should only be removed from the orphan list after the extent tree update
is complete. This ensures that if a crash occurs in the middle of extent
tree updates, we won't leave stale blocks beyond EOF.
This patch also changes the logic for updating the inode size and removing
orphan, making it similar to ext4_dio_write_end_io(). Both operations are
performed only when everything looks good.
Finally, thanks to Jans and Joseph for providing the bug fix prototype and
suggestions.
Link: https://lkml.kernel.org/r/20260402134328.27334-2-heming.zhao@suse.com
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Suggested-by: Jan Kara <jack@suse.cz>
Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
510a750287 |
ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
The l_next_free_rec > l_count check after ocfs2_read_extent_block() in __ocfs2_find_path() is now redundant, as ocfs2_validate_extent_block() already performs this validation at block read time. Remove the duplicate check to avoid maintaining the same validation in two places. Link: https://lkml.kernel.org/r/20260403090803.3860971-5-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
af5e456c0b |
ocfs2: validate extent block list fields during block read
Add extent list validation to ocfs2_validate_extent_block() so that
corrupted on-disk fields are caught early at block read time rather than
during extent tree traversal.
Two checks are added:
- l_count must equal the expected value from
ocfs2_extent_recs_per_eb(), catching blocks with a corrupted record
count before any array iteration.
- l_next_free_rec must not exceed l_count, preventing out-of-bounds
access when iterating over extent records.
Link: https://lkml.kernel.org/r/20260403090803.3860971-4-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
4ae9cca37e |
ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
The full extent list check is introduced by commit
|
||
|
|
775c17386a |
ocfs2: validate dx_root extent list fields during block read
Patch series "ocfs2: consolidate extent list validation into block read callbacks". ocfs2 validates extent list fields (l_count, l_next_free_rec) at various points during extent tree traversal. This is fragile because each caller must remember to check for corrupted on-disk data before using it. This series moves those checks into the block read validation callbacks (ocfs2_validate_dx_root and ocfs2_validate_extent_block), so corrupted fields are caught early at block read time. Redundant post-read checks are then removed. This patch (of 4): Move the extent list l_count validation from ocfs2_dx_dir_lookup_rec() into ocfs2_validate_dx_root(), so that corrupted on-disk fields are caught early at block read time rather than during directory lookups. Additionally, add a l_next_free_rec <= l_count check to prevent out-of-bounds access when iterating over extent records. Both checks are skipped for inline dx roots (OCFS2_DX_FLAG_INLINE), which use dr_entries instead of dr_list. Link: https://lkml.kernel.org/r/20260403090803.3860971-1-joseph.qi@linux.alibaba.com Link: https://lkml.kernel.org/r/20260403090803.3860971-2-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
7de554cabf |
ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
filemap_fault() may drop the mmap_lock before returning VM_FAULT_RETRY,
as documented in mm/filemap.c:
"If our return value has VM_FAULT_RETRY set, it's because the mmap_lock
may be dropped before doing I/O or by lock_folio_maybe_drop_mmap()."
When this happens, a concurrent munmap() can call remove_vma() and free
the vm_area_struct via RCU. The saved 'vma' pointer in ocfs2_fault() then
becomes a dangling pointer, and the subsequent trace_ocfs2_fault() call
dereferences it -- a use-after-free.
Fix this by saving ip_blkno as a plain integer before calling
filemap_fault(), and removing vma from the trace event. Since
ip_blkno is copied by value before the lock can be dropped, it
remains valid regardless of what happens to the vma or inode
afterward.
Link: https://lkml.kernel.org/r/20260410083816.34951-1-tejas.bharambe@outlook.com
Fixes:
|
||
|
|
4a1c0ddc6e |
ocfs2: handle invalid dinode in ocfs2_group_extend
[BUG]
kernel BUG at fs/ocfs2/resize.c:308!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:ocfs2_group_extend+0x10aa/0x1ae0 fs/ocfs2/resize.c:308
Code: 8b8520ff ffff83f8 860f8580 030000e8 5cc3c1fe
Call Trace:
...
ocfs2_ioctl+0x175/0x6e0 fs/ocfs2/ioctl.c:869
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x197/0x1e0 fs/ioctl.c:583
x64_sys_call+0x1144/0x26a0 arch/x86/include/generated/asm/syscalls_64.h:17
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x93/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x76/0x7e
...
[CAUSE]
ocfs2_group_extend() assumes that the global bitmap inode block
returned from ocfs2_inode_lock() has already been validated and
BUG_ONs when the signature is not a dinode. That assumption is too
strong for crafted filesystems because the JBD2-managed buffer path
can bypass structural validation and return an invalid dinode to the
resize ioctl.
[FIX]
Validate the dinode explicitly in ocfs2_group_extend(). If the global
bitmap buffer does not contain a valid dinode, report filesystem
corruption with ocfs2_error() and fail the resize operation instead of
crashing the kernel.
Link: https://lkml.kernel.org/r/20260401092303.3709187-1-gality369@gmail.com
Fixes:
|
||
|
|
6110d18e20 |
ocfs2: validate bg_list extent bounds in discontig groups
[BUG]
Running ocfs2 on a corrupted image with a discontiguous block
group whose bg_list.l_next_free_rec is set to an excessively
large value triggers a KASAN use-after-free crash:
BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
Read of size 4 at addr ffff88801a85f000 by task syz.0.115/552
Call Trace:
...
__asan_report_load4_noabort+0x14/0x30 mm/kasan/report_generic.c:380
ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
ocfs2_search_one_group fs/ocfs2/suballoc.c:1752 [inline]
ocfs2_claim_suballoc_bits+0x13c3/0x1cd0 fs/ocfs2/suballoc.c:1984
ocfs2_claim_new_inode+0x2e7/0x8a0 fs/ocfs2/suballoc.c:2292
ocfs2_mknod_locked.constprop.0+0x121/0x2a0 fs/ocfs2/namei.c:637
ocfs2_mknod+0xc71/0x2400 fs/ocfs2/namei.c:384
ocfs2_create+0x158/0x390 fs/ocfs2/namei.c:676
lookup_open.isra.0+0x10a1/0x1460 fs/namei.c:3796
open_last_lookups fs/namei.c:3895 [inline]
path_openat+0x11fe/0x2ce0 fs/namei.c:4131
do_filp_open+0x1f6/0x430 fs/namei.c:4161
do_sys_openat2+0x117/0x1c0 fs/open.c:1437
do_sys_open fs/open.c:1452 [inline]
__do_sys_openat fs/open.c:1468 [inline]
...
[CAUSE]
ocfs2_bg_discontig_fix_result() iterates over bg->bg_list.l_recs[]
using l_next_free_rec as the upper bound without any sanity check:
for (i = 0; i < le16_to_cpu(bg->bg_list.l_next_free_rec); i++) {
rec = &bg->bg_list.l_recs[i];
l_next_free_rec is read directly from the on-disk group descriptor and
is trusted blindly. On a 4 KiB block device, bg_list.l_recs[] can hold
at most 235 entries (ocfs2_extent_recs_per_gd(sb)). A corrupted or
crafted filesystem image can set l_next_free_rec to an arbitrarily
large value, causing the loop to index past the end of the group
descriptor buffer_head data page and into an adjacent freed page.
[FIX]
Validate discontiguous bg_list.l_count against
ocfs2_extent_recs_per_gd(sb), then reject l_next_free_rec values that
exceed l_count. This keeps the on-disk extent list self-consistent and
matches how the rest of ocfs2 uses l_count as the extent-list bound.
Link: https://lkml.kernel.org/r/20260401021622.3560952-1-gality369@gmail.com
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
5686459423 |
ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
o2hb_map_slot_data() allocates hr_tmp_block, hr_slots, hr_slot_data, and pages in stages. If a later allocation fails, the current code returns without unwinding the earlier allocations. o2hb_region_dev_store() also leaves slot mapping resources behind when setup aborts, and it keeps hr_aborted_start/hr_node_deleted set across retries. That leaves stale state behind after a failed start. Factor the slot cleanup into o2hb_unmap_slot_data(), use it from both o2hb_map_slot_data() and o2hb_region_release(), and call it from the dev_store() rollback after stopping a started heartbeat thread. While freeing pages, clear each hr_slot_data entry as it is released, and reset the start state before each new setup attempt. This closes the slot mapping leak on allocation/setup failure paths and keeps failed setup attempts retryable. Link: https://lkml.kernel.org/r/20260330153428.19586-1-yufan.chen@linux.dev Signed-off-by: Yufan Chen <ericterminal@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
fc825e513c |
vfs-7.1-rc1.bh.metadata
Please consider pulling these changes from the signed vfs-7.1-rc1.bh.metadata tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCwAKCRCRxhvAZXjc os67AQCd65HW/XVVw01846OH5Cqw7vFYBa7HipkQPebX3NjCPgEAm4w8ywqKUe5o rRLkZVIDBgkMGhH7Af+y2Ru5WZZiLwc= =G3WM -----END PGP SIGNATURE----- Merge tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs buffer_head updates from Christian Brauner: "This cleans up the mess that has accumulated over the years in metadata buffer_head tracking for inodes. It moves the tracking into dedicated structure in filesystem-private part of the inode (so that we don't use private_list, private_data, and private_lock in struct address_space), and also moves couple other users of private_data and private_list so these are removed from struct address_space saving 3 longs in struct inode for 99% of inodes" * tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits) fs: Drop i_private_list from address_space fs: Drop mapping_metadata_bhs from address space ext4: Track metadata bhs in fs-private inode part minix: Track metadata bhs in fs-private inode part udf: Track metadata bhs in fs-private inode part fat: Track metadata bhs in fs-private inode part bfs: Track metadata bhs in fs-private inode part affs: Track metadata bhs in fs-private inode part ext2: Track metadata bhs in fs-private inode part fs: Provide functions for handling mapping_metadata_bhs directly fs: Switch inode_has_buffers() to take mapping_metadata_bhs fs: Make bhs point to mapping_metadata_bhs fs: Move metadata bhs tracking to a separate struct fs: Fold fsync_buffers_list() into sync_mapping_buffers() fs: Drop osync_buffers_list() kvm: Use private inode list instead of i_private_list fs: Remove i_private_data aio: Stop using i_private_data and i_private_lock hugetlbfs: Stop using i_private_data fs: Stop using i_private_data for metadata bh tracking ... |
||
|
|
b7d74ea0fd |
vfs-7.1-rc1.kino
Please consider pulling these changes from the signed vfs-7.1-rc1.kino tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCgAKCRCRxhvAZXjc otmnAP4sbsxZQdz2TG2hJuOwnEZOkkxZQOUMc3ERVyZaWXIeTAEA7e5M+8FpoG9n 8ipO76UoaXdGLESrqVdp9EOhLqOW7QY= =uMeJ -----END PGP SIGNATURE----- Merge tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs i_ino updates from Christian Brauner: "For historical reasons, the inode->i_ino field is an unsigned long, which means that it's 32 bits on 32 bit architectures. This has caused a number of filesystems to implement hacks to hash a 64-bit identifier into a 32-bit field, and deprives us of a universal identifier field for an inode. This changes the inode->i_ino field from an unsigned long to a u64. This shouldn't make any material difference on 64-bit hosts, but 32-bit hosts will see struct inode grow by at least 4 bytes. This could have effects on slabcache sizes and field alignment. The bulk of the changes are to format strings and tracepoints, since the kernel itself doesn't care that much about the i_ino field. The first patch changes some vfs function arguments, so check that one out carefully. With this change, we may be able to shrink some inode structures. For instance, struct nfs_inode has a fileid field that holds the 64-bit inode number. With this set of changes, that field could be eliminated. I'd rather leave that sort of cleanups for later just to keep this simple" * tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: nilfs2: fix 64-bit division operations in nilfs_bmap_find_target_in_group() EVM: add comment describing why ino field is still unsigned long vfs: remove externs from fs.h on functions modified by i_ino widening treewide: fix missed i_ino format specifier conversions ext4: fix signed format specifier in ext4_load_inode trace event treewide: change inode->i_ino from unsigned long to u64 nilfs2: widen trace event i_ino fields to u64 f2fs: widen trace event i_ino fields to u64 ext4: widen trace event i_ino fields to u64 zonefs: widen trace event i_ino fields to u64 hugetlbfs: widen trace event i_ino fields to u64 ext2: widen trace event i_ino fields to u64 cachefiles: widen trace event i_ino fields to u64 vfs: widen trace event i_ino fields to u64 net: change sock.sk_ino and sock_i_ino() to u64 audit: widen ino fields to u64 vfs: widen inode hash/lookup functions to u64 |
||
|
|
be81084e03 |
ocfs2: use jbd2 jinode dirty range accessor
ocfs2 journal commit callback reads jbd2_inode dirty range fields without holding journal->j_list_lock. Use jbd2_jinode_get_dirty_range() to get the range in bytes. Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Li Chen <me@linux.beauty> Link: https://patch.msgid.link/20260306085643.465275-4-me@linux.beauty Signed-off-by: Theodore Ts'o <tytso@mit.edu> |
||
|
|
7bc5da4842 |
ocfs2: fix out-of-bounds write in ocfs2_write_end_inline
KASAN reports a use-after-free write of 4086 bytes in
ocfs2_write_end_inline, called from ocfs2_write_end_nolock during a
copy_file_range splice fallback on a corrupted ocfs2 filesystem mounted on
a loop device. The actual bug is an out-of-bounds write past the inode
block buffer, not a true use-after-free. The write overflows into an
adjacent freed page, which KASAN reports as UAF.
The root cause is that ocfs2_try_to_write_inline_data trusts the on-disk
id_count field to determine whether a write fits in inline data. On a
corrupted filesystem, id_count can exceed the physical maximum inline data
capacity, causing writes to overflow the inode block buffer.
Call trace (crash path):
vfs_copy_file_range (fs/read_write.c:1634)
do_splice_direct
splice_direct_to_actor
iter_file_splice_write
ocfs2_file_write_iter
generic_perform_write
ocfs2_write_end
ocfs2_write_end_nolock (fs/ocfs2/aops.c:1949)
ocfs2_write_end_inline (fs/ocfs2/aops.c:1915)
memcpy_from_folio <-- KASAN: write OOB
So add id_count upper bound check in ocfs2_validate_inode_block() to
alongside the existing i_size check to fix it.
Link: https://lkml.kernel.org/r/20260403063830.3662739-1-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: syzbot+62c1793956716ea8b28a@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=62c1793956716ea8b28a
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
ab5193e919 |
fs: remove unncessary pagevec.h includes
Remove unused pagevec.h includes from .c files. These were found with the following command: grep -rl '#include.*pagevec\.h' --include='*.c' | while read f; do grep -qE 'PAGEVEC_SIZE|folio_batch' "$f" || echo "$f" done There are probably more removal candidates in .h files, but those are more complex to analyze. Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-2-716868cc2d11@columbia.edu Signed-off-by: Tal Zussman <tz2294@columbia.edu> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Chris Li <chrisl@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand (Arm) <david@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
408d8af01f |
for_each_alias(): helper macro for iterating through dentries of given inode
Most of the places using d_alias are loops iterating through all aliases for given inode; introduce a helper macro (for_each_alias(dentry, inode)) and convert open-coded instances of such loop to it. They are easier to read that way and it reduces the noise on the next steps. You _must_ hold inode->i_lock over that thing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
01b61e8dda |
ocfs2/dlm: fix off-by-one in dlm_match_regions() region comparison
The local-vs-remote region comparison loop uses '<=' instead of '<',
causing it to read one entry past the valid range of qr_regions. The
other loops in the same function correctly use '<'.
Fix the loop condition to use '<' for consistency and correctness.
Link: https://lkml.kernel.org/r/SYBPR01MB78813DA26B50EC5E01F00566AF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Fixes:
|
||
|
|
7ab3fbb01b |
ocfs2/dlm: validate qr_numregions in dlm_match_regions()
Patch series "ocfs2/dlm: fix two bugs in dlm_match_regions()".
In dlm_match_regions(), the qr_numregions field from a DLM_QUERY_REGION
network message is used to drive loops over the qr_regions buffer without
sufficient validation. This series fixes two issues:
- Patch 1 adds a bounds check to reject messages where qr_numregions
exceeds O2NM_MAX_REGIONS. The o2net layer only validates message
byte length; it does not constrain field values, so a crafted message
can set qr_numregions up to 255 and trigger out-of-bounds reads past
the 1024-byte qr_regions buffer.
- Patch 2 fixes an off-by-one in the local-vs-remote comparison loop,
which uses '<=' instead of '<', reading one entry past the valid range
even when qr_numregions is within bounds.
This patch (of 2):
The qr_numregions field from a DLM_QUERY_REGION network message is used
directly as loop bounds in dlm_match_regions() without checking against
O2NM_MAX_REGIONS. Since qr_regions is sized for at most O2NM_MAX_REGIONS
(32) entries, a crafted message with qr_numregions > 32 causes
out-of-bounds reads past the qr_regions buffer.
Add a bounds check for qr_numregions before entering the loops.
Link: https://lkml.kernel.org/r/SYBPR01MB7881A334D02ACEE5E0645801AF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Link: https://lkml.kernel.org/r/SYBPR01MB788166F524AD04E262E174BEAF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Fixes:
|
||
|
|
7aa89307fc |
ocfs2: remove redundant error code assignment
Remove the error assignment for variable 'ret' during correct code execution. In subsequent execution, variable 'ret' is overwritten. Found by Linux Verification Center (linuxtesting.org) with SVACE. Link: https://lkml.kernel.org/r/20260307234809.88421-1-a.velichayshiy@ispras.ru Signed-off-by: Alexey Velichayshiy <a.velichayshiy@ispras.ru> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
b02da26a99 |
ocfs2: fix possible deadlock between unlink and dio_end_io_write
ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem,
while in ocfs2_dio_end_io_write, it acquires these locks in reverse order.
This creates an ABBA lock ordering violation on lock classes
ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and
ocfs2_file_ip_alloc_sem_key.
Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem):
ocfs2_unlink
ocfs2_prepare_orphan_dir
ocfs2_lookup_lock_orphan_dir
inode_lock(orphan_dir_inode) <- lock A
__ocfs2_prepare_orphan_dir
ocfs2_prepare_dir_for_insert
ocfs2_extend_dir
ocfs2_expand_inline_dir
down_write(&oi->ip_alloc_sem) <- Lock B
Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock):
ocfs2_dio_end_io_write
down_write(&oi->ip_alloc_sem) <- Lock B
ocfs2_del_inode_from_orphan()
inode_lock(orphan_dir_inode) <- Lock A
Deadlock Scenario:
CPU0 (unlink) CPU1 (dio_end_io_write)
------ ------
inode_lock(orphan_dir_inode)
down_write(ip_alloc_sem)
down_write(ip_alloc_sem)
inode_lock(orphan_dir_inode)
Since ip_alloc_sem is to protect allocation changes, which is unrelated
with operations in ocfs2_del_inode_from_orphan. So move
ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock.
Link: https://lkml.kernel.org/r/20260306032211.1016452-1-joseph.qi@linux.alibaba.com
Reported-by: syzbot+67b90111784a3eac8c04@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04
Fixes:
|
||
|
|
19aa667ace |
ocfs2: fix deadlock when creating quota file
syzbot detected a circular locking dependency. the scenarios:
CPU0 CPU1
---- ----
lock(&ocfs2_quota_ip_alloc_sem_key);
lock(&ocfs2_sysfile_lock_key[USER_QUOTA_SYSTEM_INODE]);
lock(&ocfs2_quota_ip_alloc_sem_key);
lock(&ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE]);
or:
CPU0 CPU1
---- ----
lock(&ocfs2_quota_ip_alloc_sem_key);
lock(&dquot->dq_lock);
lock(&ocfs2_quota_ip_alloc_sem_key);
lock(&ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE]);
Following are the code paths for above scenarios:
path_openat
ocfs2_create
ocfs2_mknod
+ ocfs2_reserve_new_inode
| ocfs2_reserve_suballoc_bits
| inode_lock(alloc_inode) //C0: hold INODE_ALLOC_SYSTEM_INODE
| //ocfs2_free_alloc_context(inode_ac) is called at the end of
| //caller ocfs2_mknod to handle the release
|
+ ocfs2_get_init_inode
__dquot_initialize
dqget
ocfs2_acquire_dquot
+ ocfs2_lock_global_qf
| down_write(&OCFS2_I(oinfo->dqi_gqinode)->ip_alloc_sem)//A2:grabbing
+ ocfs2_create_local_dquot
down_write(&OCFS2_I(lqinode)->ip_alloc_sem)//A3:grabbing
evict
ocfs2_evict_inode
ocfs2_delete_inode
ocfs2_wipe_inode
+ inode_lock(orphan_dir_inode) //B0:hold
+ ...
+ ocfs2_remove_inode
inode_lock(inode_alloc_inode) //INODE_ALLOC_SYSTEM_INODE
down_write(&inode->i_rwsem) //C1:grabbing
generic_file_direct_write
ocfs2_direct_IO
__blockdev_direct_IO
dio_complete
ocfs2_dio_end_io
ocfs2_dio_end_io_write
+ down_write(&oi->ip_alloc_sem) //A0:hold
+ ocfs2_del_inode_from_orphan
inode_lock(orphan_dir_inode) //B1:grabbing
Root cause for the circular locking:
DIO completion path:
holds oi->ip_alloc_sem and is trying to acquire the orphan_dir_inode lock.
evict path:
holds the orphan_dir_inode lock and is trying to acquire the
inode_alloc_inode lock.
ocfs2_mknod path:
Holds the inode_alloc_inode lock (to allocate a new quota file) and is
blocked waiting for oi->ip_alloc_sem in ocfs2_acquire_dquot().
How to fix:
Replace down_write() with down_write_trylock() in ocfs2_acquire_dquot().
If acquiring oi->ip_alloc_sem fails, return -EBUSY to abort the file
creation routine and break the deadlock.
Link: https://lkml.kernel.org/r/20260302061707.7092-1-heming.zhao@suse.com
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reported-by: syzbot+78359d5fbb04318c35e9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=78359d5fbb04318c35e9
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
70450fcfd2
|
ocfs2: Drop pointless sync_mapping_buffers() calls
ocfs2 never calls mark_buffer_dirty_inode() and thus its metadata buffers list is always empty. Drop the pointless sync_mapping_buffers() calls. CC: Joel Becker <jlbec@evilplan.org> CC: Joseph Qi <joseph.qi@linux.alibaba.com> CC: ocfs2-devel@lists.linux.dev Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260326095354.16340-46-jack@suse.cz Tested-by: syzbot@syzkaller.appspotmail.com Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org> |
||
|
|
0b2600f81c
|
treewide: change inode->i_ino from unsigned long to u64
On 32-bit architectures, unsigned long is only 32 bits wide, which causes 64-bit inode numbers to be silently truncated. Several filesystems (NFS, XFS, BTRFS, etc.) can generate inode numbers that exceed 32 bits, and this truncation can lead to inode number collisions and other subtle bugs on 32-bit systems. Change the type of inode->i_ino from unsigned long to u64 to ensure that inode numbers are always represented as 64-bit values regardless of architecture. Update all format specifiers treewide from %lu/%lx to %llu/%llx to match the new type, along with corresponding local variable types. This is the bulk treewide conversion. Earlier patches in this series handled trace events separately to allow trace field reordering for better struct packing on 32-bit. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260304-iino-u64-v3-12-2257ad83d372@kernel.org Acked-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org> |
||
|
|
32a92f8c89 |
Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
323bbfcf1e |
Convert 'alloc_flex' family to use the new default GFP_KERNEL argument
This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the *alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs*'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
bf4afc53b7 |
Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
69050f8d6d |
treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org> |
||
|
|
136114e0ab |
mm.git review status for linus..mm-nonmm-stable
Total patches: 107 Reviews/patch: 1.07 Reviewed rate: 67% - The 2 patch series "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" from Heming Zhao saves disk space by teaching ocfs2 to reclaim suballocator block group space. - The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in various places. - The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size. - The 7 patch series "kallsyms: Prevent invalid access when showing module buildid" from Petr Mladek cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces. - The 3 patch series "Address page fault in ima_restore_measurement_list()" from Harshit Mogalapalli fixes a kexec-related crash that can occur when booting the second-stage kernel on x86. - The 6 patch series "kho: ABI headers and Documentation updates" from Mike Rapoport updates the kexec handover ABI documentation. - The 4 patch series "Align atomic storage" from Finn Thain adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh. - The 2 patch series "kho: clean up page initialization logic" from Pratyush Yadav simplifies the page initialization logic in kho_restore_page(). - The 6 patch series "Unload linux/kernel.h" from Yury Norov moves several things out of kernel.h and into more appropriate places. - The 7 patch series "don't abuse task_struct.group_leader" from Oleg Nesterov removes the usage of ->group_leader when it is "obviously unnecessary". - The 5 patch series "list private v2 & luo flb" from Pasha Tatashin adds some infrastructure improvements to the live update orchestrator. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6 0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww= =mmsH -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ... |
||
|
|
5138c936c2 |
ocfs2: fix reflink preserve cleanup issue
commit |
||
|
|
c62e7e6444 |
ocfs2: add check for free bits before allocation in ocfs2_move_extent()
Add a check to verify the group descriptor has enough free bits before attempting allocation in ocfs2_move_extent(). This prevents a kernel BUG_ON crash in ocfs2_block_group_set_bits() when the move_extents ioctl is called on a crafted or corrupted filesystem. The existing validation in ocfs2_validate_gd_self() only checks static metadata consistency (bg_free_bits_count <= bg_bits) when the descriptor is first read from disk. However, during move_extents operations, multiple allocations can exhaust the free bits count below the requested allocation size, triggering BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits). The debug trace shows the issue clearly: - Block group 32 validated with bg_free_bits_count=427 - Repeated allocations decreased count: 427 -> 171 -> 43 -> ... -> 1 - Final request for 2 bits with only 1 available triggers BUG_ON By adding an early check in ocfs2_move_extent() right after ocfs2_find_victim_alloc_group(), we return -ENOSPC gracefully instead of crashing the kernel. This also avoids unnecessary work in ocfs2_probe_alloc_group() and __ocfs2_move_extent() when the allocation will fail. Link: https://lkml.kernel.org/r/20260104133504.14810-1-kartikey406@gmail.com Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reported-by: syzbot+7960178e777909060224@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7960178e777909060224 Link: https://lore.kernel.org/all/20251231115801.293726-1-kartikey406@gmail.com/T/ [v1] Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
77983f611f |
ocfs2: adjust function name reference
There is no function dlm_mast_regions(). However, dlm_match_regions() is passed the buffer "local", which it uses internally, so it seems like dlm_match_regions() was intended. Link: https://lkml.kernel.org/r/20251230142513.95467-1-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
29300f929e |
ocfs2: annotate more flexible array members with __counted_by_le()
Annotate flexible array members of 'struct ocfs2_local_alloc' and 'struct ocfs2_inline_data' with '__counted_by_le()' attribute to improve array bounds checking when CONFIG_UBSAN_BOUNDS is enabled, and prefer the convenient 'memset()' over an explicit loop to simplify 'ocfs2_clear_local_alloc()'. Link: https://lkml.kernel.org/r/20251021105518.119953-1-dmantipov@yandex.ru Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
e0b0f2834c |
ocfs2: fix oob in __ocfs2_find_path
syzbot constructed a corrupted image, which resulted in el->l_count from the b-tree extent block being 0. Since the length of the l_recs array depends on l_count, reading its member e_blkno triggered the out-of-bounds access reported by syzbot in [1]. The loop terminates when l_count is 0, similar to when next_free is 0. [1] UBSAN: array-index-out-of-bounds in fs/ocfs2/alloc.c:1838:11 index 0 is out of range for type 'struct ocfs2_extent_rec[] __counted_by(l_count)' (aka 'struct ocfs2_extent_rec[]') Call Trace: __ocfs2_find_path+0x606/0xa40 fs/ocfs2/alloc.c:1838 ocfs2_find_leaf+0xab/0x1c0 fs/ocfs2/alloc.c:1946 ocfs2_get_clusters_nocache+0x172/0xc60 fs/ocfs2/extent_map.c:418 ocfs2_get_clusters+0x505/0xa70 fs/ocfs2/extent_map.c:631 ocfs2_extent_map_get_blocks+0x202/0x6a0 fs/ocfs2/extent_map.c:678 ocfs2_read_virt_blocks+0x286/0x930 fs/ocfs2/extent_map.c:1001 ocfs2_read_dir_block fs/ocfs2/dir.c:521 [inline] ocfs2_find_entry_el fs/ocfs2/dir.c:728 [inline] ocfs2_find_entry+0x3e4/0x2090 fs/ocfs2/dir.c:1120 ocfs2_find_files_on_disk+0xdf/0x310 fs/ocfs2/dir.c:2023 ocfs2_lookup_ino_from_name+0x52/0x100 fs/ocfs2/dir.c:2045 _ocfs2_get_system_file_inode fs/ocfs2/sysfile.c:136 [inline] ocfs2_get_system_file_inode+0x326/0x770 fs/ocfs2/sysfile.c:112 ocfs2_init_global_system_inodes+0x319/0x660 fs/ocfs2/super.c:461 ocfs2_initialize_super fs/ocfs2/super.c:2196 [inline] ocfs2_fill_super+0x4432/0x65b0 fs/ocfs2/super.c:993 get_tree_bdev_flags+0x40e/0x4d0 fs/super.c:1691 vfs_get_tree+0x92/0x2a0 fs/super.c:1751 fc_mount fs/namespace.c:1199 [inline] Link: https://lkml.kernel.org/r/tencent_4D99464FA28D9225BE0DBA923F5DF6DD8C07@qq.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Reported-by: syzbot+151afab124dfbc5f15e6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=151afab124dfbc5f15e6 Reviewed-by: Heming Zhao <heming.zhao@suse.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
4e9f69c062 |
ocfs2: add validate function for slot map blocks
When the filesystem is being mounted, the kernel panics while the data regarding slot map allocation to the local node, is being written to the disk. This occurs because the value of slot map buffer head block number, which should have been greater than or equal to `OCFS2_SUPER_BLOCK_BLKNO` (evaluating to 2) is less than it, indicative of disk metadata corruption. This triggers BUG_ON(bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO) in ocfs2_write_block(), causing the kernel to panic. This is fixed by introducing function ocfs2_validate_slot_map_block() to validate slot map blocks. It first checks if the buffer head passed to it is up to date and valid, else it panics the kernel at that point itself. Further, it contains an if condition block, which checks if `bh->b_blocknr` is lesser than `OCFS2_SUPER_BLOCK_BLKNO`; if yes, then ocfs2_error is called, which prints the error log, for debugging purposes, and the return value of ocfs2_error() is returned. If the if condition is false, value 0 is returned by ocfs2_validate_slot_map_block(). This function is used as validate function in calls to ocfs2_read_blocks() in ocfs2_refresh_slot_info() and ocfs2_map_slot_buffers(). Link: https://lkml.kernel.org/r/20251215184600.13147-1-activprithvi@gmail.com Signed-off-by: Prithvi Tambewagh <activprithvi@gmail.com> Reported-by: syzbot+c818e5c4559444f88aa0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c818e5c4559444f88aa0 Tested-by: <syzbot+c818e5c4559444f88aa0@syzkaller.appspotmail.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
d3cd8de2e1 |
ocfs2: adjust ocfs2_xa_remove_entry() to match UBSAN boundary checks
After introducing
|
||
|
|
1524af3685 |
ocfs2: validate inline data i_size during inode read
When reading an inode from disk, ocfs2_validate_inode_block() performs various sanity checks but does not validate the size of inline data. If the filesystem is corrupted, an inode's i_size can exceed the actual inline data capacity (id_count). This causes ocfs2_dir_foreach_blk_id() to iterate beyond the inline data buffer, triggering a use-after-free when accessing directory entries from freed memory. In the syzbot report: - i_size was 1099511627576 bytes (~1TB) - Actual inline data capacity (id_count) is typically <256 bytes - A garbage rec_len (54648) caused ctx->pos to jump out of bounds - This triggered a UAF in ocfs2_check_dir_entry() Fix by adding a validation check in ocfs2_validate_inode_block() to ensure inodes with inline data have i_size <= id_count. This catches the corruption early during inode read and prevents all downstream code from operating on invalid data. Link: https://lkml.kernel.org/r/20251212052132.16750-1-kartikey406@gmail.com Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reported-by: syzbot+c897823f699449cc3eb4@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c897823f699449cc3eb4 Tested-by: syzbot+c897823f699449cc3eb4@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/20251211115231.3560028-1-kartikey406@gmail.com/T/ [v1] Link: https://lore.kernel.org/all/20251212040400.6377-1-kartikey406@gmail.com/T/ [v2] Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
688dab01c3 |
ocfs2: validate i_refcount_loc when refcount flag is set
Add validation in ocfs2_validate_inode_block() to check that if an inode has OCFS2_HAS_REFCOUNT_FL set, it must also have a valid i_refcount_loc. A corrupted filesystem image can have this inconsistent state, which later triggers a BUG_ON in ocfs2_remove_refcount_tree() when the inode is being wiped during unlink. Catch this corruption early during inode validation to fail gracefully instead of crashing the kernel. Link: https://lkml.kernel.org/r/20251212055826.20929-1-kartikey406@gmail.com Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reported-by: syzbot+6d832e79d3efe1c46743@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6d832e79d3efe1c46743 Tested-by: syzbot+6d832e79d3efe1c46743@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/20251208084407.3021466-1-kartikey406@gmail.com/T/ [v1] Link: https://lore.kernel.org/all/20251212045646.9988-1-kartikey406@gmail.com/T/ [v2] Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
9677a51abd |
ocfs2: constify struct configfs_item_operations and configfs_group_operations
'struct configfs_item_operations' and 'configfs_group_operations' are not modified in this driver. Constifying these structures moves some data to a read-only section, so increases overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 74011 19312 5280 98603 1812b fs/ocfs2/cluster/heartbeat.o After: ===== text data bss dec hex filename 74171 19152 5280 98603 1812b fs/ocfs2/cluster/heartbeat.o Link: https://lkml.kernel.org/r/7c7c00ba328e5e514d8debee698154039e9640dd.1765708880.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
fd4d53bde9 |
ocfs2: detect released suballocator BG for fh_to_[dentry|parent]
After ocfs2 gained the ability to reclaim suballocator free block group
(BGs), a suballocator block group may be released. This change causes the
xfstest case generic/426 to fail.
generic/426 expects return value -ENOENT or -ESTALE, but the current code
triggers -EROFS.
Call stack before ocfs2 gained the ability to reclaim bg:
ocfs2_fh_to_dentry //or ocfs2_fh_to_parent
ocfs2_get_dentry
+ ocfs2_test_inode_bit
| ocfs2_test_suballoc_bit
| + ocfs2_read_group_descriptor //Since ocfs2 never releases the bg,
| | //the bg block was always found.
| + *res = ocfs2_test_bit //unlink was called, and the bit is zero
|
+ if (!set) //because the above *res is 0
status = -ESTALE //the generic/426 expected return value
Current call stack that triggers -EROFS:
ocfs2_get_dentry
ocfs2_test_inode_bit
ocfs2_test_suballoc_bit
ocfs2_read_group_descriptor
+ if reading a released bg, validation fails and triggers -EROFS
How to fix:
Since the read BG is already released, we must avoid triggering -EROFS.
With this commit, we use ocfs2_read_hint_group_descriptor() to detect the
released BG block. This approach quietly handles this type of error and
returns -EINVAL, which triggers the caller's existing conversion path to
-ESTALE.
[dan.carpenter@linaro.org: fix uninitialized variable]
Link: https://lkml.kernel.org/r/dc37519fd2470909f8c65e26c5131b8b6dde2a5c.1766043917.git.dan.carpenter@linaro.org
Link: https://lkml.kernel.org/r/20251212074505.25962-3-heming.zhao@suse.com
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Su Yue <glass.su@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
4a54331616 |
ocfs2: give ocfs2 the ability to reclaim suballocator free bg
Patch series "ocfs2: give ocfs2 the ability to reclaim suballocator free bg", v6. This patch (of 2): The current ocfs2 code can't reclaim suballocator block group space. In some cases, this causes ocfs2 to hold onto a lot of space. For example, when creating lots of small files, the space is held/managed by the '//inode_alloc'. After the user deletes all the small files, the space never returns to the '//global_bitmap'. This issue prevents ocfs2 from providing the needed space even when there is enough free space in a small ocfs2 volume. This patch gives ocfs2 the ability to reclaim suballocator free space when the block group is freed. For performance reasons, this patch keeps the first suballocator block group active. Link: https://lkml.kernel.org/r/20251212074505.25962-2-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
f15d315027
|
ocfs2: add setlease file operation
Add the setlease file_operation to ocfs2_fops, ocfs2_dops, ocfs2_fops_no_plocks, and ocfs2_dops_no_plocks, pointing to generic_setlease. A future patch will change the default behavior to reject lease attempts with -EINVAL when there is no setlease file operation defined. Add generic_setlease to retain the ability to set leases on this filesystem. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://patch.msgid.link/20260108-setlease-6-20-v1-15-ea4dec9b67fa@kernel.org Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> |
||
|
|
9d9c1cfec0 |
There are no significant series in this small merge. Please see the
individual changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTsgBwAKCRDdBJ7gKXxA jkSIAP4jD66nEC2QyKTiv9XvXi8rpKz6RGAHNZSam0ucI5WKswEAoflmlqsoD/Kk sN3YLwLztzCJIYU7tT3IRaPK0irwDAo= =nkux -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc updates from Andrew Morton: "There are no significant series in this small merge. Please see the individual changelogs for details" [ Editor's note: it's mainly ocfs2 and a couple of random fixes ] * tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: memfd_luo: add CONFIG_SHMEM dependency mm: shmem: avoid build warning for CONFIG_SHMEM=n ocfs2: fix memory leak in ocfs2_merge_rec_left() ocfs2: invalidate inode if i_mode is zero after block read ocfs2: avoid -Wflex-array-member-not-at-end warning ocfs2: convert remaining read-only checks to ocfs2_emergency_state ocfs2: add ocfs2_emergency_state helper and apply to setattr checkpatch: add uninitialized pointer with __free attribute check args: fix documentation to reflect the correct numbers ocfs2: fix kernel BUG in ocfs2_find_victim_chain liveupdate: luo_core: fix redundant bound check in luo_ioctl() ocfs2: validate inline xattr size and entry count in ocfs2_xattr_ibody_list fs/fat: remove unnecessary wrapper fat_max_cache() ocfs2: replace deprecated strcpy with strscpy ocfs2: check tl_used after reading it from trancate log inode liveupdate: luo_file: don't use invalid list iterator |
||
|
|
2214ec4bf8 |
ocfs2: fix memory leak in ocfs2_merge_rec_left()
In 'ocfs2_merge_rec_left()', do not reset 'left_path' to NULL after
move, thus allowing 'ocfs2_free_path()' to free it before return.
Link: https://lkml.kernel.org/r/20251205065159.392749-1-dmantipov@yandex.ru
Fixes:
|