Commit Graph

3202 Commits

Author SHA1 Message Date
Linus Torvalds
4da0dd95be gfs2 changes
- Fix possible data loss during inode evict.
 
 - Fix a race during bufdata allocation.
 
 - More careful cleaning up during a withdraw.
 
 - Prevent excessive log flushing under memory pressure.
 
 - Various other minor fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmneMFUUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTp87g//aaGm5/y6IWIT7Q5QKX4ks3eOC20B
 zYYA+EqQsoK6xgYoPfpNMhmtW3OvjrP9sc4mr9ylQEklLTPurvicHUR3RGRDK92k
 0ZqsBz6eHKIEmLT3Pc9sB+VE76Po0CWQxAhYcI/bEPBNpIggqnQ3sSC5JycV+JeJ
 Ws+cudhNyVyrvB+HcBCVfmLk2XLPXMJ0xMCM02KNrzQAdNJNDQhRVdVbPIFBAYor
 dqvOvyaD2vgZe6lFpZ34D9leyj2iqdUONzLZah4EFJJv8uRK8BHZA5R/4fPuCZEh
 goGOkES3YA5C+JTvIhM1C3jhSASi6/lgRx4xqcY88f+FUXdDauTrOpcqvAeZchA7
 6M5L9KdVvoPYmsz6v+Nuq90lTPQ0/p+KkaHrRKz7Yw+kVh6Cz4xI2Ve08X+RSOv5
 M1LYdNXTcmvqjyDnnixSUUMVg+9nPCsES0zHG6mEOi9B7Opk6j/8RV6+CPBZH7UK
 CBoe7bp0j0I0bscp+feuhUD+Xzd8f1ZYeFSsWwCFkiJZMQN7a41KVQS9L9ohc0Lk
 EvfS9wimT5RHR/sQz0pQb7hQZcok9HIrJjTlpoufnLAJvYHGYPIMy7sn3b/sgQYM
 VJS7JPBOOH5NU1PENRIa/dXDMysmfV/N/8cEWn6yh+TJNdLdeR3hFyU8T9P9gQRK
 WK3F36B5pwRol9w=
 =DRrw
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Andreas Gruenbacher:

 - Fix possible data loss during inode evict

 - Fix a race during bufdata allocation

 - More careful cleaning up during a withdraw

 - Prevent excessive log flushing under memory pressure

 - Various other minor fixes and cleanups

* tag 'gfs2-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: prevent NULL pointer dereference during unmount
  gfs2: hide error messages after withdraw
  gfs2: wait for withdraw earlier during unmount
  gfs2: inode directory consistency checks
  gfs2: gfs2_log_flush withdraw fixes
  gfs2: add some missing log locking
  gfs2: fix address space truncation during withdraw
  gfs2: drain ail under sd_log_flush_lock
  gfs2: bufdata allocation race
  gfs2: Remove trans_drain code duplication
  gfs2: Move gfs2_remove_from_journal to log.c
  gfs2: Get rid of gfs2_log_[un]lock helpers
  gfs2: less aggressive low-memory log flushing
  gfs2: Fix data loss during inode evict
  gfs2: minor evict_[un]linked_inode cleanup
  gfs2: Avoid unnecessary transactions in evict_linked_inode
  gfs2: Remove unnecessary check in gfs2_evict_inode
  gfs2: Call unlock_new_inode before d_instantiate
2026-04-15 19:12:04 -07:00
Linus Torvalds
334fbe734e mm.git review status for linus..mm-stable
Everything:
 
 Total patches:       368
 Reviews/patch:       1.56
 Reviewed rate:       74%
 
 Excluding DAMON:
 
 Total patches:       316
 Reviews/patch:       1.77
 Reviewed rate:       81%
 
 Excluding DAMON and zram:
 
 Total patches:       306
 Reviews/patch:       1.81
 Reviewed rate:       82%
 
 Excluding DAMON, zram and maple_tree:
 
 Total patches:       276
 Reviews/patch:       2.01
 Reviewed rate:       91%
 
 Significant patch series in this merge:
 
 - The 30 patch series "maple_tree: Replace big node with maple copy"
   from Liam Howlett is mainly prepararatory work for ongoing development
   but it does reduce stack usage and is an improvement.
 
 - The 12 patch series "mm, swap: swap table phase III: remove swap_map"
   from Kairui Song offers memory savings by removing the static swap_map.
   It also yields some CPU savings and implements several cleanups.
 
 - The 2 patch series "mm: memfd_luo: preserve file seals" from Pratyush
   Yadav adds file seal preservation to LUO's memfd code.
 
 - The 2 patch series "mm: zswap: add per-memcg stat for incompressible
   pages" from Jiayuan Chen adds additional userspace stats reportng to
   zswap.
 
 - The 4 patch series "arch, mm: consolidate empty_zero_page" from Mike
   Rapoport implements some cleanups for our handling of ZERO_PAGE() and
   zero_pfn.
 
 - The 2 patch series "mm/kmemleak: Improve scan_should_stop()
   implementation" from Zhongqiu Han provides an robustness improvement and
   some cleanups in the kmemleak code.
 
 - The 4 patch series "Improve khugepaged scan logic" from Vernon Yang
   "improves the khugepaged scan logic and reduces CPU consumption by
   prioritizing scanning tasks that access memory frequently".
 
 - The 2 patch series "Make KHO Stateless" from Jason Miu simplifies
   Kexec Handover by "transitioning KHO from an xarray-based metadata
   tracking system with serialization to a radix tree data structure that
   can be passed directly to the next kernel"
 
 - The 3 patch series "mm: vmscan: add PID and cgroup ID to vmscan
   tracepoints" from Thomas Ballasi and Steven Rostedt enhances vmscan's
   tracepointing.
 
 - The 5 patch series "mm: arch/shstk: Common shadow stack mapping helper
   and VM_NOHUGEPAGE" from Catalin Marinas is a cleanup for the shadow
   stack code: remove per-arch code in favour of a generic implementation.
 
 - The 2 patch series "Fix KASAN support for KHO restored vmalloc
   regions" from Pasha Tatashin fixes a WARN() which can be emitted the KHO
   restores a vmalloc area.
 
 - The 4 patch series "mm: Remove stray references to pagevec" from Tal
   Zussman provides several cleanups, mainly udpating references to "struct
   pagevec", which became folio_batch three years ago.
 
 - The 17 patch series "mm: Eliminate fake head pages from vmemmap
   optimization" from Kiryl Shutsemau simplifies the HugeTLB vmemmap
   optimization (HVO) by changing how tail pages encode their relationship
   to the head page.
 
 - The 2 patch series "mm/damon/core: improve DAMOS quota efficiency for
   core layer filters" from SeongJae Park improves two problematic
   behaviors of DAMOS that makes it less efficient when core layer filters
   are used.
 
 - The 3 patch series "mm/damon: strictly respect min_nr_regions" from
   SeongJae Park improves DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter.
 
 - The 3 patch series "mm/page_alloc: pcp locking cleanup" from Vlastimil
   Babka is a proper fix for a previously hotfixed SMP=n issue.  Code
   simplifications and cleanups ennsed.
 
 - The 16 patch series "mm: cleanups around unmapping / zapping" from
   David Hildenbrand implements "a bunch of cleanups around unmapping and
   zapping.  Mostly simplifications, code movements, documentation and
   renaming of zapping functions".
 
 - The 6 patch series "support batched checking of the young flag for
   MGLRU" from Baolin Wang supports batched checking of the young flag for
   MGLRU.  It's part cleanups; one benchmark shows large performance
   benefits for arm64.
 
 - The 5 patch series "memcg: obj stock and slab stat caching cleanups"
   from Johannes Weiner provides memcg cleanup and robustness improvements.
 
 - The 5 patch series "Allow order zero pages in page reporting" from
   Yuvraj Sakshith enhances page_reporting's free page reporting - it is
   presently and undesirably order-0 pages when reporting free memory.
 
 - The 6 patch series "mm: vma flag tweaks" from Lorenzo Stoakes is
   cleanup work following from the recent conversion of the VMA flags to a
   bitmap.
 
 - The 10 patch series "mm/damon: add optional debugging-purpose sanity
   checks" from SeongJae Park adds some more developer-facing debug checks
   into DAMON core.
 
 - The 2 patch series "mm/damon: test and document power-of-2
   min_region_sz requirement" from SeongJae Park adds an additional DAMON
   kunit test and makes some adjustments to the addr_unit parameter
   handling.
 
 - The 3 patch series "mm/damon/core: make passed_sample_intervals
   comparisons overflow-safe" from SeongJae Park fixes a hard-to-hit time
   overflow issue in DAMON core.
 
 - The 7 patch series "mm/damon: improve/fixup/update ratio calculation,
   test and documentation" from SeongJae Park is a "batch of misc/minor
   improvements and fixups" for DAMON.
 
 - The 4 patch series "mm: move vma_(kernel|mmu)_pagesize() out of
   hugetlb.c" from David Hildenbrand fixes a possible issue with dax-device
   when CONFIG_HUGETLB=n.  Some code movement was required.
 
 - The 6 patch series "zram: recompression cleanups and tweaks" from
   Sergey Senozhatsky provides "a somewhat random mix of fixups,
   recompression cleanups and improvements" in the zram code.
 
 - The 11 patch series "mm/damon: support multiple goal-based quota
   tuning algorithms" from SeongJae Park extend DAMOS quotas goal
   auto-tuning to support multiple tuning algorithms that users can select.
 
 - The 4 patch series "mm: thp: reduce unnecessary
   start_stop_khugepaged()" from Breno Leitao fixes the khugpaged sysfs
   handling so we no longer spam the logs with reams of junk when
   starting/stopping khugepaged.
 
 - The 3 patch series "mm: improve map count checks" from Lorenzo Stoakes
   provides some cleanups and slight fixes in the mremap, mmap and vma
   code.
 
 - The 5 patch series "mm/damon: support addr_unit on default monitoring
   targets for modules" from SeongJae Park extends the use of DAMON core's
   addr_unit tunable.
 
 - The 5 patch series "mm: khugepaged cleanups and mTHP prerequisites"
   from Nico Pache provides cleanups in the khugepaged and is a base for
   Nico's planned khugepaged mTHP support.
 
 - The 15 patch series "mm: memory hot(un)plug and SPARSEMEM cleanups"
   from David Hildenbrand implements code movement and cleanups in the
   memhotplug and sparsemem code.
 
 - The 2 patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and
   cleanup CONFIG_MIGRATION" from David Hildenbrand rationalizes some
   memhotplug Kconfig support.
 
 - The 6 patch series "change young flag check functions to return bool"
   from Baolin Wang is "a cleanup patchset to change all young flag check
   functions to return bool".
 
 - The 3 patch series "mm/damon/sysfs: fix memory leak and NULL
   dereference issues" from Josh Law and SeongJae Park fixes a few
   potential DAMON bugs.
 
 - The 25 patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma
   code" from "converts a lot of the existing use of the legacy vm_flags_t
   data type to the new vma_flags_t type which replaces it".  Mainly in the
   vma code.
 
 - The 21 patch series "mm: expand mmap_prepare functionality and usage"
   from Lorenzo Stoakes "expands the mmap_prepare functionality, which is
   intended to replace the deprecated f_op->mmap hook which has been the
   source of bugs and security issues for some time".  Cleanups,
   documentation, extension of mmap_prepare into filesystem drivers.
 
 - The 13 patch series "mm/huge_memory: refactor zap_huge_pmd()" from
   Lorenzo Stoakes simplifies and cleans up zap_huge_pmd().  Additional
   cleanups around vm_normal_folio_pmd() and the softleaf functionality are
   performed.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad3HDQAKCRDdBJ7gKXxA
 jrUQAPwNhPk5nPSxnyxjAeQtOBHqgCdnICeEismLajPKd9aYRgEA0s2XAu3tSUYi
 GrBnWImHG3s4ePQxVcPCegWTsOUrXgQ=
 =1Q7o
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "maple_tree: Replace big node with maple copy" (Liam Howlett)

   Mainly prepararatory work for ongoing development but it does reduce
   stack usage and is an improvement.

 - "mm, swap: swap table phase III: remove swap_map" (Kairui Song)

   Offers memory savings by removing the static swap_map. It also yields
   some CPU savings and implements several cleanups.

 - "mm: memfd_luo: preserve file seals" (Pratyush Yadav)

   File seal preservation to LUO's memfd code

 - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
   Chen)

   Additional userspace stats reportng to zswap

 - "arch, mm: consolidate empty_zero_page" (Mike Rapoport)

   Some cleanups for our handling of ZERO_PAGE() and zero_pfn

 - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
   Han)

   A robustness improvement and some cleanups in the kmemleak code

 - "Improve khugepaged scan logic" (Vernon Yang)

   Improve khugepaged scan logic and reduce CPU consumption by
   prioritizing scanning tasks that access memory frequently

 - "Make KHO Stateless" (Jason Miu)

   Simplify Kexec Handover by transitioning KHO from an xarray-based
   metadata tracking system with serialization to a radix tree data
   structure that can be passed directly to the next kernel

 - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
   Ballasi and Steven Rostedt)

   Enhance vmscan's tracepointing

 - "mm: arch/shstk: Common shadow stack mapping helper and
   VM_NOHUGEPAGE" (Catalin Marinas)

   Cleanup for the shadow stack code: remove per-arch code in favour of
   a generic implementation

 - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)

   Fix a WARN() which can be emitted the KHO restores a vmalloc area

 - "mm: Remove stray references to pagevec" (Tal Zussman)

   Several cleanups, mainly udpating references to "struct pagevec",
   which became folio_batch three years ago

 - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
   Shutsemau)

   Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
   pages encode their relationship to the head page

 - "mm/damon/core: improve DAMOS quota efficiency for core layer
   filters" (SeongJae Park)

   Improve two problematic behaviors of DAMOS that makes it less
   efficient when core layer filters are used

 - "mm/damon: strictly respect min_nr_regions" (SeongJae Park)

   Improve DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter

 - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)

   The proper fix for a previously hotfixed SMP=n issue. Code
   simplifications and cleanups ensued

 - "mm: cleanups around unmapping / zapping" (David Hildenbrand)

   A bunch of cleanups around unmapping and zapping. Mostly
   simplifications, code movements, documentation and renaming of
   zapping functions

 - "support batched checking of the young flag for MGLRU" (Baolin Wang)

   Batched checking of the young flag for MGLRU. It's part cleanups; one
   benchmark shows large performance benefits for arm64

 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)

   memcg cleanup and robustness improvements

 - "Allow order zero pages in page reporting" (Yuvraj Sakshith)

   Enhance free page reporting - it is presently and undesirably order-0
   pages when reporting free memory.

 - "mm: vma flag tweaks" (Lorenzo Stoakes)

   Cleanup work following from the recent conversion of the VMA flags to
   a bitmap

 - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
   Park)

   Add some more developer-facing debug checks into DAMON core

 - "mm/damon: test and document power-of-2 min_region_sz requirement"
   (SeongJae Park)

   An additional DAMON kunit test and makes some adjustments to the
   addr_unit parameter handling

 - "mm/damon/core: make passed_sample_intervals comparisons
   overflow-safe" (SeongJae Park)

   Fix a hard-to-hit time overflow issue in DAMON core

 - "mm/damon: improve/fixup/update ratio calculation, test and
   documentation" (SeongJae Park)

   A batch of misc/minor improvements and fixups for DAMON

 - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
   Hildenbrand)

   Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
   movement was required.

 - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)

   A somewhat random mix of fixups, recompression cleanups and
   improvements in the zram code

 - "mm/damon: support multiple goal-based quota tuning algorithms"
   (SeongJae Park)

   Extend DAMOS quotas goal auto-tuning to support multiple tuning
   algorithms that users can select

 - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)

   Fix the khugpaged sysfs handling so we no longer spam the logs with
   reams of junk when starting/stopping khugepaged

 - "mm: improve map count checks" (Lorenzo Stoakes)

   Provide some cleanups and slight fixes in the mremap, mmap and vma
   code

 - "mm/damon: support addr_unit on default monitoring targets for
   modules" (SeongJae Park)

   Extend the use of DAMON core's addr_unit tunable

 - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)

   Cleanups to khugepaged and is a base for Nico's planned khugepaged
   mTHP support

 - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)

   Code movement and cleanups in the memhotplug and sparsemem code

 - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
   CONFIG_MIGRATION" (David Hildenbrand)

   Rationalize some memhotplug Kconfig support

 - "change young flag check functions to return bool" (Baolin Wang)

   Cleanups to change all young flag check functions to return bool

 - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
   Law and SeongJae Park)

   Fix a few potential DAMON bugs

 - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
   Stoakes)

   Convert a lot of the existing use of the legacy vm_flags_t data type
   to the new vma_flags_t type which replaces it. Mainly in the vma
   code.

 - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)

   Expand the mmap_prepare functionality, which is intended to replace
   the deprecated f_op->mmap hook which has been the source of bugs and
   security issues for some time. Cleanups, documentation, extension of
   mmap_prepare into filesystem drivers

 - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)

   Simplify and clean up zap_huge_pmd(). Additional cleanups around
   vm_normal_folio_pmd() and the softleaf functionality are performed.

* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm: fix deferred split queue races during migration
  mm/khugepaged: fix issue with tracking lock
  mm/huge_memory: add and use has_deposited_pgtable()
  mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
  mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
  mm/huge_memory: separate out the folio part of zap_huge_pmd()
  mm/huge_memory: use mm instead of tlb->mm
  mm/huge_memory: remove unnecessary sanity checks
  mm/huge_memory: deduplicate zap deposited table call
  mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
  mm/huge_memory: add a common exit path to zap_huge_pmd()
  mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
  mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
  mm/huge: avoid big else branch in zap_huge_pmd()
  mm/huge_memory: simplify vma_is_specal_huge()
  mm: on remap assert that input range within the proposed VMA
  mm: add mmap_action_map_kernel_pages[_full]()
  uio: replace deprecated mmap hook with mmap_prepare in uio_info
  drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
  mm: allow handling of stacked mmap_prepare hooks in more drivers
  ...
2026-04-15 12:59:16 -07:00
Linus Torvalds
91a4855d6c Networking changes for 7.1.
Core & protocols
 ----------------
 
  - Support HW queue leasing, allowing containers to be granted access
    to HW queues for zero-copy operations and AF_XDP.
 
  - Number of code moves to help the compiler with inlining.
    Avoid output arguments for returning drop reason where possible.
 
  - Rework drop handling within qdiscs to include more metadata
    about the reason and dropping qdisc in the tracepoints.
 
  - Remove the rtnl_lock use from IP Multicast Routing.
 
  - Pack size information into the Rx Flow Steering table pointer
    itself. This allows making the table itself a flat array of u32s,
    thus making the table allocation size a power of two.
 
  - Report TCP delayed ack timer information via socket diag.
 
  - Add ip_local_port_step_width sysctl to allow distributing the randomly
    selected ports more evenly throughout the allowed space.
 
  - Add support for per-route tunsrc in IPv6 segment routing.
 
  - Start work of switching sockopt handling to iov_iter.
 
  - Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
    buffer size drifting up.
 
  - Support MSG_EOR in MPTCP.
 
  - Add stp_mode attribute to the bridge driver for STP mode selection.
    This addresses concerns about call_usermodehelper() usage.
 
  - Remove UDP-Lite support (as announced in 2023).
 
  - Remove support for building IPv6 as a module.
    Remove the now unnecessary function calling indirection.
 
 Cross-tree stuff
 ----------------
 
  - Move Michael MIC code from generic crypto into wireless,
    it's considered insecure but some WiFi networks still need it.
 
 Netfilter
 ---------
 
  - Switch nft_fib_ipv6 module to no longer need temporary dst_entry
    object allocations by using fib6_lookup() + RCU.
    Florian W reports this gets us ~13% higher packet rate.
 
  - Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
    switch the service tables to be per-net. Convert some code that
    walks the service lists to use RCU instead of the service_mutex.
 
  - Add more opinionated input validation to lower security exposure.
 
  - Make IPVS hash tables to be per-netns and resizable.
 
 Wireless
 --------
 
  - Finished assoc frame encryption/EPPKE/802.1X-over-auth.
 
  - Radar detection improvements.
 
  - Add 6 GHz incumbent signal detection APIs.
 
  - Multi-link support for FILS, probe response templates and
    client probing.
 
  - New APIs and mac80211 support for NAN (Neighbor Aware Networking,
    aka Wi-Fi Aware) so less work must be in firmware.
 
 Driver API
 ----------
 
  - Add numerical ID for devlink instances (to avoid having to create
    fake bus/device pairs just to have an ID). Support shared devlink
    instances which span multiple PFs.
 
  - Add standard counters for reporting pause storm events
    (implement in mlx5 and fbnic).
 
  - Add configuration API for completion writeback buffering
    (implement in mana).
 
  - Support driver-initiated change of RSS context sizes.
 
  - Support DPLL monitoring input frequency (implement in zl3073x).
 
  - Support per-port resources in devlink (implement in mlx5).
 
 Misc
 ----
 
  - Expand the YAML spec for Netfilter.
 
 Drivers
 -------
 
  - Software:
    - macvlan: support multicast rx for bridge ports with shared source
      MAC address
    - team: decouple receive and transmit enablement for IEEE 802.3ad
      LACP "independent control"
 
  - Ethernet high-speed NICs:
    - nVidia/Mellanox:
      - support high order pages in zero-copy mode (for payload
        coalescing)
      - support multiple packets in a page (for systems with 64kB pages)
    - Broadcom 25-400GE (bnxt):
      - implement XDP RSS hash metadata extraction
      - add software fallback for UDP GSO, lowering the IOMMU cost
    - Broadcom 800GE (bnge):
      - add link status and configuration handling
      - add various HW and SW statistics
    - Marvell/Cavium:
      - NPC HW block support for cn20k
    - Huawei (hinic3):
      - add mailbox / control queue
      - add rx VLAN offload
      - add driver info and link management
 
  - Ethernet NICs:
    - Marvell/Aquantia:
      - support reading SFP module info on some AQC100 cards
    - Realtek PCI (r8169):
      - add support for RTL8125cp
    - Realtek USB (r8152):
      - support for the RTL8157 5Gbit chip
      - add 2500baseT EEE status/configuration support
 
  - Ethernet NICs embedded and off-the-shelf IP:
    - Synopsys (stmmac):
      - cleanup and reorganize SerDes handling and PCS support
      - cleanup descriptor handling and per-platform data
      - cleanup and consolidate MDIO defines and handling
      - shrink driver memory use for internal structures
      - improve Tx IRQ coalescing
      - improve TCP segmentation handling
      - add support for Spacemit K3
    - Cadence (macb):
      - support PHYs that have inband autoneg disabled with GEM
      - support IEEE 802.3az EEE
      - rework usrio capabilities and handling
    - AMD (xgbe):
      - improve power management for S0i3
      - improve TX resilience for link-down handling
 
  - Virtual:
    - Google cloud vNIC:
      - support larger ring sizes in DQO-QPL mode
      - improve HW-GRO handling
      - support UDP GSO for DQO format
    - PCIe NTB:
      - support queue count configuration
 
  - Ethernet PHYs:
    - automatically disable PHY autonomous EEE if MAC is in charge
    - Broadcom:
      - add BCM84891/BCM84892 support
    - Micrel:
      - support for LAN9645X internal PHY
    - Realtek:
      - add RTL8224 pair order support
      - support PHY LEDs on RTL8211F-VD
      - support spread spectrum clocking (SSC)
    - Maxlinear:
      - add PHY-level statistics via ethtool
 
  - Ethernet switches:
    - Maxlinear (mxl862xx):
      - support for bridge offloading
      - support for VLANs
      - support driver statistics
 
  - Bluetooth:
    - large number of fixes and new device IDs
    - Mediatek:
      - support MT6639 (MT7927)
      - support MT7902 SDIO
 
  - WiFi:
    - Intel (iwlwifi):
      - UNII-9 and continuing UHR work
    - MediaTek (mt76):
      - mt7996/mt7925 MLO fixes/improvements
      - mt7996 NPU support (HW eth/wifi traffic offload)
    - Qualcomm (ath12k):
      - monitor mode support on IPQ5332
      - basic hwmon temperature reporting
      - support IPQ5424
    - Realtek:
      - add USB RX aggregation to improve performance
      - add USB TX flow control by tracking in-flight URBs
 
  - Cellular:
    - IPA v5.2 support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmnelNoACgkQMUZtbf5S
 IrtWFw//WyiXuEiGawVQONnbu1dtR+3nw/cvNpSYi0IM66vbRUB9n+9fxm2MIyG4
 4jI/c/X/fxIvUxEqGez3yPn5P7KqkQR8WRYwkxrMYKRpXeukN0IDk5Euew5DskCe
 wtBKNJOQWKdKXff0bLQoJ9dHWYuJ2IMRVil5M3fhUbeUOXeyJD7Yn1w2ICvJAbj+
 T/Hw7sEtchNaHp6h6SbaQfahkUFHQG5peNoETkZF4UDF6ALGY29WH91GXeO2lrgN
 IxX203KtaavV0oU8T0oixZgOc57Ns081YfFL/F1JP2HV6lgkwhuq+zxCrRTi1c9M
 HPTXgwD7Z80Y74nM3YTLrPfoMOP8GLBZgdV3rUpwmteM26+gMTm+O1zHUur5ZoGy
 D6TaMFguPTIqiRyrARa9xY/J6r9TQkc2Wfu4bIuPndKFg8xPoepuEObODnh0+5Hg
 4j4pdFhIo2huENhSg7kVb/yl+1q68SFwM3RqTmx+OhCa0AyjcKIKgt/UBhismdnG
 r8obxzb+nXeJc2rRDuwNMwlBlcMSbep27uGt64zeHMMXVhTVqOoytNaL/X/ZpH2m
 A0DscUrpHvb36IoDPtanc6irP+JOh5Xe7Nw5qhkgwsMc7hlf8SyyHB4OUBBaz1qA
 ETSnHlfwklRmXSpWqH2LyGXjdOQpDKP46+h0W3dttMD2/cRBqYo=
 =EhQZ
 -----END PGP SIGNATURE-----

Merge tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Support HW queue leasing, allowing containers to be granted access
     to HW queues for zero-copy operations and AF_XDP

   - Number of code moves to help the compiler with inlining. Avoid
     output arguments for returning drop reason where possible

   - Rework drop handling within qdiscs to include more metadata about
     the reason and dropping qdisc in the tracepoints

   - Remove the rtnl_lock use from IP Multicast Routing

   - Pack size information into the Rx Flow Steering table pointer
     itself. This allows making the table itself a flat array of u32s,
     thus making the table allocation size a power of two

   - Report TCP delayed ack timer information via socket diag

   - Add ip_local_port_step_width sysctl to allow distributing the
     randomly selected ports more evenly throughout the allowed space

   - Add support for per-route tunsrc in IPv6 segment routing

   - Start work of switching sockopt handling to iov_iter

   - Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
     buffer size drifting up

   - Support MSG_EOR in MPTCP

   - Add stp_mode attribute to the bridge driver for STP mode selection.
     This addresses concerns about call_usermodehelper() usage

   - Remove UDP-Lite support (as announced in 2023)

   - Remove support for building IPv6 as a module. Remove the now
     unnecessary function calling indirection

  Cross-tree stuff:

   - Move Michael MIC code from generic crypto into wireless, it's
     considered insecure but some WiFi networks still need it

  Netfilter:

   - Switch nft_fib_ipv6 module to no longer need temporary dst_entry
     object allocations by using fib6_lookup() + RCU.

     Florian W reports this gets us ~13% higher packet rate

   - Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
     switch the service tables to be per-net. Convert some code that
     walks the service lists to use RCU instead of the service_mutex

   - Add more opinionated input validation to lower security exposure

   - Make IPVS hash tables to be per-netns and resizable

  Wireless:

   - Finished assoc frame encryption/EPPKE/802.1X-over-auth

   - Radar detection improvements

   - Add 6 GHz incumbent signal detection APIs

   - Multi-link support for FILS, probe response templates and client
     probing

   - New APIs and mac80211 support for NAN (Neighbor Aware Networking,
     aka Wi-Fi Aware) so less work must be in firmware

  Driver API:

   - Add numerical ID for devlink instances (to avoid having to create
     fake bus/device pairs just to have an ID). Support shared devlink
     instances which span multiple PFs

   - Add standard counters for reporting pause storm events (implement
     in mlx5 and fbnic)

   - Add configuration API for completion writeback buffering (implement
     in mana)

   - Support driver-initiated change of RSS context sizes

   - Support DPLL monitoring input frequency (implement in zl3073x)

   - Support per-port resources in devlink (implement in mlx5)

  Misc:

   - Expand the YAML spec for Netfilter

  Drivers

   - Software:
      - macvlan: support multicast rx for bridge ports with shared
        source MAC address
      - team: decouple receive and transmit enablement for IEEE 802.3ad
        LACP "independent control"

   - Ethernet high-speed NICs:
      - nVidia/Mellanox:
         - support high order pages in zero-copy mode (for payload
           coalescing)
         - support multiple packets in a page (for systems with 64kB
           pages)
      - Broadcom 25-400GE (bnxt):
         - implement XDP RSS hash metadata extraction
         - add software fallback for UDP GSO, lowering the IOMMU cost
      - Broadcom 800GE (bnge):
         - add link status and configuration handling
         - add various HW and SW statistics
      - Marvell/Cavium:
         - NPC HW block support for cn20k
      - Huawei (hinic3):
         - add mailbox / control queue
         - add rx VLAN offload
         - add driver info and link management

   - Ethernet NICs:
      - Marvell/Aquantia:
         - support reading SFP module info on some AQC100 cards
      - Realtek PCI (r8169):
         - add support for RTL8125cp
      - Realtek USB (r8152):
         - support for the RTL8157 5Gbit chip
         - add 2500baseT EEE status/configuration support

   - Ethernet NICs embedded and off-the-shelf IP:
      - Synopsys (stmmac):
         - cleanup and reorganize SerDes handling and PCS support
         - cleanup descriptor handling and per-platform data
         - cleanup and consolidate MDIO defines and handling
         - shrink driver memory use for internal structures
         - improve Tx IRQ coalescing
         - improve TCP segmentation handling
         - add support for Spacemit K3
      - Cadence (macb):
         - support PHYs that have inband autoneg disabled with GEM
         - support IEEE 802.3az EEE
         - rework usrio capabilities and handling
      - AMD (xgbe):
         - improve power management for S0i3
         - improve TX resilience for link-down handling

   - Virtual:
      - Google cloud vNIC:
         - support larger ring sizes in DQO-QPL mode
         - improve HW-GRO handling
         - support UDP GSO for DQO format
      - PCIe NTB:
         - support queue count configuration

   - Ethernet PHYs:
      - automatically disable PHY autonomous EEE if MAC is in charge
      - Broadcom:
         - add BCM84891/BCM84892 support
      - Micrel:
         - support for LAN9645X internal PHY
      - Realtek:
         - add RTL8224 pair order support
         - support PHY LEDs on RTL8211F-VD
         - support spread spectrum clocking (SSC)
      - Maxlinear:
         - add PHY-level statistics via ethtool

   - Ethernet switches:
      - Maxlinear (mxl862xx):
         - support for bridge offloading
         - support for VLANs
         - support driver statistics

   - Bluetooth:
      - large number of fixes and new device IDs
      - Mediatek:
         - support MT6639 (MT7927)
         - support MT7902 SDIO

   - WiFi:
      - Intel (iwlwifi):
         - UNII-9 and continuing UHR work
      - MediaTek (mt76):
         - mt7996/mt7925 MLO fixes/improvements
         - mt7996 NPU support (HW eth/wifi traffic offload)
      - Qualcomm (ath12k):
         - monitor mode support on IPQ5332
         - basic hwmon temperature reporting
         - support IPQ5424
      - Realtek:
         - add USB RX aggregation to improve performance
         - add USB TX flow control by tracking in-flight URBs

   - Cellular:
      - IPA v5.2 support"

* tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1561 commits)
  net: pse-pd: fix kernel-doc function name for pse_control_find_by_id()
  wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
  wireguard: allowedips: remove redundant space
  tools: ynl: add sample for wireguard
  wireguard: allowedips: Use kfree_rcu() instead of call_rcu()
  MAINTAINERS: Add netkit selftest files
  selftests/net: Add additional test coverage in nk_qlease
  selftests/net: Split netdevsim tests from HW tests in nk_qlease
  tools/ynl: Make YnlFamily closeable as a context manager
  net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
  net: airoha: Fix VIP configuration for AN7583 SoC
  net: caif: clear client service pointer on teardown
  net: strparser: fix skb_head leak in strp_abort_strp()
  net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()
  selftests/bpf: add test for xdp_master_redirect with bond not up
  net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
  net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration
  sctp: disable BH before calling udp_tunnel_xmit_skb()
  sctp: fix missing encap_port propagation for GSO fragments
  net: airoha: Rely on net_device pointer in ETS callbacks
  ...
2026-04-14 18:36:10 -07:00
Linus Torvalds
fc825e513c vfs-7.1-rc1.bh.metadata
Please consider pulling these changes from the signed vfs-7.1-rc1.bh.metadata tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCwAKCRCRxhvAZXjc
 os67AQCd65HW/XVVw01846OH5Cqw7vFYBa7HipkQPebX3NjCPgEAm4w8ywqKUe5o
 rRLkZVIDBgkMGhH7Af+y2Ru5WZZiLwc=
 =G3WM
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs buffer_head updates from Christian Brauner:
 "This cleans up the mess that has accumulated over the years in
  metadata buffer_head tracking for inodes.

  It moves the tracking into dedicated structure in filesystem-private
  part of the inode (so that we don't use private_list, private_data,
  and private_lock in struct address_space), and also moves couple other
  users of private_data and private_list so these are removed from
  struct address_space saving 3 longs in struct inode for 99% of inodes"

* tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits)
  fs: Drop i_private_list from address_space
  fs: Drop mapping_metadata_bhs from address space
  ext4: Track metadata bhs in fs-private inode part
  minix: Track metadata bhs in fs-private inode part
  udf: Track metadata bhs in fs-private inode part
  fat: Track metadata bhs in fs-private inode part
  bfs: Track metadata bhs in fs-private inode part
  affs: Track metadata bhs in fs-private inode part
  ext2: Track metadata bhs in fs-private inode part
  fs: Provide functions for handling mapping_metadata_bhs directly
  fs: Switch inode_has_buffers() to take mapping_metadata_bhs
  fs: Make bhs point to mapping_metadata_bhs
  fs: Move metadata bhs tracking to a separate struct
  fs: Fold fsync_buffers_list() into sync_mapping_buffers()
  fs: Drop osync_buffers_list()
  kvm: Use private inode list instead of i_private_list
  fs: Remove i_private_data
  aio: Stop using i_private_data and i_private_lock
  hugetlbfs: Stop using i_private_data
  fs: Stop using i_private_data for metadata bh tracking
  ...
2026-04-13 12:46:42 -07:00
Linus Torvalds
0e58e3f1c5 vfs-7.1-rc1.writeback
Please consider pulling these changes from the signed vfs-7.1-rc1.writeback tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCgAKCRCRxhvAZXjc
 oh/7AP9kfWK3ElRTV/+EfRs3tOAXD2b7VfO5jMbbSIfRjWgW5AEAiMULmQFpLOY8
 cPEY/c5E9t8GLZqdsOJL8LQwz8YYAAk=
 =BtbD
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.1-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs writeback updates from Christian Brauner:
 "This introduces writeback helper APIs and converts f2fs, gfs2 and nfs
  to stop accessing writeback internals directly"

* tag 'vfs-7.1-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  nfs: stop using writeback internals for WB_WRITEBACK accounting
  gfs2: stop using writeback internals for dirty_exceeded check
  f2fs: stop using writeback internals for dirty_exceeded checks
  writeback: prep helpers for dirty-limit and writeback accounting
2026-04-13 10:08:01 -07:00
Andreas Gruenbacher
74b4dbb946 gfs2: prevent NULL pointer dereference during unmount
When flushing out outstanding glock work during an unmount, gfs2_log_flush()
can be called when sdp->sd_jdesc has already been deallocated and sdp->sd_jdesc
is NULL.  Commit 35264909e9 ("gfs2: Fix NULL pointer dereference in
gfs2_log_flush") added a check for that to gfs2_log_flush() itself, but it
missed the sdp->sd_jdesc dereference in gfs2_log_release().  Fix that.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202604071139.HNJiCaAi-lkp@intel.com/
Fixes: 35264909e9 ("gfs2: Fix NULL pointer dereference in gfs2_log_flush")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-04-07 22:20:00 +02:00
Andreas Gruenbacher
734f0b4b9b gfs2: hide error messages after withdraw
In gfs2_evict_inode(), don't issue error messages once a withdraw has already
occurred.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-04-07 22:20:00 +02:00
Andreas Gruenbacher
f458aafc5c gfs2: wait for withdraw earlier during unmount
During an unmount, wait for potential withdraw to complete before calling
gfs2_make_fs_ro().  This will allow gfs2_make_fs_ro() to skip much of its work.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-04-07 22:20:00 +02:00
Andreas Gruenbacher
b89e487bfc gfs2: inode directory consistency checks
In gfs2_dinode_in(), only allow directories to have the GFS2_DIF_EXHASH
flag set.  This will prevent other parts of the code from treating
regular inodes as directories based on the presence of that flag.

In sweep_bh_for_rgrps() and __gfs2_free_blocks(), check if the
GFS2_DIF_EXHASH flag is set instead of checking if i_depth is non-zero.
This matches what the directory code does.  (The i_depth checks were
introduced in commit 6d3117b412 ("GFS2: Wipe directory hash table
metadata when deallocating a directory").)

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-04-07 22:20:00 +02:00
Andreas Gruenbacher
bb47cce7a1 gfs2: gfs2_log_flush withdraw fixes
When a withdraw occurs in gfs2_log_flush() and we are left with an unsubmitted
bio, fail that bio.  Otherwise, the bh's in that bio will remain locked and
gfs2_evict_inode() -> truncate_inode_pages() -> gfs2_invalidate_folio() ->
gfs2_discard() will hang trying to discard the locked bh's.

In addition, when gfs2_log_flush() fails to submit a new transaction, unpin the
buffers in the failing transaction like gfs2_remove_from_journal() does.  If
any of the bd's are on the ail2 list, leave them there and do_withdraw() ->
gfs2_withdraw_glocks() -> inode_go_inval() -> truncate_inode_pages() ->
gfs2_invalidate_folio() -> gfs2_discard() will remove them.  They will be freed
in gfs2_release_folio().

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-04-07 22:20:00 +02:00
Andreas Gruenbacher
fe2c8d0511 gfs2: add some missing log locking
Function gfs2_logd() calls the log flushing functions gfs2_ail1_start(),
gfs2_ail1_wait(), and gfs2_ail1_empty() without holding sdp->sd_log_flush_lock,
but these functions require exclusion against concurrent transactions.

To fix that, add a non-locking __gfs2_log_flush() function.  Then, in
gfs2_logd(), take sdp->sd_log_flush_lock before calling the above mentioned log
flushing functions and __gfs2_log_flush().

Fixes: 5e4c7632aa ("gfs2: Issue revokes more intelligently")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-04-07 22:20:00 +02:00
Andreas Gruenbacher
f4e4c4e6ac gfs2: fix address space truncation during withdraw
When a withdrawn filesystem's inodes are being evicted, the address spaces of
those inodes still need to be truncated but we can no longer start new
transactions.  We still don't want gfs2_invalidate_folio() to race with
gfs2_log_flush(), so take a read lock on sdp->sd_log_flush_lock in that case.
(It may not be obvious, but gfs2_invalidate_folio() is a jdata-only address
space operation.)

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-04-07 22:19:59 +02:00
Andreas Gruenbacher
7d2da6ed17 gfs2: drain ail under sd_log_flush_lock
When a withdraw is carried out, call gfs2_ail_drain() under the
sdp->sd_log_flush_lock.  This isn't strictly necessary but should be easier to
read, and more robust against possible future bugs.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-04-07 22:19:22 +02:00
Tal Zussman
4e1d77a8f3 folio_batch: rename pagevec.h to folio_batch.h
struct pagevec was removed in commit 1e0877d58b ("mm: remove struct
pagevec").  Rename include/linux/pagevec.h to reflect reality and update
includes tree-wide.  Add the new filename to MAINTAINERS explicitly, as it
no longer matches the "include/linux/page[-_]*" pattern in MEMORY
MANAGEMENT - CORE.

Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-3-716868cc2d11@columbia.edu
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:07 -07:00
Andreas Gruenbacher
6e1a833df9 gfs2: bufdata allocation race
The locking in gfs2_trans_add_data() and gfs2_trans_add_meta() doesn't
follow the usual coding pattern of checking bh->b_private under lock,
allocating a new bufdata object with the locks dropped, and re-checking
once the lock has been reacquired.  Both functions set bh->b_private
without holding the buffer lock.  Fix that.

Also, in gfs2_trans_add_meta(), taking the folio lock during the
allocation doesn't actually do anything useful.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-03-30 12:01:52 +02:00
Fernando Fernandez Mancera
309b905dee ipv6: convert CONFIG_IPV6 to built-in only and clean up Kconfigs
Maintaining a modular IPv6 stack offers image size savings for specific
setups, this benefit is outweighed by the architectural burden it
imposes on the subsystems on implementation and maintenance. Therefore,
drop it.

Change CONFIG_IPV6 from tristate to bool. Remove all Kconfig
dependencies across the tree that explicitly checked for IPV6=m. In
addition, remove MODULE_DESCRIPTION(), MODULE_ALIAS(), MODULE_AUTHOR()
and MODULE_LICENSE().

This is also replacing module_init() by device_initcall(). It is not
possible to use fs_initcall() as IPv4 does because that creates a race
condition on IPv6 addrconf.

Finally, modify the default configs from CONFIG_IPV6=m to CONFIG_IPV6=y
except for m68k as according to the bloat-o-meter the image is
increasing by 330KB~ and that isn't acceptable. Instead, disable IPv6 on
this architecture by default. This is aligned with m68k RAM requirements
and recommendations [1].

[1] http://www.linux-m68k.org/faq/ram.html

Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Tested-by: Ricardo B. Marlière <rbm@suse.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org> # arm64
Link: https://patch.msgid.link/20260325120928.15848-2-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-29 11:21:22 -07:00
Jan Kara
7e5ccdd88c
gfs2: Don't zero i_private_data
Remove the explicit zeroing of mapping->i_private_data since this
field is no longer used.

CC: Andreas Gruenbacher <agruenba@redhat.com>
CC: gfs2@lists.linux.dev
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260326095354.16340-44-jack@suse.cz
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-26 15:03:26 +01:00
Andreas Gruenbacher
9e34adb1cc gfs2: Remove trans_drain code duplication
Rename trans_drain() to gfs2_trans_drain().

Add a new gfs2_trans_drain_list() helper and use it in
gfs2_trans_drain() to reduce code duplication.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-03-23 20:55:28 +01:00
Andreas Gruenbacher
10866892c7 gfs2: Move gfs2_remove_from_journal to log.c
Move gfs2_remove_from_journal() from meta_io.c to log.c and fix a minor
indentation glitch.

With that, gfs2_remove_from_ail() is now only used inside log.c, so it
can be made static.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-03-23 20:55:28 +01:00
Andreas Gruenbacher
5a15907f99 gfs2: Get rid of gfs2_log_[un]lock helpers
These two helpers only hide the locking operation; they do not make
the code more readable.

Created with:

sed -i -e 's:gfs2_log_unlock(sdp):spin_unlock(\&sdp->sd_log_lock):' \
       -e 's:gfs2_log_lock(sdp):spin_lock(\&sdp->sd_log_lock):'

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-03-23 20:55:28 +01:00
Andreas Gruenbacher
7288185ce8 gfs2: less aggressive low-memory log flushing
It turns out that for some workloads, the fix in commit b74cd55aa9
("gfs2: low-memory forced flush fixes") causes the number of forced log
flushes to increase to a degree that the overall filesystem performance
drops significantly.  Address that by forcing a log flush only when
gfs2_writepages cannot make any progress rather than when it cannot make
"enough" progress.

Fixes: b74cd55aa9 ("gfs2: low-memory forced flush fixes")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-03-23 20:55:28 +01:00
Andreas Gruenbacher
bd67f17718 gfs2: Fix data loss during inode evict
When gfs2_evict_inode() is called on an inode with unwritten data in the
page cache, the page cache needs to be written before it can be
truncated.  This doesn't always happen.  Fix that by changing
gfs2_evict_inode() to always either call evict_linked_inode() or
evict_unlinked_inode().

Inside evict_unlinked_inode(), first check if the inode is dirty.  If it
is, make sure the inode glock is held and write back the data and
metadata.  If it isn't, skip those steps.

Also, make sure that gfs2_evict_inode() calls gfs2_evict_inode() and
evict_unlinked_inode() only if ip->i_gl is not NULL; this avoids
unnecessary complications there.

Fixes xfstest generic/211.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-03-23 20:55:27 +01:00
Andreas Gruenbacher
2b34a9e760 gfs2: minor evict_[un]linked_inode cleanup
Add gl helper variables in evict_unlinked_inode() and
evict_linked_inode().  This patch isn't very interesting by itself, but
it makes the next patch more readable.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-03-23 20:55:27 +01:00
Andreas Gruenbacher
e2de65130d gfs2: Avoid unnecessary transactions in evict_linked_inode
In evict_linked_inode(), the truncate_inode_pages() calls are carried
out inside a transaction.  This code was added to what was then function
gfs2_delete_inode() in commit 16615be18c ("[GFS2] Clean up journaled
data writing").

These transactions are only used for creating revokes for the jdata
buffers in the journal, so don't create such transactions when we know
that the address space doesn't contain any jdata buffers for this inode
and truncate the metadata address space outside of the transaction.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-03-23 20:55:24 +01:00
Andreas Gruenbacher
0ac82bc7b7 gfs2: Remove unnecessary check in gfs2_evict_inode
We are no longer using LM_FLAG_TRY or LM_FLAG_TRY_1CB during inode
evict, so ret cannot be GLR_TRYFAILED here.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-02-24 18:22:51 +01:00
Andreas Gruenbacher
2ff7cf7e06 gfs2: Call unlock_new_inode before d_instantiate
As Neil Brown describes in detail in the link referenced below, new
inodes must be unlocked before they can be instantiated.

An even better fix is to use d_instantiate_new(), which combines
d_instantiate() and unlock_new_inode().

Fixes: 3d36e57ff7 ("gfs2: gfs2_create_inode rework")
Reported-by: syzbot+0ea5108a1f5fb4fcc2d8@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-fsdevel/177153754005.8396.8777398743501764194@noble.neil.brown.name/
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-02-23 12:31:12 +01:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Kundan Kumar
8cab8dc0e1
gfs2: stop using writeback internals for dirty_exceeded check
Convert gfs2 dirty_exceeded handling to use the writeback core helper
instead of accessing writeback directly.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kundan Kumar <kundan.kumar@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Link: https://patch.msgid.link/20260213054634.79785-4-kundan.kumar@samsung.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-17 13:25:14 +01:00
Linus Torvalds
136114e0ab mm.git review status for linus..mm-nonmm-stable
Total patches:       107
 Reviews/patch:       1.07
 Reviewed rate:       67%
 
 - The 2 patch series "ocfs2: give ocfs2 the ability to reclaim
   suballocator free bg" from Heming Zhao saves disk space by teaching
   ocfs2 to reclaim suballocator block group space.
 
 - The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one
   bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in
   various places.
 
 - The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than
   PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if
   VMCOREINFO_BYTES ever exceeds the page size.
 
 - The 7 patch series "kallsyms: Prevent invalid access when showing
   module buildid" from Petr Mladek cleans up kallsyms code related to
   module buildid and fixes an invalid access crash when printing
   backtraces.
 
 - The 3 patch series "Address page fault in
   ima_restore_measurement_list()" from Harshit Mogalapalli fixes a
   kexec-related crash that can occur when booting the second-stage kernel
   on x86.
 
 - The 6 patch series "kho: ABI headers and Documentation updates" from
   Mike Rapoport updates the kexec handover ABI documentation.
 
 - The 4 patch series "Align atomic storage" from Finn Thain adds the
   __aligned attribute to atomic_t and atomic64_t definitions to get
   natural alignment of both types on csky, m68k, microblaze, nios2,
   openrisc and sh.
 
 - The 2 patch series "kho: clean up page initialization logic" from
   Pratyush Yadav simplifies the page initialization logic in
   kho_restore_page().
 
 - The 6 patch series "Unload linux/kernel.h" from Yury Norov moves
   several things out of kernel.h and into more appropriate places.
 
 - The 7 patch series "don't abuse task_struct.group_leader" from Oleg
   Nesterov removes the usage of ->group_leader when it is "obviously
   unnecessary".
 
 - The 5 patch series "list private v2 & luo flb" from Pasha Tatashin
   adds some infrastructure improvements to the live update orchestrator.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA
 jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6
 0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww=
 =mmsH
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves
   disk space by teaching ocfs2 to reclaim suballocator block group
   space (Heming Zhao)

 - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the
   ARRAY_END() macro and uses it in various places (Alejandro Colomar)

 - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes
   the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the
   page size (Pnina Feder)

 - "kallsyms: Prevent invalid access when showing module buildid" cleans
   up kallsyms code related to module buildid and fixes an invalid
   access crash when printing backtraces (Petr Mladek)

 - "Address page fault in ima_restore_measurement_list()" fixes a
   kexec-related crash that can occur when booting the second-stage
   kernel on x86 (Harshit Mogalapalli)

 - "kho: ABI headers and Documentation updates" updates the kexec
   handover ABI documentation (Mike Rapoport)

 - "Align atomic storage" adds the __aligned attribute to atomic_t and
   atomic64_t definitions to get natural alignment of both types on
   csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain)

 - "kho: clean up page initialization logic" simplifies the page
   initialization logic in kho_restore_page() (Pratyush Yadav)

 - "Unload linux/kernel.h" moves several things out of kernel.h and into
   more appropriate places (Yury Norov)

 - "don't abuse task_struct.group_leader" removes the usage of
   ->group_leader when it is "obviously unnecessary" (Oleg Nesterov)

 - "list private v2 & luo flb" adds some infrastructure improvements to
   the live update orchestrator (Pasha Tatashin)

* tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits)
  watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency
  procfs: fix missing RCU protection when reading real_parent in do_task_stat()
  watchdog/softlockup: fix sample ring index wrap in need_counting_irqs()
  kcsan, compiler_types: avoid duplicate type issues in BPF Type Format
  kho: fix doc for kho_restore_pages()
  tests/liveupdate: add in-kernel liveupdate test
  liveupdate: luo_flb: introduce File-Lifecycle-Bound global state
  liveupdate: luo_file: Use private list
  list: add kunit test for private list primitives
  list: add primitives for private list manipulations
  delayacct: fix uapi timespec64 definition
  panic: add panic_force_cpu= parameter to redirect panic to a specific CPU
  netclassid: use thread_group_leader(p) in update_classid_task()
  RDMA/umem: don't abuse current->group_leader
  drm/pan*: don't abuse current->group_leader
  drm/amd: kill the outdated "Only the pthreads threading model is supported" checks
  drm/amdgpu: don't abuse current->group_leader
  android/binder: use same_thread_group(proc->tsk, current) in binder_mmap()
  android/binder: don't abuse current->group_leader
  kho: skip memoryless NUMA nodes when reserving scratch areas
  ...
2026-02-12 12:13:01 -08:00
Linus Torvalds
7141433fbe gfs2 changes
- Prevent rename() from failing with -ESTALE when there are locking
   conflicts and retry the operation instead.
 
 - Don't fail when fiemap triggers a page fault (xfstest generic/742).
 
 - Fix another locking request cancellation bug.
 
 - Minor other fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmmJ5mQUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTooqg/+MzASj1LM37uKjYPAQkfF6nvujd9G
 BPMspoT+JzZDc0+btNPxsoOgwmju2ZeCY2ZNzGXbQ/V2wcTT3ZnuWummHOwhs07G
 XilDp+Ohzk4IcQ1uCvOpIMY7mmRdSTbCo/Ztny/nPLxHOfbe6AgWo+YU/Saxh6xI
 ndkbzSq0yjW7zjxdMSKjVbRAhvQGaW892s9orjb36isVDj+4hvA/aIWLrm7Zm5sf
 xWDlzx9mUMMqlM8zImUedUjyG8zyTSBz80NthQmnRtt6fZ+Hau0DCwbkbKz9J8mH
 Pksm7Xz6I9eft0axEQh7KrvCiHWCEUl3zoknbC/QtusJcx/r7Vpe48lBimIN7xi0
 u1e2k7MPpOeVPzsP1nhKxD9W/IRC/WaKZCKb9fYmtbhzYVk8gE/lrLLGTWnATTa2
 X6OCBpdg5niohXeVDNmE9ZtU5xsB/UH9AZ8p2+iPhhUC5dPPcF//H7mXNQhDC6m9
 Az7KxPu4VlTiRD9YCU+SV7uBM1YLocLQg8qLN4mdrxPegRs2EuVVEuQxb3isGrQV
 yMl61bbazc028AOeWu0iu7iggUXczwdjlEZVRyBpmjJjD6JwZ40WUKl8JqqiNH/K
 09CWNQO7GZOmOcXzCHz34sE2bekv3OMnncJ/z+zFdSnwc71r7Eo5H+XG7Vyf7hQ9
 OPHJe0tsyMDKPl8=
 =GBfL
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Andreas Gruenbacher:

 - Prevent rename() from failing with -ESTALE when there are locking
   conflicts and retry the operation instead

 - Don't fail when fiemap triggers a page fault (xfstest generic/742)

 - Fix another locking request cancellation bug

 - Minor other fixes and cleanups

* tag 'gfs2-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: fiemap page fault fix
  gfs2: fix memory leaks in gfs2_fill_super error path
  gfs2: Fix use-after-free in iomap inline data write path
  gfs2: Fix slab-use-after-free in qd_put
  gfs2: Introduce glock_{type,number,sbd} helpers
  gfs2: gfs2_glock_hold cleanup
  gfs: Use fixed GL_GLOCK_MIN_HOLD time
  gfs2: Fix gfs2_log_get_bio argument type
  gfs2: gfs2_chain_bio start sector fix
  gfs2: Initialize bio->bi_opf early
  gfs2: Rename gfs2_log_submit_{bio -> write}
  gfs2: Do not cancel internal demote requests
  gfs2: run_queue cleanup
  gfs2: Retries missing in gfs2_{rename,exchange}
  gfs2: glock cancelation flag fix
2026-02-09 16:29:57 -08:00
Linus Torvalds
9e355113f0 vfs-7.0-rc1.misc
Please consider pulling these changes from the signed vfs-7.0-rc1.misc tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49QAKCRCRxhvAZXjc
 ojrZAQD1VJzY46r5FnAVf4jlEHyjIbDnZCP/n+c4x6XnqpU6EQEAgB0yAtAGP6+u
 SBuytElqHoTT5VtmEXTAabCNQ9Ks8wo=
 =JwZz
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains a mix of VFS cleanups, performance improvements, API
  fixes, documentation, and a deprecation notice.

  Scalability and performance:

   - Rework pid allocation to only take pidmap_lock once instead of
     twice during alloc_pid(), improving thread creation/teardown
     throughput by 10-16% depending on false-sharing luck. Pad the
     namespace refcount to reduce false-sharing

   - Track file lock presence via a flag in ->i_opflags instead of
     reading ->i_flctx, avoiding false-sharing with ->i_readcount on
     open/close hot paths. Measured 4-16% improvement on 24-core
     open-in-a-loop benchmarks

   - Use a consume fence in locks_inode_context() to match the
     store-release/load-consume idiom, eliminating a hardware fence on
     some architectures

   - Annotate cdev_lock with __cacheline_aligned_in_smp to prevent
     false-sharing

   - Remove a redundant DCACHE_MANAGED_DENTRY check in
     __follow_mount_rcu() that never fires since the caller already
     verifies it, eliminating a 100% mispredicted branch

   - Fix a 100% mispredicted likely() in devcgroup_inode_permission()
     that became wrong after a prior code reorder

  Bug fixes and correctness:

   - Make insert_inode_locked() wait for inode destruction instead of
     skipping, fixing a corner case where two matching inodes could
     exist in the hash

   - Move f_mode initialization before file_ref_init() in alloc_file()
     to respect the SLAB_TYPESAFE_BY_RCU ordering contract

   - Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with
     no buffers attached, preventing a null pointer dereference when
     AS_RELEASE_ALWAYS is set but no release_folio op exists

   - Fix select restart_block to store end_time as timespec64, avoiding
     truncation of tv_sec on 32-bit architectures

   - Make dump_inode() use get_kernel_nofault() to safely access inode
     and superblock fields, matching the dump_mapping() pattern

  API modernization:

   - Make posix_acl_to_xattr() allocate the buffer internally since
     every single caller was doing it anyway. Reduces boilerplate and
     unnecessary error checking across ~15 filesystems

   - Replace deprecated simple_strtoul() with kstrtoul() for the
     ihash_entries, dhash_entries, mhash_entries, and mphash_entries
     boot parameters, adding proper error handling

   - Convert chardev code to use guard(mutex) and __free(kfree) cleanup
     patterns

   - Replace min_t() with min() or umin() in VFS code to avoid silently
     truncating unsigned long to unsigned int

   - Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers
     already check the flag

  Deprecation:

   - Begin deprecating legacy BSD process accounting (acct(2)). The
     interface has numerous footguns and better alternatives exist
     (eBPF)

  Documentation:

   - Fix and complete kernel-doc for struct export_operations, removing
     duplicated documentation between ReST and source

   - Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait()

  Testing:

   - Add a kunit test for initramfs cpio handling of entries with
     filesize > PATH_MAX

  Misc:

   - Add missing <linux/init_task.h> include in fs_struct.c"

* tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  posix_acl: make posix_acl_to_xattr() alloc the buffer
  fs: make insert_inode_locked() wait for inode destruction
  initramfs_test: kunit test for cpio.filesize > PATH_MAX
  fs: improve dump_inode() to safely access inode fields
  fs: add <linux/init_task.h> for 'init_fs'
  docs: exportfs: Use source code struct documentation
  fs: move initializing f_mode before file_ref_init()
  exportfs: Complete kernel-doc for struct export_operations
  exportfs: Mark struct export_operations functions at kernel-doc
  exportfs: Fix kernel-doc output for get_name()
  acct(2): begin the deprecation of legacy BSD process accounting
  device_cgroup: remove branch hint after code refactor
  VFS: fix __start_dirop() kernel-doc warnings
  fs: Describe @isnew parameter in ilookup5_nowait()
  fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu
  fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS
  select: store end_time as timespec64 in restart block
  chardev: Switch to guard(mutex) and __free(kfree)
  namespace: Replace simple_strtoul with kstrtoul to parse boot params
  dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries
  ...
2026-02-09 15:13:05 -08:00
Linus Torvalds
aa2a0fcd4c vfs-7.0-rc1.leases
Please consider pulling these changes from the signed vfs-7.0-rc1.leases tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc
 olR/AP40iNOTRn7LosXbRWqGGZqzy9v64QYoLzk3QdsWuGmbRAD/egNQzof8mkAf
 IscefWTOjY7xyDzmEBEBnfHftgMiEwM=
 =zre0
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs lease updates from Christian Brauner:
 "This contains updates for lease support to require filesystems to
  explicitly opt-in to lease support

  Currently kernel_setlease() falls through to generic_setlease() when a
  a filesystem does not define ->setlease(), silently granting lease
  support to every filesystem regardless of whether it is prepared for
  it.

  This is a poor default: most filesystems never intended to support
  leases, and the silent fallthrough makes it impossible to distinguish
  "supports leases" from "never thought about it".

  This inverts the default. It adds explicit

	.setlease = generic_setlease;

  assignments to every in-tree filesystem that should retain lease
  support, then changes kernel_setlease() to return -EINVAL when
  ->setlease is NULL.

  With the new default in place, simple_nosetlease() is redundant and
  is removed along with all references to it"

* tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
  fuse: add setlease file operation
  fs: remove simple_nosetlease()
  filelock: default to returning -EINVAL when ->setlease operation is NULL
  xfs: add setlease file operation
  ufs: add setlease file operation
  udf: add setlease file operation
  tmpfs: add setlease file operation
  squashfs: add setlease file operation
  overlayfs: add setlease file operation
  orangefs: add setlease file operation
  ocfs2: add setlease file operation
  ntfs3: add setlease file operation
  nilfs2: add setlease file operation
  jfs: add setlease file operation
  jffs2: add setlease file operation
  gfs2: add a setlease file operation
  fat: add setlease file operation
  f2fs: add setlease file operation
  exfat: add setlease file operation
  ext4: add setlease file operation
  ...
2026-02-09 11:59:07 -08:00
Linus Torvalds
74554251df vfs-7.0-rc1.nonblocking_timestamps
Please consider pulling these changes from the signed vfs-7.0-rc1.nonblocking_timestamps tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc
 oqNMAQCjHw9iwYDu63n96QAipWopJb8onqc0rTEvi0OOl1zDNwEAufN3EqTzV3uQ
 JbNgSwBWD/+ICd2aUOuAX0GgU6teyAQ=
 =lJlI
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs timestamp updates from Christian Brauner:
 "This contains the changes to support non-blocking timestamp updates.

  Since commit 66fa3cedf1 ("fs: Add async write file modification
  handling") file_update_time_flags() unconditionally returns -EAGAIN
  when any timestamp needs updating and IOCB_NOWAIT is set. This makes
  non-blocking direct writes impossible on file systems with granular
  enough timestamps, which in practice means all of them.

  This reworks the timestamp update path to propagate IOCB_NOWAIT
  through ->update_time so that file systems which can update timestamps
  without blocking are no longer penalized.

  With that groundwork in place, the core change passes IOCB_NOWAIT into
  ->update_time and returns -EAGAIN only when the file system indicates
  it would block.

  XFS implements non-blocking timestamp updates by using the new
  ->sync_lazytime and open-coding generic_update_time without the
  S_NOWAIT check, since the lazytime path through the generic helpers
  can never block in XFS"

* tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  xfs: enable non-blocking timestamp updates
  xfs: implement ->sync_lazytime
  fs: refactor file_update_time_flags
  fs: add support for non-blocking timestamp updates
  fs: add a ->sync_lazytime method
  fs: factor out a sync_lazytime helper
  fs: refactor ->update_time handling
  fat: cleanup the flags for fat_truncate_time
  nfs: split nfs_update_timestamps
  fs: allow error returns from generic_update_time
  fs: remove inode_update_time
2026-02-09 11:25:01 -08:00
Andreas Gruenbacher
e411d74cc5 gfs2: fiemap page fault fix
In gfs2_fiemap(), we are calling iomap_fiemap() while holding the inode
glock.  This can lead to recursive glock taking if the fiemap buffer is
memory mapped to the same inode and accessing it triggers a page fault.

Fix by disabling page faults for iomap_fiemap() and faulting in the
buffer by hand if necessary.

Fixes xfstest generic/742.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-02-05 18:00:45 +01:00
Deepanshu Kartikey
da6f5bbc2e gfs2: fix memory leaks in gfs2_fill_super error path
Fix two memory leaks in the gfs2_fill_super() error handling path when
transitioning a filesystem to read-write mode fails.

First leak: kthread objects (thread_struct, task_struct, etc.)
When gfs2_freeze_lock_shared() fails after init_threads() succeeds, the
created kernel threads (logd and quotad) are never destroyed. This
occurs because the fail_per_node label doesn't call
gfs2_destroy_threads().

Second leak: quota bitmap buffer (8192 bytes)
When gfs2_make_fs_rw() fails after gfs2_quota_init() succeeds but
before other operations complete, the allocated quota bitmap is never
freed.

The fix moves thread cleanup to the fail_per_node label to handle all
error paths uniformly. gfs2_destroy_threads() is safe to call
unconditionally as it checks for NULL pointers. Quota cleanup is added
in gfs2_make_fs_rw() to properly handle the withdrawal case where
quota initialization succeeds but the filesystem is then withdrawn.

Thread leak backtrace (gfs2_freeze_lock_shared failure):
  unreferenced object 0xffff88801d7bca80 (size 4480):
    copy_process+0x3a1/0x4670 kernel/fork.c:2422
    kernel_clone+0xf3/0x6e0 kernel/fork.c:2779
    kthread_create_on_node+0x100/0x150 kernel/kthread.c:478
    init_threads+0xab/0x350 fs/gfs2/ops_fstype.c:611
    gfs2_fill_super+0xe5c/0x1240 fs/gfs2/ops_fstype.c:1265

Quota leak backtrace (gfs2_make_fs_rw failure):
  unreferenced object 0xffff88812de7c000 (size 8192):
    gfs2_quota_init+0xe5/0x820 fs/gfs2/quota.c:1409
    gfs2_make_fs_rw+0x7a/0xe0 fs/gfs2/super.c:149
    gfs2_fill_super+0xfbb/0x1240 fs/gfs2/ops_fstype.c:1275

Reported-by: syzbot+aac438d7a1c44071e04b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=aac438d7a1c44071e04b
Fixes: 6c7410f449 ("gfs2: gfs2_freeze_lock_shared cleanup")
Fixes: b66f723bb5 ("gfs2: Improve gfs2_make_fs_rw error handling")
Link: https://lore.kernel.org/all/20260131062509.77974-1-kartikey406@gmail.com/T/ [v1]
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-02-03 10:57:54 +01:00
Deepanshu Kartikey
faddeb8483 gfs2: Fix use-after-free in iomap inline data write path
The inline data buffer head (dibh) is being released prematurely in
gfs2_iomap_begin() via release_metapath() while iomap->inline_data
still points to dibh->b_data. This causes a use-after-free when
iomap_write_end_inline() later attempts to write to the inline data
area.

The bug sequence:
1. gfs2_iomap_begin() calls gfs2_meta_inode_buffer() to read inode
   metadata into dibh
2. Sets iomap->inline_data = dibh->b_data + sizeof(struct gfs2_dinode)
3. Calls release_metapath() which calls brelse(dibh), dropping refcount
   to 0
4. kswapd reclaims the page (~39ms later in the syzbot report)
5. iomap_write_end_inline() tries to memcpy() to iomap->inline_data
6. KASAN detects use-after-free write to freed memory

Fix by storing dibh in iomap->private and incrementing its refcount
with get_bh() in gfs2_iomap_begin(). The buffer is then properly
released in gfs2_iomap_end() after the inline write completes,
ensuring the page stays alive for the entire iomap operation.

Note: A C reproducer is not available for this issue. The fix is based
on analysis of the KASAN report and code review showing the buffer head
is freed before use.

[agruenba: Take buffer head reference in gfs2_iomap_begin() to avoid
leaks in gfs2_iomap_get() and gfs2_iomap_alloc().]

Reported-by: syzbot+ea1cd4aa4d1e98458a55@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ea1cd4aa4d1e98458a55
Fixes: d0a22a4b03 ("gfs2: Fix iomap write page reclaim deadlock")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-30 13:21:57 +01:00
Linus Torvalds
fcb70a56f4 vfs-6.19-rc8.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaXc4IwAKCRCRxhvAZXjc
 oo0jAQDOV580l4wHiY6eT1QGY2QYa7u8fYDOi6mqfgHa+EH5twD9ETnQ0xQHIKYP
 oruFJXLf3ihBBsum+pTpAO2XFVjM7Qs=
 =pM8o
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Fix the the buggy conversion of fuse_reverse_inval_entry() introduced
   during the creation rework

 - Disallow nfs delegation requests for directories by setting
   simple_nosetlease()

 - Require an opt-in for getting readdir flag bits outside of S_DT_MASK
   set in d_type

 - Fix scheduling delayed writeback work by only scheduling when the
   dirty time expiry interval is non-zero and cancel the delayed work if
   the interval is set to zero

 - Use rounded_jiffies_interval for dirty time work

 - Check the return value of sb_set_blocksize() for romfs

 - Wait for batched folios to be stable in __iomap_get_folio()

 - Use private naming for fuse hash size

 - Fix the stale dentry cleanup to prevent a race that causes a UAF

* tag 'vfs-6.19-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  vfs: document d_dispose_if_unused()
  fuse: shrink once after all buckets have been scanned
  fuse: clean up fuse_dentry_tree_work()
  fuse: add need_resched() before unlocking bucket
  fuse: make sure dentry is evicted if stale
  fuse: fix race when disposing stale dentries
  fuse: use private naming for fuse hash size
  writeback: use round_jiffies_relative for dirtytime_work
  iomap: wait for batched folios to be stable in __iomap_get_folio
  romfs: check sb_set_blocksize() return value
  docs: clarify that dirtytime_expire_seconds=0 disables writeback
  writeback: fix 100% CPU usage when dirtytime_expire_interval is 0
  readdir: require opt-in for d_type flags
  vboxsf: don't allow delegations to be set on directories
  ceph: don't allow delegations to be set on directories
  gfs2: don't allow delegations to be set on directories
  9p: don't allow delegations to be set on directories
  smb/client: properly disallow delegations on directories
  nfs: properly disallow delegation requests on directories
  fuse: fix conversion of fuse_reverse_inval_entry() to start_removing()
2026-01-26 09:30:48 -08:00
Andreas Gruenbacher
22150a7d40 gfs2: Fix slab-use-after-free in qd_put
Commit a475c5dd16 ("gfs2: Free quota data objects synchronously")
started freeing quota data objects during filesystem shutdown instead of
putting them back onto the LRU list, but it failed to remove these
objects from the LRU list, causing LRU list corruption.  This caused
use-after-free when the shrinker (gfs2_qd_shrink_scan) tried to access
already-freed objects on the LRU list.

Fix this by removing qd objects from the LRU list before freeing them in
qd_put().

Initial fix from Deepanshu Kartikey <kartikey406@gmail.com>.

Fixes: a475c5dd16 ("gfs2: Free quota data objects synchronously")
Reported-by: syzbot+046b605f01802054bff0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=046b605f01802054bff0
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:18 +01:00
Andreas Gruenbacher
0ec49e7ea6 gfs2: Introduce glock_{type,number,sbd} helpers
Introduce glock_type(), glock_number(), and glock_sbd() helpers for
accessing a glock's type, number, and super block pointer more easily.

Created with Coccinelle using the following semantic patch:

@@ struct gfs2_glock *gl; @@
- gl->gl_name.ln_type
+ glock_type(gl)

@@ struct gfs2_glock *gl; @@
- gl->gl_name.ln_number
+ glock_number(gl)

@@ struct gfs2_glock *gl; @@
- gl->gl_name.ln_sbd
+ glock_sbd(gl)

glock_sbd() is a macro because it is used with const as well as
non-const struct gfs2_glock * arguments.

Instances in macro definitions, particularly in tracepoint definitions,
replaced by hand.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:18 +01:00
Andreas Gruenbacher
d3b39fcb39 gfs2: gfs2_glock_hold cleanup
Use lockref_get_not_dead() instead of an unguarded __lockref_is_dead()
check.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:18 +01:00
Andreas Gruenbacher
536f48e8bb gfs: Use fixed GL_GLOCK_MIN_HOLD time
GL_GLOCK_MIN_HOLD represents the minimum time (in jiffies) that a glock
should be held before being eligible for release.  It is currently
defined as 10, meaning that the duration depends on the timer interrupt
frequency (CONFIG_HZ).  Change that time to a constant 10ms independent
of CONFIG_HZ.  On CONFIG_HZ=1000 systems, the value remains the same.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00
Andreas Gruenbacher
c45fefe3a9 gfs2: Fix gfs2_log_get_bio argument type
Fix the type of gfs2_log_get_bio()'s op argument: callers pass in a
blk_opf_t value and the function passes that value on as a blk_opf_t
value, so the current argument type makes no sense.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00
Andreas Gruenbacher
08ca56ffcd gfs2: gfs2_chain_bio start sector fix
Pass the start sector into gfs2_chain_bio(): the new bio isn't
necessarily contiguous with the previous one.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00
Andreas Gruenbacher
4a94f052e0 gfs2: Initialize bio->bi_opf early
Pass the right blk_opf_t value to bio_alloc() so that ->bi_ops is
initialized correctly and doesn't have to be changed later.  Adjust the
call chain to pass that value through to where it is needed (and only
there).

Add a separate blk_opf_t argument to gfs2_chain_bio() instead of copying
the value from the previous bio.

Fixes: 8a157e0a0a ("gfs2: Fix use of bio_chain")
Reported-by: syzbot+f6539d4ce3f775aee0cc@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f6539d4ce3f775aee0cc
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00
Andreas Gruenbacher
59d81037d3 gfs2: Rename gfs2_log_submit_{bio -> write}
Rename gfs2_log_submit_bio() to gfs2_log_submit_write(): this function
isn't used for submitting log reads.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00
Andreas Gruenbacher
4928c36536 gfs2: Do not cancel internal demote requests
Trying to change the state of a glock may result in a "conversion
deadlock" error, indicating that the requested state transition would
cause a deadlock.  In this case, we unlock and retry the state change.
It makes no sense to try canceling those unlock requests, but the
current code is not aware of that.

In addition, if a locking request is canceled shortly after it is made,
the cancelation request can currently overtake the locking request.
This may result in deadlocks.

Fix both of these bugs by repurposing the GLF_PENDING_REPLY flag into a
GLF_MAY_CANCEL flag which is set only when the current locking request
can be canceled.  When trying to cancel a locking request in
gfs2_glock_dq(), wait for this flag to be set.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00
Andreas Gruenbacher
5e3319932a gfs2: run_queue cleanup
In run_queue(), instead of always setting the GLF_LOCK flag, only set it
when the flag is actually needed.  This avoids having to undo the flag
setting later.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00
Andreas Gruenbacher
11d763f0b0 gfs2: Retries missing in gfs2_{rename,exchange}
Fix a bug in gfs2's asynchronous glock handling for rename and exchange
operations.  The original async implementation from commit ad26967b9a
("gfs2: Use async glocks for rename") mentioned that retries were needed
but never implemented them, causing operations to fail with -ESTALE
instead of retrying on timeout.

Also makes the waiting interruptible.

In addition, the timeouts used were too high for situations in which
timing out is a rare but expected scenario.  Switch to shorter timeouts
with randomization and exponentional backoff.

Fixes: ad26967b9a ("gfs2: Use async glocks for rename")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00
Andreas Gruenbacher
f8f04248c7 gfs2: glock cancelation flag fix
When an asynchronous glock holder is dequeued that hasn't been granted
yet (HIF_HOLDER not set) and no dlm operation is in progress on behalf
of that holder (GLF_LOCK not set), the dequeuing takes place in
__gfs2_glock_dq().  There, we are not clearing the HIF_WAIT flag and
waking up waiters.  Fix that.

This bug prevents the same holder from being enqueued later (gfs2_glock_nq())
without first reinitializing it (gfs2_holder_reinit()).  The code
doesn't currently use this pattern, but this will change in the next
commit.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2026-01-26 14:28:17 +01:00