linux/arch
Baolin Wang 9970a9a27f arm64: mm: implement the architecture-specific test_and_clear_young_ptes()
Implement the Arm64 architecture-specific test_and_clear_young_ptes() to
enable batched checking of young flags, improving performance during large
folio reclamation when MGLRU is enabled.

While we're at it, simplify ptep_test_and_clear_young() by calling
test_and_clear_young_ptes().  Since callers guarantee that PTEs are
present before calling these functions, we can use pte_cont() to check the
CONT_PTE flag instead of pte_valid_cont().

Performance testing:

Enable MGLRU, then allocate 10G clean file-backed folios by mmap() in a
memory cgroup, and try to reclaim 8G file-backed folios via the
memory.reclaim interface.  I can observe 60%+ performance improvement on
my Arm64 32-core server (and about 15% improvement on my X86 machine).

W/o patchset:
real	0m0.470s
user	0m0.000s
sys	0m0.470s

W/ patchset:
real	0m0.180s
user	0m0.001s
sys	0m0.179s

Link: https://lkml.kernel.org/r/7f891d42a720cc2e57862f3b79e4f774404f313c.1772778858.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Xu <weixugc@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:16 -07:00
..
alpha arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
arc arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
arm arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
arm64 arm64: mm: implement the architecture-specific test_and_clear_young_ptes() 2026-04-05 13:53:16 -07:00
csky arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
hexagon arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
loongarch LoongArch/mm: align vmemmap to maximal folio size 2026-04-05 13:53:08 -07:00
m68k arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
microblaze arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
mips mm: cache struct page for empty_zero_page and return it from ZERO_PAGE() 2026-04-05 13:53:01 -07:00
nios2 arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
openrisc arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
parisc arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
powerpc mm: rename zap_vma_pages() to zap_vma() 2026-04-05 13:53:14 -07:00
riscv riscv/mm: align vmemmap to maximal folio size 2026-04-05 13:53:08 -07:00
s390 mm: rename zap_page_range_single() to zap_vma_range() 2026-04-05 13:53:15 -07:00
sh arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
sparc mm: cache struct page for empty_zero_page and return it from ZERO_PAGE() 2026-04-05 13:53:01 -07:00
um arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
x86 mm: rename zap_vma_ptes() to zap_special_vma_range() 2026-04-05 13:53:15 -07:00
xtensa arch, mm: consolidate empty_zero_page 2026-04-05 13:53:01 -07:00
.gitignore
Kconfig sched: Move clock related paravirt code to kernel/sched 2026-01-12 15:39:14 +01:00