linux/include
Vernon Yang 0562041977 mm: khugepaged: skip lazy-free folios
For example, create three task: hot1 -> cold -> hot2.  After all three
task are created, each allocate memory 128MB.  the hot1/hot2 task
continuously access 128 MB memory, while the cold task only accesses its
memory briefly and then call madvise(MADV_FREE).  However, khugepaged
still prioritizes scanning the cold task and only scans the hot2 task
after completing the scan of the cold task.

All folios in VM_DROPPABLE are lazyfree, Collapsing maintains that
property, so we can just collapse and memory pressure in the future will
free it up.  In contrast, collapsing in !VM_DROPPABLE does not maintain
that property, the collapsed folio will not be lazyfree and memory
pressure in the future will not be able to free it up.

So if the user has explicitly informed us via MADV_FREE that this memory
will be freed, and this vma does not have VM_DROPPABLE flags, it is
appropriate for khugepaged to skip it only, thereby avoiding unnecessary
scan and collapse operations to reducing CPU wastage.

Here are the performance test results:
(Throughput bigger is better, other smaller is better)

Testing on x86_64 machine:

| task hot2           | without patch | with patch    |  delta  |
|---------------------|---------------|---------------|---------|
| total accesses time |  3.14 sec     |  2.93 sec     | -6.69%  |
| cycles per access   |  4.96         |  2.21         | -55.44% |
| Throughput          |  104.38 M/sec |  111.89 M/sec | +7.19%  |
| dTLB-load-misses    |  284814532    |  69597236     | -75.56% |

Testing on qemu-system-x86_64 -enable-kvm:

| task hot2           | without patch | with patch    |  delta  |
|---------------------|---------------|---------------|---------|
| total accesses time |  3.35 sec     |  2.96 sec     | -11.64% |
| cycles per access   |  7.29         |  2.07         | -71.60% |
| Throughput          |  97.67 M/sec  |  110.77 M/sec | +13.41% |
| dTLB-load-misses    |  241600871    |  3216108      | -98.67% |

[vernon2gm@gmail.com: add comment about VM_DROPPABLE in code, make it clearer]
  Link: https://lkml.kernel.org/r/i4uowkt4h2ev47obm5h2vtd4zbk6fyw5g364up7kkjn2vmcikq@auepvqethj5r
Link: https://lkml.kernel.org/r/20260221093918.1456187-5-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn>
Acked-by: David Hildenbrand (arm) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Barry Song <baohua@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:03 -07:00
..
acpi mailbox: platform and core updates 2026-02-14 11:13:32 -08:00
asm-generic kbuild: Split .modinfo out from ELF_DETAILS 2026-02-26 11:50:19 -07:00
clocksource
crypto Networking changes for 7.0 2026-02-11 19:31:52 -08:00
cxl
drm drm/dp: Add definition for Panel Replay full-line granularity 2026-03-04 15:26:08 +02:00
dt-bindings phy-for-7.0 2026-02-17 11:40:04 -08:00
hyperv Revert "mshv: expose the scrub partition hypercall" 2026-03-11 16:54:24 +00:00
keys
kunit kunit: irq: Ensure timer doesn't fire too frequently 2026-02-24 14:44:21 -08:00
kvm
linux mm: add folio_test_lazyfree helper 2026-04-05 13:53:03 -07:00
math-emu
media [GIT PULL for v7.0] media updates 2026-02-11 12:20:25 -08:00
memory
misc
net Just a few updates: 2026-03-18 19:25:41 -07:00
pcmcia
ras
rdma RDMA/core: Check id_priv->restricted_node_type in cma_listen_on_dev() 2026-02-25 07:50:10 -05:00
rv rv: Fix multiple definition of __pcpu_unique_da_mon_this 2026-02-20 13:12:00 +01:00
scsi SCSI misc on 20260212 2026-02-12 15:43:02 -08:00
soc
sound ASoC: Fixes for v7.0 2026-03-05 17:22:14 +01:00
target
trace mm: khugepaged: skip lazy-free folios 2026-04-05 13:53:03 -07:00
uapi ARM: 2026-03-15 12:22:10 -07:00
ufs
vdso
video
xen xen/xenbus: better handle backend crash 2026-03-04 15:31:40 +01:00
Kbuild