Commit Graph

113 Commits

Author SHA1 Message Date
SeongJae Park
0199390a6b mm/damon/sysfs: dealloc repeat_call_control if damon_call() fails
damon_call() for repeat_call_control of DAMON_SYSFS could fail if somehow
the kdamond is stopped before the damon_call().  It could happen, for
example, when te damon context was made for monitroing of a virtual
address processes, and the process is terminated immediately, before the
damon_call() invocation.  In the case, the dyanmically allocated
repeat_call_control is not deallocated and leaked.

Fix the leak by deallocating the repeat_call_control under the
damon_call() failure.

This issue is discovered by sashiko [1].

Link: https://lkml.kernel.org/r/20260327003224.55752-1-sj@kernel.org
Link: https://lore.kernel.org/20260320020630.962-1-sj@kernel.org [1]
Fixes: 04a06b139e ("mm/damon/sysfs: use dynamically allocated repeat mode damon_call_control")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-06 11:13:42 -07:00
Josh Law
6557004a8b mm/damon/sysfs: check contexts->nr in repeat_call_fn
damon_sysfs_repeat_call_fn() calls damon_sysfs_upd_tuned_intervals(),
damon_sysfs_upd_schemes_stats(), and
damon_sysfs_upd_schemes_effective_quotas() without checking contexts->nr. 
If nr_contexts is set to 0 via sysfs while DAMON is running, these
functions dereference contexts_arr[0] and cause a NULL pointer
dereference.  Add the missing check.

For example, the issue can be reproduced using DAMON sysfs interface and
DAMON user-space tool (damo) [1] like below.

    $ sudo damo start --refresh_interval 1s
    $ echo 0 | sudo tee \
            /sys/kernel/mm/damon/admin/kdamonds/0/contexts/nr_contexts

Link: https://patch.msgid.link/20260320163559.178101-3-objecting@objecting.org
Link: https://lkml.kernel.org/r/20260321175427.86000-4-sj@kernel.org
Link: https://github.com/damonitor/damo [1]
Fixes: d809a7c64b ("mm/damon/sysfs: implement refresh_ms file internal work")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
Josh Law
1bfe9fb5ed mm/damon/sysfs: check contexts->nr before accessing contexts_arr[0]
Multiple sysfs command paths dereference contexts_arr[0] without first
verifying that kdamond->contexts->nr == 1.  A user can set nr_contexts to
0 via sysfs while DAMON is running, causing NULL pointer dereferences.

In more detail, the issue can be triggered by privileged users like
below.

First, start DAMON and make contexts directory empty
(kdamond->contexts->nr == 0).

    # damo start
    # cd /sys/kernel/mm/damon/admin/kdamonds/0
    # echo 0 > contexts/nr_contexts

Then, each of below commands will cause the NULL pointer dereference.

    # echo update_schemes_stats > state
    # echo update_schemes_tried_regions > state
    # echo update_schemes_tried_bytes > state
    # echo update_schemes_effective_quotas > state
    # echo update_tuned_intervals > state

Guard all commands (except OFF) at the entry point of
damon_sysfs_handle_cmd().

Link: https://lkml.kernel.org/r/20260321175427.86000-3-sj@kernel.org
Fixes: 0ac32b8aff ("mm/damon/sysfs: support DAMOS stats")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[5.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:38 -07:00
Josh Law
7fe000eb32 mm/damon/sysfs: fix param_ctx leak on damon_sysfs_new_test_ctx() failure
Patch series "mm/damon/sysfs: fix memory leak and NULL dereference
issues", v4.

DAMON_SYSFS can leak memory under allocation failure, and do NULL pointer
dereference when a privileged user make wrong sequences of control.  Fix
those.


This patch (of 3):

When damon_sysfs_new_test_ctx() fails in damon_sysfs_commit_input(),
param_ctx is leaked because the early return skips the cleanup at the out
label.  Destroy param_ctx before returning.

Link: https://lkml.kernel.org/r/20260321175427.86000-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20260321175427.86000-2-sj@kernel.org
Fixes: f0c5118ebb ("mm/damon/sysfs: catch commit test ctx alloc failure")
Signed-off-by: Josh Law <objecting@objecting.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-03-27 20:48:37 -07:00
Linus Torvalds
32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
SeongJae Park
cc1db8dff8 mm/damon: rename min_sz_region of damon_ctx to min_region_sz
'min_sz_region' field of 'struct damon_ctx' represents the minimum size of
each DAMON region for the context.  'struct damos_access_pattern' has a
field of the same name.  It confuses readers and makes 'grep' less optimal
for them.  Rename it to 'min_region_sz'.

Link: https://lkml.kernel.org/r/20260117175256.82826-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31 14:22:47 -08:00
SeongJae Park
dfb1b0c9dc mm/damon: rename DAMON_MIN_REGION to DAMON_MIN_REGION_SZ
The macro is for the default minimum size of each DAMON region.  There was
a case that a reader was confused if it is the minimum number of total
DAMON regions, which is set on damon_attrs->min_nr_regions.  Make the name
more explicit.

Link: https://lkml.kernel.org/r/20260117175256.82826-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-31 14:22:46 -08:00
SeongJae Park
f54b51ce31 mm/damon/sysfs: use damon_kdamond_pid()
DAMON sysfs interface directly uses damon_ctx->kdamond field with manual
synchronization using damon_ctx->kdamond_lock, to get the pid of the
kdamond.  Use a new dedicated function for the purpose, namely
damon_kdamond_pid(), since that doesn't require manual and error-prone
synchronization.

Avoid use of kdamond_lock outside of the core.

Link: https://lkml.kernel.org/r/20260115152047.68415-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26 20:02:37 -08:00
SeongJae Park
9814cc832b mm/damon/sysfs: cleanup attrs subdirs on context dir setup failure
When a context DAMON sysfs directory setup is failed after setup of attrs/
directory, subdirectories of attrs/ directory are not cleaned up.  As a
result, DAMON sysfs interface is nearly broken until the system reboots,
and the memory for the unremoved directory is leaked.

Cleanup the directories under such failures.

Link: https://lkml.kernel.org/r/20251225023043.18579-3-sj@kernel.org
Fixes: c951cd3b89 ("mm/damon: implement a minimal stub for sysfs-based DAMON interface")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: chongjiapeng <jiapeng.chong@linux.alibaba.com>
Cc: <stable@vger.kernel.org> # 5.18.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-14 22:16:23 -08:00
SeongJae Park
a24ca8ebb0 mm/damon/sysfs: cleanup intervals subdirs on attrs dir setup failure
Patch series "mm/damon/sysfs: free setup failures generated zombie sub-sub
dirs".

Some DAMON sysfs directory setup functions generates its sub and sub-sub
directories.  For example, 'monitoring_attrs/' directory setup creates
'intervals/' and 'intervals/intervals_goal/' directories under
'monitoring_attrs/' directory.  When such sub-sub directories are
successfully made but followup setup is failed, the setup function should
recursively clean up the subdirectories.

However, such setup functions are only dereferencing sub directory
reference counters.  As a result, under certain setup failures, the
sub-sub directories keep having non-zero reference counters.  It means the
directories cannot be removed like zombies, and the memory for the
directories cannot be freed.

The user impact of this issue is limited due to the following reasons.

When the issue happens, the zombie directories are still taking the path. 
Hence attempts to generate the directories again will fail, without
additional memory leak.  This means the upper bound memory leak is
limited.  Nonetheless this also implies controlling DAMON with a feature
that requires the setup-failed sysfs files will be impossible until the
system reboots.

Also, the setup operations are quite simple.  The certain failures would
hence only rarely happen, and are difficult to artificially trigger.


This patch (of 4):

When attrs/ DAMON sysfs directory setup is failed after setup of
intervals/ directory, intervals/intervals_goal/ directory is not cleaned
up.  As a result, DAMON sysfs interface is nearly broken until the system
reboots, and the memory for the unremoved directory is leaked.

Cleanup the directory under such failures.

Link: https://lkml.kernel.org/r/20251225023043.18579-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20251225023043.18579-2-sj@kernel.org
Fixes: 8fbbcbeaaf ("mm/damon/sysfs: implement intervals tuning goal directory")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: chongjiapeng <jiapeng.chong@linux.alibaba.com>
Cc: <stable@vger.kernel.org> # 6.15.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-14 22:16:23 -08:00
SeongJae Park
e35afdf228 mm/damon/sysfs: implement obsolete_target file
There is no good way to remove DAMON targets in the middle of the existing
targets list.  It restricts efficient and flexible DAMON use cases. 
Improve the usability by implementing a new DAMON sysfs interface file,
namely obsolete_target, under each target directory.  It is connected to
the obsolete field of parameters commit-source targets, so allows removing
arbitrary targets in the middle of existing targets list.

Note that the sysfs files are not automatically updated.  For example,
let's suppose there are three targets in the running context, and a user
removes the third target using this feature.  If the user writes 'commit'
to the kdamond 'state' file again, DAMON sysfs interface will again try to
remove the third target.  But because there is no matching target in the
running context, the commit will fail.  It is the user's responsibility to
understand resulting DAMON internal targets list change, and construct
sysfs files (using nr_targets and other sysfs files) to correctly
represent it.

Also note that this is arguably an improvement rather than a fix of broken
things.

Link: https://lkml.kernel.org/r/20251023012535.69625-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: Bijan Tabatabai <bijan311@gmail.com>
Closes: https://github.com/damonitor/damo/issues/36
Reviewed-by: Bijan Tabatabai <bijan311@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-16 17:28:23 -08:00
SeongJae Park
60bd24f272 mm/damon/sysfs: test commit input against realistic destination
DAMON sysfs interface tests if given online parameters update request is
valid, by committing those using the DAMON kernel API, to a test-purpose
destination context.  The test-purpose destination context is constructed
using damon_new_ctx(), so it has no target, no scheme.

If a source target has the obsolete field set, the test-purpose commit
will fail because damon_commit_targets() fails when there is a source
obsolete target that cannot find its matching destination target.  DAMON
sysfs interface is not letting users set the field for now, so there is no
problem.  However, the following commit will support that.  Also there
could be similar future changes that making commit fails based on current
context structure.

Make the test purpose commit destination context similar to the current
running one, by committing the running one to the test purpose context,
before doing the real test-purpose commit.

Link: https://lkml.kernel.org/r/20251023012535.69625-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Bijan Tabatabai <bijan311@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-16 17:28:23 -08:00
jianyun.gao
b6c46600bf mm: fix some typos in mm module
Below are some typos in the code comments:

  intevals ==> intervals
  addesses ==> addresses
  unavaliable ==> unavailable
  facor ==> factor
  droping ==> dropping
  exlusive ==> exclusive
  decription ==> description
  confict ==> conflict
  desriptions ==> descriptions
  otherwize ==> otherwise
  vlaue ==> value
  cheching ==> checking
  exisitng ==> existing
  modifed ==> modified
  differenciate ==> differentiate
  refernece ==> reference
  permissons ==> permissions
  indepdenent ==> independent
  spliting ==> splitting

Just fix it.

Link: https://lkml.kernel.org/r/20250929002608.1633825-1-jianyungao89@gmail.com
Signed-off-by: jianyun.gao <jianyungao89@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Chris Li <chrisl@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-16 17:27:52 -08:00
Quanmin Yan
9fd7bb5083 mm/damon/sysfs: change next_update_jiffies to a global variable
In DAMON's damon_sysfs_repeat_call_fn(), time_before() is used to compare
the current jiffies with next_update_jiffies to determine whether to
update the sysfs files at this moment.

On 32-bit systems, the kernel initializes jiffies to "-5 minutes" to make
jiffies wrap bugs appear earlier. However, this causes time_before() in
damon_sysfs_repeat_call_fn() to unexpectedly return true during the first
5 minutes after boot on 32-bit systems (see [1] for more explanation,
which fixes another jiffies-related issue before). As a result, DAMON
does not update sysfs files during that period.

There is also an issue unrelated to the system's word size[2]: if the
user stops DAMON just after next_update_jiffies is updated and restarts
it after 'refresh_ms' or a longer delay, next_update_jiffies will retain
an older value, causing time_before() to return false and the update to
happen earlier than expected.

Fix these issues by making next_update_jiffies a global variable and
initializing it each time DAMON is started.

Link: https://lkml.kernel.org/r/20251030020746.967174-3-yanquanmin1@huawei.com
Link: https://lkml.kernel.org/r/20250822025057.1740854-1-ekffu200098@gmail.com [1]
Link: https://lore.kernel.org/all/20251029013038.66625-1-sj@kernel.org/ [2]
Fixes: d809a7c64b ("mm/damon/sysfs: implement refresh_ms file internal work")
Suggested-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: ze zuo <zuoze1@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-09 21:19:45 -08:00
SeongJae Park
139e7a572a mm/damon/sysfs: dealloc commit test ctx always
The damon_ctx for testing online DAMON parameters commit inputs is
deallocated only when the test fails.  This means memory is leaked for
every successful online DAMON parameters commit.  Fix the leak by always
deallocating it.

Link: https://lkml.kernel.org/r/20251003201455.41448-3-sj@kernel.org
Fixes: 4c9ea539ad ("mm/damon/sysfs: validate user inputs from damon_sysfs_commit_input()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-10-15 13:24:33 -07:00
SeongJae Park
f0c5118ebb mm/damon/sysfs: catch commit test ctx alloc failure
Patch series "mm/damon/sysfs: fix commit test damon_ctx [de]allocation".

DAMON sysfs interface dynamically allocates and uses a damon_ctx object
for testing if given inputs for online DAMON parameters update is valid.
The object is being used without an allocation failure check, and leaked
when the test succeeds.  Fix the two bugs.


This patch (of 2):

The damon_ctx for testing online DAMON parameters commit inputs is used
without its allocation failure check.  This could result in an invalid
memory access.  Fix it by directly returning an error when the allocation
failed.

Link: https://lkml.kernel.org/r/20251003201455.41448-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20251003201455.41448-2-sj@kernel.org
Fixes: 4c9ea539ad ("mm/damon/sysfs: validate user inputs from damon_sysfs_commit_input()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-10-15 13:24:33 -07:00
Linus Torvalds
8804d970fa Summary of significant series in this pull request:
- The 3 patch series "mm, swap: improve cluster scan strategy" from
   Kairui Song improves performance and reduces the failure rate of swap
   cluster allocation.
 
 - The 4 patch series "support large align and nid in Rust allocators"
   from Vitaly Wool permits Rust allocators to set NUMA node and large
   alignment when perforning slub and vmalloc reallocs.
 
 - The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from
   Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets
   for virtual address spaces for ops-level DAMOS filters.
 
 - The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock"
   from Suren Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps.
 
 - The 2 patch series "mm/mincore: minor clean up for swap cache
   checking" from Kairui Song performs some cleanup in the swap code.
 
 - The 11 patch series "mm: vm_normal_page*() improvements" from David
   Hildenbrand provides code cleanup in the pagemap code.
 
 - The 5 patch series "add persistent huge zero folio support" from
   Pankaj Raghav provides a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero.
 
 - The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a
   few touchups to the recently added Kexec Handover feature.
 
 - The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all
   arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap.  To
   end the constant struggle with space shortage on 32-bit conflicting with
   64-bit's needs.
 
 - The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li
   cleans up some swap code.
 
 - The 7 patch series "selftests/mm: Fix false positives and skip
   unsupported tests" from Donet Tom fixes a few things in our selftests
   code.
 
 - The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide
   THPs when advised" from David Hildenbrand "allows individual processes
   to opt-out of THP=always into THP=madvise, without affecting other
   workloads on the system".
 
   It's a long story - the [1/N] changelog spells out the considerations.
 
 - The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox
   gets us started on the memdesc project.  Please see
   https://kernelnewbies.org/MatthewWilcox/Memdescs and
   https://blogs.oracle.com/linux/post/introducing-memdesc.
 
 - The 3 patch series "Tiny optimization for large read operations" from
   Chi Zhiling improves the efficiency of the pagecache read path.
 
 - The 5 patch series "Better split_huge_page_test result check" from Zi
   Yan improves our folio splitting selftest code.
 
 - The 2 patch series "test that rmap behaves as expected" from Wei Yang
   adds some rmap selftests.
 
 - The 3 patch series "remove write_cache_pages()" from Christoph Hellwig
   removes that function and converts its two remaining callers.
 
 - The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain
   fixes some UFFD selftests issues.
 
 - The 3 patch series "introduce kernel file mapped folios" from Boris
   Burkov introduces the concept of "kernel file pages".  Using these
   permits btrfs to account its metadata pages to the root cgroup, rather
   than to the cgroups of random inappropriate tasks.
 
 - The 2 patch series "mm/pageblock: improve readability of some
   pageblock handling" from Wei Yang provides some readability improvements
   to the page allocator code.
 
 - The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae
   Park teaches DAMON to understand arm32 highmem.
 
 - The 4 patch series "tools: testing: Use existing atomic.h for
   vma/maple tests" from Brendan Jackman performs some code cleanups and
   deduplication under tools/testing/.
 
 - The 2 patch series "maple_tree: Fix testing for 32bit compiles" from
   Liam Howlett fixes a couple of 32-bit issues in
   tools/testing/radix-tree.c.
 
 - The 2 patch series "kasan: unify kasan_enabled() and remove
   arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN
   arch-specific initialization code into a common arch-neutral
   implementation.
 
 - The 3 patch series "mm: remove zpool" from Johannes Weiner removes
   zspool - an indirection layer which now only redirects to a single thing
   (zsmalloc).
 
 - The 2 patch series "mm: task_stack: Stack handling cleanups" from
   Pasha Tatashin makes a couple of cleanups in the fork code.
 
 - The 37 patch series "mm: remove nth_page()" from David Hildenbrand
   makes rather a lot of adjustments at various nth_page() callsites,
   eventually permitting the removal of that undesirable helper function.
 
 - The 2 patch series "introduce kasan.write_only option in hw-tags" from
   Yeoreum Yun creates a KASAN read-only mode for ARM, using that
   architecture's memory tagging feature.  It is felt that a read-only mode
   KASAN is suitable for use in production systems rather than debug-only.
 
 - The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation"
   from Kefeng Wang does some tidying in the hugetlb folio allocation code.
 
 - The 12 patch series "mm: establish const-correctness for pointer
   parameters" from Max Kellermann makes quite a number of the MM API
   functions more accurate about the constness of their arguments.  This
   was getting in the way of subsystems (in this case CEPH) when they
   attempt to improving their own const/non-const accuracy.
 
 - The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola
   fixes a number of code sites which were confused over when to use
   free_pages() vs __free_pages().
 
 - The 3 patch series "Add Rust abstraction for Maple Trees" from Alice
   Ryhl makes the mapletree code accessible to Rust.  Required by nouveau
   and by its forthcoming successor: the new Rust Nova driver.
 
 - The 2 patch series "selftests/mm: split_huge_page_test:
   split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and
   some cleanups to the thp selftesting code.
 
 - The 14 patch series "mm, swap: introduce swap table as swap cache
   (phase I)" from Chris Li and Kairui Song is the first step along the
   path to implementing "swap tables" - a new approach to swap allocation
   and state tracking which is expected to yield speed and space
   improvements.  This patchset itself yields a 5-20% performance benefit
   in some situations.
 
 - The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes
   the new memdesc layer to clean up the ptdesc code a little.
 
 - The 3 patch series "Fix va_high_addr_switch.sh test failure" from
   Chunyu Hu fixes some issues in our 5-level pagetable selftesting code.
 
 - The 2 patch series "Minor fixes for memory allocation profiling" from
   Suren Baghdasaryan addresses a couple of minor issues in relatively new
   memory allocation profiling feature.
 
 - The 3 patch series "Small cleanups" from Matthew Wilcox has a few
   cleanups in preparation for more memdesc work.
 
 - The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and
   DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in
   furtherance of supporting arm highmem.
 
 - The 2 patch series "selftests/mm: Add -Wunreachable-code and fix
   warnings" from Muhammad Anjum adds that compiler check to selftests code
   and fixes the fallout, by removing dead code.
 
 - The 10 patch series "Improvements to Victim Process Thawing and OOM
   Reaper Traversal Order" from zhongjinji makes a number of improvements
   in the OOM killer: mainly thawing a more appropriate group of victim
   threads so they can release resources.
 
 - The 5 patch series "mm/damon: misc fixups and improvements for 6.18"
   from SeongJae Park is a bunch of small and unrelated fixups for DAMON.
 
 - The 7 patch series "mm/damon: define and use DAMON initialization
   check function" from SeongJae Park implement reliability and
   maintainability improvements to a recently-added bug fix.
 
 - The 2 patch series "mm/damon/stat: expose auto-tuned intervals and
   non-idle ages" from SeongJae Park provides additional transparency to
   userspace clients of the DAMON_STAT information.
 
 - The 2 patch series "Expand scope of khugepaged anonymous collapse"
   from Dev Jain removes some constraints on khubepaged's collapsing of
   anon VMAs.  It also increases the success rate of MADV_COLLAPSE against
   an anon vma.
 
 - The 2 patch series "mm: do not assume file == vma->vm_file in
   compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards
   removal of file_operations.mmap().  This patchset concentrates upon
   clearing up the treatment of stacked filesystems.
 
 - The 6 patch series "mm: Improve mlock tracking for large folios" from
   Kiryl Shutsemau provides some fixes and improvements to mlock's tracking
   of large folios.  /proc/meminfo's "Mlocked" field became more accurate.
 
 - The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters
   during fork" from Donet Tom fixes several user-visible KSM stats
   inaccuracies across forks and adds selftest code to verify these
   counters.
 
 - The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei
   Yang addresses some potential but presently benign issues in KSM's
   mm_slot handling.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA
 jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1
 2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA=
 =4mhY
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "mm, swap: improve cluster scan strategy" from Kairui Song improves
   performance and reduces the failure rate of swap cluster allocation

 - "support large align and nid in Rust allocators" from Vitaly Wool
   permits Rust allocators to set NUMA node and large alignment when
   perforning slub and vmalloc reallocs

 - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
   DAMOS_STAT's handling of the DAMON operations sets for virtual
   address spaces for ops-level DAMOS filters

 - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
   Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps

 - "mm/mincore: minor clean up for swap cache checking" from Kairui Song
   performs some cleanup in the swap code

 - "mm: vm_normal_page*() improvements" from David Hildenbrand provides
   code cleanup in the pagemap code

 - "add persistent huge zero folio support" from Pankaj Raghav provides
   a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero

 - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
   the recently added Kexec Handover feature

 - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
   Stoakes turns mm_struct.flags into a bitmap. To end the constant
   struggle with space shortage on 32-bit conflicting with 64-bit's
   needs

 - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
   code

 - "selftests/mm: Fix false positives and skip unsupported tests" from
   Donet Tom fixes a few things in our selftests code

 - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
   from David Hildenbrand "allows individual processes to opt-out of
   THP=always into THP=madvise, without affecting other workloads on the
   system".

   It's a long story - the [1/N] changelog spells out the considerations

 - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
   the memdesc project. Please see

      https://kernelnewbies.org/MatthewWilcox/Memdescs and
      https://blogs.oracle.com/linux/post/introducing-memdesc

 - "Tiny optimization for large read operations" from Chi Zhiling
   improves the efficiency of the pagecache read path

 - "Better split_huge_page_test result check" from Zi Yan improves our
   folio splitting selftest code

 - "test that rmap behaves as expected" from Wei Yang adds some rmap
   selftests

 - "remove write_cache_pages()" from Christoph Hellwig removes that
   function and converts its two remaining callers

 - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
   selftests issues

 - "introduce kernel file mapped folios" from Boris Burkov introduces
   the concept of "kernel file pages". Using these permits btrfs to
   account its metadata pages to the root cgroup, rather than to the
   cgroups of random inappropriate tasks

 - "mm/pageblock: improve readability of some pageblock handling" from
   Wei Yang provides some readability improvements to the page allocator
   code

 - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
   to understand arm32 highmem

 - "tools: testing: Use existing atomic.h for vma/maple tests" from
   Brendan Jackman performs some code cleanups and deduplication under
   tools/testing/

 - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
   a couple of 32-bit issues in tools/testing/radix-tree.c

 - "kasan: unify kasan_enabled() and remove arch-specific
   implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
   initialization code into a common arch-neutral implementation

 - "mm: remove zpool" from Johannes Weiner removes zspool - an
   indirection layer which now only redirects to a single thing
   (zsmalloc)

 - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
   couple of cleanups in the fork code

 - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
   adjustments at various nth_page() callsites, eventually permitting
   the removal of that undesirable helper function

 - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
   creates a KASAN read-only mode for ARM, using that architecture's
   memory tagging feature. It is felt that a read-only mode KASAN is
   suitable for use in production systems rather than debug-only

 - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
   some tidying in the hugetlb folio allocation code

 - "mm: establish const-correctness for pointer parameters" from Max
   Kellermann makes quite a number of the MM API functions more accurate
   about the constness of their arguments. This was getting in the way
   of subsystems (in this case CEPH) when they attempt to improving
   their own const/non-const accuracy

 - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
   code sites which were confused over when to use free_pages() vs
   __free_pages()

 - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
   mapletree code accessible to Rust. Required by nouveau and by its
   forthcoming successor: the new Rust Nova driver

 - "selftests/mm: split_huge_page_test: split_pte_mapped_thp
   improvements" from David Hildenbrand adds a fix and some cleanups to
   the thp selftesting code

 - "mm, swap: introduce swap table as swap cache (phase I)" from Chris
   Li and Kairui Song is the first step along the path to implementing
   "swap tables" - a new approach to swap allocation and state tracking
   which is expected to yield speed and space improvements. This
   patchset itself yields a 5-20% performance benefit in some situations

 - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
   layer to clean up the ptdesc code a little

 - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
   issues in our 5-level pagetable selftesting code

 - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
   addresses a couple of minor issues in relatively new memory
   allocation profiling feature

 - "Small cleanups" from Matthew Wilcox has a few cleanups in
   preparation for more memdesc work

 - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
   Quanmin Yan makes some changes to DAMON in furtherance of supporting
   arm highmem

 - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
   Anjum adds that compiler check to selftests code and fixes the
   fallout, by removing dead code

 - "Improvements to Victim Process Thawing and OOM Reaper Traversal
   Order" from zhongjinji makes a number of improvements in the OOM
   killer: mainly thawing a more appropriate group of victim threads so
   they can release resources

 - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
   is a bunch of small and unrelated fixups for DAMON

 - "mm/damon: define and use DAMON initialization check function" from
   SeongJae Park implement reliability and maintainability improvements
   to a recently-added bug fix

 - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
   SeongJae Park provides additional transparency to userspace clients
   of the DAMON_STAT information

 - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
   some constraints on khubepaged's collapsing of anon VMAs. It also
   increases the success rate of MADV_COLLAPSE against an anon vma

 - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
   from Lorenzo Stoakes moves us further towards removal of
   file_operations.mmap(). This patchset concentrates upon clearing up
   the treatment of stacked filesystems

 - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
   provides some fixes and improvements to mlock's tracking of large
   folios. /proc/meminfo's "Mlocked" field became more accurate

 - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
   Donet Tom fixes several user-visible KSM stats inaccuracies across
   forks and adds selftest code to verify these counters

 - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
   some potential but presently benign issues in KSM's mm_slot handling

* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
  mm: swap: check for stable address space before operating on the VMA
  mm: convert folio_page() back to a macro
  mm/khugepaged: use start_addr/addr for improved readability
  hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
  alloc_tag: fix boot failure due to NULL pointer dereference
  mm: silence data-race in update_hiwater_rss
  mm/memory-failure: don't select MEMORY_ISOLATION
  mm/khugepaged: remove definition of struct khugepaged_mm_slot
  mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
  hugetlb: increase number of reserving hugepages via cmdline
  selftests/mm: add fork inheritance test for ksm_merging_pages counter
  mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
  drivers/base/node: fix double free in register_one_node()
  mm: remove PMD alignment constraint in execmem_vmalloc()
  mm/memory_hotplug: fix typo 'esecially' -> 'especially'
  mm/rmap: improve mlock tracking for large folios
  mm/filemap: map entire large folio faultaround
  mm/fault: try to map the entire file folio in finish_fault()
  mm/rmap: mlock large folios in try_to_unmap_one()
  mm/rmap: fix a mlock race condition in folio_referenced_one()
  ...
2025-10-02 18:18:33 -07:00
Akinobu Mita
06195ee967 mm/damon/sysfs: do not ignore callback's return value in damon_sysfs_damon_call()
The callback return value is ignored in damon_sysfs_damon_call(), which
means that it is not possible to detect invalid user input when writing
commands such as 'commit' to
/sys/kernel/mm/damon/admin/kdamonds/<K>/state.  Fix it.

Link: https://lkml.kernel.org/r/20250920132546.5822-1-akinobu.mita@gmail.com
Fixes: f64539dcdb ("mm/damon/sysfs: use damon_call() for update_schemes_stats")
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[6.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-25 16:10:35 -07:00
SeongJae Park
8d009da32f mm/damon/sysfs: set damon_ctx->min_sz_region only for paddr use case
damon_ctx->addr_unit is respected only for physical address space
monitoring use case.  Meanwhile, damon_ctx->min_sz_region is used by the
core layer for aligning regions, regardless of whether it is set for
physical address space monitoring or virtual address spaces monitoring. 
And it is set as 'DAMON_MIN_REGION / damon_ctx->addr_unit'.  Hence, if
user sets ->addr_unit on virtual address spaces monitoring mode, regions
can be unexpectedly aligned in <PAGE_SIZE granularity.  It shouldn't cause
crash-like issues but make monitoring and DAMOS behavior difficult to
understand.

Fix the unexpected behavior by setting ->min_sz_region only when it is
configured for physical address space monitoring.

The issue was found from a result of Chris' experiments that thankfully
shared with me off-list.

Link: https://lkml.kernel.org/r/20250917160041.53187-1-sj@kernel.org
Fixes: d8f867fa08 ("mm/damon: add damon_ctx->min_sz_region")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Chris Mason <clm@fb.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: ze zuo <zuoze1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-23 14:14:16 -07:00
Andrew Morton
bc9950b56f Merge branch 'mm-hotfixes-stable' into mm-stable in order to pick up
changes required by mm-stable material: hugetlb and damon.
2025-09-21 14:19:36 -07:00
Quanmin Yan
d8f867fa08 mm/damon: add damon_ctx->min_sz_region
Adopting addr_unit would make DAMON_MINREGION 'addr_unit * 4096' bytes and
cause data alignment issues[1].

Add damon_ctx->min_sz_region to change DAMON_MIN_REGION from a global
macro value to per-context variable.

Link: https://lkml.kernel.org/r/20250828171242.59810-12-sj@kernel.org
Link: https://lore.kernel.org/all/527714dd-0e33-43ab-bbbd-d89670ba79e7@huawei.com [1]
Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: ze zuo <zuoze1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:24 -07:00
SeongJae Park
540a2aebc6 mm/damon/sysfs: implement addr_unit file under context dir
Only DAMON kernel API callers can use addr_unit parameter.  Implement a
sysfs file to let DAMON sysfs ABI users use it.

Additionally, addr_unit must be set to a non-zero value.

Link: https://lkml.kernel.org/r/20250828171242.59810-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: ze zuo <zuoze1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 16:55:23 -07:00
SeongJae Park
04a06b139e mm/damon/sysfs: use dynamically allocated repeat mode damon_call_control
DAMON sysfs interface is using a single global repeat mode
damon_call_control variable for refresh_ms handling, for all DAMON
contexts.  As a result, when there are more than one context, the single
global damon_call_control is unexpectedly over-written (corrupted). 
Particularly the ->link field is overwritten by the multiple contexts and
this can cause a user hangup, and/or a kernel crash.  Fix it by using
dynamically allocated damon_call_control object per DAMON context.

Link: https://lkml.kernel.org/r/20250908201513.60802-3-sj@kernel.org
Link: https://lore.kernel.org/20250904011738.930-1-yunjeong.mun@sk.com [1]
Link: https://lore.kernel.org/20250905035411.39501-1-sj@kernel.org [2]
Fixes: d809a7c64b ("mm/damon/sysfs: implement refresh_ms file internal work")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: Yunjeong Mun <yunjeong.mun@sk.com>
Closes: https://lore.kernel.org/20250904011738.930-1-yunjeong.mun@sk.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13 13:05:36 -07:00
Stanislav Fort
3260a3f082 mm/damon/sysfs: fix use-after-free in state_show()
state_show() reads kdamond->damon_ctx without holding damon_sysfs_lock. 
This allows a use-after-free race:

CPU 0                         CPU 1
-----                         -----
state_show()                  damon_sysfs_turn_damon_on()
ctx = kdamond->damon_ctx;     mutex_lock(&damon_sysfs_lock);
                              damon_destroy_ctx(kdamond->damon_ctx);
                              kdamond->damon_ctx = NULL;
                              mutex_unlock(&damon_sysfs_lock);
damon_is_running(ctx);        /* ctx is freed */
mutex_lock(&ctx->kdamond_lock); /* UAF */

(The race can also occur with damon_sysfs_kdamonds_rm_dirs() and
damon_sysfs_kdamond_release(), which free or replace the context under
damon_sysfs_lock.)

Fix by taking damon_sysfs_lock before dereferencing the context, mirroring
the locking used in pid_show().

The bug has existed since state_show() first accessed kdamond->damon_ctx.

Link: https://lkml.kernel.org/r/20250905101046.2288-1-disclosure@aisle.com
Fixes: a61ea561c8 ("mm/damon/sysfs: link DAMON for virtual address spaces monitoring")
Signed-off-by: Stanislav Fort <disclosure@aisle.com>
Reported-by: Stanislav Fort <disclosure@aisle.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08 23:45:12 -07:00
SeongJae Park
d809a7c64b mm/damon/sysfs: implement refresh_ms file internal work
Only minimum file operations for refresh_ms file is implemented.  Further
implement its designed behavior, the periodic essential files content
update, using repeat mode damon_call().

If non-zero value is written to the file, update DAMON sysfs files for
auto-tuned monitoring intervals, DAMOS stats, and auto-tuned DAMOS quota
values, which are essential to be monitored in most DAMON use cases.  The
user-written non-zero value becomes the time delay between the update.  If
zero is written to the file, the periodic refresh is disabled.

Link: https://lkml.kernel.org/r/20250717055448.56976-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-24 19:12:33 -07:00
SeongJae Park
b907494768 mm/damon/sysfs: implement refresh_ms file under kdamond directory
Patch series "mm/damon/sysfs: support periodic and automated stats
update".

DAMON sysfs interface provides files for reading DAMON internal status
including auto-tuned monitoring intervals, DAMOS stats, DAMOS action
applied regions, and auto-tuned DAMOS effective quota.  Among those,
auto-tuned monitoring intervals, DAMOS stats and auto-tuned DAMOS
effective quota are essential for common DAMON/S use cases.

The content of the files are not automatically updated, though.  Users
should manually request updates of the contents by writing a special
command to 'state' file of each kdamond directory.  This interface is good
for minimizing overhead, but causes the below problems.

First, the usage is cumbersome.  This is arguably not a big problem, since
the user-space tool (damo) can do this instead of the user.

Second, it can be too slow.  The update request is not directly handled by
the sysfs interface but kdamond thread.  And kdamond threads wake up only
once per the sampling interval.  Hence if sampling interval is not short,
each update request could take too long time.  The recommended sampling
interval setup is asking DAMON to automatically tune it, within a range
between 5 milliseconds and 10 seconds.  On production systems it is not
very rare to have a few seconds sampling interval as a result of the
auto-tuning, so this can disturb observing DAMON internal status.

Finally, parallel update requests can conflict with each other.  When
parallel update requests are received, DAMON sysfs interface simply
returns -EBUSY to one of the requests.  DAMON user-space tool is hence
implementing its own backoff mechanism, but this can make the operation
even slower.

Introduce a new sysfs file, namely refresh_ms, for asking DAMON sysfs
interface to repeat the update of the above mentioned essential contents
with a user-specified time delay.  If non-zero value is written to the
file, DAMON sysfs interface does the updates for essential DAMON internal
status including auto-tuned monitoring intervals, DAMOS stats, and
auto-tuned DAMOS quotas using the user-written value as the time delay. 
In other words, it is similar to periodically writing
'update_schemes_stats', 'update_schemes_effective_quotas', and
'update_tuned_intervals' keywords to the 'state' file.  If zero is written
to the file, the automatic refresh is disabled.


This patch (of 4):

Implement a new DAMON sysfs file named 'refresh_ms' under each kdamond
directory.  The file will be used as a control knob of automatic refresh
of a few DAMON internal status files.  This commit implements only minimum
file operations, though.  The automatic refresh feature will be
implemented by the following commit.

Link: https://lkml.kernel.org/r/20250717055448.56976-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250717055448.56976-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-24 19:12:33 -07:00
SeongJae Park
0c96decca5 mm/damon/sysfs: remove damon_sysfs_before_terminate()
DAMON core layer does target cleanup on its own.  Remove duplicated and
unnecessarily selective cleanup attempts in DAMON sysfs interface.

Link: https://lkml.kernel.org/r/20250712195016.151108-14-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 18:59:57 -07:00
SeongJae Park
f59ae147ab mm/damon/sysfs: remove damon_sysfs_destroy_targets()
The function was introduced for putting pids and deallocating unnecessary
targets.  Hence it is called before damon_destroy_ctx().  Now vaddr puts
pid for each target destruction (cleanup_target()).  damon_destroy_ctx()
deallocates the targets anyway.  So damon_sysfs_destroy_targets() has no
reason to exist.  Remove it.

Link: https://lkml.kernel.org/r/20250712195016.151108-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 18:59:56 -07:00
SeongJae Park
ff01aba6e4 mm/damon/vaddr: put pid in cleanup_target()
Implement cleanup_target() callback for [f]vaddr, which calls put_pid()
for each target that will be destroyed.  Also remove redundant put_pid()
calls in core, sysfs and sample modules, which were required to be done
redundantly due to the lack of such self cleanup in vaddr.

Link: https://lkml.kernel.org/r/20250712195016.151108-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 18:59:56 -07:00
SeongJae Park
7114bc5e01 mm/damon/core: add cleanup_target() ops callback
Some DAMON operation sets may need additional cleanup per target.  For
example, [f]vaddr need to put pids of each target.  Each user and core
logic is doing that redundantly.  Add another DAMON ops callback that will
be used for doing such cleanups in operations set layer.

[sj@kernel.org: add kernel-doc comment for damon_operations->cleanup_target]
  Link: https://lkml.kernel.org/r/20250715185239.89152-2-sj@kernel.org
[sj@kernel.org: remove damon_ctx->callback kernel-doc comment]
  Link: https://lkml.kernel.org/r/20250715185239.89152-3-sj@kernel.org
Link: https://lkml.kernel.org/r/20250712195016.151108-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 18:59:56 -07:00
SeongJae Park
e000df9ff1 mm/damon/sysfs: don't hold kdamond_lock in before_terminate()
damon_sysfs_before_terminate() is a DAMON callback that is executed from
the kdamond's context.  Hence it is safe to access DAMON context internal
data.  But the function is unnecessarily holding kdamond_lock of the
context.  It is just unnecessary.  Remove the locking code.

Link: https://lkml.kernel.org/r/20250705175000.56259-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 18:59:44 -07:00
SeongJae Park
d2b5be741a mm/damon/sysfs: use DAMON core API damon_is_running()
DAMON core implements a static function to see if a given DAMON context is
running.  DAMON sysfs interface is implementing the same one on its own. 
Make the core function non-static and reuse it from the DAMON sysfs
interface.

Link: https://lkml.kernel.org/r/20250705175000.56259-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 18:59:44 -07:00
SeongJae Park
d1600be2f6 mm/damon/sysfs: decouple from damon_ops_id
Decouple DAMON sysfs interface from damon_ops_id.  For this, define and
use new mm/damon/sysfs.c internal data structure that maps the user-space
keywords and damon_ops_id, instead of having the implicit and unflexible
array index rule.

Link: https://lkml.kernel.org/r/20250622213759.50930-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-09 22:42:21 -07:00
SeongJae Park
d682f5f643 mm/damon/sysfs: remove damon_sysfs_cmd_request and its readers
damon_sysfs_cmd_request is DAMON sysfs interface's own synchronization
mechanism for accessing DAMON internal data via damon_callback hooks.  All
the users are now migrated to damon_call() and damos_walk(), so nobody
really uses it.  No one writes to the data structure but reading code is
still remained.  Remove the reading code and the entire data structure.

Link: https://lkml.kernel.org/r/20250306175908.66300-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:55 -07:00
SeongJae Park
311f34ff85 mm/damon/sysfs: remove damon_sysfs_cmd_request_callback() and its callers
damon_sysfs_cmd_request_callback() is the damon_callback hook functions
that were used to handle user requests that need to read and/or write
DAMON internal data.  All the usages are now updated to use damon_call()
or damos_walk(), though.  Remove it and its callers.

Link: https://lkml.kernel.org/r/20250306175908.66300-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:55 -07:00
SeongJae Park
8b40db0edf mm/damon/sysfs: remove damon_sysfs_cmd_request code from damon_sysfs_handle_cmd()
damon_sysfs_handle_cmd() handles user requests that it can directly handle
on its own.  For requests that need to be handled from damon_callback
hooks, it uses DAMON sysfs interface's own synchronous damon_callback
hooks management mechanism, namely damon_sysfs_cmd_request.  Now all user
requests are handled without damon_callback hooks, so
damon_sysfs_cmd_request client code in damon_sysfs_andle_cmd() does
nothing in real.  Remove the unnecessary code.

Link: https://lkml.kernel.org/r/20250306175908.66300-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:55 -07:00
SeongJae Park
3301f1861d mm/damon/sysfs: handle commit command using damon_call()
DAMON sysfs interface is using damon_callback->after_aggregation hook with
its self-implemented synchronization mechanism for the hook.  It is
inefficient, complicated, and take up to one aggregation interval to
complete, which can be long on some configs.

Use damon_call() instead.  It provides a synchronization mechanism that
built inside DAMON's core layer, so more efficient than DAMON sysfs
interface's own one.  Also it isolates the implementation inside the core
layer, and hence it makes the code easier to maintain.  Finally, it takes
up to one sampling interval, which is much shorter than the aggregation
interval in common setups.

Link: https://lkml.kernel.org/r/20250306175908.66300-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:54 -07:00
SeongJae Park
4c9ea539ad mm/damon/sysfs: validate user inputs from damon_sysfs_commit_input()
Patch series "mm/damon/sysfs: commit parameters online via damon_call()".

Due to the lack of ways to synchronously access DAMON internal data, DAMON
sysfs interface is using damon_callback hooks with its own synchronization
mechanism.  The mechanism is built on top of damon_callback hooks in an
ineifficient and complicated way.

Patch series "mm/damon: replace most damon_callback usages in sysfs with
new core functions", which starts with commit e035320fd3
("mm/damon/sysfs-schemes: remove unnecessary schemes existence check in
damon_sysfs_schemes_clear_regions()") introduced two new DAMON kernel API
functions that providing the synchronous access, replaced most
damon_callback hooks usage in DAMON sysfs interface, and cleaned up
unnecessary code.

Continue the replacement and cleanup works.  Update the last DAMON sysfs'
usage of its own synchronization mechanism, namely online DAMON parameters
commit, to use damon_call() instead of the damon_callback hooks and the
hard-to-maintain core-external synchronization mechanism.  Then remove the
no more be used code due to the change, and more unused code that just not
yet cleaned up.

The first four patches (patches 1-4) of this series makes DAMON sysfs
interface's online parameters commit to use damon_call().  Then, following
three patches (patches 5-7) remove the DAMON sysfs interface's own
synchronization mechanism and its usages, which is no more be used by
anyone due to the first four patches.  Finally, six patches (8-13) do more
cleanup of outdated comment and unused code.


This patch (of 13):

Online DAMON parameters commit via DAMON sysfs interface can make kdamond
stop.  This behavior was made because it can make the implementation
simpler.  The implementation tries committing the parameter without
validation.  If it finds something wrong in the middle of the parameters
update, it returns error without reverting the partially committed
parameters back.  It is safe though, since it immediately breaks kdamond
main loop in the case of the error return.

Users can make the wrong parameters by mistake, though.  Stopping kdamond
in the case is not very useful behavior.  Also this makes it difficult to
utilize damon_call() instead of damon_callback hook for online parameters
update, since damon_call() cannot immediately break kdamond main loop in
the middle.

Validate the input parameters and return error when it fails before
starting parameters updates.  In case of mistakenly wrong parameters,
kdamond can continue running with the old and valid parameters.

Link: https://lkml.kernel.org/r/20250306175908.66300-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250306175908.66300-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:54 -07:00
SeongJae Park
1077605396 mm/damon/sysfs: implement a command to update auto-tuned monitoring intervals
DAMON kernel API callers can show auto-tuned sampling and aggregation
intervals from the monmitoring attributes data structure.  That can be
useful for debugging or tuning of the feature.  DAMON user-space ABI users
has no way to see that, though.  Implement a new DAMON sysfs interface
command, namely 'update_tuned_intervals', for the purpose.  If the command
is written to the kdamond state file, the tuned sampling and aggregation
intervals will be updated to the corresponding sysfs interface files.

Link: https://lkml.kernel.org/r/20250303221726.484227-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 00:05:33 -07:00
SeongJae Park
0622c68d0a mm/damon/sysfs: commit intervals tuning goal
Connect DAMON sysfs interface for sampling and aggregation intervals
auto-tuning with DAMON core API, so that users can really use the feature
using the sysfs files.

Link: https://lkml.kernel.org/r/20250303221726.484227-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 00:05:33 -07:00
SeongJae Park
8fbbcbeaaf mm/damon/sysfs: implement intervals tuning goal directory
Implement DAMON sysfs interface directory and its files for setting DAMON
sampling and aggregation intervals auto-tuning goal.

Link: https://lkml.kernel.org/r/20250303221726.484227-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 00:05:33 -07:00
SeongJae Park
a2a60f9e57 mm/damon/sysfs-schemes: expose per-region filter-passed bytes
Per-region operations set-handled DAMOS filters passed memory size
information is provided to only DAMON core API users.  Further expose it
to the user space by adding a new DAMON sysfs interface file under each
scheme tried region directory.

Link: https://lkml.kernel.org/r/20250106193401.109161-14-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:28 -08:00
SeongJae Park
cfc33a7d2d mm/damon/core: pass per-region filter-passed bytes to damos_walk_control->walk_fn()
Total size of memory that passed DAMON operations set layer-handled DAMOS
filters per scheme is provided to DAMON core API and ABI (sysfs interface)
users.  Having it per-region in non-accumulated way can provide it in
finer granularity.  Provide it to damos_walk() core API users, by passing
the data to damos_walk_control->walk_fn().

Link: https://lkml.kernel.org/r/20250106193401.109161-13-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:28 -08:00
SeongJae Park
ee14cbc6f8 mm/damon/sysfs: remove unused code for schemes tried regions update
DAMON sysfs interface was using damon_callback with its own complicated
synchronization logics to update DAMOS scheme applied regions directories
and files.  But it is replaced to use damos_walk(), and the additional
synchronization logics are no more being used.  Remove those.

Link: https://lkml.kernel.org/r/20250103174400.54890-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:26 -08:00
SeongJae Park
66178e4ec3 mm/damon/sysfs: use damos_walk() for update_schemes_tried_{bytes,regions}
DAMON sysfs interface uses damon_callback with its own complicated
synchronization facility to handle update_schemes_tried_bytes and
update_schemes_tried_regions commands.  But damos_walk() can support the
use case without the additional synchronizations.  Convert the code to use
damos_walk() instead.

Link: https://lkml.kernel.org/r/20250103174400.54890-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:26 -08:00
SeongJae Park
9a5aa3349b mm/damon/sysfs: use damon_call() for update_schemes_effective_quotas
DAMON sysfs interface uses damon_callback with its own synchronization
facility to handle update_schemes_effective_quotas command.  But
damon_call() can support the use case without the additional
synchronizations.  Convert the code to use damon_call() instead.

Link: https://lkml.kernel.org/r/20250103174400.54890-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:25 -08:00
SeongJae Park
60d2c527bd mm/damon/sysfs: use damon_call() for commit_schemes_quota_goals
DAMON sysfs interface uses damon_callback with its own synchronization
facility to handle commit_schemes_quota_goals command.  But damon_call()
can support the use case without the additional synchronizations.  Convert
the code to use damon_call() instead.

Link: https://lkml.kernel.org/r/20250103174400.54890-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:25 -08:00
SeongJae Park
f64539dcdb mm/damon/sysfs: use damon_call() for update_schemes_stats
DAMON sysfs interface uses damon_callback with its own synchronization
facility to handle update_schemes_stats kdamond command.  But damon_call()
can support the use case without the additional synchronizations.  Convert
the code to use damon_call() instead.

Link: https://lkml.kernel.org/r/20250103174400.54890-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:25 -08:00