selftests/damon/wss_estimation: test for up to 160 MiB working set size

DAMON reads and writes Accessed bits of page tables without manual TLB
flush for two reasons.  First, it minimizes the overhead.  Second, real
systems that need DAMON are expected to be memory intensive enough to
cause periodic TLB flushes.  For test setups that use small test
workloads, however, the system's TLB could be big enough to cover whole or
most accesses of the test workload.  In this case, no page table walk
happens and DAMON cannot show any access from the test workload.

The test workload for DAMON's working set size estimation selftest is such
a case.  It accesses only 10 MiB working set, and it turned out there are
test setups that have TLBs large enough to cover the 10 MiB data accesses.
As a result, the test fails depending on the test machine.

Make it more reliable by trying larger working sets up to 160 MiB when it
fails.

Link: https://lkml.kernel.org/r/20260117020731.226785-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
SeongJae Park 2026-01-16 18:07:25 -08:00 committed by Andrew Morton
parent 94a62284ed
commit 891d206e27

View File

@ -6,9 +6,8 @@ import time
import _damon_sysfs
def main():
# access two 10 MiB memory regions, 2 second per each
sz_region = 10 * 1024 * 1024
def pass_wss_estimation(sz_region):
# access two regions of given size, 2 seocnds per each region
proc = subprocess.Popen(['./access_memory', '2', '%d' % sz_region, '2000'])
kdamonds = _damon_sysfs.Kdamonds([_damon_sysfs.Kdamond(
contexts=[_damon_sysfs.DamonCtx(
@ -36,20 +35,38 @@ def main():
wss_collected.append(
kdamonds.kdamonds[0].contexts[0].schemes[0].tried_bytes)
err = kdamonds.stop()
if err is not None:
print('kdamond stop failed: %s' % err)
exit(1)
wss_collected.sort()
acceptable_error_rate = 0.2
for percentile in [50, 75]:
sample = wss_collected[int(len(wss_collected) * percentile / 100)]
error_rate = abs(sample - sz_region) / sz_region
print('%d-th percentile (%d) error %f' %
(percentile, sample, error_rate))
print('%d-th percentile error %f (expect %d, result %d)' %
(percentile, error_rate, sz_region, sample))
if error_rate > acceptable_error_rate:
print('the error rate is not acceptable (> %f)' %
acceptable_error_rate)
print('samples are as below')
print('\n'.join(['%d' % wss for wss in wss_collected]))
exit(1)
return False
return True
def main():
# DAMON doesn't flush TLB. If the system has large TLB that can cover
# whole test working set, DAMON cannot see the access. Test up to 160 MiB
# test working set.
sz_region_mb = 10
max_sz_region_mb = 160
while sz_region_mb <= max_sz_region_mb:
test_pass = pass_wss_estimation(sz_region_mb * 1024 * 1024)
if test_pass is True:
exit(0)
sz_region_mb *= 2
exit(1)
if __name__ == '__main__':
main()