Commit Graph

51718 Commits

Author SHA1 Message Date
Thomas Gleixner
010b7723c0 rseq: Don't advertise time slice extensions if disabled
If time slice extensions have been disabled on the kernel command line,
then advertising them in RSEQ flags is wrong.

Adjust the conditionals to reflect reality, fixup the misleading comments
about the gap of these flags and the rseq::flags field.

Fixes: d6200245c7 ("rseq: Allow registering RSEQ with slice extension")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.437059375%40kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Thomas Gleixner
2cb68e4512 rseq: Set rseq::cpu_id_start to 0 on unregistration
The RSEQ rework changed that to RSEQ_CPU_UNINITILIZED, which is obviously
incompatible. Revert back to the original behavior.

Fixes: 0f085b4188 ("rseq: Provide and use rseq_set_ids()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/20260428224427.271566313%40kernel.org
Cc: stable@vger.kernel.org
2026-05-01 21:32:20 +02:00
Zicheng Qu
3da56dc063 sched/fair: Clear rel_deadline when initializing forked entities
A yield-triggered crash can happen when a newly forked sched_entity
enters the fair class with se->rel_deadline unexpectedly set.

The failing sequence is:

  1. A task is forked while se->rel_deadline is still set.
  2. __sched_fork() initializes vruntime, vlag and other sched_entity
     state, but does not clear rel_deadline.
  3. On the first enqueue, enqueue_entity() calls place_entity().
  4. Because se->rel_deadline is set, place_entity() treats se->deadline
     as a relative deadline and converts it to an absolute deadline by
     adding the current vruntime.
  5. However, the forked entity's deadline is not a valid inherited
     relative deadline for this new scheduling instance, so the conversion
     produces an abnormally large deadline.
  6. If the task later calls sched_yield(), yield_task_fair() advances
     se->vruntime to se->deadline.
  7. The inflated vruntime is then used by the following enqueue path,
     where the vruntime-derived key can overflow when multiplied by the
     entity weight.
  8. This corrupts cfs_rq->sum_w_vruntime, breaks EEVDF eligibility
     calculation, and can eventually make all entities appear ineligible.
     pick_next_entity() may then return NULL unexpectedly, leading to a
     later NULL dereference.

A captured trace shows the effect clearly. Before yield, the entity's
vruntime was around:

  9834017729983308

After yield_task_fair() executed:

  se->vruntime = se->deadline

the vruntime jumped to:

  19668035460670230

and the deadline was later advanced further to:

  19668035463470230

This shows that the deadline had already become abnormally large before
yield_task_fair() copied it into vruntime.

rel_deadline is only meaningful when se->deadline really carries a
relative deadline that still needs to be placed against vruntime. A
freshly forked sched_entity should not inherit or retain this state.
Clear se->rel_deadline in __sched_fork(), together with the other
sched_entity runtime state, so that the first enqueue does not interpret
the new entity's deadline as a stale relative deadline.

Fixes: 82e9d0456e ("sched/fair: Avoid re-setting virtual deadline on 'migrations'")
Analyzed-by: Hui Tang <tanghui20@huawei.com>
Analyzed-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260424071113.1199600-1-quzicheng@huawei.com
2026-04-28 09:19:54 +02:00
Vincent Guittot
ac8e69e693 sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue
Similar to how pick_next_entity() must dequeue delayed entities, so too must
wakeup_preempt_fair(). Any delayed task being found means it is eligible and
hence past the 0-lag point, ready for removal.

Worse, by not removing delayed entities from consideration, it can skew the
preemption decision, with the end result that a short slice wakeup will not
result in a preemption.

                     tip/sched/core  tip/sched/core    +this patch
cyclictest slice  (ms) (default)2.8             8               8
hackbench slice   (ms) (default)2.8            20              20
Total Samples          |    22559           22595           22683
Average           (us) |      157              64( 59%)        59(  8%)
Median (P50)      (us) |       57              57(  0%)        58(- 2%)
90th Percentile   (us) |       64              60(  6%)        60(  0%)
99th Percentile   (us) |     2407              67( 97%)        67(  0%)
99.9th Percentile (us) |     3400            2288( 33%)       727( 68%)
Maximum           (us) |     5037            9252(-84%)      7461( 19%)

Fixes: f12e148892 ("sched/fair: Prepare pick_next_task() for delayed dequeue")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260422093400.319251-1-vincent.guittot@linaro.org
2026-04-28 09:19:54 +02:00
Peter Zijlstra
c5cd6fd75b sched/fair: Fix the negative lag increase fix
Vincent reported that my rework of his original patch lost a little
something.

Specifically it got the return value wrong; it should not compare
against the old se->vlag, but rather against the current value. Since
the thing that matters is if the effective vruntime of an entity is
affected and the thing needs repositioning or not.

Fixes: 059258b0d4 ("sched/fair: Prevent negative lag increase during delayed dequeue")
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20260423094107.GT3102624%40noisy.programming.kicks-ass.net
2026-04-28 09:19:54 +02:00
Linus Torvalds
27d128c1cf ring-buffer fix for 7.1:
- Fix accounting of persistent ring buffer rewind
 
   On boot up, the head page is moved back to the earliest point of
   the saved ring buffer. This is because the ring buffer being read by
   user space on a crash may not save the part it read. Rewinding the head
   page back to the earliest saved position helps keep those events from
   being lost.
 
   The number of events is also read during boot up and displayed in the
   stats file in the tracefs directory. It's also used for other accounting
   as well. On boot up, the "reader page" is accounted for but a rewind may
   put it back into the buffer and then the reader page may be accounted
   for again.
 
   Save off the original reader page and skip accounting it when scanning
   the pages in the ring buffer.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaevLRxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qsNBAQCO9T+3kbDLD/DxRBYjeMXhTg2iY2OV
 SHV9jEzcYTahmwEAqVCFY9sNxVYXait/C924SmHCSi1N0HrIuEzIEd29iwk=
 =bv+t
 -----END PGP SIGNATURE-----

Merge tag 'trace-ring-buffer-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ring-buffer fix from Steven Rostedt:

 - Fix accounting of persistent ring buffer rewind

   On boot up, the head page is moved back to the earliest point of the
   saved ring buffer. This is because the ring buffer being read by user
   space on a crash may not save the part it read. Rewinding the head
   page back to the earliest saved position helps keep those events from
   being lost.

   The number of events is also read during boot up and displayed in the
   stats file in the tracefs directory. It's also used for other
   accounting as well. On boot up, the "reader page" is accounted for
   but a rewind may put it back into the buffer and then the reader page
   may be accounted for again.

   Save off the original reader page and skip accounting it when
   scanning the pages in the ring buffer.

* tag 'trace-ring-buffer-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ring-buffer: Do not double count the reader_page
2026-04-24 15:17:23 -07:00
Masami Hiramatsu (Google)
92d5a60672 ring-buffer: Do not double count the reader_page
Since the cpu_buffer->reader_page is updated if there are unwound
pages. After that update, we should skip the page if it is the
original reader_page, because the original reader_page is already
checked.

Cc: stable@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://patch.msgid.link/177701353063.2223789.1471163147644103306.stgit@mhiramat.tok.corp.google.com
Fixes: ca296d32ec ("tracing: ring_buffer: Rewind persistent ring buffer on reboot")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-04-24 15:34:39 -04:00
Linus Torvalds
892c894b4b Misc locking fixes:
- Fix ww_mutex regression, which caused hangs/pauses in some DRM drivers
  - Fix rtmutex proxy-rollback bug
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmnrdJERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jnJw//bi0bzZnBYeU4ukrqnyVWUZS7lKCnl+3M
 Ys31Ls38nXTQPhs+qbgUlqLzybJ/96s7uattPIW8+0Jkr+PfQguCvT+QiXs0SFzC
 aEp6PcZAD1z+QQY1zj2c1TM414IdyWbkShRxsPULb1j8nYJ6mO50UzP4zFdl7QwH
 gwZRgPkriUhD3vfgtHX3U0rshrvFHu9Ed6uuMkLFxaEiHn0h/4rZa0q9vonxw+cU
 64MabrxtPyQA97Mjr/YVsJjodt5bzkBmKNDCu9YxfqFQ0LTXyi1DZG/zdFHmT1Kf
 WcRIXbCx27HcR+NOWRze4rby2SR27NnPAZC/rR7Y6MX0IHGg1m2tM9DIGnDw71Su
 3HkQ1W3KhQFO0uvbhkKCXk1SAfBrJwh0g3p91sgj/tZyaXXdEYRChhpie5jMy3eg
 3k49QfYHigQZCZa8IH0PjrxLzDHo8S2tB75X6dWlxC/UeN/6X1G+Z9a52G9kZrFJ
 ZhrxY5/WVF4LbeIYupFbjN393jVN4WPEJKN3Esouuc+bluqMjUjl65v12yTQZeFy
 7pObrM9nJ4FijWjt+OCHUUZLwt+k5zttdUjgEh0zonx94JtpDA0qjkpPtbYgCFak
 WQxwD+qs1r8yv+msqkQ65/ir5GT6GyAzQnQAQbAQyB4itC9iyDhoS6mzkFBsw3bk
 TxfGszSV5ng=
 =6wf3
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Ingo Molnar:

 - Fix ww_mutex regression, which caused hangs/pauses in some DRM drivers

 - Fix rtmutex proxy-rollback bug

* tag 'locking-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/mutex: Fix ww_mutex wait_list operations
  rtmutex: Use waiter::task instead of current in remove_waiter()
2026-04-24 10:14:29 -07:00
Peter Zijlstra
0adc92b910 locking/mutex: Fix ww_mutex wait_list operations
Chaitanya, John and Mikhail reported commit 25500ba7e7 ("locking/mutex:
Remove the list_head from struct mutex") wrecked ww_mutex.

Specifically there were 2 issues:

 - __ww_waiter_prev() had the termination condition wrong; it would terminate
   when the previous entry was the first, which results in a truncated
   iteration: W3, W2, (no W1).

 - __mutex_add_waiter(@pos != NULL), as used by __ww_waiter_add() /
   __ww_mutex_add_waiter(); this inserts @waiter before @pos (which is what
   list_add_tail() does). But this should then also update lock->first_waiter.

Much thanks to Prateek for spotting the __mutex_add_waiter() issue!

Fixes: 25500ba7e7 ("locking/mutex: Remove the list_head from struct mutex")
Reported-by: "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com>
Closes: https://lore.kernel.org/r/af005996-05e9-4336-8450-d14ca652ba5d%40intel.com
Reported-by: John Stultz <jstultz@google.com>
Closes: https://lore.kernel.org/r/CANDhNCq%3Doizzud3hH3oqGzTrcjB8OwGeineJ3mwZuGdDWG8fRQ%40mail.gmail.com
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/r/CABXGCsO5fKq2nD9nO8yO1z50ZzgCPWqueNXHANjntaswoOh2Dg@mail.gmail.com
Debugged-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Link: https://patch.msgid.link/20260422092335.GH3102924%40noisy.programming.kicks-ass.net
2026-04-23 10:05:49 +02:00
Linus Torvalds
1e18ed5727 ring-buffer fix for v7.1
- Make undefsyms_base.c into a real file
 
   The file undefsyms_base.c is used to catch any symbols used by a remote
   ring buffer that is made for use of a pKVM hypervisor. As it doesn't share
   the same text as the rest of the kernel, referencing any symbols within
   the kernel will make it fail to be built for the standalone hypervisor.
 
   A file was created by the Makefile that checked for any symbols that could
   cause issues. There's no reason to have this file created by the Makefile,
   just create it as a normal file instead.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaekbsxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qr0MAQCHMFK56uzmMvQJg5Tfhtb+grY7ryo4
 j0rjLkNQBovmxwD/XylBDkhxr8IQ72jU04e+xuXzrqoT18U7XvQEVpPgkQQ=
 =dQi0
 -----END PGP SIGNATURE-----

Merge tag 'trace-ring-buffer-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ring-buffer fix from Steven Rostedt:

 - Make undefsyms_base.c into a real file

   The file undefsyms_base.c is used to catch any symbols used by a
   remote ring buffer that is made for use of a pKVM hypervisor. As it
   doesn't share the same text as the rest of the kernel, referencing
   any symbols within the kernel will make it fail to be built for the
   standalone hypervisor.

   A file was created by the Makefile that checked for any symbols that
   could cause issues. There's no reason to have this file created by
   the Makefile, just create it as a normal file instead.

* tag 'trace-ring-buffer-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Make undefsyms_base.c a first-class citizen
2026-04-22 14:47:52 -07:00
Linus Torvalds
38ee6e1fb6 kgdb patches for 7.1
Only a very small update for kgdb this cycle: a single patch from Kexin Sun
 that fixes some outdated comments.
 
 Signed-off-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAmno2OcACgkQfOMlXTn3
 iKHVNw/+IXsSmpUTllkwlzmib+D7zYVEoyLOdoBRQyJDjjT9Qe811budkIs02sIk
 NUgMBnffInU73lCw2TbRpKag9+dl8ZglKFvM3UpElFzY5ePnUc/fHeFtsXdzVjSK
 PVzL3fCtN66R1MuMeCdMNjhtSVHNJCrclj0We39JRgli7kWK1zH6kY8X/j/ExBTR
 9S5e3/9Yn/V9Z2Z33lwBMCKLHyd/5r0ezKbvfVGmqT/sTE/IbTcTfVMipEHGtHfA
 JHs5PC+1obJ9jeq/1c1fV8tZ8kO+VjGI0yUibbWEBmPx0pVUW/FG1z+WwhlbcyOl
 cyhKQ7d4h4e718M+bWAXeoRGk9bXyAWHqWojM98Xh+TvR2H20qAzUC44qeT9tt7B
 SOfvZM4Xwsn+uYqmthtMqLMqe8VIX2WbUVjsntYZ46fRTq8RAUY3z7yLv6bm7N4l
 p4i7/cdCsbzi+tN6uaKANywvt9ygEAcPILGfMhfIS4Pa7aQ/JVjKgxKhP1bUuH9u
 ftqIcJwvX3d8MDQY4FsLpsRToj3rADfr35HnQCc7fmqRIFi7++fXzPerktdF/5Sd
 6NMR0+mdFewdW9JfYKGZMTybf9XXyC+F6HlKgNOBddPje7qQlLUIQBajn71W753i
 TNKI5Go6idMgWEukNVXGD2XqObyKD2eBgCj/42Kc8bDfo7D66UA=
 =MegD
 -----END PGP SIGNATURE-----

Merge tag 'kgdb-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux

Pull kgdb update from Daniel Thompson:
 "Only a very small update for kgdb this cycle: a single patch from
  Kexin Sun that fixes some outdated comments"

* tag 'kgdb-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kgdb: update outdated references to kgdb_wait()
2026-04-22 14:26:58 -07:00
Paolo Bonzini
5335e318ad tracing: Make undefsyms_base.c a first-class citizen
Linus points out that dumping undefsyms_base.c form the Makefile
is rather ugly, and that a much better course of action would be
to have this file as a first-class citizen in the git tree.

This allows some extra cleanup in the Makefile, and the removal of
the .gitignore file in kernel/trace.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/CAHk-=wieqGd_XKpu8UxDoyADZx8TDe8CF3RmkUXt5N_9t5Pf_w@mail.gmail.com
Link: https://lore.kernel.org/all/20260421095446.2951646-1-maz@kernel.org/
Link: https://patch.msgid.link/20260421100455.324333-1-pbonzini@redhat.com
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-04-22 11:24:41 -04:00
Masami Hiramatsu (Google)
476c5bbae6 tracing/fprobe: Fix to unregister ftrace_ops if it is empty on module unloading
Fix fprobe to unregister ftrace_ops if corresponding type of fprobe
does not exist on the fprobe_ip_table and it is expected to be empty
when unloading modules.

Since ftrace thinks that the empty hash means everything to be traced,
if we set fprobes only on the unloaded module, all functions are traced
unexpectedly after unloading module.
e.g.

 # modprobe xt_LOG.ko
 # echo 'f:test log_tg*' > dynamic_events
 # echo 1 > events/fprobes/test/enable
 # cat enabled_functions
log_tg [xt_LOG] (1)             tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
log_tg_check [xt_LOG] (1)               tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
log_tg_destroy [xt_LOG] (1)             tramp: 0xffffffffa0004000 (fprobe_ftrace_entry+0x0/0x490) ->fprobe_ftrace_entry+0x0/0x490
 # rmmod xt_LOG
 # wc -l enabled_functions
34085 enabled_functions

Link: https://lore.kernel.org/all/177669368776.132053.10042301916765771279.stgit@mhiramat.tok.corp.google.com/

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2026-04-22 09:24:13 +09:00
Kexin Sun
256e5254ef kgdb: update outdated references to kgdb_wait()
The function kgdb_wait() was folded into the static function
kgdb_cpu_enter() by commit 62fae31219 ("kgdb: eliminate
kgdb_wait(), all cpus enter the same way").  Update the four stale
references accordingly:

 - include/linux/kgdb.h and arch/x86/kernel/kgdb.c: the
   kgdb_roundup_cpus() kdoc describes what other CPUs are rounded up
   to call.  Because kgdb_cpu_enter() is static, the correct public
   entry point is kgdb_handle_exception(); also fix a pre-existing
   grammar error ("get them be" -> "get them into") and reflow the
   text.

 - kernel/debug/debug_core.c: replace with the generic description
   "the debug trap handler", since the actual entry path is
   architecture-specific.

 - kernel/debug/gdbstub.c: kgdb_cpu_enter() is correct here (it
   describes internal state, not a call target); add the missing
   parentheses.

Suggested-by: Daniel Thompson <daniel@riscstar.com>
Assisted-by: unnamed:deepseek-v3.2 coccinelle
Signed-off-by: Kexin Sun <kexinsun@smail.nju.edu.cn>
2026-04-21 16:41:54 +01:00
Masami Hiramatsu (Google)
0ac0058a74 tracing/fprobe: Check the same type fprobe on table as the unregistered one
Commit 2c67dc457b ("tracing: fprobe: optimization for entry only case")
introduced a different ftrace_ops for entry-only fprobes.

However, when unregistering an fprobe, the kernel only checks if another
fprobe exists at the same address, without checking which type of fprobe
it is.
If different fprobes are registered at the same address, the same address
will be registered in both fgraph_ops and ftrace_ops, but only one of
them will be deleted when unregistering. (the one removed first will not
be deleted from the ops).

This results in junk entries remaining in either fgraph_ops or ftrace_ops.
For example:
 =======
 cd /sys/kernel/tracing

 # 'Add entry and exit events on the same place'
 echo 'f:event1 vfs_read' >> dynamic_events
 echo 'f:event2 vfs_read%return' >> dynamic_events

 # 'Enable both of them'
 echo 1 > events/fprobes/enable
 cat enabled_functions
vfs_read (2)            ->arch_ftrace_ops_list_func+0x0/0x210

 # 'Disable and remove exit event'
 echo 0 > events/fprobes/event2/enable
 echo -:event2 >> dynamic_events

 # 'Disable and remove all events'
 echo 0 > events/fprobes/enable
 echo > dynamic_events

 # 'Add another event'
 echo 'f:event3 vfs_open%return' > dynamic_events
 cat dynamic_events
f:fprobes/event3 vfs_open%return

 echo 1 > events/fprobes/enable
 cat enabled_functions
vfs_open (1)            tramp: 0xffffffffa0001000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60    subops: {ent:fprobe_fgraph_entry+0x0/0x620 ret:fprobe_return+0x0/0x150}
vfs_read (1)            tramp: 0xffffffffa0001000 (ftrace_graph_func+0x0/0x60) ->ftrace_graph_func+0x0/0x60    subops: {ent:fprobe_fgraph_entry+0x0/0x620 ret:fprobe_return+0x0/0x150}
 =======

As you can see, an entry for the vfs_read remains.

To fix this issue, when unregistering, the kernel should also check if
there is the same type of fprobes still exist at the same address, and
if not, delete its entry from either fgraph_ops or ftrace_ops.

Link: https://lore.kernel.org/all/177669367993.132053.10553046138528674802.stgit@mhiramat.tok.corp.google.com/

Fixes: 2c67dc457b ("tracing: fprobe: optimization for entry only case")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2026-04-22 00:03:10 +09:00
Masami Hiramatsu (Google)
aa72812b49 tracing/fprobe: Avoid kcalloc() in rcu_read_lock section
fprobe_remove_node_in_module() is called under RCU read locked, but
this invokes kcalloc() if there are more than 8 fprobes installed
on the module. Sashiko warns it because kcalloc() can sleep [1].

 [1] https://sashiko.dev/#/patchset/177552432201.853249.5125045538812833325.stgit%40mhiramat.tok.corp.google.com

To fix this issue, expand the batch size to 128 and do not expand
the fprobe_addr_list, but just cancel walking on fprobe_ip_table,
update fgraph/ftrace_ops and retry the loop again.

Link: https://lore.kernel.org/all/177669367206.132053.1493637946869032744.stgit@mhiramat.tok.corp.google.com/

Fixes: 0de4c70d04 ("tracing: fprobe: use rhltable for fprobe_ip_table")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2026-04-22 00:02:59 +09:00
Masami Hiramatsu (Google)
845947aca6 tracing/fprobe: Remove fprobe from hash in failure path
When register_fprobe_ips() fails, it tries to remove a list of
fprobe_hash_node from fprobe_ip_table, but it missed to remove
fprobe itself from fprobe_table. Moreover, when removing
the fprobe_hash_node which is added to rhltable once, it must
use kfree_rcu() after removing from rhltable.

To fix these issues, this reuses unregister_fprobe() internal
code to rollback the half-way registered fprobe.

Link: https://lore.kernel.org/all/177669366417.132053.17874946321744910456.stgit@mhiramat.tok.corp.google.com/

Fixes: 4346ba1604 ("fprobe: Rewrite fprobe on function-graph tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2026-04-21 23:59:57 +09:00
Masami Hiramatsu (Google)
1aec9e5c3e tracing/fprobe: Unregister fprobe even if memory allocation fails
unregister_fprobe() can fail under memory pressure because of memory
allocation failure, but this maybe called from module unloading, and
usually there is no way to retry it. Moreover. trace_fprobe does not
check the return value.

To fix this problem, unregister fprobe and fprobe_hash_node even if
working memory allocation fails.
Anyway, if the last fprobe is removed, the filter will be freed.

Link: https://lore.kernel.org/all/177669365629.132053.8433032896213721288.stgit@mhiramat.tok.corp.google.com/

Fixes: 4346ba1604 ("fprobe: Rewrite fprobe on function-graph tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2026-04-21 23:59:39 +09:00
Masami Hiramatsu (Google)
6ad51ada17 tracing/fprobe: Reject registration of a registered fprobe before init
Reject registration of a registered fprobe which is on the fprobe
hash table before initializing fprobe.
The add_fprobe_hash() checks this re-register fprobe, but since
fprobe_init() clears hlist_array field, it is too late to check it.
It has to check the re-registration before touncing fprobe.

Link: https://lore.kernel.org/all/177669364845.132053.18375367916162315835.stgit@mhiramat.tok.corp.google.com/

Fixes: 4346ba1604 ("fprobe: Rewrite fprobe on function-graph tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2026-04-21 23:59:29 +09:00
Linus Torvalds
b4e07588e7 tracing: tell git to ignore the generated 'undefsyms_base.c' file
This odd file was added to automatically figure out tool-generated
symbols.

Honestly, it *should* have been just a real honest-to-goodness regular
file in git, instead of having strange code to generate it in the
Makefile, but that is not how that silly thing works.  So now we need to
ignore it explicitly.

Fixes: 1211907ac0 ("tracing: Generate undef symbols allowlist for simple_ring_buffer")
Cc: Vincent Donnefort <vdonnefort@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-04-20 17:25:56 -07:00
Linus Torvalds
b66cb4f156 printk changes for 7.1
-----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmnmEggbFIAAAAAABAAO
 bWFudTIsMi41KzEuMTIsMiwyAAoJEFKgDEdIgJTyjQwP/j2a9EzdI4+DfGrmA56m
 /heo44ObpFJppYaWEyAN7xX8Xpm7ErokLjZxVxhgQY7hGU5WLx1CmslnJbfWYkWy
 r7q92Kd/QIDmsvwHzlE0xdaX3rp8AXp3O2iIhcDfv9FRe+UultrESBpw9JbRYXXQ
 SLAFU1iOFMxyuzvhW/2l/B+PQDm0uoRMpMWTsLXo2JtNOsIGewNyV/7dhOumm+RD
 /0lgVA9jMAQA33j4Hkr38REe8lYH7aGl66x1mZhDg+kYb7w7ndKW/QN21OJiP+2D
 s4moi2/VLmC/UcxoDAOQTArKYkrYc2nzLMJRMaIun8jcNfCHEqaQAW2MTzSDAC78
 +atdlWrfIxykORA31lTtjh4o5vg43sQPjFZCJr3RxNLfxFy15NULhT75QFrxe9sA
 W3gs4Rz+LeynoQbJNCn8hgK0xcpKLtzC4dMY7dfQo6uyq2YnJauEvioJKbUYyoDh
 3l3uYfnzHcfRUb+yNtIkNCiYXwrpTAnSnifCbWO3smWYxhCdAT14Rval/fxtTjSe
 sTphd7m32U56WpWX0lYQbY001nMzso+gN/eQSK20IzrhvghwYUqvCKNm+fIM9tjR
 nQBxka+B7XhO27hNV4QIT8ZQRUkabQQAseEFhLQI2Ptgw3eV0s85gMP60Lg4gpZ0
 7OGA/VM+xrLqXvsFeDDWccSg
 =O6QR
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Fix printk ring buffer initialization and sanity checks

 - Workaround printf kunit test compilation with gcc < 12.1

 - Add IPv6 address printf format tests

 - Misc code and documentation cleanup

* tag 'printk-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printf: Compile the kunit test with DISABLE_BRANCH_PROFILING DISABLE_BRANCH_PROFILING
  lib/vsprintf: use bool for local decode variable
  lib/hexdump: print_hex_dump_bytes() calls print_hex_dump_debug()
  printk: ringbuffer: fix errors in comments
  printk_ringbuffer: Add sanity check for 0-size data
  printk_ringbuffer: Fix get_data() size sanity check
  printf: add IPv6 address format tests
  printk: Fix _DESCS_COUNT type for 64-bit systems
2026-04-20 15:42:18 -07:00
Linus Torvalds
ccbc9fdb32 Fix timer stalls caused by incorrect handling of
the dev->next_event_forced flag.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmnl1/4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g/ww/7BI4CyQUJLSCpYMjvkj+87Rrfd5u6FGqt
 jz0dGeQpY0LvRGSqASwICe+1r0zwHF+xDsUfJvA13mRaPM6D6bEU+JE6ffK8B6T9
 EIyYwEwQ2a7DrdIu9+FCXTwqXDUoGLFsguD50b4qupQKFcDlwgZbg4UAWi/ptau9
 Ww+5T/+sfw/SMR9EwXBSKH79N0gOOGpDNfGtpDv+0X0qPQvo9QGAxMfIUgMf7ZaA
 y55agXi5iOdM+mAIrE69WLinBzrBvXHWNr66/SaMadQs93I1hU54sLpir4ft1yCs
 WnDtTRWG11Y0HBHUqqgbnN8BR/2VIFDVe9BtRDoDUD70iEJ8TGqJjOvF8v7C00MK
 ets0zNel9Rqbz9wjrjTekPYUHfC/t9qqzV77c0TdU1IR6FArf/OT9Ge34AVr60EX
 a5s4aX7ECLjwuTwgQPLXsSedOD0eQndf/VYdEQ86fTUfyyujVg2NCxbFEfDr3eho
 SbjcNv1UQ1WY/7miJzYaiA2aVNtwuX25YNI+t3f3pX/1tGqmx9oB1tNzJqgGfuN9
 3/Rx3uP0kH+gpbw1lKAugFJazOEHDLJG8LBgF0PYbmlVdIGn57IuBlQL5GDLe77O
 G+sFUhrLNpkrIJVhNWODkM+K/z9vvKzENiRgG4hB0oAbQEUuqTA6ppayKWWs8KDb
 9fdgLSkdjtI=
 =GPvm
 -----END PGP SIGNATURE-----

Merge tag 'timers-urgent-2026-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "Fix timer stalls caused by incorrect handling of the
  dev->next_event_forced flag"

* tag 'timers-urgent-2026-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevents: Add missing resets of the next_event_forced flag
2026-04-20 15:30:08 -07:00
Keenan Dong
3bfdc63936 rtmutex: Use waiter::task instead of current in remove_waiter()
remove_waiter() is used by the slowlock paths, but it is also used for
proxy-lock rollback in rt_mutex_start_proxy_lock() when invoked from
futex_requeue().

In the latter case waiter::task is not current, but remove_waiter()
operates on current for the dequeue operation. That results in several
problems:

  1) the rbtree dequeue happens without waiter::task::pi_lock being held

  2) the waiter task's pi_blocked_on state is not cleared, which leaves a
     dangling pointer primed for UAF around.

  3) rt_mutex_adjust_prio_chain() operates on the wrong top priority waiter
     task

Use waiter::task instead of current in all related operations in
remove_waiter() to cure those problems.

[ tglx: Fixup rt_mutex_adjust_prio_chain(), add a comment and amend the
  	changelog ]

Fixes: 8161239a8b ("rtmutex: Simplify PI algorithm and make highest prio task get lock")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Keenan Dong <keenanat2000@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
2026-04-21 00:22:31 +02:00
Petr Mladek
add9d911be Merge branch 'rework/prb-fixes' into for-linus 2026-04-20 13:42:01 +02:00
Linus Torvalds
40735a683b mm.git review status for linus..mm-stable
Everything:
 
 Total patches:       121
 Reviews/patch:       2.11
 Reviewed rate:       90%
 
 Excluding DAMON:
 
 Total patches:       113
 Reviews/patch:       2.25
 Reviewed rate:       96%
 
 - The 33 patch series "Eliminate Dying Memory Cgroup" from Qi Zheng and
   Muchun Song addresses the longstanding "dying memcg problem".  A
   situation wherein a no-longer-used memory control group will hang around
   for an extended period pointlessly consuming memory.  The [0/N]
   changelog has a good overview of this work.
 
 - The 3 patch series "fix unexpected type conversions and potential
   overflows" from Qi Zheng fixes a couple of potential 32-bit/64-bit
   issues which were identified during review of the "Eliminate Dying
   Memory Cgroup" series.
 
 - The 6 patch series "kho: history: track previous kernel version and
   kexec boot count" from Breno Leitao uses Kexec Handover (KHO) to pass
   the previous kernel's version string and the number of kexec reboots
   since the last cold boot to the next kernel, and prints it at boot time.
 
 - The 4 patch series "liveupdate: prevent double preservation" from
   Pasha Tatashin teaches LUO to avoid managing the same file across
   different active sessions.
 
 - The 10 patch series "liveupdate: Fix module unloading and unregister
   API" from Pasha Tatashin addresses an issue with how LUO handles module
   reference counting and unregistration during module unloading.
 
 - The 2 patch series "zswap pool per-CPU acomp_ctx simplifications" from
   Kanchana Sridhar simplifies and cleans up the zswap crypto compression
   handling and improves the lifecycle management of zswap pool's per-CPU
   acomp_ctx resources.
 
 - The 2 patch series "mm/damon/core: fix damon_call()/damos_walk() vs
   kdmond exit race" from SeongJae Park addresses unlikely but possible
   leaks and deadlocks in damon_call() and damon_walk().
 
 - The 2 patch series "mm/damon/core: validate damos_quota_goal->nid"
   from SeongJae Park fixes a couple of root-only wild pointer
   dereferences.
 
 - The 2 patch series "Docs/admin-guide/mm/damon: warn commit_inputs vs
   other params race" from SeongJae Park updates the DAMON documentation to
   warn operators about potential races which can occur if the
   commit_inputs parameter is altered at the wrong time.
 
 - The 3 patch series "Minor hmm_test fixes and cleanups" from Alistair
   Popple implements two bugfixes a cleanup for the HMM kernel selftests.
 
 - The 6 patch series "Modify memfd_luo code" from Chenghao Duan provides
   cleanups, simplifications and speedups in the memfd_lou code.
 
 - The 4 patch series "mm, kvm: allow uffd support in guest_memfd" from
   Mike Rapoport enables support for userfaultfd in guest_memfd.
 
 - The 6 patch series "selftests/mm: skip several tests when thp is not
   available" from Chunyu Hu fixes several issues in the selftests code
   which were causing breakage when the tests were run on CONFIG_THP=n
   kernels.
 
 - The 2 patch series "mm/mprotect: micro-optimization work" from Pedro
   Falcato implements a couple of nice speedups for mprotect().
 
 - The 3 patch series "MAINTAINERS: update KHO and LIVE UPDATE entries"
   from Pratyush Yadav reflects upcoming changes in the maintenance of KHO,
   LUO, memfd_luo, kexec, crash, kdump and probably other kexec-based
   things - they are being moved out of mm.git and into a new git tree.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaeNL/wAKCRDdBJ7gKXxA
 jt7EAQCEEQvYYTjld+8HJKsCbavY4pEfci7z4SBiQyIPjRracQD/ZfjXnzL7ucc1
 b6q6G4TcslvIDBgzVkk9G2BVn2oCoAg=
 =3ozv
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2026-04-18-02-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - "Eliminate Dying Memory Cgroup" (Qi Zheng and Muchun Song)

   Address the longstanding "dying memcg problem". A situation wherein a
   no-longer-used memory control group will hang around for an extended
   period pointlessly consuming memory

 - "fix unexpected type conversions and potential overflows" (Qi Zheng)

   Fix a couple of potential 32-bit/64-bit issues which were identified
   during review of the "Eliminate Dying Memory Cgroup" series

 - "kho: history: track previous kernel version and kexec boot count"
   (Breno Leitao)

   Use Kexec Handover (KHO) to pass the previous kernel's version string
   and the number of kexec reboots since the last cold boot to the next
   kernel, and print it at boot time

 - "liveupdate: prevent double preservation" (Pasha Tatashin)

   Teach LUO to avoid managing the same file across different active
   sessions

 - "liveupdate: Fix module unloading and unregister API" (Pasha
   Tatashin)

   Address an issue with how LUO handles module reference counting and
   unregistration during module unloading

 - "zswap pool per-CPU acomp_ctx simplifications" (Kanchana Sridhar)

   Simplify and clean up the zswap crypto compression handling and
   improve the lifecycle management of zswap pool's per-CPU acomp_ctx
   resources

 - "mm/damon/core: fix damon_call()/damos_walk() vs kdmond exit race"
   (SeongJae Park)

   Address unlikely but possible leaks and deadlocks in damon_call() and
   damon_walk()

 - "mm/damon/core: validate damos_quota_goal->nid" (SeongJae Park)

   Fix a couple of root-only wild pointer dereferences

 - "Docs/admin-guide/mm/damon: warn commit_inputs vs other params race"
   (SeongJae Park)

   Update the DAMON documentation to warn operators about potential
   races which can occur if the commit_inputs parameter is altered at
   the wrong time

 - "Minor hmm_test fixes and cleanups" (Alistair Popple)

   Bugfixes and a cleanup for the HMM kernel selftests

 - "Modify memfd_luo code" (Chenghao Duan)

   Cleanups, simplifications and speedups to the memfd_lou code

 - "mm, kvm: allow uffd support in guest_memfd" (Mike Rapoport)

   Support for userfaultfd in guest_memfd

 - "selftests/mm: skip several tests when thp is not available" (Chunyu
   Hu)

   Fix several issues in the selftests code which were causing breakage
   when the tests were run on CONFIG_THP=n kernels

 - "mm/mprotect: micro-optimization work" (Pedro Falcato)

   A couple of nice speedups for mprotect()

 - "MAINTAINERS: update KHO and LIVE UPDATE entries" (Pratyush Yadav)

   Document upcoming changes in the maintenance of KHO, LUO, memfd_luo,
   kexec, crash, kdump and probably other kexec-based things - they are
   being moved out of mm.git and into a new git tree

* tag 'mm-stable-2026-04-18-02-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (121 commits)
  MAINTAINERS: add page cache reviewer
  mm/vmscan: avoid false-positive -Wuninitialized warning
  MAINTAINERS: update Dave's kdump reviewer email address
  MAINTAINERS: drop include/linux/liveupdate from LIVE UPDATE
  MAINTAINERS: drop include/linux/kho/abi/ from KHO
  MAINTAINERS: update KHO and LIVE UPDATE maintainers
  MAINTAINERS: update kexec/kdump maintainers entries
  mm/migrate_device: remove dead migration entry check in migrate_vma_collect_huge_pmd()
  selftests: mm: skip charge_reserved_hugetlb without killall
  userfaultfd: allow registration of ranges below mmap_min_addr
  mm/vmstat: fix vmstat_shepherd double-scheduling vmstat_update
  mm/hugetlb: fix early boot crash on parameters without '=' separator
  zram: reject unrecognized type= values in recompress_store()
  docs: proc: document ProtectionKey in smaps
  mm/mprotect: special-case small folios when applying permissions
  mm/mprotect: move softleaf code out of the main function
  mm: remove '!root_reclaim' checking in should_abort_scan()
  mm/sparse: fix comment for section map alignment
  mm/page_io: use sio->len for PSWPIN accounting in sio_read_complete()
  selftests/mm: transhuge_stress: skip the test when thp not available
  ...
2026-04-19 08:01:17 -07:00
Linus Torvalds
9055c64567 memblock: updates for 7.0-rc1
* improve debugability of reserve_mem kernel parameter handling with print
   outs in case of a failure and debugfs info showing what was actually
   reserved
 * Make memblock_free_late() and free_reserved_area() use the same core
   logic for freeing the memory to buddy and ensure it takes care of
   updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmnjRmsQHHJwcHRAa2Vy
 bmVsLm9yZwAKCRA5A4Ymyw79kYh0CAC4NpZGFqpEBep1eQcfqsPH05dvp1LUXDNk
 i5GwS2ht/F5D9GcD+EyoYRQjRM8k+XZyOe3sqEF01Uav/rHAv3XrITg/pfiA92AR
 K7CvQv4NvyQqUNcv/mEb+P8niriJ4oHRXCag9inop1jo/x3Mym07oEy73rknAx9r
 ZQKwoFNOM/QQGVb9hZUANKCkE8cAsUXG89yEOH0n17FOahC0PZbK/vxjeO+br3IL
 HxEoC5l1j4cUauf8XEhsVXXdch0iqit/fB3ROePYFNCx7koVYHk6Yl1w++AM0RUA
 ypOmfPsSiqLY2ciuTIAnpTeMfQkkhEmMI3mp6T5BUBwSKJxLRaSM
 =c1xd
 -----END PGP SIGNATURE-----

Merge tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock updates from Mike Rapoport:

 - improve debuggability of reserve_mem kernel parameter handling with
   print outs in case of a failure and debugfs info showing what was
   actually reserved

 - Make memblock_free_late() and free_reserved_area() use the same core
   logic for freeing the memory to buddy and ensure it takes care of
   updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled.

* tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  x86/alternative: delay freeing of smp_locks section
  memblock: warn when freeing reserved memory before memory map is initialized
  memblock, treewide: make memblock_free() handle late freeing
  memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y
  memblock: extract page freeing from free_reserved_area() into a helper
  memblock: make free_reserved_area() more robust
  mm: move free_reserved_area() to mm/memblock.c
  powerpc: opal-core: pair alloc_pages_exact() with free_pages_exact()
  powerpc: fadump: pair alloc_pages_exact() with free_pages_exact()
  memblock: reserve_mem: fix end caclulation in reserve_mem_release_by_name()
  memblock: move reserve_bootmem_range() to memblock.c and make it static
  memblock: Add reserve_mem debugfs info
  memblock: Print out errors on reserve_mem parser
2026-04-18 11:29:14 -07:00
Pasha Tatashin
68750e820b liveupdate: defer file handler module refcounting to active sessions
Stop pinning modules indefinitely upon file handler registration. 
Instead, dynamically increment the module reference count only when a live
update session actively uses the file handler (e.g., during preservation
or deserialization), and release it when the session ends.

This allows modules providing live update handlers to be gracefully
unloaded when no live update is in progress.

Link: https://lore.kernel.org/20260327033335.696621-11-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:50 -07:00
Pasha Tatashin
2ab7207e7e liveupdate: make unregister functions return void
Change liveupdate_unregister_file_handler and liveupdate_unregister_flb to
return void instead of an error code.  This follows the design principle
that unregistration during module unload should not fail, as the unload
cannot be stopped at that point.

Link: https://lore.kernel.org/20260327033335.696621-10-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:50 -07:00
Pasha Tatashin
074488008d liveupdate: remove liveupdate_test_unregister()
Now that file handler unregistration automatically unregisters all
associated file handlers (FLBs), the liveupdate_test_unregister() function
is no longer needed.  Remove it along with its usages and declarations.

Link: https://lore.kernel.org/20260327033335.696621-9-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:50 -07:00
Pasha Tatashin
5ee1c7d641 liveupdate: auto unregister FLBs on file handler unregistration
To ensure that unregistration is always successful and doesn't leave
dangling resources, introduce auto-unregistration of FLBs: when a file
handler is unregistered, all FLBs associated with it are automatically
unregistered.

Introduce a new helper luo_flb_unregister_all() which unregisters all FLBs
linked to the given file handler.

Link: https://lore.kernel.org/20260327033335.696621-8-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:50 -07:00
Pasha Tatashin
118c390824 liveupdate: remove luo_session_quiesce()
Now that FLB module references are handled dynamically during active
sessions, we can safely remove the luo_session_quiesce() and
luo_session_resume() mechanism.

Link: https://lore.kernel.org/20260327033335.696621-7-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:50 -07:00
Pasha Tatashin
76be9983df liveupdate: defer FLB module refcounting to active sessions
Stop pinning modules indefinitely upon FLB registration.  Instead,
dynamically take a module reference when the FLB is actively used in a
session (e.g., during preserve and retrieve) and release it when the
session concludes.

This allows modules providing FLB operations to be cleanly unloaded when
not in active use by the live update orchestrator.

Link: https://lore.kernel.org/20260327033335.696621-6-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:50 -07:00
Pasha Tatashin
6b2b22f7c8 liveupdate: protect FLB lists with luo_register_rwlock
Because liveupdate FLB objects will soon drop their persistent module
references when registered, list traversals must be protected against
concurrent module unloading.

To provide this protection, utilize the global luo_register_rwlock.  It
protects the global registry of FLBs and the handler's specific list of
FLB dependencies.

Read locks are used during concurrent list traversals (e.g., during
preservation and serialization).  Write locks are taken during
registration and unregistration.

Link: https://lore.kernel.org/20260327033335.696621-5-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:49 -07:00
Pasha Tatashin
9e1e185845 liveupdate: protect file handler list with rwsem
Because liveupdate file handlers will no longer hold a module reference
when registered, we must ensure that the access to the handler list is
protected against concurrent module unloading.

Utilize the global luo_register_rwlock to protect the global registry of
file handlers.  Read locks are taken during list traversals in
luo_preserve_file() and luo_file_deserialize().  Write locks are taken
during registration and unregistration.

Link: https://lore.kernel.org/20260327033335.696621-4-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:49 -07:00
Pasha Tatashin
38fb71ace2 liveupdate: synchronize lazy initialization of FLB private state
The luo_flb_get_private() function, which is responsible for lazily
initializing the private state of FLB objects, can be called concurrently
from multiple threads.  This creates a data race on the 'initialized' flag
and can lead to multiple executions of mutex_init() and INIT_LIST_HEAD()
on the same memory.

Introduce a static spinlock (luo_flb_init_lock) local to the function to
synchronize the initialization path.  Use smp_load_acquire() and
smp_store_release() for memory ordering between the fast path and the slow
path.

Link: https://lore.kernel.org/20260327033335.696621-3-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:49 -07:00
Pasha Tatashin
277f4e5e39 liveupdate: safely print untrusted strings
Patch series "liveupdate: Fix module unloading and unregister API", v3.

This patch series addresses an issue with how LUO handles module reference
counting and unregistration during a module unload (e.g., via rmmod).

Currently, modules that register live update file handlers are pinned for
the entire duration they are registered.  This prevents the modules from
being unloaded gracefully, even when no live update session is in
progress.

Furthermore, if a module is forcefully unloaded, the unregistration
functions return an error (e.g.  -EBUSY) if a session is active, which is
ignored by the kernel's module unload path, leaving dangling pointers in
the LUO global lists.

To resolve these issues, this series introduces the following changes:
1. Adds a global read-write semaphore (luo_register_rwlock) to protect
   the registration lists for both file handlers and FLBs.
2. Reduces the scope of module reference counting for file handlers and
   FLBs. Instead of pinning modules indefinitely upon registration,
   references are now taken only when they are actively used in a live
   update session (e.g., during preservation, retrieval, or
   deserialization).
3. Removes the global luo_session_quiesce() mechanism since module
   unload behavior now handles active sessions implicitly.
4. Introduces auto-unregistration of FLBs during file handler
   unregistration to prevent leaving dangling resources.
5. Changes the unregistration functions to return void instead of
   an error code.
6. Fixes a data race in luo_flb_get_private() by introducing a spinlock
   for thread-safe lazy initialization.
7. Strengthens security by using %.*s when printing untrusted deserialized
   compatible strings and session names to prevent out-of-bounds reads.


This patch (of 10):

Deserialized strings from KHO data (such as file handler compatible
strings and session names) are provided by the previous kernel and might
not be null-terminated if the data is corrupted or maliciously crafted.

When printing these strings in error messages, use the %.*s format
specifier with the maximum buffer size to prevent out-of-bounds reads into
adjacent kernel memory.

Link: https://lore.kernel.org/20260327033335.696621-1-pasha.tatashin@soleen.com
Link: https://lore.kernel.org/20260327033335.696621-2-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:49 -07:00
Pasha Tatashin
00d0b37237 liveupdate: prevent double management of files
Patch series "liveupdate: prevent double preservation", v4.

Currently, LUO does not prevent the same file from being managed twice
across different active sessions.

Because LUO preserves files of absolutely different types: memfd, and
upcoming vfiofd [1], iommufd [2], guestmefd (and possible kvmfd/cpufd).
There is no common private data or guarantee on how to prevent that the
same file is not preserved twice beside using inode or some slower and
expensive method like hashtables.


This patch (of 4)

Currently, LUO does not prevent the same file from being managed twice
across different active sessions.

Use a global xarray luo_preserved_files to keep track of file identifiers
being preserved by LUO.  Update luo_preserve_file() to check and insert
the file identifier into this xarray when it is preserved, and erase it in
luo_file_unpreserve_files() when it is released.

To allow handlers to define what constitutes a "unique" file (e.g.,
different struct file objects pointing to the same hardware resource), add
a get_id() callback to struct liveupdate_file_ops.  If not provided, the
default identifier is the struct file pointer itself.

This ensures that the same file (or resource) cannot be managed by
multiple sessions.  If another session attempts to preserve an already
managed file, it will now fail with -EBUSY.

Link: https://lore.kernel.org/20260326163943.574070-1-pasha.tatashin@soleen.com
Link: https://lore.kernel.org/20260326163943.574070-2-pasha.tatashin@soleen.com
Link: https://lore.kernel.org/all/20260129212510.967611-1-dmatlack@google.com [1]
Link: https://lore.kernel.org/all/20260203220948.2176157-1-skhawaja@google.com [2]
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: David Matlack <dmatlack@google.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Christian Brauner <brauner@kernel.org> 
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:49 -07:00
Breno Leitao
76aa46b9e4 kho: kexec-metadata: track previous kernel chain
Use Kexec Handover (KHO) to pass the previous kernel's version string and
the number of kexec reboots since the last cold boot to the next kernel,
and print it at boot time.

Example output:
    [    0.000000] KHO: exec from: 6.19.0-rc4-next-20260107 (count 1)

Motivation
==========

Bugs that only reproduce when kexecing from specific kernel versions are
difficult to diagnose.  These issues occur when a buggy kernel kexecs into
a new kernel, with the bug manifesting only in the second kernel.

Recent examples include the following commits:

 * commit eb22663125 ("x86/boot: Fix page table access in
   5-level to 4-level paging transition")
 * commit 77d48d39e9 ("efistub/tpm: Use ACPI reclaim memory
   for event log to avoid corruption")
 * commit 64b45dd46e ("x86/efi: skip memattr table on kexec
   boot")

As kexec-based reboots become more common, these version-dependent bugs
are appearing more frequently.  At scale, correlating crashes to the
previous kernel version is challenging, especially when issues only occur
in specific transition scenarios.

Implementation
==============

The kexec metadata is stored as a plain C struct (struct
kho_kexec_metadata) rather than FDT format, for simplicity and direct
field access.  It is registered via kho_add_subtree() as a separate
subtree, keeping it independent from the core KHO ABI.  This design
choice:

 - Keeps the core KHO ABI minimal and stable
 - Allows the metadata format to evolve independently
 - Avoids requiring version bumps for all KHO consumers (LUO, etc.)
   when the metadata format changes

The struct kho_kexec_metadata contains two fields:
 - previous_release: The kernel version that initiated the kexec
 - kexec_count: Number of kexec boots since last cold boot

On cold boot, kexec_count starts at 0 and increments with each kexec.  The
count helps identify issues that only manifest after multiple consecutive
kexec reboots.

[leitao@debian.org: call kho_kexec_metadata_init() for both boot paths]
  Link: https://lore.kernel.org/all/20260309-kho-v8-5-c3abcf4ac750@debian.org/ [1]
  Link: https://lore.kernel.org/20260409-kho_fix_merge_issue-v1-1-710c84ceaa85@debian.org
Link: https://lore.kernel.org/20260316-kho-v9-5-ed6dcd951988@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:48 -07:00
Breno Leitao
062dd306d9 kho: fix kho_in_debugfs_init() to handle non-FDT blobs
kho_in_debugfs_init() calls fdt_totalsize() to determine blob sizes, which
assumes all blobs are FDTs.  This breaks for non-FDT blobs like struct
kho_kexec_metadata.

Fix this by reading the "blob-size" property from the FDT (persisted by
kho_add_subtree()) instead of calling fdt_totalsize().  Also rename local
variables from fdt_phys/sub_fdt to blob_phys/blob for consistency with the
non-FDT-specific naming.

Link: https://lore.kernel.org/20260316-kho-v9-4-ed6dcd951988@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:48 -07:00
Breno Leitao
85e4139282 kho: persist blob size in KHO FDT
kho_add_subtree() accepts a size parameter but only forwards it to
debugfs.  The size is not persisted in the KHO FDT, so it is lost across
kexec.  This makes it impossible for the incoming kernel to determine the
blob size without understanding the blob format.

Store the blob size as a "blob-size" property in the KHO FDT alongside the
"preserved-data" physical address.  This allows the receiving kernel to
recover the size for any blob regardless of format.

Also extend kho_retrieve_subtree() with an optional size output parameter
so callers can learn the blob size without needing to understand the blob
format.  Update all callers to pass NULL for the new parameter.

Link: https://lore.kernel.org/20260316-kho-v9-3-ed6dcd951988@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:48 -07:00
Breno Leitao
4916ae3867 kho: rename fdt parameter to blob in kho_add/remove_subtree()
Since kho_add_subtree() now accepts arbitrary data blobs (not just FDTs),
rename the parameter from 'fdt' to 'blob' to better reflect its purpose. 
Apply the same rename to kho_remove_subtree() for consistency.

Also rename kho_debugfs_fdt_add() and kho_debugfs_fdt_remove() to
kho_debugfs_blob_add() and kho_debugfs_blob_remove() respectively, with
the same parameter rename from 'fdt' to 'blob'.

Link: https://lore.kernel.org/20260316-kho-v9-2-ed6dcd951988@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:48 -07:00
Breno Leitao
d9e4142e76 kho: add size parameter to kho_add_subtree()
Patch series "kho: history: track previous kernel version and kexec boot
count", v9.

Use Kexec Handover (KHO) to pass the previous kernel's version string and
the number of kexec reboots since the last cold boot to the next kernel,
and print it at boot time.

Example
=======
	[    0.000000] Linux version 6.19.0-rc3-upstream-00047-ge5d992347849
	...
	[    0.000000] KHO: exec from: 6.19.0-rc4-next-20260107upstream-00004-g3071b0dc4498 (count 1)

Motivation
==========

Bugs that only reproduce when kexecing from specific kernel versions are
difficult to diagnose.  These issues occur when a buggy kernel kexecs into
a new kernel, with the bug manifesting only in the second kernel.

Recent examples include:

 * eb22663125 ("x86/boot: Fix page table access in 5-level to 4-level paging transition")
 * 77d48d39e9 ("efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption")
 * 64b45dd46e ("x86/efi: skip memattr table on kexec boot")

As kexec-based reboots become more common, these version-dependent bugs
are appearing more frequently.  At scale, correlating crashes to the
previous kernel version is challenging, especially when issues only occur
in specific transition scenarios.

Some bugs manifest only after multiple consecutive kexec reboots. 
Tracking the kexec count helps identify these cases (this metric is
already used by live update sub-system).

KHO provides a reliable mechanism to pass information between kernels.  By
carrying the previous kernel's release string and kexec count forward, we
can print this context at boot time to aid debugging.

The goal of this feature is to have this information being printed in
early boot, so, users can trace back kernel releases in kexec.  Systemd is
not helpful because we cannot assume that the previous kernel has systemd
or even write access to the disk (common when using Linux as bootloaders)


This patch (of 6):

kho_add_subtree() assumes the fdt argument is always an FDT and calls
fdt_totalsize() on it in the debugfs code path.  This assumption will
break if a caller passes arbitrary data instead of an FDT.

When CONFIG_KEXEC_HANDOVER_DEBUGFS is enabled, kho_debugfs_fdt_add() calls
__kho_debugfs_fdt_add(), which executes:

    f->wrapper.size = fdt_totalsize(fdt);

Fix this by adding an explicit size parameter to kho_add_subtree() so
callers specify the blob size.  This allows subtrees to contain arbitrary
data formats, not just FDTs.  Update all callers:

  - memblock.c: use fdt_totalsize(fdt)
  - luo_core.c: use fdt_totalsize(fdt_out)
  - test_kho.c: use fdt_totalsize()
  - kexec_handover.c (root fdt): use fdt_totalsize(kho_out.fdt)

Also update __kho_debugfs_fdt_add() to receive the size explicitly instead
of computing it internally via fdt_totalsize().  In kho_in_debugfs_init(),
pass fdt_totalsize() for the root FDT and sub-blobs since all current
users are FDTs.  A subsequent patch will persist the size in the KHO FDT
so the incoming side can handle non-FDT blobs correctly.

Link: https://lore.kernel.org/20260323110747.193569-1-duanchenghao@kylinos.cn
Link: https://lore.kernel.org/20260316-kho-v9-1-ed6dcd951988@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Suggested-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:48 -07:00
Qi Zheng
8285917d6f mm: memcontrol: prepare for reparenting non-hierarchical stats
To resolve the dying memcg issue, we need to reparent LRU folios of child
memcg to its parent memcg.  This could cause problems for non-hierarchical
stats.

As Yosry Ahmed pointed out:

In short, if memory is charged to a dying cgroup at the time of
reparenting, when the memory gets uncharged the stats updates will occur
at the parent. This will update both hierarchical and non-hierarchical
stats of the parent, which would corrupt the parent's non-hierarchical
stats (because those counters were never incremented when the memory was
charged).

Now we have the following two types of non-hierarchical stats, and they
are only used in CONFIG_MEMCG_V1:

a. memcg->vmstats->state_local[i]
b. pn->lruvec_stats->state_local[i]

To ensure that these non-hierarchical stats work properly, we need to
reparent these non-hierarchical stats after reparenting LRU folios. To
this end, this commit makes the following preparations:

1. implement reparent_state_local() to reparent non-hierarchical stats
2. make css_killed_work_fn() to be called in rcu work, and implement
   get_non_dying_memcg_start() and get_non_dying_memcg_end() to avoid race
   between mod_memcg_state()/mod_memcg_lruvec_state()
   and reparent_state_local()

Link: https://lore.kernel.org/e862995c45a7101a541284b6ebee5e5c32c89066.1772711148.git.zhengqi.arch@bytedance.com
Co-developed-by: Yosry Ahmed <yosry@kernel.org>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Allen Pais <apais@linux.microsoft.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Chen Ridong <chenridong@huawei.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Wei Xu <weixugc@google.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18 00:10:47 -07:00
Linus Torvalds
eb0d6d97c2 bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmnihOkACgkQ6rmadz2v
 bTqjQA/+K6R/teQRwVmP1GDrfBjz2TXUzCN1WQQLzbnJNR96Mzq72+aTWjza89BK
 yEUP379qiOeUfEyyV7DNfHh8hAclUAMKuvI3T3pshLQhpOS0+YcpfbakEZbos+My
 AzEGhGl2nhT7S5twHFznCpuSaLgqldHkdAy4BZIiFkOS5lPBX9CU++OAslFPM+f8
 R28JQYWuv2/b1mRsz8zDmQQXxwH/Rpz9hdJKcpm/kCYYBay3cAFV7ArFJfn+Y5se
 9I6mTwNQ+xtSxtsmR/lftlGo1Vv9ah6qM9gKwgju0SkNrS+9UBlNUSmTrJk1fz+d
 SxdppCrqxwHY3UVd62eF4fWWgusC+oMuKzTh6d+D/ZkKvnEjdAx5XQ7uUQyYhKil
 G12vvKWcHit0Qz9RAhqlEEZ+GIpFTtLql6aW7pRmQKE8/vmQwAD1HBqNqWYKjokW
 btlJ3fUOGu8VHtnYbI3FN6VsK8BU9t/xMny9Fys9X4KmtWBLsm4udmiorV9uC+w6
 xV2s+x+ahythTEzVICB6BlQotSRyMd9kR5qisJsetWk+7NBY0Bwn7C0kfVGepHh0
 WerFSYdSifTvBWQjXnvqmAX7YspmpZvevw8PCtoPq1xq5d1FrYu1K5GX/xzpy+pH
 p13afkbN7Mk6OwteFefD1B0ofug3V9sx3HBI72ENs1Z+hh1KdOQ=
 =79I2
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:
 "Most of the diff stat comes from Xu Kuohai's fix to emit ENDBR/BTI,
  since all JITs had to be touched to move constant blinding out and
  pass bpf_verifier_env in.

   - Fix use-after-free in arena_vm_close on fork (Alexei Starovoitov)

   - Dissociate struct_ops program with map if map_update fails (Amery
     Hung)

   - Fix out-of-range and off-by-one bugs in arm64 JIT (Daniel Borkmann)

   - Fix precedence bug in convert_bpf_ld_abs alignment check (Daniel
     Borkmann)

   - Fix arg tracking for imprecise/multi-offset in BPF_ST/STX insns
     (Eduard Zingerman)

   - Copy token from main to subprogs to fix missing kallsyms (Eduard
     Zingerman)

   - Prevent double close and leak of btf objects in libbpf (Jiri Olsa)

   - Fix af_unix null-ptr-deref in sockmap (Michal Luczaj)

   - Fix NULL deref in map_kptr_match_type for scalar regs (Mykyta
     Yatsenko)

   - Avoid unnecessary IPIs. Remove redundant bpf_flush_icache() in
     arm64 and riscv JITs (Puranjay Mohan)

   - Fix out of bounds access. Validate node_id in arena_alloc_pages()
     (Puranjay Mohan)

   - Reject BPF-to-BPF calls and callbacks in arm32 JIT (Puranjay Mohan)

   - Refactor all JITs to pass bpf_verifier_env to emit ENDBR/BTI for
     indirect jump targets on x86-64, arm64 JITs (Xu Kuohai)

   - Allow UTF-8 literals in bpf_bprintf_prepare() (Yihan Ding)"

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (32 commits)
  bpf, arm32: Reject BPF-to-BPF calls and callbacks in the JIT
  bpf: Dissociate struct_ops program with map if map_update fails
  bpf: Validate node_id in arena_alloc_pages()
  libbpf: Prevent double close and leak of btf objects
  selftests/bpf: cover UTF-8 trace_printk output
  bpf: allow UTF-8 literals in bpf_bprintf_prepare()
  selftests/bpf: Reject scalar store into kptr slot
  bpf: Fix NULL deref in map_kptr_match_type for scalar regs
  bpf: Fix precedence bug in convert_bpf_ld_abs alignment check
  bpf, arm64: Emit BTI for indirect jump target
  bpf, x86: Emit ENDBR for indirect jump targets
  bpf: Add helper to detect indirect jump targets
  bpf: Pass bpf_verifier_env to JIT
  bpf: Move constants blinding out of arch-specific JITs
  bpf, sockmap: Take state lock for af_unix iter
  bpf, sockmap: Fix af_unix null-ptr-deref in proto update
  selftests/bpf: Extend bpf_iter_unix to attempt deadlocking
  bpf, sockmap: Fix af_unix iter deadlock
  bpf, sockmap: Annotate af_unix sock:: Sk_state data-races
  selftests/bpf: verify kallsyms entries for token-loaded subprograms
  ...
2026-04-17 15:58:22 -07:00
Amery Hung
f75aeb2de8 bpf: Dissociate struct_ops program with map if map_update fails
Currently, when bpf_struct_ops_map_update_elem() fails, the programs'
st_ops_assoc will remain set. They may become dangling pointers if the
map is freed later, but they will never be dereferenced since the
struct_ops attachment did not succeed. However, if one of the programs
is subsequently attached as part of another struct_ops map, its
st_ops_assoc will be poisoned even though its old st_ops_assoc was stale
from a failed attachment.

Fix the spurious poisoned st_ops_assoc by dissociating struct_ops
programs with a map if the attachment fails. Move
bpf_prog_assoc_struct_ops() to after *plink++ to make sure
bpf_prog_disassoc_struct_ops() will not miss a program when iterating
st_map->links.

Note that, dissociating a program from a map requires some attention as
it must not reset a poisoned st_ops_assoc or a st_ops_assoc pointing to
another map. The former is already guarded in
bpf_prog_disassoc_struct_ops(). The latter also will not happen since
st_ops_assoc of programs in st_map->links are set by
bpf_prog_assoc_struct_ops(), which can only be poisoned or pointing to
the current map.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260417174900.2895486-1-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-17 12:04:14 -07:00
Linus Torvalds
87768582a4 dma-mapping updates for Linux 7.0:
- added support for batched cache sync, what improves performance of
   dma_map/unmap_sg() operations on ARM64 architecture (Barry Song)
 
 - introduced DMA_ATTR_CC_SHARED attribute for explicitly shared memory
   used in confidential computing (Jiri Pirko)
 
 - refactored spaghetti-like code in drivers/of/of_reserved_mem.c and its
   clients (Marek Szyprowski, shared branch with device-tree updates to
   avoid merge conflicts)
 
 - prepared Contiguous Memory Allocator related code for making dma-buf
   drivers modularized (Maxime Ripard)
 
 - added support for benchmarking dma_map_sg() calls to tools/dma utility
   (Qinxin Xia)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSrngzkoBtlA8uaaJ+Jp1EFxbsSRAUCaeCbdQAKCRCJp1EFxbsS
 RHbWAQCt70dzrU0lu0omTR1HdDP4GTYfuM6nZR91e8/itGN1+QD/XH4I/0wuybzk
 v5uxbIC6lR3abQRc3YNRXfi+i5j26A4=
 =Oee2
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux

Pull dma-mapping updates from Marek Szyprowski:

 - added support for batched cache sync, what improves performance of
   dma_map/unmap_sg() operations on ARM64 architecture (Barry Song)

 - introduced DMA_ATTR_CC_SHARED attribute for explicitly shared memory
   used in confidential computing (Jiri Pirko)

 - refactored spaghetti-like code in drivers/of/of_reserved_mem.c and
   its clients (Marek Szyprowski, shared branch with device-tree updates
   to avoid merge conflicts)

 - prepared Contiguous Memory Allocator related code for making dma-buf
   drivers modularized (Maxime Ripard)

 - added support for benchmarking dma_map_sg() calls to tools/dma
   utility (Qinxin Xia)

* tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: (24 commits)
  dma-buf: heaps: system: document system_cc_shared heap
  dma-buf: heaps: system: add system_cc_shared heap for explicitly shared memory
  dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memory
  mm: cma: Export cma_alloc(), cma_release() and cma_get_name()
  dma: contiguous: Export dev_get_cma_area()
  dma: contiguous: Make dma_contiguous_default_area static
  dma: contiguous: Make dev_get_cma_area() a proper function
  dma: contiguous: Turn heap registration logic around
  of: reserved_mem: rework fdt_init_reserved_mem_node()
  of: reserved_mem: clarify fdt_scan_reserved_mem*() functions
  of: reserved_mem: rearrange code a bit
  of: reserved_mem: replace CMA quirks by generic methods
  of: reserved_mem: switch to ops based OF_DECLARE()
  of: reserved_mem: use -ENODEV instead of -ENOENT
  of: reserved_mem: remove fdt node from the structure
  dma-mapping: fix false kernel-doc comment marker
  dma-mapping: Support batch mode for dma_direct_{map,unmap}_sg
  dma-mapping: Separate DMA sync issuing and completion waiting
  arm64: Provide dcache_inval_poc_nosync helper
  arm64: Provide dcache_clean_poc_nosync helper
  ...
2026-04-17 11:12:42 -07:00
Puranjay Mohan
2845989f2e bpf: Validate node_id in arena_alloc_pages()
arena_alloc_pages() accepts a plain int node_id and forwards it through
the entire allocation chain without any bounds checking.

Validate node_id before passing it down the allocation chain in
arena_alloc_pages().

Fixes: 317460317a ("bpf: Introduce bpf_arena.")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260417152135.1383754-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-17 10:12:55 -07:00
Linus Torvalds
0b6bc3dbe6 tracing latency updates for 7.1:
- Add TIMERLAT_ALIGN osnoise option
 
   Add a timer alignment option for timerlat that makes it work like the
   cyclictest -A option. timelat creates threads to test the latency of the
   kernel. The alignment option will have these threads trigger at the
   alignment offsets from each other. Instead of having each thread wake up
   at the exact same time, if the alignment is set to "20" each thread will
   wake up at 20 microseconds from the previous one.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaeIBaBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqM8AQCyz4uL1lCLQtJb5thG8V9QFRkYh5F3
 +DyuRNoht3ijyAD+K4nZPAu4F09feWuHkssONtSZECKEuN6EhiHE4XX6Pgc=
 =WYYW
 -----END PGP SIGNATURE-----

Merge tag 'trace-latency-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing latency update from Steven Rostedt:

 - Add TIMERLAT_ALIGN osnoise option

   Add a timer alignment option for timerlat that makes it work like the
   cyclictest -A option. timelat creates threads to test the latency of
   the kernel. The alignment option will have these threads trigger at
   the alignment offsets from each other. Instead of having each thread
   wake up at the exact same time, if the alignment is set to "20" each
   thread will wake up at 20 microseconds from the previous one.

* tag 'trace-latency-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/osnoise: Add option to align tlat threads
2026-04-17 10:12:11 -07:00
Linus Torvalds
cb30bf881c tracing updates for v7.1:
- Fix printf format warning for bprintf
 
   sunrpc uses a trace_printk() that triggers a printf warning during the
   compile. Move the __printf() attribute around for when debugging is not
   enabled the warning will go away.
 
 - Remove redundant check for EVENT_FILE_FL_FREED in event_filter_write()
 
   The FREED flag is checked in the call to event_file_file() and then
   checked again right afterward, which is unneeded.
 
 - Clean up event_file_file() and event_file_data() helpers
 
   These helper functions played a different role in the past, but now with
   eventfs, the READ_ONCE() isn't needed. Simplify the code a bit and also
   add a warning to event_file_data() if the file or its data is not present.
 
 - Remove updating file->private_data in tracing open
 
   All access to the file private data is handled by the helper functions,
   which do not use file->private_data. Stop updating it on open.
 
 - Show ENUM names in function arguments via BTF in function tracing
 
   When showing the function arguments when func-args option is set for
   function tracing, if one of the arguments is found to be an enum, show the
   name of the enum instead of its number.
 
 - Add new trace_call__##name() API for tracepoints
 
   Tracepoints are enabled via static_branch() blocks, where when not
   enabled, there's only a nop that is in the code where the execution will
   just skip over it. When tracing is enabled, the nop is converted to a
   direct jump to the tracepoint code. Sometimes more calculations are
   required to be performed to update the parameters of the tracepoint. In
   this case, trace_##name##_enabled() is called which is a static_branch()
   that gets enabled only when the tracepoint is enabled. This allows the
   extra calculations to also be skipped by the nop:
 
   if (trace_foo_enabled()) {
       x = bar();
       trace_foo(x);
   }
 
   Where the x=bar() is only performed when foo is enabled. The problem with
   this approach is that there's now two static_branch() calls. One for
   checking if the tracepoint is enabled, and then again to know if the
   tracepoint should be called. The second one is redundant.
 
   Introduce trace_call__foo() that will call the foo() tracepoint directly
   without doing a static_branch():
 
   if (trace_foo_enabled()) {
       x = bar();
       trace_call__foo();
   }
 
 - Update various locations to use the new trace_call__##name() API
 
 - Move snapshot code out of trace.c
 
   Cleaning up trace.c to not be a "dump all", move the snapshot code out of
   it and into a new trace_snapshot.c file.
 
 - Clean up some "%*.s" to "%*s"
 
 - Allow boot kernel command line options to be called multiple times
 
   Have options like:
 
     ftrace_filter=foo ftrace_filter=bar ftrace_filter=zoo
 
   Equal to:
 
     ftrace_filter=foo,bar,zoo
 
 - Fix ipi_raise event CPU field to be a CPU field
 
   The ipi_raise target_cpus field is defined as a __bitmask(). There is now a
   __cpumask() field definition. Update the field to use that.
 
 - Have hist_field_name() use a snprintf() and not a series of strcat()
 
   It's safer to use snprintf() that a series of strcat().
 
 - Fix tracepoint regfunc balancing
 
   A tracepoint can define a "reg" and "unreg" function that gets called
   before the tracepoint is enabled, and after it is disabled respectively.
   But on error, after the "reg" func is called and the tracepoint is not
   enabled, the "unreg" function is not called to tear down what the "reg"
   function performed.
 
 - Fix output that shows what histograms are enabled
 
   Event variables are displayed incorrectly in the histogram output.
 
   Instead of "sched.sched_wakeup.$var", it is showing
   "$sched.sched_wakeup.var" where the '$' is in the incorrect location.
 
 - Some other simple cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaeCpvxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qt2WAP44m85BbAjBqJe4WR103eOXV+bREBta
 dRoReKJOMe519gEAp0rK/HoCvHgHhIGe3gaGdIsNhnaxoFyNWMG/wokoLAY=
 =Hg6+
 -----END PGP SIGNATURE-----

Merge tag 'trace-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Fix printf format warning for bprintf

   sunrpc uses a trace_printk() that triggers a printf warning during
   the compile. Move the __printf() attribute around for when debugging
   is not enabled the warning will go away

 - Remove redundant check for EVENT_FILE_FL_FREED in
   event_filter_write()

   The FREED flag is checked in the call to event_file_file() and then
   checked again right afterward, which is unneeded

 - Clean up event_file_file() and event_file_data() helpers

   These helper functions played a different role in the past, but now
   with eventfs, the READ_ONCE() isn't needed. Simplify the code a bit
   and also add a warning to event_file_data() if the file or its data
   is not present

 - Remove updating file->private_data in tracing open

   All access to the file private data is handled by the helper
   functions, which do not use file->private_data. Stop updating it on
   open

 - Show ENUM names in function arguments via BTF in function tracing

   When showing the function arguments when func-args option is set for
   function tracing, if one of the arguments is found to be an enum,
   show the name of the enum instead of its number

 - Add new trace_call__##name() API for tracepoints

   Tracepoints are enabled via static_branch() blocks, where when not
   enabled, there's only a nop that is in the code where the execution
   will just skip over it. When tracing is enabled, the nop is converted
   to a direct jump to the tracepoint code. Sometimes more calculations
   are required to be performed to update the parameters of the
   tracepoint. In this case, trace_##name##_enabled() is called which is
   a static_branch() that gets enabled only when the tracepoint is
   enabled. This allows the extra calculations to also be skipped by the
   nop:

	if (trace_foo_enabled()) {
		x = bar();
		trace_foo(x);
	}

   Where the x=bar() is only performed when foo is enabled. The problem
   with this approach is that there's now two static_branch() calls. One
   for checking if the tracepoint is enabled, and then again to know if
   the tracepoint should be called. The second one is redundant

   Introduce trace_call__foo() that will call the foo() tracepoint
   directly without doing a static_branch():

	if (trace_foo_enabled()) {
		x = bar();
		trace_call__foo();
	}

 - Update various locations to use the new trace_call__##name() API

 - Move snapshot code out of trace.c

   Cleaning up trace.c to not be a "dump all", move the snapshot code
   out of it and into a new trace_snapshot.c file

 - Clean up some "%*.s" to "%*s"

 - Allow boot kernel command line options to be called multiple times

   Have options like:

	ftrace_filter=foo ftrace_filter=bar ftrace_filter=zoo

   Equal to:

	ftrace_filter=foo,bar,zoo

 - Fix ipi_raise event CPU field to be a CPU field

   The ipi_raise target_cpus field is defined as a __bitmask(). There is
   now a __cpumask() field definition. Update the field to use that

 - Have hist_field_name() use a snprintf() and not a series of strcat()

   It's safer to use snprintf() that a series of strcat()

 - Fix tracepoint regfunc balancing

   A tracepoint can define a "reg" and "unreg" function that gets called
   before the tracepoint is enabled, and after it is disabled
   respectively. But on error, after the "reg" func is called and the
   tracepoint is not enabled, the "unreg" function is not called to tear
   down what the "reg" function performed

 - Fix output that shows what histograms are enabled

   Event variables are displayed incorrectly in the histogram output

   Instead of "sched.sched_wakeup.$var", it is showing
   "$sched.sched_wakeup.var" where the '$' is in the incorrect location

 - Some other simple cleanups

* tag 'trace-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (24 commits)
  selftests/ftrace: Add test case for fully-qualified variable references
  tracing: Fix fully-qualified variable reference printing in histograms
  tracepoint: balance regfunc() on func_add() failure in tracepoint_add_func()
  tracing: Rebuild full_name on each hist_field_name() call
  tracing: Report ipi_raise target CPUs as cpumask
  tracing: Remove duplicate latency_fsnotify() stub
  tracing: Preserve repeated trace_trigger boot parameters
  tracing: Append repeated boot-time tracing parameters
  tracing: Remove spurious default precision from show_event_trigger/filter formats
  cpufreq: Use trace_call__##name() at guarded tracepoint call sites
  tracing: Remove tracing_alloc_snapshot() when snapshot isn't defined
  tracing: Move snapshot code out of trace.c and into trace_snapshot.c
  mm: damon: Use trace_call__##name() at guarded tracepoint call sites
  btrfs: Use trace_call__##name() at guarded tracepoint call sites
  spi: Use trace_call__##name() at guarded tracepoint call sites
  i2c: Use trace_call__##name() at guarded tracepoint call sites
  kernel: Use trace_call__##name() at guarded tracepoint call sites
  tracepoint: Add trace_call__##name() API
  tracing: trace_mmap.h: fix a kernel-doc warning
  tracing: Pretty-print enum parameters in function arguments
  ...
2026-04-17 09:43:12 -07:00
Linus Torvalds
c9e03d5948 Probes updates for v7.1
- fprobe: do not zero out unused fgraph_data. This removes unneeded
   memset of fgraph_data in fprobe entry handler.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmnhhjEbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bfBMH/21EM4dL2oDrsR3Ud7G+
 R1LYB959jkztJlIKQ9xGBdUpyMd97JYszknpTG4QHb7XW4FrcvZMiHeT9AW6ZBzw
 zJkWVbSCst7sNYpjmqFcf/p3uUJp3/jHNG6byQ2/oWW9bilWPitY1LDL8u/VwzWN
 ZN35B09bw2IgKXbxCNoNlASdV8w/nnVI+CEjKSbLJBVQffBzU9losLt5Qu1i5w/A
 Q5O5ZLkeawQhRzilxVAqae7U949MzRvn28EiLyMJ4rJA0vWH+ijg7vFYmC4hnFde
 GNzJ7q7S87BmApTsSVT+QBCpIGd2XjLYfiYZBBoZIZE2wXI4fxgF+cd05XE51hrG
 mTs=
 =ycSD
 -----END PGP SIGNATURE-----

Merge tag 'probes-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull fprobe update from Masami Hiramatsu:

 - do not zero out unused fgraph_data. This removes unneeded memset of
   fgraph_data in fprobe entry handler.

* tag 'probes-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: fprobe: do not zero out unused fgraph_data
2026-04-17 09:18:32 -07:00