Commit Graph

3474 Commits

Author SHA1 Message Date
Helge Deller
41ca998fbe parisc: Fix 64-bit kernel build when CONFIG_COMPAT=n
VDSO32_SYMBOL() is used in signal.c, defining the value to zero avoids
liker issues when CONFIG_COMPAT=n.

Signed-off-by: Helge Deller <deller@gmx.de>
2026-05-01 19:09:30 +02:00
Helge Deller
e6a650acbd parisc: Fix build failure for 32-bit kernel with PA2.0 instruction set
The CONFIG_PA11 option can not be used as a reliable check if we build a
32-bit kernel which needs the 32-bit VDSO.
Instead depend on CONFIG_64BIT and CONFIG_COMPAT only.

Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-30 09:10:07 +02:00
Johan Hovold
b8425ceefe parisc: drivers: switch to dynamic root device
Driver core expects devices to be dynamically allocated and will, for
example, complain loudly if a device that lacks a release function
is ever freed.

Use root_device_register() to allocate and register the root device
instead of open coding using a static device.

While at it, drop the redundant additional reference taken at init.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-28 17:18:52 +02:00
Linus Torvalds
eb5249b125 parisc architecture fixes and updates for kernel v7.1-rc1:
- A fix to make modules on 32-bit parisc architecture work again
 - Drop ip_fast_csum() inline assembly to avoid unaligned memory accesses
 - Allow to build kernel without 32-bit VDSO
 - Reference leak fix in error path in LED driver
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCaeOk+QAKCRD3ErUQojoP
 X6fjAQD0DK47ZT3fuUGVH0T1mg6kZ2a7fSrNkjbGW+B87jlZVQD/ez6eLdZ7+B4f
 k+sQ9Y0e18Ypgi1uvtNFe1lzX25jrQg=
 =GnPk
 -----END PGP SIGNATURE-----

Merge tag 'parisc-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture updates from Helge Deller:

 - A fix to make modules on 32-bit parisc architecture work again

 - Drop ip_fast_csum() inline assembly to avoid unaligned memory
   accesses

 - Allow to build kernel without 32-bit VDSO

 - Reference leak fix in error path in LED driver

* tag 'parisc-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: led: fix reference leak on failed device registration
  module.lds.S: Fix modules on 32-bit parisc architecture
  parisc: Allow to build without VDSO32
  parisc: Include 32-bit VDSO only when building for 32-bit or compat mode
  parisc: Allow to disable COMPAT mode on 64-bit kernel
  parisc: Fix default stack size when COMPAT=n
  parisc: Fix signal code to depend on CONFIG_COMPAT instead of CONFIG_64BIT
  parisc: is_compat_task() shall return false for COMPAT=n
  parisc: Avoid compat syscalls when COMPAT=n
  parisc: _llseek syscall is only available for 32-bit userspace
  parisc: Drop ip_fast_csum() inline assembly implementation
  parisc: update outdated comments for renamed ccio_alloc_consistent()
2026-04-18 11:37:36 -07:00
Linus Torvalds
eb0d6d97c2 bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmnihOkACgkQ6rmadz2v
 bTqjQA/+K6R/teQRwVmP1GDrfBjz2TXUzCN1WQQLzbnJNR96Mzq72+aTWjza89BK
 yEUP379qiOeUfEyyV7DNfHh8hAclUAMKuvI3T3pshLQhpOS0+YcpfbakEZbos+My
 AzEGhGl2nhT7S5twHFznCpuSaLgqldHkdAy4BZIiFkOS5lPBX9CU++OAslFPM+f8
 R28JQYWuv2/b1mRsz8zDmQQXxwH/Rpz9hdJKcpm/kCYYBay3cAFV7ArFJfn+Y5se
 9I6mTwNQ+xtSxtsmR/lftlGo1Vv9ah6qM9gKwgju0SkNrS+9UBlNUSmTrJk1fz+d
 SxdppCrqxwHY3UVd62eF4fWWgusC+oMuKzTh6d+D/ZkKvnEjdAx5XQ7uUQyYhKil
 G12vvKWcHit0Qz9RAhqlEEZ+GIpFTtLql6aW7pRmQKE8/vmQwAD1HBqNqWYKjokW
 btlJ3fUOGu8VHtnYbI3FN6VsK8BU9t/xMny9Fys9X4KmtWBLsm4udmiorV9uC+w6
 xV2s+x+ahythTEzVICB6BlQotSRyMd9kR5qisJsetWk+7NBY0Bwn7C0kfVGepHh0
 WerFSYdSifTvBWQjXnvqmAX7YspmpZvevw8PCtoPq1xq5d1FrYu1K5GX/xzpy+pH
 p13afkbN7Mk6OwteFefD1B0ofug3V9sx3HBI72ENs1Z+hh1KdOQ=
 =79I2
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:
 "Most of the diff stat comes from Xu Kuohai's fix to emit ENDBR/BTI,
  since all JITs had to be touched to move constant blinding out and
  pass bpf_verifier_env in.

   - Fix use-after-free in arena_vm_close on fork (Alexei Starovoitov)

   - Dissociate struct_ops program with map if map_update fails (Amery
     Hung)

   - Fix out-of-range and off-by-one bugs in arm64 JIT (Daniel Borkmann)

   - Fix precedence bug in convert_bpf_ld_abs alignment check (Daniel
     Borkmann)

   - Fix arg tracking for imprecise/multi-offset in BPF_ST/STX insns
     (Eduard Zingerman)

   - Copy token from main to subprogs to fix missing kallsyms (Eduard
     Zingerman)

   - Prevent double close and leak of btf objects in libbpf (Jiri Olsa)

   - Fix af_unix null-ptr-deref in sockmap (Michal Luczaj)

   - Fix NULL deref in map_kptr_match_type for scalar regs (Mykyta
     Yatsenko)

   - Avoid unnecessary IPIs. Remove redundant bpf_flush_icache() in
     arm64 and riscv JITs (Puranjay Mohan)

   - Fix out of bounds access. Validate node_id in arena_alloc_pages()
     (Puranjay Mohan)

   - Reject BPF-to-BPF calls and callbacks in arm32 JIT (Puranjay Mohan)

   - Refactor all JITs to pass bpf_verifier_env to emit ENDBR/BTI for
     indirect jump targets on x86-64, arm64 JITs (Xu Kuohai)

   - Allow UTF-8 literals in bpf_bprintf_prepare() (Yihan Ding)"

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (32 commits)
  bpf, arm32: Reject BPF-to-BPF calls and callbacks in the JIT
  bpf: Dissociate struct_ops program with map if map_update fails
  bpf: Validate node_id in arena_alloc_pages()
  libbpf: Prevent double close and leak of btf objects
  selftests/bpf: cover UTF-8 trace_printk output
  bpf: allow UTF-8 literals in bpf_bprintf_prepare()
  selftests/bpf: Reject scalar store into kptr slot
  bpf: Fix NULL deref in map_kptr_match_type for scalar regs
  bpf: Fix precedence bug in convert_bpf_ld_abs alignment check
  bpf, arm64: Emit BTI for indirect jump target
  bpf, x86: Emit ENDBR for indirect jump targets
  bpf: Add helper to detect indirect jump targets
  bpf: Pass bpf_verifier_env to JIT
  bpf: Move constants blinding out of arch-specific JITs
  bpf, sockmap: Take state lock for af_unix iter
  bpf, sockmap: Fix af_unix null-ptr-deref in proto update
  selftests/bpf: Extend bpf_iter_unix to attempt deadlocking
  bpf, sockmap: Fix af_unix iter deadlock
  bpf, sockmap: Annotate af_unix sock:: Sk_state data-races
  selftests/bpf: verify kallsyms entries for token-loaded subprograms
  ...
2026-04-17 15:58:22 -07:00
Helge Deller
1221365f55 module.lds.S: Fix modules on 32-bit parisc architecture
On the 32-bit parisc architecture, we always used the
-ffunction-sections compiler option to tell the compiler to put the
functions into seperate text sections. This is necessary, otherwise
"big" kernel modules like ext4 or ipv6 fail to load because some
branches won't be able to reach their stubs.

Commit 1ba9f89794 ("vmlinux.lds: Unify TEXT_MAIN, DATA_MAIN, and related
macros") broke this for parisc because all text sections will get
unconditionally merged now.

Introduce the ARCH_WANTS_MODULES_TEXT_SECTIONS config option which
avoids the text section merge for modules, and fix this issue by
enabling this option by default for 32-bit parisc.

Fixes: 1ba9f89794 ("vmlinux.lds: Unify TEXT_MAIN, DATA_MAIN, and related macros")
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: stable@vger.kernel.org # v6.19+
Suggested-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 15:46:46 +02:00
Helge Deller
3dce917902 parisc: Allow to build without VDSO32
When building for 64-bit and without CONFIG_COMPAT, leave out the
vdso32 binary.

Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 15:46:45 +02:00
Helge Deller
ba56cdf133 parisc: Include 32-bit VDSO only when building for 32-bit or compat mode
Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 15:46:45 +02:00
Helge Deller
bc4021c4e9 parisc: Allow to disable COMPAT mode on 64-bit kernel
Although we don't yet have a 64-bit userspace, allowing to disable
the compat mode should be possible.

Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 15:46:45 +02:00
Helge Deller
35493b28e7 parisc: Fix default stack size when COMPAT=n
The CONFIG_STACK_MAX_DEFAULT_SIZE_MB config option does not exist
when CONFIG_COMPAT is disabled. Use default 1 GB stack in this case.

Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 15:46:45 +02:00
Helge Deller
7dc9ee6e5e parisc: Fix signal code to depend on CONFIG_COMPAT instead of CONFIG_64BIT
The signal handler code used CONFIG_64BIT to decide if compat handling
code should be compiled in. Fix it to use CONFIG_COMPAT instead.
This allows to disable CONFIG_COMPAT even when running a 64-bit kernel.

Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 15:46:45 +02:00
Helge Deller
b5d5faba0f parisc: is_compat_task() shall return false for COMPAT=n
Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 15:46:45 +02:00
Helge Deller
97bfda4520 parisc: Avoid compat syscalls when COMPAT=n
Drop unnecessary code and syscall tables when we run a 64-bit
kernel with conpat mode disabled.

Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 15:46:45 +02:00
Helge Deller
da3680f564 parisc: _llseek syscall is only available for 32-bit userspace
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
2026-04-17 11:32:46 +02:00
Helge Deller
3dd31a370c parisc: Drop ip_fast_csum() inline assembly implementation
The assembly code of ip_fast_csum() triggers unaligned access warnings
if the IP header isn't correctly aligned:

 Kernel: unaligned access to 0x173d22e76 in inet_gro_receive+0xbc/0x2e8 (iir 0x0e8810b6)
 Kernel: unaligned access to 0x173d22e7e in inet_gro_receive+0xc4/0x2e8 (iir 0x0e88109a)
 Kernel: unaligned access to 0x173d22e82 in inet_gro_receive+0xc8/0x2e8 (iir 0x0e90109d)
 Kernel: unaligned access to 0x173d22e7a in inet_gro_receive+0xd0/0x2e8 (iir 0x0e9810b8)
 Kernel: unaligned access to 0x173d22e86 in inet_gro_receive+0xdc/0x2e8 (iir 0x0e8810b8)

We have the option to a) ignore the warnings, b) work around it by
adding more code to check for alignment, or c) to switch to the generic
implementation and rely on the compiler to optimize the code.

Let's go with c), because a) isn't nice, and b) would effectively lead
to an implementation which is basically equal to c).

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v7.0+
2026-04-17 11:32:46 +02:00
Xu Kuohai
d9ef13f727 bpf: Pass bpf_verifier_env to JIT
Pass bpf_verifier_env to bpf_int_jit_compile(). The follow-up patch will
use env->insn_aux_data in the JIT stage to detect indirect jump targets.

Since bpf_prog_select_runtime() can be called by cbpf and lib/test_bpf.c
code without verifier, introduce helper __bpf_prog_select_runtime()
to accept the env parameter.

Remove the call to bpf_prog_select_runtime() in bpf_prog_load(), and
switch to call __bpf_prog_select_runtime() in the verifier, with env
variable passed. The original bpf_prog_select_runtime() is preserved for
cbpf and lib/test_bpf.c, where env is NULL.

Now all constants blinding calls are moved into the verifier, except
the cbpf and lib/test_bpf.c cases. The instructions arrays are adjusted
by bpf_patch_insn_data() function for normal cases, so there is no need
to call adjust_insn_arrays() in bpf_jit_blind_constants(). Remove it.

Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> # v12
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # v14
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260416064341.151802-3-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-16 07:03:40 -07:00
Xu Kuohai
d3e945223e bpf: Move constants blinding out of arch-specific JITs
During the JIT stage, constants blinding rewrites instructions but only
rewrites the private instruction copy of the JITed subprog, leaving the
global env->prog->insnsi and env->insn_aux_data untouched. This causes a
mismatch between subprog instructions and the global state, making it
difficult to use the global data in the JIT.

To avoid this mismatch, and given that all arch-specific JITs already
support constants blinding, move it to the generic verifier code, and
switch to rewrite the global env->prog->insnsi with the global states
adjusted, as other rewrites in the verifier do.

This removes the constants blinding calls in each JIT, which are largely
duplicated code across architectures.

Since constants blinding is only required for JIT, and there are two
JIT entry functions, jit_subprogs() for BPF programs with multiple
subprogs and bpf_prog_select_runtime() for programs with no subprogs,
move the constants blinding invocation into these two functions.

In the verifier path, bpf_patch_insn_data() is used to keep global
verifier auxiliary data in sync with patched instructions. A key
question is whether this global auxiliary data should be restored
on the failure path.

Besides instructions, bpf_patch_insn_data() adjusts:
  - prog->aux->poke_tab
  - env->insn_array_maps
  - env->subprog_info
  - env->insn_aux_data

For prog->aux->poke_tab, it is only used by JIT or only meaningful after
JIT succeeds, so it does not need to be restored on the failure path.

For env->insn_array_maps, when JIT fails, programs using insn arrays
are rejected by bpf_insn_array_ready() due to missing JIT addresses.
Hence, env->insn_array_maps is only meaningful for JIT and does not need
to be restored.

For subprog_info, if jit_subprogs fails and CONFIG_BPF_JIT_ALWAYS_ON
is not enabled, kernel falls back to interpreter. In this case,
env->subprog_info is used to determine subprogram stack depth. So it
must be restored on failure.

For env->insn_aux_data, it is freed by clear_insn_aux_data() at the
end of bpf_check(). Before freeing, clear_insn_aux_data() loops over
env->insn_aux_data to release jump targets recorded in it. The loop
uses env->prog->len as the array length, but this length no longer
matches the actual size of the adjusted env->insn_aux_data array after
constants blinding.

To address it, a simple approach is to keep insn_aux_data as adjusted
after failure, since it will be freed shortly, and record its actual size
for the loop in clear_insn_aux_data(). But since clear_insn_aux_data()
uses the same index to loop over both env->prog->insnsi and env->insn_aux_data,
this approach results in incorrect index for the insnsi array. So an
alternative approach is adopted: clone the original env->insn_aux_data
before blinding and restore it after failure, similar to env->prog.

For classic BPF programs, constants blinding works as before since it
is still invoked from bpf_prog_select_runtime().

Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> # powerpc jit
Reviewed-by: Pu Lehui <pulehui@huawei.com> # riscv jit
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # loongarch jit
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260416064341.151802-2-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-16 07:03:40 -07:00
Linus Torvalds
40286d6379 pci-v7.1-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmnfwfMUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxwIRAAlN1h5er8aFDbjON5YMXBZqlQmzaC
 bjlUHgwm7HkdErTFozyuqhE8QUO1kCm4uMQzeyJdfY9nRWqMDOuKYxMD5j0exk+o
 4tbbJg6Xx4dq7Qrawy9PhxyQm/PDAcvs+FRRlGala+qq9o3fxPDOAZVDE/1C8qFQ
 Jd7GGd7NZn/NN4xrqST4RQHjO8fwaMwmksWCStsb79kfesQWP6kLADGfIMcWxNUB
 2s+oTnK6Hw0tkBv56n6i8mbb0EzS3/RN1daTevGAta1rmfUVVtWGRZ4paMvv0Owi
 Rl5+O5Jz6/c1qiXZbUqu5CRQPIy7Dr3JPvURcZX6qbsV8PzWXZr0Wi+geWefGOnp
 55y+3OT0vdBGAuXLJhrcU7Clzq9D/TZOt8oTI8IFArUfDlmrAIdozPn7gr+VGre5
 QuKymSk3XWtyIbe4o8UeZ4f9g0y6ZY1XvtvB7K1tze+OOmqlkfq966+z8aZuGOKx
 ZvAU/NIat5H02EgB4dEVOP8R5vPZlXGT0RLGl1JWRypPWyZDbVVA3z927qRQG5md
 IsVq8WaIrB1zyl9g37lZeEaYwP/qCIQsHkMGPYcP4wdOQEV9AQqi5pmjMXnWyQJD
 PR1nvmTKW7USRCJ+pz8xPhZh0cj3ENaddORTD3I/0CGVV0y452bU/5rr4T+K04bK
 PCJBpxTIDuWDwXc=
 =FFRz
 -----END PGP SIGNATURE-----

Merge tag 'pci-v7.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Allow TLP Processing Hints to be enabled for RCiEPs (George Abraham
     P)

   - Enable AtomicOps only if we know the Root Port supports them (Gerd
     Bayer)

   - Don't enable AtomicOps for RCiEPs since none of them need Atomic
     Ops and we can't tell whether the Root Complex would support them
     (Gerd Bayer)

   - Leave Precision Time Measurement disabled until a driver enables it
     to avoid PCIe errors (Mika Westerberg)

   - Make pci_set_vga_state() fail if bridge doesn't support VGA
     routing, i.e., PCI_BRIDGE_CTL_VGA is not writable, and return
     errors to vga_get() callers including userspace via
     /dev/vga_arbiter (Simon Richter)

   - Validate max-link-speed from DT in j721e, brcmstb, mediatek-gen3,
     rzg3s drivers (where the actual controller constraints are known),
     and remove validation from the generic OF DT accessor (Hans Zhang)

   - Remove pc110pad driver (no longer useful after 486 CPU support
     removed) and no_pci_devices() (pc110pad was the last user) (Dmitry
     Torokhov, Heiner Kallweit)

  Resource management:

   - Prevent assigning space to unimplemented bridge windows; previously
     we mistakenly assumed prefetchable window existed and assigned
     space and put a BAR there (Ahmed Naseef)

   - Avoid shrinking bridge windows to fit in the initial Root Port
     window; fixes one problem with devices with large BARs connected
     via switches, e.g., Thunderbolt (Ilpo Järvinen)

   - Pass full extent of empty space, not just the aligned space, to
     resource_alignf callback so free space before the requested
     alignment can be used (Ilpo Järvinen)

   - Place small resources before larger ones for better utilization of
     address space (Ilpo Järvinen)

   - Fix alignment calculation for resource size larger than align,
     e.g., bridge windows larger than the 1MB required alignment (Ilpo
     Järvinen)

  Reset:

   - Update slot handling so all ARI functions are treated as being in
     the same slot. They're all reset by Secondary Bus Reset, but
     previously drivers of ARI functions that appeared to be on a
     non-zero device weren't notified and fatal hardware errors could
     result (Keith Busch)

   - Make sysfs reset_subordinate hotplug safe to avoid spurious hotplug
     events (Keith Busch)

   - Hide Secondary Bus Reset ('bus') from sysfs reset_methods if masked
     by CXL because it has no effect (Vidya Sagar)

   - Avoid FLR for AMD NPU device, where it causes the device to hang
     (Lizhi Hou)

  Error handling:

   - Clear only error bits in PCIe Device Status to avoid accidentally
     clearing Emergency Power Reduction Detected (Shuai Xue)

   - Check for AER errors even in devices without drivers (Lukas Wunner)

   - Initialize ratelimit info so DPC and EDR paths log AER error
     information (Kuppuswamy Sathyanarayanan)

  Power control:

   - Add UPD720201/UPD720202 USB 3.0 xHCI Host Controller .compatible so
     generic pwrctrl driver can control it (Neil Armstrong)

  Hotplug:

   - Set LED_HW_PLUGGABLE for NPEM hotplug-capable ports so LED core
     doesn't complain when setting brightness fails because the endpoint
     is gone (Richard Cheng)

  Peer-to-peer DMA:

   - Allow wildcards in list of host bridges that support peer-to-peer
     DMA between hierarchy domains and add all Google SoCs (Jacob
     Moroni)

  Endpoint framework:

   - Advertise dynamic inbound mapping support in pci-epf-test and
     update host pci_endpoint_test to skip doorbell testing if not
     advertised by endpoint (Koichiro Den)

   - Return 0, not remaining timeout, when MHI eDMA ops complete so
     mhi_ep_ring_add_element() doesn't interpret non-zero as failure
     (Daniel Hodges)

   - Remove vntb and ntb duplicate resource teardown that leads to oops
     when .allow_link() fails or .drop_link() is called (Koichiro Den)

   - Disable vntb delayed work before clearing BAR mappings and
     doorbells to avoid oops caused by doing the work after resources
     have been torn down (Koichiro Den)

   - Add a way to describe reserved subregions within BARs, e.g.,
     platform-owned fixed register windows, and use it for the RK3588
     BAR4 DMA ctrl window (Koichiro Den)

   - Add BAR_DISABLED for BARs that will never be available to an EPF
     driver, and change some BAR_RESERVED annotations to BAR_DISABLED
     (Niklas Cassel)

   - Add NTB .get_dma_dev() callback for cases where DMA API requires a
     different device, e.g., vNTB devices (Koichiro Den)

   - Add reserved region types for MSI-X Table and PBA so Endpoint
     controllers can them as describe hardware-owned regions in a
     BAR_RESERVED BAR (Manikanta Maddireddy)

   - Make Tegra194/234 BAR0 programmable and remove 1MB size limit
     (Manikanta Maddireddy)

   - Expose Tegra BAR2 (MSI-X) and BAR4 (DMA) as 64-bit BAR_RESERVED
     (Manikanta Maddireddy)

   - Add Tegra194 and Tegra234 device table entries to pci_endpoint_test
     (Manikanta Maddireddy)

   - Skip the BAR subrange selftest if there are not enough inbound
     window resources to run the test (Christian Bruel)

  New native PCIe controller drivers:

   - Add DT binding and driver for Andes QiLai SoC PCIe host controller
     (Randolph Lin)

   - Add DT binding and driver for ESWIN PCIe Root Complex (Senchuan
     Zhang)

  Baikal T-1 PCIe controller driver:

   - Remove driver since it never quite became usable (Andy Shevchenko)

  Cadence PCIe controller driver:

   - Implement byte/word config reads with dword (32-bit) reads because
     some Cadence controllers don't support sub-dword accesses (Aksh
     Garg)

  CIX Sky1 PCIe controller driver:

   - Add 'power-domains' to DT binding for SCMI power domain (Gary Yang)

  Freescale i.MX6 PCIe controller driver:

   - Add i.MX94 and i.MX943 to fsl,imx6q-pcie-ep DT binding (Richard
     Zhu)

   - Delay instead of polling for L2/L3 Ready after PME_Turn_off when
     suspending i.MX6SX because LTSSM registers are inaccessible
     (Richard Zhu)

   - Separate PERST# assertion (for resetting endpoints) from core reset
     (for resetting the RC itself) to prepare for new DTs with PERST#
     GPIO in per-Root Port nodes (Sherry Sun)

   - Retain Root Port MSI capability on i.MX7D, i.MX8MM, and i.MX8MQ so
     MSI from downstream devices will work (Richard Zhu)

   - Fix i.MX95 reference clock source selection when internal refclk is
     used (Franz Schnyder)

  Freescale Layerscape PCIe controller driver:

   - Allow building as a removable module (Sascha Hauer)

  MediaTek PCIe Gen3 controller driver:

   - Use dev_err_probe() to simplify error paths and make deferred probe
     messages visible in /sys/kernel/debug/devices_deferred (Chen-Yu
     Tsai)

   - Power off device if setup fails (Chen-Yu Tsai)

   - Integrate new pwrctrl API to enable power control for WiFi/BT
     adapters on mainboard or in PCIe or M.2 slots (Chen-Yu Tsai)

  NVIDIA Tegra194 PCIe controller driver:

   - Poll less aggressively and non-atomically for PME_TO_Ack during
     transition to L2 (Vidya Sagar)

   - Disable LTSSM after transition to Detect on surprise link down to
     stop toggling between Polling and Detect (Manikanta Maddireddy)

   - Don't force the device into the D0 state before L2 when suspending
     or shutting down the controller (Vidya Sagar)

   - Disable PERST# IRQ only in Endpoint mode because it's not
     registered in Root Port mode (Manikanta Maddireddy)

   - Handle 'nvidia,refclk-select' as optional (Vidya Sagar)

   - Disable direct speed change in Endpoint mode so link speed change
     is controlled by the host (Vidya Sagar)

   - Set LTR values before link up to avoid bogus LTR messages with 0
     latency (Vidya Sagar)

   - Allow system suspend when the Endpoint link is down (Vidya Sagar)

   - Use DWC IP core version, not Tegra custom values, to avoid DWC core
     version check warnings (Manikanta Maddireddy)

   - Apply ECRC workaround to devices based on DesignWare 5.00a as well
     as 4.90a (Manikanta Maddireddy)

   - Disable PM Substate L1.2 in Endpoint mode to work around Tegra234
     erratum (Vidya Sagar)

   - Delay post-PERST# cleanup until core is powered on to avoid CBB
     timeout (Manikanta Maddireddy)

   - Assert CLKREQ# so switches that forward it to their downstream side
     can bring up those links successfully (Vidya Sagar)

   - Calibrate pipe to UPHY for Endpoint mode to reset stale PLL state
     from any previous bad link state (Vidya Sagar)

   - Remove IRQF_ONESHOT flag from Endpoint interrupt registration so
     DMA driver and Endpoint controller driver can share the interrupt
     line (Vidya Sagar)

   - Enable DMA interrupt to support DMA in both Root Port and Endpoint
     modes (Vidya Sagar)

   - Enable hardware link retraining after link goes down in Endpoint
     mode (Vidya Sagar)

   - Add DT binding and driver support for core clock monitoring (Vidya
     Sagar)

  Qualcomm PCIe controller driver:

   - Advertise 'Hot-Plug Capable' and set 'No Command Completed Support'
     since Qcom Root Ports support hotplug events like DL_Up/Down and
     can accept writes to Slot Control without delays between writes
     (Krishna Chaitanya Chundru)

  Renesas R-Car PCIe controller driver:

   - Mark Endpoint BAR0 and BAR2 as Resizable (Koichiro Den)

   - Reduce EPC BAR alignment requirement to 4K (Koichiro Den)

  Renesas RZ/G3S PCIe controller driver:

   - Add RZ/G3E to DT binding and to driver (John Madieu)

   - Assert (not deassert) resets in probe error path (John Madieu)

   - Assert resets in suspend path in reverse order they were deasserted
     during probe (John Madieu)

   - Rework inbound window algorithm to prevent mapping more than
     intended region and enforce alignment on size, to prepare for
     RZ/G3E support (John Madieu)

  Rockchip DesignWare PCIe controller driver:

   - Add tracepoints for PCIe controller LTSSM transitions and link rate
     changes (Shawn Lin)

   - Trace LTSSM events collected by the dw-rockchip debug FIFO (Shawn
     Lin)

  SOPHGO PCIe controller driver:

   - Disable ASPM L0s and L1 on Sophgo 2042 PCIe Root Ports that
     advertise support for them (Yao Zi)

  Synopsys DesignWare PCIe controller driver:

   - Continue with system suspend even if an Endpoint doesn't respond
     with PME_TO_Ack message (Manivannan Sadhasivam)

   - Set Endpoint MSI-X Table Size in the correct function of a
     multi-function device when configuring MSI-X, not in Function 0
     (Aksh Garg)

   - Set Max Link Width and Max Link Speed for all functions of a
     multi-function device, not just Function 0 (Aksh Garg)

   - Expose PCIe event counters in groups 5-7 in debugfs (Hans Zhang)

  Miscellaneous:

   - Warn only once about invalid ACS kernel parameter format (Richard
     Cheng)

   - Suppress FW_BUG warning when writing sysfs 'numa_node' with the
     current value (Li RongQing)

   - Drop redundant 'depends on PCI' from Kconfig (Julian Braha)"

* tag 'pci-v7.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (165 commits)
  PCI/P2PDMA: Add Google SoCs to the P2P DMA host bridge list
  PCI/P2PDMA: Allow wildcard Device IDs in host bridge list
  PCI: sg2042: Avoid L0s and L1 on Sophgo 2042 PCIe Root Ports
  PCI: cadence: Add flags for disabling ASPM capability for broken Root Ports
  PCI: tegra194: Add core monitor clock support
  dt-bindings: PCI: tegra194: Add monitor clock support
  PCI: tegra194: Enable hardware hot reset mode in Endpoint mode
  PCI: tegra194: Enable DMA interrupt
  PCI: tegra194: Remove IRQF_ONESHOT flag during Endpoint interrupt registration
  PCI: tegra194: Calibrate pipe to UPHY for Endpoint mode
  PCI: tegra194: Assert CLKREQ# explicitly by default
  PCI: tegra194: Fix CBB timeout caused by DBI access before core power-on
  PCI: tegra194: Disable L1.2 capability of Tegra234 EP
  PCI: dwc: Apply ECRC workaround to DesignWare 5.00a as well
  PCI: tegra194: Use DWC IP core version
  PCI: tegra194: Free up Endpoint resources during remove()
  PCI: tegra194: Allow system suspend when the Endpoint link is not up
  PCI: tegra194: Set LTR message request before PCIe link up in Endpoint mode
  PCI: tegra194: Disable direct speed change for Endpoint mode
  PCI: tegra194: Use devm_gpiod_get_optional() to parse "nvidia,refclk-select"
  ...
2026-04-15 14:41:21 -07:00
Linus Torvalds
334fbe734e mm.git review status for linus..mm-stable
Everything:
 
 Total patches:       368
 Reviews/patch:       1.56
 Reviewed rate:       74%
 
 Excluding DAMON:
 
 Total patches:       316
 Reviews/patch:       1.77
 Reviewed rate:       81%
 
 Excluding DAMON and zram:
 
 Total patches:       306
 Reviews/patch:       1.81
 Reviewed rate:       82%
 
 Excluding DAMON, zram and maple_tree:
 
 Total patches:       276
 Reviews/patch:       2.01
 Reviewed rate:       91%
 
 Significant patch series in this merge:
 
 - The 30 patch series "maple_tree: Replace big node with maple copy"
   from Liam Howlett is mainly prepararatory work for ongoing development
   but it does reduce stack usage and is an improvement.
 
 - The 12 patch series "mm, swap: swap table phase III: remove swap_map"
   from Kairui Song offers memory savings by removing the static swap_map.
   It also yields some CPU savings and implements several cleanups.
 
 - The 2 patch series "mm: memfd_luo: preserve file seals" from Pratyush
   Yadav adds file seal preservation to LUO's memfd code.
 
 - The 2 patch series "mm: zswap: add per-memcg stat for incompressible
   pages" from Jiayuan Chen adds additional userspace stats reportng to
   zswap.
 
 - The 4 patch series "arch, mm: consolidate empty_zero_page" from Mike
   Rapoport implements some cleanups for our handling of ZERO_PAGE() and
   zero_pfn.
 
 - The 2 patch series "mm/kmemleak: Improve scan_should_stop()
   implementation" from Zhongqiu Han provides an robustness improvement and
   some cleanups in the kmemleak code.
 
 - The 4 patch series "Improve khugepaged scan logic" from Vernon Yang
   "improves the khugepaged scan logic and reduces CPU consumption by
   prioritizing scanning tasks that access memory frequently".
 
 - The 2 patch series "Make KHO Stateless" from Jason Miu simplifies
   Kexec Handover by "transitioning KHO from an xarray-based metadata
   tracking system with serialization to a radix tree data structure that
   can be passed directly to the next kernel"
 
 - The 3 patch series "mm: vmscan: add PID and cgroup ID to vmscan
   tracepoints" from Thomas Ballasi and Steven Rostedt enhances vmscan's
   tracepointing.
 
 - The 5 patch series "mm: arch/shstk: Common shadow stack mapping helper
   and VM_NOHUGEPAGE" from Catalin Marinas is a cleanup for the shadow
   stack code: remove per-arch code in favour of a generic implementation.
 
 - The 2 patch series "Fix KASAN support for KHO restored vmalloc
   regions" from Pasha Tatashin fixes a WARN() which can be emitted the KHO
   restores a vmalloc area.
 
 - The 4 patch series "mm: Remove stray references to pagevec" from Tal
   Zussman provides several cleanups, mainly udpating references to "struct
   pagevec", which became folio_batch three years ago.
 
 - The 17 patch series "mm: Eliminate fake head pages from vmemmap
   optimization" from Kiryl Shutsemau simplifies the HugeTLB vmemmap
   optimization (HVO) by changing how tail pages encode their relationship
   to the head page.
 
 - The 2 patch series "mm/damon/core: improve DAMOS quota efficiency for
   core layer filters" from SeongJae Park improves two problematic
   behaviors of DAMOS that makes it less efficient when core layer filters
   are used.
 
 - The 3 patch series "mm/damon: strictly respect min_nr_regions" from
   SeongJae Park improves DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter.
 
 - The 3 patch series "mm/page_alloc: pcp locking cleanup" from Vlastimil
   Babka is a proper fix for a previously hotfixed SMP=n issue.  Code
   simplifications and cleanups ennsed.
 
 - The 16 patch series "mm: cleanups around unmapping / zapping" from
   David Hildenbrand implements "a bunch of cleanups around unmapping and
   zapping.  Mostly simplifications, code movements, documentation and
   renaming of zapping functions".
 
 - The 6 patch series "support batched checking of the young flag for
   MGLRU" from Baolin Wang supports batched checking of the young flag for
   MGLRU.  It's part cleanups; one benchmark shows large performance
   benefits for arm64.
 
 - The 5 patch series "memcg: obj stock and slab stat caching cleanups"
   from Johannes Weiner provides memcg cleanup and robustness improvements.
 
 - The 5 patch series "Allow order zero pages in page reporting" from
   Yuvraj Sakshith enhances page_reporting's free page reporting - it is
   presently and undesirably order-0 pages when reporting free memory.
 
 - The 6 patch series "mm: vma flag tweaks" from Lorenzo Stoakes is
   cleanup work following from the recent conversion of the VMA flags to a
   bitmap.
 
 - The 10 patch series "mm/damon: add optional debugging-purpose sanity
   checks" from SeongJae Park adds some more developer-facing debug checks
   into DAMON core.
 
 - The 2 patch series "mm/damon: test and document power-of-2
   min_region_sz requirement" from SeongJae Park adds an additional DAMON
   kunit test and makes some adjustments to the addr_unit parameter
   handling.
 
 - The 3 patch series "mm/damon/core: make passed_sample_intervals
   comparisons overflow-safe" from SeongJae Park fixes a hard-to-hit time
   overflow issue in DAMON core.
 
 - The 7 patch series "mm/damon: improve/fixup/update ratio calculation,
   test and documentation" from SeongJae Park is a "batch of misc/minor
   improvements and fixups" for DAMON.
 
 - The 4 patch series "mm: move vma_(kernel|mmu)_pagesize() out of
   hugetlb.c" from David Hildenbrand fixes a possible issue with dax-device
   when CONFIG_HUGETLB=n.  Some code movement was required.
 
 - The 6 patch series "zram: recompression cleanups and tweaks" from
   Sergey Senozhatsky provides "a somewhat random mix of fixups,
   recompression cleanups and improvements" in the zram code.
 
 - The 11 patch series "mm/damon: support multiple goal-based quota
   tuning algorithms" from SeongJae Park extend DAMOS quotas goal
   auto-tuning to support multiple tuning algorithms that users can select.
 
 - The 4 patch series "mm: thp: reduce unnecessary
   start_stop_khugepaged()" from Breno Leitao fixes the khugpaged sysfs
   handling so we no longer spam the logs with reams of junk when
   starting/stopping khugepaged.
 
 - The 3 patch series "mm: improve map count checks" from Lorenzo Stoakes
   provides some cleanups and slight fixes in the mremap, mmap and vma
   code.
 
 - The 5 patch series "mm/damon: support addr_unit on default monitoring
   targets for modules" from SeongJae Park extends the use of DAMON core's
   addr_unit tunable.
 
 - The 5 patch series "mm: khugepaged cleanups and mTHP prerequisites"
   from Nico Pache provides cleanups in the khugepaged and is a base for
   Nico's planned khugepaged mTHP support.
 
 - The 15 patch series "mm: memory hot(un)plug and SPARSEMEM cleanups"
   from David Hildenbrand implements code movement and cleanups in the
   memhotplug and sparsemem code.
 
 - The 2 patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and
   cleanup CONFIG_MIGRATION" from David Hildenbrand rationalizes some
   memhotplug Kconfig support.
 
 - The 6 patch series "change young flag check functions to return bool"
   from Baolin Wang is "a cleanup patchset to change all young flag check
   functions to return bool".
 
 - The 3 patch series "mm/damon/sysfs: fix memory leak and NULL
   dereference issues" from Josh Law and SeongJae Park fixes a few
   potential DAMON bugs.
 
 - The 25 patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma
   code" from "converts a lot of the existing use of the legacy vm_flags_t
   data type to the new vma_flags_t type which replaces it".  Mainly in the
   vma code.
 
 - The 21 patch series "mm: expand mmap_prepare functionality and usage"
   from Lorenzo Stoakes "expands the mmap_prepare functionality, which is
   intended to replace the deprecated f_op->mmap hook which has been the
   source of bugs and security issues for some time".  Cleanups,
   documentation, extension of mmap_prepare into filesystem drivers.
 
 - The 13 patch series "mm/huge_memory: refactor zap_huge_pmd()" from
   Lorenzo Stoakes simplifies and cleans up zap_huge_pmd().  Additional
   cleanups around vm_normal_folio_pmd() and the softleaf functionality are
   performed.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad3HDQAKCRDdBJ7gKXxA
 jrUQAPwNhPk5nPSxnyxjAeQtOBHqgCdnICeEismLajPKd9aYRgEA0s2XAu3tSUYi
 GrBnWImHG3s4ePQxVcPCegWTsOUrXgQ=
 =1Q7o
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "maple_tree: Replace big node with maple copy" (Liam Howlett)

   Mainly prepararatory work for ongoing development but it does reduce
   stack usage and is an improvement.

 - "mm, swap: swap table phase III: remove swap_map" (Kairui Song)

   Offers memory savings by removing the static swap_map. It also yields
   some CPU savings and implements several cleanups.

 - "mm: memfd_luo: preserve file seals" (Pratyush Yadav)

   File seal preservation to LUO's memfd code

 - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
   Chen)

   Additional userspace stats reportng to zswap

 - "arch, mm: consolidate empty_zero_page" (Mike Rapoport)

   Some cleanups for our handling of ZERO_PAGE() and zero_pfn

 - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
   Han)

   A robustness improvement and some cleanups in the kmemleak code

 - "Improve khugepaged scan logic" (Vernon Yang)

   Improve khugepaged scan logic and reduce CPU consumption by
   prioritizing scanning tasks that access memory frequently

 - "Make KHO Stateless" (Jason Miu)

   Simplify Kexec Handover by transitioning KHO from an xarray-based
   metadata tracking system with serialization to a radix tree data
   structure that can be passed directly to the next kernel

 - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
   Ballasi and Steven Rostedt)

   Enhance vmscan's tracepointing

 - "mm: arch/shstk: Common shadow stack mapping helper and
   VM_NOHUGEPAGE" (Catalin Marinas)

   Cleanup for the shadow stack code: remove per-arch code in favour of
   a generic implementation

 - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)

   Fix a WARN() which can be emitted the KHO restores a vmalloc area

 - "mm: Remove stray references to pagevec" (Tal Zussman)

   Several cleanups, mainly udpating references to "struct pagevec",
   which became folio_batch three years ago

 - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
   Shutsemau)

   Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
   pages encode their relationship to the head page

 - "mm/damon/core: improve DAMOS quota efficiency for core layer
   filters" (SeongJae Park)

   Improve two problematic behaviors of DAMOS that makes it less
   efficient when core layer filters are used

 - "mm/damon: strictly respect min_nr_regions" (SeongJae Park)

   Improve DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter

 - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)

   The proper fix for a previously hotfixed SMP=n issue. Code
   simplifications and cleanups ensued

 - "mm: cleanups around unmapping / zapping" (David Hildenbrand)

   A bunch of cleanups around unmapping and zapping. Mostly
   simplifications, code movements, documentation and renaming of
   zapping functions

 - "support batched checking of the young flag for MGLRU" (Baolin Wang)

   Batched checking of the young flag for MGLRU. It's part cleanups; one
   benchmark shows large performance benefits for arm64

 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)

   memcg cleanup and robustness improvements

 - "Allow order zero pages in page reporting" (Yuvraj Sakshith)

   Enhance free page reporting - it is presently and undesirably order-0
   pages when reporting free memory.

 - "mm: vma flag tweaks" (Lorenzo Stoakes)

   Cleanup work following from the recent conversion of the VMA flags to
   a bitmap

 - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
   Park)

   Add some more developer-facing debug checks into DAMON core

 - "mm/damon: test and document power-of-2 min_region_sz requirement"
   (SeongJae Park)

   An additional DAMON kunit test and makes some adjustments to the
   addr_unit parameter handling

 - "mm/damon/core: make passed_sample_intervals comparisons
   overflow-safe" (SeongJae Park)

   Fix a hard-to-hit time overflow issue in DAMON core

 - "mm/damon: improve/fixup/update ratio calculation, test and
   documentation" (SeongJae Park)

   A batch of misc/minor improvements and fixups for DAMON

 - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
   Hildenbrand)

   Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
   movement was required.

 - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)

   A somewhat random mix of fixups, recompression cleanups and
   improvements in the zram code

 - "mm/damon: support multiple goal-based quota tuning algorithms"
   (SeongJae Park)

   Extend DAMOS quotas goal auto-tuning to support multiple tuning
   algorithms that users can select

 - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)

   Fix the khugpaged sysfs handling so we no longer spam the logs with
   reams of junk when starting/stopping khugepaged

 - "mm: improve map count checks" (Lorenzo Stoakes)

   Provide some cleanups and slight fixes in the mremap, mmap and vma
   code

 - "mm/damon: support addr_unit on default monitoring targets for
   modules" (SeongJae Park)

   Extend the use of DAMON core's addr_unit tunable

 - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)

   Cleanups to khugepaged and is a base for Nico's planned khugepaged
   mTHP support

 - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)

   Code movement and cleanups in the memhotplug and sparsemem code

 - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
   CONFIG_MIGRATION" (David Hildenbrand)

   Rationalize some memhotplug Kconfig support

 - "change young flag check functions to return bool" (Baolin Wang)

   Cleanups to change all young flag check functions to return bool

 - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
   Law and SeongJae Park)

   Fix a few potential DAMON bugs

 - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
   Stoakes)

   Convert a lot of the existing use of the legacy vm_flags_t data type
   to the new vma_flags_t type which replaces it. Mainly in the vma
   code.

 - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)

   Expand the mmap_prepare functionality, which is intended to replace
   the deprecated f_op->mmap hook which has been the source of bugs and
   security issues for some time. Cleanups, documentation, extension of
   mmap_prepare into filesystem drivers

 - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)

   Simplify and clean up zap_huge_pmd(). Additional cleanups around
   vm_normal_folio_pmd() and the softleaf functionality are performed.

* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm: fix deferred split queue races during migration
  mm/khugepaged: fix issue with tracking lock
  mm/huge_memory: add and use has_deposited_pgtable()
  mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
  mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
  mm/huge_memory: separate out the folio part of zap_huge_pmd()
  mm/huge_memory: use mm instead of tlb->mm
  mm/huge_memory: remove unnecessary sanity checks
  mm/huge_memory: deduplicate zap deposited table call
  mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
  mm/huge_memory: add a common exit path to zap_huge_pmd()
  mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
  mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
  mm/huge: avoid big else branch in zap_huge_pmd()
  mm/huge_memory: simplify vma_is_specal_huge()
  mm: on remap assert that input range within the proposed VMA
  mm: add mmap_action_map_kernel_pages[_full]()
  uio: replace deprecated mmap hook with mmap_prepare in uio_info
  drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
  mm: allow handling of stacked mmap_prepare hooks in more drivers
  ...
2026-04-15 12:59:16 -07:00
Linus Torvalds
91a4855d6c Networking changes for 7.1.
Core & protocols
 ----------------
 
  - Support HW queue leasing, allowing containers to be granted access
    to HW queues for zero-copy operations and AF_XDP.
 
  - Number of code moves to help the compiler with inlining.
    Avoid output arguments for returning drop reason where possible.
 
  - Rework drop handling within qdiscs to include more metadata
    about the reason and dropping qdisc in the tracepoints.
 
  - Remove the rtnl_lock use from IP Multicast Routing.
 
  - Pack size information into the Rx Flow Steering table pointer
    itself. This allows making the table itself a flat array of u32s,
    thus making the table allocation size a power of two.
 
  - Report TCP delayed ack timer information via socket diag.
 
  - Add ip_local_port_step_width sysctl to allow distributing the randomly
    selected ports more evenly throughout the allowed space.
 
  - Add support for per-route tunsrc in IPv6 segment routing.
 
  - Start work of switching sockopt handling to iov_iter.
 
  - Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
    buffer size drifting up.
 
  - Support MSG_EOR in MPTCP.
 
  - Add stp_mode attribute to the bridge driver for STP mode selection.
    This addresses concerns about call_usermodehelper() usage.
 
  - Remove UDP-Lite support (as announced in 2023).
 
  - Remove support for building IPv6 as a module.
    Remove the now unnecessary function calling indirection.
 
 Cross-tree stuff
 ----------------
 
  - Move Michael MIC code from generic crypto into wireless,
    it's considered insecure but some WiFi networks still need it.
 
 Netfilter
 ---------
 
  - Switch nft_fib_ipv6 module to no longer need temporary dst_entry
    object allocations by using fib6_lookup() + RCU.
    Florian W reports this gets us ~13% higher packet rate.
 
  - Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
    switch the service tables to be per-net. Convert some code that
    walks the service lists to use RCU instead of the service_mutex.
 
  - Add more opinionated input validation to lower security exposure.
 
  - Make IPVS hash tables to be per-netns and resizable.
 
 Wireless
 --------
 
  - Finished assoc frame encryption/EPPKE/802.1X-over-auth.
 
  - Radar detection improvements.
 
  - Add 6 GHz incumbent signal detection APIs.
 
  - Multi-link support for FILS, probe response templates and
    client probing.
 
  - New APIs and mac80211 support for NAN (Neighbor Aware Networking,
    aka Wi-Fi Aware) so less work must be in firmware.
 
 Driver API
 ----------
 
  - Add numerical ID for devlink instances (to avoid having to create
    fake bus/device pairs just to have an ID). Support shared devlink
    instances which span multiple PFs.
 
  - Add standard counters for reporting pause storm events
    (implement in mlx5 and fbnic).
 
  - Add configuration API for completion writeback buffering
    (implement in mana).
 
  - Support driver-initiated change of RSS context sizes.
 
  - Support DPLL monitoring input frequency (implement in zl3073x).
 
  - Support per-port resources in devlink (implement in mlx5).
 
 Misc
 ----
 
  - Expand the YAML spec for Netfilter.
 
 Drivers
 -------
 
  - Software:
    - macvlan: support multicast rx for bridge ports with shared source
      MAC address
    - team: decouple receive and transmit enablement for IEEE 802.3ad
      LACP "independent control"
 
  - Ethernet high-speed NICs:
    - nVidia/Mellanox:
      - support high order pages in zero-copy mode (for payload
        coalescing)
      - support multiple packets in a page (for systems with 64kB pages)
    - Broadcom 25-400GE (bnxt):
      - implement XDP RSS hash metadata extraction
      - add software fallback for UDP GSO, lowering the IOMMU cost
    - Broadcom 800GE (bnge):
      - add link status and configuration handling
      - add various HW and SW statistics
    - Marvell/Cavium:
      - NPC HW block support for cn20k
    - Huawei (hinic3):
      - add mailbox / control queue
      - add rx VLAN offload
      - add driver info and link management
 
  - Ethernet NICs:
    - Marvell/Aquantia:
      - support reading SFP module info on some AQC100 cards
    - Realtek PCI (r8169):
      - add support for RTL8125cp
    - Realtek USB (r8152):
      - support for the RTL8157 5Gbit chip
      - add 2500baseT EEE status/configuration support
 
  - Ethernet NICs embedded and off-the-shelf IP:
    - Synopsys (stmmac):
      - cleanup and reorganize SerDes handling and PCS support
      - cleanup descriptor handling and per-platform data
      - cleanup and consolidate MDIO defines and handling
      - shrink driver memory use for internal structures
      - improve Tx IRQ coalescing
      - improve TCP segmentation handling
      - add support for Spacemit K3
    - Cadence (macb):
      - support PHYs that have inband autoneg disabled with GEM
      - support IEEE 802.3az EEE
      - rework usrio capabilities and handling
    - AMD (xgbe):
      - improve power management for S0i3
      - improve TX resilience for link-down handling
 
  - Virtual:
    - Google cloud vNIC:
      - support larger ring sizes in DQO-QPL mode
      - improve HW-GRO handling
      - support UDP GSO for DQO format
    - PCIe NTB:
      - support queue count configuration
 
  - Ethernet PHYs:
    - automatically disable PHY autonomous EEE if MAC is in charge
    - Broadcom:
      - add BCM84891/BCM84892 support
    - Micrel:
      - support for LAN9645X internal PHY
    - Realtek:
      - add RTL8224 pair order support
      - support PHY LEDs on RTL8211F-VD
      - support spread spectrum clocking (SSC)
    - Maxlinear:
      - add PHY-level statistics via ethtool
 
  - Ethernet switches:
    - Maxlinear (mxl862xx):
      - support for bridge offloading
      - support for VLANs
      - support driver statistics
 
  - Bluetooth:
    - large number of fixes and new device IDs
    - Mediatek:
      - support MT6639 (MT7927)
      - support MT7902 SDIO
 
  - WiFi:
    - Intel (iwlwifi):
      - UNII-9 and continuing UHR work
    - MediaTek (mt76):
      - mt7996/mt7925 MLO fixes/improvements
      - mt7996 NPU support (HW eth/wifi traffic offload)
    - Qualcomm (ath12k):
      - monitor mode support on IPQ5332
      - basic hwmon temperature reporting
      - support IPQ5424
    - Realtek:
      - add USB RX aggregation to improve performance
      - add USB TX flow control by tracking in-flight URBs
 
  - Cellular:
    - IPA v5.2 support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmnelNoACgkQMUZtbf5S
 IrtWFw//WyiXuEiGawVQONnbu1dtR+3nw/cvNpSYi0IM66vbRUB9n+9fxm2MIyG4
 4jI/c/X/fxIvUxEqGez3yPn5P7KqkQR8WRYwkxrMYKRpXeukN0IDk5Euew5DskCe
 wtBKNJOQWKdKXff0bLQoJ9dHWYuJ2IMRVil5M3fhUbeUOXeyJD7Yn1w2ICvJAbj+
 T/Hw7sEtchNaHp6h6SbaQfahkUFHQG5peNoETkZF4UDF6ALGY29WH91GXeO2lrgN
 IxX203KtaavV0oU8T0oixZgOc57Ns081YfFL/F1JP2HV6lgkwhuq+zxCrRTi1c9M
 HPTXgwD7Z80Y74nM3YTLrPfoMOP8GLBZgdV3rUpwmteM26+gMTm+O1zHUur5ZoGy
 D6TaMFguPTIqiRyrARa9xY/J6r9TQkc2Wfu4bIuPndKFg8xPoepuEObODnh0+5Hg
 4j4pdFhIo2huENhSg7kVb/yl+1q68SFwM3RqTmx+OhCa0AyjcKIKgt/UBhismdnG
 r8obxzb+nXeJc2rRDuwNMwlBlcMSbep27uGt64zeHMMXVhTVqOoytNaL/X/ZpH2m
 A0DscUrpHvb36IoDPtanc6irP+JOh5Xe7Nw5qhkgwsMc7hlf8SyyHB4OUBBaz1qA
 ETSnHlfwklRmXSpWqH2LyGXjdOQpDKP46+h0W3dttMD2/cRBqYo=
 =EhQZ
 -----END PGP SIGNATURE-----

Merge tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Support HW queue leasing, allowing containers to be granted access
     to HW queues for zero-copy operations and AF_XDP

   - Number of code moves to help the compiler with inlining. Avoid
     output arguments for returning drop reason where possible

   - Rework drop handling within qdiscs to include more metadata about
     the reason and dropping qdisc in the tracepoints

   - Remove the rtnl_lock use from IP Multicast Routing

   - Pack size information into the Rx Flow Steering table pointer
     itself. This allows making the table itself a flat array of u32s,
     thus making the table allocation size a power of two

   - Report TCP delayed ack timer information via socket diag

   - Add ip_local_port_step_width sysctl to allow distributing the
     randomly selected ports more evenly throughout the allowed space

   - Add support for per-route tunsrc in IPv6 segment routing

   - Start work of switching sockopt handling to iov_iter

   - Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
     buffer size drifting up

   - Support MSG_EOR in MPTCP

   - Add stp_mode attribute to the bridge driver for STP mode selection.
     This addresses concerns about call_usermodehelper() usage

   - Remove UDP-Lite support (as announced in 2023)

   - Remove support for building IPv6 as a module. Remove the now
     unnecessary function calling indirection

  Cross-tree stuff:

   - Move Michael MIC code from generic crypto into wireless, it's
     considered insecure but some WiFi networks still need it

  Netfilter:

   - Switch nft_fib_ipv6 module to no longer need temporary dst_entry
     object allocations by using fib6_lookup() + RCU.

     Florian W reports this gets us ~13% higher packet rate

   - Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
     switch the service tables to be per-net. Convert some code that
     walks the service lists to use RCU instead of the service_mutex

   - Add more opinionated input validation to lower security exposure

   - Make IPVS hash tables to be per-netns and resizable

  Wireless:

   - Finished assoc frame encryption/EPPKE/802.1X-over-auth

   - Radar detection improvements

   - Add 6 GHz incumbent signal detection APIs

   - Multi-link support for FILS, probe response templates and client
     probing

   - New APIs and mac80211 support for NAN (Neighbor Aware Networking,
     aka Wi-Fi Aware) so less work must be in firmware

  Driver API:

   - Add numerical ID for devlink instances (to avoid having to create
     fake bus/device pairs just to have an ID). Support shared devlink
     instances which span multiple PFs

   - Add standard counters for reporting pause storm events (implement
     in mlx5 and fbnic)

   - Add configuration API for completion writeback buffering (implement
     in mana)

   - Support driver-initiated change of RSS context sizes

   - Support DPLL monitoring input frequency (implement in zl3073x)

   - Support per-port resources in devlink (implement in mlx5)

  Misc:

   - Expand the YAML spec for Netfilter

  Drivers

   - Software:
      - macvlan: support multicast rx for bridge ports with shared
        source MAC address
      - team: decouple receive and transmit enablement for IEEE 802.3ad
        LACP "independent control"

   - Ethernet high-speed NICs:
      - nVidia/Mellanox:
         - support high order pages in zero-copy mode (for payload
           coalescing)
         - support multiple packets in a page (for systems with 64kB
           pages)
      - Broadcom 25-400GE (bnxt):
         - implement XDP RSS hash metadata extraction
         - add software fallback for UDP GSO, lowering the IOMMU cost
      - Broadcom 800GE (bnge):
         - add link status and configuration handling
         - add various HW and SW statistics
      - Marvell/Cavium:
         - NPC HW block support for cn20k
      - Huawei (hinic3):
         - add mailbox / control queue
         - add rx VLAN offload
         - add driver info and link management

   - Ethernet NICs:
      - Marvell/Aquantia:
         - support reading SFP module info on some AQC100 cards
      - Realtek PCI (r8169):
         - add support for RTL8125cp
      - Realtek USB (r8152):
         - support for the RTL8157 5Gbit chip
         - add 2500baseT EEE status/configuration support

   - Ethernet NICs embedded and off-the-shelf IP:
      - Synopsys (stmmac):
         - cleanup and reorganize SerDes handling and PCS support
         - cleanup descriptor handling and per-platform data
         - cleanup and consolidate MDIO defines and handling
         - shrink driver memory use for internal structures
         - improve Tx IRQ coalescing
         - improve TCP segmentation handling
         - add support for Spacemit K3
      - Cadence (macb):
         - support PHYs that have inband autoneg disabled with GEM
         - support IEEE 802.3az EEE
         - rework usrio capabilities and handling
      - AMD (xgbe):
         - improve power management for S0i3
         - improve TX resilience for link-down handling

   - Virtual:
      - Google cloud vNIC:
         - support larger ring sizes in DQO-QPL mode
         - improve HW-GRO handling
         - support UDP GSO for DQO format
      - PCIe NTB:
         - support queue count configuration

   - Ethernet PHYs:
      - automatically disable PHY autonomous EEE if MAC is in charge
      - Broadcom:
         - add BCM84891/BCM84892 support
      - Micrel:
         - support for LAN9645X internal PHY
      - Realtek:
         - add RTL8224 pair order support
         - support PHY LEDs on RTL8211F-VD
         - support spread spectrum clocking (SSC)
      - Maxlinear:
         - add PHY-level statistics via ethtool

   - Ethernet switches:
      - Maxlinear (mxl862xx):
         - support for bridge offloading
         - support for VLANs
         - support driver statistics

   - Bluetooth:
      - large number of fixes and new device IDs
      - Mediatek:
         - support MT6639 (MT7927)
         - support MT7902 SDIO

   - WiFi:
      - Intel (iwlwifi):
         - UNII-9 and continuing UHR work
      - MediaTek (mt76):
         - mt7996/mt7925 MLO fixes/improvements
         - mt7996 NPU support (HW eth/wifi traffic offload)
      - Qualcomm (ath12k):
         - monitor mode support on IPQ5332
         - basic hwmon temperature reporting
         - support IPQ5424
      - Realtek:
         - add USB RX aggregation to improve performance
         - add USB TX flow control by tracking in-flight URBs

   - Cellular:
      - IPA v5.2 support"

* tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1561 commits)
  net: pse-pd: fix kernel-doc function name for pse_control_find_by_id()
  wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
  wireguard: allowedips: remove redundant space
  tools: ynl: add sample for wireguard
  wireguard: allowedips: Use kfree_rcu() instead of call_rcu()
  MAINTAINERS: Add netkit selftest files
  selftests/net: Add additional test coverage in nk_qlease
  selftests/net: Split netdevsim tests from HW tests in nk_qlease
  tools/ynl: Make YnlFamily closeable as a context manager
  net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
  net: airoha: Fix VIP configuration for AN7583 SoC
  net: caif: clear client service pointer on teardown
  net: strparser: fix skb_head leak in strp_abort_strp()
  net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()
  selftests/bpf: add test for xdp_master_redirect with bond not up
  net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
  net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration
  sctp: disable BH before calling udp_tunnel_xmit_skb()
  sctp: fix missing encap_port propagation for GSO fragments
  net: airoha: Rely on net_device pointer in ETS callbacks
  ...
2026-04-14 18:36:10 -07:00
Linus Torvalds
c1fe867b5b Updates for the timer/timekeeping core:
- A rework of the hrtimer subsystem to reduce the overhead for frequently
     armed timers, especially the hrtick scheduler timer.
 
       - Better timer locality decision
 
       - Simplification of the evaluation of the first expiry time by
         keeping track of the neighbor timers in the RB-tree by providing a
         RB-tree variant with neighbor links. That avoids walking the
         RB-tree on removal to find the next expiry time, but even more
         important allows to quickly evaluate whether a timer which is
         rearmed changes the position in the RB-tree with the modified
         expiry time or not. If not, the dequeue/enqueue sequence which both
         can end up in rebalancing can be completely avoided.
 
       - Deferred reprogramming of the underlying clock event device. This
         optimizes for the situation where a hrtimer callback sets the need
         resched bit. In that case the code attempts to defer the
         re-programming of the clock event device up to the point where the
         scheduler has picked the next task and has the next hrtick timer
         armed. In case that there is no immediate reschedule or soft
         interrupts have to be handled before reaching the reschedule point
         in the interrupt entry code the clock event is reprogrammed in one
         of those code paths to prevent that the timer becomes stale.
 
       - Support for clocksource coupled clockevents
 
       	The TSC deadline timer is coupled to the TSC. The next event is
       	programmed in TSC time. Currently this is done by converting the
       	CLOCK_MONOTONIC based expiry value into a relative timeout,
       	converting it into TSC ticks, reading the TSC adding the delta
       	ticks and writing the deadline MSR.
 
 	As the timekeeping core has the conversion factors for the TSC
 	already, the whole back and forth conversion can be completely
 	avoided. The timekeeping core calculates the reverse conversion
 	factors from nanoseconds to TSC ticks and utilizes the base
 	timestamps of TSC and CLOCK_MONOTONIC which are updated once per
 	tick. This allows a direct conversion into the TSC deadline value
 	without reading the time and as a bonus keeps the deadline
 	conversion in sync with the TSC conversion factors, which are
 	updated by adjtimex() on systems with NTP/PTP enabled.
 
      - Allow inlining of the clocksource read and clockevent write
        functions when they are tiny enough, e.g. on x86 RDTSC and WRMSR.
 
     With all those enhancements in place a hrtick enabled scheduler
     provides the same performance as without hrtick. But also other hrtimer
     users obviously benefit from these optimizations.
 
   - Robustness improvements and cleanups of historical sins in the hrtimer
     and timekeeping code.
 
   - Rewrite of the clocksource watchdog.
 
     The clocksource watchdog code has over time reached the state of an
     impenetrable maze of duct tape and staples. The original design, which was
     made in the context of systems far smaller than today, is based on the
     assumption that the to be monitored clocksource (TSC) can be trivially
     compared against a known to be stable clocksource (HPET/ACPI-PM timer).
 
     Over the years this rather naive approach turned out to have major
     flaws. Long delays between the watchdog invocations can cause wrap
     arounds of the reference clocksource. The access to the reference
     clocksource degrades on large multi-sockets systems dure to
     interconnect congestion. This has been addressed with various
     heuristics which degraded the accuracy of the watchdog to the point
     that it fails to detect actual TSC problems on older hardware which
     exposes slow inter CPU drifts due to firmware manipulating the TSC to
     hide SMI time.
 
     The rewrite addresses this by:
 
       - Restricting the validation against the reference clocksource to the
         boot CPU which is usually closest to the legacy block which
         contains the reference clocksource (HPET/ACPI-PM).
 
       - Do a round robin validation betwen the boot CPU and the other CPUs
         based only on the TSC with an algorithm similar to the TSC
         synchronization code during CPU hotplug.
 
       - Being more leniant versus remote timeouts
 
   - The usual tiny fixes, cleanups and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnbxdMQHHRnbHhAa2Vy
 bmVsLm9yZwAKCRCmGPVMDXSYoeq6EAC4h9wuBr5yCkxmog1Bhlk9cnK0oX1THb7V
 Q4z6DrYAiDXP6z4IDwqSW+3vvmNw1QXOeqpyMTiiIcQ5mNSs1IDnCt5HOEwY+ICm
 fiSUMYkXkH6xdFWspYWFkD7aExHJRT3hd/bo+WnXGHhHclPj5NHZssLMIDboHrzX
 jLV1hljmthfwg/uOXDGmQUPRFjqr2ZQjo7zGA5SwfVg8Krz7g/qRVy2wUns9TdW/
 NYwihDm1YV7qkK/+f1GnMdd70toqb1OZo/fS9FPbBrPLdyi8V+UbnFSUeZu8kCwV
 KubAzjLZR4xYCnrlaHhoi208GMd0TOvHMOrdAA0zkQHfhmszGl4N0pbF/EI29Ft2
 tQG/FUTG+nzgNOrMCPN2nr3u/UOXLP+gO+2hkyyQjqUP35IyaTYQn10JgBmPTdJL
 Ab6E8WL9gTMCd+t/bVjdU/B8W9ruMihKBtWkTfMBCcQ9uNJFCEGzrcMF8hzFYRTs
 /4rMDr3NlGYydAnbKPj6bkC5gtjBvh/L08kOdUFyXCMSqIzvJkZJ4241ogl1Awi6
 VfdwjF5KZCQo3M1ujpep+1L010wC/yulqLt2brQMO9Nt05dRhgwM3lxy7cnlMNm3
 NdfMgi+OG0CzQ+ZUpvo20hCgTDUVgWN9g5R3rar8FJX+Ym3T+ZoEKlShZF+fSRjf
 YAUIbUyi7A==
 =2qc8
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer core updates from Thomas Gleixner:

 - A rework of the hrtimer subsystem to reduce the overhead for
   frequently armed timers, especially the hrtick scheduler timer:

     - Better timer locality decision

     - Simplification of the evaluation of the first expiry time by
       keeping track of the neighbor timers in the RB-tree by providing
       a RB-tree variant with neighbor links. That avoids walking the
       RB-tree on removal to find the next expiry time, but even more
       important allows to quickly evaluate whether a timer which is
       rearmed changes the position in the RB-tree with the modified
       expiry time or not. If not, the dequeue/enqueue sequence which
       both can end up in rebalancing can be completely avoided.

     - Deferred reprogramming of the underlying clock event device. This
       optimizes for the situation where a hrtimer callback sets the
       need resched bit. In that case the code attempts to defer the
       re-programming of the clock event device up to the point where
       the scheduler has picked the next task and has the next hrtick
       timer armed. In case that there is no immediate reschedule or
       soft interrupts have to be handled before reaching the reschedule
       point in the interrupt entry code the clock event is reprogrammed
       in one of those code paths to prevent that the timer becomes
       stale.

     - Support for clocksource coupled clockevents

       The TSC deadline timer is coupled to the TSC. The next event is
       programmed in TSC time. Currently this is done by converting the
       CLOCK_MONOTONIC based expiry value into a relative timeout,
       converting it into TSC ticks, reading the TSC adding the delta
       ticks and writing the deadline MSR.

       As the timekeeping core has the conversion factors for the TSC
       already, the whole back and forth conversion can be completely
       avoided. The timekeeping core calculates the reverse conversion
       factors from nanoseconds to TSC ticks and utilizes the base
       timestamps of TSC and CLOCK_MONOTONIC which are updated once per
       tick. This allows a direct conversion into the TSC deadline value
       without reading the time and as a bonus keeps the deadline
       conversion in sync with the TSC conversion factors, which are
       updated by adjtimex() on systems with NTP/PTP enabled.

     - Allow inlining of the clocksource read and clockevent write
       functions when they are tiny enough, e.g. on x86 RDTSC and WRMSR.

   With all those enhancements in place a hrtick enabled scheduler
   provides the same performance as without hrtick. But also other
   hrtimer users obviously benefit from these optimizations.

 - Robustness improvements and cleanups of historical sins in the
   hrtimer and timekeeping code.

 - Rewrite of the clocksource watchdog.

   The clocksource watchdog code has over time reached the state of an
   impenetrable maze of duct tape and staples. The original design,
   which was made in the context of systems far smaller than today, is
   based on the assumption that the to be monitored clocksource (TSC)
   can be trivially compared against a known to be stable clocksource
   (HPET/ACPI-PM timer).

   Over the years this rather naive approach turned out to have major
   flaws. Long delays between the watchdog invocations can cause wrap
   arounds of the reference clocksource. The access to the reference
   clocksource degrades on large multi-sockets systems dure to
   interconnect congestion. This has been addressed with various
   heuristics which degraded the accuracy of the watchdog to the point
   that it fails to detect actual TSC problems on older hardware which
   exposes slow inter CPU drifts due to firmware manipulating the TSC to
   hide SMI time.

   The rewrite addresses this by:

     - Restricting the validation against the reference clocksource to
       the boot CPU which is usually closest to the legacy block which
       contains the reference clocksource (HPET/ACPI-PM).

     - Do a round robin validation betwen the boot CPU and the other
       CPUs based only on the TSC with an algorithm similar to the TSC
       synchronization code during CPU hotplug.

     - Being more leniant versus remote timeouts

 - The usual tiny fixes, cleanups and enhancements all over the place

* tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  alarmtimer: Access timerqueue node under lock in suspend
  hrtimer: Fix incorrect #endif comment for BITS_PER_LONG check
  posix-timers: Fix stale function name in comment
  timers: Get this_cpu once while clearing the idle state
  clocksource: Rewrite watchdog code completely
  clocksource: Don't use non-continuous clocksources as watchdog
  x86/tsc: Handle CLOCK_SOURCE_VALID_FOR_HRES correctly
  MIPS: Don't select CLOCKSOURCE_WATCHDOG
  parisc: Remove unused clocksource flags
  hrtimer: Add a helper to retrieve a hrtimer from its timerqueue node
  hrtimer: Remove trailing comma after HRTIMER_MAX_CLOCK_BASES
  hrtimer: Mark index and clockid of clock base as const
  hrtimer: Drop unnecessary pointer indirection in hrtimer_expire_entry event
  hrtimer: Drop spurious space in 'enum hrtimer_base_type'
  hrtimer: Don't zero-initialize ret in hrtimer_nanosleep()
  hrtimer: Remove hrtimer_get_expires_ns()
  timekeeping: Mark offsets array as const
  timekeeping/auxclock: Consistently use raw timekeeper for tk_setup_internals()
  timer_list: Print offset as signed integer
  tracing: Use explicit array size instead of sentinel elements in symbol printing
  ...
2026-04-14 10:27:07 -07:00
Linus Torvalds
5d0d362330 Kbuild/Kconfig updates for 7.1
Kbuild changes
 ==============
 
   * tools/build: Reject unexpected values for LLVM=
 
   * kbuild: uapi: remove usage of toolchain headers
 
   * kbuild: Switch from '-fms-extensions' to '-fms-anonymous-structs'
     when available (currently: clang >= 23.0.0)
 
   * kbuild: Reduce the number of compiler-generated suffixes for clang
     thin-lto build
 
   * kbuild: reduce output spam ("GEN Makefile") when building out of tree
 
   * check-uapi: improve portability for testing headers
 
   * uapi: also test UAPI headers against C++ compilers
 
   * kbuild: vdso_install: drop build ID architecture allow-list
 
   * checksyscalls: only run when necessary
 
   * Documentation: kbuild: Update the debug information notes in
     reproducible-builds.rst
 
   * kconfig: forbid multiple entries with the same symbol in a choice
 
   * kbuild: expand inlining hints with -fdiagnostics-show-inlining-chain
 
 Kconfig changes
 ===============
 
   * kconfig: Error out on duplicated kconfig inclusion
 
 Cc: Alexander Coffin <alex@cyberialabs.net>
 Cc: Ard Biesheuvel <ardb@kernel.org>
 Cc: Arnd Bergmann <arnd@arndb.de>
 Cc: Bill Wendling <morbo@google.com>
 Cc: David Howells <dhowells@redhat.com>
 Cc: Dodji Seketeli <dodji@seketeli.org>
 Cc: H. Peter Anvin <hpa@zytor.com>
 Cc: Helge Deller <deller@gmx.de>
 Cc: John Moon <john@jmoon.dev>
 Cc: Jonathan Corbet <corbet@lwn.net>
 Cc: Josh Poimboeuf <jpoimboe@kernel.org>
 Cc: Justin Stitt <justinstitt@google.com>
 Cc: Kees Cook <kees@kernel.org>
 Cc: Masahiro Yamada <masahiroy@kernel.org>
 Cc: Nathan Chancellor <nathan@kernel.org>
 Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
 Cc: Shuah Khan <skhan@linuxfoundation.org>
 Cc: Song Liu <song@kernel.org>
 Cc: Thomas Weißschuh <linux@weissschuh.net>
 Cc: Yonghong Song <yonghong.song@linux.dev>
 Cc: kernel-team@fb.com
 Cc: linux-arm-kernel@lists.infradead.org
 Cc: linux-efi@vger.kernel.org
 Cc: linux-hexagon@vger.kernel.org
 Cc: linux-kbuild@vger.kernel.org
 Cc: linux-kernel@vger.kernel.org
 Cc: linux-parisc@vger.kernel.org
 Cc: linux-s390@vger.kernel.org
 Cc: linuxppc-dev@lists.ozlabs.org
 Cc: llvm@lists.linux.dev
 Cc: loongarch@lists.linux.dev
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEh0E3p4c3JKeBvsLGB1IKcBYmEmkFAmnatXEACgkQB1IKcBYm
 Eml6ww/9Hja/CTBoF+ZgMXN/9VcQhzNonPXIp8IGarX3+LCPh8RfUEywaOLnvR/U
 fE6FEIcwDw0M5drS0hEH7t1Xowc6AhDX05lKBj3aGBgn6JqGGQFAfnysQd5z0cwW
 Y/8+bMm+Y2XQ/xZNa0J92+3evPO04U7+2kCSVD051ZhRdmK4n290u4YsTgoKs7Fm
 1SBIr+tsFa1zMOG6r+J4uCLxXNnujQ5XcejnlmdBM0o19f9kttvVkYKuBVdXPHf4
 JaTLti22Td8SklDKMmkSRg+Ul/Wh2x8D8tP98VQAJe5B3f4Uk6YAu1BMrbQaX5Rk
 5SsGbhBEeOTDc4qCaS8DS+FJQU6T9W9cf/9+tBY510fXxAIonz5cPB06q5xeJWCd
 IkVB3KpmaVxo2B54Cy4b/fvd1J3VMkmFjBQWMNwkq6cnCG1ZK/b6Jmvh9BQSNctl
 IYJxWKBjlddrMuvZEMI0CewVq4GmarTLiOpweghDg8OYqya4E6PfOUGnaWMrWT5c
 2E8ZMnQSb68yFUaXK+Sy+Pw2Nig/VvxCUxHdaarHi/RmGeoN5dMGfjj/gGZvZrHt
 NUGt6qe+X62P0ZAUR8p+GpRcU3+p3uLhCyO7dkwqgLVZTnaXy5XtUQ/uyh2G60hv
 eJlFfrn8QXplvzrxcSTJya6PunoIhuWh2BfKhf0RDymJTPyMbBc=
 =+wTC
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux

Pull Kbuild/Kconfig updates from Nicolas Schier:
 "Kbuild:
   - reject unexpected values for LLVM=
   - uapi: remove usage of toolchain headers
   - switch from '-fms-extensions' to '-fms-anonymous-structs' when
     available (currently: clang >= 23.0.0)
   - reduce the number of compiler-generated suffixes for clang thin-lto
     build
   - reduce output spam ("GEN Makefile") when building out of tree
   - improve portability for testing headers
   - also test UAPI headers against C++ compilers
   - drop build ID architecture allow-list in vdso_install
   - only run checksyscalls when necessary
   - update the debug information notes in reproducible-builds.rst
   - expand inlining hints with -fdiagnostics-show-inlining-chain

  Kconfig:
   - forbid multiple entries with the same symbol in a choice
   - error out on duplicated kconfig inclusion"

* tag 'kbuild-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (35 commits)
  kbuild: expand inlining hints with -fdiagnostics-show-inlining-chain
  kconfig: forbid multiple entries with the same symbol in a choice
  Documentation: kbuild: Update the debug information notes in reproducible-builds.rst
  checksyscalls: move instance functionality into generic code
  checksyscalls: only run when necessary
  checksyscalls: fail on all intermediate errors
  checksyscalls: move path to reference table to a variable
  kbuild: vdso_install: drop build ID architecture allow-list
  kbuild: vdso_install: gracefully handle images without build ID
  kbuild: vdso_install: hide readelf warnings
  kbuild: vdso_install: split out the readelf invocation
  kbuild: uapi: also test UAPI headers against C++ compilers
  kbuild: uapi: provide a C++ compatible dummy definition of NULL
  kbuild: uapi: handle UML in architecture-specific exclusion lists
  kbuild: uapi: move all include path flags together
  kbuild: uapi: move some compiler arguments out of the command definition
  check-uapi: use dummy libc includes
  check-uapi: honor ${CROSS_COMPILE} setting
  check-uapi: link into shared objects
  kbuild: reduce output spam when building out of tree
  ...
2026-04-14 09:18:40 -07:00
Linus Torvalds
ef3da345cc vfs-7.1-rc1.misc
Please consider pulling these changes from the signed vfs-7.1-rc1.misc tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCwAKCRCRxhvAZXjc
 ohhBAQCAmQMlMRAXAgUZFYMTZpeQlcujP5rv+/vT2Tf/xS76YwD/dRDaw1FH294+
 qtk/Z1NjleNixzE2sld1K9J32NxeyAc=
 =+g9q
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "Features:
   - coredump: add tracepoint for coredump events
   - fs: hide file and bfile caches behind runtime const machinery

  Fixes:
   - fix architecture-specific compat_ftruncate64 implementations
   - dcache: Limit the minimal number of bucket to two
   - fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
   - fs/mbcache: cancel shrink work before destroying the cache
   - dcache: permit dynamic_dname()s up to NAME_MAX

  Cleanups:
   - remove or unexport unused fs_context infrastructure
   - trivial ->setattr cleanups
   - selftests/filesystems: Assume that TIOCGPTPEER is defined
   - writeback: fix kernel-doc function name mismatch for wb_put_many()
   - autofs: replace manual symlink buffer allocation in autofs_dir_symlink
   - init/initramfs.c: trivial fix: FSM -> Finite-state machine
   - fs: remove stale and duplicate forward declarations
   - readdir: Introduce dirent_size()
   - fs: Replace user_access_{begin/end} by scoped user access
   - kernel: acct: fix duplicate word in comment
   - fs: write a better comment in step_into() concerning .mnt assignment
   - fs: attr: fix comment formatting and spelling issues"

* tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  dcache: permit dynamic_dname()s up to NAME_MAX
  fs: attr: fix comment formatting and spelling issues
  fs: hide file and bfile caches behind runtime const machinery
  fs: write a better comment in step_into() concerning .mnt assignment
  proc: rename proc_notify_change to proc_setattr
  proc: rename proc_setattr to proc_nochmod_setattr
  affs: rename affs_notify_change to affs_setattr
  adfs: rename adfs_notify_change to adfs_setattr
  hfs: update comments on hfs_inode_setattr
  kernel: acct: fix duplicate word in comment
  fs: Replace user_access_{begin/end} by scoped user access
  readdir: Introduce dirent_size()
  coredump: add tracepoint for coredump events
  fs: remove do_sys_truncate
  fs: pass on FTRUNCATE_* flags to do_truncate
  fs: fix archiecture-specific compat_ftruncate64
  fs: remove stale and duplicate forward declarations
  init/initramfs.c: trivial fix: FSM -> Finite-state machine
  autofs: replace manual symlink buffer allocation in autofs_dir_symlink
  fs/mbcache: cancel shrink work before destroying the cache
  ...
2026-04-13 14:20:11 -07:00
Jakub Kicinski
118cbd428e Final updates, notably:
- crypto: move Michael MIC code into wireless (only)
  - mac80211:
    - multi-link 4-addr support
    - NAN data support (but no drivers yet)
  - ath10k: DT quirk to make it work on some devices
  - ath12k: IPQ5424 support
  - rtw89: USB improvements for performance
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmnYm4gACgkQ10qiO8sP
 aABbhw/+Jb3JGnUT3PGqE+uyDsP5+uj6/thHVjugbtgFZeeNE6rAdprma1OzBbIF
 BZCgKivNRA4UHehYlZtgnp5Hm5bYCEF6ntGk5a7SLx6HuVn1JWR4cqc1zfCkmRmH
 7PtvShCy+rx2t6q/O5mZ3s2z4cSbauzvJ3s5j/qme/lZV5K6Hrx6HnJ6fudyBHOT
 vS6Ahl2tfFIyek6Qfs5xfUFzcNY4kEw6O8yMJ8F+ZgV4fXWZybmo4Ld6j/w0veiF
 0Wo9XYNQVu0EBqTkprsKwnZ0bmfQq03hivmupTm9b4gtXQjakb9o/wG2gQilKYkF
 ycS39cW3TxAGQDs/2U4l1CreumeRXtlq9OUywDu9arEICkJ0C7ierR3azo3fNeDB
 WomDYREtV7g3KOzoC6T0Zxivgdkg66W2ZVWvvWIHdSWCY+hYK+JsGXK6fZ5uAyo2
 5BOq/PQtK/m/3DTwR+lF3KYeUB8r+Q/FiJ1kMih6lAK+dZGDGhQPjN+seaVsIpfw
 cZVHu6hAmt7ks2zxbKpBmdFbyKjQu3Yk1UZNXtOKPuLzwKGdTECiRlJimJsOffcn
 lcveT93oz5zGSCwH1pEw0nIlsnEVGFQjxCOVln7eOs66i8vnJKMb/8OpvcxFqjsL
 rvItAvw+cPWrLXLB9VkxqZg2ndrWe9gGh+JJtRkGXhEjC4U9Ll0=
 =lf+O
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2026-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
Final updates, notably:
 - crypto: move Michael MIC code into wireless (only)
 - mac80211:
   - multi-link 4-addr support
   - NAN data support (but no drivers yet)
 - ath10k: DT quirk to make it work on some devices
 - ath12k: IPQ5424 support
 - rtw89: USB improvements for performance

* tag 'wireless-next-2026-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (124 commits)
  wifi: cfg80211: Explicitly include <linux/export.h> in michael-mic.c
  wifi: ath10k: Add device-tree quirk to skip host cap QMI requests
  dt-bindings: wireless: ath10k: Add quirk to skip host cap QMI requests
  crypto: Remove michael_mic from crypto_shash API
  wifi: ipw2x00: Use michael_mic() from cfg80211
  wifi: ath12k: Use michael_mic() from cfg80211
  wifi: ath11k: Use michael_mic() from cfg80211
  wifi: mac80211, cfg80211: Export michael_mic() and move it to cfg80211
  wifi: ipw2x00: Rename michael_mic() to libipw_michael_mic()
  wifi: libertas_tf: refactor endpoint lookup
  wifi: libertas: refactor endpoint lookup
  wifi: at76c50x: refactor endpoint lookup
  wifi: ath12k: Enable IPQ5424 WiFi device support
  wifi: ath12k: Add CE remap hardware parameters for IPQ5424
  wifi: ath12k: add ath12k_hw_regs for IPQ5424
  wifi: ath12k: add ath12k_hw_version_map entry for IPQ5424
  wifi: ath12k: Add ath12k_hw_params for IPQ5424
  dt-bindings: net: wireless: add ath12k wifi device IPQ5424
  wifi: ath10k: fix station lookup failure during disconnect
  wifi: ath12k: Create symlink for each radio in a wiphy
  ...
====================

Link: https://patch.msgid.link/20260410064703.735099-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-12 09:17:42 -07:00
Thomas Gleixner
ff1c0c5d07 Merge branch 'timers/urgent' into timers/core
to resolve the conflict with urgent fixes.
2026-04-11 07:58:33 +02:00
Eric Biggers
8c6d03b7a2 crypto: Remove michael_mic from crypto_shash API
Remove the "michael_mic" crypto_shash algorithm, since it's no longer
used.  Its only users were wireless drivers, which have now been
converted to use the michael_mic() function instead.

It makes sense that no other users ever appeared: Michael MIC is an
insecure algorithm that is specific to WPA TKIP, which itself was an
interim security solution to replace the broken WEP standard.

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://patch.msgid.link/20260408030651.80336-7-ebiggers@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-08 10:11:37 +02:00
Jakub Kicinski
e6b7e1a10c eth: remove the driver for acenic / tigon1&2
The entire git history for this driver looks like tree-wide
and automated cleanups. There's even more coming now with
AI, so let's try to delete it instead.

Acked-by: Jes Sorensen <jes@trained-monkey.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260403220501.2263835-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-06 18:52:27 -07:00
Baolin Wang
06c4dfa3ce mm: change to return bool for ptep_clear_flush_young()/clear_flush_young_ptes()
The ptep_clear_flush_young() and clear_flush_young_ptes() are used to
clear the young flag and flush the TLB, returning whether the young flag
was set.  Change the return type to bool to make the intention clearer.

Link: https://lkml.kernel.org/r/24af5144b96103631594501f77d4525f2475c1be.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:35 -07:00
Baolin Wang
a62ca3f40f mm: change to return bool for ptep_test_and_clear_young()
Patch series "change young flag check functions to return bool", v2.

This is a cleanup patchset to change all young flag check functions to
return bool, as discussed with David in the previous thread[1].  Since
callers only care about whether the young flag was set, returning bool
makes the intention clearer.  No functional changes intended.


This patch (of 6):

Callers use ptep_test_and_clear_young() to clear the young flag and check
whether it was set.  Change the return type to bool to make the intention
clearer.

Link: https://lkml.kernel.org/r/cover.1774075004.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/57e70efa9703d43959aa645246ea3cbdba14fa17.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:35 -07:00
Mike Rapoport (Microsoft)
6215d9f447 arch, mm: consolidate empty_zero_page
Reduce 22 declarations of empty_zero_page to 3 and 23 declarations of
ZERO_PAGE() to 4.

Every architecture defines empty_zero_page that way or another, but for the
most of them it is always a page aligned page in BSS and most definitions
of ZERO_PAGE do virt_to_page(empty_zero_page).

Move Linus vetted x86 definition of empty_zero_page and ZERO_PAGE() to the
core MM and drop these definitions in architectures that do not implement
colored zero page (MIPS and s390).

ZERO_PAGE() remains a macro because turning it to a wrapper for a static
inline causes severe pain in header dependencies.

For the most part the change is mechanical, with these being noteworthy:

* alpha: aliased empty_zero_page with ZERO_PGE that was also used for boot
  parameters. Switching to a generic empty_zero_page removes the aliasing
  and keeps ZERO_PGE for boot parameters only
* arm64: uses __pa_symbol() in ZERO_PAGE() so that definition of
  ZERO_PAGE() is kept intact.
* m68k/parisc/um: allocated empty_zero_page from memblock,
  although they do not support zero page coloring and having it in BSS
  will work fine.
* sparc64 can have empty_zero_page in BSS rather allocate it, but it
  can't use virt_to_page() for BSS. Keep it's definition of ZERO_PAGE()
  but instead of allocating it, make mem_map_zero point to
  empty_zero_page.
* sh: used empty_zero_page for boot parameters at the very early boot.
  Rename the parameters page to boot_params_page and let sh use the generic
  empty_zero_page.
* hexagon: had an amusing comment about empty_zero_page

	/* A handy thing to have if one has the RAM. Declared in head.S */

  that unfortunately had to go :)

Link: https://lkml.kernel.org/r/20260211103141.3215197-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>		[parisc]
Tested-by: Helge Deller <deller@gmx.de>		[parisc]
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>	[alpha]
Acked-by: Dinh Nguyen <dinguyen@kernel.org>	[nios2]
Acked-by: Andreas Larsson <andreas@gaisler.com>	[sparc]
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:01 -07:00
Ilpo Järvinen
9036bd0efc PCI: Align head space better
When a bridge window contains big and small resource(s), the small
resource(s) may not amount to the half of the size of the big resource
which would allow calculate_head_align() to shrink the head alignment.
This results in always placing the small resource(s) after the big
resource.

In general, it would be good to be able to place the small resource(s)
before the big resource to achieve better utilization of the address space.
In the cases where the large resource can only fit at the end of the
window, it is even required.

However, carrying the information over from pbus_size_mem() and
calculate_head_align() to __pci_assign_resource() and
pcibios_align_resource() is not easy with the current data structures.

A somewhat hacky way to move the non-aligning tail part to the head is
possible within pcibios_align_resource(). The free space between the start
of the free space span and the aligned start address can be compared with
the non-aligning remainder of the size. If the free space is larger than
the remainder, placing the remainder before the start address is possible.
This relocation should generally work, because PCI resources consist only
power-of-2 atoms.

Various arch requirements may still need to override the relocation, so the
relocation is only applied selectively in such cases.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221205
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xifer <xiferdev@gmail.com>
Link: https://patch.msgid.link/20260324165633.4583-10-ilpo.jarvinen@linux.intel.com
2026-03-27 10:19:08 -05:00
Ilpo Järvinen
38ec53e16f parisc/PCI: Clean up align handling
Caller of pcibios_align_resource() (__find_resource_space()) already aligns
the start address by 'alignment' so aligning is only necessary if align >
alignment.

Change also to use ALIGN() instead of open-coding.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260324165633.4583-8-ilpo.jarvinen@linux.intel.com
2026-03-27 10:19:08 -05:00
Ilpo Järvinen
f699bcc8bc resource: Pass full extent of empty space to resource_alignf callback
__find_resource_space() calculates the full extent of empty space but only
passes the aligned space to resource_alignf callback. In some situations,
the callback may choose take advantage of the free space before the
requested alignment.

Pass the full extent of the calculated empty space to resource_alignf
callback as an additional parameter.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xifer <xiferdev@gmail.com>
Link: https://patch.msgid.link/20260324165633.4583-3-ilpo.jarvinen@linux.intel.com
2026-03-27 10:18:39 -05:00
Christoph Hellwig
e43dce8a0b
fs: fix archiecture-specific compat_ftruncate64
The "small" argument to do_sys_ftruncate indicates if > 32-bit size
should be reject, but all the arch-specific compat ftruncate64
implementations get this wrong.  Merge do_sys_ftruncate and
ksys_ftruncate, replace the integer as boolean small flag with a
descriptive one about LFS semantics, and use it correctly in the
architecture-specific ftruncate64 implementations.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 3dd681d944 ("arm64: 32-bit (compat) applications support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260323070205.2939118-2-hch@lst.de
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-23 12:41:57 +01:00
Ingo Molnar
f6472b1793 Linux 7.0-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmm3G/UeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGZJUH/R0vQ3Vha48QDEic
 1NREwaHxAoTFi0i3y7OPPklqrP2V09D1qg4Q6fExYQVTQgV6F2DRjVbyPKrmr4ay
 BA6aHrUdnFngYHpDlI1b1r7rJiAIN4WFHl7StO70bS+EB+UPsP9cfP3CKXUfKfqT
 kyHXzUrd5QnjYmlb9rQw1E6rzsRamNtGUtZf7TwDidJYjtm3sPeDHUkjyRy4xkYd
 UouIu6W7UXoicl38bJAgaWBY5BiYtjN6ktnY4/gcqDeqYd7mTM3Eb1B+OSXgFfip
 F0OYfJhfWn+63WnPA+1I5jXWC1UrdVXTMK/NTYjhmGlfdmkLcWDlNGtu+qKZbpwj
 fmF3Kyo=
 =6nX1
 -----END PGP SIGNATURE-----

Merge tag 'v7.0-rc4' into timers/core, to resolve conflict

Resolve conflict between this change in the upstream kernel:

  4c652a4772 ("rseq: Mark rseq_arm_slice_extension_timer() __always_inline")

... and this pending change in timers/core:

  0e98eb1481 ("entry: Prepare for deferred hrtimer rearming")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2026-03-21 08:02:36 +01:00
Helge Deller
2c98a8fbd6 parisc: Flush correct cache in cacheflush() syscall
The assembly flush instructions were swapped for I- and D-cache flags:

SYSCALL_DEFINE3(cacheflush, ...)
{
	if (cache & DCACHE) {
			"fic ...\n"
	}
	if (cache & ICACHE && error == 0) {
			"fdc ...\n"
	}

Fix it by using fdc for DCACHE, and fic for ICACHE flushing.

Reported-by: Felix Lechner <felix.lechner@lease-up.com>
Fixes: c6d96328fe ("parisc: Add cacheflush() syscall")
Cc: <stable@vger.kernel.org> # v6.5+
Signed-off-by: Helge Deller <deller@gmx.de>
2026-03-15 09:28:49 +01:00
Nathan Chancellor
ec4c28276c
kbuild: Consolidate C dialect options
Introduce CC_FLAGS_DIALECT to make it easier to update the various
places in the tree that rely on the GNU C standard and Microsoft
extensions flags atomically. All remaining uses of '-std=gnu11' and
'-fms-extensions' are in the tools directory (which has its own build
system) and other standalone Makefiles. This will allow the kernel to
use a narrower option to enable the Microsoft anonymous tagged structure
extension in a simpler manner. Place the CC_FLAGS_DIALECT block after
the configuration include (so that a future change can move the
selection of the flag to Kconfig) but before the
arch/$(SRCARCH)/Makefile include (so that CC_FLAGS_DIALECT is available
for use in those Makefiles).

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>  # parisc
Link: https://patch.msgid.link/20260223-fms-anonymous-structs-v1-1-8ee406d3c36c@kernel.org
Signed-off-by: Nicolas Schier <nsc@kernel.org>
2026-03-12 12:52:37 +01:00
Thomas Gleixner
2a14f7c5ee parisc: Remove unused clocksource flags
PARISC does not enable the clocksource watchdog, so the VERIFY flags are
pointless as they are not evaluated. Remove them from the clocksource.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>
Link: https://patch.msgid.link/20260123231521.655892451@kernel.org
2026-03-12 12:23:26 +01:00
Linus Torvalds
6deccafcb4 parisc architecture fixes for kernel v7.0-rc3:
Three initial kernel mapping fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCaayE4AAKCRD3ErUQojoP
 X4U4AQDtHPc9nlM3areu5yTQnOcPTExuEoIpvBm9ktwNCdrwCgEAt4tqv3hhxCvG
 /lwb6XBCHfyw3d/AsTRbOIH1MGCnaQQ=
 =itGt
 -----END PGP SIGNATURE-----

Merge tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
 "While testing Sasha Levin's 'kallsyms: embed source file:line info in
  kernel stack traces' patch series, which increases the typical kernel
  image size, I found some issues with the parisc initial kernel mapping
  which may prevent the kernel to boot.

  The three small patches here fix this"

* tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix initial page table creation for boot
  parisc: Check kernel mapping earlier at bootup
  parisc: Increase initial mapping to 64 MB with KALLSYMS
2026-03-07 12:38:16 -08:00
Helge Deller
8475d8fe21 parisc: Fix initial page table creation for boot
The KERNEL_INITIAL_ORDER value defines the initial size (usually 32 or
64 MB) of the page table during bootup. Up until now the whole area was
initialized with PTE entries, but there was no check if we filled too
many entries.  Change the code to fill up with so many entries that the
"_end" symbol can be reached by the kernel, but not more entries than
actually fit into the initial PTE tables.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v6.0+
2026-03-06 11:33:13 +01:00
Helge Deller
17c144f110 parisc: Check kernel mapping earlier at bootup
The check if the initial mapping is sufficient needs to happen much
earlier during bootup. Move this test directly to the start_parisc()
function and use native PDC iodc functions to print the warning, because
panic() and printk() are not functional yet.

This fixes boot when enabling various KALLSYSMS options which need
much more space.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v6.0+
2026-03-06 11:33:13 +01:00
Helge Deller
8e732934fb parisc: Increase initial mapping to 64 MB with KALLSYMS
The 32MB initial kernel mapping can become too small when CONFIG_KALLSYMS
is used. Increase the mapping to 64 MB in this case.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v6.0+
2026-03-06 11:33:13 +01:00
Nathan Chancellor
8678591b47
kbuild: Split .modinfo out from ELF_DETAILS
Commit 3e86e4d74c ("kbuild: keep .modinfo section in
vmlinux.unstripped") added .modinfo to ELF_DETAILS while removing it
from COMMON_DISCARDS, as it was needed in vmlinux.unstripped and
ELF_DETAILS was present in all architecture specific vmlinux linker
scripts. While this shuffle is fine for vmlinux, ELF_DETAILS and
COMMON_DISCARDS may be used by other linker scripts, such as the s390
and x86 compressed boot images, which may not expect to have a .modinfo
section. In certain circumstances, this could result in a bootloader
failing to load the compressed kernel [1].

Commit ddc6cbef3e ("s390/boot/vmlinux.lds.S: Ensure bzImage ends with
SecureBoot trailer") recently addressed this for the s390 bzImage but
the same bug remains for arm, parisc, and x86. The presence of .modinfo
in the x86 bzImage was the root cause of the issue worked around with
commit d50f210913 ("kbuild: align modinfo section for Secureboot
Authenticode EDK2 compat"). misc.c in arch/x86/boot/compressed includes
lib/decompress_unzstd.c, which in turn includes lib/xxhash.c and its
MODULE_LICENSE / MODULE_DESCRIPTION macros due to the STATIC definition.

Split .modinfo out from ELF_DETAILS into its own macro and handle it in
all vmlinux linker scripts. Discard .modinfo in the places where it was
previously being discarded from being in COMMON_DISCARDS, as it has
never been necessary in those uses.

Cc: stable@vger.kernel.org
Fixes: 3e86e4d74c ("kbuild: keep .modinfo section in vmlinux.unstripped")
Reported-by: Ed W <lists@wildgooses.com>
Closes: https://lore.kernel.org/587f25e0-a80e-46a5-9f01-87cb40cfa377@wildgooses.com/ [1]
Tested-by: Ed W <lists@wildgooses.com> # x86_64
Link: https://patch.msgid.link/20260225-separate-modinfo-from-elf-details-v1-1-387ced6baf4b@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2026-02-26 11:50:19 -07:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Linus Torvalds
4cff5c05e0 mm.git review status for linus..mm-stable
Everything:
 
 Total patches:       325
 Reviews/patch:       1.39
 Reviewed rate:       72%
 
 Excluding DAMON:
 
 Total patches:       262
 Reviews/patch:       1.63
 Reviewed rate:       82%
 
 Excluding DAMON and zram:
 
 Total patches:       248
 Reviews/patch:       1.72
 Reviewed rate:       86%
 
 - The 14 patch series "powerpc/64s: do not re-activate batched TLB
   flush" from Alexander Gordeev makes arch_{enter|leave}_lazy_mmu_mode()
   nest properly.
 
   It adds a generic enter/leave layer and switches architectures to use
   it.  Various hacks were removed in the process.
 
 - The 7 patch series "zram: introduce compressed data writeback" from
   Richard Chang and Sergey Senozhatsky implements data compression for
   zram writeback.
 
 - The 8 patch series "mm: folio_zero_user: clear page ranges" from David
   Hildenbrand adds clearing of contiguous page ranges for hugepages.
   Large improvements during demand faulting are demonstrated.
 
 - The 2 patch series "memcg cleanups" from Chen Ridong tideis up some
   memcg code.
 
 - The 12 patch series "mm/damon: introduce {,max_}nr_snapshots and
   tracepoint for damos stats" from SeongJae Park improves DAMOS stat's
   provided information, deterministic control, and readability.
 
 - The 3 patch series "selftests/mm: hugetlb cgroup charging: robustness
   fixes" from Li Wang fixes a few issues in the hugetlb cgroup charging
   selftests.
 
 - The 5 patch series "Fix va_high_addr_switch.sh test failure - again"
   from Chunyu Hu addresses several issues in the va_high_addr_switch test.
 
 - The 5 patch series "mm/damon/tests/core-kunit: extend existing test
   scenarios" from Shu Anzai improves the KUnit test coverage for DAMON.
 
 - The 2 patch series "mm/khugepaged: fix dirty page handling for
   MADV_COLLAPSE" from Shivank Garg fixes a glitch in khugepaged which was
   causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN.
 
 - The 29 patch series "arch, mm: consolidate hugetlb early reservation"
   from Mike Rapoport reworks and consolidates a pile of straggly code
   related to reservation of hugetlb memory from bootmem and creation of
   CMA areas for hugetlb.
 
 - The 9 patch series "mm: clean up anon_vma implementation" from Lorenzo
   Stoakes cleans up the anon_vma implementation in various ways.
 
 - The 3 patch series "tweaks for __alloc_pages_slowpath()" from
   Vlastimil Babka does a little streamlining of the page allocator's
   slowpath code.
 
 - The 8 patch series "memcg: separate private and public ID namespaces"
   from Shakeel Butt cleans up the memcg ID code and prevents the
   internal-only private IDs from being exposed to userspace.
 
 - The 6 patch series "mm: hugetlb: allocate frozen gigantic folio" from
   Kefeng Wang cleans up the allocation of frozen folios and avoids some
   atomic refcount operations.
 
 - The 11 patch series "mm/damon: advance DAMOS-based LRU sorting" from
   SeongJae Park improves DAMOS's movement of memory betewwn the active and
   inactive LRUs and adds auto-tuning of the ratio-based quotas and of
   monitoring intervals.
 
 - The 18 patch series "Support page table check on PowerPC" from Andrew
   Donnellan makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc.
 
 - The 3 patch series "nodemask: align nodes_and{,not} with underlying
   bitmap ops" from Yury Norov makes nodes_and() and nodes_andnot()
   propagate the return values from the underlying bit operations, enabling
   some cleanup in calling code.
 
 - The 5 patch series "mm/damon: hide kdamond and kdamond_lock from API
   callers" from SeongJae Park cleans up some DAMON internal interfaces.
 
 - The 4 patch series "mm/khugepaged: cleanups and scan limit fix" from
   Shivank Garg does some cleanup work in khupaged and fixes a scan limit
   accounting issue.
 
 - The 24 patch series "mm: balloon infrastructure cleanups" from David
   Hildenbrand goes to town on the balloon infrastructure and its page
   migration function.  Mainly cleanups, also some locking simplification.
 
 - The 2 patch series "mm/vmscan: add tracepoint and reason for
   kswapd_failures reset" from Jiayuan Chen adds additional tracepoints to
   the page reclaim code.
 
 - The 3 patch series "Replace wq users and add WQ_PERCPU to
   alloc_workqueue() users" from Marco Crivellari is part of Marco's
   kernel-wide migration from the legacy workqueue APIs over to the
   preferred unbound workqueues.
 
 - The 9 patch series "Various mm kselftests improvements/fixes" from
   Kevin Brodsky provides various unrelated improvements/fixes for the mm
   kselftests.
 
 - The 5 patch series "mm: accelerate gigantic folio allocation" from
   Kefeng Wang greatly speeds up gigantic folio allocation, mainly by
   avoiding unnecessary work in pfn_range_valid_contig().
 
 - The 5 patch series "selftests/damon: improve leak detection and wss
   estimation reliability" from SeongJae Park improves the reliability of
   two of the DAMON selftests.
 
 - The 8 patch series "mm/damon: cleanup kdamond, damon_call(), damos
   filter and DAMON_MIN_REGION" from SeongJae Park does some cleanup work
   in the core DAMON code.
 
 - The 8 patch series "Docs/mm/damon: update intro, modules, maintainer
   profile, and misc" from SeongJae Park performs maintenance work on the
   DAMON documentation.
 
 - The 10 patch series "mm: add and use vma_assert_stabilised() helper"
   from Lorenzo Stoakes refactors and cleans up the core VMA code.  The
   main aim here is to be able to use the mmap write lock's lockdep state
   to perform various assertions regarding the locking which the VMA code
   requires.
 
 - The 19 patch series "mm, swap: swap table phase II: unify swapin use"
   from Kairui Song removes some old swap code (swap cache bypassing and
   swap synchronization) which wasn't working very well.  Various other
   cleanups and simplifications were made.  The end result is a 20% speedup
   in one benchmark.
 
 - The 8 patch series "enable PT_RECLAIM on more 64-bit architectures"
   from Qi Zheng makes PT_RECLAIM available on 64-bit alpha, loongarch,
   mips, parisc, um,  Various cleanups were performed along the way.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY1HfAAKCRDdBJ7gKXxA
 jqhZAP9H8ZlKKqCEgnr6U5XXmJ63Ep2FDQpl8p35yr9yVuU9+gEAgfyWiJ43l1fP
 rT0yjsUW3KQFBi/SEA3R6aYarmoIBgI=
 =+HLt
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "powerpc/64s: do not re-activate batched TLB flush" makes
   arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev)

   It adds a generic enter/leave layer and switches architectures to use
   it. Various hacks were removed in the process.

 - "zram: introduce compressed data writeback" implements data
   compression for zram writeback (Richard Chang and Sergey Senozhatsky)

 - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous
   page ranges for hugepages. Large improvements during demand faulting
   are demonstrated (David Hildenbrand)

 - "memcg cleanups" tidies up some memcg code (Chen Ridong)

 - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos
   stats" improves DAMOS stat's provided information, deterministic
   control, and readability (SeongJae Park)

 - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few
   issues in the hugetlb cgroup charging selftests (Li Wang)

 - "Fix va_high_addr_switch.sh test failure - again" addresses several
   issues in the va_high_addr_switch test (Chunyu Hu)

 - "mm/damon/tests/core-kunit: extend existing test scenarios" improves
   the KUnit test coverage for DAMON (Shu Anzai)

 - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a
   glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to
   transiently return -EAGAIN (Shivank Garg)

 - "arch, mm: consolidate hugetlb early reservation" reworks and
   consolidates a pile of straggly code related to reservation of
   hugetlb memory from bootmem and creation of CMA areas for hugetlb
   (Mike Rapoport)

 - "mm: clean up anon_vma implementation" cleans up the anon_vma
   implementation in various ways (Lorenzo Stoakes)

 - "tweaks for __alloc_pages_slowpath()" does a little streamlining of
   the page allocator's slowpath code (Vlastimil Babka)

 - "memcg: separate private and public ID namespaces" cleans up the
   memcg ID code and prevents the internal-only private IDs from being
   exposed to userspace (Shakeel Butt)

 - "mm: hugetlb: allocate frozen gigantic folio" cleans up the
   allocation of frozen folios and avoids some atomic refcount
   operations (Kefeng Wang)

 - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement
   of memory betewwn the active and inactive LRUs and adds auto-tuning
   of the ratio-based quotas and of monitoring intervals (SeongJae Park)

 - "Support page table check on PowerPC" makes
   CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan)

 - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes
   nodes_and() and nodes_andnot() propagate the return values from the
   underlying bit operations, enabling some cleanup in calling code
   (Yury Norov)

 - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up
   some DAMON internal interfaces (SeongJae Park)

 - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work
   in khupaged and fixes a scan limit accounting issue (Shivank Garg)

 - "mm: balloon infrastructure cleanups" goes to town on the balloon
   infrastructure and its page migration function. Mainly cleanups, also
   some locking simplification (David Hildenbrand)

 - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds
   additional tracepoints to the page reclaim code (Jiayuan Chen)

 - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is
   part of Marco's kernel-wide migration from the legacy workqueue APIs
   over to the preferred unbound workqueues (Marco Crivellari)

 - "Various mm kselftests improvements/fixes" provides various unrelated
   improvements/fixes for the mm kselftests (Kevin Brodsky)

 - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic
   folio allocation, mainly by avoiding unnecessary work in
   pfn_range_valid_contig() (Kefeng Wang)

 - "selftests/damon: improve leak detection and wss estimation
   reliability" improves the reliability of two of the DAMON selftests
   (SeongJae Park)

 - "mm/damon: cleanup kdamond, damon_call(), damos filter and
   DAMON_MIN_REGION" does some cleanup work in the core DAMON code
   (SeongJae Park)

 - "Docs/mm/damon: update intro, modules, maintainer profile, and misc"
   performs maintenance work on the DAMON documentation (SeongJae Park)

 - "mm: add and use vma_assert_stabilised() helper" refactors and cleans
   up the core VMA code. The main aim here is to be able to use the mmap
   write lock's lockdep state to perform various assertions regarding
   the locking which the VMA code requires (Lorenzo Stoakes)

 - "mm, swap: swap table phase II: unify swapin use" removes some old
   swap code (swap cache bypassing and swap synchronization) which
   wasn't working very well. Various other cleanups and simplifications
   were made. The end result is a 20% speedup in one benchmark (Kairui
   Song)

 - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM
   available on 64-bit alpha, loongarch, mips, parisc, and um. Various
   cleanups were performed along the way (Qi Zheng)

* tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits)
  mm/memory: handle non-split locks correctly in zap_empty_pte_table()
  mm: move pte table reclaim code to memory.c
  mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE
  mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config
  um: mm: enable MMU_GATHER_RCU_TABLE_FREE
  parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE
  mips: mm: enable MMU_GATHER_RCU_TABLE_FREE
  LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE
  alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE
  mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h
  mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles
  zsmalloc: make common caches global
  mm: add SPDX id lines to some mm source files
  mm/zswap: use %pe to print error pointers
  mm/vmscan: use %pe to print error pointers
  mm/readahead: fix typo in comment
  mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file()
  mm: refactor vma_map_pages to use vm_insert_pages
  mm/damon: unify address range representation with damon_addr_range
  mm/cma: replace snprintf with strscpy in cma_new_area
  ...
2026-02-12 11:32:37 -08:00
Linus Torvalds
8ad8d24d96 parisc architecture fixes and updates for kernel v7.0-rc1:
- Fix device reference leak in error path
 - Check if system provides a 64-bit free running platform counter
 - Minor fixes in debug code
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCaYuPEgAKCRD3ErUQojoP
 X3z9AQCfvU14/vkMab/iXt7mn7yf1CZbvo7cczAMA8XTRvLNUQD7BL+6tHqSxwCh
 rZax1BQR8TKArZYWkdOADtldog0plQ4=
 =NhTL
 -----END PGP SIGNATURE-----

Merge tag 'parisc-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc updates from Helge Deller:

 - Fix device reference leak in error path

 - Check if system provides a 64-bit free running platform counter

 - Minor fixes in debug code

* tag 'parisc-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: lba_pci: Add debug code to show IO and PA ranges
  parisc: Detect 64-bit free running platform counter
  parisc: Fix minor printk issues in iosapic debug code
  parisc: Enhance debug code for PAT firmware
  parisc: Add PDC PAT call to get free running 64-bit counter
  parisc: Fix module path output in qemu tables
  parisc: Export model name for MPE/ix
  parisc: Prevent interrupts during reboot
  parisc: Print hardware IDs as 4 digit hex strings
  parisc: kernel: replace kfree() with put_device() in create_tree_node()
2026-02-10 21:42:10 -08:00
Linus Torvalds
f1c538ca81 Updates for the VDSO subsystem:
- Provide the missing 64-bit variant of clock_getres()
 
     This allows the extension of CONFIG_COMPAT_32BIT_TIME to the vDSO and
     finally the removal of 32-bit time types from the kernel and UAPI.
 
   - Remove the useless and broken getcpu_cache from the VDSO
 
     The intention was to provide a trivial way to retrieve the CPU number from
     the VDSO, but as the VDSO data is per process there is no way to make it
     work.
 
   - Switch get/put_unaligned() from packed struct to memcpy()
 
     The packed struct violates strict aliasing rules which requires to pass
     -fno-strict-aliasing to the compiler. As this are scalar values
     __builtin_memcpy() turns them into simple loads and stores
 
   - Use __typeof_unqual__() for __unqual_scalar_typeof()
 
     The get/put_unaligned() changes triggered a new sparse warning when __beNN
     types are used with get/put_unaligned() as sparse builds add a special
     'bitwise' attribute to them which prevents sparse to evaluate the Generic
     in __unqual_scalar_typeof().
 
     Newer sparse versions support __typeof_unqual__() which avoids the problem,
     but requires a recent sparse install. So this adds a sanity check to sparse
     builds, which validates that sparse is available and capable of handling it.
 
   - Force inline __cvdso_clock_getres_common()
 
     Compilers sometimes un-inline agressively, which results in function call
     overhead and problems with automatic stack variable initialization.
 
     Interestingly enough the force inlining results in smaller code than the
     un-inlined variant produced by GCC when optimizing for size.
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmmJ2eAQHHRnbHhAa2Vy
 bmVsLm9yZwAKCRCmGPVMDXSYoXhQD/4mjneVlRBbQ6Nt9LxxhzPlYXvED6u8b4SX
 R//DQ4qqqagh2fSpE57tr3f56HqhUmXptGJSgEvr6tVKoyugsLqP6sN9J9J3o181
 jnPQR0VBP9dihQZ5X93pKo9NZWxLIn0uLD0RsdCbE9NTx4ciw4VIPFYQqkC4Rw6b
 jiRrDR2l8EhV8cmxB6puW5WaQ932M6Awabw9RumzwH3MzIIlbc5Ero51S9eS64LL
 byU5XeWUe295W1Gxze5RHHJWyNQEyx1eUCFfe3LWvfpz7FMzc2AQsKnIJDzW3GiO
 UGu5MGptbLpG+ccvhVEs6/Ls5pWXcoCw4WuDNAunCCOmqda98oDniKf2LwRRbLT0
 nAfLNatMnhXdTPk2zbS45z9uipUQAGKmVAE3/LVqB+ekcutmIGMyqHgR75QX0b4l
 CQPkC9rBsV6gGsScWTnhRydhqioNO/uhhrQv0vEXnKZa0ysTbgZKt3JDZbgUEL2B
 uDxXKyrqjpnqDZKlMMaoLtwd+l+T80ya4/NhHd4ZNGUpTUrHVw2H47lgE7ahCxEk
 /SvXTZSU4Jp8sVQIQ+J6y5z2AQ/xGy++zvNKiZMyP9fQuPqhiDqYkLpmOp61bARx
 wqyVsfGXZYSB1l16AiSC/CyUDcMqqsFohGXQ/Yf0SOiVtXu2WUyfofY1N0IXIGu4
 WOVV9mH1yg==
 =0ogX
 -----END PGP SIGNATURE-----

Merge tag 'timers-vdso-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull VDSO updates from Thomas Gleixner:

 - Provide the missing 64-bit variant of clock_getres()

   This allows the extension of CONFIG_COMPAT_32BIT_TIME to the vDSO and
   finally the removal of 32-bit time types from the kernel and UAPI.

 - Remove the useless and broken getcpu_cache from the VDSO

   The intention was to provide a trivial way to retrieve the CPU number
   from the VDSO, but as the VDSO data is per process there is no way to
   make it work.

 - Switch get/put_unaligned() from packed struct to memcpy()

   The packed struct violates strict aliasing rules which requires to
   pass -fno-strict-aliasing to the compiler. As this are scalar values
   __builtin_memcpy() turns them into simple loads and stores

 - Use __typeof_unqual__() for __unqual_scalar_typeof()

   The get/put_unaligned() changes triggered a new sparse warning when
   __beNN types are used with get/put_unaligned() as sparse builds add a
   special 'bitwise' attribute to them which prevents sparse to evaluate
   the Generic in __unqual_scalar_typeof().

   Newer sparse versions support __typeof_unqual__() which avoids the
   problem, but requires a recent sparse install. So this adds a sanity
   check to sparse builds, which validates that sparse is available and
   capable of handling it.

 - Force inline __cvdso_clock_getres_common()

   Compilers sometimes un-inline agressively, which results in function
   call overhead and problems with automatic stack variable
   initialization.

   Interestingly enough the force inlining results in smaller code than
   the un-inlined variant produced by GCC when optimizing for size.

* tag 'timers-vdso-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  vdso/gettimeofday: Force inlining of __cvdso_clock_getres_common()
  x86/percpu: Make CONFIG_USE_X86_SEG_SUPPORT work with sparse
  compiler: Use __typeof_unqual__() for __unqual_scalar_typeof()
  powerpc/vdso: Provide clock_getres_time64()
  tools headers: Remove unneeded ignoring of warnings in unaligned.h
  tools headers: Update the linux/unaligned.h copy with the kernel sources
  vdso: Switch get/put_unaligned() from packed struct to memcpy()
  parisc: Inline a type punning version of get_unaligned_le32()
  vdso: Remove struct getcpu_cache
  MIPS: vdso: Provide getres_time64() for 32-bit ABIs
  arm64: vdso32: Provide clock_getres_time64()
  ARM: VDSO: Provide clock_getres_time64()
  ARM: VDSO: Patch out __vdso_clock_getres() if unavailable
  x86/vdso: Provide clock_getres_time64() for x86-32
  selftests: vDSO: vdso_test_abi: Add test for clock_getres_time64()
  selftests: vDSO: vdso_test_abi: Use UAPI system call numbers
  selftests: vDSO: vdso_config: Add configurations for clock_getres_time64()
  vdso: Add prototype for __vdso_clock_getres_time64()
2026-02-10 17:02:23 -08:00
Linus Torvalds
36ae1c45b2 Scheduler changes for v7.0:
Scheduler Kconfig space updates:
 
  - Further consolidate configurable preemption modes: reduce
    the number of architectures that are allowed to offer
    PREEMPT_NONE and PREEMPT_VOLUNTARY, reducing the number
    of preemption models from four to just two: 'full' and 'lazy'
    on up-to-date architectures (arm64, loongarch, powerpc,
    riscv, s390, x86).
 
    None and voluntary are only available as legacy features
    on platforms that don't implement lazy preemption yet,
    or which don't even support preemption.
 
    The goal is to eventually remove cond_resched() and
    voluntary preemption altogether.
 
    (Peter Zijlstra)
 
 RSEQ based 'scheduler time slice extension' support:
 
 This allows a thread to request a time slice extension when it
 enters a critical section to avoid contention on a resource when
 the thread is scheduled out inside of the critical section.
 
  - Add fields and constants for time slice extension
  - Provide static branch for time slice extensions
  - Add statistics for time slice extensions
  - Add prctl() to enable time slice extensions
  - Implement sys_rseq_slice_yield()
  - Implement syscall entry work for time slice extensions
  - Implement time slice extension enforcement timer
  - Reset slice extension when scheduled
  - Implement rseq_grant_slice_extension()
  - entry: Hook up rseq time slice extension
  - selftests: Implement time slice extension test
 
    (Thomas Gleixner)
 
  - Allow registering RSEQ with slice extension
  - Move slice_ext_nsec to debugfs
  - Lower default slice extension
  - selftests/rseq: Add rseq slice histogram script
 
    (Peter Zijlstra)
 
 Scheduler performance/scalability improvements:
 
  - Update rq->avg_idle when a task is moved to an idle CPU,
    which improves the scalability of various workloads.
    (Shubhang Kaushik)
 
  - Reorder fields in 'struct rq' for better caching
    (Blake Jones)
 
  - Fair scheduler SMP NOHZ balancing code speedups:
 
    - Move checking for nohz cpus after time check
    - Change likelyhood of nohz.nr_cpus
    - Remove nohz.nr_cpus and use weight of cpumask instead
 
      (Shrikanth Hegde)
 
  - Avoid false sharing for sched_clock_irqtime (Wangyang Guo)
 
  - Drop useless cpumask_empty() in find_energy_efficient_cpu()
  - Simplify task_numa_find_cpu()
  - Use cpumask_weight_and() in sched_balance_find_dst_group()
 
    (Yury Norov)
 
 DL scheduler updates:
 
  - Add a deadline server for sched_ext tasks (by Andrea Righi and
    Joel Fernandes, with fixes by Peter Zijlstra)
 
 RT scheduler updates:
 
  - Skip currently executing CPU in rto_next_cpu() (Chen Jinghuang)
 
 Entry code updates and performance improvements, which is part of the
 scheduler tree in this cycle due to interdependencies with the RSEQ
 based time slice extension work:
 
   - Remove unused syscall argument from syscall_trace_enter()
   - Rework syscall_exit_to_user_mode_work() for architecture reuse
   - Add arch_ptrace_report_syscall_entry/exit()
   - Inline syscall_exit_work() and syscall_trace_enter()
 
     (Jinjie Ruan)
 
 Scheduler core updates:
 
  - Rework sched_class::wakeup_preempt() and rq_modified_*()
  - Avoid rq->lock bouncing in sched_balance_newidle()
  - Rename rcu_dereference_check_sched_domain() =>
           rcu_dereference_sched_domain()
  - <linux/compiler_types.h>: Add the __signed_scalar_typeof() helper
 
    (Peter Zijlstra)
 
 Fair scheduler updates/refactoring:
 
  - Fold the sched_avg update
  - Change rcu_dereference_check_sched_domain() to rcu-sched
  - Switch to rcu_dereference_all()
  - Remove superfluous rcu_read_lock()
  - Limit hrtick work
 
    (Peter Zijlstra)
 
  - Join two #ifdef CONFIG_FAIR_GROUP_SCHED blocks
  - Clean up comments in 'struct cfs_rq'
  - Separate se->vlag from se->vprot
  - Rename cfs_rq::avg_load to cfs_rq::sum_weight
  - Rename cfs_rq::avg_vruntime to ::sum_w_vruntime & helper functions
  - Introduce and use the vruntime_cmp() and vruntime_op() wrappers
    for wrapped-signed aritmetics
  - Sort out 'blocked_load*' namespace noise
 
    (Ingo Molnar)
 
 Scheduler debugging code updates:
 
  - Export hidden tracepoints to modules (Gabriele Monaco)
 
  - Convert copy_from_user() + kstrtouint() to kstrtouint_from_user()
    (Fushuai Wang)
 
  - Add assertions to QUEUE_CLASS (Peter Zijlstra)
 
  - hrtimer: Fix tracing oddity (Thomas Gleixner)
 
 Misc fixes and cleanups:
 
  - Re-evaluate scheduling when migrating queued tasks out of
    throttled cgroups (Zicheng Qu)
 
  - Remove task_struct->faults_disabled_mapping (Christoph Hellwig)
 
  - Fix math notation errors in avg_vruntime comment (Zhan Xusheng)
 
  - sched/cpufreq: Use %pe format for PTR_ERR() printing (zenghongling)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmJj+IRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1grtQ//WyXYGVE/WicdqslfaCY2Mr0uJnL0tLSM
 CJp+0LROdkmy+ChJmftO8RgjCUSsjhC4/xcBhUQXApf/ffQi3b2jH6nkTp/Z64Ms
 p2IXLkBiZjwdcO6fGbB0JE2G1J4hGRC5BlqfgkZzWidMf3kIbmrHg99mVWGzODLY
 N/cPW4d0WGf9TScl1FgEiOqgF3czMLlqvTDJqaFMpsTzSUcRBnrG4xushb4W/bBx
 573eqxgZJ6urNSGu8niY9PAl9F7gskXW3YxI3k8SH7VmJKSevWlwI9vMEhcRDzud
 E0XxD7J8iPOKtr7ypXm7anMBv4jWVUdAnPbYi4TDsyDDU/HguqMqT1McTGn8wQ+F
 jmdhmMC9/TEIzq93SNLbCYieibqDsmJoNVFFi0FWfPLMtYbcZd5a884SIz532vx4
 DegdlDXdazUwhxzDiQR3sq1CsHXpxNS2YdrpadAtF/r2gU86DQjsEew8yBvXi7bb
 Wrkzpax70sU1AFI23wJQkEb/OnnXyehAHAhhQN6GVvuiGr9P7C02WLEGLlmSmJrx
 zl2F750P76yhTfGcvTfJ/5LTfSB+yRozGvcdXnIkyzWotY6a2D1MKNusAfVax+IR
 kyfAWqVdxBhlKnqYbu92lTogvnPh3Lymd6G4TZZRkSH2jixyGd2oS7nZaDBAeBEM
 NHQtr9R+KyU=
 =Xj2f
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Scheduler Kconfig space updates:

   - Further consolidate configurable preemption modes (Peter Zijlstra)

     Reduce the number of architectures that are allowed to offer
     PREEMPT_NONE and PREEMPT_VOLUNTARY, reducing the number of
     preemption models from four to just two: 'full' and 'lazy' on
     up-to-date architectures (arm64, loongarch, powerpc, riscv, s390,
     x86).

     None and voluntary are only available as legacy features on
     platforms that don't implement lazy preemption yet, or which don't
     even support preemption.

     The goal is to eventually remove cond_resched() and voluntary
     preemption altogether.

  RSEQ based 'scheduler time slice extension' support (Thomas Gleixner
  and Peter Zijlstra):

  This allows a thread to request a time slice extension when it enters
  a critical section to avoid contention on a resource when the thread
  is scheduled out inside of the critical section.

   - Add fields and constants for time slice extension
   - Provide static branch for time slice extensions
   - Add statistics for time slice extensions
   - Add prctl() to enable time slice extensions
   - Implement sys_rseq_slice_yield()
   - Implement syscall entry work for time slice extensions
   - Implement time slice extension enforcement timer
   - Reset slice extension when scheduled
   - Implement rseq_grant_slice_extension()
   - entry: Hook up rseq time slice extension
   - selftests: Implement time slice extension test
   - Allow registering RSEQ with slice extension
   - Move slice_ext_nsec to debugfs
   - Lower default slice extension
   - selftests/rseq: Add rseq slice histogram script

  Scheduler performance/scalability improvements:

   - Update rq->avg_idle when a task is moved to an idle CPU, which
     improves the scalability of various workloads (Shubhang Kaushik)

   - Reorder fields in 'struct rq' for better caching (Blake Jones)

   - Fair scheduler SMP NOHZ balancing code speedups (Shrikanth Hegde):
      - Move checking for nohz cpus after time check
      - Change likelyhood of nohz.nr_cpus
      - Remove nohz.nr_cpus and use weight of cpumask instead

   - Avoid false sharing for sched_clock_irqtime (Wangyang Guo)

   - Cleanups (Yury Norov):
      - Drop useless cpumask_empty() in find_energy_efficient_cpu()
      - Simplify task_numa_find_cpu()
      - Use cpumask_weight_and() in sched_balance_find_dst_group()

  DL scheduler updates:

   - Add a deadline server for sched_ext tasks (by Andrea Righi and Joel
     Fernandes, with fixes by Peter Zijlstra)

  RT scheduler updates:

   - Skip currently executing CPU in rto_next_cpu() (Chen Jinghuang)

  Entry code updates and performance improvements (Jinjie Ruan)

  This is part of the scheduler tree in this cycle due to inter-
  dependencies with the RSEQ based time slice extension work:

    - Remove unused syscall argument from syscall_trace_enter()
    - Rework syscall_exit_to_user_mode_work() for architecture reuse
    - Add arch_ptrace_report_syscall_entry/exit()
    - Inline syscall_exit_work() and syscall_trace_enter()

  Scheduler core updates (Peter Zijlstra):

   - Rework sched_class::wakeup_preempt() and rq_modified_*()
   - Avoid rq->lock bouncing in sched_balance_newidle()
   - Rename rcu_dereference_check_sched_domain() =>
            rcu_dereference_sched_domain()
   - <linux/compiler_types.h>: Add the __signed_scalar_typeof() helper

  Fair scheduler updates/refactoring (Peter Zijlstra and Ingo Molnar):

   - Fold the sched_avg update
   - Change rcu_dereference_check_sched_domain() to rcu-sched
   - Switch to rcu_dereference_all()
   - Remove superfluous rcu_read_lock()
   - Limit hrtick work
   - Join two #ifdef CONFIG_FAIR_GROUP_SCHED blocks
   - Clean up comments in 'struct cfs_rq'
   - Separate se->vlag from se->vprot
   - Rename cfs_rq::avg_load to cfs_rq::sum_weight
   - Rename cfs_rq::avg_vruntime to ::sum_w_vruntime & helper functions
   - Introduce and use the vruntime_cmp() and vruntime_op() wrappers for
     wrapped-signed aritmetics
   - Sort out 'blocked_load*' namespace noise

  Scheduler debugging code updates:

   - Export hidden tracepoints to modules (Gabriele Monaco)

   - Convert copy_from_user() + kstrtouint() to kstrtouint_from_user()
     (Fushuai Wang)

   - Add assertions to QUEUE_CLASS (Peter Zijlstra)

   - hrtimer: Fix tracing oddity (Thomas Gleixner)

  Misc fixes and cleanups:

   - Re-evaluate scheduling when migrating queued tasks out of throttled
     cgroups (Zicheng Qu)

   - Remove task_struct->faults_disabled_mapping (Christoph Hellwig)

   - Fix math notation errors in avg_vruntime comment (Zhan Xusheng)

   - sched/cpufreq: Use %pe format for PTR_ERR() printing
     (zenghongling)"

* tag 'sched-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
  sched: Re-evaluate scheduling when migrating queued tasks out of throttled cgroups
  sched/cpufreq: Use %pe format for PTR_ERR() printing
  sched/rt: Skip currently executing CPU in rto_next_cpu()
  sched/clock: Avoid false sharing for sched_clock_irqtime
  selftests/sched_ext: Add test for DL server total_bw consistency
  selftests/sched_ext: Add test for sched_ext dl_server
  sched/debug: Fix dl_server (re)start conditions
  sched/debug: Add support to change sched_ext server params
  sched_ext: Add a DL server for sched_ext tasks
  sched/debug: Stop and start server based on if it was active
  sched/debug: Fix updating of ppos on server write ops
  sched/deadline: Clear the defer params
  entry: Inline syscall_exit_work() and syscall_trace_enter()
  entry: Add arch_ptrace_report_syscall_entry/exit()
  entry: Rework syscall_exit_to_user_mode_work() for architecture reuse
  entry: Remove unused syscall argument from syscall_trace_enter()
  sched: remove task_struct->faults_disabled_mapping
  sched: Update rq->avg_idle when a task is moved to an idle CPU
  selftests/rseq: Add rseq slice histogram script
  hrtimer: Fix trace oddity
  ...
2026-02-10 12:50:10 -08:00
Qi Zheng
46231ba5f4 parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE
On a 64-bit system, madvise(MADV_DONTNEED) may cause a large number of
empty PTE page table pages (such as 100GB+).  To resolve this problem,
first enable MMU_GATHER_RCU_TABLE_FREE to prepare for enabling the
PT_RECLAIM feature, which resolves this problem.

Link: https://lkml.kernel.org/r/b827939046dbc94bc7c585cdbed8522baab75b15.1769515122.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Magnus Lindholm <linmag7@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-06 15:47:18 -08:00