Commit Graph

120 Commits

Author SHA1 Message Date
Sourabh Jain
b3a97f9484 powerpc/kdump: fix KASAN sanitization flag for core_$(BITS).o
KASAN instrumentation is intended to be disabled for the kexec core
code, but the existing Makefile entry misses the object suffix. As a
result, the flag is not applied correctly to core_$(BITS).o.

So when KASAN is enabled, kexec_copy_flush and copy_segments in
kexec/core_64.c are instrumented, which can result in accesses to
shadow memory via normal address translation paths. Since these run
with the MMU disabled, such accesses may trigger page faults
(bad_page_fault) that cannot be handled in the kdump path, ultimately
causing a hang and preventing the kdump kernel from booting. The same
is true for kexec as well, since the same functions are used there.

Update the entry to include the “.o” suffix so that KASAN
instrumentation is properly disabled for this object file.

Fixes: 2ab2d5794f ("powerpc/kasan: Disable address sanitization in kexec paths")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/all/1dee8891-8bcc-46b4-93f3-fc3a774abd5b@linux.ibm.com/
Cc: stable@vger.kernel.org
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Acked-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260407124349.1698552-1-sourabhjain@linux.ibm.com
2026-05-06 07:31:27 +05:30
Linus Torvalds
440d6635b2 mm.git review status for linus..mm-nonmm-stable
Total patches:       126
 Reviews/patch:       0.92
 Reviewed rate:       76%
 
 - The 2 patch series "pid: make sub-init creation retryable" from Oleg
   Nesterov increases the robustness of our creation of init in a new
   namespace.  By clearing away some historical cruft which is no longer
   needed.  Also some documentation fixups are provided.
 
 - The 2 patch series "selftests/fchmodat2: Error handling and general"
   from Mark Brown has a fixup and a cleanup for the fchmodat2() syscall
   selftest.
 
 - The 3 patch series "lib: polynomial: Move to math/ and clean up" from
   Andy Shevchenko does as advertised.
 
 - The 3 patch series "hung_task: Provide runtime reset interface for
   hung task detector" from Aaron Tomlin gives administrators the ability
   to zero out /proc/sys/kernel/hung_task_detect_count.
 
 - The 2 patch series "tools/getdelays: use the static UAPI headers from
   tools/include/uapi" from Thomas Weißschuh teaches getdelays to use the
   in-kernel UAPI headers rather than the system-provided ones.
 
 - The 5 patch series "watchdog/hardlockup: Improvements to hardlockup"
   from Mayank Rungta provides several cleanups and fixups to the
   hardlockup detector code and its documentation.
 
 - The 2 patch series "lib/bch: fix undefined behavior from signed
   left-shifts" from Josh Law provides a couple of small/theoretical fixes
   in the bch code.
 
 - The 2 patch series "ocfs2/dlm: fix two bugs in dlm_match_regions()"
   from Junrui Luo does what is claims.
 
 - The 27 patch series "cleanup the RAID5 XOR library" from Christoph
   Hellwig is a quite far-reaching cleanup to this code.  I can't do better
   than to quote Christoph:
 
     The XOR library used for the RAID5 parity is a bit of a mess right
     now.  The main file sits in crypto/ despite not being cryptography and
     not using the crypto API, with the generic implementations sitting in
     include/asm-generic and the arch implementations sitting in an asm/
     header in theory.  The latter doesn't work for many cases, so
     architectures often build the code directly into the core kernel, or
     create another module for the architecture code.
 
     Change this to a single module in lib/ that also contains the
     architecture optimizations, similar to the library work Eric Biggers
     has done for the CRC and crypto libraries later.  After that it
     changes to better calling conventions that allow for smarter
     architecture implementations (although none is contained here yet),
     and uses static_call to avoid indirection function call overhead.
 
 - The 2 patch series "lib/list_sort: Clean up list_sort() scheduling
   workarounds" from Kuan-Wei Chiu cleans up this library code by removing
   a hacky thing which was added for UBIFS, which UBIFS doesn't actually
   need.
 
 - The 5 patch series "Fix bugs in extract_iter_to_sg()" from Christian
   Ehrhardt fixes a few bugs in the scatterlist code, adds in-kernel tests
   for the now-fixed bugs and fixes a leak in the test itself.
 
 - The 3 patch series "kdump: Enable LUKS-encrypted dump target support
   in ARM64 and PowerPC" from Coiby Xu eenables support of the
   LUKS-encrypted device dump target on arm64 and powerpc.
 
 - The 4 patch series "ocfs2: consolidate extent list validation into
   block read callbacks" from Joseph Qi addresses ocfs2's validation of
   extent list fields - cleanup, simplification, robustness.  (Kernel test
   robot loves mounting corrupted fs images!)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad90rQAKCRDdBJ7gKXxA
 jl7rAQD4/Rq7ZSSnEv6FS4gOwc3MgTdWcZZaXkqL1KiWyYhRwAEA+cVCO344+AKb
 znBOjet/hUr+/kBwyViifiC8LHzchwM=
 =Nfnf
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "pid: make sub-init creation retryable" (Oleg Nesterov)

   Make creation of init in a new namespace more robust by clearing away
   some historical cruft which is no longer needed. Also some
   documentation fixups

 - "selftests/fchmodat2: Error handling and general" (Mark Brown)

   Fix and a cleanup for the fchmodat2() syscall selftest

 - "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)

 - "hung_task: Provide runtime reset interface for hung task detector"
   (Aaron Tomlin)

   Give administrators the ability to zero out
   /proc/sys/kernel/hung_task_detect_count

 - "tools/getdelays: use the static UAPI headers from
   tools/include/uapi" (Thomas Weißschuh)

   Teach getdelays to use the in-kernel UAPI headers rather than the
   system-provided ones

 - "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)

   Several cleanups and fixups to the hardlockup detector code and its
   documentation

 - "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)

   A couple of small/theoretical fixes in the bch code

 - "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)

 - "cleanup the RAID5 XOR library" (Christoph Hellwig)

   A quite far-reaching cleanup to this code. I can't do better than to
   quote Christoph:

     "The XOR library used for the RAID5 parity is a bit of a mess right
      now. The main file sits in crypto/ despite not being cryptography
      and not using the crypto API, with the generic implementations
      sitting in include/asm-generic and the arch implementations
      sitting in an asm/ header in theory. The latter doesn't work for
      many cases, so architectures often build the code directly into
      the core kernel, or create another module for the architecture
      code.

      Change this to a single module in lib/ that also contains the
      architecture optimizations, similar to the library work Eric
      Biggers has done for the CRC and crypto libraries later. After
      that it changes to better calling conventions that allow for
      smarter architecture implementations (although none is contained
      here yet), and uses static_call to avoid indirection function call
      overhead"

 - "lib/list_sort: Clean up list_sort() scheduling workarounds"
   (Kuan-Wei Chiu)

   Clean up this library code by removing a hacky thing which was added
   for UBIFS, which UBIFS doesn't actually need

 - "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)

   Fix a few bugs in the scatterlist code, add in-kernel tests for the
   now-fixed bugs and fix a leak in the test itself

 - "kdump: Enable LUKS-encrypted dump target support in ARM64 and
   PowerPC" (Coiby Xu)

   Enable support of the LUKS-encrypted device dump target on arm64 and
   powerpc

 - "ocfs2: consolidate extent list validation into block read callbacks"
   (Joseph Qi)

   Cleanup, simplify, and make more robust ocfs2's validation of extent
   list fields (Kernel test robot loves mounting corrupted fs images!)

* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
  ocfs2: validate group add input before caching
  ocfs2: validate bg_bits during freefrag scan
  ocfs2: fix listxattr handling when the buffer is full
  doc: watchdog: fix typos etc
  update Sean's email address
  ocfs2: use get_random_u32() where appropriate
  ocfs2: split transactions in dio completion to avoid credit exhaustion
  ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
  ocfs2: validate extent block list fields during block read
  ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
  ocfs2: validate dx_root extent list fields during block read
  ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
  ocfs2: handle invalid dinode in ocfs2_group_extend
  .get_maintainer.ignore: add Askar
  ocfs2: validate bg_list extent bounds in discontig groups
  checkpatch: exclude forward declarations of const structs
  tools/accounting: handle truncated taskstats netlink messages
  taskstats: set version in TGID exit notifications
  ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
  arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
  ...
2026-04-16 20:11:56 -07:00
Coiby Xu
e3a84be1ec arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
CONFIG_CRASH_DM_CRYPT has been introduced to support LUKS-encrypted
device dump target by addressing two challenges [1],
 - Kdump kernel may not be able to decrypt the LUKS partition. For some
   machines, a system administrator may not have a chance to enter the
   password to decrypt the device in kdump initramfs after the 1st kernel
   crashes

 - LUKS2 by default use the memory-hard Argon2 key derivation function
   which is quite memory-consuming compared to the limited memory reserved
   for kdump.

To also enable this feature for ARM64 and PowerPC, the missing piece is to
let the kdump kernel know where to find the dm-crypt keys which are
randomly stored in memory reserved for kdump.  Introduce a new device tree
property dmcryptkeys [2] as similar to elfcorehdr to pass the memory
address of the stored info of dm-crypt keys to the kdump kernel.  Since
this property is only needed by the kdump kernel, it won't be exposed to
userspace.

Link: https://lkml.kernel.org/r/20260225060347.718905-4-coxu@redhat.com
Link: https://lore.kernel.org/all/20250502011246.99238-1-coxu@redhat.com/ [1]
Link: https://github.com/devicetree-org/dt-schema/pull/181 [2]
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Arnaud Lefebvre <arnaud.lefebvre@clever-cloud.com>
Cc: Baoquan he <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Thomas Staudt <tstaudt@de.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-02 23:36:24 -07:00
Sourabh Jain
f53b24d1fa powerpc/crash: Update backup region offset in elfcorehdr on memory hotplug
When elfcorehdr is prepared for kdump, the program header representing
the first 64 KB of memory is expected to have its offset point to the
backup region. This is required because purgatory copies the first 64 KB
of the crashed kernel memory to this backup region following a kernel
crash. This allows the capture kernel to use the first 64 KB of memory
to place the exception vectors and other required data.

When elfcorehdr is recreated due to memory hotplug, the offset of
the program header representing the first 64 KB is not updated.
As a result, the capture kernel exports the first 64 KB at offset
0, even though the data actually resides in the backup region.

Fix this by calling sync_backup_region_phdr() to update the program
header offset in the elfcorehdr created during memory hotplug.

sync_backup_region_phdr() works for images loaded via the
kexec_file_load syscall. However, it does not work for kexec_load,
because image->arch.backup_start is not initialized in that case.
So introduce machine_kexec_post_load() to process the elfcorehdr
prepared by kexec-tools and initialize image->arch.backup_start for
kdump images loaded via kexec_load syscall.

Rename update_backup_region_phdr() to sync_backup_region_phdr() and
extend it to synchronize the backup region offset between the kdump
image and the ELF core header. The helper now supports updating either
the kdump image from the ELF program header or updating the ELF program
header from the kdump image, avoiding code duplication.

Define ARCH_HAS_KIMAGE_ARCH and struct kimage_arch when
CONFIG_KEXEC_FILE or CONFIG_CRASH_DUMP is enabled so that
kimage->arch.backup_start is available with the kexec_load system call.

This patch depends on the patch titled
"powerpc/crash: fix backup region offset update to elfcorehdr".

Fixes: 849599b702 ("powerpc/crash: add crash memory hotplug support")
Reviewed-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260312083051.1935737-3-sourabhjain@linux.ibm.com
2026-04-01 09:21:04 +05:30
Sourabh Jain
789335cacd powerpc/crash: fix backup region offset update to elfcorehdr
update_backup_region_phdr() in file_load_64.c iterates over all the
program headers in the kdump kernel’s elfcorehdr and updates the
p_offset of the program header whose physical address starts at 0.

However, the loop logic is incorrect because the program header pointer
is not updated during iteration. Since elfcorehdr typically contains
PT_NOTE entries first, the PT_LOAD program header with physical address
0 is never reached. As a result, its p_offset is not updated to point to
the backup region.

Because of this behavior, the capture kernel exports the first 64 KB of
the crashed kernel’s memory at offset 0, even though that memory
actually lives in the backup region. When a crash happens, purgatory
copies the first 64 KB of the crashed kernel’s memory into the backup
region so the capture kernel can safely use it.

This has not caused problems so far because the first 64 KB is usually
identical in both the crashed and capture kernels. However, this is
just an assumption and is not guaranteed to always hold true.

Fix update_backup_region_phdr() to correctly update the p_offset of the
program header with a starting physical address of 0 by correcting the
logic used to iterate over the program headers.

Fixes: cb350c1f1f ("powerpc/kexec_file: Prepare elfcore header for crashing kernel")
Reviewed-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260312083051.1935737-2-sourabhjain@linux.ibm.com
2026-04-01 09:21:04 +05:30
Sourabh Jain
04e707cb77 powerpc/crash: adjust the elfcorehdr size
With crash hotplug support enabled, additional memory is allocated to
the elfcorehdr kexec segment to accommodate resources added during
memory hotplug events. However, the kdump FDT is not updated with the
same size, which can result in elfcorehdr corruption in the kdump
kernel.

Update elf_headers_sz (the kimage member representing the size of the
elfcorehdr kexec segment) to reflect the total memory allocated for the
elfcorehdr segment instead of the elfcorehdr buffer size at the time of
kdump load. This allows of_kexec_alloc_and_setup_fdt() to reserve the
full elfcorehdr memory in the kdump FDT and prevents elfcorehdr
corruption.

Fixes: 849599b702 ("powerpc/crash: add crash memory hotplug support")
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260227171801.2238847-1-sourabhjain@linux.ibm.com
2026-03-04 11:12:48 +05:30
Sourabh Jain
20197b967a powerpc/kexec/core: use big-endian types for crash variables
Use explicit word-sized big-endian types for kexec and crash related
variables. This makes the endianness unambiguous and avoids type
mismatches that trigger sparse warnings.

The change addresses sparse warnings like below (seen on both 32-bit
and 64-bit builds):

CHECK   ../arch/powerpc/kexec/core.c
sparse:    expected unsigned int static [addressable] [toplevel] [usertype] crashk_base
sparse:    got restricted __be32 [usertype]
sparse: warning: incorrect type in assignment (different base types)
sparse:    expected unsigned int static [addressable] [toplevel] [usertype] crashk_size
sparse:    got restricted __be32 [usertype]
sparse: warning: incorrect type in assignment (different base types)
sparse:    expected unsigned long long static [addressable] [toplevel] mem_limit
sparse:    got restricted __be32 [usertype]
sparse: warning: incorrect type in assignment (different base types)
sparse:    expected unsigned int static [addressable] [toplevel] [usertype] kernel_end
sparse:    got restricted __be32 [usertype]

No functional change intended.

Fixes: ea961a828f ("powerpc: Fix endian issues in kexec and crash dump code")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512221405.VHPKPjnp-lkp@intel.com/
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251224151257.28672-1-sourabhjain@linux.ibm.com
2026-03-04 11:08:57 +05:30
Nysal Jan K.A.
c2296a1e42 powerpc/kexec: Enable SMT before waking offline CPUs
If SMT is disabled or a partial SMT state is enabled, when a new kernel
image is loaded for kexec, on reboot the following warning is observed:

kexec: Waking offline cpu 228.
WARNING: CPU: 0 PID: 9062 at arch/powerpc/kexec/core_64.c:223 kexec_prepare_cpus+0x1b0/0x1bc
[snip]
 NIP kexec_prepare_cpus+0x1b0/0x1bc
 LR  kexec_prepare_cpus+0x1a0/0x1bc
 Call Trace:
  kexec_prepare_cpus+0x1a0/0x1bc (unreliable)
  default_machine_kexec+0x160/0x19c
  machine_kexec+0x80/0x88
  kernel_kexec+0xd0/0x118
  __do_sys_reboot+0x210/0x2c4
  system_call_exception+0x124/0x320
  system_call_vectored_common+0x15c/0x2ec

This occurs as add_cpu() fails due to cpu_bootable() returning false for
CPUs that fail the cpu_smt_thread_allowed() check or non primary
threads if SMT is disabled.

Fix the issue by enabling SMT and resetting the number of SMT threads to
the number of threads per core, before attempting to wake up all present
CPUs.

Fixes: 38253464bc ("cpu/SMT: Create topology_smt_thread_allowed()")
Reported-by: Sachin P Bappalige <sachinpb@linux.ibm.com>
Cc: stable@vger.kernel.org # v6.6+
Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com>
Tested-by: Samir M <samir@linux.ibm.com>
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251028105516.26258-1-nysal@linux.ibm.com
2025-12-22 17:53:37 +05:30
Ben Collins
38c64dfe0a kexec: Include kernel-end even without crashkernel
Certain versions of kexec don't even work without kernel-end being
added to the device-tree. Add it even if crash-kernel is disabled.

Signed-off-by: Ben Collins <bcollins@kernel.org>
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/2025042122-inescapable-mandrill-8a5ff2@boujee-and-buff
2025-11-11 14:34:15 +05:30
Sourabh Jain
7afe2383ef powerpc/kdump: Fix size calculation for hot-removed memory ranges
The elfcorehdr segment in the kdump image stores information about the
memory regions (called crash memory ranges) that the kdump kernel must
capture.

When a memory hot-remove event occurs, the kernel regenerates the
elfcorehdr for the currently loaded kdump image to remove the
hot-removed memory from the crash memory ranges.

Call chain:
remove_mem_range()
update_crash_elfcorehdr()
arch_crash_handle_hotplug_event()
crash_handle_hotplug_event()

While removing the hot-removed memory from the crash memory ranges in
remove_mem_range(), if the removed memory lies within an existing crash
range, that range is split into two. During this split, the size of the
second range was being calculated incorrectly.

This leads to dump capture failure with makedumpfile with below error:

$ makedumpfile -l -d 31 /proc/vmcore /tmp/vmcore

readpage_elf: Attempt to read non-existent page at 0xbbdab0000.
readmem: type_addr: 0, addr:c000000bbdab7f00, size:16
validate_mem_section: Can't read mem_section array.
readpage_elf: Attempt to read non-existent page at 0xbbdab0000.
readmem: type_addr: 0, addr:c000000bbdab7f00, size:8
get_mm_sparsemem: Can't get the address of mem_section.

The updated crash memory range in PT_LOAD entry is holding incorrect
data (checkout FileSiz and MemSiz):

readelf -a /proc/vmcore
<snip...>
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000b013d0000 0xc000000b80000000 0x0000000b80000000
                 0xffffffffc0000000 0xffffffffc0000000  RWE    0x0
<snip...>

Update the size calculation for the new crash memory range to fix this
issue.

Note: This problem will not occur if the kdump kernel is loaded or
reloaded after a memory hot-remove operation.

Fixes: 849599b702 ("powerpc/crash: add crash memory hotplug support")
Reported-by: Shirisha G <shirisha@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251105033941.1752287-1-sourabhjain@linux.ibm.com
2025-11-11 14:11:55 +05:30
Sourabh Jain
b4a96ab50f powerpc/kdump: Add support for crashkernel CMA reservation
Commit 35c18f2933 ("Add a new optional ",cma" suffix to the
crashkernel= command line option") and commit ab475510e0 ("kdump:
implement reserve_crashkernel_cma") added CMA support for kdump
crashkernel reservation.

Extend crashkernel CMA reservation support to powerpc.

The following changes are made to enable CMA reservation on powerpc:

- Parse and obtain the CMA reservation size along with other crashkernel
  parameters
- Call reserve_crashkernel_cma() to allocate the CMA region for kdump
- Include the CMA-reserved ranges in the usable memory ranges for the
  kdump kernel to use.
- Exclude the CMA-reserved ranges from the crash kernel memory to
  prevent them from being exported through /proc/vmcore.

With the introduction of the CMA crashkernel regions,
crash_exclude_mem_range() needs to be called multiple times to exclude
both crashk_res and crashk_cma_ranges from the crash memory ranges. To
avoid repetitive logic for validating mem_ranges size and handling
reallocation when required, this functionality is moved to a new wrapper
function crash_exclude_mem_range_guarded().

To ensure proper CMA reservation, reserve_crashkernel_cma() is called
after pageblock_order is initialized.

Update kernel-parameters.txt to document CMA support for crashkernel on
powerpc architecture.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251107080334.708028-1-sourabhjain@linux.ibm.com
2025-11-11 14:11:08 +05:30
Jiri Bohac
35c18f2933 Add a new optional ",cma" suffix to the crashkernel= command line option
Patch series "kdump: crashkernel reservation from CMA", v5.

This series implements a way to reserve additional crash kernel memory
using CMA.

Currently, all the memory for the crash kernel is not usable by the 1st
(production) kernel.  It is also unmapped so that it can't be corrupted by
the fault that will eventually trigger the crash.  This makes sense for
the memory actually used by the kexec-loaded crash kernel image and initrd
and the data prepared during the load (vmcoreinfo, ...).  However, the
reserved space needs to be much larger than that to provide enough
run-time memory for the crash kernel and the kdump userspace.  Estimating
the amount of memory to reserve is difficult.  Being too careful makes
kdump likely to end in OOM, being too generous takes even more memory from
the production system.  Also, the reservation only allows reserving a
single contiguous block (or two with the "low" suffix).  I've seen systems
where this fails because the physical memory is fragmented.

By reserving additional crashkernel memory from CMA, the main crashkernel
reservation can be just large enough to fit the kernel and initrd image,
minimizing the memory taken away from the production system.  Most of the
run-time memory for the crash kernel will be memory previously available
to userspace in the production system.  As this memory is no longer
wasted, the reservation can be done with a generous margin, making kdump
more reliable.  Kernel memory that we need to preserve for dumping is
normally not allocated from CMA, unless it is explicitly allocated as
movable.  Currently this is only the case for memory ballooning and zswap.
Such movable memory will be missing from the vmcore.  User data is
typically not dumped by makedumpfile.  When dumping of user data is
intended this new CMA reservation cannot be used.

There are five patches in this series:

The first adds a new ",cma" suffix to the recenly introduced generic
crashkernel parsing code.  parse_crashkernel() takes one more argument to
store the cma reservation size.

The second patch implements reserve_crashkernel_cma() which performs the
reservation.  If the requested size is not available in a single range,
multiple smaller ranges will be reserved.

The third patch updates Documentation/, explicitly mentioning the
potential DMA corruption of the CMA-reserved memory.

The fourth patch adds a short delay before booting the kdump kernel,
allowing pending DMA transfers to finish.

The fifth patch enables the functionality for x86 as a proof of
concept. There are just three things every arch needs to do:
- call reserve_crashkernel_cma()
- include the CMA-reserved ranges in the physical memory map
- exclude the CMA-reserved ranges from the memory available
  through /proc/vmcore by excluding them from the vmcoreinfo
  PT_LOAD ranges.

Adding other architectures is easy and I can do that as soon as this
series is merged.

With this series applied, specifying
	crashkernel=100M craskhernel=1G,cma
on the command line will make a standard crashkernel reservation
of 100M, where kexec will load the kernel and initrd.

An additional 1G will be reserved from CMA, still usable by the production
system.  The crash kernel will have 1.1G memory available.  The 100M can
be reliably predicted based on the size of the kernel and initrd.

The new cma suffix is completely optional. When no
crashkernel=size,cma is specified, everything works as before.


This patch (of 5):

Add a new cma_size parameter to parse_crashkernel().  When not NULL, call
__parse_crashkernel to parse the CMA reservation size from
"crashkernel=size,cma" and store it in cma_size.

Set cma_size to NULL in all calls to parse_crashkernel().

Link: https://lkml.kernel.org/r/aEqnxxfLZMllMC8I@dwarf.suse.cz
Link: https://lkml.kernel.org/r/aEqoQckgoTQNULnh@dwarf.suse.cz
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19 19:08:22 -07:00
Eddie James
882b25af26 powerpc/crash: Fix non-smp kexec preparation
In non-smp configurations, crash_kexec_prepare is never called in
the crash shutdown path. One result of this is that the crashing_cpu
variable is never set, preventing crash_save_cpu from storing the
NT_PRSTATUS elf note in the core dump.

Fixes: c7255058b5 ("powerpc/crash: save cpu register data in crash_smp_send_stop()")
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250211162054.857762-1-eajames@linux.ibm.com
2025-04-29 11:40:46 +05:30
Linus Torvalds
d6b02199cd - The 7 patch series "powerpc/crash: use generic crashkernel
reservation" from Sourabh Jain changes powerpc's kexec code to use more
   of the generic layers.
 
 - The 2 patch series "get_maintainer: report subsystem status
   separately" from Vlastimil Babka makes some long-requested improvements
   to the get_maintainer output.
 
 - The 4 patch series "ucount: Simplify refcounting with rcuref_t" from
   Sebastian Siewior cleans up and optimizing the refcounting in the ucount
   code.
 
 - The 12 patch series "reboot: support runtime configuration of
   emergency hw_protection action" from Ahmad Fatoum improves the ability
   for a driver to perform an emergency system shutdown or reboot.
 
 - The 16 patch series "Converge on using secs_to_jiffies() part two"
   from Easwar Hariharan performs further migrations from
   msecs_to_jiffies() to secs_to_jiffies().
 
 - The 7 patch series "lib/interval_tree: add some test cases and
   cleanup" from Wei Yang permits more userspace testing of kernel library
   code, adds some more tests and performs some cleanups.
 
 - The 2 patch series "hung_task: Dump the blocking task stacktrace" from
   Masami Hiramatsu arranges for the hung_task detector to dump the stack
   of the blocking task and not just that of the blocked task.
 
 - The 4 patch series "resource: Split and use DEFINE_RES*() macros" from
   Andy Shevchenko provides some cleanups to the resource definition
   macros.
 
 - Plus the usual shower of singleton patches - please see the individual
   changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ+nuqwAKCRDdBJ7gKXxA
 jtNqAQDxqJpjWkzn4yN9CNSs1ivVx3fr6SqazlYCrt3u89WQvwEA1oRrGpETzUGq
 r6khQUIcQImPPcjFqEFpuiSOU0MBZA0=
 =Kii8
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - The series "powerpc/crash: use generic crashkernel reservation" from
   Sourabh Jain changes powerpc's kexec code to use more of the generic
   layers.

 - The series "get_maintainer: report subsystem status separately" from
   Vlastimil Babka makes some long-requested improvements to the
   get_maintainer output.

 - The series "ucount: Simplify refcounting with rcuref_t" from
   Sebastian Siewior cleans up and optimizing the refcounting in the
   ucount code.

 - The series "reboot: support runtime configuration of emergency
   hw_protection action" from Ahmad Fatoum improves the ability for a
   driver to perform an emergency system shutdown or reboot.

 - The series "Converge on using secs_to_jiffies() part two" from Easwar
   Hariharan performs further migrations from msecs_to_jiffies() to
   secs_to_jiffies().

 - The series "lib/interval_tree: add some test cases and cleanup" from
   Wei Yang permits more userspace testing of kernel library code, adds
   some more tests and performs some cleanups.

 - The series "hung_task: Dump the blocking task stacktrace" from Masami
   Hiramatsu arranges for the hung_task detector to dump the stack of
   the blocking task and not just that of the blocked task.

 - The series "resource: Split and use DEFINE_RES*() macros" from Andy
   Shevchenko provides some cleanups to the resource definition macros.

 - Plus the usual shower of singleton patches - please see the
   individual changelogs for details.

* tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
  mailmap: consolidate email addresses of Alexander Sverdlin
  fs/procfs: fix the comment above proc_pid_wchan()
  relay: use kasprintf() instead of fixed buffer formatting
  resource: replace open coded variant of DEFINE_RES()
  resource: replace open coded variants of DEFINE_RES_*_NAMED()
  resource: replace open coded variant of DEFINE_RES_NAMED_DESC()
  resource: split DEFINE_RES_NAMED_DESC() out of DEFINE_RES_NAMED()
  samples: add hung_task detector mutex blocking sample
  hung_task: show the blocker task if the task is hung on mutex
  kexec_core: accept unaccepted kexec segments' destination addresses
  watchdog/perf: optimize bytes copied and remove manual NUL-termination
  lib/interval_tree: fix the comment of interval_tree_span_iter_next_gap()
  lib/interval_tree: skip the check before go to the right subtree
  lib/interval_tree: add test case for span iteration
  lib/interval_tree: add test case for interval_tree_iter_xxx() helpers
  lib/rbtree: add random seed
  lib/rbtree: split tests
  lib/rbtree: enable userland test suite for rbtree related data structure
  checkpatch: describe --min-conf-desc-length
  scripts/gdb/symbols: determine KASLR offset on s390
  ...
2025-04-01 10:06:52 -07:00
Sourabh Jain
e3185ee438 powerpc/crash: use generic crashkernel reservation
Commit 0ab97169aa ("crash_core: add generic function to do reservation")
added a generic function to reserve crashkernel memory.  So let's use the
same function on powerpc and remove the architecture-specific code that
essentially does the same thing.

The generic crashkernel reservation also provides a way to split the
crashkernel reservation into high and low memory reservations, which can
be enabled for powerpc in the future.

Along with moving to the generic crashkernel reservation, the code related
to finding the base address for the crashkernel has been separated into
its own function name get_crash_base() for better readability and
maintainability.

Link: https://lkml.kernel.org/r/20250131113830.925179-8-sourabhjain@linux.ibm.com
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Cc: Baoquan he <bhe@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:30:48 -07:00
Sourabh Jain
021b5b303c powerpc/crash: preserve user-specified memory limit
Commit 59d58189f3 ("crash: fix crash memory reserve exceed system memory
bug") fails crashkernel parsing if the crash size is found to be higher
than system RAM, which makes the memory_limit adjustment code ineffective
due to an early exit from reserve_crashkernel().

Regardless lets not violate the user-specified memory limit by adjusting
it.  Remove this adjustment to ensure all reservations stay within the
limit.  Commit f94f5ac079 ("powerpc/fadump: Don't update the
user-specified memory limit") did the same for fadump.

Link: https://lkml.kernel.org/r/20250131113830.925179-6-sourabhjain@linux.ibm.com
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Cc: Baoquan he <bhe@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:30:48 -07:00
Sourabh Jain
6e5250eaa6 powerpc/crash: use generic APIs to locate memory hole for kdump
On PowerPC, the memory reserved for the crashkernel can contain components
like RTAS, TCE, OPAL, etc., which should be avoided when loading kexec
segments into crashkernel memory.  Due to these special components,
PowerPC has its own set of APIs to locate holes in the crashkernel memory
for loading kexec segments for kdump.  However, for loading kexec segments
in the kexec case, PowerPC already uses generic APIs to locate holes.

The previous patch in this series, titled "crash: Let arch decide usable
memory range in reserved area," introduced arch-specific hook to handle
such special regions in the crashkernel area.  So, switch PowerPC to use
the generic APIs to locate memory holes for kdump and remove the redundant
PowerPC-specific APIs.

Link: https://lkml.kernel.org/r/20250131113830.925179-5-sourabhjain@linux.ibm.com
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:30:48 -07:00
Christophe Leroy
861efb8a48 powerpc/kexec: fix physical address calculation in clear_utlb_entry()
In relocate_32.S, function clear_utlb_entry() goes into real mode. To
do so, it has to calculate the physical address based on the virtual
address. To get the virtual address it uses 'bl' which is problematic
(see commit c974809a26 ("powerpc/vdso: Avoid link stack corruption
in __get_datapage()")). In addition, the calculation is done on a
wrong address because 'bl' loads LR with the address of the following
instruction, not the address of the target. So when the target is not
the instruction following the 'bl' instruction, it may lead to
unexpected behaviour.

Fix it by re-writing the code so that is goes via another path which
is based 'bcl 20,31,.+4' which is the right instruction to use for that.

Fixes: 6834302003 ("powerpc/47x: Kernel support for KEXEC")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/dc4f9616fba9c05c5dbf9b4b5480eb1c362adc17.1741256651.git.christophe.leroy@csgroup.eu
2025-03-10 10:10:55 +05:30
Eliav Farber
bad6722e47 kexec: Consolidate machine_kexec_mask_interrupts() implementation
Consolidate the machine_kexec_mask_interrupts implementation into a common
function located in a new file: kernel/irq/kexec.c. This removes duplicate
implementations from architecture-specific files in arch/arm, arch/arm64,
arch/powerpc, and arch/riscv, reducing code duplication and improving
maintainability.

The new implementation retains architecture-specific behavior for
CONFIG_GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD, which was previously implemented
for ARM64. When enabled (currently for ARM64), it clears the active state
of interrupts forwarded to virtual machines (VMs) before handling other
interrupt masking operations.

Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241204142003.32859-2-farbere@amazon.com
2024-12-11 20:32:34 +01:00
Zhang Zekun
83b5a407fb powerpc/kexec: Fix return of uninitialized variable
of_property_read_u64() can fail and leave the variable uninitialized,
which will then be used. Return error if reading the property failed.

Fixes: 2e6bd221d9 ("powerpc/kexec_file: Enable early kernel OPAL calls")
Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20240930075628.125138-1-zhangzekun11@huawei.com
2024-11-15 00:29:46 +11:00
Linus Torvalds
3c3ff7be97 powerpc updates for 6.11
- Remove support for 40x CPUs & platforms.
 
  - Add support to the 64-bit BPF JIT for cpu v4 instructions.
 
  - Fix PCI hotplug driver crash on powernv.
 
  - Fix doorbell emulation for KVM on PAPR guests (nestedv2).
 
  - Fix KVM nested guest handling of some less used SPRs.
 
  - Online NUMA nodes with no CPU/memory if they have a PCI device attached.
 
  - Reduce memory overhead of enabling kfence on 64-bit Radix MMU kernels.
 
  - Reimplement the iommu table_group_ops for pseries for VFIO SPAPR TCE.
 
 Thanks to: Anjali K, Artem Savkov, Athira Rajeev, Breno Leitao, Brian King,
 Celeste Liu, Christophe Leroy, Esben Haabendal, Gaurav Batra, Gautam Menghani,
 Haren Myneni, Hari Bathini, Jeff Johnson, Krishna Kumar, Krzysztof Kozlowski,
 Nathan Lynch, Nicholas Piggin, Nick Bowler, Nilay Shroff, Rob Herring (Arm),
 Shawn Anastasio, Shivaprasad G Bhat, Sourabh Jain, Srikar Dronamraju, Timothy
 Pearson, Uwe Kleine-König, Vaibhav Jain.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmaaUNITHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDA+D/4o7OZ+SY0plTlMKSy3hW/SRXVj/byA
 CCKdizNY+3Rf/+K7KhuLOUPXhZOemLPE0xfKS3ND4mIEKCswzzXqmi6kjPH0qd8q
 qUhkHbt/LNpNJzZOYYw+usaklMTMdZtAl/jD9WEvGwgu2EYHgrujRIq04kEI1b0e
 OPiRnXOZcfevRBepQmYZKHvFlCRRa5vvsQcvLfY64yFqD0AsKTHgIi/48Dn33pb2
 hqHYyV1tZA3uT86Z1TgF1OG83VOSDsgc19Sb2xn14O9aJJ7lD2TOgVa4P4FfBlXA
 TXYYGQwK31ymGVWGcGfebVdC1ECeTem9n28vlk5I0NO9xNgPok/Ov4DAiZ+u1G0E
 3CXRDx9Uz2yPcGBJI2dpxfp2iw83Ad2DtBzAdukMD36xnC7xfrQz+W9SQfbcPJ8e
 I5SMAstWuLNgrX7YkjAOnXh1N41kht/mdV6KHdcMxPc7jOtAD65gUOZcgwYLeXlT
 Av17Ax0PMbiQ1BpFe2KNr/0T9Ba5k5rN7oDSKncDAq4uX8LcZKHj4bSHT9KroT1C
 q+GERspoCYp2VDMO742Jm7KTmQDHsS5y4Q+iSdOR8cQBXF613FaryDxSoJZhg2pf
 C2zIVED13RGcjIFcWlv73iA6QpBsphM+WWFz7mjULyJhxFQwm6BYt+Wy6jFu84oH
 sOgvPH8YyaK2uA==
 =eHVd
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Remove support for 40x CPUs & platforms

 - Add support to the 64-bit BPF JIT for cpu v4 instructions

 - Fix PCI hotplug driver crash on powernv

 - Fix doorbell emulation for KVM on PAPR guests (nestedv2)

 - Fix KVM nested guest handling of some less used SPRs

 - Online NUMA nodes with no CPU/memory if they have a PCI device
   attached

 - Reduce memory overhead of enabling kfence on 64-bit Radix MMU kernels

 - Reimplement the iommu table_group_ops for pseries for VFIO SPAPR TCE

Thanks to: Anjali K, Artem Savkov, Athira Rajeev, Breno Leitao, Brian
King, Celeste Liu, Christophe Leroy, Esben Haabendal, Gaurav Batra,
Gautam Menghani, Haren Myneni, Hari Bathini, Jeff Johnson, Krishna
Kumar, Krzysztof Kozlowski, Nathan Lynch, Nicholas Piggin, Nick Bowler,
Nilay Shroff, Rob Herring (Arm), Shawn Anastasio, Shivaprasad G Bhat,
Sourabh Jain, Srikar Dronamraju, Timothy Pearson, Uwe Kleine-König, and
Vaibhav Jain.

* tag 'powerpc-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (57 commits)
  Documentation/powerpc: Mention 40x is removed
  powerpc: Remove 40x leftovers
  macintosh/therm_windtunnel: fix module unload.
  powerpc: Check only single values are passed to CPU/MMU feature checks
  powerpc/xmon: Fix disassembly CPU feature checks
  powerpc: Drop clang workaround for builtin constant checks
  powerpc64/bpf: jit support for signed division and modulo
  powerpc64/bpf: jit support for sign extended mov
  powerpc64/bpf: jit support for sign extended load
  powerpc64/bpf: jit support for unconditional byte swap
  powerpc64/bpf: jit support for 32bit offset jmp instruction
  powerpc/pci: Hotplug driver bridge support
  pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
  powerpc/configs: Update defconfig with now user-visible CONFIG_FSL_IFC
  powerpc: add missing MODULE_DESCRIPTION() macros
  macintosh/mac_hid: add MODULE_DESCRIPTION()
  KVM: PPC: add missing MODULE_DESCRIPTION() macros
  powerpc/kexec: Use of_property_read_reg()
  powerpc/64s/radix/kfence: map __kfence_pool at page granularity
  powerpc/pseries/iommu: Define spapr_tce_table_group_ops only with CONFIG_IOMMU_API
  ...
2024-07-19 21:00:33 -07:00
Rob Herring (Arm)
cf08b628cd powerpc/kexec: Use of_property_read_reg()
Replace open-coded parsing of "reg" property with
of_property_read_reg().

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
[mpe: Rename end to size, add comment to the bail out case]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240702210344.722364-1-robh@kernel.org
2024-07-04 21:55:18 +10:00
Nicholas Piggin
21a741eb75 powerpc/pseries: Fix scv instruction crash with kexec
kexec on pseries disables AIL (reloc_on_exc), required for scv
instruction support, before other CPUs have been shut down. This means
they can execute scv instructions after AIL is disabled, which causes an
interrupt at an unexpected entry location that crashes the kernel.

Change the kexec sequence to disable AIL after other CPUs have been
brought down.

As a refresher, the real-mode scv interrupt vector is 0x17000, and the
fixed-location head code probably couldn't easily deal with implementing
such high addresses so it was just decided not to support that interrupt
at all.

Fixes: 7fa95f9ada ("powerpc/64s: system call support for scv/rfscv instructions")
Cc: stable@vger.kernel.org # v5.9+
Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Closes: https://lore.kernel.org/3b4b2943-49ad-4619-b195-bc416f1d1409@linux.ibm.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Link: https://msgid.link/20240625134047.298759-1-npiggin@gmail.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2024-06-28 22:05:44 +10:00
Sourabh Jain
932bed4121 powerpc/kexec_file: fix cpus node update to FDT
While updating the cpus node, commit 40c753993e ("powerpc/kexec_file:
 Use current CPU info while setting up FDT") first deletes all subnodes
under the /cpus node. However, while adding sub-nodes back, it missed
adding cpus subnodes whose device_type != "cpu", such as l2-cache*,
l3-cache*, ibm,powerpc-cpu-features.

Fix this by only deleting cpus sub-nodes of device_type == "cpus" and
then adding all available nodes with device_type == "cpu".

Fixes: 40c753993e ("powerpc/kexec_file: Use current CPU info while setting up FDT")
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240510102235.2269496-3-sourabhjain@linux.ibm.com
2024-06-17 22:48:45 +10:00
Sourabh Jain
0d3ff06733 powerpc/kexec_file: fix extra size calculation for kexec FDT
While setting up the FDT for kexec, CPU nodes that are added after the
system boots and reserved memory ranges are incorporated into the
initial_boot_params (base FDT).

However, they are not taken into account when determining the additional
size needed for the kexec FDT. As a result, kexec fails to load,
generating the following error:

[1116.774451] Error updating memory reserve map: FDT_ERR_NOSPACE
kexec_file_load failed: No such process

Therefore, consider the extra size for CPU nodes added post-system boot
and reserved memory ranges while preparing the kexec FDT.

While adding a new parameter to the setup_new_fdt_ppc64 function, it was
noticed that there were a couple of unused parameters, so they were
removed.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240510102235.2269496-2-sourabhjain@linux.ibm.com
2024-06-17 22:48:45 +10:00
Sourabh Jain
9803af2911 powerpc/crash: remove unnecessary NULL check before kvfree()
Fix the following coccicheck build warning:

arch/powerpc/kexec/crash.c:488:2-8: WARNING: NULL check before some
freeing functions is not needed.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404261048.skfV5DDB-lkp@intel.com/
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240502182040.774759-1-sourabhjain@linux.ibm.com
2024-05-03 12:20:50 +10:00
Sourabh Jain
849599b702 powerpc/crash: add crash memory hotplug support
Extend the arch crash hotplug handler, as introduced by the patch title
("powerpc: add crash CPU hotplug support"), to also support memory
add/remove events.

Elfcorehdr describes the memory of the crash kernel to capture the
kernel; hence, it needs to be updated if memory resources change due to
memory add/remove events. Therefore, arch_crash_handle_hotplug_event()
is updated to recreate the elfcorehdr and replace it with the previous
one on memory add/remove events.

The memblock list is used to prepare the elfcorehdr. In the case of
memory hot remove, the memblock list is updated after the arch crash
hotplug handler is triggered, as depicted in Figure 1. Thus, the
hot-removed memory is explicitly removed from the crash memory ranges
to ensure that the memory ranges added to elfcorehdr do not include the
hot-removed memory.

    Memory remove
          |
          v
    Offline pages
          |
          v
 Initiate memory notify call <----> crash hotplug handler
 chain for MEM_OFFLINE event
          |
          v
 Update memblock list

 	Figure 1

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. A few changes have been made to ensure that the
kernel can safely update the elfcorehdr component of the kdump image for
both system calls.

For the kexec_file_load syscall, kdump image is prepared in the kernel.
To support an increasing number of memory regions, the elfcorehdr is
built with extra buffer space to ensure that it can accommodate
additional memory ranges in future.

For the kexec_load syscall, the elfcorehdr is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by the
kexec tool. Passing this flag to the kernel indicates that the
elfcorehdr is built to accommodate additional memory ranges and the
elfcorehdr segment is not considered for SHA calculation, making it safe
to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240326055413.186534-7-sourabhjain@linux.ibm.com
2024-04-23 15:00:04 +10:00
Sourabh Jain
b741092d59 powerpc/crash: add crash CPU hotplug support
Due to CPU/Memory hotplug or online/offline events, the elfcorehdr
(which describes the CPUs and memory of the crashed kernel) and FDT
(Flattened Device Tree) of kdump image becomes outdated. Consequently,
attempting dump collection with an outdated elfcorehdr or FDT can lead
to failed or inaccurate dump collection.

Going forward, CPU hotplug or online/offline events are referred as
CPU/Memory add/remove events.

The current solution to address the above issue involves monitoring the
CPU/Memory add/remove events in userspace using udev rules and whenever
there are changes in CPU and memory resources, the entire kdump image
is loaded again. The kdump image includes kernel, initrd, elfcorehdr,
FDT, purgatory. Given that only elfcorehdr and FDT get outdated due to
CPU/Memory add/remove events, reloading the entire kdump image is
inefficient. More importantly, kdump remains inactive for a substantial
amount of time until the kdump reload completes.

To address the aforementioned issue, commit 2472627561 ("crash: add
generic infrastructure for crash hotplug support") added a generic
infrastructure that allows architectures to selectively update the kdump
image component during CPU or memory add/remove events within the kernel
itself.

In the event of a CPU or memory add/remove events, the generic crash
hotplug event handler, `crash_handle_hotplug_event()`, is triggered. It
then acquires the necessary locks to update the kdump image and invokes
the architecture-specific crash hotplug handler,
`arch_crash_handle_hotplug_event()`, to update the required kdump image
components.

This patch adds crash hotplug handler for PowerPC and enable support to
update the kdump image on CPU add/remove events. Support for memory
add/remove events is added in a subsequent patch with the title
"powerpc: add crash memory hotplug support"

As mentioned earlier, only the elfcorehdr and FDT kdump image components
need to be updated in the event of CPU or memory add/remove events.
However, on PowerPC architecture crash hotplug handler only updates the
FDT to enable crash hotplug support for CPU add/remove events. Here's
why.

The elfcorehdr on PowerPC is built with possible CPUs, and thus, it does
not need an update on CPU add/remove events. On the other hand, the FDT
needs to be updated on CPU add events to include the newly added CPU. If
the FDT is not updated and the kernel crashes on a newly added CPU, the
kdump kernel will fail to boot due to the unavailability of the crashing
CPU in the FDT. During the early boot, it is expected that the boot CPU
must be a part of the FDT; otherwise, the kernel will raise a BUG and
fail to boot. For more information, refer to commit 36ae37e343
("powerpc: Make boot_cpuid common between 32 and 64-bit"). Since it is
okay to have an offline CPU in the kdump FDT, no action is taken in case
of CPU removal.

There are two system calls, `kexec_file_load` and `kexec_load`, used to
load the kdump image. Few changes have been made to ensure kernel can
safely update the FDT of kdump image loaded using both system calls.

For kexec_file_load syscall the kdump image is prepared in kernel. So to
support an increasing number of CPUs, the FDT is constructed with extra
buffer space to ensure it can accommodate a possible number of CPU
nodes. Additionally, a call to fdt_pack (which trims the unused space
once the FDT is prepared) is avoided if this feature is enabled.

For the kexec_load syscall, the FDT is updated only if the
KEXEC_CRASH_HOTPLUG_SUPPORT kexec flag is passed to the kernel by
userspace (kexec tools). When userspace passes this flag to the kernel,
it indicates that the FDT is built to accommodate possible CPUs, and the
FDT segment is excluded from SHA calculation, making it safe to update.

The changes related to this feature are kept under the CRASH_HOTPLUG
config, and it is enabled by default.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240326055413.186534-6-sourabhjain@linux.ibm.com
2024-04-23 15:00:04 +10:00
Sourabh Jain
0857beff9c powerpc/kexec: make the update_cpus_node() function public
Move the update_cpus_node() from kexec/{file_load_64.c => core_64.c}
to allow other kexec components to use it.

Later in the series, this function is used for in-kernel updates
to the kdump image during CPU/memory hotplug or online/offline events for
both kexec_load and kexec_file_load syscalls.

No functional changes are intended.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240326055413.186534-5-sourabhjain@linux.ibm.com
2024-04-23 14:59:51 +10:00
Sourabh Jain
f5f0da5a7b powerpc/kexec: move *_memory_ranges functions to ranges.c
Move the following functions form kexec/{file_load_64.c => ranges.c} and
make them public so that components other than KEXEC_FILE can also use
these functions.
1. get_exclude_memory_ranges
2. get_reserved_memory_ranges
3. get_crash_memory_ranges
4. get_usable_memory_ranges

Later in the series get_crash_memory_ranges function is utilized for
in-kernel updates to kdump image during CPU/Memory hotplug or
online/offline events for both kexec_load and kexec_file_load syscalls.

Since the above functions are moved to ranges.c, some of the helper
functions in ranges.c are no longer required to be public. Mark them as
static and removed them from kexec_ranges.h header file.

Finally, remove the CONFIG_KEXEC_FILE build dependency for range.c
because it is required for other config, such as CONFIG_CRASH_DUMP.

No functional changes are intended.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240326055413.186534-4-sourabhjain@linux.ibm.com
2024-04-23 14:59:01 +10:00
Hari Bathini
5c4233cc09 powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency
Remove CONFIG_CRASH_DUMP dependency on CONFIG_KEXEC. CONFIG_KEXEC_CORE
was used at places where CONFIG_CRASH_DUMP or CONFIG_CRASH_RESERVE was
appropriate. Replace with appropriate #ifdefs to support CONFIG_KEXEC
and !CONFIG_CRASH_DUMP configuration option. Also, make CONFIG_FA_DUMP
dependent on CONFIG_CRASH_DUMP to avoid unmet dependencies for FA_DUMP
with !CONFIG_KEXEC_CORE configuration option.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240226103010.589537-4-hbathini@linux.ibm.com
2024-03-17 13:34:00 +11:00
Hari Bathini
33f2cc0a2e powerpc/kexec: split CONFIG_KEXEC_FILE and CONFIG_CRASH_DUMP
CONFIG_KEXEC_FILE does not have to select CONFIG_CRASH_DUMP. Move
some code under CONFIG_CRASH_DUMP to support CONFIG_KEXEC_FILE and
!CONFIG_CRASH_DUMP case.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240226103010.589537-3-hbathini@linux.ibm.com
2024-03-17 13:34:00 +11:00
Linus Torvalds
66a27abac3 powerpc updates for 6.9
- Add AT_HWCAP3 and AT_HWCAP4 aux vector entries for future use by glibc.
 
  - Add support for recognising the Power11 architected and raw PVRs.
 
  - Add support for nr_cpus=n on the command line where the boot CPU is >= n.
 
  - Add ppcxx_allmodconfig targets for all 32-bit sub-arches.
 
  - Other small features, cleanups and fixes.
 
 Thanks to: Akanksha J N, Brian King, Christophe Leroy, Dawei Li, Geoff Levand,
 Greg Kroah-Hartman, Jan-Benedict Glaw, Kajol Jain, Kunwu Chan, Li zeming,
 Madhavan Srinivasan, Masahiro Yamada, Nathan Chancellor, Nicholas Piggin, Peter
 Bergner, Qiheng Lin, Randy Dunlap, Ricardo B. Marliere, Rob Herring, Sathvika
 Vasireddy, Shrikanth Hegde, Uwe Kleine-König, Vaibhav Jain, Wen Xiong.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmX01vgTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJ4bEACVsxXXjbjl+WKgWNjHsM7sVwUX/sSV
 z43iVycLPXDqochSkkgKjyIEFowaWhjgWVHFHmUXWxB5FjjFEEoH4FPo3VB0IY48
 VoSFT6PhzqXDrGmt2fWsJ+k6zUyJZa8pNS38DHg1yuuYDAa0KWxd3E/x/r0qzsbr
 vcas1uWcDWgjoUDMBuJpyx0sYTl6+mR9HlZuM4+aNQdzhTFU/jK69hAN0RFvryes
 K2/fLgI0fgLZpQDogCn4HV1/4uixi1eEFlVNXkwvMYDpQVo2FqiBaWLF0hNLWNCk
 kvm/fYIJhdFoNlp38jVKv0KJnBhW7aAs3prF+8B3YL2B23rLnvA6ZLZKHcdBAeLb
 8PJMRrbAbmVxOnVSAG0fgU+0dEdkJQ+0ABqa+usMOV7xIPg9uIui1YrKT1KVq6Fs
 KyGHM5EQuBC/P6bTsKO6X+1beY2QIfwWxaIkoo8pj6d0WU69qU4u+LzQiDO4XR0L
 UQQguB1Qo8yaip3rHXhuv0hlnMNVAVye56Zw63uq1MWGkewRKSkY91Ms02L+pXpF
 r6+96xoFB0ulKZFnyxyBdkj2iC0426fHtTiiJFfQ4R1fiibPKtAx9P59WYnqymVh
 QsSYqlgC2/jWzRgqJTweLp/XQK8fWqmFkNmCGDN1N9Sij9Xjx/8aZb5dvwJkSBnK
 rZ4ObxBoaCPbPA==
 =K9Ok
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add AT_HWCAP3 and AT_HWCAP4 aux vector entries for future use
   by glibc

 - Add support for recognising the Power11 architected and raw PVRs

 - Add support for nr_cpus=n on the command line where the
   boot CPU is >= n

 - Add ppcxx_allmodconfig targets for all 32-bit sub-arches

 - Other small features, cleanups and fixes

Thanks to Akanksha J N, Brian King, Christophe Leroy, Dawei Li, Geoff
Levand, Greg Kroah-Hartman, Jan-Benedict Glaw, Kajol Jain, Kunwu Chan,
Li zeming, Madhavan Srinivasan, Masahiro Yamada, Nathan Chancellor,
Nicholas Piggin, Peter Bergner, Qiheng Lin, Randy Dunlap, Ricardo B.
Marliere, Rob Herring, Sathvika Vasireddy, Shrikanth Hegde, Uwe
Kleine-König, Vaibhav Jain, and Wen Xiong.

* tag 'powerpc-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (71 commits)
  powerpc/macio: Make remove callback of macio driver void returned
  powerpc/83xx: Fix build failure with FPU=n
  powerpc/64s: Fix get_hugepd_cache_index() build failure
  powerpc/4xx: Fix warp_gpio_leds build failure
  powerpc/amigaone: Make several functions static
  powerpc/embedded6xx: Fix no previous prototype for avr_uart_send() etc.
  macintosh/adb: make adb_dev_class constant
  powerpc: xor_vmx: Add '-mhard-float' to CFLAGS
  powerpc/fsl: Fix mfpmr() asm constraint error
  powerpc: Remove cpu-as-y completely
  powerpc/fsl: Modernise mt/mfpmr
  powerpc/fsl: Fix mfpmr build errors with newer binutils
  powerpc/64s: Use .machine power4 around dcbt
  powerpc/64s: Move dcbt/dcbtst sequence into a macro
  powerpc/mm: Code cleanup for __hash_page_thp
  powerpc/hv-gpci: Fix the H_GET_PERF_COUNTER_INFO hcall return value checks
  powerpc/irq: Allow softirq to hardirq stack transition
  powerpc: Stop using of_root
  powerpc/machdep: Define 'compatibles' property in ppc_md and use it
  of: Reimplement of_machine_is_compatible() using of_machine_compatible_match()
  ...
2024-03-15 17:53:48 -07:00
Christophe Leroy
2a066ae118 powerpc: Stop using of_root
Replace all usages of of_root by of_find_node_by_path("/")

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231214103152.12269-5-mpe@ellerman.id.au
2024-03-03 22:20:29 +11:00
Sathvika Vasireddy
6035e7e354 powerpc/32: Curb objtool unannotated intra-function call warning
objtool throws the following warning:
arch/powerpc/kexec/relocate_32.o: warning: objtool: .text+0x2bc: unannotated intra-function call

Fix this warning by annotating intra-function call, using
ANNOTATE_INTRA_FUNCTION_CALL macro, to indicate that the branch target
is valid.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20221215115258.80810-1-sv@linux.ibm.com
2024-03-03 22:20:28 +11:00
Baoquan He
199da8714c arch, crash: move arch_crash_save_vmcoreinfo() out to file vmcore_info.c
Nathan reported below building error:

=====
$ curl -LSso .config https://git.alpinelinux.org/aports/plain/community/linux-edge/config-edge.armv7
$ make -skj"$(nproc)" ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- olddefconfig all
..
arm-linux-gnueabi-ld: arch/arm/kernel/machine_kexec.o: in function `arch_crash_save_vmcoreinfo':
machine_kexec.c:(.text+0x488): undefined reference to `vmcoreinfo_append_str'
====

On architecutres, like arm, s390, ppc, sh, function
arch_crash_save_vmcoreinfo() is located in machine_kexec.c and it can
only be compiled in when CONFIG_KEXEC_CORE=y.

That's not right because arch_crash_save_vmcoreinfo() is used to export
arch specific vmcoreinfo. CONFIG_VMCORE_INFO is supposed to control its
compiling in. However, CONFIG_VMVCORE_INFO could be independent of
CONFIG_KEXEC_CORE, e.g CONFIG_PROC_KCORE=y will select CONFIG_VMVCORE_INFO.
Or CONFIG_KEXEC/CONFIG_KEXEC_FILE is set while CONFIG_CRASH_DUMP is
not set, it will report linking error.

So, on arm, s390, ppc and sh, move arch_crash_save_vmcoreinfo out to
a new file vmcore_info.c. Let CONFIG_VMCORE_INFO decide if compiling in
arch_crash_save_vmcoreinfo().

[akpm@linux-foundation.org: remove stray newlines at eof]
Link: https://lkml.kernel.org/r/20240129135033.157195-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20240126045551.GA126645@dev-arch.thelio-3990X/T/#u
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:25 -08:00
Linus Torvalds
9f2a635235 Quite a lot of kexec work this time around. Many singleton patches in
many places.  The notable patch series are:
 
 - nilfs2 folio conversion from Matthew Wilcox in "nilfs2: Folio
   conversions for file paths".
 
 - Additional nilfs2 folio conversion from Ryusuke Konishi in "nilfs2:
   Folio conversions for directory paths".
 
 - IA64 remnant removal in Heiko Carstens's "Remove unused code after
   IA-64 removal".
 
 - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere
   in "Treewide: enable -Wmissing-prototypes".  This had some followup
   fixes:
 
   - Nathan Chancellor has cleaned up the hexagon build in the series
     "hexagon: Fix up instances of -Wmissing-prototypes".
 
   - Nathan also addressed some s390 warnings in "s390: A couple of
     fixes for -Wmissing-prototypes".
 
   - Arnd Bergmann addresses the same warnings for MIPS in his series
     "mips: address -Wmissing-prototypes warnings".
 
 - Baoquan He has made kexec_file operate in a top-down-fitting manner
   similar to kexec_load in the series "kexec_file: Load kernel at top of
   system RAM if required"
 
 - Baoquan He has also added the self-explanatory "kexec_file: print out
   debugging message if required".
 
 - Some checkstack maintenance work from Tiezhu Yang in the series
   "Modify some code about checkstack".
 
 - Douglas Anderson has disentangled the watchdog code's logging when
   multiple reports are occurring simultaneously.  The series is "watchdog:
   Better handling of concurrent lockups".
 
 - Yuntao Wang has contributed some maintenance work on the crash code in
   "crash: Some cleanups and fixes".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZ2R6AAKCRDdBJ7gKXxA
 juCVAP4t76qUISDOSKugB/Dn5E4Nt9wvPY9PcufnmD+xoPsgkQD+JVl4+jd9+gAV
 vl6wkJDiJO5JZ3FVtBtC3DFA/xHtVgk=
 =kQw+
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Quite a lot of kexec work this time around. Many singleton patches in
  many places. The notable patch series are:

   - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio
     conversions for file paths'.

   - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2:
     Folio conversions for directory paths'.

   - IA64 remnant removal in Heiko Carstens's 'Remove unused code after
     IA-64 removal'.

   - Arnd Bergmann has enabled the -Wmissing-prototypes warning
     everywhere in 'Treewide: enable -Wmissing-prototypes'. This had
     some followup fixes:

      - Nathan Chancellor has cleaned up the hexagon build in the series
        'hexagon: Fix up instances of -Wmissing-prototypes'.

      - Nathan also addressed some s390 warnings in 's390: A couple of
        fixes for -Wmissing-prototypes'.

      - Arnd Bergmann addresses the same warnings for MIPS in his series
        'mips: address -Wmissing-prototypes warnings'.

   - Baoquan He has made kexec_file operate in a top-down-fitting manner
     similar to kexec_load in the series 'kexec_file: Load kernel at top
     of system RAM if required'

   - Baoquan He has also added the self-explanatory 'kexec_file: print
     out debugging message if required'.

   - Some checkstack maintenance work from Tiezhu Yang in the series
     'Modify some code about checkstack'.

   - Douglas Anderson has disentangled the watchdog code's logging when
     multiple reports are occurring simultaneously. The series is
     'watchdog: Better handling of concurrent lockups'.

   - Yuntao Wang has contributed some maintenance work on the crash code
     in 'crash: Some cleanups and fixes'"

* tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits)
  crash_core: fix and simplify the logic of crash_exclude_mem_range()
  x86/crash: use SZ_1M macro instead of hardcoded value
  x86/crash: remove the unused image parameter from prepare_elf_headers()
  kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE
  scripts/decode_stacktrace.sh: strip unexpected CR from lines
  watchdog: if panicking and we dumped everything, don't re-enable dumping
  watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
  watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
  watchdog/hardlockup: adopt softlockup logic avoiding double-dumps
  kexec_core: fix the assignment to kimage->control_page
  x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init()
  lib/trace_readwrite.c:: replace asm-generic/io with linux/io
  nilfs2: cpfile: fix some kernel-doc warnings
  stacktrace: fix kernel-doc typo
  scripts/checkstack.pl: fix no space expression between sp and offset
  x86/kexec: fix incorrect argument passed to kexec_dprintk()
  x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs
  nilfs2: add missing set_freezable() for freezable kthread
  kernel: relay: remove relay_file_splice_read dead code, doesn't work
  docs: submit-checklist: remove all of "make namespacecheck"
  ...
2024-01-09 11:46:20 -08:00
Baoquan He
63b642e952 kexec_file, power: print out debugging message if required
Then when specifying '-d' for kexec_file_load interface, loaded locations
of kernel/initrd/cmdline etc can be printed out to help debug.

Here replace pr_debug() with the newly added kexec_dprintk() in kexec_file
loading related codes.

Link: https://lkml.kernel.org/r/20231213055747.61826-7-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 15:02:57 -08:00
Aditya Gupta
a143892cb7 powerpc: add cpu_spec.cpu_features to vmcoreinfo
CPU features can be determined in makedumpfile, using
'cur_cpu_spec.cpu_features'.

This provides more data to makedumpfile about the crashed system, and
can help in filtering the vmcore accordingly.

Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230920105706.853626-2-adityag@linux.ibm.com
2023-12-13 22:26:23 +11:00
Heiko Carstens
0eb5085c38 arch: remove ARCH_TASK_STRUCT_ON_STACK
IA-64 was the only architecture which selected ARCH_TASK_STRUCT_ON_STACK.
IA-64 was removed with commit cf8e865810 ("arch: Remove Itanium (IA-64)
architecture"). Therefore remove support for ARCH_TASK_STRUCT_ON_STACK
as well.

Note: this also reveals a potential bug in powerpc code, which makes use of
__init_task_data without selecting ARCH_TASK_STRUCT_ON_STACK which makes
__init_task_data a no-op. This is broken since commit d11ed3ab31 ("Expand
INIT_TASK() in init/init_task.c and remove") from 2018 and needs to be
addressed separately.

Link: https://lkml.kernel.org/r/20231116133638.1636277-4-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 17:21:31 -08:00
Linus Torvalds
707df298cb powerpc updates for 6.7
- Add support for KVM running as a nested hypervisor under development versions
    of PowerVM, using the new PAPR nested virtualisation API.
 
  - Add support for the BPF prog pack allocator.
 
  - A rework of the non-server MMU handling to support execute-only on all platforms.
 
  - Some optimisations & cleanups for the powerpc qspinlock code.
 
  - Various other small features and fixes.
 
 Thanks to: Aboorva Devarajan, Aditya Gupta, Amit Machhiwal, Benjamin Gray,
 Christophe Leroy, Dr. David Alan Gilbert, Gaurav Batra, Gautam Menghani, Geert
 Uytterhoeven, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Julia
 Lawall, Kautuk Consul, Kuan-Wei Chiu, Michael Neuling, Minjie Du, Muhammad
 Muzammil, Naveen N Rao, Nicholas Piggin, Nick Child, Nysal Jan K.A, Peter
 Lafreniere, Rob Herring, Sachin Sant, Sebastian Andrzej Siewior, Shrikanth
 Hegde, Srikar Dronamraju, Stanislav Kinsburskii, Vaibhav Jain, Wang Yufen, Yang
 Yingliang, Yuan Tan.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmVEf38THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgMKgD/4vmPVcBE31xCAuuksrVvmMDRsCoC8N
 IJe4A5dHda1tYgdN2YdeK4LBszv5pWICjf2xZHlNh+L0s3Vxpngd4ycAWGPfDAyk
 SOlM24NCKl5j3327QZEt+iZVmJeTSnrmjxO0A1y04yvzLrfvFT7mbP4EXoidjShd
 GNb/EoH9kkCFn65zulc+lN2itQEX6Ht2GQTAz5z5GKtF6d1zZGM8ftOW+SQ5LeU3
 5JOkQtMtwAKhzBiglA4BB3pQyjaOOkPaTaj/WLoxx5tbVaCkV4wrFq48Bmtbm7E3
 kYkMNoI3IsC615GqY1CaRs/RSpMt74tIVh3tstSecHWRIwNGnfF6zeZpKLvJSs8k
 Qa5greGWMUDuJdDg9oDwAX2AKtO+3byI2v1hKE+sMhMh0eeMtDP9WIrIRg4BDjKL
 mq8RffXLTCtepehgfwBpoZbcvFSwFUMwuihBD7+bDMZQeDbtuFdZ2ouMFXBP9M1n
 cuv4KySouvKv9Xp5EeCkHlpL7QmSqrtSHOPYjoPeLueJYlmjheWdreLM9p7Nl2ma
 5wBxLpdLCGCpDJOyGgWNoQRHXucBNlU97DLx2V70nXG4wvvRyXh9EZ6I2niPSdPx
 N3LJnINz4MJ52Gd1KWJvufOyJlLwXxuI07rzCq67ZegpEPh+baWqVcPscuKU8+q0
 dSh2DPCht8gw1A==
 =ddT4
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add support for KVM running as a nested hypervisor under development
   versions of PowerVM, using the new PAPR nested virtualisation API

 - Add support for the BPF prog pack allocator

 - A rework of the non-server MMU handling to support execute-only on
   all platforms

 - Some optimisations & cleanups for the powerpc qspinlock code

 - Various other small features and fixes

Thanks to Aboorva Devarajan, Aditya Gupta, Amit Machhiwal, Benjamin
Gray, Christophe Leroy, Dr. David Alan Gilbert, Gaurav Batra, Gautam
Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Joel Stanley,
Jordan Niethe, Julia Lawall, Kautuk Consul, Kuan-Wei Chiu, Michael
Neuling, Minjie Du, Muhammad Muzammil, Naveen N Rao, Nicholas Piggin,
Nick Child, Nysal Jan K.A, Peter Lafreniere, Rob Herring, Sachin Sant,
Sebastian Andrzej Siewior, Shrikanth Hegde, Srikar Dronamraju, Stanislav
Kinsburskii, Vaibhav Jain, Wang Yufen, Yang Yingliang, and Yuan Tan.

* tag 'powerpc-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (100 commits)
  powerpc/vmcore: Add MMU information to vmcoreinfo
  Revert "powerpc: add `cur_cpu_spec` symbol to vmcoreinfo"
  powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free]
  powerpc/bpf: rename powerpc64_jit_data to powerpc_jit_data
  powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack
  powerpc/bpf: implement bpf_arch_text_copy
  powerpc/code-patching: introduce patch_instructions()
  powerpc/32s: Implement local_flush_tlb_page_psize()
  powerpc/pseries: use kfree_sensitive() in plpks_gen_password()
  powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure
  powerpc/fsl_msi: Use device_get_match_data()
  powerpc: Remove cpm_dp...() macros
  powerpc/qspinlock: Rename yield_propagate_owner tunable
  powerpc/qspinlock: Propagate sleepy if previous waiter is preempted
  powerpc/qspinlock: don't propagate the not-sleepy state
  powerpc/qspinlock: propagate owner preemptedness rather than CPU number
  powerpc/qspinlock: stop queued waiters trying to set lock sleepy
  powerpc/perf: Fix disabling BHRB and instruction sampling
  powerpc/trace: Add support for HAVE_FUNCTION_ARG_ACCESS_API
  powerpc/tools: Pass -mabi=elfv2 to gcc-check-mprofile-kernel.sh
  ...
2023-11-03 10:07:39 -10:00
Aditya Gupta
36e826b568 powerpc/vmcore: Add MMU information to vmcoreinfo
Since below commit, address mapping for vmemmap has changed for Radix
MMU, where address mapping is stored in kernel page table itself,
instead of earlier used 'vmemmap_list'.

    commit 368a0590d9 ("powerpc/book3s64/vmemmap: switch radix to use
    a different vmemmap handling function")

Hence with upstream kernel, in case of Radix MMU, makedumpfile fails
to do address translation for vmemmap addresses, as it depended on
vmemmap_list, which can now be empty.

While fixing the address translation in makedumpfile, it was identified
that currently makedumpfile cannot distinguish between Hash MMU and
Radix MMU, unless VMLINUX is passed with -x flag to makedumpfile. And
hence fails to assign offsets and shifts correctly (such as in L4 to
PGDIR offset calculation in makedumpfile).

For getting the MMU, makedumpfile uses `cur_cpu_spec.mmu_features`.

Add `cur_cpu_spec` symbol and offset of `mmu_features` in the `cpu_spec`
struct, to VMCOREINFO, so that makedumpfile can assign the offsets
correctly, without needing a VMLINUX.

Also, even along with `cur_cpu_spec->mmu_features` makedumpfile has to
depend on the 'MMU_FTR_TYPE_RADIX' flag in mmu_features, implying kernel
developers need to be cautious of changes to 'MMU_FTR_*' defines.

A more stable approach was suggested in the below thread by contributors:
 https://lore.kernel.org/linuxppc-dev/20230920105706.853626-1-adityag@linux.ibm.com/

The suggestion was to add whether 'RADIX_MMU' is enabled in vmcoreinfo

This patch also implements the suggestion, by adding 'RADIX_MMU' in
vmcoreinfo, which makedumpfile can use to get whether the crashed system
had RADIX MMU (in which case 'NUMBER(RADIX_MMU)=1') or not (in which
case 'NUMBER(RADIX_MMU)=0')

Fixes: 368a0590d9 ("powerpc/book3s64/vmemmap: switch radix to use a different vmemmap handling function")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231023072612.50874-1-adityag@linux.ibm.com
2023-10-25 16:03:49 +11:00
Michael Ellerman
357673120a Revert "powerpc: add cur_cpu_spec symbol to vmcoreinfo"
This reverts commit 7135b921b3.

I applied this commit prematurely while there was still discussion
ongoing. Revert it so the final patch can be applied cleanly.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2023-10-25 16:03:49 +11:00
Benjamin Gray
2b4a6cc9a1 powerpc: Annotate endianness of various variables and functions
Sparse reports several endianness warnings on variables and functions
that are consistently treated as big endian. There are no
multi-endianness shenanigans going on here so fix these low hanging
fruit up in one patch.

All changes are just type annotations; no endianness switching
operations are introduced by this patch.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-7-bgray@linux.ibm.com
2023-10-19 17:12:47 +11:00
Baoquan He
a9e1a3d84e crash_core: change the prototype of function parse_crashkernel()
Add two parameters 'low_size' and 'high' to function parse_crashkernel(),
later crashkernel=,high|low parsing will be added.  Make adjustments in
all call sites of parse_crashkernel() in arch.

Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:41:58 -07:00
Aditya Gupta
7135b921b3 powerpc: add cur_cpu_spec symbol to vmcoreinfo
Presently, while reading a vmcore, makedumpfile uses
`cur_cpu_spec.mmu_features` to decide whether the crashed system had
RADIX MMU or not.

Currently, makedumpfile fails to get the `cur_cpu_spec` symbol (unless
a vmlinux is passed with the `-x` flag to makedumpfile), and hence
assigns offsets and shifts (such as pgd_offset_l4) incorrecly considering
MMU to be hash MMU.

Add `cur_cpu_spec` symbol and offset of `mmu_features` in the
`cpu_spec` struct, to VMCOREINFO, so that the symbol address and offset
is accessible to makedumpfile, without needing the vmlinux file

Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230911091409.415662-1-adityag@linux.ibm.com
2023-09-18 12:23:27 +10:00
Julia Lawall
06b627c123 powerpc/kexec_file: add missing of_node_put
for_each_node_with_property performs an of_node_get on each
iteration, so a break out of the loop requires an
of_node_put.

This was done using the Coccinelle semantic patch
iterators/for_each_child.cocci

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230907095521.14053-7-Julia.Lawall@inria.fr
2023-09-18 12:23:26 +10:00
Michal Suchanek
89c9ce1c99 powerpc: Move DMA64_PROPNAME define to a header
Avoid redefining the same value in multiple source.

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230817162411.429-1-msuchanek@suse.de
2023-08-18 17:03:15 +10:00
Arnd Bergmann
0f7ce21ab5 powerpc: mark more local variables as volatile
A while ago I created a2305e3de8 ("powerpc: mark local variables
around longjmp as volatile") in order to allow building powerpc with
-Wextra enabled on gcc-11.

I tried this again with gcc-13 and found two more of the same issues,
presumably based on slightly different optimization paths being taken
here:

arch/powerpc/xmon/xmon.c:3306:27: error: variable 'mm' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/kexec/crash.c:353:22: error: variable 'i' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]

I checked a bunch of randconfigs and found only these two, so just
address them the same way as the others.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230809131024.2039647-1-arnd@kernel.org
2023-08-14 21:54:04 +10:00
Laurent Dufour
7f96539437 powerpc/kexec: fix minor typo
Function name in the descriptor was not correct.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202307251721.bUGcsCeQ-lkp@intel.com/
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230725132759.53975-1-ldufour@linux.ibm.com
2023-08-02 22:22:19 +10:00