linux/drivers/gpu/drm/amd/amdkfd
Xiaogang Chen 81665e35f1 drm/amdkfd: Check if there are kfd porcesses using adev by kfd_processes_count
During gpu hot-unplug need check if there are kfd porcesses still using the
being removed gpu before clean resources of the device. Current driver checks
if kfd_processes_table is empty. kfd processes are not terminated after
removed from kfd_processes_table immediately. They are still alive and may
access the device until kfd_process_wq work queue got ran.

Check kfd->kfd_processes_count value that is updated after kfd process got
uninitialized when its ref becomes zero.

Fixes: 6cca686dfc ("drm/amdkfd: kfd driver supports hot unplug/replug amdgpu devices")
Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d12d05c4bc4c15585130af43e897923ff292df7b)
2026-05-05 10:18:23 -04:00
..
cik_event_interrupt.c drm/amdkfd: Don't expect signal mailbox update 2026-03-17 10:31:39 -04:00
cik_int.h
cik_regs.h
cwsr_trap_handler_gfx8.asm
cwsr_trap_handler_gfx9.asm
cwsr_trap_handler_gfx10.asm
cwsr_trap_handler_gfx12.asm drm/amdkfd: fix CWSR trap handler 2026-02-26 11:20:10 -05:00
cwsr_trap_handler.h drm/amdkfd: gfx12.1 trap handler instruction fixup for VOP3PX 2026-01-28 16:21:21 -05:00
Kconfig dma-buf: Always build with DMABUF_MOVE_NOTIFY 2026-01-27 10:45:11 +01:00
kfd_chardev.c drm/amdkfd: Make all TLB-flushes heavy-weight 2026-05-05 10:13:54 -04:00
kfd_crat.c drm/amdkfd: Check for NULL return values 2026-02-19 12:16:11 -05:00
kfd_crat.h
kfd_debug.c drm/amdkfd: Check for NULL return values 2026-02-19 12:16:11 -05:00
kfd_debug.h drm/amdkfd: fix gfx11 restrictions on debugging cooperative launch 2026-01-20 21:50:12 -05:00
kfd_debugfs.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
kfd_device_queue_manager_cik.c
kfd_device_queue_manager_v9.c
kfd_device_queue_manager_v10.c
kfd_device_queue_manager_v11.c
kfd_device_queue_manager_v12_1.c drm/amdgpu: Update MES VM_CNTX_CNTL for XNACK off for GFX 12.1 2025-12-10 17:39:09 -05:00
kfd_device_queue_manager_v12.c
kfd_device_queue_manager_vi.c
kfd_device_queue_manager.c drm/amdkfd: Make all TLB-flushes heavy-weight 2026-05-05 10:13:54 -04:00
kfd_device_queue_manager.h drm/amdgpu: Check for multiplication overflow in checkpoint stack size 2026-03-06 16:33:59 -05:00
kfd_device.c drm/amdkfd: Check if there are kfd porcesses using adev by kfd_processes_count 2026-05-05 10:18:23 -04:00
kfd_doorbell.c
kfd_events.c drm/amdkfd: Don't expect signal mailbox update 2026-03-17 10:31:39 -04:00
kfd_events.h drm/amdkfd: Don't expect signal mailbox update 2026-03-17 10:31:39 -04:00
kfd_flat_memory.c drm/amdgpu: GFX12.1 scratch memory limit up to 57-bit 2026-03-04 11:42:26 -05:00
kfd_int_process_v9.c drm/amdkfd: Don't expect signal mailbox update 2026-03-17 10:31:39 -04:00
kfd_int_process_v10.c drm/amdkfd: Don't expect signal mailbox update 2026-03-17 10:31:39 -04:00
kfd_int_process_v11.c drm/amdkfd: Don't expect signal mailbox update 2026-03-17 10:31:39 -04:00
kfd_int_process_v12_1.c drm/amdkfd: Switch to dev_* printk stuff in kfd_int_process_v12_1.c 2026-03-30 15:13:14 -04:00
kfd_interrupt.c
kfd_kernel_queue.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
kfd_kernel_queue.h
kfd_migrate.c drm/amdgpu: allocate move entities dynamically 2026-03-30 15:16:15 -04:00
kfd_migrate.h drm/amdgpu: update the functions to use amdgpu version of hmm 2025-10-13 14:14:36 -04:00
kfd_module.c
kfd_mqd_manager_cik.c drm/amdkfd: Removed commented line for MQD queue priority 2026-02-25 16:28:10 -05:00
kfd_mqd_manager_v9.c drm/amd: Fix MQD and control stack alignment for non-4K 2026-03-30 14:36:04 -04:00
kfd_mqd_manager_v10.c drm/amdkfd: Removed commented line for MQD queue priority 2026-02-25 16:28:10 -05:00
kfd_mqd_manager_v11.c drm/amdkfd: Removed commented line for MQD queue priority 2026-02-25 16:28:10 -05:00
kfd_mqd_manager_v12_1.c drm/amdkfd: Removed commented line for MQD queue priority 2026-02-25 16:28:10 -05:00
kfd_mqd_manager_v12.c drm/amdkfd: Removed commented line for MQD queue priority 2026-02-25 16:28:10 -05:00
kfd_mqd_manager_vi.c drm/amdgpu: Check for multiplication overflow in checkpoint stack size 2026-03-06 16:33:59 -05:00
kfd_mqd_manager.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
kfd_mqd_manager.h drm/amdgpu: Check for multiplication overflow in checkpoint stack size 2026-03-06 16:33:59 -05:00
kfd_packet_manager_v9.c amdkfd: remove DIQ support 2025-12-08 13:56:42 -05:00
kfd_packet_manager_vi.c amdkfd: remove DIQ support 2025-12-08 13:56:42 -05:00
kfd_packet_manager.c
kfd_pm4_headers_ai.h
kfd_pm4_headers_aldebaran.h
kfd_pm4_headers_vi.h
kfd_pm4_headers.h
kfd_pm4_opcodes.h
kfd_priv.h drm/amdkfd: Make all TLB-flushes heavy-weight 2026-05-05 10:13:54 -04:00
kfd_process_queue_manager.c drm/amdkfd: Update queue properties for metadata ring 2026-03-17 10:30:22 -04:00
kfd_process.c amd/amdkfd: add WQ_UNBOUND to alloc_workqueue users 2026-04-03 13:52:33 -04:00
kfd_queue.c drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size 2026-03-30 15:16:39 -04:00
kfd_smi_events.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
kfd_smi_events.h
kfd_svm.c drm/amdkfd: Make all TLB-flushes heavy-weight 2026-05-05 10:13:54 -04:00
kfd_svm.h drm/amdgpu: update the functions to use amdgpu version of hmm 2025-10-13 14:14:36 -04:00
kfd_topology.c drm/amdkfd: Add upper bound check for num_of_nodes 2026-04-23 12:54:45 -04:00
kfd_topology.h drm/amdgpu: reduce the full gpu access time in amdgpu_device_init. 2025-12-08 13:56:38 -05:00
Makefile drm/amdkfd: Add interrupt handling for GFX 12.1.0 2025-12-08 14:13:11 -05:00
soc15_int.h