linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-05 13:06:59 +02:00

Author	SHA1	Message	Date
John Harrison	16b7e65d29	drm/xe/guc: Track FAST_REQ H2Gs to report where errors came from Most H2G messages are FAST_REQ which means no synchronous response is expected. The messages are sent as fire-and-forget with no tracking. However, errors can still be returned when something goes unexpectedly wrong. That leads to confusion due to not being able to match up the error response to the originating H2G. So add support for tracking the FAST_REQ H2Gs and matching up an error response to its originator. This is only enabled in XE_DEBUG builds given that such errors should never happen in a working system and there is an overhead for the tracking. Further, if XE_DEBUG_GUC is enabled then even more memory and time is used to record the call stack of each H2G and report that with the error. That makes it much easier to work out where a specific H2G came from if there are multiple code paths that can send it. v2: Some re-wording of comments and prints, more consistent use of #if vs stub functions - review feedback from Daniele & Michal). v3: Split config change to separate patch, improve a debug print (review feedback from Michal). v4: Bunch of minor tweaks (review feedback from Michal). Original-i915-code: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250512215324.1457009-5-John.C.Harrison@Intel.com	2025-05-15 12:27:37 -07:00
John Harrison	d7d97890e2	drm/xe/guc: Rename CONFIG_XE_LARGE_GUC_BUFFER Rename XE_LARGE_GUC_BUFFER to XE_DEBUG_GUC to allow for more debug only code (in subsequent patch) without adding more config defines that each control only a single thing. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250512215324.1457009-4-John.C.Harrison@Intel.com	2025-05-15 12:27:36 -07:00
John Harrison	12373b30e2	drm/xe/guc: Add missing H2G error code definitions These error codes are not actually used in the driver but it is extremely useful to have them available to understand error messages. v2: Add a bunch more error codes and drop 'status' from names (review feedback by Michal W). v3: Drop 'SUCCESS' response as meaningless in current API (review feedback by Michal W). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250512215324.1457009-3-John.C.Harrison@Intel.com	2025-05-15 12:27:34 -07:00
John Harrison	fddf8cdd4b	drm/xe/guc: Remove double blank line An earlier patch moved a drm_print a few lines lower but accidentally left a double blank line behind. So fix that. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250512215324.1457009-2-John.C.Harrison@Intel.com	2025-05-15 12:27:33 -07:00
Lucas De Marchi	eaa287069a	drm/xe/guc_submit: Simplify and fix diff calculation With a u32 type, there's no need to check which one is greater: the current is always the latest and if it's less than the previous, it's because it wrapped: just do the unsigned calculation that will lead to the same result, or better the correct one. It fixes an off-by-one in the wrapped calculation, however that doesn't really matter for the timeout calculation. Reviewed-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250513-time-wrap-v1-1-fba9a69a65c8@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-05-15 06:14:17 -07:00
Michal Wajdeczko	3dbab383e3	drm/xe/guc: Don't allocate managed BO for each policy change We shouldn't use xe_managed_bo_create_from_data() to allocate temporary BO, as it will be released only on unload and every change in wedge_mode policy will consume resources (including precious GGTT). Instead just switchover to GuC buffer cache. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250512220018.172-3-michal.wajdeczko@intel.com	2025-05-15 12:29:55 +02:00
Michal Wajdeczko	b86babc9d9	drm/xe/guc: Unblock GuC buffer cache for all modes Today we were using GuC buffer cache only in the PF mode, but shortly we will want to use it also in native and VF mode. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://lore.kernel.org/r/20250512220018.172-2-michal.wajdeczko@intel.com	2025-05-15 12:29:54 +02:00
Himal Prasad Ghimiray	5aee6e33e1	drm/xe/vm: Add debug prints for SVM range prefetch Introduce debug logs for the prefetch operation of SVM ranges. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-16-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	09ba0a8f06	drm/xe/svm: Implement prefetch support for SVM ranges This commit adds prefetch support for SVM ranges, utilizing the existing ioctl vm_bind functionality to achieve this. v2: rebase v3: - use xa_for_each() instead of manual loop - check range is valid and in preferred location before adding to xarray - Fix naming conventions - Fix return condition as -ENODATA instead of -EAGAIN (Matthew Brost) - Handle sparsely populated cpu vma range (Matthew Brost) v4: - fix end address to find next cpu vma in case of -ENOENT v5: - Move find next vma logic to drm gpusvm layer - Avoid mixing declaration and logic v6: - Use new function names - Move eviction logic to prefetch_ranges v7: - devmem_only assigned 0 - nit address v8: - initialize ctx with 0 Cc: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-15-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	c904d4e2d7	drm/xe/svm: Add xe_svm_find_vma_start() helper Add helper xe_svm_find_vma_start() function to determine start of cpu vma in input range. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-14-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	72fa870957	drm/gpusvm: Introduce drm_gpusvm_find_vma_start() function The drm_gpusvm_find_vma_start() function is used to determine the starting address of a CPU VMA within a specified user range. If the range does not contain any VMA, the function returns ULONG_MAX. v2 - Rename function as drm_gpusvm_find_vma_start() (Matthew Brost) - mmget/mmput v3 - s/mmget/mmget_not_zero/ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-13-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	6275362f18	drm/xe/svm: Add xe_svm_range_validate() and xe_svm_range_migrate_to_smem() The xe_svm_range_validate() function checks if a range is valid and located in the desired memory region. xe_svm_range_migrate_to_smem() checks if range have pages in devmem and migrate them to smem. v2 - Fix function stub in xe_svm.h - Fix doc v3 (Matthew Brost) - Remove extra new line - s/range->base.flags.has_devmem_pages/xe_svm_range_in_vram v4 (Matthew Brost) - s/xe_svm_range_in_vram/range->base.flags.has_devmem_pages - Move eviction logic to separate function Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-12-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	cc795e0410	drm/xe/svm: Make xe_svm_range_needs_migrate_to_vram() public xe_svm_range_needs_migrate_to_vram() determines whether range needs migration to vram or not, modify it to accept region preference parameter too, so we can use it in prefetch too. v2 - add assert instead of warn (Matthew Brost) Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-11-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	e0ff0d7cf9	drm/xe/svm: Refactor usage of drm_gpusvm* function in xe_svm Define xe_svm_range_find_or_insert function wrapping drm_gpusvm_range_find_or_insert for reusing in prefetch. Define xe_svm_range_get_pages function wrapping drm_gpusvm_range_get_pages for reusing in prefetch. -v2 pass pagefault defined drm_gpu_svm context as parameter in xe_svm_range_find_or_insert(Matthew Brost) Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-10-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	da05e5ddc6	drm/xe: Rename lookup_vma function to xe_find_vma_by_addr This update renames the lookup_vma function to xe_vm_find_vma_by_addr and makes it accessible externally. The function, which looks up a VMA by its address within a specified VM, will be utilized in upcoming patches. v2 - Fix doc Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-9-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	bd1d1b46fe	drm/xe/vm: Add an identifier in xe_vma_ops for svm prefetch Add a flag in xe_vma_ops to determine whether it has svm prefetch ops or not. v2: - s/false/0 (Matthew Brost) v3: - s/XE_VMA_OPS_HAS_SVM_PREFETCH/XE_VMA_OPS_FLAG_HAS_SVM_PREFETCH Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-8-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	34ebb62723	drm/xe/vm: Update xe_vma_ops_incr_pt_update_ops to take an increment value Prefetch for SVM ranges can have more than one operation to increment, hence modify the function to accept an increment value as input. v2: - Call xe_vma_ops_incr_pt_update_ops only once for REMAP (Matthew Brost) - Add check for 0 ops v3: - s/u8/int for inc_val and num_remap_ops (Matthew Brost) Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-7-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:54 +05:30
Himal Prasad Ghimiray	da2eb41004	drm/xe/svm: Make xe_svm_range_* end/start/size public These functions will be used in prefetch too, therefore make them public. v2 - Fix kernel doc Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-6-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:53 +05:30
Himal Prasad Ghimiray	18211ff4d5	drm/xe/svm: Make to_xe_range a public function The to_xe_range function will be used in other files. Therefore, make it public and add kernel-doc documentation Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-5-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:53 +05:30
Himal Prasad Ghimiray	eb07c2fc10	drm/xe/svm: Helper to add tile masks to svm ranges Introduce a helper to add tile mask of binding present and invalidated for the range. Add a lockdep_assert to ensure it is protected by GPU SVM notifier lock. -v7 rebased Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-4-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:53 +05:30
Himal Prasad Ghimiray	686a526dad	drm/xe: Make xe_svm_alloc_vram public This function will be used in prefetch too, hence make it public. v2: - Add kernel-doc (Matthew Brost) - Rebase v3: - Move CONFIG_DRM_XE_DEVMEM_MIRROR stub out to xe_svm.c (Matthew Brost) Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-3-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:53 +05:30
Himal Prasad Ghimiray	745df157e4	drm/xe: Introduce xe_vma_op_prefetch_range struct for prefetch of ranges Add xe_vma_op_prefetch_range struct for svm ranges prefetching, including an xarray of SVM range pointers, range count, and target memory region. -v2: Fix doc Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250513040228.470682-2-himal.prasad.ghimiray@intel.com Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>	2025-05-14 19:25:53 +05:30
Umesh Nerlige Ramappa	82b98cadb0	drm/xe: Add WA BB to capture active context utilization Context Timestamp (CTX_TIMESTAMP) in the LRC accumulates the run ticks of the context, but only gets updated when the context switches out. In order to check how long a context has been active before it switches out, two things are required: (1) Determine if the context is running: To do so, we program the WA BB to set an initial value for CTX_TIMESTAMP in the LRC. The value chosen is 1 since 0 is the initial value when the LRC is initialized. During a query, we just check for this value to determine if the context is active. If the context switched out, it would overwrite this location with the actual CTX_TIMESTAMP MMIO value. Note that WA BB runs as the last part of the context restore, so reusing this LRC location will not clobber anything. (2) Calculate the time that the context has been active for: The CTX_TIMESTAMP ticks only when the context is active. If a context is active, we just use the CTX_TIMESTAMP MMIO as the new value of utilization. While doing so, we need to read the CTX_TIMESTAMP MMIO for the specific engine instance. Since we do not know which instance the context is running on until it is scheduled, we also read the ENGINE_ID MMIO in the WA BB and store it in the PPHSWP. Using the above 2 instructions in a WA BB, capture active context utilization. v2: (Matt Brost) - This breaks TDR, fix it by saving the CTX_TIMESTAMP register "drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value" - Drop tile from LRC if using gt "drm/xe: Save the gt pointer in LRC and drop the tile" v3: - Remove helpers for bb_per_ctx_ptr (Matt) - Add define for context active value (Matt) - Use 64 bit CTX TIMESTAMP for platforms that support it. For platforms that don't, live with the rare race. (Matt, Lucas) - Convert engine id to hwe and get the MMIO value (Lucas) - Correct commit message on when WA BB runs (Lucas) v4: - s/GRAPHICS_VER(...)/xe->info.has_64bit_timestamp/ (Matt) - Drop support for active utilization on a VF (CI failure) - In xe_lrc_init ensure the lrc value is 0 to begin with (CI regression) v5: - Minor checkpatch fix - Squash into previous commit and make TDR use 32-bit time - Update code comment to match commit msg Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4532 Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-8-umesh.nerlige.ramappa@intel.com	2025-05-12 14:33:25 -07:00
Umesh Nerlige Ramappa	741d3ef8b8	drm/xe: Save the gt pointer in lrc and drop the tile Save the gt pointer in the lrc so that it can used for gt based helpers. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-7-umesh.nerlige.ramappa@intel.com	2025-05-12 14:33:24 -07:00
Umesh Nerlige Ramappa	38b14233e5	drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value For determining actual job execution time, save the current value of the CTX_TIMESTAMP register rather than the value saved in LRC since the current register value is the closest to the start time of the job. v2: Define MI_STORE_REGISTER_MEM to fix compile error v3: Place MI_STORE_REGISTER_MEM sorted by MI_INSTR (Lucas) Fixes: `65921374c4` ("drm/xe: Emit ctx timestamp copy in ring ops") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250509161159.2173069-6-umesh.nerlige.ramappa@intel.com	2025-05-12 14:33:23 -07:00
Matthew Brost	1b894c2246	drm/xe: Add atomic_svm_timeslice_ms debugfs entry Add some informal control for atomic SVM fault GPU timeslice to be able to play around with values and tweak performance. v2: - Reduce timeslice default value to 5ms Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-6-matthew.brost@intel.com	2025-05-12 13:49:18 -07:00
Matthew Brost	a5d8d3be1d	drm/xe: Timeslice GPU on atomic SVM fault Ensure GPU can make forward progress on an atomic SVM GPU fault by giving the GPU a timeslice of 5ms v2: - Reduce timeslice to 5ms - Double timeslice on retry - Split out GPU SVM changes into independent patch v5: - Double timeslice in a few more places Fixes: `2f118c9491` ("drm/xe: Add SVM VRAM migration") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-5-matthew.brost@intel.com	2025-05-12 13:49:18 -07:00
Matthew Brost	8dc1812b5b	drm/gpusvm: Add timeslicing support to GPU SVM Add timeslicing support to GPU SVM which will guarantee the GPU a minimum execution time on piece of physical memory before migration back to CPU. Intended to implement strict migration policies which require memory to be in a certain placement for correct execution. Required for shared CPU and GPU atomics on certain devices. Fixes: `99624bdff8` ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-4-matthew.brost@intel.com	2025-05-12 13:49:18 -07:00
Matthew Brost	a9ac0fa455	drm/xe: Strict migration policy for atomic SVM faults Mixing GPU and CPU atomics does not work unless a strict migration policy of GPU atomics must be device memory. Enforce a policy of must be in VRAM with a retry loop of 3 attempts, if retry loop fails abort fault. Removing always_migrate_to_vram modparam as we now have real migration policy. v2: - Only retry migration on atomics - Drop alway migrate modparam v3: - Only set vram_only on DGFX (Himal) - Bail on get_pages failure if vram_only and retry count exceeded (Himal) - s/vram_only/devmem_only - Update xe_svm_range_is_valid to accept devmem_only argument v4: - Fix logic bug get_pages failure v5: - Fix commit message (Himal) - Mention removing always_migrate_to_vram in commit message (Lucas) - Fix xe_svm_range_is_valid to check for devmem pages - Bail on devmem_only && !migrate_devmem (Thomas) v6: - Add READ_ONCE barriers for opportunistic checks (Thomas) - Pair READ_ONCE with WRITE_ONCE (Thomas) v7: - Adjust comments (Thomas) Fixes: `2f118c9491` ("drm/xe: Add SVM VRAM migration") Cc: stable@vger.kernel.org Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-3-matthew.brost@intel.com	2025-05-12 13:49:08 -07:00
Himal Prasad Ghimiray	8a9b978ebd	drm/gpusvm: Introduce devmem_only flag for allocation This commit adds a new flag, devmem_only, to the drm_gpusvm structure. The purpose of this flag is to ensure that the get_pages function allocates memory exclusively from the device's memory. If the allocation from device memory fails, the function will return an -EFAULT error. Required for shared CPU and GPU atomics on certain devices. v3: - s/vram_only/devmem_only/ Fixes: `99624bdff8` ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250512135500.1405019-2-matthew.brost@intel.com	2025-05-12 13:48:48 -07:00
Aradhya Bhatia	e5c13e2c50	drm/xe/xe2hpg: Add Wa_22021007897 Add Wa_22021007897 for the Xe2_HPG (graphics version: 20.01) IP. It is a permanent workaround, and applicable on all the steppings. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://lore.kernel.org/r/20250512065004.2576-1-aradhya.bhatia@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2025-05-12 13:23:47 -07:00
Tomasz Lis	cef88d1265	drm/xe/vf: Fixup CTB send buffer messages after migration During post-migration recovery of a VF, it is necessary to update GGTT references included in messages which are going to be sent to GuC. GuC will start consuming messages after VF KMD will inform it about fixups being done; before that, the VF KMD is expected to update any H2G messages which are already in send buffer but were not consumed by GuC. Only a small subset of messages allowed for VFs have GGTT references in them. This patch adds the functionality to parse the CTB send ring buffer and shift addresses contained within. While fixing the CTB content, ct->lock is not taken. This means the only barrier taken remains GGTT address lock - which is ok, because only requests with GGTT addresses matter, but it also means tail changes can happen during the CTB fixups execution (which may be ignored as any new messages will not have anything to fix). The GGTT address locking will be introduced in a future series. v2: removed storing shift as that's now done in VMA nodes patch; macros to inlines; warns to asserts; log messages fixes (Michal) v3: removed inline keywords, enums for offsets in CTB messages, less error messages, if return unused then made functs void (Michal) v4: update the cached head before starting fixups v5: removed/updated comments, wrapped lines, converted assert into error, enums for offsets to separate patch, reused xe_map_rd v6: define xe_map__array() macros, support CTB wrap which divides a message, updated comments, moved one function to an earlier patch v7: renamed few functions, wider use on previously introduced helper, separate cases in parsing messges, documented a static funct v8: Introduced more helpers, fixed coding style mistakes v9: Move xe_map() functs to macros, add asserts, add debug print v10: Errors in place of some asserts, style fixes v11: Fixed invalid conditionals, added debug-only local pointer v12: Removed redundant __maybe_unused Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250512114018.361843-5-tomasz.lis@intel.com	2025-05-12 15:53:38 +02:00
Tomasz Lis	e327592cc9	drm/xe/guc: Introduce enum with offsets for context register H2Gs Some GuC messages are constructed with incrementing dword counter rather than referencing specific DWORDs, as described in GuC interface specification. This change introduces the definitions of DWORD numbers for parameters which will need to be referenced in a CTB parser to be added in a following patch. To ensure correctness of these DWORDs, verification in form of asserts was added to the message construction code. v2: Renamed enum members, added ones for single context registration, modified asserts to check values rather than indexes. v3: Reordered assert args to take less lines v4: Added lengths v5: Renamed MULTI_LRC_MSG_LEN to MULTI_LRC_MSG_MIN_LEN Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250512114018.361843-4-tomasz.lis@intel.com	2025-05-12 15:53:37 +02:00
Tomasz Lis	3e693945b1	drm/xe/vf: Shifting GGTT area post migration We have only one GGTT for all IOV functions, with each VF having assigned a range of addresses for its use. After migration, a VF can receive a different range of addresses than it had initially. This implements shifting GGTT addresses within drm_mm nodes, so that VMAs stay valid after migration. This will make the driver use new addresses when accessing GGTT from the moment the shifting ends. By taking the ggtt->lock for the period of VMA fixups, this change also adds constraint on that mutex. Any locks used during the recovery cannot ever wait for hardware response - because after migration, the hardware will not do anything until fixups are finished. v2: Moved some functs to xe_ggtt.c; moved shift computation to just after querying; improved documentation; switched some warns to asserts; skipping fixups when GGTT shift eq 0; iterating through tiles (Michal) v3: Updated kerneldocs, removed unused funct, properly allocate balloning nodes if non existent v4: Re-used ballooning functions from VF init, used bool in place of standard error codes v5: Renamed one function v6: Subject tag change, several kerneldocs updated, some functions renamed, some moved, added several asserts, shuffled declarations of variables, revealed more detail in high level functions v7: Fixed typos, added `_locked` suffix to some functs, improved readability of asserts, removed unneeded conditional v8: Moved one function, removed implementation detail from kerneldoc, added asserts v9: Code shuffling without much change, and one param rename v10: Minor error path change, added printing the shift via debugfs Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250512114018.361843-3-tomasz.lis@intel.com	2025-05-12 15:53:35 +02:00
Tomasz Lis	dd39212b5f	drm/xe/vf: Divide GGTT ballooning into allocation and insertion The balloon nodes, which are used to fill areas of GGTT inaccessible for a specific VF, were allocated and inserted into GGTT within one function. To be able to re-use that insertion code during VF migration recovery, we need to split it. This patch separates allocation (init/fini functs) from the insertion of balloons (balloon/deballoon functs). Locks are also moved to ensure calls from post-migration recovery worker will not cause a deadlock. v2: Moved declarations to proper header v3: Rephrased description, introduced "_locked" versions of some functs, more lockdep checks, some functions renamed, altered error handling, added missing kerneldocs. v4: Suffixed more functs with `_locked`, moved lockdep asserts, fixed finalization in error path, added asserts v5: Renamed another few functs, used xe_ggtt_node_allocated(), moved lockdep back again to avoid null dereference, added asserts, improved comments v6: Changed params of cleanup_ggtt() Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250512114018.361843-2-tomasz.lis@intel.com	2025-05-12 15:53:33 +02:00
Thomas Hellström	5dd933e33b	drm/xe: Make the gem shrinker drm managed Make the xe drm shrinker drm managed like many other resources created at device creation time. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250508113015.3374-1-thomas.hellstrom@linux.intel.com	2025-05-12 10:01:31 +02:00
Thomas Hellström	243bf99e2f	drm/xe: Fix the gem shrinker name The xe buffer object shrinker name is visible in the <debugfs>/shrinker directory and most if not all other shinkers follow a naming convention that looks like <subsystem>-<driver>_<objects>:<unique> Follow the same convention for xe, changing the name to drm-xe_gem:<unique>. Other shrinkers typically use the device node for <unique> but since drm drivers typically don't have a single unique device- node, instead use the unique name in the drm device. Fixes: `00c8efc318` ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250508112931.3347-1-thomas.hellstrom@linux.intel.com	2025-05-09 10:32:27 +02:00
Raag Jadav	252c471197	drm/xe/doc: Wire up PCIe Gen5 limitations Append PCIe Gen5 limitations to xe_firmware document. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250506054835.3395220-4-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-05-07 15:31:11 -04:00
Raag Jadav	0e414bf7ad	drm/xe: Expose PCIe link downgrade attributes Expose sysfs attributes for PCIe link downgrade capability and status. v2: Move from debugfs to sysfs (Lucas, Rodrigo, Badal) Rework macros and their naming (Rodrigo) v3: Use sysfs_create_files() (Riana) Fix checkpatch warning (Riana) v4: s/downspeed/downgrade (Lucas, Rodrigo, Riana) v5: Use PCIe Gen agnostic naming (Rodrigo) v6: s/pcie_gen/auto_link (Lucas) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250506054835.3395220-3-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-05-07 15:31:11 -04:00
Raag Jadav	f3e875b3c0	drm/xe: Move xe_device_sysfs_init() to xe_device_probe() Since xe_device_sysfs_init() exposes device specific attributes, a better place for it is xe_device_probe(). Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250506054835.3395220-2-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-05-07 15:31:10 -04:00
Shuicheng Lin	432cd94efd	drm/xe: Release force wake first then runtime power xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for the release path, xe_force_wake_put() should be called first then xe_pm_runtime_put(). Combine the error path and normal path together with goto. Fixes: `85d547608e` ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-05-07 15:29:59 -04:00
Shuicheng Lin	9d80698bcd	drm/xe: Add config control for svm flush work Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below warning pops. Refine the flush work code to be controlled by the config to avoid below warning: " [ 453.132028] ------------[ cut here ]------------ [ 453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0 [ 453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec [ 453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G U W 6.15.0-rc3+ #7 PREEMPT(full) [ 453.135405] Tainted: [U]=USER, [W]=WARN ... [ 453.136921] RIP: 0010:__flush_work+0x379/0x3a0 [ 453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe [ 453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246 [ 453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000 [ 453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8 [ 453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000 [ 453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001 [ 453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00 [ 453.143450] FS: 00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000 [ 453.144276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0 [ 453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 453.147061] Call Trace: [ 453.147336] <TASK> [ 453.147579] ? tick_nohz_tick_stopped+0xd/0x30 [ 453.148067] ? xas_load+0x9/0xb0 [ 453.148435] ? xa_load+0x6f/0xb0 [ 453.148781] __xe_vm_bind_ioctl+0xbd5/0x1500 [xe] [ 453.149338] ? dev_printk_emit+0x48/0x70 [ 453.149762] ? _dev_printk+0x57/0x80 [ 453.150148] ? drm_ioctl+0x17c/0x440 [ 453.150544] ? __drm_dev_vprintk+0x36/0x90 [ 453.150983] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.151575] ? drm_ioctl_kernel+0x9f/0xf0 [ 453.151998] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.152560] drm_ioctl_kernel+0x9f/0xf0 [ 453.152968] drm_ioctl+0x20f/0x440 [ 453.153332] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.153893] ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100 [ 453.154489] ? memory_bm_test_bit+0x5/0x60 [ 453.154935] xe_drm_ioctl+0x47/0x70 [xe] [ 453.155419] __x64_sys_ioctl+0x8d/0xc0 [ 453.155824] do_syscall_64+0x47/0x110 [ 453.156228] entry_SYSCALL_64_after_hwframe+0x76/0x7e " v2 (Matt): refine commit message to have more details add Fixes tag move the code to xe_svm.h which already have the config remove a blank line per codestyle suggestion Fixes: `63f6e480d1` ("drm/xe: Add SVM garbage collector") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com	2025-05-07 11:11:15 -07:00
Harish Chegondi	aef87a5fdb	drm/xe: Use copy_from_user() instead of __copy_from_user() copy_from_user() has more checks and is more safer than __copy_from_user() Suggested-by: Kees Cook <kees@kernel.org> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/acabf20aa8621c7bc8de09b1bffb8d14b5376484.1746126614.git.harish.chegondi@intel.com	2025-05-07 09:27:40 -07:00
Daniele Ceraolo Spurio	12370bfcc4	drm/xe/gsc: do not flush the GSC worker from the reset path The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM, while the GSC one isn't (and can't be as we need to do memory allocations in the gsc worker). Therefore, we can't flush the latter from the former. The reason why we had such a flush was to avoid interrupting either the GSC FW load or in progress GSC proxy operations. GSC proxy operations fall into 2 categories: 1) GSC proxy init: this only happens once immediately after GSC FW load and does not support being interrupted. The only way to recover from an interruption of the proxy init is to do an FLR and re-load the GSC. 2) GSC proxy request: this can happen in response to a request that the driver sends to the GSC. If this is interrupted, the GSC FW will timeout and the driver request will be failed, but overall the GSC will keep working fine. Flushing the work allowed us to avoid interruption in both cases (unless the hang came from the GSC engine itself, in which case we're toast anyway). However, a failure on a proxy request is tolerable if we're in a scenario where we're triggering a GT reset (i.e., something is already gone pretty wrong), so what we really need to avoid is interrupting the init flow, which we can do by polling on the register that reports when the proxy init is complete (as that ensure us that all the load and init operations have been completed). Note that during suspend we still want to do a flush of the worker to make sure it completes any operations involving the HW before the power is cut. v2: fix spelling in commit msg, rename waiter function (Julia) Fixes: `dd0e89e5ed` ("drm/xe/gsc: GSC FW load") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com	2025-05-05 13:36:52 -07:00
Matthew Brost	3182f3634f	drm/xe: Do not print timedout job message on killed exec queues If a user ctrl-c an app while something is running on the GPU, jobs are expected to timeout. Do not spam dmesg with timedout job messages in this case. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250428175505.935694-1-matthew.brost@intel.com	2025-05-01 09:43:05 -07:00
Arnd Bergmann	9c088a5c0d	drm/xe: fix devcoredump chunk alignmnent calculation The device core dumps are copied in 1.5GB chunks, which leads to a link-time error on 32-bit builds because of the 64-bit division not getting trivially turned into mask and shift operations: ERROR: modpost: "__moddi3" [drivers/gpu/drm/xe/xe.ko] undefined! On top of this, I noticed that the ALIGN_DOWN() usage here cannot work because that is only defined for power-of-two alignments. Change ALIGN_DOWN into an explicit div_u64_rem() that avoids the link error and hopefully produces the right results. Doing a 1.5GB kvmalloc() does seem a bit suspicious as well, e.g. this will clearly fail on any 32-bit platform and is also likely to run out of memory on 64-bit systems under memory pressure, so using a much smaller power-of-two chunk size might be a good idea instead. v2: - Always call div_u64_rem (Matt) Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504251238.JsNgFeFc-lkp@intel.com/ Fixes: `c4a2e5f865` ("drm/xe: Add devcoredump chunking") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250501012545.1045247-1-matthew.brost@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-05-01 11:48:56 -04:00
Daniele Ceraolo Spurio	dba7d17d50	drm/xe/vf: Fix guc_info debugfs for VFs The guc_info debugfs attempts to read a bunch of registers that the VFs doesn't have access to, so fix it by skipping the reads. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4775 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com> Link: https://lore.kernel.org/r/20250423173908.1571412-1-daniele.ceraolospurio@intel.com	2025-04-29 13:20:57 -07:00
Dafna Hirschfeld	f64cf7b681	drm/gpusvm: set has_dma_mapping inside mapping loop The 'has_dma_mapping' flag should be set once there is a mapping so it could be unmapped in case of error. v2: - Resend for CI Fixes: `99624bdff8` ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250428024752.881292-1-matthew.brost@intel.com	2025-04-29 11:18:13 -07:00
Tejas Upadhyay	70a2585e58	drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs LNCF registers report wrong values when XE_FORCEWAKE_GT only is held. Holding XE_FORCEWAKE_ALL ensures correct operations on LNCF regs. V2(Himal): - Use xe_force_wake_ref_has_domain Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1999 Fixes: `a6a4ea6d7d` ("drm/xe: Add mocs kunit") Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250428082357.1730068-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>	2025-04-29 17:59:53 +05:30
Thomas Hellström	1bb53d05ba	Merge drm/drm-next into drm-xe-next Additional backmerge to avoid excessive diffstats when sending PR. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-04-28 17:42:49 +02:00

1 2 3 4 5 ...

1352029 Commits