mirror of
https://github.com/torvalds/linux.git
synced 2026-05-12 16:18:45 +02:00
Merge tag 'drm-xe-next-2026-03-02' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes: - restrict multi-lrc to VCS/VECS engines (Xin Wang) - Introduce a flag to disallow vm overcommit in fault mode (Thomas) - update used tracking kernel-doc (Auld, Fixes) - Some bind queue fixes (Auld, Fixes) Cross-subsystem Changes: - Split drm_suballoc_new() into SA alloc and init helpers (Satya, Fixes) - pass pagemap_addr by reference (Arnd, Fixes) - Revert "drm/pagemap: Disable device-to-device migration" (Thomas) - Fix unbalanced unlock in drm_gpusvm_scan_mm (Maciej, Fixes) - Small GPUSVM fixes (Brost, Fixes) - Fix xe SVM configs (Thomas, Fixes) Core Changes: - Fix a hmm_range_fault() livelock / starvation problem (Thomas, Fixes) Driver Changes: - Fix leak on xa_store failure (Shuicheng, Fixes) - Correct implementation of Wa_16025250150 (Roper, Fixes) - Refactor context init into xe_lrc_ctx_init (Raag) - Fix GSC proxy cleanup on early initialization failure (Zhanjun) - Fix exec queue creation during post-migration recovery (Tomasz, Fixes) - Apply windower hardware filtering setting on Xe3 and Xe3p (Roper) - Free ctx_restore_mid_bb in release (Shuicheng, Fixes) - Drop stale MCR steering TODO comment (Roper) - dGPU memory optimizations (Brost) - Do not preempt fence signaling CS instructions (Brost, Fixes) - Revert "drm/xe/compat: Remove unused i915_reg.h from compat header" (Uma) - Don't expose display modparam if no display support (Wajdeczko) - Some VRAM flag improvements (Wajdeczko) - Misc fix for xe_guc_ct.c (Shuicheng, Fixes) - Remove unused i915_reg.h from compat header (Uma) - Workaround cleanup & simplification (Roper) - Add prefetch pagefault support for Xe3p (Varun) - Fix fs_reclaim deadlock caused by CCS save/restore (Satya, Fixes) - Cleanup partially initialized sync on parse failure (Shuicheng, Fixes) - Allow to change VFs VRAM quota using sysfs (Michal) - Increase GuC log sizes in debug builds (Tomasz) - Wa_18041344222 changes (Harish) - Add Wa_14026781792 (Niton) - Add debugfs facility to catch RTP mistakes (Roper) - Convert GT stats to per-cpu counters (Brost) - Prevent unintended VRAM channel creation (Karthik) - Privatize struct xe_ggtt (Maarten) - remove unnecessary struct dram_info forward declaration (Jani) - pagefault refactors (Brost) - Apply Wa_14024997852 (Arvind) - Redirect faults to dummy page for wedged device (Raag, Fixes) - Force EXEC_QUEUE_FLAG_KERNEL for kernel internal VMs (Piotr) - Stop applying Wa_16018737384 from Xe3 onward (Roper) - Add new XeCore fuse registers to VF runtime regs (Roper) - Update xe_device_declare_wedged() error log (Raag) - Make xe_modparam.force_vram_bar_size signed (Shuicheng, Fixes) - Avoid reading media version when media GT is disabled (Piotr, Fixes) - Fix handling of Wa_14019988906 & Wa_14019877138 (Roper, Fixes) - Basic enabling patches for Xe3p_LPG and NVL-P (Gustavo, Roper, Shekhar) - Avoid double-adjust in 64-bit reads (Shuicheng, Fixes) - Allow VF to initialize MCR tables (Wajdeczko) - Add Wa_14025883347 for GuC DMA failure on reset (Anirban) - Add bounds check on pat_index to prevent OOB kernel read in madvise (Jia, Fixes) - Fix the address range assert in ggtt_get_pte helper (Winiarski) - XeCore fuse register changes (Roper) - Add more info to powergate_info debugfs (Vinay) - Separate out GuC RC code (Vinay) - Fix g2g_test_array indexing (Pallavi) - Mutual exclusivity between CCS-mode and PF (Nareshkumar, Fixes) - Some more _types.h cleanups (Wajdeczko) - Fix sysfs initialization (Wajdeczko, Fixes) - Drop unnecessary goto in xe_device_create (Roper) - Disable D3Cold for BMG only on specific platforms (Karthik, Fixes) - Add sriov.admin_only_pf attribute (Wajdeczko) - replace old wq(s), add WQ_PERCPU to alloc_workqueue (Marco) - Make MMIO communication more robust (Wajdeczko) - Fix warning of kerneldoc (Shuicheng, Fixes) - Fix topology query pointer advance (Shuicheng, Fixes) - use entry_dump callbacks for xe2+ PAT dumps (Xin Wang) - Fix kernel-doc warning in GuC scheduler ABI header (Chaitanya, Fixes) - Fix CFI violation in debugfs access (Daniele, Fixes) - Apply WA_16028005424 to Media (Balasubramani) - Fix typo in function kernel-doc (Wajdeczko) - Protect priority against concurrent access (Niranjana) - Fix nvm aux resource cleanup (Shuicheng, Fixes) - Fix is_bound() pci_dev lifetime (Shuicheng, Fixes) - Use CLASS() for forcewake in xe_gt_enable_comp_1wcoh (Shuicheng) - Reset VF GuC state on fini (Wajdeczko) - Move _THIS_IP_ usage from xe_vm_create() to dedicated function (Nathan Chancellor, Fixes) - Unregister drm device on probe error (Shuicheng, Fixes) - Disable DCC on PTL (Vinay, Fixes) - Fix Wa_18022495364 (Tvrtko, Fixes) - Skip address copy for sync-only execs (Shuicheng, Fixes) - derive mem copy capability from graphics version (Nitin, Fixes) - Use DRM_BUDDY_CONTIGUOUS_ALLOCATION for contiguous allocations (Sanjay) - Context based TLB invalidations (Brost) - Enable multi_queue on xe3p_xpc (Brost, Niranjana) - Remove check for gt in xe_query (Nakshtra) - Reduce LRC timestamp stuck message on VFs to notice (Brost, Fixes) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/aaYR5G2MHjOEMXPW@lstrano-desk.jf.intel.com
This commit is contained in:
commit
17b95278ae
|
|
@ -129,6 +129,37 @@ Description:
|
|||
-EIO if FW refuses to change the provisioning.
|
||||
|
||||
|
||||
What: /sys/bus/pci/drivers/xe/.../sriov_admin/.bulk_profile/vram_quota
|
||||
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/profile/vram_quota
|
||||
Date: February 2026
|
||||
KernelVersion: 7.0
|
||||
Contact: intel-xe@lists.freedesktop.org
|
||||
Description:
|
||||
These files allow to perform initial VFs VRAM provisioning prior to VFs
|
||||
enabling or to change VFs VRAM provisioning once the VFs are enabled.
|
||||
Any non-zero initial VRAM provisioning will block VFs auto-provisioning.
|
||||
Without initial VRAM provisioning those files will show result of the
|
||||
VRAM auto-provisioning performed by the PF once the VFs are enabled.
|
||||
Once the VFs are disabled, all VRAM provisioning will be released.
|
||||
These files are visible only on discrete Intel Xe platforms with VRAM
|
||||
and are writeable only if dynamic VFs VRAM provisioning is supported.
|
||||
|
||||
.bulk_profile/vram_quota: (WO) unsigned integer
|
||||
The amount of the provisioned VRAM in [bytes] for each VF.
|
||||
Actual quota value might be aligned per HW/FW requirements.
|
||||
|
||||
profile/vram_quota: (RW) unsigned integer
|
||||
The amount of the provisioned VRAM in [bytes] for this VF.
|
||||
Actual quota value might be aligned per HW/FW requirements.
|
||||
|
||||
Default is 0 (unprovisioned).
|
||||
|
||||
Writes to these attributes may fail with errors like:
|
||||
-EINVAL if provided input is malformed or not recognized,
|
||||
-EPERM if change is not applicable on given HW/FW,
|
||||
-EIO if FW refuses to change the provisioning.
|
||||
|
||||
|
||||
What: /sys/bus/pci/drivers/xe/.../sriov_admin/vf<n>/stop
|
||||
Date: October 2025
|
||||
KernelVersion: 6.19
|
||||
|
|
|
|||
|
|
@ -31,6 +31,9 @@ GuC Power Conservation (PC)
|
|||
.. kernel-doc:: drivers/gpu/drm/xe/xe_guc_pc.c
|
||||
:doc: GuC Power Conservation (PC)
|
||||
|
||||
.. kernel-doc:: drivers/gpu/drm/xe/xe_guc_rc.c
|
||||
:doc: GuC Render C-states (GuC RC)
|
||||
|
||||
PCIe Gen5 Limitations
|
||||
=====================
|
||||
|
||||
|
|
|
|||
|
|
@ -819,7 +819,7 @@ enum drm_gpusvm_scan_result drm_gpusvm_scan_mm(struct drm_gpusvm_range *range,
|
|||
|
||||
if (!(pfns[i] & HMM_PFN_VALID)) {
|
||||
state = DRM_GPUSVM_SCAN_UNPOPULATED;
|
||||
goto err_free;
|
||||
break;
|
||||
}
|
||||
|
||||
page = hmm_pfn_to_page(pfns[i]);
|
||||
|
|
@ -856,9 +856,9 @@ enum drm_gpusvm_scan_result drm_gpusvm_scan_mm(struct drm_gpusvm_range *range,
|
|||
i += 1ul << drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
|
||||
}
|
||||
|
||||
err_free:
|
||||
drm_gpusvm_notifier_unlock(range->gpusvm);
|
||||
|
||||
err_free:
|
||||
kvfree(pfns);
|
||||
return state;
|
||||
}
|
||||
|
|
@ -1495,7 +1495,7 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
|
|||
}
|
||||
zdd = page->zone_device_data;
|
||||
if (pagemap != page_pgmap(page)) {
|
||||
if (i > 0) {
|
||||
if (pagemap) {
|
||||
err = -EOPNOTSUPP;
|
||||
goto err_unmap;
|
||||
}
|
||||
|
|
@ -1572,6 +1572,7 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
|
|||
return 0;
|
||||
|
||||
err_unmap:
|
||||
svm_pages->flags.has_dma_mapping = true;
|
||||
__drm_gpusvm_unmap_pages(gpusvm, svm_pages, num_dma_mapped);
|
||||
drm_gpusvm_notifier_unlock(gpusvm);
|
||||
err_free:
|
||||
|
|
|
|||
|
|
@ -480,18 +480,8 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
|
|||
.start = start,
|
||||
.end = end,
|
||||
.pgmap_owner = pagemap->owner,
|
||||
/*
|
||||
* FIXME: MIGRATE_VMA_SELECT_DEVICE_PRIVATE intermittently
|
||||
* causes 'xe_exec_system_allocator --r *race*no*' to trigger aa
|
||||
* engine reset and a hard hang due to getting stuck on a folio
|
||||
* lock. This should work and needs to be root-caused. The only
|
||||
* downside of not selecting MIGRATE_VMA_SELECT_DEVICE_PRIVATE
|
||||
* is that device-to-device migrations won’t work; instead,
|
||||
* memory will bounce through system memory. This path should be
|
||||
* rare and only occur when the madvise attributes of memory are
|
||||
* changed or atomics are being used.
|
||||
*/
|
||||
.flags = MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT,
|
||||
.flags = MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT |
|
||||
MIGRATE_VMA_SELECT_DEVICE_PRIVATE,
|
||||
};
|
||||
unsigned long i, npages = npages_in_range(start, end);
|
||||
unsigned long own_pages = 0, migrated_pages = 0;
|
||||
|
|
|
|||
|
|
@ -293,45 +293,66 @@ static bool drm_suballoc_next_hole(struct drm_suballoc_manager *sa_manager,
|
|||
}
|
||||
|
||||
/**
|
||||
* drm_suballoc_new() - Make a suballocation.
|
||||
* drm_suballoc_alloc() - Allocate uninitialized suballoc object.
|
||||
* @gfp: gfp flags used for memory allocation.
|
||||
*
|
||||
* Allocate memory for an uninitialized suballoc object. Intended usage is
|
||||
* allocate memory for suballoc object outside of a reclaim tainted context
|
||||
* and then be initialized at a later time in a reclaim tainted context.
|
||||
*
|
||||
* @drm_suballoc_free() should be used to release the memory if returned
|
||||
* suballoc object is in uninitialized state.
|
||||
*
|
||||
* Return: a new uninitialized suballoc object, or an ERR_PTR(-ENOMEM).
|
||||
*/
|
||||
struct drm_suballoc *drm_suballoc_alloc(gfp_t gfp)
|
||||
{
|
||||
struct drm_suballoc *sa;
|
||||
|
||||
sa = kmalloc_obj(*sa, gfp);
|
||||
if (!sa)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
sa->manager = NULL;
|
||||
|
||||
return sa;
|
||||
}
|
||||
EXPORT_SYMBOL(drm_suballoc_alloc);
|
||||
|
||||
/**
|
||||
* drm_suballoc_insert() - Initialize a suballocation and insert a hole.
|
||||
* @sa_manager: pointer to the sa_manager
|
||||
* @sa: The struct drm_suballoc.
|
||||
* @size: number of bytes we want to suballocate.
|
||||
* @gfp: gfp flags used for memory allocation. Typically GFP_KERNEL but
|
||||
* the argument is provided for suballocations from reclaim context or
|
||||
* where the caller wants to avoid pipelining rather than wait for
|
||||
* reclaim.
|
||||
* @intr: Whether to perform waits interruptible. This should typically
|
||||
* always be true, unless the caller needs to propagate a
|
||||
* non-interruptible context from above layers.
|
||||
* @align: Alignment. Must not exceed the default manager alignment.
|
||||
* If @align is zero, then the manager alignment is used.
|
||||
*
|
||||
* Try to make a suballocation of size @size, which will be rounded
|
||||
* up to the alignment specified in specified in drm_suballoc_manager_init().
|
||||
* Try to make a suballocation on a pre-allocated suballoc object of size @size,
|
||||
* which will be rounded up to the alignment specified in specified in
|
||||
* drm_suballoc_manager_init().
|
||||
*
|
||||
* Return: a new suballocated bo, or an ERR_PTR.
|
||||
* Return: zero on success, errno on failure.
|
||||
*/
|
||||
struct drm_suballoc *
|
||||
drm_suballoc_new(struct drm_suballoc_manager *sa_manager, size_t size,
|
||||
gfp_t gfp, bool intr, size_t align)
|
||||
int drm_suballoc_insert(struct drm_suballoc_manager *sa_manager,
|
||||
struct drm_suballoc *sa, size_t size,
|
||||
bool intr, size_t align)
|
||||
{
|
||||
struct dma_fence *fences[DRM_SUBALLOC_MAX_QUEUES];
|
||||
unsigned int tries[DRM_SUBALLOC_MAX_QUEUES];
|
||||
unsigned int count;
|
||||
int i, r;
|
||||
struct drm_suballoc *sa;
|
||||
|
||||
if (WARN_ON_ONCE(align > sa_manager->align))
|
||||
return ERR_PTR(-EINVAL);
|
||||
return -EINVAL;
|
||||
if (WARN_ON_ONCE(size > sa_manager->size || !size))
|
||||
return ERR_PTR(-EINVAL);
|
||||
return -EINVAL;
|
||||
|
||||
if (!align)
|
||||
align = sa_manager->align;
|
||||
|
||||
sa = kmalloc_obj(*sa, gfp);
|
||||
if (!sa)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
sa->manager = sa_manager;
|
||||
sa->fence = NULL;
|
||||
INIT_LIST_HEAD(&sa->olist);
|
||||
|
|
@ -348,7 +369,7 @@ drm_suballoc_new(struct drm_suballoc_manager *sa_manager, size_t size,
|
|||
if (drm_suballoc_try_alloc(sa_manager, sa,
|
||||
size, align)) {
|
||||
spin_unlock(&sa_manager->wq.lock);
|
||||
return sa;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* see if we can skip over some allocations */
|
||||
|
|
@ -385,8 +406,48 @@ drm_suballoc_new(struct drm_suballoc_manager *sa_manager, size_t size,
|
|||
} while (!r);
|
||||
|
||||
spin_unlock(&sa_manager->wq.lock);
|
||||
kfree(sa);
|
||||
return ERR_PTR(r);
|
||||
sa->manager = NULL;
|
||||
return r;
|
||||
}
|
||||
EXPORT_SYMBOL(drm_suballoc_insert);
|
||||
|
||||
/**
|
||||
* drm_suballoc_new() - Make a suballocation.
|
||||
* @sa_manager: pointer to the sa_manager
|
||||
* @size: number of bytes we want to suballocate.
|
||||
* @gfp: gfp flags used for memory allocation. Typically GFP_KERNEL but
|
||||
* the argument is provided for suballocations from reclaim context or
|
||||
* where the caller wants to avoid pipelining rather than wait for
|
||||
* reclaim.
|
||||
* @intr: Whether to perform waits interruptible. This should typically
|
||||
* always be true, unless the caller needs to propagate a
|
||||
* non-interruptible context from above layers.
|
||||
* @align: Alignment. Must not exceed the default manager alignment.
|
||||
* If @align is zero, then the manager alignment is used.
|
||||
*
|
||||
* Try to make a suballocation of size @size, which will be rounded
|
||||
* up to the alignment specified in specified in drm_suballoc_manager_init().
|
||||
*
|
||||
* Return: a new suballocated bo, or an ERR_PTR.
|
||||
*/
|
||||
struct drm_suballoc *
|
||||
drm_suballoc_new(struct drm_suballoc_manager *sa_manager, size_t size,
|
||||
gfp_t gfp, bool intr, size_t align)
|
||||
{
|
||||
struct drm_suballoc *sa;
|
||||
int err;
|
||||
|
||||
sa = drm_suballoc_alloc(gfp);
|
||||
if (IS_ERR(sa))
|
||||
return sa;
|
||||
|
||||
err = drm_suballoc_insert(sa_manager, sa, size, intr, align);
|
||||
if (err) {
|
||||
drm_suballoc_free(sa, NULL);
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
return sa;
|
||||
}
|
||||
EXPORT_SYMBOL(drm_suballoc_new);
|
||||
|
||||
|
|
@ -405,6 +466,11 @@ void drm_suballoc_free(struct drm_suballoc *suballoc,
|
|||
if (!suballoc)
|
||||
return;
|
||||
|
||||
if (!suballoc->manager) {
|
||||
kfree(suballoc);
|
||||
return;
|
||||
}
|
||||
|
||||
sa_manager = suballoc->manager;
|
||||
|
||||
spin_lock(&sa_manager->wq.lock);
|
||||
|
|
|
|||
|
|
@ -1500,6 +1500,7 @@ static const struct {
|
|||
INTEL_PTL_IDS(INTEL_DISPLAY_DEVICE, &ptl_desc),
|
||||
INTEL_WCL_IDS(INTEL_DISPLAY_DEVICE, &ptl_desc),
|
||||
INTEL_NVLS_IDS(INTEL_DISPLAY_DEVICE, &nvl_desc),
|
||||
INTEL_NVLP_IDS(INTEL_DISPLAY_DEVICE, &nvl_desc),
|
||||
};
|
||||
|
||||
static const struct {
|
||||
|
|
|
|||
|
|
@ -74,6 +74,7 @@ xe-y += xe_bb.o \
|
|||
xe_guc_log.o \
|
||||
xe_guc_pagefault.o \
|
||||
xe_guc_pc.o \
|
||||
xe_guc_rc.o \
|
||||
xe_guc_submit.o \
|
||||
xe_guc_tlb_inval.o \
|
||||
xe_heci_gsc.o \
|
||||
|
|
|
|||
|
|
@ -256,7 +256,7 @@ static int __xe_pin_fb_vma_ggtt(const struct intel_framebuffer *fb,
|
|||
size = intel_rotation_info_size(&view->rotated) * XE_PAGE_SIZE;
|
||||
|
||||
pte = xe_ggtt_encode_pte_flags(ggtt, bo, xe->pat.idx[XE_CACHE_NONE]);
|
||||
vma->node = xe_ggtt_node_insert_transform(ggtt, bo, pte,
|
||||
vma->node = xe_ggtt_insert_node_transform(ggtt, bo, pte,
|
||||
ALIGN(size, align), align,
|
||||
view->type == I915_GTT_VIEW_NORMAL ?
|
||||
NULL : write_ggtt_rotated_node,
|
||||
|
|
@ -352,8 +352,7 @@ static void __xe_unpin_fb_vma(struct i915_vma *vma)
|
|||
|
||||
if (vma->dpt)
|
||||
xe_bo_unpin_map_no_vm(vma->dpt);
|
||||
else if (!xe_ggtt_node_allocated(vma->bo->ggtt_node[tile_id]) ||
|
||||
vma->bo->ggtt_node[tile_id] != vma->node)
|
||||
else if (vma->bo->ggtt_node[tile_id] != vma->node)
|
||||
xe_ggtt_node_remove(vma->node, false);
|
||||
|
||||
ttm_bo_reserve(&vma->bo->ttm, false, false, NULL);
|
||||
|
|
|
|||
|
|
@ -55,6 +55,7 @@
|
|||
#define PIPELINE_SELECT GFXPIPE_SINGLE_DW_CMD(0x1, 0x4)
|
||||
|
||||
#define CMD_3DSTATE_DRAWING_RECTANGLE_FAST GFXPIPE_3D_CMD(0x0, 0x0)
|
||||
#define CMD_3DSTATE_CUSTOM_SAMPLE_PATTERN GFXPIPE_3D_CMD(0x0, 0x2)
|
||||
#define CMD_3DSTATE_CLEAR_PARAMS GFXPIPE_3D_CMD(0x0, 0x4)
|
||||
#define CMD_3DSTATE_DEPTH_BUFFER GFXPIPE_3D_CMD(0x0, 0x5)
|
||||
#define CMD_3DSTATE_STENCIL_BUFFER GFXPIPE_3D_CMD(0x0, 0x6)
|
||||
|
|
@ -138,8 +139,16 @@
|
|||
#define CMD_3DSTATE_SBE_MESH GFXPIPE_3D_CMD(0x0, 0x82)
|
||||
#define CMD_3DSTATE_CPSIZE_CONTROL_BUFFER GFXPIPE_3D_CMD(0x0, 0x83)
|
||||
#define CMD_3DSTATE_COARSE_PIXEL GFXPIPE_3D_CMD(0x0, 0x89)
|
||||
#define CMD_3DSTATE_MESH_SHADER_DATA_EXT GFXPIPE_3D_CMD(0x0, 0x8A)
|
||||
#define CMD_3DSTATE_TASK_SHADER_DATA_EXT GFXPIPE_3D_CMD(0x0, 0x8B)
|
||||
#define CMD_3DSTATE_VIEWPORT_STATE_POINTERS_CC_2 GFXPIPE_3D_CMD(0x0, 0x8D)
|
||||
#define CMD_3DSTATE_CC_STATE_POINTERS_2 GFXPIPE_3D_CMD(0x0, 0x8E)
|
||||
#define CMD_3DSTATE_SCISSOR_STATE_POINTERS_2 GFXPIPE_3D_CMD(0x0, 0x8F)
|
||||
#define CMD_3DSTATE_BLEND_STATE_POINTERS_2 GFXPIPE_3D_CMD(0x0, 0xA0)
|
||||
#define CMD_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP_2 GFXPIPE_3D_CMD(0x0, 0xA1)
|
||||
|
||||
#define CMD_3DSTATE_DRAWING_RECTANGLE GFXPIPE_3D_CMD(0x1, 0x0)
|
||||
#define CMD_3DSTATE_URB_MEMORY GFXPIPE_3D_CMD(0x1, 0x1)
|
||||
#define CMD_3DSTATE_CHROMA_KEY GFXPIPE_3D_CMD(0x1, 0x4)
|
||||
#define CMD_3DSTATE_POLY_STIPPLE_OFFSET GFXPIPE_3D_CMD(0x1, 0x6)
|
||||
#define CMD_3DSTATE_POLY_STIPPLE_PATTERN GFXPIPE_3D_CMD(0x1, 0x7)
|
||||
|
|
@ -160,5 +169,6 @@
|
|||
#define CMD_3DSTATE_SUBSLICE_HASH_TABLE GFXPIPE_3D_CMD(0x1, 0x1F)
|
||||
#define CMD_3DSTATE_SLICE_TABLE_STATE_POINTERS GFXPIPE_3D_CMD(0x1, 0x20)
|
||||
#define CMD_3DSTATE_PTBR_TILE_PASS_INFO GFXPIPE_3D_CMD(0x1, 0x22)
|
||||
#define CMD_3DSTATE_SLICE_TABLE_STATE_POINTER_2 GFXPIPE_3D_CMD(0x1, 0xA0)
|
||||
|
||||
#endif
|
||||
|
|
|
|||
|
|
@ -58,7 +58,7 @@
|
|||
#define MCR_SLICE(slice) REG_FIELD_PREP(MCR_SLICE_MASK, slice)
|
||||
#define MCR_SUBSLICE_MASK REG_GENMASK(26, 24)
|
||||
#define MCR_SUBSLICE(subslice) REG_FIELD_PREP(MCR_SUBSLICE_MASK, subslice)
|
||||
#define MTL_MCR_GROUPID REG_GENMASK(11, 8)
|
||||
#define MTL_MCR_GROUPID REG_GENMASK(12, 8)
|
||||
#define MTL_MCR_INSTANCEID REG_GENMASK(3, 0)
|
||||
|
||||
#define PS_INVOCATION_COUNT XE_REG(0x2348)
|
||||
|
|
@ -100,6 +100,9 @@
|
|||
#define VE1_AUX_INV XE_REG(0x42b8)
|
||||
#define AUX_INV REG_BIT(0)
|
||||
|
||||
#define GAMSTLB_CTRL2 XE_REG_MCR(0x4788)
|
||||
#define STLB_SINGLE_BANK_MODE REG_BIT(11)
|
||||
|
||||
#define XE2_LMEM_CFG XE_REG(0x48b0)
|
||||
|
||||
#define XE2_GAMWALK_CTRL 0x47e4
|
||||
|
|
@ -107,6 +110,9 @@
|
|||
#define XE2_GAMWALK_CTRL_3D XE_REG_MCR(XE2_GAMWALK_CTRL)
|
||||
#define EN_CMP_1WCOH_GW REG_BIT(14)
|
||||
|
||||
#define MMIOATSREQLIMIT_GAM_WALK_3D XE_REG_MCR(0x47f8)
|
||||
#define DIS_ATS_WRONLY_PG REG_BIT(18)
|
||||
|
||||
#define XEHP_FLAT_CCS_BASE_ADDR XE_REG_MCR(0x4910)
|
||||
#define XEHP_FLAT_CCS_PTR REG_GENMASK(31, 8)
|
||||
|
||||
|
|
@ -125,6 +131,7 @@
|
|||
#define VS_HIT_MAX_VALUE_MASK REG_GENMASK(25, 20)
|
||||
#define DIS_MESH_PARTIAL_AUTOSTRIP REG_BIT(16)
|
||||
#define DIS_MESH_AUTOSTRIP REG_BIT(15)
|
||||
#define DIS_TE_PATCH_CTRL REG_BIT(4)
|
||||
|
||||
#define VFLSKPD XE_REG_MCR(0x62a8, XE_REG_OPTION_MASKED)
|
||||
#define DIS_PARTIAL_AUTOSTRIP REG_BIT(9)
|
||||
|
|
@ -169,6 +176,7 @@
|
|||
#define COMMON_SLICE_CHICKEN4 XE_REG(0x7300, XE_REG_OPTION_MASKED)
|
||||
#define SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE REG_BIT(12)
|
||||
#define DISABLE_TDC_LOAD_BALANCING_CALC REG_BIT(6)
|
||||
#define HW_FILTERING REG_BIT(5)
|
||||
|
||||
#define COMMON_SLICE_CHICKEN3 XE_REG(0x7304, XE_REG_OPTION_MASKED)
|
||||
#define XEHP_COMMON_SLICE_CHICKEN3 XE_REG_MCR(0x7304, XE_REG_OPTION_MASKED)
|
||||
|
|
@ -210,6 +218,9 @@
|
|||
|
||||
#define GSCPSMI_BASE XE_REG(0x880c)
|
||||
|
||||
#define CCCHKNREG2 XE_REG_MCR(0x881c)
|
||||
#define LOCALITYDIS REG_BIT(7)
|
||||
|
||||
#define CCCHKNREG1 XE_REG_MCR(0x8828)
|
||||
#define L3CMPCTRL REG_BIT(23)
|
||||
#define ENCOMPPERFFIX REG_BIT(18)
|
||||
|
|
@ -253,6 +264,8 @@
|
|||
#define XE2_GT_COMPUTE_DSS_2 XE_REG(0x914c)
|
||||
#define XE2_GT_GEOMETRY_DSS_1 XE_REG(0x9150)
|
||||
#define XE2_GT_GEOMETRY_DSS_2 XE_REG(0x9154)
|
||||
#define XE3P_XPC_GT_GEOMETRY_DSS_3 XE_REG(0x915c)
|
||||
#define XE3P_XPC_GT_COMPUTE_DSS_3 XE_REG(0x9160)
|
||||
|
||||
#define SERVICE_COPY_ENABLE XE_REG(0x9170)
|
||||
#define FUSE_SERVICE_COPY_ENABLE_MASK REG_GENMASK(7, 0)
|
||||
|
|
@ -367,6 +380,7 @@
|
|||
#define FORCEWAKE_RENDER XE_REG(0xa278)
|
||||
|
||||
#define POWERGATE_DOMAIN_STATUS XE_REG(0xa2a0)
|
||||
#define GSC_AWAKE_STATUS REG_BIT(8)
|
||||
#define MEDIA_SLICE3_AWAKE_STATUS REG_BIT(4)
|
||||
#define MEDIA_SLICE2_AWAKE_STATUS REG_BIT(3)
|
||||
#define MEDIA_SLICE1_AWAKE_STATUS REG_BIT(2)
|
||||
|
|
@ -420,6 +434,8 @@
|
|||
#define LSN_DIM_Z_WGT(value) REG_FIELD_PREP(LSN_DIM_Z_WGT_MASK, value)
|
||||
|
||||
#define L3SQCREG2 XE_REG_MCR(0xb104)
|
||||
#define L3_SQ_DISABLE_COAMA_2WAY_COH REG_BIT(30)
|
||||
#define L3_SQ_DISABLE_COAMA REG_BIT(22)
|
||||
#define COMPMEMRD256BOVRFETCHEN REG_BIT(20)
|
||||
|
||||
#define L3SQCREG3 XE_REG_MCR(0xb108)
|
||||
|
|
@ -459,6 +475,8 @@
|
|||
#define FORCE_MISS_FTLB REG_BIT(3)
|
||||
|
||||
#define XEHP_GAMSTLB_CTRL XE_REG_MCR(0xcf4c)
|
||||
#define BANK_HASH_MODE REG_GENMASK(27, 26)
|
||||
#define BANK_HASH_4KB_MODE REG_FIELD_PREP(BANK_HASH_MODE, 0x3)
|
||||
#define CONTROL_BLOCK_CLKGATE_DIS REG_BIT(12)
|
||||
#define EGRESS_BLOCK_CLKGATE_DIS REG_BIT(11)
|
||||
#define TAG_BLOCK_CLKGATE_DIS REG_BIT(7)
|
||||
|
|
@ -550,11 +568,16 @@
|
|||
#define UGM_FRAGMENT_THRESHOLD_TO_3 REG_BIT(58 - 32)
|
||||
#define DIS_CHAIN_2XSIMD8 REG_BIT(55 - 32)
|
||||
#define XE2_ALLOC_DPA_STARVE_FIX_DIS REG_BIT(47 - 32)
|
||||
#define SAMPLER_LD_LSC_DISABLE REG_BIT(45 - 32)
|
||||
#define ENABLE_SMP_LD_RENDER_SURFACE_CONTROL REG_BIT(44 - 32)
|
||||
#define FORCE_SLM_FENCE_SCOPE_TO_TILE REG_BIT(42 - 32)
|
||||
#define FORCE_UGM_FENCE_SCOPE_TO_TILE REG_BIT(41 - 32)
|
||||
#define MAXREQS_PER_BANK REG_GENMASK(39 - 32, 37 - 32)
|
||||
#define DISABLE_128B_EVICTION_COMMAND_UDW REG_BIT(36 - 32)
|
||||
#define LSCFE_SAME_ADDRESS_ATOMICS_COALESCING_DISABLE REG_BIT(35 - 32)
|
||||
|
||||
#define ROW_CHICKEN5 XE_REG_MCR(0xe7f0)
|
||||
#define CPSS_AWARE_DIS REG_BIT(3)
|
||||
|
||||
#define SARB_CHICKEN1 XE_REG_MCR(0xe90c)
|
||||
#define COMP_CKN_IN REG_GENMASK(30, 29)
|
||||
|
|
|
|||
|
|
@ -40,6 +40,9 @@
|
|||
#define GS_BOOTROM_JUMP_PASSED REG_FIELD_PREP(GS_BOOTROM_MASK, 0x76)
|
||||
#define GS_MIA_IN_RESET REG_BIT(0)
|
||||
|
||||
#define BOOT_HASH_CHK XE_REG(0xc010)
|
||||
#define GUC_BOOT_UKERNEL_VALID REG_BIT(31)
|
||||
|
||||
#define GUC_HEADER_INFO XE_REG(0xc014)
|
||||
|
||||
#define GUC_WOPCM_SIZE XE_REG(0xc050)
|
||||
|
|
@ -83,7 +86,12 @@
|
|||
#define GUC_WOPCM_OFFSET_MASK REG_GENMASK(31, GUC_WOPCM_OFFSET_SHIFT)
|
||||
#define HUC_LOADING_AGENT_GUC REG_BIT(1)
|
||||
#define GUC_WOPCM_OFFSET_VALID REG_BIT(0)
|
||||
|
||||
#define GUC_SRAM_STATUS XE_REG(0xc398)
|
||||
#define GUC_SRAM_HANDLING_MASK REG_GENMASK(8, 7)
|
||||
|
||||
#define GUC_MAX_IDLE_COUNT XE_REG(0xc3e4)
|
||||
#define GUC_IDLE_FLOW_DISABLE REG_BIT(31)
|
||||
#define GUC_PMTIMESTAMP_LO XE_REG(0xc3e8)
|
||||
#define GUC_PMTIMESTAMP_HI XE_REG(0xc3ec)
|
||||
|
||||
|
|
|
|||
|
|
@ -11,14 +11,26 @@
|
|||
#include "xe_pci_test.h"
|
||||
|
||||
#define TEST_MAX_VFS 63
|
||||
#define TEST_VRAM 0x37a800000ull
|
||||
|
||||
static void pf_set_admin_mode(struct xe_device *xe, bool enable)
|
||||
{
|
||||
/* should match logic of xe_sriov_pf_admin_only() */
|
||||
xe->info.probe_display = !enable;
|
||||
xe->sriov.pf.admin_only = enable;
|
||||
KUNIT_EXPECT_EQ(kunit_get_current_test(), enable, xe_sriov_pf_admin_only(xe));
|
||||
}
|
||||
|
||||
static void pf_set_usable_vram(struct xe_device *xe, u64 usable)
|
||||
{
|
||||
struct xe_tile *tile = xe_device_get_root_tile(xe);
|
||||
struct kunit *test = kunit_get_current_test();
|
||||
|
||||
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, tile);
|
||||
xe->mem.vram->usable_size = usable;
|
||||
tile->mem.vram->usable_size = usable;
|
||||
KUNIT_ASSERT_EQ(test, usable, xe_vram_region_usable_size(tile->mem.vram));
|
||||
}
|
||||
|
||||
static const void *num_vfs_gen_param(struct kunit *test, const void *prev, char *desc)
|
||||
{
|
||||
unsigned long next = 1 + (unsigned long)prev;
|
||||
|
|
@ -34,9 +46,11 @@ static int pf_gt_config_test_init(struct kunit *test)
|
|||
{
|
||||
struct xe_pci_fake_data fake = {
|
||||
.sriov_mode = XE_SRIOV_MODE_PF,
|
||||
.platform = XE_TIGERLAKE, /* any random platform with SR-IOV */
|
||||
.platform = XE_BATTLEMAGE, /* any random DGFX platform with SR-IOV */
|
||||
.subplatform = XE_SUBPLATFORM_NONE,
|
||||
.graphics_verx100 = 2001,
|
||||
};
|
||||
struct xe_vram_region *vram;
|
||||
struct xe_device *xe;
|
||||
struct xe_gt *gt;
|
||||
|
||||
|
|
@ -50,6 +64,19 @@ static int pf_gt_config_test_init(struct kunit *test)
|
|||
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, gt);
|
||||
test->priv = gt;
|
||||
|
||||
/* pretend it has some VRAM */
|
||||
KUNIT_ASSERT_TRUE(test, IS_DGFX(xe));
|
||||
vram = kunit_kzalloc(test, sizeof(*vram), GFP_KERNEL);
|
||||
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, vram);
|
||||
vram->usable_size = TEST_VRAM;
|
||||
xe->mem.vram = vram;
|
||||
xe->tiles[0].mem.vram = vram;
|
||||
|
||||
/* pretend we have a valid LMTT */
|
||||
KUNIT_ASSERT_TRUE(test, xe_device_has_lmtt(xe));
|
||||
KUNIT_ASSERT_GE(test, GRAPHICS_VERx100(xe), 1260);
|
||||
xe->tiles[0].sriov.pf.lmtt.ops = &lmtt_ml_ops;
|
||||
|
||||
/* pretend it can support up to 63 VFs */
|
||||
xe->sriov.pf.device_total_vfs = TEST_MAX_VFS;
|
||||
xe->sriov.pf.driver_max_vfs = TEST_MAX_VFS;
|
||||
|
|
@ -189,13 +216,80 @@ static void fair_ggtt(struct kunit *test)
|
|||
KUNIT_ASSERT_EQ(test, SZ_2G, pf_profile_fair_ggtt(gt, num_vfs));
|
||||
}
|
||||
|
||||
static const u64 vram_sizes[] = {
|
||||
SZ_4G - SZ_512M,
|
||||
SZ_8G + SZ_4G - SZ_512M,
|
||||
SZ_16G - SZ_512M,
|
||||
SZ_32G - SZ_512M,
|
||||
SZ_64G - SZ_512M,
|
||||
TEST_VRAM,
|
||||
};
|
||||
|
||||
static void u64_param_get_desc(const u64 *p, char *desc)
|
||||
{
|
||||
string_get_size(*p, 1, STRING_UNITS_2, desc, KUNIT_PARAM_DESC_SIZE);
|
||||
}
|
||||
|
||||
KUNIT_ARRAY_PARAM(vram_size, vram_sizes, u64_param_get_desc);
|
||||
|
||||
static void fair_vram_1vf(struct kunit *test)
|
||||
{
|
||||
const u64 usable = *(const u64 *)test->param_value;
|
||||
struct xe_gt *gt = test->priv;
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
|
||||
pf_set_admin_mode(xe, false);
|
||||
pf_set_usable_vram(xe, usable);
|
||||
|
||||
KUNIT_EXPECT_NE(test, 0, pf_profile_fair_lmem(gt, 1));
|
||||
KUNIT_EXPECT_GE(test, usable, pf_profile_fair_lmem(gt, 1));
|
||||
KUNIT_EXPECT_TRUE(test, is_power_of_2(pf_profile_fair_lmem(gt, 1)));
|
||||
KUNIT_EXPECT_GE(test, usable - pf_profile_fair_lmem(gt, 1), pf_profile_fair_lmem(gt, 1));
|
||||
}
|
||||
|
||||
static void fair_vram_1vf_admin_only(struct kunit *test)
|
||||
{
|
||||
const u64 usable = *(const u64 *)test->param_value;
|
||||
struct xe_gt *gt = test->priv;
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
|
||||
pf_set_admin_mode(xe, true);
|
||||
pf_set_usable_vram(xe, usable);
|
||||
|
||||
KUNIT_EXPECT_NE(test, 0, pf_profile_fair_lmem(gt, 1));
|
||||
KUNIT_EXPECT_GE(test, usable, pf_profile_fair_lmem(gt, 1));
|
||||
KUNIT_EXPECT_LT(test, usable - pf_profile_fair_lmem(gt, 1), pf_profile_fair_lmem(gt, 1));
|
||||
KUNIT_EXPECT_TRUE(test, IS_ALIGNED(pf_profile_fair_lmem(gt, 1), SZ_1G));
|
||||
}
|
||||
|
||||
static void fair_vram(struct kunit *test)
|
||||
{
|
||||
unsigned int num_vfs = (unsigned long)test->param_value;
|
||||
struct xe_gt *gt = test->priv;
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
u64 alignment = pf_get_lmem_alignment(gt);
|
||||
char size[10];
|
||||
|
||||
pf_set_admin_mode(xe, false);
|
||||
|
||||
string_get_size(pf_profile_fair_lmem(gt, num_vfs), 1, STRING_UNITS_2, size, sizeof(size));
|
||||
kunit_info(test, "fair %s %llx\n", size, pf_profile_fair_lmem(gt, num_vfs));
|
||||
|
||||
KUNIT_EXPECT_TRUE(test, is_power_of_2(pf_profile_fair_lmem(gt, num_vfs)));
|
||||
KUNIT_EXPECT_TRUE(test, IS_ALIGNED(pf_profile_fair_lmem(gt, num_vfs), alignment));
|
||||
KUNIT_EXPECT_GE(test, TEST_VRAM, num_vfs * pf_profile_fair_lmem(gt, num_vfs));
|
||||
}
|
||||
|
||||
static struct kunit_case pf_gt_config_test_cases[] = {
|
||||
KUNIT_CASE(fair_contexts_1vf),
|
||||
KUNIT_CASE(fair_doorbells_1vf),
|
||||
KUNIT_CASE(fair_ggtt_1vf),
|
||||
KUNIT_CASE_PARAM(fair_vram_1vf, vram_size_gen_params),
|
||||
KUNIT_CASE_PARAM(fair_vram_1vf_admin_only, vram_size_gen_params),
|
||||
KUNIT_CASE_PARAM(fair_contexts, num_vfs_gen_param),
|
||||
KUNIT_CASE_PARAM(fair_doorbells, num_vfs_gen_param),
|
||||
KUNIT_CASE_PARAM(fair_ggtt, num_vfs_gen_param),
|
||||
KUNIT_CASE_PARAM(fair_vram, num_vfs_gen_param),
|
||||
{}
|
||||
};
|
||||
|
||||
|
|
|
|||
|
|
@ -38,12 +38,8 @@ static struct xe_bo *replacement_xe_managed_bo_create_pin_map(struct xe_device *
|
|||
if (flags & XE_BO_FLAG_GGTT) {
|
||||
struct xe_ggtt *ggtt = tile->mem.ggtt;
|
||||
|
||||
bo->ggtt_node[tile->id] = xe_ggtt_node_init(ggtt);
|
||||
bo->ggtt_node[tile->id] = xe_ggtt_insert_node(ggtt, xe_bo_size(bo), SZ_4K);
|
||||
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, bo->ggtt_node[tile->id]);
|
||||
|
||||
KUNIT_ASSERT_EQ(test, 0,
|
||||
xe_ggtt_node_insert(bo->ggtt_node[tile->id],
|
||||
xe_bo_size(bo), SZ_4K));
|
||||
}
|
||||
|
||||
return bo;
|
||||
|
|
|
|||
|
|
@ -48,6 +48,38 @@ struct g2g_test_payload {
|
|||
u32 seqno;
|
||||
};
|
||||
|
||||
static int slot_index_from_gts(struct xe_gt *tx_gt, struct xe_gt *rx_gt)
|
||||
{
|
||||
struct xe_device *xe = gt_to_xe(tx_gt);
|
||||
int idx = 0, found = 0, id, tx_idx, rx_idx;
|
||||
struct xe_gt *gt;
|
||||
struct kunit *test = kunit_get_current_test();
|
||||
|
||||
for (id = 0; id < xe->info.tile_count * xe->info.max_gt_per_tile; id++) {
|
||||
gt = xe_device_get_gt(xe, id);
|
||||
if (!gt)
|
||||
continue;
|
||||
if (gt == tx_gt) {
|
||||
tx_idx = idx;
|
||||
found++;
|
||||
}
|
||||
if (gt == rx_gt) {
|
||||
rx_idx = idx;
|
||||
found++;
|
||||
}
|
||||
|
||||
if (found == 2)
|
||||
break;
|
||||
|
||||
idx++;
|
||||
}
|
||||
|
||||
if (found != 2)
|
||||
KUNIT_FAIL(test, "GT index not found");
|
||||
|
||||
return (tx_idx * xe->info.gt_count) + rx_idx;
|
||||
}
|
||||
|
||||
static void g2g_test_send(struct kunit *test, struct xe_guc *guc,
|
||||
u32 far_tile, u32 far_dev,
|
||||
struct g2g_test_payload *payload)
|
||||
|
|
@ -163,7 +195,7 @@ int xe_guc_g2g_test_notification(struct xe_guc *guc, u32 *msg, u32 len)
|
|||
goto done;
|
||||
}
|
||||
|
||||
idx = (tx_gt->info.id * xe->info.gt_count) + rx_gt->info.id;
|
||||
idx = slot_index_from_gts(tx_gt, rx_gt);
|
||||
|
||||
if (xe->g2g_test_array[idx] != payload->seqno - 1) {
|
||||
xe_gt_err(rx_gt, "G2G: Seqno mismatch %d vs %d for %d:%d -> %d:%d!\n",
|
||||
|
|
@ -180,13 +212,17 @@ int xe_guc_g2g_test_notification(struct xe_guc *guc, u32 *msg, u32 len)
|
|||
return ret;
|
||||
}
|
||||
|
||||
#define G2G_WAIT_TIMEOUT_MS 100
|
||||
#define G2G_WAIT_POLL_MS 1
|
||||
|
||||
/*
|
||||
* Send the given seqno from all GuCs to all other GuCs in tile/GT order
|
||||
*/
|
||||
static void g2g_test_in_order(struct kunit *test, struct xe_device *xe, u32 seqno)
|
||||
{
|
||||
struct xe_gt *near_gt, *far_gt;
|
||||
int i, j;
|
||||
int i, j, waited;
|
||||
u32 idx;
|
||||
|
||||
for_each_gt(near_gt, xe, i) {
|
||||
u32 near_tile = gt_to_tile(near_gt)->id;
|
||||
|
|
@ -205,6 +241,27 @@ static void g2g_test_in_order(struct kunit *test, struct xe_device *xe, u32 seqn
|
|||
payload.rx_dev = far_dev;
|
||||
payload.rx_tile = far_tile;
|
||||
payload.seqno = seqno;
|
||||
|
||||
/* Calculate idx for event-based wait */
|
||||
idx = slot_index_from_gts(near_gt, far_gt);
|
||||
waited = 0;
|
||||
|
||||
/*
|
||||
* Wait for previous seqno to be acknowledged before sending,
|
||||
* to avoid queuing too many back-to-back messages and
|
||||
* causing a test timeout. Actual correctness of message
|
||||
* will be checked later in xe_guc_g2g_test_notification()
|
||||
*/
|
||||
while (xe->g2g_test_array[idx] != (seqno - 1)) {
|
||||
msleep(G2G_WAIT_POLL_MS);
|
||||
waited += G2G_WAIT_POLL_MS;
|
||||
if (waited >= G2G_WAIT_TIMEOUT_MS) {
|
||||
kunit_info(test, "Timeout waiting! tx gt: %d, rx gt: %d\n",
|
||||
near_gt->info.id, far_gt->info.id);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
g2g_test_send(test, &near_gt->uc.guc, far_tile, far_dev, &payload);
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -19,6 +19,8 @@ static void check_graphics_ip(struct kunit *test)
|
|||
const struct xe_ip *param = test->param_value;
|
||||
const struct xe_graphics_desc *graphics = param->desc;
|
||||
u64 mask = graphics->hw_engine_mask;
|
||||
u8 fuse_regs = graphics->num_geometry_xecore_fuse_regs +
|
||||
graphics->num_compute_xecore_fuse_regs;
|
||||
|
||||
/* RCS, CCS, and BCS engines are allowed on the graphics IP */
|
||||
mask &= ~(XE_HW_ENGINE_RCS_MASK |
|
||||
|
|
@ -27,6 +29,12 @@ static void check_graphics_ip(struct kunit *test)
|
|||
|
||||
/* Any remaining engines are an error */
|
||||
KUNIT_ASSERT_EQ(test, mask, 0);
|
||||
|
||||
/*
|
||||
* All graphics IP should have at least one geometry and/or compute
|
||||
* XeCore fuse register.
|
||||
*/
|
||||
KUNIT_ASSERT_GE(test, fuse_regs, 1);
|
||||
}
|
||||
|
||||
static void check_media_ip(struct kunit *test)
|
||||
|
|
|
|||
|
|
@ -322,7 +322,8 @@ static void xe_rtp_process_to_sr_tests(struct kunit *test)
|
|||
count_rtp_entries++;
|
||||
|
||||
xe_rtp_process_ctx_enable_active_tracking(&ctx, &active, count_rtp_entries);
|
||||
xe_rtp_process_to_sr(&ctx, param->entries, count_rtp_entries, reg_sr);
|
||||
xe_rtp_process_to_sr(&ctx, param->entries, count_rtp_entries,
|
||||
reg_sr, false);
|
||||
|
||||
xa_for_each(®_sr->xa, idx, sre) {
|
||||
if (idx == param->expected_reg.addr)
|
||||
|
|
|
|||
|
|
@ -59,16 +59,51 @@ struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 dwords, bool usm)
|
|||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
struct xe_bb *xe_bb_ccs_new(struct xe_gt *gt, u32 dwords,
|
||||
enum xe_sriov_vf_ccs_rw_ctxs ctx_id)
|
||||
/**
|
||||
* xe_bb_alloc() - Allocate a new batch buffer structure
|
||||
* @gt: the &xe_gt
|
||||
*
|
||||
* Allocates and initializes a new xe_bb structure with an associated
|
||||
* uninitialized suballoc object.
|
||||
*
|
||||
* Returns: Batch buffer structure or an ERR_PTR(-ENOMEM).
|
||||
*/
|
||||
struct xe_bb *xe_bb_alloc(struct xe_gt *gt)
|
||||
{
|
||||
struct xe_bb *bb = kmalloc_obj(*bb);
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
struct xe_sa_manager *bb_pool;
|
||||
int err;
|
||||
|
||||
if (!bb)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
bb->bo = xe_sa_bo_alloc(GFP_KERNEL);
|
||||
if (IS_ERR(bb->bo)) {
|
||||
err = PTR_ERR(bb->bo);
|
||||
goto err;
|
||||
}
|
||||
|
||||
return bb;
|
||||
|
||||
err:
|
||||
kfree(bb);
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_bb_init() - Initialize a batch buffer with memory from a sub-allocator pool
|
||||
* @bb: Batch buffer structure to initialize
|
||||
* @bb_pool: Suballoc memory pool to allocate from
|
||||
* @dwords: Number of dwords to be allocated
|
||||
*
|
||||
* Initializes the batch buffer by allocating memory from the specified
|
||||
* suballoc pool.
|
||||
*
|
||||
* Return: 0 on success, negative error code on failure.
|
||||
*/
|
||||
int xe_bb_init(struct xe_bb *bb, struct xe_sa_manager *bb_pool, u32 dwords)
|
||||
{
|
||||
int err;
|
||||
|
||||
/*
|
||||
* We need to allocate space for the requested number of dwords &
|
||||
* one additional MI_BATCH_BUFFER_END dword. Since the whole SA
|
||||
|
|
@ -76,22 +111,14 @@ struct xe_bb *xe_bb_ccs_new(struct xe_gt *gt, u32 dwords,
|
|||
* is not over written when the last chunk of SA is allocated for BB.
|
||||
* So, this extra DW acts as a guard here.
|
||||
*/
|
||||
|
||||
bb_pool = xe->sriov.vf.ccs.contexts[ctx_id].mem.ccs_bb_pool;
|
||||
bb->bo = xe_sa_bo_new(bb_pool, 4 * (dwords + 1));
|
||||
|
||||
if (IS_ERR(bb->bo)) {
|
||||
err = PTR_ERR(bb->bo);
|
||||
goto err;
|
||||
}
|
||||
err = xe_sa_bo_init(bb_pool, bb->bo, 4 * (dwords + 1));
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
bb->cs = xe_sa_bo_cpu_addr(bb->bo);
|
||||
bb->len = 0;
|
||||
|
||||
return bb;
|
||||
err:
|
||||
kfree(bb);
|
||||
return ERR_PTR(err);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct xe_sched_job *
|
||||
|
|
|
|||
|
|
@ -12,12 +12,12 @@ struct dma_fence;
|
|||
|
||||
struct xe_gt;
|
||||
struct xe_exec_queue;
|
||||
struct xe_sa_manager;
|
||||
struct xe_sched_job;
|
||||
enum xe_sriov_vf_ccs_rw_ctxs;
|
||||
|
||||
struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 dwords, bool usm);
|
||||
struct xe_bb *xe_bb_ccs_new(struct xe_gt *gt, u32 dwords,
|
||||
enum xe_sriov_vf_ccs_rw_ctxs ctx_id);
|
||||
struct xe_bb *xe_bb_alloc(struct xe_gt *gt);
|
||||
int xe_bb_init(struct xe_bb *bb, struct xe_sa_manager *bb_pool, u32 dwords);
|
||||
struct xe_sched_job *xe_bb_create_job(struct xe_exec_queue *q,
|
||||
struct xe_bb *bb);
|
||||
struct xe_sched_job *xe_bb_create_migration_job(struct xe_exec_queue *q,
|
||||
|
|
|
|||
|
|
@ -512,8 +512,8 @@ static struct ttm_tt *xe_ttm_tt_create(struct ttm_buffer_object *ttm_bo,
|
|||
/*
|
||||
* Display scanout is always non-coherent with the CPU cache.
|
||||
*
|
||||
* For Xe_LPG and beyond, PPGTT PTE lookups are also
|
||||
* non-coherent and require a CPU:WC mapping.
|
||||
* For Xe_LPG and beyond up to NVL-P (excluding), PPGTT PTE
|
||||
* lookups are also non-coherent and require a CPU:WC mapping.
|
||||
*/
|
||||
if ((!bo->cpu_caching && bo->flags & XE_BO_FLAG_SCANOUT) ||
|
||||
(!xe->info.has_cached_pt && bo->flags & XE_BO_FLAG_PAGETABLE))
|
||||
|
|
|
|||
|
|
@ -15,6 +15,7 @@
|
|||
|
||||
#include "instructions/xe_mi_commands.h"
|
||||
#include "xe_configfs.h"
|
||||
#include "xe_defaults.h"
|
||||
#include "xe_gt_types.h"
|
||||
#include "xe_hw_engine_types.h"
|
||||
#include "xe_module.h"
|
||||
|
|
@ -263,6 +264,7 @@ struct xe_config_group_device {
|
|||
bool enable_psmi;
|
||||
struct {
|
||||
unsigned int max_vfs;
|
||||
bool admin_only_pf;
|
||||
} sriov;
|
||||
} config;
|
||||
|
||||
|
|
@ -280,7 +282,8 @@ static const struct xe_config_device device_defaults = {
|
|||
.survivability_mode = false,
|
||||
.enable_psmi = false,
|
||||
.sriov = {
|
||||
.max_vfs = UINT_MAX,
|
||||
.max_vfs = XE_DEFAULT_MAX_VFS,
|
||||
.admin_only_pf = XE_DEFAULT_ADMIN_ONLY_PF,
|
||||
},
|
||||
};
|
||||
|
||||
|
|
@ -830,6 +833,7 @@ static void xe_config_device_release(struct config_item *item)
|
|||
|
||||
mutex_destroy(&dev->lock);
|
||||
|
||||
kfree(dev->config.ctx_restore_mid_bb[0].cs);
|
||||
kfree(dev->config.ctx_restore_post_bb[0].cs);
|
||||
kfree(dev);
|
||||
}
|
||||
|
|
@ -896,10 +900,40 @@ static ssize_t sriov_max_vfs_store(struct config_item *item, const char *page, s
|
|||
return len;
|
||||
}
|
||||
|
||||
static ssize_t sriov_admin_only_pf_show(struct config_item *item, char *page)
|
||||
{
|
||||
struct xe_config_group_device *dev = to_xe_config_group_device(item->ci_parent);
|
||||
|
||||
guard(mutex)(&dev->lock);
|
||||
|
||||
return sprintf(page, "%s\n", str_yes_no(dev->config.sriov.admin_only_pf));
|
||||
}
|
||||
|
||||
static ssize_t sriov_admin_only_pf_store(struct config_item *item, const char *page, size_t len)
|
||||
{
|
||||
struct xe_config_group_device *dev = to_xe_config_group_device(item->ci_parent);
|
||||
bool admin_only_pf;
|
||||
int ret;
|
||||
|
||||
guard(mutex)(&dev->lock);
|
||||
|
||||
if (is_bound(dev))
|
||||
return -EBUSY;
|
||||
|
||||
ret = kstrtobool(page, &admin_only_pf);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
dev->config.sriov.admin_only_pf = admin_only_pf;
|
||||
return len;
|
||||
}
|
||||
|
||||
CONFIGFS_ATTR(sriov_, max_vfs);
|
||||
CONFIGFS_ATTR(sriov_, admin_only_pf);
|
||||
|
||||
static struct configfs_attribute *xe_config_sriov_attrs[] = {
|
||||
&sriov_attr_max_vfs,
|
||||
&sriov_attr_admin_only_pf,
|
||||
NULL,
|
||||
};
|
||||
|
||||
|
|
@ -910,6 +944,8 @@ static bool xe_config_sriov_is_visible(struct config_item *item,
|
|||
|
||||
if (attr == &sriov_attr_max_vfs && dev->mode != XE_SRIOV_MODE_PF)
|
||||
return false;
|
||||
if (attr == &sriov_attr_admin_only_pf && dev->mode != XE_SRIOV_MODE_PF)
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
|
@ -1063,6 +1099,7 @@ static void dump_custom_dev_config(struct pci_dev *pdev,
|
|||
PRI_CUSTOM_ATTR("%llx", engines_allowed);
|
||||
PRI_CUSTOM_ATTR("%d", enable_psmi);
|
||||
PRI_CUSTOM_ATTR("%d", survivability_mode);
|
||||
PRI_CUSTOM_ATTR("%u", sriov.admin_only_pf);
|
||||
|
||||
#undef PRI_CUSTOM_ATTR
|
||||
}
|
||||
|
|
@ -1241,6 +1278,32 @@ u32 xe_configfs_get_ctx_restore_post_bb(struct pci_dev *pdev,
|
|||
}
|
||||
|
||||
#ifdef CONFIG_PCI_IOV
|
||||
/**
|
||||
* xe_configfs_admin_only_pf() - Get PF's operational mode.
|
||||
* @pdev: the &pci_dev device
|
||||
*
|
||||
* Find the configfs group that belongs to the PCI device and return a flag
|
||||
* whether the PF driver should be dedicated for VFs management only.
|
||||
*
|
||||
* If configfs group is not present, use driver's default value.
|
||||
*
|
||||
* Return: true if PF driver is dedicated for VFs administration only.
|
||||
*/
|
||||
bool xe_configfs_admin_only_pf(struct pci_dev *pdev)
|
||||
{
|
||||
struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
|
||||
bool admin_only_pf;
|
||||
|
||||
if (!dev)
|
||||
return XE_DEFAULT_ADMIN_ONLY_PF;
|
||||
|
||||
scoped_guard(mutex, &dev->lock)
|
||||
admin_only_pf = dev->config.sriov.admin_only_pf;
|
||||
|
||||
config_group_put(&dev->group);
|
||||
|
||||
return admin_only_pf;
|
||||
}
|
||||
/**
|
||||
* xe_configfs_get_max_vfs() - Get number of VFs that could be managed
|
||||
* @pdev: the &pci_dev device
|
||||
|
|
|
|||
|
|
@ -8,7 +8,9 @@
|
|||
#include <linux/limits.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
#include <xe_hw_engine_types.h>
|
||||
#include "xe_defaults.h"
|
||||
#include "xe_hw_engine_types.h"
|
||||
#include "xe_module.h"
|
||||
|
||||
struct pci_dev;
|
||||
|
||||
|
|
@ -29,6 +31,7 @@ u32 xe_configfs_get_ctx_restore_post_bb(struct pci_dev *pdev,
|
|||
const u32 **cs);
|
||||
#ifdef CONFIG_PCI_IOV
|
||||
unsigned int xe_configfs_get_max_vfs(struct pci_dev *pdev);
|
||||
bool xe_configfs_admin_only_pf(struct pci_dev *pdev);
|
||||
#endif
|
||||
#else
|
||||
static inline int xe_configfs_init(void) { return 0; }
|
||||
|
|
@ -45,7 +48,16 @@ static inline u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev,
|
|||
static inline u32 xe_configfs_get_ctx_restore_post_bb(struct pci_dev *pdev,
|
||||
enum xe_engine_class class,
|
||||
const u32 **cs) { return 0; }
|
||||
static inline unsigned int xe_configfs_get_max_vfs(struct pci_dev *pdev) { return UINT_MAX; }
|
||||
#ifdef CONFIG_PCI_IOV
|
||||
static inline unsigned int xe_configfs_get_max_vfs(struct pci_dev *pdev)
|
||||
{
|
||||
return xe_modparam.max_vfs;
|
||||
}
|
||||
static inline bool xe_configfs_admin_only_pf(struct pci_dev *pdev)
|
||||
{
|
||||
return XE_DEFAULT_ADMIN_ONLY_PF;
|
||||
}
|
||||
#endif
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
|
|
|||
26
drivers/gpu/drm/xe/xe_defaults.h
Normal file
26
drivers/gpu/drm/xe/xe_defaults.h
Normal file
|
|
@ -0,0 +1,26 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2026 Intel Corporation
|
||||
*/
|
||||
#ifndef _XE_DEFAULTS_H_
|
||||
#define _XE_DEFAULTS_H_
|
||||
|
||||
#include "xe_device_types.h"
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
|
||||
#define XE_DEFAULT_GUC_LOG_LEVEL 3
|
||||
#else
|
||||
#define XE_DEFAULT_GUC_LOG_LEVEL 1
|
||||
#endif
|
||||
|
||||
#define XE_DEFAULT_PROBE_DISPLAY IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
|
||||
#define XE_DEFAULT_VRAM_BAR_SIZE 0
|
||||
#define XE_DEFAULT_FORCE_PROBE CONFIG_DRM_XE_FORCE_PROBE
|
||||
#define XE_DEFAULT_MAX_VFS ~0
|
||||
#define XE_DEFAULT_MAX_VFS_STR "unlimited"
|
||||
#define XE_DEFAULT_ADMIN_ONLY_PF false
|
||||
#define XE_DEFAULT_WEDGED_MODE XE_WEDGED_MODE_UPON_CRITICAL_ERROR
|
||||
#define XE_DEFAULT_WEDGED_MODE_STR "upon-critical-error"
|
||||
#define XE_DEFAULT_SVM_NOTIFIER_SIZE 512
|
||||
|
||||
#endif
|
||||
|
|
@ -356,7 +356,7 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump,
|
|||
|
||||
xe_engine_snapshot_capture_for_queue(q);
|
||||
|
||||
queue_work(system_unbound_wq, &ss->work);
|
||||
queue_work(system_dfl_wq, &ss->work);
|
||||
|
||||
dma_fence_end_signalling(cookie);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -26,6 +26,7 @@
|
|||
#include "xe_bo.h"
|
||||
#include "xe_bo_evict.h"
|
||||
#include "xe_debugfs.h"
|
||||
#include "xe_defaults.h"
|
||||
#include "xe_devcoredump.h"
|
||||
#include "xe_device_sysfs.h"
|
||||
#include "xe_dma_buf.h"
|
||||
|
|
@ -455,16 +456,16 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
|
|||
xe->drm.anon_inode->i_mapping,
|
||||
xe->drm.vma_offset_manager, 0);
|
||||
if (WARN_ON(err))
|
||||
goto err;
|
||||
return ERR_PTR(err);
|
||||
|
||||
xe_bo_dev_init(&xe->bo_device);
|
||||
err = drmm_add_action_or_reset(&xe->drm, xe_device_destroy, NULL);
|
||||
if (err)
|
||||
goto err;
|
||||
return ERR_PTR(err);
|
||||
|
||||
err = xe_shrinker_create(xe);
|
||||
if (err)
|
||||
goto err;
|
||||
return ERR_PTR(err);
|
||||
|
||||
xe->info.devid = pdev->device;
|
||||
xe->info.revid = pdev->revision;
|
||||
|
|
@ -474,7 +475,7 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
|
|||
|
||||
err = xe_irq_init(xe);
|
||||
if (err)
|
||||
goto err;
|
||||
return ERR_PTR(err);
|
||||
|
||||
xe_validation_device_init(&xe->val);
|
||||
|
||||
|
|
@ -484,7 +485,7 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
|
|||
|
||||
err = xe_pagemap_shrinker_create(xe);
|
||||
if (err)
|
||||
goto err;
|
||||
return ERR_PTR(err);
|
||||
|
||||
xa_init_flags(&xe->usm.asid_to_vm, XA_FLAGS_ALLOC);
|
||||
|
||||
|
|
@ -503,13 +504,13 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
|
|||
|
||||
err = xe_bo_pinned_init(xe);
|
||||
if (err)
|
||||
goto err;
|
||||
return ERR_PTR(err);
|
||||
|
||||
xe->preempt_fence_wq = alloc_ordered_workqueue("xe-preempt-fence-wq",
|
||||
WQ_MEM_RECLAIM);
|
||||
xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
|
||||
xe->unordered_wq = alloc_workqueue("xe-unordered-wq", 0, 0);
|
||||
xe->destroy_wq = alloc_workqueue("xe-destroy-wq", 0, 0);
|
||||
xe->unordered_wq = alloc_workqueue("xe-unordered-wq", WQ_PERCPU, 0);
|
||||
xe->destroy_wq = alloc_workqueue("xe-destroy-wq", WQ_PERCPU, 0);
|
||||
if (!xe->ordered_wq || !xe->unordered_wq ||
|
||||
!xe->preempt_fence_wq || !xe->destroy_wq) {
|
||||
/*
|
||||
|
|
@ -517,18 +518,14 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
|
|||
* drmm_add_action_or_reset register above
|
||||
*/
|
||||
drm_err(&xe->drm, "Failed to allocate xe workqueues\n");
|
||||
err = -ENOMEM;
|
||||
goto err;
|
||||
return ERR_PTR(-ENOMEM);
|
||||
}
|
||||
|
||||
err = drmm_mutex_init(&xe->drm, &xe->pmt.lock);
|
||||
if (err)
|
||||
goto err;
|
||||
return ERR_PTR(err);
|
||||
|
||||
return xe;
|
||||
|
||||
err:
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
ALLOW_ERROR_INJECTION(xe_device_create, ERRNO); /* See xe_pci_probe() */
|
||||
|
||||
|
|
@ -743,7 +740,7 @@ int xe_device_probe_early(struct xe_device *xe)
|
|||
assert_lmem_ready(xe);
|
||||
|
||||
xe->wedged.mode = xe_device_validate_wedged_mode(xe, xe_modparam.wedged_mode) ?
|
||||
XE_WEDGED_MODE_DEFAULT : xe_modparam.wedged_mode;
|
||||
XE_DEFAULT_WEDGED_MODE : xe_modparam.wedged_mode;
|
||||
drm_dbg(&xe->drm, "wedged_mode: setting mode (%u) %s\n",
|
||||
xe->wedged.mode, xe_wedged_mode_to_string(xe->wedged.mode));
|
||||
|
||||
|
|
@ -1311,7 +1308,8 @@ void xe_device_declare_wedged(struct xe_device *xe)
|
|||
xe->needs_flr_on_fini = true;
|
||||
drm_err(&xe->drm,
|
||||
"CRITICAL: Xe has declared device %s as wedged.\n"
|
||||
"IOCTLs and executions are blocked. Only a rebind may clear the failure\n"
|
||||
"IOCTLs and executions are blocked.\n"
|
||||
"For recovery procedure, refer to https://docs.kernel.org/gpu/drm-uapi.html#device-wedging\n"
|
||||
"Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n",
|
||||
dev_name(xe->drm.dev));
|
||||
}
|
||||
|
|
@ -1374,3 +1372,28 @@ const char *xe_wedged_mode_to_string(enum xe_wedged_mode mode)
|
|||
return "<invalid>";
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_device_asid_to_vm() - Find VM from ASID
|
||||
* @xe: the &xe_device
|
||||
* @asid: Address space ID
|
||||
*
|
||||
* Find a VM from ASID and take a reference to VM which caller must drop.
|
||||
* Reclaim safe.
|
||||
*
|
||||
* Return: VM on success, ERR_PTR on failure
|
||||
*/
|
||||
struct xe_vm *xe_device_asid_to_vm(struct xe_device *xe, u32 asid)
|
||||
{
|
||||
struct xe_vm *vm;
|
||||
|
||||
down_read(&xe->usm.lock);
|
||||
vm = xa_load(&xe->usm.asid_to_vm, asid);
|
||||
if (vm)
|
||||
xe_vm_get(vm);
|
||||
else
|
||||
vm = ERR_PTR(-EINVAL);
|
||||
up_read(&xe->usm.lock);
|
||||
|
||||
return vm;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -12,6 +12,8 @@
|
|||
#include "xe_gt_types.h"
|
||||
#include "xe_sriov.h"
|
||||
|
||||
struct xe_vm;
|
||||
|
||||
static inline struct xe_device *to_xe_device(const struct drm_device *dev)
|
||||
{
|
||||
return container_of(dev, struct xe_device, drm);
|
||||
|
|
@ -60,13 +62,6 @@ static inline struct xe_tile *xe_device_get_root_tile(struct xe_device *xe)
|
|||
return &xe->tiles[0];
|
||||
}
|
||||
|
||||
/*
|
||||
* Highest GT/tile count for any platform. Used only for memory allocation
|
||||
* sizing. Any logic looping over GTs or mapping userspace GT IDs into GT
|
||||
* structures should use the per-platform xe->info.max_gt_per_tile instead.
|
||||
*/
|
||||
#define XE_MAX_GT_PER_TILE 2
|
||||
|
||||
static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
|
||||
{
|
||||
struct xe_tile *tile;
|
||||
|
|
@ -114,6 +109,11 @@ static inline struct xe_gt *xe_root_mmio_gt(struct xe_device *xe)
|
|||
return xe_device_get_root_tile(xe)->primary_gt;
|
||||
}
|
||||
|
||||
static inline struct xe_mmio *xe_root_tile_mmio(struct xe_device *xe)
|
||||
{
|
||||
return &xe->tiles[0].mmio;
|
||||
}
|
||||
|
||||
static inline bool xe_device_uc_enabled(struct xe_device *xe)
|
||||
{
|
||||
return !xe->info.force_execlist;
|
||||
|
|
@ -204,6 +204,8 @@ int xe_is_injection_active(void);
|
|||
|
||||
bool xe_is_xe_file(const struct file *file);
|
||||
|
||||
struct xe_vm *xe_device_asid_to_vm(struct xe_device *xe, u32 asid);
|
||||
|
||||
/*
|
||||
* Occasionally it is seen that the G2H worker starts running after a delay of more than
|
||||
* a second even after being queued and activated by the Linux workqueue subsystem. This
|
||||
|
|
|
|||
|
|
@ -15,9 +15,6 @@
|
|||
#include "xe_devcoredump_types.h"
|
||||
#include "xe_heci_gsc.h"
|
||||
#include "xe_late_bind_fw_types.h"
|
||||
#include "xe_lmtt_types.h"
|
||||
#include "xe_memirq_types.h"
|
||||
#include "xe_mert.h"
|
||||
#include "xe_oa_types.h"
|
||||
#include "xe_pagefault_types.h"
|
||||
#include "xe_platform_types.h"
|
||||
|
|
@ -29,14 +26,13 @@
|
|||
#include "xe_sriov_vf_ccs_types.h"
|
||||
#include "xe_step_types.h"
|
||||
#include "xe_survivability_mode_types.h"
|
||||
#include "xe_tile_sriov_vf_types.h"
|
||||
#include "xe_tile_types.h"
|
||||
#include "xe_validation.h"
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
|
||||
#define TEST_VM_OPS_ERROR
|
||||
#endif
|
||||
|
||||
struct dram_info;
|
||||
struct drm_pagemap_shrinker;
|
||||
struct intel_display;
|
||||
struct intel_dg_nvm_dev;
|
||||
|
|
@ -62,9 +58,6 @@ enum xe_wedged_mode {
|
|||
XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET = 2,
|
||||
};
|
||||
|
||||
#define XE_WEDGED_MODE_DEFAULT XE_WEDGED_MODE_UPON_CRITICAL_ERROR
|
||||
#define XE_WEDGED_MODE_DEFAULT_STR "upon-critical-error"
|
||||
|
||||
#define XE_BO_INVALID_OFFSET LONG_MAX
|
||||
|
||||
#define GRAPHICS_VER(xe) ((xe)->info.graphics_verx100 / 100)
|
||||
|
|
@ -79,6 +72,13 @@ enum xe_wedged_mode {
|
|||
#define XE_GT1 1
|
||||
#define XE_MAX_TILES_PER_DEVICE (XE_GT1 + 1)
|
||||
|
||||
/*
|
||||
* Highest GT/tile count for any platform. Used only for memory allocation
|
||||
* sizing. Any logic looping over GTs or mapping userspace GT IDs into GT
|
||||
* structures should use the per-platform xe->info.max_gt_per_tile instead.
|
||||
*/
|
||||
#define XE_MAX_GT_PER_TILE 2
|
||||
|
||||
#define XE_MAX_ASID (BIT(20))
|
||||
|
||||
#define IS_PLATFORM_STEP(_xe, _platform, min_step, max_step) \
|
||||
|
|
@ -91,168 +91,6 @@ enum xe_wedged_mode {
|
|||
(_xe)->info.step.graphics >= (min_step) && \
|
||||
(_xe)->info.step.graphics < (max_step))
|
||||
|
||||
#define tile_to_xe(tile__) \
|
||||
_Generic(tile__, \
|
||||
const struct xe_tile * : (const struct xe_device *)((tile__)->xe), \
|
||||
struct xe_tile * : (tile__)->xe)
|
||||
|
||||
/**
|
||||
* struct xe_mmio - register mmio structure
|
||||
*
|
||||
* Represents an MMIO region that the CPU may use to access registers. A
|
||||
* region may share its IO map with other regions (e.g., all GTs within a
|
||||
* tile share the same map with their parent tile, but represent different
|
||||
* subregions of the overall IO space).
|
||||
*/
|
||||
struct xe_mmio {
|
||||
/** @tile: Backpointer to tile, used for tracing */
|
||||
struct xe_tile *tile;
|
||||
|
||||
/** @regs: Map used to access registers. */
|
||||
void __iomem *regs;
|
||||
|
||||
/**
|
||||
* @sriov_vf_gt: Backpointer to GT.
|
||||
*
|
||||
* This pointer is only set for GT MMIO regions and only when running
|
||||
* as an SRIOV VF structure
|
||||
*/
|
||||
struct xe_gt *sriov_vf_gt;
|
||||
|
||||
/**
|
||||
* @regs_size: Length of the register region within the map.
|
||||
*
|
||||
* The size of the iomap set in *regs is generally larger than the
|
||||
* register mmio space since it includes unused regions and/or
|
||||
* non-register regions such as the GGTT PTEs.
|
||||
*/
|
||||
size_t regs_size;
|
||||
|
||||
/** @adj_limit: adjust MMIO address if address is below this value */
|
||||
u32 adj_limit;
|
||||
|
||||
/** @adj_offset: offset to add to MMIO address when adjusting */
|
||||
u32 adj_offset;
|
||||
};
|
||||
|
||||
/**
|
||||
* struct xe_tile - hardware tile structure
|
||||
*
|
||||
* From a driver perspective, a "tile" is effectively a complete GPU, containing
|
||||
* an SGunit, 1-2 GTs, and (for discrete platforms) VRAM.
|
||||
*
|
||||
* Multi-tile platforms effectively bundle multiple GPUs behind a single PCI
|
||||
* device and designate one "root" tile as being responsible for external PCI
|
||||
* communication. PCI BAR0 exposes the GGTT and MMIO register space for each
|
||||
* tile in a stacked layout, and PCI BAR2 exposes the local memory associated
|
||||
* with each tile similarly. Device-wide interrupts can be enabled/disabled
|
||||
* at the root tile, and the MSTR_TILE_INTR register will report which tiles
|
||||
* have interrupts that need servicing.
|
||||
*/
|
||||
struct xe_tile {
|
||||
/** @xe: Backpointer to tile's PCI device */
|
||||
struct xe_device *xe;
|
||||
|
||||
/** @id: ID of the tile */
|
||||
u8 id;
|
||||
|
||||
/**
|
||||
* @primary_gt: Primary GT
|
||||
*/
|
||||
struct xe_gt *primary_gt;
|
||||
|
||||
/**
|
||||
* @media_gt: Media GT
|
||||
*
|
||||
* Only present on devices with media version >= 13.
|
||||
*/
|
||||
struct xe_gt *media_gt;
|
||||
|
||||
/**
|
||||
* @mmio: MMIO info for a tile.
|
||||
*
|
||||
* Each tile has its own 16MB space in BAR0, laid out as:
|
||||
* * 0-4MB: registers
|
||||
* * 4MB-8MB: reserved
|
||||
* * 8MB-16MB: global GTT
|
||||
*/
|
||||
struct xe_mmio mmio;
|
||||
|
||||
/** @mem: memory management info for tile */
|
||||
struct {
|
||||
/**
|
||||
* @mem.kernel_vram: kernel-dedicated VRAM info for tile.
|
||||
*
|
||||
* Although VRAM is associated with a specific tile, it can
|
||||
* still be accessed by all tiles' GTs.
|
||||
*/
|
||||
struct xe_vram_region *kernel_vram;
|
||||
|
||||
/**
|
||||
* @mem.vram: general purpose VRAM info for tile.
|
||||
*
|
||||
* Although VRAM is associated with a specific tile, it can
|
||||
* still be accessed by all tiles' GTs.
|
||||
*/
|
||||
struct xe_vram_region *vram;
|
||||
|
||||
/** @mem.ggtt: Global graphics translation table */
|
||||
struct xe_ggtt *ggtt;
|
||||
|
||||
/**
|
||||
* @mem.kernel_bb_pool: Pool from which batchbuffers are allocated.
|
||||
*
|
||||
* Media GT shares a pool with its primary GT.
|
||||
*/
|
||||
struct xe_sa_manager *kernel_bb_pool;
|
||||
|
||||
/**
|
||||
* @mem.reclaim_pool: Pool for PRLs allocated.
|
||||
*
|
||||
* Only main GT has page reclaim list allocations.
|
||||
*/
|
||||
struct xe_sa_manager *reclaim_pool;
|
||||
} mem;
|
||||
|
||||
/** @sriov: tile level virtualization data */
|
||||
union {
|
||||
struct {
|
||||
/** @sriov.pf.lmtt: Local Memory Translation Table. */
|
||||
struct xe_lmtt lmtt;
|
||||
} pf;
|
||||
struct {
|
||||
/** @sriov.vf.ggtt_balloon: GGTT regions excluded from use. */
|
||||
struct xe_ggtt_node *ggtt_balloon[2];
|
||||
/** @sriov.vf.self_config: VF configuration data */
|
||||
struct xe_tile_sriov_vf_selfconfig self_config;
|
||||
} vf;
|
||||
} sriov;
|
||||
|
||||
/** @memirq: Memory Based Interrupts. */
|
||||
struct xe_memirq memirq;
|
||||
|
||||
/** @csc_hw_error_work: worker to report CSC HW errors */
|
||||
struct work_struct csc_hw_error_work;
|
||||
|
||||
/** @pcode: tile's PCODE */
|
||||
struct {
|
||||
/** @pcode.lock: protecting tile's PCODE mailbox data */
|
||||
struct mutex lock;
|
||||
} pcode;
|
||||
|
||||
/** @migrate: Migration helper for vram blits and clearing */
|
||||
struct xe_migrate *migrate;
|
||||
|
||||
/** @sysfs: sysfs' kobj used by xe_tile_sysfs */
|
||||
struct kobject *sysfs;
|
||||
|
||||
/** @debugfs: debugfs directory associated with this tile */
|
||||
struct dentry *debugfs;
|
||||
|
||||
/** @mert: MERT-related data */
|
||||
struct xe_mert mert;
|
||||
};
|
||||
|
||||
/**
|
||||
* struct xe_device - Top level struct of Xe device
|
||||
*/
|
||||
|
|
@ -300,6 +138,8 @@ struct xe_device {
|
|||
u8 tile_count;
|
||||
/** @info.max_gt_per_tile: Number of GT IDs allocated to each tile */
|
||||
u8 max_gt_per_tile;
|
||||
/** @info.multi_lrc_mask: bitmask of engine classes which support multi-lrc */
|
||||
u8 multi_lrc_mask;
|
||||
/** @info.gt_count: Total number of GTs for entire device */
|
||||
u8 gt_count;
|
||||
/** @info.vm_max_level: Max VM level */
|
||||
|
|
@ -353,6 +193,8 @@ struct xe_device {
|
|||
u8 has_pre_prod_wa:1;
|
||||
/** @info.has_pxp: Device has PXP support */
|
||||
u8 has_pxp:1;
|
||||
/** @info.has_ctx_tlb_inval: Has context based TLB invalidations */
|
||||
u8 has_ctx_tlb_inval:1;
|
||||
/** @info.has_range_tlb_inval: Has range based TLB invalidations */
|
||||
u8 has_range_tlb_inval:1;
|
||||
/** @info.has_soc_remapper_sysctrl: Has SoC remapper system controller */
|
||||
|
|
@ -559,10 +401,12 @@ struct xe_device {
|
|||
const struct xe_pat_table_entry *table;
|
||||
/** @pat.n_entries: Number of PAT entries */
|
||||
int n_entries;
|
||||
/** @pat.ats_entry: PAT entry for PCIe ATS responses */
|
||||
/** @pat.pat_ats: PAT entry for PCIe ATS responses */
|
||||
const struct xe_pat_table_entry *pat_ats;
|
||||
/** @pat.pta_entry: PAT entry for page table accesses */
|
||||
const struct xe_pat_table_entry *pat_pta;
|
||||
/** @pat.pat_primary_pta: primary GT PAT entry for page table accesses */
|
||||
const struct xe_pat_table_entry *pat_primary_pta;
|
||||
/** @pat.pat_media_pta: media GT PAT entry for page table accesses */
|
||||
const struct xe_pat_table_entry *pat_media_pta;
|
||||
u32 idx[__XE_CACHE_LEVEL_COUNT];
|
||||
} pat;
|
||||
|
||||
|
|
|
|||
|
|
@ -152,8 +152,10 @@ static void __xe_exec_queue_free(struct xe_exec_queue *q)
|
|||
if (xe_exec_queue_is_multi_queue(q))
|
||||
xe_exec_queue_group_cleanup(q);
|
||||
|
||||
if (q->vm)
|
||||
if (q->vm) {
|
||||
xe_vm_remove_exec_queue(q->vm, q);
|
||||
xe_vm_put(q->vm);
|
||||
}
|
||||
|
||||
if (q->xef)
|
||||
xe_file_put(q->xef);
|
||||
|
|
@ -224,9 +226,12 @@ static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe,
|
|||
q->ring_ops = gt->ring_ops[hwe->class];
|
||||
q->ops = gt->exec_queue_ops;
|
||||
INIT_LIST_HEAD(&q->lr.link);
|
||||
INIT_LIST_HEAD(&q->vm_exec_queue_link);
|
||||
INIT_LIST_HEAD(&q->multi_gt_link);
|
||||
INIT_LIST_HEAD(&q->hw_engine_group_link);
|
||||
INIT_LIST_HEAD(&q->pxp.link);
|
||||
spin_lock_init(&q->multi_queue.lock);
|
||||
spin_lock_init(&q->lrc_lookup_lock);
|
||||
q->multi_queue.priority = XE_MULTI_QUEUE_PRIORITY_NORMAL;
|
||||
|
||||
q->sched_props.timeslice_us = hwe->eclass->sched_props.timeslice_us;
|
||||
|
|
@ -266,6 +271,66 @@ static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe,
|
|||
return q;
|
||||
}
|
||||
|
||||
static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u16 idx)
|
||||
{
|
||||
xe_assert(gt_to_xe(q->gt), idx < q->width);
|
||||
|
||||
scoped_guard(spinlock, &q->lrc_lookup_lock)
|
||||
q->lrc[idx] = lrc;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_exec_queue_get_lrc() - Get the LRC from exec queue.
|
||||
* @q: The exec queue instance.
|
||||
* @idx: Index within multi-LRC array.
|
||||
*
|
||||
* Retrieves LRC of given index for the exec queue under lock
|
||||
* and takes reference.
|
||||
*
|
||||
* Return: Pointer to LRC on success, error on failure, NULL on
|
||||
* lookup failure.
|
||||
*/
|
||||
struct xe_lrc *xe_exec_queue_get_lrc(struct xe_exec_queue *q, u16 idx)
|
||||
{
|
||||
struct xe_lrc *lrc;
|
||||
|
||||
xe_assert(gt_to_xe(q->gt), idx < q->width);
|
||||
|
||||
scoped_guard(spinlock, &q->lrc_lookup_lock) {
|
||||
lrc = q->lrc[idx];
|
||||
if (lrc)
|
||||
xe_lrc_get(lrc);
|
||||
}
|
||||
|
||||
return lrc;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_exec_queue_lrc() - Get the LRC from exec queue.
|
||||
* @q: The exec queue instance.
|
||||
*
|
||||
* Retrieves the primary LRC for the exec queue. Note that this function
|
||||
* returns only the first LRC instance, even when multiple parallel LRCs
|
||||
* are configured. This function does not increment reference count,
|
||||
* so the reference can be just forgotten after use.
|
||||
*
|
||||
* Return: Pointer to LRC on success, error on failure
|
||||
*/
|
||||
struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q)
|
||||
{
|
||||
return q->lrc[0];
|
||||
}
|
||||
|
||||
static void __xe_exec_queue_fini(struct xe_exec_queue *q)
|
||||
{
|
||||
int i;
|
||||
|
||||
q->ops->fini(q);
|
||||
|
||||
for (i = 0; i < q->width; ++i)
|
||||
xe_lrc_put(q->lrc[i]);
|
||||
}
|
||||
|
||||
static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
|
||||
{
|
||||
int i, err;
|
||||
|
|
@ -303,38 +368,37 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
|
|||
* from the moment vCPU resumes execution.
|
||||
*/
|
||||
for (i = 0; i < q->width; ++i) {
|
||||
struct xe_lrc *lrc;
|
||||
struct xe_lrc *__lrc = NULL;
|
||||
int marker;
|
||||
|
||||
xe_gt_sriov_vf_wait_valid_ggtt(q->gt);
|
||||
lrc = xe_lrc_create(q->hwe, q->vm, q->replay_state,
|
||||
xe_lrc_ring_size(), q->msix_vec, flags);
|
||||
if (IS_ERR(lrc)) {
|
||||
err = PTR_ERR(lrc);
|
||||
goto err_lrc;
|
||||
}
|
||||
do {
|
||||
struct xe_lrc *lrc;
|
||||
|
||||
/* Pairs with READ_ONCE to xe_exec_queue_contexts_hwsp_rebase */
|
||||
WRITE_ONCE(q->lrc[i], lrc);
|
||||
marker = xe_gt_sriov_vf_wait_valid_ggtt(q->gt);
|
||||
|
||||
lrc = xe_lrc_create(q->hwe, q->vm, q->replay_state,
|
||||
xe_lrc_ring_size(), q->msix_vec, flags);
|
||||
if (IS_ERR(lrc)) {
|
||||
err = PTR_ERR(lrc);
|
||||
goto err_lrc;
|
||||
}
|
||||
|
||||
xe_exec_queue_set_lrc(q, lrc, i);
|
||||
|
||||
if (__lrc)
|
||||
xe_lrc_put(__lrc);
|
||||
__lrc = lrc;
|
||||
|
||||
} while (marker != xe_vf_migration_fixups_complete_count(q->gt));
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
err_lrc:
|
||||
for (i = i - 1; i >= 0; --i)
|
||||
xe_lrc_put(q->lrc[i]);
|
||||
__xe_exec_queue_fini(q);
|
||||
return err;
|
||||
}
|
||||
|
||||
static void __xe_exec_queue_fini(struct xe_exec_queue *q)
|
||||
{
|
||||
int i;
|
||||
|
||||
q->ops->fini(q);
|
||||
|
||||
for (i = 0; i < q->width; ++i)
|
||||
xe_lrc_put(q->lrc[i]);
|
||||
}
|
||||
|
||||
struct xe_exec_queue *xe_exec_queue_create(struct xe_device *xe, struct xe_vm *vm,
|
||||
u32 logical_mask, u16 width,
|
||||
struct xe_hw_engine *hwe, u32 flags,
|
||||
|
|
@ -1180,6 +1244,11 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
|
|||
if (XE_IOCTL_DBG(xe, !hwe))
|
||||
return -EINVAL;
|
||||
|
||||
/* multi-lrc is only supported on select engine classes */
|
||||
if (XE_IOCTL_DBG(xe, args->width > 1 &&
|
||||
!(xe->info.multi_lrc_mask & BIT(hwe->class))))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
vm = xe_vm_lookup(xef, args->vm_id);
|
||||
if (XE_IOCTL_DBG(xe, !vm))
|
||||
return -ENOENT;
|
||||
|
|
@ -1233,6 +1302,8 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
|
|||
}
|
||||
|
||||
q->xef = xe_file_get(xef);
|
||||
if (eci[0].engine_class != DRM_XE_ENGINE_CLASS_VM_BIND)
|
||||
xe_vm_add_exec_queue(vm, q);
|
||||
|
||||
/* user id alloc must always be last in ioctl to prevent UAF */
|
||||
err = xa_alloc(&xef->exec_queue.xa, &id, q, xa_limit_32b, GFP_KERNEL);
|
||||
|
|
@ -1283,21 +1354,6 @@ int xe_exec_queue_get_property_ioctl(struct drm_device *dev, void *data,
|
|||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_exec_queue_lrc() - Get the LRC from exec queue.
|
||||
* @q: The exec_queue.
|
||||
*
|
||||
* Retrieves the primary LRC for the exec queue. Note that this function
|
||||
* returns only the first LRC instance, even when multiple parallel LRCs
|
||||
* are configured.
|
||||
*
|
||||
* Return: Pointer to LRC on success, error on failure
|
||||
*/
|
||||
struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q)
|
||||
{
|
||||
return q->lrc[0];
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_exec_queue_is_lr() - Whether an exec_queue is long-running
|
||||
* @q: The exec_queue
|
||||
|
|
@ -1657,14 +1713,14 @@ int xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *q, void *scratch)
|
|||
for (i = 0; i < q->width; ++i) {
|
||||
struct xe_lrc *lrc;
|
||||
|
||||
/* Pairs with WRITE_ONCE in __xe_exec_queue_init */
|
||||
lrc = READ_ONCE(q->lrc[i]);
|
||||
lrc = xe_exec_queue_get_lrc(q, i);
|
||||
if (!lrc)
|
||||
continue;
|
||||
|
||||
xe_lrc_update_memirq_regs_with_address(lrc, q->hwe, scratch);
|
||||
xe_lrc_update_hwctx_regs_with_address(lrc);
|
||||
err = xe_lrc_setup_wa_bb_with_scratch(lrc, q->hwe, scratch);
|
||||
xe_lrc_put(lrc);
|
||||
if (err)
|
||||
break;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -160,6 +160,7 @@ void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q);
|
|||
int xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *q, void *scratch);
|
||||
|
||||
struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q);
|
||||
struct xe_lrc *xe_exec_queue_get_lrc(struct xe_exec_queue *q, u16 idx);
|
||||
|
||||
/**
|
||||
* xe_exec_queue_idle_skip_suspend() - Can exec queue skip suspend
|
||||
|
|
|
|||
|
|
@ -66,6 +66,8 @@ struct xe_exec_queue_group {
|
|||
bool sync_pending;
|
||||
/** @banned: Group banned */
|
||||
bool banned;
|
||||
/** @stopped: Group is stopped, protected by list_lock */
|
||||
bool stopped;
|
||||
};
|
||||
|
||||
/**
|
||||
|
|
@ -159,8 +161,13 @@ struct xe_exec_queue {
|
|||
struct xe_exec_queue_group *group;
|
||||
/** @multi_queue.link: Link into group's secondary queues list */
|
||||
struct list_head link;
|
||||
/** @multi_queue.priority: Queue priority within the multi-queue group */
|
||||
/**
|
||||
* @multi_queue.priority: Queue priority within the multi-queue group.
|
||||
* It is protected by @multi_queue.lock.
|
||||
*/
|
||||
enum xe_multi_queue_priority priority;
|
||||
/** @multi_queue.lock: Lock for protecting certain members */
|
||||
spinlock_t lock;
|
||||
/** @multi_queue.pos: Position of queue within the multi-queue group */
|
||||
u8 pos;
|
||||
/** @multi_queue.valid: Queue belongs to a multi queue group */
|
||||
|
|
@ -211,6 +218,9 @@ struct xe_exec_queue {
|
|||
struct dma_fence *last_fence;
|
||||
} tlb_inval[XE_EXEC_QUEUE_TLB_INVAL_COUNT];
|
||||
|
||||
/** @vm_exec_queue_link: Link to track exec queue within a VM's list of exec queues. */
|
||||
struct list_head vm_exec_queue_link;
|
||||
|
||||
/** @pxp: PXP info tracking */
|
||||
struct {
|
||||
/** @pxp.type: PXP session type used by this queue */
|
||||
|
|
@ -247,6 +257,11 @@ struct xe_exec_queue {
|
|||
u64 tlb_flush_seqno;
|
||||
/** @hw_engine_group_link: link into exec queues in the same hw engine group */
|
||||
struct list_head hw_engine_group_link;
|
||||
/**
|
||||
* @lrc_lookup_lock: Lock for protecting lrc array access. Only used when
|
||||
* running in parallel to queue creation is possible.
|
||||
*/
|
||||
spinlock_t lrc_lookup_lock;
|
||||
/** @lrc: logical ring context for this exec queue */
|
||||
struct xe_lrc *lrc[] __counted_by(width);
|
||||
};
|
||||
|
|
@ -301,6 +316,8 @@ struct xe_exec_queue_ops {
|
|||
void (*resume)(struct xe_exec_queue *q);
|
||||
/** @reset_status: check exec queue reset status */
|
||||
bool (*reset_status)(struct xe_exec_queue *q);
|
||||
/** @active: check exec queue is active */
|
||||
bool (*active)(struct xe_exec_queue *q);
|
||||
};
|
||||
|
||||
#endif
|
||||
|
|
|
|||
|
|
@ -421,7 +421,7 @@ static void execlist_exec_queue_kill(struct xe_exec_queue *q)
|
|||
static void execlist_exec_queue_destroy(struct xe_exec_queue *q)
|
||||
{
|
||||
INIT_WORK(&q->execlist->destroy_async, execlist_exec_queue_destroy_async);
|
||||
queue_work(system_unbound_wq, &q->execlist->destroy_async);
|
||||
queue_work(system_dfl_wq, &q->execlist->destroy_async);
|
||||
}
|
||||
|
||||
static int execlist_exec_queue_set_priority(struct xe_exec_queue *q,
|
||||
|
|
@ -468,6 +468,12 @@ static bool execlist_exec_queue_reset_status(struct xe_exec_queue *q)
|
|||
return false;
|
||||
}
|
||||
|
||||
static bool execlist_exec_queue_active(struct xe_exec_queue *q)
|
||||
{
|
||||
/* NIY */
|
||||
return false;
|
||||
}
|
||||
|
||||
static const struct xe_exec_queue_ops execlist_exec_queue_ops = {
|
||||
.init = execlist_exec_queue_init,
|
||||
.kill = execlist_exec_queue_kill,
|
||||
|
|
@ -480,6 +486,7 @@ static const struct xe_exec_queue_ops execlist_exec_queue_ops = {
|
|||
.suspend_wait = execlist_exec_queue_suspend_wait,
|
||||
.resume = execlist_exec_queue_resume,
|
||||
.reset_status = execlist_exec_queue_reset_status,
|
||||
.active = execlist_exec_queue_active,
|
||||
};
|
||||
|
||||
int xe_execlist_init(struct xe_gt *gt)
|
||||
|
|
|
|||
|
|
@ -148,12 +148,6 @@ static int domain_sleep_wait(struct xe_gt *gt,
|
|||
return __domain_wait(gt, domain, false);
|
||||
}
|
||||
|
||||
#define for_each_fw_domain_masked(domain__, mask__, fw__, tmp__) \
|
||||
for (tmp__ = (mask__); tmp__; tmp__ &= ~BIT(ffs(tmp__) - 1)) \
|
||||
for_each_if((domain__ = ((fw__)->domains + \
|
||||
(ffs(tmp__) - 1))) && \
|
||||
domain__->reg_ctl.addr)
|
||||
|
||||
/**
|
||||
* xe_force_wake_get() : Increase the domain refcount
|
||||
* @fw: struct xe_force_wake
|
||||
|
|
@ -266,3 +260,43 @@ void xe_force_wake_put(struct xe_force_wake *fw, unsigned int fw_ref)
|
|||
xe_gt_WARN(gt, ack_fail, "Forcewake domain%s %#x failed to acknowledge sleep request\n",
|
||||
str_plural(hweight_long(ack_fail)), ack_fail);
|
||||
}
|
||||
|
||||
const char *xe_force_wake_domain_to_str(enum xe_force_wake_domain_id id)
|
||||
{
|
||||
switch (id) {
|
||||
case XE_FW_DOMAIN_ID_GT:
|
||||
return "GT";
|
||||
case XE_FW_DOMAIN_ID_RENDER:
|
||||
return "Render";
|
||||
case XE_FW_DOMAIN_ID_MEDIA:
|
||||
return "Media";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VDBOX0:
|
||||
return "VDBox0";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VDBOX1:
|
||||
return "VDBox1";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VDBOX2:
|
||||
return "VDBox2";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VDBOX3:
|
||||
return "VDBox3";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VDBOX4:
|
||||
return "VDBox4";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VDBOX5:
|
||||
return "VDBox5";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VDBOX6:
|
||||
return "VDBox6";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VDBOX7:
|
||||
return "VDBox7";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VEBOX0:
|
||||
return "VEBox0";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VEBOX1:
|
||||
return "VEBox1";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VEBOX2:
|
||||
return "VEBox2";
|
||||
case XE_FW_DOMAIN_ID_MEDIA_VEBOX3:
|
||||
return "VEBox3";
|
||||
case XE_FW_DOMAIN_ID_GSC:
|
||||
return "GSC";
|
||||
default:
|
||||
return "Unknown";
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -19,6 +19,17 @@ unsigned int __must_check xe_force_wake_get(struct xe_force_wake *fw,
|
|||
enum xe_force_wake_domains domains);
|
||||
void xe_force_wake_put(struct xe_force_wake *fw, unsigned int fw_ref);
|
||||
|
||||
const char *xe_force_wake_domain_to_str(enum xe_force_wake_domain_id id);
|
||||
|
||||
#define for_each_fw_domain_masked(domain__, mask__, fw__, tmp__) \
|
||||
for (tmp__ = (mask__); tmp__; tmp__ &= ~BIT(ffs(tmp__) - 1)) \
|
||||
for_each_if(((domain__) = ((fw__)->domains + \
|
||||
(ffs(tmp__) - 1))) && \
|
||||
(domain__)->reg_ctl.addr)
|
||||
|
||||
#define for_each_fw_domain(domain__, fw__, tmp__) \
|
||||
for_each_fw_domain_masked((domain__), (fw__)->initialized_domains, (fw__), (tmp__))
|
||||
|
||||
static inline int
|
||||
xe_force_wake_ref(struct xe_force_wake *fw,
|
||||
enum xe_force_wake_domains domain)
|
||||
|
|
|
|||
|
|
@ -69,9 +69,8 @@
|
|||
/**
|
||||
* struct xe_ggtt_node - A node in GGTT.
|
||||
*
|
||||
* This struct needs to be initialized (only-once) with xe_ggtt_node_init() before any node
|
||||
* insertion, reservation, or 'ballooning'.
|
||||
* It will, then, be finalized by either xe_ggtt_node_remove() or xe_ggtt_node_deballoon().
|
||||
* This struct is allocated with xe_ggtt_insert_node(,_transform) or xe_ggtt_insert_bo(,_at).
|
||||
* It will be deallocated using xe_ggtt_node_remove().
|
||||
*/
|
||||
struct xe_ggtt_node {
|
||||
/** @ggtt: Back pointer to xe_ggtt where this region will be inserted at */
|
||||
|
|
@ -84,6 +83,61 @@ struct xe_ggtt_node {
|
|||
bool invalidate_on_remove;
|
||||
};
|
||||
|
||||
/**
|
||||
* struct xe_ggtt_pt_ops - GGTT Page table operations
|
||||
* Which can vary from platform to platform.
|
||||
*/
|
||||
struct xe_ggtt_pt_ops {
|
||||
/** @pte_encode_flags: Encode PTE flags for a given BO */
|
||||
u64 (*pte_encode_flags)(struct xe_bo *bo, u16 pat_index);
|
||||
|
||||
/** @ggtt_set_pte: Directly write into GGTT's PTE */
|
||||
xe_ggtt_set_pte_fn ggtt_set_pte;
|
||||
|
||||
/** @ggtt_get_pte: Directly read from GGTT's PTE */
|
||||
u64 (*ggtt_get_pte)(struct xe_ggtt *ggtt, u64 addr);
|
||||
};
|
||||
|
||||
/**
|
||||
* struct xe_ggtt - Main GGTT struct
|
||||
*
|
||||
* In general, each tile can contains its own Global Graphics Translation Table
|
||||
* (GGTT) instance.
|
||||
*/
|
||||
struct xe_ggtt {
|
||||
/** @tile: Back pointer to tile where this GGTT belongs */
|
||||
struct xe_tile *tile;
|
||||
/** @start: Start offset of GGTT */
|
||||
u64 start;
|
||||
/** @size: Total usable size of this GGTT */
|
||||
u64 size;
|
||||
|
||||
#define XE_GGTT_FLAGS_64K BIT(0)
|
||||
/**
|
||||
* @flags: Flags for this GGTT
|
||||
* Acceptable flags:
|
||||
* - %XE_GGTT_FLAGS_64K - if PTE size is 64K. Otherwise, regular is 4K.
|
||||
*/
|
||||
unsigned int flags;
|
||||
/** @scratch: Internal object allocation used as a scratch page */
|
||||
struct xe_bo *scratch;
|
||||
/** @lock: Mutex lock to protect GGTT data */
|
||||
struct mutex lock;
|
||||
/**
|
||||
* @gsm: The iomem pointer to the actual location of the translation
|
||||
* table located in the GSM for easy PTE manipulation
|
||||
*/
|
||||
u64 __iomem *gsm;
|
||||
/** @pt_ops: Page Table operations per platform */
|
||||
const struct xe_ggtt_pt_ops *pt_ops;
|
||||
/** @mm: The memory manager used to manage individual GGTT allocations */
|
||||
struct drm_mm mm;
|
||||
/** @access_count: counts GGTT writes */
|
||||
unsigned int access_count;
|
||||
/** @wq: Dedicated unordered work queue to process node removals */
|
||||
struct workqueue_struct *wq;
|
||||
};
|
||||
|
||||
static u64 xelp_ggtt_pte_flags(struct xe_bo *bo, u16 pat_index)
|
||||
{
|
||||
u64 pte = XE_PAGE_PRESENT;
|
||||
|
|
@ -193,7 +247,7 @@ static void xe_ggtt_set_pte_and_flush(struct xe_ggtt *ggtt, u64 addr, u64 pte)
|
|||
static u64 xe_ggtt_get_pte(struct xe_ggtt *ggtt, u64 addr)
|
||||
{
|
||||
xe_tile_assert(ggtt->tile, !(addr & XE_PTE_MASK));
|
||||
xe_tile_assert(ggtt->tile, addr < ggtt->size);
|
||||
xe_tile_assert(ggtt->tile, addr < ggtt->start + ggtt->size);
|
||||
|
||||
return readq(&ggtt->gsm[addr >> XE_PTE_SHIFT]);
|
||||
}
|
||||
|
|
@ -299,7 +353,7 @@ static void __xe_ggtt_init_early(struct xe_ggtt *ggtt, u64 start, u64 size)
|
|||
{
|
||||
ggtt->start = start;
|
||||
ggtt->size = size;
|
||||
drm_mm_init(&ggtt->mm, start, size);
|
||||
drm_mm_init(&ggtt->mm, 0, size);
|
||||
}
|
||||
|
||||
int xe_ggtt_init_kunit(struct xe_ggtt *ggtt, u32 start, u32 size)
|
||||
|
|
@ -347,9 +401,15 @@ int xe_ggtt_init_early(struct xe_ggtt *ggtt)
|
|||
ggtt_start = wopcm;
|
||||
ggtt_size = (gsm_size / 8) * (u64)XE_PAGE_SIZE - ggtt_start;
|
||||
} else {
|
||||
/* GGTT is expected to be 4GiB */
|
||||
ggtt_start = wopcm;
|
||||
ggtt_size = SZ_4G - ggtt_start;
|
||||
ggtt_start = xe_tile_sriov_vf_ggtt_base(ggtt->tile);
|
||||
ggtt_size = xe_tile_sriov_vf_ggtt(ggtt->tile);
|
||||
|
||||
if (ggtt_start < wopcm ||
|
||||
ggtt_start + ggtt_size > GUC_GGTT_TOP) {
|
||||
xe_tile_err(ggtt->tile, "Invalid GGTT configuration: %#llx-%#llx\n",
|
||||
ggtt_start, ggtt_start + ggtt_size - 1);
|
||||
return -ERANGE;
|
||||
}
|
||||
}
|
||||
|
||||
ggtt->gsm = ggtt->tile->mmio.regs + SZ_8M;
|
||||
|
|
@ -367,7 +427,7 @@ int xe_ggtt_init_early(struct xe_ggtt *ggtt)
|
|||
else
|
||||
ggtt->pt_ops = &xelp_pt_ops;
|
||||
|
||||
ggtt->wq = alloc_workqueue("xe-ggtt-wq", WQ_MEM_RECLAIM, 0);
|
||||
ggtt->wq = alloc_workqueue("xe-ggtt-wq", WQ_MEM_RECLAIM | WQ_PERCPU, 0);
|
||||
if (!ggtt->wq)
|
||||
return -ENOMEM;
|
||||
|
||||
|
|
@ -377,17 +437,7 @@ int xe_ggtt_init_early(struct xe_ggtt *ggtt)
|
|||
if (err)
|
||||
return err;
|
||||
|
||||
err = devm_add_action_or_reset(xe->drm.dev, dev_fini_ggtt, ggtt);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
if (IS_SRIOV_VF(xe)) {
|
||||
err = xe_tile_sriov_vf_prepare_ggtt(ggtt->tile);
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
return 0;
|
||||
return devm_add_action_or_reset(xe->drm.dev, dev_fini_ggtt, ggtt);
|
||||
}
|
||||
ALLOW_ERROR_INJECTION(xe_ggtt_init_early, ERRNO); /* See xe_pci_probe() */
|
||||
|
||||
|
|
@ -401,12 +451,17 @@ static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt)
|
|||
/* Display may have allocated inside ggtt, so be careful with clearing here */
|
||||
mutex_lock(&ggtt->lock);
|
||||
drm_mm_for_each_hole(hole, &ggtt->mm, start, end)
|
||||
xe_ggtt_clear(ggtt, start, end - start);
|
||||
xe_ggtt_clear(ggtt, ggtt->start + start, end - start);
|
||||
|
||||
xe_ggtt_invalidate(ggtt);
|
||||
mutex_unlock(&ggtt->lock);
|
||||
}
|
||||
|
||||
static void ggtt_node_fini(struct xe_ggtt_node *node)
|
||||
{
|
||||
kfree(node);
|
||||
}
|
||||
|
||||
static void ggtt_node_remove(struct xe_ggtt_node *node)
|
||||
{
|
||||
struct xe_ggtt *ggtt = node->ggtt;
|
||||
|
|
@ -418,7 +473,7 @@ static void ggtt_node_remove(struct xe_ggtt_node *node)
|
|||
|
||||
mutex_lock(&ggtt->lock);
|
||||
if (bound)
|
||||
xe_ggtt_clear(ggtt, node->base.start, node->base.size);
|
||||
xe_ggtt_clear(ggtt, xe_ggtt_node_addr(node), xe_ggtt_node_size(node));
|
||||
drm_mm_remove_node(&node->base);
|
||||
node->base.size = 0;
|
||||
mutex_unlock(&ggtt->lock);
|
||||
|
|
@ -432,7 +487,7 @@ static void ggtt_node_remove(struct xe_ggtt_node *node)
|
|||
drm_dev_exit(idx);
|
||||
|
||||
free_node:
|
||||
xe_ggtt_node_fini(node);
|
||||
ggtt_node_fini(node);
|
||||
}
|
||||
|
||||
static void ggtt_node_remove_work_func(struct work_struct *work)
|
||||
|
|
@ -538,169 +593,38 @@ static void xe_ggtt_invalidate(struct xe_ggtt *ggtt)
|
|||
ggtt_invalidate_gt_tlb(ggtt->tile->media_gt);
|
||||
}
|
||||
|
||||
static void xe_ggtt_dump_node(struct xe_ggtt *ggtt,
|
||||
const struct drm_mm_node *node, const char *description)
|
||||
{
|
||||
char buf[10];
|
||||
|
||||
if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
|
||||
string_get_size(node->size, 1, STRING_UNITS_2, buf, sizeof(buf));
|
||||
xe_tile_dbg(ggtt->tile, "GGTT %#llx-%#llx (%s) %s\n",
|
||||
node->start, node->start + node->size, buf, description);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_ggtt_node_insert_balloon_locked - prevent allocation of specified GGTT addresses
|
||||
* @node: the &xe_ggtt_node to hold reserved GGTT node
|
||||
* @start: the starting GGTT address of the reserved region
|
||||
* @end: then end GGTT address of the reserved region
|
||||
*
|
||||
* To be used in cases where ggtt->lock is already taken.
|
||||
* Use xe_ggtt_node_remove_balloon_locked() to release a reserved GGTT node.
|
||||
*
|
||||
* Return: 0 on success or a negative error code on failure.
|
||||
*/
|
||||
int xe_ggtt_node_insert_balloon_locked(struct xe_ggtt_node *node, u64 start, u64 end)
|
||||
{
|
||||
struct xe_ggtt *ggtt = node->ggtt;
|
||||
int err;
|
||||
|
||||
xe_tile_assert(ggtt->tile, start < end);
|
||||
xe_tile_assert(ggtt->tile, IS_ALIGNED(start, XE_PAGE_SIZE));
|
||||
xe_tile_assert(ggtt->tile, IS_ALIGNED(end, XE_PAGE_SIZE));
|
||||
xe_tile_assert(ggtt->tile, !drm_mm_node_allocated(&node->base));
|
||||
lockdep_assert_held(&ggtt->lock);
|
||||
|
||||
node->base.color = 0;
|
||||
node->base.start = start;
|
||||
node->base.size = end - start;
|
||||
|
||||
err = drm_mm_reserve_node(&ggtt->mm, &node->base);
|
||||
|
||||
if (xe_tile_WARN(ggtt->tile, err, "Failed to balloon GGTT %#llx-%#llx (%pe)\n",
|
||||
node->base.start, node->base.start + node->base.size, ERR_PTR(err)))
|
||||
return err;
|
||||
|
||||
xe_ggtt_dump_node(ggtt, &node->base, "balloon");
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_ggtt_node_remove_balloon_locked - release a reserved GGTT region
|
||||
* @node: the &xe_ggtt_node with reserved GGTT region
|
||||
*
|
||||
* To be used in cases where ggtt->lock is already taken.
|
||||
* See xe_ggtt_node_insert_balloon_locked() for details.
|
||||
*/
|
||||
void xe_ggtt_node_remove_balloon_locked(struct xe_ggtt_node *node)
|
||||
{
|
||||
if (!xe_ggtt_node_allocated(node))
|
||||
return;
|
||||
|
||||
lockdep_assert_held(&node->ggtt->lock);
|
||||
|
||||
xe_ggtt_dump_node(node->ggtt, &node->base, "remove-balloon");
|
||||
|
||||
drm_mm_remove_node(&node->base);
|
||||
}
|
||||
|
||||
static void xe_ggtt_assert_fit(struct xe_ggtt *ggtt, u64 start, u64 size)
|
||||
{
|
||||
struct xe_tile *tile = ggtt->tile;
|
||||
|
||||
xe_tile_assert(tile, start >= ggtt->start);
|
||||
xe_tile_assert(tile, start + size <= ggtt->start + ggtt->size);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_ggtt_shift_nodes_locked - Shift GGTT nodes to adjust for a change in usable address range.
|
||||
* xe_ggtt_shift_nodes() - Shift GGTT nodes to adjust for a change in usable address range.
|
||||
* @ggtt: the &xe_ggtt struct instance
|
||||
* @shift: change to the location of area provisioned for current VF
|
||||
* @new_start: new location of area provisioned for current VF
|
||||
*
|
||||
* This function moves all nodes from the GGTT VM, to a temp list. These nodes are expected
|
||||
* to represent allocations in range formerly assigned to current VF, before the range changed.
|
||||
* When the GGTT VM is completely clear of any nodes, they are re-added with shifted offsets.
|
||||
* Ensure that all struct &xe_ggtt_node are moved to the @new_start base address
|
||||
* by changing the base offset of the GGTT.
|
||||
*
|
||||
* The function has no ability of failing - because it shifts existing nodes, without
|
||||
* any additional processing. If the nodes were successfully existing at the old address,
|
||||
* they will do the same at the new one. A fail inside this function would indicate that
|
||||
* the list of nodes was either already damaged, or that the shift brings the address range
|
||||
* outside of valid bounds. Both cases justify an assert rather than error code.
|
||||
* This function may be called multiple times during recovery, but if
|
||||
* @new_start is unchanged from the current base, it's a noop.
|
||||
*
|
||||
* @new_start should be a value between xe_wopcm_size() and #GUC_GGTT_TOP.
|
||||
*/
|
||||
void xe_ggtt_shift_nodes_locked(struct xe_ggtt *ggtt, s64 shift)
|
||||
void xe_ggtt_shift_nodes(struct xe_ggtt *ggtt, u64 new_start)
|
||||
{
|
||||
struct xe_tile *tile __maybe_unused = ggtt->tile;
|
||||
struct drm_mm_node *node, *tmpn;
|
||||
LIST_HEAD(temp_list_head);
|
||||
guard(mutex)(&ggtt->lock);
|
||||
|
||||
lockdep_assert_held(&ggtt->lock);
|
||||
xe_tile_assert(ggtt->tile, new_start >= xe_wopcm_size(tile_to_xe(ggtt->tile)));
|
||||
xe_tile_assert(ggtt->tile, new_start + ggtt->size <= GUC_GGTT_TOP);
|
||||
|
||||
if (IS_ENABLED(CONFIG_DRM_XE_DEBUG))
|
||||
drm_mm_for_each_node_safe(node, tmpn, &ggtt->mm)
|
||||
xe_ggtt_assert_fit(ggtt, node->start + shift, node->size);
|
||||
|
||||
drm_mm_for_each_node_safe(node, tmpn, &ggtt->mm) {
|
||||
drm_mm_remove_node(node);
|
||||
list_add(&node->node_list, &temp_list_head);
|
||||
}
|
||||
|
||||
list_for_each_entry_safe(node, tmpn, &temp_list_head, node_list) {
|
||||
list_del(&node->node_list);
|
||||
node->start += shift;
|
||||
drm_mm_reserve_node(&ggtt->mm, node);
|
||||
xe_tile_assert(tile, drm_mm_node_allocated(node));
|
||||
}
|
||||
/* pairs with READ_ONCE in xe_ggtt_node_addr() */
|
||||
WRITE_ONCE(ggtt->start, new_start);
|
||||
}
|
||||
|
||||
static int xe_ggtt_node_insert_locked(struct xe_ggtt_node *node,
|
||||
static int xe_ggtt_insert_node_locked(struct xe_ggtt_node *node,
|
||||
u32 size, u32 align, u32 mm_flags)
|
||||
{
|
||||
return drm_mm_insert_node_generic(&node->ggtt->mm, &node->base, size, align, 0,
|
||||
mm_flags);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_ggtt_node_insert - Insert a &xe_ggtt_node into the GGTT
|
||||
* @node: the &xe_ggtt_node to be inserted
|
||||
* @size: size of the node
|
||||
* @align: alignment constrain of the node
|
||||
*
|
||||
* It cannot be called without first having called xe_ggtt_init() once.
|
||||
*
|
||||
* Return: 0 on success or a negative error code on failure.
|
||||
*/
|
||||
int xe_ggtt_node_insert(struct xe_ggtt_node *node, u32 size, u32 align)
|
||||
{
|
||||
int ret;
|
||||
|
||||
if (!node || !node->ggtt)
|
||||
return -ENOENT;
|
||||
|
||||
mutex_lock(&node->ggtt->lock);
|
||||
ret = xe_ggtt_node_insert_locked(node, size, align,
|
||||
DRM_MM_INSERT_HIGH);
|
||||
mutex_unlock(&node->ggtt->lock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_ggtt_node_init - Initialize %xe_ggtt_node struct
|
||||
* @ggtt: the &xe_ggtt where the new node will later be inserted/reserved.
|
||||
*
|
||||
* This function will allocate the struct %xe_ggtt_node and return its pointer.
|
||||
* This struct will then be freed after the node removal upon xe_ggtt_node_remove()
|
||||
* or xe_ggtt_node_remove_balloon_locked().
|
||||
*
|
||||
* Having %xe_ggtt_node struct allocated doesn't mean that the node is already
|
||||
* allocated in GGTT. Only xe_ggtt_node_insert(), allocation through
|
||||
* xe_ggtt_node_insert_transform(), or xe_ggtt_node_insert_balloon_locked() will ensure the node is inserted or reserved
|
||||
* in GGTT.
|
||||
*
|
||||
* Return: A pointer to %xe_ggtt_node struct on success. An ERR_PTR otherwise.
|
||||
**/
|
||||
struct xe_ggtt_node *xe_ggtt_node_init(struct xe_ggtt *ggtt)
|
||||
static struct xe_ggtt_node *ggtt_node_init(struct xe_ggtt *ggtt)
|
||||
{
|
||||
struct xe_ggtt_node *node = kzalloc_obj(*node, GFP_NOFS);
|
||||
|
||||
|
|
@ -714,30 +638,31 @@ struct xe_ggtt_node *xe_ggtt_node_init(struct xe_ggtt *ggtt)
|
|||
}
|
||||
|
||||
/**
|
||||
* xe_ggtt_node_fini - Forcebly finalize %xe_ggtt_node struct
|
||||
* @node: the &xe_ggtt_node to be freed
|
||||
* xe_ggtt_insert_node - Insert a &xe_ggtt_node into the GGTT
|
||||
* @ggtt: the &xe_ggtt into which the node should be inserted.
|
||||
* @size: size of the node
|
||||
* @align: alignment constrain of the node
|
||||
*
|
||||
* If anything went wrong with either xe_ggtt_node_insert(), xe_ggtt_node_insert_locked(),
|
||||
* or xe_ggtt_node_insert_balloon_locked(); and this @node is not going to be reused, then,
|
||||
* this function needs to be called to free the %xe_ggtt_node struct
|
||||
**/
|
||||
void xe_ggtt_node_fini(struct xe_ggtt_node *node)
|
||||
{
|
||||
kfree(node);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_ggtt_node_allocated - Check if node is allocated in GGTT
|
||||
* @node: the &xe_ggtt_node to be inspected
|
||||
*
|
||||
* Return: True if allocated, False otherwise.
|
||||
* Return: &xe_ggtt_node on success or a ERR_PTR on failure.
|
||||
*/
|
||||
bool xe_ggtt_node_allocated(const struct xe_ggtt_node *node)
|
||||
struct xe_ggtt_node *xe_ggtt_insert_node(struct xe_ggtt *ggtt, u32 size, u32 align)
|
||||
{
|
||||
if (!node || !node->ggtt)
|
||||
return false;
|
||||
struct xe_ggtt_node *node;
|
||||
int ret;
|
||||
|
||||
return drm_mm_node_allocated(&node->base);
|
||||
node = ggtt_node_init(ggtt);
|
||||
if (IS_ERR(node))
|
||||
return node;
|
||||
|
||||
guard(mutex)(&ggtt->lock);
|
||||
ret = xe_ggtt_insert_node_locked(node, size, align,
|
||||
DRM_MM_INSERT_HIGH);
|
||||
if (ret) {
|
||||
ggtt_node_fini(node);
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
||||
return node;
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
@ -770,7 +695,7 @@ static void xe_ggtt_map_bo(struct xe_ggtt *ggtt, struct xe_ggtt_node *node,
|
|||
if (XE_WARN_ON(!node))
|
||||
return;
|
||||
|
||||
start = node->base.start;
|
||||
start = xe_ggtt_node_addr(node);
|
||||
end = start + xe_bo_size(bo);
|
||||
|
||||
if (!xe_bo_is_vram(bo) && !xe_bo_is_stolen(bo)) {
|
||||
|
|
@ -811,7 +736,7 @@ void xe_ggtt_map_bo_unlocked(struct xe_ggtt *ggtt, struct xe_bo *bo)
|
|||
}
|
||||
|
||||
/**
|
||||
* xe_ggtt_node_insert_transform - Insert a newly allocated &xe_ggtt_node into the GGTT
|
||||
* xe_ggtt_insert_node_transform - Insert a newly allocated &xe_ggtt_node into the GGTT
|
||||
* @ggtt: the &xe_ggtt where the node will inserted/reserved.
|
||||
* @bo: The bo to be transformed
|
||||
* @pte_flags: The extra GGTT flags to add to mapping.
|
||||
|
|
@ -825,7 +750,7 @@ void xe_ggtt_map_bo_unlocked(struct xe_ggtt *ggtt, struct xe_bo *bo)
|
|||
*
|
||||
* Return: A pointer to %xe_ggtt_node struct on success. An ERR_PTR otherwise.
|
||||
*/
|
||||
struct xe_ggtt_node *xe_ggtt_node_insert_transform(struct xe_ggtt *ggtt,
|
||||
struct xe_ggtt_node *xe_ggtt_insert_node_transform(struct xe_ggtt *ggtt,
|
||||
struct xe_bo *bo, u64 pte_flags,
|
||||
u64 size, u32 align,
|
||||
xe_ggtt_transform_cb transform, void *arg)
|
||||
|
|
@ -833,7 +758,7 @@ struct xe_ggtt_node *xe_ggtt_node_insert_transform(struct xe_ggtt *ggtt,
|
|||
struct xe_ggtt_node *node;
|
||||
int ret;
|
||||
|
||||
node = xe_ggtt_node_init(ggtt);
|
||||
node = ggtt_node_init(ggtt);
|
||||
if (IS_ERR(node))
|
||||
return ERR_CAST(node);
|
||||
|
||||
|
|
@ -842,7 +767,7 @@ struct xe_ggtt_node *xe_ggtt_node_insert_transform(struct xe_ggtt *ggtt,
|
|||
goto err;
|
||||
}
|
||||
|
||||
ret = xe_ggtt_node_insert_locked(node, size, align, 0);
|
||||
ret = xe_ggtt_insert_node_locked(node, size, align, 0);
|
||||
if (ret)
|
||||
goto err_unlock;
|
||||
|
||||
|
|
@ -857,7 +782,7 @@ struct xe_ggtt_node *xe_ggtt_node_insert_transform(struct xe_ggtt *ggtt,
|
|||
err_unlock:
|
||||
mutex_unlock(&ggtt->lock);
|
||||
err:
|
||||
xe_ggtt_node_fini(node);
|
||||
ggtt_node_fini(node);
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
||||
|
|
@ -883,7 +808,7 @@ static int __xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
|
|||
|
||||
xe_pm_runtime_get_noresume(tile_to_xe(ggtt->tile));
|
||||
|
||||
bo->ggtt_node[tile_id] = xe_ggtt_node_init(ggtt);
|
||||
bo->ggtt_node[tile_id] = ggtt_node_init(ggtt);
|
||||
if (IS_ERR(bo->ggtt_node[tile_id])) {
|
||||
err = PTR_ERR(bo->ggtt_node[tile_id]);
|
||||
bo->ggtt_node[tile_id] = NULL;
|
||||
|
|
@ -891,10 +816,30 @@ static int __xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
|
|||
}
|
||||
|
||||
mutex_lock(&ggtt->lock);
|
||||
/*
|
||||
* When inheriting the initial framebuffer, the framebuffer is
|
||||
* physically located at VRAM address 0, and usually at GGTT address 0 too.
|
||||
*
|
||||
* The display code will ask for a GGTT allocation between end of BO and
|
||||
* remainder of GGTT, unaware that the start is reserved by WOPCM.
|
||||
*/
|
||||
if (start >= ggtt->start)
|
||||
start -= ggtt->start;
|
||||
else
|
||||
start = 0;
|
||||
|
||||
/* Should never happen, but since we handle start, fail graciously for end */
|
||||
if (end >= ggtt->start)
|
||||
end -= ggtt->start;
|
||||
else
|
||||
end = 0;
|
||||
|
||||
xe_tile_assert(ggtt->tile, end >= start + xe_bo_size(bo));
|
||||
|
||||
err = drm_mm_insert_node_in_range(&ggtt->mm, &bo->ggtt_node[tile_id]->base,
|
||||
xe_bo_size(bo), alignment, 0, start, end, 0);
|
||||
if (err) {
|
||||
xe_ggtt_node_fini(bo->ggtt_node[tile_id]);
|
||||
ggtt_node_fini(bo->ggtt_node[tile_id]);
|
||||
bo->ggtt_node[tile_id] = NULL;
|
||||
} else {
|
||||
u16 cache_mode = bo->flags & XE_BO_FLAG_NEEDS_UC ? XE_CACHE_NONE : XE_CACHE_WB;
|
||||
|
|
@ -1002,18 +947,16 @@ static u64 xe_encode_vfid_pte(u16 vfid)
|
|||
return FIELD_PREP(GGTT_PTE_VFID, vfid) | XE_PAGE_PRESENT;
|
||||
}
|
||||
|
||||
static void xe_ggtt_assign_locked(struct xe_ggtt *ggtt, const struct drm_mm_node *node, u16 vfid)
|
||||
static void xe_ggtt_assign_locked(const struct xe_ggtt_node *node, u16 vfid)
|
||||
{
|
||||
u64 start = node->start;
|
||||
u64 size = node->size;
|
||||
struct xe_ggtt *ggtt = node->ggtt;
|
||||
u64 start = xe_ggtt_node_addr(node);
|
||||
u64 size = xe_ggtt_node_size(node);
|
||||
u64 end = start + size - 1;
|
||||
u64 pte = xe_encode_vfid_pte(vfid);
|
||||
|
||||
lockdep_assert_held(&ggtt->lock);
|
||||
|
||||
if (!drm_mm_node_allocated(node))
|
||||
return;
|
||||
|
||||
while (start < end) {
|
||||
ggtt->pt_ops->ggtt_set_pte(ggtt, start, pte);
|
||||
start += XE_PAGE_SIZE;
|
||||
|
|
@ -1033,9 +976,8 @@ static void xe_ggtt_assign_locked(struct xe_ggtt *ggtt, const struct drm_mm_node
|
|||
*/
|
||||
void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid)
|
||||
{
|
||||
mutex_lock(&node->ggtt->lock);
|
||||
xe_ggtt_assign_locked(node->ggtt, &node->base, vfid);
|
||||
mutex_unlock(&node->ggtt->lock);
|
||||
guard(mutex)(&node->ggtt->lock);
|
||||
xe_ggtt_assign_locked(node, vfid);
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
@ -1057,14 +999,14 @@ int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size, u16 vfi
|
|||
if (!node)
|
||||
return -ENOENT;
|
||||
|
||||
guard(mutex)(&node->ggtt->lock);
|
||||
ggtt = node->ggtt;
|
||||
guard(mutex)(&ggtt->lock);
|
||||
|
||||
if (xe_ggtt_node_pt_size(node) != size)
|
||||
return -EINVAL;
|
||||
|
||||
ggtt = node->ggtt;
|
||||
start = node->base.start;
|
||||
end = start + node->base.size - 1;
|
||||
start = xe_ggtt_node_addr(node);
|
||||
end = start + xe_ggtt_node_size(node) - 1;
|
||||
|
||||
while (start < end) {
|
||||
pte = ggtt->pt_ops->ggtt_get_pte(ggtt, start);
|
||||
|
|
@ -1097,14 +1039,14 @@ int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u
|
|||
if (!node)
|
||||
return -ENOENT;
|
||||
|
||||
guard(mutex)(&node->ggtt->lock);
|
||||
ggtt = node->ggtt;
|
||||
guard(mutex)(&ggtt->lock);
|
||||
|
||||
if (xe_ggtt_node_pt_size(node) != size)
|
||||
return -EINVAL;
|
||||
|
||||
ggtt = node->ggtt;
|
||||
start = node->base.start;
|
||||
end = start + node->base.size - 1;
|
||||
start = xe_ggtt_node_addr(node);
|
||||
end = start + xe_ggtt_node_size(node) - 1;
|
||||
|
||||
while (start < end) {
|
||||
vfid_pte = u64_replace_bits(*buf++, vfid, GGTT_PTE_VFID);
|
||||
|
|
@ -1211,7 +1153,8 @@ u64 xe_ggtt_read_pte(struct xe_ggtt *ggtt, u64 offset)
|
|||
*/
|
||||
u64 xe_ggtt_node_addr(const struct xe_ggtt_node *node)
|
||||
{
|
||||
return node->base.start;
|
||||
/* pairs with WRITE_ONCE in xe_ggtt_shift_nodes() */
|
||||
return node->base.start + READ_ONCE(node->ggtt->start);
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
|
|||
|
|
@ -9,6 +9,7 @@
|
|||
#include "xe_ggtt_types.h"
|
||||
|
||||
struct drm_printer;
|
||||
struct xe_bo;
|
||||
struct xe_tile;
|
||||
struct drm_exec;
|
||||
|
||||
|
|
@ -17,23 +18,18 @@ int xe_ggtt_init_early(struct xe_ggtt *ggtt);
|
|||
int xe_ggtt_init_kunit(struct xe_ggtt *ggtt, u32 reserved, u32 size);
|
||||
int xe_ggtt_init(struct xe_ggtt *ggtt);
|
||||
|
||||
struct xe_ggtt_node *xe_ggtt_node_init(struct xe_ggtt *ggtt);
|
||||
void xe_ggtt_node_fini(struct xe_ggtt_node *node);
|
||||
int xe_ggtt_node_insert_balloon_locked(struct xe_ggtt_node *node,
|
||||
u64 start, u64 size);
|
||||
void xe_ggtt_node_remove_balloon_locked(struct xe_ggtt_node *node);
|
||||
void xe_ggtt_shift_nodes_locked(struct xe_ggtt *ggtt, s64 shift);
|
||||
void xe_ggtt_shift_nodes(struct xe_ggtt *ggtt, u64 new_base);
|
||||
u64 xe_ggtt_start(struct xe_ggtt *ggtt);
|
||||
u64 xe_ggtt_size(struct xe_ggtt *ggtt);
|
||||
|
||||
int xe_ggtt_node_insert(struct xe_ggtt_node *node, u32 size, u32 align);
|
||||
struct xe_ggtt_node *
|
||||
xe_ggtt_node_insert_transform(struct xe_ggtt *ggtt,
|
||||
xe_ggtt_insert_node(struct xe_ggtt *ggtt, u32 size, u32 align);
|
||||
struct xe_ggtt_node *
|
||||
xe_ggtt_insert_node_transform(struct xe_ggtt *ggtt,
|
||||
struct xe_bo *bo, u64 pte,
|
||||
u64 size, u32 align,
|
||||
xe_ggtt_transform_cb transform, void *arg);
|
||||
void xe_ggtt_node_remove(struct xe_ggtt_node *node, bool invalidate);
|
||||
bool xe_ggtt_node_allocated(const struct xe_ggtt_node *node);
|
||||
size_t xe_ggtt_node_pt_size(const struct xe_ggtt_node *node);
|
||||
void xe_ggtt_map_bo_unlocked(struct xe_ggtt *ggtt, struct xe_bo *bo);
|
||||
int xe_ggtt_insert_bo(struct xe_ggtt *ggtt, struct xe_bo *bo, struct drm_exec *exec);
|
||||
|
|
|
|||
|
|
@ -6,72 +6,16 @@
|
|||
#ifndef _XE_GGTT_TYPES_H_
|
||||
#define _XE_GGTT_TYPES_H_
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <drm/drm_mm.h>
|
||||
|
||||
#include "xe_pt_types.h"
|
||||
|
||||
struct xe_bo;
|
||||
struct xe_ggtt;
|
||||
struct xe_ggtt_node;
|
||||
struct xe_gt;
|
||||
|
||||
/**
|
||||
* struct xe_ggtt - Main GGTT struct
|
||||
*
|
||||
* In general, each tile can contains its own Global Graphics Translation Table
|
||||
* (GGTT) instance.
|
||||
*/
|
||||
struct xe_ggtt {
|
||||
/** @tile: Back pointer to tile where this GGTT belongs */
|
||||
struct xe_tile *tile;
|
||||
/** @start: Start offset of GGTT */
|
||||
u64 start;
|
||||
/** @size: Total usable size of this GGTT */
|
||||
u64 size;
|
||||
|
||||
#define XE_GGTT_FLAGS_64K BIT(0)
|
||||
/**
|
||||
* @flags: Flags for this GGTT
|
||||
* Acceptable flags:
|
||||
* - %XE_GGTT_FLAGS_64K - if PTE size is 64K. Otherwise, regular is 4K.
|
||||
*/
|
||||
unsigned int flags;
|
||||
/** @scratch: Internal object allocation used as a scratch page */
|
||||
struct xe_bo *scratch;
|
||||
/** @lock: Mutex lock to protect GGTT data */
|
||||
struct mutex lock;
|
||||
/**
|
||||
* @gsm: The iomem pointer to the actual location of the translation
|
||||
* table located in the GSM for easy PTE manipulation
|
||||
*/
|
||||
u64 __iomem *gsm;
|
||||
/** @pt_ops: Page Table operations per platform */
|
||||
const struct xe_ggtt_pt_ops *pt_ops;
|
||||
/** @mm: The memory manager used to manage individual GGTT allocations */
|
||||
struct drm_mm mm;
|
||||
/** @access_count: counts GGTT writes */
|
||||
unsigned int access_count;
|
||||
/** @wq: Dedicated unordered work queue to process node removals */
|
||||
struct workqueue_struct *wq;
|
||||
};
|
||||
|
||||
typedef void (*xe_ggtt_set_pte_fn)(struct xe_ggtt *ggtt, u64 addr, u64 pte);
|
||||
typedef void (*xe_ggtt_transform_cb)(struct xe_ggtt *ggtt,
|
||||
struct xe_ggtt_node *node,
|
||||
u64 pte_flags,
|
||||
xe_ggtt_set_pte_fn set_pte, void *arg);
|
||||
/**
|
||||
* struct xe_ggtt_pt_ops - GGTT Page table operations
|
||||
* Which can vary from platform to platform.
|
||||
*/
|
||||
struct xe_ggtt_pt_ops {
|
||||
/** @pte_encode_flags: Encode PTE flags for a given BO */
|
||||
u64 (*pte_encode_flags)(struct xe_bo *bo, u16 pat_index);
|
||||
|
||||
/** @ggtt_set_pte: Directly write into GGTT's PTE */
|
||||
xe_ggtt_set_pte_fn ggtt_set_pte;
|
||||
|
||||
/** @ggtt_get_pte: Directly read from GGTT's PTE */
|
||||
u64 (*ggtt_get_pte)(struct xe_ggtt *ggtt, u64 addr);
|
||||
};
|
||||
|
||||
#endif
|
||||
|
|
|
|||
|
|
@ -435,15 +435,11 @@ static int proxy_channel_alloc(struct xe_gsc *gsc)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static void xe_gsc_proxy_remove(void *arg)
|
||||
static void xe_gsc_proxy_stop(struct xe_gsc *gsc)
|
||||
{
|
||||
struct xe_gsc *gsc = arg;
|
||||
struct xe_gt *gt = gsc_to_gt(gsc);
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
|
||||
if (!gsc->proxy.component_added)
|
||||
return;
|
||||
|
||||
/* disable HECI2 IRQs */
|
||||
scoped_guard(xe_pm_runtime, xe) {
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FW_GSC);
|
||||
|
|
@ -455,6 +451,30 @@ static void xe_gsc_proxy_remove(void *arg)
|
|||
}
|
||||
|
||||
xe_gsc_wait_for_worker_completion(gsc);
|
||||
gsc->proxy.started = false;
|
||||
}
|
||||
|
||||
static void xe_gsc_proxy_remove(void *arg)
|
||||
{
|
||||
struct xe_gsc *gsc = arg;
|
||||
struct xe_gt *gt = gsc_to_gt(gsc);
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
|
||||
if (!gsc->proxy.component_added)
|
||||
return;
|
||||
|
||||
/*
|
||||
* GSC proxy start is an async process that can be ongoing during
|
||||
* Xe module load/unload. Using devm managed action to register
|
||||
* xe_gsc_proxy_stop could cause issues if Xe module unload has
|
||||
* already started when the action is registered, potentially leading
|
||||
* to the cleanup being called at the wrong time. Therefore, instead
|
||||
* of registering a separate devm action to undo what is done in
|
||||
* proxy start, we call it from here, but only if the start has
|
||||
* completed successfully (tracked with the 'started' flag).
|
||||
*/
|
||||
if (gsc->proxy.started)
|
||||
xe_gsc_proxy_stop(gsc);
|
||||
|
||||
component_del(xe->drm.dev, &xe_gsc_proxy_component_ops);
|
||||
gsc->proxy.component_added = false;
|
||||
|
|
@ -510,6 +530,7 @@ int xe_gsc_proxy_init(struct xe_gsc *gsc)
|
|||
*/
|
||||
int xe_gsc_proxy_start(struct xe_gsc *gsc)
|
||||
{
|
||||
struct xe_gt *gt = gsc_to_gt(gsc);
|
||||
int err;
|
||||
|
||||
/* enable the proxy interrupt in the GSC shim layer */
|
||||
|
|
@ -521,12 +542,18 @@ int xe_gsc_proxy_start(struct xe_gsc *gsc)
|
|||
*/
|
||||
err = xe_gsc_proxy_request_handler(gsc);
|
||||
if (err)
|
||||
return err;
|
||||
goto err_irq_disable;
|
||||
|
||||
if (!xe_gsc_proxy_init_done(gsc)) {
|
||||
xe_gt_err(gsc_to_gt(gsc), "GSC FW reports proxy init not completed\n");
|
||||
return -EIO;
|
||||
xe_gt_err(gt, "GSC FW reports proxy init not completed\n");
|
||||
err = -EIO;
|
||||
goto err_irq_disable;
|
||||
}
|
||||
|
||||
gsc->proxy.started = true;
|
||||
return 0;
|
||||
|
||||
err_irq_disable:
|
||||
gsc_proxy_irq_toggle(gsc, false);
|
||||
return err;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -58,6 +58,8 @@ struct xe_gsc {
|
|||
struct mutex mutex;
|
||||
/** @proxy.component_added: whether the component has been added */
|
||||
bool component_added;
|
||||
/** @proxy.started: whether the proxy has been started */
|
||||
bool started;
|
||||
/** @proxy.bo: object to store message to and from the GSC */
|
||||
struct xe_bo *bo;
|
||||
/** @proxy.to_gsc: map of the memory used to send messages to the GSC */
|
||||
|
|
|
|||
|
|
@ -33,6 +33,7 @@
|
|||
#include "xe_gt_printk.h"
|
||||
#include "xe_gt_sriov_pf.h"
|
||||
#include "xe_gt_sriov_vf.h"
|
||||
#include "xe_gt_stats.h"
|
||||
#include "xe_gt_sysfs.h"
|
||||
#include "xe_gt_topology.h"
|
||||
#include "xe_guc_exec_queue_types.h"
|
||||
|
|
@ -141,15 +142,14 @@ static void xe_gt_disable_host_l2_vram(struct xe_gt *gt)
|
|||
static void xe_gt_enable_comp_1wcoh(struct xe_gt *gt)
|
||||
{
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
unsigned int fw_ref;
|
||||
u32 reg;
|
||||
|
||||
if (IS_SRIOV_VF(xe))
|
||||
return;
|
||||
|
||||
if (GRAPHICS_VER(xe) >= 30 && xe->info.has_flat_ccs) {
|
||||
fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
|
||||
if (!fw_ref)
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FW_GT);
|
||||
if (!fw_ref.domains)
|
||||
return;
|
||||
|
||||
reg = xe_gt_mcr_unicast_read_any(gt, XE2_GAMREQSTRM_CTRL);
|
||||
|
|
@ -163,8 +163,6 @@ static void xe_gt_enable_comp_1wcoh(struct xe_gt *gt)
|
|||
reg |= EN_CMP_1WCOH_GW;
|
||||
xe_gt_mcr_multicast_write(gt, XE2_GAMWALK_CTRL_3D, reg);
|
||||
}
|
||||
|
||||
xe_force_wake_put(gt_to_fw(gt), fw_ref);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -500,6 +498,10 @@ int xe_gt_init_early(struct xe_gt *gt)
|
|||
if (err)
|
||||
return err;
|
||||
|
||||
err = xe_gt_stats_init(gt);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FW_GT);
|
||||
if (!fw_ref.domains)
|
||||
return -ETIMEDOUT;
|
||||
|
|
@ -894,7 +896,6 @@ static void gt_reset_worker(struct work_struct *w)
|
|||
if (IS_SRIOV_PF(gt_to_xe(gt)))
|
||||
xe_gt_sriov_pf_stop_prepare(gt);
|
||||
|
||||
xe_uc_gucrc_disable(>->uc);
|
||||
xe_uc_stop_prepare(>->uc);
|
||||
xe_pagefault_reset(gt_to_xe(gt), gt);
|
||||
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@
|
|||
#include "xe_gt_sysfs.h"
|
||||
#include "xe_mmio.h"
|
||||
#include "xe_sriov.h"
|
||||
#include "xe_sriov_pf.h"
|
||||
|
||||
static void __xe_gt_apply_ccs_mode(struct xe_gt *gt, u32 num_engines)
|
||||
{
|
||||
|
|
@ -88,6 +89,11 @@ void xe_gt_apply_ccs_mode(struct xe_gt *gt)
|
|||
__xe_gt_apply_ccs_mode(gt, gt->ccs_mode);
|
||||
}
|
||||
|
||||
static bool gt_ccs_mode_default(struct xe_gt *gt)
|
||||
{
|
||||
return gt->ccs_mode == 1;
|
||||
}
|
||||
|
||||
static ssize_t
|
||||
num_cslices_show(struct device *kdev,
|
||||
struct device_attribute *attr, char *buf)
|
||||
|
|
@ -117,12 +123,6 @@ ccs_mode_store(struct device *kdev, struct device_attribute *attr,
|
|||
u32 num_engines, num_slices;
|
||||
int ret;
|
||||
|
||||
if (IS_SRIOV(xe)) {
|
||||
xe_gt_dbg(gt, "Can't change compute mode when running as %s\n",
|
||||
xe_sriov_mode_to_string(xe_device_sriov_mode(xe)));
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
|
||||
ret = kstrtou32(buff, 0, &num_engines);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
|
@ -139,21 +139,35 @@ ccs_mode_store(struct device *kdev, struct device_attribute *attr,
|
|||
}
|
||||
|
||||
/* CCS mode can only be updated when there are no drm clients */
|
||||
mutex_lock(&xe->drm.filelist_mutex);
|
||||
guard(mutex)(&xe->drm.filelist_mutex);
|
||||
if (!list_empty(&xe->drm.filelist)) {
|
||||
mutex_unlock(&xe->drm.filelist_mutex);
|
||||
xe_gt_dbg(gt, "Rejecting compute mode change as there are active drm clients\n");
|
||||
return -EBUSY;
|
||||
}
|
||||
|
||||
if (gt->ccs_mode != num_engines) {
|
||||
xe_gt_info(gt, "Setting compute mode to %d\n", num_engines);
|
||||
gt->ccs_mode = num_engines;
|
||||
xe_gt_record_user_engines(gt);
|
||||
xe_gt_reset(gt);
|
||||
if (gt->ccs_mode == num_engines)
|
||||
return count;
|
||||
|
||||
/*
|
||||
* Changing default CCS mode is only allowed when there
|
||||
* are no VFs. Try to lockdown PF to find out.
|
||||
*/
|
||||
if (gt_ccs_mode_default(gt) && IS_SRIOV_PF(xe)) {
|
||||
ret = xe_sriov_pf_lockdown(xe);
|
||||
if (ret) {
|
||||
xe_gt_dbg(gt, "Can't change CCS Mode: VFs are enabled\n");
|
||||
return ret;
|
||||
}
|
||||
}
|
||||
|
||||
mutex_unlock(&xe->drm.filelist_mutex);
|
||||
xe_gt_info(gt, "Setting compute mode to %d\n", num_engines);
|
||||
gt->ccs_mode = num_engines;
|
||||
xe_gt_record_user_engines(gt);
|
||||
xe_gt_reset(gt);
|
||||
|
||||
/* We may end PF lockdown once CCS mode is default again */
|
||||
if (gt_ccs_mode_default(gt) && IS_SRIOV_PF(xe))
|
||||
xe_sriov_pf_end_lockdown(xe);
|
||||
|
||||
return count;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -155,6 +155,30 @@ static int register_save_restore(struct xe_gt *gt, struct drm_printer *p)
|
|||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Check the registers referenced on a save-restore list and report any
|
||||
* save-restore entries that did not get applied.
|
||||
*/
|
||||
static int register_save_restore_check(struct xe_gt *gt, struct drm_printer *p)
|
||||
{
|
||||
struct xe_hw_engine *hwe;
|
||||
enum xe_hw_engine_id id;
|
||||
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FORCEWAKE_ALL);
|
||||
if (!xe_force_wake_ref_has_domain(fw_ref.domains, XE_FORCEWAKE_ALL)) {
|
||||
drm_printf(p, "ERROR: Could not acquire forcewake\n");
|
||||
return -ETIMEDOUT;
|
||||
}
|
||||
|
||||
xe_reg_sr_readback_check(>->reg_sr, gt, p);
|
||||
for_each_hw_engine(hwe, gt, id)
|
||||
xe_reg_sr_readback_check(&hwe->reg_sr, gt, p);
|
||||
for_each_hw_engine(hwe, gt, id)
|
||||
xe_reg_sr_lrc_check(&hwe->reg_lrc, gt, hwe, p);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rcs_default_lrc(struct xe_gt *gt, struct drm_printer *p)
|
||||
{
|
||||
xe_lrc_dump_default(p, gt, XE_ENGINE_CLASS_RENDER);
|
||||
|
|
@ -209,6 +233,8 @@ static const struct drm_info_list vf_safe_debugfs_list[] = {
|
|||
{ "default_lrc_vecs", .show = xe_gt_debugfs_show_with_rpm, .data = vecs_default_lrc },
|
||||
{ "hwconfig", .show = xe_gt_debugfs_show_with_rpm, .data = hwconfig },
|
||||
{ "pat_sw_config", .show = xe_gt_debugfs_simple_show, .data = xe_pat_dump_sw_config },
|
||||
{ "register-save-restore-check",
|
||||
.show = xe_gt_debugfs_show_with_rpm, .data = register_save_restore_check },
|
||||
};
|
||||
|
||||
/* everything else should be added here */
|
||||
|
|
|
|||
|
|
@ -168,6 +168,24 @@ void xe_gt_idle_disable_pg(struct xe_gt *gt)
|
|||
xe_mmio_write32(>->mmio, POWERGATE_ENABLE, gtidle->powergate_enable);
|
||||
}
|
||||
|
||||
static void force_wake_domains_show(struct xe_gt *gt, struct drm_printer *p)
|
||||
{
|
||||
struct xe_force_wake_domain *domain;
|
||||
struct xe_force_wake *fw = gt_to_fw(gt);
|
||||
unsigned int tmp;
|
||||
unsigned long flags;
|
||||
|
||||
spin_lock_irqsave(&fw->lock, flags);
|
||||
for_each_fw_domain(domain, fw, tmp) {
|
||||
drm_printf(p, "%s.ref_count=%u, %s.fwake=0x%x\n",
|
||||
xe_force_wake_domain_to_str(domain->id),
|
||||
READ_ONCE(domain->ref),
|
||||
xe_force_wake_domain_to_str(domain->id),
|
||||
xe_mmio_read32(>->mmio, domain->reg_ctl));
|
||||
}
|
||||
spin_unlock_irqrestore(&fw->lock, flags);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_idle_pg_print - Xe powergating info
|
||||
* @gt: GT object
|
||||
|
|
@ -254,6 +272,13 @@ int xe_gt_idle_pg_print(struct xe_gt *gt, struct drm_printer *p)
|
|||
drm_printf(p, "Media Samplers Power Gating Enabled: %s\n",
|
||||
str_yes_no(pg_enabled & MEDIA_SAMPLERS_POWERGATE_ENABLE));
|
||||
|
||||
if (gt->info.engine_mask & BIT(XE_HW_ENGINE_GSCCS0)) {
|
||||
drm_printf(p, "GSC Power Gate Status: %s\n",
|
||||
str_up_down(pg_status & GSC_AWAKE_STATUS));
|
||||
}
|
||||
|
||||
force_wake_domains_show(gt, p);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -201,7 +201,7 @@ static const struct xe_mmio_range xe2lpg_dss_steering_table[] = {
|
|||
{ 0x009680, 0x0096FF }, /* DSS */
|
||||
{ 0x00D800, 0x00D87F }, /* SLICE */
|
||||
{ 0x00DC00, 0x00DCFF }, /* SLICE */
|
||||
{ 0x00DE80, 0x00E8FF }, /* DSS (0xE000-0xE0FF reserved) */
|
||||
{ 0x00DE00, 0x00E8FF }, /* DSS (0xE000-0xE0FF reserved) */
|
||||
{ 0x00E980, 0x00E9FF }, /* SLICE */
|
||||
{ 0x013000, 0x0133FF }, /* DSS (0x13000-0x131FF), SLICE (0x13200-0x133FF) */
|
||||
{},
|
||||
|
|
@ -280,6 +280,19 @@ static const struct xe_mmio_range xe3p_xpc_instance0_steering_table[] = {
|
|||
{},
|
||||
};
|
||||
|
||||
static const struct xe_mmio_range xe3p_lpg_instance0_steering_table[] = {
|
||||
{ 0x004000, 0x004AFF }, /* GAM, rsvd, GAMWKR */
|
||||
{ 0x008700, 0x00887F }, /* NODE */
|
||||
{ 0x00B000, 0x00B3FF }, /* NODE, L3BANK */
|
||||
{ 0x00B500, 0x00B6FF }, /* PSMI */
|
||||
{ 0x00C800, 0x00CFFF }, /* GAM */
|
||||
{ 0x00D880, 0x00D8FF }, /* NODE */
|
||||
{ 0x00DD00, 0x00DD7F }, /* MEMPIPE */
|
||||
{ 0x00F000, 0x00FFFF }, /* GAM, GAMWKR */
|
||||
{ 0x013400, 0x0135FF }, /* MEMPIPE */
|
||||
{},
|
||||
};
|
||||
|
||||
static void init_steering_l3bank(struct xe_gt *gt)
|
||||
{
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
|
|
@ -505,9 +518,6 @@ void xe_gt_mcr_init_early(struct xe_gt *gt)
|
|||
|
||||
spin_lock_init(>->mcr_lock);
|
||||
|
||||
if (IS_SRIOV_VF(xe))
|
||||
return;
|
||||
|
||||
if (gt->info.type == XE_GT_TYPE_MEDIA) {
|
||||
drm_WARN_ON(&xe->drm, MEDIA_VER(xe) < 13);
|
||||
|
||||
|
|
@ -522,17 +532,14 @@ void xe_gt_mcr_init_early(struct xe_gt *gt)
|
|||
}
|
||||
} else {
|
||||
if (GRAPHICS_VERx100(xe) == 3511) {
|
||||
/*
|
||||
* TODO: there are some ranges in bspec with missing
|
||||
* termination: [0x00B000, 0x00B0FF] and
|
||||
* [0x00D880, 0x00D8FF] (NODE); [0x00B100, 0x00B3FF]
|
||||
* (L3BANK). Update them here once bspec is updated.
|
||||
*/
|
||||
gt->steering[DSS].ranges = xe3p_xpc_xecore_steering_table;
|
||||
gt->steering[GAM1].ranges = xe3p_xpc_gam_grp1_steering_table;
|
||||
gt->steering[INSTANCE0].ranges = xe3p_xpc_instance0_steering_table;
|
||||
gt->steering[L3BANK].ranges = xelpg_l3bank_steering_table;
|
||||
gt->steering[NODE].ranges = xe3p_xpc_node_steering_table;
|
||||
} else if (GRAPHICS_VERx100(xe) >= 3510) {
|
||||
gt->steering[DSS].ranges = xe2lpg_dss_steering_table;
|
||||
gt->steering[INSTANCE0].ranges = xe3p_lpg_instance0_steering_table;
|
||||
} else if (GRAPHICS_VER(xe) >= 20) {
|
||||
gt->steering[DSS].ranges = xe2lpg_dss_steering_table;
|
||||
gt->steering[SQIDI_PSMI].ranges = xe2lpg_sqidi_psmi_steering_table;
|
||||
|
|
@ -568,9 +575,6 @@ void xe_gt_mcr_init_early(struct xe_gt *gt)
|
|||
*/
|
||||
void xe_gt_mcr_init(struct xe_gt *gt)
|
||||
{
|
||||
if (IS_SRIOV_VF(gt_to_xe(gt)))
|
||||
return;
|
||||
|
||||
/* Select non-terminated steering target for each type */
|
||||
for (int i = 0; i < NUM_STEERING_TYPES; i++) {
|
||||
gt->steering[i].initialized = true;
|
||||
|
|
|
|||
|
|
@ -279,7 +279,7 @@ static u32 encode_config_ggtt(u32 *cfg, const struct xe_gt_sriov_config *config,
|
|||
{
|
||||
struct xe_ggtt_node *node = config->ggtt_region;
|
||||
|
||||
if (!xe_ggtt_node_allocated(node))
|
||||
if (!node)
|
||||
return 0;
|
||||
|
||||
return encode_ggtt(cfg, xe_ggtt_node_addr(node), xe_ggtt_node_size(node), details);
|
||||
|
|
@ -482,23 +482,9 @@ static int pf_distribute_config_ggtt(struct xe_tile *tile, unsigned int vfid, u6
|
|||
return err ?: err2;
|
||||
}
|
||||
|
||||
static void pf_release_ggtt(struct xe_tile *tile, struct xe_ggtt_node *node)
|
||||
{
|
||||
if (xe_ggtt_node_allocated(node)) {
|
||||
/*
|
||||
* explicit GGTT PTE assignment to the PF using xe_ggtt_assign()
|
||||
* is redundant, as PTE will be implicitly re-assigned to PF by
|
||||
* the xe_ggtt_clear() called by below xe_ggtt_remove_node().
|
||||
*/
|
||||
xe_ggtt_node_remove(node, false);
|
||||
} else {
|
||||
xe_ggtt_node_fini(node);
|
||||
}
|
||||
}
|
||||
|
||||
static void pf_release_vf_config_ggtt(struct xe_gt *gt, struct xe_gt_sriov_config *config)
|
||||
{
|
||||
pf_release_ggtt(gt_to_tile(gt), config->ggtt_region);
|
||||
xe_ggtt_node_remove(config->ggtt_region, false);
|
||||
config->ggtt_region = NULL;
|
||||
}
|
||||
|
||||
|
|
@ -517,7 +503,7 @@ static int pf_provision_vf_ggtt(struct xe_gt *gt, unsigned int vfid, u64 size)
|
|||
|
||||
size = round_up(size, alignment);
|
||||
|
||||
if (xe_ggtt_node_allocated(config->ggtt_region)) {
|
||||
if (config->ggtt_region) {
|
||||
err = pf_distribute_config_ggtt(tile, vfid, 0, 0);
|
||||
if (unlikely(err))
|
||||
return err;
|
||||
|
|
@ -528,19 +514,15 @@ static int pf_provision_vf_ggtt(struct xe_gt *gt, unsigned int vfid, u64 size)
|
|||
if (unlikely(err))
|
||||
return err;
|
||||
}
|
||||
xe_gt_assert(gt, !xe_ggtt_node_allocated(config->ggtt_region));
|
||||
xe_gt_assert(gt, !config->ggtt_region);
|
||||
|
||||
if (!size)
|
||||
return 0;
|
||||
|
||||
node = xe_ggtt_node_init(ggtt);
|
||||
node = xe_ggtt_insert_node(ggtt, size, alignment);
|
||||
if (IS_ERR(node))
|
||||
return PTR_ERR(node);
|
||||
|
||||
err = xe_ggtt_node_insert(node, size, alignment);
|
||||
if (unlikely(err))
|
||||
goto err;
|
||||
|
||||
xe_ggtt_assign(node, vfid);
|
||||
xe_gt_sriov_dbg_verbose(gt, "VF%u assigned GGTT %llx-%llx\n",
|
||||
vfid, xe_ggtt_node_addr(node), xe_ggtt_node_addr(node) + size - 1);
|
||||
|
|
@ -552,7 +534,7 @@ static int pf_provision_vf_ggtt(struct xe_gt *gt, unsigned int vfid, u64 size)
|
|||
config->ggtt_region = node;
|
||||
return 0;
|
||||
err:
|
||||
pf_release_ggtt(tile, node);
|
||||
xe_ggtt_node_remove(node, false);
|
||||
return err;
|
||||
}
|
||||
|
||||
|
|
@ -562,7 +544,7 @@ static u64 pf_get_vf_config_ggtt(struct xe_gt *gt, unsigned int vfid)
|
|||
struct xe_ggtt_node *node = config->ggtt_region;
|
||||
|
||||
xe_gt_assert(gt, xe_gt_is_main_type(gt));
|
||||
return xe_ggtt_node_allocated(node) ? xe_ggtt_node_size(node) : 0;
|
||||
return node ? xe_ggtt_node_size(node) : 0;
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
@ -1469,8 +1451,8 @@ int xe_gt_sriov_pf_config_set_fair_dbs(struct xe_gt *gt, unsigned int vfid,
|
|||
|
||||
static u64 pf_get_lmem_alignment(struct xe_gt *gt)
|
||||
{
|
||||
/* this might be platform dependent */
|
||||
return SZ_2M;
|
||||
return xe_device_has_lmtt(gt_to_xe(gt)) ?
|
||||
xe_lmtt_page_size(>_to_tile(gt)->sriov.pf.lmtt) : XE_PAGE_SIZE;
|
||||
}
|
||||
|
||||
static u64 pf_get_min_spare_lmem(struct xe_gt *gt)
|
||||
|
|
@ -1645,13 +1627,15 @@ static int pf_provision_vf_lmem(struct xe_gt *gt, unsigned int vfid, u64 size)
|
|||
struct xe_device *xe = gt_to_xe(gt);
|
||||
struct xe_tile *tile = gt_to_tile(gt);
|
||||
struct xe_bo *bo;
|
||||
u64 alignment;
|
||||
int err;
|
||||
|
||||
xe_gt_assert(gt, vfid);
|
||||
xe_gt_assert(gt, IS_DGFX(xe));
|
||||
xe_gt_assert(gt, xe_gt_is_main_type(gt));
|
||||
|
||||
size = round_up(size, pf_get_lmem_alignment(gt));
|
||||
alignment = pf_get_lmem_alignment(gt);
|
||||
size = round_up(size, alignment);
|
||||
|
||||
if (config->lmem_obj) {
|
||||
err = pf_distribute_config_lmem(gt, vfid, 0);
|
||||
|
|
@ -1667,12 +1651,12 @@ static int pf_provision_vf_lmem(struct xe_gt *gt, unsigned int vfid, u64 size)
|
|||
if (!size)
|
||||
return 0;
|
||||
|
||||
xe_gt_assert(gt, pf_get_lmem_alignment(gt) == SZ_2M);
|
||||
xe_gt_assert(gt, alignment == XE_PAGE_SIZE || alignment == SZ_2M);
|
||||
bo = xe_bo_create_pin_range_novm(xe, tile,
|
||||
ALIGN(size, PAGE_SIZE), 0, ~0ull,
|
||||
ttm_bo_type_kernel,
|
||||
XE_BO_FLAG_VRAM_IF_DGFX(tile) |
|
||||
XE_BO_FLAG_NEEDS_2M |
|
||||
XE_BO_FLAG_VRAM(tile->mem.vram) |
|
||||
(alignment == SZ_2M ? XE_BO_FLAG_NEEDS_2M : 0) |
|
||||
XE_BO_FLAG_PINNED |
|
||||
XE_BO_FLAG_PINNED_LATE_RESTORE |
|
||||
XE_BO_FLAG_FORCE_USER_VRAM);
|
||||
|
|
@ -1754,7 +1738,44 @@ int xe_gt_sriov_pf_config_set_lmem(struct xe_gt *gt, unsigned int vfid, u64 size
|
|||
}
|
||||
|
||||
/**
|
||||
* xe_gt_sriov_pf_config_bulk_set_lmem - Provision many VFs with LMEM.
|
||||
* xe_gt_sriov_pf_config_bulk_set_lmem_locked() - Provision many VFs with LMEM.
|
||||
* @gt: the &xe_gt (can't be media)
|
||||
* @vfid: starting VF identifier (can't be 0)
|
||||
* @num_vfs: number of VFs to provision
|
||||
* @size: requested LMEM size
|
||||
*
|
||||
* This function can only be called on PF.
|
||||
*
|
||||
* Return: 0 on success or a negative error code on failure.
|
||||
*/
|
||||
int xe_gt_sriov_pf_config_bulk_set_lmem_locked(struct xe_gt *gt, unsigned int vfid,
|
||||
unsigned int num_vfs, u64 size)
|
||||
{
|
||||
unsigned int n;
|
||||
int err = 0;
|
||||
|
||||
lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
|
||||
xe_gt_assert(gt, xe_device_has_lmtt(gt_to_xe(gt)));
|
||||
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
|
||||
xe_gt_assert(gt, xe_gt_is_main_type(gt));
|
||||
xe_gt_assert(gt, vfid);
|
||||
|
||||
if (!num_vfs)
|
||||
return 0;
|
||||
|
||||
for (n = vfid; n < vfid + num_vfs; n++) {
|
||||
err = pf_provision_vf_lmem(gt, n, size);
|
||||
if (err)
|
||||
break;
|
||||
}
|
||||
|
||||
return pf_config_bulk_set_u64_done(gt, vfid, num_vfs, size,
|
||||
pf_get_vf_config_lmem,
|
||||
"LMEM", n, err);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_sriov_pf_config_bulk_set_lmem() - Provision many VFs with LMEM.
|
||||
* @gt: the &xe_gt (can't be media)
|
||||
* @vfid: starting VF identifier (can't be 0)
|
||||
* @num_vfs: number of VFs to provision
|
||||
|
|
@ -1767,26 +1788,52 @@ int xe_gt_sriov_pf_config_set_lmem(struct xe_gt *gt, unsigned int vfid, u64 size
|
|||
int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid,
|
||||
unsigned int num_vfs, u64 size)
|
||||
{
|
||||
unsigned int n;
|
||||
int err = 0;
|
||||
guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
|
||||
|
||||
return xe_gt_sriov_pf_config_bulk_set_lmem_locked(gt, vfid, num_vfs, size);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_sriov_pf_config_get_lmem_locked() - Get VF's LMEM quota.
|
||||
* @gt: the &xe_gt
|
||||
* @vfid: the VF identifier (can't be 0 == PFID)
|
||||
*
|
||||
* This function can only be called on PF.
|
||||
*
|
||||
* Return: VF's LMEM quota.
|
||||
*/
|
||||
u64 xe_gt_sriov_pf_config_get_lmem_locked(struct xe_gt *gt, unsigned int vfid)
|
||||
{
|
||||
lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
|
||||
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
|
||||
xe_gt_assert(gt, vfid);
|
||||
|
||||
return pf_get_vf_config_lmem(gt, vfid);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_sriov_pf_config_set_lmem_locked() - Provision VF with LMEM.
|
||||
* @gt: the &xe_gt (can't be media)
|
||||
* @vfid: the VF identifier (can't be 0 == PFID)
|
||||
* @size: requested LMEM size
|
||||
*
|
||||
* This function can only be called on PF.
|
||||
*/
|
||||
int xe_gt_sriov_pf_config_set_lmem_locked(struct xe_gt *gt, unsigned int vfid, u64 size)
|
||||
{
|
||||
int err;
|
||||
|
||||
lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
|
||||
xe_gt_assert(gt, xe_device_has_lmtt(gt_to_xe(gt)));
|
||||
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
|
||||
xe_gt_assert(gt, xe_gt_is_main_type(gt));
|
||||
xe_gt_assert(gt, vfid);
|
||||
|
||||
if (!num_vfs)
|
||||
return 0;
|
||||
err = pf_provision_vf_lmem(gt, vfid, size);
|
||||
|
||||
mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
|
||||
for (n = vfid; n < vfid + num_vfs; n++) {
|
||||
err = pf_provision_vf_lmem(gt, n, size);
|
||||
if (err)
|
||||
break;
|
||||
}
|
||||
mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
|
||||
|
||||
return pf_config_bulk_set_u64_done(gt, vfid, num_vfs, size,
|
||||
xe_gt_sriov_pf_config_get_lmem,
|
||||
"LMEM", n, err);
|
||||
return pf_config_set_u64_done(gt, vfid, size,
|
||||
pf_get_vf_config_lmem(gt, vfid),
|
||||
"LMEM", err);
|
||||
}
|
||||
|
||||
static struct xe_bo *pf_get_vf_config_lmem_obj(struct xe_gt *gt, unsigned int vfid)
|
||||
|
|
@ -1856,6 +1903,81 @@ static u64 pf_estimate_fair_lmem(struct xe_gt *gt, unsigned int num_vfs)
|
|||
return fair;
|
||||
}
|
||||
|
||||
static u64 pf_profile_fair_lmem(struct xe_gt *gt, unsigned int num_vfs)
|
||||
{
|
||||
struct xe_tile *tile = gt_to_tile(gt);
|
||||
bool admin_only_pf = xe_sriov_pf_admin_only(tile->xe);
|
||||
u64 usable = xe_vram_region_usable_size(tile->mem.vram);
|
||||
u64 spare = pf_get_min_spare_lmem(gt);
|
||||
u64 available = usable > spare ? usable - spare : 0;
|
||||
u64 shareable = ALIGN_DOWN(available, SZ_1G);
|
||||
u64 alignment = pf_get_lmem_alignment(gt);
|
||||
u64 fair;
|
||||
|
||||
if (admin_only_pf)
|
||||
fair = div_u64(shareable, num_vfs);
|
||||
else
|
||||
fair = div_u64(shareable, 1 + num_vfs);
|
||||
|
||||
if (!admin_only_pf && fair)
|
||||
fair = rounddown_pow_of_two(fair);
|
||||
|
||||
return ALIGN_DOWN(fair, alignment);
|
||||
}
|
||||
|
||||
static void __pf_show_provisioning_lmem(struct xe_gt *gt, unsigned int first_vf,
|
||||
unsigned int num_vfs, bool provisioned)
|
||||
{
|
||||
unsigned int allvfs = 1 + xe_gt_sriov_pf_get_totalvfs(gt); /* PF plus VFs */
|
||||
unsigned long *bitmap __free(bitmap) = bitmap_zalloc(allvfs, GFP_KERNEL);
|
||||
unsigned int weight;
|
||||
unsigned int n;
|
||||
|
||||
if (!bitmap)
|
||||
return;
|
||||
|
||||
for (n = first_vf; n < first_vf + num_vfs; n++) {
|
||||
if (!!pf_get_vf_config_lmem(gt, VFID(n)) == provisioned)
|
||||
bitmap_set(bitmap, n, 1);
|
||||
}
|
||||
|
||||
weight = bitmap_weight(bitmap, allvfs);
|
||||
if (!weight)
|
||||
return;
|
||||
|
||||
xe_gt_sriov_info(gt, "VF%s%*pbl %s provisioned with VRAM\n",
|
||||
weight > 1 ? "s " : "", allvfs, bitmap,
|
||||
provisioned ? "already" : "not");
|
||||
}
|
||||
|
||||
static void pf_show_all_provisioned_lmem(struct xe_gt *gt)
|
||||
{
|
||||
__pf_show_provisioning_lmem(gt, VFID(1), xe_gt_sriov_pf_get_totalvfs(gt), true);
|
||||
}
|
||||
|
||||
static void pf_show_unprovisioned_lmem(struct xe_gt *gt, unsigned int first_vf,
|
||||
unsigned int num_vfs)
|
||||
{
|
||||
__pf_show_provisioning_lmem(gt, first_vf, num_vfs, false);
|
||||
}
|
||||
|
||||
static bool pf_needs_provision_lmem(struct xe_gt *gt, unsigned int first_vf,
|
||||
unsigned int num_vfs)
|
||||
{
|
||||
unsigned int vfid;
|
||||
|
||||
for (vfid = first_vf; vfid < first_vf + num_vfs; vfid++) {
|
||||
if (pf_get_vf_config_lmem(gt, vfid)) {
|
||||
pf_show_all_provisioned_lmem(gt);
|
||||
pf_show_unprovisioned_lmem(gt, first_vf, num_vfs);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
pf_show_all_provisioned_lmem(gt);
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_sriov_pf_config_set_fair_lmem - Provision many VFs with fair LMEM.
|
||||
* @gt: the &xe_gt (can't be media)
|
||||
|
|
@ -1869,6 +1991,7 @@ static u64 pf_estimate_fair_lmem(struct xe_gt *gt, unsigned int num_vfs)
|
|||
int xe_gt_sriov_pf_config_set_fair_lmem(struct xe_gt *gt, unsigned int vfid,
|
||||
unsigned int num_vfs)
|
||||
{
|
||||
u64 profile;
|
||||
u64 fair;
|
||||
|
||||
xe_gt_assert(gt, vfid);
|
||||
|
|
@ -1878,14 +2001,22 @@ int xe_gt_sriov_pf_config_set_fair_lmem(struct xe_gt *gt, unsigned int vfid,
|
|||
if (!xe_device_has_lmtt(gt_to_xe(gt)))
|
||||
return 0;
|
||||
|
||||
mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
|
||||
fair = pf_estimate_fair_lmem(gt, num_vfs);
|
||||
mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
|
||||
guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
|
||||
|
||||
if (!pf_needs_provision_lmem(gt, vfid, num_vfs))
|
||||
return 0;
|
||||
|
||||
fair = pf_estimate_fair_lmem(gt, num_vfs);
|
||||
if (!fair)
|
||||
return -ENOSPC;
|
||||
|
||||
return xe_gt_sriov_pf_config_bulk_set_lmem(gt, vfid, num_vfs, fair);
|
||||
profile = pf_profile_fair_lmem(gt, num_vfs);
|
||||
fair = min(fair, profile);
|
||||
if (fair < profile)
|
||||
xe_gt_sriov_info(gt, "Using non-profile provisioning (%s %llu vs %llu)\n",
|
||||
"VRAM", fair, profile);
|
||||
|
||||
return xe_gt_sriov_pf_config_bulk_set_lmem_locked(gt, vfid, num_vfs, fair);
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
@ -2576,7 +2707,7 @@ int xe_gt_sriov_pf_config_release(struct xe_gt *gt, unsigned int vfid, bool forc
|
|||
|
||||
static void pf_sanitize_ggtt(struct xe_ggtt_node *ggtt_region, unsigned int vfid)
|
||||
{
|
||||
if (xe_ggtt_node_allocated(ggtt_region))
|
||||
if (ggtt_region)
|
||||
xe_ggtt_assign(ggtt_region, vfid);
|
||||
}
|
||||
|
||||
|
|
@ -3035,7 +3166,7 @@ int xe_gt_sriov_pf_config_print_ggtt(struct xe_gt *gt, struct drm_printer *p)
|
|||
|
||||
for (n = 1; n <= total_vfs; n++) {
|
||||
config = >->sriov.pf.vfs[n].config;
|
||||
if (!xe_ggtt_node_allocated(config->ggtt_region))
|
||||
if (!config->ggtt_region)
|
||||
continue;
|
||||
|
||||
string_get_size(xe_ggtt_node_size(config->ggtt_region), 1, STRING_UNITS_2,
|
||||
|
|
|
|||
|
|
@ -36,6 +36,10 @@ int xe_gt_sriov_pf_config_set_lmem(struct xe_gt *gt, unsigned int vfid, u64 size
|
|||
int xe_gt_sriov_pf_config_set_fair_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs);
|
||||
int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs,
|
||||
u64 size);
|
||||
u64 xe_gt_sriov_pf_config_get_lmem_locked(struct xe_gt *gt, unsigned int vfid);
|
||||
int xe_gt_sriov_pf_config_set_lmem_locked(struct xe_gt *gt, unsigned int vfid, u64 size);
|
||||
int xe_gt_sriov_pf_config_bulk_set_lmem_locked(struct xe_gt *gt, unsigned int vfid,
|
||||
unsigned int num_vfs, u64 size);
|
||||
struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid);
|
||||
|
||||
u32 xe_gt_sriov_pf_config_get_exec_quantum(struct xe_gt *gt, unsigned int vfid);
|
||||
|
|
|
|||
|
|
@ -1259,7 +1259,7 @@ int xe_gt_sriov_pf_control_process_restore_data(struct xe_gt *gt, unsigned int v
|
|||
}
|
||||
|
||||
/**
|
||||
* xe_gt_sriov_pf_control_trigger restore_vf() - Start an SR-IOV VF migration data restore sequence.
|
||||
* xe_gt_sriov_pf_control_trigger_restore_vf() - Start an SR-IOV VF migration data restore sequence.
|
||||
* @gt: the &xe_gt
|
||||
* @vfid: the VF identifier
|
||||
*
|
||||
|
|
|
|||
|
|
@ -111,6 +111,8 @@ static const struct xe_reg ver_35_runtime_regs[] = {
|
|||
XE2_GT_COMPUTE_DSS_2, /* _MMIO(0x914c) */
|
||||
XE2_GT_GEOMETRY_DSS_1, /* _MMIO(0x9150) */
|
||||
XE2_GT_GEOMETRY_DSS_2, /* _MMIO(0x9154) */
|
||||
XE3P_XPC_GT_GEOMETRY_DSS_3, /* _MMIO(0x915c) */
|
||||
XE3P_XPC_GT_COMPUTE_DSS_3, /* _MMIO(0x9160) */
|
||||
SERVICE_COPY_ENABLE, /* _MMIO(0x9170) */
|
||||
};
|
||||
|
||||
|
|
|
|||
|
|
@ -488,16 +488,12 @@ u32 xe_gt_sriov_vf_gmdid(struct xe_gt *gt)
|
|||
static int vf_get_ggtt_info(struct xe_gt *gt)
|
||||
{
|
||||
struct xe_tile *tile = gt_to_tile(gt);
|
||||
struct xe_ggtt *ggtt = tile->mem.ggtt;
|
||||
struct xe_guc *guc = >->uc.guc;
|
||||
u64 start, size, ggtt_size;
|
||||
s64 shift;
|
||||
int err;
|
||||
|
||||
xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
|
||||
|
||||
guard(mutex)(&ggtt->lock);
|
||||
|
||||
err = guc_action_query_single_klv64(guc, GUC_KLV_VF_CFG_GGTT_START_KEY, &start);
|
||||
if (unlikely(err))
|
||||
return err;
|
||||
|
|
@ -509,8 +505,21 @@ static int vf_get_ggtt_info(struct xe_gt *gt)
|
|||
if (!size)
|
||||
return -ENODATA;
|
||||
|
||||
xe_tile_sriov_vf_ggtt_base_store(tile, start);
|
||||
ggtt_size = xe_tile_sriov_vf_ggtt(tile);
|
||||
if (ggtt_size && ggtt_size != size) {
|
||||
if (!ggtt_size) {
|
||||
/*
|
||||
* This function is called once during xe_guc_init_noalloc(),
|
||||
* at which point ggtt_size = 0 and we have to initialize everything,
|
||||
* and GGTT is not yet initialized.
|
||||
*
|
||||
* Return early as there's nothing to fixup.
|
||||
*/
|
||||
xe_tile_sriov_vf_ggtt_store(tile, size);
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (ggtt_size != size) {
|
||||
xe_gt_sriov_err(gt, "Unexpected GGTT reassignment: %lluK != %lluK\n",
|
||||
size / SZ_1K, ggtt_size / SZ_1K);
|
||||
return -EREMCHG;
|
||||
|
|
@ -519,21 +528,13 @@ static int vf_get_ggtt_info(struct xe_gt *gt)
|
|||
xe_gt_sriov_dbg_verbose(gt, "GGTT %#llx-%#llx = %lluK\n",
|
||||
start, start + size - 1, size / SZ_1K);
|
||||
|
||||
shift = start - (s64)xe_tile_sriov_vf_ggtt_base(tile);
|
||||
xe_tile_sriov_vf_ggtt_base_store(tile, start);
|
||||
xe_tile_sriov_vf_ggtt_store(tile, size);
|
||||
|
||||
if (shift && shift != start) {
|
||||
xe_gt_sriov_info(gt, "Shifting GGTT base by %lld to 0x%016llx\n",
|
||||
shift, start);
|
||||
xe_tile_sriov_vf_fixup_ggtt_nodes_locked(gt_to_tile(gt), shift);
|
||||
}
|
||||
|
||||
if (xe_sriov_vf_migration_supported(gt_to_xe(gt))) {
|
||||
WRITE_ONCE(gt->sriov.vf.migration.ggtt_need_fixes, false);
|
||||
smp_wmb(); /* Ensure above write visible before wake */
|
||||
wake_up_all(>->sriov.vf.migration.wq);
|
||||
}
|
||||
/*
|
||||
* This function can be called repeatedly from post migration fixups,
|
||||
* at which point we inform the GGTT of the new base address.
|
||||
* xe_ggtt_shift_nodes() may be called multiple times for each migration,
|
||||
* but will be a noop if the base is unchanged.
|
||||
*/
|
||||
xe_ggtt_shift_nodes(tile->mem.ggtt, start);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
|
@ -839,6 +840,13 @@ static void xe_gt_sriov_vf_default_lrcs_hwsp_rebase(struct xe_gt *gt)
|
|||
xe_default_lrc_update_memirq_regs_with_address(hwe);
|
||||
}
|
||||
|
||||
static void vf_post_migration_mark_fixups_done(struct xe_gt *gt)
|
||||
{
|
||||
WRITE_ONCE(gt->sriov.vf.migration.ggtt_need_fixes, false);
|
||||
smp_wmb(); /* Ensure above write visible before wake */
|
||||
wake_up_all(>->sriov.vf.migration.wq);
|
||||
}
|
||||
|
||||
static void vf_start_migration_recovery(struct xe_gt *gt)
|
||||
{
|
||||
bool started;
|
||||
|
|
@ -1269,6 +1277,8 @@ static int vf_post_migration_fixups(struct xe_gt *gt)
|
|||
if (err)
|
||||
return err;
|
||||
|
||||
atomic_inc(>->sriov.vf.migration.fixups_complete_count);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
|
@ -1373,6 +1383,7 @@ static void vf_post_migration_recovery(struct xe_gt *gt)
|
|||
if (err)
|
||||
goto fail;
|
||||
|
||||
vf_post_migration_mark_fixups_done(gt);
|
||||
vf_post_migration_rearm(gt);
|
||||
|
||||
err = vf_post_migration_resfix_done(gt, marker);
|
||||
|
|
@ -1507,19 +1518,49 @@ static bool vf_valid_ggtt(struct xe_gt *gt)
|
|||
}
|
||||
|
||||
/**
|
||||
* xe_gt_sriov_vf_wait_valid_ggtt() - VF wait for valid GGTT addresses
|
||||
* @gt: the &xe_gt
|
||||
* xe_vf_migration_fixups_complete_count() - Get count of VF fixups completions.
|
||||
* @gt: the &xe_gt instance which contains affected Global GTT
|
||||
*
|
||||
* Return: number of times VF fixups were completed since driver
|
||||
* probe, or 0 if migration is not available, or -1 if fixups are
|
||||
* pending or being applied right now.
|
||||
*/
|
||||
void xe_gt_sriov_vf_wait_valid_ggtt(struct xe_gt *gt)
|
||||
int xe_vf_migration_fixups_complete_count(struct xe_gt *gt)
|
||||
{
|
||||
if (!IS_SRIOV_VF(gt_to_xe(gt)) ||
|
||||
!xe_sriov_vf_migration_supported(gt_to_xe(gt)))
|
||||
return 0;
|
||||
|
||||
/* should never match fixups_complete_count value */
|
||||
if (!vf_valid_ggtt(gt))
|
||||
return -1;
|
||||
|
||||
return atomic_read(>->sriov.vf.migration.fixups_complete_count);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_sriov_vf_wait_valid_ggtt() - wait for valid GGTT nodes and address refs
|
||||
* @gt: the &xe_gt instance which contains affected Global GTT
|
||||
*
|
||||
* Return: number of times VF fixups were completed since driver
|
||||
* probe, or 0 if migration is not available.
|
||||
*/
|
||||
int xe_gt_sriov_vf_wait_valid_ggtt(struct xe_gt *gt)
|
||||
{
|
||||
int ret;
|
||||
|
||||
/*
|
||||
* this condition needs to be identical to one in
|
||||
* xe_vf_migration_fixups_complete_count()
|
||||
*/
|
||||
if (!IS_SRIOV_VF(gt_to_xe(gt)) ||
|
||||
!xe_sriov_vf_migration_supported(gt_to_xe(gt)))
|
||||
return;
|
||||
return 0;
|
||||
|
||||
ret = wait_event_interruptible_timeout(gt->sriov.vf.migration.wq,
|
||||
vf_valid_ggtt(gt),
|
||||
HZ * 5);
|
||||
xe_gt_WARN_ON(gt, !ret);
|
||||
|
||||
return atomic_read(>->sriov.vf.migration.fixups_complete_count);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -39,6 +39,7 @@ void xe_gt_sriov_vf_print_config(struct xe_gt *gt, struct drm_printer *p);
|
|||
void xe_gt_sriov_vf_print_runtime(struct xe_gt *gt, struct drm_printer *p);
|
||||
void xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p);
|
||||
|
||||
void xe_gt_sriov_vf_wait_valid_ggtt(struct xe_gt *gt);
|
||||
int xe_gt_sriov_vf_wait_valid_ggtt(struct xe_gt *gt);
|
||||
int xe_vf_migration_fixups_complete_count(struct xe_gt *gt);
|
||||
|
||||
#endif
|
||||
|
|
|
|||
|
|
@ -54,6 +54,8 @@ struct xe_gt_sriov_vf_migration {
|
|||
wait_queue_head_t wq;
|
||||
/** @scratch: Scratch memory for VF recovery */
|
||||
void *scratch;
|
||||
/** @fixups_complete_count: Counts completed fixups stages */
|
||||
atomic_t fixups_complete_count;
|
||||
/** @debug: Debug hooks for delaying migration */
|
||||
struct {
|
||||
/**
|
||||
|
|
@ -73,7 +75,7 @@ struct xe_gt_sriov_vf_migration {
|
|||
bool recovery_queued;
|
||||
/** @recovery_inprogress: VF post migration recovery in progress */
|
||||
bool recovery_inprogress;
|
||||
/** @ggtt_need_fixes: VF GGTT needs fixes */
|
||||
/** @ggtt_need_fixes: VF GGTT and references to it need fixes */
|
||||
bool ggtt_need_fixes;
|
||||
};
|
||||
|
||||
|
|
|
|||
|
|
@ -3,12 +3,37 @@
|
|||
* Copyright © 2024 Intel Corporation
|
||||
*/
|
||||
|
||||
#include <linux/atomic.h>
|
||||
|
||||
#include <drm/drm_managed.h>
|
||||
#include <drm/drm_print.h>
|
||||
|
||||
#include "xe_device.h"
|
||||
#include "xe_gt_stats.h"
|
||||
#include "xe_gt_types.h"
|
||||
|
||||
static void xe_gt_stats_fini(struct drm_device *drm, void *arg)
|
||||
{
|
||||
struct xe_gt *gt = arg;
|
||||
|
||||
free_percpu(gt->stats);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_stats_init() - Initialize GT statistics
|
||||
* @gt: GT structure
|
||||
*
|
||||
* Allocate per-CPU GT statistics. Using per-CPU stats allows increments
|
||||
* to occur without cross-CPU atomics.
|
||||
*
|
||||
* Return: 0 on success, -ENOMEM on failure.
|
||||
*/
|
||||
int xe_gt_stats_init(struct xe_gt *gt)
|
||||
{
|
||||
gt->stats = alloc_percpu(struct xe_gt_stats);
|
||||
if (!gt->stats)
|
||||
return -ENOMEM;
|
||||
|
||||
return drmm_add_action_or_reset(>_to_xe(gt)->drm, xe_gt_stats_fini,
|
||||
gt);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_stats_incr - Increments the specified stats counter
|
||||
|
|
@ -23,7 +48,7 @@ void xe_gt_stats_incr(struct xe_gt *gt, const enum xe_gt_stats_id id, int incr)
|
|||
if (id >= __XE_GT_STATS_NUM_IDS)
|
||||
return;
|
||||
|
||||
atomic64_add(incr, >->stats.counters[id]);
|
||||
this_cpu_add(gt->stats->counters[id], incr);
|
||||
}
|
||||
|
||||
#define DEF_STAT_STR(ID, name) [XE_GT_STATS_ID_##ID] = name
|
||||
|
|
@ -35,6 +60,7 @@ static const char *const stat_description[__XE_GT_STATS_NUM_IDS] = {
|
|||
DEF_STAT_STR(SVM_TLB_INVAL_US, "svm_tlb_inval_us"),
|
||||
DEF_STAT_STR(VMA_PAGEFAULT_COUNT, "vma_pagefault_count"),
|
||||
DEF_STAT_STR(VMA_PAGEFAULT_KB, "vma_pagefault_kb"),
|
||||
DEF_STAT_STR(INVALID_PREFETCH_PAGEFAULT_COUNT, "invalid_prefetch_pagefault_count"),
|
||||
DEF_STAT_STR(SVM_4K_PAGEFAULT_COUNT, "svm_4K_pagefault_count"),
|
||||
DEF_STAT_STR(SVM_64K_PAGEFAULT_COUNT, "svm_64K_pagefault_count"),
|
||||
DEF_STAT_STR(SVM_2M_PAGEFAULT_COUNT, "svm_2M_pagefault_count"),
|
||||
|
|
@ -94,23 +120,37 @@ int xe_gt_stats_print_info(struct xe_gt *gt, struct drm_printer *p)
|
|||
{
|
||||
enum xe_gt_stats_id id;
|
||||
|
||||
for (id = 0; id < __XE_GT_STATS_NUM_IDS; ++id)
|
||||
drm_printf(p, "%s: %lld\n", stat_description[id],
|
||||
atomic64_read(>->stats.counters[id]));
|
||||
for (id = 0; id < __XE_GT_STATS_NUM_IDS; ++id) {
|
||||
u64 total = 0;
|
||||
int cpu;
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct xe_gt_stats *s = per_cpu_ptr(gt->stats, cpu);
|
||||
|
||||
total += s->counters[id];
|
||||
}
|
||||
|
||||
drm_printf(p, "%s: %lld\n", stat_description[id], total);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_gt_stats_clear - Clear the GT stats
|
||||
* xe_gt_stats_clear() - Clear the GT stats
|
||||
* @gt: GT structure
|
||||
*
|
||||
* This clear (zeros) all the available GT stats.
|
||||
* Clear (zero) all available GT stats. Note that if the stats are being
|
||||
* updated while this function is running, the results may be unpredictable.
|
||||
* Intended to be called on an idle GPU.
|
||||
*/
|
||||
void xe_gt_stats_clear(struct xe_gt *gt)
|
||||
{
|
||||
int id;
|
||||
int cpu;
|
||||
|
||||
for (id = 0; id < ARRAY_SIZE(gt->stats.counters); ++id)
|
||||
atomic64_set(>->stats.counters[id], 0);
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct xe_gt_stats *s = per_cpu_ptr(gt->stats, cpu);
|
||||
|
||||
memset(s, 0, sizeof(*s));
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -14,10 +14,16 @@ struct xe_gt;
|
|||
struct drm_printer;
|
||||
|
||||
#ifdef CONFIG_DEBUG_FS
|
||||
int xe_gt_stats_init(struct xe_gt *gt);
|
||||
int xe_gt_stats_print_info(struct xe_gt *gt, struct drm_printer *p);
|
||||
void xe_gt_stats_clear(struct xe_gt *gt);
|
||||
void xe_gt_stats_incr(struct xe_gt *gt, const enum xe_gt_stats_id id, int incr);
|
||||
#else
|
||||
static inline int xe_gt_stats_init(struct xe_gt *gt)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void
|
||||
xe_gt_stats_incr(struct xe_gt *gt, const enum xe_gt_stats_id id,
|
||||
int incr)
|
||||
|
|
|
|||
|
|
@ -6,6 +6,8 @@
|
|||
#ifndef _XE_GT_STATS_TYPES_H_
|
||||
#define _XE_GT_STATS_TYPES_H_
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
enum xe_gt_stats_id {
|
||||
XE_GT_STATS_ID_SVM_PAGEFAULT_COUNT,
|
||||
XE_GT_STATS_ID_TLB_INVAL,
|
||||
|
|
@ -13,6 +15,7 @@ enum xe_gt_stats_id {
|
|||
XE_GT_STATS_ID_SVM_TLB_INVAL_US,
|
||||
XE_GT_STATS_ID_VMA_PAGEFAULT_COUNT,
|
||||
XE_GT_STATS_ID_VMA_PAGEFAULT_KB,
|
||||
XE_GT_STATS_ID_INVALID_PREFETCH_PAGEFAULT_COUNT,
|
||||
XE_GT_STATS_ID_SVM_4K_PAGEFAULT_COUNT,
|
||||
XE_GT_STATS_ID_SVM_64K_PAGEFAULT_COUNT,
|
||||
XE_GT_STATS_ID_SVM_2M_PAGEFAULT_COUNT,
|
||||
|
|
@ -58,4 +61,21 @@ enum xe_gt_stats_id {
|
|||
__XE_GT_STATS_NUM_IDS,
|
||||
};
|
||||
|
||||
/**
|
||||
* struct xe_gt_stats - Per-CPU GT statistics counters
|
||||
* @counters: Array of 64-bit counters indexed by &enum xe_gt_stats_id
|
||||
*
|
||||
* This structure is used for high-frequency, per-CPU statistics collection
|
||||
* in the Xe driver. By using a per-CPU allocation and ensuring the structure
|
||||
* is cache-line aligned, we avoid the performance-heavy atomics and cache
|
||||
* coherency traffic.
|
||||
*
|
||||
* Updates to these counters should be performed using the this_cpu_add()
|
||||
* macro to ensure they are atomic with respect to local interrupts and
|
||||
* preemption-safe without the overhead of explicit locking.
|
||||
*/
|
||||
struct xe_gt_stats {
|
||||
u64 counters[__XE_GT_STATS_NUM_IDS];
|
||||
} ____cacheline_aligned;
|
||||
|
||||
#endif
|
||||
|
|
|
|||
|
|
@ -205,24 +205,6 @@ load_l3_bank_mask(struct xe_gt *gt, xe_l3_bank_mask_t l3_bank_mask)
|
|||
}
|
||||
}
|
||||
|
||||
static void
|
||||
get_num_dss_regs(struct xe_device *xe, int *geometry_regs, int *compute_regs)
|
||||
{
|
||||
if (GRAPHICS_VER(xe) > 20) {
|
||||
*geometry_regs = 3;
|
||||
*compute_regs = 3;
|
||||
} else if (GRAPHICS_VERx100(xe) == 1260) {
|
||||
*geometry_regs = 0;
|
||||
*compute_regs = 2;
|
||||
} else if (GRAPHICS_VERx100(xe) >= 1250) {
|
||||
*geometry_regs = 1;
|
||||
*compute_regs = 1;
|
||||
} else {
|
||||
*geometry_regs = 1;
|
||||
*compute_regs = 0;
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
xe_gt_topology_init(struct xe_gt *gt)
|
||||
{
|
||||
|
|
@ -230,29 +212,27 @@ xe_gt_topology_init(struct xe_gt *gt)
|
|||
XELP_GT_GEOMETRY_DSS_ENABLE,
|
||||
XE2_GT_GEOMETRY_DSS_1,
|
||||
XE2_GT_GEOMETRY_DSS_2,
|
||||
XE3P_XPC_GT_GEOMETRY_DSS_3,
|
||||
};
|
||||
static const struct xe_reg compute_regs[] = {
|
||||
XEHP_GT_COMPUTE_DSS_ENABLE,
|
||||
XEHPC_GT_COMPUTE_DSS_ENABLE_EXT,
|
||||
XE2_GT_COMPUTE_DSS_2,
|
||||
XE3P_XPC_GT_COMPUTE_DSS_3,
|
||||
};
|
||||
int num_geometry_regs, num_compute_regs;
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
struct drm_printer p;
|
||||
|
||||
get_num_dss_regs(xe, &num_geometry_regs, &num_compute_regs);
|
||||
|
||||
/*
|
||||
* Register counts returned shouldn't exceed the number of registers
|
||||
* passed as parameters below.
|
||||
*/
|
||||
xe_gt_assert(gt, num_geometry_regs <= ARRAY_SIZE(geometry_regs));
|
||||
xe_gt_assert(gt, num_compute_regs <= ARRAY_SIZE(compute_regs));
|
||||
xe_gt_assert(gt, gt->info.num_geometry_xecore_fuse_regs <= ARRAY_SIZE(geometry_regs));
|
||||
xe_gt_assert(gt, gt->info.num_compute_xecore_fuse_regs <= ARRAY_SIZE(compute_regs));
|
||||
|
||||
load_dss_mask(gt, gt->fuse_topo.g_dss_mask,
|
||||
num_geometry_regs, geometry_regs);
|
||||
gt->info.num_geometry_xecore_fuse_regs, geometry_regs);
|
||||
load_dss_mask(gt, gt->fuse_topo.c_dss_mask,
|
||||
num_compute_regs, compute_regs);
|
||||
gt->info.num_compute_xecore_fuse_regs, compute_regs);
|
||||
|
||||
load_eu_mask(gt, gt->fuse_topo.eu_mask_per_dss, >->fuse_topo.eu_type);
|
||||
load_l3_bank_mask(gt, gt->fuse_topo.l3_bank_mask);
|
||||
|
|
@ -330,15 +310,14 @@ xe_l3_bank_mask_ffs(const xe_l3_bank_mask_t mask)
|
|||
*/
|
||||
bool xe_gt_topology_has_dss_in_quadrant(struct xe_gt *gt, int quad)
|
||||
{
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
xe_dss_mask_t all_dss;
|
||||
int g_dss_regs, c_dss_regs, dss_per_quad, quad_first;
|
||||
int dss_per_quad, quad_first;
|
||||
|
||||
bitmap_or(all_dss, gt->fuse_topo.g_dss_mask, gt->fuse_topo.c_dss_mask,
|
||||
XE_MAX_DSS_FUSE_BITS);
|
||||
|
||||
get_num_dss_regs(xe, &g_dss_regs, &c_dss_regs);
|
||||
dss_per_quad = 32 * max(g_dss_regs, c_dss_regs) / 4;
|
||||
dss_per_quad = 32 * max(gt->info.num_geometry_xecore_fuse_regs,
|
||||
gt->info.num_compute_xecore_fuse_regs) / 4;
|
||||
|
||||
quad_first = xe_dss_mask_group_ffs(all_dss, dss_per_quad, quad);
|
||||
|
||||
|
|
|
|||
|
|
@ -35,7 +35,7 @@ enum xe_gt_eu_type {
|
|||
XE_GT_EU_TYPE_SIMD16,
|
||||
};
|
||||
|
||||
#define XE_MAX_DSS_FUSE_REGS 3
|
||||
#define XE_MAX_DSS_FUSE_REGS 4
|
||||
#define XE_MAX_DSS_FUSE_BITS (32 * XE_MAX_DSS_FUSE_REGS)
|
||||
#define XE_MAX_EU_FUSE_REGS 1
|
||||
#define XE_MAX_EU_FUSE_BITS (32 * XE_MAX_EU_FUSE_REGS)
|
||||
|
|
@ -45,11 +45,6 @@ typedef unsigned long xe_dss_mask_t[BITS_TO_LONGS(XE_MAX_DSS_FUSE_BITS)];
|
|||
typedef unsigned long xe_eu_mask_t[BITS_TO_LONGS(XE_MAX_EU_FUSE_BITS)];
|
||||
typedef unsigned long xe_l3_bank_mask_t[BITS_TO_LONGS(XE_MAX_L3_BANK_MASK_BITS)];
|
||||
|
||||
struct xe_mmio_range {
|
||||
u32 start;
|
||||
u32 end;
|
||||
};
|
||||
|
||||
/*
|
||||
* The hardware has multiple kinds of multicast register ranges that need
|
||||
* special register steering (and future platforms are expected to add
|
||||
|
|
@ -149,14 +144,21 @@ struct xe_gt {
|
|||
u8 id;
|
||||
/** @info.has_indirect_ring_state: GT has indirect ring state support */
|
||||
u8 has_indirect_ring_state:1;
|
||||
/**
|
||||
* @info.num_geometry_xecore_fuse_regs: Number of 32b-bit fuse
|
||||
* registers the geometry XeCore mask spans.
|
||||
*/
|
||||
u8 num_geometry_xecore_fuse_regs;
|
||||
/**
|
||||
* @info.num_compute_xecore_fuse_regs: Number of 32b-bit fuse
|
||||
* registers the compute XeCore mask spans.
|
||||
*/
|
||||
u8 num_compute_xecore_fuse_regs;
|
||||
} info;
|
||||
|
||||
#if IS_ENABLED(CONFIG_DEBUG_FS)
|
||||
/** @stats: GT stats */
|
||||
struct {
|
||||
/** @stats.counters: counters for various GT stats */
|
||||
atomic64_t counters[__XE_GT_STATS_NUM_IDS];
|
||||
} stats;
|
||||
struct xe_gt_stats __percpu *stats;
|
||||
#endif
|
||||
|
||||
/**
|
||||
|
|
|
|||
|
|
@ -35,11 +35,13 @@
|
|||
#include "xe_guc_klv_helpers.h"
|
||||
#include "xe_guc_log.h"
|
||||
#include "xe_guc_pc.h"
|
||||
#include "xe_guc_rc.h"
|
||||
#include "xe_guc_relay.h"
|
||||
#include "xe_guc_submit.h"
|
||||
#include "xe_memirq.h"
|
||||
#include "xe_mmio.h"
|
||||
#include "xe_platform_types.h"
|
||||
#include "xe_sleep.h"
|
||||
#include "xe_sriov.h"
|
||||
#include "xe_sriov_pf_migration.h"
|
||||
#include "xe_uc.h"
|
||||
|
|
@ -211,9 +213,6 @@ static u32 guc_ctl_wa_flags(struct xe_guc *guc)
|
|||
!xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_RENDER))
|
||||
flags |= GUC_WA_RCS_REGS_IN_CCS_REGS_LIST;
|
||||
|
||||
if (XE_GT_WA(gt, 1509372804))
|
||||
flags |= GUC_WA_RENDER_RST_RC6_EXIT;
|
||||
|
||||
if (XE_GT_WA(gt, 14018913170))
|
||||
flags |= GUC_WA_ENABLE_TSC_CHECK_ON_RC6;
|
||||
|
||||
|
|
@ -668,6 +667,13 @@ static void guc_fini_hw(void *arg)
|
|||
guc_g2g_fini(guc);
|
||||
}
|
||||
|
||||
static void vf_guc_fini_hw(void *arg)
|
||||
{
|
||||
struct xe_guc *guc = arg;
|
||||
|
||||
xe_gt_sriov_vf_reset(guc_to_gt(guc));
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_guc_comm_init_early - early initialization of GuC communication
|
||||
* @guc: the &xe_guc to initialize
|
||||
|
|
@ -772,6 +778,10 @@ int xe_guc_init(struct xe_guc *guc)
|
|||
xe->info.has_page_reclaim_hw_assist = false;
|
||||
|
||||
if (IS_SRIOV_VF(xe)) {
|
||||
ret = devm_add_action_or_reset(xe->drm.dev, vf_guc_fini_hw, guc);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
ret = xe_guc_ct_init(&guc->ct);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
|
@ -869,6 +879,10 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
|
|||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = xe_guc_rc_init(guc);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = xe_guc_engine_activity_init(guc);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
|
@ -900,6 +914,41 @@ int xe_guc_post_load_init(struct xe_guc *guc)
|
|||
return xe_guc_submit_enable(guc);
|
||||
}
|
||||
|
||||
/*
|
||||
* Wa_14025883347: Prevent GuC firmware DMA failures during GuC-only reset by ensuring
|
||||
* SRAM save/restore operations are complete before reset.
|
||||
*/
|
||||
static void guc_prevent_fw_dma_failure_on_reset(struct xe_guc *guc)
|
||||
{
|
||||
struct xe_gt *gt = guc_to_gt(guc);
|
||||
u32 boot_hash_chk, guc_status, sram_status;
|
||||
int ret;
|
||||
|
||||
guc_status = xe_mmio_read32(>->mmio, GUC_STATUS);
|
||||
if (guc_status & GS_MIA_IN_RESET)
|
||||
return;
|
||||
|
||||
boot_hash_chk = xe_mmio_read32(>->mmio, BOOT_HASH_CHK);
|
||||
if (!(boot_hash_chk & GUC_BOOT_UKERNEL_VALID))
|
||||
return;
|
||||
|
||||
/* Disable idle flow during reset (GuC reset re-enables it automatically) */
|
||||
xe_mmio_rmw32(>->mmio, GUC_MAX_IDLE_COUNT, 0, GUC_IDLE_FLOW_DISABLE);
|
||||
|
||||
ret = xe_mmio_wait32(>->mmio, GUC_STATUS, GS_UKERNEL_MASK,
|
||||
FIELD_PREP(GS_UKERNEL_MASK, XE_GUC_LOAD_STATUS_READY),
|
||||
100000, &guc_status, false);
|
||||
if (ret)
|
||||
xe_gt_warn(gt, "GuC not ready after disabling idle flow (GUC_STATUS: 0x%x)\n",
|
||||
guc_status);
|
||||
|
||||
ret = xe_mmio_wait32(>->mmio, GUC_SRAM_STATUS, GUC_SRAM_HANDLING_MASK,
|
||||
0, 5000, &sram_status, false);
|
||||
if (ret)
|
||||
xe_gt_warn(gt, "SRAM handling not complete (GUC_SRAM_STATUS: 0x%x)\n",
|
||||
sram_status);
|
||||
}
|
||||
|
||||
int xe_guc_reset(struct xe_guc *guc)
|
||||
{
|
||||
struct xe_gt *gt = guc_to_gt(guc);
|
||||
|
|
@ -912,6 +961,9 @@ int xe_guc_reset(struct xe_guc *guc)
|
|||
if (IS_SRIOV_VF(gt_to_xe(gt)))
|
||||
return xe_gt_sriov_vf_bootstrap(gt);
|
||||
|
||||
if (XE_GT_WA(gt, 14025883347))
|
||||
guc_prevent_fw_dma_failure_on_reset(guc);
|
||||
|
||||
xe_mmio_write32(mmio, GDRST, GRDOM_GUC);
|
||||
|
||||
ret = xe_mmio_wait32(mmio, GDRST, GRDOM_GUC, 0, 5000, &gdrst, false);
|
||||
|
|
@ -1388,17 +1440,21 @@ int xe_guc_auth_huc(struct xe_guc *guc, u32 rsa_addr)
|
|||
return xe_guc_ct_send_block(&guc->ct, action, ARRAY_SIZE(action));
|
||||
}
|
||||
|
||||
#define MAX_RETRIES_ON_FLR 2
|
||||
#define MIN_SLEEP_MS_ON_FLR 256
|
||||
|
||||
int xe_guc_mmio_send_recv(struct xe_guc *guc, const u32 *request,
|
||||
u32 len, u32 *response_buf)
|
||||
{
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
struct xe_gt *gt = guc_to_gt(guc);
|
||||
struct xe_mmio *mmio = >->mmio;
|
||||
u32 header, reply;
|
||||
struct xe_reg reply_reg = xe_gt_is_media_type(gt) ?
|
||||
MED_VF_SW_FLAG(0) : VF_SW_FLAG(0);
|
||||
const u32 LAST_INDEX = VF_SW_FLAG_COUNT - 1;
|
||||
bool lost = false;
|
||||
unsigned int sleep_period_ms = 1;
|
||||
unsigned int lost = 0;
|
||||
u32 header;
|
||||
int ret;
|
||||
int i;
|
||||
|
||||
|
|
@ -1430,21 +1486,25 @@ int xe_guc_mmio_send_recv(struct xe_guc *guc, const u32 *request,
|
|||
|
||||
ret = xe_mmio_wait32(mmio, reply_reg, GUC_HXG_MSG_0_ORIGIN,
|
||||
FIELD_PREP(GUC_HXG_MSG_0_ORIGIN, GUC_HXG_ORIGIN_GUC),
|
||||
50000, &reply, false);
|
||||
50000, &header, false);
|
||||
if (ret) {
|
||||
/* scratch registers might be cleared during FLR, try once more */
|
||||
if (!reply && !lost) {
|
||||
if (!header) {
|
||||
if (++lost > MAX_RETRIES_ON_FLR) {
|
||||
xe_gt_err(gt, "GuC mmio request %#x: lost, too many retries %u\n",
|
||||
request[0], lost);
|
||||
return -ENOLINK;
|
||||
}
|
||||
xe_gt_dbg(gt, "GuC mmio request %#x: lost, trying again\n", request[0]);
|
||||
lost = true;
|
||||
xe_sleep_relaxed_ms(MIN_SLEEP_MS_ON_FLR);
|
||||
goto retry;
|
||||
}
|
||||
timeout:
|
||||
xe_gt_err(gt, "GuC mmio request %#x: no reply %#x\n",
|
||||
request[0], reply);
|
||||
request[0], header);
|
||||
return ret;
|
||||
}
|
||||
|
||||
header = xe_mmio_read32(mmio, reply_reg);
|
||||
if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) ==
|
||||
GUC_HXG_TYPE_NO_RESPONSE_BUSY) {
|
||||
/*
|
||||
|
|
@ -1480,6 +1540,8 @@ int xe_guc_mmio_send_recv(struct xe_guc *guc, const u32 *request,
|
|||
|
||||
xe_gt_dbg(gt, "GuC mmio request %#x: retrying, reason %#x\n",
|
||||
request[0], reason);
|
||||
|
||||
xe_sleep_exponential_ms(&sleep_period_ms, 256);
|
||||
goto retry;
|
||||
}
|
||||
|
||||
|
|
@ -1609,6 +1671,7 @@ void xe_guc_stop_prepare(struct xe_guc *guc)
|
|||
if (!IS_SRIOV_VF(guc_to_xe(guc))) {
|
||||
int err;
|
||||
|
||||
xe_guc_rc_disable(guc);
|
||||
err = xe_guc_pc_stop(&guc->pc);
|
||||
xe_gt_WARN(guc_to_gt(guc), err, "Failed to stop GuC PC: %pe\n",
|
||||
ERR_PTR(err));
|
||||
|
|
|
|||
|
|
@ -32,6 +32,7 @@
|
|||
#include "xe_guc_tlb_inval.h"
|
||||
#include "xe_map.h"
|
||||
#include "xe_pm.h"
|
||||
#include "xe_sleep.h"
|
||||
#include "xe_sriov_vf.h"
|
||||
#include "xe_trace_guc.h"
|
||||
|
||||
|
|
@ -254,6 +255,7 @@ static bool g2h_fence_needs_alloc(struct g2h_fence *g2h_fence)
|
|||
|
||||
#define CTB_DESC_SIZE ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
|
||||
#define CTB_H2G_BUFFER_OFFSET (CTB_DESC_SIZE * 2)
|
||||
#define CTB_G2H_BUFFER_OFFSET (CTB_DESC_SIZE * 2)
|
||||
#define CTB_H2G_BUFFER_SIZE (SZ_4K)
|
||||
#define CTB_H2G_BUFFER_DWORDS (CTB_H2G_BUFFER_SIZE / sizeof(u32))
|
||||
#define CTB_G2H_BUFFER_SIZE (SZ_128K)
|
||||
|
|
@ -274,14 +276,18 @@ static bool g2h_fence_needs_alloc(struct g2h_fence *g2h_fence)
|
|||
*/
|
||||
long xe_guc_ct_queue_proc_time_jiffies(struct xe_guc_ct *ct)
|
||||
{
|
||||
BUILD_BUG_ON(!IS_ALIGNED(CTB_H2G_BUFFER_SIZE, SZ_4));
|
||||
BUILD_BUG_ON(!IS_ALIGNED(CTB_H2G_BUFFER_SIZE, SZ_4K));
|
||||
return (CTB_H2G_BUFFER_SIZE / SZ_4K) * HZ;
|
||||
}
|
||||
|
||||
static size_t guc_ct_size(void)
|
||||
static size_t guc_h2g_size(void)
|
||||
{
|
||||
return CTB_H2G_BUFFER_OFFSET + CTB_H2G_BUFFER_SIZE +
|
||||
CTB_G2H_BUFFER_SIZE;
|
||||
return CTB_H2G_BUFFER_OFFSET + CTB_H2G_BUFFER_SIZE;
|
||||
}
|
||||
|
||||
static size_t guc_g2h_size(void)
|
||||
{
|
||||
return CTB_G2H_BUFFER_OFFSET + CTB_G2H_BUFFER_SIZE;
|
||||
}
|
||||
|
||||
static void guc_ct_fini(struct drm_device *drm, void *arg)
|
||||
|
|
@ -310,7 +316,8 @@ int xe_guc_ct_init_noalloc(struct xe_guc_ct *ct)
|
|||
struct xe_gt *gt = ct_to_gt(ct);
|
||||
int err;
|
||||
|
||||
xe_gt_assert(gt, !(guc_ct_size() % PAGE_SIZE));
|
||||
xe_gt_assert(gt, !(guc_h2g_size() % PAGE_SIZE));
|
||||
xe_gt_assert(gt, !(guc_g2h_size() % PAGE_SIZE));
|
||||
|
||||
err = drmm_mutex_init(&xe->drm, &ct->lock);
|
||||
if (err)
|
||||
|
|
@ -355,7 +362,7 @@ int xe_guc_ct_init(struct xe_guc_ct *ct)
|
|||
struct xe_tile *tile = gt_to_tile(gt);
|
||||
struct xe_bo *bo;
|
||||
|
||||
bo = xe_managed_bo_create_pin_map(xe, tile, guc_ct_size(),
|
||||
bo = xe_managed_bo_create_pin_map(xe, tile, guc_h2g_size(),
|
||||
XE_BO_FLAG_SYSTEM |
|
||||
XE_BO_FLAG_GGTT |
|
||||
XE_BO_FLAG_GGTT_INVALIDATE |
|
||||
|
|
@ -363,7 +370,17 @@ int xe_guc_ct_init(struct xe_guc_ct *ct)
|
|||
if (IS_ERR(bo))
|
||||
return PTR_ERR(bo);
|
||||
|
||||
ct->bo = bo;
|
||||
ct->ctbs.h2g.bo = bo;
|
||||
|
||||
bo = xe_managed_bo_create_pin_map(xe, tile, guc_g2h_size(),
|
||||
XE_BO_FLAG_SYSTEM |
|
||||
XE_BO_FLAG_GGTT |
|
||||
XE_BO_FLAG_GGTT_INVALIDATE |
|
||||
XE_BO_FLAG_PINNED_NORESTORE);
|
||||
if (IS_ERR(bo))
|
||||
return PTR_ERR(bo);
|
||||
|
||||
ct->ctbs.g2h.bo = bo;
|
||||
|
||||
return devm_add_action_or_reset(xe->drm.dev, guc_action_disable_ct, ct);
|
||||
}
|
||||
|
|
@ -388,7 +405,7 @@ int xe_guc_ct_init_post_hwconfig(struct xe_guc_ct *ct)
|
|||
xe_assert(xe, !xe_guc_ct_enabled(ct));
|
||||
|
||||
if (IS_DGFX(xe)) {
|
||||
ret = xe_managed_bo_reinit_in_vram(xe, tile, &ct->bo);
|
||||
ret = xe_managed_bo_reinit_in_vram(xe, tile, &ct->ctbs.h2g.bo);
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
|
|
@ -438,8 +455,7 @@ static void guc_ct_ctb_g2h_init(struct xe_device *xe, struct guc_ctb *g2h,
|
|||
g2h->desc = IOSYS_MAP_INIT_OFFSET(map, CTB_DESC_SIZE);
|
||||
xe_map_memset(xe, &g2h->desc, 0, 0, sizeof(struct guc_ct_buffer_desc));
|
||||
|
||||
g2h->cmds = IOSYS_MAP_INIT_OFFSET(map, CTB_H2G_BUFFER_OFFSET +
|
||||
CTB_H2G_BUFFER_SIZE);
|
||||
g2h->cmds = IOSYS_MAP_INIT_OFFSET(map, CTB_G2H_BUFFER_OFFSET);
|
||||
}
|
||||
|
||||
static int guc_ct_ctb_h2g_register(struct xe_guc_ct *ct)
|
||||
|
|
@ -448,8 +464,8 @@ static int guc_ct_ctb_h2g_register(struct xe_guc_ct *ct)
|
|||
u32 desc_addr, ctb_addr, size;
|
||||
int err;
|
||||
|
||||
desc_addr = xe_bo_ggtt_addr(ct->bo);
|
||||
ctb_addr = xe_bo_ggtt_addr(ct->bo) + CTB_H2G_BUFFER_OFFSET;
|
||||
desc_addr = xe_bo_ggtt_addr(ct->ctbs.h2g.bo);
|
||||
ctb_addr = xe_bo_ggtt_addr(ct->ctbs.h2g.bo) + CTB_H2G_BUFFER_OFFSET;
|
||||
size = ct->ctbs.h2g.info.size * sizeof(u32);
|
||||
|
||||
err = xe_guc_self_cfg64(guc,
|
||||
|
|
@ -475,9 +491,8 @@ static int guc_ct_ctb_g2h_register(struct xe_guc_ct *ct)
|
|||
u32 desc_addr, ctb_addr, size;
|
||||
int err;
|
||||
|
||||
desc_addr = xe_bo_ggtt_addr(ct->bo) + CTB_DESC_SIZE;
|
||||
ctb_addr = xe_bo_ggtt_addr(ct->bo) + CTB_H2G_BUFFER_OFFSET +
|
||||
CTB_H2G_BUFFER_SIZE;
|
||||
desc_addr = xe_bo_ggtt_addr(ct->ctbs.g2h.bo) + CTB_DESC_SIZE;
|
||||
ctb_addr = xe_bo_ggtt_addr(ct->ctbs.g2h.bo) + CTB_G2H_BUFFER_OFFSET;
|
||||
size = ct->ctbs.g2h.info.size * sizeof(u32);
|
||||
|
||||
err = xe_guc_self_cfg64(guc,
|
||||
|
|
@ -604,9 +619,12 @@ static int __xe_guc_ct_start(struct xe_guc_ct *ct, bool needs_register)
|
|||
xe_gt_assert(gt, !xe_guc_ct_enabled(ct));
|
||||
|
||||
if (needs_register) {
|
||||
xe_map_memset(xe, &ct->bo->vmap, 0, 0, xe_bo_size(ct->bo));
|
||||
guc_ct_ctb_h2g_init(xe, &ct->ctbs.h2g, &ct->bo->vmap);
|
||||
guc_ct_ctb_g2h_init(xe, &ct->ctbs.g2h, &ct->bo->vmap);
|
||||
xe_map_memset(xe, &ct->ctbs.h2g.bo->vmap, 0, 0,
|
||||
xe_bo_size(ct->ctbs.h2g.bo));
|
||||
xe_map_memset(xe, &ct->ctbs.g2h.bo->vmap, 0, 0,
|
||||
xe_bo_size(ct->ctbs.g2h.bo));
|
||||
guc_ct_ctb_h2g_init(xe, &ct->ctbs.h2g, &ct->ctbs.h2g.bo->vmap);
|
||||
guc_ct_ctb_g2h_init(xe, &ct->ctbs.g2h, &ct->ctbs.g2h.bo->vmap);
|
||||
|
||||
err = guc_ct_ctb_h2g_register(ct);
|
||||
if (err)
|
||||
|
|
@ -623,7 +641,7 @@ static int __xe_guc_ct_start(struct xe_guc_ct *ct, bool needs_register)
|
|||
ct->ctbs.h2g.info.broken = false;
|
||||
ct->ctbs.g2h.info.broken = false;
|
||||
/* Skip everything in H2G buffer */
|
||||
xe_map_memset(xe, &ct->bo->vmap, CTB_H2G_BUFFER_OFFSET, 0,
|
||||
xe_map_memset(xe, &ct->ctbs.h2g.bo->vmap, CTB_H2G_BUFFER_OFFSET, 0,
|
||||
CTB_H2G_BUFFER_SIZE);
|
||||
}
|
||||
|
||||
|
|
@ -643,7 +661,7 @@ static int __xe_guc_ct_start(struct xe_guc_ct *ct, bool needs_register)
|
|||
spin_lock_irq(&ct->dead.lock);
|
||||
if (ct->dead.reason) {
|
||||
ct->dead.reason |= (1 << CT_DEAD_STATE_REARM);
|
||||
queue_work(system_unbound_wq, &ct->dead.worker);
|
||||
queue_work(system_dfl_wq, &ct->dead.worker);
|
||||
}
|
||||
spin_unlock_irq(&ct->dead.lock);
|
||||
#endif
|
||||
|
|
@ -921,22 +939,22 @@ static int h2g_write(struct xe_guc_ct *ct, const u32 *action, u32 len,
|
|||
u32 full_len;
|
||||
struct iosys_map map = IOSYS_MAP_INIT_OFFSET(&h2g->cmds,
|
||||
tail * sizeof(u32));
|
||||
u32 desc_status;
|
||||
|
||||
full_len = len + GUC_CTB_HDR_LEN;
|
||||
|
||||
lockdep_assert_held(&ct->lock);
|
||||
xe_gt_assert(gt, full_len <= GUC_CTB_MSG_MAX_LEN);
|
||||
|
||||
desc_status = desc_read(xe, h2g, status);
|
||||
if (desc_status) {
|
||||
xe_gt_err(gt, "CT write: non-zero status: %u\n", desc_status);
|
||||
goto corrupted;
|
||||
}
|
||||
|
||||
if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
|
||||
u32 desc_tail = desc_read(xe, h2g, tail);
|
||||
u32 desc_head = desc_read(xe, h2g, head);
|
||||
u32 desc_status;
|
||||
|
||||
desc_status = desc_read(xe, h2g, status);
|
||||
if (desc_status) {
|
||||
xe_gt_err(gt, "CT write: non-zero status: %u\n", desc_status);
|
||||
goto corrupted;
|
||||
}
|
||||
|
||||
if (tail != desc_tail) {
|
||||
desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_MISMATCH);
|
||||
|
|
@ -1005,8 +1023,15 @@ static int h2g_write(struct xe_guc_ct *ct, const u32 *action, u32 len,
|
|||
/* Update descriptor */
|
||||
desc_write(xe, h2g, tail, h2g->info.tail);
|
||||
|
||||
trace_xe_guc_ctb_h2g(xe, gt->info.id, *(action - 1), full_len,
|
||||
desc_read(xe, h2g, head), h2g->info.tail);
|
||||
/*
|
||||
* desc_read() performs an VRAM read which serializes the CPU and drains
|
||||
* posted writes on dGPU platforms. Tracepoints evaluate arguments even
|
||||
* when disabled, so guard the event to avoid adding µs-scale latency to
|
||||
* the fast H2G submission path when tracing is not active.
|
||||
*/
|
||||
if (trace_xe_guc_ctb_h2g_enabled())
|
||||
trace_xe_guc_ctb_h2g(xe, gt->info.id, *(action - 1), full_len,
|
||||
desc_read(xe, h2g, head), h2g->info.tail);
|
||||
|
||||
return 0;
|
||||
|
||||
|
|
@ -1101,7 +1126,8 @@ static int dequeue_one_g2h(struct xe_guc_ct *ct);
|
|||
*/
|
||||
static bool guc_ct_send_wait_for_retry(struct xe_guc_ct *ct, u32 len,
|
||||
u32 g2h_len, struct g2h_fence *g2h_fence,
|
||||
unsigned int *sleep_period_ms)
|
||||
unsigned int *sleep_period_ms,
|
||||
unsigned int *sleep_total_ms)
|
||||
{
|
||||
struct xe_device *xe = ct_to_xe(ct);
|
||||
|
||||
|
|
@ -1115,17 +1141,15 @@ static bool guc_ct_send_wait_for_retry(struct xe_guc_ct *ct, u32 len,
|
|||
if (!h2g_has_room(ct, len + GUC_CTB_HDR_LEN)) {
|
||||
struct guc_ctb *h2g = &ct->ctbs.h2g;
|
||||
|
||||
if (*sleep_period_ms == 1024)
|
||||
if (*sleep_total_ms > 1000)
|
||||
return false;
|
||||
|
||||
trace_xe_guc_ct_h2g_flow_control(xe, h2g->info.head, h2g->info.tail,
|
||||
h2g->info.size,
|
||||
h2g->info.space,
|
||||
len + GUC_CTB_HDR_LEN);
|
||||
msleep(*sleep_period_ms);
|
||||
*sleep_period_ms <<= 1;
|
||||
*sleep_total_ms += xe_sleep_exponential_ms(sleep_period_ms, 64);
|
||||
} else {
|
||||
struct xe_device *xe = ct_to_xe(ct);
|
||||
struct guc_ctb *g2h = &ct->ctbs.g2h;
|
||||
int ret;
|
||||
|
||||
|
|
@ -1147,7 +1171,7 @@ static bool guc_ct_send_wait_for_retry(struct xe_guc_ct *ct, u32 len,
|
|||
ret = dequeue_one_g2h(ct);
|
||||
if (ret < 0) {
|
||||
if (ret != -ECANCELED)
|
||||
xe_gt_err(ct_to_gt(ct), "CTB receive failed (%pe)",
|
||||
xe_gt_err(ct_to_gt(ct), "CTB receive failed (%pe)\n",
|
||||
ERR_PTR(ret));
|
||||
return false;
|
||||
}
|
||||
|
|
@ -1161,6 +1185,7 @@ static int guc_ct_send_locked(struct xe_guc_ct *ct, const u32 *action, u32 len,
|
|||
{
|
||||
struct xe_gt *gt = ct_to_gt(ct);
|
||||
unsigned int sleep_period_ms = 1;
|
||||
unsigned int sleep_total_ms = 0;
|
||||
int ret;
|
||||
|
||||
xe_gt_assert(gt, !g2h_len || !g2h_fence);
|
||||
|
|
@ -1173,7 +1198,7 @@ static int guc_ct_send_locked(struct xe_guc_ct *ct, const u32 *action, u32 len,
|
|||
|
||||
if (unlikely(ret == -EBUSY)) {
|
||||
if (!guc_ct_send_wait_for_retry(ct, len, g2h_len, g2h_fence,
|
||||
&sleep_period_ms))
|
||||
&sleep_period_ms, &sleep_total_ms))
|
||||
goto broken;
|
||||
goto try_again;
|
||||
}
|
||||
|
|
@ -1322,7 +1347,7 @@ static int guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len,
|
|||
*/
|
||||
mutex_lock(&ct->lock);
|
||||
if (!ret) {
|
||||
xe_gt_err(gt, "Timed out wait for G2H, fence %u, action %04x, done %s",
|
||||
xe_gt_err(gt, "Timed out wait for G2H, fence %u, action %04x, done %s\n",
|
||||
g2h_fence.seqno, action[0], str_yes_no(g2h_fence.done));
|
||||
xa_erase(&ct->fence_lookup, g2h_fence.seqno);
|
||||
mutex_unlock(&ct->lock);
|
||||
|
|
@ -1832,7 +1857,7 @@ static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len)
|
|||
ret = xe_guc_tlb_inval_done_handler(guc, payload, adj_len);
|
||||
break;
|
||||
default:
|
||||
xe_gt_warn(gt, "NOT_POSSIBLE");
|
||||
xe_gt_warn(gt, "NOT_POSSIBLE\n");
|
||||
}
|
||||
|
||||
if (ret) {
|
||||
|
|
@ -1935,7 +1960,7 @@ static void receive_g2h(struct xe_guc_ct *ct)
|
|||
mutex_unlock(&ct->lock);
|
||||
|
||||
if (unlikely(ret == -EPROTO || ret == -EOPNOTSUPP)) {
|
||||
xe_gt_err(ct_to_gt(ct), "CT dequeue failed: %d", ret);
|
||||
xe_gt_err(ct_to_gt(ct), "CT dequeue failed: %d\n", ret);
|
||||
CT_DEAD(ct, NULL, G2H_RECV);
|
||||
kick_reset(ct);
|
||||
}
|
||||
|
|
@ -1961,8 +1986,9 @@ static struct xe_guc_ct_snapshot *guc_ct_snapshot_alloc(struct xe_guc_ct *ct, bo
|
|||
if (!snapshot)
|
||||
return NULL;
|
||||
|
||||
if (ct->bo && want_ctb) {
|
||||
snapshot->ctb_size = xe_bo_size(ct->bo);
|
||||
if (ct->ctbs.h2g.bo && ct->ctbs.g2h.bo && want_ctb) {
|
||||
snapshot->ctb_size = xe_bo_size(ct->ctbs.h2g.bo) +
|
||||
xe_bo_size(ct->ctbs.g2h.bo);
|
||||
snapshot->ctb = kmalloc(snapshot->ctb_size, atomic ? GFP_ATOMIC : GFP_KERNEL);
|
||||
}
|
||||
|
||||
|
|
@ -2010,8 +2036,13 @@ static struct xe_guc_ct_snapshot *guc_ct_snapshot_capture(struct xe_guc_ct *ct,
|
|||
guc_ctb_snapshot_capture(xe, &ct->ctbs.g2h, &snapshot->g2h);
|
||||
}
|
||||
|
||||
if (ct->bo && snapshot->ctb)
|
||||
xe_map_memcpy_from(xe, snapshot->ctb, &ct->bo->vmap, 0, snapshot->ctb_size);
|
||||
if (ct->ctbs.h2g.bo && ct->ctbs.g2h.bo && snapshot->ctb) {
|
||||
xe_map_memcpy_from(xe, snapshot->ctb, &ct->ctbs.h2g.bo->vmap, 0,
|
||||
xe_bo_size(ct->ctbs.h2g.bo));
|
||||
xe_map_memcpy_from(xe, snapshot->ctb + xe_bo_size(ct->ctbs.h2g.bo),
|
||||
&ct->ctbs.g2h.bo->vmap, 0,
|
||||
xe_bo_size(ct->ctbs.g2h.bo));
|
||||
}
|
||||
|
||||
return snapshot;
|
||||
}
|
||||
|
|
@ -2165,7 +2196,7 @@ static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32 reaso
|
|||
|
||||
spin_unlock_irqrestore(&ct->dead.lock, flags);
|
||||
|
||||
queue_work(system_unbound_wq, &(ct)->dead.worker);
|
||||
queue_work(system_dfl_wq, &(ct)->dead.worker);
|
||||
}
|
||||
|
||||
static void ct_dead_print(struct xe_dead_ct *dead)
|
||||
|
|
|
|||
|
|
@ -39,6 +39,8 @@ struct guc_ctb_info {
|
|||
* struct guc_ctb - GuC command transport buffer (CTB)
|
||||
*/
|
||||
struct guc_ctb {
|
||||
/** @bo: Xe BO for CTB */
|
||||
struct xe_bo *bo;
|
||||
/** @desc: dma buffer map for CTB descriptor */
|
||||
struct iosys_map desc;
|
||||
/** @cmds: dma buffer map for CTB commands */
|
||||
|
|
@ -126,8 +128,6 @@ struct xe_fast_req_fence {
|
|||
* for the H2G and G2H requests sent and received through the buffers.
|
||||
*/
|
||||
struct xe_guc_ct {
|
||||
/** @bo: Xe BO for CT */
|
||||
struct xe_bo *bo;
|
||||
/** @lock: protects everything in CT layer */
|
||||
struct mutex lock;
|
||||
/** @fast_lock: protects G2H channel and credits */
|
||||
|
|
|
|||
|
|
@ -261,7 +261,8 @@ struct xe_guc_pagefault_desc {
|
|||
#define PFD_ACCESS_TYPE GENMASK(1, 0)
|
||||
#define PFD_FAULT_TYPE GENMASK(3, 2)
|
||||
#define PFD_VFID GENMASK(9, 4)
|
||||
#define PFD_RSVD_1 GENMASK(11, 10)
|
||||
#define PFD_RSVD_1 BIT(10)
|
||||
#define PFD_PREFETCH BIT(11) /* Only valid on Xe3+, reserved on prior platforms */
|
||||
#define PFD_VIRTUAL_ADDR_LO GENMASK(31, 12)
|
||||
#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
|
||||
|
||||
|
|
@ -281,7 +282,7 @@ struct xe_guc_pagefault_reply {
|
|||
|
||||
u32 dw1;
|
||||
#define PFR_VFID GENMASK(5, 0)
|
||||
#define PFR_RSVD_1 BIT(6)
|
||||
#define PFR_PREFETCH BIT(6) /* Only valid on Xe3+, reserved on prior platforms */
|
||||
#define PFR_ENG_INSTANCE GENMASK(12, 7)
|
||||
#define PFR_ENG_CLASS GENMASK(15, 13)
|
||||
#define PFR_PDATA GENMASK(31, 16)
|
||||
|
|
|
|||
|
|
@ -13,9 +13,13 @@ struct drm_printer;
|
|||
struct xe_device;
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_XE_DEBUG_GUC)
|
||||
#define XE_GUC_LOG_EVENT_DATA_BUFFER_SIZE SZ_8M
|
||||
#define XE_GUC_LOG_EVENT_DATA_BUFFER_SIZE SZ_16M
|
||||
#define XE_GUC_LOG_CRASH_DUMP_BUFFER_SIZE SZ_1M
|
||||
#define XE_GUC_LOG_STATE_CAPTURE_BUFFER_SIZE SZ_2M
|
||||
#elif IS_ENABLED(CONFIG_DRM_XE_DEBUG)
|
||||
#define XE_GUC_LOG_EVENT_DATA_BUFFER_SIZE SZ_8M
|
||||
#define XE_GUC_LOG_CRASH_DUMP_BUFFER_SIZE SZ_1M
|
||||
#define XE_GUC_LOG_STATE_CAPTURE_BUFFER_SIZE SZ_1M
|
||||
#else
|
||||
#define XE_GUC_LOG_EVENT_DATA_BUFFER_SIZE SZ_64K
|
||||
#define XE_GUC_LOG_CRASH_DUMP_BUFFER_SIZE SZ_16K
|
||||
|
|
|
|||
|
|
@ -8,15 +8,18 @@
|
|||
#include "xe_guc_ct.h"
|
||||
#include "xe_guc_pagefault.h"
|
||||
#include "xe_pagefault.h"
|
||||
#include "xe_pagefault_types.h"
|
||||
|
||||
static void guc_ack_fault(struct xe_pagefault *pf, int err)
|
||||
{
|
||||
u32 vfid = FIELD_GET(PFD_VFID, pf->producer.msg[2]);
|
||||
u32 prefetch = FIELD_GET(PFD_PREFETCH, pf->producer.msg[2]);
|
||||
u32 engine_instance = FIELD_GET(PFD_ENG_INSTANCE, pf->producer.msg[0]);
|
||||
u32 engine_class = FIELD_GET(PFD_ENG_CLASS, pf->producer.msg[0]);
|
||||
u32 pdata = FIELD_GET(PFD_PDATA_LO, pf->producer.msg[0]) |
|
||||
(FIELD_GET(PFD_PDATA_HI, pf->producer.msg[1]) <<
|
||||
PFD_PDATA_HI_SHIFT);
|
||||
u32 asid = FIELD_GET(PFD_ASID, pf->producer.msg[1]);
|
||||
u32 action[] = {
|
||||
XE_GUC_ACTION_PAGE_FAULT_RES_DESC,
|
||||
|
||||
|
|
@ -24,9 +27,10 @@ static void guc_ack_fault(struct xe_pagefault *pf, int err)
|
|||
FIELD_PREP(PFR_SUCCESS, !!err) |
|
||||
FIELD_PREP(PFR_REPLY, PFR_ACCESS) |
|
||||
FIELD_PREP(PFR_DESC_TYPE, FAULT_RESPONSE_DESC) |
|
||||
FIELD_PREP(PFR_ASID, pf->consumer.asid),
|
||||
FIELD_PREP(PFR_ASID, asid),
|
||||
|
||||
FIELD_PREP(PFR_VFID, vfid) |
|
||||
FIELD_PREP(PFR_PREFETCH, err ? prefetch : 0) |
|
||||
FIELD_PREP(PFR_ENG_INSTANCE, engine_instance) |
|
||||
FIELD_PREP(PFR_ENG_CLASS, engine_class) |
|
||||
FIELD_PREP(PFR_PDATA, pdata),
|
||||
|
|
@ -75,12 +79,16 @@ int xe_guc_pagefault_handler(struct xe_guc *guc, u32 *msg, u32 len)
|
|||
(FIELD_GET(PFD_VIRTUAL_ADDR_LO, msg[2]) <<
|
||||
PFD_VIRTUAL_ADDR_LO_SHIFT);
|
||||
pf.consumer.asid = FIELD_GET(PFD_ASID, msg[1]);
|
||||
pf.consumer.access_type = FIELD_GET(PFD_ACCESS_TYPE, msg[2]);
|
||||
pf.consumer.fault_type = FIELD_GET(PFD_FAULT_TYPE, msg[2]);
|
||||
pf.consumer.access_type = FIELD_GET(PFD_ACCESS_TYPE, msg[2]) |
|
||||
(FIELD_GET(PFD_PREFETCH, msg[2]) ? XE_PAGEFAULT_ACCESS_PREFETCH : 0);
|
||||
if (FIELD_GET(XE2_PFD_TRVA_FAULT, msg[0]))
|
||||
pf.consumer.fault_level = XE_PAGEFAULT_LEVEL_NACK;
|
||||
pf.consumer.fault_type_level = XE_PAGEFAULT_TYPE_LEVEL_NACK;
|
||||
else
|
||||
pf.consumer.fault_level = FIELD_GET(PFD_FAULT_LEVEL, msg[0]);
|
||||
pf.consumer.fault_type_level =
|
||||
FIELD_PREP(XE_PAGEFAULT_LEVEL_MASK,
|
||||
FIELD_GET(PFD_FAULT_LEVEL, msg[0])) |
|
||||
FIELD_PREP(XE_PAGEFAULT_TYPE_MASK,
|
||||
FIELD_GET(PFD_FAULT_TYPE, msg[2]));
|
||||
pf.consumer.engine_class = FIELD_GET(PFD_ENG_CLASS, msg[0]);
|
||||
pf.consumer.engine_instance = FIELD_GET(PFD_ENG_INSTANCE, msg[0]);
|
||||
|
||||
|
|
|
|||
|
|
@ -92,6 +92,17 @@
|
|||
* Render-C states is also a GuC PC feature that is now enabled in Xe for
|
||||
* all platforms.
|
||||
*
|
||||
* Implementation details:
|
||||
* -----------------------
|
||||
* The implementation for GuC Power Management features is split as follows:
|
||||
*
|
||||
* xe_guc_rc: Logic for handling GuC RC
|
||||
* xe_gt_idle: Host side logic for RC6 and Coarse Power gating (CPG)
|
||||
* xe_guc_pc: Logic for all other SLPC related features
|
||||
*
|
||||
* There is some cross interaction between these where host C6 will need to be
|
||||
* enabled when we plan to skip GuC RC. Also, the GuC RC mode is currently
|
||||
* overridden through 0x3003 which is an SLPC H2G call.
|
||||
*/
|
||||
|
||||
static struct xe_guc *pc_to_guc(struct xe_guc_pc *pc)
|
||||
|
|
@ -253,20 +264,35 @@ static int pc_action_unset_param(struct xe_guc_pc *pc, u8 id)
|
|||
return ret;
|
||||
}
|
||||
|
||||
static int pc_action_setup_gucrc(struct xe_guc_pc *pc, u32 mode)
|
||||
/**
|
||||
* xe_guc_pc_action_set_param() - Set value of SLPC param
|
||||
* @pc: Xe_GuC_PC instance
|
||||
* @id: Param id
|
||||
* @value: Value to set
|
||||
*
|
||||
* This function can be used to set any SLPC param.
|
||||
*
|
||||
* Return: 0 on Success
|
||||
*/
|
||||
int xe_guc_pc_action_set_param(struct xe_guc_pc *pc, u8 id, u32 value)
|
||||
{
|
||||
struct xe_guc_ct *ct = pc_to_ct(pc);
|
||||
u32 action[] = {
|
||||
GUC_ACTION_HOST2GUC_SETUP_PC_GUCRC,
|
||||
mode,
|
||||
};
|
||||
int ret;
|
||||
xe_device_assert_mem_access(pc_to_xe(pc));
|
||||
return pc_action_set_param(pc, id, value);
|
||||
}
|
||||
|
||||
ret = xe_guc_ct_send(ct, action, ARRAY_SIZE(action), 0, 0);
|
||||
if (ret && !(xe_device_wedged(pc_to_xe(pc)) && ret == -ECANCELED))
|
||||
xe_gt_err(pc_to_gt(pc), "GuC RC enable mode=%u failed: %pe\n",
|
||||
mode, ERR_PTR(ret));
|
||||
return ret;
|
||||
/**
|
||||
* xe_guc_pc_action_unset_param() - Revert to default value
|
||||
* @pc: Xe_GuC_PC instance
|
||||
* @id: Param id
|
||||
*
|
||||
* This function can be used revert any SLPC param to its default value.
|
||||
*
|
||||
* Return: 0 on Success
|
||||
*/
|
||||
int xe_guc_pc_action_unset_param(struct xe_guc_pc *pc, u8 id)
|
||||
{
|
||||
xe_device_assert_mem_access(pc_to_xe(pc));
|
||||
return pc_action_unset_param(pc, id);
|
||||
}
|
||||
|
||||
static u32 decode_freq(u32 raw)
|
||||
|
|
@ -1050,55 +1076,6 @@ int xe_guc_pc_restore_stashed_freq(struct xe_guc_pc *pc)
|
|||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_guc_pc_gucrc_disable - Disable GuC RC
|
||||
* @pc: Xe_GuC_PC instance
|
||||
*
|
||||
* Disables GuC RC by taking control of RC6 back from GuC.
|
||||
*
|
||||
* Return: 0 on success, negative error code on error.
|
||||
*/
|
||||
int xe_guc_pc_gucrc_disable(struct xe_guc_pc *pc)
|
||||
{
|
||||
struct xe_device *xe = pc_to_xe(pc);
|
||||
struct xe_gt *gt = pc_to_gt(pc);
|
||||
int ret = 0;
|
||||
|
||||
if (xe->info.skip_guc_pc)
|
||||
return 0;
|
||||
|
||||
ret = pc_action_setup_gucrc(pc, GUCRC_HOST_CONTROL);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
return xe_gt_idle_disable_c6(gt);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_guc_pc_override_gucrc_mode - override GUCRC mode
|
||||
* @pc: Xe_GuC_PC instance
|
||||
* @mode: new value of the mode.
|
||||
*
|
||||
* Return: 0 on success, negative error code on error
|
||||
*/
|
||||
int xe_guc_pc_override_gucrc_mode(struct xe_guc_pc *pc, enum slpc_gucrc_mode mode)
|
||||
{
|
||||
guard(xe_pm_runtime)(pc_to_xe(pc));
|
||||
return pc_action_set_param(pc, SLPC_PARAM_PWRGATE_RC_MODE, mode);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_guc_pc_unset_gucrc_mode - unset GUCRC mode override
|
||||
* @pc: Xe_GuC_PC instance
|
||||
*
|
||||
* Return: 0 on success, negative error code on error
|
||||
*/
|
||||
int xe_guc_pc_unset_gucrc_mode(struct xe_guc_pc *pc)
|
||||
{
|
||||
guard(xe_pm_runtime)(pc_to_xe(pc));
|
||||
return pc_action_unset_param(pc, SLPC_PARAM_PWRGATE_RC_MODE);
|
||||
}
|
||||
|
||||
static void pc_init_pcode_freq(struct xe_guc_pc *pc)
|
||||
{
|
||||
u32 min = DIV_ROUND_CLOSEST(pc->rpn_freq, GT_FREQUENCY_MULTIPLIER);
|
||||
|
|
@ -1247,9 +1224,6 @@ int xe_guc_pc_start(struct xe_guc_pc *pc)
|
|||
return -ETIMEDOUT;
|
||||
|
||||
if (xe->info.skip_guc_pc) {
|
||||
if (xe->info.platform != XE_PVC)
|
||||
xe_gt_idle_enable_c6(gt);
|
||||
|
||||
/* Request max possible since dynamic freq mgmt is not enabled */
|
||||
pc_set_cur_freq(pc, UINT_MAX);
|
||||
return 0;
|
||||
|
|
@ -1291,15 +1265,6 @@ int xe_guc_pc_start(struct xe_guc_pc *pc)
|
|||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (xe->info.platform == XE_PVC) {
|
||||
xe_guc_pc_gucrc_disable(pc);
|
||||
return 0;
|
||||
}
|
||||
|
||||
ret = pc_action_setup_gucrc(pc, GUCRC_FIRMWARE_CONTROL);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
/* Enable SLPC Optimized Strategy for compute */
|
||||
ret = pc_action_set_strategy(pc, SLPC_OPTIMIZED_STRATEGY_COMPUTE);
|
||||
|
||||
|
|
@ -1319,10 +1284,8 @@ int xe_guc_pc_stop(struct xe_guc_pc *pc)
|
|||
{
|
||||
struct xe_device *xe = pc_to_xe(pc);
|
||||
|
||||
if (xe->info.skip_guc_pc) {
|
||||
xe_gt_idle_disable_c6(pc_to_gt(pc));
|
||||
if (xe->info.skip_guc_pc)
|
||||
return 0;
|
||||
}
|
||||
|
||||
mutex_lock(&pc->freq_lock);
|
||||
pc->freq_ready = false;
|
||||
|
|
@ -1343,8 +1306,7 @@ static void xe_guc_pc_fini_hw(void *arg)
|
|||
if (xe_device_wedged(xe))
|
||||
return;
|
||||
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(pc_to_gt(pc)), XE_FORCEWAKE_ALL);
|
||||
xe_guc_pc_gucrc_disable(pc);
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(pc_to_gt(pc)), XE_FW_GT);
|
||||
XE_WARN_ON(xe_guc_pc_stop(pc));
|
||||
|
||||
/* Bind requested freq to mert_freq_cap before unload */
|
||||
|
|
|
|||
|
|
@ -9,16 +9,14 @@
|
|||
#include <linux/types.h>
|
||||
|
||||
struct xe_guc_pc;
|
||||
enum slpc_gucrc_mode;
|
||||
struct drm_printer;
|
||||
|
||||
int xe_guc_pc_init(struct xe_guc_pc *pc);
|
||||
int xe_guc_pc_start(struct xe_guc_pc *pc);
|
||||
int xe_guc_pc_stop(struct xe_guc_pc *pc);
|
||||
int xe_guc_pc_gucrc_disable(struct xe_guc_pc *pc);
|
||||
int xe_guc_pc_override_gucrc_mode(struct xe_guc_pc *pc, enum slpc_gucrc_mode mode);
|
||||
int xe_guc_pc_unset_gucrc_mode(struct xe_guc_pc *pc);
|
||||
void xe_guc_pc_print(struct xe_guc_pc *pc, struct drm_printer *p);
|
||||
int xe_guc_pc_action_set_param(struct xe_guc_pc *pc, u8 id, u32 value);
|
||||
int xe_guc_pc_action_unset_param(struct xe_guc_pc *pc, u8 id);
|
||||
|
||||
u32 xe_guc_pc_get_act_freq(struct xe_guc_pc *pc);
|
||||
int xe_guc_pc_get_cur_freq(struct xe_guc_pc *pc, u32 *freq);
|
||||
|
|
|
|||
131
drivers/gpu/drm/xe/xe_guc_rc.c
Normal file
131
drivers/gpu/drm/xe/xe_guc_rc.c
Normal file
|
|
@ -0,0 +1,131 @@
|
|||
// SPDX-License-Identifier: MIT
|
||||
/*
|
||||
* Copyright © 2026 Intel Corporation
|
||||
*/
|
||||
|
||||
#include <drm/drm_print.h>
|
||||
|
||||
#include "abi/guc_actions_slpc_abi.h"
|
||||
#include "xe_device.h"
|
||||
#include "xe_force_wake.h"
|
||||
#include "xe_gt.h"
|
||||
#include "xe_gt_idle.h"
|
||||
#include "xe_gt_printk.h"
|
||||
#include "xe_guc.h"
|
||||
#include "xe_guc_ct.h"
|
||||
#include "xe_guc_pc.h"
|
||||
#include "xe_guc_rc.h"
|
||||
#include "xe_pm.h"
|
||||
|
||||
/**
|
||||
* DOC: GuC RC (Render C-states)
|
||||
*
|
||||
* GuC handles the GT transition to deeper C-states in conjunction with Pcode.
|
||||
* GuC RC can be enabled independently of the frequency component in SLPC,
|
||||
* which is also controlled by GuC.
|
||||
*
|
||||
* This file will contain all H2G related logic for handling Render C-states.
|
||||
* There are some calls to xe_gt_idle, where we enable host C6 when GuC RC is
|
||||
* skipped. GuC RC is mostly independent of xe_guc_pc with the exception of
|
||||
* functions that override the mode for which we have to rely on the SLPC H2G
|
||||
* calls.
|
||||
*/
|
||||
|
||||
static int guc_action_setup_gucrc(struct xe_guc *guc, u32 control)
|
||||
{
|
||||
u32 action[] = {
|
||||
GUC_ACTION_HOST2GUC_SETUP_PC_GUCRC,
|
||||
control,
|
||||
};
|
||||
int ret;
|
||||
|
||||
ret = xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
|
||||
if (ret && !(xe_device_wedged(guc_to_xe(guc)) && ret == -ECANCELED))
|
||||
xe_gt_err(guc_to_gt(guc),
|
||||
"GuC RC setup %s(%u) failed (%pe)\n",
|
||||
control == GUCRC_HOST_CONTROL ? "HOST_CONTROL" :
|
||||
control == GUCRC_FIRMWARE_CONTROL ? "FIRMWARE_CONTROL" :
|
||||
"UNKNOWN", control, ERR_PTR(ret));
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_guc_rc_disable() - Disable GuC RC
|
||||
* @guc: Xe GuC instance
|
||||
*
|
||||
* Disables GuC RC by taking control of RC6 back from GuC.
|
||||
*/
|
||||
void xe_guc_rc_disable(struct xe_guc *guc)
|
||||
{
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
struct xe_gt *gt = guc_to_gt(guc);
|
||||
|
||||
if (!xe->info.skip_guc_pc && xe->info.platform != XE_PVC)
|
||||
if (guc_action_setup_gucrc(guc, GUCRC_HOST_CONTROL))
|
||||
return;
|
||||
|
||||
xe_gt_WARN_ON(gt, xe_gt_idle_disable_c6(gt));
|
||||
}
|
||||
|
||||
static void xe_guc_rc_fini_hw(void *arg)
|
||||
{
|
||||
struct xe_guc *guc = arg;
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
struct xe_gt *gt = guc_to_gt(guc);
|
||||
|
||||
if (xe_device_wedged(xe))
|
||||
return;
|
||||
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FW_GT);
|
||||
xe_guc_rc_disable(guc);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_guc_rc_init() - Init GuC RC
|
||||
* @guc: Xe GuC instance
|
||||
*
|
||||
* Add callback action for GuC RC
|
||||
*
|
||||
* Return: 0 on success, negative error code on error.
|
||||
*/
|
||||
int xe_guc_rc_init(struct xe_guc *guc)
|
||||
{
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
struct xe_gt *gt = guc_to_gt(guc);
|
||||
|
||||
xe_gt_assert(gt, xe_device_uc_enabled(xe));
|
||||
|
||||
return devm_add_action_or_reset(xe->drm.dev, xe_guc_rc_fini_hw, guc);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_guc_rc_enable() - Enable GuC RC feature if applicable
|
||||
* @guc: Xe GuC instance
|
||||
*
|
||||
* Enables GuC RC feature.
|
||||
*
|
||||
* Return: 0 on success, negative error code on error.
|
||||
*/
|
||||
int xe_guc_rc_enable(struct xe_guc *guc)
|
||||
{
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
struct xe_gt *gt = guc_to_gt(guc);
|
||||
|
||||
xe_gt_assert(gt, xe_device_uc_enabled(xe));
|
||||
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FW_GT);
|
||||
if (!xe_force_wake_ref_has_domain(fw_ref.domains, XE_FW_GT))
|
||||
return -ETIMEDOUT;
|
||||
|
||||
if (xe->info.platform == XE_PVC) {
|
||||
xe_guc_rc_disable(guc);
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (xe->info.skip_guc_pc) {
|
||||
xe_gt_idle_enable_c6(gt);
|
||||
return 0;
|
||||
}
|
||||
|
||||
return guc_action_setup_gucrc(guc, GUCRC_FIRMWARE_CONTROL);
|
||||
}
|
||||
16
drivers/gpu/drm/xe/xe_guc_rc.h
Normal file
16
drivers/gpu/drm/xe/xe_guc_rc.h
Normal file
|
|
@ -0,0 +1,16 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2026 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef _XE_GUC_RC_H_
|
||||
#define _XE_GUC_RC_H_
|
||||
|
||||
struct xe_guc;
|
||||
enum slpc_gucrc_mode;
|
||||
|
||||
int xe_guc_rc_init(struct xe_guc *guc);
|
||||
int xe_guc_rc_enable(struct xe_guc *guc);
|
||||
void xe_guc_rc_disable(struct xe_guc *guc);
|
||||
|
||||
#endif
|
||||
|
|
@ -8,9 +8,7 @@
|
|||
#include <linux/bitfield.h>
|
||||
#include <linux/bitmap.h>
|
||||
#include <linux/circ_buf.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/dma-fence-array.h>
|
||||
#include <linux/math64.h>
|
||||
|
||||
#include <drm/drm_managed.h>
|
||||
|
||||
|
|
@ -42,6 +40,7 @@
|
|||
#include "xe_pm.h"
|
||||
#include "xe_ring_ops_types.h"
|
||||
#include "xe_sched_job.h"
|
||||
#include "xe_sleep.h"
|
||||
#include "xe_trace.h"
|
||||
#include "xe_uc_fw.h"
|
||||
#include "xe_vm.h"
|
||||
|
|
@ -556,6 +555,72 @@ static void xe_guc_exec_queue_trigger_cleanup(struct xe_exec_queue *q)
|
|||
xe_sched_tdr_queue_imm(&q->guc->sched);
|
||||
}
|
||||
|
||||
static void xe_guc_exec_queue_group_stop(struct xe_exec_queue *q)
|
||||
{
|
||||
struct xe_exec_queue *primary = xe_exec_queue_multi_queue_primary(q);
|
||||
struct xe_exec_queue_group *group = q->multi_queue.group;
|
||||
struct xe_exec_queue *eq, *next;
|
||||
LIST_HEAD(tmp);
|
||||
|
||||
xe_gt_assert(guc_to_gt(exec_queue_to_guc(q)),
|
||||
xe_exec_queue_is_multi_queue(q));
|
||||
|
||||
mutex_lock(&group->list_lock);
|
||||
|
||||
/*
|
||||
* Stop all future queues being from executing while group is stopped.
|
||||
*/
|
||||
group->stopped = true;
|
||||
|
||||
list_for_each_entry_safe(eq, next, &group->list, multi_queue.link)
|
||||
/*
|
||||
* Refcount prevents an attempted removal from &group->list,
|
||||
* temporary list allows safe iteration after dropping
|
||||
* &group->list_lock.
|
||||
*/
|
||||
if (xe_exec_queue_get_unless_zero(eq))
|
||||
list_move_tail(&eq->multi_queue.link, &tmp);
|
||||
|
||||
mutex_unlock(&group->list_lock);
|
||||
|
||||
/* We cannot stop under list lock without getting inversions */
|
||||
xe_sched_submission_stop(&primary->guc->sched);
|
||||
list_for_each_entry(eq, &tmp, multi_queue.link)
|
||||
xe_sched_submission_stop(&eq->guc->sched);
|
||||
|
||||
mutex_lock(&group->list_lock);
|
||||
list_for_each_entry_safe(eq, next, &tmp, multi_queue.link) {
|
||||
/*
|
||||
* Corner where we got banned while stopping and not on
|
||||
* &group->list
|
||||
*/
|
||||
if (READ_ONCE(group->banned))
|
||||
xe_guc_exec_queue_trigger_cleanup(eq);
|
||||
|
||||
list_move_tail(&eq->multi_queue.link, &group->list);
|
||||
xe_exec_queue_put(eq);
|
||||
}
|
||||
mutex_unlock(&group->list_lock);
|
||||
}
|
||||
|
||||
static void xe_guc_exec_queue_group_start(struct xe_exec_queue *q)
|
||||
{
|
||||
struct xe_exec_queue *primary = xe_exec_queue_multi_queue_primary(q);
|
||||
struct xe_exec_queue_group *group = q->multi_queue.group;
|
||||
struct xe_exec_queue *eq;
|
||||
|
||||
xe_gt_assert(guc_to_gt(exec_queue_to_guc(q)),
|
||||
xe_exec_queue_is_multi_queue(q));
|
||||
|
||||
xe_sched_submission_start(&primary->guc->sched);
|
||||
|
||||
mutex_lock(&group->list_lock);
|
||||
group->stopped = false;
|
||||
list_for_each_entry(eq, &group->list, multi_queue.link)
|
||||
xe_sched_submission_start(&eq->guc->sched);
|
||||
mutex_unlock(&group->list_lock);
|
||||
}
|
||||
|
||||
static void xe_guc_exec_queue_group_trigger_cleanup(struct xe_exec_queue *q)
|
||||
{
|
||||
struct xe_exec_queue *primary = xe_exec_queue_multi_queue_primary(q);
|
||||
|
|
@ -738,6 +803,7 @@ static void xe_guc_exec_queue_group_cgp_sync(struct xe_guc *guc,
|
|||
{
|
||||
struct xe_exec_queue_group *group = q->multi_queue.group;
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
enum xe_multi_queue_priority priority;
|
||||
long ret;
|
||||
|
||||
/*
|
||||
|
|
@ -761,7 +827,10 @@ static void xe_guc_exec_queue_group_cgp_sync(struct xe_guc *guc,
|
|||
return;
|
||||
}
|
||||
|
||||
xe_lrc_set_multi_queue_priority(q->lrc[0], q->multi_queue.priority);
|
||||
scoped_guard(spinlock, &q->multi_queue.lock)
|
||||
priority = q->multi_queue.priority;
|
||||
|
||||
xe_lrc_set_multi_queue_priority(q->lrc[0], priority);
|
||||
xe_guc_exec_queue_group_cgp_update(xe, q);
|
||||
|
||||
WRITE_ONCE(group->sync_pending, true);
|
||||
|
|
@ -962,24 +1031,6 @@ static u32 wq_space_until_wrap(struct xe_exec_queue *q)
|
|||
return (WQ_SIZE - q->guc->wqi_tail);
|
||||
}
|
||||
|
||||
static inline void relaxed_ms_sleep(unsigned int delay_ms)
|
||||
{
|
||||
unsigned long min_us, max_us;
|
||||
|
||||
if (!delay_ms)
|
||||
return;
|
||||
|
||||
if (delay_ms > 20) {
|
||||
msleep(delay_ms);
|
||||
return;
|
||||
}
|
||||
|
||||
min_us = mul_u32_u32(delay_ms, 1000);
|
||||
max_us = min_us + 500;
|
||||
|
||||
usleep_range(min_us, max_us);
|
||||
}
|
||||
|
||||
static int wq_wait_for_space(struct xe_exec_queue *q, u32 wqi_size)
|
||||
{
|
||||
struct xe_guc *guc = exec_queue_to_guc(q);
|
||||
|
|
@ -998,10 +1049,7 @@ static int wq_wait_for_space(struct xe_exec_queue *q, u32 wqi_size)
|
|||
return -ENODEV;
|
||||
}
|
||||
|
||||
msleep(sleep_period_ms);
|
||||
sleep_total_ms += sleep_period_ms;
|
||||
if (sleep_period_ms < 64)
|
||||
sleep_period_ms <<= 1;
|
||||
sleep_total_ms += xe_sleep_exponential_ms(&sleep_period_ms, 64);
|
||||
goto try_again;
|
||||
}
|
||||
}
|
||||
|
|
@ -1414,7 +1462,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
{
|
||||
struct xe_sched_job *job = to_xe_sched_job(drm_job);
|
||||
struct drm_sched_job *tmp_job;
|
||||
struct xe_exec_queue *q = job->q;
|
||||
struct xe_exec_queue *q = job->q, *primary;
|
||||
struct xe_gpu_scheduler *sched = &q->guc->sched;
|
||||
struct xe_guc *guc = exec_queue_to_guc(q);
|
||||
const char *process_name = "no process";
|
||||
|
|
@ -1425,6 +1473,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
|
||||
xe_gt_assert(guc_to_gt(guc), !exec_queue_destroyed(q));
|
||||
|
||||
primary = xe_exec_queue_multi_queue_primary(q);
|
||||
|
||||
/*
|
||||
* TDR has fired before free job worker. Common if exec queue
|
||||
* immediately closed after last fence signaled. Add back to pending
|
||||
|
|
@ -1436,7 +1486,10 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
return DRM_GPU_SCHED_STAT_NO_HANG;
|
||||
|
||||
/* Kill the run_job entry point */
|
||||
xe_sched_submission_stop(sched);
|
||||
if (xe_exec_queue_is_multi_queue(q))
|
||||
xe_guc_exec_queue_group_stop(q);
|
||||
else
|
||||
xe_sched_submission_stop(sched);
|
||||
|
||||
/* Must check all state after stopping scheduler */
|
||||
skip_timeout_check = exec_queue_reset(q) ||
|
||||
|
|
@ -1451,14 +1504,6 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
if (xe_exec_queue_is_lr(q))
|
||||
xe_gt_assert(guc_to_gt(guc), skip_timeout_check);
|
||||
|
||||
/*
|
||||
* FIXME: In multi-queue scenario, the TDR must ensure that the whole
|
||||
* multi-queue group is off the HW before signaling the fences to avoid
|
||||
* possible memory corruptions. This means disabling scheduling on the
|
||||
* primary queue before or during the secondary queue's TDR. Need to
|
||||
* implement this in least obtrusive way.
|
||||
*/
|
||||
|
||||
/*
|
||||
* If devcoredump not captured and GuC capture for the job is not ready
|
||||
* do manual capture first and decide later if we need to use it
|
||||
|
|
@ -1485,10 +1530,11 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
set_exec_queue_banned(q);
|
||||
|
||||
/* Kick job / queue off hardware */
|
||||
if (!wedged && (exec_queue_enabled(q) || exec_queue_pending_disable(q))) {
|
||||
if (!wedged && (exec_queue_enabled(primary) ||
|
||||
exec_queue_pending_disable(primary))) {
|
||||
int ret;
|
||||
|
||||
if (exec_queue_reset(q))
|
||||
if (exec_queue_reset(primary))
|
||||
err = -EIO;
|
||||
|
||||
if (xe_uc_fw_is_running(&guc->fw)) {
|
||||
|
|
@ -1497,8 +1543,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
* modifying state
|
||||
*/
|
||||
ret = wait_event_timeout(guc->ct.wq,
|
||||
(!exec_queue_pending_enable(q) &&
|
||||
!exec_queue_pending_disable(q)) ||
|
||||
(!exec_queue_pending_enable(primary) &&
|
||||
!exec_queue_pending_disable(primary)) ||
|
||||
xe_guc_read_stopped(guc) ||
|
||||
vf_recovery(guc), HZ * 5);
|
||||
if (vf_recovery(guc))
|
||||
|
|
@ -1506,7 +1552,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
if (!ret || xe_guc_read_stopped(guc))
|
||||
goto trigger_reset;
|
||||
|
||||
disable_scheduling(q, skip_timeout_check);
|
||||
disable_scheduling(primary, skip_timeout_check);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
@ -1520,7 +1566,7 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
smp_rmb();
|
||||
ret = wait_event_timeout(guc->ct.wq,
|
||||
!xe_uc_fw_is_running(&guc->fw) ||
|
||||
!exec_queue_pending_disable(q) ||
|
||||
!exec_queue_pending_disable(primary) ||
|
||||
xe_guc_read_stopped(guc) ||
|
||||
vf_recovery(guc), HZ * 5);
|
||||
if (vf_recovery(guc))
|
||||
|
|
@ -1530,11 +1576,11 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
if (!ret)
|
||||
xe_gt_warn(guc_to_gt(guc),
|
||||
"Schedule disable failed to respond, guc_id=%d",
|
||||
q->guc->id);
|
||||
xe_devcoredump(q, job,
|
||||
primary->guc->id);
|
||||
xe_devcoredump(primary, job,
|
||||
"Schedule disable failed to respond, guc_id=%d, ret=%d, guc_read=%d",
|
||||
q->guc->id, ret, xe_guc_read_stopped(guc));
|
||||
xe_gt_reset_async(q->gt);
|
||||
primary->guc->id, ret, xe_guc_read_stopped(guc));
|
||||
xe_gt_reset_async(primary->gt);
|
||||
xe_sched_tdr_queue_imm(sched);
|
||||
goto rearm;
|
||||
}
|
||||
|
|
@ -1580,12 +1626,13 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
drm_sched_for_each_pending_job(tmp_job, &sched->base, NULL)
|
||||
xe_sched_job_set_error(to_xe_sched_job(tmp_job), -ECANCELED);
|
||||
|
||||
xe_sched_submission_start(sched);
|
||||
|
||||
if (xe_exec_queue_is_multi_queue(q))
|
||||
if (xe_exec_queue_is_multi_queue(q)) {
|
||||
xe_guc_exec_queue_group_start(q);
|
||||
xe_guc_exec_queue_group_trigger_cleanup(q);
|
||||
else
|
||||
} else {
|
||||
xe_sched_submission_start(sched);
|
||||
xe_guc_exec_queue_trigger_cleanup(q);
|
||||
}
|
||||
|
||||
/*
|
||||
* We want the job added back to the pending list so it gets freed; this
|
||||
|
|
@ -1599,7 +1646,10 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
|
|||
* but there is not currently an easy way to do in DRM scheduler. With
|
||||
* some thought, do this in a follow up.
|
||||
*/
|
||||
xe_sched_submission_start(sched);
|
||||
if (xe_exec_queue_is_multi_queue(q))
|
||||
xe_guc_exec_queue_group_start(q);
|
||||
else
|
||||
xe_sched_submission_start(sched);
|
||||
handle_vf_resume:
|
||||
return DRM_GPU_SCHED_STAT_NO_HANG;
|
||||
}
|
||||
|
|
@ -1762,7 +1812,7 @@ static void __guc_exec_queue_process_msg_suspend(struct xe_sched_msg *msg)
|
|||
since_resume_ms;
|
||||
|
||||
if (wait_ms > 0 && q->guc->resume_time)
|
||||
relaxed_ms_sleep(wait_ms);
|
||||
xe_sleep_relaxed_ms(wait_ms);
|
||||
|
||||
set_exec_queue_suspended(q);
|
||||
disable_scheduling(q, false);
|
||||
|
|
@ -1965,6 +2015,8 @@ static int guc_exec_queue_init(struct xe_exec_queue *q)
|
|||
|
||||
INIT_LIST_HEAD(&q->multi_queue.link);
|
||||
mutex_lock(&group->list_lock);
|
||||
if (group->stopped)
|
||||
WRITE_ONCE(q->guc->sched.base.pause_submit, true);
|
||||
list_add_tail(&q->multi_queue.link, &group->list);
|
||||
mutex_unlock(&group->list_lock);
|
||||
}
|
||||
|
|
@ -2111,15 +2163,22 @@ static int guc_exec_queue_set_multi_queue_priority(struct xe_exec_queue *q,
|
|||
|
||||
xe_gt_assert(guc_to_gt(exec_queue_to_guc(q)), xe_exec_queue_is_multi_queue(q));
|
||||
|
||||
if (q->multi_queue.priority == priority ||
|
||||
exec_queue_killed_or_banned_or_wedged(q))
|
||||
if (exec_queue_killed_or_banned_or_wedged(q))
|
||||
return 0;
|
||||
|
||||
msg = kmalloc_obj(*msg);
|
||||
if (!msg)
|
||||
return -ENOMEM;
|
||||
|
||||
q->multi_queue.priority = priority;
|
||||
scoped_guard(spinlock, &q->multi_queue.lock) {
|
||||
if (q->multi_queue.priority == priority) {
|
||||
kfree(msg);
|
||||
return 0;
|
||||
}
|
||||
|
||||
q->multi_queue.priority = priority;
|
||||
}
|
||||
|
||||
guc_exec_queue_add_msg(q, msg, SET_MULTI_QUEUE_PRIORITY);
|
||||
|
||||
return 0;
|
||||
|
|
@ -2206,6 +2265,14 @@ static bool guc_exec_queue_reset_status(struct xe_exec_queue *q)
|
|||
return exec_queue_reset(q) || exec_queue_killed_or_banned_or_wedged(q);
|
||||
}
|
||||
|
||||
static bool guc_exec_queue_active(struct xe_exec_queue *q)
|
||||
{
|
||||
struct xe_exec_queue *primary = xe_exec_queue_multi_queue_primary(q);
|
||||
|
||||
return exec_queue_enabled(primary) &&
|
||||
!exec_queue_pending_disable(primary);
|
||||
}
|
||||
|
||||
/*
|
||||
* All of these functions are an abstraction layer which other parts of Xe can
|
||||
* use to trap into the GuC backend. All of these functions, aside from init,
|
||||
|
|
@ -2225,6 +2292,7 @@ static const struct xe_exec_queue_ops guc_exec_queue_ops = {
|
|||
.suspend_wait = guc_exec_queue_suspend_wait,
|
||||
.resume = guc_exec_queue_resume,
|
||||
.reset_status = guc_exec_queue_reset_status,
|
||||
.active = guc_exec_queue_active,
|
||||
};
|
||||
|
||||
static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
|
||||
|
|
|
|||
|
|
@ -6,15 +6,19 @@
|
|||
#include "abi/guc_actions_abi.h"
|
||||
|
||||
#include "xe_device.h"
|
||||
#include "xe_exec_queue.h"
|
||||
#include "xe_exec_queue_types.h"
|
||||
#include "xe_gt_stats.h"
|
||||
#include "xe_gt_types.h"
|
||||
#include "xe_guc.h"
|
||||
#include "xe_guc_ct.h"
|
||||
#include "xe_guc_exec_queue_types.h"
|
||||
#include "xe_guc_tlb_inval.h"
|
||||
#include "xe_force_wake.h"
|
||||
#include "xe_mmio.h"
|
||||
#include "xe_sa.h"
|
||||
#include "xe_tlb_inval.h"
|
||||
#include "xe_vm.h"
|
||||
|
||||
#include "regs/xe_guc_regs.h"
|
||||
|
||||
|
|
@ -111,6 +115,38 @@ static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
|
|||
G2H_LEN_DW_PAGE_RECLAMATION, 1);
|
||||
}
|
||||
|
||||
static u64 normalize_invalidation_range(struct xe_gt *gt, u64 *start, u64 *end)
|
||||
{
|
||||
u64 orig_start = *start;
|
||||
u64 length = *end - *start;
|
||||
u64 align;
|
||||
|
||||
if (length < SZ_4K)
|
||||
length = SZ_4K;
|
||||
|
||||
align = roundup_pow_of_two(length);
|
||||
*start = ALIGN_DOWN(*start, align);
|
||||
*end = ALIGN(*end, align);
|
||||
length = align;
|
||||
while (*start + length < *end) {
|
||||
length <<= 1;
|
||||
*start = ALIGN_DOWN(orig_start, length);
|
||||
}
|
||||
|
||||
if (length >= SZ_2M) {
|
||||
length = max_t(u64, SZ_16M, length);
|
||||
*start = ALIGN_DOWN(orig_start, length);
|
||||
}
|
||||
|
||||
xe_gt_assert(gt, length >= SZ_4K);
|
||||
xe_gt_assert(gt, is_power_of_2(length));
|
||||
xe_gt_assert(gt, !(length & GENMASK(ilog2(SZ_16M) - 1,
|
||||
ilog2(SZ_2M) + 1)));
|
||||
xe_gt_assert(gt, IS_ALIGNED(*start, length));
|
||||
|
||||
return length;
|
||||
}
|
||||
|
||||
/*
|
||||
* Ensure that roundup_pow_of_two(length) doesn't overflow.
|
||||
* Note that roundup_pow_of_two() operates on unsigned long,
|
||||
|
|
@ -118,19 +154,21 @@ static int send_page_reclaim(struct xe_guc *guc, u32 seqno,
|
|||
*/
|
||||
#define MAX_RANGE_TLB_INVALIDATION_LENGTH (rounddown_pow_of_two(ULONG_MAX))
|
||||
|
||||
static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
|
||||
u64 start, u64 end, u32 asid,
|
||||
static int send_tlb_inval_ppgtt(struct xe_guc *guc, u32 seqno, u64 start,
|
||||
u64 end, u32 id, u32 type,
|
||||
struct drm_suballoc *prl_sa)
|
||||
{
|
||||
#define MAX_TLB_INVALIDATION_LEN 7
|
||||
struct xe_guc *guc = tlb_inval->private;
|
||||
struct xe_gt *gt = guc_to_gt(guc);
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
u32 action[MAX_TLB_INVALIDATION_LEN];
|
||||
u64 length = end - start;
|
||||
int len = 0, err;
|
||||
|
||||
if (guc_to_xe(guc)->info.force_execlist)
|
||||
return -ECANCELED;
|
||||
xe_gt_assert(gt, (type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE &&
|
||||
!xe->info.has_ctx_tlb_inval) ||
|
||||
(type == XE_GUC_TLB_INVAL_PAGE_SELECTIVE_CTX &&
|
||||
xe->info.has_ctx_tlb_inval));
|
||||
|
||||
action[len++] = XE_GUC_ACTION_TLB_INVALIDATION;
|
||||
action[len++] = !prl_sa ? seqno : TLB_INVALIDATION_SEQNO_INVALID;
|
||||
|
|
@ -138,55 +176,150 @@ static int send_tlb_inval_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
|
|||
length > MAX_RANGE_TLB_INVALIDATION_LENGTH) {
|
||||
action[len++] = MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL);
|
||||
} else {
|
||||
u64 orig_start = start;
|
||||
u64 align;
|
||||
|
||||
if (length < SZ_4K)
|
||||
length = SZ_4K;
|
||||
|
||||
/*
|
||||
* We need to invalidate a higher granularity if start address
|
||||
* is not aligned to length. When start is not aligned with
|
||||
* length we need to find the length large enough to create an
|
||||
* address mask covering the required range.
|
||||
*/
|
||||
align = roundup_pow_of_two(length);
|
||||
start = ALIGN_DOWN(start, align);
|
||||
end = ALIGN(end, align);
|
||||
length = align;
|
||||
while (start + length < end) {
|
||||
length <<= 1;
|
||||
start = ALIGN_DOWN(orig_start, length);
|
||||
}
|
||||
|
||||
/*
|
||||
* Minimum invalidation size for a 2MB page that the hardware
|
||||
* expects is 16MB
|
||||
*/
|
||||
if (length >= SZ_2M) {
|
||||
length = max_t(u64, SZ_16M, length);
|
||||
start = ALIGN_DOWN(orig_start, length);
|
||||
}
|
||||
|
||||
xe_gt_assert(gt, length >= SZ_4K);
|
||||
xe_gt_assert(gt, is_power_of_2(length));
|
||||
xe_gt_assert(gt, !(length & GENMASK(ilog2(SZ_16M) - 1,
|
||||
ilog2(SZ_2M) + 1)));
|
||||
xe_gt_assert(gt, IS_ALIGNED(start, length));
|
||||
u64 normalize_len = normalize_invalidation_range(gt, &start,
|
||||
&end);
|
||||
bool need_flush = !prl_sa &&
|
||||
seqno != TLB_INVALIDATION_SEQNO_INVALID;
|
||||
|
||||
/* Flush on NULL case, Media is not required to modify flush due to no PPC so NOP */
|
||||
action[len++] = MAKE_INVAL_OP_FLUSH(XE_GUC_TLB_INVAL_PAGE_SELECTIVE, !prl_sa);
|
||||
action[len++] = asid;
|
||||
action[len++] = MAKE_INVAL_OP_FLUSH(type, need_flush);
|
||||
action[len++] = id;
|
||||
action[len++] = lower_32_bits(start);
|
||||
action[len++] = upper_32_bits(start);
|
||||
action[len++] = ilog2(length) - ilog2(SZ_4K);
|
||||
action[len++] = ilog2(normalize_len) - ilog2(SZ_4K);
|
||||
}
|
||||
|
||||
xe_gt_assert(gt, len <= MAX_TLB_INVALIDATION_LEN);
|
||||
#undef MAX_TLB_INVALIDATION_LEN
|
||||
|
||||
err = send_tlb_inval(guc, action, len);
|
||||
if (!err && prl_sa)
|
||||
if (!err && prl_sa) {
|
||||
xe_gt_assert(gt, seqno != TLB_INVALIDATION_SEQNO_INVALID);
|
||||
err = send_page_reclaim(guc, seqno, xe_sa_bo_gpu_addr(prl_sa));
|
||||
}
|
||||
return err;
|
||||
}
|
||||
|
||||
static int send_tlb_inval_asid_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
|
||||
u64 start, u64 end, u32 asid,
|
||||
struct drm_suballoc *prl_sa)
|
||||
{
|
||||
struct xe_guc *guc = tlb_inval->private;
|
||||
|
||||
lockdep_assert_held(&tlb_inval->seqno_lock);
|
||||
|
||||
if (guc_to_xe(guc)->info.force_execlist)
|
||||
return -ECANCELED;
|
||||
|
||||
return send_tlb_inval_ppgtt(guc, seqno, start, end, asid,
|
||||
XE_GUC_TLB_INVAL_PAGE_SELECTIVE, prl_sa);
|
||||
}
|
||||
|
||||
static int send_tlb_inval_ctx_ppgtt(struct xe_tlb_inval *tlb_inval, u32 seqno,
|
||||
u64 start, u64 end, u32 asid,
|
||||
struct drm_suballoc *prl_sa)
|
||||
{
|
||||
struct xe_guc *guc = tlb_inval->private;
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
struct xe_exec_queue *q, *next, *last_q = NULL;
|
||||
struct xe_vm *vm;
|
||||
LIST_HEAD(tlb_inval_list);
|
||||
int err = 0, id = guc_to_gt(guc)->info.id;
|
||||
|
||||
lockdep_assert_held(&tlb_inval->seqno_lock);
|
||||
|
||||
if (xe->info.force_execlist)
|
||||
return -ECANCELED;
|
||||
|
||||
vm = xe_device_asid_to_vm(xe, asid);
|
||||
if (IS_ERR(vm))
|
||||
return PTR_ERR(vm);
|
||||
|
||||
down_read(&vm->exec_queues.lock);
|
||||
|
||||
/*
|
||||
* XXX: Randomly picking a threshold for now. This will need to be
|
||||
* tuned based on expected UMD queue counts and performance profiling.
|
||||
*/
|
||||
#define EXEC_QUEUE_COUNT_FULL_THRESHOLD 8
|
||||
if (vm->exec_queues.count[id] >= EXEC_QUEUE_COUNT_FULL_THRESHOLD) {
|
||||
u32 action[] = {
|
||||
XE_GUC_ACTION_TLB_INVALIDATION,
|
||||
seqno,
|
||||
MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL),
|
||||
};
|
||||
|
||||
err = send_tlb_inval(guc, action, ARRAY_SIZE(action));
|
||||
goto err_unlock;
|
||||
}
|
||||
#undef EXEC_QUEUE_COUNT_FULL_THRESHOLD
|
||||
|
||||
/*
|
||||
* Move exec queues to a temporary list to issue invalidations. The exec
|
||||
* queue must active and a reference must be taken to prevent concurrent
|
||||
* deregistrations.
|
||||
*
|
||||
* List modification is safe because we hold 'vm->exec_queues.lock' for
|
||||
* reading, which prevents external modifications. Using a per-GT list
|
||||
* is also safe since 'tlb_inval->seqno_lock' ensures no other GT users
|
||||
* can enter this code path.
|
||||
*/
|
||||
list_for_each_entry_safe(q, next, &vm->exec_queues.list[id],
|
||||
vm_exec_queue_link) {
|
||||
if (q->ops->active(q) && xe_exec_queue_get_unless_zero(q)) {
|
||||
last_q = q;
|
||||
list_move_tail(&q->vm_exec_queue_link, &tlb_inval_list);
|
||||
}
|
||||
}
|
||||
|
||||
if (!last_q) {
|
||||
/*
|
||||
* We can't break fence ordering for TLB invalidation jobs, if
|
||||
* TLB invalidations are inflight issue a dummy invalidation to
|
||||
* maintain ordering. Nor can we move safely the seqno_recv when
|
||||
* returning -ECANCELED if TLB invalidations are in flight. Use
|
||||
* GGTT invalidation as dummy invalidation given ASID
|
||||
* invalidations are unsupported here.
|
||||
*/
|
||||
if (xe_tlb_inval_idle(tlb_inval))
|
||||
err = -ECANCELED;
|
||||
else
|
||||
err = send_tlb_inval_ggtt(tlb_inval, seqno);
|
||||
goto err_unlock;
|
||||
}
|
||||
|
||||
list_for_each_entry_safe(q, next, &tlb_inval_list, vm_exec_queue_link) {
|
||||
struct drm_suballoc *__prl_sa = NULL;
|
||||
int __seqno = TLB_INVALIDATION_SEQNO_INVALID;
|
||||
u32 type = XE_GUC_TLB_INVAL_PAGE_SELECTIVE_CTX;
|
||||
|
||||
xe_assert(xe, q->vm == vm);
|
||||
|
||||
if (err)
|
||||
goto unref;
|
||||
|
||||
if (last_q == q) {
|
||||
__prl_sa = prl_sa;
|
||||
__seqno = seqno;
|
||||
}
|
||||
|
||||
err = send_tlb_inval_ppgtt(guc, __seqno, start, end,
|
||||
q->guc->id, type, __prl_sa);
|
||||
|
||||
unref:
|
||||
/*
|
||||
* Must always return exec queue to original list / drop
|
||||
* reference
|
||||
*/
|
||||
list_move_tail(&q->vm_exec_queue_link,
|
||||
&vm->exec_queues.list[id]);
|
||||
xe_exec_queue_put(q);
|
||||
}
|
||||
|
||||
err_unlock:
|
||||
up_read(&vm->exec_queues.lock);
|
||||
xe_vm_put(vm);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
|
|
@ -217,10 +350,19 @@ static long tlb_inval_timeout_delay(struct xe_tlb_inval *tlb_inval)
|
|||
return hw_tlb_timeout + 2 * delay;
|
||||
}
|
||||
|
||||
static const struct xe_tlb_inval_ops guc_tlb_inval_ops = {
|
||||
static const struct xe_tlb_inval_ops guc_tlb_inval_asid_ops = {
|
||||
.all = send_tlb_inval_all,
|
||||
.ggtt = send_tlb_inval_ggtt,
|
||||
.ppgtt = send_tlb_inval_ppgtt,
|
||||
.ppgtt = send_tlb_inval_asid_ppgtt,
|
||||
.initialized = tlb_inval_initialized,
|
||||
.flush = tlb_inval_flush,
|
||||
.timeout_delay = tlb_inval_timeout_delay,
|
||||
};
|
||||
|
||||
static const struct xe_tlb_inval_ops guc_tlb_inval_ctx_ops = {
|
||||
.ggtt = send_tlb_inval_ggtt,
|
||||
.all = send_tlb_inval_all,
|
||||
.ppgtt = send_tlb_inval_ctx_ppgtt,
|
||||
.initialized = tlb_inval_initialized,
|
||||
.flush = tlb_inval_flush,
|
||||
.timeout_delay = tlb_inval_timeout_delay,
|
||||
|
|
@ -237,8 +379,14 @@ static const struct xe_tlb_inval_ops guc_tlb_inval_ops = {
|
|||
void xe_guc_tlb_inval_init_early(struct xe_guc *guc,
|
||||
struct xe_tlb_inval *tlb_inval)
|
||||
{
|
||||
struct xe_device *xe = guc_to_xe(guc);
|
||||
|
||||
tlb_inval->private = guc;
|
||||
tlb_inval->ops = &guc_tlb_inval_ops;
|
||||
|
||||
if (xe->info.has_ctx_tlb_inval)
|
||||
tlb_inval->ops = &guc_tlb_inval_ctx_ops;
|
||||
else
|
||||
tlb_inval->ops = &guc_tlb_inval_asid_ops;
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
|
|||
|
|
@ -408,7 +408,8 @@ xe_hw_engine_setup_default_lrc_state(struct xe_hw_engine *hwe)
|
|||
},
|
||||
};
|
||||
|
||||
xe_rtp_process_to_sr(&ctx, lrc_setup, ARRAY_SIZE(lrc_setup), &hwe->reg_lrc);
|
||||
xe_rtp_process_to_sr(&ctx, lrc_setup, ARRAY_SIZE(lrc_setup),
|
||||
&hwe->reg_lrc, true);
|
||||
}
|
||||
|
||||
static void
|
||||
|
|
@ -472,7 +473,8 @@ hw_engine_setup_default_state(struct xe_hw_engine *hwe)
|
|||
},
|
||||
};
|
||||
|
||||
xe_rtp_process_to_sr(&ctx, engine_entries, ARRAY_SIZE(engine_entries), &hwe->reg_sr);
|
||||
xe_rtp_process_to_sr(&ctx, engine_entries, ARRAY_SIZE(engine_entries),
|
||||
&hwe->reg_sr, false);
|
||||
}
|
||||
|
||||
static const struct engine_info *find_engine_info(enum xe_engine_class class, int instance)
|
||||
|
|
|
|||
|
|
@ -51,7 +51,8 @@ hw_engine_group_alloc(struct xe_device *xe)
|
|||
if (!group)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
group->resume_wq = alloc_workqueue("xe-resume-lr-jobs-wq", 0, 0);
|
||||
group->resume_wq = alloc_workqueue("xe-resume-lr-jobs-wq", WQ_PERCPU,
|
||||
0);
|
||||
if (!group->resume_wq)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
|
|
|
|||
|
|
@ -27,7 +27,7 @@
|
|||
#include "regs/xe_i2c_regs.h"
|
||||
#include "regs/xe_irq_regs.h"
|
||||
|
||||
#include "xe_device_types.h"
|
||||
#include "xe_device.h"
|
||||
#include "xe_i2c.h"
|
||||
#include "xe_mmio.h"
|
||||
#include "xe_sriov.h"
|
||||
|
|
|
|||
|
|
@ -57,6 +57,23 @@ static u64 lmtt_page_size(struct xe_lmtt *lmtt)
|
|||
return BIT_ULL(lmtt->ops->lmtt_pte_shift(0));
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_lmtt_page_size() - Get LMTT page size.
|
||||
* @lmtt: the &xe_lmtt
|
||||
*
|
||||
* This function shall be called only by PF.
|
||||
*
|
||||
* Return: LMTT page size.
|
||||
*/
|
||||
u64 xe_lmtt_page_size(struct xe_lmtt *lmtt)
|
||||
{
|
||||
lmtt_assert(lmtt, IS_SRIOV_PF(lmtt_to_xe(lmtt)));
|
||||
lmtt_assert(lmtt, xe_device_has_lmtt(lmtt_to_xe(lmtt)));
|
||||
lmtt_assert(lmtt, lmtt->ops);
|
||||
|
||||
return lmtt_page_size(lmtt);
|
||||
}
|
||||
|
||||
static struct xe_lmtt_pt *lmtt_pt_alloc(struct xe_lmtt *lmtt, unsigned int level)
|
||||
{
|
||||
unsigned int num_entries = level ? lmtt->ops->lmtt_pte_num(level) : 0;
|
||||
|
|
|
|||
|
|
@ -20,6 +20,7 @@ int xe_lmtt_prepare_pages(struct xe_lmtt *lmtt, unsigned int vfid, u64 range);
|
|||
int xe_lmtt_populate_pages(struct xe_lmtt *lmtt, unsigned int vfid, struct xe_bo *bo, u64 offset);
|
||||
void xe_lmtt_drop_pages(struct xe_lmtt *lmtt, unsigned int vfid);
|
||||
u64 xe_lmtt_estimate_pt_size(struct xe_lmtt *lmtt, u64 size);
|
||||
u64 xe_lmtt_page_size(struct xe_lmtt *lmtt);
|
||||
#else
|
||||
static inline int xe_lmtt_init(struct xe_lmtt *lmtt) { return 0; }
|
||||
static inline void xe_lmtt_init_hw(struct xe_lmtt *lmtt) { }
|
||||
|
|
|
|||
|
|
@ -113,13 +113,17 @@ size_t xe_gt_lrc_hang_replay_size(struct xe_gt *gt, enum xe_engine_class class)
|
|||
/* Engine context image */
|
||||
switch (class) {
|
||||
case XE_ENGINE_CLASS_RENDER:
|
||||
if (GRAPHICS_VER(xe) >= 20)
|
||||
if (GRAPHICS_VERx100(xe) >= 3510)
|
||||
size += 7 * SZ_4K;
|
||||
else if (GRAPHICS_VER(xe) >= 20)
|
||||
size += 3 * SZ_4K;
|
||||
else
|
||||
size += 13 * SZ_4K;
|
||||
break;
|
||||
case XE_ENGINE_CLASS_COMPUTE:
|
||||
if (GRAPHICS_VER(xe) >= 20)
|
||||
if (GRAPHICS_VERx100(xe) >= 3510)
|
||||
size += 5 * SZ_4K;
|
||||
else if (GRAPHICS_VER(xe) >= 20)
|
||||
size += 2 * SZ_4K;
|
||||
else
|
||||
size += 13 * SZ_4K;
|
||||
|
|
@ -711,12 +715,13 @@ u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc)
|
|||
#define __xe_lrc_pphwsp_offset xe_lrc_pphwsp_offset
|
||||
#define __xe_lrc_regs_offset xe_lrc_regs_offset
|
||||
|
||||
#define LRC_SEQNO_PPHWSP_OFFSET 512
|
||||
#define LRC_START_SEQNO_PPHWSP_OFFSET (LRC_SEQNO_PPHWSP_OFFSET + 8)
|
||||
#define LRC_CTX_JOB_TIMESTAMP_OFFSET (LRC_START_SEQNO_PPHWSP_OFFSET + 8)
|
||||
#define LRC_CTX_JOB_TIMESTAMP_OFFSET 512
|
||||
#define LRC_ENGINE_ID_PPHWSP_OFFSET 1024
|
||||
#define LRC_PARALLEL_PPHWSP_OFFSET 2048
|
||||
|
||||
#define LRC_SEQNO_OFFSET 0
|
||||
#define LRC_START_SEQNO_OFFSET (LRC_SEQNO_OFFSET + 8)
|
||||
|
||||
u32 xe_lrc_regs_offset(struct xe_lrc *lrc)
|
||||
{
|
||||
return xe_lrc_pphwsp_offset(lrc) + LRC_PPHWSP_SIZE;
|
||||
|
|
@ -743,14 +748,12 @@ size_t xe_lrc_skip_size(struct xe_device *xe)
|
|||
|
||||
static inline u32 __xe_lrc_seqno_offset(struct xe_lrc *lrc)
|
||||
{
|
||||
/* The seqno is stored in the driver-defined portion of PPHWSP */
|
||||
return xe_lrc_pphwsp_offset(lrc) + LRC_SEQNO_PPHWSP_OFFSET;
|
||||
return LRC_SEQNO_OFFSET;
|
||||
}
|
||||
|
||||
static inline u32 __xe_lrc_start_seqno_offset(struct xe_lrc *lrc)
|
||||
{
|
||||
/* The start seqno is stored in the driver-defined portion of PPHWSP */
|
||||
return xe_lrc_pphwsp_offset(lrc) + LRC_START_SEQNO_PPHWSP_OFFSET;
|
||||
return LRC_START_SEQNO_OFFSET;
|
||||
}
|
||||
|
||||
static u32 __xe_lrc_ctx_job_timestamp_offset(struct xe_lrc *lrc)
|
||||
|
|
@ -801,10 +804,11 @@ static inline u32 __xe_lrc_wa_bb_offset(struct xe_lrc *lrc)
|
|||
return xe_bo_size(lrc->bo) - LRC_WA_BB_SIZE;
|
||||
}
|
||||
|
||||
#define DECL_MAP_ADDR_HELPERS(elem) \
|
||||
#define DECL_MAP_ADDR_HELPERS(elem, bo_expr) \
|
||||
static inline struct iosys_map __xe_lrc_##elem##_map(struct xe_lrc *lrc) \
|
||||
{ \
|
||||
struct iosys_map map = lrc->bo->vmap; \
|
||||
struct xe_bo *bo = (bo_expr); \
|
||||
struct iosys_map map = bo->vmap; \
|
||||
\
|
||||
xe_assert(lrc_to_xe(lrc), !iosys_map_is_null(&map)); \
|
||||
iosys_map_incr(&map, __xe_lrc_##elem##_offset(lrc)); \
|
||||
|
|
@ -812,20 +816,22 @@ static inline struct iosys_map __xe_lrc_##elem##_map(struct xe_lrc *lrc) \
|
|||
} \
|
||||
static inline u32 __maybe_unused __xe_lrc_##elem##_ggtt_addr(struct xe_lrc *lrc) \
|
||||
{ \
|
||||
return xe_bo_ggtt_addr(lrc->bo) + __xe_lrc_##elem##_offset(lrc); \
|
||||
struct xe_bo *bo = (bo_expr); \
|
||||
\
|
||||
return xe_bo_ggtt_addr(bo) + __xe_lrc_##elem##_offset(lrc); \
|
||||
} \
|
||||
|
||||
DECL_MAP_ADDR_HELPERS(ring)
|
||||
DECL_MAP_ADDR_HELPERS(pphwsp)
|
||||
DECL_MAP_ADDR_HELPERS(seqno)
|
||||
DECL_MAP_ADDR_HELPERS(regs)
|
||||
DECL_MAP_ADDR_HELPERS(start_seqno)
|
||||
DECL_MAP_ADDR_HELPERS(ctx_job_timestamp)
|
||||
DECL_MAP_ADDR_HELPERS(ctx_timestamp)
|
||||
DECL_MAP_ADDR_HELPERS(ctx_timestamp_udw)
|
||||
DECL_MAP_ADDR_HELPERS(parallel)
|
||||
DECL_MAP_ADDR_HELPERS(indirect_ring)
|
||||
DECL_MAP_ADDR_HELPERS(engine_id)
|
||||
DECL_MAP_ADDR_HELPERS(ring, lrc->bo)
|
||||
DECL_MAP_ADDR_HELPERS(pphwsp, lrc->bo)
|
||||
DECL_MAP_ADDR_HELPERS(seqno, lrc->seqno_bo)
|
||||
DECL_MAP_ADDR_HELPERS(regs, lrc->bo)
|
||||
DECL_MAP_ADDR_HELPERS(start_seqno, lrc->seqno_bo)
|
||||
DECL_MAP_ADDR_HELPERS(ctx_job_timestamp, lrc->bo)
|
||||
DECL_MAP_ADDR_HELPERS(ctx_timestamp, lrc->bo)
|
||||
DECL_MAP_ADDR_HELPERS(ctx_timestamp_udw, lrc->bo)
|
||||
DECL_MAP_ADDR_HELPERS(parallel, lrc->bo)
|
||||
DECL_MAP_ADDR_HELPERS(indirect_ring, lrc->bo)
|
||||
DECL_MAP_ADDR_HELPERS(engine_id, lrc->bo)
|
||||
|
||||
#undef DECL_MAP_ADDR_HELPERS
|
||||
|
||||
|
|
@ -1032,6 +1038,7 @@ static void xe_lrc_finish(struct xe_lrc *lrc)
|
|||
{
|
||||
xe_hw_fence_ctx_finish(&lrc->fence_ctx);
|
||||
xe_bo_unpin_map_no_vm(lrc->bo);
|
||||
xe_bo_unpin_map_no_vm(lrc->seqno_bo);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
@ -1431,53 +1438,16 @@ void xe_lrc_set_multi_queue_priority(struct xe_lrc *lrc, enum xe_multi_queue_pri
|
|||
lrc->desc |= FIELD_PREP(LRC_PRIORITY, xe_multi_queue_prio_to_lrc(lrc, priority));
|
||||
}
|
||||
|
||||
static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
|
||||
struct xe_vm *vm, void *replay_state, u32 ring_size,
|
||||
u16 msix_vec,
|
||||
u32 init_flags)
|
||||
static int xe_lrc_ctx_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, struct xe_vm *vm,
|
||||
void *replay_state, u16 msix_vec, u32 init_flags)
|
||||
{
|
||||
struct xe_gt *gt = hwe->gt;
|
||||
const u32 lrc_size = xe_gt_lrc_size(gt, hwe->class);
|
||||
u32 bo_size = ring_size + lrc_size + LRC_WA_BB_SIZE;
|
||||
struct xe_tile *tile = gt_to_tile(gt);
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
struct iosys_map map;
|
||||
u32 arb_enable;
|
||||
u32 bo_flags;
|
||||
int err;
|
||||
|
||||
kref_init(&lrc->refcount);
|
||||
lrc->gt = gt;
|
||||
lrc->replay_size = xe_gt_lrc_hang_replay_size(gt, hwe->class);
|
||||
lrc->size = lrc_size;
|
||||
lrc->flags = 0;
|
||||
lrc->ring.size = ring_size;
|
||||
lrc->ring.tail = 0;
|
||||
|
||||
if (gt_engine_needs_indirect_ctx(gt, hwe->class)) {
|
||||
lrc->flags |= XE_LRC_FLAG_INDIRECT_CTX;
|
||||
bo_size += LRC_INDIRECT_CTX_BO_SIZE;
|
||||
}
|
||||
|
||||
if (xe_gt_has_indirect_ring_state(gt))
|
||||
lrc->flags |= XE_LRC_FLAG_INDIRECT_RING_STATE;
|
||||
|
||||
bo_flags = XE_BO_FLAG_VRAM_IF_DGFX(tile) | XE_BO_FLAG_GGTT |
|
||||
XE_BO_FLAG_GGTT_INVALIDATE;
|
||||
|
||||
if ((vm && vm->xef) || init_flags & XE_LRC_CREATE_USER_CTX) /* userspace */
|
||||
bo_flags |= XE_BO_FLAG_PINNED_LATE_RESTORE | XE_BO_FLAG_FORCE_USER_VRAM;
|
||||
|
||||
lrc->bo = xe_bo_create_pin_map_novm(xe, tile,
|
||||
bo_size,
|
||||
ttm_bo_type_kernel,
|
||||
bo_flags, false);
|
||||
if (IS_ERR(lrc->bo))
|
||||
return PTR_ERR(lrc->bo);
|
||||
|
||||
xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt,
|
||||
hwe->fence_irq, hwe->name);
|
||||
|
||||
/*
|
||||
* Init Per-Process of HW status Page, LRC / context state to known
|
||||
* values. If there's already a primed default_lrc, just copy it, otherwise
|
||||
|
|
@ -1489,7 +1459,7 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
|
|||
xe_map_memset(xe, &map, 0, 0, LRC_PPHWSP_SIZE); /* PPHWSP */
|
||||
xe_map_memcpy_to(xe, &map, LRC_PPHWSP_SIZE,
|
||||
gt->default_lrc[hwe->class] + LRC_PPHWSP_SIZE,
|
||||
lrc_size - LRC_PPHWSP_SIZE);
|
||||
lrc->size - LRC_PPHWSP_SIZE);
|
||||
if (replay_state)
|
||||
xe_map_memcpy_to(xe, &map, LRC_PPHWSP_SIZE,
|
||||
replay_state, lrc->replay_size);
|
||||
|
|
@ -1497,21 +1467,16 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
|
|||
void *init_data = empty_lrc_data(hwe);
|
||||
|
||||
if (!init_data) {
|
||||
err = -ENOMEM;
|
||||
goto err_lrc_finish;
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
xe_map_memcpy_to(xe, &map, 0, init_data, lrc_size);
|
||||
xe_map_memcpy_to(xe, &map, 0, init_data, lrc->size);
|
||||
kfree(init_data);
|
||||
}
|
||||
|
||||
if (vm) {
|
||||
if (vm)
|
||||
xe_lrc_set_ppgtt(lrc, vm);
|
||||
|
||||
if (vm->xef)
|
||||
xe_drm_client_add_bo(vm->xef->client, lrc->bo);
|
||||
}
|
||||
|
||||
if (xe_device_has_msix(xe)) {
|
||||
xe_lrc_write_ctx_reg(lrc, CTX_INT_STATUS_REPORT_PTR,
|
||||
xe_memirq_status_ptr(&tile->memirq, hwe));
|
||||
|
|
@ -1527,14 +1492,20 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
|
|||
xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_START,
|
||||
__xe_lrc_ring_ggtt_addr(lrc));
|
||||
xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_START_UDW, 0);
|
||||
xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_HEAD, 0);
|
||||
|
||||
/* Match head and tail pointers */
|
||||
xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_HEAD, lrc->ring.tail);
|
||||
xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_TAIL, lrc->ring.tail);
|
||||
|
||||
xe_lrc_write_indirect_ctx_reg(lrc, INDIRECT_CTX_RING_CTL,
|
||||
RING_CTL_SIZE(lrc->ring.size) | RING_VALID);
|
||||
} else {
|
||||
xe_lrc_write_ctx_reg(lrc, CTX_RING_START, __xe_lrc_ring_ggtt_addr(lrc));
|
||||
xe_lrc_write_ctx_reg(lrc, CTX_RING_HEAD, 0);
|
||||
|
||||
/* Match head and tail pointers */
|
||||
xe_lrc_write_ctx_reg(lrc, CTX_RING_HEAD, lrc->ring.tail);
|
||||
xe_lrc_write_ctx_reg(lrc, CTX_RING_TAIL, lrc->ring.tail);
|
||||
|
||||
xe_lrc_write_ctx_reg(lrc, CTX_RING_CTL,
|
||||
RING_CTL_SIZE(lrc->ring.size) | RING_VALID);
|
||||
}
|
||||
|
|
@ -1583,12 +1554,76 @@ static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe,
|
|||
|
||||
err = setup_wa_bb(lrc, hwe);
|
||||
if (err)
|
||||
goto err_lrc_finish;
|
||||
return err;
|
||||
|
||||
err = setup_indirect_ctx(lrc, hwe);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, struct xe_vm *vm,
|
||||
void *replay_state, u32 ring_size, u16 msix_vec, u32 init_flags)
|
||||
{
|
||||
struct xe_gt *gt = hwe->gt;
|
||||
const u32 lrc_size = xe_gt_lrc_size(gt, hwe->class);
|
||||
u32 bo_size = ring_size + lrc_size + LRC_WA_BB_SIZE;
|
||||
struct xe_tile *tile = gt_to_tile(gt);
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
struct xe_bo *bo;
|
||||
u32 bo_flags;
|
||||
int err;
|
||||
|
||||
kref_init(&lrc->refcount);
|
||||
lrc->gt = gt;
|
||||
lrc->replay_size = xe_gt_lrc_hang_replay_size(gt, hwe->class);
|
||||
lrc->size = lrc_size;
|
||||
lrc->flags = 0;
|
||||
lrc->ring.size = ring_size;
|
||||
lrc->ring.tail = 0;
|
||||
|
||||
if (gt_engine_needs_indirect_ctx(gt, hwe->class)) {
|
||||
lrc->flags |= XE_LRC_FLAG_INDIRECT_CTX;
|
||||
bo_size += LRC_INDIRECT_CTX_BO_SIZE;
|
||||
}
|
||||
|
||||
if (xe_gt_has_indirect_ring_state(gt))
|
||||
lrc->flags |= XE_LRC_FLAG_INDIRECT_RING_STATE;
|
||||
|
||||
bo_flags = XE_BO_FLAG_VRAM_IF_DGFX(tile) | XE_BO_FLAG_GGTT |
|
||||
XE_BO_FLAG_GGTT_INVALIDATE;
|
||||
|
||||
if ((vm && vm->xef) || init_flags & XE_LRC_CREATE_USER_CTX) /* userspace */
|
||||
bo_flags |= XE_BO_FLAG_PINNED_LATE_RESTORE | XE_BO_FLAG_FORCE_USER_VRAM;
|
||||
|
||||
bo = xe_bo_create_pin_map_novm(xe, tile, bo_size,
|
||||
ttm_bo_type_kernel,
|
||||
bo_flags, false);
|
||||
if (IS_ERR(lrc->bo))
|
||||
return PTR_ERR(lrc->bo);
|
||||
|
||||
lrc->bo = bo;
|
||||
|
||||
bo = xe_bo_create_pin_map_novm(xe, tile, PAGE_SIZE,
|
||||
ttm_bo_type_kernel,
|
||||
XE_BO_FLAG_GGTT |
|
||||
XE_BO_FLAG_GGTT_INVALIDATE |
|
||||
XE_BO_FLAG_SYSTEM, false);
|
||||
if (IS_ERR(bo)) {
|
||||
err = PTR_ERR(bo);
|
||||
goto err_lrc_finish;
|
||||
}
|
||||
lrc->seqno_bo = bo;
|
||||
|
||||
xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt,
|
||||
hwe->fence_irq, hwe->name);
|
||||
|
||||
err = xe_lrc_ctx_init(lrc, hwe, vm, replay_state, msix_vec, init_flags);
|
||||
if (err)
|
||||
goto err_lrc_finish;
|
||||
|
||||
if (vm && vm->xef)
|
||||
xe_drm_client_add_bo(vm->xef->client, lrc->bo);
|
||||
|
||||
return 0;
|
||||
|
||||
err_lrc_finish:
|
||||
|
|
@ -1966,6 +2001,7 @@ static int dump_gfxpipe_command(struct drm_printer *p,
|
|||
MATCH(PIPELINE_SELECT);
|
||||
|
||||
MATCH3D(3DSTATE_DRAWING_RECTANGLE_FAST);
|
||||
MATCH3D(3DSTATE_CUSTOM_SAMPLE_PATTERN);
|
||||
MATCH3D(3DSTATE_CLEAR_PARAMS);
|
||||
MATCH3D(3DSTATE_DEPTH_BUFFER);
|
||||
MATCH3D(3DSTATE_STENCIL_BUFFER);
|
||||
|
|
@ -2049,8 +2085,16 @@ static int dump_gfxpipe_command(struct drm_printer *p,
|
|||
MATCH3D(3DSTATE_SBE_MESH);
|
||||
MATCH3D(3DSTATE_CPSIZE_CONTROL_BUFFER);
|
||||
MATCH3D(3DSTATE_COARSE_PIXEL);
|
||||
MATCH3D(3DSTATE_MESH_SHADER_DATA_EXT);
|
||||
MATCH3D(3DSTATE_TASK_SHADER_DATA_EXT);
|
||||
MATCH3D(3DSTATE_VIEWPORT_STATE_POINTERS_CC_2);
|
||||
MATCH3D(3DSTATE_CC_STATE_POINTERS_2);
|
||||
MATCH3D(3DSTATE_SCISSOR_STATE_POINTERS_2);
|
||||
MATCH3D(3DSTATE_BLEND_STATE_POINTERS_2);
|
||||
MATCH3D(3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP_2);
|
||||
|
||||
MATCH3D(3DSTATE_DRAWING_RECTANGLE);
|
||||
MATCH3D(3DSTATE_URB_MEMORY);
|
||||
MATCH3D(3DSTATE_CHROMA_KEY);
|
||||
MATCH3D(3DSTATE_POLY_STIPPLE_OFFSET);
|
||||
MATCH3D(3DSTATE_POLY_STIPPLE_PATTERN);
|
||||
|
|
@ -2070,6 +2114,7 @@ static int dump_gfxpipe_command(struct drm_printer *p,
|
|||
MATCH3D(3DSTATE_SUBSLICE_HASH_TABLE);
|
||||
MATCH3D(3DSTATE_SLICE_TABLE_STATE_POINTERS);
|
||||
MATCH3D(3DSTATE_PTBR_TILE_PASS_INFO);
|
||||
MATCH3D(3DSTATE_SLICE_TABLE_STATE_POINTER_2);
|
||||
|
||||
default:
|
||||
drm_printf(p, "[%#010x] unknown GFXPIPE command (pipeline=%#x, opcode=%#x, subopcode=%#x), likely %d dwords\n",
|
||||
|
|
@ -2141,6 +2186,102 @@ void xe_lrc_dump_default(struct drm_printer *p,
|
|||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Lookup the value of a register within the offset/value pairs of an
|
||||
* MI_LOAD_REGISTER_IMM instruction.
|
||||
*
|
||||
* Return -ENOENT if the register is not present in the MI_LRI instruction.
|
||||
*/
|
||||
static int lookup_reg_in_mi_lri(u32 offset, u32 *value,
|
||||
const u32 *dword_pair, int num_regs)
|
||||
{
|
||||
for (int i = 0; i < num_regs; i++) {
|
||||
if (dword_pair[2 * i] == offset) {
|
||||
*value = dword_pair[2 * i + 1];
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
return -ENOENT;
|
||||
}
|
||||
|
||||
/*
|
||||
* Lookup the value of a register in a specific engine type's default LRC.
|
||||
*
|
||||
* Return -EINVAL if the default LRC doesn't exist, or ENOENT if the register
|
||||
* cannot be found in the default LRC.
|
||||
*/
|
||||
int xe_lrc_lookup_default_reg_value(struct xe_gt *gt,
|
||||
enum xe_engine_class hwe_class,
|
||||
u32 offset,
|
||||
u32 *value)
|
||||
{
|
||||
u32 *dw;
|
||||
int remaining_dw, ret;
|
||||
|
||||
if (!gt->default_lrc[hwe_class])
|
||||
return -EINVAL;
|
||||
|
||||
/*
|
||||
* Skip the beginning of the LRC since it contains the per-process
|
||||
* hardware status page.
|
||||
*/
|
||||
dw = gt->default_lrc[hwe_class] + LRC_PPHWSP_SIZE;
|
||||
remaining_dw = (xe_gt_lrc_size(gt, hwe_class) - LRC_PPHWSP_SIZE) / 4;
|
||||
|
||||
while (remaining_dw > 0) {
|
||||
u32 num_dw = instr_dw(*dw);
|
||||
|
||||
if (num_dw > remaining_dw)
|
||||
num_dw = remaining_dw;
|
||||
|
||||
switch (*dw & XE_INSTR_CMD_TYPE) {
|
||||
case XE_INSTR_MI:
|
||||
switch (*dw & MI_OPCODE) {
|
||||
case MI_BATCH_BUFFER_END:
|
||||
/* End of LRC; register not found */
|
||||
return -ENOENT;
|
||||
|
||||
case MI_NOOP:
|
||||
case MI_TOPOLOGY_FILTER:
|
||||
/*
|
||||
* MI_NOOP and MI_TOPOLOGY_FILTER don't have
|
||||
* a length field and are always 1-dword
|
||||
* instructions.
|
||||
*/
|
||||
remaining_dw--;
|
||||
dw++;
|
||||
break;
|
||||
|
||||
case MI_LOAD_REGISTER_IMM:
|
||||
ret = lookup_reg_in_mi_lri(offset, value,
|
||||
dw + 1, (num_dw - 1) / 2);
|
||||
if (ret == 0)
|
||||
return 0;
|
||||
|
||||
fallthrough;
|
||||
|
||||
default:
|
||||
/*
|
||||
* Jump to next instruction based on length
|
||||
* field.
|
||||
*/
|
||||
remaining_dw -= num_dw;
|
||||
dw += num_dw;
|
||||
break;
|
||||
}
|
||||
break;
|
||||
|
||||
default:
|
||||
/* Jump to next instruction based on length field. */
|
||||
remaining_dw -= num_dw;
|
||||
dw += num_dw;
|
||||
}
|
||||
}
|
||||
|
||||
return -ENOENT;
|
||||
}
|
||||
|
||||
struct instr_state {
|
||||
u32 instr;
|
||||
u16 num_dw;
|
||||
|
|
|
|||
|
|
@ -75,7 +75,8 @@ static inline struct xe_lrc *xe_lrc_get(struct xe_lrc *lrc)
|
|||
*/
|
||||
static inline void xe_lrc_put(struct xe_lrc *lrc)
|
||||
{
|
||||
kref_put(&lrc->refcount, xe_lrc_destroy);
|
||||
if (lrc)
|
||||
kref_put(&lrc->refcount, xe_lrc_destroy);
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
@ -133,6 +134,10 @@ size_t xe_lrc_skip_size(struct xe_device *xe);
|
|||
void xe_lrc_dump_default(struct drm_printer *p,
|
||||
struct xe_gt *gt,
|
||||
enum xe_engine_class);
|
||||
int xe_lrc_lookup_default_reg_value(struct xe_gt *gt,
|
||||
enum xe_engine_class hwe_class,
|
||||
u32 offset,
|
||||
u32 *value);
|
||||
|
||||
u32 *xe_lrc_emit_hwe_state_instructions(struct xe_exec_queue *q, u32 *cs);
|
||||
|
||||
|
|
|
|||
|
|
@ -22,6 +22,12 @@ struct xe_lrc {
|
|||
*/
|
||||
struct xe_bo *bo;
|
||||
|
||||
/**
|
||||
* @seqno_bo: Buffer object (memory) for seqno numbers. Always in system
|
||||
* memory as this a CPU read, GPU write path object.
|
||||
*/
|
||||
struct xe_bo *seqno_bo;
|
||||
|
||||
/** @size: size of the lrc and optional indirect ring state */
|
||||
u32 size;
|
||||
|
||||
|
|
|
|||
|
|
@ -25,6 +25,7 @@
|
|||
#include "xe_exec_queue.h"
|
||||
#include "xe_ggtt.h"
|
||||
#include "xe_gt.h"
|
||||
#include "xe_gt_printk.h"
|
||||
#include "xe_hw_engine.h"
|
||||
#include "xe_lrc.h"
|
||||
#include "xe_map.h"
|
||||
|
|
@ -1148,65 +1149,73 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
|
|||
size -= src_L0;
|
||||
}
|
||||
|
||||
bb = xe_bb_alloc(gt);
|
||||
if (IS_ERR(bb))
|
||||
return PTR_ERR(bb);
|
||||
|
||||
bb_pool = ctx->mem.ccs_bb_pool;
|
||||
guard(mutex) (xe_sa_bo_swap_guard(bb_pool));
|
||||
xe_sa_bo_swap_shadow(bb_pool);
|
||||
scoped_guard(mutex, xe_sa_bo_swap_guard(bb_pool)) {
|
||||
xe_sa_bo_swap_shadow(bb_pool);
|
||||
|
||||
bb = xe_bb_ccs_new(gt, batch_size, read_write);
|
||||
if (IS_ERR(bb)) {
|
||||
drm_err(&xe->drm, "BB allocation failed.\n");
|
||||
err = PTR_ERR(bb);
|
||||
return err;
|
||||
err = xe_bb_init(bb, bb_pool, batch_size);
|
||||
if (err) {
|
||||
xe_gt_err(gt, "BB allocation failed.\n");
|
||||
xe_bb_free(bb, NULL);
|
||||
return err;
|
||||
}
|
||||
|
||||
batch_size_allocated = batch_size;
|
||||
size = xe_bo_size(src_bo);
|
||||
batch_size = 0;
|
||||
|
||||
/*
|
||||
* Emit PTE and copy commands here.
|
||||
* The CCS copy command can only support limited size. If the size to be
|
||||
* copied is more than the limit, divide copy into chunks. So, calculate
|
||||
* sizes here again before copy command is emitted.
|
||||
*/
|
||||
|
||||
while (size) {
|
||||
batch_size += 10; /* Flush + ggtt addr + 2 NOP */
|
||||
u32 flush_flags = 0;
|
||||
u64 ccs_ofs, ccs_size;
|
||||
u32 ccs_pt;
|
||||
|
||||
u32 avail_pts = max_mem_transfer_per_pass(xe) /
|
||||
LEVEL0_PAGE_TABLE_ENCODE_SIZE;
|
||||
|
||||
src_L0 = xe_migrate_res_sizes(m, &src_it);
|
||||
|
||||
batch_size += pte_update_size(m, false, src, &src_it, &src_L0,
|
||||
&src_L0_ofs, &src_L0_pt, 0, 0,
|
||||
avail_pts);
|
||||
|
||||
ccs_size = xe_device_ccs_bytes(xe, src_L0);
|
||||
batch_size += pte_update_size(m, 0, NULL, &ccs_it, &ccs_size, &ccs_ofs,
|
||||
&ccs_pt, 0, avail_pts, avail_pts);
|
||||
xe_assert(xe, IS_ALIGNED(ccs_it.start, PAGE_SIZE));
|
||||
batch_size += EMIT_COPY_CCS_DW;
|
||||
|
||||
emit_pte(m, bb, src_L0_pt, false, true, &src_it, src_L0, src);
|
||||
|
||||
emit_pte(m, bb, ccs_pt, false, false, &ccs_it, ccs_size, src);
|
||||
|
||||
bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags);
|
||||
flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, src_is_pltt,
|
||||
src_L0_ofs, dst_is_pltt,
|
||||
src_L0, ccs_ofs, true);
|
||||
bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags);
|
||||
|
||||
size -= src_L0;
|
||||
}
|
||||
|
||||
xe_assert(xe, (batch_size_allocated == bb->len));
|
||||
src_bo->bb_ccs[read_write] = bb;
|
||||
|
||||
xe_sriov_vf_ccs_rw_update_bb_addr(ctx);
|
||||
xe_sa_bo_sync_shadow(bb->bo);
|
||||
}
|
||||
|
||||
batch_size_allocated = batch_size;
|
||||
size = xe_bo_size(src_bo);
|
||||
batch_size = 0;
|
||||
|
||||
/*
|
||||
* Emit PTE and copy commands here.
|
||||
* The CCS copy command can only support limited size. If the size to be
|
||||
* copied is more than the limit, divide copy into chunks. So, calculate
|
||||
* sizes here again before copy command is emitted.
|
||||
*/
|
||||
while (size) {
|
||||
batch_size += 10; /* Flush + ggtt addr + 2 NOP */
|
||||
u32 flush_flags = 0;
|
||||
u64 ccs_ofs, ccs_size;
|
||||
u32 ccs_pt;
|
||||
|
||||
u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE;
|
||||
|
||||
src_L0 = xe_migrate_res_sizes(m, &src_it);
|
||||
|
||||
batch_size += pte_update_size(m, false, src, &src_it, &src_L0,
|
||||
&src_L0_ofs, &src_L0_pt, 0, 0,
|
||||
avail_pts);
|
||||
|
||||
ccs_size = xe_device_ccs_bytes(xe, src_L0);
|
||||
batch_size += pte_update_size(m, 0, NULL, &ccs_it, &ccs_size, &ccs_ofs,
|
||||
&ccs_pt, 0, avail_pts, avail_pts);
|
||||
xe_assert(xe, IS_ALIGNED(ccs_it.start, PAGE_SIZE));
|
||||
batch_size += EMIT_COPY_CCS_DW;
|
||||
|
||||
emit_pte(m, bb, src_L0_pt, false, true, &src_it, src_L0, src);
|
||||
|
||||
emit_pte(m, bb, ccs_pt, false, false, &ccs_it, ccs_size, src);
|
||||
|
||||
bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags);
|
||||
flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, src_is_pltt,
|
||||
src_L0_ofs, dst_is_pltt,
|
||||
src_L0, ccs_ofs, true);
|
||||
bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags);
|
||||
|
||||
size -= src_L0;
|
||||
}
|
||||
|
||||
xe_assert(xe, (batch_size_allocated == bb->len));
|
||||
src_bo->bb_ccs[read_write] = bb;
|
||||
|
||||
xe_sriov_vf_ccs_rw_update_bb_addr(ctx);
|
||||
xe_sa_bo_sync_shadow(bb->bo);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@
|
|||
#ifndef _XE_MMIO_H_
|
||||
#define _XE_MMIO_H_
|
||||
|
||||
#include "xe_gt_types.h"
|
||||
#include "xe_mmio_types.h"
|
||||
|
||||
struct xe_device;
|
||||
struct xe_reg;
|
||||
|
|
@ -37,11 +37,6 @@ static inline u32 xe_mmio_adjusted_addr(const struct xe_mmio *mmio, u32 addr)
|
|||
return addr;
|
||||
}
|
||||
|
||||
static inline struct xe_mmio *xe_root_tile_mmio(struct xe_device *xe)
|
||||
{
|
||||
return &xe->tiles[0].mmio;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_PCI_IOV
|
||||
void xe_mmio_init_vf_view(struct xe_mmio *mmio, const struct xe_mmio *base, unsigned int vfid);
|
||||
#endif
|
||||
|
|
|
|||
64
drivers/gpu/drm/xe/xe_mmio_types.h
Normal file
64
drivers/gpu/drm/xe/xe_mmio_types.h
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2022-2026 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef _XE_MMIO_TYPES_H_
|
||||
#define _XE_MMIO_TYPES_H_
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
struct xe_gt;
|
||||
struct xe_tile;
|
||||
|
||||
/**
|
||||
* struct xe_mmio - register mmio structure
|
||||
*
|
||||
* Represents an MMIO region that the CPU may use to access registers. A
|
||||
* region may share its IO map with other regions (e.g., all GTs within a
|
||||
* tile share the same map with their parent tile, but represent different
|
||||
* subregions of the overall IO space).
|
||||
*/
|
||||
struct xe_mmio {
|
||||
/** @tile: Backpointer to tile, used for tracing */
|
||||
struct xe_tile *tile;
|
||||
|
||||
/** @regs: Map used to access registers. */
|
||||
void __iomem *regs;
|
||||
|
||||
/**
|
||||
* @sriov_vf_gt: Backpointer to GT.
|
||||
*
|
||||
* This pointer is only set for GT MMIO regions and only when running
|
||||
* as an SRIOV VF structure
|
||||
*/
|
||||
struct xe_gt *sriov_vf_gt;
|
||||
|
||||
/**
|
||||
* @regs_size: Length of the register region within the map.
|
||||
*
|
||||
* The size of the iomap set in *regs is generally larger than the
|
||||
* register mmio space since it includes unused regions and/or
|
||||
* non-register regions such as the GGTT PTEs.
|
||||
*/
|
||||
size_t regs_size;
|
||||
|
||||
/** @adj_limit: adjust MMIO address if address is below this value */
|
||||
u32 adj_limit;
|
||||
|
||||
/** @adj_offset: offset to add to MMIO address when adjusting */
|
||||
u32 adj_offset;
|
||||
};
|
||||
|
||||
/**
|
||||
* struct xe_mmio_range - register range structure
|
||||
*
|
||||
* @start: first register offset in the range.
|
||||
* @end: last register offset in the range.
|
||||
*/
|
||||
struct xe_mmio_range {
|
||||
u32 start;
|
||||
u32 end;
|
||||
};
|
||||
|
||||
#endif
|
||||
|
|
@ -600,6 +600,7 @@ static unsigned int get_mocs_settings(struct xe_device *xe,
|
|||
info->wb_index = 4;
|
||||
info->unused_entries_index = 4;
|
||||
break;
|
||||
case XE_NOVALAKE_P:
|
||||
case XE_NOVALAKE_S:
|
||||
case XE_PANTHERLAKE:
|
||||
case XE_LUNARLAKE:
|
||||
|
|
|
|||
|
|
@ -10,6 +10,7 @@
|
|||
|
||||
#include <drm/drm_module.h>
|
||||
|
||||
#include "xe_defaults.h"
|
||||
#include "xe_device_types.h"
|
||||
#include "xe_drv.h"
|
||||
#include "xe_configfs.h"
|
||||
|
|
@ -19,51 +20,38 @@
|
|||
#include "xe_observation.h"
|
||||
#include "xe_sched_job.h"
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
|
||||
#define DEFAULT_GUC_LOG_LEVEL 3
|
||||
#else
|
||||
#define DEFAULT_GUC_LOG_LEVEL 1
|
||||
#endif
|
||||
|
||||
#define DEFAULT_PROBE_DISPLAY true
|
||||
#define DEFAULT_VRAM_BAR_SIZE 0
|
||||
#define DEFAULT_FORCE_PROBE CONFIG_DRM_XE_FORCE_PROBE
|
||||
#define DEFAULT_MAX_VFS ~0
|
||||
#define DEFAULT_MAX_VFS_STR "unlimited"
|
||||
#define DEFAULT_WEDGED_MODE XE_WEDGED_MODE_DEFAULT
|
||||
#define DEFAULT_WEDGED_MODE_STR XE_WEDGED_MODE_DEFAULT_STR
|
||||
#define DEFAULT_SVM_NOTIFIER_SIZE 512
|
||||
|
||||
struct xe_modparam xe_modparam = {
|
||||
.probe_display = DEFAULT_PROBE_DISPLAY,
|
||||
.guc_log_level = DEFAULT_GUC_LOG_LEVEL,
|
||||
.force_probe = DEFAULT_FORCE_PROBE,
|
||||
.probe_display = XE_DEFAULT_PROBE_DISPLAY,
|
||||
.guc_log_level = XE_DEFAULT_GUC_LOG_LEVEL,
|
||||
.force_probe = XE_DEFAULT_FORCE_PROBE,
|
||||
#ifdef CONFIG_PCI_IOV
|
||||
.max_vfs = DEFAULT_MAX_VFS,
|
||||
.max_vfs = XE_DEFAULT_MAX_VFS,
|
||||
#endif
|
||||
.wedged_mode = DEFAULT_WEDGED_MODE,
|
||||
.svm_notifier_size = DEFAULT_SVM_NOTIFIER_SIZE,
|
||||
.wedged_mode = XE_DEFAULT_WEDGED_MODE,
|
||||
.svm_notifier_size = XE_DEFAULT_SVM_NOTIFIER_SIZE,
|
||||
/* the rest are 0 by default */
|
||||
};
|
||||
|
||||
module_param_named(svm_notifier_size, xe_modparam.svm_notifier_size, uint, 0600);
|
||||
MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier size in MiB, must be power of 2 "
|
||||
"[default=" __stringify(DEFAULT_SVM_NOTIFIER_SIZE) "]");
|
||||
"[default=" __stringify(XE_DEFAULT_SVM_NOTIFIER_SIZE) "]");
|
||||
|
||||
module_param_named_unsafe(force_execlist, xe_modparam.force_execlist, bool, 0444);
|
||||
MODULE_PARM_DESC(force_execlist, "Force Execlist submission");
|
||||
|
||||
#if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
|
||||
module_param_named(probe_display, xe_modparam.probe_display, bool, 0444);
|
||||
MODULE_PARM_DESC(probe_display, "Probe display HW, otherwise it's left untouched "
|
||||
"[default=" __stringify(DEFAULT_PROBE_DISPLAY) "])");
|
||||
"[default=" __stringify(XE_DEFAULT_PROBE_DISPLAY) "])");
|
||||
#endif
|
||||
|
||||
module_param_named(vram_bar_size, xe_modparam.force_vram_bar_size, int, 0600);
|
||||
MODULE_PARM_DESC(vram_bar_size, "Set the vram bar size in MiB (<0=disable-resize, 0=max-needed-size, >0=force-size "
|
||||
"[default=" __stringify(DEFAULT_VRAM_BAR_SIZE) "])");
|
||||
"[default=" __stringify(XE_DEFAULT_VRAM_BAR_SIZE) "])");
|
||||
|
||||
module_param_named(guc_log_level, xe_modparam.guc_log_level, int, 0600);
|
||||
MODULE_PARM_DESC(guc_log_level, "GuC firmware logging level (0=disable, 1=normal, 2..5=verbose-levels "
|
||||
"[default=" __stringify(DEFAULT_GUC_LOG_LEVEL) "])");
|
||||
"[default=" __stringify(XE_DEFAULT_GUC_LOG_LEVEL) "])");
|
||||
|
||||
module_param_named_unsafe(guc_firmware_path, xe_modparam.guc_firmware_path, charp, 0400);
|
||||
MODULE_PARM_DESC(guc_firmware_path,
|
||||
|
|
@ -80,20 +68,20 @@ MODULE_PARM_DESC(gsc_firmware_path,
|
|||
module_param_named_unsafe(force_probe, xe_modparam.force_probe, charp, 0400);
|
||||
MODULE_PARM_DESC(force_probe,
|
||||
"Force probe options for specified devices. See CONFIG_DRM_XE_FORCE_PROBE for details "
|
||||
"[default=" DEFAULT_FORCE_PROBE "])");
|
||||
"[default=" XE_DEFAULT_FORCE_PROBE "])");
|
||||
|
||||
#ifdef CONFIG_PCI_IOV
|
||||
module_param_named(max_vfs, xe_modparam.max_vfs, uint, 0400);
|
||||
MODULE_PARM_DESC(max_vfs,
|
||||
"Limit number of Virtual Functions (VFs) that could be managed. "
|
||||
"(0=no VFs; N=allow up to N VFs "
|
||||
"[default=" DEFAULT_MAX_VFS_STR "])");
|
||||
"[default=" XE_DEFAULT_MAX_VFS_STR "])");
|
||||
#endif
|
||||
|
||||
module_param_named_unsafe(wedged_mode, xe_modparam.wedged_mode, uint, 0600);
|
||||
MODULE_PARM_DESC(wedged_mode,
|
||||
"Module's default policy for the wedged mode (0=never, 1=upon-critical-error, 2=upon-any-hang-no-reset "
|
||||
"[default=" DEFAULT_WEDGED_MODE_STR "])");
|
||||
"[default=" XE_DEFAULT_WEDGED_MODE_STR "])");
|
||||
|
||||
static int xe_check_nomodeset(void)
|
||||
{
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@
|
|||
#include <linux/intel_dg_nvm_aux.h>
|
||||
#include <linux/pci.h>
|
||||
|
||||
#include "xe_device_types.h"
|
||||
#include "xe_device.h"
|
||||
#include "xe_mmio.h"
|
||||
#include "xe_nvm.h"
|
||||
#include "xe_pcode_api.h"
|
||||
|
|
@ -133,12 +133,10 @@ int xe_nvm_init(struct xe_device *xe)
|
|||
if (WARN_ON(xe->nvm))
|
||||
return -EFAULT;
|
||||
|
||||
xe->nvm = kzalloc_obj(*nvm);
|
||||
if (!xe->nvm)
|
||||
nvm = kzalloc_obj(*nvm);
|
||||
if (!nvm)
|
||||
return -ENOMEM;
|
||||
|
||||
nvm = xe->nvm;
|
||||
|
||||
nvm->writable_override = xe_nvm_writable_override(xe);
|
||||
nvm->non_posted_erase = xe_nvm_non_posted_erase(xe);
|
||||
nvm->bar.parent = &pdev->resource[0];
|
||||
|
|
@ -165,7 +163,6 @@ int xe_nvm_init(struct xe_device *xe)
|
|||
if (ret) {
|
||||
drm_err(&xe->drm, "xe-nvm aux init failed %d\n", ret);
|
||||
kfree(nvm);
|
||||
xe->nvm = NULL;
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
@ -173,8 +170,9 @@ int xe_nvm_init(struct xe_device *xe)
|
|||
if (ret) {
|
||||
drm_err(&xe->drm, "xe-nvm aux add failed %d\n", ret);
|
||||
auxiliary_device_uninit(aux_dev);
|
||||
xe->nvm = NULL;
|
||||
return ret;
|
||||
}
|
||||
|
||||
xe->nvm = nvm;
|
||||
return devm_add_action_or_reset(xe->drm.dev, xe_nvm_fini, xe);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -29,7 +29,7 @@
|
|||
#include "xe_gt.h"
|
||||
#include "xe_gt_mcr.h"
|
||||
#include "xe_gt_printk.h"
|
||||
#include "xe_guc_pc.h"
|
||||
#include "xe_guc_rc.h"
|
||||
#include "xe_macros.h"
|
||||
#include "xe_mmio.h"
|
||||
#include "xe_oa.h"
|
||||
|
|
@ -873,10 +873,6 @@ static void xe_oa_stream_destroy(struct xe_oa_stream *stream)
|
|||
xe_force_wake_put(gt_to_fw(gt), stream->fw_ref);
|
||||
xe_pm_runtime_put(stream->oa->xe);
|
||||
|
||||
/* Wa_1509372804:pvc: Unset the override of GUCRC mode to enable rc6 */
|
||||
if (stream->override_gucrc)
|
||||
xe_gt_WARN_ON(gt, xe_guc_pc_unset_gucrc_mode(>->uc.guc.pc));
|
||||
|
||||
xe_oa_free_configs(stream);
|
||||
xe_file_put(stream->xef);
|
||||
}
|
||||
|
|
@ -969,7 +965,7 @@ static void xe_oa_config_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
|
|||
struct xe_oa_fence *ofence = container_of(cb, typeof(*ofence), cb);
|
||||
|
||||
INIT_DELAYED_WORK(&ofence->work, xe_oa_fence_work_fn);
|
||||
queue_delayed_work(system_unbound_wq, &ofence->work,
|
||||
queue_delayed_work(system_dfl_wq, &ofence->work,
|
||||
usecs_to_jiffies(NOA_PROGRAM_ADDITIONAL_DELAY_US));
|
||||
dma_fence_put(fence);
|
||||
}
|
||||
|
|
@ -1760,19 +1756,6 @@ static int xe_oa_stream_init(struct xe_oa_stream *stream,
|
|||
goto exit;
|
||||
}
|
||||
|
||||
/*
|
||||
* GuC reset of engines causes OA to lose configuration
|
||||
* state. Prevent this by overriding GUCRC mode.
|
||||
*/
|
||||
if (XE_GT_WA(stream->gt, 1509372804)) {
|
||||
ret = xe_guc_pc_override_gucrc_mode(>->uc.guc.pc,
|
||||
SLPC_GUCRC_MODE_GUCRC_NO_RC6);
|
||||
if (ret)
|
||||
goto err_free_configs;
|
||||
|
||||
stream->override_gucrc = true;
|
||||
}
|
||||
|
||||
/* Take runtime pm ref and forcewake to disable RC6 */
|
||||
xe_pm_runtime_get(stream->oa->xe);
|
||||
stream->fw_ref = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
|
||||
|
|
@ -1823,9 +1806,6 @@ static int xe_oa_stream_init(struct xe_oa_stream *stream,
|
|||
err_fw_put:
|
||||
xe_force_wake_put(gt_to_fw(gt), stream->fw_ref);
|
||||
xe_pm_runtime_put(stream->oa->xe);
|
||||
if (stream->override_gucrc)
|
||||
xe_gt_WARN_ON(gt, xe_guc_pc_unset_gucrc_mode(>->uc.guc.pc));
|
||||
err_free_configs:
|
||||
xe_oa_free_configs(stream);
|
||||
exit:
|
||||
xe_file_put(stream->xef);
|
||||
|
|
|
|||
|
|
@ -239,9 +239,6 @@ struct xe_oa_stream {
|
|||
/** @poll_period_ns: hrtimer period for checking OA buffer for available data */
|
||||
u64 poll_period_ns;
|
||||
|
||||
/** @override_gucrc: GuC RC has been overridden for the OA stream */
|
||||
bool override_gucrc;
|
||||
|
||||
/** @oa_status: temporary storage for oa_status register value */
|
||||
u32 oa_status;
|
||||
|
||||
|
|
|
|||
|
|
@ -136,7 +136,7 @@ static int xe_pagefault_handle_vma(struct xe_gt *gt, struct xe_vma *vma,
|
|||
static bool
|
||||
xe_pagefault_access_is_atomic(enum xe_pagefault_access_type access_type)
|
||||
{
|
||||
return access_type == XE_PAGEFAULT_ACCESS_TYPE_ATOMIC;
|
||||
return (access_type & XE_PAGEFAULT_ACCESS_TYPE_MASK) == XE_PAGEFAULT_ACCESS_TYPE_ATOMIC;
|
||||
}
|
||||
|
||||
static struct xe_vm *xe_pagefault_asid_to_vm(struct xe_device *xe, u32 asid)
|
||||
|
|
@ -164,7 +164,7 @@ static int xe_pagefault_service(struct xe_pagefault *pf)
|
|||
bool atomic;
|
||||
|
||||
/* Producer flagged this fault to be nacked */
|
||||
if (pf->consumer.fault_level == XE_PAGEFAULT_LEVEL_NACK)
|
||||
if (pf->consumer.fault_type_level == XE_PAGEFAULT_TYPE_LEVEL_NACK)
|
||||
return -EFAULT;
|
||||
|
||||
vm = xe_pagefault_asid_to_vm(xe, pf->consumer.asid);
|
||||
|
|
@ -225,17 +225,20 @@ static void xe_pagefault_print(struct xe_pagefault *pf)
|
|||
{
|
||||
xe_gt_info(pf->gt, "\n\tASID: %d\n"
|
||||
"\tFaulted Address: 0x%08x%08x\n"
|
||||
"\tFaultType: %d\n"
|
||||
"\tAccessType: %d\n"
|
||||
"\tFaultLevel: %d\n"
|
||||
"\tFaultType: %lu\n"
|
||||
"\tAccessType: %lu\n"
|
||||
"\tFaultLevel: %lu\n"
|
||||
"\tEngineClass: %d %s\n"
|
||||
"\tEngineInstance: %d\n",
|
||||
pf->consumer.asid,
|
||||
upper_32_bits(pf->consumer.page_addr),
|
||||
lower_32_bits(pf->consumer.page_addr),
|
||||
pf->consumer.fault_type,
|
||||
pf->consumer.access_type,
|
||||
pf->consumer.fault_level,
|
||||
FIELD_GET(XE_PAGEFAULT_TYPE_MASK,
|
||||
pf->consumer.fault_type_level),
|
||||
FIELD_GET(XE_PAGEFAULT_ACCESS_TYPE_MASK,
|
||||
pf->consumer.access_type),
|
||||
FIELD_GET(XE_PAGEFAULT_LEVEL_MASK,
|
||||
pf->consumer.fault_type_level),
|
||||
pf->consumer.engine_class,
|
||||
xe_hw_engine_class_to_str(pf->consumer.engine_class),
|
||||
pf->consumer.engine_instance);
|
||||
|
|
@ -259,9 +262,15 @@ static void xe_pagefault_queue_work(struct work_struct *w)
|
|||
|
||||
err = xe_pagefault_service(&pf);
|
||||
if (err) {
|
||||
xe_pagefault_print(&pf);
|
||||
xe_gt_info(pf.gt, "Fault response: Unsuccessful %pe\n",
|
||||
ERR_PTR(err));
|
||||
if (!(pf.consumer.access_type & XE_PAGEFAULT_ACCESS_PREFETCH)) {
|
||||
xe_pagefault_print(&pf);
|
||||
xe_gt_info(pf.gt, "Fault response: Unsuccessful %pe\n",
|
||||
ERR_PTR(err));
|
||||
} else {
|
||||
xe_gt_stats_incr(pf.gt, XE_GT_STATS_ID_INVALID_PREFETCH_PAGEFAULT_COUNT, 1);
|
||||
xe_gt_dbg(pf.gt, "Prefetch Fault response: Unsuccessful %pe\n",
|
||||
ERR_PTR(err));
|
||||
}
|
||||
}
|
||||
|
||||
pf.producer.ops->ack_fault(&pf, err);
|
||||
|
|
|
|||
|
|
@ -68,24 +68,26 @@ struct xe_pagefault {
|
|||
/** @consumer.asid: address space ID */
|
||||
u32 asid;
|
||||
/**
|
||||
* @consumer.access_type: access type, u8 rather than enum to
|
||||
* keep size compact
|
||||
* @consumer.access_type: access type and prefetch flag packed
|
||||
* into a u8.
|
||||
*/
|
||||
u8 access_type;
|
||||
#define XE_PAGEFAULT_ACCESS_TYPE_MASK GENMASK(1, 0)
|
||||
#define XE_PAGEFAULT_ACCESS_PREFETCH BIT(7)
|
||||
/**
|
||||
* @consumer.fault_type: fault type, u8 rather than enum to
|
||||
* keep size compact
|
||||
* @consumer.fault_type_level: fault type and level, u8 rather
|
||||
* than enum to keep size compact
|
||||
*/
|
||||
u8 fault_type;
|
||||
#define XE_PAGEFAULT_LEVEL_NACK 0xff /* Producer indicates nack fault */
|
||||
/** @consumer.fault_level: fault level */
|
||||
u8 fault_level;
|
||||
u8 fault_type_level;
|
||||
#define XE_PAGEFAULT_TYPE_LEVEL_NACK 0xff /* Producer indicates nack fault */
|
||||
#define XE_PAGEFAULT_LEVEL_MASK GENMASK(3, 0)
|
||||
#define XE_PAGEFAULT_TYPE_MASK GENMASK(7, 4)
|
||||
/** @consumer.engine_class: engine class */
|
||||
u8 engine_class;
|
||||
/** @consumer.engine_instance: engine instance */
|
||||
u8 engine_instance;
|
||||
/** consumer.reserved: reserved bits for future expansion */
|
||||
u8 reserved[7];
|
||||
u64 reserved;
|
||||
} consumer;
|
||||
/**
|
||||
* @producer: State for the producer (i.e., HW/FW interface). Populated
|
||||
|
|
|
|||
|
|
@ -88,6 +88,7 @@ struct xe_pat_ops {
|
|||
void (*program_media)(struct xe_gt *gt, const struct xe_pat_table_entry table[],
|
||||
int n_entries);
|
||||
int (*dump)(struct xe_gt *gt, struct drm_printer *p);
|
||||
void (*entry_dump)(struct drm_printer *p, const char *label, u32 pat, bool rsvd);
|
||||
};
|
||||
|
||||
static const struct xe_pat_table_entry xelp_pat_table[] = {
|
||||
|
|
@ -123,7 +124,8 @@ static const struct xe_pat_table_entry xelpg_pat_table[] = {
|
|||
* - no_promote: 0=promotable, 1=no promote
|
||||
* - comp_en: 0=disable, 1=enable
|
||||
* - l3clos: L3 class of service (0-3)
|
||||
* - l3_policy: 0=WB, 1=XD ("WB - Transient Display"), 3=UC
|
||||
* - l3_policy: 0=WB, 1=XD ("WB - Transient Display"),
|
||||
* 2=XA ("WB - Transient App" for Xe3p), 3=UC
|
||||
* - l4_policy: 0=WB, 1=WT, 3=UC
|
||||
* - coh_mode: 0=no snoop, 2=1-way coherent, 3=2-way coherent
|
||||
*
|
||||
|
|
@ -252,6 +254,44 @@ static const struct xe_pat_table_entry xe3p_xpc_pat_table[] = {
|
|||
[31] = XE3P_XPC_PAT( 0, 3, 0, 0, 3 ),
|
||||
};
|
||||
|
||||
static const struct xe_pat_table_entry xe3p_primary_pat_pta = XE2_PAT(0, 0, 0, 0, 0, 3);
|
||||
static const struct xe_pat_table_entry xe3p_media_pat_pta = XE2_PAT(0, 0, 0, 0, 0, 2);
|
||||
|
||||
static const struct xe_pat_table_entry xe3p_lpg_pat_table[] = {
|
||||
[ 0] = XE2_PAT( 0, 0, 0, 0, 3, 0 ),
|
||||
[ 1] = XE2_PAT( 0, 0, 0, 0, 3, 2 ),
|
||||
[ 2] = XE2_PAT( 0, 0, 0, 0, 3, 3 ),
|
||||
[ 3] = XE2_PAT( 0, 0, 0, 3, 3, 0 ),
|
||||
[ 4] = XE2_PAT( 0, 0, 0, 3, 0, 2 ),
|
||||
[ 5] = XE2_PAT( 0, 0, 0, 3, 3, 2 ),
|
||||
[ 6] = XE2_PAT( 1, 0, 0, 1, 3, 0 ),
|
||||
[ 7] = XE2_PAT( 0, 0, 0, 3, 0, 3 ),
|
||||
[ 8] = XE2_PAT( 0, 0, 0, 3, 0, 0 ),
|
||||
[ 9] = XE2_PAT( 0, 1, 0, 0, 3, 0 ),
|
||||
[10] = XE2_PAT( 0, 1, 0, 3, 0, 0 ),
|
||||
[11] = XE2_PAT( 1, 1, 0, 1, 3, 0 ),
|
||||
[12] = XE2_PAT( 0, 1, 0, 3, 3, 0 ),
|
||||
[13] = XE2_PAT( 0, 0, 0, 0, 0, 0 ),
|
||||
[14] = XE2_PAT( 0, 1, 0, 0, 0, 0 ),
|
||||
[15] = XE2_PAT( 1, 1, 0, 1, 1, 0 ),
|
||||
[16] = XE2_PAT( 0, 1, 0, 0, 3, 2 ),
|
||||
/* 17 is reserved; leave set to all 0's */
|
||||
[18] = XE2_PAT( 1, 0, 0, 2, 3, 0 ),
|
||||
[19] = XE2_PAT( 1, 0, 0, 2, 3, 2 ),
|
||||
[20] = XE2_PAT( 0, 0, 1, 0, 3, 0 ),
|
||||
[21] = XE2_PAT( 0, 1, 1, 0, 3, 0 ),
|
||||
[22] = XE2_PAT( 0, 0, 1, 0, 3, 2 ),
|
||||
[23] = XE2_PAT( 0, 0, 1, 0, 3, 3 ),
|
||||
[24] = XE2_PAT( 0, 0, 2, 0, 3, 0 ),
|
||||
[25] = XE2_PAT( 0, 1, 2, 0, 3, 0 ),
|
||||
[26] = XE2_PAT( 0, 0, 2, 0, 3, 2 ),
|
||||
[27] = XE2_PAT( 0, 0, 2, 0, 3, 3 ),
|
||||
[28] = XE2_PAT( 0, 0, 3, 0, 3, 0 ),
|
||||
[29] = XE2_PAT( 0, 1, 3, 0, 3, 0 ),
|
||||
[30] = XE2_PAT( 0, 0, 3, 0, 3, 2 ),
|
||||
[31] = XE2_PAT( 0, 0, 3, 0, 3, 3 ),
|
||||
};
|
||||
|
||||
u16 xe_pat_index_get_coh_mode(struct xe_device *xe, u16 pat_index)
|
||||
{
|
||||
WARN_ON(pat_index >= xe->pat.n_entries);
|
||||
|
|
@ -284,8 +324,10 @@ static void program_pat(struct xe_gt *gt, const struct xe_pat_table_entry table[
|
|||
|
||||
if (xe->pat.pat_ats)
|
||||
xe_mmio_write32(>->mmio, XE_REG(_PAT_ATS), xe->pat.pat_ats->value);
|
||||
if (xe->pat.pat_pta)
|
||||
xe_mmio_write32(>->mmio, XE_REG(_PAT_PTA), xe->pat.pat_pta->value);
|
||||
if (xe->pat.pat_primary_pta && xe_gt_is_main_type(gt))
|
||||
xe_mmio_write32(>->mmio, XE_REG(_PAT_PTA), xe->pat.pat_primary_pta->value);
|
||||
if (xe->pat.pat_media_pta && xe_gt_is_media_type(gt))
|
||||
xe_mmio_write32(>->mmio, XE_REG(_PAT_PTA), xe->pat.pat_media_pta->value);
|
||||
}
|
||||
|
||||
static void program_pat_mcr(struct xe_gt *gt, const struct xe_pat_table_entry table[],
|
||||
|
|
@ -301,8 +343,10 @@ static void program_pat_mcr(struct xe_gt *gt, const struct xe_pat_table_entry ta
|
|||
|
||||
if (xe->pat.pat_ats)
|
||||
xe_gt_mcr_multicast_write(gt, XE_REG_MCR(_PAT_ATS), xe->pat.pat_ats->value);
|
||||
if (xe->pat.pat_pta)
|
||||
xe_gt_mcr_multicast_write(gt, XE_REG_MCR(_PAT_PTA), xe->pat.pat_pta->value);
|
||||
if (xe->pat.pat_primary_pta && xe_gt_is_main_type(gt))
|
||||
xe_gt_mcr_multicast_write(gt, XE_REG_MCR(_PAT_PTA), xe->pat.pat_primary_pta->value);
|
||||
if (xe->pat.pat_media_pta && xe_gt_is_media_type(gt))
|
||||
xe_gt_mcr_multicast_write(gt, XE_REG_MCR(_PAT_PTA), xe->pat.pat_media_pta->value);
|
||||
}
|
||||
|
||||
static int xelp_dump(struct xe_gt *gt, struct drm_printer *p)
|
||||
|
|
@ -458,7 +502,7 @@ static int xe2_dump(struct xe_gt *gt, struct drm_printer *p)
|
|||
pat = xe_gt_mcr_unicast_read_any(gt, XE_REG_MCR(_PAT_INDEX(i)));
|
||||
|
||||
xe_pat_index_label(label, sizeof(label), i);
|
||||
xe2_pat_entry_dump(p, label, pat, !xe->pat.table[i].valid);
|
||||
xe->pat.ops->entry_dump(p, label, pat, !xe->pat.table[i].valid);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
@ -471,7 +515,7 @@ static int xe2_dump(struct xe_gt *gt, struct drm_printer *p)
|
|||
pat = xe_gt_mcr_unicast_read_any(gt, XE_REG_MCR(_PAT_PTA));
|
||||
|
||||
drm_printf(p, "Page Table Access:\n");
|
||||
xe2_pat_entry_dump(p, "PTA_MODE", pat, false);
|
||||
xe->pat.ops->entry_dump(p, "PTA_MODE", pat, false);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
|
@ -480,44 +524,14 @@ static const struct xe_pat_ops xe2_pat_ops = {
|
|||
.program_graphics = program_pat_mcr,
|
||||
.program_media = program_pat,
|
||||
.dump = xe2_dump,
|
||||
.entry_dump = xe2_pat_entry_dump,
|
||||
};
|
||||
|
||||
static int xe3p_xpc_dump(struct xe_gt *gt, struct drm_printer *p)
|
||||
{
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
u32 pat;
|
||||
int i;
|
||||
char label[PAT_LABEL_LEN];
|
||||
|
||||
CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FW_GT);
|
||||
if (!fw_ref.domains)
|
||||
return -ETIMEDOUT;
|
||||
|
||||
drm_printf(p, "PAT table: (* = reserved entry)\n");
|
||||
|
||||
for (i = 0; i < xe->pat.n_entries; i++) {
|
||||
pat = xe_gt_mcr_unicast_read_any(gt, XE_REG_MCR(_PAT_INDEX(i)));
|
||||
|
||||
xe_pat_index_label(label, sizeof(label), i);
|
||||
xe3p_xpc_pat_entry_dump(p, label, pat, !xe->pat.table[i].valid);
|
||||
}
|
||||
|
||||
/*
|
||||
* Also print PTA_MODE, which describes how the hardware accesses
|
||||
* PPGTT entries.
|
||||
*/
|
||||
pat = xe_gt_mcr_unicast_read_any(gt, XE_REG_MCR(_PAT_PTA));
|
||||
|
||||
drm_printf(p, "Page Table Access:\n");
|
||||
xe3p_xpc_pat_entry_dump(p, "PTA_MODE", pat, false);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const struct xe_pat_ops xe3p_xpc_pat_ops = {
|
||||
.program_graphics = program_pat_mcr,
|
||||
.program_media = program_pat,
|
||||
.dump = xe3p_xpc_dump,
|
||||
.dump = xe2_dump,
|
||||
.entry_dump = xe3p_xpc_pat_entry_dump,
|
||||
};
|
||||
|
||||
void xe_pat_init_early(struct xe_device *xe)
|
||||
|
|
@ -527,11 +541,26 @@ void xe_pat_init_early(struct xe_device *xe)
|
|||
xe->pat.ops = &xe3p_xpc_pat_ops;
|
||||
xe->pat.table = xe3p_xpc_pat_table;
|
||||
xe->pat.pat_ats = &xe3p_xpc_pat_ats;
|
||||
xe->pat.pat_pta = &xe3p_xpc_pat_pta;
|
||||
xe->pat.pat_primary_pta = &xe3p_xpc_pat_pta;
|
||||
xe->pat.pat_media_pta = &xe3p_xpc_pat_pta;
|
||||
xe->pat.n_entries = ARRAY_SIZE(xe3p_xpc_pat_table);
|
||||
xe->pat.idx[XE_CACHE_NONE] = 3;
|
||||
xe->pat.idx[XE_CACHE_WT] = 3; /* N/A (no display); use UC */
|
||||
xe->pat.idx[XE_CACHE_WB] = 2;
|
||||
} else if (GRAPHICS_VER(xe) == 35) {
|
||||
xe->pat.ops = &xe2_pat_ops;
|
||||
xe->pat.table = xe3p_lpg_pat_table;
|
||||
xe->pat.pat_ats = &xe2_pat_ats;
|
||||
if (!IS_DGFX(xe)) {
|
||||
xe->pat.pat_primary_pta = &xe3p_primary_pat_pta;
|
||||
xe->pat.pat_media_pta = &xe3p_media_pat_pta;
|
||||
}
|
||||
xe->pat.n_entries = ARRAY_SIZE(xe3p_lpg_pat_table);
|
||||
xe->pat.idx[XE_CACHE_NONE] = 3;
|
||||
xe->pat.idx[XE_CACHE_WT] = 15;
|
||||
xe->pat.idx[XE_CACHE_WB] = 2;
|
||||
xe->pat.idx[XE_CACHE_NONE_COMPRESSION] = 12;
|
||||
xe->pat.idx[XE_CACHE_WB_COMPRESSION] = 16;
|
||||
} else if (GRAPHICS_VER(xe) == 30 || GRAPHICS_VER(xe) == 20) {
|
||||
xe->pat.ops = &xe2_pat_ops;
|
||||
if (GRAPHICS_VER(xe) == 30) {
|
||||
|
|
@ -541,8 +570,10 @@ void xe_pat_init_early(struct xe_device *xe)
|
|||
xe->pat.table = xe2_pat_table;
|
||||
}
|
||||
xe->pat.pat_ats = &xe2_pat_ats;
|
||||
if (IS_DGFX(xe))
|
||||
xe->pat.pat_pta = &xe2_pat_pta;
|
||||
if (IS_DGFX(xe)) {
|
||||
xe->pat.pat_primary_pta = &xe2_pat_pta;
|
||||
xe->pat.pat_media_pta = &xe2_pat_pta;
|
||||
}
|
||||
|
||||
/* Wa_16023588340. XXX: Should use XE_WA */
|
||||
if (GRAPHICS_VERx100(xe) == 2001)
|
||||
|
|
@ -600,20 +631,17 @@ void xe_pat_init_early(struct xe_device *xe)
|
|||
GRAPHICS_VER(xe), GRAPHICS_VERx100(xe) % 100);
|
||||
}
|
||||
|
||||
/* VFs can't program nor dump PAT settings */
|
||||
if (IS_SRIOV_VF(xe))
|
||||
xe->pat.ops = NULL;
|
||||
|
||||
xe_assert(xe, !xe->pat.ops || xe->pat.ops->dump);
|
||||
xe_assert(xe, !xe->pat.ops || xe->pat.ops->program_graphics);
|
||||
xe_assert(xe, !xe->pat.ops || MEDIA_VER(xe) < 13 || xe->pat.ops->program_media);
|
||||
xe_assert(xe, xe->pat.ops->dump);
|
||||
xe_assert(xe, xe->pat.ops->program_graphics);
|
||||
xe_assert(xe, MEDIA_VER(xe) < 13 || xe->pat.ops->program_media);
|
||||
xe_assert(xe, GRAPHICS_VER(xe) < 20 || xe->pat.ops->entry_dump);
|
||||
}
|
||||
|
||||
void xe_pat_init(struct xe_gt *gt)
|
||||
{
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
|
||||
if (!xe->pat.ops)
|
||||
if (IS_SRIOV_VF(xe))
|
||||
return;
|
||||
|
||||
if (xe_gt_is_media_type(gt))
|
||||
|
|
@ -633,7 +661,7 @@ int xe_pat_dump(struct xe_gt *gt, struct drm_printer *p)
|
|||
{
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
|
||||
if (!xe->pat.ops)
|
||||
if (IS_SRIOV_VF(xe))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
return xe->pat.ops->dump(gt, p);
|
||||
|
|
@ -649,6 +677,8 @@ int xe_pat_dump(struct xe_gt *gt, struct drm_printer *p)
|
|||
int xe_pat_dump_sw_config(struct xe_gt *gt, struct drm_printer *p)
|
||||
{
|
||||
struct xe_device *xe = gt_to_xe(gt);
|
||||
const struct xe_pat_table_entry *pta_entry = xe_gt_is_main_type(gt) ?
|
||||
xe->pat.pat_primary_pta : xe->pat.pat_media_pta;
|
||||
char label[PAT_LABEL_LEN];
|
||||
|
||||
if (!xe->pat.table || !xe->pat.n_entries)
|
||||
|
|
@ -658,12 +688,9 @@ int xe_pat_dump_sw_config(struct xe_gt *gt, struct drm_printer *p)
|
|||
for (u32 i = 0; i < xe->pat.n_entries; i++) {
|
||||
u32 pat = xe->pat.table[i].value;
|
||||
|
||||
if (GRAPHICS_VERx100(xe) == 3511) {
|
||||
if (GRAPHICS_VER(xe) >= 20) {
|
||||
xe_pat_index_label(label, sizeof(label), i);
|
||||
xe3p_xpc_pat_entry_dump(p, label, pat, !xe->pat.table[i].valid);
|
||||
} else if (GRAPHICS_VER(xe) == 30 || GRAPHICS_VER(xe) == 20) {
|
||||
xe_pat_index_label(label, sizeof(label), i);
|
||||
xe2_pat_entry_dump(p, label, pat, !xe->pat.table[i].valid);
|
||||
xe->pat.ops->entry_dump(p, label, pat, !xe->pat.table[i].valid);
|
||||
} else if (xe->info.platform == XE_METEORLAKE) {
|
||||
xelpg_pat_entry_dump(p, i, pat);
|
||||
} else if (xe->info.platform == XE_PVC) {
|
||||
|
|
@ -675,18 +702,18 @@ int xe_pat_dump_sw_config(struct xe_gt *gt, struct drm_printer *p)
|
|||
}
|
||||
}
|
||||
|
||||
if (xe->pat.pat_pta) {
|
||||
u32 pat = xe->pat.pat_pta->value;
|
||||
if (pta_entry) {
|
||||
u32 pat = pta_entry->value;
|
||||
|
||||
drm_printf(p, "Page Table Access:\n");
|
||||
xe2_pat_entry_dump(p, "PTA_MODE", pat, false);
|
||||
xe->pat.ops->entry_dump(p, "PTA_MODE", pat, false);
|
||||
}
|
||||
|
||||
if (xe->pat.pat_ats) {
|
||||
u32 pat = xe->pat.pat_ats->value;
|
||||
|
||||
drm_printf(p, "PCIe ATS/PASID:\n");
|
||||
xe2_pat_entry_dump(p, "PAT_ATS ", pat, false);
|
||||
xe->pat.ops->entry_dump(p, "PAT_ATS ", pat, false);
|
||||
}
|
||||
|
||||
drm_printf(p, "Cache Level:\n");
|
||||
|
|
|
|||
|
|
@ -52,6 +52,7 @@ __diag_ignore_all("-Woverride-init", "Allow field overrides in table");
|
|||
|
||||
static const struct xe_graphics_desc graphics_xelp = {
|
||||
.hw_engine_mask = BIT(XE_HW_ENGINE_RCS0) | BIT(XE_HW_ENGINE_BCS0),
|
||||
.num_geometry_xecore_fuse_regs = 1,
|
||||
};
|
||||
|
||||
#define XE_HP_FEATURES \
|
||||
|
|
@ -62,6 +63,8 @@ static const struct xe_graphics_desc graphics_xehpg = {
|
|||
BIT(XE_HW_ENGINE_RCS0) | BIT(XE_HW_ENGINE_BCS0) |
|
||||
BIT(XE_HW_ENGINE_CCS0) | BIT(XE_HW_ENGINE_CCS1) |
|
||||
BIT(XE_HW_ENGINE_CCS2) | BIT(XE_HW_ENGINE_CCS3),
|
||||
.num_geometry_xecore_fuse_regs = 1,
|
||||
.num_compute_xecore_fuse_regs = 1,
|
||||
|
||||
XE_HP_FEATURES,
|
||||
};
|
||||
|
|
@ -81,12 +84,15 @@ static const struct xe_graphics_desc graphics_xehpc = {
|
|||
.has_asid = 1,
|
||||
.has_atomic_enable_pte_bit = 1,
|
||||
.has_usm = 1,
|
||||
.num_compute_xecore_fuse_regs = 2,
|
||||
};
|
||||
|
||||
static const struct xe_graphics_desc graphics_xelpg = {
|
||||
.hw_engine_mask =
|
||||
BIT(XE_HW_ENGINE_RCS0) | BIT(XE_HW_ENGINE_BCS0) |
|
||||
BIT(XE_HW_ENGINE_CCS0),
|
||||
.num_geometry_xecore_fuse_regs = 1,
|
||||
.num_compute_xecore_fuse_regs = 1,
|
||||
|
||||
XE_HP_FEATURES,
|
||||
};
|
||||
|
|
@ -104,6 +110,15 @@ static const struct xe_graphics_desc graphics_xelpg = {
|
|||
|
||||
static const struct xe_graphics_desc graphics_xe2 = {
|
||||
XE2_GFX_FEATURES,
|
||||
.num_geometry_xecore_fuse_regs = 3,
|
||||
.num_compute_xecore_fuse_regs = 3,
|
||||
};
|
||||
|
||||
static const struct xe_graphics_desc graphics_xe3p_lpg = {
|
||||
XE2_GFX_FEATURES,
|
||||
.multi_queue_engine_class_mask = BIT(XE_ENGINE_CLASS_COPY) | BIT(XE_ENGINE_CLASS_COMPUTE),
|
||||
.num_geometry_xecore_fuse_regs = 3,
|
||||
.num_compute_xecore_fuse_regs = 3,
|
||||
};
|
||||
|
||||
static const struct xe_graphics_desc graphics_xe3p_xpc = {
|
||||
|
|
@ -112,6 +127,10 @@ static const struct xe_graphics_desc graphics_xe3p_xpc = {
|
|||
.hw_engine_mask =
|
||||
GENMASK(XE_HW_ENGINE_BCS8, XE_HW_ENGINE_BCS1) |
|
||||
GENMASK(XE_HW_ENGINE_CCS3, XE_HW_ENGINE_CCS0),
|
||||
.multi_queue_engine_class_mask = BIT(XE_ENGINE_CLASS_COPY) |
|
||||
BIT(XE_ENGINE_CLASS_COMPUTE),
|
||||
.num_geometry_xecore_fuse_regs = 4,
|
||||
.num_compute_xecore_fuse_regs = 4,
|
||||
};
|
||||
|
||||
static const struct xe_media_desc media_xem = {
|
||||
|
|
@ -146,6 +165,7 @@ static const struct xe_ip graphics_ips[] = {
|
|||
{ 3003, "Xe3_LPG", &graphics_xe2 },
|
||||
{ 3004, "Xe3_LPG", &graphics_xe2 },
|
||||
{ 3005, "Xe3_LPG", &graphics_xe2 },
|
||||
{ 3510, "Xe3p_LPG", &graphics_xe3p_lpg },
|
||||
{ 3511, "Xe3p_XPC", &graphics_xe3p_xpc },
|
||||
};
|
||||
|
||||
|
|
@ -164,6 +184,10 @@ static const struct xe_ip media_ips[] = {
|
|||
{ 3503, "Xe3p_HPM", &media_xelpmp },
|
||||
};
|
||||
|
||||
#define MULTI_LRC_MASK \
|
||||
.multi_lrc_mask = BIT(XE_ENGINE_CLASS_VIDEO_DECODE) | \
|
||||
BIT(XE_ENGINE_CLASS_VIDEO_ENHANCE)
|
||||
|
||||
static const struct xe_device_desc tgl_desc = {
|
||||
.pre_gmdid_graphics_ip = &graphics_ip_xelp,
|
||||
.pre_gmdid_media_ip = &media_ip_xem,
|
||||
|
|
@ -174,6 +198,7 @@ static const struct xe_device_desc tgl_desc = {
|
|||
.has_llc = true,
|
||||
.has_sriov = true,
|
||||
.max_gt_per_tile = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.va_bits = 48,
|
||||
.vm_max_level = 3,
|
||||
|
|
@ -188,6 +213,7 @@ static const struct xe_device_desc rkl_desc = {
|
|||
.has_display = true,
|
||||
.has_llc = true,
|
||||
.max_gt_per_tile = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.va_bits = 48,
|
||||
.vm_max_level = 3,
|
||||
|
|
@ -205,6 +231,7 @@ static const struct xe_device_desc adl_s_desc = {
|
|||
.has_llc = true,
|
||||
.has_sriov = true,
|
||||
.max_gt_per_tile = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.subplatforms = (const struct xe_subplatform_desc[]) {
|
||||
{ XE_SUBPLATFORM_ALDERLAKE_S_RPLS, "RPLS", adls_rpls_ids },
|
||||
|
|
@ -226,6 +253,7 @@ static const struct xe_device_desc adl_p_desc = {
|
|||
.has_llc = true,
|
||||
.has_sriov = true,
|
||||
.max_gt_per_tile = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.subplatforms = (const struct xe_subplatform_desc[]) {
|
||||
{ XE_SUBPLATFORM_ALDERLAKE_P_RPLU, "RPLU", adlp_rplu_ids },
|
||||
|
|
@ -245,6 +273,7 @@ static const struct xe_device_desc adl_n_desc = {
|
|||
.has_llc = true,
|
||||
.has_sriov = true,
|
||||
.max_gt_per_tile = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.va_bits = 48,
|
||||
.vm_max_level = 3,
|
||||
|
|
@ -263,6 +292,7 @@ static const struct xe_device_desc dg1_desc = {
|
|||
.has_gsc_nvm = 1,
|
||||
.has_heci_gscfi = 1,
|
||||
.max_gt_per_tile = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.va_bits = 48,
|
||||
.vm_max_level = 3,
|
||||
|
|
@ -293,6 +323,7 @@ static const struct xe_device_desc ats_m_desc = {
|
|||
.pre_gmdid_media_ip = &media_ip_xehpm,
|
||||
.dma_mask_size = 46,
|
||||
.max_gt_per_tile = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
|
||||
DG2_FEATURES,
|
||||
|
|
@ -305,6 +336,7 @@ static const struct xe_device_desc dg2_desc = {
|
|||
.pre_gmdid_media_ip = &media_ip_xehpm,
|
||||
.dma_mask_size = 46,
|
||||
.max_gt_per_tile = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
|
||||
DG2_FEATURES,
|
||||
|
|
@ -323,6 +355,7 @@ static const __maybe_unused struct xe_device_desc pvc_desc = {
|
|||
.has_heci_gscfi = 1,
|
||||
.max_gt_per_tile = 1,
|
||||
.max_remote_tiles = 1,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.va_bits = 57,
|
||||
.vm_max_level = 4,
|
||||
|
|
@ -338,6 +371,7 @@ static const struct xe_device_desc mtl_desc = {
|
|||
.has_display = true,
|
||||
.has_pxp = true,
|
||||
.max_gt_per_tile = 2,
|
||||
MULTI_LRC_MASK,
|
||||
.va_bits = 48,
|
||||
.vm_max_level = 3,
|
||||
};
|
||||
|
|
@ -349,6 +383,7 @@ static const struct xe_device_desc lnl_desc = {
|
|||
.has_flat_ccs = 1,
|
||||
.has_pxp = true,
|
||||
.max_gt_per_tile = 2,
|
||||
MULTI_LRC_MASK,
|
||||
.needs_scratch = true,
|
||||
.va_bits = 48,
|
||||
.vm_max_level = 4,
|
||||
|
|
@ -373,6 +408,7 @@ static const struct xe_device_desc bmg_desc = {
|
|||
.has_soc_remapper_telem = true,
|
||||
.has_sriov = true,
|
||||
.max_gt_per_tile = 2,
|
||||
MULTI_LRC_MASK,
|
||||
.needs_scratch = true,
|
||||
.subplatforms = (const struct xe_subplatform_desc[]) {
|
||||
{ XE_SUBPLATFORM_BATTLEMAGE_G21, "G21", bmg_g21_ids },
|
||||
|
|
@ -391,6 +427,7 @@ static const struct xe_device_desc ptl_desc = {
|
|||
.has_pre_prod_wa = 1,
|
||||
.has_pxp = true,
|
||||
.max_gt_per_tile = 2,
|
||||
MULTI_LRC_MASK,
|
||||
.needs_scratch = true,
|
||||
.needs_shared_vf_gt_wq = true,
|
||||
.va_bits = 48,
|
||||
|
|
@ -404,6 +441,7 @@ static const struct xe_device_desc nvls_desc = {
|
|||
.has_flat_ccs = 1,
|
||||
.has_pre_prod_wa = 1,
|
||||
.max_gt_per_tile = 2,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.va_bits = 48,
|
||||
.vm_max_level = 4,
|
||||
|
|
@ -425,11 +463,27 @@ static const struct xe_device_desc cri_desc = {
|
|||
.has_soc_remapper_telem = true,
|
||||
.has_sriov = true,
|
||||
.max_gt_per_tile = 2,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.va_bits = 57,
|
||||
.vm_max_level = 4,
|
||||
};
|
||||
|
||||
static const struct xe_device_desc nvlp_desc = {
|
||||
PLATFORM(NOVALAKE_P),
|
||||
.dma_mask_size = 46,
|
||||
.has_cached_pt = true,
|
||||
.has_display = true,
|
||||
.has_flat_ccs = 1,
|
||||
.has_page_reclaim_hw_assist = true,
|
||||
.has_pre_prod_wa = true,
|
||||
.max_gt_per_tile = 2,
|
||||
MULTI_LRC_MASK,
|
||||
.require_force_probe = true,
|
||||
.va_bits = 48,
|
||||
.vm_max_level = 4,
|
||||
};
|
||||
|
||||
#undef PLATFORM
|
||||
__diag_pop();
|
||||
|
||||
|
|
@ -459,6 +513,7 @@ static const struct pci_device_id pciidlist[] = {
|
|||
INTEL_WCL_IDS(INTEL_VGA_DEVICE, &ptl_desc),
|
||||
INTEL_NVLS_IDS(INTEL_VGA_DEVICE, &nvls_desc),
|
||||
INTEL_CRI_IDS(INTEL_PCI_DEVICE, &cri_desc),
|
||||
INTEL_NVLP_IDS(INTEL_VGA_DEVICE, &nvlp_desc),
|
||||
{ }
|
||||
};
|
||||
MODULE_DEVICE_TABLE(pci, pciidlist);
|
||||
|
|
@ -710,6 +765,7 @@ static int xe_info_init_early(struct xe_device *xe,
|
|||
xe->info.skip_pcode = desc->skip_pcode;
|
||||
xe->info.needs_scratch = desc->needs_scratch;
|
||||
xe->info.needs_shared_vf_gt_wq = desc->needs_shared_vf_gt_wq;
|
||||
xe->info.multi_lrc_mask = desc->multi_lrc_mask;
|
||||
|
||||
xe->info.probe_display = IS_ENABLED(CONFIG_DRM_XE_DISPLAY) &&
|
||||
xe_modparam.probe_display &&
|
||||
|
|
@ -786,6 +842,8 @@ static struct xe_gt *alloc_primary_gt(struct xe_tile *tile,
|
|||
gt->info.has_indirect_ring_state = graphics_desc->has_indirect_ring_state;
|
||||
gt->info.multi_queue_engine_class_mask = graphics_desc->multi_queue_engine_class_mask;
|
||||
gt->info.engine_mask = graphics_desc->hw_engine_mask;
|
||||
gt->info.num_geometry_xecore_fuse_regs = graphics_desc->num_geometry_xecore_fuse_regs;
|
||||
gt->info.num_compute_xecore_fuse_regs = graphics_desc->num_compute_xecore_fuse_regs;
|
||||
|
||||
/*
|
||||
* Before media version 13, the media IP was part of the primary GT
|
||||
|
|
@ -892,6 +950,7 @@ static int xe_info_init(struct xe_device *xe,
|
|||
xe->info.has_device_atomics_on_smem = 1;
|
||||
|
||||
xe->info.has_range_tlb_inval = graphics_desc->has_range_tlb_inval;
|
||||
xe->info.has_ctx_tlb_inval = graphics_desc->has_ctx_tlb_inval;
|
||||
xe->info.has_usm = graphics_desc->has_usm;
|
||||
xe->info.has_64bit_timestamp = graphics_desc->has_64bit_timestamp;
|
||||
xe->info.has_mem_copy_instr = GRAPHICS_VER(xe) >= 20;
|
||||
|
|
|
|||
|
|
@ -30,6 +30,7 @@ struct xe_device_desc {
|
|||
u8 dma_mask_size;
|
||||
u8 max_remote_tiles:2;
|
||||
u8 max_gt_per_tile:2;
|
||||
u8 multi_lrc_mask;
|
||||
u8 va_bits;
|
||||
u8 vm_max_level;
|
||||
u8 vram_flags;
|
||||
|
|
@ -66,11 +67,14 @@ struct xe_device_desc {
|
|||
struct xe_graphics_desc {
|
||||
u64 hw_engine_mask; /* hardware engines provided by graphics IP */
|
||||
u16 multi_queue_engine_class_mask; /* bitmask of engine classes which support multi queue */
|
||||
u8 num_geometry_xecore_fuse_regs;
|
||||
u8 num_compute_xecore_fuse_regs;
|
||||
|
||||
u8 has_asid:1;
|
||||
u8 has_atomic_enable_pte_bit:1;
|
||||
u8 has_indirect_ring_state:1;
|
||||
u8 has_range_tlb_inval:1;
|
||||
u8 has_ctx_tlb_inval:1;
|
||||
u8 has_usm:1;
|
||||
u8 has_64bit_timestamp:1;
|
||||
};
|
||||
|
|
|
|||
|
|
@ -26,6 +26,7 @@ enum xe_platform {
|
|||
XE_PANTHERLAKE,
|
||||
XE_NOVALAKE_S,
|
||||
XE_CRESCENTISLAND,
|
||||
XE_NOVALAKE_P,
|
||||
};
|
||||
|
||||
enum xe_subplatform {
|
||||
|
|
|
|||
|
|
@ -142,9 +142,6 @@ query_engine_cycles(struct xe_device *xe,
|
|||
return -EINVAL;
|
||||
|
||||
eci = &resp.eci;
|
||||
if (eci->gt_id >= xe->info.max_gt_per_tile)
|
||||
return -EINVAL;
|
||||
|
||||
gt = xe_device_get_gt(xe, eci->gt_id);
|
||||
if (!gt)
|
||||
return -EINVAL;
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@
|
|||
#include <drm/drm_managed.h>
|
||||
#include <drm/drm_print.h>
|
||||
|
||||
#include "xe_assert.h"
|
||||
#include "xe_device.h"
|
||||
#include "xe_device_types.h"
|
||||
#include "xe_force_wake.h"
|
||||
|
|
@ -20,6 +21,7 @@
|
|||
#include "xe_gt_printk.h"
|
||||
#include "xe_gt_types.h"
|
||||
#include "xe_hw_engine_types.h"
|
||||
#include "xe_lrc.h"
|
||||
#include "xe_mmio.h"
|
||||
#include "xe_rtp_types.h"
|
||||
|
||||
|
|
@ -98,10 +100,12 @@ int xe_reg_sr_add(struct xe_reg_sr *sr,
|
|||
*pentry = *e;
|
||||
ret = xa_err(xa_store(&sr->xa, idx, pentry, GFP_KERNEL));
|
||||
if (ret)
|
||||
goto fail;
|
||||
goto fail_free;
|
||||
|
||||
return 0;
|
||||
|
||||
fail_free:
|
||||
kfree(pentry);
|
||||
fail:
|
||||
xe_gt_err(gt,
|
||||
"discarding save-restore reg %04lx (clear: %08x, set: %08x, masked: %s, mcr: %s): ret=%d\n",
|
||||
|
|
@ -169,8 +173,11 @@ void xe_reg_sr_apply_mmio(struct xe_reg_sr *sr, struct xe_gt *gt)
|
|||
if (xa_empty(&sr->xa))
|
||||
return;
|
||||
|
||||
if (IS_SRIOV_VF(gt_to_xe(gt)))
|
||||
return;
|
||||
/*
|
||||
* We don't process non-LRC reg_sr lists in VF, so they should have
|
||||
* been empty in the check above.
|
||||
*/
|
||||
xe_gt_assert(gt, !IS_SRIOV_VF(gt_to_xe(gt)));
|
||||
|
||||
xe_gt_dbg(gt, "Applying %s save-restore MMIOs\n", sr->name);
|
||||
|
||||
|
|
@ -204,3 +211,66 @@ void xe_reg_sr_dump(struct xe_reg_sr *sr, struct drm_printer *p)
|
|||
str_yes_no(entry->reg.masked),
|
||||
str_yes_no(entry->reg.mcr));
|
||||
}
|
||||
|
||||
static u32 readback_reg(struct xe_gt *gt, struct xe_reg reg)
|
||||
{
|
||||
struct xe_reg_mcr mcr_reg = to_xe_reg_mcr(reg);
|
||||
|
||||
if (reg.mcr)
|
||||
return xe_gt_mcr_unicast_read_any(gt, mcr_reg);
|
||||
else
|
||||
return xe_mmio_read32(>->mmio, reg);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_reg_sr_readback_check() - Readback registers referenced in save/restore
|
||||
* entries and check whether the programming is in place.
|
||||
* @sr: Save/restore entries
|
||||
* @gt: GT to read register from
|
||||
* @p: DRM printer to report discrepancies on
|
||||
*/
|
||||
void xe_reg_sr_readback_check(struct xe_reg_sr *sr,
|
||||
struct xe_gt *gt,
|
||||
struct drm_printer *p)
|
||||
{
|
||||
struct xe_reg_sr_entry *entry;
|
||||
unsigned long offset;
|
||||
|
||||
xa_for_each(&sr->xa, offset, entry) {
|
||||
u32 val = readback_reg(gt, entry->reg);
|
||||
u32 mask = entry->clr_bits | entry->set_bits;
|
||||
|
||||
if ((val & mask) != entry->set_bits)
|
||||
drm_printf(p, "%#8lx & %#10x :: expected %#10x got %#10x\n",
|
||||
offset, mask, entry->set_bits, val & mask);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_reg_sr_lrc_check() - Check LRC for registers referenced in save/restore
|
||||
* entries and check whether the programming is in place.
|
||||
* @sr: Save/restore entries
|
||||
* @gt: GT to read register from
|
||||
* @hwe: Hardware engine type to check LRC for
|
||||
* @p: DRM printer to report discrepancies on
|
||||
*/
|
||||
void xe_reg_sr_lrc_check(struct xe_reg_sr *sr,
|
||||
struct xe_gt *gt,
|
||||
struct xe_hw_engine *hwe,
|
||||
struct drm_printer *p)
|
||||
{
|
||||
struct xe_reg_sr_entry *entry;
|
||||
unsigned long offset;
|
||||
|
||||
xa_for_each(&sr->xa, offset, entry) {
|
||||
u32 val;
|
||||
int ret = xe_lrc_lookup_default_reg_value(gt, hwe->class, offset, &val);
|
||||
u32 mask = entry->clr_bits | entry->set_bits;
|
||||
|
||||
if (ret == -ENOENT)
|
||||
drm_printf(p, "%#8lx :: not found in LRC for %s\n", offset, hwe->name);
|
||||
else if ((val & mask) != entry->set_bits)
|
||||
drm_printf(p, "%#8lx & %#10x :: expected %#10x got %#10x\n",
|
||||
offset, mask, entry->set_bits, val & mask);
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -19,6 +19,13 @@ struct drm_printer;
|
|||
|
||||
int xe_reg_sr_init(struct xe_reg_sr *sr, const char *name, struct xe_device *xe);
|
||||
void xe_reg_sr_dump(struct xe_reg_sr *sr, struct drm_printer *p);
|
||||
void xe_reg_sr_readback_check(struct xe_reg_sr *sr,
|
||||
struct xe_gt *gt,
|
||||
struct drm_printer *p);
|
||||
void xe_reg_sr_lrc_check(struct xe_reg_sr *sr,
|
||||
struct xe_gt *gt,
|
||||
struct xe_hw_engine *hwe,
|
||||
struct drm_printer *p);
|
||||
|
||||
int xe_reg_sr_add(struct xe_reg_sr *sr, const struct xe_reg_sr_entry *e,
|
||||
struct xe_gt *gt);
|
||||
|
|
|
|||
|
|
@ -75,7 +75,15 @@ static const struct xe_rtp_entry_sr register_whitelist[] = {
|
|||
XE_RTP_ACTIONS(WHITELIST(CSBE_DEBUG_STATUS(RENDER_RING_BASE), 0))
|
||||
},
|
||||
{ XE_RTP_NAME("14024997852"),
|
||||
XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3005), ENGINE_CLASS(RENDER)),
|
||||
XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 3005), ENGINE_CLASS(RENDER)),
|
||||
XE_RTP_ACTIONS(WHITELIST(FF_MODE,
|
||||
RING_FORCE_TO_NONPRIV_ACCESS_RW),
|
||||
WHITELIST(VFLSKPD,
|
||||
RING_FORCE_TO_NONPRIV_ACCESS_RW))
|
||||
},
|
||||
{ XE_RTP_NAME("14024997852"),
|
||||
XE_RTP_RULES(GRAPHICS_VERSION(3510), GRAPHICS_STEP(A0, B0),
|
||||
ENGINE_CLASS(RENDER)),
|
||||
XE_RTP_ACTIONS(WHITELIST(FF_MODE,
|
||||
RING_FORCE_TO_NONPRIV_ACCESS_RW),
|
||||
WHITELIST(VFLSKPD,
|
||||
|
|
@ -181,7 +189,7 @@ void xe_reg_whitelist_process_engine(struct xe_hw_engine *hwe)
|
|||
struct xe_rtp_process_ctx ctx = XE_RTP_PROCESS_CTX_INITIALIZER(hwe);
|
||||
|
||||
xe_rtp_process_to_sr(&ctx, register_whitelist, ARRAY_SIZE(register_whitelist),
|
||||
&hwe->reg_whitelist);
|
||||
&hwe->reg_whitelist, false);
|
||||
whitelist_apply_to_hwe(hwe);
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -280,6 +280,9 @@ static void __emit_job_gen12_simple(struct xe_sched_job *job, struct xe_lrc *lrc
|
|||
|
||||
i = emit_bb_start(batch_addr, ppgtt_flag, dw, i);
|
||||
|
||||
/* Don't preempt fence signaling */
|
||||
dw[i++] = MI_ARB_ON_OFF | MI_ARB_DISABLE;
|
||||
|
||||
if (job->user_fence.used) {
|
||||
i = emit_flush_dw(dw, i);
|
||||
i = emit_store_imm_ppgtt_posted(job->user_fence.addr,
|
||||
|
|
@ -345,6 +348,9 @@ static void __emit_job_gen12_video(struct xe_sched_job *job, struct xe_lrc *lrc,
|
|||
|
||||
i = emit_bb_start(batch_addr, ppgtt_flag, dw, i);
|
||||
|
||||
/* Don't preempt fence signaling */
|
||||
dw[i++] = MI_ARB_ON_OFF | MI_ARB_DISABLE;
|
||||
|
||||
if (job->user_fence.used) {
|
||||
i = emit_flush_dw(dw, i);
|
||||
i = emit_store_imm_ppgtt_posted(job->user_fence.addr,
|
||||
|
|
@ -397,6 +403,9 @@ static void __emit_job_gen12_render_compute(struct xe_sched_job *job,
|
|||
|
||||
i = emit_bb_start(batch_addr, ppgtt_flag, dw, i);
|
||||
|
||||
/* Don't preempt fence signaling */
|
||||
dw[i++] = MI_ARB_ON_OFF | MI_ARB_DISABLE;
|
||||
|
||||
i = emit_render_cache_flush(job, dw, i);
|
||||
|
||||
if (job->user_fence.used)
|
||||
|
|
|
|||
|
|
@ -270,6 +270,8 @@ static void rtp_mark_active(struct xe_device *xe,
|
|||
* @sr: Save-restore struct where matching rules execute the action. This can be
|
||||
* viewed as the "coalesced view" of multiple the tables. The bits for each
|
||||
* register set are expected not to collide with previously added entries
|
||||
* @process_in_vf: Whether this RTP table should get processed for SR-IOV VF
|
||||
* devices. Should generally only be 'true' for LRC tables.
|
||||
*
|
||||
* Walk the table pointed by @entries (with an empty sentinel) and add all
|
||||
* entries with matching rules to @sr. If @hwe is not NULL, its mmio_base is
|
||||
|
|
@ -278,7 +280,8 @@ static void rtp_mark_active(struct xe_device *xe,
|
|||
void xe_rtp_process_to_sr(struct xe_rtp_process_ctx *ctx,
|
||||
const struct xe_rtp_entry_sr *entries,
|
||||
size_t n_entries,
|
||||
struct xe_reg_sr *sr)
|
||||
struct xe_reg_sr *sr,
|
||||
bool process_in_vf)
|
||||
{
|
||||
const struct xe_rtp_entry_sr *entry;
|
||||
struct xe_hw_engine *hwe = NULL;
|
||||
|
|
@ -287,6 +290,9 @@ void xe_rtp_process_to_sr(struct xe_rtp_process_ctx *ctx,
|
|||
|
||||
rtp_get_context(ctx, &hwe, >, &xe);
|
||||
|
||||
if (!process_in_vf && IS_SRIOV_VF(xe))
|
||||
return;
|
||||
|
||||
xe_assert(xe, entries);
|
||||
|
||||
for (entry = entries; entry - entries < n_entries; entry++) {
|
||||
|
|
|
|||
|
|
@ -431,7 +431,8 @@ void xe_rtp_process_ctx_enable_active_tracking(struct xe_rtp_process_ctx *ctx,
|
|||
|
||||
void xe_rtp_process_to_sr(struct xe_rtp_process_ctx *ctx,
|
||||
const struct xe_rtp_entry_sr *entries,
|
||||
size_t n_entries, struct xe_reg_sr *sr);
|
||||
size_t n_entries, struct xe_reg_sr *sr,
|
||||
bool process_in_vf);
|
||||
|
||||
void xe_rtp_process(struct xe_rtp_process_ctx *ctx,
|
||||
const struct xe_rtp_entry *entries);
|
||||
|
|
|
|||
|
|
@ -89,6 +89,12 @@ struct xe_sa_manager *__xe_sa_bo_manager_init(struct xe_tile *tile, u32 size,
|
|||
if (ret)
|
||||
return ERR_PTR(ret);
|
||||
|
||||
if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
|
||||
fs_reclaim_acquire(GFP_KERNEL);
|
||||
might_lock(&sa_manager->swap_guard);
|
||||
fs_reclaim_release(GFP_KERNEL);
|
||||
}
|
||||
|
||||
shadow = xe_managed_bo_create_pin_map(xe, tile, size,
|
||||
XE_BO_FLAG_VRAM_IF_DGFX(tile) |
|
||||
XE_BO_FLAG_GGTT |
|
||||
|
|
@ -175,6 +181,36 @@ struct drm_suballoc *__xe_sa_bo_new(struct xe_sa_manager *sa_manager, u32 size,
|
|||
return drm_suballoc_new(&sa_manager->base, size, gfp, true, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_sa_bo_alloc() - Allocate uninitialized suballoc object.
|
||||
* @gfp: gfp flags used for memory allocation.
|
||||
*
|
||||
* Allocate memory for an uninitialized suballoc object. Intended usage is
|
||||
* allocate memory for suballoc object outside of a reclaim tainted context
|
||||
* and then be initialized at a later time in a reclaim tainted context.
|
||||
*
|
||||
* Return: a new uninitialized suballoc object, or an ERR_PTR(-ENOMEM).
|
||||
*/
|
||||
struct drm_suballoc *xe_sa_bo_alloc(gfp_t gfp)
|
||||
{
|
||||
return drm_suballoc_alloc(gfp);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_sa_bo_init() - Initialize a suballocation.
|
||||
* @sa_manager: pointer to the sa_manager
|
||||
* @sa: The struct drm_suballoc.
|
||||
* @size: number of bytes we want to suballocate.
|
||||
*
|
||||
* Try to make a suballocation on a pre-allocated suballoc object of size @size.
|
||||
*
|
||||
* Return: zero on success, errno on failure.
|
||||
*/
|
||||
int xe_sa_bo_init(struct xe_sa_manager *sa_manager, struct drm_suballoc *sa, size_t size)
|
||||
{
|
||||
return drm_suballoc_insert(&sa_manager->base, sa, size, true, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_sa_bo_flush_write() - Copy the data from the sub-allocation to the GPU memory.
|
||||
* @sa_bo: the &drm_suballoc to flush
|
||||
|
|
|
|||
|
|
@ -38,6 +38,8 @@ static inline struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager
|
|||
return __xe_sa_bo_new(sa_manager, size, GFP_KERNEL);
|
||||
}
|
||||
|
||||
struct drm_suballoc *xe_sa_bo_alloc(gfp_t gfp);
|
||||
int xe_sa_bo_init(struct xe_sa_manager *sa_manager, struct drm_suballoc *sa, size_t size);
|
||||
void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo);
|
||||
void xe_sa_bo_sync_read(struct drm_suballoc *sa_bo);
|
||||
void xe_sa_bo_free(struct drm_suballoc *sa_bo, struct dma_fence *fence);
|
||||
|
|
|
|||
57
drivers/gpu/drm/xe/xe_sleep.h
Normal file
57
drivers/gpu/drm/xe/xe_sleep.h
Normal file
|
|
@ -0,0 +1,57 @@
|
|||
/* SPDX-License-Identifier: MIT */
|
||||
/*
|
||||
* Copyright © 2026 Intel Corporation
|
||||
*/
|
||||
|
||||
#ifndef _XE_SLEEP_H_
|
||||
#define _XE_SLEEP_H_
|
||||
|
||||
#include <linux/delay.h>
|
||||
#include <linux/math64.h>
|
||||
|
||||
/**
|
||||
* xe_sleep_relaxed_ms() - Sleep for an approximate time.
|
||||
* @delay_ms: time in msec to sleep
|
||||
*
|
||||
* For smaller timeouts, sleep with 0.5ms accuracy.
|
||||
*/
|
||||
static inline void xe_sleep_relaxed_ms(unsigned int delay_ms)
|
||||
{
|
||||
unsigned long min_us, max_us;
|
||||
|
||||
if (!delay_ms)
|
||||
return;
|
||||
|
||||
if (delay_ms > 20) {
|
||||
msleep(delay_ms);
|
||||
return;
|
||||
}
|
||||
|
||||
min_us = mul_u32_u32(delay_ms, 1000);
|
||||
max_us = min_us + 500;
|
||||
|
||||
usleep_range(min_us, max_us);
|
||||
}
|
||||
|
||||
/**
|
||||
* xe_sleep_exponential_ms() - Sleep for a exponentially increased time.
|
||||
* @sleep_period_ms: current time in msec to sleep
|
||||
* @max_sleep_ms: maximum time in msec to sleep
|
||||
*
|
||||
* Sleep for the @sleep_period_ms and exponentially increase this time for the
|
||||
* next loop, unless reaching the @max_sleep_ms limit.
|
||||
*
|
||||
* Return: approximate time in msec the task was delayed.
|
||||
*/
|
||||
static inline unsigned int xe_sleep_exponential_ms(unsigned int *sleep_period_ms,
|
||||
unsigned int max_sleep_ms)
|
||||
{
|
||||
unsigned int delay_ms = *sleep_period_ms;
|
||||
unsigned int next_delay_ms = 2 * delay_ms;
|
||||
|
||||
xe_sleep_relaxed_ms(delay_ms);
|
||||
*sleep_period_ms = min(next_delay_ms, max_sleep_ms);
|
||||
return delay_ms;
|
||||
}
|
||||
|
||||
#endif
|
||||
|
|
@ -4,6 +4,7 @@
|
|||
*/
|
||||
|
||||
#include "regs/xe_soc_remapper_regs.h"
|
||||
#include "xe_device.h"
|
||||
#include "xe_mmio.h"
|
||||
#include "xe_soc_remapper.h"
|
||||
|
||||
|
|
|
|||
|
|
@ -120,7 +120,7 @@ int xe_sriov_init(struct xe_device *xe)
|
|||
xe_sriov_vf_init_early(xe);
|
||||
|
||||
xe_assert(xe, !xe->sriov.wq);
|
||||
xe->sriov.wq = alloc_workqueue("xe-sriov-wq", 0, 0);
|
||||
xe->sriov.wq = alloc_workqueue("xe-sriov-wq", WQ_PERCPU, 0);
|
||||
if (!xe->sriov.wq)
|
||||
return -ENOMEM;
|
||||
|
||||
|
|
|
|||
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user