drm fixes for 7.1-rc2

core and helpers:
 - calculate framebuffer geometry with format helpers
 - fix docs
 
 amdgpu:
 - GFX12 fix for CONFIG_DRM_DEBUG_MM configs
 - Fix DC analog support
 - Userq fixes
 - GART placement fix
 - Aldebaran SMU fixes
 - AMDGPU_INFO_READ_MMR_REG fix
 - UVD 3.1 fix
 - GC 6 TCC fix
 - Fix root reservation in amdgpu_vm_handle_fault()
 - RAS fix
 - Module reload fix for APUs
 - Fix build for CONFIG_DRM_FBDEV_EMULATION=n
 - IGT DWB regression fix
 - GC 11.5.4 fix
 - VCN user fence fixes
 - JPEG user fence fixes
 - SMU 13.0.6 fix
 - VCN 3/4 IB parser fixes
 - NV3x+ dGPU vblank fix
 - DCE6/8 fixes for LVDS/eDP panels without an EDID
 
 amdkfd:
 - Fix for when CONFIG_HSA_AMD is not set
 - SVM fixes
 
 xe:
 - uapi: Add missing pad and extensions check
 - uapi: Reject unsafe PAT indices for CPU cached memory
 - Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge
 - Xe3p tuning and workaround fixes
 - USE drm mm instead of drm SA for CCS read/write
 - Fix leaks and null derefs
 - Fix Wa_18022495364
 
 appletbdrm:
 - allocate protocol buffers with kvzalloc()
 
 dma-buf:
 - fix docs
 
 imagination:
 - avoid segfault in debugfs
 
 ofdrm:
 - put PCI device reference on errors
 
 udl:
 - increase USB timeout
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmn1JYQACgkQDHTzWXnE
 hr5rkBAAnAEiMatySCl54Zwt9RlC1S8PDJ+cKW0GbGE6ID3UYMcIgBgXjBfRWPGI
 smhCUq1a/tNjIFCO+JCNe3WqX/vhghtJKfh2FJVWy0tu18S/PvrxB5m3Iasm7JfP
 NxKGyoVCXknDQW4dMATWrDm5JoqAsh5b59Jf8WCcBrMQXeqVSZgHxXjVkwj8e092
 i/FIoS/sV83Lf4xJcm9l25+0OcLhkoLdXz6+r7pwFwsafP07mWbXYXa53efWqy8v
 848AH25FaB+cK16QcrluhIvdVFl3iLbX2b7WpJF3TAbDe3Emr4uggBqiqwcI4p5/
 rQGfVZkng1FBLOcHBZ7p0Wsa+F35C+6H14R8fueMiOmsgX6pXZLnJJ0KpQvSEc+d
 acia8SYp1SGTaxBrdvrhRY6BKtcq/ClOPvbIvV0CPuxFtVNWU940FE+b3V51EpbG
 TGhks4Nuh1C7ihm82Kep34pZjx7ZRnQWPAz7Cm9L9ZfX2DOOi9Uu16u71IwgumfL
 yp/7Jt06Hx/TS0qWV1dnH3ZtluQgBA/EUARmv1MNyIEvSOjpvKiVqlNJmlPKi0+9
 piXl0QUrOQz+Wj9glzcM3ENKh7ZxDFJxcIMkHx7q/wEwSIppnhOPuQAwNXWO4Y4Q
 p4X99W+gHfKwVG8BrY5tbW7lbkt8/4MWSR/9R2Vj8prIJCeeKFI=
 =wagp
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2026-05-02' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Fixes for rc2, the usual amdgpu/xe double header, I think xe had a
  couple of weeks combined due to some maintainer access issues,
  otherwise there's just a few misc fixes and documentation fixups.

  core and helpers:
   - calculate framebuffer geometry with format helpers
   - fix docs

  amdgpu:
   - GFX12 fix for CONFIG_DRM_DEBUG_MM configs
   - Fix DC analog support
   - Userq fixes
   - GART placement fix
   - Aldebaran SMU fixes
   - AMDGPU_INFO_READ_MMR_REG fix
   - UVD 3.1 fix
   - GC 6 TCC fix
   - Fix root reservation in amdgpu_vm_handle_fault()
   - RAS fix
   - Module reload fix for APUs
   - Fix build for CONFIG_DRM_FBDEV_EMULATION=n
   - IGT DWB regression fix
   - GC 11.5.4 fix
   - VCN user fence fixes
   - JPEG user fence fixes
   - SMU 13.0.6 fix
   - VCN 3/4 IB parser fixes
   - NV3x+ dGPU vblank fix
   - DCE6/8 fixes for LVDS/eDP panels without an EDID

  amdkfd:
   - Fix for when CONFIG_HSA_AMD is not set
   - SVM fixes

  xe:
   - uapi: Add missing pad and extensions check
   - uapi: Reject unsafe PAT indices for CPU cached memory
   - Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge
   - Xe3p tuning and workaround fixes
   - USE drm mm instead of drm SA for CCS read/write
   - Fix leaks and null derefs
   - Fix Wa_18022495364

  appletbdrm:
   - allocate protocol buffers with kvzalloc()

  dma-buf:
   - fix docs

  imagination:
   - avoid segfault in debugfs

  ofdrm:
   - put PCI device reference on errors

  udl:
   - increase USB timeout"

* tag 'drm-fixes-2026-05-02' of https://gitlab.freedesktop.org/drm/kernel: (77 commits)
  drm/xe/uapi: Reject coh_none PAT index for CPU_ADDR_MIRROR
  drm/xe/uapi: Reject coh_none PAT index for CPU cached memory in madvise
  drm/xe/xelp: Fix Wa_18022495364
  drm/xe/gsc: Fix BO leak on error in query_compatibility_version()
  drm/xe/eustall: Fix drm_dev_put called before stream disable in close
  drm/xe: Fix error cleanup in xe_exec_queue_create_ioctl()
  drm/xe: Fix dma-buf attachment leak in xe_gem_prime_import()
  drm/xe: Fix bo leak in xe_dma_buf_init_obj() on allocation failure
  drm/xe/bo: Fix bo leak on GGTT flag validation in xe_bo_init_locked()
  drm/xe/bo: Fix bo leak on unaligned size validation in xe_bo_init_locked()
  drm/xe: Fix potential NULL deref in xe_exec_queue_tlb_inval_last_fence_put_unlocked
  drm/xe/vf: Use drm mm instead of drm sa for CCS read/write
  drm/xe: Add memory pool with shadow support
  drm/xe/debugfs: Correct printing of register whitelist ranges
  drm/xe: Mark ROW_CHICKEN5 as a masked register
  drm/xe/tuning: Use proper register offset for GAMSTLB_CTRL
  drm/xe/xe3p_lpg: Add missing indirect ring state feature flag
  drm/xe: Drop redundant rtp entries for Wa_14019988906 & Wa_14019877138
  drm/xe/vm: Add missing pad and extensions check
  drm/xe: Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge()
  ...
This commit is contained in:
Linus Torvalds 2026-05-01 16:56:08 -07:00
commit f1a5e78a55
88 changed files with 1085 additions and 269 deletions

View File

@ -19,6 +19,7 @@ Abhinav Kumar <quic_abhinavk@quicinc.com> <abhinavk@codeaurora.org>
Ahmad Masri <quic_amasri@quicinc.com> <amasri@codeaurora.org>
Adam Oldham <oldhamca@gmail.com>
Adam Radford <aradford@gmail.com>
Aditya Garg <gargaditya08@proton.me> <gargaditya08@live.com>
Adriana Reus <adi.reus@gmail.com> <adriana.reus@intel.com>
Adrian Bunk <bunk@stusta.de>
Ajay Kaher <ajay.kaher@broadcom.com> <akaher@vmware.com>

View File

@ -7873,7 +7873,7 @@ F: drivers/gpu/drm/sun4i/sun8i*
DRM DRIVER FOR APPLE TOUCH BARS
M: Aun-Ali Zaidi <admin@kodeit.net>
M: Aditya Garg <gargaditya08@live.com>
M: Aditya Garg <gargaditya08@proton.me>
L: dri-devel@lists.freedesktop.org
S: Maintained
T: git https://gitlab.freedesktop.org/drm/misc/kernel.git

View File

@ -2839,8 +2839,12 @@ static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev)
* that checks whether the PSP is running. A solution for those issues
* in the APU is to trigger a GPU reset, but this should be done during
* the unload phase to avoid adding boot latency and screen flicker.
* GFX V11 has GC block as default off IP. Every time AMDGPU driver sends
* a request to PMFW to unload MP1, PMFW will put GC in reset and power down
* the voltage. Hence, skipping reset for APUs with GFX V11 or later.
*/
if ((adev->flags & AMD_IS_APU) && !adev->gmc.is_app_apu) {
if ((adev->flags & AMD_IS_APU) && !adev->gmc.is_app_apu &&
amdgpu_ip_version(adev, GC_HWIP, 0) < IP_VERSION(11, 0, 0)) {
r = amdgpu_asic_reset(adev);
if (r)
dev_err(adev->dev, "asic reset on %s failed\n", __func__);

View File

@ -3090,10 +3090,8 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
case IP_VERSION(11, 5, 1):
case IP_VERSION(11, 5, 2):
case IP_VERSION(11, 5, 3):
adev->family = AMDGPU_FAMILY_GC_11_5_0;
break;
case IP_VERSION(11, 5, 4):
adev->family = AMDGPU_FAMILY_GC_11_5_4;
adev->family = AMDGPU_FAMILY_GC_11_5_0;
break;
case IP_VERSION(12, 0, 0):
case IP_VERSION(12, 0, 1):

View File

@ -3158,8 +3158,10 @@ static int __init amdgpu_init(void)
amdgpu_register_atpx_handler();
amdgpu_acpi_detect();
/* Ignore KFD init failures. Normal when CONFIG_HSA_AMD is not set. */
amdgpu_amdkfd_init();
/* Ignore KFD init failures when CONFIG_HSA_AMD is not set. */
r = amdgpu_amdkfd_init();
if (r && r != -ENOENT)
goto error_fence;
if (amdgpu_pp_feature_mask & PP_OVERDRIVE_MASK) {
add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);

View File

@ -314,7 +314,10 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc,
mc->gart_start = max_mc_address - mc->gart_size + 1;
break;
case AMDGPU_GART_PLACEMENT_LOW:
mc->gart_start = 0;
if (size_bf >= mc->gart_size)
mc->gart_start = 0;
else
mc->gart_start = ALIGN(mc->fb_end, four_gb);
break;
case AMDGPU_GART_PLACEMENT_BEST_FIT:
default:

View File

@ -873,68 +873,59 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
? -EFAULT : 0;
}
case AMDGPU_INFO_READ_MMR_REG: {
int ret = 0;
unsigned int n, alloc_size;
uint32_t *regs;
unsigned int se_num = (info->read_mmr_reg.instance >>
AMDGPU_INFO_MMR_SE_INDEX_SHIFT) &
AMDGPU_INFO_MMR_SE_INDEX_MASK;
unsigned int sh_num = (info->read_mmr_reg.instance >>
AMDGPU_INFO_MMR_SH_INDEX_SHIFT) &
AMDGPU_INFO_MMR_SH_INDEX_MASK;
if (!down_read_trylock(&adev->reset_domain->sem))
return -ENOENT;
unsigned int alloc_size;
uint32_t *regs;
int ret;
/* set full masks if the userspace set all bits
* in the bitfields
*/
if (se_num == AMDGPU_INFO_MMR_SE_INDEX_MASK) {
if (se_num == AMDGPU_INFO_MMR_SE_INDEX_MASK)
se_num = 0xffffffff;
} else if (se_num >= AMDGPU_GFX_MAX_SE) {
ret = -EINVAL;
goto out;
}
else if (se_num >= AMDGPU_GFX_MAX_SE)
return -EINVAL;
if (sh_num == AMDGPU_INFO_MMR_SH_INDEX_MASK) {
if (sh_num == AMDGPU_INFO_MMR_SH_INDEX_MASK)
sh_num = 0xffffffff;
} else if (sh_num >= AMDGPU_GFX_MAX_SH_PER_SE) {
ret = -EINVAL;
goto out;
}
else if (sh_num >= AMDGPU_GFX_MAX_SH_PER_SE)
return -EINVAL;
if (info->read_mmr_reg.count > 128) {
ret = -EINVAL;
goto out;
}
if (info->read_mmr_reg.count > 128)
return -EINVAL;
regs = kmalloc_array(info->read_mmr_reg.count, sizeof(*regs), GFP_KERNEL);
if (!regs) {
ret = -ENOMEM;
goto out;
}
regs = kmalloc_array(info->read_mmr_reg.count, sizeof(*regs),
GFP_KERNEL);
if (!regs)
return -ENOMEM;
down_read(&adev->reset_domain->sem);
alloc_size = info->read_mmr_reg.count * sizeof(*regs);
amdgpu_gfx_off_ctrl(adev, false);
ret = 0;
for (i = 0; i < info->read_mmr_reg.count; i++) {
if (amdgpu_asic_read_register(adev, se_num, sh_num,
info->read_mmr_reg.dword_offset + i,
&regs[i])) {
DRM_DEBUG_KMS("unallowed offset %#x\n",
info->read_mmr_reg.dword_offset + i);
kfree(regs);
amdgpu_gfx_off_ctrl(adev, true);
ret = -EFAULT;
goto out;
break;
}
}
amdgpu_gfx_off_ctrl(adev, true);
n = copy_to_user(out, regs, min(size, alloc_size));
kfree(regs);
ret = (n ? -EFAULT : 0);
out:
up_read(&adev->reset_domain->sem);
if (!ret) {
ret = copy_to_user(out, regs, min(size, alloc_size))
? -EFAULT : 0;
}
kfree(regs);
return ret;
}
case AMDGPU_INFO_DEV_INFO: {

View File

@ -1950,7 +1950,7 @@ void amdgpu_ras_check_bad_page_status(struct amdgpu_device *adev)
if (!control || amdgpu_bad_page_threshold == 0)
return;
if (control->ras_num_bad_pages >= ras->bad_page_cnt_threshold) {
if (control->ras_num_bad_pages > ras->bad_page_cnt_threshold) {
if (amdgpu_dpm_send_rma_reason(adev))
dev_warn(adev->dev, "Unable to send out-of-band RMA CPER");
else

View File

@ -75,6 +75,9 @@ static int amdgpu_ttm_init_on_chip(struct amdgpu_device *adev,
unsigned int type,
uint64_t size_in_page)
{
if (!size_in_page)
return 0;
return ttm_range_man_init(&adev->mman.bdev, type,
false, size_in_page);
}

View File

@ -205,6 +205,19 @@ void amdgpu_userq_start_hang_detect_work(struct amdgpu_usermode_queue *queue)
msecs_to_jiffies(timeout_ms));
}
void amdgpu_userq_process_fence_irq(struct amdgpu_device *adev, u32 doorbell)
{
struct xarray *xa = &adev->userq_doorbell_xa;
struct amdgpu_usermode_queue *queue;
unsigned long flags;
xa_lock_irqsave(xa, flags);
queue = xa_load(xa, doorbell);
if (queue)
amdgpu_userq_fence_driver_process(queue->fence_drv);
xa_unlock_irqrestore(xa, flags);
}
static void amdgpu_userq_init_hang_detect_work(struct amdgpu_usermode_queue *queue)
{
INIT_DELAYED_WORK(&queue->hang_detect_work, amdgpu_userq_hang_detect_work);
@ -643,12 +656,6 @@ amdgpu_userq_destroy(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_usermode_que
#endif
amdgpu_userq_detect_and_reset_queues(uq_mgr);
r = amdgpu_userq_unmap_helper(queue);
/*TODO: It requires a reset for userq hw unmap error*/
if (r) {
drm_warn(adev_to_drm(uq_mgr->adev), "trying to destroy a HW mapping userq\n");
queue->state = AMDGPU_USERQ_STATE_HUNG;
}
atomic_dec(&uq_mgr->userq_count[queue->queue_type]);
amdgpu_userq_cleanup(queue);
mutex_unlock(&uq_mgr->userq_mutex);
@ -1187,7 +1194,7 @@ amdgpu_userq_vm_validate(struct amdgpu_userq_mgr *uq_mgr)
bo = range->bo;
ret = amdgpu_ttm_tt_get_user_pages(bo, range);
if (ret)
goto unlock_all;
goto free_ranges;
}
invalidated = true;
@ -1214,6 +1221,7 @@ amdgpu_userq_vm_validate(struct amdgpu_userq_mgr *uq_mgr)
unlock_all:
drm_exec_fini(&exec);
free_ranges:
xa_for_each(&xa, tmp_key, range) {
if (!range)
continue;

View File

@ -156,6 +156,7 @@ void amdgpu_userq_reset_work(struct work_struct *work);
void amdgpu_userq_pre_reset(struct amdgpu_device *adev);
int amdgpu_userq_post_reset(struct amdgpu_device *adev, bool vram_lost);
void amdgpu_userq_start_hang_detect_work(struct amdgpu_usermode_queue *queue);
void amdgpu_userq_process_fence_irq(struct amdgpu_device *adev, u32 doorbell);
int amdgpu_userq_input_va_validate(struct amdgpu_device *adev,
struct amdgpu_usermode_queue *queue,

View File

@ -3023,11 +3023,22 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
is_compute_context = vm->is_compute_context;
if (is_compute_context && !svm_range_restore_pages(adev, pasid, vmid,
node_id, addr >> PAGE_SHIFT, ts, write_fault)) {
if (is_compute_context) {
/* Unreserve root since svm_range_restore_pages might try to reserve it. */
/* TODO: rework svm_range_restore_pages so that this isn't necessary. */
amdgpu_bo_unreserve(root);
if (!svm_range_restore_pages(adev, pasid, vmid,
node_id, addr >> PAGE_SHIFT, ts, write_fault)) {
amdgpu_bo_unref(&root);
return true;
}
amdgpu_bo_unref(&root);
return true;
/* Re-acquire the VM lock, could be that the VM was freed in between. */
vm = amdgpu_vm_lock_by_pasid(adev, &root, pasid);
if (!vm)
return false;
}
addr /= AMDGPU_GPU_PAGE_SIZE;

View File

@ -6523,15 +6523,7 @@ static int gfx_v11_0_eop_irq(struct amdgpu_device *adev,
DRM_DEBUG("IH: CP EOP\n");
if (adev->enable_mes && doorbell_offset) {
struct amdgpu_usermode_queue *queue;
struct xarray *xa = &adev->userq_doorbell_xa;
unsigned long flags;
xa_lock_irqsave(xa, flags);
queue = xa_load(xa, doorbell_offset);
if (queue)
amdgpu_userq_fence_driver_process(queue->fence_drv);
xa_unlock_irqrestore(xa, flags);
amdgpu_userq_process_fence_irq(adev, doorbell_offset);
} else {
me_id = (entry->ring_id & 0x0c) >> 2;
pipe_id = (entry->ring_id & 0x03) >> 0;

View File

@ -4854,15 +4854,7 @@ static int gfx_v12_0_eop_irq(struct amdgpu_device *adev,
DRM_DEBUG("IH: CP EOP\n");
if (adev->enable_mes && doorbell_offset) {
struct xarray *xa = &adev->userq_doorbell_xa;
struct amdgpu_usermode_queue *queue;
unsigned long flags;
xa_lock_irqsave(xa, flags);
queue = xa_load(xa, doorbell_offset);
if (queue)
amdgpu_userq_fence_driver_process(queue->fence_drv);
xa_unlock_irqrestore(xa, flags);
amdgpu_userq_process_fence_irq(adev, doorbell_offset);
} else {
me_id = (entry->ring_id & 0x0c) >> 2;
pipe_id = (entry->ring_id & 0x03) >> 0;

View File

@ -3643,16 +3643,7 @@ static int gfx_v12_1_eop_irq(struct amdgpu_device *adev,
DRM_DEBUG("IH: CP EOP\n");
if (adev->enable_mes && doorbell_offset) {
struct xarray *xa = &adev->userq_doorbell_xa;
struct amdgpu_usermode_queue *queue;
unsigned long flags;
xa_lock_irqsave(xa, flags);
queue = xa_load(xa, doorbell_offset);
if (queue)
amdgpu_userq_fence_driver_process(queue->fence_drv);
xa_unlock_irqrestore(xa, flags);
amdgpu_userq_process_fence_irq(adev, doorbell_offset);
} else {
me_id = (entry->ring_id & 0x0c) >> 2;
pipe_id = (entry->ring_id & 0x03) >> 0;

View File

@ -1571,6 +1571,71 @@ static void gfx_v6_0_setup_spi(struct amdgpu_device *adev)
mutex_unlock(&adev->grbm_idx_mutex);
}
/**
* gfx_v6_0_setup_tcc() - setup which TCCs are used
*
* @adev: amdgpu_device pointer
*
* Verify whether the current GPU has any TCCs disabled,
* which can happen when the GPU is harvested and some
* memory channels are disabled, reducing the memory bus width.
* For example, on the Radeon HD 7870 XT (Tahiti LE).
*
* If some TCCs are disabled, we need to make sure that
* the disabled TCCs are not used, and the remaining TCCs
* are used optimally.
*
* TCP_CHAN_STEER_LO/HI control which TCC is used by TCP channels.
* TCP_ADDR_CONFIG.NUM_TCC_BANKS controls how many channels are used.
*
* For optimal performance:
* - Rely on the CHAN_STEER from the golden registers table,
* only skip disabled TCCs but keep the mapping order.
* - Limit NUM_TCC_BANKS to number of active TCCs to avoid thrashing,
* which performs better than using the same TCC twice.
*/
static void gfx_v6_0_setup_tcc(struct amdgpu_device *adev)
{
u32 i, tcc, tcp_addr_config, num_active_tcc = 0;
u64 chan_steer, patched_chan_steer = 0;
const u32 num_max_tcc = adev->gfx.config.max_texture_channel_caches;
const u32 dis_tcc_mask =
amdgpu_gfx_create_bitmask(num_max_tcc) &
(REG_GET_FIELD(RREG32(mmCGTS_TCC_DISABLE),
CGTS_TCC_DISABLE, TCC_DISABLE) |
REG_GET_FIELD(RREG32(mmCGTS_USER_TCC_DISABLE),
CGTS_USER_TCC_DISABLE, TCC_DISABLE));
/* When no TCC is disabled, the golden registers table already has optimal TCC setup */
if (!dis_tcc_mask)
return;
/* Each 4-bit nibble contains the index of a TCC used by all TCPs */
chan_steer = RREG32(mmTCP_CHAN_STEER_LO) | ((u64)RREG32(mmTCP_CHAN_STEER_HI) << 32ull);
/* Patch the TCP to TCC mapping to skip disabled TCCs */
for (i = 0; i < num_max_tcc; ++i) {
tcc = (chan_steer >> (u64)(4 * i)) & 0xf;
if (!((1 << tcc) & dis_tcc_mask)) {
/* Copy enabled TCC indices to the patched register value. */
patched_chan_steer |= (u64)tcc << (u64)(4 * num_active_tcc);
++num_active_tcc;
}
}
WARN_ON(num_active_tcc != num_max_tcc - hweight32(dis_tcc_mask));
/* Patch number of TCCs used by TCPs */
tcp_addr_config = REG_SET_FIELD(RREG32(mmTCP_ADDR_CONFIG),
TCP_ADDR_CONFIG, NUM_TCC_BANKS,
num_active_tcc - 1);
WREG32(mmTCP_ADDR_CONFIG, tcp_addr_config);
WREG32(mmTCP_CHAN_STEER_HI, upper_32_bits(patched_chan_steer));
WREG32(mmTCP_CHAN_STEER_LO, lower_32_bits(patched_chan_steer));
}
static void gfx_v6_0_config_init(struct amdgpu_device *adev)
{
adev->gfx.config.double_offchip_lds_buf = 0;
@ -1729,6 +1794,7 @@ static void gfx_v6_0_constants_init(struct amdgpu_device *adev)
gfx_v6_0_tiling_mode_table_init(adev);
gfx_v6_0_setup_rb(adev);
gfx_v6_0_setup_tcc(adev);
gfx_v6_0_setup_spi(adev);

View File

@ -802,6 +802,7 @@ static const struct amd_ip_funcs jpeg_v2_0_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v2_0_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v2_0_dec_ring_get_rptr,
.get_wptr = jpeg_v2_0_dec_ring_get_wptr,
.set_wptr = jpeg_v2_0_dec_ring_set_wptr,

View File

@ -693,6 +693,7 @@ static const struct amd_ip_funcs jpeg_v2_6_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v2_5_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v2_5_dec_ring_get_rptr,
.get_wptr = jpeg_v2_5_dec_ring_get_wptr,
.set_wptr = jpeg_v2_5_dec_ring_set_wptr,
@ -724,6 +725,7 @@ static const struct amdgpu_ring_funcs jpeg_v2_5_dec_ring_vm_funcs = {
static const struct amdgpu_ring_funcs jpeg_v2_6_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v2_5_dec_ring_get_rptr,
.get_wptr = jpeg_v2_5_dec_ring_get_wptr,
.set_wptr = jpeg_v2_5_dec_ring_set_wptr,

View File

@ -594,6 +594,7 @@ static const struct amd_ip_funcs jpeg_v3_0_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v3_0_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v3_0_dec_ring_get_rptr,
.get_wptr = jpeg_v3_0_dec_ring_get_wptr,
.set_wptr = jpeg_v3_0_dec_ring_set_wptr,

View File

@ -759,6 +759,7 @@ static const struct amd_ip_funcs jpeg_v4_0_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v4_0_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v4_0_dec_ring_get_rptr,
.get_wptr = jpeg_v4_0_dec_ring_get_wptr,
.set_wptr = jpeg_v4_0_dec_ring_set_wptr,

View File

@ -1219,6 +1219,7 @@ static const struct amd_ip_funcs jpeg_v4_0_3_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v4_0_3_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v4_0_3_dec_ring_get_rptr,
.get_wptr = jpeg_v4_0_3_dec_ring_get_wptr,
.set_wptr = jpeg_v4_0_3_dec_ring_set_wptr,

View File

@ -804,6 +804,7 @@ static const struct amd_ip_funcs jpeg_v4_0_5_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v4_0_5_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v4_0_5_dec_ring_get_rptr,
.get_wptr = jpeg_v4_0_5_dec_ring_get_wptr,
.set_wptr = jpeg_v4_0_5_dec_ring_set_wptr,

View File

@ -680,6 +680,7 @@ static const struct amd_ip_funcs jpeg_v5_0_0_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v5_0_0_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v5_0_0_dec_ring_get_rptr,
.get_wptr = jpeg_v5_0_0_dec_ring_get_wptr,
.set_wptr = jpeg_v5_0_0_dec_ring_set_wptr,

View File

@ -884,6 +884,7 @@ static const struct amd_ip_funcs jpeg_v5_0_1_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v5_0_1_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v5_0_1_dec_ring_get_rptr,
.get_wptr = jpeg_v5_0_1_dec_ring_get_wptr,
.set_wptr = jpeg_v5_0_1_dec_ring_set_wptr,

View File

@ -703,6 +703,7 @@ static const struct amd_ip_funcs jpeg_v5_0_2_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v5_0_2_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v5_0_2_dec_ring_get_rptr,
.get_wptr = jpeg_v5_0_2_dec_ring_get_wptr,
.set_wptr = jpeg_v5_0_2_dec_ring_set_wptr,

View File

@ -661,6 +661,7 @@ static const struct amd_ip_funcs jpeg_v5_3_0_ip_funcs = {
static const struct amdgpu_ring_funcs jpeg_v5_3_0_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_JPEG,
.align_mask = 0xf,
.no_user_fence = true,
.get_rptr = jpeg_v5_3_0_dec_ring_get_rptr,
.get_wptr = jpeg_v5_3_0_dec_ring_get_wptr,
.set_wptr = jpeg_v5_3_0_dec_ring_set_wptr,

View File

@ -1662,17 +1662,8 @@ static int sdma_v6_0_process_fence_irq(struct amdgpu_device *adev,
u32 doorbell_offset = entry->src_data[0];
if (adev->enable_mes && doorbell_offset) {
struct amdgpu_usermode_queue *queue;
struct xarray *xa = &adev->userq_doorbell_xa;
unsigned long flags;
doorbell_offset >>= SDMA0_QUEUE0_DOORBELL_OFFSET__OFFSET__SHIFT;
xa_lock_irqsave(xa, flags);
queue = xa_load(xa, doorbell_offset);
if (queue)
amdgpu_userq_fence_driver_process(queue->fence_drv);
xa_unlock_irqrestore(xa, flags);
amdgpu_userq_process_fence_irq(adev, doorbell_offset);
}
return 0;

View File

@ -1594,17 +1594,8 @@ static int sdma_v7_0_process_fence_irq(struct amdgpu_device *adev,
u32 doorbell_offset = entry->src_data[0];
if (adev->enable_mes && doorbell_offset) {
struct xarray *xa = &adev->userq_doorbell_xa;
struct amdgpu_usermode_queue *queue;
unsigned long flags;
doorbell_offset >>= SDMA0_QUEUE0_DOORBELL_OFFSET__OFFSET__SHIFT;
xa_lock_irqsave(xa, flags);
queue = xa_load(xa, doorbell_offset);
if (queue)
amdgpu_userq_fence_driver_process(queue->fence_drv);
xa_unlock_irqrestore(xa, flags);
amdgpu_userq_process_fence_irq(adev, doorbell_offset);
}
return 0;

View File

@ -242,6 +242,10 @@ static void uvd_v3_1_mc_resume(struct amdgpu_device *adev)
uint64_t addr;
uint32_t size;
/* When the keyselect is already set, don't perturb it. */
if (RREG32(mmUVD_FW_START))
return;
/* program the VCPU memory controller bits 0-27 */
addr = (adev->uvd.inst->gpu_addr + AMDGPU_UVD_FIRMWARE_OFFSET) >> 3;
size = AMDGPU_UVD_FIRMWARE_SIZE(adev) >> 3;
@ -284,6 +288,12 @@ static int uvd_v3_1_fw_validate(struct amdgpu_device *adev)
int i;
uint32_t keysel = adev->uvd.keyselect;
if (RREG32(mmUVD_FW_START) & UVD_FW_STATUS__PASS_MASK) {
dev_dbg(adev->dev, "UVD keyselect already set: 0x%x (on CPU: 0x%x)\n",
RREG32(mmUVD_FW_START), adev->uvd.keyselect);
return 0;
}
WREG32(mmUVD_FW_START, keysel);
for (i = 0; i < 10; ++i) {

View File

@ -2113,6 +2113,7 @@ static const struct amd_ip_funcs vcn_v2_0_ip_funcs = {
static const struct amdgpu_ring_funcs vcn_v2_0_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_DEC,
.align_mask = 0xf,
.no_user_fence = true,
.secure_submission_supported = true,
.get_rptr = vcn_v2_0_dec_ring_get_rptr,
.get_wptr = vcn_v2_0_dec_ring_get_wptr,
@ -2145,6 +2146,7 @@ static const struct amdgpu_ring_funcs vcn_v2_0_enc_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.get_rptr = vcn_v2_0_enc_ring_get_rptr,
.get_wptr = vcn_v2_0_enc_ring_get_wptr,
.set_wptr = vcn_v2_0_enc_ring_set_wptr,

View File

@ -1778,6 +1778,7 @@ static void vcn_v2_5_dec_ring_set_wptr(struct amdgpu_ring *ring)
static const struct amdgpu_ring_funcs vcn_v2_5_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_DEC,
.align_mask = 0xf,
.no_user_fence = true,
.secure_submission_supported = true,
.get_rptr = vcn_v2_5_dec_ring_get_rptr,
.get_wptr = vcn_v2_5_dec_ring_get_wptr,
@ -1879,6 +1880,7 @@ static const struct amdgpu_ring_funcs vcn_v2_5_enc_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.get_rptr = vcn_v2_5_enc_ring_get_rptr,
.get_wptr = vcn_v2_5_enc_ring_get_wptr,
.set_wptr = vcn_v2_5_enc_ring_set_wptr,

View File

@ -1856,6 +1856,7 @@ static const struct amdgpu_ring_funcs vcn_v3_0_dec_sw_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_DEC,
.align_mask = 0x3f,
.nop = VCN_DEC_SW_CMD_NO_OP,
.no_user_fence = true,
.secure_submission_supported = true,
.get_rptr = vcn_v3_0_dec_ring_get_rptr,
.get_wptr = vcn_v3_0_dec_ring_get_wptr,
@ -1972,6 +1973,7 @@ static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
for (i = 0, msg = &msg[6]; i < num_buffers; ++i, msg += 4) {
uint32_t offset, size, *create;
uint64_t buf_end;
if (msg[0] != RDECODE_MESSAGE_CREATE)
continue;
@ -1979,7 +1981,8 @@ static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
offset = msg[1];
size = msg[2];
if (size < 4 || offset + size > end - addr) {
if (size < 4 || check_add_overflow(offset, size, &buf_end) ||
buf_end > end - addr) {
DRM_ERROR("VCN message buffer exceeds BO bounds!\n");
r = -EINVAL;
goto out;
@ -2036,6 +2039,7 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
static const struct amdgpu_ring_funcs vcn_v3_0_dec_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_DEC,
.align_mask = 0xf,
.no_user_fence = true,
.secure_submission_supported = true,
.get_rptr = vcn_v3_0_dec_ring_get_rptr,
.get_wptr = vcn_v3_0_dec_ring_get_wptr,
@ -2138,6 +2142,7 @@ static const struct amdgpu_ring_funcs vcn_v3_0_enc_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.get_rptr = vcn_v3_0_enc_ring_get_rptr,
.get_wptr = vcn_v3_0_enc_ring_get_wptr,
.set_wptr = vcn_v3_0_enc_ring_set_wptr,

View File

@ -1889,6 +1889,7 @@ static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
for (i = 0, msg = &msg[6]; i < num_buffers; ++i, msg += 4) {
uint32_t offset, size, *create;
uint64_t buf_end;
if (msg[0] != RDECODE_MESSAGE_CREATE)
continue;
@ -1896,7 +1897,8 @@ static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
offset = msg[1];
size = msg[2];
if (size < 4 || offset + size > end - addr) {
if (size < 4 || check_add_overflow(offset, size, &buf_end) ||
buf_end > end - addr) {
DRM_ERROR("VCN message buffer exceeds BO bounds!\n");
r = -EINVAL;
goto out;
@ -1994,6 +1996,7 @@ static struct amdgpu_ring_funcs vcn_v4_0_unified_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.extra_bytes = sizeof(struct amdgpu_vcn_rb_metadata),
.get_rptr = vcn_v4_0_unified_ring_get_rptr,
.get_wptr = vcn_v4_0_unified_ring_get_wptr,

View File

@ -1775,6 +1775,7 @@ static const struct amdgpu_ring_funcs vcn_v4_0_3_unified_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.get_rptr = vcn_v4_0_3_unified_ring_get_rptr,
.get_wptr = vcn_v4_0_3_unified_ring_get_wptr,
.set_wptr = vcn_v4_0_3_unified_ring_set_wptr,

View File

@ -1483,6 +1483,7 @@ static struct amdgpu_ring_funcs vcn_v4_0_5_unified_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.get_rptr = vcn_v4_0_5_unified_ring_get_rptr,
.get_wptr = vcn_v4_0_5_unified_ring_get_wptr,
.set_wptr = vcn_v4_0_5_unified_ring_set_wptr,

View File

@ -1207,6 +1207,7 @@ static const struct amdgpu_ring_funcs vcn_v5_0_0_unified_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.get_rptr = vcn_v5_0_0_unified_ring_get_rptr,
.get_wptr = vcn_v5_0_0_unified_ring_get_wptr,
.set_wptr = vcn_v5_0_0_unified_ring_set_wptr,

View File

@ -1419,6 +1419,7 @@ static const struct amdgpu_ring_funcs vcn_v5_0_1_unified_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.get_rptr = vcn_v5_0_1_unified_ring_get_rptr,
.get_wptr = vcn_v5_0_1_unified_ring_get_wptr,
.set_wptr = vcn_v5_0_1_unified_ring_set_wptr,

View File

@ -994,6 +994,7 @@ static const struct amdgpu_ring_funcs vcn_v5_0_2_unified_ring_vm_funcs = {
.type = AMDGPU_RING_TYPE_VCN_ENC,
.align_mask = 0x3f,
.nop = VCN_ENC_CMD_NO_OP,
.no_user_fence = true,
.get_rptr = vcn_v5_0_2_unified_ring_get_rptr,
.get_wptr = vcn_v5_0_2_unified_ring_get_wptr,
.set_wptr = vcn_v5_0_2_unified_ring_set_wptr,

View File

@ -25,6 +25,7 @@
#include <linux/err.h>
#include <linux/fs.h>
#include <linux/file.h>
#include <linux/overflow.h>
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/uaccess.h>
@ -1695,6 +1696,16 @@ static int kfd_ioctl_smi_events(struct file *filep,
return kfd_smi_event_open(pdd->dev, &args->anon_fd);
}
static int kfd_ioctl_svm_validate(void *kdata, unsigned int usize)
{
struct kfd_ioctl_svm_args *args = kdata;
size_t expected = struct_size(args, attrs, args->nattr);
if (expected == SIZE_MAX || usize < expected)
return -EINVAL;
return 0;
}
#if IS_ENABLED(CONFIG_HSA_AMD_SVM)
static int kfd_ioctl_set_xnack_mode(struct file *filep,
@ -3209,7 +3220,11 @@ static int kfd_ioctl_create_process(struct file *filep, struct kfd_process *p, v
#define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
.cmd_drv = 0, .name = #ioctl}
.validate = NULL, .cmd_drv = 0, .name = #ioctl}
#define AMDKFD_IOCTL_DEF_V(ioctl, _func, _validate, _flags) \
[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
.validate = _validate, .cmd_drv = 0, .name = #ioctl}
/** Ioctl table */
static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
@ -3306,7 +3321,8 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
AMDKFD_IOCTL_DEF(AMDKFD_IOC_SMI_EVENTS,
kfd_ioctl_smi_events, 0),
AMDKFD_IOCTL_DEF(AMDKFD_IOC_SVM, kfd_ioctl_svm, 0),
AMDKFD_IOCTL_DEF_V(AMDKFD_IOC_SVM, kfd_ioctl_svm,
kfd_ioctl_svm_validate, 0),
AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_XNACK_MODE,
kfd_ioctl_set_xnack_mode, 0),
@ -3431,6 +3447,12 @@ static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
memset(kdata, 0, usize);
}
if (ioctl->validate) {
retcode = ioctl->validate(kdata, usize);
if (retcode)
goto err_i1;
}
retcode = func(filep, process, kdata);
if (cmd & IOC_OUT)

View File

@ -1047,10 +1047,13 @@ extern struct srcu_struct kfd_processes_srcu;
typedef int amdkfd_ioctl_t(struct file *filep, struct kfd_process *p,
void *data);
typedef int amdkfd_ioctl_validate_t(void *kdata, unsigned int usize);
struct amdkfd_ioctl_desc {
unsigned int cmd;
int flags;
amdkfd_ioctl_t *func;
amdkfd_ioctl_validate_t *validate;
unsigned int cmd_drv;
const char *name;
};

View File

@ -1366,6 +1366,12 @@ svm_range_unmap_from_gpu(struct amdgpu_device *adev, struct amdgpu_vm *vm,
pr_debug("CPU[0x%llx 0x%llx] -> GPU[0x%llx 0x%llx]\n", start, last,
gpu_start, gpu_end);
if (!amdgpu_vm_ready(vm)) {
pr_debug("VM not ready, canceling unmap\n");
return -EINVAL;
}
return amdgpu_vm_update_range(adev, vm, false, true, true, false, NULL, gpu_start,
gpu_end, init_pte_value, 0, 0, NULL, NULL,
fence);
@ -1443,6 +1449,11 @@ svm_range_map_to_gpu(struct kfd_process_device *pdd, struct svm_range *prange,
pr_debug("svms 0x%p [0x%lx 0x%lx] readonly %d\n", prange->svms,
last_start, last_start + npages - 1, readonly);
if (!amdgpu_vm_ready(vm)) {
pr_debug("VM not ready, canceling map\n");
return -EINVAL;
}
for (i = offset; i < offset + npages; i++) {
uint64_t gpu_start;
uint64_t gpu_end;

View File

@ -1903,7 +1903,11 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
goto error;
}
init_data.asic_id.chip_family = adev->family;
/* special handling for early revisions of GC 11.5.4 */
if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(11, 5, 4))
init_data.asic_id.chip_family = AMDGPU_FAMILY_GC_11_5_4;
else
init_data.asic_id.chip_family = adev->family;
init_data.asic_id.pci_revision_id = adev->pdev->revision;
init_data.asic_id.hw_internal_rev = adev->external_rev_id;
@ -9404,9 +9408,21 @@ static void manage_dm_interrupts(struct amdgpu_device *adev,
if (acrtc_state) {
timing = &acrtc_state->stream->timing;
if (amdgpu_ip_version(adev, DCE_HWIP, 0) <
IP_VERSION(3, 5, 0) ||
!(adev->flags & AMD_IS_APU)) {
if (amdgpu_ip_version(adev, DCE_HWIP, 0) >=
IP_VERSION(3, 2, 0) &&
!(adev->flags & AMD_IS_APU)) {
/*
* DGPUs NV3x and newer that support idle optimizations
* experience intermittent flip-done timeouts on cursor
* updates. Restore 5s offdelay behavior for now.
*
* Discussion on the issue:
* https://lore.kernel.org/amd-gfx/20260217191632.1243826-1-sysdadmin@m1k.cloud/
*/
config.offdelay_ms = 5000;
config.disable_immediate = false;
} else if (amdgpu_ip_version(adev, DCE_HWIP, 0) <
IP_VERSION(3, 5, 0)) {
/*
* Older HW and DGPU have issues with instant off;
* use a 2 frame offdelay.

View File

@ -1032,6 +1032,45 @@ dm_helpers_read_acpi_edid(struct amdgpu_dm_connector *aconnector)
return drm_edid_read_custom(connector, dm_helpers_probe_acpi_edid, connector);
}
static const struct drm_edid *
dm_helpers_read_vbios_hardcoded_edid(struct dc_link *link, struct amdgpu_dm_connector *aconnector)
{
struct dc_bios *bios = link->ctx->dc_bios;
struct embedded_panel_info info;
const struct drm_edid *edid;
enum bp_result r;
if (!dc_is_embedded_signal(link->connector_signal) ||
!bios->funcs->get_embedded_panel_info)
return NULL;
memset(&info, 0, sizeof(info));
r = bios->funcs->get_embedded_panel_info(bios, &info);
if (r != BP_RESULT_OK) {
dm_error("Error when reading embedded panel info: %u\n", r);
return NULL;
}
if (!info.fake_edid || !info.fake_edid_size) {
dm_error("Embedded panel info doesn't contain an EDID\n");
return NULL;
}
edid = drm_edid_alloc(info.fake_edid, info.fake_edid_size);
if (!drm_edid_valid(edid)) {
dm_error("EDID from embedded panel info is invalid\n");
drm_edid_free(edid);
return NULL;
}
aconnector->base.display_info.width_mm = info.panel_width_mm;
aconnector->base.display_info.height_mm = info.panel_height_mm;
return edid;
}
void populate_hdmi_info_from_connector(struct drm_hdmi_info *hdmi, struct dc_edid_caps *edid_caps)
{
edid_caps->scdc_present = hdmi->scdc.supported;
@ -1052,6 +1091,9 @@ enum dc_edid_status dm_helpers_read_local_edid(
if (link->aux_mode)
ddc = &aconnector->dm_dp_aux.aux.ddc;
else if (link->ddc_hw_inst == GPIO_DDC_LINE_UNKNOWN &&
dc_is_embedded_signal(link->connector_signal))
ddc = NULL;
else
ddc = &aconnector->i2c->base;
@ -1065,6 +1107,8 @@ enum dc_edid_status dm_helpers_read_local_edid(
drm_edid = dm_helpers_read_acpi_edid(aconnector);
if (drm_edid)
drm_info(connector->dev, "Using ACPI provided EDID for %s\n", connector->name);
else if (!ddc)
drm_edid = dm_helpers_read_vbios_hardcoded_edid(link, aconnector);
else
drm_edid = drm_edid_read_ddc(connector, ddc);
drm_edid_connector_update(connector, drm_edid);

View File

@ -794,11 +794,13 @@ static enum bp_result bios_parser_external_encoder_control(
static enum bp_result bios_parser_dac_load_detection(
struct dc_bios *dcb,
enum engine_id engine_id)
enum engine_id engine_id,
struct graphics_object_id ext_enc_id)
{
struct bios_parser *bp = BP_FROM_DCB(dcb);
struct dc_context *ctx = dcb->ctx;
struct bp_load_detection_parameters bp_params = {0};
struct bp_external_encoder_control ext_cntl = {0};
enum bp_result bp_result = BP_RESULT_UNSUPPORTED;
uint32_t bios_0_scratch;
uint32_t device_id_mask = 0;
@ -824,6 +826,13 @@ static enum bp_result bios_parser_dac_load_detection(
bp_params.engine_id = engine_id;
bp_result = bp->cmd_tbl.dac_load_detection(bp, &bp_params);
} else if (ext_enc_id.id) {
if (!bp->cmd_tbl.external_encoder_control)
return BP_RESULT_UNSUPPORTED;
ext_cntl.action = EXTERNAL_ENCODER_CONTROL_DAC_LOAD_DETECT;
ext_cntl.encoder_id = ext_enc_id;
bp_result = bp->cmd_tbl.external_encoder_control(bp, &ext_cntl);
}
if (bp_result != BP_RESULT_OK)
@ -1304,6 +1313,60 @@ static enum bp_result bios_parser_get_embedded_panel_info(
return BP_RESULT_FAILURE;
}
static enum bp_result get_embedded_panel_extra_info(
struct bios_parser *bp,
struct embedded_panel_info *info,
const uint32_t table_offset)
{
uint8_t *record = bios_get_image(&bp->base, table_offset, 1);
ATOM_PANEL_RESOLUTION_PATCH_RECORD *panel_res_record;
ATOM_FAKE_EDID_PATCH_RECORD *fake_edid_record;
while (*record != ATOM_RECORD_END_TYPE) {
switch (*record) {
case LCD_MODE_PATCH_RECORD_MODE_TYPE:
record += sizeof(ATOM_PATCH_RECORD_MODE);
break;
case LCD_RTS_RECORD_TYPE:
record += sizeof(ATOM_LCD_RTS_RECORD);
break;
case LCD_CAP_RECORD_TYPE:
record += sizeof(ATOM_LCD_MODE_CONTROL_CAP);
break;
case LCD_FAKE_EDID_PATCH_RECORD_TYPE:
fake_edid_record = (ATOM_FAKE_EDID_PATCH_RECORD *)record;
if (fake_edid_record->ucFakeEDIDLength) {
if (fake_edid_record->ucFakeEDIDLength == 128)
info->fake_edid_size =
fake_edid_record->ucFakeEDIDLength;
else
info->fake_edid_size =
fake_edid_record->ucFakeEDIDLength * 128;
info->fake_edid = fake_edid_record->ucFakeEDIDString;
record += struct_size(fake_edid_record,
ucFakeEDIDString,
info->fake_edid_size);
} else {
/* empty fake edid record must be 3 bytes long */
record += sizeof(ATOM_FAKE_EDID_PATCH_RECORD) + 1;
}
break;
case LCD_PANEL_RESOLUTION_RECORD_TYPE:
panel_res_record = (ATOM_PANEL_RESOLUTION_PATCH_RECORD *)record;
info->panel_width_mm = panel_res_record->usHSize;
info->panel_height_mm = panel_res_record->usVSize;
record += sizeof(ATOM_PANEL_RESOLUTION_PATCH_RECORD);
break;
default:
return BP_RESULT_BADBIOSTABLE;
}
}
return BP_RESULT_OK;
}
static enum bp_result get_embedded_panel_info_v1_2(
struct bios_parser *bp,
struct embedded_panel_info *info)
@ -1420,6 +1483,10 @@ static enum bp_result get_embedded_panel_info_v1_2(
if (ATOM_PANEL_MISC_API_ENABLED & lvds->ucLVDS_Misc)
info->lcd_timing.misc_info.API_ENABLED = true;
if (lvds->usExtInfoTableOffset)
return get_embedded_panel_extra_info(bp, info,
le16_to_cpu(lvds->usExtInfoTableOffset) + DATA_TABLES(LCD_Info));
return BP_RESULT_OK;
}
@ -1545,6 +1612,10 @@ static enum bp_result get_embedded_panel_info_v1_3(
(uint32_t) (ATOM_PANEL_MISC_V13_GREY_LEVEL &
lvds->ucLCD_Misc) >> ATOM_PANEL_MISC_V13_GREY_LEVEL_SHIFT;
if (lvds->usExtInfoTableOffset)
return get_embedded_panel_extra_info(bp, info,
le16_to_cpu(lvds->usExtInfoTableOffset) + DATA_TABLES(LCD_Info));
return BP_RESULT_OK;
}

View File

@ -1682,7 +1682,7 @@ struct dc_scratch_space {
struct dc_link_training_overrides preferred_training_settings;
struct dp_audio_test_data audio_test_data;
uint8_t ddc_hw_inst;
enum gpio_ddc_line ddc_hw_inst;
uint8_t hpd_src;

View File

@ -102,7 +102,8 @@ struct dc_vbios_funcs {
struct bp_external_encoder_control *cntl);
enum bp_result (*dac_load_detection)(
struct dc_bios *bios,
enum engine_id engine_id);
enum engine_id engine_id,
struct graphics_object_id ext_enc_id);
enum bp_result (*transmitter_control)(
struct dc_bios *bios,
struct bp_transmitter_control *cntl);

View File

@ -1102,7 +1102,9 @@ void dce110_link_encoder_hw_init(
ASSERT(result == BP_RESULT_OK);
}
aux_initialize(enc110);
if (enc110->aux_regs)
aux_initialize(enc110);
/* reinitialize HPD.
* hpd_initialize() will pass DIG_FE id to HW context.

View File

@ -40,8 +40,8 @@
#define FN(reg_name, field_name) \
mcif_wb30->mcif_wb_shift->field_name, mcif_wb30->mcif_wb_mask->field_name
#define MCIF_ADDR(addr) (((unsigned long long)addr & 0xffffffffff) + 0xFE) >> 8
#define MCIF_ADDR_HIGH(addr) (unsigned long long)addr >> 40
#define MCIF_ADDR(addr) ((uint32_t)((((unsigned long long)(addr) & 0xffffffffffULL) + 0xFEULL) >> 8))
#define MCIF_ADDR_HIGH(addr) ((uint32_t)(((unsigned long long)(addr)) >> 40))
/* wbif programming guide:
* 1. set up wbif parameter:

View File

@ -646,6 +646,9 @@ enum gpio_result dal_ddc_change_mode(
enum gpio_ddc_line dal_ddc_get_line(
const struct ddc *ddc)
{
if (!ddc)
return GPIO_DDC_LINE_UNKNOWN;
return (enum gpio_ddc_line)dal_gpio_get_enum(ddc->pin_data);
}

View File

@ -665,16 +665,45 @@ void dce110_update_info_frame(struct pipe_ctx *pipe_ctx)
}
static void
dce110_dac_encoder_control(struct pipe_ctx *pipe_ctx, bool enable)
dce110_external_encoder_control(enum bp_external_encoder_control_action action,
struct dc_link *link,
struct dc_crtc_timing *timing)
{
struct dc_link *link = pipe_ctx->stream->link;
struct dc *dc = link->ctx->dc;
struct dc_bios *bios = link->ctx->dc_bios;
struct bp_encoder_control encoder_control = {0};
const struct dc_link_settings *link_settings = &link->cur_link_settings;
enum bp_result bp_result = BP_RESULT_OK;
struct bp_external_encoder_control ext_cntl = {
.action = action,
.connector_obj_id = link->link_enc->connector,
.encoder_id = link->ext_enc_id,
.lanes_number = link_settings->lane_count,
.link_rate = link_settings->link_rate,
encoder_control.action = enable ? ENCODER_CONTROL_ENABLE : ENCODER_CONTROL_DISABLE;
encoder_control.engine_id = link->link_enc->analog_engine;
encoder_control.pixel_clock = pipe_ctx->stream->timing.pix_clk_100hz / 10;
bios->funcs->encoder_control(bios, &encoder_control);
/* Use signal type of the real link encoder, ie. DP */
.signal = link->connector_signal,
/* We don't know the timing yet when executing the SETUP action,
* so use a reasonably high default value. It seems that ENABLE
* can change the actual pixel clock but doesn't work with higher
* pixel clocks than what SETUP was called with.
*/
.pixel_clock = timing ? timing->pix_clk_100hz / 10 : 300000,
.color_depth = timing ? timing->display_color_depth : COLOR_DEPTH_888,
};
DC_LOGGER_INIT(dc->ctx);
bp_result = bios->funcs->external_encoder_control(bios, &ext_cntl);
if (bp_result != BP_RESULT_OK)
DC_LOG_ERROR("Failed to execute external encoder action: 0x%x\n", action);
}
static void
dce110_prepare_ddc(struct dc_link *link)
{
if (link->ext_enc_id.id)
dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_DDC_SETUP, link, NULL);
}
static bool
@ -684,7 +713,8 @@ dce110_dac_load_detect(struct dc_link *link)
struct link_encoder *link_enc = link->link_enc;
enum bp_result bp_result;
bp_result = bios->funcs->dac_load_detection(bios, link_enc->analog_engine);
bp_result = bios->funcs->dac_load_detection(
bios, link_enc->analog_engine, link->ext_enc_id);
return bp_result == BP_RESULT_OK;
}
@ -700,7 +730,6 @@ void dce110_enable_stream(struct pipe_ctx *pipe_ctx)
uint32_t early_control = 0;
struct timing_generator *tg = pipe_ctx->stream_res.tg;
link_hwss->setup_stream_attribute(pipe_ctx);
link_hwss->setup_stream_encoder(pipe_ctx);
dc->hwss.update_info_frame(pipe_ctx);
@ -719,8 +748,8 @@ void dce110_enable_stream(struct pipe_ctx *pipe_ctx)
tg->funcs->set_early_control(tg, early_control);
if (dc_is_rgb_signal(pipe_ctx->stream->signal))
dce110_dac_encoder_control(pipe_ctx, true);
if (link->ext_enc_id.id)
dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_ENABLE, link, timing);
}
static enum bp_result link_transmitter_control(
@ -1219,8 +1248,8 @@ void dce110_disable_stream(struct pipe_ctx *pipe_ctx)
link_enc->transmitter - TRANSMITTER_UNIPHY_A);
}
if (dc_is_rgb_signal(pipe_ctx->stream->signal))
dce110_dac_encoder_control(pipe_ctx, false);
if (link->ext_enc_id.id)
dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_DISABLE, link, NULL);
}
void dce110_unblank_stream(struct pipe_ctx *pipe_ctx,
@ -1603,22 +1632,6 @@ static enum dc_status dce110_enable_stream_timing(
return DC_OK;
}
static void
dce110_select_crtc_source(struct pipe_ctx *pipe_ctx)
{
struct dc_link *link = pipe_ctx->stream->link;
struct dc_bios *bios = link->ctx->dc_bios;
struct bp_crtc_source_select crtc_source_select = {0};
enum engine_id engine_id = link->link_enc->preferred_engine;
if (dc_is_rgb_signal(pipe_ctx->stream->signal))
engine_id = link->link_enc->analog_engine;
crtc_source_select.controller_id = CONTROLLER_ID_D0 + pipe_ctx->stream_res.tg->inst;
crtc_source_select.color_depth = pipe_ctx->stream->timing.display_color_depth;
crtc_source_select.engine_id = engine_id;
crtc_source_select.sink_signal = pipe_ctx->stream->signal;
bios->funcs->select_crtc_source(bios, &crtc_source_select);
}
enum dc_status dce110_apply_single_controller_ctx_to_hw(
struct pipe_ctx *pipe_ctx,
@ -1639,10 +1652,6 @@ enum dc_status dce110_apply_single_controller_ctx_to_hw(
hws->funcs.disable_stream_gating(dc, pipe_ctx);
}
if (pipe_ctx->stream->signal == SIGNAL_TYPE_RGB) {
dce110_select_crtc_source(pipe_ctx);
}
if (pipe_ctx->stream_res.audio != NULL) {
struct audio_output audio_output = {0};
@ -1722,8 +1731,7 @@ enum dc_status dce110_apply_single_controller_ctx_to_hw(
pipe_ctx->stream_res.tg->funcs->set_static_screen_control(
pipe_ctx->stream_res.tg, event_triggers, 2);
if (!dc_is_virtual_signal(pipe_ctx->stream->signal) &&
!dc_is_rgb_signal(pipe_ctx->stream->signal))
if (!dc_is_virtual_signal(pipe_ctx->stream->signal))
pipe_ctx->stream_res.stream_enc->funcs->dig_connect_to_otg(
pipe_ctx->stream_res.stream_enc,
pipe_ctx->stream_res.tg->inst);
@ -3376,6 +3384,15 @@ void dce110_enable_tmds_link_output(struct dc_link *link,
link->phy_state.symclk_state = SYMCLK_ON_TX_ON;
}
static void dce110_enable_analog_link_output(
struct dc_link *link,
uint32_t pix_clk_100hz)
{
link->link_enc->funcs->enable_analog_output(
link->link_enc,
pix_clk_100hz);
}
void dce110_enable_dp_link_output(
struct dc_link *link,
const struct link_resource *link_res,
@ -3423,6 +3440,11 @@ void dce110_enable_dp_link_output(
}
}
if (link->ext_enc_id.id) {
dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_INIT, link, NULL);
dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_SETUP, link, NULL);
}
if (dc->link_srv->dp_get_encoding_format(link_settings) == DP_8b_10b_ENCODING) {
if (dc->clk_mgr->funcs->notify_link_rate_change)
dc->clk_mgr->funcs->notify_link_rate_change(dc->clk_mgr, link);
@ -3513,8 +3535,10 @@ static const struct hw_sequencer_funcs dce110_funcs = {
.enable_lvds_link_output = dce110_enable_lvds_link_output,
.enable_tmds_link_output = dce110_enable_tmds_link_output,
.enable_dp_link_output = dce110_enable_dp_link_output,
.enable_analog_link_output = dce110_enable_analog_link_output,
.disable_link_output = dce110_disable_link_output,
.dac_load_detect = dce110_dac_load_detect,
.prepare_ddc = dce110_prepare_ddc,
};
static const struct hwseq_private_funcs dce110_private_funcs = {

View File

@ -568,7 +568,9 @@ static bool construct_phy(struct dc_link *link,
goto ddc_create_fail;
}
if (!link->ddc->ddc_pin) {
/* Embedded display connectors such as LVDS may not have DDC. */
if (!link->ddc->ddc_pin &&
!dc_is_embedded_signal(link->connector_signal)) {
DC_ERROR("Failed to get I2C info for connector!\n");
goto ddc_create_fail;
}

View File

@ -753,7 +753,8 @@ static struct link_encoder *dce60_link_encoder_create(
enc_init_data,
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
enc_init_data->channel == CHANNEL_ID_UNKNOWN ?
NULL : &link_enc_aux_regs[enc_init_data->channel - 1],
enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ?
NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]);
return &enc110->base;

View File

@ -760,7 +760,8 @@ static struct link_encoder *dce80_link_encoder_create(
enc_init_data,
&link_enc_feature,
&link_enc_regs[link_regs_id],
&link_enc_aux_regs[enc_init_data->channel - 1],
enc_init_data->channel == CHANNEL_ID_UNKNOWN ?
NULL : &link_enc_aux_regs[enc_init_data->channel - 1],
enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ?
NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]);
return &enc110->base;

View File

@ -153,6 +153,10 @@ struct embedded_panel_info {
uint32_t drr_enabled;
uint32_t min_drr_refresh_rate;
bool realtek_eDPToLVDS;
uint16_t panel_width_mm;
uint16_t panel_height_mm;
uint16_t fake_edid_size;
const uint8_t *fake_edid;
};
struct dc_firmware_info {

View File

@ -425,6 +425,7 @@ static int aldebaran_set_default_dpm_table(struct smu_context *smu)
dpm_table->dpm_levels[0].enabled = true;
dpm_table->dpm_levels[1].value = pptable->GfxclkFmax;
dpm_table->dpm_levels[1].enabled = true;
dpm_table->flags |= SMU_DPM_TABLE_FINE_GRAINED;
} else {
dpm_table->count = 1;
dpm_table->dpm_levels[0].value = smu->smu_table.boot_values.gfxclk / 100;

View File

@ -1129,6 +1129,7 @@ static int smu_v13_0_6_set_default_dpm_table(struct smu_context *smu)
/* gfxclk dpm table setup */
dpm_table = &dpm_context->dpm_tables.gfx_table;
dpm_table->clk_type = SMU_GFXCLK;
dpm_table->flags = SMU_DPM_TABLE_FINE_GRAINED;
if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_GFXCLK_BIT)) {
/* In the case of gfxclk, only fine-grained dpm is honored.
* Get min/max values from FW.

View File

@ -1370,7 +1370,7 @@ int smu_cmn_print_dpm_clk_levels(struct smu_context *smu,
level_index = 1;
}
if (!is_fine_grained) {
if (!is_fine_grained || count == 1) {
for (i = 0; i < count; i++) {
freq_match = !is_deep_sleep &&
smu_cmn_freqs_match(

View File

@ -831,7 +831,7 @@ static void fill_palette_332(struct drm_crtc *crtc, u16 r, u16 g, u16 b,
}
/**
* drm_crtc_fill_palette_332 - Programs a default palette for R332-like formats
* drm_crtc_fill_palette_332 - Programs a default palette for RGB332-like formats
* @crtc: The displaying CRTC
* @set_palette: Callback for programming the hardware gamma LUT
*

View File

@ -172,8 +172,8 @@ int drm_gem_fb_init_with_funcs(struct drm_device *dev,
}
for (i = 0; i < info->num_planes; i++) {
unsigned int width = mode_cmd->width / (i ? info->hsub : 1);
unsigned int height = mode_cmd->height / (i ? info->vsub : 1);
unsigned int width = drm_format_info_plane_width(info, mode_cmd->width, i);
unsigned int height = drm_format_info_plane_height(info, mode_cmd->height, i);
unsigned int min_size;
objs[i] = drm_gem_object_lookup(file, mode_cmd->handles[i]);

View File

@ -558,6 +558,6 @@ pvr_fw_trace_debugfs_init(struct pvr_device *pvr_dev, struct dentry *dir)
&pvr_fw_trace_fops);
}
debugfs_create_file("trace_mask", 0600, dir, fw_trace,
debugfs_create_file("trace_mask", 0600, dir, pvr_dev,
&pvr_fw_trace_mask_fops);
}

View File

@ -350,6 +350,7 @@ static void ofdrm_pci_release(void *data)
struct pci_dev *pcidev = data;
pci_disable_device(pcidev);
pci_dev_put(pcidev);
}
static int ofdrm_device_init_pci(struct ofdrm_device *odev)
@ -375,6 +376,7 @@ static int ofdrm_device_init_pci(struct ofdrm_device *odev)
if (ret) {
drm_err(dev, "pci_enable_device(%s) failed: %d\n",
dev_name(&pcidev->dev), ret);
pci_dev_put(pcidev);
return ret;
}
ret = devm_add_action_or_reset(&pdev->dev, ofdrm_pci_release, pcidev);

View File

@ -353,7 +353,7 @@ static int appletbdrm_primary_plane_helper_atomic_check(struct drm_plane *plane,
frames_size +
sizeof(struct appletbdrm_fb_request_footer), 16);
appletbdrm_state->request = kzalloc(request_size, GFP_KERNEL);
appletbdrm_state->request = kvzalloc(request_size, GFP_KERNEL);
if (!appletbdrm_state->request)
return -ENOMEM;
@ -543,7 +543,7 @@ static void appletbdrm_primary_plane_destroy_state(struct drm_plane *plane,
{
struct appletbdrm_plane_state *appletbdrm_state = to_appletbdrm_plane_state(state);
kfree(appletbdrm_state->request);
kvfree(appletbdrm_state->request);
kfree(appletbdrm_state->response);
__drm_gem_destroy_shadow_plane_state(&appletbdrm_state->base);

View File

@ -285,13 +285,12 @@ static struct urb *udl_get_urb_locked(struct udl_device *udl, long timeout)
return unode->urb;
}
#define GET_URB_TIMEOUT HZ
struct urb *udl_get_urb(struct udl_device *udl)
{
struct urb *urb;
spin_lock_irq(&udl->urbs.lock);
urb = udl_get_urb_locked(udl, GET_URB_TIMEOUT);
urb = udl_get_urb_locked(udl, HZ * 2);
spin_unlock_irq(&udl->urbs.lock);
return urb;
}

View File

@ -21,6 +21,7 @@
#include <drm/drm_gem_framebuffer_helper.h>
#include <drm/drm_gem_shmem_helper.h>
#include <drm/drm_modeset_helper_vtables.h>
#include <drm/drm_print.h>
#include <drm/drm_probe_helper.h>
#include <drm/drm_vblank.h>
@ -342,8 +343,10 @@ static void udl_crtc_helper_atomic_enable(struct drm_crtc *crtc, struct drm_atom
return;
urb = udl_get_urb(udl);
if (!urb)
if (!urb) {
drm_err_ratelimited(dev, "get urb failed when enabling crtc\n");
goto out;
}
buf = (char *)urb->transfer_buffer;
buf = udl_vidreg_lock(buf);

View File

@ -88,6 +88,7 @@ xe-y += xe_bb.o \
xe_irq.o \
xe_late_bind_fw.o \
xe_lrc.o \
xe_mem_pool.o \
xe_migrate.o \
xe_mmio.o \
xe_mmio_gem.o \

View File

@ -583,7 +583,7 @@
#define DISABLE_128B_EVICTION_COMMAND_UDW REG_BIT(36 - 32)
#define LSCFE_SAME_ADDRESS_ATOMICS_COALESCING_DISABLE REG_BIT(35 - 32)
#define ROW_CHICKEN5 XE_REG_MCR(0xe7f0)
#define ROW_CHICKEN5 XE_REG_MCR(0xe7f0, XE_REG_OPTION_MASKED)
#define CPSS_AWARE_DIS REG_BIT(3)
#define SARB_CHICKEN1 XE_REG_MCR(0xe90c)

View File

@ -2322,8 +2322,10 @@ struct xe_bo *xe_bo_init_locked(struct xe_device *xe, struct xe_bo *bo,
}
/* XE_BO_FLAG_GGTTx requires XE_BO_FLAG_GGTT also be set */
if ((flags & XE_BO_FLAG_GGTT_ALL) && !(flags & XE_BO_FLAG_GGTT))
if ((flags & XE_BO_FLAG_GGTT_ALL) && !(flags & XE_BO_FLAG_GGTT)) {
xe_bo_free(bo);
return ERR_PTR(-EINVAL);
}
if (flags & (XE_BO_FLAG_VRAM_MASK | XE_BO_FLAG_STOLEN) &&
!(flags & XE_BO_FLAG_IGNORE_MIN_PAGE_SIZE) &&
@ -2342,8 +2344,10 @@ struct xe_bo *xe_bo_init_locked(struct xe_device *xe, struct xe_bo *bo,
alignment = SZ_4K >> PAGE_SHIFT;
}
if (type == ttm_bo_type_device && aligned_size != size)
if (type == ttm_bo_type_device && aligned_size != size) {
xe_bo_free(bo);
return ERR_PTR(-EINVAL);
}
if (!bo) {
bo = xe_bo_alloc();

View File

@ -18,6 +18,7 @@
#include "xe_ggtt_types.h"
struct xe_device;
struct xe_mem_pool_node;
struct xe_vm;
#define XE_BO_MAX_PLACEMENTS 3
@ -88,7 +89,7 @@ struct xe_bo {
bool ccs_cleared;
/** @bb_ccs: BB instructions of CCS read/write. Valid only for VF */
struct xe_bb *bb_ccs[XE_SRIOV_VF_CCS_CTX_COUNT];
struct xe_mem_pool_node *bb_ccs[XE_SRIOV_VF_CCS_CTX_COUNT];
/**
* @cpu_caching: CPU caching mode. Currently only used for userspace

View File

@ -258,6 +258,13 @@ struct dma_buf *xe_gem_prime_export(struct drm_gem_object *obj, int flags)
return ERR_PTR(ret);
}
/*
* Takes ownership of @storage: on success it is transferred to the returned
* drm_gem_object; on failure it is freed before returning the error.
* This matches the contract of xe_bo_init_locked() which frees @storage on
* its error paths, so callers need not (and must not) free @storage after
* this call.
*/
static struct drm_gem_object *
xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage,
struct dma_buf *dma_buf)
@ -271,8 +278,10 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage,
int ret = 0;
dummy_obj = drm_gpuvm_resv_object_alloc(&xe->drm);
if (!dummy_obj)
if (!dummy_obj) {
xe_bo_free(storage);
return ERR_PTR(-ENOMEM);
}
dummy_obj->resv = resv;
xe_validation_guard(&ctx, &xe->val, &exec, (struct xe_val_flags) {}, ret) {
@ -281,6 +290,7 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage,
if (ret)
break;
/* xe_bo_init_locked() frees storage on error */
bo = xe_bo_init_locked(xe, storage, NULL, resv, NULL, dma_buf->size,
0, /* Will require 1way or 2way for vm_bind */
ttm_bo_type_sg, XE_BO_FLAG_SYSTEM, &exec);
@ -368,12 +378,15 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev,
goto out_err;
}
/* Errors here will take care of freeing the bo. */
/*
* xe_dma_buf_init_obj() takes ownership of bo on both success
* and failure, so we must not touch bo after this call.
*/
obj = xe_dma_buf_init_obj(dev, bo, dma_buf);
if (IS_ERR(obj))
if (IS_ERR(obj)) {
dma_buf_detach(dma_buf, attach);
return obj;
}
get_dma_buf(dma_buf);
obj->import_attach = attach;
return obj;

View File

@ -869,14 +869,14 @@ static int xe_eu_stall_stream_close(struct inode *inode, struct file *file)
struct xe_eu_stall_data_stream *stream = file->private_data;
struct xe_gt *gt = stream->gt;
drm_dev_put(&gt->tile->xe->drm);
mutex_lock(&gt->eu_stall->stream_lock);
xe_eu_stall_disable_locked(stream);
xe_eu_stall_data_buf_destroy(stream);
xe_eu_stall_stream_free(stream);
mutex_unlock(&gt->eu_stall->stream_lock);
drm_dev_put(&gt->tile->xe->drm);
return 0;
}

View File

@ -1405,7 +1405,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
if (q->vm && q->hwe->hw_engine_group) {
err = xe_hw_engine_group_add_exec_queue(q->hwe->hw_engine_group, q);
if (err)
goto put_exec_queue;
goto kill_exec_queue;
}
}
@ -1416,12 +1416,15 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
/* user id alloc must always be last in ioctl to prevent UAF */
err = xa_alloc(&xef->exec_queue.xa, &id, q, xa_limit_32b, GFP_KERNEL);
if (err)
goto kill_exec_queue;
goto del_hw_engine_group;
args->exec_queue_id = id;
return 0;
del_hw_engine_group:
if (q->vm && q->hwe && q->hwe->hw_engine_group)
xe_hw_engine_group_del_exec_queue(q->hwe->hw_engine_group, q);
kill_exec_queue:
xe_exec_queue_kill(q);
delete_queue_group:
@ -1760,7 +1763,7 @@ void xe_exec_queue_tlb_inval_last_fence_put(struct xe_exec_queue *q,
void xe_exec_queue_tlb_inval_last_fence_put_unlocked(struct xe_exec_queue *q,
unsigned int type)
{
xe_assert(q->vm->xe, type == XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT ||
xe_assert(gt_to_xe(q->gt), type == XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT ||
type == XE_EXEC_QUEUE_TLB_INVAL_PRIMARY_GT);
dma_fence_put(q->tlb_inval[type].last_fence);

View File

@ -166,7 +166,7 @@ static int query_compatibility_version(struct xe_gsc *gsc)
&rd_offset);
if (err) {
xe_gt_err(gt, "HuC: invalid GSC reply for version query (err=%d)\n", err);
return err;
goto out_bo;
}
compat->major = version_query_rd(xe, &bo->vmap, rd_offset, proj_major);

View File

@ -259,24 +259,12 @@ static void guc_submit_sw_fini(struct drm_device *drm, void *arg)
}
static void guc_submit_fini(void *arg)
{
struct xe_guc *guc = arg;
/* Forcefully kill any remaining exec queues */
xe_guc_ct_stop(&guc->ct);
guc_submit_reset_prepare(guc);
xe_guc_softreset(guc);
xe_guc_submit_stop(guc);
xe_uc_fw_sanitize(&guc->fw);
xe_guc_submit_pause_abort(guc);
}
static void guc_submit_wedged_fini(void *arg)
{
struct xe_guc *guc = arg;
struct xe_exec_queue *q;
unsigned long index;
/* Drop any wedged queue refs */
mutex_lock(&guc->submission_state.lock);
xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) {
if (exec_queue_wedged(q)) {
@ -286,6 +274,14 @@ static void guc_submit_wedged_fini(void *arg)
}
}
mutex_unlock(&guc->submission_state.lock);
/* Forcefully kill any remaining exec queues */
xe_guc_ct_stop(&guc->ct);
guc_submit_reset_prepare(guc);
xe_guc_softreset(guc);
xe_guc_submit_stop(guc);
xe_uc_fw_sanitize(&guc->fw);
xe_guc_submit_pause_abort(guc);
}
static const struct xe_exec_queue_ops guc_exec_queue_ops;
@ -1320,10 +1316,8 @@ static void disable_scheduling_deregister(struct xe_guc *guc,
void xe_guc_submit_wedge(struct xe_guc *guc)
{
struct xe_device *xe = guc_to_xe(guc);
struct xe_gt *gt = guc_to_gt(guc);
struct xe_exec_queue *q;
unsigned long index;
int err;
xe_gt_assert(guc_to_gt(guc), guc_to_xe(guc)->wedged.mode);
@ -1335,15 +1329,6 @@ void xe_guc_submit_wedge(struct xe_guc *guc)
return;
if (xe->wedged.mode == XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET) {
err = devm_add_action_or_reset(guc_to_xe(guc)->drm.dev,
guc_submit_wedged_fini, guc);
if (err) {
xe_gt_err(gt, "Failed to register clean-up on wedged.mode=%s; "
"Although device is wedged.\n",
xe_wedged_mode_to_string(XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET));
return;
}
mutex_lock(&guc->submission_state.lock);
xa_for_each(&guc->submission_state.exec_queue_lookup, index, q)
if (xe_exec_queue_get_unless_zero(q))

View File

@ -1214,7 +1214,7 @@ static ssize_t setup_invalidate_state_cache_wa(struct xe_lrc *lrc,
if (xe_gt_WARN_ON(lrc->gt, max_len < 3))
return -ENOSPC;
*cmd++ = MI_LOAD_REGISTER_IMM | MI_LRI_NUM_REGS(1);
*cmd++ = MI_LOAD_REGISTER_IMM | MI_LRI_LRM_CS_MMIO | MI_LRI_NUM_REGS(1);
*cmd++ = CS_DEBUG_MODE2(0).addr;
*cmd++ = REG_MASKED_FIELD_ENABLE(INSTRUCTION_STATE_CACHE_INVALIDATE);

View File

@ -0,0 +1,403 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2026 Intel Corporation
*/
#include <linux/kernel.h>
#include <drm/drm_managed.h>
#include "instructions/xe_mi_commands.h"
#include "xe_bo.h"
#include "xe_device_types.h"
#include "xe_map.h"
#include "xe_mem_pool.h"
#include "xe_mem_pool_types.h"
#include "xe_tile_printk.h"
/**
* struct xe_mem_pool - DRM MM pool for sub-allocating memory from a BO on an
* XE tile.
*
* The XE memory pool is a DRM MM manager that provides sub-allocation of memory
* from a backing buffer object (BO) on a specific XE tile. It is designed to
* manage memory for GPU workloads, allowing for efficient allocation and
* deallocation of memory regions within the BO.
*
* The memory pool maintains a primary BO that is pinned in the GGTT and mapped
* into the CPU address space for direct access. Optionally, it can also maintain
* a shadow BO that can be used for atomic updates to the primary BO's contents.
*
* The API provided by the memory pool allows clients to allocate and free memory
* regions, retrieve GPU and CPU addresses, and synchronize data between the
* primary and shadow BOs as needed.
*/
struct xe_mem_pool {
/** @base: Range allocator over [0, @size) in bytes */
struct drm_mm base;
/** @bo: Active pool BO (GGTT-pinned, CPU-mapped). */
struct xe_bo *bo;
/** @shadow: Shadow BO for atomic command updates. */
struct xe_bo *shadow;
/** @swap_guard: Timeline guard updating @bo and @shadow */
struct mutex swap_guard;
/** @cpu_addr: CPU virtual address of the active BO. */
void *cpu_addr;
/** @is_iomem: Indicates if the BO mapping is I/O memory. */
bool is_iomem;
};
static struct xe_mem_pool *node_to_pool(struct xe_mem_pool_node *node)
{
return container_of(node->sa_node.mm, struct xe_mem_pool, base);
}
static struct xe_tile *pool_to_tile(struct xe_mem_pool *pool)
{
return pool->bo->tile;
}
static void fini_pool_action(struct drm_device *drm, void *arg)
{
struct xe_mem_pool *pool = arg;
if (pool->is_iomem)
kvfree(pool->cpu_addr);
drm_mm_takedown(&pool->base);
}
static int pool_shadow_init(struct xe_mem_pool *pool)
{
struct xe_tile *tile = pool->bo->tile;
struct xe_device *xe = tile_to_xe(tile);
struct xe_bo *shadow;
int ret;
xe_assert(xe, !pool->shadow);
ret = drmm_mutex_init(&xe->drm, &pool->swap_guard);
if (ret)
return ret;
if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
fs_reclaim_acquire(GFP_KERNEL);
might_lock(&pool->swap_guard);
fs_reclaim_release(GFP_KERNEL);
}
shadow = xe_managed_bo_create_pin_map(xe, tile,
xe_bo_size(pool->bo),
XE_BO_FLAG_VRAM_IF_DGFX(tile) |
XE_BO_FLAG_GGTT |
XE_BO_FLAG_GGTT_INVALIDATE |
XE_BO_FLAG_PINNED_NORESTORE);
if (IS_ERR(shadow))
return PTR_ERR(shadow);
pool->shadow = shadow;
return 0;
}
/**
* xe_mem_pool_init() - Initialize memory pool.
* @tile: the &xe_tile where allocate.
* @size: number of bytes to allocate.
* @guard: the size of the guard region at the end of the BO that is not
* sub-allocated, in bytes.
* @flags: flags to use to create shadow pool.
*
* Initializes a memory pool for sub-allocating memory from a backing BO on the
* specified XE tile. The backing BO is pinned in the GGTT and mapped into
* the CPU address space for direct access. Optionally, a shadow BO can also be
* initialized for atomic updates to the primary BO's contents.
*
* Returns: a pointer to the &xe_mem_pool, or an error pointer on failure.
*/
struct xe_mem_pool *xe_mem_pool_init(struct xe_tile *tile, u32 size,
u32 guard, int flags)
{
struct xe_device *xe = tile_to_xe(tile);
struct xe_mem_pool *pool;
struct xe_bo *bo;
u32 managed_size;
int ret;
xe_tile_assert(tile, size > guard);
managed_size = size - guard;
pool = drmm_kzalloc(&xe->drm, sizeof(*pool), GFP_KERNEL);
if (!pool)
return ERR_PTR(-ENOMEM);
bo = xe_managed_bo_create_pin_map(xe, tile, size,
XE_BO_FLAG_VRAM_IF_DGFX(tile) |
XE_BO_FLAG_GGTT |
XE_BO_FLAG_GGTT_INVALIDATE |
XE_BO_FLAG_PINNED_NORESTORE);
if (IS_ERR(bo)) {
xe_tile_err(tile, "Failed to prepare %uKiB BO for mem pool (%pe)\n",
size / SZ_1K, bo);
return ERR_CAST(bo);
}
pool->bo = bo;
pool->is_iomem = bo->vmap.is_iomem;
if (pool->is_iomem) {
pool->cpu_addr = kvzalloc(size, GFP_KERNEL);
if (!pool->cpu_addr)
return ERR_PTR(-ENOMEM);
} else {
pool->cpu_addr = bo->vmap.vaddr;
}
if (flags & XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY) {
ret = pool_shadow_init(pool);
if (ret)
goto out_err;
}
drm_mm_init(&pool->base, 0, managed_size);
ret = drmm_add_action_or_reset(&xe->drm, fini_pool_action, pool);
if (ret)
return ERR_PTR(ret);
return pool;
out_err:
if (flags & XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY)
xe_tile_err(tile,
"Failed to initialize shadow BO for mem pool (%d)\n", ret);
if (bo->vmap.is_iomem)
kvfree(pool->cpu_addr);
return ERR_PTR(ret);
}
/**
* xe_mem_pool_sync() - Copy the entire contents of the main pool to shadow pool.
* @pool: the memory pool containing the primary and shadow BOs.
*
* Copies the entire contents of the primary pool to the shadow pool. This must
* be done after xe_mem_pool_init() with the XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY
* flag to ensure that the shadow pool has the same initial contents as the primary
* pool. After this initial synchronization, clients can choose to synchronize the
* shadow pool with the primary pool on a node basis using
* xe_mem_pool_sync_shadow_locked() as needed.
*
* Return: None.
*/
void xe_mem_pool_sync(struct xe_mem_pool *pool)
{
struct xe_tile *tile = pool_to_tile(pool);
struct xe_device *xe = tile_to_xe(tile);
xe_tile_assert(tile, pool->shadow);
xe_map_memcpy_to(xe, &pool->shadow->vmap, 0,
pool->cpu_addr, xe_bo_size(pool->bo));
}
/**
* xe_mem_pool_swap_shadow_locked() - Swap the primary BO with the shadow BO.
* @pool: the memory pool containing the primary and shadow BOs.
*
* Swaps the primary buffer object with the shadow buffer object in the mem
* pool. This allows for atomic updates to the contents of the primary BO
* by first writing to the shadow BO and then swapping it with the primary BO.
* Swap_guard must be held to ensure synchronization with any concurrent swap
* operations.
*
* Return: None.
*/
void xe_mem_pool_swap_shadow_locked(struct xe_mem_pool *pool)
{
struct xe_tile *tile = pool_to_tile(pool);
xe_tile_assert(tile, pool->shadow);
lockdep_assert_held(&pool->swap_guard);
swap(pool->bo, pool->shadow);
if (!pool->bo->vmap.is_iomem)
pool->cpu_addr = pool->bo->vmap.vaddr;
}
/**
* xe_mem_pool_sync_shadow_locked() - Copy node from primary pool to shadow pool.
* @node: the node allocated in the memory pool.
*
* Copies the specified batch buffer from the primary pool to the shadow pool.
* Swap_guard must be held to ensure synchronization with any concurrent swap
* operations.
*
* Return: None.
*/
void xe_mem_pool_sync_shadow_locked(struct xe_mem_pool_node *node)
{
struct xe_mem_pool *pool = node_to_pool(node);
struct xe_tile *tile = pool_to_tile(pool);
struct xe_device *xe = tile_to_xe(tile);
struct drm_mm_node *sa_node = &node->sa_node;
xe_tile_assert(tile, pool->shadow);
lockdep_assert_held(&pool->swap_guard);
xe_map_memcpy_to(xe, &pool->shadow->vmap,
sa_node->start,
pool->cpu_addr + sa_node->start,
sa_node->size);
}
/**
* xe_mem_pool_gpu_addr() - Retrieve GPU address of memory pool.
* @pool: the memory pool
*
* Returns: GGTT address of the memory pool.
*/
u64 xe_mem_pool_gpu_addr(struct xe_mem_pool *pool)
{
return xe_bo_ggtt_addr(pool->bo);
}
/**
* xe_mem_pool_cpu_addr() - Retrieve CPU address of manager pool.
* @pool: the memory pool
*
* Returns: CPU virtual address of memory pool.
*/
void *xe_mem_pool_cpu_addr(struct xe_mem_pool *pool)
{
return pool->cpu_addr;
}
/**
* xe_mem_pool_bo_swap_guard() - Retrieve the mutex used to guard swap
* operations on a memory pool.
* @pool: the memory pool
*
* Returns: Swap guard mutex or NULL if shadow pool is not created.
*/
struct mutex *xe_mem_pool_bo_swap_guard(struct xe_mem_pool *pool)
{
if (!pool->shadow)
return NULL;
return &pool->swap_guard;
}
/**
* xe_mem_pool_bo_flush_write() - Copy the data from the sub-allocation
* to the GPU memory.
* @node: the node allocated in the memory pool to flush.
*/
void xe_mem_pool_bo_flush_write(struct xe_mem_pool_node *node)
{
struct xe_mem_pool *pool = node_to_pool(node);
struct xe_tile *tile = pool_to_tile(pool);
struct xe_device *xe = tile_to_xe(tile);
struct drm_mm_node *sa_node = &node->sa_node;
if (!pool->bo->vmap.is_iomem)
return;
xe_map_memcpy_to(xe, &pool->bo->vmap, sa_node->start,
pool->cpu_addr + sa_node->start,
sa_node->size);
}
/**
* xe_mem_pool_bo_sync_read() - Copy the data from GPU memory to the
* sub-allocation.
* @node: the node allocated in the memory pool to read back.
*/
void xe_mem_pool_bo_sync_read(struct xe_mem_pool_node *node)
{
struct xe_mem_pool *pool = node_to_pool(node);
struct xe_tile *tile = pool_to_tile(pool);
struct xe_device *xe = tile_to_xe(tile);
struct drm_mm_node *sa_node = &node->sa_node;
if (!pool->bo->vmap.is_iomem)
return;
xe_map_memcpy_from(xe, pool->cpu_addr + sa_node->start,
&pool->bo->vmap, sa_node->start, sa_node->size);
}
/**
* xe_mem_pool_alloc_node() - Allocate a new node for use with xe_mem_pool.
*
* Returns: node structure or an ERR_PTR(-ENOMEM).
*/
struct xe_mem_pool_node *xe_mem_pool_alloc_node(void)
{
struct xe_mem_pool_node *node = kzalloc_obj(*node);
if (!node)
return ERR_PTR(-ENOMEM);
return node;
}
/**
* xe_mem_pool_insert_node() - Insert a node into the memory pool.
* @pool: the memory pool to insert into
* @node: the node to insert
* @size: the size of the node to be allocated in bytes.
*
* Inserts a node into the specified memory pool using drm_mm for
* allocation.
*
* Returns: 0 on success or a negative error code on failure.
*/
int xe_mem_pool_insert_node(struct xe_mem_pool *pool,
struct xe_mem_pool_node *node, u32 size)
{
if (!pool)
return -EINVAL;
return drm_mm_insert_node(&pool->base, &node->sa_node, size);
}
/**
* xe_mem_pool_free_node() - Free a node allocated from the memory pool.
* @node: the node to free
*
* Returns: None.
*/
void xe_mem_pool_free_node(struct xe_mem_pool_node *node)
{
if (!node)
return;
drm_mm_remove_node(&node->sa_node);
kfree(node);
}
/**
* xe_mem_pool_node_cpu_addr() - Retrieve CPU address of the node.
* @node: the node allocated in the memory pool
*
* Returns: CPU virtual address of the node.
*/
void *xe_mem_pool_node_cpu_addr(struct xe_mem_pool_node *node)
{
struct xe_mem_pool *pool = node_to_pool(node);
return xe_mem_pool_cpu_addr(pool) + node->sa_node.start;
}
/**
* xe_mem_pool_dump() - Dump the state of the DRM MM manager for debugging.
* @pool: the memory pool info be dumped.
* @p: The DRM printer to use for output.
*
* Only the drm managed region is dumped, not the state of the BOs or any other
* pool information.
*
* Returns: None.
*/
void xe_mem_pool_dump(struct xe_mem_pool *pool, struct drm_printer *p)
{
drm_mm_print(&pool->base, p);
}

View File

@ -0,0 +1,35 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2026 Intel Corporation
*/
#ifndef _XE_MEM_POOL_H_
#define _XE_MEM_POOL_H_
#include <linux/sizes.h>
#include <linux/types.h>
#include <drm/drm_mm.h>
#include "xe_mem_pool_types.h"
struct drm_printer;
struct xe_mem_pool;
struct xe_tile;
struct xe_mem_pool *xe_mem_pool_init(struct xe_tile *tile, u32 size,
u32 guard, int flags);
void xe_mem_pool_sync(struct xe_mem_pool *pool);
void xe_mem_pool_swap_shadow_locked(struct xe_mem_pool *pool);
void xe_mem_pool_sync_shadow_locked(struct xe_mem_pool_node *node);
u64 xe_mem_pool_gpu_addr(struct xe_mem_pool *pool);
void *xe_mem_pool_cpu_addr(struct xe_mem_pool *pool);
struct mutex *xe_mem_pool_bo_swap_guard(struct xe_mem_pool *pool);
void xe_mem_pool_bo_flush_write(struct xe_mem_pool_node *node);
void xe_mem_pool_bo_sync_read(struct xe_mem_pool_node *node);
struct xe_mem_pool_node *xe_mem_pool_alloc_node(void);
int xe_mem_pool_insert_node(struct xe_mem_pool *pool,
struct xe_mem_pool_node *node, u32 size);
void xe_mem_pool_free_node(struct xe_mem_pool_node *node);
void *xe_mem_pool_node_cpu_addr(struct xe_mem_pool_node *node);
void xe_mem_pool_dump(struct xe_mem_pool *pool, struct drm_printer *p);
#endif

View File

@ -0,0 +1,21 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2026 Intel Corporation
*/
#ifndef _XE_MEM_POOL_TYPES_H_
#define _XE_MEM_POOL_TYPES_H_
#include <drm/drm_mm.h>
#define XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY BIT(0)
/**
* struct xe_mem_pool_node - Sub-range allocations from mem pool.
*/
struct xe_mem_pool_node {
/** @sa_node: drm_mm_node for this allocation. */
struct drm_mm_node sa_node;
};
#endif

View File

@ -29,6 +29,7 @@
#include "xe_hw_engine.h"
#include "xe_lrc.h"
#include "xe_map.h"
#include "xe_mem_pool.h"
#include "xe_mocs.h"
#include "xe_printk.h"
#include "xe_pt.h"
@ -1166,11 +1167,12 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
u32 batch_size, batch_size_allocated;
struct xe_device *xe = gt_to_xe(gt);
struct xe_res_cursor src_it, ccs_it;
struct xe_mem_pool *bb_pool;
struct xe_sriov_vf_ccs_ctx *ctx;
struct xe_sa_manager *bb_pool;
u64 size = xe_bo_size(src_bo);
struct xe_bb *bb = NULL;
struct xe_mem_pool_node *bb;
u64 src_L0, src_L0_ofs;
struct xe_bb xe_bb_tmp;
u32 src_L0_pt;
int err;
@ -1208,18 +1210,18 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
size -= src_L0;
}
bb = xe_bb_alloc(gt);
bb = xe_mem_pool_alloc_node();
if (IS_ERR(bb))
return PTR_ERR(bb);
bb_pool = ctx->mem.ccs_bb_pool;
scoped_guard(mutex, xe_sa_bo_swap_guard(bb_pool)) {
xe_sa_bo_swap_shadow(bb_pool);
scoped_guard(mutex, xe_mem_pool_bo_swap_guard(bb_pool)) {
xe_mem_pool_swap_shadow_locked(bb_pool);
err = xe_bb_init(bb, bb_pool, batch_size);
err = xe_mem_pool_insert_node(bb_pool, bb, batch_size * sizeof(u32));
if (err) {
xe_gt_err(gt, "BB allocation failed.\n");
xe_bb_free(bb, NULL);
kfree(bb);
return err;
}
@ -1227,6 +1229,7 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
size = xe_bo_size(src_bo);
batch_size = 0;
xe_bb_tmp = (struct xe_bb){ .cs = xe_mem_pool_node_cpu_addr(bb), .len = 0 };
/*
* Emit PTE and copy commands here.
* The CCS copy command can only support limited size. If the size to be
@ -1255,24 +1258,27 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
xe_assert(xe, IS_ALIGNED(ccs_it.start, PAGE_SIZE));
batch_size += EMIT_COPY_CCS_DW;
emit_pte(m, bb, src_L0_pt, false, true, &src_it, src_L0, src);
emit_pte(m, &xe_bb_tmp, src_L0_pt, false, true, &src_it, src_L0, src);
emit_pte(m, bb, ccs_pt, false, false, &ccs_it, ccs_size, src);
emit_pte(m, &xe_bb_tmp, ccs_pt, false, false, &ccs_it, ccs_size, src);
bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags);
flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, src_is_pltt,
xe_bb_tmp.len = emit_flush_invalidate(xe_bb_tmp.cs, xe_bb_tmp.len,
flush_flags);
flush_flags = xe_migrate_ccs_copy(m, &xe_bb_tmp, src_L0_ofs, src_is_pltt,
src_L0_ofs, dst_is_pltt,
src_L0, ccs_ofs, true);
bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags);
xe_bb_tmp.len = emit_flush_invalidate(xe_bb_tmp.cs, xe_bb_tmp.len,
flush_flags);
size -= src_L0;
}
xe_assert(xe, (batch_size_allocated == bb->len));
xe_assert(xe, (batch_size_allocated == xe_bb_tmp.len));
xe_assert(xe, bb->sa_node.size == xe_bb_tmp.len * sizeof(u32));
src_bo->bb_ccs[read_write] = bb;
xe_sriov_vf_ccs_rw_update_bb_addr(ctx);
xe_sa_bo_sync_shadow(bb->bo);
xe_mem_pool_sync_shadow_locked(bb);
}
return 0;
@ -1297,10 +1303,10 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
void xe_migrate_ccs_rw_copy_clear(struct xe_bo *src_bo,
enum xe_sriov_vf_ccs_rw_ctxs read_write)
{
struct xe_bb *bb = src_bo->bb_ccs[read_write];
struct xe_mem_pool_node *bb = src_bo->bb_ccs[read_write];
struct xe_device *xe = xe_bo_device(src_bo);
struct xe_mem_pool *bb_pool;
struct xe_sriov_vf_ccs_ctx *ctx;
struct xe_sa_manager *bb_pool;
u32 *cs;
xe_assert(xe, IS_SRIOV_VF(xe));
@ -1308,17 +1314,17 @@ void xe_migrate_ccs_rw_copy_clear(struct xe_bo *src_bo,
ctx = &xe->sriov.vf.ccs.contexts[read_write];
bb_pool = ctx->mem.ccs_bb_pool;
guard(mutex) (xe_sa_bo_swap_guard(bb_pool));
xe_sa_bo_swap_shadow(bb_pool);
scoped_guard(mutex, xe_mem_pool_bo_swap_guard(bb_pool)) {
xe_mem_pool_swap_shadow_locked(bb_pool);
cs = xe_sa_bo_cpu_addr(bb->bo);
memset(cs, MI_NOOP, bb->len * sizeof(u32));
xe_sriov_vf_ccs_rw_update_bb_addr(ctx);
cs = xe_mem_pool_node_cpu_addr(bb);
memset(cs, MI_NOOP, bb->sa_node.size);
xe_sriov_vf_ccs_rw_update_bb_addr(ctx);
xe_sa_bo_sync_shadow(bb->bo);
xe_bb_free(bb, NULL);
src_bo->bb_ccs[read_write] = NULL;
xe_mem_pool_sync_shadow_locked(bb);
xe_mem_pool_free_node(bb);
src_bo->bb_ccs[read_write] = NULL;
}
}
/**

View File

@ -118,6 +118,7 @@ static const struct xe_graphics_desc graphics_xe2 = {
static const struct xe_graphics_desc graphics_xe3p_lpg = {
XE2_GFX_FEATURES,
.has_indirect_ring_state = 1,
.multi_queue_engine_class_mask = BIT(XE_ENGINE_CLASS_COPY) | BIT(XE_ENGINE_CLASS_COMPUTE),
.num_geometry_xecore_fuse_regs = 3,
.num_compute_xecore_fuse_regs = 3,

View File

@ -226,7 +226,7 @@ void xe_reg_whitelist_print_entry(struct drm_printer *p, unsigned int indent,
}
range_start = reg & REG_GENMASK(25, range_bit);
range_end = range_start | REG_GENMASK(range_bit, 0);
range_end = range_start | REG_GENMASK(range_bit - 1, 0);
switch (val & RING_FORCE_TO_NONPRIV_ACCESS_MASK) {
case RING_FORCE_TO_NONPRIV_ACCESS_RW:

View File

@ -14,9 +14,9 @@
#include "xe_guc.h"
#include "xe_guc_submit.h"
#include "xe_lrc.h"
#include "xe_mem_pool.h"
#include "xe_migrate.h"
#include "xe_pm.h"
#include "xe_sa.h"
#include "xe_sriov_printk.h"
#include "xe_sriov_vf.h"
#include "xe_sriov_vf_ccs.h"
@ -141,43 +141,47 @@ static u64 get_ccs_bb_pool_size(struct xe_device *xe)
static int alloc_bb_pool(struct xe_tile *tile, struct xe_sriov_vf_ccs_ctx *ctx)
{
struct xe_mem_pool *pool;
struct xe_device *xe = tile_to_xe(tile);
struct xe_sa_manager *sa_manager;
u32 *pool_cpu_addr, *last_dw_addr;
u64 bb_pool_size;
int offset, err;
int err;
bb_pool_size = get_ccs_bb_pool_size(xe);
xe_sriov_info(xe, "Allocating %s CCS BB pool size = %lldMB\n",
ctx->ctx_id ? "Restore" : "Save", bb_pool_size / SZ_1M);
sa_manager = __xe_sa_bo_manager_init(tile, bb_pool_size, SZ_4K, SZ_16,
XE_SA_BO_MANAGER_FLAG_SHADOW);
if (IS_ERR(sa_manager)) {
xe_sriov_err(xe, "Suballocator init failed with error: %pe\n",
sa_manager);
err = PTR_ERR(sa_manager);
pool = xe_mem_pool_init(tile, bb_pool_size, sizeof(u32),
XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY);
if (IS_ERR(pool)) {
xe_sriov_err(xe, "xe_mem_pool_init failed with error: %pe\n",
pool);
err = PTR_ERR(pool);
return err;
}
offset = 0;
xe_map_memset(xe, &sa_manager->bo->vmap, offset, MI_NOOP,
bb_pool_size);
xe_map_memset(xe, &sa_manager->shadow->vmap, offset, MI_NOOP,
bb_pool_size);
pool_cpu_addr = xe_mem_pool_cpu_addr(pool);
memset(pool_cpu_addr, 0, bb_pool_size);
offset = bb_pool_size - sizeof(u32);
xe_map_wr(xe, &sa_manager->bo->vmap, offset, u32, MI_BATCH_BUFFER_END);
xe_map_wr(xe, &sa_manager->shadow->vmap, offset, u32, MI_BATCH_BUFFER_END);
last_dw_addr = pool_cpu_addr + (bb_pool_size / sizeof(u32)) - 1;
*last_dw_addr = MI_BATCH_BUFFER_END;
ctx->mem.ccs_bb_pool = sa_manager;
/**
* Sync the main copy and shadow copy so that the shadow copy is
* replica of main copy. We sync only BBs after init part. So, we
* need to make sure the main pool and shadow copy are in sync after
* this point. This is needed as GuC may read the BB commands from
* shadow copy.
*/
xe_mem_pool_sync(pool);
ctx->mem.ccs_bb_pool = pool;
return 0;
}
static void ccs_rw_update_ring(struct xe_sriov_vf_ccs_ctx *ctx)
{
u64 addr = xe_sa_manager_gpu_addr(ctx->mem.ccs_bb_pool);
u64 addr = xe_mem_pool_gpu_addr(ctx->mem.ccs_bb_pool);
struct xe_lrc *lrc = xe_exec_queue_lrc(ctx->mig_q);
u32 dw[10], i = 0;
@ -388,7 +392,7 @@ int xe_sriov_vf_ccs_init(struct xe_device *xe)
#define XE_SRIOV_VF_CCS_RW_BB_ADDR_OFFSET (2 * sizeof(u32))
void xe_sriov_vf_ccs_rw_update_bb_addr(struct xe_sriov_vf_ccs_ctx *ctx)
{
u64 addr = xe_sa_manager_gpu_addr(ctx->mem.ccs_bb_pool);
u64 addr = xe_mem_pool_gpu_addr(ctx->mem.ccs_bb_pool);
struct xe_lrc *lrc = xe_exec_queue_lrc(ctx->mig_q);
struct xe_device *xe = gt_to_xe(ctx->mig_q->gt);
@ -412,8 +416,8 @@ int xe_sriov_vf_ccs_attach_bo(struct xe_bo *bo)
struct xe_device *xe = xe_bo_device(bo);
enum xe_sriov_vf_ccs_rw_ctxs ctx_id;
struct xe_sriov_vf_ccs_ctx *ctx;
struct xe_mem_pool_node *bb;
struct xe_tile *tile;
struct xe_bb *bb;
int err = 0;
xe_assert(xe, IS_VF_CCS_READY(xe));
@ -445,7 +449,7 @@ int xe_sriov_vf_ccs_detach_bo(struct xe_bo *bo)
{
struct xe_device *xe = xe_bo_device(bo);
enum xe_sriov_vf_ccs_rw_ctxs ctx_id;
struct xe_bb *bb;
struct xe_mem_pool_node *bb;
xe_assert(xe, IS_VF_CCS_READY(xe));
@ -471,8 +475,8 @@ int xe_sriov_vf_ccs_detach_bo(struct xe_bo *bo)
*/
void xe_sriov_vf_ccs_print(struct xe_device *xe, struct drm_printer *p)
{
struct xe_sa_manager *bb_pool;
enum xe_sriov_vf_ccs_rw_ctxs ctx_id;
struct xe_mem_pool *bb_pool;
if (!IS_VF_CCS_READY(xe))
return;
@ -485,7 +489,7 @@ void xe_sriov_vf_ccs_print(struct xe_device *xe, struct drm_printer *p)
drm_printf(p, "ccs %s bb suballoc info\n", ctx_id ? "write" : "read");
drm_printf(p, "-------------------------\n");
drm_suballoc_dump_debug_info(&bb_pool->base, p, xe_sa_manager_gpu_addr(bb_pool));
xe_mem_pool_dump(bb_pool, p);
drm_puts(p, "\n");
}
}

View File

@ -17,9 +17,6 @@ enum xe_sriov_vf_ccs_rw_ctxs {
XE_SRIOV_VF_CCS_CTX_COUNT
};
struct xe_migrate;
struct xe_sa_manager;
/**
* struct xe_sriov_vf_ccs_ctx - VF CCS migration context data.
*/
@ -33,7 +30,7 @@ struct xe_sriov_vf_ccs_ctx {
/** @mem: memory data */
struct {
/** @mem.ccs_bb_pool: Pool from which batch buffers are allocated. */
struct xe_sa_manager *ccs_bb_pool;
struct xe_mem_pool *ccs_bb_pool;
} mem;
};

View File

@ -97,7 +97,7 @@ static const struct xe_rtp_entry_sr gt_tunings[] = {
{ XE_RTP_NAME("Tuning: Set STLB Bank Hash Mode to 4KB"),
XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3510, XE_RTP_END_VERSION_UNDEFINED),
IS_INTEGRATED),
XE_RTP_ACTIONS(FIELD_SET(XEHP_GAMSTLB_CTRL, BANK_HASH_MODE,
XE_RTP_ACTIONS(FIELD_SET(GAMSTLB_CTRL, BANK_HASH_MODE,
BANK_HASH_4KB_MODE))
},
};

View File

@ -3658,6 +3658,8 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm,
op == DRM_XE_VM_BIND_OP_MAP_USERPTR) ||
XE_IOCTL_DBG(xe, coh_mode == XE_COH_NONE &&
op == DRM_XE_VM_BIND_OP_MAP_USERPTR) ||
XE_IOCTL_DBG(xe, !IS_DGFX(xe) && coh_mode == XE_COH_NONE &&
is_cpu_addr_mirror) ||
XE_IOCTL_DBG(xe, xe_device_is_l2_flush_optimized(xe) &&
(op == DRM_XE_VM_BIND_OP_MAP_USERPTR ||
is_cpu_addr_mirror) &&
@ -4156,7 +4158,8 @@ int xe_vm_get_property_ioctl(struct drm_device *drm, void *data,
int ret = 0;
if (XE_IOCTL_DBG(xe, (args->reserved[0] || args->reserved[1] ||
args->reserved[2])))
args->reserved[2] || args->extensions ||
args->pad)))
return -EINVAL;
vm = xe_vm_lookup(xef, args->vm_id);

View File

@ -621,6 +621,45 @@ static int xe_madvise_purgeable_retained_to_user(const struct xe_madvise_details
return 0;
}
static bool check_pat_args_are_sane(struct xe_device *xe,
struct xe_vmas_in_madvise_range *madvise_range,
u16 pat_index)
{
u16 coh_mode = xe_pat_index_get_coh_mode(xe, pat_index);
int i;
/*
* Using coh_none with CPU cached buffers is not allowed on iGPU.
* On iGPU the GPU shares the LLC with the CPU, so with coh_none
* the GPU bypasses CPU caches and reads directly from DRAM,
* potentially seeing stale sensitive data from previously freed
* pages. On dGPU this restriction does not apply, because the
* platform does not provide a non-coherent system memory access
* path that would violate the DMA coherency contract.
*/
if (coh_mode != XE_COH_NONE || IS_DGFX(xe))
return true;
for (i = 0; i < madvise_range->num_vmas; i++) {
struct xe_vma *vma = madvise_range->vmas[i];
struct xe_bo *bo = xe_vma_bo(vma);
if (bo) {
/* BO with WB caching + COH_NONE is not allowed */
if (XE_IOCTL_DBG(xe, bo->cpu_caching == DRM_XE_GEM_CPU_CACHING_WB))
return false;
/* Imported dma-buf without caching info, assume cached */
if (XE_IOCTL_DBG(xe, !bo->cpu_caching))
return false;
} else if (XE_IOCTL_DBG(xe, xe_vma_is_cpu_addr_mirror(vma) ||
xe_vma_is_userptr(vma)))
/* System memory (userptr/SVM) is always CPU cached */
return false;
}
return true;
}
static bool check_bo_args_are_sane(struct xe_vm *vm, struct xe_vma **vmas,
int num_vmas, u32 atomic_val)
{
@ -750,6 +789,14 @@ int xe_vm_madvise_ioctl(struct drm_device *dev, void *data, struct drm_file *fil
}
}
if (args->type == DRM_XE_MEM_RANGE_ATTR_PAT) {
if (!check_pat_args_are_sane(xe, &madvise_range,
args->pat_index.val)) {
err = -EINVAL;
goto free_vmas;
}
}
if (madvise_range.has_bo_vmas) {
if (args->type == DRM_XE_MEM_RANGE_ATTR_ATOMIC) {
if (!check_bo_args_are_sane(vm, madvise_range.vmas,

View File

@ -743,14 +743,6 @@ static const struct xe_rtp_entry_sr lrc_was[] = {
XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)),
XE_RTP_ACTIONS(SET(WM_CHICKEN3, HIZ_PLANE_COMPRESSION_DIS))
},
{ XE_RTP_NAME("14019988906"),
XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2002), ENGINE_CLASS(RENDER)),
XE_RTP_ACTIONS(SET(XEHP_PSS_CHICKEN, FLSH_IGNORES_PSD))
},
{ XE_RTP_NAME("14019877138"),
XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2002), ENGINE_CLASS(RENDER)),
XE_RTP_ACTIONS(SET(XEHP_PSS_CHICKEN, FD_END_COLLECT))
},
{ XE_RTP_NAME("14021490052"),
XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)),
XE_RTP_ACTIONS(SET(FF_MODE,

View File

@ -273,6 +273,12 @@ int drm_fb_helper_hotplug_event(struct drm_fb_helper *fb_helper);
int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper);
bool drm_fb_helper_gem_is_fb(const struct drm_fb_helper *fb_helper,
const struct drm_gem_object *obj);
#else
static inline bool drm_fb_helper_gem_is_fb(const struct drm_fb_helper *fb_helper,
const struct drm_gem_object *obj)
{
return false;
}
#endif
#endif

View File

@ -322,13 +322,13 @@ struct dma_buf {
* @vmapping_counter:
*
* Used internally to refcnt the vmaps returned by dma_buf_vmap().
* Protected by @lock.
* Protected by @resv.
*/
unsigned vmapping_counter;
/**
* @vmap_ptr:
* The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
* The current vmap ptr if @vmapping_counter > 0. Protected by @resv.
*/
struct iosys_map vmap_ptr;