From 0b13173d27fa15679463b62a10cfa8b3d6c3a71c Mon Sep 17 00:00:00 2001 From: Xiang Gao Date: Wed, 15 Apr 2026 13:41:01 +0800 Subject: [PATCH 01/77] dma-buf: fix stale @lock references in struct dma_buf documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The kernel-doc comments for vmapping_counter and vmap_ptr in struct dma_buf reference "@lock" as the protecting lock, but struct dma_buf no longer has a "lock" member. The mutex was removed in favor of using the dma_resv lock exclusively. The implementation correctly uses dma_resv_assert_held(dmabuf->resv) in dma_buf_vmap() and dma_buf_vunmap(), so update the documentation to reference @resv instead. Signed-off-by: gaoxiang17 Reviewed-by: Christian König Signed-off-by: Christian König Link: https://lore.kernel.org/r/20260415054101.535520-1-gxxa03070307@gmail.com --- include/linux/dma-buf.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 133b9e637b55..ef6d93fd7a2c 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -322,13 +322,13 @@ struct dma_buf { * @vmapping_counter: * * Used internally to refcnt the vmaps returned by dma_buf_vmap(). - * Protected by @lock. + * Protected by @resv. */ unsigned vmapping_counter; /** * @vmap_ptr: - * The current vmap ptr if @vmapping_counter > 0. Protected by @lock. + * The current vmap ptr if @vmapping_counter > 0. Protected by @resv. */ struct iosys_map vmap_ptr; From 095a8b0ad3c3b5cdc3850d961adb8a8f735220bb Mon Sep 17 00:00:00 2001 From: Arjan van de Ven Date: Mon, 20 Apr 2026 14:57:15 -0700 Subject: [PATCH 02/77] drm/amdgpu: fix zero-size GDS range init on RDNA4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit RDNA4 (GFX 12) hardware removes the GDS, GWS, and OA on-chip memory resources. The gfx_v12_0 initialisation code correctly leaves adev->gds.gds_size, adev->gds.gws_size, and adev->gds.oa_size at zero to reflect this. amdgpu_ttm_init() unconditionally calls amdgpu_ttm_init_on_chip() for each of these resources regardless of size. When the size is zero, amdgpu_ttm_init_on_chip() forwards the call to ttm_range_man_init(), which calls drm_mm_init(mm, 0, 0). drm_mm_init() immediately fires DRM_MM_BUG_ON(start + size <= start) -- trivially true when size is zero -- crashing the kernel during modprobe of amdgpu on an RX 9070 XT. Guard against this by returning 0 early from amdgpu_ttm_init_on_chip() when size_in_page is zero. This skips TTM resource manager registration for hardware resources that are absent, without affecting any other GPU type. DRM_MM_BUG_ON() only asserts if CONFIG_DRM_DEBUG_MM is enabled in the kernel config. This is apparently rarely enabled as these chips have been in the market for over a year and this issue was only reported now. Link: https://lore.kernel.org/all/bug-221376-2300@https.bugzilla.kernel.org%2F/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=221376 Oops-Analysis: http://oops.fenrus.org/reports/bugzilla.korg/221376/report.html Assisted-by: GitHub Copilot:Claude Sonnet 4.6 linux-kernel-oops-x86. Signed-off-by: Arjan van de Ven Cc: Alex Deucher Cc: "Christian König" Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher (cherry picked from commit 5719ce5865279cad4fd5f01011fe037168503f2d) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 0dc68fb9d88e..3d2e00efc741 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -75,6 +75,9 @@ static int amdgpu_ttm_init_on_chip(struct amdgpu_device *adev, unsigned int type, uint64_t size_in_page) { + if (!size_in_page) + return 0; + return ttm_range_man_init(&adev->mman.bdev, type, false, size_in_page); } From 4867cef03b58ca53651593842efcfd0587a707f2 Mon Sep 17 00:00:00 2001 From: Roman Li Date: Wed, 15 Apr 2026 17:45:10 -0400 Subject: [PATCH 03/77] drm/amd/display: Restore analog connector support MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [Why] The analog connector support was accidentally removed, causing a crash when connecting an analog monitor. [How] This patch restores the functions and pointers required for proper analog and DP bridge encoder support on legacy GPUs. V2: Restore the external encoder control functions. V3: - Restore BIOS parser external encoder DAC load detection - Restore stream initialization and source selection changes Fixes: e56e3cff2a1b ("drm/amd/display: Sync dcn42 with DC 3.2.373") Cc: Timur Kristóf Signed-off-by: Roman Li Reviewed-by: Alex Hung Reviewed-by: Timur Kristóf Tested-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit cea8349e4494d2892ea57eef3fe4a8987464a876) --- .../gpu/drm/amd/display/dc/bios/bios_parser.c | 11 ++- .../gpu/drm/amd/display/dc/dc_bios_types.h | 3 +- .../amd/display/dc/hwss/dce110/dce110_hwseq.c | 94 ++++++++++++------- 3 files changed, 71 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c index dd362071a6c9..e270b1d2457c 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c +++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c @@ -794,11 +794,13 @@ static enum bp_result bios_parser_external_encoder_control( static enum bp_result bios_parser_dac_load_detection( struct dc_bios *dcb, - enum engine_id engine_id) + enum engine_id engine_id, + struct graphics_object_id ext_enc_id) { struct bios_parser *bp = BP_FROM_DCB(dcb); struct dc_context *ctx = dcb->ctx; struct bp_load_detection_parameters bp_params = {0}; + struct bp_external_encoder_control ext_cntl = {0}; enum bp_result bp_result = BP_RESULT_UNSUPPORTED; uint32_t bios_0_scratch; uint32_t device_id_mask = 0; @@ -824,6 +826,13 @@ static enum bp_result bios_parser_dac_load_detection( bp_params.engine_id = engine_id; bp_result = bp->cmd_tbl.dac_load_detection(bp, &bp_params); + } else if (ext_enc_id.id) { + if (!bp->cmd_tbl.external_encoder_control) + return BP_RESULT_UNSUPPORTED; + + ext_cntl.action = EXTERNAL_ENCODER_CONTROL_DAC_LOAD_DETECT; + ext_cntl.encoder_id = ext_enc_id; + bp_result = bp->cmd_tbl.external_encoder_control(bp, &ext_cntl); } if (bp_result != BP_RESULT_OK) diff --git a/drivers/gpu/drm/amd/display/dc/dc_bios_types.h b/drivers/gpu/drm/amd/display/dc/dc_bios_types.h index 6f96c5cf39fe..526f71616f94 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_bios_types.h +++ b/drivers/gpu/drm/amd/display/dc/dc_bios_types.h @@ -102,7 +102,8 @@ struct dc_vbios_funcs { struct bp_external_encoder_control *cntl); enum bp_result (*dac_load_detection)( struct dc_bios *bios, - enum engine_id engine_id); + enum engine_id engine_id, + struct graphics_object_id ext_enc_id); enum bp_result (*transmitter_control)( struct dc_bios *bios, struct bp_transmitter_control *cntl); diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c index 5273ca09fe12..f0abbb7c2cb2 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c @@ -665,16 +665,45 @@ void dce110_update_info_frame(struct pipe_ctx *pipe_ctx) } static void -dce110_dac_encoder_control(struct pipe_ctx *pipe_ctx, bool enable) +dce110_external_encoder_control(enum bp_external_encoder_control_action action, + struct dc_link *link, + struct dc_crtc_timing *timing) { - struct dc_link *link = pipe_ctx->stream->link; + struct dc *dc = link->ctx->dc; struct dc_bios *bios = link->ctx->dc_bios; - struct bp_encoder_control encoder_control = {0}; + const struct dc_link_settings *link_settings = &link->cur_link_settings; + enum bp_result bp_result = BP_RESULT_OK; + struct bp_external_encoder_control ext_cntl = { + .action = action, + .connector_obj_id = link->link_enc->connector, + .encoder_id = link->ext_enc_id, + .lanes_number = link_settings->lane_count, + .link_rate = link_settings->link_rate, - encoder_control.action = enable ? ENCODER_CONTROL_ENABLE : ENCODER_CONTROL_DISABLE; - encoder_control.engine_id = link->link_enc->analog_engine; - encoder_control.pixel_clock = pipe_ctx->stream->timing.pix_clk_100hz / 10; - bios->funcs->encoder_control(bios, &encoder_control); + /* Use signal type of the real link encoder, ie. DP */ + .signal = link->connector_signal, + + /* We don't know the timing yet when executing the SETUP action, + * so use a reasonably high default value. It seems that ENABLE + * can change the actual pixel clock but doesn't work with higher + * pixel clocks than what SETUP was called with. + */ + .pixel_clock = timing ? timing->pix_clk_100hz / 10 : 300000, + .color_depth = timing ? timing->display_color_depth : COLOR_DEPTH_888, + }; + DC_LOGGER_INIT(dc->ctx); + + bp_result = bios->funcs->external_encoder_control(bios, &ext_cntl); + + if (bp_result != BP_RESULT_OK) + DC_LOG_ERROR("Failed to execute external encoder action: 0x%x\n", action); +} + +static void +dce110_prepare_ddc(struct dc_link *link) +{ + if (link->ext_enc_id.id) + dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_DDC_SETUP, link, NULL); } static bool @@ -684,7 +713,8 @@ dce110_dac_load_detect(struct dc_link *link) struct link_encoder *link_enc = link->link_enc; enum bp_result bp_result; - bp_result = bios->funcs->dac_load_detection(bios, link_enc->analog_engine); + bp_result = bios->funcs->dac_load_detection( + bios, link_enc->analog_engine, link->ext_enc_id); return bp_result == BP_RESULT_OK; } @@ -700,7 +730,6 @@ void dce110_enable_stream(struct pipe_ctx *pipe_ctx) uint32_t early_control = 0; struct timing_generator *tg = pipe_ctx->stream_res.tg; - link_hwss->setup_stream_attribute(pipe_ctx); link_hwss->setup_stream_encoder(pipe_ctx); dc->hwss.update_info_frame(pipe_ctx); @@ -719,8 +748,8 @@ void dce110_enable_stream(struct pipe_ctx *pipe_ctx) tg->funcs->set_early_control(tg, early_control); - if (dc_is_rgb_signal(pipe_ctx->stream->signal)) - dce110_dac_encoder_control(pipe_ctx, true); + if (link->ext_enc_id.id) + dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_ENABLE, link, timing); } static enum bp_result link_transmitter_control( @@ -1219,8 +1248,8 @@ void dce110_disable_stream(struct pipe_ctx *pipe_ctx) link_enc->transmitter - TRANSMITTER_UNIPHY_A); } - if (dc_is_rgb_signal(pipe_ctx->stream->signal)) - dce110_dac_encoder_control(pipe_ctx, false); + if (link->ext_enc_id.id) + dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_DISABLE, link, NULL); } void dce110_unblank_stream(struct pipe_ctx *pipe_ctx, @@ -1603,22 +1632,6 @@ static enum dc_status dce110_enable_stream_timing( return DC_OK; } -static void -dce110_select_crtc_source(struct pipe_ctx *pipe_ctx) -{ - struct dc_link *link = pipe_ctx->stream->link; - struct dc_bios *bios = link->ctx->dc_bios; - struct bp_crtc_source_select crtc_source_select = {0}; - enum engine_id engine_id = link->link_enc->preferred_engine; - - if (dc_is_rgb_signal(pipe_ctx->stream->signal)) - engine_id = link->link_enc->analog_engine; - crtc_source_select.controller_id = CONTROLLER_ID_D0 + pipe_ctx->stream_res.tg->inst; - crtc_source_select.color_depth = pipe_ctx->stream->timing.display_color_depth; - crtc_source_select.engine_id = engine_id; - crtc_source_select.sink_signal = pipe_ctx->stream->signal; - bios->funcs->select_crtc_source(bios, &crtc_source_select); -} enum dc_status dce110_apply_single_controller_ctx_to_hw( struct pipe_ctx *pipe_ctx, @@ -1639,10 +1652,6 @@ enum dc_status dce110_apply_single_controller_ctx_to_hw( hws->funcs.disable_stream_gating(dc, pipe_ctx); } - if (pipe_ctx->stream->signal == SIGNAL_TYPE_RGB) { - dce110_select_crtc_source(pipe_ctx); - } - if (pipe_ctx->stream_res.audio != NULL) { struct audio_output audio_output = {0}; @@ -1722,8 +1731,7 @@ enum dc_status dce110_apply_single_controller_ctx_to_hw( pipe_ctx->stream_res.tg->funcs->set_static_screen_control( pipe_ctx->stream_res.tg, event_triggers, 2); - if (!dc_is_virtual_signal(pipe_ctx->stream->signal) && - !dc_is_rgb_signal(pipe_ctx->stream->signal)) + if (!dc_is_virtual_signal(pipe_ctx->stream->signal)) pipe_ctx->stream_res.stream_enc->funcs->dig_connect_to_otg( pipe_ctx->stream_res.stream_enc, pipe_ctx->stream_res.tg->inst); @@ -3376,6 +3384,15 @@ void dce110_enable_tmds_link_output(struct dc_link *link, link->phy_state.symclk_state = SYMCLK_ON_TX_ON; } +static void dce110_enable_analog_link_output( + struct dc_link *link, + uint32_t pix_clk_100hz) +{ + link->link_enc->funcs->enable_analog_output( + link->link_enc, + pix_clk_100hz); +} + void dce110_enable_dp_link_output( struct dc_link *link, const struct link_resource *link_res, @@ -3423,6 +3440,11 @@ void dce110_enable_dp_link_output( } } + if (link->ext_enc_id.id) { + dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_INIT, link, NULL); + dce110_external_encoder_control(EXTERNAL_ENCODER_CONTROL_SETUP, link, NULL); + } + if (dc->link_srv->dp_get_encoding_format(link_settings) == DP_8b_10b_ENCODING) { if (dc->clk_mgr->funcs->notify_link_rate_change) dc->clk_mgr->funcs->notify_link_rate_change(dc->clk_mgr, link); @@ -3513,8 +3535,10 @@ static const struct hw_sequencer_funcs dce110_funcs = { .enable_lvds_link_output = dce110_enable_lvds_link_output, .enable_tmds_link_output = dce110_enable_tmds_link_output, .enable_dp_link_output = dce110_enable_dp_link_output, + .enable_analog_link_output = dce110_enable_analog_link_output, .disable_link_output = dce110_disable_link_output, .dac_load_detect = dce110_dac_load_detect, + .prepare_ddc = dce110_prepare_ddc, }; static const struct hwseq_private_funcs dce110_private_funcs = { From 508babf310365f1107a2e8831c267c292a286818 Mon Sep 17 00:00:00 2001 From: Hongyan Xu Date: Wed, 22 Apr 2026 20:38:17 +0800 Subject: [PATCH 04/77] drm/amdgpu: avoid double drm_exec_fini() in userq validate MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When new_addition is true, amdgpu_userq_vm_validate() calls drm_exec_fini(&exec) before iterating over the collected HMM ranges and calling amdgpu_ttm_tt_get_user_pages(). If amdgpu_ttm_tt_get_user_pages() fails in that path, the code jumps to unlock_all and calls drm_exec_fini(&exec) a second time on the same exec object. drm_exec_fini() is not idempotent: it frees exec->objects and may also drop exec->contended and finalize the ww acquire context. Route that error path directly to the range cleanup once exec has already been finalized. Fixes: 42f148788469 ("drm/amdgpu/userqueue: validate userptrs for userqueues") Issue found using a prototype static analysis tool and confirmed by code review. Reviewed-by: Christian König Signed-off-by: Hongyan Xu Signed-off-by: Slavin Liu <220245772@seu.edu.cn> Signed-off-by: Alex Deucher (cherry picked from commit 2802952e4a07306da6ebe813ff1acacc5691851a) --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c index d5abf785ca17..a15ca3e35344 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c @@ -1187,7 +1187,7 @@ amdgpu_userq_vm_validate(struct amdgpu_userq_mgr *uq_mgr) bo = range->bo; ret = amdgpu_ttm_tt_get_user_pages(bo, range); if (ret) - goto unlock_all; + goto free_ranges; } invalidated = true; @@ -1214,6 +1214,7 @@ amdgpu_userq_vm_validate(struct amdgpu_userq_mgr *uq_mgr) unlock_all: drm_exec_fini(&exec); +free_ranges: xa_for_each(&xa, tmp_key, range) { if (!range) continue; From 87612bab9656a63affa0e2788e0d7a4a1dffa89e Mon Sep 17 00:00:00 2001 From: "Mario Limonciello (AMD)" Date: Wed, 10 Dec 2025 14:15:08 -0600 Subject: [PATCH 05/77] amdkfd: Only ignore -ENOENT for KFD init failuires When compiled without CONFIG_HSA_AMD KFD will return -ENOENT. As other errors will cause KFD functionality issues this is the only error code that should be ignored at init. Reviewed-by: Kent Russell Signed-off-by: Mario Limonciello (AMD) Signed-off-by: Alex Deucher (cherry picked from commit 4259a25341abf77939767215706f4e3cfd4b73b8) --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index e47921e2a9af..46aae3fad4bf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -3158,8 +3158,10 @@ static int __init amdgpu_init(void) amdgpu_register_atpx_handler(); amdgpu_acpi_detect(); - /* Ignore KFD init failures. Normal when CONFIG_HSA_AMD is not set. */ - amdgpu_amdkfd_init(); + /* Ignore KFD init failures when CONFIG_HSA_AMD is not set. */ + r = amdgpu_amdkfd_init(); + if (r && r != -ENOENT) + goto error_fence; if (amdgpu_pp_feature_mask & PP_OVERDRIVE_MASK) { add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK); From 36d65da7570bf72ce28504fa9a81abfc728e6d96 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Sat, 18 Apr 2026 23:49:30 +0200 Subject: [PATCH 06/77] drm/amdgpu/gmc: Fix AMDGPU_GART_PLACEMENT_LOW to not overlap with VRAM MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When the GART placement is set to AMDGPU_GART_PLACEMENT_LOW: Make sure that GART does not overlap with VRAM when VRAM is configured to be in the low address space. Solve this according to the following logic: - When GART fits before VRAM, use zero address for GART - Otherwise, put GART after the end of VRAM, aligned to 4 GiB Previously, I had assumed this was not possible so it was OK to not handle it, but now we got a report from a user who has a board that is configured this way. Fixes: 917f91d8d8e8 ("drm/amdgpu/gmc: add a way to force a particular placement for GART") Signed-off-by: Timur Kristóf Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 3d9de5d86a1658cadb311461b001eb1df67263ad) --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 285e217fba04..3d9497d121ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -314,7 +314,10 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc, mc->gart_start = max_mc_address - mc->gart_size + 1; break; case AMDGPU_GART_PLACEMENT_LOW: - mc->gart_start = 0; + if (size_bf >= mc->gart_size) + mc->gart_start = 0; + else + mc->gart_start = ALIGN(mc->fb_end, four_gb); break; case AMDGPU_GART_PLACEMENT_BEST_FIT: default: From ccf8932ed8cf4fbfdcd4df2c6b524913691ee700 Mon Sep 17 00:00:00 2001 From: Yang Wang Date: Wed, 22 Apr 2026 18:41:42 +0800 Subject: [PATCH 07/77] drm/amd/pm: fix missing fine-grained dpm table flag on aldebaran Add the missing SMU_DPM_TABLE_FINE_GRAINED flag to aldebaran DPM table. This fixes the pp_dpm_sclk node issue caused by missing flag configuration. Fixes: 7ea1c722fe1d ("drm/amd/pm: Use common helper for aldebaran dpm table") Signed-off-by: Yang Wang Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher (cherry picked from commit 3427dea3a48ebddb491a26093f3627384b3cb2c2) --- drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c index 7f386ff0c872..9d8b1227388f 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c @@ -425,6 +425,7 @@ static int aldebaran_set_default_dpm_table(struct smu_context *smu) dpm_table->dpm_levels[0].enabled = true; dpm_table->dpm_levels[1].value = pptable->GfxclkFmax; dpm_table->dpm_levels[1].enabled = true; + dpm_table->flags |= SMU_DPM_TABLE_FINE_GRAINED; } else { dpm_table->count = 1; dpm_table->dpm_levels[0].value = smu->smu_table.boot_values.gfxclk / 100; From 0ef196a208385b7d7da79f411c161b04e97283e2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20K=C3=B6nig?= Date: Fri, 17 Apr 2026 15:52:45 +0200 Subject: [PATCH 08/77] drm/amdgpu: fix AMDGPU_INFO_READ_MMR_REG MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit There were multiple issues in that code. First of all the order between the reset semaphore and the mm_lock was wrong (e.g. copy_to_user) was called while holding the lock. Then we allocated memory while holding the reset semaphore which is also a pretty big bug and can deadlock. Then we used down_read_trylock() instead of waiting for the reset to finish. Signed-off-by: Christian König Fixes: 9e823f307074 ("drm/amdgpu: Block MMR_READ IOCTL in reset") Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher (cherry picked from commit 361b6e6b303d4b691f6c5974d3eaab67ca6dd90e) --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 57 +++++++++++-------------- 1 file changed, 24 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 06efce38f323..71272f40feef 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -873,68 +873,59 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) ? -EFAULT : 0; } case AMDGPU_INFO_READ_MMR_REG: { - int ret = 0; - unsigned int n, alloc_size; - uint32_t *regs; unsigned int se_num = (info->read_mmr_reg.instance >> AMDGPU_INFO_MMR_SE_INDEX_SHIFT) & AMDGPU_INFO_MMR_SE_INDEX_MASK; unsigned int sh_num = (info->read_mmr_reg.instance >> AMDGPU_INFO_MMR_SH_INDEX_SHIFT) & AMDGPU_INFO_MMR_SH_INDEX_MASK; - - if (!down_read_trylock(&adev->reset_domain->sem)) - return -ENOENT; + unsigned int alloc_size; + uint32_t *regs; + int ret; /* set full masks if the userspace set all bits * in the bitfields */ - if (se_num == AMDGPU_INFO_MMR_SE_INDEX_MASK) { + if (se_num == AMDGPU_INFO_MMR_SE_INDEX_MASK) se_num = 0xffffffff; - } else if (se_num >= AMDGPU_GFX_MAX_SE) { - ret = -EINVAL; - goto out; - } + else if (se_num >= AMDGPU_GFX_MAX_SE) + return -EINVAL; - if (sh_num == AMDGPU_INFO_MMR_SH_INDEX_MASK) { + if (sh_num == AMDGPU_INFO_MMR_SH_INDEX_MASK) sh_num = 0xffffffff; - } else if (sh_num >= AMDGPU_GFX_MAX_SH_PER_SE) { - ret = -EINVAL; - goto out; - } + else if (sh_num >= AMDGPU_GFX_MAX_SH_PER_SE) + return -EINVAL; - if (info->read_mmr_reg.count > 128) { - ret = -EINVAL; - goto out; - } + if (info->read_mmr_reg.count > 128) + return -EINVAL; - regs = kmalloc_array(info->read_mmr_reg.count, sizeof(*regs), GFP_KERNEL); - if (!regs) { - ret = -ENOMEM; - goto out; - } + regs = kmalloc_array(info->read_mmr_reg.count, sizeof(*regs), + GFP_KERNEL); + if (!regs) + return -ENOMEM; + down_read(&adev->reset_domain->sem); alloc_size = info->read_mmr_reg.count * sizeof(*regs); - amdgpu_gfx_off_ctrl(adev, false); + ret = 0; for (i = 0; i < info->read_mmr_reg.count; i++) { if (amdgpu_asic_read_register(adev, se_num, sh_num, info->read_mmr_reg.dword_offset + i, ®s[i])) { DRM_DEBUG_KMS("unallowed offset %#x\n", info->read_mmr_reg.dword_offset + i); - kfree(regs); - amdgpu_gfx_off_ctrl(adev, true); ret = -EFAULT; - goto out; + break; } } amdgpu_gfx_off_ctrl(adev, true); - n = copy_to_user(out, regs, min(size, alloc_size)); - kfree(regs); - ret = (n ? -EFAULT : 0); -out: up_read(&adev->reset_domain->sem); + + if (!ret) { + ret = copy_to_user(out, regs, min(size, alloc_size)) + ? -EFAULT : 0; + } + kfree(regs); return ret; } case AMDGPU_INFO_DEV_INFO: { From 13e4cf116dbf7a1fb8123a59bea2c098f30d3736 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Sat, 18 Apr 2026 23:49:31 +0200 Subject: [PATCH 09/77] drm/amdgpu/uvd3.1: Don't validate the firmware when already validated MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit UVD 3.1 firmware validation seems to always fail after attempting it when it had already been validated. (This works similarly with the VCE 1.0 as well.) Don't attempt repeating the validation when it's already done. This caused issues in situations when the system isn't able to suspend the GPU properly and so the GPU isn't actually powered down. Then amdgpu would fail when calling the IP block resume function. Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/2887 Fixes: bb7978111dd3 ("drm/amdgpu: fix SI UVD firmware validate resume fail") Signed-off-by: Timur Kristóf Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 889a2cfd889c4a4dd9d0c89ce9a8e60b78be71dd) --- drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c index fea576a7f397..efb3fde919ee 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c @@ -242,6 +242,10 @@ static void uvd_v3_1_mc_resume(struct amdgpu_device *adev) uint64_t addr; uint32_t size; + /* When the keyselect is already set, don't perturb it. */ + if (RREG32(mmUVD_FW_START)) + return; + /* program the VCPU memory controller bits 0-27 */ addr = (adev->uvd.inst->gpu_addr + AMDGPU_UVD_FIRMWARE_OFFSET) >> 3; size = AMDGPU_UVD_FIRMWARE_SIZE(adev) >> 3; @@ -284,6 +288,12 @@ static int uvd_v3_1_fw_validate(struct amdgpu_device *adev) int i; uint32_t keysel = adev->uvd.keyselect; + if (RREG32(mmUVD_FW_START) & UVD_FW_STATUS__PASS_MASK) { + dev_dbg(adev->dev, "UVD keyselect already set: 0x%x (on CPU: 0x%x)\n", + RREG32(mmUVD_FW_START), adev->uvd.keyselect); + return 0; + } + WREG32(mmUVD_FW_START, keysel); for (i = 0; i < 10; ++i) { From fe2b84f9228e2a0903221a4d0d8c350b018e9c0c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Sat, 18 Apr 2026 23:49:33 +0200 Subject: [PATCH 10/77] drm/amdgpu/gfx6: Support harvested SI chips with disabled TCCs (v2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This commit fixes amdgpu to work on the Radeon HD 7870 XT which has never worked with the Linux open source drivers before. Some boards have "harvested" chips, meaning that some parts of the chip are disabled and fused, and it's sold for cheaper and under a different marketing name. On a harvested chip, any of the following can be disabled: - CUs (Compute Units) - RBs (Render Backend, aka. ROP) - Memory channels (ie. the chip has a lower bandwidth) - TCCs (ie. less L2 cache) Handle chips with harvested TCCs by patching the registers that configure how TCCs are mapped. If some TCCs are disabled, we need to make sure that the disabled TCCs are not used, and the remaining TCCs are used optimally. TCP_CHAN_STEER_LO/HI control which TCC is used by TCP channels. TCP_ADDR_CONFIG.NUM_TCC_BANKS controls how many channels are used. Note that the TCC configuration is highly relevant to performance. Suboptimal configuration (eg. CHAN_STEER=0) can significantly reduce gaming performance. For optimal performance: - Rely on the CHAN_STEER from the golden registers table, only skip disabled TCCs but keep the mapping order. - Limit NUM_TCC_BANKS to number of active TCCs to avoid thrashing, which performs better than using the same TCC twice. v2: - Also consider CGTS_USER_TCC_DISABLE for disabled TCCs. Link: https://bugs.freedesktop.org/show_bug.cgi?id=60879 Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/2664 Fixes: 2cd46ad22383 ("drm/amdgpu: add graphic pipeline implementation for si v8") Signed-off-by: Timur Kristóf Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 00218d15528fab9f6b31241fe5904eea4fcaa30d) --- drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 66 +++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c index 73223d97a87f..ac90d8e9d86a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c @@ -1571,6 +1571,71 @@ static void gfx_v6_0_setup_spi(struct amdgpu_device *adev) mutex_unlock(&adev->grbm_idx_mutex); } +/** + * gfx_v6_0_setup_tcc() - setup which TCCs are used + * + * @adev: amdgpu_device pointer + * + * Verify whether the current GPU has any TCCs disabled, + * which can happen when the GPU is harvested and some + * memory channels are disabled, reducing the memory bus width. + * For example, on the Radeon HD 7870 XT (Tahiti LE). + * + * If some TCCs are disabled, we need to make sure that + * the disabled TCCs are not used, and the remaining TCCs + * are used optimally. + * + * TCP_CHAN_STEER_LO/HI control which TCC is used by TCP channels. + * TCP_ADDR_CONFIG.NUM_TCC_BANKS controls how many channels are used. + * + * For optimal performance: + * - Rely on the CHAN_STEER from the golden registers table, + * only skip disabled TCCs but keep the mapping order. + * - Limit NUM_TCC_BANKS to number of active TCCs to avoid thrashing, + * which performs better than using the same TCC twice. + */ +static void gfx_v6_0_setup_tcc(struct amdgpu_device *adev) +{ + u32 i, tcc, tcp_addr_config, num_active_tcc = 0; + u64 chan_steer, patched_chan_steer = 0; + const u32 num_max_tcc = adev->gfx.config.max_texture_channel_caches; + const u32 dis_tcc_mask = + amdgpu_gfx_create_bitmask(num_max_tcc) & + (REG_GET_FIELD(RREG32(mmCGTS_TCC_DISABLE), + CGTS_TCC_DISABLE, TCC_DISABLE) | + REG_GET_FIELD(RREG32(mmCGTS_USER_TCC_DISABLE), + CGTS_USER_TCC_DISABLE, TCC_DISABLE)); + + /* When no TCC is disabled, the golden registers table already has optimal TCC setup */ + if (!dis_tcc_mask) + return; + + /* Each 4-bit nibble contains the index of a TCC used by all TCPs */ + chan_steer = RREG32(mmTCP_CHAN_STEER_LO) | ((u64)RREG32(mmTCP_CHAN_STEER_HI) << 32ull); + + /* Patch the TCP to TCC mapping to skip disabled TCCs */ + for (i = 0; i < num_max_tcc; ++i) { + tcc = (chan_steer >> (u64)(4 * i)) & 0xf; + + if (!((1 << tcc) & dis_tcc_mask)) { + /* Copy enabled TCC indices to the patched register value. */ + patched_chan_steer |= (u64)tcc << (u64)(4 * num_active_tcc); + ++num_active_tcc; + } + } + + WARN_ON(num_active_tcc != num_max_tcc - hweight32(dis_tcc_mask)); + + /* Patch number of TCCs used by TCPs */ + tcp_addr_config = REG_SET_FIELD(RREG32(mmTCP_ADDR_CONFIG), + TCP_ADDR_CONFIG, NUM_TCC_BANKS, + num_active_tcc - 1); + + WREG32(mmTCP_ADDR_CONFIG, tcp_addr_config); + WREG32(mmTCP_CHAN_STEER_HI, upper_32_bits(patched_chan_steer)); + WREG32(mmTCP_CHAN_STEER_LO, lower_32_bits(patched_chan_steer)); +} + static void gfx_v6_0_config_init(struct amdgpu_device *adev) { adev->gfx.config.double_offchip_lds_buf = 0; @@ -1729,6 +1794,7 @@ static void gfx_v6_0_constants_init(struct amdgpu_device *adev) gfx_v6_0_tiling_mode_table_init(adev); gfx_v6_0_setup_rb(adev); + gfx_v6_0_setup_tcc(adev); gfx_v6_0_setup_spi(adev); From 686e5985d9f5ba29e2fd43d618548039727adee2 Mon Sep 17 00:00:00 2001 From: Pierre-Eric Pelloux-Prayer Date: Mon, 20 Apr 2026 10:23:39 +0200 Subject: [PATCH 11/77] drm/amdgpu: fix root reservation in amdgpu_vm_handle_fault MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit svm_range_restore_pages might reserve the root bo so it must be called after unreserving it. Fixes: 1b135c6da061 ("drm/amdgpu: extract amdgpu_vm_lock_by_pasid from amdgpu_vm_handle_fault") Signed-off-by: Pierre-Eric Pelloux-Prayer Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 5cdc219fe86a1720aa4b5b4f42f11913146e6a93) --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 115a7b269af3..9ba9de16a27a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -3023,11 +3023,22 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid, is_compute_context = vm->is_compute_context; - if (is_compute_context && !svm_range_restore_pages(adev, pasid, vmid, - node_id, addr >> PAGE_SHIFT, ts, write_fault)) { + if (is_compute_context) { + /* Unreserve root since svm_range_restore_pages might try to reserve it. */ + /* TODO: rework svm_range_restore_pages so that this isn't necessary. */ amdgpu_bo_unreserve(root); + + if (!svm_range_restore_pages(adev, pasid, vmid, + node_id, addr >> PAGE_SHIFT, ts, write_fault)) { + amdgpu_bo_unref(&root); + return true; + } amdgpu_bo_unref(&root); - return true; + + /* Re-acquire the VM lock, could be that the VM was freed in between. */ + vm = amdgpu_vm_lock_by_pasid(adev, &root, pasid); + if (!vm) + return false; } addr /= AMDGPU_GPU_PAGE_SIZE; From b56922fc37454633b831a2a04a1537616742977d Mon Sep 17 00:00:00 2001 From: Kent Russell Date: Wed, 22 Apr 2026 09:34:04 -0400 Subject: [PATCH 12/77] drm/amdgpu: Only send RMA CPER when threshold is exceeded According to our documentation, the RMA should only occur when the threshold has been exceeded, not met. Fixes: 5028a24aa89a ("drm/amdgpu: Send applicable RMA CPERs at end of RAS init") Signed-off-by: Kent Russell Reviewed-by: Tao Zhou Signed-off-by: Alex Deucher (cherry picked from commit 8bc09a7d0e90ec45a0b4865661cf45cbbce1c3d7) --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index cdf4909592d2..0c57fe259894 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -1950,7 +1950,7 @@ void amdgpu_ras_check_bad_page_status(struct amdgpu_device *adev) if (!control || amdgpu_bad_page_threshold == 0) return; - if (control->ras_num_bad_pages >= ras->bad_page_cnt_threshold) { + if (control->ras_num_bad_pages > ras->bad_page_cnt_threshold) { if (amdgpu_dpm_send_rma_reason(adev)) dev_warn(adev->dev, "Unable to send out-of-band RMA CPER"); else From 47776ac1e3f4a2aefcf7fe7c7e4a11151b676222 Mon Sep 17 00:00:00 2001 From: Shubhankar Milind Sardeshpande Date: Tue, 21 Apr 2026 17:01:21 +0530 Subject: [PATCH 13/77] drm/amdgpu: Avoid reset in AMDGPU unload path for APUs with GFX V11 and higher. GFX V11 has GC block as default off IP. Every time AMDGPU driver sends a request to PMFW to unload MP1, PMFW will put GC in reset and power down the voltage.Hence, skipping reset for APUs with GFX V11 or later to avoid reset related failures. Fixes: 34355e61835e ("drm/amdgpu: Fix GFX hang on SteamDeck when amdgpu is reloaded") Reviewed-by: Alex Deucher Signed-off-by: Shubhankar Milind Sardeshpande Signed-off-by: Alex Deucher (cherry picked from commit d0a8cadffc818f51d05bc234d8da1af228bc59a3) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 737ef1ef96a5..66ca043658ff 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2839,8 +2839,12 @@ static int amdgpu_device_ip_fini_early(struct amdgpu_device *adev) * that checks whether the PSP is running. A solution for those issues * in the APU is to trigger a GPU reset, but this should be done during * the unload phase to avoid adding boot latency and screen flicker. + * GFX V11 has GC block as default off IP. Every time AMDGPU driver sends + * a request to PMFW to unload MP1, PMFW will put GC in reset and power down + * the voltage. Hence, skipping reset for APUs with GFX V11 or later. */ - if ((adev->flags & AMD_IS_APU) && !adev->gmc.is_app_apu) { + if ((adev->flags & AMD_IS_APU) && !adev->gmc.is_app_apu && + amdgpu_ip_version(adev, GC_HWIP, 0) < IP_VERSION(11, 0, 0)) { r = amdgpu_asic_reset(adev); if (r) dev_err(adev->dev, "asic reset on %s failed\n", __func__); From 045e0ff208f0838a246c10204105126611b267a1 Mon Sep 17 00:00:00 2001 From: Alysa Liu Date: Tue, 21 Apr 2026 10:18:28 -0400 Subject: [PATCH 14/77] drm/amdkfd: validate SVM ioctl nattr against buffer size Validate nattr field against the buffer size, preventing out-of-bounds buffer access via user-controlled attribute count. Reviewed-by: Amir Shetaia Signed-off-by: Alysa Liu Signed-off-by: Alex Deucher (cherry picked from commit 5eca8bfdfa456c3304ca77523718fe24254c172f) Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 26 ++++++++++++++++++++++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 3 +++ 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 55ea5145a28a..f829d65a79b4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include #include @@ -1695,6 +1696,16 @@ static int kfd_ioctl_smi_events(struct file *filep, return kfd_smi_event_open(pdd->dev, &args->anon_fd); } +static int kfd_ioctl_svm_validate(void *kdata, unsigned int usize) +{ + struct kfd_ioctl_svm_args *args = kdata; + size_t expected = struct_size(args, attrs, args->nattr); + + if (expected == SIZE_MAX || usize < expected) + return -EINVAL; + return 0; +} + #if IS_ENABLED(CONFIG_HSA_AMD_SVM) static int kfd_ioctl_set_xnack_mode(struct file *filep, @@ -3209,7 +3220,11 @@ static int kfd_ioctl_create_process(struct file *filep, struct kfd_process *p, v #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \ [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \ - .cmd_drv = 0, .name = #ioctl} + .validate = NULL, .cmd_drv = 0, .name = #ioctl} + +#define AMDKFD_IOCTL_DEF_V(ioctl, _func, _validate, _flags) \ + [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \ + .validate = _validate, .cmd_drv = 0, .name = #ioctl} /** Ioctl table */ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = { @@ -3306,7 +3321,8 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = { AMDKFD_IOCTL_DEF(AMDKFD_IOC_SMI_EVENTS, kfd_ioctl_smi_events, 0), - AMDKFD_IOCTL_DEF(AMDKFD_IOC_SVM, kfd_ioctl_svm, 0), + AMDKFD_IOCTL_DEF_V(AMDKFD_IOC_SVM, kfd_ioctl_svm, + kfd_ioctl_svm_validate, 0), AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_XNACK_MODE, kfd_ioctl_set_xnack_mode, 0), @@ -3431,6 +3447,12 @@ static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) memset(kdata, 0, usize); } + if (ioctl->validate) { + retcode = ioctl->validate(kdata, usize); + if (retcode) + goto err_i1; + } + retcode = func(filep, process, kdata); if (cmd & IOC_OUT) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 6e333bfa17d6..163d665a6074 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -1047,10 +1047,13 @@ extern struct srcu_struct kfd_processes_srcu; typedef int amdkfd_ioctl_t(struct file *filep, struct kfd_process *p, void *data); +typedef int amdkfd_ioctl_validate_t(void *kdata, unsigned int usize); + struct amdkfd_ioctl_desc { unsigned int cmd; int flags; amdkfd_ioctl_t *func; + amdkfd_ioctl_validate_t *validate; unsigned int cmd_drv; const char *name; }; From d0f5711fa14a09c010537375cf34893cd33bc2ee Mon Sep 17 00:00:00 2001 From: YuanShang Date: Thu, 26 Mar 2026 18:27:30 +0800 Subject: [PATCH 15/77] drm/amdkfd: check if vm ready in svm map and unmap to gpu Don't map or unmap svm range to gpu if vm is not ready for updates. Why: DRM entity may already be killed when the svm worker try to update gpu vm. Signed-off-by: YuanShang Reviewed-by: Philip Yang Signed-off-by: Alex Deucher (cherry picked from commit 55f8e366c326980174a4f2b9501b524d8eb25135) --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index b120fdb0ef77..38085a0a0f58 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -1366,6 +1366,12 @@ svm_range_unmap_from_gpu(struct amdgpu_device *adev, struct amdgpu_vm *vm, pr_debug("CPU[0x%llx 0x%llx] -> GPU[0x%llx 0x%llx]\n", start, last, gpu_start, gpu_end); + + if (!amdgpu_vm_ready(vm)) { + pr_debug("VM not ready, canceling unmap\n"); + return -EINVAL; + } + return amdgpu_vm_update_range(adev, vm, false, true, true, false, NULL, gpu_start, gpu_end, init_pte_value, 0, 0, NULL, NULL, fence); @@ -1443,6 +1449,11 @@ svm_range_map_to_gpu(struct kfd_process_device *pdd, struct svm_range *prange, pr_debug("svms 0x%p [0x%lx 0x%lx] readonly %d\n", prange->svms, last_start, last_start + npages - 1, readonly); + if (!amdgpu_vm_ready(vm)) { + pr_debug("VM not ready, canceling map\n"); + return -EINVAL; + } + for (i = offset; i < offset + npages; i++) { uint64_t gpu_start; uint64_t gpu_end; From 3d4c2268bd7243c3780fe32bf24ff876da272acf Mon Sep 17 00:00:00 2001 From: Ashutosh Desai Date: Mon, 20 Apr 2026 01:36:37 +0000 Subject: [PATCH 16/77] drm/gem: Fix inconsistent plane dimension calculation in drm_gem_fb_init_with_funcs() drm_gem_fb_init_with_funcs() computes sub-sampled plane dimensions using plain integer division: unsigned int width = mode_cmd->width / (i ? info->hsub : 1); unsigned int height = mode_cmd->height / (i ? info->vsub : 1); However, the ioctl-level framebuffer_check() in drm_framebuffer.c uses drm_format_info_plane_width/height() which round up dimensions via DIV_ROUND_UP(). This inconsistency corrupts the subsequent GEM object size check for certain pixel format and dimension combinations. For example, with NV12 (vsub=2) and a 1-pixel-tall framebuffer the GEM size validation path sees height=0 instead of height=1. The expression (height - 1) then wraps to UINT_MAX as an unsigned int, causing min_size to overflow and wrap back to a small value. A tiny GEM object therefore passes the size guard, yet when the GPU accesses the chroma plane it will read or write memory beyond the object's bounds. Fix by replacing the open-coded divisions with drm_format_info_plane_width() and drm_format_info_plane_height(), which use DIV_ROUND_UP() and match the calculation already used in framebuffer_check(). Fixes: 4c3dbb2c312c ("drm: Add GEM backed framebuffer library") Cc: stable@vger.kernel.org # v4.14+ Reviewed-by: Thomas Zimmermann Signed-off-by: Ashutosh Desai Signed-off-by: Thomas Zimmermann Link: https://patch.msgid.link/20260420013637.457751-1-ashutoshdesai993@gmail.com --- drivers/gpu/drm/drm_gem_framebuffer_helper.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_gem_framebuffer_helper.c b/drivers/gpu/drm/drm_gem_framebuffer_helper.c index 9166c353f131..88808e972cc1 100644 --- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c +++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c @@ -172,8 +172,8 @@ int drm_gem_fb_init_with_funcs(struct drm_device *dev, } for (i = 0; i < info->num_planes; i++) { - unsigned int width = mode_cmd->width / (i ? info->hsub : 1); - unsigned int height = mode_cmd->height / (i ? info->vsub : 1); + unsigned int width = drm_format_info_plane_width(info, mode_cmd->width, i); + unsigned int height = drm_format_info_plane_height(info, mode_cmd->height, i); unsigned int min_size; objs[i] = drm_gem_object_lookup(file, mode_cmd->handles[i]); From 4aa8110000b0d215deef8eed283565dd0c1def88 Mon Sep 17 00:00:00 2001 From: Yuho Choi Date: Sun, 19 Apr 2026 20:25:13 -0400 Subject: [PATCH 17/77] drm/sysfb: ofdrm: fix PCI device reference leaks display_get_pci_dev_of() gets a referenced PCI device via pci_get_device(). Drop that reference when pci_enable_device() fails and release it during the managed teardown path after pci_disable_device(). Without that, ofdrm leaks the pci_dev reference on both the error path and the normal cleanup path. Fixes: c8a17756c425 ("drm/ofdrm: Add ofdrm for Open Firmware framebuffers") Co-developed-by: Myeonghun Pak Signed-off-by: Myeonghun Pak Co-developed-by: Ijae Kim Signed-off-by: Ijae Kim Co-developed-by: Taegyu Kim Signed-off-by: Taegyu Kim Signed-off-by: Yuho Choi Reviewed-by: Thomas Zimmermann Signed-off-by: Thomas Zimmermann Link: https://patch.msgid.link/20260420002513.216-1-dbgh9129@gmail.com --- drivers/gpu/drm/sysfb/ofdrm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/sysfb/ofdrm.c b/drivers/gpu/drm/sysfb/ofdrm.c index d38ba70f4e0d..247cf13c80a0 100644 --- a/drivers/gpu/drm/sysfb/ofdrm.c +++ b/drivers/gpu/drm/sysfb/ofdrm.c @@ -350,6 +350,7 @@ static void ofdrm_pci_release(void *data) struct pci_dev *pcidev = data; pci_disable_device(pcidev); + pci_dev_put(pcidev); } static int ofdrm_device_init_pci(struct ofdrm_device *odev) @@ -375,6 +376,7 @@ static int ofdrm_device_init_pci(struct ofdrm_device *odev) if (ret) { drm_err(dev, "pci_enable_device(%s) failed: %d\n", dev_name(&pcidev->dev), ret); + pci_dev_put(pcidev); return ret; } ret = devm_add_action_or_reset(&pdev->dev, ofdrm_pci_release, pcidev); From aaaa684bab1f6d9ecfc49db328facb1771fd0eb2 Mon Sep 17 00:00:00 2001 From: Sasha Finkelstein Date: Mon, 20 Apr 2026 14:17:43 +0200 Subject: [PATCH 18/77] drm/appletbdrm: Use kvzalloc for big allocations This driver is attached to a ~2000x80 screen, which is a lot more than a single page. This causes out of memory errors in some rare cases. Reported-by: soopyc Closes: https://github.com/t2linux/fedora/issues/51 Signed-off-by: Sasha Finkelstein Signed-off-by: Thomas Zimmermann Reviewed-by: Aditya Garg Reviewed-by: Thomas Zimmermann Fixes: 0670c2f56e45 ("drm/tiny: add driver for Apple Touch Bars in x86 Macs") Cc: # v6.15+ Link: https://patch.msgid.link/20260420-x86-tb-vmalloc-v1-1-7757ff657223@chaosmail.tech --- drivers/gpu/drm/tiny/appletbdrm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/tiny/appletbdrm.c b/drivers/gpu/drm/tiny/appletbdrm.c index 3bae91d7eefe..278bb23fe4c8 100644 --- a/drivers/gpu/drm/tiny/appletbdrm.c +++ b/drivers/gpu/drm/tiny/appletbdrm.c @@ -353,7 +353,7 @@ static int appletbdrm_primary_plane_helper_atomic_check(struct drm_plane *plane, frames_size + sizeof(struct appletbdrm_fb_request_footer), 16); - appletbdrm_state->request = kzalloc(request_size, GFP_KERNEL); + appletbdrm_state->request = kvzalloc(request_size, GFP_KERNEL); if (!appletbdrm_state->request) return -ENOMEM; @@ -543,7 +543,7 @@ static void appletbdrm_primary_plane_destroy_state(struct drm_plane *plane, { struct appletbdrm_plane_state *appletbdrm_state = to_appletbdrm_plane_state(state); - kfree(appletbdrm_state->request); + kvfree(appletbdrm_state->request); kfree(appletbdrm_state->response); __drm_gem_destroy_shadow_plane_state(&appletbdrm_state->base); From 9d5a2b8f6281f6090002517fb9272ea07038afe8 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Tue, 21 Apr 2026 09:48:32 +0200 Subject: [PATCH 19/77] drm/color-mgmt: Typo s/R332/RGB332/ Fix a typo of "RGB332" in kerneldoc for the drm_crtc_fill_palette_332() helper. Fixes: 7ff61177b7116825 ("drm/color-mgmt: Prepare for RGB332 palettes") Signed-off-by: Geert Uytterhoeven Reviewed-by: Javier Martinez Canillas Reviewed-by: Thomas Zimmermann Signed-off-by: Thomas Zimmermann Link: https://patch.msgid.link/c413e45c8f752a532a4ff377f7a8b9eaab4a082a.1776757681.git.geert+renesas@glider.be --- drivers/gpu/drm/drm_color_mgmt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c index c598b99673fc..e7db4e4ea700 100644 --- a/drivers/gpu/drm/drm_color_mgmt.c +++ b/drivers/gpu/drm/drm_color_mgmt.c @@ -831,7 +831,7 @@ static void fill_palette_332(struct drm_crtc *crtc, u16 r, u16 g, u16 b, } /** - * drm_crtc_fill_palette_332 - Programs a default palette for R332-like formats + * drm_crtc_fill_palette_332 - Programs a default palette for RGB332-like formats * @crtc: The displaying CRTC * @set_palette: Callback for programming the hardware gamma LUT * From 2454bd74cb989365edda476c4bd1c4e90eacf568 Mon Sep 17 00:00:00 2001 From: Aditya Garg Date: Fri, 24 Apr 2026 17:59:14 +0000 Subject: [PATCH 20/77] MAINTAINERS, mailmap: update Aditya Garg's email address My Outlook email address often sends emails from kernel devs to the junk folder. Also, emails from some addresses (eg suse.de) are not received at all. Update the email to my alternate Proton Mail address. Signed-off-by: Aditya Garg Acked-by: Thomas Zimmermann Signed-off-by: Thomas Zimmermann Link: https://patch.msgid.link/20260424175846.15103-1-gargaditya08@proton.me --- .mailmap | 1 + MAINTAINERS | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/.mailmap b/.mailmap index 34acd34bbf9b..35be6d46bde3 100644 --- a/.mailmap +++ b/.mailmap @@ -19,6 +19,7 @@ Abhinav Kumar Ahmad Masri Adam Oldham Adam Radford +Aditya Garg Adriana Reus Adrian Bunk Ajay Kaher diff --git a/MAINTAINERS b/MAINTAINERS index 2fb1c75afd16..a86040ecad91 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7873,7 +7873,7 @@ F: drivers/gpu/drm/sun4i/sun8i* DRM DRIVER FOR APPLE TOUCH BARS M: Aun-Ali Zaidi -M: Aditya Garg +M: Aditya Garg L: dri-devel@lists.freedesktop.org S: Maintained T: git https://gitlab.freedesktop.org/drm/misc/kernel.git From 5dfd429591f8d7185bf63a08b5c30863fb605611 Mon Sep 17 00:00:00 2001 From: Brajesh Gupta Date: Mon, 27 Apr 2026 11:01:37 +0530 Subject: [PATCH 21/77] drm/imagination: Fix segfault when updating ftrace mask Fix invalid data access by passing right data for debugfs entry. [ 171.549793] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 171.559248] Mem abort info: [ 171.562173] ESR = 0x0000000096000044 [ 171.566227] EC = 0x25: DABT (current EL), IL = 32 bits [ 171.573108] SET = 0, FnV = 0 [ 171.576448] EA = 0, S1PTW = 0 [ 171.579745] FSC = 0x04: level 0 translation fault [ 171.584760] Data abort info: [ 171.588012] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000 [ 171.593734] CM = 0, WnR = 1, TnD = 0, TagAccess = 0 [ 171.598962] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 171.604471] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000083837000 [ 171.611358] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 171.618500] Internal error: Oops: 0000000096000044 [#1] SMP [ 171.624222] Modules linked in: powervr drm_shmem_helper drm_gpuvm... [ 171.656580] CPU: 0 UID: 0 PID: 549 Comm: bash Not tainted 7.0.0-rc2-g730b257ba723-dirty #13 PREEMPT [ 171.665773] Hardware name: BeagleBoard.org BeaglePlay (DT) [ 171.671296] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 171.678306] pc : pvr_fw_trace_mask_set+0x78/0x154 [powervr] [ 171.683959] lr : pvr_fw_trace_mask_set+0x4c/0x154 [powervr] [ 171.689593] sp : ffff8000835ebb90 [ 171.692929] x29: ffff8000835ebc00 x28: ffff000005c60f80 x27: 0000000000000000 [ 171.700130] x26: 0000000000000000 x25: ffff00000504af28 x24: 0000000000000000 [ 171.707324] x23: ffff00000504af50 x22: 0000000000000203 x21: 0000000000000000 [ 171.714518] x20: ffff000005c44a80 x19: ffff000005c457b8 x18: 0000000000000000 [ 171.721715] x17: 0000000000000000 x16: 0000000000000000 x15: 0000aaaae8887580 [ 171.728908] x14: 0000000000000000 x13: 0000000000000000 x12: ffff8000835ebc30 [ 171.736095] x11: ffff00000504af2a x10: ffff00008504af29 x9 : 0fffffffffffffff [ 171.743286] x8 : ffff8000835ebbf8 x7 : 0000000000000000 x6 : 000000000000002a [ 171.750479] x5 : ffff00000504af2e x4 : 0000000000000000 x3 : 0000000000000010 [ 171.757674] x2 : 0000000000000203 x1 : 0000000000000000 x0 : ffff8000835ebba0 [ 171.764871] Call trace: [ 171.767342] pvr_fw_trace_mask_set+0x78/0x154 [powervr] (P) [ 171.772984] simple_attr_write_xsigned.isra.0+0xe0/0x19c [ 171.778341] simple_attr_write+0x18/0x24 [ 171.782296] debugfs_attr_write+0x50/0x98 [ 171.786341] full_proxy_write+0x6c/0xa8 [ 171.790208] vfs_write+0xd4/0x350 [ 171.793561] ksys_write+0x70/0x108 [ 171.796995] __arm64_sys_write+0x1c/0x28 [ 171.800952] invoke_syscall+0x48/0x10c [ 171.804740] el0_svc_common.constprop.0+0x40/0xe0 [ 171.809487] do_el0_svc+0x1c/0x28 [ 171.812834] el0_svc+0x34/0x108 [ 171.816013] el0t_64_sync_handler+0xa0/0xe4 [ 171.820237] el0t_64_sync+0x198/0x19c [ 171.823939] Code: 32000262 b90ac293 1a931056 9134e293 (b9000036) [ 171.830073] ---[ end trace 0000000000000000 ]--- Fixes: a331631496a0 ("drm/imagination: Simplify module parameters") Signed-off-by: Brajesh Gupta Reviewed-by: Alessio Belle Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260427-ftrace_fix-v3-1-e081530759a8@imgtec.com Signed-off-by: Matt Coster --- drivers/gpu/drm/imagination/pvr_fw_trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/imagination/pvr_fw_trace.c b/drivers/gpu/drm/imagination/pvr_fw_trace.c index e154cb35f604..6193811ef7be 100644 --- a/drivers/gpu/drm/imagination/pvr_fw_trace.c +++ b/drivers/gpu/drm/imagination/pvr_fw_trace.c @@ -558,6 +558,6 @@ pvr_fw_trace_debugfs_init(struct pvr_device *pvr_dev, struct dentry *dir) &pvr_fw_trace_fops); } - debugfs_create_file("trace_mask", 0600, dir, fw_trace, + debugfs_create_file("trace_mask", 0600, dir, pvr_dev, &pvr_fw_trace_mask_fops); } From ac2c996675755c725a0065dbe3e2ebffded9080b Mon Sep 17 00:00:00 2001 From: Shixiong Ou Date: Fri, 24 Apr 2026 20:44:27 +0800 Subject: [PATCH 22/77] drm/udl: Increase GET_URB_TIMEOUT [WHY] A situation has occurred where udl_handle_damage() executed successfully and the kernel log appears normal, but the display fails to show any output. This is because the call to udl_get_urb() in udl_crtc_helper_atomic_enable() failed without generating any error message. [HOW] 1. Increase timeout of getting urb. 2. Add error messages when calling udl_get_urb() failed in udl_crtc_helper_atomic_enable(). Signed-off-by: Shixiong Ou Reviewed-by: Thomas Zimmermann Fixes: 5320918b9a87 ("drm/udl: initial UDL driver (v4)") Signed-off-by: Thomas Zimmermann Cc: # v3.4+ Link: https://patch.msgid.link/20260424124427.657-1-oushixiong1025@163.com --- drivers/gpu/drm/udl/udl_main.c | 3 +-- drivers/gpu/drm/udl/udl_modeset.c | 5 ++++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/udl/udl_main.c b/drivers/gpu/drm/udl/udl_main.c index 08a0e9480d70..17950fe3a0ec 100644 --- a/drivers/gpu/drm/udl/udl_main.c +++ b/drivers/gpu/drm/udl/udl_main.c @@ -285,13 +285,12 @@ static struct urb *udl_get_urb_locked(struct udl_device *udl, long timeout) return unode->urb; } -#define GET_URB_TIMEOUT HZ struct urb *udl_get_urb(struct udl_device *udl) { struct urb *urb; spin_lock_irq(&udl->urbs.lock); - urb = udl_get_urb_locked(udl, GET_URB_TIMEOUT); + urb = udl_get_urb_locked(udl, HZ * 2); spin_unlock_irq(&udl->urbs.lock); return urb; } diff --git a/drivers/gpu/drm/udl/udl_modeset.c b/drivers/gpu/drm/udl/udl_modeset.c index 231e829bd709..1ca073a4ecb2 100644 --- a/drivers/gpu/drm/udl/udl_modeset.c +++ b/drivers/gpu/drm/udl/udl_modeset.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -342,8 +343,10 @@ static void udl_crtc_helper_atomic_enable(struct drm_crtc *crtc, struct drm_atom return; urb = udl_get_urb(udl); - if (!urb) + if (!urb) { + drm_err_ratelimited(dev, "get urb failed when enabling crtc\n"); goto out; + } buf = (char *)urb->transfer_buffer; buf = udl_vidreg_lock(buf); From 927011b65a875302d08709bbe82eaf4d0d96c5d5 Mon Sep 17 00:00:00 2001 From: Yury Norov Date: Mon, 27 Apr 2026 22:49:41 -0400 Subject: [PATCH 23/77] drm/amdgpu: fix build for CONFIG_DRM_FBDEV_EMULATION=n MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The merge-commit 02e778f12359 ("Merge tag 'amd-drm-next-7.1-2026-03-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next") removes the stub for drm_fb_helper_gem_is_fb(), so the buld gets broken if DRM_FBDEV_EMULATION is not set. ‘drm_fb_helper_gem_is_fb’; did you mean ‘drm_fb_helper_from_client’? [-Wimplicit-function-declaration] 1777 | if (!drm_fb_helper_gem_is_fb(dev->fb_helper, fb->obj[0])) { | ^~~~~~~~~~~~~~~~~~~~~~~ | drm_fb_helper_from_client Restore it. Fixes: 02e778f12359 ("Merge tag 'amd-drm-next-7.1-2026-03-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next") Reviewed-by: Thomas Zimmermann Signed-off-by: Yury Norov Signed-off-by: Alex Deucher (cherry picked from commit 7b81bc38e92c2522484c42671401eaa023ae8831) --- include/drm/drm_fb_helper.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h index bf391903443d..0c5e5ed7b5e7 100644 --- a/include/drm/drm_fb_helper.h +++ b/include/drm/drm_fb_helper.h @@ -273,6 +273,12 @@ int drm_fb_helper_hotplug_event(struct drm_fb_helper *fb_helper); int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper); bool drm_fb_helper_gem_is_fb(const struct drm_fb_helper *fb_helper, const struct drm_gem_object *obj); +#else +static inline bool drm_fb_helper_gem_is_fb(const struct drm_fb_helper *fb_helper, + const struct drm_gem_object *obj) +{ + return false; +} #endif #endif From d2f272a36e1b4b857165021cfb2689a92efff2f5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20K=C3=B6nig?= Date: Mon, 20 Apr 2026 16:08:35 +0200 Subject: [PATCH 24/77] drm/amdgpu: rework userq fence signal processing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Move more code into a common userq function. Signed-off-by: Christian König Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher (cherry picked from commit 12f52fab11500d0dce7d23c71909eaf0cf9aa701) --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 13 +++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | 1 + drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 10 +--------- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 10 +--------- drivers/gpu/drm/amd/amdgpu/gfx_v12_1.c | 11 +---------- drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 11 +---------- drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 11 +---------- 7 files changed, 19 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c index a15ca3e35344..d9b9c03267c0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c @@ -205,6 +205,19 @@ void amdgpu_userq_start_hang_detect_work(struct amdgpu_usermode_queue *queue) msecs_to_jiffies(timeout_ms)); } +void amdgpu_userq_process_fence_irq(struct amdgpu_device *adev, u32 doorbell) +{ + struct xarray *xa = &adev->userq_doorbell_xa; + struct amdgpu_usermode_queue *queue; + unsigned long flags; + + xa_lock_irqsave(xa, flags); + queue = xa_load(xa, doorbell); + if (queue) + amdgpu_userq_fence_driver_process(queue->fence_drv); + xa_unlock_irqrestore(xa, flags); +} + static void amdgpu_userq_init_hang_detect_work(struct amdgpu_usermode_queue *queue) { INIT_DELAYED_WORK(&queue->hang_detect_work, amdgpu_userq_hang_detect_work); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h index 675fe6395ac8..8b8f345b60b6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h @@ -156,6 +156,7 @@ void amdgpu_userq_reset_work(struct work_struct *work); void amdgpu_userq_pre_reset(struct amdgpu_device *adev); int amdgpu_userq_post_reset(struct amdgpu_device *adev, bool vram_lost); void amdgpu_userq_start_hang_detect_work(struct amdgpu_usermode_queue *queue); +void amdgpu_userq_process_fence_irq(struct amdgpu_device *adev, u32 doorbell); int amdgpu_userq_input_va_validate(struct amdgpu_device *adev, struct amdgpu_usermode_queue *queue, diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 8c82e90f871b..d40ab1e95480 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -6523,15 +6523,7 @@ static int gfx_v11_0_eop_irq(struct amdgpu_device *adev, DRM_DEBUG("IH: CP EOP\n"); if (adev->enable_mes && doorbell_offset) { - struct amdgpu_usermode_queue *queue; - struct xarray *xa = &adev->userq_doorbell_xa; - unsigned long flags; - - xa_lock_irqsave(xa, flags); - queue = xa_load(xa, doorbell_offset); - if (queue) - amdgpu_userq_fence_driver_process(queue->fence_drv); - xa_unlock_irqrestore(xa, flags); + amdgpu_userq_process_fence_irq(adev, doorbell_offset); } else { me_id = (entry->ring_id & 0x0c) >> 2; pipe_id = (entry->ring_id & 0x03) >> 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c index 65c33823a688..0e0b1e5b88fc 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c @@ -4854,15 +4854,7 @@ static int gfx_v12_0_eop_irq(struct amdgpu_device *adev, DRM_DEBUG("IH: CP EOP\n"); if (adev->enable_mes && doorbell_offset) { - struct xarray *xa = &adev->userq_doorbell_xa; - struct amdgpu_usermode_queue *queue; - unsigned long flags; - - xa_lock_irqsave(xa, flags); - queue = xa_load(xa, doorbell_offset); - if (queue) - amdgpu_userq_fence_driver_process(queue->fence_drv); - xa_unlock_irqrestore(xa, flags); + amdgpu_userq_process_fence_irq(adev, doorbell_offset); } else { me_id = (entry->ring_id & 0x0c) >> 2; pipe_id = (entry->ring_id & 0x03) >> 0; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_1.c b/drivers/gpu/drm/amd/amdgpu/gfx_v12_1.c index 68fd3c04134d..68db1bc73bc7 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v12_1.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v12_1.c @@ -3643,16 +3643,7 @@ static int gfx_v12_1_eop_irq(struct amdgpu_device *adev, DRM_DEBUG("IH: CP EOP\n"); if (adev->enable_mes && doorbell_offset) { - struct xarray *xa = &adev->userq_doorbell_xa; - struct amdgpu_usermode_queue *queue; - unsigned long flags; - - xa_lock_irqsave(xa, flags); - queue = xa_load(xa, doorbell_offset); - if (queue) - amdgpu_userq_fence_driver_process(queue->fence_drv); - - xa_unlock_irqrestore(xa, flags); + amdgpu_userq_process_fence_irq(adev, doorbell_offset); } else { me_id = (entry->ring_id & 0x0c) >> 2; pipe_id = (entry->ring_id & 0x03) >> 0; diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c index 0f530bb8a9a3..8ca46e1e474e 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c @@ -1662,17 +1662,8 @@ static int sdma_v6_0_process_fence_irq(struct amdgpu_device *adev, u32 doorbell_offset = entry->src_data[0]; if (adev->enable_mes && doorbell_offset) { - struct amdgpu_usermode_queue *queue; - struct xarray *xa = &adev->userq_doorbell_xa; - unsigned long flags; - doorbell_offset >>= SDMA0_QUEUE0_DOORBELL_OFFSET__OFFSET__SHIFT; - - xa_lock_irqsave(xa, flags); - queue = xa_load(xa, doorbell_offset); - if (queue) - amdgpu_userq_fence_driver_process(queue->fence_drv); - xa_unlock_irqrestore(xa, flags); + amdgpu_userq_process_fence_irq(adev, doorbell_offset); } return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c index 9ed817b69a3b..37191e2918d4 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c @@ -1594,17 +1594,8 @@ static int sdma_v7_0_process_fence_irq(struct amdgpu_device *adev, u32 doorbell_offset = entry->src_data[0]; if (adev->enable_mes && doorbell_offset) { - struct xarray *xa = &adev->userq_doorbell_xa; - struct amdgpu_usermode_queue *queue; - unsigned long flags; - doorbell_offset >>= SDMA0_QUEUE0_DOORBELL_OFFSET__OFFSET__SHIFT; - - xa_lock_irqsave(xa, flags); - queue = xa_load(xa, doorbell_offset); - if (queue) - amdgpu_userq_fence_driver_process(queue->fence_drv); - xa_unlock_irqrestore(xa, flags); + amdgpu_userq_process_fence_irq(adev, doorbell_offset); } return 0; From ec3e3976f626d9845a228d78d8a371ddc18edec8 Mon Sep 17 00:00:00 2001 From: Gaghik Khachatrian Date: Mon, 13 Apr 2026 11:11:52 -0400 Subject: [PATCH 25/77] drm/amd/display: Update MCIF_ADDR macro to address IGT DWB regression [Why] A previous warning-fix commit updated type casts in the DCN3 mmhubbub code but missed updating the MCIF_ADDR macro to the correct, fully parenthesized and casted version. This caused a regression during DWB tests, where address values could be misinterpreted, potentially leading to incorrect hardware programming. [How] Updated the MCIF_ADDR macro in dcn30_mmhubbub.c to use the proper parenthesization and type casting, ensuring correct address handling. Removed redundant casts from REG_UPDATE calls for improved clarity and consistency with current coding standards. Fixes: f4cdbb5d5405 ("drm/amd/display: Fix implicit narrowing conversion warnings") Reviewed-by: Clayton King Signed-off-by: Gaghik Khachatrian Signed-off-by: Tom Chung Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher (cherry picked from commit 4f251a5e9f2297023b00b7cab606de111931cfa3) --- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mmhubbub.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mmhubbub.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mmhubbub.c index 6f2a0d5d963b..62fe5c3b18dc 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mmhubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mmhubbub.c @@ -40,8 +40,8 @@ #define FN(reg_name, field_name) \ mcif_wb30->mcif_wb_shift->field_name, mcif_wb30->mcif_wb_mask->field_name -#define MCIF_ADDR(addr) (((unsigned long long)addr & 0xffffffffff) + 0xFE) >> 8 -#define MCIF_ADDR_HIGH(addr) (unsigned long long)addr >> 40 +#define MCIF_ADDR(addr) ((uint32_t)((((unsigned long long)(addr) & 0xffffffffffULL) + 0xFEULL) >> 8)) +#define MCIF_ADDR_HIGH(addr) ((uint32_t)(((unsigned long long)(addr)) >> 40)) /* wbif programming guide: * 1. set up wbif parameter: From d6b99885b122528651d554a7bd907211a81579c2 Mon Sep 17 00:00:00 2001 From: Lijo Lazar Date: Mon, 27 Apr 2026 17:17:41 +0530 Subject: [PATCH 26/77] drm/amd/pm: Update emit clock logic If only one level is enabled in clock table, there is no need to follow the fine grained clock logic which expects a minimum of two levels (min/max). Signed-off-by: Lijo Lazar Reviewed-by: Asad Kamal Signed-off-by: Alex Deucher (cherry picked from commit 7f19097af1496dd908a044ca95862f32d05f02df) --- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c index 3d49e58794d2..90c7127beabf 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c @@ -1370,7 +1370,7 @@ int smu_cmn_print_dpm_clk_levels(struct smu_context *smu, level_index = 1; } - if (!is_fine_grained) { + if (!is_fine_grained || count == 1) { for (i = 0; i < count; i++) { freq_match = !is_deep_sleep && smu_cmn_freqs_match( From 31bc64e87f5f3d9ccbb7e625d570cfd8f52c77fc Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Thu, 23 Apr 2026 12:29:03 -0400 Subject: [PATCH 27/77] drm/amd/display: properly handle family setting for early GC 11.5.4 Early variants need an override. Fixes: 57d00816c6a9 ("drm/amdgpu: set family for GC 11.5.4") Cc: Pratik Vishwakarma Cc: Roman Li Cc: Mario Limonciello Reviewed-by: Mario Limonciello (AMD) Tested-by: Mario Limonciello (AMD) Signed-off-by: Alex Deucher (cherry picked from commit 922fccc2d3f8186008c19ba08a49ae8a9463cb50) --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 4 +--- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 +++++- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index fcad7daaa41b..8d99bfaa498f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -3090,10 +3090,8 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev) case IP_VERSION(11, 5, 1): case IP_VERSION(11, 5, 2): case IP_VERSION(11, 5, 3): - adev->family = AMDGPU_FAMILY_GC_11_5_0; - break; case IP_VERSION(11, 5, 4): - adev->family = AMDGPU_FAMILY_GC_11_5_4; + adev->family = AMDGPU_FAMILY_GC_11_5_0; break; case IP_VERSION(12, 0, 0): case IP_VERSION(12, 0, 1): diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index e96a12ff2d31..f8f9953565f6 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1903,7 +1903,11 @@ static int amdgpu_dm_init(struct amdgpu_device *adev) goto error; } - init_data.asic_id.chip_family = adev->family; + /* special handling for early revisions of GC 11.5.4 */ + if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(11, 5, 4)) + init_data.asic_id.chip_family = AMDGPU_FAMILY_GC_11_5_4; + else + init_data.asic_id.chip_family = adev->family; init_data.asic_id.pci_revision_id = adev->pdev->revision; init_data.asic_id.hw_internal_rev = adev->external_rev_id; From 8d80b293b41fcb5e9396db93e788b0f4ebcbafb7 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:35 -0400 Subject: [PATCH 28/77] drm/amdgpu/vcn: set no_user_fence for VCN v2.0 enc/dec rings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 1b61de45dfaf ("drm/amdgpu: add initial VCN2.0 support (v2)") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit e2b5499fca55f1a32960a311bbb62e35891eaf73) --- drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c index e35fae9cdaf6..0442bfcfd384 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c @@ -2113,6 +2113,7 @@ static const struct amd_ip_funcs vcn_v2_0_ip_funcs = { static const struct amdgpu_ring_funcs vcn_v2_0_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_DEC, .align_mask = 0xf, + .no_user_fence = true, .secure_submission_supported = true, .get_rptr = vcn_v2_0_dec_ring_get_rptr, .get_wptr = vcn_v2_0_dec_ring_get_wptr, @@ -2145,6 +2146,7 @@ static const struct amdgpu_ring_funcs vcn_v2_0_enc_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .get_rptr = vcn_v2_0_enc_ring_get_rptr, .get_wptr = vcn_v2_0_enc_ring_get_wptr, .set_wptr = vcn_v2_0_enc_ring_set_wptr, From 4f317863a3ab212a027d8c8c3cc3af4e3fb95704 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:35 -0400 Subject: [PATCH 29/77] drm/amdgpu/vcn: set no_user_fence for VCN v2.5 enc/dec rings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 28c17d72072b ("drm/amdgpu: add VCN2.5 basic supports") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit efc9dd5590894109bce9a0bfe1fa5592dd6b20b1) --- drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c index 006a15451197..8b8184fe6764 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c @@ -1778,6 +1778,7 @@ static void vcn_v2_5_dec_ring_set_wptr(struct amdgpu_ring *ring) static const struct amdgpu_ring_funcs vcn_v2_5_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_DEC, .align_mask = 0xf, + .no_user_fence = true, .secure_submission_supported = true, .get_rptr = vcn_v2_5_dec_ring_get_rptr, .get_wptr = vcn_v2_5_dec_ring_get_wptr, @@ -1879,6 +1880,7 @@ static const struct amdgpu_ring_funcs vcn_v2_5_enc_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .get_rptr = vcn_v2_5_enc_ring_get_rptr, .get_wptr = vcn_v2_5_enc_ring_get_wptr, .set_wptr = vcn_v2_5_enc_ring_set_wptr, From f1e5a6660d7cbf006079126d9babbf0ccf538c6b Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:35 -0400 Subject: [PATCH 30/77] drm/amdgpu/vcn: set no_user_fence for VCN v3.0 enc/dec rings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: cf14826cdfb5 ("drm/amdgpu: add VCN3.0 support for Sienna_Cichlid") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 663bed3c7b8b9a7624b0d95d300ddae034ad0614) --- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c index 6fb4fcdbba4f..4924da5af5e7 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c @@ -1856,6 +1856,7 @@ static const struct amdgpu_ring_funcs vcn_v3_0_dec_sw_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_DEC, .align_mask = 0x3f, .nop = VCN_DEC_SW_CMD_NO_OP, + .no_user_fence = true, .secure_submission_supported = true, .get_rptr = vcn_v3_0_dec_ring_get_rptr, .get_wptr = vcn_v3_0_dec_ring_get_wptr, @@ -2036,6 +2037,7 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p, static const struct amdgpu_ring_funcs vcn_v3_0_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_DEC, .align_mask = 0xf, + .no_user_fence = true, .secure_submission_supported = true, .get_rptr = vcn_v3_0_dec_ring_get_rptr, .get_wptr = vcn_v3_0_dec_ring_get_wptr, @@ -2138,6 +2140,7 @@ static const struct amdgpu_ring_funcs vcn_v3_0_enc_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .get_rptr = vcn_v3_0_enc_ring_get_rptr, .get_wptr = vcn_v3_0_enc_ring_get_wptr, .set_wptr = vcn_v3_0_enc_ring_set_wptr, From 51f694221047c84fa185be98210eb2c354ffb8c6 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:36 -0400 Subject: [PATCH 31/77] drm/amdgpu/vcn: set no_user_fence for VCN v4.0 enc ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 8da1170a16e4 ("drm/amdgpu: add VCN4 ip block support") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit fd852c048b46f9825e904a4f3f4538fe9d8827d9) --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c index 5dec92691f73..bbdd017cbafb 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c @@ -1994,6 +1994,7 @@ static struct amdgpu_ring_funcs vcn_v4_0_unified_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .extra_bytes = sizeof(struct amdgpu_vcn_rb_metadata), .get_rptr = vcn_v4_0_unified_ring_get_rptr, .get_wptr = vcn_v4_0_unified_ring_get_wptr, From 4532b52b34e4e4310386e6fdf6a643368599f522 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:36 -0400 Subject: [PATCH 32/77] drm/amdgpu/vcn: set no_user_fence for VCN v4.0.3 enc ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: b889ef4ac988 ("drm/amdgpu/vcn: add vcn support for VCN4_0_3") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit ff1a5a125c5a70c328806b9bc01d7d942cf3f9aa) --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c index ff3013b97abd..10e8fc2821f3 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c @@ -1775,6 +1775,7 @@ static const struct amdgpu_ring_funcs vcn_v4_0_3_unified_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .get_rptr = vcn_v4_0_3_unified_ring_get_rptr, .get_wptr = vcn_v4_0_3_unified_ring_get_wptr, .set_wptr = vcn_v4_0_3_unified_ring_set_wptr, From 589a254bf3e88204c8402b9cbccd5e23a0af990f Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:36 -0400 Subject: [PATCH 33/77] drm/amdgpu/vcn: set no_user_fence for VCN v4.0.5 enc ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 547aad32edac ("drm/amdgpu: add VCN4 ip block support") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 084d94ac93707bdda07efb5cee786f632de4219b) --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c index 1f6a22983c0d..1571cc5a148c 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c @@ -1483,6 +1483,7 @@ static struct amdgpu_ring_funcs vcn_v4_0_5_unified_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .get_rptr = vcn_v4_0_5_unified_ring_get_rptr, .get_wptr = vcn_v4_0_5_unified_ring_get_wptr, .set_wptr = vcn_v4_0_5_unified_ring_set_wptr, From 8cae0ce77de492d7c31c1532a2e80c0c6e7e58cb Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:36 -0400 Subject: [PATCH 34/77] drm/amdgpu/vcn: set no_user_fence for VCN v5.0.0 enc ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: b6d1a0632051 ("drm/amdgpu: add VCN_5_0_0 IP block support") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 49b1fbbb5a071197ee71e2d70959b1cb29bdc317) --- drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c index 6109124f852e..d5f49fa33bee 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c @@ -1207,6 +1207,7 @@ static const struct amdgpu_ring_funcs vcn_v5_0_0_unified_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .get_rptr = vcn_v5_0_0_unified_ring_get_rptr, .get_wptr = vcn_v5_0_0_unified_ring_get_wptr, .set_wptr = vcn_v5_0_0_unified_ring_set_wptr, From 8f4954722eab88e10c4ea0c0d3b1269c31421d3a Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:36 -0400 Subject: [PATCH 35/77] drm/amdgpu/vcn: set no_user_fence for VCN v5.0.1 enc ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 346492f30ce3 ("drm/amdgpu: Add VCN_5_0_1 support") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit e16be95a2c3ee712b142cb27d2dca0b461181359) --- drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c index c28c6aff17aa..54fbf8d73ca6 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c @@ -1419,6 +1419,7 @@ static const struct amdgpu_ring_funcs vcn_v5_0_1_unified_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .get_rptr = vcn_v5_0_1_unified_ring_get_rptr, .get_wptr = vcn_v5_0_1_unified_ring_get_wptr, .set_wptr = vcn_v5_0_1_unified_ring_set_wptr, From ed9d2832b09eecfe6055833c925d586ce0dda70a Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:45:36 -0400 Subject: [PATCH 36/77] drm/amdgpu/vcn: set no_user_fence for VCN v5.0.2 enc ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit VCN encoder and decoder rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 8433398c789c ("drm/amdgpu: Add VCN v5_0_2") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 48fc78c31ea7fec63100a772f863cf51b2f8cd0a) --- drivers/gpu/drm/amd/amdgpu/vcn_v5_0_2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_2.c b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_2.c index c3d3cc023058..bbc172db91a1 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_2.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_2.c @@ -994,6 +994,7 @@ static const struct amdgpu_ring_funcs vcn_v5_0_2_unified_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_ENC, .align_mask = 0x3f, .nop = VCN_ENC_CMD_NO_OP, + .no_user_fence = true, .get_rptr = vcn_v5_0_2_unified_ring_get_rptr, .get_wptr = vcn_v5_0_2_unified_ring_get_wptr, .set_wptr = vcn_v5_0_2_unified_ring_set_wptr, From e5f612dc91650561fe2b5b76dd6d2898ec9ad480 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:10 -0400 Subject: [PATCH 37/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v2.0 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 6ac27241106b ("drm/amdgpu: add JPEG v2.0 function supports") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 96179da0c6b059eb31706a0abe8dd6381c533143) --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c index 9fe8d10ab270..cffb1e6bab35 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c @@ -802,6 +802,7 @@ static const struct amd_ip_funcs jpeg_v2_0_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v2_0_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v2_0_dec_ring_get_rptr, .get_wptr = jpeg_v2_0_dec_ring_get_wptr, .set_wptr = jpeg_v2_0_dec_ring_set_wptr, From 79405e774ede411c6b47ed41c651e40b92de64a2 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:10 -0400 Subject: [PATCH 38/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v2.5 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 14f43e8f88c5 ("drm/amdgpu: move JPEG2.5 out from VCN2.5") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 3216a7f4e2642bda5fd14f57586e835ae9202587) --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c index 20983f126b49..13a6e24c624a 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c @@ -693,6 +693,7 @@ static const struct amd_ip_funcs jpeg_v2_6_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v2_5_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v2_5_dec_ring_get_rptr, .get_wptr = jpeg_v2_5_dec_ring_get_wptr, .set_wptr = jpeg_v2_5_dec_ring_set_wptr, @@ -724,6 +725,7 @@ static const struct amdgpu_ring_funcs jpeg_v2_5_dec_ring_vm_funcs = { static const struct amdgpu_ring_funcs jpeg_v2_6_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v2_5_dec_ring_get_rptr, .get_wptr = jpeg_v2_5_dec_ring_get_wptr, .set_wptr = jpeg_v2_5_dec_ring_set_wptr, From a2baf12eec41f246689e6a3f8619af1200031576 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:10 -0400 Subject: [PATCH 39/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v3.0 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: dfd57dbf44dd ("drm/amdgpu: add JPEG3.0 support for Sienna_Cichlid") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 4d7d774f100efb5089c86a1fb8c5bf47c63fc9ef) --- drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c index 98f5e0622bc5..d0445df39d2c 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c @@ -594,6 +594,7 @@ static const struct amd_ip_funcs jpeg_v3_0_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v3_0_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v3_0_dec_ring_get_rptr, .get_wptr = jpeg_v3_0_dec_ring_get_wptr, .set_wptr = jpeg_v3_0_dec_ring_set_wptr, From e7e90b5839aeb8805ec83bb4da610b8dab8e184d Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:11 -0400 Subject: [PATCH 40/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: b13111de32a9 ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_0") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 8d0cac9478a3f046279c657d6a2545de49ae675a) --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c index 0bd83820dd20..6fd4238a8471 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c @@ -759,6 +759,7 @@ static const struct amd_ip_funcs jpeg_v4_0_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v4_0_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v4_0_dec_ring_get_rptr, .get_wptr = jpeg_v4_0_dec_ring_get_wptr, .set_wptr = jpeg_v4_0_dec_ring_set_wptr, From 83e37c0987ca92f9e87789b46dd311dcf5a4a6c8 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:11 -0400 Subject: [PATCH 41/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0.3 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: e684e654eba9 ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_3") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 2f6afc97d259d530f4f86c7743efbc573a8da927) --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c index 82abe181c730..0c746580de11 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c @@ -1219,6 +1219,7 @@ static const struct amd_ip_funcs jpeg_v4_0_3_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v4_0_3_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v4_0_3_dec_ring_get_rptr, .get_wptr = jpeg_v4_0_3_dec_ring_get_wptr, .set_wptr = jpeg_v4_0_3_dec_ring_set_wptr, From b65b7f3f3c18f797f81a2af7c97e2079900ad6db Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:11 -0400 Subject: [PATCH 42/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0.5 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 8f98a715da8e ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_5") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit f05d0a4f21fc720116d6e238f23308b199891058) --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c index 54fd9c800c40..a43582b9c876 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c @@ -804,6 +804,7 @@ static const struct amd_ip_funcs jpeg_v4_0_5_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v4_0_5_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v4_0_5_dec_ring_get_rptr, .get_wptr = jpeg_v4_0_5_dec_ring_get_wptr, .set_wptr = jpeg_v4_0_5_dec_ring_set_wptr, From ea7c61c5f895e8f9ea0ffffa180498ef9c740152 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:11 -0400 Subject: [PATCH 43/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v5.0.0 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: dfad65c65728 ("drm/amdgpu: Add JPEG5 support") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 0f43893d3cd478fa57836697525b338817c9c23d) --- drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c index 46bf15dce2bd..72a4b2d0676f 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c @@ -680,6 +680,7 @@ static const struct amd_ip_funcs jpeg_v5_0_0_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v5_0_0_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v5_0_0_dec_ring_get_rptr, .get_wptr = jpeg_v5_0_0_dec_ring_get_wptr, .set_wptr = jpeg_v5_0_0_dec_ring_set_wptr, From 2f8e3da71a1b469b6e157aa3972f1448b3157840 Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:11 -0400 Subject: [PATCH 44/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v5.0.1 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: b8f57b69942b ("drm/amdgpu: Add JPEG5_0_1 support") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 742a98e2e81702df8fe1b1eccee5223220a03dc2) --- drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c index edecbfe66c79..250316704dfa 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c @@ -884,6 +884,7 @@ static const struct amd_ip_funcs jpeg_v5_0_1_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v5_0_1_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v5_0_1_dec_ring_get_rptr, .get_wptr = jpeg_v5_0_1_dec_ring_get_wptr, .set_wptr = jpeg_v5_0_1_dec_ring_set_wptr, From 8068519c7e78819f88e1c08fe027efd5e468609d Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:11 -0400 Subject: [PATCH 45/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v5.0.2 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 855e3e19f69c ("drm/amdgpu: Add JPEG_v5_0_2 IP block") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 4ec1c402fb0fb39511136c5fc874788542c476bc) --- drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_2.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_2.c index 285c459379c4..7a4ecea6b39a 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_2.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_2.c @@ -703,6 +703,7 @@ static const struct amd_ip_funcs jpeg_v5_0_2_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v5_0_2_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v5_0_2_dec_ring_get_rptr, .get_wptr = jpeg_v5_0_2_dec_ring_get_wptr, .set_wptr = jpeg_v5_0_2_dec_ring_set_wptr, From 3b0ea2021351b6b813b34fac940957f1f4fad85b Mon Sep 17 00:00:00 2001 From: Yinjie Yao Date: Mon, 27 Apr 2026 11:46:11 -0400 Subject: [PATCH 46/77] drm/amdgpu/jpeg: set no_user_fence for JPEG v5.3.0 ring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit JPEG rings do not support 64-bit user fence writes, reject CS submissions with user fences. Fixes: 4aeaf3cbfa9f ("drm/amdgpu/jpeg: Add jpeg 5.3.0 support") Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Yinjie Yao Signed-off-by: Alex Deucher (cherry picked from commit 86ac011ae234c03fb872f4945913391ea1d8862e) --- drivers/gpu/drm/amd/amdgpu/jpeg_v5_3_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_3_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_3_0.c index 1821dced936f..e7546816baba 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v5_3_0.c @@ -661,6 +661,7 @@ static const struct amd_ip_funcs jpeg_v5_3_0_ip_funcs = { static const struct amdgpu_ring_funcs jpeg_v5_3_0_dec_ring_vm_funcs = { .type = AMDGPU_RING_TYPE_VCN_JPEG, .align_mask = 0xf, + .no_user_fence = true, .get_rptr = jpeg_v5_3_0_dec_ring_get_rptr, .get_wptr = jpeg_v5_3_0_dec_ring_get_wptr, .set_wptr = jpeg_v5_3_0_dec_ring_set_wptr, From 8f935acbc18ff7ad09cb812528b28c59c78f10f9 Mon Sep 17 00:00:00 2001 From: Prike Liang Date: Mon, 27 Apr 2026 20:06:57 +0800 Subject: [PATCH 47/77] drm/amdgpu: clean up the userq unmap error handler MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit amdgpu_userq_unmap_helper() already handles the unmap error case. Signed-off-by: Prike Liang Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 66cb6579990b633ccc7300c27011d837b9a58da0) --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c index d9b9c03267c0..de140a8ed135 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c @@ -656,12 +656,6 @@ amdgpu_userq_destroy(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_usermode_que #endif amdgpu_userq_detect_and_reset_queues(uq_mgr); r = amdgpu_userq_unmap_helper(queue); - /*TODO: It requires a reset for userq hw unmap error*/ - if (r) { - drm_warn(adev_to_drm(uq_mgr->adev), "trying to destroy a HW mapping userq\n"); - queue->state = AMDGPU_USERQ_STATE_HUNG; - } - atomic_dec(&uq_mgr->userq_count[queue->queue_type]); amdgpu_userq_cleanup(queue); mutex_unlock(&uq_mgr->userq_mutex); From 47a5dfc8add4e60ff1ddc312f79998e70cbb0c09 Mon Sep 17 00:00:00 2001 From: Lijo Lazar Date: Mon, 27 Apr 2026 12:53:30 +0530 Subject: [PATCH 48/77] drm/amd/pm: Add fine grained flag to SMU v13.0.6 Gfx clock is fine grained on SMU v13.0.6/12 SOCs. Add the flag to report clock frequencies correctly. Fixes: 7380228401c4 ("drm/amd/pm: Use generic dpm table for SMUv13 SOCs") Signed-off-by: Lijo Lazar Reviewed-by: Asad Kamal Signed-off-by: Alex Deucher (cherry picked from commit d4871d837bbf70173f63426a84fa80b39e408b9e) --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c index cd0a23f432ff..0df8c05a7fce 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c @@ -1129,6 +1129,7 @@ static int smu_v13_0_6_set_default_dpm_table(struct smu_context *smu) /* gfxclk dpm table setup */ dpm_table = &dpm_context->dpm_tables.gfx_table; dpm_table->clk_type = SMU_GFXCLK; + dpm_table->flags = SMU_DPM_TABLE_FINE_GRAINED; if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_GFXCLK_BIT)) { /* In the case of gfxclk, only fine-grained dpm is honored. * Get min/max values from FW. From e6e9faba8100628990cccd13f0f044a648c303cf Mon Sep 17 00:00:00 2001 From: Benjamin Cheng Date: Mon, 13 Apr 2026 09:22:15 -0400 Subject: [PATCH 49/77] drm/amdgpu/vcn3: Avoid overflow on msg bound check As pointed out by SDL, the previous condition may be vulnerable to overflow. Fixes: b193019860d6 ("drm/amdgpu/vcn3: Prevent OOB reads when parsing dec msg") Cc: SDL Signed-off-by: Benjamin Cheng Reviewed-by: Ruijing Dong Signed-off-by: Alex Deucher (cherry picked from commit db00257ac9e4a51eb2515aaea161a019f7125e10) --- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c index 4924da5af5e7..81bba3ec2a93 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c @@ -1973,6 +1973,7 @@ static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job, for (i = 0, msg = &msg[6]; i < num_buffers; ++i, msg += 4) { uint32_t offset, size, *create; + uint64_t buf_end; if (msg[0] != RDECODE_MESSAGE_CREATE) continue; @@ -1980,7 +1981,8 @@ static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job, offset = msg[1]; size = msg[2]; - if (size < 4 || offset + size > end - addr) { + if (size < 4 || check_add_overflow(offset, size, &buf_end) || + buf_end > end - addr) { DRM_ERROR("VCN message buffer exceeds BO bounds!\n"); r = -EINVAL; goto out; From 65bce27ea6192320448c30267ffc17ffa094e713 Mon Sep 17 00:00:00 2001 From: Benjamin Cheng Date: Mon, 13 Apr 2026 09:22:15 -0400 Subject: [PATCH 50/77] drm/amdgpu/vcn4: Avoid overflow on msg bound check As pointed out by SDL, the previous condition may be vulnerable to overflow. Fixes: 0a78f2bac142 ("drm/amdgpu/vcn4: Prevent OOB reads when parsing dec msg") Cc: SDL Signed-off-by: Benjamin Cheng Reviewed-by: Ruijing Dong Signed-off-by: Alex Deucher (cherry picked from commit 3c5367d950140d4ec7af830b2268a5a6fdaa3885) --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c index bbdd017cbafb..ff7269bafae8 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c @@ -1889,6 +1889,7 @@ static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job, for (i = 0, msg = &msg[6]; i < num_buffers; ++i, msg += 4) { uint32_t offset, size, *create; + uint64_t buf_end; if (msg[0] != RDECODE_MESSAGE_CREATE) continue; @@ -1896,7 +1897,8 @@ static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job, offset = msg[1]; size = msg[2]; - if (size < 4 || offset + size > end - addr) { + if (size < 4 || check_add_overflow(offset, size, &buf_end) || + buf_end > end - addr) { DRM_ERROR("VCN message buffer exceeds BO bounds!\n"); r = -EINVAL; goto out; From a1fc7bf6677eb547167cb72b3bcafdc34b976692 Mon Sep 17 00:00:00 2001 From: Leo Li Date: Wed, 22 Apr 2026 12:29:56 -0400 Subject: [PATCH 51/77] drm/amd/display: Restore 5s vbl offdelay for NV3x+ DGPUs [Why] Rapid vblank off is causing flip-done timeouts for NV3x and newer family of GPUs that support more idle optimization features. A proper fix requires further investigation. In lieu of it, let's workaround it for now. [How] For NV3x and newer family of DGPUs, restore the old 5s vblank off timer. Fixes: 9b47278cec98 ("drm/amd/display: temp w/a for dGPU to enter idle optimizations") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3787 Link: https://lore.kernel.org/amd-gfx/20260217191632.1243826-1-sysdadmin@m1k.cloud/ Tested-by: Michele Palazzi Reviewed-by: Mario Limonciello Signed-off-by: Leo Li Signed-off-by: Alex Deucher (cherry picked from commit df482c2d441b090161633566b7a0755f1bbd55c2) --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index f8f9953565f6..5fc5d5608506 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -9408,9 +9408,21 @@ static void manage_dm_interrupts(struct amdgpu_device *adev, if (acrtc_state) { timing = &acrtc_state->stream->timing; - if (amdgpu_ip_version(adev, DCE_HWIP, 0) < - IP_VERSION(3, 5, 0) || - !(adev->flags & AMD_IS_APU)) { + if (amdgpu_ip_version(adev, DCE_HWIP, 0) >= + IP_VERSION(3, 2, 0) && + !(adev->flags & AMD_IS_APU)) { + /* + * DGPUs NV3x and newer that support idle optimizations + * experience intermittent flip-done timeouts on cursor + * updates. Restore 5s offdelay behavior for now. + * + * Discussion on the issue: + * https://lore.kernel.org/amd-gfx/20260217191632.1243826-1-sysdadmin@m1k.cloud/ + */ + config.offdelay_ms = 5000; + config.disable_immediate = false; + } else if (amdgpu_ip_version(adev, DCE_HWIP, 0) < + IP_VERSION(3, 5, 0)) { /* * Older HW and DGPU have issues with instant off; * use a 2 frame offdelay. From 494941aa772dab79251543764db6cd14bd337e43 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Tue, 28 Apr 2026 13:40:40 +0200 Subject: [PATCH 52/77] drm/amd/display: Allow embedded connectors without DDC MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On some laptops, the embedded panel may not have a DDC (display data channel) available. On these, the EDID may be hardcoded in ACPI or the VBIOS. In this case, use GPIO_DDC_LINE_UNKNOWN and don't fail. Fixes: def3488eb0fd ("drm/amd/display: refactor HPD to increase flexibility") Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/5192 Signed-off-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit 75b8a6ca0e8bc3ce24572f854e95f8721b321179) --- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c | 3 +++ drivers/gpu/drm/amd/display/dc/link/link_factory.c | 4 +++- 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index 7f55ba09b191..37714d4371fb 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -1682,7 +1682,7 @@ struct dc_scratch_space { struct dc_link_training_overrides preferred_training_settings; struct dp_audio_test_data audio_test_data; - uint8_t ddc_hw_inst; + enum gpio_ddc_line ddc_hw_inst; uint8_t hpd_src; diff --git a/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c b/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c index a2c46350e44e..95f8b7c7d657 100644 --- a/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c +++ b/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c @@ -646,6 +646,9 @@ enum gpio_result dal_ddc_change_mode( enum gpio_ddc_line dal_ddc_get_line( const struct ddc *ddc) { + if (!ddc) + return GPIO_DDC_LINE_UNKNOWN; + return (enum gpio_ddc_line)dal_gpio_get_enum(ddc->pin_data); } diff --git a/drivers/gpu/drm/amd/display/dc/link/link_factory.c b/drivers/gpu/drm/amd/display/dc/link/link_factory.c index 7e7682d7dfc8..ae4c4ad05baa 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_factory.c +++ b/drivers/gpu/drm/amd/display/dc/link/link_factory.c @@ -568,7 +568,9 @@ static bool construct_phy(struct dc_link *link, goto ddc_create_fail; } - if (!link->ddc->ddc_pin) { + /* Embedded display connectors such as LVDS may not have DDC. */ + if (!link->ddc->ddc_pin && + !dc_is_embedded_signal(link->connector_signal)) { DC_ERROR("Failed to get I2C info for connector!\n"); goto ddc_create_fail; } From ac27e3f99035f132f23bc0409d0e57f11f054c70 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Tue, 28 Apr 2026 13:40:41 +0200 Subject: [PATCH 53/77] drm/amd/display: Allow DCE link encoder without AUX registers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Allow constructing the DCE link encoder without DDC, which means the AUX registers array will be NULL. This is necessary to support embedded connectors without DDC. Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/5192 Signed-off-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit 87f30b101af62590faf6020d106da07efdda199b) --- drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c index 5f40ae9e3120..e15fd1454d3b 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c @@ -1102,7 +1102,9 @@ void dce110_link_encoder_hw_init( ASSERT(result == BP_RESULT_OK); } - aux_initialize(enc110); + + if (enc110->aux_regs) + aux_initialize(enc110); /* reinitialize HPD. * hpd_initialize() will pass DIG_FE id to HW context. From 880498a1943f865529819f778df3b9945ca57262 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Tue, 28 Apr 2026 13:40:42 +0200 Subject: [PATCH 54/77] drm/amd/display: Allow constructing DCE6 link encoder without DDC MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When the DDC channel ID is set to CHANNEL_ID_UNKNOWN, pass NULL to the AUX regs array. This is necessary to support embedded connectors without DDC. Fixes: 7c15fd86aaec ("drm/amd/display: dc/dce: add initial DCE6 support (v10)") Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/5192 Signed-off-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit 38a70e50b22a188ff601740d64dd75f46213121f) --- drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c index 6a25dcfcdf17..d2d56a1c4b8b 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c @@ -753,7 +753,8 @@ static struct link_encoder *dce60_link_encoder_create( enc_init_data, &link_enc_feature, &link_enc_regs[link_regs_id], - &link_enc_aux_regs[enc_init_data->channel - 1], + enc_init_data->channel == CHANNEL_ID_UNKNOWN ? + NULL : &link_enc_aux_regs[enc_init_data->channel - 1], enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ? NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]); return &enc110->base; From 60af4605ef35ecb7ad649a8534b83a2f7c69576d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Tue, 28 Apr 2026 13:40:43 +0200 Subject: [PATCH 55/77] drm/amd/display: Allow constructing DCE8 link encoder without DDC MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When the DDC channel ID is set to CHANNEL_ID_UNKNOWN, pass NULL to the AUX regs array. This is necessary to support embedded connectors without DDC. Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/5192 Signed-off-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit 155baf3038c1af50b602723022ed869b38e86a99) --- drivers/gpu/drm/amd/display/dc/resource/dce80/dce80_resource.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/resource/dce80/dce80_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dce80/dce80_resource.c index 33be49b3c1b1..6c00497e9a01 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dce80/dce80_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dce80/dce80_resource.c @@ -760,7 +760,8 @@ static struct link_encoder *dce80_link_encoder_create( enc_init_data, &link_enc_feature, &link_enc_regs[link_regs_id], - &link_enc_aux_regs[enc_init_data->channel - 1], + enc_init_data->channel == CHANNEL_ID_UNKNOWN ? + NULL : &link_enc_aux_regs[enc_init_data->channel - 1], enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ? NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]); return &enc110->base; From 9ea16f64189bf7b6ba50fc7f0325b3c1f836d105 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Tue, 28 Apr 2026 13:40:44 +0200 Subject: [PATCH 56/77] drm/amd/display: Read EDID from VBIOS embedded panel info MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Some board manufacturers hardcode the EDID for the embedded panel in the VBIOS. This EDID should be used when the panel doesn't have a DDC. For reference, see the legacy non-DC display code: amdgpu_atombios_encoder_get_lcd_info() This is necessary to support embedded connectors without DDC. Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/5192 Signed-off-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit eb105e63b474c11ef6a84a1c6b18100d851ff364) --- .../gpu/drm/amd/display/dc/bios/bios_parser.c | 62 +++++++++++++++++++ .../display/include/grph_object_ctrl_defs.h | 4 ++ 2 files changed, 66 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c index e270b1d2457c..c307f42fe0b9 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c +++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c @@ -1313,6 +1313,60 @@ static enum bp_result bios_parser_get_embedded_panel_info( return BP_RESULT_FAILURE; } +static enum bp_result get_embedded_panel_extra_info( + struct bios_parser *bp, + struct embedded_panel_info *info, + const uint32_t table_offset) +{ + uint8_t *record = bios_get_image(&bp->base, table_offset, 1); + ATOM_PANEL_RESOLUTION_PATCH_RECORD *panel_res_record; + ATOM_FAKE_EDID_PATCH_RECORD *fake_edid_record; + + while (*record != ATOM_RECORD_END_TYPE) { + switch (*record) { + case LCD_MODE_PATCH_RECORD_MODE_TYPE: + record += sizeof(ATOM_PATCH_RECORD_MODE); + break; + case LCD_RTS_RECORD_TYPE: + record += sizeof(ATOM_LCD_RTS_RECORD); + break; + case LCD_CAP_RECORD_TYPE: + record += sizeof(ATOM_LCD_MODE_CONTROL_CAP); + break; + case LCD_FAKE_EDID_PATCH_RECORD_TYPE: + fake_edid_record = (ATOM_FAKE_EDID_PATCH_RECORD *)record; + if (fake_edid_record->ucFakeEDIDLength) { + if (fake_edid_record->ucFakeEDIDLength == 128) + info->fake_edid_size = + fake_edid_record->ucFakeEDIDLength; + else + info->fake_edid_size = + fake_edid_record->ucFakeEDIDLength * 128; + + info->fake_edid = fake_edid_record->ucFakeEDIDString; + + record += struct_size(fake_edid_record, + ucFakeEDIDString, + info->fake_edid_size); + } else { + /* empty fake edid record must be 3 bytes long */ + record += sizeof(ATOM_FAKE_EDID_PATCH_RECORD) + 1; + } + break; + case LCD_PANEL_RESOLUTION_RECORD_TYPE: + panel_res_record = (ATOM_PANEL_RESOLUTION_PATCH_RECORD *)record; + info->panel_width_mm = panel_res_record->usHSize; + info->panel_height_mm = panel_res_record->usVSize; + record += sizeof(ATOM_PANEL_RESOLUTION_PATCH_RECORD); + break; + default: + return BP_RESULT_BADBIOSTABLE; + } + } + + return BP_RESULT_OK; +} + static enum bp_result get_embedded_panel_info_v1_2( struct bios_parser *bp, struct embedded_panel_info *info) @@ -1429,6 +1483,10 @@ static enum bp_result get_embedded_panel_info_v1_2( if (ATOM_PANEL_MISC_API_ENABLED & lvds->ucLVDS_Misc) info->lcd_timing.misc_info.API_ENABLED = true; + if (lvds->usExtInfoTableOffset) + return get_embedded_panel_extra_info(bp, info, + le16_to_cpu(lvds->usExtInfoTableOffset) + DATA_TABLES(LCD_Info)); + return BP_RESULT_OK; } @@ -1554,6 +1612,10 @@ static enum bp_result get_embedded_panel_info_v1_3( (uint32_t) (ATOM_PANEL_MISC_V13_GREY_LEVEL & lvds->ucLCD_Misc) >> ATOM_PANEL_MISC_V13_GREY_LEVEL_SHIFT; + if (lvds->usExtInfoTableOffset) + return get_embedded_panel_extra_info(bp, info, + le16_to_cpu(lvds->usExtInfoTableOffset) + DATA_TABLES(LCD_Info)); + return BP_RESULT_OK; } diff --git a/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h b/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h index 38a77fa9b4af..a0f03fb67605 100644 --- a/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h +++ b/drivers/gpu/drm/amd/display/include/grph_object_ctrl_defs.h @@ -153,6 +153,10 @@ struct embedded_panel_info { uint32_t drr_enabled; uint32_t min_drr_refresh_rate; bool realtek_eDPToLVDS; + uint16_t panel_width_mm; + uint16_t panel_height_mm; + uint16_t fake_edid_size; + const uint8_t *fake_edid; }; struct dc_firmware_info { From 019155e2bd3e2cec425553195e9f9bc76bb0f848 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timur=20Krist=C3=B3f?= Date: Tue, 28 Apr 2026 13:40:45 +0200 Subject: [PATCH 57/77] drm/amd/display: Use EDID from VBIOS embedded panel info MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When an embedded panel has no DDC, read the EDID from the VBIOS embedded panel info and use that. Fixes: 7c7f5b15be65 ("drm/amd/display: Refactor edid read.") Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/5192 Signed-off-by: Timur Kristóf Signed-off-by: Alex Deucher (cherry picked from commit 399b9abc353c62f6e37d38325edbdb6c2c00411c) --- .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 44 +++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index 3b8ae7798a93..a3cb05490dc9 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -1032,6 +1032,45 @@ dm_helpers_read_acpi_edid(struct amdgpu_dm_connector *aconnector) return drm_edid_read_custom(connector, dm_helpers_probe_acpi_edid, connector); } +static const struct drm_edid * +dm_helpers_read_vbios_hardcoded_edid(struct dc_link *link, struct amdgpu_dm_connector *aconnector) +{ + struct dc_bios *bios = link->ctx->dc_bios; + struct embedded_panel_info info; + const struct drm_edid *edid; + enum bp_result r; + + if (!dc_is_embedded_signal(link->connector_signal) || + !bios->funcs->get_embedded_panel_info) + return NULL; + + memset(&info, 0, sizeof(info)); + r = bios->funcs->get_embedded_panel_info(bios, &info); + + if (r != BP_RESULT_OK) { + dm_error("Error when reading embedded panel info: %u\n", r); + return NULL; + } + + if (!info.fake_edid || !info.fake_edid_size) { + dm_error("Embedded panel info doesn't contain an EDID\n"); + return NULL; + } + + edid = drm_edid_alloc(info.fake_edid, info.fake_edid_size); + + if (!drm_edid_valid(edid)) { + dm_error("EDID from embedded panel info is invalid\n"); + drm_edid_free(edid); + return NULL; + } + + aconnector->base.display_info.width_mm = info.panel_width_mm; + aconnector->base.display_info.height_mm = info.panel_height_mm; + + return edid; +} + void populate_hdmi_info_from_connector(struct drm_hdmi_info *hdmi, struct dc_edid_caps *edid_caps) { edid_caps->scdc_present = hdmi->scdc.supported; @@ -1052,6 +1091,9 @@ enum dc_edid_status dm_helpers_read_local_edid( if (link->aux_mode) ddc = &aconnector->dm_dp_aux.aux.ddc; + else if (link->ddc_hw_inst == GPIO_DDC_LINE_UNKNOWN && + dc_is_embedded_signal(link->connector_signal)) + ddc = NULL; else ddc = &aconnector->i2c->base; @@ -1065,6 +1107,8 @@ enum dc_edid_status dm_helpers_read_local_edid( drm_edid = dm_helpers_read_acpi_edid(aconnector); if (drm_edid) drm_info(connector->dev, "Using ACPI provided EDID for %s\n", connector->name); + else if (!ddc) + drm_edid = dm_helpers_read_vbios_hardcoded_edid(link, aconnector); else drm_edid = drm_edid_read_ddc(connector, ddc); drm_edid_connector_update(connector, drm_edid); From a0fc362f095330f7b3f68ac0c55ef8da18290c87 Mon Sep 17 00:00:00 2001 From: Matthew Brost Date: Thu, 26 Mar 2026 14:01:16 -0700 Subject: [PATCH 58/77] drm/xe: Drop registration of guc_submit_wedged_fini from xe_guc_submit_wedge() xe_guc_submit_wedge() runs in the DMA-fence signaling path, where GFP_KERNEL memory allocations are not permitted. However, registering guc_submit_wedged_fini via drmm_add_action_or_reset() triggers such an allocation. Avoid this by moving the logic from guc_submit_wedged_fini() into guc_submit_fini(), where wedged exec queue references are dropped during normal teardown. Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang") Signed-off-by: Matthew Brost Reviewed-by: Rodrigo Vivi Link: https://patch.msgid.link/20260326210116.202585-3-matthew.brost@intel.com (cherry picked from commit 4a706bd93c4fb156a13477e26ffdf2e633edeb10) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_guc_submit.c | 33 ++++++++---------------------- 1 file changed, 9 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index a145234f662b..10556156eaad 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -259,24 +259,12 @@ static void guc_submit_sw_fini(struct drm_device *drm, void *arg) } static void guc_submit_fini(void *arg) -{ - struct xe_guc *guc = arg; - - /* Forcefully kill any remaining exec queues */ - xe_guc_ct_stop(&guc->ct); - guc_submit_reset_prepare(guc); - xe_guc_softreset(guc); - xe_guc_submit_stop(guc); - xe_uc_fw_sanitize(&guc->fw); - xe_guc_submit_pause_abort(guc); -} - -static void guc_submit_wedged_fini(void *arg) { struct xe_guc *guc = arg; struct xe_exec_queue *q; unsigned long index; + /* Drop any wedged queue refs */ mutex_lock(&guc->submission_state.lock); xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) { if (exec_queue_wedged(q)) { @@ -286,6 +274,14 @@ static void guc_submit_wedged_fini(void *arg) } } mutex_unlock(&guc->submission_state.lock); + + /* Forcefully kill any remaining exec queues */ + xe_guc_ct_stop(&guc->ct); + guc_submit_reset_prepare(guc); + xe_guc_softreset(guc); + xe_guc_submit_stop(guc); + xe_uc_fw_sanitize(&guc->fw); + xe_guc_submit_pause_abort(guc); } static const struct xe_exec_queue_ops guc_exec_queue_ops; @@ -1320,10 +1316,8 @@ static void disable_scheduling_deregister(struct xe_guc *guc, void xe_guc_submit_wedge(struct xe_guc *guc) { struct xe_device *xe = guc_to_xe(guc); - struct xe_gt *gt = guc_to_gt(guc); struct xe_exec_queue *q; unsigned long index; - int err; xe_gt_assert(guc_to_gt(guc), guc_to_xe(guc)->wedged.mode); @@ -1335,15 +1329,6 @@ void xe_guc_submit_wedge(struct xe_guc *guc) return; if (xe->wedged.mode == XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET) { - err = devm_add_action_or_reset(guc_to_xe(guc)->drm.dev, - guc_submit_wedged_fini, guc); - if (err) { - xe_gt_err(gt, "Failed to register clean-up on wedged.mode=%s; " - "Although device is wedged.\n", - xe_wedged_mode_to_string(XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET)); - return; - } - mutex_lock(&guc->submission_state.lock); xa_for_each(&guc->submission_state.exec_queue_lookup, index, q) if (xe_exec_queue_get_unless_zero(q)) From 2bc0cce2724f74dde914d11fabb35b3a912c8329 Mon Sep 17 00:00:00 2001 From: Jonathan Cavitt Date: Tue, 31 Mar 2026 18:12:17 +0000 Subject: [PATCH 59/77] drm/xe/vm: Add missing pad and extensions check Add missing pad and extensions check to xe_vm_get_property_ioctl v2: - Combine with other check (Auld) Fixes: 50c577eab051 ("drm/xe/xe_vm: Implement xe_vm_get_property_ioctl") Suggested-by: Matthew Auld Signed-off-by: Jonathan Cavitt Reviewed-by: Matthew Auld Reviewed-by: Matthew Brost Signed-off-by: Matthew Auld Link: https://patch.msgid.link/20260331181216.37775-2-jonathan.cavitt@intel.com (cherry picked from commit 896070686b16cc45cca7854be2049923b2b303d3) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_vm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 56e2db50bb36..1720205c09ca 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -4156,7 +4156,8 @@ int xe_vm_get_property_ioctl(struct drm_device *drm, void *data, int ret = 0; if (XE_IOCTL_DBG(xe, (args->reserved[0] || args->reserved[1] || - args->reserved[2]))) + args->reserved[2] || args->extensions || + args->pad))) return -EINVAL; vm = xe_vm_lookup(xef, args->vm_id); From 9d7ca81b3019905c36c8cae9c306827325ba5878 Mon Sep 17 00:00:00 2001 From: Matt Roper Date: Wed, 1 Apr 2026 13:12:44 -0700 Subject: [PATCH 60/77] drm/xe: Drop redundant rtp entries for Wa_14019988906 & Wa_14019877138 There appears to have been a silent merge conflict between some commits updating the workaround tables on Xe's -fixes and -next branches: - Commit bc6387a2e0c1 ("drm/xe/xe2_hpg: Fix handling of Wa_14019988906 & Wa_14019877138") from the fixes branch moved the Xe2_HPG instance of two workarounds touching the PSS_CHICKEN register from the engine_was[] table to the lrc_was[] table; the equivalent implementation for all other platforms/IPs were already properly located on lrc_was[]. This commit on the fixes branch is a cherry-pick of commit e04c609eedf4 ("drm/xe/xe2_hpg: Fix handling of Wa_14019988906 & Wa_14019877138") that already existed on the next branch. - Commit 55b19abb6c44 ("drm/xe: Consolidate workaround entries for Wa_14019877138") and commit c2142a1a8415 ("drm/xe: Consolidate workaround entries for Wa_14019988906") consolidated the individual entries per IP generation for each workaround into single, larger range-based entries. During merge conflict resolution the Xe2_HPG-specific entries (i.e., those with rule "GRAPHICS_VERSION_RANGE(2001, 2002)") were accidentally resurrected, even though the table already contains the consolidated entries that match a superset of thse ranges. These redundant entries don't cause any build failures but do trigger a dmesg error during probe on BMG-G21 devices: xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: discarding save-restore reg 7044 (clear: 00000400, set: 00000400, masked: yes, mcr: yes): ret=-22 xe 0000:03:00.0: [drm] *ERROR* Tile0: GT0: discarding save-restore reg 7044 (clear: 00000020, set: 00000020, masked: yes, mcr: yes): ret=-22 Re-drop the Xe2_HPG-specific table entries to eliminate the error. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7433 Fixes: 17b95278ae6a ("Merge tag 'drm-xe-next-2026-03-02' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next") Cc: Dave Airlie Signed-off-by: Matt Roper Reviewed-by: Shuicheng Lin Link: https://patch.msgid.link/20260401-wa_merge_conflict-v1-1-b477ab53fedc@intel.com Signed-off-by: Maarten Lankhorst (cherry picked from commit c79bc999442ff3c0908ab8bce92b2a3cb7d59861) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_wa.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index 546296f0220b..4b1cbced06be 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -743,14 +743,6 @@ static const struct xe_rtp_entry_sr lrc_was[] = { XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)), XE_RTP_ACTIONS(SET(WM_CHICKEN3, HIZ_PLANE_COMPRESSION_DIS)) }, - { XE_RTP_NAME("14019988906"), - XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2002), ENGINE_CLASS(RENDER)), - XE_RTP_ACTIONS(SET(XEHP_PSS_CHICKEN, FLSH_IGNORES_PSD)) - }, - { XE_RTP_NAME("14019877138"), - XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2002), ENGINE_CLASS(RENDER)), - XE_RTP_ACTIONS(SET(XEHP_PSS_CHICKEN, FD_END_COLLECT)) - }, { XE_RTP_NAME("14021490052"), XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)), XE_RTP_ACTIONS(SET(FF_MODE, From 68fdf2c943bbba75d4f3a5c5546bc764f5886c13 Mon Sep 17 00:00:00 2001 From: Gustavo Sousa Date: Wed, 1 Apr 2026 19:10:51 -0300 Subject: [PATCH 61/77] drm/xe/xe3p_lpg: Add missing indirect ring state feature flag Even though commit 8fcb7dfb8bbf ("drm/xe/xe3p_lpg: Add support for graphics IP 35.10") mentions that the support for Indirect Ring State exists for Xe3p_LPG, it missed actually setting the feature flag in graphics_xe3p_lpg. Fix that by adding the missing member. Fixes: 8fcb7dfb8bbf ("drm/xe/xe3p_lpg: Add support for graphics IP 35.10") Reviewed-by: Matt Roper Link: https://patch.msgid.link/20260401-xe3p_lpg-indirect-ring-state-v1-1-0e4b5edf6898@intel.com Signed-off-by: Gustavo Sousa (cherry picked from commit ec4f4970eb744fd7d6d135f40f5c83bd05982e72) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_pci.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c index 01673d2b2464..9f98d0334164 100644 --- a/drivers/gpu/drm/xe/xe_pci.c +++ b/drivers/gpu/drm/xe/xe_pci.c @@ -118,6 +118,7 @@ static const struct xe_graphics_desc graphics_xe2 = { static const struct xe_graphics_desc graphics_xe3p_lpg = { XE2_GFX_FEATURES, + .has_indirect_ring_state = 1, .multi_queue_engine_class_mask = BIT(XE_ENGINE_CLASS_COPY) | BIT(XE_ENGINE_CLASS_COMPUTE), .num_geometry_xecore_fuse_regs = 3, .num_compute_xecore_fuse_regs = 3, From 2299d73562e68e85e358289438924572b01cfe19 Mon Sep 17 00:00:00 2001 From: Matt Roper Date: Fri, 10 Apr 2026 15:50:29 -0700 Subject: [PATCH 62/77] drm/xe/tuning: Use proper register offset for GAMSTLB_CTRL From Xe2 onward (i.e., all platforms officially supported by the Xe driver), the GAMSTLB_CTRL register is located at offset 0x477C and represented by the macro "GAMSTLB_CTRL" in code. However the register formerly resided at offset 0xCF4C on Xe1-era platforms, and we also have macro XEHP_GAMSTLB_CTRL that represents this old offset in the unofficial/developer-only Xe1 code. When tuning for the register was added for Xe3p_LPG, the old Xe1-era macro was accidentally used instead of the proper macro for Xe2 and beyond, causing the tuning to not be applied properly. Use the proper definition so that the correct offset is written to. Bspec: 59298 Fixes: 377c89bfaa5d ("drm/xe/xe3p_lpg: Set STLB bank hash mode to 4KB") Reviewed-by: Gustavo Sousa Link: https://patch.msgid.link/20260410-xe3p_tuning-v1-2-e206a62ee38f@intel.com Signed-off-by: Matt Roper (cherry picked from commit 0b1676eafdd1ba5a5436bdca0d2a25ce56699783) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_tuning.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c index f8de6a4bf189..0b78ec2bc6a4 100644 --- a/drivers/gpu/drm/xe/xe_tuning.c +++ b/drivers/gpu/drm/xe/xe_tuning.c @@ -97,7 +97,7 @@ static const struct xe_rtp_entry_sr gt_tunings[] = { { XE_RTP_NAME("Tuning: Set STLB Bank Hash Mode to 4KB"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3510, XE_RTP_END_VERSION_UNDEFINED), IS_INTEGRATED), - XE_RTP_ACTIONS(FIELD_SET(XEHP_GAMSTLB_CTRL, BANK_HASH_MODE, + XE_RTP_ACTIONS(FIELD_SET(GAMSTLB_CTRL, BANK_HASH_MODE, BANK_HASH_4KB_MODE)) }, }; From 9407936237c98104873550219efedc286f28bbe9 Mon Sep 17 00:00:00 2001 From: Matt Roper Date: Fri, 10 Apr 2026 15:50:30 -0700 Subject: [PATCH 63/77] drm/xe: Mark ROW_CHICKEN5 as a masked register ROW_CHICKEN5 is a masked register (i.e., to adjust the value of any of the lower 16 bits, the corresponding bit in the upper 16 bits must also be set). Add the XE_REG_OPTION_MASKED to its definition; failure to do so will cause workaround updates of this register to not apply properly. Bspec: 56853 Fixes: 835cd6cbb0d0 ("drm/xe/xe3p_lpg: Add initial workarounds for graphics version 35.10") Reviewed-by: Gustavo Sousa Link: https://patch.msgid.link/20260410-xe3p_tuning-v1-3-e206a62ee38f@intel.com Signed-off-by: Matt Roper (cherry picked from commit cd84bfbba7feb4c1e72356f14de026dfda1a9e2a) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/regs/xe_gt_regs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index 4ebaa0888a43..9c88ca3ce768 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -583,7 +583,7 @@ #define DISABLE_128B_EVICTION_COMMAND_UDW REG_BIT(36 - 32) #define LSCFE_SAME_ADDRESS_ATOMICS_COALESCING_DISABLE REG_BIT(35 - 32) -#define ROW_CHICKEN5 XE_REG_MCR(0xe7f0) +#define ROW_CHICKEN5 XE_REG_MCR(0xe7f0, XE_REG_OPTION_MASKED) #define CPSS_AWARE_DIS REG_BIT(3) #define SARB_CHICKEN1 XE_REG_MCR(0xe90c) From 03f2499c51dffce611b065b2894406beb9f2ebe0 Mon Sep 17 00:00:00 2001 From: Matt Roper Date: Wed, 8 Apr 2026 15:27:44 -0700 Subject: [PATCH 64/77] drm/xe/debugfs: Correct printing of register whitelist ranges The register-save-restore debugfs prints whitelist entries as offset ranges. E.g., REG[0x39319c-0x39319f]: allow read access for a single dword-sized register. However the GENMASK value used to set the lower bits to '1' for the upper bound of the whitelist range incorrectly included one more bit than it should have, causing the whitelist ranges to sometimes appear twice as large as they really were. For example, REG[0x6210-0x6217]: allow rw access was also intended to be a single dword-sized register whitelist (with a range 0x6210-0x6213) but was printed incorrectly as a qword-sized range because one too many bits was flipped on. Similar 'off by one' logic was applied when printing 4-dword register ranges and 64-dword register ranges as well. Correct the GENMASK logic to print these ranges in debugfs correctly. No impact outside of correcting the misleading debugfs output. Fixes: d855d2246ea6 ("drm/xe: Print whitelist while applying") Reviewed-by: Stuart Summers Link: https://patch.msgid.link/20260408-regsr_wl_range-v1-1-e9a28c8b4264@intel.com Signed-off-by: Matt Roper (cherry picked from commit 1a2a722ff96749734a5585dfe7f0bea7719caa8b) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_reg_whitelist.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_reg_whitelist.c b/drivers/gpu/drm/xe/xe_reg_whitelist.c index 80577e4b7437..8cc313182968 100644 --- a/drivers/gpu/drm/xe/xe_reg_whitelist.c +++ b/drivers/gpu/drm/xe/xe_reg_whitelist.c @@ -226,7 +226,7 @@ void xe_reg_whitelist_print_entry(struct drm_printer *p, unsigned int indent, } range_start = reg & REG_GENMASK(25, range_bit); - range_end = range_start | REG_GENMASK(range_bit, 0); + range_end = range_start | REG_GENMASK(range_bit - 1, 0); switch (val & RING_FORCE_TO_NONPRIV_ACCESS_MASK) { case RING_FORCE_TO_NONPRIV_ACCESS_RW: From 36c6bac158816ede655f298a3f76e5a350eaa90e Mon Sep 17 00:00:00 2001 From: Satyanarayana K V P Date: Wed, 8 Apr 2026 11:01:47 +0000 Subject: [PATCH 65/77] drm/xe: Add memory pool with shadow support MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a memory pool to allocate sub-ranges from a BO-backed pool using drm_mm. Signed-off-by: Satyanarayana K V P Cc: Matthew Brost Cc: Thomas Hellström Cc: Maarten Lankhorst Cc: Michal Wajdeczko Reviewed-by: Matthew Brost Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20260408110145.1639937-5-satyanarayana.k.v.p@intel.com (cherry picked from commit 1ce3229f8f269a245ff3b8c65ffae36b4d6afb93) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/Makefile | 1 + drivers/gpu/drm/xe/xe_mem_pool.c | 403 +++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_mem_pool.h | 35 +++ drivers/gpu/drm/xe/xe_mem_pool_types.h | 21 ++ 4 files changed, 460 insertions(+) create mode 100644 drivers/gpu/drm/xe/xe_mem_pool.c create mode 100644 drivers/gpu/drm/xe/xe_mem_pool.h create mode 100644 drivers/gpu/drm/xe/xe_mem_pool_types.h diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index 49de1c22a469..03242e8b3d87 100644 --- a/drivers/gpu/drm/xe/Makefile +++ b/drivers/gpu/drm/xe/Makefile @@ -88,6 +88,7 @@ xe-y += xe_bb.o \ xe_irq.o \ xe_late_bind_fw.o \ xe_lrc.o \ + xe_mem_pool.o \ xe_migrate.o \ xe_mmio.o \ xe_mmio_gem.o \ diff --git a/drivers/gpu/drm/xe/xe_mem_pool.c b/drivers/gpu/drm/xe/xe_mem_pool.c new file mode 100644 index 000000000000..d5e24d6aa88d --- /dev/null +++ b/drivers/gpu/drm/xe/xe_mem_pool.c @@ -0,0 +1,403 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2026 Intel Corporation + */ + +#include + +#include + +#include "instructions/xe_mi_commands.h" +#include "xe_bo.h" +#include "xe_device_types.h" +#include "xe_map.h" +#include "xe_mem_pool.h" +#include "xe_mem_pool_types.h" +#include "xe_tile_printk.h" + +/** + * struct xe_mem_pool - DRM MM pool for sub-allocating memory from a BO on an + * XE tile. + * + * The XE memory pool is a DRM MM manager that provides sub-allocation of memory + * from a backing buffer object (BO) on a specific XE tile. It is designed to + * manage memory for GPU workloads, allowing for efficient allocation and + * deallocation of memory regions within the BO. + * + * The memory pool maintains a primary BO that is pinned in the GGTT and mapped + * into the CPU address space for direct access. Optionally, it can also maintain + * a shadow BO that can be used for atomic updates to the primary BO's contents. + * + * The API provided by the memory pool allows clients to allocate and free memory + * regions, retrieve GPU and CPU addresses, and synchronize data between the + * primary and shadow BOs as needed. + */ +struct xe_mem_pool { + /** @base: Range allocator over [0, @size) in bytes */ + struct drm_mm base; + /** @bo: Active pool BO (GGTT-pinned, CPU-mapped). */ + struct xe_bo *bo; + /** @shadow: Shadow BO for atomic command updates. */ + struct xe_bo *shadow; + /** @swap_guard: Timeline guard updating @bo and @shadow */ + struct mutex swap_guard; + /** @cpu_addr: CPU virtual address of the active BO. */ + void *cpu_addr; + /** @is_iomem: Indicates if the BO mapping is I/O memory. */ + bool is_iomem; +}; + +static struct xe_mem_pool *node_to_pool(struct xe_mem_pool_node *node) +{ + return container_of(node->sa_node.mm, struct xe_mem_pool, base); +} + +static struct xe_tile *pool_to_tile(struct xe_mem_pool *pool) +{ + return pool->bo->tile; +} + +static void fini_pool_action(struct drm_device *drm, void *arg) +{ + struct xe_mem_pool *pool = arg; + + if (pool->is_iomem) + kvfree(pool->cpu_addr); + + drm_mm_takedown(&pool->base); +} + +static int pool_shadow_init(struct xe_mem_pool *pool) +{ + struct xe_tile *tile = pool->bo->tile; + struct xe_device *xe = tile_to_xe(tile); + struct xe_bo *shadow; + int ret; + + xe_assert(xe, !pool->shadow); + + ret = drmm_mutex_init(&xe->drm, &pool->swap_guard); + if (ret) + return ret; + + if (IS_ENABLED(CONFIG_PROVE_LOCKING)) { + fs_reclaim_acquire(GFP_KERNEL); + might_lock(&pool->swap_guard); + fs_reclaim_release(GFP_KERNEL); + } + shadow = xe_managed_bo_create_pin_map(xe, tile, + xe_bo_size(pool->bo), + XE_BO_FLAG_VRAM_IF_DGFX(tile) | + XE_BO_FLAG_GGTT | + XE_BO_FLAG_GGTT_INVALIDATE | + XE_BO_FLAG_PINNED_NORESTORE); + if (IS_ERR(shadow)) + return PTR_ERR(shadow); + + pool->shadow = shadow; + + return 0; +} + +/** + * xe_mem_pool_init() - Initialize memory pool. + * @tile: the &xe_tile where allocate. + * @size: number of bytes to allocate. + * @guard: the size of the guard region at the end of the BO that is not + * sub-allocated, in bytes. + * @flags: flags to use to create shadow pool. + * + * Initializes a memory pool for sub-allocating memory from a backing BO on the + * specified XE tile. The backing BO is pinned in the GGTT and mapped into + * the CPU address space for direct access. Optionally, a shadow BO can also be + * initialized for atomic updates to the primary BO's contents. + * + * Returns: a pointer to the &xe_mem_pool, or an error pointer on failure. + */ +struct xe_mem_pool *xe_mem_pool_init(struct xe_tile *tile, u32 size, + u32 guard, int flags) +{ + struct xe_device *xe = tile_to_xe(tile); + struct xe_mem_pool *pool; + struct xe_bo *bo; + u32 managed_size; + int ret; + + xe_tile_assert(tile, size > guard); + managed_size = size - guard; + + pool = drmm_kzalloc(&xe->drm, sizeof(*pool), GFP_KERNEL); + if (!pool) + return ERR_PTR(-ENOMEM); + + bo = xe_managed_bo_create_pin_map(xe, tile, size, + XE_BO_FLAG_VRAM_IF_DGFX(tile) | + XE_BO_FLAG_GGTT | + XE_BO_FLAG_GGTT_INVALIDATE | + XE_BO_FLAG_PINNED_NORESTORE); + if (IS_ERR(bo)) { + xe_tile_err(tile, "Failed to prepare %uKiB BO for mem pool (%pe)\n", + size / SZ_1K, bo); + return ERR_CAST(bo); + } + pool->bo = bo; + pool->is_iomem = bo->vmap.is_iomem; + + if (pool->is_iomem) { + pool->cpu_addr = kvzalloc(size, GFP_KERNEL); + if (!pool->cpu_addr) + return ERR_PTR(-ENOMEM); + } else { + pool->cpu_addr = bo->vmap.vaddr; + } + + if (flags & XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY) { + ret = pool_shadow_init(pool); + + if (ret) + goto out_err; + } + + drm_mm_init(&pool->base, 0, managed_size); + ret = drmm_add_action_or_reset(&xe->drm, fini_pool_action, pool); + if (ret) + return ERR_PTR(ret); + + return pool; + +out_err: + if (flags & XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY) + xe_tile_err(tile, + "Failed to initialize shadow BO for mem pool (%d)\n", ret); + if (bo->vmap.is_iomem) + kvfree(pool->cpu_addr); + return ERR_PTR(ret); +} + +/** + * xe_mem_pool_sync() - Copy the entire contents of the main pool to shadow pool. + * @pool: the memory pool containing the primary and shadow BOs. + * + * Copies the entire contents of the primary pool to the shadow pool. This must + * be done after xe_mem_pool_init() with the XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY + * flag to ensure that the shadow pool has the same initial contents as the primary + * pool. After this initial synchronization, clients can choose to synchronize the + * shadow pool with the primary pool on a node basis using + * xe_mem_pool_sync_shadow_locked() as needed. + * + * Return: None. + */ +void xe_mem_pool_sync(struct xe_mem_pool *pool) +{ + struct xe_tile *tile = pool_to_tile(pool); + struct xe_device *xe = tile_to_xe(tile); + + xe_tile_assert(tile, pool->shadow); + + xe_map_memcpy_to(xe, &pool->shadow->vmap, 0, + pool->cpu_addr, xe_bo_size(pool->bo)); +} + +/** + * xe_mem_pool_swap_shadow_locked() - Swap the primary BO with the shadow BO. + * @pool: the memory pool containing the primary and shadow BOs. + * + * Swaps the primary buffer object with the shadow buffer object in the mem + * pool. This allows for atomic updates to the contents of the primary BO + * by first writing to the shadow BO and then swapping it with the primary BO. + * Swap_guard must be held to ensure synchronization with any concurrent swap + * operations. + * + * Return: None. + */ +void xe_mem_pool_swap_shadow_locked(struct xe_mem_pool *pool) +{ + struct xe_tile *tile = pool_to_tile(pool); + + xe_tile_assert(tile, pool->shadow); + lockdep_assert_held(&pool->swap_guard); + + swap(pool->bo, pool->shadow); + if (!pool->bo->vmap.is_iomem) + pool->cpu_addr = pool->bo->vmap.vaddr; +} + +/** + * xe_mem_pool_sync_shadow_locked() - Copy node from primary pool to shadow pool. + * @node: the node allocated in the memory pool. + * + * Copies the specified batch buffer from the primary pool to the shadow pool. + * Swap_guard must be held to ensure synchronization with any concurrent swap + * operations. + * + * Return: None. + */ +void xe_mem_pool_sync_shadow_locked(struct xe_mem_pool_node *node) +{ + struct xe_mem_pool *pool = node_to_pool(node); + struct xe_tile *tile = pool_to_tile(pool); + struct xe_device *xe = tile_to_xe(tile); + struct drm_mm_node *sa_node = &node->sa_node; + + xe_tile_assert(tile, pool->shadow); + lockdep_assert_held(&pool->swap_guard); + + xe_map_memcpy_to(xe, &pool->shadow->vmap, + sa_node->start, + pool->cpu_addr + sa_node->start, + sa_node->size); +} + +/** + * xe_mem_pool_gpu_addr() - Retrieve GPU address of memory pool. + * @pool: the memory pool + * + * Returns: GGTT address of the memory pool. + */ +u64 xe_mem_pool_gpu_addr(struct xe_mem_pool *pool) +{ + return xe_bo_ggtt_addr(pool->bo); +} + +/** + * xe_mem_pool_cpu_addr() - Retrieve CPU address of manager pool. + * @pool: the memory pool + * + * Returns: CPU virtual address of memory pool. + */ +void *xe_mem_pool_cpu_addr(struct xe_mem_pool *pool) +{ + return pool->cpu_addr; +} + +/** + * xe_mem_pool_bo_swap_guard() - Retrieve the mutex used to guard swap + * operations on a memory pool. + * @pool: the memory pool + * + * Returns: Swap guard mutex or NULL if shadow pool is not created. + */ +struct mutex *xe_mem_pool_bo_swap_guard(struct xe_mem_pool *pool) +{ + if (!pool->shadow) + return NULL; + + return &pool->swap_guard; +} + +/** + * xe_mem_pool_bo_flush_write() - Copy the data from the sub-allocation + * to the GPU memory. + * @node: the node allocated in the memory pool to flush. + */ +void xe_mem_pool_bo_flush_write(struct xe_mem_pool_node *node) +{ + struct xe_mem_pool *pool = node_to_pool(node); + struct xe_tile *tile = pool_to_tile(pool); + struct xe_device *xe = tile_to_xe(tile); + struct drm_mm_node *sa_node = &node->sa_node; + + if (!pool->bo->vmap.is_iomem) + return; + + xe_map_memcpy_to(xe, &pool->bo->vmap, sa_node->start, + pool->cpu_addr + sa_node->start, + sa_node->size); +} + +/** + * xe_mem_pool_bo_sync_read() - Copy the data from GPU memory to the + * sub-allocation. + * @node: the node allocated in the memory pool to read back. + */ +void xe_mem_pool_bo_sync_read(struct xe_mem_pool_node *node) +{ + struct xe_mem_pool *pool = node_to_pool(node); + struct xe_tile *tile = pool_to_tile(pool); + struct xe_device *xe = tile_to_xe(tile); + struct drm_mm_node *sa_node = &node->sa_node; + + if (!pool->bo->vmap.is_iomem) + return; + + xe_map_memcpy_from(xe, pool->cpu_addr + sa_node->start, + &pool->bo->vmap, sa_node->start, sa_node->size); +} + +/** + * xe_mem_pool_alloc_node() - Allocate a new node for use with xe_mem_pool. + * + * Returns: node structure or an ERR_PTR(-ENOMEM). + */ +struct xe_mem_pool_node *xe_mem_pool_alloc_node(void) +{ + struct xe_mem_pool_node *node = kzalloc_obj(*node); + + if (!node) + return ERR_PTR(-ENOMEM); + + return node; +} + +/** + * xe_mem_pool_insert_node() - Insert a node into the memory pool. + * @pool: the memory pool to insert into + * @node: the node to insert + * @size: the size of the node to be allocated in bytes. + * + * Inserts a node into the specified memory pool using drm_mm for + * allocation. + * + * Returns: 0 on success or a negative error code on failure. + */ +int xe_mem_pool_insert_node(struct xe_mem_pool *pool, + struct xe_mem_pool_node *node, u32 size) +{ + if (!pool) + return -EINVAL; + + return drm_mm_insert_node(&pool->base, &node->sa_node, size); +} + +/** + * xe_mem_pool_free_node() - Free a node allocated from the memory pool. + * @node: the node to free + * + * Returns: None. + */ +void xe_mem_pool_free_node(struct xe_mem_pool_node *node) +{ + if (!node) + return; + + drm_mm_remove_node(&node->sa_node); + kfree(node); +} + +/** + * xe_mem_pool_node_cpu_addr() - Retrieve CPU address of the node. + * @node: the node allocated in the memory pool + * + * Returns: CPU virtual address of the node. + */ +void *xe_mem_pool_node_cpu_addr(struct xe_mem_pool_node *node) +{ + struct xe_mem_pool *pool = node_to_pool(node); + + return xe_mem_pool_cpu_addr(pool) + node->sa_node.start; +} + +/** + * xe_mem_pool_dump() - Dump the state of the DRM MM manager for debugging. + * @pool: the memory pool info be dumped. + * @p: The DRM printer to use for output. + * + * Only the drm managed region is dumped, not the state of the BOs or any other + * pool information. + * + * Returns: None. + */ +void xe_mem_pool_dump(struct xe_mem_pool *pool, struct drm_printer *p) +{ + drm_mm_print(&pool->base, p); +} diff --git a/drivers/gpu/drm/xe/xe_mem_pool.h b/drivers/gpu/drm/xe/xe_mem_pool.h new file mode 100644 index 000000000000..89cd2555fe91 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_mem_pool.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2026 Intel Corporation + */ +#ifndef _XE_MEM_POOL_H_ +#define _XE_MEM_POOL_H_ + +#include +#include + +#include +#include "xe_mem_pool_types.h" + +struct drm_printer; +struct xe_mem_pool; +struct xe_tile; + +struct xe_mem_pool *xe_mem_pool_init(struct xe_tile *tile, u32 size, + u32 guard, int flags); +void xe_mem_pool_sync(struct xe_mem_pool *pool); +void xe_mem_pool_swap_shadow_locked(struct xe_mem_pool *pool); +void xe_mem_pool_sync_shadow_locked(struct xe_mem_pool_node *node); +u64 xe_mem_pool_gpu_addr(struct xe_mem_pool *pool); +void *xe_mem_pool_cpu_addr(struct xe_mem_pool *pool); +struct mutex *xe_mem_pool_bo_swap_guard(struct xe_mem_pool *pool); +void xe_mem_pool_bo_flush_write(struct xe_mem_pool_node *node); +void xe_mem_pool_bo_sync_read(struct xe_mem_pool_node *node); +struct xe_mem_pool_node *xe_mem_pool_alloc_node(void); +int xe_mem_pool_insert_node(struct xe_mem_pool *pool, + struct xe_mem_pool_node *node, u32 size); +void xe_mem_pool_free_node(struct xe_mem_pool_node *node); +void *xe_mem_pool_node_cpu_addr(struct xe_mem_pool_node *node); +void xe_mem_pool_dump(struct xe_mem_pool *pool, struct drm_printer *p); + +#endif diff --git a/drivers/gpu/drm/xe/xe_mem_pool_types.h b/drivers/gpu/drm/xe/xe_mem_pool_types.h new file mode 100644 index 000000000000..d5e926c93351 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_mem_pool_types.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2026 Intel Corporation + */ + +#ifndef _XE_MEM_POOL_TYPES_H_ +#define _XE_MEM_POOL_TYPES_H_ + +#include + +#define XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY BIT(0) + +/** + * struct xe_mem_pool_node - Sub-range allocations from mem pool. + */ +struct xe_mem_pool_node { + /** @sa_node: drm_mm_node for this allocation. */ + struct drm_mm_node sa_node; +}; + +#endif From 1460eae74fbbb27d5c5b159dba021e41c6ace4c1 Mon Sep 17 00:00:00 2001 From: Satyanarayana K V P Date: Wed, 8 Apr 2026 11:01:48 +0000 Subject: [PATCH 66/77] drm/xe/vf: Use drm mm instead of drm sa for CCS read/write MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The suballocator algorithm tracks a hole cursor at the last allocation and tries to allocate after it. This is optimized for fence-ordered progress, where older allocations are expected to become reusable first. In fence-enabled mode, that ordering assumption holds. In fence-disabled mode, allocations may be freed in arbitrary order, so limiting allocation to the current hole window can miss valid free space and fail allocations despite sufficient total space. Use DRM memory manager instead of sub-allocator to get rid of this issue as CCS read/write operations do not use fences. Fixes: 864690cf4dd6 ("drm/xe/vf: Attach and detach CCS copy commands with BO") Signed-off-by: Satyanarayana K V P Cc: Matthew Brost Cc: Thomas Hellström Cc: Maarten Lankhorst Cc: Michal Wajdeczko Reviewed-by: Matthew Brost Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20260408110145.1639937-6-satyanarayana.k.v.p@intel.com (cherry picked from commit 6c84b493012aeb05dec29c709377bf0e17ac6815) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_bo_types.h | 3 +- drivers/gpu/drm/xe/xe_migrate.c | 56 ++++++++++++---------- drivers/gpu/drm/xe/xe_sriov_vf_ccs.c | 54 +++++++++++---------- drivers/gpu/drm/xe/xe_sriov_vf_ccs_types.h | 5 +- 4 files changed, 63 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h index ff8317bfc1ae..9d19940b8fc0 100644 --- a/drivers/gpu/drm/xe/xe_bo_types.h +++ b/drivers/gpu/drm/xe/xe_bo_types.h @@ -18,6 +18,7 @@ #include "xe_ggtt_types.h" struct xe_device; +struct xe_mem_pool_node; struct xe_vm; #define XE_BO_MAX_PLACEMENTS 3 @@ -88,7 +89,7 @@ struct xe_bo { bool ccs_cleared; /** @bb_ccs: BB instructions of CCS read/write. Valid only for VF */ - struct xe_bb *bb_ccs[XE_SRIOV_VF_CCS_CTX_COUNT]; + struct xe_mem_pool_node *bb_ccs[XE_SRIOV_VF_CCS_CTX_COUNT]; /** * @cpu_caching: CPU caching mode. Currently only used for userspace diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index fc918b4fba54..5fdc89ed5256 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -29,6 +29,7 @@ #include "xe_hw_engine.h" #include "xe_lrc.h" #include "xe_map.h" +#include "xe_mem_pool.h" #include "xe_mocs.h" #include "xe_printk.h" #include "xe_pt.h" @@ -1166,11 +1167,12 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q, u32 batch_size, batch_size_allocated; struct xe_device *xe = gt_to_xe(gt); struct xe_res_cursor src_it, ccs_it; + struct xe_mem_pool *bb_pool; struct xe_sriov_vf_ccs_ctx *ctx; - struct xe_sa_manager *bb_pool; u64 size = xe_bo_size(src_bo); - struct xe_bb *bb = NULL; + struct xe_mem_pool_node *bb; u64 src_L0, src_L0_ofs; + struct xe_bb xe_bb_tmp; u32 src_L0_pt; int err; @@ -1208,18 +1210,18 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q, size -= src_L0; } - bb = xe_bb_alloc(gt); + bb = xe_mem_pool_alloc_node(); if (IS_ERR(bb)) return PTR_ERR(bb); bb_pool = ctx->mem.ccs_bb_pool; - scoped_guard(mutex, xe_sa_bo_swap_guard(bb_pool)) { - xe_sa_bo_swap_shadow(bb_pool); + scoped_guard(mutex, xe_mem_pool_bo_swap_guard(bb_pool)) { + xe_mem_pool_swap_shadow_locked(bb_pool); - err = xe_bb_init(bb, bb_pool, batch_size); + err = xe_mem_pool_insert_node(bb_pool, bb, batch_size * sizeof(u32)); if (err) { xe_gt_err(gt, "BB allocation failed.\n"); - xe_bb_free(bb, NULL); + kfree(bb); return err; } @@ -1227,6 +1229,7 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q, size = xe_bo_size(src_bo); batch_size = 0; + xe_bb_tmp = (struct xe_bb){ .cs = xe_mem_pool_node_cpu_addr(bb), .len = 0 }; /* * Emit PTE and copy commands here. * The CCS copy command can only support limited size. If the size to be @@ -1255,24 +1258,27 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q, xe_assert(xe, IS_ALIGNED(ccs_it.start, PAGE_SIZE)); batch_size += EMIT_COPY_CCS_DW; - emit_pte(m, bb, src_L0_pt, false, true, &src_it, src_L0, src); + emit_pte(m, &xe_bb_tmp, src_L0_pt, false, true, &src_it, src_L0, src); - emit_pte(m, bb, ccs_pt, false, false, &ccs_it, ccs_size, src); + emit_pte(m, &xe_bb_tmp, ccs_pt, false, false, &ccs_it, ccs_size, src); - bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags); - flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, src_is_pltt, + xe_bb_tmp.len = emit_flush_invalidate(xe_bb_tmp.cs, xe_bb_tmp.len, + flush_flags); + flush_flags = xe_migrate_ccs_copy(m, &xe_bb_tmp, src_L0_ofs, src_is_pltt, src_L0_ofs, dst_is_pltt, src_L0, ccs_ofs, true); - bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags); + xe_bb_tmp.len = emit_flush_invalidate(xe_bb_tmp.cs, xe_bb_tmp.len, + flush_flags); size -= src_L0; } - xe_assert(xe, (batch_size_allocated == bb->len)); + xe_assert(xe, (batch_size_allocated == xe_bb_tmp.len)); + xe_assert(xe, bb->sa_node.size == xe_bb_tmp.len * sizeof(u32)); src_bo->bb_ccs[read_write] = bb; xe_sriov_vf_ccs_rw_update_bb_addr(ctx); - xe_sa_bo_sync_shadow(bb->bo); + xe_mem_pool_sync_shadow_locked(bb); } return 0; @@ -1297,10 +1303,10 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q, void xe_migrate_ccs_rw_copy_clear(struct xe_bo *src_bo, enum xe_sriov_vf_ccs_rw_ctxs read_write) { - struct xe_bb *bb = src_bo->bb_ccs[read_write]; + struct xe_mem_pool_node *bb = src_bo->bb_ccs[read_write]; struct xe_device *xe = xe_bo_device(src_bo); + struct xe_mem_pool *bb_pool; struct xe_sriov_vf_ccs_ctx *ctx; - struct xe_sa_manager *bb_pool; u32 *cs; xe_assert(xe, IS_SRIOV_VF(xe)); @@ -1308,17 +1314,17 @@ void xe_migrate_ccs_rw_copy_clear(struct xe_bo *src_bo, ctx = &xe->sriov.vf.ccs.contexts[read_write]; bb_pool = ctx->mem.ccs_bb_pool; - guard(mutex) (xe_sa_bo_swap_guard(bb_pool)); - xe_sa_bo_swap_shadow(bb_pool); + scoped_guard(mutex, xe_mem_pool_bo_swap_guard(bb_pool)) { + xe_mem_pool_swap_shadow_locked(bb_pool); - cs = xe_sa_bo_cpu_addr(bb->bo); - memset(cs, MI_NOOP, bb->len * sizeof(u32)); - xe_sriov_vf_ccs_rw_update_bb_addr(ctx); + cs = xe_mem_pool_node_cpu_addr(bb); + memset(cs, MI_NOOP, bb->sa_node.size); + xe_sriov_vf_ccs_rw_update_bb_addr(ctx); - xe_sa_bo_sync_shadow(bb->bo); - - xe_bb_free(bb, NULL); - src_bo->bb_ccs[read_write] = NULL; + xe_mem_pool_sync_shadow_locked(bb); + xe_mem_pool_free_node(bb); + src_bo->bb_ccs[read_write] = NULL; + } } /** diff --git a/drivers/gpu/drm/xe/xe_sriov_vf_ccs.c b/drivers/gpu/drm/xe/xe_sriov_vf_ccs.c index db023fb66a27..09b99fb2608b 100644 --- a/drivers/gpu/drm/xe/xe_sriov_vf_ccs.c +++ b/drivers/gpu/drm/xe/xe_sriov_vf_ccs.c @@ -14,9 +14,9 @@ #include "xe_guc.h" #include "xe_guc_submit.h" #include "xe_lrc.h" +#include "xe_mem_pool.h" #include "xe_migrate.h" #include "xe_pm.h" -#include "xe_sa.h" #include "xe_sriov_printk.h" #include "xe_sriov_vf.h" #include "xe_sriov_vf_ccs.h" @@ -141,43 +141,47 @@ static u64 get_ccs_bb_pool_size(struct xe_device *xe) static int alloc_bb_pool(struct xe_tile *tile, struct xe_sriov_vf_ccs_ctx *ctx) { + struct xe_mem_pool *pool; struct xe_device *xe = tile_to_xe(tile); - struct xe_sa_manager *sa_manager; + u32 *pool_cpu_addr, *last_dw_addr; u64 bb_pool_size; - int offset, err; + int err; bb_pool_size = get_ccs_bb_pool_size(xe); xe_sriov_info(xe, "Allocating %s CCS BB pool size = %lldMB\n", ctx->ctx_id ? "Restore" : "Save", bb_pool_size / SZ_1M); - sa_manager = __xe_sa_bo_manager_init(tile, bb_pool_size, SZ_4K, SZ_16, - XE_SA_BO_MANAGER_FLAG_SHADOW); - - if (IS_ERR(sa_manager)) { - xe_sriov_err(xe, "Suballocator init failed with error: %pe\n", - sa_manager); - err = PTR_ERR(sa_manager); + pool = xe_mem_pool_init(tile, bb_pool_size, sizeof(u32), + XE_MEM_POOL_BO_FLAG_INIT_SHADOW_COPY); + if (IS_ERR(pool)) { + xe_sriov_err(xe, "xe_mem_pool_init failed with error: %pe\n", + pool); + err = PTR_ERR(pool); return err; } - offset = 0; - xe_map_memset(xe, &sa_manager->bo->vmap, offset, MI_NOOP, - bb_pool_size); - xe_map_memset(xe, &sa_manager->shadow->vmap, offset, MI_NOOP, - bb_pool_size); + pool_cpu_addr = xe_mem_pool_cpu_addr(pool); + memset(pool_cpu_addr, 0, bb_pool_size); - offset = bb_pool_size - sizeof(u32); - xe_map_wr(xe, &sa_manager->bo->vmap, offset, u32, MI_BATCH_BUFFER_END); - xe_map_wr(xe, &sa_manager->shadow->vmap, offset, u32, MI_BATCH_BUFFER_END); + last_dw_addr = pool_cpu_addr + (bb_pool_size / sizeof(u32)) - 1; + *last_dw_addr = MI_BATCH_BUFFER_END; - ctx->mem.ccs_bb_pool = sa_manager; + /** + * Sync the main copy and shadow copy so that the shadow copy is + * replica of main copy. We sync only BBs after init part. So, we + * need to make sure the main pool and shadow copy are in sync after + * this point. This is needed as GuC may read the BB commands from + * shadow copy. + */ + xe_mem_pool_sync(pool); + ctx->mem.ccs_bb_pool = pool; return 0; } static void ccs_rw_update_ring(struct xe_sriov_vf_ccs_ctx *ctx) { - u64 addr = xe_sa_manager_gpu_addr(ctx->mem.ccs_bb_pool); + u64 addr = xe_mem_pool_gpu_addr(ctx->mem.ccs_bb_pool); struct xe_lrc *lrc = xe_exec_queue_lrc(ctx->mig_q); u32 dw[10], i = 0; @@ -388,7 +392,7 @@ int xe_sriov_vf_ccs_init(struct xe_device *xe) #define XE_SRIOV_VF_CCS_RW_BB_ADDR_OFFSET (2 * sizeof(u32)) void xe_sriov_vf_ccs_rw_update_bb_addr(struct xe_sriov_vf_ccs_ctx *ctx) { - u64 addr = xe_sa_manager_gpu_addr(ctx->mem.ccs_bb_pool); + u64 addr = xe_mem_pool_gpu_addr(ctx->mem.ccs_bb_pool); struct xe_lrc *lrc = xe_exec_queue_lrc(ctx->mig_q); struct xe_device *xe = gt_to_xe(ctx->mig_q->gt); @@ -412,8 +416,8 @@ int xe_sriov_vf_ccs_attach_bo(struct xe_bo *bo) struct xe_device *xe = xe_bo_device(bo); enum xe_sriov_vf_ccs_rw_ctxs ctx_id; struct xe_sriov_vf_ccs_ctx *ctx; + struct xe_mem_pool_node *bb; struct xe_tile *tile; - struct xe_bb *bb; int err = 0; xe_assert(xe, IS_VF_CCS_READY(xe)); @@ -445,7 +449,7 @@ int xe_sriov_vf_ccs_detach_bo(struct xe_bo *bo) { struct xe_device *xe = xe_bo_device(bo); enum xe_sriov_vf_ccs_rw_ctxs ctx_id; - struct xe_bb *bb; + struct xe_mem_pool_node *bb; xe_assert(xe, IS_VF_CCS_READY(xe)); @@ -471,8 +475,8 @@ int xe_sriov_vf_ccs_detach_bo(struct xe_bo *bo) */ void xe_sriov_vf_ccs_print(struct xe_device *xe, struct drm_printer *p) { - struct xe_sa_manager *bb_pool; enum xe_sriov_vf_ccs_rw_ctxs ctx_id; + struct xe_mem_pool *bb_pool; if (!IS_VF_CCS_READY(xe)) return; @@ -485,7 +489,7 @@ void xe_sriov_vf_ccs_print(struct xe_device *xe, struct drm_printer *p) drm_printf(p, "ccs %s bb suballoc info\n", ctx_id ? "write" : "read"); drm_printf(p, "-------------------------\n"); - drm_suballoc_dump_debug_info(&bb_pool->base, p, xe_sa_manager_gpu_addr(bb_pool)); + xe_mem_pool_dump(bb_pool, p); drm_puts(p, "\n"); } } diff --git a/drivers/gpu/drm/xe/xe_sriov_vf_ccs_types.h b/drivers/gpu/drm/xe/xe_sriov_vf_ccs_types.h index 22c499943d2a..6fc8f97ef3f4 100644 --- a/drivers/gpu/drm/xe/xe_sriov_vf_ccs_types.h +++ b/drivers/gpu/drm/xe/xe_sriov_vf_ccs_types.h @@ -17,9 +17,6 @@ enum xe_sriov_vf_ccs_rw_ctxs { XE_SRIOV_VF_CCS_CTX_COUNT }; -struct xe_migrate; -struct xe_sa_manager; - /** * struct xe_sriov_vf_ccs_ctx - VF CCS migration context data. */ @@ -33,7 +30,7 @@ struct xe_sriov_vf_ccs_ctx { /** @mem: memory data */ struct { /** @mem.ccs_bb_pool: Pool from which batch buffers are allocated. */ - struct xe_sa_manager *ccs_bb_pool; + struct xe_mem_pool *ccs_bb_pool; } mem; }; From f8c4151d50b12923b67819ebf03c1c6782c984c1 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Thu, 9 Apr 2026 00:34:49 +0000 Subject: [PATCH 67/77] drm/xe: Fix potential NULL deref in xe_exec_queue_tlb_inval_last_fence_put_unlocked xe_exec_queue_tlb_inval_last_fence_put_unlocked() uses q->vm->xe as the first argument to xe_assert(). This function is called unconditionally from xe_exec_queue_destroy() for all queues, including kernel queues that have q->vm == NULL (e.g., queues created during GT init in xe_gt_record_default_lrcs() with vm=NULL). While current compilers optimize away the q->vm->xe dereference (even in CONFIG_DRM_XE_DEBUG=y builds, the compiler pushes the dereference into the WARN branch that is only taken when the assert condition is false), the code is semantically incorrect and constitutes undefined behavior in the C abstract machine for the NULL pointer case. Use gt_to_xe(q->gt) instead, which is always valid for any exec queue. This is consistent with how xe_exec_queue_destroy() itself obtains the xe_device pointer in its own xe_assert at the top of the function. Fixes: b2d7ec41f2a3 ("drm/xe: Attach last fence to TLB invalidation job queues") Assisted-by: Claude:claude-opus-4.6 Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20260409003449.3405767-1-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin (cherry picked from commit 96078a1c68bf97f17fd1d08c3f58f5c5cc9ccd65) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_exec_queue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index b287d0e0e60a..8de8ec784a03 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -1760,7 +1760,7 @@ void xe_exec_queue_tlb_inval_last_fence_put(struct xe_exec_queue *q, void xe_exec_queue_tlb_inval_last_fence_put_unlocked(struct xe_exec_queue *q, unsigned int type) { - xe_assert(q->vm->xe, type == XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT || + xe_assert(gt_to_xe(q->gt), type == XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT || type == XE_EXEC_QUEUE_TLB_INVAL_PRIMARY_GT); dma_fence_put(q->tlb_inval[type].last_fence); From 09a8f3c1c11977a6e10c167f26dd298790b31c32 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Wed, 8 Apr 2026 17:52:52 +0000 Subject: [PATCH 68/77] drm/xe/bo: Fix bo leak on unaligned size validation in xe_bo_init_locked() When type is ttm_bo_type_device and aligned_size != size, the function returns an error without freeing a caller-provided bo, violating the documented contract that bo is freed on failure. Add xe_bo_free(bo) before returning the error. Fixes: 4e03b584143e ("drm/xe/uapi: Reject bo creation of unaligned size") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4.6 Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20260408175255.3402838-2-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin (cherry picked from commit 601c2aa087b6f21014300a3f107a08ee4dde7bdf) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_bo.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index a7c2dc7f224c..c5e9befc6ba3 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -2342,8 +2342,10 @@ struct xe_bo *xe_bo_init_locked(struct xe_device *xe, struct xe_bo *bo, alignment = SZ_4K >> PAGE_SHIFT; } - if (type == ttm_bo_type_device && aligned_size != size) + if (type == ttm_bo_type_device && aligned_size != size) { + xe_bo_free(bo); return ERR_PTR(-EINVAL); + } if (!bo) { bo = xe_bo_alloc(); From 1d0adf2fd94fb0c0037c643fadd8f2cf3cffc009 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Wed, 8 Apr 2026 17:52:53 +0000 Subject: [PATCH 69/77] drm/xe/bo: Fix bo leak on GGTT flag validation in xe_bo_init_locked() When XE_BO_FLAG_GGTT_ALL is set without XE_BO_FLAG_GGTT, the function returns an error without freeing a caller-provided bo, violating the documented contract that bo is freed on failure. Add xe_bo_free(bo) before returning the error. Fixes: 5a3b0df25d6a ("drm/xe: Allow bo mapping on multiple ggtts") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4.6 Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20260408175255.3402838-3-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin (cherry picked from commit 3fbd6cf43cac7b60757f3ce3d95195d3843a902c) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_bo.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index c5e9befc6ba3..4075edf97421 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -2322,8 +2322,10 @@ struct xe_bo *xe_bo_init_locked(struct xe_device *xe, struct xe_bo *bo, } /* XE_BO_FLAG_GGTTx requires XE_BO_FLAG_GGTT also be set */ - if ((flags & XE_BO_FLAG_GGTT_ALL) && !(flags & XE_BO_FLAG_GGTT)) + if ((flags & XE_BO_FLAG_GGTT_ALL) && !(flags & XE_BO_FLAG_GGTT)) { + xe_bo_free(bo); return ERR_PTR(-EINVAL); + } if (flags & (XE_BO_FLAG_VRAM_MASK | XE_BO_FLAG_STOLEN) && !(flags & XE_BO_FLAG_IGNORE_MIN_PAGE_SIZE) && From 93a528f67ce5095bcab46a69839eca97f43dd352 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Wed, 8 Apr 2026 17:52:54 +0000 Subject: [PATCH 70/77] drm/xe: Fix bo leak in xe_dma_buf_init_obj() on allocation failure When drm_gpuvm_resv_object_alloc() fails, the pre-allocated storage bo is not freed. Add xe_bo_free(storage) before returning the error. xe_dma_buf_init_obj() calls xe_bo_init_locked(), which frees the bo on error. Therefore, xe_dma_buf_init_obj() must also free the bo on its own error paths. Otherwise, since xe_gem_prime_import() cannot distinguish whether the failure originated from xe_dma_buf_init_obj() or from xe_bo_init_locked(), it cannot safely decide whether the bo should be freed. Add comments documenting the ownership semantics: on success, ownership of storage is transferred to the returned drm_gem_object; on failure, storage is freed before returning. v2: Add comments to explain the free logic. Fixes: eb289a5f6cc6 ("drm/xe: Convert xe_dma_buf.c for exhaustive eviction") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4.6 Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20260408175255.3402838-4-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin (cherry picked from commit 78a6c5f899f22338bbf48b44fb8950409c5a69b9) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_dma_buf.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c index 7f9602b3363d..c0937c090d33 100644 --- a/drivers/gpu/drm/xe/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/xe_dma_buf.c @@ -258,6 +258,13 @@ struct dma_buf *xe_gem_prime_export(struct drm_gem_object *obj, int flags) return ERR_PTR(ret); } +/* + * Takes ownership of @storage: on success it is transferred to the returned + * drm_gem_object; on failure it is freed before returning the error. + * This matches the contract of xe_bo_init_locked() which frees @storage on + * its error paths, so callers need not (and must not) free @storage after + * this call. + */ static struct drm_gem_object * xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, struct dma_buf *dma_buf) @@ -271,8 +278,10 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, int ret = 0; dummy_obj = drm_gpuvm_resv_object_alloc(&xe->drm); - if (!dummy_obj) + if (!dummy_obj) { + xe_bo_free(storage); return ERR_PTR(-ENOMEM); + } dummy_obj->resv = resv; xe_validation_guard(&ctx, &xe->val, &exec, (struct xe_val_flags) {}, ret) { @@ -281,6 +290,7 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage, if (ret) break; + /* xe_bo_init_locked() frees storage on error */ bo = xe_bo_init_locked(xe, storage, NULL, resv, NULL, dma_buf->size, 0, /* Will require 1way or 2way for vm_bind */ ttm_bo_type_sg, XE_BO_FLAG_SYSTEM, &exec); From 111ab678471bf1f90d078d5513bb086b70596c3c Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Wed, 8 Apr 2026 17:52:55 +0000 Subject: [PATCH 71/77] drm/xe: Fix dma-buf attachment leak in xe_gem_prime_import() When xe_dma_buf_init_obj() fails, the attachment from dma_buf_dynamic_attach() is not detached. Add dma_buf_detach() before returning the error. Note: we cannot use goto out_err here because xe_dma_buf_init_obj() already frees bo on failure, and out_err would double-free it. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4.6 Reviewed-by: Mattheq Brost Link: https://patch.msgid.link/20260408175255.3402838-5-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin (cherry picked from commit a828eb185aac41800df8eae4b60501ccc0dbbe51) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_dma_buf.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c index c0937c090d33..b9828da15897 100644 --- a/drivers/gpu/drm/xe/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/xe_dma_buf.c @@ -378,12 +378,15 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, goto out_err; } - /* Errors here will take care of freeing the bo. */ + /* + * xe_dma_buf_init_obj() takes ownership of bo on both success + * and failure, so we must not touch bo after this call. + */ obj = xe_dma_buf_init_obj(dev, bo, dma_buf); - if (IS_ERR(obj)) + if (IS_ERR(obj)) { + dma_buf_detach(dma_buf, attach); return obj; - - + } get_dma_buf(dma_buf); obj->import_attach = attach; return obj; From f3cc22d4df3ed58439ea7e21daa54c3608e03b78 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Wed, 8 Apr 2026 02:06:47 +0000 Subject: [PATCH 72/77] drm/xe: Fix error cleanup in xe_exec_queue_create_ioctl() Two error handling issues exist in xe_exec_queue_create_ioctl(): 1. When xe_hw_engine_group_add_exec_queue() fails, the error path jumps to put_exec_queue which skips xe_exec_queue_kill(). If the VM is in preempt fence mode, xe_vm_add_compute_exec_queue() has already added the queue to the VM's compute exec queue list. Skipping the kill leaves the queue on that list, leading to a dangling pointer after the queue is freed. 2. When xa_alloc() fails after xe_hw_engine_group_add_exec_queue() has succeeded, the error path does not call xe_hw_engine_group_del_exec_queue() to remove the queue from the hw engine group list. The queue is then freed while still linked into the hw engine group, causing a use-after-free. Fix both by: - Changing the xe_hw_engine_group_add_exec_queue() failure path to jump to kill_exec_queue so that xe_exec_queue_kill() properly removes the queue from the VM's compute list. - Adding a del_hw_engine_group label before kill_exec_queue for the xa_alloc() failure path, which removes the queue from the hw engine group before proceeding with the rest of the cleanup. Fixes: 7970cb36966c ("'drm/xe/hw_engine_group: Register hw engine group's exec queues") Cc: Francois Dugast Cc: Matthew Brost Cc: Niranjana Vishwanathapura Assisted-by: Claude:claude-opus-4.6 Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20260408020647.3397933-1-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin (cherry picked from commit 37c831f401746a45d510b312b0ed7a77b1e06ec8) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_exec_queue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 8de8ec784a03..071b8c41df43 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -1405,7 +1405,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data, if (q->vm && q->hwe->hw_engine_group) { err = xe_hw_engine_group_add_exec_queue(q->hwe->hw_engine_group, q); if (err) - goto put_exec_queue; + goto kill_exec_queue; } } @@ -1416,12 +1416,15 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data, /* user id alloc must always be last in ioctl to prevent UAF */ err = xa_alloc(&xef->exec_queue.xa, &id, q, xa_limit_32b, GFP_KERNEL); if (err) - goto kill_exec_queue; + goto del_hw_engine_group; args->exec_queue_id = id; return 0; +del_hw_engine_group: + if (q->vm && q->hwe && q->hwe->hw_engine_group) + xe_hw_engine_group_del_exec_queue(q->hwe->hw_engine_group, q); kill_exec_queue: xe_exec_queue_kill(q); delete_queue_group: From dc2d9842c67d883d3200ae33b9c3859dd9492408 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Wed, 15 Apr 2026 22:54:28 +0000 Subject: [PATCH 73/77] drm/xe/eustall: Fix drm_dev_put called before stream disable in close In xe_eu_stall_stream_close(), drm_dev_put() is called before the stream is disabled and its resources are freed. If this drops the last reference, the device structures could be freed while the subsequent cleanup code still accesses them, leading to a use-after-free. Fix this by moving drm_dev_put() after all device accesses are complete. This matches the ordering in xe_oa_release(). Fixes: 9a0b11d4cf3b ("drm/xe/eustall: Add support to init, enable and disable EU stall sampling") Cc: Harish Chegondi Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Shuicheng Lin Reviewed-by: Harish Chegondi Link: https://patch.msgid.link/20260415225428.3399934-1-shuicheng.lin@intel.com Signed-off-by: Matt Roper (cherry picked from commit 35aff528f7297e949e5e19c9cd7fd748cf1cf21c) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_eu_stall.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c index c34408cfd292..dddcdd0bb7a3 100644 --- a/drivers/gpu/drm/xe/xe_eu_stall.c +++ b/drivers/gpu/drm/xe/xe_eu_stall.c @@ -869,14 +869,14 @@ static int xe_eu_stall_stream_close(struct inode *inode, struct file *file) struct xe_eu_stall_data_stream *stream = file->private_data; struct xe_gt *gt = stream->gt; - drm_dev_put(>->tile->xe->drm); - mutex_lock(>->eu_stall->stream_lock); xe_eu_stall_disable_locked(stream); xe_eu_stall_data_buf_destroy(stream); xe_eu_stall_stream_free(stream); mutex_unlock(>->eu_stall->stream_lock); + drm_dev_put(>->tile->xe->drm); + return 0; } From 3762d6c36549accea7068c4a175483fafdd03657 Mon Sep 17 00:00:00 2001 From: Shuicheng Lin Date: Fri, 17 Apr 2026 16:33:08 +0000 Subject: [PATCH 74/77] drm/xe/gsc: Fix BO leak on error in query_compatibility_version() When xe_gsc_read_out_header() fails, query_compatibility_version() returns directly instead of jumping to the out_bo label. This skips the xe_bo_unpin_map_no_vm() call, leaving the BO pinned and mapped with no remaining reference to free it. Fix by using goto out_bo so the error path properly cleans up the BO, consistent with the other error handling in the same function. Fixes: 0881cbe04077 ("drm/xe/gsc: Query GSC compatibility version") Cc: Daniele Ceraolo Spurio Reviewed-by: Daniele Ceraolo Spurio Link: https://patch.msgid.link/20260417163308.3416147-1-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin (cherry picked from commit 8de86d0a843c32ca9d36864bdb92f0376a830bce) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_gsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_gsc.c b/drivers/gpu/drm/xe/xe_gsc.c index e5c234f3d795..0d13e357fb43 100644 --- a/drivers/gpu/drm/xe/xe_gsc.c +++ b/drivers/gpu/drm/xe/xe_gsc.c @@ -166,7 +166,7 @@ static int query_compatibility_version(struct xe_gsc *gsc) &rd_offset); if (err) { xe_gt_err(gt, "HuC: invalid GSC reply for version query (err=%d)\n", err); - return err; + goto out_bo; } compat->major = version_query_rd(xe, &bo->vmap, rd_offset, proj_major); From 0df99689eb790bcad3ad82b38fa4ce1cbf3cffa3 Mon Sep 17 00:00:00 2001 From: Tvrtko Ursulin Date: Mon, 20 Apr 2026 14:16:03 +0100 Subject: [PATCH 75/77] drm/xe/xelp: Fix Wa_18022495364 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Command parser relative MMIO addressing needs to be enabled when writing to the register. Signed-off-by: Tvrtko Ursulin Fixes: ca33cd271ef9 ("drm/xe/xelp: Add Wa_18022495364") Cc: Matt Roper Cc: Matthew Brost Cc: Thomas Hellström Cc: Rodrigo Vivi Reviewed-by: Matt Roper Link: https://patch.msgid.link/20260420131603.70357-1-tvrtko.ursulin@igalia.com Signed-off-by: Matt Roper (cherry picked from commit 5627392001802a98ed6cf8cf79a303abd00d1c0f) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_lrc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 9d12a0d2f0b5..c725cde4508d 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -1214,7 +1214,7 @@ static ssize_t setup_invalidate_state_cache_wa(struct xe_lrc *lrc, if (xe_gt_WARN_ON(lrc->gt, max_len < 3)) return -ENOSPC; - *cmd++ = MI_LOAD_REGISTER_IMM | MI_LRI_NUM_REGS(1); + *cmd++ = MI_LOAD_REGISTER_IMM | MI_LRI_LRM_CS_MMIO | MI_LRI_NUM_REGS(1); *cmd++ = CS_DEBUG_MODE2(0).addr; *cmd++ = REG_MASKED_FIELD_ENABLE(INSTRUCTION_STATE_CACHE_INVALIDATE); From 4e5591c2fc1b30f4ea5e2eab4c3a695acc404e39 Mon Sep 17 00:00:00 2001 From: Jia Yao Date: Fri, 17 Apr 2026 05:59:16 +0000 Subject: [PATCH 76/77] drm/xe/uapi: Reject coh_none PAT index for CPU cached memory in madvise MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add validation in xe_vm_madvise_ioctl() to reject PAT indices with XE_COH_NONE coherency mode when applied to CPU cached memory. Using coh_none with CPU cached buffers is a security issue. When the kernel clears pages before reallocation, the clear operation stays in CPU cache (dirty). GPU with coh_none can bypass CPU caches and read stale sensitive data directly from DRAM, potentially leaking data from previously freed pages of other processes. This aligns with the existing validation in vm_bind path (xe_vm_bind_ioctl_validate_bo). v2(Matthew brost) - Add fixes - Move one debug print to better place v3(Matthew Auld) - Should be drm/xe/uapi - More Cc v4(Shuicheng Lin) - Fix kmem leak issues by the way v5 - Remove kmem leak because it has been merged by another patch v6 - Remove the fix which is not related to current fix v7 - No change v8 - Rebase v9 - Limit the restrictions to iGPU v10 - No change Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe") Cc: # v6.18+ Cc: Shuicheng Lin Cc: Mathew Alwin Cc: Michal Mrozek Cc: Matthew Brost Cc: Matthew Auld Signed-off-by: Jia Yao Reviewed-by: Matthew Auld Acked-by: Michal Mrozek Acked-by: José Roberto de Souza Signed-off-by: Matthew Auld Link: https://patch.msgid.link/20260417055917.2027459-2-jia.yao@intel.com (cherry picked from commit 016ccdb674b8c899940b3944952c96a6a490d10a) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_vm_madvise.c | 47 ++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c b/drivers/gpu/drm/xe/xe_vm_madvise.c index 66f00d3f5c07..c78906dea82b 100644 --- a/drivers/gpu/drm/xe/xe_vm_madvise.c +++ b/drivers/gpu/drm/xe/xe_vm_madvise.c @@ -621,6 +621,45 @@ static int xe_madvise_purgeable_retained_to_user(const struct xe_madvise_details return 0; } +static bool check_pat_args_are_sane(struct xe_device *xe, + struct xe_vmas_in_madvise_range *madvise_range, + u16 pat_index) +{ + u16 coh_mode = xe_pat_index_get_coh_mode(xe, pat_index); + int i; + + /* + * Using coh_none with CPU cached buffers is not allowed on iGPU. + * On iGPU the GPU shares the LLC with the CPU, so with coh_none + * the GPU bypasses CPU caches and reads directly from DRAM, + * potentially seeing stale sensitive data from previously freed + * pages. On dGPU this restriction does not apply, because the + * platform does not provide a non-coherent system memory access + * path that would violate the DMA coherency contract. + */ + if (coh_mode != XE_COH_NONE || IS_DGFX(xe)) + return true; + + for (i = 0; i < madvise_range->num_vmas; i++) { + struct xe_vma *vma = madvise_range->vmas[i]; + struct xe_bo *bo = xe_vma_bo(vma); + + if (bo) { + /* BO with WB caching + COH_NONE is not allowed */ + if (XE_IOCTL_DBG(xe, bo->cpu_caching == DRM_XE_GEM_CPU_CACHING_WB)) + return false; + /* Imported dma-buf without caching info, assume cached */ + if (XE_IOCTL_DBG(xe, !bo->cpu_caching)) + return false; + } else if (XE_IOCTL_DBG(xe, xe_vma_is_cpu_addr_mirror(vma) || + xe_vma_is_userptr(vma))) + /* System memory (userptr/SVM) is always CPU cached */ + return false; + } + + return true; +} + static bool check_bo_args_are_sane(struct xe_vm *vm, struct xe_vma **vmas, int num_vmas, u32 atomic_val) { @@ -750,6 +789,14 @@ int xe_vm_madvise_ioctl(struct drm_device *dev, void *data, struct drm_file *fil } } + if (args->type == DRM_XE_MEM_RANGE_ATTR_PAT) { + if (!check_pat_args_are_sane(xe, &madvise_range, + args->pat_index.val)) { + err = -EINVAL; + goto free_vmas; + } + } + if (madvise_range.has_bo_vmas) { if (args->type == DRM_XE_MEM_RANGE_ATTR_ATOMIC) { if (!check_bo_args_are_sane(vm, madvise_range.vmas, From 662f9ddc8077792129440d05cbef2f944a07777a Mon Sep 17 00:00:00 2001 From: Jia Yao Date: Fri, 17 Apr 2026 05:59:17 +0000 Subject: [PATCH 77/77] drm/xe/uapi: Reject coh_none PAT index for CPU_ADDR_MIRROR Add validation in xe_vm_bind_ioctl() to reject PAT indices with XE_COH_NONE coherency mode when used with DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR. CPU address mirror mappings use system memory that is CPU cached, which makes them incompatible with COH_NONE PAT indices. Allowing COH_NONE with CPU cached buffers is a security risk, as the GPU may bypass CPU caches and read stale sensitive data from DRAM. Although CPU_ADDR_MIRROR does not create an immediate mapping, the backing system memory is still CPU cached. Apply the same PAT coherency restrictions as DRM_XE_VM_BIND_OP_MAP_USERPTR. v2: - Correct fix tag v6: - No change v7: - Correct fix tag v8: - Rebase v9: - Limit the restrictions to iGPU v10: - Just add the iGPU logic but keep dGPU logic Fixes: b43e864af0d4 ("drm/xe/uapi: Add DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR") Cc: # v6.15+ Cc: Shuicheng Lin Cc: Mathew Alwin Cc: Michal Mrozek Cc: Matthew Brost Cc: Matthew Auld Signed-off-by: Jia Yao Reviewed-by: Matthew Auld Acked-by: Michal Mrozek Signed-off-by: Matthew Auld Link: https://patch.msgid.link/20260417055917.2027459-3-jia.yao@intel.com (cherry picked from commit 4d58d7535e826a3175527b6174502f0db319d7f6) Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/xe/xe_vm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 1720205c09ca..a717a2b8dea3 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -3658,6 +3658,8 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm, op == DRM_XE_VM_BIND_OP_MAP_USERPTR) || XE_IOCTL_DBG(xe, coh_mode == XE_COH_NONE && op == DRM_XE_VM_BIND_OP_MAP_USERPTR) || + XE_IOCTL_DBG(xe, !IS_DGFX(xe) && coh_mode == XE_COH_NONE && + is_cpu_addr_mirror) || XE_IOCTL_DBG(xe, xe_device_is_l2_flush_optimized(xe) && (op == DRM_XE_VM_BIND_OP_MAP_USERPTR || is_cpu_addr_mirror) &&