drm/etnaviv: Fix armed job not being pushed to the DRM scheduler

When xa_alloc_cyclic() failed in etnaviv_sched_push_job(), the error
path skipped drm_sched_entity_push_job(). This is a violation of the DRM
scheduler contract, as once a job has been armed with drm_sched_job_arm(),
it must be pushed with drm_sched_entity_push_job(). From the DRM
scheduler documentation,

"""
drm_sched_job_arm() is a point of no return since it initializes the
fences and their sequence number etc. Once that function has been called,
you *must* submit it with drm_sched_entity_push_job() and cannot simply
abort it by calling drm_sched_job_cleanup().
"""

Fix this by splitting the fence ID allocation into two phases: first,
alloc an xarray slot before arming the job (which can fail), then fill in
the actual fence with xa_store() after arming. This way, allocation
failures are handled before the job is armed, and once armed, the job is
always pushed to the scheduler.

This also fixes a double call to drm_sched_job_cleanup(), as both
etnaviv_sched_push_job() and its caller would call it on failure.

Fixes: 764be12345 ("drm/etnaviv: convert user fence tracking to XArray")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patch.msgid.link/20260402193424.2023318-1-mcanal@igalia.com
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
This commit is contained in:
Maíra Canal 2026-04-02 16:32:35 -03:00 committed by Christian Gmeiner
parent 84ae184026
commit 3780c41460

View File

@ -116,16 +116,18 @@ int etnaviv_sched_push_job(struct etnaviv_gem_submit *submit)
*/
mutex_lock(&gpu->sched_lock);
ret = xa_alloc_cyclic(&gpu->user_fences, &submit->out_fence_id,
NULL, xa_limit_32b, &gpu->next_user_fence,
GFP_KERNEL);
if (ret < 0)
goto out_unlock;
drm_sched_job_arm(&submit->sched_job);
submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
ret = xa_alloc_cyclic(&gpu->user_fences, &submit->out_fence_id,
submit->out_fence, xa_limit_32b,
&gpu->next_user_fence, GFP_KERNEL);
if (ret < 0) {
drm_sched_job_cleanup(&submit->sched_job);
goto out_unlock;
}
xa_store(&gpu->user_fences, submit->out_fence_id,
submit->out_fence, GFP_KERNEL);
/* the scheduler holds on to the job now */
kref_get(&submit->refcount);