drm/amdgpu: Fix error handling in amdgpu_ras_add_bad_pages

It ensures that appropriate error codes are returned when an error
condition is detected

Fixes the below;
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2849 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_umc_pages_in_a_row()' failed.
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2884 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_ras_mca2pa()' failed.

v2: s/-EIO/-EINVAL, retained the use of -EINVAL from
    amdgpu_umc_pages_in_a_row & and amdgpu_ras_mca2pa_by_idx, when the
    RAS context is not initialized or the convert_ras_err_addr function is
    unavailable. (Thomas)

V3: Returning 0 as the absence of eh_data is acceptable. (Tao)

Fixes: a8d133e625 ("drm/amdgpu: parse legacy RAS bad page mixed with new data in various NPS modes")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: YiPeng Chai <yipeng.chai@amd.com>
Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Srinivasan Shanmugam 2024-12-13 16:46:42 +05:30 committed by Alex Deucher
parent 2774ef7625
commit 9095567bc3

View File

@ -2832,8 +2832,10 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
mutex_lock(&con->recovery_lock);
data = con->eh_data;
if (!data)
if (!data) {
/* Returning 0 as the absence of eh_data is acceptable */
goto free;
}
for (i = 0; i < pages; i++) {
if (from_rom &&
@ -2845,26 +2847,34 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
* one row
*/
if (amdgpu_umc_pages_in_a_row(adev, &err_data,
bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT))
bps[i].retired_page <<
AMDGPU_GPU_PAGE_SHIFT)) {
ret = -EINVAL;
goto free;
else
} else {
find_pages_per_pa = true;
}
} else {
/* unsupported cases */
ret = -EOPNOTSUPP;
goto free;
}
}
} else {
if (amdgpu_umc_pages_in_a_row(adev, &err_data,
bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT))
bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT)) {
ret = -EINVAL;
goto free;
}
}
} else {
if (from_rom && !find_pages_per_pa) {
if (bps[i].retired_page & UMC_CHANNEL_IDX_V2) {
/* bad page in any NPS mode in eeprom */
if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data))
if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data)) {
ret = -EINVAL;
goto free;
}
} else {
/* legacy bad page in eeprom, generated only in
* NPS1 mode
@ -2881,6 +2891,7 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
/* non-nps1 mode, old RAS TA
* can't support it
*/
ret = -EOPNOTSUPP;
goto free;
}
}