Commit Graph

323 Commits

Author SHA1 Message Date
Maíra Canal
497d370a64
drm/v3d: Fix out-of-bounds read in v3d_csd_job_run()
When enabling UBSAN on Raspberry Pi 5, we get the following warning:

[  387.894977] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/v3d/v3d_sched.c:320:3
[  387.903868] index 7 is out of range for type '__u32 [7]'
[  387.909692] CPU: 0 PID: 1207 Comm: kworker/u16:2 Tainted: G        WC         6.10.3-v8-16k-numa #151
[  387.919166] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT)
[  387.925961] Workqueue: v3d_csd drm_sched_run_job_work [gpu_sched]
[  387.932525] Call trace:
[  387.935296]  dump_backtrace+0x170/0x1b8
[  387.939403]  show_stack+0x20/0x38
[  387.942907]  dump_stack_lvl+0x90/0xd0
[  387.946785]  dump_stack+0x18/0x28
[  387.950301]  __ubsan_handle_out_of_bounds+0x98/0xd0
[  387.955383]  v3d_csd_job_run+0x3a8/0x438 [v3d]
[  387.960707]  drm_sched_run_job_work+0x520/0x6d0 [gpu_sched]
[  387.966862]  process_one_work+0x62c/0xb48
[  387.971296]  worker_thread+0x468/0x5b0
[  387.975317]  kthread+0x1c4/0x1e0
[  387.978818]  ret_from_fork+0x10/0x20
[  387.983014] ---[ end trace ]---

This happens because the UAPI provides only seven configuration
registers and we are reading the eighth position of this u32 array.

Therefore, fix the out-of-bounds read in `v3d_csd_job_run()` by
accessing only seven positions on the '__u32 [7]' array. The eighth
register exists indeed on V3D 7.1, but it isn't currently used. That
being so, let's guarantee that it remains unused and add a note that it
could be set in a future patch.

Fixes: 0ad5bc1ce4 ("drm/v3d: fix up register addresses for V3D 7.x")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809152001.668314-1-mcanal@igalia.com
2024-08-12 11:14:21 -03:00
Daniel Vetter
91dae758bd drm-misc-next for v6.12:
UAPI Changes:
 
 virtio:
 - Define DRM capset
 
 Cross-subsystem Changes:
 
 dma-buf:
 - heaps: Clean up documentation
 
 printk:
 - Pass description to kmsg_dump()
 
 Core Changes:
 
 CI:
 - Update IGT tests
 - Point upstream repo to GitLab instance
 
 modesetting:
 - Introduce Power Saving Policy property for connectors
 - Add might_fault() to drm_modeset_lock priming
 - Add dynamic per-crtc vblank configuration support
 
 panic:
 - Avoid build-time interference with framebuffer console
 
 docs:
 - Document Colorspace property
 
 scheduler:
 - Remove full_recover from drm_sched_start
 
 TTM:
 - Make LRU walk restartable after dropping locks
 - Allow direct reclaim to allocate local memory
 
 Driver Changes:
 
 amdgpu:
 - Support Power Saving Policy connector property
 
 ast:
 - astdp: Support AST2600 with VGA; Clean up HPD
 
 bridge:
 - Silence error message on -EPROBE_DEFER
 - analogix: Clean aup
 - bridge-connector: Fix double free
 - lt6505: Disable interrupt when powered off
 - tc358767: Make default DP port preemphasis configurable
 
 gma500:
 - Update i2c terminology
 
 ivpu:
 - Add MODULE_FIRMWARE()
 
 lcdif:
 - Fix pixel clock
 
 loongson:
 - Use GEM refcount over TTM's
 
 mgag200:
 - Improve BMC handling
 - Support VBLANK intterupts
 
 nouveau:
 - Refactor and clean up internals
 - Use GEM refcount over TTM's
 
 panel:
 - Shutdown fixes plus documentation
 - Refactor several drivers for better code sharing
 - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
   DT; Fix porch parameter
 - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
   BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
   CMN N116BCP-EA2, CSW MNB601LS1-4
 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
 - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
 - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
   for code sharing
 
 sti:
 - Fix module owner
 
 stm:
 - Avoid UAF wih managed plane and CRTC helpers
 - Fix module owner
 - Fix error handling in probe
 - Depend on COMMON_CLK
 - ltdc: Fix transparency after disabling plane; Remove unused interrupt
 
 tegra:
 - Call drm_atomic_helper_shutdown()
 
 v3d:
 - Clean up perfmon
 
 vkms:
 - Clean up
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmareygACgkQaA3BHVML
 eiO2vwf9FirbMiq4lfHzgcbNIU1dTUtjRAZjrlwGmqk5cb9lUshAMCMBMOEQBDdg
 XMQQj/RMBvRUuxzsPGk78ObSz5FBaBLgKwFprer0V6uslQaJxj4YRsnkp0l2n+0k
 +ebhfo2rUgZOdgNOkXH326w9UhqiydIa7GaA2aq1vUzXKFDfvGXtSN75BMlEWlKP
 rTft56AiwjwcKu7zYFHGlFUMSNpKAQy7lnV3+dBXAfFNHu4zVNoI/yWGEOdR7eVo
 WhiEcpvismsOh+BfUvMNPP3RKwjXHdwMlJYb+v9XGgH27hqc50lSceWydHtoJTto
 DTXF9WQhJ+/GQR9ZGmBjos9GVbECDA==
 =L/1W
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-08-01' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.12:

UAPI Changes:

virtio:
- Define DRM capset

Cross-subsystem Changes:

dma-buf:
- heaps: Clean up documentation

printk:
- Pass description to kmsg_dump()

Core Changes:

CI:
- Update IGT tests
- Point upstream repo to GitLab instance

modesetting:
- Introduce Power Saving Policy property for connectors
- Add might_fault() to drm_modeset_lock priming
- Add dynamic per-crtc vblank configuration support

panic:
- Avoid build-time interference with framebuffer console

docs:
- Document Colorspace property

scheduler:
- Remove full_recover from drm_sched_start

TTM:
- Make LRU walk restartable after dropping locks
- Allow direct reclaim to allocate local memory

Driver Changes:

amdgpu:
- Support Power Saving Policy connector property

ast:
- astdp: Support AST2600 with VGA; Clean up HPD

bridge:
- Silence error message on -EPROBE_DEFER
- analogix: Clean aup
- bridge-connector: Fix double free
- lt6505: Disable interrupt when powered off
- tc358767: Make default DP port preemphasis configurable

gma500:
- Update i2c terminology

ivpu:
- Add MODULE_FIRMWARE()

lcdif:
- Fix pixel clock

loongson:
- Use GEM refcount over TTM's

mgag200:
- Improve BMC handling
- Support VBLANK intterupts

nouveau:
- Refactor and clean up internals
- Use GEM refcount over TTM's

panel:
- Shutdown fixes plus documentation
- Refactor several drivers for better code sharing
- boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
  DT; Fix porch parameter
- edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
  BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
  CMN N116BCP-EA2, CSW MNB601LS1-4
- himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
- ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
- jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
  for code sharing

sti:
- Fix module owner

stm:
- Avoid UAF wih managed plane and CRTC helpers
- Fix module owner
- Fix error handling in probe
- Depend on COMMON_CLK
- ltdc: Fix transparency after disabling plane; Remove unused interrupt

tegra:
- Call drm_atomic_helper_shutdown()

v3d:
- Clean up perfmon

vkms:
- Clean up

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801121406.GA102996@linux.fritz.box
2024-08-08 18:58:46 +02:00
Maxime Ripard
a1ff5a7d78
Merge drm/drm-fixes into drm-misc-fixes
Let's start the new drm-misc-fixes cycle by bringing in 6.11-rc1.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-07-30 09:09:23 +02:00
Thomas Zimmermann
0e8655b4e8 Merge drm/drm-next into drm-misc-next
Backmerging to get a late RC of v6.10 before moving into v6.11.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-29 09:35:54 +02:00
Christian König
83b501c179 drm/scheduler: remove full_recover from drm_sched_start
This was basically just another one of amdgpus hacks. The parameter
allowed to restart the scheduler without turning fence signaling on
again.

That this is absolutely not a good idea should be obvious by now since
the fences will then just sit there and never signal.

While at it cleanup the code a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722083816.99685-1-christian.koenig@amd.com
2024-07-25 14:05:12 +02:00
Tvrtko Ursulin
4ecc24a84d drm/v3d: Validate passed in drm syncobj handles in the performance extension
If userspace provides an unknown or invalid handle anywhere in the handle
array the rest of the driver will not handle that well.

Fix it by checking handle was looked up successfully or otherwise fail the
extension by jumping into the existing unwind.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-6-tursulin@igalia.com
(cherry picked from commit a546b7e4d7)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:50:08 +02:00
Tvrtko Ursulin
023d22e8bb drm/v3d: Validate passed in drm syncobj handles in the timestamp extension
If userspace provides an unknown or invalid handle anywhere in the handle
array the rest of the driver will not handle that well.

Fix it by checking handle was looked up successfully or otherwise fail the
extension by jumping into the existing unwind.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 9ba0ff3e08 ("drm/v3d: Create a CPU job extension for the timestamp query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-5-tursulin@igalia.com
(cherry picked from commit 8d1276d1b8)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:49:55 +02:00
Tvrtko Ursulin
32df4abc44 drm/v3d: Fix potential memory leak in the performance extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-4-tursulin@igalia.com
(cherry picked from commit 484de39fa5)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:49:28 +02:00
Tvrtko Ursulin
0e50fcc20b drm/v3d: Fix potential memory leak in the timestamp extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 9ba0ff3e08 ("drm/v3d: Create a CPU job extension for the timestamp query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-3-tursulin@igalia.com
(cherry picked from commit 753ce4fea6)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:49:08 +02:00
Tvrtko Ursulin
6ce9efd12a drm/v3d: Prevent out of bounds access in performance query extensions
Check that the number of perfmons userspace is passing in the copy and
reset extensions is not greater than the internal kernel storage where
the ids will be copied into.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-2-tursulin@igalia.com
(cherry picked from commit f32b5128d2)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:48:37 +02:00
Maíra Canal
1fe1c66274
drm/v3d: Fix Indirect Dispatch configuration for V3D 7.1.6 and later
`args->cfg[4]` is configured in Indirect Dispatch using the number of
batches. Currently, for all V3D tech versions, `args->cfg[4]` equals the
number of batches subtracted by 1. But, for V3D 7.1.6 and later, we must not
subtract 1 from the number of batches.

Implement the fix by checking the V3D tech version and revision.

Fixes several `dEQP-VK.synchronization*` CTS tests related to Indirect Dispatch.

Fixes: 18b8413b25 ("drm/v3d: Create a CPU job extension for a indirect CSD job")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-2-mcanal@igalia.com
2024-07-15 12:49:52 -03:00
Maíra Canal
7c78fdbace
drm/v3d: Add V3D tech revision to the device information
The V3D tech revision can be a useful information when configuring
jobs. Therefore, expose it in the `struct v3d_dev` with the V3D tech
version.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-1-mcanal@igalia.com
2024-07-15 12:45:59 -03:00
Maíra Canal
0b91c6daca
drm/v3d: Expose memory stats through fdinfo
Use the common DRM function `drm_show_memory_stats()` to expose standard
fdinfo memory stats.

V3D exposes global GPU memory stats through debugfs. Those stats will be
preserved while the DRM subsystem doesn't have a standard solution to
expose global GPU stats.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711142736.783816-1-mcanal@igalia.com
2024-07-13 11:00:32 -03:00
Tvrtko Ursulin
c48abf099d
drm/v3d: Add some local variables in queries/extensions
Add some local variables to make the code a bit less verbose, with the
main benefit being pulling some lines to under 80 columns wide.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-12-tursulin@igalia.com
2024-07-13 11:00:32 -03:00
Tvrtko Ursulin
96bc9049ba
drm/v3d: Prefer get_user for scalar types
It makes it just a tiny bit more obvious what is going on.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-11-tursulin@igalia.com
2024-07-13 11:00:32 -03:00
Tvrtko Ursulin
3ef80d4ed6
drm/v3d: Move perfmon init completely into own unit
Now that the build time dependencies on various array sizes have been
removed, we can move the perfmon init completely into its own compilation
unit and remove the hardcoded defines.

This improves on the temporary fix quickly delivered in commit
9c3951ec27 ("drm/v3d: Fix perfmon build error/warning").

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
References: 9c3951ec27 ("drm/v3d: Fix perfmon build error/warning")
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-10-tursulin@igalia.com
2024-07-13 11:00:32 -03:00
Tvrtko Ursulin
1be825c5c0
drm/v3d: Do not use intermediate storage when copying performance query results
Removing the intermediate buffer removes the last use of the
V3D_MAX_COUNTERS define, which will enable further driver cleanup.

While at it pull the 32 vs 64 bit copying decision outside the loop in
order to reduce the number of conditional instructions.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-9-tursulin@igalia.com
2024-07-13 11:00:32 -03:00
Tvrtko Ursulin
c9d6630f7c
drm/v3d: Size the kperfmon_ids array at runtime
Instead of statically reserving pessimistic space for the kperfmon_ids
array, make the userspace extension code allocate the exactly required
amount of space.

Apart from saving some memory at runtime, this also removes the need for
the V3D_MAX_PERFMONS macro whose removal will benefit further driver
cleanup.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-8-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Tvrtko Ursulin
4bd75a81e6
drm/v3d: Move part of copying of reset/copy performance extension to a helper
The loop which looks up the syncobj and copies the kperfmon ids is
identical so lets move it to a helper.

The only change is replacing copy_from_user with get_user when copying a
scalar.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-7-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Tvrtko Ursulin
a546b7e4d7
drm/v3d: Validate passed in drm syncobj handles in the performance extension
If userspace provides an unknown or invalid handle anywhere in the handle
array the rest of the driver will not handle that well.

Fix it by checking handle was looked up successfully or otherwise fail the
extension by jumping into the existing unwind.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-6-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Tvrtko Ursulin
8d1276d1b8
drm/v3d: Validate passed in drm syncobj handles in the timestamp extension
If userspace provides an unknown or invalid handle anywhere in the handle
array the rest of the driver will not handle that well.

Fix it by checking handle was looked up successfully or otherwise fail the
extension by jumping into the existing unwind.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 9ba0ff3e08 ("drm/v3d: Create a CPU job extension for the timestamp query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-5-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Tvrtko Ursulin
484de39fa5
drm/v3d: Fix potential memory leak in the performance extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-4-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Tvrtko Ursulin
753ce4fea6
drm/v3d: Fix potential memory leak in the timestamp extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 9ba0ff3e08 ("drm/v3d: Create a CPU job extension for the timestamp query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-3-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Tvrtko Ursulin
f32b5128d2
drm/v3d: Prevent out of bounds access in performance query extensions
Check that the number of perfmons userspace is passing in the copy and
reset extensions is not greater than the internal kernel storage where
the ids will be copied into.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-2-tursulin@igalia.com
2024-07-13 11:00:30 -03:00
Tvrtko Ursulin
9c3951ec27
drm/v3d: Fix perfmon build error/warning
Move static const array into the source file to fix the "defined but not
used" errors.

The fix is perhaps not the prettiest due hand crafting the array sizes
in v3d_performance_counters.h, but I did add some build time asserts to
validate the counts look sensible, so hopefully it is good enough for a
quick fix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 3cbcbe016c ("drm/v3d: Add Performance Counters descriptions for V3D 4.2 and 7.1")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405211137.hueFkLKG-lkp@intel.com/Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240604160210.24073-1-tursulin@igalia.com
2024-06-05 10:44:51 +02:00
Maxime Ripard
375c4d1583
Merge drm/drm-next into drm-misc-next
Let's start the new release cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-05-27 11:08:31 +02:00
Maíra Canal
f5b798bdc9
drm/v3d: Use V3D_MAX_COUNTERS instead of V3D_PERFCNT_NUM
V3D_PERFCNT_NUM represents the maximum number of performance counters
for V3D 4.2, but not for V3D 7.1. This means that, if we use
V3D_PERFCNT_NUM, we might go out-of-bounds on V3D 7.1.

Therefore, use the number of performance counters on V3D 7.1 as the
maximum number of counters. This will allow us to create arrays on the
stack with reasonable size. Note that userspace must use the value
provided by DRM_V3D_PARAM_MAX_PERF_COUNTERS.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-6-mcanal@igalia.com
2024-05-20 16:38:03 -03:00
Maíra Canal
f33fe58298
drm/v3d: Create new IOCTL to expose performance counters information
Userspace usually needs some information about the performance counters
available. Although we could replicate this information in the kernel
and user-space, let's use the kernel as the "single source of truth" to
avoid issues in the future (e.g. list of performance counters is updated
in user-space, but not in the kernel, generating invalid requests).

Therefore, create a new IOCTL to expose the performance counters
information, that is name, category, and description.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-5-mcanal@igalia.com
2024-05-20 16:38:02 -03:00
Maíra Canal
36b75080e6
drm/v3d: Create a new V3D parameter for the maximum number of perfcnt
The maximum number of performance counters can change from version to
version and it's important for userspace to know this value, as it needs
to use the counters for performance queries. Therefore, expose the
maximum number of performance counters to userspace as a parameter.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-4-mcanal@igalia.com
2024-05-20 16:38:01 -03:00
Maíra Canal
c606043ddb
drm/v3d: Different V3D versions can have different number of perfcnt
Currently, even though V3D 7.1 has 93 performance counters, it is not
possible to create counters bigger than 87, as
`v3d_perfmon_create_ioctl()` understands that counters bigger than 87
are invalid.

Therefore, create a device variable to expose the maximum
number of counters for a given V3D version and make
`v3d_perfmon_create_ioctl()` check this variable.

This commit fixes CTS failures in the performance queries tests
`dEQP-VK.query_pool.performance_query.*` [1]

Link: ea1f09a5f2 [1]
Fixes: 6fd9487147 ("drm/v3d: add brcm,2712-v3d as a compatible V3D device")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-3-mcanal@igalia.com
2024-05-20 16:38:00 -03:00
Maíra Canal
3cbcbe016c
drm/v3d: Add Performance Counters descriptions for V3D 4.2 and 7.1
Add name, category and description for each one of the 93 performance
counters available on V3D.

Note that V3D 4.2 has 87 performance counters, while V3D 7.1 has 93.
Therefore, there are two performance counters arrays. The index of the
performance counter for each V3D version is represented by its position
on the array.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-2-mcanal@igalia.com
2024-05-20 16:37:59 -03:00
Linus Torvalds
61307b7be4 The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.  Notable
 series include:
 
 - Lucas Stach has provided some page-mapping
   cleanup/consolidation/maintainability work in the series "mm/treewide:
   Remove pXd_huge() API".
 
 - In the series "Allow migrate on protnone reference with
   MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
   MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
   test.
 
 - In their series "Memory allocation profiling" Kent Overstreet and
   Suren Baghdasaryan have contributed a means of determining (via
   /proc/allocinfo) whereabouts in the kernel memory is being allocated:
   number of calls and amount of memory.
 
 - Matthew Wilcox has provided the series "Various significant MM
   patches" which does a number of rather unrelated things, but in largely
   similar code sites.
 
 - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
   Weiner has fixed the page allocator's handling of migratetype requests,
   with resulting improvements in compaction efficiency.
 
 - In the series "make the hugetlb migration strategy consistent" Baolin
   Wang has fixed a hugetlb migration issue, which should improve hugetlb
   allocation reliability.
 
 - Liu Shixin has hit an I/O meltdown caused by readahead in a
   memory-tight memcg.  Addressed in the series "Fix I/O high when memory
   almost met memcg limit".
 
 - In the series "mm/filemap: optimize folio adding and splitting" Kairui
   Song has optimized pagecache insertion, yielding ~10% performance
   improvement in one test.
 
 - Baoquan He has cleaned up and consolidated the early zone
   initialization code in the series "mm/mm_init.c: refactor
   free_area_init_core()".
 
 - Baoquan has also redone some MM initializatio code in the series
   "mm/init: minor clean up and improvement".
 
 - MM helper cleanups from Christoph Hellwig in his series "remove
   follow_pfn".
 
 - More cleanups from Matthew Wilcox in the series "Various page->flags
   cleanups".
 
 - Vlastimil Babka has contributed maintainability improvements in the
   series "memcg_kmem hooks refactoring".
 
 - More folio conversions and cleanups in Matthew Wilcox's series
 
 	"Convert huge_zero_page to huge_zero_folio"
 	"khugepaged folio conversions"
 	"Remove page_idle and page_young wrappers"
 	"Use folio APIs in procfs"
 	"Clean up __folio_put()"
 	"Some cleanups for memory-failure"
 	"Remove page_mapping()"
 	"More folio compat code removal"
 
 - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
   functions to work on folis".
 
 - Code consolidation and cleanup work related to GUP's handling of
   hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
 
 - Rick Edgecombe has developed some fixes to stack guard gaps in the
   series "Cover a guard gap corner case".
 
 - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
   "mm/ksm: fix ksm exec support for prctl".
 
 - Baolin Wang has implemented NUMA balancing for multi-size THPs.  This
   is a simple first-cut implementation for now.  The series is "support
   multi-size THP numa balancing".
 
 - Cleanups to vma handling helper functions from Matthew Wilcox in the
   series "Unify vma_address and vma_pgoff_address".
 
 - Some selftests maintenance work from Dev Jain in the series
   "selftests/mm: mremap_test: Optimizations and style fixes".
 
 - Improvements to the swapping of multi-size THPs from Ryan Roberts in
   the series "Swap-out mTHP without splitting".
 
 - Kefeng Wang has significantly optimized the handling of arm64's
   permission page faults in the series
 
 	"arch/mm/fault: accelerate pagefault when badaccess"
 	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
 
 - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
   GUP-fast".
 
 - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
   use struct vm_fault".
 
 - selftests build fixes from John Hubbard in the series "Fix
   selftests/mm build without requiring "make headers"".
 
 - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
   series "Improved Memory Tier Creation for CPUless NUMA Nodes".  Fixes
   the initialization code so that migration between different memory types
   works as intended.
 
 - David Hildenbrand has improved follow_pte() and fixed an errant driver
   in the series "mm: follow_pte() improvements and acrn follow_pte()
   fixes".
 
 - David also did some cleanup work on large folio mapcounts in his
   series "mm: mapcount for large folios + page_mapcount() cleanups".
 
 - Folio conversions in KSM in Alex Shi's series "transfer page to folio
   in KSM".
 
 - Barry Song has added some sysfs stats for monitoring multi-size THP's
   in the series "mm: add per-order mTHP alloc and swpout counters".
 
 - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
   and limit checking cleanups".
 
 - Matthew Wilcox has been looking at buffer_head code and found the
   documentation to be lacking.  The series is "Improve buffer head
   documentation".
 
 - Multi-size THPs get more work, this time from Lance Yang.  His series
   "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
   the freeing of these things.
 
 - Kemeng Shi has added more userspace-visible writeback instrumentation
   in the series "Improve visibility of writeback".
 
 - Kemeng Shi then sent some maintenance work on top in the series "Fix
   and cleanups to page-writeback".
 
 - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
   series "Improve anon_vma scalability for anon VMAs".  Intel's test bot
   reported an improbable 3x improvement in one test.
 
 - SeongJae Park adds some DAMON feature work in the series
 
 	"mm/damon: add a DAMOS filter type for page granularity access recheck"
 	"selftests/damon: add DAMOS quota goal test"
 
 - Also some maintenance work in the series
 
 	"mm/damon/paddr: simplify page level access re-check for pageout"
 	"mm/damon: misc fixes and improvements"
 
 - David Hildenbrand has disabled some known-to-fail selftests ni the
   series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
 
 - memcg metadata storage optimizations from Shakeel Butt in "memcg:
   reduce memory consumption by memcg stats".
 
 - DAX fixes and maintenance work from Vishal Verma in the series
   "dax/bus.c: Fixups for dax-bus locking".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
 jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
 nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
 =V3R/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:
 "The usual shower of singleton fixes and minor series all over MM,
  documented (hopefully adequately) in the respective changelogs.
  Notable series include:

   - Lucas Stach has provided some page-mapping cleanup/consolidation/
     maintainability work in the series "mm/treewide: Remove pXd_huge()
     API".

   - In the series "Allow migrate on protnone reference with
     MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
     MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
     one test.

   - In their series "Memory allocation profiling" Kent Overstreet and
     Suren Baghdasaryan have contributed a means of determining (via
     /proc/allocinfo) whereabouts in the kernel memory is being
     allocated: number of calls and amount of memory.

   - Matthew Wilcox has provided the series "Various significant MM
     patches" which does a number of rather unrelated things, but in
     largely similar code sites.

   - In his series "mm: page_alloc: freelist migratetype hygiene"
     Johannes Weiner has fixed the page allocator's handling of
     migratetype requests, with resulting improvements in compaction
     efficiency.

   - In the series "make the hugetlb migration strategy consistent"
     Baolin Wang has fixed a hugetlb migration issue, which should
     improve hugetlb allocation reliability.

   - Liu Shixin has hit an I/O meltdown caused by readahead in a
     memory-tight memcg. Addressed in the series "Fix I/O high when
     memory almost met memcg limit".

   - In the series "mm/filemap: optimize folio adding and splitting"
     Kairui Song has optimized pagecache insertion, yielding ~10%
     performance improvement in one test.

   - Baoquan He has cleaned up and consolidated the early zone
     initialization code in the series "mm/mm_init.c: refactor
     free_area_init_core()".

   - Baoquan has also redone some MM initializatio code in the series
     "mm/init: minor clean up and improvement".

   - MM helper cleanups from Christoph Hellwig in his series "remove
     follow_pfn".

   - More cleanups from Matthew Wilcox in the series "Various
     page->flags cleanups".

   - Vlastimil Babka has contributed maintainability improvements in the
     series "memcg_kmem hooks refactoring".

   - More folio conversions and cleanups in Matthew Wilcox's series:
	"Convert huge_zero_page to huge_zero_folio"
	"khugepaged folio conversions"
	"Remove page_idle and page_young wrappers"
	"Use folio APIs in procfs"
	"Clean up __folio_put()"
	"Some cleanups for memory-failure"
	"Remove page_mapping()"
	"More folio compat code removal"

   - David Hildenbrand chipped in with "fs/proc/task_mmu: convert
     hugetlb functions to work on folis".

   - Code consolidation and cleanup work related to GUP's handling of
     hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".

   - Rick Edgecombe has developed some fixes to stack guard gaps in the
     series "Cover a guard gap corner case".

   - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
     series "mm/ksm: fix ksm exec support for prctl".

   - Baolin Wang has implemented NUMA balancing for multi-size THPs.
     This is a simple first-cut implementation for now. The series is
     "support multi-size THP numa balancing".

   - Cleanups to vma handling helper functions from Matthew Wilcox in
     the series "Unify vma_address and vma_pgoff_address".

   - Some selftests maintenance work from Dev Jain in the series
     "selftests/mm: mremap_test: Optimizations and style fixes".

   - Improvements to the swapping of multi-size THPs from Ryan Roberts
     in the series "Swap-out mTHP without splitting".

   - Kefeng Wang has significantly optimized the handling of arm64's
     permission page faults in the series
	"arch/mm/fault: accelerate pagefault when badaccess"
	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"

   - GUP cleanups from David Hildenbrand in "mm/gup: consistently call
     it GUP-fast".

   - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
     path to use struct vm_fault".

   - selftests build fixes from John Hubbard in the series "Fix
     selftests/mm build without requiring "make headers"".

   - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
     series "Improved Memory Tier Creation for CPUless NUMA Nodes".
     Fixes the initialization code so that migration between different
     memory types works as intended.

   - David Hildenbrand has improved follow_pte() and fixed an errant
     driver in the series "mm: follow_pte() improvements and acrn
     follow_pte() fixes".

   - David also did some cleanup work on large folio mapcounts in his
     series "mm: mapcount for large folios + page_mapcount() cleanups".

   - Folio conversions in KSM in Alex Shi's series "transfer page to
     folio in KSM".

   - Barry Song has added some sysfs stats for monitoring multi-size
     THP's in the series "mm: add per-order mTHP alloc and swpout
     counters".

   - Some zswap cleanups from Yosry Ahmed in the series "zswap
     same-filled and limit checking cleanups".

   - Matthew Wilcox has been looking at buffer_head code and found the
     documentation to be lacking. The series is "Improve buffer head
     documentation".

   - Multi-size THPs get more work, this time from Lance Yang. His
     series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
     optimizes the freeing of these things.

   - Kemeng Shi has added more userspace-visible writeback
     instrumentation in the series "Improve visibility of writeback".

   - Kemeng Shi then sent some maintenance work on top in the series
     "Fix and cleanups to page-writeback".

   - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
     the series "Improve anon_vma scalability for anon VMAs". Intel's
     test bot reported an improbable 3x improvement in one test.

   - SeongJae Park adds some DAMON feature work in the series
	"mm/damon: add a DAMOS filter type for page granularity access recheck"
	"selftests/damon: add DAMOS quota goal test"

   - Also some maintenance work in the series
	"mm/damon/paddr: simplify page level access re-check for pageout"
	"mm/damon: misc fixes and improvements"

   - David Hildenbrand has disabled some known-to-fail selftests ni the
     series "selftests: mm: cow: flag vmsplice() hugetlb tests as
     XFAIL".

   - memcg metadata storage optimizations from Shakeel Butt in "memcg:
     reduce memory consumption by memcg stats".

   - DAX fixes and maintenance work from Vishal Verma in the series
     "dax/bus.c: Fixups for dax-bus locking""

* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
  memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
  selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
  selftests: cgroup: add tests to verify the zswap writeback path
  mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
  mm/damon/core: fix return value from damos_wmark_metric_value
  mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
  selftests: cgroup: remove redundant enabling of memory controller
  Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
  Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
  Docs/mm/damon/design: use a list for supported filters
  Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
  Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
  selftests/damon: classify tests for functionalities and regressions
  selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
  selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
  selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
  mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
  selftests/damon: add a test for DAMOS quota goal
  ...
2024-05-19 09:21:03 -07:00
Kent Overstreet
0069455bcb fix missing vmalloc.h includes
Patch series "Memory allocation profiling", v6.

Overview:
Low overhead [1] per-callsite memory allocation profiling. Not just for
debug kernels, overhead low enough to be deployed in production.

Example output:
  root@moria-kvm:~# sort -rn /proc/allocinfo
   127664128    31168 mm/page_ext.c:270 func:alloc_page_ext
    56373248     4737 mm/slub.c:2259 func:alloc_slab_page
    14880768     3633 mm/readahead.c:247 func:page_cache_ra_unbounded
    14417920     3520 mm/mm_init.c:2530 func:alloc_large_system_hash
    13377536      234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
    11718656     2861 mm/filemap.c:1919 func:__filemap_get_folio
     9192960     2800 kernel/fork.c:307 func:alloc_thread_stack_node
     4206592        4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
     4136960     1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
     3940352      962 mm/memory.c:4214 func:alloc_anon_folio
     2894464    22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
     ...

Usage:
kconfig options:
 - CONFIG_MEM_ALLOC_PROFILING
 - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
 - CONFIG_MEM_ALLOC_PROFILING_DEBUG
   adds warnings for allocations that weren't accounted because of a
   missing annotation

sysctl:
  /proc/sys/vm/mem_profiling

Runtime info:
  /proc/allocinfo

Notes:

[1]: Overhead
To measure the overhead we are comparing the following configurations:
(1) Baseline with CONFIG_MEMCG_KMEM=n
(2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)
(3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y)
(4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1)
(5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT
(6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)  && CONFIG_MEMCG_KMEM=y
(7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y

Performance overhead:
To evaluate performance we implemented an in-kernel test executing
multiple get_free_page/free_page and kmalloc/kfree calls with allocation
sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
affinity set to a specific CPU to minimize the noise. Below are results
from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
56 core Intel Xeon:

                        kmalloc                 pgalloc
(1 baseline)            6.764s                  16.902s
(2 default disabled)    6.793s  (+0.43%)        17.007s (+0.62%)
(3 default enabled)     7.197s  (+6.40%)        23.666s (+40.02%)
(4 runtime enabled)     7.405s  (+9.48%)        23.901s (+41.41%)
(5 memcg)               13.388s (+97.94%)       48.460s (+186.71%)
(6 def disabled+memcg)  13.332s (+97.10%)       48.105s (+184.61%)
(7 def enabled+memcg)   13.446s (+98.78%)       54.963s (+225.18%)

Memory overhead:
Kernel size:

   text           data        bss         dec         diff
(1) 26515311	      18890222    17018880    62424413
(2) 26524728	      19423818    16740352    62688898    264485
(3) 26524724	      19423818    16740352    62688894    264481
(4) 26524728	      19423818    16740352    62688898    264485
(5) 26541782	      18964374    16957440    62463596    39183

Memory consumption on a 56 core Intel CPU with 125GB of memory:
Code tags:           192 kB
PageExts:         262144 kB (256MB)
SlabExts:           9876 kB (9.6MB)
PcpuExts:            512 kB (0.5MB)

Total overhead is 0.2% of total memory.

Benchmarks:

Hackbench tests run 100 times:
hackbench -s 512 -l 200 -g 15 -f 25 -P
      baseline       disabled profiling           enabled profiling
avg   0.3543         0.3559 (+0.0016)             0.3566 (+0.0023)
stdev 0.0137         0.0188                       0.0077


hackbench -l 10000
      baseline       disabled profiling           enabled profiling
avg   6.4218         6.4306 (+0.0088)             6.5077 (+0.0859)
stdev 0.0933         0.0286                       0.0489

stress-ng tests:
stress-ng --class memory --seq 4 -t 60
stress-ng --class cpu --seq 4 -t 60
Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/

[2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/


This patch (of 37):

The next patch drops vmalloc.h from a system header in order to fix a
circular dependency; this adds it to all the files that were pulling it in
implicitly.

[kent.overstreet@linux.dev: fix arch/alpha/lib/memcpy.c]
  Link: https://lkml.kernel.org/r/20240327002152.3339937-1-kent.overstreet@linux.dev
[surenb@google.com: fix arch/x86/mm/numa_32.c]
  Link: https://lkml.kernel.org/r/20240402180933.1663992-1-surenb@google.com
[kent.overstreet@linux.dev: a few places were depending on sizes.h]
  Link: https://lkml.kernel.org/r/20240404034744.1664840-1-kent.overstreet@linux.dev
[arnd@arndb.de: fix mm/kasan/hw_tags.c]
  Link: https://lkml.kernel.org/r/20240404124435.3121534-1-arnd@kernel.org
[surenb@google.com: fix arc build]
  Link: https://lkml.kernel.org/r/20240405225115.431056-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-2-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:49 -07:00
Maíra Canal
6abe93b621
drm/v3d: Fix race-condition between sysfs/fdinfo and interrupt handler
In V3D, the conclusion of a job is indicated by a IRQ. When a job
finishes, then we update the local and the global GPU stats of that
queue. But, while the GPU stats are being updated, a user might be
reading the stats from sysfs or fdinfo.

For example, on `gpu_stats_show()`, we could think about a scenario where
`v3d->queue[queue].start_ns != 0`, then an interrupt happens, we update
the value of `v3d->queue[queue].start_ns` to 0, we come back to
`gpu_stats_show()` to calculate `active_runtime` and now,
`active_runtime = timestamp`.

In this simple example, the user would see a spike in the queue usage,
that didn't match reality.

In order to address this issue properly, use a seqcount to protect read
and write sections of the code.

Fixes: 09a93cc4f7 ("drm/v3d: Implement show_fdinfo() callback for GPU usage stats")
Reported-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-7-mcanal@igalia.com
2024-04-23 19:32:49 -03:00
Maíra Canal
12d1624ce3
drm/v3d: Decouple stats calculation from printing
Create a function to decouple the stats calculation from the printing.
This will be useful in the next step when we add a seqcount to protect
the stats.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-6-mcanal@igalia.com
2024-04-23 19:32:48 -03:00
Maíra Canal
da483d079b
drm/v3d: Create function to update a set of GPU stats
Given a set of GPU stats, that is, a `struct v3d_stats` related to a
queue in a given context, create a function that can update this set
of GPU stats.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-5-mcanal@igalia.com
2024-04-23 19:32:47 -03:00
Maíra Canal
b136b1953f
drm/v3d: Create a struct to store the GPU stats
This will make it easier to instantiate the GPU stats variables and it
will create a structure where we can store all the variables that refer
to GPU stats.

Note that, when we created the struct `v3d_stats`, we renamed
`jobs_sent` to `jobs_completed`. This better express the semantics of
the variable, as we are only accounting jobs that have been completed.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-4-mcanal@igalia.com
2024-04-23 19:32:46 -03:00
Maíra Canal
52ce97765c
drm/v3d: Create two functions to update all GPU stats variables
Currently, we manually perform all operations to update the GPU stats
variables. Apart from the code repetition, this is very prone to errors,
as we can see on commit 35f4f8c9fc ("drm/v3d: Don't increment
`enabled_ns` twice").

Therefore, create two functions to manage updating all GPU stats
variables. Now, the jobs only need to call for `v3d_job_update_stats()`
when the job is done and `v3d_job_start_stats()` when starting the job.

Co-developed-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-3-mcanal@igalia.com
2024-04-23 19:32:45 -03:00
Maíra Canal
35f4f8c9fc
drm/v3d: Don't increment enabled_ns twice
The commit 509433d814 ("drm/v3d: Expose the total GPU usage stats on sysfs")
introduced the calculation of global GPU stats. For the regards, it used
the already existing infrastructure provided by commit 09a93cc4f7 ("drm/v3d:
Implement show_fdinfo() callback for GPU usage stats"). While adding
global GPU stats calculation ability, the author forgot to delete the
existing one.

Currently, the value of `enabled_ns` is incremented twice by the end of
the job, when it should be added just once. Therefore, delete the
leftovers from commit 509433d814 ("drm/v3d: Expose the total GPU usage
stats on sysfs").

Fixes: 509433d814 ("drm/v3d: Expose the total GPU usage stats on sysfs")
Reported-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240403203517.731876-2-mcanal@igalia.com
2024-04-15 15:00:19 -03:00
Maíra Canal
51b76c1f30
drm/v3d: Enable V3D to use different PAGE_SIZE
Currently, the V3D driver uses PAGE_SHIFT over the assumption that
PAGE_SHIFT = 12, as the PAGE_SIZE = 4KB. But, the RPi 5 is using
PAGE_SIZE = 16KB, so the MMU PAGE_SHIFT is different than the system's
PAGE_SHIFT.

Enable V3D to be used in system's with any PAGE_SIZE by making sure that
everything MMU-related uses the MMU page shift.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240214193503.164462-1-mcanal@igalia.com
2024-02-23 16:37:20 -03:00
Dave Airlie
f8e4806e0d drm-misc-next for v6.9:
UAPI Changes:
 
 virtio:
 - add Venus capset defines
 
 Cross-subsystem Changes:
 
 Core Changes:
 
 - fix drm_fixp2int_ceil()
 - documentation fixes
 - clean ups
 - allow DRM_MM_DEBUG with DRM=m
 - build fixes for debugfs support
 - EDID cleanups
 - sched: error-handling fixes
 - ttm: add tests
 
 Driver Changes:
 
 bridge:
 - ite-6505: fix DP link-training bug
 - samsung-dsim: fix error checking in probe
 - tc358767: fix regmap usage
 
 efifb:
 - use copy of global screen_info state
 
 hisilicon:
 - fix EDID includes
 
 mgag200:
 - improve ioremap usage
 - convert to struct drm_edid
 
 nouveau:
 - disp: use kmemdup()
 - fix EDID includes
 - documentation fixes
 
 panel:
 - ltk050h3146w: error-handling fixes
 - panel-edp: support delay between power-on and enable; use put_sync in
   unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49 V8.0,
   BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings
 - panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings
 - panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings
 
 qaic:
 - fixes to BO handling
 - make use of DRM managed release
 - fix order of remove operations
 
 rockchip:
 - analogix_dp: get encoder port from DT
 - inno_hdmi: support HDMI for RK3128
 - lvds: error-handling fixes
 
 simplefb:
 - fix logging
 
 ssd130x:
 - support SSD133x plus DT bindings
 
 tegra:
 - fix error handling
 
 tilcdc:
 - make use of DRM managed release
 
 v3d:
 - show memory stats in debugfs
 
 vc4:
 - fix error handling in plane prepare_fb
 - fix framebuffer test in plane helpers
 
 vesafb:
 - use copy of global screen_info state
 
 virtio:
 - cleanups
 
 vkms:
 - fix OOB access when programming the LUT
 - Kconfig improvements
 
 vmwgfx:
 - unmap surface before changing plane state
 - fix memory leak in error handling
 - documentation fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmWgDSsACgkQaA3BHVML
 eiPG5AgAjK42pY25OezV/wmUiffhfDluTbjw+Cjwg/IzZ+RyAMcf1q7R8Oid6G6o
 kG64WfUmBURwdOQ2nvtJGum8YoBx70uvmKyFzhu20v+e9HT/EFizRi7Qvg71Il/0
 l8WQ5BJ1W6PvJKgEU4io+OWWhDyZ1B3GpCvkd30wF2Vfi0CPeeVAvwmeKC5Rj4lE
 lbYa93sAS/neNkTBvhEEUGoz/t9+5FqAMarygM6hYzDVgnhs8Mbm/pRKCUgPvYvp
 871oibBPenQGP6jkWU1h1UWIa5CACvHgkebSGUGy6b5e9jPBRrSmlnRhiviJEY1/
 6x7kGnB69r6I40CdY/1cnTqyhJ25cA==
 =GX2h
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-01-11' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v6.9:

UAPI Changes:

virtio:
- add Venus capset defines

Cross-subsystem Changes:

Core Changes:

- fix drm_fixp2int_ceil()
- documentation fixes
- clean ups
- allow DRM_MM_DEBUG with DRM=m
- build fixes for debugfs support
- EDID cleanups
- sched: error-handling fixes
- ttm: add tests

Driver Changes:

bridge:
- ite-6505: fix DP link-training bug
- samsung-dsim: fix error checking in probe
- tc358767: fix regmap usage

efifb:
- use copy of global screen_info state

hisilicon:
- fix EDID includes

mgag200:
- improve ioremap usage
- convert to struct drm_edid

nouveau:
- disp: use kmemdup()
- fix EDID includes
- documentation fixes

panel:
- ltk050h3146w: error-handling fixes
- panel-edp: support delay between power-on and enable; use put_sync in
  unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49 V8.0,
  BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings
- panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings
- panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings

qaic:
- fixes to BO handling
- make use of DRM managed release
- fix order of remove operations

rockchip:
- analogix_dp: get encoder port from DT
- inno_hdmi: support HDMI for RK3128
- lvds: error-handling fixes

simplefb:
- fix logging

ssd130x:
- support SSD133x plus DT bindings

tegra:
- fix error handling

tilcdc:
- make use of DRM managed release

v3d:
- show memory stats in debugfs

vc4:
- fix error handling in plane prepare_fb
- fix framebuffer test in plane helpers

vesafb:
- use copy of global screen_info state

virtio:
- cleanups

vkms:
- fix OOB access when programming the LUT
- Kconfig improvements

vmwgfx:
- unmap surface before changing plane state
- fix memory leak in error handling
- documentation fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240111154902.GA8448@linux-uq9g
2024-02-05 13:50:15 +10:00
Maíra Canal
502756e233
drm/v3d: Show the memory-management stats on debugfs
Dump the contents of the DRM MM allocator of the V3D driver. This will
help us to debug the VA ranges allocated.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240105145851.193492-1-mcanal@igalia.com
2024-01-11 11:45:47 -03:00
Maíra Canal
2ad62d16cd
drm/v3d: Free the job and assign it to NULL if initialization fails
Currently, if `v3d_job_init()` fails (e.g. in the IGT test "bad-in-sync",
where we submit an invalid in-sync to the IOCTL), then we end up with
the following NULL pointer dereference:

[   34.146279] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000078
[   34.146301] Mem abort info:
[   34.146306]   ESR = 0x0000000096000005
[   34.146315]   EC = 0x25: DABT (current EL), IL = 32 bits
[   34.146322]   SET = 0, FnV = 0
[   34.146328]   EA = 0, S1PTW = 0
[   34.146334]   FSC = 0x05: level 1 translation fault
[   34.146340] Data abort info:
[   34.146345]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[   34.146351]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[   34.146357]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[   34.146366] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001232e6000
[   34.146375] [0000000000000078] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[   34.146399] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[   34.146406] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device algif_hash aes_neon_bs aes_neon_blk algif_skcipher af_alg bnep hid_logitech_hidpp brcmfmac_wcc brcmfmac brcmutil hci_uart vc4 btbcm cfg80211 bluetooth bcm2835_v4l2(C) snd_soc_hdmi_codec binfmt_misc cec drm_display_helper hid_logitech_dj bcm2835_mmal_vchiq(C) drm_dma_helper drm_kms_helper videobuf2_v4l2 raspberrypi_hwmon ecdh_generic videobuf2_vmalloc videobuf2_memops ecc videobuf2_common rfkill videodev libaes snd_soc_core dwc2 i2c_brcmstb snd_pcm_dmaengine snd_bcm2835(C) i2c_bcm2835 pwm_bcm2835 snd_pcm mc v3d snd_timer snd gpu_sched drm_shmem_helper nvmem_rmem uio_pdrv_genirq uio i2c_dev drm fuse dm_mod drm_panel_orientation_quirks backlight configfs ip_tables x_tables ipv6
[   34.146556] CPU: 1 PID: 1890 Comm: v3d_submit_csd Tainted: G         C         6.7.0-rc3-g49ddab089611 #68
[   34.146563] Hardware name: Raspberry Pi 4 Model B Rev 1.5 (DT)
[   34.146569] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   34.146575] pc : drm_sched_job_cleanup+0x3c/0x190 [gpu_sched]
[   34.146611] lr : v3d_submit_csd_ioctl+0x1b4/0x460 [v3d]
[   34.146653] sp : ffffffc083cbbb80
[   34.146658] x29: ffffffc083cbbb90 x28: ffffff81035afc00 x27: ffffffe77a641168
[   34.146668] x26: ffffff81056a8000 x25: 0000000000000058 x24: 0000000000000000
[   34.146677] x23: ffffff81065e2000 x22: ffffff81035afe00 x21: ffffffc083cbbcf0
[   34.146686] x20: ffffff81035afe00 x19: 00000000ffffffea x18: 0000000000000000
[   34.146694] x17: 0000000000000000 x16: ffffffe7989e34b0 x15: 0000000000000000
[   34.146703] x14: 0000000004000004 x13: ffffff81035afe80 x12: ffffffc083cb8000
[   34.146711] x11: cc57e05dfbe5ef00 x10: cc57e05dfbe5ef00 x9 : ffffffe77a64131c
[   34.146719] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 000000000000003f
[   34.146727] x5 : 0000000000000040 x4 : ffffff81fefb03f0 x3 : ffffffc083cbba40
[   34.146736] x2 : ffffff81056a8000 x1 : ffffffe7989e35e8 x0 : 0000000000000000
[   34.146745] Call trace:
[   34.146748]  drm_sched_job_cleanup+0x3c/0x190 [gpu_sched]
[   34.146768]  v3d_submit_csd_ioctl+0x1b4/0x460 [v3d]
[   34.146791]  drm_ioctl_kernel+0xe0/0x120 [drm]
[   34.147029]  drm_ioctl+0x264/0x408 [drm]
[   34.147135]  __arm64_sys_ioctl+0x9c/0xe0
[   34.147152]  invoke_syscall+0x4c/0x118
[   34.147162]  el0_svc_common+0xb8/0xf0
[   34.147168]  do_el0_svc+0x28/0x40
[   34.147174]  el0_svc+0x38/0x88
[   34.147184]  el0t_64_sync_handler+0x84/0x100
[   34.147191]  el0t_64_sync+0x190/0x198
[   34.147201] Code: aa0003f4 f90007e8 f9401008 aa0803e0 (b8478c09)
[   34.147210] ---[ end trace 0000000000000000 ]---

This happens because we are calling `drm_sched_job_cleanup()` twice:
once at `v3d_job_init()` and again when we call `v3d_job_cleanup()`.

To mitigate this issue, we can return to the same approach that we used
to use before 464c61e76de8: deallocate the job after `v3d_job_init()`
fails and assign it to NULL. Then, when we call `v3d_job_cleanup()`, job
is NULL and the function returns.

Fixes: 464c61e76d ("drm/v3d: Decouple job allocation from job initiation")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109142857.1122704-1-mcanal@igalia.com
2024-01-11 11:23:13 -03:00
Maíra Canal
89fe46019a
drm/v3d: Fix support for register debugging on the RPi 4
RPi 4 uses V3D 4.2, which is currently not supported by the register
definition stated at `v3d_core_reg_defs`. We should be able to support
V3D 4.2, therefore, change the maximum version of the register
definition to 42, not 41.

Fixes: 0ad5bc1ce4 ("drm/v3d: fix up register addresses for V3D 7.x")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109113126.929446-1-mcanal@igalia.com
2024-01-09 14:21:47 -03:00
Harshit Mogalapalli
dce94061f0 drm/v3d: Fix missing error code in v3d_submit_cpu_ioctl()
Smatch warns:
	drivers/gpu/drm/v3d/v3d_submit.c:1222 v3d_submit_cpu_ioctl()
	warn: missing error code 'ret'

When there is no job type or job is submitted with wrong number of BOs
it is an error path, ret is zero at this point which is incorrect
return.

Fix this by changing it to -EINVAL.

Fixes: aafc1a2bea ("drm/v3d: Add a CPU job submission")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231204122102.181298-1-harshit.m.mogalapalli@oracle.com
2023-12-04 21:30:33 -01:00
Maíra Canal
209e8d2695
drm/v3d: Create a CPU job extension for the copy performance query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A copy performance query job is a job that copy the complete
or partial result of a query to a buffer. In order to copy the result of
a performance query to a buffer, we need to get the values from the
performance monitors.

So, create a user extension for the CPU job that enables the creation
of a copy performance query job. This user extension will allow the creation
of a CPU job that copy the results of a performance query to a BO with the
possibility to indicate the availability with a availability bit.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-19-mcanal@igalia.com
2023-12-01 09:47:36 -03:00
Maíra Canal
bae7cb5d68
drm/v3d: Create a CPU job extension for the reset performance query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A reset performance query job is a job that resets the
performance queries by resetting the values of the perfmons. Moreover,
we also reset the syncobjs related to the availability of the query.

So, create a user extension for the CPU job that enables the creation
of a reset performance job. This user extension will allow the creation of
a CPU job that resets the perfmons values and resets the availability syncobj.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-18-mcanal@igalia.com
2023-12-01 09:47:35 -03:00
Maíra Canal
6745f3e44a
drm/v3d: Create a CPU job extension to copy timestamp query to a buffer
A CPU job is a type of job that performs operations that requires CPU
intervention. A copy timestamp query job is a job that copy the complete
or partial result of a query to a buffer. As V3D doesn't provide any
mechanism to obtain a timestamp from the GPU, it is a job that needs
CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a copy timestamp query job. This user extension will allow the creation
of a CPU job that copy the results of a timestamp query to a BO with the
possibility to indicate the timestamp availability with a availability bit.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-17-mcanal@igalia.com
2023-12-01 09:47:31 -03:00
Maíra Canal
34a101e642
drm/v3d: Create a CPU job extension for the reset timestamp job
A CPU job is a type of job that performs operations that requires CPU
intervention. A reset timestamp job is a job that resets the timestamp
queries based on the value offset of the first query. As V3D doesn't
provide any mechanism to obtain a timestamp from the GPU, it is a job
that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a reset timestamp job. This user extension will allow the creation of
a CPU job that resets the timestamp value in the timestamp BO and resets
the availability syncobj.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-16-mcanal@igalia.com
2023-12-01 09:42:47 -03:00
Maíra Canal
9ba0ff3e08
drm/v3d: Create a CPU job extension for the timestamp query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A timestamp query job is a job that calculates the
query timestamp and updates the query availability by signaling a
syncobj. As V3D doesn't provide any mechanism to obtain a timestamp
from the GPU, it is a job that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a timestamp query job. This user extension will allow the creation of
a CPU job that performs the timestamp query calculation and updates the
timestamp BO with the proper value.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-15-mcanal@igalia.com
2023-12-01 09:41:37 -03:00
Maíra Canal
18b8413b25
drm/v3d: Create a CPU job extension for a indirect CSD job
A CPU job is a type of job that performs operations that requires CPU
intervention. An indirect CSD job is a job that, when executed in the
queue, will map the indirect buffer, read the dispatch parameters, and
submit a regular dispatch. Therefore, it is a job that needs CPU
intervention.

So, create a user extension for the CPU job that enables the creation
of an indirect CSD. This user extension will allow the creation of a CSD
job linked to a CPU job. The CPU job will wait for the indirect CSD job
dependencies and, once they are signaled, it will update the CSD job
parameters.

Co-developed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-14-mcanal@igalia.com
2023-12-01 09:40:15 -03:00
Maíra Canal
7c13132c40
drm/v3d: Enable BO mapping
For the indirect CSD CPU job, we will need to access the internal
contents of the BO with the dispatch parameters. Therefore, create
methods to allow the mapping and unmapping of the BO.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-13-mcanal@igalia.com
2023-12-01 09:37:50 -03:00
Melissa Wen
369b059617
drm/v3d: Detach the CSD job BO setup
Detach CSD job setup from CSD submission ioctl to reuse it in CPU
submission ioctl for indirect CSD job.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Co-developed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-12-mcanal@igalia.com
2023-12-01 09:37:49 -03:00
Maíra Canal
1fe0879efc
drm/v3d: Create tracepoints to track the CPU job
Create tracepoints to track the three major events of a CPU job
lifetime:
	1. Submission of a `v3d_submit_cpu` IOCTL
	2. Beginning of the execution of a CPU job
	3. Ending of the execution of a CPU job

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-11-mcanal@igalia.com
2023-12-01 09:37:48 -03:00
Maíra Canal
c5195d001f
drm/v3d: Use v3d_get_extensions() to parse CPU job data
Currently, v3d_get_extensions() only parses multisync data and assigns
it to the `struct v3d_submit_ext`. But, to implement the CPU job with
user extensions, we want v3d_get_extensions() to be able to parse CPU
job data and assign it to the `struct v3d_cpu_job`.

Therefore, allow the function v3d_get_extensions() to use `struct v3d_cpu_job *`
as a parameter. If the `struct v3d_cpu_job *` is assigned to NULL, it means
that the job is a GPU job and CPU job extensions should be rejected.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-10-mcanal@igalia.com
2023-12-01 09:34:25 -03:00
Melissa Wen
aafc1a2bea
drm/v3d: Add a CPU job submission
Create a new type of job, a CPU job. A CPU job is a type of job that
performs operations that requires CPU intervention. The overall idea is
to use user extensions to enable different types of CPU job, allowing the
CPU job to perform different operations according to the type of user
extension. The user extension ID identify the type of CPU job that must
be dealt.

Having a CPU job is interesting for synchronization purposes as a CPU
job has a queue like any other V3D job and can be synchoronized by the
multisync extension.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Co-developed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-9-mcanal@igalia.com
2023-12-01 09:34:19 -03:00
Maíra Canal
464c61e76d
drm/v3d: Decouple job allocation from job initiation
We want to allow the IOCTLs to allocate the job without initiating it.
This will be useful for the CPU job submission IOCTL, as the CPU job has
the need to use information from the user extensions. Currently, the
user extensions are parsed before the job allocation, making it
impossible to fill the CPU job when parsing the user extensions.
Therefore, decouple the job allocation from the job initiation.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-8-mcanal@igalia.com
2023-12-01 09:34:15 -03:00
Maíra Canal
6893deb881
drm/v3d: Don't allow two multisync extensions in the same job
Currently, two multisync extensions can be added to the same job and
only the last multisync extension will be used. To avoid this
vulnerability, don't allow two multisync extensions in the same job.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-7-mcanal@igalia.com
2023-12-01 09:34:12 -03:00
Melissa Wen
8288faaa8b
drm/v3d: Simplify job refcount handling
Instead of checking if the job is NULL every time we call the function,
check it inside the function.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-6-mcanal@igalia.com
2023-12-01 09:34:08 -03:00
Melissa Wen
9032d5f633
drm/v3d: Detach job submissions IOCTLs to a new specific file
We will include a new job submission type, the CPU job submission. For
readability and maintability, separate the job submission IOCTLs and
related operations from v3d_gem.c.

Minor fix in the CSD submission kernel doc:
CSD (texture formatting) -> CSD (compute shader).

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-5-mcanal@igalia.com
2023-12-01 09:34:01 -03:00
Melissa Wen
a8ad9d63a1
drm/v3d: Move wait BO ioctl to the v3d_bo file
IOCTLs related to BO operations reside on the file v3d_bo.c. The wait BO
ioctl is the only IOCTL regarding BOs that is placed in a different file.
So, move it to the v3d_bo.c file.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-4-mcanal@igalia.com
2023-12-01 09:33:56 -03:00
Melissa Wen
780b9463ce
drm/v3d: Remove unused function header
v3d_mmu_get_offset header was added but the function was never defined.
Just remove it.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-3-mcanal@igalia.com
2023-12-01 09:33:52 -03:00
Danilo Krummrich
a78422e9df drm/sched: implement dynamic job-flow control
Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231110001638.71750-1-dakr@redhat.com
2023-11-10 02:54:29 +01:00
Maíra Canal
509433d814
drm/v3d: Expose the total GPU usage stats on sysfs
The previous patch exposed the accumulated amount of active time per
client for each V3D queue. But this doesn't provide a global notion of
the GPU usage.

Therefore, provide the accumulated amount of active time for each V3D
queue (BIN, RENDER, CSD, TFU and CACHE_CLEAN), considering all the jobs
submitted to the queue, independent of the client.

This data is exposed through the sysfs interface, so that if the
interface is queried at two different points of time the usage percentage
of each of the queues can be calculated.

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com
2023-11-06 10:09:29 -03:00
Maíra Canal
09a93cc4f7
drm/v3d: Implement show_fdinfo() callback for GPU usage stats
This patch exposes the accumulated amount of active time per client
through the fdinfo infrastructure. The amount of active time is exposed
for each V3D queue: BIN, RENDER, CSD, TFU and CACHE_CLEAN.

In order to calculate the amount of active time per client, a CPU clock
is used through the function local_clock(). The point where the jobs has
started is marked and is finally compared with the time that the job had
finished.

Moreover, the number of jobs submitted to each queue is also exposed on
fdinfo through the identifier "v3d-jobs-<queue>".

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com
2023-11-06 10:09:23 -03:00
Iago Toral Quiroga
6fd9487147
drm/v3d: add brcm,2712-v3d as a compatible V3D device
This is required to get the V3D module to load with Raspberry Pi 5.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-5-itoral@igalia.com
2023-11-02 08:54:39 -03:00
Iago Toral Quiroga
0ad5bc1ce4
drm/v3d: fix up register addresses for V3D 7.x
This patch updates a number of register addresses that have
been changed in Raspberry Pi 5 (V3D 7.1) and updates the
code to use the corresponding registers and addresses based
on the actual V3D version.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-3-itoral@igalia.com
2023-11-02 08:54:39 -03:00
Matthew Brost
a6149f0393 drm/sched: Convert drm scheduler to use a work queue rather than kthread
In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
  - (Rob Clark) Fix msm build
  - Pass in run work queue
v3:
  - (Boris) don't have loop in worker
v4:
  - (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
  - (Boris) default to ordered work queue
v6:
  - (Luben / checkpatch) fix alignment in msm_ringbuffer.c
  - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
  - (Luben) Update comment for drm_sched_wqueue_enqueue
  - (Luben) Positive check for submit_wq in drm_sched_init
  - (Luben) s/alloc_submit_wq/own_submit_wq
v7:
  - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
  - (Luben) Adjust var names / comments

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
2023-11-01 17:29:21 -04:00
Maíra Canal
79d94360d5
drm/v3d: wait for all jobs to finish before unregistering
Currently, we are only warning the user if the BIN or RENDER jobs don't
finish before we unregister V3D. We must wait for all jobs to finish
before unregistering. Therefore, warn the user if TFU or CSD jobs
are not done by the time the driver is unregistered.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20231023105927.101502-1-mcanal@igalia.com
2023-10-30 09:34:09 -03:00
Luben Tuikov
56e449603f drm/sched: Convert the GPU scheduler to variable number of run-queues
The GPU scheduler has now a variable number of run-queues, which are set up at
drm_sched_init() time. This way, each driver announces how many run-queues it
requires (supports) per each GPU scheduler it creates. Note, that run-queues
correspond to scheduler "priorities", thus if the number of run-queues is set
to 1 at drm_sched_init(), then that scheduler supports a single run-queue,
i.e. single "priority". If a driver further sets a single entity per
run-queue, then this creates a 1-to-1 correspondence between a scheduler and
a scheduled entity.

Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com
2023-10-26 12:03:47 -04:00
Kees Cook
9586e24017 drm/v3d: Annotate struct v3d_perfmon with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct v3d_perfmon.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <mwen@igalia.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-9-keescook@chromium.org
2023-10-05 11:31:33 +02:00
Nathan Chancellor
b27211db61
drm/v3d: Avoid -Wconstant-logical-operand in nsecs_to_jiffies_timeout()
A proposed update to clang's -Wconstant-logical-operand to warn when the
left hand side is a constant shows the following instance in
nsecs_to_jiffies_timeout() when NSEC_PER_SEC is not a multiple of HZ,
such as CONFIG_HZ=300:

  In file included from drivers/gpu/drm/v3d/v3d_debugfs.c:12:
  drivers/gpu/drm/v3d/v3d_drv.h:343:24: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand]
    343 |         if (NSEC_PER_SEC % HZ &&
        |             ~~~~~~~~~~~~~~~~~ ^
  drivers/gpu/drm/v3d/v3d_drv.h:343:24: note: use '&' for a bitwise operation
    343 |         if (NSEC_PER_SEC % HZ &&
        |                               ^~
        |                               &
  drivers/gpu/drm/v3d/v3d_drv.h:343:24: note: remove constant to silence this warning
  1 warning generated.

Turn this into an explicit comparison against zero to make the
expression a boolean to make it clear this should be a logical check,
not a bitwise one.

Link: https://reviews.llvm.org/D142609
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20230718-nsecs_to_jiffies_timeout-constant-logical-operand-v1-1-36ed8fc8faea@kernel.org
2023-07-27 13:01:27 -03:00
Thomas Zimmermann
71e801b9b4 drm: Clear fd/handle callbacks in struct drm_driver
Clear all assignments of struct drm_driver's fd/handle callbacks to
drm_gem_prime_fd_to_handle() and drm_gem_prime_handle_to_fd(). These
functions are called by default. Add a TODO item to convert vmwgfx
to the defaults as well.

v2:
	* remove TODO item (Zack)
	* also update amdgpu's amdgpu_partition_driver

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simon Ser <contact@emersion.fr>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com> # qaic
Link: https://patchwork.freedesktop.org/patch/msgid/20230620080252.16368-3-tzimmermann@suse.de
2023-06-26 11:08:41 +02:00
Thomas Zimmermann
0adec22702 drm: Remove struct drm_driver.gem_prime_mmap
All drivers initialize this field with drm_gem_prime_mmap(). Call
the function directly and remove the field. Simplifies the code and
resolves a long-standing TODO item.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230613150441.17720-3-tzimmermann@suse.de
2023-06-19 13:56:40 +02:00
Uwe Kleine-König
b957812839 drm/v3d: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is (mostly) ignored
and this typically results in resource leaks. To improve here there is a
quest to make the remove callback return void. In the first step of this
quest all drivers are converted to .remove_new() which already returns
void.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230507162616.1368908-52-u.kleine-koenig@pengutronix.de
2023-06-08 09:04:13 -07:00
Maíra Canal
25c0e4062d
drm/v3d: Use drm_sched_job_add_syncobj_dependency()
As v3d_job_add_deps() performs the same steps as
drm_sched_job_add_syncobj_dependency(), replace the open-coded
implementation in v3d in order to simply use the DRM function.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20230209124447.467867-6-mcanal@igalia.com
2023-02-24 17:23:43 -03:00
Maíra Canal
a53be8dae8 drm/v3d: replace open-coded implementation of drm_gem_object_lookup
As v3d_submit_tfu_ioctl() performs the same steps as
drm_gem_object_lookup(), replace the open-code implementation in v3d
with its DRM core equivalent.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221227200010.191351-1-mcanal@igalia.com
2023-01-03 18:16:01 -01:00
Maíra Canal
c0dda238e2 drm/v3d: use new debugfs device-centered functions
Replace the use of drm_debugfs_create_files() with the new
drm_debugfs_add_files() function, which centers the debugfs files
management on the drm_device instead of drm_minor.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20221219120621.15086-6-mcanal@igalia.com
2022-12-22 14:59:23 -03:00
Melissa Wen
5d93060521 drm/v3d: replace obj lookup steps with drm_gem_objects_lookup
As v3d_lookup_bos() performs the same steps as drm_gem_objects_lookup(),
replace the explicit code in v3d to simply use the DRM function.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221205135538.3545051-3-mwen@igalia.com
2022-12-18 23:08:20 -01:00
Melissa Wen
f98c5ec251 drm/v3d: cleanup BOs properly when lookup_bos fails
When v3d_lookup_bos fails to `allocate validated BO pointers`,
job->bo_count was already set to args->bo_count, but job->bo points to
NULL. In this scenario, we must verify that job->bo is not NULL before
iterating on it to proper clean up a job. Also, drm_gem_object_put
already checks that the object passed is not NULL, doing the job->bo[i]
checker redundant.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221205135538.3545051-2-mwen@igalia.com
2022-12-18 23:08:19 -01:00
Maíra Canal
d23a6e3f76 drm/v3d: add missing mutex_destroy
v3d_perfmon_open_file() instantiates a mutex for a particular file
instance, but it never destroys it by calling mutex_destroy() in
v3d_perfmon_close_file().

Similarly, v3d_perfmon_create_ioctl() instantiates a mutex for a
particular perfmon, but it never destroys it by calling mutex_destroy()
in v3d_perfmon_destroy_ioctl().

So, add the missing mutex_destroy on both cases.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108175425.39819-3-mcanal@igalia.com
2022-11-10 11:49:40 -01:00
Maíra Canal
91d502f6eb drm/v3d: switch to drmm_mutex_init
mutex_init is supposed to be balanced by a call to mutex_destroy, but
this is not currently happening on the v3d driver.

Considering the introduction of a DRM-managed mutex_init variant, switch
to the drmm_mutex_init.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108175425.39819-2-mcanal@igalia.com
2022-11-10 11:49:40 -01:00
Thomas Zimmermann
45b64fd9f7 drm/fb-helper: Remove unnecessary include statements
Remove include statements for <drm/drm_fb_helper.h> where it is not
required (i.e., most of them). In a few places include other header
files that are required by the source code.

v3:
	* fix amdgpu include statements
	* fix rockchip include statements

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221103151446.2638-23-tzimmermann@suse.de
2022-11-05 17:12:04 +01:00
Danilo Krummrich
4a83c26a1d drm/gem: rename GEM CMA helpers to GEM DMA helpers
Rename "GEM CMA" helpers to "GEM DMA" helpers - considering the
hierarchy of APIs (mm/cma -> dma -> gem dma) calling them "GEM
DMA" seems to be more applicable.

Besides that, commit e57924d4ae ("drm/doc: Task to rename CMA helpers")
requests to rename the CMA helpers and implies that people seem to be
confused about the naming.

In order to do this renaming the following script was used:

```
	#!/bin/bash

	DIRS="drivers/gpu include/drm Documentation/gpu"

	REGEX_SYM_UPPER="[0-9A-Z_\-]"
	REGEX_SYM_LOWER="[0-9a-z_\-]"

	REGEX_GREP_UPPER="(${REGEX_SYM_UPPER}*)(GEM)_CMA_(${REGEX_SYM_UPPER}*)"
	REGEX_GREP_LOWER="(${REGEX_SYM_LOWER}*)(gem)_cma_(${REGEX_SYM_LOWER}*)"

	REGEX_SED_UPPER="s/${REGEX_GREP_UPPER}/\1\2_DMA_\3/g"
	REGEX_SED_LOWER="s/${REGEX_GREP_LOWER}/\1\2_dma_\3/g"

	# Find all upper case 'CMA' symbols and replace them with 'DMA'.
	for ff in $(grep -REHl "${REGEX_GREP_UPPER}" $DIRS)
	do
	       sed -i -E "$REGEX_SED_UPPER" $ff
	done

	# Find all lower case 'cma' symbols and replace them with 'dma'.
	for ff in $(grep -REHl "${REGEX_GREP_LOWER}" $DIRS)
	do
	       sed -i -E "$REGEX_SED_LOWER" $ff
	done

	# Replace all occurrences of 'CMA' / 'cma' in comments and
	# documentation files with 'DMA' / 'dma'.
	for ff in $(grep -RiHl " cma " $DIRS)
	do
		sed -i -E "s/ cma / dma /g" $ff
		sed -i -E "s/ CMA / DMA /g" $ff
	done

	# Rename all 'cma_obj's to 'dma_obj'.
	for ff in $(grep -RiHl "cma_obj" $DIRS)
	do
		sed -i -E "s/cma_obj/dma_obj/g" $ff
	done
```

Only a few more manual modifications were needed, e.g. reverting the
following modifications in some DRM Kconfig files

    -       select CMA if HAVE_DMA_CONTIGUOUS
    +       select DMA if HAVE_DMA_CONTIGUOUS

as well as manually picking the occurrences of 'CMA'/'cma' in comments and
documentation which relate to "GEM CMA", but not "FB CMA".

Also drivers/gpu/drm/Makefile was fixed up manually after renaming
drm_gem_cma_helper.c to drm_gem_dma_helper.c.

This patch is compile-time tested building a x86_64 kernel with
`make allyesconfig && make drivers/gpu/drm`.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> #drivers/gpu/drm/arm
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220802000405.949236-4-dakr@redhat.com
2022-08-03 18:31:49 +02:00
Danilo Krummrich
5e8bf00ea9 drm/fb: remove unused includes of drm_fb_cma_helper.h
Quite a lot of drivers include the drm_fb_cma_helper.h header file
without actually making use of it's provided API, hence remove those
includes.

Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220802000405.949236-2-dakr@redhat.com
2022-08-03 18:30:20 +02:00
Danilo Krummrich
604bda63bf
drm/v3d: use idr_init_base() to initialize v3d_priv->perfmon.idr
idr_init_base(), implemented by commit 6ce711f275 ("idr: Make 1-based
IDRs more efficient"), let us set an arbitrary base other than
idr_init(), which uses base 0.

Since, for this IDR, no ID < 1 is ever requested/allocated, using
idr_init_base(&idr, 1) avoids unnecessary tree walks.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220701190227.284783-1-dakr@redhat.com
2022-07-28 15:35:56 +01:00
Peter Robinson
e5a068983c drm/v3d: Add support for bcm2711
Add compatible string and Kconfig options and help for bcm2711.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Reviewed-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220603092610.1909675-4-pbrobinson@gmail.com
2022-06-12 22:10:50 -01:00
Peter Robinson
90a64adb08 drm/v3d: Get rid of pm code
Runtime PM doesn't seem to work correctly on this driver. On top of
that, commit 8b6864e3e1 ("drm/v3d/v3d_drv: Remove unused static
variable 'v3d_v3d_pm_ops'") hints that it most likely never did as the
driver's PM ops were not hooked-up.

So, in order to support regular operation with V3D on BCM2711 (Raspberry
Pi 4), get rid of the PM code. PM will be reinstated once we figure out
the underlying issues.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220603092610.1909675-3-pbrobinson@gmail.com
2022-06-12 22:10:40 -01:00
Colin Ian King
ce7a1ecf3f drm/v3d: Fix null pointer dereference of pointer perfmon
In the unlikely event that pointer perfmon is null the WARN_ON return path
occurs after the pointer has already been deferenced. Fix this by only
dereferencing perfmon after it has been null checked.

Fixes: 26a4dc29b7 ("drm/v3d: Expose performance counters to userspace")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220424183512.1365683-1-colin.i.king@gmail.com
2022-04-26 11:55:28 -01:00
Christian König
73511edf8b dma-buf: specify usage while adding fences to dma_resv obj v7
Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.

Rework all drivers to use this interface instead and deprecate the old one.

v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
    disable warning temporary, rebase on upstream changes
v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com
2022-04-07 12:53:53 +02:00
Christian König
c8d4c18bfb dma-buf/drivers: make reserving a shared slot mandatory v4
Audit all the users of dma_resv_add_excl_fence() and make sure they
reserve a shared slot also when only trying to add an exclusive fence.

This is the next step towards handling the exclusive fence like a
shared one.

v2: fix missed case in amdgpu
v3: and two more radeon, rename function
v4: add one more case to TTM, fix i915 after rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com
2022-04-06 17:38:25 +02:00
Melissa Wen
7e302637ba drm/v3d: centralize error handling when init scheduler fails
Remove redundant error message (since now it is very similar to what
we do in drm_sched_init) and centralize all error handling in a
unique place, as we follow the same steps in any case of failure.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228181647.3794298-1-mwen@igalia.com
2022-03-01 22:37:13 -01:00
Jiawei Gu
8ab62eda17 drm/sched: Add device pointer to drm_gpu_scheduler
Add device pointer so scheduler's printing can use
DRM_DEV_ERROR() instead, which makes life easier under multiple GPU
scenario.

v2: amend all calls of drm_sched_init()
v3: fill dev pointer for all drm_sched_init() calls

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com
2022-02-23 10:04:14 +01:00
Thomas Zimmermann
d315bdbfeb drm/gem-shmem: Set vm_ops in static initializer
Initialize default vm_ops in static initialization of the GEM SHMEM funcs,
instead of the mmap code. It's simply better style. GEM helpers will later
set a VMA's vm_ops from the default automatically.

v2:
	* also update the drivers that build upon GEM SHMEM

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209155634.3994-2-tzimmermann@suse.de
2022-02-14 09:42:01 +01:00
Lucas De Marchi
b8c75bd974 drm: Convert open-coded yes/no strings to yesno()
linux/string_helpers.h provides a helper to return "yes"/"no" strings.
Replace the open coded versions with str_yes_no(). The places were
identified with the following semantic patch:

	@@
	expression b;
	@@

	- b ? "yes" : "no"
	+ str_yes_no(b)

Then the includes were added, so we include-what-we-use, and parenthesis
adjusted in drivers/gpu/drm/v3d/v3d_debugfs.c. After the conversion we
still see the same binary sizes:

   text    data     bss     dec     hex filename
  51149    3295     212   54656    d580 virtio/virtio-gpu.ko.old
  51149    3295     212   54656    d580 virtio/virtio-gpu.ko
1441491   60340     800 1502631  16eda7 radeon/radeon.ko.old
1441491   60340     800 1502631  16eda7 radeon/radeon.ko
6125369  328538   34000 6487907  62ff63 amd/amdgpu/amdgpu.ko.old
6125369  328538   34000 6487907  62ff63 amd/amdgpu/amdgpu.ko
 411986   10490    6176  428652   68a6c drm.ko.old
 411986   10490    6176  428652   68a6c drm.ko
  98129    1636     264  100029   186bd dp/drm_dp_helper.ko.old
  98129    1636     264  100029   186bd dp/drm_dp_helper.ko
1973432  109640    2352 2085424  1fd230 nouveau/nouveau.ko.old
1973432  109640    2352 2085424  1fd230 nouveau/nouveau.ko

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220126093951.1470898-10-lucas.demarchi@intel.com
2022-02-07 13:04:25 -08:00
Yongzhi Liu
e57c1a3bd5 drm/v3d: fix missing unlock
[why]
Unlock is needed on the error handling path to prevent dead lock.
v3d_submit_cl_ioctl and v3d_submit_csd_ioctl is missing unlock.

[how]
Fix this by changing goto target on the error handling path. So
changing the goto to target an error handling path
that includes drm_gem_unlock reservations.

Signed-off-by: Yongzhi Liu <lyz_cs@pku.edu.cn>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1643377262-109975-1-git-send-email-lyz_cs@pku.edu.cn
2022-02-01 08:25:51 -01:00
Jiasheng Jiang
4a39156166 drm/v3d/v3d_drv: Check for error num after setting mask
Because of the possible failure of the dma_supported(), the
dma_set_mask_and_coherent() may return error num.
Therefore, it should be better to check it and return the error if
fails.
Also, we can create a variable for the mask to solve the
alignment issue.

Fixes: 334dd38a38 ("drm/v3d: Set dma_mask as well as coherent_dma_mask")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110013807.4105270-1-jiasheng@iscas.ac.cn
2022-01-11 19:07:42 -01:00
Thomas Zimmermann
4ff22f487f drm: Return error codes from struct drm_driver.gem_create_object
GEM helper libraries use struct drm_driver.gem_create_object to let
drivers override GEM object allocation. On failure, the call returns
NULL.

Change the semantics to make the calls return a pointer-encoded error.
This aligns the callback with its callers. Fixes the ingenic driver,
which already returns an error pointer.

Also update the callers to handle the involved types more strictly.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211130095255.26710-1-tzimmermann@suse.de
2021-12-02 11:12:39 +01:00
Thomas Zimmermann
a193f3b4e0 drm/shmem-helper: Pass GEM shmem object in public interfaces
Change all GEM SHMEM object functions that receive a GEM object
of type struct drm_gem_object to expect an object of type
struct drm_gem_shmem_object instead.

This change reduces the number of upcasts from struct drm_gem_object
by moving them into callers. The C compiler can now verify that the
GEM SHMEM functions are called with the correct type.

For consistency, the patch also renames drm_gem_shmem_free_object to
drm_gem_shmem_free. It further updates documentation for a number of
functions.

v3:
	* fix docs for drm_gem_shmem_object_free()
v2:
	* mention _object_ callbacks in docs (Daniel)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108093149.7226-4-tzimmermann@suse.de
2021-11-15 11:46:13 +01:00
Thomas Zimmermann
c7fbcb7149 drm/shmem-helper: Export dedicated wrappers for GEM object functions
Wrap GEM SHMEM functions for struct drm_gem_object_funcs and update
all callers. This will allow for an update of the public interfaces
of the GEM SHMEM helper library.

v2:
	* fix docs for drm_gem_shmem_object_print_info()

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108093149.7226-3-tzimmermann@suse.de
2021-11-15 11:45:01 +01:00
Yang Li
75ad021f21 drm/v3d: nullify pointer se with a NULL
Currently a plain integer is being used to nullify the pointer
struct v3d_submit_ext *se. Use NULL instead. Cleans up sparse
warnings:
drivers/gpu/drm/v3d/v3d_gem.c:777:53: warning: Using plain integer as
NULL pointer
drivers/gpu/drm/v3d/v3d_gem.c:1010:45: warning: Using plain integer as
NULL pointer

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1634282081-72255-1-git-send-email-yang.lee@linux.alibaba.com
2021-10-17 21:40:27 +01:00
Dan Carpenter
ee30840ba3 drm/v3d: fix copy_from_user() error codes
The copy_to/from_user() function returns the number of bytes remaining
to be copied, but we want to return -EFAULT on error.

Fixes: e4165ae830 ("drm/v3d: add multiple syncobjs support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211011123303.GA14314@kili
2021-10-11 21:21:14 +01:00
Melissa Wen
e4165ae830 drm/v3d: add multiple syncobjs support
Using the generic extension from the previous patch, a specific multisync
extension enables more than one in/out binary syncobj per job submission.
Arrays of syncobjs are set in struct drm_v3d_multisync, that also cares
of determining the stage for sync (wait deps) according to the job
queue.

v2:
- subclass the generic extension struct (Daniel)
- simplify adding dependency conditions to make understandable (Iago)

v3:
- fix conditions to consider single or multiples in/out_syncs (Iago)
- remove irrelevant comment (Iago)

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ffd8b2e3dd2e0c686db441a0c0a4a0181ff85328.1633016479.git.mwen@igalia.com
2021-10-04 10:08:46 +01:00
Melissa Wen
bb3425efdc drm/v3d: add generic ioctl extension
Add support to attach generic extensions on job submission. This patch
is third prep work to enable multiple syncobjs on job submission. With
this work, when the job submission interface needs to be extended to
accommodate a new feature, we will use a generic extension struct where
an id determines the data type to be pointed. The first application is
to enable multiples in/out syncobj (next patch), but the base is
already done for future features. Therefore, to attach a new feature,
a specific extension struct should subclass drm_v3d_extension and
update the list of extensions in a job submission.

v2:
- remove redundant elements to subclass struct (Daniel)

v3:
- add comment for v3d_get_extensions

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ed53b1cd7e3125b76f18fe3fb995a04393639bc6.1633016479.git.mwen@igalia.com
2021-10-04 10:08:46 +01:00
Melissa Wen
07c2a41658 drm/v3d: alloc and init job in one shot
Move job memory allocation to v3d_job_init function. This aim to facilitate
error handling in job initialization, since cleanup steps are similar for
all (struct v3d_job)-based types of job involved in a command submission.
To generalize v3d_job_init(), this change takes into account that all job
structs have the first element a struct v3d_job (bin, render, tfu, csd) or
it is a v3d_job itself (clean_job) for pointer casting.

v3:
- explicitly init job as NULL (Iago)
- fix pm failure handling on v3_job_init (Iago)

Suggested-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4d12e07bd334d2cddb51cabd359e99edde595619.1633016479.git.mwen@igalia.com
2021-10-04 10:08:16 +01:00
Melissa Wen
223583dd00 drm/v3d: decouple adding job dependencies steps from job init
Prep work to enable a job to wait for more than one syncobj before
start. Also get rid of old checkpatch warnings in the v3d_gem file.
No functional changes.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/825f5fdd540b0aa2eb57bd5ff93c0777808b108c.1633016479.git.mwen@igalia.com
2021-10-04 10:00:20 +01:00
Cai Huoqing
c3c7d70b20 drm/v3d: Make use of the helper function devm_platform_ioremap_resource_byname()
Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210901112941.31320-1-caihuoqing@baidu.com
2021-09-20 13:37:34 +01:00
Melissa Wen
9fcb4a8ff2 drm/v3d: fix sched job resources cleanup when a job is aborted
In a cl submission, when bin job initialization fails, sched job resources
were already allocated for the render job. At this point,
drm_sched_job_init(render) was done in v3d_job_init but the render job is
aborted before drm_sched_job_arm (in v3d_job_push) happens; therefore, not
only v3d_job_put but also drm_sched_job_cleanup should be called (by
v3d_job_cleanup). A similar issue is addressed for csd and tfu submissions.

The issue was noticed from a review by Iago Toral in a patch that touches
the same part of the code.

Fixes: 916044fac8 ("drm/v3d: Move drm_sched_job_init to v3d_job_init")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916212726.2u2psq2egwy2mdva@mail.igalia.com
2021-09-19 20:40:23 +01:00
Iago Toral Quiroga
e4f8681911 drm/v3d: fix wait for TMU write combiner flush
The hardware sets the TMUWCF bit back to 0 when the TMU write
combiner flush completes so we should be checking for that instead
of the L2TFLS bit.

v2 (Melissa Wen):
  - Add Signed-off-by and Fixes tags.
  - Change the error message for the timeout to be more clear.

Fixes spurious Vulkan CTS failures in:
dEQP-VK.binding_model.descriptorset_random.*

Fixes: d223f98f02 ("drm/v3d: Add support for compute shader dispatch.")
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210915100507.3945-1-itoral@igalia.com
2021-09-15 18:43:37 +01:00
Daniel Vetter
da3208e863 drm/v3d: Use scheduler dependency handling
With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

v2: Rebase over renamed function names for adding dependencies.

Reviewed-by: Melissa Wen <mwen@igalia.com> (v1)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-11-daniel.vetter@ffwll.ch
2021-08-30 10:58:47 +02:00
Daniel Vetter
916044fac8 drm/v3d: Move drm_sched_job_init to v3d_job_init
Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

v4: Rebase over perfmon patch

Reviewed-by: Melissa Wen <mwen@igalia.com> (v3)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-10-daniel.vetter@ffwll.ch
2021-08-30 10:58:40 +02:00
Daniel Vetter
0e10e9a1db drm/sched: drop entity parameter from drm_sched_push_job
Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Steven Price <steven.price@arm.com> (v1)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Melissa Wen <mwen@igalia.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-6-daniel.vetter@ffwll.ch
2021-08-30 10:54:45 +02:00
Daniel Vetter
dbe48d030b drm/sched: Split drm_sched_job_init
This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
  usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
  to be moved into drm_sched_job_arm, which made me realize that the
  job->id definitely needs to be moved too.

  Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't
be a problem where it is now.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Steven Price <steven.price@arm.com> (v2)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v5)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210817084917.3555822-1-daniel.vetter@ffwll.ch
2021-08-30 10:50:44 +02:00
Lukas Bulwahn
5e12f7ea4a drm: v3d: correct reference to config ARCH_BRCMSTB
Commit 57692c94dc ("drm/v3d: Introduce a new DRM driver for Broadcom V3D
V3.x+") adds the config DRM_V3D, which depends on "ARCH_BCMSTB".

Although, a bit confusing: all Broadcom architectures in
./arch/arm/mach-bcm/Kconfig have the prefix "ARCH_BCM", except for
ARCH_BRCMSTB, i.e., the config for Broadcom BCM7XXX based boards.

So, correct the reference ARCH_BCMSTB to the intended ARCH_BRCMSTB.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210819112253.16484-4-lukas.bulwahn@gmail.com
2021-08-26 11:19:08 +02:00
Juan A. Suarez Romero
26a4dc29b7 drm/v3d: Expose performance counters to userspace
The V3D engine has several hardware performance counters that can of
interest for userspace performance analysis tools.

This exposes new ioctls to create and destroy performance monitor
objects, as well as to query the counter values.

Each created performance monitor object has an ID that can be attached
to CL/CSD submissions, so the driver enables the requested counters when
the job is submitted, and updates the performance monitor values when
the job is done.

It is up to the user to ensure all the jobs have been finished before
getting the performance monitor values. It is also up to the user to
properly synchronize BCL jobs when submitting jobs with different
performance monitors attached.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Emma Anholt <emma@anholt.net>
To: dri-devel@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608111541.461991-1-jasuarez@igalia.com
2021-07-21 00:19:59 +01:00
Boris Brezillon
78efe21b6f drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
2021-07-01 08:53:25 +02:00
Christian König
f2f12eb9c3 drm/scheduler: provide scheduler score externally
Allow multiple schedulers to share the load balancing score.

This is useful when one engine has different hw rings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com
2021-02-05 10:47:11 +01:00
Christian König
576a08e008 drm/v3d/v3d_sched: fix scheduler callbacks return status
Looks like this was not correctly adjusted.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: a6a1f036c7 ("drm/scheduler: Job timeout handler returns status (v3)")
Link: https://patchwork.freedesktop.org/patch/msgid/20210201091159.177853-1-christian.koenig@amd.com
2021-02-02 11:10:14 +01:00
Luben Tuikov
a6a1f036c7 drm/scheduler: Job timeout handler returns status (v3)
This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
    timeout callback method.

v3: Return scheduler/device information, rather
    than task information.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Steven Price <steven.price@arm.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/415095/
2021-01-29 11:30:22 +01:00
Phil Elwell
545d9d7802
drm/v3d: Don't clear MMU control bits on exception
MMU exception conditions are reported in the V3D_MMU_CTRL register as
write-1-to-clear (W1C) bits. The MMU interrupt handling code clears any
exceptions, but does so by masking out any other bits and writing the
result back. There are some important control bits in that register,
including MMU_ENABLE, so a safer approach is to simply write back the
value just read unaltered.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/1608755714-18233-4-git-send-email-stefan.wahren@i2se.com
2021-01-06 18:38:57 +01:00
Phil Elwell
334dd38a38
drm/v3d: Set dma_mask as well as coherent_dma_mask
Both coherent_dma_mask and dma_mask act as constraints on allocations
and bounce buffer usage, so be sure to set dma_mask to the appropriate
value otherwise the effective mask could be incorrect.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/1608755714-18233-3-git-send-email-stefan.wahren@i2se.com
2021-01-06 18:38:57 +01:00
Nicolas Saenz Julienne
f4f3beb779
drm/v3d: Use platform_get_irq_optional() to get optional IRQs
Aside from being more correct, the non optional version of the function
prints an error when failing to find the IRQ.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/1608755714-18233-2-git-send-email-stefan.wahren@i2se.com
2021-01-06 18:38:57 +01:00
Thomas Zimmermann
0cf2ef46c6 drm/shmem-helper: Use cached mappings by default
SHMEM-buffer backing storage is allocated from system memory; which is
typically cachable. The default mode for SHMEM objects is writecombine
though.

Unify SHMEM semantics by defaulting to cached mappings. The exception
is pages imported via dma-buf. DMA memory is usually not cached.

DRM drivers that require write-combined mappings set the map_wc flag
in struct drm_gem_shmem_object to true. This currently affects lima,
panfrost and v3d.

The drivers mgag200, udl, virtio and vkms continue to use default
shmem mappings.

The drivers cirrus and gm12u320 change caching flags. Both used
writecombine and now switch over to shmem defaults. Both drivers use
SHMEM objects as shadow buffers for internal video memory, so cached
mappings will not affect them negatively.

v3:
	* set value of shmem pointer before dereferencing it in
	  __drm_gem_shmem_create() (Dan, kernel test robot)
v2:
	* recreate patch on top of latest SHMEM helpers
	* update lima, panfrost, v3d to select writecombine (Daniel, Rob)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20201117133156.26822-2-tzimmermann@suse.de
2020-11-24 09:10:21 +01:00
Lee Jones
d49c4b2c07 drm/v3d/v3d_sched: Demote non-conformant kernel-doc header
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 'sched_job' not described in 'v3d_job_dependency'
 drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 's_entity' not described in 'v3d_job_dependency'

Cc: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-37-lee.jones@linaro.org
2020-11-18 11:51:27 +01:00
Lee Jones
e226878e13 drm/v3d/v3d_gem: Provide descriptions for 'v3d_lookup_bos's params
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/v3d/v3d_gem.c:292: warning: Function parameter or member 'bo_handles' not described in 'v3d_lookup_bos'
 drivers/gpu/drm/v3d/v3d_gem.c:292: warning: Function parameter or member 'bo_count' not described in 'v3d_lookup_bos'

Cc: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-36-lee.jones@linaro.org
2020-11-17 20:26:47 +01:00
Lee Jones
8b6864e3e1 drm/v3d/v3d_drv: Remove unused static variable 'v3d_v3d_pm_ops'
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/v3d/v3d_drv.c:73:32: warning: ‘v3d_v3d_pm_ops’ defined but not used [-Wunused-const-variable=]

Cc: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-35-lee.jones@linaro.org
2020-11-17 20:22:43 +01:00
Mauro Carvalho Chehab
e9d2871f69 drm: fix some kernel-doc markups
Some identifiers have different names between their prototypes
and the kernel-doc markup.

Others need to be fixed, as kernel-doc markups should use this format:
        identifier - description

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/12d4ca26f6843618200529ce5445063734d38c04.1605521731.git.mchehab+huawei@kernel.org
2020-11-16 20:48:20 +01:00
Daniel Vetter
70a59dd829 drm/<drivers>: Constify struct drm_driver
Only the following drivers aren't converted:
- amdgpu, because of the driver_feature mangling due to virt support.
  Subsequent patch will address this.
- nouveau, because DRIVER_ATOMIC uapi is still not the default on the
  platforms where it's supported (i.e. again driver_feature mangling)
- vc4, again because of driver_feature mangling
- qxl, because the ioctl table is somewhere else and moving that is
  maybe a bit too much, hence the num_ioctls assignment prevents a
  const driver structure.
- arcpgu, because that is stuck behind a pending tiny-fication series
  from me.
- legacy drivers, because legacy requires non-const drm_driver.

Note that for armada I also went ahead and made the ioctl array const.

Only cc'ing the driver people who've not been converted (everyone else
is way too much).

v2: Fix one misplaced const static, should be static const (0day)

v3:
- Improve commit message (Sam)

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: kernel test robot <lkp@intel.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104100425.1922351-5-daniel.vetter@ffwll.ch
2020-11-06 10:31:26 +01:00
Dan Carpenter
897dbea6b7
drm/v3d: Fix double free in v3d_submit_cl_ioctl()
Originally this error path used to leak "bin" but then we accidentally
applied two separate commits to fix it and ended up with a double free.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20201026094905.GA1634423@mwanda
2020-10-26 11:43:31 +01:00
Dave Airlie
b40be05ed2 Merge branch 'for-5.10-drm-sg-fix' of https://github.com/mszyprow/linux into drm-next
Please pull a set of fixes for various DRM drivers that finally resolve
incorrect usage of the scatterlists (struct sg_table nents and orig_nents
entries), what causes issues when IOMMU is used.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910080505.24456-1-m.szyprowski@samsung.com
2020-09-17 16:07:11 +10:00
Marek Szyprowski
e96418da0a drm: v3d: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-09-10 08:18:35 +02:00
Qinglang Miao
ad28cd6942 drm/v3d: convert to use module_platform_driver
Get rid of boilerplate code by using module_platform_driver macro
for v3d_drm.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200810125931.186456-1-miaoqinglang@huawei.com
2020-08-10 15:32:00 +02:00
Emil Velikov
2b86189e4d drm/v3d: remove _unlocked suffix in drm_gem_object_put_unlocked
Spelling out _unlocked for each and every driver is a annoying.
Especially if we consider how many drivers, do not know (or need to)
about the horror stories involving struct_mutex.

Just drop the suffix. It makes the API cleaner.

Done via the following script:

__from=drm_gem_object_put_unlocked
__to=drm_gem_object_put
for __file in $(git grep --name-only $__from); do
  sed -i  "s/$__from/$__to/g" $__file;
done

Cc: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515095118.2743122-33-emil.l.velikov@gmail.com
2020-05-19 22:31:35 +01:00
Daniel Vetter
0df3ac7657 drm/v3d: Delete v3d_dev->pdev
We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
different objects if there's multiple
- compilers have an easier time too

To avoid having to pull in some big headers I implemented the casting
function as a macro instead of a static inline. Typechecking thanks to
container_of still assured.

But also a bit a bikeshed, so feel free to ignore.

v2: More parens for v3d_to_pdev macro (checkpatch)

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-11-daniel.vetter@ffwll.ch
2020-04-28 15:15:59 +02:00
Daniel Vetter
bc662528e2 drm/v3d: Delete v3d_dev->dev
We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
  different objects if there's multiple
- compilers have an easier time too

But also a bit a bikeshed, so feel free to ignore.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-10-daniel.vetter@ffwll.ch
2020-04-28 15:15:52 +02:00
Daniel Vetter
235b7e7d7e drm/v3d: Use devm_drm_dev_alloc
Also allows us to simplify the unroll code since the drm_dev_put
disappears.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-9-daniel.vetter@ffwll.ch
2020-04-28 15:15:46 +02:00
Daniel Vetter
af25c16bd1 drm/v3d: Don't set drm_device->dev_private
And switch the helper over to container_of, which is a bunch faster
than chasing a pointer. Plus allows gcc to see through this maze.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-8-daniel.vetter@ffwll.ch
2020-04-28 15:15:41 +02:00
Daniel Vetter
ea3aa6203c drm/v3d: Use drmm_add_final_kfree
With this we can drop the final kfree from the release function.

I also noticed that the unwind code is wrong, after drm_dev_init the
drm_device owns the v3d allocation, so the kfree(v3d) is a double-free.
Reorder the setup to fix this issue.

After a bit more prep in drivers and drm core v3d should be able to
switch over to devm_drm_dev_init, which should clean this up further.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323144950.3018436-11-daniel.vetter@ffwll.ch
2020-03-26 15:18:11 +01:00
Wambui Karuga
7ce84471e3 drm: convert .debugfs_init() hook to return void.
As a result of commit 987d65d013 (drm: debugfs: make
drm_debugfs_create_files() never fail) and changes to various debugfs
functions in drm/core and across various drivers, there is no need for
the drm_driver.debugfs_init() hook to have a return value. Therefore,
declare it as void.

This also includes refactoring all users of the .debugfs_init() hook to
return void across the subsystem.

v2: include changes to the hook and drivers that use it in one patch to
prevent driver breakage and enable individual successful compilation of
this change.

References: https://lists.freedesktop.org/archives/dri-devel/2020-February/257183.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310133121.27913-18-wambui.karugax@gmail.com
2020-03-18 17:53:28 +01:00
Wambui Karuga
19660e8903 drm/v3d: make v3d_debugfs_init() return 0
Since commit 987d65d013 (drm: debugfs: make
drm_debugfs_create_files() never fail), drm_debugfs_create_files() never
fails and should return void. Therefore, remove its use as the
return value of v3d_debugfs_init() and have the function return 0
directly instead.

v2: remove conversion of v3d_debugfs_init() to void to avoid build
breakage and enable individual compilation.

References: https://lists.freedesktop.org/archives/dri-devel/2020-February/257183.html
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310133121.27913-4-wambui.karugax@gmail.com
2020-03-18 16:22:12 +01:00
James Hughes
9daee6141c drm/v3d: Replace wait_for macros to remove use of msleep
The wait_for macro's for Broadcom V3D driver used msleep, which is
inappropriate due to its inaccuracy at low values (minimum wait time
is about 30ms on the Raspberry Pi).  This sleep was triggering in
v3d_clean_caches(), causing us to only be able to dispatch ~33 compute
jobs per second.

This patch replaces the macro with the one from the Intel i915 version
which uses usleep_range to provide more accurate waits.

v2: Split from the vc4 patch so that we can confidently apply to
    stable (by anholt)

Signed-off-by: James Hughes <james.hughes@raspberrypi.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200217153145.13780-1-james.hughes@raspberrypi.com
Link: https://github.com/raspberrypi/linux/issues/3460
Fixes: 57692c94dc ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
2020-03-04 22:15:34 -08:00
Nirmoy Das
b3ac17667f drm/scheduler: rework entity creation
Entity currently keeps a copy of run_queue list and modify it in
drm_sched_entity_set_priority(). Entities shouldn't modify run_queue
list. Use drm_gpu_scheduler list instead of drm_sched_rq list
in drm_sched_entity struct. In this way we can select a runqueue based
on entity/ctx's priority for a  drm scheduler.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00
Dave Airlie
77e0723bd2 Linux 5.4-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3IqJQeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOiUH+gOEDwid5OODaFAd
 CggXugdFIlBZefKqGVNW5sjgX8pxFWHXuEMC8iNb6QXtQZdFrI6LFf9hhUDmzQtm
 6y1LPxxEiTZjObMEsBNylb7tyzgujFHcAlp0Zro3w/HLCqmYTSP3FF46i2u6KZfL
 XhkpM4X7R7qxlfpdhlfESv/ElRGocZe6SwXfC7pcPo5flFcmkdu9ijqhNd/6CZ/h
 Nf9rTsD/wEDVUelFbgVN+LJzlaB0tsyc4Zbof07n8OsFZjhdEOop8gfM/kTBLcyY
 6bh66SfDScdsNnC/l8csbPjSZRx+i+nQs67DyhGNnsSAFgHBZdC4Tb/2mDCwhCLR
 dUvuYZc=
 =1N6F
 -----END PGP SIGNATURE-----

Merge v5.4-rc7 into drm-next

We have the i915 security fixes to backmerge, but first
let's clear the decks for other drivers to avoid a bigger
mess.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-11-14 05:53:10 +10:00
Navid Emamdoost
29cd13cfd7 drm/v3d: Fix memory leak in v3d_submit_cl_ioctl
In the impelementation of v3d_submit_cl_ioctl() there are two memory
leaks. One is when allocation for bin fails, and the other is when bin
initialization fails. If kcalloc fails to allocate memory for bin then
render->base should be put. Also, if v3d_job_init() fails to initialize
bin->base then allocated memory for bin should be released.

Fixes: a783a09ee7 ("drm/v3d: Refactor job management.")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20191021185250.26130-1-navid.emamdoost@gmail.com
2019-10-23 20:57:05 +02:00
Gerd Hoffmann
eee9a2e0ad drm/shmem: drop DEFINE_DRM_GEM_SHMEM_FOPS
DEFINE_DRM_GEM_SHMEM_FOPS is identical
to DEFINE_DRM_GEM_FOPS now, drop it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20191016115203.20095-6-kraxel@redhat.com
2019-10-17 13:59:16 +02:00
Gerd Hoffmann
0be8958936 drm/shmem: switch shmem helper to &drm_gem_object_funcs.mmap
Switch gem shmem helper to the new mmap() workflow,
from &gem_driver.fops.mmap to &drm_gem_object_funcs.mmap.

v2: Fix vm_flags and vm_page_prot handling.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20191016115203.20095-3-kraxel@redhat.com
2019-10-17 13:59:16 +02:00
Iago Toral Quiroga
455d56ce80 drm/v3d: clean caches at the end of render jobs on request from user space
Extends the user space ioctl for CL submissions so it can include a request
to flush the cache once the CL execution has completed. Fixes memory
write violation messages reported by the kernel in workloads involving
shader memory writes (SSBOs, shader images, scratch, etc) which sometimes
also lead to GPU resets during Piglit and CTS workloads.

v2: if v3d_job_init() fails we need to kfree() the job instead of
    v3d_job_put() it (Eric Anholt).

v3 (Eric Anholt):
  - Drop _FLAG suffix from the new flag name.
  - Add a new param so userspace can tell whether cache flushing is
    implemented in the kernel.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919071016.4578-1-itoral@igalia.com
2019-09-19 12:57:44 -07:00
Iago Toral Quiroga
0d352a3a8a drm/v3d: don't leak bin job if v3d_job_init fails.
If the initialization of the job fails we need to kfree() it
before returning.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190916071125.5255-1-itoral@igalia.com
Fixes: a783a09ee7 ("drm/v3d: Refactor job management.")
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 10:30:44 -07:00
Christian König
52791eeec1 dma-buf: rename reservation_object to dma_resv
Be more consistent with the naming of the other DMA-buf objects.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/
2019-08-13 09:09:30 +02:00
Sam Ravnborg
220989e709 drm/v3d: drop use of drmP.h
Drop use of the deprecated drmP.h header file.
Made v3d_drv.h self-contained with only sufficient
include files.
Fixed fallout in remaining files.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190716064220.18157-3-sam@ravnborg.org
2019-07-17 12:52:20 +02:00