-----BEGIN PGP SIGNATURE-----
iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmnS4Y8eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGe6AIAI4rjLLPlxUKQbx4
JP9lsKH7vqeIVvuMqzFau7+B8ngJ+80OESnBF7n43oNEqdJ0NYiL+rPtcGgBjZDP
yUu5DlzVSxpAIQBZe2Nc0dz/5NbT9QxKyC5Yl/whpNIR7UHx1RFvDJYxwN9xKxTw
ggLQevKAnHrKjIOKjq70Yqz2T1JMXc9Wp/xpur0oGioiFW/lH24CgHDXjE2Ka9oD
wqhotzThuSaaVDmqZ8WNFKxx2onR4r8/NpljaVT2mWRJ2+IMF4pMOBJZRQiNZtRa
1CsoJ3aV6pslAsuC1dLboCMul48VUgyu7l3xQwXVuA5bRO1jqt5ILWC10g09OItU
7CxGTno=
=1TRg
-----END PGP SIGNATURE-----
Merge v7.0-rc7 into drm-next
Thomas Zimmermann needs 2f42c1a616 ("drm/ast: dp501: Fix
initialization of SCU2C") for drm-misc-next.
Conflicts:
- drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c
Just between e927b36ae1 ("drm/amd/display: Fix NULL pointer
dereference in dcn401_init_hw()") and it's cherry-pick that confused
git.
- drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
Deleted in 6b0a611628 ("drm/amd/pm: Unify version check in SMUv11")
but some cherry-picks confused git. Same for v12/v14.
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
When an SVM is closed, the garbage collector work item must be stopped
synchronously and any future queuing must be prevented. Replace
flush_work() with disable_work_sync() to ensure both conditions are
met.
Fixes: 63f6e480d1 ("drm/xe: Add SVM garbage collector")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260227015225.3081787-1-matthew.brost@intel.com
(cherry picked from commit 2247feb9badca5a4774df9a437bfc44fba4f22de)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Track purgeable state per-VMA instead of using a coarse shared
BO check. This prevents purging shared BOs until all VMAs across
all VMs are marked DONTNEED.
Add xe_bo_all_vmas_dontneed() to check all VMAs before marking
a BO purgeable. Add xe_bo_recheck_purgeable_on_vma_unbind() to
handle state transitions when VMAs are destroyed - if all
remaining VMAs are DONTNEED the BO can become purgeable, or if
no VMAs remain it transitions to WILLNEED.
The per-VMA purgeable_state field stores the madvise hint for
each mapping. Shared BOs can only be purged when all VMAs
unanimously indicate DONTNEED.
This prevents the bug where unmapping the last VMA would incorrectly
flip a DONTNEED BO back to WILLNEED. The enum-based state check
preserves BO state when no VMAs remain, only updating when VMAs provide
explicit hints.
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-7-arvind.yadav@intel.com
Breakdown the GT stats for copy to host and copy to device per size (4K,
64K 2M) to make it easier for user space to track memory migrations.
This is helpful to verify allocation alignment is correct when porting
applications to SVM.
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260325160152.1057556-1-francois.dugast@intel.com
xe_vm_range_tilemask_tlb_inval() submits TLB invalidation requests to
all GTs in a tile mask and then immediately waits for them to complete
before returning. This is fine for the existing callers, but a
subsequent patch will need to defer the wait in order to overlap TLB
invalidations across multiple VMAs.
Introduce xe_tlb_inval_range_tilemask_submit() and
xe_tlb_inval_batch_wait() in xe_tlb_inval.c as the submit and wait
halves respectively. The batch of fences is carried in the new
xe_tlb_inval_batch structure. Remove xe_vm_range_tilemask_tlb_inval()
and convert all three call sites to the new API.
v3:
- Don't wait on TLB invalidation batches if the corresponding batch
submit returns an error. (Matt Brost)
- s/_batch/batch/ (Matt Brost)
Assisted-by: GitHub Copilot:claude-sonnet-4.6
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260305093909.43623-4-thomas.hellstrom@linux.intel.com
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
Passing a structure by value into a function is sometimes problematic,
for a number of reasons. Of of these is a warning from the 32-bit arm
compiler:
drivers/gpu/drm/drm_gpusvm.c: In function '__drm_gpusvm_unmap_pages':
drivers/gpu/drm/drm_gpusvm.c:1152:33: note: parameter passing for argument of type 'struct drm_pagemap_addr' changed in GCC 9.1
1152 | dpagemap->ops->device_unmap(dpagemap,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1153 | dev, *addr);
| ~~~~~~~~~~~
This particular problem is harmless since we are not mixing compiler versions
inside of the compiler. However, passing this by reference avoids the warning
along with providing slightly better calling conventions as it avoids an
extra copy on the stack.
Fixes: 75af93b3f5 ("drm/pagemap, drm/xe: Support destination migration over interconnect")
Fixes: 2df55d9e66 ("drm/xe: Support pcie p2p dma as a fast interconnect")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260216134644.1025365-1-arnd@kernel.org
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
(cherry picked from commit 95162db020)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Passing a structure by value into a function is sometimes problematic,
for a number of reasons. Of of these is a warning from the 32-bit arm
compiler:
drivers/gpu/drm/drm_gpusvm.c: In function '__drm_gpusvm_unmap_pages':
drivers/gpu/drm/drm_gpusvm.c:1152:33: note: parameter passing for argument of type 'struct drm_pagemap_addr' changed in GCC 9.1
1152 | dpagemap->ops->device_unmap(dpagemap,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1153 | dev, *addr);
| ~~~~~~~~~~~
This particular problem is harmless since we are not mixing compiler versions
inside of the compiler. However, passing this by reference avoids the warning
along with providing slightly better calling conventions as it avoids an
extra copy on the stack.
Fixes: 75af93b3f5 ("drm/pagemap, drm/xe: Support destination migration over interconnect")
Fixes: 2df55d9e66 ("drm/xe: Support pcie p2p dma as a fast interconnect")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260216134644.1025365-1-arnd@kernel.org
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Move the DRM buddy allocator one level up so that it can be used by GPU
drivers (example, nova-core) that have usecases other than DRM (such as
VFIO vGPU support). Modify the API, structures and Kconfigs to use
"gpu_buddy" terminology. Adapt the drivers and tests to use the new API.
The commit cannot be split due to bisectability, however no functional
change is intended. Verified by running K-UNIT tests and build tested
various configurations.
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
[airlied: I've split this into two so git can find copies easier.
I've also just nuked drm_random library, that stuff needs to be done
elsewhere and only the buddy tests seem to be using it].
Signed-off-by: Dave Airlie <airlied@redhat.com>
Ensure preferred system memory placement is checked in
xe_svm_range_validate when dpagemap is NULL. Without this check, a
prefetch to system memory may become a no-op because device memory is
considered a valid placement.
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 238dbc9d9f ("drm/xe: Use the vma attibute drm_pagemap to select where to migrate")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patch.msgid.link/20260106213443.1866797-1-matthew.brost@intel.com
Introduce an rw-semaphore to serialize migration to device if
it's likely that migration races with another device migration
of the same CPU address space range.
This is a temporary fix to attempt to mitigate a livelock that
might happen if many devices try to migrate a range at the same
time, and it affects only devices using the xe driver.
A longer term fix is probably improvements in the core mm
migration layer.
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251219113320.183860-25-thomas.hellstrom@linux.intel.com
Support destination migration over interconnect when migrating from
device-private pages with the same dev_pagemap owner.
Since we now also collect device-private pages to migrate,
also abort migration if the range to migrate is already
fully populated with pages from the desired pagemap.
Finally return -EBUSY from drm_pagemap_populate_mm()
if the migration can't be completed without first migrating all
pages in the range to system. It is expected that the caller
will perform that before retrying the call to
drm_pagemap_populate_mm().
v3:
- Fix a bug where the p2p dma-address was never used.
- Postpone enabling destination interconnect migration,
since xe devices require source interconnect migration to
ensure the source L2 cache is flushed at migration time.
- Update the drm_pagemap_migrate_to_devmem() interface to
pass migration details.
v4:
- Define XE_INTERCONNECT_P2P unconditionally (CI)
- Include a missing header (CI)
v5:
- Use page order increments where possible (Matt Brost).
- Fix a negated value of can_migrate_same_pagemap.
- Move removal of some dead code to a separate patch (Matt Brost).
- Remove an unnecessary zdd get() and put() (Matt Brost).
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-23-thomas.hellstrom@linux.intel.com
Use drm_gpusvm_scan_mm() to avoid unnecessarily calling into
drm_pagemap_populate_mm();
v3:
- New patch.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patch.msgid.link/20251219113320.183860-22-thomas.hellstrom@linux.intel.com
Use the dev_pagemap->owner field wherever possible, simplifying
the code slightly.
v3: New patch
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-20-thomas.hellstrom@linux.intel.com
As an aid to understanding the lifetime of the drm_pagemaps used
by the xe driver, document how the xe driver keeps the
drm_pagemap references.
v3:
- Fix formatting (Matt Brost)
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251219113320.183860-19-thomas.hellstrom@linux.intel.com
Add debug printouts that are valueable for pagemap prefetch,
migration and page collection.
v2:
- Add additional debug prinouts around migration and page collection.
- Require CONFIG_DRM_XE_DEBUG_VM.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1
Link: https://patch.msgid.link/20251219113320.183860-18-thomas.hellstrom@linux.intel.com
Mimic the dma-buf method using dma_[map|unmap]_resource to map
for pcie-p2p dma.
There's an ongoing area of work upstream to sort out how this best
should be done. One method proposed is to add an additional
pci_p2p_dma_pagemap aliasing the device_private pagemap and use
the corresponding pci_p2p_dma_pagemap page as input for
dma_map_page(). However, that would incur double the amount of
memory and latency to set up the drm_pagemap and given the huge
amount of memory present on modern GPUs, that would really not work.
Hence the simple approach used in this patch.
v2:
- Simplify xe_page_to_pcie(). (Matt Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251219113320.183860-17-thomas.hellstrom@linux.intel.com
Use device file descriptors and regions to represent pagemaps on
foreign or local devices.
The underlying files are type-checked at madvise time, and
references are kept on the drm_pagemap as long as there is are
madvises pointing to it.
Extend the madvise preferred_location UAPI to support the region
instance to identify the foreign placement.
v2:
- Improve UAPI documentation. (Matt Brost)
- Sanitize preferred_mem_loc.region_instance madvise. (Matt Brost)
- Clarify madvise drm_pagemap vs xe_pagemap refcounting. (Matt Brost)
- Don't allow a foreign drm_pagemap madvise without a fast
interconnect.
v3:
- Add a comment about reference-counting in xe_devmem_open() and
remove the reference-count get-and-put. (Matt Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251219113320.183860-16-thomas.hellstrom@linux.intel.com
Honor the drm_pagemap vma attribute when migrating SVM pages.
Ensure that when the desired placement is validated as device
memory, that we also check that the requested drm_pagemap is
consistent with the current.
v2:
- Initialize a struct drm_pagemap pointer to NULL that could
otherwise be dereferenced uninitialized. (CI)
- Remove a redundant assignment (Matt Brost)
- Slightly improved commit message (Matt Brost)
- Extended drm_pagemap validation.
v3:
- Fix a compilation error if CONFIG_DRM_GPUSVM is not enabled.
(kernel test robot <lkp@intel.com>)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patch.msgid.link/20251219113320.183860-14-thomas.hellstrom@linux.intel.com
As a consequence, struct xe_vma_mem_attr() can't simply be assigned
or freed without taking the reference count of individual members
into account. Also add helpers to do that.
v2:
- Move some calls to xe_vma_mem_attr_fini() to xe_vma_free(). (Matt Brost)
v3:
- Rebase.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v2
Link: https://patch.msgid.link/20251219113320.183860-13-thomas.hellstrom@linux.intel.com
Register a driver-wide owner list, provide a callback to identify
fast interconnects and use the drm_pagemap_util helper to allocate
or reuse a suitable owner struct. For now we consider pagemaps on
different tiles on the same device as having fast interconnect and
thus the same owner.
v2:
- Fix up the error onion unwind in xe_pagemap_create(). (Matt Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251219113320.183860-12-thomas.hellstrom@linux.intel.com
Define a struct xe_pagemap that embeds all pagemap-related
data used by xekmd, and use the drm_pagemap cache- and
shrinker to manage lifetime.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251219113320.183860-9-thomas.hellstrom@linux.intel.com
If a device holds a reference on a foregin device's drm_pagemap,
and a device unbind is executed on the foreign device,
Typically that foreign device would evict its device-private
pages and then continue its device-managed cleanup eventually
releasing its drm device and possibly allow for module unload.
However, since we're still holding a reference on a drm_pagemap,
when that reference is released and the provider module is
unloaded we'd execute out of undefined memory.
Therefore keep a reference on the provider device and module until
the last drm_pagemap reference is gone.
Note that in theory, the drm_gpusvm_helper module may be unloaded
as soon as the final module_put() of the provider driver module is
executed, so we need to add a module_exit() function that waits
for the work item executing the module_put() has completed.
v2:
- Better commit message (Matt Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-7-thomas.hellstrom@linux.intel.com
With the end goal of being able to free unused pagemaps
and allocate them on demand, add a refcount to struct drm_pagemap,
remove the xe embedded drm_pagemap, allocating and freeing it
explicitly.
v2:
- Make the drm_pagemap pointer in drm_gpusvm_pages reference-counted.
v3:
- Call drm_pagemap_get() before drm_pagemap_put() in drm_gpusvm_pages
(Himal Prasad Ghimiray)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-5-thomas.hellstrom@linux.intel.com
In situations where no system memory is migrated to devmem, and in
upcoming patches where another GPU is performing the migration to
the newly allocated devmem buffer, there is nothing to ensure any
ongoing clear to the devmem allocation or async eviction from the
devmem allocation is complete.
Address that by passing a struct dma_fence down to the copy
functions, and ensure it is waited for before migration is marked
complete.
v3:
- New patch.
v4:
- Update the logic used for determining when to wait for the
pre_migrate_fence.
- Update the logic used for determining when to warn for the
pre_migrate_fence since the scheduler fences apparently
can signal out-of-order.
v5:
- Fix a UAF (CI)
- Remove references to source P2P migration (Himal)
- Put the pre_migrate_fence after migration.
v6:
- Pipeline the pre_migrate_fence dependency (Matt Brost)
Fixes: c5b3eb5a90 ("drm/xe: Add GPUSVM device memory copy vfunc functions")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.15+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> # For merging through drm-xe.
Link: https://patch.msgid.link/20251219113320.183860-4-thomas.hellstrom@linux.intel.com
Normalize GT stats that record execution periods in code paths by
adding helpers to perform the ktime calculation. Use these helpers in
the SVM code.
Suggested-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251212182847.1683222-7-matthew.brost@intel.com
ALLOW UNMAP of VMAs associated with SVM mappings when the MAP operation
is intended to merge adjacent CPU_ADDR_MIRROR VMAs.
v2
- Remove mapping exist check in garbage collector
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251125075628.1182481-5-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
While restoring default memory attributes for VMAs during garbage
collection, extend the target range by checking neighboring VMAs. If
adjacent VMAs are CPU-address-mirrored and have default attributes,
include them in the mergeable range to reduce fragmentation and improve
VMA reuse.
v2
-Rebase
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251125075628.1182481-3-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Add xe_guc_pagefault layer (producer) which parses G2H fault messages
messages into struct xe_pagefault, forwards them to the page fault layer
(consumer) for servicing, and provides a vfunc to acknowledge faults to
the GuC upon completion. Replace the old (and incorrect) GT page fault
layer with this new layer throughout the driver.
As part of this change, the ACC handling code has been removed, as it is
dead code that is currently unused.
v2:
- Include engine instance (Stuart)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Tested-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patch.msgid.link/20251031165416.2871503-7-matthew.brost@intel.com
Corrected various spelling mistakes and typos in multiple
files under the Xe directory. These fixes improve clarity
and maintain consistency in documentation.
v2
- Replaced all instances of "XE" with "Xe" where it referred
to the driver name
- of -> for
- Typical -> Typically
v3
- Revert "Xe" to "XE" for macro prefix reference
Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20251023121453.1182035-2-sanjay.kumar.yadav@intel.com
The madvise implementation currently resets the SVM madvise if the
underlying CPU map is unmapped. This is in an attempt to mimic the
CPU madvise behaviour. However, it's not clear that this is a desired
behaviour since if the end app user relies on it for malloc()ed
objects or stack objects, it may not work as intended.
Instead of having the autoreset functionality being a direct
application-facing implicit UAPI, make the UMD explicitly choose
this behaviour if it wants to expose it by introducing
DRM_XE_VM_BIND_FLAG_MADVISE_AUTORESET, and add a semantics
description.
v2:
- Kerneldoc fixes. Fix a commit log message.
Fixes: a2eb8aec3e ("drm/xe: Reset VMA attributes to default in SVM garbage collector")
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: "Falkowski, John" <john.falkowski@intel.com>
Cc: "Mrozek, Michal" <michal.mrozek@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20251015170726.178685-2-thomas.hellstrom@linux.intel.com
If the location madvise() is set to
DRM_XE_PREFERRED_LOC_DEFAULT_SYSTEM, the drm_pagemap in the
SVM gpu fault handler will be set to NULL. However there is nothing
that explicitly migrates the data to system if it is already present
in device memory.
In that case, set the device memory owner to NULL to ensure
data gets properly migrated to system on page-fault.
v2:
- Remove redundant dpagemap assignment (Himal Prasad Ghimiray)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20251010104149.72783-2-thomas.hellstrom@linux.intel.com
Fixes: 10aa5c8060 ("drm/gpusvm, drm/xe: Fix userptr to not allow device private pages")
Moving to VRAM will fail if mixed mappings are present or if the page is
already located in VRAM. Atomic faults that require a move to VRAM
currently retry without attempting to evict mixed mappings or locate
existing VRAM mappings.
This patch fixes the issue by attempting to evict mixed mappings or find
existing VRAM pages when a move to VRAM fails during atomic fault
handling.
Fixes: a9ac0fa455 ("drm/xe: Strict migration policy for atomic SVM faults")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20251009130629.3531962-1-matthew.brost@intel.com
When userptr is used on SVM-enabled VMs, a non-NULL
hmm_range::dev_private_owner value might mean that
hmm_range_fault() attempts to return device private pages.
Either that will fail, or the userptr code will not know
how to handle those.
Use NULL for hmm_range::dev_private_owner to migrate
such pages to system. In order to do that, move the
struct drm_gpusvm::device_private_page_owner field to
struct drm_gpusvm_ctx::device_private_page_owner so that
it doesn't remain immutable over the drm_gpusvm lifetime.
v2:
- Don't conditionally compile xe_svm_devm_owner().
- Kerneldoc xe_svm_devm_owner().
Fixes: 9e97874148 ("drm/xe/userptr: replace xe_hmm with gpusvm")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/20250930122752.96034-1-thomas.hellstrom@linux.intel.com
These changes should have no functional impact.
1. Correct typo of "operation"in macro range_debug().
2. Combine 2 spin_lock() call in xe_svm_garbage_collector() into 1.
3. Drop redundant preferred_region_is_vram check in
xe_svm_range_needs_migrate_to_vram().
4. Combine the devmem_possible check in xe_svm_handle_pagefault().
need_vram includes the IS_DGFX() check, so there is no change for
.devmem_only.
v2: revert !ctx.devmem_only change (Matt)
v3: rebase code and refine commit message.
v4: rebase code and refine commit message.
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20250911031405.1371812-2-shuicheng.lin@intel.com
Use ERR_CAST inline function instead of ERR_PTR(PTR_ERR(...)).
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250914101630.17719-1-wangfushuai@baidu.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Convert SVM validation to support exhaustive eviction,
using xe_validation_guard().
v2:
- Wrap also xe_vm_range_rebind (Matt Brost)
- Adapt to argument changes of xe_validation_guard().
v5:
- Rebase on SVM stats.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250908101246.65025-5-thomas.hellstrom@linux.intel.com
We want all validation (potential backing store allocation) to be part
of a drm_exec transaction. Therefore add a drm_exec pointer argument
to xe_bo_validate() and ___xe_bo_create_locked(). Upcoming patches
will deal with making all (or nearly all) calls to these functions
part of a drm_exec transaction. In the meantime, define special values
of the drm_exec pointer:
XE_VALIDATION_UNIMPLEMENTED: Implementation of the drm_exec transaction
has not been done yet.
XE_VALIDATION_UNSUPPORTED: Some Middle-layers (dma-buf) doesn't allow
the drm_exec context to be passed down to map_attachment where
validation takes place.
XE_VALIDATION_OPT_OUT: May be used only for kunit tests where exhaustive
eviction isn't crucial and the ROI of converting those is very
small.
For XE_VALIDATION_UNIMPLEMENTED and XE_VALIDATION_OPT_OUT there is also
a lockdep check that a drm_exec transaction can indeed start at the
location where the macro is expanded. This is to encourage
developers to take this into consideration early in the code
development process.
v2:
- Fix xe_vm_set_validation_exec() imbalance. Add an assert that
hopefully catches future instances of this (Matt Brost)
v3:
- Extend to psmi_alloc_object
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v3
Link: https://lore.kernel.org/r/20250908101246.65025-2-thomas.hellstrom@linux.intel.com
Goal here is cut over to gpusvm and remove xe_hmm, relying instead on
common code. The core facilities we need are get_pages(), unmap_pages()
and free_pages() for a given useptr range, plus a vm level notifier
lock, which is now provided by gpusvm.
v2:
- Reuse the same SVM vm struct we use for full SVM, that way we can
use the same lock (Matt B & Himal)
v3:
- Re-use svm_init/fini for userptr.
v4:
- Allow building xe without userptr if we are missing DRM_GPUSVM
config. (Matt B)
- Always make .read_only match xe_vma_read_only() for the ctx. (Dafna)
v5:
- Fix missing conversion with CONFIG_DRM_XE_USERPTR_INVAL_INJECT
v6:
- Convert the new user in xe_vm_madise.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Dafna Hirschfeld <dafna.hirschfeld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250828142430.615826-17-matthew.auld@intel.com
Pull the pages stuff from the svm range into its own substructure, with
the idea of having the main pages related routines, like get_pages(),
unmap_pages() and free_pages() all operating on some lower level
structures, which can then be re-used for stuff like userptr.
v2:
- Move seq into pages struct (Matt B)
v3:
- Small kernel-doc fixes
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250828142430.615826-13-matthew.auld@intel.com
Add more SVM GT stats which give visibility to where time is spent in
the SVM page fault handler. Stats include number of faults at a given
size, total SVM page fault time, migration time in us, copy time in us,
copy kb, get pages time in us, and bind time in us. Will help in tuning
SVM for performance.
v2:
- Include local changes
v3:
- Add tlb invalidation + valid page fault + per size copy size stats
v4:
- Ensure gt not NULL when incrementing SVM copy stats
- Normalize stats names
- Use magic macros to generate increment functions for ranges
v7:
- Use DEF_STAT_STR (Michal)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://lore.kernel.org/r/20250829172232.1308004-3-matthew.brost@intel.com
Decouple TLB invalidations from the GT by updating the TLB invalidation
layer to accept a `struct xe_tlb_inval` instead of a `struct xe_gt`.
Also, rename *gt_tlb* to *tlb*. The internals of the TLB invalidation
code still operate on a GT, but this is now hidden from the rest of the
driver.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250826182911.392550-7-stuart.summers@intel.com
tlb_invalidation is a bit verbose leading to ugly wraps in the code,
shorten to tlb_inval.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250826182911.392550-4-stuart.summers@intel.com
Restore default memory attributes for VMAs during garbage collection
if they were modified by madvise. Reuse existing VMA if fully overlapping;
otherwise, allocate a new mirror VMA.
v2 (Matthew Brost)
- Add helper for vma split
- Add retry to get updated vma
v3
- Rebase on gpuvm layer
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250821173104.3030148-19-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
When the user sets the valid devmem_fd as a preferred location, GPU fault
will trigger migration to tile of device associated with devmem_fd.
If the user sets an invalid devmem_fd the preferred location is current
placement(smem) only.
v2(Matthew Brost)
- Default should be faulting tile
- remove devmem_fd used as region
v3 (Matthew Brost)
- Add migration_policy
- Fix return condition
- fix migrate condition
v4
-Rebase
v5
- Add check for userptr and bo based vmas
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250821173104.3030148-11-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
If the platform does not support atomic access on system memory, and the
ranges are in system memory, but the user requires atomic accesses on
the VMA, then migrate the ranges to VRAM. Apply this policy for prefetch
operations as well.
v2
- Drop unnecessary vm_dbg
v3 (Matthew Brost)
- fix atomic policy
- prefetch shouldn't have any impact of atomic
- bo can be accessed from vma, avoid duplicate parameter
v4 (Matthew Brost)
- Remove TODO comment
- Fix comment
- Dont allow gpu atomic ops when user is setting atomic attr as CPU
v5 (Matthew Brost)
- Fix atomic checks
- Add userptr checks
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250821173104.3030148-10-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>