The register COMMON_SLICE_CHICKEN4 is a MCR register on both Xe2 and
Xe3. Let's make sure to define a MCR version of it and use it for the
relevant IP versions.
Use XEHP_ as prefix for the register name, since it is MCR as of Xe_HP.
v2:
- Also change for one entry in lrc_tunnings, which was caught by
manual testing and add corresponging Fixes tag in commit message.
(Gustavo)
Fixes: 8d6f16f1f0 ("drm/xe: Extend Wa_22021007897 to Xe3 platforms")
Fixes: e5c13e2c50 ("drm/xe/xe2hpg: Add Wa_22021007897")
Fixes: 8ccf5f6b22 ("drm/xe/tuning: Apply windower hardware filtering setting on Xe3 and Xe3p")
Bspec: 66534, 71185, 74417
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260514-rtp-mcr-check-v3-3-30dd47855fee@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
(cherry picked from commit 75f65f1a4c06da1d87f28570a9d4cdad28f13360)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The register COMMON_SLICE_CHICKEN1 is a MCR register on Xe2.
Let's make sure to define a MCR version of it and use it for the
relevant IP versions.
Use XEHP_ as prefix for the register name, since it is MCR as of Xe_HP.
Fixes: a5d221924e ("drm/xe/xe2_hpg: Add set of workarounds")
Fixes: 9f18b55b6d ("drm/xe/xe2: Add workaround 18033852989")
Bspec: 66534, 71185
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260514-rtp-mcr-check-v3-2-30dd47855fee@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
(cherry picked from commit a672725fdbfc3ea430130039d677c7dc98d59df8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
CACHE_MODE_1 is a MCR register for all platforms that currently use it
in the Xe driver. Use XE_REG_MCR() when defining it.
Fixes: 8cd7e97597 ("drm/xe: Add missing DG2 lrc workarounds")
Fixes: ff063430ca ("drm/xe/mtl: Add some initial MTL workarounds")
Bspec: 66534, 67788
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260514-rtp-mcr-check-v3-1-30dd47855fee@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
(cherry picked from commit 8f765f0c054e0fb39980a76b4c899b027395929d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
ROW_CHICKEN5 is a masked register (i.e., to adjust the value of any of
the lower 16 bits, the corresponding bit in the upper 16 bits must also
be set). Add the XE_REG_OPTION_MASKED to its definition; failure to do
so will cause workaround updates of this register to not apply properly.
Bspec: 56853
Fixes: 835cd6cbb0 ("drm/xe/xe3p_lpg: Add initial workarounds for graphics version 35.10")
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20260410-xe3p_tuning-v1-3-e206a62ee38f@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit cd84bfbba7feb4c1e72356f14de026dfda1a9e2a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The hardware teams noticed that the originally documented workaround
steps for Wa_16025250150 may not be sufficient to fully avoid a hardware
issue. The workaround documentation has been augmented to suggest
programming one additional register; make the corresponding change in
the driver.
Fixes: 7654d51f1f ("drm/xe/xe2hpg: Add Wa_16025250150")
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260319-wa_16025250150_part2-v1-1-46b1de1a31b2@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
The Watchdog Counter Control and Watchdog Counter Threshold
registers are needed for watchdog programming. This watchdog
will generate the "Media Hang Notify" interrupt.
Bspec: 45999, 46000
Bspec: 60373, 60374
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patch.msgid.link/20260303201354.17948-2-michal.wajdeczko@intel.com
There are higher level sleep states that will cause RC6 state readout to
come back with an "in-reset" value. That is the case with NVL-P. As
those states are only possible if the GT is already in C6, let's just
translate the "reset value" into C6 when doing the readout.
Bspec: 67651
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260309-extra-nvl-p-enabling-patches-v5-7-be9c902ee34e@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Wa_16028780921 involves writing to a register that is locked by firmware
prior to driver loading and doesn't have any effect if implemented by
the KMD. Since the implementation of the workaround actually belongs
the firmware, just drop the ineffective implementation by the KMD.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260309-extra-nvl-p-enabling-patches-v5-6-be9c902ee34e@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Implement the KMD part of Wa_14026539277, which applies to NVL-P A0.
The KMD implementation is just one component of the workaround, which
also depends on Pcode to implement its part in order to be complete.
v2:
- Add FUNC(xe_rtp_match_not_sriov_vf) to skip applying the workaround
to SRIOV VFs. (Matt)
v3:
- Make Wa_14026539277 a device workaround instead of a GT workaround.
(Matt)
v4:
- Drop FUNC(xe_rtp_match_not_sriov_vf) and use a direct check with
IS_SRIOV_VF() in the workaround implementation. (Matt)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com> # v3
Link: https://patch.msgid.link/20260309-extra-nvl-p-enabling-patches-v5-5-be9c902ee34e@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Similar to i915's commit cebc13de7e
("drm/i915: Whitelist COMMON_SLICE_CHICKEN3 for UMD access"), except
that instead of putting the register on the allowlist for UMD to
program, the KMD is doing the programming at context initialization
based on a queue creation flag.
This is a recommended tuning setting for both gen12 and Xe_HP
platforms.
If a render queue is created with
DRM_XE_EXEC_QUEUE_SET_STATE_CACHE_PERF_FIX, COMMON_SLICE_CHICKEN3 will
be programmed at initialization to enable the render color cache to
key with BTP+BTI (binding table pool + binding table entry) instead of
just BTI (binding table entry). This enables the UMD to avoid emitting
render-target-cache-flush + stall-at-pixel-scoreboard every time a
binding table entry pointing to a render target is changed.
v2: Use xe_lrc_write_ring()
v3: Update xe_query.c to report availability
v4: Rename defines to add DISABLE_
v5: update commit message
v6: rebase
Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982
Bspec: 73993, 73994, 72161, 31870, 68331
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patch.msgid.link/20260306075504.1288676-1-lionel.g.landwerlin@intel.com
PVC supports GT error reporting via vector registers along with
error status register. Add support to report these errors and
update respective counters. Incase of Subslice error reported
by vector register, process the error status register
for applicable bits.
The counter is embedded in the xe drm ras structure and is
exposed to the userspace using the drm_ras generic netlink
interface.
$ sudo ynl --family drm_ras --do get-error-counter \
--json '{"node-id":0, "error-id":1}'
{'error-id': 1, 'error-name': 'core-compute', 'error-value': 0}
Co-developed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://patch.msgid.link/20260304074412.464435-11-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add a shared header that's used by i915, xe, and i915 display.
This allows us to drop the compat-i915-headers/i915_reg_defs.h include
from xe_reg_defs.h. All the register macro helpers were subtly pulled in
from i915 to all of xe through this.
Reviewed-by: Michał Grzelak <michal.grzelak@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/fcd70f3317755bf98a6e7ae88974aa8ba06efd1e.1772042022.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
A recent bspec tuning guide update asks us to program
COMMON_SLICE_CHICKEN4[5] on Xe3 and Xe3p platforms. Add this setting to
our LRC tuning RTP table so that the setting will become part of each
context's LRC.
Bspec: 72161, 55902
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260224235055.3038710-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
When generating the default LRC, if a register is not masked, we apply
any save-restore programming necessary via a read-modify-write sequence
that will ensure we only update the relevant bits/fields without
clobbering the rest of the register. However some of the registers that
need to be updated might be MCR registers which require steering to a
non-terminated instance to ensure we can read back a valid, non-zero
value. The steering of reads originating from a command streamer is
controlled by register CS_MMIO_GROUP_INSTANCE_SELECT. Emit additional
MI_LRI commands to update the steering before any RMW of an MCR register
to ensure the reads are performed properly.
Note that needing to perform a RMW of an MCR register while building the
default LRC is pretty rare. Most of the MCR registers that are part of
an engine's LRCs are also masked registers, so no MCR is necessary.
Fixes: f2f90989cc ("drm/xe: Avoid reading RMW registers in emit_wa_job")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patch.msgid.link/20260206223058.387014-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Since the dominant size of the pages referred in an i-gpu, such as
Xe3p_LPG, will be 4KB, the HW default of mix of 64K and 2M for STLB bank
hash mode does not make sense.
Allow the SW to change it to 4KB Mode, for Xe3p_LPG.
v2:
- Add Bspec reference. (Matt)
Bspec: 78248
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-11-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Xe3p_LPG extends the 'group ID' register mask by one bit. Since the new
upper bit (12) was unused on previous platforms, we can safely extend
the existing mask size without worrying about adding conditional version
checks to the register programming.
Bspec: 67175
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Link: https://patch.msgid.link/20260206-nvl-p-upstreaming-v3-9-636e1ad32688@intel.com
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Prevent GuC firmware DMA failures during GuC-only reset by disabling
idle flow and verifying SRAM handling completion. Without this, reset
can be issued while SRAM handler is copying WOPCM to SRAM,
causing GuC HW to get stuck.
v2: Modify error message (Badal)
Rename reg bit name (Daniele)
Update WA skip condition (Daniele)
Update SRAM handling logic (Daniele)
v3: Reorder WA call (Badal)
Wait for GuC ready status (Daniele)
v4: Update reg name (Badal)
Add comment (Daniele)
Add extended graphics version (Daniele)
Modify rules
Signed-off-by: Sk Anirban <sk.anirban@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patch.msgid.link/20260202105313.3338094-4-sk.anirban@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
On Xe3p_XPC, there are now four registers reserved to express the XeCore
mask rather than just three. Define the new registers and update the IP
descriptor accordingly.
Note that this only applies to Xe3p_XPC for now; Xe3p_LPG still only
uses three registers to express the mask.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patch.msgid.link/20260205214139.48515-4-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
All MERT catastrophic errors but VF's LMTT fault are serious, so
we shouldn't limit our handling only to print debug messages.
Change CATERR message to error level and then declare the device
as wedged to match expectation from the design document. For the
LMTT faults, add a note about adding tracking of this unexpected
VF activity.
While at it, rename register fields defnitions to match the BSpec.
Also drop trailing include guard name from the regs.h file.
BSpec: 74625
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lukasz Laguna <lukasz.laguna@intel.com>
Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com>
Link: https://patch.msgid.link/20260112183716.28700-1-michal.wajdeczko@intel.com
Expose individual VRAM temperature attributes.
Update Xe hwmon documentation for this entry.
v2:
- Avoid using default switch case for VRAM individual temperatures.
- Append labels with VRAM channel number.
- Update kernel version in Xe hwmon documentation.
v3:
- Add missing brackets in Xe hwmon documentation from VRAM channel sysfs.
- Reorder BMG_VRAM_TEMPERATURE_N macro in xe_pcode_regs.h.
- Add api to check if VRAM is available on the channel.
v4:
- Improve VRAM label handling to eliminate temp variable by
introducing a dedicated array vram_label in xe_hwmon_thermal_info.
- Remove a magic number.
- Change the label from vram_X to vram_ch_X.
v5:
- Address review comments from Raag.
- Change vram to VRAM in commit title and subject.
- Refactor BMG_VRAM_TEMPERATURE_N macro.
- Refactor is_vram_ch_available().
- Rephrase a comment.
- Check individual VRAM temperature limits in addition to VRAM
availability in xe_hwmon_temp_is_visible. (Raag)
- Move VRAM label change out of this patch.
v6:
- Use in_range() for VRAM_N index check instead of if check. (Raag)
- Minor aesthetic changes.
Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://patch.msgid.link/20260112203521.1014388-5-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Previously, compressible surfaces were required to be non-coherent
(allocated as WC) because compression and coherency were mutually
exclusive. Starting with Xe3, hardware supports combining compression
with 1-way coherency, allowing compressible surfaces to be allocated as
WB memory. This provides applications with more efficient memory
allocation by avoiding WC allocation overhead that can cause system
stuttering and memory management challenges.
The implementation adds support for compressed+coherent PAT entry for
the xe3_lpg devices and updates the driver logic to handle the new
compression capabilities.
v2: (Matthew Auld)
- Improved error handling with XE_IOCTL_DBG()
- Enhanced documentation and comments
- Fixed xe_bo_needs_ccs_pages() outdated compression assumptions
v3:
- Improve WB compression support detection by checking PAT table
instead of version check
v4:
- Add XE_CACHE_WB_COMPRESSION, which simplifies the logic.
v5:
- Use U16_MAX for the invalid PAT index. (Matthew Auld)
Bspec: 71582, 59361, 59399
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Xin Wang <x.wang@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260109093007.546784-1-x.wang@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Since different drivers can use SoC remapper, modify VSEC code to
access SoC remapper via a helper that would synchronize such accesses.
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patch.msgid.link/20251223183943.3175941-7-umesh.nerlige.ramappa@intel.com
A MERT OA unit is available in the SoC on some platforms. Add support
for this OA unit and expose it to userspace. The MERT OA unit does not
have any HW engines attached, but is otherwise similar to an OAM unit.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patch.msgid.link/20251205212613.826224-2-ashutosh.dixit@intel.com
When we're initially enabling driver support for a new platform/IP, we
usually implement all workarounds documented in the WA database in the
driver. Many of those workarounds are restricted to early steppings
that only showed up in pre-production hardware (i.e., internal test
chips that are not available to the general public). Since the
workarounds for early, pre-production steppings tend to be some of the
ugliest and most complicated workarounds, we generally want to eliminate
them and simplify the code once the platform has launched and our
internal usage of those pre-production parts have been phased out.
Let's add a flag to the device info that tracks which platforms still
have support for pre-production workarounds for so that we can print a
warning and taint if someone tries to load the driver on a
pre-production part for a platform without pre-production workarounds.
This will help our internal users understand the likely problems they'll
encounter if they try to load the driver on an old pre-production
device.
The Xe behavior here is similar to what we've done for many years on
i915 (see intel_detect_preproduction_hw()), except that instead of
manually coding up ranges of device steppings that we believe to be
pre-production hardware, Xe will use the hardware's own production vs
pre-production fusing status, which we can read from the FUSE2 register.
This fuse didn't exist on older Intel hardware, but should be present on
all platforms supported by the Xe driver.
Going forward, let's set the expectation that we'll start looking into
removing pre-production workarounds for a platform around the time that
platforms of the next major IP stepping are having their force_probe
requirement lifted. This timing is just a rough guideline; there may be
cases where some instances of pre-production parts are still being
actively used in CI farms, internal device pools, etc. and we'll need to
wait a bit longer for those to be swapped out.
v2:
- Fix inverted forcewake check
v3:
- Invert flag and add it to the platforms on which we still have
pre-prod workarounds. (Jani, Lucas)
v4:
- Avoid checking pre-production on VF since they don't have access to
the FUSE2 register.
Bspec: 78271, 52544
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patch.msgid.link/20251212181411.294854-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Page reclaim list (PRL) is preparation work for the page reclaim feature.
The PRL is firstly owned by pt_update_ops and all other page reclaim
operations will point back to this PRL. PRL generates its entries during
the unbind page walker, updating the PRL.
This PRL is restricted to a 4K page, so 512 page entries at most.
v2:
- Removed unused function. (Shuicheng)
- Compacted warning checking, update commit message,
spelling, etc. (Shuicheng, Matthew B)
- Fix kernel docs
- Moved PRL max entries overflow handling out from
generate_reclaim_entry to caller (Shuicheng)
- Add xe_page_reclaim_list_init for clarity. (Matthew B)
- Modify xe_guc_page_reclaim_entry to use macros
for greater flexbility. (Matthew B)
- Add fallback for PTE outside of page reclaim supported
4K, 64K, 2M pages (Matthew B)
- Invalidate PRL for early abort page walk.
- Removed page reclaim related variables from tlb fence
(Matthew Brost)
- Remove error handling in *alloc_entries failure. (Matthew B)
v3:
- Fix NULL pointer dereference check.
- Modify reclaim_entry to QW and bitfields accordingly. (Matthew B)
- Add vm_dbg prints for PRL generation and invalidation. (Matthew B)
v4:
- s/GENMASK/GENMASK_ULL && s/BIT/BIT_ULL (CI)
v5:
- Addition of xe_page_reclaim_list_is_new() to avoid continuous
allocation of PRL if consecutive VMAs cause a PRL invalidation.
- Add xe_page_reclaim_list_valid() helpers for clarity. (Matthew B)
- Move xe_page_reclaim_list_entries_put in
xe_page_reclaim_list_invalidate.
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-17-brian3.nguyen@intel.com
Applied Wa_16028005424 to Graphics version from 30.00 to 30.05
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patch.msgid.link/20251121100822.20076-2-balasubramani.vivekanandan@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
The MERT block triggers an interrupt when a catastrophic error occurs.
Update the interrupt handler to read the MERT catastrophic error type
and log appropriate debug message.
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251124190237.20503-5-lukasz.laguna@intel.com
Add support for triggering and handling MERT TLB invalidation. After
LMTT updates, the MERT TLB invalidation is initiated to ensure memory
translations remain coherent.
Completion of the invalidation is signaled via MERT interrupt (bit 13 in
the GFX master interrupt register). Detect and handle this interrupt to
properly synchronize the invalidation flow.
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251124190237.20503-4-lukasz.laguna@intel.com
On platforms with standalone MERT, the PF driver needs to program LMTT
in MERT's LMEM_CFG register.
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251124190237.20503-3-lukasz.laguna@intel.com
The TILE_ADDR_RANGE register is not available on all platforms going
forward as it was deprecated and is being replaced by equivalent
registers within SoC MMIO space. While that doesn't happen, the
SG_TILE_ADDR_RANGE (base 0x1083a0) is still valid for all platforms
supported by xe. Use that instead.
BSpec: 59353, 54991
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patch.msgid.link/20251107-tile-addr-v1-1-a3014aadc2e7@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Crescent Island has some additional and different bits for performance
limit reasons. Add the new definitions and use them for CRI.
Signed-off-by: Sk Anirban <sk.anirban@intel.com>
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://patch.msgid.link/20251029-gt-throttle-cri-v3-1-d1f5abbb8114@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Add G7 package state residency counter in debugfs alongside existing
G2,G6,G8,G10 states for complete power state visibility.
Signed-off-by: Mohammed Thasleem <mohammed.thasleem@intel.com>
Reviewed-by: Karthik Poosa <karthik.poosa@intel.com>
Link: https://patch.msgid.link/20251016001219.37684-1-mohammed.thasleem@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Apply WA_14024681466 to Xe3_LPG graphics IP versions from 30.00 to 30.05.
v2: (Matthew Roper)
- Remove stepping filter as workaround applies to all steppings.
- Add an engine class filter so it only applies to the RENDER engine.
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Link: https://patch.msgid.link/20251027092643.335904-1-nitin.r.gote@intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
For Xe3p arch some subunits of an IP may be different. The GMD_ID
register returns the Xe3p arch and dedicates the reserved field to mark
possible subunit differences. Generally this is an under-the-hood
implementation detail that drivers don't need to worry about, but the
new Main_GAMCTRL may be enabled or not depending on those.
Those reserved bits are described for Xe3p as: "If Zero, No special case
to be handled. If Non-Zero, special case to be handled by Software
agent.". That special case is defined per Arch. So if media version is
35, also check the additional reserved bits. To avoid confusion with the
usual meaning of "reserved", define them as GMD_ID_SUBIP_FLAG_MASK.
Bspec: 74201
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251019-xe3p-gamctrl-v1-2-ad66d3c1908f@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Starting from Xe3p, there are two different copies of some of the GAM
registers: the traditional MCR variant at their old locations, and a
new unicast copy known as "main_gamctrl." The Xe driver doesn't use
these registers directly, but we need to instruct the GuC on which set
it should use. Since the new, unicast registers are preferred (since
they avoid the need for unnecessary MCR synchronization), set a new GuC
feature flag, GUC_CTL_MAIN_GAMCTRL_QUEUES to convey this decision. A
new helper function, xe_guc_using_main_gamctrl_queues(), is added for
use in the 3 independent places that need to handle configuration of the
new reporting queues.
The mmio write to enable the main gamctl is only done during the general
GuC upload. The gamctrl registers are not accessed by the GuC during
hwconfig load.
Last, the ADS blob for communicating the queue addresses contains both a
DPA and GGTT offset. The GuC documentation states that DPA is now MBZ
when using the MAIN_GAMCTRL queues.
Bspec: 76445, 73540
Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251019-xe3p-gamctrl-v1-1-ad66d3c1908f@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Current implementation of compute walker has dependency on GPU/SW Stack
which requires SW/UMD to wait for event from KMD to indicate
PIPE_CONTROL interrupt was done. This created latency on SW stack.
This feature adds support to generate completion interrupt from GPGPU
walker which does not support MSIx and avoid software using Pipe control
drain/idle latency.
The only thing needed for the kernel driver to do here is to wakeup the
thread waiting on the ufence, which is already handled by the irq
handler. Before waiting on this event, the userspace side can opt-in to
this interrupt being generated by the HW by selecting the flag in the
POST_SYNC_DATA_2 substructure's dw0[3] of COMPUTE_WALKER_2 instruction.
Bspec: 62346, 74334
Suggested-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: S A Muqthyar Ahmed <syed.abdul.muqthyar.ahmed@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-21-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Just like the other engines, check xe_hw_engine_mask_per_class() for VCS
and VECS to account for architectural availability of those registers.
With that, all the possibly available media engines can have their
interrupts enabled.
Bspec: 54030
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-20-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Two bit fields have similar functionality across the interrupt vectors
but are named "RENDER". Rename them to follow the bspec more closely and
clear any confusion when using them for other engines.
Bspec: 62353, 62354, 62355, 62346, 62345, 63341
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-19-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The CSMQDEBUG is useful for the development of MQ feature. Start dumping
the debug register.
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Wang Xin <x.wang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-10-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Add CURRENT_LRCA to register dump to help debugging.
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Wang Xin <x.wang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-9-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Xe3p introduces a dedicated SERVICE_COPY_ENABLE fuse register to reflect
the availability of the service copy engines (BCS1-BCS8).
Bspec: 74624
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-8-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The warning was added for a condition that never triggered even for
platforms prior to Xe2. It's not supported in Xe2 and in Xe3p the
register is removed from the main GT. Just drop the entire function as
it doesn't bring any benefit.
Bspec: 62395
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
[ Drop the entire check for CTC_MODE ]
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20251016-xe3p-v3-3-3dd173a3097a@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>