Linux kernel source tree
Go to file
Rodrigo Vivi e799485044 drm/xe: Introduce the dev_coredump infrastructure.
The goal is to use devcoredump infrastructure to report error states
captured at the crash time.

The error state will contain useful information for GPU hang debug, such
as INSTDONE registers and the current buffers getting executed, as well
as any other information that helps user space and allow later replays of
the error.

The proposal here is to avoid a Xe only error_state like i915 and use
a standard dev_coredump infrastructure to expose the error state.

For our own case, the data is only useful if it is a snapshot of the
time when the GPU crash has happened, since we reset the GPU immediately
after and the registers might have changed. So the proposal here is to
have an internal snapshot to be printed out later.

Also, usually a subsequent GPU hang can be only a cause of the initial
one. So we only save the 'first' hang. The dev_coredump has a delayed
work queue where it remove the coredump and free all the data within a
few moments of the error. When that happens we also reset our capture
state and allow further snapshots.

Right now this infra only print out the time of the hang. More information
will be migrated here on subsequent work. Also, in order to organize the
dump better, the goal is to propose dev_coredump changes itself to allow
multiple files and different controls. But for now we start Xe usage of
it without any dependency on dev_coredump core changes.

v2: Add dma_fence annotation for capture that might happen during long
    running. (Thomas and Matt)
    Use xe->drm.primary->index on drm_info msg. (Jani)
v3: checkpatch fixes
v4: Fix building and locking issues found by Francois.
    Actually let's kill all of the locking in here. gt_reset serialization
    already guarantee that there will be only one capture at the same time.
    Also, the devcoredump has its own locking to protect the free and reads
    and drivers don't need to duplicate it.
    Besides this, the dma_fence locking was pushed to a following patch
    since it is not needed in this one.
    Fix a use after free identified by KASAN: Do not stash the faulty_engine
    since that will be freed somewhere else.
v5: Fix Uptime - ktime_get_boottime actually returns the Uptime. (Francois)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-19 18:33:51 -05:00
arch parisc architecture fixes for kernel v6.7-rc3: 2023-11-26 09:59:39 -08:00
block vfs-6.7-rc3.fixes 2023-11-24 09:45:40 -08:00
certs This update includes the following changes: 2023-11-02 16:15:30 -10:00
crypto This push fixes a regression in ahash and hides the Kconfig sub-options for the jitter RNG. 2023-11-09 17:04:58 -08:00
Documentation drm/xe: Introduce a new DRM driver for Intel GPUs 2023-12-12 14:05:48 -05:00
drivers drm/xe: Introduce the dev_coredump infrastructure. 2023-12-19 18:33:51 -05:00
fs eventfs fixes: 2023-11-26 19:48:20 -08:00
include drm/xe: Add max engine priority to xe query 2023-12-19 18:30:21 -05:00
init As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
io_uring io_uring: fix off-by one bvec index 2023-11-20 15:21:38 -07:00
ipc Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
kernel Fix lockdep block chain corruption resulting in KASAN warnings. 2023-11-26 08:30:11 -08:00
lib parisc architecture fixes for kernel v6.7-rc3: 2023-11-26 09:59:39 -08:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm vfs-6.7-rc3.fixes 2023-11-24 09:45:40 -08:00
net tls: fix NULL deref on tls_sw_splice_eof() with empty record 2023-11-23 08:51:45 -08:00
rust Kbuild updates for v6.7 2023-11-04 08:07:19 -10:00
samples Landlock updates for v6.7-rc1 2023-11-03 09:28:53 -10:00
scripts scripts/checkstack.pl: match all stack sizes for s390 2023-11-22 15:06:23 +01:00
security + Features 2023-11-03 09:48:17 -10:00
sound drm-misc-next for 6.8: 2023-11-20 09:50:09 +01:00
tools parisc architecture fixes for kernel v6.7-rc3: 2023-11-26 09:59:39 -08:00
usr arch: Remove Itanium (IA-64) architecture 2023-09-11 08:13:17 +00:00
virt ARM: 2023-09-07 13:52:20 -07:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: rpm-pkg: generate kernel.spec in rpmbuild/SPECS/ 2023-10-03 20:49:09 +09:00
.mailmap As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: Document Imagination PowerVR driver patches go via drm-misc 2023-12-04 14:47:59 +01:00
Makefile Linux 6.7-rc3 2023-11-26 19:59:33 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.