Commit Graph

664 Commits

Author SHA1 Message Date
Niklas Neronin
dad6711b9e usb: xhci: remove duplicate '0x' prefix
Prefix "0x" is automatically added by '%pad'.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-25-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-02 15:55:38 +02:00
Michal Pecio
88a985cf95 usb: xhci: Simplify clearing the Event Interrupt bit
USBSTS is mostly RW1C, so to clear EINT we should write just this
one bit. Remove pointless code which ORs the bit with current value
of the register, even though the bit is already known to be set,
and writes the result back, which clears all active RW1C flags.

We used to inadvertently clear PCD and SRE in this way. PCD isn't
used by the driver and SRE is only used at resume, so clearing them
should make no difference. Don't clear them anymore.

Tested by connecting and mounting a storage device on a few HCs.

Before: xhci_irq USBSTS 0x00000018 EINT PCD -> 0x00000000
        xhci_irq USBSTS 0x00000008 EINT -> 0x00000000
After:  xhci_irq USBSTS 0x00000018 EINT PCD -> 0x00000010 PCD
        xhci_irq USBSTS 0x00000018 EINT PCD -> 0x00000010 PCD

Some flags are RsvdZ - should be written as zero regardless of the
value read, so technically it was a bug. But no problems are known.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260402131342.2628648-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-04-02 15:55:36 +02:00
Dayu Jiang
d6d5febd12 usb: xhci: Prevent interrupt storm on host controller error (HCE)
The xHCI controller reports a Host Controller Error (HCE) in UAS Storage
Device plug/unplug scenarios on Android devices. HCE is checked in
xhci_irq() function and causes an interrupt storm (since the interrupt
isn’t cleared), leading to severe system-level faults.

When the xHC controller reports HCE in the interrupt handler, the driver
only logs a warning and assumes xHC activity will stop as stated in xHCI
specification. An interrupt storm does however continue on some hosts
even after HCE, and only ceases after manually disabling xHC interrupt
and stopping the controller by calling xhci_halt().

Add xhci_halt() to xhci_irq() function where STS_HCE status is checked,
mirroring the existing error handling pattern used for STS_FATAL errors.

This only fixes the interrupt storm. Proper HCE recovery requires resetting
and re-initializing the xHC.

CC: stable@vger.kernel.org
Signed-off-by: Dayu Jiang <jiangdayu@xiaomi.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20260304223639.3882398-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-11 16:18:48 +01:00
Linus Torvalds
f5e9d31e79 USB/Thunderbolt changes for 6.19-rc1
Here is the big set of USB and Thunderbolt driver updates for 6.19-rc1.
 Nothing major here, just lots of tiny updates for most of the common USB
 drivers.  Included in here are:
   - more xhci driver updates and fixes
   - Thunderbolt driver cleanups
   - usb serial driver updates
   - typec driver updates
   - USB tracepoint additions
   - dwc3 driver updates, including support for Apple hardware
   - lots of other smaller driver updates and cleanups
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaTTLNQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn29wCcD23xCSWuRIKHfSWlefCwptHKT0AAoM0NJdeO
 C4/4pui5ifTu1g5Vud1r
 =Sm/3
 -----END PGP SIGNATURE-----

Merge tag 'usb-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB/Thunderbolt updates from Greg KH:
 "Here is the big set of USB and Thunderbolt driver updates for
  6.19-rc1. Nothing major here, just lots of tiny updates for most of
  the common USB drivers. Included in here are:

   - more xhci driver updates and fixes

   - Thunderbolt driver cleanups

   - usb serial driver updates

   - typec driver updates

   - USB tracepoint additions

   - dwc3 driver updates, including support for Apple hardware

   - lots of other smaller driver updates and cleanups

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (161 commits)
  usb: gadget: tegra-xudc: Always reinitialize data toggle when clear halt
  USB: serial: option: move Telit 0x10c7 composition in the right place
  USB: serial: option: add Telit Cinterion FE910C04 new compositions
  usb: typec: ucsi: fix use-after-free caused by uec->work
  usb: typec: ucsi: fix probe failure in gaokun_ucsi_probe()
  usb: dwc3: core: Remove redundant comment in core init
  usb: phy: Initialize struct usb_phy list_head
  USB: serial: option: add Foxconn T99W760
  usb: usb-storage: No additional quirks need to be added to the EL-R12 optical drive.
  usb: typec: hd3ss3220: Enable VBUS based on ID pin state
  dt-bindings: usb: ti,hd3ss3220: Add support for VBUS based on ID state
  usb: typec: anx7411: add WQ_PERCPU to alloc_workqueue users
  USB: add WQ_PERCPU to alloc_workqueue users
  dt-bindings: usb: dwc3-xilinx: Describe the reset constraint for the versal platform
  drivers/usb/storage: use min() instead of min_t()
  usb: raw-gadget: cap raw_io transfer length to KMALLOC_MAX_SIZE
  usb: ohci-da8xx: remove unused platform data
  usb: gadget: functionfs: use dma_buf_unmap_attachment_unlocked() helper
  usb: uas: reduce time under spinlock
  usb: dwc3: eic7700: Add EIC7700 USB driver
  ...
2025-12-06 18:42:12 -08:00
Niklas Neronin
757508d6d7 usb: xhci: standardize single bit-field macros
Convert single bit-field macros to simple masks. The change makes the
masks more universal. Multi bit-field macros are changed in the next
commit. After both changes, all masks in xhci-caps.h will follow the
same format. I plan to introduce this change to all xhci macros.

Bit shift operations on a 32-bit signed can be problematic on some
architectures. Instead use BIT() macro, which returns a 64-bit unsigned
value. This ensures that the shift operation is performed on an unsigned
type, which is safer and more portable across different architectures.
Using unsigned integers for bit shifts avoids issues related to sign bits
and ensures consistent behavior.

Switch from 32-bit to 64-bit?
As far as I am aware, this does not cause any issues.
Performing bitwise operations between 32 and 64 bit values, the smaller
operand is promoted to match the size of the larger one, resulting in a
64-bit operation. This promotion extends the 32-bit value to 64 bits,
by zero-padding (for unsigned).

Will the change to 64-bit slow down the xhci driver?
On a 64-bit architecture - No. On a 32-bit architecture, yes? but in my
opinion the performance decrease does not outweigh the readability and
other benefits of using BIT() macro.

Why not use FIELD_GET() and FIELD_PREP()?
While they can be used for single bit macros, I prefer to use simple
bitwise operation directly. Because, it takes less space, is less overhead
and is as clear as if using FIELD_GET() and FIELD_PREP().

Why not use test_bit() macro?
Same reason as with FIELD_GET() and FIELD_PREP().

Suggested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-23-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:53:01 +01:00
Niklas Neronin
f724e34719 usb: xhci: simplify Isochronous Scheduling Threshold handling
The IST is represented by bits 2:0, with bit 3 indicating the unit of
measurement, Frames or Microframes. Introduce xhci_ist_microframes(),
which returns the IST value in Microframes, simplifying the code and
reducing duplication.

Improve documentation in xhci-caps.h to clarify the IST register specifics,
including the unit conversion details. These change removes the need to
explain it each time the IST values is retrieved.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-20-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:53:01 +01:00
Niklas Neronin
df08973556 usb: xhci: simplify handling of Structural Parameters 1 values
The 32-bit read-only HCSPARAMS1 register contains the following fields:
 Bits  7:0   - Number of Device Slots (MaxSlots)
 Bits 18:8   - Number of Interrupters (MaxIntrs)
 Bits 23:19  - Reserved
 Bits 31:24  - Number of Ports (MaxPorts)

Since the register value is constant for the lifetime of the controller,
it is cached in 'xhci->hcs_params1'. However, platform drivers may
override the number of interrupters through a separate variable,
'xhci->max_interrupters', leaving only the maximum slots and ports values
still derived from the cached register.

To simplify the code and improve readability, replace 'xhci->hcs_params1'
with two dedicated 'u8' fields: 'xhci->max_slots' and 'xhci->max_ports'.
These values are initialized once and used directly instead of calling
'HCS_MAX_SLOTS()' and 'HCS_MAX_PORTS()' macros.

This change reduces code clutter without increasing memory usage.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-16-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:53:00 +01:00
Marco Crivellari
1ebf363fcd usb: xhci: replace use of system_wq with system_percpu_wq
Currently if a user enqueues a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistency cannot be addressed without refactoring the API.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")

Switch to using system_percpu_wq because system_wq is going away as part of
a workqueue restructuring.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-12-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:53:00 +01:00
Michal Pecio
e6aec6d9f5 usb: xhci: Don't unchain link TRBs on quirky HCs
Some old HCs ignore transfer ring link TRBs whose chain bit is unset.
This breaks endpoint operation and sometimes makes it execute other
ring's TDs, which may corrupt their buffers or cause unwanted device
action. We avoid this by chaining all link TRBs on affected rings.

Fix an omission which allows them to be unchained by cancelling TDs.

The patch was tested by reproducing this condition on an isochronous
endpoint (non-power-of-two TDs are sometimes split not to cross 64K)
and printing link TRBs in trb_to_noop() on good and buggy HCs.

Actual hardware malfunction is rare since it requires Missed Service
Error shortly before the unchained link TRB, at least on NEC and AMD.
I have never seen it after commit bb0ba4cb10 ("usb: xhci: Apply the
link chain quirk on NEC isoc endpoints"), but it's Russian roulette
and I can't test all affected hosts and workloads. Fairly often MSEs
happen after cancellation because the endpoint was stopped.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-11-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:53:00 +01:00
Michal Pecio
f781297745 usb: xhci: Assume that endpoints halt as specified
xHCI 4.8.3 recommends that software should simply assume endpoints to
halt after certain events, without looking at the Endpoint Context for
confirmation, because HCs may be slow to update that.

While no cases of such "slowness" appear to be known, different problem
exists on AMD Promontory chipsets: they may halt and generate a transfer
event, but fail to ever update the Endpoint Context at all, at least not
until some command is queued and fails with Context State Error. This is
easily triggered by disconnecting D- of a full speed serial device.

Possibly similar bug in non-AMD hardware has been reported to linux-usb.

In such case, failed TD is given back without erasing from the ring and
endpoint isn't reset. If some URB is unlinked later, Stop Endpoint fails
and its handler resets the endpoint. On next submission it will restart
on the stale TD. Outcome is UAF on success, or another halt on error and
then Dequeue doesn't move and URBs are stuck. Unlinking and resubmitting
the URBs causes unlimited ring expansion if the situation repeats.

This can be solved by ignoring Endpoint Context State and trusting that
endpoints halt when required, except one known case in ancient hardware.
The check for "Already resolving halted ep" becomes redundant, because
for these completion codes we now jump to xhci_handle_halted_endpoint()
which deals with pending EP_HALTED internally.

Link: https://lore.kernel.org/linux-usb/20250311234139.0e73e138@foxbook/
Link: https://lore.kernel.org/linux-usb/20250918055527.4157212-1-zhangjinpeng@kylinos.cn/
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-10-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:53:00 +01:00
Niklas Neronin
511afe80b8 usb: xhci: add helper to read PORTSC register
Add a dedicated helper function to read the USB Port Status and Control
(PORTSC) register. This complements xhci_portsc_writel() and improves code
clarity by providing a clear counterpart for reading the register.

Suggested-by: Peter Chen <peter.chen@kernel.org>
Reviewed-by: Peter Chen <peter.chen@kerne.org>
Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-7-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:52:59 +01:00
Mathias Nyman
86dcf43be8 xhci: simplify and rework trb_in_td()
The trb_in_td() checking is quite complex, especially when checking for
TRBs in ranges that can span several segments.

Simplify the search by creating a position index for each TRB on the
ring, and just compare the position indexes.
Add a more generic dma_in_range() helper that checks if a trb dma
address is in the range between a start and end trb and call it from
trb_in_td()

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:52:59 +01:00
Mathias Nyman
fad902d670 xhci: Add helper to find trb from its dma address
Add a xhci_dma_to_trb() helper, and use it to find the transfer TRB
early in handle_tx_event() based on the dma address found in the
event TRB.

With this helper we can avoid using 'ep_seg' transfer TRB segment
variable as both a a boolean to indicate if the transfer TRB is part
of the next queued TD, and to actually find the transfer TRB based
on ep_seg and ep_trb_dma.

This is a first step in reworking and cleaning up trb_in_td() and
handle_tx_event()

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251119142417.2820519-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-21 14:52:59 +01:00
Mathias Nyman
b69dfcab68 xhci: fix stale flag preventig URBs after link state error is cleared
A usb device caught behind a link in ss.Inactive error state needs to
be reset to recover. A VDEV_PORT_ERROR flag is used to track this state,
preventing new transfers from being queued until error is cleared.

This flag may be left uncleared if link goes to error state between two
resets, and print the following message:

"xhci_hcd 0000:00:14.0: Can't queue urb, port error, link inactive"

Fix setting and clearing the flag.

The flag is cleared after hub driver has successfully reset the device
when hcd->reset_device is called. xhci-hcd issues an internal "reset
device" command in this callback, and clear all flags once the command
completes successfully.

This command may complete with a context state error if slot was recently
reset and is already in the defauilt state. This is treated as a success
but flag was left uncleared.

The link state field is also unreliable if port is currently in reset,
so don't set the flag in active reset cases.
Also clear the flag immediately when link is no longer in ss.Inactive
state and port event handler detects a completed reset.

This issue was discovered while debugging kernel bugzilla issue 220491.
It is likely one small part of the problem, causing some of the failures,
but root cause remains unknown

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220491
Fixes: b8c3b71808 ("usb: xhci: Don't try to recover an endpoint if port is in error state.")
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://patch.msgid.link/20251107162819.1362579-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-09 10:54:44 +09:00
Niklas Neronin
931e468764 usb: xhci: improve TR Dequeue Pointer mask
Address the naming and usage of the TR Dequeue Pointer mask in the xhci
driver. The Endpoint Context Field at offset 0x08 is defined as follows:
 Bit 0		Dequeue Cycle State (DCS)
 Bits 3:1	RsvdZ (Reserved and Zero)
 Bits 63:4	TR Dequeue Pointer

When extracting the TR Dequeue Pointer for an Endpoint without Streams,
in xhci_handle_cmd_set_deq(), the inverted Dequeue Cycle State mask
(~EP_CTX_CYCLE_MASK) is used, inadvertently including the Reserved bits.
Although bits 3:1 are typically zero, using the incorrect mask could cause
issues.

The existing mask, named "SCTX_DEQ_MASK," is misleading because "SCTX"
implies exclusivity to Stream Contexts, whereas the TR Dequeue Pointer is
applicable to both Stream and non-Stream Contexts.

Rename the mask to "TR_DEQ_PTR_MASK", utilize GENMASK_ULL() macro and use
the mask when handling the TR Dequeue Pointer field.

Function xhci_get_hw_deq() returns the Endpoint Context Field 0x08, either
directly from the Endpoint context or a Stream.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250917210726.97100-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-18 09:53:11 +02:00
Michal Pecio
0ed023a883 usb: xhci: Update a comment about Stop Endpoint retries
Retries are no longer gated by a quirk, so remove that part.
Add a brief explanation of the timeout.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250917210726.97100-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-18 09:53:11 +02:00
Michal Pecio
08fa726e66 Revert "usb: xhci: Avoid Stop Endpoint retry loop if the endpoint seems Running"
This reverts commit 28a76fcc4c.

No actual HW bugs are known where Endpoint Context shows Running state
but Stop Endpoint fails repeatedly with Context State Error and leaves
the endpoint state unchanged. Stop Endpoint retries on Running EPs have
been performed since early 2021 with no such issues reported so far.

Trying to handle this hypothetical case brings a more realistic danger:
if Stop Endpoint fails on an endpoint which hasn't yet started after a
doorbell ring and enough latency occurs before this completion event is
handled, the driver may time out and begin removing cancelled TDs from
a running endpoint, even though one more retry would stop it reliably.

Such high latency is rare but not impossible, and removing TDs from a
running endpoint can cause more damage than not giving back a cancelled
URB (which wasn't happening anyway). So err on the side of caution and
revert to the old policy of always retrying if the EP appears running.

[Remove stable tag as we are dealing with theoretical cases -Mathias]

Fixes: 28a76fcc4c ("usb: xhci: Avoid Stop Endpoint retry loop if the endpoint seems Running")
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250917210726.97100-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-18 09:53:11 +02:00
Greg Kroah-Hartman
0f577e88d9 Merge patch series "eUSB2 Double Isochronous IN Bandwidth support"
Sakari Ailus <sakari.ailus@linux.intel.com> says:

This series enables support for eUSB2 Double Isochronous IN Bandwidth UVC
devices specified in 'USB 2.0 Double Isochronous IN Bandwidth' ECN. In
short, it adds support for new integrated USB2 webcams that can send twice
the data compared to conventional USB2 webcams.

These devices are identified by the device descriptor bcdUSB 0x0220 value.
They have an additional eUSB2 Isochronous Endpoint Companion Descriptor,
and a zero max packet size in regular isoc endpoint descriptor. Support
for parsing that new descriptor was added in commit

c749f058b4 ("USB: core: Add eUSB2 descriptor and parsing in USB core")

This series adds support to UVC, USB core, and xHCI to identify eUSB2
double isoc devices, and allow and set proper max packet, iso frame desc
sizes, bytes per interval, and other values in URBs and xHCI endpoint
contexts needed to support the double data rates for eUSB2 double isoc
devices.

since v4:
  https://lore.kernel.org/linux-usb/20250812132445.3185026-1-sakari.ailus@linux.intel.com
- New patch: use le16_to_cpu() to access endpoint descriptor's
  wMaxPacketSize field, which is an __le16. This isn't a bugfix as the
  value was compared to 0.
- New patch: add USB device speed check for eUSB2 isochronous endpoint
  companion parsing. The check is then removed from sites checking the
  existence of the companion (through companion's bDescriptorType field,
  which is non-zero for valid descriptors).
- New patch: do not parse eUSB2 isoc double BW companion descriptor on
  interrupt or OUT endpoints. It is not supposed to be found there,
  according to the ECN.
- Rename usb_endpoint_max_isoc_bpi() as
  usb_endpoint_max_periodic_payload() and move it right after
  usb_maxpacket().
- Fixed @ep reference in kernel-doc documentation for
  usb_endpoint_max_periodic_payload().
- In usb_endpoint_max_periodic_payload(), call struct usb_device pointer
  argument "udev" instead of "dev", to align with naming elsewhere.
- Add support for interrupt endpoints in
  usb_endpoint_max_periodic_payload(); eUSB2 double isoc BW is still
  limited to isochronous endpoints though.
- In usb_endpoint_max_periodic_payload(), remove the separate case for
  USB_SPEED_HIGH as the check is already done in parsing the eUSB isoc
  double BW companion, which is checked for.
- New patch: use usb_endpoint_max_periodic_payload() in xHCI driver,
  replacing xhci_get_max_esit_payload().
- Check non-zero bDescriptorType field of ep->eusb2_isoc_ep_comp instead
  of dwBytesPerInterval value exceeding 3072, where
  xhci_eusb2_is_isoc_bw_double() was used. This aligns the checks of eUSB2
  isochronous double bandwidth support for an endpoint.
- New patch: introduce usb_endpoint_is_hs_isoc_double() to figure out
  whether an endpoint uses isochronous double bandwidth and use the
  function in the xHCI driver and the usb core.
  xhci_eusb2_is_isoc_bw_double() is dropped, as well as the
  MAX_ISOC_XFER_SIZE_HS macro. usb_endpoint_is_hs_isoc_double() also
  includes check for bcdUSB == 0x220, to anticipate adding support for
  eUSB2V2.
- Merge condition for checking eUSB2 isoc double bw support for
  xHCI/endpoint in xhci_get_endpoint_mult().
- Improve comment regarding maximum packet size bits 12:11 in
  xhci_get_endpoint_max_burst().
- Aligned subject prefixes with the recent patches to the same files.

since v3:
  https://lore.kernel.org/linux-usb/20250807055355.1257029-1-sakari.ailus@linux.intel.com/
- Use spaces in aligning macro body for HCC2_EUSB2_DIC() (1st patch).
- Move usb_endpoint_max_isoc_bpi() to drivers/usb/core/usb.c (3rd patch).

since v2:
  https://lore.kernel.org/linux-usb/20250711083413.1552423-1-sakari.ailus@linux.intel.com
- Use ep->eusb2_isoc_ep_comp.bDescriptorType to determined whether the
  eUSB2 isochronous endpoint companion descriptor exists.
- Clean up eUSB2 double isoc bw maxp calculation.
- Drop le16_to_cpu(udev->descriptor.bcdUSB) == 0x220 check from
  xhci_eusb2_is_isoc_bw_double() -- it's redundant as
  ep->eusb2_isoc_ep_comp.dwBytesPerInterval will be zero otherwise.
- Add kernel-doc documentation for usb_endpoint_max_isoc_bpi().
- Check the endpoint has IN direction in usb_endpoint_max_isoc_bpi() and
  usb_submit_urb() as a condition for eUSB2 isoc double bw.

since v1:
  https://lore.kernel.org/linux-usb/20250616093730.2569328-2-mathias.nyman@linux.intel.com
- Introduce uvc_endpoint_max_isoc_bpi() to obtain maximum bytes per
  interval value for an endpoint, in a new patch (3rd). This code has been
  slightly reworked from the instance in the UVC driver, including support
  for SuperSpeedPlus Isochronous Endpoint Companion.
- Use usb_endpoint_max_isoc_bpi() in the UVC driver instead of open-coding
  eUSB2 support there, also drop now-redundant uvc_endpoint_max_bpi().
- Use u32 for maximum bpi and related information in the UVC driver -- the
  value could be larger than a 16-bit type can hold.
- Assume max in usb_submit_urb() is a natural number as
  usb_endpoint_maxp() returns only natural numbers (2nd patch).

Link: https://lore.kernel.org/r/20250820143824.551777-1-sakari.ailus@linux.intel.com
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-06 15:25:08 +02:00
Rai, Amardeep
0c670dc882 usb: xhci: Add host support for eUSB2 double isochronous bandwidth devices
Detect eUSB2 double isoc bw capable hosts and devices, and set the proper
xhci endpoint context values such as 'Mult', 'Max Burst Size', and 'Max
ESIT Payload' to enable the double isochronous bandwidth endpoints.

Intel xHC uses the endpoint context 'Mult' field for eUSB2 isoc
endpoints even if hosts supporting Large ESIT Payload Capability should
normally ignore the mult field.

Signed-off-by: Rai, Amardeep <amardeep.rai@intel.com>
Co-developed-by: Kannappan R <r.kannappan@intel.com>
Signed-off-by: Kannappan R <r.kannappan@intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Co-developed-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Co-developed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250820143824.551777-8-sakari.ailus@linux.intel.com
2025-09-06 15:25:05 +02:00
Weitao Wang
2eb0337615 usb: xhci: Fix slot_id resource race conflict
xHC controller may immediately reuse a slot_id after it's disabled,
giving it to a new enumerating device before the xhci driver freed
all resources related to the disabled device.

In such a scenario, device-A with slot_id equal to 1 is disconnecting
while device-B is enumerating, device-B will fail to enumerate in the
follow sequence.

1.[device-A] send disable slot command
2.[device-B] send enable slot command
3.[device-A] disable slot command completed and wakeup waiting thread
4.[device-B] enable slot command completed with slot_id equal to 1 and
	     wakeup waiting thread
5.[device-B] driver checks that slot_id is still in use (by device-A) in
	     xhci_alloc_virt_device, and fail to enumerate due to this
	     conflict
6.[device-A] xhci->devs[slot_id] set to NULL in xhci_free_virt_device

To fix driver's slot_id resources conflict, clear xhci->devs[slot_id] and
xhci->dcbba->dev_context_ptrs[slot_id] pointers in the interrupt context
when disable slot command completes successfully. Simultaneously, adjust
function xhci_free_virt_device to accurately handle device release.

[minor smatch warning and commit message fix -Mathias]

Cc: stable@vger.kernel.org
Fixes: 7faac1953e ("xhci: avoid race between disable slot command and host runtime suspend")
Signed-off-by: Weitao Wang <WeitaoWang-oc@zhaoxin.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250819125844.2042452-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-19 16:12:13 +02:00
Su Hui
7919407eca usb: xhci: print xhci->xhc_state when queue_command failed
When encounters some errors like these:
xhci_hcd 0000:4a:00.2: xHCI dying or halted, can't queue_command
xhci_hcd 0000:4a:00.2: FIXME: allocate a command ring segment
usb usb5-port6: couldn't allocate usb_device

It's hard to know whether xhc_state is dying or halted. So it's better
to print xhc_state's value which can help locate the resaon of the bug.

Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20250725060117.1773770-1-suhui@nfschina.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-25 10:50:24 +02:00
Mario Limonciello
4b9c60e440 usb: xhci: Avoid showing errors during surprise removal
When a USB4 dock is unplugged from a system it won't respond to ring
events. The PCI core handles the surprise removal event and notifies
all PCI drivers. The XHCI PCI driver sets a flag that the device is
being removed as well.

When that flag is set don't show messages in the cleanup path for
marking the controller dead.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250717073107.488599-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17 10:53:04 +02:00
Roy Luo
7aed15379d Revert "usb: xhci: Implement xhci_handshake_check_state() helper"
This reverts commit 6ccb83d6c4.

Commit 6ccb83d6c4 ("usb: xhci: Implement xhci_handshake_check_state()
helper") was introduced to workaround watchdog timeout issues on some
platforms, allowing xhci_reset() to bail out early without waiting
for the reset to complete.

Skipping the xhci handshake during a reset is a dangerous move. The
xhci specification explicitly states that certain registers cannot
be accessed during reset in section 5.4.1 USB Command Register (USBCMD),
Host Controller Reset (HCRST) field:
"This bit is cleared to '0' by the Host Controller when the reset
process is complete. Software cannot terminate the reset process
early by writinga '0' to this bit and shall not write any xHC
Operational or Runtime registers until while HCRST is '1'."

This behavior causes a regression on SNPS DWC3 USB controller with
dual-role capability. When the DWC3 controller exits host mode and
removes xhci while a reset is still in progress, and then tries to
configure its hardware for device mode, the ongoing reset leads to
register access issues; specifically, all register reads returns 0.
These issues extend beyond the xhci register space (which is expected
during a reset) and affect the entire DWC3 IP block, causing the DWC3
device mode to malfunction.

Cc: stable <stable@kernel.org>
Fixes: 6ccb83d6c4 ("usb: xhci: Implement xhci_handshake_check_state() helper")
Signed-off-by: Roy Luo <royluo@google.com>
Link: https://lore.kernel.org/r/20250522190912.457583-3-royluo@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19 12:41:35 +02:00
Niklas Neronin
bf9cce90da usb: xhci: rename 'irq_pending' to 'iman'
The Interrupt Register Set contains Interrupt Management register (IMAN).
The IMAN register contains the following fields:
 - Bit 0:	Interrupt Pending (IP)
 - Bit 1:	Interrupt Enable (IE)
 - Bits 31:2:	RsvdP (Reserved and Preserved)

Tn the xhci driver, the pointer currently named 'irq_pending' refers to the
IMAN register. However, the name "irq_pending" only describes one of the
fields within the IMAN register, rather than the entire register itself.
To improve clarity and better align with the xHCI specification,
the pointer is renamed to 'iman'.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250515135621.335595-22-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21 12:35:33 +02:00
Niklas Neronin
e1db856bd2 usb: xhci: remove '0' write to write-1-to-clear register
xHCI specification 1.2, section 5.5.2.1.
Interrupt Pending bit is RW1C (Write-1-to-clear), which means that
writing '0' to is has no effect and is removed.

The Interrupt Pending (IP) bit is cleared at the start of interrupt
handling; xhci_clear_interrupt_pending(). This could theoretically
cause a new interrupt to be issued before the xhci driver reaches
the interrupter disable functions.
To address this, the IP bit is read after Interrupt Enable is
disabled, and a debug message is issued if the IP bit is still set.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250515135621.335595-18-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21 12:35:33 +02:00
Niklas Neronin
f5bce30ad2 usb: xhci: guarantee that IMAN register is flushed
Add read call to guarantee that the write to the IMAN register has
been flushed.

xHCI specification 1.2, section 5.5.2.1, Note:
"Most systems have write buffers that minimize overhead, but this may
 require a read operation to guarantee that the write has been flushed
 from the posted buffer."

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250515135621.335595-17-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21 12:35:33 +02:00
Xu Rao
59d50e53e0 usb: xhci: Add debugfs support for xHCI port bandwidth
In many projects, you need to obtain the available bandwidth of the
xhci roothub port. Refer to xhci rev1_2 and use the TRB_GET_BW
command to obtain it.

hardware tested:
03:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1
(prog-if 30 [XHCI])
Subsystem: Huawei Technologies Co., Ltd. Raven USB 3.1
Flags: bus master, fast devsel, latency 0, IRQ 30
Memory at c0300000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [c0] MSI-X: Enable+ Count=8 Masked-
Kernel driver in use: xhci_hcd

test progress:
1. cd /sys/kernel/debug/usb/xhci/0000:03:00.3/port_bandwidth# ls
FS_BW  HS_BW  SS_BW
2. test fs speed  device
cat FS_BW
port[1] available bw: 90%.
port[2] available bw: 90%.
port[3] available bw: 90%.
port[4] available bw: 90%.
port[5] available bw: 0%.
port[6] available bw: 0%.
port[7] available bw: 0%.
port[8] available bw: 0%.
plug in fs usb audio ID 0d8c:013c
cat FS_BW
port[1] available bw: 76%.
port[2] available bw: 76%.
port[3] available bw: 76%.
port[4] available bw: 76%.
port[5] available bw: 0%.
port[6] available bw: 0%.
port[7] available bw: 0%.
port[8] available bw: 0%.
3. test hs speed device
cat HS_BW
port[1] available bw: 79%.
port[2] available bw: 79%.
port[3] available bw: 79%.
port[4] available bw: 79%.
port[5] available bw: 0%.
port[6] available bw: 0%.
port[7] available bw: 0%.
port[8] available bw: 0%.
plug in hs usb video ID 0408:1040
cat HS_BW
port[1] available bw: 39%.
port[2] available bw: 39%.
port[3] available bw: 39%.
port[4] available bw: 39%.
port[5] available bw: 0%.
port[6] available bw: 0%.
port[7] available bw: 0%.
port[8] available bw: 0%.
4.cat SS_BW
port[1] available bw: 0%.
port[2] available bw: 0%.
port[3] available bw: 0%.
port[4] available bw: 0%.
port[5] available bw: 90%.
port[6] available bw: 90%.
port[7] available bw: 90%.
port[8] available bw: 90%.

Signed-off-by: Xu Rao <raoxu@uniontech.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250515135621.335595-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21 12:35:31 +02:00
Michal Pecio
597f5c2f41 usb: xhci: Don't log transfer ring segment list on errors
The error message above used to span two lines, rarely more. A recent
cleanup concentrated useful information from it in one line, but then
it added printing the list of all ring segments, which is even longer
than before. It provides no new information in usual cases and little
in unusual ones, but adds noise to the log. Drop it.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250515135621.335595-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21 12:35:31 +02:00
Greg Kroah-Hartman
ab6dc9a6c7 Merge 6.15-rc6 into usb-next
We need the USB fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-13 08:26:58 +02:00
Michal Pecio
6328bdc988 usb: xhci: Don't trust the EP Context cycle bit when moving HW dequeue
VIA VL805 doesn't bother updating the EP Context cycle bit when the
endpoint halts. This is seen by patching xhci_move_dequeue_past_td()
to print the cycle bits of the EP Context and the TRB at hw_dequeue
and then disconnecting a flash drive while reading it. Actual cycle
state is random as expected, but the EP Context bit is always 1.

This means that the cycle state produced by this function is wrong
half the time, and then the endpoint stops working.

Work around it by looking at the cycle bit of TD's end_trb instead
of believing the Endpoint or Stream Context. Specifically:

- rename cycle_found to hw_dequeue_found to avoid confusion
- initialize new_cycle from td->end_trb instead of hw_dequeue
- switch new_cycle toggling to happen after end_trb is found

Now a workload which regularly stalls the device works normally for
a few hours and clearly demonstrates the HW bug - the EP Context bit
is not updated in a new cycle until Set TR Dequeue overwrites it:

[  +0,000298] sd 10:0:0:0: [sdc] Attached SCSI disk
[  +0,011758] cycle bits: TRB 1 EP Ctx 1
[  +5,947138] cycle bits: TRB 1 EP Ctx 1
[  +0,065731] cycle bits: TRB 0 EP Ctx 1
[  +0,064022] cycle bits: TRB 0 EP Ctx 0
[  +0,063297] cycle bits: TRB 0 EP Ctx 0
[  +0,069823] cycle bits: TRB 0 EP Ctx 0
[  +0,063390] cycle bits: TRB 1 EP Ctx 0
[  +0,063064] cycle bits: TRB 1 EP Ctx 1
[  +0,062293] cycle bits: TRB 1 EP Ctx 1
[  +0,066087] cycle bits: TRB 0 EP Ctx 1
[  +0,063636] cycle bits: TRB 0 EP Ctx 0
[  +0,066360] cycle bits: TRB 0 EP Ctx 0

Also tested on the buggy ASM1042 which moves EP Context dequeue to
the next TRB after errors, one problem case addressed by the rework
that implemented this loop. In this case hw_dequeue can be enqueue,
so simply picking the cycle bit of TRB at hw_dequeue wouldn't work.

Commit 5255660b20 ("xhci: add quirk for host controllers that
don't update endpoint DCS") tried to solve the stale cycle problem,
but it was more complex and got reverted due to a reported issue.

Cc: Jonathan Bell <jonathan@raspberrypi.org>
Cc: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250505125630.561699-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-05 16:30:45 +02:00
Greg Kroah-Hartman
615dca38c2 Linux 6.15-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmgOrWseHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGFyIH/AhXcuA8y8rk43mo
 t+0GO7JR4dnr4DIl74GgDjCXlXiKCT7EXMfD/ABdofTxV4Pbyv+pUODlg1E6eO9U
 C1WWM5PPNBGDDEVSQ3Yu756nr0UoiFhvW0R6pVdou5cezCWAtIF9LTN8DEUgis0u
 EUJD9+/cHAMzfkZwabjm/HNsa1SXv2X47MzYv/PdHKr0htEPcNHF4gqBrBRdACGy
 FJtaCKhuPf6TcDNXOFi5IEWMXrugReRQmOvrXqVYGa7rfUFkZgsAzRY6n/rUN5Z9
 FAgle4Vlv9ohVYj9bXX8b6wWgqiKRpoN+t0PpRd6G6ict1AFBobNGo8LH3tYIKqZ
 b/dCGNg=
 =xDGd
 -----END PGP SIGNATURE-----

Merge 6.15-rc4 into usb-next

We need the USB fixes in here as well, and this resolves the following
merge conflicts that were reported in linux-next:

	drivers/usb/chipidea/ci_hdrc_imx.c
	drivers/usb/host/xhci.h

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-28 10:32:58 +02:00
Michal Pecio
1ea050da55 usb: xhci: Fix invalid pointer dereference in Etron workaround
This check is performed before prepare_transfer() and prepare_ring(), so
enqueue can already point at the final link TRB of a segment. And indeed
it will, some 0.4% of times this code is called.

Then enqueue + 1 is an invalid pointer. It will crash the kernel right
away or load some junk which may look like a link TRB and cause the real
link TRB to be replaced with a NOOP. This wouldn't end well.

Use a functionally equivalent test which doesn't dereference the pointer
and always gives correct result.

Something has crashed my machine twice in recent days while playing with
an Etron HC, and a control transfer stress test ran for confirmation has
just crashed it again. The same test passes with this patch applied.

Fixes: 5e1c67abc9 ("xhci: Fix control transfer error on Etron xHCI host")
Cc: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Reviewed-by: Kuangyi Chiang <ki.chiang65@gmail.com>
Link: https://lore.kernel.org/r/20250410151828.2868740-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11 14:36:55 +02:00
Michal Pecio
9e3a28793d usb: xhci: Fix Short Packet handling rework ignoring errors
A Short Packet event before the last TRB of a TD is followed by another
event on the final TRB on spec-compliant HCs, which is most of them.

A 'last_td_was_short' flag was added to know if a TD has just completed
as Short Packet and another event is to come. The flag was cleared after
seeing the event (unless no TDs are pending, but that's a separate bug)
or seeing a new TD complete as something other than Short Packet.

A rework replaced the flag with an 'old_trb_comp_code' variable. When
an event doesn't match the pending TD and the previous event was Short
Packet, the new event is silently ignored.

To preserve old behavior, 'old_trb_comp_code' should be cleared at this
point, but instead it is being set to current comp code, which is often
Short Packet again. This can cause more events to be silently ignored,
even though they are no longer connected with the old TD that completed
short and indicate a serious problem with the driver or the xHC.

Common device classes like UAC in async mode, UVC, serial or the UAS
status pipe complete as Short Packet routinely and could be affected.

Clear 'old_trb_comp_code' to zero, which is an invalid completion code
and the same value the variable starts with. This restores original
behavior on Short Packet and also works for illegal Etron events, which
the code has been extended to cover too.

Fixes: b331a3d809 ("xhci: Handle spurious events on Etron host isoc enpoints")
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250410151828.2868740-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11 14:36:55 +02:00
Mathias Nyman
b513cc1905 Revert "xhci: Prevent early endpoint restart when handling STALL errors."
This reverts commit 860f5d0d35.

Paul Menzel reported that the two EP_STALLED patches in 6.15-rc1 cause
regression. Turns out that the new flag may never get cleared after
reset-resume, preventing xhci from restarting the endpoint.

Revert this to take a proper look at it.

Link: https://lore.kernel.org/linux-usb/84b400f8-2943-44e0-8803-f3aac3b670af@molgen.mpg.de
cc: Paul Menzel <pmenzel@molgen.mpg.de>
cc: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250410151828.2868740-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11 14:36:55 +02:00
Wesley Cheng
5beb4a53a1 usb: host: xhci-mem: Cleanup pending secondary event ring events
As part of xHCI bus suspend, the xHCI is halted.  However, if there are
pending events in the secondary event ring, it is observed that the xHCI
controller stops responding to further commands upon host or device
initiated bus resume.  Iterate through all pending events and update the
dequeue pointer to the beginning of the event ring.

Tested-by: Puma Hsu <pumahsu@google.com>
Tested-by: Daehwan Jung <dh10.jung@samsung.com>
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409194804.3773260-3-quic_wcheng@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11 13:02:29 +02:00
Michal Pecio
28a76fcc4c usb: xhci: Avoid Stop Endpoint retry loop if the endpoint seems Running
Nothing prevents a broken HC from claiming that an endpoint is Running
and repeatedly rejecting Stop Endpoint with Context State Error.

Avoid infinite retries and give back cancelled TDs.

No such cases known so far, but HCs have bugs.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250311154551.4035726-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-11 17:58:43 +01:00
Michal Pecio
dfc88357b6 usb: xhci: Don't change the status of stalled TDs on failed Stop EP
When the device stalls an endpoint, current TD is assigned -EPIPE
status and Reset Endpoint is queued. If a Stop Endpoint is pending
at the time, it will run before Reset Endpoint and fail due to the
stall. Its handler will change TD's status to -EPROTO before Reset
Endpoint handler runs and initiates giveback.

Check if the stall has already been handled and don't try to do it
again. Since xhci_handle_halted_endpoint() performs this check too,
not overwriting td->status is the only difference.

I haven't seen this case yet, but I have seen a related one where
the xHC has already executed Reset Endpoint, EP Context state is
now Stopped and EP_HALTED is set. If the xHC took a bit longer to
execute Reset Endpoint, said case would become this one.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250311154551.4035726-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-11 17:58:43 +01:00
Mathias Nyman
b331a3d809 xhci: Handle spurious events on Etron host isoc enpoints
Unplugging a USB3.0 webcam from Etron hosts while streaming results
in errors like this:

[ 2.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13
[ 2.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650
[ 2.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13
[ 2.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670

Etron xHC generates two transfer events for the TRB if an error is
detected while processing the last TRB of an isoc TD.

The first event can be any sort of error (like USB Transaction or
Babble Detected, etc), and the final event is Success.

The xHCI driver will handle the TD after the first event and remove it
from its internal list, and then print an "Transfer event TRB DMA ptr
not part of current TD" error message after the final event.

Commit 5372c65e13 ("xhci: process isoc TD properly when there was a
transaction error mid TD.") is designed to address isoc transaction
errors, but unfortunately it doesn't account for this scenario.

This issue is similar to the XHCI_SPURIOUS_SUCCESS case where a success
event follows a 'short transfer' event, but the TD the event points to
is already given back.

Expand the spurious success 'short transfer' event handling to cover
the spurious success after error on Etron hosts.

Kuangyi Chiang reported this issue and submitted a different solution
based on using error_mid_td. This commit message is mostly taken
from that patch.

Reported-by: Kuangyi Chiang <ki.chiang65@gmail.com>
Closes: https://lore.kernel.org/linux-usb/20241028025337.6372-6-ki.chiang65@gmail.com/
Tested-by: Kuangyi Chiang <ki.chiang65@gmail.com>
Tested-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-16-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:16 +01:00
Michal Pecio
118abe036c usb: xhci: Unify duplicate inc_enq() code
Extract a block of code copied from inc_enq() into a separate function
and call it from inc_enq() and the other function which used this code.
Remove the pointless 'next' variable which only aliases ring->enqueue.

Note: I don't know if any 0.95 xHC ever reached series production, but
"AMD 0.96 host" appears to be the "Llano" family APU. Example dmesg at
https://linux-hardware.org/?probe=79d5cfd4fd&log=dmesg

pci 0000:00:10.0: [1022:7812] type 00 class 0x0c0330
hcc params 0x014042c3 hci version 0x96 quirks 0x0000000000000608

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-15-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:16 +01:00
Mathias Nyman
860f5d0d35 xhci: Prevent early endpoint restart when handling STALL errors.
Ensure that an endpoint halted due to device STALL is not
restarted before a Clear_Feature(ENDPOINT_HALT) request is sent to
the device.

The host side of the endpoint may otherwise be started early by the
'Set TR Deq' command completion handler which is called if dequeue
is moved past a cancelled or halted TD.

Prevent this with a new flag set for bulk and interrupt endpoints
when a Stall Error is received. Clear it in hcd->endpoint_reset()
which is called after Clear_Feature(ENDPOINT_HALT) is sent.

Also add a debug message if a class driver queues a new URB after the
STALL. Note that class driver might not be aware of the STALL
yet when it submits the URB as URBs are given back in BH.

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-13-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:16 +01:00
Niklas Neronin
9a7f4bc4c8 usb: xhci: move debug capabilities from trb_in_td() to handle_tx_event()
Function trb_in_td() currently includes debug capabilities that are
triggered when its debug argument is set to true. The only consumer of
these debug capabilities is handle_tx_event(), which calls trb_in_td()
twice, once for its primary functionality and a second time solely for
debugging purposes if the first call returns 'NULL'.

This approach is inefficient and can lead to confusion, as trb_in_td()
executes the same code with identical arguments twice, differing only in
the debug output during the second execution.

To enhance clarity and efficiency, move the debug capabilities out of
trb_in_td() and integrates them directly into handle_tx_event().
This change reduces the argument count of trb_in_td() and ensures that
debug steps are executed only when necessary, streamlining the function's
operation.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-12-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:16 +01:00
Niklas Neronin
d71cb7d6e1 usb: xhci: refactor trb_in_td() to be static
Relocate trb_in_td() and marks it as static, as it's exclusively utilized
in xhci-ring.c. This adjustment lays the groundwork for future rework of
the function.

The function's logic remains unchanged; only its access specifier is
altered to static and a redundant "else" is removed on line 325
(due to checkpatch.pl complaining).

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-11-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:16 +01:00
Michal Pecio
fe1ccba52a usb: xhci: Skip only one TD on Ring Underrun/Overrun
If skipping is deferred to events other than Missed Service Error itsef,
it means we are running on an xHCI 1.0 host and don't know how many TDs
were missed until we reach some ordinary transfer completion event.

And in case of ring xrun, we can't know where the xrun happened either.

If we skip all pending TDs, we may prematurely give back TDs added after
the xrun had occurred, risking data loss or buffer UAF by the xHC.

If we skip none, a driver may become confused and stop working when all
its URBs are missed and appear to be "in flight" forever.

Skip exactly one TD on each xrun event - the first one that was missed,
as we can now be sure that the HC has finished processing it. Provided
that one more TD is queued before any subsequent doorbell ring, it will
become safe to skip another TD by the time we get an xrun again.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-8-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:16 +01:00
Michal Pecio
d0b619599e usb: xhci: Expedite skipping missed isoch TDs on modern HCs
xHCI spec rev. 1.0 allowed the TRB pointer of Missed Service events
to be NULL. Having no idea which of the queued TDs were missed and
which are waiting, we can only set a flag to skip missed TDs later.

But HCs are also allowed to give us pointer to the last missed TRB,
and this became mandatory in spec rev. 1.1 and later.

Use this pointer, if available, to immediately skip all missed TDs.
This reduces latency and risk of skipping-related bugs, because we
can now leave the skip flag cleared for future events.

Handle Missed Service Error events as 'error mid TD', if applicable,
because rev. 1.0 spec excplicitly says so in notes to 4.10.3.2 and
later revs in 4.10.3.2 and 4.11.2.5.2. Notes to 4.9.1 seem to apply.

Tested on ASM1142 and ASM3142 v1.1 xHCs which provide TRB pointers.
Tested on AMD, Etron, Renesas v1.0 xHCs which provide TRB pointers.
Tested on NEC v0.96 and VIA v1.0 xHCs which send a NULL pointer.

Change inspired by a discussion about realtime USB audio.

Link: https://lore.kernel.org/linux-usb/76e1a191-020d-4a76-97f6-237f9bd0ede0@gmx.net/T/
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-7-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:15 +01:00
Michal Pecio
906dec15b9 usb: xhci: Fix isochronous Ring Underrun/Overrun event handling
The TRB pointer of these events points at enqueue at the time of error
occurrence on xHCI 1.1+ HCs or it's NULL on older ones. By the time we
are handling the event, a new TD may be queued at this ring position.

I can trigger this race by rising interrupt moderation to increase IRQ
handling delay. Similar delay may occur naturally due to system load.

If this ever happens after a Missed Service Error, missed TDs will be
skipped and the new TD processed as if it matched the event. It could
be given back prematurely, risking data loss or buffer UAF by the xHC.

Don't complete TDs on xrun events and don't warn if queued TDs don't
match the event's TRB pointer, which can be NULL or a link/no-op TRB.
Don't warn if there are no queued TDs at all.

Now that it's safe, also handle xrun events if the skip flag is clear.
This ensures completion of any TD stuck in 'error mid TD' state right
before the xrun event, which could happen if a driver submits a finite
number of URBs to a buggy HC and then an error occurs on the last TD.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-6-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:15 +01:00
Michal Pecio
bfa8459942 usb: xhci: Complete 'error mid TD' transfers when handling Missed Service
Missed Service Error after an error mid TD means that the failed TD has
already been passed by the xHC without acknowledgment of the final TRB,
a known hardware bug. So don't wait any more and give back the TD.

Reproduced on NEC uPD720200 under conditions of ludicrously bad USB link
quality, confirmed to behave as expected using dynamic debug.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:15 +01:00
Michal Pecio
58d0a3fab5 usb: xhci: Don't skip on Stopped - Length Invalid
Up until commit d56b0b2ab1 ("usb: xhci: ensure skipped isoc TDs are
returned when isoc ring is stopped") in v6.11, the driver didn't skip
missed isochronous TDs when handling Stoppend and Stopped - Length
Invalid events. Instead, it erroneously cleared the skip flag, which
would cause the ring to get stuck, as future events won't match the
missed TD which is never removed from the queue until it's cancelled.

This buggy logic seems to have been in place substantially unchanged
since the 3.x series over 10 years ago, which probably speaks first
and foremost about relative rarity of this case in normal usage, but
by the spec I see no reason why it shouldn't be possible.

After d56b0b2ab1, TDs are immediately skipped when handling those
Stopped events. This poses a potential problem in case of Stopped -
Length Invalid, which occurs either on completed TDs (likely already
given back) or Link and No-Op TRBs. Such event won't be recognized
as matching any TD (unless it's the rare Link TRB inside a TD) and
will result in skipping all pending TDs, giving them back possibly
before they are done, risking isoc data loss and maybe UAF by HW.

As a compromise, don't skip and don't clear the skip flag on this
kind of event. Then the next event will skip missed TDs. A downside
of not handling Stopped - Length Invalid on a Link inside a TD is
that if the TD is cancelled, its actual length will not be updated
to account for TRBs (silently) completed before the TD was stopped.

I had no luck producing this sequence of completion events so there
is no compelling demonstration of any resulting disaster. It may be
a very rare, obscure condition. The sole motivation for this patch
is that if such unlikely event does occur, I'd rather risk reporting
a cancelled partially done isoc frame as empty than gamble with UAF.

This will be fixed more properly by looking at Stopped event's TRB
pointer when making skipping decisions, but such rework is unlikely
to be backported to v6.12, which will stay around for a few years.

Fixes: d56b0b2ab1 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped")
Cc: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:15 +01:00
Niklas Neronin
856563be98 usb: xhci: remove redundant update_ring_for_set_deq_completion() function
The function is a remnant from a previous implementation and is now
redundant. There is no longer a need to search for the dequeue pointer,
as both the TRB and segment dequeue pointers are saved within
'queued_deq_seg' and 'queued_deq_ptr'.

Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250306144954.3507700-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06 16:46:15 +01:00
Krzysztof Kozlowski
789a171429 USB: host: Use str_enable_disable-like helpers
Replace ternary (condition ? "enable" : "disable") syntax with helpers
from string_choices.h because:
1. Simple function call with one argument is easier to read.  Ternary
   operator has three arguments and with wrapping might lead to quite
   long code.
2. Is slightly shorter thus also easier to read.
3. It brings uniformity in the text - same string.
4. Allows deduping by the linker, which results in a smaller binary
   file.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-2-c8405df47c19@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15 18:28:12 +01:00
Mathias Nyman
3ac820f9d4 xhci: Add command completion parameter support
xHC hosts can pass 24 bits of data with a command completion event TRB
as the completion code only uses 8 bits of the 32 bit status field.

Only Configure Endpoint, and Get Extended Property commands utilize
this "command completion parameter" 24 bit field.
For other command completion events the xHC should keep it RsvdZ.

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20241227120142.1035206-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-27 13:06:10 +01:00