Commit Graph

332 Commits

Author SHA1 Message Date
Ilpo Järvinen
8cb0816673 PCI: Fix alignment calculation for resource size larger than align
The commit bc75c8e507 ("PCI: Rewrite bridge window head alignment
function") did not use if (r_size <= align) check from pbus_size_mem() for
the new head alignment bookkeeping structure (aligns2[]). In some
configurations, this can result in producing a gap into the bridge window
which the resource larger than its alignment cannot fill.

The old alignment calculation algorithm was removed by the subsequent
commit 3958bf16e2 ("PCI: Stop over-estimating bridge window size") which
renamed the aligns2[] array leaving only aligns[] array.

Add the if (r_size <= align) check back to avoid this problem.

Fixes: bc75c8e507 ("PCI: Rewrite bridge window head alignment function")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/b05a6f14-979d-42c9-924c-d8408cb12ae7@roeck-us.net/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xifer <xiferdev@gmail.com>
Link: https://patch.msgid.link/20260324165633.4583-11-ilpo.jarvinen@linux.intel.com
2026-03-27 10:19:08 -05:00
Ilpo Järvinen
8bbe8cec89 PCI: Rename window_alignment() to pci_min_window_alignment()
window_alignment() lacks prefix. Rename it to pci_min_window_alignment() in
order to include the prefix and also add min to indicate the returned
window alignment is the minimum PCI spec and arch allows.

Also make it available in drivers/pci/pci.h as upcoming changes will need
to call it from outside of setup-bus.c.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xifer <xiferdev@gmail.com>
Link: https://patch.msgid.link/20260324165633.4583-9-ilpo.jarvinen@linux.intel.com
2026-03-27 10:19:08 -05:00
Ilpo Järvinen
1ee4716a5a PCI: Fix premature removal from realloc_head list during resource assignment
reassign_resources_sorted() checks for two things:

a) Resource assignment failures for mandatory resources by checking if the
   resource remains unassigned, which are known to always repeat, and does
   not attempt to assign them again.

b) That resource is not among the ones being processed/assigned at this
   stage, leading to skip processing such resources in
   reassign_resources_sorted() as well (resource assignment progresses
   one PCI hierarchy level at a time).

The problem here is that a) is checked before b), but b) also implies the
resource is not being assigned yet, making also a) true. As a) only skips
resource assignment but still removes the resource from realloc_head, the
later stages that would need to process the information in realloc_head
cannot obtain the optional size information anymore. This leads to
considering only non-optional part for bridge windows deeper in the PCI
hierarchy.

This problem has been observed during rescan (add_size is not considered
while attempting assignment for 0000:e2:00.0 indicating the corresponding
entry was removed from realloc_head while processing resource assignments
for 0000:e1):

  pci_bus 0000:e1: scanning bus
  ...
  pci 0000:e3:01.0: bridge window [mem 0x800000000-0x1000ffffff 64bit pref] to [bus e4] add_size 60c000000 add_align 800000000
  pci 0000:e3:01.0: bridge window [mem 0x00100000-0x000fffff] to [bus e4] add_size 200000 add_align 200000
  pci 0000:e3:02.0: disabling bridge window [mem 0x00000000-0x000fffff 64bit pref] to [bus e5] (unused)
  pci 0000:e2:00.0: bridge window [mem 0x800000000-0x1000ffffff 64bit pref] to [bus e3-e5] add_size 60c000000 add_align 800000000
  pci 0000:e2:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus e3-e5] add_size 200000 add_align 200000
  pcieport 0000:e1:02.0: bridge window [io  size 0x2000]: can't assign; no space
  pcieport 0000:e1:02.0: bridge window [io  size 0x2000]: failed to assign
  pcieport 0000:e1:02.0: bridge window [io  0x1000-0x2fff]: resource restored
  pcieport 0000:e1:02.0: bridge window [io  0x1000-0x2fff]: resource restored
  pcieport 0000:e1:02.0: bridge window [io  size 0x2000]: can't assign; no space
  pcieport 0000:e1:02.0: bridge window [io  size 0x2000]: failed to assign
  pci 0000:e2:00.0: bridge window [mem 0x28f000000000-0x28f800ffffff 64bit pref]: assigned

Fixes: 96336ec702 ("PCI: Perform reset_resource() and build fail list in sync")
Reported-by: Peter Nisbet <peter.nisbet@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Peter Nisbet <peter.nisbet@intel.com>
Link: https://patch.msgid.link/20260313084551.1934-1-ilpo.jarvinen@linux.intel.com
2026-03-26 15:00:39 -05:00
Ilpo Järvinen
dc4b4d04e1 PCI: Prevent shrinking bridge window from its required size
Steve reported an eGPU (either Radeon Instinct MI50 32GB or NVIDIA 3080
10GB) connected via Thunderbolt was not assigned sufficient BAR space in
v6.11, so the amdgpu and nvidia drivers were unable to initialize the
device.

pci_bridge_distribute_available_resources() -> ... ->
adjust_bridge_window() is called between __pci_bus_size_bridges()
and assigning the resources. Since the commit 948675736a ("PCI: Allow
adjust_bridge_window() to shrink resource if necessary")
adjust_bridge_window() can also shrink the bridge window.  The shrunken
size, however, conflicts with what __pci_bus_size_bridges() ->
pbus_size_mem() calculated as the required bridge window size. By shrinking
the size, adjust_bridge_window() prevents the rest of the resource fitting
algorithm from working as intended.  Resource fitting logic is expecting
assignment failures when bridge windows need resizing, but there are cases
where failures are no longer happening after the commit 948675736a ("PCI:
Allow adjust_bridge_window() to shrink resource if necessary").

The commit 948675736a ("PCI: Allow adjust_bridge_window() to shrink
resource if necessary") justifies the change by the extra reservation
made due to hpmemsize parameter, however, the kernel code contradicts
that statement. (For simplicity, finer-grained hpmmiosize and hpmmiopref
parameters that can be used to the same effect as hpmemsize are ignored in
this description.)

pbus_size_mem() calls calculate_memsize() twice. First with add_size=0
to find out the minimal required resource size. The second call occurs
with add_size=hpmemsize (effectively) but the result does not directly
affect the resource size only resulting in an entry on the realloc_head
list (a.k.a. add_list). Yet, adjust_bridge_window() directly changes
the resource size which does not include what is reserved due to
hpmemsize. Also, if the required size for the bridge window exceeds
hpmemsize, the parameter does not have any effect even on the second
size calculation made by pbus_size_mem(); from calculate_memsize():

  size = max(size, add_size) + children_add_size;

The commit ae4611f1d7 ("PCI: Set resource size directly in
adjust_bridge_window()") that precedes the commit 948675736a ("PCI:
Allow adjust_bridge_window() to shrink resource if necessary") is also
related to causing this problem. Its changelog explicitly states
adjust_bridge_window() wants to "guarantee" allocation success.
Guaranteed allocations, however, are incompatible with how the other
parts of the resource fitting algorithm work. The given justification
fails to explain why guaranteed allocations at this stage are required
nor why forcing window to a smaller value than what was calculated by
pbus_size_mem() is correct. While the change might have worked by chance
in some test scenario, too small bridge window does not "guarantee"
success from the point of view of the endpoint device resource
assignments. No issue is mentioned within the changelog so it's unclear
if the change was made to fix some observed issue nor and what that
issue was.

The unwanted shrinking of a bridge window occurs, e.g., when a device with
large BARs such as eGPU is attached using Thunderbolt and the Root Port
holds less than enough resource space for the eGPU. The GPU resources are
in order of GBs and the default hotplug allocation is a mere 2MB
(DEFAULT_HOTPLUG_MMIO_PREF_SIZE). The problem is illustrated by this log
(filtered to the relevant content only):

  pci 0000:00:07.0: PCI bridge to [bus 03-2c]
  pci 0000:00:07.0:   bridge window [mem 0x6000000000-0x601bffffff 64bit pref]
  pci 0000:03:00.0: PCI bridge to [bus 00]
  pci 0000:03:00.0:   bridge window [mem 0x00000000-0x000fffff 64bit pref]
  pci 0000:03:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
  pci 0000:03:00.0: PCI bridge to [bus 04-2c]
  pcieport 0000:00:07.0: Assigned bridge window [mem 0x6000000000-0x601bffffff 64bit pref] to [bus 03-2c] cannot fit 0xc00000000 required for 0000:03:00.0 bridging to [bus 04-2c]
  pci 0000:03:00.0: bridge window [mem 0x800000000-0x10003fffff 64bit pref] to [bus 04-2c] add_size 100000 add_align 100000
  pcieport 0000:00:07.0: distributing available resources
  pci 0000:03:00.0: bridge window [mem 0x800000000-0x10003fffff 64bit pref] shrunken by 0x00000007e4400000
  pci 0000:03:00.0: bridge window [mem 0x6000000000-0x601bffffff 64bit pref]: assigned

The initial size of the Root Port's window is 448MB (0x601bffffff -
0x6000000000). __pci_bus_size_bridges() -> pbus_size_mem() calculates the
required size to be 32772 MB (0x10003fffff - 0x800000000) which would fit
the eGPU resources. adjust_bridge_window() then shrinks the bridge window
down to what is guaranteed to fit into the Root Port's bridge window. The
bridge window for 03:00.0 is also eliminated from the add_list (a.k.a.
realloc_head) list by adjust_bridge_window().

After adjustment, the resources are assigned and as the bridge window for
03:00.0 is assigned successfully, no failure is recorded. Without a
failure, no attempt to resize the window of the Root Port is required.  The
end result is eGPU not having large enough resources to work.

The commit 948675736a ("PCI: Allow adjust_bridge_window() to shrink
resource if necessary") also claims nested bridge windows are sized the
same, which is false. pbus_size_mem() calculates the size for the parent
bridge window by summing all the downstream resources so the resource
fitting calculates larger bridge window for the parent to accommodate the
childen. That is, hpmemsize does not result the same size for the case
where there are nested bridge windows.

In order to fix the most immediate problem, don't shrink the resource size
in adjust_bridge_window() as hpmemsize had nothing to do with it.  When
considering add_size, only reduce it up to what is added due to hpmemsize
(if required size is larger than hpmemsize, the parameter has no impact,
see calculate_memsize()). Unfortunately, if the tail of the bridge window
was aligned in calculate_memsize() from below hpmemsize to above it, the
size check will falsely match but the check at least errs to the side of
caution. There's not enough information available in adjust_bridge_window()
to know the calculated size precisely.

This is not exactly a revert of the commits e4611f1d7e9 ("PCI: Set resource
size directly in adjust_bridge_window()") and 948675736a ("PCI: Allow
adjust_bridge_window() to shrink resource if necessary") as shrinking still
remains in place but is implemented differently, and the end result behaves
very differently.

It is possible that those two commits fixed some other issue that is not
described with enough detail in the changelog and undoing parts of them
results in another regression due to behavioral change.  Nonetheless, as
described above, the solution by those two commits was flawed and the
issue, if one exists, should be solved in a way that is compatible with the
rest of the resource fitting algorithm instead of working against it.

Besides shrinking, the case where adjust_bridge_window() expands the bridge
window is likely somewhat wrong as well because it removes the entry from
add_list (a.k.a. realloc_head), but it is less damaging as that only
impacts optional resources and may have no impact if expanding by hpmemsize
is larger than what add_size was. Fixing it is left as further work.

Fixes: 948675736a ("PCI: Allow adjust_bridge_window() to shrink resource if necessary")
Fixes: ae4611f1d7 ("PCI: Set resource size directly in adjust_bridge_window()")
Reported-by: Steve Oswald <stevepeter.oswald@gmail.com>
Closes: https://lore.kernel.org/linux-pci/CAN95MYEaO8QYYL=5cN19nv_qDGuuP5QOD17pD_ed6a7UqFVZ-g@mail.gmail.com/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260219153951.68869-1-ilpo.jarvinen@linux.intel.com
2026-03-26 14:30:21 -05:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Ilpo Järvinen
a3b93b4223 PCI: Account fully optional bridge windows correctly
pbus_size_mem_optional() adds dev_res->add_size of a bridge window into
children_add_size when the window has a non-optional part. However, if the
bridge window is fully optional, only r_size is added (which is zero for
such a window).

Also, a second dev_res entry will be added by pci_dev_res_add_to_list()
into realloc_head for the bridge window (resulting in triggering the
realloc_head-must-be-fully-consumed sanity check after a single pass of the
resource assignment algorithm):

  WARNING: drivers/pci/setup-bus.c:2153 at pci_assign_unassigned_root_bus_resources+0xa5/0x260

Correct these problems by always adding dev_res->add_size for bridge
windows and not calling pci_dev_res_add_to_list() if the dev_res entry
exists.

Fixes: 6a5e64c75e ("PCI: Add pbus_mem_size_optional() to handle optional sizes")
Reported-by: RavitejaX Veesam <ravitejax.veesam@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: RavitejaX Veesam <ravitejax.veesam@intel.com>
Link: https://patch.msgid.link/20260218223419.22366-1-ilpo.jarvinen@linux.intel.com
2026-02-19 15:33:40 -06:00
Kai-Heng Feng
e5f72cb9ce PCI: Validate window resource type in pbus_select_window_for_type()
After ebe091ad81 ("PCI: Use pbus_select_window_for_type() during IO
window sizing") and ae88d0b9c5 ("PCI: Use pbus_select_window_for_type()
during mem window sizing"), many bridge windows can't get resources
assigned:

  pci 0006:05:00.0: bridge window [??? 0x00001000-0x00001fff flags 0x20080000]: can't assign; no space
  pci 0006:05:00.0: bridge window [??? 0x00001000-0x00001fff flags 0x20080000]: failed to assign

Those commits replace find_bus_resource_of_type() with
pbus_select_window_for_type(), and the latter lacks resource type
validation.

Add the resource type validation back to pbus_select_window_for_type() to
match the original behavior.

Fixes: 74afce3dfc ("PCI: Add bridge window selection functions")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221072
Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20260210142058.82701-1-kaihengf@nvidia.com
2026-02-12 11:08:34 -06:00
Linus Torvalds
1c2b4a4c2b pci-v7.0-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmmKJO4UHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vzotA/+OGSOPOs9hWd+OwNF5Dm2WA81yG/3
 K3Jx5uMuPoSjduMbPhVcib02Mr6YDJTa6WlYNVa76ADs2G6HxcVMFHutlYudSVcl
 umSF48FnyeH1LTba88dRoVj4DB47Cue+BfhYY2L0ZtxmjQq/NRuDFAaGBh54uNeF
 Gcdgr52QlM01n1X6yKvl7vE9gPdcPH80L256ssHAm6oSOHI1SPc6gqEKUUD02f8G
 FtzfTUAq/cWYjlY3VoS5GKtdHxFYuXqC5WfbURhJ11o/nVJY9k1Zx8n4eI1tmAtN
 7q692xjWSQJZlzepOBBEyjFUpIiy80tZ43z2ptRRBeI/n/qMmGPAov/g4MzegBWG
 IAEHTAp/xx1Wra1ynr7RNvYVcPpXm2TEim8gIGah9DkHbNgbu7ing+OO7DnQuyfD
 2h4hGD2622o6uikqkwzVd4mYuIcFu7SA6yROZhFn83BRnz0QOQienDrDlvOB8XCV
 EodLAOMc2KClvOmmriFMy11PH7MFFoXexV6KS83VfDJHi4+XzBsy0w6TXTohcA9s
 JTPIkSWqf/u6SrdLjXlFGyyJ2/KCgRiXFIBhhtYBMhDuuO7nG+mcSVzMa1PT0s6C
 PF+QoT7sJof/5VMJ4o3BgPrPkD3CQICrlt8XIt5I8ngsy6RZRQ5rt+pUix7Shcn8
 DgcunuINYfQtkfw=
 =LIjp
 -----END PGP SIGNATURE-----

Merge tag 'pci-v7.0-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Don't try to enable Extended Tags on VFs since that bit is Reserved
     and causes misleading log messages (Håkon Bugge)

   - Initialize Endpoint Read Completion Boundary to match Root Port,
     regardless of ACPI _HPX (Håkon Bugge)

   - Apply _HPX PCIe Setting Record only to AER configuration, and only
     when OS owns PCIe hotplug but not AER, to avoid clobbering Extended
     Tag and Relaxed Ordering settings (Håkon Bugge)

  Resource management:

   - Move CardBus code to setup-cardbus.c and only build it when
     CONFIG_CARDBUS is set (Ilpo Järvinen)

   - Fix bridge window alignment with optional resources, where
     additional alignment requirement was previously lost (Ilpo
     Järvinen)

   - Stop over-estimating bridge window size since they are now assigned
     without any gaps between them (Ilpo Järvinen)

   - Increase resource MAX_IORES_LEVEL to avoid /proc/iomem flattening
     for nested bridges and endpoints (Ilpo Järvinen)

   - Add pbus_mem_size_optional() to handle sizes of optional resources
     (SR-IOV VF BARs, expansion ROMs, bridge windows) (Ilpo Järvinen)

   - Don't claim disabled bridge windows to avoid spurious claim
     failures (Ilpo Järvinen)

  Driver binding:

   - Fix device reference leak in pcie_port_remove_service() (Uwe
     Kleine-König)

   - Move pcie_port_bus_match() and pcie_port_bus_type to PCIe-specific
     portdrv.c (Uwe Kleine-König)

   - Convert portdrv to use pcie_port_bus_type.probe() and .remove()
     callbacks so .probe() and .remove() can eventually be removed from
     struct device_driver (Uwe Kleine-König)

  Error handling:

   - Clear stale errors on reporting agents upon probe so they don't
     look like recent errors (Lukas Wunner)

   - Add generic RAS tracepoint for hotplug events (Shuai Xue)

   - Add RAS tracepoint for link speed changes (Shuai Xue)

  Power management:

   - Avoid redundant delay on transition from D3hot to D3cold if the
     device was already in D3hot (Brian Norris)

   - Prevent runtime suspend until devices are fully initialized to
     avoid saving incompletely configured device state (Brian Norris)

  Power control:

   - Add power_on/off callbacks with generic signature to pwrseq,
     tc9563, and slot drivers so they can be used by pwrctrl core
     (Manivannan Sadhasivam)

   - Add PCIe M.2 connector support to the slot pwrctrl driver
     (Manivannan Sadhasivam)

   - Switch to pwrctrl interfaces to create, destroy, and power on/off
     devices, calling them from host controller drivers instead of the
     PCI core (Manivannan Sadhasivam)

   - Drop qcom .assert_perst() callbacks since this is now done by the
     controller driver instead of the pwrctrl driver (Manivannan
     Sadhasivam)

  Virtualization:

   - Remove an incorrect unlock in pci_slot_trylock() error handling
     (Jinhui Guo)

   - Lock the bridge device for slot reset (Keith Busch)

   - Enable ACS after IOMMU configuration on OF platforms so ACS is
     enabled an all devices; previously the first device enumerated
     (typically a Root Port) didn't have ACS enabled (Manivannan
     Sadhasivam)

   - Disable ACS Source Validation for IDT 0x80b5 and 0x8090 switches to
     work around hardware erratum; previously ACS SV was only
     temporarily disabled, which worked for enumeration but not after
     reset (Manivannan Sadhasivam)

  Peer-to-peer DMA:

   - Release per-CPU pgmap ref when vm_insert_page() fails to avoid hang
     when removing the PCI device (Hou Tao)

   - Remove incorrect p2pmem_alloc_mmap() warning about page refcount
     (Hou Tao)

  Endpoint framework:

   - Add configfs sub-groups synchronously to avoid NULL pointer
     dereference when racing with removal (Liu Song)

   - Fix swapped parameters in pci_{primary/secondary}_epc_epf_unlink()
     functions (Manikanta Maddireddy)

  ASPEED PCIe controller driver:

   - Add ASPEED Root Complex DT binding and driver (Jacky Chou)

  Freescale i.MX6 PCIe controller driver:

   - Add DT binding and driver support for an optional external refclock
     in addition to the refclock from the internal PLL (Richard Zhu)

   - Fix CLKREQ# control so host asserts it during enumeration and
     Endpoints can use it afterwards to exit the L1.2 link state
     (Richard Zhu)

  NVIDIA Tegra PCIe controller driver:

   - Export irq_domain_free_irqs() to allow PCI/MSI drivers that tear
     down MSI domains to be built as modules (Aaron Kling)

   - Allow pci-tegra to be built as a module (Aaron Kling)

  NVIDIA Tegra194 PCIe controller driver:

   - Relax Kconfig so tegra194 can be built for platforms beyond
     Tegra194 (Vidya Sagar)

  Qualcomm PCIe controller driver:

   - Merge SC8180x DT binding into SM8150 (Krzysztof Kozlowski)

   - Move SDX55, SDM845, QCS404, IPQ5018, IPQ6018, IPQ8074 Gen3,
     IPQ8074, IPQ4019, IPQ9574, APQ8064, MSM8996, APQ8084 to dedicated
     schema (Krzysztof Kozlowski)

   - Add DT binding and driver support for SA8255p Endpoint being
     configured by firmware (Mrinmay Sarkar)

   - Parse PERST# from all PCIe bridge nodes for future platforms that
     will have PERST# in Switch Downstream Ports as well as in Root
     Ports (Manivannan Sadhasivam)

  Renesas RZ/G3S PCIe controller driver:

   - Use pci_generic_config_write() since the writability provided by
     the custom wrapper is unnecessary (Claudiu Beznea)

  SOPHGO PCIe controller driver:

   - Disable ASPM L0s and L1 on Sophgo 2044 PCIe Root Ports (Inochi
     Amaoto)

  Synopsys DesignWare PCIe controller driver:

   - Extend PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP() to return a
     pointer to the preceding Capability, to allow removal of
     Capabilities that are advertised but not fully implemented (Qiang
     Yu)

   - Remove MSI and MSI-X Capabilities in platforms that can't support
     them, so the PCI core automatically falls back to INTx (Qiang Yu)

   - Add ASPM L1.1 and L1.2 Substates context to debugfs ltssm_status
     for drivers that support this (Shawn Lin)

   - Skip PME_Turn_Off broadcast and L2/L3 transition during suspend if
     link is not up to avoid an unnecessary timeout (Manivannan
     Sadhasivam)

   - Revert dw-rockchip, qcom, and DWC core changes that used link-up
     IRQs to trigger enumeration instead of waiting for link to be up
     because the PCI core doesn't allocate bus number space for
     hierarchies that might be attached (Niklas Cassel)

   - Make endpoint iATU entry for MSI permanent instead of programming
     it dynamically, which is slow and racy with respect to other
     concurrent traffic, e.g., eDMA (Koichiro Den)

   - Use iMSI-RX MSI target address when possible to fix endpoints using
     32-bit MSI (Shawn Lin)

   - Allow DWC host controller driver probe to continue if device is not
     found or found but inactive; only fail when there's an error with
     the link (Manivannan Sadhasivam)

   - For controllers like NXP i.MX6QP and i.MX7D, where LTSSM registers
     are not accessible after PME_Turn_Off, simply wait 10ms instead of
     polling for L2/L3 Ready (Richard Zhu)

   - Use multiple iATU entries to map large bridge windows and DMA
     ranges when necessary instead of failing (Samuel Holland)

   - Add EPC dynamic_inbound_mapping feature bit for Endpoint
     Controllers that can update BAR inbound address translation without
     requiring EPF driver to clear/reset the BAR first, and advertise it
     for DWC-based Endpoints (Koichiro Den)

   - Add EPC subrange_mapping feature bit for Endpoint Controllers that
     can map multiple independent inbound regions in a single BAR,
     implement subrange mapping, advertise it for DWC-based Endpoints,
     and add Endpoint selftests for it (Koichiro Den)

   - Make resizable BARs work for Endpoint multi-PF configurations;
     previously it only worked for PF 0 (Aksh Garg)

   - Fix Endpoint non-PF 0 support for BAR configuration, ATU mappings,
     and Address Match Mode (Aksh Garg)

   - Set up iATU when ECAM is enabled; previously IO and MEM outbound
     windows weren't programmed, and ECAM-related iATU entries weren't
     restored after suspend/resume, so config accesses failed (Krishna
     Chaitanya Chundru)

  Miscellaneous:

   - Use system_percpu_wq and WQ_PERCPU to explicitly request per-CPU
     work so WQ_UNBOUND can eventually be removed (Marco Crivellari)"

* tag 'pci-v7.0-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (176 commits)
  PCI/bwctrl: Disable BW controller on Intel P45 using a quirk
  PCI: Disable ACS SV for IDT 0x8090 switch
  PCI: Disable ACS SV for IDT 0x80b5 switch
  PCI: Cache ACS Capabilities register
  PCI: Enable ACS after configuring IOMMU for OF platforms
  PCI: Add ACS quirk for Pericom PI7C9X2G404 switches [12d8:b404]
  PCI: Add ACS quirk for Qualcomm Hamoa & Glymur
  PCI: Use device_lock_assert() to verify device lock is held
  PCI: Use lockdep_assert_held(pci_bus_sem) to verify lock is held
  PCI: Fix pci_slot_lock () device locking
  PCI: Fix pci_slot_trylock() error handling
  PCI: Mark Nvidia GB10 to avoid bus reset
  PCI: Mark ASM1164 SATA controller to avoid bus reset
  PCI: host-generic: Avoid reporting incorrect 'missing reg property' error
  PCI/PME: Replace RMW of Root Status register with direct write
  PCI/AER: Clear stale errors on reporting agents upon probe
  PCI: Don't claim disabled bridge windows
  PCI: rzg3s-host: Fix device node reference leak in rzg3s_pcie_host_parse_port()
  PCI: dwc: Fix missing iATU setup when ECAM is enabled
  PCI: dwc: Clean up iATU index usage in dw_pcie_iatu_setup()
  ...
2026-02-11 17:20:38 -08:00
Ilpo Järvinen
2ecc1bf14e PCI: Don't claim disabled bridge windows
The commit 8278c69143 ("PCI: Preserve bridge window resource type flags")
changed bridge window resource behavior such that flags are no longer zero
if the bridge window is not valid or is disabled (mainly to preserve the
type flags for later use). If a bridge window has its limit smaller than
base address, pci_read_bridge_*() sets both IORESOURCE_UNSET and
IORESOURCE_DISABLED to indicate the bridge window exists but is not valid
with the current base and limit configuration.

The code in pci_claim_bridge_resources() still depends on the old behavior
of checking validity of the bridge window solely based on !r->flags,
whereas after 8278c69143, also IORESOURCE_DISABLED may indicate bridge
window addresses are not valid.

While pci_claim_resource() does check IORESOURCE_UNSET,
pci_claim_bridge_resource() attempts to clip the resource if
pci_claim_resource() fails, which is not correct for bridge window
resources that are not valid. As pci_bus_clip_resource() performs clipping
regardless of flags and then clears IORESOURCE_UNSET, it should not be
called unless the resource is valid.

The problem is visible in this log:

  pci 0000:20:00.0: PCI bridge to [bus 21]
  pci 0000:20:00.0: bridge window [io  size 0x0000 disabled]: can't claim; no address assigned
  pci 0000:20:00.0: [io  0x0000-0xffffffffffffffff disabled] clipped to [io 0x0000-0xffff disabled]

Add IORESOURCE_DISABLED check in pci_claim_bridge_resources() to only
claim bridge windows that appear to have a valid configuration.

Fixes: 8278c69143 ("PCI: Preserve bridge window resource type flags")
Reported-by: Sizhe Liu <liusizhe5@huawei.com>
Link: https://lore.kernel.org/all/20260203023545.2753811-1-liusizhe5@huawei.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/4d9228d6-a230-6ddf-e300-fbf42d523863@linux.intel.com
2026-02-06 16:12:42 -06:00
Ilpo Järvinen
fd29d4ea09 PCI: Separate CardBus setup & build it only with CONFIG_CARDBUS
PCI bridge window setup code includes special code to handle CardBus
bridges. CardBus has long since fallen out of favor and modern systems have
no use for it.

Move CardBus setup code to its own file and use existing CONFIG_CARDBUS to
decide whether it should be built or not.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-18-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:52 -06:00
Ilpo Järvinen
b398665a5b PCI: Add 'pci' prefix to struct pci_dev_resource handling functions
setup-bus.c has static functions for handling struct pci_dev_resource
related operation which have no prefixes. Add 'pci' prefixes to those
function names as add_to_list() will be needed in another file by an
upcoming change.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-17-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:52 -06:00
Ilpo Järvinen
1a5de84c3a PCI: Use resource_assigned() in setup-bus.c algorithm
Many places in the resource fitting and assignment algorithm want to
know if the resource is assigned into the resource tree or not. Convert
open-coded ->parent checks to use resource_assigned().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-16-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:52 -06:00
Ilpo Järvinen
6a5e64c75e PCI: Add pbus_mem_size_optional() to handle optional sizes
The resource loop in pbus_size_mem() handles optional resources that are
either fully optional (SR-IOV and disabled Expansion ROMs) or bridge
windows that may be optional only for a part. The logic is a little
inconsistent when it comes to a bridge window that has only optional
children resources as it would be more natural to treat it similar to any
fully optional resource. As resource size should be zero in that case, it
shouldn't cause any bugs but it still seems useful to address the
inconsistency.

Place the optional size related code of pbus_size_mem() into
pbus_mem_size_optional() and add a check in pci_resource_is_optional()
for entirely optional bridge windows. Reorder the logic inside
pbus_mem_size_optional() such that fully optional resources are handled
the same irrespective of whether the resource is a bridge window or
not.

Additional motivation for this are the upcoming changes that add complexity
to the optional sizing logic due to Resizable BAR awareness.  The extra
logic would exceed any reasonable indentation level if the optional sizing
code is kept within the loop body.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-14-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:52 -06:00
Ilpo Järvinen
c10fe0c0e6 PCI: Check invalid align earlier in pbus_size_mem()
Check for invalid align before any bridge window sizing actions in
pbus_size_mem() to avoid need to roll back any sizing calculations.

Placing the check earlier will make it easier to add more optional size
related calculations at where the SR-IOV logic currently is in
pbus_size_mem().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-13-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:52 -06:00
Ilpo Järvinen
9629f71722 PCI: Log reset and restore of resources
PCI resource fitting and assignment is complicated to track because it
performs many actions without any logging. One of these is resource reset
(zeroing the resource) and the restore during the multi-pass resource
fitting algorithm.

Resource reset does not play well with the other PCI code if the code later
wants to reattempt assignment of that resource. Knowing that a resource was
left in the reset state without a pairing restore is useful for
understanding issues that show up as resource assignment failures.

Add pci_dbg() to both reset and restore to be better able to track what's
going on within the resource fitting algorithm.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-12-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:52 -06:00
Ilpo Järvinen
5fa2f9fb34 PCI: Add pci_resource_is_bridge_win()
Add pci_resource_is_bridge_win() helper to simplify checking if the
resource is a bridge window.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-11-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:52 -06:00
Ilpo Järvinen
e112fbb26b PCI: Fetch dev_res to local var in __assign_resources_sorted()
__assign_resources_sorted() calls get_res_add_size() and
get_res_add_align(), each walking through the realloc_head list to
relocate the corresponding pci_dev_resource entry.

Fetch the pci_dev_resource entry into a local variable to avoid double
walk.

In addition, reverse logic to reduce indentation level.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-10-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:51 -06:00
Ilpo Järvinen
4bee4fc0f4 PCI: Use res_to_dev_res() in reassign_resources_sorted()
reassign_resources_sorted() contains a search loop for a particular
resource in the head list. res_to_dev_res() already implements the same
search so use it instead.

Drop unused found_match and dev_res variables.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-9-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:51 -06:00
Ilpo Järvinen
5819403a0e PCI: Pass bridge window resource to pbus_size_mem()
pbus_size_mem() inputs type and calculates bridge window resource within.
Its caller (__pci_bus_size_bridges()) also has to look up the prefetchable
window to determine if it exists or not to decide whether to call
pbus_size_mem() twice or once.

Change the interface such that the caller is responsible in providing the
bridge window resource. Passing the resource directly avoids another lookup
for the prefetchable window inside pbus_size_mem().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-8-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:51 -06:00
Ilpo Järvinen
d0c72d6e39 PCI: Push realloc check into pbus_size_mem()
pbus_size_mem() and calculate_memsize() input both min_size and add_size.
They are given the same value if realloc_head is NULL and min_size is 0
otherwise. Both are used in calculate_memsize() to enforce a lower bound to
the size.

The interface between __pci_bus_size_bridges() and the forementioned
functions can be simplied by pushing the realloc check into
pbus_size_mem().

There are only two possible cases:

  1) when calculating size0, add_size parameter given to
     calculate_memsize() is always 0 which implies only min_size
     matters.

  2) when calculating size1, realloc_head is not NULL which implies
     min_size=0 so only add_size matters.

Drop min_size parameter from pbus_size_mem() and check realloc_head when
calling calculate_memsize(). Drop add_size from calculate_memsize() and use
only min_size within max() to enforce the lower bound.

calculate_iosize() is a bit more complicated than calculate_memsize() and
is therefore left as is, but pbus_size_io() can still input only min_size
similar to pbus_size_mem().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-7-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:51 -06:00
Ilpo Järvinen
f909e3ee3e PCI: Remove old_size limit from bridge window sizing
calculate_memsize() applies lower bound to the resource size before
aligning the resource size making it impossible to shrink bridge window
resources. I've not found any justification for this lower bound and
nothing indicated it was to work around some HW issue.

Prior to the commit 3baeae3603 ("PCI: Use pci_release_resource() instead
of release_resource()"), releasing a bridge window during BAR resize
resulted in clearing start and end address of the resource.  Clearing
addresses destroys the resource size as a side-effect, therefore nullifying
the effect of the old size lower bound.

After the commit 3baeae3603 ("PCI: Use pci_release_resource() instead of
release_resource()"), BAR resize uses the aligned old size, which results
in exceeding what fits into the parent window in some cases:

  xe 0030:03:00.0: [drm] Attempting to resize bar from 256MiB -> 16384MiB
  xe 0030:03:00.0: BAR 0 [mem 0x620c000000000-0x620c000ffffff 64bit]: releasing
  xe 0030:03:00.0: BAR 2 [mem 0x6200000000000-0x620000fffffff 64bit pref]: releasing
  pci 0030:02:01.0: bridge window [mem 0x6200000000000-0x620001fffffff 64bit pref]: releasing
  pci 0030:01:00.0: bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit pref]: releasing
  pci 0030:00:00.0: bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit pref]: was not released (still contains assigned resources)
  pci 0030:00:00.0: Assigned bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit pref] to [bus 01-04] free space at [mem 0x6200400000000-0x62007ffffffff 64bit pref]
  pci 0030:00:00.0: Assigned bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit pref] to [bus 01-04] cannot fit 0x4000000000 required for 0030:01:00.0 bridging to [bus 02-04]

The old size of 0x6200000000000-0x6203fbff0ffff resource was used as the
lower bound which results in 0x4000000000 size request due to alignment.
That exceeds what can fit into the parent window.

Since the lower bound never even was enforced fully because the resource
addresses were cleared when the bridge window is released, remove the
old_size lower bound entirely and trust the calculated bridge window size
is enough.

This same problem may occur on io window side but seems less likely to
cause issues due to general difference in alignment. Removing the lower
bound may have other unforeseen consequences in case of io window so it's
better to leave it as -next material if no problem is reported related to
io window sizing (BAR resize shouldn't touch io windows anyway).

Fixes: 3baeae3603 ("PCI: Use pci_release_resource() instead of release_resource()")
Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Link: https://lore.kernel.org/r/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-6-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:51 -06:00
Ilpo Järvinen
3958bf16e2 PCI: Stop over-estimating bridge window size
New way to calculate the bridge window head alignment produces tight-fit,
that is, it does not leave any gaps between the resources.  Similarly,
relaxed tail alignment does not leave extra tail room.

Start to use bridge window calculation that does not over-estimate the size
of the required window.

pbus_upstream_space_available() can be removed.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Malte Schröder <malte+lkml@tnxip.de>
Link: https://patch.msgid.link/20251219174036.16738-4-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:51 -06:00
Ilpo Järvinen
bc75c8e507 PCI: Rewrite bridge window head alignment function
The calculation of bridge window head alignment is done by
calculate_mem_align() [*]. With the default bridge window alignment, it
is used for both head and tail alignment.

The selected head alignment does not always result in tight-fitting
resources (gap at d4f00000-d4ffffff):

  d4800000-dbffffff : PCI Bus 0000:06
    d4800000-d48fffff : PCI Bus 0000:07
      d4800000-d4803fff : 0000:07:00.0
        d4800000-d4803fff : nvme
    d4900000-d49fffff : PCI Bus 0000:0a
      d4900000-d490ffff : 0000:0a:00.0
        d4900000-d490ffff : r8169
      d4910000-d4913fff : 0000:0a:00.0
    d4a00000-d4cfffff : PCI Bus 0000:0b
      d4a00000-d4bfffff : 0000:0b:00.0
        d4a00000-d4bfffff : 0000:0b:00.0
      d4c00000-d4c07fff : 0000:0b:00.0
    d4d00000-d4dfffff : PCI Bus 0000:15
      d4d00000-d4d07fff : 0000:15:00.0
        d4d00000-d4d07fff : xhci-hcd
    d4e00000-d4efffff : PCI Bus 0000:16
      d4e00000-d4e7ffff : 0000:16:00.0
      d4e80000-d4e803ff : 0000:16:00.0
        d4e80000-d4e803ff : ahci
    d5000000-dbffffff : PCI Bus 0000:0c

This has not caused problems (for years) with the default bridge window
tail alignment that grossly over-estimates the required tail alignment
leaving more tail room than necessary. With the introduction of relaxed
tail alignment that leaves no extra tail room whatsoever, any gaps will
immediately turn into assignment failures.

Introduce head alignment calculation that ensures no gaps are left and
apply the new approach when using relaxed alignment. We may want to
consider using it for the normal alignment eventually, but as the first
step, solve only the problem with the relaxed tail alignment.

([*] I don't understand the algorithm in calculate_mem_align().)

Link: https://git.kernel.org/history/history/c/5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220775
Reported-by: Malte Schröder <malte+lkml@tnxip.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Malte Schröder <malte+lkml@tnxip.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251219174036.16738-3-ilpo.jarvinen@linux.intel.com
2026-01-27 16:26:59 -06:00
Ilpo Järvinen
7e90360e6d PCI: Fix bridge window alignment with optional resources
pbus_size_mem() has two alignments, one for required resources in min_align
and another in add_align that takes account optional resources.

The add_align is applied to the bridge window through the realloc_head
list. It can happen, however, that add_align is larger than min_align but
calculated size1 and size0 are equal due to extra tailroom (e.g., hotplug
reservation, tail alignment), and therefore no entry is created to the
realloc_head list. Without the bridge appearing in the realloc head,
add_align is lost when pbus_size_mem() returns.

The problem is visible in this log for 0000:05:00.0 which lacks
add_size ... add_align ... line that would indicate it was added into
the realloc_head list:

  pci 0000:05:00.0: PCI bridge to [bus 06-16]
  ...
  pci 0000:06:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus 07] requires relaxed alignment rules
  pci 0000:06:06.0: bridge window [mem 0x00100000-0x001fffff] to [bus 0a] requires relaxed alignment rules
  pci 0000:06:07.0: bridge window [mem 0x00100000-0x003fffff] to [bus 0b] requires relaxed alignment rules
  pci 0000:06:08.0: bridge window [mem 0x00800000-0x00ffffff 64bit pref] to [bus 0c-14] requires relaxed alignment rules
  pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] requires relaxed alignment rules
  pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] requires relaxed alignment rules
  pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] add_size 100000 add_align 1000000
  pci 0000:06:0c.0: bridge window [mem 0x00100000-0x001fffff] to [bus 15] requires relaxed alignment rules
  pci 0000:06:0d.0: bridge window [mem 0x00100000-0x001fffff] to [bus 16] requires relaxed alignment rules
  pci 0000:06:0d.0: bridge window [mem 0x00100000-0x001fffff] to [bus 16] requires relaxed alignment rules
  pci 0000:05:00.0: bridge window [mem 0xd4800000-0xd97fffff]: assigned
  pci 0000:05:00.0: bridge window [mem 0x1060000000-0x10607fffff 64bit pref]: assigned
  pci 0000:06:08.0: bridge window [mem size 0x04900000]: can't assign; no space
  pci 0000:06:08.0: bridge window [mem size 0x04900000]: failed to assign

While this bug itself seems old, it has likely become more visible after
the relaxed tail alignment that does not grossly overestimate the size
needed for the bridge window.

Make sure add_align > min_align too results in adding an entry into the
realloc head list. In addition, add handling to the cases where add_size is
zero while only alignment differs.

Fixes: d74b9027a4 ("PCI: Consider additional PF's IOV BAR alignment in sizing and assigning")
Reported-by: Malte Schröder <malte+lkml@tnxip.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Malte Schröder <malte+lkml@tnxip.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251219174036.16738-2-ilpo.jarvinen@linux.intel.com
2026-01-26 11:00:28 -06:00
Ilpo Järvinen
5528fd38f2 PCI: Fix Resizable BAR restore order
The commit 337b1b566d ("PCI: Fix restoring BARs on BAR resize rollback
path") changed BAR resize to layer rebar code and resource setup/restore
code cleanly. Unfortunately, it did not consider how the value of the BAR
Size field impacts the read-only bits in the Base Address Register (PCIe7
spec, sec. 7.8.6.3). That is, it very much matters in which order the BAR
Size and Base Address Register are restored.

Post-337b1b566db0 ("PCI: Fix restoring BARs on BAR resize rollback path")
during BAR resize rollback, pci_do_resource_release_and_resize() attempts
to restore the old address to the BAR that was resized, but it can fail to
setup the address correctly if the address has low bits set that collide
with the bits that are still read-only. As a result, kernel's resource and
BAR will be out-of-sync.

Fix this by restoring BAR Size before rolling back the resource changes and
restoring the BAR.

Fixes: 337b1b566d ("PCI: Fix restoring BARs on BAR resize rollback path")
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/linux-pci/aW_w1oFQCzUxGYtu@intel.com/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260121131417.9582-3-ilpo.jarvinen@linux.intel.com
2026-01-22 10:29:55 -06:00
Ilpo Järvinen
08d9eae76b PCI: Fix BAR resize rollback path overwriting ret
The commit 337b1b566d ("PCI: Fix restoring BARs on BAR resize rollback
path") added BAR rollback to pci_do_resource_release_and_resize() in case
of resize failure.

On the rollback, pci_claim_resource() is called, which can fail and the
code is prepared for that possibility. pci_claim_resource()'s return value,
however, overwrites the original value of ret so
pci_do_resource_release_and_resize() will return an incorrect value in the
end (as pci_claim_resource() normally succeeds, in practice ret will be 0).

Fix the issue by directly calling pci_claim_resource() inside the if ().

Fixes: 337b1b566d ("PCI: Fix restoring BARs on BAR resize rollback path")
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/linux-pci/aW_w1oFQCzUxGYtu@intel.com/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260121131417.9582-2-ilpo.jarvinen@linux.intel.com
2026-01-22 10:28:59 -06:00
Linus Torvalds
43dfc13ca9 pci-v6.19-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmkwoyoUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vztdBAAjvZO0NOafCbhn6lUAz/T4VxPY0R7
 L5RqeLci33rQzbQ0yhJYXsd8VemMo6Zk0qmlSwjddlOMPboHoC1jK3i4C16QoP+3
 R5ecab0VoImLl2Ffig0BZoHQpVq01q2kTGQ2YyrryzDCgBCsBG3U10ZD380pGsTW
 ypqEgOCxaQCq2mtqr5CavaCcquq2krrnHkkVQOP1ryWzRq1C3wDXcQXFYNdzXpDP
 Lq8pBIh8WN5pYwrqjrFMrtNhj7BHPmowLEaAbNIWmH8WjGav624XcKq2O+arx3Hl
 BDHFKVjWtiYikrWmAODZhlY3HgEj546h2HtQYwWPOKuSKkgzNVn28LFIDgL3oXDP
 sJ6gWIDWMgRpEI6VzmqzRXJWbTAkIrRfHv3QFzvATSZV7eHk3eUpMZtG/qOi5z3P
 rPwW7NSXKbPg5qi+zKpqC20Im1Wm6fF4qQtim3uNlz2KXlVGvQl5/Ww27nIZMn6B
 Kkv5RGzdodePbKGqd+CADAQoJOpR6kOng5ZaRdZx+6aTTooyM7KSk8bFq7j/CoYM
 PzVtO6IHIvV42di7H/NP8/qtQA3xwSLue3lWpTh+tzYmZvCdldZiIvBo9YXnlsEM
 kCetDIGBxUWBj7OdYK4hC1BPBVw3XtCWM/51ElslDZTPvT543lApBfMnVTv3JmFp
 gzvZ7XfZx443JOg=
 =qSzr
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan
     Williams)

   - Switch vmd from custom domain number allocator to the common
     allocator to prevent a potential race with new non-VMD buses (Dan
     Williams)

   - Enable Precision Time Measurement (PTM) only if device advertises
     support for a relevant role, to prevent invalid PTM Requests that
     cause ACS violations that are reported as AER Uncorrectable
     Non-Fatal errors (Mika Westerberg)

  Resource management:

   - Prevent resource tree corruption when BAR resize fails (Ilpo
     Järvinen)

   - Restore BARs to the original size if a BAR resize fails (Ilpo
     Järvinen)

   - Remove BAR release from BAR resize attempts by the xe, i915, and
     amdgpu drivers so the PCI core can restore BARs if the resize fails
     (Ilpo Järvinen)

   - Move Resizable BAR code to rebar.c (Ilpo Järvinen)

   - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo
     Järvinen)

   - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo
     Järvinen)

  Power management and error handling:

   - For drivers using PCI legacy suspend, save config state at suspend
     so that state (not any earlier state from enumeration, probe, or
     error recovery) will be restored when resuming (Lukas Wunner)

   - For devices with no driver or a driver that lacks power management,
     save config state at hibernate so that state (not any earlier state
     from enumeration, probe, or error recovery) will be restored when
     resuming (Lukas Wunner)

   - Save device config space on device addition, before driver binding,
     so error recovery works more reliably (Lukas Wunner)

   - Drop pci_save_state() from several drivers that no longer need it
     since the PCI core always does it and pci_restore_state() no longer
     invalidates the saved state (Lukas Wunner)

   - Document use of pci_save_state() by drivers to capture the state
     they want restored during error recovery (Lukas Wunner)

  Power control:

   - Add a struct pci_ops.assert_perst() function pointer to
     assert/deassert PCIe PERST# and implement it for the qcom driver
     (Krishna Chaitanya Chundru)

   - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe
     switch, which must be held in reset after poweron so the pwrctrl
     driver can configure the switch via I2C before bringing up the
     links (Krishna Chaitanya Chundru)

  Endpoint framework:

   - Convert the endpoint doorbell test to use a threaded IRQ to fix a
     'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri)

   - Add endpoint VNTB MSI doorbell support to reduce latency between
     host and endpoint (Frank Li)

  New native PCIe controller drivers:

   - Add CIX Sky1 host controller DT binding and driver (Hans Zhang)

   - Add NXP S32G host controller DT binding and driver (Vincent
     Guittot)

   - Add Renesas RZ/G3S host controller DT binding and driver (Claudiu
     Beznea)

   - Add SpacemiT K1 host controller DT binding and driver (Alex Elder)

  Amlogic Meson PCIe controller driver:

   - Update DT binding to name DBI region 'dbi', not 'elbi', and update
     driver to support both (Manivannan Sadhasivam)

  Apple PCIe controller driver:

   - Move struct pci_host_bridge allocation from pci_host_common_init()
     to callers, which significantly simplifies pcie-apple (Marc
     Zyngier)

  Broadcom STB PCIe controller driver:

   - Disable advertising ASPM L0s support correctly (Jim Quinlan)

   - Add a panic/die handler to print diagnostic info in case PCIe
     caused an unrecoverable abort (Jim Quinlan)

  Cadence PCIe controller driver:

   - Add module support for Cadence platform host and endpoint
     controller driver (Manikandan K Pillai)

   - Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare
     for new CIX Sky1 driver (Manikandan K Pillai)

  MediaTek PCIe controller driver:

   - Convert DT binding to YAML schema (Christian Marangi)

   - Add Airoha AN7583 DT compatible and driver support (Christian
     Marangi)

  Qualcomm PCIe controller driver:

   - Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu)

   - Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280,
     sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT
     schemas (Krzysztof Kozlowski)

   - Look up OPP using both frequency and data rate (not just frequency)
     so RPMh votes can account for both (Krishna Chaitanya Chundru)

  Rockchip DesignWare PCIe controller driver:

   - Add Rockchip RK3528 compatible strings in DT binding (Yao Zi)

  STMicroelectronics STM32MP25 PCIe controller driver:

   - Fix a race between link training and endpoint register
     initialization (Christian Bruel)

   - Align endpoint allocations to match the ATU requirements (Christian
     Bruel)

  Synopsys DesignWare PCIe controller driver:

   - Clear L1 PM Substate Capability 'Supported' bits unless glue driver
     says it's supported, which prevents users from enabling non-working
     L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas)

   - Remove now-superfluous L1SS disable code from tegra194 (Bjorn
     Helgaas)

   - Configure L1SS support in dw-rockchip when DT says
     'supports-clkreq' (Shawn Lin)

  TI Keystone PCIe controller driver:

   - Fail the probe instead of silently succeeding if ks_pcie_of_data
     didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli)

   - Make keystone buildable as a loadable module, except on ARM32 where
     hook_fault_code() is __init (Siddharth Vadapalli)"

* tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits)
  MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer
  MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer
  PCI: sky1: Add PCIe host support for CIX Sky1
  dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings
  PCI: cadence: Add support for High Perf Architecture (HPA) controller
  MAINTAINERS: Add NXP S32G PCIe controller driver maintainer
  PCI: s32g: Add NXP S32G PCIe controller driver (RC)
  PCI: dwc: Add register and bitfield definitions
  dt-bindings: PCI: s32g: Add NXP S32G PCIe controller
  PCI: Add Renesas RZ/G3S host controller driver
  PCI: host-generic: Move bridge allocation outside of pci_host_common_init()
  dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding
  PCI: Validate pci_rebar_size_supported() input
  Documentation: PCI: Amend error recovery doc with pci_save_state() rules
  treewide: Drop pci_save_state() after pci_restore_state()
  PCI/ERR: Ensure error recoverability at all times
  PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
  PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths
  PCI: dw-rockchip: Configure L1SS support
  PCI: tegra194: Remove unnecessary L1SS disable code
  ...
2025-12-04 17:29:41 -08:00
Ilpo Järvinen
7409c1b12c PCI: Prevent restoring assigned resources
restore_dev_resource() copies saved addresses and flags from the struct
pci_dev_resource back to the struct resource, typically, during rollback
from a failure or in preparation for a retry attempt.

If the resource is within resource tree, the resource must not be
modified as the resource tree could be corrupted. Thus, it's a bug to
call restore_dev_resource() for assigned resources (which did happen
due to logic flaws in the BAR resize rollback).

Add WARN_ON_ONCE() into restore_dev_resource() to detect such bugs easily
and return without altering the resource to prevent corruption.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-12-ilpo.jarvinen@linux.intel.com
2025-11-14 12:34:20 -06:00
Ilpo Järvinen
337b1b566d PCI: Fix restoring BARs on BAR resize rollback path
BAR resize operation is implemented in the pci_resize_resource() and
pbus_reassign_bridge_resources() functions. pci_resize_resource() can be
called either from __resource_resize_store() from sysfs or directly by the
driver for the Endpoint Device.

The pci_resize_resource() requires that caller has released the device
resources that share the bridge window with the BAR to be resized as
otherwise the bridge window is pinned in place and cannot be changed.

pbus_reassign_bridge_resources() rolls back resources if the resize
operation fails, but rollback is performed only for the bridge windows.
Because releasing the device resources are done by the caller of the BAR
resize interface, these functions performing the BAR resize do not have
access to the device resources as they were before the resize.

pbus_reassign_bridge_resources() could try __pci_bridge_assign_resources()
after rolling back the bridge windows as they were, however, it will not
guarantee the resource are assigned due to differences in how FW and the
kernel assign the resources (alignment of the start address and tail).

To perform rollback robustly, the BAR resize interface has to be altered to
also release the device resources that share the bridge window with the BAR
to be resized.

Also, remove restoring from the entries failed list as saved list should
now contain both the bridge windows and device resources so the extra
restore is duplicated work.

Some drivers (currently only amdgpu) want to prevent releasing some
resources. Add exclude_bars param to pci_resize_resource() and make amdgpu
pass its register BAR (BAR 2 or 5), which should never be released during
resize operation. Normally 64-bit prefetchable resources do not share a
bridge window with the 32-bit only register BAR, but there are various
fallbacks in the resource assignment logic which may make the resources
share the bridge window in rare cases.

This change (together with the driver side changes) is to counter the
resource releases that had to be done to prevent resource tree corruption
in the ("PCI: Release assigned resource before restoring them") change. As
such, it likely restores functionality in cases where device resources were
released to avoid resource tree conflicts which appeared to be "working"
when such conflicts were not correctly detected by the kernel.

Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Link: https://lore.kernel.org/linux-pci/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash amdgpu BAR selection from
https://lore.kernel.org/r/20251114103053.13778-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20251113162628.5946-7-ilpo.jarvinen@linux.intel.com
2025-11-14 12:33:14 -06:00
Ilpo Järvinen
1d8a0506f6 PCI: Free saved list without holding pci_bus_sem
Freeing the saved list does not require holding pci_bus_sem, so the
critical section can be made shorter.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-6-ilpo.jarvinen@linux.intel.com
2025-11-14 12:33:10 -06:00
Ilpo Järvinen
121d3e9e4b PCI: Try BAR resize even when no window was released
Usually, resizing BARs requires releasing bridge windows in order to
resize it to fit a larger BAR into the window. This is not always the
case, however, FW could have made the window large enough to accommodate
larger BAR as is, or the user might prefer to shrink a BAR to make more
space for another Resizable BAR.

Thus, replace the check that requires that at least one bridge window
was released with a check that simply ensures bridge is not NULL.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-5-ilpo.jarvinen@linux.intel.com
2025-11-14 12:33:05 -06:00
Ilpo Järvinen
34c702ea04 PCI: Change pci_dev variable from 'bridge' to 'dev'
Upcoming fix to BAR resize will store also device BAR resource in the
saved list. Change the pci_dev variable in the loop from 'bridge' to
'dev' as the former would be misleading with non-bridges in the list.

This is in a separate change to reduce churn in the upcoming BAR resize
fix.

While it appears that the logic in the loop doing pci_setup_bridge() is
altered as 'bridge' variable is no longer updated, a bridge should never
appear more than once in the saved list so the check can only match to the
first entry. As such, the code with two distinct pci_dev variables better
represents the intention of the check compared with the old code where
bridge variable was reused for a different purpose.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Link: https://patch.msgid.link/20251113162628.5946-4-ilpo.jarvinen@linux.intel.com
2025-11-14 12:32:55 -06:00
Ilpo Järvinen
91c4c89db4 PCI: Prevent resource tree corruption when BAR resize fails
pbus_reassign_bridge_resources() saves bridge windows into the saved
list before attempting to adjust resource assignments to perform a BAR
resize operation. If resource adjustments cannot be completed fully,
rollback is attempted by restoring the resource from the saved list.

The rollback, however, does not check whether the resources it restores were
assigned by the partial resize attempt. If restore changes addresses of the
resource, it can result in corrupting the resource tree.

An example of a corrupted resource tree with overlapping addresses:

  6200000000000-6203fbfffffff : pciex@620c3c0000000
    6200000000000-6203fbff0ffff : PCI Bus 0030:01
      6200020000000-62000207fffff : 0030:01:00.0
      6200000000000-6203fbff0ffff : PCI Bus 0030:02

A resource that are assigned into the resource tree must remain
unchanged. Thus, release such a resource before attempting to restore
and claim it back.

For simplicity, always do the release and claim back for the resource
even in the cases where it is restored to the same address range.

Note: this fix may "break" some cases where devices "worked" because
the resource tree corruption allowed address space double counting to
fit more resource than what can now be assigned without double
counting. The upcoming changes to BAR resizing should address those
scenarios (to the extent possible).

Fixes: 8bb705e3e7 ("PCI: Add pci_resize_resource() for resizing BARs")
Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Link: https://lore.kernel.org/linux-pci/67840a16-99b4-4d8c-9b5c-4721ab0970a2@hogyros.de/
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org> # AVA, AMD GPU
Link: https://patch.msgid.link/20251113162628.5946-2-ilpo.jarvinen@linux.intel.com
2025-11-14 12:31:55 -06:00
Ilpo Järvinen
437aa64c8e PCI: Do not size non-existing prefetchable window
pbus_size_mem() should only be called for bridge windows that exist but
__pci_bus_size_bridges() may point 'pref' to a resource that does not exist
(has zero flags) in case of non-root buses.

When prefetchable bridge window does not exist, the same non-prefetchable
bridge window is sized more than once which may result in duplicating
entries into the realloc_head list. Duplicated entries are shown in this
log and trigger a WARN_ON() because realloc_head had residual entries after
the resource assignment algorithm:

  pci 0000:00:03.0: [11ab:6820] type 01 class 0x060400 PCIe Root Port
  pci 0000:00:03.0: PCI bridge to [bus 00]
  pci 0000:00:03.0:   bridge window [io  0x0000-0x0fff]
  pci 0000:00:03.0:   bridge window [mem 0x00000000-0x000fffff]
  pci 0000:00:03.0: bridge window [mem 0x00200000-0x003fffff] to [bus 02] add_size 200000 add_align 200000
  pci 0000:00:03.0: bridge window [mem 0x00200000-0x003fffff] to [bus 02] add_size 200000 add_align 200000
  pci 0000:00:03.0: bridge window [mem 0xe0000000-0xe03fffff]: assigned
  pci 0000:00:03.0: PCI bridge to [bus 02]
  pci 0000:00:03.0:   bridge window [mem 0xe0000000-0xe03fffff]
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1 at drivers/pci/setup-bus.c:2373 pci_assign_unassigned_root_bus_resources+0x1bc/0x234

Check resource flags of 'pref' and only size the prefetchable window if the
resource has the IORESOURCE_PREFETCH flag.

Fixes: ae88d0b9c5 ("PCI: Use pbus_select_window_for_type() during mem window sizing")
Reported-by: Klaus Kudielka <klaus.kudielka@gmail.com>
Closes: https://lore.kernel.org/r/51e8cf1c62b8318882257d6b5a9de7fdaaecc343.camel@gmail.com/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Klaus Kudielka <klaus.kudielka@gmail.com>
Link: https://patch.msgid.link/20251027132423.8841-1-ilpo.jarvinen@linux.intel.com
2025-10-31 15:07:21 -05:00
Yangyu Chen
a154f14160 PCI: Fix regression in pci_bus_distribute_available_resources()
The refactoring in 4292a1e45f ("PCI: Refactor distributing available
memory to use loops") switched pci_bus_distribute_available_resources() to
operate on an array of bridge windows. That accidentally looked up bus
resources via pci_bus_resource_n() and then passed those pointers to helper
routines that expect the resource to belong to the device. As soon as we
execute that code, pci_resource_num() warned because the resource wasn't in
the bridge's resource array.

This happens on my AMD Strix Halo machine with Thunderbolt device; the
error message is shown below:

  WARNING: CPU: 6 PID: 272 at drivers/pci/pci.h:471 pci_bus_distribute_available_resources+0x6ad/0x6d0
  CPU: 6 UID: 0 PID: 272 Comm: irq/33-pciehp Not tainted 6.17.0+ #1 PREEMPT(voluntary)
  Hardware name: PELADN YO Series/YO1, BIOS 1.04 05/15/2025
  RIP: 0010:pci_bus_distribute_available_resources+0x6ad/0x6d0
  Call Trace:
   pci_bus_distribute_available_resources+0x590/0x6d0
   pci_bridge_distribute_available_resources+0x62/0xb0
   pci_assign_unassigned_bridge_resources+0x65/0x1b0
   pciehp_configure_device+0x92/0x160
   pciehp_handle_presence_or_link_change+0x1b5/0x350
   pciehp_ist+0x147/0x1c0

Fix the regression by always fetching the resource directly from the bridge
with pci_resource_n(bridge, PCI_BRIDGE_RESOURCES + i). This restores the
original behaviour while keeping the refactored structure.  Then we can
successfully assign resources to the Thunderbolt device.

Fixes: 4292a1e45f ("PCI: Refactor distributing available memory to use loops")
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Closes: https://lore.kernel.org/r/dd551b81-9e81-480b-aab3-7cf8b8bbc1d0@panix.com
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
[bhelgaas: trim timestamps, etc from commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-By: Kenneth R. Crudup <kenny@panix.com>
Link: https://lore.kernel.org/r/F833CC81-7C60-48FC-A31C-B9999DCC6FA2@icloud.com
Link: https://patch.msgid.link/tencent_8C54420E1B0FF8D804C1B4651DF970716309@qq.com
2025-10-08 16:36:31 -05:00
Ilpo Järvinen
15c5867b0a PCI: Don't print stale information about resource
pbus_size_mem() logs the bridge window resource using pci_info() before the
start and end fields of the resource have been updated which then prints
stale information.

Set resource addresses earlier to make understanding logs easier.
Regrettably, this results in setting the addresses multiple times but that
seems unavoidable.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250924135641.3399-1-ilpo.jarvinen@linux.intel.com
2025-09-24 18:44:04 -05:00
Ilpo Järvinen
43b4f7cd06 PCI: Alter misleading recursion to pci_bus_release_bridge_resources()
Recursing into pci_bus_release_bridge_resources() should not alter rel_type
because it makes no sense to change the release type within the recursion
call chain. A literal "whole_subtree" is passed into the recursion instead
of "rel_type" parameter which is misleading as the release type should
remain the same throughout the entire operation.

This is not a correctness issue because of the preceding if () that only
allows the recursion to happen if rel_type is "whole_subtree". Still,
replace the non-intuitive parameter with direct passing of "rel_type".

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-25-ilpo.jarvinen@linux.intel.com
2025-09-16 11:20:09 -05:00
Ilpo Järvinen
159fbfd041 PCI: Pass bridge window to pci_bus_release_bridge_resources()
pci_bus_release_bridge_resources() takes type, which is converted into a
bridge window resource in pci_bridge_release_resources().

Find out the correct bridge window for resource whose assignment failed.
Pass that bridge window to pci_bus_release_bridge_resources() instead of
passing the type. When recursing to subordinate, check which bridge windows
have to be released and recurse for each.

For now, use pbus_select_window_for_type() instead of pbus_select_window()
because non-bridge window resources still have their flags reset which
destroys the type information from the struct resource. The struct
pci_dev_resource holds a copy of the flags which are used instead.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-24-ilpo.jarvinen@linux.intel.com
2025-09-16 11:20:06 -05:00
Ilpo Järvinen
ebbebd8873 PCI: Add pci_setup_one_bridge_window()
pci_bridge_release_resources() contains a resource type hack to work
around the unsuitable __pci_setup_bridge() interface. Extract the
switch statement that picks the correct bridge window setup function
from pci_claim_bridge_resource() into pci_setup_one_bridge_window() and
use it also in pci_bridge_release_resources().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-23-ilpo.jarvinen@linux.intel.com
2025-09-16 11:20:02 -05:00
Ilpo Järvinen
aaae2863e7 PCI: Refactor remove_dev_resources() to use pbus_select_window()
Convert remove_dev_resources() to use pbus_select_window(). As 'available'
is not the real resources, the index has to be adjusted as only bridge
resource counterparts are present in the 'available' array.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-22-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:59 -05:00
Ilpo Järvinen
4292a1e45f PCI: Refactor distributing available memory to use loops
pci_bus_distribute_available_resources() and
pci_bridge_distribute_available_resources() retain bridge window resources
and related data needed for distributing the available window in
independent variables for io, memory, and prefetchable memory windows. The
code is essentially the same for all of them and therefore repeated three
times with different variable names.

Refactor pci_bus_distribute_available_resources() to take an array. This
is complicated slightly by the function taking advantage of passing the
struct as value, which cannot be done for arrays in C. Therefore, copy the
data into a local array in the stack in the first loop.

Variable names are (hopefully) improved slightly as well.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-21-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:55 -05:00
Ilpo Järvinen
ae88d0b9c5 PCI: Use pbus_select_window_for_type() during mem window sizing
__pci_bus_size_bridges() goes to great lengths of helping pbus_size_mem()
in which types it should put into a particular bridge window, requiring
passing up to three resource type into pbus_size_mem().

Instead of having complex logic in __pci_bus_size_bridges() and a
non-straightforward interface between those functions, use
pbus_select_window_for_type() and pbus_select_window() to find the correct
bridge window and compare if the resources belong to that window.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-20-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:52 -05:00
Ilpo Järvinen
13016e15d5 PCI: Use pbus_select_window() in space available checker
pbus_upstream_space_available() figures out the upstream bridge window
resources on its own. Migrate it to use pbus_select_window().

Note: pbus_select_window() -> pbus_select_window_for_type() calls
find_bus_resource_of_type() for root bus, which does not do parent check
similar to what pbus_upstream_space_available() did earlier, but the
difference does not matter because pbus_upstream_space_available() itself
stops when it encounters the root bus.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-19-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:47 -05:00
Ilpo Järvinen
da07881005 PCI: Rename resource variable from r to res
Resource is going to be passed in as argument aften an upcoming change.
Rename the struct resource variable from "r" to "res" to avoid using one
letter variable name in a function argument.

This rename is made separately to reduce churn in the upcoming change.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-18-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:44 -05:00
Ilpo Järvinen
ebe091ad81 PCI: Use pbus_select_window_for_type() during IO window sizing
Convert pbus_size_io() to use pbus_select_window_for_type().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-17-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:41 -05:00
Ilpo Järvinen
85796d20a6 PCI: Warn if bridge window cannot be released when resizing BAR
BAR resizing calls to pci_reassign_bridge_resources(), which attempts to
release any upstream bridge window to allow them to accommodate the new BAR
size. The release can only be performed if there are no other child
resources for the bridge window. Previously the code continued silently
when other child resources were detected.

Add pci_warn() to inform user that a bridge window could not be released
because of child resources. As a small bridge window is often the reason
why BAR resize fails, this warning will help to pinpoint to the cause.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-15-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:34 -05:00
Ilpo Järvinen
3ab10f83e2 PCI: Fix finding bridge window in pci_reassign_bridge_resources()
pci_reassign_bridge_resources() walks upwards in the PCI bus hierarchy,
locates the relevant bridge window on each level using flags check, and
attempts to release the bridge window. The flags-based check is fragile due
to various fallbacks in the bridge window selection logic. As such, the
algorithm might not locate the correct bridge window.

Refactor pci_reassign_bridge_resources() to determine the correct bridge
window using pbus_select_window(), which contains logic to handle all
fallback cases correctly. Change function prefix to pbus as it now inputs
struct bus and resource for which to locate the bridge window.

The main purpose is to make bridge window selection logic consistent across
the entire PCI core (one step at a time). While this technically also fixes
the commit 8bb705e3e7 ("PCI: Add pci_resize_resource() for resizing
BARs") making the bridge window walk algorithm more robust, the normal
setup having a 64-bit resizable BAR underneath bridge(s) with 64-bit
prefetchable windows does not need to use any fallbacks. As such, the
practical impact is low (requiring BAR resize use case and a non-typical
bridge device).

The way to detect if unrelated resource failed again is left to use the
type based approximation which should not behave worse than before.

Fixes: 8bb705e3e7 ("PCI: Add pci_resize_resource() for resizing BARs")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-14-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:31 -05:00
Ilpo Järvinen
74afce3dfc PCI: Add bridge window selection functions
Various places in the PCI core code independently decide into which bridge
window a child resource should be placed. It is hard to see whether these
decisions always end up in agreement, especially in the corner cases, and
in some places it requires complex logic to pass multiple resource types
and/or bridge windows around.

Add pbus_select_window() and pbus_select_window_for_type() for cases where
the former cannot be used so that eventually the same helper can be used to
select the bridge window everywhere. Using the same function ensures the
selected bridge window remains always the same and it can be easily
recalculated in-situ allowing simplifying the interfaces between internal
functions in upcoming changes.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-13-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:28 -05:00
Ilpo Järvinen
8278c69143 PCI: Preserve bridge window resource type flags
When a bridge window is found unused or fails to assign, the flags of the
associated resource are cleared. Clearing flags is problematic as it also
removes the type information of the resource which is needed later.

Thus, always preserve the bridge window type flags and use IORESOURCE_UNSET
and IORESOURCE_DISABLED to indicate the status of the bridge window. Also,
when initializing resources, make sure all valid bridge windows do get
their type flags set.

Change various places that relied on resource flags being cleared to check
for IORESOURCE_UNSET and IORESOURCE_DISABLED to allow bridge window
resource to retain their type flags. Add pdev_resource_assignable() and
pdev_resource_should_fit() helpers to filter out disabled bridge windows
during resource fitting; the latter combines more common checks into the
helper.

When reading the bridge windows from the registers, instead of leaving the
resource flags cleared for bridge windows that are not enabled, always
set up the flags and set IORESOURCE_UNSET | IORESOURCE_DISABLED as needed.

When resource fitting or assignment fails for a bridge window resource, or
the bridge window is not needed, mark the resource with IORESOURCE_UNSET or
IORESOURCE_DISABLED, respectively.

Use dummy zero resource in resource_show() for backwards compatibility as
lspci will otherwise misrepresent disabled bridge windows.

This change fixes an issue which highlights the importance of keeping the
resource type flags intact:

  At the end of __assign_resources_sorted(), reset_resource() is called,
  previously clearing the flags. Later, pci_prepare_next_assign_round()
  attempted to release bridge resources using
  pci_bus_release_bridge_resources() that calls into
  pci_bridge_release_resources() that assumes type flags are still present.
  As type flags were cleared, IORESOURCE_MEM_64 was not set leading to
  resources under an incorrect bridge window to be released (idx = 1
  instead of idx = 2). While the assignments performed later covered this
  problem so that the wrongly released resources got assigned in the end,
  it was still causing extra release+assign pairs.

There are other reasons why the resource flags should be retained in
upcoming changes too.

Removing the flag reset for non-bridge window resource is left as future
work, in part because it has a much higher regression potential due to
pci_enable_resources() that will start to work also for those resources
then and due to what endpoint drivers might assume about resources.

Despite the Fixes tag, backporting this (at least any time soon) is highly
discouraged. The issue fixed is borderline cosmetic as the later
assignments normally cover the problem entirely. Also there might be
non-obvious dependencies.

Fixes: 5b28541552 ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250829131113.36754-11-ilpo.jarvinen@linux.intel.com
2025-09-16 11:19:18 -05:00