From 73711730a1128d91ebca1a6994ceeb18f36cb0cd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?H=C3=A5kon=20Bugge?= Date: Wed, 12 Nov 2025 10:54:40 +0100 Subject: [PATCH 1/6] PCI: Do not attempt to set ExtTag for VFs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The bit for enabling extended tags is Reserved and Preserved (RsvdP) for VFs, according to PCIe r7.0 section 7.5.3.4 table 7.21. Hence, bail out early from pci_configure_extended_tags() if the device is a VF. Otherwise, we may see incorrect log messages such as: kernel: pci 0000:af:00.2: enabling Extended Tags (af:00.2 is a VF) Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported") Signed-off-by: Håkon Bugge Signed-off-by: Bjorn Helgaas Reviewed-by: Zhu Yanjun Link: https://patch.msgid.link/20251112095442.1913258-1-haakon.bugge@oracle.com --- drivers/pci/probe.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 41183aed8f5d..86665658d704 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2270,7 +2270,8 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign) u16 ctl; int ret; - if (!pci_is_pcie(dev)) + /* PCI_EXP_DEVCTL_EXT_TAG is RsvdP in VFs */ + if (!pci_is_pcie(dev) || dev->is_virtfn) return 0; ret = pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap); From 959ac08a2c2811305be8c2779779e8b0932e5a99 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rg=20Wedekind?= Date: Mon, 19 Jan 2026 15:31:10 +0100 Subject: [PATCH 2/6] PCI: Mark 3ware-9650SA Root Port Extended Tags as broken MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per PCIe r7.0, sec 2.2.6.2.1 and 7.5.3.4, a Requester may not use 8-bit Tags unless its Extended Tag Field Enable is set, but all Receivers/Completers must handle 8-bit Tags correctly regardless of their Extended Tag Field Enable. Some devices do not handle 8-bit Tags as Completers, so add a quirk for them. If we find such a device, we disable Extended Tags for the entire hierarchy to make peer-to-peer DMA possible. The 3ware 9650SA seems to have issues with handling 8-bit tags. Mark it as broken. This fixes PCI Parity Errors like : 3w-9xxx: scsi0: ERROR: (0x06:0x000C): PCI Parity Error: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x000D): PCI Abort: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x000E): Controller Queue Error: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x0010): Microcontroller Error: clearing. Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=202425 Signed-off-by: Jörg Wedekind Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/20260119143114.21948-1-joerg@wedekind.de --- drivers/pci/quirks.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index b9c252aa6fe0..c7e733beaab0 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5581,6 +5581,7 @@ static void quirk_no_ext_tags(struct pci_dev *pdev) pci_walk_bus(bridge->bus, pci_configure_extended_tags, NULL); } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_3WARE, 0x1004, quirk_no_ext_tags); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_3WARE, 0x1005, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0132, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0140, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0141, quirk_no_ext_tags); From f7245901de8978d829f80b3d8e36ed9a8fd18049 Mon Sep 17 00:00:00 2001 From: Sergey Shtylyov Date: Tue, 27 Jan 2026 23:39:42 +0300 Subject: [PATCH 3/6] PCI: Check parent for NULL in of_pci_bus_release_domain_nr() of_pci_bus_find_domain_nr() allows its parent parameter to be NULL but of_pci_bus_release_domain_nr() (that undoes its effect) doesn't -- that means it's going to blow up while calling of_get_pci_domain_nr() if the parent parameter indeed happens to be NULL. Add the missing NULL check. Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") Signed-off-by: Sergey Shtylyov Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/20260127203944.28588-1-s.shtylyov@auroraos.dev --- drivers/pci/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 13dbb405dc31..9fc4c2226b03 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6591,7 +6591,7 @@ static void of_pci_bus_release_domain_nr(struct device *parent, int domain_nr) return; /* Release domain from IDA where it was allocated. */ - if (of_get_pci_domain_nr(parent->of_node) == domain_nr) + if (parent && of_get_pci_domain_nr(parent->of_node) == domain_nr) ida_free(&pci_domain_nr_static_ida, domain_nr); else ida_free(&pci_domain_nr_dynamic_ida, domain_nr); From 1a6845aaa6de81f95959b380b45de8f10d6a8502 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?H=C3=A5kon=20Bugge?= Date: Thu, 29 Jan 2026 18:52:32 +0100 Subject: [PATCH 4/6] PCI: Initialize RCB from pci_configure_device() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") worked around a bogus _HPX type 2 record, which caused program_hpx_type2() to set the RCB in an endpoint even though the Root Port did not have the RCB bit set. e42010d8207f fixed that by setting the RCB in the endpoint only when it was set in the Root Port. In retrospect, program_hpx_type2() is intended for AER-related settings, and the RCB should be configured elsewhere so it doesn't depend on the presence or contents of an _HPX record. Explicitly program the RCB from pci_configure_device() so it matches the Root Port's RCB. The Root Port may not be visible to virtualized guests; in that case, leave RCB alone. Fixes: e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") Signed-off-by: Håkon Bugge Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/20260129175237.727059-2-haakon.bugge@oracle.com --- drivers/pci/probe.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 86665658d704..c791bca2891f 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2411,6 +2411,37 @@ static void pci_configure_serr(struct pci_dev *dev) } } +static void pci_configure_rcb(struct pci_dev *dev) +{ + struct pci_dev *rp; + u16 rp_lnkctl; + + /* + * Per PCIe r7.0, sec 7.5.3.7, RCB is only meaningful in Root Ports + * (where it is read-only), Endpoints, and Bridges. It may only be + * set for Endpoints and Bridges if it is set in the Root Port. For + * Endpoints, it is 'RsvdP' for Virtual Functions. + */ + if (!pci_is_pcie(dev) || + pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || + pci_pcie_type(dev) == PCI_EXP_TYPE_UPSTREAM || + pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM || + pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC || + dev->is_virtfn) + return; + + /* Root Port often not visible to virtualized guests */ + rp = pcie_find_root_port(dev); + if (!rp) + return; + + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &rp_lnkctl); + pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL, + PCI_EXP_LNKCTL_RCB, + (rp_lnkctl & PCI_EXP_LNKCTL_RCB) ? + PCI_EXP_LNKCTL_RCB : 0); +} + static void pci_configure_device(struct pci_dev *dev) { pci_configure_mps(dev); @@ -2420,6 +2451,7 @@ static void pci_configure_device(struct pci_dev *dev) pci_configure_aspm_l1ss(dev); pci_configure_eetlp_prefix(dev); pci_configure_serr(dev); + pci_configure_rcb(dev); pci_acpi_program_hp_params(dev); } From 9abf79c8d7b40db0e5a34aa8c744ea60ff9a3fcf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?H=C3=A5kon=20Bugge?= Date: Thu, 29 Jan 2026 18:52:33 +0100 Subject: [PATCH 5/6] PCI/ACPI: Restrict program_hpx_type2() to AER bits MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously program_hpx_type2() applied PCIe settings unconditionally, which could incorrectly change bits like Extended Tag Field Enable and Enable Relaxed Ordering. When _HPX was added to ACPI r3.0, the intent of the PCIe Setting Record (Type 2) in sec 6.2.7.3 was to configure AER registers when the OS does not own the AER Capability: The PCI Express setting record contains ... [the AER] Uncorrectable Error Mask, Uncorrectable Error Severity, Correctable Error Mask ... to be used when configuring registers in the Advanced Error Reporting Extended Capability Structure ... OSPM [1] will only evaluate _HPX with Setting Record – Type 2 if OSPM is not controlling the PCI Express Advanced Error Reporting capability. ACPI r3.0b, sec 6.2.7.3, added more AER registers, including registers in the PCIe Capability with AER-related bits, and the restriction that the OS use this only when it owns PCIe native hotplug: ... when configuring PCI Express registers in the Advanced Error Reporting Extended Capability Structure *or PCI Express Capability Structure* ... An OS that has assumed ownership of native hot plug but does not ... have ownership of the AER register set must use ... the Type 2 record to program the AER registers ... However, since the Type 2 record also includes register bits that have functions other than AER, the OS must ignore values ... that are not applicable. Restrict program_hpx_type2() to only the intended purpose: - Apply settings only when OS owns PCIe native hotplug but not AER, - Only touch the AER-related bits (Error Reporting Enables) in Device Control - Don't touch Link Control at all, since nothing there seems AER-related, but log _HPX settings for debugging purposes Note that Read Completion Boundary is now configured elsewhere, since it is unrelated to _HPX. [1] Operating System-directed configuration and Power Management Fixes: 40abb96c51bb ("[PATCH] pciehp: Fix programming hotplug parameters") Signed-off-by: Håkon Bugge Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/20260129175237.727059-3-haakon.bugge@oracle.com --- drivers/pci/pci-acpi.c | 59 +++++++++++++++++------------------------- drivers/pci/pci.h | 3 +++ drivers/pci/pcie/aer.c | 3 --- 3 files changed, 27 insertions(+), 38 deletions(-) diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c index 9369377725fa..0162acfb5789 100644 --- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -271,21 +271,6 @@ static acpi_status decode_type1_hpx_record(union acpi_object *record, return AE_OK; } -static bool pcie_root_rcb_set(struct pci_dev *dev) -{ - struct pci_dev *rp = pcie_find_root_port(dev); - u16 lnkctl; - - if (!rp) - return false; - - pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); - if (lnkctl & PCI_EXP_LNKCTL_RCB) - return true; - - return false; -} - /* _HPX PCI Express Setting Record (Type 2) */ struct hpx_type2 { u32 revision; @@ -311,6 +296,7 @@ static void program_hpx_type2(struct pci_dev *dev, struct hpx_type2 *hpx) { int pos; u32 reg32; + const struct pci_host_bridge *host; if (!hpx) return; @@ -318,6 +304,15 @@ static void program_hpx_type2(struct pci_dev *dev, struct hpx_type2 *hpx) if (!pci_is_pcie(dev)) return; + host = pci_find_host_bridge(dev->bus); + + /* + * Only do the _HPX Type 2 programming if OS owns PCIe native + * hotplug but not AER. + */ + if (!host->native_pcie_hotplug || host->native_aer) + return; + if (hpx->revision > 1) { pci_warn(dev, "PCIe settings rev %d not supported\n", hpx->revision); @@ -325,33 +320,27 @@ static void program_hpx_type2(struct pci_dev *dev, struct hpx_type2 *hpx) } /* - * Don't allow _HPX to change MPS or MRRS settings. We manage - * those to make sure they're consistent with the rest of the - * platform. + * We only allow _HPX to program DEVCTL bits related to AER, namely + * PCI_EXP_DEVCTL_CERE, PCI_EXP_DEVCTL_NFERE, PCI_EXP_DEVCTL_FERE, + * and PCI_EXP_DEVCTL_URRE. + * + * The rest of DEVCTL is managed by the OS to make sure it's + * consistent with the rest of the platform. */ - hpx->pci_exp_devctl_and |= PCI_EXP_DEVCTL_PAYLOAD | - PCI_EXP_DEVCTL_READRQ; - hpx->pci_exp_devctl_or &= ~(PCI_EXP_DEVCTL_PAYLOAD | - PCI_EXP_DEVCTL_READRQ); + hpx->pci_exp_devctl_and |= ~PCI_EXP_AER_FLAGS; + hpx->pci_exp_devctl_or &= PCI_EXP_AER_FLAGS; /* Initialize Device Control Register */ pcie_capability_clear_and_set_word(dev, PCI_EXP_DEVCTL, ~hpx->pci_exp_devctl_and, hpx->pci_exp_devctl_or); - /* Initialize Link Control Register */ + /* Log if _HPX attempts to modify Link Control Register */ if (pcie_cap_has_lnkctl(dev)) { - - /* - * If the Root Port supports Read Completion Boundary of - * 128, set RCB to 128. Otherwise, clear it. - */ - hpx->pci_exp_lnkctl_and |= PCI_EXP_LNKCTL_RCB; - hpx->pci_exp_lnkctl_or &= ~PCI_EXP_LNKCTL_RCB; - if (pcie_root_rcb_set(dev)) - hpx->pci_exp_lnkctl_or |= PCI_EXP_LNKCTL_RCB; - - pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL, - ~hpx->pci_exp_lnkctl_and, hpx->pci_exp_lnkctl_or); + if (hpx->pci_exp_lnkctl_and != 0xffff || + hpx->pci_exp_lnkctl_or != 0) + pci_info(dev, "_HPX attempts Link Control setting (AND %#06x OR %#06x)\n", + hpx->pci_exp_lnkctl_and, + hpx->pci_exp_lnkctl_or); } /* Find Advanced Error Reporting Enhanced Capability */ diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 0e67014aa001..e3c2852c80fb 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -88,6 +88,9 @@ struct pcie_tlp_log; #define PCI_BUS_BRIDGE_MEM_WINDOW 1 #define PCI_BUS_BRIDGE_PREF_MEM_WINDOW 2 +#define PCI_EXP_AER_FLAGS (PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \ + PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE) + extern const unsigned char pcie_link_speed[]; extern bool pci_early_dump; diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index e0bcaa896803..9472d86cef55 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -239,9 +239,6 @@ void pcie_ecrc_get_policy(char *str) } #endif /* CONFIG_PCIE_ECRC */ -#define PCI_EXP_AER_FLAGS (PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \ - PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE) - int pcie_aer_is_native(struct pci_dev *dev) { struct pci_host_bridge *host = pci_find_host_bridge(dev->bus); From 699722468a0fca8b1b9ce1ffe2532171ddcaff95 Mon Sep 17 00:00:00 2001 From: Lukas Wunner Date: Sun, 26 Oct 2025 17:57:57 +0100 Subject: [PATCH 6/6] PCI/PME: Replace RMW of Root Status register with direct write MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit As of PCIe r7.0, the Root Status register contains a single writeable bit (PME Status, type RW1C) and otherwise just read-only bits and RsvdZ bits (which software must write as zero, PCIe r7.0 sec 7.4). Thus, when clearing the PME Status bit, there's no need to perform a read-modify-write of the register. Instead, the bit can be written directly. Signed-off-by: Lukas Wunner Signed-off-by: Bjorn Helgaas Reviewed-by: Ilpo Järvinen Link: https://patch.msgid.link/39f87c99f6c44be3c0371c79e454e6fde7be0d4d.1761497583.git.lukas@wunner.de --- drivers/pci/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 9fc4c2226b03..10ea5e7f4b34 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2256,7 +2256,7 @@ void pcie_clear_device_status(struct pci_dev *dev) */ void pcie_clear_root_pme_status(struct pci_dev *dev) { - pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME); + pcie_capability_write_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME); } /**