linux/drivers
Pali Rohár 1f520a0d78 PCI: aardvark: Fix link training
commit f76b36d40b upstream.

Fix multiple link training issues in aardvark driver. The main reason of
these issues was misunderstanding of what certain registers do, since their
names and comments were misleading: before commit 96be36dbff ("PCI:
aardvark: Replace custom macros by standard linux/pci_regs.h macros"), the
pci-aardvark.c driver used custom macros for accessing standard PCIe Root
Bridge registers, and misleading comments did not help to understand what
the code was really doing.

After doing more tests and experiments I've come to the conclusion that the
SPEED_GEN register in aardvark sets the PCIe revision / generation
compliance and forces maximal link speed. Both GEN3 and GEN2 values set the
read-only PCI_EXP_FLAGS_VERS bits (PCIe capabilities version of Root
Bridge) to value 2, while GEN1 value sets PCI_EXP_FLAGS_VERS to 1, which
matches with PCI Express specifications revisions 3, 2 and 1 respectively.
Changing SPEED_GEN also sets the read-only bits PCI_EXP_LNKCAP_SLS and
PCI_EXP_LNKCAP2_SLS to corresponding speed.

(Note that PCI Express rev 1 specification does not define PCI_EXP_LNKCAP2
 and PCI_EXP_LNKCTL2 registers and when SPEED_GEN is set to GEN1 (which
 also sets PCI_EXP_FLAGS_VERS set to 1), lspci cannot access
 PCI_EXP_LNKCAP2 and PCI_EXP_LNKCTL2 registers.)

Changing PCIe link speed can be done via PCI_EXP_LNKCTL2_TLS bits of
PCI_EXP_LNKCTL2 register. Armada 3700 Functional Specifications says that
the default value of PCI_EXP_LNKCTL2_TLS is based on SPEED_GEN value, but
tests showed that the default value is always 8.0 GT/s, independently of
speed set by SPEED_GEN. So after setting SPEED_GEN, we must also set value
in PCI_EXP_LNKCTL2 register via PCI_EXP_LNKCTL2_TLS bits.

Triggering PCI_EXP_LNKCTL_RL bit immediately after setting LINK_TRAINING_EN
bit actually doesn't do anything. Tests have shown that a delay is needed
after enabling LINK_TRAINING_EN bit. As triggering PCI_EXP_LNKCTL_RL
currently does nothing, remove it.

Commit 43fc679ced ("PCI: aardvark: Improve link training") introduced
code which sets SPEED_GEN register based on negotiated link speed from
PCI_EXP_LNKSTA_CLS bits of PCI_EXP_LNKSTA register. This code was added to
fix detection of Compex WLE900VX (Atheros QCA9880) WiFi GEN1 PCIe cards, as
otherwise these cards were "invisible" on PCIe bus (probably because they
crashed). But apparently more people reported the same issues with these
cards also with other PCIe controllers [1] and I was able to reproduce this
issue also with other "noname" WiFi cards based on Atheros QCA9890 chip
(with the same PCI vendor/device ids as Atheros QCA9880). So this is not an
issue in aardvark but rather an issue in Atheros QCA98xx chips. Also, this
issue only exists if the kernel is compiled with PCIe ASPM support, and a
generic workaround for this is to change PCIe Bridge to 2.5 GT/s link speed
via PCI_EXP_LNKCTL2_TLS_2_5GT bits in PCI_EXP_LNKCTL2 register [2], before
triggering PCI_EXP_LNKCTL_RL bit. This workaround also works when SPEED_GEN
is set to value GEN2 (5 GT/s). So remove this hack completely in the
aardvark driver and always set SPEED_GEN to value from 'max-link-speed' DT
property. Fix for Atheros QCA98xx chips is handled separately by patch [2].

These two things (code for triggering PCI_EXP_LNKCTL_RL bit and changing
SPEED_GEN value) also explain why commit 6964494582 ("PCI: aardvark:
Train link immediately after enabling training") somehow fixed detection of
those problematic Compex cards with Atheros chips: if triggering link
retraining (via PCI_EXP_LNKCTL_RL bit) was done immediately after enabling
link training (via LINK_TRAINING_EN), it did nothing. If there was a
specific delay, aardvark HW already initialized PCIe link and therefore
triggering link retraining caused the above issue. Compex cards triggered
link down event and disappeared from the PCIe bus.

Commit f4c7d053d7 ("PCI: aardvark: Wait for endpoint to be ready before
training link") added 100ms sleep before calling 'Start link training'
command and explained that it is a requirement of PCI Express
specification. But the code after this 100ms sleep was not doing 'Start
link training', rather it triggered PCI_EXP_LNKCTL_RL bit via PCIe Root
Bridge to put link into Recovery state.

The required delay after fundamental reset is already done in function
advk_pcie_wait_for_link() which also checks whether PCIe link is up.
So after removing the code which triggers PCI_EXP_LNKCTL_RL bit on PCIe
Root Bridge, there is no need to wait 100ms again. Remove the extra
msleep() call and update comment about the delay required by the PCI
Express specification.

According to Marvell Armada 3700 Functional Specifications, Link training
should be enabled via aardvark register LINK_TRAINING_EN after selecting
PCIe generation and x1 lane. There is no need to disable it prior resetting
card via PERST# signal. This disabling code was introduced in commit
5169a9851d ("PCI: aardvark: Issue PERST via GPIO") as a workaround for
some Atheros cards. It turns out that this also is Atheros specific issue
and affects any PCIe controller, not only aardvark. Moreover this Atheros
issue was triggered by juggling with PCI_EXP_LNKCTL_RL, LINK_TRAINING_EN
and SPEED_GEN bits interleaved with sleeps. Now, after removing triggering
PCI_EXP_LNKCTL_RL, there is no need to explicitly disable LINK_TRAINING_EN
bit. So remove this code too. The problematic Compex cards described in
previous git commits are correctly detected in advk_pcie_train_link()
function even after applying all these changes.

Note that with this patch, and also prior this patch, some NVMe disks which
support PCIe GEN3 with 8 GT/s speed are negotiated only at the lowest link
speed 2.5 GT/s, independently of SPEED_GEN value. After manually triggering
PCI_EXP_LNKCTL_RL bit (e.g. from userspace via setpci), these NVMe disks
change link speed to 5 GT/s when SPEED_GEN was configured to GEN2. This
issue first needs to be properly investigated. I will send a fix in the
future.

On the other hand, some other GEN2 PCIe cards with 5 GT/s speed are
autonomously by HW autonegotiated at full 5 GT/s speed without need of any
software interaction.

Armada 3700 Functional Specifications describes the following steps for
link training: set SPEED_GEN to GEN2, enable LINK_TRAINING_EN, poll until
link training is complete, trigger PCI_EXP_LNKCTL_RL, poll until signal
rate is 5 GT/s, poll until link training is complete, enable ASPM L0s.

The requirement for triggering PCI_EXP_LNKCTL_RL can be explained by the
need to achieve 5 GT/s speed (as changing link speed is done by throw to
recovery state entered by PCI_EXP_LNKCTL_RL) or maybe as a part of enabling
ASPM L0s (but in this case ASPM L0s should have been enabled prior
PCI_EXP_LNKCTL_RL).

It is unknown why the original pci-aardvark.c driver was triggering
PCI_EXP_LNKCTL_RL bit before waiting for the link to be up. This does not
align with neither PCIe base specifications nor with Armada 3700 Functional
Specification. (Note that in older versions of aardvark, this bit was
called incorrectly PCIE_CORE_LINK_TRAINING, so this may be the reason.)

It is also unknown why Armada 3700 Functional Specification says that it is
needed to trigger PCI_EXP_LNKCTL_RL for GEN2 mode, as according to PCIe
base specification 5 GT/s speed negotiation is supposed to be entirely
autonomous, even if initial speed is 2.5 GT/s.

[1] - https://lore.kernel.org/linux-pci/87h7l8axqp.fsf@toke.dk/
[2] - https://lore.kernel.org/linux-pci/20210326124326.21163-1-pali@kernel.org/

Link: https://lore.kernel.org/r/20211005180952.6812-12-kabel@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-01 09:19:02 +01:00
..
accessibility
acpi ACPI: Get acpi_device's parent from the parent field 2021-12-01 09:18:58 +01:00
amba ARM: 9120/1: Revert "amba: make use of -1 IRQs warn" 2021-11-06 14:10:09 +01:00
android binder: fix test regression due to sender_euid change 2021-12-01 09:18:59 +01:00
ata libata: fix checking of DMA state 2021-11-18 14:03:46 +01:00
atm atm: nicstar: register the interrupt handler in the right place 2021-07-19 09:44:52 +02:00
auxdisplay auxdisplay: ht16k33: Fix frame buffer device blanking 2021-11-18 14:04:24 +01:00
base firmware_loader: fix pre-allocated buf built-in firmware use 2021-11-26 10:39:10 +01:00
bcma bcma: Fix memory leak for internally-handled cores 2021-09-15 09:50:45 +02:00
block loop: Use blk_validate_block_size() to validate block size 2021-11-21 13:46:35 +01:00
bluetooth Bluetooth: btmtkuart: fix a memleak in mtk_hci_wmt_sync 2021-11-18 14:04:03 +01:00
bus bus: ti-sysc: Use context lost quirk for otg 2021-11-26 10:39:08 +01:00
cdrom
char tpm_tis_spi: Add missing SPI ID 2021-11-18 14:04:11 +01:00
clk clk: qcom: gcc-msm8996: Drop (again) gcc_aggre1_pnoc_ahb_clk 2021-11-26 10:39:13 +01:00
clocksource clocksource/drivers/timer-ti-dm: Select TIMER_OF 2021-11-18 14:04:09 +01:00
connector
counter counter: 104-quad-8: Return error when invalid mode during ceiling_write 2021-09-15 09:50:38 +02:00
cpufreq cpufreq: schedutil: Destroy mutex before kobject_put() frees the memory 2021-10-06 15:55:46 +02:00
cpuidle cpuidle: Fix kobject memory leaks in error paths 2021-11-18 14:04:05 +01:00
crypto crypto: qat - disregard spurious PFVF interrupts 2021-11-18 14:04:06 +01:00
dax
dca
devfreq PM / devfreq: Add missing error code in devfreq_add_device() 2021-07-14 16:56:11 +02:00
dio
dma dmaengine: dmaengine_desc_callback_valid(): Check for callback_result 2021-11-18 14:04:24 +01:00
dma-buf dma-buf: WARN on dmabuf release with pending attachments 2021-11-18 14:03:52 +01:00
edac EDAC/amd64: Handle three rank interleaving mode 2021-11-18 14:04:06 +01:00
eisa
extcon extcon: intel-mrfld: Sync hardware and software state on init 2021-07-19 09:45:00 +02:00
firewire
firmware firmware: qcom_scm: Fix error retval in __qcom_scm_is_call_available() 2021-11-18 14:04:20 +01:00
fpga fpga: machxo2-spi: Fix missing error code in machxo2_write_complete() 2021-09-30 10:11:04 +02:00
fsi fsi: Add missing MODULE_DEVICE_TABLE 2021-07-20 16:05:42 +02:00
gnss
gpio gpio: mlxbf2.c: Add check for bgpio_init failure 2021-11-18 14:03:42 +01:00
gpu drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors 2021-11-26 10:39:21 +01:00
greybus
hid HID: wacom: Use "Confidence" flag to prevent reporting invalid contacts 2021-12-01 09:19:00 +01:00
hsi
hv hyperv/vmbus: include linux/bitops.h 2021-11-18 14:03:42 +01:00
hwmon hwmon: (pmbus/lm25066) Let compiler determine outer dimension of lm25066_coeff 2021-11-18 14:04:07 +01:00
hwspinlock
hwtracing coresight: cti: Correct the parameter for pm_runtime_put 2021-11-18 14:03:51 +01:00
i2c i2c: xlr: Fix a resource leak in the error handling path of 'xlr_i2c_probe()' 2021-11-18 14:04:25 +01:00
i3c
ide
idle
iio iio: imu: st_lsm6dsx: Avoid potential array overflow in st_lsm6dsx_set_odr() 2021-11-26 10:39:11 +01:00
infiniband RDMA/bnxt_re: Check if the vlan is valid before reporting 2021-11-26 10:39:08 +01:00
input Input: i8042 - Add quirk for Fujitsu Lifebook T725 2021-11-18 14:03:36 +01:00
interconnect treewide: Change list_sort to use const pointers 2021-09-30 10:11:04 +02:00
iommu iommu/amd: Relocate GAMSup check to early_enable_iommus 2021-09-26 14:08:59 +02:00
ipack ipack: ipoctal: fix module reference leak 2021-10-06 15:56:01 +02:00
irqchip irqchip/sifive-plic: Fixup EOI failed when masked 2021-11-18 14:04:29 +01:00
isdn mISDN: Fix return values of the probe function 2021-11-18 14:03:41 +01:00
leds leds: trigger: audio: Add an activate callback to ensure the initial brightness is set 2021-09-15 09:50:36 +02:00
lightnvm
macintosh
mailbox soc: mediatek: cmdq: add address shift in jump 2021-09-18 13:40:16 +02:00
mcb mcb: fix error handling in mcb_alloc_bus() 2021-09-30 10:11:00 +02:00
md md: update superblock after changing rdev flags in state_store 2021-11-18 14:03:57 +01:00
media media: cec: copy sequence field for the reply 2021-12-01 09:19:00 +01:00
memory memory: fsl_ifc: fix leak of irq and nand_irq in fsl_ifc_ctrl_probe 2021-11-18 14:04:16 +01:00
memstick memstick: jmb38x_ms: use appropriate free function in jmb38x_ms_alloc_host() 2021-11-18 14:04:07 +01:00
message
mfd mfd: dln2: Add cell for initializing DLN2 ADC 2021-11-18 14:04:30 +01:00
misc misc: fastrpc: Add missing lock before accessing find_vma() 2021-10-20 11:45:01 +02:00
mmc mmc: sdhci: Fix ADMA for PAGE_SIZE >= 64KiB 2021-12-01 09:19:01 +01:00
most most: fix control-message timeouts 2021-11-18 14:03:51 +01:00
mtd mtd: rawnand: au1550nd: Keep the driver compatible with on-die ECC engines 2021-11-18 14:04:31 +01:00
mux
net mdio: aspeed: Fix "Link is Down" issue 2021-12-01 09:19:01 +01:00
nfc nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails 2021-11-18 14:04:27 +01:00
ntb NTB: perf: Fix an error code in perf_setup_inbuf() 2021-09-22 12:28:02 +02:00
nubus
nvdimm libnvdimm/pmem: Fix crash triggered when I/O in-flight during unbind 2021-09-18 13:40:36 +02:00
nvme nvme-rdma: fix error code in nvme_rdma_setup_ctrl 2021-11-18 14:04:09 +01:00
nvmem nvmem: Fix shift-out-of-bound (UBSAN) with byte size cells 2021-10-20 11:45:01 +02:00
of of: unittest: fix EXPECT text for gpio hog errors 2021-11-18 14:04:13 +01:00
opp opp: Fix return in _opp_add_static_v2() 2021-11-18 14:04:22 +01:00
oprofile
parisc parisc: Move pci_dev_is_behind_card_dino to where it is used 2021-09-26 14:08:59 +02:00
parport parport: remove non-zero check on count 2021-09-18 13:40:34 +02:00
pci PCI: aardvark: Fix link training 2021-12-01 09:19:02 +01:00
pcmcia pcmcia: i82092: fix a null pointer dereference bug 2021-08-12 13:22:16 +02:00
perf perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number 2021-07-14 16:56:08 +02:00
phy phy: qcom-snps: Correct the FSEL_MASK 2021-11-18 14:04:20 +01:00
pinctrl pinctrl: qcom: sdm845: Enable dual edge errata 2021-11-26 10:39:18 +01:00
platform platform/x86: hp_accel: Fix an error handling path in 'lis3lv02d_probe()' 2021-11-26 10:39:16 +01:00
pnp
power power: supply: bq27xxx: Fix kernel crash on IRQ handler register error 2021-11-18 14:04:21 +01:00
powercap
pps
ps3
ptp ptp_pch: Load module automatically if ID matches 2021-10-13 10:04:27 +02:00
pwm pwm: stm32-lp: Don't modify HW state in .remove() callback 2021-09-26 14:09:01 +02:00
rapidio
ras
regulator regulator: s5m8767: do not use reset value as DVS voltage if GPIO DVS is disabled 2021-11-18 14:03:45 +01:00
remoteproc remoteproc: Fix a memory leak in an error handling path in 'rproc_handle_vdev()' 2021-11-18 14:04:23 +01:00
reset reset: socfpga: add empty driver allowing consumers to probe 2021-11-18 14:03:42 +01:00
rpmsg
rtc rtc: rv3032: fix error handling in rv3032_clkout_set_rate() 2021-11-18 14:04:23 +01:00
s390 s390/cio: make ccw_device_dma_* more robust 2021-11-18 14:04:30 +01:00
sbus
scsi scsi: ufs: core: Fix task management completion timeout race 2021-11-26 10:39:21 +01:00
sfi
sh maple: fix wrong return value of maple_bus_init(). 2021-11-26 10:39:12 +01:00
siox
slimbus slimbus: ngd: reset dma setup during runtime pm 2021-08-26 08:35:55 -04:00
soc soc/tegra: pmc: Fix imbalanced clock disabling in error code path 2021-11-18 14:04:33 +01:00
soundwire soundwire: debugfs: use controller id and link_id for debugfs 2021-11-18 14:04:16 +01:00
spi spi: spi-rpc-if: Check return value of rpcif_sw_init() 2021-11-18 14:04:11 +01:00
spmi
ssb ssb: Fix error return code in ssb_bus_scan() 2021-07-14 16:56:21 +02:00
staging staging: rtl8192e: Fix use after free in _rtl92e_pci_disconnect() 2021-12-01 09:19:00 +01:00
target scsi: target: Fix alua_tg_pt_gps_count tracking 2021-11-26 10:39:11 +01:00
tc
tee tee: optee: Fix missing devices unregister during optee_remove 2021-10-20 11:45:02 +02:00
thermal thermal: Fix NULL pointer dereferences in of_thermal_ functions 2021-11-21 13:46:37 +01:00
thunderbolt thunderbolt: Fix port linking by checking all adapters 2021-09-18 13:40:27 +02:00
tty tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc 2021-11-26 10:39:10 +01:00
uio
usb usb: hub: Fix locking issues with address0_mutex 2021-12-01 09:18:59 +01:00
vdpa vdpa/mlx5: Avoid destroying MR on empty iotlb 2021-08-26 08:35:42 -04:00
vfio vfio: Use config not menuconfig for VFIO_NOIOMMU 2021-09-18 13:40:12 +02:00
vhost vhost-vdpa: Fix the wrong input in config_cb 2021-10-20 11:45:04 +02:00
video parisc/sticon: fix reverse colors 2021-11-26 10:39:20 +01:00
virt
virtio virtio_ring: check desc == NULL when using indirect with packed 2021-11-18 14:04:21 +01:00
visorbus visorbus: fix error return code in visorchipset_init() 2021-07-14 16:56:41 +02:00
vlynq
vme
w1 w1: ds2438: fixing bug that would always get page0 2021-07-20 16:05:39 +02:00
watchdog ar7: fix kernel builds for compiler test 2021-11-18 14:04:24 +01:00
xen xen: detect uninitialized xenbus in xenbus_init 2021-12-01 09:19:01 +01:00
zorro
Kconfig
Makefile