Linux kernel source tree
Go to file
Jason Gunthorpe ead79118da arm64/io: Provide a WC friendly __iowriteXX_copy()
The kernel provides driver support for using write combining IO memory
through the __iowriteXX_copy() API which is commonly used as an optional
optimization to generate 16/32/64 byte MemWr TLPs in a PCIe environment.

iomap_copy.c provides a generic implementation as a simple 4/8 byte at a
time copy loop that has worked well with past ARM64 CPUs, giving a high
frequency of large TLPs being successfully formed.

However modern ARM64 CPUs are quite sensitive to how the write combining
CPU HW is operated and a compiler generated loop with intermixed
load/store is not sufficient to frequently generate a large TLP. The CPUs
would like to see the entire TLP generated by consecutive store
instructions from registers. Compilers like gcc tend to intermix loads and
stores and have poor code generation, in part, due to the ARM64 situation
that writeq() does not codegen anything other than "[xN]". However even
with that resolved compilers like clang still do not have good code
generation.

This means on modern ARM64 CPUs the rate at which __iowriteXX_copy()
successfully generates large TLPs is very small (less than 1 in 10,000)
tries), to the point that the use of WC is pointless.

Implement __iowrite32/64_copy() specifically for ARM64 and use inline
assembly to build consecutive blocks of STR instructions. Provide direct
support for 64/32/16 large TLP generation in this manner. Optimize for
common constant lengths so that the compiler can directly inline the store
blocks.

This brings the frequency of large TLP generation up to a high level that
is comparable with older CPU generations.

As the __iowriteXX_copy() family of APIs is intended for use with WC
incorporate the DGH hint directly into the function.

Link: https://lore.kernel.org/r/4-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22 17:11:20 -03:00
arch arm64/io: Provide a WC friendly __iowriteXX_copy() 2024-04-22 17:11:20 -03:00
block block-6.9-20240329 2024-03-29 09:40:22 -07:00
certs This update includes the following changes: 2023-11-02 16:15:30 -10:00
crypto This push fixes a regression that broke iwd as well as a divide by 2024-03-25 10:48:23 -07:00
Documentation Kbuild fixes for v6.9 2024-03-31 11:23:51 -07:00
drivers RDMA/rxe: Let destroy qp succeed with stuck packet 2024-04-22 16:55:57 -03:00
fs Kbuild fixes for v6.9 2024-03-31 11:23:51 -07:00
include s390: Stop using weak symbols for __iowrite64_copy() 2024-04-22 17:11:20 -03:00
init init: open /initrd.image with O_LARGEFILE 2024-03-26 11:07:19 -07:00
io_uring io_uring/sqpoll: early exit thread if task_context wasn't allocated 2024-03-18 20:22:42 -06:00
ipc sysctl changes for v6.9-rc1 2024-03-18 14:59:13 -07:00
kernel Kbuild fixes for v6.9 2024-03-31 11:23:51 -07:00
lib s390: Stop using weak symbols for __iowrite64_copy() 2024-04-22 17:11:20 -03:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm Kbuild fixes for v6.9 2024-03-31 11:23:51 -07:00
net nfsd-6.9 fixes: 2024-03-28 14:35:32 -07:00
rust Kbuild updates for v6.9 2024-03-21 14:41:00 -07:00
samples Tracing updates for 6.9: 2024-03-18 15:11:44 -07:00
scripts Kbuild fixes for v6.9 2024-03-31 11:23:51 -07:00
security - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
sound sound fixes for 6.9-rc2 2024-03-28 14:54:49 -07:00
tools objtool: Fix compile failure when using the x32 compiler 2024-03-30 22:12:37 +01:00
usr Kbuild updates for v6.8 2024-01-18 17:57:07 -08:00
virt KVM Xen and pfncache changes for 6.9: 2024-03-11 10:42:55 -04:00
.clang-format clang-format: Update with v6.7-rc4's for_each macro list 2023-12-08 23:54:38 +01:00
.cocciconfig
.editorconfig Add .editorconfig file for basic formatting 2023-12-28 16:22:47 +09:00
.get_maintainer.ignore Add Jeff Kirsher to .get_maintainer.ignore 2024-03-08 11:36:54 +00:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: create a list of all built DTB files 2024-02-19 18:20:39 +09:00
.mailmap Including fixes from bpf, WiFi and netfilter. 2024-03-28 13:09:37 -07:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS Not a ton of stuff happening in the clk framework in this pull request. We got 2024-03-15 11:48:01 -07:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS - Volunteer in Anna-Maria and Frederic as timers co-maintainers 2024-03-31 10:34:49 -07:00
Makefile Linux 6.9-rc2 2024-03-31 14:32:39 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.