Linux kernel source tree
Go to file
Chris Leech ffe0c49167 nvme-tcp: lockdep: annotate in-kernel sockets
[ Upstream commit 841aee4d75 ]

Put NVMe/TCP sockets in their own class to avoid some lockdep warnings.
Sockets created by nvme-tcp are not exposed to user-space, and will not
trigger certain code paths that the general socket API exposes.

Lockdep complains about a circular dependency between the socket and
filesystem locks, because setsockopt can trigger a page fault with a
socket lock held, but nvme-tcp sends requests on the socket while file
system locks are held.

  ======================================================
  WARNING: possible circular locking dependency detected
  5.15.0-rc3 #1 Not tainted
  ------------------------------------------------------
  fio/1496 is trying to acquire lock:
  (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendpage+0x23/0x80

  but task is already holding lock:
  (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs]

  which lock already depends on the new lock.

  other info that might help us debug this:

  chain exists of:
   sk_lock-AF_INET --> sb_internal --> &xfs_dir_ilock_class/5

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&xfs_dir_ilock_class/5);
                                lock(sb_internal);
                                lock(&xfs_dir_ilock_class/5);
   lock(sk_lock-AF_INET);

  *** DEADLOCK ***

  6 locks held by fio/1496:
   #0: (sb_writers#13){.+.+}-{0:0}, at: path_openat+0x9fc/0xa20
   #1: (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: path_openat+0x296/0xa20
   #2: (sb_internal){.+.+}-{0:0}, at: xfs_trans_alloc_icreate+0x41/0xd0 [xfs]
   #3: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs]
   #4: (hctx->srcu){....}-{0:0}, at: hctx_lock+0x51/0xd0
   #5: (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0x33e/0x380 [nvme_tcp]

This annotation lets lockdep analyze nvme-tcp controlled sockets
independently of what the user-space sockets API does.

Link: https://lore.kernel.org/linux-nvme/CAHj4cs9MDYLJ+q+2_GXUK9HxFizv2pxUryUR0toX974M040z7g@mail.gmail.com/

Signed-off-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08 14:23:56 +02:00
arch parisc: Fix handling off probe non-access faults 2022-04-08 14:23:56 +02:00
block Revert "Revert "block, bfq: honor already-setup queue merges"" 2022-04-08 14:23:56 +02:00
certs certs: Add support for using elliptic curve keys for signing modules 2021-08-23 19:55:42 +03:00
crypto crypto: xts - Add softdep on ecb 2022-04-08 14:23:55 +02:00
Documentation vsprintf: Fix %pK with kptr_restrict == 0 2022-04-08 14:23:18 +02:00
drivers nvme-tcp: lockdep: annotate in-kernel sockets 2022-04-08 14:23:56 +02:00
fs fs/binfmt_elf: Fix AT_PHDR for unusual ELF files 2022-04-08 14:23:56 +02:00
include serial: 8250: fix XOFF/XON sending when DMA is used 2022-04-08 14:23:50 +02:00
init init: make unknown command line param message clearer 2021-11-18 19:17:11 +01:00
ipc ipc/sem: do not sleep with a spin lock held 2022-02-08 18:34:03 +01:00
kernel rcu: Mark writes to the rcu_segcblist structure's ->flags field 2022-04-08 14:23:55 +02:00
lib lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3 2022-04-08 14:23:56 +02:00
LICENSES LICENSES/dual/CC-BY-4.0: Git rid of "smart quotes" 2021-07-15 06:31:24 -06:00
mm mm/kmemleak: reset tag when compare object pointer 2022-04-08 14:22:56 +02:00
net net/sched: act_ct: fix ref leak when switching zones 2022-04-08 14:23:53 +02:00
samples samples/bpf, xdpsock: Fix race when running for fix duration of time 2022-04-08 14:23:40 +02:00
scripts gcc-plugins/stackleak: Exactly match strings instead of prefixes 2022-04-08 14:23:54 +02:00
security Fix incorrect type in assignment of ipv6 port for audit 2022-04-08 14:23:55 +02:00
sound ASoC: amd: Fix reference to PCM buffer address 2022-04-08 14:23:22 +02:00
tools selftests: test_vxlan_under_vrf: Fix broken test case 2022-04-08 14:23:52 +02:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:27:15 +01:00
virt KVM: Fix lockdep false negative during host resume 2022-03-16 14:23:40 +01:00
.clang-format clang-format: Update with the latest for_each macro list 2021-05-12 23:32:39 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: add Andrej Shadura 2021-10-18 20:22:03 -10:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Daniel Drake to credits 2021-09-21 08:34:58 +03:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: adjust file entry for of_net.c after movement 2022-03-08 19:12:53 +01:00
Makefile Linux 5.15.32 2022-03-28 09:58:46 +02:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.