linux/drivers/nvme/host
Chris Leech 1c200c8bce nvme-tcp: lockdep: annotate in-kernel sockets
[ Upstream commit 841aee4d75 ]

Put NVMe/TCP sockets in their own class to avoid some lockdep warnings.
Sockets created by nvme-tcp are not exposed to user-space, and will not
trigger certain code paths that the general socket API exposes.

Lockdep complains about a circular dependency between the socket and
filesystem locks, because setsockopt can trigger a page fault with a
socket lock held, but nvme-tcp sends requests on the socket while file
system locks are held.

  ======================================================
  WARNING: possible circular locking dependency detected
  5.15.0-rc3 #1 Not tainted
  ------------------------------------------------------
  fio/1496 is trying to acquire lock:
  (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendpage+0x23/0x80

  but task is already holding lock:
  (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs]

  which lock already depends on the new lock.

  other info that might help us debug this:

  chain exists of:
   sk_lock-AF_INET --> sb_internal --> &xfs_dir_ilock_class/5

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&xfs_dir_ilock_class/5);
                                lock(sb_internal);
                                lock(&xfs_dir_ilock_class/5);
   lock(sk_lock-AF_INET);

  *** DEADLOCK ***

  6 locks held by fio/1496:
   #0: (sb_writers#13){.+.+}-{0:0}, at: path_openat+0x9fc/0xa20
   #1: (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: path_openat+0x296/0xa20
   #2: (sb_internal){.+.+}-{0:0}, at: xfs_trans_alloc_icreate+0x41/0xd0 [xfs]
   #3: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs]
   #4: (hctx->srcu){....}-{0:0}, at: hctx_lock+0x51/0xd0
   #5: (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0x33e/0x380 [nvme_tcp]

This annotation lets lockdep analyze nvme-tcp controlled sockets
independently of what the user-space sockets API does.

Link: https://lore.kernel.org/linux-nvme/CAHj4cs9MDYLJ+q+2_GXUK9HxFizv2pxUryUR0toX974M040z7g@mail.gmail.com/

Signed-off-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08 14:40:32 +02:00
..
core.c nvme: cleanup __nvme_check_ids 2022-04-08 14:40:00 +02:00
fabrics.c nvme-fabrics: decode host pathing error for connect 2021-06-16 12:01:37 +02:00
fabrics.h nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts() 2022-02-08 18:30:35 +01:00
fault_inject.c nvme: enable to inject errors into admin commands 2019-06-21 11:15:50 +02:00
fc.c nvme-fc: avoid race between time out and tear down 2021-10-09 14:40:57 +02:00
fc.h nvme-fc: Update header and host for common definitions for LS handling 2020-05-09 16:18:33 -06:00
hwmon.c nvme: return errors for hwmon init 2020-09-22 17:49:55 +02:00
Kconfig nvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME 2021-06-16 12:01:38 +02:00
lightnvm.c nvme: support for multiple Command Sets Supported and Effects log pages 2020-07-08 16:16:20 +02:00
Makefile nvme: support for zoned namespaces 2020-07-08 16:16:20 +02:00
multipath.c nvme: drop scan_lock and always kick requeue list when removing namespaces 2021-11-18 14:03:59 +01:00
nvme.h nvme: add command id quirk for apple controllers 2021-10-06 15:55:59 +02:00
pci.c nvme-pci: add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs 2022-02-16 12:54:20 +01:00
rdma.c nvme-rdma: fix possible use-after-free in transport error_recovery work 2022-02-23 12:01:00 +01:00
tcp.c nvme-tcp: lockdep: annotate in-kernel sockets 2022-04-08 14:40:32 +02:00
trace.c nvme: trace: parse Get LBA Status command in detail 2019-08-29 12:55:01 -07:00
trace.h nvme: fix nvme_setup_command metadata trace event 2021-08-08 09:05:23 +02:00
zns.c nvme: remove the disk argument to nvme_update_zone_info 2020-10-07 07:56:17 +02:00