linux/net
Ying Xue 0aa1fedab0 tipc: fix lockdep warning during bearer initialization
[ Upstream commit 4225a398c1 ]

When the lockdep validator is enabled, it will report the below
warning when we enable a TIPC bearer:

[ INFO: possible irq lock inversion dependency detected ]
---------------------------------------------------------
Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(ptype_lock);
                                local_irq_disable();
                                lock(tipc_net_lock);
                                lock(ptype_lock);
   <Interrupt>
   lock(tipc_net_lock);

  *** DEADLOCK ***

the shortest dependencies between 2nd lock and 1st lock:
  -> (ptype_lock){+.+...} ops: 10 {
[...]
SOFTIRQ-ON-W at:
                      [<c1089418>] __lock_acquire+0x528/0x13e0
                      [<c108a360>] lock_acquire+0x90/0x100
                      [<c1553c38>] _raw_spin_lock+0x38/0x50
                      [<c14651ca>] dev_add_pack+0x3a/0x60
                      [<c182da75>] arp_init+0x1a/0x48
                      [<c182dce5>] inet_init+0x181/0x27e
                      [<c1001114>] do_one_initcall+0x34/0x170
                      [<c17f7329>] kernel_init+0x110/0x1b2
                      [<c155b6a2>] kernel_thread_helper+0x6/0x10
[...]
   ... key      at: [<c17e4b10>] ptype_lock+0x10/0x20
   ... acquired at:
    [<c108a360>] lock_acquire+0x90/0x100
    [<c1553c38>] _raw_spin_lock+0x38/0x50
    [<c14651ca>] dev_add_pack+0x3a/0x60
    [<c8bc18d2>] enable_bearer+0xf2/0x140 [tipc]
    [<c8bb283a>] tipc_enable_bearer+0x1ba/0x450 [tipc]
    [<c8bb3a04>] tipc_cfg_do_cmd+0x5c4/0x830 [tipc]
    [<c8bbc032>] handle_cmd+0x42/0xd0 [tipc]
    [<c148e802>] genl_rcv_msg+0x232/0x280
    [<c148d3f6>] netlink_rcv_skb+0x86/0xb0
    [<c148e5bc>] genl_rcv+0x1c/0x30
    [<c148d144>] netlink_unicast+0x174/0x1f0
    [<c148ddab>] netlink_sendmsg+0x1eb/0x2d0
    [<c1456bc1>] sock_aio_write+0x161/0x170
    [<c1135a7c>] do_sync_write+0xac/0xf0
    [<c11360f6>] vfs_write+0x156/0x170
    [<c11361e2>] sys_write+0x42/0x70
    [<c155b0df>] sysenter_do_call+0x12/0x38
[...]
}
  -> (tipc_net_lock){+..-..} ops: 4 {
[...]
    IN-SOFTIRQ-R at:
                     [<c108953a>] __lock_acquire+0x64a/0x13e0
                     [<c108a360>] lock_acquire+0x90/0x100
                     [<c15541cd>] _raw_read_lock_bh+0x3d/0x50
                     [<c8bb874d>] tipc_recv_msg+0x1d/0x830 [tipc]
                     [<c8bc195f>] recv_msg+0x3f/0x50 [tipc]
                     [<c146a5fa>] __netif_receive_skb+0x22a/0x590
                     [<c146ab0b>] netif_receive_skb+0x2b/0xf0
                     [<c13c43d2>] pcnet32_poll+0x292/0x780
                     [<c146b00a>] net_rx_action+0xfa/0x1e0
                     [<c103a4be>] __do_softirq+0xae/0x1e0
[...]
}

>From the log, we can see three different call chains between
CPU0 and CPU1:

Time 0 on CPU0:

  kernel_init()->inet_init()->dev_add_pack()

At time 0, the ptype_lock is held by CPU0 in dev_add_pack();

Time 1 on CPU1:

  tipc_enable_bearer()->enable_bearer()->dev_add_pack()

At time 1, tipc_enable_bearer() first holds tipc_net_lock, and then
wants to take ptype_lock to register TIPC protocol handler into the
networking stack.  But the ptype_lock has been taken by dev_add_pack()
on CPU0, so at this time the dev_add_pack() running on CPU1 has to be
busy looping.

Time 2 on CPU0:

  netif_receive_skb()->recv_msg()->tipc_recv_msg()

At time 2, an incoming TIPC packet arrives at CPU0, hence
tipc_recv_msg() will be invoked. In tipc_recv_msg(), it first wants
to hold tipc_net_lock.  At the moment, below scenario happens:

On CPU0, below is our sequence of taking locks:

  lock(ptype_lock)->lock(tipc_net_lock)

On CPU1, our sequence of taking locks looks like:

  lock(tipc_net_lock)->lock(ptype_lock)

Obviously deadlock may happen in this case.

But please note the deadlock possibly doesn't occur at all when the
first TIPC bearer is enabled.  Before enable_bearer() -- running on
CPU1 does not hold ptype_lock, so the TIPC receive handler (i.e.
recv_msg()) is not registered successfully via dev_add_pack(), so
the tipc_recv_msg() cannot be called by recv_msg() even if a TIPC
message comes to CPU0. But when the second TIPC bearer is
registered, the deadlock can perhaps really happen.

To fix it, we will push the work of registering TIPC protocol
handler into workqueue context. After the change, both paths taking
ptype_lock are always in process contexts, thus, the deadlock should
never occur.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14 06:02:11 -07:00
..
9p 9p: fix off by one causing access violations and memory corruption 2013-07-28 16:26:05 -07:00
802 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-02 17:53:39 -07:00
8021q vlan: fix a race in egress prio management 2013-07-28 16:26:08 -07:00
appletalk net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
atm atm: update msg_namelen in vcc_recvmsg() 2013-05-01 09:41:04 -07:00
ax25 ax25: fix info leak via msg_name in ax25_recvmsg() 2013-05-01 09:41:04 -07:00
batman-adv batman-adv: fix random jitter calculation 2013-01-11 09:07:03 -08:00
bluetooth Bluetooth: Fix crash in l2cap_build_cmd() with small MTU 2013-07-03 10:59:00 -07:00
bridge net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delay 2013-09-14 06:02:10 -07:00
caif caif: Fix missing msg_namelen update in caif_seqpkt_recvmsg() 2013-05-01 09:41:04 -07:00
can can: gw: use kmem_cache_free() instead of kfree() 2013-04-12 09:38:47 -07:00
ceph libceph: Fix NULL pointer dereference in auth client code 2013-07-13 11:03:40 -07:00
core neighbour: populate neigh_parms on alloc before calling ndo_neigh_setup 2013-09-14 06:02:08 -07:00
dcb dcbnl: fix various netlink info leaks 2013-03-20 13:05:02 -07:00
dccp inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock 2013-01-11 09:07:14 -08:00
decnet Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
dns_resolver KEYS: Allow special keyrings to be cleared 2012-01-19 14:38:51 +11:00
dsa dsa: Move switch drivers to new directory drivers/net/dsa 2011-11-29 00:21:36 -05:00
econet Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
ethernet Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
ieee802154 6lowpan: Fix endianness issue in is_addr_link_local(). 2013-03-20 13:05:02 -07:00
ipv4 tcp: cubic: fix bug in bictcp_acked() 2013-09-14 06:02:09 -07:00
ipv6 net: ipv6: tcp: fix potential use after free in tcp_v6_do_rcv 2013-09-14 06:02:10 -07:00
ipx net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
irda irda: Fix missing msg_namelen update in irda_recvmsg_dgram() 2013-05-01 09:41:05 -07:00
iucv iucv: Fix missing msg_namelen update in iucv_sock_recvmsg() 2013-05-01 09:41:05 -07:00
key af_key: initialize satype in key_notify_policy_flush() 2013-08-20 08:26:28 -07:00
l2tp l2tp: add missing .owner to struct pppox_proto 2013-07-28 16:26:02 -07:00
lapb Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
llc llc: Fix missing msg_namelen update in llc_ui_recvmsg() 2013-05-01 09:41:05 -07:00
mac80211 mac80211: fix duplicate retransmission detection 2013-08-11 15:38:42 -07:00
netfilter ipvs: ip_vs_sip_fill_param() BUG: bad check of return value 2013-05-11 13:48:08 -07:00
netlabel netlabel: improve domain mapping validation 2013-06-27 11:27:31 -07:00
netlink thermal: shorten too long mcast group name 2013-04-05 10:04:38 -07:00
netrom netrom: fix invalid use of sizeof in nr_recvmsg() 2013-05-01 09:41:06 -07:00
nfc NFC: llcp: fix info leaks via msg_name in llcp_sock_recvmsg() 2013-05-01 09:41:05 -07:00
openvswitch openvswitch: Reset upper layer protocol info on internal devices. 2012-10-02 10:29:50 -07:00
packet packet: packet_getname_spkt: make sure string is always 0-terminated 2013-06-27 11:27:33 -07:00
phonet phonet: Sort out initiailziation and cleanup code. 2012-04-13 11:01:43 -04:00
rds rds: limit the size allocated by rds_message_alloc() 2013-03-20 13:05:01 -07:00
rfkill device.h: cleanup users outside of linux/include (C files) 2012-03-11 14:27:37 -04:00
rose rose: fix info leak via msg_name in rose_recvmsg() 2013-05-01 09:41:05 -07:00
rxrpc RxRPC: Fix kcalloc parameters swapped 2012-02-14 14:41:55 -05:00
sched htb: fix sign extension bug 2013-09-14 06:02:08 -07:00
sctp sctp: fully initialize sctp_outq in sctp_outq_init 2013-08-11 15:38:44 -07:00
sunrpc SUNRPC: Fix memory corruption issue on 32-bit highmem systems 2013-09-07 21:58:15 -07:00
tipc tipc: fix lockdep warning during bearer initialization 2013-09-14 06:02:11 -07:00
unix af_unix: If we don't care about credentials coallesce all messages 2013-05-01 09:41:07 -07:00
wanrouter wanmain: comparing array with NULL 2012-08-09 08:31:51 -07:00
wimax net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
wireless nl80211: fix mgmt tx status and testmode reporting for netns 2013-08-11 15:38:41 -07:00
x25 x25: Fix broken locking in ioctl error paths. 2013-07-28 16:25:58 -07:00
xfrm xfrm_user: ensure user supplied esn replay window is valid 2012-10-13 05:38:41 +09:00
compat.c net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg 2013-06-27 11:27:32 -07:00
Kconfig net: Add Open vSwitch kernel components. 2011-12-03 09:35:17 -08:00
Makefile net: Add Open vSwitch kernel components. 2011-12-03 09:35:17 -08:00
nonet.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
socket.c net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg 2013-06-27 11:27:32 -07:00
sysctl_net.c sysctl: Modify __register_sysctl_paths to take a set instead of a root and an nsproxy 2012-01-24 16:40:30 -08:00