linux/net/core
Eric W. Biederman 5854ca90c6 net: Fix double free and memory corruption in get_net_ns_by_id()
[ Upstream commit 21b5944350 ]

(I can trivially verify that that idr_remove in cleanup_net happens
 after the network namespace count has dropped to zero --EWB)

Function get_net_ns_by_id() does not check for net::count
after it has found a peer in netns_ids idr.

It may dereference a peer, after its count has already been
finaly decremented. This leads to double free and memory
corruption:

put_net(peer)                                   rtnl_lock()
atomic_dec_and_test(&peer->count) [count=0]     ...
__put_net(peer)                                 get_net_ns_by_id(net, id)
  spin_lock(&cleanup_list_lock)
  list_add(&net->cleanup_list, &cleanup_list)
  spin_unlock(&cleanup_list_lock)
queue_work()                                      peer = idr_find(&net->netns_ids, id)
  |                                               get_net(peer) [count=1]
  |                                               ...
  |                                               (use after final put)
  v                                               ...
  cleanup_net()                                   ...
    spin_lock(&cleanup_list_lock)                 ...
    list_replace_init(&cleanup_list, ..)          ...
    spin_unlock(&cleanup_list_lock)               ...
    ...                                           ...
    ...                                           put_net(peer)
    ...                                             atomic_dec_and_test(&peer->count) [count=0]
    ...                                               spin_lock(&cleanup_list_lock)
    ...                                               list_add(&net->cleanup_list, &cleanup_list)
    ...                                               spin_unlock(&cleanup_list_lock)
    ...                                             queue_work()
    ...                                           rtnl_unlock()
    rtnl_lock()                                   ...
    for_each_net(tmp) {                           ...
      id = __peernet2id(tmp, peer)                ...
      spin_lock_irq(&tmp->nsid_lock)              ...
      idr_remove(&tmp->netns_ids, id)             ...
      ...                                         ...
      net_drop_ns()                               ...
	net_free(peer)                            ...
    }                                             ...
  |
  v
  cleanup_net()
    ...
    (Second free of peer)

Also, put_net() on the right cpu may reorder with left's cpu
list_replace_init(&cleanup_list, ..), and then cleanup_list
will be corrupted.

Since cleanup_net() is executed in worker thread, while
put_net(peer) can happen everywhere, there should be
enough time for concurrent get_net_ns_by_id() to pick
the peer up, and the race does not seem to be unlikely.
The patch fixes the problem in standard way.

(Also, there is possible problem in peernet2id_alloc(), which requires
check for net::count under nsid_lock and maybe_get_net(peer), but
in current stable kernel it's used under rtnl_lock() and it has to be
safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
be fixed too. While this is not in stable kernel yet, so I'll send
a separate message to netdev@ later).

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Fixes: 0c7aecd4bd "netns: add rtnl cmd to add and get peer netns ids"
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-02 20:33:26 +01:00
..
datagram.c net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c net: Zero terminate ifr_name in dev_ifname(). 2017-08-11 09:08:52 -07:00
dev.c net: Resend IGMP memberships upon peer notification. 2017-12-20 10:04:54 +01:00
drop_monitor.c drop_monitor: consider inserted data in genlmsg_end 2017-01-15 13:41:35 +01:00
dst.c Fix an intermittent pr_emerg warning about lo becoming free. 2017-07-05 14:37:13 +02:00
ethtool.c ethtool: do not vzalloc(0) on registers dump 2017-06-17 06:39:36 +02:00
fib_rules.c fib_rules: fix fib rule dumps across multiple skbs 2015-09-24 15:21:54 -07:00
filter.c tcp: take care of truncations done by sk_filter() 2016-11-21 10:06:40 +01:00
flow_dissector.c flow_dissect: call init_default_flow_dissectors() earlier 2016-12-02 09:09:02 +01:00
flow.c flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c 2015-09-01 17:00:24 -07:00
gen_estimator.c net_sched: gen_estimator: extend pps limit 2015-07-08 13:59:20 -07:00
gen_stats.c gen_stats.c: Duplicate xstats buffer for later use 2015-02-19 15:45:53 -05:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwtunnel.c dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
Makefile lwtunnel: infrastructure for handling light weight tunnels like mpls 2015-07-21 10:39:03 -07:00
neighbour.c net: neigh: guard against NULL solicit() method 2017-05-02 21:19:51 -07:00
net_namespace.c net: Fix double free and memory corruption in get_net_ns_by_id() 2018-01-02 20:33:26 +01:00
net-procfs.c
net-sysfs.c switchdev: rename SWITCHDEV_ATTR_* enum values to SWITCHDEV_ATTR_ID_* 2015-10-03 04:49:37 -07:00
net-sysfs.h
net-traces.c net: FIB tracepoints 2015-08-29 13:05:16 -07:00
netclassid_cgroup.c Merge branch 'master' into for-4.4-fixes 2015-12-07 10:09:03 -05:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c netpoll: Check for skb->queue_mapping 2017-05-02 21:19:53 -07:00
netprio_cgroup.c cgroup: fix handling of multi-destination migration from subtree_control enabling 2015-12-03 10:18:21 -05:00
pktgen.c net: pktgen: remove rcu locking in pktgen_change_name() 2016-11-15 07:46:38 +01:00
ptp_classifier.c ptp: Change ptp_class to a proper bitmask 2015-11-03 11:08:22 -05:00
request_sock.c tcp: restore fastopen operations 2015-10-05 03:19:06 -07:00
rtnetlink.c rtnetlink: allocate more memory for dev_set_mac_address() 2017-08-11 09:08:52 -07:00
scm.c unix: correctly track in-flight fds in sending process user_struct 2016-03-03 15:07:05 -08:00
secure_seq.c net: remove a sparse error in secure_dccpv6_sequence_number() 2015-05-25 22:55:37 -04:00
skbuff.c netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed 2017-11-24 08:32:24 +01:00
sock_diag.c net/core: make sock_diag.c explicitly non-modular 2015-10-09 07:52:27 -07:00
sock.c net: Set sk_prot_creator when cloning sockets to the right proto 2017-10-21 17:09:03 +02:00
stream.c net: fix sock_wake_async() rcu protection 2015-12-01 15:45:05 -05:00
sysctl_net_core.c net: Do not allow negative values for busy_read and busy_poll sysctl interfaces 2017-12-25 14:22:12 +01:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: add support for IPv6 2015-10-26 22:24:22 -07:00
utils.c net: move net_get_random_once to lib 2015-10-08 05:26:35 -07:00