linux/net
Jakub Kicinski dd891b5b10 net: do not send a MOVE event when netdev changes netns
Networking supports changing netdevice's netns and name
at the same time. This allows avoiding name conflicts
and having to rename the interface in multiple steps.
E.g. netns1={eth0, eth1}, netns2={eth1} - we want
to move netns1:eth1 to netns2 and call it eth0 there.
If we can't rename "in flight" we'd need to (1) rename
eth1 -> $tmp, (2) change netns, (3) rename $tmp -> eth0.

To rename the underlying struct device we have to call
device_rename(). The rename()'s MOVE event, however, doesn't
"belong" to either the old or the new namespace.
If there are conflicts on both sides it's actually impossible
to issue a real MOVE (old name -> new name) without confusing
user space. And Daniel reports that such confusions do in fact
happen for systemd, in real life.

Since we already issue explicit REMOVE and ADD events
manually - suppress the MOVE event completely. Move
the ADD after the rename, so that the REMOVE uses
the old name, and the ADD the new one.

If there is no rename this changes the picture as follows:

Before:

old ns | KERNEL[213.399289] remove   /devices/virtual/net/eth0 (net)
new ns | KERNEL[213.401302] add      /devices/virtual/net/eth0 (net)
new ns | KERNEL[213.401397] move     /devices/virtual/net/eth0 (net)

After:

old ns | KERNEL[266.774257] remove   /devices/virtual/net/eth0 (net)
new ns | KERNEL[266.774509] add      /devices/virtual/net/eth0 (net)

If there is a rename and a conflict (using the exact eth0/eth1
example explained above) we get this:

Before:

old ns | KERNEL[224.316833] remove   /devices/virtual/net/eth1 (net)
new ns | KERNEL[224.318551] add      /devices/virtual/net/eth1 (net)
new ns | KERNEL[224.319662] move     /devices/virtual/net/eth0 (net)

After:

old ns | KERNEL[333.033166] remove   /devices/virtual/net/eth1 (net)
new ns | KERNEL[333.035098] add      /devices/virtual/net/eth0 (net)

Note that "in flight" rename is only performed when needed.
If there is no conflict for old name in the target netns -
the rename will be performed separately by dev_change_name(),
as if the rename was a different command, and there will still
be a MOVE event for the rename:

Before:

old ns | KERNEL[194.416429] remove   /devices/virtual/net/eth0 (net)
new ns | KERNEL[194.418809] add      /devices/virtual/net/eth0 (net)
new ns | KERNEL[194.418869] move     /devices/virtual/net/eth0 (net)
new ns | KERNEL[194.420866] move     /devices/virtual/net/eth1 (net)

After:

old ns | KERNEL[71.917520] remove   /devices/virtual/net/eth0 (net)
new ns | KERNEL[71.919155] add      /devices/virtual/net/eth0 (net)
new ns | KERNEL[71.920729] move     /devices/virtual/net/eth1 (net)

If deleting the MOVE event breaks some user space we should insert
an explicit kobject_uevent(MOVE) after the ADD, like this:

@@ -11192,6 +11192,12 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
 	kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
 	netdev_adjacent_add_links(dev);

+	/* User space wants an explicit MOVE event, issue one unless
+	 * dev_change_name() will get called later and issue one.
+	 */
+	if (!pat || new_name[0])
+		kobject_uevent(&dev->dev.kobj, KOBJ_MOVE);
+
 	/* Adapt owner in case owning user namespace of target network
 	 * namespace is different from the original one.
 	 */

Reported-by: Daniel Gröber <dxld@darkboxed.org>
Link: https://lore.kernel.org/all/20231010121003.x3yi6fihecewjy4e@House.clients.dxld.at/
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/all/20231120184140.578375-1-kuba@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-21 14:39:03 -08:00
..
6lowpan
9p 9p/net: fix possible memory leak in p9_check_errors() 2023-10-27 12:44:13 +09:00
802 net: fill in MODULE_DESCRIPTION()s under net/802* 2023-10-28 11:29:28 +01:00
8021q net: ethtool: Refactor identical get_ts_info implementations. 2023-11-18 14:52:57 +00:00
appletalk appletalk: remove special handling code for ipddp 2023-10-13 17:59:32 -07:00
atm net: atm: Remove redundant check. 2023-10-23 08:45:25 +01:00
ax25 net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
batman-adv batman-adv: Switch to linux/array_size.h 2023-11-14 08:16:34 +01:00
bluetooth This update includes the following changes: 2023-11-02 16:15:30 -10:00
bpf bpf: Add __bpf_kfunc_{start,end}_defs macros 2023-11-01 22:33:53 -07:00
bpfilter
bridge netfilter: nf_conntrack_bridge: initialize err to 0 2023-11-14 16:16:21 +01:00
caif
can Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-12 17:07:34 -07:00
ceph This update includes the following changes: 2023-11-02 16:15:30 -10:00
core net: do not send a MOVE event when netdev changes netns 2023-11-21 14:39:03 -08:00
dcb
dccp dccp/tcp: Call security_inet_conn_request() after setting IPv6 addresses. 2023-11-02 12:56:03 +01:00
devlink devlink: Add device lock assert in reload operation 2023-11-18 17:38:50 +00:00
dns_resolver
dsa net: dsa: tag_rtl4_a: Use existing ETH_P_REALTEK constant 2023-11-14 19:45:35 -08:00
ethernet
ethtool net: partial revert of the "Make timestamping selectable: series 2023-11-18 18:42:37 -08:00
handshake Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-26 13:46:28 -07:00
hsr hsr: Prevent use after free in prp_create_tagged_frame() 2023-11-01 22:26:04 -07:00
ieee802154
ife
ipv4 tcp: no longer abort SYN_SENT when receiving some ICMP 2023-11-16 23:35:12 +00:00
ipv6 tcp: no longer abort SYN_SENT when receiving some ICMP 2023-11-16 23:35:12 +00:00
iucv s390: use control register bit defines 2023-09-19 13:26:57 +02:00
kcm net: kcm: fill in MODULE_DESCRIPTION() 2023-11-08 18:17:44 -08:00
key
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-05 13:16:47 -07:00
l3mdev
lapb
llc llc: verify mac len before reading mac header 2023-11-01 22:21:32 -07:00
mac80211 wireless-next patches for v6.7 2023-10-26 20:27:58 -07:00
mac802154
mctp mctp: perform route lookups under a RCU read-side lock 2023-10-10 19:43:22 -07:00
mpls
mptcp mptcp: fix setsockopt(IP_TOS) subflow locking 2023-11-14 20:10:20 -08:00
ncsi net/ncsi: Add NC-SI 1.2 Get MC MAC Address command 2023-11-18 15:00:51 +00:00
netfilter netfilter: nf_tables: split async and sync catchall in two functions 2023-11-14 16:16:21 +01:00
netlabel
netlink rtnetlink: introduce nlmsg_new_large and use it in rtnl_getlink 2023-11-18 20:18:25 -08:00
netrom net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
nfc nfc: nci: fix possible NULL pointer dereference in send_acknowledge() 2023-10-16 17:34:53 -07:00
nsh
openvswitch net/sched: act_ct: Always fill offloading tuple iifidx 2023-11-08 17:47:08 -08:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-12 17:07:34 -07:00
phonet
psample
qrtr
rds net: prevent address rewrite in kernel_bind() 2023-10-01 19:31:29 +01:00
rfkill net: rfkill: reduce data->mtx scope in rfkill_fop_open 2023-10-11 16:55:10 +02:00
rose net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
rxrpc rxrpc: Fix two connection reaping bugs 2023-11-01 22:28:55 -07:00
sched net/sched: cls_u32: replace int refcounts with proper refcounts 2023-11-18 19:38:23 -08:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-05 13:16:47 -07:00
smc net/smc: put sk reference if close work was canceled 2023-11-06 10:01:07 +00:00
strparser
sunrpc NFS client updates for Linux 6.7 2023-11-08 13:39:16 -08:00
switchdev
tipc tipc: Remove redundant call to TLV_SPACE() 2023-11-17 02:27:27 +00:00
tls tls: don't reset prot->aad_size and prot->tail_size for TLS_HW 2023-10-23 10:15:09 -07:00
unix af_unix: fix use-after-free in unix_stream_read_actor() 2023-11-14 10:51:13 +01:00
vmw_vsock virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt() 2023-11-07 18:56:06 -08:00
wireless wireless-next patches for v6.7 2023-10-26 20:27:58 -07:00
x25 net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
xdp xsk: Avoid starving the xsk further down the list 2023-10-24 11:55:36 +02:00
xfrm Including fixes from netfilter and bpf. 2023-11-09 17:09:35 -08:00
compat.c
devres.c
Kconfig net: add skb_segment kunit test 2023-10-11 10:39:01 +01:00
Kconfig.debug
Makefile
socket.c bpf: Add __bpf_hook_{start,end} macros 2023-11-01 22:33:53 -07:00
sysctl_net.c