linux/include/net
Lin Ming 0b0ea6a363 ipvs: fix oops on NAT reply in br_nf context
commit 9e33ce453f upstream.

IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.

[  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.781792] PGD 218f9067 PUD 0
[  579.781865] Oops: 0000 [#1] SMP
[  579.781945] CPU 0
[  579.781983] Modules linked in:
[  579.782047]
[  579.782080]
[  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
[  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
[  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
[  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
[  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
[  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
[  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
[  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
[  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
[  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
[  579.783919] Stack:
[  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
[  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
[  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
[  579.784477] Call Trace:
[  579.784523]  <IRQ>
[  579.784562]
[  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
[  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
[  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
[  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
[  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
[  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
[  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
[  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
[  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
[  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
[  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
[  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
[  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
[  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30

The steps to reproduce as follow,

1. On Host1, setup brige br0(192.168.1.106)
2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
3. Start IPVS service on Host1
   ipvsadm -A -t 192.168.1.106:80 -s rr
   ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
4. Run apache benchmark on Host2(192.168.1.101)
   ab -n 1000 http://192.168.1.106/

ip_vs_reply4
  ip_vs_out
    handle_response
      ip_vs_notrack
        nf_reset()
        {
          skb->nf_bridge = NULL;
        }

Actually, IPVS wants in this case just to replace nfct
with untracked version. So replace the nf_reset(skb) call
in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.

Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-21 09:28:00 -07:00
..
9p
bluetooth Bluetooth: Change signature of smp_conn_security() 2012-10-02 10:30:34 -07:00
caif
irda
iucv af_iucv: add shutdown for HS transport 2012-03-07 22:52:24 -08:00
netfilter netfilter: nf_conntrack: fix racy timer handling with reliable events 2012-10-21 09:28:00 -07:00
netns BUG: headers with BUG/BUG_ON etc. need linux/bug.h 2012-03-04 17:54:34 -05:00
nfc NFC: NCI code identation fixes 2012-03-06 15:16:25 -05:00
phonet
sctp sctp: check cached dst before using it 2012-05-10 23:15:47 -04:00
tc_act
act_api.h
addrconf.h
af_ieee802154.h
af_rxrpc.h
af_unix.h switch unix_sock to struct path 2012-03-20 21:29:41 -04:00
ah.h
arp.h
atmclip.h
ax25.h
ax88796.h
cfg80211-wext.h
cfg80211.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2012-03-27 22:15:01 -04:00
checksum.h
cipso_ipv4.h cipso: handle CIPSO options correctly when NetLabel is disabled 2012-07-16 09:03:44 -07:00
cls_cgroup.h
compat.h net: get rid of some pointless casts to sockaddr 2012-03-11 19:11:22 -07:00
datalink.h
dcbevent.h
dcbnl.h net: dcb: getnumtcs()/setnumtcs() should return an int 2012-03-02 18:16:49 -08:00
dn_dev.h
dn_fib.h
dn_neigh.h
dn_nsp.h
dn_route.h
dn.h
dsa.h
dsfield.h
dst_ops.h
dst.h ipv6: fix incorrect ipsec fragment 2012-06-10 00:36:15 +09:00
esp.h
ethoc.h
fib_rules.h
flow_keys.h
flow.h
garp.h
gen_stats.h
genetlink.h
gre.h
icmp.h
ieee80211_radiotap.h
ieee802154_netdev.h
ieee802154.h
if_inet6.h
inet_common.h
inet_connection_sock.h
inet_ecn.h
inet_frag.h
inet_hashtables.h
inet_sock.h
inet_timewait_sock.h
inet6_connection_sock.h
inet6_hashtables.h
inetpeer.h inetpeer: fix a race in inetpeer_gc_worker() 2012-07-16 09:03:45 -07:00
ip_fib.h
ip_vs.h ipvs: fix oops on NAT reply in br_nf context 2012-10-21 09:28:00 -07:00
ip.h ipv4: Make ip_call_ra_chain() return bool. 2012-03-09 14:34:50 -08:00
ip6_checksum.h
ip6_fib.h ipv6: clean up rt6_clean_expires 2012-04-17 22:31:59 -04:00
ip6_route.h
ip6_tunnel.h
ipcomp.h
ipconfig.h
ipip.h
ipv6.h
ipx.h
iw_handler.h
lapb.h
lib80211.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
llc.h
mac80211.h mac80211: Convert WARN_ON to WARN_ON_ONCE 2012-04-09 15:54:48 -04:00
mip6.h
mld.h
ndisc.h
neighbour.h
net_namespace.h
net_ratelimit.h
netdma.h
netevent.h
netlabel.h
netlink.h
netprio_cgroup.h
netrom.h
nexthop.h
nl802154.h
p8022.h
ping.h
pkt_cls.h
pkt_sched.h
protocol.h
psnap.h
raw.h
rawv6.h
red.h net_sched: red: Make minor corrections to comments 2012-04-16 23:53:11 -04:00
regulatory.h
request_sock.h
rose.h
route.h
rtnetlink.h rtnetlink: Fix problem with buffer allocation 2012-02-21 16:56:45 -05:00
sch_generic.h bonding: Fix corrupted queue_mapping 2012-07-16 09:03:47 -07:00
scm.h af_netlink: force credentials passing [CVE-2012-3520] 2012-10-02 10:29:37 -07:00
secure_seq.h
slhc_vj.h
snmp.h
sock.h tcp: Apply device TSO segment limit earlier 2012-10-02 10:29:34 -07:00
stp.h
tcp_memcontrol.h
tcp_states.h
tcp.h The following text was taken from the original review request: 2012-03-24 10:08:39 -07:00
timewait_sock.h BUG: headers with BUG/BUG_ON etc. need linux/bug.h 2012-03-04 17:54:34 -05:00
transp_v6.h
udp.h BUG: headers with BUG/BUG_ON etc. need linux/bug.h 2012-03-04 17:54:34 -05:00
udplite.h net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
wext.h
wimax.h
wpan-phy.h BUG: headers with BUG/BUG_ON etc. need linux/bug.h 2012-03-04 17:54:34 -05:00
x25.h
x25device.h
xfrm.h xfrm: Workaround incompatibility of ESN and async crypto 2012-10-13 05:38:40 +09:00