mirror of
https://github.com/torvalds/linux.git
synced 2026-05-25 15:41:52 +02:00
v7.0-rc5
2676 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
3715a00855 |
bridge: cfm: Fix race condition in peer_mep deletion
When a peer MEP is being deleted, cancel_delayed_work_sync() is called
on ccm_rx_dwork before freeing. However, br_cfm_frame_rx() runs in
softirq context under rcu_read_lock (without RTNL) and can re-schedule
ccm_rx_dwork via ccm_rx_timer_start() between cancel_delayed_work_sync()
returning and kfree_rcu() being called.
The following is a simple race scenario:
cpu0 cpu1
mep_delete_implementation()
cancel_delayed_work_sync(ccm_rx_dwork);
br_cfm_frame_rx()
// peer_mep still in hlist
if (peer_mep->ccm_defect)
ccm_rx_timer_start()
queue_delayed_work(ccm_rx_dwork)
hlist_del_rcu(&peer_mep->head);
kfree_rcu(peer_mep, rcu);
ccm_rx_work_expired()
// on freed peer_mep
To prevent this, cancel_delayed_work_sync() is replaced with
disable_delayed_work_sync() in both peer MEP deletion paths, so
that subsequent queue_delayed_work() calls from br_cfm_frame_rx()
are silently rejected.
The cc_peer_disable() helper retains cancel_delayed_work_sync()
because it is also used for the CC enable/disable toggle path where
the work must remain re-schedulable.
Fixes:
|
||
|
|
e5e8906305 |
net: bridge: fix nd_tbl NULL dereference when IPv6 is disabled
When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never
initialized because inet6_init() exits before ndisc_init() is called
which initializes it. Then, if neigh_suppress is enabled and an ICMPv6
Neighbor Discovery packet reaches the bridge, br_do_suppress_nd() will
dereference ipv6_stub->nd_tbl which is NULL, passing it to
neigh_lookup(). This causes a kernel NULL pointer dereference.
BUG: kernel NULL pointer dereference, address: 0000000000000268
Oops: 0000 [#1] PREEMPT SMP NOPTI
[...]
RIP: 0010:neigh_lookup+0x16/0xe0
[...]
Call Trace:
<IRQ>
? neigh_lookup+0x16/0xe0
br_do_suppress_nd+0x160/0x290 [bridge]
br_handle_frame_finish+0x500/0x620 [bridge]
br_handle_frame+0x353/0x440 [bridge]
__netif_receive_skb_core.constprop.0+0x298/0x1110
__netif_receive_skb_one_core+0x3d/0xa0
process_backlog+0xa0/0x140
__napi_poll+0x2c/0x170
net_rx_action+0x2c4/0x3a0
handle_softirqs+0xd0/0x270
do_softirq+0x3f/0x60
Fix this by replacing IS_ENABLED(IPV6) call with ipv6_mod_enabled() in
the callers. This is in essence disabling NS/NA suppression when IPv6 is
disabled.
Fixes:
|
||
|
|
93c9475c04 |
bridge: Check relevant per-VLAN options in VLAN range grouping
The br_vlan_opts_eq_range() function determines if consecutive VLANs can
be grouped together in a range for compact netlink notifications. It
currently checks state, tunnel info, and multicast router configuration,
but misses two categories of per-VLAN options that affect the output:
1. User-visible priv_flags (neigh_suppress, mcast_enabled)
2. Port multicast context (mcast_max_groups, mcast_n_groups)
When VLANs have different settings for these options, they are incorrectly
grouped into ranges, causing netlink notifications to report only one
VLAN's settings for the entire range.
Fix by checking priv_flags equality, but only for flags that affect netlink
output (BR_VLFLAG_NEIGH_SUPPRESS_ENABLED and BR_VLFLAG_MCAST_ENABLED),
and comparing multicast context (mcast_max_groups and mcast_n_groups).
Example showing the bugs before the fix:
$ bridge vlan set vid 10 dev dummy1 neigh_suppress on
$ bridge vlan set vid 11 dev dummy1 neigh_suppress off
$ bridge -d vlan show dev dummy1
port vlan-id
dummy1 10-11
... neigh_suppress on
$ bridge vlan set vid 10 dev dummy1 mcast_max_groups 100
$ bridge vlan set vid 11 dev dummy1 mcast_max_groups 200
$ bridge -d vlan show dev dummy1
port vlan-id
dummy1 10-11
... mcast_max_groups 100
After the fix, VLANs 10 and 11 are shown as separate entries with their
correct individual settings.
Fixes:
|
||
|
|
189f164e57 |
Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses
Conversion performed via this Coccinelle script:
// SPDX-License-Identifier: GPL-2.0-only
// Options: --include-headers-for-types --all-includes --include-headers --keep-comments
virtual patch
@gfp depends on patch && !(file in "tools") && !(file in "samples")@
identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
kzalloc_obj,kzalloc_objs,kzalloc_flex,
kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
@@
ALLOC(...
- , GFP_KERNEL
)
$ make coccicheck MODE=patch COCCI=gfp.cocci
Build and boot tested x86_64 with Fedora 42's GCC and Clang:
Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
bf4afc53b7 |
Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
69050f8d6d |
treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org> |
||
|
|
8bf22c33e7 |
Including fixes from Netfilter.
Current release - new code bugs:
- net: fix backlog_unlock_irq_restore() vs CONFIG_PREEMPT_RT
- eth: mlx5e: XSK, Fix unintended ICOSQ change
- phy_port: correctly recompute the port's linkmodes
- vsock: prevent child netns mode switch from local to global
- couple of kconfig fixes for new symbols
Previous releases - regressions:
- nfc: nci: fix false-positive parameter validation for packet data
- net: do not delay zero-copy skbs in skb_attempt_defer_free()
Previous releases - always broken:
- mctp: ensure our nlmsg responses to user space are zero-initialised
- ipv6: ioam: fix heap buffer overflow in __ioam6_fill_trace_data()
- fixes for ICMP rate limiting
Misc:
- intel: fix PCI device ID conflict between i40e and ipw2200
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmmXUh8ACgkQMUZtbf5S
IrufYA//ZVj+4gvegqKwKZYXNBndVW00GGTYqaILbaenK1olUVUelVB91eV2Klc/
dXCeKG/MgEPuT89IjkPzVr2Yg4x6uhjcQL1rsahORn+GuQfSI/P8y7ysDOPnHVeM
Rtsg1m8z3EizJcHPeAJe7nEqFzfvZ2m+FCEGe++z8BYaUZUVApytgpIWOHO/aB+p
t13bCNzd05XxPphMl610T00Fncj2jCVDHILMgTB5rmFmkeJuQwNrRGXQSoQame46
+g+yCZjT0eVTrBaH1EUssWfrOT3VJj3BEee6gSp7k9mxMkbW18i8shBgmxS+EHjk
u19wwBzSrHK+JY1UExim+1E/rZisQVmEE1Gs0ALedxAu9zC/Julzfa2/+BFsc0j7
QTXd4jukG3aTPIX8v3TV2Igu0j+bAT4WdpzvnsXXBMVKy7wFYMd1+aSOLyFH2W9L
qRbg50oUATcsz77bZt6YUTJEgua4HXNYGtn15FMZOR7HJVR2L44Q5TK5mQxGp5iM
GabeKMzg6bsjE98STM3nbWks3pIb9ptIk++i0913eSqKgn84bDPtp3Gabfgle2SJ
8gjKS61K8rDt5x8StXVod7oGQ4asL8RJyOtE/avgbWUu9BNH8/oKqsE6TQrpXauv
1ndiyim/mPe4fBCxkVAi2+uq5/ph9z8XyleESz9VYwyL3Rl4nsg=
=qSCj
-----END PGP SIGNATURE-----
Merge tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Netfilter.
Current release - new code bugs:
- net: fix backlog_unlock_irq_restore() vs CONFIG_PREEMPT_RT
- eth: mlx5e: XSK, Fix unintended ICOSQ change
- phy_port: correctly recompute the port's linkmodes
- vsock: prevent child netns mode switch from local to global
- couple of kconfig fixes for new symbols
Previous releases - regressions:
- nfc: nci: fix false-positive parameter validation for packet data
- net: do not delay zero-copy skbs in skb_attempt_defer_free()
Previous releases - always broken:
- mctp: ensure our nlmsg responses to user space are zero-initialised
- ipv6: ioam: fix heap buffer overflow in __ioam6_fill_trace_data()
- fixes for ICMP rate limiting
Misc:
- intel: fix PCI device ID conflict between i40e and ipw2200"
* tag 'net-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
net: nfc: nci: Fix parameter validation for packet data
net/mlx5e: Use unsigned for mlx5e_get_max_num_channels
net/mlx5e: Fix deadlocks between devlink and netdev instance locks
net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event
net/mlx5: Fix misidentification of write combining CQE during poll loop
net/mlx5e: Fix misidentification of ASO CQE during poll loop
net/mlx5: Fix multiport device check over light SFs
bonding: alb: fix UAF in rlb_arp_recv during bond up/down
bnge: fix reserving resources from FW
eth: fbnic: Advertise supported XDP features.
rds: tcp: fix uninit-value in __inet_bind
net/rds: Fix NULL pointer dereference in rds_tcp_accept_one
octeontx2-af: Fix default entries mcam entry action
net/mlx5e: XSK, Fix unintended ICOSQ change
ipv6: icmp: icmpv6_xrlim_allow() optimization if net.ipv6.icmp.ratelimit is zero
ipv4: icmp: icmpv4_xrlim_allow() optimization if net.ipv4.icmp_ratelimit is zero
ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow()
inet: move icmp_global_{credit,stamp} to a separate cache line
icmp: prevent possible overflow in icmp_global_allow()
selftests/net: packetdrill: add ipv4-mapped-ipv6 tests
...
|
||
|
|
8b769e311a |
net: bridge: mcast: always update mdb_n_entries for vlan contexts
syzbot triggered a warning[1] about the number of mdb entries in a context.
It turned out that there are multiple ways to trigger that warning today
(some got added during the years), the root cause of the problem is that
the increase is done conditionally, and over the years these different
conditions increased so there were new ways to trigger the warning, that is
to do a decrease which wasn't paired with a previous increase.
For example one way to trigger it is with flush:
$ ip l add br0 up type bridge vlan_filtering 1 mcast_snooping 1
$ ip l add dumdum up master br0 type dummy
$ bridge mdb add dev br0 port dumdum grp 239.0.0.1 permanent vid 1
$ ip link set dev br0 down
$ ip link set dev br0 type bridge mcast_vlan_snooping 1
^^^^ this will enable snooping, but will not update mdb_n_entries
because in __br_multicast_enable_port_ctx() we check !netif_running
$ bridge mdb flush dev br0
^^^ this will trigger the warning because it will delete the pg which
we added above, which will try to decrease mdb_n_entries
Fix the problem by removing the conditional increase and always keep the
count up-to-date while the vlan exists. In order to do that we have to
first initialize it on port-vlan context creation, and then always increase
or decrease the value regardless of mcast options. To keep the current
behaviour we have to enforce the mdb limit only if the context is port's or
if the port-vlan's mcast snooping is enabled.
[1]
------------[ cut here ]------------
n == 0
WARNING: net/bridge/br_multicast.c:718 at br_multicast_port_ngroups_dec_one net/bridge/br_multicast.c:718 [inline], CPU#0: syz.4.4607/22043
WARNING: net/bridge/br_multicast.c:718 at br_multicast_port_ngroups_dec net/bridge/br_multicast.c:771 [inline], CPU#0: syz.4.4607/22043
WARNING: net/bridge/br_multicast.c:718 at br_multicast_del_pg+0x1bbe/0x1e20 net/bridge/br_multicast.c:825, CPU#0: syz.4.4607/22043
Modules linked in:
CPU: 0 UID: 0 PID: 22043 Comm: syz.4.4607 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
RIP: 0010:br_multicast_port_ngroups_dec_one net/bridge/br_multicast.c:718 [inline]
RIP: 0010:br_multicast_port_ngroups_dec net/bridge/br_multicast.c:771 [inline]
RIP: 0010:br_multicast_del_pg+0x1bbe/0x1e20 net/bridge/br_multicast.c:825
Code: 41 5f 5d e9 04 7a 48 f7 e8 3f 73 5c f7 90 0f 0b 90 e9 cf fd ff ff e8 31 73 5c f7 90 0f 0b 90 e9 16 fd ff ff e8 23 73 5c f7 90 <0f> 0b 90 e9 60 fd ff ff e8 15 73 5c f7 eb 05 e8 0e 73 5c f7 48 8b
RSP: 0018:ffffc9000c207220 EFLAGS: 00010293
RAX: ffffffff8a68042d RBX: ffff88807c6f1800 RCX: ffff888066e90000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffff888066e90000 R09: 000000000000000c
R10: 000000000000000c R11: 0000000000000000 R12: ffff8880303ef800
R13: dffffc0000000000 R14: ffff888050eb11c4 R15: 1ffff1100a1d6238
FS: 00007fa45921b6c0(0000) GS:ffff8881256f5000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa4591f9ff8 CR3: 0000000081df2000 CR4: 00000000003526f0
Call Trace:
<TASK>
br_mdb_flush_pgs net/bridge/br_mdb.c:1525 [inline]
br_mdb_flush net/bridge/br_mdb.c:1544 [inline]
br_mdb_del_bulk+0x5e2/0xb20 net/bridge/br_mdb.c:1561
rtnl_mdb_del+0x48a/0x640 net/core/rtnetlink.c:-1
rtnetlink_rcv_msg+0x77e/0xbe0 net/core/rtnetlink.c:6967
netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2550
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x80f/0x9b0 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0xa68/0xad0 net/socket.c:2592
___sys_sendmsg+0x2a5/0x360 net/socket.c:2646
__sys_sendmsg net/socket.c:2678 [inline]
__do_sys_sendmsg net/socket.c:2683 [inline]
__se_sys_sendmsg net/socket.c:2681 [inline]
__x64_sys_sendmsg+0x1bd/0x2a0 net/socket.c:2681
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fa45839aeb9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fa45921b028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fa458615fa0 RCX: 00007fa45839aeb9
RDX: 0000000000000000 RSI: 00002000000000c0 RDI: 0000000000000004
RBP: 00007fa458408c1f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fa458616038 R14: 00007fa458615fa0 R15: 00007fff0b59fae8
</TASK>
Fixes:
|
||
|
|
136114e0ab |
mm.git review status for linus..mm-nonmm-stable
Total patches: 107 Reviews/patch: 1.07 Reviewed rate: 67% - The 2 patch series "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" from Heming Zhao saves disk space by teaching ocfs2 to reclaim suballocator block group space. - The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in various places. - The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size. - The 7 patch series "kallsyms: Prevent invalid access when showing module buildid" from Petr Mladek cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces. - The 3 patch series "Address page fault in ima_restore_measurement_list()" from Harshit Mogalapalli fixes a kexec-related crash that can occur when booting the second-stage kernel on x86. - The 6 patch series "kho: ABI headers and Documentation updates" from Mike Rapoport updates the kexec handover ABI documentation. - The 4 patch series "Align atomic storage" from Finn Thain adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh. - The 2 patch series "kho: clean up page initialization logic" from Pratyush Yadav simplifies the page initialization logic in kho_restore_page(). - The 6 patch series "Unload linux/kernel.h" from Yury Norov moves several things out of kernel.h and into more appropriate places. - The 7 patch series "don't abuse task_struct.group_leader" from Oleg Nesterov removes the usage of ->group_leader when it is "obviously unnecessary". - The 5 patch series "list private v2 & luo flb" from Pasha Tatashin adds some infrastructure improvements to the live update orchestrator. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6 0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww= =mmsH -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ... |
||
|
|
b2936b4fd5 |
net/ipv6: Introduce payload_len helpers
The next commits will transition away from using the hop-by-hop extension header to encode packet length for BIG TCP. Add wrappers around ip6->payload_len that return the actual value if it's non-zero, and calculate it from skb->len if payload_len is set to zero (and a symmetrical setter). The new helpers are used wherever the surrounding code supports the hop-by-hop jumbo header for BIG TCP IPv6, or the corresponding IPv4 code uses skb_ip_totlen (e.g., in include/net/netfilter/nf_tables_ipv6.h). No behavioral change in this commit. Signed-off-by: Alice Mikityanska <alice@isovalent.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260205133925.526371-2-alice.kernel@fastmail.im Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
6dfa3df797 |
net: bridge: use sysfs_emit instead of sprintf
Replace sprintf with sysfs_emit in sysfs show() methods as outlined in Documentation/filesystems/sysfs.rst. sysfs_emit is preferred to sprintf in sysfs show() methods as it is safer with buffer handling. Signed-off-by: David Corvaglia <david@corvaglia.dev> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/0100019c1fc2bcc3-bc9ca2f1-22d7-4250-8441-91e4af57117b-000000@email.amazonses.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
a010fe8d86 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.19-rc8). No adjacent changes, conflicts: drivers/net/ethernet/spacemit/k1_emac.c |
||
|
|
cc0cf10fda |
net: bridge: fix static key check
Fix the check if netfilter's static keys are available. netfilter defines
and exports static keys if CONFIG_JUMP_LABEL is enabled. (HAVE_JUMP_LABEL
is never defined.)
Fixes:
|
||
|
|
dd7fca9f49 |
net: bridge: mcast: fix memcpy with u64_stats
On 64bit arches, struct u64_stats_sync is empty and provides no help against load/store tearing. memcpy() should not be considered atomic against u64 values. Use u64_stats_copy() instead. Signed-off-by: David Yang <mmyangfl@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260120092137.2161162-3-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
24c776355f |
kernel.h: drop hex.h and update all hex.h users
Remove <linux/hex.h> from <linux/kernel.h> and update all users/callers of hex.h interfaces to directly #include <linux/hex.h> as part of the process of putting kernel.h on a diet. Removing hex.h from kernel.h means that 36K C source files don't have to pay the price of parsing hex.h for the roughly 120 C source files that need it. This change has been build-tested with allmodconfig on most ARCHes. Also, all users/callers of <linux/hex.h> in the entire source tree have been updated if needed (if not already #included). Link: https://lkml.kernel.org/r/20251215005206.2362276-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
910d271227 |
netfilter: don't include xt and nftables.h in unrelated subsystems
conntrack, xtables and nftables are distinct subsystems, don't use them in other subystems. Signed-off-by: Florian Westphal <fw@strlen.de> |
||
|
|
b25a0b4a21 |
net: bridge: annotate data-races around fdb->{updated,used}
fdb->updated and fdb->used are read and written locklessly.
Add READ_ONCE()/WRITE_ONCE() annotations.
Fixes:
|
||
|
|
d6f6c6d909 |
netfilter pull request nf-26-01-02
-----BEGIN PGP SIGNATURE----- iQJdBAABCABHFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmlXqEwbFIAAAAAABAAO bWFudTIsMi41KzEuMTEsMiwyDRxmd0BzdHJsZW4uZGUACgkQcJGo2a1f9gBFeg// awoFVzkRjgk5RgGnZstWU5QIDRwdl6G+Q0OkuQSh3KNQak50CxbrY6X6zSdVSJjG y///iM1zONM46k0TIvYovHklYp4adTTEPEXISuUwzARfbD6X40qgsVCkBNrmr5fO l1cu2RXAcYzNOm9DrC+744z8KVeduoL6LFon0Xf4ah/eqxM7o92Tcj8dtPV1TsA8 8C+wVxZIw11OaM7H1fKhUMhQ6CnVc5OZOveO/lJxonaTIuCVULxPezEZQjpXNcnc hp5uMrip2BlecyeNFiPqDhqnVeU34xN3Zxns6Nq9zOcz5yfQg/LB/XGTx39HVljn 4v7ziGS+7qUQ9zc9LNID1jgJY6RNlj2+SkbavTfKpPQagKOR+BEIw7b4KBAsOL9l b8uDXLlUaWKTjb/DIVSWks5viwg1tbtsdyoeplcHWfP7miATp59es2FvD+DxT8HV stXhTWf0CDCLxiUHW9E8+QoQcnktjw9khoCZY/YZwfbIYf60uvlLEzO2G1Dt9SO6 cYfdlHLPpFcmQKhEOAFuOSRURqpyMHpwRgbul1yvR/ItW0hNA8j4oepo6g6crU+l WQ+l18ZtV0Aa/CxZ4eGcUBvonjf/n+XR8Hshukh9yRblF5BFDuSFIKWY7pMbkRO2 Es+C4Q6WMufRmovYhWqUaR4pHJFFSVe7JxBZ85VBVjI= =g2yW -----END PGP SIGNATURE----- Merge tag 'nf-26-01-02' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter: updates for net The following patchset contains Netfilter fixes for *net*: 1) Fix overlap detection for nf_tables with concatenated ranges. There are cases where element could not be added due to a conflict with existing range, while kernel reports success to userspace. 2) update selftest to cover this bug. 3) synproxy update path should use READ/WRITE once as we replace config struct while packet path might read it in parallel. This relies on said config struct to fit sizeof(long). From Fernando Fernandez Mancera. 4) Don't return -EEXIST from xtables in module load path, a pending patch to module infra will spot a warning if this happens. From Daniel Gomez. 5) Fix a memory leak in nf_tables when chain hits 2**32 users and rule is to be hw-offloaded, from Zilin Guan. 6) Avoid infinite list growth when insert rate is high in nf_conncount, also from Fernando. * tag 'nf-26-01-02' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_conncount: update last_gc only when GC has been performed netfilter: nf_tables: fix memory leak in nf_tables_newrule() netfilter: replace -EEXIST with -EBUSY netfilter: nft_synproxy: avoid possible data-race on update operation selftests: netfilter: nft_concat_range.sh: add check for overlap detection bug netfilter: nft_set_pipapo: fix range overlap detection ==================== Link: https://patch.msgid.link/20260102114128.7007-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
3128df6be1 |
bridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egress
When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is
incorrectly stripped from frames during egress processing.
br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN
from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also
moves any "next" VLAN from the payload into hwaccel:
/* move next vlan tag to hw accel tag */
__skb_vlan_pop(skb, &vlan_tci);
__vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci);
For QinQ frames where the C-VLAN sits in the payload, this moves it to
hwaccel where it gets lost during VXLAN encapsulation.
Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only
the hwaccel S-VLAN and leaves the payload untouched.
This path is only taken when vlan_tunnel is enabled and tunnel_info
is configured, so 802.1Q bridges are unaffected.
Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN
preserved in VXLAN payload via tcpdump.
Fixes:
|
||
|
|
2bafeb8d2f |
netfilter: replace -EEXIST with -EBUSY
The -EEXIST error code is reserved by the module loading infrastructure to indicate that a module is already loaded. When a module's init function returns -EEXIST, userspace tools like kmod interpret this as "module already loaded" and treat the operation as successful, returning 0 to the user even though the module initialization actually failed. Replace -EEXIST with -EBUSY to ensure correct error reporting in the module initialization path. Affected modules: * ebtable_broute ebtable_filter ebtable_nat arptable_filter * ip6table_filter ip6table_mangle ip6table_nat ip6table_raw * ip6table_security iptable_filter iptable_mangle iptable_nat * iptable_raw iptable_security Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Signed-off-by: Florian Westphal <fw@strlen.de> |
||
|
|
f79f9b7ace |
net: bridge: Describe @tunnel_hash member in net_bridge_vlan_group struct
Sphinx reports kernel-doc warning:
WARNING: ./net/bridge/br_private.h:267 struct member 'tunnel_hash' not described in 'net_bridge_vlan_group'
Fix it by describing @tunnel_hash member.
Fixes:
|
||
|
|
1ec9871fbb |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.18-rc5). Conflicts: drivers/net/wireless/ath/ath12k/mac.c |
||
|
|
ee87c63f9b |
net: bridge: fix MST static key usage
As Ido pointed out, the static key usage in MST is buggy and should use
inc/dec instead of enable/disable because we can have multiple bridges
with MST enabled which means a single bridge can disable MST for all.
Use static_branch_inc/dec to avoid that. When destroying a bridge decrement
the key if MST was enabled.
Fixes:
|
||
|
|
8dca36978a |
net: bridge: fix use-after-free due to MST port state bypass
syzbot reported[1] a use-after-free when deleting an expired fdb. It is
due to a race condition between learning still happening and a port being
deleted, after all its fdbs have been flushed. The port's state has been
toggled to disabled so no learning should happen at that time, but if we
have MST enabled, it will bypass the port's state, that together with VLAN
filtering disabled can lead to fdb learning at a time when it shouldn't
happen while the port is being deleted. VLAN filtering must be disabled
because we flush the port VLANs when it's being deleted which will stop
learning. This fix adds a check for the port's vlan group which is
initialized to NULL when the port is getting deleted, that avoids the port
state bypass. When MST is enabled there would be a minimal new overhead
in the fast-path because the port's vlan group pointer is cache-hot.
[1] https://syzkaller.appspot.com/bug?extid=dd280197f0f7ab3917be
Fixes:
|
||
|
|
68800bbf58 |
net: bridge: Flush multicast groups when snooping is disabled
When forwarding multicast packets, the bridge takes MDB into account when IGMP / MLD snooping is enabled. Currently, when snooping is disabled, the MDB is retained, even though it is not used anymore. At the same time, during the time that snooping is disabled, the IGMP / MLD control packets are obviously ignored, and after the snooping is reenabled, the administrator has to assume it is out of sync. In particular, missed join and leave messages would lead to traffic being forwarded to wrong interfaces. Keeping the MDB entries around thus serves no purpose, and just takes memory. Note also that disabling per-VLAN snooping does actually flush the relevant MDB entries. This patch flushes non-permanent MDB entries as global snooping is disabled. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/5e992df1bb93b88e19c0ea5819e23b669e3dde5d.1761228273.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
0152747a52 |
net: bridge: use common function to compute the features
Previously, bridge ignored all features propagation and DST retention, only handling explicitly the GSO limits. By switching to the new helper netdev_compute_master_upper_features(), the bridge now expose additional features, depending on the lowers capabilities. Since br_set_gso_limits() is already covered by the helper, it can be removed safely. Bridge has it's own way to update needed_headroom. So we don't need to update it in the helper. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20251017034155.61990-5-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
0513a3f97b |
net: bridge: correct debug message function name in br_fill_ifinfo
The debug message in br_fill_ifinfo() incorrectly refers to br_fill_info instead of the actual function name. Update it for clarity in debugging output. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20251013100121.755899-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
bbf0c98b3a |
bridge: br_vlan_fill_forward_path_pvid: use br_vlan_group_rcu()
net/bridge/br_private.h:1627 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
7 locks held by socat/410:
#0: ffff88800d7a9c90 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_stream_connect+0x43/0xa0
#1: ffffffff9a779900 (rcu_read_lock){....}-{1:3}, at: __ip_queue_xmit+0x62/0x1830
[..]
#6: ffffffff9a779900 (rcu_read_lock){....}-{1:3}, at: nf_hook.constprop.0+0x8a/0x440
Call Trace:
lockdep_rcu_suspicious.cold+0x4f/0xb1
br_vlan_fill_forward_path_pvid+0x32c/0x410 [bridge]
br_fill_forward_path+0x7a/0x4d0 [bridge]
Use to correct helper, non _rcu variant requires RTNL mutex.
Fixes:
|
||
|
|
cd9a9562b2 |
net: bridge: Install FDB for bridge MAC on VLAN 0
Currently, after the bridge is created, the FDB does not hold an FDB entry for the bridge MAC on VLAN 0: # ip link add name br up type bridge # ip -br link show dev br br UNKNOWN 92:19:8c:4e:01:ed <BROADCAST,MULTICAST,UP,LOWER_UP> # bridge fdb show | grep 92:19:8c:4e:01:ed 92:19:8c:4e:01:ed dev br vlan 1 master br permanent Later when the bridge MAC is changed, or in fact when the address is given during netdevice creation, the entry appears: # ip link add name br up address 00:11:22:33:44:55 type bridge # bridge fdb show | grep 00:11:22:33:44:55 00:11:22:33:44:55 dev br vlan 1 master br permanent 00:11:22:33:44:55 dev br master br permanent However when the bridge address is set by the user to the current bridge address before the first port is enslaved, none of the address handlers gets invoked, because the address is not actually changed. The address is however marked as NET_ADDR_SET. Then when a port is enslaved, the address is not changed, because it is NET_ADDR_SET. Thus the VLAN 0 entry is not added, and it has not been added previously either: # ip link add name br up type bridge # ip -br link show dev br br UNKNOWN 7e:f0:a8:1a:be:c2 <BROADCAST,MULTICAST,UP,LOWER_UP> # ip link set dev br addr 7e:f0:a8:1a:be:c2 # ip link add name v up type veth # ip link set dev v master br # ip -br link show dev br br UNKNOWN 7e:f0:a8:1a:be:c2 <BROADCAST,MULTICAST,UP,LOWER_UP> # bridge fdb | grep 7e:f0:a8:1a:be:c2 7e:f0:a8:1a:be:c2 dev br vlan 1 master br permanent Then when the bridge MAC is used as DMAC, and br_handle_frame_finish() looks up an FDB entry with VLAN=0, it doesn't find any, and floods the traffic instead of passing it up. Fix this by simply adding the VLAN 0 FDB entry for the bridge itself always on netdevice creation. This also makes the behavior consistent with how ports are treated: ports always have an FDB entry for each member VLAN as well as VLAN 0. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/415202b2d1b9b0899479a502bbe2ba188678f192.1758550408.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
5fd8bb982e |
net: replace use of system_wq with system_percpu_wq
Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. system_unbound_wq should be the default workqueue so as not to enforce locality constraints for random work whenever it's not required. Adding system_dfl_wq to encourage its use when unbound work should be used. The old system_unbound_wq will be kept for a few release cycles. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Link: https://patch.msgid.link/20250918142427.309519-3-marco.crivellari@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
bd569dd935 |
netfilter pull request nf-next-25-09-11
-----BEGIN PGP SIGNATURE----- iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmjC0WsNHGZ3QHN0cmxl bi5kZQAKCRBwkajZrV/2AOC2D/982GOJGI79ArhsM9FzEiJDp2QLxMnz7cw0t1Vg pU469JYA/He6VHMbhiqCBopSAskl36kMMGLbflPe+Zc1DNjconFD8wrDRBlUI+EI lR1dERMUyafuRMuuvKaE7A0ePyR7AD1JFf1znpo6BrRDMX8mbumcIA/L7/T1Rz2F 0b9Phbj77EdKruRgD4eXCuDJibOr5j6FymNvDGR/bm3SovOsOM4gi2dQ3Y3B85ZK aUDDrss5acypBfu36w+340eMtr1GJKoaCKoK9ScnO58peCSo9C0b+o6xwUBPnHdw sQOhl6fErOfN+agGir357jLsBUb1H7Tg9decwv9qHp3jESei3kgAyoBF4SqH+48S Bz1E2M0A0Hxve3/I/Pp1Hz6X5r+V+gDtx71z4+acoZg5YjDMAKFJAfgiK6sFKqZP 73QsvnU2omu2C5r8TAbryv01C5xmkbe3YyU8j4taZOBGk4YkPd5WjJELbTcVM4If dxDE6vQ9+GrrlBl9aidrMNxIhJuS/qjVIb0tv7oZPVql995AgPJG3F3gSO78OPBj V4dN9p/wPN9lRl5tj/odWiOfsdQiY7fEkqdXlO79sACQCzXC5O6zAwcnK37s80mQ QFSt3oiPFcacJJyHHYeYISy13HXlxwUgdxWC6T/sXMU5Jpic+3bWFxC2YgRcIsqZ ZE1JbA== =FXmW -----END PGP SIGNATURE----- Merge tag 'nf-next-25-09-11' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter: updates for net-next 1) Don't respond to ICMP_UNREACH errors with another ICMP_UNREACH error. 2) Support fetching the current bridge ethernet address. This allows a more flexible approach to packet redirection on bridges without need to use hardcoded addresses. From Fernando Fernandez Mancera. 3) Zap a few no-longer needed conditionals from ipvs packet path and convert to READ/WRITE_ONCE to avoid KCSAN warnings. From Zhang Tengfei. 4) Remove a no-longer-used macro argument in ipset, from Zhen Ni. * tag 'nf-next-25-09-11' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_reject: don't reply to icmp error messages ipvs: Use READ_ONCE/WRITE_ONCE for ipvs->enable netfilter: nft_meta_bridge: introduce NFT_META_BRI_IIFHWADDR support netfilter: ipset: Remove unused htable_bits in macro ahash_region selftest:net: fixed spelling mistakes ==================== Link: https://patch.msgid.link/20250911143819.14753-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
21446c06b4 |
net: bridge: Introduce UAPI for BR_BOOLOPT_FDB_LOCAL_VLAN_0
The previous patches introduced a new option, BR_BOOLOPT_FDB_LOCAL_VLAN_0. When enabled, it has local FDB entries installed only on VLAN 0, instead of duplicating them across all VLANs. In this patch, add the corresponding UAPI toggle, and the code for turning the feature on and off. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/ea99bfb10f687fa58091e6e1c2f8acc33f47ca45.1757004393.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
a29aba64e0 |
net: bridge: BROPT_FDB_LOCAL_VLAN_0: Skip local FDBs on VLAN creation
When BROPT_FDB_LOCAL_VLAN_0 is enabled, the local FDB entries for the member ports as well as the bridge itself should not be created per-VLAN, but instead only on VLAN 0. Thus when a VLAN is added for a port or the bridge itself, a local FDB entry with the corresponding address should not be added when in the VLAN-0 mode. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/bb13ba01d58ed6d5d700e012c519d38ee6806d22.1757004393.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
40df3b8e90 |
net: bridge: BROPT_FDB_LOCAL_VLAN_0: On bridge changeaddr, skip per-VLAN FDBs
When BROPT_FDB_LOCAL_VLAN_0 is enabled, the local FDB entries for the bridge itself should not be created per-VLAN, but instead only on VLAN 0. When the bridge address changes, the local FDB entries need to be updated, which is done in br_fdb_change_mac_address(). Bail out early when in VLAN-0 mode, so that the per-VLAN FDB entries are not created. The per-VLAN walk is only done afterwards. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/0bd432cf91921ef7c4ed0e129de1d1cd358c716b.1757004393.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
4cf5fd8497 |
net: bridge: BROPT_FDB_LOCAL_VLAN_0: On port changeaddr, skip per-VLAN FDBs
When BROPT_FDB_LOCAL_VLAN_0 is enabled, the local FDB entries for member ports should not be created per-VLAN, but instead only on VLAN 0. When the member port address changes, the local FDB entries need to be updated, which is done in br_fdb_changeaddr(). Under the VLAN-0 mode, only one local FDB entry will ever be added for a port's address, and that on VLAN 0. Thus bail out of the delete loop early. For the same reason, also skip adding the per-VLAN entries. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/0cf9d41836d2a245b0ce07e1a16ee05ca506cbe9.1757004393.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
60d6be0931 |
net: bridge: BROPT_FDB_LOCAL_VLAN_0: Look up FDB on VLAN 0 on miss
When BROPT_FDB_LOCAL_VLAN_0 is enabled, the local FDB entries for the member ports as well as the bridge itself should not be created per-VLAN, but instead only on VLAN 0. That means that br_handle_frame_finish() needs to make two lookups: the primary lookup on an appropriate VLAN, and when that misses, a lookup on VLAN 0. Have the second lookup only accept local MAC addresses. Turning this into a generic second-lookup feature is not the goal. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/8087475009dce360fb68d873b1ed9c80827da302.1757004393.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
c1164178e9 |
net: bridge: Introduce BROPT_FDB_LOCAL_VLAN_0
The following patches will gradually introduce the ability of the bridge to look up local FDB entries on VLAN 0 instead of using the VLAN indicated by a packet. In this patch, just introduce the option itself, with which the feature will be linked. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/ab85e33ef41ed19a3deaef0ff7da26830da30642.1757004393.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
fc3a281041 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.17-rc6). Conflicts: net/netfilter/nft_set_pipapo.c net/netfilter/nft_set_pipapo_avx2.c |
||
|
|
cbd2257dc9 |
netfilter: nft_meta_bridge: introduce NFT_META_BRI_IIFHWADDR support
Expose the input bridge interface ethernet address so it can be used to
redirect the packet to the receiving physical device for processing.
Tested with nft command line tool.
table bridge nat {
chain PREROUTING {
type filter hook prerouting priority 0; policy accept;
ether daddr de:ad:00:00:be:ef meta pkttype set host ether daddr set meta ibrhwdr accept
}
}
Joint work with Pablo Neira.
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
||
|
|
8625f5748f |
net: bridge: Bounce invalid boolopts
The bridge driver currently tolerates options that it does not recognize.
Instead, it should bounce them.
Fixes:
|
||
|
|
5ef04a7b06 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.17-rc5). No conflicts. Adjacent changes: include/net/sock.h |
||
|
|
46015e6b3e |
netfilter: ebtables: Use vmalloc_array() to improve code
Remove array_size() calls and replace vmalloc() with vmalloc_array() to simplify the code. vmalloc_array() is also optimized better, uses fewer instructions, and handles overflow more concisely[1]. [1]: https://lore.kernel.org/lkml/abc66ec5-85a4-47e1-9759-2f60ab111971@vivo.com/ Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Signed-off-by: Florian Westphal <fw@strlen.de> |
||
|
|
479a54ab92 |
netfilter: br_netfilter: do not check confirmed bit in br_nf_local_in() after confirm
When send a broadcast packet to a tap device, which was added to a bridge,
br_nf_local_in() is called to confirm the conntrack. If another conntrack
with the same hash value is added to the hash table, which can be
triggered by a normal packet to a non-bridge device, the below warning
may happen.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 96 at net/bridge/br_netfilter_hooks.c:632 br_nf_local_in+0x168/0x200
CPU: 1 UID: 0 PID: 96 Comm: tap_send Not tainted 6.17.0-rc2-dirty #44 PREEMPT(voluntary)
RIP: 0010:br_nf_local_in+0x168/0x200
Call Trace:
<TASK>
nf_hook_slow+0x3e/0xf0
br_pass_frame_up+0x103/0x180
br_handle_frame_finish+0x2de/0x5b0
br_nf_hook_thresh+0xc0/0x120
br_nf_pre_routing_finish+0x168/0x3a0
br_nf_pre_routing+0x237/0x5e0
br_handle_frame+0x1ec/0x3c0
__netif_receive_skb_core+0x225/0x1210
__netif_receive_skb_one_core+0x37/0xa0
netif_receive_skb+0x36/0x160
tun_get_user+0xa54/0x10c0
tun_chr_write_iter+0x65/0xb0
vfs_write+0x305/0x410
ksys_write+0x60/0xd0
do_syscall_64+0xa4/0x260
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
---[ end trace 0000000000000000 ]---
To solve the hash conflict, nf_ct_resolve_clash() try to merge the
conntracks, and update skb->_nfct. However, br_nf_local_in() still use the
old ct from local variable 'nfct' after confirm(), which leads to this
warning.
If confirm() does not insert the conntrack entry and return NF_DROP, the
warning may also occur. There is no need to reserve the WARN_ON_ONCE, just
remove it.
Link: https://lore.kernel.org/netdev/20250820043329.2902014-1-wangliang74@huawei.com/
Fixes:
|
||
|
|
a9af709fda |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.17-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
7de0eebbb4 |
net: bridge: remove unused argument of br_multicast_query_expired()
Since commit
|
||
|
|
d1547bf460 |
net: bridge: fix soft lockup in br_multicast_query_expired()
When set multicast_query_interval to a large value, the local variable 'time' in br_multicast_send_query() may overflow. If the time is smaller than jiffies, the timer will expire immediately, and then call mod_timer() again, which creates a loop and may trigger the following soft lockup issue. watchdog: BUG: soft lockup - CPU#1 stuck for 221s! [rb_consumer:66] CPU: 1 UID: 0 PID: 66 Comm: rb_consumer Not tainted 6.16.0+ #259 PREEMPT(none) Call Trace: <IRQ> __netdev_alloc_skb+0x2e/0x3a0 br_ip6_multicast_alloc_query+0x212/0x1b70 __br_multicast_send_query+0x376/0xac0 br_multicast_send_query+0x299/0x510 br_multicast_query_expired.constprop.0+0x16d/0x1b0 call_timer_fn+0x3b/0x2a0 __run_timers+0x619/0x950 run_timer_softirq+0x11c/0x220 handle_softirqs+0x18e/0x560 __irq_exit_rcu+0x158/0x1a0 sysvec_apic_timer_interrupt+0x76/0x90 </IRQ> This issue can be reproduced with: ip link add br0 type bridge echo 1 > /sys/class/net/br0/bridge/multicast_querier echo 0xffffffffffffffff > /sys/class/net/br0/bridge/multicast_query_interval ip link set dev br0 up The multicast_startup_query_interval can also cause this issue. Similar to the commit |
||
|
|
3d05b24429 |
bridge: Redirect to backup port when port is administratively down
If a backup port is configured for a bridge port, the bridge will redirect known unicast traffic towards the backup port when the primary port is administratively up but without a carrier. This is useful, for example, in MLAG configurations where a system is connected to two switches and there is a peer link between both switches. The peer link serves as the backup port in case one of the switches loses its connection to the multi-homed system. In order to avoid flooding when the primary port loses its carrier, the bridge does not flush dynamic FDB entries pointing to the port upon STP disablement, if the port has a backup port. The above means that known unicast traffic destined to the primary port will be blackholed when the port is put administratively down, until the FDB entries pointing to it are aged-out. Given that the current behavior is quite weird and unlikely to be depended on by anyone, amend the bridge to redirect to the backup port also when the primary port is administratively down and not only when it does not have a carrier. The change is motivated by a report from a user who expected traffic to be redirected to the backup port when the primary port was put administratively down while debugging a network issue. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250812080213.325298-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
25a8b88f00 |
netfilter: add back NETFILTER_XTABLES dependencies
Some Kconfig symbols were changed to depend on the 'bool' symbol
NETFILTER_XTABLES_LEGACY, which means they can now be set to built-in
when the xtables code itself is in a loadable module:
x86_64-linux-ld: vmlinux.o: in function `arpt_unregister_table_pre_exit':
(.text+0x1831987): undefined reference to `xt_find_table'
x86_64-linux-ld: vmlinux.o: in function `get_info.constprop.0':
arp_tables.c:(.text+0x1831aab): undefined reference to `xt_request_find_table_lock'
x86_64-linux-ld: arp_tables.c:(.text+0x1831bea): undefined reference to `xt_table_unlock'
x86_64-linux-ld: vmlinux.o: in function `do_arpt_get_ctl':
arp_tables.c:(.text+0x183205d): undefined reference to `xt_find_table_lock'
x86_64-linux-ld: arp_tables.c:(.text+0x18320c1): undefined reference to `xt_table_unlock'
x86_64-linux-ld: arp_tables.c:(.text+0x183219a): undefined reference to `xt_recseq'
Change these to depend on both NETFILTER_XTABLES and
NETFILTER_XTABLES_LEGACY.
Fixes:
|
||
|
|
8be4d31cb8 |
Networking changes for 6.17.
Core & protocols
----------------
- Wrap datapath globals into net_aligned_data, to avoid false sharing.
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container).
- Add SO_INQ and SCM_INQ support to AF_UNIX.
- Add SIOCINQ support to AF_VSOCK.
- Add TCP_MAXSEG sockopt to MPTCP.
- Add IPv6 force_forwarding sysctl to enable forwarding per interface.
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB.
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users.
- Convert TCP send queue handling from tasklet to BH workque.
- Improve BPF iteration over TCP sockets to see each socket exactly once.
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code.
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel NAPI
thread when threaded NAPI gets disabled. Previously thread would stick
around until ifdown due to tricky synchronization.
- Allow multicast routing to take effect on locally-generated packets.
- Add output interface argument for End.X in segment routing.
- MCTP: add support for gateway routing, improve bind() handling.
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink.
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed.
- Support NUD_PERMANENT for proxy neighbor entries.
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM.
- Add sequence numbers to netconsole messages. Unregister netconsole's
console when all net targets are removed. Code refactoring.
Add a number of selftests.
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup.
- Support inspecting ref_tracker state via DebugFS.
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links.
- Allow providing upcall pid for the 'execute' command in openvswitch.
- Remove DCCP support from Netfilter's conntrack.
- Disallow multiple packet duplications in the queuing layer.
- Prevent use of deprecated iptables code on PREEMPT_RT.
Driver API
----------
- Support RSS and hashing configuration over ethtool Netlink.
- Add dedicated ethtool callbacks for getting and setting hashing fields.
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc.
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs.
- Support traffic classes in devlink rate API for bandwidth management.
- Remove rtnl_lock dependency from UDP tunnel port configuration.
Device drivers
--------------
- Add a new Broadcom driver for 800G Ethernet (bnge).
- Add a standalone driver for Microchip ZL3073x DPLL.
- Remove IBM's NETIUCV device driver.
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is used
by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmiFgLgACgkQMUZtbf5S
IrvafxAAnQRwYBoIG+piCILx6z5pRvBGHkmEQ4AQgSCFuq2eO3ubwMFIqEybfma1
5+QFjUZAV3OgGgKRBS2KGWxtSzdiF+/JGV1VOIN67sX3Mm0a2QgjA4n5CgKL0FPr
o6BEzjX5XwG1zvGcBNQ5BZ19xUUKjoZQgTtnea8sZ57Fsp5RtRgmYRqoewNvNk/n
uImh0NFsDVb0UeOpSzC34VD9l1dJvLGdui4zJAjno/vpvmT1DkXjoK419J/r52SS
X+5WgsfJ6DkjHqVN1tIhhK34yWqBOcwGFZJgEnWHMkFIl2FqRfFKMHyqtfLlVnLA
mnIpSyz8Sq2AHtx0TlgZ3At/Ri8p5+yYJgHOXcDKyABa8y8Zf4wrycmr6cV9JLuL
z54nLEVnJuvfDVDVJjsLYdJXyhMpZFq6+uAItdxKaw8Ugp/QqG4QtoRj+XIHz4ZW
z6OohkCiCzTwEISFK+pSTxPS30eOxq43kCspcvuLiwCCStJBRkRb5GdZA4dm7LA+
1Od4ADAkHjyrFtBqTyyC2scX8UJ33DlAIpAYyIeS6w9Cj9EXxtp1z33IAAAZ03MW
jJwIaJuc8bK2fWKMmiG7ucIXjPo4t//KiWlpkwwqLhPbjZgfDAcxq1AC2TLoqHBL
y4EOgKpHDCMAghSyiFIAn2JprGcEt8dp+11B0JRXIn4Pm/eYDH8=
=lqbe
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Wrap datapath globals into net_aligned_data, to avoid false sharing
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container)
- Add SO_INQ and SCM_INQ support to AF_UNIX
- Add SIOCINQ support to AF_VSOCK
- Add TCP_MAXSEG sockopt to MPTCP
- Add IPv6 force_forwarding sysctl to enable forwarding per interface
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users
- Convert TCP send queue handling from tasklet to BH workque
- Improve BPF iteration over TCP sockets to see each socket exactly
once
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel
NAPI thread when threaded NAPI gets disabled. Previously thread
would stick around until ifdown due to tricky synchronization
- Allow multicast routing to take effect on locally-generated packets
- Add output interface argument for End.X in segment routing
- MCTP: add support for gateway routing, improve bind() handling
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed
- Support NUD_PERMANENT for proxy neighbor entries
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM
- Add sequence numbers to netconsole messages. Unregister
netconsole's console when all net targets are removed. Code
refactoring. Add a number of selftests
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup
- Support inspecting ref_tracker state via DebugFS
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links
- Allow providing upcall pid for the 'execute' command in openvswitch
- Remove DCCP support from Netfilter's conntrack
- Disallow multiple packet duplications in the queuing layer
- Prevent use of deprecated iptables code on PREEMPT_RT
Driver API:
- Support RSS and hashing configuration over ethtool Netlink
- Add dedicated ethtool callbacks for getting and setting hashing
fields
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs
- Support traffic classes in devlink rate API for bandwidth
management
- Remove rtnl_lock dependency from UDP tunnel port configuration
Device drivers:
- Add a new Broadcom driver for 800G Ethernet (bnge)
- Add a standalone driver for Microchip ZL3073x DPLL
- Remove IBM's NETIUCV device driver
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is
used by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading"
* tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits)
dpll: zl3073x: Fix build failure
selftests: bpf: fix legacy netfilter options
ipv6: annotate data-races around rt->fib6_nsiblings
ipv6: fix possible infinite loop in fib6_info_uses_dev()
ipv6: prevent infinite loop in rt6_nlmsg_size()
ipv6: add a retry logic in net6_rt_notify()
vrf: Drop existing dst reference in vrf_ip6_input_dst
net/sched: taprio: align entry index attr validation with mqprio
net: fsl_pq_mdio: use dev_err_probe
selftests: rtnetlink.sh: remove esp4_offload after test
vsock: remove unnecessary null check in vsock_getname()
igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
stmmac: xsk: fix negative overflow of budget in zerocopy mode
dt-bindings: ieee802154: Convert at86rf230.txt yaml format
net: dsa: microchip: Disable PTP function of KSZ8463
net: dsa: microchip: Setup fiber ports for KSZ8463
net: dsa: microchip: Write switch MAC address differently for KSZ8463
net: dsa: microchip: Use different registers for KSZ8463
net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
dt-bindings: net: dsa: microchip: Add KSZ8463 switch support
...
|
||
|
|
22c5696e3f |
Driver core changes for 6.17-rc1
- DEBUGFS
- Remove unneeded debugfs_file_{get,put}() instances
- Remove last remnants of debugfs_real_fops()
- Allow storing non-const void * in struct debugfs_inode_info::aux
- SYSFS
- Switch back to attribute_group::bin_attrs (treewide)
- Switch back to bin_attribute::read()/write() (treewide)
- Constify internal references to 'struct bin_attribute'
- Support cache-ids for device-tree systems
- Add arch hook arch_compact_of_hwid()
- Use arch_compact_of_hwid() to compact MPIDR values on arm64
- Rust
- Device
- Introduce CoreInternal device context (for bus internal methods)
- Provide generic drvdata accessors for bus devices
- Provide Driver::unbind() callbacks
- Use the infrastructure above for auxiliary, PCI and platform
- Implement Device::as_bound()
- Rename Device::as_ref() to Device::from_raw() (treewide)
- Implement fwnode and device property abstractions
- Implement example usage in the Rust platform sample driver
- Devres
- Remove the inner reference count (Arc) and use pin-init instead
- Replace Devres::new_foreign_owned() with devres::register()
- Require T to be Send in Devres<T>
- Initialize the data kept inside a Devres last
- Provide an accessor for the Devres associated Device
- Device ID
- Add support for ACPI device IDs and driver match tables
- Split up generic device ID infrastructure
- Use generic device ID infrastructure in net::phy
- DMA
- Implement the dma::Device trait
- Add DMA mask accessors to dma::Device
- Implement dma::Device for PCI and platform devices
- Use DMA masks from the DMA sample module
- I/O
- Implement abstraction for resource regions (struct resource)
- Implement resource-based ioremap() abstractions
- Provide platform device accessors for I/O (remap) requests
- Misc
- Support fallible PinInit types in Revocable
- Implement Wrapper<T> for Opaque<T>
- Merge pin-init blanket dependencies (for Devres)
- Misc
- Fix OF node leak in auxiliary_device_create()
- Use util macros in device property iterators
- Improve kobject sample code
- Add device_link_test() for testing device link flags
- Fix typo in Documentation/ABI/testing/sysfs-kernel-address_bits
- Hint to prefer container_of_const() over container_of()
-----BEGIN PGP SIGNATURE-----
iHQEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCaIjkhwAKCRBFlHeO1qrK
LpXuAP9RWwfD9ZGgQZ9OsMk/0pZ2mDclaK97jcmI9TAeSxeZMgD1FHnOMTY7oSIi
iG7Muq0yLD+A5gk9HUnMUnFNrngWCg==
=jgRj
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core updates from Danilo Krummrich:
"debugfs:
- Remove unneeded debugfs_file_{get,put}() instances
- Remove last remnants of debugfs_real_fops()
- Allow storing non-const void * in struct debugfs_inode_info::aux
sysfs:
- Switch back to attribute_group::bin_attrs (treewide)
- Switch back to bin_attribute::read()/write() (treewide)
- Constify internal references to 'struct bin_attribute'
Support cache-ids for device-tree systems:
- Add arch hook arch_compact_of_hwid()
- Use arch_compact_of_hwid() to compact MPIDR values on arm64
Rust:
- Device:
- Introduce CoreInternal device context (for bus internal methods)
- Provide generic drvdata accessors for bus devices
- Provide Driver::unbind() callbacks
- Use the infrastructure above for auxiliary, PCI and platform
- Implement Device::as_bound()
- Rename Device::as_ref() to Device::from_raw() (treewide)
- Implement fwnode and device property abstractions
- Implement example usage in the Rust platform sample driver
- Devres:
- Remove the inner reference count (Arc) and use pin-init instead
- Replace Devres::new_foreign_owned() with devres::register()
- Require T to be Send in Devres<T>
- Initialize the data kept inside a Devres last
- Provide an accessor for the Devres associated Device
- Device ID:
- Add support for ACPI device IDs and driver match tables
- Split up generic device ID infrastructure
- Use generic device ID infrastructure in net::phy
- DMA:
- Implement the dma::Device trait
- Add DMA mask accessors to dma::Device
- Implement dma::Device for PCI and platform devices
- Use DMA masks from the DMA sample module
- I/O:
- Implement abstraction for resource regions (struct resource)
- Implement resource-based ioremap() abstractions
- Provide platform device accessors for I/O (remap) requests
- Misc:
- Support fallible PinInit types in Revocable
- Implement Wrapper<T> for Opaque<T>
- Merge pin-init blanket dependencies (for Devres)
Misc:
- Fix OF node leak in auxiliary_device_create()
- Use util macros in device property iterators
- Improve kobject sample code
- Add device_link_test() for testing device link flags
- Fix typo in Documentation/ABI/testing/sysfs-kernel-address_bits
- Hint to prefer container_of_const() over container_of()"
* tag 'driver-core-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (84 commits)
rust: io: fix broken intra-doc links to `platform::Device`
rust: io: fix broken intra-doc link to missing `flags` module
rust: io: mem: enable IoRequest doc-tests
rust: platform: add resource accessors
rust: io: mem: add a generic iomem abstraction
rust: io: add resource abstraction
rust: samples: dma: set DMA mask
rust: platform: implement the `dma::Device` trait
rust: pci: implement the `dma::Device` trait
rust: dma: add DMA addressing capabilities
rust: dma: implement `dma::Device` trait
rust: net::phy Change module_phy_driver macro to use module_device_table macro
rust: net::phy represent DeviceId as transparent wrapper over mdio_device_id
rust: device_id: split out index support into a separate trait
device: rust: rename Device::as_ref() to Device::from_raw()
arm64: cacheinfo: Provide helper to compress MPIDR value into u32
cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id
cacheinfo: Set cache 'id' based on DT data
container_of: Document container_of() is not to be used in new code
driver core: auxiliary bus: fix OF node leak
...
|