mirror of
https://github.com/torvalds/linux.git
synced 2026-06-05 04:56:13 +02:00
Testing a network device that has large numbers of bytes/packets may
overflow. Using stats64 when comparing fixes this problem.
I tripped on this while iterating on a qstats patch for mlx5. See below
for confirmation without my added code that this is a bug.
Before this patch (with added debugging output):
$ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
rstat: 481708634 qstat: 666201639514 key: tx-bytes
not ok 3 stats.pkt_byte_sum
ok 4 stats.qstat_by_ifindex
Note the huge delta above ^^^ in the rtnl vs qstats.
After this patch:
$ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
ok 3 stats.pkt_byte_sum
ok 4 stats.qstat_by_ifindex
It looks like rtnl_fill_stats in net/core/rtnetlink.c will attempt to
copy the 64bit stats into a 32bit structure which is probably why this
behavior is occurring.
To show this is happening, you can get the underlying stats that the
stats.py test uses like this:
$ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \
--do getlink --json '{"ifi-index": 7}'
And examine the output (heavily snipped to show relevant fields):
'stats': {
'multicast': 3739197,
'rx-bytes': 1201525399,
'rx-packets': 56807158,
'tx-bytes': 492404458,
'tx-packets': 1200285371,
'stats64': {
'multicast': 3739197,
'rx-bytes': 35561263767,
'rx-packets': 56807158,
'tx-bytes': 666212335338,
'tx-packets': 1200285371,
The stats.py test prior to this patch was using the 'stats' structure
above, which matches the failure output on my system.
Comparing side by side, rx-bytes and tx-bytes, and getting ethtool -S
output:
rx-bytes stats: 1201525399
rx-bytes stats64: 35561263767
rx-bytes ethtool: 36203402638
tx-bytes stats: 492404458
tx-bytes stats64: 666212335338
tx-bytes ethtool: 666215360113
Note that the above was taken from a system with an mlx5 NIC, which only
exposes ndo_get_stats64.
Based on the ethtool output and qstat output, it appears that stats.py
should be updated to use the 'stats64' structure for accurate
comparisons when packet/byte counters get very large.
To confirm that this was not related to the qstats code I was iterating
on, I booted a kernel without my driver changes and re-ran the test
which shows the qstats are skipped (as they don't exist for mlx5):
NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
KTAP version 1
1..4
ok 1 stats.check_pause
ok 2 stats.check_fec
ok 3 stats.pkt_byte_sum # SKIP qstats not supported by the device
ok 4 stats.qstat_by_ifindex # SKIP No ifindex supports qstats
But, fetching the stats using the CLI
$ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \
--do getlink --json '{"ifi-index": 7}'
Shows the same issue (heavily snipped for relevant fields only):
'stats': {
'multicast': 105489,
'rx-bytes': 530879526,
'rx-packets': 751415,
'tx-bytes': 2510191396,
'tx-packets': 27700323,
'stats64': {
'multicast': 105489,
'rx-bytes': 530879526,
'rx-packets': 751415,
'tx-bytes': 15395093284,
'tx-packets': 27700323,
Comparing side by side with ethtool -S on the unmodified mlx5 driver:
tx-bytes stats: 2510191396
tx-bytes stats64: 15395093284
tx-bytes ethtool: 17718435810
Fixes:
|
||
|---|---|---|
| .. | ||
| alsa | ||
| amd-pstate | ||
| arm64 | ||
| bpf | ||
| breakpoints | ||
| cachestat | ||
| capabilities | ||
| cgroup | ||
| clone3 | ||
| connector | ||
| core | ||
| cpu-hotplug | ||
| cpufreq | ||
| damon | ||
| devices | ||
| dma | ||
| dmabuf-heaps | ||
| drivers | ||
| dt | ||
| efivarfs | ||
| exec | ||
| fchmodat2 | ||
| filelock | ||
| filesystems | ||
| firmware | ||
| fpu | ||
| ftrace | ||
| futex | ||
| gpio | ||
| hid | ||
| ia64 | ||
| intel_pstate | ||
| iommu | ||
| ipc | ||
| ir | ||
| kcmp | ||
| kexec | ||
| kmod | ||
| kselftest | ||
| kvm | ||
| landlock | ||
| lib | ||
| livepatch | ||
| lkdtm | ||
| locking | ||
| lsm | ||
| media_tests | ||
| membarrier | ||
| memfd | ||
| memory-hotplug | ||
| mincore | ||
| mm | ||
| mount | ||
| mount_setattr | ||
| move_mount_set_group | ||
| mqueue | ||
| nci | ||
| net | ||
| nolibc | ||
| nsfs | ||
| ntb | ||
| openat2 | ||
| perf_events | ||
| pid_namespace | ||
| pidfd | ||
| power_supply | ||
| powerpc | ||
| prctl | ||
| proc | ||
| pstore | ||
| ptp | ||
| ptrace | ||
| rcutorture | ||
| resctrl | ||
| ring-buffer | ||
| riscv | ||
| rlimits | ||
| rseq | ||
| rtc | ||
| rust | ||
| safesetid | ||
| sched | ||
| seccomp | ||
| sgx | ||
| sigaltstack | ||
| size | ||
| sparc64 | ||
| splice | ||
| static_keys | ||
| sync | ||
| syscall_user_dispatch | ||
| sysctl | ||
| tc-testing | ||
| tdx | ||
| thermal/intel | ||
| timens | ||
| timers | ||
| tmpfs | ||
| tpm2 | ||
| tty | ||
| turbostat | ||
| uevent | ||
| user | ||
| user_events | ||
| vDSO | ||
| watchdog | ||
| wireguard | ||
| x86 | ||
| zram | ||
| .gitignore | ||
| gen_kselftest_tar.sh | ||
| kselftest_deps.sh | ||
| kselftest_harness.h | ||
| kselftest_install.sh | ||
| kselftest_module.h | ||
| kselftest.h | ||
| lib.mk | ||
| Makefile | ||
| run_kselftest.sh | ||