Commit Graph

10818 Commits

Author SHA1 Message Date
Eric Biggers
0961c3bcef lib/crc-t10dif: add support for arch overrides
Following what was done for CRC32, add support for architecture-specific
override of the CRC-T10DIF library.  This will allow the CRC-T10DIF
library functions to access architecture-optimized code directly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20241202012056.209768-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:13 -08:00
Eric Biggers
be3c45b070 lib/crc-t10dif: stop wrapping the crypto API
In preparation for making the CRC-T10DIF library directly optimized for
each architecture, like what has been done for CRC32, get rid of the
weird layering where crc_t10dif_update() calls into the crypto API.
Instead, move crc_t10dif_generic() into the crc-t10dif library module,
and make crc_t10dif_update() just call crc_t10dif_generic().
Acceleration will be reintroduced via crc_t10dif_arch() in the following
patches.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20241202012056.209768-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:13 -08:00
Eric Biggers
38a9a5121c lib/crc32: make crc32c() go directly to lib
Now that the lower level __crc32c_le() library function is optimized for
each architecture, make crc32c() just call that instead of taking an
inefficient and error-prone detour through the shash API.

Note: a future cleanup should make crc32c_le() be the actual library
function instead of __crc32c_le().  That will require updating callers
of __crc32c_le() to use crc32c_le() instead, and updating callers of
crc32c_le() that expect a 'const void *' arg to expect 'const u8 *'
instead.  Similarly, a future cleanup should remove LIBCRC32C by making
everyone who is selecting it just select CRC32 directly instead.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20241202010844.144356-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:02 -08:00
Eric Biggers
d36cebe03c lib/crc32: improve support for arch-specific overrides
Currently the CRC32 library functions are defined as weak symbols, and
the arm64 and riscv architectures override them.

This method of arch-specific overrides has the limitation that it only
works when both the base and arch code is built-in.  Also, it makes the
arch-specific code be silently not used if it is accidentally built with
lib-y instead of obj-y; unfortunately the RISC-V code does this.

This commit reorganizes the code to have explicit *_arch() functions
that are called when they are enabled, similar to how some of the crypto
library code works (e.g. chacha_crypt() calls chacha_crypt_arch()).

Make the existing kconfig choice for the CRC32 implementation also
control whether the arch-optimized implementation (if one is available)
is enabled or not.  Make it enabled by default if CRC32 is also enabled.

The result is that arch-optimized CRC32 library functions will be
included automatically when appropriate, but it is now possible to
disable them.  They can also now be built as a loadable module if the
CRC32 library functions happen to be used only by loadable modules, in
which case the arch and base CRC32 modules will be automatically loaded
via direct symbol dependency when appropriate.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20241202010844.144356-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:01 -08:00
Eric Biggers
0a499a7e98 lib/crc32: drop leading underscores from __crc32c_le_base
Remove the leading underscores from __crc32c_le_base().

This is in preparation for adding crc32c_le_arch() and eventually
renaming __crc32c_le() to crc32c_le().

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20241202010844.144356-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:01 -08:00
Linus Torvalds
88862eeb47 vsnprintf: Removal of bprintf()
- Remove unused bprintf() function
 
   bprintf() was added with the rest of the "bin-printf" functions.
   These are functions that are used by trace_printk() that allows to
   quickly save the format and arguments into the ring buffer without
   the expensive processing of converting numbers to ASCII. Then on
   output, at a much later time, the ring buffer is read and the string
   processing occurs then. The bprintf() was added for consistency but
   was never used. It can be safely removed.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ0yNShQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qmJ6AP9i8pFOjeMfb2hOBpJTzORkIXEbz5nG
 OCK/5aeSdjxy8QEAqafBSr5IQOxaTCFve1p7WSwdgmi2ZLmqEasaud0LmAk=
 =5bp1
 -----END PGP SIGNATURE-----

Merge tag 'trace-printf-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bprintf() removal from Steven Rostedt:

 - Remove unused bprintf() function, that was added with the rest of the
   "bin-printf" functions.

   These are functions that are used by trace_printk() that allows to
   quickly save the format and arguments into the ring buffer without
   the expensive processing of converting numbers to ASCII. Then on
   output, at a much later time, the ring buffer is read and the string
   processing occurs then. The bprintf() was added for consistency but
   was never used. It can be safely removed.

* tag 'trace-printf-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  printf: Remove unused 'bprintf'
2024-12-01 13:10:51 -08:00
Linus Torvalds
9022ed0e7e strscpy: write destination buffer only once
The point behind strscpy() was to once and for all avoid all the
problems with 'strncpy()' and later broken "fixed" versions like
strlcpy() that just made things worse.

So strscpy not only guarantees NUL-termination (unlike strncpy), it also
doesn't do unnecessary padding at the destination.  But at the same time
also avoids byte-at-a-time reads and writes by _allowing_ some extra NUL
writes - within the size, of course - so that the whole copy can be done
with word operations.

It is also stable in the face of a mutable source string: it explicitly
does not read the source buffer multiple times (so an implementation
using "strnlen()+memcpy()" would be wrong), and does not read the source
buffer past the size (like the mis-design that is strlcpy does).

Finally, the return value is designed to be simple and unambiguous: if
the string cannot be copied fully, it returns an actual negative error,
making error handling clearer and simpler (and the caller already knows
the size of the buffer).  Otherwise it returns the string length of the
result.

However, there was one final stability issue that can be important to
callers: the stability of the destination buffer.

In particular, the same way we shouldn't read the source buffer more
than once, we should avoid doing multiple writes to the destination
buffer: first writing a potentially non-terminated string, and then
terminating it with NUL at the end does not result in a stable result
buffer.

Yes, it gives the right result in the end, but if the rule for the
destination buffer was that it is _always_ NUL-terminated even when
accessed concurrently with updates, the final byte of the buffer needs
to always _stay_ as a NUL byte.

[ Note that "final byte is NUL" here is literally about the final byte
  in the destination array, not the terminating NUL at the end of the
  string itself. There is no attempt to try to make concurrent reads and
  writes give any kind of consistent string length or contents, but we
  do want to guarantee that there is always at least that final
  terminating NUL character at the end of the destination array if it
  existed before ]

This is relevant in the kernel for the tsk->comm[] array, for example.
Even without locking (for either readers or writers), we want to know
that while the buffer contents may be garbled, it is always a valid C
string and always has a NUL character at 'comm[TASK_COMM_LEN-1]' (and
never has any "out of thin air" data).

So avoid any "copy possibly non-terminated string, and terminate later"
behavior, and write the destination buffer only once.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-01 12:17:16 -08:00
Dr. David Alan Gilbert
f69e63756f printf: Remove unused 'bprintf'
bprintf() is unused. Remove it. It was added in the commit 4370aa4aa7
("vsprintf: add binary printf") but as far as I can see was never used,
unlike the other two functions in that patch.

Link: https://lore.kernel.org/20241002173147.210107-1-linux@treblig.org
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Acked-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-30 22:41:35 -05:00
Linus Torvalds
55cb93fd24 Driver core changes for 6.13-rc1
Here is a small set of driver core changes for 6.13-rc1.
 
 Nothing major for this merge cycle, except for the 2 simple merge
 conflicts are here just to make life interesting.
 
 Included in here are:
   - sysfs core changes and preparations for more sysfs api cleanups that
     can come through all driver trees after -rc1 is out
   - fw_devlink fixes based on many reports and debugging sessions
   - list_for_each_reverse() removal, no one was using it!
   - last-minute seq_printf() format string bug found and fixed in many
     drivers all at once.
   - minor bugfixes and changes full details in the shortlog
 
 As mentioned above, there is 2 merge conflicts with your tree, one is
 where the file is removed (easy enough to resolve), the second is a
 build time error, that has been found in linux-next and the fix can be
 seen here:
 	https://lore.kernel.org/r/20241107212645.41252436@canb.auug.org.au
 
 Other than that, the changes here have been in linux-next with no other
 reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ0lEog8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym+0ACgw6wN+LkLVIHWhxTq5DYHQ0QCxY8AoJrRIcKe
 78h0+OU3OXhOy8JGz62W
 =oI5S
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is a small set of driver core changes for 6.13-rc1.

  Nothing major for this merge cycle, except for the two simple merge
  conflicts are here just to make life interesting.

  Included in here are:

   - sysfs core changes and preparations for more sysfs api cleanups
     that can come through all driver trees after -rc1 is out

   - fw_devlink fixes based on many reports and debugging sessions

   - list_for_each_reverse() removal, no one was using it!

   - last-minute seq_printf() format string bug found and fixed in many
     drivers all at once.

   - minor bugfixes and changes full details in the shortlog"

* tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits)
  Fix a potential abuse of seq_printf() format string in drivers
  cpu: Remove spurious NULL in attribute_group definition
  s390/con3215: Remove spurious NULL in attribute_group definition
  perf: arm-ni: Remove spurious NULL in attribute_group definition
  driver core: Constify bin_attribute definitions
  sysfs: attribute_group: allow registration of const bin_attribute
  firmware_loader: Fix possible resource leak in fw_log_firmware_info()
  drivers: core: fw_devlink: Fix excess parameter description in docstring
  driver core: class: Correct WARN() message in APIs class_(for_each|find)_device()
  cacheinfo: Use of_property_present() for non-boolean properties
  cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap()
  drivers: core: fw_devlink: Make the error message a bit more useful
  phy: tegra: xusb: Set fwnode for xusb port devices
  drm: display: Set fwnode for aux bus devices
  driver core: fw_devlink: Stop trying to optimize cycle detection logic
  driver core: Constify attribute arguments of binary attributes
  sysfs: bin_attribute: add const read/write callback variants
  sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR()
  sysfs: treewide: constify attribute callback of bin_attribute::llseek()
  sysfs: treewide: constify attribute callback of bin_attribute::mmap()
  ...
2024-11-29 11:43:29 -08:00
Luis Chamberlain
3e1d95b63c selftests: kallsyms: fix and clarify current test boundaries
Provide and clarify the existing ranges and what you should expect.
Fix the gen_test_kallsyms.sh script to accept different ranges.

Fixes: 84b4a51fce ("selftests: add new kallsyms selftests")
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-11-28 11:17:30 -08:00
Luis Chamberlain
7ea13556f7 selftests: kallsyms: fix double build stupidity
The current arrangement will have the test modules rebuilt on
any make without having the script or code actually change.
Take Masahiro Yamada's suggested fix and cleanups on the Makefile
to fix this.

Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 84b4a51fce ("selftests: add new kallsyms selftests")
Closes: https://lore.kernel.org/all/CAK7LNATRDODmfz1tE=inV-DQqPA4G9vKH+38zMbaGdpTuFWZFw@mail.gmail.com/T/#me6c8f98e82acbee6e75a31b34bbb543eb4940b15
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-11-28 11:17:07 -08:00
Linus Torvalds
b5361254c9 Modules changes for v6.13-rc1
Highlights for this merge window:
 
   * The whole caching of module code into huge pages by Mike Rapoport is going
     in through Andrew Morton's tree due to some other code dependencies. That's
     really the biggest highlight for Linux kernel modules in this release. With
     it we share huge pages for modules, starting off with x86. Expect to see that
     soon through Andrew!
 
   * Helge Deller addressed some lingering low hanging fruit alignment
     enhancements by. It is worth pointing out that from his old patch series
     I dropped his vmlinux.lds.h change at Masahiro's request as he would
     prefer this to be specified in asm code [0].
 
     [0] https://lore.kernel.org/all/20240129192644.3359978-5-mcgrof@kernel.org/T/#m9efef5e700fbecd28b7afb462c15eed8ba78ef5a
 
   * Matthew Maurer and Sami Tolvanen have been tag teaming to help
     get us closer to a modversions for Rust. In this cycle we take in
     quite a lot of the refactoring for ELF validation. I expect modversions
     for Rust will be merged by v6.14 as that code is mostly ready now.
 
   * Adds a new modules selftests: kallsyms which helps us tests find_symbol()
     and the limits of kallsyms on Linux today.
 
   * We have a realtime mailing list to kernel-ci testing for modules now
     which relies and combines patchwork, kpd and kdevops:
 
     - https://patchwork.kernel.org/project/linux-modules/list/
     - https://github.com/linux-kdevops/kdevops/blob/main/docs/kernel-ci/README.md
     - https://github.com/linux-kdevops/kdevops/blob/main/docs/kernel-ci/kernel-ci-kpd.md
     - https://github.com/linux-kdevops/kdevops/blob/main/docs/kernel-ci/linux-modules-kdevops-ci.md
 
     If you want to help avoid Linux kernel modules regressions, now its simple,
     just add a new Linux modules sefltests under tools/testing/selftests/module/
     That is it. All new selftests will be used and leveraged automatically by
     the CI.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmdGbrcSHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinIDEQAMa1H7hsneNT0Z/YewzOfdSKZIkTzpk3
 /fLl7PfWyFvk7yHT1JiUXidS/80SEMnWb+u8Sn00/uvcJomnPcK9oTwTzBQ0vefl
 FWIUM0DmBzBOi5xdjrPLjg5o6TFt7hVae3hoRJzIlLD02vGfrPYpyHo7XmRrLM4C
 8p+3geziwZMpjcGM254eSiTGxNL8z1iZVRsz8QrrBruRfBDnHNgwtmK097v13Xdb
 qmLX6CN2irmNPZSZwDqP8QL2sJk9qQpNdPmpjMvaY3VfaMVkM46FLy0k9yeXXNqw
 E1p/GuylCZq4NG1hic9zB1I1CE910ugCztJnPcGw4C7CSm54YoLiUJrIeRyTZhk6
 et9N25AlJHxyq72GIRTMQCA9Njxaavx5KilvuWYZmaILfeI0k/3gvcxUqp/EJQ9Q
 axPu69HJFRSKMVh1o+QrSaPmEtSydpYwuuNJ6ONRpq5I3bzOVDSCroceAdXEMO9K
 yoSfm4KwN/BSnmX6KVLonrSM91nv2/v9UokuaZMV/CsDpXIZs996PvAoopCm1Twb
 K3fv0uD+2q2FTOOBInkuRJo2zBUvNnDRPAS2pE3DMXy8xhsQXdovEpjijuCGb8eC
 y0R+I4RIugIB2n6YBUFfyma1veGlT3PtrWQnO6E3YJpv8bqIJoYVT5IGo9M9YRO9
 lzjtR9NzGtmh
 =Ny84
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux

Pull modules updates from Luis Chamberlain:

 - The whole caching of module code into huge pages by Mike Rapoport is
   going in through Andrew Morton's tree due to some other code
   dependencies. That's really the biggest highlight for Linux kernel
   modules in this release. With it we share huge pages for modules,
   starting off with x86. Expect to see that soon through Andrew!

 - Helge Deller addressed some lingering low hanging fruit alignment
   enhancements by. It is worth pointing out that from his old patch
   series I dropped his vmlinux.lds.h change at Masahiro's request as he
   would prefer this to be specified in asm code [0].

    [0] https://lore.kernel.org/all/20240129192644.3359978-5-mcgrof@kernel.org/T/#m9efef5e700fbecd28b7afb462c15eed8ba78ef5a

 - Matthew Maurer and Sami Tolvanen have been tag teaming to help get us
   closer to a modversions for Rust. In this cycle we take in quite a
   lot of the refactoring for ELF validation. I expect modversions for
   Rust will be merged by v6.14 as that code is mostly ready now.

 - Adds a new modules selftests: kallsyms which helps us tests
   find_symbol() and the limits of kallsyms on Linux today.

 - We have a realtime mailing list to kernel-ci testing for modules now
   which relies and combines patchwork, kpd and kdevops:

     https://patchwork.kernel.org/project/linux-modules/list/
     https://github.com/linux-kdevops/kdevops/blob/main/docs/kernel-ci/README.md
     https://github.com/linux-kdevops/kdevops/blob/main/docs/kernel-ci/kernel-ci-kpd.md
     https://github.com/linux-kdevops/kdevops/blob/main/docs/kernel-ci/linux-modules-kdevops-ci.md

   If you want to help avoid Linux kernel modules regressions, now its
   simple, just add a new Linux modules sefltests under
   tools/testing/selftests/module/ That is it. All new selftests will be
   used and leveraged automatically by the CI.

* tag 'modules-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
  tests/module/gen_test_kallsyms.sh: use 0 value for variables
  scripts: Remove export_report.pl
  selftests: kallsyms: add MODULE_DESCRIPTION
  selftests: add new kallsyms selftests
  module: Reformat struct for code style
  module: Additional validation in elf_validity_cache_strtab
  module: Factor out elf_validity_cache_strtab
  module: Group section index calculations together
  module: Factor out elf_validity_cache_index_str
  module: Factor out elf_validity_cache_index_sym
  module: Factor out elf_validity_cache_index_mod
  module: Factor out elf_validity_cache_index_info
  module: Factor out elf_validity_cache_secstrings
  module: Factor out elf_validity_cache_sechdrs
  module: Factor out elf_validity_ehdr
  module: Take const arg in validate_section_offset
  modules: Add missing entry for __ex_table
  modules: Ensure 64-bit alignment on __ksymtab_* sections
2024-11-27 10:20:50 -08:00
Linus Torvalds
e06635e26c slab updates for 6.13
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmdERvEACgkQu+CwddJF
 iJre6Af9EBMVQiWJrmoMOjbGLqLgmZzSXRNxR862WGn4D/wesA1HmSlWgEn54hgc
 GIYIeD++v4JaIRNH0yZqb2UBSKjF/rYPDkKstnqgFaVakLoDrwkkwV2n3Gk5BEgR
 m/SzLGgoDWKR65I/oMpL6e2KrMOfMfjpB31qiVvdlaQd2Nv/5rw+gUVylxhNIZEH
 W11N3IC+e9hmgT3ZBpTmHeqNrlXE1+USWPrp/HV05Ndz6yf97JnP4Wr9f9pcyN3R
 aflLHR38+Q9cCfO7y8wNqtYvIV/kbqgdaqD76frSgalC4Lmz9+L+TZ2NuENCPoGj
 Xdbip2z+iffWhvqM+qooOLVxR0XqTA==
 =Sepb
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.13-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:

 - Add new slab_strict_numa boot parameter to enforce per-object memory
   policies on top of slab folio policies, for systems where saving cost
   of remote accesses is more important than minimizing slab allocation
   overhead (Christoph Lameter)

 - Fix for freeptr_offset alignment check being too strict for m68k
   (Geert Uytterhoeven)

 - krealloc() fixes for not violating __GFP_ZERO guarantees on
   krealloc() when slub_debug (redzone and object tracking) is enabled
   (Feng Tang)

 - Fix a memory leak in case sysfs registration fails for a slab cache,
   and also no longer fail to create the cache in that case (Hyeonggon
   Yoo)

 - Fix handling of detected consistency problems (due to buggy slab
   user) with slub_debug enabled, so that it does not cause further list
   corruption bugs (yuan.gao)

 - Code cleanup and kerneldocs polishing (Zhen Lei, Vlastimil Babka)

* tag 'slab-for-6.13-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slab: Fix too strict alignment check in create_cache()
  mm/slab: Allow cache creation to proceed even if sysfs registration fails
  mm/slub: Avoid list corruption when removing a slab from the full list
  mm/slub, kunit: Add testcase for krealloc redzone and zeroing
  mm/slub: Improve redzone check and zeroing for krealloc()
  mm/slub: Consider kfence case for get_orig_size()
  SLUB: Add support for per object memory policies
  mm, slab: add kerneldocs for common SLAB_ flags
  mm/slab: remove duplicate check in create_cache()
  mm/slub: Move krealloc() and related code to slub.c
  mm/kasan: Don't store metadata inside kmalloc object when slub_debug_orig_size is on
2024-11-25 16:51:24 -08:00
Linus Torvalds
f5f4745a7f - The series "resource: A couple of cleanups" from Andy Shevchenko
performs some cleanups in the resource management code.
 
 - The series "Improve the copy of task comm" from Yafang Shao addresses
   possible race-induced overflows in the management of task_struct.comm[].
 
 - The series "Remove unnecessary header includes from
   {tools/}lib/list_sort.c" from Kuan-Wei Chiu adds some cleanups and a
   small fix to the list_sort library code and to its selftest.
 
 - The series "Enhance min heap API with non-inline functions and
   optimizations" also from Kuan-Wei Chiu optimizes and cleans up the
   min_heap library code.
 
 - The series "nilfs2: Finish folio conversion" from Ryusuke Konishi
   finishes off nilfs2's folioification.
 
 - The series "add detect count for hung tasks" from Lance Yang adds more
   userspace visibility into the hung-task detector's activity.
 
 - Apart from that, singelton patches in many places - please see the
   individual changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ0L6lQAKCRDdBJ7gKXxA
 jmEIAPwMSglNPKRIOgzOvHh8MUJW1Dy8iKJ2kWCO3f6QTUIM2AEA+PazZbUd/g2m
 Ii8igH0UBibIgva7MrCyJedDI1O23AA=
 =8BIU
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - The series "resource: A couple of cleanups" from Andy Shevchenko
   performs some cleanups in the resource management code

 - The series "Improve the copy of task comm" from Yafang Shao addresses
   possible race-induced overflows in the management of
   task_struct.comm[]

 - The series "Remove unnecessary header includes from
   {tools/}lib/list_sort.c" from Kuan-Wei Chiu adds some cleanups and a
   small fix to the list_sort library code and to its selftest

 - The series "Enhance min heap API with non-inline functions and
   optimizations" also from Kuan-Wei Chiu optimizes and cleans up the
   min_heap library code

 - The series "nilfs2: Finish folio conversion" from Ryusuke Konishi
   finishes off nilfs2's folioification

 - The series "add detect count for hung tasks" from Lance Yang adds
   more userspace visibility into the hung-task detector's activity

 - Apart from that, singelton patches in many places - please see the
   individual changelogs for details

* tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  gdb: lx-symbols: do not error out on monolithic build
  kernel/reboot: replace sprintf() with sysfs_emit()
  lib: util_macros_kunit: add kunit test for util_macros.h
  util_macros.h: fix/rework find_closest() macros
  Improve consistency of '#error' directive messages
  ocfs2: fix uninitialized value in ocfs2_file_read_iter()
  hung_task: add docs for hung_task_detect_count
  hung_task: add detect count for hung tasks
  dma-buf: use atomic64_inc_return() in dma_buf_getfile()
  fs/proc/kcore.c: fix coccinelle reported ERROR instances
  resource: avoid unnecessary resource tree walking in __region_intersects()
  ocfs2: remove unused errmsg function and table
  ocfs2: cluster: fix a typo
  lib/scatterlist: use sg_phys() helper
  checkpatch: always parse orig_commit in fixes tag
  nilfs2: convert metadata aops from writepage to writepages
  nilfs2: convert nilfs_recovery_copy_block() to take a folio
  nilfs2: convert nilfs_page_count_clean_buffers() to take a folio
  nilfs2: remove nilfs_writepage
  nilfs2: convert checkpoint file to be folio-based
  ...
2024-11-25 16:09:48 -08:00
Linus Torvalds
36843bfbf7 hardening updates for v6.13-rc1
- Disable __counted_by in Clang < 19.1.3 (Jan Hendrik Farr)
 
 - string_helpers: Silence output truncation warning (Bartosz Golaszewski)
 
 - compiler.h: Avoid needing BUILD_BUG_ON_ZERO() (Philipp Reisner)
 
 - MAINTAINERS: Add kernel hardening keywords __counted_by{_le|_be}
   (Thorsten Blum)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZz9GaQAKCRA2KwveOeQk
 uwIhAP0dbxSOT3T7Xz7ZKqNKWvuyy8nkY5SqizXeThFXKQZGMgEApHJ2DVENHA+R
 mvFTq1t8JcFMUlBgBO1a6ow8/CCRmAI=
 =l0lE
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - Disable __counted_by in Clang < 19.1.3 (Jan Hendrik Farr)

 - string_helpers: Silence output truncation warning (Bartosz
   Golaszewski)

 - compiler.h: Avoid needing BUILD_BUG_ON_ZERO() (Philipp Reisner)

 - MAINTAINERS: Add kernel hardening keywords __counted_by{_le|_be}
   (Thorsten Blum)

* tag 'hardening-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  Compiler Attributes: disable __counted_by for clang < 19.1.3
  compiler.h: Fix undefined BUILD_BUG_ON_ZERO()
  lib: string_helpers: silence snprintf() output truncation warning
  MAINTAINERS: Add kernel hardening keywords __counted_by{_le|_be}
2024-11-25 15:22:35 -08:00
Linus Torvalds
5c00ff742b - The series "zram: optimal post-processing target selection" from
Sergey Senozhatsky improves zram's post-processing selection algorithm.
   This leads to improved memory savings.
 
 - Wei Yang has gone to town on the mapletree code, contributing several
   series which clean up the implementation:
 
 	- "refine mas_mab_cp()"
 	- "Reduce the space to be cleared for maple_big_node"
 	- "maple_tree: simplify mas_push_node()"
 	- "Following cleanup after introduce mas_wr_store_type()"
 	- "refine storing null"
 
 - The series "selftests/mm: hugetlb_fault_after_madv improvements" from
   David Hildenbrand fixes this selftest for s390.
 
 - The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng
   implements some rationaizations and cleanups in the page mapping code.
 
 - The series "mm: optimize shadow entries removal" from Shakeel Butt
   optimizes the file truncation code by speeding up the handling of shadow
   entries.
 
 - The series "Remove PageKsm()" from Matthew Wilcox completes the
   migration of this flag over to being a folio-based flag.
 
 - The series "Unify hugetlb into arch_get_unmapped_area functions" from
   Oscar Salvador implements a bunch of consolidations and cleanups in the
   hugetlb code.
 
 - The series "Do not shatter hugezeropage on wp-fault" from Dev Jain
   takes away the wp-fault time practice of turning a huge zero page into
   small pages.  Instead we replace the whole thing with a THP.  More
   consistent cleaner and potentiall saves a large number of pagefaults.
 
 - The series "percpu: Add a test case and fix for clang" from Andy
   Shevchenko enhances and fixes the kernel's built in percpu test code.
 
 - The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett
   optimizes mremap() by avoiding doing things which we didn't need to do.
 
 - The series "Improve the tmpfs large folio read performance" from
   Baolin Wang teaches tmpfs to copy data into userspace at the folio size
   rather than as individual pages.  A 20% speedup was observed.
 
 - The series "mm/damon/vaddr: Fix issue in
   damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON splitting.
 
 - The series "memcg-v1: fully deprecate charge moving" from Shakeel Butt
   removes the long-deprecated memcgv2 charge moving feature.
 
 - The series "fix error handling in mmap_region() and refactor" from
   Lorenzo Stoakes cleanup up some of the mmap() error handling and
   addresses some potential performance issues.
 
 - The series "x86/module: use large ROX pages for text allocations" from
   Mike Rapoport teaches x86 to use large pages for read-only-execute
   module text.
 
 - The series "page allocation tag compression" from Suren Baghdasaryan
   is followon maintenance work for the new page allocation profiling
   feature.
 
 - The series "page->index removals in mm" from Matthew Wilcox remove
   most references to page->index in mm/.  A slow march towards shrinking
   struct page.
 
 - The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs
   interface tests" from Andrew Paniakin performs maintenance work for
   DAMON's self testing code.
 
 - The series "mm: zswap swap-out of large folios" from Kanchana Sridhar
   improves zswap's batching of compression and decompression.  It is a
   step along the way towards using Intel IAA hardware acceleration for
   this zswap operation.
 
 - The series "kasan: migrate the last module test to kunit" from
   Sabyrzhan Tasbolatov completes the migration of the KASAN built-in tests
   over to the KUnit framework.
 
 - The series "implement lightweight guard pages" from Lorenzo Stoakes
   permits userapace to place fault-generating guard pages within a single
   VMA, rather than requiring that multiple VMAs be created for this.
   Improved efficiencies for userspace memory allocators are expected.
 
 - The series "memcg: tracepoint for flushing stats" from JP Kobryn uses
   tracepoints to provide increased visibility into memcg stats flushing
   activity.
 
 - The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky
   fixes a zram buglet which potentially affected performance.
 
 - The series "mm: add more kernel parameters to control mTHP" from
   Maíra Canal enhances our ability to control/configuremultisize THP from
   the kernel boot command line.
 
 - The series "kasan: few improvements on kunit tests" from Sabyrzhan
   Tasbolatov has a couple of fixups for the KASAN KUnit tests.
 
 - The series "mm/list_lru: Split list_lru lock into per-cgroup scope"
   from Kairui Song optimizes list_lru memory utilization when lockdep is
   enabled.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZzwFqgAKCRDdBJ7gKXxA
 jkeuAQCkl+BmeYHE6uG0hi3pRxkupseR6DEOAYIiTv0/l8/GggD/Z3jmEeqnZaNq
 xyyenpibWgUoShU2wZ/Ha8FE5WDINwg=
 =JfWR
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - The series "zram: optimal post-processing target selection" from
   Sergey Senozhatsky improves zram's post-processing selection
   algorithm. This leads to improved memory savings.

 - Wei Yang has gone to town on the mapletree code, contributing several
   series which clean up the implementation:
	- "refine mas_mab_cp()"
	- "Reduce the space to be cleared for maple_big_node"
	- "maple_tree: simplify mas_push_node()"
	- "Following cleanup after introduce mas_wr_store_type()"
	- "refine storing null"

 - The series "selftests/mm: hugetlb_fault_after_madv improvements" from
   David Hildenbrand fixes this selftest for s390.

 - The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng
   implements some rationaizations and cleanups in the page mapping
   code.

 - The series "mm: optimize shadow entries removal" from Shakeel Butt
   optimizes the file truncation code by speeding up the handling of
   shadow entries.

 - The series "Remove PageKsm()" from Matthew Wilcox completes the
   migration of this flag over to being a folio-based flag.

 - The series "Unify hugetlb into arch_get_unmapped_area functions" from
   Oscar Salvador implements a bunch of consolidations and cleanups in
   the hugetlb code.

 - The series "Do not shatter hugezeropage on wp-fault" from Dev Jain
   takes away the wp-fault time practice of turning a huge zero page
   into small pages. Instead we replace the whole thing with a THP. More
   consistent cleaner and potentiall saves a large number of pagefaults.

 - The series "percpu: Add a test case and fix for clang" from Andy
   Shevchenko enhances and fixes the kernel's built in percpu test code.

 - The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett
   optimizes mremap() by avoiding doing things which we didn't need to
   do.

 - The series "Improve the tmpfs large folio read performance" from
   Baolin Wang teaches tmpfs to copy data into userspace at the folio
   size rather than as individual pages. A 20% speedup was observed.

 - The series "mm/damon/vaddr: Fix issue in
   damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON
   splitting.

 - The series "memcg-v1: fully deprecate charge moving" from Shakeel
   Butt removes the long-deprecated memcgv2 charge moving feature.

 - The series "fix error handling in mmap_region() and refactor" from
   Lorenzo Stoakes cleanup up some of the mmap() error handling and
   addresses some potential performance issues.

 - The series "x86/module: use large ROX pages for text allocations"
   from Mike Rapoport teaches x86 to use large pages for
   read-only-execute module text.

 - The series "page allocation tag compression" from Suren Baghdasaryan
   is followon maintenance work for the new page allocation profiling
   feature.

 - The series "page->index removals in mm" from Matthew Wilcox remove
   most references to page->index in mm/. A slow march towards shrinking
   struct page.

 - The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs
   interface tests" from Andrew Paniakin performs maintenance work for
   DAMON's self testing code.

 - The series "mm: zswap swap-out of large folios" from Kanchana Sridhar
   improves zswap's batching of compression and decompression. It is a
   step along the way towards using Intel IAA hardware acceleration for
   this zswap operation.

 - The series "kasan: migrate the last module test to kunit" from
   Sabyrzhan Tasbolatov completes the migration of the KASAN built-in
   tests over to the KUnit framework.

 - The series "implement lightweight guard pages" from Lorenzo Stoakes
   permits userapace to place fault-generating guard pages within a
   single VMA, rather than requiring that multiple VMAs be created for
   this. Improved efficiencies for userspace memory allocators are
   expected.

 - The series "memcg: tracepoint for flushing stats" from JP Kobryn uses
   tracepoints to provide increased visibility into memcg stats flushing
   activity.

 - The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky
   fixes a zram buglet which potentially affected performance.

 - The series "mm: add more kernel parameters to control mTHP" from
   Maíra Canal enhances our ability to control/configuremultisize THP
   from the kernel boot command line.

 - The series "kasan: few improvements on kunit tests" from Sabyrzhan
   Tasbolatov has a couple of fixups for the KASAN KUnit tests.

 - The series "mm/list_lru: Split list_lru lock into per-cgroup scope"
   from Kairui Song optimizes list_lru memory utilization when lockdep
   is enabled.

* tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (215 commits)
  cma: enforce non-zero pageblock_order during cma_init_reserved_mem()
  mm/kfence: add a new kunit test test_use_after_free_read_nofault()
  zram: fix NULL pointer in comp_algorithm_show()
  memcg/hugetlb: add hugeTLB counters to memcg
  vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event
  mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount
  zram: ZRAM_DEF_COMP should depend on ZRAM
  MAINTAINERS/MEMORY MANAGEMENT: add document files for mm
  Docs/mm/damon: recommend academic papers to read and/or cite
  mm: define general function pXd_init()
  kmemleak: iommu/iova: fix transient kmemleak false positive
  mm/list_lru: simplify the list_lru walk callback function
  mm/list_lru: split the lock to per-cgroup scope
  mm/list_lru: simplify reparenting and initial allocation
  mm/list_lru: code clean up for reparenting
  mm/list_lru: don't export list_lru_add
  mm/list_lru: don't pass unnecessary key parameters
  kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller
  kasan: change kasan_atomics kunit test as KUNIT_CASE_SLOW
  kasan: use EXPORT_SYMBOL_IF_KUNIT to export symbols
  ...
2024-11-23 09:58:07 -08:00
Linus Torvalds
e288c352a4 linux_kselftest-kunit-6.13-rc1-fixed
kunit update for Linux 6.13-rc1
 
 -- fixes user-after-free (UAF) bug in kunit_init_suite()
 
 -- adds option to kunit tool to print just the summary of test results
 
 -- adds option to kunit tool to print just the failed test results
 
 -- fixes kunit_zalloc_skb() to use user passed in gfp value instead of
    hardcoding GFP_KERNEL
 
 -- fixes kunit_zalloc_skb() kernel doc to include allocation flags variable
 
 -- updates KUnit email address for Brendan Higgins
 
 -- adds LoongArch config to qemu_configs
 
 -- changes tool to allow overriding the shutdown mode from qemu config
 
 -- enables shutdown in loongarch qemu_config
 
 -- fixes potential null dereference in kunit_device_driver_test()
 
 -- fixes debugfs to use IS_ERR() for alloc_string_stream() error check
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmc/zUAACgkQCwJExA0N
 QxyHtRAAo439aoXwyLs/tTezI3K8O3gT0QJTqOCuOdE3OtuqfHOn5LpAXEeEfAHo
 Mb0G0jXuMJSHrt3psUQFBEJ3+wKax8FtNoUJN7Tc8gbQJ1Ue8UBAVKdTUmca6lqB
 SwIaZA4hZBujj9z1YwnMYkTnuGAG03rzOjE1biZq+GNuo7cw5OQ/49yQA2BBR9fR
 zVxMcc2C0JfMkGezR74/gV7uWHyVmG/SOofoxLs2JsgGPN4hMAqtjLgVjqZ9CuQ+
 YJWKxrRNrhMJrfYxGHPI9Yvab919Vftkf+oUlJArjwCBCZQihkScuuwDvBdZx+u3
 Npt81j2lpglxJNiXIm2K5lsKs6djhT6omueOWVIlibF6Ee3Hmd3b7Xc+AJbJMUMe
 xQyro059spIP7ezgrGeC2hETkt6CAGD7K6R9iedh2oOozC/Tp03a+8jE6mux2EQm
 nhrQIqazbNRdnv3Pe3+G09peqgQ9FlH2hHoMEoSPM4JQNJWGYhDG19YwLO8fnzsO
 HHb4xS8DLxk0uu87HWI89GEptVi/s0LrqQeUJb0ZlAIdU842TsTSOWmxu6M6FBsx
 9hFkBVsFkAKt5912gnm4qQSgkwkNw4Ou7MnMbHPm/gFpmIkGPhqpCmQSScct3lZx
 YWNxQ5CcQeNk4ngWJz3d1Voq27tL1yd5Zvbg97v4wHnSyo4W+JY=
 =gRq+
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.13-rc1-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - fix user-after-free (UAF) bug in kunit_init_suite()

 - add option to kunit tool to print just the summary of test results

 - add option to kunit tool to print just the failed test results

 - fix kunit_zalloc_skb() to use user passed in gfp value instead of
   hardcoding GFP_KERNEL

 - fixe kunit_zalloc_skb() kernel doc to include allocation flags
   variable

 - update KUnit email address for Brendan Higgins

 - add LoongArch config to qemu_configs

 - allow overriding the shutdown mode from qemu config

 - enable shutdown in loongarch qemu_config

 - fix potential null dereference in kunit_device_driver_test()

 - fix debugfs to use IS_ERR() for alloc_string_stream() error check

* tag 'linux_kselftest-kunit-6.13-rc1-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: qemu_configs: loongarch: Enable shutdown
  kunit: tool: Allow overriding the shutdown mode from qemu config
  kunit: qemu_configs: Add LoongArch config
  kunit: debugfs: Use IS_ERR() for alloc_string_stream() error check
  kunit: Fix potential null dereference in kunit_device_driver_test()
  MAINTAINERS: Update KUnit email address for Brendan Higgins
  kunit: string-stream: Fix a UAF bug in kunit_init_suite()
  kunit: tool: print failed tests only
  kunit: tool: Only print the summary
  kunit: skb: add gfp to kernel doc for kunit_zalloc_skb()
  kunit: skb: use "gfp" variable instead of hardcoding GFP_KERNEL
2024-11-22 16:11:56 -08:00
Linus Torvalds
563cb0b1e7 cxl changes for v6.13
- Constify range_contains() input parameters to prevent changes.
 - Add support for displaying RCD capabilities in sysfs to support lspci for CXL device.
 - Downgrade warning message to debug in cxl_probe_component_regs().
 - Add support for adding a printf specifier '$pra' to emit 'struct range' content.
   - Add sanity tests for 'struct resource'.
   - Add documentation for special case.
   - Add %pra for 'struct range'.
   - Add %pra usage in CXL code.
 - Add preparation code for DCD support
   - Add range_overlaps().
   - Add CDAT DSMAS table shared and read only flag in ACPICA.
   - Add documentation to 'struct dev_dax_range'.
   - Delay event buffer allocation in CXL PCI code until needed.
   - Use guard() in cxl_dpa_set_mode().
   - Refactor create region code to consolidate common code.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmc84dMACgkQYGjFFmlT
 OEoGTg//cSJlQ9X7+xZDbngnzpJwcLzQkR/FXDfe3obtmgs7woDJgNNcYnKSlgyf
 wal47Q0UM/1Hv8Dtfrt62Ay1fmOvDL2GSpey35NVJGCEpIsfOqqk1zTCgfgwRHTO
 MZJLnOSFUIlDYlVz8ljLNHnNqPjr7dCoUh9tdBefvkw59FqbkHNcWI8hG1lh1SR4
 2frtJcqVg54S6vJa2eeWmNVpxz7RZvPFrb8TJzhdrGM8PkTMNFA2oJINAf0j00Ev
 8/T6HXTxXvFtNhBH0dtMO1MFh1d6Qr/zFnX/gmrnPWl1l/12HFDMBIZIzq/Whjpo
 +7hQ5xK3cwkMevFgFrAhwdZMj8maR84x1dbFItoThaoeDIQ4sGfyQEMPsbkZP/Sc
 67i5hQFIBZc+ORLB0W+z9Da52ZFGyVw/xsCmDRzXCw4s7N2twpydIoA7Pvu9NN1X
 3JVF35NrsRZ+PyuGWEitNjo0Rj6swNpBC5Xv/T1mgFtSgvVuk1T2QtSHJcPoQyzQ
 zbijsCKmvJYbdJBnPiotdrBs1BUxBsP9dBT9IxWzMy6lcEpTJrYpUheRCk2tSHFa
 Kk8O8IYNiBKZaSpN9UHKaGzr43H8gNbLf4svSIiu1lZJTSSdtWqfZZYjXFBgB1Vb
 l2gBCDmPJ0y7WKZSCa53UmQiOusr+l3Pi+OflZEfCy6JxbSqTTM=
 =GNlu
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dave Jiang:

 - Constify range_contains() input parameters to prevent changes

 - Add support for displaying RCD capabilities in sysfs to support lspci
   for CXL device

 - Downgrade warning message to debug in cxl_probe_component_regs()

 - Add support for adding a printf specifier '%pra' to emit 'struct
   range' content:
     - Add sanity tests for 'struct resource'
     - Add documentation for special case
     - Add %pra for 'struct range'
     - Add %pra usage in CXL code

 - Add preparation code for DCD support:
     - Add range_overlaps()
     - Add CDAT DSMAS table shared and read only flag in ACPICA
     - Add documentation to 'struct dev_dax_range'
     - Delay event buffer allocation in CXL PCI code until needed
     - Use guard() in cxl_dpa_set_mode()
     - Refactor create region code to consolidate common code

* tag 'cxl-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/region: Refactor common create region code
  cxl/hdm: Use guard() in cxl_dpa_set_mode()
  cxl/pci: Delay event buffer allocation
  dax: Document struct dev_dax_range
  ACPI/CDAT: Add CDAT/DSMAS shared and read only flag values
  range: Add range_overlaps()
  cxl/cdat: Use %pra for dpa range outputs
  printf: Add print format (%pra) for struct range
  Documentation/printf: struct resource add start == end special case
  test printf: Add very basic struct resource tests
  cxl: downgrade a warning message to debug level in cxl_probe_component_regs()
  cxl/pci: Add sysfs attribute for CXL 1.1 device link status
  cxl/core/regs: Add rcd_pcie_cap initialization
  kernel/range: Const-ify range_contains parameters
2024-11-22 12:33:52 -08:00
Linus Torvalds
fcc79e1714 Networking changes for 6.13.
The most significant set of changes is the per netns RTNL. The new
 behavior is disabled by default, regression risk should be contained.
 
 Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its
 default value from PTP_1588_CLOCK_KVM, as the first is intended to be
 a more reliable replacement for the latter.
 
 Core
 ----
 
  - Started a very large, in-progress, effort to make the RTNL lock
    scope per network-namespace, thus reducing the lock contention
    significantly in the containerized use-case, comprising:
    - RCU-ified some relevant slices of the FIB control path
    - introduce basic per netns locking helpers
    - namespacified the IPv4 address hash table
    - remove rtnl_register{,_module}() in favour of rtnl_register_many()
    - refactor rtnl_{new,del,set}link() moving as much validation as
      possible out of RTNL lock
    - convert all phonet doit() and dumpit() handlers to RCU
    - convert IPv4 addresses manipulation to per-netns RTNL
    - convert virtual interface creation to per-netns RTNL
    the per-netns lock infra is guarded by the CONFIG_DEBUG_NET_SMALL_RTNL
    knob, disabled by default ad interim.
 
  - Introduce NAPI suspension, to efficiently switching between busy
    polling (NAPI processing suspended) and normal processing.
 
  - Migrate the IPv4 routing input, output and control path from direct
    ToS usage to DSCP macros. This is a work in progress to make ECN
    handling consistent and reliable.
 
  - Add drop reasons support to the IPv4 rotue input path, allowing
    better introspection in case of packets drop.
 
  - Make FIB seqnum lockless, dropping RTNL protection for read
    access.
 
  - Make inet{,v6} addresses hashing less predicable.
 
  - Allow providing timestamp OPT_ID via cmsg, to correlate TX packets
    and timestamps
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Add small file operations for debugfs, to reduce the struct ops size.
 
  - Refactoring and optimization for the implementation of page_frag API,
    This is a preparatory work to consolidate the page_frag
    implementation.
 
 Netfilter
 ---------
 
  - Optimize set element transactions to reduce memory consumption
 
  - Extended netlink error reporting for attribute parser failure.
 
  - Make legacy xtables configs user selectable, giving users
    the option to configure iptables without enabling any other config.
 
  - Address a lot of false-positive RCU issues, pointed by recent
    CI improvements.
 
 BPF
 ---
 
  - Put xsk sockets on a struct diet and add various cleanups. Overall,
    this helps to bump performance by 12% for some workloads.
 
  - Extend BPF selftests to increase coverage of XDP features in
    combination with BPF cpumap.
 
  - Optimize and homogenize bpf_csum_diff helper for all archs and also
    add a batch of new BPF selftests for it.
 
  - Extend netkit with an option to delegate skb->{mark,priority}
    scrubbing to its BPF program.
 
  - Make the bpf_get_netns_cookie() helper available also to tc(x) BPF
    programs.
 
 Protocols
 ---------
 
  - Introduces 4-tuple hash for connected udp sockets, speeding-up
    significantly connected sockets lookup.
 
  - Add a fastpath for some TCP timers that usually expires after close,
    the socket lock contention.
 
  - Add inbound and outbound xfrm state caches to speed up state lookups.
 
  - Avoid sending MPTCP advertisements on stale subflows, reducing
    risks on loosing them.
 
  - Make neighbours table flushing more scalable, maintaining per device
    neigh lists.
 
 Driver API
 ----------
 
  - Introduce a unified interface to configure transmission H/W shaping,
    and expose it to user-space via generic-netlink.
 
  - Add support for per-NAPI config via netlink. This makes napi
    configuration persistent across queues removal and re-creation.
    Requires driver updates, currently supported drivers are:
    nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice.
 
  - Add ethtool support for writing SFP / PHY firmware blocks.
 
  - Track RSS context allocation from ethtool core.
 
  - Implement support for mirroring to DSA CPU port, via TC mirror
    offload.
 
  - Consolidate FDB updates notification, to avoid duplicates on
    device-specific entries.
 
  - Expose DPLL clock quality level to the user-space.
 
  - Support master-slave PHY config via device tree.
 
 Tests and tooling
 -----------------
 
  - forwarding: introduce deferred commands, to simplify
    the cleanup phase
 
 Drivers
 -------
 
  - Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic,
    Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the
    IRQs and queues to NAPI IDs, allowing busy polling and better
    introspection.
 
  - Ethernet high-speed NICs:
    - nVidia/Mellanox:
      - mlx5:
        - a large refactor to implement support for cross E-Switch
          scheduling
        - refactor H/W conter management to let it scale better
        - H/W GRO cleanups
    - Intel (100G, ice)::
      - adds support for ethtool reset
      - implement support for per TX queue H/W shaping
    - AMD/Solarflare:
      - implement per device queue stats support
    - Broadcom (bnxt):
      - improve wildcard l4proto on IPv4/IPv6 ntuple rules
    - Marvell Octeon:
      - Adds representor support for each Resource Virtualization Unit
        (RVU) device.
    - Hisilicon:
      - adds support for the BMC Gigabit Ethernet
    - IBM (EMAC):
      - driver cleanup and modernization
    - Cisco (VIC):
      - raise the queues number limit to 256
 
  - Ethernet virtual:
    - Google vNIC:
      - implements page pool support
    - macsec:
      - inherit lower device's features and TSO limits when offloading
    - virtio_net:
      - enable premapped mode by default
      - support for XDP socket(AF_XDP) zerocopy TX
    - wireguard:
      - set the TSO max size to be GSO_MAX_SIZE, to aggregate larger
        packets.
 
  - Ethernet NICs embedded and virtual:
    - Broadcom ASP:
      - enable software timestamping
    - Freescale:
      - add enetc4 PF driver
    - MediaTek: Airoha SoC:
      - implement BQL support
    - RealTek r8169:
      - enable TSO by default on r8168/r8125
      - implement extended ethtool stats
    - Renesas AVB:
      - enable TX checksum offload
    - Synopsys (stmmac):
      - support header splitting for vlan tagged packets
      - move common code for DWMAC4 and DWXGMAC into a separate FPE
        module.
      - Add the dwmac driver support for T-HEAD TH1520 SoC
    - Synopsys (xpcs):
      - driver refactor and cleanup
    - TI:
      - icssg_prueth: add VLAN offload support
    - Xilinx emaclite:
      - adds clock support
 
  - Ethernet switches:
    - Microchip:
      - implement support for the lan969x Ethernet switch family
      - add LAN9646 switch support to KSZ DSA driver
 
  - Ethernet PHYs:
    - Marvel: 88q2x: enable auto negotiation
    - Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2
 
  - PTP:
    - Add support for the Amazon virtual clock device
    - Add PtP driver for s390 clocks
 
  - WiFi:
    - mac80211
      - EHT 1024 aggregation size for transmissions
      - new operation to indicate that a new interface is to be added
      - support radio separation of multi-band devices
      - move wireless extension spy implementation to libiw
    - Broadcom:
      - brcmfmac: optional LPO clock support
    - Microchip:
      - add support for Atmel WILC3000
    - Qualcomm (ath12k):
      - firmware coredump collection support
      - add debugfs support for a multitude of statistics
    - Qualcomm (ath5k):
      -  Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support
    - Realtek:
      - rtw88: 8821au and 8812au USB adapters support
      - rtw89: add thermal protection
      - rtw89: fine tune BT-coexsitence to improve user experience
      - rtw89: firmware secure boot for WiFi 6 chip
 
  - Bluetooth
      - add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and
        0x13d3:0x3623
      - add Realtek RTL8852BE support for id Foxconn 0xe123
      - add MediaTek MT7920 support for wireless module ids
      - btintel_pcie: add handshake between driver and firmware
      - btintel_pcie: add recovery mechanism
      - btnxpuart: add GPIO support to power save feature
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmc8sukSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkLEYQAIMM6Qjh0bh3Byr3gOS1xZzXG+APLjP4
 9Jr0p3i+X53i90jvVqzeVO5FTc95MVHSKZ3kvPkDMXSLUaEJxocNHCI5Dzl/2/qL
 wWdpUB6/ou+jKB4Bn6Z8OvVODT7qrr0tVa9M2/fuKWrIsOU/ntIhG8EhnGddk5U/
 vKPSf5PUIb81uNRnF58VusY3wrT1dEoh9VfJYxL+ST+inPxjEAMy6Y+lmlsjGaSX
 jrS+Pp9KYiUwl3Qt0AQs+cG4OHkJdjbnChrfosWwpkiyddO8klVq06+wX/TiSzfF
 b9VZtBfy/GZs3lkE1mQkcILdtX5pP3YHQdpsuxFfVI0JHVszx2ck7WdoRux/8F0v
 kKZsYcO7bH9I1wMFP66Ff9hIbdEQaeucK+KdDkXyPNMfP91Vzmfjii8IBxOC36Ie
 BbOeFUrXyTxxJ2u0vf/X9JtIq8bcrkNrSd1n1jlGPMqG3FVzsY95+Oi4qfsyeUbl
 lS1PlVTqPMPFdX54HnxM3y2rJjhd7iXhkvmtuXNjRFThXlOiK3maAPWlM1aZ3b8u
 Vjs4JFUsW0tleZG+RzANjsGjXbf7AiPUGLZt+acem0K+fcjG4i5aGIAJrxwa/ORx
 eG74IZRt5cOI371W7gNLGHjwnuge8tFPgOWcRP2eozNm7jvMYALBejYS7eWUTvaf
 THcvVM+bupEZ
 =GzPr
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "The most significant set of changes is the per netns RTNL. The new
  behavior is disabled by default, regression risk should be contained.

  Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its
  default value from PTP_1588_CLOCK_KVM, as the first is intended to be
  a more reliable replacement for the latter.

  Core:

   - Started a very large, in-progress, effort to make the RTNL lock
     scope per network-namespace, thus reducing the lock contention
     significantly in the containerized use-case, comprising:
       - RCU-ified some relevant slices of the FIB control path
       - introduce basic per netns locking helpers
       - namespacified the IPv4 address hash table
       - remove rtnl_register{,_module}() in favour of
         rtnl_register_many()
       - refactor rtnl_{new,del,set}link() moving as much validation as
         possible out of RTNL lock
       - convert all phonet doit() and dumpit() handlers to RCU
       - convert IPv4 addresses manipulation to per-netns RTNL
       - convert virtual interface creation to per-netns RTNL
     the per-netns lock infrastructure is guarded by the
     CONFIG_DEBUG_NET_SMALL_RTNL knob, disabled by default ad interim.

   - Introduce NAPI suspension, to efficiently switching between busy
     polling (NAPI processing suspended) and normal processing.

   - Migrate the IPv4 routing input, output and control path from direct
     ToS usage to DSCP macros. This is a work in progress to make ECN
     handling consistent and reliable.

   - Add drop reasons support to the IPv4 rotue input path, allowing
     better introspection in case of packets drop.

   - Make FIB seqnum lockless, dropping RTNL protection for read access.

   - Make inet{,v6} addresses hashing less predicable.

   - Allow providing timestamp OPT_ID via cmsg, to correlate TX packets
     and timestamps

  Things we sprinkled into general kernel code:

   - Add small file operations for debugfs, to reduce the struct ops
     size.

   - Refactoring and optimization for the implementation of page_frag
     API, This is a preparatory work to consolidate the page_frag
     implementation.

  Netfilter:

   - Optimize set element transactions to reduce memory consumption

   - Extended netlink error reporting for attribute parser failure.

   - Make legacy xtables configs user selectable, giving users the
     option to configure iptables without enabling any other config.

   - Address a lot of false-positive RCU issues, pointed by recent CI
     improvements.

  BPF:

   - Put xsk sockets on a struct diet and add various cleanups. Overall,
     this helps to bump performance by 12% for some workloads.

   - Extend BPF selftests to increase coverage of XDP features in
     combination with BPF cpumap.

   - Optimize and homogenize bpf_csum_diff helper for all archs and also
     add a batch of new BPF selftests for it.

   - Extend netkit with an option to delegate skb->{mark,priority}
     scrubbing to its BPF program.

   - Make the bpf_get_netns_cookie() helper available also to tc(x) BPF
     programs.

  Protocols:

   - Introduces 4-tuple hash for connected udp sockets, speeding-up
     significantly connected sockets lookup.

   - Add a fastpath for some TCP timers that usually expires after
     close, the socket lock contention.

   - Add inbound and outbound xfrm state caches to speed up state
     lookups.

   - Avoid sending MPTCP advertisements on stale subflows, reducing
     risks on loosing them.

   - Make neighbours table flushing more scalable, maintaining per
     device neigh lists.

  Driver API:

   - Introduce a unified interface to configure transmission H/W
     shaping, and expose it to user-space via generic-netlink.

   - Add support for per-NAPI config via netlink. This makes napi
     configuration persistent across queues removal and re-creation.
     Requires driver updates, currently supported drivers are:
     nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice.

   - Add ethtool support for writing SFP / PHY firmware blocks.

   - Track RSS context allocation from ethtool core.

   - Implement support for mirroring to DSA CPU port, via TC mirror
     offload.

   - Consolidate FDB updates notification, to avoid duplicates on
     device-specific entries.

   - Expose DPLL clock quality level to the user-space.

   - Support master-slave PHY config via device tree.

  Tests and tooling:

   - forwarding: introduce deferred commands, to simplify the cleanup
     phase

  Drivers:

   - Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic,
     Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the
     IRQs and queues to NAPI IDs, allowing busy polling and better
     introspection.

   - Ethernet high-speed NICs:
      - nVidia/Mellanox:
         - mlx5:
           - a large refactor to implement support for cross E-Switch
             scheduling
           - refactor H/W conter management to let it scale better
           - H/W GRO cleanups
      - Intel (100G, ice)::
         - add support for ethtool reset
         - implement support for per TX queue H/W shaping
      - AMD/Solarflare:
         - implement per device queue stats support
      - Broadcom (bnxt):
         - improve wildcard l4proto on IPv4/IPv6 ntuple rules
      - Marvell Octeon:
         - Add representor support for each Resource Virtualization Unit
           (RVU) device.
      - Hisilicon:
         - add support for the BMC Gigabit Ethernet
      - IBM (EMAC):
         - driver cleanup and modernization
      - Cisco (VIC):
         - raise the queues number limit to 256

   - Ethernet virtual:
      - Google vNIC:
         - implement page pool support
      - macsec:
         - inherit lower device's features and TSO limits when
           offloading
      - virtio_net:
         - enable premapped mode by default
         - support for XDP socket(AF_XDP) zerocopy TX
      - wireguard:
         - set the TSO max size to be GSO_MAX_SIZE, to aggregate larger
           packets.

   - Ethernet NICs embedded and virtual:
      - Broadcom ASP:
         - enable software timestamping
      - Freescale:
         - add enetc4 PF driver
      - MediaTek: Airoha SoC:
         - implement BQL support
      - RealTek r8169:
         - enable TSO by default on r8168/r8125
         - implement extended ethtool stats
      - Renesas AVB:
         - enable TX checksum offload
      - Synopsys (stmmac):
         - support header splitting for vlan tagged packets
         - move common code for DWMAC4 and DWXGMAC into a separate FPE
           module.
         - add dwmac driver support for T-HEAD TH1520 SoC
      - Synopsys (xpcs):
         - driver refactor and cleanup
      - TI:
         - icssg_prueth: add VLAN offload support
      - Xilinx emaclite:
         - add clock support

   - Ethernet switches:
      - Microchip:
         - implement support for the lan969x Ethernet switch family
         - add LAN9646 switch support to KSZ DSA driver

   - Ethernet PHYs:
      - Marvel: 88q2x: enable auto negotiation
      - Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2

   - PTP:
      - Add support for the Amazon virtual clock device
      - Add PtP driver for s390 clocks

   - WiFi:
      - mac80211
         - EHT 1024 aggregation size for transmissions
         - new operation to indicate that a new interface is to be added
         - support radio separation of multi-band devices
         - move wireless extension spy implementation to libiw
      - Broadcom:
         - brcmfmac: optional LPO clock support
      - Microchip:
         - add support for Atmel WILC3000
      - Qualcomm (ath12k):
         - firmware coredump collection support
         - add debugfs support for a multitude of statistics
      - Qualcomm (ath5k):
         -  Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support
      - Realtek:
         - rtw88: 8821au and 8812au USB adapters support
         - rtw89: add thermal protection
         - rtw89: fine tune BT-coexsitence to improve user experience
         - rtw89: firmware secure boot for WiFi 6 chip

   - Bluetooth
      - add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and
        0x13d3:0x3623
      - add Realtek RTL8852BE support for id Foxconn 0xe123
      - add MediaTek MT7920 support for wireless module ids
      - btintel_pcie: add handshake between driver and firmware
      - btintel_pcie: add recovery mechanism
      - btnxpuart: add GPIO support to power save feature"

* tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1475 commits)
  mm: page_frag: fix a compile error when kernel is not compiled
  Documentation: tipc: fix formatting issue in tipc.rst
  selftests: nic_performance: Add selftest for performance of NIC driver
  selftests: nic_link_layer: Add selftest case for speed and duplex states
  selftests: nic_link_layer: Add link layer selftest for NIC driver
  bnxt_en: Add FW trace coredump segments to the coredump
  bnxt_en: Add a new ethtool -W dump flag
  bnxt_en: Add 2 parameters to bnxt_fill_coredump_seg_hdr()
  bnxt_en: Add functions to copy host context memory
  bnxt_en: Do not free FW log context memory
  bnxt_en: Manage the FW trace context memory
  bnxt_en: Allocate backing store memory for FW trace logs
  bnxt_en: Add a 'force' parameter to bnxt_free_ctx_mem()
  bnxt_en: Refactor bnxt_free_ctx_mem()
  bnxt_en: Add mem_valid bit to struct bnxt_ctx_mem_type
  bnxt_en: Update firmware interface spec to 1.10.3.85
  selftests/bpf: Add some tests with sockmap SK_PASS
  bpf: fix recursive lock when verdict program return SK_PASS
  wireguard: device: support big tcp GSO
  wireguard: selftests: load nf_conntrack if not present
  ...
2024-11-21 08:28:08 -08:00
Linus Torvalds
79caa6c88a asm-generic updates for 6.13
These are a number of unrelated cleanups, generally simplifying the
 architecture specific header files:
 
  - A series from Al Viro simplifies asm/vga.h, after it turns out that
    most of it can be generalized.
 
  - A series from Julian Vetter adds a common version of
    memcpy_{to,from}io() and memset_io() and changes most architectures
    to use that instead of their own implementation
 
  - A series from Niklas Schnelle concludes his work to make PC
    style inb()/outb() optional
 
  - Nicolas Pitre contributes improvements for the generic do_div()
    helper
 
  - Christoph Hellwig adds a generic version of page_to_phys()
    and phys_to_page(), replacing the slightly different architecture
    specific definitions.
 
  - Uwe Kleine-Koenig has a minor cleanup for ioctl definitions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmc+Z0gACgkQYKtH/8kJ
 UicqzA/8CcqVdcWKlFAyiFI62DCkd3iYm/joNK3/JhvUIvVFvY+HI0+XpTeOEN1r
 dfYBNg/KTVSbia5MEEy28Lk5WdoA3X7p9E8NuYC1ik/qvH3Y0kXDU2NiRcJDwalq
 u56tGUwDITFUzRo47a4Z53JpV60FlGaUVjuKp1jJiOQkcs/iussVYuti8mNVb1ud
 1tf21TEAIywq43IC8CxevIRsBkJBqMhalaGWYgKw3ZTwXdiKaXed6RH7IjPodanN
 6b7R6aFEqlT7usFX9vLOYNRGzd3HIueXOT1iqiiGI1lm5u/iutxKH+8eS4q381oN
 WJL0jQdo4sv2MxtSHYrjpzPRQpSp/qrin29h3PVjwBjZF3i5WvFeTYgfjQEEkqe0
 fpTXjUsr5n1F1pGV90DtJHwaD5TxKD4VYFLDRCDGUiAnWPkZ7EYUBL3SA6GqEkXB
 1lVRPsEBo0y867/WQcoCZA/x7ANZDI6bDZ6fjumwx8OCZOHZeN6FGtqQJHcVZR5O
 +nu/j3I8YH1tZGKbA+wliyQwt/T60Oxs62HHcFzFLGakARwUEDYO53IGCJUByFwk
 kCrgNVvzFklwWpqqyTADqb5lkQKpZr5gIdpst185qttCQkb+EFWiCi9w2inXTjHl
 2oCc7Uf0cvoxnhVlJAw73eGTtpqS37KCWK+iNyrQbOfy+hgIv+w=
 =zEHk
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "These are a number of unrelated cleanups, generally simplifying the
  architecture specific header files:

   - A series from Al Viro simplifies asm/vga.h, after it turns out that
     most of it can be generalized.

   - A series from Julian Vetter adds a common version of
     memcpy_{to,from}io() and memset_io() and changes most architectures
     to use that instead of their own implementation

   - A series from Niklas Schnelle concludes his work to make PC style
     inb()/outb() optional

   - Nicolas Pitre contributes improvements for the generic do_div()
     helper

   - Christoph Hellwig adds a generic version of page_to_phys() and
     phys_to_page(), replacing the slightly different architecture
     specific definitions.

   - Uwe Kleine-Koenig has a minor cleanup for ioctl definitions"

* tag 'asm-generic-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (24 commits)
  empty include/asm-generic/vga.h
  sparc: get rid of asm/vga.h
  asm/vga.h: don't bother with scr_mem{cpy,move}v() unless we need to
  vt_buffer.h: get rid of dead code in default scr_...() instances
  tty: serial: export serial_8250_warn_need_ioport
  lib/iomem_copy: fix kerneldoc format style
  hexagon: simplify asm/io.h for !HAS_IOPORT
  loongarch: Use new fallback IO memcpy/memset
  csky: Use new fallback IO memcpy/memset
  arm64: Use new fallback IO memcpy/memset
  New implementation for IO memcpy and IO memset
  watchdog: Add HAS_IOPORT dependency for SBC8360 and SBC7240
  __arch_xprod64(): make __always_inline when optimizing for performance
  ARM: div64: improve __arch_xprod_64()
  asm-generic/div64: optimize/simplify __div64_const32()
  lib/math/test_div64: add some edge cases relevant to __div64_const32()
  asm-generic: add an optional pfn_valid check to page_to_phys
  asm-generic: provide generic page_to_phys and phys_to_page implementations
  asm-generic/io.h: Remove I/O port accessors for HAS_IOPORT=n
  tty: serial: handle HAS_IOPORT dependencies
  ...
2024-11-20 15:13:02 -08:00
Linus Torvalds
e6de688e93 Devicetree updates for v6.13:
Bindings:
 
 - Enable dtc "interrupt_provider" warnings for binding examples.
   Fix the warnings in fsl,mu-msi and ti,sci-inta due to this.
 
 - Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton,  and
   altr,fpga-passive-serial to DT schema format
 
 - Add some documentation on the different forms of YAML text blocks
   which are a constant source of review comments
 
 - Fix some schema errors in constraints for arrays
 
 - Add compatibles for qcom,sar2130p-pdc and onnn,adt7462
 
 DT core:
 
 - Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
 
 - Add some warnings on deprecated address handling
 
 - Rework early_init_dt_scan() so the arch can pass in the phys address
   of the DTB as __pa() is not always valid to use. This fixes a warning
   for arm64 with kexec.
 
 - Add and use some new DT graph iterators for iterating over ports and
   endpoints
 
 - Rework reserved-memory handling to be sized dynamically for fixed
   regions
 
 - Optimize of_modalias() to avoid a strlen() call
 
 - Constify struct device_node and property pointers where ever possible
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmc7qaoACgkQ+vtdtY28
 YcN54g/+Ifz4hQTSWV+VBhihovMMPiQUdxZ+MfJfPnPcZ7NJzaTf+zqhZyS4wQou
 v0pdtyR0B1fCM/EvKaYD+1aTTAQFEIT5Dqac+9ePwqaYqSk+yCTxyzW9m+P3rTPV
 THo8SGRss7T+Rs+2WaUGxphTJItMGIRdbBvoqK+82EdKFXXKw2BSD8tlJTWwbTam
 9xkrpUzw7f4FvVY8vVhRyOd5i8/v+FH8D65DMIT6ME9zRn4MzKVzCg6udgYeCBld
 C2XbV+wnyewtjrN2IX+2uQ2mheb7yJu3AEI3iFR5x/sRrsSLpisxrUl38xOOpxrM
 XxYtHgE3omjagQ+y+L2PMthlKvhFrXVXIvhUH8xxje5z1Vyq3VMfiABkHlMpAnys
 5LY4xEhvqDkPNo65UmjMiHxGW/xtcKsmAZBOp+HLerZfCJIFvl380fi8mNg/Sjvz
 7ExCSpzCPsHASZg7QCTplU3BUtg+067Ch/k8Hsn/Og73Pqm3xH4IezQZKwweN9ZT
 LC6OQBI7C3Yt1hom9qgUcA4H4/aaPxTVV7i0DGuAKh8Lon6SaoX2yFpweUBgbsL/
 c9DIW4vbYBIGASxxUbHlNMKvPCKACKmpFXhsnH5Waj+VWSOwsJ8bjGpH8PfMKdFW
 dyJB/r94GqCGpCW7+FC1qGmXiQJGkCo89pKBVjSf4Kj45ht/76o=
 =NCYS
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
 "Bindings:

   - Enable dtc "interrupt_provider" warnings for binding examples. Fix
     the warnings in fsl,mu-msi and ti,sci-inta due to this.

   - Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton, and
     altr,fpga-passive-serial to DT schema format

   - Add some documentation on the different forms of YAML text blocks
     which are a constant source of review comments

   - Fix some schema errors in constraints for arrays

   - Add compatibles for qcom,sar2130p-pdc and onnn,adt7462

  DT core:

   - Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n

   - Add some warnings on deprecated address handling

   - Rework early_init_dt_scan() so the arch can pass in the phys
     address of the DTB as __pa() is not always valid to use. This fixes
     a warning for arm64 with kexec.

   - Add and use some new DT graph iterators for iterating over ports
     and endpoints

   - Rework reserved-memory handling to be sized dynamically for fixed
     regions

   - Optimize of_modalias() to avoid a strlen() call

   - Constify struct device_node and property pointers where ever
     possible"

* tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (36 commits)
  of: Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
  dt-bindings: interrupt-controller: qcom,pdc: Add SAR2130P compatible
  of/address: Rework bus matching to avoid warnings
  of: WARN on deprecated #address-cells/#size-cells handling
  of/fdt: Don't use default address cell sizes for address translation
  dt-bindings: Enable dtc "interrupt_provider" warnings
  of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify
  dt-bindings: cache: qcom,llcc: Fix X1E80100 reg entries
  dt-bindings: watchdog: convert zii,rave-sp-wdt.txt to yaml format
  dt-bindings: input: convert zii,rave-sp-pwrbutton.txt to yaml
  media: xilinx-tpg: use new of_graph functions
  fbdev: omapfb: use new of_graph functions
  gpu: drm: omapdrm: use new of_graph functions
  ASoC: audio-graph-card2: use new of_graph functions
  ASoC: audio-graph-card: use new of_graph functions
  ASoC: test-component: use new of_graph functions
  of: property: use new of_graph functions
  of: property: add of_graph_get_next_port_endpoint()
  of: property: add of_graph_get_next_port()
  of: module: remove strlen() call in of_modalias()
  ...
2024-11-20 13:19:25 -08:00
Linus Torvalds
bf9aa14fc5 A rather large update for timekeeping and timers:
- The final step to get rid of auto-rearming posix-timers
 
     posix-timers are currently auto-rearmed by the kernel when the signal
     of the timer is ignored so that the timer signal can be delivered once
     the corresponding signal is unignored.
 
     This requires to throttle the timer to prevent a DoS by small intervals
     and keeps the system pointlessly out of low power states for no value.
     This is a long standing non-trivial problem due to the lock order of
     posix-timer lock and the sighand lock along with life time issues as
     the timer and the sigqueue have different life time rules.
 
     Cure this by:
 
      * Embedding the sigqueue into the timer struct to have the same life
        time rules. Aside of that this also avoids the lookup of the timer
        in the signal delivery and rearm path as it's just a always valid
        container_of() now.
 
      * Queuing ignored timer signals onto a seperate ignored list.
 
      * Moving queued timer signals onto the ignored list when the signal is
        switched to SIG_IGN before it could be delivered.
 
      * Walking the ignored list when SIG_IGN is lifted and requeue the
        signals to the actual signal lists. This allows the signal delivery
        code to rearm the timer.
 
     This also required to consolidate the signal delivery rules so they are
     consistent across all situations. With that all self test scenarios
     finally succeed.
 
   - Core infrastructure for VFS multigrain timestamping
 
     This is required to allow the kernel to use coarse grained time stamps
     by default and switch to fine grained time stamps when inode attributes
     are actively observed via getattr().
 
     These changes have been provided to the VFS tree as well, so that the
     VFS specific infrastructure could be built on top.
 
   - Cleanup and consolidation of the sleep() infrastructure
 
     * Move all sleep and timeout functions into one file
 
     * Rework udelay() and ndelay() into proper documented inline functions
       and replace the hardcoded magic numbers by proper defines.
 
     * Rework the fsleep() implementation to take the reality of the timer
       wheel granularity on different HZ values into account. Right now the
       boundaries are hard coded time ranges which fail to provide the
       requested accuracy on different HZ settings.
 
     * Update documentation for all sleep/timeout related functions and fix
       up stale documentation links all over the place
 
     * Fixup a few usage sites
 
   - Rework of timekeeping and adjtimex(2) to prepare for multiple PTP clocks
 
     A system can have multiple PTP clocks which are participating in
     seperate and independent PTP clock domains. So far the kernel only
     considers the PTP clock which is based on CLOCK TAI relevant as that's
     the clock which drives the timekeeping adjustments via the various user
     space daemons through adjtimex(2).
 
     The non TAI based clock domains are accessible via the file descriptor
     based posix clocks, but their usability is very limited. They can't be
     accessed fast as they always go all the way out to the hardware and
     they cannot be utilized in the kernel itself.
 
     As Time Sensitive Networking (TSN) gains traction it is required to
     provide fast user and kernel space access to these clocks.
 
     The approach taken is to utilize the timekeeping and adjtimex(2)
     infrastructure to provide this access in a similar way how the kernel
     provides access to clock MONOTONIC, REALTIME etc.
 
     Instead of creating a duplicated infrastructure this rework converts
     timekeeping and adjtimex(2) into generic functionality which operates
     on pointers to data structures instead of using static variables.
 
     This allows to provide time accessors and adjtimex(2) functionality for
     the independent PTP clocks in a subsequent step.
 
   - Consolidate hrtimer initialization
 
     hrtimers are set up by initializing the data structure and then
     seperately setting the callback function for historical reasons.
 
     That's an extra unnecessary step and makes Rust support less straight
     forward than it should be.
 
     Provide a new set of hrtimer_setup*() functions and convert the core
     code and a few usage sites of the less frequently used interfaces over.
 
     The bulk of the htimer_init() to hrtimer_setup() conversion is already
     prepared and scheduled for the next merge window.
 
   - Drivers:
 
     * Ensure that the global timekeeping clocksource is utilizing the
       cluster 0 timer on MIPS multi-cluster systems.
 
       Otherwise CPUs on different clusters use their cluster specific
       clocksource which is not guaranteed to be synchronized with other
       clusters.
 
     * Mostly boring cleanups, fixes, improvements and code movement
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmc7kPITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZKkD/9OUL6fOJrDUmOYBa4QVeMyfTef4EaL
 tvwIMM/29XQFeiq3xxCIn+EMnHjXn2lvIhYGQ7GKsbKYwvJ7ZBDpQb+UMhZ2nKI9
 6D6BP6WomZohKeH2fZbJQAdqOi3KRYdvQdIsVZUexkqiaVPphRvOH9wOr45gHtZM
 EyMRSotPlQTDqcrbUejDMEO94GyjDCYXRsyATLxjmTzL/N4xD4NRIiotjM2vL/a9
 8MuCgIhrKUEyYlFoOxxeokBsF3kk3/ez2jlG9b/N8VLH3SYIc2zgL58FBgWxlmgG
 bY71nVG3nUgEjxBd2dcXAVVqvb+5widk8p6O7xxOAQKTLMcJ4H0tQDkMnzBtUzvB
 DGAJDHAmAr0g+ja9O35Pkhunkh4HYFIbq0Il4d1HMKObhJV0JumcKuQVxrXycdm3
 UZfq3seqHsZJQbPgCAhlFU0/2WWScocbee9bNebGT33KVwSp5FoVv89C/6Vjb+vV
 Gusc3thqrQuMAZW5zV8g4UcBAA/xH4PB0I+vHib+9XPZ4UQ7/6xKl2jE0kd5hX7n
 AAUeZvFNFqIsY+B6vz+Jx/yzyM7u5cuXq87pof5EHVFzv56lyTp4ToGcOGYRgKH5
 JXeYV1OxGziSDrd5vbf9CzdWMzqMvTefXrHbWrjkjhNOe8E1A8O88RZ5uRKZhmSw
 hZZ4hdM9+3T7cg==
 =2VC6
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "A rather large update for timekeeping and timers:

   - The final step to get rid of auto-rearming posix-timers

     posix-timers are currently auto-rearmed by the kernel when the
     signal of the timer is ignored so that the timer signal can be
     delivered once the corresponding signal is unignored.

     This requires to throttle the timer to prevent a DoS by small
     intervals and keeps the system pointlessly out of low power states
     for no value. This is a long standing non-trivial problem due to
     the lock order of posix-timer lock and the sighand lock along with
     life time issues as the timer and the sigqueue have different life
     time rules.

     Cure this by:

       - Embedding the sigqueue into the timer struct to have the same
         life time rules. Aside of that this also avoids the lookup of
         the timer in the signal delivery and rearm path as it's just a
         always valid container_of() now.

       - Queuing ignored timer signals onto a seperate ignored list.

       - Moving queued timer signals onto the ignored list when the
         signal is switched to SIG_IGN before it could be delivered.

       - Walking the ignored list when SIG_IGN is lifted and requeue the
         signals to the actual signal lists. This allows the signal
         delivery code to rearm the timer.

     This also required to consolidate the signal delivery rules so they
     are consistent across all situations. With that all self test
     scenarios finally succeed.

   - Core infrastructure for VFS multigrain timestamping

     This is required to allow the kernel to use coarse grained time
     stamps by default and switch to fine grained time stamps when inode
     attributes are actively observed via getattr().

     These changes have been provided to the VFS tree as well, so that
     the VFS specific infrastructure could be built on top.

   - Cleanup and consolidation of the sleep() infrastructure

       - Move all sleep and timeout functions into one file

       - Rework udelay() and ndelay() into proper documented inline
         functions and replace the hardcoded magic numbers by proper
         defines.

       - Rework the fsleep() implementation to take the reality of the
         timer wheel granularity on different HZ values into account.
         Right now the boundaries are hard coded time ranges which fail
         to provide the requested accuracy on different HZ settings.

       - Update documentation for all sleep/timeout related functions
         and fix up stale documentation links all over the place

       - Fixup a few usage sites

   - Rework of timekeeping and adjtimex(2) to prepare for multiple PTP
     clocks

     A system can have multiple PTP clocks which are participating in
     seperate and independent PTP clock domains. So far the kernel only
     considers the PTP clock which is based on CLOCK TAI relevant as
     that's the clock which drives the timekeeping adjustments via the
     various user space daemons through adjtimex(2).

     The non TAI based clock domains are accessible via the file
     descriptor based posix clocks, but their usability is very limited.
     They can't be accessed fast as they always go all the way out to
     the hardware and they cannot be utilized in the kernel itself.

     As Time Sensitive Networking (TSN) gains traction it is required to
     provide fast user and kernel space access to these clocks.

     The approach taken is to utilize the timekeeping and adjtimex(2)
     infrastructure to provide this access in a similar way how the
     kernel provides access to clock MONOTONIC, REALTIME etc.

     Instead of creating a duplicated infrastructure this rework
     converts timekeeping and adjtimex(2) into generic functionality
     which operates on pointers to data structures instead of using
     static variables.

     This allows to provide time accessors and adjtimex(2) functionality
     for the independent PTP clocks in a subsequent step.

   - Consolidate hrtimer initialization

     hrtimers are set up by initializing the data structure and then
     seperately setting the callback function for historical reasons.

     That's an extra unnecessary step and makes Rust support less
     straight forward than it should be.

     Provide a new set of hrtimer_setup*() functions and convert the
     core code and a few usage sites of the less frequently used
     interfaces over.

     The bulk of the htimer_init() to hrtimer_setup() conversion is
     already prepared and scheduled for the next merge window.

   - Drivers:

       - Ensure that the global timekeeping clocksource is utilizing the
         cluster 0 timer on MIPS multi-cluster systems.

         Otherwise CPUs on different clusters use their cluster specific
         clocksource which is not guaranteed to be synchronized with
         other clusters.

       - Mostly boring cleanups, fixes, improvements and code movement"

* tag 'timers-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (140 commits)
  posix-timers: Fix spurious warning on double enqueue versus do_exit()
  clocksource/drivers/arm_arch_timer: Use of_property_present() for non-boolean properties
  clocksource/drivers/gpx: Remove redundant casts
  clocksource/drivers/timer-ti-dm: Fix child node refcount handling
  dt-bindings: timer: actions,owl-timer: convert to YAML
  clocksource/drivers/ralink: Add Ralink System Tick Counter driver
  clocksource/drivers/mips-gic-timer: Always use cluster 0 counter as clocksource
  clocksource/drivers/timer-ti-dm: Don't fail probe if int not found
  clocksource/drivers:sp804: Make user selectable
  clocksource/drivers/dw_apb: Remove unused dw_apb_clockevent functions
  hrtimers: Delete hrtimer_init_on_stack()
  alarmtimer: Switch to use hrtimer_setup() and hrtimer_setup_on_stack()
  io_uring: Switch to use hrtimer_setup_on_stack()
  sched/idle: Switch to use hrtimer_setup_on_stack()
  hrtimers: Delete hrtimer_init_sleeper_on_stack()
  wait: Switch to use hrtimer_setup_sleeper_on_stack()
  timers: Switch to use hrtimer_setup_sleeper_on_stack()
  net: pktgen: Switch to use hrtimer_setup_sleeper_on_stack()
  futex: Switch to use hrtimer_setup_sleeper_on_stack()
  fs/aio: Switch to use hrtimer_setup_sleeper_on_stack()
  ...
2024-11-19 16:35:06 -08:00
Linus Torvalds
fb1dd1403c A set of changes for debugobjects:
- Prevent destroying the kmem_cache on early failure.
 
     Destroying a kmem_cache requires work queues to be set up, but in the
     early failure case they are not yet initializated. So rather leak the
     cache instead of triggering a BUG.
 
   - Reduce parallel pool fill attempts.
 
     Refilling the object pool requires to take the global pool lock, which
     causes a massive performance issue when a large number of CPUs attempt
     to refill concurrently. It turns out that it's sufficient to let one
     CPU handle the refill from the to free list and in case there are not
     enough objects on it to allocate new objects from the kmem cache.
 
     This also splits the free list handling from the actual allocation path
     as that yields better results on RT where allocation is restricted to
     preemptible code paths. The refill from free list has no such
     restrictions.
 
   - Consolidate the global and the per CPU pools to use the same data
     structure, so all helper functions can be shared.
 
   - Simplify the object allocation/free logic.
 
     The allocation/free logic is an incomprehensible maze, which tries to
     utilize the to free list and the global pool in the best way. This all
     can be simplified into a straight forward comprehensible code flow.
 
   - Convert the allocation/free mechanism to batch mode.
 
     Transferring objects from the global pool to the per CPU pools or vice
     versa is done by walking the hlist and moving object by object. That
     not only increases the pool lock held time, it also dirties up to 17
     cache lines.
 
     This can be avoided by storing the pointer to the first object in a
     batch of 16 objects in the objects themself and propagate it through
     the batch when an object is enqueued into a pool or to a temporary
     hlist head on allocation.
 
     This allows to move batches of objects with at max four cache lines
     dirtied and reduces the pool lock held time and therefore contention
     significantly.
 
   - Improve the object reusage
 
     The current implementation is too agressively freeing unused objects,
     which is counterproductive on bursty workloads like a kernel compile.
 
     Address this by:
 
     	* increasing the per CPU pool size
 
 	* refilling the per CPU pool from the to be freed pool when the per
           CPU pool emptied a batch
 
 	* keeping track of object usage with a exponentially wheighted
           moving average which prevents the work queue callback to free
           objects prematuraly.
 
     This combined reduces the allocation/free rate for a full kernel
     compile significantly:
 
                 kmem_cache_alloc()  kmem_cache_free()
     Baseline:   380k                330k
     Improved:   170k                117k
 
   - A few cleanups and a more cache line friendly layout of debug
     information on top.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmc7ezETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYqOD/42X0//BzqdCs0W3jAuaSxbcncp14en
 kxuKJVcIOwTwiry5xnSD647YYBdXGZyEa1FR84eFpI6cM6O68mCm+Q4Ab+O02MwC
 1tAAQ7fS3fhPBHip6RQtBygexH8WXH3I9BeeXkzQgMCyyObkjRSL3oLIGA4Azfuo
 q79LNZ5ctp9zd2DMWD/h+DEzYKr7LZfCMeoxXKLv6BdpZSS35cZhX4u7uu7DPryE
 AWPCFCE/bEv/QQZ9bUz9Zc8KXsclcgrPXn/ubP8NVK6IHJ2RjIXqBDzQo0C2+QVi
 yb/XdjmQJXNxb3RZxOpwwrefy/jhd8h41rY3prnfnHBU8XU7IFUgN6MfAC46peZR
 dXOLGxsLhJk2xaGcddqD7rSDA1hm7Dpn6ZtTbgiaxWd+ksUCxQckkzWCYlGXl3Az
 4M0LeexWEBKQYBAb1XjAOmfWmndVZWJ6QDFNMN67o0YZt4Uh2APSV/0fevUBGjzT
 nVWxDzN0a/0kMuvmFtwnReVnnGKixC4X3AV4/mvNYQOoRhSrTxjwkBn2TxvZ+3Sh
 v5uNGkUGe3dXS4XBWbytm/HeDdzKZ/C3KATm+bHSqQ+/ktxuCp13EhiursYf5Yc/
 44T8sPEcSTj+xWHLZpsJfz0lpQM4q3KJj0HPQkSIHUD5KWTMkBSFonuBF6jHkf9H
 R4OsmrvXTdFG5g==
 =zxbA
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects updates from Thomas Gleixner:

 - Prevent destroying the kmem_cache on early failure.

   Destroying a kmem_cache requires work queues to be set up, but in the
   early failure case they are not yet initializated. So rather leak the
   cache instead of triggering a BUG.

 - Reduce parallel pool fill attempts.

   Refilling the object pool requires to take the global pool lock,
   which causes a massive performance issue when a large number of CPUs
   attempt to refill concurrently. It turns out that it's sufficient to
   let one CPU handle the refill from the to free list and in case there
   are not enough objects on it to allocate new objects from the kmem
   cache.

   This also splits the free list handling from the actual allocation
   path as that yields better results on RT where allocation is
   restricted to preemptible code paths. The refill from free list has
   no such restrictions.

 - Consolidate the global and the per CPU pools to use the same data
   structure, so all helper functions can be shared.

 - Simplify the object allocation/free logic.

   The allocation/free logic is an incomprehensible maze, which tries to
   utilize the to free list and the global pool in the best way. This
   all can be simplified into a straight forward comprehensible code
   flow.

 - Convert the allocation/free mechanism to batch mode.

   Transferring objects from the global pool to the per CPU pools or
   vice versa is done by walking the hlist and moving object by object.
   That not only increases the pool lock held time, it also dirties up
   to 17 cache lines.

   This can be avoided by storing the pointer to the first object in a
   batch of 16 objects in the objects themself and propagate it through
   the batch when an object is enqueued into a pool or to a temporary
   hlist head on allocation.

   This allows to move batches of objects with at max four cache lines
   dirtied and reduces the pool lock held time and therefore contention
   significantly.

 - Improve the object reusage

   The current implementation is too agressively freeing unused objects,
   which is counterproductive on bursty workloads like a kernel compile.

   Address this by:

      * increasing the per CPU pool size

      * refilling the per CPU pool from the to be freed pool when the
        per CPU pool emptied a batch

      * keeping track of object usage with a exponentially wheighted
        moving average which prevents the work queue callback to free
        objects prematuraly.

   This combined reduces the allocation/free rate for a full kernel
   compile significantly:

                  kmem_cache_alloc()  kmem_cache_free()
      Baseline:   380k                330k
      Improved:   170k                117k

 - A few cleanups and a more cache line friendly layout of debug
   information on top.

* tag 'core-debugobjects-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  debugobjects: Track object usage to avoid premature freeing of objects
  debugobjects: Refill per CPU pool more agressively
  debugobjects: Double the per CPU slots
  debugobjects: Move pool statistics into global_pool struct
  debugobjects: Implement batch processing
  debugobjects: Prepare kmem_cache allocations for batching
  debugobjects: Prepare for batching
  debugobjects: Use static key for boot pool selection
  debugobjects: Rework free_object_work()
  debugobjects: Rework object freeing
  debugobjects: Rework object allocation
  debugobjects: Move min/max count into pool struct
  debugobjects: Rename and tidy up per CPU pools
  debugobjects: Use separate list head for boot pool
  debugobjects: Move pools into a datastructure
  debugobjects: Reduce parallel pool fill attempts
  debugobjects: Make debug_objects_enabled bool
  debugobjects: Provide and use free_object_list()
  debugobjects: Remove pointless debug printk
  debugobjects: Reuse put_objects() on OOM
  ...
2024-11-19 15:20:04 -08:00
Kuan-Wei Chiu
95b6d723a0 kunit: debugfs: Use IS_ERR() for alloc_string_stream() error check
The alloc_string_stream() function only returns ERR_PTR(-ENOMEM) on
failure and never returns NULL. Therefore, switching the error check in
the caller from IS_ERR_OR_NULL to IS_ERR improves clarity, indicating
that this function will return an error pointer (not NULL) when an
error occurs. This change avoids any ambiguity regarding the function's
return behavior.

Link: https://lore.kernel.org/lkml/Zy9deU5VK3YR+r9N@visitorckw-System-Product-Name
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-11-19 15:18:13 -07:00
Zichen Xie
435c20eed5 kunit: Fix potential null dereference in kunit_device_driver_test()
kunit_kzalloc() may return a NULL pointer, dereferencing it without
NULL check may lead to NULL dereference.
Add a NULL check for test_state.

Link: https://lore.kernel.org/r/20241115054335.21673-1-zichenxie0106@gmail.com
Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Signed-off-by: Zichen Xie <zichenxie0106@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-11-19 15:17:51 -07:00
Jinjie Ruan
39e21403c9 kunit: string-stream: Fix a UAF bug in kunit_init_suite()
In kunit_debugfs_create_suite(), if alloc_string_stream() fails in the
kunit_suite_for_each_test_case() loop, the "suite->log = stream"
has assigned before, and the error path only free the suite->log's stream
memory but not set it to NULL, so the later string_stream_clear() of
suite->log in kunit_init_suite() will cause below UAF bug.

Set stream pointer to NULL after free to fix it.

	Unable to handle kernel paging request at virtual address 006440150000030d
	Mem abort info:
	  ESR = 0x0000000096000004
	  EC = 0x25: DABT (current EL), IL = 32 bits
	  SET = 0, FnV = 0
	  EA = 0, S1PTW = 0
	  FSC = 0x04: level 0 translation fault
	Data abort info:
	  ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
	  CM = 0, WnR = 0, TnD = 0, TagAccess = 0
	  GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
	[006440150000030d] address between user and kernel address ranges
	Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
	Dumping ftrace buffer:
	   (ftrace buffer empty)
	Modules linked in: iio_test_gts industrialio_gts_helper cfg80211 rfkill ipv6 [last unloaded: iio_test_gts]
	CPU: 5 UID: 0 PID: 6253 Comm: modprobe Tainted: G    B   W        N 6.12.0-rc4+ #458
	Tainted: [B]=BAD_PAGE, [W]=WARN, [N]=TEST
	Hardware name: linux,dummy-virt (DT)
	pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
	pc : string_stream_clear+0x54/0x1ac
	lr : string_stream_clear+0x1a8/0x1ac
	sp : ffffffc080b47410
	x29: ffffffc080b47410 x28: 006440550000030d x27: ffffff80c96b5e98
	x26: ffffff80c96b5e80 x25: ffffffe461b3f6c0 x24: 0000000000000003
	x23: ffffff80c96b5e88 x22: 1ffffff019cdf4fc x21: dfffffc000000000
	x20: ffffff80ce6fa7e0 x19: 032202a80000186d x18: 0000000000001840
	x17: 0000000000000000 x16: 0000000000000000 x15: ffffffe45c355cb4
	x14: ffffffe45c35589c x13: ffffffe45c03da78 x12: ffffffb810168e75
	x11: 1ffffff810168e74 x10: ffffffb810168e74 x9 : dfffffc000000000
	x8 : 0000000000000004 x7 : 0000000000000003 x6 : 0000000000000001
	x5 : ffffffc080b473a0 x4 : 0000000000000000 x3 : 0000000000000000
	x2 : 0000000000000001 x1 : ffffffe462fbf620 x0 : dfffffc000000000
	Call trace:
	 string_stream_clear+0x54/0x1ac
	 __kunit_test_suites_init+0x108/0x1d8
	 kunit_exec_run_tests+0xb8/0x100
	 kunit_module_notify+0x400/0x55c
	 notifier_call_chain+0xfc/0x3b4
	 blocking_notifier_call_chain+0x68/0x9c
	 do_init_module+0x24c/0x5c8
	 load_module+0x4acc/0x4e90
	 init_module_from_file+0xd4/0x128
	 idempotent_init_module+0x2d4/0x57c
	 __arm64_sys_finit_module+0xac/0x100
	 invoke_syscall+0x6c/0x258
	 el0_svc_common.constprop.0+0x160/0x22c
	 do_el0_svc+0x44/0x5c
	 el0_svc+0x48/0xb8
	 el0t_64_sync_handler+0x13c/0x158
	 el0t_64_sync+0x190/0x194
	Code: f9400753 d2dff800 f2fbffe0 d343fe7c (38e06b80)
	---[ end trace 0000000000000000 ]---
	Kernel panic - not syncing: Oops: Fatal exception

Link: https://lore.kernel.org/r/20241112080314.407966-1-ruanjinjie@huawei.com
Cc: stable@vger.kernel.org
Fixes: a3fdf78478 ("kunit: string-stream: Decouple string_stream from kunit")
Suggested-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-11-19 15:16:47 -07:00
Linus Torvalds
364eeb79a2 Locking changes for v6.13 are:
- lockdep:
     - Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING (Sebastian Andrzej Siewior)
     - Add lockdep_cleanup_dead_cpu() (David Woodhouse)
 
  - futexes:
     - Use atomic64_inc_return() in get_inode_sequence_number() (Uros Bizjak)
     - Use atomic64_try_cmpxchg_relaxed() in get_inode_sequence_number() (Uros Bizjak)
 
  - RT locking:
     - Add sparse annotation PREEMPT_RT's locking (Sebastian Andrzej Siewior)
 
  - spinlocks:
     - Use atomic_try_cmpxchg_release() in osq_unlock() (Uros Bizjak)
 
  - atomics:
     - x86: Use ALT_OUTPUT_SP() for __alternative_atomic64() (Uros Bizjak)
     - x86: Use ALT_OUTPUT_SP() for __arch_{,try_}cmpxchg64_emu() (Uros Bizjak)
 
  - KCSAN, seqlocks:
     - Support seqcount_latch_t (Marco Elver)
 
  - <linux/cleanup.h>:
     - Add if_not_cond_guard() conditional guard helper (David Lechner)
     - Adjust scoped_guard() macros to avoid potential warning (Przemek Kitszel)
     - Remove address space of returned pointer (Uros Bizjak)
 
  - WW mutexes:
     - locking/ww_mutex: Adjust to lockdep nest_lock requirements (Thomas Hellström)
 
  - Rust integration:
     - Fix raw_spin_lock initialization on PREEMPT_RT (Eder Zulian)
 
  - miscellaneous cleanups & fixes:
     - lockdep: Fix wait-type check related warnings (Ahmed Ehab)
     - lockdep: Use info level for initial info messages (Jiri Slaby)
     - spinlocks: Make __raw_* lock ops static (Geert Uytterhoeven)
     - pvqspinlock: Convert fields of 'enum vcpu_state' to uppercase (Qiuxu Zhuo)
     - iio: magnetometer: Fix if () scoped_guard() formatting (Stephen Rothwell)
     - rtmutex: Fix misleading comment (Peter Zijlstra)
     - percpu-rw-semaphores: Fix grammar in percpu-rw-semaphore.rst (Xiu Jianfeng)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmc7AkQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hGqQ/+KWR5arkoJjH/Nf5IyezYitOwqK7YAdJk
 mrWoZcez0DRopNTf8yZMv1m8jyx7W9KUQumEO/ghqJRlBW+AbxZ1t99kmqWI5Aw0
 +zmhpyo06JHeMYQAfKJXX3iRt2Rt59BPHtGzoop6b0e2i55+uPE+DZTNm2+FwCV9
 4vxmfpYyg5/sJB9/v5b0N9TTDe9a8caOHXU5F+HA1yWuxMmqFuDFIcpKrgS/sUeP
 NelOLbh2L3UOPWP6tRRfpajxCQTmRoeZOQQv0L9dd3jYpyQOCesgKqOhqNTCU8KK
 qamTPig2N00smSLp6I/OVyJ96vFYZrbhyq0kwMayaafAU7mB8lzcfUj+8qP0c90k
 1PROtD1XpF3Nobp1F+YUp3sQxEGdCgs+9VeLWWObv2b/Vt3MDZijdEiC/3OkRAUh
 LPCfl/ky41BmT8AlaxRDjkyrN7hH4oUOkGUdVx6yR389J0OR9MSwEX9qNaMw8bBg
 1ALvv9+OR3QhTWyG30PGqUf3Um230oIdWuWxwFrhaoMmDVEVMRZQMtvQahi5hDYq
 zyX79DKWtExEe/f2hY1m/6eNm6st5HE7X7scOba3TamQzvOzJkjzo7XoS2yeUAjb
 eByO2G0PvTrA0TFls6Hyrl6db5OW5KjQnVWr6W3fiWL5YIdh0SQMkWeaGVvGyfy8
 Q3vhk7POaZo=
 =BvPn
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
 "Lockdep:
   - Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING (Sebastian Andrzej
     Siewior)
   - Add lockdep_cleanup_dead_cpu() (David Woodhouse)

  futexes:
   - Use atomic64_inc_return() in get_inode_sequence_number() (Uros
     Bizjak)
   - Use atomic64_try_cmpxchg_relaxed() in get_inode_sequence_number()
     (Uros Bizjak)

  RT locking:
   - Add sparse annotation PREEMPT_RT's locking (Sebastian Andrzej
     Siewior)

  spinlocks:
   - Use atomic_try_cmpxchg_release() in osq_unlock() (Uros Bizjak)

  atomics:
   - x86: Use ALT_OUTPUT_SP() for __alternative_atomic64() (Uros Bizjak)
   - x86: Use ALT_OUTPUT_SP() for __arch_{,try_}cmpxchg64_emu() (Uros
     Bizjak)

  KCSAN, seqlocks:
   - Support seqcount_latch_t (Marco Elver)

  <linux/cleanup.h>:
   - Add if_not_guard() conditional guard helper (David Lechner)
   - Adjust scoped_guard() macros to avoid potential warning (Przemek
     Kitszel)
   - Remove address space of returned pointer (Uros Bizjak)

  WW mutexes:
   - locking/ww_mutex: Adjust to lockdep nest_lock requirements (Thomas
     Hellström)

  Rust integration:
   - Fix raw_spin_lock initialization on PREEMPT_RT (Eder Zulian)

  Misc cleanups & fixes:
   - lockdep: Fix wait-type check related warnings (Ahmed Ehab)
   - lockdep: Use info level for initial info messages (Jiri Slaby)
   - spinlocks: Make __raw_* lock ops static (Geert Uytterhoeven)
   - pvqspinlock: Convert fields of 'enum vcpu_state' to uppercase
     (Qiuxu Zhuo)
   - iio: magnetometer: Fix if () scoped_guard() formatting (Stephen
     Rothwell)
   - rtmutex: Fix misleading comment (Peter Zijlstra)
   - percpu-rw-semaphores: Fix grammar in percpu-rw-semaphore.rst (Xiu
     Jianfeng)"

* tag 'locking-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
  locking/Documentation: Fix grammar in percpu-rw-semaphore.rst
  iio: magnetometer: fix if () scoped_guard() formatting
  rust: helpers: Avoid raw_spin_lock initialization for PREEMPT_RT
  kcsan, seqlock: Fix incorrect assumption in read_seqbegin()
  seqlock, treewide: Switch to non-raw seqcount_latch interface
  kcsan, seqlock: Support seqcount_latch_t
  time/sched_clock: Broaden sched_clock()'s instrumentation coverage
  time/sched_clock: Swap update_clock_read_data() latch writes
  locking/atomic/x86: Use ALT_OUTPUT_SP() for __arch_{,try_}cmpxchg64_emu()
  locking/atomic/x86: Use ALT_OUTPUT_SP() for __alternative_atomic64()
  cleanup: Add conditional guard helper
  cleanup: Adjust scoped_guard() macros to avoid potential warning
  locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock()
  cleanup: Remove address space of returned pointer
  locking/rtmutex: Fix misleading comment
  locking/rt: Annotate unlock followed by lock for sparse.
  locking/rt: Add sparse annotation for RCU.
  locking/rt: Remove one __cond_lock() in RT's spin_trylock_irqsave()
  locking/rt: Add sparse annotation PREEMPT_RT's sleeping locks.
  locking/pvqspinlock: Convert fields of 'enum vcpu_state' to uppercase
  ...
2024-11-19 12:43:11 -08:00
Linus Torvalds
8a7fa81137 Random number generator updates for Linux 6.13-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmc6oE0ACgkQSfxwEqXe
 A65n5BAAtNmfBJhYRiC6Svsg7+ktHmhCAHoHwnP7sv+bjs81FRAEv21CsfI+02Nb
 zUvaPuyiLtYzlWxzE5Yg44v1cADHAq+QZE1Fg5yl7ge6zPZ3+S1pv/8suNSyyI2M
 PKvh1sb4OkUtqplveYSuP1J87u55zAtV9mP9qC3hSlY3XkeQUObt9Awss8peOMdv
 sH2AxwBlRkqFXpY2worxlfg3p5iLemb3AUZ3f0Jc6fRmOagSJCt7i4mDrWo3EXke
 90Ao8ypY0x3YVGRFACHnxCS53X20HGwLxm7jdicfriMCzAJ6JQR6asO+NYnXR+Ev
 9Za3UquVHP6HbQGWj6d1k5k2nF+IbkTHTgFBPRK/CY9ZpVbP04B2K7tE1gmT81wj
 AscRGi9RBVBPKAUguyi99MXYlprFG/ZTLOux3hvdarv5u0bP94eXmy1FrRM+IO0r
 u4BiQ39FlkDdtRxjzKfCiKkMrf3NmFEciZJhxCnflzmOBaj64r1hRt/ea8Bjxvp3
 a4k0MfULmcEn2JwPiT1/Swz45ypZQc4OgbP87SCU8P0a23r21r2oK+9v3No/rCzB
 TI0fP6ykDTFQoiKUOSg1mJmkipdjeDyQ9E+0XIDsKd+T8Yv9rFoaV6RWoMrkt4AJ
 Yea9+V+XEI8F3SjhdD4OL/s3/+bjTjnRHDaXnJf2XzGmXcuvnbs=
 =o4ww
 -----END PGP SIGNATURE-----

Merge tag 'random-6.13-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "This contains a single series from Uros to replace uses of
  <linux/random.h> with prandom.h or other more specific headers
  as needed, in order to avoid a circular header issue.

  Uros' goal is to be able to use percpu.h from prandom.h, which
  will then allow him to define __percpu in percpu.h rather than
  in compiler_types.h"

* tag 'random-6.13-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: Include <linux/percpu.h> in <linux/prandom.h>
  random: Do not include <linux/prandom.h> in <linux/random.h>
  netem: Include <linux/prandom.h> in sch_netem.c
  lib/test_scanf: Include <linux/prandom.h> instead of <linux/random.h>
  lib/test_parman: Include <linux/prandom.h> instead of <linux/random.h>
  bpf/tests: Include <linux/prandom.h> instead of <linux/random.h>
  lib/rbtree-test: Include <linux/prandom.h> instead of <linux/random.h>
  random32: Include <linux/prandom.h> instead of <linux/random.h>
  kunit: string-stream-test: Include <linux/prandom.h>
  lib/interval_tree_test.c: Include <linux/prandom.h> instead of <linux/random.h>
  bpf: Include <linux/prandom.h> instead of <linux/random.h>
  scsi: libfcoe: Include <linux/prandom.h> instead of <linux/random.h>
  fscrypt: Include <linux/once.h> in fs/crypto/keyring.c
  mtd: tests: Include <linux/prandom.h> instead of <linux/random.h>
  media: vivid: Include <linux/prandom.h> in vivid-vid-cap.c
  drm/lib: Include <linux/prandom.h> instead of <linux/random.h>
  drm/i915/selftests: Include <linux/prandom.h> instead of <linux/random.h>
  crypto: testmgr: Include <linux/prandom.h> instead of <linux/random.h>
  x86/kaslr: Include <linux/prandom.h> instead of <linux/random.h>
2024-11-19 10:43:44 -08:00
Linus Torvalds
02b2f1a7b8 This update includes the following changes:
API:
 
 - Add sig driver API.
 - Remove signing/verification from akcipher API.
 - Move crypto_simd_disabled_for_test to lib/crypto.
 - Add WARN_ON for return values from driver that indicates memory corruption.
 
 Algorithms:
 
 - Provide crc32-arch and crc32c-arch through Crypto API.
 - Optimise crc32c code size on x86.
 - Optimise crct10dif on arm/arm64.
 - Optimise p10-aes-gcm on powerpc.
 - Optimise aegis128 on x86.
 - Output full sample from test interface in jitter RNG.
 - Retry without padata when it fails in pcrypt.
 
 Drivers:
 
 - Add support for Airoha EN7581 TRNG.
 - Add support for STM32MP25x platforms in stm32.
 - Enable iproc-r200 RNG driver on BCMBCA.
 - Add Broadcom BCM74110 RNG driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmc6sQsACgkQxycdCkmx
 i6dfHxAAnkI65TE6agZq9DlkEU4ZqOsxxdk0MsGIhbCUTxW3KENzu9vtKjnvg9T/
 Ou0d2J49ny87Y4zaA59Wf/Q1+gg5YSQR5kelonpfrPLkCkJjr72HZpyCHv8TTzEC
 uHHoVj9cnPIF5/yfiqQsrWT1ACip9vn+slyVPaMJV1qR6gnvnSALtsg4e/vKHkn7
 ZMaf2pZ2ROYXdB02nMK5KQcCrxD64MQle/yQepY44eYjnT+XclkqPdi6o1nUSpj/
 RFAeY0jFSTu0pj3DqT48TnU/LiiNLlFOZrGjCdEySoac63vmTtKqfYDmrRaFz4hB
 sucxbgJ3xnnYseRijtfXnxaD/IkDJln+ipGNQKAZLfOVMDCTxPdYGmOpobMTXMS+
 0sY0eAHgqr23P9pOp+sOzcAEFIqg6llAYQVWx3Zl4vpXBUuxzg6AqmHnPicnck7y
 Lw1cJhQxij2De3dG2ZL/0dgQxMjGN/YfCM8SSg6l+Xn3j4j47rqJNH2ZsmXtbJ2n
 kTkmemmWdgRR1IvgQQGsvyKs9ThkcEDW+IzW26SUv3Clvru2NSkX4ZPHbezZQf+D
 R0wMZsW3Fw7Zymerz1GIBSqdLnsyFWtIAjukDpOR6ordPgOBeDt76v6tw5vL2/II
 KYoeN1pdEEecwuhAsEvCryT5ZG4noBeNirf/ElWAfEybgcXiTks=
 =T8pa
 -----END PGP SIGNATURE-----

Merge tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Add sig driver API
   - Remove signing/verification from akcipher API
   - Move crypto_simd_disabled_for_test to lib/crypto
   - Add WARN_ON for return values from driver that indicates memory
     corruption

  Algorithms:
   - Provide crc32-arch and crc32c-arch through Crypto API
   - Optimise crc32c code size on x86
   - Optimise crct10dif on arm/arm64
   - Optimise p10-aes-gcm on powerpc
   - Optimise aegis128 on x86
   - Output full sample from test interface in jitter RNG
   - Retry without padata when it fails in pcrypt

  Drivers:
   - Add support for Airoha EN7581 TRNG
   - Add support for STM32MP25x platforms in stm32
   - Enable iproc-r200 RNG driver on BCMBCA
   - Add Broadcom BCM74110 RNG driver"

* tag 'v6.13-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (112 commits)
  crypto: marvell/cesa - fix uninit value for struct mv_cesa_op_ctx
  crypto: cavium - Fix an error handling path in cpt_ucode_load_fw()
  crypto: aesni - Move back to module_init
  crypto: lib/mpi - Export mpi_set_bit
  crypto: aes-gcm-p10 - Use the correct bit to test for P10
  hwrng: amd - remove reference to removed PPC_MAPLE config
  crypto: arm/crct10dif - Implement plain NEON variant
  crypto: arm/crct10dif - Macroify PMULL asm code
  crypto: arm/crct10dif - Use existing mov_l macro instead of __adrl
  crypto: arm64/crct10dif - Remove remaining 64x64 PMULL fallback code
  crypto: arm64/crct10dif - Use faster 16x64 bit polynomial multiply
  crypto: arm64/crct10dif - Remove obsolete chunking logic
  crypto: bcm - add error check in the ahash_hmac_init function
  crypto: caam - add error check to caam_rsa_set_priv_key_form
  hwrng: bcm74110 - Add Broadcom BCM74110 RNG driver
  dt-bindings: rng: add binding for BCM74110 RNG
  padata: Clean up in padata_do_multithreaded()
  crypto: inside-secure - Fix the return value of safexcel_xcbcmac_cra_init()
  crypto: qat - Fix missing destroy_workqueue in adf_init_aer()
  crypto: rsassa-pkcs1 - Reinstate support for legacy protocols
  ...
2024-11-19 10:28:41 -08:00
Jan Hendrik Farr
f06e108a3d Compiler Attributes: disable __counted_by for clang < 19.1.3
This patch disables __counted_by for clang versions < 19.1.3 because
of the two issues listed below. It does this by introducing
CONFIG_CC_HAS_COUNTED_BY.

1. clang < 19.1.2 has a bug that can lead to __bdos returning 0:
https://github.com/llvm/llvm-project/pull/110497

2. clang < 19.1.3 has a bug that can lead to __bdos being off by 4:
https://github.com/llvm/llvm-project/pull/112636

Fixes: c8248faf3c ("Compiler Attributes: counted_by: Adjust name and identifier expansion")
Cc: stable@vger.kernel.org # 6.6.x: 16c31dd7fdf6: Compiler Attributes: counted_by: bump min gcc version
Cc: stable@vger.kernel.org # 6.6.x: 2993eb7a8d34: Compiler Attributes: counted_by: fixup clang URL
Cc: stable@vger.kernel.org # 6.6.x: 231dc3f0c936: lkdtm/bugs: Improve warning message for compilers without counted_by support
Cc: stable@vger.kernel.org # 6.6.x
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20240913164630.GA4091534@thelio-3990X/
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202409260949.a1254989-oliver.sang@intel.com
Link: https://lore.kernel.org/all/Zw8iawAF5W2uzGuh@archlinux/T/#m204c09f63c076586a02d194b87dffc7e81b8de7b
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jan Hendrik Farr <kernel@jfarr.cc>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://lore.kernel.org/r/20241029140036.577804-2-kernel@jfarr.cc
Signed-off-by: Kees Cook <kees@kernel.org>
2024-11-19 08:48:27 -08:00
Linus Torvalds
0338cd9c22 s390 updates for 6.13 merge window
- Add firmware sysfs interface which allows user space to retrieve the dump
   area size of the machine
 
 - Add 'measurement_chars_full' CHPID sysfs attribute to make the complete
   associated Channel-Measurements Characteristics Block available
 
 - Add virtio-mem support
 
 - Move gmap aka KVM page fault handling from the main fault handler to KVM
   code. This is the first step to make s390 KVM page fault handling similar
   to other architectures. With this first step the main fault handler does
   not have any special handling anymore, and therefore convert it to
   support LOCK_MM_AND_FIND_VMA
 
 - With gcc 14 s390 support for flag output operand support for inline
   assemblies was added. This allows for several optimizations
 
   - Provide a cmpxchg inline assembly which makes use of this, and provide
     all variants of arch_try_cmpxchg() so that the compiler can generate
     slightly better code
 
   - Convert a few cmpxchg() loops to try_cmpxchg() loops
 
   - Similar to x86 add a CC_OUT() helper macro (and other macros), and
     convert all inline assemblies to make use of them, so that depending on
     compiler version better code can be generated
 
 - List installed host-key hashes in sysfs if the machine supports the Query
   Ultravisor Keys UVC
 
 - Add 'Retrieve Secret' ioctl which allows user space in protected
   execution guests to retrieve previously stored secrets from the
   Ultravisor
 
 - Add pkey-uv module which supports the conversion of Ultravisor
   retrievable secrets to protected keys
 
 - Extend the existing paes cipher to exploit the full AES-XTS hardware
   acceleration introduced with message-security assist extension 10
 
 - Convert hopefully all sysfs show functions to use sysfs_emit() so that
   the constant flow of such patches stop
 
 - For PCI devices make use of the newly added Topology ID attribute to
   enable whole card multi-function support despite the change to PCHID per
   port. Additionally improve the overall robustness and usability of
   the multifunction support
 
 - Various other small improvements, fixes, and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmc3Y9oACgkQIg7DeRsp
 bsJigQ//fcZ3NqA6rARWYoVNEEzUfvDha1LchhAV4aBUu5cIZFc/SQKxMuACVELh
 wW7RKCWhGLML5c/cPjke4ECBJiFYI/MQNB3xkDl1i2FDyUNs1Fdq9Be3Y0uXXO+U
 TxvSYiPm3p/Gik8G2KhDPivqPQmrF7o2KNyRWqPBdqRl5U4NLnwJpCMbddP/PTdI
 2ytJ2OGuXo3djzibXldUbik4UG6hXUqGzeIMbrOG8ZiFCeznVck/OHydoLR4MKBy
 MyrmqCxTu/p7gpTanccpTQR+uC5lodxad4kMh86CV3w41HhrWV1z912eNdsz6MMR
 B8kGPx5D0juXtUbB0Mn0kdM6Kak5/BaSA58HRNJz9AMa5MVOj+YTAmlTN5E7uGzg
 graPE3ilwEgj0pArdhwyhIEnVGP381NyhTbMDhTUhRB6lMJVyN5202YZCieezr/u
 dIyurno1T0T8if1B6n7tQQprIVSQDthzE8lCAtYrll86vLIbiXGxCg2yaVLEz1aL
 ptUZ84/bT29G8XivZAeDLjzRSwde+l5pkZWd3rBmdHC8FCH8Epiy/ZB5ozpJ1u02
 fViqheeTsTC/nR6DlwylF4YET6QVPYgLOUZCnBQJnTsVRFtBpAXIaHyvOJYNuxUN
 ybtsgzJ59bMES8DpBCIibBoJOD1vyoWoeXu06bhGuMT+wahCwgE=
 =v+um
 -----END PGP SIGNATURE-----

Merge tag 's390-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - Add firmware sysfs interface which allows user space to retrieve the
   dump area size of the machine

 - Add 'measurement_chars_full' CHPID sysfs attribute to make the
   complete associated Channel-Measurements Characteristics Block
   available

 - Add virtio-mem support

 - Move gmap aka KVM page fault handling from the main fault handler to
   KVM code. This is the first step to make s390 KVM page fault handling
   similar to other architectures. With this first step the main fault
   handler does not have any special handling anymore, and therefore
   convert it to support LOCK_MM_AND_FIND_VMA

 - With gcc 14 s390 support for flag output operand support for inline
   assemblies was added. This allows for several optimizations:

     - Provide a cmpxchg inline assembly which makes use of this, and
       provide all variants of arch_try_cmpxchg() so that the compiler
       can generate slightly better code

     - Convert a few cmpxchg() loops to try_cmpxchg() loops

     - Similar to x86 add a CC_OUT() helper macro (and other macros),
       and convert all inline assemblies to make use of them, so that
       depending on compiler version better code can be generated

 - List installed host-key hashes in sysfs if the machine supports the
   Query Ultravisor Keys UVC

 - Add 'Retrieve Secret' ioctl which allows user space in protected
   execution guests to retrieve previously stored secrets from the
   Ultravisor

 - Add pkey-uv module which supports the conversion of Ultravisor
   retrievable secrets to protected keys

 - Extend the existing paes cipher to exploit the full AES-XTS hardware
   acceleration introduced with message-security assist extension 10

 - Convert hopefully all sysfs show functions to use sysfs_emit() so
   that the constant flow of such patches stop

 - For PCI devices make use of the newly added Topology ID attribute to
   enable whole card multi-function support despite the change to PCHID
   per port. Additionally improve the overall robustness and usability
   of the multifunction support

 - Various other small improvements, fixes, and cleanups

* tag 's390-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (133 commits)
  s390/cio/ioasm: Convert to use flag output macros
  s390/cio/qdio: Convert to use flag output macros
  s390/sclp: Convert to use flag output macros
  s390/dasd: Convert to use flag output macros
  s390/boot/physmem: Convert to use flag output macros
  s390/pci: Convert to use flag output macros
  s390/kvm: Convert to use flag output macros
  s390/extmem: Convert to use flag output macros
  s390/string: Convert to use flag output macros
  s390/diag: Convert to use flag output macros
  s390/irq: Convert to use flag output macros
  s390/smp: Convert to use flag output macros
  s390/uv: Convert to use flag output macros
  s390/pai: Convert to use flag output macros
  s390/mm: Convert to use flag output macros
  s390/cpu_mf: Convert to use flag output macros
  s390/cpcmd: Convert to use flag output macros
  s390/topology: Convert to use flag output macros
  s390/time: Convert to use flag output macros
  s390/pageattr: Convert to use flag output macros
  ...
2024-11-18 17:45:41 -08:00
Linus Torvalds
77a0cfafa9 for-6.13/block-20241118
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmc7S40QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpjHVD/43rDZ8ehs+IAAr6S0RemNX1SRG0mK2UOEb
 kMoNogS7StO/c4JYW3JuzCyLRn5ZsgeWV/muqxwDEWQrmTGrvi+V45KikrZPwm3k
 p0ump33qV9EU2jiR1MKZjtwK2P0CI7/DD3W8ww6IOvKbTT7RcqQcdHznvXArFBtc
 xCuQPpayFG7ZasC+N9VaBwtiUEVgU3Ek9AFT7UVZRWajjHPNalQwaooJWayO0rEG
 KdoW5yG0ryLrgCY2ACSvRLS+2s14EJtb8hgT08WKHTNgd5LxhSKxfsTapamua+7U
 FdVS6Ij0tEkgu2jpvgj7QKO0Uw10Cnep2gj7RHts/LVewvkliS6XcheOzqRS1jWU
 I2EI+UaGOZ11OUiw52VIveEVS5zV/NWhgy5BSP9LYEvXw0BUAHRDYGMem8o5G1V1
 SWqjIM1UWvcQDlAnMF9FDVzojvjVUmYWvcAlFFztO8J0B7SavHR3NcfHwEf57reH
 rNoUbi/9c4/wjJJF33gejiR5pU+ewy/Mk75GrtX3xpEqlztfRbf9/FbPCMEAO1KR
 DF/b3lkUV9i2/BRW6a0SpZ5RDSmSYMnateel6TrPyVSRnpiSSFO8FrbynwUOa17b
 6i49YDFWzzXOrR1YWDg6IEtTrcmBEmvi7F6aoDs020qUnL0hwLn1ZuoIxuiFEpor
 Z0iFF1B/nw==
 =PWTH
 -----END PGP SIGNATURE-----

Merge tag 'for-6.13/block-20241118' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - NVMe updates via Keith:
      - Use uring_cmd helper (Pavel)
      - Host Memory Buffer allocation enhancements (Christoph)
      - Target persistent reservation support (Guixin)
      - Persistent reservation tracing (Guixen)
      - NVMe 2.1 specification support (Keith)
      - Rotational Meta Support (Matias, Wang, Keith)
      - Volatile cache detection enhancment (Guixen)

 - MD updates via Song:
      - Maintainers update
      - raid5 sync IO fix
      - Enhance handling of faulty and blocked devices
      - raid5-ppl atomic improvement
      - md-bitmap fix

 - Support for manually defining embedded partition tables

 - Zone append fixes and cleanups

 - Stop sending the queued requests in the plug list to the driver
   ->queue_rqs() handle in reverse order.

 - Zoned write plug cleanups

 - Cleanups disk stats tracking and add support for disk stats for
   passthrough IO

 - Add preparatory support for file system atomic writes

 - Add lockdep support for queue freezing. Already found a bunch of
   issues, and some fixes for that are in here. More will be coming.

 - Fix race between queue stopping/quiescing and IO queueing

 - ublk recovery improvements

 - Fix ublk mmap for 64k pages

 - Various fixes and cleanups

* tag 'for-6.13/block-20241118' of git://git.kernel.dk/linux: (118 commits)
  MAINTAINERS: Update git tree for mdraid subsystem
  block: make struct rq_list available for !CONFIG_BLOCK
  block/genhd: use seq_put_decimal_ull for diskstats decimal values
  block: don't reorder requests in blk_mq_add_to_batch
  block: don't reorder requests in blk_add_rq_to_plug
  block: add a rq_list type
  block: remove rq_list_move
  virtio_blk: reverse request order in virtio_queue_rqs
  nvme-pci: reverse request order in nvme_queue_rqs
  btrfs: validate queue limits
  block: export blk_validate_limits
  nvmet: add tracing of reservation commands
  nvme: parse reservation commands's action and rtype to string
  nvmet: report ns's vwc not present
  md/raid5: Increase r5conf.cache_name size
  block: remove the ioprio field from struct request
  block: remove the write_hint field from struct request
  nvme: check ns's volatile write cache not present
  nvme: add rotational support
  nvme: use command set independent id ns if available
  ...
2024-11-18 16:50:08 -08:00
Feng Tang
080c8579c3 mm/slub, kunit: Add testcase for krealloc redzone and zeroing
Danilo Krummrich raised issue about krealloc+GFP_ZERO [1], and Vlastimil
suggested to add some test case which can sanity test the kmalloc-redzone
and zeroing by utilizing the kmalloc's 'orig_size' debug feature.

It covers the grow and shrink case of krealloc() re-using current kmalloc
object, and the case of re-allocating a new bigger object.

[1]. https://lore.kernel.org/lkml/20240812223707.32049-1-dakr@kernel.org/

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-11-16 21:19:39 +01:00
Herbert Xu
0594ad6184 crypto: lib/mpi - Export mpi_set_bit
This function is part of the exposed API and should be exported.
Otherwise a modular user would fail to build, e.g., crypto/rsa.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-11-15 19:52:51 +08:00
Jakub Kicinski
a79993b5fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc8).

Conflicts:

tools/testing/selftests/net/.gitignore
  252e01e682 ("selftests: net: add netlink-dumps to .gitignore")
  be43a6b238 ("selftests: ncdevmem: Move ncdevmem under drivers/net/hw")
https://lore.kernel.org/all/20241113122359.1b95180a@canb.auug.org.au/

drivers/net/phy/phylink.c
  671154f174 ("net: phylink: ensure PHY momentary link-fails are handled")
  7530ea26c8 ("net: phylink: remove "using_mac_select_pcs"")

Adjacent changes:

drivers/net/ethernet/stmicro/stmmac/dwmac-intel-plat.c
  5b366eae71 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines")
  e96321fad3 ("net: ethernet: Switch back to struct platform_driver::remove()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14 11:29:15 -08:00
Breno Leitao
12079a59ce net: Implement fault injection forcing skb reallocation
Introduce a fault injection mechanism to force skb reallocation. The
primary goal is to catch bugs related to pointer invalidation after
potential skb reallocation.

The fault injection mechanism aims to identify scenarios where callers
retain pointers to various headers in the skb but fail to reload these
pointers after calling a function that may reallocate the data. This
type of bug can lead to memory corruption or crashes if the old,
now-invalid pointers are used.

By forcing reallocation through fault injection, we can stress-test code
paths and ensure proper pointer management after potential skb
reallocations.

Add a hook for fault injection in the following functions:

 * pskb_trim_rcsum()
 * pskb_may_pull_reason()
 * pskb_trim()

As the other fault injection mechanism, protect it under a debug Kconfig
called CONFIG_FAIL_SKB_REALLOC.

This patch was *heavily* inspired by Jakub's proposal from:
https://lore.kernel.org/all/20240719174140.47a868e6@kernel.org/

CC: Akinobu Mita <akinobu.mita@gmail.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20241107-fault_v6-v6-1-1b82cb6ecacd@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 12:05:33 +01:00
Alexandru Ardelean
111314157f lib: util_macros_kunit: add kunit test for util_macros.h
A bug was found in the find_closest() (find_closest_descending() is also
affected after some testing), where for certain values with small
progressions of 1, 2 & 3, the rounding (done by averaging 2 values) causes
an incorrect index to be returned.

The bug is described in more detail in the commit which fixes the bug. 
This commit adds a kunit test to validate that the fix works correctly.

This kunit test adds some of the arrays (from the driver-sphere) that seem
to produce issues with the 'find_closest()' macro.  Specifically the one
from ad7606 driver (with which the bug was found) and from the ina2xx
drivers, which shows the quirk with 'find_closest()' with elements in a
array that have an interval of 3.

For the find_closest_descending() tests, the same arrays are used as for
the find_closest(), but in reverse; the idea is that
'find_closest_descending()' should return the sames indices as
'find_closest()' but in reverse.

For testing both macros, there are 4 special arrays created, one for
testing find_closest{_descending}() for arrays of progressions 1, 2, 3 and
4.  The idea is to show that (for progressions of 1, 2 & 3) the fix works
as expected.  When removing the fix, the issues should start to show up.

Then an extra array of negative and positive values is added.  There are
currently no such arrays within drivers, but one could expect that these
macros behave correctly even for such arrays.

To run this kunit:
  ./tools/testing/kunit/kunit.py run "*util_macros*"

Link: https://lkml.kernel.org/r/20241105145406.554365-2-aardelean@baylibre.com
Signed-off-by: Alexandru Ardelean <aardelean@baylibre.com>
Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 17:17:05 -08:00
Wei Yang
431e106019 maple_tree: add a test checking storing null
Add a test to assert that, when storing null to am empty tree or a
single entry tree it will not result into:

  * a root node with range [0, ULONG_MAX] set to NULL
  * a root node with consecutive slot set to NULL

[akpm@linux-foundation.org: work around build error (mas_root)]
Link: https://lkml.kernel.org/r/20241031231627.14316-6-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 13:09:42 -08:00
Wei Yang
0ea120b278 maple_tree: refine mas_store_root() on storing NULL
Currently, when storing NULL on mas_store_root(), the behavior could be
improved.

Storing NULLs over the entire tree may result in a node being used to
store a single range.  Further stores of NULL may cause the node and
tree to be corrupt and cause incorrect behaviour.  Fixing the store to
the root null fixes the issue by ensuring that a range of 0 - ULONG_MAX
results in an empty tree.

Users of the tree may experience incorrect values returned if the tree
was expanded to store values, then overwritten by all NULLS, then
continued to store NULLs over the empty area.

For example possible cases are:

  * store NULL at any range result a new node
  * store NULL at range [m, n] where m > 0 to a single entry tree result
    a new node with range [m, n] set to NULL
  * store NULL at range [m, n] where m > 0 to an empty tree result
    consecutive NULL slot
  * it allows for multiple NULL entries by expanding root
    to store NULLs to an empty tree

This patch tries to improve in:

  * memory efficient by setting to empty tree instead of using a node
  * remove the possibility of consecutive NULL slot which will prohibit
    extended null in later operation

Link: https://lkml.kernel.org/r/20241031231627.14316-5-richard.weiyang@gmail.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 13:09:42 -08:00
Wei Yang
8c836f1712 maple_tree: not necessary to check index/last again
Before calling mas_new_root(), the range has been checked.

Link: https://lkml.kernel.org/r/20241031231627.14316-4-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 13:09:42 -08:00
Wei Yang
cefbcf206f maple_tree: the return value of mas_root_expand() is not used
No user of the return value now, just remove it.

Link: https://lkml.kernel.org/r/20241031231627.14316-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 13:09:42 -08:00
Wei Yang
04dafdd208 maple_tree: print empty for an empty tree on mt_dump()
Patch series "refine storing null", v5.

When overwriting the whole range with NULL, current behavior is not
correct.

An empty tree is represented by having the tree point to NULL directly. 
An empty tree indicates the entire range (0-ULONG_MAX) is NULL.

A store operation into an existing node that causes 0 - ULONG_MAX to be
equal to NULL may not be restored to an empty state - a node is used to
store the single range instead.  This is wasteful and different from the
initial setup of the tree.

Once the tree is using a single node to store 0 - ULONG_MAX, problems may
arise when storing more values into a tree with the unexpected state of 0
- ULONG being a single range in a node.

User visible issues may mean a corrupt tree and incorrect storage of
information within the tree.  This would be limited to users who create
and then empty a tree by overwriting all values, then try to store more
NULLs into the empty tree.

I cannot come up with an example of any user doing this (users usually
destroy the tree and generally don't keep trying to store NULLs over
NULLs), but patch 4/5 "maple_tree: refine mas_store_root() on storing
NULL" should be backported just in case.


This patch (of 5):

Currently for an empty tree, it would print:

  maple_tree(0x7ffcd02c6ee0) flags 1, height 0 root (nil)
  0: (nil)

This is a little misleading.

Let's print (empty) for an empty tree.

Link: https://lkml.kernel.org/r/20241031231627.14316-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20241031231627.14316-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 13:09:42 -08:00
Sabyrzhan Tasbolatov
4e4d9c72c9 kasan: delete CONFIG_KASAN_MODULE_TEST
Since we've migrated all tests to the KUnit framework, we can delete
CONFIG_KASAN_MODULE_TEST and mentioning of it in the documentation as
well.

I've used the online translator to modify the non-English documentation.

[snovitoll@gmail.com: fix indentation in translation]
  Link: https://lkml.kernel.org/r/20241020042813.3223449-1-snovitoll@gmail.com
Link: https://lkml.kernel.org/r/20241016131802.3115788-4-snovitoll@gmail.com
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Hu Haowen <2023002089@link.tyut.edu.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marco Elver <elver@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 00:26:44 -08:00
Sabyrzhan Tasbolatov
ae193dd793 kasan: move checks to do_strncpy_from_user
Patch series "kasan: migrate the last module test to kunit", v4.

copy_user_test() is the last KUnit-incompatible test with
CONFIG_KASAN_MODULE_TEST requirement, which we are going to migrate to
KUnit framework and delete the former test and Kconfig as well.

In this patch series:

	- [1/3] move kasan_check_write() and check_object_size() to
		do_strncpy_from_user() to cover with KASAN checks with
		multiple conditions	in strncpy_from_user().

	- [2/3] migrated copy_user_test() to KUnit, where we can also test
		strncpy_from_user() due to [1/4].

		KUnits have been tested on:
		- x86_64 with CONFIG_KASAN_GENERIC. Passed
		- arm64 with CONFIG_KASAN_SW_TAGS. 1 fail. See [1]
		- arm64 with CONFIG_KASAN_HW_TAGS. 1 fail. See [1]
		[1] https://lore.kernel.org/linux-mm/CACzwLxj21h7nCcS2-KA_q7ybe+5pxH0uCDwu64q_9pPsydneWQ@mail.gmail.com/

	- [3/3] delete CONFIG_KASAN_MODULE_TEST and documentation occurrences.


This patch (of 3):

Since in the commit 2865baf54077("x86: support user address masking
instead of non-speculative conditional") do_strncpy_from_user() is called
from multiple places, we should sanitize the kernel *dst memory and size
which were done in strncpy_from_user() previously.

Link: https://lkml.kernel.org/r/20241016131802.3115788-1-snovitoll@gmail.com
Link: https://lkml.kernel.org/r/20241016131802.3115788-2-snovitoll@gmail.com
Fixes: 2865baf540 ("x86: support user address masking instead of non-speculative conditional")
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Hu Haowen <2023002089@link.tyut.edu.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marco Elver <elver@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 00:26:43 -08:00
Andrew Morton
2ec0859039 Merge branch 'mm-hotfixes-stable' into mm-stable
Pick up e7ac4daeed ("mm: count zeromap read and set for swapout and
swapin") in order to move

mm: define obj_cgroup_get() if CONFIG_MEMCG is not defined
mm: zswap: modify zswap_compress() to accept a page instead of a folio
mm: zswap: rename zswap_pool_get() to zswap_pool_tryget()
mm: zswap: modify zswap_stored_pages to be atomic_long_t
mm: zswap: support large folios in zswap_store()
mm: swap: count successful large folio zswap stores in hugepage zswpout stats
mm: zswap: zswap_store_page() will initialize entry after adding to xarray.
mm: add per-order mTHP swpin counters

from mm-unstable into mm-stable.
2024-11-11 00:04:10 -08:00
Luis Chamberlain
2466b31201 tests/module/gen_test_kallsyms.sh: use 0 value for variables
Use 0 for the values as we use them for the return value on init
to keep the test modules simple. This fixes a splat reported

do_init_module: 'test_kallsyms_b'->init suspiciously returned 255, it should follow 0/-E convention
do_init_module: loading module anyway...
CPU: 5 UID: 0 PID: 1873 Comm: modprobe Not tainted 6.12.0-rc3+ #4
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2024.08-1 09/18/2024
Call Trace:
 <TASK>
 dump_stack_lvl+0x56/0x80
 do_init_module.cold+0x21/0x26
 init_module_from_file+0x88/0xf0
 idempotent_init_module+0x108/0x300
 __x64_sys_finit_module+0x5a/0xb0
 do_syscall_64+0x4b/0x110
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f4f3a718839
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff>
RSP: 002b:00007fff97d1a9e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000055b94001ab90 RCX: 00007f4f3a718839
RDX: 0000000000000000 RSI: 000055b910e68a10 RDI: 0000000000000004
RBP: 0000000000000000 R08: 00007f4f3a7f1b20 R09: 000055b94001c5b0
R10: 0000000000000040 R11: 0000000000000246 R12: 000055b910e68a10
R13: 0000000000040000 R14: 000055b94001ad60 R15: 0000000000000000
 </TASK>
do_init_module: 'test_kallsyms_b'->init suspiciously returned 255, it should follow 0/-E convention
do_init_module: loading module anyway...
CPU: 1 UID: 0 PID: 1884 Comm: modprobe Not tainted 6.12.0-rc3+ #4
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2024.08-1 09/18/2024
Call Trace:
 <TASK>
 dump_stack_lvl+0x56/0x80
 do_init_module.cold+0x21/0x26
 init_module_from_file+0x88/0xf0
 idempotent_init_module+0x108/0x300
 __x64_sys_finit_module+0x5a/0xb0
 do_syscall_64+0x4b/0x110
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7ffaa5d18839

Reported-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-11-07 14:32:55 -08:00
Suren Baghdasaryan
b7fc16a16b mm/codetag: uninline and move pgalloc_tag_copy and pgalloc_tag_split
pgalloc_tag_copy() and pgalloc_tag_split() are sizable and outside of any
performance-critical paths, so it should be fine to uninline them.  Also
move their declarations into pgalloc_tag.h which seems like a more
appropriate place for them.  No functional changes other than uninlining.

Link: https://lkml.kernel.org/r/20241024162318.1640781-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Yu Zhao <yuzhao@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:17 -08:00
Suren Baghdasaryan
4835f747d3 alloc_tag: support for page allocation tag compression
Implement support for storing page allocation tag references directly in
the page flags instead of page extensions.  sysctl.vm.mem_profiling boot
parameter it extended to provide a way for a user to request this mode. 
Enabling compression eliminates memory overhead caused by page_ext and
results in better performance for page allocations.  However this mode
will not work if the number of available page flag bits is insufficient to
address all kernel allocations.  Such condition can happen during boot or
when loading a module.  If this condition is detected, memory allocation
profiling gets disabled with an appropriate warning.  By default
compression mode is disabled.

Link: https://lkml.kernel.org/r/20241023170759.999909-7-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiongwei Song <xiongwei.song@windriver.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:16 -08:00
Suren Baghdasaryan
0f9b685626 alloc_tag: populate memory for module tags as needed
The memory reserved for module tags does not need to be backed by physical
pages until there are tags to store there.  Change the way we reserve this
memory to allocate only virtual area for the tags and populate it with
physical pages as needed when we load a module.

[surenb@google.com: avoid execmem_vmap() when !MMU]
  Link: https://lkml.kernel.org/r/20241031233611.3833002-1-surenb@google.com
Link: https://lkml.kernel.org/r/20241023170759.999909-5-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiongwei Song <xiongwei.song@windriver.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:16 -08:00
Suren Baghdasaryan
0db6f8d782 alloc_tag: load module tags into separate contiguous memory
When a module gets unloaded there is a possibility that some of the
allocations it made are still used and therefore the allocation tags
corresponding to these allocations are still referenced.  As such, the
memory for these tags can't be freed.  This is currently handled as an
abnormal situation and module's data section is not being unloaded.  To
handle this situation without keeping module's data in memory, allow
codetags with longer lifespan than the module to be loaded into their own
separate memory.  The in-use memory areas and gaps after module unloading
in this separate memory are tracked using maple trees.  Allocation tags
arrange their separate memory so that it is virtually contiguous and that
will allow simple allocation tag indexing later on in this patchset.  The
size of this virtually contiguous memory is set to store up to 100000
allocation tags.

[surenb@google.com: fix empty codetag module section handling]
  Link: https://lkml.kernel.org/r/20241101000017.3856204-1-surenb@google.com
[akpm@linux-foundation.org: update comment, per Dan]
Link: https://lkml.kernel.org/r/20241023170759.999909-4-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiongwei Song <xiongwei.song@windriver.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:16 -08:00
Suren Baghdasaryan
3e09c500bb alloc_tag: introduce shutdown_mem_profiling helper function
Implement a helper function to disable memory allocation profiling and use
it when creation of /proc/allocinfo fails.  Ensure /proc/allocinfo does
not get created when memory allocation profiling is disabled.

Link: https://lkml.kernel.org/r/20241023170759.999909-3-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiongwei Song <xiongwei.song@windriver.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:16 -08:00
Masami Hiramatsu (Google)
cb6fcef8b4 objpool: fix to make percpu slot allocation more robust
Since gfp & GFP_ATOMIC == GFP_ATOMIC is true for GFP_KERNEL | GFP_HIGH, it
will use kmalloc if user specifies that combination.  Here the reason why
combining the __vmalloc_node() and kmalloc_node() is that the vmalloc does
not support all GFP flag, especially GFP_ATOMIC.  So we should check if
gfp & (GFP_ATOMIC | GFP_KERNEL) != GFP_ATOMIC for vmalloc first.  This
ensures caller can sleep.  And for the robustness, even if vmalloc fails,
it should retry with kmalloc to allocate it.

Link: https://lkml.kernel.org/r/173008598713.1262174.2959179484209897252.stgit@mhiramat.roam.corp.google.com
Fixes: aff1871bfc ("objpool: fix choosing allocation for percpu slots")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Closes: https://lore.kernel.org/all/CAHk-=whO+vSH+XVRio8byJU8idAWES0SPGVZ7KAVdc4qrV0VUA@mail.gmail.com/
Cc: Leo Yan <leo.yan@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Wu <wuqiang.matt@bytedance.com>
Cc: Mikel Rychliski <mikel@mikelr.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:14:58 -08:00
Jakub Kicinski
2696e451df Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc7).

Conflicts:

drivers/net/ethernet/freescale/enetc/enetc_pf.c
  e15c5506dd ("net: enetc: allocate vf_state during PF probes")
  3774409fd4 ("net: enetc: build enetc_pf_common.c as a separate module")
https://lore.kernel.org/20241105114100.118bd35e@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/ti/am65-cpsw-nuss.c
  de794169cf ("net: ethernet: ti: am65-cpsw: Fix multi queue Rx on J7")
  4a7b2ba94a ("net: ethernet: ti: am65-cpsw: Use tstats instead of open coded version")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-07 13:44:16 -08:00
David Hildenbrand
2b37c814aa lib/Kconfig.debug: Default STRICT_DEVMEM to "y" on s390
virtio-mem currently depends on !DEVMEM | STRICT_DEVMEM. Let's default
STRICT_DEVMEM to "y" just like we do for arm64 and x86.

There could be ways in the future to filter access to virtio-mem device
memory even without STRICT_DEVMEM, but for now let's just keep it
simple.

Tested-by: Mario Casquero <mcasquer@redhat.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20241025141453.1210600-6-david@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07 10:26:24 +01:00
Wei Yang
38dc8f4952 maple_tree: remove sanity check from mas_wr_slot_store()
After commit 5d659bbb52 ("maple_tree: introduce mas_wr_store_type()"),
the check here is redundant.

Let's remove it.

Link: https://lkml.kernel.org/r/20241017015809.23392-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:16 -08:00
Wei Yang
61e9df7085 maple_tree: calculate new_end when needed
Patch series "Following cleanup after introduce mas_wr_store_type()", v2.

Patch 1 postpone new_end calculation when needed.
Patch 2 removes a unnecessary sanity check in mas_wr_slot_store().


This patch (of 2):

For wr_exact_fit/wr_new_root, we don't need to calculate new_end.

Let's postpone it until necessary.

Link: https://lkml.kernel.org/r/20241017015809.23392-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20241017015809.23392-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:15 -08:00
Andy Shevchenko
4a7bba1df0 percpu: add a test case for the specific 64-bit value addition
It might be a corner case when we add UINT_MAX as 64-bit unsigned value to
the percpu variable as it's not the same as -1 (ULONG_LONG_MAX).  Add a
test case for that.

Link: https://lkml.kernel.org/r/20241016182635.1156168-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:14 -08:00
Wei Yang
908378a30b maple_tree: simplify mas_push_node()
When count is not 0, we know head is valid.  So we can put the assignment
in if (count) instead of checking the head pointer again.

Also count represents current total, we can assign the new total by
increasing the count by one.

Link: https://lkml.kernel.org/r/20241015120746.15850-4-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:14 -08:00
Wei Yang
4223dd93bf maple_tree: total is not changed for nomem_one case
If it jumps to nomem_one, the total allocated number is not changed.  So
we don't need to adjust it.

For the nomem_bulk case, we know there is a valid mas->alloc.  So we don't
need to do the check.

Link: https://lkml.kernel.org/r/20241015120746.15850-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:14 -08:00
Wei Yang
e852cb1d00 maple_tree: clear request_count for new allocated one
Patch series "maple_tree: simplify mas_push_node()", v2.

When count is not 0, we know head is valid.  So we can put the assignment
in if (count) instead of checking the head pointer again.

Also count represents current total, we can assign the new total by
increasing the count by one.


This patch (of 3):

If this is not a new allocated one, the request_count has already been
cleared in mas_set_alloc_req().

Link: https://lkml.kernel.org/r/20241015120746.15850-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20241015120746.15850-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:14 -08:00
Wei Yang
0cc8d68abe maple_tree: root node could be handled by !p_slot too
For a root node, mte_parent_slot() return 0, this exactly fits the
following !p_slot check.

So we can remove the special handling for root node.

Link: https://lkml.kernel.org/r/20240913063128.27391-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:13 -08:00
Jiazi Li
5b2100f723 maple_tree: fix alloc node fail issue
In the following code, the second call to the mas_node_count will return
-ENOMEM:

	mas_node_count(mas, MAPLE_ALLOC_SLOTS + 1);
	mas_node_count(mas, MAPLE_ALLOC_SLOTS * 2 + 2);

This is because there may be some full maple_alloc node in current maple
state.  Use full maple_alloc node will make max_req equal to 0.  And it
leads to mt_alloc_bulk return 0.  As a result, mas_node_count set mas.node
to MA_ERROR(-ENOMEM).

Find a non-full maple_alloc node, and if necessary, use this non-full node
in the next while loop.

Link: https://lkml.kernel.org/r/20240626160631.3636515-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Jiazi Li <jqqlijiazi@gmail.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:13 -08:00
Sidhartha Kumar
f0c99037a0 maple_tree: refactor mas_wr_store_type()
In mas_wr_store_type(), we check if new_end < mt_slots[wr_mas->type].  If
this check fails, we know that ,after this, new_end is >= mt_min_slots. 
Checking this again when we detect a wr_node_store later in the function
is reduntant.  Because this check is part of an OR statement, the
statement will always evaluate to true, therefore we can just get rid of
it.

We also refactor mas_wr_store_type() to return the store type rather than
set it directly as it greatly cleans up the function.

Link: https://lkml.kernel.org/r/20241011214451.7286-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha <sidhartha.kumar@oracle.com>
Suggested-by: Liam Howlett <liam.howlett@oracle.com>
Suggested-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:12 -08:00
Lorenzo Stoakes
b314e21596 maple_tree: do not hash pointers on dump in debug mode
Many maple tree values output when an mt_validate() or equivalent hits an
issue utilise tagged pointers, most notably parent nodes. Also some
pivots/slots contain meaningful values, output as pointers, such as the
index of the last entry with data for example.

All pointer values such as this are destroyed by kernel pointer hashing
rendering the debug output obtained from CONFIG_DEBUG_VM_MAPLE_TREE
considerably less usable.

Update this code to output the raw pointers using %px rather than %p when
CONFIG_DEBUG_VM_MAPLE_TREE is defined. This is justified, as the use of
this configuration flag indicates that this is a test environment.

Userland does not understand %px, so use %p there.

In an abundance of caution, if CONFIG_DEBUG_VM_MAPLE_TREE is not set, also
use %p to avoid exposing raw kernel pointers except when we are positive a
testing mode is enabled.

This was inspired by the investigation performed in recent debugging
efforts around a maple tree regression [0] where kernel pointer tagging had
to be disabled in order to obtain truly meaningful and useful data.

[0]:https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/

Link: https://lkml.kernel.org/r/20241007115335.90104-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:09 -08:00
Sui Jingfeng
e01caa2b63 lib/scatterlist: use sg_phys() helper
This shorten the length of code in horizential direction, therefore is
easier to read.

Link: https://lkml.kernel.org/r/20241028182920.1025819-1-sui.jingfeng@linux.dev
Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:40 -08:00
Kuan-Wei Chiu
d559bb2c6d lib/test_min_heap: update min_heap_callbacks to use default builtin swap
Replace the swp function pointer in the min_heap_callbacks of
test_min_heap with NULL, allowing direct usage of the default builtin swap
implementation.  This modification simplifies the code and improves
performance by removing unnecessary function indirection.

Link: https://lkml.kernel.org/r/20241020040200.939973-5-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:35 -08:00
Kuan-Wei Chiu
92a8b224b8 lib/min_heap: introduce non-inline versions of min heap API functions
Patch series "Enhance min heap API with non-inline functions and
optimizations", v2.

Add non-inline versions of the min heap API functions in lib/min_heap.c
and updates all users outside of kernel/events/core.c to use these
non-inline versions.  To mitigate the performance impact of indirect
function calls caused by the non-inline versions of the swap and compare
functions, a builtin swap has been introduced that swaps elements based on
their size.  Additionally, it micro-optimizes the efficiency of the min
heap by pre-scaling the counter, following the same approach as in
lib/sort.c.  Documentation for the min heap API has also been added to the
core-api section.


This patch (of 10):

All current min heap API functions are marked with '__always_inline'. 
However, as the number of users increases, inlining these functions
everywhere leads to a increase in kernel size.

In performance-critical paths, such as when perf events are enabled and
min heap functions are called on every context switch, it is important to
retain the inline versions for optimal performance.  To balance this, the
original inline functions are kept, and additional non-inline versions of
the functions have been added in lib/min_heap.c.

Link: https://lkml.kernel.org/r/20241020040200.939973-1-visitorckw@gmail.com
Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-foundation.org
Link: https://lkml.kernel.org/r/20241020040200.939973-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:34 -08:00
Kuan-Wei Chiu
908ef9bb4b lib/list_sort: remove unnecessary header includes
Patch series "Remove unnecessary header includes from
{tools/}lib/list_sort.c".

Remove outdated and unnecessary header includes from lib/list_sort.c and
tools/lib/list_sort.c.  Additionally, update the hunk exceptions checked
by check_headers.sh to reflect these changes.


This patch (of 3):

After commit 043b3f7b63 ("lib/list_sort: simplify and remove
MAX_LIST_LENGTH_BITS"), list_sort.c no longer uses ARRAY_SIZE() (which
required kernel.h and bug.h for BUILD_BUG_ON_ZERO via __must_be_array) or
memset() (which required string.h).  As these headers are no longer
needed, removes them.

There are no changes to the generated code, as confirmed by 'objdump -d'. 
Additionally, 'wc -l' shows that the size of lib/.list_sort.o.cmd is
reduced from 259 lines to 101 lines.

Link: https://lkml.kernel.org/r/20241012042828.471614-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20241012042828.471614-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:33 -08:00
Kuan-Wei Chiu
bf9850f6ea lib/Makefile: make union-find compilation conditional on CONFIG_CPUSETS
Currently, cpuset is the only user of the union-find implementation. 
Compiling union-find in all configurations unnecessarily increases the
code size when building the kernel without cgroup support.  Modify the
build system to compile union-find only when CONFIG_CPUSETS is enabled.

Link: https://lore.kernel.org/lkml/1ccd6411-5002-4574-bb8e-3e64bba6a757@redhat.com/
Link: https://lkml.kernel.org/r/20241011141214.87096-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Suggested-by: Waiman Long <llong@redhat.com>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Xavier <xavier_qy@163.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:32 -08:00
Vinicius Peixoto
5d04270708 lib/crc16_kunit.c: add KUnit tests for crc16
Add Kunit tests for the kernel's implementation of the standard CRC-16
algorithm (<linux/crc16.h>).  The test data consists of 100
randomly-generated test cases, validated against a naive CRC-16
implementation.

This test follows roughly the same logic as lib/crc32test.c, but without
the performance measurements.

Link: https://lkml.kernel.org/r/20241012-crc16-kunit-v3-1-0ca75cb58ca9@lkcamp.dev
Signed-off-by: Vinicius Peixoto <vpeixoto@lkcamp.dev>
Co-developed-by: Enzo Bertoloti <ebertoloti@lkcamp.dev>
Signed-off-by: Enzo Bertoloti <ebertoloti@lkcamp.dev>
Co-developed-by: Fabricio Gasperin <fgasperin@lkcamp.dev>
Signed-off-by: Fabricio Gasperin <fgasperin@lkcamp.dev>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:31 -08:00
I Hsin Cheng
5a3c9366cb list: test: check the size of every lists for list_cut_position*()
Check the total number of elements in both resultant lists are correct
within list_cut_position*().  Previously, only the first list's size was
checked.  so additional elements in the second list would not have been
caught.

Link: https://lkml.kernel.org/r/20241008065253.26673-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:30 -08:00
Kuan-Wei Chiu
b42166427b lib/Kconfig.debug: move int_pow test option to runtime testing section
When executing 'make menuconfig' with KUNIT enabled, the int_pow test
option appears on the first page of the main menu instead of under the
runtime testing section.  Relocate the int_pow test configuration to the
appropriate runtime testing submenu, ensuring a more organized and logical
structure in the menu configuration.

Link: https://lkml.kernel.org/r/20241005222221.2154393-1-visitorckw@gmail.com
Fixes: 7fcc9b5321 ("lib/math: Add int_pow test suite")
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: David Gow <davidgow@google.com>
Cc: Luis Felipe Hernandez <luis.hernandez093@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:30 -08:00
Wei Yang
5059aa6334 maple_tree: memset maple_big_node as a whole
In mast_fill_bnode(), we first clear some fields of maple_big_node and set
the 'type' unconditionally before return.  This means we won't leverage
any information in maple_big_node and it is safe to clear the whole
structure.

In maple_big_node, we define slot and padding/gap in a union.  And based
on current definition of MAPLE_BIG_NODE_SLOTS/GAPS, padding is always less
than slot and part of the gap is overlapped by slot.

For example on 64bit system:

  MAPLE_BIG_NODE_SLOT is 34
  MAPLE_BIG_NODE_GAP  is 21

With this knowledge, current code may clear some space by twice. And
this could be avoid by clearing the structure as a whole.

Link: https://lkml.kernel.org/r/20240908140554.20378-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:24 -08:00
Wei Yang
f36ba81081 maple_tree: remove maple_big_node.parent
Patch series "Reduce the space to be cleared for maple_big_node", v2.

Found current code may clear maple_big_node redundantly.

First we define a field parent, which is never used.  After removing this,
we reduce the size of memory to be cleared by memset.

Then mast_fill_bnode() clears part of the structure twice, since slot and
gap share some space.  By clearing the whole structure, we can avoid this.


This patch (of 2):

The member parent of maple_big_node is never used.

Let's remove it which could reduce the number of space to be cleared on
memset.

Link: https://lkml.kernel.org/r/20240908140554.20378-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20240908140554.20378-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:24 -08:00
Wei Yang
1c148069b2 maple_tree: goto complete directly on a pivot of 0
When we break the loop after assigning a pivot, the index i/j is not
changed.  Then the following code assign pivot, which means we do the
assignment with same i/j by mas_safe_pivot.

Since the loop condition is (i < piv_end), from which we can get i is less
than mt_pivots[mt].  It implies mas_safe_pivot() return pivot[i] which is
the same value we get in loop.

Now we can conclude it does a redundant assignment on a pivot of 0.  Let's
just go to complete to avoid it.

Link: https://lkml.kernel.org/r/20240911142759.20989-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:24 -08:00
Wei Yang
8c7904a8cd maple_tree: i is always less than or equal to mas_end
Patch series "refine mas_mab_cp()".

By analysis of the code, one condition check can be removed and one case
would hit a redundant assignment.


This patch (of 2):

mas_mab_cp() copy range [mas_start, mas_end] inclusively from a
maple_node to maple_big_node. This implies mas_start <= mas_end.

Based on the relationship of mas_start and mas_end, we can have the
following four cases:

                 | mas_start == mas_end |  mas_start < mas_end
  ---------------+----------------------+----------------------
  mas_start == 0 |         1            |          2
  ---------------+----------------------+----------------------
  mas_start != 0 |         3            |          4

We can see in all these four cases, i is always less than or equal to
mas_end after finish the loop:

  Case 1: After assign pivot 0, i is set to 1, which is bigger than
          mas_end 0. So it jumps to complete and skip the check.
  Case 2: After assign pivot 0, i is set to 1.
          ∵ (mas_start < mas_end) && (mas_start == 0)
             ==>  (1 <= mas_end)
          ∵ (i == 1) && (1 <= mas_end)
             ==>  (i <= mas_end)
          ∴ Before loop, we have (i <= mas_end). And we still hold this
             if it skips the loop. For example, (i == mas_end).

          Now let's see what happens in the loop:
          ∵ piv_end = min(mas_end, mt_pivots[mt])
             ==>  (piv_end <= mas_end)
	  ∵ loop condition is (i < piv_end)
	     ==>  (i <= piv_end) on finish the loop both normally or break
          ∵ (i <= piv_end) && (piv_end <= mas_end)
             ==>  (i <= mas_end)
          ∴ After loop, we still get (i <= mas_end) in this case
  Case 3: This case would skip both if clause and loop. So when it comes
          to the check, i is still mas_start which equals to mas_end.
  Case 4: This case would skip the if clause.
          ∵ (mas_start < mas_end) && (i == mas_start)
             ==>  (i < mas_end)
          ∴ Before loop, we have (i < mas_end).
          The loop process is similar with Case 2, so we get the same
	  result.

Now we can conclude in all cases, we get (i <= mas_end) when doing
check. Then it is not necessary to do the check.

Link: https://lkml.kernel.org/r/20240911142759.20989-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20240911142759.20989-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:23 -08:00
Greg Kroah-Hartman
09fbb82f94 Merge 6.12-rc6 into driver-core-next
We need the driver-core fix/revert in here as well to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05 10:11:53 +01:00
Jakub Kicinski
cbf49bed6a bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZyP6TxUcZGFuaWVsQGlv
 Z2VhcmJveC5uZXQACgkQ2yufC7HISINz7QD/RTuJAzPJXPQmjdzMj7pepjnSQH4K
 DnOc1soDqjJPSFkBAMlklDCZqSsFoNtNxagbyILrYQBC/MsV9jngimK46DEN
 =pDzC
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-10-31

We've added 13 non-merge commits during the last 16 day(s) which contain
a total of 16 files changed, 710 insertions(+), 668 deletions(-).

The main changes are:

1) Optimize and homogenize bpf_csum_diff helper for all archs and also
   add a batch of new BPF selftests for it, from Puranjay Mohan.

2) Rewrite and migrate the test_tcp_check_syncookie.sh BPF selftest
   into test_progs so that it can be run in BPF CI, from Alexis Lothoré.

3) Two BPF sockmap selftest fixes, from Zijian Zhang.

4) Small XDP synproxy BPF selftest cleanup to remove IP_DF check,
   from Vincent Li.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Add a selftest for bpf_csum_diff()
  selftests/bpf: Don't mask result of bpf_csum_diff() in test_verifier
  bpf: bpf_csum_diff: Optimize and homogenize for all archs
  net: checksum: Move from32to16() to generic header
  selftests/bpf: remove xdp_synproxy IP_DF check
  selftests/bpf: remove test_tcp_check_syncookie
  selftests/bpf: test MSS value returned with bpf_tcp_gen_syncookie
  selftests/bpf: add ipv4 and dual ipv4/ipv6 support in btf_skc_cls_ingress
  selftests/bpf: get rid of global vars in btf_skc_cls_ingress
  selftests/bpf: add missing ns cleanups in btf_skc_cls_ingress
  selftests/bpf: factorize conn and syncookies tests in a single runner
  selftests/bpf: Fix txmsg_redir of test_txmsg_pull in test_sockmap
  selftests/bpf: Fix msg_verify_data in test_sockmap

====================

Link: https://patch.msgid.link/20241031221543.108853-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 14:44:51 -08:00
Caleb Sander Mateos
61bf0009a7 dim: pass dim_sample to net_dim() by reference
net_dim() is currently passed a struct dim_sample argument by value.
struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64
passes it on the stack. All callers have already initialized dim_sample
on the stack, so passing it by value requires pushing a duplicated copy
to the stack. Either witing to the stack and immediately reading it, or
perhaps dereferencing addresses relative to the stack pointer in a chain
of push instructions, seems to perform quite poorly.

In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time,
94% of which is attributed to the first push instruction to copy
dim_sample on the stack for the call to net_dim():
// Call ktime_get()
  0.26 |4ead2:   call   4ead7 <mlx5e_handle_rx_dim+0x47>
// Pass the address of struct dim in %rdi
       |4ead7:   lea    0x3d0(%rbx),%rdi
// Set dim_sample.pkt_ctr
       |4eade:   mov    %r13d,0x8(%rsp)
// Set dim_sample.byte_ctr
       |4eae3:   mov    %r12d,0xc(%rsp)
// Set dim_sample.event_ctr
  0.15 |4eae8:   mov    %bp,0x10(%rsp)
// Duplicate dim_sample on the stack
 94.16 |4eaed:   push   0x10(%rsp)
  2.79 |4eaf1:   push   0x10(%rsp)
  0.07 |4eaf5:   push   %rax
// Call net_dim()
  0.21 |4eaf6:   call   4eafb <mlx5e_handle_rx_dim+0x6b>

To allow the caller to reuse the struct dim_sample already on the stack,
pass the struct dim_sample by reference to net_dim().

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Link: https://patch.msgid.link/20241031002326.3426181-2-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 12:36:54 -08:00
Caleb Sander Mateos
a865276872 dim: make dim_calc_stats() inputs const pointers
Make the start and end arguments to dim_calc_stats() const pointers
to clarify that the function does not modify their values.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>
Link: https://patch.msgid.link/20241031002326.3426181-1-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 12:35:57 -08:00
Bartosz Golaszewski
a508ef4b1d lib: string_helpers: silence snprintf() output truncation warning
The output of ".%03u" with the unsigned int in range [0, 4294966295] may
get truncated if the target buffer is not 12 bytes. This can't really
happen here as the 'remainder' variable cannot exceed 999 but the
compiler doesn't know it. To make it happy just increase the buffer to
where the warning goes away.

Fixes: 3c9f3681d0 ("[SCSI] lib: add generic helper to print sizes rounded to the correct SI range")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Kees Cook <kees@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20241101205453.9353-1-brgl@bgdev.pl
Signed-off-by: Kees Cook <kees@kernel.org>
2024-11-02 13:08:55 -07:00
Thomas Gleixner
d44d26987b timekeeping: Remove CONFIG_DEBUG_TIMEKEEPING
Since 135225a363 timekeeping_cycles_to_ns() handles large offsets which
would lead to 64bit multiplication overflows correctly. It's also protected
against negative motion of the clocksource unconditionally, which was
exclusive to x86 before.

timekeeping_advance() handles large offsets already correctly.

That means the value of CONFIG_DEBUG_TIMEKEEPING which analyzed these cases
is very close to zero. Remove all of it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/all/20241031120328.536010148@linutronix.de
2024-11-02 10:14:31 +01:00
Ming Lei
341468e0ab lib/iov_iter: fix bvec iterator setup
.bi_size of bvec iterator should be initialized as real max size for
walking, and .bi_bvec_done just counts how many bytes need to be
skipped in the 1st bvec, so .bi_size isn't related with .bi_bvec_done.

This patch fixes bvec iterator initialization, and the inner `size`
check isn't needed any more, so revert Eric Dumazet's commit
7bc802acf193 ("iov-iter: do not return more bytes than requested in
iov_iter_extract_bvec_pages()").

Cc: Eric Dumazet <edumazet@google.com>
Fixes: e4e535bff2 ("iov_iter: don't require contiguous pages in iov_iter_extract_bvec_pages")
Reported-by: syzbot+71abe7ab2b70bca770fd@syzkaller.appspotmail.com
Tested-by: syzbot+71abe7ab2b70bca770fd@syzkaller.appspotmail.com
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-01 20:18:21 -06:00
Linus Torvalds
3dfffd506e arm64 fixes for -rc6
- Fix handling of POR_EL0 during signal delivery so that pushing the
   signal context doesn't fail based on the pkey configuration of the
   interrupted context and align our user-visible behaviour with that of
   x86.
 
 - Fix a bogus pointer being passed to the CPU hotplug code from the
   Arm SDEI driver.
 
 - Re-enable software tag-based KASAN with GCC by using an alternative
   implementation of '__no_sanitize_address'.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmcjr8wQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNL2DB/4tNl7feCA2V4fW/Eu3RzXrHTdJbZvTjLDl
 JjeXPZr4WdGQQMgQ0DPZtpnmeBzd5nswx9WHG9VSsUxc5g+rzWxwvMnUeplDvEXo
 Y/QMUq4JZN3eqDZWPs0mEN4fMI+QOihInErVHvFXaJLcbxYrU5BvfwExgfY53AjT
 ZJEPmF291OL6V4UCWVWggk44BQaTBeWmc4itJcYm6z6mIgAgh84MZGK5M0e582ip
 CRAImDiAPqLxRO9kzKcYthI3FDyyVi1HtiSL1CiNktOXMNz19qPelq1XAnDEyvBt
 TEUitTLTwbUJ0nqi4u7ve09aebneAq8nsGucteYTrBU4U/PRjvQO
 =LTB9
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "The important one is a change to the way in which we handle protection
  keys around signal delivery so that we're more closely aligned with
  the x86 behaviour, however there is also a revert of the previous fix
  to disable software tag-based KASAN with GCC, since a workaround
  materialised shortly afterwards.

  I'd love to say we're done with 6.12, but we're aware of some
  longstanding fpsimd register corruption issues that we're almost at
  the bottom of resolving.

  Summary:

   - Fix handling of POR_EL0 during signal delivery so that pushing the
     signal context doesn't fail based on the pkey configuration of the
     interrupted context and align our user-visible behaviour with that
     of x86.

   - Fix a bogus pointer being passed to the CPU hotplug code from the
     Arm SDEI driver.

   - Re-enable software tag-based KASAN with GCC by using an alternative
     implementation of '__no_sanitize_address'"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
  firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state()
  Revert "kasan: Disable Software Tag-Based KASAN with GCC"
  kasan: Fix Software Tag-Based KASAN with GCC
2024-11-01 07:54:11 -10:00
Linus Torvalds
d56239a82e vfs-6.12-rc6.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZyTGAQAKCRCRxhvAZXjc
 opd6AQCal4omyfS8FYe4VRRZ/0XHouagq99I0U0TAmKkvoKAsgD/XrdE+pSTEkPX
 Pv4T9phh1cZRxcyKVu77UoYkuHJEDAg=
 =Lu9R
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.12-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull filesystem fixes from Christian Brauner:
 "VFS:

   - Fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP=y is set

   - Add a get_tree_bdev_flags() helper that allows to modify e.g.,
     whether errors are logged into the filesystem context during
     superblock creation. This is used by erofs to fix a userspace
     regression where an error is currently logged when its used on a
     regular file which is an new allowed mode in erofs.

  netfs:

   - Fix the sysfs debug path in the documentation.

   - Fix iov_iter_get_pages*() for folio queues by skipping the page
     extracation if we're at the end of a folio.

  afs:

   - Fix moving subdirectories to different parent directory.

  autofs:

   - Fix handling of AUTOFS_DEV_IOCTL_TIMEOUT_CMD ioctl in
     validate_dev_ioctl(). The actual ioctl number, not the ioctl
     command needs to be checked for autofs"

* tag 'vfs-6.12-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP
  autofs: fix thinko in validate_dev_ioctl()
  iov_iter: Fix iov_iter_get_pages*() for folio_queue
  afs: Fix missing subdir edit when renamed between parent dirs
  doc: correcting the debug path for cachefiles
  erofs: use get_tree_bdev_flags() to avoid misleading messages
  fs/super.c: introduce get_tree_bdev_flags()
2024-11-01 07:37:10 -10:00
Eric Dumazet
a911bad094 dql: annotate data-races around dql->last_obj_cnt
dql->last_obj_cnt is read/written from different contexts,
without any lock synchronization.

Use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20241029191425.2519085-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 19:19:36 -07:00
Jakub Kicinski
5b1c965956 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc6).

Conflicts:

drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
  cbe84e9ad5 ("wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd")
  188a1bf894 ("wifi: mac80211: re-order assigning channel in activate links")
https://lore.kernel.org/all/20241028123621.7bbb131b@canb.auug.org.au/

net/mac80211/cfg.c
  c4382d5ca1 ("wifi: mac80211: update the right link for tx power")
  8dd0498983 ("wifi: mac80211: Fix setting txpower with emulate_chanctx")

drivers/net/ethernet/intel/ice/ice_ptp_hw.h
  6e58c33106 ("ice: fix crash on probe for DPLL enabled E810 LOM")
  e4291b64e1 ("ice: Align E810T GPIO to other products")
  ebb2693f8f ("ice: Read SDP section from NVM for pin definitions")
  ac532f4f42 ("ice: Cleanup unused declarations")
https://lore.kernel.org/all/20241030120524.1ee1af18@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 18:10:07 -07:00
Ming Lei
496a51b371 lib/iov_iter.c: initialize bi.bi_idx before iterating over bvec
Initialize bi.bi_idx as 0 before iterating over bvec, otherwise
garbage data can be used as ->bi_idx.

Cc: Christoph Hellwig <hch@lst.de>
Reported-and-tested-by: Klara Modin <klarasmodin@gmail.com>
Fixes: e4e535bff2 ("iov_iter: don't require contiguous pages in iov_iter_extract_bvec_pages")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-31 07:40:34 -06:00
Puranjay Mohan
db71aae70e net: checksum: Move from32to16() to generic header
from32to16() is used by lib/checksum.c and also by
arch/parisc/lib/checksum.c. The next patch will use it in the
bpf_csum_diff helper.

Move from32to16() to the include/net/checksum.h as csum_from32to16() and
remove other implementations.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20241026125339.26459-2-puranjay@kernel.org
2024-10-30 15:29:59 +01:00
Linus Torvalds
7fbaacafbc slab fixes for 6.12-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmcgrxcACgkQu+CwddJF
 iJrq9ggAiZ/2c7p23s52LdVhT9GTyV5omVOh2kDztVx4w6RM3RbkhkLWdqt0XUag
 uf1TJe6kOvnCeHEFEEo3sqPj820XebxKDf0GGCdI6a9f4n30ipKH+vWSQ0iutKO/
 dOBdArxr0FGOV5VZR9i3xQ6sUqZXXUbJdte0c0ovp6Q6HDHTeQeKNhOQ2fv33TG/
 7jBh5HVyhI6JE/+TOxrMaklH0IqYBb6z49wdbaN7XBvXVXlb5MtOZy109gfUHDwe
 tfktifyE45VtmF0WdHfxDbCnqyDSG1Jm3wsLDbMq+voJ1BQlUvIZ5Dv4kucYqffm
 VN5HkH6uQ09aoounBoU4g50UYeNpiQ==
 =xAw8
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

 - Fix for a slub_kunit test warning with MEM_ALLOC_PROFILING_DEBUG (Pei
   Xiao)

 - Fix for a MTE-based KASAN BUG in krealloc() (Qun-Wei Lin)

* tag 'slab-for-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm: krealloc: Fix MTE false alarm in __do_krealloc
  slub/kunit: fix a WARNING due to unwrapped __kmalloc_cache_noprof
2024-10-29 16:24:02 -10:00
Ming Lei
e4e535bff2 iov_iter: don't require contiguous pages in iov_iter_extract_bvec_pages
The iov_iter_extract_pages interface allows to return physically
discontiguous pages, as long as all but the first and last page
in the array are page aligned and page size.  Rewrite
iov_iter_extract_bvec_pages to take advantage of that instead of only
returning ranges of physically contiguous pages.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
[hch: minor cleanups, new commit log]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241024050021.627350-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-29 09:14:28 -06:00
Arnd Bergmann
5a8b4b4001 lib/iomem_copy: fix kerneldoc format style
The newly added file did not quite get the punctuation right:

lib/iomem_copy.c:14: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410290907.0mDZVYPK-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-29 07:14:29 +00:00
Arnd Bergmann
af08475370 selftests: kallsyms: add MODULE_DESCRIPTION
The newly added test script creates modules that are lacking
a description line in order to build cleanly:

WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/module/test_kallsyms_a.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/module/test_kallsyms_b.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/module/test_kallsyms_c.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/module/test_kallsyms_d.o

Fixes: 84b4a51fce ("selftests: add new kallsyms selftests")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-10-28 15:27:08 -07:00
Julian Vetter
b660d0a2ac
New implementation for IO memcpy and IO memset
The IO memcpy and IO memset functions in asm-generic/io.h simply call
memcpy and memset. This can lead to alignment problems or faults on
architectures that do not define their own version and fall back to
these defaults.
This patch introduces new implementations for IO memcpy and IO memset,
that use read{l,q} accessor functions, align accesses to machine word
size, and resort to byte accesses when the target memory is not aligned.
For new architectures and existing ones that were using the old
fallbacks these functions are save to use, because IO memory constraints
are taken into account. Moreover, architectures with similar
implementations can now use these new versions, not needing to implement
their own.

Reviewed-by: Yann Sionneau <ysionneau@kalrayinc.com>
Signed-off-by: Julian Vetter <jvetter@kalrayinc.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-28 21:44:29 +00:00
Nicolas Pitre
1dc82675cb
lib/math/test_div64: add some edge cases relevant to __div64_const32()
Be sure to test the extreme cases with and without bias.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-28 21:44:28 +00:00
Ira Weiny
4261974701 printf: Add print format (%pra) for struct range
The use of struct range in the CXL subsystem is growing.  In particular,
the addition of Dynamic Capacity devices uses struct range in a number
of places which are reported in debug and error messages.

To wit requiring the printing of the start/end fields in each print
became cumbersome.  Dan Williams mentions in [1] that it might be time
to have a print specifier for struct range similar to struct resource.

A few alternatives were considered including '%par', '%r', and '%pn'.
%pra follows that struct range is similar to struct resource (%p[rR])
but needs to be different.  Based on discussions with Petr and Andy
'%pra' was chosen.[2]

Andy also suggested to keep the range prints similar to struct resource
though combined code.  Add hex_range() to handle printing for both
pointer types.

Finally introduce DEFINE_RANGE() as a parallel to DEFINE_RES_*() and use
it in the tests.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: open list <linux-kernel@vger.kernel.org>
Link: https://lore.kernel.org/all/663922b475e50_d54d72945b@dwillia2-xfh.jf.intel.com.notmuch/ [1]
Link: https://lore.kernel.org/all/66cea3bf3332f_f937b29424@iweiny-mobl.notmuch/ [2]
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20241025-cxl-pra-v2-3-123a825daba2@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-28 14:32:43 -07:00
Ira Weiny
8e7f07e608 test printf: Add very basic struct resource tests
The printf tests for struct resource were stubbed out.  struct range
printing will leverage the struct resource implementation.

To prevent regression add some basic sanity tests for struct resource.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20241007-dcd-type2-upstream-v4-1-c261ee6eeded@intel.com
Tested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20241025-cxl-pra-v2-1-123a825daba2@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-28 14:30:52 -07:00
Hugh Dickins
c749d9b7eb
iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP
generic/077 on x86_32 CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y with highmem,
on huge=always tmpfs, issues a warning and then hangs (interruptibly):

WARNING: CPU: 5 PID: 3517 at mm/highmem.c:622 kunmap_local_indexed+0x62/0xc9
CPU: 5 UID: 0 PID: 3517 Comm: cp Not tainted 6.12.0-rc4 #2
...
copy_page_from_iter_atomic+0xa6/0x5ec
generic_perform_write+0xf6/0x1b4
shmem_file_write_iter+0x54/0x67

Fix copy_page_from_iter_atomic() by limiting it in that case
(include/linux/skbuff.h skb_frag_must_loop() does similar).

But going forward, perhaps CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is too
surprising, has outlived its usefulness, and should just be removed?

Fixes: 908a1ad894 ("iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Link: https://lore.kernel.org/r/dd5f0c89-186e-18e1-4f43-19a60f5a9774@google.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-28 13:39:35 +01:00
Eric Biggers
4964a1d91c crypto: api - move crypto_simd_disabled_for_test to lib
Move crypto_simd_disabled_for_test to lib/ so that crypto_simd_usable()
can be used by library code.

This was discussed previously
(https://lore.kernel.org/linux-crypto/20220716062920.210381-4-ebiggers@kernel.org/)
but was not done because there was no use case yet.  However, this is
now needed for the arm64 CRC32 library code.

Tested with:
    export ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
    echo CONFIG_CRC32=y > .config
    echo CONFIG_MODULES=y >> .config
    echo CONFIG_CRYPTO=m >> .config
    echo CONFIG_DEBUG_KERNEL=y >> .config
    echo CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n >> .config
    echo CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y >> .config
    make olddefconfig
    make -j$(nproc)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:11 +08:00
Ard Biesheuvel
16739efac6 crypto: crc32c - Provide crc32c-arch driver for accelerated library code
crc32c-generic is currently backed by the architecture's CRC-32c library
code, which may offer a variety of implementations depending on the
capabilities of the platform. These are not covered by the crypto
subsystem's fuzz testing capabilities because crc32c-generic is the
reference driver that the fuzzing logic uses as a source of truth.

Fix this by providing a crc32c-arch implementation which is based on the
arch library code if available, and modify crc32c-generic so it is
always based on the generic C implementation. If the arch has no CRC-32c
library code, this change does nothing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Ard Biesheuvel
a37e55791f crypto: crc32 - Provide crc32-arch driver for accelerated library code
crc32-generic is currently backed by the architecture's CRC-32 library
code, which may offer a variety of implementations depending on the
capabilities of the platform. These are not covered by the crypto
subsystem's fuzz testing capabilities because crc32-generic is the
reference driver that the fuzzing logic uses as a source of truth.

Fix this by providing a crc32-arch implementation which is based on the
arch library code if available, and modify crc32-generic so it is
always based on the generic C implementation. If the arch has no CRC-32
library code, this change does nothing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Paolo Abeni
03fc07a247 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts and no adjacent changes.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-25 09:08:22 +02:00
Linus Torvalds
c2cd8e4592 Probes fixes for v6.12-rc4(2):
- objpool: Fix choosing allocation for percpu slots
   Fixes to allocate objpool's percpu slots correctly according to the
   GFP flag. It checks whether "any bit" in GFP_ATOMIC is set to choose
   the vmalloc source, but it should check "all bits" in GFP_ATOMIC flag
   is set, because GFP_ATOMIC is a combined flag.
 
 - tracing/probes: Fix MAX_TRACE_ARGS limit handling
   If more than MAX_TRACE_ARGS are passed for creating a probe event, the
   entries over MAX_TRACE_ARG in trace_arg array are not initialized.
   Thus if the kernel accesses those entries, it crashes. This rejects
   creating event if the number of arguments is over MAX_TRACE_ARGS.
 
 - tracing: Consider the NULL character when validating the event length
   A strlen() is used when parsing the event name, and the original code
   does not consider the terminal null byte. Thus it can pass the name
   1 byte longer than the buffer. This fixes to check it correctly.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmcZBJ0ACgkQ2/sHvwUr
 Pxu4qAgAm+mIiCaBGyolsT1oB5EF+9gztbwRtcAOY1811RJZ0XiQPuOwtZfijpBr
 1Pl+SjubRKhLg+lLHEuCQHxkqlTSp+zrjkF+A0hFlB38nJ5P3pIw+b5pM5FCvhY+
 w0tBTwkjiRBS9h1z88c74ciKYA/XR4apcMMUrPQZUCHq8P73Wu/Fo2lhnCVGBs6q
 nYESyrTcOCDR0c6HP9D2GWxQFtbbCyAfotUjX37EIooTcl7ufAr8IPm8jBx7EzCa
 WM841FwbuIgGbFCGYlG1/lOR+Qf7FszKAY5SBJMV/BiyFbxJqZfA5DWfJcrZ9YpW
 pl86oKWyEkidwx8OIiB3Y1enPzUUJQ==
 =8oUB
 -----END PGP SIGNATURE-----

Merge tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - objpool: Fix choosing allocation for percpu slots

   Fixes to allocate objpool's percpu slots correctly according to the
   GFP flag. It checks whether "any bit" in GFP_ATOMIC is set to choose
   the vmalloc source, but it should check "all bits" in GFP_ATOMIC flag
   is set, because GFP_ATOMIC is a combined flag.

 - tracing/probes: Fix MAX_TRACE_ARGS limit handling

   If more than MAX_TRACE_ARGS are passed for creating a probe event,
   the entries over MAX_TRACE_ARG in trace_arg array are not
   initialized. Thus if the kernel accesses those entries, it crashes.
   This rejects creating event if the number of arguments is over
   MAX_TRACE_ARGS.

 - tracing: Consider the NUL character when validating the event length

   A strlen() is used when parsing the event name, and the original code
   does not consider the terminal null byte. Thus it can pass the name
   one byte longer than the buffer. This fixes to check it correctly.

* tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Consider the NULL character when validating the event length
  tracing/probes: Fix MAX_TRACE_ARGS limit handling
  objpool: fix choosing allocation for percpu slots
2024-10-24 13:51:58 -07:00
Luis Chamberlain
84b4a51fce selftests: add new kallsyms selftests
We lack find_symbol() selftests, so add one. This let's us stress test
improvements easily on find_symbol() or optimizations. It also inherently
allows us to test the limits of kallsyms on Linux today.

We test a pathalogical use case for kallsyms by introducing modules
which are automatically written for us with a larger number of symbols.
We have 4 kallsyms test modules:

A: has KALLSYSMS_NUMSYMS exported symbols
B: uses one of A's symbols
C: adds KALLSYMS_SCALE_FACTOR * KALLSYSMS_NUMSYMS exported
D: adds 2 * the symbols than C

By using anything much larger than KALLSYSMS_NUMSYMS as 10,000 and
KALLSYMS_SCALE_FACTOR of 8 we segfault today. So we're capped at
around 160000 symbols somehow today. We can inpsect that issue at
our leasure later, but for now the real value to this test is that
this will easily allow us to test improvements on find_symbol().

We want to enable this test on allyesmodconfig builds so we can't
use this combination, so instead just use a safe value for now and
be informative on the Kconfig symbol documentation about where our
thresholds are for testers. We default then to KALLSYSMS_NUMSYMS of
just 100 and KALLSYMS_SCALE_FACTOR of 8.

On x86_64 we can use perf, for other architectures we just use 'time'
and allow for customizations. For example a future enhancements could
be done for parisc to check for unaligned accesses which triggers a
special special exception handler assembler code inside the kernel.
The negative impact on performance is so large on parisc that it
keeps track of its accesses on /proc/cpuinfo as UAH:

IRQ:       CPU0       CPU1
3:       1332          0         SuperIO  ttyS0
7:    1270013          0         SuperIO  pata_ns87415
64:  320023012  320021431             CPU  timer
65:   17080507   20624423             CPU  IPI
UAH:   10948640      58104   Unaligned access handler traps

While at it, this tidies up lib/ test modules to allow us to have
a new directory for them. The amount of test modules under lib/
is insane.

This should also hopefully showcase how to start doing basic
self module writing code, which may be more useful for more complex
cases later in the future.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-10-24 10:14:12 -07:00
David Howells
e65a0dc1ca
iov_iter: Fix iov_iter_get_pages*() for folio_queue
p9_get_mapped_pages() uses iov_iter_get_pages_alloc2() to extract pages
from an iterator when performing a zero-copy request and under some
circumstances, this crashes with odd page errors[1], for example, I see:

    page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbcf0
    flags: 0x2000000000000000(zone=1)
    ...
    page dumped because: VM_BUG_ON_FOLIO(((unsigned int) folio_ref_count(folio) + 127u <= 127u))
    ------------[ cut here ]------------
    kernel BUG at include/linux/mm.h:1444!

This is because, unlike in iov_iter_extract_folioq_pages(), the
iter_folioq_get_pages() helper function doesn't skip the current folio
when iov_offset points to the end of it, but rather extracts the next
page beyond the end of the folio and adds it to the list.  Reading will
then clobber the contents of this page, leading to system corruption,
and if the page is not in use, put_page() may try to clean up the unused
page.

This can be worked around by copying the iterator before each
extraction[2] and using iov_iter_advance() on the original as the
advance function steps over the page we're at the end of.

Fix this by skipping the page extraction if we're at the end of the
folio.

This was reproduced in the ktest environment[3] by forcing 9p to use the
fscache caching mode and then reading a file through 9p.

Fixes: db0aa2e956 ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Reported-by: Antony Antony <antony@phenome.org>
Closes: https://lore.kernel.org/r/ZxFQw4OI9rrc7UYc@Antony2201.local/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Van Hensbergen <ericvh@kernel.org>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: v9fs@lists.linux.dev
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/ZxFEi1Tod43pD6JC@moon.secunet.de/ [1]
Link: https://lore.kernel.org/r/2299159.1729543103@warthog.procyon.org.uk/ [2]
Link: https://github.com/koverstreet/ktest.git [3]
Tested-by: Antony Antony <antony.antony@secunet.com>
Link: https://lore.kernel.org/r/3327438.1729678025@warthog.procyon.org.uk
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-24 13:50:27 +02:00
Marco Elver
237ab03e30 Revert "kasan: Disable Software Tag-Based KASAN with GCC"
This reverts commit 7aed6a2c51.

Now that __no_sanitize_address attribute is fixed for KASAN_SW_TAGS with
GCC, allow re-enabling KASAN_SW_TAGS with GCC.

Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20241021120013.3209481-2-elver@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-23 16:04:30 +01:00
Pei Xiao
2b059d0d1e slub/kunit: fix a WARNING due to unwrapped __kmalloc_cache_noprof
'modprobe slub_kunit' will have a warning as shown below. The root cause
is that __kmalloc_cache_noprof was directly used, which resulted in no
alloc_tag being allocated. This caused current->alloc_tag to be null,
leading to a warning in alloc_tag_add_check.

Let's add an alloc_hook layer to __kmalloc_cache_noprof specifically
within lib/slub_kunit.c, which is the only user of this internal slub
function outside kmalloc implementation itself.

[58162.947016] WARNING: CPU: 2 PID: 6210 at
./include/linux/alloc_tag.h:125 alloc_tagging_slab_alloc_hook+0x268/0x27c
[58162.957721] Call trace:
[58162.957919]  alloc_tagging_slab_alloc_hook+0x268/0x27c
[58162.958286]  __kmalloc_cache_noprof+0x14c/0x344
[58162.958615]  test_kmalloc_redzone_access+0x50/0x10c [slub_kunit]
[58162.959045]  kunit_try_run_case+0x74/0x184 [kunit]
[58162.959401]  kunit_generic_run_threadfn_adapter+0x2c/0x4c [kunit]
[58162.959841]  kthread+0x10c/0x118
[58162.960093]  ret_from_fork+0x10/0x20
[58162.960363] ---[ end trace 0000000000000000 ]---

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Fixes: a0a44d9175 ("mm, slab: don't wrap internal functions with alloc_hooks()")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-10-23 09:50:58 +02:00
Viktor Malik
aff1871bfc objpool: fix choosing allocation for percpu slots
objpool intends to use vmalloc for default (non-atomic) allocations of
percpu slots and objects. However, the condition checking if GFP flags
set any bit of GFP_ATOMIC is wrong b/c GFP_ATOMIC is a combination of bits
(__GFP_HIGH|__GFP_KSWAPD_RECLAIM) and so `pool->gfp & GFP_ATOMIC` will
be true if either bit is set. Since GFP_ATOMIC and GFP_KERNEL share the
___GFP_KSWAPD_RECLAIM bit, kmalloc will be used in cases when GFP_KERNEL
is specified, i.e. in all current usages of objpool.

This may lead to unexpected OOM errors since kmalloc cannot allocate
large amounts of memory.

For instance, objpool is used by fprobe rethook which in turn is used by
BPF kretprobe.multi and kprobe.session probe types. Trying to attach
these to all kernel functions with libbpf using

    SEC("kprobe.session/*")
    int kprobe(struct pt_regs *ctx)
    {
        [...]
    }

fails on objpool slot allocation with ENOMEM.

Fix the condition to truly use vmalloc by default.

Link: https://lore.kernel.org/all/20240826060718.267261-1-vmalik@redhat.com/

Fixes: b4edb8d2d4 ("lib: objpool added: ring-array based lockless MPMC")
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Matt Wu <wuqiang.matt@bytedance.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-10-22 14:22:42 +09:00
Linus Torvalds
a777c32ca4 This push fixes a regression in mpi that broke RSA.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmcSJQYACgkQxycdCkmx
 i6ejoBAAhK/3bk9jmxMOnVvednjrjVMqg+17daXHKbHT6eMcOwXsgr4ZWrkc5syV
 tQBRipdSfLhwf4aTNOzgyg3GIVVQkLZuRKDanntVdyYs65YKKUP/BiUshMAJ4DbW
 nkPe+LBdl0EvIWexrSKy5cyB2Yt+5MknK+mUMHyAeRjgVHNCEBMbMo/4KHGDW6fL
 Cn8rBATD1LCBODkxFC83pHe5M/TsxM08hL8xQxPJZm9SvNiBa7+xaS/oSApyIs8x
 L0RmYdlXlRGQcok5/ZCFc66QEOw2lIOwIc6sTmbT+eKFtvztkZ+ErhAuubgk5UKa
 TaB0qrBIpsQs2O7gFq4OU7BkG4QAlFt37MqBuf21b5Zh605s/ORDWEQobcokXpBY
 SmxOBxBhhLcRgb1cjUQn44/M8vrRXL0+IZiuOWkb+vcNln32bCH+BeiW6traNdL3
 s3uVRF28Pd76xB4eAuT4eqiSOuCI/FyB7+hJmkOcpKC1eQUq2whrFLfru3iGItn8
 bJWJQjPaysI8QXoky6miMjaeBWWOHuBWgYb2BzzHRsAdxK2oXUN/Q3BOJq1wONtP
 YaRzqu5vBvPk+0F/SOIl1MBp1nt62T8WRcDyIAhDsgmnuWASAKzo9Smzzo0gJr8q
 bB9iHTHN6yR9J3+zPyOqPY99zkaABSrQU9StFqEjN8icndG5Tfo=
 =MHMX
 -----END PGP SIGNATURE-----

Merge tag 'v6.12-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "Fix a regression in mpi that broke RSA"

* tag 'v6.12-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: lib/mpi - Fix an "Uninitialized scalar variable" issue
2024-10-21 09:59:43 -07:00
Linus Torvalds
4e6bd4a33a Rust fixes for v6.12 (2nd)
Toolchain and infrastructure:
 
  - Fix several issues with the 'rustc-option' macro. It includes a
    refactor from Masahiro of three '{cc,rust}-*' macros, which is not
    a fix but avoids repeating the same commands (which would be several
    lines in the case of 'rustc-option').
 
  - Fix conditions for 'CONFIG_HAVE_CFI_ICALL_NORMALIZE_INTEGERS'. It
    includes the addition of 'CONFIG_RUSTC_LLVM_VERSION', which is not a
    fix but is needed for the actual fix.
 
 And a trivial grammar fix.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmcS5LkACgkQGXyLc2ht
 IW07ghAAxP94zqWzf8bQ4IIgTYrV9WSqR9vMpd31VAPknRJjGUq5dehFxiQxDJ5X
 ibMcpyja8V1CGeOh4qthLJAD/OGw+ANafjLfHM/l9cQRx1uwLEac3h4/YR1x52Ep
 al3ISewhbs3cjko2aa6Gnym3hdYizqkKY9Bca6kvo7k4ZRRmWT3sKAsle6rV93Hw
 q9AjC40XC8iy2VYv/JPvP1zcr3T7ZzCrs3ELG8sLSeR0gZZEmI3e3FOWWHcRlVRa
 uig4SSPvhHVssG8k64CHmzUtVQCApuJuzQGG72Ozs4V5Xxk86ZRE0XzyMXaw15nu
 Mm8s+hDxsFXfESQg0GMCVQ7wnGFSuvRwK3sWALltXmqtGQxkYgcJ3mYtu0sP8p51
 VIzDIomdUfGLxk+sDn7Lnl5PrSLaetUd94nr5qCMmfb2/7/kSaB4aHmML+8ZHCn5
 I4TQONL/pVmmRm97HFaAFOzCaGRWfVoIzQ/cRaQhqK+qrTfRjyFcsMzN+Flp5A58
 c3AgnTVlm4pPqtlLQ1z9BiGYT50dI0fHBOQiisogGsZwwMUqzEMOnbZjbhS/HKSp
 FG8hu/OyzIsNnNqOfQZN4DSTyf4qfIuyTmFM1OAel8zllCwlxy5F2hVp/opwH3/y
 On6CW0lunUBzCXZZ+byWudo7Vg8YpMVHATLqp9FHZpJb8JK688w=
 =Y7fL
 -----END PGP SIGNATURE-----

Merge tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux

Pull rust fixes from Miguel Ojeda:
 "Toolchain and infrastructure:

   - Fix several issues with the 'rustc-option' macro. It includes a
     refactor from Masahiro of three '{cc,rust}-*' macros, which is not
     a fix but avoids repeating the same commands (which would be
     several lines in the case of 'rustc-option').

   - Fix conditions for 'CONFIG_HAVE_CFI_ICALL_NORMALIZE_INTEGERS'. It
     includes the addition of 'CONFIG_RUSTC_LLVM_VERSION', which is not
     a fix but is needed for the actual fix.

  And a trivial grammar fix"

* tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux:
  cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS
  kbuild: rust: add `CONFIG_RUSTC_LLVM_VERSION`
  kbuild: fix issues with rustc-option
  kbuild: refactor cc-option-yn, cc-disable-warning, rust-option-yn macros
  lib/Kconfig.debug: fix grammar in RUST_BUILD_ASSERT_ALLOW
2024-10-19 08:32:47 -07:00
Linus Torvalds
3d5ad2d4ec BPF fixes:
- Fix BPF verifier to not affect subreg_def marks in its range
   propagation, from Eduard Zingerman.
 
 - Fix a truncation bug in the BPF verifier's handling of
   coerce_reg_to_size_sx, from Dimitar Kanaliev.
 
 - Fix the BPF verifier's delta propagation between linked
   registers under 32-bit addition, from Daniel Borkmann.
 
 - Fix a NULL pointer dereference in BPF devmap due to missing
   rxq information, from Florian Kauer.
 
 - Fix a memory leak in bpf_core_apply, from Jiri Olsa.
 
 - Fix an UBSAN-reported array-index-out-of-bounds in BTF
   parsing for arrays of nested structs, from Hou Tao.
 
 - Fix build ID fetching where memory areas backing the file
   were created with memfd_secret, from Andrii Nakryiko.
 
 - Fix BPF task iterator tid filtering which was incorrectly
   using pid instead of tid, from Jordan Rome.
 
 - Several fixes for BPF sockmap and BPF sockhash redirection
   in combination with vsocks, from Michal Luczaj.
 
 - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered,
   from Andrea Parri.
 
 - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the
   possibility of an infinite BPF tailcall, from Pu Lehui.
 
 - Fix a build warning from resolve_btfids that bpf_lsm_key_free
   cannot be resolved, from Thomas Weißschuh.
 
 - Fix a bug in kfunc BTF caching for modules where the wrong
   BTF object was returned, from Toke Høiland-Jørgensen.
 
 - Fix a BPF selftest compilation error in cgroup-related tests
   with musl libc, from Tony Ambardar.
 
 - Several fixes to BPF link info dumps to fill missing fields,
   from Tyrone Wu.
 
 - Add BPF selftests for kfuncs from multiple modules, checking
   that the correct kfuncs are called, from Simon Sundberg.
 
 - Ensure that internal and user-facing bpf_redirect flags
   don't overlap, also from Toke Høiland-Jørgensen.
 
 - Switch to use kvzmalloc to allocate BPF verifier environment,
   from Rik van Riel.
 
 - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic
   splat under RT, from Wander Lairson Costa.
 
 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZxK4OhUcZGFuaWVsQGlv
 Z2VhcmJveC5uZXQACgkQ2yufC7HISIOCrwEAib2kC5EEQn5+wKVE/bnZryVX2leT
 YXdfItDCBU6zCYUA+wTU5hGGn9lcDUcZx72l/KZPDyPw7HdzNJ+6iR1zQqoM
 =f9kv
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Daniel Borkmann:

 - Fix BPF verifier to not affect subreg_def marks in its range
   propagation (Eduard Zingerman)

 - Fix a truncation bug in the BPF verifier's handling of
   coerce_reg_to_size_sx (Dimitar Kanaliev)

 - Fix the BPF verifier's delta propagation between linked registers
   under 32-bit addition (Daniel Borkmann)

 - Fix a NULL pointer dereference in BPF devmap due to missing rxq
   information (Florian Kauer)

 - Fix a memory leak in bpf_core_apply (Jiri Olsa)

 - Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for
   arrays of nested structs (Hou Tao)

 - Fix build ID fetching where memory areas backing the file were
   created with memfd_secret (Andrii Nakryiko)

 - Fix BPF task iterator tid filtering which was incorrectly using pid
   instead of tid (Jordan Rome)

 - Several fixes for BPF sockmap and BPF sockhash redirection in
   combination with vsocks (Michal Luczaj)

 - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri)

 - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility
   of an infinite BPF tailcall (Pu Lehui)

 - Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot
   be resolved (Thomas Weißschuh)

 - Fix a bug in kfunc BTF caching for modules where the wrong BTF object
   was returned (Toke Høiland-Jørgensen)

 - Fix a BPF selftest compilation error in cgroup-related tests with
   musl libc (Tony Ambardar)

 - Several fixes to BPF link info dumps to fill missing fields (Tyrone
   Wu)

 - Add BPF selftests for kfuncs from multiple modules, checking that the
   correct kfuncs are called (Simon Sundberg)

 - Ensure that internal and user-facing bpf_redirect flags don't overlap
   (Toke Høiland-Jørgensen)

 - Switch to use kvzmalloc to allocate BPF verifier environment (Rik van
   Riel)

 - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat
   under RT (Wander Lairson Costa)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits)
  lib/buildid: Handle memfd_secret() files in build_id_parse()
  selftests/bpf: Add test case for delta propagation
  bpf: Fix print_reg_state's constant scalar dump
  bpf: Fix incorrect delta propagation between linked registers
  bpf: Properly test iter/task tid filtering
  bpf: Fix iter/task tid filtering
  riscv, bpf: Make BPF_CMPXCHG fully ordered
  bpf, vsock: Drop static vsock_bpf_prot initialization
  vsock: Update msg_count on read_skb()
  vsock: Update rx_bytes on read_skb()
  bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock
  selftests/bpf: Add asserts for netfilter link info
  bpf: Fix link info netfilter flags to populate defrag flag
  selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx()
  selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx()
  bpf: Fix truncation bug in coerce_reg_to_size_sx()
  selftests/bpf: Assert link info uprobe_multi count & path_size if unset
  bpf: Fix unpopulated path_size when uprobe_multi fields unset
  selftests/bpf: Fix cross-compiling urandom_read
  selftests/bpf: Add test for kfunc module order
  ...
2024-10-18 16:27:14 -07:00
Sebastian Andrzej Siewior
560af5dc83 lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING.
With the printk issues solved, the last known splat created by
PROVE_RAW_LOCK_NESTING is gone.

Enable PROVE_RAW_LOCK_NESTING by default as part of PROVE_LOCKING. Keep
the defines around in case something serious pops up and it needs to be
disabled.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/20241009161041.1018375-2-bigeasy@linutronix.de
2024-10-17 21:21:16 -07:00
Ahmed Ehab
5eadeb7b3b locking/lockdep: Add a test for lockdep_set_subclass()
Add a test case to ensure that no new name string literal will be
created in lockdep_set_subclass(), otherwise a warning will be triggered
in look_up_lock_class(). Add this to catch the problem in the future.

[boqun: Reword the title, replace #if with #ifdef and rename functions
and variables]

Signed-off-by: Ahmed Ehab <bottaawesome633@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/lkml/20240905011220.356973-1-bottaawesome633@gmail.com/
2024-10-17 21:21:16 -07:00
Linus Torvalds
4d939780b7 28 hotfixes. 13 are cc:stable. 23 are MM.
It is the usual shower of unrelated singletons - please see the individual
 changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZxGY5wAKCRDdBJ7gKXxA
 js6RAQC16zQ7WRV091i79cEi1C5648NbZjMCU626hZjuyfbzKgEA2v8PYtjj9w2e
 UGLxMY+PYZki2XNEh75Sikdkiyl9Vgg=
 =xcWT
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "28 hotfixes. 13 are cc:stable. 23 are MM.

  It is the usual shower of unrelated singletons - please see the
  individual changelogs for details"

* tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits)
  maple_tree: add regression test for spanning store bug
  maple_tree: correct tree corruption on spanning store
  mm/mglru: only clear kswapd_failures if reclaimable
  mm/swapfile: skip HugeTLB pages for unuse_vma
  selftests: mm: fix the incorrect usage() info of khugepaged
  MAINTAINERS: add Jann as memory mapping/VMA reviewer
  mm: swap: prevent possible data-race in __try_to_reclaim_swap
  mm: khugepaged: fix the incorrect statistics when collapsing large file folios
  MAINTAINERS: kasan, kcov: add bugzilla links
  mm: don't install PMD mappings when THPs are disabled by the hw/process/vma
  mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw()
  Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs
  Docs/damon/maintainer-profile: add missing '_' suffixes for external web links
  maple_tree: check for MA_STATE_BULK on setting wr_rebalance
  mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point
  mm/damon/tests/sysfs-kunit.h: fix memory leak in damon_sysfs_test_add_targets()
  mm: remove unused stub for can_swapin_thp()
  mailmap: add an entry for Andy Chiu
  MAINTAINERS: add memory mapping/VMA co-maintainers
  fs/proc: fix build with GCC 15 due to -Werror=unterminated-string-initialization
  ...
2024-10-17 16:33:06 -07:00
Andrii Nakryiko
5ac9b4e935 lib/buildid: Handle memfd_secret() files in build_id_parse()
>From memfd_secret(2) manpage:

  The memory areas backing the file created with memfd_secret(2) are
  visible only to the processes that have access to the file descriptor.
  The memory region is removed from the kernel page tables and only the
  page tables of the processes holding the file descriptor map the
  corresponding physical memory. (Thus, the pages in the region can't be
  accessed by the kernel itself, so that, for example, pointers to the
  region can't be passed to system calls.)

We need to handle this special case gracefully in build ID fetching
code. Return -EFAULT whenever secretmem file is passed to build_id_parse()
family of APIs. Original report and repro can be found in [0].

  [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/

Fixes: de3ec364c3 ("lib/buildid: add single folio-based file reader abstraction")
Reported-by: Yi Lai <yi1.lai@intel.com>
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Link: https://lore.kernel.org/bpf/20241017175431.6183-A-hca@linux.ibm.com
Link: https://lore.kernel.org/bpf/20241017174713.2157873-1-andrii@kernel.org
2024-10-17 21:30:32 +02:00
Linus Torvalds
6efbea77b3 arm64 fixes for -rc4
- Disable software tag-based KASAN when compiling with GCC, as functions
   are incorrectly instrumented leading to a crash early during boot.
 
 - Fix pkey configuration for kernel threads when POE is enabled.
 
 - Fix invalid memory accesses in uprobes when targetting load-literal
   instructions.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmcPrzQQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNIr6B/wN+o1xI7Fv/QdlaTuKYLvOOg/XTl6sbUDj
 YssxtjhpKuaFVG4zJHNsWvgUqO+YCM7m3F1L8LVPMF7l2xoKtRTIB1Ye315hTjYm
 dW5Te6xBMVKF8SVxE8sBbZobdokIW1JNPBrvGvHO3d5ujmofzwHU8RNMXuTUItRw
 z85Qy75FkEDTEbsWhS3VL5HOgEr+k0TYDRa8SXwKWVj7/rYna3tO39kIdS5dt9VX
 wDJbnxtWJMhiHmDnevFFhBkSZrips12P1Rb6HUSmhpUJh0Rk4TAZntSl2f/lr+jA
 PuboBbSG68UOCwAHoNmTcLdFhkiNaiyw4w2F7hk2A6aNRtme+bT0
 =M/ug
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Disable software tag-based KASAN when compiling with GCC, as
   functions are incorrectly instrumented leading to a crash early
   during boot

 - Fix pkey configuration for kernel threads when POE is enabled

 - Fix invalid memory accesses in uprobes when targetting load-literal
   instructions

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  kasan: Disable Software Tag-Based KASAN with GCC
  Documentation/protection-keys: add AArch64 to documentation
  arm64: set POR_EL0 for kernel threads
  arm64: probes: Fix uprobes for big-endian kernels
  arm64: probes: Fix simulate_ldr*_literal()
  arm64: probes: Remove broken LDR (literal) uprobe support
2024-10-17 09:51:03 -07:00
Lorenzo Stoakes
bea07fd631 maple_tree: correct tree corruption on spanning store
Patch series "maple_tree: correct tree corruption on spanning store", v3.

There has been a nasty yet subtle maple tree corruption bug that appears
to have been in existence since the inception of the algorithm.

This bug seems far more likely to happen since commit f8d112a4e6
("mm/mmap: avoid zeroing vma tree in mmap_region()"), which is the point
at which reports started to be submitted concerning this bug.

We were made definitely aware of the bug thanks to the kind efforts of
Bert Karwatzki who helped enormously in my being able to track this down
and identify the cause of it.

The bug arises when an attempt is made to perform a spanning store across
two leaf nodes, where the right leaf node is the rightmost child of the
shared parent, AND the store completely consumes the right-mode node.

This results in mas_wr_spanning_store() mitakenly duplicating the new and
existing entries at the maximum pivot within the range, and thus maple
tree corruption.

The fix patch corrects this by detecting this scenario and disallowing the
mistaken duplicate copy.

The fix patch commit message goes into great detail as to how this occurs.

This series also includes a test which reliably reproduces the issue, and
asserts that the fix works correctly.

Bert has kindly tested the fix and confirmed it resolved his issues.  Also
Mikhail Gavrilov kindly reported what appears to be precisely the same
bug, which this fix should also resolve.


This patch (of 2):

There has been a subtle bug present in the maple tree implementation from
its inception.

This arises from how stores are performed - when a store occurs, it will
overwrite overlapping ranges and adjust the tree as necessary to
accommodate this.

A range may always ultimately span two leaf nodes.  In this instance we
walk the two leaf nodes, determine which elements are not overwritten to
the left and to the right of the start and end of the ranges respectively
and then rebalance the tree to contain these entries and the newly
inserted one.

This kind of store is dubbed a 'spanning store' and is implemented by
mas_wr_spanning_store().

In order to reach this stage, mas_store_gfp() invokes
mas_wr_preallocate(), mas_wr_store_type() and mas_wr_walk() in turn to
walk the tree and update the object (mas) to traverse to the location
where the write should be performed, determining its store type.

When a spanning store is required, this function returns false stopping at
the parent node which contains the target range, and mas_wr_store_type()
marks the mas->store_type as wr_spanning_store to denote this fact.

When we go to perform the store in mas_wr_spanning_store(), we first
determine the elements AFTER the END of the range we wish to store (that
is, to the right of the entry to be inserted) - we do this by walking to
the NEXT pivot in the tree (i.e.  r_mas.last + 1), starting at the node we
have just determined contains the range over which we intend to write.

We then turn our attention to the entries to the left of the entry we are
inserting, whose state is represented by l_mas, and copy these into a 'big
node', which is a special node which contains enough slots to contain two
leaf node's worth of data.

We then copy the entry we wish to store immediately after this - the copy
and the insertion of the new entry is performed by mas_store_b_node().

After this we copy the elements to the right of the end of the range which
we are inserting, if we have not exceeded the length of the node (i.e. 
r_mas.offset <= r_mas.end).

Herein lies the bug - under very specific circumstances, this logic can
break and corrupt the maple tree.

Consider the following tree:

Height
  0                             Root Node
                                 /      \
                 pivot = 0xffff /        \ pivot = ULONG_MAX
                               /          \
  1                       A [-----]       ...
                             /   \
             pivot = 0x4fff /     \ pivot = 0xffff
                           /       \
  2 (LEAVES)          B [-----]  [-----] C
                                      ^--- Last pivot 0xffff.

Now imagine we wish to store an entry in the range [0x4000, 0xffff] (note
that all ranges expressed in maple tree code are inclusive):

1. mas_store_gfp() descends the tree, finds node A at <=0xffff, then
   determines that this is a spanning store across nodes B and C. The mas
   state is set such that the current node from which we traverse further
   is node A.

2. In mas_wr_spanning_store() we try to find elements to the right of pivot
   0xffff by searching for an index of 0x10000:

    - mas_wr_walk_index() invokes mas_wr_walk_descend() and
      mas_wr_node_walk() in turn.

        - mas_wr_node_walk() loops over entries in node A until EITHER it
          finds an entry whose pivot equals or exceeds 0x10000 OR it
          reaches the final entry.

        - Since no entry has a pivot equal to or exceeding 0x10000, pivot
          0xffff is selected, leading to node C.

    - mas_wr_walk_traverse() resets the mas state to traverse node C. We
      loop around and invoke mas_wr_walk_descend() and mas_wr_node_walk()
      in turn once again.

         - Again, we reach the last entry in node C, which has a pivot of
           0xffff.

3. We then copy the elements to the left of 0x4000 in node B to the big
   node via mas_store_b_node(), and insert the new [0x4000, 0xffff] entry
   too.

4. We determine whether we have any entries to copy from the right of the
   end of the range via - and with r_mas set up at the entry at pivot
   0xffff, r_mas.offset <= r_mas.end, and then we DUPLICATE the entry at
   pivot 0xffff.

5. BUG! The maple tree is corrupted with a duplicate entry.

This requires a very specific set of circumstances - we must be spanning
the last element in a leaf node, which is the last element in the parent
node.

spanning store across two leaf nodes with a range that ends at that shared
pivot.

A potential solution to this problem would simply be to reset the walk
each time we traverse r_mas, however given the rarity of this situation it
seems that would be rather inefficient.

Instead, this patch detects if the right hand node is populated, i.e.  has
anything we need to copy.

We do so by only copying elements from the right of the entry being
inserted when the maximum value present exceeds the last, rather than
basing this on offset position.

The patch also updates some comments and eliminates the unused bool return
value in mas_wr_walk_index().

The work performed in commit f8d112a4e6 ("mm/mmap: avoid zeroing vma
tree in mmap_region()") seems to have made the probability of this event
much more likely, which is the point at which reports started to be
submitted concerning this bug.

The motivation for this change arose from Bert Karwatzki's report of
encountering mm instability after the release of kernel v6.12-rc1 which,
after the use of CONFIG_DEBUG_VM_MAPLE_TREE and similar configuration
options, was identified as maple tree corruption.

After Bert very generously provided his time and ability to reproduce this
event consistently, I was able to finally identify that the issue
discussed in this commit message was occurring for him.

Link: https://lkml.kernel.org/r/cover.1728314402.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/48b349a2a0f7c76e18772712d0997a5e12ab0a3b.1728314403.git.lorenzo.stoakes@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf@web.de>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/all/CABXGCsOPwuoNOqSMmAvWO2Fz4TEmPnjFj-b7iF+XFRu1h7-+Dg@mail.gmail.com/
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-17 08:35:10 -07:00
Sidhartha Kumar
a6e0ceb7bf maple_tree: check for MA_STATE_BULK on setting wr_rebalance
It is possible for a bulk operation (MA_STATE_BULK is set) to enter the
new_end < mt_min_slots[type] case and set wr_rebalance as a store type. 
This is incorrect as bulk stores do not rebalance per write, but rather
after the all of the writes are done through the mas_bulk_rebalance()
path.  Therefore, add a check to make sure MA_STATE_BULK is not set before
we return wr_rebalance as the store type.

Also add a test to make sure wr_rebalance is never the store type when
doing bulk operations via mas_expected_entries()

This is a hotfix for this rc however it has no userspace effects as there
are no users of the bulk insertion mode.

Link: https://lkml.kernel.org/r/20241011214451.7286-1-sidhartha.kumar@oracle.com
Fixes: 5d659bbb52 ("maple_tree: introduce mas_wr_store_type()")
Suggested-by: Liam Howlett <liam.howlett@oracle.com>
Signed-off-by: Sidhartha <sidhartha.kumar@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-17 00:28:09 -07:00
Florian Westphal
dc783ba4b9 lib: alloc_tag_module_unload must wait for pending kfree_rcu calls
Ben Greear reports following splat:
 ------------[ cut here ]------------
 net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload
 WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0
 Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat
...
 Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020
 RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0
  codetag_unload_module+0x19b/0x2a0
  ? codetag_load_module+0x80/0x80

nf_nat module exit calls kfree_rcu on those addresses, but the free
operation is likely still pending by the time alloc_tag checks for leaks.

Wait for outstanding kfree_rcu operations to complete before checking
resolves this warning.

Reproducer:
unshare -n iptables-nft -t nat -A PREROUTING -p tcp
grep nf_nat /proc/allocinfo # will list 4 allocations
rmmod nft_chain_nat
rmmod nf_nat                # will WARN.

[akpm@linux-foundation.org: add comment]
Link: https://lkml.kernel.org/r/20241007205236.11847-1-fw@strlen.de
Fixes: a473573964 ("lib: code tagging module support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/netdev/bdaaef9d-4364-4171-b82b-bcfc12e207eb@candelatech.com/
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-17 00:28:07 -07:00
Qianqiang Liu
cd843399d7 crypto: lib/mpi - Fix an "Uninitialized scalar variable" issue
The "err" variable may be returned without an initialized value.

Fixes: 8e3a67f2de ("crypto: lib/mpi - Add error checks to extension")
Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-16 13:38:16 +08:00
Thomas Gleixner
ff8d523cc4 debugobjects: Track object usage to avoid premature freeing of objects
The freelist is freed at a constant rate independent of the actual usage
requirements. That's bad in scenarios where usage comes in bursts. The end
of a burst puts the objects on the free list and freeing proceeds even when
the next burst which requires objects started again.

Keep track of the usage with a exponentially wheighted moving average and
take that into account in the worker function which frees objects from the
free list.

This further reduces the kmem_cache allocation/free rate for a full kernel
compile:

   	    kmem_cache_alloc()	kmem_cache_free()
Baseline:   225k		173k
Usage:	    170k		117k

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/87bjznhme2.ffs@tglx
2024-10-15 17:30:33 +02:00
Thomas Gleixner
13f9ca7239 debugobjects: Refill per CPU pool more agressively
Right now the per CPU pools are only refilled when they become
empty. That's suboptimal especially when there are still non-freed objects
in the to free list.

Check whether an allocation from the per CPU pool emptied a batch and try
to allocate from the free pool if that still has objects available.

   	    kmem_cache_alloc()	kmem_cache_free()
Baseline:   295k		245k
Refill:	    225k		173k

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.439053085@linutronix.de
2024-10-15 17:30:33 +02:00
Thomas Gleixner
a201a96b96 debugobjects: Double the per CPU slots
In situations where objects are rapidly allocated from the pool and handed
back, the size of the per CPU pool turns out to be too small.

Double the size of the per CPU pool.

This reduces the kmem cache allocation and free operations during a kernel compile:

     	     alloc    	    free
Baseline:    380k           330k
Double size: 295k	    245k

Especially the reduction of allocations is important because that happens
in the hot path when objects are initialized.

The maximum increase in per CPU pool memory consumption is about 2.5K per
online CPU, which is acceptable.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.378676302@linutronix.de
2024-10-15 17:30:33 +02:00
Thomas Gleixner
2638345d22 debugobjects: Move pool statistics into global_pool struct
Keep it along with the pool as that's a hot cache line anyway and it makes
the code more comprehensible.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.318776207@linutronix.de
2024-10-15 17:30:33 +02:00
Thomas Gleixner
f57ebb92ba debugobjects: Implement batch processing
Adding and removing single objects in a loop is bad in terms of lock
contention and cache line accesses.

To implement batching, record the last object in a batch in the object
itself. This is trivialy possible as hlists are strictly stacks. At a batch
boundary, when the first object is added to the list the object stores a
pointer to itself in debug_obj::batch_last. When the next object is added
to the list then the batch_last pointer is retrieved from the first object
in the list and stored in the to be added one.

That means for batch processing the first object always has a pointer to
the last object in a batch, which allows to move batches in a cache line
efficient way and reduces the lock held time.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.258995000@linutronix.de
2024-10-15 17:30:33 +02:00
Thomas Gleixner
aebbfe0779 debugobjects: Prepare kmem_cache allocations for batching
Allocate a batch and then push it into the pool. Utilize the
debug_obj::last_node pointer for keeping track of the batch boundary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241007164914.198647184@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
74fe1ad413 debugobjects: Prepare for batching
Move the debug_obj::object pointer into a union and add a pointer to the
last node in a batch. That allows to implement batch processing efficiently
by utilizing the stack property of hlist:

When the first object of a batch is added to the list, then the batch
pointer is set to the hlist node of the object itself. Any subsequent add
retrieves the pointer to the last node from the first object in the list
and uses that for storing the last node pointer in the newly added object.

Add the pointer to the data structure and ensure that all relevant pool
sizes are strictly batch sized. The actual batching implementation follows
in subsequent changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.139204961@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
14077b9e58 debugobjects: Use static key for boot pool selection
Get rid of the conditional in the hot path.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.077247071@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
9ce99c6d7b debugobjects: Rework free_object_work()
Convert it to batch processing with intermediate helper functions. This
reduces the final changes for batch processing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.015906394@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
a3b9e191f5 debugobjects: Rework object freeing
__free_object() is uncomprehensibly complex. The same can be achieved by:

   1) Adding the object to the per CPU pool

   2) If that pool is full, move a batch of objects into the global pool
      or if the global pool is full into the to free pool

This also prepares for batch processing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.955542307@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
fb60c004f3 debugobjects: Rework object allocation
The current allocation scheme tries to allocate from the per CPU pool
first. If that fails it allocates one object from the global pool and then
refills the per CPU pool from the global pool.

That is in the way of switching the pool management to batch mode as the
global pool needs to be a strict stack of batches, which does not allow
to allocate single objects.

Rework the code to refill the per CPU pool first and then allocate the
object from the refilled batch. Also try to allocate from the to free pool
first to avoid freeing and reallocating objects.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.893554162@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
96a9a0421c debugobjects: Move min/max count into pool struct
Having the accounting in the datastructure is better in terms of cache
lines and allows more optimizations later on.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.831908427@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
18b8afcb37 debugobjects: Rename and tidy up per CPU pools
No point in having a separate data structure. Reuse struct obj_pool and
tidy up the code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.770595795@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
cb58d19084 debugobjects: Use separate list head for boot pool
There is no point to handle the statically allocated objects during early
boot in the actual pool list. This phase does not require accounting, so
all of the related complexity can be avoided.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.708939081@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
e18328ff70 debugobjects: Move pools into a datastructure
The contention on the global pool lock can be reduced by strict batch
processing where batches of objects are moved from one list head to another
instead of moving them object by object. This also reduces the cache
footprint because it avoids the list walk and dirties at maximum three
cache lines instead of potentially up to eighteen.

To prepare for that, move the hlist head and related counters into a
struct.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.646171170@linutronix.de
2024-10-15 17:30:31 +02:00
Zhen Lei
d8c6cd3a5c debugobjects: Reduce parallel pool fill attempts
The contention on the global pool_lock can be massive when the global pool
needs to be refilled and many CPUs try to handle this.

Address this by:

  - splitting the refill from free list and allocation.

    Refill from free list has no constraints vs. the context on RT, so
    it can be tried outside of the RT specific preemptible() guard

  - Let only one CPU handle the free list

  - Let only one CPU do allocations unless the pool level is below
    half of the minimum fill level.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240911083521.2257-4-thunder.leizhen@huawei.com-
Link: https://lore.kernel.org/all/20241007164913.582118421@linutronix.de

--
 lib/debugobjects.c |   84 +++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 59 insertions(+), 25 deletions(-)
2024-10-15 17:30:31 +02:00
Thomas Gleixner
661cc28b52 debugobjects: Make debug_objects_enabled bool
Make it what it is.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.518175013@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
49a5cb827d debugobjects: Provide and use free_object_list()
Move the loop to free a list of objects into a helper function so it can be
reused later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241007164913.453912357@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
241463f4fd debugobjects: Remove pointless debug printk
It has zero value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.390511021@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
49968cf181 debugobjects: Reuse put_objects() on OOM
Reuse the helper function instead of having a open coded copy.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.326834268@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
a2a702383e debugobjects: Dont free objects directly on CPU hotplug
Freeing the per CPU pool of the unplugged CPU directly is suboptimal as the
objects can be reused in the real pool if there is room. Aside of that this
gets the accounting wrong.

Use the regular free path, which allows reuse and has the accounting correct.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.263960570@linutronix.de
2024-10-15 17:30:30 +02:00
Thomas Gleixner
3f397bf955 debugobjects: Remove pointless hlist initialization
It's BSS zero initialized.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.200379308@linutronix.de
2024-10-15 17:30:30 +02:00
Thomas Gleixner
55fb412ef7 debugobjects: Dont destroy kmem cache in init()
debug_objects_mem_init() is invoked from mm_core_init() before work queues
are available. If debug_objects_mem_init() destroys the kmem cache in the
error path it causes an Oops in __queue_work():

 Oops: Oops: 0000 [#1] PREEMPT SMP PTI
 RIP: 0010:__queue_work+0x35/0x6a0
  queue_work_on+0x66/0x70
  flush_all_cpus_locked+0xdf/0x1a0
  __kmem_cache_shutdown+0x2f/0x340
  kmem_cache_destroy+0x4e/0x150
  mm_core_init+0x9e/0x120
  start_kernel+0x298/0x800
  x86_64_start_reservations+0x18/0x30
  x86_64_start_kernel+0xc5/0xe0
  common_startup_64+0x12c/0x138

Further the object cache pointer is used in various places to check for
early boot operation. It is exposed before the replacments for the static
boot time objects are allocated and the self test operates on it.

This can be avoided by:

     1) Running the self test with the static boot objects

     2) Exposing it only after the replacement objects have been added to
     	the pool.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241007164913.137021337@linutronix.de
2024-10-15 17:30:30 +02:00
Zhen Lei
813fd07858 debugobjects: Collect newly allocated objects in a list to reduce lock contention
Collect the newly allocated debug objects in a list outside the lock, so
that the lock held time and the potential lock contention is reduced.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240911083521.2257-3-thunder.leizhen@huawei.com
Link: https://lore.kernel.org/all/20241007164913.073653668@linutronix.de
2024-10-15 17:30:30 +02:00
Zhen Lei
a0ae950408 debugobjects: Delete a piece of redundant code
The statically allocated objects are all located in obj_static_pool[],
the whole memory of obj_static_pool[] will be reclaimed later. Therefore,
there is no need to split the remaining statically nodes in list obj_pool
into isolated ones, no one will use them anymore. Just write
INIT_HLIST_HEAD(&obj_pool) is enough. Since hlist_move_list() directly
discards the old list, even this can be omitted.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240911083521.2257-2-thunder.leizhen@huawei.com
Link: https://lore.kernel.org/all/20241007164913.009849239@linutronix.de
2024-10-15 17:30:30 +02:00
Will Deacon
7aed6a2c51 kasan: Disable Software Tag-Based KASAN with GCC
Syzbot reports a KASAN failure early during boot on arm64 when building
with GCC 12.2.0 and using the Software Tag-Based KASAN mode:

  | BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
  | BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
  | Write of size 4 at addr 03ff800086867e00 by task swapper/0
  | Pointer tag: [03], memory tag: [fe]

Initial triage indicates that the report is a false positive and a
thorough investigation of the crash by Mark Rutland revealed the root
cause to be a bug in GCC:

  > When GCC is passed `-fsanitize=hwaddress` or
  > `-fsanitize=kernel-hwaddress` it ignores
  > `__attribute__((no_sanitize_address))`, and instruments functions
  > we require are not instrumented.
  >
  > [...]
  >
  > All versions [of GCC] I tried were broken, from 11.3.0 to 14.2.0
  > inclusive.
  >
  > I think we have to disable KASAN_SW_TAGS with GCC until this is
  > fixed

Disable Software Tag-Based KASAN when building with GCC by making
CC_HAS_KASAN_SW_TAGS depend on !CC_IS_GCC.

Cc: Andrey Konovalov <andreyknvl@gmail.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/000000000000f362e80620e27859@google.com
Link: https://lore.kernel.org/r/ZvFGwKfoC4yVjN_X@J2N7QTR9R3
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218854
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20241014161100.18034-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-15 11:38:10 +01:00
Rob Herring (Arm)
6ba55951e7 logic_pio: Constify fwnode_handle
The fwnode_handle passed into find_io_range_by_fwnode() and
logic_pio_trans_hwaddr() are not modified, so make them const.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20241010-dt-const-v1-2-87a51f558425@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-14 16:33:24 -05:00
Zijun Hu
9bd133f05b lib: devres: Simplify API devm_ioport_unmap() implementation
Simplify devm_ioport_unmap() implementation by dedicated API
devres_release(), compared with current solution, namely
ioport_unmap() + devres_destroy(), devres_release() has below advantages:

- it is simpler if devm_ioport_unmap()'s parameter @addr was ever
  returned by devm_ioport_map().

- it can avoid unnecessary ioport_unmap(@addr) if @addr was not
  ever returned by devm_ioport_map().

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20240918-fix_lib_devres-v1-2-e696ab5486e6@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-14 08:21:09 +02:00
Zijun Hu
0ee4dcafda lib: devres: Simplify API devm_iounmap() implementation
Simplify devm_iounmap() implementation by dedicated API devres_release()
compared with current solution, namely, devres_destroy() + iounmap()
devres_release() has the following advantages:

- it is simpler if devm_iounmap()'s parameter @addr is valid, namely
  @addr was ever returned by one of devm_ioremap() variants.

- it can avoid unnecessary iounmap(@addr) if @addr is not valid.

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20240918-fix_lib_devres-v1-1-e696ab5486e6@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-14 08:21:09 +02:00
Jakub Kicinski
9c0fc36ec4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc3).

No conflicts and no adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 13:13:33 -07:00
Vladimir Oltean
1405981bbb lib: packing: catch kunit_kzalloc() failure in the pack() test
kunit_kzalloc() may fail. Other call sites verify that this is the case,
either using a direct comparison with the NULL pointer, or the
KUNIT_ASSERT_NOT_NULL() or KUNIT_ASSERT_NOT_ERR_OR_NULL().

Pick KUNIT_ASSERT_NOT_NULL() as the error handling method that made most
sense to me. It's an unlikely thing to happen, but at least we call
__kunit_abort() instead of dereferencing this NULL pointer.

Fixes: e9502ea6db ("lib: packing: add KUnit tests adapted from selftests")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241004110012.1323427-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 16:36:25 -07:00
Timo Grautstueck
ab8851431b lib/Kconfig.debug: fix grammar in RUST_BUILD_ASSERT_ALLOW
Just a grammar fix in lib/Kconfig.debug, under the config option
RUST_BUILD_ASSERT_ALLOW.

Reported-by: Miguel Ojeda <ojeda@kernel.org>
Closes: https://github.com/Rust-for-Linux/linux/issues/1006
Fixes: ecaa6ddff2 ("rust: add `build_error` crate")
Signed-off-by: Timo Grautstueck <timo.grautstueck@web.de>
Link: https://lore.kernel.org/r/20241006140244.5509-1-timo.grautstueck@web.de
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-10-07 19:13:03 +02:00
Qianqiang Liu
6100da511b crypto: lib/mpi - Fix an "Uninitialized scalar variable" issue
The "err" variable may be returned without an initialized value.

Fixes: 8e3a67f2de ("crypto: lib/mpi - Add error checks to extension")
Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05 13:22:05 +08:00
Linus Torvalds
f6785e0ccf slab fixes for 6.12-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmb/8bcACgkQu+CwddJF
 iJoApwf5AWWhKFbbYwFUCXDi7+/Xr7T7c9H9q+GAEOQiDLsDxihEAo1KYQ+DLl+h
 Vp1ddRYIKMIUfllW3bcD4O6C8L46OX3XPHhTHnksEfvtn3fQGjcU3jKH8n0eL01J
 s9eUdvduNSJorAWqjFPPRrGuLJTXmervrDYYPJLaXGITHHMOxMjKfLAxtXehvARv
 mVQV1F0NTvvNqieuibUCM5XqJs37lrmqB39pLun7bQDU48z4OR1L3nkJxTFF1bGm
 EcvAPayTiNybMt08QSVHIwqfSs+e0HmyKqjvSLpJPImDrfSrWOJvBCJxI4DU+1aw
 UiHyWYLaxWZ7DoJgtZuHV2//8wOWww==
 =EXEA
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:
 "Fixes for issues introduced in this merge window: kobject memory leak,
  unsupressed warning and possible lockup in new slub_kunit tests,
  misleading code in kvfree_rcu_queue_batch()"

* tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
  mm, slab: suppress warnings in test_leak_destroy kunit test
  rcu/kvfree: Refactor kvfree_rcu_queue_batch()
  mm, slab: fix use of SLAB_SUPPORTS_SYSFS in kmem_cache_release()
2024-10-04 12:05:39 -07:00
Vladimir Oltean
46e784e94b lib: packing: use GENMASK() for box_mask
This is an u8, so using GENMASK_ULL() for unsigned long long is
unnecessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-10-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Vladimir Oltean
fb02c7c8a5 lib: packing: use BITS_PER_BYTE instead of 8
This helps clarify what the 8 is for.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-9-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Jacob Keller
e7fdf5dddc lib: packing: fix QUIRK_MSB_ON_THE_RIGHT behavior
The QUIRK_MSB_ON_THE_RIGHT quirk is intended to modify pack() and unpack()
so that the most significant bit of each byte in the packed layout is on
the right.

The way the quirk is currently implemented is broken whenever the packing
code packs or unpacks any value that is not exactly a full byte.

The broken behavior can occur when packing any values smaller than one
byte, when packing any value that is not exactly a whole number of bytes,
or when the packing is not aligned to a byte boundary.

This quirk is documented in the following way:

  1. Normally (no quirks), we would do it like this:

  ::

    63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
    7                       6                       5                        4
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
    3                       2                       1                        0

  <snip>

  2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this:

  ::

    56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
    7                       6                        5                       4
    24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23  8  9 10 11 12 13 14 15  0  1  2  3  4  5  6  7
    3                       2                        1                       0

  That is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but
  inverts bit offsets inside a byte.

Essentially, the mapping for physical bit offsets should be reserved for a
given byte within the payload. This reversal should be fixed to the bytes
in the packing layout.

The logic to implement this quirk is handled within the
adjust_for_msb_right_quirk() function. This function does not work properly
when dealing with the bytes that contain only a partial amount of data.

In particular, consider trying to pack or unpack the range 53-44. We should
always be mapping the bits from the logical ordering to their physical
ordering in the same way, regardless of what sequence of bits we are
unpacking.

This, we should grab the following logical bits:

  Logical: 55 54 53 52 51 50 49 48 47 45 44 43 42 41 40 39
                  ^  ^  ^  ^  ^  ^  ^  ^  ^

And pack them into the physical bits:

   Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
    Logical: 48 49 50 51 52 53                   44 45 46 47
              ^  ^  ^  ^  ^  ^                    ^  ^  ^  ^

The current logic in adjust_for_msb_right_quirk is broken. I believe it is
intending to map according to the following:

  Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
   Logical:       48 49 50 51 52 53 44 45 46 47
                   ^  ^  ^  ^  ^  ^  ^  ^  ^  ^

That is, it tries to keep the bits at the start and end of a packing
together. This is wrong, as it makes the packing change what bit is being
mapped to what based on which bits you're currently packing or unpacking.

Worse, the actual calculations within adjust_for_msb_right_quirk don't make
sense.

Consider the case when packing the last byte of an unaligned packing. It
might have a start bit of 7 and an end bit of 5. This would have a width of
3 bits. The new_start_bit will be calculated as the width - the box_end_bit
- 1. This will underflow and produce a negative value, which will
ultimate result in generating a new box_mask of all 0s.

For any other values, the result of the calculations of the
new_box_end_bit, new_box_start_bit, and the new box_mask will result in the
exact same values for the box_end_bit, box_start_bit, and box_mask. This
makes the calculations completely irrelevant.

If box_end_bit is 0, and box_start_bit is 7, then the entire function of
adjust_for_msb_right_quirk will boil down to just:

    *to_write = bitrev8(*to_write)

The other adjustments are attempting (incorrectly) to keep the bits in the
same place but just reversed. This is not the right behavior even if
implemented correctly, as it leaves the mapping dependent on the bit values
being packed or unpacked.

Remove adjust_for_msb_right_quirk() and just use bitrev8 to reverse the
byte order when interacting with the packed data.

In particular, for packing, we need to reverse both the box_mask and the
physical value being packed. This is done after shifting the value by
box_end_bit so that the reversed mapping is always aligned to the physical
buffer byte boundary. The box_mask is reversed as we're about to use it to
clear any stale bits in the physical buffer at this block.

For unpacking, we need to reverse the contents of the physical buffer
*before* masking with the box_mask. This is critical, as the box_mask is a
logical mask of the bit layout before handling the QUIRK_MSB_ON_THE_RIGHT.

Add several new tests which cover this behavior. These tests will fail
without the fix and pass afterwards. Note that no current drivers make use
of QUIRK_MSB_ON_THE_RIGHT. I suspect this is why there have been no reports
of this inconsistency before.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-8-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Jacob Keller
fcd6dd91d0 lib: packing: add additional KUnit tests
While reviewing the initial KUnit tests for lib/packing, Przemek pointed
out that the test values have duplicate bytes in the input sequence.

In addition, I noticed that the unit tests pack and unpack on a byte
boundary, instead of crossing bytes. Thus, we lack good coverage of the
corner cases of the API.

Add additional unit tests to cover packing and unpacking byte buffers which
do not have duplicate bytes in the unpacked value, and which pack and
unpack to an unaligned offset.

A careful reviewer may note the lack tests for QUIRK_MSB_ON_THE_RIGHT. This
is because I found issues with that quirk during test implementation. This
quirk will be fixed and the tests will be included in a future change.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-7-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Jacob Keller
e9502ea6db lib: packing: add KUnit tests adapted from selftests
Add 24 simple KUnit tests for the lib/packing.c pack() and unpack() APIs.

The first 16 tests exercise all combinations of quirks with a simple magic
number value on a 16-byte buffer. The remaining 8 tests cover
non-multiple-of-4 buffer sizes.

These tests were originally written by Vladimir as simple selftest
functions. I adapted them to KUnit, refactoring them into a table driven
approach. This will aid in adding additional tests in the future.

Co-developed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-6-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Vladimir Oltean
28aec9ca29 lib: packing: duplicate pack() and unpack() implementations
packing() is now used in some hot paths, and it would be good to get rid
of some ifs and buts that depend on "op", to speed things up a little bit.

With the main implementations now taking size_t endbit, we no longer
have to check for negative values. Update the local integer variables to
also be size_t to match.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-5-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Vladimir Oltean
7263f64e16 lib: packing: add pack() and unpack() wrappers over packing()
Geert Uytterhoeven described packing() as "really bad API" because of
not being able to enforce const correctness. The same function is used
both when "pbuf" is input and "uval" is output, as in the other way
around.

Create 2 wrapper functions where const correctness can be ensured.
Do ugly type casts inside, to be able to reuse packing() as currently
implemented - which will _not_ modify the input argument.

Also, take the opportunity to change the type of startbit and endbit to
size_t - an unsigned type - in these new function prototypes. When int,
an extra check for negative values is necessary. Hopefully, when
packing() goes away completely, that check can be dropped.

My concern is that code which does rely on the conditional directionality
of packing() is harder to refactor without blowing up in size. So it may
take a while to completely eliminate packing(). But let's make alternatives
available for those who do not need that.

Link: https://lore.kernel.org/netdev/20210223112003.2223332-1-geert+renesas@glider.be/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-4-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Vladimir Oltean
a636ba5e86 lib: packing: adjust definitions and implementation for arbitrary buffer lengths
Jacob Keller has a use case for packing() in the intel/ice networking
driver, but it cannot be used as-is.

Simply put, the API quirks for LSW32_IS_FIRST and LITTLE_ENDIAN are
naively implemented with the undocumented assumption that the buffer
length must be a multiple of 4. All calculations of group offsets and
offsets of bytes within groups assume that this is the case. But in the
ice case, this does not hold true. For example, packing into a buffer
of 22 bytes would yield wrong results, but pretending it was a 24 byte
buffer would work.

Rather than requiring such hacks, and leaving a big question mark when
it comes to discontinuities in the accessible bit fields of such buffer,
we should extend the packing API to support this use case.

It turns out that we can keep the design in terms of groups of 4 bytes,
but also make it work if the total length is not a multiple of 4.
Just like before, imagine the buffer as a big number, and its most
significant bytes (the ones that would make up to a multiple of 4) are
missing. Thus, with a big endian (no quirks) interpretation of the
buffer, those most significant bytes would be absent from the beginning
of the buffer, and with a LSW32_IS_FIRST interpretation, they would be
absent from the end of the buffer. The LITTLE_ENDIAN quirk, in the
packing() API world, only affects byte ordering within groups of 4.
Thus, it does not change which bytes are missing. Only the significance
of the remaining bytes within the (smaller) group.

No change intended for buffer sizes which are multiples of 4. Tested
with the sja1105 driver and with downstream unit tests.

Link: https://lore.kernel.org/netdev/a0338310-e66c-497c-bc1f-a597e50aa3ff@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-2-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:03 -07:00
Vladimir Oltean
8b3e26677b lib: packing: refuse operating on bit indices which exceed size of buffer
While reworking the implementation, it became apparent that this check
does not exist.

There is no functional issue yet, because at call sites, "startbit" and
"endbit" are always hardcoded to correct values, and never come from the
user.

Even with the upcoming support of arbitrary buffer lengths, the
"startbit >= 8 * pbuflen" check will remain correct. This is because
we intend to always interpret the packed buffer in a way that avoids
discontinuities in the available bit indices.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-1-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:03 -07:00
Linus Torvalds
20c2474fa5 vfs-6.12-rc2.fixes.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZv5Y3gAKCRCRxhvAZXjc
 ojFPAP45kz5JgVKFn8iZmwfjPa7qbCa11gEzmx0SbUt3zZ3mJAD/fL9k9KaNU+qA
 LIcZW5BJn/p5fumUAw8/fKoz4ajCWQk=
 =LIz1
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.12-rc2.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "vfs:

   - Ensure that iter_folioq_get_pages() advances to the next slot
     otherwise it will end up using the same folio with an out-of-bound
     offset.

  iomap:

   - Dont unshare delalloc extents which can't be reflinked, and thus
     can't be shared.

   - Constrain the file range passed to iomap_file_unshare() directly in
     iomap instead of requiring the callers to do it.

  netfs:

   - Use folioq_count instead of folioq_nr_slot to prevent an
     unitialized value warning in netfs_clear_buffer().

   - Fix missing wakeup after issuing writes by scheduling the write
     collector only if all the subrequest queues are empty and thus no
     writes are pending.

   - Fix two minor documentation bugs"

* tag 'vfs-6.12-rc2.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  iomap: constrain the file range passed to iomap_file_unshare
  iomap: don't bother unsharing delalloc extents
  netfs: Fix missing wakeup after issuing writes
  Documentation: add missing folio_queue entry
  folio_queue: fix documentation
  netfs: Fix a KMSAN uninit-value error in netfs_clear_buffer
  iov_iter: fix advancing slot in iter_folioq_get_pages()
2024-10-03 09:22:50 -07:00
Uros Bizjak
0402779aae lib/test_scanf: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:29 +02:00
Uros Bizjak
1da74f9050 lib/test_parman: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:27 +02:00
Uros Bizjak
2e2fe47182 bpf/tests: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:23 +02:00
Uros Bizjak
a7e74510e0 lib/rbtree-test: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:19 +02:00
Uros Bizjak
baacb8b413 random32: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:17 +02:00
Uros Bizjak
9127ad4242 kunit: string-stream-test: Include <linux/prandom.h>
Include <linux/random.h> header to allow the removal of legacy
inclusion of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:14 +02:00
Uros Bizjak
d46150d6fd lib/interval_tree_test.c: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:12 +02:00
Al Viro
5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Vlastimil Babka
cac39b0706 slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
Guenter Roeck reports that the new slub kunit tests added by commit
4e1c44b3db ("kunit, slub: add test_kfree_rcu() and
test_leak_destroy()") cause a lockup on boot on several architectures
when the kunit tests are configured to be built-in and not modules.

The test_kfree_rcu test invokes kfree_rcu() and boot sequence inspection
showed the runner for built-in kunit tests kunit_run_all_tests() is
called before setting system_state to SYSTEM_RUNNING and calling
rcu_end_inkernel_boot(), so this seems like a likely cause. So while I
was unable to reproduce the problem myself, skipping the test when the
slub_kunit module is built-in should avoid the issue.

An alternative fix that was moving the call to kunit_run_all_tests() a
bit later in the boot was tried, but has broken tests with functions
marked as __init due to free_initmem() already being done.

Fixes: 4e1c44b3db ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: rcu@vger.kernel.org
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Cc: linux-kselftest@vger.kernel.org
Cc: kunit-dev@googlegroups.com
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-10-02 16:28:46 +02:00
Vlastimil Babka
3f1dd33f99 mm, slab: suppress warnings in test_leak_destroy kunit test
The test_leak_destroy kunit test intends to test the detection of stray
objects in kmem_cache_destroy(), which normally produces a warning. The
other slab kunit tests suppress the warnings in the kunit test context,
so suppress warnings and related printk output in this test as well.
Automated test running environments then don't need to learn to filter
the warnings.

Also rename the test's kmem_cache, the name was wrongly copy-pasted from
test_kfree_rcu.

Fixes: 4e1c44b3db ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202408251723.42f3d902-oliver.sang@intel.com
Reported-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Closes: https://lore.kernel.org/all/CAB=+i9RHHbfSkmUuLshXGY_ifEZg9vCZi3fqr99+kmmnpDus7Q@mail.gmail.com/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-10-02 16:28:46 +02:00
Omar Sandoval
0d24852bd7
iov_iter: fix advancing slot in iter_folioq_get_pages()
iter_folioq_get_pages() decides to advance to the next folioq slot when
it has reached the end of the current folio. However, it is checking
offset, which is the beginning of the current part, instead of
iov_offset, which is adjusted to the end of the current part, so it
doesn't advance the slot when it's supposed to. As a result, on the next
iteration, we'll use the same folio with an out-of-bounds offset and
return an unrelated page.

This manifested as various crashes and other failures in 9pfs in drgn's
VM testing setup and BPF CI.

Fixes: db0aa2e956 ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Link: https://lore.kernel.org/linux-fsdevel/20240923183432.1876750-1-chantr4@gmail.com/
Tested-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Link: https://lore.kernel.org/r/cbaf141ba6c0e2e209717d02746584072844841a.1727722269.git.osandov@fb.com
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
Tested-by: Joey Gouly <joey.gouly@arm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-01 11:49:57 +02:00
Linus Torvalds
9c44575c78 bitmap-for-6.12
- switch all bitmamp APIs from inline to __always_inline from Brian Norris;
  - introduce GENMASK_U128() macro from Anshuman Khandual;
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmb22isACgkQsUSA/Tof
 vsie2gwAl3l5vye90xnD6N8wFmKBKAWXMn8Iby7JyM9gAn6j1QuE5AppS+3JtIpZ
 rPRSgFZIVPOgBtiKjb6zAWj7KbtCmaSW+L5ZVaLQ+vtwBVNpWIWHsHKu0uIpuugT
 3wp/IeaE92bc/mioqb27pj2Gnv+lzYBmbK7Mu08a3q1Adwv0I7BJ4GvqxN1lLAEW
 xrFB86xztqdV7QC45J7Q5nIyUw7UBYK078elQ8iKSj5BR8MeaEJiavETwx9DHgAO
 Z8cG94ek3IpvLpiexNcgG+FTezZj9PnTVHxry9o7CIctafiqjYqXAJ9gks1Q4QUu
 q1IjPAdueLTAMPkpK67sI3fwC6zPyX5d8DVDUTuA6qhCsMyHW687gTRy4LPR14LL
 gd1Tzg+J9DQ5KBoG4TYN/g5VoP1hkKQqpetaJhdPqmYocfmqZuzyItb+gBjhyvSp
 3YOgLg/4lULy3sZ6Qd/q8CWglWlaNYXXzf13H8f2qUpVx4NLTDOwjj/CVjZR/D0C
 wje/8XU3
 =8jNc
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.12' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - switch all bitmamp APIs from inline to __always_inline (Brian Norris)

   The __always_inline series improves on code generation, and now with
   the latest compiler versions is required to avoid compilation
   warnings. It spent enough in my backlog, and I'm thankful to Brian
   Norris for taking over and moving it forward.

 - introduce GENMASK_U128() macro (Anshuman Khandual)

   GENMASK_U128() is a prerequisite needed for arm64 development

* tag 'bitmap-for-6.12' of https://github.com/norov/linux:
  lib/test_bits.c: Add tests for GENMASK_U128()
  uapi: Define GENMASK_U128
  nodemask: Switch from inline to __always_inline
  cpumask: Switch from inline to __always_inline
  bitmap: Switch from inline to __always_inline
  find: Switch from inline to __always_inline
2024-09-27 12:10:45 -07:00
Linus Torvalds
eee280841e 19 hotfixes. 13 are cc:stable.
There's a focus on fixes for the memfd_pin_folios() work which was added
 into 6.11.  Apart from that, the usual shower of singleton fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZvbhSAAKCRDdBJ7gKXxA
 jp8CAP47txk2c+tBLggog2MkQamADY5l5MT6E3fYq3ghSiKtVQEAnqX3LiQJ02tB
 o9LcPcVrM90QntpKrLP1CpWCVdR+zA8=
 =e0QC
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-09-27-09-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull  misc fixes from Andrew Morton:
 "19 hotfixes.  13 are cc:stable.

  There's a focus on fixes for the memfd_pin_folios() work which was
  added into 6.11. Apart from that, the usual shower of singleton fixes"

* tag 'mm-hotfixes-stable-2024-09-27-09-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  ocfs2: fix uninit-value in ocfs2_get_block()
  zram: don't free statically defined names
  memory tiers: use default_dram_perf_ref_source in log message
  Revert "list: test: fix tests for list_cut_position()"
  kselftests: mm: fix wrong __NR_userfaultfd value
  compiler.h: specify correct attribute for .rodata..c_jump_table
  mm/damon/Kconfig: update DAMON doc URL
  mm: kfence: fix elapsed time for allocated/freed track
  ocfs2: fix deadlock in ocfs2_get_system_file_inode
  ocfs2: reserve space for inline xattr before attaching reflink tree
  mm: migrate: annotate data-race in migrate_folio_unmap()
  mm/hugetlb: simplify refs in memfd_alloc_folio
  mm/gup: fix memfd_pin_folios alloc race panic
  mm/gup: fix memfd_pin_folios hugetlb page allocation
  mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak
  mm/hugetlb: fix memfd_pin_folios free_huge_pages leak
  mm/filemap: fix filemap_get_folios_contig THP panic
  mm: make SPLIT_PTE_PTLOCKS depend on SMP
  tools: fix shared radix-tree build
2024-09-27 10:27:22 -07:00
Guenter Roeck
c509f67df3 Revert "list: test: fix tests for list_cut_position()"
This reverts commit e620799c41.

The commit introduces unit test failures.

     Expected cur == &entries[i], but
         cur == 0000037fffadfd80
         &entries[i] == 0000037fffadfd60
     # list_test_list_cut_position: pass:0 fail:1 skip:0 total:1
     not ok 21 list_test_list_cut_position
     # list_test_list_cut_before: EXPECTATION FAILED at lib/list-test.c:444
     Expected cur == &entries[i], but
         cur == 0000037fffa9fd70
         &entries[i] == 0000037fffa9fd60
     # list_test_list_cut_before: EXPECTATION FAILED at lib/list-test.c:444
     Expected cur == &entries[i], but
         cur == 0000037fffa9fd80
         &entries[i] == 0000037fffa9fd70

Revert it.

Link: https://lkml.kernel.org/r/20240922150507.553814-1-linux@roeck-us.net
Fixes: e620799c41 ("list: test: fix tests for list_cut_position()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: I Hsin Cheng <richard120310@gmail.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-26 14:01:44 -07:00
Linus Torvalds
11a299a793 for-6.12/block-20240925
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmb0T5AQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnfHEADCXqmqZC+xr3sHZH9T1lz9KaFp1FjuBhCw
 bGpUgXQ9aLcqQUWJxmYVer8N2x2+Ds+xq4fm/rP1BfvNgRupqheHBwuLxSrz14EX
 lYmKZ+krMIPTDaLFewmEWflDwmZX0WFgV6nKTMLiO5BMeI4zXCkFGtwYFys2+Cdd
 9zYCFPgGDZUR77Ws5PpyqPVz2MoiNtsjrGmHpEmNZ+rIDzlpVOYgYk27X9ZbvNxC
 /l0KTc9+ayAeG0Kx5jO+m6Hrj3I6ehvM9JZMgpS/tF/jtccD2oVkJFJDlU+Jciv6
 BwVzgyDPGV7sXFT1fnSqDBYYwr/73nzNH0Gk8wn4Jg2LhjmVANVo9eQSOXDTYZI+
 O4HfIHGTIrk75TQd4bhq3dqaylS78pKBI/eQJUli2UNoyLWMrMyE88yh2YJam2Fs
 vJ/MHGxvFRurYbAlqLr33nb3ajvpg+D7XuAYfqHPMc2ZUe28Kza50Dj+luNjfVCu
 3qfR6qBlsdWuABtUS3vneB9jZp5jDnOpVfuBgtcAqIboUjehTXsI7If09Ex/mxLq
 O0KqNwBMfunPOKd5kGXlAgY8LRMfOhNaAAFBlXYUZB2eAadQnqVselTFvHMZkXo7
 wH/l6trd+/Tf+7Rav0YduNIlpVr7IctC+A7ph4zPdIjQxFEySCrC7cvAjel29LyV
 zgWW0Mw/sA==
 =yiWu
 -----END PGP SIGNATURE-----

Merge tag 'for-6.12/block-20240925' of git://git.kernel.dk/linux

Pull more block updates from Jens Axboe:

 - Improve blk-integrity segment counting and merging (Keith)

 - NVMe pull request via Keith:
      - Multipath fixes (Hannes)
      - Sysfs attribute list NULL terminate fix (Shin'ichiro)
      - Remove problematic read-back (Keith)

 - Fix for a regression with the IO scheduler switching freezing from
   6.11 (Damien)

 - Use a raw spinlock for sbitmap, as it may get called from preempt
   disabled context (Ming)

 - Cleanup for bd_claiming waiting, using var_waitqueue() rather than
   the bit waitqueues, as that more accurately describes that it does
   (Neil)

 - Various cleanups (Kanchan, Qiu-ji, David)

* tag 'for-6.12/block-20240925' of git://git.kernel.dk/linux:
  nvme: remove CC register read-back during enabling
  nvme: null terminate nvme_tls_attrs
  nvme-multipath: avoid hang on inaccessible namespaces
  nvme-multipath: system fails to create generic nvme device
  lib/sbitmap: define swap_lock as raw_spinlock_t
  block: Remove unused blk_limits_io_{min,opt}
  drbd: Fix atomicity violation in drbd_uuid_set_bm()
  block: Fix elv_iosched_local_module handling of "none" scheduler
  block: remove bogus union
  block: change wait on bd_claiming to use a var_waitqueue
  blk-integrity: improved sg segment mapping
  block: unexport blk_rq_count_integrity_sg
  nvme-rdma: use request to get integrity segments
  scsi: use request to get integrity segments
  block: provide a request helper for user integrity segments
  blk-integrity: consider entire bio list for merging
  blk-integrity: properly account for segments
  blk-mq: set the nr_integrity_segments from bio
  blk-mq: unconditional nr_integrity_segments
2024-09-25 14:56:40 -07:00
Linus Torvalds
68e5c7d4ce Kbuild updates for v6.12
- Support cross-compiling linux-headers Debian package and kernel-devel
    RPM package
 
  - Add support for the linux-debug Pacman package
 
  - Improve module rebuilding speed by factoring out the common code to
    scripts/module-common.c
 
  - Separate device tree build rules into scripts/Makefile.dtbs
 
  - Add a new script to generate modules.builtin.ranges, which is useful
    for tracing tools to find symbols in built-in modules
 
  - Refactor Kconfig and misc tools
 
  - Update Kbuild and Kconfig documentation
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmby2+QVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGpQ0QALWMgox3OdceNiBT8QieqRFfwKFv
 5jxtsZt+MbTdWNMEfgc4Cq2i5ZAqpYGZh32RwTiZJogBvYEIoO7M4Md9VwoEe/BC
 q8VZ6FhUy7358IX/FCukfB0dYvkziRalBRDrE4iFmMMdhBvZ9nrvMxllqFCMllLj
 DTrBTTiMus3qiiczr4tb5QwaIR6C+yqiEBF++ftLmWvo9dn8YNNUnI65fGjyQM/w
 0wMPwsB3Y2HdnRpLUS6T18gZbjoXsAk4+WX0TpdBfTs3d7AdbzlSMtc0BslEm6Tb
 JjIK6SbJCM3kNC7O0/gsUenOaSBxSbKjjg33gQxn/eNoi0nRt+qnBMMreYiTd95G
 Hq86QcNfKQtWAagKRTppMkYEDqMU2RKH7BmJOsfQyeG9cGpAAu+0HsQv3f/h5QP1
 MlA8o+NP5oQn6RbrhZz1Pqm24+OMxiXaBhmo8XbZ+MXzi/CBR54Eo4ip/FSHzXII
 EGEAQL7t7YU7xu8qMIE6ZQMH7BJsjJNee0vrNiYZa4xHLYyHi6mJl8K6LlHQ3nEx
 WOsPX9MLITtSJwcvIio/0sEnuR7pjcShGfqhbHO5tiOYznsbcSvu3+18HPGCpFRt
 vYFkNIRc298k7++A+Zp2wwdD2TS+SSilrAImmJXMhf0M+Nyg2vnlfAo8t0QSkFlh
 1g9dJuy+8jYRjHXP
 =g4t/
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Support cross-compiling linux-headers Debian package and kernel-devel
   RPM package

 - Add support for the linux-debug Pacman package

 - Improve module rebuilding speed by factoring out the common code to
   scripts/module-common.c

 - Separate device tree build rules into scripts/Makefile.dtbs

 - Add a new script to generate modules.builtin.ranges, which is useful
   for tracing tools to find symbols in built-in modules

 - Refactor Kconfig and misc tools

 - Update Kbuild and Kconfig documentation

* tag 'kbuild-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (51 commits)
  kbuild: doc: replace "gcc" in external module description
  kbuild: doc: describe the -C option precisely for external module builds
  kbuild: doc: remove the description about shipped files
  kbuild: doc: drop section numbering, use references in modules.rst
  kbuild: doc: throw out the local table of contents in modules.rst
  kbuild: doc: remove outdated description of the limitation on -I usage
  kbuild: doc: remove description about grepping CONFIG options
  kbuild: doc: update the description about Kbuild/Makefile split
  kbuild: remove unnecessary export of RUST_LIB_SRC
  kbuild: remove append operation on cmd_ld_ko_o
  kconfig: cache expression values
  kconfig: use hash table to reuse expressions
  kconfig: refactor expr_eliminate_dups()
  kconfig: add comments to expression transformations
  kconfig: change some expr_*() functions to bool
  scripts: move hash function from scripts/kconfig/ to scripts/include/
  kallsyms: change overflow variable to bool type
  kallsyms: squash output_address()
  kbuild: add install target for modules.builtin.ranges
  scripts: add verifier script for builtin module range data
  ...
2024-09-24 13:02:06 -07:00
Linus Torvalds
9ab27b0186 The core clk framework is left largely untouched this time around except for
support for the newly ratified DT property 'assigned-clock-rates-u64'. I'm much
 more excited about the support for loading DT overlays from KUnit tests so that
 we can test how the clk framework parses DT nodes during clk registration. The
 clk framework has some places that are highly DeviceTree dependent so this
 charts the path to extend the KUnit tests to cover even more framework code in
 the future. I've got some more tests on the list that use the DT overlay
 support, but they uncovered issues with clk unregistration that I'm still
 working on fixing.
 
 Outside the core, the clk driver update pile is dominated by Qualcomm and
 Renesas SoCs, making it fairly usual. Looking closer, there are fixes for
 things all over the place, like adding missing clk frequencies or moving
 defines for the number of clks out of DT binding headers into the drivers.
 There are even conversions of DT bindings to YAML and migration away from
 strings to describe clk topology. Overall it doesn't look unusual so I expect
 the new drivers to be where we'll have fixes in the coming weeks.
 
 Core:
  - KUnit tests for clk registration and fixed rate basic clk type
  - A couple more devm helpers, one consumer and one provider
  - Support for assigned-clock-rates-u64
 
 New Drivers:
  - Camera, display and GPU clocks on Qualcomm SM4450
  - Camera clocks on Qualcomm SM8150
  - Rockchip rk3576 clks
  - Microchip SAM9X7 clks
  - Renesas RZ/V2H(P) (R9A09G057) clks
 
 Updates:
  - Mark a bunch of struct freq_tbl const to reduce .data usage
  - Add Qualcomm MSM8226 A7PLL and Regera PLL support
  - Fix the Qualcomm Lucid 5LPE PLL configuration sequence to not reuse
    Trion, as they do differ
  - A number of fixes to the Qualcomm SM8550 display clock driver
  - Fold Qualcomm SM8650 display clock driver into SM8550 one
  - Add missing clocks and GDSCs needed for audio on Qualcomm MSM8998
  - Add missing USB MP resets, GPLL9, and QUPv3 DFS to Qualcomm SC8180X
  - Fix sdcc clk frequency tables on Qualcomm SC8180X
  - Drop the Qualcomm SM8150 gcc_cpuss_ahb_clk_src
  - Mark Qualcomm PCIe GDSCs as RET_ON on sm8250 and sm8540 to avoid them
    turning off during suspend
  - Use the HW_CTRL mechanism on Qualcomm SM8550 video clock controller
    GDSCs
  - Get rid of CLK_NR_CLKS defines in Rockchip DT binding headers
  - Some fixes for Rockchip rk3228 and rk3588
  - Exynos850: Add clock for Thermal Management Unit
  - Exynos7885: Fix duplicated ID in the header, add missing TOP PLLs and
    add clocks for USB block in the FSYS clock controller
  - ExynosAutov9: Add DPUM clock controller
  - ExynosAutov920: Add new (first) clock controllers: TOP and PERIC0
    (and a bit more complete bindings)
  - Use clk_hw pointer instead of fw_name for acm_aud_clk[0-1]_sel clocks
    on i.MX8Q as parents in ACM provider
  - Add i.MX95 NETCMIX support to the block control provider
  - Fix parents for ENETx_REF_SEL clocks on i.MX6UL
  - Add USB clocks, resets and power domains on Renesas RZ/G3S
  - Add Generic Timer (GTM), I2C Bus Interface (RIIC), SD/MMC Host
    Interface (SDHI) and Watchdog Timer (WDT) clocks and resets on
    Renesas RZ/V2H
  - Add PCIe, PWM, and CAN-FD clocks on Renesas R-Car V4M
  - Add LCD controller clocks and resets on Renesas RZ/G2UL
  - Add DMA clocks and resets on Renesas RZ/G3S
  - Add fractional multiplication PLL support on Renesas R-Car Gen4
  - Document support for the Renesas RZ/G2M v3.0 (r8a774a3) SoC
  - Support for the Microchip SAM9X7 SoC as follows:
  - Updates for the Microchip PLL drivers
  - DT binding documentation updates (for the new clock driver and for
    the slow clock controller that SAM9X7 is using)
  - A fix for the Microchip SAMA7G5 clock driver to avoid allocating more
    memory than necessary
  - Constify some Amlogic structs
  - Add SM1 eARC clocks for Amlogic
  - Introduce a symbol namespace for Amlogic clock specific symbols
  - Add reset controller support to audiomix block control on i.MX
  - Add CLK_SET_RATE_PARENT flag to all audiomix clocks and to
    i.MX7D lcdif_pixel_src clock
  - Fix parent clocks for earc_phy and audpll on i.MX8MP
  - Fix default parents for enet[12]_ref_sel on i.MX6UL
  - Add ops in composite 8M and 93 that allow no-op on disable
  - Add check for PCC present bit on composite 7ULP register
  - Fix fractional part for fracn-gppll on prepare in i.MX
  - Fix clock tree update for TF-A managed clocks on i.MX8M
  - Drop CLK_SET_PARENT_GATE for DRAM mux on i.MX7D
  - Add the SAI7 IPG clock for i.MX8MN
  - Mark the 'nand_usdhc_bus' clock as non-critical on i.MX8MM
  - Add LVDS bypass clocks on i.MX8QXP
  - Add muxes for MIPI and PHY ref clocks on i.MX
  - Reorder dc0_bypass0_clk, lcd_pxl and dc1_disp clocks on i.MX8QXP
  - Add 1039.5MHz and 800MHz rates to fracn-gppll table on i.MX
  - Add CLK_SET_RATE_PARENT for media_disp pixel clocks on i.MX8QXP
  - Add some module descriptions to the i.MX generic and the
    i.MXRT1050 driver
  - Fix return value for bypass for composite i.MX7ULP
  - Move Mediatek clk bindings to clock/
  - Convert some more clk bindings to dt schema
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmbxswcRHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSXjoQ/9GRwTJsRBHhFKZscwklDGHJiFOowsLnzC
 q+fk0J2in+7rLezNv/5nkANOtm7eicYv5kkiY/OQArHB704neHkdVfXvSuaGMMM5
 SXPLq7YtH/4haOWhs/HYfx551+cWGHv9orTVDJpF8GHQ5t37C1BX4KphLlUcgxFe
 X0ZvbLdecp/VS4BiU+HM2zPM/SLU8V4xNmARUMZhur9QQ1P2n4YY8zGU87bWLaTB
 u1wrwm9LMtq+A+LR6ViMRwLZKYXaR9o+rndbhCVURvYZEmrIB+x5iYS8RPJa2kvy
 utsPOghOP0VRqZLT2VvLmKud7lk2Th1Uzng4xwcPxdDtpo6D5y+18VoA8tSHD2Zr
 uwirN8pGbJm+7Ak9K9I4KcA9/9JgGRMsPBgCqdnvJxFgD1c7kT2/aJ5AEWmG8GBD
 zUtqLzmSSnNfYBxXeWAqdrGNFzYZju53tl0ACI01W3lwUffPoJwnvHAdI4aiWMv1
 WdzABSnieX7YcGJrnGzV7ZaIdGwUUyR9OQ5JEi+ajD+qCbnI+oXJgEa+tHI5/XLY
 3As5WJlktmRkWzyacAPiGKsyYJYLNTy0TGwBw1CKQIrtIwjR/HF5THEr2qcy6cze
 YiT7xAzhHcjUlMjjcDEe6Qg5R9ykvYSrFixRscWXbdehP1GpWJkqdgzc1+aBJWGW
 QLLHSYHPkXo=
 =XmiQ
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Stephen Boyd:
 "The core clk framework is left largely untouched this time around
  except for support for the newly ratified DT property
  'assigned-clock-rates-u64'.

  I'm much more excited about the support for loading DT overlays from
  KUnit tests so that we can test how the clk framework parses DT nodes
  during clk registration. The clk framework has some places that are
  highly DeviceTree dependent so this charts the path to extend the
  KUnit tests to cover even more framework code in the future. I've got
  some more tests on the list that use the DT overlay support, but they
  uncovered issues with clk unregistration that I'm still working on
  fixing.

  Outside the core, the clk driver update pile is dominated by Qualcomm
  and Renesas SoCs, making it fairly usual. Looking closer, there are
  fixes for things all over the place, like adding missing clk
  frequencies or moving defines for the number of clks out of DT binding
  headers into the drivers. There are even conversions of DT bindings to
  YAML and migration away from strings to describe clk topology. Overall
  it doesn't look unusual so I expect the new drivers to be where we'll
  have fixes in the coming weeks.

  Core:
   - KUnit tests for clk registration and fixed rate basic clk type
   - A couple more devm helpers, one consumer and one provider
   - Support for assigned-clock-rates-u64

  New Drivers:
   - Camera, display and GPU clocks on Qualcomm SM4450
   - Camera clocks on Qualcomm SM8150
   - Rockchip rk3576 clks
   - Microchip SAM9X7 clks
   - Renesas RZ/V2H(P) (R9A09G057) clks

  Updates:
   - Mark a bunch of struct freq_tbl const to reduce .data usage
   - Add Qualcomm MSM8226 A7PLL and Regera PLL support
   - Fix the Qualcomm Lucid 5LPE PLL configuration sequence to not reuse
     Trion, as they do differ
   - A number of fixes to the Qualcomm SM8550 display clock driver
   - Fold Qualcomm SM8650 display clock driver into SM8550 one
   - Add missing clocks and GDSCs needed for audio on Qualcomm MSM8998
   - Add missing USB MP resets, GPLL9, and QUPv3 DFS to Qualcomm SC8180X
   - Fix sdcc clk frequency tables on Qualcomm SC8180X
   - Drop the Qualcomm SM8150 gcc_cpuss_ahb_clk_src
   - Mark Qualcomm PCIe GDSCs as RET_ON on sm8250 and sm8540 to avoid
     them turning off during suspend
   - Use the HW_CTRL mechanism on Qualcomm SM8550 video clock controller
     GDSCs
   - Get rid of CLK_NR_CLKS defines in Rockchip DT binding headers
   - Some fixes for Rockchip rk3228 and rk3588
   - Exynos850: Add clock for Thermal Management Unit
   - Exynos7885: Fix duplicated ID in the header, add missing TOP PLLs
     and add clocks for USB block in the FSYS clock controller
   - ExynosAutov9: Add DPUM clock controller
   - ExynosAutov920: Add new (first) clock controllers: TOP and PERIC0
     (and a bit more complete bindings)
   - Use clk_hw pointer instead of fw_name for acm_aud_clk[0-1]_sel
     clocks on i.MX8Q as parents in ACM provider
   - Add i.MX95 NETCMIX support to the block control provider
   - Fix parents for ENETx_REF_SEL clocks on i.MX6UL
   - Add USB clocks, resets and power domains on Renesas RZ/G3S
   - Add Generic Timer (GTM), I2C Bus Interface (RIIC), SD/MMC Host
     Interface (SDHI) and Watchdog Timer (WDT) clocks and resets on
     Renesas RZ/V2H
   - Add PCIe, PWM, and CAN-FD clocks on Renesas R-Car V4M
   - Add LCD controller clocks and resets on Renesas RZ/G2UL
   - Add DMA clocks and resets on Renesas RZ/G3S
   - Add fractional multiplication PLL support on Renesas R-Car Gen4
   - Document support for the Renesas RZ/G2M v3.0 (r8a774a3) SoC
   - Support for the Microchip SAM9X7 SoC as follows:
   - Updates for the Microchip PLL drivers
   - DT binding documentation updates (for the new clock driver and for
     the slow clock controller that SAM9X7 is using)
   - A fix for the Microchip SAMA7G5 clock driver to avoid allocating
     more memory than necessary
   - Constify some Amlogic structs
   - Add SM1 eARC clocks for Amlogic
   - Introduce a symbol namespace for Amlogic clock specific symbols
   - Add reset controller support to audiomix block control on i.MX
   - Add CLK_SET_RATE_PARENT flag to all audiomix clocks and to i.MX7D
     lcdif_pixel_src clock
   - Fix parent clocks for earc_phy and audpll on i.MX8MP
   - Fix default parents for enet[12]_ref_sel on i.MX6UL
   - Add ops in composite 8M and 93 that allow no-op on disable
   - Add check for PCC present bit on composite 7ULP register
   - Fix fractional part for fracn-gppll on prepare in i.MX
   - Fix clock tree update for TF-A managed clocks on i.MX8M
   - Drop CLK_SET_PARENT_GATE for DRAM mux on i.MX7D
   - Add the SAI7 IPG clock for i.MX8MN
   - Mark the 'nand_usdhc_bus' clock as non-critical on i.MX8MM
   - Add LVDS bypass clocks on i.MX8QXP
   - Add muxes for MIPI and PHY ref clocks on i.MX
   - Reorder dc0_bypass0_clk, lcd_pxl and dc1_disp clocks on i.MX8QXP
   - Add 1039.5MHz and 800MHz rates to fracn-gppll table on i.MX
   - Add CLK_SET_RATE_PARENT for media_disp pixel clocks on i.MX8QXP
   - Add some module descriptions to the i.MX generic and the i.MXRT1050
     driver
   - Fix return value for bypass for composite i.MX7ULP
   - Move Mediatek clk bindings to clock/
   - Convert some more clk bindings to dt schema"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (180 commits)
  clk: Switch back to struct platform_driver::remove()
  dt-bindings: clock, reset: fix top-comment indentation rk3576 headers
  clk: rockchip: remove unused mclk_pdm0_p/pdm0_p definitions
  clk: provide devm_clk_get_optional_enabled_with_rate()
  clk: fixed-rate: add devm_clk_hw_register_fixed_rate_parent_data()
  clk: imx6ul: fix clock parent for IMX6UL_CLK_ENETx_REF_SEL
  clk: renesas: r9a09g057: Add clock and reset entries for GTM/RIIC/SDHI/WDT
  clk: renesas: rzv2h: Add support for dynamic switching divider clocks
  clk: renesas: r9a08g045: Add clocks, resets and power domains for USB
  clk: rockchip: fix error for unknown clocks
  clk: rockchip: rk3588: drop unused code
  clk: rockchip: Add clock controller for the RK3576
  clk: rockchip: Add new pll type pll_rk3588_ddr
  dt-bindings: clock, reset: Add support for rk3576
  dt-bindings: clock: rockchip,rk3588-cru: drop unneeded assigned-clocks
  clk: rockchip: rk3588: Fix 32k clock name for pmu_24m_32k_100m_src_p
  clk: imx95: enable the clock of NETCMIX block control
  dt-bindings: clock: add RMII clock selection
  dt-bindings: clock: add i.MX95 NETCMIX block control
  clk: imx: imx8: Use clk_hw pointer for self registered clock in clk_parent_data
  ...
2024-09-23 15:01:48 -07:00
Linus Torvalds
b3f391fddf bcachefs changes for 6.12-rc1
rcu_pending, btree key cache rework: this solves lock contenting in the
 key cache, eliminating the biggest source of the srcu lock hold time
 warnings, and drastically improving performance on some metadata heavy
 workloads - on multithreaded creates we're now 3-4x faster than xfs.
 
 We're now using an rhashtable instead of the system inode hash table;
 this is another significant performance improvement on multithreaded
 metadata workloads, eliminating more lock contention.
 
 for_each_btree_key_in_subvolume_upto(): new helper for iterating over
 keys within a specific subvolume, eliminating a lot of open coded
 "subvolume_get_snapshot()" and also fixing another source of srcu lock
 time warnings, by running each loop iteration in its own transaction (as
 the existing for_each_btree_key() does).
 
 More work on btree_trans locking asserts; we now assert that we don't
 hold btree node locks when trans->locked is false, which is important
 because we don't use lockdep for tracking individual btree node locks.
 
 Some cleanups and improvements in the bset.c btree node lookup code,
 from Alan.
 
 Rework of btree node pinning, which we use in backpointers fsck. The old
 hacky implementation, where the shrinker just skipped over nodes in the
 pinned range, was causing OOMs; instead we now use another shrinker with
 a much higher seeks number for pinned nodes.
 
 Rebalance now uses BCH_WRITE_ONLY_SPECIFIED_DEVS; this fixes an issue
 where rebalance would sometimes fall back to allocating from the full
 filesystem, which is not what we want when it's trying to move data to a
 specific target.
 
 Use __GFP_ACCOUNT, GFP_RECLAIMABLE for btree node, key cache
 allocations.
 
 Idmap mounts are now supported - Hongbo.
 
 Rename whiteouts are now supported - Hongbo.
 
 Erasure coding can now handle devices being marked as failed, or
 forcibly removed. We still need the evacuate path for erasure coding,
 but it's getting very close to ready for people to start using.
 
 Status, and when will we be taking off experimental:
 ----------------------------------------------------
 
 Going by critical, user facing bugs getting found and fixed, we're
 nearly there. There are a couple key items that need to be finished
 before we can take off the experimental label:
 
 - The end-user experience is still pretty painful when the root
   filesystem needs a fsck; we need some form of limited self healing so
   that necessary repair gets run automatically. Errors (by type) are
   recorded in the superblock, so what we need to do next is convert
   remaining inconsistent() errors to fsck() errors (so that all runtime
   inconsistencies are logged in the superblock), and we need to go
   through the list of fsck errors and classify them by which fsck passes
   are needed to repair them.
 
 - We need comprehensive torture testing for all our repair paths, to
   shake out remaining bugs there. Thomas has been working on the tooling
   for this, so this is coming soonish.
 
 Slightly less critical items:
 
 - We need to improve the end-user experience for degraded mounts: right
   now, a degraded root filesystem means dropping to an initramfs shell
   or somehow inputting mount options manually (we don't want to allow
   degraded mounts without some form of user input, except on unattended
   servers) - we need the mount helper to prompt the user to allow
   mounting degraded, and make sure this works with systemd.
 
 - Scalabiity: we have users running 100TB+ filesystems, and that's
   effectively the limit right now due to fsck times. We have some
   reworks in the pipeline to address this, we're aiming to make petabyte
   sized filesystems practical.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmbvHQoACgkQE6szbY3K
 bnYfAw/+IXQ43/O+Jzs0MLD7pKZnrlbHiX9FqYLazD40vWvkyRTQOwgTn8pVNhq3
 4YWmtuZyqh036YC+bGqYFOhz20YetS5UdgbClpwmc99JJ6xsY+Z1mdpYfz5oq1Dw
 /pBX5iYb3rAt8UbQoZ8lcWM+GpT3GKJVgJuiLB2gRp9gATFesuh+0qU42oIVVVU5
 4y3VhDBUmRk4XqEnk8hr7EIDMW0wWP3aptxYMZzeUPW0x1cEQ+FWrJo5D6lXv2KK
 dKv3MogvA0FFNi/eNexclPiu2pXtI7vrxT7umsxAICHLt41rWpV5ttE6io3bC4ZN
 qvwF9w2CpmKPKchFru9PO+QrWHVR7e6bphwf3TzyoKZ7tTn42f1RQlub7gBzI3bz
 ai5ZwGRIvpUoPVBj+CO+Ipog81uUb23Ma+gXg1akEFBOAb+o7I3KOOSBh5l+0cHj
 3Ov1n0TLcsoO2cqoqfsV2QubW9YcWEZ76g5mKwQnUn8Cs6Fp0wWaIyK9aNkIAxcr
 tNDPGtH1gKitxUvju5i/LyI7y1UoeFvqJFee0VsU6QnixHn1ySzhePsJt6UEnIJT
 Ia3C96Igqu2mV9FxhfGHj/qi7TGjqqkZHa8+B610cDpgf15cx7Ps2DYjkuQMFCqZ
 Q3Q1o5De9roRq5xF2hLiYJCbzJKqd5ichFsBtLQuX572ICxbICg=
 =oVCy
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-09-21' of git://evilpiepirate.org/bcachefs

Pull bcachefs updates from Kent Overstreet:

 - rcu_pending, btree key cache rework: this solves lock contenting in
   the key cache, eliminating the biggest source of the srcu lock hold
   time warnings, and drastically improving performance on some metadata
   heavy workloads - on multithreaded creates we're now 3-4x faster than
   xfs.

 - We're now using an rhashtable instead of the system inode hash table;
   this is another significant performance improvement on multithreaded
   metadata workloads, eliminating more lock contention.

 - for_each_btree_key_in_subvolume_upto(): new helper for iterating over
   keys within a specific subvolume, eliminating a lot of open coded
   "subvolume_get_snapshot()" and also fixing another source of srcu
   lock time warnings, by running each loop iteration in its own
   transaction (as the existing for_each_btree_key() does).

 - More work on btree_trans locking asserts; we now assert that we don't
   hold btree node locks when trans->locked is false, which is important
   because we don't use lockdep for tracking individual btree node
   locks.

 - Some cleanups and improvements in the bset.c btree node lookup code,
   from Alan.

 - Rework of btree node pinning, which we use in backpointers fsck. The
   old hacky implementation, where the shrinker just skipped over nodes
   in the pinned range, was causing OOMs; instead we now use another
   shrinker with a much higher seeks number for pinned nodes.

 - Rebalance now uses BCH_WRITE_ONLY_SPECIFIED_DEVS; this fixes an issue
   where rebalance would sometimes fall back to allocating from the full
   filesystem, which is not what we want when it's trying to move data
   to a specific target.

 - Use __GFP_ACCOUNT, GFP_RECLAIMABLE for btree node, key cache
   allocations.

 - Idmap mounts are now supported (Hongbo Li)

 - Rename whiteouts are now supported (Hongbo Li)

 - Erasure coding can now handle devices being marked as failed, or
   forcibly removed. We still need the evacuate path for erasure coding,
   but it's getting very close to ready for people to start using.

* tag 'bcachefs-2024-09-21' of git://evilpiepirate.org/bcachefs: (99 commits)
  bcachefs: return err ptr instead of null in read sb clean
  bcachefs: Remove duplicated include in backpointers.c
  bcachefs: Don't drop devices with stripe pointers
  bcachefs: bch2_ec_stripe_head_get() now checks for change in rw devices
  bcachefs: bch_fs.rw_devs_change_count
  bcachefs: bch2_dev_remove_stripes()
  bcachefs: bch2_trigger_ptr() calculates sectors even when no device
  bcachefs: improve error messages in bch2_ec_read_extent()
  bcachefs: improve error message on too few devices for ec
  bcachefs: improve bch2_new_stripe_to_text()
  bcachefs: ec_stripe_head.nr_created
  bcachefs: bch_stripe.disk_label
  bcachefs: stripe_to_mem()
  bcachefs: EIO errcode cleanup
  bcachefs: Rework btree node pinning
  bcachefs: split up btree cache counters for live, freeable
  bcachefs: btree cache counters should be size_t
  bcachefs: Don't count "skipped access bit" as touched in btree cache scan
  bcachefs: Failed devices no longer require mounting in degraded mode
  bcachefs: bch2_dev_rcu_noerror()
  ...
2024-09-23 10:05:41 -07:00
Linus Torvalds
de5cb0dcb7 Merge branch 'address-masking'
Merge user access fast validation using address masking.

This allows architectures to optionally use a data dependent address
masking model instead of a conditional branch for validating user
accesses.  That avoids the Spectre-v1 speculation barriers.

Right now only x86-64 takes advantage of this, and not all architectures
will be able to do it.  It requires a guard region between the user and
kernel address spaces (so that you can't overflow from one to the
other), and an easy way to generate a guaranteed-to-fault address for
invalid user pointers.

Also note that this currently assumes that there is no difference
between user read and write accesses.  If extended to architectures like
powerpc, we'll also need to separate out the user read-vs-write cases.

* address-masking:
  x86: make the masked_user_access_begin() macro use its argument only once
  x86: do the user address masking outside the user access area
  x86: support user address masking instead of non-speculative conditional
2024-09-22 11:19:35 -07:00
Linus Torvalds
88264981f2 sched_ext: Initial pull request for v6.12
This is the initial pull request of sched_ext. The v7 patchset
 (https://lkml.kernel.org/r/20240618212056.2833381-1-tj@kernel.org) is
 applied on top of tip/sched/core + bpf/master as of Jun 18th.
 
   tip/sched/core 793a62823d1c ("sched/core: Drop spinlocks on contention iff kernel is preempti
 ble")
   bpf/master f6afdaf72a ("Merge branch 'bpf-support-resilient-split-btf'")
 
 Since then, the following pulls were made:
 
 - v6.11-rc1 is pulled to keep up with the mainline.
 
 - tip/sched/core was pulled several times:
 
   - 7b9f6c864a, 0df340ceae, 5ac998574f, 0b1777f0fa04: To resolve
     conflicts. See each commit for details on conflicts and their
     resolutions.
 
   - d7b01aef9dbd: To receive fd03c5b858 ("sched: Rework pick_next_task()")
     and related commits. @prev in added to sched_class->put_prev_task() and
     put_prev_task() is reordered after ->pick_task(), which makes
     sched_class->switch_class() unnecessary. The follow-up commits update
     sched_ext accordingly and drop sched_class->switch_class().
 
 - bpf/master was pulled to receive baebe9aaba ("bpf: allow passing struct
   bpf_iter_<type> as kfunc arguments") and related changes in preparation
   for the DSQ iterator patchset
 
 To obtain the net sched_ext changes, diff against:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git for-6.12-base
 
 which is the merge of:
 
   tip/sched/core bc9057da1a ("sched/cpufreq: Use NSEC_PER_MSEC for deadline task")
   bpf/master 2ad6d23f46 ("selftests/bpf: Do not update vmlinux.h unnecessarily")
 
 Since the v7 patchset, the following changes were made:
 
 - cpuperf support which was a part of the v6 patchset was posted separately
   and then applied after reviews.
 
 - cgroup support which was a part of the v6 patchset was posted seprately,
   iterated and then applied.
 
 - Improve integration with sched core.
 
 - Double locking usage in migration paths dropped. Depend on
   TASK_ON_RQ_MIGRATING synchronization instead.
 
 - The BPF scheduler couldn't directly dispatch to the local DSQ of another
   CPU using a SCX_DSQ_LOCAL_ON verdict. This caused difficulties around
   handling non-wakeup enqueues. Updated so that SCX_DSQ_LOCAL_ON can be used
   in the enqueue path too.
 
 - DSQ iterator which was a part of the v6 patchset was posted separately.
   The iterator itself was applied after a couple revisions. The associated
   selective consumption kfunc can use further improvements and is still
   being worked on.
 
 - scx_bpf_dispatch[_vtime]_from_dsq() added to increase flexibility. A task
   can now be transferred between two DSQs from almost any context. This
   involved significant refactoring of migration code.
 
 - Various fixes and improvements.
 
 As the branch is based on top of tip/sched/core + bpf/master, please merge
 after both are applied.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZuOSuA4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGVZyAQDBU3WPkYKB8gl6a6YQ+/PzBXorOK7mioS9A2iJ
 vBR3FgEAg1vtcss1S+2juWmVq7ItiFNWCqtXzUr/bVmL9CqqDwA=
 =bOOC
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext support from Tejun Heo:
 "This implements a new scheduler class called ‘ext_sched_class’, or
  sched_ext, which allows scheduling policies to be implemented as BPF
  programs.

  The goals of this are:

   - Ease of experimentation and exploration: Enabling rapid iteration
     of new scheduling policies.

   - Customization: Building application-specific schedulers which
     implement policies that are not applicable to general-purpose
     schedulers.

   - Rapid scheduler deployments: Non-disruptive swap outs of scheduling
     policies in production environments"

See individual commits for more documentation, but also the cover letter
for the latest series:

Link: https://lore.kernel.org/all/20240618212056.2833381-1-tj@kernel.org/

* tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (110 commits)
  sched: Move update_other_load_avgs() to kernel/sched/pelt.c
  sched_ext: Don't trigger ops.quiescent/runnable() on migrations
  sched_ext: Synchronize bypass state changes with rq lock
  scx_qmap: Implement highpri boosting
  sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()
  sched_ext: Compact struct bpf_iter_scx_dsq_kern
  sched_ext: Replace consume_local_task() with move_local_task_to_local_dsq()
  sched_ext: Move consume_local_task() upward
  sched_ext: Move sanity check and dsq_mod_nr() into task_unlink_from_dsq()
  sched_ext: Reorder args for consume_local/remote_task()
  sched_ext: Restructure dispatch_to_local_dsq()
  sched_ext: Fix processs_ddsp_deferred_locals() by unifying DTL_INVALID handling
  sched_ext: Make find_dsq_for_dispatch() handle SCX_DSQ_LOCAL_ON
  sched_ext: Refactor consume_remote_task()
  sched_ext: Rename scx_kfunc_set_sleepable to unlocked and relocate
  sched_ext: Add missing static to scx_dump_data
  sched_ext: Add missing static to scx_has_op[]
  sched_ext: Temporarily work around pick_task_scx() being called without balance_scx()
  sched_ext: Add a cgroup scheduler which uses flattened hierarchy
  sched_ext: Add cgroup support
  ...
2024-09-21 09:44:57 -07:00
Linus Torvalds
440b652328 bpf-next-6.12
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmbk/nIACgkQ6rmadz2v
 bTqxuBAAnqW81Rr0nORIxeJMbyo4EiFuYHGk6u5BYP9NPzqHroUPCLVmSP7Hp/Ta
 CJjsiZeivZsGa6Qlc3BCa4hHNpqP5WE1C/73svSDn7/99EfxdSBtirpMVFUPsUtn
 DDb5chNpvnxKNS8Mw5Ty8wBrdbXHMlSx+IfaFHpv0Yn6EAcuF4UdoEUq2l3PqhfD
 Il9Zm127eViPGAP+o+TBZFfW+rRw8d0ngqeRq2GvJ8ibNEDWss+GmBI1Dod7d+fC
 dUDg96Ipdm1a5Xz7dnH80eXz9JHdpu6qhQrQMKKArnlpJElrKiOf9b17ZcJoPQOR
 ZnstEnUyVnrWROZxUuKY72+2tx3TuSf+L9uZqFHNx3Ix5FIoS+tFbHf4b8SxtsOb
 hb2X7SigdGqhQDxUT+IPeO5hsJlIvG1/VYxMXxgc++rh9DjL06hDLUSH1WBSU0fC
 kFQ7HrcpAlVHtWmGbwwUyVjD+KC/qmZBTAnkcYT4C62WZVytSCnihIuSFAvV1tpZ
 SSIhVPyQ599UoZIiQYihp0S4qP74FotCtErWSrThneh2Cl8kDsRq//lV1nj/PTV8
 CpTvz4VCFDFTgthCfd62fP95EwW5K+aE3NjGTPW/9Hx/0+J/1tT+yqWsrToGaruf
 TbrqtzQhpclz9UEqA+696cVAXNj9uRU4AoD3YIg72kVnRlkgYd0=
 =MDwh
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

 - Introduce '__attribute__((bpf_fastcall))' for helpers and kfuncs with
   corresponding support in LLVM.

   It is similar to existing 'no_caller_saved_registers' attribute in
   GCC/LLVM with a provision for backward compatibility. It allows
   compilers generate more efficient BPF code assuming the verifier or
   JITs will inline or partially inline a helper/kfunc with such
   attribute. bpf_cast_to_kern_ctx, bpf_rdonly_cast,
   bpf_get_smp_processor_id are the first set of such helpers.

 - Harden and extend ELF build ID parsing logic.

   When called from sleepable context the relevants parts of ELF file
   will be read to find and fetch .note.gnu.build-id information. Also
   harden the logic to avoid TOCTOU, overflow, out-of-bounds problems.

 - Improvements and fixes for sched-ext:
    - Allow passing BPF iterators as kfunc arguments
    - Make the pointer returned from iter_next method trusted
    - Fix x86 JIT convergence issue due to growing/shrinking conditional
      jumps in variable length encoding

 - BPF_LSM related:
    - Introduce few VFS kfuncs and consolidate them in
      fs/bpf_fs_kfuncs.c
    - Enforce correct range of return values from certain LSM hooks
    - Disallow attaching to other LSM hooks

 - Prerequisite work for upcoming Qdisc in BPF:
    - Allow kptrs in program provided structs
    - Support for gen_epilogue in verifier_ops

 - Important fixes:
    - Fix uprobe multi pid filter check
    - Fix bpf_strtol and bpf_strtoul helpers
    - Track equal scalars history on per-instruction level
    - Fix tailcall hierarchy on x86 and arm64
    - Fix signed division overflow to prevent INT_MIN/-1 trap on x86
    - Fix get kernel stack in BPF progs attached to tracepoint:syscall

 - Selftests:
    - Add uprobe bench/stress tool
    - Generate file dependencies to drastically improve re-build time
    - Match JIT-ed and BPF asm with __xlated/__jited keywords
    - Convert older tests to test_progs framework
    - Add support for RISC-V
    - Few fixes when BPF programs are compiled with GCC-BPF backend
      (support for GCC-BPF in BPF CI is ongoing in parallel)
    - Add traffic monitor
    - Enable cross compile and musl libc

* tag 'bpf-next-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (260 commits)
  btf: require pahole 1.21+ for DEBUG_INFO_BTF with default DWARF version
  btf: move pahole check in scripts/link-vmlinux.sh to lib/Kconfig.debug
  btf: remove redundant CONFIG_BPF test in scripts/link-vmlinux.sh
  bpf: Call the missed kfree() when there is no special field in btf
  bpf: Call the missed btf_record_free() when map creation fails
  selftests/bpf: Add a test case to write mtu result into .rodata
  selftests/bpf: Add a test case to write strtol result into .rodata
  selftests/bpf: Rename ARG_PTR_TO_LONG test description
  selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test
  bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error
  bpf: Improve check_raw_mode_ok test for MEM_UNINIT-tagged types
  bpf: Fix helper writes to read-only maps
  bpf: Remove truncation test in bpf_strtol and bpf_strtoul helpers
  bpf: Fix bpf_strtol and bpf_strtoul helpers for 32bit
  selftests/bpf: Add tests for sdiv/smod overflow cases
  bpf: Fix a sdiv overflow issue
  libbpf: Add bpf_object__token_fd accessor
  docs/bpf: Add missing BPF program types to docs
  docs/bpf: Add constant values for linkages
  bpf: Use fake pt_regs when doing bpf syscall tracepoint tracing
  ...
2024-09-21 09:27:50 -07:00
Linus Torvalds
7856a56541 Many singleton patches - please see the various changelogs for details.
Quite a lot of nilfs2 work this time around.
 
 Notable patch series in this pull request are:
 
 "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with
 assistance from Uwe Kleine-König.  Reimplement mul_u64_u64_div_u64() to
 provide (much) more accurate results.  The current implementation was
 causing Uwe some issues in the PWM drivers.
 
 "xz: Updates to license, filters, and compression options" from Lasse
 Collin.  Miscellaneous maintenance and kinor feature work to the xz
 decompressor.
 
 "Fix some GDB command error and add some GDB commands" from Kuan-Ying Lee.
 Fixes and enhancements to the gdb scripts.
 
 "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff Johnson.
 Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of warnings about this.
 
 "nilfs2: add support for some common ioctls" from Ryusuke Konishi.  Adds
 various commonly-available ioctls to nilfs2.
 
 "This series fixes a number of formatting issues in kernel doc comments"
 from Ryusuke Konishi does that.
 
 "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke Konishi.  Fix
 issues where -ENOENT was being unintentionally and inappropriately
 returned to userspace.
 
 "nilfs2: assorted cleanups" from Huang Xiaojia.
 
 "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke
 Konishi fixes some issues which can occur on corrupted nilfs2 filesystems.
 
 "scripts/decode_stacktrace.sh: improve error reporting and usability" from
 Luca Ceresoli does those things.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu7dpAAKCRDdBJ7gKXxA
 jsPqAPwMDEZyKlfSw7QioEHNHDkmkbP7VYCYR0CbUnppbztwpAD8D37aVbWQ+UzM
 3nnOq3W2Pc2o/20zqi8Upf1mnvUrygQ=
 =/NWE
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Many singleton patches - please see the various changelogs for
  details.

  Quite a lot of nilfs2 work this time around.

  Notable patch series in this pull request are:

   - "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with
     assistance from Uwe Kleine-König. Reimplement mul_u64_u64_div_u64()
     to provide (much) more accurate results. The current implementation
     was causing Uwe some issues in the PWM drivers.

   - "xz: Updates to license, filters, and compression options" from
     Lasse Collin. Miscellaneous maintenance and kinor feature work to
     the xz decompressor.

   - "Fix some GDB command error and add some GDB commands" from
     Kuan-Ying Lee. Fixes and enhancements to the gdb scripts.

   - "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff
     Johnson. Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of
     warnings about this.

   - "nilfs2: add support for some common ioctls" from Ryusuke Konishi.
     Adds various commonly-available ioctls to nilfs2.

   - "This series fixes a number of formatting issues in kernel doc
     comments" from Ryusuke Konishi does that.

   - "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke
     Konishi. Fix issues where -ENOENT was being unintentionally and
     inappropriately returned to userspace.

   - "nilfs2: assorted cleanups" from Huang Xiaojia.

   - "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke
     Konishi fixes some issues which can occur on corrupted nilfs2
     filesystems.

   - "scripts/decode_stacktrace.sh: improve error reporting and
     usability" from Luca Ceresoli does those things"

* tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (103 commits)
  list: test: increase coverage of list_test_list_replace*()
  list: test: fix tests for list_cut_position()
  proc: use __auto_type more
  treewide: correct the typo 'retun'
  ocfs2: cleanup return value and mlog in ocfs2_global_read_info()
  nilfs2: remove duplicate 'unlikely()' usage
  nilfs2: fix potential oob read in nilfs_btree_check_delete()
  nilfs2: determine empty node blocks as corrupted
  nilfs2: fix potential null-ptr-deref in nilfs_btree_insert()
  user_namespace: use kmemdup_array() instead of kmemdup() for multiple allocation
  tools/mm: rm thp_swap_allocator_test when make clean
  squashfs: fix percpu address space issues in decompressor_multi_percpu.c
  lib: glob.c: added null check for character class
  nilfs2: refactor nilfs_segctor_thread()
  nilfs2: use kthread_create and kthread_stop for the log writer thread
  nilfs2: remove sc_timer_task
  nilfs2: do not repair reserved inode bitmap in nilfs_new_inode()
  nilfs2: eliminate the shared counter and spinlock for i_generation
  nilfs2: separate inode type information from i_state field
  nilfs2: use the BITS_PER_LONG macro
  ...
2024-09-21 08:20:50 -07:00
Linus Torvalds
617a814f14 ALong with the usual shower of singleton patches, notable patch series in
this pull request are:
 
 "Align kvrealloc() with krealloc()" from Danilo Krummrich.  Adds
 consistency to the APIs and behaviour of these two core allocation
 functions.  This also simplifies/enables Rustification.
 
 "Some cleanups for shmem" from Baolin Wang.  No functional changes - mode
 code reuse, better function naming, logic simplifications.
 
 "mm: some small page fault cleanups" from Josef Bacik.  No functional
 changes - code cleanups only.
 
 "Various memory tiering fixes" from Zi Yan.  A small fix and a little
 cleanup.
 
 "mm/swap: remove boilerplate" from Yu Zhao.  Code cleanups and
 simplifications and .text shrinkage.
 
 "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt.  This
 is a feature, it adds new feilds to /proc/vmstat such as
 
     $ grep kstack /proc/vmstat
     kstack_1k 3
     kstack_2k 188
     kstack_4k 11391
     kstack_8k 243
     kstack_16k 0
 
 which tells us that 11391 processes used 4k of stack while none at all
 used 16k.  Useful for some system tuning things, but partivularly useful
 for "the dynamic kernel stack project".
 
 "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov.
 Teaches kmemleak to detect leaksage of percpu memory.
 
 "mm: memcg: page counters optimizations" from Roman Gushchin.  "3
 independent small optimizations of page counters".
 
 "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David
 Hildenbrand.  Improves PTE/PMD splitlock detection, makes powerpc/8xx work
 correctly by design rather than by accident.
 
 "mm: remove arch_make_page_accessible()" from David Hildenbrand.  Some
 folio conversions which make arch_make_page_accessible() unneeded.
 
 "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel.
 Cleans up and fixes our handling of the resetting of the cgroup/process
 peak-memory-use detector.
 
 "Make core VMA operations internal and testable" from Lorenzo Stoakes.
 Rationalizaion and encapsulation of the VMA manipulation APIs.  With a
 view to better enable testing of the VMA functions, even from a
 userspace-only harness.
 
 "mm: zswap: fixes for global shrinker" from Takero Funaki.  Fix issues in
 the zswap global shrinker, resulting in improved performance.
 
 "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao.  Fill in
 some missing info in /proc/zoneinfo.
 
 "mm: replace follow_page() by folio_walk" from David Hildenbrand.  Code
 cleanups and rationalizations (conversion to folio_walk()) resulting in
 the removal of follow_page().
 
 "improving dynamic zswap shrinker protection scheme" from Nhat Pham.  Some
 tuning to improve zswap's dynamic shrinker.  Significant reductions in
 swapin and improvements in performance are shown.
 
 "mm: Fix several issues with unaccepted memory" from Kirill Shutemov.
 Improvements to the new unaccepted memory feature,
 
 "mm/mprotect: Fix dax puds" from Peter Xu.  Implements mprotect on DAX
 PUDs.  This was missing, although nobody seems to have notied yet.
 
 "Introduce a store type enum for the Maple tree" from Sidhartha Kumar.
 Cleanups and modest performance improvements for the maple tree library
 code.
 
 "memcg: further decouple v1 code from v2" from Shakeel Butt.  Move more
 cgroup v1 remnants away from the v2 memcg code.
 
 "memcg: initiate deprecation of v1 features" from Shakeel Butt.  Adds
 various warnings telling users that memcg v1 features are deprecated.
 
 "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li.
 Greatly improves the success rate of the mTHP swap allocation.
 
 "mm: introduce numa_memblks" from Mike Rapoport.  Moves various disparate
 per-arch implementations of numa_memblk code into generic code.
 
 "mm: batch free swaps for zap_pte_range()" from Barry Song.  Greatly
 improves the performance of munmap() of swap-filled ptes.
 
 "support large folio swap-out and swap-in for shmem" from Baolin Wang.
 With this series we no longer split shmem large folios into simgle-page
 folios when swapping out shmem.
 
 "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao.  Nice performance
 improvements and code reductions for gigantic folios.
 
 "support shmem mTHP collapse" from Baolin Wang.  Adds support for
 khugepaged's collapsing of shmem mTHP folios.
 
 "mm: Optimize mseal checks" from Pedro Falcato.  Fixes an mprotect()
 performance regression due to the addition of mseal().
 
 "Increase the number of bits available in page_type" from Matthew Wilcox.
 Increases the number of bits available in page_type!
 
 "Simplify the page flags a little" from Matthew Wilcox.  Many legacy page
 flags are now folio flags, so the page-based flags and their
 accessors/mutators can be removed.
 
 "mm: store zero pages to be swapped out in a bitmap" from Usama Arif.  An
 optimization which permits us to avoid writing/reading zero-filled zswap
 pages to backing store.
 
 "Avoid MAP_FIXED gap exposure" from Liam Howlett.  Fixes a race window
 which occurs when a MAP_FIXED operqtion is occurring during an unrelated
 vma tree walk.
 
 "mm: remove vma_merge()" from Lorenzo Stoakes.  Major rotorooting of the
 vma_merge() functionality, making ot cleaner, more testable and better
 tested.
 
 "misc fixups for DAMON {self,kunit} tests" from SeongJae Park.  Minor
 fixups of DAMON selftests and kunit tests.
 
 "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang.  Code
 cleanups and folio conversions.
 
 "Shmem mTHP controls and stats improvements" from Ryan Roberts.  Cleanups
 for shmem controls and stats.
 
 "mm: count the number of anonymous THPs per size" from Barry Song.  Expose
 additional anon THP stats to userspace for improved tuning.
 
 "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio
 conversions and removal of now-unused page-based APIs.
 
 "replace per-quota region priorities histogram buffer with per-context
 one" from SeongJae Park.  DAMON histogram rationalization.
 
 "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae
 Park.  DAMON documentation updates.
 
 "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve
 related doc and warn" from Jason Wang: fixes usage of page allocator
 __GFP_NOFAIL and GFP_ATOMIC flags.
 
 "mm: split underused THPs" from Yu Zhao.  Improve THP=always policy - this
 was overprovisioning THPs in sparsely accessed memory areas.
 
 "zram: introduce custom comp backends API" frm Sergey Senozhatsky.  Add
 support for zram run-time compression algorithm tuning.
 
 "mm: Care about shadow stack guard gap when getting an unmapped area" from
 Mark Brown.  Fix up the various arch_get_unmapped_area() implementations
 to better respect guard areas.
 
 "Improve mem_cgroup_iter()" from Kinsey Ho.  Improve the reliability of
 mem_cgroup_iter() and various code cleanups.
 
 "mm: Support huge pfnmaps" from Peter Xu.  Extends the usage of huge
 pfnmap support.
 
 "resource: Fix region_intersects() vs add_memory_driver_managed()" from
 Huang Ying.  Fix a bug in region_intersects() for systems with CXL memory.
 
 "mm: hwpoison: two more poison recovery" from Kefeng Wang.  Teaches a
 couple more code paths to correctly recover from the encountering of
 poisoned memry.
 
 "mm: enable large folios swap-in support" from Barry Song.  Support the
 swapin of mTHP memory into appropriately-sized folios, rather than into
 single-page folios.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu1BBwAKCRDdBJ7gKXxA
 jlWNAQDYlqQLun7bgsAN4sSvi27VUuWv1q70jlMXTfmjJAvQqwD/fBFVR6IOOiw7
 AkDbKWP2k0hWPiNJBGwoqxdHHx09Xgo=
 =s0T+
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Along with the usual shower of singleton patches, notable patch series
  in this pull request are:

   - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds
     consistency to the APIs and behaviour of these two core allocation
     functions. This also simplifies/enables Rustification.

   - "Some cleanups for shmem" from Baolin Wang. No functional changes -
     mode code reuse, better function naming, logic simplifications.

   - "mm: some small page fault cleanups" from Josef Bacik. No
     functional changes - code cleanups only.

   - "Various memory tiering fixes" from Zi Yan. A small fix and a
     little cleanup.

   - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and
     simplifications and .text shrinkage.

   - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel
     Butt. This is a feature, it adds new feilds to /proc/vmstat such as

       $ grep kstack /proc/vmstat
       kstack_1k 3
       kstack_2k 188
       kstack_4k 11391
       kstack_8k 243
       kstack_16k 0

     which tells us that 11391 processes used 4k of stack while none at
     all used 16k. Useful for some system tuning things, but
     partivularly useful for "the dynamic kernel stack project".

   - "kmemleak: support for percpu memory leak detect" from Pavel
     Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory.

   - "mm: memcg: page counters optimizations" from Roman Gushchin. "3
     independent small optimizations of page counters".

   - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from
     David Hildenbrand. Improves PTE/PMD splitlock detection, makes
     powerpc/8xx work correctly by design rather than by accident.

   - "mm: remove arch_make_page_accessible()" from David Hildenbrand.
     Some folio conversions which make arch_make_page_accessible()
     unneeded.

   - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David
     Finkel. Cleans up and fixes our handling of the resetting of the
     cgroup/process peak-memory-use detector.

   - "Make core VMA operations internal and testable" from Lorenzo
     Stoakes. Rationalizaion and encapsulation of the VMA manipulation
     APIs. With a view to better enable testing of the VMA functions,
     even from a userspace-only harness.

   - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix
     issues in the zswap global shrinker, resulting in improved
     performance.

   - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill
     in some missing info in /proc/zoneinfo.

   - "mm: replace follow_page() by folio_walk" from David Hildenbrand.
     Code cleanups and rationalizations (conversion to folio_walk())
     resulting in the removal of follow_page().

   - "improving dynamic zswap shrinker protection scheme" from Nhat
     Pham. Some tuning to improve zswap's dynamic shrinker. Significant
     reductions in swapin and improvements in performance are shown.

   - "mm: Fix several issues with unaccepted memory" from Kirill
     Shutemov. Improvements to the new unaccepted memory feature,

   - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on
     DAX PUDs. This was missing, although nobody seems to have notied
     yet.

   - "Introduce a store type enum for the Maple tree" from Sidhartha
     Kumar. Cleanups and modest performance improvements for the maple
     tree library code.

   - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move
     more cgroup v1 remnants away from the v2 memcg code.

   - "memcg: initiate deprecation of v1 features" from Shakeel Butt.
     Adds various warnings telling users that memcg v1 features are
     deprecated.

   - "mm: swap: mTHP swap allocator base on swap cluster order" from
     Chris Li. Greatly improves the success rate of the mTHP swap
     allocation.

   - "mm: introduce numa_memblks" from Mike Rapoport. Moves various
     disparate per-arch implementations of numa_memblk code into generic
     code.

   - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly
     improves the performance of munmap() of swap-filled ptes.

   - "support large folio swap-out and swap-in for shmem" from Baolin
     Wang. With this series we no longer split shmem large folios into
     simgle-page folios when swapping out shmem.

   - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice
     performance improvements and code reductions for gigantic folios.

   - "support shmem mTHP collapse" from Baolin Wang. Adds support for
     khugepaged's collapsing of shmem mTHP folios.

   - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect()
     performance regression due to the addition of mseal().

   - "Increase the number of bits available in page_type" from Matthew
     Wilcox. Increases the number of bits available in page_type!

   - "Simplify the page flags a little" from Matthew Wilcox. Many legacy
     page flags are now folio flags, so the page-based flags and their
     accessors/mutators can be removed.

   - "mm: store zero pages to be swapped out in a bitmap" from Usama
     Arif. An optimization which permits us to avoid writing/reading
     zero-filled zswap pages to backing store.

   - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race
     window which occurs when a MAP_FIXED operqtion is occurring during
     an unrelated vma tree walk.

   - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of
     the vma_merge() functionality, making ot cleaner, more testable and
     better tested.

   - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park.
     Minor fixups of DAMON selftests and kunit tests.

   - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang.
     Code cleanups and folio conversions.

   - "Shmem mTHP controls and stats improvements" from Ryan Roberts.
     Cleanups for shmem controls and stats.

   - "mm: count the number of anonymous THPs per size" from Barry Song.
     Expose additional anon THP stats to userspace for improved tuning.

   - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more
     folio conversions and removal of now-unused page-based APIs.

   - "replace per-quota region priorities histogram buffer with
     per-context one" from SeongJae Park. DAMON histogram
     rationalization.

   - "Docs/damon: update GitHub repo URLs and maintainer-profile" from
     SeongJae Park. DAMON documentation updates.

   - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and
     improve related doc and warn" from Jason Wang: fixes usage of page
     allocator __GFP_NOFAIL and GFP_ATOMIC flags.

   - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy.
     This was overprovisioning THPs in sparsely accessed memory areas.

   - "zram: introduce custom comp backends API" frm Sergey Senozhatsky.
     Add support for zram run-time compression algorithm tuning.

   - "mm: Care about shadow stack guard gap when getting an unmapped
     area" from Mark Brown. Fix up the various arch_get_unmapped_area()
     implementations to better respect guard areas.

   - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability
     of mem_cgroup_iter() and various code cleanups.

   - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge
     pfnmap support.

   - "resource: Fix region_intersects() vs add_memory_driver_managed()"
     from Huang Ying. Fix a bug in region_intersects() for systems with
     CXL memory.

   - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches
     a couple more code paths to correctly recover from the encountering
     of poisoned memry.

   - "mm: enable large folios swap-in support" from Barry Song. Support
     the swapin of mTHP memory into appropriately-sized folios, rather
     than into single-page folios"

* tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits)
  zram: free secondary algorithms names
  uprobes: turn xol_area->pages[2] into xol_area->page
  uprobes: introduce the global struct vm_special_mapping xol_mapping
  Revert "uprobes: use vm_special_mapping close() functionality"
  mm: support large folios swap-in for sync io devices
  mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios
  mm: fix swap_read_folio_zeromap() for large folios with partial zeromap
  mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries
  set_memory: add __must_check to generic stubs
  mm/vma: return the exact errno in vms_gather_munmap_vmas()
  memcg: cleanup with !CONFIG_MEMCG_V1
  mm/show_mem.c: report alloc tags in human readable units
  mm: support poison recovery from copy_present_page()
  mm: support poison recovery from do_cow_fault()
  resource, kunit: add test case for region_intersects()
  resource: make alloc_free_mem_region() works for iomem_resource
  mm: z3fold: deprecate CONFIG_Z3FOLD
  vfio/pci: implement huge_fault support
  mm/arm64: support large pfn mappings
  mm/x86: support large pfn mappings
  ...
2024-09-21 07:29:05 -07:00
Ming Lei
65f666c620 lib/sbitmap: define swap_lock as raw_spinlock_t
When called from sbitmap_queue_get(), sbitmap_deferred_clear() may be run
with preempt disabled. In RT kernel, spin_lock() can sleep, then warning
of "BUG: sleeping function called from invalid context" can be triggered.

Fix it by replacing it with raw_spin_lock.

Cc: Yang Yang <yang.yang@vivo.com>
Fixes: 72d04bdcf3 ("sbitmap: fix io hung due to race on sbitmap_word::cleared")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Yang Yang <yang.yang@vivo.com>
Link: https://lore.kernel.org/r/20240919021709.511329-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-09-20 00:20:06 -06:00
Kris Van Hees
5f5e734432 kbuild: generate offset range data for builtin modules
Create file module.builtin.ranges that can be used to find where
built-in modules are located by their addresses. This will be useful for
tracing tools to find what functions are for various built-in modules.

The offset range data for builtin modules is generated using:
 - modules.builtin: associates object files with module names
 - vmlinux.map: provides load order of sections and offset of first member
    per section
 - vmlinux.o.map: provides offset of object file content per section
 - .*.cmd: build cmd file with KBUILD_MODFILE

The generated data will look like:

.text 00000000-00000000 = _text
.text 0000baf0-0000cb10 amd_uncore
.text 0009bd10-0009c8e0 iosf_mbi
...
.text 00b9f080-00ba011a intel_skl_int3472_discrete
.text 00ba0120-00ba03c0 intel_skl_int3472_discrete intel_skl_int3472_tps68470
.text 00ba03c0-00ba08d6 intel_skl_int3472_tps68470
...
.data 00000000-00000000 = _sdata
.data 0000f020-0000f680 amd_uncore

For each ELF section, it lists the offset of the first symbol.  This can
be used to determine the base address of the section at runtime.

Next, it lists (in strict ascending order) offset ranges in that section
that cover the symbols of one or more builtin modules.  Multiple ranges
can apply to a single module, and ranges can be shared between modules.

The CONFIG_BUILTIN_MODULE_RANGES option controls whether offset range data
is generated for kernel modules that are built into the kernel image.

How it works:

 1. The modules.builtin file is parsed to obtain a list of built-in
    module names and their associated object names (the .ko file that
    the module would be in if it were a loadable module, hereafter
    referred to as <kmodfile>).  This object name can be used to
    identify objects in the kernel compile because any C or assembler
    code that ends up into a built-in module will have the option
    -DKBUILD_MODFILE=<kmodfile> present in its build command, and those
    can be found in the .<obj>.cmd file in the kernel build tree.

    If an object is part of multiple modules, they will all be listed
    in the KBUILD_MODFILE option argument.

    This allows us to conclusively determine whether an object in the
    kernel build belong to any modules, and which.

 2. The vmlinux.map is parsed next to determine the base address of each
    top level section so that all addresses into the section can be
    turned into offsets.  This makes it possible to handle sections
    getting loaded at different addresses at system boot.

    We also determine an 'anchor' symbol at the beginning of each
    section to make it possible to calculate the true base address of
    a section at runtime (i.e. symbol address - symbol offset).

    We collect start addresses of sections that are included in the top
    level section.  This is used when vmlinux is linked using vmlinux.o,
    because in that case, we need to look at the vmlinux.o linker map to
    know what object a symbol is found in.

    And finally, we process each symbol that is listed in vmlinux.map
    (or vmlinux.o.map) based on the following structure:

    vmlinux linked from vmlinux.a:

      vmlinux.map:
        <top level section>
          <included section>  -- might be same as top level section)
            <object>          -- built-in association known
              <symbol>        -- belongs to module(s) object belongs to
              ...

    vmlinux linked from vmlinux.o:

      vmlinux.map:
        <top level section>
          <included section>  -- might be same as top level section)
            vmlinux.o         -- need to use vmlinux.o.map
              <symbol>        -- ignored
              ...

      vmlinux.o.map:
        <section>
            <object>          -- built-in association known
              <symbol>        -- belongs to module(s) object belongs to
              ...

 3. As sections, objects, and symbols are processed, offset ranges are
    constructed in a straight-forward way:

      - If the symbol belongs to one or more built-in modules:
          - If we were working on the same module(s), extend the range
            to include this object
          - If we were working on another module(s), close that range,
            and start the new one
      - If the symbol does not belong to any built-in modules:
          - If we were working on a module(s) range, close that range

Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Reviewed-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Sam James <sam@gentoo.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-09-20 09:21:43 +09:00
Linus Torvalds
4a39ac5b7d Random number generator updates for Linux 6.12-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmboHyUACgkQSfxwEqXe
 A66wGQ/8DRIjBllwf1YuTWi4T6OcfoYxK6C9bXO6QPP5gzdTyFE9pvDuuPyad6+F
 FR086ydTHeodemz1dFiQCL9etcUaxo4+6FRKyXKF9/1ezGbTA5nJd0/fKJGlqbI2
 EoA4LNYHOsvCZk1BTpxRNWKeKphU9zQgQdSigy6Rx8p269UkGmIZjD1PtUc+vqfR
 Ox0dK/Cswyo236fRi5HzaoMntWI4vXgLfxty0e1R7tfbstkCxSKWAON1lo3uHgkA
 0HpJXWgWXAPt9gp++Fs/jGNpOqbt6IaKeV5f7CjYfvWhlFjNMhQxF+PbxknaZn/k
 K0gQsItOIoFTfbQdLDIdfnj9awMdLW8FB2A1WXHpNr9pVC4ickPb1bMTF/XRd0tm
 wBNu4BL0gklx6017KZg5uINMIduzMLGkBLRFiBW0en/sZMLTJTMg58BJn0CL1Pmh
 1ll/Q3ToSMHalvxU2OnJagTwh4fzzCEpK/hW9WiDO4jSCsMXyX0clinrCjNo1JfA
 tqgTWEy3uGtg+dg0Du9VD5JASbNQSJ0ZRnas5+qz10IRWWfTolrsk61dliXLQ4Sv
 tSryDtsE2znwJF1Krh4aHNSSVhD5/l/8QaXkf9aZc/kkaHxwsx83FuWnqw6nMz8c
 l4B2MbH0jUgsEqEyx+0iwk+FXE9kZKWumTVLjFZ6bRnq3q+uq0U=
 =mWCw
 -----END PGP SIGNATURE-----

Merge tag 'random-6.12-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "Originally I'd planned on sending each of the vDSO getrandom()
  architecture ports to their respective arch trees. But as we started
  to work on this, we found lots of interesting issues in the shared
  code and infrastructure, the fixes for which the various archs needed
  to base their work.

  So in the end, this turned into a nice collaborative effort fixing up
  issues and porting to 5 new architectures -- arm64, powerpc64,
  powerpc32, s390x, and loongarch64 -- with everybody pitching in and
  commenting on each other's code. It was a fun development cycle.

  This contains:

   - Numerous fixups to the vDSO selftest infrastructure, getting it
     running successfully on more platforms, and fixing bugs in it.

   - Additions to the vDSO getrandom & chacha selftests. Basically every
     time manual review unearthed a bug in a revision of an arch patch,
     or an ambiguity, the tests were augmented.

     By the time the last arch was submitted for review, s390x, v1 of
     the series was essentially fine right out of the gate.

   - Fixes to the the generic C implementation of vDSO getrandom, to
     build and run successfully on all archs, decoupling it from
     assumptions we had (unintentionally) made on x86_64 that didn't
     carry through to the other architectures.

   - Port of vDSO getrandom to LoongArch64, from Xi Ruoyao and acked by
     Huacai Chen.

   - Port of vDSO getrandom to ARM64, from Adhemerval Zanella and acked
     by Will Deacon.

   - Port of vDSO getrandom to PowerPC, in both 32-bit and 64-bit
     varieties, from Christophe Leroy and acked by Michael Ellerman.

   - Port of vDSO getrandom to S390X from Heiko Carstens, the arch
     maintainer.

  While it'd be natural for there to be things to fix up over the course
  of the development cycle, these patches got a decent amount of review
  from a fairly diverse crew of folks on the mailing lists, and, for the
  most part, they've been cooking in linux-next, which has been helpful
  for ironing out build issues.

  In terms of architectures, I think that mostly takes care of the
  important 64-bit archs with hardware still being produced and running
  production loads in settings where vDSO getrandom is likely to help.

  Arguably there's still RISC-V left, and we'll see for 6.13 whether
  they find it useful and submit a port"

* tag 'random-6.12-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (47 commits)
  selftests: vDSO: check cpu caps before running chacha test
  s390/vdso: Wire up getrandom() vdso implementation
  s390/vdso: Move vdso symbol handling to separate header file
  s390/vdso: Allow alternatives in vdso code
  s390/module: Provide find_section() helper
  s390/facility: Let test_facility() generate static branch if possible
  s390/alternatives: Remove ALT_FACILITY_EARLY
  s390/facility: Disable compile time optimization for decompressor code
  selftests: vDSO: fix vdso_config for s390
  selftests: vDSO: fix ELF hash table entry size for s390x
  powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64
  powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32
  powerpc/vdso: Refactor CFLAGS for CVDSO build
  powerpc/vdso32: Add crtsavres
  mm: Define VM_DROPPABLE for powerpc/32
  powerpc/vdso: Fix VDSO data access when running in a non-root time namespace
  selftests: vDSO: don't include generated headers for chacha test
  arm64: vDSO: Wire up getrandom() vDSO implementation
  arm64: alternative: make alternative_has_cap_likely() VDSO compatible
  selftests: vDSO: also test counter in vdso_test_chacha
  ...
2024-09-18 15:26:31 +02:00
Linus Torvalds
39b3f4e0db hardening updates for v6.12-rc1
- lib/string_choices: Add str_up_down() helper (Michal Wajdeczko)
 
 - lib/string_choices: Add str_true_false()/str_false_true() helper
   (Hongbo Li)
 
 - lib/string_choices: Introduce several opposite string choice helpers
   (Hongbo Li)
 
 - lib/string_helpers: rework overflow-dependent code (Justin Stitt)
 
 - fortify: refactor test_fortify Makefile to fix some build problems
   (Masahiro Yamada)
 
 - string: Check for "nonstring" attribute on strscpy() arguments
 
 - virt: vbox: Replace 1-element arrays with flexible arrays
 
 - media: venus: hfi_cmds: Replace 1-element arrays with flexible arrays
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZufwawAKCRA2KwveOeQk
 u3n9AQCI8G1FSMFSa8MKSSwTo600dHbZGavJd33fl2VrV7KCvQD8CMPRC/itOIVI
 PXcGo9tekW+zAOOw+v47QorpxHGd1w4=
 =jSSr
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - lib/string_choices:
    - Add str_up_down() helper (Michal Wajdeczko)
    - Add str_true_false()/str_false_true() helper  (Hongbo Li)
    - Introduce several opposite string choice helpers  (Hongbo Li)

 - lib/string_helpers:
    - rework overflow-dependent code (Justin Stitt)

 - fortify: refactor test_fortify Makefile to fix some build problems
   (Masahiro Yamada)

 - string: Check for "nonstring" attribute on strscpy() arguments

 - virt: vbox: Replace 1-element arrays with flexible arrays

 - media: venus: hfi_cmds: Replace 1-element arrays with flexible arrays

* tag 'hardening-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  lib/string_choices: Add some comments to make more clear for string choices helpers.
  lib/string_choices: Introduce several opposite string choice helpers
  lib/string_choices: Add str_true_false()/str_false_true() helper
  string: Check for "nonstring" attribute on strscpy() arguments
  media: venus: hfi_cmds: struct hfi_session_release_buffer_pkt: Add __counted_by annotation
  media: venus: hfi_cmds: struct hfi_session_release_buffer_pkt: Replace 1-element array with flexible array
  virt: vbox: struct vmmdev_hgcm_pagelist: Replace 1-element array with flexible array
  lib/string_helpers: rework overflow-dependent code
  coccinelle: Add rules to find str_down_up() replacements
  string_choices: Add wrapper for str_down_up()
  coccinelle: Add rules to find str_up_down() replacements
  lib/string_choices: Add str_up_down() helper
  fortify: use if_changed_dep to record header dependency in *.cmd files
  fortify: move test_fortify.sh to lib/test_fortify/
  fortify: refactor test_fortify Makefile to fix some build problems
2024-09-18 12:12:41 +02:00
Linus Torvalds
bdf56c7580 slab updates for 6.12
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmbn5g0ACgkQu+CwddJF
 iJq+Uwf/aqnLNEpjUBzwUUhSojCpPnTtiyjv+AILTxoSTHmbu8OvN0W79+Rpbdmk
 O4QapAK+BCs+VL2VATwCCufcJ75Z78txO+buQE0DgwluFTIYZ+IwpUMPsK04ln6A
 FD1/uvP1QFx60heqcp2c4zWFBUpg4DE6ufx2A5kieO268lFcWLxyVlcdgRU79ZCt
 uAcV2yDLk3GvPGfxZwPKEmZUo/FmuSoBv0XgT+eWxmTu/R7hcpFse49OyjBH8Tvb
 8d/RCIFgXOr8dTIjtds7eenwB/is4TkRlctezEQ0jO9/JwL/BVOgXZjD1qCtNWqz
 is4TWK7VV+vdq1RD+0xC2hV/+uGEwQ==
 =+WAm
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:
 "This time it's mostly refactoring and improving APIs for slab users in
  the kernel, along with some debugging improvements.

   - kmem_cache_create() refactoring (Christian Brauner)

     Over the years have been growing new parameters to
     kmem_cache_create() where most of them are needed only for a small
     number of caches - most recently the rcu_freeptr_offset parameter.

     To avoid adding new parameters to kmem_cache_create() and adjusting
     all its callers, or creating new wrappers such as
     kmem_cache_create_rcu(), we can now pass extra parameters using the
     new struct kmem_cache_args. Not explicitly initialized fields
     default to values interpreted as unused.

     kmem_cache_create() is for now a wrapper that works both with the
     new form: kmem_cache_create(name, object_size, args, flags) and the
     legacy form: kmem_cache_create(name, object_size, align, flags,
     ctor)

   - kmem_cache_destroy() waits for kfree_rcu()'s in flight (Vlastimil
     Babka, Uladislau Rezki)

     Since SLOB removal, kfree() is allowed for freeing objects
     allocated by kmem_cache_create(). By extension kfree_rcu() as
     allowed as well, which can allow converting simple call_rcu()
     callbacks that only do kmem_cache_free(), as there was never a
     kmem_cache_free_rcu() variant. However, for caches that can be
     destroyed e.g. on module removal, the cache owners knew to issue
     rcu_barrier() first to wait for the pending call_rcu()'s, and this
     is not sufficient for pending kfree_rcu()'s due to its internal
     batching optimizations. Ulad has provided a new
     kvfree_rcu_barrier() and to make the usage less error-prone,
     kmem_cache_destroy() calls it. Additionally, destroying
     SLAB_TYPESAFE_BY_RCU caches now again issues rcu_barrier()
     synchronously instead of using an async work, because the past
     motivation for async work no longer applies. Users of custom
     call_rcu() callbacks should however keep calling rcu_barrier()
     before cache destruction.

   - Debugging use-after-free in SLAB_TYPESAFE_BY_RCU caches (Jann Horn)

     Currently, KASAN cannot catch UAFs in such caches as it is legal to
     access them within a grace period, and we only track the grace
     period when trying to free the underlying slab page. The new
     CONFIG_SLUB_RCU_DEBUG option changes the freeing of individual
     object to be RCU-delayed, after which KASAN can poison them.

   - Delayed memcg charging (Shakeel Butt)

     In some cases, the memcg is uknown at allocation time, such as
     receiving network packets in softirq context. With
     kmem_cache_charge() these may be now charged later when the user
     and its memcg is known.

   - Misc fixes and improvements (Pedro Falcato, Axel Rasmussen,
     Christoph Lameter, Yan Zhen, Peng Fan, Xavier)"

* tag 'slab-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits)
  mm, slab: restore kerneldoc for kmem_cache_create()
  io_uring: port to struct kmem_cache_args
  slab: make __kmem_cache_create() static inline
  slab: make kmem_cache_create_usercopy() static inline
  slab: remove kmem_cache_create_rcu()
  file: port to struct kmem_cache_args
  slab: create kmem_cache_create() compatibility layer
  slab: port KMEM_CACHE_USERCOPY() to struct kmem_cache_args
  slab: port KMEM_CACHE() to struct kmem_cache_args
  slab: remove rcu_freeptr_offset from struct kmem_cache
  slab: pass struct kmem_cache_args to do_kmem_cache_create()
  slab: pull kmem_cache_open() into do_kmem_cache_create()
  slab: pass struct kmem_cache_args to create_cache()
  slab: port kmem_cache_create_usercopy() to struct kmem_cache_args
  slab: port kmem_cache_create_rcu() to struct kmem_cache_args
  slab: port kmem_cache_create() to struct kmem_cache_args
  slab: add struct kmem_cache_args
  slab: s/__kmem_cache_create/do_kmem_cache_create/g
  memcg: add charging of already allocated slab objects
  mm/slab: Optimize the code logic in find_mergeable()
  ...
2024-09-18 08:53:53 +02:00
Linus Torvalds
067610ebaa RCU pull request for v6.12
This pull request contains the following branches:
 
 context_tracking.15.08.24a: Rename context tracking state related
         symbols and remove references to "dynticks" in various context
         tracking state variables and related helpers; force
         context_tracking_enabled_this_cpu() to be inlined to avoid
         leaving a noinstr section.
 
 csd.lock.15.08.24a: Enhance CSD-lock diagnostic reports; add an API
         to provide an indication of ongoing CSD-lock stall.
 
 nocb.09.09.24a: Update and simplify RCU nocb code to handle
         (de-)offloading of callbacks only for offline CPUs; fix RT
         throttling hrtimer being armed from offline CPU.
 
 rcutorture.14.08.24a: Remove redundant rcu_torture_ops get_gp_completed
         fields; add SRCU ->same_gp_state and ->get_comp_state
         functions; add generic test for NUM_ACTIVE_*RCU_POLL* for
         testing RCU and SRCU polled grace periods; add CFcommon.arch
         for arch-specific Kconfig options; print number of update types
         in rcu_torture_write_types();
         add rcutree.nohz_full_patience_delay testing to the TREE07
         scenario; add a stall_cpu_repeat module parameter to test
         repeated CPU stalls; add argument to limit number of CPUs a
         guest OS can use in torture.sh;
 
 rcustall.09.09.24a: Abbreviate RCU CPU stall warnings during CSD-lock
         stalls; Allow dump_cpu_task() to be called without disabling
         preemption; defer printing stall-warning backtrace when holding
         rcu_node lock.
 
 srcu.12.08.24a: Make SRCU gp seq wrap-around faster; add KCSAN checks
         for concurrent updates to ->srcu_n_exp_nodelay and
         ->reschedule_count which are used in heuristics governing
         auto-expediting of normal SRCU grace periods and
         grace-period-state-machine delays; mark idle SRCU-barrier
         callbacks to help identify stuck SRCU-barrier callback.
 
 rcu.tasks.14.08.24a: Remove RCU Tasks Rude asynchronous APIs as they
         are no longer used; stop testing RCU Tasks Rude asynchronous
         APIs; fix access to non-existent percpu regions; check
         processor-ID assumptions during chosen CPU calculation for
         callback enqueuing; update description of rtp->tasks_gp_seq
         grace-period sequence number; add rcu_barrier_cb_is_done()
         to identify whether a given rcu_barrier callback is stuck;
         mark idle Tasks-RCU-barrier callbacks; add
         *torture_stats_print() functions to print detailed
         diagnostics for Tasks-RCU variants; capture start time of
         rcu_barrier_tasks*() operation to help distinguish a hung
         barrier operation from a long series of barrier operations.
 
 rcu_scaling_tests.15.08.24a:
         refscale: Add a TINY scenario to support tests of Tiny RCU
         and Tiny SRCU; Optimize process_durations() operation;
 
         rcuscale: Dump stacks of stalled rcu_scale_writer() instances;
         dump grace-period statistics when rcu_scale_writer() stalls;
         mark idle RCU-barrier callbacks to identify stuck RCU-barrier
         callbacks; print detailed grace-period and barrier diagnostics
         on rcu_scale_writer() hangs for Tasks-RCU variants; warn if
         async module parameter is specified for RCU implementations
         that do not have async primitives such as RCU Tasks Rude;
         make all writer tasks report upon hang; tolerate repeated
         GFP_KERNEL failure in rcu_scale_writer(); use special allocator
         for rcu_scale_writer(); NULL out top-level pointers to heap
         memory to avoid double-free bugs on modprobe failures; maintain
         per-task instead of per-CPU callbacks count to avoid any issues
         with migration of either tasks or callbacks; constify struct
         ref_scale_ops.
 
 fixes.12.08.24a: Use system_unbound_wq for kfree_rcu work to avoid
         disturbing isolated CPUs.
 
 misc.11.08.24a: Warn on unexpected rcu_state.srs_done_tail state;
         Better define "atomic" for list_replace_rcu() and
         hlist_replace_rcu() routines; annotate struct
         kvfree_rcu_bulk_data with __counted_by().
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSi2tPIQIc2VEtjarIAHS7/6Z0wpQUCZt8+8wAKCRAAHS7/6Z0w
 pTqoAPwPN//tlEoJx2PRs6t0q+nD1YNvnZawPaRmdzgdM8zJogD+PiSN+XhqRr80
 jzyvMDU4Aa0wjUNP3XsCoaCxo7L/lQk=
 =bZ9z
 -----END PGP SIGNATURE-----

Merge tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU updates from Neeraj Upadhyay:
 "Context tracking:
   - rename context tracking state related symbols and remove references
     to "dynticks" in various context tracking state variables and
     related helpers
   - force context_tracking_enabled_this_cpu() to be inlined to avoid
     leaving a noinstr section

  CSD lock:
   - enhance CSD-lock diagnostic reports
   - add an API to provide an indication of ongoing CSD-lock stall

  nocb:
   - update and simplify RCU nocb code to handle (de-)offloading of
     callbacks only for offline CPUs
   - fix RT throttling hrtimer being armed from offline CPU

  rcutorture:
   - remove redundant rcu_torture_ops get_gp_completed fields
   - add SRCU ->same_gp_state and ->get_comp_state functions
   - add generic test for NUM_ACTIVE_*RCU_POLL* for testing RCU and SRCU
     polled grace periods
   - add CFcommon.arch for arch-specific Kconfig options
   - print number of update types in rcu_torture_write_types()
   - add rcutree.nohz_full_patience_delay testing to the TREE07 scenario
   - add a stall_cpu_repeat module parameter to test repeated CPU stalls
   - add argument to limit number of CPUs a guest OS can use in
     torture.sh

  rcustall:
   - abbreviate RCU CPU stall warnings during CSD-lock stalls
   - Allow dump_cpu_task() to be called without disabling preemption
   - defer printing stall-warning backtrace when holding rcu_node lock

  srcu:
   - make SRCU gp seq wrap-around faster
   - add KCSAN checks for concurrent updates to ->srcu_n_exp_nodelay and
     ->reschedule_count which are used in heuristics governing
     auto-expediting of normal SRCU grace periods and
     grace-period-state-machine delays
   - mark idle SRCU-barrier callbacks to help identify stuck
     SRCU-barrier callback

  rcu tasks:
   - remove RCU Tasks Rude asynchronous APIs as they are no longer used
   - stop testing RCU Tasks Rude asynchronous APIs
   - fix access to non-existent percpu regions
   - check processor-ID assumptions during chosen CPU calculation for
     callback enqueuing
   - update description of rtp->tasks_gp_seq grace-period sequence
     number
   - add rcu_barrier_cb_is_done() to identify whether a given
     rcu_barrier callback is stuck
   - mark idle Tasks-RCU-barrier callbacks
   - add *torture_stats_print() functions to print detailed diagnostics
     for Tasks-RCU variants
   - capture start time of rcu_barrier_tasks*() operation to help
     distinguish a hung barrier operation from a long series of barrier
     operations

  refscale:
   - add a TINY scenario to support tests of Tiny RCU and Tiny
     SRCU
   - optimize process_durations() operation

  rcuscale:
   - dump stacks of stalled rcu_scale_writer() instances and
     grace-period statistics when rcu_scale_writer() stalls
   - mark idle RCU-barrier callbacks to identify stuck RCU-barrier
     callbacks
   - print detailed grace-period and barrier diagnostics on
     rcu_scale_writer() hangs for Tasks-RCU variants
   - warn if async module parameter is specified for RCU implementations
     that do not have async primitives such as RCU Tasks Rude
   - make all writer tasks report upon hang
   - tolerate repeated GFP_KERNEL failure in rcu_scale_writer()
   - use special allocator for rcu_scale_writer()
   - NULL out top-level pointers to heap memory to avoid double-free
     bugs on modprobe failures
   - maintain per-task instead of per-CPU callbacks count to avoid any
     issues with migration of either tasks or callbacks
   - constify struct ref_scale_ops

  Fixes:
   - use system_unbound_wq for kfree_rcu work to avoid disturbing
     isolated CPUs

  Misc:
   - warn on unexpected rcu_state.srs_done_tail state
   - better define "atomic" for list_replace_rcu() and
     hlist_replace_rcu() routines
   - annotate struct kvfree_rcu_bulk_data with __counted_by()"

* tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (90 commits)
  rcu: Defer printing stall-warning backtrace when holding rcu_node lock
  rcu/nocb: Remove superfluous memory barrier after bypass enqueue
  rcu/nocb: Conditionally wake up rcuo if not already waiting on GP
  rcu/nocb: Fix RT throttling hrtimer armed from offline CPU
  rcu/nocb: Simplify (de-)offloading state machine
  context_tracking: Tag context_tracking_enabled_this_cpu() __always_inline
  context_tracking, rcu: Rename rcu_dyntick trace event into rcu_watching
  rcu: Update stray documentation references to rcu_dynticks_eqs_{enter, exit}()
  rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs()
  rcu: Rename rcu_implicit_dynticks_qs() into rcu_watching_snap_recheck()
  rcu: Rename dyntick_save_progress_counter() into rcu_watching_snap_save()
  rcu: Rename struct rcu_data .exp_dynticks_snap into .exp_watching_snap
  rcu: Rename struct rcu_data .dynticks_snap into .watching_snap
  rcu: Rename rcu_dynticks_zero_in_eqs() into rcu_watching_zero_in_eqs()
  rcu: Rename rcu_dynticks_in_eqs_since() into rcu_watching_snap_stopped_since()
  rcu: Rename rcu_dynticks_in_eqs() into rcu_watching_snap_in_eqs()
  rcu: Rename rcu_dynticks_eqs_online() into rcu_watching_online()
  context_tracking, rcu: Rename rcu_dynticks_curr_cpu_in_eqs() into rcu_is_watching_curr_cpu()
  context_tracking, rcu: Rename rcu_dynticks_task*() into rcu_task*()
  refscale: Constify struct ref_scale_ops
  ...
2024-09-18 07:52:24 +02:00
Linus Torvalds
78567e2bc7 cgroup: Changes for v6.12
- cpuset isolation improvements.
 
 - cpuset cgroup1 support is split into its own file behind the new config
   option CONFIG_CPUSET_V1. This makes it the second controller which makes
   cgroup1 support optional after memcg.
 
 - Handling of unavailable v1 controller handling improved during cgroup1
   mount operations.
 
 - union_find applied to cpuset. It makes code simpler and more efficient.
 
 - Reduce spurious events in pids.events.
 
 - Cleanups and other misc changes.
 
 - Contains a merge of cgroup/for-6.11-fixes to receive cpuset fixes that
   further changes build upon.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZuNU3Q4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGdMsAP9yqPxu//LiJ3lPWhKcVVKtdwrA3AYDLE81VSJO
 5VZJhAD+Ic+Ly/jZjDtjjQpZ1U3JsBpBRcVBqzeH0gD7eXaJgwk=
 =h/+c
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - cpuset isolation improvements

 - cpuset cgroup1 support is split into its own file behind the new
   config option CONFIG_CPUSET_V1. This makes it the second controller
   which makes cgroup1 support optional after memcg

 - Handling of unavailable v1 controller handling improved during
   cgroup1 mount operations

 - union_find applied to cpuset. It makes code simpler and more
   efficient

 - Reduce spurious events in pids.events

 - Cleanups and other misc changes

 - Contains a merge of cgroup/for-6.11-fixes to receive cpuset fixes
   that further changes build upon

* tag 'cgroup-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (34 commits)
  cgroup: Do not report unavailable v1 controllers in /proc/cgroups
  cgroup: Disallow mounting v1 hierarchies without controller implementation
  cgroup/cpuset: Expose cpuset filesystem with cpuset v1 only
  cgroup/cpuset: Move cpu.h include to cpuset-internal.h
  cgroup/cpuset: add sefltest for cpuset v1
  cgroup/cpuset: guard cpuset-v1 code under CONFIG_CPUSETS_V1
  cgroup/cpuset: rename functions shared between v1 and v2
  cgroup/cpuset: move v1 interfaces to cpuset-v1.c
  cgroup/cpuset: move validate_change_legacy to cpuset-v1.c
  cgroup/cpuset: move legacy hotplug update to cpuset-v1.c
  cgroup/cpuset: add callback_lock helper
  cgroup/cpuset: move memory_spread to cpuset-v1.c
  cgroup/cpuset: move relax_domain_level to cpuset-v1.c
  cgroup/cpuset: move memory_pressure to cpuset-v1.c
  cgroup/cpuset: move common code to cpuset-internal.h
  cgroup/cpuset: introduce cpuset-v1.c
  selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs
  cgroup/cpuset: Account for boot time isolated CPUs
  cgroup/cpuset: remove use_parent_ecpus of cpuset
  cgroup/cpuset: remove fetch_xcpus
  ...
2024-09-18 06:39:03 +02:00
Linus Torvalds
194fcd20eb linux_kselftest-kunit-6.12-rc1
This kunit update for Linux 6.12-rc1 consists of:
 
 -- a new int_pow test suite
 -- documentation update to clarify filename best practices
 -- kernel-doc fix for EXPORT_SYMBOL_IF_KUNIT
 -- change to build compile_commands.json automatically instead
    of requiring a manual build.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmbo3WEACgkQCwJExA0N
 Qxz1WxAAj+772NHxsJ4JnPqr/74doKnzKc1jM2V4g/F9Y+BT0tSKs1Cu5CyN9VsT
 wvxVPWqYltyhumVm/H6SaUGb0yZ7CzJi/5FuT3p3QFUDidMSu1h9KnlLi79q3cDI
 VuFKE8K4DDP0GfyFMpbSPZOGfYQp24FybhxRxreY+7q6uRVAnPh33Q1/Bonv6K6q
 5329a0z9wWySgisa93ABmQNpF4UJSYunR2bsdUzZqHgyrTXSyK66fcmVKwbBUaIT
 o16P1LBjDcIbfwswFb+xUmWD1IPGk7ulirEq8n69tErI6zKbkv1rojXHsoXuvOEN
 a4i+sNyR+a7NVI1h/T8F25pSbegkL0XQs7cmehATqpInmEZNDeGR8PkaGZNXXrFy
 kG/z7LlWh8zQUBrTsqOLU/iz4sRVrsPCuLIUzo8MiKpAskmj/7fqw5Cab9jmL5V3
 6OLAfCQDrfcH7fM9V5U6Ury2dkcovFuw+ZhFcBuLnspB5z0Cj7Yqz6aDZdJ97qyR
 PfZuyBU2ouykhpJ4P/sRJC3Gq1t0b+PoDq3qNdCqz4ETld1jaiDz0e75ypquJWyB
 QdVMNJF6W7Nwnmpzp4GY9QZ6dtwOKGZyuvW5J0eleWKiD4gjHZaoupIzqT24fgYi
 vdscbcOxMMU3/b9F4qDlgsLSPCLVF4HIXTAK2UdiznLdaxYVHQ0=
 =rmqh
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - a new int_pow test suite

 - documentation update to clarify filename best practices

 - kernel-doc fix for EXPORT_SYMBOL_IF_KUNIT

 - change to build compile_commands.json automatically instead of
   requiring a manual build

* tag 'linux_kselftest-kunit-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  lib/math: Add int_pow test suite
  kunit: tool: Build compile_commands.json
  kunit: Fix kernel-doc for EXPORT_SYMBOL_IF_KUNIT
  Documentation: KUnit: Update filename best practices
2024-09-17 16:52:24 +02:00
Linus Torvalds
dea435d397 Enable UBSAN traps for x86, which provides better reporting through
metadata encodeded into UD1.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbpM6ITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoU/kEACWS7Z9mQrWB3r22ufTTPoN+hNudth+
 CP8wluXZGvLPh1Pq9dpB9ZniBUN8levYoGyj3NTdr6VtoMJ6NYcZVuH98lCCEMXO
 1UmDpydSGZ3BqVgmf4h0eYAJgEiA5qTflXMsh6SfsaPQR7jniJTE451hgJdRIogG
 DvgWeVTYn5vt0+oRHJp6ogRLR9oOUgdp94fIwaW34OpesbVJeWUW9zAvBcqdNrDT
 KJIM7ta6eivEakFRxriQZTKRc+3ElvZ2fdWNdo9qrRd64MTIOTXAj3G0lXt3YtpZ
 06pfJ1CfQ+nwHKfxmmy4gz4eJG7KcpMM+KFZTR3NoSAz4oMTzAvVTxAuEt+pahx6
 bmLzaY/I/gRB/Rt+e5oEZSEIq+Sh/Lm3IZoQUhK0+HeJBjwPghBZw3BjkFJvEsMw
 S0arvklH2x37gP9rnzOODf2QG7aIAqLTrvRJS610fctwadR4k+2UIE8ZGHOTt55J
 UdiK/QhU4gMVaRTebTcPquu3IMmnJjla/bEWdIrBtOSiGtVd1BnAp/kvmkdQH3eI
 ZUqJbnfofN4rzSufFqSVY88ORVIcQMnNDLM0qyJofIC79u7OiU40icoDxWS6mDHQ
 wQSEszInhwNzyAxoHnNkXDunjDVKhATQPOde0F4TxLcrYD9KRpvJag/1j5fCQi+0
 ftODZflfGS2UjQ==
 =Z5Hg
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core update from Thomas Gleixner:
 "Enable UBSAN traps for x86, which provides better reporting through
  metadata encodeded into UD1"

* tag 'x86-core-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/traps: Enable UBSAN traps on x86
2024-09-17 13:17:27 +02:00
Linus Torvalds
5ba202a7c9 Updates for KCOV instrumentation on x86:
- Prevent spurious KCOV coverage in common_interrupt()
 
   - Fixup the KCOV Makefile directive which got stale due to a source file
     rename
 
   - Exclude stack unwinding from KCOV as it creates large amounts of
     uninteresting coverage
 
   - Provide a self test to validate that KCOV coverage of the interrupt
     handling code starts not before preempt count got updated.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbpMeITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaOeD/4oO3g0soK0LIcDIwzaG0ap0hx0nucw
 aVSAESuY+ZaSbRbV0fNoYdHORvLdErs67SeyeJRSxTzSNqGH2dGoFrfbkRSXq951
 RdCSPP60T7xgqAme1YLDiChfXt/gkbWk/8V5Q7sG3oq3GaVcPUyZgPo4M4HQMdfg
 Mla3VPikW5Np3fvs0IZYWQ5VdY0fFOHY5JGMhKJznJxf+Ud+VAtxsbJUcO4MEYWW
 A9CVJNHGEXssGA6vm5kgtLu6n2QFuoSj6En/WqLEaJb8f/V332e04Xj2ZHUaOOjV
 2abVeDovv+dwUYb4SgrGVg9gfEwwcLPDnmOuuQJmQBB5kU4mJsCqI5TTS6c1fgU4
 x8tQsGSOKHFQAI14ZWtitrL4rS2uFcBkAFXo0dF8J5o4989RA8cpfeWVSVUb/UXd
 u38BWpc9iHiihHKMmMQgsa1bUMwdSUTvN5XFHkeP4oqUdMiEiWn8iM5+zXd/lfTs
 9mrTv+kcLA7mjFOmn4JyE2b+NuiPdgS2FCBGLycHvGwvJoJlO2UmSpF89AJ5vdKs
 F8vWLkV+gno/HtwS5o949cAwjYiCodfc7u1W0xj2VDAbx0RbaBw1SDhXMQcLxLgn
 BTt4yHKKIeLX++WH3fpeyL91+UJWubUzNzY4rAmLkz5DedWAkpES+45fatp1buIz
 Lp/hGiIsG9p5xw==
 =tiXT
 -----END PGP SIGNATURE-----

Merge tag 'x86-build-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 build updates from Thomas Gleixner:
 "Updates for KCOV instrumentation on x86:

   - Prevent spurious KCOV coverage in common_interrupt()

   - Fixup the KCOV Makefile directive which got stale due to a source
     file rename

   - Exclude stack unwinding from KCOV as it creates large amounts of
     uninteresting coverage

   - Provide a self test to validate that KCOV coverage of the interrupt
     handling code starts not before preempt count got updated"

* tag 'x86-build-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Ignore stack unwinding in KCOV
  module: Fix KCOV-ignored file name
  kcov: Add interrupt handling self test
  x86/entry: Remove unwanted instrumentation in common_interrupt()
2024-09-17 12:40:34 +02:00
I Hsin Cheng
5e06e08939 list: test: increase coverage of list_test_list_replace*()
Increase the test coverage of list_test_list_replace*() by adding the
checks to compare the pointer of "a_new.next" and "a_new.prev" to make
sure a perfect circular doubly linked list is formed after the
replacement.

Link: https://lkml.kernel.org/r/20240910040818.65723-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-17 01:11:20 -07:00
I Hsin Cheng
e620799c41 list: test: fix tests for list_cut_position()
Fix test for list_cut_position*() for the missing check of integer "i"
after the second loop.  The variable should be checked for second time to
make sure both lists after the cut operation are formed as expected.

Link: https://lkml.kernel.org/r/20240910043531.71343-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-17 01:11:20 -07:00
Huang Ying
99185c10d5 resource, kunit: add test case for region_intersects()
Patch series "resource: Fix region_intersects() vs
add_memory_driver_managed()", v3.

The patchset fixes a bug of region_intersects() for systems with CXL
memory.  The details of the bug can be found in [1/3].  To avoid similar
bugs in the future.  A kunit test case for region_intersects() is added in
[3/3].  [2/3] is a preparation patch for [3/3].


This patch (of 3):

region_intersects() is important because it's used for /dev/mem permission
checking.  To avoid possible bug of region_intersects() in the future, a
kunit test case for region_intersects() is added.

Link: https://lkml.kernel.org/r/20240906030713.204292-1-ying.huang@intel.com
Link: https://lkml.kernel.org/r/20240906030713.204292-4-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-17 01:07:00 -07:00
Linus Torvalds
daa394f0f9 A set of updates for debugobjects:
- Use the threshold to check for the pool refill condition and not the
     run time recorded all time low fill value, which is lower than the
     threshold and therefore causes refills to be delayed.
 
   - KCSAN annotation updates and simplification of the fill_pool() code.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbn480THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVB1D/0UE1n86SLFrR7plXudttXJnbyJ/OjK
 uOjLSHx66TyMkN1z6xF6K4bZTyQRpIUifPLz4evyd9CdDvITvnrvkboby/15rsGW
 8sEBqAFVMkENkPzDA1Qmn3fxJs9XvHoER7WcMjaEl9yQbSi4gjO5Y+B0BNp4XKHZ
 P1YSmRJqUBX5F0BvmeeDlHCCpyUxeRGiyzxZ/WSl70e6RSGis10R+B/aqsMxf3Zz
 6WboQJqMxnDT3ICtDxTicH9VJ6Lh9iJxppeLVxAtZ+acfhcRmpwKFmsfJJOVy1eg
 zkJuDh3ieb8hH7vr6bqzMEoP8qclUY7JgcJCK0dIwcASIvr7ZFVLCDLDx6Ta9UrG
 D+L7sjGs+h/wz7NOoKTaGJS0XHwijVtLhc5/O64p1POUiQVTfjCVW6E3RAs3IGBI
 uXTxuVzpK7XXvbg7+iEwYVcE5fp5vctnlLyepkbXvei9r/ccgIndj3rVGZz1qyOc
 41LVhTx1Uu9MSqnsWTGbr+kzIze/g1rj8OlSH+692nbLL0mxWsOuojljvDgILC1Q
 rcvZLJrf8e4FDFyGZiX8kG3eHbyYQPdf3fqUCI7B05n0o7utXLf4Mgw+/LdIvpKY
 JTx4/lhwZ4TXFMvf+LiW/rhRlP72QsVkljjsyJTHI6a5LukdNL9dNXKTqSCypcjm
 hAsMzee52FiZoQ==
 =B0II
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects updates from Thomas Gleixner:

 - Use the threshold to check for the pool refill condition and not the
   run time recorded all time low fill value, which is lower than the
   threshold and therefore causes refills to be delayed.

 - KCSAN annotation updates and simplification of the fill_pool() code.

* tag 'core-debugobjects-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Remove redundant checks in fill_pool()
  debugobjects: Fix conditions in fill_pool()
  debugobjects: Fix the compilation attributes of some global variables
2024-09-17 08:14:00 +02:00
Linus Torvalds
9ea925c806 Updates for timers and timekeeping:
- Core:
 
 	- Overhaul of posix-timers in preparation of removing the
 	  workaround for periodic timers which have signal delivery
 	  ignored.
 
         - Remove the historical extra jiffie in msleep()
 
 	  msleep() adds an extra jiffie to the timeout value to ensure
 	  minimal sleep time. The timer wheel ensures minimal sleep
 	  time since the large rewrite to a non-cascading wheel, but the
 	  extra jiffie in msleep() remained unnoticed. Remove it.
 
         - Make the timer slack handling correct for realtime tasks.
 
 	  The procfs interface is inconsistent and does neither reflect
 	  reality nor conforms to the man page. Show the correct 0 slack
 	  for real time tasks and enforce it at the core level instead of
 	  having inconsistent individual checks in various timer setup
 	  functions.
 
         - The usual set of updates and enhancements all over the place.
 
   - Drivers:
 
         - Allow the ACPI PM timer to be turned off during suspend
 
 	- No new drivers
 
 	- The usual updates and enhancements in various drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbn7jQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobqnD/9COlU0nwsulABI/aNIrsh6iYvnCC9v
 14CcNta7Qn+157Wfw9BWOyHdNhR1/fPCXE8jJ71zTyIOeW27HV2JyTtxTwe9ZcdK
 ViHAaj7YcIjcVUEC3StCoRCPnvLslEw4qJA5AOQuDyMivdQn+YVa2c0baJxKaXZt
 xk4HZdMj4NAS0jRKnoZSwtKW/+Oz6rR4GAWrZo+Zs1/8ur3HfqnQfi8lJ1hJtLLW
 V7XDCVRvamVi6Ah3ocYPPp/1P6yeQDA1ge9aMddqaza5STWISXRtSnFMUmYP3rbS
 FaL8TyL+ilfny8pkGB2WlG6nLuSbtvogtdEh1gG1k1RmZt44kAtk8ba/KiWFPBSb
 zK9cjojRMBS71f9G4kmb5F4rnXoLsg1YbD1Nzhz3wq2Cs1Z90dc2QwMren0zoQ1x
 Fn56ueRyAiagBlnrSaKyso/2RvqJTNoSdi3RkpjYeAph0UoDCqvTvKjGAf1mWiw1
 T/1lUWSVqWHnzZbM7XXzzajIN9bl6A7bbqlcAJ2O9vZIDt7273DG+bQym9Vh6Why
 0LTGGERHxzKBsG7WRg+2Gmvv6S18UPKRo8tLtlA758rHlFuPTZCShWrIriwSNl1K
 Hxon+d4BparSnm1h9W/NHPKJA574UbWRCBjdk58IkAj8DxZZY4ORD9SMP+ggkV7G
 F6p9cgoDNP9KFg==
 =jE0N
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Core:

   - Overhaul of posix-timers in preparation of removing the workaround
     for periodic timers which have signal delivery ignored.

   - Remove the historical extra jiffie in msleep()

     msleep() adds an extra jiffie to the timeout value to ensure
     minimal sleep time. The timer wheel ensures minimal sleep time
     since the large rewrite to a non-cascading wheel, but the extra
     jiffie in msleep() remained unnoticed. Remove it.

   - Make the timer slack handling correct for realtime tasks.

     The procfs interface is inconsistent and does neither reflect
     reality nor conforms to the man page. Show the correct 0 slack for
     real time tasks and enforce it at the core level instead of having
     inconsistent individual checks in various timer setup functions.

   - The usual set of updates and enhancements all over the place.

  Drivers:

   - Allow the ACPI PM timer to be turned off during suspend

   - No new drivers

   - The usual updates and enhancements in various drivers"

* tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  ntp: Make sure RTC is synchronized when time goes backwards
  treewide: Fix wrong singular form of jiffies in comments
  cpu: Use already existing usleep_range()
  timers: Rename next_expiry_recalc() to be unique
  platform/x86:intel/pmc: Fix comment for the pmc_core_acpi_pm_timer_suspend_resume function
  clocksource/drivers/jcore: Use request_percpu_irq()
  clocksource/drivers/cadence-ttc: Add missing clk_disable_unprepare in ttc_setup_clockevent
  clocksource/drivers/asm9260: Add missing clk_disable_unprepare in asm9260_timer_init
  clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
  clocksource/drivers/ingenic: Use devm_clk_get_enabled() helpers
  platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended
  clocksource: acpi_pm: Add external callback for suspend/resume
  clocksource/drivers/arm_arch_timer: Using for_each_available_child_of_node_scoped()
  dt-bindings: timer: rockchip: Add rk3576 compatible
  timers: Annotate possible non critical data race of next_expiry
  timers: Remove historical extra jiffie for timeout in msleep()
  hrtimer: Use and report correct timerslack values for realtime tasks
  hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse.
  timers: Add sparse annotation for timer_sync_wait_running().
  signal: Replace BUG_ON()s
  ...
2024-09-17 07:25:37 +02:00
Linus Torvalds
cb69d86550 Updates for the interrupt subsystem:
- Core:
 	- Remove a global lock in the affinity setting code
 
 	  The lock protects a cpumask for intermediate results and the lock
 	  causes a bottleneck on simultaneous start of multiple virtual
 	  machines. Replace the lock and the static cpumask with a per CPU
 	  cpumask which is nicely serialized by raw spinlock held when
 	  executing this code.
 
 	- Provide support for giving a suffix to interrupt domain names.
 
 	  That's required to support devices with subfunctions so that the
 	  domain names are distinct even if they originate from the same
 	  device node.
 
 	- The usual set of cleanups and enhancements all over the place
 
   - Drivers:
 
 	- Support for longarch AVEC interrupt chip
 
 	- Refurbishment of the Armada driver so it can be extended for new
           variants.
 
 	- The usual set of cleanups and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbn5p8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRFtD/43eB3h5usY2OPW0JmDqrE6qnzsvjPZ
 1H52BcmMcOuI6yCfTnbi/fBB52mwSEGq9Dmt1GXradyq9/CJDIqZ1ajI1rA2jzW2
 YdbeTDpKm1rS2ddzfp2LT2BryrNt+7etrRO7qHn4EKSuOcNuV2f58WPbIIqasvaK
 uPbUDVDPrvXxLNcjoab6SqaKrEoAaHSyKpd0MvDd80wHrtcSC/QouW7JDSUXv699
 RwvLebN1OF6mQ2J8Z3DLeCQpcbAs+UT8UvID7kYUJi1g71J/ZY+xpMLoX/gHiDNr
 isBtsuEAiZeNaFpksc7A6Jgu5ljZf2/aLCqbPLlHaduHFNmo94x9KUbIF2cpEMN+
 rsf5Ff7AVh1otz3cUwLLsm+cFLWRRoZdLuncn7rrgB4Yg0gll7qzyLO6YGvQHr8U
 Ocj1RXtvvWsMk4XzhgCt1AH/42cO6go+bhA4HspeYykNpsIldIUl1MeFbO8sWiDJ
 kybuwiwHp3oaMLjEK4Lpq65u7Ll8Lju2zRde65YUJN2nbNmJFORrOLmeC1qsr6ri
 dpend6n2qD9UD1oAt32ej/uXnG160nm7UKescyxiZNeTm1+ez8GW31hY128ifTY3
 4R3urGS38p3gazXBsfw6eqkeKx0kEoDNoQqrO5gBvb8kowYTvoZtkwMGAN9OADwj
 w6vvU0i+NIyVMA==
 =JlJ2
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Core:

   - Remove a global lock in the affinity setting code

     The lock protects a cpumask for intermediate results and the lock
     causes a bottleneck on simultaneous start of multiple virtual
     machines. Replace the lock and the static cpumask with a per CPU
     cpumask which is nicely serialized by raw spinlock held when
     executing this code.

   - Provide support for giving a suffix to interrupt domain names.

     That's required to support devices with subfunctions so that the
     domain names are distinct even if they originate from the same
     device node.

   - The usual set of cleanups and enhancements all over the place

  Drivers:

   - Support for longarch AVEC interrupt chip

   - Refurbishment of the Armada driver so it can be extended for new
     variants.

   - The usual set of cleanups and enhancements all over the place"

* tag 'irq-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits)
  genirq: Use cpumask_intersects()
  genirq/cpuhotplug: Use cpumask_intersects()
  irqchip/apple-aic: Only access system registers on SoCs which provide them
  irqchip/apple-aic: Add a new "Global fast IPIs only" feature level
  irqchip/apple-aic: Skip unnecessary enabling of use_fast_ipi
  dt-bindings: apple,aic: Document A7-A11 compatibles
  irqdomain: Use IS_ERR_OR_NULL() in irq_domain_trim_hierarchy()
  genirq/msi: Use kmemdup_array() instead of kmemdup()
  genirq/proc: Change the return value for set affinity permission error
  genirq/proc: Use irq_move_pending() in show_irq_affinity()
  genirq/proc: Correctly set file permissions for affinity control files
  genirq: Get rid of global lock in irq_do_set_affinity()
  genirq: Fix typo in struct comment
  irqchip/loongarch-avec: Add AVEC irqchip support
  irqchip/loongson-pch-msi: Prepare get_pch_msi_handle() for AVECINTC
  irqchip/loongson-eiointc: Rename CPUHP_AP_IRQ_LOONGARCH_STARTING
  LoongArch: Architectural preparation for AVEC irqchip
  LoongArch: Move irqchip function prototypes to irq-loongson.h
  irqchip/loongson-pch-msi: Switch to MSI parent domains
  softirq: Remove unused 'action' parameter from action callback
  ...
2024-09-17 07:09:17 +02:00
Linus Torvalds
35219bc5c7 vfs-6.12.netfs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZuQEvgAKCRCRxhvAZXjc
 onQWAQD6IxAKPU0zom2FoWNilvSzPs7WglTtvddX9pu/lT1RNAD/YC/wOLW8mvAv
 9oTAmigQDQQhEWdJA9RgLZBiw7k+DAw=
 =zWFb
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull netfs updates from Christian Brauner:
 "This contains the work to improve read/write performance for the new
  netfs library.

  The main performance enhancing changes are:

   - Define a structure, struct folio_queue, and a new iterator type,
     ITER_FOLIOQ, to hold a buffer as a replacement for ITER_XARRAY. See
     that patch for questions about naming and form.

     ITER_FOLIOQ is provided as a replacement for ITER_XARRAY. The
     problem with an xarray is that accessing it requires the use of a
     lock (typically the RCU read lock) - and this means that we can't
     supply iterate_and_advance() with a step function that might sleep
     (crypto for example) without having to drop the lock between pages.
     ITER_FOLIOQ is the iterator for a chain of folio_queue structs,
     where each folio_queue holds a small list of folios. A folio_queue
     struct is a simpler structure than xarray and is not subject to
     concurrent manipulation by the VM. folio_queue is used rather than
     a bvec[] as it can form lists of indefinite size, adding to one end
     and removing from the other on the fly.

   - Provide a copy_folio_from_iter() wrapper.

   - Make cifs RDMA support ITER_FOLIOQ.

   - Use folio queues in the write-side helpers instead of xarrays.

   - Add a function to reset the iterator in a subrequest.

   - Simplify the write-side helpers to use sheaves to skip gaps rather
     than trying to work out where gaps are.

   - In afs, make the read subrequests asynchronous, putting them into
     work items to allow the next patch to do progressive
     unlocking/reading.

   - Overhaul the read-side helpers to improve performance.

   - Fix the caching of a partial block at the end of a file.

   - Allow a store to be cancelled.

  Then some changes for cifs to make it use folio queues instead of
  xarrays for crypto bufferage:

   - Use raw iteration functions rather than manually coding iteration
     when hashing data.

   - Switch to using folio_queue for crypto buffers.

   - Remove the xarray bits.

  Make some adjustments to the /proc/fs/netfs/stats file such that:

   - All the netfs stats lines begin 'Netfs:' but change this to
     something a bit more useful.

   - Add a couple of stats counters to track the numbers of skips and
     waits on the per-inode writeback serialisation lock to make it
     easier to check for this as a source of performance loss.

  Miscellaneous work:

   - Ensure that the sb_writers lock is taken around
     vfs_{set,remove}xattr() in the cachefiles code.

   - Reduce the number of conditional branches in netfs_perform_write().

   - Move the CIFS_INO_MODIFIED_ATTR flag to the netfs_inode struct and
     remove cifs_post_modify().

   - Move the max_len/max_nr_segs members from netfs_io_subrequest to
     netfs_io_request as they're only needed for one subreq at a time.

   - Add an 'unknown' source value for tracing purposes.

   - Remove NETFS_COPY_TO_CACHE as it's no longer used.

   - Set the request work function up front at allocation time.

   - Use bh-disabling spinlocks for rreq->lock as cachefiles completion
     may be run from block-filesystem DIO completion in softirq context.

   - Remove fs/netfs/io.c"

* tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
  docs: filesystems: corrected grammar of netfs page
  cifs: Don't support ITER_XARRAY
  cifs: Switch crypto buffer to use a folio_queue rather than an xarray
  cifs: Use iterate_and_advance*() routines directly for hashing
  netfs: Cancel dirty folios that have no storage destination
  cachefiles, netfs: Fix write to partial block at EOF
  netfs: Remove fs/netfs/io.c
  netfs: Speed up buffered reading
  afs: Make read subreqs async
  netfs: Simplify the writeback code
  netfs: Provide an iterator-reset function
  netfs: Use new folio_queue data type and iterator instead of xarray iter
  cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs
  iov_iter: Provide copy_folio_from_iter()
  mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
  netfs: Use bh-disabling spinlocks for rreq->lock
  netfs: Set the request work function upon allocation
  netfs: Remove NETFS_COPY_TO_CACHE
  netfs: Reserve netfs_sreq_source 0 as unset/unknown
  netfs: Move max_len/max_nr_segs from netfs_io_subrequest to netfs_io_stream
  ...
2024-09-16 12:13:31 +02:00
Linus Torvalds
85ffc6e4ed This update includes the following changes:
API:
 
 - Make self-test asynchronous.
 
 Algorithms:
 
 - Remove MPI functions added for SM3.
 - Add allocation error checks to remaining MPI functions (introduced for SM3).
 - Set default Jitter RNG OSR to 3.
 
 Drivers:
 
 - Add hwrng driver for Rockchip RK3568 SoC.
 - Allow disabling SR-IOV VFs through sysfs in qat.
 - Fix device reset bugs in hisilicon.
 - Fix authenc key parsing by using generic helper in octeontx*.
 
 Others:
 
 - Fix xor benchmarking on parisc.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmbnq/wACgkQxycdCkmx
 i6cyXw//cBgngKOuCv7tLqMPeSLC39jDJEKkP9tS9ZilYyxzg1b9cbnDLlKNk4Yq
 4A6rRqqh8PD3/yJT58pGXaU5Is5sVMQRqqgwFutXrkD+hsMLk2nlgzsWYhg6aUsY
 /THTfmKTwEgfc3qDLZq6xGOShmMdO6NiOGsH3MeEWhbgfrDuJlOkHXd7QncNa7q8
 NEP7kI3vBc0xFcOxzbjy8tSGYEmPft1LECXAKsgOycWj9Q0SkzPocmo59iSbM21b
 HfV0p3hgAEa5VgKv0Rc5/6PevAqJqOSjGNfRBSPZ97o7dop8sl/z/cOWiy8dM7wO
 xhd9g7XXtmML6UO2MpJPMJzsLgMsjmUTWO2UyEpIyst6RVfJlniOL/jGzWmZ/P2+
 vw/F/mX8k60Zv1du46PC3p6eBeH4Yx/2fEPvPTJus+DQHS9GchXtAKtMToyyUHc2
 6TAy0nOihVQK2Q3QuQ1B/ghQS4tkdOenOIYHSCf9a9nJamub+PqP8jWDw0Y2RcY6
 jSs+tk6hwHJaKnj/T/Mr0gVPX9L8KHCYBtZD7Qbr0NhoXOT6w47m6bbew/dzTN+0
 pmFsgz32fNm8vb8R8D0kZDF63s6uz6CN+P9Dx6Tab4X+87HxNdeaBPS/Le9tYgOC
 0MmE5oIquooqedpM5tW55yuyOHhLPGLQS2SDiA+Ke+WYbAC8SQc=
 =rG1X
 -----END PGP SIGNATURE-----

Merge tag 'v6.12-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto update from Herbert Xu"
 "API:
   - Make self-test asynchronous

  Algorithms:
   - Remove MPI functions added for SM3
   - Add allocation error checks to remaining MPI functions (introduced
     for SM3)
   - Set default Jitter RNG OSR to 3

  Drivers:
   - Add hwrng driver for Rockchip RK3568 SoC
   - Allow disabling SR-IOV VFs through sysfs in qat
   - Fix device reset bugs in hisilicon
   - Fix authenc key parsing by using generic helper in octeontx*

  Others:
   - Fix xor benchmarking on parisc"

* tag 'v6.12-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (96 commits)
  crypto: n2 - Set err to EINVAL if snprintf fails for hmac
  crypto: camm/qi - Use ERR_CAST() to return error-valued pointer
  crypto: mips/crc32 - Clean up useless assignment operations
  crypto: qcom-rng - rename *_of_data to *_match_data
  crypto: qcom-rng - fix support for ACPI-based systems
  dt-bindings: crypto: qcom,prng: document support for SA8255p
  crypto: aegis128 - Fix indentation issue in crypto_aegis128_process_crypt()
  crypto: octeontx* - Select CRYPTO_AUTHENC
  crypto: testmgr - Hide ENOENT errors
  crypto: qat - Remove trailing space after \n newline
  crypto: hisilicon/sec - Remove trailing space after \n newline
  crypto: algboss - Pass instance creation error up
  crypto: api - Fix generic algorithm self-test races
  crypto: hisilicon/qm - inject error before stopping queue
  crypto: hisilicon/hpre - mask cluster timeout error
  crypto: hisilicon/qm - reset device before enabling it
  crypto: hisilicon/trng - modifying the order of header files
  crypto: hisilicon - add a lock for the qp send operation
  crypto: hisilicon - fix missed error branch
  crypto: ccp - do not request interrupt on cmd completion when irqs disabled
  ...
2024-09-16 06:28:28 +02:00
Masahiro Yamada
5277d13094 btf: require pahole 1.21+ for DEBUG_INFO_BTF with default DWARF version
As described in commit 42d9b379e3 ("lib/Kconfig.debug: Allow BTF +
DWARF5 with pahole 1.21+"), the combination of CONFIG_DEBUG_INFO_BTF
and CONFIG_DEBUG_INFO_DWARF5 requires pahole 1.21+.

GCC 11+ and Clang 14+ default to DWARF 5 when the -g flag is passed.
For the same reason, the combination of CONFIG_DEBUG_INFO_BTF and
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is also likely to require
pahole 1.21+ these days. (At least, it is uncertain whether the actual
requirement is pahole 1.16+ or 1.21+.)

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240913173759.1316390-3-masahiroy@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-13 20:03:29 -07:00
Masahiro Yamada
42450f7a90 btf: move pahole check in scripts/link-vmlinux.sh to lib/Kconfig.debug
When DEBUG_INFO_DWARF5 is selected, pahole 1.21+ is required to enable
DEBUG_INFO_BTF.

When DEBUG_INFO_DWARF4 or DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is selected,
DEBUG_INFO_BTF can be enabled without pahole installed, but a build error
will occur in scripts/link-vmlinux.sh:

    LD      .tmp_vmlinux1
  BTF: .tmp_vmlinux1: pahole (pahole) is not available
  Failed to generate BTF for vmlinux
  Try to disable CONFIG_DEBUG_INFO_BTF

We did not guard DEBUG_INFO_BTF by PAHOLE_VERSION when previously
discussed [1].

However, commit 613fe16923 ("kbuild: Add CONFIG_PAHOLE_VERSION")
added CONFIG_PAHOLE_VERSION after all. Now several CONFIG options, as
well as the combination of DEBUG_INFO_BTF and DEBUG_INFO_DWARF5, are
guarded by PAHOLE_VERSION.

The remaining compile-time check in scripts/link-vmlinux.sh now appears
to be an awkward inconsistency.

This commit adopts Nathan's original work.

[1]: https://lore.kernel.org/lkml/20210111180609.713998-1-natechancellor@gmail.com/

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240913173759.1316390-2-masahiroy@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-13 20:03:29 -07:00
Christophe Leroy
7f053812da random: vDSO: minimize and simplify header includes
Depending on the architecture, building a 32-bit vDSO on a 64-bit kernel
is problematic when some system headers are included.

Minimise the amount of headers by moving needed items, such as
__{get,put}_unaligned_t, into dedicated common headers and in general
use more specific headers, similar to what was done in commit
8165b57bca ("linux/const.h: Extract common header for vDSO") and
commit 8c59ab839f ("lib/vdso: Enable common headers").

On some architectures this results in missing PAGE_SIZE, as was
described by commit 8b3843ae36 ("vdso/datapage: Quick fix - use
asm/page-def.h for ARM64"), so define this if necessary, in the same way
as done prior by commit cffaefd15a ("vdso: Use CONFIG_PAGE_SHIFT in
vdso/datapage.h").

Removing linux/time64.h leads to missing 'struct timespec64' in
x86's asm/pvclock.h. Add a forward declaration of that struct in
that file.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13 17:28:35 +02:00
Christophe Leroy
b7bad082e1 random: vDSO: avoid call to out of line memset()
With the current implementation, __cvdso_getrandom_data() calls
memset() on certain architectures, which is unexpected in the VDSO.

Rather than providing a memset(), simply rewrite opaque data
initialization to avoid memset().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13 17:28:35 +02:00
Christophe Leroy
81723e3ac3 random: vDSO: add missing c-getrandom-y in Makefile
Same as for the gettimeofday CVDSO implementation, add c-getrandom-y to
ease the inclusion of lib/vdso/getrandom.c in architectures' VDSO
builds.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13 17:28:35 +02:00
Christophe Leroy
81c6896049 random: vDSO: don't use 64-bit atomics on 32-bit architectures
Performing SMP atomic operations on u64 fails on powerpc32:

    CC      drivers/char/random.o
  In file included from <command-line>:
  drivers/char/random.c: In function 'crng_reseed':
  ././include/linux/compiler_types.h:510:45: error: call to '__compiletime_assert_391' declared with attribute error: Need native word sized stores/loads for atomicity.
    510 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |                                             ^
  ././include/linux/compiler_types.h:491:25: note: in definition of macro '__compiletime_assert'
    491 |                         prefix ## suffix();                             \
        |                         ^~~~~~
  ././include/linux/compiler_types.h:510:9: note: in expansion of macro '_compiletime_assert'
    510 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |         ^~~~~~~~~~~~~~~~~~~
  ././include/linux/compiler_types.h:513:9: note: in expansion of macro 'compiletime_assert'
    513 |         compiletime_assert(__native_word(t),                            \
        |         ^~~~~~~~~~~~~~~~~~
  ./arch/powerpc/include/asm/barrier.h:74:9: note: in expansion of macro 'compiletime_assert_atomic_type'
     74 |         compiletime_assert_atomic_type(*p);                             \
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ./include/asm-generic/barrier.h:172:55: note: in expansion of macro '__smp_store_release'
    172 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
        |                                                       ^~~~~~~~~~~~~~~~~~~
  drivers/char/random.c:286:9: note: in expansion of macro 'smp_store_release'
    286 |         smp_store_release(&__arch_get_k_vdso_rng_data()->generation, next_gen + 1);
        |         ^~~~~~~~~~~~~~~~~

The kernel-side generation counter in the random driver is handled as an
unsigned long, not as a u64, in base_crng and struct crng.

But on the vDSO side, it needs to be an u64, not just an unsigned long,
in order to support a 32-bit vDSO atop a 64-bit kernel.

On kernel side, however, it is an unsigned long, hence a 32-bit value on
32-bit architectures, so just cast it to unsigned long for the
smp_store_release(). A side effect is that on big endian architectures
the store will be performed in the upper 32 bits. It is not an issue on
its own because the vDSO site doesn't mind the value, as it only checks
differences. Just make sure that the vDSO side checks the full 64 bits.
For that, the local current_generation has to be u64 as well.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13 17:28:35 +02:00
Luis Felipe Hernandez
7fcc9b5321 lib/math: Add int_pow test suite
Adds test suite for integer based power function which performs integer
exponentiation.

The test suite is designed to verify that the implementation of int_pow
correctly computes the power of a given base raised to a given exponent.

The tests check various scenarios and edge cases to ensure the accuracy
and reliability of the exponentiation function.

Updated commit with test information at commit time: Shuah Khan

Signed-off-by: Luis Felipe Hernandez <luis.hernandez093@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-09-12 10:03:00 -06:00
David Howells
db0aa2e956
mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
Define a data structure, struct folio_queue, to represent a sequence of
folios and a kernel-internal I/O iterator type, ITER_FOLIOQ, to allow a
list of folio_queue structures to be used to provide a buffer to
iov_iter-taking functions, such as sendmsg and recvmsg.

The folio_queue structure looks like:

	struct folio_queue {
		struct folio_batch	vec;
		u8			orders[PAGEVEC_SIZE];
		struct folio_queue	*next;
		struct folio_queue	*prev;
		unsigned long		marks;
		unsigned long		marks2;
	};

It does not use a list_head so that next and/or prev can be set to NULL at
the ends of the list, allowing iov_iter-handling routines to determine that
they *are* the ends without needing to store a head pointer in the iov_iter
struct.

A folio_batch struct is used to hold the folio pointers which allows the
batch to be passed to batch handling functions.  Two mark bits are
available per slot.  The intention is to use at least one of them to mark
folios that need putting, but that might not be ultimately necessary.
Accessor functions are used to access the slots to do the masking and an
additional accessor function is used to indicate the size of the array.

The order of each folio is also stored in the structure to avoid the need
for iov_iter_advance() and iov_iter_revert() to have to query each folio to
find its size.

With careful barriering, this can be used as an extending buffer with new
folios inserted and new folio_queue structs added without the need for a
lock.  Further, provided we always keep at least one struct in the buffer,
we can also remove consumed folios and consumed structs from the head end
as we without the need for locks.

[Questions/thoughts]

 (1) To manage this, I need a head pointer, a tail pointer, a tail slot
     number (assuming insertion happens at the tail end and the next
     pointers point from head to tail).  Should I put these into a struct
     of their own, say "folio_queue_head" or "rolling_buffer"?

     I will end up with two of these in netfs_io_request eventually, one
     keeping track of the pagecache I'm dealing with for buffered I/O and
     the other to hold a bounce buffer when we need one.

 (2) Should I make the slots {folio,off,len} or bio_vec?

 (3) This is intended to replace ITER_XARRAY eventually.  Using an xarray
     in I/O iteration requires the taking of the RCU read lock, doing
     copying under the RCU read lock, walking the xarray (which may change
     under us), handling retries and dealing with special values.

     The advantage of ITER_XARRAY is that when we're dealing with the
     pagecache directly, we don't need any allocation - but if we're doing
     encrypted comms, there's a good chance we'd be using a bounce buffer
     anyway.

     This will require afs, erofs, cifs, orangefs and fscache to be
     converted to not use this.  afs still uses it for dirs and symlinks;
     some of erofs usages should be easy to change, but there's one which
     won't be so easy; ceph's use via fscache can be fixed by porting ceph
     to netfslib; cifs is using xarray as a bounce buffer - that can be
     moved to use sheaves instead; and orangefs has a similar problem to
     erofs - maybe orangefs could use netfslib?

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Gao Xiang <xiang@kernel.org>
cc: Mike Marshall <hubcap@omnibond.com>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
cc: linux-afs@lists.infradead.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: linux-erofs@lists.ozlabs.org
cc: devel@lists.orangefs.org
Link: https://lore.kernel.org/r/20240814203850.2240469-13-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-12 12:20:21 +02:00
Andrii Nakryiko
cdbb44f9a7 lib/buildid: don't limit .note.gnu.build-id to the first page in ELF
With freader we don't need to restrict ourselves to a single page, so
let's allow ELF notes to be at any valid position with the file.

We also merge parse_build_id() and parse_build_id_buf() as now the only
difference between them is note offset overflow, which makes sense to
check in all situations.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:31 -07:00
Andrii Nakryiko
ad41251c29 lib/buildid: implement sleepable build_id_parse() API
Extend freader with a flag specifying whether it's OK to cause page
fault to fetch file data that is not already physically present in
memory. With this, it's now easy to wait for data if the caller is
running in sleepable (faultable) context.

We utilize read_cache_folio() to bring the desired folio into page
cache, after which the rest of the logic works just the same at folio level.

Suggested-by: Omar Sandoval <osandov@fb.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:31 -07:00
Andrii Nakryiko
45b8fc3096 lib/buildid: rename build_id_parse() into build_id_parse_nofault()
Make it clear that build_id_parse() assumes that it can take no page
fault by renaming it and current few users to build_id_parse_nofault().

Also add build_id_parse() stub which for now falls back to non-sleepable
implementation, but will be changed in subsequent patches to take
advantage of sleepable context. PROCMAP_QUERY ioctl() on
/proc/<pid>/maps file is using build_id_parse() and will automatically
take advantage of more reliable sleepable context implementation.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Andrii Nakryiko
4e9d360c4c lib/buildid: remove single-page limit for PHDR search
Now that freader allows to access multiple pages transparently, there is
no need to limit program headers to the very first ELF file page. Remove
this limitation, but still put some sane limit on amount of program
headers that we are willing to iterate over (set arbitrarily to 256).

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Andrii Nakryiko
d4deb82423 lib/buildid: take into account e_phoff when fetching program headers
Current code assumption is that program (segment) headers are following
ELF header immediately. This is a common case, but is not guaranteed. So
take into account e_phoff field of the ELF header when accessing program
headers.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Andrii Nakryiko
de3ec364c3 lib/buildid: add single folio-based file reader abstraction
Add freader abstraction that transparently manages fetching and local
mapping of the underlying file page(s) and provides a simple direct data
access interface.

freader_fetch() is the only and single interface necessary. It accepts
file offset and desired number of bytes that should be accessed, and
will return a kernel mapped pointer that caller can use to dereference
data up to requested size. Requested size can't be bigger than the size
of the extra buffer provided during initialization (because, worst case,
all requested data has to be copied into it, so it's better to flag
wrongly sized buffer unconditionally, regardless if requested data range
is crossing page boundaries or not).

If folio is not paged in, or some of the conditions are not satisfied,
NULL is returned and more detailed error code can be accessed through
freader->err field. This approach makes the usage of freader_fetch()
cleaner.

To accommodate accessing file data that crosses folio boundaries, user
has to provide an extra buffer that will be used to make a local copy,
if necessary. This is done to maintain a simple linear pointer data
access interface.

We switch existing build ID parsing logic to it, without changing or
lifting any of the existing constraints, yet. This will be done
separately.

Given existing code was written with the assumption that it's always
working with a single (first) page of the underlying ELF file, logic
passes direct pointers around, which doesn't really work well with
freader approach and would be limiting when removing the single page (folio)
limitation. So we adjust all the logic to work in terms of file offsets.

There is also a memory buffer-based version (freader_init_from_mem())
for cases when desired data is already available in kernel memory. This
is used for parsing vmlinux's own build ID note. In this mode assumption
is that provided data starts at "file offset" zero, which works great
when parsing ELF notes sections, as all the parsing logic is relative to
note section's start.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Andrii Nakryiko
905415ff3f lib/buildid: harden build ID parsing logic
Harden build ID parsing logic, adding explicit READ_ONCE() where it's
important to have a consistent value read and validated just once.

Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
note is within a page bounds, so move the overflow check up and add an
extra note_size boundaries validation.

Fixes tag below points to the code that moved this code into
lib/buildid.c, and then subsequently was used in perf subsystem, making
this code exposed to perf_event_open() users in v5.12+.

Cc: stable@vger.kernel.org
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Jann Horn <jannh@google.com>
Suggested-by: Andi Kleen <ak@linux.intel.com>
Fixes: bd7525dacd ("bpf: Move stack_map_get_build_id into lib")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Thomas Gleixner
2f7eedca6c Merge branch 'linus' into timers/core
To update with the latest fixes.
2024-09-10 13:49:53 +02:00
Alok Swaminathan
7b0a5b6669 lib: glob.c: added null check for character class
Add null check for character class.  Previously, an inverted character
class could result in a nul byte being matched and lead to the function
reading past the end of the inputted string.

Link: https://lkml.kernel.org/r/20240826155709.12383-1-swaminathanalok@gmail.com
Signed-off-by: Alok Swaminathan <swaminathanalok@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:47:41 -07:00
Liam R. Howlett
1930c6ad93 maple_tree: mark three functions as __maybe_unused
People keep trying to remove three functions that are going to be used in
a feature that is being developed.  Dropping the functions entirely may
end up with people trying to use the bit for other uses, as people have
tried in the past.

Adding __maybe_unused stops compilers complaining about the unused
functions so they can be silently optimised out of the compiled code and
people won't try to claim the bit for another use.

Link: https://lore.kernel.org/all/20230726080916.17454-2-zhangpeng.00@bytedance.com/
Link: https://lore.kernel.org/all/202408310728.S7EE59BN-lkp@intel.com/
Link: https://lkml.kernel.org/r/20240907021506.4018676-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:17 -07:00
Sergey Senozhatsky
f3c11cf5ca lib: zstd: fix null-deref in ZSTD_createCDict_advanced2()
ZSTD_createCDict_advanced2() must ensure that
ZSTD_createCDict_advanced_internal() has successfully allocated cdict. 
customMalloc() may be called under low memory condition and may be unable
to allocate workspace for cdict.

Link: https://lkml.kernel.org/r/20240902105656.1383858-4-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:06 -07:00
Sergey Senozhatsky
7518847430 lib: lz4hc: export LZ4_resetStreamHC symbol
This symbol is needed to enable lz4hc dictionary support.

Link: https://lkml.kernel.org/r/20240902105656.1383858-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:06 -07:00
Sergey Senozhatsky
4fc4187984 lib: zstd: export API needed for dictionary support
Patch series "zram: introduce custom comp backends API", v7.

This series introduces support for run-time compression algorithms tuning,
so users, for instance, can adjust compression/acceleration levels and
provide pre-trained compression/decompression dictionaries which certain
algorithms support.

At this point we stop supporting (old/deprecated) comp API.  We may add
new acomp API support in the future, but before that zram needs to undergo
some major rework (we are not ready for async compression).

Some benchmarks for reference (look at column #2)

*** init zstd
/sys/block/zram0/mm_stat
1750659072 504622188 514355200        0 514355200        1        0    34204    34204

*** init zstd dict=/home/ss/zstd-dict-amd64
/sys/block/zram0/mm_stat
1750650880 465908890 475398144        0 475398144        1        0    34185    34185

*** init zstd level=8 dict=/home/ss/zstd-dict-amd64
/sys/block/zram0/mm_stat
1750654976 430803319 439873536        0 439873536        1        0    34185    34185

*** init lz4
/sys/block/zram0/mm_stat
1750646784 664266564 677060608        0 677060608        1        0    34288    34288

*** init lz4 dict=/home/ss/lz4-dict-amd64
/sys/block/zram0/mm_stat
1750650880 619990300 632102912        0 632102912        1        0    34278    34278

*** init lz4hc
/sys/block/zram0/mm_stat
1750630400 609023822 621232128        0 621232128        1        0    34288    34288

*** init lz4hc dict=/home/ss/lz4-dict-amd64
/sys/block/zram0/mm_stat
1750659072 505133172 515231744        0 515231744        1        0    34278    34278


Recompress
init zram zstd (prio=0), zstd level=5 (prio 1), zstd with dict (prio 2)

*** zstd
/sys/block/zram0/mm_stat
1750982656 504630584 514269184        0 514269184        1        0    34204    34204

*** idle recompress priority=1 (zstd level=5)
/sys/block/zram0/mm_stat
1750982656 488645601 525438976        0 514269184        1        0    34204    34204

*** idle recompress priority=2 (zstd dict)
/sys/block/zram0/mm_stat
1750982656 460869640 517914624        0 514269184        1        0    34185    34204


This patch (of 24):

We need to export a number of API functions that enable advanced zstd
usage - C/D dictionaries, dictionaries sharing between contexts, etc.

Link: https://lkml.kernel.org/r/20240902105656.1383858-1-senozhatsky@chromium.org
Link: https://lkml.kernel.org/r/20240902105656.1383858-2-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:06 -07:00
Wei Yang
96ae4c9019 maple_tree: cleanup function descriptions
This patch tries to cleanup some function description:

  * function name mismatch
  * parameter name mismatch
  * parameter all end up with ':'
  * not prefix '*' if parameter is a pointer

There is still some missing description of parameters, I didn't add them
since I am not sure the exact meaning.

Link: https://lkml.kernel.org/r/20240830220400.2007-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:05 -07:00
Wei Yang
21a449bedf maple_tree: dump error message based on format
Just do what mt_dump_range64() does.

Dump the error message based on format.

Link: https://lkml.kernel.org/r/20240826012422.29935-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:01 -07:00
Wei Yang
8152831069 maple_tree: arange64 node is not a leaf node
mt_dump_arange64() only applies to an entry whose type is maple_arange_64,
in which mte_is_leaf() must return false.

Since mte_is_leaf() here is always false, we can remove this condition
check.

Link: https://lkml.kernel.org/r/20240826012422.29935-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:01 -07:00
Zhen Lei
63a4a9b52c debugobjects: Remove redundant checks in fill_pool()
fill_pool() checks locklessly at the beginning whether the pool has to be
refilled. After that it checks locklessly in a loop whether the free list
contains objects and repeats the refill check.

If both conditions are true, it acquires the pool lock and tries to move
objects from the free list to the pool repeating the same checks again.

There are two redundant issues with that:

      1) The repeated check for the fill condition
      2) The loop processing

The repeated check is pointless as it was just established that fill is
required. The condition has to be re-evaluated under the lock anyway.

The loop processing is not required either because there is practically
zero chance that a repeated attempt will succeed if the checks under the
lock terminate the moving of objects.

Remove the redundant check and replace the loop with a simple if condition.

[ tglx: Massaged change log ]

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240904133944.2124-4-thunder.leizhen@huawei.com
2024-09-09 16:40:26 +02:00
Zhen Lei
684d28feb8 debugobjects: Fix conditions in fill_pool()
fill_pool() uses 'obj_pool_min_free' to decide whether objects should be
handed back to the kmem cache. But 'obj_pool_min_free' records the lowest
historical value of the number of objects in the object pool and not the
minimum number of objects which should be kept in the pool.

Use 'debug_objects_pool_min_level' instead, which holds the minimum number
which was scaled to the number of CPUs at boot time.

[ tglx: Massage change log ]

Fixes: d26bf5056f ("debugobjects: Reduce number of pool_lock acquisitions in fill_pool()")
Fixes: 36c4ead6f6 ("debugobjects: Add global free list and the counter")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240904133944.2124-3-thunder.leizhen@huawei.com
2024-09-09 16:40:25 +02:00
Zhen Lei
e4757c710b debugobjects: Fix the compilation attributes of some global variables
1. Both debug_objects_pool_min_level and debug_objects_pool_size are
   read-only after initialization, change attribute '__read_mostly' to
   '__ro_after_init', and remove '__data_racy'.

2. Many global variables are read in the debug_stats_show() function, but
   didn't mask KCSAN's detection. Add '__data_racy' for them.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240904133944.2124-2-thunder.leizhen@huawei.com
2024-09-09 16:40:25 +02:00
Kent Overstreet
b3f9da79e7 lib/generic-radix-tree.c: add preallocation
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:47 -04:00
Kent Overstreet
f659463381 lib/generic-radix-tree.c: genradix_ptr_inlined()
Provide an inlined fast path

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:47 -04:00
Anna-Maria Behnsen
bd7c8ff9fe treewide: Fix wrong singular form of jiffies in comments
There are several comments all over the place, which uses a wrong singular
form of jiffies.

Replace 'jiffie' by 'jiffy'. No functional change.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-3-e98760256370@linutronix.de
2024-09-08 20:47:40 +02:00
Jakub Kicinski
502cc061de Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/phy/phy_device.c
  2560db6ede ("net: phy: Fix missing of_node_put() for leds")
  1dce520abd ("net: phy: Use for_each_available_child_of_node_scoped()")
https://lore.kernel.org/20240904115823.74333648@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/xilinx/xilinx_axienet.h
drivers/net/ethernet/xilinx/xilinx_axienet_main.c
  858430db28 ("net: xilinx: axienet: Fix race in axienet_stop")
  76abb5d675 ("net: xilinx: axienet: Add statistics support")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-05 20:37:20 -07:00
Linus Torvalds
120434e5b3 linux_kselftest-kunit-fixes-6.11-rc7
This kunit update for Linux 6.11-rc7 consist of one single fix to
 a use-after-free bug resulting from kunit_driver_create() failing
 to copy the driver name leaving it on the stack or freeing it.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmbY0WMACgkQCwJExA0N
 QxzCgBAA7Cb6tyvGcXsQTXC50S90CR+3bGmHzTL8jl/ElHvTz521UzPTn01QB51t
 JcGNhKz3RByvRBuukhg7abpCnCYWZoa9pmxojVD5D1TO2AXvypWEv0ao/UwSAYyi
 2b7BTkcc7ciRske51/yFfipjwI/NLLIlu4HVcZ0OisOt+tvHzoz50KiyYV+Qan8r
 e8NkqVI587KLfDAZRC+cLXyJCIRwlCK+jNMrjoiOanv1Ybe65eAGNQmAIyuGX1Fo
 Ku8ZgoCgpc+Vjc1bMWgwgHWCdFOvINdd7ibfCp59JBBAkqYFpHYS5Lk9kHWH6lYF
 X9THLaCSh5cq+u0qksW8p4ml1fYnWZbm92qkdPj0wG36v9la769HSXijtVhL2lxD
 b1ca/NpfNfbbr5mxoVRq4ulO1JvyC6jmRKSJWt1p1SFfHf+Oaowh2Sr2ZjFfOozj
 +/Joh3n2dxlnH/in8BvXGwQIo7xbyTatm/4IVCccJAolR+hPv7izBeWfYn3xgtu5
 5WZVcxPMxNwgNHWnxm2nbxTtBTvTsOSC8/nbxm8g3jM9cHCP7Mz3/zSV6p2vcRxm
 HPx/Qj2LmNcPKGXs4jh7WLErgkunxlvsqCJChwGjZoYR0fgRmzCgrwbkDE6/26UW
 Teo51bWwD/CxTy7OtXi8D2pPzVqt8u5cFPaNgHaRzxLDuVTouhU=
 =JRC5
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fix fromShuah Khan:
 "One single fix to a use-after-free bug resulting from
  kunit_driver_create() failing to copy the driver name leaving it on
  the stack or freeing it"

* tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Device wrappers should also manage driver name
2024-09-05 09:43:38 -07:00
Tejun Heo
649e980dad Merge branch 'bpf/master' into for-6.12
Pull bpf/master to receive baebe9aaba ("bpf: allow passing struct
bpf_iter_<type> as kfunc arguments") and related changes in preparation for
the DSQ iterator patchset.

Signed-off-by: Tejun Heo <tj@kernel.org>
2024-09-04 11:41:32 -10:00
Matthew Wilcox (Oracle)
e27ad6560e printf: remove %pGt support
Patch series "Increase the number of bits available in page_type".

Kent wants more than 16 bits in page_type, so I resurrected this old patch
and expanded it a bit.  It's a bit more efficient than our current scheme
(1 4-byte insn vs 3 insns of 13 bytes total) to test a single page type.


This patch (of 4):

An upcoming patch will convert page type from being a bitfield to a
single byte, so we will not be able to use %pG to print the page type
any more.  The printing of the symbolic name will be restored in that
patch.

Link: https://lkml.kernel.org/r/20240821173914.2270383-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20240821173914.2270383-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-03 21:15:42 -07:00
Alexander Lobakin
00d066a4d4 netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature,
rather an attribute, very similar to IFF_NO_QUEUE (and hot).
Free one netdev_features_t bit and make it a "hot" private flag.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03 11:36:43 +02:00
Yang Ruibin
38676d9e33 lib: fix the NULL vs IS_ERR() bug for debugfs_create_dir()
debugfs_create_dir() returns error pointers.  It never returns NULL.  So
use IS_ERR() to check it.

Link: https://lkml.kernel.org/r/20240821073441.9701-1-11162571@vivo.com
Signed-off-by: Yang Ruibin <11162571@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:40 -07:00
Andy Shevchenko
fb54ea1ee8 dimlib: use *-y instead of *-objs in Makefile
*-objs suffix is reserved rather for (user-space) host programs while
usually *-y suffix is used for kernel drivers (although *-objs works for
that purpose for now).

Let's correct the old usages of *-objs in Makefiles.

Link: https://lkml.kernel.org/r/20240821155140.611514-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Tal Gilboa <talgi@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:40 -07:00
Uros Bizjak
16d9691ad4 lib/percpu_counter: add missing __percpu qualifier to a cast
Add missing __percpu qualifier to a (void *) cast to fix

percpu_counter.c:212:36: warning: cast removes address space '__percpu' of expression
percpu_counter.c:212:33: warning: incorrect type in assignment (different address spaces)
percpu_counter.c:212:33:    expected signed int [noderef] [usertype] __percpu *counters
percpu_counter.c:212:33:    got void *

sparse warnings.

Found by GCC's named address space checks.

There were no changes in the resulting object file.

Link: https://lkml.kernel.org/r/20240814064437.940162-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:34 -07:00
Kuan-Wei Chiu
cbf164cd44 lib/bcd: optimize _bin2bcd() for improved performance
The original _bin2bcd() function used / 10 and % 10 operations for
conversion.  Although GCC optimizes these operations and does not generate
division or modulus instructions, the new implementation reduces the
number of mov instructions in the generated code for both x86-64 and ARM
architectures.

This optimization calculates the tens digit using (val * 103) >> 10, which
is accurate for values of 'val' in the range [0, 178].  Given that the
valid input range is [0, 99], this method ensures correctness while
simplifying the generated code.

Link: https://lkml.kernel.org/r/20240812170229.229380-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:33 -07:00
Jani Nikula
6ce2082fd3 fault-inject: improve build for CONFIG_FAULT_INJECTION=n
The fault-inject.h users across the kernel need to add a lot of #ifdef
CONFIG_FAULT_INJECTION to cater for shortcomings in the header.  Make
fault-inject.h self-contained for CONFIG_FAULT_INJECTION=n, and add stubs
for DECLARE_FAULT_ATTR(), setup_fault_attr(), should_fail_ex(), and
should_fail() to allow removal of conditional compilation.

[akpm@linux-foundation.org: repair fallout from no longer including debugfs.h into fault-inject.h]
[akpm@linux-foundation.org: fix drivers/misc/xilinx_tmr_inject.c]
[akpm@linux-foundation.org: Add debugfs.h inclusion to more files, per Stephen]
Link: https://lkml.kernel.org/r/20240813121237.2382534-1-jani.nikula@intel.com
Fixes: 6ff1cb355e ("[PATCH] fault-injection capabilities infrastructure")
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:33 -07:00
Davidlohr Bueso
a15bec6a8f lib/rhashtable: cleanup fallback check in bucket_table_alloc()
Upon allocation failure, the current check with the nofail bits is
unnecessary, and further stands in the way of discouraging direct use of
__GFP_NOFAIL.  Remove this and replace with the proper way of determining
if doing a non-blocking allocation for the nested table case.

Link: https://lkml.kernel.org/r/20240806153927.184515-1-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Suggested-by: Michal Hocko <mhocko@suse.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:32 -07:00
J. R. Okajima
e0ba72e3a4 lockdep: upper limit LOCKDEP_CHAINS_BITS
CONFIG_LOCKDEP_CHAINS_BITS value decides the size of chain_hlocks[] in
kernel/locking/lockdep.c, and it is checked by add_chain_cache() with
    BUILD_BUG_ON((1UL << 24) <= ARRAY_SIZE(chain_hlocks));
This patch is just to silence BUILD_BUG_ON().

See also https://lore.kernel.org/all/30795.1620913191@jrobl/

[cmllamas@google.com: fix minor checkpatch issues in commit log]
Link: https://lkml.kernel.org/r/20240723164018.2489615-1-cmllamas@google.com
Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:32 -07:00
Thorsten Blum
b6e21b7120 lib: checksum: use ARRAY_SIZE() to improve assert_setup_correct()
Use ARRAY_SIZE() to simplify the assert_setup_correct() function and
improve its readability.

Link: https://lkml.kernel.org/r/20240726154946.230928-1-thorsten.blum@toblux.com
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:30 -07:00
Deshan Zhang
9a42bfd255 lib/lru_cache: fix spelling mistake "colision"->"collision"
There is a spelling mistake in a literal string and in cariable names. 
Fix these.

Link: https://lkml.kernel.org/r/20240725093044.1742842-1-deshan@nfschina.com
Signed-off-by: Deshan Zhang <deshan@nfschina.com>
Cc: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:29 -07:00
Markus Elfring
fbe617af69 closures: use seq_putc() in debug_show()
A single line break should be put into a sequence.  Thus use the
corresponding function "seq_putc".

This issue was transformed by using the Coccinelle software.

Link: https://lkml.kernel.org/r/e7faa2c4-9590-44b4-8669-69ef810277b1@web.de
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:29 -07:00
Markus Elfring
7b76689a02 dyndbg: use seq_putc() in ddebug_proc_show()
Single characters should be put into a sequence.  Thus use the
corresponding function "seq_putc".

This issue was transformed by using the Coccinelle software.

Link: https://lkml.kernel.org/r/375b5b4b-6295-419e-bae9-da724a7a682d@web.de
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:29 -07:00
Lasse Collin
c6f371bab2 xz: remove XZ_EXTERN and extern from functions
XZ_EXTERN was used to make internal functions static in the preboot code. 
However, in other decompressors this hasn't been done.  On x86-64, this
makes no difference to the kernel image size.

Omit XZ_EXTERN and let some of the internal functions be extern in the
preboot code.  Omitting XZ_EXTERN from include/linux/xz.h fixes warnings
in "make htmldocs" and makes the intradocument links to xz_dec functions
work in Documentation/staging/xz.rst.  The alternative would have been to
add "XZ_EXTERN" to c_id_attributes in Documentation/conf.py but omitting
XZ_EXTERN seemed cleaner.

Link: https://lore.kernel.org/lkml/20240723205437.3c0664b0@kaneli/
Link: https://lkml.kernel.org/r/20240724110544.16430-1-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:27 -07:00
Lasse Collin
7472ff8ada xz: adjust arch-specific options for better kernel compression
Use LZMA2 options that match the arch-specific alignment of instructions. 
This change reduces compressed kernel size 0-2 % depending on the arch. 
On 1-byte-aligned x86 it makes no difference and on 4-byte-aligned archs
it helps the most.

Use the ARM-Thumb filter for ARM-Thumb2 kernels.  This reduces compressed
kernel size about 5 %.[1] Previously such kernels were compressed using
the ARM filter which didn't do anything useful with ARM-Thumb2 code.

Add BCJ filter support for ARM64 and RISC-V.  Compared to unfiltered XZ or
plain LZMA, the compressed kernel size is reduced about 5 % on ARM64 and 7
% on RISC-V.  A new enough version of the xz tool is required: 5.4.0 for
ARM64 and 5.6.0 for RISC-V.  With an old xz version, a message is printed
to standard error and the kernel is compressed without the filter.

Update lib/decompress_unxz.c to match the changes to xz_wrap.sh.

Update the CONFIG_KERNEL_XZ help text in init/Kconfig:
  - Add the RISC-V and ARM64 filters.
  - Clarify that the PowerPC filter is for big endian only.
  - Omit IA-64.

Link: https://lore.kernel.org/lkml/1637379771-39449-1-git-send-email-zhongjubin@huawei.com/ [1]
Link: https://lkml.kernel.org/r/20240721133633.47721-15-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:27 -07:00
Lasse Collin
93d09773d1 xz: add RISC-V BCJ filter
A later commit updates lib/decompress_unxz.c to enable this filter for
kernel decompression.  lib/decompress_unxz.c is already used if
CONFIG_EFI_ZBOOT=y && CONFIG_KERNEL_XZ=y.

This filter can be used by Squashfs without modifications to the Squashfs
kernel code (only needs support in userspace Squashfs-tools).

Link: https://lkml.kernel.org/r/20240721133633.47721-13-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:26 -07:00
Lasse Collin
4b62813f5e xz: Add ARM64 BCJ filter
Also omit a duplicated check for XZ_DEC_ARM in xz_private.h.

A later commit updates lib/decompress_unxz.c to enable this filter for
kernel decompression.  lib/decompress_unxz.c is already used if
CONFIG_EFI_ZBOOT=y && CONFIG_KERNEL_XZ=y.

This filter can be used by Squashfs without modifications to the Squashfs
kernel code (only needs support in userspace Squashfs-tools).

Link: https://lkml.kernel.org/r/20240721133633.47721-12-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:26 -07:00
Lasse Collin
bdfc041171 xz: optimize for-loop conditions in the BCJ decoders
Compilers cannot optimize the addition "i + 4" away since theoretically it
could overflow.

Link: https://lkml.kernel.org/r/20240721133633.47721-11-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:26 -07:00
Lasse Collin
2ee96abef2 xz: cleanup CRC32 edits from 2018
In 2018, a dependency on <linux/crc32poly.h> was added to avoid
duplicating the same constant in multiple files.  Two months later it was
found to be a bad idea and the definition of CRC32_POLY_LE macro was moved
into xz_private.h to avoid including <linux/crc32poly.h>.

xz_private.h is a wrong place for it too.  Revert back to the upstream
version which has the poly in xz_crc32_init() in xz_crc32.c.

Link: https://lkml.kernel.org/r/20240721133633.47721-10-lasse.collin@tukaani.org
Fixes: faa16bc404 ("lib: Use existing define with polynomial")
Fixes: 242cdad873 ("lib/xz: Put CRC32_POLY_LE in xz_private.h")
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:25 -07:00
Lasse Collin
ff221153aa xz: fix comments and coding style
- Fix comments that were no longer in sync with the code below them.
- Fix language errors.
- Fix coding style.

Link: https://lkml.kernel.org/r/20240721133633.47721-5-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:24 -07:00
Lasse Collin
836d13a6ef xz: switch from public domain to BSD Zero Clause License (0BSD)
Remove the public domain notices and add SPDX license identifiers.

Change MODULE_LICENSE from "GPL" to "Dual BSD/GPL" because 0BSD should
count as a BSD license variant here.

The switch to 0BSD was done in the upstream XZ Embedded project because
public domain has (real or perceived) legal issues in some jurisdictions.

Link: https://lkml.kernel.org/r/20240721133633.47721-4-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:24 -07:00
Andrey Konovalov
e24f4de8a7 kcov: don't instrument lib/find_bit.c
This file produces large amounts of flaky coverage not useful for the
KCOV's intended use case (guiding the fuzzing process).

Link: https://lkml.kernel.org/r/20240722223726.194658-1-andrey.konovalov@linux.dev
Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:23 -07:00
Jeff Johnson
053a5e4cbb lib: test_objpool: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_objpool.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240715-md-lib-test_objpool-v2-1-5a2b9369c37e@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Matt Wu <wuqiang.matt@bytedance.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:23 -07:00
Nicolas Pitre
1635e62e75 mul_u64_u64_div_u64: basic sanity test
Verify that edge cases produce proper results, and some more.

[npitre@baylibre.com: avoid undefined shift value]
  Link: https://lkml.kernel.org/r/7rrs9pn1-n266-3013-9q6n-1osp8r8s0rrn@syhkavp.arg
Link: https://lkml.kernel.org/r/20240707190648.1982714-3-nico@fluxnic.net
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:22 -07:00
Nicolas Pitre
b29a62d87c mul_u64_u64_div_u64: make it precise always
Patch series "mul_u64_u64_div_u64: new implementation", v3.

This provides an implementation for mul_u64_u64_div_u64() that always
produces exact results.


This patch (of 2):

Library facilities must always return exact results.  If the caller may be
contented with approximations then it should do the approximation on its
own.

In this particular case the comment in the code says "the algorithm
... below might lose some precision". Well, if you try it with e.g.:

	a = 18446462598732840960
	b = 18446462598732840960
	c = 18446462598732840961

then the produced answer is 0 whereas the exact answer should be
18446462598732840959.  This is _some_ precision lost indeed!

Let's reimplement this function so it always produces the exact result
regardless of its inputs while preserving existing fast paths when
possible.

Uwe said:

: My personal interest is to get the calculations in pwm drivers right. 
: This function is used in several drivers below drivers/pwm/ .  With the
: errors in mul_u64_u64_div_u64(), pwm consumers might not get the
: settings they request.  Although I have to admit that I'm not aware it
: breaks real use cases (because typically the periods used are too short
: to make the involved multiplications overflow), but I pretty sure am
: not aware of all usages and it breaks testing.
: 
: Another justification is commits like
: https://git.kernel.org/tip/77baa5bafcbe1b2a15ef9c37232c21279c95481c,
: where people start to work around the precision shortcomings of
: mul_u64_u64_div_u64().

Link: https://lkml.kernel.org/r/20240707190648.1982714-1-nico@fluxnic.net
Link: https://lkml.kernel.org/r/20240707190648.1982714-2-nico@fluxnic.net
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Tested-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Tested-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:22 -07:00
Sidhartha Kumar
ed4dfd9aa1 maple_tree: make write helper functions void
The return value of various write helper functions are not checked. We
can safely change the return type of these functions to be void.

Link: https://lkml.kernel.org/r/20240814161944.55347-18-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:18 -07:00
Sidhartha Kumar
c27e6183c6 maple_tree: remove unneeded mas_wr_walk() in mas_store_prealloc()
Users of mas_store_prealloc() enter this function with nodes already
preallocated. This means the store type must be already set. We can then
remove the call to mas_wr_store_type() and initialize the write state to
continue the partial walk that was done when determining the store type.

Link: https://lkml.kernel.org/r/20240814161944.55347-17-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:18 -07:00
Sidhartha Kumar
add60ea5f6 maple_tree: remove repeated sanity checks from write helper functions
These sanity checks are now redundant as they are already checked in
mas_wr_store_type(). We can remove them from mas_wr_append() and
mas_wr_node_store().

Link: https://lkml.kernel.org/r/20240814161944.55347-16-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:17 -07:00
Sidhartha Kumar
9155e84334 maple_tree: remove node allocations from various write helper functions
These write helper functions are all called from store paths which
preallocate enough nodes that will be needed for the write. There is no
more need to allocate within the functions themselves.

Link: https://lkml.kernel.org/r/20240814161944.55347-15-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:17 -07:00
Sidhartha Kumar
4037d44f54 maple_tree: have mas_store() allocate nodes if needed
Not all users of mas_store() enter with nodes already preallocated.
Check for the MA_STATE_PREALLOC flag to decide whether to preallocate nodes
within mas_store() rather than relying on future write helper functions
to perform the allocations. This allows the write helper functions to be
simplified as they do not have to do checks to make sure there are
enough allocated nodes to perform the write.

Link: https://lkml.kernel.org/r/20240814161944.55347-14-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:17 -07:00
Sidhartha Kumar
7987d02779 maple_tree: remove mas_wr_modify()
There are no more users of the function, safely remove it.

Link: https://lkml.kernel.org/r/20240814161944.55347-13-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:17 -07:00
Sidhartha Kumar
62c7b2b984 maple_tree: simplify mas_commit_b_node()
The only callers of mas_commit_b_node() are those with store type of
wr_rebalance and wr_split_store. Use mas->store_type to dispatch to the
correct helper function. This allows the removal of mas_reuse_node() as
it is no longer used.

Link: https://lkml.kernel.org/r/20240814161944.55347-12-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:16 -07:00
Sidhartha Kumar
1fd7c4f322 maple_tree: convert mas_insert() to preallocate nodes
By setting the store type in mas_insert(), we no longer need to use
mas_wr_modify() to determine the correct store function to use. Instead,
set the store type and call mas_wr_store_entry(). Also, pass in the
requested gfp flags to mas_insert() so they can be passed to the call to
mas_wr_preallocate().

Link: https://lkml.kernel.org/r/20240814161944.55347-11-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:16 -07:00
Sidhartha Kumar
580fcbd67c maple_tree: use store type in mas_wr_store_entry()
When storing an entry, we can read the store type that was set from a
previous partial walk of the tree. Now that the type of store is known,
select the correct write helper function to use to complete the store.

Also noinline mas_wr_spanning_store() to limit stack frame usage in
mas_wr_store_entry() as it allocates a maple_big_node on the stack.

Link: https://lkml.kernel.org/r/20240814161944.55347-10-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:16 -07:00
Sidhartha Kumar
23e217a848 maple_tree: print store type in mas_dump()
Knowing the store type of the maple state could be helpful for debugging.
Have mas_dump() print mas->store_type.

Link: https://lkml.kernel.org/r/20240814161944.55347-9-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:16 -07:00
Sidhartha Kumar
85db8f2417 maple_tree: use mas_store_gfp() in mtree_store_range()
Refactor mtree_store_range() to use mas_store_gfp() which will abstract
the store, memory allocation, and error handling.

Link: https://lkml.kernel.org/r/20240814161944.55347-8-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:15 -07:00
Sidhartha Kumar
7e093834ed maple_tree: preallocate nodes in mas_erase()
Use mas_wr_preallocate() in mas_erase() to preallocate enough nodes to
complete the erase.  Add error handling by skipping the store if the
preallocation lead to some error besides no memory.

Link: https://lkml.kernel.org/r/20240814161944.55347-7-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:15 -07:00
Sidhartha Kumar
3cd9e92e00 maple_tree: remove mas_destroy() from mas_nomem()
Separate call to mas_destroy() from mas_nomem() so we can check for no
memory errors without destroying the current maple state in
mas_store_gfp().  We then add calls to mas_destroy() to callers of
mas_nomem().

Link: https://lkml.kernel.org/r/20240814161944.55347-6-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:15 -07:00
Sidhartha Kumar
5d659bbb52 maple_tree: introduce mas_wr_store_type()
Introduce mas_wr_store_type() which will set the correct store type based
on a walk of the tree.  In mas_wr_node_store() the <= min_slots condition
is changed to < as if new_end is = to mt_min_slots then there is not
enough room.

mas_prealloc_calc() is also introduced to abstract the calculation used to
determine the number of nodes needed for a store operation.

In this change a call to mas_reset() is removed in the error case of
mas_prealloc().  This is only needed in the MA_STATE_REBALANCE case of
mas_destroy().  We can move the call to mas_reset() directly to
mas_destroy().

Also, add a test case to validate the order that we check the store type
in is correct.  This test models a vma expanding and then shrinking which
is part of the boot process.

Link: https://lkml.kernel.org/r/20240814161944.55347-5-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:15 -07:00
Sidhartha Kumar
3cc6f42a53 maple_tree: move up mas_wr_store_setup() and mas_wr_prealloc_setup()
Subsequent patches require these definitions to be higher, no functional
changes intended.

Link: https://lkml.kernel.org/r/20240814161944.55347-4-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:14 -07:00
Sidhartha Kumar
19138a2cc1 maple_tree: introduce mas_wr_prealloc_setup()
Introduce a helper function, mas_wr_prealoc_setup(), that will set up a
maple write state in order to start a walk of a maple tree.

Link: https://lkml.kernel.org/r/20240814161944.55347-3-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:14 -07:00
Wei Yang
c64d66153b maple_tree: fix comment typo with corresponding maple_status
In comment of function mas_start(), we list the return value of different
cases.  According to the comment context, tell the maple_status here is
more consistent with others.

Let's correct it with ma_active in the case it's a tree.

Link: https://lkml.kernel.org/r/20240812150925.31551-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:11 -07:00
Wei Yang
7a0529d0c2 maple_tree: fix comment typo of ma_root
In comment of mas_start(), we lists the return value for different cases. 
In case of a single entry, we set mas->status to ma_root, while the
comment uses mas_root, which is not a maple_status.

Fix the typo according to the code.

Link: https://lkml.kernel.org/r/20240812150925.31551-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:11 -07:00
Sidhartha Kumar
617f8e4d76 maple_tree: add test to replicate low memory race conditions
Add new callback fields to the userspace implementation of struct
kmem_cache.  This allows for executing callback functions in order to
further test low memory scenarios where node allocation is retried.

This callback can help test race conditions by calling a function when a
low memory event is tested.

Link: https://lkml.kernel.org/r/20240812190543.71967-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:11 -07:00
Sidhartha Kumar
e1b8b883bb maple_tree: reset mas->index and mas->last on write retries
The following scenario can result in a race condition:

Consider a node with the following indices and values

	a<------->b<----------->c<--------->d
	    0xA        NULL          0xB

	CPU 1			  CPU 2
      ---------        		---------
	mas_set_range(a,b)
	mas_erase()
		-> range is expanded (a,c) because of null expansion

	mas_nomem()
	mas_unlock()
				mas_store_range(b,c,0xC)

The node now looks like:

	a<------->b<----------->c<--------->d
	    0xA        0xC          0xB

	mas_lock()
	mas_erase() <------ range of erase is still (a,c)

The node is now NULL from (a,c) but the write from CPU 2 should have been
retained and range (b,c) should still have 0xC as its value.  We can fix
this by re-intializing to the original index and last.  This does not need
a cc: Stable as there are no users of the maple tree which use internal
locking and this condition is only possible with internal locking.

Link: https://lkml.kernel.org/r/20240812190543.71967-1-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:11 -07:00
Thorsten Blum
592c9330e3 lib: test_hmm: use min() to improve dmirror_exclusive()
Use min() to simplify the dmirror_exclusive() function and improve its
readability.

Link: https://lkml.kernel.org/r/20240726131245.161695-1-thorsten.blum@toblux.com
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:25:52 -07:00
Danilo Krummrich
590b9d576c mm: kvmalloc: align kvrealloc() with krealloc()
Besides the obvious (and desired) difference between krealloc() and
kvrealloc(), there is some inconsistency in their function signatures and
behavior:

 - krealloc() frees the memory when the requested size is zero, whereas
   kvrealloc() simply returns a pointer to the existing allocation.

 - krealloc() behaves like kmalloc() if a NULL pointer is passed, whereas
   kvrealloc() does not accept a NULL pointer at all and, if passed,
   would fault instead.

 - krealloc() is self-contained, whereas kvrealloc() relies on the caller
   to provide the size of the previous allocation.

Inconsistent behavior throughout allocation APIs is error prone, hence
make kvrealloc() behave like krealloc(), which seems superior in all
mentioned aspects.

Besides that, implementing kvrealloc() by making use of krealloc() and
vrealloc() provides oppertunities to grow (and shrink) allocations more
efficiently.  For instance, vrealloc() can be optimized to allocate and
map additional pages to grow the allocation or unmap and free unused pages
to shrink the allocation.

[dakr@kernel.org: document concurrency restrictions]
  Link: https://lkml.kernel.org/r/20240725125442.4957-1-dakr@kernel.org
[dakr@kernel.org: disable KASAN when switching to vmalloc]
  Link: https://lkml.kernel.org/r/20240730185049.6244-2-dakr@kernel.org
[dakr@kernel.org: properly document __GFP_ZERO behavior]
  Link: https://lkml.kernel.org/r/20240730185049.6244-5-dakr@kernel.org
Link: https://lkml.kernel.org/r/20240722163111.4766-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Chandan Babu R <chandan.babu@oracle.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:25:44 -07:00
Suren Baghdasaryan
052a45c1cb alloc_tag: fix allocation tag reporting when CONFIG_MODULES=n
codetag_module_init() is used to initialize sections containing allocation
tags.  This function is used to initialize module sections as well as core
kernel sections, in which case the module parameter is set to NULL.  This
function has to be called even when CONFIG_MODULES=n to initialize core
kernel allocation tag sections.  When CONFIG_MODULES=n, this function is a
NOP, which is wrong.  This leads to /proc/allocinfo reported as empty. 
Fix this by making it independent of CONFIG_MODULES.

Link: https://lkml.kernel.org/r/20240828231536.1770519-1-surenb@google.com
Fixes: 916cc5167c ("lib: code tagging framework")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>	[6.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 17:59:03 -07:00
Liam R. Howlett
f806de88d8 maple_tree: remove rcu_read_lock() from mt_validate()
The write lock should be held when validating the tree to avoid updates
racing with checks.  Holding the rcu read lock during a large tree
validation may also cause a prolonged rcu read window and "rcu_preempt
detected stalls" warnings.

Link: https://lore.kernel.org/all/0000000000001d12d4062005aea1@google.com/
Link: https://lkml.kernel.org/r/20240820175417.2782532-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reported-by: syzbot+036af2f0c7338a33b0cd@syzkaller.appspotmail.com
Cc: Hillf Danton <hdanton@sina.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 17:59:01 -07:00
Linus Torvalds
d5d547aa7b Random number generator fixes for Linux 6.11-rc6.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmbPwucACgkQSfxwEqXe
 A653nRAA0pk0iDH9iz/DLXVy5e4WWE1WQyCdT4jB5H2SItG3fz4kcKz0x1qcPEtA
 RUhO4bZLTeFE/QkAQROA41x0ysAbg2dnIefO6CzFhndKGDyOEfUKYAsb65HiYj8Z
 HI9XGRYWc8kD35BGDtqGrgbgDgSVS3JPASC8mPJKv608h9f1M1ABqtyuft8bxz57
 2OxuXoxVVN4ZI0VyQqqhT1roEiCIuuDaSZlPUws2PjnLxcqIQXXXPMLgN2vi9QzG
 cCslhtJMxBAhQ/skAVbxQlI6S2OB0zGROE78k2PK7eqGZuBAex9G0kuWH9Rl3RQL
 NmYjITWPZts7LRxCcvUQzxcKYsGb08mvCMCu+AAS9QfI1rOQu/TS7+4IfRHnHyg0
 J7OBN0aPqKfciAch5NCfxN5EMUAlwXdro2/salONdGNF7do9mdjt/LqUzhbSKBPi
 kpVWBkLHzl0obPR1F/BBfC2oRW7Us5ShjaLod9J1DcJps/GTr7MXir8lEnPxwypJ
 5t4F8Y4M34MpxmVZ/k2oNsEGhugpicaTAqa5KO4vqtWDPk1TNHi2POxU1Fjnth5K
 ds/NxoRvXV/2K5V+XiJQnngt5pgRjqU5DgCh19Bq1W7PqqbGkVWmzIa+zfYm9sCH
 +RuZiyjM16RyN/tDAxhfKowBqsagW6/DM7LJe3fWJO7yCem/S5g=
 =a3c1
 -----END PGP SIGNATURE-----

Merge tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator fix from Jason Donenfeld:
 "Reject invalid flags passed to vgetrandom() in the same way that
  getrandom() does, so that the behavior is the same, from Yann.

  The flags argument to getrandom() only has a behavioral effect on the
  function if the RNG isn't initialized yet, so vgetrandom() falls back
  to the syscall in that case. But if the RNG is initialized, all of the
  flags behave the same way, so vgetrandom() didn't bother checking
  them, and just ignored them entirely.

  But that doesn't account for invalid flags passed in, which need to be
  rejected so we can use them later"

* tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: vDSO: reject unknown getrandom() flags
2024-08-29 13:59:18 +12:00
Anshuman Khandual
d7bcc37436 lib/test_bits.c: Add tests for GENMASK_U128()
This adds GENMASK_U128() tests although currently only 64 bit wide masks
are being tested.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-08-28 06:54:39 -07:00
Vlastimil Babka
4e1c44b3db kunit, slub: add test_kfree_rcu() and test_leak_destroy()
Add a test that will create cache, allocate one object, kfree_rcu() it
and attempt to destroy it. As long as the usage of kvfree_rcu_barrier()
in kmem_cache_destroy() works correctly, there should be no warnings in
dmesg and the test should pass.

Additionally add a test_leak_destroy() test that leaks an object on
purpose and verifies that kmem_cache_destroy() catches it.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-08-27 14:12:51 +02:00
David Gow
f2c6dbd220 kunit: Device wrappers should also manage driver name
kunit_driver_create() accepts a name for the driver, but does not copy
it, so if that name is either on the stack, or otherwise freed, we end
up with a use-after-free when the driver is cleaned up.

Instead, strdup() the name, and manage it as another KUnit allocation.
As there was no existing kunit_kstrdup(), we add one. Further, add a
kunit_ variant of strdup_const() and kfree_const(), so we don't need to
allocate and manage the string in the majority of cases where it's a
constant.

However, these are inline functions, and is_kernel_rodata() only works
for built-in code. This causes problems in two cases:
- If kunit is built as a module, __{start,end}_rodata is not defined.
- If a kunit test using these functions is built as a module, it will
  suffer the same fate.

This fixes a KASAN splat with overflow.overflow_allocation_test, when
built as a module.

Restrict the is_kernel_rodata() case to when KUnit is built as a module,
which fixes the first case, at the cost of losing the optimisation.

Also, make kunit_{kstrdup,kfree}_const non-inline, so that other modules
using them will not accidentally depend on is_kernel_rodata(). If KUnit
is built-in, they'll benefit from the optimisation, if KUnit is not,
they won't, but the string will be properly duplicated.

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reported-by: Nico Pache <npache@redhat.com>
Closes: https://groups.google.com/g/kunit-dev/c/81V9b9QYON0
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Tested-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-08-26 07:03:46 -06:00
Yann Droneaud
28f5df210d random: vDSO: reject unknown getrandom() flags
Like the getrandom() syscall, vDSO getrandom() must also reject unknown
flags. [1]

It would be possible to return -EINVAL from vDSO itself, but in the
possible case that a new flag is added to getrandom() syscall in the
future, it would be easier to get the behavior from the syscall, instead
of erroring until the vDSO is extended to support the new flag or
explicitly falling back.

[1] Designing the API: Planning for Extension
    https://docs.kernel.org/process/adding-syscalls.html#designing-the-api-planning-for-extension

Signed-off-by: Yann Droneaud <yann@droneaud.fr>
[Jason: reworded commit message]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-08-26 09:58:52 +02:00
Caleb Sander Mateos
e68ac2b488 softirq: Remove unused 'action' parameter from action callback
When soft interrupt actions are called, they are passed a pointer to the
struct softirq action which contains the action's function pointer.

This pointer isn't useful, as the action callback already knows what
function it is. And since each callback handles a specific soft interrupt,
the callback also knows which soft interrupt number is running.

No soft interrupt action callback actually uses this parameter, so remove
it from the function pointer signature. This clarifies that soft interrupt
actions are global routines and makes it slightly cheaper to call them.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/all/20240815171549.3260003-1-csander@purestorage.com
2024-08-20 17:13:40 +02:00
Linus Torvalds
2865baf540 x86: support user address masking instead of non-speculative conditional
The Spectre-v1 mitigations made "access_ok()" much more expensive, since
it has to serialize execution with the test for a valid user address.

All the normal user copy routines avoid this by just masking the user
address with a data-dependent mask instead, but the fast
"unsafe_user_read()" kind of patterms that were supposed to be a fast
case got slowed down.

This introduces a notion of using

	src = masked_user_access_begin(src);

to do the user address sanity using a data-dependent mask instead of the
more traditional conditional

	if (user_read_access_begin(src, len)) {

model.

This model only works for dense accesses that start at 'src' and on
architectures that have a guard region that is guaranteed to fault in
between the user space and the kernel space area.

With this, the user access doesn't need to be manually checked, because
a bad address is guaranteed to fault (by some architecture masking
trick: on x86-64 this involves just turning an invalid user address into
all ones, since we don't map the top of address space).

This only converts a couple of examples for now.  Example x86-64 code
generation for loading two words from user space:

        stac
        mov    %rax,%rcx
        sar    $0x3f,%rcx
        or     %rax,%rcx
        mov    (%rcx),%r13
        mov    0x8(%rcx),%r14
        clac

where all the error handling and -EFAULT is now purely handled out of
line by the exception path.

Of course, if the micro-architecture does badly at 'clac' and 'stac',
the above is still pitifully slow.  But at least we did as well as we
could.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-08-19 11:31:18 -07:00
Linus Torvalds
b718175853 bcachefs fixes for 6.11-rc4
- New on disk format version, bcachefs_metadata_version_disk_accounting_inum
 
 This adds one more disk accounting counter, which counts disk usage and
 number of extents per inode number. This lets us track fragmentation,
 for implementing defragmentation later, and it also counts disk usage
 per inode in all snapshots, which will be a useful thing to expose to
 users.
 
 - One performance issue we've observed is threads spinning when they
 should be waiting for dirty keys in the key cache to be flushed by
 journal reclaim, so we now have hysteresis for the waiting thread, as
 well as improving the tracepoint and a new time_stat, for tracking time
 blocked waiting on key cache flushing.
 
 And, various assorted smaller fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAma/9QkACgkQE6szbY3K
 bnYcBw/+LBSZ415gWSjPktdecf5rc6K4KxETxAxV0f0KesYzxqAtQzN0SCDvKt65
 3aALU03wM8vWITiLS38/ckT+j6S2BpXcOxdu/OC0nRYQEUg9ZLvqEG5lQ3a/LliV
 Q64N33qsSr6QaKszFllLYcN4tGduKg8HoMlHn6+vJ7HNPjdfv0HHERSUsc7K84/w
 jkRtDE2NxsRJZKMEvIFp8hd5KXUR5zyBz/kc4P0WliLXpSyJLITzhKw1JV7ikKVD
 0mO2bJ/0i7wPIabAD2HJahvbC7fl+2fkYFxUJ2XnvMTgU/+QyeGHEufbcbVrVSp0
 BpzBTmSMFbGXBkbQBruFX5rJetzXeBqdYf0Yfavd4KDhGvYlSfDZQUapXT1QKC2q
 aHSB/s+2r7Crr/MBJyjbeFgXFTNGvI5yerlbdp2yj1kxjYJHHaKrp6h7n6XXk21W
 /mGF5tkIMkFTv98rQnIaky4neJzOPsLTTgxeR8zEudCgMaVUqEcaMdIFvARDjY/3
 n52VR0zl3olV3vu7LgHaHfgH6lfaMV0sHPaGNYGL0YL+bCJD+lYM8a6l9aaks8vk
 md7+mFcOS4FUdDdS8MEKIN/k/gkEOC/EpmI864i9rIl0SiNXNy7FPTDKON8b+Ury
 5omBMUQMEe9Q/pgKGXfpJWFynhSPEVf4y1DIOsrXk/jeBqenFyo=
 =BPGT
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent OverstreetL

 - New on disk format version, bcachefs_metadata_version_disk_accounting_inum

   This adds one more disk accounting counter, which counts disk usage
   and number of extents per inode number. This lets us track
   fragmentation, for implementing defragmentation later, and it also
   counts disk usage per inode in all snapshots, which will be a useful
   thing to expose to users.

 - One performance issue we've observed is threads spinning when they
   should be waiting for dirty keys in the key cache to be flushed by
   journal reclaim, so we now have hysteresis for the waiting thread, as
   well as improving the tracepoint and a new time_stat, for tracking
   time blocked waiting on key cache flushing.

... and various assorted smaller fixes.

* tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefs:
  bcachefs: Fix locking in __bch2_trans_mark_dev_sb()
  bcachefs: fix incorrect i_state usage
  bcachefs: avoid overflowing LRU_TIME_BITS for cached data lru
  bcachefs: Fix forgetting to pass trans to fsck_err()
  bcachefs: Increase size of cuckoo hash table on too many rehashes
  bcachefs: bcachefs_metadata_version_disk_accounting_inum
  bcachefs: Kill __bch2_accounting_mem_mod()
  bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()
  bcachefs: Fix warning in __bch2_fsck_err() for trans not passed in
  bcachefs: Add a time_stat for blocked on key cache flush
  bcachefs: Improve trans_blocked_journal_reclaim tracepoint
  bcachefs: Add hysteresis to waiting on btree key cache flush
  lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()
  bcachefs: Convert for_each_btree_node() to lockrestart_do()
  bcachefs: Add missing downgrade table entry
  bcachefs: disk accounting: ignore unknown types
  bcachefs: bch2_accounting_invalid() fixup
  bcachefs: Fix bch2_trigger_alloc when upgrading from old versions
  bcachefs: delete faulty fastpath in bch2_btree_path_traverse_cached()
2024-08-17 09:46:10 -07:00
Herbert Xu
8e3a67f2de crypto: lib/mpi - Add error checks to extension
The remaining functions added by commit
a8ea8bdd9d did not check for memory
allocation errors.  Add the checks and change the API to allow errors
to be returned.

Fixes: a8ea8bdd9d ("lib/mpi: Extend the MPI library")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-17 13:55:50 +08:00
Herbert Xu
fca5cb4dd2 Revert "lib/mpi: Extend the MPI library"
This partially reverts commit a8ea8bdd9d.

Most of it is no longer needed since sm2 has been removed.  However,
the following functions have been kept as they have developed other
uses:

mpi_copy

mpi_mod

mpi_test_bit
mpi_set_bit
mpi_rshift

mpi_add
mpi_sub
mpi_addm
mpi_subm

mpi_mul
mpi_mulm

mpi_tdiv_r
mpi_fdiv_r

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-17 13:55:50 +08:00
Justin Stitt
bbf3c7ff9d lib/string_helpers: rework overflow-dependent code
When @size is 0, the desired behavior is to allow unlimited bytes to be
parsed. Currently, this relies on some intentional arithmetic overflow
where --size gives us SIZE_MAX when size is 0.

Explicitly spell out the desired behavior without relying on intentional
overflow/underflow.

Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20240808-b4-string_helpers_caa133-v1-1-686a455167c4@google.com
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:26:02 -07:00
Masahiro Yamada
9c6b7fbbd7 fortify: use if_changed_dep to record header dependency in *.cmd files
After building with CONFIG_FORTIFY_SOURCE=y, many .*.d files are left
in lib/test_fortify/ because the compiler outputs header dependencies
into *.d without fixdep being invoked.

When compiling C files, if_changed_dep should be used so that the
auto-generated header dependencies are recorded in .*.cmd files.

Currently, if_changed is incorrectly used, and only two headers are
hard-coded in lib/Makefile.

In the previous patch version, the kbuild test robot detected new errors
on GCC 7.

GCC 7 or older does not produce test.d with the following test code:

 $ echo 'void b(void) __attribute__((__error__(""))); void a(void) { b(); }' |
   gcc -Wp,-MMD,test.d -c -o /dev/null -x c -

Perhaps, this was a bug that existed in older GCC versions.

Skip the tests for GCC<=7 for now, as this will be eventually solved
when we bump the minimal supported GCC version.

Link: https://lore.kernel.org/oe-kbuild-all/CAK7LNARmJcyyzL-jVJfBPi3W684LTDmuhMf1koF0TXoCpKTmcw@mail.gmail.com/T/#m13771bf78ae21adff22efc4d310c973fb4bcaf67
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20240727150302.1823750-4-masahiroy@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:26:02 -07:00
Masahiro Yamada
5a8d0c46c9 fortify: move test_fortify.sh to lib/test_fortify/
This script is only used in lib/test_fortify/.

There is no reason to keep it in scripts/.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20240727150302.1823750-3-masahiroy@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:26:02 -07:00
Masahiro Yamada
4e9903b086 fortify: refactor test_fortify Makefile to fix some build problems
There are some issues in the test_fortify Makefile code.

Problem 1: cc-disable-warning invokes compiler dozens of times

To see how many times the cc-disable-warning is evaluated, change
this code:

  $(call cc-disable-warning,fortify-source)

to:

  $(call cc-disable-warning,$(shell touch /tmp/fortify-$$$$)fortify-source)

Then, build the kernel with CONFIG_FORTIFY_SOURCE=y. You will see a
large number of '/tmp/fortify-<PID>' files created:

  $ ls -1 /tmp/fortify-* | wc
       80      80    1600

This means the compiler was invoked 80 times just for checking the
-Wno-fortify-source flag support.

$(call cc-disable-warning,fortify-source) should be added to a simple
variable instead of a recursive variable.

Problem 2: do not recompile string.o when the test code is updated

The test cases are independent of the kernel. However, when the test
code is updated, $(obj)/string.o is rebuilt and vmlinux is relinked
due to this dependency:

  $(obj)/string.o: $(obj)/$(TEST_FORTIFY_LOG)

always-y is suitable for building the log files.

Problem 3: redundant code

  clean-files += $(addsuffix .o, $(TEST_FORTIFY_LOGS))

... is unneeded because the top Makefile globally cleans *.o files.

This commit fixes these issues and makes the code readable.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20240727150302.1823750-2-masahiroy@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:26:02 -07:00
Ivan Orlov
92e9bac181 kunit/overflow: Fix UB in overflow_allocation_test
The 'device_name' array doesn't exist out of the
'overflow_allocation_test' function scope. However, it is being used as
a driver name when calling 'kunit_driver_create' from
'kunit_device_register'. It produces the kernel panic with KASAN
enabled.

Since this variable is used in one place only, remove it and pass the
device name into kunit_device_register directly as an ascii string.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/r/20240815000431.401869-1-ivan.orlov0322@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:24:55 -07:00
Paul E. McKenney
ac9d45544c locking/csd_lock: Provide an indication of ongoing CSD-lock stall
If a CSD-lock stall goes on long enough, it will cause an RCU CPU
stall warning.  This additional warning provides much additional
console-log traffic and little additional information.  Therefore,
provide a new csd_lock_is_stuck() function that returns true if there
is an ongoing CSD-lock stall.  This function will be used by the RCU
CPU stall warnings to provide a one-line indication of the stall when
this function returns true.

[ neeraj.upadhyay: Apply Rik van Riel feedback. ]
[ neeraj.upadhyay: Apply kernel test robot feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-15 00:05:39 +05:30
Kent Overstreet
b2f11c6f3e lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()
If we need to increase the tree depth, allocate a new node, and then
race with another thread that increased the tree depth before us, we'll
still have a preallocated node that might be used later.

If we then use that node for a new non-root node, it'll still have a
pointer to the old root instead of being zeroed - fix this by zeroing it
in the cmpxchg failure path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13 22:56:50 -04:00
Herbert Xu
da4fe6815a Revert "lib/mpi: Introduce ec implementation to MPI library"
This reverts commit d58bb7e55a.

It's no longer needed since sm2 has been removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-10 12:25:34 +08:00
Dmitry Vyukov
6cd0dd934b kcov: Add interrupt handling self test
Add a boot self test that can catch sprious coverage from interrupts.
The coverage callback filters out interrupt code, but only after the
handler updates preempt count. Some code periodically leaks out
of that section and leads to spurious coverage.
Add a best-effort (but simple) test that is likely to catch such bugs.
If the test is enabled on CI systems that use KCOV, they should catch
any issues fast.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/all/7662127c97e29da1a748ad1c1539dd7b65b737b2.1718092070.git.dvyukov@google.com
2024-08-08 17:36:35 +02:00
Gatlin Newhouse
7424fc6b86 x86/traps: Enable UBSAN traps on x86
Currently ARM64 extracts which specific sanitizer has caused a trap via
encoded data in the trap instruction. Clang on x86 currently encodes the
same data in the UD1 instruction but x86 handle_bug() and
is_valid_bugaddr() currently only look at UD2.

Bring x86 to parity with ARM64, similar to commit 25b84002af ("arm64:
Support Clang UBSAN trap codes for better reporting"). See the llvm
links for information about the code generation.

Enable the reporting of UBSAN sanitizer details on x86 compiled with clang
when CONFIG_UBSAN_TRAP=y by analysing UD1 and retrieving the type immediate
which is encoded by the compiler after the UD1.

[ tglx: Simplified it by moving the printk() into handle_bug() ]

Signed-off-by: Gatlin Newhouse <gatlin.newhouse@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/all/20240724000206.451425-1-gatlin.newhouse@gmail.com
Link: https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f
Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27
2024-08-06 13:42:40 +02:00
Xavier
93c8332c83 Union-Find: add a new module in kernel library
This patch implements a union-find data structure in the kernel library,
which includes operations for allocating nodes, freeing nodes,
finding the root of a node, and merging two nodes.

Signed-off-by: Xavier <xavier_qy@163.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30 13:04:36 -10:00
Tejun Heo
c8faf11cd1 Linux 6.11-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmamtfseHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC20H/j6G3+7gYGDtSsl9
 5eH7UFzk18JeIG4c9Z5q9p2YVqdTggHOyWUA0qYBJWLyjpQa0q5SO+Qf2VwH8bH7
 NpHZQYIdRB6dy/MySZII/6KdOJobz779P8EOPVdPs6PaAmiwOwzdK4aHxhi3iQJv
 8QHmswjnT6t44p7WX1gZCUL2R3TL5hyA505BfPBz5OPBLkuuTArCBO8mZfTvk3R6
 fskKrVBC3oEb9Vgx/bycah9wTJn4ptPUGggaTnbu44RkhZcHfMiciqOrtMtYtqKx
 fmGQllbVQ8CHp4IBZ5nYfUB4E04Zg+XqNeYHa0T9R97e7crZ5iMKutujydmnhqA0
 r3Ca53w=
 =R3sl
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-rc1' into for-6.12

Linux 6.11-rc1
2024-07-30 09:30:11 -10:00
Stephen Boyd
5ac7973032 platform: Add test managed platform_device/driver APIs
Introduce KUnit resource wrappers around platform_driver_register(),
platform_device_alloc(), and platform_device_add() so that test authors
can register platform drivers/devices from their tests and have the
drivers/devices automatically be unregistered when the test is done.

This makes test setup code simpler when a platform driver or platform
device is needed. Add a few test cases at the same time to make sure the
APIs work as intended.

Cc: Brendan Higgins <brendan.higgins@linux.dev>
Reviewed-by: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20240718210513.3801024-6-sboyd@kernel.org
2024-07-29 15:33:12 -07:00
Linus Torvalds
cb04e8b1d2 minmax: don't use max() in situations that want a C constant expression
We only had a couple of array[] declarations, and changing them to just
use 'MAX()' instead of 'max()' fixes the issue.

This will allow us to simplify our min/max macros enormously, since they
can now unconditionally use temporary variables to avoid using the
argument values multiple times.

Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28 20:23:27 -07:00
Linus Torvalds
1a251f52cf minmax: make generic MIN() and MAX() macros available everywhere
This just standardizes the use of MIN() and MAX() macros, with the very
traditional semantics.  The goal is to use these for C constant
expressions and for top-level / static initializers, and so be able to
simplify the min()/max() macros.

These macro names were used by various kernel code - they are very
traditional, after all - and all such users have been fixed up, with a
few different approaches:

 - trivial duplicated macro definitions have been removed

   Note that 'trivial' here means that it's obviously kernel code that
   already included all the major kernel headers, and thus gets the new
   generic MIN/MAX macros automatically.

 - non-trivial duplicated macro definitions are guarded with #ifndef

   This is the "yes, they define their own versions, but no, the include
   situation is not entirely obvious, and maybe they don't get the
   generic version automatically" case.

 - strange use case #1

   A couple of drivers decided that the way they want to describe their
   versioning is with

	#define MAJ 1
	#define MIN 2
	#define DRV_VERSION __stringify(MAJ) "." __stringify(MIN)

   which adds zero value and I just did my Alexander the Great
   impersonation, and rewrote that pointless Gordian knot as

	#define DRV_VERSION "1.2"

   instead.

 - strange use case #2

   A couple of drivers thought that it's a good idea to have a random
   'MIN' or 'MAX' define for a value or index into a table, rather than
   the traditional macro that takes arguments.

   These values were re-written as C enum's instead. The new
   function-line macros only expand when followed by an open
   parenthesis, and thus don't clash with enum use.

Happily, there weren't really all that many of these cases, and a lot of
users already had the pattern of using '#ifndef' guarding (or in one
case just using '#undef MIN') before defining their own private version
that does the same thing. I left such cases alone.

Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28 15:49:18 -07:00
Linus Torvalds
910bfc26d1 Rust changes for v6.11
The highlight is the establishment of a minimum version for the Rust
 toolchain, including 'rustc' (and bundled tools) and 'bindgen'.
 
 The initial minimum will be the pinned version we currently have, i.e.
 we are just widening the allowed versions. That covers 3 stable Rust
 releases: 1.78.0, 1.79.0, 1.80.0 (getting released tomorrow), plus beta,
 plus nightly.
 
 This should already be enough for kernel developers in distributions
 that provide recent Rust compiler versions routinely, such as Arch
 Linux, Debian Unstable (outside the freeze period), Fedora Linux,
 Gentoo Linux (especially the testing channel), Nix (unstable) and
 openSUSE Slowroll and Tumbleweed.
 
 In addition, the kernel is now being built-tested by Rust's pre-merge
 CI. That is, every change that is attempting to land into the Rust
 compiler is tested against the kernel, and it is merged only if it
 passes. Similarly, the bindgen tool has agreed to build the kernel in
 their CI too.
 
 Thus, with the pre-merge CI in place, both projects hope to avoid
 unintentional changes to Rust that break the kernel. This means that,
 in general, apart from intentional changes on their side (that we
 will need to workaround conditionally on our side), the upcoming Rust
 compiler versions should generally work.
 
 In addition, the Rust project has proposed getting the kernel into
 stable Rust (at least solving the main blockers) as one of its three
 flagship goals for 2024H2 [1].
 
 I would like to thank Niko, Sid, Emilio et al. for their help promoting
 the collaboration between Rust and the kernel.
 
 [1] https://rust-lang.github.io/rust-project-goals/2024h2/index.html#flagship-goals
 
 Toolchain and infrastructure:
 
  - Support several Rust toolchain versions.
 
  - Support several bindgen versions.
 
  - Remove 'cargo' requirement and simplify 'rusttest', thanks to 'alloc'
    having been dropped last cycle.
 
  - Provide proper error reporting for the 'rust-analyzer' target.
 
 'kernel' crate:
 
  - Add 'uaccess' module with a safe userspace pointers abstraction.
 
  - Add 'page' module with a 'struct page' abstraction.
 
  - Support more complex generics in workqueue's 'impl_has_work!' macro.
 
 'macros' crate:
 
  - Add 'firmware' field support to the 'module!' macro.
 
  - Improve 'module!' macro documentation.
 
 Documentation:
 
  - Provide instructions on what packages should be installed to build
    the kernel in some popular Linux distributions.
 
  - Introduce the new kernel.org LLVM+Rust toolchains.
 
  - Explain '#[no_std]'.
 
 And a few other small bits.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmahqRUACgkQGXyLc2ht
 IW0xbA/6A26b14LjvmFBJU6LZb0ey1BCbK9cOWtd6K6f/uWp108WAIdA/+gHgOGU
 I6rW8nXk3af078lHRqv0ihMDUks/1mz5wyxEXoZ/mVvRJbzH9TsHN7cSP2fr4H14
 8rES4esr2XBlu9OdgDFb/o7jequ7PE0+WQDapV6eAhWQlBC6AI+ShyX26pWcB5gv
 8O4mE59Up51d21L8apVh+pnEgBsCsu7c68pUMbrk2k4sHVvnRti4iLoVlemf4X80
 Di9hyi8iN/MvWMdfq+hCIufUIbcWde07HcCbLjQlkJv0sc20V+UIGUx4EOUasOTY
 ugUyzhlFNGPxJYayAZAb8KJtQZhSbGZ+R244Z/CoV2RMlEw9LxSCpyzHr1nalOLT
 01gqZh6+gIFyPm6F0ORsetcV6yzdvUcGTjx1vuEJ9qqeKG/gc/VqFOcmCPaT7y8K
 nTOMg6zY3mzaqTn1iBebid7INzXJN7ha9dk1TkDv47BNZAic51d3L0hQFXuDrEuu
 MxVIPTAPKJSaQTCh0jrLxLJ649v/98OP0urYqlVeKuTeovupETxCsBTVtjjjsv+w
 ZomqEO+JWuf7hjG0RLuCwi/IvWpUFpEdOal4qfHbKLOAOn7zxV/WrG675HcRKbw5
 Zkr/0Q44fwbZWd2b/svTO1qOKaYV7oL0utVOdUb2KX05K71NNVo=
 =8PYF
 -----END PGP SIGNATURE-----

Merge tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux

Pull Rust updates from Miguel Ojeda:
 "The highlight is the establishment of a minimum version for the Rust
  toolchain, including 'rustc' (and bundled tools) and 'bindgen'.

  The initial minimum will be the pinned version we currently have, i.e.
  we are just widening the allowed versions. That covers three stable
  Rust releases: 1.78.0, 1.79.0, 1.80.0 (getting released tomorrow),
  plus beta, plus nightly.

  This should already be enough for kernel developers in distributions
  that provide recent Rust compiler versions routinely, such as Arch
  Linux, Debian Unstable (outside the freeze period), Fedora Linux,
  Gentoo Linux (especially the testing channel), Nix (unstable) and
  openSUSE Slowroll and Tumbleweed.

  In addition, the kernel is now being built-tested by Rust's pre-merge
  CI. That is, every change that is attempting to land into the Rust
  compiler is tested against the kernel, and it is merged only if it
  passes. Similarly, the bindgen tool has agreed to build the kernel in
  their CI too.

  Thus, with the pre-merge CI in place, both projects hope to avoid
  unintentional changes to Rust that break the kernel. This means that,
  in general, apart from intentional changes on their side (that we will
  need to workaround conditionally on our side), the upcoming Rust
  compiler versions should generally work.

  In addition, the Rust project has proposed getting the kernel into
  stable Rust (at least solving the main blockers) as one of its three
  flagship goals for 2024H2 [1].

  I would like to thank Niko, Sid, Emilio et al. for their help
  promoting the collaboration between Rust and the kernel.

  Toolchain and infrastructure:

   - Support several Rust toolchain versions.

   - Support several bindgen versions.

   - Remove 'cargo' requirement and simplify 'rusttest', thanks to
     'alloc' having been dropped last cycle.

   - Provide proper error reporting for the 'rust-analyzer' target.

  'kernel' crate:

   - Add 'uaccess' module with a safe userspace pointers abstraction.

   - Add 'page' module with a 'struct page' abstraction.

   - Support more complex generics in workqueue's 'impl_has_work!'
     macro.

  'macros' crate:

   - Add 'firmware' field support to the 'module!' macro.

   - Improve 'module!' macro documentation.

  Documentation:

   - Provide instructions on what packages should be installed to build
     the kernel in some popular Linux distributions.

   - Introduce the new kernel.org LLVM+Rust toolchains.

   - Explain '#[no_std]'.

  And a few other small bits"

Link: https://rust-lang.github.io/rust-project-goals/2024h2/index.html#flagship-goals [1]

* tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux: (26 commits)
  docs: rust: quick-start: add section on Linux distributions
  rust: warn about `bindgen` versions 0.66.0 and 0.66.1
  rust: start supporting several `bindgen` versions
  rust: work around `bindgen` 0.69.0 issue
  rust: avoid assuming a particular `bindgen` build
  rust: start supporting several compiler versions
  rust: simplify Clippy warning flags set
  rust: relax most deny-level lints to warnings
  rust: allow `dead_code` for never constructed bindings
  rust: init: simplify from `map_err` to `inspect_err`
  rust: macros: indent list item in `paste!`'s docs
  rust: add abstraction for `struct page`
  rust: uaccess: add typed accessors for userspace pointers
  uaccess: always export _copy_[from|to]_user with CONFIG_RUST
  rust: uaccess: add userspace pointers
  kbuild: rust-analyzer: improve comment documentation
  kbuild: rust-analyzer: better error handling
  docs: rust: no_std is used
  rust: alloc: add __GFP_HIGHMEM flag
  rust: alloc: fix typo in docs for GFP_NOWAIT
  ...
2024-07-27 13:44:54 -07:00
Linus Torvalds
7b0acd911c 11 hotfixes, 7 of which are cc:stable. 7 are MM, 4 are other.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZqQWWQAKCRDdBJ7gKXxA
 jqJVAP9vU9HNzIyKDOOqoNHKMI+VzGn39w1FihWjG6AU5a+9NQD+MZJwr7bBwkpH
 ii43HLUGvNRQtsldBZSRypsaitCSwAI=
 =HGce
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
 "11 hotfixes, 7 of which are cc:stable.  7 are MM, 4 are other"

* tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  nilfs2: handle inconsistent state in nilfs_btnode_create_block()
  selftests/mm: skip test for non-LPA2 and non-LVA systems
  mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist()
  mm: memcg: add cacheline padding after lruvec in mem_cgroup_per_node
  alloc_tag: outline and export free_reserved_page()
  decompress_bunzip2: fix rare decompression failure
  mm/huge_memory: avoid PMD-size page cache if needed
  mm: huge_memory: use !CONFIG_64BIT to relax huge page alignment on 32 bit machines
  mm: fix old/young bit handling in the faulting path
  dt-bindings: arm: update James Clark's email address
  MAINTAINERS: mailmap: update James Clark's email address
2024-07-27 10:26:41 -07:00
Ross Lagerwall
bf6acd5d16 decompress_bunzip2: fix rare decompression failure
The decompression code parses a huffman tree and counts the number of
symbols for a given bit length.  In rare cases, there may be >= 256
symbols with a given bit length, causing the unsigned char to overflow. 
This causes a decompression failure later when the code tries and fails to
find the bit length for a given symbol.

Since the maximum number of symbols is 258, use unsigned short instead.

Link: https://lkml.kernel.org/r/20240717162016.1514077-1-ross.lagerwall@citrix.com
Fixes: bc22c17e12 ("bzip2/lzma: library support for gzip, bzip2 and lzma decompression")
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: Alain Knaff <alain@knaff.lu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-26 14:33:09 -07:00
Linus Torvalds
51c4767503 bitmap-6.11-rc1
Random fixes for v6.11.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmahKbIACgkQsUSA/Tof
 vsh8zQwAvguyeNubDFqdMe3E/Vp1J3WqXsBFzbE1rGLCyI2S0cgJFL5BlW51zY47
 70wLt9EmroEobwj1qHSQlzejNp31kSBQ1Sqq25oivfJqEF1elDT5PQxYqBbU1C9Y
 kVWnxtb+oKaoFd5jiBK8+iTl8dXjT6H2RoV0zpPab/JPcqsjwFfkUvtENt/Kpo5c
 aRrGTFwshdp5eT4sEZQv57VKroBcwZOvv2//qrklFHrJHl4pjMT8eaX3twcQysoy
 umTVt+TK6NErLnht+VRQJ2/L02FKi7b+bHePVgNzaT+1FSDMT4FltmZd96Xwbzah
 hSkwWtqy0N2gaTcqie9nwdEiCJGjF39M7k2wangUS91CeDsbIUSsJgDCESUCm+zK
 hRqleGOnoeg4+jZBci7M53lKa/pADlmLhnU8iAc3BSKozsaioltkT+hHn8vAkstk
 h/kHlbfkzasufUWAhduBpIn384gWWEY6RACffgCsOuvbT+ZyDKUJpKYaEwVx+Pri
 l72j0hs9
 =RbET
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux

Pull bitmap updates from Yury Norov:
 "Random fixes"

* tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux:
  riscv: Remove unnecessary int cast in variable_fls()
  radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c
  bitops: Add a comment explaining the double underscore macros
  lib: bitmap: add missing MODULE_DESCRIPTION() macros
  cpumask: introduce assign_cpu() macro
2024-07-26 09:50:36 -07:00
Linus Torvalds
8bf100092d trivial printk changes for 6.11
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmaiUQcACgkQUqAMR0iA
 lPITNQ/+KDdmQljKwpuXlqe01F1mG/LFn5i1Y/a8fZVep/OSmihsgnEqnYzBomTq
 CF22tlrH7r6EZ2D5by1fjno/AG6/BAXvwH8jGDr9jhVNNBsneeVYrtMB1UUslR/e
 OEFoFKyzpq6VJNmHl5aAM95CEFEkE5uBba4DkJ/hCh3oErc2zP5DRdD9COCkdlIp
 +LzQa6XsMjwzrWAMAm2vWdBgePCHVKsAVVFUdfmN28FQw3BcFZEgIvN7vPT7Ee3I
 ESKx/Asb3myb1J7bFvDKnpT9O+7EkU/cpQn+HxjiIFVPqFLX3mfSXzgvfNocuPB0
 hkIUzA9Sbu2wa+SE7qU1IwHVZj2N612OPso8lG8cbcic/KaYLpd6Kt7bnJe8kBe7
 gFGBVmDjvapilQwWteWJcMs2hBxXuq0Xd+CMcXTMKYcLS8Z+4TVAYA8onssrUq+0
 Jye8hW/CvST/P7wazVtuQu1fsKKz3SG+dAaXw9/7fYTGQ2LdRoLUkDhLkwqIURjD
 j6+pMMuYpxrpeA7yaOD1xLLOKC33OINClgfodjGVHvDlwxsQBhZFchCKsbufYEa1
 CxFi8lDcfNVSGuw5x3a6iMqwnqxoeVKpi6eKKgpU81fXnGd4G2NA5jGRMycWmqzX
 +uUq6Ot1NO4QVK+pT70GhXMt2rQ6UsC1SyWggeFYBhaaqfIGKRk=
 =89Nl
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - trivial printk changes

The bigger "real" printk work is still being discussed.

* tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  vsprintf: add missing MODULE_DESCRIPTION() macro
  printk: Rename console_replay_all() and update context
2024-07-25 13:18:41 -07:00
Linus Torvalds
c2a96b7f18 Driver core changes for 6.11-rc1
Here is the big set of driver core changes for 6.11-rc1.
 
 Lots of stuff in here, with not a huge diffstat, but apis are evolving
 which required lots of files to be touched.  Highlights of the changes
 in here are:
   - platform remove callback api final fixups (Uwe took many releases to
     get here, finally!)
   - Rust bindings for basic firmware apis and initial driver-core
     interactions.  It's not all that useful for a "write a whole driver
     in rust" type of thing, but the firmware bindings do help out the
     phy rust drivers, and the driver core bindings give a solid base on
     which others can start their work.  There is still a long way to go
     here before we have a multitude of rust drivers being added, but
     it's a great first step.
   - driver core const api changes.  This reached across all bus types,
     and there are some fix-ups for some not-common bus types that
     linux-next and 0-day testing shook out.  This work is being done to
     help make the rust bindings more safe, as well as the C code, moving
     toward the end-goal of allowing us to put driver structures into
     read-only memory.  We aren't there yet, but are getting closer.
   - minor devres cleanups and fixes found by code inspection
   - arch_topology minor changes
   - other minor driver core cleanups
 
 All of these have been in linux-next for a very long time with no
 reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZqH+aQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymoOQCfVBdLcBjEDAGh3L8qHRGMPy4rV2EAoL/r+zKm
 cJEYtJpGtWX6aAtugm9E
 =ZyJV
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the big set of driver core changes for 6.11-rc1.

  Lots of stuff in here, with not a huge diffstat, but apis are evolving
  which required lots of files to be touched. Highlights of the changes
  in here are:

   - platform remove callback api final fixups (Uwe took many releases
     to get here, finally!)

   - Rust bindings for basic firmware apis and initial driver-core
     interactions.

     It's not all that useful for a "write a whole driver in rust" type
     of thing, but the firmware bindings do help out the phy rust
     drivers, and the driver core bindings give a solid base on which
     others can start their work.

     There is still a long way to go here before we have a multitude of
     rust drivers being added, but it's a great first step.

   - driver core const api changes.

     This reached across all bus types, and there are some fix-ups for
     some not-common bus types that linux-next and 0-day testing shook
     out.

     This work is being done to help make the rust bindings more safe,
     as well as the C code, moving toward the end-goal of allowing us to
     put driver structures into read-only memory. We aren't there yet,
     but are getting closer.

   - minor devres cleanups and fixes found by code inspection

   - arch_topology minor changes

   - other minor driver core cleanups

  All of these have been in linux-next for a very long time with no
  reported problems"

* tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  ARM: sa1100: make match function take a const pointer
  sysfs/cpu: Make crash_hotplug attribute world-readable
  dio: Have dio_bus_match() callback take a const *
  zorro: make match function take a const pointer
  driver core: module: make module_[add|remove]_driver take a const *
  driver core: make driver_find_device() take a const *
  driver core: make driver_[create|remove]_file take a const *
  firmware_loader: fix soundness issue in `request_internal`
  firmware_loader: annotate doctests as `no_run`
  devres: Correct code style for functions that return a pointer type
  devres: Initialize an uninitialized struct member
  devres: Fix memory leakage caused by driver API devm_free_percpu()
  devres: Fix devm_krealloc() wasting memory
  driver core: platform: Switch to use kmemdup_array()
  driver core: have match() callback in struct bus_type take a const *
  MAINTAINERS: add Rust device abstractions to DRIVER CORE
  device: rust: improve safety comments
  MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer
  MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER
  firmware: rust: improve safety comments
  ...
2024-07-25 10:42:22 -07:00
Linus Torvalds
7a3fad30fd Random number generator updates for Linux 6.11-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmaarzgACgkQSfxwEqXe
 A66ZWBAAlhXx8bve0uKlDRK8fffWHgruho/fOY4lZJ137AKwA9JCtmOyqdfL4Dmk
 VxFe7pEQJlQhcA/6kH54uO7SBXwfKlKZJth6SYnaCRMUIbFifHjjIQ0QqldjEKi0
 rP90Hu4FVsbwQC7u9i9lQj9n2P36zb6pn83BzpZQ/2PtoVCSCrdSJUe0Rxa3H3GN
 0+nNkDSXQt5otCByLaeE3x7KJgXLWL9+G2eFSFLTZ8rSVfMx1CdOIAG37WlLGdWm
 BaFYPDKMyBTVvVJBNgAe9YSqtrsZ5nlmLz+Z9wAe/hTL7RlL03kWUu34/Udcpull
 zzMDH0WMntiGK3eFQ2gOYSWqypvAjwHgn3BzqNmjUb69+89mZsdU1slcvnxWsUwU
 D3vphrscaqarF629tfsXti3jc5PoXwUTjROZVcCyeFPBhyAZgzK8xUvPpJO+RT+K
 EuUABob9cpA6FCpW/QeolDmMDhXlNT8QgsZu1juokZac2xP3Ly3REyEvT7HLbU2W
 ZJjbEqm1ppp3RmGELUOJbyhwsLrnbt+OMDO7iEWoG8aSFK4diBK/ZM6WvLMkr8Oi
 7ioXGIsYkCy3c47wpZKTrAapOPJp5keqNAiHSEbXw8mozp6429QAEZxNOcczgHKC
 Ea2JzRkctqutcIT+Slw/uUe//i1iSsIHXbE81fp5udcQTJcUByo=
 =P8aI
 -----END PGP SIGNATURE-----

Merge tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "This adds getrandom() support to the vDSO.

  First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
  lets the kernel zero out pages anytime under memory pressure, which
  enables allocating memory that never gets swapped to disk but also
  doesn't count as being mlocked.

  Then, the vDSO implementation of getrandom() is introduced in a
  generic manner and hooked into random.c.

  Next, this is implemented on x86. (Also, though it's not ready for
  this pull, somebody has begun an arm64 implementation already)

  Finally, two vDSO selftests are added.

  There are also two housekeeping cleanup commits"

* tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  MAINTAINERS: add random.h headers to RNG subsection
  random: note that RNDGETPOOL was removed in 2.6.9-rc2
  selftests/vDSO: add tests for vgetrandom
  x86: vdso: Wire up getrandom() vDSO implementation
  random: introduce generic vDSO getrandom() implementation
  mm: add MAP_DROPPABLE for designating always lazily freeable mappings
2024-07-24 10:29:50 -07:00
Linus Torvalds
7d080fa867 for-6.11/block-20240722
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmaeZBIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqI7D/9XPinZuuwiZ/670P8yjk1SHFzqzdwtuFuP
 +Dq2lcoYRkuwm5PvCvhs3QH2mnjS1vo1SIoAijGEy3V1bs41mw87T2knKMIn4g5v
 I5A4gC6i0IqxIkFm17Zx9yG+MivoOmPtqM4RMxze2xS/uJwWcvg4tjBHZfylY3d9
 oaIXyZj+0dTRf955K2x/5dpfE6qjtDG0bqrrJXnzaIKHBJk2HKezYFbTstAA4OY+
 MvMqRL7uJmJBd7384/WColIO0b8/UEchPl7qG+zy9pg+wzQGLFyF/Z/KdjrWdDMD
 IFs92uNDFQmiGoyujJmXdDV9xpKi94nqDAtUR+Qct0Mui5zz0w2RNcGvyTDjBMpv
 CAzTkTW48moYkwLPhPmy8Ge69elT82AC/9ZQAHbA7g3TYgJML5IT/7TtiaVe6Rc1
 podnTR3/e9XmZnc25aUZeAr6CG7b+0NBvB+XPO9lNyMEE38sfwShoPdAGdKX25oA
 mjnLHBc9grVOQzRGEx22E11k+1ChXf/o9H546PB2Pr9yvf/DQ3868a+QhHssxufL
 Xul1K5a+pUmOnaTLD3ESftYlFmcDOHQ6gDK697so7mU7lrD3ctN4HYZ2vwNk35YY
 2b4xrABrOEbAXlUo3Ht8F/ecg6qw4xTr9vAW5q4+L2H5+28RaZKYclHhLmR23yfP
 xJ/d5FfVFQ==
 =fqoV
 -----END PGP SIGNATURE-----

Merge tag 'for-6.11/block-20240722' of git://git.kernel.dk/linux

Pull more block updates from Jens Axboe:

 - MD fixes via Song:
     - md-cluster fixes (Heming Zhao)
     - raid1 fix (Mateusz Jończyk)

 - s390/dasd module description (Jeff)

 - Series cleaning up and hardening the blk-mq debugfs flag handling
   (John, Christoph)

 - blk-cgroup cleanup (Xiu)

 - Error polled IO attempts if backend doesn't support it (hexue)

 - Fix for an sbitmap hang (Yang)

* tag 'for-6.11/block-20240722' of git://git.kernel.dk/linux: (23 commits)
  blk-cgroup: move congestion_count to struct blkcg
  sbitmap: fix io hung due to race on sbitmap_word::cleared
  block: avoid polling configuration errors
  block: Catch possible entries missing from rqf_name[]
  block: Simplify definition of RQF_NAME()
  block: Use enum to define RQF_x bit indexes
  block: Catch possible entries missing from cmd_flag_name[]
  block: Catch possible entries missing from alloc_policy_name[]
  block: Catch possible entries missing from hctx_flag_name[]
  block: Catch possible entries missing from hctx_state_name[]
  block: Catch possible entries missing from blk_queue_flag_name[]
  block: Make QUEUE_FLAG_x as an enum
  block: Relocate BLK_MQ_MAX_DEPTH
  block: Relocate BLK_MQ_CPU_WORK_BATCH
  block: remove QUEUE_FLAG_STOPPED
  block: Add missing entry to hctx_flag_name[]
  block: Add zone write plugging entry to rqf_name[]
  block: Add missing entries from cmd_flag_name[]
  s390/dasd: fix error checks in dasd_copy_pair_store()
  s390/dasd: add missing MODULE_DESCRIPTION() macros
  ...
2024-07-22 11:32:05 -07:00
Linus Torvalds
527eff227d - In the series "treewide: Refactor heap related implementation",
Kuan-Wei Chiu has significantly reworked the min_heap library code and
   has taught bcachefs to use the new more generic implementation.
 
 - Yury Norov's series "Cleanup cpumask.h inclusion in core headers"
   reworks the cpumask and nodemask headers to make things generally more
   rational.
 
 - Kuan-Wei Chiu has sent along some maintenance work against our sorting
   library code in the series "lib/sort: Optimizations and cleanups".
 
 - More library maintainance work from Christophe Jaillet in the series
   "Remove usage of the deprecated ida_simple_xx() API".
 
 - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the
   series "nilfs2: eliminate the call to inode_attach_wb()".
 
 - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB
   command error".
 
 - Plus the usual shower of singleton patches all over the place.  Please
   see the relevant changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2GvwAKCRDdBJ7gKXxA
 jlf/AP48xP5ilIHbtpAKm2z+MvGuTxJQ5VSC0UXFacuCbc93lAEA+Yo+vOVRmh6j
 fQF2nVKyKLYfSz7yqmCyAaHWohIYLgg=
 =Stxz
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - In the series "treewide: Refactor heap related implementation",
   Kuan-Wei Chiu has significantly reworked the min_heap library code
   and has taught bcachefs to use the new more generic implementation.

 - Yury Norov's series "Cleanup cpumask.h inclusion in core headers"
   reworks the cpumask and nodemask headers to make things generally
   more rational.

 - Kuan-Wei Chiu has sent along some maintenance work against our
   sorting library code in the series "lib/sort: Optimizations and
   cleanups".

 - More library maintainance work from Christophe Jaillet in the series
   "Remove usage of the deprecated ida_simple_xx() API".

 - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the
   series "nilfs2: eliminate the call to inode_attach_wb()".

 - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix
   GDB command error".

 - Plus the usual shower of singleton patches all over the place. Please
   see the relevant changelogs for details.

* tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits)
  ia64: scrub ia64 from poison.h
  watchdog/perf: properly initialize the turbo mode timestamp and rearm counter
  tsacct: replace strncpy() with strscpy()
  lib/bch.c: use swap() to improve code
  test_bpf: convert comma to semicolon
  init/modpost: conditionally check section mismatch to __meminit*
  init: remove unused __MEMINIT* macros
  nilfs2: Constify struct kobj_type
  nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro
  math: rational: add missing MODULE_DESCRIPTION() macro
  lib/zlib: add missing MODULE_DESCRIPTION() macro
  fs: ufs: add MODULE_DESCRIPTION()
  lib/rbtree.c: fix the example typo
  ocfs2: add bounds checking to ocfs2_check_dir_entry()
  fs: add kernel-doc comments to ocfs2_prepare_orphan_dir()
  coredump: simplify zap_process()
  selftests/fpu: add missing MODULE_DESCRIPTION() macro
  compiler.h: simplify data_race() macro
  build-id: require program headers to be right after ELF header
  resource: add missing MODULE_DESCRIPTION()
  ...
2024-07-21 17:56:22 -07:00
Linus Torvalds
fbc90c042c - 875fa64577da ("mm/hugetlb_vmemmap: fix race with speculative PFN
walkers") is known to cause a performance regression
   (https://lore.kernel.org/all/3acefad9-96e5-4681-8014-827d6be71c7a@linux.ibm.com/T/#mfa809800a7862fb5bdf834c6f71a3a5113eb83ff).
   Yu has a fix which I'll send along later via the hotfixes branch.
 
 - In the series "mm: Avoid possible overflows in dirty throttling" Jan
   Kara addresses a couple of issues in the writeback throttling code.
   These fixes are also targetted at -stable kernels.
 
 - Ryusuke Konishi's series "nilfs2: fix potential issues related to
   reserved inodes" does that.  This should actually be in the
   mm-nonmm-stable tree, along with the many other nilfs2 patches.  My bad.
 
 - More folio conversions from Kefeng Wang in the series "mm: convert to
   folio_alloc_mpol()"
 
 - Kemeng Shi has sent some cleanups to the writeback code in the series
   "Add helper functions to remove repeated code and improve readability of
   cgroup writeback"
 
 - Kairui Song has made the swap code a little smaller and a little
   faster in the series "mm/swap: clean up and optimize swap cache index".
 
 - In the series "mm/memory: cleanly support zeropage in
   vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
   Hildenbrand has reworked the rather sketchy handling of the use of the
   zeropage in MAP_SHARED mappings.  I don't see any runtime effects here -
   more a cleanup/understandability/maintainablity thing.
 
 - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of
   higher addresses, for aarch64.  The (poorly named) series is
   "Restructure va_high_addr_switch".
 
 - The core TLB handling code gets some cleanups and possible slight
   optimizations in Bang Li's series "Add update_mmu_tlb_range() to
   simplify code".
 
 - Jane Chu has improved the handling of our
   fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the
   series "Enhance soft hwpoison handling and injection".
 
 - Jeff Johnson has sent a billion patches everywhere to add
   MODULE_DESCRIPTION() to everything.  Some landed in this pull.
 
 - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has
   simplified migration's use of hardware-offload memory copying.
 
 - Yosry Ahmed performs more folio API conversions in his series "mm:
   zswap: trivial folio conversions".
 
 - In the series "large folios swap-in: handle refault cases first",
   Chuanhua Han inches us forward in the handling of large pages in the
   swap code.  This is a cleanup and optimization, working toward the end
   objective of full support of large folio swapin/out.
 
 - In the series "mm,swap: cleanup VMA based swap readahead window
   calculation", Huang Ying has contributed some cleanups and a possible
   fixlet to his VMA based swap readahead code.
 
 - In the series "add mTHP support for anonymous shmem" Baolin Wang has
   taught anonymous shmem mappings to use multisize THP.  By default this
   is a no-op - users must opt in vis sysfs controls.  Dramatic
   improvements in pagefault latency are realized.
 
 - David Hildenbrand has some cleanups to our remaining use of
   page_mapcount() in the series "fs/proc: move page_mapcount() to
   fs/proc/internal.h".
 
 - David also has some highmem accounting cleanups in the series
   "mm/highmem: don't track highmem pages manually".
 
 - Build-time fixes and cleanups from John Hubbard in the series
   "cleanups, fixes, and progress towards avoiding "make headers"".
 
 - Cleanups and consolidation of the core pagemap handling from Barry
   Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
   and utilize them".
 
 - Lance Yang's series "Reclaim lazyfree THP without splitting" has
   reduced the latency of the reclaim of pmd-mapped THPs under fairly
   common circumstances.  A 10x speedup is seen in a microbenchmark.
 
   It does this by punting to aother CPU but I guess that's a win unless
   all CPUs are pegged.
 
 - hugetlb_cgroup cleanups from Xiu Jianfeng in the series
   "mm/hugetlb_cgroup: rework on cftypes".
 
 - Miaohe Lin's series "Some cleanups for memory-failure" does just that
   thing.
 
 - Is anyone reading this stuff?  If so, email me!
 
 - Someone other than SeongJae has developed a DAMON feature in Honggyu
   Kim's series "DAMON based tiered memory management for CXL memory".
   This adds DAMON features which may be used to help determine the
   efficiency of our placement of CXL/PCIe attached DRAM.
 
 - DAMON user API centralization and simplificatio work in SeongJae
   Park's series "mm/damon: introduce DAMON parameters online commit
   function".
 
 - In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
   David Hildenbrand does some maintenance work on zsmalloc - partially
   modernizing its use of pageframe fields.
 
 - Kefeng Wang provides more folio conversions in the series "mm: remove
   page_maybe_dma_pinned() and page_mkclean()".
 
 - More cleanup from David Hildenbrand, this time in the series
   "mm/memory_hotplug: use PageOffline() instead of PageReserved() for
   !ZONE_DEVICE".  It "enlightens memory hotplug more about PageOffline()
   pages" and permits the removal of some virtio-mem hacks.
 
 - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
   __folio_add_anon_rmap()" is a cleanup to the anon folio handling in
   preparation for mTHP (multisize THP) swapin.
 
 - Kefeng Wang's series "mm: improve clear and copy user folio"
   implements more folio conversions, this time in the area of large folio
   userspace copying.
 
 - The series "Docs/mm/damon/maintaier-profile: document a mailing tool
   and community meetup series" tells people how to get better involved
   with other DAMON developers.  From SeongJae Park.
 
 - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
   that.
 
 - David Hildenbrand sends along more cleanups, this time against the
   migration code.  The series is "mm/migrate: move NUMA hinting fault
   folio isolation + checks under PTL".
 
 - Jan Kara has found quite a lot of strangenesses and minor errors in
   the readahead code.  He addresses this in the series "mm: Fix various
   readahead quirks".
 
 - SeongJae Park's series "selftests/damon: test DAMOS tried regions and
   {min,max}_nr_regions" adds features and addresses errors in DAMON's self
   testing code.
 
 - Gavin Shan has found a userspace-triggerable WARN in the pagecache
   code.  The series "mm/filemap: Limit page cache size to that supported
   by xarray" addresses this.  The series is marked cc:stable.
 
 - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
   and cleanup" cleans up and slightly optimizes KSM.
 
 - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
   code motion.  The series (which also makes the memcg-v1 code
   Kconfigurable) are
 
   "mm: memcg: separate legacy cgroup v1 code and put under config
   option" and
   "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1"
 
 - Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
   adds an additional feature to this cgroup-v2 control file.
 
 - The series "Userspace controls soft-offline pages" from Jiaqi Yan
   permits userspace to stop the kernel's automatic treatment of excessive
   correctable memory errors.  In order to permit userspace to monitor and
   handle this situation.
 
 - Kefeng Wang's series "mm: migrate: support poison recover from migrate
   folio" teaches the kernel to appropriately handle migration from
   poisoned source folios rather than simply panicing.
 
 - SeongJae Park's series "Docs/damon: minor fixups and improvements"
   does those things.
 
 - In the series "mm/zsmalloc: change back to per-size_class lock"
   Chengming Zhou improves zsmalloc's scalability and memory utilization.
 
 - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
   pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare
   refcount increments.  So these paes can first be moved aside if they
   reside in the movable zone or a CMA block.
 
 - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps
   for much faster reading of vma information.  The series is "query VMAs
   from /proc/<pid>/maps".
 
 - In the series "mm: introduce per-order mTHP split counters" Lance Yang
   improves the kernel's presentation of developer information related to
   multisize THP splitting.
 
 - Michael Ellerman has developed the series "Reimplement huge pages
   without hugepd on powerpc (8xx, e500, book3s/64)".  This permits
   userspace to use all available huge page sizes.
 
 - In the series "revert unconditional slab and page allocator fault
   injection calls" Vlastimil Babka removes a performance-affecting and not
   very useful feature from slab fault injection.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2C+QAKCRDdBJ7gKXxA
 joTkAQDvjqOoFStqk4GU3OXMYB7WCU/ZQMFG0iuu1EEwTVDZ4QEA8CnG7seek1R3
 xEoo+vw0sWWeLV3qzsxnCA1BJ8cTJA8=
 =z0Lf
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - In the series "mm: Avoid possible overflows in dirty throttling" Jan
   Kara addresses a couple of issues in the writeback throttling code.
   These fixes are also targetted at -stable kernels.

 - Ryusuke Konishi's series "nilfs2: fix potential issues related to
   reserved inodes" does that. This should actually be in the
   mm-nonmm-stable tree, along with the many other nilfs2 patches. My
   bad.

 - More folio conversions from Kefeng Wang in the series "mm: convert to
   folio_alloc_mpol()"

 - Kemeng Shi has sent some cleanups to the writeback code in the series
   "Add helper functions to remove repeated code and improve readability
   of cgroup writeback"

 - Kairui Song has made the swap code a little smaller and a little
   faster in the series "mm/swap: clean up and optimize swap cache
   index".

 - In the series "mm/memory: cleanly support zeropage in
   vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
   Hildenbrand has reworked the rather sketchy handling of the use of
   the zeropage in MAP_SHARED mappings. I don't see any runtime effects
   here - more a cleanup/understandability/maintainablity thing.

 - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling
   of higher addresses, for aarch64. The (poorly named) series is
   "Restructure va_high_addr_switch".

 - The core TLB handling code gets some cleanups and possible slight
   optimizations in Bang Li's series "Add update_mmu_tlb_range() to
   simplify code".

 - Jane Chu has improved the handling of our
   fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in
   the series "Enhance soft hwpoison handling and injection".

 - Jeff Johnson has sent a billion patches everywhere to add
   MODULE_DESCRIPTION() to everything. Some landed in this pull.

 - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang
   has simplified migration's use of hardware-offload memory copying.

 - Yosry Ahmed performs more folio API conversions in his series "mm:
   zswap: trivial folio conversions".

 - In the series "large folios swap-in: handle refault cases first",
   Chuanhua Han inches us forward in the handling of large pages in the
   swap code. This is a cleanup and optimization, working toward the end
   objective of full support of large folio swapin/out.

 - In the series "mm,swap: cleanup VMA based swap readahead window
   calculation", Huang Ying has contributed some cleanups and a possible
   fixlet to his VMA based swap readahead code.

 - In the series "add mTHP support for anonymous shmem" Baolin Wang has
   taught anonymous shmem mappings to use multisize THP. By default this
   is a no-op - users must opt in vis sysfs controls. Dramatic
   improvements in pagefault latency are realized.

 - David Hildenbrand has some cleanups to our remaining use of
   page_mapcount() in the series "fs/proc: move page_mapcount() to
   fs/proc/internal.h".

 - David also has some highmem accounting cleanups in the series
   "mm/highmem: don't track highmem pages manually".

 - Build-time fixes and cleanups from John Hubbard in the series
   "cleanups, fixes, and progress towards avoiding "make headers"".

 - Cleanups and consolidation of the core pagemap handling from Barry
   Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
   and utilize them".

 - Lance Yang's series "Reclaim lazyfree THP without splitting" has
   reduced the latency of the reclaim of pmd-mapped THPs under fairly
   common circumstances. A 10x speedup is seen in a microbenchmark.

   It does this by punting to aother CPU but I guess that's a win unless
   all CPUs are pegged.

 - hugetlb_cgroup cleanups from Xiu Jianfeng in the series
   "mm/hugetlb_cgroup: rework on cftypes".

 - Miaohe Lin's series "Some cleanups for memory-failure" does just that
   thing.

 - Someone other than SeongJae has developed a DAMON feature in Honggyu
   Kim's series "DAMON based tiered memory management for CXL memory".
   This adds DAMON features which may be used to help determine the
   efficiency of our placement of CXL/PCIe attached DRAM.

 - DAMON user API centralization and simplificatio work in SeongJae
   Park's series "mm/damon: introduce DAMON parameters online commit
   function".

 - In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
   David Hildenbrand does some maintenance work on zsmalloc - partially
   modernizing its use of pageframe fields.

 - Kefeng Wang provides more folio conversions in the series "mm: remove
   page_maybe_dma_pinned() and page_mkclean()".

 - More cleanup from David Hildenbrand, this time in the series
   "mm/memory_hotplug: use PageOffline() instead of PageReserved() for
   !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline()
   pages" and permits the removal of some virtio-mem hacks.

 - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
   __folio_add_anon_rmap()" is a cleanup to the anon folio handling in
   preparation for mTHP (multisize THP) swapin.

 - Kefeng Wang's series "mm: improve clear and copy user folio"
   implements more folio conversions, this time in the area of large
   folio userspace copying.

 - The series "Docs/mm/damon/maintaier-profile: document a mailing tool
   and community meetup series" tells people how to get better involved
   with other DAMON developers. From SeongJae Park.

 - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
   that.

 - David Hildenbrand sends along more cleanups, this time against the
   migration code. The series is "mm/migrate: move NUMA hinting fault
   folio isolation + checks under PTL".

 - Jan Kara has found quite a lot of strangenesses and minor errors in
   the readahead code. He addresses this in the series "mm: Fix various
   readahead quirks".

 - SeongJae Park's series "selftests/damon: test DAMOS tried regions and
   {min,max}_nr_regions" adds features and addresses errors in DAMON's
   self testing code.

 - Gavin Shan has found a userspace-triggerable WARN in the pagecache
   code. The series "mm/filemap: Limit page cache size to that supported
   by xarray" addresses this. The series is marked cc:stable.

 - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
   and cleanup" cleans up and slightly optimizes KSM.

 - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
   code motion. The series (which also makes the memcg-v1 code
   Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put
   under config option" and "mm: memcg: put cgroup v1-specific memcg
   data under CONFIG_MEMCG_V1"

 - Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
   adds an additional feature to this cgroup-v2 control file.

 - The series "Userspace controls soft-offline pages" from Jiaqi Yan
   permits userspace to stop the kernel's automatic treatment of
   excessive correctable memory errors. In order to permit userspace to
   monitor and handle this situation.

 - Kefeng Wang's series "mm: migrate: support poison recover from
   migrate folio" teaches the kernel to appropriately handle migration
   from poisoned source folios rather than simply panicing.

 - SeongJae Park's series "Docs/damon: minor fixups and improvements"
   does those things.

 - In the series "mm/zsmalloc: change back to per-size_class lock"
   Chengming Zhou improves zsmalloc's scalability and memory
   utilization.

 - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
   pinning memfd folios" makes the GUP code use FOLL_PIN rather than
   bare refcount increments. So these paes can first be moved aside if
   they reside in the movable zone or a CMA block.

 - Andrii Nakryiko has added a binary ioctl()-based API to
   /proc/pid/maps for much faster reading of vma information. The series
   is "query VMAs from /proc/<pid>/maps".

 - In the series "mm: introduce per-order mTHP split counters" Lance
   Yang improves the kernel's presentation of developer information
   related to multisize THP splitting.

 - Michael Ellerman has developed the series "Reimplement huge pages
   without hugepd on powerpc (8xx, e500, book3s/64)". This permits
   userspace to use all available huge page sizes.

 - In the series "revert unconditional slab and page allocator fault
   injection calls" Vlastimil Babka removes a performance-affecting and
   not very useful feature from slab fault injection.

* tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits)
  mm/mglru: fix ineffective protection calculation
  mm/zswap: fix a white space issue
  mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio
  mm/hugetlb: fix possible recursive locking detected warning
  mm/gup: clear the LRU flag of a page before adding to LRU batch
  mm/numa_balancing: teach mpol_to_str about the balancing mode
  mm: memcg1: convert charge move flags to unsigned long long
  alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting
  lib: reuse page_ext_data() to obtain codetag_ref
  lib: add missing newline character in the warning message
  mm/mglru: fix overshooting shrinker memory
  mm/mglru: fix div-by-zero in vmpressure_calc_level()
  mm/kmemleak: replace strncpy() with strscpy()
  mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC
  mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB
  mm: ignore data-race in __swap_writepage
  hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr
  mm: shmem: rename mTHP shmem counters
  mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async()
  mm/migrate: putback split folios when numa hint migration fails
  ...
2024-07-21 17:15:46 -07:00
Linus Torvalds
acc5965b9f Char/Misc and other driver changes for 6.11-rc1
Here is the "big" set of char/misc and other driver subsystem changes
 for 6.11-rc1.  Nothing major in here, just loads of new drivers and
 updates.  Included in here are:
   - IIO api updates and new drivers added
   - wait_interruptable_timeout() api cleanups for some drivers
   - MODULE_DESCRIPTION() additions for loads of drivers
   - parport out-of-bounds fix
   - interconnect driver updates and additions
   - mhi driver updates and additions
   - w1 driver fixes
   - binder speedups and fixes
   - eeprom driver updates
   - coresight driver updates
   - counter driver update
   - new misc driver additions
   - other minor api updates
 
 All of these, EXCEPT for the final Kconfig build fix for 32bit systems,
 have been in linux-next for a while with no reported issues.  The
 Kconfig fixup went in 29 hours ago, so might have missed the latest
 linux-next, but was acked by everyone involved.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZppR4w8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykwoQCeIaW3nbOiNTmOupvEnZwrN3yVNs8An3Q5L+Br
 1LpTASaU6A8pN81Z1m5g
 =6U1z
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc and other driver updates from Greg KH:
 "Here is the "big" set of char/misc and other driver subsystem changes
  for 6.11-rc1. Nothing major in here, just loads of new drivers and
  updates. Included in here are:

   - IIO api updates and new drivers added

   - wait_interruptable_timeout() api cleanups for some drivers

   - MODULE_DESCRIPTION() additions for loads of drivers

   - parport out-of-bounds fix

   - interconnect driver updates and additions

   - mhi driver updates and additions

   - w1 driver fixes

   - binder speedups and fixes

   - eeprom driver updates

   - coresight driver updates

   - counter driver update

   - new misc driver additions

   - other minor api updates

  All of these, EXCEPT for the final Kconfig build fix for 32bit
  systems, have been in linux-next for a while with no reported issues.
  The Kconfig fixup went in 29 hours ago, so might have missed the
  latest linux-next, but was acked by everyone involved"

* tag 'char-misc-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (330 commits)
  misc: Kconfig: exclude mrvl-cn10k-dpi compilation for 32-bit systems
  misc: delete Makefile.rej
  binder: fix hang of unregistered readers
  misc: Kconfig: add a new dependency for MARVELL_CN10K_DPI
  virtio: add missing MODULE_DESCRIPTION() macro
  agp: uninorth: add missing MODULE_DESCRIPTION() macro
  spmi: add missing MODULE_DESCRIPTION() macros
  dev/parport: fix the array out-of-bounds risk
  samples: configfs: add missing MODULE_DESCRIPTION() macro
  misc: mrvl-cn10k-dpi: add Octeon CN10K DPI administrative driver
  misc: keba: Fix missing AUXILIARY_BUS dependency
  slimbus: Fix struct and documentation alignment in stream.c
  MAINTAINERS: CC dri-devel list on Qualcomm FastRPC patches
  misc: fastrpc: use coherent pool for untranslated Compute Banks
  misc: fastrpc: support complete DMA pool access to the DSP
  misc: fastrpc: add missing MODULE_DESCRIPTION() macro
  misc: fastrpc: Add missing dev_err newlines
  misc: fastrpc: Use memdup_user()
  nvmem: core: Implement force_ro sysfs attribute
  nvmem: Use sysfs_emit() for type attribute
  ...
2024-07-19 15:55:08 -07:00
Jason A. Donenfeld
4ad10a5f5f random: introduce generic vDSO getrandom() implementation
Provide a generic C vDSO getrandom() implementation, which operates on
an opaque state returned by vgetrandom_alloc() and produces random bytes
the same way as getrandom(). This has the following API signature:

  ssize_t vgetrandom(void *buffer, size_t len, unsigned int flags,
                     void *opaque_state, size_t opaque_len);

The return value and the first three arguments are the same as ordinary
getrandom(), while the last two arguments are a pointer to the opaque
allocated state and its size. Were all five arguments passed to the
getrandom() syscall, nothing different would happen, and the functions
would have the exact same behavior.

The actual vDSO RNG algorithm implemented is the same one implemented by
drivers/char/random.c, using the same fast-erasure techniques as that.
Should the in-kernel implementation change, so too will the vDSO one.

It requires an implementation of ChaCha20 that does not use any stack,
in order to maintain forward secrecy if a multi-threaded program forks
(though this does not account for a similar issue with SA_SIGINFO
copying registers to the stack), so this is left as an
architecture-specific fill-in. Stack-less ChaCha20 is an easy algorithm
to implement on a variety of architectures, so this shouldn't be too
onerous.

Initially, the state is keyless, and so the first call makes a
getrandom() syscall to generate that key, and then uses it for
subsequent calls. By keeping track of a generation counter, it knows
when its key is invalidated and it should fetch a new one using the
syscall. Later, more than just a generation counter might be used.

Since MADV_WIPEONFORK is set on the opaque state, the key and related
state is wiped during a fork(), so secrets don't roll over into new
processes, and the same state doesn't accidentally generate the same
random stream. The generation counter, as well, is always >0, so that
the 0 counter is a useful indication of a fork() or otherwise
uninitialized state.

If the kernel RNG is not yet initialized, then the vDSO always calls the
syscall, because that behavior cannot be emulated in userspace, but
fortunately that state is short lived and only during early boot. If it
has been initialized, then there is no need to inspect the `flags`
argument, because the behavior does not change post-initialization
regardless of the `flags` value.

Since the opaque state passed to it is mutated, vDSO getrandom() is not
reentrant, when used with the same opaque state, which libc should be
mindful of.

The function works over an opaque per-thread state of a particular size,
which must be marked VM_WIPEONFORK, VM_DONTDUMP, VM_NORESERVE, and
VM_DROPPABLE for proper operation. Over time, the nuances of these
allocations may change or grow or even differ based on architectural
features.

The opaque state passed to vDSO getrandom() must be allocated using the
mmap_flags and mmap_prot parameters provided by the vgetrandom_opaque_params
struct, which also contains the size of each state. That struct can be
obtained with a call to vgetrandom(NULL, 0, 0, &params, ~0UL). Then,
libc can call mmap(2) and slice up the returned array into a state per
each thread, while ensuring that no single state straddles a page
boundary. Libc is expected to allocate a chunk of these on first use,
and then dole them out to threads as they're created, allocating more
when needed.

vDSO getrandom() provides the ability for userspace to generate random
bytes quickly and safely, and is intended to be integrated into libc's
thread management. As an illustrative example, the introduced code in
the vdso_test_getrandom self test later in this series might be used to
do the same outside of libc. In a libc the various pthread-isms are
expected to be elided into libc internals.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-07-19 20:22:12 +02:00
Linus Torvalds
c434e25b62 This update includes the following changes:
API:
 
 - Test setkey in no-SIMD context.
 - Add skcipher speed test for user-specified algorithm.
 
 Algorithms:
 
 - Add x25519 support on ppc64le.
 - Add VAES and AVX512 / AVX10 optimized AES-GCM on x86.
 - Remove sm2 algorithm.
 
 Drivers:
 
 - Add Allwinner H616 support to sun8i-ce.
 - Use DMA in stm32.
 - Add Exynos850 hwrng support to exynos.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmaZFsgACgkQxycdCkmx
 i6f76Q//ej7akY9fo6/qsn8UFK16O0SCEMkx7TrkxqHV8R6uwy4ret3+b5dbckY6
 hBjDabiL/BAdNzo8hvta+BOtN6ToEqquSVwNCpX0U3YMLf9dIzcMA4Uri3LbxUHi
 x9Qa8klI5x62Kg+RW+ovaJC4C11oKTpjVeDn4S57MudlBnhEa3DYcEADKiUowkEz
 aigtLx8HrZYjwkQxwgWeS0xzeojhW1P20yaghOd6hTCD7vKw18JaKdD8r4YFGOBu
 39eDaM/0vR+wWokk3NNl6NmXieBT8qLFt+OIbQs6b3gX9K37daahRs1VoShcL+ix
 l8GaqLpo1n1llVrV1OWzyVLVLtYK849QEo6OmlusnbK7e5pQKEOXoACQ0VB8ElNE
 1u7KNW6CBWGzr33dWPgl9yYBrT3BmMXABIK4dNmTicJsK2zk2FPKbLDZNi8fWah/
 D46mv7Rb8EtTdhN56EzceUJpd1ZfmP9S4vY1Hu8YdmI1pxex11US/XppKLoyymqp
 vNOzf85VuZ/GkUPfHdyWAFBnTaCjXtSBrlXD6+0nxavU9KGli0PLLX5tKNNWGw0l
 51Z0tbNsDbo3Z+sMmtfvBXR2V8NwiAT5f775W0lLvpq/44mbDpdN3jGvfy9y9C7u
 1DUC6F0XtUhZjR7e6/EhvHh3lB/a3w/m3+XC+XzDeox/VYTrC3Q=
 =x80X
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto update from Herbert Xu:
 "API:
   - Test setkey in no-SIMD context
   - Add skcipher speed test for user-specified algorithm

  Algorithms:
   - Add x25519 support on ppc64le
   - Add VAES and AVX512 / AVX10 optimized AES-GCM on x86
   - Remove sm2 algorithm

  Drivers:
   - Add Allwinner H616 support to sun8i-ce
   - Use DMA in stm32
   - Add Exynos850 hwrng support to exynos"

* tag 'v6.11-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (81 commits)
  hwrng: core - remove (un)register_miscdev()
  crypto: lib/mpi - delete unnecessary condition
  crypto: testmgr - generate power-of-2 lengths more often
  crypto: mxs-dcp - Ensure payload is zero when using key slot
  hwrng: Kconfig - Do not enable by default CN10K driver
  crypto: starfive - Fix nent assignment in rsa dec
  crypto: starfive - Align rsa input data to 32-bit
  crypto: qat - fix unintentional re-enabling of error interrupts
  crypto: qat - extend scope of lock in adf_cfg_add_key_value_param()
  Documentation: qat: fix auto_reset attribute details
  crypto: sun8i-ce - add Allwinner H616 support
  crypto: sun8i-ce - wrap accesses to descriptor address fields
  dt-bindings: crypto: sun8i-ce: Add compatible for H616
  hwrng: core - Fix wrong quality calculation at hw rng registration
  hwrng: exynos - Enable Exynos850 support
  hwrng: exynos - Add SMC based TRNG operation
  hwrng: exynos - Implement bus clock control
  hwrng: exynos - Use devm_clk_get_enabled() to get the clock
  hwrng: exynos - Improve coding style
  dt-bindings: rng: Add Exynos850 support to exynos-trng
  ...
2024-07-19 08:52:58 -07:00
Yang Yang
72d04bdcf3 sbitmap: fix io hung due to race on sbitmap_word::cleared
Configuration for sbq:
  depth=64, wake_batch=6, shift=6, map_nr=1

1. There are 64 requests in progress:
  map->word = 0xFFFFFFFFFFFFFFFF
2. After all the 64 requests complete, and no more requests come:
  map->word = 0xFFFFFFFFFFFFFFFF, map->cleared = 0xFFFFFFFFFFFFFFFF
3. Now two tasks try to allocate requests:
  T1:                                       T2:
  __blk_mq_get_tag                          .
  __sbitmap_queue_get                       .
  sbitmap_get                               .
  sbitmap_find_bit                          .
  sbitmap_find_bit_in_word                  .
  __sbitmap_get_word  -> nr=-1              __blk_mq_get_tag
  sbitmap_deferred_clear                    __sbitmap_queue_get
  /* map->cleared=0xFFFFFFFFFFFFFFFF */     sbitmap_find_bit
    if (!READ_ONCE(map->cleared))           sbitmap_find_bit_in_word
      return false;                         __sbitmap_get_word -> nr=-1
    mask = xchg(&map->cleared, 0)           sbitmap_deferred_clear
    atomic_long_andnot()                    /* map->cleared=0 */
                                              if (!(map->cleared))
                                                return false;
                                     /*
                                      * map->cleared is cleared by T1
                                      * T2 fail to acquire the tag
                                      */

4. T2 is the sole tag waiter. When T1 puts the tag, T2 cannot be woken
up due to the wake_batch being set at 6. If no more requests come, T1
will wait here indefinitely.

This patch achieves two purposes:
1. Check on ->cleared and update on both ->cleared and ->word need to
be done atomically, and using spinlock could be the simplest solution.
2. Add extra check in sbitmap_deferred_clear(), to identify whether
->word has free bits.

Fixes: ea86ea2cdc ("sbitmap: ammortize cost of clearing bits")
Signed-off-by: Yang Yang <yang.yang@vivo.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240716082644.659566-1-yang.yang@vivo.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-19 09:39:32 -06:00
Linus Torvalds
76d9b92e68 slab updates for 6.11
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmaXl0kACgkQu+CwddJF
 iJrOlgf+N/G7BmgoW2CBF7mKsvCYs+pX3xeBuxPtsuq4FD386nsPFMN8gWAYLG3q
 ZU1z1S+0M8LhTg6/G9jMYLHt2Y7WhYbhFTjTHmULJkuhMDTUP9CRYy4XZ+hdPtHF
 30ezSdJQF9x/XxCSaaRVK1s+SMVHFg5xAOHKpfkNSamcMz9g+ZkYyPBr10/VoKd0
 JqwhW7r6hrlvWAiqY3QKCOvohIWglgvBUnNjUGMh1cUkOE2aYLYHklhRwICKgA6z
 p/2BUXiAEWUtgBkUrizwm/pdhJXLs0pOeYarVZP1v83tQMxyrc6XLNnqhvxP3DPW
 31thF5Rf9I8WaWTczXhxsAwFjqO3KQ==
 =4uf9
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:
 "The most prominent change this time is the kmem_buckets based
  hardening of kmalloc() allocations from Kees Cook.

  We have also extended the kmalloc() alignment guarantees for
  non-power-of-two sizes in a way that benefits rust.

  The rest are various cleanups and non-critical fixups.

   - Dedicated bucket allocator (Kees Cook)

     This series [1] enhances the probabilistic defense against heap
     spraying/grooming of CONFIG_RANDOM_KMALLOC_CACHES from last year.

     kmalloc() users that are known to be useful for exploits can get
     completely separate set of kmalloc caches that can't be shared with
     other users. The first converted users are alloc_msg() and
     memdup_user().

     The hardening is enabled by CONFIG_SLAB_BUCKETS.

   - Extended kmalloc() alignment guarantees (Vlastimil Babka)

     For years now we have guaranteed natural alignment for power-of-two
     allocations, but nothing was defined for other sizes (in practice,
     we have two such buckets, kmalloc-96 and kmalloc-192).

     To avoid unnecessary padding in the rust layer due to its alignment
     rules, extend the guarantee so that the alignment is at least the
     largest power-of-two divisor of the requested size.

     This fits what rust needs, is a superset of the existing
     power-of-two guarantee, and does not in practice change the layout
     (and thus does not add overhead due to padding) of the kmalloc-96
     and kmalloc-192 caches, unless slab debugging is enabled for them.

   - Cleanups and non-critical fixups (Chengming Zhou, Suren
     Baghdasaryan, Matthew Willcox, Alex Shi, and Vlastimil Babka)

     Various tweaks related to the new alloc profiling code, folio
     conversion, debugging and more leftovers after SLAB"

Link: https://lore.kernel.org/all/20240701190152.it.631-kees@kernel.org/ [1]

* tag 'slab-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/memcg: alignment memcg_data define condition
  mm, slab: move prepare_slab_obj_exts_hook under CONFIG_MEM_ALLOC_PROFILING
  mm, slab: move allocation tagging code in the alloc path into a hook
  mm/util: Use dedicated slab buckets for memdup_user()
  ipc, msg: Use dedicated slab buckets for alloc_msg()
  mm/slab: Introduce kmem_buckets_create() and family
  mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets argument
  mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
  mm/slab: Introduce kmem_buckets typedef
  slab, rust: extend kmalloc() alignment guarantees to remove Rust padding
  slab: delete useless RED_INACTIVE and RED_ACTIVE
  slab: don't put freepointer outside of object if only orig_size
  slab: make check_object() more consistent
  mm: Reduce the number of slab->folio casts
  mm, slab: don't wrap internal functions with alloc_hooks()
2024-07-18 15:08:12 -07:00
Linus Torvalds
db2451e78d Bootconfig updates for v6.11:
- Remove duplicate included header file linux/bootconfig.h from
   lib/bootconfig.c. This is a cleanup, no behavior change.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmaWhj4bHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bdd0H/iraZ7ZOFWxCapOZI4dL
 7f870j0PQG/KU7lB4jAo+3u7YyQWQTTLdhDPEOci4axsDG+56C/SVpHV0Z26SGHX
 ZqcKlA/H0HT4BA3zG1leRzXC/qPYiAEdIw38NngYPYBUWhqM3qmYlrRIBeg89VrM
 B4yaIJA/Uae7KAlB2dcmhmrIg86QK1iPKU6G+U5mIFecxDQmowE7z5f5pI/K/M5j
 2HT2Kg1XPTtxOb15mKtA19TXbbA1IqYUvwW5jOffppKMwtiggEaOj4mLQ1MhlrP0
 pEb1OJMx21MvEJYtjOXi8qsSGOhdWH8sBpxdUv21GzwRvOuG/AoaN1YKMIZCQp1K
 Jjo=
 =Bjzb
 -----END PGP SIGNATURE-----

Merge tag 'bootconfig-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig update from Masami Hiramatsu:

 - Remove duplicate included header file linux/bootconfig.h from
   lib/bootconfig.c. This is a cleanup, no behavior change.

* tag 'bootconfig-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  bootconfig: Remove duplicate included header file linux/bootconfig.h
2024-07-18 12:39:40 -07:00
Linus Torvalds
b3ce7a3084 drm next for 6.11-rc1:
core:
 - deprecate DRM data and return 0 date
 - connector: Create a set of helpers to help with HDMI support
 - Remove driver owner assignments
 - Allow more drivers to compile with COMPILE_TEST
 - Conversions to drm_edid
 - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
 - Remove drm_mm_replace_node
 - print: Add a drm prefix to warn level messages too, remove
          ___drm_dbg, consolidate prefix handling
 - New monochrome TV mode variant
 
 ttm:
 - improve number of page faults on some platforms
 - fix test builds under PREEMPT_RT
 - more test coverage
 
 ci:
 - Require a more recent version of mesa,
 - improve farm setup and test generation
 
 dma-buf:
 - warn if reserving 0 fence slots
 - internal API heap enhancements
 
 fbdev:
 - Create memory manager optimized fbdev emulation
 
 panic:
 - Allow to select fonts,
 - improve drm_fb_dma_get_scanout_buffer
 - Allow to dump kmsg to the screen
 
 bridge:
 - Remove redundant checks on bridge->encoder
 - Remove drm_bridge_chain_mode_fixup
 - bridge-connector: Plumb in the new HDMI helper
 - analogix_dp: Various improvements, handle AUX transfers timeout
 - samsung-dsim: Fix timings calculation
 - tc358767: Plenty of small fixes, fix no connector attach, fix clocks
 - sii902x: state validation improvements
 
 panels:
 - Switch panels from register table initialization to proper code
 - Now that the panel code tracks the panel state, remove every
   ad-hoc implementation in the panel drivers
 - More cleanup of prepare / enable state tracking in drivers
 - edp: Drop legacy panel compatibles
 - simple-bridge: Switch to devm_drm_bridge_add
 - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
   13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
   nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView PM070WL4,
   Lincoln Technologies LCD197, Ortustech COM35H3P70ULC,
   AUO G104STN01, K&d kd101ne3-40ti
 
 amdgpu:
 - DCN 4.0.x support
 - GC 12.0 support
 - GMC 12.0 support
 - SDMA 7.0 support
 - MES12 support
 - MMHUB 4.1 support
 - GFX12 modifier and DCC support
 - lots of IP fixes/updates
 
 amdkfd:
 - Contiguous VRAM allocations
 - GC 12.0 support
 - SDMA 7.0 support
 - SR-IOV fixes
 - KFD GFX ALU exceptions
 
 i915:
 - Battlemage Xe2 HPD display enablement
 - Panel Replay enabling
 - DP AUX-less ALPM/LOBF
 - Enable link training failure fallback for DP MST links
 - CMRR (Content Match Refresh Rate) enabling
 - Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
 - Enable eDP AUX based HDR backlight
 - Support replaying GPU hangs with captured context image
 - Automate CCS Mode setting during engine resets
 - lots of refactoring
 - Support replaying GPU hangs with captured context image
 - Increase FLR timeout from 3s to 9s
 - Enable w/a 16021333562 for DG2, MTL and ARL [guc]
 
 xe:
 - update MAINATINERS
 - New uapi adding OA functionality to Xe
 - expose l3 bank mask
 - fix display detect on ADL-N
 - runtime PM Fixes
 - Fix silent backmerge issues
 - More prep for SR-IOV
 - HWmon additions
 - per client usage info
 - Rework GPU page fault handling
 - Drop EXEC_QUEUE_FLAG_BANNED
 - Add BMG PCI IDs
 - Scheduler fixes and improvements
 - Rename xe_exec_queue::compute to xe_exec_queue::lr
 - Use ttm_uncached for BO with NEEDS_UC flag
 - Rename xe perf layer as xe observation layer
 - lots of refactoring
 
 radeon:
 - Backlight workaround for iMac
 - Silence UBSAN flex array warnings
 
 msm:
 - Validate registers XML description against schema in CI
 - core/dpu: SM7150 support
 - mdp5: Add support for MSM8937
 - gpu: Add param for userspace to know if raytracing is supported
 - gpu: X185 support (aka gpu in X1 laptop chips)
 - gpu: a505 support
 
 ivpu:
 - hardware scheduler support
 - profiling support
 - improvements to the platform support layer
 - firmware handling improvements
 - clocks/power mgmt improvements
 - scheduler/logging improvements
 
 habanalabs:
 - Gradual sleep in polling memory macro.
 - Reduce Gaudi2 MSI-X interrupt count to 128.
 - Add Gaudi2-D revision support.
 - Add timestamp to CPLD info.
 - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error.
 - Align Gaudi2 interrupt names.
 - Check for errors after preboot is ready.
 - Change habanalabs maintainer and git repo path.
 
 mgag200:
 - refactoring and improvements
 - Add BMC output
 - enable polling
 
 nouveau:
 - add registry command line
 
 v3d:
 - perf counters improvements
 
 zynqmp:
 - irq and debugfs improvements
 
 atmel-hlcdc:
 - Support XLCDC in sam9x7
 
 mipi-dbi:
 - Remove mipi_dbi_machine_little_endian
 - make SPI bits per word configurable
 - support RGB888
 - allow pixel formats to be specified in the DT
 
 sun4i:
 - Rework the blender setup for DE2
 
 panfrost:
 - Enable MT8188 support
 
 vc4:
 - Monochrome TV support
 
 exynos:
 - fix fallback mode regression
 - fix memory leak
 - Use drm_edid_duplicate() instead of kmemdup()
 
 etnaviv:
 - fix i.MX8MP NPU clock gating
 - workaround FE register cdc issues on some cores
 - fix DMA sync handling for cached buffers
 - fix job timeout handling
 - keep TS enabled on MMUv2 cores for improved performance
 
 mediatek:
 - Convert to platform remove callback returning void-
 - Drop chain_mode_fixup call in mode_valid()
 - Fixes the errors of MediaTek display driver found by IGT.
 - Add display support for the MT8365-EVK board
 - Fix bit depth overwritten for mtk_ovl_set bit_depth()
 - Fix possible_crtcs calculation
 - Fix spurious kfree()
 
 ast:
 - refactor mode setting code
 
 stm:
 - Add LVDS support
 - DSI PHY updates
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmaYqVEACgkQDHTzWXnE
 hr5p3Q/+OOxTHKJ/8WMwfV1Tuep5otkCZdBgNdcuu9zqzpEMEDUDwmV1iboIvT9x
 qJsDwSAJomwbZAnVjDKsbZuycSHUBV6HQdf+5+rtq6be1EfFRwJVzOq0u5+D3KGt
 7f2vy6sM9tw4tR6EikiuP7vCvnSz4iGrWERvEJDEtXECbALhju8sulht8ZMnr6GW
 /MfUetULLSDjq0L1x3TWAq2MPGnJ5UxIkIeOBUP6n4etAUX1BPTNA6N76eN/xMvn
 a40JhtM+pCjjkHxvloIZ+KTYN3S+hskIRksczPHh9HtNX7y/A437wyhOHJZ1NvZb
 yc5ke9GjXxGcxyZH+PY5aCS7O/XElzSSkR1jFZ2s3/MX7PVKgCahGK7+yWjPsiK2
 R5oXebdObshUa8LHDE/3WgBUmTchkvKRTXV9cvGqzxEPhC2zrxArvwP5v6B4mhCn
 Vqo3Pv0Cyr+n65Z5Dzqz/9+m999LJjFTsTrug0p5b/qBJQKu2rQONe4lpZ0NFwwY
 ExyjdxILj7mqrQpKcA6V5Bel5ZCnlVsGfTshFL6Iux54VFlJyRMzKWZ+Gdv4av5k
 dbjz+re+CojKabn3ML/7pAQujK6Rqe58vPuHV78zkvAGJnQgJOOTrmYNYtn3oBqe
 ogdCN+/PREb/9U7i6mQv5hhdHs4tT9ROXaT9jyb8XSHXW+t9lBM=
 =g+Ad
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "There's a lot of stuff in here, amd, i915 and xe have new platform
  work, lots of core rework around EDID handling, some new COMPILE_TEST
  options, maintainer changes and a lots of other stuff. Summary:

  core:
   - deprecate DRM data and return 0 date
   - connector: Create a set of helpers to help with HDMI support
   - Remove driver owner assignments
   - Allow more drivers to compile with COMPILE_TEST
   - Conversions to drm_edid
   - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
   - Remove drm_mm_replace_node
   - print: Add a drm prefix to warn level messages too, remove
            ___drm_dbg, consolidate prefix handling
   - New monochrome TV mode variant

  ttm:
   - improve number of page faults on some platforms
   - fix test builds under PREEMPT_RT
   - more test coverage

  ci:
   - Require a more recent version of mesa
   - improve farm setup and test generation

  dma-buf:
   - warn if reserving 0 fence slots
   - internal API heap enhancements

  fbdev:
   - Create memory manager optimized fbdev emulation

  panic:
   - Allow to select fonts
   - improve drm_fb_dma_get_scanout_buffer
   - Allow to dump kmsg to the screen

  bridge:
   - Remove redundant checks on bridge->encoder
   - Remove drm_bridge_chain_mode_fixup
   - bridge-connector: Plumb in the new HDMI helper
   - analogix_dp: Various improvements, handle AUX transfers timeout
   - samsung-dsim: Fix timings calculation
   - tc358767: Plenty of small fixes, fix no connector attach, fix
               clocks
   - sii902x: state validation improvements

  panels:
   - Switch panels from register table initialization to proper code
   - Now that the panel code tracks the panel state, remove every ad-hoc
     implementation in the panel drivers
   - More cleanup of prepare / enable state tracking in drivers
   - edp: Drop legacy panel compatibles
   - simple-bridge: Switch to devm_drm_bridge_add
   - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
                 13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0,
                 BOE nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView
                 PM070WL4, Lincoln Technologies LCD197, Ortustech
                 COM35H3P70ULC, AUO G104STN01, K&d kd101ne3-40ti

  amdgpu:
   - DCN 4.0.x support
   - GC 12.0 support
   - GMC 12.0 support
   - SDMA 7.0 support
   - MES12 support
   - MMHUB 4.1 support
   - GFX12 modifier and DCC support
   - lots of IP fixes/updates

  amdkfd:
   - Contiguous VRAM allocations
   - GC 12.0 support
   - SDMA 7.0 support
   - SR-IOV fixes
   - KFD GFX ALU exceptions

  i915:
   - Battlemage Xe2 HPD display enablement
   - Panel Replay enabling
   - DP AUX-less ALPM/LOBF
   - Enable link training failure fallback for DP MST links
   - CMRR (Content Match Refresh Rate) enabling
   - Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
   - Enable eDP AUX based HDR backlight
   - Support replaying GPU hangs with captured context image
   - Automate CCS Mode setting during engine resets
   - lots of refactoring
   - Support replaying GPU hangs with captured context image
   - Increase FLR timeout from 3s to 9s
   - Enable w/a 16021333562 for DG2, MTL and ARL [guc]

  xe:
   - update MAINATINERS
   - New uapi adding OA functionality to Xe
   - expose l3 bank mask
   - fix display detect on ADL-N
   - runtime PM Fixes
   - Fix silent backmerge issues
   - More prep for SR-IOV
   - HWmon additions
   - per client usage info
   - Rework GPU page fault handling
   - Drop EXEC_QUEUE_FLAG_BANNED
   - Add BMG PCI IDs
   - Scheduler fixes and improvements
   - Rename xe_exec_queue::compute to xe_exec_queue::lr
   - Use ttm_uncached for BO with NEEDS_UC flag
   - Rename xe perf layer as xe observation layer
   - lots of refactoring

  radeon:
   - Backlight workaround for iMac
   - Silence UBSAN flex array warnings

  msm:
   - Validate registers XML description against schema in CI
   - core/dpu: SM7150 support
   - mdp5: Add support for MSM8937
   - gpu: Add param for userspace to know if raytracing is supported
   - gpu: X185 support (aka gpu in X1 laptop chips)
   - gpu: a505 support

  ivpu:
   - hardware scheduler support
   - profiling support
   - improvements to the platform support layer
   - firmware handling improvements
   - clocks/power mgmt improvements
   - scheduler/logging improvements

  habanalabs:
   - Gradual sleep in polling memory macro
   - Reduce Gaudi2 MSI-X interrupt count to 128
   - Add Gaudi2-D revision support
   - Add timestamp to CPLD info
   - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error
   - Align Gaudi2 interrupt names
   - Check for errors after preboot is ready
   - Change habanalabs maintainer and git repo path

  mgag200:
   - refactoring and improvements
   - Add BMC output
   - enable polling

  nouveau:
   - add registry command line

  v3d:
   - perf counters improvements

  zynqmp:
   - irq and debugfs improvements

  atmel-hlcdc:
   - Support XLCDC in sam9x7

  mipi-dbi:
   - Remove mipi_dbi_machine_little_endian
   - make SPI bits per word configurable
   - support RGB888
   - allow pixel formats to be specified in the DT

  sun4i:
   - Rework the blender setup for DE2

  panfrost:
   - Enable MT8188 support

  vc4:
   - Monochrome TV support

  exynos:
   - fix fallback mode regression
   - fix memory leak
   - Use drm_edid_duplicate() instead of kmemdup()

  etnaviv:
   - fix i.MX8MP NPU clock gating
   - workaround FE register cdc issues on some cores
   - fix DMA sync handling for cached buffers
   - fix job timeout handling
   - keep TS enabled on MMUv2 cores for improved performance

  mediatek:
   - Convert to platform remove callback returning void-
   - Drop chain_mode_fixup call in mode_valid()
   - Fixes the errors of MediaTek display driver found by IGT
   - Add display support for the MT8365-EVK board
   - Fix bit depth overwritten for mtk_ovl_set bit_depth()
   - Fix possible_crtcs calculation
   - Fix spurious kfree()

  ast:
   - refactor mode setting code

  stm:
   - Add LVDS support
   - DSI PHY updates"

* tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel: (2501 commits)
  drm/amdgpu/mes12: add missing opcode string
  drm/amdgpu/mes11: update opcode strings
  Revert "drm/amd/display: Reset freesync config before update new state"
  drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
  drm/xe: Drop trace_xe_hw_fence_free
  drm/xe/uapi: Rename xe perf layer as xe observation layer
  drm/amdgpu: remove exp hw support check for gfx12
  drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
  drm/amdgpu: flush all cached ras bad pages to eeprom
  drm/amdgpu: select compute ME engines dynamically
  drm/amd/display: Allow display DCC for DCN401
  drm/amdgpu: select compute ME engines dynamically
  drm/amdgpu/job: Replace DRM_INFO/ERROR logging
  drm/amdgpu: select compute ME engines dynamically
  drm/amd/pm: Ignore initial value in smu response register
  drm/amdgpu: Initialize VF partition mode
  drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping
  MAINTAINERS: fix Xinhui's name
  MAINTAINERS: update powerplay and swsmu
  drm/qxl: Pin buffer objects for internal mappings
  ...
2024-07-18 09:34:02 -07:00
Linus Torvalds
51835949dd Networking changes for 6.11. Not much excitement - a handful of large
patchsets (devmem among them) did not make it in time.
 
 Core & protocols
 ----------------
 
  - Use local_lock in addition to local_bh_disable() to protect per-CPU
    resources in networking, a step closer for local_bh_disable() not
    to act as a big lock on PREEMPT_RT.
 
  - Use flex array for netdevice priv area, ensure its cache alignment.
 
  - Add a sysctl knob to allow user to specify a default rto_min at socket
    init time. Bit of a big hammer but multiple companies were
    independently carrying such patch downstream so clearly it's useful.
 
  - Support scheduling transmission of packets based on CLOCK_TAI.
 
  - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off
    using cpusets.
 
  - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address.
 
  - Allow configuration of multipath hash seed, to both allow synchronizing
    hashing of two routers, and preventing partial accidental sync.
 
  - Improve TCP compliance with RFC 9293 for simultaneous connect().
 
  - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace
    IKE daemon had to do this before, but the kernel can better keep
    track of it.
 
  - Support sending supervision HSR frames with MAC addresses stored in
    ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled.
 
  - Introduce IPPROTO_SMC for selecting SMC when socket is created.
 
  - Allow UDP GSO transmit from devices with no checksum offload.
 
  - openvswitch: add packet sampling via psample, separating the sampled
    traffic from "upcall" packets sent to user space for forwarding.
 
  - nf_tables: shrink memory consumption for transaction objects.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Power Sequencing subsystem (used by Qualcomm Bluetooth driver
    for QCA6390).
 
  - Add IRQ information in sysfs for auxiliary bus.
 
  - Introduce guard definition for local_lock.
 
  - Add aligned flavor of __cacheline_group_{begin, end}() markings for
    grouping fields in structures.
 
 BPF
 ---
 
  - Notify user space (via epoll) when a struct_ops object is getting
    detached/unregistered.
 
  - Add new kfuncs for a generic, open-coded bits iterator.
 
  - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
    bpf_list_head.
 
  - Support resilient split BTF which cuts down on duplication and makes
    BTF as compact as possible WRT BTF from modules.
 
  - Add support for dumping kfunc prototypes from BTF which enables both
    detecting as well as dumping compilable prototypes for kfuncs.
 
  - riscv64 BPF JIT improvements in particular to add 12-argument support
    for BPF trampolines and to utilize bpf_prog_pack for the latter.
 
  - Add the capability to offload the netfilter flowtable in XDP layer
    through kfuncs.
 
 Driver API
 ----------
 
  - Allow users to configure IRQ tresholds between which automatic IRQ
    moderation can choose.
 
  - Expand Power Sourcing (PoE) status with power, class and failure
    reason. Support setting power limits.
 
  - Track additional RSS contexts in the core, make sure configuration
    changes don't break them.
 
  - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP
    data paths.
 
  - Support updating firmware on SFP modules.
 
 Tests and tooling
 -----------------
 
  - mptcp: use net/lib.sh to manage netns.
 
  - TCP-AO and TCP-MD5: replace debug prints used by tests with
    tracepoints.
 
  - openvswitch: make test self-contained (don't depend on OvS CLI tools).
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - increase the max total outstanding PTP TX packets to 4
      - add timestamping statistics support
      - implement netdev_queue_mgmt_ops
      - support new RSS context API
    - Intel (100G, ice, idpf):
      - implement FEC statistics and dumping signal quality indicators
      - support E825C products (with 56Gbps PHYs)
    - nVidia/Mellanox:
      - support HW-GRO
      - mlx4/mlx5: support per-queue statistics via netlink
      - obey the max number of EQs setting in sub-functions
    - AMD/Solarflare:
      - support new RSS context API
    - AMD/Pensando:
      - ionic: rework fix for doorbell miss to lower overhead
        and skip it on new HW
    - Wangxun:
      - txgbe: support Flow Director perfect filters
 
  - Ethernet NICs consumer, embedded and virtual:
    - Add driver for Tehuti Networks TN40xx chips
    - Add driver for Meta's internal NIC chips
    - Add driver for Ethernet MAC on Airoha EN7581 SoCs
    - Add driver for Renesas Ethernet-TSN devices
    - Google cloud vNIC:
      - flow steering support
    - Microsoft vNIC:
      - support page sizes other than 4KB on ARM64
    - vmware vNIC:
      - support latency measurement (update to version 9)
    - VirtIO net:
      - support for Byte Queue Limits
      - support configuring thresholds for automatic IRQ moderation
      - support for AF_XDP Rx zero-copy
    - Synopsys (stmmac):
      - support for STM32MP13 SoC
      - let platforms select the right PCS implementation
    - TI:
      - icssg-prueth: add multicast filtering support
      - icssg-prueth: enable PTP timestamping and PPS
    - Renesas:
      - ravb: improve Rx performance 30-400% by using page pool,
        theaded NAPI and timer-based IRQ coalescing
      - ravb: add MII support for R-Car V4M
    - Cadence (macb):
      - macb: add ARP support to Wake-On-LAN
    - Cortina:
      - use phylib for RX and TX pause configuration
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - support configuration of multipath hash seed
      - report more accurate max MTU
      - use page_pool to improve Rx performance
    - MediaTek:
      - mt7530: add support for bridge port isolation
    - Qualcomm:
      - qca8k: add support for bridge port isolation
    - Microchip:
      - lan9371/2: add 100BaseTX PHY support
    - NXP:
      - vsc73xx: implement VLAN operations
 
  - Ethernet PHYs:
    - aquantia: enable support for aqr115c
    - aquantia: add support for PHY LEDs
    - realtek: add support for rtl8224 2.5Gbps PHY
    - xpcs: add memory-mapped device support
    - add BroadR-Reach link mode and support in Broadcom's PHY driver
 
  - CAN:
    - add document for ISO 15765-2 protocol support
    - mcp251xfd: workaround for erratum DS80000789E, use timestamps
      to catch when device returns incorrect FIFO status
 
  - WiFi:
    - mac80211/cfg80211:
      - parse Transmit Power Envelope (TPE) data in mac80211 instead of
        in drivers
      - improvements for 6 GHz regulatory flexibility
      - multi-link improvements
      - support multiple radios per wiphy
      - remove DEAUTH_NEED_MGD_TX_PREP flag
    - Intel (iwlwifi):
      - bump FW API to 91 for BZ/SC devices
      - report 64-bit radiotap timestamp
      - enable P2P low latency by default
      - handle Transmit Power Envelope (TPE) advertised by AP
      - remove support for older FW for new devices
      - fast resume (keeping the device configured)
      - mvm: re-enable Multi-Link Operation (MLO)
      - aggregation (A-MSDU) optimizations
    - MediaTek (mt76):
      - mt7925 Multi-Link Operation (MLO) support
    - Qualcomm (ath10k):
      - LED support for various chipsets
    - Qualcomm (ath12k):
      - remove unsupported Tx monitor handling
      - support channel 2 in 6 GHz band
      - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
      - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
        Advertisements (EMA)
      - support dynamic VLAN
      - add panic handler for resetting the firmware state
      - DebugFS support for datapath statistics
      - WCN7850: support for Wake on WLAN
    - Microchip (wilc1000):
      - read MAC address during probe to make it visible to user space
      - suspend/resume improvements
    - TI (wl18xx):
      - support newer firmware versions
    - RealTek (rtw89):
      - preparation for RTL8852BE-VT support
      - Wake on WLAN support for WiFi 6 chips
      - 36-bit PCI DMA support
    - RealTek (rtlwifi):
      - RTL8192DU support
    - Broadcom (brcmfmac):
      - Management Frame Protection support (to enable WPA3)
 
  - Bluetooth:
    - qualcomm: use the power sequencer for QCA6390
    - btusb: mediatek: add ISO data transmission functions
    - hci_bcm4377: add BCM4388 support
    - btintel: add support for BlazarU core
    - btintel: add support for Whale Peak2
    - btnxpuart: add support for AW693 A1 chipset
    - btnxpuart: add support for IW615 chipset
    - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaWjBwACgkQMUZtbf5S
 IrvuSRAAkJuEzTRqgURBCe4eNEQde6mJJig7l2CKHwCbFiHZpRkFHf8qKbcGWbL6
 uLW33SWnKtJVDhxVKWHLq635XW7BAa80YhqGw21GDi+mIEhWXZglHj3xbXNxsMfE
 4eg/kG4BkfYWFmHaXOwVWV/mr7nXf6j7WmXNeXEi32ufE1j0OL+YlQenKnMj8yP2
 j9JmYa2Chwppng1SblHmcjmGkdNVwFhStKeCG+2K7v06wdDH/QYBlbgUv9gw/cxp
 NlW//wgiaeX40U4O3kDwt9C+LDoh+0VrDDeVdQ+IsScLtY3PhAzEoKolFYTq2HSr
 I1JpoaHNnyNsJq3DZrACQ5WlH4yDn6C2EUB6dxNnFaI9F1ZPsi+7MTl6Sei1AklD
 TuQTj/lxOACBwW2Q77NU72uoxiIUauesGPHcnrAFuoCIEhZF0mso7k59BvrXhsOP
 QwcLbQdc1YHNkqv/Vc7NBY+ruMsYB+5Ubbhhj2p27dp/CWFIwxI29fze4dn2uhO6
 ejHN3mbqwPdSzg12YJtM6Iq61Cnwo2eVSvhTxl+ZVSZtI4nu2arzR+y7QTYmNrXP
 6tkgVN9UsWeLl2xJ8wyyqL5mcvNHP2rPXWZ2X56iTaa26m+UlleeQ7YRaYtQAAr0
 Ec/vlDMX64SwHhd+qwE99DXGQf2g+KklHKSLsnajJUVrWFTlRI0=
 =opz8
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Not much excitement - a handful of large patchsets (devmem among them)
  did not make it in time.

  Core & protocols:

   - Use local_lock in addition to local_bh_disable() to protect per-CPU
     resources in networking, a step closer for local_bh_disable() not
     to act as a big lock on PREEMPT_RT

   - Use flex array for netdevice priv area, ensure its cache alignment

   - Add a sysctl knob to allow user to specify a default rto_min at
     socket init time. Bit of a big hammer but multiple companies were
     independently carrying such patch downstream so clearly it's useful

   - Support scheduling transmission of packets based on CLOCK_TAI

   - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned
     off using cpusets

   - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address

   - Allow configuration of multipath hash seed, to both allow
     synchronizing hashing of two routers, and preventing partial
     accidental sync

   - Improve TCP compliance with RFC 9293 for simultaneous connect()

   - Support sending NAT keepalives in IPsec ESP in UDP states.
     Userspace IKE daemon had to do this before, but the kernel can
     better keep track of it

   - Support sending supervision HSR frames with MAC addresses stored in
     ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled

   - Introduce IPPROTO_SMC for selecting SMC when socket is created

   - Allow UDP GSO transmit from devices with no checksum offload

   - openvswitch: add packet sampling via psample, separating the
     sampled traffic from "upcall" packets sent to user space for
     forwarding

   - nf_tables: shrink memory consumption for transaction objects

  Things we sprinkled into general kernel code:

   - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for
     QCA6390)           [ Already merged separately - Linus ]

   - Add IRQ information in sysfs for auxiliary bus

   - Introduce guard definition for local_lock

   - Add aligned flavor of __cacheline_group_{begin, end}() markings for
     grouping fields in structures

  BPF:

   - Notify user space (via epoll) when a struct_ops object is getting
     detached/unregistered

   - Add new kfuncs for a generic, open-coded bits iterator

   - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
     bpf_list_head

   - Support resilient split BTF which cuts down on duplication and
     makes BTF as compact as possible WRT BTF from modules

   - Add support for dumping kfunc prototypes from BTF which enables
     both detecting as well as dumping compilable prototypes for kfuncs

   - riscv64 BPF JIT improvements in particular to add 12-argument
     support for BPF trampolines and to utilize bpf_prog_pack for the
     latter

   - Add the capability to offload the netfilter flowtable in XDP layer
     through kfuncs

  Driver API:

   - Allow users to configure IRQ tresholds between which automatic IRQ
     moderation can choose

   - Expand Power Sourcing (PoE) status with power, class and failure
     reason. Support setting power limits

   - Track additional RSS contexts in the core, make sure configuration
     changes don't break them

   - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated
     ESP data paths

   - Support updating firmware on SFP modules

  Tests and tooling:

   - mptcp: use net/lib.sh to manage netns

   - TCP-AO and TCP-MD5: replace debug prints used by tests with
     tracepoints

   - openvswitch: make test self-contained (don't depend on OvS CLI
     tools)

  Drivers:

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - increase the max total outstanding PTP TX packets to 4
         - add timestamping statistics support
         - implement netdev_queue_mgmt_ops
         - support new RSS context API
      - Intel (100G, ice, idpf):
         - implement FEC statistics and dumping signal quality indicators
         - support E825C products (with 56Gbps PHYs)
      - nVidia/Mellanox:
         - support HW-GRO
         - mlx4/mlx5: support per-queue statistics via netlink
         - obey the max number of EQs setting in sub-functions
      - AMD/Solarflare:
         - support new RSS context API
      - AMD/Pensando:
         - ionic: rework fix for doorbell miss to lower overhead and
           skip it on new HW
      - Wangxun:
         - txgbe: support Flow Director perfect filters

   - Ethernet NICs consumer, embedded and virtual:
      - Add driver for Tehuti Networks TN40xx chips
      - Add driver for Meta's internal NIC chips
      - Add driver for Ethernet MAC on Airoha EN7581 SoCs
      - Add driver for Renesas Ethernet-TSN devices
      - Google cloud vNIC:
         - flow steering support
      - Microsoft vNIC:
         - support page sizes other than 4KB on ARM64
      - vmware vNIC:
         - support latency measurement (update to version 9)
      - VirtIO net:
         - support for Byte Queue Limits
         - support configuring thresholds for automatic IRQ moderation
         - support for AF_XDP Rx zero-copy
      - Synopsys (stmmac):
         - support for STM32MP13 SoC
         - let platforms select the right PCS implementation
      - TI:
         - icssg-prueth: add multicast filtering support
         - icssg-prueth: enable PTP timestamping and PPS
      - Renesas:
         - ravb: improve Rx performance 30-400% by using page pool,
           theaded NAPI and timer-based IRQ coalescing
         - ravb: add MII support for R-Car V4M
      - Cadence (macb):
         - macb: add ARP support to Wake-On-LAN
      - Cortina:
         - use phylib for RX and TX pause configuration

   - Ethernet switches:
      - nVidia/Mellanox:
         - support configuration of multipath hash seed
         - report more accurate max MTU
         - use page_pool to improve Rx performance
      - MediaTek:
         - mt7530: add support for bridge port isolation
      - Qualcomm:
         - qca8k: add support for bridge port isolation
      - Microchip:
         - lan9371/2: add 100BaseTX PHY support
      - NXP:
         - vsc73xx: implement VLAN operations

   - Ethernet PHYs:
      - aquantia: enable support for aqr115c
      - aquantia: add support for PHY LEDs
      - realtek: add support for rtl8224 2.5Gbps PHY
      - xpcs: add memory-mapped device support
      - add BroadR-Reach link mode and support in Broadcom's PHY driver

   - CAN:
      - add document for ISO 15765-2 protocol support
      - mcp251xfd: workaround for erratum DS80000789E, use timestamps to
        catch when device returns incorrect FIFO status

   - WiFi:
      - mac80211/cfg80211:
         - parse Transmit Power Envelope (TPE) data in mac80211 instead
           of in drivers
         - improvements for 6 GHz regulatory flexibility
         - multi-link improvements
         - support multiple radios per wiphy
         - remove DEAUTH_NEED_MGD_TX_PREP flag
      - Intel (iwlwifi):
         - bump FW API to 91 for BZ/SC devices
         - report 64-bit radiotap timestamp
         - enable P2P low latency by default
         - handle Transmit Power Envelope (TPE) advertised by AP
         - remove support for older FW for new devices
         - fast resume (keeping the device configured)
         - mvm: re-enable Multi-Link Operation (MLO)
         - aggregation (A-MSDU) optimizations
      - MediaTek (mt76):
         - mt7925 Multi-Link Operation (MLO) support
      - Qualcomm (ath10k):
         - LED support for various chipsets
      - Qualcomm (ath12k):
         - remove unsupported Tx monitor handling
         - support channel 2 in 6 GHz band
         - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
         - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
           Advertisements (EMA)
         - support dynamic VLAN
         - add panic handler for resetting the firmware state
         - DebugFS support for datapath statistics
         - WCN7850: support for Wake on WLAN
      - Microchip (wilc1000):
         - read MAC address during probe to make it visible to user space
         - suspend/resume improvements
      - TI (wl18xx):
         - support newer firmware versions
      - RealTek (rtw89):
         - preparation for RTL8852BE-VT support
         - Wake on WLAN support for WiFi 6 chips
         - 36-bit PCI DMA support
      - RealTek (rtlwifi):
         - RTL8192DU support
      - Broadcom (brcmfmac):
         - Management Frame Protection support (to enable WPA3)

   - Bluetooth:
      - qualcomm: use the power sequencer for QCA6390
      - btusb: mediatek: add ISO data transmission functions
      - hci_bcm4377: add BCM4388 support
      - btintel: add support for BlazarU core
      - btintel: add support for Whale Peak2
      - btnxpuart: add support for AW693 A1 chipset
      - btnxpuart: add support for IW615 chipset
      - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591"

* tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits)
  eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering"
  tcp: Replace strncpy() with strscpy()
  wifi: ath12k: fix build vs old compiler
  tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child().
  eth: fbnic: Write the TCAM tables used for RSS control and Rx to host
  eth: fbnic: Add L2 address programming
  eth: fbnic: Add basic Rx handling
  eth: fbnic: Add basic Tx handling
  eth: fbnic: Add link detection
  eth: fbnic: Add initial messaging to notify FW of our presence
  eth: fbnic: Implement Rx queue alloc/start/stop/free
  eth: fbnic: Implement Tx queue alloc/start/stop/free
  eth: fbnic: Allocate a netdevice and napi vectors with queues
  eth: fbnic: Add FW communication mechanism
  eth: fbnic: Add message parsing for FW messages
  eth: fbnic: Add register init to set PCIe/Ethernet device config
  eth: fbnic: Allocate core device specific structures and devlink interface
  eth: fbnic: Add scaffolding for Meta's NIC driver
  PCI: Add Meta Platforms vendor ID
  net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK
  ...
2024-07-16 19:28:34 -07:00
Linus Torvalds
f8d22a3195 linux_kselftest-kunit-6.11-rc1
This KUnit next update for Linux 6.11-rc1 consists of:
 
 -- adds vm_mmap() allocation resource manager
 -- converts usercopy kselftest to KUnit
 -- disables usercopy testing on !CONFIG_MMU
 -- adds MODULE_DESCRIPTION() to core, list, and usercopy tests
 -- adds tests for assertion formatting functions - assert.c
 -- introduces KUNIT_ASSERT_MEMEQ and KUNIT_ASSERT_MEMNEQ macros
 -- fixes KUNIT_ASSERT_STRNEQ comments to make it clear that it is
    an assertion
 -- renames KUNIT_ASSERT_FAILURE to KUNIT_FAIL_AND_ABORT
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmaWpCYACgkQCwJExA0N
 QxwdPQ/9G26Q+xhbieosvXHu/04ZWTcuUP/cFRv56jLH9bKm25YbW8WZzKM/imE5
 So35IT6SIYlwxn9fYyriPz372h3ZC522cu8tIVrUh5Uo3O5LbzQqdrxos9a+RuCg
 u6lenSksAjJRZ3S3IKDJ1ErxLnPYKyjjZFwDmV1+0Xxy30SwzFEbQqj9lY2Q4iGs
 KWBm0lrFPipbHdBqZcPB/mxIDyF6rhe+oeuOPU8uag6ncNN31xMpDanU8O6XEAz9
 QoAiDICANbVKTRKG5xXgmsJtyLF8GON4e49kEYtCLdnESPc39hQtf3cTHeYI22HC
 7OWhhOySifNIukFj1hVtxnN3ZfjtBGmbCwe5rXZFvMovE3YwAplKK61GoOaI9UV0
 qPk5GGrAb/xEh2HZ9tgf8+CsqmnPQLGnVt2h3u3c28u4YzbkinqVj20KYsye39zz
 KzJsO2yDJH4LlIJjc8XWof1cyyo0TIJQVOwJqAieOPePnfs4zabmVOus8y1Cj07V
 iAvQTPPoZ165zA1cl0iSMolKkXeAgf2FjlEGbODrktKKX6Ag/PKVp3e6PW28zJbp
 0p1V1IDQQAlEhbcRAZb+5y1voh+hcy++KyPwpj7lAVkmHd7RoK/mDL3W+oLdOTrB
 aXWs4JOlkmtUaz3EpAQZuvhYWVW7DexR9rU1SF44UAVzSdZSndw=
 =nnFR
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - add vm_mmap() allocation resource manager

 - convert usercopy kselftest to KUnit

 - disable usercopy testing on !CONFIG_MMU

 - add MODULE_DESCRIPTION() to core, list, and usercopy tests

 - add tests for assertion formatting functions - assert.c

 - introduce KUNIT_ASSERT_MEMEQ and KUNIT_ASSERT_MEMNEQ macros

 - fix KUNIT_ASSERT_STRNEQ comments to make it clear that it is an
   assertion

 - rename KUNIT_ASSERT_FAILURE to KUNIT_FAIL_AND_ABORT

* tag 'linux_kselftest-kunit-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Introduce KUNIT_ASSERT_MEMEQ and KUNIT_ASSERT_MEMNEQ macros
  kunit: Rename KUNIT_ASSERT_FAILURE to KUNIT_FAIL_AND_ABORT for readability
  kunit: Fix the comment of KUNIT_ASSERT_STRNEQ as assertion
  kunit: executor: Simplify string allocation handling
  kunit/usercopy: Add missing MODULE_DESCRIPTION()
  kunit/usercopy: Disable testing on !CONFIG_MMU
  usercopy: Convert test_user_copy to KUnit test
  kunit: test: Add vm_mmap() allocation resource manager
  list: test: add the missing MODULE_DESCRIPTION() macro
  kunit: add missing MODULE_DESCRIPTION() macros to core modules
  list: test: remove unused struct 'klist_test_struct'
  kunit: Cover 'assert.c' with tests
2024-07-16 17:42:14 -07:00
Linus Torvalds
f8a8b94d06 sysctl changes for 6.11-rc1
Summary
 
 * Remove "->procname == NULL" check when iterating through sysctl table arrays
 
     Removing sentinels in ctl_table arrays reduces the build time size and
     runtime memory consumed by ~64 bytes per array. With all ctl_table
     sentinels gone, the additional check for ->procname == NULL that worked in
     tandem with the ARRAY_SIZE to calculate the size of the ctl_table arrays is
     no longer needed and has been removed. The sysctl register functions now
     returns an error if a sentinel is used.
 
 * Preparation patches for sysctl constification
 
     Constifying ctl_table structs prevents the modification of proc_handler
     function pointers as they would reside in .rodata. The ctl_table arguments
     in sysctl utility functions are const qualified in preparation for a future
     treewide proc_handler argument constification commit.
 
 * Misc fixes
 
     Increase robustness of set_ownership by providing sane default ownership
     values in case the callee doesn't set them. Bound check proc_dou8vec_minmax
     to avoid loading buggy modules and give sysctl testing module a name to
     avoid compiler complaints.
 
 Testing
 
   * This got push to linux-next in v6.10-rc2, so it has had more than a month
     of testing
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmaWdz4ACgkQupfNUreW
 QU/WKQwAkSuUz42yCQye77BK+Z8ANcTF1f3aI/wfv2nahq1GaSrNBpqUiXvEe9Tt
 KD2lM1PWiQfizVLIDPh96yxa5q69GQrPPOA/V1jwIXmk/HRpjjoONCFNNXVRCTls
 VCqDz/RatuXvzO35Yn87MnWnxv6PiX7X/zq/3WikVsUI381kvTgC6OwZxdFM52w4
 ESwOa3LeOovtRnqV5dpHr6DCQKyd0N52nPxgXvaerjlsJsv7PlezN7z9YyLOOfmW
 xUD7X6LQcJq7HcEukaB6I9o2GQOi4yYXL2YOzed7qu9Thu+lasEoN3Bd7P+ilXkc
 JY6EXJ5o+d69PewKRuJ1QvD7wrHIkhNMNbMtvehNay124wAHDy3KtonFzyvlX4wE
 qCHBYc6rySJNhSqwVp9MoksOZfDM99pVIOs9YVIjc90Zzu5J7tORgYWRVOHTcAtj
 fd8nMdkK3+ZANapygFCyew6GueIzaqlQwveVgLGw4vc5L3ClknmURit3y487Pzdg
 B+BEVlsp
 =bs2G
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl updates from Joel Granados:

 - Remove "->procname == NULL" check when iterating through sysctl table
   arrays

   Removing sentinels in ctl_table arrays reduces the build time size
   and runtime memory consumed by ~64 bytes per array. With all
   ctl_table sentinels gone, the additional check for ->procname == NULL
   that worked in tandem with the ARRAY_SIZE to calculate the size of
   the ctl_table arrays is no longer needed and has been removed. The
   sysctl register functions now returns an error if a sentinel is used.

 - Preparation patches for sysctl constification

   Constifying ctl_table structs prevents the modification of
   proc_handler function pointers as they would reside in .rodata. The
   ctl_table arguments in sysctl utility functions are const qualified
   in preparation for a future treewide proc_handler argument
   constification commit.

 - Misc fixes

   Increase robustness of set_ownership by providing sane default
   ownership values in case the callee doesn't set them. Bound check
   proc_dou8vec_minmax to avoid loading buggy modules and give sysctl
   testing module a name to avoid compiler complaints.

* tag 'sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: Warn on an empty procname element
  sysctl: Remove ctl_table sentinel code comments
  sysctl: Remove "child" sysctl code comments
  sysctl: Remove superfluous empty allocations from sysctl internals
  sysctl: Replace nr_entries with ctl_table_size in new_links
  sysctl: Remove check for sentinel element in ctl_table arrays
  mm profiling: Remove superfluous sentinel element from ctl_table
  locking: Remove superfluous sentinel element from kern_lockdep_table
  sysctl: Add module description to sysctl-testing
  sysctl: constify ctl_table arguments of utility function
  utsname: constify ctl_table arguments of utility function
  sysctl: move the extra1/2 boundary check of u8 to sysctl_check_table_array
  sysctl: always initialize i_uid/i_gid
2024-07-16 14:24:29 -07:00
Linus Torvalds
ce5a51bfac hardening updates for v6.11-rc1
- lkdtm/bugs: add test for hung smp_call_function_single() (Mark Rutland)
 
 - gcc-plugins: Remove duplicate included header file stringpool.h
   (Thorsten Blum)
 
 - ARM: Remove address checking for MMUless devices (Yanjun Yang)
 
 - randomize_kstack: Clean up per-arch entropy and codegen
 
 - KCFI: Make FineIBT mode Kconfig selectable
 
 - fortify: Do not special-case 0-sized destinations
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmaVT2IACgkQiXL039xt
 wCbq8A//RhxTdr+l/h2gyMy/Lcy/NMR9KEWklnxdftuM1V1Kzr53yeH/g6Ehw69g
 e8Ag3Sp7Fn4rNBVa+tY6RqzKwfrUHIbeewGI4LkRe19NDWFWc/Od+4tamfRSPf9c
 GL9ZnJZviRm3zByetwr4CbS69HocXFFSSgcpIv/7xOd+haSWWdvEc3KcSnavY/aq
 8wQPkZxzy8ESkOajZj2k0E2l9JP42Ex20qy0KcjweSSYVafKmbTxhKZgriwAKMCD
 Yj2m55fbD6D08vd0Y6S7H4TPilYtRbulXR9FNMtw59UpKeoUceEmyn4B43psDvau
 9XuJF/oFKrXBEJG+OUZogNu5L6uYUaNdYdtb43upu9lCsjrAjmMYfmXDHO2E40V8
 76MikxHtyFAPEzUwg/BH2CGUu9hil+FADd28s8zLuUBpRDitgYudQD+Cqrc34b6s
 QlAX19bX7KFgXqlsdwy6zJNSd3dpoMBVsP58/EhQQfiqv/ZU2TOryZenz0URlH+k
 ZCAbpXYRAzTyGz23qkutRO+6MiKXoheE7gmd9jESiaqyXe2Q6mIMPyoFU50458TH
 xXhXbZc7War8vbJLyWF7fvK/GlooTHu4xOxfNTsxKWiYShI01iiwG1hH+j4ZDVOG
 NBBK2AfX9GM8AOHJolp5EaGmon0AoVsxbRANSs1K4qZ93WTNGLk=
 =LoG2
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - lkdtm/bugs: add test for hung smp_call_function_single() (Mark
   Rutland)

 - gcc-plugins: Remove duplicate included header file stringpool.h
   (Thorsten Blum)

 - ARM: Remove address checking for MMUless devices (Yanjun Yang)

 - randomize_kstack: Clean up per-arch entropy and codegen

 - KCFI: Make FineIBT mode Kconfig selectable

 - fortify: Do not special-case 0-sized destinations

* tag 'hardening-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randomize_kstack: Improve stack alignment codegen
  ARM: Remove address checking for MMUless devices
  gcc-plugins: Remove duplicate included header file stringpool.h
  randomize_kstack: Remove non-functional per-arch entropy filtering
  fortify: Do not special-case 0-sized destinations
  x86/alternatives: Make FineIBT mode Kconfig selectable
  lkdtm/bugs: add test for hung smp_call_function_single()
2024-07-16 13:45:43 -07:00
Linus Torvalds
4fd9435641 Updates for timers, timekeeping and related functionality:
- Core:
 
     - Make the takeover of a hrtimer based broadcast timer reliable during
       CPU hot-unplug. The current implementation suffers from a race which
       can lead to broadcast timer starvation in the worst case.
 
     - VDSO related cleanups and simplifications
 
     - Small cleanups and enhancements all over the place
 
   - PTP:
 
     - Replace the architecture specific base clock to clocksource, e.g. ART
       to TSC, conversion function with generic functionality to avoid
       exposing such internals to drivers and convert all existing drivers
       over. This also allows to provide functionality which converts the
       other way round in the core code based on the same parameter set.
 
     - Provide a function to convert CLOCK_REALTIME to the base clock to
       support the upcoming PPS output driver on Intel platforms.
 
   - Drivers:
 
     - A set of Device Tree bindings for new hardware
 
     - Cleanups and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmaUOM0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYofolD/9kK+aYdDj1gCFuZXZ2wTgMMxFmf/91
 0UcsGRuBJiIXs3H3iizQ0Mb0cdTW6qZJoBp0jPlvUSm0BEKdEgE1uRX2RuAPZ/Gq
 4/54ZJVopKSgAqeJFmqQubRVSv2XdMRAAJT0o1oUG3jZ0c6u8vqArIh5ZCnu13l/
 tsNOeYLYzQFyA30eHSJ/KjQ2zHwAhJnl5a/b7pdAvxmlN37bGgKEpglv+9zwFiDB
 K/kWbpb/oED9WOmoQy5QYi8iSvLQHEhFGrqzXV3fegu/B/mBBf/bpsisVx7Z1m2R
 nzxNqg86RdMjNR6giwBETZjm7YxM+gKb9nCBNILjbjWZFC4tyrBkLGJ+KniTRNyZ
 M5R4X1oP/14h00qXmCgIEFWysXaJRewYI+TIm8R2rLXrR6Tf3c4oL6fHQJxy3X52
 7A+4Z/vOk/KX6PxYmLC+xQDukhFh2nirVYsP1oNM9yC9zR/wkBBXTTmUSAI+8m8l
 KphniSPS2HMSBI6TtgOT8SKY7lRUZTnafBZq7wRXCv0Zz8AXoofgQDmBkXC99BkB
 MjLvRotJVJvY9a8LtA7htjDg/jiEMa0wHRNAGNSbflKoAKrJzoE5WbFxFZKbq3vZ
 o8cEYRMAIP+X+qn+oymT45XXXQlifZiccJdAi9FqDTvplEib2jmTmH6Ae5Khkr4l
 Lbzh/nSKVN7lOg==
 =8GjP
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for timers, timekeeping and related functionality:

  Core:

   - Make the takeover of a hrtimer based broadcast timer reliable
     during CPU hot-unplug. The current implementation suffers from a
     race which can lead to broadcast timer starvation in the worst
     case.

   - VDSO related cleanups and simplifications

   - Small cleanups and enhancements all over the place

  PTP:

   - Replace the architecture specific base clock to clocksource, e.g.
     ART to TSC, conversion function with generic functionality to avoid
     exposing such internals to drivers and convert all existing drivers
     over. This also allows to provide functionality which converts the
     other way round in the core code based on the same parameter set.

   - Provide a function to convert CLOCK_REALTIME to the base clock to
     support the upcoming PPS output driver on Intel platforms.

  Drivers:

   - A set of Device Tree bindings for new hardware

   - Cleanups and enhancements all over the place"

* tag 'timers-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  clocksource/drivers/realtek: Add timer driver for rtl-otto platforms
  dt-bindings: timer: Add schema for realtek,otto-timer
  dt-bindings: timer: Add SOPHGO SG2002 clint
  dt-bindings: timer: renesas,tmu: Add R-Car Gen2 support
  dt-bindings: timer: renesas,tmu: Add RZ/G1 support
  dt-bindings: timer: renesas,tmu: Add R-Mobile APE6 support
  clocksource/drivers/mips-gic-timer: Correct sched_clock width
  clocksource/drivers/mips-gic-timer: Refine rating computation
  clocksource/drivers/sh_cmt: Address race condition for clock events
  clocksource/driver/arm_global_timer: Remove unnecessary ‘0’ values from err
  clocksource/drivers/arm_arch_timer: Remove unnecessary ‘0’ values from irq
  tick/broadcast: Make takeover of broadcast hrtimer reliable
  tick/sched: Combine WARN_ON_ONCE and print_once
  x86/vdso: Remove unused include
  x86/vgtod: Remove unused typedef gtod_long_t
  x86/vdso: Fix function reference in comment
  vdso: Add comment about reason for vdso struct ordering
  vdso/gettimeofday: Clarify comment about open coded function
  timekeeping: Add missing kernel-doc function comments
  tick: Remove unnused tick_nohz_get_idle_calls()
  ...
2024-07-15 15:03:09 -07:00
Linus Torvalds
0e4b77d4ea A single update for debugobjects to annotate all intentionally racy global
debug variables so that KCSAN ignores them.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmaULP4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoT6pEACXmc34OzO3jbOGEmgt5ch0cYSNvlY0
 AL0iAV5JakC8AGDWeDNAUhR5r7tuNqjMmiy/XH+uR/4+xCZLZvQp7flyhrm/W7vd
 rB3slu4xqqHizoQe81ZdH3ffg7Cj/Q/zqcJTv44UYkWLlAKA92S79bsn903UHpnL
 ENH0IMulpP0b3GedV3GySz476kyAJX4ZJHXfsG71oyWz8gJahXfaDzSMqnMW0bLG
 z0u51D9Q2R60zYpEsSPfBCKERKZ+Dzbn/YOYF85kytpXkVQd183JY05IkZmDgxyB
 O973GgxvPGXZMXrUfhd+h7Kr17TiG+OKFpxhxgGCQoJNebFUt4A+QFWwQ7/FE/TN
 FmjvwTBHllrLpucskivvI6zEETnJB/13XBB/T3k0BMB3cFfUiXdQS0N+xOBVoAhD
 CLo21kG+xNPbzuKwzKx1+Vb/FH8/aoKp6py5kQlKAtQ6ddfqyvyGN3TZKYQGl3Hk
 9o1ZuwlfkpG0a/0GKvyPcUeLUP0IagGe1wrOard+uL2VRlPRTnr4GH7ItTEedmAY
 JRlCD0A1GQzwVtOy+D54W0G0ueW/tX76QzxuIJj5wwmZQpcV37eTOfIbZXnk4RzS
 TZJ6gjxSLGbjYMbTiIcTFBU6UXhKjkE30bb5gPdzpXh8QtI1SSqpftZszqTAXWA3
 qbMwI0/csYVXsg==
 =PuR2
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects update from Thomas Gleixner:
 "A single update for debugobjects to annotate all intentionally racy
  global debug variables so that KCSAN ignores them"

* tag 'core-debugobjects-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Annotate racy debug variables
2024-07-15 14:52:21 -07:00
Vlastimil Babka
436381eaf2 Merge branch 'slab/for-6.11/buckets' into slab/for-next
Merge all the slab patches previously collected on top of v6.10-rc1,
over cleanups/fixes that had to be based on rc6.
2024-07-15 10:44:16 +02:00
Linus Torvalds
882ddcd1bf Kbuild fixes for v6.10 (fourth)
- Make scripts/ld-version.sh robust against the latest LLD
 
  - Fix warnings in rpm-pkg with device tree support
 
  - Fix warnings in fortify tests with KASAN
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmaUM9kVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGHEYQAKvrP1IzwlkANEEj/2qW1iGHlUod
 im3cFCKyxrlBar71n15fYtclhK0N4GTVAUMHAk0d1GZo8UjuCeUzurHM4o53Hu1P
 D0pXbkmiA0YgndpJYQpSV0CrLxCOCVAEFRf7AgdolVjpNLuba4z0bXSTQfEwfHKC
 W1igTL2vGG8citbfHhEGZfB7AIEQBB0LtNkarpsDVD39rG+blZAABBLEtCueSVtw
 rVX/Yuny9nDET6tlaCNgr2esNfkrHPIxOSsufeLWdhVVIZprPSGERflhkL3yE299
 v6R45ANn72iVNKnnmmjxTNeezIpr74w1NSzBJ0jRM1KRqzbsEuFAf0ZNamtoUJ4r
 m4tSu5l7lDj86APvehoO2o07A3omd8vcgLPt+lZlFsBIjorVIKovsjix6pVUgHlS
 BTvxbSojbSMUa/NrkbosJkLo/6TzZxYxHKr17nxk+HsXu0i9A9IiHPBK5dTcbtua
 olp1MKolQG78FYMwl7v4yQithawRG0mNDLJ2J8oTEIATXQtXV0WAaje73qQFIs6I
 cMBEfeaDAMH4z0/VvKZsdksXFPDrrjoW0/x1tPqcAgOSyacPGbki4asn52rwDHT5
 mfAzlnUc8ts56sBasArmMpk0z+PKC4MZeFUXNGJf7bZ3NZqoDRDHpb69Aqs11vSw
 AJa9Kj07o5YD8N+E
 =MMw8
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Make scripts/ld-version.sh robust against the latest LLD

 - Fix warnings in rpm-pkg with device tree support

 - Fix warnings in fortify tests with KASAN

* tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  fortify: fix warnings in fortify tests with KASAN
  kbuild: rpm-pkg: avoid the warnings with dtb's listed twice
  kbuild: Make ld-version.sh more robust against version string changes
2024-07-14 15:29:35 -07:00
Masahiro Yamada
84679f04ce fortify: fix warnings in fortify tests with KASAN
When a software KASAN mode is enabled, the fortify tests emit warnings
on some architectures.

For example, for ARCH=arm, the combination of CONFIG_FORTIFY_SOURCE=y
and CONFIG_KASAN=y produces the following warnings:

    TEST    lib/test_fortify/read_overflow-memchr.log
  warning: unsafe memchr() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memchr.c
    TEST    lib/test_fortify/read_overflow-memchr_inv.log
  warning: unsafe memchr_inv() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memchr_inv.c
    TEST    lib/test_fortify/read_overflow-memcmp.log
  warning: unsafe memcmp() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memcmp.c
    TEST    lib/test_fortify/read_overflow-memscan.log
  warning: unsafe memscan() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memscan.c
    TEST    lib/test_fortify/read_overflow2-memcmp.log
  warning: unsafe memcmp() usage lacked '__read_overflow2' warning in lib/test_fortify/read_overflow2-memcmp.c
     [ more and more similar warnings... ]

Commit 9c2d1328f8 ("kbuild: provide reasonable defaults for tool
coverage") removed KASAN flags from non-kernel objects by default.
It was an intended behavior because lib/test_fortify/*.c are unit
tests that are not linked to the kernel.

As it turns out, some architectures require -fsanitize=kernel-(hw)address
to define __SANITIZE_ADDRESS__ for the fortify tests.

Without __SANITIZE_ADDRESS__ defined, arch/arm/include/asm/string.h
defines __NO_FORTIFY, thus excluding <linux/fortify-string.h>.

This issue does not occur on x86 thanks to commit 4ec4190be4
("kasan, x86: don't rename memintrinsics in uninstrumented files"),
but there are still some architectures that define __NO_FORTIFY
in such a situation.

Set KASAN_SANITIZE=y explicitly to the fortify tests.

Fixes: 9c2d1328f8 ("kbuild: provide reasonable defaults for tool coverage")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Closes: https://lore.kernel.org/all/0e8dee26-41cc-41ae-9493-10cd1a8e3268@app.fastmail.com/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-15 04:53:49 +09:00
Thomas Gleixner
b7625d67eb - Remove unnecessary local variables initialization as they will be
initialized in the code path anyway right after on the ARM arch
   timer and the ARM global timer (Li kunyu)
 
 - Fix a race condition in the interrupt leading to a deadlock on the
   SH CMT driver. Note that this fix was not tested on the platform
   using this timer but the fix seems reasonable enough to be picked
   confidently (Niklas Söderlund)
 
 - Increase the rating of the gic-timer and use the configured width
   clocksource register on the MIPS architecture (Jiaxun Yang)
 
 - Add the DT bindings for the TMU on the Renesas platforms (Geert
   Uytterhoeven)
 
 - Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas
   Bonnefille)
 
 - Add the rtl-otto timer driver along with the DT bindings for the
   Realtek platform (Chris Packham)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmaRQh0ACgkQqDIjiipP
 6E+rfQgAqkAWZ9BjswxV8Fg+Hj+a1cSohKjDczqitQF5rJm25X5VvMwlXVa3XQGm
 yemh4tKPpll02LOiYCTyqOWzNrkVS9VsoBd5rrYjRX5aSv7UD35EXklLj4P/INwX
 O9CRGD6aK4Xbw66xxheYHSSh+2iRs2x2mq61+/VdcIBlAwpQo+vx7McRoJZZI+2t
 NFIXw8RF5dDlmmAaqiB0WnPAtcOK3SDo9fu1LEAX1ZAzvbZriLo7XLnL7ibySWVe
 BW1n7Ore6PN5Dvz7jMfTsOQsgAlVv6MPfp/s4EDqMfBLVqXNirzXrdhiee/ahnYP
 vyzQyU5HPCMiIYS45mhJF0OyDd3wyw==
 =wuYA
 -----END PGP SIGNATURE-----

Merge tag 'timers-v6.11-rc1' of https://git.linaro.org/people/daniel.lezcano/linux into timers/core

Pull clocksource/event driver updates from Daniel Lezcano:

  - Remove unnecessary local variables initialization as they will be
    initialized in the code path anyway right after on the ARM arch
    timer and the ARM global timer (Li kunyu)

  - Fix a race condition in the interrupt leading to a deadlock on the
    SH CMT driver. Note that this fix was not tested on the platform
    using this timer but the fix seems reasonable enough to be picked
    confidently (Niklas Söderlund)

  - Increase the rating of the gic-timer and use the configured width
    clocksource register on the MIPS architecture (Jiaxun Yang)

  - Add the DT bindings for the TMU on the Renesas platforms (Geert
    Uytterhoeven)

  - Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas
    Bonnefille)

  - Add the rtl-otto timer driver along with the DT bindings for the
    Realtek platform (Chris Packham)

Link: https://lore.kernel.org/all/91cd05de-4c5d-4242-a381-3b8a4fe6a2a2@linaro.org
2024-07-13 12:07:10 +02:00
Dan Carpenter
fe69b772e3 crypto: lib/mpi - delete unnecessary condition
We checked that "nlimbs" is non-zero in the outside if statement so delete
the duplicate check here.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-07-13 11:50:28 +12:00
Thorsten Blum
e1fb7430fc lib/bch.c: use swap() to improve code
Use the swap() macro to simplify the functions solve_linear_system() and
gf_poly_gcd() and improve their readability.  Remove the local variable
tmp.

Fixes the following three Coccinelle/coccicheck warnings reported by
swap.cocci:

  WARNING opportunity for swap()
  WARNING opportunity for swap()
  WARNING opportunity for swap()

Link: https://lkml.kernel.org/r/20240708224023.9312-2-thorsten.blum@toblux.com
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-12 16:39:53 -07:00
Chen Ni
4f5d4a1ba7 test_bpf: convert comma to semicolon
Replace commas between expression statements with semicolons.

Link: https://lkml.kernel.org/r/20240709034323.586185-1-nichen@iscas.ac.cn
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-12 16:39:52 -07:00
Kees Cook
7554a7b96d kunit: executor: Simplify string allocation handling
The alloc/copy code pattern is better consolidated to single kstrdup (and
kstrndup) calls instead. This gets rid of deprecated[1] strncpy() uses as
well. Replace one other strncpy() use with the more idiomatic strscpy().

Link: https://github.com/KSPP/linux/issues/90 [1]
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-12 10:11:48 -06:00
Thorsten Blum
0d9c0a67b1 bootconfig: Remove duplicate included header file linux/bootconfig.h
The header file linux/bootconfig.h is included whether __KERNEL__ is
defined or not.

Include it only once before the #ifdef/#else/#endif preprocessor
directives and remove the following make includecheck warning:

  linux/bootconfig.h is included more than once

Move the comment to the top and delete the now empty #else block.

Link: https://lore.kernel.org/all/20240711084315.1507-1-thorsten.blum@toblux.com/

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-07-12 08:55:02 +09:00
Jakub Kicinski
7c8267275d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/sched/act_ct.c
  26488172b0 ("net/sched: Fix UAF when resolving a clash")
  3abbd7ed8b ("act_ct: prepare for stolen verdict coming from conntrack and nat engine")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11 12:58:13 -07:00
Linus Torvalds
9d9a2f29ae 21 hotfixes, 15 of which are cc:stable.
No identifiable theme here - all are singleton patches, 19 are for MM.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZo7tTQAKCRDdBJ7gKXxA
 jvhZAP977PnAwQH5khIS3xJxZrqx/+Tho7UPZzQPvHJPRpHorAD/TZfDazGtlPMD
 uLPEVslh18rks/w+kddLrnlBnkpUMwY=
 =vhts
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-07-10-13-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "21 hotfixes, 15 of which are cc:stable.

  No identifiable theme here - all are singleton patches, 19 are for MM"

* tag 'mm-hotfixes-stable-2024-07-10-13-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
  mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio
  mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio()
  filemap: replace pte_offset_map() with pte_offset_map_nolock()
  arch/xtensa: always_inline get_current() and current_thread_info()
  sched.h: always_inline alloc_tag_{save|restore} to fix modpost warnings
  MAINTAINERS: mailmap: update Lorenzo Stoakes's email address
  mm: fix crashes from deferred split racing folio migration
  lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat
  mm: gup: stop abusing try_grab_folio
  nilfs2: fix kernel bug on rename operation of broken directory
  mm/hugetlb_vmemmap: fix race with speculative PFN walkers
  cachestat: do not flush stats in recency check
  mm/shmem: disable PMD-sized page cache if needed
  mm/filemap: skip to create PMD-sized page cache if needed
  mm/readahead: limit page cache size in page_cache_ra_order()
  mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray
  mm/damon/core: merge regions aggressively when max_nr_regions is unmet
  Fix userfaultfd_api to return EINVAL as expected
  mm: vmalloc: check if a hash-index is in cpu_possible_mask
  mm: prevent derefencing NULL ptr in pfn_section_valid()
  ...
2024-07-10 14:59:41 -07:00
Linus Torvalds
f6963ab4b0 bcachefs fixes for 6.10-rc8
- Switch some asserts to WARN()
 - Fix a few "transaction not locked" asserts in the data read retry
   paths and backpointers gc
 - Fix a race that would cause the journal to get stuck on a flush commit
 - Add missing fsck checks for the fragmentation LRU
 - The usual assorted ssorted syzbot fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmaOuRwACgkQE6szbY3K
 bnaCHhAAi9VRqws+zx3fSpe2OMwWqAEWA84QgIFJccy+I86d7dXkqG389gFqJwMG
 9S3BUHP1WooJmpsTRhK5cNtxZuKKOajXlxUYz3onsF7O/U3dHFY5GU7yIIjXS/0o
 q7+iryWAJ4MmlOrAJhgPMH/WlhbSVsjANUN0n/NhlOWHccFGHmpdMTb6aYzb+lfL
 iZOONKmEOR65gLzZYlO323OB2Tv00iEbOZAtxk68BLZYX+WON/j1T1A8gK4G0XSX
 8wcYpXNxGGkCufjBfAbXf4mcp/WygQq0Wj3bdVMFkZ+AwSJDcfGeK1H7f6tJ9e4n
 lqfWL4tgWIckS+41sA96B5cYry9TMDdhu3IeFaAm0ZrF55JT1JySGE1GNA+mo6xA
 mkMAqhG7rwYh6nSJfWX0Ie+zJ9TFbmi05ZbI7jaTuQjnJ5uvPpTuRfBDi+qSWmoi
 +IBDAi9hZgCUNEsLRGDm7RDQo0dpbFo6jpArn1RHK4MO/HkTrqcKpTqiGnfwFAU4
 PFxwq5G9+d38+M6YMX0tXdfQ+fdxroA6aIBJSsIpF18tPRBOBlQsM2GFP34uHbyk
 L6HOzed2QpM5ExBmViX79F+obuDQ/gzXQszYvDKL4QTFNbx43gPWRDrGm8EQen6y
 12EScamXbUWBSWnOqxscmeUsTdTKxLfw/F43JbE2fE7jSxc5tss=
 =VGT8
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:

 - Switch some asserts to WARN()

 - Fix a few "transaction not locked" asserts in the data read retry
   paths and backpointers gc

 - Fix a race that would cause the journal to get stuck on a flush
   commit

 - Add missing fsck checks for the fragmentation LRU

 - The usual assorted ssorted syzbot fixes

* tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefs: (22 commits)
  bcachefs: Add missing bch2_trans_begin()
  bcachefs: Fix missing error check in journal_entry_btree_keys_validate()
  bcachefs: Warn on attempting a move with no replicas
  bcachefs: bch2_data_update_to_text()
  bcachefs: Log mount failure error code
  bcachefs: Fix undefined behaviour in eytzinger1_first()
  bcachefs: Mark bch_inode_info as SLAB_ACCOUNT
  bcachefs: Fix bch2_inode_insert() race path for tmpfiles
  closures: fix closure_sync + closure debugging
  bcachefs: Fix journal getting stuck on a flush commit
  bcachefs: io clock: run timer fns under clock lock
  bcachefs: Repair fragmentation_lru in alloc_write_key()
  bcachefs: add check for missing fragmentation in check_alloc_to_lru_ref()
  bcachefs: bch2_btree_write_buffer_maybe_flush()
  bcachefs: Add missing printbuf_tabstops_reset() calls
  bcachefs: Fix loop restart in bch2_btree_transactions_read()
  bcachefs: Fix bch2_read_retry_nodecode()
  bcachefs: Don't use the new_fs() bucket alloc path on an initialized fs
  bcachefs: Fix shift greater than integer size
  bcachefs: Change bch2_fs_journal_stop() BUG_ON() to warning
  ...
2024-07-10 11:50:16 -07:00
Kent Overstreet
29f1c1ae6d closures: fix closure_sync + closure debugging
originally, stack closures were only used synchronously, and with the
original implementation of closure_sync() the ref never hit 0; thus,
closure_put_after_sub() assumes that if the ref hits 0 it's on the debug
list, in debug mode.

that's no longer true with the current implementation of closure_sync,
so we need a new magic so closure_debug_destroy() doesn't pop an assert.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-10 09:53:39 -04:00
Paolo Abeni
7b769adc26 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZoxN0AAKCRDbK58LschI
 g0c5AQDa3ZV9gfbN42y1zSDoM1uOgO60fb+ydxyOYh8l3+OiQQD/fLfpTY3gBFSY
 9yi/pZhw/QdNzQskHNIBrHFGtJbMxgs=
 =p1Zz
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-07-08

The following pull-request contains BPF updates for your *net-next* tree.

We've added 102 non-merge commits during the last 28 day(s) which contain
a total of 127 files changed, 4606 insertions(+), 980 deletions(-).

The main changes are:

1) Support resilient split BTF which cuts down on duplication and makes BTF
   as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman.

2) Add support for dumping kfunc prototypes from BTF which enables both detecting
   as well as dumping compilable prototypes for kfuncs, from Daniel Xu.

3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement
   support for BPF exceptions, from Ilya Leoshkevich.

4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support
   for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui.

5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option
   for skbs and add coverage along with it, from Vadim Fedorenko.

6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives
   a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan.

7) Extend the BPF verifier to track the delta between linked registers in order
   to better deal with recent LLVM code optimizations, from Alexei Starovoitov.

8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should
   have been a pointer to the map value, from Benjamin Tissoires.

9) Extend BPF selftests to add regular expression support for test output matching
   and adjust some of the selftest when compiled under gcc, from Cupertino Miranda.

10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always
    iterates exactly once anyway, from Dan Carpenter.

11) Add the capability to offload the netfilter flowtable in XDP layer through
    kfuncs, from Florian Westphal & Lorenzo Bianconi.

12) Various cleanups in networking helpers in BPF selftests to shave off a few
    lines of open-coded functions on client/server handling, from Geliang Tang.

13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so
    that x86 JIT does not need to implement detection, from Leon Hwang.

14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an
    out-of-bounds memory access for dynpointers, from Matt Bobrowski.

15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as
    it might lead to problems on 32-bit archs, from Jiri Olsa.

16) Enhance traffic validation and dynamic batch size support in xsk selftests,
    from Tushar Vyavahare.

bpf-next-for-netdev

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits)
  selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep
  selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
  bpf: helpers: fix bpf_wq_set_callback_impl signature
  libbpf: Add NULL checks to bpf_object__{prev_map,next_map}
  selftests/bpf: Remove exceptions tests from DENYLIST.s390x
  s390/bpf: Implement exceptions
  s390/bpf: Change seen_reg to a mask
  bpf: Remove unnecessary loop in task_file_seq_get_next()
  riscv, bpf: Optimize stack usage of trampoline
  bpf, devmap: Add .map_alloc_check
  selftests/bpf: Remove arena tests from DENYLIST.s390x
  selftests/bpf: Add UAF tests for arena atomics
  selftests/bpf: Introduce __arena_global
  s390/bpf: Support arena atomics
  s390/bpf: Enable arena
  s390/bpf: Support address space cast instruction
  s390/bpf: Support BPF_PROBE_MEM32
  s390/bpf: Land on the next JITed instruction after exception
  s390/bpf: Introduce pre- and post- probe functions
  s390/bpf: Get rid of get_probe_mem_regno()
  ...
====================

Link: https://patch.msgid.link/20240708221438.10974-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09 17:01:46 +02:00
Arnd Bergmann
1f9a8286bc uaccess: always export _copy_[from|to]_user with CONFIG_RUST
Rust code needs to be able to access _copy_from_user and _copy_to_user
so that it can skip the check_copy_size check in cases where the length
is known at compile-time, mirroring the logic for when C code will skip
check_copy_size. To do this, we ensure that exported versions of these
methods are available when CONFIG_RUST is enabled.

Alice has verified that this patch passes the CONFIG_TEST_USER_COPY test
on x86 using the Android cuttlefish emulator.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240528-alice-mm-v7-2-78222c31b8f4@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-07-08 23:44:01 +02:00
Andrew Morton
8ef6fd0e9e Merge branch 'mm-hotfixes-stable' into mm-stable to pick up "mm: fix
crashes from deferred split racing folio migration", needed by "mm:
migrate: split folio_migrate_mapping()".
2024-07-06 11:44:41 -07:00
Paul Menzel
2fe29fe945 lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat
On a system with Perl 5.12.1, commit 5ef6dc08cf
("lib/build_OID_registry: don't mention the full path of the script in
output") causes the build to fail with the error below.

     Bareword found where operator expected at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
     syntax error at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
     Execution of ./lib/build_OID_registry aborted due to compilation errors.
     make[3]: *** [lib/Makefile:352: lib/oid_registry_data.c] Error 255

Ahmad Fatoum analyzed that non-destructive substitution is only supported since
Perl 5.13.2. Instead of dropping `r` and having the side effect of modifying
`$0`, introduce a dedicated variable to support older Perl versions.

Link: https://lkml.kernel.org/r/20240702223512.8329-2-pmenzel@molgen.mpg.de
Link: https://lkml.kernel.org/r/20240701155802.75152-1-pmenzel@molgen.mpg.de
Fixes: 5ef6dc08cf ("lib/build_OID_registry: don't mention the full path of the script in output")
Link: https://lore.kernel.org/all/259f7a87-2692-480e-9073-1c1c35b52f67@molgen.mpg.de/
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Suggested-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-06 11:39:51 -07:00
Daniel Vetter
86634fa4e6 Linux 6.10-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmaB0NweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGkvwH/36UJRk/o6wvXnyH
 E6QjCSWo2226APyWks22NjtC3I/8Iqdvkneuh6wG0qL2sXAB078EMjUq5R81bF8H
 wWFBJwetjYTp8GEyLioMEb2wCH/J3R29dLFC4UYTplafXRGP6//xcpJaKmTxcgdR
 31IzvTPXbApZ7L3k1U6rA2bK9PNKcFCOvZlrNMUCuwMrabymHsDfOUt1DqXyg2xp
 zjqiWYBwlklozmgawSWt/mdEgkWuTcAbg+KyqDVQF59s9aj/OOwZ0j+HACq5V8CM
 quTPIAYL6CC9p7uxa69lGr/sgC0Is/BZLPX7RTZAwCgarGvnX+1HUsjDcaFCtrVg
 O6fPUV8=
 =pgUx
 -----END PGP SIGNATURE-----

Merge v6.10-rc6 into drm-next

The exynos-next pull is based on a newer -rc than drm-next. hence
backmerge first to make sure the unrelated conflicts we accumulated
don't end up randomly in the exynos merge pull, but are separated out.

Conflicts are all benign: Adjacent changes in amdgpu and fbdev-dma
code, and cherry-pick conflict in xe.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-07-05 10:47:28 +02:00
Jeff Johnson
8547d1150f math: rational: add missing MODULE_DESCRIPTION() macro
With ARCH=sh, make allmodconfig && make W=1 C=1 reports: WARNING: modpost:
missing MODULE_DESCRIPTION() in lib/math/rational.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240702-md-sh-lib-math-v1-1-93f4ac4fa8fd@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:11 -07:00
Jeff Johnson
bee6c683de lib/zlib: add missing MODULE_DESCRIPTION() macro
With ARCH=csky, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/zlib_deflate/zlib_deflate.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240613-md-csky-lib-zlib_deflate-v1-1-83504d9a27d6@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Guo Ren <guoren@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:11 -07:00
Hsin Chang Yu
cedb08caac lib/rbtree.c: fix the example typo
Replace the "Sr" with "sr", the example is wrong if sl and N don't have
child nodes, so sr should be red node.

Link: https://lkml.kernel.org/r/20240628142229.69419-1-zxcvb600870024@gmail.com
Signed-off-by: Hsin Chang Yu <zxcvb600870024@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:10 -07:00
Jakub Kicinski
76ed626479 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/phy/aquantia/aquantia.h
  219343755e ("net: phy: aquantia: add missing include guards")
  61578f6793 ("net: phy: aquantia: add support for PHY LEDs")

drivers/net/ethernet/wangxun/libwx/wx_hw.c
  bd07a98178 ("net: txgbe: remove separate irq request for MSI and INTx")
  b501d261a5 ("net: txgbe: add FDIR ATR support")
https://lore.kernel.org/all/20240703112936.483c1975@canb.auug.org.au/

include/linux/mlx5/mlx5_ifc.h
  048a403648 ("net/mlx5: IFC updates for changing max EQs")
  99be56171f ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
https://lore.kernel.org/all/20240701133951.6926b2e3@canb.auug.org.au/

Adjacent changes:

drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
  4130c67cd1 ("wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference")
  3f3126515f ("wifi: iwlwifi: mvm: add mvm-specific guard")

include/net/mac80211.h
  816c6bec09 ("wifi: mac80211: fix BSS_CHANGED_UNSOL_BCAST_PROBE_RESP")
  5a009b42e0 ("wifi: mac80211: track changes in AP's TPE")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04 14:16:11 -07:00
Ilya Leoshkevich
89f42df66c lib/zlib: unpoison DFLTCC output buffers
The constraints of the DFLTCC inline assembly are not precise: they do not
communicate the size of the output buffers to the compiler, so it cannot
automatically instrument it.

Add the manual kmsan_unpoison_memory() calls for the output buffers.  The
logic is the same as in [1].

[1] 1f5ddcc009

Link: https://lkml.kernel.org/r/20240621113706.315500-21-iii@linux.ibm.com
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <kasan-dev@googlegroups.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:23 -07:00
JaeJoon Jung
739820a617 maple_tree: modified return type of mas_wr_store_entry()
Since the return value of mas_wr_store_entry() is not used,
the return type can be changed to void.

Link: https://lkml.kernel.org/r/20240614092428.29491-1-rgbi3307@gmail.com
Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:19 -07:00
Jeff Johnson
3c666d0a32 lib: test_hmm: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_hmm.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-lib-md-test_hmm-v1-1-e4aa17daa57b@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:05 -07:00
Jeff Johnson
a619dd3948 test_maple_tree: add the missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing
MODULE_DESCRIPTION() in lib/test_maple_tree.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_maple_tree-v1-1-7b1b485aeec3@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:05 -07:00
Jeff Johnson
2ec83987a5 ubsan: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_ubsan.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_ubsan-v1-1-c2a80d258842@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:05 -07:00
Jeff Johnson
757234f1ad test_xarray: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_xarray.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_xarray-v1-1-42fd6833bdd4@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:05 -07:00
Anna-Maria Behnsen
f48955e038 vdso/gettimeofday: Clarify comment about open coded function
The two comments state, that the following code open codes something but
they lack to specify what exactly is open coded.

Expand comments by mentioning the reference to the open coded function.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20240701-vdso-cleanup-v1-1-36eb64e7ece2@linutronix.de
2024-07-03 21:27:03 +02:00
Jeff Johnson
67c9971cd6 kunit/usercopy: Add missing MODULE_DESCRIPTION()
Fix warning seen with:

$ make allmodconfig && make W=1 C=1 lib/usercopy_kunit.ko
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/usercopy_kunit.o

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-02 10:11:45 -06:00
Kees Cook
4d6cf24832 kunit/usercopy: Disable testing on !CONFIG_MMU
Since arch_pick_mmap_layout() is an inline for non-MMU systems, disable
this test there.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202406160505.uBge6TMY-lkp@intel.com/
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-02 10:11:40 -06:00
Greg Kroah-Hartman
19ed3bb558 Merge 6.10-rc6 into char-misc-next
We need the char/misc/iio fixes in here as well to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-01 13:55:39 +02:00
Jeff Johnson
2a49c8b6b6 selftests/fpu: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 now reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_fpu.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240622-md-i386-lib-test_fpu_glue-v1-1-a4e40b7b1264@quicinc.com
Fixes: 9613736d85 ("selftests/fpu: move FP code to a separate translation unit")
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-28 19:36:30 -07:00
Alexey Dobriyan
961a285132 build-id: require program headers to be right after ELF header
Neither ELF spec not ELF loader require program header to be placed right
after ELF header, but build-id code very much assumes such placement:

See

	find_get_page(vma->vm_file->f_mapping, 0);

line and checks against PAGE_SIZE.

Returns errors for now until someone rewrites build-id parser
to be more inline with load_elf_binary().

Link: https://lkml.kernel.org/r/d58bc281-6ca7-467a-9a64-40fa214bd63e@p183
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-28 19:36:30 -07:00
Linus Torvalds
b75f947270 hardening fixes for v6.10-rc6
- Remove invalid tty __counted_by annotation (Nathan Chancellor)
 
 - Add missing MODULE_DESCRIPTION()s for KUnit string tests (Jeff Johnson)
 
 - Remove non-functional per-arch kstack entropy filtering
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmZ+4Z4ACgkQiXL039xt
 wCYUPQ/9Ghbg4CfOIyjl5G7fAYuG+/zLDCkY+kh7XcO2kAn3213KiyRKm0GUAhXY
 p3N7rDH9NsXedfO2bnQ0YTDR3TU8AWIegKgEyGBsyqvdtjSe0ParwWOoGGpavJZ2
 6Op39e6LL2fKGyL4N72lkhRpGPJgGQOqckTljaDl5yQfIHryMpQl0fXzMMjh1HUt
 TKc39kSRbQxguDdIqU1zHgs+Lu9Kph6A3q9PjVap9qzCcPZ4RjIRms4gDrghP7GK
 M0POyZbuXUWxaJ8VwRHbqAtEyEGjXdfBW9DgKQM1fg9XWGZbCkucu3PZbPHv+c6e
 eBGG6O5l6UylmXpmkqLMfIudUekfo8cAEXqcLCBYis8uIuasUWiLMhoTDjdfcvhn
 HHr6iu25IKR698PZzTHQ5yUiuBP38qjXfXr9DDzXrI2+SUbxjurTfbHxFBWK/FYX
 YSdrZR4DbeaU/HI1I+I5YghgeRfR6TQ5NGrmj61wW1QnwvEF6Gdlh+MZgUS59SP5
 S+T50ggGKEYARZcZj1N6Nz39Co9syn/xlhyPKFPkgsRTXw1QE0z6e841V1jxhr49
 cStKFcKAovDeG2UN4bAju49/MWUFlcpkIxn9Y0ZHiu6R6SC9zasXhKi7+xDFolmP
 B6PmON2ZSSoFNwMr7Fr1SC0gWg7V3TYLmpHITDWz5KL00ReEdJY=
 =dItV
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - Remove invalid tty __counted_by annotation (Nathan Chancellor)

 - Add missing MODULE_DESCRIPTION()s for KUnit string tests (Jeff
   Johnson)

 - Remove non-functional per-arch kstack entropy filtering

* tag 'hardening-v6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  tty: mxser: Remove __counted_by from mxser_board.ports[]
  randomize_kstack: Remove non-functional per-arch entropy filtering
  string: kunit: add missing MODULE_DESCRIPTION() macros
2024-06-28 16:11:02 -07:00
Linus Torvalds
cd63a278ac bcachefs fixes for 6.10-rc6
simple stuff:
 - null ptr/err ptr deref fixes
 - fix for getting wedged on shutdown after journal error
 
 - fix missing recalc_capacity() call, capacity now changes correctly
   after a device goes read only
 
   however: our capacity calculation still doesn't take into account when
   we have mixed ro/rw devices and the ro devices have data on them,
   that's going to be a more involved fix to separate accounting for
   "capacity used on ro devices" and "capacity used on rw devices"
 
 - boring syzbot stuff
 
 slightly more involved:
 - discard, invalidate workers are now per device
   this has the effect of simplifying how we take device refs in these
   paths, and the device ref cleanup fixes a longstanding race between
   the device removal path and the discard path
 
 - fixes for how the debugfs code takes refs on btree_trans objects
   we have debugfs code that prints in use btree_trans objects. It uses
   closure_get() on trans->ref, which is mainly for the cycle detector,
   but the debugfs code was using it on a closure that may have hit 0,
   which is not allowed; for performance reasons we cannot avoid having
   not-in-use transactions on the global list.
 
   introduce some new primitives to fix this and make the synchronization
   here a whole lot saner
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmZ+ye4ACgkQE6szbY3K
 bnb4shAAqkKdgB2abIaD1t8+KjUiXwt7seRY4EmzwrEaWniW5bDUYMBvV+tew93j
 uvGmSKMs4ML/r24hcg0zGPJ9GoWrFb3MWhPYizzRS8QspsUjsECJuehNPCe3RPaf
 QBgQtKahTge1e41y1frzkiGKqaOGOTtUVLOfPIebe+oJAhRCYRnrGY2dkZTms7Ue
 aXNtBmnlX3Fkmlm0GiKYrTHpAZz3d0kzdX11Pc2vTXvqo/znuJTTVGnjJkdrHzyv
 6cz6YnMKFdxLVbYO1KlB/3Hu9y9qt815g1rjvaqym8pDk9ltsGHNM3LcCCCyp7Of
 btnbLQ6TdfggK5Kf2hNYuJRY2pnjNyfcNxupQF3RNaw/D/4G5EU16zfFElORC6Mw
 eGwXLvDIGqOSSIvevoRZrgJKAvVptXNg9EtCI5Z5ujQ4ExW8ti1lPHp/r5SVOhyz
 x0Am14H2ERuz7Vt5jUas3k74+tAck6JWc5OemMQawA5waeH1inMT7QZuBt+Bmrhx
 Av0zbhaq4aTsHXmm+Xi6ofj3UBaOQ2rNzT7Au0kxdvJgDPe/USjw4tejV5DmjmHA
 SyRsTG7Zn5xJBi7jc47fcwUgUzlxlffVQGFCVjRUU1vF6u/Ldn7K0zfYbkwSCiKp
 iWSEyg3j5z5N69Vrgdadma4xTDjL/C5+XsMWh8G8ohf+crhUeSo=
 =svIi
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-06-28' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Simple stuff:

   - NULL ptr/err ptr deref fixes

   - fix for getting wedged on shutdown after journal error

   - fix missing recalc_capacity() call, capacity now changes correctly
     after a device goes read only

     however: our capacity calculation still doesn't take into account
     when we have mixed ro/rw devices and the ro devices have data on
     them, that's going to be a more involved fix to separate accounting
     for "capacity used on ro devices" and "capacity used on rw devices"

   - boring syzbot stuff

  Slightly more involved:

   - discard, invalidate workers are now per device

     this has the effect of simplifying how we take device refs in these
     paths, and the device ref cleanup fixes a longstanding race between
     the device removal path and the discard path

   - fixes for how the debugfs code takes refs on btree_trans objects we
     have debugfs code that prints in use btree_trans objects.

     It uses closure_get() on trans->ref, which is mainly for the cycle
     detector, but the debugfs code was using it on a closure that may
     have hit 0, which is not allowed; for performance reasons we cannot
     avoid having not-in-use transactions on the global list.

     Introduce some new primitives to fix this and make the
     synchronization here a whole lot saner"

* tag 'bcachefs-2024-06-28' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: Fix kmalloc bug in __snapshot_t_mut
  bcachefs: Discard, invalidate workers are now per device
  bcachefs: Fix shift-out-of-bounds in bch2_blacklist_entries_gc
  bcachefs: slab-use-after-free Read in bch2_sb_errors_from_cpu
  bcachefs: Add missing bch2_journal_do_writes() call
  bcachefs: Fix null ptr deref in journal_pins_to_text()
  bcachefs: Add missing recalc_capacity() call
  bcachefs: Fix btree_trans list ordering
  bcachefs: Fix race between trans_put() and btree_transactions_read()
  closures: closure_get_not_zero(), closure_return_sync()
  bcachefs: Make btree_deadlock_to_text() clearer
  bcachefs: fix seqmutex_relock()
  bcachefs: Fix freeing of error pointers
2024-06-28 09:25:21 -07:00
Jeff Johnson
6a4805b2f5 string: kunit: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/string_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/string_helpers_kunit.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-lib-string-v1-1-2738cf057d94@quicinc.com
Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-28 08:54:55 -07:00
Dave Airlie
91fdc5e765 drm-misc-next for $kernel-version:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - panic: Monochrome logo support, Various fixes
   - ttm: Improve the number of page faults on some platforms, Fix test
     build breakage with PREEMPT_RT, more test coverage and various test
     improvements
 
 Driver Changes:
   - Add missing MODULE_DESCRIPTION where needed
   - ipu-v3: Various fixes
   - vc4: Monochrome TV support
   - bridge:
     - analogix_dp: Various improvements and reworks, handle AUX
       transfers timeout
     - tc358767: Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR, Fix clock
       calculations
   - panels:
     - More transitions to mipi_dsi wrapped functions
     - New panels: Lincoln Technologies LCD197, Ortustech COM35H3P70ULC,
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZn1DmQAKCRDj7w1vZxhR
 xYj3AP9ThM8q3HoCqXKerpEfnb5LYDB4NocLjn/Bamtm134oNQD+M4Gu2zLSVymV
 74PwtPYuQGKWrmXdw0tD70/MtTAihQc=
 =fSI4
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-06-27' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for 6.11:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - panic: Monochrome logo support, Various fixes
  - ttm: Improve the number of page faults on some platforms, Fix test
    build breakage with PREEMPT_RT, more test coverage and various test
    improvements

Driver Changes:
  - Add missing MODULE_DESCRIPTION where needed
  - ipu-v3: Various fixes
  - vc4: Monochrome TV support
  - bridge:
    - analogix_dp: Various improvements and reworks, handle AUX
      transfers timeout
    - tc358767: Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR, Fix clock
      calculations
  - panels:
    - More transitions to mipi_dsi wrapped functions
    - New panels: Lincoln Technologies LCD197, Ortustech COM35H3P70ULC,

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240627-congenial-pistachio-nyala-848cf4@houat
2024-06-28 08:20:37 +10:00
Jakub Kicinski
193b9b2002 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:
  e3f02f32a0 ("ionic: fix kernel panic due to multi-buffer handling")
  d9c0420999 ("ionic: Mark error paths in the data path as unlikely")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-27 12:14:11 -07:00
Leon Hwang
d65f3767de bpf: Fix tailcall cases in test_bpf
Since f663a03c8e ("bpf, x64: Remove tail call detection"),
tail_call_reachable won't be detected in x86 JIT. And, tail_call_reachable
is provided by verifier.

Therefore, in test_bpf, the tail_call_reachable must be provided in test
cases before running.

Fix and test:

[  174.828662] test_bpf: #0 Tail call leaf jited:1 170 PASS
[  174.829574] test_bpf: #1 Tail call 2 jited:1 244 PASS
[  174.830363] test_bpf: #2 Tail call 3 jited:1 296 PASS
[  174.830924] test_bpf: #3 Tail call 4 jited:1 719 PASS
[  174.831863] test_bpf: #4 Tail call load/store leaf jited:1 197 PASS
[  174.832240] test_bpf: #5 Tail call load/store jited:1 326 PASS
[  174.832240] test_bpf: #6 Tail call error path, max count reached jited:1 2214 PASS
[  174.835713] test_bpf: #7 Tail call count preserved across function calls jited:1 609751 PASS
[  175.446098] test_bpf: #8 Tail call error path, NULL target jited:1 472 PASS
[  175.447597] test_bpf: #9 Tail call error path, index out of range jited:1 206 PASS
[  175.448833] test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202406251415.c51865bc-oliver.sang@intel.com
Fixes: f663a03c8e ("bpf, x64: Remove tail call detection")
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Link: https://lore.kernel.org/r/20240625145351.40072-1-hffilwlqm@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-25 18:12:44 -07:00
Heng Qi
13ba28c5cd dim: add new interfaces for initialization and getting results
DIM-related mode and work have been collected in one same place,
so new interfaces are added to provide convenience.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-5-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25 17:15:06 -07:00
Heng Qi
f750dfe825 ethtool: provide customized dim profile management
The NetDIM library, currently leveraged by an array of NICs, delivers
excellent acceleration benefits. Nevertheless, NICs vary significantly
in their dim profile list prerequisites.

Specifically, virtio-net backends may present diverse sw or hw device
implementation, making a one-size-fits-all parameter list impractical.
On Alibaba Cloud, the virtio DPU's performance under the default DIM
profile falls short of expectations, partly due to a mismatch in
parameter configuration.

I also noticed that ice/idpf/ena and other NICs have customized
profilelist or placed some restrictions on dim capabilities.

Motivated by this, I tried adding new params for "ethtool -C" that provides
a per-device control to modify and access a device's interrupt parameters.

Usage
========
The target NIC is named ethx.

Assume that ethx only declares support for rx profile setting
(with DIM_PROFILE_RX flag set in profile_flags) and supports modification
of usec and pkt fields.

1. Query the currently customized list of the device

$ ethtool -c ethx
...
rx-profile:
{.usec =   1, .pkts = 256, .comps = n/a,},
{.usec =   8, .pkts = 256, .comps = n/a,},
{.usec =  64, .pkts = 256, .comps = n/a,},
{.usec = 128, .pkts = 256, .comps = n/a,},
{.usec = 256, .pkts = 256, .comps = n/a,}
tx-profile:   n/a

2. Tune
$ ethtool -C ethx rx-profile 1,1,n_2,n,n_3,3,n_4,4,n_n,5,n
"n" means do not modify this field.
$ ethtool -c ethx
...
rx-profile:
{.usec =   1, .pkts =   1, .comps = n/a,},
{.usec =   2, .pkts = 256, .comps = n/a,},
{.usec =   3, .pkts =   3, .comps = n/a,},
{.usec =   4, .pkts =   4, .comps = n/a,},
{.usec = 256, .pkts =   5, .comps = n/a,}
tx-profile:   n/a

3. Hint
If the device does not support some type of customized dim profiles,
the corresponding "n/a" will display.

If the "n/a" field is being modified, -EOPNOTSUPP will be reported.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-4-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25 17:15:06 -07:00
Heng Qi
b65e697a7c dim: make DIMLIB dependent on NET
DIMLIB's capabilities are supplied by the dim, net_dim, and
rdma_dim objects, and dim's interfaces solely act as a base for
net_dim and rdma_dim and are not explicitly used anywhere else.
rdma_dim is utilized by the infiniband driver, while net_dim
is for network devices, excluding the soc/fsl driver.

In this patch, net_dim relies on some NET's interfaces, thus
DIMLIB needs to explicitly depend on the NET Kconfig.

The soc/fsl driver uses the functions provided by net_dim, so
it also needs to depend on NET.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-3-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25 17:15:06 -07:00
Heng Qi
0e942053e4 linux/dim: move useful macros to .h file
Useful macros will be used effectively elsewhere.
These will be utilized in subsequent patches.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-2-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25 17:15:06 -07:00
Jeff Johnson
3034749132 KUnit: add missing MODULE_DESCRIPTION() macros for lib/test_*.ko
make allmodconfig && make W=1 C=1 reports for lib/test_*.ko:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_hexdump.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_dhry.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_firmware.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_sysctl.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_hash.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_ida.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_list_sort.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_min_heap.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_module.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_sort.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_static_keys.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_static_key_base.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_memcat_p.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_blackhole_dev.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_meminit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_free_pages.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_kprobes.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_ref_tracker.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bits.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240619-md-lib-test-v2-1-301e30eeba1e@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:11 -07:00
Suren Baghdasaryan
d2917ff199 lib/dump_stack: report process UID in dump_stack_print_info()
To make it easier to identify the crashing process, report effective UID
when dumping the stack.

Link: https://lkml.kernel.org/r/20240615041358.103791-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:11 -07:00
I Hsin Cheng
c8dab79f9e lib/plist.c: avoid worst case scenario in plist_add
Worst case scenario of plist_add() happens when the priority of the
inserted plist_node is going to be the largest after the insertion is
done.  The cost is going to be more significant when the original plist is
longer, because the iterator is going to traverse the whole plist to find
the correct position to insert the new node.

The situation can be avoided by using a reverse iterator at the same time,
doing so the maximum possible number of iteration is going to shrink from
N to N/2.

The proposed change of plist_add pasts the test in lib/plist.c to validate
its correctness, also add the worst case scenario test for plist_add() in
plist_test().

The worst case test are tested with the size of test_data and test_node
growing from 200 to 1000.  The result are showned in the following table,
in which we can observed that the proposed change of plist_add performs
better than the original version, and the difference between these two
implementations are more significant with the size of N growing.

The random case test [1], and best case test [2] are also provided, with
result showing the proposed change performs slightly better in random case
test while the original implementation performs slightly better in best
case test, while the difference in both test are minor, we can see them as
even in those two situations.

 -----------------------------------------------------------
| Test size      |   200 |   400 |    600 |    800 |   1000 |
 -----------------------------------------------------------
| new_plist_add  | 140911| 548681| 1220512| 2048493| 3763755|
 -----------------------------------------------------------
| old_plist_add  | 188198| 774222| 1643547| 3008929| 4947435|
 -----------------------------------------------------------

Link: https://lkml.kernel.org/r/20240614154603.65203-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Signed-off-by: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:10 -07:00
Brian Masney
d0bff05405 lib/Kconfig.debug: document panic= command line option and procfs entry for PANIC_TIMEOUT
PANIC_TIMEOUT can also be controlled with the panic= kernel command line
option and the file /proc/sys/kernel/panic.  Let's document both of these
in the Kconfig help text.

Link: https://lkml.kernel.org/r/20240607152443.925168-1-bmasney@redhat.com
Signed-off-by: Brian Masney <bmasney@redhat.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:07 -07:00
Jeff Johnson
09aaf15a78 lib/test_linear_ranges: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_linear_ranges.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_linear_ranges-v1-1-053a1aad37c6@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:07 -07:00
Jeff Johnson
7ef148daa5 lib/test_kmod: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_kmod.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_kmod-v1-1-fdf11bc6095e@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:07 -07:00
Jeff Johnson
d46a555d3c siphash: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/siphash_kunit.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-siphash_kunit-v1-1-38688065b796@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:07 -07:00
Jeff Johnson
683da20738 uuid: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_uuid.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_uuid-v1-1-67fa498104c0@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:06 -07:00
Jeff Johnson
1c5a13b39d kunit: add missing MODULE_DESCRIPTION() macros to lib/*.c
make allmodconfig && make W=1 C=1 reports for lib/*kunit:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/bitfield_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/checksum_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/cmdline_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/is_signed_type_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/overflow_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/stackinit_kunit.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240601-md-lib-kunit-tests-v1-1-4493fe0032b9@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:06 -07:00
Jeff Johnson
2e29fcb774 lib/asn1_encoder: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/asn1_encoder.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240601-md-lib-asn1_encoder-v1-1-8c634ed2d2e8@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:06 -07:00
Jeff Johnson
f069e33daf KUnit: add missing MODULE_DESCRIPTION() macros for lib/*_test.ko
make allmodconfig && make W=1 C=1 reports for lib/*_test.ko:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/atomic64_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/hashtable_test.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240601-md-lib-test2-v1-1-be764b785f17@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:06 -07:00
Jeff Johnson
e471831be2 kunit/fortify: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/memcpy_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/fortify_kunit.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-fortify_source-v1-1-2c37f7fbaafc@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:05 -07:00
Jani Nikula
2f183c6834 kernel/panic: add verbose logging of kernel taints in backtraces
With nearly 20 taint flags and respective characters, it's getting a bit
difficult to remember what each taint flag character means.  Add verbose
logging of the set taints in the format:

Tainted: [P]=PROPRIETARY_MODULE, [W]=WARN

in dump_stack_print_info() when there are taints.

Note that the "negative flag" G is not included.

Link: https://lkml.kernel.org/r/7321e306166cb2ca2807ab8639e665baa2462e9c.1717146197.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:05 -07:00
Jeff Johnson
21516c56ff lib/ts: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_kmp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_bm.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_fsm.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-lib-ts-v1-1-03d7f3546c49@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:04 -07:00
I Hsin Cheng
7abcb84f95 lib/plist.c: enforce memory ordering in plist_check_list
There exists an iteration over a plist in plist_check_list(), and memory
dependency exists between variables "prev", "next" and "prev->next".  As
plist is used in the scheduling subsystem, we should guarantee the memory
ordering between multiple processors.

Using macro "WRITE_ONCE()" can help us to ensure the memory ordering as
it was stated in "Documentation/memory-barriers.txt".

Link: https://lkml.kernel.org/r/20240526140139.17220-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:04 -07:00
Mateusz Guzik
51d821654b percpu_counter: add a cmpxchg-based _add_batch variant
Interrupt disable/enable trips are quite expensive on x86-64 compared to a
mere cmpxchg (note: no lock prefix!) and percpu counters are used quite
often.

With this change I get a bump of 1% ops/s for negative path lookups,
plugged into will-it-scale:

void testcase(unsigned long long *iterations, unsigned long nr)
{
        while (1) {
                int fd = open("/tmp/nonexistent", O_RDONLY);
                assert(fd == -1);

                (*iterations)++;
        }
}

The win would be higher if it was not for other slowdowns, but one has
to start somewhere.

Link: https://lkml.kernel.org/r/20240528204257.434817-1-mjguzik@gmail.com
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:03 -07:00
Kuan-Wei Chiu
54ce43da25 lib/test_sort: add a testcase to ensure code coverage
The addition of an if statement in lib/sort to handle the final unsorted 2
or 3 elements is not covered by existing test cases, leading to incomplete
test coverage.  To ensure comprehensive testing and maintain 100% code
coverage, add a new testcase for scenarios where the if statement is
triggered.

Since the if statement is only triggered when the array length is odd and
the first element is greater than the second element, a testcase is
created using an array length of TEST_LEN - 1 and a suitable random seed
to maintain full code coverage.

Link: https://lkml.kernel.org/r/20240527203011.1644280-5-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:03 -07:00
Kuan-Wei Chiu
41ed780435 lib/sort: optimize heapsort for handling final 2 or 3 elements
After building the heap, the code continuously pops two elements from the
heap until only 2 or 3 elements remain, at which point it switches back to
a regular heapsort with one element popped at a time.  However, to handle
the final 2 or 3 elements, an additional else-if statement in the while
loop was introduced, potentially increasing branch misses.  Moreover, when
there are only 2 or 3 elements left, continuing with regular heapify
operations is unnecessary as these cases are simple enough to be handled
with a single comparison and 1 or 2 swaps outside the while loop.

Eliminating the additional else-if statement and directly managing cases
involving 2 or 3 elements outside the loop reduces unnecessary conditional
branches resulting from the numerous loops and conditionals in heapify.

This optimization maintains consistent numbers of comparisons and swaps
for arrays with even lengths while reducing swaps and comparisons for
arrays with odd lengths from 2.5 swaps and 1 comparison to 1.5 swaps and 1
comparison.

Link: https://lkml.kernel.org/r/20240527203011.1644280-4-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:03 -07:00
Kuan-Wei Chiu
f49ac9571b lib/sort: fix outdated comment regarding glibc qsort()
The existing comment in lib/sort refers to glibc qsort() using quicksort. 
However, glibc qsort() no longer uses quicksort; it now uses mergesort and
falls back to heapsort if memory allocation for mergesort fails.  This
makes the comment outdated and incorrect.

Update the comment to refer to quicksort in general rather than glibc's
implementation to provide accurate information about the comparisons and
trade-offs without implying an outdated implementation.

Link: https://lkml.kernel.org/r/20240527203011.1644280-3-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:02 -07:00
Kuan-Wei Chiu
85fb11a879 lib/sort: remove unused pr_fmt macro
Patch series "lib/sort: Optimizations and cleanups".

This patch series optimizes the handling of the last 2 or 3 elements in
lib/sort and adds a testcase in lib/test_sort to maintain 100% code
coverage reflecting this change.  Additionally, it corrects outdated
descriptions regarding glibc qsort() and removes the unused pr_fmt macro.


This patch (of 4):

The pr_fmt macro is defined but not used in lib/sort.c.  Since there are
no pr_* functions printing any messages, the pr_fmt macro is redundant and
can be safely removed.

Link: https://lkml.kernel.org/r/20240527203011.1644280-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20240527203011.1644280-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:02 -07:00
Kuan-Wei Chiu
7099f74dc3 lib/test_min_heap: add test for heap_del()
Add test cases for the min_heap_del() to ensure its functionality is
thoroughly tested.

Link: https://lkml.kernel.org/r/20240524152958.919343-15-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:24:59 -07:00
Kuan-Wei Chiu
267607e875 lib min_heap: add args for min_heap_callbacks
Add a third parameter 'args' for the 'less' and 'swp' functions in the
'struct min_heap_callbacks'.  This additional parameter allows these
comparison and swap functions to handle extra arguments when necessary.

Link: https://lkml.kernel.org/r/20240524152958.919343-9-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:24:58 -07:00
Kuan-Wei Chiu
873ce25766 lib min_heap: add type safe interface
Implement a type-safe interface for min_heap using strong type pointers
instead of void * in the data field.  This change includes adding small
macro wrappers around functions, enabling the use of __minheap_cast and
__minheap_obj_size macros for type casting and obtaining element size. 
This implementation removes the necessity of passing element size in
min_heap_callbacks.  Additionally, introduce the MIN_HEAP_PREALLOCATED
macro for preallocating some elements.

Link: https://lkml.kernel.org/ioyfizrzq7w7mjrqcadtzsfgpuntowtjdw5pgn4qhvsdp4mqqg@nrlek5vmisbu
Link: https://lkml.kernel.org/r/20240524152958.919343-5-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:24:57 -07:00
Breno Leitao
5b5baba622 debugobjects: Annotate racy debug variables
KCSAN has identified a potential data race in debugobjects, where the
global variable debug_objects_maxchain is accessed for both reading and
writing simultaneously in separate and parallel data paths. This results in
the following splat printed by KCSAN:

  BUG: KCSAN: data-race in debug_check_no_obj_freed / debug_object_activate

  write to 0xffffffff847ccfc8 of 4 bytes by task 734 on cpu 41:
  debug_object_activate (lib/debugobjects.c:199 lib/debugobjects.c:564 lib/debugobjects.c:710)
  call_rcu (kernel/rcu/rcu.h:227 kernel/rcu/tree.c:2719 kernel/rcu/tree.c:2838)
  security_inode_free (security/security.c:1626)
  __destroy_inode (./include/linux/fsnotify.h:222 fs/inode.c:287)
  ...
  read to 0xffffffff847ccfc8 of 4 bytes by task 384 on cpu 31:
  debug_check_no_obj_freed (lib/debugobjects.c:1000 lib/debugobjects.c:1019)
  kfree (mm/slub.c:2081 mm/slub.c:4280 mm/slub.c:4390)
  percpu_ref_exit (lib/percpu-refcount.c:147)
  css_free_rwork_fn (kernel/cgroup/cgroup.c:5357)
  ...
  value changed: 0x00000070 -> 0x00000071

The data race is actually harmless as this is just used for debugfs
statistics, as all other debug variables.

Annotate all debug variables as racy explicitly, since these variables
are known to be racy and harmless.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240611091813.1189860-1-leitao@debian.org
2024-06-24 16:46:43 +02:00
Geert Uytterhoeven
a03a84bee3 lib/fonts: Fix visiblity of SUN12x22 and TER16x32 if DRM_PANIC
When CONFIG_FONTS ("Select compiled-in fonts") is not enabled, the user
should not be asked about any fonts.  However, when CONFIG_DRM_PANIC is
enabled, the user is still asked about the Sparc console 12x22 and
Terminus 16x32 fonts.

Fix this by moving the "|| DRM_PANIC" to where it belongs.
Split the dependency in two rules to improve readability.

Fixes: b94605a388 ("lib/fonts: Allow to select fonts for drm_panic")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ac474c6755800e61e18bd5af407c6acb449c5149.1718305355.git.geert+renesas@glider.be
2024-06-24 13:18:02 +02:00
Kent Overstreet
06efa5f30c closures: closure_get_not_zero(), closure_return_sync()
Provide new primitives for solving a lifetime issue with bcachefs
btree_trans objects.

closure_sync_return(): like closure_sync(), wait synchronously for any
outstanding gets. like closure_return, the closure is considered
"finished" and the ref left at 0.

closure_get_not_zero(): get a ref on a closure if it's alive, i.e. the
ref is not zero.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-23 00:57:21 -04:00
Linus Torvalds
c3de9b572f bcachefs fixes for 6.10-rc5
Lots of (mostly boring) fixes for syzbot bugs and rare(r) CI bugs.
 
 The LRU_TIME_BITS fix was slightly more involved; we only have 48 bits
 for the LRU position (we would prefer 64), so wraparound is possible for
 the cached data LRUs on a filesystem that has done sufficient
 (petabytes) reads; this is now handled.
 
 One notable user reported bugfix, where we were forgetting to correctly
 set the bucket data type, which should have been BCH_DATA_need_gc_gens
 instead of BCH_DATA_free; this was causing us to go emergency read-only
 on a filesystem that had seen heavy enough use to see bucket gen
 wraparoud.
 
 We're now starting to fix simple (safe) errors without requiring user
 intervention - i.e. a small incremental step towards full self healing.
 This is currently limited to just certain allocation information
 counters, and the error is still logged in the superblock; see that
 patch for more information. ("bcachefs: Fix safe errors by default").
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmZ22GAACgkQE6szbY3K
 bnYJaQ/+Pzep1M9JU9bQRCjmbR1pDkHswqeiVR0DSPTqDSCR0KmoypA+iwAbAmzC
 X0Z3bHgh9X36QdnF5s+JSLSzeAdzD74btLeyCI58iH//QaIg7da6tE2FgJCstMt8
 11i9a172fhiYLH7YjigZczV10nrWApClS/9qHgY+paVEgMeJgx/3zJwysC1UhuT9
 6bsSZKeGMhkGDca3k5hd7mZZKUvFXpE//xe6axK05aTHvd2wDQbDOaMdn07XC+hF
 KIWloxYVu9utqprjIq2XHWLJaxRhguHwlI4xq+n8eljLw8Kt6S9lZp7CA85Hq4RA
 hLmv1qoqJvh8+YZ7twwYAhflm9mcz58GGKrIqPCG/YaftIktJx3DCOkZzn2b7TmD
 iVXBVYkcmlZqLpZzPisKO8omqVkH4YIN/WPIGa1JU/+jkw5Qzpw62K+9AjYowUCp
 q47TVWRNtAuL5sct2KVUdTkC5Dkhx7lu3NDvx4jVXfbPsv0ssNYiTKMNnAUqefz/
 eM37MCVzmy7OwAymdb5d83CzMIHm0JKetc7CgLBAjOcMLMoLDjRdGEcFGxq/iBMB
 2Ty4rUWGFbXlwV1umcYd2cODqIt+iLwmHWCAIXjtTlOw1h5YuwX67wb9zw/tzB1W
 JUEetJQWzQ7P/Q1huntNUbiIHw2GbWzeB2u0wBPaVVEgyHWftyk=
 =ktka
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Lots of (mostly boring) fixes for syzbot bugs and rare(r) CI bugs.

  The LRU_TIME_BITS fix was slightly more involved; we only have 48 bits
  for the LRU position (we would prefer 64), so wraparound is possible
  for the cached data LRUs on a filesystem that has done sufficient
  (petabytes) reads; this is now handled.

  One notable user reported bugfix, where we were forgetting to
  correctly set the bucket data type, which should have been
  BCH_DATA_need_gc_gens instead of BCH_DATA_free; this was causing us to
  go emergency read-only on a filesystem that had seen heavy enough use
  to see bucket gen wraparoud.

  We're now starting to fix simple (safe) errors without requiring user
  intervention - i.e. a small incremental step towards full self
  healing.

  This is currently limited to just certain allocation information
  counters, and the error is still logged in the superblock; see that
  patch for more information. ("bcachefs: Fix safe errors by default")"

* tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs: (22 commits)
  bcachefs: Move the ei_flags setting to after initialization
  bcachefs: Fix a UAF after write_super()
  bcachefs: Use bch2_print_string_as_lines for long err
  bcachefs: Fix I_NEW warning in race path in bch2_inode_insert()
  bcachefs: Replace bare EEXIST with private error codes
  bcachefs: Fix missing alloc_data_type_set()
  closures: Change BUG_ON() to WARN_ON()
  bcachefs: fix alignment of VMA for memory mapped files on THP
  bcachefs: Fix safe errors by default
  bcachefs: Fix bch2_trans_put()
  bcachefs: set_worker_desc() for delete_dead_snapshots
  bcachefs: Fix bch2_sb_downgrade_update()
  bcachefs: Handle cached data LRU wraparound
  bcachefs: Guard against overflowing LRU_TIME_BITS
  bcachefs: delete_dead_snapshots() doesn't need to go RW
  bcachefs: Fix early init error path in journal code
  bcachefs: Check for invalid btree IDs
  bcachefs: Fix btree ID bitmasks
  bcachefs: Fix shift overflow in read_one_super()
  bcachefs: Fix a locking bug in the do_discard_fast() path
  ...
2024-06-22 09:02:39 -07:00
Kent Overstreet
339b84ab6b closures: Change BUG_ON() to WARN_ON()
If a BUG_ON() can be hit in the wild, it shouldn't be a BUG_ON()

For reference, this has popped up once in the CI, and we'll need more
info to debug it:

03240 ------------[ cut here ]------------
03240 kernel BUG at lib/closure.c:21!
03240 kernel BUG at lib/closure.c:21!
03240 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
03240 Modules linked in:
03240 CPU: 15 PID: 40534 Comm: kworker/u80:1 Not tainted 6.10.0-rc4-ktest-ga56da69799bd #25570
03240 Hardware name: linux,dummy-virt (DT)
03240 Workqueue: btree_update btree_interior_update_work
03240 pstate: 00001005 (nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
03240 pc : closure_put+0x224/0x2a0
03240 lr : closure_put+0x24/0x2a0
03240 sp : ffff0000d12071c0
03240 x29: ffff0000d12071c0 x28: dfff800000000000 x27: ffff0000d1207360
03240 x26: 0000000000000040 x25: 0000000000000040 x24: 0000000000000040
03240 x23: ffff0000c1f20180 x22: 0000000000000000 x21: ffff0000c1f20168
03240 x20: 0000000040000000 x19: ffff0000c1f20140 x18: 0000000000000001
03240 x17: 0000000000003aa0 x16: 0000000000003ad0 x15: 1fffe0001c326974
03240 x14: 0000000000000a1e x13: 0000000000000000 x12: 1fffe000183e402d
03240 x11: ffff6000183e402d x10: dfff800000000000 x9 : ffff6000183e402e
03240 x8 : 0000000000000001 x7 : 00009fffe7c1bfd3 x6 : ffff0000c1f2016b
03240 x5 : ffff0000c1f20168 x4 : ffff6000183e402e x3 : ffff800081391954
03240 x2 : 0000000000000001 x1 : 0000000000000000 x0 : 00000000a8000000
03240 Call trace:
03240  closure_put+0x224/0x2a0
03240  bch2_check_for_deadlock+0x910/0x1028
03240  bch2_six_check_for_deadlock+0x1c/0x30
03240  six_lock_slowpath.isra.0+0x29c/0xed0
03240  six_lock_ip_waiter+0xa8/0xf8
03240  __bch2_btree_node_lock_write+0x14c/0x298
03240  bch2_trans_lock_write+0x6d4/0xb10
03240  __bch2_trans_commit+0x135c/0x5520
03240  btree_interior_update_work+0x1248/0x1c10
03240  process_scheduled_works+0x53c/0xd90
03240  worker_thread+0x370/0x8c8
03240  kthread+0x258/0x2e8
03240  ret_from_fork+0x10/0x20
03240 Code: aa1303e0 d63f0020 a94363f7 17ffff8c (d4210000)
03240 ---[ end trace 0000000000000000 ]---
03240 Kernel panic - not syncing: Oops - BUG: Fatal exception
03240 SMP: stopping secondary CPUs
03241 SMP: failed to stop secondary CPUs 13,15
03241 Kernel Offset: disabled
03241 CPU features: 0x00,00000003,80000008,4240500b
03241 Memory Limit: none
03241 ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]---
03246 ========= FAILED TIMEOUT copygc_torture_no_checksum in 7200s

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21 10:17:04 -04:00
Jeff Johnson
3cbe18b0bc crypto: lib - add missing MODULE_DESCRIPTION() macros
With ARCH=arm, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libsha256.o

Add the missing invocation of the MODULE_DESCRIPTION() macro to all
files which have a MODULE_LICENSE().

This includes sha1.c and utils.c which, although they did not produce
a warning with the arm allmodconfig configuration, may cause this
warning with other configurations.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-21 22:04:19 +10:00
Jiapeng Chong
b44327ebc1 crypto: lib/mpi - Use swap() in mpi_powm()
Use existing swap() function rather than duplicating its implementation.

./lib/crypto/mpi/mpi-pow.c:211:11-12: WARNING opportunity for swap().
./lib/crypto/mpi/mpi-pow.c:239:12-13: WARNING opportunity for swap().

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9327
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-21 22:04:18 +10:00
Jiapeng Chong
f0da7a231c crypto: lib/mpi - Use swap() in mpi_ec_mul_point()
Use existing swap() function rather than duplicating its implementation.

./lib/crypto/mpi/ec.c:1291:20-21: WARNING opportunity for swap().
./lib/crypto/mpi/ec.c:1292:20-21: WARNING opportunity for swap().

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9328
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-21 22:04:17 +10:00
Dave Airlie
f680df51ca drm-misc-next for 6.11:
UAPI Changes:
   - Deprecate DRM date and return a 0 date in DRM_IOCTL_VERSION
 
 Core Changes:
   - connector: Create a set of helpers to help with HDMI support
   - fbdev: Create memory manager optimized fbdev emulation
   - panic: Allow to select fonts, improve drm_fb_dma_get_scanout_buffer
 
 Driver Changes:
   - Remove driver owner assignments
   - Allow more drivers to compile with COMPILE_TEST
   - Conversions to drm_edid
   - ivpu: hardware scheduler support, profiling support, improvements
     to the platform support layer
   - mgag200: general reworks and improvements
   - nouveau: Add NVreg_RegistryDwords command line option
   - rockchip: Conversion to the hdmi helpers
   - sun4i: Conversion to the hdmi helpers
   - vc4: Conversion to the hdmi helpers
   - v3d: Perf counters improvements
   - zynqmp: IRQ and debugfs improvements
   - bridge:
     - Remove redundant checks on bridge->encoder
   - panels:
     - Switch panels from register table initialization to proper code
     - Now that the panel code tracks the panel state, remove every
       ad-hoc implementation in the panel drivers
     - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
       13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
       nv110wum-l60, IVO t109nw41
 -----BEGIN PGP SIGNATURE-----
 
 iJUEABMJAB0WIQTkHFbLp4ejekA/qfgnX84Zoj2+dgUCZlhUKAAKCRAnX84Zoj2+
 dgHoAYDTpShgXFXnlnMtqZr+ZuShcjcwiqzwM4qNWdtyji9MONtJJU3ZQnGlnXbI
 ZU+oZP0Bf0PyT0/8bf+rmZBJ1UdAxt2IQaLkP1tTHOad4E+KlcL5n1opzMi160mB
 EZSm9f7aNw==
 =bZPt
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-05-30' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for 6.11:

UAPI Changes:
  - Deprecate DRM date and return a 0 date in DRM_IOCTL_VERSION

Core Changes:
  - connector: Create a set of helpers to help with HDMI support
  - fbdev: Create memory manager optimized fbdev emulation
  - panic: Allow to select fonts, improve drm_fb_dma_get_scanout_buffer

Driver Changes:
  - Remove driver owner assignments
  - Allow more drivers to compile with COMPILE_TEST
  - Conversions to drm_edid
  - ivpu: hardware scheduler support, profiling support, improvements
    to the platform support layer
  - mgag200: general reworks and improvements
  - nouveau: Add NVreg_RegistryDwords command line option
  - rockchip: Conversion to the hdmi helpers
  - sun4i: Conversion to the hdmi helpers
  - vc4: Conversion to the hdmi helpers
  - v3d: Perf counters improvements
  - zynqmp: IRQ and debugfs improvements
  - bridge:
    - Remove redundant checks on bridge->encoder
  - panels:
    - Switch panels from register table initialization to proper code
    - Now that the panel code tracks the panel state, remove every
      ad-hoc implementation in the panel drivers
    - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
      13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
      nv110wum-l60, IVO t109nw41

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240530-hilarious-flat-magpie-5fa186@houat
2024-06-21 10:30:31 +10:00
Jakub Kicinski
a6ec08beec Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  1e7962114c ("bnxt_en: Restore PTP tx_avail count in case of skb_pad() error")
  165f87691a ("bnxt_en: add timestamping statistics support")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-20 13:49:59 -07:00
Kees Cook
2003e483a8 fortify: Do not special-case 0-sized destinations
All fake flexible arrays should have been removed now, so remove the
special casing that was avoiding checking them. If a destination claims
to be 0 sized, believe it. This is especially important for cases where
__counted_by is in use and may have a 0 element count.

Link: https://lore.kernel.org/r/20240619203105.work.747-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-19 13:32:04 -07:00
David Vernet
1538e33995 sched_ext: Print sched_ext info when dumping stack
It would be useful to see what the sched_ext scheduler state is, and what
scheduler is running, when we're dumping a task's stack. This patch
therefore adds a new print_scx_info() function that's called in the same
context as print_worker_info() and print_stop_info(). An example dump
follows.

  BUG: kernel NULL pointer dereference, address: 0000000000000999
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 0 P4D 0
  Oops: 0002 [#1] PREEMPT SMP
  CPU: 13 PID: 2047 Comm: insmod Tainted: G           O       6.6.0-work-10323-gb58d4cae8e99-dirty #34
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022
  Sched_ext: qmap (enabled+all), task: runnable_at=-17ms
  RIP: 0010:init_module+0x9/0x1000 [test_module]
  ...

v3: - scx_ops_enable_state_str[] definition moved to an earlier patch as
      it's now used by core implementation.

    - Convert jiffy delta to msecs using jiffies_to_msecs() instead of
      multiplying by (HZ / MSEC_PER_SEC). The conversion is implemented in
      jiffies_delta_msecs().

v2: - We are now using scx_ops_enable_state_str[] outside
      CONFIG_SCHED_DEBUG. Move it outside of CONFIG_SCHED_DEBUG and to the
      top. This was reported by Changwoo and Andrea.

Signed-off-by: David Vernet <void@manifault.com>
Reported-by: Changwoo Min <changwoo@igalia.com>
Reported-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-06-18 10:09:18 -10:00
Jeff Johnson
e334771d83 lib: bitmap: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/find_bit_benchmark.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/cpumask_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bitmap.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-06-18 10:40:52 -07:00
Linus Torvalds
5d272dd1b3 cpumask: limit FORCE_NR_CPUS to just the UP case
Hardcoding the number of CPUs at compile time does improve code
generation, but if you get it wrong the result will be confusion.

We already limited this earlier to only "experts" (see commit
fe5759d5bf "cpumask: limit visibility of FORCE_NR_CPUS"), but with
distro kernel configs often having EXPERT enabled, that turns out to not
be much of a limit.

To quote the philosophers at Disney: "Everyone can be an expert. And
when everyone's an expert, no one will be".

There's a runtime warning if you then set nr_cpus to anything but the
forced number, but apparently that can be ignored too [1] and by then
it's pretty much too late anyway.

If we had some real way to limit this to "embedded only", maybe it would
be worth it, but let's see if anybody even notices that the option is
gone.  We need to simplify kernel configuration anyway.

Link: https://lore.kernel.org/all/20240618105036.208a8860@rorschach.local.home/ [1]
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paul McKenney <paulmck@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-06-18 09:00:04 -07:00
Linus Torvalds
e6b324fbf2 19 hotfixes, 8 of which are cc:stable.
Mainly MM singleton fixes.  And a couple of ocfs2 regression fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZnCEQAAKCRDdBJ7gKXxA
 jmgSAQDk3BYs1n67cnwx/Zi04yMYDyfYTCYg2udPfT2a+GpmbwD+N5dJd/vCztXH
 5eLpP11xd/yr2+I9FefyZeUuA80KtgQ=
 =2agY
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-06-17-11-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "Mainly MM singleton fixes. And a couple of ocfs2 regression fixes"

* tag 'mm-hotfixes-stable-2024-06-17-11-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  kcov: don't lose track of remote references during softirqs
  mm: shmem: fix getting incorrect lruvec when replacing a shmem folio
  mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick
  mm: fix possible OOB in numa_rebuild_large_mapping()
  mm/migrate: fix kernel BUG at mm/compaction.c:2761!
  selftests: mm: make map_fixed_noreplace test names stable
  mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC
  mm: mmap: allow for the maximum number of bits for randomizing mmap_base by default
  gcov: add support for GCC 14
  zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
  mm: huge_memory: fix misused mapping_large_folio_support() for anon folios
  lib/alloc_tag: fix RCU imbalance in pgalloc_tag_get()
  lib/alloc_tag: do not register sysctl interface when CONFIG_SYSCTL=n
  MAINTAINERS: remove Lorenzo as vmalloc reviewer
  Revert "mm: init_mlocked_on_free_v3"
  mm/page_table_check: fix crash on ZONE_DEVICE
  gcc: disable '-Warray-bounds' for gcc-9
  ocfs2: fix NULL pointer dereference in ocfs2_abort_trigger()
  ocfs2: fix NULL pointer dereference in ocfs2_journal_dirty()
2024-06-17 12:30:07 -07:00
Linus Torvalds
5cf81d7b0d hardening fixes for v6.10-rc5
- yama: document function parameter (Christian Göttsche_
 
 - mm/util: Swap kmemdup_array() arguments (Jean-Philippe Brucker)
 
 - kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()
 
 - MAINTAINERS: Update entries for Kees Cook
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmZwfVsACgkQiXL039xt
 wCYfuQ/+KidYsVlf9xhc9eU6XQQZmPXhQT7QCWZEX2xj6xdob5Pv+YBHrL2dGCvn
 4b7xqWFqrkjDGVEQW5zF7mmn9T7a3c6+czKUR6rSueB6aO+NFns961rCBViYWxLN
 /xgee/1iCRg5iwg6SfP5CR9NIr9h6jU9d4Mv7cT2rwy913bCeQa89gkqCD2LJXmr
 m9HZgT0vsgfUO3+XsA42LKpP+dP+8UHtTumNOZrqnzZr9k69io9ncRjzmS/LjQPL
 ILo3QQ6QIV8bkSlOogMLZNHRc84Sc8x91KUM42ZUhV2tNxpNG6lt6UZXPATbvq/g
 TLHxvayjYOTWwF2DmlXncF/rtDLugsg/lyGS4tPjRX00Iq+jaTm1HOVJQ0rDUeLI
 lmMlGyDzAPK7UXU3hmx+i3sOuyt6HbfJYwF/7ErR0plDaWIbUrqy7uVxarag3qnc
 i4Lrr/5OdThUKl1jTBIBmfrOELI+m5opMvF2zUpS1BgHUw1U33rHWxQRoW1iTUnH
 Df11bl0NycmxyY0Vv4M1dnm8uP7XpjfFbdi87xj4+lGGKTM+wM9iQhrHVLBeIdPa
 dntZfsFB2ZF8LYlNXVnOcWLJjQP8SC99VCMsp/Un6AVmu/HMBP/+cZ6LHGWcUoWz
 qVrxqu9OjnK7jqsaDbDm3TLroCzL/8/oLRbqXuGJNamLOxz9oW0=
 =RFT7
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - yama: document function parameter (Christian Göttsche)

 - mm/util: Swap kmemdup_array() arguments (Jean-Philippe Brucker)

 - kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()

 - MAINTAINERS: Update entries for Kees Cook

* tag 'hardening-v6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  MAINTAINERS: Update entries for Kees Cook
  kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()
  yama: document function parameter
  mm/util: Swap kmemdup_array() arguments
2024-06-17 12:00:22 -07:00
Greg Kroah-Hartman
b5dd424181 Linux 6.10-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmZvTbAeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGVksIAJEn4a9IVM8FNCJy
 Dxo0BItD1/qJ5mLDptqUFRKlxInjbojofz5CyoeIeXb0DwRfB16ALXqNXAkd3APi
 saoOpfjFsg2H2OqL9CHdkzWcJEAq2lDnL0zaOjumeDVu/EyeT+tC4e4hq1e6Bm0E
 fPC5ms2b+07DF9Rg6/DW8yPbdM5n6Mz1bRd3fQOIgvpM3yGOyGztEBgTRub/ZUgH
 5pNJauknFAZgdiWhgNpc+lPWYZbgHKULQPhUBPdVhDIXPtQNUlKgNTQc6+L0Nmbb
 K1sG1q7FLeMJOTFGQfD4r26X5DNQUi894q/9SX8X7rcrECdJKcw2WjVyB4myADpf
 ae2gP+A=
 =XjWP
 -----END PGP SIGNATURE-----

Merge tag 'v6.10-rc4' into driver-core-next

We need the driver core and sysfs fixes in here to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-17 08:33:41 +02:00
Greg Kroah-Hartman
2046047295 Linux 6.10-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmZvTbAeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGVksIAJEn4a9IVM8FNCJy
 Dxo0BItD1/qJ5mLDptqUFRKlxInjbojofz5CyoeIeXb0DwRfB16ALXqNXAkd3APi
 saoOpfjFsg2H2OqL9CHdkzWcJEAq2lDnL0zaOjumeDVu/EyeT+tC4e4hq1e6Bm0E
 fPC5ms2b+07DF9Rg6/DW8yPbdM5n6Mz1bRd3fQOIgvpM3yGOyGztEBgTRub/ZUgH
 5pNJauknFAZgdiWhgNpc+lPWYZbgHKULQPhUBPdVhDIXPtQNUlKgNTQc6+L0Nmbb
 K1sG1q7FLeMJOTFGQfD4r26X5DNQUi894q/9SX8X7rcrECdJKcw2WjVyB4myADpf
 ae2gP+A=
 =XjWP
 -----END PGP SIGNATURE-----

Merge tag 'v6.10-rc4' into char-misc-next

We need the char-misc and iio fixes in here as well to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-17 08:31:12 +02:00
Suren Baghdasaryan
c944bf60c1 lib/alloc_tag: do not register sysctl interface when CONFIG_SYSCTL=n
Memory allocation profiling is trying to register sysctl interface even
when CONFIG_SYSCTL=n, resulting in proc_do_static_key() being undefined. 
Prevent that by skipping sysctl registration for such configurations.

Link: https://lkml.kernel.org/r/20240601233831.617124-1-surenb@google.com
Fixes: 22d407b164 ("lib: add allocation tagging support for memory allocation profiling")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405280616.wcOGWJEj-lkp@intel.com/
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-15 10:43:05 -07:00
Kees Cook
cf6219ee88 usercopy: Convert test_user_copy to KUnit test
Convert the runtime tests of hardened usercopy to standard KUnit tests.

Additionally disable usercopy_test_invalid() for systems with separate
address spaces (or no MMU) since it's not sensible to test for address
confusion there (e.g. m68k).

Co-developed-by: Vitor Massaru Iha <vitor@massaru.org>
Signed-off-by: Vitor Massaru Iha <vitor@massaru.org>
Link: https://lore.kernel.org/r/20200721174654.72132-1-vitor@massaru.org
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-14 19:31:39 -06:00
Kees Cook
51104c19d8 kunit: test: Add vm_mmap() allocation resource manager
For tests that need to allocate using vm_mmap() (e.g. usercopy and
execve), provide the interface to have the allocation tracked by KUnit
itself. This requires bringing up a placeholder userspace mm.

This combines my earlier attempt at this with Mark Rutland's version[1].

Normally alloc_mm() and arch_pick_mmap_layout() aren't exported for
modules, so export these only for KUnit testing.

Link: https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutland@arm.com/ [1]
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-14 19:31:33 -06:00
Joel Granados
e2a6c472de mm profiling: Remove superfluous sentinel element from ctl_table
This commit is part of a greater effort to remove all empty elements at
the end of the ctl_table arrays (sentinels) which will reduce the
overall build time size of the kernel and run time memory bloat by ~64
bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Removed sentinel from memory_allocation_profiling_sysctls

Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-13 10:50:52 +02:00
Jeff Johnson
a930fde94a vsprintf: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_printf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_scanf.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-vsprintf-v1-1-d8bc7e21539a@quicinc.com
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-06-12 13:26:28 +02:00
Zijun Hu
dd6e9894b4 kobject_uevent: Fix OOB access within zap_modalias_env()
zap_modalias_env() wrongly calculates size of memory block to move, so
will cause OOB memory access issue if variable MODALIAS is not the last
one within its @env parameter, fixed by correcting size to memmove.

Fixes: 9b3fa47d4a ("kobject: fix suppressing modalias in uevents delivered over netlink")
Cc: stable@vger.kernel.org
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Lk Sii <lk_sii@163.com>
Link: https://lore.kernel.org/r/1717074877-11352-1-git-send-email-quic_zijuhu@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 13:24:05 +02:00
Jakub Kicinski
b1156532bc bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZmIsRAAKCRDbK58LschI
 g4SSAP0bkl6rPMn7zp1h+/l7hlvpp2aVOmasBTe8hIhAGUbluwD/TGq4sNsGgXFI
 i4tUtFRhw8pOjy2guy6526qyJvBs8wY=
 =WMhY
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-06-06

We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 50 files changed, 1887 insertions(+), 527 deletions(-).

The main changes are:

1) Add a user space notification mechanism via epoll when a struct_ops
   object is getting detached/unregistered, from Kui-Feng Lee.

2) Big batch of BPF selftest refactoring for sockmap and BPF congctl
   tests, from Geliang Tang.

3) Add BTF field (type and string fields, right now) iterator support
   to libbpf instead of using existing callback-based approaches,
   from Andrii Nakryiko.

4) Extend BPF selftests for the latter with a new btf_field_iter
   selftest, from Alan Maguire.

5) Add new kfuncs for a generic, open-coded bits iterator,
   from Yafang Shao.

6) Fix BPF selftests' kallsyms_find() helper under kernels configured
   with CONFIG_LTO_CLANG_THIN, from Yonghong Song.

7) Remove a bunch of unused structs in BPF selftests,
   from David Alan Gilbert.

8) Convert test_sockmap section names into names understood by libbpf
   so it can deduce program type and attach type, from Jakub Sitnicki.

9) Extend libbpf with the ability to configure log verbosity
   via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko.

10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness
    in nested VMs, from Song Liu.

11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba
    optimization, from Xiao Wang.

12) Enable BPF programs to declare arrays and struct fields with kptr,
    bpf_rb_root, and bpf_list_head, from Kui-Feng Lee.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
  selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
  selftests/bpf: Add start_test helper in bpf_tcp_ca
  selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
  libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
  selftests/bpf: Add btf_field_iter selftests
  selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
  libbpf: Remove callback-based type/string BTF field visitor helpers
  bpftool: Use BTF field iterator in btfgen
  libbpf: Make use of BTF field iterator in BTF handling code
  libbpf: Make use of BTF field iterator in BPF linker code
  libbpf: Add BTF field iterator
  selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find()
  selftests/bpf: Fix bpf_cookie and find_vma in nested VM
  selftests/bpf: Test global bpf_list_head arrays.
  selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types.
  selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
  bpf: limit the number of levels of a nested struct type.
  bpf: look into the types of the fields of a struct type recursively.
  ...
====================

Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10 18:02:14 -07:00
Kees Cook
9dd5134c61 kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()
When a flexible array structure has a __counted_by annotation, its use
with DEFINE_RAW_FLEX() will result in the count being zero-initialized.
This is expected since one doesn't want to use RAW with a counted_by
struct. Adjust the tests to check for the condition and for compiler
support.

Reported-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Closes: https://lore.kernel.org/all/0bfc6b38-8bc5-4971-b6fb-dc642a73fbfe@gmail.com/
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240610182301.work.272-kees@kernel.org
Tested-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Reviewed-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-10 12:00:04 -07:00
Ido Schimmel
97d833ceb2 mlxsw: spectrum_acl_erp: Fix object nesting warning
ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
(A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
contain more ACLs (i.e., tc filters), but the number of masks in each
region (i.e., tc chain) is limited.

In order to mitigate the effects of the above limitation, the device
allows filters to share a single mask if their masks only differ in up
to 8 consecutive bits. For example, dst_ip/25 can be represented using
dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
number of masks being used (and therefore does not support mask
aggregation), but can contain a limited number of filters.

The driver uses the "objagg" library to perform the mask aggregation by
passing it objects that consist of the filter's mask and whether the
filter is to be inserted into the A-TCAM or the C-TCAM since filters in
different TCAMs cannot share a mask.

The set of created objects is dependent on the insertion order of the
filters and is not necessarily optimal. Therefore, the driver will
periodically ask the library to compute a more optimal set ("hints") by
looking at all the existing objects.

When the library asks the driver whether two objects can be aggregated
the driver only compares the provided masks and ignores the A-TCAM /
C-TCAM indication. This is the right thing to do since the goal is to
move as many filters as possible to the A-TCAM. The driver also forbids
two identical masks from being aggregated since this can only happen if
one was intentionally put in the C-TCAM to avoid a conflict in the
A-TCAM.

The above can result in the following set of hints:

H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta

After getting the hints from the library the driver will start migrating
filters from one region to another while consulting the computed hints
and instructing the device to perform a lookup in both regions during
the transition.

Assuming a filter with mask X is being migrated into the A-TCAM in the
new region, the hints lookup will return H1. Since H2 is the parent of
H1, the library will try to find the object associated with it and
create it if necessary in which case another hints lookup (recursive)
will be performed. This hints lookup for {mask Y, A-TCAM} will either
return H2 or H3 since the driver passes the library an object comparison
function that ignores the A-TCAM / C-TCAM indication.

This can eventually lead to nested objects which are not supported by
the library [1].

Fix by removing the object comparison function from both the driver and
the library as the driver was the only user. That way the lookup will
only return exact matches.

I do not have a reliable reproducer that can reproduce the issue in a
timely manner, but before the fix the issue would reproduce in several
minutes and with the fix it does not reproduce in over an hour.

Note that the current usefulness of the hints is limited because they
include the C-TCAM indication and represent aggregation that cannot
actually happen. This will be addressed in net-next.

[1]
WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
Modules linked in:
CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
[...]
Call Trace:
 <TASK>
 __objagg_obj_get+0x2bb/0x580
 objagg_obj_get+0xe/0x80
 mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
 mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
 mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
 mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
 process_one_work+0x151/0x370

Fixes: 9069a3817d ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:14:52 +01:00
Ido Schimmel
b4a3a89fff lib: objagg: Fix general protection fault
The library supports aggregation of objects into other objects only if
the parent object does not have a parent itself. That is, nesting is not
supported.

Aggregation happens in two cases: Without and with hints, where hints
are a pre-computed recommendation on how to aggregate the provided
objects.

Nesting is not possible in the first case due to a check that prevents
it, but in the second case there is no check because the assumption is
that nesting cannot happen when creating objects based on hints. The
violation of this assumption leads to various warnings and eventually to
a general protection fault [1].

Before fixing the root cause, error out when nesting happens and warn.

[1]
general protection fault, probably for non-canonical address 0xdead000000000d90: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1083 Comm: kworker/1:9 Tainted: G        W          6.9.0-rc6-custom-gd9b4f1cca7fb #7
Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:mlxsw_sp_acl_erp_bf_insert+0x25/0x80
[...]
Call Trace:
 <TASK>
 mlxsw_sp_acl_atcam_entry_add+0x256/0x3c0
 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
 mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
 mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
 process_one_work+0x151/0x370
 worker_thread+0x2cb/0x3e0
 kthread+0xd0/0x100
 ret_from_fork+0x34/0x50
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Fixes: 9069a3817d ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Reported-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:14:52 +01:00
Ido Schimmel
2aad28ec45 lib: test_objagg: Fix spelling
Fixes: 0a020d416d ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:14:52 +01:00
Ido Schimmel
c1e156ae50 lib: objagg: Fix spelling
Fixes: 0a020d416d ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:14:52 +01:00
Jeff Johnson
425ae3ab5a list: test: add the missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/list-test.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-07 15:59:15 -06:00
Jeff Johnson
a521746821 kunit: add missing MODULE_DESCRIPTION() macros to core modules
make allmodconfig && make W=1 C=1 reports in lib/kunit:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/kunit/kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/kunit/kunit-test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/kunit/kunit-example-test.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-07 15:59:05 -06:00
Jeff Johnson
645211db13 crypto: lib - add missing MODULE_DESCRIPTION() macros
Fix the allmodconfig 'make W=1' warnings:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libchacha.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libarc4.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libdes.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libpoly1305.o

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-07 19:46:39 +08:00
Linus Torvalds
d30d0e49da Including fixes from BPF and big collection of fixes for WiFi core
and drivers.
 
 Current release - regressions:
 
  - vxlan: fix regression when dropping packets due to invalid src addresses
 
  - bpf: fix a potential use-after-free in bpf_link_free()
 
  - xdp: revert support for redirect to any xsk socket bound to the same
    UMEM as it can result in a corruption
 
  - virtio_net:
    - add missing lock protection when reading return code from control_buf
    - fix false-positive lockdep splat in DIM
    - Revert "wifi: wilc1000: convert list management to RCU"
 
  - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config
 
 Previous releases - regressions:
 
  - rtnetlink: make the "split" NLM_DONE handling generic, restore the old
    behavior for two cases where we started coalescing those messages with
    normal messages, breaking sloppily-coded userspace
 
  - wifi:
    - cfg80211: validate HE operation element parsing
    - cfg80211: fix 6 GHz scan request building
    - mt76: mt7615: add missing chanctx ops
    - ath11k: move power type check to ASSOC stage, fix connecting
      to 6 GHz AP
    - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs
    - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS
    - iwlwifi: mvm: fix a crash on 7265
 
 Previous releases - always broken:
 
  - ncsi: prevent multi-threaded channel probing, a spec violation
 
  - vmxnet3: disable rx data ring on dma allocation failure
 
  - ethtool: init tsinfo stats if requested, prevent unintentionally
    reporting all-zero stats on devices which don't implement any
 
  - dst_cache: fix possible races in less common IPv6 features
 
  - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED
 
  - ax25: fix two refcounting bugs
 
  - eth: ionic: fix kernel panic in XDP_TX action
 
 Misc:
 
  - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmZh3mUACgkQMUZtbf5S
 IrvPwRAApv8X0ZIbPD5PuVEkiYuSkSE6QVou5GaVO7DzF4gj07zPNtCe6B/ZZdBu
 RLdlppxjAmVwdCRmUo0plxSydYZcqFpQqV6lRH/rbWmktWIp0pGIOAcOG7ISRPCC
 FAYJ4udSt4+wrq0hXTsE1KO1JZ0p7zE2bXxNC8uR8wgM9yonUjqhYdAUZhrl3yCY
 zOCD/+kvWFLYtehDcmyNK0ANS3yNveTNkRhXDc1UrpOGMtza60lf5u3bWK+sU5VS
 NGPe9cU60WKMQi6QnWFBZKIcp4Vgy2MukOLdNn9e8BRjFLh2dbY86LAmE4HWPA7I
 ONZagOfEjeOcRSCMdFHxui/PUDZLBZNhrnqQ6x8uC2yKwwIMr+CgEt5sCmVFwH6n
 3HTlWSjL38yuiVuYuhxGchmVnZfC4bLi2qAFF1oxhlDGViBDhAwi36MSCnjDpN8k
 Jo0x6crQLS/uvwVXPKWAUcQhy7OE69A3FwwA1PtkxRX5EQPn1if2Z7yq7YfYb9aD
 bChvCarlfuVDm+CBItphXg0ajVZc+im7+JK62Zn50A1cTbEK0lnYCOcmqzqiqrXI
 Vr3XXt6gVVnvwY374JDO1vmB5ft2IYBn7sWnLcIvR2UlggqEfqMdKSSwm7pOprG9
 YJ/LDAXVmG0kLN7rZUYUBLItnpuHAhYDrBOsV5HaFeksWauc1oY=
 =mwEJ
 -----END PGP SIGNATURE-----

Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from BPF and big collection of fixes for WiFi core and
  drivers.

  Current release - regressions:

   - vxlan: fix regression when dropping packets due to invalid src
     addresses

   - bpf: fix a potential use-after-free in bpf_link_free()

   - xdp: revert support for redirect to any xsk socket bound to the
     same UMEM as it can result in a corruption

   - virtio_net:
      - add missing lock protection when reading return code from
        control_buf
      - fix false-positive lockdep splat in DIM
      - Revert "wifi: wilc1000: convert list management to RCU"

   - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config

  Previous releases - regressions:

   - rtnetlink: make the "split" NLM_DONE handling generic, restore the
     old behavior for two cases where we started coalescing those
     messages with normal messages, breaking sloppily-coded userspace

   - wifi:
      - cfg80211: validate HE operation element parsing
      - cfg80211: fix 6 GHz scan request building
      - mt76: mt7615: add missing chanctx ops
      - ath11k: move power type check to ASSOC stage, fix connecting to
        6 GHz AP
      - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs
      - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS
      - iwlwifi: mvm: fix a crash on 7265

  Previous releases - always broken:

   - ncsi: prevent multi-threaded channel probing, a spec violation

   - vmxnet3: disable rx data ring on dma allocation failure

   - ethtool: init tsinfo stats if requested, prevent unintentionally
     reporting all-zero stats on devices which don't implement any

   - dst_cache: fix possible races in less common IPv6 features

   - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED

   - ax25: fix two refcounting bugs

   - eth: ionic: fix kernel panic in XDP_TX action

  Misc:

   - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB"

* tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
  selftests: net: lib: set 'i' as local
  selftests: net: lib: avoid error removing empty netns name
  selftests: net: lib: support errexit with busywait
  net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()
  ipv6: fix possible race in __fib6_drop_pcpu_from()
  af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
  af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
  af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
  af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
  af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
  af_unix: Annotate data-races around sk->sk_sndbuf.
  af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
  af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
  af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
  af_unix: Annotate data-race of sk->sk_state in unix_accept().
  af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
  af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
  af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
  af_unix: Annodate data-races around sk->sk_state for writers.
  af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
  ...
2024-06-06 09:55:27 -07:00
Jean-Philippe Brucker
0ee1472547 mm/util: Swap kmemdup_array() arguments
GCC 14.1 complains about the argument usage of kmemdup_array():

  drivers/soc/tegra/fuse/fuse-tegra.c:130:65: error: 'kmemdup_array' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
    130 |         fuse->lookups = kmemdup_array(fuse->soc->lookups, sizeof(*fuse->lookups),
        |                                                                 ^
  drivers/soc/tegra/fuse/fuse-tegra.c:130:65: note: earlier argument should specify number of elements, later size of each element

The annotation introduced by commit 7d78a77733 ("string: Add
additional __realloc_size() annotations for "dup" helpers") lets the
compiler think that kmemdup_array() follows the same format as calloc(),
with the number of elements preceding the size of one element. So we
could simply swap the arguments to __realloc_size() to get rid of that
warning, but it seems cleaner to instead have kmemdup_array() follow the
same format as krealloc_array(), memdup_array_user(), calloc() etc.

Fixes: 7d78a77733 ("string: Add additional __realloc_size() annotations for "dup" helpers")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20240606144608.97817-2-jean-philippe@linaro.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-06 08:55:20 -07:00
Jeff Johnson
0d618e3976 lib/math: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/math/prime_numbers.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/math/rational-test.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-lib-math-v1-1-11a3bec51ebb@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04 17:40:07 +02:00
Jeff Johnson
5a71c0d118 dyndbg: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_dynamic_debug.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-test_dynamic_debug-v1-1-2194b477f55e@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04 17:40:02 +02:00
Jeff Johnson
c6cab01d7e lib/test_rhashtable: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_rhashtable.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-lib-test_rhashtable-v1-1-cd6d4138f1b6@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-03 18:51:18 -07:00
Dr. David Alan Gilbert
8031042cc5 list: test: remove unused struct 'klist_test_struct'
'klist_test_struct' has been unused since the original
commit 57b4f760f9 ("list: test: Test the klist structure").

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-03 11:16:03 -06:00
Jeff Johnson
ec1249d327 test_bpf: Add missing MODULE_DESCRIPTION()
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bpf.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240531-md-lib-test_bpf-v1-1-868e4bd2f9ed@quicinc.com
2024-06-03 17:03:41 +02:00
Kees Cook
99a6087dfd kunit/fortify: Remove __kmalloc_node() test
__kmalloc_node() is considered an "internal" function to the Slab, so
drop it from explicit testing.

Link: https://lore.kernel.org/r/20240531185703.work.588-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-05-31 13:47:41 -07:00
Ivan Orlov
2c7afc2a88 kunit: Cover 'assert.c' with tests
There are multiple assertion formatting functions in the `assert.c`
file, which are not covered with tests yet. Implement the KUnit test
for these functions.

The test consists of 11 test cases for the following functions:

1) 'is_literal'
2) 'is_str_literal'
3) 'kunit_assert_prologue', test case for multiple assert types
4) 'kunit_assert_print_msg'
5) 'kunit_unary_assert_format'
6) 'kunit_ptr_not_err_assert_format'
7) 'kunit_binary_assert_format'
8) 'kunit_binary_ptr_assert_format'
9) 'kunit_binary_str_assert_format'
10) 'kunit_assert_hexdump'
11) 'kunit_mem_assert_format'

The test aims at maximizing the branch coverage for the assertion
formatting functions.

As you can see, it covers some of the static helper functions as
well, so mark the static functions in `assert.c` as 'VISIBLE_IF_KUNIT'
and conditionally export them with EXPORT_SYMBOL_IF_KUNIT. Add the
corresponding definitions to `assert.h`.

Build the assert test when CONFIG_KUNIT_TEST is enabled, similar to
how it is done for the string stream test.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-30 12:53:47 -06:00
Vlastimil Babka
a0a44d9175 mm, slab: don't wrap internal functions with alloc_hooks()
The functions __kmalloc_noprof(), kmalloc_large_noprof(),
kmalloc_trace_noprof() and their _node variants are all internal to the
implementations of kmalloc_noprof() and kmalloc_node_noprof() and are
only declared in the "public" slab.h and exported so that those
implementations can be static inline and distinguish the build-time
constant size variants. The only other users for some of the internal
functions are slub_kunit and fortify_kunit tests which make very
short-lived allocations.

Therefore we can stop wrapping them with the alloc_hooks() macro.
Instead add a __ prefix to all of them and a comment documenting these
as internal. Also rename __kmalloc_trace() to __kmalloc_cache() which is
more descriptive - it is a variant of __kmalloc() where the exact
kmalloc cache has been already determined.

The usage in fortify_kunit can be removed completely, as the internal
functions should be tested already through kmalloc() tests in the
test variant that passes non-constant allocation size.

Reported-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-05-28 09:27:50 +02:00
Maxime Ripard
375c4d1583
Merge drm/drm-next into drm-misc-next
Let's start the new release cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-05-27 11:08:31 +02:00
Linus Torvalds
9b62e02e63 16 hotfixes, 11 of which are cc:stable.
A few nilfs2 fixes, the remainder are for MM: a couple of selftests fixes,
 various singletons fixing various issues in various parts.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZlIOUgAKCRDdBJ7gKXxA
 jrYnAP9UeOw8YchTIsjEllmAbTMAqWGI+54CU/qD78jdIHoVWAEAmp0QqgFW3r2p
 jze4jBkh3lGQjykTjkUskaR71h9AZww=
 =AHeV
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "16 hotfixes, 11 of which are cc:stable.

  A few nilfs2 fixes, the remainder are for MM: a couple of selftests
  fixes, various singletons fixing various issues in various parts"

* tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/ksm: fix possible UAF of stable_node
  mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
  mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
  nilfs2: fix potential hang in nilfs_detach_log_writer()
  nilfs2: fix unexpected freezing of nilfs_segctor_sync()
  nilfs2: fix use-after-free of timer for log writer thread
  selftests/mm: fix build warnings on ppc64
  arm64: patching: fix handling of execmem addresses
  selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
  selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
  selftests/mm: compaction_test: fix bogus test success on Aarch64
  mailmap: update email address for Satya Priya
  mm/huge_memory: don't unpoison huge_zero_folio
  kasan, fortify: properly rename memintrinsics
  lib: add version into /proc/allocinfo output
  mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
2024-05-25 15:10:33 -07:00
Suren Baghdasaryan
a38568a0b4 lib: add version into /proc/allocinfo output
Add version string and a header at the beginning of /proc/allocinfo to
allow later format changes.  Example output:

> head /proc/allocinfo
allocinfo - version: 1.0
#     <size>  <calls> <tag info>
           0        0 init/main.c:1314 func:do_initcalls
           0        0 init/do_mounts.c:353 func:mount_nodev_root
           0        0 init/do_mounts.c:187 func:mount_root_generic
           0        0 init/do_mounts.c:158 func:do_mount_root
           0        0 init/initramfs.c:493 func:unpack_to_rootfs
           0        0 init/initramfs.c:492 func:unpack_to_rootfs
           0        0 init/initramfs.c:491 func:unpack_to_rootfs
         512        1 arch/x86/events/rapl.c:681 func:init_rapl_pmus
         128        1 arch/x86/events/rapl.c:571 func:rapl_cpu_online

[akpm@linux-foundation.org: remove stray newline from struct allocinfo_private]
Link: https://lkml.kernel.org/r/20240514163128.3662251-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24 11:55:05 -07:00
Linus Torvalds
b0a9ba13ff hardening fixes for v6.10-rc1
- loadpin: Prevent SECURITY_LOADPIN_ENFORCE=y without module decompression
   (Stephen Boyd)
 
 - ubsan: Restore dependency on ARCH_HAS_UBSAN
 
 - kunit/fortify: Fix memcmp() test to be amplitude agnostic
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmZP0w0WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJqYDEACWaY0Xjig6Izo+B+85IozTLf2R
 Wv3zlOjUhjbRn7enzhVBRRfU216nl/wp8s7pKhNYCEZ7gJ+04hYtZoLY6YV7jtZ0
 RAvpwc1dmUm7RZIBxjnzqiNTdttNBniPDE47goV0Yi9JVSDFY1Y/P5GwiAr0PO6W
 kt1+WBr2zADNpTZziH8MZou7jfK+y1bOZw8rUUFMODrMc0buuLGO2h+lZqASJXNs
 5NHPUOoJsZHvQxN/YSyE555VycpoyWiwMvA1XOz1NVKdr1eFP1heu88AnIRKOD7o
 cMz6W/yUZ+4dYr2yydDGNX+QvFmZuvPz0oXAlI7BAblpT0UU7xv0jaioAhIam87U
 WxVQSOgkLQBw6Ym79W66HplizCVfEl9aUAYDSK5UJlwdpNE/j16XLYDLKxDi0wUZ
 pjUy5CF0X7FFNyY7Kp5flqzKrQG31vfqZf/yWhtWu258x604LR6CTkO06IJDINx0
 UUrbehie3bGnbu5FS0oVKGH37Mq0aRn4Xk2aUZaFf1Vz/YtU4Wo3FbtyOyFZsdpl
 aCNyYzmNmfVijDQlLshy6HBACeLPV2DjIJ8pcC74abUV1FX6VOvIDsTy4ELkm9BF
 WZ8LNryo79lFsFMThhwfCDHubhXoaLjkl4rpOB5x+Ld0q+GgfIb5jMfF507YxrRj
 3KxJJKXzUKNf+JFnjg==
 =VTTF
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - loadpin: Prevent SECURITY_LOADPIN_ENFORCE=y without module
   decompression (Stephen Boyd)

 - ubsan: Restore dependency on ARCH_HAS_UBSAN

 - kunit/fortify: Fix memcmp() test to be amplitude agnostic

* tag 'hardening-v6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kunit/fortify: Fix memcmp() test to be amplitude agnostic
  ubsan: Restore dependency on ARCH_HAS_UBSAN
  loadpin: Prevent SECURITY_LOADPIN_ENFORCE=y without module decompression
2024-05-24 08:33:44 -07:00
Linus Torvalds
c760b3725e - A series ("kbuild: enable more warnings by default") from Arnd
Bergmann which enables a number of additional build-time warnings.  We
   fixed all the fallout which we could find, there may still be a few
   stragglers.
 
 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API".  This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer AMD
   GPUs on RISC-V.
 
 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".
 
 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZk6OSAAKCRDdBJ7gKXxA
 jpTGAP9hQaZ+g7CO38hKQAtEI8rwcZJtvUAP84pZEGMjYMGLxQD/S8z1o7UHx61j
 DUbnunbOkU/UcPx3Fs/gp4KcJARMEgs=
 =EPi9
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more non-mm updates from Andrew Morton:

 - A series ("kbuild: enable more warnings by default") from Arnd
   Bergmann which enables a number of additional build-time warnings. We
   fixed all the fallout which we could find, there may still be a few
   stragglers.

 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API". This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer
   AMD GPUs on RISC-V.

 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".

 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.

* tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  nilfs2: make block erasure safe in nilfs_finish_roll_forward()
  selftests/harness: use 1024 in place of LINE_MAX
  Revert "selftests/harness: remove use of LINE_MAX"
  selftests/fpu: allow building on other architectures
  selftests/fpu: move FP code to a separate translation unit
  drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
  drm/amd/display: only use hard-float, not altivec on powerpc
  riscv: add support for kernel-mode FPU
  x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: fix asm/fpu/types.h include guard
  kbuild: enable -Wcast-function-type-strict unconditionally
  kbuild: enable -Wformat-truncation on clang
  ...
2024-05-22 18:59:29 -07:00
Linus Torvalds
5c6f4d68e2 A series from Dave Chinner which cleans up and fixes the handling of
nested allocations within stackdepot and page-owner.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZk6MRwAKCRDdBJ7gKXxA
 jnzeAP9WHW425N7pWmE7rK7n8oXZK9f356dKJMtz2A35Bx6XJgEAuK86kDRA4Kv3
 kg8mtwzOIQYKZWzn5VlcvBbtlhjKGwM=
 =9/Ou
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-22-17-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more mm updates from Andrew Morton:
 "A series from Dave Chinner which cleans up and fixes the handling of
  nested allocations within stackdepot and page-owner"

* tag 'mm-stable-2024-05-22-17-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/page-owner: use gfp_nested_mask() instead of open coded masking
  stackdepot: use gfp_nested_mask() instead of open coded masking
  mm: lift gfp_kmemleak_mask() to gfp.h
2024-05-22 17:32:04 -07:00
Linus Torvalds
f6b8e86b7a TTY/Serial changes for 6.10-rc1
Here is the big set of tty/serial driver changes for 6.10-rc1.  Included
 in here are:
   - Usual good set of api cleanups and evolution by Jiri Slaby to make
     the serial interfaces move out of the 1990's by using kfifos instead
     of hand-rolling their own logic.
   - 8250_exar driver updates
   - max3100 driver updates
   - sc16is7xx driver updates
   - exar driver updates
   - sh-sci driver updates
   - tty ldisc api addition to help refuse bindings
   - other smaller serial driver updates
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZk4Cvg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymqpwCgnHU1NeBBUsvoSDOLk5oApIQ4jVgAn102jWlw
 3dNDhA4i3Ay/mZdv8/Kj
 =TI+P
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial updates from Greg KH:
 "Here is the big set of tty/serial driver changes for 6.10-rc1.
  Included in here are:

   - Usual good set of api cleanups and evolution by Jiri Slaby to make
     the serial interfaces move out of the 1990's by using kfifos
     instead of hand-rolling their own logic.

   - 8250_exar driver updates

   - max3100 driver updates

   - sc16is7xx driver updates

   - exar driver updates

   - sh-sci driver updates

   - tty ldisc api addition to help refuse bindings

   - other smaller serial driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (113 commits)
  serial: Clear UPF_DEAD before calling tty_port_register_device_attr_serdev()
  serial: imx: Raise TX trigger level to 8
  serial: 8250_pnp: Simplify "line" related code
  serial: sh-sci: simplify locking when re-issuing RXDMA fails
  serial: sh-sci: let timeout timer only run when DMA is scheduled
  serial: sh-sci: describe locking requirements for invalidating RXDMA
  serial: sh-sci: protect invalidating RXDMA on shutdown
  tty: add the option to have a tty reject a new ldisc
  serial: core: Call device_set_awake_path() for console port
  dt-bindings: serial: brcm,bcm2835-aux-uart: convert to dtschema
  tty: serial: uartps: Add support for uartps controller reset
  arm64: zynqmp: Add resets property for UART nodes
  dt-bindings: serial: cdns,uart: Add optional reset property
  serial: 8250_pnp: Switch to DEFINE_SIMPLE_DEV_PM_OPS()
  serial: 8250_exar: Keep the includes sorted
  serial: 8250_exar: Make type of bit the same in exar_ee_*_bit()
  serial: 8250_exar: Use BIT() in exar_ee_read()
  serial: 8250_exar: Switch to use dev_err_probe()
  serial: 8250_exar: Return directly from switch-cases
  serial: 8250_exar: Decrease indentation level
  ...
2024-05-22 11:53:02 -07:00
Linus Torvalds
4865a27c66 bitmap patches for 6.10
Hi Linus,
 
 Please pull patches for 6.10. This includes:
  - topology_span_sane() optimization from Kyle Meyer;
  - fns() rework from Kuan-Wei Chiu (used in
    cpumask_local_spread() and other places); and
  - headers cleanup from Andy.
 
 This also adds a MAINTAINERS record for bitops API as it's unattended,
 and I'd like to follow it closer.
 
 Thanks,
 Yury
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmZKh/kACgkQsUSA/Tof
 vshtSQv/eT5+KyXg5qCY3fLaIjWYD0uch5jxkdqtib5BncfIrUMsFpZBon+E2x9C
 fWu7K/nfxUjKZF0Sfgl9gVns6K0rC4F24WzHjzWRVVV7+g4idXwMC1kxSX733KQC
 o+D2065Dx9EmhnzypBbmNsGQsQ09WXP1GsJLf8qSGCw0lT1zNtgqsAD5sSogFGGn
 ca9ZsndThuzTst5lXPXipt1W/c26frchh6SgjVTPjzALCDAf5r9Ls5np3AL1AW8X
 yR8cuV9UphT1ysBplzPbBET/Fy/AGbZl1g4u72M6NvGy/nVkQ5Ic4HZj0zIem0Ic
 C60PokY8lg6hQ7tWN8da12/g6WZINgZcfUfuodKiQAzryBGUJlW0aDzDUZPcCqB/
 gmV/Op4RPJeQr9sibQ6nIFx73ydKVQEmZRliahzXR0p33HJCOLTATOeYqLTXQMdi
 ZwhYCqG5fNEUK0VMBy8S4+tEsUAoykU21hFD04b/Ur8A49bxxJ9RDlAUC0IEc1Pj
 fiU0VPFx
 =H6BQ
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.10v2' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - topology_span_sane() optimization from Kyle Meyer

 - fns() rework from Kuan-Wei Chiu (used in cpumask_local_spread() and
   other places)

 - headers cleanup from Andy

 - add a MAINTAINERS record for bitops API

* tag 'bitmap-for-6.10v2' of https://github.com/norov/linux:
  usercopy: Don't use "proxy" headers
  bitops: Move aligned_byte_mask() to wordpart.h
  MAINTAINERS: add BITOPS API record
  bitmap: relax find_nth_bit() limitation on return value
  lib: make test_bitops compilable into the kernel image
  bitops: Optimize fns() for improved performance
  lib/test_bitops: Add benchmark test for fns()
  Compiler Attributes: Add __always_used macro
  sched/topology: Optimize topology_span_sane()
  cpumask: Add for_each_cpu_from()
2024-05-21 15:29:01 -07:00
Linus Torvalds
3413efa888 Compactifying bdev flags
We can easily have up to 24 flags with sane
 atomicity, _without_ pushing anything out
 of the first cacheline of struct block_device.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZkznRwAKCRBZ7Krx/gZQ
 69XpAQDOZCyvYOZ/dlMOKKLf2vAojC/h++E/NjvGt3erbvVN2wEArXMi13ECsoCw
 JYJA3MsmvjuY6VNcm24icf2/p4TMIgo=
 =JyYi
 -----END PGP SIGNATURE-----

Merge tag 'pull-bd_flags-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull bdev flags update from Al Viro:
 "Compactifying bdev flags.

  We can easily have up to 24 flags with sane atomicity, _without_
  pushing anything out of the first cacheline of struct block_device"

* tag 'pull-bd_flags-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  bdev: move ->bd_make_it_fail to ->__bd_flags
  bdev: move ->bd_ro_warned to ->__bd_flags
  bdev: move ->bd_has_subit_bio to ->__bd_flags
  bdev: move ->bd_write_holder into ->__bd_flags
  bdev: move ->bd_read_only to ->__bd_flags
  bdev: infrastructure for flags
  wrapper for access to ->bd_partno
  Use bdev_is_paritition() instead of open-coding it
2024-05-21 13:02:56 -07:00
Andy Shevchenko
5671dca241 usercopy: Don't use "proxy" headers
Update header inclusions to follow IWYU (Include What You Use)
principle.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-19 16:12:38 -07:00
Andy Shevchenko
9f2c2d6ba1 bitops: Move aligned_byte_mask() to wordpart.h
The bitops.h is for bit related operations. The aligned_byte_mask()
is about byte (or part of the machine word) operations, for which
we have a separate header, move the mentioned macro to wordpart.h
to consolidate similar operations.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-19 16:12:38 -07:00
Dave Chinner
70c435ca8d stackdepot: use gfp_nested_mask() instead of open coded masking
The stackdepot code is used by KASAN and lockdep for recoding stack
traces.  Both of these track allocation context information, and so their
internal allocations must obey the caller allocation contexts to avoid
generating their own false positive warnings that have nothing to do with
the code they are instrumenting/tracking.

We also don't want recording stack traces to deplete emergency memory
reserves - debug code is useless if it creates new issues that can't be
replicated when the debug code is disabled.

Switch the stackdepot allocation masking to use gfp_nested_mask() to
address these issues.  gfp_nested_mask() also strips GFP_ZONEMASK
naturally, so that greatly simplifies this code.

Link: https://lkml.kernel.org/r/20240430054604.4169568-3-david@fromorbit.com
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:40:44 -07:00
Samuel Holland
790a4a3dd1 selftests/fpu: allow building on other architectures
Now that ARCH_HAS_KERNEL_FPU_SUPPORT provides a common way to compile and
run floating-point code, this test is no longer x86-specific.

Link: https://lkml.kernel.org/r/20240329072441.591471-16-samuel.holland@sifive.com
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com> 
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Xuerui <git@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:36:20 -07:00
Samuel Holland
9613736d85 selftests/fpu: move FP code to a separate translation unit
This ensures no compiler-generated floating-point code can appear outside
kernel_fpu_{begin,end}() sections, and some architectures enforce this
separation.

Link: https://lkml.kernel.org/r/20240329072441.591471-15-samuel.holland@sifive.com
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com> 
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Xuerui <git@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:36:20 -07:00
Samuel Holland
4be073931c lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
Now that CC_FLAGS_FPU is exported and can be used anywhere in the source
tree, use it instead of duplicating the flags here.

Link: https://lkml.kernel.org/r/20240329072441.591471-7-samuel.holland@sifive.com
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com> 
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Xuerui <git@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:36:18 -07:00
Linus Torvalds
eb6a9339ef Mainly singleton patches, documented in their respective changelogs.
Notable series include:
 
 - Some maintenance and performance work for ocfs2 in Heming Zhao's
   series "improve write IO performance when fragmentation is high".
 
 - Some ocfs2 bugfixes from Su Yue in the series "ocfs2 bugs fixes
   exposed by fstests".
 
 - kfifo header rework from Andy Shevchenko in the series "kfifo: Clean
   up kfifo.h".
 
 - GDB script fixes from Florian Rommel in the series "scripts/gdb: Fixes
   for $lx_current and $lx_per_cpu".
 
 - After much discussion, a coding-style update from Barry Song
   explaining one reason why inline functions are preferred over macros.
   The series is "codingstyle: avoid unused parameters for a function-like
   macro".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkpLYQAKCRDdBJ7gKXxA
 jo9NAQDctSD3TMXqxqCHLaEpCaYTYzi6TGAVHjgkqGzOt7tYjAD/ZIzgcmRwthjP
 R7SSiSgZ7UnP9JRn16DQILmFeaoG1gs=
 =lYhr
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-mm updates from Andrew Morton:
 "Mainly singleton patches, documented in their respective changelogs.
  Notable series include:

   - Some maintenance and performance work for ocfs2 in Heming Zhao's
     series "improve write IO performance when fragmentation is high".

   - Some ocfs2 bugfixes from Su Yue in the series "ocfs2 bugs fixes
     exposed by fstests".

   - kfifo header rework from Andy Shevchenko in the series "kfifo:
     Clean up kfifo.h".

   - GDB script fixes from Florian Rommel in the series "scripts/gdb:
     Fixes for $lx_current and $lx_per_cpu".

   - After much discussion, a coding-style update from Barry Song
     explaining one reason why inline functions are preferred over
     macros. The series is "codingstyle: avoid unused parameters for a
     function-like macro""

* tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (62 commits)
  fs/proc: fix softlockup in __read_vmcore
  nilfs2: convert BUG_ON() in nilfs_finish_roll_forward() to WARN_ON()
  scripts: checkpatch: check unused parameters for function-like macro
  Documentation: coding-style: ask function-like macros to evaluate parameters
  nilfs2: use __field_struct() for a bitwise field
  selftests/kcmp: remove unused open mode
  nilfs2: remove calls to folio_set_error() and folio_clear_error()
  kernel/watchdog_perf.c: tidy up kerneldoc
  watchdog: allow nmi watchdog to use raw perf event
  watchdog: handle comma separated nmi_watchdog command line
  nilfs2: make superblock data array index computation sparse friendly
  squashfs: remove calls to set the folio error flag
  squashfs: convert squashfs_symlink_read_folio to use folio APIs
  scripts/gdb: fix detection of current CPU in KGDB
  scripts/gdb: make get_thread_info accept pointers
  scripts/gdb: fix parameter handling in $lx_per_cpu
  scripts/gdb: fix failing KGDB detection during probe
  kfifo: don't use "proxy" headers
  media: stih-cec: add missing io.h
  media: rc: add missing io.h
  ...
2024-05-19 14:02:03 -07:00
Linus Torvalds
16dbfae867 bcachefs changes for 6.10-rc1
- More safety fixes, primarily found by syzbot
 
 - Run the upgrade/downgrade paths in nochnages mode. Nochanges mode is
   primarily for testing fsck/recovery in dry run mode, so it shouldn't
   change anything besides disabling writes and holding dirty metadata in
   memory.
 
   The idea here was to reduce the amount of activity if we can't write
   anything out, so that bringing up a filesystem in "super ro" mode
   would be more lilkely to work for data recovery - but norecovery is
   the correct option for this.
 
 - btree_trans->locked; we now track whether a btree_trans has any btree
   nodes locked, and this is used for improved assertions related to
   trans_unlock() and trans_relock(). We'll also be using it for
   improving how we work with lockdep in the future: we don't want
   lockdep to be tracking individual btree node locks because we take too
   many for lockdep to track, and it's not necessary since we have a
   cycle detector.
 
 - Trigger improvements that are prep work for online fsck
 
 - BTREE_TRIGGER_check_repair; this regularizes how we do some repair
   work for extents that goes with running triggers in fsck, and fixes
   some subtle issues with transaction restarts there.
 
 - bch2_snapshot_equiv() has now been ripped out of fsck.c; snapshot
   equivalence classes are for when snapshot deletion leaves behind
   redundant snapshot nodes, but snapshot deletion now cleans this up
   right away, so the abstraction doesn't need to leak.
 
 - Improvements to how we resume writing to the journal in recovery. The
   code for picking the new place to write when reading the journal is
   greatly simplified and we also store the position in the superblock
   for when we don't read the journal; this means that we preserve more
   of the journal for list_journal debugging.
 
 - Improvements to sysfs btree_cache and btree_node_cache, for debugging
   memory reclaim.
 
 - We now detect when we've blocked for 10 seconds on the allocator in
   the write path and dump some useful info.
 
 - Safety fixes for devices references: this is a big series that changes
   almost all device lookups to properly check if the device exists and
   take a reference to it.
 
   Previously we assumed that if a bkey exists that references a device
   then the device must exist, and this was enforced in .invalid methods,
   but this was incorrect because it meant device removal relied on
   accounting being correct to not leave keys pointing to invalid
   devices, and that's not something we can assume.
 
   Getting the "pointer to invalid device" checks out of our .invalid()
   methods fixes some long standing device removal bugs; the only
   outstanding bug with device removal now is a race between the discard
   path and deleting alloc info, which should be easily fixed.
 
 - The allocator now prefers not to expand the new
   member_info.btree_allocated bitmap, meaning if repair ever requires
   scanning for btree nodes (because of a corrupt interior nodes) we
   won't have to scan the whole device(s).
 
 - New coding style document, which among other things talks about the
   correct usage of assertions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmZKJQgACgkQE6szbY3K
 bnZETg//SU9H0OHnBSMB/cteF6PKo9QR+dhT+3n+gWTxl0o/egbGTqwbzVqGtd2f
 J6II1BsDk8VoTOb/gFfLRShlmJfnj2jpRThU265faR/7LQYeSaqndDPkjOpTayAD
 Nj/DJyiSUTL753rZh3yUhOpOIHf7iapH6wuaZCPfhdfk+yvZNW8iz07JHjHLKRp8
 I2cFH0r6kN916NdRkt9oDCz68WouT8eWTqwcKra04XsLEZjNJHxLpKMq4M8UdPc7
 YynJPVt+aP8+VduGIq6pV8Co3afCP2oUywo11JpRmvLsw4tex/59wxOYtpMfgn6k
 4H+9WqiBwkbmnLDrfFHWRameS6F/7+GRAOVuz9nkmfk61UPU15gLjSRffqZ6u2YC
 7vbrXgebId/sZXtBpQd83RMMX52BnEJah0upNJ54IsSqfDYkU9lwl6CEyYpcX1hf
 YNBGBTbspZztc3AB13b3ow421FMhaySUg0FDmntMR9O8Z6/BXk7Ykc7b8DPEfrFs
 W6JY7q+ARBxr+EgFcV74fvMCf7NJTAhyv80AKryo7NFU2JZOyyaTxcTGSnolX4Mi
 lyHiOgicmOX+vy3vbC1dZoDcmIDJ4Uc0vixYcpKiZqxlR8XJ+wpevC50TEhxrcW+
 ZO4SloQvgyjI34xu/gZgjRYb3BhXK3x+ougVFpRG8V8zQ/+ccWg=
 =MKrF
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs updates from Kent Overstreet:

 - More safety fixes, primarily found by syzbot

 - Run the upgrade/downgrade paths in nochnages mode. Nochanges mode is
   primarily for testing fsck/recovery in dry run mode, so it shouldn't
   change anything besides disabling writes and holding dirty metadata
   in memory.

   The idea here was to reduce the amount of activity if we can't write
   anything out, so that bringing up a filesystem in "super ro" mode
   would be more lilkely to work for data recovery - but norecovery is
   the correct option for this.

 - btree_trans->locked; we now track whether a btree_trans has any btree
   nodes locked, and this is used for improved assertions related to
   trans_unlock() and trans_relock(). We'll also be using it for
   improving how we work with lockdep in the future: we don't want
   lockdep to be tracking individual btree node locks because we take
   too many for lockdep to track, and it's not necessary since we have a
   cycle detector.

 - Trigger improvements that are prep work for online fsck

 - BTREE_TRIGGER_check_repair; this regularizes how we do some repair
   work for extents that goes with running triggers in fsck, and fixes
   some subtle issues with transaction restarts there.

 - bch2_snapshot_equiv() has now been ripped out of fsck.c; snapshot
   equivalence classes are for when snapshot deletion leaves behind
   redundant snapshot nodes, but snapshot deletion now cleans this up
   right away, so the abstraction doesn't need to leak.

 - Improvements to how we resume writing to the journal in recovery. The
   code for picking the new place to write when reading the journal is
   greatly simplified and we also store the position in the superblock
   for when we don't read the journal; this means that we preserve more
   of the journal for list_journal debugging.

 - Improvements to sysfs btree_cache and btree_node_cache, for debugging
   memory reclaim.

 - We now detect when we've blocked for 10 seconds on the allocator in
   the write path and dump some useful info.

 - Safety fixes for devices references: this is a big series that
   changes almost all device lookups to properly check if the device
   exists and take a reference to it.

   Previously we assumed that if a bkey exists that references a device
   then the device must exist, and this was enforced in .invalid
   methods, but this was incorrect because it meant device removal
   relied on accounting being correct to not leave keys pointing to
   invalid devices, and that's not something we can assume.

   Getting the "pointer to invalid device" checks out of our .invalid()
   methods fixes some long standing device removal bugs; the only
   outstanding bug with device removal now is a race between the discard
   path and deleting alloc info, which should be easily fixed.

 - The allocator now prefers not to expand the new
   member_info.btree_allocated bitmap, meaning if repair ever requires
   scanning for btree nodes (because of a corrupt interior nodes) we
   won't have to scan the whole device(s).

 - New coding style document, which among other things talks about the
   correct usage of assertions

* tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs: (155 commits)
  bcachefs: add no_invalid_checks flag
  bcachefs: add counters for failed shrinker reclaim
  bcachefs: Fix sb_field_downgrade validation
  bcachefs: Plumb bch_validate_flags to sb_field_ops.validate()
  bcachefs: s/bkey_invalid_flags/bch_validate_flags
  bcachefs: fsync() should not return -EROFS
  bcachefs: Invalid devices are now checked for by fsck, not .invalid methods
  bcachefs: kill bch2_dev_bkey_exists() in bch2_check_fix_ptrs()
  bcachefs: kill bch2_dev_bkey_exists() in bch2_read_endio()
  bcachefs: bch2_dev_get_ioref() checks for device not present
  bcachefs: bch2_dev_get_ioref2(); io_read.c
  bcachefs: bch2_dev_get_ioref2(); debug.c
  bcachefs: bch2_dev_get_ioref2(); journal_io.c
  bcachefs: bch2_dev_get_ioref2(); io_write.c
  bcachefs: bch2_dev_get_ioref2(); btree_io.c
  bcachefs: bch2_dev_get_ioref2(); backpointers.c
  bcachefs: bch2_dev_get_ioref2(); alloc_background.c
  bcachefs: for_each_bset() declares loop iter
  bcachefs: Move BCACHEFS_STATFS_MAGIC value to UAPI magic.h
  bcachefs: Improve sysfs internal/btree_cache
  ...
2024-05-19 13:45:48 -07:00
Linus Torvalds
61307b7be4 The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.  Notable
 series include:
 
 - Lucas Stach has provided some page-mapping
   cleanup/consolidation/maintainability work in the series "mm/treewide:
   Remove pXd_huge() API".
 
 - In the series "Allow migrate on protnone reference with
   MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
   MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
   test.
 
 - In their series "Memory allocation profiling" Kent Overstreet and
   Suren Baghdasaryan have contributed a means of determining (via
   /proc/allocinfo) whereabouts in the kernel memory is being allocated:
   number of calls and amount of memory.
 
 - Matthew Wilcox has provided the series "Various significant MM
   patches" which does a number of rather unrelated things, but in largely
   similar code sites.
 
 - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
   Weiner has fixed the page allocator's handling of migratetype requests,
   with resulting improvements in compaction efficiency.
 
 - In the series "make the hugetlb migration strategy consistent" Baolin
   Wang has fixed a hugetlb migration issue, which should improve hugetlb
   allocation reliability.
 
 - Liu Shixin has hit an I/O meltdown caused by readahead in a
   memory-tight memcg.  Addressed in the series "Fix I/O high when memory
   almost met memcg limit".
 
 - In the series "mm/filemap: optimize folio adding and splitting" Kairui
   Song has optimized pagecache insertion, yielding ~10% performance
   improvement in one test.
 
 - Baoquan He has cleaned up and consolidated the early zone
   initialization code in the series "mm/mm_init.c: refactor
   free_area_init_core()".
 
 - Baoquan has also redone some MM initializatio code in the series
   "mm/init: minor clean up and improvement".
 
 - MM helper cleanups from Christoph Hellwig in his series "remove
   follow_pfn".
 
 - More cleanups from Matthew Wilcox in the series "Various page->flags
   cleanups".
 
 - Vlastimil Babka has contributed maintainability improvements in the
   series "memcg_kmem hooks refactoring".
 
 - More folio conversions and cleanups in Matthew Wilcox's series
 
 	"Convert huge_zero_page to huge_zero_folio"
 	"khugepaged folio conversions"
 	"Remove page_idle and page_young wrappers"
 	"Use folio APIs in procfs"
 	"Clean up __folio_put()"
 	"Some cleanups for memory-failure"
 	"Remove page_mapping()"
 	"More folio compat code removal"
 
 - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
   functions to work on folis".
 
 - Code consolidation and cleanup work related to GUP's handling of
   hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
 
 - Rick Edgecombe has developed some fixes to stack guard gaps in the
   series "Cover a guard gap corner case".
 
 - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
   "mm/ksm: fix ksm exec support for prctl".
 
 - Baolin Wang has implemented NUMA balancing for multi-size THPs.  This
   is a simple first-cut implementation for now.  The series is "support
   multi-size THP numa balancing".
 
 - Cleanups to vma handling helper functions from Matthew Wilcox in the
   series "Unify vma_address and vma_pgoff_address".
 
 - Some selftests maintenance work from Dev Jain in the series
   "selftests/mm: mremap_test: Optimizations and style fixes".
 
 - Improvements to the swapping of multi-size THPs from Ryan Roberts in
   the series "Swap-out mTHP without splitting".
 
 - Kefeng Wang has significantly optimized the handling of arm64's
   permission page faults in the series
 
 	"arch/mm/fault: accelerate pagefault when badaccess"
 	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
 
 - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
   GUP-fast".
 
 - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
   use struct vm_fault".
 
 - selftests build fixes from John Hubbard in the series "Fix
   selftests/mm build without requiring "make headers"".
 
 - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
   series "Improved Memory Tier Creation for CPUless NUMA Nodes".  Fixes
   the initialization code so that migration between different memory types
   works as intended.
 
 - David Hildenbrand has improved follow_pte() and fixed an errant driver
   in the series "mm: follow_pte() improvements and acrn follow_pte()
   fixes".
 
 - David also did some cleanup work on large folio mapcounts in his
   series "mm: mapcount for large folios + page_mapcount() cleanups".
 
 - Folio conversions in KSM in Alex Shi's series "transfer page to folio
   in KSM".
 
 - Barry Song has added some sysfs stats for monitoring multi-size THP's
   in the series "mm: add per-order mTHP alloc and swpout counters".
 
 - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
   and limit checking cleanups".
 
 - Matthew Wilcox has been looking at buffer_head code and found the
   documentation to be lacking.  The series is "Improve buffer head
   documentation".
 
 - Multi-size THPs get more work, this time from Lance Yang.  His series
   "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
   the freeing of these things.
 
 - Kemeng Shi has added more userspace-visible writeback instrumentation
   in the series "Improve visibility of writeback".
 
 - Kemeng Shi then sent some maintenance work on top in the series "Fix
   and cleanups to page-writeback".
 
 - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
   series "Improve anon_vma scalability for anon VMAs".  Intel's test bot
   reported an improbable 3x improvement in one test.
 
 - SeongJae Park adds some DAMON feature work in the series
 
 	"mm/damon: add a DAMOS filter type for page granularity access recheck"
 	"selftests/damon: add DAMOS quota goal test"
 
 - Also some maintenance work in the series
 
 	"mm/damon/paddr: simplify page level access re-check for pageout"
 	"mm/damon: misc fixes and improvements"
 
 - David Hildenbrand has disabled some known-to-fail selftests ni the
   series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
 
 - memcg metadata storage optimizations from Shakeel Butt in "memcg:
   reduce memory consumption by memcg stats".
 
 - DAX fixes and maintenance work from Vishal Verma in the series
   "dax/bus.c: Fixups for dax-bus locking".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
 jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
 nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
 =V3R/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:
 "The usual shower of singleton fixes and minor series all over MM,
  documented (hopefully adequately) in the respective changelogs.
  Notable series include:

   - Lucas Stach has provided some page-mapping cleanup/consolidation/
     maintainability work in the series "mm/treewide: Remove pXd_huge()
     API".

   - In the series "Allow migrate on protnone reference with
     MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
     MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
     one test.

   - In their series "Memory allocation profiling" Kent Overstreet and
     Suren Baghdasaryan have contributed a means of determining (via
     /proc/allocinfo) whereabouts in the kernel memory is being
     allocated: number of calls and amount of memory.

   - Matthew Wilcox has provided the series "Various significant MM
     patches" which does a number of rather unrelated things, but in
     largely similar code sites.

   - In his series "mm: page_alloc: freelist migratetype hygiene"
     Johannes Weiner has fixed the page allocator's handling of
     migratetype requests, with resulting improvements in compaction
     efficiency.

   - In the series "make the hugetlb migration strategy consistent"
     Baolin Wang has fixed a hugetlb migration issue, which should
     improve hugetlb allocation reliability.

   - Liu Shixin has hit an I/O meltdown caused by readahead in a
     memory-tight memcg. Addressed in the series "Fix I/O high when
     memory almost met memcg limit".

   - In the series "mm/filemap: optimize folio adding and splitting"
     Kairui Song has optimized pagecache insertion, yielding ~10%
     performance improvement in one test.

   - Baoquan He has cleaned up and consolidated the early zone
     initialization code in the series "mm/mm_init.c: refactor
     free_area_init_core()".

   - Baoquan has also redone some MM initializatio code in the series
     "mm/init: minor clean up and improvement".

   - MM helper cleanups from Christoph Hellwig in his series "remove
     follow_pfn".

   - More cleanups from Matthew Wilcox in the series "Various
     page->flags cleanups".

   - Vlastimil Babka has contributed maintainability improvements in the
     series "memcg_kmem hooks refactoring".

   - More folio conversions and cleanups in Matthew Wilcox's series:
	"Convert huge_zero_page to huge_zero_folio"
	"khugepaged folio conversions"
	"Remove page_idle and page_young wrappers"
	"Use folio APIs in procfs"
	"Clean up __folio_put()"
	"Some cleanups for memory-failure"
	"Remove page_mapping()"
	"More folio compat code removal"

   - David Hildenbrand chipped in with "fs/proc/task_mmu: convert
     hugetlb functions to work on folis".

   - Code consolidation and cleanup work related to GUP's handling of
     hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".

   - Rick Edgecombe has developed some fixes to stack guard gaps in the
     series "Cover a guard gap corner case".

   - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
     series "mm/ksm: fix ksm exec support for prctl".

   - Baolin Wang has implemented NUMA balancing for multi-size THPs.
     This is a simple first-cut implementation for now. The series is
     "support multi-size THP numa balancing".

   - Cleanups to vma handling helper functions from Matthew Wilcox in
     the series "Unify vma_address and vma_pgoff_address".

   - Some selftests maintenance work from Dev Jain in the series
     "selftests/mm: mremap_test: Optimizations and style fixes".

   - Improvements to the swapping of multi-size THPs from Ryan Roberts
     in the series "Swap-out mTHP without splitting".

   - Kefeng Wang has significantly optimized the handling of arm64's
     permission page faults in the series
	"arch/mm/fault: accelerate pagefault when badaccess"
	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"

   - GUP cleanups from David Hildenbrand in "mm/gup: consistently call
     it GUP-fast".

   - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
     path to use struct vm_fault".

   - selftests build fixes from John Hubbard in the series "Fix
     selftests/mm build without requiring "make headers"".

   - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
     series "Improved Memory Tier Creation for CPUless NUMA Nodes".
     Fixes the initialization code so that migration between different
     memory types works as intended.

   - David Hildenbrand has improved follow_pte() and fixed an errant
     driver in the series "mm: follow_pte() improvements and acrn
     follow_pte() fixes".

   - David also did some cleanup work on large folio mapcounts in his
     series "mm: mapcount for large folios + page_mapcount() cleanups".

   - Folio conversions in KSM in Alex Shi's series "transfer page to
     folio in KSM".

   - Barry Song has added some sysfs stats for monitoring multi-size
     THP's in the series "mm: add per-order mTHP alloc and swpout
     counters".

   - Some zswap cleanups from Yosry Ahmed in the series "zswap
     same-filled and limit checking cleanups".

   - Matthew Wilcox has been looking at buffer_head code and found the
     documentation to be lacking. The series is "Improve buffer head
     documentation".

   - Multi-size THPs get more work, this time from Lance Yang. His
     series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
     optimizes the freeing of these things.

   - Kemeng Shi has added more userspace-visible writeback
     instrumentation in the series "Improve visibility of writeback".

   - Kemeng Shi then sent some maintenance work on top in the series
     "Fix and cleanups to page-writeback".

   - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
     the series "Improve anon_vma scalability for anon VMAs". Intel's
     test bot reported an improbable 3x improvement in one test.

   - SeongJae Park adds some DAMON feature work in the series
	"mm/damon: add a DAMOS filter type for page granularity access recheck"
	"selftests/damon: add DAMOS quota goal test"

   - Also some maintenance work in the series
	"mm/damon/paddr: simplify page level access re-check for pageout"
	"mm/damon: misc fixes and improvements"

   - David Hildenbrand has disabled some known-to-fail selftests ni the
     series "selftests: mm: cow: flag vmsplice() hugetlb tests as
     XFAIL".

   - memcg metadata storage optimizations from Shakeel Butt in "memcg:
     reduce memory consumption by memcg stats".

   - DAX fixes and maintenance work from Vishal Verma in the series
     "dax/bus.c: Fixups for dax-bus locking""

* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
  memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
  selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
  selftests: cgroup: add tests to verify the zswap writeback path
  mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
  mm/damon/core: fix return value from damos_wmark_metric_value
  mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
  selftests: cgroup: remove redundant enabling of memory controller
  Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
  Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
  Docs/mm/damon/design: use a list for supported filters
  Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
  Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
  selftests/damon: classify tests for functionalities and regressions
  selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
  selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
  selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
  mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
  selftests/damon: add a test for DAMOS quota goal
  ...
2024-05-19 09:21:03 -07:00
Kees Cook
ae1a863bcd kunit/fortify: Fix memcmp() test to be amplitude agnostic
When memcmp() returns a non-zero value, only the signed bit has any
meaning. The actual value may differ between implementations.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2025
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240518184020.work.604-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-18 13:46:10 -07:00
Kees Cook
890a64810d ubsan: Restore dependency on ARCH_HAS_UBSAN
While removing CONFIG_UBSAN_SANITIZE_ALL, ARCH_HAS_UBSAN wasn't correctly
depended on. Restore this, as we do not want to attempt UBSAN builds
unless it's actually been tested on a given architecture.

Reported-by: Masahiro Yamada <masahiroy@kernel.org>
Closes: https://lore.kernel.org/all/20240514095427.541201-1-masahiroy@kernel.org
Fixes: 918327e9b7 ("ubsan: Remove CONFIG_UBSAN_SANITIZE_ALL")
Link: https://lore.kernel.org/r/20240514233747.work.441-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-18 13:46:10 -07:00
Linus Torvalds
25f4874662 RDMA v6.10 merge window
Normal set of driver updates and small fixes:
 
 - Small improvements and fixes for erdma, efa, hfi1, bnxt_re
 
 - Fix a UAF crash after module unload on leaking restrack entry
 
 - Continue adding full RDMA support in mana with support for EQs, GID's
   and CQs
 
 - Improvements to the mkey cache in mlx5
 
 - DSCP traffic class support in hns and several bug fixes
 
 - Cap the maximum number of MADs in the receive queue to avoid OOM
 
 - Another batch of rxe bug fixes from large scale testing
 
 - __iowrite64_copy() optimizations for write combining MMIO memory
 
 - Remove NULL checks before dev_put/hold()
 
 - EFA support for receive with immediate
 
 - Fix a recent memleaking regression in a cma error path
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZkeo2gAKCRCFwuHvBreF
 YbuNAQChzGmS4F0JAn5Wj0CDvkZghELqtvzEb92SzqcgdyQafAD/fC7f23LJ4OsO
 1ZIaQEZu7j9DVg5PKFZ7WfdXjGTKqwA=
 =QRXg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Aside from the usual things this has an arch update for
  __iowrite64_copy() used by the RDMA drivers.

  This API was intended to generate large 64 byte MemWr TLPs on PCI.
  These days most processors had done this by just repeating writel() in
  a loop. S390 and some new ARM64 designs require a special helper to
  get this to generate.

   - Small improvements and fixes for erdma, efa, hfi1, bnxt_re

   - Fix a UAF crash after module unload on leaking restrack entry

   - Continue adding full RDMA support in mana with support for EQs,
     GID's and CQs

   - Improvements to the mkey cache in mlx5

   - DSCP traffic class support in hns and several bug fixes

   - Cap the maximum number of MADs in the receive queue to avoid OOM

   - Another batch of rxe bug fixes from large scale testing

   - __iowrite64_copy() optimizations for write combining MMIO memory

   - Remove NULL checks before dev_put/hold()

   - EFA support for receive with immediate

   - Fix a recent memleaking regression in a cma error path"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (70 commits)
  RDMA/cma: Fix kmemleak in rdma_core observed during blktests nvme/rdma use siw
  RDMA/IPoIB: Fix format truncation compilation errors
  bnxt_re: avoid shift undefined behavior in bnxt_qplib_alloc_init_hwq
  RDMA/efa: Support QP with unsolicited write w/ imm. receive
  IB/hfi1: Remove generic .ndo_get_stats64
  IB/hfi1: Do not use custom stat allocator
  RDMA/hfi1: Use RMW accessors for changing LNKCTL2
  RDMA/mana_ib: implement uapi for creation of rnic cq
  RDMA/mana_ib: boundary check before installing cq callbacks
  RDMA/mana_ib: introduce a helper to remove cq callbacks
  RDMA/mana_ib: create and destroy RNIC cqs
  RDMA/mana_ib: create EQs for RNIC CQs
  RDMA/core: Remove NULL check before dev_{put, hold}
  RDMA/ipoib: Remove NULL check before dev_{put, hold}
  RDMA/mlx5: Remove NULL check before dev_{put, hold}
  RDMA/mlx5: Track DCT, DCI and REG_UMR QPs as diver_detail resources.
  RDMA/core: Add an option to display driver-specific QPs in the rdmatool
  RDMA/efa: Add shutdown notifier
  RDMA/mana_ib: Fix missing ret value
  IB/mlx5: Use __iowrite64_copy() for write combining stores
  ...
2024-05-18 13:04:15 -07:00
Linus Torvalds
ff9a79307f Kbuild updates for v6.10
- Avoid 'constexpr', which is a keyword in C23
 
  - Allow 'dtbs_check' and 'dt_compatible_check' run independently of
    'dt_binding_check'
 
  - Fix weak references to avoid GOT entries in position-independent
    code generation
 
  - Convert the last use of 'optional' property in arch/sh/Kconfig
 
  - Remove support for the 'optional' property in Kconfig
 
  - Remove support for Clang's ThinLTO caching, which does not work with
    the .incbin directive
 
  - Change the semantics of $(src) so it always points to the source
    directory, which fixes Makefile inconsistencies between upstream and
    downstream
 
  - Fix 'make tar-pkg' for RISC-V to produce a consistent package
 
  - Provide reasonable default coverage for objtool, sanitizers, and
    profilers
 
  - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.
 
  - Remove the last use of tristate choice in drivers/rapidio/Kconfig
 
  - Various cleanups and fixes in Kconfig
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmZFlGcVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsG8voQALC8NtFpduWVfLRj2Qg6Ll/xf1vX
 2igcTJEOFHkeqXLGoT8dTDKLEipUBUvKyguPq66CGwVTe2g6zy/nUSXeVtFrUsIa
 msLTi8FqhqUo5lodNvGMRf8qqmuqcvnXoiQwIocF92jtsFy14bhiFY+n4HfcFNjj
 GOKwqBZYQUwY/VVb090efc7RfS9c7uwABJSBelSoxg3AGZriwjGy7Pw5aSKGgVYi
 inqL1eR6qwPP6z7CgQWM99soP+zwybFZmnQrsD9SniRBI4rtAat8Ih5jQFaSUFUQ
 lk2w0NQBRFN88/uR2IJ2GWuIlQ74WeJ+QnCqVuQ59tV5zw90wqSmLzngfPD057Dv
 JjNuhk0UyXVtpIg3lRtd4810ppNSTe33b9OM4O2H846W/crju5oDRNDHcflUXcwm
 Rmn5ho1rb5QVzDVejJbgwidnUInSgJ9PZcvXQ/RJVZPhpgsBzAY9pQexG1G3hviw
 y9UDrt6KP6bF9tHjmolmtdIes9Pj0c4dN6/Rdj4HS4hIQ/GDar0tnwvOvtfUctNL
 orJlBsA6GeMmDVXKkR0ytOCWRYqWWbyt8g70RVKQJfuHX7/hGyAQPaQ2/u4mQhC2
 aevYfbNJMj0VDfGz81HDBKFtkc5n+Ite8l157dHEl2LEabkOkRdNVcn7SNbOvZmd
 ZCSnZ31h7woGfNho
 =D5B/
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Avoid 'constexpr', which is a keyword in C23

 - Allow 'dtbs_check' and 'dt_compatible_check' run independently of
   'dt_binding_check'

 - Fix weak references to avoid GOT entries in position-independent code
   generation

 - Convert the last use of 'optional' property in arch/sh/Kconfig

 - Remove support for the 'optional' property in Kconfig

 - Remove support for Clang's ThinLTO caching, which does not work with
   the .incbin directive

 - Change the semantics of $(src) so it always points to the source
   directory, which fixes Makefile inconsistencies between upstream and
   downstream

 - Fix 'make tar-pkg' for RISC-V to produce a consistent package

 - Provide reasonable default coverage for objtool, sanitizers, and
   profilers

 - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.

 - Remove the last use of tristate choice in drivers/rapidio/Kconfig

 - Various cleanups and fixes in Kconfig

* tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (46 commits)
  kconfig: use sym_get_choice_menu() in sym_check_prop()
  rapidio: remove choice for enumeration
  kconfig: lxdialog: remove initialization with A_NORMAL
  kconfig: m/nconf: merge two item_add_str() calls
  kconfig: m/nconf: remove dead code to display value of bool choice
  kconfig: m/nconf: remove dead code to display children of choice members
  kconfig: gconf: show checkbox for choice correctly
  kbuild: use GCOV_PROFILE and KCSAN_SANITIZE in scripts/Makefile.modfinal
  Makefile: remove redundant tool coverage variables
  kbuild: provide reasonable defaults for tool coverage
  modules: Drop the .export_symbol section from the final modules
  kconfig: use menu_list_for_each_sym() in sym_check_choice_deps()
  kconfig: use sym_get_choice_menu() in conf_write_defconfig()
  kconfig: add sym_get_choice_menu() helper
  kconfig: turn defaults and additional prompt for choice members into error
  kconfig: turn missing prompt for choice members into error
  kconfig: turn conf_choice() into void function
  kconfig: use linked list in sym_set_changed()
  kconfig: gconf: use MENU_CHANGED instead of SYMBOL_CHANGED
  kconfig: gconf: remove debug code
  ...
2024-05-18 12:39:20 -07:00
Linus Torvalds
70a663205d Probes updates for v6.10:
- tracing/probes: Adding new pseudo-types %pd and %pD support for dumping
   dentry name from 'struct dentry *' and file name from 'struct file *'.
 
 - uprobes: Some performance optimizations have been done.
  . Speed up the BPF uprobe event by delaying the fetching of the uprobe
    event arguments that are not used in BPF.
  . Avoid locking by speculatively checking whether uprobe event is valid.
  . Reduce lock contention by using read/write_lock instead of spinlock for
    uprobe list operation. This improved BPF uprobe benchmark result 43% on
    average.
 
 - rethook: Removes non-fatal warning messages when tracing stack from BPF
   and skip rcu_is_watching() validation in rethook if possible.
 
 - objpool: Optimizing objpool (which is used by kretprobes and fprobe as
   rethook backend storage) by inlining functions and avoid caching nr_cpu_ids
   because it is a const value.
 
 - fprobe: Add entry/exit callbacks types (code cleanup)
 - kprobes: Check ftrace was killed in kprobes if it uses ftrace.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmZFUxsbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b+fIH/A96/SeC5WRLhXmHfTCM
 IvKUea2n0b0oV/2pVfHqfkCBTICuUZ97Opd9VH9jLtjBOTh0fUOGZ2DNVGdSYfWm
 IIkS5dhuZxHXrSHEVYykwLHI3AOL7Q6Ny9EmOg1CNMidUkPMNtBvppsBYPlFU/B/
 qQJAvOdkVOnNITCaas0+MNgepoVVKdJzdNQ1I4WrGyG8isCZBaCYKo2QcGyheCNN
 y8NXvnVHgmgHQ8nTaeE5AawclFzFnhwHfPQPe1kiyGrx15b8K+VYmaZxPKv33A1a
 KT3TKJ1Ep7s7iWFh2iPVJzIwOXCmSnvNTKfNx/MDuKtO7UVfFwytoMEaekbmv3bG
 VqM=
 =n/mW
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:

 - tracing/probes: Add new pseudo-types %pd and %pD support for dumping
   dentry name from 'struct dentry *' and file name from 'struct file *'

 - uprobes performance optimizations:
    - Speed up the BPF uprobe event by delaying the fetching of the
      uprobe event arguments that are not used in BPF
    - Avoid locking by speculatively checking whether uprobe event is
      valid
    - Reduce lock contention by using read/write_lock instead of
      spinlock for uprobe list operation. This improved BPF uprobe
      benchmark result 43% on average

 - rethook: Remove non-fatal warning messages when tracing stack from
   BPF and skip rcu_is_watching() validation in rethook if possible

 - objpool: Optimize objpool (which is used by kretprobes and fprobe as
   rethook backend storage) by inlining functions and avoid caching
   nr_cpu_ids because it is a const value

 - fprobe: Add entry/exit callbacks types (code cleanup)

 - kprobes: Check ftrace was killed in kprobes if it uses ftrace

* tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  kprobe/ftrace: bail out if ftrace was killed
  selftests/ftrace: Fix required features for VFS type test case
  objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids
  objpool: enable inlining objpool_push() and objpool_pop() operations
  rethook: honor CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING in rethook_try_get()
  ftrace: make extra rcu_is_watching() validation check optional
  uprobes: reduce contention on uprobes_tree access
  rethook: Remove warning messages printed for finding return address of a frame.
  fprobe: Add entry/exit callbacks types
  selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD"
  selftests/ftrace: add kprobe test cases for VFS type "%pd" and "%pD"
  Documentation: tracing: add new type '%pd' and '%pD' for kprobe
  tracing/probes: support '%pD' type for print struct file's name
  tracing/probes: support '%pd' type for print struct dentry's name
  uprobes: add speculative lockless system-wide uprobe filter check
  uprobes: prepare uprobe args buffer lazily
  uprobes: encapsulate preparation of uprobe args buffer
2024-05-17 18:29:30 -07:00
Linus Torvalds
1b294a1f35 Networking changes for 6.10.
Core & protocols
 ----------------
 
  - Complete rework of garbage collection of AF_UNIX sockets.
    AF_UNIX is prone to forming reference count cycles due to fd passing
    functionality. New method based on Tarjan's Strongly Connected Components
    algorithm should be both faster and remove a lot of workarounds
    we accumulated over the years.
 
  - Add TCP fraglist GRO support, allowing chaining multiple TCP packets
    and forwarding them together. Useful for small switches / routers which
    lack basic checksum offload in some scenarios (e.g. PPPoE).
 
  - Support using SMP threads for handling packet backlog i.e. packet
    processing from software interfaces and old drivers which don't
    use NAPI. This helps move the processing out of the softirq jumble.
 
  - Continue work of converting from rtnl lock to RCU protection.
    Don't require rtnl lock when reading: IPv6 routing FIB, IPv6 address
    labels, netdev threaded NAPI sysfs files, bonding driver's sysfs files,
    MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics, TC Qdiscs,
    neighbor entries, ARP entries via ioctl(SIOCGARP), a lot of the link
    information available via rtnetlink.
 
  - Small optimizations from Eric to UDP wake up handling, memory accounting,
    RPS/RFS implementation, TCP packet sizing etc.
 
  - Allow direct page recycling in the bulk API used by XDP, for +2% PPS.
 
  - Support peek with an offset on TCP sockets.
 
  - Add MPTCP APIs for querying last time packets were received/sent/acked,
    and whether MPTCP "upgrade" succeeded on a TCP socket.
 
  - Add intra-node communication shortcut to improve SMC performance.
 
  - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol driver.
 
  - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.
 
  - Add reset reasons for tracing what caused a TCP reset to be sent.
 
  - Introduce direction attribute for xfrm (IPSec) states.
    State can be used either for input or output packet processing.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().
    This required touch-ups and renaming of a few existing users.
 
  - Add Endian-dependent __counted_by_{le,be} annotations.
 
  - Make building selftests "quieter" by printing summaries like
    "CC object.o" rather than full commands with all the arguments.
 
 Netfilter
 ---------
 
  - Use GFP_KERNEL to clone elements, to deal better with OOM situations
    and avoid failures in the .commit step.
 
 BPF
 ---
 
  - Add eBPF JIT for ARCv2 CPUs.
 
  - Support attaching kprobe BPF programs through kprobe_multi link in
    a session mode, meaning, a BPF program is attached to both function entry
    and return, the entry program can decide if the return program gets
    executed and the entry program can share u64 cookie value with return
    program. "Session mode" is a common use-case for tetragon and bpftrace.
 
  - Add the ability to specify and retrieve BPF cookie for raw tracepoint
    programs in order to ease migration from classic to raw tracepoints.
 
  - Add an internal-only BPF per-CPU instruction for resolving per-CPU
    memory addresses and implement support in x86, ARM64 and RISC-V JITs.
    This allows inlining functions which need to access per-CPU state.
 
  - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
    atomics in bpf_arena which can be JITed as a single x86 instruction.
    Support BPF arena on ARM64.
 
  - Add a new bpf_wq API for deferring events and refactor process-context
    bpf_timer code to keep common code where possible.
 
  - Harden the BPF verifier's and/or/xor value tracking.
 
  - Introduce crypto kfuncs to let BPF programs call kernel crypto APIs.
 
  - Support bpf_tail_call_static() helper for BPF programs with GCC 13.
 
  - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
    program to have code sections where preemption is disabled.
 
 Driver API
 ----------
 
  - Skip software TC processing completely if all installed rules are
    marked as HW-only, instead of checking the HW-only flag rule by rule.
 
  - Add support for configuring PoE (Power over Ethernet), similar to
    the already existing support for PoDL (Power over Data Line) config.
 
  - Initial bits of a queue control API, for now allowing a single queue
    to be reset without disturbing packet flow to other queues.
 
  - Common (ethtool) statistics for hardware timestamping.
 
 Tests and tooling
 -----------------
 
  - Remove the need to create a config file to run the net forwarding tests
    so that a naive "make run_tests" can exercise them.
 
  - Define a method of writing tests which require an external endpoint
    to communicate with (to send/receive data towards the test machine).
    Add a few such tests.
 
  - Create a shared code library for writing Python tests. Expose the YAML
    Netlink library from tools/ to the tests for easy Netlink access.
 
  - Move netfilter tests under net/, extend them, separate performance tests
    from correctness tests, and iron out issues found by running them
    "on every commit".
 
  - Refactor BPF selftests to use common network helpers.
 
  - Further work filling in YAML definitions of Netlink messages for:
    nftables, team driver, bonding interfaces, vlan interfaces, VF info,
    TC u32 mark, TC police action.
 
  - Teach Python YAML Netlink to decode attribute policies.
 
  - Extend the definition of the "indexed array" construct in the specs
    to cover arrays of scalars rather than just nests.
 
  - Add hyperlinks between definitions in generated Netlink docs.
 
 Drivers
 -------
 
  - Make sure unsupported flower control flags are rejected by drivers,
    and make more drivers report errors directly to the application rather
    than dmesg (large number of driver changes from Asbjørn Sloth Tønnesen).
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - support multiple RSS contexts and steering traffic to them
      - support XDP metadata
      - make page pool allocations more NUMA aware
    - Intel (100G, ice, idpf):
      - extract datapath code common among Intel drivers into a library
      - use fewer resources in switchdev by sharing queues with the PF
      - add PFCP filter support
      - add Ethernet filter support
      - use a spinlock instead of HW lock in PTP clock ops
      - support 5 layer Tx scheduler topology
    - nVidia/Mellanox:
      - 800G link modes and 100G SerDes speeds
      - per-queue IRQ coalescing configuration
    - Marvell Octeon:
      - support offloading TC packet mark action
 
  - Ethernet NICs consumer, embedded and virtual:
    - stop lying about skb->truesize in USB Ethernet drivers, it messes up
      TCP memory calculations
    - Google cloud vNIC:
      - support changing ring size via ethtool
      - support ring reset using the queue control API
    - VirtIO net:
      - expose flow hash from RSS to XDP
      - per-queue statistics
      - add selftests
    - Synopsys (stmmac):
      - support controllers which require an RX clock signal from the MII
        bus to perform their hardware initialization
    - TI:
      - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
      - icssg_prueth: add SW TX / RX Coalescing based on hrtimers
      - cpsw: minimal XDP support
    - Renesas (ravb):
      - support describing the MDIO bus
    - Realtek (r8169):
      - add support for RTL8168M
    - Microchip Sparx5:
      - matchall and flower actions mirred and redirect
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - improve events processing performance
    - Marvell:
      - add support for MV88E6250 family internal PHYs
    - Microchip:
      - add DCB and DSCP mapping support for KSZ switches
      - vsc73xx: convert to PHYLINK
    - Realtek:
      - rtl8226b/rtl8221b: add C45 instances and SerDes switching
 
  - Many driver changes related to PHYLIB and PHYLINK deprecated API cleanup.
 
  - Ethernet PHYs:
    - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
    - micrel: lan8814: add support for PPS out and external timestamp trigger
 
  - WiFi:
    - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices drivers.
      Modern devices can only be configured using nl80211.
    - mac80211/cfg80211
      - handle color change per link for WiFi 7 Multi-Link Operation
    - Intel (iwlwifi):
      - don't support puncturing in 5 GHz
      - support monitor mode on passive channels
      - BZ-W device support
      - P2P with HE/EHT support
      - re-add support for firmware API 90
      - provide channel survey information for Automatic Channel Selection
    - MediaTek (mt76):
      - mt7921 LED control
      - mt7925 EHT radiotap support
      - mt7920e PCI support
    - Qualcomm (ath11k):
      - P2P support for QCA6390, WCN6855 and QCA2066
      - support hibernation
      - ieee80211-freq-limit Device Tree property support
    - Qualcomm (ath12k):
      - refactoring in preparation of multi-link support
      - suspend and hibernation support
      - ACPI support
      - debugfs support, including dfs_simulate_radar support
    - RealTek:
      - rtw88: RTL8723CS SDIO device support
      - rtw89: RTL8922AE Wi-Fi 7 PCI device support
      - rtw89: complete features of new WiFi 7 chip 8922AE including
        BT-coexistence and Wake-on-WLAN
      - rtw89: use BIOS ACPI settings to set TX power and channels
      - rtl8xxxu: enable Management Frame Protection (MFP) support
 
  - Bluetooth:
    - support for Intel BlazarI and Filmore Peak2 (BE201)
    - support for MediaTek MT7921S SDIO
    - initial support for Intel PCIe BT driver
    - remove HCI_AMP support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmZD6sQACgkQMUZtbf5S
 IrtLYw/+I73ePGIye37o2jpbodcLAUZVfF3r6uYUzK8hokEcKD0QVJa9w7PizLZ3
 UO45ClOXFLJCkfP4reFenLfxGCel2AJI+F7VFl2xaO2XgrcH/lnVrHqKZEAEXjls
 KoYMnShIolv7h2MKP6hHtyTi2j1wvQUKsZC71o9/fuW+4fUT8gECx1YtYcL73wrw
 gEMdlUgBYC3jiiCUHJIFX6iPJ2t/TC+q1eIIF2K/Osrk2kIqQhzoozcL4vpuAZQT
 99ljx/qRelXa8oppDb7nM5eulg7WY8ZqxEfFZphTMC5nLEGzClxuOTTl2kDYI/D/
 UZmTWZDY+F5F0xvNk2gH84qVJXBOVDoobpT7hVA/tDuybobc/kvGDzRayEVqVzKj
 Q0tPlJs+xBZpkK5TVnxaFLJVOM+p1Xosxy3kNVXmuYNBvT/R89UbJiCrUKqKZF+L
 z/1mOYUv8UklHqYAeuJSptHvqJjTGa/fsEYP7dAUBbc1N2eVB8mzZ4mgU5rYXbtC
 E6UXXiWnoSRm8bmco9QmcWWoXt5UGEizHSJLz6t1R5Df/YmXhWlytll5aCwY1ksf
 FNoL7S4u7AZThL1Nwi7yUs4CAjhk/N4aOsk+41S0sALCx30BJuI6UdesAxJ0lu+Z
 fwCQYbs27y4p7mBLbkYwcQNxAxGm7PSK4yeyRIy2njiyV4qnLf8=
 =EsC2
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Complete rework of garbage collection of AF_UNIX sockets.

     AF_UNIX is prone to forming reference count cycles due to fd
     passing functionality. New method based on Tarjan's Strongly
     Connected Components algorithm should be both faster and remove a
     lot of workarounds we accumulated over the years.

   - Add TCP fraglist GRO support, allowing chaining multiple TCP
     packets and forwarding them together. Useful for small switches /
     routers which lack basic checksum offload in some scenarios (e.g.
     PPPoE).

   - Support using SMP threads for handling packet backlog i.e. packet
     processing from software interfaces and old drivers which don't use
     NAPI. This helps move the processing out of the softirq jumble.

   - Continue work of converting from rtnl lock to RCU protection.

     Don't require rtnl lock when reading: IPv6 routing FIB, IPv6
     address labels, netdev threaded NAPI sysfs files, bonding driver's
     sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics,
     TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot
     of the link information available via rtnetlink.

   - Small optimizations from Eric to UDP wake up handling, memory
     accounting, RPS/RFS implementation, TCP packet sizing etc.

   - Allow direct page recycling in the bulk API used by XDP, for +2%
     PPS.

   - Support peek with an offset on TCP sockets.

   - Add MPTCP APIs for querying last time packets were received/sent/acked
     and whether MPTCP "upgrade" succeeded on a TCP socket.

   - Add intra-node communication shortcut to improve SMC performance.

   - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol
     driver.

   - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.

   - Add reset reasons for tracing what caused a TCP reset to be sent.

   - Introduce direction attribute for xfrm (IPSec) states. State can be
     used either for input or output packet processing.

  Things we sprinkled into general kernel code:

   - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().

     This required touch-ups and renaming of a few existing users.

   - Add Endian-dependent __counted_by_{le,be} annotations.

   - Make building selftests "quieter" by printing summaries like
     "CC object.o" rather than full commands with all the arguments.

  Netfilter:

   - Use GFP_KERNEL to clone elements, to deal better with OOM
     situations and avoid failures in the .commit step.

  BPF:

   - Add eBPF JIT for ARCv2 CPUs.

   - Support attaching kprobe BPF programs through kprobe_multi link in
     a session mode, meaning, a BPF program is attached to both function
     entry and return, the entry program can decide if the return
     program gets executed and the entry program can share u64 cookie
     value with return program. "Session mode" is a common use-case for
     tetragon and bpftrace.

   - Add the ability to specify and retrieve BPF cookie for raw
     tracepoint programs in order to ease migration from classic to raw
     tracepoints.

   - Add an internal-only BPF per-CPU instruction for resolving per-CPU
     memory addresses and implement support in x86, ARM64 and RISC-V
     JITs. This allows inlining functions which need to access per-CPU
     state.

   - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
     atomics in bpf_arena which can be JITed as a single x86
     instruction. Support BPF arena on ARM64.

   - Add a new bpf_wq API for deferring events and refactor
     process-context bpf_timer code to keep common code where possible.

   - Harden the BPF verifier's and/or/xor value tracking.

   - Introduce crypto kfuncs to let BPF programs call kernel crypto
     APIs.

   - Support bpf_tail_call_static() helper for BPF programs with GCC 13.

   - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
     program to have code sections where preemption is disabled.

  Driver API:

   - Skip software TC processing completely if all installed rules are
     marked as HW-only, instead of checking the HW-only flag rule by
     rule.

   - Add support for configuring PoE (Power over Ethernet), similar to
     the already existing support for PoDL (Power over Data Line)
     config.

   - Initial bits of a queue control API, for now allowing a single
     queue to be reset without disturbing packet flow to other queues.

   - Common (ethtool) statistics for hardware timestamping.

  Tests and tooling:

   - Remove the need to create a config file to run the net forwarding
     tests so that a naive "make run_tests" can exercise them.

   - Define a method of writing tests which require an external endpoint
     to communicate with (to send/receive data towards the test
     machine). Add a few such tests.

   - Create a shared code library for writing Python tests. Expose the
     YAML Netlink library from tools/ to the tests for easy Netlink
     access.

   - Move netfilter tests under net/, extend them, separate performance
     tests from correctness tests, and iron out issues found by running
     them "on every commit".

   - Refactor BPF selftests to use common network helpers.

   - Further work filling in YAML definitions of Netlink messages for:
     nftables, team driver, bonding interfaces, vlan interfaces, VF
     info, TC u32 mark, TC police action.

   - Teach Python YAML Netlink to decode attribute policies.

   - Extend the definition of the "indexed array" construct in the specs
     to cover arrays of scalars rather than just nests.

   - Add hyperlinks between definitions in generated Netlink docs.

  Drivers:

   - Make sure unsupported flower control flags are rejected by drivers,
     and make more drivers report errors directly to the application
     rather than dmesg (large number of driver changes from Asbjørn
     Sloth Tønnesen).

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - support multiple RSS contexts and steering traffic to them
         - support XDP metadata
         - make page pool allocations more NUMA aware
      - Intel (100G, ice, idpf):
         - extract datapath code common among Intel drivers into a library
         - use fewer resources in switchdev by sharing queues with the PF
         - add PFCP filter support
         - add Ethernet filter support
         - use a spinlock instead of HW lock in PTP clock ops
         - support 5 layer Tx scheduler topology
      - nVidia/Mellanox:
         - 800G link modes and 100G SerDes speeds
         - per-queue IRQ coalescing configuration
      - Marvell Octeon:
         - support offloading TC packet mark action

   - Ethernet NICs consumer, embedded and virtual:
      - stop lying about skb->truesize in USB Ethernet drivers, it
        messes up TCP memory calculations
      - Google cloud vNIC:
         - support changing ring size via ethtool
         - support ring reset using the queue control API
      - VirtIO net:
         - expose flow hash from RSS to XDP
         - per-queue statistics
         - add selftests
      - Synopsys (stmmac):
         - support controllers which require an RX clock signal from the
           MII bus to perform their hardware initialization
      - TI:
         - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
         - icssg_prueth: add SW TX / RX Coalescing based on hrtimers
         - cpsw: minimal XDP support
      - Renesas (ravb):
         - support describing the MDIO bus
      - Realtek (r8169):
         - add support for RTL8168M
      - Microchip Sparx5:
         - matchall and flower actions mirred and redirect

   - Ethernet switches:
      - nVidia/Mellanox:
         - improve events processing performance
      - Marvell:
         - add support for MV88E6250 family internal PHYs
      - Microchip:
         - add DCB and DSCP mapping support for KSZ switches
         - vsc73xx: convert to PHYLINK
      - Realtek:
         - rtl8226b/rtl8221b: add C45 instances and SerDes switching

   - Many driver changes related to PHYLIB and PHYLINK deprecated API
     cleanup

   - Ethernet PHYs:
      - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
      - micrel: lan8814: add support for PPS out and external timestamp trigger

   - WiFi:
      - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices
        drivers. Modern devices can only be configured using nl80211.
      - mac80211/cfg80211
         - handle color change per link for WiFi 7 Multi-Link Operation
      - Intel (iwlwifi):
         - don't support puncturing in 5 GHz
         - support monitor mode on passive channels
         - BZ-W device support
         - P2P with HE/EHT support
         - re-add support for firmware API 90
         - provide channel survey information for Automatic Channel Selection
      - MediaTek (mt76):
         - mt7921 LED control
         - mt7925 EHT radiotap support
         - mt7920e PCI support
      - Qualcomm (ath11k):
         - P2P support for QCA6390, WCN6855 and QCA2066
         - support hibernation
         - ieee80211-freq-limit Device Tree property support
      - Qualcomm (ath12k):
         - refactoring in preparation of multi-link support
         - suspend and hibernation support
         - ACPI support
         - debugfs support, including dfs_simulate_radar support
      - RealTek:
         - rtw88: RTL8723CS SDIO device support
         - rtw89: RTL8922AE Wi-Fi 7 PCI device support
         - rtw89: complete features of new WiFi 7 chip 8922AE including
           BT-coexistence and Wake-on-WLAN
         - rtw89: use BIOS ACPI settings to set TX power and channels
         - rtl8xxxu: enable Management Frame Protection (MFP) support

   - Bluetooth:
      - support for Intel BlazarI and Filmore Peak2 (BE201)
      - support for MediaTek MT7921S SDIO
      - initial support for Intel PCIe BT driver
      - remove HCI_AMP support"

* tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits)
  selftests: netfilter: fix packetdrill conntrack testcase
  net: gro: fix napi_gro_cb zeroed alignment
  Bluetooth: btintel_pcie: Refactor and code cleanup
  Bluetooth: btintel_pcie: Fix warning reported by sparse
  Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1
  Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config
  Bluetooth: btintel_pcie: Fix compiler warnings
  Bluetooth: btintel_pcie: Add *setup* function to download firmware
  Bluetooth: btintel_pcie: Add support for PCIe transport
  Bluetooth: btintel: Export few static functions
  Bluetooth: HCI: Remove HCI_AMP support
  Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
  Bluetooth: qca: Fix error code in qca_read_fw_build_info()
  Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning
  Bluetooth: btintel: Add support for Filmore Peak2 (BE201)
  Bluetooth: btintel: Add support for BlazarI
  LE Create Connection command timeout increased to 20 secs
  dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth
  Bluetooth: compute LE flow credits based on recvbuf space
  Bluetooth: hci_sync: Use cmd->num_cis instead of magic number
  ...
2024-05-14 19:42:24 -07:00
Linus Torvalds
896d3fce84 linux_kselftest-kunit-6.10-rc1
This kunit update for Linux 6.10-rc1 consists of:
 
 - fix to race condition in try-catch completion
 - change to __kunit_test_suites_init() to exit early if there is
   nothing to test
 - change to string-stream-test to use KUNIT_DEFINE_ACTION_WRAPPER
 - moving fault tests behind KUNIT_FAULT_TEST Kconfig option
 - kthread test fixes and improvements
 - iov_iter test fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmZCNsAACgkQCwJExA0N
 Qxx03w/9EmjF3T16LPaeuerdoypWDcroDT6gpoFXGrvf3lDrna8uDNija5Pb1yMn
 l97wla3IJ1EZRMTy1jgWGQiiGIdkV8hcze65HZMi19qx/49TUbhA/pTmpYC56cp9
 sk2fBjOHz8iI4kdL4eCMr9MpSiwOIDcfWOr1Lh/AP2LHOU1pRdFZbwO6iZ3wyGlJ
 JH4D1CwmfgMGEau4qUo0jvuRbFAf33S+yEI9gr8CskPItljFVO4jVz4lprnTbU9i
 qAOivHzwcHyYc0upb6q2vIlp8vhmDygG/m07lnwfF7ZHsYo+3zV4FkxHspN2+jGA
 frH7Y0X9zt6YjRRMb9NcNnI67VTiSNzdCvB7urUhKlbXoZ2gjtgB7zHeQtAhlXRo
 XVa4QgWBI5ExKBuLI+0yKo4wEO8M0quXxhbX+2Q+tsRnoYmhwb0G8AUyl/26bt2g
 RelGrArDS5eMrlxl97rjMGFrB5Uan2MR751tl+aZPgyNRW3tRKJnQLZmM1z8aFQp
 vGReT6POzCnQ1wLUkcj6mnObbv9XuuYY1BQgKCtmJflvRToEuwpLOKK8Uca7ou3p
 TbVarGIn0jdHv4zGkXrAkt/mhcxanBXhVKLfh/MqQ7fCZBULkSrjJFLhCpvvHwIV
 nckaP2sZWls6FTDuawFOUxrr/+LjJchMmHhFy9MiDaVoieiTg6U=
 =3QIa
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - fix race condition in try-catch completion

 - change __kunit_test_suites_init() to exit early if there is
   nothing to test

 - change string-stream-test to use KUNIT_DEFINE_ACTION_WRAPPER

 - move fault tests behind KUNIT_FAULT_TEST Kconfig option

 - kthread test fixes and improvements

 - iov_iter test fixes

* tag 'linux_kselftest-kunit-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: bail out early in __kunit_test_suites_init() if there are no suites to test
  kunit: string-stream-test: use KUNIT_DEFINE_ACTION_WRAPPER
  kunit: test: Move fault tests behind KUNIT_FAULT_TEST Kconfig option
  kunit: unregister the device on error
  kunit: Fix race condition in try-catch completion
  kunit: Add tests for fault
  kunit: Print last test location on fault
  kunit: Fix KUNIT_SUCCESS() calls in iov_iter tests
  kunit: Handle test faults
  kunit: Fix timeout message
  kunit: Fix kthread reference
  kunit: Handle thread creation error
2024-05-14 11:32:52 -07:00
Linus Torvalds
6bfd2d442a Updates for the interrupt subsystem:
- Core code:
 
    - Interrupt storm detection for the lockup watchdog:
 
      Lockups which are caused by interrupt storms are not easy to debug
      because there is no information about the events which make the lockup
      detector trigger.
 
      To make this more user friendly, provide an extenstion to interrupt
      statistics which allows to take snapshots and an interface to retrieve
      the delta to the snapshot. Use this new mechanism in the watchdog code
      to do a two stage lockup analysis by taking the snapshot and printing
      the deltas for the topmost active interrupts on the second trigger.
 
      Note: This contains both the interrupt and the watchdog changes as
      the latter depend on the former obviously.
 
   - Avoid summation loops in the /proc/interrupts output and use the global
     counter when possible
 
   - Skip suspended interrupts on CPU hotplug operations to ensure that they
     are not delivered before the system resumes the device drivers when
     coming out of suspend.
 
   - On CPU hot-unplug interrupts which are affine to the outgoing CPU are
     migrated to a different CPU in the affinity mask. This can fail when
     the CPUs have no vectors left. Instead of giving up try to migrate it
     to any online CPU and thereby breaking the affinity setting in order to
     prevent a stale device interrupt which targets an offline CPU
 
   - The usual small cleanups
 
  - Driver code:
 
   - Support for the RISCV AIA MSI controller
 
   - Make the interrupt allocation for the Loongson PCH controller more
     flexible to prevent vector exhaustion
 
   - The usual set of cleanups and fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmZBCM0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoeZHEACqMLN3K+1HyWflYtcTHJeYCjZLHS77
 2tQeKaaskOA4W6dcGXPxMw5CHqAobHVQQMqgcJxhUdqQiOJnFFnrtCD7JtqM0hWK
 UORNbyeovuhAo+iJ0fTuS8p63H7vm2GIWwBLWJnOuChYv/6Yyx5Cald1skvyvbzL
 zePhiiAf5mkdmJMeT5wJSCqEWSRYOXsVAJ/0YAwFG3bKkJH3bmDo6SDJY02sXT5P
 pjbtD/0hum9wIVT4fNdYleHHQMdBdj9dLlcxXBikHq50mDMw7GxvjKiLcXmoerw3
 rEBfVVJp3qpSofpNJZ3HH0ywcF3yUzq04/LPE9Tk2MoQ8NF0GzP8r9Ahke4B7cUj
 FysWNiAlC2IisEi6th313FZkTLx0zgewdsdEBTLt8eAE9TU0wamRbo99LZ8i/Qr3
 hk7jV8DzL+EDQJLgl4p1iPJgA708eW17tbCxLEa15VKVV6P58miohmhx/IfPO2Gx
 FV1PPehtItsmiK/UoRtUCoFdFsqNQtOE+h8DWLyy8RDmhBqGbn9Ut4euXiQIF+rX
 WJKPFfslCTR39BrBcZnZeNsgOCN7tEfFRstzjzkey1DaeTGWtxmA5UGhpC2vT74y
 YyXluvZlgKr4S64ABmcqQj++hQLho0OQAih3uW5YVxt4VxEUcXYMJOsV1AQGpMjF
 UnewWH5opBQdfw==
 =jFLf
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull interrupt subsystem updates from Thomas Gleixner:
 "Core code:

   - Interrupt storm detection for the lockup watchdog:

     Lockups which are caused by interrupt storms are not easy to debug
     because there is no information about the events which make the
     lockup detector trigger.

     To make this more user friendly, provide an extenstion to interrupt
     statistics which allows to take snapshots and an interface to
     retrieve the delta to the snapshot. Use this new mechanism in the
     watchdog code to do a two stage lockup analysis by taking the
     snapshot and printing the deltas for the topmost active interrupts
     on the second trigger.

     Note: This contains both the interrupt and the watchdog changes as
     the latter depend on the former obviously.

   - Avoid summation loops in the /proc/interrupts output and use the
     global counter when possible

   - Skip suspended interrupts on CPU hotplug operations to ensure that
     they are not delivered before the system resumes the device drivers
     when coming out of suspend.

   - On CPU hot-unplug interrupts which are affine to the outgoing CPU
     are migrated to a different CPU in the affinity mask. This can fail
     when the CPUs have no vectors left. Instead of giving up try to
     migrate it to any online CPU and thereby breaking the affinity
     setting in order to prevent a stale device interrupt which targets
     an offline CPU

   - The usual small cleanups

  Driver code:

   - Support for the RISCV AIA MSI controller

   - Make the interrupt allocation for the Loongson PCH controller more
     flexible to prevent vector exhaustion

   - The usual set of cleanups and fixes all over the place"

* tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  irqchip/gic-v3-its: Remove BUG_ON in its_vpe_irq_domain_alloc
  cpuidle: Avoid explicit cpumask allocation on stack
  irqchip/sifive-plic: Avoid explicit cpumask allocation on stack
  irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack
  irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack
  irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack
  irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack
  cpumask: Introduce cpumask_first_and_and()
  irqchip/irq-brcmstb-l2: Avoid saving mask on shutdown
  genirq: Reuse irq_is_nmi()
  genirq/cpuhotplug: Retry with cpu_online_mask when migration fails
  genirq/cpuhotplug: Skip suspended interrupts when restoring affinity
  arm64: dts: st: Add interrupt parent to pinctrl on stm32mp251
  arm64: dts: st: Add exti1 and exti2 nodes on stm32mp251
  ARM: dts: stm32: List exti parent interrupts on stm32mp131
  ARM: dts: stm32: List exti parent interrupts on stm32mp151
  arm64: Kconfig.platforms: Enable STM32_EXTI for ARCH_STM32
  irqchip/stm32-exti: Mark events reserved with RIF configuration check
  irqchip/stm32-exti: Skip secure events
  irqchip/stm32-exti: Convert driver to standard PM
  ...
2024-05-14 09:47:14 -07:00
Linus Torvalds
2d9db778dd Timers and timekeeping updates:
- Core code:
 
    - Make timekeeping and VDSO time readouts resilent against math overflow:
 
      In guest context the kernel is prone to math overflow when the host
      defers the timer interrupt due to overload, malfunction or malice.
 
      This can be mitigated by checking the clocksource delta for the
      maximum deferrement which is readily available. If that value is
      exceeded then the code uses a slowpath function which can handle the
      multiplication overflow.
 
      This functionality is enabled unconditionally in the kernel, but made
      conditional in the VDSO code. The latter is conditional because it
      allows architectures to optimize the check so it is not causing
      performance regressions.
 
      On X86 this is achieved by reworking the existing check for negative
      TSC deltas as a negative delta obviously exceeds the maximum
      deferrement when it is evaluated as an unsigned value. That avoids two
      conditionals in the hotpath and allows to hide both the negative delta
      and the large delta handling in the same slow path.
 
    - Add an initial minimal ktime_t abstraction for Rust
 
    - The usual boring cleanups and enhancements
 
  - Drivers:
 
    - Boring updates to device trees and trivial enhancements in various
      drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmZBErUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZVhD/9iUPzcGNgqGqcO1bXy6dH4xLpeec6o
 2En1vg45DOaygN7DFxkoei20KJtfdFeaaEDH8UqmOfPcpLIuVAd0yqhgDQtx6ZcO
 XNd09SFDInzUt1Ot/WcoXp5N6Wt3vyEgUAlIN1fQdbaZ3fh6OhGhXXCRfiRCGXU1
 ea2pSunLuRf1pKU0AYhGIexnZMOHC4NmVXw/m+WNw5DJrmWB+OaNFKfMoQjtQ1HD
 Vgyr2RALHnIeXm60y2j3dD7TWGXICE/edzOd7pEyg5LFXsmcp388eu/DEdOq3OTV
 tsHLgIi05GJym3dykPBVwZk09M5oVNNfkg9zDxHWhSLkEJmc4QUaH3dgM8uBoaRW
 pS3LaO3ePxWmtAOdSNKFY6xnl6df+PYJoZcIF/GuXgty7im+VLK9C4M05mSjey00
 omcEywvmGdFezY6D9MmjjhFa+q2v9zpRjFpCWaIv3DQdAaDPrOzBk4SSqHZOV4lq
 +hp7ar1mTn1FPrXBouwyOgSOUANISV5cy/QuwOtrVIuVR4rWFVgfWo/7J32/q5Ik
 XBR0lTdQy1Biogf6xy0HCY+4wItOLTqEXXqeknHSMJpDzj5uZglZemgKbix1wVJ9
 8YlD85Q7sktlPmiLMKV9ra0MKVyXDoIrgt4hX98A8M12q9bNdw23x0p0jkJHwGha
 ZYUyX+XxKgOJug==
 =pL+S
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timers and timekeeping updates from Thomas Gleixner:
 "Core code:

   - Make timekeeping and VDSO time readouts resilent against math
     overflow:

     In guest context the kernel is prone to math overflow when the host
     defers the timer interrupt due to overload, malfunction or malice.

     This can be mitigated by checking the clocksource delta for the
     maximum deferrement which is readily available. If that value is
     exceeded then the code uses a slowpath function which can handle
     the multiplication overflow.

     This functionality is enabled unconditionally in the kernel, but
     made conditional in the VDSO code. The latter is conditional
     because it allows architectures to optimize the check so it is not
     causing performance regressions.

     On X86 this is achieved by reworking the existing check for
     negative TSC deltas as a negative delta obviously exceeds the
     maximum deferrement when it is evaluated as an unsigned value. That
     avoids two conditionals in the hotpath and allows to hide both the
     negative delta and the large delta handling in the same slow path.

   - Add an initial minimal ktime_t abstraction for Rust

   - The usual boring cleanups and enhancements

  Drivers:

   - Boring updates to device trees and trivial enhancements in various
     drivers"

* tag 'timers-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  clocksource/drivers/arm_arch_timer: Mark hisi_161010101_oem_info const
  clocksource/drivers/timer-ti-dm: Remove an unused field in struct dmtimer
  clocksource/drivers/renesas-ostm: Avoid reprobe after successful early probe
  clocksource/drivers/renesas-ostm: Allow OSTM driver to reprobe for RZ/V2H(P) SoC
  dt-bindings: timer: renesas: ostm: Document Renesas RZ/V2H(P) SoC
  rust: time: doc: Add missing C header links
  clocksource: Make the int help prompt unit readable in ncurses
  hrtimer: Rename __hrtimer_hres_active() to hrtimer_hres_active()
  timerqueue: Remove never used function timerqueue_node_expires()
  rust: time: Add Ktime
  vdso: Fix powerpc build U64_MAX undeclared error
  clockevents: Convert s[n]printf() to sysfs_emit()
  clocksource: Convert s[n]printf() to sysfs_emit()
  clocksource: Make watchdog and suspend-timing multiplication overflow safe
  timekeeping: Let timekeeping_cycles_to_ns() handle both under and overflow
  timekeeping: Make delta calculation overflow safe
  timekeeping: Prepare timekeeping_cycles_to_ns() for overflow safety
  timekeeping: Fold in timekeeping_delta_to_ns()
  timekeeping: Consolidate timekeeping helpers
  timekeeping: Refactor timekeeping helpers
  ...
2024-05-14 09:27:40 -07:00
Linus Torvalds
6e5a0c30b6 Scheduler changes for v6.10:
- Add cpufreq pressure feedback for the scheduler
 
  - Rework misfit load-balancing wrt. affinity restrictions
 
  - Clean up and simplify the code around ::overutilized and
    ::overload access.
 
  - Simplify sched_balance_newidle()
 
  - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
    handling that changed the output.
 
  - Rework & clean up <asm/vtime.h> interactions wrt. arch_vtime_task_switch()
 
  - Reorganize, clean up and unify most of the higher level
    scheduler balancing function names around the sched_balance_*()
    prefix.
 
  - Simplify the balancing flag code (sched_balance_running)
 
  - Miscellaneous cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmZBtA0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gQEw//WiCiV7zTlWShSiG/g8GTfoAvl53QTWXF
 0jQ8TUcoIhxB5VeGgxVG1srYt8f505UXjH7L0MJLrbC3nOgRCg4NK57WiQEachKK
 HORIJHT0tMMsKIwX9D5Ovo4xYJn+j7mv7j/caB+hIlzZAbWk+zZPNWcS84p0ZS/4
 appY6RIcp7+cI7bisNMGUuNZS14+WMdWoX3TgoI6ekgDZ7Ky+kQvkwGEMBXsNElO
 qZOj6yS/QUE4Htwz0tVfd6h5svoPM/VJMIvl0yfddPGurfNw6jEh/fjcXnLdAzZ6
 9mgcosETncQbm0vfSac116lrrZIR9ygXW/yXP5S7I5dt+r+5pCrBZR2E5g7U4Ezp
 GjX1+6J9U6r6y12AMLRjadFOcDvxdwtszhZq4/wAcmS3B9dvupnH/w7zqY9ho3wr
 hTdtDHoAIzxJh7RNEHgeUC0/yQX3wJ9THzfYltDRIIjHTuvl4d5lHgsug+4Y9ClE
 pUIQm/XKouweQN9TZz2ULle4ZhRrR9sM9QfZYfirJ/RppmuKool4riWyQFQNHLCy
 mBRMjFFsTpFIOoZXU6pD4EabOpWdNrRRuND/0yg3WbDat2gBWq6jvSFv2UN1/v7i
 Un5jijTuN7t8yP5lY5Tyf47kQfLlA9bUx1v56KnF9mrpI87FyiDD3MiQVhDsvpGX
 rP96BIOrkSo=
 =obph
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Add cpufreq pressure feedback for the scheduler

 - Rework misfit load-balancing wrt affinity restrictions

 - Clean up and simplify the code around ::overutilized and
   ::overload access.

 - Simplify sched_balance_newidle()

 - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
   handling that changed the output.

 - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch()

 - Reorganize, clean up and unify most of the higher level
   scheduler balancing function names around the sched_balance_*()
   prefix

 - Simplify the balancing flag code (sched_balance_running)

 - Miscellaneous cleanups & fixes

* tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  sched/pelt: Remove shift of thermal clock
  sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure()
  thermal/cpufreq: Remove arch_update_thermal_pressure()
  sched/cpufreq: Take cpufreq feedback into account
  cpufreq: Add a cpufreq pressure feedback for the scheduler
  sched/fair: Fix update of rd->sg_overutilized
  sched/vtime: Do not include <asm/vtime.h> header
  s390/irq,nmi: Include <asm/vtime.h> header directly
  s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover
  sched/vtime: Get rid of generic vtime_task_switch() implementation
  sched/vtime: Remove confusing arch_vtime_task_switch() declaration
  sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags
  sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized()
  sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED
  sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded()
  sched/fair: Rename root_domain::overload to ::overloaded
  sched/fair: Use helper functions to access root_domain::overload
  sched/fair: Check root_domain::overload value before update
  sched/fair: Combine EAS check with root_domain::overutilized access
  sched/fair: Simplify the continue_balancing logic in sched_balance_newidle()
  ...
2024-05-13 17:18:51 -07:00
Linus Torvalds
87caef4220 hardening updates for 6.10-rc1
- selftests: Add str*cmp tests (Ivan Orlov)
 
 - __counted_by: provide UAPI for _le/_be variants (Erick Archer)
 
 - Various strncpy deprecation refactors (Justin Stitt)
 
 - stackleak: Use a copy of soon-to-be-const sysctl table (Thomas Weißschuh)
 
 - UBSAN: Work around i386 -regparm=3 bug with Clang prior to version 19
 
 - Provide helper to deal with non-NUL-terminated string copying
 
 - SCSI: Fix older string copying bugs (with new helper)
 
 - selftests: Consolidate string helper behavioral tests
 
 - selftests: add memcpy() fortify tests
 
 - string: Add additional __realloc_size() annotations for "dup" helpers
 
 - LKDTM: Fix KCFI+rodata+objtool confusion
 
 - hardening.config: Enable KCFI
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmY/yCUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJuf2D/9xlQA7UxUDlm1Z6DPYzTZfNm4M
 D+RJ1QoLNbZEYSzULWvfRSWI+c82qINoSgvtv2DdhWqSKivcMoeNDN846gewfwMY
 0q3iChbhPaNBAHaXat1pf0iA6q2n/wpg1jv1C1PmPVSaEpl0CeQ2MLXSOMz9Gb7G
 FkkaN/v+YlShUzkw61KwKPg959/bh5vCBbeLjSd1XAhLGKU7nWw4yj0J3usTnRbV
 icCnW4mk9SD+pIli/+n7t/QIvPMf6TrJZoSgH9P7YNm+wNme4UEAm1PJz8F+KVAH
 D3CJhlH36l8TrndsHMsHgDjKtUUchh+ExOlWGw3ObUnbU7ST2JP6crAdjtnyT2eN
 uF+ELBT97SskFBAlzOzBSIs8lEwBZzTdJCmWqEBr3ZxxR7lcClmqbJY+X/FhvXko
 o7PvtCbHCatpDPJPZ0e25nVsfEJS29RUED5Gen6vWcUtuvdFEgws70s5BDAbSZTo
 RoJsuDqlRAFLdNDYmEN3UTGcm+PBjPgKsBrXiiNr4Y0BilU67Bzdmd8jiZC9ARe6
 +3cfQRs0uWdemANzvrN5FnrIUhjRHWTvfVTXcC9Jt53HntIuMhhRajJuMcTAX5uQ
 iWACUR14RL8lfInS8phWB5T4AvNexTFc6kVRqNzsGB0ZutsnAsqELttCk57tYQVr
 Hlv/MbePyyLSKF/nYA==
 =CgsW
 -----END PGP SIGNATURE-----

Merge tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "The bulk of the changes here are related to refactoring and expanding
  the KUnit tests for string helper and fortify behavior.

  Some trivial strncpy replacements in fs/ were carried in my tree. Also
  some fixes to SCSI string handling were carried in my tree since the
  helper for those was introduce here. Beyond that, just little fixes
  all around: objtool getting confused about LKDTM+KCFI, preparing for
  future refactors (constification of sysctl tables, additional
  __counted_by annotations), a Clang UBSAN+i386 crash fix, and adding
  more options in the hardening.config Kconfig fragment.

  Summary:

   - selftests: Add str*cmp tests (Ivan Orlov)

   - __counted_by: provide UAPI for _le/_be variants (Erick Archer)

   - Various strncpy deprecation refactors (Justin Stitt)

   - stackleak: Use a copy of soon-to-be-const sysctl table (Thomas
     Weißschuh)

   - UBSAN: Work around i386 -regparm=3 bug with Clang prior to
     version 19

   - Provide helper to deal with non-NUL-terminated string copying

   - SCSI: Fix older string copying bugs (with new helper)

   - selftests: Consolidate string helper behavioral tests

   - selftests: add memcpy() fortify tests

   - string: Add additional __realloc_size() annotations for "dup"
     helpers

   - LKDTM: Fix KCFI+rodata+objtool confusion

   - hardening.config: Enable KCFI"

* tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (29 commits)
  uapi: stddef.h: Provide UAPI macros for __counted_by_{le, be}
  stackleak: Use a copy of the ctl_table argument
  string: Add additional __realloc_size() annotations for "dup" helpers
  kunit/fortify: Fix replaced failure path to unbreak __alloc_size
  hardening: Enable KCFI and some other options
  lkdtm: Disable CFI checking for perms functions
  kunit/fortify: Add memcpy() tests
  kunit/fortify: Do not spam logs with fortify WARNs
  kunit/fortify: Rename tests to use recommended conventions
  init: replace deprecated strncpy with strscpy_pad
  kunit/fortify: Fix mismatched kvalloc()/vfree() usage
  scsi: qla2xxx: Avoid possible run-time warning with long model_num
  scsi: mpi3mr: Avoid possible run-time warning with long manufacturer strings
  scsi: mptfusion: Avoid possible run-time warning with long manufacturer strings
  fs: ecryptfs: replace deprecated strncpy with strscpy
  hfsplus: refactor copy_name to not use strncpy
  reiserfs: replace deprecated strncpy with scnprintf
  virt: acrn: replace deprecated strncpy with strscpy
  ubsan: Avoid i386 UBSAN handler crashes with Clang
  ubsan: Remove 1-element array usage in debug reporting
  ...
2024-05-13 14:14:05 -07:00
Linus Torvalds
0c9f4ac808 for-6.10/block-20240511
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmY/YgsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpvi0EACwnFRtYioizBH0x7QUHTBcIr0IhACd5gfz
 bm+uwlDUtf6G6lupHdJT9gOVB2z2z1m2Pz//8RuUVWw3Eqw2+rfgG8iJd+yo7IaV
 DpX3WaM4NnBvB7FKOKHlMPvGuf7KgbZ3uPm3x8cbrn/axMmkZ6ljxTixJ3p5t4+s
 xRsef/lVdG71DkXIFgTKATB86yNRJNlRQTbL+sZW22vdXdtfyBbOgR1sBuFfp7Hd
 g/uocZM/z0ahM6JH/5R2IX2ttKXMIBZLA8HRkJdvYqg022cj4js2YyRCPU3N6jQN
 MtN4TpJV5I++8l6SPQOOhaDNrK/6zFtDQpwG0YBiKKj3nQDgVbWWb8ejYTIUv4MP
 SrEto4MVBEqg5N65VwYYhIf45rmueFyJp6z0Vqv6Owur5nuww/YIFknmoMa/WDMd
 V8dIU3zL72FZDbPjIBjxHeqAGz9OgzEVafled7pi0Xbw6wqiB4kZihlMGXlD+WBy
 Yd6xo8PX4i5+d2LLKKPxpW1X0eJlKYJ/4dnYCoFN8LmXSiPJnMx2pYrV+NqMxy4X
 Thr8lxswLQC7j9YBBuIeDl8NB9N5FZZLvaC6I25QKq045M2ckJ+VrounsQb3vGwJ
 72nlxxBZL8wz3sasgX9Pc1Cez9AqYbM+UZahq8ezPY5y3Jh0QfRw/MOk1ZaDNC8V
 CNOHBH0E+Q==
 =HnjE
 -----END PGP SIGNATURE-----

Merge tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - Add a partscan attribute in sysfs, fixing an issue with systemd
   relying on an internal interface that went away.

 - Attempt #2 at making long running discards interruptible. The
   previous attempt went into 6.9, but we ended up mostly reverting it
   as it had issues.

 - Remove old ida_simple API in bcache

 - Support for zoned write plugging, greatly improving the performance
   on zoned devices.

 - Remove the old throttle low interface, which has been experimental
   since 2017 and never made it beyond that and isn't being used.

 - Remove page->index debugging checks in brd, as it hasn't caught
   anything and prepares us for removing in struct page.

 - MD pull request from Song

 - Don't schedule block workers on isolated CPUs

* tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux: (84 commits)
  blk-throttle: delay initialization until configuration
  blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW
  block: fix that util can be greater than 100%
  block: support to account io_ticks precisely
  block: add plug while submitting IO
  bcache: fix variable length array abuse in btree_iter
  bcache: Remove usage of the deprecated ida_simple_xx() API
  md: Revert "md: Fix overflow in is_mddev_idle"
  blk-lib: check for kill signal in ioctl BLKDISCARD
  block: add a bio_await_chain helper
  block: add a blk_alloc_discard_bio helper
  block: add a bio_chain_and_submit helper
  block: move discard checks into the ioctl handler
  block: remove the discard_granularity check in __blkdev_issue_discard
  block/ioctl: prefer different overflow check
  null_blk: Fix the WARNING: modpost: missing MODULE_DESCRIPTION()
  block: fix and simplify blkdevparts= cmdline parsing
  block: refine the EOF check in blkdev_iomap_begin
  block: add a partscan sysfs attribute for disks
  block: add a disk_has_partscan helper
  ...
2024-05-13 13:03:54 -07:00
Linus Torvalds
b19239143e Hi,
These are the changes for the TPM driver with a single major new
 feature: TPM bus encryption and integrity protection. The key pair
 on TPM side is generated from so called null random seed per power
 on of the machine [1]. This supports the TPM encryption of the hard
 drive by adding layer of protection against bus interposer attacks.
 
 Other than the pull request a few minor fixes and documentation for
 tpm_tis to clarify basics of TPM localities for future patch review
 discussions (will be extended and refined over times, just a seed).
 
 [1] https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iJYEABYKAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZj0l2iAcamFya2tvLnNh
 a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0m8yAP4hBjMtpgAJZ4eZ
 5o9tEQJrh/1JFZJ+8HU5IKPc4RU8BAEAyyYOCtxtS/C5B95iP+LvNla0KWi0pprU
 HsCLULnV2Aw=
 =RTXJ
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull TPM updates from Jarkko Sakkinen:
 "These are the changes for the TPM driver with a single major new
  feature: TPM bus encryption and integrity protection. The key pair on
  TPM side is generated from so called null random seed per power on of
  the machine [1]. This supports the TPM encryption of the hard drive by
  adding layer of protection against bus interposer attacks.

  Other than that, a few minor fixes and documentation for tpm_tis to
  clarify basics of TPM localities for future patch review discussions
  (will be extended and refined over times, just a seed)"

Link: https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/ [1]

* tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (28 commits)
  Documentation: tpm: Add TPM security docs toctree entry
  tpm: disable the TPM if NULL name changes
  Documentation: add tpm-security.rst
  tpm: add the null key name as a sysfs export
  KEYS: trusted: Add session encryption protection to the seal/unseal path
  tpm: add session encryption protection to tpm2_get_random()
  tpm: add hmac checks to tpm2_pcr_extend()
  tpm: Add the rest of the session HMAC API
  tpm: Add HMAC session name/handle append
  tpm: Add HMAC session start and end functions
  tpm: Add TCG mandated Key Derivation Functions (KDFs)
  tpm: Add NULL primary creation
  tpm: export the context save and load commands
  tpm: add buffer function to point to returned parameters
  crypto: lib - implement library version of AES in CFB mode
  KEYS: trusted: tpm2: Use struct tpm_buf for sized buffers
  tpm: Add tpm_buf_read_{u8,u16,u32}
  tpm: TPM2B formatted buffers
  tpm: Store the length of the tpm_buf data separately.
  tpm: Update struct tpm_buf documentation comments
  ...
2024-05-13 10:40:15 -07:00
Linus Torvalds
cd97950cbc slab updates for 6.10
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmY8mxAACgkQu+CwddJF
 iJru7AgAmBfolYwYjm9fCkH+px40smQQF08W+ygJaKF4+6e+b5ijfI8H3AG7QtuE
 5FmdCjSvu56lr15sjeUy7giYWRfeEwxC/ztJ0FJ+RCzSEQVKCo2wWGYxDneelwdH
 /v0Of5ENbIiH/svK4TArY9AemZw+nowNrwa4TI1QAEcp47T7x52r0GFOs1pnduep
 eV6uSwHSx00myiF3fuMGQ7P4aUDLNTGn5LSHNI4sykObesGPx4Kvr0zZvhQT41me
 c6Sc0GwV5M9sqBFwjujIeD7CB98wVPju4SDqNiEL+R1u+pnIA0kkefO4D4VyKvpr
 7R/WXmqZI4Ae/HEtcRd8+5Z4FvapPw==
 =7ez3
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:
 "This time it's mostly random cleanups and fixes, with two performance
  fixes that might have significant impact, but limited to systems
  experiencing particular bad corner case scenarios rather than general
  performance improvements.

  The memcg hook changes are going through the mm tree due to
  dependencies.

   - Prevent stalls when reading /proc/slabinfo (Jianfeng Wang)

     This fixes the long-standing problem that can happen with workloads
     that have alloc/free patterns resulting in many partially used
     slabs (in e.g. dentry cache). Reading /proc/slabinfo will traverse
     the long partial slab list under spinlock with disabled irqs and
     thus can stall other processes or even trigger the lockup
     detection. The traversal is only done to count free objects so that
     <active_objs> column can be reported along with <num_objs>.

     To avoid affecting fast paths with another shared counter
     (attempted in the past) or complex partial list traversal schemes
     that allow rescheduling, the chosen solution resorts to
     approximation - when the partial list is over 10000 slabs long, we
     will only traverse first 5000 slabs from head and tail each and use
     the average of those to estimate the whole list. Both head and tail
     are used as the slabs near head to tend to have more free objects
     than the slabs towards the tail.

     It is expected the approximation should not break existing
     /proc/slabinfo consumers. The <num_objs> field is still accurate
     and reflects the overall kmem_cache footprint. The <active_objs>
     was already imprecise due to cpu and percpu-partial slabs, so can't
     be relied upon to determine exact cache usage. The difference
     between <active_objs> and <num_objs> is mainly useful to determine
     the slab fragmentation, and that will be possible even with the
     approximation in place.

   - Prevent allocating many slabs when a NUMA node is full (Chen Jun)

     Currently, on NUMA systems with a node under significantly bigger
     pressure than other nodes, the fallback strategy may result in each
     kmalloc_node() that can't be safisfied from the preferred node, to
     allocate a new slab on a fallback node, and not reuse the slabs
     already on that node's partial list.

     This is now fixed and partial lists of fallback nodes are checked
     even for kmalloc_node() allocations. It's still preferred to
     allocate a new slab on the requested node before a fallback, but
     only with a GFP_NOWAIT attempt, which will fail quickly when the
     node is under a significant memory pressure.

   - More SLAB removal related cleanups (Xiu Jianfeng, Hyunmin Lee)

   - Fix slub_kunit self-test with hardened freelists (Guenter Roeck)

   - Mark racy accesses for KCSAN (linke li)

   - Misc cleanups (Xiongwei Song, Haifeng Xu, Sangyun Kim)"

* tag 'slab-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slub: remove the check for NULL kmalloc_caches
  mm/slub: create kmalloc 96 and 192 caches regardless cache size order
  mm/slub: mark racy access on slab->freelist
  slub: use count_partial_free_approx() in slab_out_of_memory()
  slub: introduce count_partial_free_approx()
  slub: Set __GFP_COMP in kmem_cache by default
  mm/slub: remove duplicate initialization for early_kmem_cache_node_alloc()
  mm/slub: correct comment in do_slab_free()
  mm/slub, kunit: Use inverted data to corrupt kmem cache
  mm/slub: simplify get_partial_node()
  mm/slub: add slub_get_cpu_partial() helper
  mm/slub: remove the check of !kmem_cache_has_cpu_partial()
  mm/slub: Reduce memory consumption in extreme scenarios
  mm/slub: mark racy accesses on slab->slabs
  mm/slub: remove dummy slabinfo functions
2024-05-13 10:28:34 -07:00
Linus Torvalds
2e57d1d606 sparc32,parisc,csky: Provide one-byte and two-byte cmpxchg() support
This series provides native one-byte and two-byte cmpxchg() support
 for sparc32 and parisc, courtesy of Al Viro.  This support is provided
 by the same hashed-array-of-locks technique used for the other atomic
 operations provided for these two platforms.
 
 This series also provides emulated one-byte cmpxchg() support for csky
 using a new cmpxchg_emu_u8() function that uses a four-byte cmpxchg()
 to emulate the one-byte variant.
 
 Similar patches for emulation of one-byte cmpxchg() for arc, sh, and
 xtensa have not yet received maintainer acks, so they are slated for
 the v6.11 merge window.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmY/gZ8THHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jFIjD/0Uu4VZZN96jYbSaDbC5aAkEHg/swBK
 6OVn+yspLOvkebVZlSfus+7rc5VUrxT3GA/gvAWEQsUlPqpYg6Qja/efFpPPRjIq
 lwkFE5HFgE0J4lBo9p78ggm6Hx60WUPlNg9uS23qURZbFTx5TYQyAdzXw9HlYzr8
 jg5IuTtO5L5AZzR2ocDRh4A5sqfcBJCVdVsKO+XzdFLLtgum+kJY7StYLPdY8VtL
 pIV3+ZQENoiwzE+wccnCb2R/4kt6jsEDShlpV4VEfv76HwbjBdvSq4jEg4jS2N3/
 AIyThclD97AEdbbM1oJ3oZdjD3GLGVPhVFfiMSGD5HGA+JVJPjJe2it4o+xY7CIR
 sSdI/E3Rs67qgaga6t2vHygDZABOwgNLAsc4VwM7X6I20fRixkYVc7aVOTnAPzmr
 15iaFd/T7fLKJcC3m/IXb9iNdlfe0Op4+YVD0lOTWmzIk80Xgf45a39u1VFlqQvh
 CLIZG3IdmuxXSWjOmk70iokzJgoSmBriGLbAT3K++pzGYUN/BNQs6XRR77BczFsX
 CbZTZKnEWZMR1U0UWa/TbvUKcsVBZTYebJSvJOG2/+oVqayzvwYfBsE/vWZcI72K
 XEEpKY9ZPDf/gCs/G4OFWt2QPJ0PL+Nt4UZDr5Khrqgo1PwN0uIXstA4mnJ0WjqQ
 sGiACjdTXk4h0w==
 =AEPy
 -----END PGP SIGNATURE-----

Merge tag 'cmpxchg.2024.05.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull cmpxchg updates from Paul McKenney:
 "Provide one-byte and two-byte cmpxchg() support on sparc32, parisc,
  and csky

  This provides native one-byte and two-byte cmpxchg() support for
  sparc32 and parisc, courtesy of Al Viro. This support is provided by
  the same hashed-array-of-locks technique used for the other atomic
  operations provided for these two platforms.

  There is also emulated one-byte cmpxchg() support for csky using a new
  cmpxchg_emu_u8() function that uses a four-byte cmpxchg() to emulate
  the one-byte variant.

  Similar patches for emulation of one-byte cmpxchg() for arc, sh, and
  xtensa have not yet received maintainer acks, so they are slated for
  the v6.11 merge window"

* tag 'cmpxchg.2024.05.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  csky: Emulate one-byte cmpxchg
  lib: Add one-byte emulation function
  parisc: add u16 support to cmpxchg()
  parisc: add missing export of __cmpxchg_u8()
  parisc: unify implementations of __cmpxchg_u{8,32,64}
  parisc: __cmpxchg_u32(): lift conversion into the callers
  sparc32: add __cmpxchg_u{8,16}() and teach __cmpxchg() to handle those sizes
  sparc32: unify __cmpxchg_u{32,64}
  sparc32: make the first argument of __cmpxchg_u64() volatile u64 *
  sparc32: make __cmpxchg_u32() return u32
2024-05-13 10:05:39 -07:00
Linus Torvalds
c22c3e0753 18 hotfixes, 7 of which are cc:stable.
More fixups for this cycle's page_owner updates.  And a few userfaultfd
 fixes.  Otherwise, random singletons - see the individual changelogs for
 details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZj6AhAAKCRDdBJ7gKXxA
 jsvHAQCoSRI4qM0a6j5Fs2Q+B1in+kGWTe50q5Rd755VgolEsgD8CUASDgZ2Qv7g
 yDAlluXMv4uvA4RqkZvDiezsENzYQw0=
 =MApd
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM fixes from Andrew Morton:
 "18 hotfixes, 7 of which are cc:stable.

  More fixups for this cycle's page_owner updates. And a few userfaultfd
  fixes. Otherwise, random singletons - see the individual changelogs
  for details"

* tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: add entry for Barry Song
  selftests/mm: fix powerpc ARCH check
  mailmap: add entry for John Garry
  XArray: set the marks correctly when splitting an entry
  selftests/vDSO: fix runtime errors on LoongArch
  selftests/vDSO: fix building errors on LoongArch
  mm,page_owner: don't remove __GFP_NOLOCKDEP in add_stack_record_to_list
  fs/proc/task_mmu: fix uffd-wp confusion in pagemap_scan_pmd_entry()
  fs/proc/task_mmu: fix loss of young/dirty bits during pagemap scan
  mm/vmalloc: fix return value of vb_alloc if size is 0
  mm: use memalloc_nofs_save() in page_cache_ra_order()
  kmsan: compiler_types: declare __no_sanitize_or_inline
  lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add()
  tools: fix userspace compilation with new test_xarray changes
  MAINTAINERS: update URL's for KEYS/KEYRINGS_INTEGRITY and TPM DEVICE DRIVER
  mm: page_owner: fix wrong information in dump_page_owner
  maple_tree: fix mas_empty_area_rev() null pointer dereference
  mm/userfaultfd: reset ptes when close() for wr-protected ones
2024-05-10 14:16:03 -07:00
Masahiro Yamada
b1992c3772 kbuild: use $(src) instead of $(srctree)/$(src) for source directory
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for
checked-in source files. It is merely a convention without any functional
difference. In fact, $(obj) and $(src) are exactly the same, as defined
in scripts/Makefile.build:

    src := $(obj)

When the kernel is built in a separate output directory, $(src) does
not accurately reflect the source directory location. While Kbuild
resolves this discrepancy by specifying VPATH=$(srctree) to search for
source files, it does not cover all cases. For example, when adding a
header search path for local headers, -I$(srctree)/$(src) is typically
passed to the compiler.

This introduces inconsistency between upstream and downstream Makefiles
because $(src) is used instead of $(srctree)/$(src) for the latter.

To address this inconsistency, this commit changes the semantics of
$(src) so that it always points to the directory in the source tree.

Going forward, the variables used in Makefiles will have the following
meanings:

  $(obj)     - directory in the object tree
  $(src)     - directory in the source tree  (changed by this commit)
  $(objtree) - the top of the kernel object tree
  $(srctree) - the top of the kernel source tree

Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced
with $(src).

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-05-10 04:34:52 +09:00
Ard Biesheuvel
f135440447 crypto: lib - implement library version of AES in CFB mode
Implement AES in CFB mode using the existing, mostly constant-time
generic AES library implementation. This will be used by the TPM code
to encrypt communications with TPM hardware, which is often a discrete
component connected using sniffable wires or traces.

While a CFB template does exist, using a skcipher is a major pain for
non-performance critical synchronous crypto where the algorithm is known
at compile time and the data is in contiguous buffers with valid kernel
virtual addresses.

Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lore.kernel.org/all/20230216201410.15010-1-James.Bottomley@HansenPartnership.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09 22:30:51 +03:00
Jakub Kicinski
e7073830cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
  35d92abfba ("net: hns3: fix kernel crash when devlink reload during initialization")
  2a1a1a7b5f ("net: hns3: add command queue trace for hns3")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-09 10:01:01 -07:00
Yury Norov
0b2811ba11 bitmap: relax find_nth_bit() limitation on return value
The function claims to return the bitmap size, if Nth bit doesn't exist.
This rule is violated in inline case because the fns() that is used
there doesn't know anything about size of the bitmap.

So, relax this requirement to '>= size', and make the outline
implementation a bit cheaper.

All in-tree kernel users of find_nth_bit() are safe against that.

Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Closes: https://lore.kernel.org/all/Zi50cAgR8nZvgLa3@yury-ThinkPad/T/#m6da806a0525e74dcc91f35e5f20766ed4e853e8a
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-09 09:25:08 -07:00
Yury Norov
77db1920a8 lib: make test_bitops compilable into the kernel image
The test now is limited to be compiled as a module. There's no technical
reason for it. Now that the test bears some performance benchmarks, it
would be reasonable to run it at kernel load time, before userspace
starts, to reduce possible jitter.

Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-09 09:25:08 -07:00
Kuan-Wei Chiu
0a2c6664e5 lib/test_bitops: Add benchmark test for fns()
Introduce a benchmark test for the fns(). It measures the total time
taken by fns() to process 10,000 test data generated using
get_random_bytes() for each n in the range [0, BITS_PER_LONG).

example:
test_bitops: fns:             7637268 ns

CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: David Laight <David.Laight@aculab.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-09 09:25:08 -07:00
Kent Overstreet
4c5b7294de closures: closure_sync_timeout()
Add a new variant of closure_sync_timeout() that takes a timeout.

Note that when this returns -ETIME the closure will still be waiting on
something, i.e. it's not safe to return if you've got a stack allocated
closure.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:22 -04:00
Andy Shevchenko
22bcc915ae kfifo: don't use "proxy" headers
Update header inclusions to follow IWYU (Include What You Use) principle.

Link: https://lkml.kernel.org/r/20240423192529.3249134-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Alain Volmat <alain.volmat@foss.st.com>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Patrice Chotard <patrice.chotard@foss.st.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Samuel Holland <samuel@sholland.org>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: Sean Young <sean@mess.org>
Cc: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08 08:41:27 -07:00
Florian Fainelli
0d5044b4e7 lib: Allow for the DIM library to be modular
Allow the Dynamic Interrupt Moderation (DIM) library to be built as a
module. This is particularly useful in an Android GKI (Google Kernel
Image) configuration where everything is built as a module, including
Ethernet controller drivers. Having to build DIMLIB into the kernel
image with potentially no user is wasteful.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20240506175040.410446-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07 16:42:45 -07:00
Scott Mayhew
5496b9b77d kunit: bail out early in __kunit_test_suites_init() if there are no suites to test
Commit c72a870926 added a mutex to prevent kunit tests from running
concurrently.  Unfortunately that mutex gets locked during module load
regardless of whether the module actually has any kunit tests.  This
causes a problem for kunit tests that might need to load other kernel
modules (e.g. gss_krb5_test loading the camellia module).

So check to see if there are actually any tests to run before locking
the kunit_run_lock mutex.

Fixes: c72a870926 ("kunit: add ability to run tests after boot using debugfs")
Reported-by: Nico Pache <npache@redhat.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Ivan Orlov
a96a394577 kunit: string-stream-test: use KUNIT_DEFINE_ACTION_WRAPPER
Use KUNIT_DEFINE_ACTION_WRAPPER macro to define the 'kfree' and
'string_stream_destroy' wrappers for kunit_add_action.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
David Gow
4b513a02fd kunit: test: Move fault tests behind KUNIT_FAULT_TEST Kconfig option
The NULL dereference tests in kunit_fault deliberately trigger a kernel
BUG(), and therefore print the associated stack trace, even when the
test passes. This is both annoying (as it bloats the test output), and
can confuse some test harnesses, which assume any BUG() is a failure.

Allow these tests to be specifically disabled (without disabling all
of KUnit's other tests), by placing them behind the
CONFIG_KUNIT_FAULT_TEST Kconfig option. This is enabled by default, but
can be set to 'n' to disable the test. An empty 'kunit_fault' suite is
left behind, which will automatically be marked 'skipped'.

As the fault tests already were disabled under UML (as they weren't
compatible with its fault handling), we can simply adapt those
conditions, and add a dependency on !UML for our new option.

Suggested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/928249cc-e027-4f7f-b43f-502f99a1ea63@roeck-us.net/
Fixes: 82b0beff3497 ("kunit: Add tests for fault")
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Mickaël Salaün <mic@digikod.net>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Wander Lairson Costa
fabd480b72 kunit: unregister the device on error
kunit_init_device() should unregister the device on bus register error,
but mistakenly it tries to unregister the bus.

Unregister the device instead of the bus.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
David Gow
1eb69ded80 kunit: Fix race condition in try-catch completion
KUnit's try-catch infrastructure now uses vfork_done, which is always
set to a valid completion when a kthread is created, but which is set to
NULL once the thread terminates. This creates a race condition, where
the kthread exits before we can wait on it.

Keep a copy of vfork_done, which is taken before we wake_up_process()
and so valid, and wait on that instead.

Fixes: 93533996100c ("kunit: Handle test faults")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/lkml/20240410102710.35911-1-naresh.kamboju@linaro.org/
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Acked-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Tested-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
170c31737c kunit: Add tests for fault
Add a test case to check NULL pointer dereference and make sure it would
result as a failed test.

The full kunit_fault test suite is marked as skipped when run on UML
because it would result to a kernel panic.

Tested with:
./tools/testing/kunit/kunit.py run --arch x86_64 kunit_fault
./tools/testing/kunit/kunit.py run --arch arm64 \
  --cross_compile=aarch64-linux-gnu- kunit_fault

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Rae Moar <rmoar@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-8-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
8bd5d74bab kunit: Print last test location on fault
This helps identify the location of test faults with opportunistic calls
to _KUNIT_SAVE_LOC().  This can be useful while writing tests or
debugging them.  It is possible to call KUNIT_SUCCESS() to explicit save
last location.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-7-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
70585f05fb kunit: Fix KUNIT_SUCCESS() calls in iov_iter tests
Fix KUNIT_SUCCESS() calls to pass a test argument.

This is a no-op for now because this macro does nothing, but it will be
required for the next commit.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Rae Moar <rmoar@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-6-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
3a35c13007 kunit: Handle test faults
Previously, when a kernel test thread crashed (e.g. NULL pointer
dereference, general protection fault), the KUnit test hanged for 30
seconds and exited with a timeout error.

Fix this issue by waiting on task_struct->vfork_done instead of the
custom kunit_try_catch.try_completion, and track the execution state by
initially setting try_result with -EINTR and only setting it to 0 if
the test passed.

Fix kunit_generic_run_threadfn_adapter() signature by returning 0
instead of calling kthread_complete_and_exit().  Because thread's exit
code is never checked, always set it to 0 to make it clear.  To make
this explicit, export kthread_exit() for KUnit tests built as module.

Fix the -EINTR error message, which couldn't be reached until now.

This is tested with a following patch.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: Rae Moar <rmoar@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-5-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
53026ff63b kunit: Fix timeout message
The exit code is always checked, so let's properly handle the -ETIMEDOUT
error code.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-4-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
f8aa1b98ce kunit: Fix kthread reference
There is a race condition when a kthread finishes after the deadline and
before the call to kthread_stop(), which may lead to use after free.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Fixes: adf5054570 ("kunit: fix UAF when run kfence test case test_gfpzero")
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-3-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
cde5e1b4a9 kunit: Handle thread creation error
Previously, if a thread creation failed (e.g. -ENOMEM), the function was
called (kunit_catch_run_case or kunit_catch_run_case_cleanup) without
marking the test as failed.  Instead, fill try_result with the error
code returned by kthread_run(), which will mark the test as failed and
print "internal error occurred...".

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-2-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Long Li
ba591801a3 xarray: inline xas_descend to improve performance
The commit 63b1898fff ("XArray: Disallow sibling entries of nodes")
modified the xas_descend function in such a way that it was no longer
being compiled as an inline function, because it increased the size of
xas_descend(), and the compiler no longer optimizes it as inline.  This
had a negative impact on performance, xas_descend is called frequently to
traverse downwards in the xarray tree, making it a hot function.

Inlining xas_descend has been shown to significantly improve performance
by approximately 4.95% in the iozone write test.

  Machine: Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz
  #iozone i 0 -i 1 -s 64g -r 16m -f /test/tmptest

Before this patch:

       kB    reclen    write   rewrite     read    reread
 67108864     16384  2230080   3637689  6315197   5496027

After this patch:

       kB    reclen    write   rewrite     read    reread
 67108864     16384  2340360   3666175  6272401   5460782

Percentage change:
                       4.95%     0.78%   -0.68%    -0.64%

This patch introduces inlining to the xas_descend function. While this
change increases the size of lib/xarray.o, the performance gains in
critical workloads make this an acceptable trade-off.

Size comparison before and after patch:
.text		.data		.bss		file
0x3502		    0		   0		lib/xarray.o.before
0x3602		    0		   0		lib/xarray.o.after

Link: https://lkml.kernel.org/r/20240416061628.3768901-1-leo.lilong@huawei.com
Signed-off-by: Long Li <leo.lilong@huawei.com>
Cc: Hou Tao <houtao1@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: yangerkun <yangerkun@huawei.com>
Cc: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:53:38 -07:00
Matthew Wilcox (Oracle)
2a0774c288 XArray: set the marks correctly when splitting an entry
If we created a new node to replace an entry which had search marks set,
we were setting the search mark on every entry in that node.  That works
fine when we're splitting to order 0, but when splitting to a larger
order, we must not set the search marks on the sibling entries.

Link: https://lkml.kernel.org/r/20240501153120.4094530-1-willy@infradead.org
Fixes: c010d47f10 ("mm: thp: split huge page to any lower order pages")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/ZjFGCOYk3FK_zVy3@bombadil.infradead.org
Tested-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:28:08 -07:00
Luis Chamberlain
2aaba39e78 lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add()
While testing lib/test_xarray in userspace I've noticed we can fail with:

make -C tools/testing/radix-tree
./tools/testing/radix-tree/xarray

BUG at check_xa_multi_store_adv_add:749
xarray: 0x55905fb21a00x head 0x55905fa1d8e0x flags 0 marks 0 0 0
0: 0x55905fa1d8e0x
xarray: ../../../lib/test_xarray.c:749: check_xa_multi_store_adv_add: Assertion `0' failed.
Aborted

We get a failure with a BUG_ON(), and that is because we actually can
fail due to -ENOMEM, the check in xas_nomem() will fix this for us so
it makes no sense to expect no failure inside the loop. So modify the
check and since this is also useful for instructional purposes clarify
the situation.

The check for XA_BUG_ON(xa, xa_load(xa, index) != p) is already done
at the end of the loop so just remove the bogus on inside the loop.

With this we now pass the test in both kernel and userspace:

In userspace:

./tools/testing/radix-tree/xarray
XArray: 149092856 of 149092856 tests passed

In kernel space:

XArray: 148257077 of 148257077 tests passed

Link: https://lkml.kernel.org/r/20240423192221.301095-3-mcgrof@kernel.org
Fixes: a60cc288a1 ("test_xarray: add tests for advanced multi-index use")
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:28:05 -07:00
Liam R. Howlett
955a923d28 maple_tree: fix mas_empty_area_rev() null pointer dereference
Currently the code calls mas_start() followed by mas_data_end() if the
maple state is MA_START, but mas_start() may return with the maple state
node == NULL.  This will lead to a null pointer dereference when checking
information in the NULL node, which is done in mas_data_end().

Avoid setting the offset if there is no node by waiting until after the
maple state is checked for an empty or single entry state.

A user could trigger the events to cause a kernel oops by unmapping all
vmas to produce an empty maple tree, then mapping a vma that would cause
the scenario described above.

Link: https://lkml.kernel.org/r/20240422203349.2418465-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Marius Fleischer <fleischermarius@gmail.com>
Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/
Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/
Tested-by: Marius Fleischer <fleischermarius@gmail.com>
Tested-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:28:05 -07:00
Linus Torvalds
b9158815de Char/Misc driver fixes for 6.9-rc7
Here are some small char/misc/other driver fixes and new device ids for
 6.9-rc7 that resolve some reported problems.
 
 Included in here are:
   - iio driver fixes
   - mei driver fix and new device ids
   - dyndbg bugfix
   - pvpanic-pci driver bugfix
   - slimbus driver bugfix
   - fpga new device id
 
 All have been in linux-next with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZjdD2Q8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk38wCeJeUXW4/yQ4BTj7cHir0aOowVs+UAnAxCUwzt
 NpooaVg3v9tzLtvAOp1O
 =YfmA
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small char/misc/other driver fixes and new device ids
  for 6.9-rc7 that resolve some reported problems.

  Included in here are:

   - iio driver fixes

   - mei driver fix and new device ids

   - dyndbg bugfix

   - pvpanic-pci driver bugfix

   - slimbus driver bugfix

   - fpga new device id

  All have been in linux-next with no reported problems"

* tag 'char-misc-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  slimbus: qcom-ngd-ctrl: Add timeout for wait operation
  dyndbg: fix old BUG_ON in >control parser
  misc/pvpanic-pci: register attributes via pci_driver
  fpga: dfl-pci: add PCI subdevice ID for Intel D5005 card
  mei: me: add lunar lake point M DID
  mei: pxp: match against PCI_CLASS_DISPLAY_OTHER
  iio:imu: adis16475: Fix sync mode setting
  iio: accel: mxc4005: Reset chip on probe() and resume()
  iio: accel: mxc4005: Interrupt handling fixes
  dt-bindings: iio: health: maxim,max30102: fix compatible check
  iio: pressure: Fixes SPI support for BMP3xx devices
  iio: pressure: Fixes BME280 SPI driver data
2024-05-05 10:08:52 -07:00
Al Viro
b8c873edbf wrapper for access to ->bd_partno
On the next step it's going to get folded into a field where flags will go.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-05-02 17:48:09 -04:00
Al Viro
3f9b8fb46e Use bdev_is_paritition() instead of open-coding it
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-05-02 17:48:09 -04:00
Jakub Kicinski
e958da0ddb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

include/linux/filter.h
kernel/bpf/core.c
  66e13b615a ("bpf: verifier: prevent userspace memory access")
  d503a04f8b ("bpf: Add support for certain atomics in bpf_arena to x86 JIT")
https://lore.kernel.org/all/20240429114939.210328b0@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02 12:06:25 -07:00
Linus Torvalds
545c494465 Including fixes from bpf.
Relatively calm week, likely due to public holiday in most places.
 No known outstanding regressions.
 
 Current release - regressions:
 
   - rxrpc: fix wrong alignmask in __page_frag_alloc_align()
 
   - eth: e1000e: change usleep_range to udelay in PHY mdic access
 
 Previous releases - regressions:
 
   - gro: fix udp bad offset in socket lookup
 
   - bpf: fix incorrect runtime stat for arm64
 
   - tipc: fix UAF in error path
 
   - netfs: fix a potential infinite loop in extract_user_to_sg()
 
   - eth: ice: ensure the copied buf is NUL terminated
 
   - eth: qeth: fix kernel panic after setting hsuid
 
 Previous releases - always broken:
 
   - bpf:
     - verifier: prevent userspace memory access
     - xdp: use flags field to disambiguate broadcast redirect
 
   - bridge: fix multicast-to-unicast with fraglist GSO
 
   - mptcp: ensure snd_nxt is properly initialized on connect
 
   - nsh: fix outer header access in nsh_gso_segment().
 
   - eth: bcmgenet: fix racing registers access
 
   - eth: vxlan: fix stats counters.
 
 Misc:
 
   - a bunch of MAINTAINERS file updates
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmYzaRsSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkh70P/jzsTsvzHspu3RUwcsyvWpSoJPcxP2tF
 5SKR66o8sbSjB5I26zUi/LtRZgbPO32GmLN2Y8GvP74h9lwKdDo4AY4volZKCT6f
 lRG6GohvMa0lSPSn1fti7CKVzDOsaTHvLz3uBBr+Xb9ITCKh+I+zGEEDGj/47SQN
 tmDWHPF8OMs2ezmYS5NqRIQ3CeRz6uyLmEoZhVm4SolypZ18oEg7GCtL3u6U48n+
 e3XB3WwKl0ZxK8ipvPgUDwGIDuM5hEyAaeNon3zpYGoqitRsRITUjULpb9dT4DtJ
 Jma3OkarFJNXgm4N/p/nAtQ9AdiAloF9ivZXs2t0XCdrrUZJUh05yuikoX+mLfpw
 GedG2AbaVl6mdqNkrHeyf5SXKuiPgeCLVfF2xMjS0l1kFbY+Bt8BqnRSdOrcoUG0
 zlSzBeBtajttMdnalWv2ZshjP8uo/NjXydUjoVNwuq8xGO5wP+zhNnwhOvecNyUg
 t7q2PLokahlz4oyDqyY/7SQ0hSEndqxOlt43I6CthoWH0XkS83nTPdQXcTKQParD
 ntJUk5QYwefUT1gimbn/N8GoP7a1+ysWiqcf/7+SNm932gJGiDt36+HOEmyhIfIG
 IDWTWJJW64SnPBIUw59MrG7hMtbfaiZiFQqeUJQpFVrRr+tg5z5NUZ5thA+EJVd8
 qiVDvmngZFiv
 =f6KY
 -----END PGP SIGNATURE-----

Merge tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf.

  Relatively calm week, likely due to public holiday in most places. No
  known outstanding regressions.

  Current release - regressions:

   - rxrpc: fix wrong alignmask in __page_frag_alloc_align()

   - eth: e1000e: change usleep_range to udelay in PHY mdic access

  Previous releases - regressions:

   - gro: fix udp bad offset in socket lookup

   - bpf: fix incorrect runtime stat for arm64

   - tipc: fix UAF in error path

   - netfs: fix a potential infinite loop in extract_user_to_sg()

   - eth: ice: ensure the copied buf is NUL terminated

   - eth: qeth: fix kernel panic after setting hsuid

  Previous releases - always broken:

   - bpf:
       - verifier: prevent userspace memory access
       - xdp: use flags field to disambiguate broadcast redirect

   - bridge: fix multicast-to-unicast with fraglist GSO

   - mptcp: ensure snd_nxt is properly initialized on connect

   - nsh: fix outer header access in nsh_gso_segment().

   - eth: bcmgenet: fix racing registers access

   - eth: vxlan: fix stats counters.

  Misc:

   - a bunch of MAINTAINERS file updates"

* tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  MAINTAINERS: mark MYRICOM MYRI-10G as Orphan
  MAINTAINERS: remove Ariel Elior
  net: gro: add flush check in udp_gro_receive_segment
  net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  ipv4: Fix uninit-value access in __ip_make_skb()
  s390/qeth: Fix kernel panic after setting hsuid
  vxlan: Pull inner IP header in vxlan_rcv().
  tipc: fix a possible memleak in tipc_buf_append
  tipc: fix UAF in error path
  rxrpc: Clients must accept conn from any address
  net: core: reject skb_copy(_expand) for fraglist GSO skbs
  net: bridge: fix multicast-to-unicast with fraglist GSO
  mptcp: ensure snd_nxt is properly initialized on connect
  e1000e: change usleep_range to udelay in PHY mdic access
  net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
  cxgb4: Properly lock TX queue for the selftest.
  rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
  vxlan: Add missing VNI filter counter update in arp_reduce().
  vxlan: Fix racy device stats updates.
  net: qede: use return from qede_parse_actions()
  ...
2024-05-02 08:51:47 -07:00
Kees Cook
7d78a77733 string: Add additional __realloc_size() annotations for "dup" helpers
Several other "dup"-style interfaces could use the __realloc_size()
attribute. (As a reminder to myself and others: "realloc" is used here
instead of "alloc" because the "alloc_size" attribute implies that the
memory contents are uninitialized. Since we're copying contents into the
resulting allocation, it must use "realloc_size" to avoid confusing the
compiler's optimization passes.)

Add KUnit test coverage where possible. (KUnit still does not have the
ability to manipulate userspace memory.)

Reviewed-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240502145218.it.729-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-02 07:52:41 -07:00
Ard Biesheuvel
377d909511 vmlinux: Avoid weak reference to notes section
Weak references are references that are permitted to remain unsatisfied
in the final link. This means they cannot be implemented using place
relative relocations, resulting in GOT entries when using position
independent code generation.

The notes section should always exist, so the weak annotations can be
omitted.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-05-02 19:48:26 +09:00
Thomas Zimmermann
b0a835db17 Merge drm/drm-next into drm-misc-next
Backmerging to get DRM fixes from v6.9-rc6.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-05-02 11:25:27 +02:00
Jocelyn Falempe
b94605a388 lib/fonts: Allow to select fonts for drm_panic
drm_panic has been introduced recently, and uses the same fonts as
FRAMEBUFFER_CONSOLE.

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419132243.154466-1-jfalempe@redhat.com
2024-05-02 09:40:17 +02:00
Kees Cook
74df22453c kunit/fortify: Fix replaced failure path to unbreak __alloc_size
The __alloc_size annotation for kmemdup() was getting disabled under
KUnit testing because the replaced fortify_panic macro implementation
was using "return NULL" as a way to survive the sanity checking. But
having the chance to return NULL invalidated __alloc_size, so kmemdup
was not passing the __builtin_dynamic_object_size() tests any more:

[23:26:18] [PASSED] fortify_test_alloc_size_kmalloc_const
[23:26:19]     # fortify_test_alloc_size_kmalloc_dynamic: EXPECTATION FAILED at lib/fortify_kunit.c:265
[23:26:19]     Expected __builtin_dynamic_object_size(p, 1) == expected, but
[23:26:19]         __builtin_dynamic_object_size(p, 1) == -1 (0xffffffffffffffff)
[23:26:19]         expected == 11 (0xb)
[23:26:19] __alloc_size() not working with __bdos on kmemdup("hello there", len, gfp)
[23:26:19] [FAILED] fortify_test_alloc_size_kmalloc_dynamic

Normal builds were not affected: __alloc_size continued to work there.

Use a zero-sized allocation instead, which allows __alloc_size to
behave.

Fixes: 4ce615e798 ("fortify: Provide KUnit counters for failure testing")
Fixes: fa4a3f86d4 ("fortify: Add KUnit tests for runtime overflows")
Link: https://lore.kernel.org/r/20240501232937.work.532-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-01 16:35:06 -07:00
Andrii Nakryiko
78d0b16127 objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids
Profiling shows that calling nr_possible_cpus() in objpool_pop() takes
a noticeable amount of CPU (when profiled on 80-core machine), as we
need to recalculate number of set bits in a CPU bit mask. This number
can't change, so there is no point in paying the price for recalculating
it. As such, cache this value in struct objpool_head and use it in
objpool_pop().

On the other hand, cached pool->nr_cpus isn't necessary, as it's not
used in hot path and is also a pretty trivial value to retrieve. So drop
pool->nr_cpus in favor of using nr_cpu_ids everywhere. This way the size
of struct objpool_head remains the same, which is a nice bonus.

Same BPF selftests benchmarks were used to evaluate the effect. Using
changes in previous patch (inlining of objpool_pop/objpool_push) as
baseline, here are the differences:

BASELINE
========
kretprobe      :    9.937 ± 0.174M/s
kretprobe-multi:   10.440 ± 0.108M/s

AFTER
=====
kretprobe      :   10.106 ± 0.120M/s (+1.7%)
kretprobe-multi:   10.515 ± 0.180M/s (+0.7%)

Link: https://lore.kernel.org/all/20240424215214.3956041-3-andrii@kernel.org/

Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-05-01 23:18:48 +09:00
Andrii Nakryiko
a3b00f10da objpool: enable inlining objpool_push() and objpool_pop() operations
objpool_push() and objpool_pop() are very performance-critical functions
and can be called very frequently in kretprobe triggering path.

As such, it makes sense to allow compiler to inline them completely to
eliminate function calls overhead. Luckily, their logic is quite well
isolated and doesn't have any sprawling dependencies.

This patch moves both objpool_push() and objpool_pop() into
include/linux/objpool.h and marks them as static inline functions,
enabling inlining. To avoid anyone using internal helpers
(objpool_try_get_slot, objpool_try_add_slot), rename them to use leading
underscores.

We used kretprobe microbenchmark from BPF selftests (bench trig-kprobe
and trig-kprobe-multi benchmarks) running no-op BPF kretprobe/kretprobe.multi
programs in a tight loop to evaluate the effect. BPF own overhead in
this case is minimal and it mostly stresses the rest of in-kernel
kretprobe infrastructure overhead. Results are in millions of calls per
second. This is not super scientific, but shows the trend nevertheless.

BEFORE
======
kretprobe      :    9.794 ± 0.086M/s
kretprobe-multi:   10.219 ± 0.032M/s

AFTER
=====
kretprobe      :    9.937 ± 0.174M/s (+1.5%)
kretprobe-multi:   10.440 ± 0.108M/s (+2.2%)

Link: https://lore.kernel.org/all/20240424215214.3956041-2-andrii@kernel.org/

Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-05-01 23:18:48 +09:00
Kees Cook
26f812ba75 kunit/fortify: Add memcpy() tests
Add fortify tests for memcpy() and memmove(). This can use a similar
method to the fortify_panic() replacement, only we can do it for what
was the WARN_ONCE(), which can be redefined.

Since this is primarily testing the fortify behaviors of the memcpy()
and memmove() defenses, the tests for memcpy() and memmove() are
identical.

Link: https://lore.kernel.org/r/20240429194342.2421639-3-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30 10:34:30 -07:00
Kees Cook
091f79e8de kunit/fortify: Do not spam logs with fortify WARNs
When running KUnit fortify tests, we're already doing precise tracking
of which warnings are getting hit. Don't fill the logs with WARNs unless
we've been explicitly built with DEBUG enabled.

Link: https://lore.kernel.org/r/20240429194342.2421639-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30 10:34:29 -07:00
Kees Cook
a0d6677ec3 kunit/fortify: Rename tests to use recommended conventions
The recommended conventions for KUnit tests is ${module}_test_${what}.
Adjust the fortify tests to match.

Link: https://lore.kernel.org/r/20240429194342.2421639-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30 10:34:29 -07:00
Jim Cromie
00e7d3bea2 dyndbg: fix old BUG_ON in >control parser
Fix a BUG_ON from 2009.  Even if it looks "unreachable" (I didn't
really look), lets make sure by removing it, doing pr_err and return
-EINVAL instead.

Cc: stable <stable@kernel.org>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20240429193145.66543-2-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-30 09:20:48 +02:00
Jakub Kicinski
89de2db193 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZi9+AAAKCRDbK58LschI
 g0nEAP487m7L0nLVriC2oIOWsi29tklW3etm6DO7gmGRGIHgrgEAnMyV1xBj3bGj
 v6jJwDcybCym1hLx+1x1JCZ4eoAFswE=
 =xbna
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-04-29

We've added 147 non-merge commits during the last 32 day(s) which contain
a total of 158 files changed, 9400 insertions(+), 2213 deletions(-).

The main changes are:

1) Add an internal-only BPF per-CPU instruction for resolving per-CPU
   memory addresses and implement support in x86 BPF JIT. This allows
   inlining per-CPU array and hashmap lookups
   and the bpf_get_smp_processor_id() helper, from Andrii Nakryiko.

2) Add BPF link support for sk_msg and sk_skb programs, from Yonghong Song.

3) Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
   atomics in bpf_arena which can be JITed as a single x86 instruction,
   from Alexei Starovoitov.

4) Add support for passing mark with bpf_fib_lookup helper,
   from Anton Protopopov.

5) Add a new bpf_wq API for deferring events and refactor sleepable
   bpf_timer code to keep common code where possible,
   from Benjamin Tissoires.

6) Fix BPF_PROG_TEST_RUN infra with regards to bpf_dummy_struct_ops programs
   to check when NULL is passed for non-NULLable parameters,
   from Eduard Zingerman.

7) Harden the BPF verifier's and/or/xor value tracking,
   from Harishankar Vishwanathan.

8) Introduce crypto kfuncs to make BPF programs able to utilize the kernel
   crypto subsystem, from Vadim Fedorenko.

9) Various improvements to the BPF instruction set standardization doc,
   from Dave Thaler.

10) Extend libbpf APIs to partially consume items from the BPF ringbuffer,
    from Andrea Righi.

11) Bigger batch of BPF selftests refactoring to use common network helpers
    and to drop duplicate code, from Geliang Tang.

12) Support bpf_tail_call_static() helper for BPF programs with GCC 13,
    from Jose E. Marchesi.

13) Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
    program to have code sections where preemption is disabled,
    from Kumar Kartikeya Dwivedi.

14) Allow invoking BPF kfuncs from BPF_PROG_TYPE_SYSCALL programs,
    from David Vernet.

15) Extend the BPF verifier to allow different input maps for a given
    bpf_for_each_map_elem() helper call in a BPF program, from Philo Lu.

16) Add support for PROBE_MEM32 and bpf_addr_space_cast instructions
    for riscv64 and arm64 JITs to enable BPF Arena, from Puranjay Mohan.

17) Shut up a false-positive KMSAN splat in interpreter mode by unpoison
    the stack memory, from Martin KaFai Lau.

18) Improve xsk selftest coverage with new tests on maximum and minimum
    hardware ring size configurations, from Tushar Vyavahare.

19) Various ReST man pages fixes as well as documentation and bash completion
    improvements for bpftool, from Rameez Rehman & Quentin Monnet.

20) Fix libbpf with regards to dumping subsequent char arrays,
    from Quentin Deslandes.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (147 commits)
  bpf, docs: Clarify PC use in instruction-set.rst
  bpf_helpers.h: Define bpf_tail_call_static when building with GCC
  bpf, docs: Add introduction for use in the ISA Internet Draft
  selftests/bpf: extend BPF_SOCK_OPS_RTT_CB test for srtt and mrtt_us
  bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args
  selftests/bpf: dummy_st_ops should reject 0 for non-nullable params
  bpf: check bpf_dummy_struct_ops program params for test runs
  selftests/bpf: do not pass NULL for non-nullable params in dummy_st_ops
  selftests/bpf: adjust dummy_st_ops_success to detect additional error
  bpf: mark bpf_dummy_struct_ops.test_1 parameter as nullable
  selftests/bpf: Add ring_buffer__consume_n test.
  bpf: Add bpf_guard_preempt() convenience macro
  selftests: bpf: crypto: add benchmark for crypto functions
  selftests: bpf: crypto skcipher algo selftests
  bpf: crypto: add skcipher to bpf crypto
  bpf: make common crypto API for TC/XDP programs
  bpf: update the comment for BTF_FIELDS_MAX
  selftests/bpf: Fix wq test.
  selftests/bpf: Use make_sockaddr in test_sock_addr
  selftests/bpf: Use connect_to_addr in test_sock_addr
  ...
====================

Link: https://lore.kernel.org/r/20240429131657.19423-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-29 13:12:19 -07:00
Jakub Kicinski
b2ff42c6d3 bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZiwdfQAKCRDbK58LschI
 g1oqAP9mjayeIHCfYMQZa2eevy1PmVlgdNdFdMDWZFS/pHv9cgD/ZdmGzbUDKCAQ
 Y/KiTajitZw3kxtHX45v8/Ugtlsh9Qg=
 =Ewiw
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2024-04-26

We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 14 files changed, 168 insertions(+), 72 deletions(-).

The main changes are:

1) Fix BPF_PROBE_MEM in verifier and JIT to skip loads from vsyscall page,
   from Puranjay Mohan.

2) Fix a crash in XDP with devmap broadcast redirect when the latter map
   is in process of being torn down, from Toke Høiland-Jørgensen.

3) Fix arm64 and riscv64 BPF JITs to properly clear start time for BPF
   program runtime stats, from Xu Kuohai.

4) Fix a sockmap KCSAN-reported data race in sk_psock_skb_ingress_enqueue,
    from Jason Xing.

5) Fix BPF verifier error message in resolve_pseudo_ldimm64,
   from Anton Protopopov.

6) Fix missing DEBUG_INFO_BTF_MODULES Kconfig menu item,
   from Andrii Nakryiko.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Test PROBE_MEM of VSYSCALL_ADDR on x86-64
  bpf, x86: Fix PROBE_MEM runtime load check
  bpf: verifier: prevent userspace memory access
  xdp: use flags field to disambiguate broadcast redirect
  arm32, bpf: Reimplement sign-extension mov instruction
  riscv, bpf: Fix incorrect runtime stats
  bpf, arm64: Fix incorrect runtime stats
  bpf: Fix a verifier verbose message
  bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue
  MAINTAINERS: bpf: Add Lehui and Puranjay as riscv64 reviewers
  MAINTAINERS: Update email address for Puranjay Mohan
  bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition
====================

Link: https://lore.kernel.org/r/20240426224248.26197-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26 17:36:53 -07:00
Kees Cook
998b18072c kunit/fortify: Fix mismatched kvalloc()/vfree() usage
The kv*() family of tests were accidentally freeing with vfree() instead
of kvfree(). Use kvfree() instead.

Fixes: 9124a26401 ("kunit/fortify: Validate __alloc_size attribute results")
Link: https://lore.kernel.org/r/20240425230619.work.299-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-26 15:31:39 -07:00
Linus Torvalds
e6ebf01172 11 hotfixes. 8 are cc:stable and the remaining 3 (nice ratio!) address
post-6.8 issues or aren't considered suitable for backporting.
 
 All except one of these are for MM.  I see no particular theme - it's
 singletons all over.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZiwPZwAKCRDdBJ7gKXxA
 jmcQAPkB6UT/rBUMvFZb1dom9R6SDYl5ZBr20Vj1HvfakCLxmQEAqEd0N7QoWvKS
 hKNCMDujiEKqDUWeUaJen4cqXFFE2Qg=
 =1wP7
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "11 hotfixes. 8 are cc:stable and the remaining 3 (nice ratio!) address
  post-6.8 issues or aren't considered suitable for backporting.

  All except one of these are for MM. I see no particular theme - it's
  singletons all over"

* tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  selftests: mm: protection_keys: save/restore nr_hugepages value from launch script
  stackdepot: respect __GFP_NOLOCKDEP allocation flag
  hugetlb: check for anon_vma prior to folio allocation
  mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
  mm: turn folio_test_hugetlb into a PageType
  mm: support page_mapcount() on page_has_type() pages
  mm: create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macros
  mm/hugetlb: fix missing hugetlb_lock for resv uncharge
  selftests: mm: fix unused and uninitialized variable warning
  selftests/harness: remove use of LINE_MAX
2024-04-26 13:48:03 -07:00
David Howells
6a30653b60 Fix a potential infinite loop in extract_user_to_sg()
Fix extract_user_to_sg() so that it will break out of the loop if
iov_iter_extract_pages() returns 0 rather than looping around forever.

[Note that I've included two fixes lines as the function got moved to a
different file and renamed]

Fixes: 85dd2c8ff3 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator")
Fixes: f5f82cd187 ("Move netfs_extract_iter_to_sg() to lib/scatterlist.c")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: netfs@lists.linux.dev
Link: https://lore.kernel.org/r/1967121.1714034372@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26 12:35:57 -07:00
linke li
6ad0d7e0f4 sbitmap: use READ_ONCE to access map->word
In __sbitmap_queue_get_batch(), map->word is read several times, and
update atomically using atomic_long_try_cmpxchg(). But the first two read
of map->word is not protected.

This patch moves the statement val = READ_ONCE(map->word) forward,
eliminating unprotected accesses to map->word within the function.
It is aimed at reducing the number of benign races reported by KCSAN in
order to focus future debugging effort on harmful races.

Signed-off-by: linke li <lilinke99@qq.com>
Link: https://lore.kernel.org/r/tencent_0B517C25E519D3D002194E8445E86C04AD0A@qq.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-26 07:40:28 -06:00
Arnd Bergmann
3ef3a05ba6 test_hexdump: avoid string truncation warning
gcc can warn when a string is too long to fit into the strncpy()
destination buffer, as it is here depending on the function arguments:

    inlined from 'test_hexdump_prepare_test.constprop' at /home/arnd/arm-soc/lib/test_hexdump.c:116:3:
include/linux/fortify-string.h:108:33: error: '__builtin_strncpy' output truncated copying between 0 and 32 bytes from a string of length 32 [-Werror=stringop-truncation]
  108 | #define __underlying_strncpy    __builtin_strncpy
      |                                 ^
include/linux/fortify-string.h:187:16: note: in expansion of macro '__underlying_strncpy'
  187 |         return __underlying_strncpy(p, q, size);
      |                ^~~~~~~~~~~~~~~~~~~~

The intention here is to copy exactly 'l' bytes without any padding or
NUL-termination, so the most logical change is to use memcpy(), just as
a previous change adapted the other output from strncpy() to memcpy().

Link: https://lkml.kernel.org/r/20240409140059.3806717-2-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Justin Stitt <justinstitt@google.com>
Cc: Alexey Starikovskiy <astarikovskiy@suse.de>
Cc: Bob Moore <robert.moore@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Len Brown <lenb@kernel.org>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:07 -07:00
Andy Shevchenko
3cc98aa11e devres: don't use "proxy" headers
Update header inclusions to follow IWYU (Include What You Use) principle.

Link: https://lkml.kernel.org/r/20240403104820.557487-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:06 -07:00
Andy Shevchenko
f36c54f3ce devres: switch to use dev_err_probe() for unification
Patch series "devres: A couple of cleanups".

A couple of ad-hoc cleanups. No functional changes intended. 


This patch (of 2):

The devm_*() APIs are supposed to be called during the ->probe() stage. 
Many drivers (especially new ones) have switched to use dev_err_probe()
for error messaging for the sake of unification.  Let's do the same in the
devres APIs.

Link: https://lkml.kernel.org/r/20240403104820.557487-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20240403104820.557487-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:06 -07:00
Niklas Schnelle
b157f0e97e kgdb: add HAS_IOPORT dependency
In a future patch HAS_IOPORT=n will disable inb()/outb() and friends at
compile time.  We thus need to add HAS_IOPORT as dependency for those
drivers using them.

Link: https://lkml.kernel.org/r/20240403132547.762429-2-schnelle@linux.ibm.com
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:05 -07:00
Uwe Kleine-König
5ef6dc08cf lib/build_OID_registry: don't mention the full path of the script in output
This change strips the full path of the script generating
lib/oid_registry_data.c to just lib/build_OID_registry.  The motivation
for this change is Yocto emitting a build warning

	File /usr/src/debug/linux-lxatac/6.7-r0/lib/oid_registry_data.c in package linux-lxatac-src contains reference to TMPDIR [buildpaths]

So this change brings us one step closer to make the build result
reproducible independent of the build path.

Link: https://lkml.kernel.org/r/20240313211957.884561-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:01 -07:00
Kairui Song
6758c1128c mm/filemap: optimize filemap folio adding
Instead of doing multiple tree walks, do one optimism range check with
lock hold, and exit if raced with another insertion.  If a shadow exists,
check it with a new xas_get_order helper before releasing the lock to
avoid redundant tree walks for getting its order.

Drop the lock and do the allocation only if a split is needed.

In the best case, it only need to walk the tree once.  If it needs to
alloc and split, 3 walks are issued (One for first ranged conflict check
and order retrieving, one for the second check after allocation, one for
the insert after split).

Testing with 4K pages, in an 8G cgroup, with 16G brd as block device:

  echo 3 > /proc/sys/vm/drop_caches

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
    --buffered=1 --ioengine=mmap --rw=randread --time_based \
    --ramp_time=30s --runtime=5m --group_reporting

Before:
bw (  MiB/s): min= 1027, max= 3520, per=100.00%, avg=2445.02, stdev=18.90, samples=8691
iops        : min=263001, max=901288, avg=625924.36, stdev=4837.28, samples=8691

After (+7.3%):
bw (  MiB/s): min=  493, max= 3947, per=100.00%, avg=2625.56, stdev=25.74, samples=8651
iops        : min=126454, max=1010681, avg=672142.61, stdev=6590.48, samples=8651

Test result with THP (do a THP randread then switch to 4K page in hope it
issues a lot of splitting):

  echo 3 > /proc/sys/vm/drop_caches

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
      --buffered=1 --ioengine=mmap -thp=1 --readonly \
      --rw=randread --time_based --ramp_time=30s --runtime=10m \
      --group_reporting

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
      --buffered=1 --ioengine=mmap \
      --rw=randread --time_based --runtime=5s --group_reporting

Before:
bw (  KiB/s): min= 4141, max=14202, per=100.00%, avg=7935.51, stdev=96.85, samples=18976
iops        : min= 1029, max= 3548, avg=1979.52, stdev=24.23, samples=18976·

READ: bw=4545B/s (4545B/s), 4545B/s-4545B/s (4545B/s-4545B/s), io=64.0KiB (65.5kB), run=14419-14419msec

After (+12.5%):
bw (  KiB/s): min= 4611, max=15370, per=100.00%, avg=8928.74, stdev=105.17, samples=19146
iops        : min= 1151, max= 3842, avg=2231.27, stdev=26.29, samples=19146

READ: bw=4635B/s (4635B/s), 4635B/s-4635B/s (4635B/s-4635B/s), io=64.0KiB (65.5kB), run=14137-14137msec

The performance is better for both 4K (+7.5%) and THP (+12.5%) cached read.

Link: https://lkml.kernel.org/r/20240415171857.19244-5-ryncsn@gmail.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:56:09 -07:00
Kairui Song
a4864671ca lib/xarray: introduce a new helper xas_get_order
It can be used after xas_load to check the order of loaded entries. 
Compared to xa_get_order, it saves an XA_STATE and avoid a rewalk.

Added new test for xas_get_order, to make the test work, we have to export
xas_get_order with EXPORT_SYMBOL_GPL.

Also fix a sparse warning by checking the slot value with xa_entry instead
of accessing it directly, as suggested by Matthew Wilcox.

[kasong@tencent.com: simplify comment, sparse warning fix, per Matthew Wilcox]
  Link: https://lkml.kernel.org/r/20240416071722.45997-4-ryncsn@gmail.com
Link: https://lkml.kernel.org/r/20240415171857.19244-4-ryncsn@gmail.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:56:09 -07:00
Kees Cook
e13106952f alloc_tag: Tighten file permissions on /proc/allocinfo
The /proc/allocinfo file exposes a tremendous about of information about
kernel build details, memory allocations (obviously), and potentially even
image layout (due to ordering).  As this is intended to be consumed by
system owners (like /proc/slabinfo), use the same file permissions as
there: 0400.

Link: https://lkml.kernel.org/r/20240425200844.work.184-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:59 -07:00
Suren Baghdasaryan
1438d349d1 lib: add memory allocations report in show_mem()
Include allocations in show_mem reports.

Link: https://lkml.kernel.org/r/20240321163705.3067592-33-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:57 -07:00
Kent Overstreet
9e54dd8b64 rhashtable: plumb through alloc tag
This gives better memory allocation profiling results; rhashtable
allocations will be accounted to the code that initialized the rhashtable.

[surenb@google.com: undo _noprof additions in the documentation]
  Link: https://lkml.kernel.org/r/20240326231453.1206227-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-32-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:57 -07:00
Suren Baghdasaryan
c789b5fe38 lib: add codetag reference into slabobj_ext
To store code tag for every slab object, a codetag reference is embedded
into slabobj_ext when CONFIG_MEM_ALLOC_PROFILING=y.

Link: https://lkml.kernel.org/r/20240321163705.3067592-23-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:55 -07:00
Suren Baghdasaryan
8d469d0bee lib: introduce early boot parameter to avoid page_ext memory overhead
The highest memory overhead from memory allocation profiling comes from
page_ext objects.  This overhead exists even if the feature is disabled
but compiled-in.  To avoid it, introduce an early boot parameter that
prevents page_ext object creation.  The new boot parameter is a tri-state
with possible values of 0|1|never.  When it is set to "never" the memory
allocation profiling support is disabled, and overhead is minimized
(currently no page_ext objects are allocated, in the future more overhead
might be eliminated).  As a result we also lose ability to enable memory
allocation profiling at runtime (because there is no space to store
alloctag references).  Runtime sysctrl becomes read-only if the early boot
parameter was set to "never".  Note that the default value of this boot
parameter depends on the CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
configuration.  When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n the
boot parameter is set to "never", therefore eliminating any overhead. 
CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y results in boot parameter
being set to 1 (enabled).  This allows distributions to avoid any overhead
by setting CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n config and with
no changes to the kernel command line.

We reuse sysctl.vm.mem_profiling boot parameter name in order to avoid
introducing yet another control.  This change turns it into a tri-state
early boot parameter.

Link: https://lkml.kernel.org/r/20240321163705.3067592-16-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:53 -07:00
Suren Baghdasaryan
dcfe378c81 lib: introduce support for page allocation tagging
Introduce helper functions to easily instrument page allocators by storing
a pointer to the allocation tag associated with the code that allocated
the page in a page_ext field.

Link: https://lkml.kernel.org/r/20240321163705.3067592-15-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:53 -07:00
Suren Baghdasaryan
22d407b164 lib: add allocation tagging support for memory allocation profiling
Introduce CONFIG_MEM_ALLOC_PROFILING which provides definitions to easily
instrument memory allocators.  It registers an "alloc_tags" codetag type
with /proc/allocinfo interface to output allocation tag information when
the feature is enabled.

CONFIG_MEM_ALLOC_PROFILING_DEBUG is provided for debugging the memory
allocation profiling instrumentation.

Memory allocation profiling can be enabled or disabled at runtime using
/proc/sys/vm/mem_profiling sysctl when CONFIG_MEM_ALLOC_PROFILING_DEBUG=n.
CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT enables memory allocation
profiling by default.

[surenb@google.com: Documentation/filesystems/proc.rst: fix allocinfo title]
  Link: https://lkml.kernel.org/r/20240326073813.727090-1-surenb@google.com
[surenb@google.com: do limited memory accounting for modules with ARCH_NEEDS_WEAK_PER_CPU]
  Link: https://lkml.kernel.org/r/20240402180933.1663992-2-surenb@google.com
[klarasmodin@gmail.com: explicitly include irqflags.h in alloc_tag.h]
  Link: https://lkml.kernel.org/r/20240407133252.173636-1-klarasmodin@gmail.com
[surenb@google.com: fix alloc_tag_init() to prevent passing NULL to PTR_ERR()]
  Link: https://lkml.kernel.org/r/20240417003349.2520094-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-14-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:52 -07:00
Suren Baghdasaryan
47a92dfbe0 lib: prevent module unloading if memory is not freed
Skip freeing module's data section if there are non-zero allocation tags
because otherwise, once these allocations are freed, the access to their
code tag would cause UAF.

Link: https://lkml.kernel.org/r/20240321163705.3067592-13-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:52 -07:00
Suren Baghdasaryan
a473573964 lib: code tagging module support
Add support for code tagging from dynamically loaded modules.

Link: https://lkml.kernel.org/r/20240321163705.3067592-12-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:52 -07:00
Suren Baghdasaryan
916cc5167c lib: code tagging framework
Add basic infrastructure to support code tagging which stores tag common
information consisting of the module name, function, file name and line
number.  Provide functions to register a new code tag type and navigate
between code tags.

Link: https://lkml.kernel.org/r/20240321163705.3067592-11-surenb@google.com
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:52 -07:00
Duoming Zhou
c2af060d1c lib/test_hmm.c: handle src_pfns and dst_pfns allocation failure
The kcalloc() in dmirror_device_evict_chunk() will return null if the
physical memory has run out.  As a result, if src_pfns or dst_pfns is
dereferenced, the null pointer dereference bug will happen.

Moreover, the device is going away.  If the kcalloc() fails, the pages
mapping a chunk could not be evicted.  So add a __GFP_NOFAIL flag in
kcalloc().

Finally, as there is no need to have physically contiguous memory, Switch
kcalloc() to kvcalloc() in order to avoid failing allocations.

Link: https://lkml.kernel.org/r/20240312005905.9939-1-duoming@zju.edu.cn
Fixes: b2ef9f5a5c ("mm/hmm/test: add selftest driver for HMM")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:48 -07:00
Jakub Kicinski
2bd87951de Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/ti/icssg/icssg_prueth.c

net/mac80211/chan.c
  89884459a0 ("wifi: mac80211: fix idle calculation with multi-link")
  87f5500285 ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()")
https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/

net/unix/garbage.c
  1971d13ffa ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().")
  4090fa373f ("af_unix: Replace garbage collection algorithm.")

drivers/net/ethernet/ti/icssg/icssg_prueth.c
drivers/net/ethernet/ti/icssg/icssg_common.c
  4dcd0e83ea ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()")
  e2dc7bfd67 ("net: ti: icssg-prueth: Move common functions into a separate file")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25 12:41:37 -07:00
Andrey Ryabinin
6fe60465e1 stackdepot: respect __GFP_NOLOCKDEP allocation flag
If stack_depot_save_flags() allocates memory it always drops
__GFP_NOLOCKDEP flag.  So when KASAN tries to track __GFP_NOLOCKDEP
allocation we may end up with lockdep splat like bellow:

======================================================
 WARNING: possible circular locking dependency detected
 6.9.0-rc3+ #49 Not tainted
 ------------------------------------------------------
 kswapd0/149 is trying to acquire lock:
 ffff88811346a920
(&xfs_nondir_ilock_class){++++}-{4:4}, at: xfs_reclaim_inode+0x3ac/0x590
[xfs]

 but task is already holding lock:
 ffffffff8bb33100 (fs_reclaim){+.+.}-{0:0}, at:
balance_pgdat+0x5d9/0xad0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:
 -> #1 (fs_reclaim){+.+.}-{0:0}:
        __lock_acquire+0x7da/0x1030
        lock_acquire+0x15d/0x400
        fs_reclaim_acquire+0xb5/0x100
 prepare_alloc_pages.constprop.0+0xc5/0x230
        __alloc_pages+0x12a/0x3f0
        alloc_pages_mpol+0x175/0x340
        stack_depot_save_flags+0x4c5/0x510
        kasan_save_stack+0x30/0x40
        kasan_save_track+0x10/0x30
        __kasan_slab_alloc+0x83/0x90
        kmem_cache_alloc+0x15e/0x4a0
        __alloc_object+0x35/0x370
        __create_object+0x22/0x90
 __kmalloc_node_track_caller+0x477/0x5b0
        krealloc+0x5f/0x110
        xfs_iext_insert_raw+0x4b2/0x6e0 [xfs]
        xfs_iext_insert+0x2e/0x130 [xfs]
        xfs_iread_bmbt_block+0x1a9/0x4d0 [xfs]
        xfs_btree_visit_block+0xfb/0x290 [xfs]
        xfs_btree_visit_blocks+0x215/0x2c0 [xfs]
        xfs_iread_extents+0x1a2/0x2e0 [xfs]
 xfs_buffered_write_iomap_begin+0x376/0x10a0 [xfs]
        iomap_iter+0x1d1/0x2d0
 iomap_file_buffered_write+0x120/0x1a0
        xfs_file_buffered_write+0x128/0x4b0 [xfs]
        vfs_write+0x675/0x890
        ksys_write+0xc3/0x160
        do_syscall_64+0x94/0x170
 entry_SYSCALL_64_after_hwframe+0x71/0x79

Always preserve __GFP_NOLOCKDEP to fix this.

Link: https://lkml.kernel.org/r/20240418141133.22950-1-ryabinin.a.a@gmail.com
Fixes: cd11016e5f ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reported-by: Xiubo Li <xiubli@redhat.com>
Closes: https://lore.kernel.org/all/a0caa289-ca02-48eb-9bf2-d86fd47b71f4@redhat.com/
Reported-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Closes: https://lore.kernel.org/all/f9ff999a-e170-b66b-7caf-293f2b147ac2@opensource.wdc.com/
Suggested-by: Dave Chinner <david@fromorbit.com>
Tested-by: Xiubo Li <xiubli@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-24 19:34:26 -07:00
Kees Cook
2e431b23a1 ubsan: Avoid i386 UBSAN handler crashes with Clang
When generating Runtime Calls, Clang doesn't respect the -mregparm=3
option used on i386. Hopefully this will be fixed correctly in Clang 19:
https://github.com/llvm/llvm-project/pull/89707
but we need to fix this for earlier Clang versions today. Force the
calling convention to use non-register arguments.

Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Closes: https://github.com/KSPP/linux/issues/350
Link: https://lore.kernel.org/r/20240424224026.it.216-kees@kernel.org
Acked-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-24 15:45:38 -07:00
Dawei Li
cdc66553c4 cpumask: Introduce cpumask_first_and_and()
Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
works in-place with all inputs and produces desired output directly.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20240416085454.3547175-2-dawei.li@shingroup.cn
2024-04-24 21:23:49 +02:00
Kees Cook
c209826737 ubsan: Remove 1-element array usage in debug reporting
The "type_name" character array was still marked as a 1-element array.
While we don't validate strings used in format arguments yet, let's fix
this before it causes trouble some future day.

Link: https://lore.kernel.org/r/20240424162739.work.492-kees@kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-24 11:59:19 -07:00
Kees Cook
c01c41e500 string_kunit: Move strtomem KUnit test to string_kunit.c
It is more logical to have the strtomem() test in string_kunit.c instead
of the memcpy() suite. Move it to live with memtostr().

Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-24 09:02:34 -07:00
Kees Cook
0efc5990bc string.h: Introduce memtostr() and memtostr_pad()
Another ambiguous use of strncpy() is to copy from strings that may not
be NUL-terminated. These cases depend on having the destination buffer
be explicitly larger than the source buffer's maximum size, having
the size of the copy exactly match the source buffer's maximum size,
and for the destination buffer to get explicitly NUL terminated.

This usually happens when parsing protocols or hardware character arrays
that are not guaranteed to be NUL-terminated. The code pattern is
effectively this:

	char dest[sizeof(src) + 1];

	strncpy(dest, src, sizeof(src));
	dest[sizeof(dest) - 1] = '\0';

In practice it usually looks like:

struct from_hardware {
	...
	char name[HW_NAME_SIZE] __nonstring;
	...
};

	struct from_hardware *p = ...;
	char name[HW_NAME_SIZE + 1];

	strncpy(name, p->name, HW_NAME_SIZE);
	name[NW_NAME_SIZE] = '\0';

This cannot be replaced with:

	strscpy(name, p->name, sizeof(name));

because p->name is smaller and not NUL-terminated, so FORTIFY will
trigger when strnlen(p->name, sizeof(name)) is used. And it cannot be
replaced with:

	strscpy(name, p->name, sizeof(p->name));

because then "name" may contain a 1 character early truncation of
p->name.

Provide an unambiguous interface for converting a maybe not-NUL-terminated
string to a NUL-terminated string, with compile-time buffer size checking
so that it can never fail at runtime: memtostr() and memtostr_pad(). Also
add KUnit tests for both.

Link: https://lore.kernel.org/r/20240410023155.2100422-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-24 08:57:09 -07:00
Greg Kroah-Hartman
660a708098 Merge 6.9-rc5 into tty-next
We want the tty fixes in here as well, and it resolves a merge conflict
in:
	drivers/tty/serial/serial_core.c
as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23 13:24:45 +02:00
Jason Gunthorpe
e7bc47b166 s390: Stop using weak symbols for __iowrite64_copy()
Complete switching the __iowriteXX_copy() routines over to use #define and
arch provided inline/macro functions instead of weak symbols.

S390 has an implementation that simply calls another memcpy
function. Inline this so the callers don't have to do two jumps.

Link: https://lore.kernel.org/r/3-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22 17:11:20 -03:00
Jason Gunthorpe
20516d6e51 x86: Stop using weak symbols for __iowrite32_copy()
Start switching iomap_copy routines over to use #define and arch provided
inline/macro functions instead of weak symbols.

Inline functions allow more compiler optimization and this is often a
driver hot path.

x86 has the only weak implementation for __iowrite32_copy(), so replace it
with a static inline containing the same single instruction inline
assembly. The compiler will generate the "mov edx,ecx" in a more optimal
way.

Remove iomap_copy_64.S

Link: https://lore.kernel.org/r/1-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22 17:11:19 -03:00
Linus Torvalds
2d412262cc hardening fixes for v6.9-rc5
- Correctly disable UBSAN configs in configs/hardening (Nathan Chancellor)
 
 - Add missing signed integer overflow trap types to arm64 handler
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmYi0M8WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjPrEACrLlPZUPLiJPlPdYC5bW4lLSgZ
 v6z5XjeEVWVvIlzW3DKPKvzMmIl6D3CTN6KbgdjHR+s5VYGYVlkoQJw09SSBu1OX
 yFC2i0lyqUKuAmh6jK1T46kXvbrgK/3ClO3nQk7KotTfvuRcorAcGEmayTYnaWqd
 JX3qyry2oEiQG9pWDHl9bRQ1ZgbNdNkxR2YYIhy88lrMWORdVNG7PFkgNnVsEbnb
 UAbXl817//TSTuXUwklTllz0UNInmDmQTjrmMRUhiwTKEs8aRS5VX6biSyc1Fucz
 KYXNeK9ciV80mQYnj7jDxgXC5jNThtrjEokzht8vvZGHcBp3WMr6CJLmwj9aaSXE
 edib7mJf/YveJTCPN17xAvIMHZAFZyoyeiVIOE1Ys2lWSj8rXH5TvWnn/E4QPxHK
 77lOKGZNwNMYmIa+L6gb3OOWpiZpOMTLCGMuJh6VSDf5BcA0i45yTxAlAe5JYpgw
 txxDscFu5MtrabR4Z28J+VY/wnWqQAC89D6qYsJOPH8kL0o3XhELCDKPNUZoY094
 LV7XuhAB+xDqNdvZi7SHTmTtZSLPqRBlNrOUqQSmXrwjp11naya26l7fn1Y0cpQM
 K8o3ioUkSg0PJNox/kGxryouHXXMqtN/k52JPotkfa6XEQDpN82uo0xJD9r21Viu
 qhA7A8vcQ3KIb0cUbw==
 =Q6Hg
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - Correctly disable UBSAN configs in configs/hardening (Nathan
   Chancellor)

 - Add missing signed integer overflow trap types to arm64 handler

* tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  ubsan: Add awareness of signed integer overflow traps
  configs/hardening: Disable CONFIG_UBSAN_SIGNED_WRAP
  configs/hardening: Fix disabling UBSAN configurations
2024-04-19 14:10:11 -07:00
Kees Cook
dde915c5cb string: Convert KUnit test names to standard convention
The KUnit convention for test names is AREA_test_WHAT. Adjust the string
test names to follow this pattern.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-5-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:12:02 -07:00
Kees Cook
bd678f7d9b string: Merge strcat KUnit tests into string_kunit.c
Move the strcat() tests into string_kunit.c. Remove the separate
Kconfig and Makefile rule.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-4-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:12:01 -07:00
Kees Cook
6e4ef1429f string: Prepare to merge strcat KUnit tests into string_kunit.c
The test naming convention differs between string_kunit.c and
strcat_kunit.c. Move "test" to the beginning of the function name.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-3-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:11:58 -07:00
Kees Cook
bb8d9b742a string: Merge strscpy KUnit tests into string_kunit.c
Move the strscpy() tests into string_kunit.c. Remove the separate
Kconfig and Makefile rule.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:11:57 -07:00
Kees Cook
b03442f761 string: Prepare to merge strscpy_kunit.c into string_kunit.c
In preparation for moving the strscpy_kunit.c tests into string_kunit.c,
rename "tc" to "strscpy_check" for better readability.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:11:45 -07:00
Linus Torvalds
9c6e84e4ba Bootconfig fixes for v6.9-rc4:
- Fix potential static_command_line buffer overrun. Currently we allocate
   the memory for static_command_line based on "boot_command_line", but it
   will copy "command_line" into it. So we use the length of "command_line"
   instead of "boot_command_line" (as previously we did).
 - Use memblock_free_late() in xbc_exit() instead of memblock_free() after
   the buddy system is initialized.
 - Fix a kerneldoc warning.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmYgN1kbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b/yEH/1FFgb7UJDtQLbtHl5/b
 bcxLbSzfb/N37Bc+sE/AKZYrt5QAMjaOmdtzQz9kdLtycxWcQinne4jqGxd6zfTU
 UIisfDjEZr46/Rs5sJg+5i8wWrud1TJOmlMsqiSVcorl0f/wE4S7PqgYXRNWZ0p+
 KipjuCCV43ITmVjsiq2NxfZGDaWzow/EJXwZzpQkJE1zaU13w2nzgzg64JW3f/lf
 Dx/o9jlYEoLkCjiQJ6XaRuTpHbPP1grozSMbvE3z1WnxCaiFHlzXGi6WUhto+pTu
 vt/pUrIFYE7k0IFHAVEgBjOkfCm5y9FwOdPLqwy3harQ5ek9D6h6bFnDhbZw7I27
 6V8=
 =e2c5
 -----END PGP SIGNATURE-----

Merge tag 'bootconfig-fixes-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig fixes from Masami Hiramatsu:

 - Fix potential static_command_line buffer overrun.

   Currently we allocate the memory for static_command_line based on
   "boot_command_line", but it will copy "command_line" into it. So we
   use the length of "command_line" instead of "boot_command_line" (as
   we previously did)

 - Use memblock_free_late() in xbc_exit() instead of memblock_free()
   after the buddy system is initialized

 - Fix a kerneldoc warning

* tag 'bootconfig-fixes-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  bootconfig: Fix the kerneldoc of _xbc_exit()
  bootconfig: use memblock_free_late to free xbc memory to buddy
  init/main.c: Fix potential static_command_line memory overflow
2024-04-19 09:52:09 -07:00
Ivan Orlov
9259a47216 string_kunit: Add test cases for str*cmp functions
Currently, str*cmp functions (strcmp, strncmp, strcasecmp and
strncasecmp) are not covered with tests. Extend the `string_kunit.c`
test by adding the test cases for them.

This patch adds 8 more test cases:

1) strcmp test
2) strcmp test on long strings (2048 chars)
3) strncmp test
4) strncmp test on long strings (2048 chars)
5) strcasecmp test
6) strcasecmp test on long strings
7) strncasecmp test
8) strncasecmp test on long strings

These test cases aim at covering as many edge cases as possible,
including the tests on empty strings, situations when the different
symbol is placed at the end of one of the strings, etc.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240417233033.717596-1-ivan.orlov0322@gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-18 10:14:19 -07:00
Masami Hiramatsu (Google)
298b871cd5 bootconfig: Fix the kerneldoc of _xbc_exit()
Fix the kerneldoc of _xbc_exit() which is updated to have an @early
argument and the function name is changed.

Link: https://lore.kernel.org/all/171321744474.599864.13532445969528690358.stgit@devnote2/

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404150036.kPJ3HEFA-lkp@intel.com/
Fixes: 89f9a1e876 ("bootconfig: use memblock_free_late to free xbc memory to buddy")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-18 05:01:32 +09:00
Chen Pei
dac045fc9f bpf, tests: Fix typos in comments
Currently, there are two comments with same name "64-bit ATOMIC magnitudes",
the second one should be "32-bit ATOMIC magnitudes" based on the context.

Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240415081928.17440-1-cp0613@linux.alibaba.com
2024-04-16 17:22:18 +02:00
Kees Cook
f4626c12e4 ubsan: Add awareness of signed integer overflow traps
On arm64, UBSAN traps can be decoded from the trap instruction. Add the
add, sub, and mul overflow trap codes now that CONFIG_UBSAN_SIGNED_WRAP
exists. Seen under clang 19:

  Internal error: UBSAN: unrecognized failure code: 00000000f2005515 [#1] PREEMPT SMP

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/lkml/20240411-fix-ubsan-in-hardening-config-v1-0-e0177c80ffaa@kernel.org
Fixes: 557f8c582a ("ubsan: Reintroduce signed overflow sanitizer")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240415182832.work.932-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-15 17:42:43 -07:00
Breno Leitao
4ba67ef3a1 net: dqs: make struct dql more cache efficient
With the previous change, struct dqs->stall_thrs will be in the hot path
(at queue side), even if DQS is disabled.

The other fields accessed in this function (last_obj_cnt and num_queued)
are in the first cache line, let's move this field  (stall_thrs) to the
very first cache line, since there is a hole there.

This does not change the structure size, since it moves an short (2
bytes) to 4-bytes whole in the first cache line.

This is the new structure format now:

struct dql {
	unsigned int    num_queued;
	unsigned int    last_obj_cnt;
...
	short unsigned int    stall_thrs;
	/* XXX 2 bytes hole, try to pack */
...
	/* --- cacheline 1 boundary (64 bytes) --- */
...
 	/* Longest stall detected, reported to user */
	short unsigned int         stall_max;
	/* XXX 2 bytes hole, try to pack */
};

Also, read the stall_thrs (now in the very first cache line) earlier,
together with dql->num_queued (also in the first cache line).

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240411192241.2498631-5-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-15 11:19:53 -07:00
Qiang Zhang
89f9a1e876 bootconfig: use memblock_free_late to free xbc memory to buddy
On the time to free xbc memory in xbc_exit(), memblock may has handed
over memory to buddy allocator. So it doesn't make sense to free memory
back to memblock. memblock_free() called by xbc_exit() even causes UAF bugs
on architectures with CONFIG_ARCH_KEEP_MEMBLOCK disabled like x86.
Following KASAN logs shows this case.

This patch fixes the xbc memory free problem by calling memblock_free()
in early xbc init error rewind path and calling memblock_free_late() in
xbc exit path to free memory to buddy allocator.

[    9.410890] ==================================================================
[    9.418962] BUG: KASAN: use-after-free in memblock_isolate_range+0x12d/0x260
[    9.426850] Read of size 8 at addr ffff88845dd30000 by task swapper/0/1

[    9.435901] CPU: 9 PID: 1 Comm: swapper/0 Tainted: G     U             6.9.0-rc3-00208-g586b5dfb51b9 #5
[    9.446403] Hardware name: Intel Corporation RPLP LP5 (CPU:RaptorLake)/RPLP LP5 (ID:13), BIOS IRPPN02.01.01.00.00.19.015.D-00000000 Dec 28 2023
[    9.460789] Call Trace:
[    9.463518]  <TASK>
[    9.465859]  dump_stack_lvl+0x53/0x70
[    9.469949]  print_report+0xce/0x610
[    9.473944]  ? __virt_addr_valid+0xf5/0x1b0
[    9.478619]  ? memblock_isolate_range+0x12d/0x260
[    9.483877]  kasan_report+0xc6/0x100
[    9.487870]  ? memblock_isolate_range+0x12d/0x260
[    9.493125]  memblock_isolate_range+0x12d/0x260
[    9.498187]  memblock_phys_free+0xb4/0x160
[    9.502762]  ? __pfx_memblock_phys_free+0x10/0x10
[    9.508021]  ? mutex_unlock+0x7e/0xd0
[    9.512111]  ? __pfx_mutex_unlock+0x10/0x10
[    9.516786]  ? kernel_init_freeable+0x2d4/0x430
[    9.521850]  ? __pfx_kernel_init+0x10/0x10
[    9.526426]  xbc_exit+0x17/0x70
[    9.529935]  kernel_init+0x38/0x1e0
[    9.533829]  ? _raw_spin_unlock_irq+0xd/0x30
[    9.538601]  ret_from_fork+0x2c/0x50
[    9.542596]  ? __pfx_kernel_init+0x10/0x10
[    9.547170]  ret_from_fork_asm+0x1a/0x30
[    9.551552]  </TASK>

[    9.555649] The buggy address belongs to the physical page:
[    9.561875] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x45dd30
[    9.570821] flags: 0x200000000000000(node=0|zone=2)
[    9.576271] page_type: 0xffffffff()
[    9.580167] raw: 0200000000000000 ffffea0011774c48 ffffea0012ba1848 0000000000000000
[    9.588823] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
[    9.597476] page dumped because: kasan: bad access detected

[    9.605362] Memory state around the buggy address:
[    9.610714]  ffff88845dd2ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    9.618786]  ffff88845dd2ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    9.626857] >ffff88845dd30000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[    9.634930]                    ^
[    9.638534]  ffff88845dd30080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[    9.646605]  ffff88845dd30100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[    9.654675] ==================================================================

Link: https://lore.kernel.org/all/20240414114944.1012359-1-qiang4.zhang@linux.intel.com/

Fixes: 40caa127f3 ("init: bootconfig: Remove all bootconfig data when the init memory is removed")
Cc: Stable@vger.kernel.org
Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-14 22:00:43 +09:00
Bitao Hu
d7037381d0 watchdog/softlockup: Low-overhead detection of interrupt storm
The following softlockup is caused by interrupt storm, but it cannot be
identified from the call tree. Because the call tree is just a snapshot
and doesn't fully capture the behavior of the CPU during the soft lockup.
  watchdog: BUG: soft lockup - CPU#28 stuck for 23s! [fio:83921]
  ...
  Call trace:
    __do_softirq+0xa0/0x37c
    __irq_exit_rcu+0x108/0x140
    irq_exit+0x14/0x20
    __handle_domain_irq+0x84/0xe0
    gic_handle_irq+0x80/0x108
    el0_irq_naked+0x50/0x58

Therefore, it is necessary to report CPU utilization during the
softlockup_threshold period (report once every sample_period, for a total
of 5 reportings), like this:
  watchdog: BUG: soft lockup - CPU#28 stuck for 23s! [fio:83921]
  CPU#28 Utilization every 4s during lockup:
    #1: 0% system, 0% softirq, 100% hardirq, 0% idle
    #2: 0% system, 0% softirq, 100% hardirq, 0% idle
    #3: 0% system, 0% softirq, 100% hardirq, 0% idle
    #4: 0% system, 0% softirq, 100% hardirq, 0% idle
    #5: 0% system, 0% softirq, 100% hardirq, 0% idle
  ...

This is helpful in determining whether an interrupt storm has occurred or
in identifying the cause of the softlockup. The criteria for determination
are as follows:

  a. If the hardirq utilization is high, then interrupt storm should be
     considered and the root cause cannot be determined from the call tree.
  b. If the softirq utilization is high, then the call might not necessarily
     point at the root cause.
  c. If the system utilization is high, then analyzing the root
     cause from the call tree is possible in most cases.

The mechanism requires a considerable amount of global storage space
when configured for the maximum number of CPUs. Therefore, adding a
SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob that defaults to "yes"
if the max number of CPUs is <= 128.

Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Liu Song <liusong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240411074134.30922-5-yaoma@linux.alibaba.com
2024-04-12 17:08:05 +02:00
Jakub Kicinski
94426ed213 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/unix/garbage.c
  47d8ac011f ("af_unix: Fix garbage collector racing against connect()")
  4090fa373f ("af_unix: Replace garbage collection algorithm.")

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  faa12ca245 ("bnxt_en: Reset PTP tx_avail after possible firmware reset")
  b3d0083caf ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()")

drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
  7ac10c7d72 ("bnxt_en: Fix possible memory leak in bnxt_rdma_aux_device_init()")
  194fad5b27 ("bnxt_en: Refactor bnxt_rdma_aux_device_init/uninit functions")

drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
  958f56e483 ("net/mlx5e: Un-expose functions in en.h")
  49e6c93870 ("net/mlx5e: RSS, Block XOR hash with over 128 channels")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-11 14:23:47 -07:00
Linus Torvalds
2ae9a8972c Including fixes from bluetooth.
Current release - new code bugs:
 
   - netfilter: complete validation of user input
 
   - mlx5: disallow SRIOV switchdev mode when in multi-PF netdev
 
 Previous releases - regressions:
 
   - core: fix u64_stats_init() for lockdep when used repeatedly in one file
 
   - ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
 
   - bluetooth: fix memory leak in hci_req_sync_complete()
 
   - batman-adv: avoid infinite loop trying to resize local TT
 
   - drv: geneve: fix header validation in geneve[6]_xmit_skb
 
   - drv: bnxt_en: fix possible memory leak in bnxt_rdma_aux_device_init()
 
   - drv: mlx5: offset comp irq index in name by one
 
   - drv: ena: avoid double-free clearing stale tx_info->xdpf value
 
   - drv: pds_core: fix pdsc_check_pci_health deadlock
 
 Previous releases - always broken:
 
   - xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING
 
   - bluetooth: fix setsockopt not validating user input
 
   - af_unix: clear stale u->oob_skb.
 
   - nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies
 
   - drv: virtio_net: fix guest hangup on invalid RSS update
 
   - drv: mlx5e: Fix mlx5e_priv_init() cleanup flow
 
   - dsa: mt7530: trap link-local frames regardless of ST Port State
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmYXyoQSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk72wQAJJ9DQra9b/8S3Zla1dutBcznSCxruas
 vWrpgIZiT3Aw5zUmZUZn+rNP8xeWLBK78Yv4m236B8/D3Kji2uMbVrjUAApBBcHr
 /lLmctZIhDHCoJCYvRTC/VOVPuqbbbmxOmx6rNvry93iNAiHAnBdCUlOYzMNPzJz
 6XIGtztFP0ICtt9owFtQRnsPeKhZJ5DoxqgE9KS2Pmb9PU99i1bEShpwLwB5I83S
 yTHKUY5W0rknkQZTW1gbv+o3dR0iFy7LZ+1FItJ/UzH0bG6JmcqzSlH5mZZJCc/L
 5LdUwtwMmKG2Kez/vKr1DAwTeAyhwVU+d+Hb28QXiO0kAYbjbOgNXse1st3RwDt5
 YKMKlsmR+kgPYLcvs9df2aubNSRvi2utwIA2kuH33HxBYF5PfQR5PTGeR21A+cKo
 wvSit8aMaGFTPJ7rRIzkNaPdIHSvPMKYcXV/T8EPvlOHzi5GBX0qHWj99JO9Eri+
 VFci+FG3HCPHK8v683g/WWiiVNx/IHMfNbcukes1oDFsCeNo7KZcnPY+zVhtdvvt
 QBnvbAZGKeDXMbnHZLB3DCR3ENHWTrJzC3alLDp3/uFC79VKtfIRO2wEX3gkrN8S
 JHsdYU13Yp1ERaNjUeq7Sqk2OGLfsBt4HSOhcK8OPPgE5rDRON5UPjkuNvbaEiZY
 Morzaqzerg1B
 =a9bB
 -----END PGP SIGNATURE-----

Merge tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth.

  Current release - new code bugs:

   - netfilter: complete validation of user input

   - mlx5: disallow SRIOV switchdev mode when in multi-PF netdev

  Previous releases - regressions:

   - core: fix u64_stats_init() for lockdep when used repeatedly in one
     file

   - ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr

   - bluetooth: fix memory leak in hci_req_sync_complete()

   - batman-adv: avoid infinite loop trying to resize local TT

   - drv: geneve: fix header validation in geneve[6]_xmit_skb

   - drv: bnxt_en: fix possible memory leak in
     bnxt_rdma_aux_device_init()

   - drv: mlx5: offset comp irq index in name by one

   - drv: ena: avoid double-free clearing stale tx_info->xdpf value

   - drv: pds_core: fix pdsc_check_pci_health deadlock

  Previous releases - always broken:

   - xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING

   - bluetooth: fix setsockopt not validating user input

   - af_unix: clear stale u->oob_skb.

   - nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies

   - drv: virtio_net: fix guest hangup on invalid RSS update

   - drv: mlx5e: Fix mlx5e_priv_init() cleanup flow

   - dsa: mt7530: trap link-local frames regardless of ST Port State"

* tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
  net: ena: Set tx_info->xdpf value to NULL
  net: ena: Fix incorrect descriptor free behavior
  net: ena: Wrong missing IO completions check order
  net: ena: Fix potential sign extension issue
  af_unix: Fix garbage collector racing against connect()
  net: dsa: mt7530: trap link-local frames regardless of ST Port State
  Revert "s390/ism: fix receive message buffer allocation"
  net: sparx5: fix wrong config being used when reconfiguring PCS
  net/mlx5: fix possible stack overflows
  net/mlx5: Disallow SRIOV switchdev mode when in multi-PF netdev
  net/mlx5e: RSS, Block XOR hash with over 128 channels
  net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit
  net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
  net/mlx5e: Fix mlx5e_priv_init() cleanup flow
  net/mlx5e: RSS, Block changing channels number when RXFH is configured
  net/mlx5: Correctly compare pkt reformat ids
  net/mlx5: Properly link new fs rules into the tree
  net/mlx5: offset comp irq index in name by one
  net/mlx5: Register devlink first under devlink lock
  net/mlx5: E-switch, store eswitch pointer before registering devlink_param
  ...
2024-04-11 11:46:31 -07:00
Linus Torvalds
fe5b5ef836 hardening fixes for v6.9-rc4
- gcc-plugins/stackleak: Avoid .head.text section (Ard Biesheuvel)
 
 - ubsan: fix unused variable warning in test module (Arnd Bergmann)
 
 - Improve entropy diffusion in randomize_kstack
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmYWv7sWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJv0TD/9z0EFo2K6brr6rZ9LUK8JgdcFd
 MXfee95uw3DvqTphx6H+559AGl0nCZd0S8huw8TAPsRg2hNeuFsQTcQ4TRnyCE+w
 qMCjflZI0LHqyTfTOuFeSJERQ8kiHntaci33xXR7crehHNGvJN6jRB5ouUOxdFlB
 L4mx7P4OFjS/+6qRM957yqWrCPM26Nl7qe0TcnOZ5eh+HvIRNF9hSg8DAMMIb0j7
 lZZojpOdiYYccfBUhM2PCpDfSNtG06/QpKGCeVK/3gpRzrVWl99iABTpJStf3rbH
 PWd3kwz/FINYq8sOjm9Y960Xgkz+IZkhr5YXVs6ezMoV5J9vGBG7zjzlYbPwbfoy
 UTaHUEGmYqCnnshrC5PU05G/CGlYP5iz6pzfxofXLca5+WQY0HxERLAkVdVhn39q
 oRLIOleAguocK3eBQ03tmJFF28yFJyOWK3C0gp8g8vExBAg2EQGrHOAfR28crB5p
 yKvLQHL04urSQA9gs4PiflBGyHiLSuWBGyO1K3NvW5cNrRUIGXlwfy8i7cHSzBFo
 /rQ49E7xxtitAlk+Jol55OewEpw/nq1wqOv0eFW4pje+WSOVRpbQBtDD03gik74v
 bW9ownxrypqUm5OR6Uuf6OwVWi/zX03inVF0d55dcLv+kjDy0Xu8ViVOhXhs2vT8
 PayuOAhAvHgcfGJwFw==
 =MHgA
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - gcc-plugins/stackleak: Avoid .head.text section (Ard Biesheuvel)

 - ubsan: fix unused variable warning in test module (Arnd Bergmann)

 - Improve entropy diffusion in randomize_kstack

* tag 'hardening-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randomize_kstack: Improve entropy diffusion
  ubsan: fix unused variable warning in test module
  gcc-plugins/stackleak: Avoid .head.text section
2024-04-10 13:31:34 -07:00
Paul E. McKenney
a88d970c8b lib: Add one-byte emulation function
Architectures are required to provide four-byte cmpxchg() and 64-bit
architectures are additionally required to provide eight-byte cmpxchg().
However, there are cases where one-byte cmpxchg() would be extremely
useful.  Therefore, provide cmpxchg_emu_u8() that emulates one-byte
cmpxchg() in terms of four-byte cmpxchg().

Note that this emulations is fully ordered, and can (for example) cause
one-byte cmpxchg_relaxed() to incur the overhead of full ordering.
If this causes problems for a given architecture, that architecture is
free to provide its own lighter-weight primitives.

[ paulmck: Apply Marco Elver feedback. ]
[ paulmck: Apply kernel test robot feedback. ]
[ paulmck: Drop two-byte support per Arnd Bergmann feedback. ]

Link: https://lore.kernel.org/all/0733eb10-5e7a-4450-9b8a-527b97c842ff@paulmck-laptop/

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Marco Elver <elver@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: <linux-arch@vger.kernel.org>
2024-04-09 22:06:00 -07:00
Jiri Slaby (SUSE)
d52b761e4b kfifo: add kfifo_dma_out_prepare_mapped()
When the kfifo buffer is already dma-mapped, one cannot use the kfifo
API to fill in an SG list.

Add kfifo_dma_in_prepare_mapped() which allows exactly this. A mapped
dma_addr_t is passed and it is filled into provided sgl too. Including
the dma_len.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-8-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
fea0dde081 kfifo: pass offset to setup_sgl_buf() instead of a pointer
As a preparatory for dma addresses filling, we need the data offset
instead of virtual pointer in setup_sgl_buf(). So pass the former
instead the latter.

And pointer to fifo is needed in setup_sgl_buf() now too.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-7-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
ed6d22f5d8 kfifo: rename l to len_to_end in setup_sgl()
So that one can make any sense of the name.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-6-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
e9d9576de0 kfifo: remove support for physically non-contiguous memory
First, there is no such user. The only user of this interface is
caam_rng_fill_async() and that uses kfifo_alloc() -> kmalloc().

Second, the implementation does not allow anything else than direct
mapping and kmalloc() (due to virt_to_phys()), anyway.

Therefore, there is no point in having this dead (and complex) code in
the kernel.

Note the setup_sgl_buf() function now boils down to simple sg_set_buf().
That is called twice from setup_sgl() to take care of kfifo buffer
wrap-around.

setup_sgl_buf() will be extended shortly, so keeping it in place.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Stefani Seibold <stefani@seibold.net>
Link: https://lore.kernel.org/r/20240405060826.2521-5-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
4edd7e96a1 kfifo: add kfifo_out_linear{,_ptr}()
These are helpers which are going to be used in the serial layer. We
need a wrapper around kfifo which provides us with a tail (sometimes
"tail" offset, sometimes a pointer) to the kfifo data. And which returns
count of available data -- but not larger than to the end of the buffer
(hence _linear in the names). I.e. something like CIRC_CNT_TO_END() in
the legacy circ_buf.

This patch adds such two helpers.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-4-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
779087fe2f kfifo: drop __kfifo_dma_out_finish_r()
It is the same as __kfifo_skip_r(), so:
* drop __kfifo_dma_out_finish_r() completely, and
* replace its (only) use by __kfifo_skip_r().

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-2-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:02 +02:00
Adrian Hunter
8ff1e6c5ac vdso: Fix powerpc build U64_MAX undeclared error
U64_MAX is not in include/vdso/limits.h, although that isn't noticed on x86
because x86 includes include/linux/limits.h indirectly. However powerpc is
more selective, resulting in the following build error:

  In file included from <command-line>:
  lib/vdso/gettimeofday.c: In function 'vdso_calc_ns':
  lib/vdso/gettimeofday.c:11:33: error: 'U64_MAX' undeclared
     11 | # define VDSO_DELTA_MASK(vd)    U64_MAX
        |                                 ^~~~~~~

Use ULLONG_MAX instead which will work just as well and is in
include/vdso/limits.h.

Fixes: c8e3a8b6f2 ("vdso: Consolidate vdso_calc_delta()")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240409062639.3393-1-adrian.hunter@intel.com
Closes: https://lore.kernel.org/all/20240409124905.6816db37@canb.auug.org.au/
2024-04-09 12:35:19 +02:00
Adrian Hunter
456e3788bc vdso: Make delta calculation overflow safe
Kernel timekeeping is designed to keep the change in cycles (since the last
timer interrupt) below max_cycles, which prevents multiplication overflow
when converting cycles to nanoseconds. However, if timer interrupts stop,
the calculation will eventually overflow.

Add protection against that, enabled by config option
CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT. Check against max_cycles, falling
back to a slower higher precision calculation.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240325064023.2997-8-adrian.hunter@intel.com
2024-04-08 15:03:07 +02:00
Adrian Hunter
0c68458b0a vdso: Add CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT
Add CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT in preparation to add
multiplication overflow protection to the VDSO time getter functions.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240325064023.2997-4-adrian.hunter@intel.com
2024-04-08 15:03:06 +02:00
Adrian Hunter
5b26ef660a vdso: Consolidate nanoseconds calculation
Consolidate nanoseconds calculation to simplify and reduce code
duplication.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240325064023.2997-3-adrian.hunter@intel.com
2024-04-08 15:03:06 +02:00
Adrian Hunter
c8e3a8b6f2 vdso: Consolidate vdso_calc_delta()
Consolidate vdso_calc_delta(), in preparation for further simplification.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240325064023.2997-2-adrian.hunter@intel.com
2024-04-08 15:03:06 +02:00
Arnd Bergmann
e9d47b7b31 lib: checksum: hide unused expected_csum_ipv6_magic[]
When CONFIG_NET is disabled, an extra warning shows up for this
unused variable:

lib/checksum_kunit.c:218:18: error: 'expected_csum_ipv6_magic' defined but not used [-Werror=unused-const-variable=]

Replace the #ifdef with an IS_ENABLED() check that makes the compiler's
dead-code-elimination take care of the link failure.

Fixes: f24a70106d ("lib: checksum: Fix build with CONFIG_NET=n")
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-08 11:03:05 +01:00
Peter Collingbourne
a6c1d9cb9a stackdepot: rename pool_index to pool_index_plus_1
Commit 3ee34eabac ("lib/stackdepot: fix first entry having a 0-handle")
changed the meaning of the pool_index field to mean "the pool index plus
1".  This made the code accessing this field less self-documenting, as
well as causing debuggers such as drgn to not be able to easily remain
compatible with both old and new kernels, because they typically do that
by testing for presence of the new field.  Because stackdepot is a
debugging tool, we should make sure that it is debugger friendly. 
Therefore, give the field a different name to improve readability as well
as enabling debugger backwards compatibility.

This is needed in 6.9, which would otherwise become an odd release with
the new semantics and old name so debuggers wouldn't recognize the new
semantics there.

Fixes: 3ee34eabac ("lib/stackdepot: fix first entry having a 0-handle")
Link: https://lkml.kernel.org/r/20240402001500.53533-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/Ib3e70c36c1d230dd0a118dc22649b33e768b9f88
Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Marco Elver <elver@google.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05 11:21:31 -07:00
Andrii Nakryiko
229087f6f1 bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition
Turns out that due to CONFIG_DEBUG_INFO_BTF_MODULES not having an
explicitly specified "menu item name" in Kconfig, it's basically
impossible to turn it off (see [0]).

This patch fixes the issue by defining menu name for
CONFIG_DEBUG_INFO_BTF_MODULES, which makes it actually adjustable
and independent of CONFIG_DEBUG_INFO_BTF, in the sense that one can
have DEBUG_INFO_BTF=y and DEBUG_INFO_BTF_MODULES=n.

We still keep it as defaulting to Y, of course.

Fixes: 5f9ae91f7c ("kbuild: Build kernel module BTFs if BTF is enabled and pahole supports it")
Reported-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/CAK3+h2xiFfzQ9UXf56nrRRP=p1+iUxGoEP5B+aq9MDT5jLXDSg@mail.gmail.com [0]
Link: https://lore.kernel.org/bpf/20240404220344.3879270-1-andrii@kernel.org
2024-04-05 15:12:47 +02:00
Guenter Roeck
b1080c667b mm/slub, kunit: Use inverted data to corrupt kmem cache
Two failure patterns are seen randomly when running slub_kunit tests with
CONFIG_SLAB_FREELIST_RANDOM and CONFIG_SLAB_FREELIST_HARDENED enabled.

Pattern 1:
     # test_clobber_zone: pass:1 fail:0 skip:0 total:1
     ok 1 test_clobber_zone
     # test_next_pointer: EXPECTATION FAILED at lib/slub_kunit.c:72
     Expected 3 == slab_errors, but
         slab_errors == 0 (0x0)
     # test_next_pointer: EXPECTATION FAILED at lib/slub_kunit.c:84
     Expected 2 == slab_errors, but
         slab_errors == 0 (0x0)
     # test_next_pointer: pass:0 fail:1 skip:0 total:1
     not ok 2 test_next_pointer

In this case, test_next_pointer() overwrites p[s->offset], but the data
at p[s->offset] is already 0x12.

Pattern 2:
     ok 1 test_clobber_zone
     # test_next_pointer: EXPECTATION FAILED at lib/slub_kunit.c:72
     Expected 3 == slab_errors, but
         slab_errors == 2 (0x2)
     # test_next_pointer: pass:0 fail:1 skip:0 total:1
     not ok 2 test_next_pointer

In this case, p[s->offset] has a value other than 0x12, but one of the
expected failures is nevertheless missing.

Invert data instead of writing a fixed value to corrupt the cache data
structures to fix the problem.

Fixes: 1f9f78b1b3 ("mm/slub, kunit: add a KUnit test for SLUB debugging functionality")
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
CC: Daniel Latypov <dlatypov@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-04 16:08:10 +02:00
Arnd Bergmann
bbda3ba626 ubsan: fix unused variable warning in test module
This is one of the drivers with an unused variable that is marked 'const'.
Adding a __used annotation here avoids the warning and lets us enable
the option by default:

lib/test_ubsan.c:137:28: error: unused variable 'skip_ubsan_array' [-Werror,-Wunused-const-variable]

Fixes: 4a26f49b7b ("ubsan: expand tests and reporting")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20240403080702.3509288-3-arnd@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-03 14:35:57 -07:00
Alexander Lobakin
7adaf37f7f lib/bitmap: add compile-time test for __assign_bit() optimization
Commit dc34d50366 ("lib: test_bitmap: add compile-time
optimization/evaluations assertions") initially missed __assign_bit(),
which led to that quite a time passed before I realized it doesn't get
optimized at compilation time. Now that it does, add test for that just
to make sure nothing will break one day.
To make things more interesting, use bitmap_complement() and
bitmap_full(), thus checking their compile-time evaluation as well. And
remove the misleading comment mentioning the workaround removed recently
in favor of adding the whole file to GCov exceptions.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:28 +01:00
Alexander Lobakin
a37fbe666c bitmap: introduce generic optimized bitmap_size()
The number of times yet another open coded
`BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge.
Some generic helper is long overdue.

Add one, bitmap_size(), but with one detail.
BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both
divident and divisor are compile-time constants or when the divisor
is not a pow-of-2. When it is however, the compilers sometimes tend
to generate suboptimal code (GCC 13):

48 83 c0 3f          	add    $0x3f,%rax
48 c1 e8 06          	shr    $0x6,%rax
48 8d 14 c5 00 00 00 00	lea    0x0(,%rax,8),%rdx

%BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does
full division of `nbits + 63` by it and then multiplication by 8.
Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC:

8d 50 3f             	lea    0x3f(%rax),%edx
c1 ea 03             	shr    $0x3,%edx
81 e2 f8 ff ff 1f    	and    $0x1ffffff8,%edx

Now it shifts `nbits + 63` by 3 positions (IOW performs fast division
by 8) and then masks bits[2:0]. bloat-o-meter:

add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617)

Clang does it better and generates the same code before/after starting
from -O1, except that with the ALIGN() approach it uses %edx and thus
still saves some bytes:

add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520)

Note that we can't expand DIV_ROUND_UP() by adding a check and using
this approach there, as it's used in array declarations where
expressions are not allowed.
Add this helper to tools/ as well.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:28 +01:00
Alexander Potapenko
f3e28876b6 lib/test_bitmap: use pr_info() for non-error messages
pr_err() messages may be treated as errors by some log readers, so let
us only use them for test failures. For non-error messages, replace them
with pr_info().

Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:27 +01:00
Alexander Potapenko
991e558364 lib/test_bitmap: add tests for bitmap_{read,write}()
Add basic tests ensuring that values can be added at arbitrary positions
of the bitmap, including those spanning into the adjacent unsigned
longs.

Two new performance tests, test_bitmap_read_perf() and
test_bitmap_write_perf(), can be used to assess future performance
improvements of bitmap_read() and bitmap_write():

[    0.431119][    T1] test_bitmap: Time spent in test_bitmap_read_perf:	615253
[    0.433197][    T1] test_bitmap: Time spent in test_bitmap_write_perf:	916313

(numbers from a Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz machine running
QEMU).

Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:27 +01:00
Ingo Molnar
f4566a1e73 Linux 6.9-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmYAlq0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGYqwH/0fb4pRbVtULpiIK
 Cs7/e/IWzRRWLBq+Jj2KVVTxwjyiKFNOq6K/CHHnljIWo1yN2CIWeOgbHfTI0WfN
 xmBdJP7OtK8MCN9PwwoWhZxMLcyv4pFCERrrkGa7AD+cdN4j/ytQ3mH5V8f/21fd
 rnpQSdpgGXB2SSMHd520Y+e56+gxrrTmsDXjZWM08Wt0bbqAWJrjNe58BMz5hI1t
 yQtcgYRTdUuZBn5TMkT99lK9EFQslV38YCo7RUP5D0DWXS1jSfWlgnCD1Nc1ziF4
 ps/xPdUMDJAc5Tslg/hgJOciSuLqgMzIUsVgZrKysuu3NhwDY1LDWGORmH1t8E8W
 RC25950=
 =F+01
 -----END PGP SIGNATURE-----

Merge tag 'v6.9-rc1' into sched/core, to pick up fixes and to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-03-25 11:32:29 +01:00
Linus Torvalds
b71871395c hardening fixes for v6.9-rc1
- CONFIG_MEMCPY_SLOW_KUNIT_TEST is no longer needed (Guenter Roeck)
 
 - Fix needless UTF-8 character in arch/Kconfig (Liu Song)
 
 - Improve __counted_by warning message in LKDTM (Nathan Chancellor)
 
 - Refactor DEFINE_FLEX() for default use of __counted_by
 
 - Disable signed integer overflow sanitizer on GCC < 8
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmX+Gj8WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJg1fD/4zrQ3MmMpK/GaNCziayLuCJiMu
 urS85FkRaGkmf3d3PDVVOBw5wFae408O+i4a1b10k2L/At79Z5oi/RqKYtPXthGr
 davVIJt+ZBuTC5l7jUHeK6XAJ/8/28Xsq4dMQjy7pUstYU14sl2a1z+IuV7I+jxm
 hlVqvj7Q0dsPijyF9YAuoWhvXIGs3/p6oaq/p4+p0sGpbbjNEpNka9qgR9vl1NFe
 WrByHrFMxl6gmLPTilNvgdD50id4x7OYEtROdDP8tXK+dx8YGaEXAuPzYkpzmuOb
 9UjhGitxRX6VfWBy7N/TWtDryV493Jq2VFVsrEc2zNr2HR34/vTYCjMyfLlJNXVu
 hD3vZ/tO2tDnDhBeMKigIsbTtev4qn60IElmc4BXudgVbWehjVsJk+kGbKEIkIy6
 ww7vF4OPWEKNTBVsDtFJwKNZixb/NMX3nS1xCALffzNgh3o8qP28gESOKImHeKhI
 U7ZhoWLob3ayEcjCVhCaBEC+fK9a5Eva6L5lTUoObPqzXwoysi2yGEQTqHlYwdfg
 unQuZUaiWEE9w1GG7CEx5xB9VmaY2U9bF7thUw6qMNmvCzqJr2goUcJ8E6y90s4w
 fR71YYkfYsXMuL1MnxCp/xOS5EbodYWskyLicNee7SlBAqInraaQG8syBGUFb+4S
 zvILMp7Zc1sBIxhATA==
 =1LW2
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.9-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull more hardening updates from Kees Cook:

 - CONFIG_MEMCPY_SLOW_KUNIT_TEST is no longer needed (Guenter Roeck)

 - Fix needless UTF-8 character in arch/Kconfig (Liu Song)

 - Improve __counted_by warning message in LKDTM (Nathan Chancellor)

 - Refactor DEFINE_FLEX() for default use of __counted_by

 - Disable signed integer overflow sanitizer on GCC < 8

* tag 'hardening-v6.9-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  lkdtm/bugs: Improve warning message for compilers without counted_by support
  overflow: Change DEFINE_FLEX to take __counted_by member
  Revert "kunit: memcpy: Split slow memcpy tests into MEMCPY_SLOW_KUNIT_TEST"
  arch/Kconfig: eliminate needless UTF-8 character in Kconfig help
  ubsan: Disable signed integer overflow sanitizer on GCC < 8
2024-03-23 08:43:21 -07:00
Kees Cook
d8e45f2929 overflow: Change DEFINE_FLEX to take __counted_by member
The norm should be flexible array structures with __counted_by
annotations, so DEFINE_FLEX() is updated to expect that. Rename
the non-annotated version to DEFINE_RAW_FLEX(), and update the
few existing users. Additionally add selftests for the macros.

Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20240306235128.it.933-kees@kernel.org
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-03-22 16:25:31 -07:00
Linus Torvalds
4f55aa85a8 fbdev fixes and cleanups for 6.9-rc1:
- Allow console fonts up to 64x128 pixels (Samuel Thibault)
 - Prevent division-by-zero in fb monitor code (Roman Smirnov)
 - Drop Renesas ARM platforms from Mobile LCDC framebuffer driver
   (Geert Uytterhoeven)
 - Various code cleanups in viafb, uveafb and mb862xxfb drivers by
   Aleksandr Burakov, Li Zhijian and Michael Ellerman
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZfx/jAAKCRD3ErUQojoP
 XxYoAP9EYvCtd+nxlUvnhS4fXUeTCXZJQA+HiXCORm6DGv7lUgD9EatR9b69CLA3
 e4LgzGS/9cybv3uUd3S5lFxsatGZnA8=
 =DQZN
 -----END PGP SIGNATURE-----

Merge tag 'fbdev-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

Pull fbdev updates from Helge Deller:

 - Allow console fonts up to 64x128 pixels (Samuel Thibault)

 - Prevent division-by-zero in fb monitor code (Roman Smirnov)

 - Drop Renesas ARM platforms from Mobile LCDC framebuffer driver (Geert
   Uytterhoeven)

 - Various code cleanups in viafb, uveafb and mb862xxfb drivers by
   Aleksandr Burakov, Li Zhijian and Michael Ellerman

* tag 'fbdev-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbdev: panel-tpo-td043mtea1: Convert sprintf() to sysfs_emit()
  fbmon: prevent division by zero in fb_videomode_from_videomode()
  fbcon: Increase maximum font width x height to 64 x 128
  fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2
  fbdev: mb862xxfb: Fix defined but not used error
  fbdev: uvesafb: Convert sprintf/snprintf to sysfs_emit
  fbdev: Restrict FB_SH_MOBILE_LCDC to SuperH
2024-03-22 10:09:08 -07:00
Linus Torvalds
1d35aae78f Kbuild updates for v6.9
- Generate a list of built DTB files (arch/*/boot/dts/dtbs-list)
 
  - Use more threads when building Debian packages in parallel
 
  - Fix warnings shown during the RPM kernel package uninstallation
 
  - Change OBJECT_FILES_NON_STANDARD_*.o etc. to take a relative path to
    Makefile
 
  - Support GCC's -fmin-function-alignment flag
 
  - Fix a null pointer dereference bug in modpost
 
  - Add the DTB support to the RPM package
 
  - Various fixes and cleanups in Kconfig
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmX8HGIVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGYfIQAIl/zEFoNVSHGR4TIvO7SIwkT4MM
 VAm0W6XRFaXfIGw8HL/MXe+U9jAyeQ9yL9uUVv8PqFTO+LzBbW1X1X97tlmrlQsC
 7mdxbA1KJXwkwt4wH/8/EZQMwHr327vtVH4AilSm+gAaWMXaSKAye3ulKQQ2gevz
 vP6aOcfbHIWOPdxA53cLdSl9LOGrYNczKySHXKV9O39T81F+ko7wPpdkiMWw5LWG
 ISRCV8bdXli8j10Pmg8jlbevSKl4Z5FG2BVw/Cl8rQ5tBBoCzFsUPnnp9A29G8QP
 OqRhbwxtkSm67BMJAYdHnhjp/l0AOEbmetTGpna+R06hirOuXhR3vc6YXZxhQjff
 LmKaqfG5YchRALS1fNDsRUNIkQxVJade+tOUG+V4WbxHQKWX7Ghu5EDlt2/x7P0p
 +XLPE48HoNQLQOJ+pgIOkaEDl7WLfGhoEtEgprZBuEP2h39xcdbYJyF10ZAAR4UZ
 FF6J9lDHbf7v1uqD2YnAQJQ6jJ06CvN6/s6SdiJnCWSs5cYRW0fnYigSIuwAgGHZ
 c/QFECoGEflXGGuqZDl5iXiIjhWKzH2nADSVEs7maP47vapcMWb9gA7VBNoOr5M0
 IXuFo1khChF4V2pxqlDj3H5TkDlFENYT/Wjh+vvjx8XplKCRKaSh+LaZ39hja61V
 dWH7BPecS44h4KXx
 =tFdl
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Generate a list of built DTB files (arch/*/boot/dts/dtbs-list)

 - Use more threads when building Debian packages in parallel

 - Fix warnings shown during the RPM kernel package uninstallation

 - Change OBJECT_FILES_NON_STANDARD_*.o etc. to take a relative path to
   Makefile

 - Support GCC's -fmin-function-alignment flag

 - Fix a null pointer dereference bug in modpost

 - Add the DTB support to the RPM package

 - Various fixes and cleanups in Kconfig

* tag 'kbuild-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (67 commits)
  kconfig: tests: test dependency after shuffling choices
  kconfig: tests: add a test for randconfig with dependent choices
  kconfig: tests: support KCONFIG_SEED for the randconfig runner
  kbuild: rpm-pkg: add dtb files in kernel rpm
  kconfig: remove unneeded menu_is_visible() call in conf_write_defconfig()
  kconfig: check prompt for choice while parsing
  kconfig: lxdialog: remove unused dialog colors
  kconfig: lxdialog: fix button color for blackbg theme
  modpost: fix null pointer dereference
  kbuild: remove GCC's default -Wpacked-bitfield-compat flag
  kbuild: unexport abs_srctree and abs_objtree
  kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1
  kconfig: remove named choice support
  kconfig: use linked list in get_symbol_str() to iterate over menus
  kconfig: link menus to a symbol
  kbuild: fix inconsistent indentation in top Makefile
  kbuild: Use -fmin-function-alignment when available
  alpha: merge two entries for CONFIG_ALPHA_GAMMA
  alpha: merge two entries for CONFIG_ALPHA_EV4
  kbuild: change DTC_FLAGS_<basetarget>.o to take the path relative to $(obj)
  ...
2024-03-21 14:41:00 -07:00
Linus Torvalds
241590e5a1 Driver core changes for 6.9-rc1
Here is the "big" set of driver core and kernfs changes for 6.9-rc1.
 
 Nothing all that crazy here, just some good updates that include:
   - automatic attribute group hiding from Dan Williams (he fixed up my
     horrible attempt at doing this.)
   - kobject lock contention fixes from Eric Dumazet
   - driver core cleanups from Andy
   - kernfs rcu work from Tejun
   - fw_devlink changes to resolve some reported issues
   - other minor changes, all details in the shortlog
 
 All of these have been in linux-next for a long time with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZfwsHg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynT4ACePcNRAsYrINlOPPKPHimJtyP01yEAn0pZYnj2
 0/UpqIqf3HVPu7zsLKTa
 =vR9S
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the "big" set of driver core and kernfs changes for 6.9-rc1.

  Nothing all that crazy here, just some good updates that include:

   - automatic attribute group hiding from Dan Williams (he fixed up my
     horrible attempt at doing this.)

   - kobject lock contention fixes from Eric Dumazet

   - driver core cleanups from Andy

   - kernfs rcu work from Tejun

   - fw_devlink changes to resolve some reported issues

   - other minor changes, all details in the shortlog

  All of these have been in linux-next for a long time with no reported
  issues"

* tag 'driver-core-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (28 commits)
  device: core: Log warning for devices pending deferred probe on timeout
  driver: core: Use dev_* instead of pr_* so device metadata is added
  driver: core: Log probe failure as error and with device metadata
  of: property: fw_devlink: Add support for "post-init-providers" property
  driver core: Add FWLINK_FLAG_IGNORE to completely ignore a fwnode link
  driver core: Adds flags param to fwnode_link_add()
  debugfs: fix wait/cancellation handling during remove
  device property: Don't use "proxy" headers
  device property: Move enum dev_dma_attr to fwnode.h
  driver core: Move fw_devlink stuff to where it belongs
  driver core: Drop unneeded 'extern' keyword in fwnode.h
  firmware_loader: Suppress warning on FW_OPT_NO_WARN flag
  sysfs:Addresses documentation in sysfs_merge_group and sysfs_unmerge_group.
  firmware_loader: introduce __free() cleanup hanler
  platform-msi: Remove usage of the deprecated ida_simple_xx() API
  sysfs: Introduce DEFINE_SIMPLE_SYSFS_GROUP_VISIBLE()
  sysfs: Document new "group visible" helpers
  sysfs: Fix crash on empty group attributes array
  sysfs: Introduce a mechanism to hide static attribute_groups
  sysfs: Introduce a mechanism to hide static attribute_groups
  ...
2024-03-21 13:34:15 -07:00
Linus Torvalds
3bcb0bf65c TTY/Serial driver update for 6.9-rc1
Here is the big set of TTY/Serial driver updates and cleanups for
 6.9-rc1.  Included in here are:
   - more tty cleanups from Jiri
   - loads of 8250 driver cleanups from Andy
   - max310x driver updates
   - samsung serial driver updates
   - uart_prepare_sysrq_char() updates for many drivers
   - platform driver remove callback void cleanups
   - stm32 driver updates
   - other small tty/serial driver updates
 
 All of these have been in linux-next for a long time with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZfwqow8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynNegCffxTbsnbMGjWhVrQ326IJx/DFvNMAoI9csigv
 m+G3RzefzZLRx8nAma0c
 =GMfc
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial driver updates from Greg KH:
 "Here is the big set of TTY/Serial driver updates and cleanups for
  6.9-rc1. Included in here are:

   - more tty cleanups from Jiri

   - loads of 8250 driver cleanups from Andy

   - max310x driver updates

   - samsung serial driver updates

   - uart_prepare_sysrq_char() updates for many drivers

   - platform driver remove callback void cleanups

   - stm32 driver updates

   - other small tty/serial driver updates

  All of these have been in linux-next for a long time with no reported
  issues"

* tag 'tty-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits)
  dt-bindings: serial: stm32: add power-domains property
  serial: 8250_dw: Replace ACPI device check by a quirk
  serial: Lock console when calling into driver before registration
  serial: 8250_uniphier: Switch to use uart_read_port_properties()
  serial: 8250_tegra: Switch to use uart_read_port_properties()
  serial: 8250_pxa: Switch to use uart_read_port_properties()
  serial: 8250_omap: Switch to use uart_read_port_properties()
  serial: 8250_of: Switch to use uart_read_port_properties()
  serial: 8250_lpc18xx: Switch to use uart_read_port_properties()
  serial: 8250_ingenic: Switch to use uart_read_port_properties()
  serial: 8250_dw: Switch to use uart_read_port_properties()
  serial: 8250_bcm7271: Switch to use uart_read_port_properties()
  serial: 8250_bcm2835aux: Switch to use uart_read_port_properties()
  serial: 8250_aspeed_vuart: Switch to use uart_read_port_properties()
  serial: port: Introduce a common helper to read properties
  serial: core: Add UPIO_UNKNOWN constant for unknown port type
  serial: core: Move struct uart_port::quirks closer to possible values
  serial: sh-sci: Call sci_serial_{in,out}() directly
  serial: core: only stop transmit when HW fifo is empty
  serial: pch: Use uart_prepare_sysrq_char().
  ...
2024-03-21 12:44:10 -07:00
Guenter Roeck
acd80cdcee Revert "kunit: memcpy: Split slow memcpy tests into MEMCPY_SLOW_KUNIT_TEST"
This reverts commit 4acf1de35f.

Commit d055c6a2cc ("kunit: memcpy: Mark tests as slow using test
attributes") marks slow memcpy unit tests as slow. Since this commit,
the tests can be disabled with a module parameter, and the configuration
option to skip the slow tests is no longer needed. Revert the patch
introducing it.

Cc: David Gow <davidgow@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20240314151200.2285314-1-linux@roeck-us.net
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-03-18 11:24:15 -07:00
Kees Cook
77fcc34769 ubsan: Disable signed integer overflow sanitizer on GCC < 8
For opting functions out of sanitizer coverage, the "no_sanitize"
attribute is used, but in GCC this wasn't introduced until GCC 8.
Disable the sanitizer unless we're not using GCC, or it is GCC
version 8 or higher.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202403110643.27JXEVCI-lkp@intel.com/
Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-03-18 11:24:14 -07:00
Linus Torvalds
02c163e959 cxl for v6.9
- Supplement ACPI HMAT reported memory performance with native CXL
   memory performance enumeration
 
 - Add support for CXL error injection via the ACPI EINJ mechanism
 
 - Cleanup CXL DOE and CDAT integration
 
 - Miscellaneous cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZfSddgAKCRDfioYZHlFs
 Z+QJAQC0DaU/QbziEUFgd6r92nLA9PHLWi2zhjsSjyuJ3kh3IQD+KN0h8IZ+Av05
 EOjLw21+ejwJ2dtCDcy2dlSpS6653wc=
 =czLR
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dan Williams:
 "CXL has mechanisms to enumerate the performance characteristics of
  memory devices. Those mechanisms allow Linux to build the equivalent
  of ACPI SRAT, SLIT, and HMAT tables dynamically at runtime. That
  capability is necessary because static ACPI can not represent dynamic
  CXL configurations (and reconfigurations).

  So, building on the v6.8 work to add "Quality of Service" enumeration,
  this update plumbs CXL "access coordinates" (read/write access latency
  and bandwidth) in all the same places that ACPI HMAT feeds similar
  data. Follow-on patches from the -mm side can then use that data to
  feed mechanisms like mm/memory-tiers.c. Greg has acked the touch to
  drivers/base/.

  The other feature update this cycle is support for CXL error injection
  via the ACPI EINJ module. That facility enables injection of bus
  protocol errors provided the user knows the magic address values to
  insert in the interface. To hide that magic, and make this easier to
  use, new error injection attributes were added to CXL debugfs. That
  interface injects the errors relative to a CXL object rather than
  require user tooling to know how to lookup and inject RCRB (Root
  Complex Register Block) addresses into the raw EINJ debugfs interface.
  It received some helpful review comments from Tony, but no explicit
  acks from the ACPI side. The primary user visible change for existing
  EINJ users is that they may find that einj.ko was already loaded by
  cxl_core.ko. Previously, einj.ko was only loaded on demand.

  The usual collection of miscellaneous cleanups are also present this
  cycle.

  Summary:

   - Supplement ACPI HMAT reported memory performance with native CXL
     memory performance enumeration

   - Add support for CXL error injection via the ACPI EINJ mechanism

   - Cleanup CXL DOE and CDAT integration

   - Miscellaneous cleanups and fixes"

* tag 'cxl-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (21 commits)
  Documentation/ABI/testing/debugfs-cxl: Fix "Unexpected indentation"
  lib/firmware_table: Provide buffer length argument to cdat_table_parse()
  cxl/pci: Get rid of pointer arithmetic reading CDAT table
  cxl/pci: Rename DOE mailbox handle to doe_mb
  cxl: Fix the incorrect assignment of SSLBIS entry pointer initial location
  cxl/core: Add CXL EINJ debugfs files
  EINJ, Documentation: Update EINJ kernel doc
  EINJ: Add CXL error type support
  EINJ: Migrate to a platform driver
  cxl/region: Deal with numa nodes not enumerated by SRAT
  cxl/region: Add memory hotplug notifier for cxl region
  cxl/region: Add sysfs attribute for locality attributes of CXL regions
  cxl/region: Calculate performance data for a region
  cxl: Set cxlmd->endpoint before adding port device
  cxl: Move QoS class to be calculated from the nearest CPU
  cxl: Split out host bridge access coordinates
  cxl: Split out combine_coordinates() for common shared usage
  ACPI: HMAT / cxl: Add retrieval of generic port coordinates for both access classes
  ACPI: HMAT: Introduce 2 levels of generic port access class
  base/node / ACPI: Enumerate node access class for 'struct access_coordinate'
  ...
2024-03-16 10:04:12 -07:00
Samuel Thibault
152609795d fbcon: Increase maximum font width x height to 64 x 128
By using bitmaps we actually support whatever size we would want, but
the console currently limits fonts to 64x128 (which gives 60x16 text on
4k screens), so we don't need more for now, and we can easily increase
later.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2024-03-16 08:29:48 +01:00
Linus Torvalds
8a2fbffcbf This includes the following changes related to sparc for v6.9:
- Fix missing prototype warnings in various places, including switching
   to using generic cmpdi2/ucmpdi2 and parport.h and stop selecting
   unneeded GENERIC_ISA_DMA.
 - Reduce duplicate code by using shared font data, with dependency fixup
   in separate commit touching lib/fonts.
 - Convert sbus drives to use remove callbacks returning void
 - Fix return values of __setup handlers
 - Section mismatch fix for grpci pci drivers
 - Make the vio bus type constant
 - Kconfig cleanups and fixes
 - Typo fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQQfqfbgobF48oKMeq81AykqDLayywUCZfQtuhQcYW5kcmVhc0Bn
 YWlzbGVyLmNvbQAKCRA1AykqDLayy85XAQChTv3muYGiLSyX2r5leQrxZwWNjdiw
 W6uc+pM6nCVryAEA+KtjosgriNo7wZViEWqXyYjs/csdEekLpj4U2qBphAg=
 =19B/
 -----END PGP SIGNATURE-----

Merge tag 'sparc-for-6.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc

Pull sparc updates from Andreas Larsson:

 - Fix missing prototype warnings in various places, including switching
   to using generic cmpdi2/ucmpdi2 and parport.h and stop selecting
   unneeded GENERIC_ISA_DMA.

 - Reduce duplicate code by using shared font data, with dependency
   fixup in separate commit touching lib/fonts.

 - Convert sbus drives to use remove callbacks returning void

 - Fix return values of __setup handlers

 - Section mismatch fix for grpci pci drivers

 - Make the vio bus type constant

 - Kconfig cleanups and fixes

 - Typo fixes

* tag 'sparc-for-6.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc:
  lib/fonts: Allow Sparc console 8x16 font for sparc64 early boot text console
  sbus: uctrl: Convert to platform remove callback returning void
  sbus: flash: Convert to platform remove callback returning void
  sbus: envctrl: Convert to platform remove callback returning void
  sbus: display7seg: Convert to platform remove callback returning void
  sbus: bbc_i2c: Convert to platform remove callback returning void
  sbus: Add prototype for bbc_envctrl_init and bbc_envctrl_cleanup to header
  sparc32: Fix section mismatch in leon_pci_grpci
  sparc32: Fix parport build with sparc32
  sparc32: Do not select GENERIC_ISA_DMA
  mtd: maps: sun_uflash: Declare uflash_devinit static
  sparc32: Fix build with trapbase
  sparc32: Use generic cmpdi2/ucmpdi2 variants
  sparc: select FRAME_POINTER instead of redefining it
  sparc: vDSO: fix return value of __setup handler
  sparc64: NMI watchdog: fix return value of __setup handler
  sparc: vio: make vio_bus_type const
  sparc: Fix typos
  sparc: Use shared font data
  sparc: remove obsolete config ARCH_ATU
2024-03-15 12:47:21 -07:00
Linus Torvalds
32a50540c3 bcachefs updates for 6.9
- Subvolume children btree; this is needed for providing a userspace
    interface for walking subvolumes, which will come later
  - Lots of improvements to directory structure checking
  - Improved journal pipelining, significantly improving performance on
    high iodepth write workloads
  - Discard path improvements: the discard path is more efficient, and no
    longer flushes the journal unnecessarily
  - Buffered write path can now avoid taking the inode lock
  - new mm helper: memalloc_flags_{save|restore}
  - mempool now does kvmalloc mempools
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmXycEcACgkQE6szbY3K
 bnYUTg/+K4Nv2EdAqOCyHRTKaF2OgJDUb25ZDmbGpfT1XyPrNB7/+CxHqSdEP7/e
 FVuhtP61vnQAImDv82u9iZiab/TnuCZPUrjSobFEvrWYoGRtP9Bm9MyYB28NzmMa
 AXGmS4yJGVwtxxrFNxZP98IbiHYiHSoYbkqxX2E5VgLag8Ru8peb7oD0Ro3zw0rb
 z+6UM/seJ7on5i/9IJEMKKXFVEoZC2J5DAVoe1TghG2kgOw3cKu5OUdltLPOY5jL
 jkm5J5wa6Ep46nufHat92yiMxXIQrf4U9LkXxzTi5ThoSmt+Af2qXcBjqTTVqd2D
 1dGxj+UG8iu4DCCbQC6EA7J5EMvxfJM0+9lk1ULUgxUs3X69co6nlI6XH1fwEMqk
 KpIqd35+Y/IYgogt9ioXI0dtXyL7dbaTVt6NZhc9SaPGPX+C2V0+l4bqToFdNaPH
 0KATjjyQaJRE4ZFIjr6GliYOtKWDLi/HPEyoBivniUn7cF5vjSvti+cSQwNDSPpa
 6jOd5Y923Iq9ZqDAPM3+mvTH8nNaaf2T2fmbPNrc5pdWbha9bGwOU71zvKHNFGm/
 66ZsnwhKSk+uwglTMZHPKSkJJXUYAHESw3slQtEWHZVlliArc55+pBHwE00bvRt7
 KHUUqkqXBUPzbp/kdZGylMAdH9+8j9TE5QJ2RaoryFm/eCfexmI=
 =6xnj
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs updates from Kent Overstreet:

 - Subvolume children btree; this is needed for providing a userspace
   interface for walking subvolumes, which will come later

 - Lots of improvements to directory structure checking

 - Improved journal pipelining, significantly improving performance on
   high iodepth write workloads

 - Discard path improvements: the discard path is more efficient, and no
   longer flushes the journal unnecessarily

 - Buffered write path can now avoid taking the inode lock

 - new mm helper: memalloc_flags_{save|restore}

 - mempool now does kvmalloc mempools

* tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs: (128 commits)
  bcachefs: time_stats: shrink time_stat_buffer for better alignment
  bcachefs: time_stats: split stats-with-quantiles into a separate structure
  bcachefs: mean_and_variance: put struct mean_and_variance_weighted on a diet
  bcachefs: time_stats: add larger units
  bcachefs: pull out time_stats.[ch]
  bcachefs: reconstruct_alloc cleanup
  bcachefs: fix bch_folio_sector padding
  bcachefs: Fix btree key cache coherency during replay
  bcachefs: Always flush write buffer in delete_dead_inodes()
  bcachefs: Fix order of gc_done passes
  bcachefs: fix deletion of indirect extents in btree_gc
  bcachefs: Prefer struct_size over open coded arithmetic
  bcachefs: Kill unused flags argument to btree_split()
  bcachefs: Check for writing superblocks with nonsense member seq fields
  bcachefs: fix bch2_journal_buf_to_text()
  lib/generic-radix-tree.c: Make nodes more reasonably sized
  bcachefs: copy_(to|from)_user_errcode()
  bcachefs: Split out bkey_types.h
  bcachefs: fix lost journal buf wakeup due to improved pipelining
  bcachefs: intercept mountoption value for bool type
  ...
2024-03-15 09:00:09 -07:00
Linus Torvalds
e5eb28f6d1 - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min
heap optimizations".
 
 - Kuan-Wei Chiu has also sped up the library sorting code in the series
   "lib/sort: Optimize the number of swaps and comparisons".
 
 - Alexey Gladkov has added the ability for code running within an IPC
   namespace to alter its IPC and MQ limits.  The series is "Allow to
   change ipc/mq sysctls inside ipc namespace".
 
 - Geert Uytterhoeven has contributed some dhrystone maintenance work in
   the series "lib: dhry: miscellaneous cleanups".
 
 - Ryusuke Konishi continues nilfs2 maintenance work in the series
 
 	"nilfs2: eliminate kmap and kmap_atomic calls"
 	"nilfs2: fix kernel bug at submit_bh_wbc()"
 
 - Nathan Chancellor has updated our build tools requirements in the
   series "Bump the minimum supported version of LLVM to 13.0.1".
 
 - Muhammad Usama Anjum continues with the selftests maintenance work in
   the series "selftests/mm: Improve run_vmtests.sh".
 
 - Oleg Nesterov has done some maintenance work against the signal code
   in the series "get_signal: minor cleanups and fix".
 
 Plus the usual shower of singleton patches in various parts of the tree.
 Please see the individual changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZfMnvgAKCRDdBJ7gKXxA
 jjKMAP4/Upq07D4wjkMVPb+QrkipbbLpdcgJ++q3z6rba4zhPQD+M3SFriIJk/Xh
 tKVmvihFxfAhdDthseXcIf1nBjMALwY=
 =8rVc
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-03-14-09-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min
   heap optimizations".

 - Kuan-Wei Chiu has also sped up the library sorting code in the series
   "lib/sort: Optimize the number of swaps and comparisons".

 - Alexey Gladkov has added the ability for code running within an IPC
   namespace to alter its IPC and MQ limits. The series is "Allow to
   change ipc/mq sysctls inside ipc namespace".

 - Geert Uytterhoeven has contributed some dhrystone maintenance work in
   the series "lib: dhry: miscellaneous cleanups".

 - Ryusuke Konishi continues nilfs2 maintenance work in the series

	"nilfs2: eliminate kmap and kmap_atomic calls"
	"nilfs2: fix kernel bug at submit_bh_wbc()"

 - Nathan Chancellor has updated our build tools requirements in the
   series "Bump the minimum supported version of LLVM to 13.0.1".

 - Muhammad Usama Anjum continues with the selftests maintenance work in
   the series "selftests/mm: Improve run_vmtests.sh".

 - Oleg Nesterov has done some maintenance work against the signal code
   in the series "get_signal: minor cleanups and fix".

Plus the usual shower of singleton patches in various parts of the tree.
Please see the individual changelogs for details.

* tag 'mm-nonmm-stable-2024-03-14-09-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
  nilfs2: prevent kernel bug at submit_bh_wbc()
  nilfs2: fix failure to detect DAT corruption in btree and direct mappings
  ocfs2: enable ocfs2_listxattr for special files
  ocfs2: remove SLAB_MEM_SPREAD flag usage
  assoc_array: fix the return value in assoc_array_insert_mid_shortcut()
  buildid: use kmap_local_page()
  watchdog/core: remove sysctl handlers from public header
  nilfs2: use div64_ul() instead of do_div()
  mul_u64_u64_div_u64: increase precision by conditionally swapping a and b
  kexec: copy only happens before uchunk goes to zero
  get_signal: don't initialize ksig->info if SIGNAL_GROUP_EXIT/group_exec_task
  get_signal: hide_si_addr_tag_bits: fix the usage of uninitialized ksig
  get_signal: don't abuse ksig->info.si_signo and ksig->sig
  const_structs.checkpatch: add device_type
  Normalise "name (ad@dr)" MODULE_AUTHORs to "name <ad@dr>"
  dyndbg: replace kstrdup() + strchr() with kstrdup_and_replace()
  list: leverage list_is_head() for list_entry_is_head()
  nilfs2: MAINTAINERS: drop unreachable project mirror site
  smp: make __smp_processor_id() 0-argument macro
  fat: fix uninitialized field in nostale filehandles
  ...
2024-03-14 18:03:09 -07:00
Linus Torvalds
902861e34c - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames
from hotplugged memory rather than only from main memory.  Series
   "implement "memmap on memory" feature on s390".
 
 - More folio conversions from Matthew Wilcox in the series
 
 	"Convert memcontrol charge moving to use folios"
 	"mm: convert mm counter to take a folio"
 
 - Chengming Zhou has optimized zswap's rbtree locking, providing
   significant reductions in system time and modest but measurable
   reductions in overall runtimes.  The series is "mm/zswap: optimize the
   scalability of zswap rb-tree".
 
 - Chengming Zhou has also provided the series "mm/zswap: optimize zswap
   lru list" which provides measurable runtime benefits in some
   swap-intensive situations.
 
 - And Chengming Zhou further optimizes zswap in the series "mm/zswap:
   optimize for dynamic zswap_pools".  Measured improvements are modest.
 
 - zswap cleanups and simplifications from Yosry Ahmed in the series "mm:
   zswap: simplify zswap_swapoff()".
 
 - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has
   contributed several DAX cleanups as well as adding a sysfs tunable to
   control the memmap_on_memory setting when the dax device is hotplugged
   as system memory.
 
 - Johannes Weiner has added the large series "mm: zswap: cleanups",
   which does that.
 
 - More DAMON work from SeongJae Park in the series
 
 	"mm/damon: make DAMON debugfs interface deprecation unignorable"
 	"selftests/damon: add more tests for core functionalities and corner cases"
 	"Docs/mm/damon: misc readability improvements"
 	"mm/damon: let DAMOS feeds and tame/auto-tune itself"
 
 - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs
   extension" Rakie Kim has developed a new mempolicy interleaving policy
   wherein we allocate memory across nodes in a weighted fashion rather
   than uniformly.  This is beneficial in heterogeneous memory environments
   appearing with CXL.
 
 - Christophe Leroy has contributed some cleanup and consolidation work
   against the ARM pagetable dumping code in the series "mm: ptdump:
   Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute".
 
 - Luis Chamberlain has added some additional xarray selftesting in the
   series "test_xarray: advanced API multi-index tests".
 
 - Muhammad Usama Anjum has reworked the selftest code to make its
   human-readable output conform to the TAP ("Test Anything Protocol")
   format.  Amongst other things, this opens up the use of third-party
   tools to parse and process out selftesting results.
 
 - Ryan Roberts has added fork()-time PTE batching of THP ptes in the
   series "mm/memory: optimize fork() with PTE-mapped THP".  Mainly
   targeted at arm64, this significantly speeds up fork() when the process
   has a large number of pte-mapped folios.
 
 - David Hildenbrand also gets in on the THP pte batching game in his
   series "mm/memory: optimize unmap/zap with PTE-mapped THP".  It
   implements batching during munmap() and other pte teardown situations.
   The microbenchmark improvements are nice.
 
 - And in the series "Transparent Contiguous PTEs for User Mappings" Ryan
   Roberts further utilizes arm's pte's contiguous bit ("contpte
   mappings").  Kernel build times on arm64 improved nicely.  Ryan's series
   "Address some contpte nits" provides some followup work.
 
 - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has
   fixed an obscure hugetlb race which was causing unnecessary page faults.
   He has also added a reproducer under the selftest code.
 
 - In the series "selftests/mm: Output cleanups for the compaction test",
   Mark Brown did what the title claims.
 
 - Kinsey Ho has added the series "mm/mglru: code cleanup and refactoring".
 
 - Even more zswap material from Nhat Pham.  The series "fix and extend
   zswap kselftests" does as claimed.
 
 - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX
   regression" Mathieu Desnoyers has cleaned up and fixed rather a mess in
   our handling of DAX on archiecctures which have virtually aliasing data
   caches.  The arm architecture is the main beneficiary.
 
 - Lokesh Gidra's series "per-vma locks in userfaultfd" provides dramatic
   improvements in worst-case mmap_lock hold times during certain
   userfaultfd operations.
 
 - Some page_owner enhancements and maintenance work from Oscar Salvador
   in his series
 
 	"page_owner: print stacks and their outstanding allocations"
 	"page_owner: Fixup and cleanup"
 
 - Uladzislau Rezki has contributed some vmalloc scalability improvements
   in his series "Mitigate a vmap lock contention".  It realizes a 12x
   improvement for a certain microbenchmark.
 
 - Some kexec/crash cleanup work from Baoquan He in the series "Split
   crash out from kexec and clean up related config items".
 
 - Some zsmalloc maintenance work from Chengming Zhou in the series
 
 	"mm/zsmalloc: fix and optimize objects/page migration"
 	"mm/zsmalloc: some cleanup for get/set_zspage_mapping()"
 
 - Zi Yan has taught the MM to perform compaction on folios larger than
   order=0.  This a step along the path to implementaton of the merging of
   large anonymous folios.  The series is named "Enable >0 order folio
   memory compaction".
 
 - Christoph Hellwig has done quite a lot of cleanup work in the
   pagecache writeback code in his series "convert write_cache_pages() to
   an iterator".
 
 - Some modest hugetlb cleanups and speedups in Vishal Moola's series
   "Handle hugetlb faults under the VMA lock".
 
 - Zi Yan has changed the page splitting code so we can split huge pages
   into sizes other than order-0 to better utilize large folios.  The
   series is named "Split a folio to any lower order folios".
 
 - David Hildenbrand has contributed the series "mm: remove
   total_mapcount()", a cleanup.
 
 - Matthew Wilcox has sought to improve the performance of bulk memory
   freeing in his series "Rearrange batched folio freeing".
 
 - Gang Li's series "hugetlb: parallelize hugetlb page init on boot"
   provides large improvements in bootup times on large machines which are
   configured to use large numbers of hugetlb pages.
 
 - Matthew Wilcox's series "PageFlags cleanups" does that.
 
 - Qi Zheng's series "minor fixes and supplement for ptdesc" does that
   also.  S390 is affected.
 
 - Cleanups to our pagemap utility functions from Peter Xu in his series
   "mm/treewide: Replace pXd_large() with pXd_leaf()".
 
 - Nico Pache has fixed a few things with our hugepage selftests in his
   series "selftests/mm: Improve Hugepage Test Handling in MM Selftests".
 
 - Also, of course, many singleton patches to many things.  Please see
   the individual changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZfJpPQAKCRDdBJ7gKXxA
 joxeAP9TrcMEuHnLmBlhIXkWbIR4+ki+pA3v+gNTlJiBhnfVSgD9G55t1aBaRplx
 TMNhHfyiHYDTx/GAV9NXW84tasJSDgA=
 =TG55
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames
   from hotplugged memory rather than only from main memory. Series
   "implement "memmap on memory" feature on s390".

 - More folio conversions from Matthew Wilcox in the series

	"Convert memcontrol charge moving to use folios"
	"mm: convert mm counter to take a folio"

 - Chengming Zhou has optimized zswap's rbtree locking, providing
   significant reductions in system time and modest but measurable
   reductions in overall runtimes. The series is "mm/zswap: optimize the
   scalability of zswap rb-tree".

 - Chengming Zhou has also provided the series "mm/zswap: optimize zswap
   lru list" which provides measurable runtime benefits in some
   swap-intensive situations.

 - And Chengming Zhou further optimizes zswap in the series "mm/zswap:
   optimize for dynamic zswap_pools". Measured improvements are modest.

 - zswap cleanups and simplifications from Yosry Ahmed in the series
   "mm: zswap: simplify zswap_swapoff()".

 - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has
   contributed several DAX cleanups as well as adding a sysfs tunable to
   control the memmap_on_memory setting when the dax device is
   hotplugged as system memory.

 - Johannes Weiner has added the large series "mm: zswap: cleanups",
   which does that.

 - More DAMON work from SeongJae Park in the series

	"mm/damon: make DAMON debugfs interface deprecation unignorable"
	"selftests/damon: add more tests for core functionalities and corner cases"
	"Docs/mm/damon: misc readability improvements"
	"mm/damon: let DAMOS feeds and tame/auto-tune itself"

 - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs
   extension" Rakie Kim has developed a new mempolicy interleaving
   policy wherein we allocate memory across nodes in a weighted fashion
   rather than uniformly. This is beneficial in heterogeneous memory
   environments appearing with CXL.

 - Christophe Leroy has contributed some cleanup and consolidation work
   against the ARM pagetable dumping code in the series "mm: ptdump:
   Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute".

 - Luis Chamberlain has added some additional xarray selftesting in the
   series "test_xarray: advanced API multi-index tests".

 - Muhammad Usama Anjum has reworked the selftest code to make its
   human-readable output conform to the TAP ("Test Anything Protocol")
   format. Amongst other things, this opens up the use of third-party
   tools to parse and process out selftesting results.

 - Ryan Roberts has added fork()-time PTE batching of THP ptes in the
   series "mm/memory: optimize fork() with PTE-mapped THP". Mainly
   targeted at arm64, this significantly speeds up fork() when the
   process has a large number of pte-mapped folios.

 - David Hildenbrand also gets in on the THP pte batching game in his
   series "mm/memory: optimize unmap/zap with PTE-mapped THP". It
   implements batching during munmap() and other pte teardown
   situations. The microbenchmark improvements are nice.

 - And in the series "Transparent Contiguous PTEs for User Mappings"
   Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte
   mappings"). Kernel build times on arm64 improved nicely. Ryan's
   series "Address some contpte nits" provides some followup work.

 - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has
   fixed an obscure hugetlb race which was causing unnecessary page
   faults. He has also added a reproducer under the selftest code.

 - In the series "selftests/mm: Output cleanups for the compaction
   test", Mark Brown did what the title claims.

 - Kinsey Ho has added the series "mm/mglru: code cleanup and
   refactoring".

 - Even more zswap material from Nhat Pham. The series "fix and extend
   zswap kselftests" does as claimed.

 - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX
   regression" Mathieu Desnoyers has cleaned up and fixed rather a mess
   in our handling of DAX on archiecctures which have virtually aliasing
   data caches. The arm architecture is the main beneficiary.

 - Lokesh Gidra's series "per-vma locks in userfaultfd" provides
   dramatic improvements in worst-case mmap_lock hold times during
   certain userfaultfd operations.

 - Some page_owner enhancements and maintenance work from Oscar Salvador
   in his series

	"page_owner: print stacks and their outstanding allocations"
	"page_owner: Fixup and cleanup"

 - Uladzislau Rezki has contributed some vmalloc scalability
   improvements in his series "Mitigate a vmap lock contention". It
   realizes a 12x improvement for a certain microbenchmark.

 - Some kexec/crash cleanup work from Baoquan He in the series "Split
   crash out from kexec and clean up related config items".

 - Some zsmalloc maintenance work from Chengming Zhou in the series

	"mm/zsmalloc: fix and optimize objects/page migration"
	"mm/zsmalloc: some cleanup for get/set_zspage_mapping()"

 - Zi Yan has taught the MM to perform compaction on folios larger than
   order=0. This a step along the path to implementaton of the merging
   of large anonymous folios. The series is named "Enable >0 order folio
   memory compaction".

 - Christoph Hellwig has done quite a lot of cleanup work in the
   pagecache writeback code in his series "convert write_cache_pages()
   to an iterator".

 - Some modest hugetlb cleanups and speedups in Vishal Moola's series
   "Handle hugetlb faults under the VMA lock".

 - Zi Yan has changed the page splitting code so we can split huge pages
   into sizes other than order-0 to better utilize large folios. The
   series is named "Split a folio to any lower order folios".

 - David Hildenbrand has contributed the series "mm: remove
   total_mapcount()", a cleanup.

 - Matthew Wilcox has sought to improve the performance of bulk memory
   freeing in his series "Rearrange batched folio freeing".

 - Gang Li's series "hugetlb: parallelize hugetlb page init on boot"
   provides large improvements in bootup times on large machines which
   are configured to use large numbers of hugetlb pages.

 - Matthew Wilcox's series "PageFlags cleanups" does that.

 - Qi Zheng's series "minor fixes and supplement for ptdesc" does that
   also. S390 is affected.

 - Cleanups to our pagemap utility functions from Peter Xu in his series
   "mm/treewide: Replace pXd_large() with pXd_leaf()".

 - Nico Pache has fixed a few things with our hugepage selftests in his
   series "selftests/mm: Improve Hugepage Test Handling in MM
   Selftests".

 - Also, of course, many singleton patches to many things. Please see
   the individual changelogs for details.

* tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits)
  mm/zswap: remove the memcpy if acomp is not sleepable
  crypto: introduce: acomp_is_async to expose if comp drivers might sleep
  memtest: use {READ,WRITE}_ONCE in memory scanning
  mm: prohibit the last subpage from reusing the entire large folio
  mm: recover pud_leaf() definitions in nopmd case
  selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements
  selftests/mm: skip uffd hugetlb tests with insufficient hugepages
  selftests/mm: dont fail testsuite due to a lack of hugepages
  mm/huge_memory: skip invalid debugfs new_order input for folio split
  mm/huge_memory: check new folio order when split a folio
  mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure
  mm: add an explicit smp_wmb() to UFFDIO_CONTINUE
  mm: fix list corruption in put_pages_list
  mm: remove folio from deferred split list before uncharging it
  filemap: avoid unnecessary major faults in filemap_fault()
  mm,page_owner: drop unnecessary check
  mm,page_owner: check for null stack_record before bumping its refcount
  mm: swap: fix race between free_swap_and_cache() and swapoff()
  mm/treewide: align up pXd_leaf() retval across archs
  mm/treewide: drop pXd_large()
  ...
2024-03-14 17:43:30 -07:00
Linus Torvalds
705c1da8fa pci-v6.9-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmXw04sUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vyT3xAAsp5+c2IcbrXpZZM7figwx4y9PPRp
 jcQ4AYSGP41xqTUGXUTcVZYvRorSIAFEOz33U0SL1UNxoOZz8j/M6SD58k8a6XRr
 9SSPuKja1OKJjONhS1bzrcbVtuzr71ISrECXfLkvW5hY5hvq+3+anMtu3/UIEHu6
 M1vVc+basRjjPJNTixMWvVqS3R+4gDAFeBtdZl/D+U6v0v2xOK+81YZqjfZZCw9v
 xmdHHK2dKNEdysNoRJ5cafY3b1NnSsrxlHbIhBnKt+7uRSWKD1dHcBQj7wDc/HrX
 yBGca+BZBKitXEJM3p5KcWWs4ijaywGw0GSffUIKrN9i6RIfwnxBH9jUbwDngifU
 2IR/kLEqdjYi/WnENxIHpQATLyXhXZ8uEnLS0xMlRIA96u3M5B0mrYOZxaN3bo12
 A3aE+aPOTw0u1wf7G8dBX6IdYnjZ/ZuR9Q+fVoKpZBvsYUVaKyiKCtKMCNaVirn5
 z7nxR1W71ee+35+37KthPXhiw+YtURGz1wBWt+wWUMjBcpIj2bpzU9wQDE9ZMdYt
 XJoJcatrRhJzefO3uzd0egft+vwk0xrj5LQEDhMQyDrnBLC4EgI5niKPWqbay5Nx
 Cnll01CI82xAnIF6eu7OOuI1nYGtoFcY8rP3hTC85cWN7Xi8SAOLTZZcVTpfBMUr
 l2uEll8p+8dZ6IY=
 =AP3I
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Consolidate interrupt related code in irq.c (Ilpo Järvinen)

   - Reduce kernel size by replacing sysfs resource macros with
     functions (Ilpo Järvinen)

   - Reduce kernel size by compiling sysfs support only when
     CONFIG_SYSFS=y (Lukas Wunner)

   - Avoid using Extended Tags on 3ware-9650SE Root Port to work around
     an apparent hardware defect (Jörg Wedekind)

  Resource management:

   - Fix an MMIO mapping leak in pci_iounmap() (Philipp Stanner)

   - Move pci_iomap.c and other PCI-specific devres code to drivers/pci
     (Philipp Stanner)

   - Consolidate PCI devres code in devres.c (Philipp Stanner)

  Power management:

   - Avoid D3cold on Asus B1400 PCI-NVMe bridge, where firmware doesn't
     know how to return correctly to D0, and remove previous quirk that
     wasn't as specific (Daniel Drake)

   - Allow runtime PM when the driver enables it but doesn't need any
     runtime PM callbacks (Raag Jadav)

   - Drain runtime-idle callbacks before driver removal to avoid races
     between .remove() and .runtime_idle(), which caused intermittent
     page faults when the rtsx .runtime_idle() accessed registers that
     its .remove() had already unmapped (Rafael J. Wysocki)

  Virtualization:

   - Avoid Secondary Bus Reset on LSI FW643 so it can be assigned to VMs
     with VFIO, e.g., for professional audio software on many Apple
     machines, at the cost of leaking state between VMs (Edmund Raile)

  Error handling:

   - Print all logged TLP Prefixes, not just the first, after AER or DPC
     errors (Ilpo Järvinen)

   - Quirk the DPC PIO log size for Intel Raptor Lake Root Ports, which
     still don't advertise a legal size (Paul Menzel)

   - Ignore expected DPC Surprise Down errors on hot removal (Smita
     Koralahalli)

   - Block runtime suspend while handling AER errors to avoid races that
     prevent the device form being resumed from D3hot (Stanislaw
     Gruszka)

  Peer-to-peer DMA:

   - Use atomic XA allocation in RCU read section (Christophe JAILLET)

  ASPM:

   - Collect bits of ASPM-related code that we need even without
     CONFIG_PCIEASPM into aspm.c (David E. Box)

   - Save/restore L1 PM Substates config for suspend/resume (David E.
     Box)

   - Update save_save when ASPM config is changed, so a .slot_reset()
     during error recovery restores the changed config, not the
     .probe()-time config (Vidya Sagar)

  Endpoint framework:

   - Refactor and improve pci_epf_alloc_space() API (Niklas Cassel)

   - Clean up endpoint BAR descriptions (Niklas Cassel)

   - Fix ntb_register_device() name leak in error path (Yang Yingliang)

   - Return actual error code for pci_vntb_probe() failure (Yang
     Yingliang)

  Broadcom STB PCIe controller driver:

   - Fix MDIO write polling, which previously never waited for
     completion (Jonathan Bell)

  Cadence PCIe endpoint driver:

   - Clear the ARI "Next Function Number" of last function (Jasko-EXT
     Wojciech)

  Freescale i.MX6 PCIe controller driver:

   - Simplify by replacing switch statements with function pointers for
     different hardware variants (Frank Li)

   - Simplify by using clk_bulk*() API (Frank Li)

   - Remove redundant DT clock and reg/reg-name details (Frank Li)

   - Add i.MX95 DT and driver support for both Root Complex and Endpoint
     mode (Frank Li)

  Microsoft Hyper-V host bridge driver:

   - Reduce memory usage by limiting ring buffer size to 16KB instead of
     4 pages (Michael Kelley)

  Qualcomm PCIe controller driver:

   - Add X1E80100 DT and driver support (Abel Vesa)

   - Add DT 'required-opps' for SoCs that require a minimum performance
     level (Johan Hovold)

   - Make DT 'msi-map-mask' optional, depending on how MSI interrupts
     are mapped (Johan Hovold)

   - Disable ASPM L0s for sc8280xp, sa8540p and sa8295p because the PHY
     configuration isn't tuned correctly for L0s (Johan Hovold)

   - Split dt-binding qcom,pcie.yaml into qcom,pcie-common.yaml and
     separate files for SA8775p, SC7280, SC8180X, SC8280XP, SM8150,
     SM8250, SM8350, SM8450, SM8550 for easier reviewing (Krzysztof
     Kozlowski)

   - Enable BDF to SID translation by disabling bypass mode (Manivannan
     Sadhasivam)

   - Add endpoint MHI support for Snapdragon SA8775P SoC (Mrinmay
     Sarkar)

  Synopsys DesignWare PCIe controller driver:

   - Allocate 64-bit MSI address if no 32-bit address is available (Ajay
     Agarwal)

   - Fix endpoint Resizable BAR to actually advertise the required 1MB
     size (Niklas Cassel)

  MicroSemi Switchtec management driver:

   - Release resources if the .probe() fails (Christophe JAILLET)

  Miscellaneous:

   - Make pcie_port_bus_type const (Ricardo B. Marliere)"

* tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (77 commits)
  PCI/ASPM: Update save_state when configuration changes
  PCI/ASPM: Disable L1 before configuring L1 Substates
  PCI/ASPM: Call pci_save_ltr_state() from pci_save_pcie_state()
  PCI/ASPM: Save L1 PM Substates Capability for suspend/resume
  PCI: hv: Fix ring buffer size calculation
  PCI: dwc: endpoint: Fix advertised resizable BAR size
  PCI: cadence: Clear the ARI Capability Next Function Number of the last function
  PCI: dwc: Strengthen the MSI address allocation logic
  PCI: brcmstb: Fix broken brcm_pcie_mdio_write() polling
  PCI: qcom: Add X1E80100 PCIe support
  dt-bindings: PCI: qcom: Document the X1E80100 PCIe Controller
  PCI: qcom: Enable BDF to SID translation properly
  PCI/AER: Generalize TLP Header Log reading
  PCI/AER: Use explicit register size for PCI_ERR_CAP
  PCI: qcom: Disable ASPM L0s for sc8280xp, sa8540p and sa8295p
  dt-bindings: PCI: qcom: Do not require 'msi-map-mask'
  dt-bindings: PCI: qcom: Allow 'required-opps'
  PCI/AER: Block runtime suspend when handling errors
  PCI/ASPM: Move pci_save_ltr_state() to aspm.c
  PCI/ASPM: Always build aspm.c
  ...
2024-03-14 10:58:27 -07:00
Kent Overstreet
3a319a2476 lib/generic-radix-tree.c: Make nodes more reasonably sized
this code originally used the page allocator directly, but most code
shouldn't do that - PAGE_SIZE varies with architecture, and slab is
faster.

4k is also on the large side for typical usage, 512 bytes is a better
choice for typical usage that might be somewhat sparse.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:26 -04:00
Linus Torvalds
ce0c1c9265 Modules changes for v6.9-rc1
Christophe Leroy did most of the work on this release, first with a few
 cleanups on CONFIG_STRICT_KERNEL_RWX and ending with error handling for
 when set_memory_XX() can fail. This is part of a larger effort to clean
 up all these callers which can fail, modules is just part of it.
 
 This has been sitting on linux-next for about a month without issues.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmXxArkSHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoin5VoQALp/Fbv3uDbp/vF+9c+qBNaJCwdiDXto
 R7ns4qjjqs11OpZF3to4WzwScKUESbwStXY1hkjQqRED3vN49SmDVdS7P+Aa5ixu
 SLEMIgD0qJp8aM+SWyejLEY2vCf+tvK81Cb7/HdjAsH0UWblb/mPzbULUCbKi/P5
 qKU+UO0Ojx3Zl9RXUo81dDhbJzhmjBbsxYRLiOaMbWemEh0DO0bqI+8LLq4rdQX1
 dnRCTeHZOZNCwTauqV0NY5ZGNQayJguc/+sK127JSLlxvllGC9n8CVQVOCxUK5oM
 SGv3lPK8uwanYuX+PLJGMcbdk8uzD4WGwQVI6A71S4Uv4Y5TO6Tph/ZsViQc+0hE
 fdoGmoLV/SaFVzSm5u3E4j6i4nRb8uRGWD2dzCgG7POyjxSu7LLBSVG9eeJNjuOJ
 Dkdvi2hBlyGQiYtKeS29EXfIU4YF7eQs14Js7dXkIiEuiz94gpzHfJD07e+hg7o+
 51f6sB5DQ6hewkmnCqayuXxPsW2fF7a8x5Ce+iTrde9n5lF7ks5wl8JCaliWtYax
 GLwlwLie65yJz9qoirU0VmkFtYd7gJIhHsYdGYK8VHtHb0fdWy9XNO0rc/71QWWb
 3QW2i9PaVB4MoeegOks7pMX/m8YqqqLQ91Es9/5o0GqACt8Nr4/mRkpfH2+kxWQh
 kkwS4W4fdXwK
 =Z9SO
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull modules updates from Luis Chamberlain:
 "Christophe Leroy did most of the work on this release, first with a
  few cleanups on CONFIG_STRICT_KERNEL_RWX and ending with error
  handling for when set_memory_XX() can fail.

  This is part of a larger effort to clean up all these callers which
  can fail, modules is just part of it"

* tag 'modules-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  module: Don't ignore errors from set_memory_XX()
  lib/test_kmod: fix kernel-doc warnings
  powerpc: Simplify strict_kernel_rwx_enabled()
  modules: Remove #ifdef CONFIG_STRICT_MODULE_RWX around rodata_enabled
  init: Declare rodata_enabled and mark_rodata_ro() at all time
  module: Change module_enable_{nx/x/ro}() to more explicit names
  module: Use set_memory_rox()
2024-03-13 12:40:58 -07:00
Robert Richter
c6c3187d66 lib/firmware_table: Provide buffer length argument to cdat_table_parse()
There exist card implementations with a CDAT table using a fixed size
buffer, but with entries filled in that do not fill the whole table
length size. Then, the last entry in the CDAT table may not mark the
end of the CDAT table buffer specified by the length field in the CDAT
header. It can be shorter with trailing unused (zero'ed) data. The
actual table length is determined while reading all CDAT entries of
the table with DOE.

If the table is greater than expected (containing zero'ed trailing
data), the CDAT parser fails with:

 [   48.691717] Malformed DSMAS table length: (24:0)
 [   48.702084] [CDAT:0x00] Invalid zero length
 [   48.711460] cxl_port endpoint1: Failed to parse CDAT: -22

In addition, a check of the table buffer length is missing to prevent
an out-of-bound access then parsing the CDAT table.

Hardening code against device returning borked table. Fix that by
providing an optional buffer length argument to
acpi_parse_entries_array() that can be used by cdat_table_parse() to
propagate the buffer size down to its users to check the buffer
length. This also prevents a possible out-of-bound access mentioned.

Add a check to warn about a malformed CDAT table length.

Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/ZdEnopFO0Tl3t2O1@rric.localdomain
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-13 00:03:21 -07:00
Linus Torvalds
b0546776ad printk changes for 6.9
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmXwUvcACgkQUqAMR0iA
 lPKSEw/+NZ7MY1NlKs3+j61L3Bh6zIXJcuHJc1Zt4EPkf5N/OyJZxSYXpaFFxDEB
 at+1vcyU9Xq096rsgfxkU8tkIGOpOgLICgyzyF8ZvBNMsVdd+V0gb6PoIuVx2aWj
 wiYOZ6L8XuIGZ4BN3svk10a6Z/Fvk+HNjyuy+X+kqss5NX4hMNKK01FJyMllidC1
 4aryDPo3fKRpHBSi2YMO4NXLZGkAL8+1UoJ5p8ZWsufJKAMlQy4Vd3bIc/f6ccyy
 IuGK1+lOr6c8k/F7JO8EE2GAd8c/KvjwQR+L55YkHVoyUp2f9M19obxwqkIVeD6I
 X0y8OFNY+3hfzub6VRXob8EBmEUIdOCh6GWtT7mwzTyIouscfHltQHQxR2fMaMFF
 G066lkuVvpxhYjKtGePfQi9GtOQvdAHe7l/RS0AK53+FNzAP2I6guHRD6TSsVfNe
 erqY+4//256s/97GeqQ8ON/2gz6u/0rH6e+GiEvoMaTaw0+1YKA9xHi2Qx4AvUHk
 8TNNNZbL2PoDj744Tj1xF/zenlm1BxNeK1Q0l89ZNqPNPEokF4Hq11dW56beWpv7
 gCina3gveAnmZvdJYbn0UJ92eXjff4ZdLmiZVlVyrX2k9PVu2NYOQz4E0cPbL0Gt
 SNYgBW78e1VOcNpUokfq3OiTOQo1VDaW1SypcCbYkuc7tROq0xU=
 =Z413
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:
 "Improve the behavior during panic. The issues were found when testing
  the ongoing changes introducing atomic consoles and printk kthreads:

   - pr_flush() has to wait for the last reserved record instead of the
     last finalized one. Note that records are finalized in random order
     when generated by more CPUs in parallel.

   - Ignore non-finalized records during panic(). Messages printed on
     panic-CPU are always finalized. Messages printed by other CPUs
     might never be finalized when the CPUs get stopped.

   - Block new printk() calls on non-panic CPUs completely. Backtraces
     are printed before entering the panic mode. Later messages would
     just mess information printed by the panic CPU.

   - Do not take console_lock in console_flush_on_panic() at all. The
     original code did try_lock()/console_unlock(). The unlock part
     might cause a deadlock when panic() happened in a scheduler code.

   - Fix conversion of 64-bit sequence number for 32-bit atomic
     operations"

* tag 'printk-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  dump_stack: Do not get cpu_sync for panic CPU
  panic: Flush kernel log buffer at the end
  printk: Avoid non-panic CPUs writing to ringbuffer
  printk: Disable passing console lock owner completely during panic()
  printk: ringbuffer: Skip non-finalized records in panic
  printk: Wait for all reserved records with pr_flush()
  printk: ringbuffer: Cleanup reader terminology
  printk: Add this_cpu_in_panic()
  printk: For @suppress_panic_printk check for other CPU in panic
  printk: ringbuffer: Clarify special lpos values
  printk: ringbuffer: Do not skip non-finalized records with prb_next_seq()
  printk: Use prb_first_seq() as base for 32bit seq macros
  printk: Adjust mapping for 32bit seq macros
  printk: nbcon: Relocate 32bit seq macros
2024-03-12 20:54:50 -07:00
Linus Torvalds
9187210eee Networking changes for 6.9.
Core & protocols
 ----------------
 
  - Large effort by Eric to lower rtnl_lock pressure and remove locks:
 
    - Make commonly used parts of rtnetlink (address, route dumps etc.)
      lockless, protected by RCU instead of rtnl_lock.
 
    - Add a netns exit callback which already holds rtnl_lock,
      allowing netns exit to take rtnl_lock once in the core
      instead of once for each driver / callback.
 
    - Remove locks / serialization in the socket diag interface.
 
    - Remove 6 calls to synchronize_rcu() while holding rtnl_lock.
 
    - Remove the dev_base_lock, depend on RCU where necessary.
 
  - Support busy polling on a per-epoll context basis. Poll length
    and budget parameters can be set independently of system defaults.
 
  - Introduce struct net_hotdata, to make sure read-mostly global config
    variables fit in as few cache lines as possible.
 
  - Add optional per-nexthop statistics to ease monitoring / debug
    of ECMP imbalance problems.
 
  - Support TCP_NOTSENT_LOWAT in MPTCP.
 
  - Ensure that IPv6 temporary addresses' preferred lifetimes are long
    enough, compared to other configured lifetimes, and at least 2 sec.
 
  - Support forwarding of ICMP Error messages in IPSec, per RFC 4301.
 
  - Add support for the independent control state machine for bonding
    per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
    control state machine.
 
  - Add "network ID" to MCTP socket APIs to support hosts with multiple
    disjoint MCTP networks.
 
  - Re-use the mono_delivery_time skbuff bit for packets which user
    space wants to be sent at a specified time. Maintain the timing
    information while traversing veth links, bridge etc.
 
  - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.
 
  - Simplify many places iterating over netdevs by using an xarray
    instead of a hash table walk (hash table remains in place, for
    use on fastpaths).
 
  - Speed up scanning for expired routes by keeping a dedicated list.
 
  - Speed up "generic" XDP by trying harder to avoid large allocations.
 
  - Support attaching arbitrary metadata to netconsole messages.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce
    VM_SPARSE kind and vm_area_[un]map_pages (used by bpf_arena).
 
  - Rework selftest harness to enable the use of the full range of
    ksft exit code (pass, fail, skip, xfail, xpass).
 
 Netfilter
 ---------
 
  - Allow userspace to define a table that is exclusively owned by a daemon
    (via netlink socket aliveness) without auto-removing this table when
    the userspace program exits. Such table gets marked as orphaned and
    a restarting management daemon can re-attach/regain ownership.
 
  - Speed up element insertions to nftables' concatenated-ranges set type.
    Compact a few related data structures.
 
 BPF
 ---
 
  - Add BPF token support for delegating a subset of BPF subsystem
    functionality from privileged system-wide daemons such as systemd
    through special mount options for userns-bound BPF fs to a trusted
    & unprivileged application.
 
  - Introduce bpf_arena which is sparse shared memory region between BPF
    program and user space where structures inside the arena can have
    pointers to other areas of the arena, and pointers work seamlessly
    for both user-space programs and BPF programs.
 
  - Introduce may_goto instruction that is a contract between the verifier
    and the program. The verifier allows the program to loop assuming it's
    behaving well, but reserves the right to terminate it.
 
  - Extend the BPF verifier to enable static subprog calls in spin lock
    critical sections.
 
  - Support registration of struct_ops types from modules which helps
    projects like fuse-bpf that seeks to implement a new struct_ops type.
 
  - Add support for retrieval of cookies for perf/kprobe multi links.
 
  - Support arbitrary TCP SYN cookie generation / validation in the TC
    layer with BPF to allow creating SYN flood handling in BPF firewalls.
 
  - Add code generation to inline the bpf_kptr_xchg() helper which
    improves performance when stashing/popping the allocated BPF objects.
 
 Wireless
 --------
 
  - Add SPP (signaling and payload protected) AMSDU support.
 
  - Support wider bandwidth OFDMA, as required for EHT operation.
 
 Driver API
 ----------
 
  - Major overhaul of the Energy Efficient Ethernet internals to support
    new link modes (2.5GE, 5GE), share more code between drivers
    (especially those using phylib), and encourage more uniform behavior.
    Convert and clean up drivers.
 
  - Define an API for querying per netdev queue statistics from drivers.
 
  - IPSec: account in global stats for fully offloaded sessions.
 
  - Create a concept of Ethernet PHY Packages at the Device Tree level,
    to allow parameterizing the existing PHY package code.
 
  - Enable Rx hashing (RSS) on GTP protocol fields.
 
 Misc
 ----
 
  - Improvements and refactoring all over networking selftests.
 
  - Create uniform module aliases for TC classifiers, actions,
    and packet schedulers to simplify creating modprobe policies.
 
  - Address all missing MODULE_DESCRIPTION() warnings in networking.
 
  - Extend the Netlink descriptions in YAML to cover message encapsulation
    or "Netlink polymorphism", where interpretation of nested attributes
    depends on link type, classifier type or some other "class type".
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
    - Intel (100G, ice, idpf):
      - support E825-C devices
    - nVidia/Mellanox:
      - support devices with one port and multiple PCIe links
    - Broadcom (bnxt):
      - support n-tuple filters
      - support configuring the RSS key
    - Wangxun (ngbe/txgbe):
      - implement irq_domain for TXGBE's sub-interrupts
    - Pensando/AMD:
      - support XDP
      - optimize queue submission and wakeup handling (+17% bps)
      - optimize struct layout, saving 28% of memory on queues
 
  - Ethernet NICs embedded and virtual:
    - Google cloud vNIC:
      - refactor driver to perform memory allocations for new queue
        config before stopping and freeing the old queue memory
    - Synopsys (stmmac):
      - obey queueMaxSDU and implement counters required by 802.1Qbv
    - Renesas (ravb):
      - support packet checksum offload
      - suspend to RAM and runtime PM support
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - support for nexthop group statistics
    - Microchip:
      - ksz8: implement PHY loopback
      - add support for KSZ8567, a 7-port 10/100Mbps switch
 
  - PTP:
    - New driver for RENESAS FemtoClock3 Wireless clock generator.
    - Support OCP PTP cards designed and built by Adva.
 
  - CAN:
    - Support recvmsg() flags for own, local and remote traffic
      on CAN BCM sockets.
    - Support for esd GmbH PCIe/402 CAN device family.
    - m_can:
      - Rx/Tx submission coalescing
      - wake on frame Rx
 
  - WiFi:
    - Intel (iwlwifi):
      - enable signaling and payload protected A-MSDUs
      - support wider-bandwidth OFDMA
      - support for new devices
      - bump FW API to 89 for AX devices; 90 for BZ/SC devices
    - MediaTek (mt76):
      - mt7915: newer ADIE version support
      - mt7925: radio temperature sensor support
    - Qualcomm (ath11k):
      - support 6 GHz station power modes: Low Power Indoor (LPI),
        Standard Power) SP and Very Low Power (VLP)
      - QCA6390 & WCN6855: support 2 concurrent station interfaces
      - QCA2066 support
    - Qualcomm (ath12k):
      - refactoring in preparation for Multi-Link Operation (MLO) support
      - 1024 Block Ack window size support
      - firmware-2.bin support
      - support having multiple identical PCI devices (firmware needs to
        have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
      - QCN9274: support split-PHY devices
      - WCN7850: enable Power Save Mode in station mode
      - WCN7850: P2P support
    - RealTek:
      - rtw88: support for more rtw8811cu and rtw8821cu devices
      - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
      - rtlwifi: speed up USB firmware initialization
      - rtwl8xxxu:
        - RTL8188F: concurrent interface support
        - Channel Switch Announcement (CSA) support in AP mode
    - Broadcom (brcmfmac):
      - per-vendor feature support
      - per-vendor SAE password setup
      - DMI nvram filename quirk for ACEPC W5 Pro
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmXv0mgACgkQMUZtbf5S
 IrtgMxAAuRd+WJW++SENr4KxIWhYO1q6Xcxnai43wrNkan9swD24icG8TYALt4f3
 yoT6idQvWReAb5JNlh9rUQz8R7E0nJXlvEFn5MtJwcthx2C6wFo/XkJlddlRrT+j
 c2xGILwLjRhW65LaC0MZ2ECbEERkFz8xcGfK2SWzUgh6KYvPjcRfKFxugpM7xOQK
 P/Wnqhs4fVRS/Mj/bCcXcO+yhwC121Q3qVeQVjGS0AzEC65hAW87a/kc2BfgcegD
 EyI9R7mf6criQwX+0awubjfoIdr4oW/8oDVNvUDczkJkbaEVaLMQk9P5x/0XnnVS
 UHUchWXyI80Q8Rj12uN1/I0h3WtwNQnCRBuLSmtm6GLfCAwbLvp2nGWDnaXiqryW
 DVKUIHGvqPKjkOOMOVfSvfB3LvkS3xsFVVYiQBQCn0YSs/gtu4CoF2Nty9CiLPbK
 tTuxUnLdPDZDxU//l0VArZmP8p2JM7XQGJ+JH8GFH4SBTyBR23e0iyPSoyaxjnYn
 RReDnHMVsrS1i7GPhbqDJWn+uqMSs7N149i0XmmyeqwQHUVSJN3J2BApP2nCaDfy
 H2lTuYly5FfEezt61NvCE4qr/VsWeEjm1fYlFQ9dFn4pGn+HghyCpw+xD1ZN56DN
 lujemau5B3kk1UTtAT4ypPqvuqjkRFqpNV2LzsJSk/Js+hApw8Y=
 =oY52
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Large effort by Eric to lower rtnl_lock pressure and remove locks:

      - Make commonly used parts of rtnetlink (address, route dumps
        etc) lockless, protected by RCU instead of rtnl_lock.

      - Add a netns exit callback which already holds rtnl_lock,
        allowing netns exit to take rtnl_lock once in the core instead
        of once for each driver / callback.

      - Remove locks / serialization in the socket diag interface.

      - Remove 6 calls to synchronize_rcu() while holding rtnl_lock.

      - Remove the dev_base_lock, depend on RCU where necessary.

   - Support busy polling on a per-epoll context basis. Poll length and
     budget parameters can be set independently of system defaults.

   - Introduce struct net_hotdata, to make sure read-mostly global
     config variables fit in as few cache lines as possible.

   - Add optional per-nexthop statistics to ease monitoring / debug of
     ECMP imbalance problems.

   - Support TCP_NOTSENT_LOWAT in MPTCP.

   - Ensure that IPv6 temporary addresses' preferred lifetimes are long
     enough, compared to other configured lifetimes, and at least 2 sec.

   - Support forwarding of ICMP Error messages in IPSec, per RFC 4301.

   - Add support for the independent control state machine for bonding
     per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
     control state machine.

   - Add "network ID" to MCTP socket APIs to support hosts with multiple
     disjoint MCTP networks.

   - Re-use the mono_delivery_time skbuff bit for packets which user
     space wants to be sent at a specified time. Maintain the timing
     information while traversing veth links, bridge etc.

   - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.

   - Simplify many places iterating over netdevs by using an xarray
     instead of a hash table walk (hash table remains in place, for use
     on fastpaths).

   - Speed up scanning for expired routes by keeping a dedicated list.

   - Speed up "generic" XDP by trying harder to avoid large allocations.

   - Support attaching arbitrary metadata to netconsole messages.

  Things we sprinkled into general kernel code:

   - Enforce VM_IOREMAP flag and range in ioremap_page_range and
     introduce VM_SPARSE kind and vm_area_[un]map_pages (used by
     bpf_arena).

   - Rework selftest harness to enable the use of the full range of ksft
     exit code (pass, fail, skip, xfail, xpass).

  Netfilter:

   - Allow userspace to define a table that is exclusively owned by a
     daemon (via netlink socket aliveness) without auto-removing this
     table when the userspace program exits. Such table gets marked as
     orphaned and a restarting management daemon can re-attach/regain
     ownership.

   - Speed up element insertions to nftables' concatenated-ranges set
     type. Compact a few related data structures.

  BPF:

   - Add BPF token support for delegating a subset of BPF subsystem
     functionality from privileged system-wide daemons such as systemd
     through special mount options for userns-bound BPF fs to a trusted
     & unprivileged application.

   - Introduce bpf_arena which is sparse shared memory region between
     BPF program and user space where structures inside the arena can
     have pointers to other areas of the arena, and pointers work
     seamlessly for both user-space programs and BPF programs.

   - Introduce may_goto instruction that is a contract between the
     verifier and the program. The verifier allows the program to loop
     assuming it's behaving well, but reserves the right to terminate
     it.

   - Extend the BPF verifier to enable static subprog calls in spin lock
     critical sections.

   - Support registration of struct_ops types from modules which helps
     projects like fuse-bpf that seeks to implement a new struct_ops
     type.

   - Add support for retrieval of cookies for perf/kprobe multi links.

   - Support arbitrary TCP SYN cookie generation / validation in the TC
     layer with BPF to allow creating SYN flood handling in BPF
     firewalls.

   - Add code generation to inline the bpf_kptr_xchg() helper which
     improves performance when stashing/popping the allocated BPF
     objects.

  Wireless:

   - Add SPP (signaling and payload protected) AMSDU support.

   - Support wider bandwidth OFDMA, as required for EHT operation.

  Driver API:

   - Major overhaul of the Energy Efficient Ethernet internals to
     support new link modes (2.5GE, 5GE), share more code between
     drivers (especially those using phylib), and encourage more
     uniform behavior. Convert and clean up drivers.

   - Define an API for querying per netdev queue statistics from
     drivers.

   - IPSec: account in global stats for fully offloaded sessions.

   - Create a concept of Ethernet PHY Packages at the Device Tree level,
     to allow parameterizing the existing PHY package code.

   - Enable Rx hashing (RSS) on GTP protocol fields.

  Misc:

   - Improvements and refactoring all over networking selftests.

   - Create uniform module aliases for TC classifiers, actions, and
     packet schedulers to simplify creating modprobe policies.

   - Address all missing MODULE_DESCRIPTION() warnings in networking.

   - Extend the Netlink descriptions in YAML to cover message
     encapsulation or "Netlink polymorphism", where interpretation of
     nested attributes depends on link type, classifier type or some
     other "class type".

  Drivers:

   - Ethernet high-speed NICs:
      - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
      - Intel (100G, ice, idpf):
         - support E825-C devices
      - nVidia/Mellanox:
         - support devices with one port and multiple PCIe links
      - Broadcom (bnxt):
         - support n-tuple filters
         - support configuring the RSS key
      - Wangxun (ngbe/txgbe):
         - implement irq_domain for TXGBE's sub-interrupts
      - Pensando/AMD:
         - support XDP
         - optimize queue submission and wakeup handling (+17% bps)
         - optimize struct layout, saving 28% of memory on queues

   - Ethernet NICs embedded and virtual:
      - Google cloud vNIC:
         - refactor driver to perform memory allocations for new queue
           config before stopping and freeing the old queue memory
      - Synopsys (stmmac):
         - obey queueMaxSDU and implement counters required by 802.1Qbv
      - Renesas (ravb):
         - support packet checksum offload
         - suspend to RAM and runtime PM support

   - Ethernet switches:
      - nVidia/Mellanox:
         - support for nexthop group statistics
      - Microchip:
         - ksz8: implement PHY loopback
         - add support for KSZ8567, a 7-port 10/100Mbps switch

   - PTP:
      - New driver for RENESAS FemtoClock3 Wireless clock generator.
      - Support OCP PTP cards designed and built by Adva.

   - CAN:
      - Support recvmsg() flags for own, local and remote traffic on CAN
        BCM sockets.
      - Support for esd GmbH PCIe/402 CAN device family.
      - m_can:
         - Rx/Tx submission coalescing
         - wake on frame Rx

   - WiFi:
      - Intel (iwlwifi):
         - enable signaling and payload protected A-MSDUs
         - support wider-bandwidth OFDMA
         - support for new devices
         - bump FW API to 89 for AX devices; 90 for BZ/SC devices
      - MediaTek (mt76):
         - mt7915: newer ADIE version support
         - mt7925: radio temperature sensor support
      - Qualcomm (ath11k):
         - support 6 GHz station power modes: Low Power Indoor (LPI),
           Standard Power) SP and Very Low Power (VLP)
         - QCA6390 & WCN6855: support 2 concurrent station interfaces
         - QCA2066 support
      - Qualcomm (ath12k):
         - refactoring in preparation for Multi-Link Operation (MLO)
           support
         - 1024 Block Ack window size support
         - firmware-2.bin support
         - support having multiple identical PCI devices (firmware needs
           to have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
         - QCN9274: support split-PHY devices
         - WCN7850: enable Power Save Mode in station mode
         - WCN7850: P2P support
      - RealTek:
         - rtw88: support for more rtw8811cu and rtw8821cu devices
         - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
         - rtlwifi: speed up USB firmware initialization
         - rtwl8xxxu:
             - RTL8188F: concurrent interface support
             - Channel Switch Announcement (CSA) support in AP mode
      - Broadcom (brcmfmac):
         - per-vendor feature support
         - per-vendor SAE password setup
         - DMI nvram filename quirk for ACEPC W5 Pro"

* tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits)
  nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y
  nexthop: Fix out-of-bounds access during attribute validation
  nexthop: Only parse NHA_OP_FLAGS for dump messages that require it
  nexthop: Only parse NHA_OP_FLAGS for get messages that require it
  bpf: move sleepable flag from bpf_prog_aux to bpf_prog
  bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes()
  selftests/bpf: Add kprobe multi triggering benchmarks
  ptp: Move from simple ida to xarray
  vxlan: Remove generic .ndo_get_stats64
  vxlan: Do not alloc tstats manually
  devlink: Add comments to use netlink gen tool
  nfp: flower: handle acti_netdevs allocation failure
  net/packet: Add getsockopt support for PACKET_COPY_THRESH
  net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID
  selftests/bpf: Add bpf_arena_htab test.
  selftests/bpf: Add bpf_arena_list test.
  selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages
  bpf: Add helper macro bpf_addr_space_cast()
  libbpf: Recognize __arena global variables.
  bpftool: Recognize arena map type
  ...
2024-03-12 17:44:08 -07:00
Linus Torvalds
216532e147 hardening updates for v6.9-rc1
- string.h and related header cleanups (Tanzir Hasan, Andy Shevchenko)
 
 - VMCI memcpy() usage and struct_size() cleanups (Vasiliy Kovalev, Harshit
   Mogalapalli)
 
 - selftests/powerpc: Fix load_unaligned_zeropad build failure (Michael
   Ellerman)
 
 - hardened Kconfig fragment updates (Marco Elver, Lukas Bulwahn)
 
 - Handle tail call optimization better in LKDTM (Douglas Anderson)
 
 - Use long form types in overflow.h (Andy Shevchenko)
 
 - Add flags param to string_get_size() (Andy Shevchenko)
 
 - Add Coccinelle script for potential struct_size() use (Jacob Keller)
 
 - Fix objtool corner case under KCFI (Josh Poimboeuf)
 
 - Drop 13 year old backward compat CAP_SYS_ADMIN check (Jingzi Meng)
 
 - Add str_plural() helper (Michal Wajdeczko, Kees Cook)
 
 - Ignore relocations in .notes section
 
 - Add comments to explain how __is_constexpr() works
 
 - Fix m68k stack alignment expectations in stackinit Kunit test
 
 - Convert string selftests to KUnit
 
 - Add KUnit tests for fortified string functions
 
 - Improve reporting during fortified string warnings
 
 - Allow non-type arg to type_max() and type_min()
 
 - Allow strscpy() to be called with only 2 arguments
 
 - Add binary mode to leaking_addresses scanner
 
 - Various small cleanups to leaking_addresses scanner
 
 - Adding wrapping_*() arithmetic helper
 
 - Annotate initial signed integer wrap-around in refcount_t
 
 - Add explicit UBSAN section to MAINTAINERS
 
 - Fix UBSAN self-test warnings
 
 - Simplify UBSAN build via removal of CONFIG_UBSAN_SANITIZE_ALL
 
 - Reintroduce UBSAN's signed overflow sanitizer
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmXvm5kWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJiQqD/4mM6SWZpYHKlR1nEiqIyz7Hqr9
 g4oguuw6HIVNJXLyeBI5Hd43CTeHPA0e++EETqhUAt7HhErxfYJY+JB221nRYmu+
 zhhQ7N/xbTMV/Je7AR03kQjhiMm8LyEcM2X4BNrsAcoCieQzmO3g0zSp8ISzLUE0
 PEEmf1lOzMe3gK2KOFCPt5Hiz9sGWyN6at+BQubY18tQGtjEXYAQNXkpD5qhGn4a
 EF693r/17wmc8hvSsjf4AGaWy1k8crG0WfpMCZsaqftjj0BbvOC60IDyx4eFjpcy
 tGyAJKETq161AkCdNweIh2Q107fG3tm0fcvw2dv8Wt1eQCko6M8dUGCBinQs/thh
 TexjJFS/XbSz+IvxLqgU+C5qkOP23E0M9m1dbIbOFxJAya/5n16WOBlGr3ae2Wdq
 /+t8wVSJw3vZiku5emWdFYP1VsdIHUjVa5QizFaaRhzLGRwhxVV49SP4IQC/5oM5
 3MAgNOFTP6yRQn9Y9wP+SZs+SsfaIE7yfKa9zOi4S+Ve+LI2v4YFhh8NCRiLkeWZ
 R1dhp8Pgtuq76f/v0qUaWcuuVeGfJ37M31KOGIhi1sI/3sr7UMrngL8D1+F8UZMi
 zcLu+x4GtfUZCHl6znx1rNUBqE5S/5ndVhLpOqfCXKaQ+RAm7lkOJ3jXE2VhNkhp
 yVEmeSOLnlCaQjZvXQ==
 =OP+o
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "As is pretty normal for this tree, there are changes all over the
  place, especially for small fixes, selftest improvements, and improved
  macro usability.

  Some header changes ended up landing via this tree as they depended on
  the string header cleanups. Also, a notable set of changes is the work
  for the reintroduction of the UBSAN signed integer overflow sanitizer
  so that we can continue to make improvements on the compiler side to
  make this sanitizer a more viable future security hardening option.

  Summary:

   - string.h and related header cleanups (Tanzir Hasan, Andy
     Shevchenko)

   - VMCI memcpy() usage and struct_size() cleanups (Vasiliy Kovalev,
     Harshit Mogalapalli)

   - selftests/powerpc: Fix load_unaligned_zeropad build failure
     (Michael Ellerman)

   - hardened Kconfig fragment updates (Marco Elver, Lukas Bulwahn)

   - Handle tail call optimization better in LKDTM (Douglas Anderson)

   - Use long form types in overflow.h (Andy Shevchenko)

   - Add flags param to string_get_size() (Andy Shevchenko)

   - Add Coccinelle script for potential struct_size() use (Jacob
     Keller)

   - Fix objtool corner case under KCFI (Josh Poimboeuf)

   - Drop 13 year old backward compat CAP_SYS_ADMIN check (Jingzi Meng)

   - Add str_plural() helper (Michal Wajdeczko, Kees Cook)

   - Ignore relocations in .notes section

   - Add comments to explain how __is_constexpr() works

   - Fix m68k stack alignment expectations in stackinit Kunit test

   - Convert string selftests to KUnit

   - Add KUnit tests for fortified string functions

   - Improve reporting during fortified string warnings

   - Allow non-type arg to type_max() and type_min()

   - Allow strscpy() to be called with only 2 arguments

   - Add binary mode to leaking_addresses scanner

   - Various small cleanups to leaking_addresses scanner

   - Adding wrapping_*() arithmetic helper

   - Annotate initial signed integer wrap-around in refcount_t

   - Add explicit UBSAN section to MAINTAINERS

   - Fix UBSAN self-test warnings

   - Simplify UBSAN build via removal of CONFIG_UBSAN_SANITIZE_ALL

   - Reintroduce UBSAN's signed overflow sanitizer"

* tag 'hardening-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (51 commits)
  selftests/powerpc: Fix load_unaligned_zeropad build failure
  string: Convert helpers selftest to KUnit
  string: Convert selftest to KUnit
  sh: Fix build with CONFIG_UBSAN=y
  compiler.h: Explain how __is_constexpr() works
  overflow: Allow non-type arg to type_max() and type_min()
  VMCI: Fix possible memcpy() run-time warning in vmci_datagram_invoke_guest_handler()
  lib/string_helpers: Add flags param to string_get_size()
  x86, relocs: Ignore relocations in .notes section
  objtool: Fix UNWIND_HINT_{SAVE,RESTORE} across basic blocks
  overflow: Use POD in check_shl_overflow()
  lib: stackinit: Adjust target string to 8 bytes for m68k
  sparc: vdso: Disable UBSAN instrumentation
  kernel.h: Move lib/cmdline.c prototypes to string.h
  leaking_addresses: Provide mechanism to scan binary files
  leaking_addresses: Ignore input device status lines
  leaking_addresses: Use File::Temp for /tmp files
  MAINTAINERS: Update LEAKING_ADDRESSES details
  fortify: Improve buffer overflow reporting
  fortify: Add KUnit tests for runtime overflows
  ...
2024-03-12 14:49:30 -07:00
Roman Smirnov
bea0a58695 assoc_array: fix the return value in assoc_array_insert_mid_shortcut()
Returning the edit variable is redundant because it is dereferenced right
before it is returned.  It would be better to return true.

Found by Linux Verification Center (linuxtesting.org) with Svace.

Link: https://lkml.kernel.org/r/20240307071717.5318-1-r.smirnov@omp.ru
Signed-off-by: Roman Smirnov <r.smirnov@omp.ru>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-12 13:09:23 -07:00
Peng Hao
c44f063e74 buildid: use kmap_local_page()
Use kmap_local_page() instead of kmap_atomic() which has been deprecated.

Link: https://lkml.kernel.org/r/20240306034804.62087-1-flyingpeng@tencent.com
Signed-off-by: Peng Hao <flyingpeng@tencent.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-12 13:09:23 -07:00
Uwe Kleine-König
8c86fb68ff mul_u64_u64_div_u64: increase precision by conditionally swapping a and b
As indicated in the added comment, the algorithm works better if b is big.
As multiplication is commutative, a and b can be swapped.  Do this if a
is bigger than b.

Link: https://lkml.kernel.org/r/20240303092408.662449-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-12 13:09:22 -07:00
Linus Torvalds
691632f0e8 s390 updates for 6.9 merge window
- Various virtual vs physical address usage fixes
 
 - Fix error handling in Processor Activity Instrumentation device driver, and
   export number of counters with a sysfs file
 
 - Allow for multiple events when Processor Activity Instrumentation counters
   are monitored in system wide sampling
 
 - Change multiplier and shift values of the Time-of-Day clock source to improve
   steering precision
 
 - Remove a couple of unneeded GFP_DMA flags from allocations
 
 - Disable mmap alignment if randomize_va_space is also disabled, to avoid a too
   small heap
 
 - Various changes to allow s390 to be compiled with LLVM=1, since ld.lld and
   llvm-objcopy will have proper s390 support witch clang 19
 
 - Add __uninitialized macro to Compiler Attributes. This is helpful with s390's
   FPU code where some users have up to 520 byte stack frames. Clearing such
   stack frames (if INIT_STACK_ALL_PATTERN or INIT_STACK_ALL_ZERO is enabled)
   before they are used contradicts the intention (performance improvement) of
   such code sections.
 
 - Convert switch_to() to an out-of-line function, and use the generic switch_to
   header file
 
 - Replace the usage of s390's debug feature with pr_debug() calls within the
   zcrypt device driver
 
 - Improve hotplug support of the Adjunct Processor device driver
 
 - Improve retry handling in the zcrypt device driver
 
 - Various changes to the in-kernel FPU code:
 
   - Make in-kernel FPU sections preemptible
 
   - Convert various larger inline assemblies and assembler files to C, mainly
     by using singe instruction inline assemblies. This increases readability,
     but also allows makes it easier to add proper instrumentation hooks
 
   - Cleanup of the header files
 
 - Provide fast variants of csum_partial() and csum_partial_copy_nocheck() based
   on vector instructions
 
 - Introduce and use a lock to synchronize accesses to zpci device data
   structures to avoid inconsistent states caused by concurrent accesses
 
 - Compile the kernel without -fPIE. This addresses the following problems if
   the kernel is compiled with -fPIE:
 
   - It uses dynamic symbols (.dynsym), for which the linker refuses to allow
     more than 64k sections. This can break features which use
     '-ffunction-sections' and '-fdata-sections', including kpatch-build and
     function granular KASLR
 
   - It unnecessarily uses GOT relocations, adding an extra layer of indirection
     for many memory accesses
 
 - Fix shared_cpu_list for CPU private L2 caches, which incorrectly were
   reported as globally shared
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmXu3jEACgkQIg7DeRsp
 bsJC8A/9Gi9JSMKWpIDR4WE2MQGwP/PnYdEamtK6c9ewOjIR/UzRIyIM3J1pyV0L
 RwL8k7EBuv3f7shTcwfPzZWlnAwNwqr1UdcafjFNtHTig50YtdP5fBL33frKHBrm
 ATedlCjagojOuVbh1gB45WUgzjSSkPyn0vqwjjo4h6uEAQ35zMEWwCs5Hpajlkhi
 GCdJaiBLJcnhT96QGurQdke+MsrpGCzeBVBnA0qopQEWaQo8OdiAJ1uMD2WKbgPR
 817kNzvmE6nXnfd5JevYbaiLjK/HQUSw2dZUS6/fjuIrzTsZEUhSg4ECaprKXDg7
 5qiVVPNg4WbJAp0SsB+w7c4U99VxhbS7IVHXju18GrXw6SSAupdxIo7R7YiaT8vC
 YIXZ1uIQ4Vbts3w/UqWUczIl/ooQt2DdrWT5NDNA+84OlOM42rthzA3vznTWuPTb
 U21R7cZmN++hAUjR6s4aO2LfS7HQdnKL8nvJW2y99qSfrOXm+M973W2pDhYEVXQh
 ixQ/lxfQpbBT1yUGlquIErokCPB85VY6ZTdGu6Erziywf4CWGsT5CspyaQnX2KTJ
 s4CpFPnilrW3OnxmIkrM+pNJDun1nnkGA388Xq1NEKX8Oe65OMXEFNCb0kAHQ1ua
 vb6534Ib/iuPnxsGpz1sX9iRqtUd06aBovPcbwIvatHCSfkWws8=
 =KZ31
 -----END PGP SIGNATURE-----

Merge tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - Various virtual vs physical address usage fixes

 - Fix error handling in Processor Activity Instrumentation device
   driver, and export number of counters with a sysfs file

 - Allow for multiple events when Processor Activity Instrumentation
   counters are monitored in system wide sampling

 - Change multiplier and shift values of the Time-of-Day clock source to
   improve steering precision

 - Remove a couple of unneeded GFP_DMA flags from allocations

 - Disable mmap alignment if randomize_va_space is also disabled, to
   avoid a too small heap

 - Various changes to allow s390 to be compiled with LLVM=1, since
   ld.lld and llvm-objcopy will have proper s390 support witch clang 19

 - Add __uninitialized macro to Compiler Attributes. This is helpful
   with s390's FPU code where some users have up to 520 byte stack
   frames. Clearing such stack frames (if INIT_STACK_ALL_PATTERN or
   INIT_STACK_ALL_ZERO is enabled) before they are used contradicts the
   intention (performance improvement) of such code sections.

 - Convert switch_to() to an out-of-line function, and use the generic
   switch_to header file

 - Replace the usage of s390's debug feature with pr_debug() calls
   within the zcrypt device driver

 - Improve hotplug support of the Adjunct Processor device driver

 - Improve retry handling in the zcrypt device driver

 - Various changes to the in-kernel FPU code:

     - Make in-kernel FPU sections preemptible

     - Convert various larger inline assemblies and assembler files to
       C, mainly by using singe instruction inline assemblies. This
       increases readability, but also allows makes it easier to add
       proper instrumentation hooks

     - Cleanup of the header files

 - Provide fast variants of csum_partial() and
   csum_partial_copy_nocheck() based on vector instructions

 - Introduce and use a lock to synchronize accesses to zpci device data
   structures to avoid inconsistent states caused by concurrent accesses

 - Compile the kernel without -fPIE. This addresses the following
   problems if the kernel is compiled with -fPIE:

     - It uses dynamic symbols (.dynsym), for which the linker refuses
       to allow more than 64k sections. This can break features which
       use '-ffunction-sections' and '-fdata-sections', including
       kpatch-build and function granular KASLR

     - It unnecessarily uses GOT relocations, adding an extra layer of
       indirection for many memory accesses

 - Fix shared_cpu_list for CPU private L2 caches, which incorrectly were
   reported as globally shared

* tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (117 commits)
  s390/tools: handle rela R_390_GOTPCDBL/R_390_GOTOFF64
  s390/cache: prevent rebuild of shared_cpu_list
  s390/crypto: remove retry loop with sleep from PAES pkey invocation
  s390/pkey: improve pkey retry behavior
  s390/zcrypt: improve zcrypt retry behavior
  s390/zcrypt: introduce retries on in-kernel send CPRB functions
  s390/ap: introduce mutex to lock the AP bus scan
  s390/ap: rework ap_scan_bus() to return true on config change
  s390/ap: clarify AP scan bus related functions and variables
  s390/ap: rearm APQNs bindings complete completion
  s390/configs: increase number of LOCKDEP_BITS
  s390/vfio-ap: handle hardware checkstop state on queue reset operation
  s390/pai: change sampling event assignment for PMU device driver
  s390/boot: fix minor comment style damages
  s390/boot: do not check for zero-termination relocation entry
  s390/boot: make type of __vmlinux_relocs_64_start|end consistent
  s390/boot: sanitize kaslr_adjust_relocs() function prototype
  s390/boot: simplify GOT handling
  s390: vmlinux.lds.S: fix .got.plt assertion
  s390/boot: workaround current 'llvm-objdump -t -j ...' behavior
  ...
2024-03-12 10:14:22 -07:00
Ingo Molnar
9ab121d65e sched/debug: Allow CONFIG_SCHEDSTATS even on !KERNEL_DEBUG kernels
All major Linux distributions enable CONFIG_SCHEDSTATS,
so make it more widely available.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240308105901.1096078-6-mingo@kernel.org
2024-03-12 11:03:41 +01:00
Linus Torvalds
a5b1a017cb Locking changes for v6.9:
- Micro-optimize local_xchg() and the rtmutex code on x86
 
 - Fix percpu-rwsem contention tracepoints
 
 - Simplify debugging Kconfig dependencies
 
 - Update/clarify the documentation of atomic primitives
 
 - Misc cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmXu6EARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1i7og/8DY/pEGqa/9xYZNE+3NZypuri93XjzFKu
 i2yN1ymjSmjgQY83ImmP67gBBf7xd3kS0oiHM+lWnPE10pkzIPhleru4iExoyOB6
 oMcQSyZALK3uRzxG/EwhuZhE0z9SadB/vkFUDJh677beMRsqfm2QXb4urEcTLUye
 z4+Tg5zjJvNpKpGoTO7sWj0AfvpEa40RFaGAZEBdmU5CrykLE9tIL6wBEP5RAUcI
 b8M+tr7D0JD0VGp4zhayEvq2TiwZhhxQ9C5HpVqck7LsfQvoXgBhGtxl/EkXVJ59
 PiaLDJAY/D0ocyz1WNB7pFfOdZP6RV0a/5gEzp1uvmRdRV+gEhX88aBmtcc2072p
 Do5fQwoqNecpHdY1+QY4n5Bq5KYQz9JZl3U1M5g/5dAjDiCo1W+eKk4AlkdymLQQ
 4jhCsBFnrQdcrxHIfyHi1ocggs0cUXTCDIRPZSsA1ij51UxcLK2kz/6Ba1jSnFGk
 iAfcF+Dj68/48zrz9yr+DS1od+oIsj4E+lr0btbj7xf2yiFXKbjPNE5Z8dk3JLay
 /Eyb5NSZzfT4cpjpwYAoQ/JJySm3i0Uu/llOOIlTTi94waFomFBaCAo7/ujoGOOJ
 /VHqouGVWaWtv6JhkjikgqzVj34Yr3rqZq9O3SbrZcu4YafKbaLNbzlt5z4zMkQv
 wSMAPFgvcZQ=
 =t84c
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - Micro-optimize local_xchg() and the rtmutex code on x86

 - Fix percpu-rwsem contention tracepoints

 - Simplify debugging Kconfig dependencies

 - Update/clarify the documentation of atomic primitives

 - Misc cleanups

* tag 'locking-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Use try_cmpxchg_relaxed() in mark_rt_mutex_waiters()
  locking/x86: Implement local_xchg() using CMPXCHG without the LOCK prefix
  locking/percpu-rwsem: Trigger contention tracepoints only if contended
  locking/rwsem: Make DEBUG_RWSEMS and PREEMPT_RT mutually exclusive
  locking/rwsem: Clarify that RWSEM_READER_OWNED is just a hint
  locking/mutex: Simplify <linux/mutex.h>
  locking/qspinlock: Fix 'wait_early' set but not used warning
  locking/atomic: scripts: Clarify ordering of conditional atomics
2024-03-11 18:33:03 -07:00
Linus Torvalds
7ea65c89d8 vfs-6.9.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZem3wQAKCRCRxhvAZXjc
 otRMAQDeo8qsuuIAcS2KUicKqZR5yMVvrY9r4sQzf7YRcJo5HQD+NQXkKwQuv1VO
 OUeScsic/+I+136AgdjWnlEYO5dp0go=
 =4WKU
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "Misc features, cleanups, and fixes for vfs and individual filesystems.

  Features:

   - Support idmapped mounts for hugetlbfs.

   - Add RWF_NOAPPEND flag for pwritev2(). This allows us to fix a bug
     where the passed offset is ignored if the file is O_APPEND. The new
     flag allows a caller to enforce that the offset is honored to
     conform to posix even if the file was opened in append mode.

   - Move i_mmap_rwsem in struct address_space to avoid false sharing
     between i_mmap and i_mmap_rwsem.

   - Convert efs, qnx4, and coda to use the new mount api.

   - Add a generic is_dot_dotdot() helper that's used by various
     filesystems and the VFS code instead of open-coding it multiple
     times.

   - Recently we've added stable offsets which allows stable ordering
     when iterating directories exported through NFS on e.g., tmpfs
     filesystems. Originally an xarray was used for the offset map but
     that caused slab fragmentation issues over time. This switches the
     offset map to the maple tree which has a dense mode that handles
     this scenario a lot better. Includes tests.

   - Finally merge the case-insensitive improvement series Gabriel has
     been working on for a long time. This cleanly propagates case
     insensitive operations through ->s_d_op which in turn allows us to
     remove the quite ugly generic_set_encrypted_ci_d_ops() operations.
     It also improves performance by trying a case-sensitive comparison
     first and then fallback to case-insensitive lookup if that fails.
     This also fixes a bug where overlayfs would be able to be mounted
     over a case insensitive directory which would lead to all sort of
     odd behaviors.

  Cleanups:

   - Make file_dentry() a simple accessor now that ->d_real() is
     simplified because of the backing file work we did the last two
     cycles.

   - Use the dedicated file_mnt_idmap helper in ntfs3.

   - Use smp_load_acquire/store_release() in the i_size_read/write
     helpers and thus remove the hack to handle i_size reads in the
     filemap code.

   - The SLAB_MEM_SPREAD is a nop now. Remove it from various places in
     fs/

   - It's no longer necessary to perform a second built-in initramfs
     unpack call because we retain the contents of the previous
     extraction. Remove it.

   - Now that we have removed various allocators kfree_rcu() always
     works with kmem caches and kmalloc(). So simplify various places
     that only use an rcu callback in order to handle the kmem cache
     case.

   - Convert the pipe code to use a lockdep comparison function instead
     of open-coding the nesting making lockdep validation easier.

   - Move code into fs-writeback.c that was located in a header but can
     be made static as it's only used in that one file.

   - Rewrite the alignment checking iterators for iovec and bvec to be
     easier to read, and also significantly more compact in terms of
     generated code. This saves 270 bytes of text on x86-64 (with
     clang-18) and 224 bytes on arm64 (with gcc-13). In profiles it also
     saves a bit of time for the same workload.

   - Switch various places to use KMEM_CACHE instead of
     kmem_cache_create().

   - Use inode_set_ctime_to_ts() in inode_set_ctime_current()

   - Use kzalloc() in name_to_handle_at() to avoid kernel infoleak.

   - Various smaller cleanups for eventfds.

  Fixes:

   - Fix various comments and typos, and unneeded initializations.

   - Fix stack allocation hack for clang in the select code.

   - Improve dump_mapping() debug code on a best-effort basis.

   - Fix build errors in various selftests.

   - Avoid wrap-around instrumentation in various places.

   - Don't allow user namespaces without an idmapping to be used for
     idmapped mounts.

   - Fix sysv sb_read() call.

   - Fix fallback implementation of the get_name() export operation"

* tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (70 commits)
  hugetlbfs: support idmapped mounts
  qnx4: convert qnx4 to use the new mount api
  fs: use inode_set_ctime_to_ts to set inode ctime to current time
  libfs: Drop generic_set_encrypted_ci_d_ops
  ubifs: Configure dentry operations at dentry-creation time
  f2fs: Configure dentry operations at dentry-creation time
  ext4: Configure dentry operations at dentry-creation time
  libfs: Add helper to choose dentry operations at mount-time
  libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
  fscrypt: Drop d_revalidate once the key is added
  fscrypt: Drop d_revalidate for valid dentries during lookup
  fscrypt: Factor out a helper to configure the lookup dentry
  ovl: Always reject mounting over case-insensitive directories
  libfs: Attempt exact-match comparison first during casefolded lookup
  efs: remove SLAB_MEM_SPREAD flag usage
  jfs: remove SLAB_MEM_SPREAD flag usage
  minix: remove SLAB_MEM_SPREAD flag usage
  openpromfs: remove SLAB_MEM_SPREAD flag usage
  proc: remove SLAB_MEM_SPREAD flag usage
  qnx6: remove SLAB_MEM_SPREAD flag usage
  ...
2024-03-11 09:38:17 -07:00
Linus Torvalds
97ec9715a8 linux_kselftest-kunit-6.9-rc1
This KUnit next update for Linux 6.9-rc1 consists of:
 
 -- fix to make kunit_bus_type const
 
 -- kunit tool change to Print UML command
 
 -- DRM device creation helpers are now using the new kunit device
    creation helpers. This change resulted in DRM helpers switching
    from using a platform_device, to a dedicated bus and device type
    used by kunit. kunit devices don't set DMA mask and this caused
    regression on some drm tests as they can't allocate DMA buffers.
    Fix this problem by setting DMA masks on the kunit device during
    initialization.
 
 -- KUnit has several macros which accept a log message, which can
    contain printf format specifiers. Some of these (the explicit
    log macros) already use the __printf() gcc attribute to ensure
    the format specifiers are valid, but those which could fail the
    test, and hence used __kunit_do_failed_assertion() behind the scenes,
    did not.
 
    These include: KUNIT_EXPECT_*_MSG(), KUNIT_ASSERT_*_MSG(), and
    KUNIT_FAIL()
 
    A 9 patch series adds the __printf() attribute, and fixes all of
    the issues uncovered.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmXpHUsACgkQCwJExA0N
 QxxucA//VQt+qPeYHysK75Vu9icGGD/apxwMQiKIygVf8Mxg9IN3L7mJDDRIJj3h
 kAY2yJG9MxiezvTK58pqV38A4l1pB2L/qqyDhdFbgD6XSgJS5beNh4Sne5gL2Okw
 lHJkkZGj+g65UKTIbzhMFVzPsHPvJLdJAG2GSJcS6m2GfaSCBoOmRvugZ1OM40d0
 ruxU6/ucR6u8jtN7fUTEuOSpfngJrUpBGer4i4+qC4mlI26XZ96oh35gFJTsE/CK
 2WAXqv+Y9WhdFTihMHUcy11CWEM7XrkSXdp5GsdEOvg2SpqyP6C7kVCZ9aYV0FFe
 LOo3D3rCA+WggMI5wJ51P0F3KkHu+mr+XBcl3TQ1de6mnX4+qZKJSyCt+69PzeIi
 3TGWGO9lKkFrZ4StYdfCy8M/ABIpWq/UqIQAPOYtpQAEkGSj7H6J4OK9SG3oH1Oa
 Jnn+yeTDr6B7v0gzkS57wBZg10uL+FG1MoOYqi7p1ZbyHc1lOPbx5AboPAh20cqW
 h4UEIg50aGvHT6VjAidzI/CfZDmgkusCEnip0c2HaCg+AMi03JD1lQZTOuM9S6os
 dkFrPoDGXyBQytyJmdWi6GKULn3DG8llFKDEGZnyYgszQS8hw21iqmK5/Kuit+sN
 oJpjdSmXwoG5h6R9hUWnz+NcjNe44F4P88agVyrBYk2nZu97IiY=
 =ilEb
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - fix to make kunit_bus_type const

 - kunit tool change to Print UML command

 - DRM device creation helpers are now using the new kunit device
   creation helpers. This change resulted in DRM helpers switching from
   using a platform_device, to a dedicated bus and device type used by
   kunit. kunit devices don't set DMA mask and this caused regression on
   some drm tests as they can't allocate DMA buffers. Fix this problem
   by setting DMA masks on the kunit device during initialization.

 - KUnit has several macros which accept a log message, which can
   contain printf format specifiers. Some of these (the explicit log
   macros) already use the __printf() gcc attribute to ensure the format
   specifiers are valid, but those which could fail the test, and hence
   used __kunit_do_failed_assertion() behind the scenes, did not.

   These include: KUNIT_EXPECT_*_MSG(), KUNIT_ASSERT_*_MSG(), and
   KUNIT_FAIL()

   A nine-patch series adds the __printf() attribute, and fixes all of
   the issues uncovered.

* tag 'linux_kselftest-kunit-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Annotate _MSG assertion variants with gnu printf specifiers
  drm: tests: Fix invalid printf format specifiers in KUnit tests
  drm/xe/tests: Fix printf format specifiers in xe_migrate test
  net: test: Fix printf format specifier in skb_segment kunit test
  rtc: test: Fix invalid format specifier.
  time: test: Fix incorrect format specifier
  lib: memcpy_kunit: Fix an invalid format specifier in an assertion msg
  lib/cmdline: Fix an invalid format specifier in an assertion msg
  kunit: test: Log the correct filter string in executor_test
  kunit: Setup DMA masks on the kunit device
  kunit: make kunit_bus_type const
  kunit: Mark filter* params as rw
  kunit: tool: Print UML command
2024-03-11 09:32:28 -07:00
Linus Torvalds
d451b075f7 linux_kselftest-next-6.9-rc1
This kselftest next update for Linux 6.9-rc1 consists of:
 
 -- livepatch restructuring to move the module out of lib to be
    built as a out-of-tree modules during kselftest build. This
    change makes it easier change, debug and rebuild the tests by
    running make on the selftests/livepatch directory, which is not
    currently possible since the modules on lib/livepatch are build
    and installed using the main makefile modules target.
 
 -- livepatch restructuring fixes for problems found by kernel test
    robot. The change skips the test if kernel-devel isn't installed
    (default value of KDIR), or if KDIR variable passed doesn't exists.
 
 -- resctrl test restructuring and new non-contiguous CBMs CAT test
 
 -- new ktap_helpers to print diagnostic messages, pass/fail tests
    based on exit code, abort test, and finish the test.
 
 -- a new test verify power supply properties.
 
 -- a new ftrace to exercise function tracer across cpu hotplug.
 
 -- timeout increase for mqueue test to allow the test to run on
    i3.metal AWS instances.
 
 -- minor spelling corrections in several tests.
 
 -- missing gitignore files and changes to existing gitignore files.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmXo/kUACgkQCwJExA0N
 Qxy0aBAAk0SLA1ZIAdlNjo5B13C7GC7rFRrtaai9ReXSvU/X5TX9sD5T9DIULKdj
 Mcqi+oaP88GPSUZS+bn7DyVxKyuvHg/f4jWQwqZ34WxK4K1K+yt+3YhTnHZx7ezU
 6WIbUsD1Zs7tXXI2v76riHFbD3pfxZ+AXQaf/1cXDi4SpIpLkiqyeYWoWN5Z2rtJ
 BwMzrI2RBiLMox4g8F3Ey4BX+bOIYiiJq5bdl7gJVKcp74VdU3S7IyOuXFbSdcFR
 xxmFMxWGFOgRzexW0fmDWLudD2dII0XQAExSsl5xMnR/lmSh+lHWheoNgphQl050
 VcLmrPugWVJSioe0fHEgmDQXe3lPqDtepUg921tIlWvCmtR3Ur6+GpILTbSvQ4qp
 SK+2pt7nGSAT2UkRO/6/TYFG3mELADvj6tglj0b1SkIXmNiF+7OZ+hJ2XqyM7peo
 Z7gtmSmpbAotxp64Jj8HsNZLpCX0xdaxoTMEWPoG09fwTXY7Hy03yoWDKBKB4MZ9
 jBtNXDolhpEQ/ppSGFnRPzXuNVapYX28UY0cwBBVgke5jwB8SUnBEr2dbNnVU1q0
 y5uxtj/EFQzxSynB3eM1us2OuXvr5TfAWmKVpyE/cNC3WreHeA+Y2kN1dzv8hgpw
 o4NbltdF8F+a9qQF9B1XvjVhqa5By1esS1jOg96cJgGseAVWiQs=
 =G+DO
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-next-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest update from Shuah Khan:

 - livepatch restructuring to move the module out of lib to be built as
   a out-of-tree modules during kselftest build. This makes it easier
   change, debug and rebuild the tests by running make on the
   selftests/livepatch directory, which is not currently possible since
   the modules on lib/livepatch are build and installed using the main
   makefile modules target.

 - livepatch restructuring fixes for problems found by kernel test
   robot. The change skips the test if kernel-devel isn't installed
   (default value of KDIR), or if KDIR variable passed doesn't exists.

 - resctrl test restructuring and new non-contiguous CBMs CAT test

 - new ktap_helpers to print diagnostic messages, pass/fail tests based
   on exit code, abort test, and finish the test.

 - a new test verify power supply properties.

 - a new ftrace to exercise function tracer across cpu hotplug.

 - timeout increase for mqueue test to allow the test to run on i3.metal
   AWS instances.

 - minor spelling corrections in several tests.

 - missing gitignore files and changes to existing gitignore files.

* tag 'linux_kselftest-next-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (57 commits)
  kselftest: Add basic test for probing the rust sample modules
  selftests: lib.mk: Do not process TEST_GEN_MODS_DIR
  selftests: livepatch: Avoid running the tests if kernel-devel is missing
  selftests: livepatch: Add initial .gitignore
  selftests/resctrl: Add non-contiguous CBMs CAT test
  selftests/resctrl: Add resource_info_file_exists()
  selftests/resctrl: Split validate_resctrl_feature_request()
  selftests/resctrl: Add a helper for the non-contiguous test
  selftests/resctrl: Add test groups and name L3 CAT test L3_CAT
  selftests: sched: Fix spelling mistake "hiearchy" -> "hierarchy"
  selftests/mqueue: Set timeout to 180 seconds
  selftests/ftrace: Add test to exercize function tracer across cpu hotplug
  selftest: ftrace: fix minor typo in log
  selftests: thermal: intel: workload_hint: add missing gitignore
  selftests: thermal: intel: power_floor: add missing gitignore
  selftests: uevent: add missing gitignore
  selftests: Add test to verify power supply properties
  selftests: ktap_helpers: Add a helper to finish the test
  selftests: ktap_helpers: Add a helper to abort the test
  selftests: ktap_helpers: Add helper to pass/fail test based on exit code
  ...
2024-03-11 09:25:33 -07:00
Andy Shevchenko
de5f843389 lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers
These helpers scatters or gathers a bitmap with the help of the mask
position bits parameter.

bitmap_scatter() does the following:
  src:  0000000001011010
                  ||||||
           +------+|||||
           |  +----+||||
           |  |+----+|||
           |  ||   +-+||
           |  ||   |  ||
  mask: ...v..vv...v..vv
        ...0..11...0..10
  dst:  0000001100000010

and bitmap_gather() performs this one:
   mask: ...v..vv...v..vv
   src:  0000001100000010
            ^  ^^   ^   0
            |  ||   |  10
            |  ||   > 010
            |  |+--> 1010
            |  +--> 11010
            +----> 011010
   dst:  0000000000011010

bitmap_gather() can the seen as the reverse bitmap_scatter() operation.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/lkml/20230926052007.3917389-3-andriy.shevchenko@linux.intel.com/
Co-developed-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11 09:36:11 +00:00
Andreas Larsson
84b76d0582 lib/fonts: Allow Sparc console 8x16 font for sparc64 early boot text console
Allow FONT_SUN8x16 when EARLYFB is enabled for sparc64, even when
FRAMEBUFFER_CONSOLE is not to avoid the following warning for this case

   WARNING: unmet direct dependencies detected for FONT_SUN8x16
     Depends on [n]: FONT_SUPPORT [=y] && (FRAMEBUFFER_CONSOLE [=n] && (FONTS [=n] || SPARC [=y]) || BOOTX_TEXT)
     Selected by [y]:
     - EARLYFB [=y] && SPARC64 [=y]

by allowing it in the same manner as is done for powerpc in commit
0ebc7feae7 ("powerpc: Use shared font data").

Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Fixes: 0f1991949d ("sparc: Use shared font data")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402241539.epQT43nI-lkp@intel.com/
Cc: "Dr. David Alan Gilbert" <linux@treblig.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Reviewed-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20240307180742.900068-1-andreas@gaisler.com
2024-03-08 21:29:16 +01:00
Jakub Kicinski
6025b9135f net: dqs: add NIC stall detector based on BQL
softnet_data->time_squeeze is sometimes used as a proxy for
host overload or indication of scheduling problems. In practice
this statistic is very noisy and has hard to grasp units -
e.g. is 10 squeezes a second to be expected, or high?

Delaying network (NAPI) processing leads to drops on NIC queues
but also RTT bloat, impacting pacing and CA decisions.
Stalls are a little hard to detect on the Rx side, because
there may simply have not been any packets received in given
period of time. Packet timestamps help a little bit, but
again we don't know if packets are stale because we're
not keeping up or because someone (*cough* cgroups)
disabled IRQs for a long time.

We can, however, use Tx as a proxy for Rx stalls. Most drivers
use combined Rx+Tx NAPIs so if Tx gets starved so will Rx.
On the Tx side we know exactly when packets get queued,
and completed, so there is no uncertainty.

This patch adds stall checks to BQL. Why BQL? Because
it's a convenient place to add such checks, already
called by most drivers, and it has copious free space
in its structures (this patch adds no extra cache
references or dirtying to the fast path).

The algorithm takes one parameter - max delay AKA stall
threshold and increments a counter whenever NAPI got delayed
for at least that amount of time. It also records the length
of the longest stall.

To be precise every time NAPI has not polled for at least
stall thrs we check if there were any Tx packets queued
between last NAPI run and now - stall_thrs/2.

Unlike the classic Tx watchdog this mechanism does not
ignore stalls caused by Tx being disabled, or loss of link.
I don't think the check is worth the complexity, and
stall is a stall, whether due to host overload, flow
control, link down... doesn't matter much to the application.

We have been running this detector in production at Meta
for 2 years, with the threshold of 8ms. It's the lowest
value where false positives become rare. There's still
a constant stream of reported stalls (especially without
the ksoftirqd deferral patches reverted), those who like
their stall metrics to be 0 may prefer higher value.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-08 10:23:26 +00:00
Jakub Kicinski
e3afe5dd3a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

net/core/page_pool_user.c
  0b11b1c5c3 ("netdev: let netlink core handle -EMSGSIZE errors")
  429679dcf7 ("page_pool: fix netlink dump stop/resume")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-07 10:29:36 -08:00
Andy Shevchenko
9bea6216f9 dyndbg: replace kstrdup() + strchr() with kstrdup_and_replace()
Replace open coded functionalify of kstrdup_and_replace() with a call.

Link: https://lkml.kernel.org/r/20240213162741.3102810-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06 13:07:39 -08:00
Linus Torvalds
a50026bdb8
iov_iter: get rid of 'copy_mc' flag
This flag is only set by one single user: the magical core dumping code
that looks up user pages one by one, and then writes them out using
their kernel addresses (by using a BVEC_ITER).

That actually ends up being a huge problem, because while we do use
copy_mc_to_kernel() for this case and it is able to handle the possible
machine checks involved, nothing else is really ready to handle the
failures caused by the machine check.

In particular, as reported by Tong Tiangen, we don't actually support
fault_in_iov_iter_readable() on a machine check area.

As a result, the usual logic for writing things to a file under a
filesystem lock, which involves doing a copy with page faults disabled
and then if that fails trying to fault pages in without holding the
locks with fault_in_iov_iter_readable() does not work at all.

We could decide to always just make the MC copy "succeed" (and filling
the destination with zeroes), and that would then create a core dump
file that just ignores any machine checks.

But honestly, this single special case has been problematic before, and
means that all the normal iov_iter code ends up slightly more complex
and slower.

See for example commit c9eec08bac ("iov_iter: Don't deal with
iter->copy_mc in memcpy_from_iter_mc()") where David Howells
re-organized the code just to avoid having to check the 'copy_mc' flags
inside the inner iov_iter loops.

So considering that we have exactly one user, and that one user is a
non-critical special case that doesn't actually ever trigger in real
life (Tong found this with manual error injection), the sane solution is
to just decide that the onus on handling the machine check lines on that
user instead.

Ergo, do the copy_mc_to_kernel() in the core dump logic itself, copying
the user data to a stable kernel page before writing it out.

Fixes: f1982740f5 ("iov_iter: Convert iterate*() to inline funcs")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Link: https://lore.kernel.org/r/20240305133336.3804360-1-tongtiangen@huawei.com
Link: https://lore.kernel.org/all/4e80924d-9c85-f13a-722a-6a5d2b1c225a@huawei.com/
Tested-by: David Howells <dhowells@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reported-by: Tong Tiangen <tongtiangen@huawei.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-06 10:52:12 +01:00
Kees Cook
fb57550fcb string: Convert helpers selftest to KUnit
Convert test-string_helpers.c to KUnit so it can be easily run with
everything else.

Failure reporting doesn't need to be open-coded in most places, for
example, forcing a failure in the expected output for upper/lower
testing looks like this:

[12:18:43] # test_upper_lower: EXPECTATION FAILED at lib/string_helpers_kunit.c:579
[12:18:43] Expected dst == strings_upper[i].out, but
[12:18:43]     dst == "ABCDEFGH1234567890TEST"
[12:18:43]     strings_upper[i].out == "ABCDEFGH1234567890TeST"
[12:18:43] [FAILED] test_upper_lower

Currently passes without problems:

$ ./tools/testing/kunit/kunit.py run string_helpers
...
[12:23:55] Starting KUnit Kernel (1/1)...
[12:23:55] ============================================================
[12:23:55] =============== string_helpers (3 subtests) ================
[12:23:55] [PASSED] test_get_size
[12:23:55] [PASSED] test_upper_lower
[12:23:55] [PASSED] test_unescape
[12:23:55] ================= [PASSED] string_helpers ==================
[12:23:55] ============================================================
[12:23:55] Testing complete. Ran 3 tests: passed: 3
[12:23:55] Elapsed time: 6.709s total, 0.001s configuring, 6.591s building, 0.066s running

Link: https://lore.kernel.org/r/20240301202732.2688342-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-03-05 01:55:28 -08:00
Kees Cook
29d8568849 string: Convert selftest to KUnit
Convert test_string.c to KUnit so it can be easily run with everything
else.

Additional text context is retained for failure reporting. For example,
when forcing a bad match, we can see the loop counters reported for the
memset() tests:

[09:21:52]     # test_memset64: ASSERTION FAILED at lib/string_kunit.c:93
[09:21:52]     Expected v == 0xa2a1a1a1a1a1a1a1ULL, but
[09:21:52]         v == -6799976246779207263 (0xa1a1a1a1a1a1a1a1)
[09:21:52]         0xa2a1a1a1a1a1a1a1ULL == -6727918652741279327 (0xa2a1a1a1a1a1a1a1)
[09:21:52] i:0 j:0 k:0
[09:21:52] [FAILED] test_memset64

Currently passes without problems:

$ ./tools/testing/kunit/kunit.py run string
...
[09:37:40] Starting KUnit Kernel (1/1)...
[09:37:40] ============================================================
[09:37:40] =================== string (6 subtests) ====================
[09:37:40] [PASSED] test_memset16
[09:37:40] [PASSED] test_memset32
[09:37:40] [PASSED] test_memset64
[09:37:40] [PASSED] test_strchr
[09:37:40] [PASSED] test_strnchr
[09:37:40] [PASSED] test_strspn
[09:37:40] ===================== [PASSED] string ======================
[09:37:40] ============================================================
[09:37:40] Testing complete. Ran 6 tests: passed: 6
[09:37:40] Elapsed time: 6.730s total, 0.001s configuring, 6.562s building, 0.131s running

Link: https://lore.kernel.org/r/20240301202732.2688342-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-03-05 01:55:28 -08:00
Martin Kaiser
4c4a52544a lib/test_vmalloc.c: use unsigned long constant
Use an unsigned long constant instead of an int constant and a cast.  This
fixes the checkpatch warning

WARNING: Unnecessary typecast of c90 int constant - '(unsigned long) 1' could be '1UL'
+     align = ((unsigned long) 1) << i;

Link: https://lkml.kernel.org/r/20240226191159.39509-4-martin@kaiser.cx
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04 17:01:22 -08:00
Martin Kaiser
e2c5bfebab lib/test_vmalloc.c: drop empty exit function
The module is never loaded successfully.  Therefore, it'll never be
unloaded and we can remove the exit function.

Link: https://lkml.kernel.org/r/20240226191159.39509-3-martin@kaiser.cx
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04 17:01:21 -08:00
Martin Kaiser
44503b97ad lib/test_vmalloc.c: fix typo in function name
Fix a typo and change the function name to init_test_configuration.  Both
caller and definition have the same typo, so the current code already
works.

Link: https://lkml.kernel.org/r/20240226191159.39509-2-martin@kaiser.cx
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04 17:01:21 -08:00
Dan Carpenter
dc24559472 lib/stackdepot: off by one in depot_fetch_stack()
The stack_pools[] array has DEPOT_MAX_POOLS.  The "pools_num" tracks the
number of pools which are initialized.  See depot_init_pool() for more
details.

If pool_index == pools_num_cached, this will read one element beyond what
we want.  If not all the pools are initialized, then the pool will be
NULL, triggering a WARN(), and if they are all initialized it will read
one element beyond the end of the array.

Link: https://lkml.kernel.org/r/361ac881-60b7-471f-91e5-5bf8fe8042b2@moroto.mountain
Fixes: b29d318858 ("lib/stackdepot: store free stack records in a freelist")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-04 17:01:17 -08:00
Andy Shevchenko
f0b7f8ade9 lib/string_helpers: Add flags param to string_get_size()
The new flags parameter allows controlling
 - Whether or not the units suffix is separated by a space, for
   compatibility with sort -h
 - Whether or not to append a B suffix - we're not always printing
   bytes.

Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/r/20240229205345.93902-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 22:34:42 -08:00
Jakub Kicinski
65f5dd4f02 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/mptcp/protocol.c
  adf1bb78da ("mptcp: fix snd_wnd initialization for passive socket")
  9426ce476a ("mptcp: annotate lockless access for RX path fields")
https://lore.kernel.org/all/20240228103048.19255709@canb.auug.org.au/

Adjacent changes:

drivers/dpll/dpll_core.c
  0d60d8df6f ("dpll: rely on rcu for netdev_dpll_pin()")
  e7f8df0e81 ("dpll: move xa_erase() call in to match dpll_pin_alloc() error path order")

drivers/net/veth.c
  1ce7d306ea ("veth: try harder when allocating queue memory")
  0bef512012 ("net: add netdev_lockdep_set_classes() to virtual drivers")

drivers/net/wireless/intel/iwlwifi/mvm/d3.c
  8c9bef26e9 ("wifi: iwlwifi: mvm: d3: implement suspend with MLO")
  78f65fbf42 ("wifi: iwlwifi: mvm: ensure offloading TID queue exists")

net/wireless/nl80211.c
  f78c137533 ("wifi: nl80211: reject iftype change with mesh ID change")
  414532d8aa ("wifi: cfg80211: use IEEE80211_MAX_MESH_ID_LEN appropriately")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 14:24:56 -08:00
Kees Cook
c2efa5387c lib: stackinit: Adjust target string to 8 bytes for m68k
For reasons I cannot understand, m68k moves the start of the stack frame
for consecutive calls to the same function if the function's test
variable is larger than 8 bytes. This was only happening for the char
array test (obviously), so adjust the length of the string for m68k
only. I want the array size to be longer than "unsigned long" for every
given architecture, so the other remain unchanged.

Additionally adjust the error message to be a bit more clear about
what's happened, and move the KUNIT check outside of the consecutive
calls to minimize what happens between them.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/lkml/a0d10d50-2720-4ecd-a2c6-c2c5e5aeee65@roeck-us.net/
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/r/CAMuHMdX_g1tbiUL9PUQdqaegrEzCNN3GtbSvSBFYAL4TzvstFg@mail.gmail.com
Closes: https://lore.kernel.org/r/CAMuHMdW6N40+0gGQ+LSrN64Mo4A0-ELAm0pR3gWQ0mNanyBuUQ@mail.gmail.com
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/all/a4bf4063-194f-4740-9c1d-88f9ab38b778@roeck-us.net
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 13:38:03 -08:00
Kees Cook
3d965b33e4 fortify: Improve buffer overflow reporting
Improve the reporting of buffer overflows under CONFIG_FORTIFY_SOURCE to
help accelerate debugging efforts. The calculations are all just sitting
in registers anyway, so pass them along to the function to be reported.

For example, before:

  detected buffer overflow in memcpy

and after:

  memcpy: detected buffer overflow: 4096 byte read of buffer size 1

Link: https://lore.kernel.org/r/20230407192717.636137-10-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 13:38:02 -08:00
Kees Cook
fa4a3f86d4 fortify: Add KUnit tests for runtime overflows
With fortify overflows able to be redirected, we can use KUnit to
exercise the overflow conditions. Add tests for every API covered by
CONFIG_FORTIFY_SOURCE, except for memset() and memcpy(), which are
special-cased for now.

Disable warnings in the Makefile since we're explicitly testing
known-bad string handling code patterns.

Note that this makes the LKDTM FORTIFY_STR* tests obsolete, but those
can be removed separately.

Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 13:38:02 -08:00
Kees Cook
4ce615e798 fortify: Provide KUnit counters for failure testing
The standard C string APIs were not designed to have a failure mode;
they were expected to always succeed without memory safety issues.
Normally, CONFIG_FORTIFY_SOURCE will use fortify_panic() to stop
processing, as truncating a read or write may provide an even worse
system state. However, this creates a problem for testing under things
like KUnit, which needs a way to survive failures.

When building with CONFIG_KUNIT, provide a failure path for all users
of fortify_panic, and track whether the failure was a read overflow or
a write overflow, for KUnit tests to examine. Inspired by similar logic
in the slab tests.

Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 13:38:02 -08:00
Kees Cook
1a78f8cb5d fortify: Allow KUnit test to build without FORTIFY
In order for CI systems to notice all the skipped tests related to
CONFIG_FORTIFY_SOURCE, allow the FORTIFY_SOURCE KUnit tests to build
with or without CONFIG_FORTIFY_SOURCE.

Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 13:38:02 -08:00
Kees Cook
475ddf1fce fortify: Split reporting and avoid passing string pointer
In preparation for KUnit testing and further improvements in fortify
failure reporting, split out the report and encode the function and access
failure (read or write overflow) into a single u8 argument. This mainly
ends up saving a tiny bit of space in the data segment. For a defconfig
with FORTIFY_SOURCE enabled:

$ size gcc/vmlinux.before gcc/vmlinux.after
   text  	  data     bss     dec    	    hex filename
26132309        9760658 2195460 38088427        2452eeb gcc/vmlinux.before
26132386        9748382 2195460 38076228        244ff44 gcc/vmlinux.after

Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 13:38:02 -08:00
Kees Cook
08d45ee84b overflow: Introduce wrapping_assign_add() and wrapping_assign_sub()
This allows replacements of the idioms "var += offset" and "var -=
offset" with the wrapping_assign_add() and wrapping_assign_sub() helpers
respectively. They will avoid wrap-around sanitizer instrumentation.

Add to the selftests to validate behavior and lack of side-effects.

Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 13:38:01 -08:00
Kees Cook
d70de8054c overflow: Introduce wrapping_add(), wrapping_sub(), and wrapping_mul()
Provide helpers that will perform wrapping addition, subtraction, or
multiplication without tripping the arithmetic wrap-around sanitizers. The
first argument is the type under which the wrap-around should happen
with. In other words, these two calls will get very different results:

	wrapping_mul(int, 50, 50) == 2500
	wrapping_mul(u8,  50, 50) ==  196

Add to the selftests to validate behavior and lack of side-effects.

Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-29 13:38:01 -08:00
Linus Torvalds
87adedeba5 Including fixes from bluetooth, WiFi and netfilter.
We have one outstanding issue with the stmmac driver, which may
 be a LOCKDEP false positive, not a blocker.
 
 Current release - regressions:
 
  - netfilter: nf_tables: re-allow NFPROTO_INET in
    nft_(match/target)_validate()
 
  - eth: ionic: fix error handling in PCI reset code
 
 Current release - new code bugs:
 
  - eth: stmmac: complete meta data only when enabled, fix null-deref
 
  - kunit: fix again checksum tests on big endian CPUs
 
 Previous releases - regressions:
 
  - veth: try harder when allocating queue memory
 
  - Bluetooth:
    - hci_bcm4377: do not mark valid bd_addr as invalid
    - hci_event: fix handling of HCI_EV_IO_CAPA_REQUEST
 
 Previous releases - always broken:
 
  - info leak in __skb_datagram_iter() on netlink socket
 
  - mptcp:
    - map v4 address to v6 when destroying subflow
    - fix potential wake-up event loss due to sndbuf auto-tuning
    - fix double-free on socket dismantle
 
  - wifi: nl80211: reject iftype change with mesh ID change
 
  - fix small out-of-bound read when validating netlink be16/32 types
 
  - rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
 
  - ipv6: fix potential "struct net" ref-leak in inet6_rtm_getaddr()
 
  - ip_tunnel: prevent perpetual headroom growth with huge number of
    tunnels on top of each other
 
  - mctp: fix skb leaks on error paths of mctp_local_output()
 
  - eth: ice: fixes for DPLL state reporting
 
  - dpll: rely on rcu for netdev_dpll_pin() to prevent UaF
 
  - eth: dpaa: accept phy-interface-type = "10gbase-r" in the device tree
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmXg6ioACgkQMUZtbf5S
 IrupHQ/+Jt9OK8AYiUUBpeE0E0pb4yHS4KuiGWChx2YECJCeeeU6Ko4gaPI6+Nyv
 mMh/3sVsLnX7w4OXp2HddMMiFGbd1ufIptS0T/EMhHbbg1h7Qr1jhpu8aM8pb9jM
 5DwjfTZijDaW84Oe+Kk9BOonxR6A+Df27O3PSEUtLk4JCy5nwEwUr9iCxgCla499
 3aLu5eWRw8PTSsJec4BK6hfCKWiA/6oBHS1pQPwYvWuBWFZe8neYHtvt3LUwo1HR
 DwN9gtMiGBzYSSQmk8V1diGIokn80G5Krdq4gXbhsLxIU0oEJA7ltGpqasxy/OCs
 KGLHcU5wCd3j42gZOzvBzzzj8RQyd2ZekyvCu7B5Rgy3fx6JWI1jLalsQ/tT9yQg
 VJgFM2AZBb1EEAw/P2DkVQ8Km8ZuVlGtzUoldvIY1deP1/LZFWc0PftA6ndT7Ldl
 wQwKPQtJ5DMzqEe3mwSjFkL+AiSmcCHCkpnGBIi4c7Ek2/GgT1HeUMwJPh0mBftz
 smlLch3jMH2YKk7AmH7l9o/Q9ypgvl+8FA+icLaX0IjtSbzz5Q7gNyhgE0w1Hdb2
 79q6SE3ETLG/dn75XMA1C0Wowrr60WKHwagMPUl57u9bchfUT8Ler/4Sd9DWn8Vl
 55YnGPWMLCkxgpk+DHXYOWjOBRszCkXrAA71NclMnbZ5cQ86JYY=
 =T2ty
 -----END PGP SIGNATURE-----

Merge tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, WiFi and netfilter.

  We have one outstanding issue with the stmmac driver, which may be a
  LOCKDEP false positive, not a blocker.

  Current release - regressions:

   - netfilter: nf_tables: re-allow NFPROTO_INET in
     nft_(match/target)_validate()

   - eth: ionic: fix error handling in PCI reset code

  Current release - new code bugs:

   - eth: stmmac: complete meta data only when enabled, fix null-deref

   - kunit: fix again checksum tests on big endian CPUs

  Previous releases - regressions:

   - veth: try harder when allocating queue memory

   - Bluetooth:
      - hci_bcm4377: do not mark valid bd_addr as invalid
      - hci_event: fix handling of HCI_EV_IO_CAPA_REQUEST

  Previous releases - always broken:

   - info leak in __skb_datagram_iter() on netlink socket

   - mptcp:
      - map v4 address to v6 when destroying subflow
      - fix potential wake-up event loss due to sndbuf auto-tuning
      - fix double-free on socket dismantle

   - wifi: nl80211: reject iftype change with mesh ID change

   - fix small out-of-bound read when validating netlink be16/32 types

   - rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back

   - ipv6: fix potential "struct net" ref-leak in inet6_rtm_getaddr()

   - ip_tunnel: prevent perpetual headroom growth with huge number of
     tunnels on top of each other

   - mctp: fix skb leaks on error paths of mctp_local_output()

   - eth: ice: fixes for DPLL state reporting

   - dpll: rely on rcu for netdev_dpll_pin() to prevent UaF

   - eth: dpaa: accept phy-interface-type = '10gbase-r' in the device
     tree"

* tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
  dpll: fix build failure due to rcu_dereference_check() on unknown type
  kunit: Fix again checksum tests on big endian CPUs
  tls: fix use-after-free on failed backlog decryption
  tls: separate no-async decryption request handling from async
  tls: fix peeking with sync+async decryption
  tls: decrement decrypt_pending if no async completion will be called
  gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
  net: hsr: Use correct offset for HSR TLV values in supervisory HSR frames
  igb: extend PTP timestamp adjustments to i211
  rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
  tools: ynl: fix handling of multiple mcast groups
  selftests: netfilter: add bridge conntrack + multicast test case
  netfilter: bridge: confirm multicast packets before passing them up the stack
  netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate()
  Bluetooth: qca: Fix triggering coredump implementation
  Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT
  Bluetooth: qca: Fix wrong event type for patch config command
  Bluetooth: Enforce validation on max value of connection interval
  Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
  Bluetooth: mgmt: Fix limited discoverable off timeout
  ...
2024-02-29 12:40:20 -08:00
Christophe Leroy
3d6423ef8d kunit: Fix again checksum tests on big endian CPUs
Commit b38460bc46 ("kunit: Fix checksum tests on big endian CPUs")
fixed endianness issues with kunit checksum tests, but then
commit 6f4c45cbcb ("kunit: Add tests for csum_ipv6_magic and
ip_fast_csum") introduced new issues on big endian CPUs. Those issues
are once again reflected by the warnings reported by sparse.

So, fix them with the same approach, perform proper conversion in
order to support both little and big endian CPUs. Once the conversions
are properly done and the right types used, the sparse warnings are
cleared as well.

Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Fixes: 6f4c45cbcb ("kunit: Add tests for csum_ipv6_magic and ip_fast_csum")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/73df3a9e95c2179119398ad1b4c84cdacbd8dfb6.1708684443.git.christophe.leroy@csgroup.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 09:16:02 -08:00
Waiman Long
f22f71322a locking/rwsem: Make DEBUG_RWSEMS and PREEMPT_RT mutually exclusive
The debugging code enabled by CONFIG_DEBUG_RWSEMS=y will only be
compiled in when CONFIG_PREEMPT_RT isn't set. There is no point to
allow CONFIG_DEBUG_RWSEMS to be set in a kernel configuration where
CONFIG_PREEMPT_RT is also set. Make them mutually exclusive.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240222150540.79981-5-longman@redhat.com
2024-02-28 13:08:38 +01:00
David Gow
0a549ed22c lib: memcpy_kunit: Fix an invalid format specifier in an assertion msg
The 'i' passed as an assertion message is a size_t, so should use '%zu',
not '%d'.

This was found by annotating the _MSG() variants of KUnit's assertions
to let gcc validate the format strings.

Fixes: bb95ebbe89 ("lib: Introduce CONFIG_MEMCPY_KUNIT_TEST")
Signed-off-by: David Gow <davidgow@google.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-02-27 15:26:02 -07:00
David Gow
d2733a026f lib/cmdline: Fix an invalid format specifier in an assertion msg
The correct format specifier for p - n (both p and n are pointers) is
%td, as the type should be ptrdiff_t.

This was discovered by annotating KUnit assertion macros with gcc's
printf specifier, but note that gcc incorrectly suggested a %d or %ld
specifier (depending on the pointer size of the architecture being
built).

Fixes: 0ea0908311 ("lib/cmdline: Allow get_options() to take 0 to validate the input")
Signed-off-by: David Gow <davidgow@google.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-02-27 15:25:56 -07:00
David Gow
6f2f793fba kunit: test: Log the correct filter string in executor_test
KUnit's executor_test logs the filter string in KUNIT_ASSERT_EQ_MSG(),
but passed a random character from the filter, rather than the whole
string.

This was found by annotating KUNIT_ASSERT_EQ_MSG() to let gcc validate
the format string.

Fixes: 76066f93f1 ("kunit: add tests for filtering attributes")
Signed-off-by: David Gow <davidgow@google.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-02-27 15:25:50 -07:00
Maxime Ripard
c5215d54dc kunit: Setup DMA masks on the kunit device
Commit d393acce7b ("drm/tests: Switch to kunit devices") switched the
DRM device creation helpers from an ad-hoc implementation to the new
kunit device creation helpers introduced in commit d03c720e03 ("kunit:
Add APIs for managing devices").

However, while the DRM helpers were using a platform_device, the kunit
helpers are using a dedicated bus and device type.

That situation creates small differences in the initialisation, and one
of them is that the kunit devices do not have the DMA masks setup. In
turn, this means that we can't do any kind of DMA buffer allocation
anymore, which creates a regression on some (downstream for now) tests.

Let's set up a default DMA mask that should work on any platform to fix
it.

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-02-27 14:46:35 -07:00
Ricardo B. Marliere
2fadeb950f kunit: make kunit_bus_type const
Since commit d492cc2573 ("driver core: device.h: make struct
bus_type a const *"), the driver core can properly handle constant
struct bus_type, move the kunit_bus_type variable to be a constant
structure as well, placing it into read-only memory which can not be
modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-02-27 14:46:35 -07:00
Lucas De Marchi
a0dd82d6d8 kunit: Mark filter* params as rw
By allowing the filter_glob parameter to be written to, it's possible to
tweak the testsuites that will be executed on new module loads. This
makes it easier to run specific tests without having to reload kunit and
provides a way to filter tests on real HW even if kunit is builtin.
Example for xe driver:

1) Run just 1 test
	# echo -n xe_bo > /sys/module/kunit/parameters/filter_glob
	# modprobe -r xe_live_test
	# modprobe xe_live_test
	# ls /sys/kernel/debug/kunit/
	xe_bo

2) Run all tests
	# echo \* > /sys/module/kunit/parameters/filter_glob
	# modprobe -r xe_live_test
	# modprobe xe_live_test
	# ls /sys/kernel/debug/kunit/
	xe_bo  xe_dma_buf  xe_migrate  xe_mocs

For completeness and to cover other use cases, also change filter and
filter_action to rw.

Link: https://lore.kernel.org/intel-xe/dzacvbdditbneiu3e3fmstjmttcbne44yspumpkd6sjn56jqpk@vxu7sksbqrp6/
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-02-27 14:46:34 -07:00
Greg Kroah-Hartman
13a44ba0dc Merge 6.8-rc6 into tty-next
We need the tty/serial fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-27 06:22:13 +01:00
Baoquan He
443cbaf9e2 crash: split vmcoreinfo exporting code out from crash_core.c
Now move the relevant codes into separate files:
kernel/crash_reserve.c, include/linux/crash_reserve.h.

And add config item CRASH_RESERVE to control its enabling.

And also update the old ifdeffery of CONFIG_CRASH_CORE, including of
<linux/crash_core.h> and config item dependency on CRASH_CORE
accordingly.

And also do renaming as follows:
 - arch/xxx/kernel/{crash_core.c => vmcore_info.c}
because they are only related to vmcoreinfo exporting on x86, arm64,
riscv.

And also Remove config item CRASH_CORE, and rely on CONFIG_KEXEC_CORE to
decide if build in crash_core.c.

[yang.lee@linux.alibaba.com: remove duplicated include in vmcore_info.c]
  Link: https://lkml.kernel.org/r/20240126005744.16561-1-yang.lee@linux.alibaba.com
Link: https://lkml.kernel.org/r/20240124051254.67105-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:22 -08:00
Oscar Salvador
4bedfb314b mm,page_owner: maintain own list of stack_records structs
page_owner needs to increment a stack_record refcount when a new
allocation occurs, and decrement it on a free operation.  In order to do
that, we need to have a way to get a stack_record from a handle. 
Implement __stack_depot_get_stack_record() which just does that, and make
it public so page_owner can use it.

Also, traversing all stackdepot buckets comes with its own complexity,
plus we would have to implement a way to mark only those stack_records
that were originated from page_owner, as those are the ones we are
interested in.  For that reason, page_owner maintains its own list of
stack_records, because traversing that list is faster than traversing all
buckets while keeping at the same time a low complexity.

For now, add to stack_list only the stack_records of dummy_handle and
failure_handle, and set their refcount of 1.

Further patches will add code to increment or decrement stack_records
count on allocation and free operation.

Link: https://lkml.kernel.org/r/20240215215907.20121-4-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:17 -08:00
Oscar Salvador
8151c7a35d lib/stackdepot: move stack_record struct definition into the header
In order to move the heavy lifting into page_owner code, this one needs to
have access to the stack_record structure, which right now sits in
lib/stackdepot.c.  Move it to the stackdepot.h header so page_owner can
access stack_record's struct fields.

Link: https://lkml.kernel.org/r/20240215215907.20121-3-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:16 -08:00
Oscar Salvador
3ee34eabac lib/stackdepot: fix first entry having a 0-handle
Patch series "page_owner: print stacks and their outstanding allocations",
v10.

page_owner is a great debug functionality tool that lets us know about all
pages that have been allocated/freed and their specific stacktrace.  This
comes very handy when debugging memory leaks, since with some scripting we
can see the outstanding allocations, which might point to a memory leak.

In my experience, that is one of the most useful cases, but it can get
really tedious to screen through all pages and try to reconstruct the
stack <-> allocated/freed relationship, becoming most of the time a
daunting and slow process when we have tons of allocation/free operations.
 

This patchset aims to ease that by adding a new functionality into
page_owner.  This functionality creates a new directory called
'page_owner_stacks' under 'sys/kernel//debug' with a read-only file called
'show_stacks', which prints out all the stacks followed by their
outstanding number of allocations (being that the times the stacktrace has
allocated but not freed yet).  This gives us a clear and a quick overview
of stacks <-> allocated/free.

We take advantage of the new refcount_f field that stack_record struct
gained, and increment/decrement the stack refcount on every
__set_page_owner() (alloc operation) and __reset_page_owner (free
operation) call.

Unfortunately, we cannot use the new stackdepot api STACK_DEPOT_FLAG_GET
because it does not fulfill page_owner needs, meaning we would have to
special case things, at which point makes more sense for page_owner to do
its own {dec,inc}rementing of the stacks.  E.g: Using
STACK_DEPOT_FLAG_PUT, once the refcount reaches 0, such stack gets
evicted, so page_owner would lose information.

This patchset also creates a new file called 'set_threshold' within
'page_owner_stacks' directory, and by writing a value to it, the stacks
which refcount is below such value will be filtered out.

A PoC can be found below:

 # cat /sys/kernel/debug/page_owner_stacks/show_stacks > page_owner_full_stacks.txt
 # head -40 page_owner_full_stacks.txt 
  prep_new_page+0xa9/0x120
  get_page_from_freelist+0x801/0x2210
  __alloc_pages+0x18b/0x350
  alloc_pages_mpol+0x91/0x1f0
  folio_alloc+0x14/0x50
  filemap_alloc_folio+0xb2/0x100
  page_cache_ra_unbounded+0x96/0x180
  filemap_get_pages+0xfd/0x590
  filemap_read+0xcc/0x330
  blkdev_read_iter+0xb8/0x150
  vfs_read+0x285/0x320
  ksys_read+0xa5/0xe0
  do_syscall_64+0x80/0x160
  entry_SYSCALL_64_after_hwframe+0x6e/0x76
 stack_count: 521



  prep_new_page+0xa9/0x120
  get_page_from_freelist+0x801/0x2210
  __alloc_pages+0x18b/0x350
  alloc_pages_mpol+0x91/0x1f0
  folio_alloc+0x14/0x50
  filemap_alloc_folio+0xb2/0x100
  __filemap_get_folio+0x14a/0x490
  ext4_write_begin+0xbd/0x4b0 [ext4]
  generic_perform_write+0xc1/0x1e0
  ext4_buffered_write_iter+0x68/0xe0 [ext4]
  ext4_file_write_iter+0x70/0x740 [ext4]
  vfs_write+0x33d/0x420
  ksys_write+0xa5/0xe0
  do_syscall_64+0x80/0x160
  entry_SYSCALL_64_after_hwframe+0x6e/0x76
 stack_count: 4609
...
...

 # echo 5000 > /sys/kernel/debug/page_owner_stacks/set_threshold 
 # cat /sys/kernel/debug/page_owner_stacks/show_stacks > page_owner_full_stacks_5000.txt
 # head -40 page_owner_full_stacks_5000.txt 
  prep_new_page+0xa9/0x120
  get_page_from_freelist+0x801/0x2210
  __alloc_pages+0x18b/0x350
  alloc_pages_mpol+0x91/0x1f0
  folio_alloc+0x14/0x50
  filemap_alloc_folio+0xb2/0x100
  __filemap_get_folio+0x14a/0x490
  ext4_write_begin+0xbd/0x4b0 [ext4]
  generic_perform_write+0xc1/0x1e0
  ext4_buffered_write_iter+0x68/0xe0 [ext4]
  ext4_file_write_iter+0x70/0x740 [ext4]
  vfs_write+0x33d/0x420
  ksys_pwrite64+0x75/0x90
  do_syscall_64+0x80/0x160
  entry_SYSCALL_64_after_hwframe+0x6e/0x76
 stack_count: 6781



  prep_new_page+0xa9/0x120
  get_page_from_freelist+0x801/0x2210
  __alloc_pages+0x18b/0x350
  pcpu_populate_chunk+0xec/0x350
  pcpu_balance_workfn+0x2d1/0x4a0
  process_scheduled_works+0x84/0x380
  worker_thread+0x12a/0x2a0
  kthread+0xe3/0x110
  ret_from_fork+0x30/0x50
  ret_from_fork_asm+0x1b/0x30
 stack_count: 8641


This patch (of 7):

The very first entry of stack_record gets a handle of 0, but this is wrong
because stackdepot treats a 0-handle as a non-valid one.  E.g: See the
check in stack_depot_fetch()

Fix this by adding and offset of 1.

This bug has been lurking since the very beginning of stackdepot, but no
one really cared as it seems.  Because of that I am not adding a Fixes
tag.

Link: https://lkml.kernel.org/r/20240215215907.20121-1-osalvador@suse.de
Link: https://lkml.kernel.org/r/20240215215907.20121-2-osalvador@suse.de
Co-developed-by: Marco Elver <elver@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:16 -08:00
Andrew Morton
1f1183c4c0 merge mm-hotfixes-stable into mm-nonmm-stable to pick up stackdepot changes 2024-02-23 17:28:43 -08:00
Marco Elver
31639fd6ce stackdepot: use variable size records for non-evictable entries
With the introduction of stack depot evictions, each stack record is now
fixed size, so that future reuse after an eviction can safely store
differently sized stack traces.  In all cases that do not make use of
evictions, this wastes lots of space.

Fix it by re-introducing variable size stack records (up to the max
allowed size) for entries that will never be evicted.  We know if an entry
will never be evicted if the flag STACK_DEPOT_FLAG_GET is not provided,
since a later stack_depot_put() attempt is undefined behavior.

With my current kernel config that enables KASAN and also SLUB owner
tracking, I observe (after a kernel boot) a whopping reduction of 296
stack depot pools, which translates into 4736 KiB saved.  The savings here
are from SLUB owner tracking only, because KASAN generic mode still uses
refcounting.

Before:

  pools: 893
  allocations: 29841
  frees: 6524
  in_use: 23317
  freelist_size: 3454

After:

  pools: 597
  refcounted_allocations: 17547
  refcounted_frees: 6477
  refcounted_in_use: 11070
  freelist_size: 3497
  persistent_count: 12163
  persistent_bytes: 1717008

[elver@google.com: fix -Wstringop-overflow warning]
  Link: https://lore.kernel.org/all/20240201135747.18eca98e@canb.auug.org.au/
  Link: https://lkml.kernel.org/r/20240201090434.1762340-1-elver@google.com
  Link: https://lore.kernel.org/all/CABXGCsOzpRPZGg23QqJAzKnqkZPKzvieeg=W7sgjgi3q0pBo0g@mail.gmail.com/
Link: https://lkml.kernel.org/r/20240129100708.39460-1-elver@google.com
Link: https://lore.kernel.org/all/CABXGCsOzpRPZGg23QqJAzKnqkZPKzvieeg=W7sgjgi3q0pBo0g@mail.gmail.com/
Fixes: 108be8def4 ("lib/stackdepot: allow users to evict stack traces")
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:27:12 -08:00
Florian Westphal
9a0d18853c netlink: add nla be16/32 types to minlen array
BUG: KMSAN: uninit-value in nla_validate_range_unsigned lib/nlattr.c:222 [inline]
BUG: KMSAN: uninit-value in nla_validate_int_range lib/nlattr.c:336 [inline]
BUG: KMSAN: uninit-value in validate_nla lib/nlattr.c:575 [inline]
BUG: KMSAN: uninit-value in __nla_validate_parse+0x2e20/0x45c0 lib/nlattr.c:631
 nla_validate_range_unsigned lib/nlattr.c:222 [inline]
 nla_validate_int_range lib/nlattr.c:336 [inline]
 validate_nla lib/nlattr.c:575 [inline]
...

The message in question matches this policy:

 [NFTA_TARGET_REV]       = NLA_POLICY_MAX(NLA_BE32, 255),

but because NLA_BE32 size in minlen array is 0, the validation
code will read past the malformed (too small) attribute.

Note: Other attributes, e.g. BITFIELD32, SINT, UINT.. are also missing:
those likely should be added too.

Reported-by: syzbot+3f497b07aa3baf2fb4d0@syzkaller.appspotmail.com
Reported-by: xingwei lee <xrivendell7@gmail.com>
Closes: https://lore.kernel.org/all/CABOYnLzFYHSnvTyS6zGa-udNX55+izqkOt2sB9WDqUcEGW6n8w@mail.gmail.com/raw
Fixes: ecaf75ffd5 ("netlink: introduce bigendian integer types")
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20240221172740.5092-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-22 19:01:55 -08:00
Nathan Chancellor
9feceff1d2 lib/Kconfig.debug: update Clang version check in CONFIG_KCOV
Now that the minimum supported version of LLVM for building the kernel has
been bumped to 13.0.1, this condition can be changed to just
CONFIG_CC_IS_CLANG, as the build will fail during the configuration stage
for older LLVM versions.

Link: https://lkml.kernel.org/r/20240125-bump-min-llvm-ver-to-13-0-1-v1-10-f5ff9bda41c5@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:54 -08:00
Geert Uytterhoeven
f785785c0a lib: dhry: add missing closing parenthesis
The help text for the Dhrystone benchmark test lacks a matching closing
parenthesis.

Link: https://lkml.kernel.org/r/772b43271bcb3dd17a6aae671b2084f08c05b079.1705934853.git.geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:52 -08:00
Geert Uytterhoeven
b8d1b82837 lib: dhry: use ktime_ms_delta() helper
Use the existing ktime_ms_delta() helper instead of open-coding the same
operation.

Link: https://lkml.kernel.org/r/bb43c67a7580de6152f5e6eb225071166d33b6e4.1705934853.git.geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:52 -08:00
Geert Uytterhoeven
c3c6c20482 lib: dhry: remove unneeded <linux/mutex.h>
Patch series "lib: dhry: miscellaneous cleanups".

This patch series contains a few miscellaneous cleanups for the
Dhrystone benchmark test.


This patch (of 3):

The Dhrystone benchmark test does not use mutexes.

Link: https://lkml.kernel.org/r/cover.1705934853.git.geert+renesas@glider.be
Link: https://lkml.kernel.org/r/cf8fafaedccf96143f1513745c43a457480bfc24.1705934853.git.geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:52 -08:00
Kemeng Shi
d6bbab8f35 flex_proportions: remove unused fprop_local_single
The single variant of flex_proportions is not used.  Simply remove it.

Link: https://lkml.kernel.org/r/20240118201321.759174-1-shikemeng@huaweicloud.com
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:52 -08:00
Kuan-Wei Chiu
0e02ca29a5 lib/sort: optimize heapsort with double-pop variation
Instead of popping only the maximum element from the heap during each
iteration, we now pop the two largest elements at once.  Although this
introduces an additional comparison to determine the second largest
element, it enables a reduction in the height of the tree by one during
the heapify operations starting from root's left/right child.  This
reduction in tree height by one leads to a decrease of one comparison and
one swap.

This optimization results in saving approximately 0.5 * n swaps without
increasing the number of comparisons.  Additionally, the heap size during
heapify is now one less than the original size, offering a chance for
further reduction in comparisons and swaps.

The following experimental data is based on the array generated using
get_random_u32().

| N     | swaps (old) | swaps (new) | comparisons (old) | comparisons (new) |
|-------|-------------|-------------|-------------------|-------------------|
| 1000  | 9054        | 8569        | 10328             | 10320             |
| 2000  | 20137       | 19182       | 22634             | 22587             |
| 3000  | 32062       | 30623       | 35833             | 35752             |
| 4000  | 44274       | 42282       | 49332             | 49306             |
| 5000  | 57195       | 54676       | 63300             | 63294             |
| 6000  | 70205       | 67202       | 77599             | 77557             |
| 7000  | 83276       | 79831       | 92113             | 92032             |
| 8000  | 96630       | 92678       | 106635            | 106617            |
| 9000  | 110349      | 105883      | 121505            | 121404            |
| 10000 | 124165      | 119202      | 136628            | 136617            |


Link: https://lkml.kernel.org/r/20240113031352.2395118-3-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: George Spelvin <lkml@sdf.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:52 -08:00
Kuan-Wei Chiu
db946a4222 lib/sort: optimize heapsort for equal elements in sift-down path
Patch series "lib/sort: Optimize the number of swaps and comparisons".

This patch series aims to optimize the heapsort algorithm, specifically
targeting a reduction in the number of swaps and comparisons required.


This patch (of 2):

Currently, when searching for the sift-down path and encountering equal
elements, the algorithm chooses the left child.  However, considering that
the height of the right subtree may be one less than that of the left
subtree, selecting the right child in such cases can potentially reduce
the number of comparisons and swaps.

For instance, when sorting an array of 10,000 identical elements, the
current implementation requires 247,209 comparisons.  With this patch, the
number of comparisons can be reduced to 227,241.

Link: https://lkml.kernel.org/r/20240113031352.2395118-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20240113031352.2395118-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:52 -08:00
Nathan Chancellor
2947a4567f treewide: update LLVM Bugzilla links
LLVM moved their issue tracker from their own Bugzilla instance to GitHub
issues.  While all of the links are still valid, they may not necessarily
show the most up to date information around the issues, as all updates
will occur on GitHub, not Bugzilla.

Another complication is that the Bugzilla issue number is not always the
same as the GitHub issue number.  Thankfully, LLVM maintains this mapping
through two shortlinks:

  https://llvm.org/bz<num> -> https://bugs.llvm.org/show_bug.cgi?id=<num>
  https://llvm.org/pr<num> -> https://github.com/llvm/llvm-project/issues/<mapped_num>

Switch all "https://bugs.llvm.org/show_bug.cgi?id=<num>" links to the
"https://llvm.org/pr<num>" shortlink so that the links show the most up to
date information.  Each migrated issue links back to the Bugzilla entry,
so there should be no loss of fidelity of information here.

Link: https://lkml.kernel.org/r/20240109-update-llvm-links-v1-3-eb09b59db071@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Fangrui Song <maskray@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:51 -08:00
Jakub Kicinski
fecc51559a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/ipv4/udp.c
  f796feabb9 ("udp: add local "peek offset enabled" flag")
  56667da739 ("net: implement lockless setsockopt(SO_PEEK_OFF)")

Adjacent changes:

net/unix/garbage.c
  aa82ac51d6 ("af_unix: Drop oob_skb ref before purging queue in GC.")
  11498715f2 ("af_unix: Remove io_uring code for GC.")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-22 15:29:26 -08:00
Daniel Gomez
e777ae44e3 XArray: add cmpxchg order test
XArray multi-index entries do not keep track of the order stored once the
entry is being marked as used with cmpxchg (conditionally replaced with
NULL).  Add a test to check the order is actually lost.  The test also
verifies the order and entries for all the tied indexes before and after
the NULL replacement with xa_cmpxchg.

Add another entry at 1 << order that keeps the node around and the order
information for the NULL-entry after xa_cmpxchg.

Link: https://lkml.kernel.org/r/20240131225125.1370598-3-mcgrof@kernel.org
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 10:24:48 -08:00
Luis Chamberlain
a60cc288a1 test_xarray: add tests for advanced multi-index use
Patch series "test_xarray: advanced API multi-index tests", v2.

This is a respin of the test_xarray multi-index tests [0] which use and
demonstrate the advanced API which is used by the page cache.  This should
let folks more easily follow how we use multi-index to support for example
a min order later in the page cache.  It also lets us grow the selftests
to mimic more of what we do in the page cache.


This patch (of 2):

The multi index selftests are great but they don't replicate how we deal
with the page cache exactly, which makes it a bit hard to follow as the
page cache uses the advanced API.

Add tests which use the advanced API, mimicking what we do in the page
cache, while at it, extend the example to do what is needed for min order
support.

[mcgrof@kernel.org: fix soft lockup for advanced-api tests]
  Link: https://lkml.kernel.org/r/20240216194329.840555-1-mcgrof@kernel.org
[akpm@linux-foundation.org: s/i/loops/, make non-static]
[akpm@linux-foundation.org: restore static storage for loop counter]
Link: https://lkml.kernel.org/r/20240131225125.1370598-1-mcgrof@kernel.org
Link: https://lkml.kernel.org/r/20240131225125.1370598-2-mcgrof@kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: Daniel Gomez <da.gomez@samsung.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 10:24:48 -08:00
Lukas Bulwahn
8689d75000 maple_tree: avoid duplicate variable init in mast_spanning_rebalance()
The local variables r_tmp and l_tmp in mast_spanning_rebalance() are
already initialized at its declaration; there is no need to assign the
value again.

Remove the duplicate initialization of {r,l}_tmp.  No functional change. 
Due to common compiler optimizations, also no change to object code.

This issue was identified with clang-analyzer's dead stores analysis.

Link: https://lkml.kernel.org/r/20240122102000.29558-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 10:24:39 -08:00
Christian Brauner
4af6ccb469
Merge series 'Use Maple Trees for simple_offset utilities' of https://lore.kernel.org/r/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net
Pull simple offset series from Chuck Lever

In an effort to address slab fragmentation issues reported a few
months ago, I've replaced the use of xarrays for the directory
offset map in "simple" file systems (including tmpfs).

Thanks to Liam Howlett for helping me get this working with Maple
Trees.

* series 'Use Maple Trees for simple_offset utilities' of https://lore.kernel.org/r/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net: (6 commits)
  libfs: Convert simple directory offsets to use a Maple Tree
  test_maple_tree: testing the cyclic allocation
  maple_tree: Add mtree_alloc_cyclic()
  libfs: Add simple_offset_empty()
  libfs: Define a minimum directory offset
  libfs: Re-arrange locking in offset_iterate_dir()

Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-22 10:03:26 +01:00
Sidhartha Kumar
e755c43eb4 maple_tree: fix comment describing mas_node_count_gfp()
The function description comment for mas_node_count_gfp() mistakingly
refers to the function as mas_node_count().  Change it to refer to the
correct function.

Link: https://lkml.kernel.org/r/20240109223119.162357-1-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-21 16:00:01 -08:00
Liam R. Howlett
f92e1a829d test_maple_tree: testing the cyclic allocation
This tests the interactions of the cyclic allocations, the maple state
index and last, and overflow.

Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/170820144894.6328.13052830860966450674.stgit@91.116.238.104.host.secureserver.net
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-21 09:34:26 +01:00
Chuck Lever
9b6713cc75 maple_tree: Add mtree_alloc_cyclic()
I need a cyclic allocator for the simple_offset implementation in
fs/libfs.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/170820144179.6328.12838600511394432325.stgit@91.116.238.104.host.secureserver.net
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-21 09:34:26 +01:00
Kees Cook
e6584c3964 string: Allow 2-argument strscpy()
Using sizeof(dst) for the "size" argument in strscpy() is the
overwhelmingly common case. Instead of requiring this everywhere, allow a
2-argument version to be used that will use the sizeof() internally. There
are other functions in the kernel with optional arguments[1], so this
isn't unprecedented, and improves readability. Update and relocate the
kern-doc for strscpy() too, and drop __HAVE_ARCH_STRSCPY as it is unused.

Adjust ARCH=um build to notice the changed export name, as it doesn't
do full header includes for the string helpers.

This could additionally let us save a few hundred lines of code:
 1177 files changed, 2455 insertions(+), 3026 deletions(-)
with a treewide cleanup using Coccinelle:

@needless_arg@
expression DST, SRC;
@@

        strscpy(DST, SRC
-, sizeof(DST)
        )

Link: https://elixir.bootlin.com/linux/v6.7/source/include/linux/pci.h#L1517 [1]
Reviewed-by: Justin Stitt <justinstitt@google.com>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-20 20:47:32 -08:00
Kees Cook
f478898e0a string: Redefine strscpy_pad() as a macro
In preparation for making strscpy_pad()'s 3rd argument optional, redefine
it as a macro. This also has the benefit of allowing greater FORITFY
introspection, as it couldn't see into the strscpy() nor the memset()
within strscpy_pad().

Cc: Andy Shevchenko <andy@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc:  <linux-hardening@vger.kernel.org>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-20 20:47:31 -08:00
Kees Cook
557f8c582a ubsan: Reintroduce signed overflow sanitizer
In order to mitigate unexpected signed wrap-around[1], bring back the
signed integer overflow sanitizer. It was removed in commit 6aaa31aeb9
("ubsan: remove overflow checks") because it was effectively a no-op
when combined with -fno-strict-overflow (which correctly changes signed
overflow from being "undefined" to being explicitly "wrap around").

Compilers are adjusting their sanitizers to trap wrap-around and to
detecting common code patterns that should not be instrumented
(e.g. "var + offset < var"). Prepare for this and explicitly rename
the option from "OVERFLOW" to "WRAP" to more accurately describe the
behavior.

To annotate intentional wrap-around arithmetic, the helpers
wrapping_add/sub/mul_wrap() can be used for individual statements. At
the function level, the __signed_wrap attribute can be used to mark an
entire function as expecting its signed arithmetic to wrap around. For a
single object file the Makefile can use "UBSAN_SIGNED_WRAP_target.o := n"
to mark it as wrapping, and for an entire directory, "UBSAN_SIGNED_WRAP :=
n" can be used.

Additionally keep these disabled under CONFIG_COMPILE_TEST for now.

Link: https://github.com/KSPP/linux/issues/26 [1]
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hao Luo <haoluo@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-20 20:44:49 -08:00
Guenter Roeck
1eb1e98437 lib/Kconfig.debug: TEST_IOV_ITER depends on MMU
Trying to run the iov_iter unit test on a nommu system such as the qemu
kc705-nommu emulation results in a crash.

    KTAP version 1
    # Subtest: iov_iter
    # module: kunit_iov_iter
    1..9
BUG: failure at mm/nommu.c:318/vmap()!
Kernel panic - not syncing: BUG!

The test calls vmap() directly, but vmap() is not supported on nommu
systems, causing the crash.  TEST_IOV_ITER therefore needs to depend on
MMU.

Link: https://lkml.kernel.org/r/20240208153010.1439753-1-linux@roeck-us.net
Fixes: 2d71340ff1 ("iov_iter: Kunit tests for copying to/from an iterator")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-20 14:20:48 -08:00
Masahiro Yamada
cd14b01846 treewide: replace or remove redundant def_bool in Kconfig files
'def_bool X' is a shorthand for 'bool' plus 'default X'.

'def_bool' is redundant where 'bool' is already present, so 'def_bool X'
can be replaced with 'default X', or removed if X is 'n'.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-02-20 20:47:45 +09:00
Greg Kroah-Hartman
36d97cdaf4 Merge 6.8-rc5 into tty-next
We need the serial/tty fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19 09:06:37 +01:00
Greg Kroah-Hartman
07749061b8 Linux 6.8-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmXSbvkeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGi3kH/iM5AcKeQfee07to
 TJP3WORLBt5UrUMgAQbJdNAruCj6brHA6YnZ6AjdhdNZ41RO06h69kTpb+y/CYJz
 +ogGK5qhWsC8ETM6rJCioo8uI4GU70p84fGiMfLw8o6vUvmtyI5qH3531Y30Edxt
 qqMVtQrVe9J5JcHQjJ/mQo8nAuvZdQQGM90zLGqFP+JheP3CzsCokraIITqoebDC
 PCVAsnSbXT/uMjwSr03kdnMnnY++fxz/TyHb/QRMPq/4xU8nz0Z2hlOOYrVkOINu
 lkIYtCDGTk0QR3hdDiO9yt+k1OsWi+oBkrhjAQmAk9x3riiXq1NKvZZLDArt7PH5
 gSyo9sE=
 =h5Ei
 -----END PGP SIGNATURE-----

Merge 6.8-rc5 into driver-core-next

We need the driver core changes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-19 07:51:35 +01:00
Linus Torvalds
ced5905231 Driver core fixes for 6.8-rc5
Here are some driver core fixes, a kobject fix, and a documentation
 update for 6.8-rc5.  In detail these changes are:
   - devlink fixes for reported issues with 6.8-rc1
   - topology scheduling regression fix that has been reported by many
   - kobject loosening of checks change in -rc1 is now reverted as some
     codepaths seemed to need the checks
   - documentation update for the CVE process.  Has been reviewed by
     many, the last minute change to the document was to bring the .rst
     format back into the the new style rules, the contents did not
     change.
 
 All of these, except for the documentation update, have been in
 linux-next for over a week.  The documentation update has been reviewed
 for weeks by a group of developers, and in public for a week and the
 wording has stabilized for now.  If future changes are needed, we can do
 so before 6.8-final is out (or anytime after that.)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZdC7Eg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykMaQCgnFRIta+T0yxCftMfSxqEcMeDLcsAoIM7v7WK
 krcgNVRuERcuJfHIoS6u
 =jshL
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are some driver core fixes, a kobject fix, and a documentation
  update for 6.8-rc5. In detail these changes are:

   - devlink fixes for reported issues with 6.8-rc1

   - topology scheduling regression fix that has been reported by many

   - kobject loosening of checks change in -rc1 is now reverted as some
     codepaths seemed to need the checks

   - documentation update for the CVE process. Has been reviewed by
     many, the last minute change to the document was to bring the .rst
     format back into the the new style rules, the contents did not
     change.

  All of these, except for the documentation update, have been in
  linux-next for over a week. The documentation update has been reviewed
  for weeks by a group of developers, and in public for a week and the
  wording has stabilized for now. If future changes are needed, we can
  do so before 6.8-final is out (or anytime after that)"

* tag 'driver-core-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Documentation: Document the Linux Kernel CVE process
  Revert "kobject: Remove redundant checks for whether ktype is NULL"
  driver core: fw_devlink: Improve logs for cycle detection
  driver core: fw_devlink: Improve detection of overlapping cycles
  driver core: Fix device_link_flag_is_sync_state_only()
  topology: Set capacity_freq_ref in all cases
2024-02-17 08:56:41 -08:00
Eric Dumazet
5c0941c55e kobject: reduce uevent_sock_mutex scope
This is a followup of commit a3498436b3 ("netns: restrict uevents")

- uevent_sock_mutex no longer protects uevent_seqnum thanks
  to prior patch in the series.

- uevent_net_broadcast() can run without holding uevent_sock_mutex.

- Instead of grabbing uevent_sock_mutex before calling
  kobject_uevent_net_broadcast(), we can move the
  mutex_lock(&uevent_sock_mutex) to the place we iterate over
  uevent_sock_list : uevent_net_broadcast_untagged().

After this patch, typical netdevice creations and destructions
calling uevent_net_broadcast_tagged() no longer need to acquire
uevent_sock_mutex.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christian Brauner <brauner@kernel.org>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20240214084829.684541-3-edumazet@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-17 16:20:41 +01:00
Eric Dumazet
2444a80c1c kobject: make uevent_seqnum atomic
We will soon no longer acquire uevent_sock_mutex
for most kobject_uevent_net_broadcast() calls,
and also while calling uevent_net_broadcast().

Make uevent_seqnum an atomic64_t to get its own protection.

This fixes a race while reading /sys/kernel/uevent_seqnum.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christian Brauner <brauner@kernel.org>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20240214084829.684541-2-edumazet@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-17 16:20:41 +01:00
Linus Torvalds
4b6f7c624e Tracing fixes for v6.8:
- Fix the #ifndef that didn't have CONFIG_ on HAVE_DYNAMIC_FTRACE_WITH_REGS
   The fix to have dynamic trampolines work with x86 broke arm64 as
   the config used in the #ifdef was HAVE_DYNAMIC_FTRACE_WITH_REGS and not
   CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS which removed the fix that the
   previous fix was to fix.
 
 - Fix tracing_on state
   The code to test if "tracing_on" is set used ring_buffer_record_is_on()
   which returns false if the ring buffer isn't able to be written to.
   But the ring buffer disable has several bits that disable it.
   One is internal disabling which is used for resizing and other
   modifications of the ring buffer. But the "tracing_on" user space
   visible flag should only report if tracing is actually on and not
   internally disabled, as this can cause confusion as writing "1"
   when it is disabled will not enable it.
 
   Instead use ring_buffer_record_is_set_on() which shows the user space
   visible settings.
 
 - Fix a false positive kmemleak on saved cmdlines
   Now that the saved_cmdlines structure is allocated via alloc_page()
   and not via kmalloc() it has become invisible to kmemleak.
   The allocation done to one of its pointers was flagged as a
   dangling allocation leak. Make kmemleak aware of this allocation
   and free.
 
 - Fix synthetic event dynamic strings.
   A update that cleaned up the synthetic event code removed the
   return value of trace_string(), and had it return zero instead
   of the length, causing dynamic strings in the synthetic event
   to always have zero size.
 
 - Clean up documentation and header files for seq_buf
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZc92CxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qiD6AQCCtF9hBWq7slLlBQ+k07hWXOi1h4nR
 Ae6UmoGlu3u4ugEA6/g8mO2vjABagnK7RSk/R+s1SvSGqWkmAsWeEKirEAY=
 =Eqjs
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix the #ifndef that didn't have the 'CONFIG_' prefix on
   HAVE_DYNAMIC_FTRACE_WITH_REGS

   The fix to have dynamic trampolines work with x86 broke arm64 as the
   config used in the #ifdef was HAVE_DYNAMIC_FTRACE_WITH_REGS and not
   CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS which removed the fix that the
   previous fix was to fix.

 - Fix tracing_on state

   The code to test if "tracing_on" is set incorrectly used
   ring_buffer_record_is_on() which returns false if the ring buffer
   isn't able to be written to.

   But the ring buffer disable has several bits that disable it. One is
   internal disabling which is used for resizing and other modifications
   of the ring buffer. But the "tracing_on" user space visible flag
   should only report if tracing is actually on and not internally
   disabled, as this can cause confusion as writing "1" when it is
   disabled will not enable it.

   Instead use ring_buffer_record_is_set_on() which shows the user space
   visible settings.

 - Fix a false positive kmemleak on saved cmdlines

   Now that the saved_cmdlines structure is allocated via alloc_page()
   and not via kmalloc() it has become invisible to kmemleak. The
   allocation done to one of its pointers was flagged as a dangling
   allocation leak. Make kmemleak aware of this allocation and free.

 - Fix synthetic event dynamic strings

   An update that cleaned up the synthetic event code removed the return
   value of trace_string(), and had it return zero instead of the
   length, causing dynamic strings in the synthetic event to always have
   zero size.

 - Clean up documentation and header files for seq_buf

* tag 'trace-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  seq_buf: Fix kernel documentation
  seq_buf: Don't use "proxy" headers
  tracing/synthetic: Fix trace_string() return value
  tracing: Inform kmemleak of saved_cmdlines allocation
  tracing: Use ring_buffer_record_is_set_on() in tracer_tracing_is_on()
  tracing: Fix HAVE_DYNAMIC_FTRACE_WITH_REGS ifdef
2024-02-16 10:33:51 -08:00
Heiko Carstens
c8dde11df1 s390/raid6: convert to use standard fpu_*() inline assemblies
Move the s390 specific raid6 inline assemblies, make them generic, and
reuse them to implement the raid6 gen/xor implementation.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16 14:30:17 +01:00
Heiko Carstens
066c40918b s390/fpu: decrease stack usage for some cases
The kernel_fpu structure has a quite large size of 520 bytes. In order to
reduce stack footprint introduce several kernel fpu structures with
different and also smaller sizes. This way every kernel fpu user must use
the correct variant. A compile time check verifies that the correct variant
is used.

There are several users which use only 16 instead of all 32 vector
registers. For those users the new kernel_fpu_16 structure with a size of
only 266 bytes can be used.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16 14:30:16 +01:00
Heiko Carstens
fd2527f209 s390/fpu: move, rename, and merge header files
Move, rename, and merge the fpu and vx header files. This way fpu header
files have a consistent naming scheme (fpu*.h).

Also get rid of the fpu subdirectory and move header files to asm
directory, so that all fpu and vx header files can be found at the same
location.

Merge internal.h header file into other header files, since the internal
helpers are used at many locations. so those helper functions are really
not internal.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16 14:30:14 +01:00
Jakub Kicinski
73be9a3aab Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

net/core/dev.c
  9f30831390 ("net: add rcu safety to rtnl_prop_list_size()")
  723de3ebef ("net: free altname using an RCU callback")

net/unix/garbage.c
  11498715f2 ("af_unix: Remove io_uring code for GC.")
  25236c91b5 ("af_unix: Fix task hung while purging oob_skb in GC.")

drivers/net/ethernet/renesas/ravb_main.c
  ed4adc0720 ("net: ravb: Count packets instead of descriptors in GbEth RX path"
)
  c2da940857 ("ravb: Add Rx checksum offload support for GbEth")

net/mptcp/protocol.c
  bdd70eb689 ("mptcp: drop the push_pending field")
  28e5c13805 ("mptcp: annotate lockless accesses around read-mostly fields")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-15 16:20:04 -08:00
Andy Shevchenko
6efe4d1879 seq_buf: Fix kernel documentation
There are plenty of issues with the kernel documentation here:
  - misspelled word "sequence"
  - different style of returned value descriptions
  - missed Return sections
  - unaligned style of ASCII / NUL-terminated / etc
  - wrong function references

Fix all these.

Link: https://lkml.kernel.org/r/20240215152506.598340-1-andriy.shevchenko@linux.intel.com

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-02-15 12:17:28 -05:00
Andy Shevchenko
8a566f9410 seq_buf: Don't use "proxy" headers
Update header inclusions to follow IWYU (Include What You Use)
principle.

Link: https://lkml.kernel.org/r/20240215142255.400264-1-andriy.shevchenko@linux.intel.com

Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-02-15 12:16:46 -05:00
Linus Torvalds
91f842ffe6 linux_kselftest-kunit-fixes-6.8-rc5
This KUnit update for Linux 6.8-rc5 consists of one important fix
 to unregister kunit_bus when KUnit module is unloaded. Not doing
 so causes an error when KUnit module tries to re-register the bus
 when it gets reloaded.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmXMAnoACgkQCwJExA0N
 Qxy7UA//bP1Igj6osQfBjpR+RRyI3x069Z6zFRKmMglsyXnG2OTmTECGFTKbXPWf
 TX6UVc6iwcYTZzu2n/Xn7+smS4x3kUzYYUUhwtQzgm8Cape/XpQV3s32rYFO7XVs
 KH1QpB38wHibW+8YiBuluAfNTsjEYqlhVGIBPfmsG9jP+sm7y+yFiIu4Eo/JwCTa
 0KM4s+OFMcvC13RegOvK/mvBqqhcM7U3lMWQhRjLEXi0OjO65S4prTpM0NMO56Ar
 d8KNX718BvDY9MyihwioFE4VEIMIBNeqbzx1nbCFu7cUSS0n+VWK+41CeJBuYitm
 ub/meRILtAHbV9+9SY1REqIIrsWSC7v/+fbG05YOnTIMfVV1Ye1XvBZoJLAmiAGz
 VR1JbDbuk9xfwKU48NIS8CqH7VJjM74Rl3GJh0Meyn833BYHIfVHkRlLjBbiNDG5
 qac0XyH3vRHvp4Ud3PAmLa8e3QDo5HIHDkvBag4XOrzKdHpcBAGghrNWbGbipaKI
 7BTyvWu5c5riVo1GN81JqT1jsZF8Dld/QaS0mcvFHy5ORfCrLi2RTpYPJIRzv++a
 gUjAllyH/pqwHhB/Jj9Khi8OSv8/3jMIpMS3QE/ADwFfNslGWW63kycKeDuE9Jps
 gCVu9DHmm18OtLiYM+nSNjyWN1pvRvCV7uo8Atucbw4bBDFwZY8=
 =bDqz
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-fixes-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit fix from Shuah Khan:
 "One important fix to unregister kunit_bus when KUnit module is
  unloaded.

  Not doing so causes an error when KUnit module tries to re-register
  the bus when it gets reloaded"

* tag 'linux_kselftest-kunit-fixes-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: device: Unregister the kunit_bus on shutdown
2024-02-14 15:34:03 -08:00
Philipp Stanner
acc2364fe6 PCI: Move PCI-specific devres code to drivers/pci/
The pcim_*() functions in lib/devres.c are guarded by an #ifdef CONFIG_PCI
and, thus, don't belong to this file. They are only ever used for PCI and
are not generic infrastructure.

Move all pcim_*() functions in lib/devres.c to drivers/pci/devres.c.
Adjust the Makefile.

Add drivers/pci/devres.c to Documentation.

Link: https://lore.kernel.org/r/20240131090023.12331-4-pstanner@redhat.com
Suggested-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-02-12 10:36:17 -06:00
Philipp Stanner
ae87402752 PCI: Move pci_iomap.c to drivers/pci/
The entirety of pci_iomap.c is guarded by an #ifdef CONFIG_PCI. It,
consequently, does not belong to lib/ because it is not generic
infrastructure.

Move pci_iomap.c to drivers/pci/ and implement the necessary changes to
Makefiles and Kconfigs.

Update MAINTAINERS file.

Update Documentation.

Link: https://lore.kernel.org/r/20240131090023.12331-3-pstanner@redhat.com
[bhelgaas: squash in https://lore.kernel.org/r/20240212150934.24559-1-pstanner@redhat.com]
Suggested-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
2024-02-12 10:35:40 -06:00
Heiko Carstens
8d16ce1488 s390/fpu: make use of __uninitialized macro
Code sections in s390 specific kernel code which use floating point or
vector registers all come with a 520 byte stack variable to save already in
use registers, if required.

With INIT_STACK_ALL_PATTERN or INIT_STACK_ALL_ZERO enabled this variable
will always be initialized on function entry in addition to saving register
contents, which contradicts the intention (performance improvement) of such
code sections.

Therefore provide a DECLARE_KERNEL_FPU_ONSTACK() macro which provides
struct kernel_fpu variables with an __uninitialized attribute, and convert
all existing code to use this.

This way only this specific type of stack variable will not be initialized,
regardless of config options.

Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240205154844.3757121-3-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09 13:58:16 +01:00
Greg Kroah-Hartman
3ca8fbabcc Revert "kobject: Remove redundant checks for whether ktype is NULL"
This reverts commit 1b28cb81da.

It is reported to cause problems, so revert it for now until the root
cause can be found.

Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 1b28cb81da ("kobject: Remove redundant checks for whether ktype is NULL")
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Closes: https://lore.kernel.org/oe-lkp/202402071403.e302e33a-oliver.sang@intel.com
Link: https://lore.kernel.org/r/2024020849-consensus-length-6264@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-08 16:39:25 +00:00
John Ogness
7412dc6d55 dump_stack: Do not get cpu_sync for panic CPU
dump_stack() is called in panic(). If for some reason another CPU
is holding the printk_cpu_sync and is unable to release it, the
panic CPU will be unable to continue and print the stacktrace.

Since non-panic CPUs are not allowed to store new printk messages
anyway, there is no need to synchronize the stacktrace output in
a panic situation.

For the panic CPU, do not get the printk_cpu_sync because it is
not needed and avoids a potential deadlock scenario in panic().

Link: https://lore.kernel.org/lkml/ZcIGKU8sxti38Kok@alley
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20240207134103.1357162-15-john.ogness@linutronix.de
Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-02-07 17:23:19 +01:00
David Gow
829388b725 kunit: device: Unregister the kunit_bus on shutdown
If KUnit is built as a module, and it's unloaded, the kunit_bus is not
unregistered. This causes an error if it's then re-loaded later, as we
try to re-register the bus.

Unregister the bus and root_device on shutdown, if it looks valid.

In addition, be more specific about the value of kunit_bus_device. It
is:
- a valid struct device* if the kunit_bus initialised correctly.
- an ERR_PTR if it failed to initialise.
- NULL before initialisation and after shutdown.

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-02-06 17:07:37 -07:00
Kees Cook
918327e9b7 ubsan: Remove CONFIG_UBSAN_SANITIZE_ALL
For simplicity in splitting out UBSan options into separate rules,
remove CONFIG_UBSAN_SANITIZE_ALL, effectively defaulting to "y", which
is how it is generally used anyway. (There are no ":= y" cases beyond
where a specific file is enabled when a top-level ":= n" is in effect.)

Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Marco Elver <elver@google.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-06 02:21:38 -08:00
Kees Cook
30edbdf9b9 ubsan: Silence W=1 warnings in self-test
Silence a handful of W=1 warnings in the UBSan selftest, which set
variables without using them. For example:

   lib/test_ubsan.c:101:6: warning: variable 'val1' set but not used [-Wunused-but-set-variable]
     101 |         int val1 = 10;
         |             ^

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401310423.XpCIk6KO-lkp@intel.com/
Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-06 02:21:38 -08:00
Breno Leitao
843a8851e8 net: blackhole_dev: fix build warning for ethh set but not used
lib/test_blackhole_dev.c sets a variable that is never read, causing
this following building warning:

	lib/test_blackhole_dev.c:32:17: warning: variable 'ethh' set but not used [-Wunused-but-set-variable]

Remove the variable struct ethhdr *ethh, which is unused.

Fixes: 509e56b37c ("blackhole_dev: add a selftest")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-05 12:30:54 +00:00
Greg Kroah-Hartman
a802f50d6e Merge 6.8-rc3 into tty-next
We need the tty/serial fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-04 06:21:02 -08:00
Randy Dunlap
157285397f lib/test_kmod: fix kernel-doc warnings
Fix all kernel-doc warnings in test_kmod.c:
- Mark some enum values as private so that kernel-doc is not needed
  for them
- s/thread_mutex/thread_lock/ in a struct's kernel-doc comments
- add kernel-doc info for @task_sync

test_kmod.c:67: warning: Enum value '__TEST_KMOD_INVALID' not described in enum 'kmod_test_case'
test_kmod.c:67: warning: Enum value '__TEST_KMOD_MAX' not described in enum 'kmod_test_case'
test_kmod.c💯 warning: Function parameter or member 'task_sync' not described in 'kmod_test_device_info'
test_kmod.c:134: warning: Function parameter or member 'thread_mutex' not described in 'kmod_test_device'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc:  <linux-modules@vger.kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-02-02 10:21:26 -08:00
Kees Cook
bd8c239c05 iov_iter: Avoid wrap-around instrumentation in copy_compat_iovec_from_user()
The loop counter "i" in copy_compat_iovec_from_user() is an int, but
because the nr_segs argument is unsigned long, the signed overflow
sanitizer got worried "i" could wrap around. Instead of making "i" an
unsigned long (which may enlarge the type size), switch both nr_segs
and i to u32. There is no truncation with nr_segs since it is never
larger than UIO_MAXIOV anyway. This keeps sanitizer instrumentation[1]
out of a UACCESS path:

vmlinux.o: warning: objtool: copy_compat_iovec_from_user+0xa9: call to __ubsan_handle_add_overflow() with UACCESS enabled

Link: https://github.com/KSPP/linux/issues/26 [1]
Cc: Christian Brauner <brauner@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240129183729.work.991-kees@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-02 13:11:49 +01:00
Jakub Kicinski
cf244463a2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01 15:12:37 -08:00
Tanzir Hasan
38b9baf194 lib/string: shrink lib/string.i via IWYU
This diff uses an open source tool include-what-you-use (IWYU) to modify
the include list, changing indirect includes to direct includes. IWYU is
implemented using the IWYUScripts github repository which is a tool that
is currently undergoing development. These changes seek to improve build
times.

This change to lib/string.c resulted in a preprocessed size of
lib/string.i from 26371 lines to 5321 lines (-80%) for the x86
defconfig.

Link: https://github.com/ClangBuiltLinux/IWYUScripts
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tanzir Hasan <tanzirh@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20231226-libstringheader-v6-2-80aa08c7652c@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-02-01 09:47:59 -08:00
Yury Norov
c1f5204efc cpumask: add cpumask_weight_andnot()
Similarly to cpumask_weight_and(), cpumask_weight_andnot() is a handy
helper that may help to avoid creating an intermediate mask just to
calculate number of bits that set in a 1st given mask, and clear in 2nd
one.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01 13:06:40 +01:00
Philipp Stanner
7626913652 pci_iounmap(): Fix MMIO mapping leak
The #ifdef ARCH_HAS_GENERIC_IOPORT_MAP accidentally also guards iounmap(),
which means MMIO mappings are leaked.

Move the guard so we call iounmap() for MMIO mappings.

Fixes: 316e8d79a0 ("pci_iounmap'2: Electric Boogaloo: try to make sense of it all")
Link: https://lore.kernel.org/r/20240131090023.12331-2-pstanner@redhat.com
Reported-by: Danilo Krummrich <dakr@redhat.com>
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: <stable@vger.kernel.org> # v5.15+
2024-01-31 14:40:40 -06:00
Linus Torvalds
2a6526c4f3 linux_kselftest-kunit-fixes-6.8-rc3
This kunit fixes update for Linux 6.8-rc3 consists of NULL vs IS_ERR()
 bug fixes, documentation update, MAINTAINERS file update to add
 Rae Moar as a reviewer, and a fix to run test suites only after module
 initialization completes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmW5dOAACgkQCwJExA0N
 QxwvoQ/+LoNEP+yu0C/KSWdNsDCMJizbwoCtoImETVdxxxSXOLKlh6cVtFfZTenN
 lLMX7zfrdIZEKIqIPVa4URq2+M9cHLuFkA8z2h5a8wOLSYxR2EyFq63o0K0NJwuC
 eSA7zgf/YkfEp7AKdkmefCqyn3bB3910KBlComqdtFtc/d/doZkeo2zCpJ5WXPzr
 x0aUGPceUa59u44JCgb1JWjQnA50ST+DYFHbfUEQJs37XfkswaseyCBKzWFAZnzR
 cVFHBVTZRceMrVDaXfSlXLr7FFgh/bQKF9BDTNBOEg9b1TVZmVH5sY5qnXsBTJ86
 nCf4J8bIzBOvLkDFGQVRs+6edvSli5wXWfM9O8hzGEbR+xs8GOrlHitM74hyhX7z
 iq6QZl0FjrGfc98DKTftwO7vDKZcFd2bhpw9ZV1cO4YzYnZVRcWwWYxPO+wS7r3q
 b+vcypxBmBhXDxxNJaJttxTcblNrxuLY1r7oJZaroE2ioKMY3uEqIA2hLagzPp9E
 /q6968b2tvFQrveFT3MJwRAAIq175xBRlNcpF8UkhnGXgZMF4er/teMN8coPeK4q
 N//sxYobNVIJYnz6npIQB6wSobWrc0Qq4AD6YTEYkhbaXiGo3VfdrjwrxTNxryBg
 wWAXvrSGUqWJXZqnSdFpmpdcOAo9r0Pf0NsByW5jpLiObIGQN5U=
 =BWfz
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-fixes-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:
 "NULL vs IS_ERR() bug fixes, documentation update, MAINTAINERS file
  update to add Rae Moar as a reviewer, and a fix to run test suites
  only after module initialization completes"

* tag 'linux_kselftest-kunit-fixes-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  Documentation: KUnit: Update the instructions on how to test static functions
  kunit: run test suites only after module initialization completes
  MAINTAINERS: kunit: Add Rae Moar as a reviewer
  kunit: device: Fix a NULL vs IS_ERR() check in init()
  kunit: Fix a NULL vs IS_ERR() bug
2024-01-30 15:12:58 -08:00
Lukas Bulwahn
5c49b6a4a4 vt: remove superfluous CONFIG_HW_CONSOLE
The config HW_CONSOLE is always identical to the config VT and is not
visible in the kernel's build menuconfig. So, CONFIG_HW_CONSOLE is
redundant.

Replace all references to CONFIG_HW_CONSOLE with CONFIG_VT and remove
CONFIG_HW_CONSOLE.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20240108134102.601-1-lukas.bulwahn@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-27 19:03:51 -08:00
Marco Elver
4434a56ec2 stackdepot: make fast paths lock-less again
With the introduction of the pool_rwlock (reader-writer lock), several
fast paths end up taking the pool_rwlock as readers.  Furthermore,
stack_depot_put() unconditionally takes the pool_rwlock as a writer.

Despite allowing readers to make forward-progress concurrently,
reader-writer locks have inherent cache contention issues, which does not
scale well on systems with large CPU counts.

Rework the synchronization story of stack depot to again avoid taking any
locks in the fast paths.  This is done by relying on RCU-protected list
traversal, and the NMI-safe subset of RCU to delay reuse of freed stack
records.  See code comments for more details.

Along with the performance issues, this also fixes incorrect nesting of
rwlock within a raw_spinlock, given that stack depot should still be
usable from anywhere:

 | [ BUG: Invalid wait context ]
 | -----------------------------
 | swapper/0/1 is trying to lock:
 | ffffffff89869be8 (pool_rwlock){..--}-{3:3}, at: stack_depot_save_flags
 | other info that might help us debug this:
 | context-{5:5}
 | 2 locks held by swapper/0/1:
 |  #0: ffffffff89632440 (rcu_read_lock){....}-{1:3}, at: __queue_work
 |  #1: ffff888100092018 (&pool->lock){-.-.}-{2:2}, at: __queue_work  <-- raw_spin_lock

Stack depot usage stats are similar to the previous version after a KASAN
kernel boot:

 $ cat /sys/kernel/debug/stackdepot/stats
 pools: 838
 allocations: 29865
 frees: 6604
 in_use: 23261
 freelist_size: 1879

The number of pools is the same as previously.  The freelist size is
minimally larger, but this may also be due to variance across system
boots.  This shows that even though we do not eagerly wait for the next
RCU grace period (such as with synchronize_rcu() or call_rcu()) after
freeing a stack record - requiring depot_pop_free() to "poll" if an entry
may be used - new allocations are very likely to happen in later RCU grace
periods.

Link: https://lkml.kernel.org/r/20240118110216.2539519-2-elver@google.com
Fixes: 108be8def4 ("lib/stackdepot: allow users to evict stack traces")
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-25 23:52:21 -08:00
Marco Elver
c2a292545c stackdepot: add stats counters exported via debugfs
Add a few basic stats counters for stack depot that can be used to derive
if stack depot is working as intended.  This is a snapshot of the new
stats after booting a system with a KASAN-enabled kernel:

 $ cat /sys/kernel/debug/stackdepot/stats
 pools: 838
 allocations: 29861
 frees: 6561
 in_use: 23300
 freelist_size: 1840

Generally, "pools" should be well below the max; once the system is
booted, "in_use" should remain relatively steady.

Link: https://lkml.kernel.org/r/20240118110216.2539519-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-25 23:52:21 -08:00
Jens Axboe
2263639f96 iov_iter: streamline iovec/bvec alignment iteration
Rewrite the alignment checking iterators for iovec and bvec to be easier
to read, and also significantly more compact in terms of generated code.
This saves 270 bytes of text on x86-64 for me (with clang-18) and 224
bytes on arm64 (with gcc-13).

In profiles, also saves a bit of time as well for the same workload:

     0.81%     -0.18%  [kernel.vmlinux]  [k] iov_iter_aligned_bvec
     0.48%     -0.09%  [kernel.vmlinux]  [k] iov_iter_is_aligned

which is a nice side benefit as well.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/544b31f7-6d4b-42f5-a544-1420501f081f@kernel.dk
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>

v2: do the other half of the iterators too, as suggested by Keith.
    This further saves some text.
2024-01-25 17:14:44 +01:00
Marcos Paulo de Souza
c4bbe83d27 livepatch: Move tests from lib/livepatch to selftests/livepatch
The modules are being moved from lib/livepatch to
tools/testing/selftests/livepatch/test_modules.

This code moving will allow writing more complex tests, like for example an
userspace C code that will call a livepatched kernel function.

The modules are now built as out-of-tree
modules, but being part of the kernel source means they will be maintained.

Another advantage of the code moving is to be able to easily change,
debug and rebuild the tests by running make on the selftests/livepatch
directory, which is not currently possible since the modules on
lib/livepatch are build and installed using the "modules" target.

The current approach also keeps the ability to execute the tests manually
by executing the scripts inside selftests/livepatch directory, as it's
currently supported. If the modules are modified, they needed to be
rebuilt before running the scripts though.

The modules are built before running the selftests when using the
kselftest invocations:

	make kselftest TARGETS=livepatch
or
	make -C tools/testing/selftests/livepatch run_tests

Having the modules being built as out-of-modules requires changing the
currently used 'modprobe' by 'insmod' and adapt the test scripts that
check for the kernel message buffer.

Now it is possible to only compile the modules by running:

	make -C tools/testing/selftests/livepatch/

This way the test modules and other test program can be built in order
to be packaged if so desired.

As there aren't any modules being built on lib/livepatch, remove the
TEST_LIVEPATCH Kconfig and it's references.

Note: "make gen_tar" packages the pre-built binaries into the tarball.
       It means that it will store the test modules pre-built for
       the kernel running on the build host.

       Note that these modules need not binary compatible with
       the kernel built from the same sources. But the same
       is true for other packaged selftest binaries.

       The entire kernel sources are needed for rebuilding
       the selftests on another system.

Reviewed-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22 10:29:47 -07:00
Marco Pagani
a1af6a2bfa kunit: run test suites only after module initialization completes
Commit 2810c1e998 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") fixed a wild-memory-access bug that could have
happened during the loading phase of test suites built and executed as
loadable modules. However, it also introduced a problematic side effect
that causes test suites modules to crash when they attempt to register
fake devices.

When a module is loaded, it traverses the MODULE_STATE_UNFORMED and
MODULE_STATE_COMING states before reaching the normal operating state
MODULE_STATE_LIVE. Finally, when the module is removed, it moves to
MODULE_STATE_GOING before being released. However, if the loading
function load_module() fails between complete_formation() and
do_init_module(), the module goes directly from MODULE_STATE_COMING to
MODULE_STATE_GOING without passing through MODULE_STATE_LIVE.

This behavior was causing kunit_module_exit() to be called without
having first executed kunit_module_init(). Since kunit_module_exit() is
responsible for freeing the memory allocated by kunit_module_init()
through kunit_filter_suites(), this behavior was resulting in a
wild-memory-access bug.

Commit 2810c1e998 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") fixed this issue by running the tests when the
module is still in MODULE_STATE_COMING. However, modules in that state
are not fully initialized, lacking sysfs kobjects. Therefore, if a test
module attempts to register a fake device, it will inevitably crash.

This patch proposes a different approach to fix the original
wild-memory-access bug while restoring the normal module execution flow
by making kunit_module_exit() able to detect if kunit_module_init() has
previously initialized the tests suite set. In this way, test modules
can once again register fake devices without crashing.

This behavior is achieved by checking whether mod->kunit_suites is a
virtual or direct mapping address. If it is a virtual address, then
kunit_module_init() has allocated the suite_set in kunit_filter_suites()
using kmalloc_array(). On the contrary, if mod->kunit_suites is still
pointing to the original address that was set when looking up the
.kunit_test_suites section of the module, then the loading phase has
failed and there's no memory to be freed.

v4:
- rebased on 6.8
- noted that kunit_filter_suites() must return a virtual address
v3:
- add a comment to clarify why the start address is checked
v2:
- add include <linux/mm.h>

Fixes: 2810c1e998 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: Rae Moar <rmoar@google.com>
Tested-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Marco Pagani <marpagan@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22 07:58:54 -07:00
Dan Carpenter
083974ebb8 kunit: device: Fix a NULL vs IS_ERR() check in init()
The root_device_register() function does not return NULL, it returns
error pointers.  Fix the check to match.

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22 07:58:22 -07:00
Dan Carpenter
ed1a72fb0d kunit: Fix a NULL vs IS_ERR() bug
The kunit_device_register() function doesn't return NULL, it returns
error pointers.  Change the KUNIT_ASSERT_NOT_NULL() to check for
ERR_OR_NULL().

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-22 07:58:12 -07:00
Linus Torvalds
e5075d8ec5 RISC-V Patches for the 6.8 Merge Window, Part 4
This includes everything from part 2:
 
 * Support for tuning for systems with fast misaligned accesses.
 * Support for SBI-based suspend.
 * Support for the new SBI debug console extension.
 * The T-Head CMOs now use PA-based flushes.
 * Support for enabling the V extension in kernel code.
 * Optimized IP checksum routines.
 * Various ftrace improvements.
 * Support for archrandom, which depends on the Zkr extension.
 
 and then also a fix for those:
 
 * The build is no longer broken under NET=n, KUNIT=y for ports that
   don't define their own ipv6 checksum.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmWsCOMTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYieQND/0f+1gizTM0OzuqZG9+DOdWTtqmILyr
 sZaYXWBw6SPzbUSlwjoW4Qp/S3Ur7IhrbfttM2aMoS4GHZvSESAXOMXC4c7AnCaQ
 HOXBC2OuXvq6jA0ZjK5XPviR70A/7uD2iu5SNO1hyfJK08LSEu+AulxtkW50+wMc
 bHXSpZxEf8AtwOJK1cRtwhH4qy+Qcs3Nla3jG7OnDsPbhJVcydHx95eCtfwn2cQA
 KwJPN1fjRtm4ALZb91QcMDO8VAoanfPEkSR3DoNVE/UfdTItYk35VHmf4RWh7IWA
 qDnV5Mp/XMX2RmJqwi1ZmSHHX0rfVLL5UqgBhGHC8PuMpLJn5p9U6DZ0qD7YWxcB
 NDlrHsaXt112RHEEM/7CcLkqEexua/ezcC45E5tSQ4sRDZE3fvgbALao67xSQ22D
 lCpVAY0Z3o5oWaM/jISiQHjSNn5RrAwEYSvvv2pkW4QAMShA2eggmQaCF+Jl4EMp
 u6yqJpXxDI99C088uvM6Bi2gcX8fnBSmOzCB/sSU4a1I72UpWrGngqUpTYKHG8Jz
 cTZhbIKmQirBP0vC/UgMOS0sNuw/NykItfRXZ2g0qGKvw1TjJ6djdeZBKcAj3h0E
 fJpMxuhmeOFYE7DavnhSt3CResFTXZzXLChjxGbT+g10YzVEf9g7vBVnjxAwad9f
 tryMVpL/ipGpQQ==
 =Sjhj
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for tuning for systems with fast misaligned accesses.

 - Support for SBI-based suspend.

 - Support for the new SBI debug console extension.

 - The T-Head CMOs now use PA-based flushes.

 - Support for enabling the V extension in kernel code.

 - Optimized IP checksum routines.

 - Various ftrace improvements.

 - Support for archrandom, which depends on the Zkr extension.

 - The build is no longer broken under NET=n, KUNIT=y for ports that
   don't define their own ipv6 checksum.

* tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (56 commits)
  lib: checksum: Fix build with CONFIG_NET=n
  riscv: lib: Check if output in asm goto supported
  riscv: Fix build error on rv32 + XIP
  riscv: optimize ELF relocation function in riscv
  RISC-V: Implement archrandom when Zkr is available
  riscv: Optimize hweight API with Zbb extension
  riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi
  samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
  riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  riscv: ftrace: Make function graph use ftrace directly
  riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
  lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
  riscv: Restrict DWARF5 when building with LLVM to known working versions
  riscv: Hoist linker relaxation disabling logic into Kconfig
  kunit: Add tests for csum_ipv6_magic and ip_fast_csum
  riscv: Add checksum library
  riscv: Add checksum header
  riscv: Add static key for misaligned accesses
  asm-generic: Improve csum_fold
  RISC-V: selftests: cbo: Ensure asm operands match constraints
  ...
2024-01-20 11:06:04 -08:00
Linus Torvalds
57f22c8dab strlcpy removal for v6.8-rc1
- Remove of the final (very recent) user of strlcpy() (in bcachefs).
 
 - Remove the strlcpy() API. Long live strscpy().
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmWq5VgWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhsHD/9xfEA1YFC4WzTuX1RcsSwZQTGL
 L8ej9NuRiQ57vJA37PEV3wyTIVHLOJDjNr+8cmL1Pu0GR9K4R2s4YzdQtaK6pFeE
 BYuXUOUK9rkQsVLL0DrTv/YryjMah0DDb/M7kKZDRfTgii0yWZ1WqEmO2+9wbdKS
 n7O9oYZreiNkFg3/6yHPYlBve9QXt+VHN/NIxSQqps3BVXRPKcIwCCJq7IiazBpR
 xo7FkhTftmL1ZqGOGRcoY7YKWt7WFg9HPBB30WXkqCIqmaFWm4sBancArVgTgQ+r
 vES/QF4SsFXkprf4fPWuQZlcChc2hibREI9o3t3Qck4FG7W+alXSpj3IxFiZqNFu
 BvNZwKW5/MB2r+CugM12JUszxAVlcwqskoilGOVD65AJ26xUYh2oAr3kpU5L6Nur
 c3zcLSlpec9sYQMdXSGQWOF2juhEp2ikceP5dw5ONcj4P7UXadPnB4hsW8ulG844
 Rh552sR0je5UCxzXNozec9X1JFZf7Z8lOjdRv1Xs549+F2rmzaZAt2eOnageCCO7
 XKoqZ/auIwzj/3WqDxivjs3xT+1PpxJd3bALDXb/iIu10DMbNq7CRwHO+1OZo1e1
 4OLE1gbM3Ldv2WgUe2o1dDURnmKq1aiYN8ThoOIVy9VTC0FOxujVXKsd//f6qpMu
 EGOypgqRBFpVd53DvQ==
 =DuKT
 -----END PGP SIGNATURE-----

Merge tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull strlcpy removal from Kees Cook:
 "As promised, this is 'part 2' of the hardening tree, late in -rc1 now
  that all the other trees with strlcpy() removals have landed. One new
  user appeared (in bcachefs) but was a trivial refactor. The kernel is
  now free of the strlcpy() API!

   - Remove of the final (very recent) user of strlcpy() (in bcachefs)

   - Remove the strlcpy() API. Long live strscpy()"

* tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  string: Remove strlcpy()
  bcachefs: Replace strlcpy() with strscpy()
2024-01-19 13:49:16 -08:00
Kees Cook
d26270061a string: Remove strlcpy()
With all the users of strlcpy() removed[1] from the kernel, remove the
API, self-tests, and other references. Leave mentions in Documentation
(about its deprecation), and in checkpatch.pl (to help migrate host-only
tools/ usage). Long live strscpy().

Link: https://github.com/KSPP/linux/issues/89 [1]
Cc: Azeem Shaikh <azeemshaikh38@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: linux-hardening@vger.kernel.org
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-01-19 11:59:11 -08:00
Palmer Dabbelt
f24a70106d
lib: checksum: Fix build with CONFIG_NET=n
The generic ipv6 checksums are only defined with CONFIG_NET=y, so gate
the test as well.

Fixes: 6f4c45cbcb ("kunit: Add tests for csum_ipv6_magic and ip_fast_csum")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401192143.jLdjbIy3-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202401192357.WU4nPRdN-lkp@intel.com/
Reviewed-By: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240119145600.3093-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-19 08:12:38 -08:00
Linus Torvalds
9d1694dc91 for-6.8/block-2024-01-18
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmWpoCgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqIUEADFvJdC2izkPzYzsOrMK5Rt1H7vaHGKhbA+
 zWCuQaa1xQd8bazq+NVnQpbzgclkE/WodTCNfNXcTTjzeQEmcZC888llP3Y9vwyP
 XfEKH7fSaeKvGigJLro1oPe3YV7/t89F5ol3BoZayfzJF8GEU9BXRWzgOkZzijnk
 xdm5wUyn/GknksMuQQraZ+U6bQRFLBOulzoaQeMD6Dosx+uRlM4WvAJawC+uOV6R
 qPT2BVSfYGzmgEKvoaphw0FMkUhFBMDHfXTpQBi5tIzTKOaof8tynYEGz0FHZWeh
 V0JEEp+3jLWFxFXeEcXgBVPJPE8J0DzGm9g17/uwC2Yhmlbw4FKZVRvGG+PpeUso
 D5aqhqm3w0x7HgZ7JKwy/aUctADYvjVcSVzPHTaFK0aCSYCIAXxqv4p7fOoxPqyx
 T32IUHTzGtkCdqzv/xFdtTYhTNM2vyzzbbWj5lXgCBqHsXOVbCh8UM2p+9ec2Umq
 Fo1XF9eoCDe6Sn4s15hJ5G4DEhKGOKkHluvRUdM+0selA5b0sNOeUqlAf2v+0ve3
 Pv3e3X4NPssNIEcsDHf5pc3zGC+LXRS0oFvfIvDESBjwXc3iHIMl+SkjyS57P4Fd
 RKrHEUUiACuCKO/IWqFYLiNBNHnP3RmV5gSxIZr9QJhFSwOzP+/+4++TCdF5vdAV
 amhv+0PdCw==
 =DLW9
 -----END PGP SIGNATURE-----

Merge tag 'for-6.8/block-2024-01-18' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - NVMe pull request via Keith:
      - tcp, fc, and rdma target fixes (Maurizio, Daniel, Hannes,
        Christoph)
      - discard fixes and improvements (Christoph)
      - timeout debug improvements (Keith, Max)
      - various cleanups (Daniel, Max, Giuxen)
      - trace event string fixes (Arnd)
      - shadow doorbell setup on reset fix (William)
      - a write zeroes quirk for SK Hynix (Jim)

 - MD pull request via Song:
      - Sparse warning since v6.0 (Bart)
      - /proc/mdstat regression since v6.7 (Yu Kuai)

 - Use symbolic error value (Christian)

 - IO Priority documentation update (Christian)

 - Fix for accessing queue limits without having entered the queue
   (Christoph, me)

 - Fix for loop dio support (Christoph)

 - Move null_blk off deprecated ida interface (Christophe)

 - Ensure nbd initializes full msghdr (Eric)

 - Fix for a regression with the folio conversion, which is now easier
   to hit because of an unrelated change (Matthew)

 - Remove redundant check in virtio-blk (Li)

 - Fix for a potential hang in sbitmap (Ming)

 - Fix for partial zone appending (Damien)

 - Misc changes and fixes (Bart, me, Kemeng, Dmitry)

* tag 'for-6.8/block-2024-01-18' of git://git.kernel.dk/linux: (45 commits)
  Documentation: block: ioprio: Update schedulers
  loop: fix the the direct I/O support check when used on top of block devices
  blk-mq: Remove the hctx 'run' debugfs attribute
  nbd: always initialize struct msghdr completely
  block: Fix iterating over an empty bio with bio_for_each_folio_all
  block: bio-integrity: fix kcalloc() arguments order
  virtio_blk: remove duplicate check if queue is broken in virtblk_done
  sbitmap: remove stale comment in sbq_calc_wake_batch
  block: Correct a documentation comment in blk-cgroup.c
  null_blk: Remove usage of the deprecated ida_simple_xx() API
  block: ensure we hold a queue reference when using queue limits
  blk-mq: rename blk_mq_can_use_cached_rq
  block: print symbolic error name instead of error code
  blk-mq: fix IO hang from sbitmap wakeup race
  nvmet-rdma: avoid circular locking dependency on install_queue()
  nvmet-tcp: avoid circular locking dependency on install_queue()
  nvme-pci: set doorbell config before unquiescing
  block: fix partial zone append completion handling in req_bio_endio()
  block/iocost: silence warning on 'last_period' potentially being unused
  md/raid1: Use blk_opf_t for read and write operations
  ...
2024-01-18 18:22:40 -08:00
Linus Torvalds
db5ccb9eb2 cxl for v6.8
- Add support for parsing the Coherent Device Attribute Table (CDAT)
 
 - Add support for calculating a platform CXL QoS class from CDAT data
 
 - Unify the tracing of EFI CXL Events with native CXL Events.
 
 - Add Get Timestamp support
 
 - Miscellaneous cleanups and fixups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZaHVvAAKCRDfioYZHlFs
 Z3sCAQDPHSsHmj845k4lvKbWjys3eh78MKKEFyTXLQgYhOlsGAEAigQY2ZiSum52
 nwdIgpOOADNt0Iq6yXuLsmn9xvY9bAU=
 =HjCl
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dan Williams:
 "The bulk of this update is support for enumerating the performance
  capabilities of CXL memory targets and connecting that to a platform
  CXL memory QoS class. Some follow-on work remains to hook up this data
  into core-mm policy, but that is saved for v6.9.

  The next significant update is unifying how CXL event records (things
  like background scrub errors) are processed between so called
  "firmware first" and native error record retrieval. The CXL driver
  handler that processes the record retrieved from the device mailbox is
  now the handler for that same record format coming from an EFI/ACPI
  notification source.

  This also contains miscellaneous feature updates, like Get Timestamp,
  and other fixups.

  Summary:

   - Add support for parsing the Coherent Device Attribute Table (CDAT)

   - Add support for calculating a platform CXL QoS class from CDAT data

   - Unify the tracing of EFI CXL Events with native CXL Events.

   - Add Get Timestamp support

   - Miscellaneous cleanups and fixups"

* tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (41 commits)
  cxl/core: use sysfs_emit() for attr's _show()
  cxl/pci: Register for and process CPER events
  PCI: Introduce cleanup helpers for device reference counts and locks
  acpi/ghes: Process CXL Component Events
  cxl/events: Create a CXL event union
  cxl/events: Separate UUID from event structures
  cxl/events: Remove passing a UUID to known event traces
  cxl/events: Create common event UUID defines
  cxl/events: Promote CXL event structures to a core header
  cxl: Refactor to use __free() for cxl_root allocation in cxl_endpoint_port_probe()
  cxl: Refactor to use __free() for cxl_root allocation in cxl_find_nvdimm_bridge()
  cxl: Fix device reference leak in cxl_port_perf_data_calculate()
  cxl: Convert find_cxl_root() to return a 'struct cxl_root *'
  cxl: Introduce put_cxl_root() helper
  cxl/port: Fix missing target list lock
  cxl/port: Fix decoder initialization when nr_targets > interleave_ways
  cxl/region: fix x9 interleave typo
  cxl/trace: Pass UUID explicitly to event traces
  cxl/region: use %pap format to print resource_size_t
  cxl/region: Add dev_dbg() detail on failure to allocate HPA space
  ...
2024-01-18 16:22:43 -08:00
Palmer Dabbelt
448857ec53
Merge patch series "RISC-V: Disable DWARF5 with known broken LLVM versions"
Nathan Chancellor <nathan@kernel.org> says:

This series disables DWARF5 for LLVM versions where it is known to be
broken due to linker relaxation.

* b4-shazam-merge:
  lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
  riscv: Restrict DWARF5 when building with LLVM to known working versions
  riscv: Hoist linker relaxation disabling logic into Kconfig

Link: bbc0f99f3b
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-0-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:30 -08:00
Nathan Chancellor
a4426641f0
lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
Fangrui noted that the comment around CONFIG_AS_HAS_NON_CONST_LEB128
could be made more accurate because explicit .sleb128 directives are not
emitted, only .uleb128 directives are. Rename the symbol to
CONFIG_AS_HAS_NON_CONST_ULEB128 as a result.

Further clarifications include replacing "symbol deltas" with the more
accurate "label differences", noting that this issue has been resolved
in newer binutils (2.41+), and it only occurs when a port uses RISC-V
style linker relaxation.

Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-3-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:28 -08:00
Nathan Chancellor
ae84ff9a14
riscv: Restrict DWARF5 when building with LLVM to known working versions
LLVM prior to 18.0.0 would generate incorrect debug info for DWARF5 due
to linker relaxation, which was worked around in clang by defaulting
RISC-V to DWARF4 [1]. Unfortunately, this workaround does not work for
the kernel because the DWARF version can be independently changed from
the default in Kconfig.

Do not allow DWARF5 to be selected for RISC-V when using linker
relaxation (ld.lld >= 15.0.0) and a version of LLVM that does not have
the fixes (the integrated assembler [2] and ld.lld [3] < 18.0.0)
necessary to generate the correct debug info.

Link: bbc0f99f3b [1]
Link: 1df5ea29b4 [2]
Link: 7ffabb61a5 [3]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20231205-riscv-restrict-dwarf5-llvm-v2-2-aedf00a382ac@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:08:27 -08:00
Charlie Jenkins
6f4c45cbcb
kunit: Add tests for csum_ipv6_magic and ip_fast_csum
Supplement existing checksum tests with tests for csum_ipv6_magic and
ip_fast_csum.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-5-1c50de5f2167@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 17:52:33 -08:00
Linus Torvalds
7f5e47f785 17 hotfixes. 10 address post-6.7 issues and the other 7 are cc:stable.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZaHe5gAKCRDdBJ7gKXxA
 jrAiAQCYZQuwsNVyGJUuPD/GGQzqVUZNpWcuYwMXXAi6dO5rSAD+LDeFviun2K52
 uHCz4iRq5EwNLA+MbdHtAnQzr+e5CQ8=
 =Jjkw
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-01-12-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
 "For once not mostly MM-related.

  17 hotfixes. 10 address post-6.7 issues and the other 7 are cc:stable"

* tag 'mm-hotfixes-stable-2024-01-12-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  userfaultfd: avoid huge_zero_page in UFFDIO_MOVE
  MAINTAINERS: add entry for shrinker
  selftests: mm: hugepage-vmemmap fails on 64K page size systems
  mm/memory_hotplug: fix memmap_on_memory sysfs value retrieval
  mailmap: switch email for Tanzir Hasan
  mailmap: add old address mappings for Randy
  kernel/crash_core.c: make __crash_hotplug_lock static
  efi: disable mirror feature during crashkernel
  kexec: do syscore_shutdown() in kernel_kexec
  mailmap: update entry for Manivannan Sadhasivam
  fs/proc/task_mmu: move mmu notification mechanism inside mm lock
  mm: zswap: switch maintainers to recently active developers and reviewers
  scripts/decode_stacktrace.sh: optionally use LLVM utilities
  kasan: avoid resetting aux_lock
  lib/Kconfig.debug: disable CONFIG_DEBUG_INFO_BTF for Hexagon
  MAINTAINERS: update LTP maintainers
  kdump: defer the insertion of crashkernel resources
2024-01-17 09:31:36 -08:00
Kemeng Shi
5c7fa5c8ad sbitmap: remove stale comment in sbq_calc_wake_batch
After commit 106397376c ("sbitmap: fix batching wakeup"), we may wake
up more than one queue for each batch. Just remove stale comment that
we wake up only one queue for each batch.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Link: https://lore.kernel.org/r/20240115145626.665562-1-shikemeng@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-15 07:23:50 -07:00
Nathan Chancellor
aaa2c9a97c lib/Kconfig.debug: disable CONFIG_DEBUG_INFO_BTF for Hexagon
pahole, which generates BTF, relies on elfutils to process DWARF debug
info.  Because kernel modules are relocatable files, elfutils needs to
resolve relocations when processing the DWARF .debug sections.

Hexagon is not supported in binutils or elfutils, so elfutils is unable to
process relocations in kernel modules, causing pahole to crash during BTF
generation.

Do not allow CONFIG_DEBUG_INFO_BTF to be selected for Hexagon until it is
supported in elfutils, so that there are no more cryptic build failures
during BTF generation.

Link: https://lkml.kernel.org/r/20240105-hexagon-disable-btf-v1-1-ddab073e7f74@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312192107.wMIKiZWw-lkp@intel.com/
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Brian Cain <bcain@quicinc.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-12 15:20:45 -08:00
Linus Torvalds
3e7aeb78ab Networking changes for 6.8.
Core & protocols
 ----------------
 
  - Analyze and reorganize core networking structs (socks, netdev,
    netns, mibs) to optimize cacheline consumption and set up
    build time warnings to safeguard against future header changes.
    This improves TCP performances with many concurrent connections
    up to 40%.
 
  - Add page-pool netlink-based introspection, exposing the
    memory usage and recycling stats. This helps indentify
    bad PP users and possible leaks.
 
  - Refine TCP/DCCP source port selection to no longer favor even
    source port at connect() time when IP_LOCAL_PORT_RANGE is set.
    This lowers the time taken by connect() for hosts having
    many active connections to the same destination.
 
  - Refactor the TCP bind conflict code, shrinking related socket
    structs.
 
  - Refactor TCP SYN-Cookie handling, as a preparation step to
    allow arbitrary SYN-Cookie processing via eBPF.
 
  - Tune optmem_max for 0-copy usage, increasing the default value
    to 128KB and namespecifying it.
 
  - Allow coalescing for cloned skbs coming from page pools, improving
    RX performances with some common configurations.
 
  - Reduce extension header parsing overhead at GRO time.
 
  - Add bridge MDB bulk deletion support, allowing user-space to
    request the deletion of matching entries.
 
  - Reorder nftables struct members, to keep data accessed by the
    datapath first.
 
  - Introduce TC block ports tracking and use. This allows supporting
    multicast-like behavior at the TC layer.
 
  - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
    classifiers (RSVP and tcindex).
 
  - More data-race annotations.
 
  - Extend the diag interface to dump TCP bound-only sockets.
 
  - Conditional notification of events for TC qdisc class and actions.
 
  - Support for WPAN dynamic associations with nearby devices, to form
    a sub-network using a specific PAN ID.
 
  - Implement SMCv2.1 virtual ISM device support.
 
  - Add support for Batman-avd mulicast packet type.
 
 BPF
 ---
 
  - Tons of verifier improvements:
    - BPF register bounds logic and range support along with a large
      test suite
    - log improvements
    - complete precision tracking support for register spills
    - track aligned STACK_ZERO cases as imprecise spilled registers. It
      improves the verifier "instructions processed" metric from single
      digit to 50-60% for some programs
    - support for user's global BPF subprogram arguments with few
      commonly requested annotations for a better developer experience
    - support tracking of BPF_JNE which helps cases when the compiler
      transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
      like
    - several fixes
 
  - Add initial TX metadata implementation for AF_XDP with support in
    mlx5 and stmmac drivers. Two types of offloads are supported right
    now, that is, TX timestamp and TX checksum offload.
 
  - Fix kCFI bugs in BPF all forms of indirect calls from BPF into
    kernel and from kernel into BPF work with CFI enabled. This allows
    BPF to work with CONFIG_FINEIBT=y.
 
  - Change BPF verifier logic to validate global subprograms lazily
    instead of unconditionally before the main program, so they can be
    guarded using BPF CO-RE techniques.
 
  - Support uid/gid options when mounting bpffs.
 
  - Add a new kfunc which acquires the associated cgroup of a task
    within a specific cgroup v1 hierarchy where the latter is identified
    by its id.
 
  - Extend verifier to allow bpf_refcount_acquire() of a map value field
    obtained via direct load which is a use-case needed in sched_ext.
 
  - Add BPF link_info support for uprobe multi link along with bpftool
    integration for the latter.
 
  - Support for VLAN tag in XDP hints.
 
  - Remove deprecated bpfilter kernel leftovers given the project
    is developed in user-space (https://github.com/facebook/bpfilter).
 
 Misc
 ----
 
  - Support for parellel TC self-tests execution.
 
  - Increase MPTCP self-tests coverage.
 
  - Updated the bridge documentation, including several so-far
    undocumented features.
 
  - Convert all the net self-tests to run in unique netns, to
    avoid random failures due to conflict and allow concurrent
    runs.
 
  - Add TCP-AO self-tests.
 
  - Add kunit tests for both cfg80211 and mac80211.
 
  - Autogenerate Netlink families documentation from YAML spec.
 
  - Add yml-gen support for fixed headers and recursive nests, the
    tool can now generate user-space code for all genetlink families
    for which we have specs.
 
  - A bunch of additional module descriptions fixes.
 
  - Catch incorrect freeing of pages belonging to a page pool.
 
 Driver API
 ----------
 
  - Rust abstractions for network PHY drivers; do not cover yet the
    full C API, but already allow implementing functional PHY drivers
    in rust.
 
  - Introduce queue and NAPI support in the netdev Netlink interface,
    allowing complete access to the device <> NAPIs <> queues
    relationship.
 
  - Introduce notifications filtering for devlink to allow control
    application scale to thousands of instances.
 
  - Improve PHY validation, requesting rate matching information for
    each ethtool link mode supported by both the PHY and host.
 
  - Add support for ethtool symmetric-xor RSS hash.
 
  - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
    platform.
 
  - Expose pin fractional frequency offset value over new DPLL generic
    netlink attribute.
 
  - Convert older drivers to platform remove callback returning void.
 
  - Add support for PHY package MMD read/write.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Octeon CN10K devices
    - Broadcom 5760X P7
    - Qualcomm SM8550 SoC
    - Texas Instrument DP83TG720S PHY
 
  - Bluetooth:
    - IMC Networks Bluetooth radio
 
 Removed
 -------
 
  - WiFi:
    - libertas 16-bit PCMCIA support
    - Atmel at76c50x drivers
    - HostAP ISA/PCMCIA style 802.11b driver
    - zd1201 802.11b USB dongles
    - Orinoco ISA/PCMCIA 802.11b driver
    - Aviator/Raytheon driver
    - Planet WL3501 driver
    - RNDIS USB 802.11b driver
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Intel (100G, ice, idpf):
      - allow one by one port representors creation and removal
      - add temperature and clock information reporting
      - add get/set for ethtool's header split ringparam
      - add again FW logging
      - adds support switchdev hardware packet mirroring
      - iavf: implement symmetric-xor RSS hash
      - igc: add support for concurrent physical and free-running timers
      - i40e: increase the allowable descriptors
    - nVidia/Mellanox:
      - Preparation for Socket-Direct multi-dev netdev. That will allow
        in future releases combining multiple PFs devices attached to
        different NUMA nodes under the same netdev
    - Broadcom (bnxt):
      - TX completion handling improvements
      - add basic ntuple filter support
      - reduce MSIX vectors usage for MQPRIO offload
      - add VXLAN support, USO offload and TX coalesce completion for P7
    - Marvell Octeon EP:
      - xmit-more support
      - add PF-VF mailbox support and use it for FW notifications for VFs
    - Wangxun (ngbe/txgbe):
      - implement ethtool functions to operate pause param, ring param,
        coalesce channel number and msglevel
    - Netronome/Corigine (nfp):
      - add flow-steering support
      - support UDP segmentation offload
 
  - Ethernet NICs embedded, slower, virtual:
    - Xilinx AXI: remove duplicate DMA code adopting the dma engine driver
    - stmmac: add support for HW-accelerated VLAN stripping
    - TI AM654x sw: add mqprio, frame preemption & coalescing
    - gve: add support for non-4k page sizes.
    - virtio-net: support dynamic coalescing moderation
 
  - nVidia/Mellanox Ethernet datacenter switches:
    - allow firmware upgrade without a reboot
    - more flexible support for bridge flooding via the compressed
      FID flooding mode
 
  - Ethernet embedded switches:
    - Microchip:
      - fine-tune flow control and speed configurations in KSZ8xxx
      - KSZ88X3: enable setting rmii reference
    - Renesas:
      - add jumbo frames support
    - Marvell:
      - 88E6xxx: add "eth-mac" and "rmon" stats support
 
  - Ethernet PHYs:
    - aquantia: add firmware load support
    - at803x: refactor the driver to simplify adding support for more
      chip variants
    - NXP C45 TJA11xx: Add MACsec offload support
 
  - Wifi:
    - MediaTek (mt76):
      - NVMEM EEPROM improvements
      - mt7996 Extremely High Throughput (EHT) improvements
      - mt7996 Wireless Ethernet Dispatcher (WED) support
      - mt7996 36-bit DMA support
    - Qualcomm (ath12k):
      - support for a single MSI vector
      - WCN7850: support AP mode
    - Intel (iwlwifi):
      - new debugfs file fw_dbg_clear
      - allow concurrent P2P operation on DFS channels
 
  - Bluetooth:
    - QCA2066: support HFP offload
    - ISO: more broadcast-related improvements
    - NXP: better recovery in case receiver/transmitter get out of sync
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWdamsSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkGC4P/2xjLzdw22ckSssuE9ORbGko9SNjnqHk
 PQh1E+26BHiCg5KB8VvzMsL78E79MRNXEattSW+1g7dhCvln3oi+Vd0WkdRkgt35
 98Iv18zLbbwFAJeyKvmLAPAkQkMLtVj19QILBBRrugF+egEZgVSE3JBcTAiKv2ZQ
 HzkabA171Ri6LpCcEEtY5XuaKvimGnGzF8YMFf8rX0wtqd2p5kbY9aMe47WAGxvU
 Vf9548XvH+A5yVH2/4/gujtUOpA/RHuhuCMb+oo0cZ+VCC1x9MGzoXzj6r87OTkf
 k2W1whNzcGoin92f+9Lk1JYMuiGKBH4QVaDdNXJnYFSJWPTE7RvRsPzYTSD4/GzK
 yEZbzSJXpy/2vDQm16NoAxl7evRs8Sorzkw4LQRviZHI/5SAkK2ZQiCK5CO8QSYy
 C1LELcV5kn6Foe24xWnrWLjAGug9oJnYoGPMU5gvPmFJMvUMXqm5rmbBgUWL5Rxw
 q1M6gVzabCyWUy6z2G2vaqW2ZntNVvCkdsLtIX0XZkcTzNoP0MA+TuhyGz4wbiuo
 PeyQp/mbGnDgCYggqKIA0YWrTVxkhFrKN520cbO8qXBQytV9oFbM/0/+C0/r/5WX
 pL1JVzLrh6l5ME7EIQfha8UOF9j8q4ueSwb40P3AR2NaZiDABM0zfUZ6+sx+91WF
 ucqPEcZB5cRE
 =1bW6
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "The most interesting thing is probably the networking structs
  reorganization and a significant amount of changes is around
  self-tests.

  Core & protocols:

   - Analyze and reorganize core networking structs (socks, netdev,
     netns, mibs) to optimize cacheline consumption and set up build
     time warnings to safeguard against future header changes

     This improves TCP performances with many concurrent connections up
     to 40%

   - Add page-pool netlink-based introspection, exposing the memory
     usage and recycling stats. This helps indentify bad PP users and
     possible leaks

   - Refine TCP/DCCP source port selection to no longer favor even
     source port at connect() time when IP_LOCAL_PORT_RANGE is set. This
     lowers the time taken by connect() for hosts having many active
     connections to the same destination

   - Refactor the TCP bind conflict code, shrinking related socket
     structs

   - Refactor TCP SYN-Cookie handling, as a preparation step to allow
     arbitrary SYN-Cookie processing via eBPF

   - Tune optmem_max for 0-copy usage, increasing the default value to
     128KB and namespecifying it

   - Allow coalescing for cloned skbs coming from page pools, improving
     RX performances with some common configurations

   - Reduce extension header parsing overhead at GRO time

   - Add bridge MDB bulk deletion support, allowing user-space to
     request the deletion of matching entries

   - Reorder nftables struct members, to keep data accessed by the
     datapath first

   - Introduce TC block ports tracking and use. This allows supporting
     multicast-like behavior at the TC layer

   - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
     classifiers (RSVP and tcindex)

   - More data-race annotations

   - Extend the diag interface to dump TCP bound-only sockets

   - Conditional notification of events for TC qdisc class and actions

   - Support for WPAN dynamic associations with nearby devices, to form
     a sub-network using a specific PAN ID

   - Implement SMCv2.1 virtual ISM device support

   - Add support for Batman-avd mulicast packet type

  BPF:

   - Tons of verifier improvements:
       - BPF register bounds logic and range support along with a large
         test suite
       - log improvements
       - complete precision tracking support for register spills
       - track aligned STACK_ZERO cases as imprecise spilled registers.
         This improves the verifier "instructions processed" metric from
         single digit to 50-60% for some programs
       - support for user's global BPF subprogram arguments with few
         commonly requested annotations for a better developer
         experience
       - support tracking of BPF_JNE which helps cases when the compiler
         transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
         like
       - several fixes

   - Add initial TX metadata implementation for AF_XDP with support in
     mlx5 and stmmac drivers. Two types of offloads are supported right
     now, that is, TX timestamp and TX checksum offload

   - Fix kCFI bugs in BPF all forms of indirect calls from BPF into
     kernel and from kernel into BPF work with CFI enabled. This allows
     BPF to work with CONFIG_FINEIBT=y

   - Change BPF verifier logic to validate global subprograms lazily
     instead of unconditionally before the main program, so they can be
     guarded using BPF CO-RE techniques

   - Support uid/gid options when mounting bpffs

   - Add a new kfunc which acquires the associated cgroup of a task
     within a specific cgroup v1 hierarchy where the latter is
     identified by its id

   - Extend verifier to allow bpf_refcount_acquire() of a map value
     field obtained via direct load which is a use-case needed in
     sched_ext

   - Add BPF link_info support for uprobe multi link along with bpftool
     integration for the latter

   - Support for VLAN tag in XDP hints

   - Remove deprecated bpfilter kernel leftovers given the project is
     developed in user-space (https://github.com/facebook/bpfilter)

  Misc:

   - Support for parellel TC self-tests execution

   - Increase MPTCP self-tests coverage

   - Updated the bridge documentation, including several so-far
     undocumented features

   - Convert all the net self-tests to run in unique netns, to avoid
     random failures due to conflict and allow concurrent runs

   - Add TCP-AO self-tests

   - Add kunit tests for both cfg80211 and mac80211

   - Autogenerate Netlink families documentation from YAML spec

   - Add yml-gen support for fixed headers and recursive nests, the tool
     can now generate user-space code for all genetlink families for
     which we have specs

   - A bunch of additional module descriptions fixes

   - Catch incorrect freeing of pages belonging to a page pool

  Driver API:

   - Rust abstractions for network PHY drivers; do not cover yet the
     full C API, but already allow implementing functional PHY drivers
     in rust

   - Introduce queue and NAPI support in the netdev Netlink interface,
     allowing complete access to the device <> NAPIs <> queues
     relationship

   - Introduce notifications filtering for devlink to allow control
     application scale to thousands of instances

   - Improve PHY validation, requesting rate matching information for
     each ethtool link mode supported by both the PHY and host

   - Add support for ethtool symmetric-xor RSS hash

   - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
     platform

   - Expose pin fractional frequency offset value over new DPLL generic
     netlink attribute

   - Convert older drivers to platform remove callback returning void

   - Add support for PHY package MMD read/write

  New hardware / drivers:

   - Ethernet:
       - Octeon CN10K devices
       - Broadcom 5760X P7
       - Qualcomm SM8550 SoC
       - Texas Instrument DP83TG720S PHY

   - Bluetooth:
       - IMC Networks Bluetooth radio

  Removed:

   - WiFi:
       - libertas 16-bit PCMCIA support
       - Atmel at76c50x drivers
       - HostAP ISA/PCMCIA style 802.11b driver
       - zd1201 802.11b USB dongles
       - Orinoco ISA/PCMCIA 802.11b driver
       - Aviator/Raytheon driver
       - Planet WL3501 driver
       - RNDIS USB 802.11b driver

  Driver updates:

   - Ethernet high-speed NICs:
       - Intel (100G, ice, idpf):
          - allow one by one port representors creation and removal
          - add temperature and clock information reporting
          - add get/set for ethtool's header split ringparam
          - add again FW logging
          - adds support switchdev hardware packet mirroring
          - iavf: implement symmetric-xor RSS hash
          - igc: add support for concurrent physical and free-running
            timers
          - i40e: increase the allowable descriptors
       - nVidia/Mellanox:
          - Preparation for Socket-Direct multi-dev netdev. That will
            allow in future releases combining multiple PFs devices
            attached to different NUMA nodes under the same netdev
       - Broadcom (bnxt):
          - TX completion handling improvements
          - add basic ntuple filter support
          - reduce MSIX vectors usage for MQPRIO offload
          - add VXLAN support, USO offload and TX coalesce completion
            for P7
       - Marvell Octeon EP:
          - xmit-more support
          - add PF-VF mailbox support and use it for FW notifications
            for VFs
       - Wangxun (ngbe/txgbe):
          - implement ethtool functions to operate pause param, ring
            param, coalesce channel number and msglevel
       - Netronome/Corigine (nfp):
          - add flow-steering support
          - support UDP segmentation offload

   - Ethernet NICs embedded, slower, virtual:
       - Xilinx AXI: remove duplicate DMA code adopting the dma engine
         driver
       - stmmac: add support for HW-accelerated VLAN stripping
       - TI AM654x sw: add mqprio, frame preemption & coalescing
       - gve: add support for non-4k page sizes.
       - virtio-net: support dynamic coalescing moderation

   - nVidia/Mellanox Ethernet datacenter switches:
       - allow firmware upgrade without a reboot
       - more flexible support for bridge flooding via the compressed
         FID flooding mode

   - Ethernet embedded switches:
       - Microchip:
          - fine-tune flow control and speed configurations in KSZ8xxx
          - KSZ88X3: enable setting rmii reference
       - Renesas:
          - add jumbo frames support
       - Marvell:
          - 88E6xxx: add "eth-mac" and "rmon" stats support

   - Ethernet PHYs:
       - aquantia: add firmware load support
       - at803x: refactor the driver to simplify adding support for more
         chip variants
       - NXP C45 TJA11xx: Add MACsec offload support

   - Wifi:
       - MediaTek (mt76):
          - NVMEM EEPROM improvements
          - mt7996 Extremely High Throughput (EHT) improvements
          - mt7996 Wireless Ethernet Dispatcher (WED) support
          - mt7996 36-bit DMA support
       - Qualcomm (ath12k):
          - support for a single MSI vector
          - WCN7850: support AP mode
       - Intel (iwlwifi):
          - new debugfs file fw_dbg_clear
          - allow concurrent P2P operation on DFS channels

   - Bluetooth:
       - QCA2066: support HFP offload
       - ISO: more broadcast-related improvements
       - NXP: better recovery in case receiver/transmitter get out of sync"

* tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1714 commits)
  lan78xx: remove redundant statement in lan78xx_get_eee
  lan743x: remove redundant statement in lan743x_ethtool_get_eee
  bnxt_en: Fix RCU locking for ntuple filters in bnxt_rx_flow_steer()
  bnxt_en: Fix RCU locking for ntuple filters in bnxt_srxclsrldel()
  bnxt_en: Remove unneeded variable in bnxt_hwrm_clear_vnic_filter()
  tcp: Revert no longer abort SYN_SENT when receiving some ICMP
  Revert "mlx5 updates 2023-12-20"
  Revert "net: stmmac: Enable Per DMA Channel interrupt"
  ipvlan: Remove usage of the deprecated ida_simple_xx() API
  ipvlan: Fix a typo in a comment
  net/sched: Remove ipt action tests
  net: stmmac: Use interrupt mode INTM=1 for per channel irq
  net: stmmac: Add support for TX/RX channel interrupt
  net: stmmac: Make MSI interrupt routine generic
  dt-bindings: net: snps,dwmac: per channel irq
  net: phy: at803x: make read_status more generic
  net: phy: at803x: add support for cdt cross short test for qca808x
  net: phy: at803x: refactor qca808x cable test get status function
  net: phy: at803x: generalize cdt fault length function
  net: ethernet: cortina: Drop TSO support
  ...
2024-01-11 10:07:29 -08:00
Linus Torvalds
de927f6c0b s390 updates for 6.8 merge window
- Add machine variable capacity information to /proc/sysinfo.
 
 - Limit the waste of page tables and always align vmalloc area size
   and base address on segment boundary.
 
 - Fix a memory leak when an attempt to register interruption sub class
   (ISC) for the adjunct-processor (AP) guest failed.
 
 - Reset response code AP_RESPONSE_INVALID_GISA to understandable
   by guest AP_RESPONSE_INVALID_ADDRESS in response to a failed
   interruption sub class (ISC) registration attempt.
 
 - Improve reaction to adjunct-processor (AP) AP_RESPONSE_OTHERWISE_CHANGED
   response code when enabling interrupts on behalf of a guest.
 
 - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP) queue
   device bound to the vfio_ap device driver when the mediated device is
   attached to a guest, but the queue device is not passed through.
 
 - Rework struct ap_card to hold the whole adjunct-processor (AP) card
   hardware information. As result, all the ugly bit checks are replaced
   by simple evaluations of the required bit fields.
 
 - Improve handling of some weird scenarios between service element (SE)
   host and SE guest with adjunct-processor (AP) pass-through support.
 
 - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return the
   previous value of the to be changed control register. This is useful if
   a bit is only changed temporarily and the previous content needs to be
   restored.
 
 - The kernel starts with machine checks disabled and is expected to enable
   it once trap_init() is called. However the implementation allows machine
   checks early. Consistently enable it in trap_init() only.
 
 - local_mcck_disable() and local_mcck_enable() assume that machine checks
   are always enabled. Instead implement and use local_mcck_save() and
   local_mcck_restore() to disable machine checks and restore the previous
   state.
 
 - Modification of floating point control (FPC) register of a traced
   process using ptrace interface may lead to corruption of the FPC
   register of the tracing process. Fix this.
 
 - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point control
   (FPC) register in vCPU, but may lead to corruption of the FPC register
   of the host process. Fix this.
 
 - Use READ_ONCE() to read a vCPU floating point register value from the
   memory mapped area. This avoids that, depending on code generation,
   a different value is tested for validity than the one that is used.
 
 - Get rid of test_fp_ctl(), since it is quite subtle to use it correctly.
   Instead copy a new floating point control register value into its save
   area and test the validity of the new value when loading it.
 
 - Remove superfluous save_fpu_regs() call.
 
 - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines
   provide the vector facility since many years and the need to make the
   task structure size dependent on the vector facility does not exist.
 
 - Remove the "novx" kernel command line option, as the vector code runs
   without any problems since many years.
 
 - Add the vector facility to the z13 architecture level set (ALS).
   All hypervisors support the vector facility since many years.
   This allows compile time optimizations of the kernel.
 
 - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As result,
   the compiled code will have less runtime checks and less code.
 
 - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C.
 
 - Convert the struct subchannel spinlock from pointer to member.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCZZxnchccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8PyGAP9Z0Z1Pm3hPf24M4FgY5uuRqBHo
 VoiuohOZRGONwifnsgEAmN/3ba/7d/j0iMwpUdUFouiqtwTUihh15wYx1sH2IA8=
 =F6YD
 -----END PGP SIGNATURE-----

Merge tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Alexander Gordeev:

 - Add machine variable capacity information to /proc/sysinfo.

 - Limit the waste of page tables and always align vmalloc area size and
   base address on segment boundary.

 - Fix a memory leak when an attempt to register interruption sub class
   (ISC) for the adjunct-processor (AP) guest failed.

 - Reset response code AP_RESPONSE_INVALID_GISA to understandable by
   guest AP_RESPONSE_INVALID_ADDRESS in response to a failed
   interruption sub class (ISC) registration attempt.

 - Improve reaction to adjunct-processor (AP)
   AP_RESPONSE_OTHERWISE_CHANGED response code when enabling interrupts
   on behalf of a guest.

 - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP)
   queue device bound to the vfio_ap device driver when the mediated
   device is attached to a guest, but the queue device is not passed
   through.

 - Rework struct ap_card to hold the whole adjunct-processor (AP) card
   hardware information. As result, all the ugly bit checks are replaced
   by simple evaluations of the required bit fields.

 - Improve handling of some weird scenarios between service element (SE)
   host and SE guest with adjunct-processor (AP) pass-through support.

 - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return
   the previous value of the to be changed control register. This is
   useful if a bit is only changed temporarily and the previous content
   needs to be restored.

 - The kernel starts with machine checks disabled and is expected to
   enable it once trap_init() is called. However the implementation
   allows machine checks early. Consistently enable it in trap_init()
   only.

 - local_mcck_disable() and local_mcck_enable() assume that machine
   checks are always enabled. Instead implement and use
   local_mcck_save() and local_mcck_restore() to disable machine checks
   and restore the previous state.

 - Modification of floating point control (FPC) register of a traced
   process using ptrace interface may lead to corruption of the FPC
   register of the tracing process. Fix this.

 - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point
   control (FPC) register in vCPU, but may lead to corruption of the FPC
   register of the host process. Fix this.

 - Use READ_ONCE() to read a vCPU floating point register value from the
   memory mapped area. This avoids that, depending on code generation, a
   different value is tested for validity than the one that is used.

 - Get rid of test_fp_ctl(), since it is quite subtle to use it
   correctly. Instead copy a new floating point control register value
   into its save area and test the validity of the new value when
   loading it.

 - Remove superfluous save_fpu_regs() call.

 - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines
   provide the vector facility since many years and the need to make the
   task structure size dependent on the vector facility does not exist.

 - Remove the "novx" kernel command line option, as the vector code runs
   without any problems since many years.

 - Add the vector facility to the z13 architecture level set (ALS). All
   hypervisors support the vector facility since many years. This allows
   compile time optimizations of the kernel.

 - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As
   result, the compiled code will have less runtime checks and less
   code.

 - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C.

 - Convert the struct subchannel spinlock from pointer to member.

* tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (24 commits)
  Revert "s390: update defconfigs"
  s390/cio: make sch->lock spinlock pointer a member
  s390: update defconfigs
  s390/mm: convert pgste locking functions to C
  s390/fpu: get rid of MACHINE_HAS_VX
  s390/als: add vector facility to z13 architecture level set
  s390/fpu: remove "novx" option
  s390/fpu: remove ARCH_WANTS_DYNAMIC_TASK_STRUCT support
  KVM: s390: remove superfluous save_fpu_regs() call
  s390/fpu: get rid of test_fp_ctl()
  KVM: s390: use READ_ONCE() to read fpc register value
  KVM: s390: fix setting of fpc register
  s390/ptrace: handle setting of fpc register correctly
  s390/nmi: implement and use local_mcck_save() / local_mcck_restore()
  s390/nmi: consistently enable machine checks in trap_init()
  s390/ctlreg: return old register contents when changing bits
  s390/ap: handle outband SE bind state change
  s390/ap: store TAPQ hwinfo in struct ap_card
  s390/vfio-ap: fix sysfs status attribute for AP queue devices
  s390/vfio-ap: improve reaction to response code 07 from PQAP(AQIC) command
  ...
2024-01-10 18:18:20 -08:00
Linus Torvalds
a05aea98d4 sysctl-6.8-rc1
To help make the move of sysctls out of kernel/sysctl.c not incur a size
 penalty sysctl has been changed to allow us to not require the sentinel, the
 final empty element on the sysctl array. Joel Granados has been doing all this
 work. On the v6.6 kernel we got the major infrastructure changes required to
 support this. For v6.7 we had all arch/ and drivers/ modified to remove
 the sentinel. For v6.8-rc1 we get a few more updates for fs/ directory only.
 The kernel/ directory is left but we'll save that for v6.9-rc1 as those patches
 are still being reviewed. After that we then can expect also the removal of the
 no longer needed check for procname == NULL.
 
 Let us recap the purpose of this work:
 
   - this helps reduce the overall build time size of the kernel and run time
     memory consumed by the kernel by about ~64 bytes per array
   - the extra 64-byte penalty is no longer inncurred now when we move sysctls
     out from kernel/sysctl.c to their own files
 
 Thomas Weißschuh also sent a few cleanups, for v6.9-rc1 we expect to see further
 work by Thomas Weißschuh with the constificatin of the struct ctl_table.
 
 Due to Joel Granados's work, and to help bring in new blood, I have suggested
 for him to become a maintainer and he's accepted. So for v6.9-rc1 I look forward
 to seeing him sent you a pull request for further sysctl changes. This also
 removes Iurii Zaikin as a maintainer as he has moved on to other projects and
 has had no time to help at all.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmWdWDESHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinjJAP/jTNNoyzWisvrrvmXqR5txFGLOE+wW6x
 Xv9avuiM+DTHsH/wK8CkXEivwDqYNAZEHU7NEcolS5bJX/ddSRwN9b5aSVlCrUdX
 Ab4rXmpeSCNFp9zNszWJsDuBKIqjvsKw7qGleGtgZ2qAUHbbH30VROLWCggaee50
 wU3icDLdwkasxrcMXy4Sq5dT5wYC4j/QelqBGIkYPT14Arl1im5zqPZ95gmO/s/6
 mdicTAmq+hhAUfUBJBXRKtsvxY6CItxe55Q4fjpncLUJLHUw+VPVNoBKFWJlBwlh
 LO3liKFfakPSkil4/en+/+zuMByd0JBkIzIJa+Kk5kjpbHRhK0RkmU4+Y5G5spWN
 jjLfiv6RxInNaZ8EWQBMfjE95A7PmYDQ4TOH08+OvzdDIi6B0BB5tBGQpG9BnyXk
 YsLg1Uo4CwE/vn1/a9w0rhadjUInvmAryhb/uSJYFz/lmApLm2JUpY3/KstwGetb
 z+HmLstJb24Djkr6pH8DcjhzRBHeWQ5p0b4/6B+v1HqAUuEhdbyw1F2GrDywyF3R
 h/UOAaKLm1+ffdA246o9TejKiDU96qEzzXMaCzPKyestaRZuiyuYEMDhYbvtsMV5
 zIdMJj5HQ+U1KHDv4IN99DEj7+/vjE3f4Sjo+POFpQeQ8/d+fxpFNqXVv449dgnb
 6xEkkxsR0ElM
 =2qBt
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "To help make the move of sysctls out of kernel/sysctl.c not incur a
  size penalty sysctl has been changed to allow us to not require the
  sentinel, the final empty element on the sysctl array. Joel Granados
  has been doing all this work.

  In the v6.6 kernel we got the major infrastructure changes required to
  support this. For v6.7 we had all arch/ and drivers/ modified to
  remove the sentinel. For v6.8-rc1 we get a few more updates for fs/
  directory only.

  The kernel/ directory is left but we'll save that for v6.9-rc1 as
  those patches are still being reviewed. After that we then can expect
  also the removal of the no longer needed check for procname == NULL.

  Let us recap the purpose of this work:

   - this helps reduce the overall build time size of the kernel and run
     time memory consumed by the kernel by about ~64 bytes per array

   - the extra 64-byte penalty is no longer inncurred now when we move
     sysctls out from kernel/sysctl.c to their own files

  Thomas Weißschuh also sent a few cleanups, for v6.9-rc1 we expect to
  see further work by Thomas Weißschuh with the constificatin of the
  struct ctl_table.

  Due to Joel Granados's work, and to help bring in new blood, I have
  suggested for him to become a maintainer and he's accepted. So for
  v6.9-rc1 I look forward to seeing him sent you a pull request for
  further sysctl changes. This also removes Iurii Zaikin as a maintainer
  as he has moved on to other projects and has had no time to help at
  all"

* tag 'sysctl-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  sysctl: remove struct ctl_path
  sysctl: delete unused define SYSCTL_PERM_EMPTY_DIR
  coda: Remove the now superfluous sentinel elements from ctl_table array
  sysctl: Remove the now superfluous sentinel elements from ctl_table array
  fs: Remove the now superfluous sentinel elements from ctl_table array
  cachefiles: Remove the now superfluous sentinel element from ctl_table array
  sysclt: Clarify the results of selftest run
  sysctl: Add a selftest for handling empty dirs
  sysctl: Fix out of bounds access for empty sysctl registers
  MAINTAINERS: Add Joel Granados as co-maintainer for proc sysctl
  MAINTAINERS: remove Iurii Zaikin from proc sysctl
2024-01-10 17:44:36 -08:00
Linus Torvalds
78273df7f6 header cleanups for 6.8
The goal is to get sched.h down to a type only header, so the main thing
 happening in this patchset is splitting out various _types.h headers and
 dependency fixups, as well as moving some things out of sched.h to
 better locations.
 
 This is prep work for the memory allocation profiling patchset which
 adds new sched.h interdepencencies.
 
 Testing - it's been in -next, and fixes from pretty much all
 architectures have percolated in - nothing major.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmWfBwwACgkQE6szbY3K
 bnZPwBAAmuRojXaeWxi01IPIOehSGDe68vw44PR9glEMZvxdnZuPOdvE4/+245/L
 bRKU2WBCjBUokUbV9msIShwRkFTZAmEMPNfPAAsFMA+VXeDYHKB+ZRdwTggNAQ+I
 SG6fZgh5m0HsewCDxU8oqVHkjVq4fXn0cy+aL6xLEd9gu67GoBzX2pDieS2Kvy6j
 jnyoKTxFwb+LTQgph0P4EIpq5I2umAsdLwdSR8EJ+8e9NiNvMo1pI00Lx/ntAnFZ
 JftWUJcMy3TQ5u1GkyfQN9y/yThX1bZK5GvmHS9SJ2Dkacaus5d+xaKCHtRuFS1I
 7C6b8PsNgRczUMumBXus44HdlNfNs1yU3lvVxFvBIPE1qC9pYRHrkWIXXIocXLLC
 oxTEJ6B2G3BQZVQgLIA4fOaxMVhmvKffi/aEZLi9vN9VVosd1a6XNKI6KbyRnXFp
 GSs9qDqszhn5I3GYNlDNQTc/8UsRlhPFgS6nS0By6QnvxtGi9QkU2tBRBsXvqwCy
 cLoCYIhc2tvugHvld70dz26umiJ4rnmxGlobStNoigDvIKAIUt1UmIdr1so8P8eH
 xehnL9ZcOX6xnANDL0AqMFFHV6I58CJynhFdUoXfVQf/DWLGX48mpi9LVNsYBzsI
 CAwVOAQ0UjGrpdWmJ9ueY/ABYqg9vRjzaDEXQ+MhAYO55CLaVsg=
 =3tyT
 -----END PGP SIGNATURE-----

Merge tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs

Pull header cleanups from Kent Overstreet:
 "The goal is to get sched.h down to a type only header, so the main
  thing happening in this patchset is splitting out various _types.h
  headers and dependency fixups, as well as moving some things out of
  sched.h to better locations.

  This is prep work for the memory allocation profiling patchset which
  adds new sched.h interdepencencies"

* tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs: (51 commits)
  Kill sched.h dependency on rcupdate.h
  kill unnecessary thread_info.h include
  Kill unnecessary kernel.h include
  preempt.h: Kill dependency on list.h
  rseq: Split out rseq.h from sched.h
  LoongArch: signal.c: add header file to fix build error
  restart_block: Trim includes
  lockdep: move held_lock to lockdep_types.h
  sem: Split out sem_types.h
  uidgid: Split out uidgid_types.h
  seccomp: Split out seccomp_types.h
  refcount: Split out refcount_types.h
  uapi/linux/resource.h: fix include
  x86/signal: kill dependency on time.h
  syscall_user_dispatch.h: split out *_types.h
  mm_types_task.h: Trim dependencies
  Split out irqflags_types.h
  ipc: Kill bogus dependency on spinlock.h
  shm: Slim down dependencies
  workqueue: Split out workqueue_types.h
  ...
2024-01-10 16:43:55 -08:00
Linus Torvalds
0cb552aa97 This update includes the following changes:
API:
 
 - Add incremental lskcipher/skcipher processing.
 
 Algorithms:
 
 - Remove SHA1 from drbg.
 - Remove CFB and OFB.
 
 Drivers:
 
 - Add comp high perf mode configuration in hisilicon/zip.
 - Add support for 420xx devices in qat.
 - Add IAA Compression Accelerator driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmWdxR4ACgkQxycdCkmx
 i6fAjg//SqOwxeUYWpT4KdMCxGMn7U9iE3wJeX8nqfma3a62Wt2soey7H3GB9G7v
 gEh0OraOKIGeBtS8giIX83SZJOirMlgeE2tngxMmR9O95EUNR0XGnywF/emyt96z
 WcSN1IrRZ8qQzTASBF0KpV2Ir5mNzBiOwU9tVHIztROufA4C1fwKl7yhPM67C3MU
 88vf1R+ZeWUbNbzQNC8oYIqU11dcNaMNhOVPiZCECKbIR6LqwUf3Swexz+HuPR/D
 WTSrb4J3Eeg77SMhI959/Hi53WeEyVW1vWYAVMgfTEFw6PESiOXyPeImfzUMFos6
 fFYIAoQzoG5GlQeYwLLSoZAwtfY+f7gTNoaE+bnPk5317EFzFDijaXrkjjVKqkS2
 OOBfxrMMIGNmxp7pPkt6HPnIvGNTo+SnbAdVIm6M3EN1K+BTGrj7/CTJkcT6XSyK
 nCBL6nbP7zMB1GJfCFGPvlIdW4oYnAfB1Q5YJ9tzYbEZ0t5NWxDKZ45RnM9xQp4Y
 2V1zdfALdqmGRKBWgyUcqp1T4/AYRU0+WaQxz7gHw3BPR4QmfVLPRqiiR7OT0Z+P
 XFotOYD3epVXS1OUyZdLBn5+FXLnRd1uylQ+j8FNfnddr4Nr+tH1J6edK71NMvXG
 Tj7p5rP5bbgvVkD43ywsVnCI0w+9NS55mH5UP2Y4fSLS6p2tJAw=
 =yMmO
 -----END PGP SIGNATURE-----

Merge tag 'v6.8-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Add incremental lskcipher/skcipher processing

  Algorithms:
   - Remove SHA1 from drbg
   - Remove CFB and OFB

  Drivers:
   - Add comp high perf mode configuration in hisilicon/zip
   - Add support for 420xx devices in qat
   - Add IAA Compression Accelerator driver"

* tag 'v6.8-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (172 commits)
  crypto: iaa - Account for cpu-less numa nodes
  crypto: scomp - fix req->dst buffer overflow
  crypto: sahara - add support for crypto_engine
  crypto: sahara - remove error message for bad aes request size
  crypto: sahara - remove unnecessary NULL assignments
  crypto: sahara - remove 'active' flag from sahara_aes_reqctx struct
  crypto: sahara - use dev_err_probe()
  crypto: sahara - use devm_clk_get_enabled()
  crypto: sahara - use BIT() macro
  crypto: sahara - clean up macro indentation
  crypto: sahara - do not resize req->src when doing hash operations
  crypto: sahara - fix processing hash requests with req->nbytes < sg->length
  crypto: sahara - improve error handling in sahara_sha_process()
  crypto: sahara - fix wait_for_completion_timeout() error handling
  crypto: sahara - fix ahash reqsize
  crypto: sahara - handle zero-length aes requests
  crypto: skcipher - remove excess kerneldoc members
  crypto: shash - remove excess kerneldoc members
  crypto: qat - generate dynamically arbiter mappings
  crypto: qat - add support for ring pair level telemetry
  ...
2024-01-10 12:23:43 -08:00
Linus Torvalds
41daf06ea1 linux_kselftest-kunit-6.8-rc1
This KUnit update for Linux 6.8-rc1 consists of:
 
 - a new feature that adds APIs for managing devices introducing
   a set of helper functions which allow devices (internally a
   struct kunit_device) to be created and managed by KUnit.
   These devices will be automatically unregistered on
   test exit. These helpers can either use a user-provided
   struct device_driver, or have one automatically created and
   managed by KUnit. In both cases, the device lives on a new
   kunit_bus.
 
 - changes to switch drm/tests to use kunit devices
 
 - several fixes and enhancements to attribute feature
 
 - changes to reorganize deferred action function introducing
   KUNIT_DEFINE_ACTION_WRAPPER
 
 - new feature adds ability to run tests after boot using debugfs
 
 - fixes and enhancements to string-stream-test:
   - parse ERR_PTR in string_stream_destroy()
   - unchecked dereference in bug fix in debugfs_print_results()
   - handling errors from alloc_string_stream()
   - NULL-dereference bug fix in kunit_init_suite()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmWdiHIACgkQCwJExA0N
 QxxzCxAAmhn+rkKV4DfuGXxUAJbO109H7LSP1Y7FKMYCVp83msWKASziujb2IQR9
 87jnmgeJMbmQaPcc9m//NHuFhZmJwQZwAdGZryoDiz7XK+1MwLxYeUj92HI7FPaD
 o5Jz6tlqFdehx5jCOymgwbvhI5kJMkQCTTtnEaiHCByfaA02UqmTtt3bXK5OeJkZ
 UG0HqdvI/6Xo01i+BnerRBZFcQV49GMhl4acw1k+dJnPLkqusL6txftRBoKtxuVd
 mXQHKS1SmNgiNA+nqs4d/8qERoMJWuwj6wV4pldVBXhgZwOHXbBxBf67i7hTakE/
 TkEURCkOb5X0QrT6akJj6phJ2xqXsF7xwzBJh9G4jF2Pdwwo8GGuAXW+ol0TRrm8
 ZEQ4eMBGIK07Lb9FeBMLO2bZ0Ox+oiN+YNGY/bs1d6Ibf4PnBUfy7IlmMjKL9h/V
 A/EpYdaq5q72IZZQ2pu1rYkBRPbnP7vHmjLXVYIq7Pq8bLA9/ycKO/0jnGHdo1oz
 rBK/6t7yB+ATi4KeKQpjpmUTX/vdEenUQI/QDn9ngXIEwYQfNrEUZitEvBXR1Kw+
 T8iKDIPFkvb/yEZgjWgNpxETooDx3yBkeeC29gKMj4QoN38wEdfy0Xltp8eqq9cS
 6lijRoipUypHRAuXeSJMW2dflLnFIt4mtC25hBNF+DmyNVT+IF4=
 =79+u
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - a new feature that adds APIs for managing devices introducing a set
   of helper functions which allow devices (internally a struct
   kunit_device) to be created and managed by KUnit.

   These devices will be automatically unregistered on test exit. These
   helpers can either use a user-provided struct device_driver, or have
   one automatically created and managed by KUnit. In both cases, the
   device lives on a new kunit_bus.

 - changes to switch drm/tests to use kunit devices

 - several fixes and enhancements to attribute feature

 - changes to reorganize deferred action function introducing
   KUNIT_DEFINE_ACTION_WRAPPER

 - new feature adds ability to run tests after boot using debugfs

 - fixes and enhancements to string-stream-test:
     - parse ERR_PTR in string_stream_destroy()
     - unchecked dereference in bug fix in debugfs_print_results()
     - handling errors from alloc_string_stream()
     - NULL-dereference bug fix in kunit_init_suite()

* tag 'linux_kselftest-kunit-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (27 commits)
  kunit: Fix some comments which were mistakenly kerneldoc
  kunit: Protect string comparisons against NULL
  kunit: Add example of kunit_activate_static_stub() with pointer-to-function
  kunit: Allow passing function pointer to kunit_activate_static_stub()
  kunit: Fix NULL-dereference in kunit_init_suite() if suite->log is NULL
  kunit: Reset test->priv after each param iteration
  kunit: Add example for using test->priv
  drm/tests: Switch to kunit devices
  ASoC: topology: Replace fake root_device with kunit_device in tests
  overflow: Replace fake root_device with kunit_device
  fortify: test: Use kunit_device
  kunit: Add APIs for managing devices
  Documentation: Add debugfs docs with run after boot
  kunit: add ability to run tests after boot using debugfs
  kunit: add is_init test attribute
  kunit: add example suite to test init suites
  kunit: add KUNIT_INIT_TABLE to init linker section
  kunit: move KUNIT_TABLE out of INIT_DATA
  kunit: tool: add test for parsing attributes
  kunit: tool: fix parsing of test attributes
  ...
2024-01-09 17:16:58 -08:00
Linus Torvalds
bd012f3a5b ACPI updates for 6.8-rc1
- Add CSI-2 and DisCo for Imaging support to the ACPI device
    enumeration code (Sakari Ailus, Rafael J. Wysocki).
 
  - Adjust the cpufreq thermal reduction algorithm in the ACPI processor
    driver for Tegra241 (Srikar Srimath Tirumala, Arnd Bergmann).
 
  - Make acpi_proc_quirk_mwait_check() x86-specific (Rafael J. Wysocki).
 
  - Switch over ACPI to using a threaded interrupt handler for the
    SCI (Rafael J. Wysocki).
 
  - Allow ACPI Notify () handlers to run on all CPUs and clean up the
    ACPI interface for deferred events processing (Rafael J. Wysocki).
 
  - Switch over the ACPI EC driver to using a threaded handler for the
    dedicated IRQ on systems without the EC GPE (Rafael J. Wysocki).
 
  - Adjust code using ACPICA spinlocks and the ACPI EC driver spinlock to
    keep local interrupts on (Rafael J. Wysocki).
 
  - Adjust the USB4 _OSC handshake to correctly handle cases in which
    certain types of OS control are denied by the platform (Mika
    Westerberg).
 
  - Correct and clean up the generic function for parsing ACPI data-only
    tables with array structure (Yuntao Wang).
 
  - Modify acpi_dev_uid_match() to support different types of its second
    argument and adjust its users accordingly (Raag Jadav).
 
  - Clean up code related to acpi_evaluate_reference() and ACPI device
    lists (Rafael J. Wysocki).
 
  - Use generic ACPI helpers for evaluating trip point temperature
    objects in the ACPI thermal zone driver (Rafael J. Wysockii, Arnd
    Bergmann).
 
  - Add Thermal fast Sampling Period (_TFP) support to the ACPI thermal
    zone driver (Jeff Brasen).
 
  - Modify the ACPI LPIT table handling code to avoid u32 multiplication
    overflows in state residency computations (Nikita Kiryushin).
 
  - Drop an unused helper function from the ACPI backlight (video) driver
    and add a clarifying comment to it (Hans de Goede).
 
  - Update the ACPI backlight driver to avoid using uninitialized memory
    in some cases (Nikita Kiryushin).
 
  - Add ACPI backlight quirk for the Colorful X15 AT 23 laptop (Yuluo
    Qiu).
 
  - Add support for vendor-defined error types to the ACPI APEI error
    injection code (Avadhut Naik).
 
  - Adjust APEI to properly set MF_ACTION_REQUIRED on synchronous memory
    failure events, so they are handled differently from the asynchronous
    ones (Shuai Xue).
 
  - Fix NULL pointer dereference check in the ACPI extlog driver (Prarit
    Bhargava).
 
  - Adjust the ACPI extlog driver to clear the Extended Error Log status
    when RAS_CEC handled the error (Tony Luck).
 
  - Add IRQ override quirks for some Infinity laptops and for TongFang
    GMxXGxx (David McFarland, Hans de Goede).
 
  - Clean up the ACPI NUMA code and fix it to ensure that fake_pxm is not
    the same as one of the real pxm values (Yuntao Wang).
 
  - Fix the fractional clock divider flags in the ACPI LPSS (Intel SoC)
    driver so as to prevent miscalculation of the values in the clock
    divider (Andy Shevchenko).
 
  - Adjust comments in the ACPI watchdog driver to prevent kernel-doc
    from complaining during documentation builds (Randy Dunlap).
 
  - Make the ACPI button driver send wakeup key events to user space in
    addition to power button events on systems that can be woken up by
    the power button (Ken Xue).
 
  - Adjust pnpacpi_parse_allocated_vendor() to use memcpy() on a full
    structure field (Dmitry Antipov).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmWb8asSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxhsQP/jfRiEP7L9WUl66PdzxSWi1u7bVUZIbs
 z07ujAFdAbvpdM1WgWVq6mSzYewAqIm0A9Koabj7zKuG4VPh0Gjvq26jrK/et65m
 RJhC/qcnZ4h/2bELf9/JE7FIQMDWBGK8gNHBBXVQOZrQYIiBzJ2xyHJ4F0AvLVW6
 GGuX/4mb00jlWGr6uot6qjBgLLxY0EowneLUuH4onEWrThoNWy7zbD34LSsKuljA
 a69UkQPetXbkX4XQYnt4K4BAnwjRQNU2DlUE9lpMtheTS70wilxrC+P0XaETeO7c
 NCm38X2aUv/hSwJ0BekBRdNEvG/WQsfRdOt9jWAkoCL3oDCZdOgfM6Eas7ZDLF2n
 RoxLk2O9UXFwaSSGBVgkRLPCVyWBNI6C8GXnVDN8f9hqIk+jmlsXaXghpzVlGS54
 +ox6fjO81zJjEBxSP5ACCTNZq3BwwHhPhygtIkTO5JQ9SPn+WYCPM0C5Lcvzoj7A
 x7cdOguddhAi4ZWcoRo2cg7qN6vVaDgDgV+ylzh7q5N4cBY4edCJLzcFFuasriN4
 j9/Uj/EgCafrnOhlTJz0iZkAbPZ6T/qa3qBfF948dtFRkztTsddmGA4xof90jfG9
 /FLXL4wSiXK7jbFeUb1OCLOVANWpjHP3pM3gmnggiI3ApcweEGilhhbgVr7FuCG8
 7qj78EUqNVbW
 =Ntzm
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "From the new features standpoint, the most significant change here is
  the addition of CSI-2 and MIPI DisCo for Imaging support to the ACPI
  device enumeration code that will allow MIPI cameras to be enumerated
  through the platform firmware on systems using ACPI.

  Also significant is the switch-over to threaded interrupt handlers for
  the ACPI SCI and the dedicated EC interrupt (on systems where the
  former is not used) which essentially allows all ACPI code to run with
  local interrupts enabled. That should improve responsiveness
  significantly on systems where multiple GPEs are enabled and the
  handling of one SCI involves many I/O address space accesses which
  previously had to be carried out in one go with disabled interrupts on
  the local CPU.

  Apart from the above, the ACPI thermal zone driver will use the
  Thermal fast Sampling Period (_TFP) object if available, which should
  allow temperature changes to be followed more accurately on some
  systems, the ACPI Notify () handlers can run on all CPUs (not just on
  CPU0), which should generally speed up the processing of events
  signaled through the ACPI SCI, and the ACPI power button driver will
  trigger wakeup key events via the input subsystem (on systems where it
  is a system wakeup device)

  In addition to that, there are the usual bunch of fixes and cleanups.

  Specifics:

   - Add CSI-2 and DisCo for Imaging support to the ACPI device
     enumeration code (Sakari Ailus, Rafael J. Wysocki)

   - Adjust the cpufreq thermal reduction algorithm in the ACPI
     processor driver for Tegra241 (Srikar Srimath Tirumala, Arnd
     Bergmann)

   - Make acpi_proc_quirk_mwait_check() x86-specific (Rafael J. Wysocki)

   - Switch over ACPI to using a threaded interrupt handler for the SCI
     (Rafael J. Wysocki)

   - Allow ACPI Notify () handlers to run on all CPUs and clean up the
     ACPI interface for deferred events processing (Rafael J. Wysocki)

   - Switch over the ACPI EC driver to using a threaded handler for the
     dedicated IRQ on systems without the EC GPE (Rafael J. Wysocki)

   - Adjust code using ACPICA spinlocks and the ACPI EC driver spinlock
     to keep local interrupts on (Rafael J. Wysocki)

   - Adjust the USB4 _OSC handshake to correctly handle cases in which
     certain types of OS control are denied by the platform (Mika
     Westerberg)

   - Correct and clean up the generic function for parsing ACPI
     data-only tables with array structure (Yuntao Wang)

   - Modify acpi_dev_uid_match() to support different types of its
     second argument and adjust its users accordingly (Raag Jadav)

   - Clean up code related to acpi_evaluate_reference() and ACPI device
     lists (Rafael J. Wysocki)

   - Use generic ACPI helpers for evaluating trip point temperature
     objects in the ACPI thermal zone driver (Rafael J. Wysockii, Arnd
     Bergmann)

   - Add Thermal fast Sampling Period (_TFP) support to the ACPI thermal
     zone driver (Jeff Brasen)

   - Modify the ACPI LPIT table handling code to avoid u32
     multiplication overflows in state residency computations (Nikita
     Kiryushin)

   - Drop an unused helper function from the ACPI backlight (video)
     driver and add a clarifying comment to it (Hans de Goede)

   - Update the ACPI backlight driver to avoid using uninitialized
     memory in some cases (Nikita Kiryushin)

   - Add ACPI backlight quirk for the Colorful X15 AT 23 laptop (Yuluo
     Qiu)

   - Add support for vendor-defined error types to the ACPI APEI error
     injection code (Avadhut Naik)

   - Adjust APEI to properly set MF_ACTION_REQUIRED on synchronous
     memory failure events, so they are handled differently from the
     asynchronous ones (Shuai Xue)

   - Fix NULL pointer dereference check in the ACPI extlog driver
     (Prarit Bhargava)

   - Adjust the ACPI extlog driver to clear the Extended Error Log
     status when RAS_CEC handled the error (Tony Luck)

   - Add IRQ override quirks for some Infinity laptops and for TongFang
     GMxXGxx (David McFarland, Hans de Goede)

   - Clean up the ACPI NUMA code and fix it to ensure that fake_pxm is
     not the same as one of the real pxm values (Yuntao Wang)

   - Fix the fractional clock divider flags in the ACPI LPSS (Intel SoC)
     driver so as to prevent miscalculation of the values in the clock
     divider (Andy Shevchenko)

   - Adjust comments in the ACPI watchdog driver to prevent kernel-doc
     from complaining during documentation builds (Randy Dunlap)

   - Make the ACPI button driver send wakeup key events to user space in
     addition to power button events on systems that can be woken up by
     the power button (Ken Xue)

   - Adjust pnpacpi_parse_allocated_vendor() to use memcpy() on a full
     structure field (Dmitry Antipov)"

* tag 'acpi-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (56 commits)
  ACPI: resource: Add Infinity laptops to irq1_edge_low_force_override
  ACPI: button: trigger wakeup key events
  ACPI: resource: Add another DMI match for the TongFang GMxXGxx
  ACPI: EC: Use a spin lock without disabing interrupts
  ACPI: EC: Use a threaded handler for dedicated IRQ
  ACPI: OSL: Use spin locks without disabling interrupts
  ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events
  ACPI: utils: Introduce helper for _DEP list lookup
  ACPI: utils: Fix white space in struct acpi_handle_list definition
  ACPI: utils: Refine acpi_handle_list_equal() slightly
  ACPI: utils: Return bool from acpi_evaluate_reference()
  ACPI: utils: Rearrange in acpi_evaluate_reference()
  ACPI: arm64: export acpi_arch_thermal_cpufreq_pctg()
  ACPI: extlog: Clear Extended Error Log status when RAS_CEC handled the error
  ACPI: LPSS: Fix the fractional clock divider flags
  ACPI: NUMA: Fix the logic of getting the fake_pxm value
  ACPI: NUMA: Optimize the check for the availability of node values
  ACPI: NUMA: Remove unnecessary check in acpi_parse_gi_affinity()
  ACPI: watchdog: fix kernel-doc warnings
  ACPI: extlog: fix NULL pointer dereference check
  ...
2024-01-09 16:12:44 -08:00
Linus Torvalds
9f2a635235 Quite a lot of kexec work this time around. Many singleton patches in
many places.  The notable patch series are:
 
 - nilfs2 folio conversion from Matthew Wilcox in "nilfs2: Folio
   conversions for file paths".
 
 - Additional nilfs2 folio conversion from Ryusuke Konishi in "nilfs2:
   Folio conversions for directory paths".
 
 - IA64 remnant removal in Heiko Carstens's "Remove unused code after
   IA-64 removal".
 
 - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere
   in "Treewide: enable -Wmissing-prototypes".  This had some followup
   fixes:
 
   - Nathan Chancellor has cleaned up the hexagon build in the series
     "hexagon: Fix up instances of -Wmissing-prototypes".
 
   - Nathan also addressed some s390 warnings in "s390: A couple of
     fixes for -Wmissing-prototypes".
 
   - Arnd Bergmann addresses the same warnings for MIPS in his series
     "mips: address -Wmissing-prototypes warnings".
 
 - Baoquan He has made kexec_file operate in a top-down-fitting manner
   similar to kexec_load in the series "kexec_file: Load kernel at top of
   system RAM if required"
 
 - Baoquan He has also added the self-explanatory "kexec_file: print out
   debugging message if required".
 
 - Some checkstack maintenance work from Tiezhu Yang in the series
   "Modify some code about checkstack".
 
 - Douglas Anderson has disentangled the watchdog code's logging when
   multiple reports are occurring simultaneously.  The series is "watchdog:
   Better handling of concurrent lockups".
 
 - Yuntao Wang has contributed some maintenance work on the crash code in
   "crash: Some cleanups and fixes".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZ2R6AAKCRDdBJ7gKXxA
 juCVAP4t76qUISDOSKugB/Dn5E4Nt9wvPY9PcufnmD+xoPsgkQD+JVl4+jd9+gAV
 vl6wkJDiJO5JZ3FVtBtC3DFA/xHtVgk=
 =kQw+
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Quite a lot of kexec work this time around. Many singleton patches in
  many places. The notable patch series are:

   - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio
     conversions for file paths'.

   - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2:
     Folio conversions for directory paths'.

   - IA64 remnant removal in Heiko Carstens's 'Remove unused code after
     IA-64 removal'.

   - Arnd Bergmann has enabled the -Wmissing-prototypes warning
     everywhere in 'Treewide: enable -Wmissing-prototypes'. This had
     some followup fixes:

      - Nathan Chancellor has cleaned up the hexagon build in the series
        'hexagon: Fix up instances of -Wmissing-prototypes'.

      - Nathan also addressed some s390 warnings in 's390: A couple of
        fixes for -Wmissing-prototypes'.

      - Arnd Bergmann addresses the same warnings for MIPS in his series
        'mips: address -Wmissing-prototypes warnings'.

   - Baoquan He has made kexec_file operate in a top-down-fitting manner
     similar to kexec_load in the series 'kexec_file: Load kernel at top
     of system RAM if required'

   - Baoquan He has also added the self-explanatory 'kexec_file: print
     out debugging message if required'.

   - Some checkstack maintenance work from Tiezhu Yang in the series
     'Modify some code about checkstack'.

   - Douglas Anderson has disentangled the watchdog code's logging when
     multiple reports are occurring simultaneously. The series is
     'watchdog: Better handling of concurrent lockups'.

   - Yuntao Wang has contributed some maintenance work on the crash code
     in 'crash: Some cleanups and fixes'"

* tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits)
  crash_core: fix and simplify the logic of crash_exclude_mem_range()
  x86/crash: use SZ_1M macro instead of hardcoded value
  x86/crash: remove the unused image parameter from prepare_elf_headers()
  kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE
  scripts/decode_stacktrace.sh: strip unexpected CR from lines
  watchdog: if panicking and we dumped everything, don't re-enable dumping
  watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
  watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
  watchdog/hardlockup: adopt softlockup logic avoiding double-dumps
  kexec_core: fix the assignment to kimage->control_page
  x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init()
  lib/trace_readwrite.c:: replace asm-generic/io with linux/io
  nilfs2: cpfile: fix some kernel-doc warnings
  stacktrace: fix kernel-doc typo
  scripts/checkstack.pl: fix no space expression between sp and offset
  x86/kexec: fix incorrect argument passed to kexec_dprintk()
  x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs
  nilfs2: add missing set_freezable() for freezable kthread
  kernel: relay: remove relay_file_splice_read dead code, doesn't work
  docs: submit-checklist: remove all of "make namespacecheck"
  ...
2024-01-09 11:46:20 -08:00
Linus Torvalds
fb46e22a9e Many singleton patches against the MM code. The patch series which
are included in this merge do the following:
 
 - Peng Zhang has done some mapletree maintainance work in the
   series
 
 	"maple_tree: add mt_free_one() and mt_attr() helpers"
 	"Some cleanups of maple tree"
 
 - In the series "mm: use memmap_on_memory semantics for dax/kmem"
   Vishal Verma has altered the interworking between memory-hotplug
   and dax/kmem so that newly added 'device memory' can more easily
   have its memmap placed within that newly added memory.
 
 - Matthew Wilcox continues folio-related work (including a few
   fixes) in the patch series
 
 	"Add folio_zero_tail() and folio_fill_tail()"
 	"Make folio_start_writeback return void"
 	"Fix fault handler's handling of poisoned tail pages"
 	"Convert aops->error_remove_page to ->error_remove_folio"
 	"Finish two folio conversions"
 	"More swap folio conversions"
 
 - Kefeng Wang has also contributed folio-related work in the series
 
 	"mm: cleanup and use more folio in page fault"
 
 - Jim Cromie has improved the kmemleak reporting output in the
   series "tweak kmemleak report format".
 
 - In the series "stackdepot: allow evicting stack traces" Andrey
   Konovalov to permits clients (in this case KASAN) to cause
   eviction of no longer needed stack traces.
 
 - Charan Teja Kalla has fixed some accounting issues in the page
   allocator's atomic reserve calculations in the series "mm:
   page_alloc: fixes for high atomic reserve caluculations".
 
 - Dmitry Rokosov has added to the samples/ dorectory some sample
   code for a userspace memcg event listener application.  See the
   series "samples: introduce cgroup events listeners".
 
 - Some mapletree maintanance work from Liam Howlett in the series
   "maple_tree: iterator state changes".
 
 - Nhat Pham has improved zswap's approach to writeback in the
   series "workload-specific and memory pressure-driven zswap
   writeback".
 
 - DAMON/DAMOS feature and maintenance work from SeongJae Park in
   the series
 
 	"mm/damon: let users feed and tame/auto-tune DAMOS"
 	"selftests/damon: add Python-written DAMON functionality tests"
 	"mm/damon: misc updates for 6.8"
 
 - Yosry Ahmed has improved memcg's stats flushing in the series
   "mm: memcg: subtree stats flushing and thresholds".
 
 - In the series "Multi-size THP for anonymous memory" Ryan Roberts
   has added a runtime opt-in feature to transparent hugepages which
   improves performance by allocating larger chunks of memory during
   anonymous page faults.
 
 - Matthew Wilcox has also contributed some cleanup and maintenance
   work against eh buffer_head code int he series "More buffer_head
   cleanups".
 
 - Suren Baghdasaryan has done work on Andrea Arcangeli's series
   "userfaultfd move option".  UFFDIO_MOVE permits userspace heap
   compaction algorithms to move userspace's pages around rather than
   UFFDIO_COPY'a alloc/copy/free.
 
 - Stefan Roesch has developed a "KSM Advisor", in the series
   "mm/ksm: Add ksm advisor".  This is a governor which tunes KSM's
   scanning aggressiveness in response to userspace's current needs.
 
 - Chengming Zhou has optimized zswap's temporary working memory
   use in the series "mm/zswap: dstmem reuse optimizations and
   cleanups".
 
 - Matthew Wilcox has performed some maintenance work on the
   writeback code, both code and within filesystems.  The series is
   "Clean up the writeback paths".
 
 - Andrey Konovalov has optimized KASAN's handling of alloc and
   free stack traces for secondary-level allocators, in the series
   "kasan: save mempool stack traces".
 
 - Andrey also performed some KASAN maintenance work in the series
   "kasan: assorted clean-ups".
 
 - David Hildenbrand has gone to town on the rmap code.  Cleanups,
   more pte batching, folio conversions and more.  See the series
   "mm/rmap: interface overhaul".
 
 - Kinsey Ho has contributed some maintenance work on the MGLRU
   code in the series "mm/mglru: Kconfig cleanup".
 
 - Matthew Wilcox has contributed lruvec page accounting code
   cleanups in the series "Remove some lruvec page accounting
   functions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZyF2wAKCRDdBJ7gKXxA
 jjWjAP42LHvGSjp5M+Rs2rKFL0daBQsrlvy6/jCHUequSdWjSgEAmOx7bc5fbF27
 Oa8+DxGM9C+fwqZ/7YxU2w/WuUmLPgU=
 =0NHs
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Peng Zhang has done some mapletree maintainance work in the series

	'maple_tree: add mt_free_one() and mt_attr() helpers'
	'Some cleanups of maple tree'

   - In the series 'mm: use memmap_on_memory semantics for dax/kmem'
     Vishal Verma has altered the interworking between memory-hotplug
     and dax/kmem so that newly added 'device memory' can more easily
     have its memmap placed within that newly added memory.

   - Matthew Wilcox continues folio-related work (including a few fixes)
     in the patch series

	'Add folio_zero_tail() and folio_fill_tail()'
	'Make folio_start_writeback return void'
	'Fix fault handler's handling of poisoned tail pages'
	'Convert aops->error_remove_page to ->error_remove_folio'
	'Finish two folio conversions'
	'More swap folio conversions'

   - Kefeng Wang has also contributed folio-related work in the series

	'mm: cleanup and use more folio in page fault'

   - Jim Cromie has improved the kmemleak reporting output in the series
     'tweak kmemleak report format'.

   - In the series 'stackdepot: allow evicting stack traces' Andrey
     Konovalov to permits clients (in this case KASAN) to cause eviction
     of no longer needed stack traces.

   - Charan Teja Kalla has fixed some accounting issues in the page
     allocator's atomic reserve calculations in the series 'mm:
     page_alloc: fixes for high atomic reserve caluculations'.

   - Dmitry Rokosov has added to the samples/ dorectory some sample code
     for a userspace memcg event listener application. See the series
     'samples: introduce cgroup events listeners'.

   - Some mapletree maintanance work from Liam Howlett in the series
     'maple_tree: iterator state changes'.

   - Nhat Pham has improved zswap's approach to writeback in the series
     'workload-specific and memory pressure-driven zswap writeback'.

   - DAMON/DAMOS feature and maintenance work from SeongJae Park in the
     series

	'mm/damon: let users feed and tame/auto-tune DAMOS'
	'selftests/damon: add Python-written DAMON functionality tests'
	'mm/damon: misc updates for 6.8'

   - Yosry Ahmed has improved memcg's stats flushing in the series 'mm:
     memcg: subtree stats flushing and thresholds'.

   - In the series 'Multi-size THP for anonymous memory' Ryan Roberts
     has added a runtime opt-in feature to transparent hugepages which
     improves performance by allocating larger chunks of memory during
     anonymous page faults.

   - Matthew Wilcox has also contributed some cleanup and maintenance
     work against eh buffer_head code int he series 'More buffer_head
     cleanups'.

   - Suren Baghdasaryan has done work on Andrea Arcangeli's series
     'userfaultfd move option'. UFFDIO_MOVE permits userspace heap
     compaction algorithms to move userspace's pages around rather than
     UFFDIO_COPY'a alloc/copy/free.

   - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm:
     Add ksm advisor'. This is a governor which tunes KSM's scanning
     aggressiveness in response to userspace's current needs.

   - Chengming Zhou has optimized zswap's temporary working memory use
     in the series 'mm/zswap: dstmem reuse optimizations and cleanups'.

   - Matthew Wilcox has performed some maintenance work on the writeback
     code, both code and within filesystems. The series is 'Clean up the
     writeback paths'.

   - Andrey Konovalov has optimized KASAN's handling of alloc and free
     stack traces for secondary-level allocators, in the series 'kasan:
     save mempool stack traces'.

   - Andrey also performed some KASAN maintenance work in the series
     'kasan: assorted clean-ups'.

   - David Hildenbrand has gone to town on the rmap code. Cleanups, more
     pte batching, folio conversions and more. See the series 'mm/rmap:
     interface overhaul'.

   - Kinsey Ho has contributed some maintenance work on the MGLRU code
     in the series 'mm/mglru: Kconfig cleanup'.

   - Matthew Wilcox has contributed lruvec page accounting code cleanups
     in the series 'Remove some lruvec page accounting functions'"

* tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits)
  mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
  mm, treewide: introduce NR_PAGE_ORDERS
  selftests/mm: add separate UFFDIO_MOVE test for PMD splitting
  selftests/mm: skip test if application doesn't has root privileges
  selftests/mm: conform test to TAP format output
  selftests: mm: hugepage-mmap: conform to TAP format output
  selftests/mm: gup_test: conform test to TAP format output
  mm/selftests: hugepage-mremap: conform test to TAP format output
  mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING
  mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large
  mm/memcontrol: remove __mod_lruvec_page_state()
  mm/khugepaged: use a folio more in collapse_file()
  slub: use a folio in __kmalloc_large_node
  slub: use folio APIs in free_large_kmalloc()
  slub: use alloc_pages_node() in alloc_slab_page()
  mm: remove inc/dec lruvec page state functions
  mm: ratelimit stat flush from workingset shrinker
  kasan: stop leaking stack trace handles
  mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
  mm/mglru: add dummy pmd_dirty()
  ...
2024-01-09 11:18:47 -08:00
Linus Torvalds
d30e51aa7b slab updates for 6.8
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmWWu9EACgkQu+CwddJF
 iJpXvQf/aGL7uEY57VpTm0t4gPwoZ9r2P89HxI/nQs9XgVzDcBmVp/cC0LDvSdcm
 t91kJO538KeGjMgvlhLMTEuoShH5FlPs6cOwrGAYUoAGa4NwiOpGvliGky+nNHqY
 w887ZgSzVLq0UOuSvn86N6enumMvewt4V+872+OWo6O1HWOJhC0SgHTIa8QPQtwb
 yZ9BghO5IqMRXiZEsSIwyO+tQHcaU6l2G5huFXzgMFUhkQqAB9KTFc3h6rYI+i80
 L4ppNXo2KNPGTDRb9dA8LNMWgvmfjhCb7chs8o1zSY2PwZlkzOix7EUBLCAIbc/2
 EIaFC8AsZjfT47D1t72r8QpHB+C14Q==
 =J+E7
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:

 - SLUB: delayed freezing of CPU partial slabs (Chengming Zhou)

   Freezing is an operation involving double_cmpxchg() that makes a slab
   exclusive for a particular CPU. Chengming noticed that we use it also
   in situations where we are not yet installing the slab as the CPU
   slab, because freezing also indicates that the slab is not on the
   shared list. This results in redundant freeze/unfreeze operation and
   can be avoided by marking separately the shared list presence by
   reusing the PG_workingset flag.

   This approach neatly avoids the issues described in 9b1ea29bc0
   ("Revert "mm, slub: consider rest of partial list if acquire_slab()
   fails"") as we can now grab a slab from the shared list in a quick
   and guaranteed way without the cmpxchg_double() operation that
   amplifies the lock contention and can fail.

   As a result, lkp has reported 34.2% improvement of
   stress-ng.rawudp.ops_per_sec

 - SLAB removal and SLUB cleanups (Vlastimil Babka)

   The SLAB allocator has been deprecated since 6.5 and nobody has
   objected so far. We agreed at LSF/MM to wait until the next LTS,
   which is 6.6, so we should be good to go now.

   This doesn't yet erase all traces of SLAB outside of mm/ so some dead
   code, comments or documentation remain, and will be cleaned up
   gradually (some series are already in the works).

   Removing the choice of allocators has already allowed to simplify and
   optimize the code wiring up the kmalloc APIs to the SLUB
   implementation.

* tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits)
  mm/slub: free KFENCE objects in slab_free_hook()
  mm/slub: handle bulk and single object freeing separately
  mm/slub: introduce __kmem_cache_free_bulk() without free hooks
  mm/slub: fix bulk alloc and free stats
  mm/slub: optimize free fast path code layout
  mm/slub: optimize alloc fastpath code layout
  mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappers
  mm/slab: move kmalloc() functions from slab_common.c to slub.c
  mm/slab: move kmalloc_slab() to mm/slab.h
  mm/slab: move kfree() from slab_common.c to slub.c
  mm/slab: move struct kmem_cache_node from slab.h to slub.c
  mm/slab: move memcg related functions from slab.h to slub.c
  mm/slab: move pre/post-alloc hooks from slab.h to slub.c
  mm/slab: consolidate includes in the internal mm/slab.h
  mm/slab: move the rest of slub_def.h to mm/slab.h
  mm/slab: move struct kmem_cache_cpu declaration to slub.c
  mm/slab: remove mm/slab.c and slab_def.h
  mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs
  mm/slab: remove CONFIG_SLAB code from slab common code
  cpu/hotplug: remove CPUHP_SLAB_PREPARE hooks
  ...
2024-01-09 10:36:07 -08:00
Linus Torvalds
ab9517fa9a Debugobject changes for v6.8:
- Make tracking object use more robust: it's not safe to access a
    tracking object after releasing the hashbucket lock. Create a
    persistent copy for debug printouts instead.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmWbwj4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iQORAAkO9jZmvIgh3e0boJyvVkYh+m+O0sMSek
 /HAnoCLk5ipA8I8uhK7r5aqLHPPCoxw2MLihnKj4rTK1nPULZU/VMmge+4hUComv
 vy1m7tHeUShdAlxSvjZ/807oh546nA5IySv3qG6KCaemHeELllp3gRoIkkTjmnz6
 GcPUPQ9WIk8ptyplvyxOR4mw6QVynkIofk7p6C0ePwGNn7yFrYczc3VsnFL+Kmat
 fl5ZsyN9bRfG4q4bCqj4nKXuUyiJZdbtO2VkyPZtYFmlpERcvGiDHCtzQoKPHS4k
 CvAz4Eq11DdHjPr5VdRIuW7M91CyGE/2QMOfCOlxPqD+Jh0zTbO00U5BuRhicPXz
 rn0Yg5J54gDqv/2jtq6eSNsM28OrhGEu+nep9165e4BBhI6oYd0Wbq0TH+rlzohb
 048zMePY77nLffwNEpa4dEK2+UKLwufeQiP+d2BGfe96mQQ2i4LIRDzdu/BQtR4W
 CU7gbcy2e4q+hjI0gB7smFhQi5zR7tIt3WSfLY6hRpzDcpRqaTvfILyzWFgYN2RN
 I3B9+n3aTlCyCdeFseFajcVvBg5KexkAaPphhfBrkZ9viB+iKUjmcN8mMrvRfX6I
 vjCeIaNCL7arCAkce0hhHZ0WUf/f+DPoEJCb+Z9EuBmOosO/3tlNH6YGyeJNN+vx
 r7mt+MSC4CY=
 =WFe/
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobject update from Ingo Molnar:

 - Make tracking object use more robust: it's not safe to access a
   tracking object after releasing the hashbucket lock. Create a
   persistent copy for debug printouts instead.

* tag 'core-debugobjects-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Stop accessing objects after releasing hash bucket lock
2024-01-08 18:35:33 -08:00
Kirill A. Shutemov
fd37721803 mm, treewide: introduce NR_PAGE_ORDERS
NR_PAGE_ORDERS defines the number of page orders supported by the page
allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total.

NR_PAGE_ORDERS assists in defining arrays of page orders and allows for
more natural iteration over them.

[kirill.shutemov@linux.intel.com: fixup for kerneldoc warning]
  Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box
Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08 15:27:15 -08:00
Linus Torvalds
5db8752c3b vfs-6.8.iov_iter
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZZUzBQAKCRCRxhvAZXjc
 ot+3AQCZw1PBD4azVxFMWH76qwlAGoVIFug4+ogKU/iUa4VLygEA2FJh1vLJw5iI
 LpgBEIUTPVkwtzinAW94iJJo1Vr7NAI=
 =p6PB
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.8.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs iov_iter cleanups from Christian Brauner:
 "This contains a minor cleanup. The patches drop an unused argument
  from import_single_range() allowing to replace import_single_range()
  with import_ubuf() and dropping import_single_range() completely"

* tag 'vfs-6.8.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  iov_iter: replace import_single_range() with import_ubuf()
  iov_iter: remove unused 'iov' argument from import_single_range()
2024-01-08 11:43:04 -08:00
Dan Williams
80dda9a69a Merge branch 'for-6.8/cxl-misc' into for-6.8/cxl
Pick up some miscellaneous fixups for v6.8.
2024-01-05 19:03:06 -08:00
Jakub Kicinski
e63c1822ac Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  e009b2efb7 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()")
  0f2b214779 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL")
https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04 18:06:46 -08:00
Rafael J. Wysocki
8be056a2c0 Merge branches 'acpi-osl', 'acpi-bus' and 'acpi-tables'
Merge low-level ACPICA interface changes, an _SB-scope _OSC handshake
update and a data-only ACPI tables parsing code update for 6.8-rc1:

 - Switch over ACPI to using a threaded interrupt handler for the
   SCI (Rafael J. Wysocki).

 - Allow ACPI Notify () handlers to run on all CPUs and clean up the
   ACPI interface for deferred events processing (Rafael J. Wysocki).

 - Switch over the ACPI EC driver to using a threaded handler for the
   dedicated IRQ on systems without the EC GPE (Rafael J. Wysocki).

 - Adjust code using ACPICA spinlocks and the ACPI EC driver spinlock to
   keep local interrupts on (Rafael J. Wysocki).

 - Adjust the USB4 _OSC handshake to correctly handle cases in which
   certain types of OS control are denied by the platform (Mika
   Westerberg).

 - Correct and clean up the generic function for parsing ACPI data-only
   tables with array structure (Yuntao Wang).

* acpi-osl:
  ACPI: EC: Use a spin lock without disabing interrupts
  ACPI: EC: Use a threaded handler for dedicated IRQ
  ACPI: OSL: Use spin locks without disabling interrupts
  ACPI: OSL: Allow Notify () handlers to run on all CPUs
  ACPI: OSL: Rearrange workqueue selection in acpi_os_execute()
  ACPI: OSL: Rework error handling in acpi_os_execute()
  ACPI: OSL: Use a threaded interrupt handler for SCI

* acpi-bus:
  ACPI: Run USB4 _OSC() first with query bit set

* acpi-tables:
  ACPI: tables: Correct and clean up the logic of acpi_parse_entries_array()
2024-01-04 12:35:42 +01:00
David Gow
539e582a37 kunit: Fix some comments which were mistakenly kerneldoc
The KUnit device helpers are documented with kerneldoc in their header
file, but also have short comments over their implementation. These were
mistakenly formatted as kerneldoc comments, even though they're not
valid kerneldoc. It shouldn't cause any serious problems -- this file
isn't included in the docs -- but it could be confusing, and causes
warnings.

Remove the extra '*' so that these aren't treated as kerneldoc.

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312181920.H4EPAH20-lkp@intel.com/
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-03 09:10:37 -07:00
Richard Fitzgerald
5fb1a8c671 kunit: Add example of kunit_activate_static_stub() with pointer-to-function
Adds a variant of example_static_stub_test() that shows use of a
pointer-to-function with kunit_activate_static_stub().

A const pointer to the add_one() function is declared. This
pointer-to-function is passed to kunit_activate_static_stub() and
kunit_deactivate_static_stub() instead of passing add_one directly.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-03 09:07:23 -07:00
Richard Fitzgerald
a0b84213f9 kunit: Fix NULL-dereference in kunit_init_suite() if suite->log is NULL
suite->log must be checked for NULL before passing it to
string_stream_clear(). This was done in kunit_init_test() but was missing
from kunit_init_suite().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 6d696c4695c5 ("kunit: add ability to run tests after boot using debugfs")
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: David Gow <davidgow@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-01-03 09:06:19 -07:00
Tanzir Hasan
037d88f0dd lib/trace_readwrite.c:: replace asm-generic/io with linux/io
asm-generic/io.h can be replaced with linux/io.h and the file will still
build correctly.  It is an asm-generic file which should be avoided if
possible.

Link: https://lkml.kernel.org/r/20231221-tracereadwrite-v1-1-a434f25180c7@google.com
Signed-off-by: Tanzir Hasan <tanzirh@google.com>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 12:22:29 -08:00
Mathis Marion
90ca22513e lib: crc_ccitt_false() is identical to crc_itu_t()
crc_ccitt_false() was introduced in commit 0d85adb5fb ("lib/crc-ccitt:
Add CCITT-FALSE CRC16 variant"), but it is redundant with crc_itu_t(). 
Since the latter is more used, it is the one being kept.

Link: https://lkml.kernel.org/r/20231219131154.748577-1-Mathis.Marion@silabs.com
Signed-off-by: Mathis Marion <mathis.marion@silabs.com>
Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Andrey Vostrikov <andrey.vostrikov@cogentembedded.com>
Cc: Jérôme Pouiller <jerome.pouiller@silabs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 12:22:26 -08:00
Uwe Kleine-König
bc09d1dea8 lib: add note about process exit message for DEBUG_STACK_USAGE
DEBUG_STACK_USAGE doesn't only have an influence on the output of sysrq-T
and sysrq-P, it also enables a message at process exit.  See
check_stack_usage() in kernel/exit.c where this is implemented.

Link: https://lkml.kernel.org/r/20231219182808.210284-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 12:22:26 -08:00
Andrey Konovalov
a914d8d6cf lib/stackdepot: add printk_deferred_enter/exit guards
Patch series "lib/stackdepot, kasan: fixes for stack eviction series", v3.

A few fixes for the stack depot eviction series ("stackdepot: allow
evicting stack traces").

This patch (of 5):

Stack depot functions can be called from various contexts that do
allocations, including with console locks taken.  At the same time, stack
depot functions might print WARNING's or refcount-related failures.

This can cause a deadlock on console locks.

Add printk_deferred_enter/exit guards to stack depot to avoid this.

Link: https://lkml.kernel.org/r/cover.1703020707.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/82092f9040d075a161d1264377d51e0bac847e8a.1703020707.git.andreyknvl@google.com
Fixes: 108be8def4 ("lib/stackdepot: allow users to evict stack traces")
Fixes: cd11016e5f ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Closes: https://lore.kernel.org/all/000000000000f56750060b9ad216@google.com/
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 11:58:41 -08:00
Joel Granados
c8a65501d3 sysctl: Remove the now superfluous sentinel elements from ctl_table array
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Remove empty sentinel element from test_table and test_table_unregister.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-12-28 04:57:57 -08:00
Joel Granados
777740779e sysctl: Add a selftest for handling empty dirs
Basic test to ensure that empty directories can be registered and that
they in turn can serve as a base dir for other registrations.

Add one test to the sysctl selftest module. It first registers an empty
directory under "empty_add" and then uses that as a base to register
another empty dir.
The sysctl bash script then checks that "empty_add" is present and that
there an empty directory within it.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-12-28 04:57:57 -08:00
Linus Torvalds
f5837722ff 11 hotfixes. 7 are cc:stable and the other 4 address post-6.6 issues or
are not considered backporting material.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZYys4AAKCRDdBJ7gKXxA
 jtmaAQC+o04Ia7IfB8MIqp1p7dNZQo64x/EnGA8YjUnQ8N6IwQD+ImU7dHl9g9Oo
 ROiiAbtMRBUfeJRsExX/Yzc1DV9E9QM=
 =ZGcs
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-12-27-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "11 hotfixes. 7 are cc:stable and the other 4 address post-6.6 issues
  or are not considered backporting material"

* tag 'mm-hotfixes-stable-2023-12-27-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: add an old address for Naoya Horiguchi
  mm/memory-failure: cast index to loff_t before shifting it
  mm/memory-failure: check the mapcount of the precise page
  mm/memory-failure: pass the folio and the page to collect_procs()
  selftests: secretmem: floor the memory size to the multiple of page_size
  mm: migrate high-order folios in swap cache correctly
  maple_tree: do not preallocate nodes for slot stores
  mm/filemap: avoid buffered read/write race to read inconsistent data
  kunit: kasan_test: disable fortify string checker on kmalloc_oob_memset
  kexec: select CRYPTO from KEXEC_FILE instead of depending on it
  kexec: fix KEXEC_FILE dependencies
2023-12-27 16:14:41 -08:00
Kent Overstreet
1e2f2d3199 Kill sched.h dependency on rcupdate.h
by moving cond_resched_rcu() to rcupdate_wait.h, we can kill another big
sched.h dependency.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-27 11:50:20 -05:00
Dave Jiang
60e43fe528 lib/firmware_table: tables: Add CDAT table parsing support
The CDAT table is very similar to ACPI tables when it comes to sub-table
and entry structures. The helper functions can be also used to parse the
CDAT table. Add support to the helper functions to deal with an external
CDAT table, and also handle the endieness since CDAT can be processed by a
BE host. Export a function cdat_table_parse() for CXL driver to parse
a CDAT table.

In order to minimize ACPICA code changes, __force is being utilized to deal
with the case of a big endian (BE) host parsing a CDAT. All CDAT data
structure variables are being force casted to __leX as appropriate.

Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/170319615131.2212653.10932785667981494238.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-22 14:23:13 -08:00
Linus Torvalds
c0f65a7c11 printk changes for 6.8
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmWFxecACgkQUqAMR0iA
 lPLq0RAAg0YQxIltUuR1/IcsVB6rMz8eekmR62sQlmsM2mYRHiEX4ffvM2aSuoMW
 1ikAqtfDCFpjIg2dxlirIIA1bkCj7P9wUDShwZispPmjiAag1Woj3E02Q7enxljF
 LwkdBP0rm/59jGj3u/HKqQ/vwbusg6cezoCXXUqf9BMc7AaD9w4wyF4LbdyqVdcS
 B1ngNvqyONH7LS2rp3sixowbO0KDgtaPKFg3kC9r5Aqhz0pvJtmrdQhssx5jQzJ3
 emnKdSPvrtbeLMa+/mYO9OsYa2PnqkotLRq5ulW4b/9QNluPmVqQbkyqiFJGldFo
 /jHLIIYM9CwaOhvqLY5o1uBq8jXQH6dyfE8Yv2aOcVGN8loTR3eG77h+ZgDAsIXL
 6p5Twx4zCYlI80R8H5nVdpPa0MUjLBK7bG/oT7pmGnxK9pqtKNdejXqhrPzUcH0i
 sVbR9BfOTcWXQlL1CcQRc5sfoikv0Qp6TQbpaE+qyHogWZI3oOaJBo2ietvYx0Lp
 E73amdvkJ2q55rEspTdLxLDQCd7hov2zkQMtIHzLBybk7i9QaMLnzzjmUe7yy9yc
 il/8Ma+8PtYsPfC77PjMp7shI94Ldps0383I/NAon8DrUG7bV6CXd3QAPB1c7P/b
 AN7Qjig1udk7yZDHCm5TbX/niB3/sfiBWJINHG764+iUa3lkJ8E=
 =m5F8
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk fix from Petr Mladek:

 - Prevent refcount warning from code releasing a fwnode

* tag 'printk-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  lib/vsprintf: Fix %pfwf when current node refcount == 0
2023-12-22 13:41:29 -08:00
Tianjia Zhang
ba3c557420 crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init
When the mpi_ec_ctx structure is initialized, some fields are not
cleared, causing a crash when referencing the field when the
structure was released. Initially, this issue was ignored because
memory for mpi_ec_ctx is allocated with the __GFP_ZERO flag.
For example, this error will be triggered when calculating the
Za value for SM2 separately.

Fixes: d58bb7e55a ("lib/mpi: Introduce ec implementation to MPI library")
Cc: stable@vger.kernel.org # v6.5
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-12-22 12:30:19 +08:00
Paolo Abeni
56794e5358 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
  23c93c3b62 ("bnxt_en: do not map packet buffers twice")
  6d1add9553 ("bnxt_en: Modify TX ring indexing logic.")

tools/testing/selftests/net/Makefile
  2258b66648 ("selftests: add vlan hw filter tests")
  a0bc96c0cd ("selftests: net: verify fq per-band packet limit")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-21 22:17:23 +01:00
Matthew Wilcox (Oracle)
af73483f4e ida: Fix crash in ida_free when the bitmap is empty
The IDA usually detects double-frees, but that detection failed to
consider the case when there are no nearby IDs allocated and so we have a
NULL bitmap rather than simply having a clear bit.  Add some tests to the
test-suite to be sure we don't inadvertently reintroduce this problem.
Unfortunately they're quite noisy so include a message to disregard
the warnings.

Reported-by: Zhenghan Wang <wzhmmmmm@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-21 10:02:28 -08:00
Mark Rutland
0df52582e0 kcov: remove stale RANDOMIZE_BASE text
The Kconfig help text for CONFIG_KCOV describes that recorded PC values
will not be stable across machines or reboots when RANDOMIZE_BASE is
selected.  This was the case when KCOV was introduced in commit:

  5c9a8750a6 ("kernel: add kcov code coverage")

However, this changed in commit:

  4983f0ab7f ("kcov: make kcov work properly with KASLR enabled")

Since that commit KCOV always subtracts the KASLR offset from PC values,
which ensures that these are stable across machines and across reboots
even when RANDOMIZE_BASE is selected.

Unfortunately, that commit failed to update the Kconfig help text, which
still suggests disabling RANDOMIZE_BASE even though this is no longer
necessary.

Remove the stale Kconfig text.

Link: https://lkml.kernel.org/r/20231204171807.3313022-1-mark.rutland@arm.com
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 15:02:57 -08:00
Borislav Petkov (AMD)
ffda655682 UBSAN: use the kernel panic message markers
Use the same splat markers as panic does for easier matching by external
tools scanning kernel dmesg for splats.

Link: https://lkml.kernel.org/r/20231218135339.23209-1-bp@alien8.de
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:14 -08:00
Peng Zhang
7e552dcd80 maple_tree: avoid checking other gaps after getting the largest gap
The last range stored in maple tree is typically quite large.  By checking
if it exceeds the sum of the remaining ranges in that node, it is possible
to avoid checking all other gaps.

Running the maple tree test suite in user mode almost always results in a
near 100% hit rate for this optimization.

Link: https://lkml.kernel.org/r/20231215074632.82045-1-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:14 -08:00
Randy Dunlap
d5f6057cf0 maple_tree: fix typos/spellos etc
Fix typos/grammar and spellos in documentation.

Link: https://lkml.kernel.org/r/20231210063839.29967-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:12 -08:00
Andrew Morton
5143eecd2a lib/maple_tree.c: fix build error due to hotfix alteration
Commit 0de56e38b3 ("maple_tree: use maple state end for write
operations") was broken by a later patch "maple_tree: do not preallocate
nodes for slot stores".  But the later patch was scheduled ahead of
0de56e38b3, for 6.7-rc.

This fixlet undoes the damage.

Fixes: 0de56e38b3 ("maple_tree: use maple state end for write operations")
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:11 -08:00
Andrew Morton
a721aeac8b sync mm-stable with mm-hotfixes-stable to pick up depended-upon changes 2023-12-20 14:47:18 -08:00
Sidhartha Kumar
4249f13c11 maple_tree: do not preallocate nodes for slot stores
mas_preallocate() defaults to requesting 1 node for preallocation and then
,depending on the type of store, will update the request variable.  There
isn't a check for a slot store type, so slot stores are preallocating the
default 1 node.  Slot stores do not require any additional nodes, so add a
check for the slot store case that will bypass node_count_gfp().  Update
the tests to reflect that slot stores do not require allocations.

User visible effects of this bug include increased memory usage from the
unneeded node that was allocated.

Link: https://lkml.kernel.org/r/20231213205058.386589-1-sidhartha.kumar@oracle.com
Fixes: 0b8bb544b1 ("maple_tree: update mas_preallocate() testing")
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: <stable@vger.kernel.org>	[6.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 13:46:19 -08:00
Jakub Kicinski
c49b292d03 netdev
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmWAz2EACgkQ6rmadz2v
 bToqrw/9EwroZCc8GEHOKAlb/fzrMvn92rLo0ZW/cGN84QJPnx4zM6Zo0+fgLaaN
 oqqztwMUwdzGC3uX3FfVXaaLKbJ/MeHeL9BXFZNW8zkRHciw4R7kIBhOdPnHyET7
 uT+rQ4xPe1Mt7e9PjepKlSL5mEsxWfBkdUgsdn19Z2Vjdfr9mZMhYWYMJGcfTCD1
 TwxHKBPhq5fN3IsshmMBB8IrRp1HStUKb65MgZ4dI22LJXxTsFkx5XMFXcmuqvkH
 NhKj8jDcPEEh31bYcb6aG2Z4onw5F2lquygjk1Qyy5cyw45m/ipJKAXKdAyvJG+R
 VZCWOET/9wbRwFSK5wxwihCuKghFiofK52i2PcGtXZh0PCouyZZneSJOKM0yVWKO
 BvuJBxK4ETRnQyN6ZxhuJiEXG3/YMBBhyR2TX1LntVK9ct/k7qFVzATG49J39/sR
 SYMbptBRj4a5oMJ1qn0nFVEDFkg0jTnTDNnsEpcz60Ayt6EsJ1XosO5yz2huf861
 xgRMTKMseyG1/uV45tQ8ZPzbSPpBxjUi9Dl3coYsIm1a+y6clWUXcarONY5KVrpS
 CR98DuFgl+E7dXuisd/Kz2p2KxxSPq8nytsmLlgOvrUqhwiXqB+TKN8EHgIapVOt
 l1A5LrzXFTcGlT9MlaWBqEIy83Bu1nqQqbxrAFOE0k8A5jomXaw=
 =stU2
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2023-12-18

This PR is larger than usual and contains changes in various parts
of the kernel.

The main changes are:

1) Fix kCFI bugs in BPF, from Peter Zijlstra.

End result: all forms of indirect calls from BPF into kernel
and from kernel into BPF work with CFI enabled. This allows BPF
to work with CONFIG_FINEIBT=y.

2) Introduce BPF token object, from Andrii Nakryiko.

It adds an ability to delegate a subset of BPF features from privileged
daemon (e.g., systemd) through special mount options for userns-bound
BPF FS to a trusted unprivileged application. The design accommodates
suggestions from Christian Brauner and Paul Moore.

Example:
$ sudo mkdir -p /sys/fs/bpf/token
$ sudo mount -t bpf bpffs /sys/fs/bpf/token \
             -o delegate_cmds=prog_load:MAP_CREATE \
             -o delegate_progs=kprobe \
             -o delegate_attachs=xdp

3) Various verifier improvements and fixes, from Andrii Nakryiko, Andrei Matei.

 - Complete precision tracking support for register spills
 - Fix verification of possibly-zero-sized stack accesses
 - Fix access to uninit stack slots
 - Track aligned STACK_ZERO cases as imprecise spilled registers.
   It improves the verifier "instructions processed" metric from single
   digit to 50-60% for some programs.
 - Fix verifier retval logic

4) Support for VLAN tag in XDP hints, from Larysa Zaremba.

5) Allocate BPF trampoline via bpf_prog_pack mechanism, from Song Liu.

End result: better memory utilization and lower I$ miss for calls to BPF
via BPF trampoline.

6) Fix race between BPF prog accessing inner map and parallel delete,
from Hou Tao.

7) Add bpf_xdp_get_xfrm_state() kfunc, from Daniel Xu.

It allows BPF interact with IPSEC infra. The intent is to support
software RSS (via XDP) for the upcoming ipsec pcpu work.
Experiments on AWS demonstrate single tunnel pcpu ipsec reaching
line rate on 100G ENA nics.

8) Expand bpf_cgrp_storage to support cgroup1 non-attach, from Yafang Shao.

9) BPF file verification via fsverity, from Song Liu.

It allows BPF progs get fsverity digest.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (164 commits)
  bpf: Ensure precise is reset to false in __mark_reg_const_zero()
  selftests/bpf: Add more uprobe multi fail tests
  bpf: Fail uprobe multi link with negative offset
  selftests/bpf: Test the release of map btf
  s390/bpf: Fix indirect trampoline generation
  selftests/bpf: Temporarily disable dummy_struct_ops test on s390
  x86/cfi,bpf: Fix bpf_exception_cb() signature
  bpf: Fix dtor CFI
  cfi: Add CFI_NOSEAL()
  x86/cfi,bpf: Fix bpf_struct_ops CFI
  x86/cfi,bpf: Fix bpf_callback_t CFI
  x86/cfi,bpf: Fix BPF JIT call
  cfi: Flip headers
  selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment
  selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test
  selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment
  bpf: Limit the number of kprobes when attaching program to multiple kprobes
  bpf: Limit the number of uprobes when attaching program to multiple uprobes
  bpf: xdp: Register generic_kfunc_set with XDP programs
  selftests/bpf: utilize string values for delegate_xxx mount options
  ...
====================

Link: https://lore.kernel.org/r/20231219000520.34178-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-18 16:46:08 -08:00
Michal Wajdeczko
342fb97892 kunit: Reset test->priv after each param iteration
If we run parameterized test that uses test->priv to prepare some
custom data, then value of test->priv will leak to the next param
iteration and may be unexpected.  This could be easily seen if
we promote example_priv_test to parameterized test as then only
first test iteration will be successful:

$ ./tools/testing/kunit/kunit.py run \
	--kunitconfig ./lib/kunit/.kunitconfig *.example_priv*

[ ] Starting KUnit Kernel (1/1)...
[ ] ============================================================
[ ] =================== example (1 subtest) ====================
[ ] ==================== example_priv_test  ====================
[ ] [PASSED] example value 3
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ]     test->priv == 0000000060dfe290
[ ]     ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 2
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ]     test->priv == 0000000060dfe290
[ ]     ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 1
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ]     test->priv == 0000000060dfe290
[ ]     ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 0
[ ] # example_priv_test: initializing
[ ] # example_priv_test: cleaning up
[ ] # example_priv_test: pass:1 fail:3 skip:0 total:4
[ ] ================ [FAILED] example_priv_test ================
[ ]     # example: initializing suite
[ ]     # module: kunit_example_test
[ ]     # example: exiting suite
[ ] # Totals: pass:1 fail:3 skip:0 total:4
[ ] ===================== [FAILED] example =====================

Fix that by resetting test->priv after each param iteration, in
similar way what we did for the test->status.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:39:05 -07:00
Michal Wajdeczko
2b61582acd kunit: Add example for using test->priv
In a test->priv field the user can store arbitrary data.
Add example how to use this feature in the test code.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:39:00 -07:00
davidgow@google.com
837018388e overflow: Replace fake root_device with kunit_device
Using struct root_device to create fake devices for tests is something
of a hack. The new struct kunit_device is meant for this purpose, so use
it instead.

Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:28:08 -07:00
davidgow@google.com
46ee8f688e fortify: test: Use kunit_device
Using struct root_device to create fake devices for tests is something
of a hack. The new struct kunit_device is meant for this purpose, so use
it instead.

Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:28:08 -07:00
davidgow@google.com
d03c720e03 kunit: Add APIs for managing devices
Tests for drivers often require a struct device to pass to other
functions. While it's possible to create these with
root_device_register(), or to use something like a platform device, this
is both a misuse of those APIs, and can be difficult to clean up after,
for example, a failed assertion.

Add some KUnit-specific functions for registering and unregistering a
struct device:
- kunit_device_register()
- kunit_device_register_with_driver()
- kunit_device_unregister()

These helpers allocate a on a 'kunit' bus which will either probe the
driver passed in (kunit_device_register_with_driver), or will create a
stub driver (kunit_device_register) which is cleaned up on test shutdown.

Devices are automatically unregistered on test shutdown, but can be
manually unregistered earlier with kunit_device_unregister() in order
to, for example, test device release code.

Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:28:08 -07:00
Rae Moar
c72a870926 kunit: add ability to run tests after boot using debugfs
Add functionality to run built-in tests after boot by writing to a
debugfs file.

Add a new debugfs file labeled "run" for each test suite to use for
this purpose.

As an example, write to the file using the following:

echo "any string" > /sys/kernel/debugfs/kunit/<testsuite>/run

This will trigger the test suite to run and will print results to the
kernel log.

To guard against running tests concurrently with this feature, add a
mutex lock around running kunit. This supports the current practice of
not allowing tests to be run concurrently on the same kernel.

This new functionality could be used to design a parameter
injection feature in the future.

Fixed up merge conflict duing rebase to Linux 6.7-rc6
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:25:49 -07:00
Rae Moar
6c4ea2f48d kunit: add is_init test attribute
Add is_init test attribute of type bool. Add to_string, get, and filter
methods to lib/kunit/attributes.c.

Mark each of the tests in the init section with the is_init=true attribute.

Add is_init to the attributes documentation.

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:15 -07:00
Rae Moar
2cf4528157 kunit: add example suite to test init suites
Add example_init_test_suite to allow for testing the feature of running
test suites marked as init to indicate they use init data and/or
functions.

This suite should always pass and uses a simple init function.

This suite can also be used to test the is_init attribute introduced in
the next patch.

Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:15 -07:00
Rae Moar
d81f0d7b8b kunit: add KUNIT_INIT_TABLE to init linker section
Add KUNIT_INIT_TABLE to the INIT_DATA linker section.

Alter the KUnit macros to create init tests:
kunit_test_init_section_suites

Update lib/kunit/executor.c to run both the suites in KUNIT_TABLE and
KUNIT_INIT_TABLE.

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:15 -07:00
Richard Fitzgerald
1557e89d3a kunit: debugfs: Handle errors from alloc_string_stream()
In kunit_debugfs_create_suite() give up and skip creating the debugfs
file if any of the alloc_string_stream() calls return an error or NULL.
Only put a value in the log pointer of kunit_suite and kunit_test if it
is a valid pointer to a log.

This prevents the potential invalid dereference reported by smatch:

 lib/kunit/debugfs.c:115 kunit_debugfs_create_suite() error: 'suite->log'
	dereferencing possible ERR_PTR()
 lib/kunit/debugfs.c:119 kunit_debugfs_create_suite() error: 'test_case->log'
	dereferencing possible ERR_PTR()

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 05e2006ce4 ("kunit: Use string_stream for test log")
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:15 -07:00
Richard Fitzgerald
34dfd5bb2e kunit: debugfs: Fix unchecked dereference in debugfs_print_results()
Move the call to kunit_suite_has_succeeded() after the check that
the kunit_suite pointer is valid.

This was found by smatch:

 lib/kunit/debugfs.c:66 debugfs_print_results() warn: variable
 dereferenced before check 'suite' (see line 63)

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 38289a26e1 ("kunit: fix debugfs code to use enum kunit_status, not bool")
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:14 -07:00
Richard Fitzgerald
15bf000014 kunit: string-stream: Allow ERR_PTR to be passed to string_stream_destroy()
Check the stream pointer passed to string_stream_destroy() for
IS_ERR_OR_NULL() instead of only NULL.

Whatever alloc_string_stream() returns should be safe to pass
to string_stream_destroy(), and that will be an ERR_PTR.

It's obviously good practise and generally helpful to also check
for NULL pointers so that client cleanup code can call
string_stream_destroy() unconditionally - which could include
pointers that have never been set to anything and so are NULL.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:14 -07:00
Richard Fitzgerald
37f0d37ffc kunit: string-stream-test: Avoid cast warning when testing gfp_t flags
Passing a gfp_t to KUNIT_EXPECT_EQ() causes a cast warning:

  lib/kunit/string-stream-test.c:73:9: sparse: sparse: incorrect type in
  initializer (different base types) expected long long right_value
  got restricted gfp_t const __right

Avoid this by testing stream->gfp for the expected value and passing the
boolean result of this comparison to KUNIT_EXPECT_TRUE(), as was already
done a few lines above in string_stream_managed_init_test().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: d1a0d699bf ("kunit: string-stream: Add tests for freeing resource-managed string_stream")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311181918.0mpCu2Xh-lkp@intel.com/
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:14 -07:00
David Gow
56778b49c9 kunit: Add a macro to wrap a deferred action function
KUnit's deferred action API accepts a void(*)(void *) function pointer
which is called when the test is exited. However, we very frequently
want to use existing functions which accept a single pointer, but which
may not be of type void*. While this is probably dodgy enough to be on
the wrong side of the C standard, it's been often used for similar
callbacks, and gcc's -Wcast-function-type seems to ignore cases where
the only difference is the type of the argument, assuming it's
compatible (i.e., they're both pointers to data).

However, clang 16 has introduced -Wcast-function-type-strict, which no
longer permits any deviation in function pointer type. This seems to be
because it'd break CFI, which validates the type of function calls.

This rather ruins our attempts to cast functions to defer them, and
leaves us with a few options. The one we've chosen is to implement a
macro which will generate a wrapper function which accepts a void*, and
casts the argument to the appropriate type.

For example, if you were trying to wrap:
void foo_close(struct foo *handle);
you could use:
KUNIT_DEFINE_ACTION_WRAPPER(kunit_action_foo_close,
			    foo_close,
			    struct foo *);

This would create a new kunit_action_foo_close() function, of type
kunit_action_t, which could be passed into kunit_add_action() and
similar functions.

In addition to defining this macro, update KUnit and its tests to use
it.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-18 13:21:14 -07:00
Jens Axboe
ae1914174a cred: get rid of CONFIG_DEBUG_CREDENTIALS
This code is rarely (never?) enabled by distros, and it hasn't caught
anything in decades. Let's kill off this legacy debug code.

Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-15 14:19:48 -08:00
Jakub Kicinski
8f674972d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/intel/iavf/iavf_ethtool.c
  3a0b5a2929 ("iavf: Introduce new state machines for flow director")
  95260816b4 ("iavf: use iavf_schedule_aq_request() helper")
https://lore.kernel.org/all/84e12519-04dc-bd80-bc34-8cf50d7898ce@intel.com/

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  c13e268c07 ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic")
  c2f8063309 ("bnxt_en: Refactor RX VLAN acceleration logic.")
  a7445d6980 ("bnxt_en: Add support for new RX and TPA_START completion types for P7")
  1c7fd6ee2f ("bnxt_en: Rename some macros for the P5 chips")
https://lore.kernel.org/all/20231211110022.27926ad9@canb.auug.org.au/

drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
  bd6781c18c ("bnxt_en: Fix wrong return value check in bnxt_close_nic()")
  84793a4995 ("bnxt_en: Skip nic close/open when configuring tstamp filters")
https://lore.kernel.org/all/20231214113041.3a0c003c@canb.auug.org.au/

drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
  3d7a3f2612 ("net/mlx5: Nack sync reset request when HotPlug is enabled")
  cecf44ea1a ("net/mlx5: Allow sync reset flow when BF MGT interface device is present")
https://lore.kernel.org/all/20231211110328.76c925af@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14 17:14:41 -08:00
Levi Yun
d9d9bd979c maple_tree: change return type of mas_split_final_node as void.
mas_split_final_node() always returns true and its return value is never
checked.

Change return type to void.

Link: https://lkml.kernel.org/r/20231109160821.16248-2-ppbuk5246@gmail.com
Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:09 -08:00
Peng Zhang
330018fe69 maple_tree: simplify mas_leaf_set_meta()
Now it seems that the incoming 'end' is already pointing to the last item,
so we can simplify this function, considering only whether the last slot
is being used.  This has passed the maple tree test suite.

Link: https://lkml.kernel.org/r/20231120070937.35481-6-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:01 -08:00
Peng Zhang
026b935cd9 maple_tree: delete one of the two identical checks
There are two identical checks, delete one of them.

Link: https://lkml.kernel.org/r/20231120070937.35481-5-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:00 -08:00
Peng Zhang
c5e9412138 maple_tree: remove an unused parameter for ma_meta_end()
The parameter maple_type is not used, so remove it.

Link: https://lkml.kernel.org/r/20231120070937.35481-4-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:00 -08:00
Peng Zhang
3f05fcdebf maple_tree: avoid ascending when mas->min is also the parent's minimum
When the child node is the first child of its parent node, mas->min does
not need to be updated. This can reduce the number of ascending times
in some cases.

Link: https://lkml.kernel.org/r/20231120070937.35481-3-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:00 -08:00
Peng Zhang
2e783f0c1a maple_tree: move the check forward to avoid static check warning
Patch series "Some cleanups of maple tree", v2.

These are some small cleanups of maple tree.


This patch (of 5):

Put the check for gap before its reference to avoid Smatch static check
warnings.  This is not a bug, it's just a validation program.  Even with
this change, Smatch may still generate warnings because MT_BUG_ON()
doesn't necessarily stop the program.  It may require fixing Smatch itself
to avoid these warnings.

Link: https://lkml.kernel.org/r/20231120070937.35481-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20231120070937.35481-2-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: http://lists.infradead.org/pipermail/maple-tree/2023-November/003046.html
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:00 -08:00
Jiapeng Chong
d1fefa3d22 maple_tree: remove unused function
The function are defined in the maple_tree.c file, but not called
elsewhere, so delete the unused function.

lib/maple_tree.c:689:29: warning: unused function 'mas_pivot'.

Link: https://lkml.kernel.org/r/20231027084944.24888-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7064
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:59 -08:00
Liam R. Howlett
a3c63c8c5d maple_tree: mtree_range_walk() clean up
mtree_range_walk() needed to be updated to avoid checking if there was a
pivot value.  On closer examination, the code could avoid setting min or
max in certain scenarios.  The commit removes the extra check for
pivot[offset] before setting max and only sets max when necessary.  It
also only sets min if it is necessary by checking offset 0 prior to the
loop (as it has always done).

The commit also drops a dead node check since the end of the node will
return the array size when the last slot is occupied (by a potential reuse
in a dead node).  The data will be discarded later if the node is marked
dead.

Benchmarking these changes results in an increase in performance of 5.45%
using the BENCH_WALK in the maple tree test code.

Link: https://lkml.kernel.org/r/20231101171629.3612299-13-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:59 -08:00
Liam R. Howlett
24662decdd maple_tree: don't find node end in mtree_lookup_walk()
Since the pivot being set is now reliable, the optimized loop no longer
needs to find the node end.  The redundant check for a dead node can also
be avoided as there is no danger of using the wrong pivot since the
results will be thrown out in the case of a dead node by the later check.

This patch also adds a benchmark test for the function to the maple tree
test framework.  The benchmark shows an average increase performance of
5.98% over 3 runs with this commit.

Link: https://lkml.kernel.org/r/20231101171629.3612299-12-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:59 -08:00
Liam R. Howlett
0de56e38b3 maple_tree: use maple state end for write operations
ma_wr_state was previously tracking the end of the node for writing. 
Since the implementation of the ma_state end tracking, this is duplicated
work.  This patch removes the maple write state tracking of the end of the
node and uses the maple state end instead.

Link: https://lkml.kernel.org/r/20231101171629.3612299-11-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:59 -08:00
Liam R. Howlett
9a40d45c1f maple_tree: remove mas_searchable()
Now that the status of the maple state is outside of the node, the
mas_searchable() function can be dropped for easier open-coding of what is
going on.

Link: https://lkml.kernel.org/r/20231101171629.3612299-10-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:58 -08:00
Liam R. Howlett
067311d33e maple_tree: separate ma_state node from status
The maple tree node is overloaded to keep status as well as the active
node.  This, unfortunately, results in a re-walk on underflow or overflow.
Since the maple state has room, the status can be placed in its own enum
in the structure.  Once an underflow/overflow is detected, certain modes
can restore the status to active and others may need to re-walk just that
one node to see the entry.

The status being an enum has the benefit of detecting unhandled status in
switch statements.

[Liam.Howlett@oracle.com: fix comments about MAS_*]
  Link: https://lkml.kernel.org/r/20231106154124.614247-1-Liam.Howlett@oracle.com
[Liam.Howlett@oracle.com: update forking to separate maple state and node]
  Link: https://lkml.kernel.org/r/20231106154551.615042-1-Liam.Howlett@oracle.com
[Liam.Howlett@oracle.com: fix mas_prev() state separation code]
  Link: https://lkml.kernel.org/r/20231207193319.4025462-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20231101171629.3612299-9-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:58 -08:00
Liam R. Howlett
271f61a8b4 maple_tree: clean up inlines for some functions
There are a few functions which were inlined but are somewhat too large to
inline, so remove the inline key word.

There are also several very small functions which are used in critical
code sections which gcc was not inlining, so make this more strict and use
__always_line for these functions.

Link: https://lkml.kernel.org/r/20231101171629.3612299-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:58 -08:00
Liam R. Howlett
1f41ef12ab maple_tree: use cached node end in mas_destroy()
The node end is set during the walk, so use the resulting end instead of
re-fetching it.

Link: https://lkml.kernel.org/r/20231101171629.3612299-7-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:58 -08:00
Liam R. Howlett
e9c52d8940 maple_tree: use cached node end in mas_next()
When looking for the next entry, don't recalculate the node end as it is
now tracked in the maple state.

Link: https://lkml.kernel.org/r/20231101171629.3612299-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:57 -08:00
Liam R. Howlett
31c532a8af maple_tree: add end of node tracking to the maple state
Analysis of the mas_for_each() iteration showed that there is a
significant time spent finding the end of a node.  This time can be
greatly reduced if the end of the node is cached in the maple state.  Care
must be taken to update & invalidate as necessary.

Link: https://lkml.kernel.org/r/20231101171629.3612299-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:57 -08:00
Liam R. Howlett
f7a5901895 maple_tree: make mas_erase() more robust
mas_erase() may not deal correctly with all maple states.  Make the
function more robust by ensuring the state is in one of the two acceptable
states.

Link: https://lkml.kernel.org/r/20231101171629.3612299-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:57 -08:00
Liam R. Howlett
37a8ab24d3 maple_tree: remove unnecessary default labels from switch statements
Patch series "maple_tree: iterator state changes".

These patches have some general cleanup and a change to separate the maple
state status tracking from the maple state node.

The maple state status change allows for walks to continue from previous
places when the status needs to be recorded to make logical sense for the
next call to the maple state.  For instance, it allows for prev/next to
function in a way that better resembles the linked list.  It also allows
switch statements to be used to detect missed states during compile, and
the addition of fast-path "active" state is cleaner as an enum.

While making the status change, perf showed some very small (one line)
functions that were not inlined even with the inline key word.  Making
these small functions __always_inline is less expensive according to perf.
As part of that change, some inlines have been dropped from larger
functions.

Perf also showed that the commonly used mas_for_each() iterator was
spending a lot of time finding the end of the node.  This series
introduces caching of the end of the node in the maple state (and updating
it during writes).  This caching along with the inline changes yielded at
23.25% improvement on the BENCH_MAS_FOR_EACH maple tree test framework
benchmark.

I've also included a change to mtree_range_walk and mtree_lookup_walk to
take advantage of Peng's change [1] to the initial pivot setup.

mmtests did not produce any significant gains.

[1] https://lore.kernel.org/all/20230711035444.526-1-zhangpeng.00@bytedance.com/T/#u


This patch (of 12):

Removing the default types from the switch statements will cause compile
warnings on missing cases.

Link: https://lkml.kernel.org/r/20231101171629.3612299-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:56:56 -08:00
Heiko Carstens
18564756ab s390/fpu: get rid of MACHINE_HAS_VX
Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx() which is a
short readable wrapper for "test_facility(129)".

Facility bit 129 is set if the vector facility is present. test_facility()
returns also true for all bits which are set in the architecture level set
of the cpu that the kernel is compiled for. This means that
test_facility(129) is a compile time constant which returns true for z13
and later, since the vector facility bit is part of the z13 kernel ALS.

In result the compiled code will have less runtime checks, and less code.

Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-11 14:33:07 +01:00
Arnd Bergmann
fd6f52e3fa ida: make 'ida_dump' static
Patch series "Treewide: enable -Wmissing-prototypes", v3.

At this point, there are five architectures with a number of known
regressions: alpha, nios2, mips, sh and sparc.  In the previous version of
this patch, I had turned off the missing prototype warnings for the 15
architectures that still had issues, but since there are only five left, I
think we can leave the rest to the maintainers (Cc'd here) as well.

The series is also likely to cause occasional build regressions on
linux-next as developers add new code that misses prototypes.  Hopefully
this should be resolved by the time the patches make it into a release and
everyone gets the warnings right away.  


This patch (of 6):

There is no global declaration for ida_dump() and no other callers, so
make it static to avoid this warning:

lib/test_ida.c:16:6: error: no previous prototype for 'ida_dump'

Link: https://lkml.kernel.org/r/20231123110506.707903-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20231123110506.707903-2-arnd@kernel.org
Fixes: 8ab8ba38d4 ("ida: Start new test_ida module")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Tudor Ambarus <tudor.ambarus@linaro.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 17:21:42 -08:00
Juntong Deng
5d4c6ac946 kasan: record and report more information
Record and report more information to help us find the cause of the bug
and to help us correlate the error with other system events.

This patch adds recording and showing CPU number and timestamp at
allocation and free (controlled by CONFIG_KASAN_EXTRA_INFO).  The
timestamps in the report use the same format and source as printk.

Error occurrence timestamp is already implicit in the printk log, and CPU
number is already shown by dump_stack_lvl, so there is no need to add it.

In order to record CPU number and timestamp at allocation and free,
corresponding members need to be added to the relevant data structures,
which will lead to increased memory consumption.

In Generic KASAN, members are added to struct kasan_track.  Since in most
cases, alloc meta is stored in the redzone and free meta is stored in the
object or the redzone, memory consumption will not increase much.

In SW_TAGS KASAN and HW_TAGS KASAN, members are added to struct
kasan_stack_ring_entry.  Memory consumption increases as the size of
struct kasan_stack_ring_entry increases (this part of the memory is
allocated by memblock), but since this is configurable, it is up to the
user to choose.

Link: https://lkml.kernel.org/r/VI1P193MB0752BD991325D10E4AB1913599BDA@VI1P193MB0752.EURP193.PROD.OUTLOOK.COM
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:55 -08:00
Andrey Konovalov
bd9d9624b7 lib/stackdepot: adjust DEPOT_POOLS_CAP for KMSAN
KMSAN is frequently used in fuzzing scenarios and thus saves a lot of
stack traces.  As KMSAN does not support evicting stack traces from the
stack depot, the stack depot capacity might be reached quickly with large
stack records.

Adjust the maximum number of stack depot pools for this case.

The average size of a stack trace saved into the stack depot is ~16
frames.  Thus, adjust the maximum pools number accordingly to keep the
maximum number of stack traces that can be saved into the stack depot
similar to the one that was allowed before the stack trace eviction
changes.

Link: https://lkml.kernel.org/r/301a115cf7ce8ddb42ef6de9151c2bb76ba728fc.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:48 -08:00
Andrey Konovalov
108be8def4 lib/stackdepot: allow users to evict stack traces
Add stack_depot_put, a function that decrements the reference counter on a
stack record and removes it from the stack depot once the counter reaches
0.

Internally, when removing a stack record, the function unlinks it from the
hash table bucket and returns to the freelist.

With this change, the users of stack depot can call stack_depot_put when
keeping a stack trace in the stack depot is not needed anymore.  This
allows avoiding polluting the stack depot with irrelevant stack traces and
thus have more space to store the relevant ones before the stack depot
reaches its capacity.

Link: https://lkml.kernel.org/r/1d1ad5692ee43d4fc2b3fd9d221331d30b36123f.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:47 -08:00
Andrey Konovalov
410b764f89 lib/stackdepot: add refcount for records
Add a reference counter for how many times a stack records has been
  added to stack depot.

Add a new STACK_DEPOT_FLAG_GET flag to stack_depot_save_flags that
  instructs the stack depot to increment the refcount.

Do not yet decrement the refcount; this is implemented in one of the
  following patches.

Do not yet enable any users to use the flag to avoid overflowing the
  refcount.

This is preparatory patch for implementing the eviction of stack records
  from the stack depot.

Link: https://lkml.kernel.org/r/a3fc14a2359d019d2a008d4ff8b46a665371ffee.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:46 -08:00
Andrey Konovalov
022012dcf4 lib/stackdepot, kasan: add flags to __stack_depot_save and rename
Change the bool can_alloc argument of __stack_depot_save to a u32
  argument that accepts a set of flags.

The following patch will add another flag to stack_depot_save_flags
  besides the existing STACK_DEPOT_FLAG_CAN_ALLOC.

Also rename the function to stack_depot_save_flags, as
  __stack_depot_save is a cryptic name,

Link: https://lkml.kernel.org/r/645fa15239621eebbd3a10331e5864b718839512.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:46 -08:00
Andrey Konovalov
4805180bc1 lib/stackdepot: use list_head for stack record links
Switch stack_record to use list_head for links in the hash table and in
  the freelist.

This will allow removing entries from the hash table buckets.

This is preparatory patch for implementing the eviction of stack records
  from the stack depot.

Link: https://lkml.kernel.org/r/4787d9a584cd33433d9ee1846b17fa3d3e1987ad.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:46 -08:00
Andrey Konovalov
a6cd957021 lib/stackdepot: use read/write lock
Currently, stack depot uses the following locking scheme:

1. Lock-free accesses when looking up a stack record, which allows to
   have multiple users to look up records in parallel;
2. Spinlock for protecting the stack depot pools and the hash table
   when adding a new record.

For implementing the eviction of stack traces from stack depot, the
  lock-free approach is not going to work anymore, as we will need to be
  able to also remove records from the hash table.

Convert the spinlock into a read/write lock, and drop the atomic
  accesses, as they are no longer required.

Looking up stack traces is now protected by the read lock and adding new
  records - by the write lock.  One of the following patches will add a
  new function for evicting stack records, which will be protected by the
  write lock as well.

With this change, multiple users can still look up records in parallel.

This is preparatory patch for implementing the eviction of stack records
  from the stack depot.

Link: https://lkml.kernel.org/r/9f81ffcc4bb422ebb6326a65a770bf1918634cbb.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:45 -08:00
Andrey Konovalov
b29d318858 lib/stackdepot: store free stack records in a freelist
Instead of using the global pool_offset variable to find a free slot when
storing a new stack record, mainlain a freelist of free slots within the
allocated stack pools.

A global next_stack variable is used as the head of the freelist, and the
next field in the stack_record struct is reused as freelist link (when the
record is not in the freelist, this field is used as a link in the hash
table).

This is preparatory patch for implementing the eviction of stack records
from the stack depot.

Link: https://lkml.kernel.org/r/b9e4c79955c2121b69301778643b203d3fb09ccc.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:45 -08:00
Andrey Konovalov
a5d21f7171 lib/stackdepot: store next pool pointer in new_pool
Instead of using the last pointer in stack_pools for storing the pointer
to a new pool (which does not yet store any stack records), use a new
new_pool variable.

This a purely code readability change: it seems more logical to store the
pointer to a pool with a special meaning in a dedicated variable.

Link: https://lkml.kernel.org/r/448bc18296c16bef95cb3167697be6583dcc8ce3.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:45 -08:00
Andrey Konovalov
b6a353d3eb lib/stackdepot: rename next_pool_required to new_pool_required
Rename next_pool_required to new_pool_required.

This a purely code readability change: the following patch will change
stack depot to store the pointer to the new pool in a separate variable,
and "new" seems like a more logical name.

Link: https://lkml.kernel.org/r/fd7cd6c6eb250c13ec5d2009d75bb4ddd1470db9.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:45 -08:00
Andrey Konovalov
94b7d32870 lib/stackdepot: rework helpers for depot_alloc_stack
Split code in depot_alloc_stack and depot_init_pool into 3 functions:

1. depot_keep_next_pool that keeps preallocated memory for the next pool
   if required.

2. depot_update_pools that moves on to the next pool if there's no space
   left in the current pool, uses preallocated memory for the new current
   pool if required, and calls depot_keep_next_pool otherwise.

3. depot_alloc_stack that calls depot_update_pools and then allocates
   a stack record as before.

This makes it somewhat easier to follow the logic of depot_alloc_stack and
also serves as a preparation for implementing the eviction of stack
records from the stack depot.

Link: https://lkml.kernel.org/r/71fb144d42b701fcb46708d7f4be6801a4a8270e.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov
fcccc41ecb lib/stackdepot: fix and clean-up atomic annotations
Drop smp_load_acquire from next_pool_required in depot_init_pool, as both
depot_init_pool and the all smp_store_release's to this variable are
executed under the stack depot lock.

Also simplify and clean up comments accompanying the use of atomic
accesses in the stack depot code.

Link: https://lkml.kernel.org/r/c118ef044d8db80248d9e1f14592c72e8429e9d9.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov
fc60e0caa9 lib/stackdepot: use fixed-sized slots for stack records
Instead of storing stack records in stack depot pools one right after
another, use fixed-sized slots.

Add a new Kconfig option STACKDEPOT_MAX_FRAMES that allows to select the
size of the slot in frames.  Use 64 as the default value, which is the
maximum stack trace size both KASAN and KMSAN use right now.

Also add descriptions for other stack depot Kconfig options.

This is preparatory patch for implementing the eviction of stack records
from the stack depot.

Link: https://lkml.kernel.org/r/dce7d030a99ff61022509665187fac45b0827298.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov
83130ab2d8 lib/stackdepot: add depot_fetch_stack helper
Add a helper depot_fetch_stack function that fetches the pointer to a
stack record.

With this change, all static depot_* functions now operate on stack pools
and the exported stack_depot_* functions operate on the hash table.

Link: https://lkml.kernel.org/r/170d8c202f29dc8e3d5491ee074d1e9e029a46db.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov
5f9ce55e02 lib/stackdepot: drop valid bit from handles
Stack depot doesn't use the valid bit in handles in any way, so drop it.

Link: https://lkml.kernel.org/r/34969bba2ca6e012c6ad071767197dee64dc5723.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:44 -08:00
Andrey Konovalov
603c000c11 lib/stackdepot: simplify __stack_depot_save
The retval local variable in __stack_depot_save has the union type
handle_parts, but the function never uses anything but the union's handle
field.

Define retval simply as depot_stack_handle_t to simplify the code.

Link: https://lkml.kernel.org/r/3b0763c8057a1cf2f200ff250a5f9580ee36a28c.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:43 -08:00
Andrey Konovalov
0c5d44a814 lib/stackdepot: check disabled flag when fetching
Do not try fetching a stack trace from the stack depot if the
stack_depot_disabled flag is enabled.

Link: https://lkml.kernel.org/r/c3bfa3b7ab00b2e48ab75a3fbb9c67555777cb08.1700502145.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:43 -08:00
Andrey Konovalov
4d07a03723 lib/stackdepot: print disabled message only if truly disabled
Patch series "stackdepot: allow evicting stack traces", v4.

Currently, the stack depot grows indefinitely until it reaches its
capacity.  Once that happens, the stack depot stops saving new stack
traces.

This creates a problem for using the stack depot for in-field testing and
in production.

For such uses, an ideal stack trace storage should:

1. Allow saving fresh stack traces on systems with a large uptime while
   limiting the amount of memory used to store the traces;
2. Have a low performance impact.

Implementing #1 in the stack depot is impossible with the current
keep-forever approach.  This series targets to address that.  Issue #2 is
left to be addressed in a future series.

This series changes the stack depot implementation to allow evicting
unneeded stack traces from the stack depot.  The users of the stack depot
can do that via new stack_depot_save_flags(STACK_DEPOT_FLAG_GET) and
stack_depot_put APIs.

Internal changes to the stack depot code include:

1. Storing stack traces in fixed-frame-sized slots (vs precisely-sized
   slots in the current implementation); the slot size is controlled via
   CONFIG_STACKDEPOT_MAX_FRAMES (default: 64 frames);
2. Keeping available slots in a freelist (vs keeping an offset to the next
   free slot);
3. Using a read/write lock for synchronization (vs a lock-free approach
   combined with a spinlock).

This series also integrates the eviction functionality into KASAN: the
tag-based modes evict stack traces when the corresponding entry leaves the
stack ring, and Generic KASAN evicts stack traces for objects once those
leave the quarantine.

With KASAN, despite wasting some space on rounding up the size of each
stack record, the total memory consumed by stack depot gets saturated due
to the eviction of irrelevant stack traces from the stack depot.

With the tag-based KASAN modes, the average total amount of memory used
for stack traces becomes ~0.5 MB (with the current default stack ring size
of 32k entries and the default CONFIG_STACKDEPOT_MAX_FRAMES of 64).  With
Generic KASAN, the stack traces take up ~1 MB per 1 GB of RAM (as the
quarantine's size depends on the amount of RAM).

However, with KMSAN, the stack depot ends up using ~4x more memory per a
stack trace than before.  Thus, for KMSAN, the stack depot capacity is
increased accordingly.  KMSAN uses a lot of RAM for shadow memory anyway,
so the increased stack depot memory usage will not make a significant
difference.

Other users of the stack depot do not save stack traces as often as KASAN
and KMSAN.  Thus, the increased memory usage is taken as an acceptable
trade-off.  In the future, these other users can take advantage of the
eviction API to limit the memory waste.

There is no measurable boot time performance impact of these changes for
KASAN on x86-64.  I haven't done any tests for arm64 modes (the stack
depot without performance optimizations is not suitable for intended use
of those anyway), but I expect a similar result.  Obtaining and copying
stack trace frames when saving them into stack depot is what takes the
most time.

This series does not yet provide a way to configure the maximum size of
the stack depot externally (e.g.  via a command-line parameter).  This
will be added in a separate series, possibly together with the performance
improvement changes.


This patch (of 22):

Currently, if stack_depot_disable=off is passed to the kernel command-line
after stack_depot_disable=on, stack depot prints a message that it is
disabled, while it is actually enabled.

Fix this by moving printing the disabled message to
stack_depot_early_init.  Place it before the
__stack_depot_early_init_requested check, so that the message is printed
even if early stack depot init has not been requested.

Also drop the stack_table = NULL assignment from disable_stack_depot, as
stack_table is NULL by default.

Link: https://lkml.kernel.org/r/cover.1700502145.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/73a25c5fff29f3357cd7a9330e85e09bc8da2cbe.1700502145.git.andreyknvl@google.com
Fixes: e1fdc40334 ("lib: stackdepot: add support to disable stack depot")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:43 -08:00
Paul Heidekrüger
83a6fdd6c2 kasan: default to inline instrumentation
KASan inline instrumentation can yield up to a 2x performance gain at the
cost of a larger binary.

Make inline instrumentation the default, as suggested in the bug report
below.

When an architecture does not support inline instrumentation, it should
set ARCH_DISABLE_KASAN_INLINE, as done by PowerPC, for instance.

Link: https://lkml.kernel.org/r/20231109155101.186028-1-paul.heidekrueger@tum.de
Signed-off-by: Paul Heidekrüger <paul.heidekrueger@tum.de>
Reported-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Marco Elver <elver@google.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=203495
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:40 -08:00
Peng Zhang
8e50d32c7a maple_tree: preserve the tree attributes when destroying maple tree
When destroying maple tree, preserve its attributes and then turn it into
an empty tree.  This allows it to be reused without needing to be
reinitialized.

Link: https://lkml.kernel.org/r/20231027033845.90608-10-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:33 -08:00
Peng Zhang
446e1867e6 maple_tree: update check_forking() and bench_forking()
Updated check_forking() and bench_forking() to use __mt_dup() to duplicate
maple tree.

Link: https://lkml.kernel.org/r/20231027033845.90608-9-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:33 -08:00
Peng Zhang
f670fa1caa maple_tree: skip other tests when BENCH is enabled
Skip other tests when BENCH is enabled so that performance can be measured
in user space.

Link: https://lkml.kernel.org/r/20231027033845.90608-8-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:33 -08:00
Peng Zhang
fd32e4e9b7 maple_tree: introduce interfaces __mt_dup() and mtree_dup()
Introduce interfaces __mt_dup() and mtree_dup(), which are used to
duplicate a maple tree.  They duplicate a maple tree in Depth-First Search
(DFS) pre-order traversal.  It uses memcopy() to copy nodes in the source
tree and allocate new child nodes in non-leaf nodes.  The new node is
exactly the same as the source node except for all the addresses stored in
it.  It will be faster than traversing all elements in the source tree and
inserting them one by one into the new tree.  The time complexity of these
two functions is O(n).

The difference between __mt_dup() and mtree_dup() is that mtree_dup()
handles locks internally.

Analysis of the average time complexity of this algorithm:

For simplicity, let's assume that the maximum branching factor of all
non-leaf nodes is 16 (in allocation mode, it is 10), and the tree is a
full tree.

Under the given conditions, if there is a maple tree with n elements, the
number of its leaves is n/16.  From bottom to top, the number of nodes in
each level is 1/16 of the number of nodes in the level below.  So the
total number of nodes in the entire tree is given by the sum of n/16 +
n/16^2 + n/16^3 + ...  + 1.  This is a geometric series, and it has log(n)
terms with base 16.  According to the formula for the sum of a geometric
series, the sum of this series can be calculated as (n-1)/15.  Each node
has only one parent node pointer, which can be considered as an edge.  In
total, there are (n-1)/15-1 edges.

This algorithm consists of two operations:

1. Traversing all nodes in DFS order.
2. For each node, making a copy and performing necessary modifications
   to create a new node.

For the first part, DFS traversal will visit each edge twice.  Let
T(ascend) represent the cost of taking one step downwards, and T(descend)
represent the cost of taking one step upwards.  And both of them are
constants (although mas_ascend() may not be, as it contains a loop, but
here we ignore it and treat it as a constant).  So the time spent on the
first part can be represented as ((n-1)/15-1) * (T(ascend) + T(descend)).

For the second part, each node will be copied, and the cost of copying a
node is denoted as T(copy_node).  For each non-leaf node, it is necessary
to reallocate all child nodes, and the cost of this operation is denoted
as T(dup_alloc).  The behavior behind memory allocation is complex and not
specific to the maple tree operation.  Here, we assume that the time
required for a single allocation is constant.  Since the size of a node is
fixed, both of these symbols are also constants.  We can calculate that
the time spent on the second part is ((n-1)/15) * T(copy_node) + ((n-1)/15
- n/16) * T(dup_alloc).

Adding both parts together, the total time spent by the algorithm can be
represented as:

((n-1)/15) * (T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)) -
n/16 * T(dup_alloc) - (T(ascend) + T(descend))

Let C1 = T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)
Let C2 = T(dup_alloc)
Let C3 = T(ascend) + T(descend)

Finally, the expression can be simplified as:
((16 * C1 - 15 * C2) / (15 * 16)) * n - (C1 / 15 + C3).

This is a linear function, so the average time complexity is O(n).

Link: https://lkml.kernel.org/r/20231027033845.90608-4-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Suggested-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:32 -08:00
Peng Zhang
4f2267b58a maple_tree: add mt_free_one() and mt_attr() helpers
Patch series "Introduce __mt_dup() to improve the performance of fork()", v7.

This series introduces __mt_dup() to improve the performance of fork(). 
During the duplication process of mmap, all VMAs are traversed and
inserted one by one into the new maple tree, causing the maple tree to be
rebalanced multiple times.  Balancing the maple tree is a costly
operation.  To duplicate VMAs more efficiently, mtree_dup() and __mt_dup()
are introduced for the maple tree.  They can efficiently duplicate a maple
tree.

Here are some algorithmic details about {mtree,__mt}_dup().  We perform a
DFS pre-order traversal of all nodes in the source maple tree.  During
this process, we fully copy the nodes from the source tree to the new
tree.  This involves memory allocation, and when encountering a new node,
if it is a non-leaf node, all its child nodes are allocated at once.

This idea was originally from Liam R.  Howlett's Maple Tree Work email,
and I added some of my own ideas to implement it.  Some previous
discussions can be found in [1].  For a more detailed analysis of the
algorithm, please refer to the logs for patch [3/10] and patch [10/10].

There is a "spawn" in byte-unixbench[2], which can be used to test the
performance of fork().  I modified it slightly to make it work with
different number of VMAs.

Below are the test results.  The first row shows the number of VMAs.  The
second and third rows show the number of fork() calls per ten seconds,
corresponding to next-20231006 and the this patchset, respectively.  The
test results were obtained with CPU binding to avoid scheduler load
balancing that could cause unstable results.  There are still some
fluctuations in the test results, but at least they are better than the
original performance.

21     121   221    421    821    1621   3221   6421   12821  25621  51221
112100 76261 54227  34035  20195  11112  6017   3161   1606   802    393
114558 83067 65008  45824  28751  16072  8922   4747   2436   1233   599
2.19%  8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42%

Thanks to Liam and Matthew for the review.


This patch (of 10):

Add two helpers:
1. mt_free_one(), used to free a maple node.
2. mt_attr(), used to obtain the attributes of maple tree.

Link: https://lkml.kernel.org/r/20231027033845.90608-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20231027033845.90608-2-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:31 -08:00
Tiezhu Yang
5181dc08f7 test_bpf: Rename second ALU64_SMOD_X to ALU64_SMOD_K
Currently, there are two test cases with same name
"ALU64_SMOD_X: -7 % 2 = -1", the first one is right,
the second one should be ALU64_SMOD_K because its
code is BPF_ALU64 | BPF_MOD | BPF_K.

Before:
test_bpf: #170 ALU64_SMOD_X: -7 % 2 = -1 jited:1 4 PASS
test_bpf: #171 ALU64_SMOD_X: -7 % 2 = -1 jited:1 4 PASS

After:
test_bpf: #170 ALU64_SMOD_X: -7 % 2 = -1 jited:1 4 PASS
test_bpf: #171 ALU64_SMOD_K: -7 % 2 = -1 jited:1 4 PASS

Fixes: daabb2b098 ("bpf/tests: add tests for cpuv4 instructions")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231207040851.19730-1-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-09 21:27:54 -08:00
Linus Torvalds
8e819a7623 31 hotfixes. 10 of these address pre-6.6 issues and are marked cc:stable.
The remainder address post-6.6 issues or aren't considered serious enough
 to justify backporting.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZXKEfwAKCRDdBJ7gKXxA
 jlRpAQCiAp1nSqIz/fOKTzoQRaTDXU/m+C+6ZAXdKLDfvQBhpwEAnxxjZ8IgF+8Z
 Klz/GirHX5w5o7jE2wb8iObo1nR75Qo=
 =omRq
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-12-07-18-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "31 hotfixes. Ten of these address pre-6.6 issues and are marked
  cc:stable. The remainder address post-6.6 issues or aren't considered
  serious enough to justify backporting"

* tag 'mm-hotfixes-stable-2023-12-07-18-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (31 commits)
  mm/madvise: add cond_resched() in madvise_cold_or_pageout_pte_range()
  nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
  mm/hugetlb: have CONFIG_HUGETLB_PAGE select CONFIG_XARRAY_MULTI
  scripts/gdb: fix lx-device-list-bus and lx-device-list-class
  MAINTAINERS: drop Antti Palosaari
  highmem: fix a memory copy problem in memcpy_from_folio
  nilfs2: fix missing error check for sb_set_blocksize call
  kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP
  units: add missing header
  drivers/base/cpu: crash data showing should depends on KEXEC_CORE
  mm/damon/sysfs-schemes: add timeout for update_schemes_tried_regions
  scripts/gdb/tasks: fix lx-ps command error
  mm/Kconfig: make userfaultfd a menuconfig
  selftests/mm: prevent duplicate runs caused by TEST_GEN_PROGS
  mm/damon/core: copy nr_accesses when splitting region
  lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly
  checkstack: fix printed address
  mm/memory_hotplug: fix error handling in add_memory_resource()
  mm/memory_hotplug: add missing mem_hotplug_lock
  .mailmap: add a new address mapping for Chester Lin
  ...
2023-12-08 08:36:23 -08:00
Jakub Kicinski
2483e7f04c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/stmicro/stmmac/dwmac5.c
drivers/net/ethernet/stmicro/stmmac/dwmac5.h
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
drivers/net/ethernet/stmicro/stmmac/hwif.h
  37e4b8df27 ("net: stmmac: fix FPE events losing")
  c3f3b97238 ("net: stmmac: Refactor EST implementation")
https://lore.kernel.org/all/20231206110306.01e91114@canb.auug.org.au/

Adjacent changes:

net/ipv4/tcp_ao.c
  9396c4ee93 ("net/tcp: Don't store TCP-AO maclen on reqsk")
  7b0f570f87 ("tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-07 17:53:17 -08:00
Andrew Morton
0c92218f4e Merge branch 'master' into mm-hotfixes-stable 2023-12-06 17:03:50 -08:00
Ming Lei
0263f92fad lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly
group_cpus_evenly() could be part of storage driver's error handler, such
as nvme driver, when may happen during CPU hotplug, in which storage queue
has to drain its pending IOs because all CPUs associated with the queue
are offline and the queue is becoming inactive.  And handling IO needs
error handler to provide forward progress.

Then deadlock is caused:

1) inside CPU hotplug handler, CPU hotplug lock is held, and blk-mq's
   handler is waiting for inflight IO

2) error handler is waiting for CPU hotplug lock

3) inflight IO can't be completed in blk-mq's CPU hotplug handler
   because error handling can't provide forward progress.

Solve the deadlock by not holding CPU hotplug lock in group_cpus_evenly(),
in which two stage spreads are taken: 1) the 1st stage is over all present
CPUs; 2) the end stage is over all other CPUs.

Turns out the two stage spread just needs consistent 'cpu_present_mask',
and remove the CPU hotplug lock by storing it into one local cache.  This
way doesn't change correctness, because all CPUs are still covered.

Link: https://lkml.kernel.org/r/20231120083559.285174-1-ming.lei@redhat.com
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Reported-by: Guangwu Zhang <guazhang@redhat.com>
Tested-by: Guangwu Zhang <guazhang@redhat.com>
Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <kbusch@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-06 16:12:46 -08:00
Yuntao Wang
4b3805daaa ACPI: tables: Correct and clean up the logic of acpi_parse_entries_array()
The original intention of acpi_parse_entries_array() is to return the
number of all matching entries on success. This number may be greater than
the value of the max_entries parameter. When this happens, the function
will output a warning message, indicating that `count - max_entries`
matching entries remain unprocessed and have been ignored.

However, commit 4ceacd02f5 ("ACPI / table: Always count matched and
successfully parsed entries") changed this logic to return the number of
entries successfully processed by the handler. In this case, when the
max_entries parameter is not zero, the number of entries successfully
processed can never be greater than the value of max_entries. In other
words, the expression `count > max_entries` will always evaluate to false.
This means that the logic in the final if statement will never be executed.

Commit 99b0efd7c8 ("ACPI / tables: do not report the number of entries
ignored by acpi_parse_entries()") mentioned this issue, but it tried to fix
it by removing part of the warning message. This is meaningless because the
pr_warn statement will never be executed in the first place.

Commit 8726d4f441 ("ACPI / tables: fix acpi_parse_entries_array() so it
traverses all subtables") introduced an errs variable, which is intended to
make acpi_parse_entries_array() always traverse all of the subtables,
calling as many of the callbacks as possible. However, it seems that the
commit does not achieve this goal. For example, when a handler returns an
error, none of the handlers will be called again in the subsequent
iterations. This result appears to be no different from before the change.

This patch corrects and cleans up the logic of acpi_parse_entries_array(),
making it return the number of all matching entries, rather than the number
of entries successfully processed by handlers. Additionally, if an error
occurs when executing a handler, the function will return -EINVAL immediately.

This patch should not affect existing users of acpi_parse_entries_array().

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-06 20:46:14 +01:00
Herve Codina
5c47251e8c lib/vsprintf: Fix %pfwf when current node refcount == 0
A refcount issue can appeared in __fwnode_link_del() due to the
pr_debug() call:
  WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110
  Call Trace:
  <TASK>
  ...
  of_node_get+0x1e/0x30
  of_fwnode_get+0x28/0x40
  fwnode_full_name_string+0x34/0x90
  fwnode_string+0xdb/0x140
  ...
  vsnprintf+0x17b/0x630
  ...
  __fwnode_link_del+0x25/0xa0
  fwnode_links_purge+0x39/0xb0
  of_node_release+0xd9/0x180
  ...

Indeed, an fwnode (of_node) is being destroyed and so, of_node_release()
is called because the of_node refcount reached 0.
From of_node_release() several function calls are done and lead to
a pr_debug() calls with %pfwf to print the fwnode full name.
The issue is not present if we change %pfwf to %pfwP.

To print the full name, %pfwf iterates over the current node and its
parents and obtain/drop a reference to all nodes involved.

In order to allow to print the full name (%pfwf) of a node while it is
being destroyed, do not obtain/drop a reference to this current node.

Fixes: a92eb7621b ("lib/vsprintf: Make use of fwnode API to obtain node names and separators")
Cc: stable@vger.kernel.org
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20231114152655.409331-1-herve.codina@bootlin.com
2023-12-06 11:06:59 +01:00
Jens Axboe
9fd7874c0e
iov_iter: replace import_single_range() with import_ubuf()
With the removal of the 'iov' argument to import_single_range(), the two
functions are now fully identical. Convert the import_single_range()
callers to import_ubuf(), and remove the former fully.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20231204174827.1258875-3-axboe@kernel.dk
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-12-05 11:57:37 +01:00
Jens Axboe
6ac805d138
iov_iter: remove unused 'iov' argument from import_single_range()
It is entirely unused, just get rid of it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20231204174827.1258875-2-axboe@kernel.dk
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-12-05 11:57:34 +01:00
Vlastimil Babka
2a19be61a6 mm/slab: remove CONFIG_SLAB from all Kconfig and Makefile
Remove CONFIG_SLAB, CONFIG_DEBUG_SLAB, CONFIG_SLAB_DEPRECATED and
everything in Kconfig files and mm/Makefile that depends on those. Since
SLUB is the only remaining allocator, remove the allocator choice, make
CONFIG_SLUB a "def_bool y" for now and remove all explicit dependencies
on SLUB or SLAB as it's now always enabled. Make every option's verbose
name and description refer to "the slab allocator" without refering to
the specific implementation. Do not rename the CONFIG_ option names yet.

Everything under #ifdef CONFIG_SLAB, and mm/slab.c is now dead code, all
code under #ifdef CONFIG_SLUB is now always compiled.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-05 11:14:40 +01:00
Linus Torvalds
669fc83452 Probes fixes for v6.7-r3:
- objpool: Fix objpool overrun case on memory/cache access delay especially
   on the big.LITTLE SoC. The objpool uses a copy of object slot index
   internal loop, but the slot index can be changed on another processor
   in parallel. In that case, the difference of 'head' local copy and the
   'slot->last' index will be bigger than local slot size. In that case,
   we need to re-read the slot::head to update it.
 
 - kretprobe: Fix to use appropriate rcu API for kretprobe holder. Since
   kretprobe_holder::rp is RCU managed, it should use rcu_assign_pointer()
   and rcu_dereference_check() correctly. Also adding __rcu tag for
   finding wrong usage by sparse.
 
 - rethook: Fix to use appropriate rcu API for rethook::handler. The same
   as kretprobe, rethook::handler is RCU managed and it should use
   rcu_assign_pointer() and rcu_dereference_check(). This also adds __rcu
   tag for finding wrong usage by sparse.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmVpfBobHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bNyMIAJSLICKQNuFiBJEn/rty
 ACWJ9QMOnwi0DoVaepG/m9QJh6AIUUFW4//9helmSm0GIVzxQ2+f8UeKU+sYiVtH
 ro9atea4W4+FMTvtEB1cU8oG5CDVT4WQdUXbjMktqYe3+WB8Zt8+fIP0mnbTFAVr
 yStpliGPecmlupJVRYqrJGyDdbkUxXxVlPsP/eDrHFgbBWv8Incw0f+MLGSi6LSE
 sZ1MaKCdi2tlHbtD/fiowfLoBMZwQAKY4hq/XguVsWh+BGaGUgwtif+8ESwPeu22
 KEZLyWDQ1N8XBHyOBotV7vsBEwh6LKtLGVXIBsO4KxVyGw6msxWBis0dt/tkn+kk
 LEg=
 =B9WK
 -----END PGP SIGNATURE-----

Merge tag 'probes-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - objpool: Fix objpool overrun case on memory/cache access delay
   especially on the big.LITTLE SoC. The objpool uses a copy of object
   slot index internal loop, but the slot index can be changed on
   another processor in parallel. In that case, the difference of 'head'
   local copy and the 'slot->last' index will be bigger than local slot
   size. In that case, we need to re-read the slot::head to update it.

 - kretprobe: Fix to use appropriate rcu API for kretprobe holder. Since
   kretprobe_holder::rp is RCU managed, it should use
   rcu_assign_pointer() and rcu_dereference_check() correctly. Also
   adding __rcu tag for finding wrong usage by sparse.

 - rethook: Fix to use appropriate rcu API for rethook::handler. The
   same as kretprobe, rethook::handler is RCU managed and it should use
   rcu_assign_pointer() and rcu_dereference_check(). This also adds
   __rcu tag for finding wrong usage by sparse.

* tag 'probes-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rethook: Use __rcu pointer for rethook::handler
  kprobes: consistent rcu api usage for kretprobe holder
  lib: objpool: fix head overrun on RK3588 SBC
2023-12-03 08:02:49 +09:00
Linus Torvalds
ce474ae7d0 ACPI fixes for 6.7-rc4
- Fix a recently introduced build issue on ARM32 platforms caused by an
    inadvertent header file breakage (Dave Jiang).
 
  - Eliminate questionable usage of acpi_driver_data() in the ACPI
    backlight cooling device code that leads to NULL pointer dereferences
    after recent ACPI core changes (Hans de Goede).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmVqSGMSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxKBsP/RqwUb8oNYxreUmKgIVZ0H8SnPRfluVP
 4St+HeTxsmdN+obglVUWnhZMViewGsionLuq0y/FrYoWLI2F08UQAU8i248h20aZ
 nGXalM+n5H517dOidTJzvGxKHMOA2TrzyVna/IcQAYLnbXmp25j8EdHmuhrI3KfK
 3yxXLe+6J22776U/MMyutR+rwTVE0tgfQOWM4YuxT67uUPQVkKX+3/uYt/EfkKsX
 Dz2ce/5vF28JDjv/yTxaoctMpmjjem97av0J7Y1EM/5K9kTZ070U8OZhYEgBHw5o
 pRalhQlNz5VI/KQy3Mc8n4DmrwrPoktCBI0pQPr8FV56dmCFmTaFY1C4/SxebTr4
 O2U3r5GkmcxLCKZXAUUAc8J+M6BBRHNQtlpBN3iNRG4ID2x72idPn5fr02UeGk4g
 lxysNzcIwxcOuogeKTD2IERzLZA7Ub3qm8gguFOMqnxV1nmPx6f1nl1MuxLsJY4e
 ZQwrwZCDJr5CZUrvJz6Mo9HUibLQOELgtsWinKV9OYUgTOfQ69t+XuwsPYIYtQiH
 UV7qfE+ZLiG/jZPgP6tfN6InbtB9I4xXNMhKyoZrGTNtbKUVLdDHJH62/vIyHBXQ
 VuLr9jH+CuyRjJSRCnbWA1BWkLhWfHQnFaq9UVWAWkVd8MisGVolnCSPGjUyQur7
 ZT7i3c13nxJ2
 =ANS/
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "This fixes a recently introduced build issue on ARM32 and a NULL
  pointer dereference in the ACPI backlight driver due to a design issue
  exposed by a recent change in the ACPI bus type code.

  Specifics:

   - Fix a recently introduced build issue on ARM32 platforms caused by
     an inadvertent header file breakage (Dave Jiang)

   - Eliminate questionable usage of acpi_driver_data() in the ACPI
     backlight cooling device code that leads to NULL pointer
     dereferences after recent ACPI core changes (Hans de Goede)"

* tag 'acpi-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: video: Use acpi_video_device for cooling-dev driver data
  ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
2023-12-02 08:52:20 +09:00
Linus Torvalds
e6861be452 More bcachefs bugfixes for 6.7
Bigger/user visible fixes:
 
  - bcache & bcachefs were broken with CFI enabled; patch for closures to
    fix type punning
 
  - mark erasure coding as extra-experimental; there are incompatible
    disk space accounting changes coming for erasure coding, and I'm
    still seeing checksum errors in some tests
 
  - several fixes for durability-related issues (durability is a device
    specific setting where we can tell bcachefs that data on a given
    device should be counted as replicated x times )
 
  - a fix for a rare livelock when a btree node merge then updates a
    parent node that is almost full
 
  - fix a race in the device removal path, where dropping a pointer in a
    btree node to a device would be clobbered by an in flight btree write
    updating the btree node key on completion
 
  - fix one SRCU lock hold time warning in the btree gc code - ther's
    still a bunch more of these to fix
 
  - fix a rare race where we'd start copygc before initializing the "are
    we rw" percpu refcount; copygc would think we were already ro and die
    immediately
 
 https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-for-upstream
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmVnoHoACgkQE6szbY3K
 bnbzLBAApVEg3kB3XDCHYw+8AxLbzkuKbuV8FR/w+ULYAmRKbnM5e4pM4UJzwVJ9
 vzBS9KUT4mVNpA5zl7FWmqh5AiJkhbPgb/BijtQiS+gz1ofZ8uCW/DjzWZpaTaT9
 0zz9auiKwzJbBmLXC2lWC28MUPjFNXxlP2pfQPqhpKqlGKBC893hKeJ0Veb6dM1R
 DqkctoWtSQzsNpEaXiQpKBNoNUIlYcFX1XXHn+XpPpWNe80SpMfVNCs2qPkMByu/
 V/QULE9cHI7RTu7oyFY80+9xQDeXDDYZgvtpD7hqNPcyyoix+r/DVz1mZe41XF2B
 bvaJhfcdWePctmiuEXJVXT4HSkwwzC6EKHwi7fejGY56hOvsrEAxNzTEIPRNw5st
 ZkZlxASwFqkiJ3ehy+KRngLX2GZSbJsU4aM5ViQJKtz4rBzGyyf0LmMucdxAoDH5
 zLzsAYaA6FkIZ5e5ZNdTDj7/TMnKWXlU9vTttqIpb8s7qSy+3ejk5NuGitJihZ4R
 LAaCTs1JIsItLP47Ko0ZvmKV6CHlmt+Ht8OBqu73BWJ8vsBTQ8JMK4mGt60bwHvm
 LdEMtp3C3FmXFc06zhKoGgjrletZYO6G4mFBPnQqh1brfFXM1prVg3ftDTqBWkMI
 iAz2chiVc8k0qxoSAqylCYFaGzgiBKzw6YMtqPRmZgfLcq/sJ34=
 =vN+y
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs bugfixes from Kent Overstreet:

 - bcache & bcachefs were broken with CFI enabled; patch for closures to
   fix type punning

 - mark erasure coding as extra-experimental; there are incompatible
   disk space accounting changes coming for erasure coding, and I'm
   still seeing checksum errors in some tests

 - several fixes for durability-related issues (durability is a device
   specific setting where we can tell bcachefs that data on a given
   device should be counted as replicated x times)

 - a fix for a rare livelock when a btree node merge then updates a
   parent node that is almost full

 - fix a race in the device removal path, where dropping a pointer in a
   btree node to a device would be clobbered by an in flight btree write
   updating the btree node key on completion

 - fix one SRCU lock hold time warning in the btree gc code - ther's
   still a bunch more of these to fix

 - fix a rare race where we'd start copygc before initializing the "are
   we rw" percpu refcount; copygc would think we were already ro and die
   immediately

* tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs: (23 commits)
  bcachefs: Extra kthread_should_stop() calls for copygc
  bcachefs: Convert gc_alloc_start() to for_each_btree_key2()
  bcachefs: Fix race between btree writes and metadata drop
  bcachefs: move journal seq assertion
  bcachefs: -EROFS doesn't count as move_extent_start_fail
  bcachefs: trace_move_extent_start_fail() now includes errcode
  bcachefs: Fix split_race livelock
  bcachefs: Fix bucket data type for stripe buckets
  bcachefs: Add missing validation for jset_entry_data_usage
  bcachefs: Fix zstd compress workspace size
  bcachefs: bpos is misaligned on big endian
  bcachefs: Fix ec + durability calculation
  bcachefs: Data update path won't accidentaly grow replicas
  bcachefs: deallocate_extra_replicas()
  bcachefs: Proper refcounting for journal_keys
  bcachefs: preserve device path as device name
  bcachefs: Fix an endianness conversion
  bcachefs: Start gc, copygc, rebalance threads after initing writes ref
  bcachefs: Don't stop copygc thread on device resize
  bcachefs: Make sure bch2_move_ratelimit() also waits for move_ops
  ...
2023-12-02 06:02:16 +09:00
Rafael J. Wysocki
7d4c44a53d Merge branch 'acpi-tables'
Merge a fix for a recently introduced build issue on ARM32 platforms
caused by an inadvertent header file breakage (Dave Jiang).

* acpi-tables:
  ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
2023-12-01 21:32:19 +01:00
wuqiang.matt
d67f39d2b8 lib: objpool: fix head overrun on RK3588 SBC
objpool overrun stress with test_objpool on OrangePi5+ SBC triggered the
following kernel warnings:

    WARNING: CPU: 6 PID: 3115 at lib/objpool.c:168 objpool_push+0xc0/0x100

This message is from objpool.c:168:

    WARN_ON_ONCE(tail - head > pool->nr_objs);

The overrun test case is to validate the case that pre-allocated objects
are insufficient: 8 objects are pre-allocated for each node and consumer
thread per node tries to grab 16 objects in a row. The testing system is
OrangePI 5+, with RK3588, a big.LITTLE SOC with 4x A76 and 4x A55. When
disabling either all 4 big or 4 little cores, the overrun tests run well,
and once with big and little cores mixed together, the overrun test would
always cause an overrun loop. It's likely the memory timing differences
of big and little cores cause this trouble. Here are the debugging data
of objpool_try_get_slot after try_cmpxchg_release:

    objpool_pop: cpu: 4/0 0:0 head: 278/279 tail:278 last:276/278

The local copies of 'head' and 'last' were 278 and 276, and reloading of
'slot->head' and 'slot->last' got 279 and 278. After try_cmpxchg_release
'slot->head' became 'head + 1', which is correct. But what's wrong here
is the stale value of 'last', and that stale value of 'last' finally led
the overrun of 'head'.

Memory updating of 'last' and 'head' are performed in push() and pop()
independently, which could be the culprit leading this out of order
visibility of 'last' and 'head'. So for objpool_try_get_slot(), it's
not enough only checking the condition of 'head != slot', the implicit
condition 'last - head <= nr_objs' must also be explicitly asserted to
guarantee 'last' is always behind 'head' before the object retrieving.

This patch will check and try reloading of 'head' and 'last' to ensure
'last' is behind 'head' at the time of object retrieving. Performance
testings show the average impact is about 0.1% for X86_64 and 1.12% for
ARM64. Here are the results:

    OS: Debian 10 X86_64, Linux 6.6rc
    HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
                      1T         2T         4T         8T        16T
    native:     49543304   99277826  199017659  399070324  795185848
    objpool:    29909085   59865637  119692073  239750369  478005250
    objpool+:   29879313   59230743  119609856  239067773  478509029
                     32T        48T        64T        96T       128T
    native:   1596927073 2390099988 2929397330 3183875848 3257546602
    objpool:   957553042 1435814086 1680872925 2043126796 2165424198
    objpool+:  956476281 1434491297 1666055740 2041556569 2157415622

    OS: Debian 11 AARCH64, Linux 6.6rc
    HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
                      1T         2T         4T         8T        16T
    native:     30890508   60399915  123111980  242257008  494002946
    objpool:    14742531   28883047   57739948  115886644  232455421
    objpool+:   14107220   29032998   57286084  113730493  232232850
                     24T        32T        48T        64T        96T
    native:    746406039 1000174750 1493236240 1998318364 2942911180
    objpool:   349164852  467284332  702296756  934459713 1387898285
    objpool+:  348388180  462750976  696606096  927865887 1368402195

Link: https://lore.kernel.org/all/20231114115148.298821-1-wuqiang.matt@bytedance.com/

Fixes: b4edb8d2d4 ("lib: objpool added: ring-array based lockless MPMC")
Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-12-01 14:53:55 +09:00
Linus Torvalds
47669f40b1 linux_kselftest-kunit-fixes-6.7-rc4
This KUnit fixes update for Linux 6.7-rc4 consists of three fixes to
 warnings and run-time test behavior. With these fixes, test suite
 counter will be reset correctly before running tests, kunit will warn
 if tests are too slow, and eliminate warning when kfree() as an action.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmVosvIACgkQCwJExA0N
 QxyzqxAAtXffplLUm7saZW7bXW3IB5Bugzu5qjp8Zw0YR7WNhgXMimCVHotWhuHX
 CrQiRIf1ZyXt6+qnUYXMIsEpMri03qhun2E8lmsA6Ws7xAHgQh5nORPEZOgBEV+3
 HnyPJTT2RBPBoOZuRTipYeu6MZE1YSnmpGdftYWHWENThqG+qb5WJ/Rs33JSZAnx
 O/XnCHZigqc7aCxZHnm8HcFWfikn4pFg2f76XU7j6uvH+ZIuQJEZvCdukzjWvGeC
 AP47AaZ04rqbkRZAA5nOew7SIgb0fK3dYdxsWfv3ai1gJMReUwqwQQ/44tnxtPs2
 s0d2FzH8+OL3MIQ0PFMH/E16bmAwcGXFlRmJvJFdbMoWa+zji8mhUSw7qIqJaFTI
 6oMHd4XdHGwHYHrLgQgLdIzuggTRSusnaMBubGJe0xo2xCgFI6CjNUzLNk1zsY2L
 OeKNlTUH2l3oOnlia4CsOPzx5HOe204ZCZwQR1fkSQVmfkl5cfz17Mr/lKshxH+Q
 X+t56bLy5rhyUI6LxrAj7AC6E8MVQ/fiial3v1lDmJKD6Bh5YL0n6wAl5eWv3Hcq
 Com0maVlGCaPIErzjaSM9RqBCYT7Le89J/p3lJ7rXF3zUY8B1Miv7IVCeHDOxzpl
 AwMI9PavdHOvh2NeaDku/fQwzID7ytdT7dAktQIviUReqWnZRGU=
 =g2ql
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit fixes from Shuah Khan:
 "Three fixes to warnings and run-time test behavior. With these fixes,
  test suite counter will be reset correctly before running tests, kunit
  will warn if tests are too slow, and eliminate warning when kfree() as
  an action"

* tag 'linux_kselftest-kunit-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: test: Avoid cast warning when adding kfree() as an action
  kunit: Reset suite counter right before running tests
  kunit: Warn if tests are slow
2023-12-01 14:03:05 +09:00
Jakub Kicinski
753c8608f3 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZWiCPAAKCRDbK58LschI
 g4djAQC1FdqCRIFkhbiIRNHTgHjnfQShELQbd9ofJqzylLqmmgD+JI1E7D9SXagm
 pIXQ26EGmq8/VcCT3VLncA8EsC76Gg4=
 =Xowm
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-11-30

We've added 30 non-merge commits during the last 7 day(s) which contain
a total of 58 files changed, 1598 insertions(+), 154 deletions(-).

The main changes are:

1) Add initial TX metadata implementation for AF_XDP with support in mlx5
   and stmmac drivers. Two types of offloads are supported right now, that
   is, TX timestamp and TX checksum offload, from Stanislav Fomichev with
   stmmac implementation from Song Yoong Siang.

2) Change BPF verifier logic to validate global subprograms lazily instead
   of unconditionally before the main program, so they can be guarded using
   BPF CO-RE techniques, from Andrii Nakryiko.

3) Add BPF link_info support for uprobe multi link along with bpftool
   integration for the latter, from Jiri Olsa.

4) Use pkg-config in BPF selftests to determine ld flags which is
   in particular needed for linking statically, from Akihiko Odaki.

5) Fix a few BPF selftest failures to adapt to the upcoming LLVM18,
   from Yonghong Song.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (30 commits)
  bpf/tests: Remove duplicate JSGT tests
  selftests/bpf: Add TX side to xdp_hw_metadata
  selftests/bpf: Convert xdp_hw_metadata to XDP_USE_NEED_WAKEUP
  selftests/bpf: Add TX side to xdp_metadata
  selftests/bpf: Add csum helpers
  selftests/xsk: Support tx_metadata_len
  xsk: Add option to calculate TX checksum in SW
  xsk: Validate xsk_tx_metadata flags
  xsk: Document tx_metadata_len layout
  net: stmmac: Add Tx HWTS support to XDP ZC
  net/mlx5e: Implement AF_XDP TX timestamp and checksum offload
  tools: ynl: Print xsk-features from the sample
  xsk: Add TX timestamp and TX checksum offload support
  xsk: Support tx_metadata_len
  selftests/bpf: Use pkg-config for libelf
  selftests/bpf: Override PKG_CONFIG for static builds
  selftests/bpf: Choose pkg-config for the target
  bpftool: Add support to display uprobe_multi links
  selftests/bpf: Add link_info test for uprobe_multi link
  selftests/bpf: Use bpf_link__destroy in fill_link_info tests
  ...
====================

Conflicts:

Documentation/netlink/specs/netdev.yaml:
  839ff60df3 ("net: page_pool: add nlspec for basic access to page pools")
  48eb03dd26 ("xsk: Add TX timestamp and TX checksum offload support")
https://lore.kernel.org/all/20231201094705.1ee3cab8@canb.auug.org.au/

While at it also regen, tree is dirty after:
  48eb03dd26 ("xsk: Add TX timestamp and TX checksum offload support")
looks like code wasn't re-rendered after "render-max" was removed.

Link: https://lore.kernel.org/r/20231130145708.32573-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30 16:58:42 -08:00
Jakub Kicinski
975f2d73a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30 16:11:19 -08:00
Yujie Liu
f690ff9122 bpf/tests: Remove duplicate JSGT tests
It seems unnecessary that JSGT is tested twice (one before JSGE and one
after JSGE) since others are tested only once. Remove the duplicate JSGT
tests.

Fixes: 0bbaa02b48 ("bpf/tests: Add tests to check source register zero-extension")
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Link: https://lore.kernel.org/bpf/20231130034018.2144963-1-yujie.liu@intel.com
2023-11-30 12:17:33 +01:00
Linus Torvalds
d2da77f431 parisc architecture fixes for kernel v6.7-rc3:
- Drop HP-UX ENOSYM and EREMOTERELEASE return codes to avoid glibc
   build issues
 - Fix section alignments for ex_table, altinstructions, parisc unwind
   table, jump_table and bug_table
 - Reduce size of bug_table on 64-bit kernel by using relative
   pointers
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZWNy+gAKCRD3ErUQojoP
 XwmuAQDw3a424GmG9c4lxQgPCdtFMUeOCOpMpPNNWdT8UhAcngD9FL5++lt1mwhK
 mq+w+ftx9sATtUIbvX5exkD1dChRaQw=
 =+Q66
 -----END PGP SIGNATURE-----

Merge tag 'parisc-for-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:
 "This patchset fixes and enforces correct section alignments for the
  ex_table, altinstructions, parisc_unwind, jump_table and bug_table
  which are created by inline assembly.

  Due to not being correctly aligned at link & load time they can
  trigger unnecessarily the kernel unaligned exception handler at
  runtime. While at it, I switched the bug table to use relative
  addresses which reduces the size of the table by half on 64-bit.

  We still had the ENOSYM and EREMOTERELEASE errno symbols as left-overs
  from HP-UX, which now trigger build-issues with glibc. We can simply
  remove them.

  Most of the patches are tagged for stable kernel series.

  Summary:

   - Drop HP-UX ENOSYM and EREMOTERELEASE return codes to avoid glibc
     build issues

   - Fix section alignments for ex_table, altinstructions, parisc unwind
     table, jump_table and bug_table

   - Reduce size of bug_table on 64-bit kernel by using relative
     pointers"

* tag 'parisc-for-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Reduce size of the bug_table on 64-bit kernel by half
  parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
  parisc: Use natural CPU alignment for bug_table
  parisc: Ensure 32-bit alignment on parisc unwind section
  parisc: Mark lock_aligned variables 16-byte aligned on SMP
  parisc: Mark jump_table naturally aligned
  parisc: Mark altinstructions read-only and 32-bit aligned
  parisc: Mark ex_table entries 32-bit aligned in uaccess.h
  parisc: Mark ex_table entries 32-bit aligned in assembly.h
2023-11-26 09:59:39 -08:00
Helge Deller
e5f3e299a2 parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
Those return codes are only defined for the parisc architecture and
are leftovers from when we wanted to be HP-UX compatible.

They are not returned by any Linux kernel syscall but do trigger
problems with the glibc strerrorname_np() and strerror() functions as
reported in glibc issue #31080.

There is no need to keep them, so simply remove them.

Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: Bruno Haible <bruno@clisp.org>
Closes: https://sourceware.org/bugzilla/show_bug.cgi?id=31080
Cc: stable@vger.kernel.org
2023-11-25 09:43:18 +01:00
Jakub Kicinski
53775da0b4 Merge branch 'firmware_loader'
Kory says:

====================
This patch was initially submitted as part of a net patch series.
Conor expressed interest in using it in a different subsystem.
https://lore.kernel.org/netdev/20231116-feature_poe-v1-7-be48044bf249@bootlin.com/

Consequently, I extracted it from the series and submitted it separately.
I first tried to send it to driver-core but it seems also not the best
choice:
https://lore.kernel.org/lkml/2023111720-slicer-exes-7d9f@gregkh/
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-24 18:11:55 -08:00
Kory Maincent
a066f906ba firmware_loader: Expand Firmware upload error codes with firmware invalid error
No error code are available to signal an invalid firmware content.
Drivers that can check the firmware content validity can not return this
specific failure to the user-space

Expand the firmware error code with an additional code:
- "firmware invalid" code which can be used when the provided firmware
  is invalid

Sync lib/test_firmware.c file accordingly.

Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231122-feature_firmware_error_code-v3-1-04ec753afb71@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-24 18:09:19 -08:00
Linus Torvalds
fa2b906f51 vfs-6.7-rc3.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZWBq0gAKCRCRxhvAZXjc
 ot4EAP48O5ExMtQ3/AIkNDo+/9/Iz4g7bE1HYmdyiMPO3Ou/uwEAySwBXRJrFAsS
 9omvkEdqrfyguW0xgoYwcxBdATVHnAE=
 =ScR3
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Avoid calling back into LSMs from vfs_getattr_nosec() calls.

   IMA used to query inode properties accessing raw inode fields without
   dedicated helpers. That was finally fixed a few releases ago by
   forcing IMA to use vfs_getattr_nosec() helpers.

   The goal of the vfs_getattr_nosec() helper is to query for attributes
   without calling into the LSM layer which would be quite problematic
   because incredibly IMA is called from __fput()...

     __fput()
       -> ima_file_free()

   What it does is to call back into the filesystem to update the file's
   IMA xattr. Querying the inode without using vfs_getattr_nosec() meant
   that IMA didn't handle stacking filesystems such as overlayfs
   correctly. So the switch to vfs_getattr_nosec() is quite correct. But
   the switch to vfs_getattr_nosec() revealed another bug when used on
   stacking filesystems:

     __fput()
       -> ima_file_free()
          -> vfs_getattr_nosec()
             -> i_op->getattr::ovl_getattr()
                -> vfs_getattr()
                   -> i_op->getattr::$WHATEVER_UNDERLYING_FS_getattr()
                      -> security_inode_getattr() # calls back into LSMs

   Now, if that __fput() happens from task_work_run() of an exiting task
   current->fs and various other pointer could already be NULL. So
   anything in the LSM layer relying on that not being NULL would be
   quite surprised.

   Fix that by passing the information that this is a security request
   through to the stacking filesystem by adding a new internal
   ATT_GETATTR_NOSEC flag. Now the callchain becomes:

     __fput()
       -> ima_file_free()
          -> vfs_getattr_nosec()
             -> i_op->getattr::ovl_getattr()
                -> if (AT_GETATTR_NOSEC)
                          vfs_getattr_nosec()
                   else
                          vfs_getattr()
                   -> i_op->getattr::$WHATEVER_UNDERLYING_FS_getattr()

 - Fix a bug introduced with the iov_iter rework from last cycle.

   This broke /proc/kcore by copying too much and without the correct
   offset.

 - Add a missing NULL check when allocating the root inode in
   autofs_fill_super().

 - Fix stable writes for multi-device filesystems (xfs, btrfs etc) and
   the block device pseudo filesystem.

   Stable writes used to be a superblock flag only, making it a per
   filesystem property. Add an additional AS_STABLE_WRITES mapping flag
   to allow for fine-grained control.

 - Ensure that offset_iterate_dir() returns 0 after reaching the end of
   a directory so it adheres to getdents() convention.

* tag 'vfs-6.7-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  libfs: getdents() should return 0 after reaching EOD
  xfs: respect the stable writes flag on the RT device
  xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflags
  block: update the stable_writes flag in bdev_add
  filemap: add a per-mapping stable writes flag
  autofs: add: new_inode check in autofs_fill_super()
  iov_iter: fix copy_page_to_iter_nofault()
  fs: Pass AT_GETATTR_NOSEC flag to getattr interface function
2023-11-24 09:45:40 -08:00
Kent Overstreet
d4e3b928ab closures: CLOSURE_CALLBACK() to fix type punning
Control flow integrity is now checking that type signatures match on
indirect function calls. That breaks closures, which embed a work_struct
in a closure in such a way that a closure_fn may also be used as a
workqueue fn by the underlying closure code.

So we have to change closure fns to take a work_struct as their
argument - but that results in a loss of clarity, as closure fns have
different semantics from normal workqueue functions (they run owning a
ref on the closure, which must be released with continue_at() or
closure_return()).

Thus, this patc introduces CLOSURE_CALLBACK() and closure_type() macros
as suggested by Kees, to smooth things over a bit.

Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Coly Li <colyli@suse.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 00:29:58 -05:00
Dave Jiang
35732699f5 ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
Linus reported that:
After commit a103f46633 the kernel stopped compiling for
several ARM32 platforms that I am building with a bare metal
compiler. Bare metal compilers (arm-none-eabi-) don't
define __linux__.

This is because the header <acpi/platform/acenv.h> is now
in the include path for <linux/irq.h>:

  CC      arch/arm/kernel/irq.o
  CC      kernel/sysctl.o
  CC      crypto/api.o
In file included from ../include/acpi/acpi.h:22,
                 from ../include/linux/fw_table.h:29,
                 from ../include/linux/acpi.h:18,
                 from ../include/linux/irqchip.h:14,
                 from ../arch/arm/kernel/irq.c:25:
../include/acpi/platform/acenv.h:218:2: error: #error Unknown target environment
  218 | #error Unknown target environment
      |  ^~~~~

The issue is caused by the introducing of splitting out the ACPI code to
support the new generic fw_table code.

Rafael suggested [1] moving the fw_table.h include in linux/acpi.h to below
the linux/mutex.h. Remove the two includes in fw_table.h. Replace
linux/fw_table.h include in fw_table.c with linux/acpi.h.

Link: https://lore.kernel.org/linux-acpi/CAJZ5v0idWdJq3JSqQWLG5q+b+b=zkEdWR55rGYEoxh7R6N8kFQ@mail.gmail.com/
Fixes: a103f46633 ("acpi: Move common tables helper functions to common lib")
Closes: https://lore.kernel.org/linux-acpi/20231114-arm-build-bug-v1-1-458745fe32a4@linaro.org/
Reported-by: Linus Walleij <linus.walleij@linaro.org>
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-22 20:41:34 +01:00
Andrzej Hajda
9bb6362652 debugobjects: Stop accessing objects after releasing hash bucket lock
After release of the hashbucket lock the tracking object can be modified or
freed by a concurrent thread.  Using it in such a case is error prone, even
for printing the object state:

    1. T1 tries to deactivate destroyed object, debugobjects detects it,
       hash bucket lock is released.

    2. T2 preempts T1 and frees the tracking object.

    3. The freed tracking object is allocated and initialized for a
       different to be tracked kernel object.

    4. T1 resumes and reports error for wrong kernel object.

Create a local copy of the tracking object before releasing the hash bucket
lock and use the local copy for reporting and fixups to prevent this.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20231025-debugobjects_fix-v3-1-2bc3bf7084c2@intel.com
2023-11-22 10:41:46 +01:00
Jakub Kicinski
53475287da bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZV0kjgAKCRDbK58LschI
 gy0EAP9XwncW2OhO72DpITluFzvWPgB0N97OANKBXjzKJrRAlQD/aUe9nlvBQuad
 WsbMKLeC4wvI2X/4PEIR4ukbuZ3ypAA=
 =LMVg
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-11-21

We've added 85 non-merge commits during the last 12 day(s) which contain
a total of 63 files changed, 4464 insertions(+), 1484 deletions(-).

The main changes are:

1) Huge batch of verifier changes to improve BPF register bounds logic
   and range support along with a large test suite, and verifier log
   improvements, all from Andrii Nakryiko.

2) Add a new kfunc which acquires the associated cgroup of a task within
   a specific cgroup v1 hierarchy where the latter is identified by its id,
   from Yafang Shao.

3) Extend verifier to allow bpf_refcount_acquire() of a map value field
   obtained via direct load which is a use-case needed in sched_ext,
   from Dave Marchevsky.

4) Fix bpf_get_task_stack() helper to add the correct crosstask check
   for the get_perf_callchain(), from Jordan Rome.

5) Fix BPF task_iter internals where lockless usage of next_thread()
   was wrong. The rework also simplifies the code, from Oleg Nesterov.

6) Fix uninitialized tail padding via LIBBPF_OPTS_RESET, and another
   fix for certain BPF UAPI structs to fix verifier failures seen
   in bpf_dynptr usage, from Yonghong Song.

7) Add BPF selftest fixes for map_percpu_stats flakes due to per-CPU BPF
   memory allocator not being able to allocate per-CPU pointer successfully,
   from Hou Tao.

8) Add prep work around dynptr and string handling for kfuncs which
   is later going to be used by file verification via BPF LSM and fsverity,
   from Song Liu.

9) Improve BPF selftests to update multiple prog_tests to use ASSERT_*
   macros, from Yuran Pereira.

10) Optimize LPM trie lookup to check prefixlen before walking the trie,
    from Florian Lehner.

11) Consolidate virtio/9p configs from BPF selftests in config.vm file
    given they are needed consistently across archs, from Manu Bretelle.

12) Small BPF verifier refactor to remove register_is_const(),
    from Shung-Hsi Yu.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
  selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux
  selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id
  selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bind_perm
  selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca
  selftests/bpf: reduce verboseness of reg_bounds selftest logs
  bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos)
  bpf: bpf_iter_task_next: use __next_thread() rather than next_thread()
  bpf: task_group_seq_get_next: use __next_thread() rather than next_thread()
  bpf: emit frameno for PTR_TO_STACK regs if it differs from current one
  bpf: smarter verifier log number printing logic
  bpf: omit default off=0 and imm=0 in register state log
  bpf: emit map name in register state if applicable and available
  bpf: print spilled register state in stack slot
  bpf: extract register state printing
  bpf: move verifier state printing code to kernel/bpf/log.c
  bpf: move verbose_linfo() into kernel/bpf/log.c
  bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS
  bpf: Remove test for MOVSX32 with offset=32
  selftests/bpf: add iter test requiring range x range logic
  veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag
  ...
====================

Link: https://lore.kernel.org/r/20231122000500.28126-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-21 17:53:20 -08:00
Omar Sandoval
fe2c34bab6 iov_iter: fix copy_page_to_iter_nofault()
The recent conversion to inline functions made two mistakes:

1. It tries to copy the full amount requested (bytes), not just what's
   available in the kmap'd page (n).
2. It's not applying the offset in the first page.

Note that copy_page_to_iter_nofault() is only used by /proc/kcore. This
was detected by drgn's test suite.

Fixes: f1982740f5 ("iov_iter: Convert iterate*() to inline funcs")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Link: https://lore.kernel.org/r/c1616e06b5248013cbbb1881bb4fef85a7a69ccb.1700257019.git.osandov@fb.com
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-18 16:42:07 +01:00
Sagar Vashnav
239e27a983 crypto: lib/aesgcm - Add kernel docs for aesgcm_mac
Add kernel documentation for the aesgcm_mac.
This function generates the authentication tag using the AES-GCM algorithm.

Signed-off-by: Sagar Vashnav <sagarvashnav72427@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-11-17 19:16:28 +08:00
Puranjay Mohan
5fa201f37c bpf: Remove test for MOVSX32 with offset=32
MOVSX32 only supports sign extending 8-bit and 16-bit operands into 32
bit operands. The "ALU_MOVSX | BPF_W" test tries to sign extend a 32 bit
operand into a 32 bit operand which is equivalent to a normal BPF_MOV.

Remove this test as it tries to run an invalid instruction.

Fixes: daabb2b098 ("bpf/tests: add tests for cpuv4 instructions")
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202310111838.46ff5b6a-oliver.sang@intel.com
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20231110175150.87803-1-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15 12:46:27 -08:00
Linus Torvalds
86d11b0e20 Zstd fixes for v6.7
Only a single line change to fix a benign UBSAN warning that has been
 baking in linux-next for a month. I just missed the merge window, but I
 think it is worthwhile to include this fix in the v6.7 kernel. If you
 would like me to wait for v6.8 please let me know.
 
 Signed-off-by: Nick Terrell <terrelln@fb.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmVUG20ACgkQuzRpqaNE
 qPXx2Q/9GSGokNZm3+ZgINijpxsSz/+WT34ZJ4n14ns97PiiiSLQ10Nzpy3xGfHC
 gQTXYUoRJhuTUsid8QDsbgoLzybOxgZsxQWECrWEioXS+AiGCqGLnC85JdZNpJO3
 hWrxO6j0XfLb6Sgl5gm0CBWVuGkdKSUCCd91iHIZ9vKwFWSsQizGsdyYVXB4gZn4
 qYYGhka5TunDXgaTeqBGlNe3x6b/RvsPTBsX3SnAI3aBd88I6xrpcURrYru3NerZ
 16V2t5+GpiDz8cm7IHwEekh4JNaMwRLO8jhkThdErbLjFs/TU0s2hXkvG0m/yujs
 4Yklv+j0wIJv7f0IMSEZi+msnun+ajx8Mg4P1hi6WEiyylHC7aGAT55ShKhO8Qzx
 FD1kNiNR9svfHL83fJAZ3iEuhOQokRnt2qADutYmp0kq6BCcEnNuMBOpSyw+hzDZ
 MHDCkTujDTobE8gzZBEwUYPiw9kX6c3vsvb/mmNv8+y2LUxyb0GShinw3od0RlKQ
 T5xF6yq9neAxbARFJ9e3Zs51e5OXnCbb+K+pMNqR+nw89hUXiscynN93o4jMCzpn
 VBmhjzaSUTiTABG2TrqWNa5Vr6J3l77sr6T5bvC5aNpQQZ9KZFFPmpup8Z4BtBrG
 89BZqDsrUDRlMpS4Z2ECbl5GUELP3EBSV1NHRgVM4BB7t/UzvzU=
 =w0Q0
 -----END PGP SIGNATURE-----

Merge tag 'zstd-linus-v6.7-rc2' of https://github.com/terrelln/linux

Pull Zstd fix from Nick Terrell:
 "Only a single line change to fix a benign UBSAN warning"

* tag 'zstd-linus-v6.7-rc2' of https://github.com/terrelln/linux:
  zstd: Fix array-index-out-of-bounds UBSAN warning
2023-11-14 23:35:31 -05:00
Nick Terrell
77618db346 zstd: Fix array-index-out-of-bounds UBSAN warning
Zstd used an array of length 1 to mean a flexible array for C89
compatibility. Switch to a C99 flexible array to fix the UBSAN warning.

Tested locally by booting the kernel and writing to and reading from a
BtrFS filesystem with zstd compression enabled. I was unable to reproduce
the issue before the fix, however it is a trivial change.

Link: https://lkml.kernel.org/r/20231012213428.1390905-1-nickrterrell@gmail.com
Reported-by: syzbot+1f2eb3e8cd123ffce499@syzkaller.appspotmail.com
Reported-by: Eric Biggers <ebiggers@kernel.org>
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2023-11-14 17:12:52 -08:00
Richard Fitzgerald
1bddcf77ce kunit: test: Avoid cast warning when adding kfree() as an action
In kunit_log_test() pass the kfree_wrapper() function to kunit_add_action()
instead of directly passing kfree().

This prevents a cast warning:

lib/kunit/kunit-test.c:565:25: warning: cast from 'void (*)(const void *)'
to 'kunit_action_t *' (aka 'void (*)(void *)') converts to incompatible
function type [-Wcast-function-type-strict]

   564		full_log = string_stream_get_string(test->log);
 > 565		kunit_add_action(test, (kunit_action_t *)kfree, full_log);

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311070041.kWVYx7YP-lkp@intel.com/
Fixes: 05e2006ce4 ("kunit: Use string_stream for test log")
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-11-14 13:01:57 -07:00
Michal Wajdeczko
2e3c94aed5 kunit: Reset suite counter right before running tests
Today we reset the suite counter as part of the suite cleanup,
called from the module exit callback, but it might not work that
well as one can try to collect results without unloading a previous
test (either unintentionally or due to dependencies).

For easy reproduction try to load the kunit-test.ko and then
collect and parse results from the kunit-example-test.ko load.
Parser will complain about mismatch of expected test number:

[ ] KTAP version 1
[ ] 1..1
[ ]     # example: initializing suite
[ ]     KTAP version 1
[ ]     # Subtest: example
..
[ ] # example: pass:5 fail:0 skip:4 total:9
[ ] # Totals: pass:6 fail:0 skip:6 total:12
[ ] ok 7 example

[ ] [ERROR] Test: example: Expected test number 1 but found 7
[ ] ===================== [PASSED] example =====================
[ ] ============================================================
[ ] Testing complete. Ran 12 tests: passed: 6, skipped: 6, errors: 1

Since we are now printing suite test plan on every module load,
right before running suite tests, we should make sure that suite
counter will also start from 1. Easiest solution seems to be move
counter reset to the __kunit_test_suites_init() function.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-11-14 13:01:47 -07:00
Maxime Ripard
f8f2847f73 kunit: Warn if tests are slow
Kunit recently gained support to setup attributes, the first one being
the speed of a given test, then allowing to filter out slow tests.

A slow test is defined in the documentation as taking more than one
second. There's an another speed attribute called "super slow" but whose
definition is less clear.

Add support to the test runner to check the test execution time, and
report tests that should be marked as slow but aren't.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-11-14 13:01:37 -07:00
wuqiang.matt
3afe733729 lib: test_objpool: make global variables static
Kernel test robot reported build warnings that structures g_ot_sync_ops,
g_ot_async_ops and g_testcases should be static. These definitions are
only used in test_objpool.c, so make them static

Link: https://lore.kernel.org/all/20231108012248.313574-1-wuqiang.matt@bytedance.com/

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311071229.WGrWUjM1-lkp@intel.com/

Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-11-10 19:59:04 +09:00
Linus Torvalds
c9d01179e1 Second bcachefs pull request for 6.7-rc1
Here's the second big bcachefs pull request. This brings your tree up to
 date with my master branch, which is what existing bcachefs users are
 currently running.
 
 All but the last few patches have been in linux-next, those being small
 fixes. Test results from my dashboard:
   https://evilpiepirate.org/~testdashboard/ci?commit=c7046ed0cf9bb33599aa7e72e7b67bba4be42d64
 
 New features:
  - rebalance_work btree (and metadata version 1.3): the rebalance thread
    no longer has to scan to find extents that need processing - big
    scalability improvement.
  - sb_errors superblock section: this adds counters for each fsck error
    type, since filesystem creation, along with the date of the most
    recent error. It'll get us better bug reports (since users do not
    typically report errors that fsck was able to fix), and I might add
    telemetry for this in the future.
 
 Fixes include:
  - multiple snapshot deletion fixes
  - members_v2 fixups
  - deleted_inodes btree fixes
  - copygc thread no longer spins when a device is full but has no
    fragmented buckets (i.e. rebalance needs to move data around instead)
  - a fix for a memory reclaim issue with the btree key cache: we're now
    careful not to hold the srcu read lock that blocks key cache reclaim
    for too long
  - an early allocator locking fix, from Brian
  - endianness fixes, from Brian
  - CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y, a big
    performance improvement on multithreaded workloads
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmVH9xYACgkQE6szbY3K
 bnahLRAAiNRZL73SQ+MW79o4yPqGwt0Eyy/mvoiGpZf1B8uXp0oZ55j2w3l887Uf
 LeM03mInAYCPdyp/d4vxqIr96j9BODmRRl8sEkkGdJDzokLG+22F0ovOe45KWTxL
 kBoNdng/O/oeOe/1K7taP3KzBvMx2nOF6oA+xfgyCjECMArAIXek0iocyEUR4Ywd
 vGKhLNn1k2c+94wacnDYwjjdcLBxoqxsFXlpu6V0BcaY+DX4J3aBaGmj75KEoCI0
 VbBOzxrOO4QzJrzW2+hxZZWgGyvReCkBJvqfORfuPxiSbFobTim10MdfZOAMQA1U
 Xr1FTEpK1wMX0/pPVgZRqaOsttC+yc/SsfPNgSxybgHPbDlMLaakDHjvYssbKOYG
 urDWSMG5yCsktSLj95SXsvUFKZaZFD72SKBNdgdt/nZjwTHuNQ7IkdrMwIrCQ/PT
 Ifn50UrR/Ahd8RAd5tyNCPw6U9VfwnxACSNl2KA7ONKpvHb+gSt1JsJTDyz1+gN9
 nFVrw1SHKQ6EIV6XhVon/5DEuRTzqoYGWoN08FHEUq9fBlvnVpmbJErCQMplOjz9
 OQnAfpJH4YqkpXyjFAjP1V0An+RUn8QvDgXNqC9TyvCYuOliVFuil4y7/c+7oIQU
 NEoz+jVLenqsGOGAbduI4/Q567COojRgwEvbebSIxSImXuhCNj4=
 =Lo4N
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-11-5' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Here's the second big bcachefs pull request. This brings your tree up
  to date with my master branch, which is what existing bcachefs users
  are currently running.

  New features:
   - rebalance_work btree (and metadata version 1.3): the rebalance
     thread no longer has to scan to find extents that need processing -
     big scalability improvement.
   - sb_errors superblock section: this adds counters for each fsck
     error type, since filesystem creation, along with the date of the
     most recent error. It'll get us better bug reports (since users do
     not typically report errors that fsck was able to fix), and I might
     add telemetry for this in the future.

  Fixes include:
   - multiple snapshot deletion fixes
   - members_v2 fixups
   - deleted_inodes btree fixes
   - copygc thread no longer spins when a device is full but has no
     fragmented buckets (i.e. rebalance needs to move data around
     instead)
   - a fix for a memory reclaim issue with the btree key cache: we're
     now careful not to hold the srcu read lock that blocks key cache
     reclaim for too long
   - an early allocator locking fix, from Brian
   - endianness fixes, from Brian
   - CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y, a big
     performance improvement on multithreaded workloads"

* tag 'bcachefs-2023-11-5' of https://evilpiepirate.org/git/bcachefs: (70 commits)
  bcachefs: Improve stripe checksum error message
  bcachefs: Simplify, fix bch2_backpointer_get_key()
  bcachefs: kill thing_it_points_to arg to backpointer_not_found()
  bcachefs: bch2_ec_read_extent() now takes btree_trans
  bcachefs: bch2_stripe_to_text() now prints ptr gens
  bcachefs: Don't iterate over journal entries just for btree roots
  bcachefs: Break up bch2_journal_write()
  bcachefs: Replace ERANGE with private error codes
  bcachefs: bkey_copy() is no longer a macro
  bcachefs: x-macro-ify inode flags enum
  bcachefs: Convert bch2_fs_open() to darray
  bcachefs: Move __bch2_members_v2_get_mut to sb-members.h
  bcachefs: bch2_prt_datetime()
  bcachefs: CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y
  bcachefs: Add a comment for BTREE_INSERT_NOJOURNAL usage
  bcachefs: rebalance_work btree is not a snapshots btree
  bcachefs: Add missing printk newlines
  bcachefs: Fix recovery when forced to use JSET_NO_FLUSH journal entry
  bcachefs: .get_parent() should return an error pointer
  bcachefs: Fix bch2_delete_dead_inodes()
  ...
2023-11-07 11:38:38 -08:00
Linus Torvalds
b8cc56d041 cxl for v6.7
- Add support for RCH (Restricted CXL Host) Error recovery
 
 - Fix several region assembly bugs
 
 - Fix mem-device lifetime issues relative to the sanitize command and
   RCH topology.
 
 - Refactor ACPI table parsing for CDAT parsing re-use in preparation for
   CXL QOS support.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZUaowQAKCRDfioYZHlFs
 Z75rAP44azzLPwJtva7Ur60KpNsGuoZKhvWWdeI1/zo9k4pHbwEA/Vaf/GGo0U5k
 bMkoTmwPTd7YY79B5HNUQSZsqF9wlAc=
 =TEQ0
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dan Williams:
 "The main new functionality this time is work to allow Linux to
  natively handle CXL link protocol errors signalled via PCIe AER for
  current generation CXL platforms. This required some enlightenment of
  the PCIe AER core to workaround the fact that current generation RCH
  (Restricted CXL Host) platforms physically hide topology details and
  registers via a mechanism called RCRB (Root Complex Register Block).

  The next major highlight is reworks to address bugs in parsing region
  configurations for next generation VH (Virtual Host) topologies. The
  old broken algorithm is replaced with a simpler one that significantly
  increases the number of region configurations supported by Linux. This
  is again relevant for error handling so that forward and reverse
  address translation of memory errors can be carried out by Linux for
  memory regions instantiated by platform firmware.

  As for other cross-tree work, the ACPI table parsing code has been
  refactored for reuse parsing the "CDAT" structure which is an
  ACPI-like data structure that is reported by CXL devices. That work is
  in preparation for v6.8 support for CXL QoS. Think of this as dynamic
  generation of NUMA node topology information generated by Linux rather
  than platform firmware.

  Lastly, a number of internal object lifetime issues have been resolved
  along with misc. fixes and feature updates (decoders_committed sysfs
  ABI).

  Summary:

   - Add support for RCH (Restricted CXL Host) Error recovery

   - Fix several region assembly bugs

   - Fix mem-device lifetime issues relative to the sanitize command and
     RCH topology.

   - Refactor ACPI table parsing for CDAT parsing re-use in preparation
     for CXL QOS support"

* tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (50 commits)
  lib/fw_table: Remove acpi_parse_entries_array() export
  cxl/pci: Change CXL AER support check to use native AER
  cxl/hdm: Remove broken error path
  cxl/hdm: Fix && vs || bug
  acpi: Move common tables helper functions to common lib
  cxl: Add support for reading CXL switch CDAT table
  cxl: Add checksum verification to CDAT from CXL
  cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute
  cxl: Add decoders_committed sysfs attribute to cxl_port
  cxl: Add cxl_decoders_committed() helper
  cxl/core/regs: Rework cxl_map_pmu_regs() to use map->dev for devm
  cxl/core/regs: Rename phys_addr in cxl_map_component_regs()
  PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling
  PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler
  cxl/pci: Disable root port interrupts in RCH mode
  cxl/pci: Add RCH downstream port error logging
  cxl/pci: Map RCH downstream AER registers for logging protocol errors
  cxl/pci: Update CXL error logging to use RAS register address
  PCI/AER: Refactor cper_print_aer() for use by CXL driver module
  cxl/pci: Add RCH downstream port AER register discovery
  ...
2023-11-04 16:20:36 -10:00
Linus Torvalds
707df298cb powerpc updates for 6.7
- Add support for KVM running as a nested hypervisor under development versions
    of PowerVM, using the new PAPR nested virtualisation API.
 
  - Add support for the BPF prog pack allocator.
 
  - A rework of the non-server MMU handling to support execute-only on all platforms.
 
  - Some optimisations & cleanups for the powerpc qspinlock code.
 
  - Various other small features and fixes.
 
 Thanks to: Aboorva Devarajan, Aditya Gupta, Amit Machhiwal, Benjamin Gray,
 Christophe Leroy, Dr. David Alan Gilbert, Gaurav Batra, Gautam Menghani, Geert
 Uytterhoeven, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Julia
 Lawall, Kautuk Consul, Kuan-Wei Chiu, Michael Neuling, Minjie Du, Muhammad
 Muzammil, Naveen N Rao, Nicholas Piggin, Nick Child, Nysal Jan K.A, Peter
 Lafreniere, Rob Herring, Sachin Sant, Sebastian Andrzej Siewior, Shrikanth
 Hegde, Srikar Dronamraju, Stanislav Kinsburskii, Vaibhav Jain, Wang Yufen, Yang
 Yingliang, Yuan Tan.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmVEf38THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgMKgD/4vmPVcBE31xCAuuksrVvmMDRsCoC8N
 IJe4A5dHda1tYgdN2YdeK4LBszv5pWICjf2xZHlNh+L0s3Vxpngd4ycAWGPfDAyk
 SOlM24NCKl5j3327QZEt+iZVmJeTSnrmjxO0A1y04yvzLrfvFT7mbP4EXoidjShd
 GNb/EoH9kkCFn65zulc+lN2itQEX6Ht2GQTAz5z5GKtF6d1zZGM8ftOW+SQ5LeU3
 5JOkQtMtwAKhzBiglA4BB3pQyjaOOkPaTaj/WLoxx5tbVaCkV4wrFq48Bmtbm7E3
 kYkMNoI3IsC615GqY1CaRs/RSpMt74tIVh3tstSecHWRIwNGnfF6zeZpKLvJSs8k
 Qa5greGWMUDuJdDg9oDwAX2AKtO+3byI2v1hKE+sMhMh0eeMtDP9WIrIRg4BDjKL
 mq8RffXLTCtepehgfwBpoZbcvFSwFUMwuihBD7+bDMZQeDbtuFdZ2ouMFXBP9M1n
 cuv4KySouvKv9Xp5EeCkHlpL7QmSqrtSHOPYjoPeLueJYlmjheWdreLM9p7Nl2ma
 5wBxLpdLCGCpDJOyGgWNoQRHXucBNlU97DLx2V70nXG4wvvRyXh9EZ6I2niPSdPx
 N3LJnINz4MJ52Gd1KWJvufOyJlLwXxuI07rzCq67ZegpEPh+baWqVcPscuKU8+q0
 dSh2DPCht8gw1A==
 =ddT4
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add support for KVM running as a nested hypervisor under development
   versions of PowerVM, using the new PAPR nested virtualisation API

 - Add support for the BPF prog pack allocator

 - A rework of the non-server MMU handling to support execute-only on
   all platforms

 - Some optimisations & cleanups for the powerpc qspinlock code

 - Various other small features and fixes

Thanks to Aboorva Devarajan, Aditya Gupta, Amit Machhiwal, Benjamin
Gray, Christophe Leroy, Dr. David Alan Gilbert, Gaurav Batra, Gautam
Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Joel Stanley,
Jordan Niethe, Julia Lawall, Kautuk Consul, Kuan-Wei Chiu, Michael
Neuling, Minjie Du, Muhammad Muzammil, Naveen N Rao, Nicholas Piggin,
Nick Child, Nysal Jan K.A, Peter Lafreniere, Rob Herring, Sachin Sant,
Sebastian Andrzej Siewior, Shrikanth Hegde, Srikar Dronamraju, Stanislav
Kinsburskii, Vaibhav Jain, Wang Yufen, Yang Yingliang, and Yuan Tan.

* tag 'powerpc-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (100 commits)
  powerpc/vmcore: Add MMU information to vmcoreinfo
  Revert "powerpc: add `cur_cpu_spec` symbol to vmcoreinfo"
  powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free]
  powerpc/bpf: rename powerpc64_jit_data to powerpc_jit_data
  powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack
  powerpc/bpf: implement bpf_arch_text_copy
  powerpc/code-patching: introduce patch_instructions()
  powerpc/32s: Implement local_flush_tlb_page_psize()
  powerpc/pseries: use kfree_sensitive() in plpks_gen_password()
  powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure
  powerpc/fsl_msi: Use device_get_match_data()
  powerpc: Remove cpm_dp...() macros
  powerpc/qspinlock: Rename yield_propagate_owner tunable
  powerpc/qspinlock: Propagate sleepy if previous waiter is preempted
  powerpc/qspinlock: don't propagate the not-sleepy state
  powerpc/qspinlock: propagate owner preemptedness rather than CPU number
  powerpc/qspinlock: stop queued waiters trying to set lock sleepy
  powerpc/perf: Fix disabling BHRB and instruction sampling
  powerpc/trace: Add support for HAVE_FUNCTION_ARG_ACCESS_API
  powerpc/tools: Pass -mabi=elfv2 to gcc-check-mprofile-kernel.sh
  ...
2023-11-03 10:07:39 -10:00
Linus Torvalds
31e5f934ff Tracing updates for v6.7:
- Remove eventfs_file descriptor
 
   This is the biggest change, and the second part of making eventfs
   create its files dynamically.
 
   In 6.6 the first part was added, and that maintained a one to one
   mapping between eventfs meta descriptors and the directories and
   file inodes and dentries that were dynamically created. The
   directories were represented by a eventfs_inode and the files
   were represented by a eventfs_file.
 
   In v6.7 the eventfs_file is removed. As all events have the same
   directory make up (sched_switch has an "enable", "id", "format",
   etc files), the handing of what files are underneath each leaf
   eventfs directory is moved back to the tracing subsystem via a
   callback. When a event is added to the eventfs, it registers
   an array of evenfs_entry's. These hold the names of the files and
   the callbacks to call when the file is referenced. The callback gets
   the name so that the same callback may be used by multiple files.
   The callback then supplies the filesystem_operations structure needed
   to create this file.
 
   This has brought the memory footprint of creating multiple eventfs
   instances down by 2 megs each!
 
 - User events now has persistent events that are not associated
   to a single processes. These are privileged events that hang around
   even if no process is attached to them.
 
 - Clean up of seq_buf.
   There's talk about using seq_buf more to replace strscpy() and friends.
   But this also requires some minor modifications of seq_buf to be
   able to do this.
 
 - Expand instance ring buffers individually
   Currently if boot up creates an instance, and a trace event is
   enabled on that instance, the ring buffer for that instance and the
   top level ring buffer are expanded (1.4 MB per CPU). This wastes
   memory as this happens when nothing is using the top level instance.
 
 - Other minor clean ups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZUMrBBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6quzVAQCed/kPM7X9j2QZamJVDruMf2CmVxpu
 /TOvKvSKV584GgEAxLntf5VKx1Q98bc68y3Zkg+OCi8jSgORos1ROmURhws=
 =iIgb
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Remove eventfs_file descriptor

   This is the biggest change, and the second part of making eventfs
   create its files dynamically.

   In 6.6 the first part was added, and that maintained a one to one
   mapping between eventfs meta descriptors and the directories and file
   inodes and dentries that were dynamically created. The directories
   were represented by a eventfs_inode and the files were represented by
   a eventfs_file.

   In v6.7 the eventfs_file is removed. As all events have the same
   directory make up (sched_switch has an "enable", "id", "format", etc
   files), the handing of what files are underneath each leaf eventfs
   directory is moved back to the tracing subsystem via a callback.

   When an event is added to the eventfs, it registers an array of
   evenfs_entry's. These hold the names of the files and the callbacks
   to call when the file is referenced. The callback gets the name so
   that the same callback may be used by multiple files. The callback
   then supplies the filesystem_operations structure needed to create
   this file.

   This has brought the memory footprint of creating multiple eventfs
   instances down by 2 megs each!

 - User events now has persistent events that are not associated to a
   single processes. These are privileged events that hang around even
   if no process is attached to them

 - Clean up of seq_buf

   There's talk about using seq_buf more to replace strscpy() and
   friends. But this also requires some minor modifications of seq_buf
   to be able to do this

 - Expand instance ring buffers individually

   Currently if boot up creates an instance, and a trace event is
   enabled on that instance, the ring buffer for that instance and the
   top level ring buffer are expanded (1.4 MB per CPU). This wastes
   memory as this happens when nothing is using the top level instance

 - Other minor clean ups and fixes

* tag 'trace-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits)
  seq_buf: Export seq_buf_puts()
  seq_buf: Export seq_buf_putc()
  eventfs: Use simple_recursive_removal() to clean up dentries
  eventfs: Remove special processing of dput() of events directory
  eventfs: Delete eventfs_inode when the last dentry is freed
  eventfs: Hold eventfs_mutex when calling callback functions
  eventfs: Save ownership and mode
  eventfs: Test for ei->is_freed when accessing ei->dentry
  eventfs: Have a free_ei() that just frees the eventfs_inode
  eventfs: Remove "is_freed" union with rcu head
  eventfs: Fix kerneldoc of eventfs_remove_rec()
  tracing: Have the user copy of synthetic event address use correct context
  eventfs: Remove extra dget() in eventfs_create_events_dir()
  tracing: Have trace_event_file have ref counters
  seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()
  eventfs: Fix typo in eventfs_inode union comment
  eventfs: Fix WARN_ON() in create_file_dentry()
  powerpc: Remove initialisation of readpos
  tracing/histograms: Simplify last_cmd_set()
  seq_buf: fix a misleading comment
  ...
2023-11-03 07:41:18 -10:00
Linus Torvalds
2a80532c07 printk changes for 6.7
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmVDk4QACgkQUqAMR0iA
 lPLKtg/9FXMuIRtyiZqHARIjRprxSWyiwVu+ZsDfp8JznP1+WnoWp8E+xf824dbW
 xNnQS1qKecvH+1Nw9jMhXNAlViI5re7ft01yDraA0POjP3P8ar+4CXiB3e8CXTLZ
 VUwMegBlztkcskV0L5EpRRyOa728UF63V3FN41SuR81GaAGOVyZNE4RmlwlsS9xg
 uCj7XEMgVuZpVjnM6Cx/YzUfuWGwFp0eWn31Vc7Dp5Bab/yDvSRdD8H1foo23/3k
 nLh6wDV9UbYbAsgHwLWck18kmn0xsnzK8G08bzFwP8FASHjVuEaxSgJqFBuZxY1j
 O4XFR1zedVOB2Q08fzwi/rVB7jibw9mZZNWceOnrDM8PUx7m/m/YVOO0aNVmBh+f
 uztptyoMw8o93dXwR05Dc5JGyYh4k4eMN9eS0MbALNziFwvc80ln4g1goNYpzjyb
 vfytDoacwuRYD3BkEL5t5BsAK4ULqpTyOZiv0sXUC+cERH1HkUGm5KCPbQawnvaK
 fEkGJTHMtE+I+Cui83jYCdxJVfE1LFcM9Yl7ZDfrVisRQ8/KM9+L68XceRe4E30U
 1YiSJIopeYi5+ABysfE4RPumHWsG6d8FQPtXZyS+/K0cl5j49t36r3IOy0QUh6bn
 G9iZSvO4oxUHz1sXja+X+EKzN1r8fRf/j+ovTrBUpCcZZRqKk9s=
 =Cmex
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Another preparation step for introducing printk kthreads. The main
   piece is a per-console lock with several features:

    - Support three priorities: normal, emergency, and panic. They will
      be defined by a context where the lock is taken. A context with a
      higher priority is allowed to take over the lock from a context
      with a lower one.

      The plan is to use the emergency context for Oops and WARN()
      messages, and also by watchdogs.

      The panic() context will be used on panic CPU.

    - The owner might enter/exit regions where it is not safe to take
      over the lock. It allows the take over the lock a safe way in the
      middle of a message.

      For example, serial drivers emit characters one by one. And the
      serial port is in a safe state in between.

      Only the final console_flush_in_panic() will be allowed to take
      over the lock even in the unsafe state (last chance, pray, and
      hope).

    - A higher priority context might busy wait with a timeout. The
      current owner is informed about the waiter and releases the lock
      on exit from the unsafe state.

    - The new lock is safe even in atomic contexts, including NMI.

   Another change is a safe manipulation of per-console sequence number
   counter under the new lock.

 - simple_strntoull() micro-optimization

 - Reduce pr_flush() pooling time.

 - Calm down false warning about possible buffer invalid access to
   console buffers when CONFIG_PRINTK is disabled.

[ .. and Thomas Gleixner wants to point out that while several of the
  commits are attributed to him, he only authored the early versions of
  said commits, and that John Ogness and Petr Mladek have been the ones
  who sorted out the details and really should be those who get the
  credit   - Linus ]

* tag 'printk-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  vsprintf: uninline simple_strntoull(), reorder arguments
  printk: printk: Remove unnecessary statements'len = 0;'
  printk: Reduce pr_flush() pooling time
  printk: fix illegal pbufs access for !CONFIG_PRINTK
  printk: nbcon: Allow drivers to mark unsafe regions and check state
  printk: nbcon: Add emit function and callback function for atomic printing
  printk: nbcon: Add sequence handling
  printk: nbcon: Add ownership state functions
  printk: nbcon: Add buffer management
  printk: Make static printk buffers available to nbcon
  printk: nbcon: Add acquire/release logic
  printk: Add non-BKL (nbcon) console basic infrastructure
2023-11-03 07:24:22 -10:00
Linus Torvalds
9a719c2145 bitmap patches for v6.7
Hi Linus,
 
 Please pull patches for v6.7.
 
 This request includes "bitmap: cleanup bitmap_*_region() implementation"
 series, and scattered cleanup patches.
 
 Thanks,
         Yury
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmVCgS4ACgkQsUSA/Tof
 vsjIdQv+PnSQ5Lq6ISWYqhV0I60LPLWjf4jm5bgHUT/gKWjUIqYJmYfHD1M1MTkJ
 +qsLdywshSdE62TG/Y0r/i9el8IedJOP1T0Oi9RpVPjV/vZd7BgGYSLfOsZnvV4e
 wmIVKKE5A+uAcKHw2+9MWoK+4LxG6YRWb6AKGroghz3GU70hFz9xY+kwsfP1NxLd
 pqalPYGyyfkte+7uSchwMKfJVkXA5TwxbasB8Qd8s0fM0DNOLcoZbjFxt2ufZzBY
 a57I12nheYagBmLfMPjOT3TR/g9XXQnn8pxxhNM0XJeu73WDno+ZMTmH80SzDuv7
 P6+6KglUHY1IHyeQ0chgwZDusxkCKfR9W6fQ5IhGYJuZkKtzbdsjVf38jJbWwp8n
 ZIFu8n1kkYN7Ap4veOJ32N/cDRN0yR5f2pWxTw2hPifn5Rftl26PhidH0Bjz/F+p
 q4/dIxsGPA6bsQCfZ7XNfGf9pARwLjcHgZt8MMwj2RA2hv+1qyefRav94jUrkyPT
 9gaBkZHi
 =L4AW
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.7' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:
 "This includes the 'bitmap: cleanup bitmap_*_region() implementation'
  series, and scattered cleanup patches"

* tag 'bitmap-for-6.7' of https://github.com/norov/linux:
  buildid: reduce header file dependencies for module
  bitmap: move bitmap_*_region() functions to bitmap.h
  bitmap: drop _reg_op() function
  bitmap: replace _reg_op(REG_OP_ISFREE) with find_next_bit()
  bitmap: replace _reg_op(REG_OP_RELEASE) with bitmap_clear()
  bitmap: replace _reg_op(REG_OP_ALLOC) with bitmap_set()
  bitmap: fix opencoded bitmap_allocate_region()
  bitmap: add test for bitmap_*_region() functions
  bitmap: align __reg_op() wrappers with modern coding style
  lib/bitmap: split-out string-related operations to a separate files
  bitmap: Remove dead code, i.e. bitmap_copy_le()
  bitmap: Fix a typo ("identify map")
  cpumask: kernel-doc cleanups and additions
2023-11-03 07:08:36 -10:00
Linus Torvalds
8f6f76a6a2 As usual, lots of singleton and doubleton patches all over the tree and
there's little I can say which isn't in the individual changelogs.
 
 The lengthier patch series are
 
 - "kdump: use generic functions to simplify crashkernel reservation in
   arch", from Baoquan He.  This is mainly cleanups and consolidation of
   the "crashkernel=" kernel parameter handling.
 
 - After much discussion, David Laight's "minmax: Relax type checks in
   min() and max()" is here.  Hopefully reduces some typecasting and the
   use of min_t() and max_t().
 
 - A group of patches from Oleg Nesterov which clean up and slightly fix
   our handling of reads from /proc/PID/task/...  and which remove
   task_struct.therad_group.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZUQP9wAKCRDdBJ7gKXxA
 jmOAAQDh8sxagQYocoVsSm28ICqXFeaY9Co1jzBIDdNesAvYVwD/c2DHRqJHEiS4
 63BNcG3+hM9nwGJHb5lyh5m79nBMRg0=
 =On4u
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "As usual, lots of singleton and doubleton patches all over the tree
  and there's little I can say which isn't in the individual changelogs.

  The lengthier patch series are

   - 'kdump: use generic functions to simplify crashkernel reservation
     in arch', from Baoquan He. This is mainly cleanups and
     consolidation of the 'crashkernel=' kernel parameter handling

   - After much discussion, David Laight's 'minmax: Relax type checks in
     min() and max()' is here. Hopefully reduces some typecasting and
     the use of min_t() and max_t()

   - A group of patches from Oleg Nesterov which clean up and slightly
     fix our handling of reads from /proc/PID/task/... and which remove
     task_struct.thread_group"

* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
  scripts/gdb/vmalloc: disable on no-MMU
  scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
  .mailmap: add address mapping for Tomeu Vizoso
  mailmap: update email address for Claudiu Beznea
  tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
  .mailmap: map Benjamin Poirier's address
  scripts/gdb: add lx_current support for riscv
  ocfs2: fix a spelling typo in comment
  proc: test ProtectionKey in proc-empty-vm test
  proc: fix proc-empty-vm test with vsyscall
  fs/proc/base.c: remove unneeded semicolon
  do_io_accounting: use sig->stats_lock
  do_io_accounting: use __for_each_thread()
  ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
  ocfs2: fix a typo in a comment
  scripts/show_delta: add __main__ judgement before main code
  treewide: mark stuff as __ro_after_init
  fs: ocfs2: check status values
  proc: test /proc/${pid}/statm
  compiler.h: move __is_constexpr() to compiler.h
  ...
2023-11-02 20:53:31 -10:00
Linus Torvalds
ecae0bd517 Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
 
 - Kemeng Shi has contributed some compation maintenance work in the
   series "Fixes and cleanups to compaction".
 
 - Joel Fernandes has a patchset ("Optimize mremap during mutual
   alignment within PMD") which fixes an obscure issue with mremap()'s
   pagetable handling during a subsequent exec(), based upon an
   implementation which Linus suggested.
 
 - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the
   following patch series:
 
 	mm/damon: misc fixups for documents, comments and its tracepoint
 	mm/damon: add a tracepoint for damos apply target regions
 	mm/damon: provide pseudo-moving sum based access rate
 	mm/damon: implement DAMOS apply intervals
 	mm/damon/core-test: Fix memory leaks in core-test
 	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
 
 - In the series "Do not try to access unaccepted memory" Adrian Hunter
   provides some fixups for the recently-added "unaccepted memory' feature.
   To increase the feature's checking coverage.  "Plug a few gaps where
   RAM is exposed without checking if it is unaccepted memory".
 
 - In the series "cleanups for lockless slab shrink" Qi Zheng has done
   some maintenance work which is preparation for the lockless slab
   shrinking code.
 
 - Qi Zheng has redone the earlier (and reverted) attempt to make slab
   shrinking lockless in the series "use refcount+RCU method to implement
   lockless slab shrink".
 
 - David Hildenbrand contributes some maintenance work for the rmap code
   in the series "Anon rmap cleanups".
 
 - Kefeng Wang does more folio conversions and some maintenance work in
   the migration code.  Series "mm: migrate: more folio conversion and
   unification".
 
 - Matthew Wilcox has fixed an issue in the buffer_head code which was
   causing long stalls under some heavy memory/IO loads.  Some cleanups
   were added on the way.  Series "Add and use bdev_getblk()".
 
 - In the series "Use nth_page() in place of direct struct page
   manipulation" Zi Yan has fixed a potential issue with the direct
   manipulation of hugetlb page frames.
 
 - In the series "mm: hugetlb: Skip initialization of gigantic tail
   struct pages if freed by HVO" has improved our handling of gigantic
   pages in the hugetlb vmmemmep optimizaton code.  This provides
   significant boot time improvements when significant amounts of gigantic
   pages are in use.
 
 - Matthew Wilcox has sent the series "Small hugetlb cleanups" - code
   rationalization and folio conversions in the hugetlb code.
 
 - Yin Fengwei has improved mlock()'s handling of large folios in the
   series "support large folio for mlock"
 
 - In the series "Expose swapcache stat for memcg v1" Liu Shixin has
   added statistics for memcg v1 users which are available (and useful)
   under memcg v2.
 
 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
   prctl so that userspace may direct the kernel to not automatically
   propagate the denial to child processes.  The series is named "MDWE
   without inheritance".
 
 - Kefeng Wang has provided the series "mm: convert numa balancing
   functions to use a folio" which does what it says.
 
 - In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch
   makes is possible for a process to propagate KSM treatment across
   exec().
 
 - Huang Ying has enhanced memory tiering's calculation of memory
   distances.  This is used to permit the dax/kmem driver to use "high
   bandwidth memory" in addition to Optane Data Center Persistent Memory
   Modules (DCPMM).  The series is named "memory tiering: calculate
   abstract distance based on ACPI HMAT"
 
 - In the series "Smart scanning mode for KSM" Stefan Roesch has
   optimized KSM by teaching it to retain and use some historical
   information from previous scans.
 
 - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the
   series "mm: memcg: fix tracking of pending stats updates values".
 
 - In the series "Implement IOCTL to get and optionally clear info about
   PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits
   us to atomically read-then-clear page softdirty state.  This is mainly
   used by CRIU.
 
 - Hugh Dickins contributed the series "shmem,tmpfs: general maintenance"
   - a bunch of relatively minor maintenance tweaks to this code.
 
 - Matthew Wilcox has increased the use of the VMA lock over file-backed
   page faults in the series "Handle more faults under the VMA lock".  Some
   rationalizations of the fault path became possible as a result.
 
 - In the series "mm/rmap: convert page_move_anon_rmap() to
   folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups
   and folio conversions.
 
 - In the series "various improvements to the GUP interface" Lorenzo
   Stoakes has simplified and improved the GUP interface with an eye to
   providing groundwork for future improvements.
 
 - Andrey Konovalov has sent along the series "kasan: assorted fixes and
   improvements" which does those things.
 
 - Some page allocator maintenance work from Kemeng Shi in the series
   "Two minor cleanups to break_down_buddy_pages".
 
 - In thes series "New selftest for mm" Breno Leitao has developed
   another MM self test which tickles a race we had between madvise() and
   page faults.
 
 - In the series "Add folio_end_read" Matthew Wilcox provides cleanups
   and an optimization to the core pagecache code.
 
 - Nhat Pham has added memcg accounting for hugetlb memory in the series
   "hugetlb memcg accounting".
 
 - Cleanups and rationalizations to the pagemap code from Lorenzo
   Stoakes, in the series "Abstract vma_merge() and split_vma()".
 
 - Audra Mitchell has fixed issues in the procfs page_owner code's new
   timestamping feature which was causing some misbehaviours.  In the
   series "Fix page_owner's use of free timestamps".
 
 - Lorenzo Stoakes has fixed the handling of new mappings of sealed files
   in the series "permit write-sealed memfd read-only shared mappings".
 
 - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
   series "Batch hugetlb vmemmap modification operations".
 
 - Some buffer_head folio conversions and cleanups from Matthew Wilcox in
   the series "Finish the create_empty_buffers() transition".
 
 - As a page allocator performance optimization Huang Ying has added
   automatic tuning to the allocator's per-cpu-pages feature, in the series
   "mm: PCP high auto-tuning".
 
 - Roman Gushchin has contributed the patchset "mm: improve performance
   of accounted kernel memory allocations" which improves their performance
   by ~30% as measured by a micro-benchmark.
 
 - folio conversions from Kefeng Wang in the series "mm: convert page
   cpupid functions to folios".
 
 - Some kmemleak fixups in Liu Shixin's series "Some bugfix about
   kmemleak".
 
 - Qi Zheng has improved our handling of memoryless nodes by keeping them
   off the allocation fallback list.  This is done in the series "handle
   memoryless nodes more appropriately".
 
 - khugepaged conversions from Vishal Moola in the series "Some
   khugepaged folio conversions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA
 jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y
 FgeUPAD1oasg6CP+INZvCj34waNxwAc=
 =E+Y4
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Kemeng Shi has contributed some compation maintenance work in the
     series 'Fixes and cleanups to compaction'

   - Joel Fernandes has a patchset ('Optimize mremap during mutual
     alignment within PMD') which fixes an obscure issue with mremap()'s
     pagetable handling during a subsequent exec(), based upon an
     implementation which Linus suggested

   - More DAMON/DAMOS maintenance and feature work from SeongJae Park i
     the following patch series:

	mm/damon: misc fixups for documents, comments and its tracepoint
	mm/damon: add a tracepoint for damos apply target regions
	mm/damon: provide pseudo-moving sum based access rate
	mm/damon: implement DAMOS apply intervals
	mm/damon/core-test: Fix memory leaks in core-test
	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval

   - In the series 'Do not try to access unaccepted memory' Adrian
     Hunter provides some fixups for the recently-added 'unaccepted
     memory' feature. To increase the feature's checking coverage. 'Plug
     a few gaps where RAM is exposed without checking if it is
     unaccepted memory'

   - In the series 'cleanups for lockless slab shrink' Qi Zheng has done
     some maintenance work which is preparation for the lockless slab
     shrinking code

   - Qi Zheng has redone the earlier (and reverted) attempt to make slab
     shrinking lockless in the series 'use refcount+RCU method to
     implement lockless slab shrink'

   - David Hildenbrand contributes some maintenance work for the rmap
     code in the series 'Anon rmap cleanups'

   - Kefeng Wang does more folio conversions and some maintenance work
     in the migration code. Series 'mm: migrate: more folio conversion
     and unification'

   - Matthew Wilcox has fixed an issue in the buffer_head code which was
     causing long stalls under some heavy memory/IO loads. Some cleanups
     were added on the way. Series 'Add and use bdev_getblk()'

   - In the series 'Use nth_page() in place of direct struct page
     manipulation' Zi Yan has fixed a potential issue with the direct
     manipulation of hugetlb page frames

   - In the series 'mm: hugetlb: Skip initialization of gigantic tail
     struct pages if freed by HVO' has improved our handling of gigantic
     pages in the hugetlb vmmemmep optimizaton code. This provides
     significant boot time improvements when significant amounts of
     gigantic pages are in use

   - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
     rationalization and folio conversions in the hugetlb code

   - Yin Fengwei has improved mlock()'s handling of large folios in the
     series 'support large folio for mlock'

   - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
     added statistics for memcg v1 users which are available (and
     useful) under memcg v2

   - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
     prctl so that userspace may direct the kernel to not automatically
     propagate the denial to child processes. The series is named 'MDWE
     without inheritance'

   - Kefeng Wang has provided the series 'mm: convert numa balancing
     functions to use a folio' which does what it says

   - In the series 'mm/ksm: add fork-exec support for prctl' Stefan
     Roesch makes is possible for a process to propagate KSM treatment
     across exec()

   - Huang Ying has enhanced memory tiering's calculation of memory
     distances. This is used to permit the dax/kmem driver to use 'high
     bandwidth memory' in addition to Optane Data Center Persistent
     Memory Modules (DCPMM). The series is named 'memory tiering:
     calculate abstract distance based on ACPI HMAT'

   - In the series 'Smart scanning mode for KSM' Stefan Roesch has
     optimized KSM by teaching it to retain and use some historical
     information from previous scans

   - Yosry Ahmed has fixed some inconsistencies in memcg statistics in
     the series 'mm: memcg: fix tracking of pending stats updates
     values'

   - In the series 'Implement IOCTL to get and optionally clear info
     about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
     which permits us to atomically read-then-clear page softdirty
     state. This is mainly used by CRIU

   - Hugh Dickins contributed the series 'shmem,tmpfs: general
     maintenance', a bunch of relatively minor maintenance tweaks to
     this code

   - Matthew Wilcox has increased the use of the VMA lock over
     file-backed page faults in the series 'Handle more faults under the
     VMA lock'. Some rationalizations of the fault path became possible
     as a result

   - In the series 'mm/rmap: convert page_move_anon_rmap() to
     folio_move_anon_rmap()' David Hildenbrand has implemented some
     cleanups and folio conversions

   - In the series 'various improvements to the GUP interface' Lorenzo
     Stoakes has simplified and improved the GUP interface with an eye
     to providing groundwork for future improvements

   - Andrey Konovalov has sent along the series 'kasan: assorted fixes
     and improvements' which does those things

   - Some page allocator maintenance work from Kemeng Shi in the series
     'Two minor cleanups to break_down_buddy_pages'

   - In thes series 'New selftest for mm' Breno Leitao has developed
     another MM self test which tickles a race we had between madvise()
     and page faults

   - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
     and an optimization to the core pagecache code

   - Nhat Pham has added memcg accounting for hugetlb memory in the
     series 'hugetlb memcg accounting'

   - Cleanups and rationalizations to the pagemap code from Lorenzo
     Stoakes, in the series 'Abstract vma_merge() and split_vma()'

   - Audra Mitchell has fixed issues in the procfs page_owner code's new
     timestamping feature which was causing some misbehaviours. In the
     series 'Fix page_owner's use of free timestamps'

   - Lorenzo Stoakes has fixed the handling of new mappings of sealed
     files in the series 'permit write-sealed memfd read-only shared
     mappings'

   - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
     series 'Batch hugetlb vmemmap modification operations'

   - Some buffer_head folio conversions and cleanups from Matthew Wilcox
     in the series 'Finish the create_empty_buffers() transition'

   - As a page allocator performance optimization Huang Ying has added
     automatic tuning to the allocator's per-cpu-pages feature, in the
     series 'mm: PCP high auto-tuning'

   - Roman Gushchin has contributed the patchset 'mm: improve
     performance of accounted kernel memory allocations' which improves
     their performance by ~30% as measured by a micro-benchmark

   - folio conversions from Kefeng Wang in the series 'mm: convert page
     cpupid functions to folios'

   - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
     kmemleak'

   - Qi Zheng has improved our handling of memoryless nodes by keeping
     them off the allocation fallback list. This is done in the series
     'handle memoryless nodes more appropriately'

   - khugepaged conversions from Vishal Moola in the series 'Some
     khugepaged folio conversions'"

[ bcachefs conflicts with the dynamically allocated shrinkers have been
  resolved as per Stephen Rothwell in

     https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/

  with help from Qi Zheng.

  The clone3 test filtering conflict was half-arsed by yours truly ]

* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
  mm/damon/sysfs: update monitoring target regions for online input commit
  mm/damon/sysfs: remove requested targets when online-commit inputs
  selftests: add a sanity check for zswap
  Documentation: maple_tree: fix word spelling error
  mm/vmalloc: fix the unchecked dereference warning in vread_iter()
  zswap: export compression failure stats
  Documentation: ubsan: drop "the" from article title
  mempolicy: migration attempt to match interleave nodes
  mempolicy: mmap_lock is not needed while migrating folios
  mempolicy: alloc_pages_mpol() for NUMA policy without vma
  mm: add page_rmappable_folio() wrapper
  mempolicy: remove confusing MPOL_MF_LAZY dead code
  mempolicy: mpol_shared_policy_init() without pseudo-vma
  mempolicy trivia: use pgoff_t in shared mempolicy tree
  mempolicy trivia: slightly more consistent naming
  mempolicy trivia: delete those ancient pr_debug()s
  mempolicy: fix migrate_pages(2) syscall return nr_failed
  kernfs: drop shared NUMA mempolicy hooks
  hugetlbfs: drop shared NUMA mempolicy pretence
  mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
  ...
2023-11-02 19:38:47 -10:00
Dan Williams
4b92894064 lib/fw_table: Remove acpi_parse_entries_array() export
Stephen reports that the ACPI helper library rework,
CONFIG_FIRMWARE_TABLE, introduces a new compiler warning:

    WARNING: modpost: vmlinux: acpi_parse_entries_array: EXPORT_SYMBOL used
    for init symbol. Remove __init or EXPORT_SYMBOL.

Delete this export as it turns out it is unneeded, and future work wraps
this in another exported helper. Note that in general
EXPORT_SYMBOL_ACPI_LIB() is needed for exporting symbols that are marked
__init_or_acpilib, but in this case no export is required.

Fixes: a103f46633 ("acpi: Move common tables helper functions to common lib")
Cc: Dave Jiang <dave.jiang@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: http://lore.kernel.org/r/20231030160523.670a7569@canb.auug.org.au
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/169896282222.70775.940454758280866379.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-11-02 15:17:21 -07:00
Christophe JAILLET
70a9affa93 seq_buf: Export seq_buf_puts()
Mark seq_buf_puts() which is part of the seq_buf API to be exported to
kernel loadable GPL modules.

Link: https://lkml.kernel.org/r/b9e3737f66ec2450221b492048ce0d9c65c84953.1698861216.git.christophe.jaillet@wanadoo.fr

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-11-02 00:19:44 -04:00
Christophe JAILLET
685b38c765 seq_buf: Export seq_buf_putc()
Mark seq_buf_putc() which is part of the seq_buf API to be exported to
kernel loadable GPL modules.

Link: https://lkml.kernel.org/r/5c9a5ed97ac37dbdcd9c1e7bcbdec9ac166e79be.1698861216.git.christophe.jaillet@wanadoo.fr

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-11-02 00:18:52 -04:00
Linus Torvalds
5eda8f2537 linux_kselftest-kunit-6.7-rc1
This kunit update for Linux 6.7-rc1 consists of:
 
 -- string-stream testing enhancements
 -- several fixes memory leaks
 -- fix to reset status during parameter handling
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmVClgAACgkQCwJExA0N
 QxwXhA//Yn2nL5an6A9ZQire8S0HmxqdAwgROBVb7AaCkJ9L4cY+BYyt+VzTkaGO
 DJ3x5TwBi6BUkDSmAjppu+KhSQqbRvtmC5xIwH5zbSqKesftklJDhNC6SBd8ZFSC
 W5w+wwK8xxqFni2NAQu/ZMHrxgllUlR7ZbCE31U/IUYvW4NP4izz9oX3rqkvgjEU
 aZOQpe3GBVn3jkX+gYopoCqegbSiobjeE83cq8NwFX22rxA8FRDKeJF6/5fv3N1h
 vJF0KR14QIxc+Y50KHLNMpCXLIM8/IXEiPXd0bqzFVEB+/uyikM8BRPQRHaq9PfA
 VhaRSDAe9rYgAQYZcayAPLHlrsrFXS/gFs2x15QxzeLfrke7uJg1jEa8CoxV50f3
 ImAOxbtxXcW0Oz0lU4J5iX6Kq1gwhv/GP/Rgr8Cf2xMo4dWy3k6/dt8Ep7FwuwWk
 +ReDNPBw+FMrRLN3hWTXr2Y7k4avOmCDmjj5L1YVVG9yug0WI4ZD3OelbcXtzAA6
 XUAuB/EjCQYNB8XrOGd7+xX8eZraA/68xgf9gtNM3ycoIQ2UAfGnQnaZm3BXjvGk
 zjnvG8JpaCBObvonK22y1t60Ive1PbZgpdtU2Za1QTUjZWUPPudLHr7oU30crOaW
 +aqtCam6UGU6GlsiMWzbN8cv/q5fo/gCWkfKLGEry2NCz/71mwc=
 =Nn2t
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - string-stream testing enhancements

 - several fixes memory leaks

 - fix to reset status during parameter handling

* tag 'linux_kselftest-kunit-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: test: Fix the possible memory leak in executor_test
  kunit: Fix possible memory leak in kunit_filter_suites()
  kunit: Fix the wrong kfree of copy for kunit_filter_suites()
  kunit: Fix missed memory release in kunit_free_suite_set()
  kunit: Reset test status on each param iteration
  kunit: string-stream: Test performance of string_stream
  kunit: Use string_stream for test log
  kunit: string-stream: Add tests for freeing resource-managed string_stream
  kunit: string-stream: Decouple string_stream from kunit
  kunit: string-stream: Add kunit_alloc_string_stream()
  kunit: Don't use a managed alloc in is_literal()
  kunit: string-stream-test: Add cases for string_stream newline appending
  kunit: string-stream: Add option to make all lines end with newline
  kunit: string-stream: Improve testing of string_stream
  kunit: string-stream: Don't create a fragment for empty strings
2023-11-01 17:02:29 -10:00
Linus Torvalds
05bf73aa27 Probes updates for v6.7:
- cleanups:
   . kprobes: Fixes typo in kprobes samples.
 
   . tracing/eprobes: Remove 'break' after return.
 
 - kretprobe/fprobe performance improvements:
   . lib: Introduce new `objpool`, which is a high performance lockless
     object queue. This uses per-cpu ring array to allocate/release
     objects from the pre-allocated object pool. Since the index of ring
     array is a 32bit sequential counter, we can retry to push/pop the
     object pointer from the ring without lock (as seq-lock does).
 
   . lib: Add an objpool test module to test the functionality and
     evaluate the performance under some circumstances.
 
   . kprobes/fprobe: Improve kretprobe and rethook scalability
     performance with objpool.
     This improves both legacy kretprobe and fprobe exit handler (which
     is based on rethook) to be scalable on SMP systems. Even with
     8-threads parallel test, it shows a great scalability improvement.
 
   . Remove unneeded freelist.h which is replaced by objpool.
 
   . objpool: Add maintainers entry for the objpool.
 
   . objpool: Fix to remove unused include header lines.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmVA54obHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8busoH/3mG/rJwVVJw70zTLlfs
 ko4U1wn16aImYQYYLXkZLlYsKr6Y2dzNkb5C4CEI2r47EZjTamHatGZ6MSwvAtPb
 u9oloHEbRbE6yM+EjrE1JAKT9FwC+21/yZCN2zACZKJRwCwQRzxGIXUwGTWtDNdE
 NySLBDyMoR6zZJsFy8YueFBAJxcZdWIPK6mQH2Y5awVQA4tV7tQEe92KFqUYWTd5
 exbfBbcVG8MBWmrPqRI46Hxh0NWOnPCqFwGqX8Q7hE/yrQnTPzJ+2ZsbYFkGRk6A
 pM5wRCdwO5+OlcHEcEHBMQSGCmFgk6m1UMG8RvbCKyF3cwHbxzlelbjzHosKQvSh
 EKQ=
 =/vZK
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:
 "Cleanups:

   - kprobes: Fixes typo in kprobes samples

   - tracing/eprobes: Remove 'break' after return

  kretprobe/fprobe performance improvements:

   - lib: Introduce new `objpool`, which is a high performance lockless
     object queue. This uses per-cpu ring array to allocate/release
     objects from the pre-allocated object pool.

     Since the index of ring array is a 32bit sequential counter, we can
     retry to push/pop the object pointer from the ring without lock (as
     seq-lock does)

   - lib: Add an objpool test module to test the functionality and
     evaluate the performance under some circumstances

   - kprobes/fprobe: Improve kretprobe and rethook scalability
     performance with objpool.

     This improves both legacy kretprobe and fprobe exit handler (which
     is based on rethook) to be scalable on SMP systems. Even with
     8-threads parallel test, it shows a great scalability improvement

   - Remove unneeded freelist.h which is replaced by objpool

   - objpool: Add maintainers entry for the objpool

   - objpool: Fix to remove unused include header lines"

* tag 'probes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  kprobes: unused header files removed
  MAINTAINERS: objpool added
  kprobes: freelist.h removed
  kprobes: kretprobe scalability improvement
  lib: objpool test module added
  lib: objpool added: ring-array based lockless MPMC
  tracing/eprobe: drop unneeded breaks
  samples: kprobes: Fixes a typo
2023-11-01 16:15:42 -10:00
Linus Torvalds
1e0c505e13 asm-generic updates for v6.7
The ia64 architecture gets its well-earned retirement as planned,
 now that there is one last (mostly) working release that will
 be maintained as an LTS kernel.
 
 The architecture specific system call tables are updated for
 the added map_shadow_stack() syscall and to remove references
 to the long-gone sys_lookup_dcookie() syscall.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmVC40IACgkQYKtH/8kJ
 Uidhmw/9EX+aWSXGoObJ3fngaNSMw+PmrEuP8qEKBHxfKHcCdX3hc451Oh4GlhaQ
 tru91pPwgNvN2/rfoKusxT+V4PemGIzfNni/04rp+P0kvmdw5otQ2yNhsQNsfVmq
 XGWvkxF4P2GO6bkjjfR/1dDq7GtlyXtwwPDKeLbYb6TnJOZjtx+EAN27kkfSn1Ms
 R4Sa3zJ+DfHUmHL5S9g+7UD/CZ5GfKNmIskI4Mz5GsfoUz/0iiU+Bge/9sdcdSJQ
 kmbLy5YnVzfooLZ3TQmBFsO3iAMWb0s/mDdtyhqhTVmTUshLolkPYyKnPFvdupyv
 shXcpEST2XJNeaDRnL2K4zSCdxdbnCZHDpjfl9wfioBg7I8NfhXKpf1jYZHH1de4
 LXq8ndEFEOVQw/zSpYWfQq1sux8Jiqr+UK/ukbVeFWiGGIUs91gEWtPAf8T0AZo9
 ujkJvaWGl98O1g5wmBu0/dAR6QcFJMDfVwbmlIFpU8O+MEaz6X8mM+O5/T0IyTcD
 eMbAUjj4uYcU7ihKzHEv/0SS9Of38kzff67CLN5k8wOP/9NlaGZ78o1bVle9b52A
 BdhrsAefFiWHp1jT6Y9Rg4HOO/TguQ9e6EWSKOYFulsiLH9LEFaB9RwZLeLytV0W
 vlAgY9rUW77g1OJcb7DoNv33nRFuxsKqsnz3DEIXtgozo9CzbYI=
 =H1vH
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull ia64 removal and asm-generic updates from Arnd Bergmann:

 - The ia64 architecture gets its well-earned retirement as planned,
   now that there is one last (mostly) working release that will be
   maintained as an LTS kernel.

 - The architecture specific system call tables are updated for the
   added map_shadow_stack() syscall and to remove references to the
   long-gone sys_lookup_dcookie() syscall.

* tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  hexagon: Remove unusable symbols from the ptrace.h uapi
  asm-generic: Fix spelling of architecture
  arch: Reserve map_shadow_stack() syscall number for all architectures
  syscalls: Cleanup references to sys_lookup_dcookie()
  Documentation: Drop or replace remaining mentions of IA64
  lib/raid6: Drop IA64 support
  Documentation: Drop IA64 from feature descriptions
  kernel: Drop IA64 support from sig_fault handlers
  arch: Remove Itanium (IA-64) architecture
2023-11-01 15:28:33 -10:00
Linus Torvalds
385903a7ec SoC driver updates for 6.7
The highlights for the driver support this time are
 
  - Qualcomm platforms gain support for the Qualcomm Secure Execution
    Environment firmware interface to access EFI variables on certain
    devices, and new features for multiple platform and firmware drivers.
 
  - Arm FF-A firmware support gains support for v1.1 specification features,
    in particular notification and memory transaction descriptor changes.
 
  - SCMI firmware support now support v3.2 features for clock and DVFS
    configuration and a new transport for Qualcomm platforms.
 
  - Minor cleanups and bugfixes are added to pretty much all the active
    platforms: qualcomm, broadcom, dove, ti-k3, rockchip, sifive, amlogic,
    atmel, tegra, aspeed, vexpress, mediatek, samsung and more.
    In particular, this contains portions of the treewide conversion to
    use __counted_by annotations and the device_get_match_data helper.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmVC10IACgkQYKtH/8kJ
 UifFoQ//Tw7aux88EA2UkyL2Wulv80NwRQn3tQlxI/6ltjBX64yeQ6Y8OzmYdSYK
 20NEpbU7VWOFftN+D6Jp1HLrvfi0OV9uJn3WiTX3ChgDXixpOXo4TYgNNTlb9uZ4
 MrSTG3NkS27m/oTaCmYprOObgSNLq1FRCGIP7w4U9gyMk9N9FSKMpSJjlH06qPz6
 WBLTaIwPgBsyrLfCdxfA1y7AFCAHVxQJO4bp0VWSIalTrneGTeQrd2FgYMUesQ2e
 fIUNCaU4mpmj8XnQ/W19Wsek8FRB+fOh0hn/Gl+iHYibpxusIsn7bkdZ5BOJn2J0
 OY3C1biopaaxXcZ+wmnX9X0ieZ3TDsHzYOEf0zmNGzMZaZkV8kQt4/Ykv77xz6Gc
 4Bl6JI5QZ4rTZvlHYGMYxhy3hKuB31mO2rHbei7eR7J7UmjzWcl5P6HYfCgj7wzH
 crIWj1IR1Nx6Dt/wXf3HlRcEiAEJ2D0M3KIFjAVT239TsxacBfDrRk+YedF2bKbn
 WMYfVM6jJnPOykGg/gMRlttS/o/7TqHBl3y/900Idiijcm3cRPbQ+uKfkpHXftN/
 2vOtsw7pzEg7QQI9GVrb4drTrLvYJ7GQOi4o0twXTCshlXUk2V684jvHt0emFkdX
 ew9Zft4YLAYSmuJ3XqGhhMP63FsHKMlB1aSTKKPeswdIJmrdO80=
 =QIut
 -----END PGP SIGNATURE-----

Merge tag 'soc-drivers-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC driver updates from Arnd Bergmann:
 "The highlights for the driver support this time are

   - Qualcomm platforms gain support for the Qualcomm Secure Execution
     Environment firmware interface to access EFI variables on certain
     devices, and new features for multiple platform and firmware
     drivers.

   - Arm FF-A firmware support gains support for v1.1 specification
     features, in particular notification and memory transaction
     descriptor changes.

   - SCMI firmware support now support v3.2 features for clock and DVFS
     configuration and a new transport for Qualcomm platforms.

   - Minor cleanups and bugfixes are added to pretty much all the active
     platforms: qualcomm, broadcom, dove, ti-k3, rockchip, sifive,
     amlogic, atmel, tegra, aspeed, vexpress, mediatek, samsung and
     more.

     In particular, this contains portions of the treewide conversion to
     use __counted_by annotations and the device_get_match_data helper"

* tag 'soc-drivers-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (156 commits)
  soc: qcom: pmic_glink_altmode: Print return value on error
  firmware: qcom: scm: remove unneeded 'extern' specifiers
  firmware: qcom: scm: add a missing forward declaration for struct device
  firmware: qcom: move Qualcomm code into its own directory
  soc: samsung: exynos-chipid: Convert to platform remove callback returning void
  soc: qcom: apr: Add __counted_by for struct apr_rx_buf and use struct_size()
  soc: qcom: pmic_glink: fix connector type to be DisplayPort
  soc: ti: k3-socinfo: Avoid overriding return value
  soc: ti: k3-socinfo: Fix typo in bitfield documentation
  soc: ti: knav_qmss_queue: Use device_get_match_data()
  firmware: ti_sci: Use device_get_match_data()
  firmware: qcom: qseecom: add missing include guards
  soc/pxa: ssp: Convert to platform remove callback returning void
  soc/mediatek: mtk-mmsys: Convert to platform remove callback returning void
  soc/mediatek: mtk-devapc: Convert to platform remove callback returning void
  soc/loongson: loongson2_guts: Convert to platform remove callback returning void
  soc/litex: litex_soc_ctrl: Convert to platform remove callback returning void
  soc/ixp4xx: ixp4xx-qmgr: Convert to platform remove callback returning void
  soc/ixp4xx: ixp4xx-npe: Convert to platform remove callback returning void
  soc/hisilicon: kunpeng_hccs: Convert to platform remove callback returning void
  ...
2023-11-01 14:46:51 -10:00
Alexey Dobriyan
72fcce70fa vsprintf: uninline simple_strntoull(), reorder arguments
* uninline simple_strntoull(),
  gcc overinlines and this function is not performance critical

* reorder arguments, so that appending INT_MAX as 4th argument
  generates very efficient tail call

Space savings:

	add/remove: 1/0 grow/shrink: 0/3 up/down: 27/-179 (-152)
	Function                            old     new   delta
	simple_strntoll                       -      27     +27
	simple_strtoull                      15      10      -5
	simple_strtoll                       41       7     -34
	vsscanf                            1930    1790    -140

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/all/82a2af6e-9b6c-4a09-89d7-ca90cc1cdad1@p183/
2023-11-01 15:55:39 +01:00
Linus Torvalds
89ed67ef12 Networking changes for 6.7.
Core & protocols
 ----------------
 
  - Support usec resolution of TCP timestamps, enabled selectively by
    a route attribute.
 
  - Defer regular TCP ACK while processing socket backlog, try to send
    a cumulative ACK at the end. Increase single TCP flow performance
    on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).
 
  - The Fair Queuing (FQ) packet scheduler:
    - add built-in 3 band prio / WRR scheduling
    - support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
    - improve inactive flow reporting
    - optimize the layout of structures for better cache locality
 
  - Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
    replacement for the old MD5 option.
 
  - Add more retransmission timeout (RTO) related statistics to TCP_INFO.
 
  - Support sending fragmented skbs over vsock sockets.
 
  - Make sure we send SIGPIPE for vsock sockets if socket was shutdown().
 
  - Add sysctl for ignoring lower limit on lifetime in Router
    Advertisement PIO, based on an in-progress IETF draft.
 
  - Add sysctl to control activation of TCP ping-pong mode.
 
  - Add sysctl to make connection timeout in MPTCP configurable.
 
  - Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
    limit the number of wakeups.
 
  - Support netlink GET for MDB (multicast forwarding), allowing user
    space to request a single MDB entry instead of dumping the entire
    table.
 
  - Support selective FDB flushing in the VXLAN tunnel driver.
 
  - Allow limiting learned FDB entries in bridges, prevent OOM attacks.
 
  - Allow controlling via configfs netconsole targets which were created
    via the kernel cmdline at boot, rather than via configfs at runtime.
 
  - Support multiple PTP timestamp event queue readers with different
    filters.
 
  - MCTP over I3C.
 
 BPF
 ---
 
  - Add new veth-like netdevice where BPF program defines the logic
    of the xmit routine. It can operate in L3 and L2 mode.
 
  - Support exceptions - allow asserting conditions which should
    never be true but are hard for the verifier to infer.
    With some extra flexibility around handling of the exit / failure.
    https://lwn.net/Articles/938435/
 
  - Add support for local per-cpu kptr, allow allocating and storing
    per-cpu objects in maps. Access to those objects operates on
    the value for the current CPU. This allows to deprecate local
    one-off implementations of per-CPU storage like
    BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.
 
  - Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
    for systemd to re-implement the LogNamespace feature which allows
    running multiple instances of systemd-journald to process the logs
    of different services.
 
  - Enable open-coded task_vma iteration, after maple tree conversion
    made it hard to directly walk VMAs in tracing programs.
 
  - Add open-coded task, css_task and css iterator support.
    One of the use cases is customizable OOM victim selection via BPF.
 
  - Allow source address selection with bpf_*_fib_lookup().
 
  - Add ability to pin BPF timer to the current CPU.
 
  - Prevent creation of infinite loops by combining tail calls and
    fentry/fexit programs.
 
  - Add missed stats for kprobes to retrieve the number of missed kprobe
    executions and subsequent executions of BPF programs.
 
  - Inherit system settings for CPU security mitigations.
 
  - Add BPF v4 CPU instruction support for arm32 and s390x.
 
 Changes to common code
 ----------------------
 
  - overflow: add DEFINE_FLEX() for on-stack definition of structs
    with flexible array members.
 
  - Process doc update with more guidance for reviewers.
 
 Driver API
 ----------
 
  - Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
    mutex in most places and remove a lot of smaller locks.
 
  - Create a common DPLL configuration API. Allow configuring
    and querying state of PLL circuits used for clock syntonization,
    in network time distribution.
 
  - Unify fragmented and full page allocation APIs in page pool code.
    Let drivers be ignorant of PAGE_SIZE.
 
  - Rework PHY state machine to avoid races with calls to phy_stop().
 
  - Notify DSA drivers of MAC address changes on user ports, improve
    correctness of offloads which depend on matching port MAC addresses.
 
  - Allow antenna control on injected WiFi frames.
 
  - Reduce the number of variants of napi_schedule().
 
  - Simplify error handling when composing devlink health messages.
 
 Misc
 ----
 
  - A lot of KCSAN data race "fixes", from Eric.
 
  - A lot of __counted_by() annotations, from Kees.
 
  - A lot of strncpy -> strscpy and printf format fixes.
 
  - Replace master/slave terminology with conduit/user in DSA drivers.
 
  - Handful of KUnit tests for netdev and WiFi core.
 
 Removed
 -------
 
  - AppleTalk COPS.
 
  - AppleTalk ipddp.
 
  - TI AR7 CPMAC Ethernet driver.
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Intel (100G, ice, idpf):
      - add a driver for the Intel E2000 IPUs
      - make CRC/FCS stripping configurable
      - cross-timestamping for E823 devices
      - basic support for E830 devices
      - use aux-bus for managing client drivers
      - i40e: report firmware versions via devlink
    - nVidia/Mellanox:
      - support 4-port NICs
      - increase max number of channels to 256
      - optimize / parallelize SF creation flow
    - Broadcom (bnxt):
      - enhance NIC temperature reporting
      - support PAM4 speeds and lane configuration
    - Marvell OcteonTX2:
      - PTP pulse-per-second output support
      - enable hardware timestamping for VFs
    - Solarflare/AMD:
      - conntrack NAT offload and offload for tunnels
    - Wangxun (ngbe/txgbe):
      - expose HW statistics
    - Pensando/AMD:
      - support PCI level reset
      - narrow down the condition under which skbs are linearized
    - Netronome/Corigine (nfp):
      - support CHACHA20-POLY1305 crypto in IPsec offload
 
  - Ethernet NICs embedded, slower, virtual:
    - Synopsys (stmmac):
      - add Loongson-1 SoC support
      - enable use of HW queues with no offload capabilities
      - enable PPS input support on all 5 channels
      - increase TX coalesce timer to 5ms
    - RealTek USB (r8152): improve efficiency of Rx by using GRO frags
    - xen: support SW packet timestamping
    - add drivers for implementations based on TI's PRUSS (AM64x EVM)
 
  - nVidia/Mellanox Ethernet datacenter switches:
    - avoid poor HW resource use on Spectrum-4 by better block selection
      for IPv6 multicast forwarding and ordering of blocks in ACL region
 
  - Ethernet embedded switches:
    - Microchip:
      - support configuring the drive strength for EMI compliance
      - ksz9477: partial ACL support
      - ksz9477: HSR offload
      - ksz9477: Wake on LAN
    - Realtek:
      - rtl8366rb: respect device tree config of the CPU port
 
  - Ethernet PHYs:
    - support Broadcom BCM5221 PHYs
    - TI dp83867: support hardware LED blinking
 
  - CAN:
    - add support for Linux-PHY based CAN transceivers
    - at91_can: clean up and use rx-offload helpers
 
  - WiFi:
    - MediaTek (mt76):
      - new sub-driver for mt7925 USB/PCIe devices
      - HW wireless <> Ethernet bridging in MT7988 chips
      - mt7603/mt7628 stability improvements
    - Qualcomm (ath12k):
      - WCN7850:
        - enable 320 MHz channels in 6 GHz band
        - hardware rfkill support
        - enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS
          to make scan faster
        - read board data variant name from SMBIOS
      - QCN9274: mesh support
    - RealTek (rtw89):
      - TDMA-based multi-channel concurrency (MCC)
    - Silicon Labs (wfx):
      - Remain-On-Channel (ROC) support
 
  - Bluetooth:
    - ISO: many improvements for broadcast support
    - mark BCM4378/BCM4387 as BROKEN_LE_CODED
    - add support for QCA2066
    - btmtksdio: enable Bluetooth wakeup from suspend
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmU8XsYACgkQMUZtbf5S
 Irv19RAAnud/24OOF5XMEJkIcYlnfqximh4XO6PujRSYkSkOUJdZTF6iJPgf3pSP
 YpwoHYbYKHYfeOf8+3bTNESiQNSnoVmvmvwiS6/7lZ3behHUrGLQzW9Htc3EZyWH
 2h6QkDZ5OOjfg0bwYSfp3vXkmMH2k8WE9Y0NvCkhcohqZi13Rmp14RnyPmNb2d1V
 yZRYDMSM133KqE6gnBr1Ct65IEvnKeGlCUN2mTGqOJgdn6DZMsyxvtt0y4rmN7Ab
 41+CgPU5SfxfbYpW+Dl2HJpgfte3WrC57KC6AM0PAPJzPmQWgeB/m9mjz/apj6Bg
 bhsEIo7FdvbCnQm3yWPhK2OgCAcSwLr8jfGMU+Q+W4VnL5SRRR3Rm0zjsze+kHNP
 OfqJgxzl3DpvoJqVBy1h5FGcZt0XHwhksm4cTxWqIahsF+veY0ECBXbuBBQx9XTF
 Y7INfI8ulg7wISJs+CJfIClYkgOibTw2u8taBS5ikbtgxNqp5D4QqODn7UefQap1
 PR/IDYODF+zRgmMJLeBqSa6fij6BkfOEDiOWak5kggBoZdtbtmeKI6tzze06CNdW
 lWv1WEhRufxnwK+IuWsEkjhiMbs2WGLvkJ5JbgQV9BfqHfIfiqBCrcWtT/WbQnGt
 lmU46CXh1t/FZEqbmK9h+8vsIIfrcDl6jb5npEiKPRG00vDKRTM=
 =46nS
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Support usec resolution of TCP timestamps, enabled selectively by a
     route attribute.

   - Defer regular TCP ACK while processing socket backlog, try to send
     a cumulative ACK at the end. Increase single TCP flow performance
     on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).

   - The Fair Queuing (FQ) packet scheduler:
       - add built-in 3 band prio / WRR scheduling
       - support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
       - improve inactive flow reporting
       - optimize the layout of structures for better cache locality

   - Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
     replacement for the old MD5 option.

   - Add more retransmission timeout (RTO) related statistics to
     TCP_INFO.

   - Support sending fragmented skbs over vsock sockets.

   - Make sure we send SIGPIPE for vsock sockets if socket was
     shutdown().

   - Add sysctl for ignoring lower limit on lifetime in Router
     Advertisement PIO, based on an in-progress IETF draft.

   - Add sysctl to control activation of TCP ping-pong mode.

   - Add sysctl to make connection timeout in MPTCP configurable.

   - Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
     limit the number of wakeups.

   - Support netlink GET for MDB (multicast forwarding), allowing user
     space to request a single MDB entry instead of dumping the entire
     table.

   - Support selective FDB flushing in the VXLAN tunnel driver.

   - Allow limiting learned FDB entries in bridges, prevent OOM attacks.

   - Allow controlling via configfs netconsole targets which were
     created via the kernel cmdline at boot, rather than via configfs at
     runtime.

   - Support multiple PTP timestamp event queue readers with different
     filters.

   - MCTP over I3C.

  BPF:

   - Add new veth-like netdevice where BPF program defines the logic of
     the xmit routine. It can operate in L3 and L2 mode.

   - Support exceptions - allow asserting conditions which should never
     be true but are hard for the verifier to infer. With some extra
     flexibility around handling of the exit / failure:

          https://lwn.net/Articles/938435/

   - Add support for local per-cpu kptr, allow allocating and storing
     per-cpu objects in maps. Access to those objects operates on the
     value for the current CPU.

     This allows to deprecate local one-off implementations of per-CPU
     storage like BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.

   - Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
     for systemd to re-implement the LogNamespace feature which allows
     running multiple instances of systemd-journald to process the logs
     of different services.

   - Enable open-coded task_vma iteration, after maple tree conversion
     made it hard to directly walk VMAs in tracing programs.

   - Add open-coded task, css_task and css iterator support. One of the
     use cases is customizable OOM victim selection via BPF.

   - Allow source address selection with bpf_*_fib_lookup().

   - Add ability to pin BPF timer to the current CPU.

   - Prevent creation of infinite loops by combining tail calls and
     fentry/fexit programs.

   - Add missed stats for kprobes to retrieve the number of missed
     kprobe executions and subsequent executions of BPF programs.

   - Inherit system settings for CPU security mitigations.

   - Add BPF v4 CPU instruction support for arm32 and s390x.

  Changes to common code:

   - overflow: add DEFINE_FLEX() for on-stack definition of structs with
     flexible array members.

   - Process doc update with more guidance for reviewers.

  Driver API:

   - Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
     mutex in most places and remove a lot of smaller locks.

   - Create a common DPLL configuration API. Allow configuring and
     querying state of PLL circuits used for clock syntonization, in
     network time distribution.

   - Unify fragmented and full page allocation APIs in page pool code.
     Let drivers be ignorant of PAGE_SIZE.

   - Rework PHY state machine to avoid races with calls to phy_stop().

   - Notify DSA drivers of MAC address changes on user ports, improve
     correctness of offloads which depend on matching port MAC
     addresses.

   - Allow antenna control on injected WiFi frames.

   - Reduce the number of variants of napi_schedule().

   - Simplify error handling when composing devlink health messages.

  Misc:

   - A lot of KCSAN data race "fixes", from Eric.

   - A lot of __counted_by() annotations, from Kees.

   - A lot of strncpy -> strscpy and printf format fixes.

   - Replace master/slave terminology with conduit/user in DSA drivers.

   - Handful of KUnit tests for netdev and WiFi core.

  Removed:

   - AppleTalk COPS.

   - AppleTalk ipddp.

   - TI AR7 CPMAC Ethernet driver.

  Drivers:

   - Ethernet high-speed NICs:
      - Intel (100G, ice, idpf):
         - add a driver for the Intel E2000 IPUs
         - make CRC/FCS stripping configurable
         - cross-timestamping for E823 devices
         - basic support for E830 devices
         - use aux-bus for managing client drivers
         - i40e: report firmware versions via devlink
      - nVidia/Mellanox:
         - support 4-port NICs
         - increase max number of channels to 256
         - optimize / parallelize SF creation flow
      - Broadcom (bnxt):
         - enhance NIC temperature reporting
         - support PAM4 speeds and lane configuration
      - Marvell OcteonTX2:
         - PTP pulse-per-second output support
         - enable hardware timestamping for VFs
      - Solarflare/AMD:
         - conntrack NAT offload and offload for tunnels
      - Wangxun (ngbe/txgbe):
         - expose HW statistics
      - Pensando/AMD:
         - support PCI level reset
         - narrow down the condition under which skbs are linearized
      - Netronome/Corigine (nfp):
         - support CHACHA20-POLY1305 crypto in IPsec offload

   - Ethernet NICs embedded, slower, virtual:
      - Synopsys (stmmac):
         - add Loongson-1 SoC support
         - enable use of HW queues with no offload capabilities
         - enable PPS input support on all 5 channels
         - increase TX coalesce timer to 5ms
      - RealTek USB (r8152): improve efficiency of Rx by using GRO frags
      - xen: support SW packet timestamping
      - add drivers for implementations based on TI's PRUSS (AM64x EVM)

   - nVidia/Mellanox Ethernet datacenter switches:
      - avoid poor HW resource use on Spectrum-4 by better block
        selection for IPv6 multicast forwarding and ordering of blocks
        in ACL region

   - Ethernet embedded switches:
      - Microchip:
         - support configuring the drive strength for EMI compliance
         - ksz9477: partial ACL support
         - ksz9477: HSR offload
         - ksz9477: Wake on LAN
      - Realtek:
         - rtl8366rb: respect device tree config of the CPU port

   - Ethernet PHYs:
      - support Broadcom BCM5221 PHYs
      - TI dp83867: support hardware LED blinking

   - CAN:
      - add support for Linux-PHY based CAN transceivers
      - at91_can: clean up and use rx-offload helpers

   - WiFi:
      - MediaTek (mt76):
         - new sub-driver for mt7925 USB/PCIe devices
         - HW wireless <> Ethernet bridging in MT7988 chips
         - mt7603/mt7628 stability improvements
      - Qualcomm (ath12k):
         - WCN7850:
            - enable 320 MHz channels in 6 GHz band
            - hardware rfkill support
            - enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to
              make scan faster
            - read board data variant name from SMBIOS
        - QCN9274: mesh support
      - RealTek (rtw89):
         - TDMA-based multi-channel concurrency (MCC)
      - Silicon Labs (wfx):
         - Remain-On-Channel (ROC) support

   - Bluetooth:
      - ISO: many improvements for broadcast support
      - mark BCM4378/BCM4387 as BROKEN_LE_CODED
      - add support for QCA2066
      - btmtksdio: enable Bluetooth wakeup from suspend"

* tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1816 commits)
  net: pcs: xpcs: Add 2500BASE-X case in get state for XPCS drivers
  net: bpf: Use sockopt_lock_sock() in ip_sock_set_tos()
  net: mana: Use xdp_set_features_flag instead of direct assignment
  vxlan: Cleanup IFLA_VXLAN_PORT_RANGE entry in vxlan_get_size()
  iavf: delete the iavf client interface
  iavf: add a common function for undoing the interrupt scheme
  iavf: use unregister_netdev
  iavf: rely on netdev's own registered state
  iavf: fix the waiting time for initial reset
  iavf: in iavf_down, don't queue watchdog_task if comms failed
  iavf: simplify mutex_trylock+sleep loops
  iavf: fix comments about old bit locks
  doc/netlink: Update schema to support cmd-cnt-name and cmd-max-name
  tools: ynl: introduce option to process unknown attributes or types
  ipvlan: properly track tx_errors
  netdevsim: Block until all devices are released
  nfp: using napi_build_skb() to replace build_skb()
  net: dsa: microchip: ksz9477: Fix spelling mistake "Enery" -> "Energy"
  net: dsa: microchip: Ensure Stable PME Pin State for Wake-on-LAN
  net: dsa: microchip: Refactor switch shutdown routine for WoL preparation
  ...
2023-10-31 05:10:11 -10:00
Linus Torvalds
befaa609f4 hardening updates for v6.7-rc1
- Add LKDTM test for stuck CPUs (Mark Rutland)
 
 - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)
 
 - Refactor more 1-element arrays into flexible arrays (Gustavo A. R. Silva)
 
 - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem Shaikh)
 
 - Convert group_info.usage to refcount_t (Elena Reshetova)
 
 - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)
 
 - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas Bulwahn)
 
 - Fix randstruct GCC plugin performance mode to stay in groups (Kees Cook)
 
 - Fix strtomem() compile-time check for small sources (Kees Cook)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmU/3cUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsEoEACBGPSiOmfSWdH3TOnIG270PD24
 jGjg8KFv7RC/JTOdYmpLl0okdlGT9LvjN/ToSSDEw3PIayxoXUdhkbYy0MYtiV3m
 yz2ozDTzJuplQX/W2fPE+nXSzIwHao2zjPPFjHnT7lt8IIjhgjiOtLfZ2gGUkW99
 Mdu2aWh3u0r4tC8OS23++yN5ibRc5l72efsjDWjZ0aPXnxE1bjmLMiIPiizpndIf
 beasPuDBs98sJVYouemCwnsPXuXOPz3Q1Cpo/fTd+TMTJCLSemCQZCTuOBU0acI/
 ZjLCgCaJU1yIYKBMtrIN4G9kITZniXX3/Nm4o6NQMVlcCqMeNaHuflomqWoqWfhE
 UPbRo2eghZOaMNiCKLLvZDIqPrh1IcsiEl6Ef3W4hICc42GTK96IuGisIvDXwQ4N
 /SzTOupJuN42noh3z1M3XuZy5RoXJ99IYDNY5CTKf9IdqvA0bbGkU3nb1gZH/xw9
 BjTqKzR/7K1kTXuSgagDZ1Wceej9pZxhX7E3IHYsP8ZOvKug3EeL4yybVwQ3HRfq
 Qnzcp/qPB9cOkLSQXveRTFTsj2mX28Gixct/iDuc1jIYwGQlY1gI6dcUcqby6ptM
 BrQti7eR2NH2+T3aE2UVCIWsZVhx7NaSF+z8JxfAuu56jicc4xJVsi8zrNveWX5M
 m2VXyBl3121BVtKi4w==
 =0iVF
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "One of the more voluminous set of changes is for adding the new
  __counted_by annotation[1] to gain run-time bounds checking of
  dynamically sized arrays with UBSan.

   - Add LKDTM test for stuck CPUs (Mark Rutland)

   - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)

   - Refactor more 1-element arrays into flexible arrays (Gustavo A. R.
     Silva)

   - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem
     Shaikh)

   - Convert group_info.usage to refcount_t (Elena Reshetova)

   - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)

   - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas
     Bulwahn)

   - Fix randstruct GCC plugin performance mode to stay in groups (Kees
     Cook)

   - Fix strtomem() compile-time check for small sources (Kees Cook)"

* tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (56 commits)
  hwmon: (acpi_power_meter) replace open-coded kmemdup_nul
  reset: Annotate struct reset_control_array with __counted_by
  kexec: Annotate struct crash_mem with __counted_by
  virtio_console: Annotate struct port_buffer with __counted_by
  ima: Add __counted_by for struct modsig and use struct_size()
  MAINTAINERS: Include stackleak paths in hardening entry
  string: Adjust strtomem() logic to allow for smaller sources
  hardening: x86: drop reference to removed config AMD_IOMMU_V2
  randstruct: Fix gcc-plugin performance mode to stay in group
  mailbox: zynqmp: Annotate struct zynqmp_ipi_pdata with __counted_by
  drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by
  irqchip/imx-intmux: Annotate struct intmux_data with __counted_by
  KVM: Annotate struct kvm_irq_routing_table with __counted_by
  virt: acrn: Annotate struct vm_memory_region_batch with __counted_by
  hwmon: Annotate struct gsc_hwmon_platform_data with __counted_by
  sparc: Annotate struct cpuinfo_tree with __counted_by
  isdn: kcapi: replace deprecated strncpy with strscpy_pad
  isdn: replace deprecated strncpy with strscpy
  NFS/flexfiles: Annotate struct nfs4_ff_layout_segment with __counted_by
  nfs41: Annotate struct nfs4_file_layout_dsaddr with __counted_by
  ...
2023-10-30 19:09:55 -10:00
Kent Overstreet
ee526b88ca closures: Fix race in closure_sync()
As pointed out by Linus, closure_sync() was racy; we could skip blocking
immediately after a get() and a put(), but then that would skip any
barrier corresponding to the other thread's put() barrier.

To fix this, always do the full __closure_sync() sequence whenever any
get() has happened and the closure might have been used by other
threads.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-30 21:48:22 -04:00
Kent Overstreet
2bce6368c4 closures: Better memory barriers
atomic_(dec|sub)_return_release() are a thing now - use them.

Also, delete the useless barrier in set_closure_fn(): it's redundant
with the memory barrier in closure_put(0.

Since closure_put() would now otherwise just have a release barrier, we
also need a new barrier when the ref hits 0 -
smp_acquire__after_ctrl_dep().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-30 21:48:22 -04:00
Linus Torvalds
63ce50fff9 Scheduler changes for v6.7 are:
- Fair scheduler (SCHED_OTHER) improvements:
 
     - Remove the old and now unused SIS_PROP code & option
     - Scan cluster before LLC in the wake-up path
     - Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
 
  - NUMA scheduling improvements:
 
     - Improve the VMA access-PID code to better skip/scan VMAs
     - Extend tracing to cover VMA-skipping decisions
     - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
     - Generalize numa_map_to_online_node()
 
  - Energy scheduling improvements:
 
     - Remove the EM_MAX_COMPLEXITY limit
     - Add tracepoints to track energy computation
     - Make the behavior of the 'sched_energy_aware' sysctl more consistent
     - Consolidate and clean up access to a CPU's max compute capacity
     - Fix uclamp code corner cases
 
  - RT scheduling improvements:
 
     - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
     - Drive the ->rto_mask with rt_rq->pushable_tasks updates
 
  - Scheduler scalability improvements:
 
     - Rate-limit updates to tg->load_avg
     - On x86 disable IBRS when CPU is offline to improve single-threaded performance
     - Micro-optimize in_task() and in_interrupt()
     - Micro-optimize the PSI code
     - Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
 
  - Core scheduler infrastructure improvements:
 
     - Use saved_state to reduce some spurious freezer wakeups
     - Bring in a handful of fast-headers improvements to scheduler headers
     - Make the scheduler UAPI headers more widely usable by user-space
     - Simplify the control flow of scheduler syscalls by using lock guards
     - Fix sched_setaffinity() vs. CPU hotplug race
 
  - Scheduler debuggability improvements:
     - Disallow writing invalid values to sched_rt_period_us
     - Fix a race in the rq-clock debugging code triggering warnings
     - Fix a warning in the bandwidth distribution code
     - Micro-optimize in_atomic_preempt_off() checks
     - Enforce that the tasklist_lock is held in for_each_thread()
     - Print the TGID in sched_show_task()
     - Remove the /proc/sys/kernel/sched_child_runs_first sysctl
 
  - Misc cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU8/NoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gN+xAAvKGYNZBCBG4jowxccgqAbCx81KOhhsy/
 KUaOmdLPg9WaXuqjZ5sggXQCMT0wUqBYAmqV7ts53VhWcma2I1ap4dCM6Jj+RLrc
 vNwkeNetsikiZtarMoCJs5NahL8ULh3liBaoAkkToPjQ5r43aZ/eKwDovEdIKc+g
 +Vgn7jUY8ssIrAOKT1midSwY1y8kAU2AzWOSFDTgedkJP4PgOu9/lBl9jSJ2sYaX
 N4XqONYPXTwOHUtvmzkYILxLz0k0GgJ7hmt78E8Xy2rC4taGCRwCfCMBYxREuwiP
 huo3O1P/iIe5svm4/EBUvcpvf44eAWTV+CD0dnJPwOc9IvFhpSzqSZZAsyy/JQKt
 Lnzmc/xmyc1PnXCYJfHuXrw2/m+MyUHaegPzh5iLJFrlqa79GavOElj0jNTAMzbZ
 39fybzPtuFP+64faRfu0BBlQZfORPBNc/oWMpPKqgP58YGuveKTWaUF5rl5lM7Ne
 nm07uOmq02JVR8YzPl/FcfhU2dPMawWuMwUjEr2eU+lAunY3PF88vu0FALj7iOBd
 66F8qrtpDHJanOxrdEUwSJ7hgw79qY1iw66Db7cQYjMazFKZONxArQPqFUZ0ngLI
 n9hVa7brg1bAQKrQflqjcIAIbpVu3SjPEl15cKpAJTB/gn5H66TQgw8uQ6HfG+h2
 GtOsn1nlvuk=
 =GDqb
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Fair scheduler (SCHED_OTHER) improvements:
   - Remove the old and now unused SIS_PROP code & option
   - Scan cluster before LLC in the wake-up path
   - Use candidate prev/recent_used CPU if scanning failed for cluster
     wakeup

  NUMA scheduling improvements:
   - Improve the VMA access-PID code to better skip/scan VMAs
   - Extend tracing to cover VMA-skipping decisions
   - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
   - Generalize numa_map_to_online_node()

  Energy scheduling improvements:
   - Remove the EM_MAX_COMPLEXITY limit
   - Add tracepoints to track energy computation
   - Make the behavior of the 'sched_energy_aware' sysctl more
     consistent
   - Consolidate and clean up access to a CPU's max compute capacity
   - Fix uclamp code corner cases

  RT scheduling improvements:
   - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
   - Drive the ->rto_mask with rt_rq->pushable_tasks updates

  Scheduler scalability improvements:
   - Rate-limit updates to tg->load_avg
   - On x86 disable IBRS when CPU is offline to improve single-threaded
     performance
   - Micro-optimize in_task() and in_interrupt()
   - Micro-optimize the PSI code
   - Avoid updating PSI triggers and ->rtpoll_total when there are no
     state changes

  Core scheduler infrastructure improvements:
   - Use saved_state to reduce some spurious freezer wakeups
   - Bring in a handful of fast-headers improvements to scheduler
     headers
   - Make the scheduler UAPI headers more widely usable by user-space
   - Simplify the control flow of scheduler syscalls by using lock
     guards
   - Fix sched_setaffinity() vs. CPU hotplug race

  Scheduler debuggability improvements:
   - Disallow writing invalid values to sched_rt_period_us
   - Fix a race in the rq-clock debugging code triggering warnings
   - Fix a warning in the bandwidth distribution code
   - Micro-optimize in_atomic_preempt_off() checks
   - Enforce that the tasklist_lock is held in for_each_thread()
   - Print the TGID in sched_show_task()
   - Remove the /proc/sys/kernel/sched_child_runs_first sysctl

  ... and misc cleanups & fixes"

* tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
  sched/fair: Remove SIS_PROP
  sched/fair: Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
  sched/fair: Scan cluster before scanning LLC in wake-up path
  sched: Add cpus_share_resources API
  sched/core: Fix RQCF_ACT_SKIP leak
  sched/fair: Remove unused 'curr' argument from pick_next_entity()
  sched/nohz: Update comments about NEWILB_KICK
  sched/fair: Remove duplicate #include
  sched/psi: Update poll => rtpoll in relevant comments
  sched: Make PELT acronym definition searchable
  sched: Fix stop_one_cpu_nowait() vs hotplug
  sched/psi: Bail out early from irq time accounting
  sched/topology: Rename 'DIE' domain to 'PKG'
  sched/psi: Delete the 'update_total' function parameter from update_triggers()
  sched/psi: Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
  sched/headers: Remove comment referring to rq::cpu_load, since this has been removed
  sched/numa: Complete scanning of inactive VMAs when there is no alternative
  sched/numa: Complete scanning of partial VMAs regardless of PID activity
  sched/numa: Move up the access pid reset logic
  sched/numa: Trace decisions related to skipping VMAs
  ...
2023-10-30 13:12:15 -10:00
Linus Torvalds
3cf3fabccb Locking changes in this cycle are:
- Futex improvements:
 
     - Add the 'futex2' syscall ABI, which is an attempt to get away from the
       multiplex syscall and adds a little room for extentions, while lifting
       some limitations.
 
     - Fix futex PI recursive rt_mutex waiter state bug
 
     - Fix inter-process shared futexes on no-MMU systems
 
     - Use folios instead of pages
 
  - Micro-optimizations of locking primitives:
 
     - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
       architectures, to improve lockref code generation.
 
     - Improve the x86-32 lockref_get_not_zero() main loop by adding
       build-time CMPXCHG8B support detection for the relevant lockref code,
       and by better interfacing the CMPXCHG8B assembly code with the compiler.
 
     - Introduce arch_sync_try_cmpxchg() on x86 to improve sync_try_cmpxchg()
       code generation. Convert some sync_cmpxchg() users to sync_try_cmpxchg().
 
     - Micro-optimize rcuref_put_slowpath()
 
  - Locking debuggability improvements:
 
     - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well
 
     - Enforce atomicity of sched_submit_work(), which is de-facto atomic but
       was un-enforced previously.
 
     - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check semantics
 
     - Fix ww_mutex self-tests
 
     - Clean up const-propagation in <linux/seqlock.h> and simplify
       the API-instantiation macros a bit.
 
  - RT locking improvements:
 
     - Provide the rt_mutex_*_schedule() primitives/helpers and use them
       in the rtmutex code to avoid recursion vs. rtlock on the PI state.
 
     - Add nested blocking lockdep asserts to rt_mutex_lock(), rtlock_lock()
       and rwbase_read_lock().
 
  - Plus misc fixes & cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU877IRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g9jw/+N7rxQ78dmFCYh4UWnLCYvuKP0/ivHErG
 493JcB8MupuA2tfJHIkDdr4aM2mNq2E61w69/WlZAQWWD6pdOhwgF5Xf5eoEcJm0
 vsAhWBGLxihXdtevPuMAx0dEpg3AMp2wc6i5PkN831KdPUgCNsrKq9Bfnfef7/G8
 MQTSHjmtba6jxleyxfEa4tE2xe5PJX825nRfkX2e1cf+stkYua+uJFxVxUfxFWGE
 4pBy70D9OC7MsJ44WWOA1gwkVtMMiBTmRPNjlP8Gz2GQ0f3ERHRwYk3jDHOPHZI6
 0GNt7pE3IMXQn2UuDtfkvv9IFTd+U5qD+APnWIn2ntWXqzGLFqOlmovMrobVn7El
 olYDCyweWPG71m1Qblsb1VK2QjRPQVJ9NAEg8RlDHIu2ThxHbMysDVGPVOYnPFq4
 S8QFpmldzbNoPU4rDJyT1fAmoUIrusBHkl+Us3yGfC74iM+fHnDEvaSoMZbzEdY1
 x/Nocj9XgKEgfXdYzrCWFmZ9xXqHkO25/wDL6yKqBdQtvaEalXuHTT6mQcYxrUPm
 Xx1BPan2Jg7p4u2oOFcVtKewUtRH9KBx8qytr5S+JK4PJbrBsixMnr84HLd/3X2V
 ykYkO+367T5MTYv4TnJDE5vdurzUqekKSCFPY3skPujPJfdLj1vsPzYf9iMkCLdo
 hU2f/R+Wpdk=
 =36Ff
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Info Molnar:
 "Futex improvements:

   - Add the 'futex2' syscall ABI, which is an attempt to get away from
     the multiplex syscall and adds a little room for extentions, while
     lifting some limitations.

   - Fix futex PI recursive rt_mutex waiter state bug

   - Fix inter-process shared futexes on no-MMU systems

   - Use folios instead of pages

  Micro-optimizations of locking primitives:

   - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
     architectures, to improve lockref code generation

   - Improve the x86-32 lockref_get_not_zero() main loop by adding
     build-time CMPXCHG8B support detection for the relevant lockref
     code, and by better interfacing the CMPXCHG8B assembly code with
     the compiler

   - Introduce arch_sync_try_cmpxchg() on x86 to improve
     sync_try_cmpxchg() code generation. Convert some sync_cmpxchg()
     users to sync_try_cmpxchg().

   - Micro-optimize rcuref_put_slowpath()

  Locking debuggability improvements:

   - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well

   - Enforce atomicity of sched_submit_work(), which is de-facto atomic
     but was un-enforced previously.

   - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check
     semantics

   - Fix ww_mutex self-tests

   - Clean up const-propagation in <linux/seqlock.h> and simplify the
     API-instantiation macros a bit

  RT locking improvements:

   - Provide the rt_mutex_*_schedule() primitives/helpers and use them
     in the rtmutex code to avoid recursion vs. rtlock on the PI state.

   - Add nested blocking lockdep asserts to rt_mutex_lock(),
     rtlock_lock() and rwbase_read_lock()

  .. plus misc fixes & cleanups"

* tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  futex: Don't include process MM in futex key on no-MMU
  locking/seqlock: Fix grammar in comment
  alpha: Fix up new futex syscall numbers
  locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts
  locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning
  locking/seqlock: Change __seqprop() to return the function pointer
  locking/seqlock: Simplify SEQCOUNT_LOCKNAME()
  locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath()
  locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg()
  locking/atomic/x86: Introduce arch_sync_try_cmpxchg()
  locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback
  locking/seqlock: Fix typo in comment
  futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic()
  locking/local, arch: Rewrite local_add_unless() as a static inline function
  locking/debug: Fix debugfs API return value checks to use IS_ERR()
  locking/ww_mutex/test: Make sure we bail out instead of livelock
  locking/ww_mutex/test: Fix potential workqueue corruption
  locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup
  futex: Add sys_futex_requeue()
  futex: Add flags2 argument to futex_requeue()
  ...
2023-10-30 12:38:48 -10:00
Linus Torvalds
9e87705289 Initial bcachefs pull request for 6.7-rc1
Here's the bcachefs filesystem pull request.
 
 One new patch since last week: the exportfs constants ended up
 conflicting with other filesystems that are also getting added to the
 global enum, so switched to new constants picked by Amir.
 
 I'll also be sending another pull request later on in the cycle bringing
 things up to date my master branch that people are currently running;
 that will be restricted to fs/bcachefs/, naturally.
 
 Testing - fstests as well as the bcachefs specific tests in ktest:
   https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-for-upstream
 
 It's also been soaking in linux-next, which resulted in a whole bunch of
 smatch complaints and fixes and a patch or two from Kees.
 
 The only new non fs/bcachefs/ patch is the objtool patch that adds
 bcachefs functions to the list of noreturns. The patch that exports
 osq_lock() has been dropped for now, per Ingo.
 
 Prereq patch list:
 
 faf1dce852 objtool: Add bcachefs noreturns
 73badee428 lib/generic-radix-tree.c: Add peek_prev()
 9492261ff2 lib/generic-radix-tree.c: Don't overflow in peek()
 0fb5d567f5 MAINTAINERS: Add entry for generic-radix-tree
 b414e8ecd4 closures: Add a missing include
 48b7935722 closures: closure_nr_remaining()
 ced58fc7ab closures: closure_wait_event()
 bd0d22e41e MAINTAINERS: Add entry for closures
 8c8d2d9670 bcache: move closures to lib/
 957e48087d locking: export contention tracepoints for bcachefs six locks
 21db931445 lib: Export errname
 83feeb1955 lib/string_helpers: string_get_size() now returns characters wrote
 7d672f4094 stacktrace: Export stack_trace_save_tsk
 771eb4fe8b fs: factor out d_mark_tmpfile()
 2b69987be5 sched: Add task_struct->faults_disabled_mapping
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmU/wyIACgkQE6szbY3K
 bnZc1xAAqjQBGXdtgtKQvk0/ru0WaMZguMsOHd3BUXIbm30F6eJqnoXQ/ahALofc
 Ju6NrOgcy9wmdPKWpbeF+aK3WnkAW9jShDd0QieVH6PkhyYyh5r11iR/EVtjjLu5
 6Teodn8fyTqn9WSDtKG15QreTCJrEasAoGFQKQDA8oiXC7zc+RSpLUkkTWD/pxyW
 zVqkGGiAUG4x6FON+X2a3QBa9WCahIgV6XzHstGLsmOECxKO/LopGR5jThuIhv9t
 Yo0wodQTKAgb9QviG6V3f2dJLQKKUVDmVEGTXv+8Hl3d8CiYBJeIh+icp+VESBo1
 m8ev0y2xbTPLwgm5v0Uj4o/G8ISZ+qmcexV2zQ9xUWUAd2AjEBzhCh9BrNXM5qSg
 o7mphH+Pt6bJXgzxb2RkYJixU11yG3yuHPOCrRGGFpVHiNYhdHuJeDZOqChWZB8x
 6kY0uvU0X0tqVfWKxMwTwuqG8mJ5BkJNvnEvYi05QEZG0dDcUhgOqYlNNaL8vGkl
 qVixOwE4aH4kscdmW2gXY1c76VSebheyN8n6Wj1zrmTw4hTJH7ZWXPtmbRqQzpB6
 U6w3NjVyopbIjuF+syWeGqitTT/8fpvgZU4E9MpKGmHX4ADgecp6YSZQzzxTJn7D
 cbVX7YQxhmsM50C1PW7A8yLCspD/uRNiKLvzb/g9gFSInk4rV+U=
 =g+ia
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs

Pull initial bcachefs updates from Kent Overstreet:
 "Here's the bcachefs filesystem pull request.

  One new patch since last week: the exportfs constants ended up
  conflicting with other filesystems that are also getting added to the
  global enum, so switched to new constants picked by Amir.

  The only new non fs/bcachefs/ patch is the objtool patch that adds
  bcachefs functions to the list of noreturns. The patch that exports
  osq_lock() has been dropped for now, per Ingo"

* tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs: (2781 commits)
  exportfs: Change bcachefs fid_type enum to avoid conflicts
  bcachefs: Refactor memcpy into direct assignment
  bcachefs: Fix drop_alloc_keys()
  bcachefs: snapshot_create_lock
  bcachefs: Fix snapshot skiplists during snapshot deletion
  bcachefs: bch2_sb_field_get() refactoring
  bcachefs: KEY_TYPE_error now counts towards i_sectors
  bcachefs: Fix handling of unknown bkey types
  bcachefs: Switch to unsafe_memcpy() in a few places
  bcachefs: Use struct_size()
  bcachefs: Correctly initialize new buckets on device resize
  bcachefs: Fix another smatch complaint
  bcachefs: Use strsep() in split_devs()
  bcachefs: Add iops fields to bch_member
  bcachefs: Rename bch_sb_field_members -> bch_sb_field_members_v1
  bcachefs: New superblock section members_v2
  bcachefs: Add new helper to retrieve bch_member from sb
  bcachefs: bucket_lock() is now a sleepable lock
  bcachefs: fix crc32c checksum merge byte order problem
  bcachefs: Fix bch2_inode_delete_keys()
  ...
2023-10-30 11:09:38 -10:00
Linus Torvalds
8b16da681e NFSD 6.7 Release Notes
This release completes the SunRPC thread scheduler work that was
 begun in v6.6. The scheduler can now find an svc thread to wake in
 constant time and without a list walk. Thanks again to Neil Brown
 for this overhaul.
 
 Lorenzo Bianconi contributed infrastructure for a netlink-based
 NFSD control plane. The long-term plan is to provide the same
 functionality as found in /proc/fs/nfsd, plus some interesting
 additions, and then migrate the NFSD user space utilities to
 netlink.
 
 A long series to overhaul NFSD's NFSv4 operation encoding was
 applied in this release. The goals are to bring this family of
 encoding functions in line with the matching NFSv4 decoding
 functions and with the NFSv2 and NFSv3 XDR functions, preparing
 the way for better memory safety and maintainability.
 
 A further improvement to NFSD's write delegation support was
 contributed by Dai Ngo. This adds a CB_GETATTR callback,
 enabling the server to retrieve cached size and mtime data from
 clients holding write delegations. If the server can retrieve
 this information, it does not have to recall the delegation in
 some cases.
 
 The usual panoply of bug fixes and minor improvements round out
 this release. As always I am grateful to all contributors,
 reviewers, and testers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmU5IuoACgkQM2qzM29m
 f5eVsg//bVp8S93ci/oDlKfzOwH2fO5e5rna91wrDpJxkd51h6KTx55dSRG5sjAZ
 EywIVOann6xCtsixAPyff5Cweg2dWvzQRsy1ZnvWQ1qZBzD5KAJY5LPkeSFUCKBo
 Zani/qTOYbxzgFMjZx+yDSXDPKG68WYZBQK59SI7mURu4SYdk8aRyNY8mjHfr0Vh
 Aqrcny4oVtXV4sL5P5G/2FUW7WKT3olA3jSYlRRNMhbs2qpEemRCCrspOEMMad+b
 t1+ZCg+U27PMranvOJnof4RU7peZbaxDWA0gyiUbivVXVtZn9uOs0ffhktkvechL
 ePc33dqdp2ITdKIPA6JlaRv5WflKXQw0YYM9Kv5mcR4A2el7owL4f/pMlPhtbYwJ
 IOJv15KdKVN979G2e6WMYiKK+iHfaUUguhMEXnfnGoAajHOZNQiUEo3iFQAD7LDc
 DvMF8d9QqYmB9IW8FOYaRRfZGJOQHf3TL79Nd08z/bn5swvlvfj77leux9Sb+0/m
 Luk2Xvz2AJVSXE31wzabaGHkizN+BtH+e4MMbXUHBPW5jE9v7XOnEUFr4UdZyr9P
 Gl87A7NcrzNjJWT5TrnzM4sOslNsx46Aeg+VuNt2fSRn2dm6iBu2B8s0N4imx6dV
 PX1y9VSLq5WRhjrFZ1qeiZdsuTaQtrEiNDoRIQR6nCJPAV80iFk=
 =B4wJ
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "This release completes the SunRPC thread scheduler work that was begun
  in v6.6. The scheduler can now find an svc thread to wake in constant
  time and without a list walk. Thanks again to Neil Brown for this
  overhaul.

  Lorenzo Bianconi contributed infrastructure for a netlink-based NFSD
  control plane. The long-term plan is to provide the same functionality
  as found in /proc/fs/nfsd, plus some interesting additions, and then
  migrate the NFSD user space utilities to netlink.

  A long series to overhaul NFSD's NFSv4 operation encoding was applied
  in this release. The goals are to bring this family of encoding
  functions in line with the matching NFSv4 decoding functions and with
  the NFSv2 and NFSv3 XDR functions, preparing the way for better memory
  safety and maintainability.

  A further improvement to NFSD's write delegation support was
  contributed by Dai Ngo. This adds a CB_GETATTR callback, enabling the
  server to retrieve cached size and mtime data from clients holding
  write delegations. If the server can retrieve this information, it
  does not have to recall the delegation in some cases.

  The usual panoply of bug fixes and minor improvements round out this
  release. As always I am grateful to all contributors, reviewers, and
  testers"

* tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (127 commits)
  svcrdma: Fix tracepoint printk format
  svcrdma: Drop connection after an RDMA Read error
  NFSD: clean up alloc_init_deleg()
  NFSD: Fix frame size warning in svc_export_parse()
  NFSD: Rewrite synopsis of nfsd_percpu_counters_init()
  nfsd: Clean up errors in nfs3proc.c
  nfsd: Clean up errors in nfs4state.c
  NFSD: Clean up errors in stats.c
  NFSD: simplify error paths in nfsd_svc()
  NFSD: Clean up nfsd4_encode_seek()
  NFSD: Clean up nfsd4_encode_offset_status()
  NFSD: Clean up nfsd4_encode_copy_notify()
  NFSD: Clean up nfsd4_encode_copy()
  NFSD: Clean up nfsd4_encode_test_stateid()
  NFSD: Clean up nfsd4_encode_exchange_id()
  NFSD: Clean up nfsd4_do_encode_secinfo()
  NFSD: Clean up nfsd4_encode_access()
  NFSD: Clean up nfsd4_encode_readdir()
  NFSD: Clean up nfsd4_encode_entry4()
  NFSD: Add an nfsd4_encode_nfs_cookie4() helper
  ...
2023-10-30 10:12:29 -10:00
Linus Torvalds
df9c65b5fc vfs-6.7.iov_iter
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTppQwAKCRCRxhvAZXjc
 om2kAP4u+eLsrhJHfUPUttGUEkSkZE+5/s/f1A/1GcV51usLSgEAu8urxAnP49GW
 INaDABXaFfKx8/KI/H2YFZPKGwlNEwY=
 =PDDE
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull iov_iter updates from Christian Brauner:
 "This contain's David's iov_iter cleanup work to convert the iov_iter
  iteration macros to inline functions:

   - Remove last_offset from iov_iter as it was only used by ITER_PIPE

   - Add a __user tag on copy_mc_to_user()'s dst argument on x86 to
     match that on powerpc and get rid of a sparse warning

   - Convert iter->user_backed to user_backed_iter() in the sound PCM
     driver

   - Convert iter->user_backed to user_backed_iter() in a couple of
     infiniband drivers

   - Renumber the type enum so that the ITER_* constants match the order
     in iterate_and_advance*()

   - Since the preceding patch puts UBUF and IOVEC at 0 and 1, change
     user_backed_iter() to just use the type value and get rid of the
     extra flag

   - Convert the iov_iter iteration macros to always-inline functions to
     make the code easier to follow. It uses function pointers, but they
     get optimised away

   - Move the check for ->copy_mc to _copy_from_iter() and
     copy_page_from_iter_atomic() rather than in memcpy_from_iter_mc()
     where it gets repeated for every segment. Instead, we check once
     and invoke a side function that can use iterate_bvec() rather than
     iterate_and_advance() and supply a different step function

   - Move the copy-and-csum code to net/ where it can be in proximity
     with the code that uses it

   - Fold memcpy_and_csum() in to its two users

   - Move csum_and_copy_from_iter_full() out of line and merge in
     csum_and_copy_from_iter() since the former is the only caller of
     the latter

   - Move hash_and_copy_to_iter() to net/ where it can be with its only
     caller"

* tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  iov_iter, net: Move hash_and_copy_to_iter() to net/
  iov_iter, net: Merge csum_and_copy_from_iter{,_full}() together
  iov_iter, net: Fold in csum_and_memcpy()
  iov_iter, net: Move csum_and_copy_to/from_iter() to net/
  iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
  iov_iter: Convert iterate*() to inline funcs
  iov_iter: Derive user-backedness from the iterator type
  iov_iter: Renumber ITER_* constants
  infiniband: Use user_backed_iter() to see if iterator is UBUF/IOVEC
  sound: Fix snd_pcm_readv()/writev() to use iov access functions
  iov_iter, x86: Be consistent about the __user tag on copy_mc_to_user()
  iov_iter: Remove last_offset from iov_iter as it was for ITER_PIPE
2023-10-30 09:24:21 -10:00
Kees Cook
dcc4e5728e seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()
Solve two ergonomic issues with struct seq_buf;

1) Too much boilerplate is required to initialize:

	struct seq_buf s;
	char buf[32];

	seq_buf_init(s, buf, sizeof(buf));

Instead, we can build this directly on the stack. Provide
DECLARE_SEQ_BUF() macro to do this:

	DECLARE_SEQ_BUF(s, 32);

2) %NUL termination is fragile and requires 2 steps to get a valid
   C String (and is a layering violation exposing the "internals" of
   seq_buf):

	seq_buf_terminate(s);
	do_something(s->buffer);

Instead, we can just return s->buffer directly after terminating it in
the refactored seq_buf_terminate(), now known as seq_buf_str():

	do_something(seq_buf_str(s));

Link: https://lore.kernel.org/linux-trace-kernel/20231027155634.make.260-kees@kernel.org
Link: https://lore.kernel.org/linux-trace-kernel/20231026194033.it.702-kees@kernel.org/

Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Yun Zhou <yun.zhou@windriver.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-10-28 16:52:43 -04:00
Dave Jiang
a103f46633 acpi: Move common tables helper functions to common lib
Some of the routines in ACPI driver/acpi/tables.c can be shared with
parsing CDAT. CDAT is a device-provided data structure that is formatted
similar to a platform provided ACPI table. CDAT is used by CXL and can
exist on platforms that do not use ACPI. Split out the common routine
from ACPI to accommodate platforms that do not support ACPI and move that
to /lib. The common routines can be built outside of ACPI if
FIRMWARE_TABLES is selected.

Link: https://lore.kernel.org/linux-cxl/CAJZ5v0jipbtTNnsA0-o5ozOk8ZgWnOg34m34a9pPenTyRLj=6A@mail.gmail.com/
Suggested-by: "Rafael J. Wysocki" <rafael@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/169713683430.2205276.17899451119920103445.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 20:48:03 -07:00
Jakub Kicinski
ec4c20ca09 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/mac80211/rx.c
  91535613b6 ("wifi: mac80211: don't drop all unprotected public action frames")
  6c02fab724 ("wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return value")

Adjacent changes:

drivers/net/ethernet/apm/xgene/xgene_enet_main.c
  61471264c0 ("net: ethernet: apm: Convert to platform remove callback returning void")
  d2ca43f306 ("net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF")

net/vmw_vsock/virtio_transport.c
  64c99d2d6a ("vsock/virtio: support to send non-linear skb")
  53b08c4985 ("vsock/virtio: initialize the_virtio_vsock before using VQs")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-26 13:46:28 -07:00
wuqiang.matt
4758560fa2 kprobes: unused header files removed
As kernel test robot reported, lib/test_objpool.c (trace:probes/for-next)
has linux/version.h included, but version.h is not used at all. Then more
unused headers are found in test_objpool.c and rethook.c, and all of them
should be removed.

Link: https://lore.kernel.org/all/20231023112245.6112-1-wuqiang.matt@bytedance.com/

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310191512.vvypKU5Z-lkp@intel.com/
Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-10-24 10:04:59 +09:00
Matthew Wilcox (Oracle)
d0ed46b603 tracing: Move readpos from seq_buf to trace_seq
To make seq_buf more lightweight as a string buf, move the readpos member
from seq_buf to its container, trace_seq.  That puts the responsibility
of maintaining the readpos entirely in the tracing code.  If some future
users want to package up the readpos with a seq_buf, we can define a
new struct then.

Link: https://lore.kernel.org/linux-trace-kernel/20231020033545.2587554-2-willy@infradead.org

Cc: Kees Cook <keescook@chromium.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-10-20 12:16:10 -04:00
Jakub Kicinski
374d345d9b netlink: add variable-length / auto integers
We currently push everyone to use padding to align 64b values
in netlink. Un-padded nla_put_u64() doesn't even exist any more.

The story behind this possibly start with this thread:
https://lore.kernel.org/netdev/20121204.130914.1457976839967676240.davem@davemloft.net/
where DaveM was concerned about the alignment of a structure
containing 64b stats. If user space tries to access such struct
directly:

	struct some_stats *stats = nla_data(attr);
	printf("A: %llu", stats->a);

lack of alignment may become problematic for some architectures.
These days we most often put every single member in a separate
attribute, meaning that the code above would use a helper like
nla_get_u64(), which can deal with alignment internally.
Even for arches which don't have good unaligned access - access
aligned to 4B should be pretty efficient.
Kernel and well known libraries deal with unaligned input already.

Padded 64b is quite space-inefficient (64b + pad means at worst 16B
per attr vs 32b which takes 8B). It is also more typing:

    if (nla_put_u64_pad(rsp, NETDEV_A_SOMETHING_SOMETHING,
                        value, NETDEV_A_SOMETHING_PAD))

Create a new attribute type which will use 32 bits at netlink
level if value is small enough (probably most of the time?),
and (4B-aligned) 64 bits otherwise. Kernel API is just:

    if (nla_put_uint(rsp, NETDEV_A_SOMETHING_SOMETHING, value))

Calling this new type "just" sint / uint with no specific size
will hopefully also make people more comfortable with using it.
Currently telling people "don't use u8, you may need the bits,
and netlink will round up to 4B, anyway" is the #1 comment
we give to newcomers.

In terms of netlink layout it looks like this:

         0       4       8       12      16
32b:     [nlattr][ u32  ]
64b:     [  pad ][nlattr][     u64      ]
uint(32) [nlattr][ u32  ]
uint(64) [nlattr][     u64      ]

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:43:35 +01:00
Kent Overstreet
73badee428 lib/generic-radix-tree.c: Add peek_prev()
This patch adds genradix_peek_prev(), genradix_iter_rewind(), and
genradix_for_each_reverse(), for iterating backwards over a generic
radix tree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-19 14:47:33 -04:00
Kent Overstreet
9492261ff2 lib/generic-radix-tree.c: Don't overflow in peek()
When we started spreading new inode numbers throughout most of the 64
bit inode space, that triggered some corner case bugs, in particular
some integer overflows related to the radix tree code. Oops.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-19 14:47:33 -04:00
Kent Overstreet
b414e8ecd4 closures: Add a missing include
Fixes building in userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-19 14:47:33 -04:00
Kent Overstreet
8c8d2d9670 bcache: move closures to lib/
Prep work for bcachefs - being a fork of bcache it also uses closures

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Acked-by: Coly Li <colyli@suse.de>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
2023-10-19 14:47:33 -04:00
Alexey Dobriyan
68279f9c9f treewide: mark stuff as __ro_after_init
__read_mostly predates __ro_after_init. Many variables which are marked
__read_mostly should have been __ro_after_init from day 1.

Also, mark some stuff as "const" and "__init" while I'm at it.

[akpm@linux-foundation.org: revert sysctl_nr_open_min, sysctl_nr_open_max changes due to arm warning]
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/4f6bb9c0-abba-4ee4-a7aa-89265e886817@p183
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:43:23 -07:00
Hugh Dickins
1431996bf9 percpu_counter: extend _limited_add() to negative amounts
Though tmpfs does not need it, percpu_counter_limited_add() can be twice
as useful if it works sensibly with negative amounts (subs) - typically
decrements towards a limit of 0 or nearby: as suggested by Dave Chinner.

And in the course of that reworking, skip the percpu counter sum if it is
already obvious that the limit would be passed: as suggested by Tim Chen.

Extend the comment above __percpu_counter_limited_add(), defining the
behaviour with positive and negative amounts, allowing negative limits,
but not bothering about overflow beyond S64_MAX.

Link: https://lkml.kernel.org/r/8f86083b-c452-95d4-365b-f16a2e4ebcd4@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Carlos Maiolino <cem@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Tim Chen <tim.c.chen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:14 -07:00
Hugh Dickins
beb9868628 shmem,percpu_counter: add _limited_add(fbc, limit, amount)
Percpu counter's compare and add are separate functions: without locking
around them (which would defeat their purpose), it has been possible to
overflow the intended limit.  Imagine all the other CPUs fallocating tmpfs
huge pages to the limit, in between this CPU's compare and its add.

I have not seen reports of that happening; but tmpfs's recent addition of
dquot_alloc_block_nodirty() in between the compare and the add makes it
even more likely, and I'd be uncomfortable to leave it unfixed.

Introduce percpu_counter_limited_add(fbc, limit, amount) to prevent it.

I believe this implementation is correct, and slightly more efficient than
the combination of compare and add (taking the lock once rather than twice
when nearing full - the last 128MiB of a tmpfs volume on a machine with
128 CPUs and 4KiB pages); but it does beg for a better design - when
nearing full, there is no new batching, but the costly percpu counter sum
across CPUs still has to be done, while locked.

Follow __percpu_counter_sum()'s example, including cpu_dying_mask as well
as cpu_online_mask: but shouldn't __percpu_counter_compare() and
__percpu_counter_limited_add() then be adding a num_dying_cpus() to
num_online_cpus(), when they calculate the maximum which could be held
across CPUs?  But the times when it matters would be vanishingly rare.

Link: https://lkml.kernel.org/r/bb817848-2d19-bcc8-39ca-ea179af0f0b4@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Carlos Maiolino <cem@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:14 -07:00
Liam R. Howlett
099d7439ce maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()
Users complained about OOM errors during fork without triggering
compaction.  This can be fixed by modifying the flags used in
mas_expected_entries() so that the compaction will be triggered in low
memory situations.  Since mas_expected_entries() is only used during fork,
the extra argument does not need to be passed through.

Additionally, the two test_maple_tree test cases and one benchmark test
were altered to use the correct locking type so that allocations would not
trigger sleeping and thus fail.  Testing was completed with lockdep atomic
sleep detection.

The additional locking change requires rwsem support additions to the
tools/ directory through the use of pthreads pthread_rwlock_t.  With this
change test_maple_tree works in userspace, as a module, and in-kernel.

Users may notice that the system gave up early on attempting to start new
processes instead of attempting to reclaim memory.

Link: https://lkml.kernel.org/r/20230915093243epcms1p46fa00bbac1ab7b7dca94acb66c44c456@epcms1p4
Link: https://lkml.kernel.org/r/20231012155233.2272446-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: <jason.sim@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 12:12:41 -07:00
Arnd Bergmann
57e06f8c1f Qualcomm driver updates for v6.7
This introduces partial support for the Qualcomm Secure Execution
 Environment SCM interface, and uses this to implement EFI variable
 access on the Windows On Snapdragon devices (for now).
 
 The 32/64-bit calling convention detector of the SCM interface is
 updated to not choose 64-bit convention when Linux is 32-bit. The
 "extern" specifier is dropped from the interface include file.
 
 The LLCC driver gains support for carrying configuration for multiple
 different system/DDR configurations for a given platform, and selecting
 between them. Support for Q[DR]U1000 is added to the driver.
 
 All exported symbols are transitioned to EXPORT_SYMBOL_GPL().
 
 The platform_drivers in the Qualcomm SoC are transitioned to the
 void-returning remove_new implementation.
 
 The rmtfs memory driver gains support for leaving guard pages around the
 used area, to avoid issues if the allocation happens to be placed
 adjacent to another protected memory region.
 
 The socinfo driver gains knowledge about IPQ8174, QCM6490, SM7150P and
 various PMICs used together with SM8550.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmUsTdwVHGFuZGVyc3Nv
 bkBrZXJuZWwub3JnAAoJEAsfOT8Nma3FffsQAMcY5NKBZqbQu5wNr2zPnKW5m/30
 lYi4bW42mlY6Mo3h97LLxjGM9bqsmPasfAxv/84viN0JYxrSVye/zNtD35qTQBtu
 t8O8vdOaOYXIc8dwqC16Fl0kI5pm9tl7p5SJmGAUZBvIGUIGngy3gZOYM+HzyMQr
 UXFO6dO0tQvjowd1t0xE51UNm4J79Vm8HjtuR1pc4i+fZhsy5XhZYq9e4531Yc7o
 lNacBugjXhurw/rz0odNuRzKn+He+7KQR/hSNv8B2MbH0ZUP+k4VUaRSD9TZyBgC
 cOahbCfVLkGwepPFGuaaJUUgo8/aa4HiAbdyOrNL2ozYkoLv/PbRrLF0R9KgXK5v
 LPXpqXt97+rWPnq2iPrfZlo4LAj6JVSCUCxIRZFcWyDFv+wHsYuRbO64s1+MC7+M
 CPpvup31PLGCeR9Joo/WTt9iAUPJ4BY9v4mkjIdmJF371WRryP4um5Ln6xMC+fIe
 U/Ss8mPNpSM+guDo8LbdCGAcxWTDk8TOXxa/888y/4Q5Mg97nn0hY2KK4JBmJ7I3
 Thd0n9vXjM3oOwPGLC7Y2zj9ZQI5k7JKn331yPS4Hb20SKGEztlaJp4B1DdA5/ul
 4BUoh3HWV7dB4l+felrAmPAzAUXvyfGrDBJhXe2dabRJjMhouQ6Wudn3OlC4VP/l
 fjszjVvPFt7CE30C
 =B1Kn
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmUv9yoACgkQYKtH/8kJ
 UifTOQ/+NUj5Y/sAZfMLY3Q3e8s2nsKZLHHUCH7O/DOR6ILc5DZcTI5a23kfIA9V
 VVluTEnRUzbSHCb3WPW3Yfgy5HpZoaLgXWXGgCsRVKRIODN745TOm2Hn/dFZF0Z7
 Fso0oPplGKrvuxSlaxg+1fbJ7c5r8kE/TDu+ok0eiZGvzwfnVas4bB+27thdH/le
 tY/wS9BhHmBrs6k7gqCfEFWF7M6kS29FOPILIlAfxgUevgSIvAUMBH/KUAGw1Qo/
 UNd372TwXkdw0pG2KOx8xR4iKFBfWX4PbHBtBbUZzAnqerDnNrh7p0qTlhfKXOpb
 jQWx3eqQIwFbbWZ39DxiLE8/h5ig3jF29u1s/vAG2Zb5o5cYwiRM7OjWW8Q8Ha1E
 ha2NSWMEaNniegnMeoy1VTN8uSYY+dUseKmLmP5vCobmYVKuX96iQx5pNBYhsrwN
 YOlkOkF1fXINFqSD5v2N4s4AdGx634GgP7azkjm6fGS+hlwKgWNzwH4HHgXNx/u/
 IPRPw7U/hji2R++oNPtEOlNHoHWM6Zx0ema38qGTC69uAFHETxbFUmOBaX6q/7Ot
 AwpP25YozehwDoUVZRwR6CpHcVbKDl4oJhWaruflkxC+6YFTGYS2hPFiF6zbnQw9
 u36QDkgNzhBG2LINlmqCAAoGWmC+mqwsqYN/DyKsUUSVEODyUng=
 =7XXT
 -----END PGP SIGNATURE-----

Merge tag 'qcom-drivers-for-6.7' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/drivers

Qualcomm driver updates for v6.7

This introduces partial support for the Qualcomm Secure Execution
Environment SCM interface, and uses this to implement EFI variable
access on the Windows On Snapdragon devices (for now).

The 32/64-bit calling convention detector of the SCM interface is
updated to not choose 64-bit convention when Linux is 32-bit. The
"extern" specifier is dropped from the interface include file.

The LLCC driver gains support for carrying configuration for multiple
different system/DDR configurations for a given platform, and selecting
between them. Support for Q[DR]U1000 is added to the driver.

All exported symbols are transitioned to EXPORT_SYMBOL_GPL().

The platform_drivers in the Qualcomm SoC are transitioned to the
void-returning remove_new implementation.

The rmtfs memory driver gains support for leaving guard pages around the
used area, to avoid issues if the allocation happens to be placed
adjacent to another protected memory region.

The socinfo driver gains knowledge about IPQ8174, QCM6490, SM7150P and
various PMICs used together with SM8550.

* tag 'qcom-drivers-for-6.7' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (44 commits)
  soc: qcom: socinfo: Convert to platform remove callback returning void
  soc: qcom: smsm: Convert to platform remove callback returning void
  soc: qcom: smp2p: Convert to platform remove callback returning void
  soc: qcom: smem: Convert to platform remove callback returning void
  soc: qcom: rmtfs_mem: Convert to platform remove callback returning void
  soc: qcom: qcom_stats: Convert to platform remove callback returning void
  soc: qcom: qcom_gsbi: Convert to platform remove callback returning void
  soc: qcom: qcom_aoss: Convert to platform remove callback returning void
  soc: qcom: pmic_glink: Convert to platform remove callback returning void
  soc: qcom: ocmem: Convert to platform remove callback returning void
  soc: qcom: llcc-qcom: Convert to platform remove callback returning void
  soc: qcom: icc-bwmon: Convert to platform remove callback returning void
  firmware: qcom_scm: use 64-bit calling convention only when client is 64-bit
  soc: qcom: llcc: Handle a second device without data corruption
  soc: qcom: Switch to EXPORT_SYMBOL_GPL()
  soc: qcom: smem: Annotate struct qcom_smem with __counted_by
  soc: qcom: rmtfs: Support discarding guard pages
  dt-bindings: reserved-memory: rmtfs: Allow guard pages
  dt-bindings: firmware: qcom,scm: document IPQ5018 compatible
  firmware: qcom_scm: disable SDI if required
  ...

Link: https://lore.kernel.org/r/20231015204014.855672-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-10-18 17:18:02 +02:00
wuqiang.matt
92f90d3b0d lib: objpool test module added
The test_objpool module (test_objpool) will run several testcases
for objpool stress and performance evaluation. Each testcase will
have all available cpu cores involved to create a situation of high
parallel and high contention.

As of now there are 5 groups and 5 * 2 testcases in total:

1) group 1: synchronous mode
   objpool is managed synchronously, that is, all objects are to be
   reclaimed before objpool finalization and the objpool owner makes
   sure of it. All threads on different cores run in the same pace
2) group 2: synchronous mode + hrtimer
   this case have 2 customers: normal threads and hrtimer softirqs
3) group 3: synchronous + overrun mode
   This test group is mainly for performance evaluation of missing
   cases when pre-allocated objects are less than the requested
4) group 4: asynchronous mode
   This case is just an emulation of kretprobe, with refcount used
   to control the objpool lifecycle
5) group 5: asynchronous mode with hrtimer
   hrtimer softirq is introduced to stress async objpool operations

Link: https://lore.kernel.org/all/20231017135654.82270-3-wuqiang.matt@bytedance.com/

Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-10-18 22:36:03 +09:00
wuqiang.matt
b4edb8d2d4 lib: objpool added: ring-array based lockless MPMC
objpool is a scalable implementation of high performance queue for
object allocation and reclamation, such as kretprobe instances.

With leveraging percpu ring-array to mitigate hot spots of memory
contention, it delivers near-linear scalability for high parallel
scenarios. The objpool is best suited for the following cases:
1) Memory allocation or reclamation are prohibited or too expensive
2) Consumers are of different priorities, such as irqs and threads

Limitations:
1) Maximum objects (capacity) is fixed after objpool creation
2) All pre-allocated objects are managed in percpu ring array,
   which consumes more memory than linked lists

Link: https://lore.kernel.org/all/20231017135654.82270-2-wuqiang.matt@bytedance.com/

Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-10-18 22:35:36 +09:00
Yury Norov
6cb42f91aa bitmap: move bitmap_*_region() functions to bitmap.h
Now that bitmap_*_region() functions are implemented as thin wrappers
around others, it's worth to move them to the header, as it opens room
for compile-time optimizations.

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-16 16:14:45 -07:00
NeilBrown
de9e82c355 lib: add light-weight queuing mechanism.
lwq is a FIFO single-linked queue that only requires a spinlock
for dequeueing, which happens in process context.  Enqueueing is atomic
with no spinlock and can happen in any context.

This is particularly useful when work items are queued from BH or IRQ
context, and when they are handled one at a time by dedicated threads.

Avoiding any locking when enqueueing means there is no need to disable
BH or interrupts, which is generally best avoided (particularly when
there are any RT tasks on the machine).

This solution is superior to using "list_head" links because we need
half as many pointers in the data structures, and because list_head
lists would need locking to add items to the queue.

This solution is superior to a bespoke solution as all locking and
container_of casting is integrated, so the interface is simple.

Despite the similar name, this solution meets a distinctly different
need to kfifo.  kfifo provides a fixed sized circular buffer to which
data can be added at one end and removed at the other, and does not
provide any locking.  lwq does not have any size limit and works with
data structures (objects?) rather than data (bytes).

A unit test for basic functionality, which runs at boot time, is included.

Signed-off-by: NeilBrown <neilb@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David Gow <davidgow@google.com>
Cc: linux-kernel@vger.kernel.org
Message-Id: <20230911111333.4d1a872330e924a00acb905b@linux-foundation.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16 12:44:06 -04:00
NeilBrown
8a3e5975ed llist: add llist_del_first_this()
llist_del_first_this() deletes a specific entry from an llist, providing
it is at the head of the list.  Multiple threads can call this
concurrently providing they each offer a different entry.

This can be uses for a set of worker threads which are on the llist when
they are idle.  The head can always be woken, and when it is woken it
can remove itself, and possibly wake the next if there is an excess of
work to do.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16 12:44:06 -04:00
Yury Norov
1d4836527d bitmap: drop _reg_op() function
Now that all _reg_op() users are switched to alternative functions,
_reg_op() machinery is not needed anymore.

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:23 -07:00
Yury Norov
9276819a68 bitmap: replace _reg_op(REG_OP_ISFREE) with find_next_bit()
_reg_op(REG_OP_ISFREE) can be trivially replaced with find_next_bit().
Doing that opens room for potential small_const_nbits() optimization.

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov
add00c76ee bitmap: replace _reg_op(REG_OP_RELEASE) with bitmap_clear()
_reg_op(REG_OP_RELEASE) duplicates bitmap_clear().

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov
eae5acbd75 bitmap: replace _reg_op(REG_OP_ALLOC) with bitmap_set()
_reg_op(REG_OP_ALLOC) duplicates bitmap_set().

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov
b085f969ed bitmap: fix opencoded bitmap_allocate_region()
bitmap_find_region() opencodes bitmap_allocate_region().

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov
6d5d3a0c33 bitmap: add test for bitmap_*_region() functions
Test basic functionality of bitmap_{allocate,release,find_free}_region()
functions.

CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov
82bf9bdfbc bitmap: align __reg_op() wrappers with modern coding style
Fix comments so that scripts/kernel-doc doesn't warn, and fix for-loop
stype in bitmap_find_free_region().

CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Yury Norov
aae06fc1b5 lib/bitmap: split-out string-related operations to a separate files
lib/bitmap.c and corresponding include/linux/bitmap.h are intended to
hold functions related to operations on bitmaps, like bitmap_shift or
bitmap_set. Historically, some string-related operations like
bitmap_parse are also reside in lib/bitmap.c.

Now that the subsystem evolves, string-related bitmap operations became a
significant part of the file. Because they are quite different from the
other bitmap functions by nature, it's worth to split them to a separate
source/header files.

CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Andy Shevchenko
7733aa8938 bitmap: Remove dead code, i.e. bitmap_copy_le()
Besides the fact it's not used anywhere it should be implemented
differently, i.e. via helpers from linux/byteorder/generic.h.
Yet the helpers themselves need to be introduced first.

Also note, the function lacks of the test cases, they must be provided.

Hence, drop the current dead code for good.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Jonathan Neuschäfer
8ed13a762c bitmap: Fix a typo ("identify map")
A map in which each element is mapped to itself is called an "identity
map".

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:22 -07:00
Randy Dunlap
57f728d59f cpumask: kernel-doc cleanups and additions
Clean up some punctutation and abbreviations.
Add kernel-doc notation for one function and function return value
for 39 functions.

cpumask.h:
Fix some punctuation (plural vs. possessive).
Fix some abbreviations (ie. -> i.e., id -> ID).

Fix 35 warnings like this:
include/linux/cpumask.h:161: warning: No description found for return value of 'cpumask_first'

cpumask.c:
Add Return: value for 4 functions.
Add kernel-doc for cpumask_any_distribute().

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-10-14 20:25:21 -07:00
Uros Bizjak
4fbf8b136d locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath()
Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
in rcuref_put_slowpath(). On x86 the CMPXCHG instruction returns success in the
ZF flag, so this change saves a compare after CMPXCHG.  Additionaly,
the compiler reorders some code blocks to follow likely/unlikely
annotations in the atomic_try_cmpxchg() macro, improving the code from:

  9a:	f0 0f b1 0b          	lock cmpxchg %ecx,(%rbx)
  9e:	83 f8 ff             	cmp    $0xffffffff,%eax
  a1:	74 04                	je     a7 <rcuref_put_slowpath+0x27>
  a3:	31 c0                	xor    %eax,%eax

to:

  9a:	f0 0f b1 0b          	lock cmpxchg %ecx,(%rbx)
  9e:	75 4c                	jne    ec <rcuref_put_slowpath+0x6c>
  a0:	b0 01                	mov    $0x1,%al

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20230509150255.3691-1-ubizjak@gmail.com
2023-10-10 10:14:27 +02:00
David Howells
b5f0e20f44
iov_iter, net: Move hash_and_copy_to_iter() to net/
Move hash_and_copy_to_iter() to be with its only caller in networking code.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-13-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
cc: netdev@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-09 09:35:14 +02:00
David Howells
6d0d419914
iov_iter, net: Move csum_and_copy_to/from_iter() to net/
Move csum_and_copy_to/from_iter() to net code now that the iteration
framework can be #included.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-10-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
cc: netdev@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-09 09:35:14 +02:00
David Howells
c9eec08bac
iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
iter->copy_mc is only used with a bvec iterator and only by
dump_emit_page() in fs/coredump.c so rather than handle this in
memcpy_from_iter_mc() where it is checked repeatedly by _copy_from_iter()
and copy_page_from_iter_atomic(),

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-9-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-09 09:35:14 +02:00
Ingo Molnar
8db30574db Merge branch 'sched/urgent' into sched/core, to pick up fixes and refresh the branch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-10-07 11:32:24 +02:00
Jakub Kicinski
2606cf059c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts (or adjacent changes of note).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-05 13:16:47 -07:00
Dr. David Alan Gilbert
0ebc7feae7 powerpc: Use shared font data
PowerPC has a 'btext' font used for the console which is almost identical
to the shared font_sun8x16, so use it rather than duplicating the data.

They were actually identical until about a decade ago when
   commit bcfbeecea1 ("drivers: console: font_: Change a glyph from
                        "broken bar" to "vertical line"")

which changed the | in the shared font to be a solid
bar rather than a broken bar.  That's the only difference.

This was originally spotted by the PMF source code analyser, which
noticed that sparc does the same thing with the same data, and they
also share a bunch of functions to manipulate the data.  I've previously
posted a near identical patch for sparc.

Tested very lightly with a boot without FS in qemu.

Signed-off-by: "Dr. David Alan Gilbert" <linux@treblig.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230825142754.1487900-1-linux@treblig.org
2023-10-01 23:09:02 +11:00
Liam R. Howlett
a8091f039c maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states
When updating the maple tree iterator to avoid rewalks, an issue was
introduced when shifting beyond the limits.  This can be seen by trying to
go to the previous address of 0, which would set the maple node to
MAS_NONE and keep the range as the last entry.

Subsequent calls to mas_find() would then search upwards from mas->last
and skip the value at mas->index/mas->last.  This showed up as a bug in
mprotect which skips the actual VMA at the current range after attempting
to go to the previous VMA from 0.

Since MAS_NONE may already be set when searching for a value that isn't
contained within a node, changing the handling of MAS_NONE in mas_find()
would make the code more complicated and error prone.  Furthermore, there
was no way to tell which limit was hit, and thus which action to take
(next or the entry at the current range).

This solution is to add two states to track what happened with the
previous iterator action.  This allows for the expected behaviour of the
next command to return the correct item (either the item at the range
requested, or the next/previous).

Tests are also added and updated accordingly.

Link: https://lkml.kernel.org/r/20230921181236.509072-3-Liam.Howlett@oracle.com
Link: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b
Link: https://lore.kernel.org/linux-mm/20230921181236.509072-1-Liam.Howlett@oracle.com/
Fixes: 39193685d5 ("maple_tree: try harder to keep active node with mas_prev()")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Pedro Falcato <pedro.falcato@gmail.com>
Closes: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b
Closes: https://bugs.archlinux.org/task/79656
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29 17:20:46 -07:00
Jinjie Ruan
8040345fda kunit: test: Fix the possible memory leak in executor_test
When CONFIG_KUNIT_ALL_TESTS=y, making CONFIG_DEBUG_KMEMLEAK=y and
CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, the below memory leak is detected.

If kunit_filter_suites() succeeds, not only copy but also filtered_suite
and filtered_suite->test_cases should be freed.

So as Rae suggested, to avoid the suite set never be freed when
KUNIT_ASSERT_EQ() fails and exits after kunit_filter_suites() succeeds,
update kfree_at_end() func to free_suite_set_at_end() to use
kunit_free_suite_set() to free them as kunit_module_exit() and
kunit_run_all_tests() do it. As the second arg got of
free_suite_set_at_end() is a local variable, copy it for free to avoid
wild-memory-access. After applying this patch, the following memory leak
is never detected.

unreferenced object 0xffff8881001de400 (size 1024):
  comm "kunit_try_catch", pid 1396, jiffies 4294720452 (age 932.801s)
  hex dump (first 32 bytes):
    73 75 69 74 65 32 00 00 00 00 00 00 00 00 00 00  suite2..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817db753>] __kmalloc_node_track_caller+0x53/0x150
    [<ffffffff817bd242>] kmemdup+0x22/0x50
    [<ffffffff829e961d>] kunit_filter_suites+0x44d/0xcc0
    [<ffffffff829eb69f>] filter_suites_test+0x12f/0x360
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff8881052cd388 (size 192):
  comm "kunit_try_catch", pid 1396, jiffies 4294720452 (age 932.801s)
  hex dump (first 32 bytes):
    a0 85 9e 82 ff ff ff ff 80 cd 7c 84 ff ff ff ff  ..........|.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817dbad2>] __kmalloc+0x52/0x150
    [<ffffffff829e9651>] kunit_filter_suites+0x481/0xcc0
    [<ffffffff829eb69f>] filter_suites_test+0x12f/0x360
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20

unreferenced object 0xffff888100da8400 (size 1024):
  comm "kunit_try_catch", pid 1398, jiffies 4294720454 (age 781.945s)
  hex dump (first 32 bytes):
    73 75 69 74 65 32 00 00 00 00 00 00 00 00 00 00  suite2..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817db753>] __kmalloc_node_track_caller+0x53/0x150
    [<ffffffff817bd242>] kmemdup+0x22/0x50
    [<ffffffff829e961d>] kunit_filter_suites+0x44d/0xcc0
    [<ffffffff829eb13f>] filter_suites_test_glob_test+0x12f/0x560
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff888105117878 (size 96):
  comm "kunit_try_catch", pid 1398, jiffies 4294720454 (age 781.945s)
  hex dump (first 32 bytes):
    a0 85 9e 82 ff ff ff ff a0 ac 7c 84 ff ff ff ff  ..........|.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817dbad2>] __kmalloc+0x52/0x150
    [<ffffffff829e9651>] kunit_filter_suites+0x481/0xcc0
    [<ffffffff829eb13f>] filter_suites_test_glob_test+0x12f/0x560
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff888102c31c00 (size 1024):
  comm "kunit_try_catch", pid 1404, jiffies 4294720460 (age 781.948s)
  hex dump (first 32 bytes):
    6e 6f 72 6d 61 6c 5f 73 75 69 74 65 00 00 00 00  normal_suite....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817db753>] __kmalloc_node_track_caller+0x53/0x150
    [<ffffffff817bd242>] kmemdup+0x22/0x50
    [<ffffffff829ecf17>] kunit_filter_attr_tests+0xf7/0x860
    [<ffffffff829e99ff>] kunit_filter_suites+0x82f/0xcc0
    [<ffffffff829ea975>] filter_attr_test+0x195/0x5f0
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff8881052cd250 (size 192):
  comm "kunit_try_catch", pid 1404, jiffies 4294720460 (age 781.948s)
  hex dump (first 32 bytes):
    a0 85 9e 82 ff ff ff ff 00 a9 7c 84 ff ff ff ff  ..........|.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817dbad2>] __kmalloc+0x52/0x150
    [<ffffffff829ecfc1>] kunit_filter_attr_tests+0x1a1/0x860
    [<ffffffff829e99ff>] kunit_filter_suites+0x82f/0xcc0
    [<ffffffff829ea975>] filter_attr_test+0x195/0x5f0
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff888104f4e400 (size 1024):
  comm "kunit_try_catch", pid 1408, jiffies 4294720464 (age 781.944s)
  hex dump (first 32 bytes):
    73 75 69 74 65 00 00 00 00 00 00 00 00 00 00 00  suite...........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817db753>] __kmalloc_node_track_caller+0x53/0x150
    [<ffffffff817bd242>] kmemdup+0x22/0x50
    [<ffffffff829ecf17>] kunit_filter_attr_tests+0xf7/0x860
    [<ffffffff829e99ff>] kunit_filter_suites+0x82f/0xcc0
    [<ffffffff829e9fc3>] filter_attr_skip_test+0x133/0x6e0
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff8881052cc620 (size 192):
  comm "kunit_try_catch", pid 1408, jiffies 4294720464 (age 781.944s)
  hex dump (first 32 bytes):
    a0 85 9e 82 ff ff ff ff c0 a8 7c 84 ff ff ff ff  ..........|.....
    00 00 00 00 00 00 00 00 02 00 00 00 02 00 00 00  ................
  backtrace:
    [<ffffffff817dbad2>] __kmalloc+0x52/0x150
    [<ffffffff829ecfc1>] kunit_filter_attr_tests+0x1a1/0x860
    [<ffffffff829e99ff>] kunit_filter_suites+0x82f/0xcc0
    [<ffffffff829e9fc3>] filter_attr_skip_test+0x133/0x6e0
    [<ffffffff829e802a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
    [<ffffffff81236fc6>] kthread+0x2b6/0x380
    [<ffffffff81096afd>] ret_from_fork+0x2d/0x70
    [<ffffffff81003511>] ret_from_fork_asm+0x11/0x20

Fixes: e5857d396f ("kunit: flatten kunit_suite*** to kunit_suite** in .kunit_test_suites")
Fixes: 76066f93f1 ("kunit: add tests for filtering attributes")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Suggested-by: David Gow <davidgow@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309142251.uJ8saAZv-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202309270433.wGmFRGjd-lkp@intel.com/
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-28 08:51:07 -06:00
Jinjie Ruan
24de14c98b kunit: Fix possible memory leak in kunit_filter_suites()
If the outer layer for loop is iterated more than once and it fails not
in the first iteration, the filtered_suite and filtered_suite->test_cases
allocated in the last kunit_filter_attr_tests() in last inner for loop
is leaked.

So add a new free_filtered_suite err label and free the filtered_suite
and filtered_suite->test_cases so far. And change kmalloc_array of copy
to kcalloc to Clear the copy to make the kfree safe.

Fixes: 529534e8cb ("kunit: Add ability to filter attributes")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-28 08:51:02 -06:00
Jinjie Ruan
e44679515a kunit: Fix the wrong kfree of copy for kunit_filter_suites()
If the outer layer for loop is iterated more than once and it fails not
in the first iteration, the copy pointer has been moved. So it should free
the original copy's backup copy_start.

Fixes: abbf73816b ("kunit: fix possible memory leak in kunit_filter_suites()")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-28 08:50:57 -06:00
Jinjie Ruan
a6074cf012 kunit: Fix missed memory release in kunit_free_suite_set()
modprobe cpumask_kunit and rmmod cpumask_kunit, kmemleak detect
a suspected memory leak as below.

If kunit_filter_suites() in kunit_module_init() succeeds, the
suite_set.start will not be NULL and the kunit_free_suite_set() in
kunit_module_exit() should free all the memory which has not
been freed. However the test_cases in suites is left out.

unreferenced object 0xffff54ac47e83200 (size 512):
  comm "modprobe", pid 592, jiffies 4294913238 (age 1367.612s)
  hex dump (first 32 bytes):
    84 13 1a f0 d3 b6 ff ff 30 68 1a f0 d3 b6 ff ff  ........0h......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000008dec63a2>] slab_post_alloc_hook+0xb8/0x368
    [<00000000ec280d8e>] __kmem_cache_alloc_node+0x174/0x290
    [<00000000896c7740>] __kmalloc+0x60/0x2c0
    [<000000007a50fa06>] kunit_filter_suites+0x254/0x5b8
    [<0000000078cc98e2>] kunit_module_notify+0xf4/0x240
    [<0000000033cea952>] notifier_call_chain+0x98/0x17c
    [<00000000973d05cc>] notifier_call_chain_robust+0x4c/0xa4
    [<000000005f95895f>] blocking_notifier_call_chain_robust+0x4c/0x74
    [<0000000048e36fa7>] load_module+0x1a2c/0x1c40
    [<0000000004eb8a91>] init_module_from_file+0x94/0xcc
    [<0000000037dbba28>] idempotent_init_module+0x184/0x278
    [<00000000161b75cb>] __arm64_sys_finit_module+0x68/0xa8
    [<000000006dc1669b>] invoke_syscall+0x44/0x100
    [<00000000fa87e304>] el0_svc_common.constprop.1+0x68/0xe0
    [<000000009d8ad866>] do_el0_svc+0x1c/0x28
    [<000000005b83c607>] el0_svc+0x3c/0xc4

Fixes: a127b154a8 ("kunit: tool: allow filtering test cases via glob")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-28 08:50:51 -06:00
David Howells
f1982740f5
iov_iter: Convert iterate*() to inline funcs
Convert the iov_iter iteration macros to inline functions to make the code
easier to follow.

The functions are marked __always_inline as we don't want to end up with
indirect calls in the code.  This, however, leaves dealing with ->copy_mc
in an awkard situation since the step function (memcpy_from_iter_mc())
needs to test the flag in the iterator, but isn't passed the iterator.
This will be dealt with in a follow-up patch.

The variable names in the per-type iterator functions have been harmonised
as much as possible and made clearer as to the variable purpose.

The iterator functions are also moved to a header file so that other
operations that need to scan over an iterator can be added.  For instance,
the rbd driver could use this to scan a buffer to see if it is all zeros
and libceph could use this to generate a crc.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/3710261.1691764329@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/855.1692047347@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/20230816120741.534415-1-dhowells@redhat.com/ # v3
Link: https://lore.kernel.org/r/20230925120309.1731676-8-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-09-25 14:30:28 +02:00
David Howells
f1b4cb650b
iov_iter: Derive user-backedness from the iterator type
Use the iterator type to determine whether an iterator is user-backed or
not rather than using a special flag for it.  Now that ITER_UBUF and
ITER_IOVEC are 0 and 1, they can be checked with a single comparison.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-7-dhowells@redhat.com
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Christian Brauner <christian@brauner.io>
cc: Matthew Wilcox <willy@infradead.org>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: David Laight <David.Laight@ACULAB.COM>
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-09-25 14:30:28 +02:00
Azeem Shaikh
6cd59324c6 kobject: Replace strlcpy with strscpy
strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().

Direct replacement is safe here since return value of -errno
is used to check for truncation instead of sizeof(dest).

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230831140104.207019-1-azeemshaikh38@gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-09-22 09:50:56 -07:00
Randy Dunlap
36ee98b555 argv_split: fix kernel-doc warnings
Use proper kernel-doc notation to prevent build warnings:

lib/argv_split.c:36: warning: Function parameter or member 'argv' not described in 'argv_free'
lib/argv_split.c:61: warning: No description found for return value of 'argv_split'

Link: https://lkml.kernel.org/r/20230912060838.3794-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-19 13:21:33 -07:00
Randy Dunlap
c80da1fb85 scatterlist: add missing function params to kernel-doc
Describe missing function parameters to prevent kernel-doc warnings:

lib/scatterlist.c:288: warning: Function parameter or member 'first_chunk' not described in '__sg_alloc_table'
lib/scatterlist.c:800: warning: Function parameter or member 'flags' not described in 'sg_miter_start'

Link: https://lkml.kernel.org/r/20230912060848.4673-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-19 13:21:33 -07:00
Michal Wajdeczko
ee5f8cc277 kunit: Reset test status on each param iteration
If we skip one parametrized test case then test status remains
SKIP for all subsequent test params leading to wrong reports:

$ ./tools/testing/kunit/kunit.py run \
	--kunitconfig ./lib/kunit/.kunitconfig *.example_params*
	--raw_output \

[ ] Starting KUnit Kernel (1/1)...
KTAP version 1
1..1
    # example: initializing suite
    KTAP version 1
    # Subtest: example
    # module: kunit_example_test
    1..1
        KTAP version 1
        # Subtest: example_params_test
    # example_params_test: initializing
    # example_params_test: cleaning up
        ok 1 example value 3 # SKIP unsupported param value 3
    # example_params_test: initializing
    # example_params_test: cleaning up
        ok 2 example value 2 # SKIP unsupported param value 3
    # example_params_test: initializing
    # example_params_test: cleaning up
        ok 3 example value 1 # SKIP unsupported param value 3
    # example_params_test: initializing
    # example_params_test: cleaning up
        ok 4 example value 0 # SKIP unsupported param value 0
    # example_params_test: pass:0 fail:0 skip:4 total:4
    ok 1 example_params_test # SKIP unsupported param value 0
    # example: exiting suite
ok 1 example # SKIP

Reset test status and status comment after each param iteration
to avoid using stale results.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:46:56 -06:00
Richard Fitzgerald
53568b720c kunit: string-stream: Test performance of string_stream
Add a test of the speed and memory use of string_stream.

string_stream_performance_test() doesn't actually "test" anything (it
cannot fail unless the system has run out of allocatable memory) but it
measures the speed and memory consumption of the string_stream and reports
the result.

This allows changes in the string_stream implementation to be compared.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:59 -06:00
Richard Fitzgerald
05e2006ce4 kunit: Use string_stream for test log
Replace the fixed-size log buffer with a string_stream so that the
log can grow as lines are added.

The existing kunit log tests have been updated for using a
string_stream as the log. No new test have been added because there
are already tests for the underlying string_stream.

As the log tests now depend on string_stream functions they cannot
build when kunit-test is a module. They have been surrounded by
a #if to replace them with skipping version when the test is
build as a module. Though this isn't pretty, it avoids moving
code to another file while that code is also being changed.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:53 -06:00
Richard Fitzgerald
d1a0d699bf kunit: string-stream: Add tests for freeing resource-managed string_stream
string_stream_managed_free_test() allocates a resource-managed
string_stream and tests that kunit_free_string_stream() calls
string_stream_destroy().

string_stream_resource_free_test() allocates a resource-managed
string_stream and tests that string_stream_destroy() is called
when the test resources are cleaned up.

The old string_stream_init_test() has been split into two tests,
one for kunit_alloc_string_stream() and the other for
alloc_string_stream().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:46 -06:00
Richard Fitzgerald
a3fdf78478 kunit: string-stream: Decouple string_stream from kunit
Re-work string_stream so that it is not tied to a struct kunit. This is
to allow using it for the log of struct kunit_suite.

Instead of resource-managing individual allocations the whole string_stream
can be resource-managed, if required.

    alloc_string_stream() now allocates a string stream that is
    not resource-managed.

    string_stream_destroy() now works on an unmanaged string_stream
    allocated by alloc_string_stream() and frees the entire
    string_stream (previously it only freed the fragments).

    string_stream_clear() has been made public for callers that
    want to free the fragments without destroying the string_stream.

For resource-managed allocations use kunit_alloc_string_stream()
and kunit_free_string_stream().

In addition to this, string_stream_get_string() now returns an
unmanaged buffer that the caller must kfree().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:40 -06:00
Richard Fitzgerald
20631e154c kunit: string-stream: Add kunit_alloc_string_stream()
Add function kunit_alloc_string_stream() to do a resource-managed
allocation of a string stream, and corresponding
kunit_free_string_stream() to free the resource-managed stream.

This is preparing for decoupling the string_stream
implementation from struct kunit, to reduce the amount of code
churn when that happens. Currently:
 - kunit_alloc_string_stream() only calls alloc_string_stream().
 - kunit_free_string_stream() takes a struct kunit* which
   isn't used yet.

Callers of the old alloc_string_stream() and
string_stream_destroy() are all requesting a managed allocation
so have been changed to use the new functions.

alloc_string_stream() has been temporarily made static because
its current behavior has been replaced with
kunit_alloc_string_stream().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:35 -06:00
Richard Fitzgerald
7b4481cbe7 kunit: Don't use a managed alloc in is_literal()
There is no need to use a test-managed alloc in is_literal().
The function frees the temporary buffer before returning.

This removes the only use of the test and gfp members of
struct string_stream outside of the string_stream implementation.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:30 -06:00
Richard Fitzgerald
1f58cdb173 kunit: string-stream-test: Add cases for string_stream newline appending
Add test cases for testing the string_stream feature that appends a
newline to strings that do not already end with a newline.

string_stream_no_auto_newline_test() tests with this feature disabled.
Newlines should not be added or dropped.

string_stream_auto_newline_test() tests with this feature enabled.
Newlines should be added to lines that do not end with a newline.

string_stream_append_auto_newline_test() tests appending the
content of one stream to another stream when the target stream
has newline appending enabled.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:23 -06:00
Richard Fitzgerald
a5abe7b201 kunit: string-stream: Add option to make all lines end with newline
Add an optional feature to string_stream that will append a newline to
any added string that does not already end with a newline. The purpose
of this is so that string_stream can be used to collect log lines.

This is enabled/disabled by calling string_stream_set_append_newlines().

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:16 -06:00
Richard Fitzgerald
4551caca6a kunit: string-stream: Improve testing of string_stream
Replace the minimal tests with more-thorough testing.

string_stream_init_test() tests that struct string_stream is
initialized correctly.

string_stream_line_add_test() adds a series of numbered lines and
checks that the resulting string contains all the lines.

string_stream_variable_length_line_test() adds a large number of
lines of varying length to create many fragments, then tests that all
lines are present.

string_stream_append_test() tests various cases of using
string_stream_append() to append the content of one stream to another.

Adds string_stream_append_empty_string_test() to test that adding an
empty string to a string_stream doesn't create a new empty fragment.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:11 -06:00
Richard Fitzgerald
5c54c9ebb1 kunit: string-stream: Don't create a fragment for empty strings
If the result of the formatted string is an empty string just return
instead of creating an empty fragment.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-18 10:45:05 -06:00
David S. Miller
685c6d5b2c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
The following pull-request contains BPF updates for your *net-next* tree.

We've added 73 non-merge commits during the last 9 day(s) which contain
a total of 79 files changed, 5275 insertions(+), 600 deletions(-).

The main changes are:

1) Basic BTF validation in libbpf, from Andrii Nakryiko.

2) bpf_assert(), bpf_throw(), exceptions in bpf progs, from Kumar Kartikeya Dwivedi.

3) next_thread cleanups, from Oleg Nesterov.

4) Add mcpu=v4 support to arm32, from Puranjay Mohan.

5) Add support for __percpu pointers in bpf progs, from Yonghong Song.

6) Fix bpf tailcall interaction with bpf trampoline, from Leon Hwang.

7) Raise irq_work in bpf_mem_alloc while irqs are disabled to improve refill probabablity, from Hou Tao.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

Thanks a lot!

Also thanks to reporters, reviewers and testers of commits in this pull-request:

Alan Maguire, Andrey Konovalov, Dave Marchevsky, "Eric W. Biederman",
Jiri Olsa, Maciej Fijalkowski, Quentin Monnet, Russell King (Oracle),
Song Liu, Stanislav Fomichev, Yonghong Song
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-17 15:12:06 +01:00
Puranjay Mohan
daabb2b098 bpf/tests: add tests for cpuv4 instructions
The BPF JITs now support cpuv4 instructions. Add tests for these new
instructions to the test suite:

1. Sign extended Load
2. Sign extended Mov
3. Unconditional byte swap
4. Unconditional jump with 32-bit offset
5. Signed division and modulo

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Link: https://lore.kernel.org/r/20230907230550.1417590-9-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-15 17:16:57 -07:00
Yury Norov
9ecea9ae4d sched/topology: Handle NUMA_NO_NODE in sched_numa_find_nth_cpu()
sched_numa_find_nth_cpu() doesn't handle NUMA_NO_NODE properly, and
may crash kernel if passed with it. On the other hand, the only user
of sched_numa_find_nth_cpu() has to check NUMA_NO_NODE case explicitly.

It would be easier for users if this logic will get moved into
sched_numa_find_nth_cpu().

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Link: https://lore.kernel.org/r/20230819141239.287290-6-yury.norov@gmail.com
2023-09-15 13:48:11 +02:00
Maximilian Luz
e4c89f9380 lib/ucs2_string: Add UCS-2 strscpy function
Add a ucs2_strscpy() function for UCS-2 strings. The behavior is
equivalent to the standard strscpy() function, just for 16-bit character
UCS-2 strings.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230827211408.689076-2-luzmaximilian@gmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-09-13 10:18:42 -07:00
Linus Torvalds
fb52c87a06 linux-kselftest-kunit-6.6-rc2
This kunit update for Linux 6.6-rc2 consists of important fixes to
 possible memory leak, null-ptr-deref, wild-memory-access, and error
 path bugs.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmT/PB8ACgkQCwJExA0N
 QxwWCg//fPTPHu59xQ/SOeVFEeaKGB77FwnmG+JIdryd+4oVEnBehk/4TZCZTvMe
 v7wLwaB9WOxkFWztWIlEEsYApZoFbLSWYx8iEua5QPaPBI81XAZGYv1j9u6z/vCn
 62boIq+TVusokEuNzS6zb8ZKd8A44zUTswdfVmiZ7Itf0HSe4xYNFdhDU93AtZOo
 oHtR3ok6dTj/lYp4Sp5BC4JRog3OhSdme2+whLfgH9YY1HRTHBxScSHYDCTn9OPT
 ZbaZJW4PJsw1mx7fF/ZS5Zuo1zey27LoPuEJu9C46kTLSXEGWAO8L5rV9G2gJLHg
 LAMjvcgfkEDgWhrWRaFcP2tsYqn+Dxsb/Yvmrtq3tFJDYbatUTABxhuO/6T8wbfY
 YS8ksIdXz8koDTI2ShpSd+1Gv7KEjXM7XhoO8GVcVg3pVL/niKgPLdxEsE8dgaoE
 XRip1IJBs8A+9c0xE7325wR6op6IKD0xM0xGTTjYEbGHqKjZJgg72zbCBPMGcWDZ
 mGlv53sPlDTRh0B8B/M+LnUpdJ5gRMf800CgDl3mKUN1b8fvqmsW3HjO5wWNo/cw
 I9JSJUrSxkYHUbJVtMasrhseRHY+V+ysvvzl/Z39RoVMPMonh+EIY6dG07aAU6lA
 jIIL6k1KwJnox4+dFgkShrnNqcZF/6INok0wzzSnJBwBKKD8Aac=
 =wiS6
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:
 "Fixes to possible memory leak, null-ptr-deref, wild-memory-access, and
  error path bugs"

* tag 'linux-kselftest-kunit-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Fix possible memory leak in kunit_filter_suites()
  kunit: Fix possible null-ptr-deref in kunit_parse_glob_filter()
  kunit: Fix the wrong err path and add goto labels in kunit_filter_suites()
  kunit: Fix wild-memory-access bug in kunit_free_suite_set()
  kunit: test: Make filter strings in executor_test writable
2023-09-12 09:05:49 -07:00
Kent Overstreet
21db931445 lib: Export errname
errname() returns the name of an errcode; this functionality is
otherwise only available for error pointers via %pE - bcachefs uses this
for better error messages.

Signed-off-by: Christopher James Halse Rogers <raof@ubuntu.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-09-11 23:59:47 -04:00
Kent Overstreet
83feeb1955 lib/string_helpers: string_get_size() now returns characters wrote
printbuf now needs to know the number of characters that would have been
written if the buffer was too small, like snprintf(); this changes
string_get_size() to return the the return value of snprintf().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-09-11 23:59:47 -04:00
Ard Biesheuvel
b089ea3cc3 lib/raid6: Drop IA64 support
Drop Itanium support from the RAID6 code, and along with it, the 16x and
32x unrolled versions, which were only used by IA64.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11 08:13:18 +00:00
Ard Biesheuvel
cf8e865810 arch: Remove Itanium (IA-64) architecture
The Itanium architecture is obsolete, and an informal survey [0] reveals
that any residual use of Itanium hardware in production is mostly HP-UX
or OpenVMS based. The use of Linux on Itanium appears to be limited to
enthusiasts that occasionally boot a fresh Linux kernel to see whether
things are still working as intended, and perhaps to churn out some
distro packages that are rarely used in practice.

None of the original companies behind Itanium still produce or support
any hardware or software for the architecture, and it is listed as
'Orphaned' in the MAINTAINERS file, as apparently, none of the engineers
that contributed on behalf of those companies (nor anyone else, for that
matter) have been willing to support or maintain the architecture
upstream or even be responsible for applying the odd fix. The Intel
firmware team removed all IA-64 support from the Tianocore/EDK2
reference implementation of EFI in 2018. (Itanium is the original
architecture for which EFI was developed, and the way Linux supports it
deviates significantly from other architectures.) Some distros, such as
Debian and Gentoo, still maintain [unofficial] ia64 ports, but many have
dropped support years ago.

While the argument is being made [1] that there is a 'for the common
good' angle to being able to build and run existing projects such as the
Grid Community Toolkit [2] on Itanium for interoperability testing, the
fact remains that none of those projects are known to be deployed on
Linux/ia64, and very few people actually have access to such a system in
the first place. Even if there were ways imaginable in which Linux/ia64
could be put to good use today, what matters is whether anyone is
actually doing that, and this does not appear to be the case.

There are no emulators widely available, and so boot testing Itanium is
generally infeasible for ordinary contributors. GCC still supports IA-64
but its compile farm [3] no longer has any IA-64 machines. GLIBC would
like to get rid of IA-64 [4] too because it would permit some overdue
code cleanups. In summary, the benefits to the ecosystem of having IA-64
be part of it are mostly theoretical, whereas the maintenance overhead
of keeping it supported is real.

So let's rip off the band aid, and remove the IA-64 arch code entirely.
This follows the timeline proposed by the Debian/ia64 maintainer [5],
which removes support in a controlled manner, leaving IA-64 in a known
good state in the most recent LTS release. Other projects will follow
once the kernel support is removed.

[0] https://lore.kernel.org/all/CAMj1kXFCMh_578jniKpUtx_j8ByHnt=s7S+yQ+vGbKt9ud7+kQ@mail.gmail.com/
[1] https://lore.kernel.org/all/0075883c-7c51-00f5-2c2d-5119c1820410@web.de/
[2] https://gridcf.org/gct-docs/latest/index.html
[3] https://cfarm.tetaneutral.net/machines/list/
[4] https://lore.kernel.org/all/87bkiilpc4.fsf@mid.deneb.enyo.de/
[5] https://lore.kernel.org/all/ff58a3e76e5102c94bb5946d99187b358def688a.camel@physik.fu-berlin.de/

Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11 08:13:17 +00:00
David Howells
a3c57ab79a iov_iter: Kunit tests for page extraction
Add some kunit tests for page extraction for ITER_BVEC, ITER_KVEC and
ITER_XARRAY type iterators.  ITER_UBUF and ITER_IOVEC aren't dealt with
as they require userspace VM interaction.  ITER_DISCARD isn't dealt with
either as that can't be extracted.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00
David Howells
2d71340ff1 iov_iter: Kunit tests for copying to/from an iterator
Add some kunit tests for page extraction for ITER_BVEC, ITER_KVEC and
ITER_XARRAY type iterators.  ITER_UBUF and ITER_IOVEC aren't dealt with
as they require userspace VM interaction.  ITER_DISCARD isn't dealt with
either as that does nothing.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00
David Howells
f741bd7178 iov_iter: Fix iov_iter_extract_pages() with zero-sized entries
iov_iter_extract_pages() doesn't correctly handle skipping over initial
zero-length entries in ITER_KVEC and ITER_BVEC-type iterators.

The problem is that it accidentally reduces maxsize to 0 when it
skipping and thus runs to the end of the array and returns 0.

Fix this by sticking the calculated size-to-copy in a new variable
rather than back in maxsize.

Fixes: 7d58fe7310 ("iov_iter: Add a function to extract a page list from an iterator")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00
Linus Torvalds
3095dd99dd XArray/IDA updates for 6.6
- Fix a bug encountered by people using bittorrent where they'd get
    NULL pointer dereferences on page cache lookups when using XFS
 
  - Two documentation fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmT7XLYACgkQDpNsjXcp
 gj6qrAf+LiAs3dUELOjrqaQQbbNGp4na+YwJCiezuvwZn8P+ieJpt6QCEDHEb1jH
 LCOjr0GFMhHnAWp9Q0Qay4IXoKk8DPkA/avSaZgsl5blmMyNqFMgHklU7mjRvhCG
 ayb/NeZYwrJhA9NyueXYuH3h7QDryxyIN3TZS1/7z13YrohMIQeu3q7X/ZBMh7NS
 uPd7vmDj8TnZ/agQzplQ4XDov9lrzkUXDJqpMvn/Gbr4K7y66UZa3SLxi1JPrnah
 ffDvBlK2OImNBoaADfiRImWc7QlXVkF/B08xUcJ6tXAeO6xJDykkie+gjsF2S040
 YP2YIG+IWi47zqa25EuxFRtavwUh6w==
 =4GKy
 -----END PGP SIGNATURE-----

Merge tag 'xarray-6.6' of git://git.infradead.org/users/willy/xarray

Pull xarray fixes from Matthew Wilcox:

 - Fix a bug encountered by people using bittorrent where they'd get
   NULL pointer dereferences on page cache lookups when using XFS

 - Two documentation fixes

* tag 'xarray-6.6' of git://git.infradead.org/users/willy/xarray:
  idr: fix param name in idr_alloc_cyclic() doc
  xarray: Document necessary flag in alloc functions
  XArray: Do not return sibling entries from xa_load()
2023-09-08 21:46:26 -07:00
Linus Torvalds
12952b6bbd LoongArch changes for v6.6
1, Allow usage of LSX/LASX in the kernel;
 2, Add SIMD-optimized RAID5/RAID6 routines;
 3, Add Loongson Binary Translation (LBT) extension support;
 4, Add basic KGDB & KDB support;
 5, Add building with kcov coverage;
 6, Add KFENCE (Kernel Electric-Fence) support;
 7, Add KASAN (Kernel Address Sanitizer) support;
 8, Some bug fixes and other small changes;
 9, Update the default config file.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmT5TfMWHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImeqd3EACjqCaHNlp33kwufSPpGuQw9a8I
 F7JW1KzBOoWELch5nFRjfQClROBWRmM4jN5YnxENBQ5K2F1K6gfxdkfjew+KV2mn
 ki9ByamCfFVJDZXo9wavUD2LBrVakEFmLT+SyXBxdWwJ3fDivHjF6A0qs9ltp7dq
 Bttq4bkw1mZsU6MnViRwPKVROtNUVrd9mwYSTq0iXviVEbWhPHQQTxRizNra9Z6X
 7XWxO0ODHl0WVvdOJU+F16mBRS3Bs1g/HHAIDc41yrYEHFFOeFCEUAQSF/4Nj5wj
 BAfAB8WOa9+vPH8fTnrpCt2RtGJmkz71TM49DdXB7jpGaWIyc4WDi9MXeeBiJ0wE
 vQg8IECc9POC1sH4/6BMwq2qkrWRj2PYFYof0fP66iWNjmodtNUf7GOVHy8MTQan
 xHWizJFAdY/u/bwbF9tRQ+EVeot/844CkjtZxkgTfV8shN6kCMEVAamwBItZ7TXN
 g/oc1ORM6nsKHBDQF3r2LSY0Gbf3OSfMJVL8SLEQ9hAhgGhotmJ36B4bdvyO7T0Q
 gNn//U+p4IIMFRKRxreEz9P0KjTOJrHAAxNzu1oZebhGZd5WI+i0PHYkkBDKZTXc
 7qaEdM2cX8Wd0ePIXOHQnSItwYO7ilrviHyeCM8wd/g2/W/00jvnpF3J+2rk7eJO
 rcfAr8+V5ylYBQzp6Q==
 =NXy2
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch updates from Huacai Chen:

 - Allow usage of LSX/LASX in the kernel, and use them for
   SIMD-optimized RAID5/RAID6 routines

 - Add Loongson Binary Translation (LBT) extension support

 - Add basic KGDB & KDB support

 - Add building with kcov coverage

 - Add KFENCE (Kernel Electric-Fence) support

 - Add KASAN (Kernel Address Sanitizer) support

 - Some bug fixes and other small changes

 - Update the default config file

* tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits)
  LoongArch: Update Loongson-3 default config file
  LoongArch: Add KASAN (Kernel Address Sanitizer) support
  LoongArch: Simplify the processing of jumping new kernel for KASLR
  kasan: Add (pmd|pud)_init for LoongArch zero_(pud|p4d)_populate process
  kasan: Add __HAVE_ARCH_SHADOW_MAP to support arch specific mapping
  LoongArch: Add KFENCE (Kernel Electric-Fence) support
  LoongArch: Get partial stack information when providing regs parameter
  LoongArch: mm: Add page table mapped mode support for virt_to_page()
  kfence: Defer the assignment of the local variable addr
  LoongArch: Allow building with kcov coverage
  LoongArch: Provide kaslr_offset() to get kernel offset
  LoongArch: Add basic KGDB & KDB support
  LoongArch: Add Loongson Binary Translation (LBT) extension support
  raid6: Add LoongArch SIMD recovery implementation
  raid6: Add LoongArch SIMD syndrome calculation
  LoongArch: Add SIMD-optimized XOR routines
  LoongArch: Allow usage of LSX/LASX in the kernel
  LoongArch: Define symbol 'fault' as a local label in fpu.S
  LoongArch: Adjust {copy, clear}_user exception handler behavior
  LoongArch: Use static defined zero page rather than allocated
  ...
2023-09-08 12:16:52 -07:00
WANG Xuerui
f209132104 raid6: Add LoongArch SIMD recovery implementation
Similar to the syndrome calculation, the recovery algorithms also work
on 64 bytes at a time to align with the L1 cache line size of current
and future LoongArch cores (that we care about). Which means
unrolled-by-4 LSX and unrolled-by-2 LASX code.

The assembly is originally based on the x86 SSSE3/AVX2 ports, but
register allocation has been redone to take advantage of LSX/LASX's 32
vector registers, and instruction sequence has been optimized to suit
(e.g. LoongArch can perform per-byte srl and andi on vectors, but x86
cannot).

Performance numbers measured by instrumenting the raid6test code, on a
3A5000 system clocked at 2.5GHz:

> lasx  2data: 354.987 MiB/s
> lasx  datap: 350.430 MiB/s
> lsx   2data: 340.026 MiB/s
> lsx   datap: 337.318 MiB/s
> intx1 2data: 164.280 MiB/s
> intx1 datap: 187.966 MiB/s

Because recovery algorithms are chosen solely based on priority and
availability, lasx is marked as priority 2 and lsx priority 1. At least
for the current generation of LoongArch micro-architectures, LASX should
always be faster than LSX whenever supported, and have similar power
consumption characteristics (because the only known LASX-capable uarch,
the LA464, always compute the full 256-bit result for vector ops).

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06 22:53:55 +08:00
WANG Xuerui
8f3f06dfd6 raid6: Add LoongArch SIMD syndrome calculation
The algorithms work on 64 bytes at a time, which is the L1 cache line
size of all current and future LoongArch cores (that we care about), as
confirmed by Huacai. The code is based on the generic int.uc algorithm,
unrolled 4 times for LSX and 2 times for LASX. Further unrolling does
not meaningfully improve the performance according to experiments.

Performance numbers measured during system boot on a 3A5000 @ 2.5GHz:

> raid6: lasx     gen() 12726 MB/s
> raid6: lsx      gen() 10001 MB/s
> raid6: int64x8  gen()  2876 MB/s
> raid6: int64x4  gen()  3867 MB/s
> raid6: int64x2  gen()  2531 MB/s
> raid6: int64x1  gen()  1945 MB/s

Comparison of xor() speeds (from different boots but meaningful anyway):

> lasx:    11226 MB/s
> lsx:     6395 MB/s
> int64x4: 2147 MB/s

Performance as measured by raid6test:

> raid6: lasx     gen() 25109 MB/s
> raid6: lsx      gen() 13233 MB/s
> raid6: int64x8  gen()  4164 MB/s
> raid6: int64x4  gen()  6005 MB/s
> raid6: int64x2  gen()  5781 MB/s
> raid6: int64x1  gen()  4119 MB/s
> raid6: using algorithm lasx gen() 25109 MB/s
> raid6: .... xor() 14439 MB/s, rmw enabled

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-09-06 22:53:55 +08:00
Ariel Marcovitch
2a15de80dd idr: fix param name in idr_alloc_cyclic() doc
The relevant parameter is 'start' and not 'nextid'

Fixes: 460488c58c ("idr: Remove idr_alloc_ext")
Signed-off-by: Ariel Marcovitch <arielmarcovitch@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2023-09-05 19:01:38 -04:00
Philipp Stanner
e7716c74e3 xarray: Document necessary flag in alloc functions
Adds a new line to the docstrings of functions wrapping __xa_alloc() and
__xa_alloc_cyclic(), informing about the necessity of flag XA_FLAGS_ALLOC
being set previously.

The documentation so far says that functions wrapping __xa_alloc() and
__xa_alloc_cyclic() are supposed to return either -ENOMEM or -EBUSY in
case of an error. If the xarray has been initialized without the flag
XA_FLAGS_ALLOC, however, they fail with a different, undocumented error
code.

As hinted at in Documentation/core-api/xarray.rst, wrappers around these
functions should only be invoked when the flag has been set. The
functions' documentation should reflect that as well.

Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2023-09-05 19:01:38 -04:00
Jinjie Ruan
9076bc476d kunit: Fix possible memory leak in kunit_filter_suites()
If both filter_glob and filters are not NULL, and kunit_parse_glob_filter()
succeed, but kcalloc parsed_filters fails, the suite_glob and test_glob of
parsed kzalloc in kunit_parse_glob_filter() will be leaked.

As Rae suggested, assign -ENOMEM to *err to correctly free copy and goto
free_parsed_glob to free the suite/test_glob of parsed.

Fixes: 1c9fd080df ("kunit: fix uninitialized variables bug in attributes filtering")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-05 12:30:06 -06:00
Jinjie Ruan
2b56a4b79b kunit: Fix possible null-ptr-deref in kunit_parse_glob_filter()
Inject fault while probing kunit-example-test.ko, if kzalloc fails
in kunit_parse_glob_filter(), strcpy() or strncpy() to NULL will
cause below null-ptr-deref bug. So check NULL for kzalloc() and
return int instead of void for kunit_parse_glob_filter().

 Unable to handle kernel paging request at virtual address dfff800000000000
 KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
 Mem abort info:
   ESR = 0x0000000096000005
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x05: level 1 translation fault
 Data abort info:
   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
 [dfff800000000000] address between user and kernel address ranges
 Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
 Modules linked in: kunit_example_test cfg80211 rfkill 8021q garp mrp stp llc ipv6 [last unloaded: kunit_example_test]
 CPU: 4 PID: 6047 Comm: modprobe Tainted: G        W        N 6.5.0-next-20230829+ #141
 Hardware name: linux,dummy-virt (DT)
 pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : strncpy+0x58/0xc0
 lr : kunit_filter_suites+0x15c/0xa84
 sp : ffff800082a17420
 x29: ffff800082a17420 x28: 0000000000000000 x27: 0000000000000004
 x26: 0000000000000000 x25: ffffa847e40a5320 x24: 0000000000000001
 x23: 0000000000000000 x22: 0000000000000001 x21: dfff800000000000
 x20: 000000000000002a x19: 0000000000000000 x18: 00000000750b3b54
 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
 x14: 0000000000000000 x13: 34393178302f3039 x12: ffff7508fcea4ec1
 x11: 1ffff508fcea4ec0 x10: ffff7508fcea4ec0 x9 : dfff800000000000
 x8 : ffff6051b1a7f86a x7 : ffff800082a17270 x6 : 0000000000000002
 x5 : 0000000000000098 x4 : ffff028d9817b250 x3 : 0000000000000000
 x2 : 0000000000000000 x1 : ffffa847e40a5320 x0 : 0000000000000000
 Call trace:
  strncpy+0x58/0xc0
  kunit_filter_suites+0x15c/0xa84
  kunit_module_notify+0x1b0/0x3ac
  blocking_notifier_call_chain+0xc4/0x128
  do_init_module+0x250/0x594
  load_module+0x37b0/0x44b4
  init_module_from_file+0xd4/0x128
  idempotent_init_module+0x2c8/0x524
  __arm64_sys_finit_module+0xac/0x100
  invoke_syscall+0x6c/0x258
  el0_svc_common.constprop.0+0x160/0x22c
  do_el0_svc+0x44/0x5c
  el0_svc+0x38/0x78
  el0t_64_sync_handler+0x13c/0x158
  el0t_64_sync+0x190/0x194
 Code: 5400028a d343fe63 12000a62 39400034 (38f56863)
 ---[ end trace 0000000000000000 ]---
 Kernel panic - not syncing: Oops: Fatal exception
 SMP: stopping secondary CPUs
 Kernel Offset: 0x284761400000 from 0xffff800080000000
 PHYS_OFFSET: 0xfffffd7380000000
 CPU features: 0x88000203,3c020000,1000421b
 Memory Limit: none
 Rebooting in 1 seconds..

Fixes: a127b154a8 ("kunit: tool: allow filtering test cases via glob")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-05 12:30:01 -06:00
Jinjie Ruan
4b00920da1 kunit: Fix the wrong err path and add goto labels in kunit_filter_suites()
Take the last kfree(parsed_filters) and add it to be the first. Take
the first kfree(copy) and add it to be the last. The Best practice is to
return these errors reversely.

And as David suggested, add several labels which target only the things
which actually have been allocated so far.

Fixes: 529534e8cb ("kunit: Add ability to filter attributes")
Fixes: abbf73816b ("kunit: fix possible memory leak in kunit_filter_suites()")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Suggested-by: David Gow <davidgow@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-05 12:29:55 -06:00
Jinjie Ruan
2810c1e998 kunit: Fix wild-memory-access bug in kunit_free_suite_set()
Inject fault while probing kunit-example-test.ko, if kstrdup()
fails in mod_sysfs_setup() in load_module(), the mod->state will
switch from MODULE_STATE_COMING to MODULE_STATE_GOING instead of
from MODULE_STATE_LIVE to MODULE_STATE_GOING, so only
kunit_module_exit() will be called without kunit_module_init(), and
the mod->kunit_suites is no set correctly and the free in
kunit_free_suite_set() will cause below wild-memory-access bug.

The mod->state state machine when load_module() succeeds:

MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_LIVE
	 ^						|
	 |						| delete_module
	 +---------------- MODULE_STATE_GOING <---------+

The mod->state state machine when load_module() fails at
mod_sysfs_setup():

MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_GOING
	^						|
	|						|
	+-----------------------------------------------+

Call kunit_module_init() at MODULE_STATE_COMING state to fix the issue
because MODULE_STATE_LIVE is transformed from it.

 Unable to handle kernel paging request at virtual address ffffff341e942a88
 KASAN: maybe wild-memory-access in range [0x0003f9a0f4a15440-0x0003f9a0f4a15447]
 Mem abort info:
   ESR = 0x0000000096000004
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x04: level 0 translation fault
 Data abort info:
   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
 swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000441ea000
 [ffffff341e942a88] pgd=0000000000000000, p4d=0000000000000000
 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
 Modules linked in: kunit_example_test(-) cfg80211 rfkill 8021q garp mrp stp llc ipv6 [last unloaded: kunit_example_test]
 CPU: 3 PID: 2035 Comm: modprobe Tainted: G        W        N 6.5.0-next-20230828+ #136
 Hardware name: linux,dummy-virt (DT)
 pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : kfree+0x2c/0x70
 lr : kunit_free_suite_set+0xcc/0x13c
 sp : ffff8000829b75b0
 x29: ffff8000829b75b0 x28: ffff8000829b7b90 x27: 0000000000000000
 x26: dfff800000000000 x25: ffffcd07c82a7280 x24: ffffcd07a50ab300
 x23: ffffcd07a50ab2e8 x22: 1ffff00010536ec0 x21: dfff800000000000
 x20: ffffcd07a50ab2f0 x19: ffffcd07a50ab2f0 x18: 0000000000000000
 x17: 0000000000000000 x16: 0000000000000000 x15: ffffcd07c24b6764
 x14: ffffcd07c24b63c0 x13: ffffcd07c4cebb94 x12: ffff700010536ec7
 x11: 1ffff00010536ec6 x10: ffff700010536ec6 x9 : dfff800000000000
 x8 : 00008fffefac913a x7 : 0000000041b58ab3 x6 : 0000000000000000
 x5 : 1ffff00010536ec5 x4 : ffff8000829b7628 x3 : dfff800000000000
 x2 : ffffff341e942a80 x1 : ffffcd07a50aa000 x0 : fffffc0000000000
 Call trace:
  kfree+0x2c/0x70
  kunit_free_suite_set+0xcc/0x13c
  kunit_module_notify+0xd8/0x360
  blocking_notifier_call_chain+0xc4/0x128
  load_module+0x382c/0x44a4
  init_module_from_file+0xd4/0x128
  idempotent_init_module+0x2c8/0x524
  __arm64_sys_finit_module+0xac/0x100
  invoke_syscall+0x6c/0x258
  el0_svc_common.constprop.0+0x160/0x22c
  do_el0_svc+0x44/0x5c
  el0_svc+0x38/0x78
  el0t_64_sync_handler+0x13c/0x158
  el0t_64_sync+0x190/0x194
 Code: aa0003e1 b25657e0 d34cfc42 8b021802 (f9400440)
 ---[ end trace 0000000000000000 ]---
 Kernel panic - not syncing: Oops: Fatal exception
 SMP: stopping secondary CPUs
 Kernel Offset: 0x4d0742200000 from 0xffff800080000000
 PHYS_OFFSET: 0xffffee43c0000000
 CPU features: 0x88000203,3c020000,1000421b
 Memory Limit: none
 Rebooting in 1 seconds..

Fixes: 3d6e446238 ("kunit: unify module and builtin suite definitions")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-05 12:29:50 -06:00
Linus Torvalds
3c31041e37 printk changes for 6.6
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmT1pbAACgkQUqAMR0iA
 lPLoxBAAl18gKo6C8zIBNBoYNl7FxvChrJORjK7RQONs5RYKt8drHjSrJGazhjiV
 TIdbZt9juqs+UT/f6DnkJznrqQ0W70fQsgUpw+q7n7+cnkIoXAiAs+plexdQXigB
 6nx67b2oub41jTwzn/uV7R/eTwq2VnoZqudS/o9iAI/Ia9JzkqmGx08hQedvOoqX
 2SWs140iY/zXsFUyEfe8UTXwJUnC/n9pwtpe5sLpmtyupGc/KumUimTQ+LFJbV9o
 n8QhcQn10CE93M5fH/R2JXjZO7wuSmCHt/V8oSnoOwdCBBe7Tc6aBx5wUwc4XCuC
 450h5hlYBKq97lA1PnWGC01uAkeDTRw8963LVRRqWvohoFuHXR0cisF9FTM9LXfs
 bg90XjzYAK7Ns9fJ0dZHOSbUtRaa06hiExUnO3ctyv2G6h8qKfv86LCuP0CMFmQU
 rflfk1dPiMW20HT3ZJNtMCy3Vu6kVQSdSaGKTnwzTcUWop5tCQxhmWYBXH6q/1LH
 aD7xT1xJnBGqLUqq5C8twoOea+w47x/vtjTLi7mJarP5Wfh8xv6axdkwE8N4NrYp
 cc2RR83a1BZ7At3YkAjfjHmhaZ97gSSe6+Yqk9UzvUEQY/WILEGnb0DKO1jCSB34
 D2NPh1MHF5MFQjazdt/dSyMJVxDlTeY/S5wqfLLhNZts48LJ8n0=
 =D7ZU
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Do not try to get the console lock when it is not need or useful in
   panic()

 - Replace the global console_suspended state by a per-console flag

 - Export symbols needed for dumping the raw printk buffer in panic()

 - Fix documentation of printf formats for integer types

 - Moved Sergey Senozhatsky to the reviewer role

 - Misc cleanups

* tag 'printk-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: export symbols for debug modules
  lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()
  printk: ringbuffer: Fix truncating buffer size min_t cast
  printk: Rename abandon_console_lock_in_panic() to other_cpu_in_panic()
  printk: Add per-console suspended state
  printk: Consolidate console deferred printing
  printk: Do not take console lock for console_flush_on_panic()
  printk: Keep non-panic-CPUs out of console lock
  printk: Reduce console_unblank() usage in unsafe scenarios
  kdb: Do not assume write() callback available
  docs: printk-formats: Treat char as always unsigned
  docs: printk-formats: Fix hex printing of signed values
  MAINTAINERS: adjust printk/vsprintf entries
2023-09-04 13:20:19 -07:00
Linus Torvalds
e987af4546 percpu: changes for v6.6
percpu
 * A couple cleanups by Baoquan He and Bibo Mao. The only behavior change
   is to start printing messages if we're under the warn limit for failed
   atomic allocations.
 
 percpu_counter
 * Shakeel introduced percpu counters into mm_struct which caused percpu
   allocations be on the hot path [1]. Originally I spent some time
   trying to improve the percpu allocator, but instead preferred what
   Mateusz Guzik proposed grouping at the allocation site,
   percpu_counter_init_many(). This allows a single percpu allocation to
   be shared by the counters. I like this approach because it creates a
   shared lifetime by the allocations. Additionally, I believe many inits
   have higher level synchronization requirements, like percpu_counter
   does against HOTPLUG_CPU. Therefore we can group these optimizations
   together.
 
 [1] https://lore.kernel.org/linux-mm/20221024052841.3291983-1-shakeelb@google.com/
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE3hZPHJdcVwe+yTTtiDc0yuoFPR0FAmTv2IUACgkQiDc0yuoF
 PR0+gg//U430Y9jRSKQtbh3dEPaAeWGcTfSTnVHbQGfBj3A4ePJyWl/Tgzri31AC
 rzr8SRs0yX8b82TbECWsV67i/GrntLJyz4yQ52S/RRqVwnQqSn/wicEdCY00lJBt
 Tye8zApOnYBouaYqIOxm/M7ofvKzJ3gWOVeF/zBwM6hwvNaXXtY5r86fSDxoEbhY
 HOFnCDmg5Spf0U50j1G7nV5KfAb7BNA3/HFyzfzH+w+OWi4IGbThsfrg1qvjyFot
 KlEK/kF8Af2xj2A2se4XFsLc2D/Tj+29juYVQqIPBJzVPrZ2uerKSszK5Zcr+Use
 kMiG7tRWKE+2vkOM1RQ5Y5NCVEBhlXlienz1gf/C7247SEGs6OIyqvyDAgPTRx6p
 oR2/vx9hMtaSMf4aHWd+fYS5gNZ05iMvOIbRZnI1wZkQglQVkJvXhzuLaJ+dIGSP
 ypv6XOepik7vDjZ3p3xJXd0TAn4NSkn3jWRetrymdtMFanF99qw1VqjmkLecSil0
 Gr0UhRL1oiMde6niVJrOpdOGLwt/M4N99Y5rksw6NCnktRJ99coFGj7LglZGMsu+
 YkOyjD8MVJXTkBtBNGeqHTKe6nyVkHFq9ad5EmWjPkefP5JziH8i18k7JlF1dLA5
 c8peq3ES659D5f0mU2jilD9PsCsBfSn6Of4ruMZa2Zr1XDD8snI=
 =vcA1
 -----END PGP SIGNATURE-----

Merge tag 'percpu-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu

Pull percpu updates from Dennis Zhou:
 "One bigger change to percpu_counter's api allowing for init and
  destroy of multiple counters via percpu_counter_init_many() and
  percpu_counter_destroy_many(). This is used to help begin remediating
  a performance regression with percpu rss stats.

  Additionally, it seems larger core count machines are feeling the
  burden of the single threaded allocation of percpu. Mateusz is
  thinking about it and I will spend some time on it too.

  percpu:

   - A couple cleanups by Baoquan He and Bibo Mao. The only behavior
     change is to start printing messages if we're under the warn limit
     for failed atomic allocations.

  percpu_counter:

   - Shakeel introduced percpu counters into mm_struct which caused
     percpu allocations be on the hot path [1]. Originally I spent some
     time trying to improve the percpu allocator, but instead preferred
     what Mateusz Guzik proposed grouping at the allocation site,
     percpu_counter_init_many(). This allows a single percpu allocation
     to be shared by the counters. I like this approach because it
     creates a shared lifetime by the allocations. Additionally, I
     believe many inits have higher level synchronization requirements,
     like percpu_counter does against HOTPLUG_CPU. Therefore we can
     group these optimizations together"

Link: https://lore.kernel.org/linux-mm/20221024052841.3291983-1-shakeelb@google.com/ [1]

* tag 'percpu-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
  kernel/fork: group allocation/free of per-cpu counters for mm struct
  pcpcntr: add group allocation/free
  mm/percpu.c: print error message too if atomic alloc failed
  mm/percpu.c: optimize the code in pcpu_setup_first_chunk() a little bit
  mm/percpu.c: remove redundant check
  mm/percpu: Remove some local variables in pcpu_populate_pte
2023-09-01 15:44:45 -07:00
Linus Torvalds
1c9f8dff62 Char/Misc driver changes for 6.6-rc1
Here is the big set of char/misc and other small driver subsystem
 changes for 6.6-rc1.
 
 Stuff all over the place here, lots of driver updates and changes and
 new additions.  Short summary is:
   - new IIO drivers and updates
   - Interconnect driver updates
   - fpga driver updates and additions
   - fsi driver updates
   - mei driver updates
   - coresight driver updates
   - nvmem driver updates
   - counter driver updates
   - lots of smaller misc and char driver updates and additions
 
 All of these have been in linux-next for a long time with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH64g8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynr2QCfd3RKeR+WnGzyEOFhksl30UJJhiIAoNZtYT5+
 t9KG0iMDXRuTsOqeEQbd
 =tVnk
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char/misc and other small driver subsystem
  changes for 6.6-rc1.

  Stuff all over the place here, lots of driver updates and changes and
  new additions. Short summary is:

   - new IIO drivers and updates

   - Interconnect driver updates

   - fpga driver updates and additions

   - fsi driver updates

   - mei driver updates

   - coresight driver updates

   - nvmem driver updates

   - counter driver updates

   - lots of smaller misc and char driver updates and additions

  All of these have been in linux-next for a long time with no reported
  problems"

* tag 'char-misc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (267 commits)
  nvmem: core: Notify when a new layout is registered
  nvmem: core: Do not open-code existing functions
  nvmem: core: Return NULL when no nvmem layout is found
  nvmem: core: Create all cells before adding the nvmem device
  nvmem: u-boot-env:: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
  nvmem: sec-qfprom: Add Qualcomm secure QFPROM support
  dt-bindings: nvmem: sec-qfprom: Add bindings for secure qfprom
  dt-bindings: nvmem: Add compatible for QCM2290
  nvmem: Kconfig: Fix typo "drive" -> "driver"
  nvmem: Explicitly include correct DT includes
  nvmem: add new NXP QorIQ eFuse driver
  dt-bindings: nvmem: Add t1023-sfp efuse support
  dt-bindings: nvmem: qfprom: Add compatible for MSM8226
  nvmem: uniphier: Use devm_platform_get_and_ioremap_resource()
  nvmem: qfprom: do some cleanup
  nvmem: stm32-romem: Use devm_platform_get_and_ioremap_resource()
  nvmem: rockchip-efuse: Use devm_platform_get_and_ioremap_resource()
  nvmem: meson-mx-efuse: Convert to devm_platform_ioremap_resource()
  nvmem: lpc18xx_otp: Convert to devm_platform_ioremap_resource()
  nvmem: brcm_nvram: Use devm_platform_get_and_ioremap_resource()
  ...
2023-09-01 09:53:54 -07:00
Linus Torvalds
28a4f91f5f Driver core changes for 6.6-rc1
Here is a small set of driver core updates and additions for 6.6-rc1.
 
 Included in here are:
   - stable kernel documentation updates
   - class structure const work from Ivan on various subsystems
   - kernfs tweaks
   - driver core tests!
   - kobject sanity cleanups
   - kobject structure reordering to save space
   - driver core error code handling fixups
   - other minor driver core cleanups
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH77Q8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylZMACePk8SitfaJc6FfFf5I7YK7Nq0V8MAn0nUjgsR
 i8NcNpu/Yv4HGrDgTdh/
 =PJbk
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is a small set of driver core updates and additions for 6.6-rc1.

  Included in here are:

   - stable kernel documentation updates

   - class structure const work from Ivan on various subsystems

   - kernfs tweaks

   - driver core tests!

   - kobject sanity cleanups

   - kobject structure reordering to save space

   - driver core error code handling fixups

   - other minor driver core cleanups

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
  driver core: Call in reversed order in device_platform_notify_remove()
  driver core: Return proper error code when dev_set_name() fails
  kobject: Remove redundant checks for whether ktype is NULL
  kobject: Add sanity check for kset->kobj.ktype in kset_register()
  drivers: base: test: Add missing MODULE_* macros to root device tests
  drivers: base: test: Add missing MODULE_* macros for platform devices tests
  drivers: base: Free devm resources when unregistering a device
  drivers: base: Add basic devm tests for platform devices
  drivers: base: Add basic devm tests for root devices
  kernfs: fix missing kernfs_iattr_rwsem locking
  docs: stable-kernel-rules: mention that regressions must be prevented
  docs: stable-kernel-rules: fine-tune various details
  docs: stable-kernel-rules: make the examples for option 1 a proper list
  docs: stable-kernel-rules: move text around to improve flow
  docs: stable-kernel-rules: improve structure by changing headlines
  base/node: Remove duplicated include
  kernfs: attach uuid for every kernfs and report it in fsid
  kernfs: add stub helper for kernfs_generic_poll()
  x86/resctrl: make pseudo_lock_class a static const structure
  x86/MSR: make msr_class a static const structure
  ...
2023-09-01 09:43:18 -07:00
Linus Torvalds
e0152e7481 RISC-V Patches for the 6.6 Merge Window, Part 1
* Support for the new "riscv,isa-extensions" and "riscv,isa-base" device
   tree interfaces for probing extensions.
 * Support for userspace access to the performance counters.
 * Support for more instructions in kprobes.
 * Crash kernels can be allocated above 4GiB.
 * Support for KCFI.
 * Support for ELFs in !MMU configurations.
 * ARCH_KMALLOC_MINALIGN has been reduced to 8.
 * mmap() defaults to sv48-sized addresses, with longer addresses hidden
   behind a hint (similar to Arm and Intel).
 * Also various fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmTx96kTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiVjRD/9DYVLlkQ/OEDJjPaEcYCP49xgIVUUU
 lhs3XbSs2VNHBaiG114f6Q0AaT/uNi+uqSej3CeTmEot2kZkBk/f2yu+UNIriPZ9
 GQiZsdyXhu921C+5VFtiI47KDWOVZ+Jpy3M1ll61IWt3yPSQHr1xOP0AOiyHHqe3
 cmqpNnzjajlfVDoXPc2mGGzUJt/7ar4thcwnMNi98raXR5Qh7SP6rrHjoQhE1oFk
 LMP3CHqEAcHE2tE4CxZVpc6HOQ5m0LpQIOK7ypufGMyoIYESm5dt/JOT4MlhTtDw
 6JzyVKtiM7lartUnUaW3ZoX4trQYT5gbXxWrJ2gCnUGy3VulikoXr1Rpz0qfdeOR
 XN8OLkVAqHfTGFI7oKk24f9Adw96R5NPZcdCay90h4J/kMfCiC7ZyUUI1XIa5iy1
 np5pZCkf8HNcdywML7qcFd5n2O0wchyFnRLFZo6kJP9Ls5cEi6kBx/1jSdTcNgx/
 fUKXyoEcriGoQiiwn29+4RZnU69gJV3zqQNLPpuwDQ5F/Q1zHTlrr+dqzezKkzcO
 dRTV2d2Q4A5vIDXPptzNNLlRQdrc8qxPJ1lxQVkPIU4/mtqczmZBwlyY2u9zwPyS
 sehJgJZnoAf+jm71NgQAKLck4MUBsMnMogOWunhXkVRCoZlbbkUWX4ECZYwPKsVk
 W7zVPmLvSM0l5g==
 =/tXb
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the new "riscv,isa-extensions" and "riscv,isa-base"
   device tree interfaces for probing extensions

 - Support for userspace access to the performance counters

 - Support for more instructions in kprobes

 - Crash kernels can be allocated above 4GiB

 - Support for KCFI

 - Support for ELFs in !MMU configurations

 - ARCH_KMALLOC_MINALIGN has been reduced to 8

 - mmap() defaults to sv48-sized addresses, with longer addresses hidden
   behind a hint (similar to Arm and Intel)

 - Also various fixes and cleanups

* tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  lib/Kconfig.debug: Restrict DEBUG_INFO_SPLIT for RISC-V
  riscv: support PREEMPT_DYNAMIC with static keys
  riscv: Move create_tmp_mapping() to init sections
  riscv: Mark KASAN tmp* page tables variables as static
  riscv: mm: use bitmap_zero() API
  riscv: enable DEBUG_FORCE_FUNCTION_ALIGN_64B
  riscv: remove redundant mv instructions
  RISC-V: mm: Document mmap changes
  RISC-V: mm: Update pgtable comment documentation
  RISC-V: mm: Add tests for RISC-V mm
  RISC-V: mm: Restrict address space for sv39,sv48,sv57
  riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent
  riscv: allow kmalloc() caches aligned to the smallest value
  riscv: support the elf-fdpic binfmt loader
  binfmt_elf_fdpic: support 64-bit systems
  riscv: Allow CONFIG_CFI_CLANG to be selected
  riscv/purgatory: Disable CFI
  riscv: Add CFI error handling
  riscv: Add ftrace_stub_graph
  riscv: Add types to indirectly called assembly functions
  ...
2023-09-01 08:09:48 -07:00
David Gow
dce19a3fed kunit: test: Make filter strings in executor_test writable
KUnit's attribute filtering feature needs the filter strings passed in
to be writable, as it modifies them in-place during parsing. This works
for the filters passed on the kernel command line, but the string
literals used in the executor tests are at least theoretically read-only
(though they work on x86_64 for some reason). s390 wasn't fooled, and
crashed when these tests were run.

Use a 'char[]' instead, (and make an explicit variable for the current
filter in parse_filter_attr_test), which will store the string in a
writable segment.

Fixes: 76066f93f1 ("kunit: add tests for filtering attributes")
Closes: https://lore.kernel.org/linux-kselftest/55950256-c00a-4d21-a2c0-cf9f0e5b8a9a@roeck-us.net/
Signed-off-by: David Gow <davidgow@google.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-09-01 09:09:25 -06:00
Nathan Chancellor
89775a27ff
lib/Kconfig.debug: Restrict DEBUG_INFO_SPLIT for RISC-V
When building for ARCH=riscv using LLVM < 14, there is an error with
CONFIG_DEBUG_INFO_SPLIT=y:

  error: A dwo section may not contain relocations

This was worked around in LLVM 15 by disallowing '-gsplit-dwarf' with
'-mrelax' (the default), so CONFIG_DEBUG_INFO_SPLIT is not selectable
with newer versions of LLVM:

  $ clang --target=riscv64-linux-gnu -gsplit-dwarf -c -o /dev/null -x c /dev/null
  clang: error: -gsplit-dwarf is unsupported with RISC-V linker relaxation (-mrelax)

GCC silently had a similar issue that was resolved with GCC 12.x.
Restrict CONFIG_DEBUG_INFO_SPLIT for RISC-V when using LLVM or GCC <
12.x to avoid these known issues.

Link: https://github.com/ClangBuiltLinux/linux/issues/1914
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99090
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/all/202308090204.9yZffBWo-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20230816-riscv-debug_info_split-v1-1-d1019d6ccc11@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-31 00:18:37 -07:00
Jisheng Zhang
3ed8513cae
riscv: enable DEBUG_FORCE_FUNCTION_ALIGN_64B
Allow to force all function address 64B aligned as it is possible for
other architectures. This may be useful when verify if performance
bump is caused by function alignment changes.

Before commit 1bf18da621 ("lib/Kconfig.debug: add ARCH dependency
for FUNCTION_ALIGN option"), riscv supports enabling the
DEBUG_FORCE_FUNCTION_ALIGN_64B option, but after that commit, each
arch needs to claim the support explicitly.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230727160356.3874-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-31 00:18:26 -07:00
Linus Torvalds
f8fd5c2483 This pull request is full of clk driver changes. In fact, there aren't any
changes to the clk framework this time around. That's probably because everyone
 was on vacation (yours truly included). We did lose a couple clk drivers this
 time around because nobody was using those devices. That skews the diffstat a
 bit, but either way, nothing looks out of the ordinary here. The usual suspects
 are chugging along adding support for more SoCs and fixing bugs.
 
 If I had to choose, I'd say the theme for the past few months has been
 "polish". There's quite a few patches that migrate to
 devm_platform_ioremap_resource() in here. And there's more than a handful of
 patches that move the NR_CLKS define from the DT binding header to the driver.
 There's even patches that migrate drivers to use clk_parent_data and clk_hw to
 describe clk tree topology. It seems that the spring (summer?) cleaning bug got
 some folks, or the semiconductor shortage finally hit the software side.
 
 New Drivers:
  - StarFive JH7110 SoC clock drivers
  - Qualcomm IPQ5018 Global Clock Controller driver
  - Versa3 clk generator to support 48KHz playback/record with audio codec on
    RZ/G2L SMARC EVK
 
 Removed Drivers:
  - Remove non-OF mmp clk drivers
  - Remove OXNAS clk driver
 
 Updates:
  - Add __counted_by to struct clk_hw_onecell_data and struct spmi_pmic_div_clk_cc
  - Move defines for numbers of clks (NR_CLKS) from DT headers to drivers
  - Introduce kstrdup_and_replace() and use it
  - Add PLL rates for Rockchip rk3568
  - Add the display clock tree for Rockchip rv1126
  - Add Audio Clock Generator (ADG) clocks on Renesas R-Car Gen3 and RZ/G2 SoCs
  - Convert sun9i-mmc clock to use devm_platform_get_and_ioremap_resource()
  - Fix function name in a comment in ccu_mmc_timing.c
  - Parameter name correction for ccu_nkm_round_rate()
  - Implement CLK_SET_RATE_PARENT for Allwinner NKM clocks, i.e. consider alternative
    parent rates when determining clock rates
  - Set CLK_SET_RATE_PARENT for Allwinner A64 pll-mipi
  - Support finding closest (as opposed to closest but not higher) clock rate
    for NM, NKM, mux and div type clocks, as use it for Allwinner A64 pll-video0
  - Prefer current parent rate if able to generate ideal clock rate for Allwinner NKM clocks
  - Clean up Qualcomm SMD RPM driver, with interconnect bus clocks moved out to
    the interconnect drivers
  - Fix various PM runtime bugs across many Qualcomm clk drivers
  - Migrate Qualcomm MDM9615 is to parent_hw and parent_data
  - Add network related resets on Qualcomm IPQ4019
  - Add a couple missing USB related clocks to Qualcomm IPQ9574
  - Add missing gpll0_sleep_clk_src to Qualcomm MSM8917 global clock controller
  - In the Qualcomm QDU1000 global clock controller, GDSCs, clkrefs, and GPLL1 are
    added, while PCIe pipe clock, SDCC rcg ops are corrected
  - Add missing GDSCs to and correct GDSCs for the SC8280XP global clock controller driver
  - Support retention for the Qualcomm SC8280XP display clock controller GDSCs.
  - Qualcommm's SDCC apps_clk_src is marked with CLK_OPS_PARENT_ENABLE to fix
    issues with missing parent clocks across sc7180, sm7150, sm6350 and sm8250,
    while sm8450 is corrected to use floor ops
  - Correct Qualcomm SM6350 GPU clock controller's clock supplies
  - Drop unwanted clocks from the Qualcomm IPQ5332 GCC driver
  - Add missing OXILICX GDSC to Qualcomm MSM8226 GCC
  - Change the delay in the Qualcomm reset controller to fsleep() for correctness
  - Extend the Qualcomm SM83550 Video clock controller to support SC8280XP
  - Add graphics clock support on Renesas RZ/G2M, RZ/G2N, RZ/G2E, and R-Car H3,
    M3-W, and M3-N SoCs
  - Add Clocked Serial Interface (CSI) clocks on Renesas RZ/V2M
  - Add PWM (MTU3) clock and reset on Renesas RZ/G2UL and RZ/Five
  - Add the PDM IPC clock for i.MX93
  - Add 519.75MHz frequency support for i.MX9 PLL
  - Simplify the .determine_rate() implementation for i.MX GPR mux
  - Make the i.MX8QXP LPCG clock use devm_platform_ioremap_resource()
  - Add the audio mux clock to i.MX8
  - Fix the SPLL2 MULT range for PLLv4
  - Update the SPLL2 type in i.MX8ULP
  - Fix the SAI4 clock on i.MX8MP
  - Add silicon revision print for i.MX25 on clocks init
  - Drop the return value from __mx25_clocks_init()
  - Fix the clock pauses on no-op set_rate for i.MX8M composite clock
  - Drop restrictions for i.MX PLL14xx and fix its max prediv value
  - Drop the 393216000 and 361267200 from i.MX PLL14xx rate table to allow
    glitch free switching
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmTv2wkRHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSW1LRAAuHR2HoyB4bRHmCa1bfOfYYDfSWsBWEav
 tWIfBl86Nl/Je50Gk2NJ9vqU5OPqRZ57TIniijHHoX5n7/kYcr8KVmlomY07hUeg
 CzWyothkxg4k7+rQwVAWvmlR2YAVwzHDKcwq7gkMZOnW/y26LXip99cjopu2CJLx
 zVwTgvWollmd4KVlicnAlx4zUjgNkWR24iA4Lcf5ir+Dr6FYNjxLI+akBA8EPxxi
 wLixZbScgBSgpGn6KVgoFhclCToPS0gt5m6HfQxJ/svOCU54l+jRKpzkNZGWvyu4
 A8t3CRrwL2iS/mfCGk2yRlaKySoLLpjlpW1AI7fHTWbG2P6p8ZphtN7jOeeAEsbq
 TNpzWEjtY6B/lfRzxxINXkrtLaqmlnFY/P5np5fDrf/61gRFxLFQemyRdY/xCSJf
 Kwq8ja1mrSGWoDGG9XhDqTf9Yek9LRObNzlDrEmn/i/qLTcxhOIz58pzHg4iAlx5
 9HDtnJ8hKg4uE1TtT12Bmasb1+WzG7GYYESNfKWZhCvbRqEUzcDOHk7xpwYa1ffx
 yZIgMs7Sb/exNW8LMPYmgnyj/f9eo5IdjiQvune+Zy5NrdzfyN6Sf/LSibrqCF2z
 X5aFHqQrR8+PifD+se+g5HPa0ezSmBIhXzYUTOC6f+nywlrJjhwDXPDYI6Lcd//p
 r4mpOmJS+G4=
 =h2Jz
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk subsystem updates from Stephen Boyd:
 "This pull request is full of clk driver changes. In fact, there aren't
  any changes to the clk framework this time around. That's probably
  because everyone was on vacation (yours truly included). We did lose a
  couple clk drivers this time around because nobody was using those
  devices. That skews the diffstat a bit, but either way, nothing looks
  out of the ordinary here. The usual suspects are chugging along adding
  support for more SoCs and fixing bugs.

  If I had to choose, I'd say the theme for the past few months has been
  "polish". There's quite a few patches that migrate to
  devm_platform_ioremap_resource() in here. And there's more than a
  handful of patches that move the NR_CLKS define from the DT binding
  header to the driver. There's even patches that migrate drivers to use
  clk_parent_data and clk_hw to describe clk tree topology. It seems
  that the spring (summer?) cleaning bug got some folks, or the
  semiconductor shortage finally hit the software side.

  New Drivers:
   - StarFive JH7110 SoC clock drivers
   - Qualcomm IPQ5018 Global Clock Controller driver
   - Versa3 clk generator to support 48KHz playback/record with audio
     codec on RZ/G2L SMARC EVK

  Removed Drivers:
   - Remove non-OF mmp clk drivers
   - Remove OXNAS clk driver

  Updates:
   - Add __counted_by to struct clk_hw_onecell_data and struct
     spmi_pmic_div_clk_cc
   - Move defines for numbers of clks (NR_CLKS) from DT headers to
     drivers
   - Introduce kstrdup_and_replace() and use it
   - Add PLL rates for Rockchip rk3568
   - Add the display clock tree for Rockchip rv1126
   - Add Audio Clock Generator (ADG) clocks on Renesas R-Car Gen3 and
     RZ/G2 SoCs
   - Convert sun9i-mmc clock to use
     devm_platform_get_and_ioremap_resource()
   - Fix function name in a comment in ccu_mmc_timing.c
   - Parameter name correction for ccu_nkm_round_rate()
   - Implement CLK_SET_RATE_PARENT for Allwinner NKM clocks, i.e.
     consider alternative parent rates when determining clock rates
   - Set CLK_SET_RATE_PARENT for Allwinner A64 pll-mipi
   - Support finding closest (as opposed to closest but not higher)
     clock rate for NM, NKM, mux and div type clocks, as use it for
     Allwinner A64 pll-video0
   - Prefer current parent rate if able to generate ideal clock rate for
     Allwinner NKM clocks
   - Clean up Qualcomm SMD RPM driver, with interconnect bus clocks
     moved out to the interconnect drivers
   - Fix various PM runtime bugs across many Qualcomm clk drivers
   - Migrate Qualcomm MDM9615 is to parent_hw and parent_data
   - Add network related resets on Qualcomm IPQ4019
   - Add a couple missing USB related clocks to Qualcomm IPQ9574
   - Add missing gpll0_sleep_clk_src to Qualcomm MSM8917 global clock
     controller
   - In the Qualcomm QDU1000 global clock controller, GDSCs, clkrefs,
     and GPLL1 are added, while PCIe pipe clock, SDCC rcg ops are
     corrected
   - Add missing GDSCs to and correct GDSCs for the SC8280XP global
     clock controller driver
   - Support retention for the Qualcomm SC8280XP display clock
     controller GDSCs.
   - Qualcommm's SDCC apps_clk_src is marked with CLK_OPS_PARENT_ENABLE
     to fix issues with missing parent clocks across sc7180, sm7150,
     sm6350 and sm8250, while sm8450 is corrected to use floor ops
   - Correct Qualcomm SM6350 GPU clock controller's clock supplies
   - Drop unwanted clocks from the Qualcomm IPQ5332 GCC driver
   - Add missing OXILICX GDSC to Qualcomm MSM8226 GCC
   - Change the delay in the Qualcomm reset controller to fsleep() for
     correctness
   - Extend the Qualcomm SM83550 Video clock controller to support
     SC8280XP
   - Add graphics clock support on Renesas RZ/G2M, RZ/G2N, RZ/G2E, and
     R-Car H3, M3-W, and M3-N SoCs
   - Add Clocked Serial Interface (CSI) clocks on Renesas RZ/V2M
   - Add PWM (MTU3) clock and reset on Renesas RZ/G2UL and RZ/Five
   - Add the PDM IPC clock for i.MX93
   - Add 519.75MHz frequency support for i.MX9 PLL
   - Simplify the .determine_rate() implementation for i.MX GPR mux
   - Make the i.MX8QXP LPCG clock use devm_platform_ioremap_resource()
   - Add the audio mux clock to i.MX8
   - Fix the SPLL2 MULT range for PLLv4
   - Update the SPLL2 type in i.MX8ULP
   - Fix the SAI4 clock on i.MX8MP
   - Add silicon revision print for i.MX25 on clocks init
   - Drop the return value from __mx25_clocks_init()
   - Fix the clock pauses on no-op set_rate for i.MX8M composite clock
   - Drop restrictions for i.MX PLL14xx and fix its max prediv value
   - Drop the 393216000 and 361267200 from i.MX PLL14xx rate table to
     allow glitch free switching"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (207 commits)
  clk: qcom: Fix SM_GPUCC_8450 dependencies
  clk: lmk04832: Support using PLL1_LD as SPI readback pin
  clk: lmk04832: Don't disable vco clock on probe fail
  clk: lmk04832: Set missing parent_names for output clocks
  clk: mvebu: Convert to devm_platform_ioremap_resource()
  clk: nuvoton: Convert to devm_platform_ioremap_resource()
  clk: socfpga: agilex: Convert to devm_platform_ioremap_resource()
  clk: ti: Use devm_platform_get_and_ioremap_resource()
  clk: mediatek: Convert to devm_platform_ioremap_resource()
  clk: hsdk-pll: Convert to devm_platform_ioremap_resource()
  clk: gemini: Convert to devm_platform_ioremap_resource()
  clk: fsl-sai: Convert to devm_platform_ioremap_resource()
  clk: bm1880: Convert to devm_platform_ioremap_resource()
  clk: axm5516: Convert to devm_platform_ioremap_resource()
  clk: actions: Convert to devm_platform_ioremap_resource()
  clk: cdce925: Remove redundant of_match_ptr()
  clk: pxa910: Move number of clocks to driver source
  clk: pxa1928: Move number of clocks to driver source
  clk: pxa168: Move number of clocks to driver source
  clk: mmp2: Move number of clocks to driver source
  ...
2023-08-30 19:53:39 -07:00
Linus Torvalds
ef2a0b7cdb Devicetree include cleanups for v6.6:
These are the remaining few clean-ups of DT related includes which
 didn't get applied to subsystem trees.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmTucUoACgkQ+vtdtY28
 YcOYoQ//RwIPeWc74PHQbOb6eQR95eTHDcDE1MR9Fw8amqxFaomGlSMpbyVyP4ag
 8p82c6qfJIZautyEikbKFO+iYjFMua0KuOTMVuDxHErQOl6ym4P4Uk3+1h5stVSj
 IdfK4CACtMKxKBOPAcyxJU6HKoWcUtMKsKV6OLdDh7M2Fy/G4RCjv4w1Xf3VAn59
 VOa0KF7FhHU3dhIB/tGsj0t13+3e3kF5+l4+pdoMoZWhR4gac5FJRxiR5dMZG6jr
 VY8i9FZb7DW2VtY78FVVOaYDDVf4vNrc+0kqnCbWUaKACHPgNXC375LvS7jFGXvc
 HYVN3teqhFxNOyoSehn2bdBVwJxjQFgy2gTt2vRWTa/CaUDES90cue2R9GT2Sz0b
 eBc3DQtNeT5m8mrLkuEfZrJjKjaEy2Pr6FjNDhNcmkJak7dkMMgkG/Y/SpNmpZOe
 2C3T6i4i6FUxni/2/rWHSVLnYBGfhPNdwWAZcQOi8rqtzp3tF46wVa345+Ev3VDG
 ECDndH8Qk3gtOmGyeTIvPc51yDP6Hpuh7+0jydtehkXHB+cUJtR+g0efIGf7BDgo
 sQpa1vRxkOolrCxyzKwcogEY7jjeccv/FM7BwaZQKXEibiKGkxeDuahdwbfvDuVq
 br16Uj9VzG8Jl6KK0gexV7kzZAAdw1y3JqPGUZf7hn4zmk099ow=
 =eLMf
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-header-cleanups-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree include cleanups from Rob Herring:
 "These are the remaining few clean-ups of DT related includes which
  didn't get applied to subsystem trees"

* tag 'devicetree-header-cleanups-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  ipmi: Explicitly include correct DT includes
  tpm: Explicitly include correct DT includes
  lib/genalloc: Explicitly include correct DT includes
  parport: Explicitly include correct DT includes
  sbus: Explicitly include correct DT includes
  mux: Explicitly include correct DT includes
  macintosh: Explicitly include correct DT includes
  hte: Explicitly include correct DT includes
  EDAC: Explicitly include correct DT includes
  clocksource: Explicitly include correct DT includes
  sparc: Explicitly include correct DT includes
  riscv: Explicitly include correct DT includes
2023-08-30 17:04:28 -07:00
Linus Torvalds
4fb0dacb78 sound updates for 6.6-rc1
We've received a fairly wide range of changes at this time, including
 for ALSA and ASoC core, but all of them are rather small changes.
 
 Here are some highlights:
 
 ALSA / ASoC Core:
 - Fixes of inconsistent locking around control API helpers
 - A few new control API functions and cleanups
 - Workarounds for potential UAFs by delayed kobj releases
 - Unified PCM copy ops with iov_iter
 - Continued efforts for ASoC API cleanups
 
 ASoC:
 - An adaptor to allow use of IIO DACs and ADCs in ASoC which pulls in
   some IIO changes
 - Create a library function for intlog10() and use it in the NAU8825
   driver
 - Convert drivers to use the more modern maple tree register cache
 - Lots of work on the SOF framework, AMD and Intel drivers, including a
   lot of cleanup and new device support
 - Standardization of the presentation of jacks from drivers
 - Provision of some generic sound card DT properties
 - Support for AMD Van Gogh, AMD machines with MAX98388 and NAU8821,
   AWInic AW88261, Cirrus Logic CS35L36 and CS42L43, various Intel
   platforms including AVS machines with ES8336 and RT5663, Mediatek
   MT7986, NXP i.MX93, RealTek RT1017 and StarFive JH7110
 
 Others:
 - New test coverage including ASoC and topology tests in KUnit;
   this also involves enabling UML builds of ALSA since that's the
   default KUnit test environment which pulls in the addition of some
   stubs to the driver
 - More enhancement of pcmtest driver
 - A few fixes / enhancements of MIDI 2.0 UMP core
 - Using PCI definitions in allover HD-audio code
 - Support for Cirrus CS35L56 and TI TAS2781 HD-audio sub-codecs
 - CS35L41 HD-audio sub-codec improvements
 - Continued emu10k1 improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmTvJ1oOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE8/3w//XgX9aEmlr7ZD2s6sOolfYMFajq/mOBaFkw53
 iDZOJs+jQNmx/BfVlaio8/hinkV8lUrjjPiBVwF6AWy3a2V9RmgqYtYvhlWIj2jW
 95eqUFtpGq4pR2KM9kEfHsZEZO+LynpF3nQ0Zy1ShNZQv5H5SQ1Hi2N1btTkRq2y
 HcHGc7bosoBYPCiF5gm/u93h/u1oW7E1IEoxEjYDySvmvapQ6SiYQYX6jLRRda9T
 PxCz1sMerkglqFif2OVWB7MJQ4C1xQlVElVItKIxHwjvbwP0bmg32qY5+qI9M8vw
 2VpDk1oXKBqFrdy5zDXL+zIj5WQ9BD2HFvfhiodfNNiI/eyTg/cVn1HysZ3CD0lh
 JU1j0pL7lwJkcgexEZqXqmshTGz0QrsJZQqa2WIHyl74xmwydxytzSdM/cEtPwt5
 fo1/H6gfDHBZj4JzkZZs8/aGj0rnzlasHds6kROzN73D7dMx3SNTP9sotEksyAJ/
 8EY2JFrD1rYSOuArFLYdLK8FDlbpICAGRMjnuosglGJxOzyh5faCtijTu3LhhBfh
 QuIus+Q+mc454LZUPaoRPBiUqAp296YqsJGNv1v02s/BLNy4HGDWgQ5j6GUvi6Ew
 0lTaOtjXscC6e091OG9LFShUFd4YAEetWdlAHwcMJbcoeubteZqz75YHvq7erzA8
 rVOxUmM=
 =Rr0i
 -----END PGP SIGNATURE-----

Merge tag 'sound-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "We've received a fairly wide range of changes at this time, including
  for ALSA and ASoC core, but all of them are rather small changes.

  Here are some highlights:

  ALSA / ASoC Core:
   - Fixes of inconsistent locking around control API helpers
   - A few new control API functions and cleanups
   - Workarounds for potential UAFs by delayed kobj releases
   - Unified PCM copy ops with iov_iter
   - Continued efforts for ASoC API cleanups

  ASoC:
   - An adaptor to allow use of IIO DACs and ADCs in ASoC which pulls in
     some IIO changes
   - Create a library function for intlog10() and use it in the NAU8825
     driver
   - Convert drivers to use the more modern maple tree register cache
   - Lots of work on the SOF framework, AMD and Intel drivers, including
     a lot of cleanup and new device support
   - Standardization of the presentation of jacks from drivers
   - Provision of some generic sound card DT properties
   - Support for AMD Van Gogh, AMD machines with MAX98388 and NAU8821,
     AWInic AW88261, Cirrus Logic CS35L36 and CS42L43, various Intel
     platforms including AVS machines with ES8336 and RT5663, Mediatek
     MT7986, NXP i.MX93, RealTek RT1017 and StarFive JH7110

  Others:
   - New test coverage including ASoC and topology tests in KUnit; this
     also involves enabling UML builds of ALSA since that's the default
     KUnit test environment which pulls in the addition of some stubs to
     the driver
   - More enhancement of pcmtest driver
   - A few fixes / enhancements of MIDI 2.0 UMP core
   - Using PCI definitions in allover HD-audio code
   - Support for Cirrus CS35L56 and TI TAS2781 HD-audio sub-codecs
   - CS35L41 HD-audio sub-codec improvements
   - Continued emu10k1 improvements"

* tag 'sound-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (693 commits)
  ALSA: pcm: Fix missing fixup call in compat hw_refine ioctl
  ASoC: dwc: i2s: Fix unused functions
  ALSA: usb-audio: Don't try to submit URBs after disconnection
  ALSA: emu10k1: add separate documentation for E-MU cards
  ALSA: emu10k1: more documentation updates
  ALSA: emu10k1: de-duplicate audigy-mixer.rst vs. sb-live-mixer.rst
  ALSA: ump: Fix -Wformat-truncation warnings
  ALSA: hda: Add missing dependency on CONFIG_EFI for Cirrus/TI sub-codecs
  ALSA: doc: Fix missing backquote in midi-2.0.rst
  ALSA: hda/realtek: Add quirk for mute LEDs on HP ENVY x360 15-eu0xxx
  ALSA: hda/tas2781: Switch back to use struct i2c_driver's .probe()
  ASoC: soc-core.c: Do not error if a DAI link component is not found
  ASoC: codecs: Fix error code in aw88261_i2c_probe()
  ASoC: audio-graph-card.c: move audio_graph_parse_of()
  ASoC: cs42l43: Use new-style PM runtime macros
  ALSA: documentation: Add description for USB MIDI 2.0 gadget driver
  ALSA: ump: Don't create unused substreams for static blocks
  ALSA: ump: Fill group names for legacy rawmidi substreams
  ALSA: usb-audio: Attach legacy rawmidi after probing all UMP EPs
  ALSA: ac97: Fix possible error value of *rac97
  ...
2023-08-30 13:45:05 -07:00
Huacai Chen
9d1785590b Merge tag 'md-next-20230814-resend' into loongarch-next
LoongArch architecture changes for 6.5 (raid5/6 optimization) depend on
the md changes to fix build and work, so merge them to create a base.
2023-08-30 17:35:54 +08:00
Linus Torvalds
3d3dfeb3ae for-6.6/block-2023-08-28
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmTs08EQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqa4EACu/zKE+omGXBV0Q7kEpVsChjp0ElGtSDIJ
 tJfTuvnWqQjrqRv4ksmZvGdx8SkqFuXri4/7oBXlsaqeUVbIQdWJUpLErBye6nxa
 lUb6nXOFWwyG94cMRYs71lN0loosjb7aiVw7oVLAIhntq3p3doFl/cyy3ndMZrUE
 pZbsrWSt4QiOKhcO0TtIjfAwsr31AN51qFiNNITEiZl3UjXfkGRCK81X0yM2N8zZ
 7Y0h1ldPBsZ/olNWeRyaW1uB64nKM0buR7/nDxCV/NI05nndJ34bIgo/JIj4xy0v
 SiBj2+y86+oMJZt17yYENwOQdtX3hbyESGuVm9dCrO0t9/byVQxkUk0OMm65BM/l
 l2d+gmMQZTbHziqfLlgq9i3i9+B4C2hsb7iBpuo7SW/FPbM45POgi3lpiZycaZyu
 krQo1qwL4KSGXzGN9CabEuKDcJcXqLxqMDOyEDA3R5Kz06V9tNuM+Di/mr4vuZHK
 sVHUfHuWBO9ionLlGPdc3fH/CuMqic8SHjumiAm2menBZV6cSzRDxpm6H4CyLt7y
 tWmw7BNU7dfHFGd+Jw0Ld49sAuEybszEXq6qYv5uYBVfJNqDvOvEeVoQp0RN2jJA
 AG30hymcZgxn9n7gkIgkPQDgIGUjnzUR8B2mE2UFU1CYVHXYXAXU55CCI5oeTkbs
 d0Y/zCZf1A==
 =p1bd
 -----END PGP SIGNATURE-----

Merge tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:
 "Pretty quiet round for this release. This contains:

   - Add support for zoned storage to ublk (Andreas, Ming)

   - Series improving performance for drivers that mark themselves as
     needing a blocking context for issue (Bart)

   - Cleanup the flush logic (Chengming)

   - sed opal keyring support (Greg)

   - Fixes and improvements to the integrity support (Jinyoung)

   - Add some exports for bcachefs that we can hopefully delete again in
     the future (Kent)

   - deadline throttling fix (Zhiguo)

   - Series allowing building the kernel without buffer_head support
     (Christoph)

   - Sanitize the bio page adding flow (Christoph)

   - Write back cache fixes (Christoph)

   - MD updates via Song:
      - Fix perf regression for raid0 large sequential writes (Jan)
      - Fix split bio iostat for raid0 (David)
      - Various raid1 fixes (Heinz, Xueshi)
      - raid6test build fixes (WANG)
      - Deprecate bitmap file support (Christoph)
      - Fix deadlock with md sync thread (Yu)
      - Refactor md io accounting (Yu)
      - Various non-urgent fixes (Li, Yu, Jack)

   - Various fixes and cleanups (Arnd, Azeem, Chengming, Damien, Li,
     Ming, Nitesh, Ruan, Tejun, Thomas, Xu)"

* tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux: (113 commits)
  block: use strscpy() to instead of strncpy()
  block: sed-opal: keyring support for SED keys
  block: sed-opal: Implement IOC_OPAL_REVERT_LSP
  block: sed-opal: Implement IOC_OPAL_DISCOVERY
  blk-mq: prealloc tags when increase tagset nr_hw_queues
  blk-mq: delete redundant tagset map update when fallback
  blk-mq: fix tags leak when shrink nr_hw_queues
  ublk: zoned: support REQ_OP_ZONE_RESET_ALL
  md: raid0: account for split bio in iostat accounting
  md/raid0: Fix performance regression for large sequential writes
  md/raid0: Factor out helper for mapping and submitting a bio
  md raid1: allow writebehind to work on any leg device set WriteMostly
  md/raid1: hold the barrier until handle_read_error() finishes
  md/raid1: free the r1bio before waiting for blocked rdev
  md/raid1: call free_r1bio() before allow_barrier() in raid_end_bio_io()
  blk-cgroup: Fix NULL deref caused by blkg_policy_data being installed before init
  drivers/rnbd: restore sysfs interface to rnbd-client
  md/raid5-cache: fix null-ptr-deref for r5l_flush_stripe_to_raid()
  raid6: test: only check for Altivec if building on powerpc hosts
  raid6: test: make sure all intermediate and artifact files are .gitignored
  ...
2023-08-29 20:21:42 -07:00
Linus Torvalds
d68b4b6f30 - An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options").
 
 - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
   couple of macros to args.h").
 
 - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
   commands").
 
 - vsprintf inclusion rationalization from Andy Shevchenko
   ("lib/vsprintf: Rework header inclusions").
 
 - Switch the handling of kdump from a udev scheme to in-kernel handling,
   by Eric DeVolder ("crash: Kernel handling of CPU and memory hot
   un/plug").
 
 - Many singleton patches to various parts of the tree
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO2GpAAKCRDdBJ7gKXxA
 juW3AQD1moHzlSN6x9I3tjm5TWWNYFoFL8af7wXDJspp/DWH/AD/TO0XlWWhhbYy
 QHy7lL0Syha38kKLMXTM+bN6YQHi9AU=
 =WJQa
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - An extensive rework of kexec and crash Kconfig from Eric DeVolder
   ("refactor Kconfig to consolidate KEXEC and CRASH options")

 - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
   couple of macros to args.h")

 - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
   commands")

 - vsprintf inclusion rationalization from Andy Shevchenko
   ("lib/vsprintf: Rework header inclusions")

 - Switch the handling of kdump from a udev scheme to in-kernel
   handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
   hot un/plug")

 - Many singleton patches to various parts of the tree

* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
  document while_each_thread(), change first_tid() to use for_each_thread()
  drivers/char/mem.c: shrink character device's devlist[] array
  x86/crash: optimize CPU changes
  crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
  crash: hotplug support for kexec_load()
  x86/crash: add x86 crash hotplug support
  crash: memory and CPU hotplug sysfs attributes
  kexec: exclude elfcorehdr from the segment digest
  crash: add generic infrastructure for crash hotplug support
  crash: move a few code bits to setup support of crash hotplug
  kstrtox: consistently use _tolower()
  kill do_each_thread()
  nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
  scripts/bloat-o-meter: count weak symbol sizes
  treewide: drop CONFIG_EMBEDDED
  lockdep: fix static memory detection even more
  lib/vsprintf: declare no_hash_pointers in sprintf.h
  lib/vsprintf: split out sprintf() and friends
  kernel/fork: stop playing lockless games for exe_file replacement
  adfs: delete unused "union adfs_dirtail" definition
  ...
2023-08-29 14:53:51 -07:00
Linus Torvalds
b96a3e9142 - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
   reduces the special-case code for handling hugetlb pages in GUP.  It
   also speeds up GUP handling of transparent hugepages.
 
 - Peng Zhang provides some maple tree speedups ("Optimize the fast path
   of mas_store()").
 
 - Sergey Senozhatsky has improved te performance of zsmalloc during
   compaction (zsmalloc: small compaction improvements").
 
 - Domenico Cerasuolo has developed additional selftest code for zswap
   ("selftests: cgroup: add zswap test program").
 
 - xu xin has doe some work on KSM's handling of zero pages.  These
   changes are mainly to enable the user to better understand the
   effectiveness of KSM's treatment of zero pages ("ksm: support tracking
   KSM-placed zero-pages").
 
 - Jeff Xu has fixes the behaviour of memfd's
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
 
 - David Howells has fixed an fscache optimization ("mm, netfs, fscache:
   Stop read optimisation when folio removed from pagecache").
 
 - Axel Rasmussen has given userfaultfd the ability to simulate memory
   poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD").
 
 - Miaohe Lin has contributed some routine maintenance work on the
   memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
   check").
 
 - Peng Zhang has contributed some maintenance work on the maple tree
   code ("Improve the validation for maple tree and some cleanup").
 
 - Hugh Dickins has optimized the collapsing of shmem or file pages into
   THPs ("mm: free retracted page table by RCU").
 
 - Jiaqi Yan has a patch series which permits us to use the healthy
   subpages within a hardware poisoned huge page for general purposes
   ("Improve hugetlbfs read on HWPOISON hugepages").
 
 - Kemeng Shi has done some maintenance work on the pagetable-check code
   ("Remove unused parameters in page_table_check").
 
 - More folioification work from Matthew Wilcox ("More filesystem folio
   conversions for 6.6"), ("Followup folio conversions for zswap").  And
   from ZhangPeng ("Convert several functions in page_io.c to use a
   folio").
 
 - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
 
 - Baoquan He has converted some architectures to use the GENERIC_IOREMAP
   ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take
   GENERIC_IOREMAP way").
 
 - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
   batched/deferred tlb shootdown during page reclamation/migration").
 
 - Better maple tree lockdep checking from Liam Howlett ("More strict
   maple tree lockdep").  Liam also developed some efficiency improvements
   ("Reduce preallocations for maple tree").
 
 - Cleanup and optimization to the secondary IOMMU TLB invalidation, from
   Alistair Popple ("Invalidate secondary IOMMU TLB on permission
   upgrade").
 
 - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
   for arm64").
 
 - Kemeng Shi provides some maintenance work on the compaction code ("Two
   minor cleanups for compaction").
 
 - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most
   file-backed faults under the VMA lock").
 
 - Aneesh Kumar contributes code to use the vmemmap optimization for DAX
   on ppc64, under some circumstances ("Add support for DAX vmemmap
   optimization for ppc64").
 
 - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
   data in page_ext"), ("minor cleanups to page_ext header").
 
 - Some zswap cleanups from Johannes Weiner ("mm: zswap: three
   cleanups").
 
 - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
 
 - VMA handling cleanups from Kefeng Wang ("mm: convert to
   vma_is_initial_heap/stack()").
 
 - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
   implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
   address ranges and DAMON monitoring targets").
 
 - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
 
 - Liam Howlett has improved the maple tree node replacement code
   ("maple_tree: Change replacement strategy").
 
 - ZhangPeng has a general code cleanup - use the K() macro more widely
   ("cleanup with helper macro K()").
 
 - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap
   on memory feature on ppc64").
 
 - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
   in page_alloc"), ("Two minor cleanups for get pageblock migratetype").
 
 - Vishal Moola introduces a memory descriptor for page table tracking,
   "struct ptdesc" ("Split ptdesc from struct page").
 
 - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
   for vm.memfd_noexec").
 
 - MM include file rationalization from Hugh Dickins ("arch: include
   asm/cacheflush.h in asm/hugetlb.h").
 
 - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
   output").
 
 - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
   object_cache instead of kmemleak_initialized").
 
 - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
   and _folio_order").
 
 - A VMA locking scalability improvement from Suren Baghdasaryan
   ("Per-VMA lock support for swap and userfaults").
 
 - pagetable handling cleanups from Matthew Wilcox ("New page table range
   API").
 
 - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
   using page->private on tail pages for THP_SWAP + cleanups").
 
 - Cleanups and speedups to the hugetlb fault handling from Matthew
   Wilcox ("Change calling convention for ->huge_fault").
 
 - Matthew Wilcox has also done some maintenance work on the MM subsystem
   documentation ("Improve mm documentation").
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO1JUQAKCRDdBJ7gKXxA
 jrMwAP47r/fS8vAVT3zp/7fXmxaJYTK27CTAM881Gw1SDhFM/wEAv8o84mDenCg6
 Nfio7afS1ncD+hPYT8947UnLxTgn+ww=
 =Afws
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Some swap cleanups from Ma Wupeng ("fix WARN_ON in
   add_to_avail_list")

 - Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
   reduces the special-case code for handling hugetlb pages in GUP. It
   also speeds up GUP handling of transparent hugepages.

 - Peng Zhang provides some maple tree speedups ("Optimize the fast path
   of mas_store()").

 - Sergey Senozhatsky has improved te performance of zsmalloc during
   compaction (zsmalloc: small compaction improvements").

 - Domenico Cerasuolo has developed additional selftest code for zswap
   ("selftests: cgroup: add zswap test program").

 - xu xin has doe some work on KSM's handling of zero pages. These
   changes are mainly to enable the user to better understand the
   effectiveness of KSM's treatment of zero pages ("ksm: support
   tracking KSM-placed zero-pages").

 - Jeff Xu has fixes the behaviour of memfd's
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").

 - David Howells has fixed an fscache optimization ("mm, netfs, fscache:
   Stop read optimisation when folio removed from pagecache").

 - Axel Rasmussen has given userfaultfd the ability to simulate memory
   poisoning ("add UFFDIO_POISON to simulate memory poisoning with
   UFFD").

 - Miaohe Lin has contributed some routine maintenance work on the
   memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
   check").

 - Peng Zhang has contributed some maintenance work on the maple tree
   code ("Improve the validation for maple tree and some cleanup").

 - Hugh Dickins has optimized the collapsing of shmem or file pages into
   THPs ("mm: free retracted page table by RCU").

 - Jiaqi Yan has a patch series which permits us to use the healthy
   subpages within a hardware poisoned huge page for general purposes
   ("Improve hugetlbfs read on HWPOISON hugepages").

 - Kemeng Shi has done some maintenance work on the pagetable-check code
   ("Remove unused parameters in page_table_check").

 - More folioification work from Matthew Wilcox ("More filesystem folio
   conversions for 6.6"), ("Followup folio conversions for zswap"). And
   from ZhangPeng ("Convert several functions in page_io.c to use a
   folio").

 - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").

 - Baoquan He has converted some architectures to use the
   GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
   architectures to take GENERIC_IOREMAP way").

 - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
   batched/deferred tlb shootdown during page reclamation/migration").

 - Better maple tree lockdep checking from Liam Howlett ("More strict
   maple tree lockdep"). Liam also developed some efficiency
   improvements ("Reduce preallocations for maple tree").

 - Cleanup and optimization to the secondary IOMMU TLB invalidation,
   from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
   upgrade").

 - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
   for arm64").

 - Kemeng Shi provides some maintenance work on the compaction code
   ("Two minor cleanups for compaction").

 - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
   most file-backed faults under the VMA lock").

 - Aneesh Kumar contributes code to use the vmemmap optimization for DAX
   on ppc64, under some circumstances ("Add support for DAX vmemmap
   optimization for ppc64").

 - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
   data in page_ext"), ("minor cleanups to page_ext header").

 - Some zswap cleanups from Johannes Weiner ("mm: zswap: three
   cleanups").

 - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").

 - VMA handling cleanups from Kefeng Wang ("mm: convert to
   vma_is_initial_heap/stack()").

 - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
   implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
   address ranges and DAMON monitoring targets").

 - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").

 - Liam Howlett has improved the maple tree node replacement code
   ("maple_tree: Change replacement strategy").

 - ZhangPeng has a general code cleanup - use the K() macro more widely
   ("cleanup with helper macro K()").

 - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
   memmap on memory feature on ppc64").

 - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
   in page_alloc"), ("Two minor cleanups for get pageblock
   migratetype").

 - Vishal Moola introduces a memory descriptor for page table tracking,
   "struct ptdesc" ("Split ptdesc from struct page").

 - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
   for vm.memfd_noexec").

 - MM include file rationalization from Hugh Dickins ("arch: include
   asm/cacheflush.h in asm/hugetlb.h").

 - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
   output").

 - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
   object_cache instead of kmemleak_initialized").

 - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
   and _folio_order").

 - A VMA locking scalability improvement from Suren Baghdasaryan
   ("Per-VMA lock support for swap and userfaults").

 - pagetable handling cleanups from Matthew Wilcox ("New page table
   range API").

 - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
   using page->private on tail pages for THP_SWAP + cleanups").

 - Cleanups and speedups to the hugetlb fault handling from Matthew
   Wilcox ("Change calling convention for ->huge_fault").

 - Matthew Wilcox has also done some maintenance work on the MM
   subsystem documentation ("Improve mm documentation").

* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
  maple_tree: shrink struct maple_tree
  maple_tree: clean up mas_wr_append()
  secretmem: convert page_is_secretmem() to folio_is_secretmem()
  nios2: fix flush_dcache_page() for usage from irq context
  hugetlb: add documentation for vma_kernel_pagesize()
  mm: add orphaned kernel-doc to the rst files.
  mm: fix clean_record_shared_mapping_range kernel-doc
  mm: fix get_mctgt_type() kernel-doc
  mm: fix kernel-doc warning from tlb_flush_rmaps()
  mm: remove enum page_entry_size
  mm: allow ->huge_fault() to be called without the mmap_lock held
  mm: move PMD_ORDER to pgtable.h
  mm: remove checks for pte_index
  memcg: remove duplication detection for mem_cgroup_uncharge_swap
  mm/huge_memory: work on folio->swap instead of page->private when splitting folio
  mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
  mm/swap: use dedicated entry for swap in folio
  mm/swap: stop using page->private on tail pages for THP_SWAP
  selftests/mm: fix WARNING comparing pointer to 0
  selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
  ...
2023-08-29 14:25:26 -07:00
Linus Torvalds
bd6c11bc43 Networking changes for 6.6.
Core
 ----
 
  - Increase size limits for to-be-sent skb frag allocations. This
    allows tun, tap devices and packet sockets to better cope with large
    writes operations.
 
  - Store netdevs in an xarray, to simplify iterating over netdevs.
 
  - Refactor nexthop selection for multipath routes.
 
  - Improve sched class lifetime handling.
 
  - Add backup nexthop ID support for bridge.
 
  - Implement drop reasons support in openvswitch.
 
  - Several data races annotations and fixes.
 
  - Constify the sk parameter of routing functions.
 
  - Prepend kernel version to netconsole message.
 
 Protocols
 ---------
 
  - Implement support for TCP probing the peer being under memory
    pressure.
 
  - Remove hard coded limitation on IPv6 specific info placement
    inside the socket struct.
 
  - Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated
    per socket scaling factor.
 
  - Scaling-up the IPv6 expired route GC via a separated list of
    expiring routes.
 
  - In-kernel support for the TLS alert protocol.
 
  - Better support for UDP reuseport with connected sockets.
 
  - Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR
    header size.
 
  - Get rid of additional ancillary per MPTCP connection struct socket.
 
  - Implement support for BPF-based MPTCP packet schedulers.
 
  - Format MPTCP subtests selftests results in TAP.
 
  - Several new SMC 2.1 features including unique experimental options,
    max connections per lgr negotiation, max links per lgr negotiation.
 
 BPF
 ---
 
  - Multi-buffer support in AF_XDP.
 
  - Add multi uprobe BPF links for attaching multiple uprobes
    and usdt probes, which is significantly faster and saves extra fds.
 
  - Implement an fd-based tc BPF attach API (TCX) and BPF link support on
    top of it.
 
  - Add SO_REUSEPORT support for TC bpf_sk_assign.
 
  - Support new instructions from cpu v4 to simplify the generated code and
    feature completeness, for x86, arm64, riscv64.
 
  - Support defragmenting IPv(4|6) packets in BPF.
 
  - Teach verifier actual bounds of bpf_get_smp_processor_id()
    and fix perf+libbpf issue related to custom section handling.
 
  - Introduce bpf map element count and enable it for all program types.
 
  - Add a BPF hook in sys_socket() to change the protocol ID
    from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy.
 
  - Introduce bpf_me_mcache_free_rcu() and fix OOM under stress.
 
  - Add uprobe support for the bpf_get_func_ip helper.
 
  - Check skb ownership against full socket.
 
  - Support for up to 12 arguments in BPF trampoline.
 
  - Extend link_info for kprobe_multi and perf_event links.
 
 Netfilter
 ---------
 
  - Speed-up process exit by aborting ruleset validation if a
    fatal signal is pending.
 
  - Allow NLA_POLICY_MASK to be used with BE16/BE32 types.
 
 Driver API
 ----------
 
  - Page pool optimizations, to improve data locality and cache usage.
 
  - Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the need
    for raw ioctl() handling in drivers.
 
  - Simplify genetlink dump operations (doit/dumpit) providing them
    the common information already populated in struct genl_info.
 
  - Extend and use the yaml devlink specs to [re]generate the split ops.
 
  - Introduce devlink selective dumps, to allow SF filtering SF based on
    handle and other attributes.
 
  - Add yaml netlink spec for netlink-raw families, allow route, link and
    address related queries via the ynl tool.
 
  - Remove phylink legacy mode support.
 
  - Support offload LED blinking to phy.
 
  - Add devlink port function attributes for IPsec.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Broadcom ASP 2.0 (72165) ethernet controller
    - MediaTek MT7988 SoC
    - Texas Instruments AM654 SoC
    - Texas Instruments IEP driver
    - Atheros qca8081 phy
    - Marvell 88Q2110 phy
    - NXP TJA1120 phy
 
  - WiFi:
    - MediaTek mt7981 support
 
  - Can:
    - Kvaser SmartFusion2 PCI Express devices
    - Allwinner T113 controllers
    - Texas Instruments tcan4552/4553 chips
 
  - Bluetooth:
    - Intel Gale Peak
    - Qualcomm WCN3988 and WCN7850
    - NXP AW693 and IW624
    - Mediatek MT2925
 
 Drivers
 -------
 
  - Ethernet NICs:
    - nVidia/Mellanox:
      - mlx5:
        - support UDP encapsulation in packet offload mode
        - IPsec packet offload support in eswitch mode
        - improve aRFS observability by adding new set of counters
        - extends MACsec offload support to cover RoCE traffic
        - dynamic completion EQs
      - mlx4:
        - convert to use auxiliary bus instead of custom interface logic
    - Intel
      - ice:
        - implement switchdev bridge offload, even for LAG interfaces
        - implement SRIOV support for LAG interfaces
      - igc:
        - add support for multiple in-flight TX timestamps
    - Broadcom:
      - bnxt:
        - use the unified RX page pool buffers for XDP and non-XDP
        - use the NAPI skb allocation cache
    - OcteonTX2:
      - support Round Robin scheduling HTB offload
      - TC flower offload support for SPI field
    - Freescale:
      -  add XDP_TX feature support
    - AMD:
      - ionic: add support for PCI FLR event
      - sfc:
        - basic conntrack offload
        - introduce eth, ipv4 and ipv6 pedit offloads
    - ST Microelectronics:
      - stmmac: maximze PTP timestamping resolution
 
  - Virtual NICs:
    - Microsoft vNIC:
      - batch ringing RX queue doorbell on receiving packets
      - add page pool for RX buffers
    - Virtio vNIC:
      - add per queue interrupt coalescing support
    - Google vNIC:
      - add queue-page-list mode support
 
  - Ethernet high-speed switches:
    - nVidia/Mellanox (mlxsw):
      - add port range matching tc-flower offload
      - permit enslavement to netdevices with uppers
 
  - Ethernet embedded switches:
    - Marvell (mv88e6xxx):
      - convert to phylink_pcs
    - Renesas:
      - r8A779fx: add speed change support
      - rzn1: enables vlan support
 
  - Ethernet PHYs:
    - convert mv88e6xxx to phylink_pcs
 
  - WiFi:
    - Qualcomm Wi-Fi 7 (ath12k):
      - extremely High Throughput (EHT) PHY support
    - RealTek (rtl8xxxu):
      - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
        RTL8192EU and RTL8723BU
    - RealTek (rtw89):
      - Introduce Time Averaged SAR (TAS) support
 
  - Connector:
    - support for event filtering
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmTt1ZoSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkgFUP/REFaYWdWUvAzmWeezyx9dqgZMfSOjWq
 9QvySiA94OAOcjIYkb7wfzQ5BBAZqaBQ/f8XqWwS1EDDDEBs8sP1cxmABKwW7Hsr
 qFRu2sOqLzKBk223d0jIgEocfQaFpGbF71gXoTlDivBjBi5UxWm9bF0XnbYWcKgO
 /QEvzNosi9uNdi85Fzmv62J6YzAdidEpwGsM7X2CfejwNRmStxAEg/NwvRR0Hyiq
 OJCo97omEgTRaUle8nc64PDx33u4h5kQ1BkaeHEv0rbE3hftFC2YPKn/InmqSFGz
 6ew2xnrGPR37LCuAiCcIIv6yR7K0eu0iYJ7jXwZxBDqxGavEPuwWGBoCP6qFiitH
 ZLWhIrAUrdmSbySkTOCONhJ475qFAuQoYHYpZnX/bJZUHlSsb/9lwDJYJQGpVfd1
 /daqJVSb7lhaifmNO1iNd/ibCIXq9zapwtkRwA897M8GkZBTsnVvazFld1Em+Se3
 Bx6DSDUVBqVQ9fpZG2IAGD6odDwOzC1lF2IoceFvK9Ff6oE0psI+A0qNLMkHxZbW
 Qlo7LsNe53hpoCC+yHTfXX7e/X8eNt0EnCGOQJDusZ0Nr3K7H4LKFA0i8UBUK05n
 4lKnnaSQW7GQgdofLWt103OMDR9GoDxpFsm7b1X9+AEk6Fz6tq50wWYeMZETUKYP
 DCW8VGFOZjZM
 =9CsR
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core:

   - Increase size limits for to-be-sent skb frag allocations. This
     allows tun, tap devices and packet sockets to better cope with
     large writes operations

   - Store netdevs in an xarray, to simplify iterating over netdevs

   - Refactor nexthop selection for multipath routes

   - Improve sched class lifetime handling

   - Add backup nexthop ID support for bridge

   - Implement drop reasons support in openvswitch

   - Several data races annotations and fixes

   - Constify the sk parameter of routing functions

   - Prepend kernel version to netconsole message

  Protocols:

   - Implement support for TCP probing the peer being under memory
     pressure

   - Remove hard coded limitation on IPv6 specific info placement inside
     the socket struct

   - Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated per
     socket scaling factor

   - Scaling-up the IPv6 expired route GC via a separated list of
     expiring routes

   - In-kernel support for the TLS alert protocol

   - Better support for UDP reuseport with connected sockets

   - Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR
     header size

   - Get rid of additional ancillary per MPTCP connection struct socket

   - Implement support for BPF-based MPTCP packet schedulers

   - Format MPTCP subtests selftests results in TAP

   - Several new SMC 2.1 features including unique experimental options,
     max connections per lgr negotiation, max links per lgr negotiation

  BPF:

   - Multi-buffer support in AF_XDP

   - Add multi uprobe BPF links for attaching multiple uprobes and usdt
     probes, which is significantly faster and saves extra fds

   - Implement an fd-based tc BPF attach API (TCX) and BPF link support
     on top of it

   - Add SO_REUSEPORT support for TC bpf_sk_assign

   - Support new instructions from cpu v4 to simplify the generated code
     and feature completeness, for x86, arm64, riscv64

   - Support defragmenting IPv(4|6) packets in BPF

   - Teach verifier actual bounds of bpf_get_smp_processor_id() and fix
     perf+libbpf issue related to custom section handling

   - Introduce bpf map element count and enable it for all program types

   - Add a BPF hook in sys_socket() to change the protocol ID from
     IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy

   - Introduce bpf_me_mcache_free_rcu() and fix OOM under stress

   - Add uprobe support for the bpf_get_func_ip helper

   - Check skb ownership against full socket

   - Support for up to 12 arguments in BPF trampoline

   - Extend link_info for kprobe_multi and perf_event links

  Netfilter:

   - Speed-up process exit by aborting ruleset validation if a fatal
     signal is pending

   - Allow NLA_POLICY_MASK to be used with BE16/BE32 types

  Driver API:

   - Page pool optimizations, to improve data locality and cache usage

   - Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the
     need for raw ioctl() handling in drivers

   - Simplify genetlink dump operations (doit/dumpit) providing them the
     common information already populated in struct genl_info

   - Extend and use the yaml devlink specs to [re]generate the split ops

   - Introduce devlink selective dumps, to allow SF filtering SF based
     on handle and other attributes

   - Add yaml netlink spec for netlink-raw families, allow route, link
     and address related queries via the ynl tool

   - Remove phylink legacy mode support

   - Support offload LED blinking to phy

   - Add devlink port function attributes for IPsec

  New hardware / drivers:

   - Ethernet:
      - Broadcom ASP 2.0 (72165) ethernet controller
      - MediaTek MT7988 SoC
      - Texas Instruments AM654 SoC
      - Texas Instruments IEP driver
      - Atheros qca8081 phy
      - Marvell 88Q2110 phy
      - NXP TJA1120 phy

   - WiFi:
      - MediaTek mt7981 support

   - Can:
      - Kvaser SmartFusion2 PCI Express devices
      - Allwinner T113 controllers
      - Texas Instruments tcan4552/4553 chips

   - Bluetooth:
      - Intel Gale Peak
      - Qualcomm WCN3988 and WCN7850
      - NXP AW693 and IW624
      - Mediatek MT2925

  Drivers:

   - Ethernet NICs:
      - nVidia/Mellanox:
         - mlx5:
            - support UDP encapsulation in packet offload mode
            - IPsec packet offload support in eswitch mode
            - improve aRFS observability by adding new set of counters
            - extends MACsec offload support to cover RoCE traffic
            - dynamic completion EQs
         - mlx4:
            - convert to use auxiliary bus instead of custom interface
              logic
      - Intel
         - ice:
            - implement switchdev bridge offload, even for LAG
              interfaces
            - implement SRIOV support for LAG interfaces
         - igc:
            - add support for multiple in-flight TX timestamps
      - Broadcom:
         - bnxt:
            - use the unified RX page pool buffers for XDP and non-XDP
            - use the NAPI skb allocation cache
      - OcteonTX2:
         - support Round Robin scheduling HTB offload
         - TC flower offload support for SPI field
      - Freescale:
         - add XDP_TX feature support
      - AMD:
         - ionic: add support for PCI FLR event
         - sfc:
            - basic conntrack offload
            - introduce eth, ipv4 and ipv6 pedit offloads
      - ST Microelectronics:
         - stmmac: maximze PTP timestamping resolution

   - Virtual NICs:
      - Microsoft vNIC:
         - batch ringing RX queue doorbell on receiving packets
         - add page pool for RX buffers
      - Virtio vNIC:
         - add per queue interrupt coalescing support
      - Google vNIC:
         - add queue-page-list mode support

   - Ethernet high-speed switches:
      - nVidia/Mellanox (mlxsw):
         - add port range matching tc-flower offload
         - permit enslavement to netdevices with uppers

   - Ethernet embedded switches:
      - Marvell (mv88e6xxx):
         - convert to phylink_pcs
      - Renesas:
         - r8A779fx: add speed change support
         - rzn1: enables vlan support

   - Ethernet PHYs:
      - convert mv88e6xxx to phylink_pcs

   - WiFi:
      - Qualcomm Wi-Fi 7 (ath12k):
         - extremely High Throughput (EHT) PHY support
      - RealTek (rtl8xxxu):
         - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
           RTL8192EU and RTL8723BU
      - RealTek (rtw89):
         - Introduce Time Averaged SAR (TAS) support

   - Connector:
      - support for event filtering"

* tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1806 commits)
  net: ethernet: mtk_wed: minor change in wed_{tx,rx}info_show
  net: ethernet: mtk_wed: add some more info in wed_txinfo_show handler
  net: stmmac: clarify difference between "interface" and "phy_interface"
  r8152: add vendor/device ID pair for D-Link DUB-E250
  devlink: move devlink_notify_register/unregister() to dev.c
  devlink: move small_ops definition into netlink.c
  devlink: move tracepoint definitions into core.c
  devlink: push linecard related code into separate file
  devlink: push rate related code into separate file
  devlink: push trap related code into separate file
  devlink: use tracepoint_enabled() helper
  devlink: push region related code into separate file
  devlink: push param related code into separate file
  devlink: push resource related code into separate file
  devlink: push dpipe related code into separate file
  devlink: move and rename devlink_dpipe_send_and_alloc_skb() helper
  devlink: push shared buffer related code into separate file
  devlink: push port related code into separate file
  devlink: push object register/unregister notifications into separate helpers
  inet: fix IP_TRANSPARENT error handling
  ...
2023-08-29 11:33:01 -07:00
Linus Torvalds
68cf01760b This update includes the following changes:
API:
 
 - Move crypto engine callback from tfm ctx into algorithm object.
 - Fix atomic sleep bug in crypto_destroy_instance.
 - Move lib/mpi into lib/crypto.
 
 Algorithms:
 
 - Add chacha20 and poly1305 implementation for powerpc p10.
 
 Drivers:
 
 - Add AES skcipher and aead support to starfive.
 - Add Dynamic Boost Control support to ccp.
 - Add support for STM32P13 platform to stm32.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmTsZkMACgkQxycdCkmx
 i6furw//e6kYK1CTOqidPM6nI0KK1Ok204VXu56H0wM4THZ09ZwcbDNKpvI6vjMi
 XZkKthiayl/1okmpRVP0rPqMWDtxajeu6IUAQqqFGUFU8R7AqCDrOd+te+zlSFWG
 16ySNQO47RND0OzNqZ4ojgCC0n9RpP+zOfndmderZ4EnfXSbodwGUwkcuE7Z96cP
 jNoainO2iwlyMZPlVynrw61O3RxGu/s/ch+uY1mV+TyvAAWoOlzt57gYUs3eGduz
 4Ky+0Ubctg3sfBaqA2Hg6GjtAqG/QUssRyj8YgsFMrgXPHDTbLh6abej39wWo4gz
 ZdC7Bm47hV/yfVdWe2iq3/5iqdILEdPBh3fDh6NNsZ1Jlm3aEZpH9rEXm0k4X2MJ
 A9NDAFVj8dAYVZza7+Y8jPc8FNe+HqN9HYip/2K7g68WAJGWnMc9lq9qGwGmg1Gl
 dn6yM27AgH8B+UljWYM9FS1ZFsc8KCudJavRZqA2d0W3rbXVWAoBBp83ii0yX1Nm
 ZPAblAYMZCDeCtrVrDYKLtGn566rfpCrv3R5cppwHLksGJsDxgWrjG47l9uy5HXI
 u05jiXT11R+pjIU2Wv5qsiUIhyvli6AaiFYHIdZ8fWaovPAOdhrCrN3IryvUVHj/
 LqMcnmW1rWGNYN9pqHn0sQZ730ZJIma0klhTZOn8HPJNbiK68X0=
 =LbcA
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Move crypto engine callback from tfm ctx into algorithm object
   - Fix atomic sleep bug in crypto_destroy_instance
   - Move lib/mpi into lib/crypto

  Algorithms:
   - Add chacha20 and poly1305 implementation for powerpc p10

  Drivers:
   - Add AES skcipher and aead support to starfive
   - Add Dynamic Boost Control support to ccp
   - Add support for STM32P13 platform to stm32"

* tag 'v6.6-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (149 commits)
  Revert "dt-bindings: crypto: qcom,prng: Add SM8450"
  crypto: chelsio - Remove unused declarations
  X.509: if signature is unsupported skip validation
  crypto: qat - fix crypto capability detection for 4xxx
  crypto: drivers - Explicitly include correct DT includes
  crypto: engine - Remove crypto_engine_ctx
  crypto: zynqmp - Use new crypto_engine_op interface
  crypto: virtio - Use new crypto_engine_op interface
  crypto: stm32 - Use new crypto_engine_op interface
  crypto: jh7110 - Use new crypto_engine_op interface
  crypto: rk3288 - Use new crypto_engine_op interface
  crypto: omap - Use new crypto_engine_op interface
  crypto: keembay - Use new crypto_engine_op interface
  crypto: sl3516 - Use new crypto_engine_op interface
  crypto: caam - Use new crypto_engine_op interface
  crypto: aspeed - Remove non-standard sha512 algorithms
  crypto: aspeed - Use new crypto_engine_op interface
  crypto: amlogic - Use new crypto_engine_op interface
  crypto: sun8i-ss - Use new crypto_engine_op interface
  crypto: sun8i-ce - Use new crypto_engine_op interface
  ...
2023-08-29 11:23:29 -07:00
Linus Torvalds
815c24a085 linux-kselftest-kunit-6.6-rc1
This kunit update for Linux 6.6.rc1 consists of:
 
 -- Adds support for running Rust documentation tests as KUnit tests
 -- Makes init, str, sync, types doctests compilable/testable
 -- Adds support for attributes API which include speed, modules
    attributes, ability to filter and report attributes.
 -- Adds support for marking tests slow using attributes API.
 -- Adds attributes API documentation
 -- Fixes to wild-memory-access bug in kunit_filter_suites() and
    a possible memory leak in kunit_filter_suites()
 -- Adds support for counting number of test suites in a module, list
    action to kunit test modules, and test filtering on module tests.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmTsxL8ACgkQCwJExA0N
 Qxwt6BAA5FgF7nUeGRZCnot4MQCNGRThxsns2k3CKjM1Iokp8tstTDoNHXzk2veS
 WlRYOHFQqQOVTVRP+laXyjjMMHnlnhFxqbv93UKsen4JIUJDLFLq9x+0i+0bZh97
 N1rE5cKUnqjAOL6MIJuomW9IzEIrbMcqdljm6SOCZp90NLvq1+I4pDGLgx2bxcow
 Y/7dkx+dnlEsoACZ19CL1L2TaR21GpKdpOudpHNCShsbE0aOAlyHAVcmH64FTqCy
 Z1LtrA0odS71q0yxDVCk5X3cIkeVfGBMz6aMZBRzS9k5jU4H1EN1eG1rGdGErIe5
 YduwX3KMikYJB2stT64T1vgldIpT/emxqkBigmxQ37g3Flgopz4bI1snMBry+nKb
 ViD/WQNjsf2iL8MooCgYBzH7yjmX6lXXQTZXROogBj4lP2/0gHiQVZyXZEAjtoO3
 uNzUbfHQGnvtTphBHV4nNGaO+7kU9Y/oX8TYFcSYJQzcH5UVx16uBwevZjT1bii/
 q89bRAQLnJpzkR93SGpnmsRgoDcYJSYsEA1o/f9Eqq8j3guOS2idpJvkheXq8+A2
 MqTSOCJHENKZ3v0UGKlvZUPStaMaqN58z/VjlWug5EaB83LLfPcXJrGjz/EHk967
 hYDHcwPoamTegr1zlg3ckOLiWEhga2tv6aHPkshkcFphpnhRU/c=
 =Nsb8
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - add support for running Rust documentation tests as KUnit tests

 - make init, str, sync, types doctests compilable/testable

 - add support for attributes API which include speed, modules
   attributes, ability to filter and report attributes

 - add support for marking tests slow using attributes API

 - add attributes API documentation

 - fix a wild-memory-access bug in kunit_filter_suites() and a possible
   memory leak in kunit_filter_suites()

 - add support for counting number of test suites in a module, list
   action to kunit test modules, and test filtering on module tests

* tag 'linux-kselftest-kunit-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (25 commits)
  kunit: fix struct kunit_attr header
  kunit: replace KUNIT_TRIGGER_STATIC_STUB maro with KUNIT_STATIC_STUB_REDIRECT
  kunit: Allow kunit test modules to use test filtering
  kunit: Make 'list' action available to kunit test modules
  kunit: Report the count of test suites in a module
  kunit: fix uninitialized variables bug in attributes filtering
  kunit: fix possible memory leak in kunit_filter_suites()
  kunit: fix wild-memory-access bug in kunit_filter_suites()
  kunit: Add documentation of KUnit test attributes
  kunit: add tests for filtering attributes
  kunit: time: Mark test as slow using test attributes
  kunit: memcpy: Mark tests as slow using test attributes
  kunit: tool: Add command line interface to filter and report attributes
  kunit: Add ability to filter attributes
  kunit: Add module attribute
  kunit: Add speed attribute
  kunit: Add test attributes API structure
  MAINTAINERS: add Rust KUnit files to the KUnit entry
  rust: support running Rust documentation tests as KUnit ones
  rust: types: make doctests compilable/testable
  ...
2023-08-28 18:56:38 -07:00
Linus Torvalds
d637fce034 Simplify the locking self-tests via using the new <linux/cleanup.h>
facilities for lock guards.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmTtBG4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j8HBAAjCRultcDtiX9GF1KvaHWBVxpSn1Esbl0
 NM2hySE9DbZZXG5xDWDFW/levAMq8CzrdQ0uSEgSJU0tneiFtp3li5B5BuSBFUJ6
 f837sT8mKEieydahZKL9i8mbiYLuXMwpLPiW+RVS8nXwZokhljqPw8zJ7TKCmMpG
 naQCJdtYDulUhdnOLKvYB+CXTcOyhGQn92zYUaO71xmjhqNR1+JzMWcheuOsWXK1
 UlfW7q+JeFmrbPu90u7WRw3xkGo9BvRv7acatDJ7vInZDbhE0aNrOeq0f9S/zPb0
 4TkdjV+EjmHUputkbYahzYFUUmU8v474lSbZYNuH2YUpqplHKwciNec9iDhiXZe0
 0JrggpZqhNtNkNarzpsEV2hBBxT5DjlOdBXHhLk8sDIWS0TeyN+ryxlfwICMW0sC
 UO83ncYfXajJ+jXjzk2y3pf/YG+AKOZT/L4+0hkuXrXf0Ag9Ig3sqUR4VEfB3W5+
 uzBolKceoXpxEXP5gLsozsAF6zLNC7x8zYnz/WHn8RJDK50oVEHbO6f+3UapSear
 e29D7+GG+evTUPxZoQl5EkUfTRBz+xf1ewMTd3OcIgoJpnjGc8qJlpErtkwUJJzn
 jsus7xGMBRmZwWYK0tdRbVJyG3NmpKTxpqw/kwSnrQTFuOtpoP6ykY7S4d1rBNwR
 wJmwIfKi1/k=
 =V/7H
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking update from Ingo Molnar:
 "Simplify the locking self-tests via using the new <linux/cleanup.h>
  facilities for lock guards"

* tag 'locking-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep/selftests: Use SBRM APIs for wait context tests
2023-08-28 16:32:09 -07:00
Linus Torvalds
727dbda16b hardening updates for v6.6-rc1
- Carve out the new CONFIG_LIST_HARDENED as a more focused subset of
   CONFIG_DEBUG_LIST (Marco Elver).
 
 - Fix kallsyms lookup failure under Clang LTO (Yonghong Song).
 
 - Clarify documentation for CONFIG_UBSAN_TRAP (Jann Horn).
 
 - Flexible array member conversion not carried in other tree (Gustavo
   A. R. Silva).
 
 - Various strlcpy() and strncpy() removals not carried in other trees
   (Azeem Shaikh, Justin Stitt).
 
 - Convert nsproxy.count to refcount_t (Elena Reshetova).
 
 - Add handful of __counted_by annotations not carried in other trees,
   as well as an LKDTM test.
 
 - Fix build failure with gcc-plugins on GCC 14+.
 
 - Fix selftests to respect SKIP for signal-delivery tests.
 
 - Fix CFI warning for paravirt callback prototype.
 
 - Clarify documentation for seq_show_option_n() usage.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmTs6ZAWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJkpjD/9AeST5Imc2t0t71Qd+wPxW3jT3
 kDZPlHH8wHmuxSpRscX82m21SozvEMvybo6Cp7FSH4qr863FnBWMlo8acr7rKxUf
 0f7Y9qgY/hKADiVx5p0pbnCgcy+l4pwsxIqVCGuhjvNCbWHrdGqLM4UjIfaVz5Ws
 +55a/C3S1KVwB1s1+6to43jtKqQAx6yrqYWOaT3wEfCzHC87f9PUHhIGnFQVwPGP
 WpjQI/BQKpH7+MDCoJOPrZqXaE/4lWALxR6+5BBheGbvLoWifpJEYHX6bDUzkgBz
 liQDkgr4eAw5EXSOS7mX3EApfeMKakznJt9Mcmn0h3pPRlM3ZSVD64Xrou2Brpje
 exS2JRuh6HwIiXY9nTHc6YMGcAWG1syAR/hM2fQdujM0CWtBUk9+kkuYWsqF6nIK
 3tOxYLB/Ph4p+tShd+v5R3mEmp/6snYRKJoUk+9Fk67i54NnK4huyxaCO4zui+ML
 3vHuGp8KgFHUjJaYmYXHs3TRZnKSFUkPGc4MbpiGtmJ9zhfSwlhhF+yfBJCsvmTf
 ZajA+sPupT4OjLxU6vUD/ZNkXAEjWzktyX2v9YBA7FHh7SqPtX9ARRIxh417AjEJ
 tBPHhW/iRw9ftBIAKDmI7gPLynngd/zvjhvk6O5egHYjjgRM1/WAJZ4V26XR6+hf
 TWfQb7VRzdZIqwOEUA==
 =9ZWP
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "As has become normal, changes are scattered around the tree (either
  explicitly maintainer Acked or for trivial stuff that went ignored):

   - Carve out the new CONFIG_LIST_HARDENED as a more focused subset of
     CONFIG_DEBUG_LIST (Marco Elver)

   - Fix kallsyms lookup failure under Clang LTO (Yonghong Song)

   - Clarify documentation for CONFIG_UBSAN_TRAP (Jann Horn)

   - Flexible array member conversion not carried in other tree (Gustavo
     A. R. Silva)

   - Various strlcpy() and strncpy() removals not carried in other trees
     (Azeem Shaikh, Justin Stitt)

   - Convert nsproxy.count to refcount_t (Elena Reshetova)

   - Add handful of __counted_by annotations not carried in other trees,
     as well as an LKDTM test

   - Fix build failure with gcc-plugins on GCC 14+

   - Fix selftests to respect SKIP for signal-delivery tests

   - Fix CFI warning for paravirt callback prototype

   - Clarify documentation for seq_show_option_n() usage"

* tag 'hardening-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
  LoadPin: Annotate struct dm_verity_loadpin_trusted_root_digest with __counted_by
  kallsyms: Change func signature for cleanup_symbol_name()
  kallsyms: Fix kallsyms_selftest failure
  nsproxy: Convert nsproxy.count to refcount_t
  integrity: Annotate struct ima_rule_opt_list with __counted_by
  lkdtm: Add FAM_BOUNDS test for __counted_by
  Compiler Attributes: counted_by: Adjust name and identifier expansion
  um: refactor deprecated strncpy to memcpy
  um: vector: refactor deprecated strncpy
  alpha: Replace one-element array with flexible-array member
  hardening: Move BUG_ON_DATA_CORRUPTION to hardening options
  list: Introduce CONFIG_LIST_HARDENED
  list_debug: Introduce inline wrappers for debug checks
  compiler_types: Introduce the Clang __preserve_most function attribute
  gcc-plugins: Rename last_stmt() for GCC 14+
  selftests/harness: Actually report SKIP for signal tests
  x86/paravirt: Fix tlb_remove_table function callback prototype warning
  EISA: Replace all non-returning strlcpy with strscpy
  perf: Replace strlcpy with strscpy
  um: Remove strlcpy declaration
  ...
2023-08-28 12:59:45 -07:00
Linus Torvalds
6016fc9162 New code for 6.6:
* Make large writes to the page cache fill sparse parts of the cache
    with large folios, then use large memcpy calls for the large folio.
  * Track the per-block dirty state of each large folio so that a
    buffered write to a single byte on a large folio does not result in a
    (potentially) multi-megabyte writeback IO.
  * Allow some directio completions to be performed in the initiating
    task's context instead of punting through a workqueue.  This will
    reduce latency for some io_uring requests.
 
 Signed-off-by: Darrick J. Wong <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZM0Z1AAKCRBKO3ySh0YR
 pp7BAQCzkKejCM0185tNIH/faHjzidSisNQkJ5HoB4Opq9U66AEA6IPuAdlPlM/J
 FPW1oPq33Yn7AV4wXjUNFfDLzVb/Fgg=
 =dFBU
 -----END PGP SIGNATURE-----

Merge tag 'iomap-6.6-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap updates from Darrick Wong:
 "We've got some big changes for this release -- I'm very happy to be
  landing willy's work to enable large folios for the page cache for
  general read and write IOs when the fs can make contiguous space
  allocations, and Ritesh's work to track sub-folio dirty state to
  eliminate the write amplification problems inherent in using large
  folios.

  As a bonus, io_uring can now process write completions in the caller's
  context instead of bouncing through a workqueue, which should reduce
  io latency dramatically. IOWs, XFS should see a nice performance bump
  for both IO paths.

  Summary:

   - Make large writes to the page cache fill sparse parts of the cache
     with large folios, then use large memcpy calls for the large folio.

   - Track the per-block dirty state of each large folio so that a
     buffered write to a single byte on a large folio does not result in
     a (potentially) multi-megabyte writeback IO.

   - Allow some directio completions to be performed in the initiating
     task's context instead of punting through a workqueue. This will
     reduce latency for some io_uring requests"

* tag 'iomap-6.6-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (26 commits)
  iomap: support IOCB_DIO_CALLER_COMP
  io_uring/rw: add write support for IOCB_DIO_CALLER_COMP
  fs: add IOCB flags related to passing back dio completions
  iomap: add IOMAP_DIO_INLINE_COMP
  iomap: only set iocb->private for polled bio
  iomap: treat a write through cache the same as FUA
  iomap: use an unsigned type for IOMAP_DIO_* defines
  iomap: cleanup up iomap_dio_bio_end_io()
  iomap: Add per-block dirty state tracking to improve performance
  iomap: Allocate ifs in ->write_begin() early
  iomap: Refactor iomap_write_delalloc_punch() function out
  iomap: Use iomap_punch_t typedef
  iomap: Fix possible overflow condition in iomap_write_delalloc_scan
  iomap: Add some uptodate state handling helpers for ifs state bitmap
  iomap: Drop ifs argument from iomap_set_range_uptodate()
  iomap: Rename iomap_page to iomap_folio_state and others
  iomap: Copy larger chunks from userspace
  iomap: Create large folios in the buffered write path
  filemap: Allow __filemap_get_folio to allocate large folios
  filemap: Add fgf_t typedef
  ...
2023-08-28 11:59:52 -07:00
Rob Herring
077ca0408c lib/genalloc: Explicitly include correct DT includes
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it was merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.

Link: https://lore.kernel.org/r/20230714175056.4066297-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
2023-08-28 13:36:24 -05:00
Takashi Iwai
692f551015 ASoC: Updates for v6.6
The rest of the updates for v6.6, some of the highlights include:
 
  - A big API cleanup from Morimoto-san, rationalising the places we put
    functions.
  - Lots of work on the SOF framework, AMD and Intel drivers, including a
    lot of cleanup and new device support.
  - Standardisation of the presentation of jacks from drivers.
  - Provision of some generic sound card DT properties.
  - Conversion oof more drivers to the maple tree register cache.
  - New drivers for AMD Van Gogh, AWInic AW88261, Cirrus Logic cs42l43,
    various Intel platforms, Mediatek MT7986, RealTek RT1017 and StarFive
    JH7110.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmTo6Y0ACgkQJNaLcl1U
 h9AnaAf/XiBSnZl2i9wFckPy7bLcR74YrP1sFet5ZAqtpIt+/DvzQlgFAraHJ4tR
 ScM2ZyyMwREaFhrHIXKLm8kbaOKeIjIMSxiHREVG9Wibq8d1TwaOHWAcXc9jMsQb
 3G23Aizy2h5yD+/VTh8q6aV+fmYZJDfr1tIu8zWva90HcL2fMGvLjVdt24RNejTL
 bgCC2GaaGP4pnC3xoBo1hGayvp0PES1BHVeyAXqMVscH+GCplPNJEdSHvU14OBck
 1Nfjf5NVkh5G0pvrbG/yblsn1Zm5HRAzCE7gF1OHLFH27ygvp7fGk6TIXXpvw23c
 OSvveYee2YLrf4kyndmv88Aq8JVTeA==
 =9F/T
 -----END PGP SIGNATURE-----

Merge tag 'asoc-v6.6' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Updates for v6.6

The rest of the updates for v6.6, some of the highlights include:

 - A big API cleanup from Morimoto-san, rationalising the places we put
   functions.
 - Lots of work on the SOF framework, AMD and Intel drivers, including a
   lot of cleanup and new device support.
 - Standardisation of the presentation of jacks from drivers.
 - Provision of some generic sound card DT properties.
 - Conversion oof more drivers to the maple tree register cache.
 - New drivers for AMD Van Gogh, AWInic AW88261, Cirrus Logic cs42l43,
   various Intel platforms, Mediatek MT7986, RealTek RT1017 and StarFive
   JH7110.
2023-08-28 16:13:03 +02:00
Jakub Kicinski
bebfbf07c7 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZOjkTAAKCRDbK58LschI
 gx32AP9gaaHFBtOYBfoenKTJfMgv1WhtQHIBas+WN9ItmBx9MAEA4gm/VyQ6oD7O
 EBjJKJQ2CZ/QKw7cNacXw+l5jF7/+Q0=
 =8P7g
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-08-25

We've added 87 non-merge commits during the last 8 day(s) which contain
a total of 104 files changed, 3719 insertions(+), 4212 deletions(-).

The main changes are:

1) Add multi uprobe BPF links for attaching multiple uprobes
   and usdt probes, which is significantly faster and saves extra fds,
   from Jiri Olsa.

2) Add support BPF cpu v4 instructions for arm64 JIT compiler,
   from Xu Kuohai.

3) Add support BPF cpu v4 instructions for riscv64 JIT compiler,
   from Pu Lehui.

4) Fix LWT BPF xmit hooks wrt their return values where propagating
   the result from skb_do_redirect() would trigger a use-after-free,
   from Yan Zhai.

5) Fix a BPF verifier issue related to bpf_kptr_xchg() with local kptr
   where the map's value kptr type and locally allocated obj type
   mismatch, from Yonghong Song.

6) Fix BPF verifier's check_func_arg_reg_off() function wrt graph
   root/node which bypassed reg->off == 0 enforcement,
   from Kumar Kartikeya Dwivedi.

7) Lift BPF verifier restriction in networking BPF programs to treat
   comparison of packet pointers not as a pointer leak,
   from Yafang Shao.

8) Remove unmaintained XDP BPF samples as they are maintained
   in xdp-tools repository out of tree, from Toke Høiland-Jørgensen.

9) Batch of fixes for the tracing programs from BPF samples in order
   to make them more libbpf-aware, from Daniel T. Lee.

10) Fix a libbpf signedness determination bug in the CO-RE relocation
    handling logic, from Andrii Nakryiko.

11) Extend libbpf to support CO-RE kfunc relocations. Also follow-up
    fixes for bpf_refcount shared ownership implementation,
    both from Dave Marchevsky.

12) Add a new bpf_object__unpin() API function to libbpf,
    from Daniel Xu.

13) Fix a memory leak in libbpf to also free btf_vmlinux
    when the bpf_object gets closed, from Hao Luo.

14) Small error output improvements to test_bpf module, from Helge Deller.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (87 commits)
  selftests/bpf: Add tests for rbtree API interaction in sleepable progs
  bpf: Allow bpf_spin_{lock,unlock} in sleepable progs
  bpf: Consider non-owning refs to refcounted nodes RCU protected
  bpf: Reenable bpf_refcount_acquire
  bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes
  bpf: Consider non-owning refs trusted
  bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
  selftests/bpf: Enable cpu v4 tests for RV64
  riscv, bpf: Support unconditional bswap insn
  riscv, bpf: Support signed div/mod insns
  riscv, bpf: Support 32-bit offset jmp insn
  riscv, bpf: Support sign-extension mov insns
  riscv, bpf: Support sign-extension load insns
  riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W
  samples/bpf: Add note to README about the XDP utilities moved to xdp-tools
  samples/bpf: Cleanup .gitignore
  samples/bpf: Remove the xdp_sample_pkts utility
  samples/bpf: Remove the xdp1 and xdp2 utilities
  samples/bpf: Remove the xdp_rxq_info utility
  samples/bpf: Remove the xdp_redirect* utilities
  ...
====================

Link: https://lore.kernel.org/r/20230825194319.12727-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-25 18:40:15 -07:00
Helge Deller
382d4cd184 lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernels
The gcc compiler translates on some architectures the 64-bit
__builtin_clzll() function to a call to the libgcc function __clzdi2(),
which should take a 64-bit parameter on 32- and 64-bit platforms.

But in the current kernel code, the built-in __clzdi2() function is
defined to operate (wrongly) on 32-bit parameters if BITS_PER_LONG ==
32, thus the return values on 32-bit kernels are in the range from
[0..31] instead of the expected [0..63] range.

This patch fixes the in-kernel functions __clzdi2() and __ctzdi2() to
take a 64-bit parameter on 32-bit kernels as well, thus it makes the
functions identical for 32- and 64-bit kernels.

This bug went unnoticed since kernel 3.11 for over 10 years, and here
are some possible reasons for that:

 a) Some architectures have assembly instructions to count the bits and
    which are used instead of calling __clzdi2(), e.g. on x86 the bsr
    instruction and on ppc cntlz is used. On such architectures the
    wrong __clzdi2() implementation isn't used and as such the bug has
    no effect and won't be noticed.

 b) Some architectures link to libgcc.a, and the in-kernel weak
    functions get replaced by the correct 64-bit variants from libgcc.a.

 c) __builtin_clzll() and __clzdi2() doesn't seem to be used in many
    places in the kernel, and most likely only in uncritical functions,
    e.g. when printing hex values via seq_put_hex_ll(). The wrong return
    value will still print the correct number, but just in a wrong
    formatting (e.g. with too many leading zeroes).

 d) 32-bit kernels aren't used that much any longer, so they are less
    tested.

A trivial testcase to verify if the currently running 32-bit kernel is
affected by the bug is to look at the output of /proc/self/maps:

Here the kernel uses a correct implementation of __clzdi2():

  root@debian:~# cat /proc/self/maps
  00010000-00019000 r-xp 00000000 08:05 787324     /usr/bin/cat
  00019000-0001a000 rwxp 00009000 08:05 787324     /usr/bin/cat
  0001a000-0003b000 rwxp 00000000 00:00 0          [heap]
  f7551000-f770d000 r-xp 00000000 08:05 794765     /usr/lib/hppa-linux-gnu/libc.so.6
  ...

and this kernel uses the broken implementation of __clzdi2():

  root@debian:~# cat /proc/self/maps
  0000000010000-0000000019000 r-xp 00000000 000000008:000000005 787324  /usr/bin/cat
  0000000019000-000000001a000 rwxp 000000009000 000000008:000000005 787324  /usr/bin/cat
  000000001a000-000000003b000 rwxp 00000000 00:00 0  [heap]
  00000000f73d1000-00000000f758d000 r-xp 00000000 000000008:000000005 794765  /usr/lib/hppa-linux-gnu/libc.so.6
  ...

Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 4df87bb7b6 ("lib: add weak clz/ctz functions")
Cc: Chanho Min <chanho.min@lge.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-08-25 13:22:10 -07:00
Linus Torvalds
6f0edbb833 18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4 issues
or aren't considered suitable for a -stable backport.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZOjuGgAKCRDdBJ7gKXxA
 jkLlAQDY9sYxhQZp1PFLirUIPeOBjEyifVy6L6gCfk9j0snLggEA2iK+EtuJt2Dc
 SlMfoTq29zyU/YgfKKwZEVKtPJZOHQU=
 =oTcj
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4
  issues or aren't considered suitable for a -stable backport"

* tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  shmem: fix smaps BUG sleeping while atomic
  selftests: cachestat: catch failing fsync test on tmpfs
  selftests: cachestat: test for cachestat availability
  maple_tree: disable mas_wr_append() when other readers are possible
  madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
  madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
  madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
  mm: multi-gen LRU: don't spin during memcg release
  mm: memory-failure: fix unexpected return value in soft_offline_page()
  radix tree: remove unused variable
  mm: add a call to flush_cache_vmap() in vmap_pfn()
  selftests/mm: FOLL_LONGTERM need to be updated to 0x100
  nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
  mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast
  selftests: cgroup: fix test_kmem_basic less than error
  mm: enable page walking API to lock vmas during the walk
  smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
  mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
2023-08-25 11:44:43 -07:00
Mateusz Guzik
c439d5e8a0 pcpcntr: add group allocation/free
Allocations and frees are globally serialized on the pcpu lock (and the
CPU hotplug lock if enabled, which is the case on Debian).

At least one frequent consumer allocates 4 back-to-back counters (and
frees them in the same manner), exacerbating the problem.

While this does not fully remedy scalability issues, it is a step
towards that goal and provides immediate relief.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Dennis Zhou <dennis@kernel.org>
Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com>
Link: https://lore.kernel.org/r/20230823050609.2228718-2-mjguzik@gmail.com
[Dennis: reflowed a few lines]
Signed-off-by: Dennis Zhou <dennis@kernel.org>
2023-08-25 08:06:53 -07:00
Christophe Leroy
b38460bc46 kunit: Fix checksum tests on big endian CPUs
On powerpc64le checksum kunit tests work:

[    2.011457][    T1]     KTAP version 1
[    2.011662][    T1]     # Subtest: checksum
[    2.011848][    T1]     1..3
[    2.034710][    T1]     ok 1 test_csum_fixed_random_inputs
[    2.079325][    T1]     ok 2 test_csum_all_carry_inputs
[    2.127102][    T1]     ok 3 test_csum_no_carry_inputs
[    2.127202][    T1] # checksum: pass:3 fail:0 skip:0 total:3
[    2.127533][    T1] # Totals: pass:3 fail:0 skip:0 total:3
[    2.127956][    T1] ok 1 checksum

But on powerpc64 and powerpc32 they fail:

[    1.859890][    T1]     KTAP version 1
[    1.860041][    T1]     # Subtest: checksum
[    1.860201][    T1]     1..3
[    1.861927][   T58]     # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:243
[    1.861927][   T58]     Expected result == expec, but
[    1.861927][   T58]         result == 54991 (0xd6cf)
[    1.861927][   T58]         expec == 33316 (0x8224)
[    1.863742][    T1]     not ok 1 test_csum_fixed_random_inputs
[    1.864520][   T60]     # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:267
[    1.864520][   T60]     Expected result == expec, but
[    1.864520][   T60]         result == 255 (0xff)
[    1.864520][   T60]         expec == 65280 (0xff00)
[    1.868820][    T1]     not ok 2 test_csum_all_carry_inputs
[    1.869977][   T62]     # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:306
[    1.869977][   T62]     Expected result == expec, but
[    1.869977][   T62]         result == 64515 (0xfc03)
[    1.869977][   T62]         expec == 0 (0x0)
[    1.872060][    T1]     not ok 3 test_csum_no_carry_inputs
[    1.872102][    T1] # checksum: pass:0 fail:3 skip:0 total:3
[    1.872458][    T1] # Totals: pass:0 fail:3 skip:0 total:3
[    1.872791][    T1] not ok 3 checksum

This is because all expected values were calculated for X86 which
is little endian. On big endian systems all precalculated 16 bits
halves must be byte swapped.

And this is confirmed by a huge amount of sparse errors when building
with C=2

So fix all sparse errors and it will naturally work on all endianness.

Fixes: 688eb8191b ("x86/csum: Improve performance of `csum_partial`")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-25 10:14:34 +01:00
Liam R. Howlett
432af5c966 maple_tree: clean up mas_wr_append()
Avoid setting the variables until necessary, and actually use the
variables where applicable.  Introducing a variable for the slots array
avoids spanning multiple lines.

Add the missing argument to the documentation.

Use the node type when setting the metadata instead of blindly assuming
the type.

Finally, add a trace point to the function for successful store.

Link: https://lkml.kernel.org/r/20230819004356.1454718-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:20:32 -07:00
Matthew Wilcox (Oracle)
f9bff0e318 minmax: add in_range() macro
Patch series "New page table range API", v6.

This patchset changes the API used by the MM to set up page table entries.
The four APIs are:

    set_ptes(mm, addr, ptep, pte, nr)
    update_mmu_cache_range(vma, addr, ptep, nr)
    flush_dcache_folio(folio) 
    flush_icache_pages(vma, page, nr)

flush_dcache_folio() isn't technically new, but no architecture
implemented it, so I've done that for them.  The old APIs remain around
but are mostly implemented by calling the new interfaces.

The new APIs are based around setting up N page table entries at once. 
The N entries belong to the same PMD, the same folio and the same VMA, so
ptep++ is a legitimate operation, and locking is taken care of for you. 
Some architectures can do a better job of it than just a loop, but I have
hesitated to make too deep a change to architectures I don't understand
well.

One thing I have changed in every architecture is that PG_arch_1 is now a
per-folio bit instead of a per-page bit when used for dcache clean/dirty
tracking.  This was something that would have to happen eventually, and it
makes sense to do it now rather than iterate over every page involved in a
cache flush and figure out if it needs to happen.

The point of all this is better performance, and Fengwei Yin has measured
improvement on x86.  I suspect you'll see improvement on your architecture
too.  Try the new will-it-scale test mentioned here:
https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/
You'll need to run it on an XFS filesystem and have
CONFIG_TRANSPARENT_HUGEPAGE set.

This patchset is the basis for much of the anonymous large folio work
being done by Ryan, so it's received quite a lot of testing over the last
few months.


This patch (of 38):

Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND).  It also has useful (under some
circumstances) behaviour if the range exceeds the maximum value of the
type.  Convert all the conflicting definitions of in_range() within the
kernel; some can use the generic definition while others need their own
definition.

Link: https://lkml.kernel.org/r/20230802151406.3735276-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230802151406.3735276-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:20:18 -07:00
Andrew Morton
fcbc329fa3 merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes 2023-08-24 15:25:56 -07:00
Liam R. Howlett
cfeb6ae8bc maple_tree: disable mas_wr_append() when other readers are possible
The current implementation of append may cause duplicate data and/or
incorrect ranges to be returned to a reader during an update.  Although
this has not been reported or seen, disable the append write operation
while the tree is in rcu mode out of an abundance of caution.

During the analysis of the mas_next_slot() the following was
artificially created by separating the writer and reader code:

Writer:                                 reader:
mas_wr_append
    set end pivot
    updates end metata
    Detects write to last slot
    last slot write is to start of slot
    store current contents in slot
    overwrite old end pivot
                                        mas_next_slot():
                                                read end metadata
                                                read old end pivot
                                                return with incorrect range
    store new value

Alternatively:

Writer:                                 reader:
mas_wr_append
    set end pivot
    updates end metata
    Detects write to last slot
    last lost write to end of slot
    store value
                                        mas_next_slot():
                                                read end metadata
                                                read old end pivot
                                                read new end pivot
                                                return with incorrect range
    set old end pivot

There may be other accesses that are not safe since we are now updating
both metadata and pointers, so disabling append if there could be rcu
readers is the safest action.

Link: https://lkml.kernel.org/r/20230819004356.1454718-2-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:47 -07:00
Takashi Iwai
a057efde80 Merge branch 'for-linus' into for-next
Back-merge the 6.5-devel branch for the clean patch application for
6.6 and resolving merge conflicts.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-08-24 09:27:21 +02:00
Mark Brown
0bbe06493b
Add cs42l43 PC focused SoundWire CODEC
Merge series from Charles Keepax <ckeepax@opensource.cirrus.com>:

This patch chain adds support for the Cirrus Logic cs42l43 PC focused
SoundWire CODEC. The chain is currently based of Lee's for-mfd-next
branch.

This series is mostly just a resend keeping pace with the kernel under
it, except for a minor fixup in the ASoC stuff.

Thanks,
Charles

Charles Keepax (4):
  dt-bindings: mfd: cirrus,cs42l43: Add initial DT binding
  mfd: cs42l43: Add support for cs42l43 core driver
  pinctrl: cs42l43: Add support for the cs42l43
  ASoC: cs42l43: Add support for the cs42l43

Lucas Tanure (2):
  soundwire: bus: Allow SoundWire peripherals to register IRQ handlers
  spi: cs42l43: Add SPI controller support

 .../bindings/sound/cirrus,cs42l43.yaml        |  313 +++
 MAINTAINERS                                   |    4 +
 drivers/mfd/Kconfig                           |   23 +
 drivers/mfd/Makefile                          |    3 +
 drivers/mfd/cs42l43-i2c.c                     |   98 +
 drivers/mfd/cs42l43-sdw.c                     |  239 ++
 drivers/mfd/cs42l43.c                         | 1188 +++++++++
 drivers/mfd/cs42l43.h                         |   28 +
 drivers/pinctrl/cirrus/Kconfig                |   11 +
 drivers/pinctrl/cirrus/Makefile               |    2 +
 drivers/pinctrl/cirrus/pinctrl-cs42l43.c      |  609 +++++
 drivers/soundwire/bus.c                       |   32 +
 drivers/soundwire/bus_type.c                  |   12 +
 drivers/spi/Kconfig                           |    7 +
 drivers/spi/Makefile                          |    1 +
 drivers/spi/spi-cs42l43.c                     |  284 ++
 include/linux/mfd/cs42l43-regs.h              | 1184 +++++++++
 include/linux/mfd/cs42l43.h                   |  102 +
 include/linux/soundwire/sdw.h                 |    9 +
 include/sound/cs42l43.h                       |   17 +
 sound/soc/codecs/Kconfig                      |   16 +
 sound/soc/codecs/Makefile                     |    4 +
 sound/soc/codecs/cs42l43-jack.c               |  946 +++++++
 sound/soc/codecs/cs42l43-sdw.c                |   74 +
 sound/soc/codecs/cs42l43.c                    | 2278 +++++++++++++++++
 sound/soc/codecs/cs42l43.h                    |  131 +
 26 files changed, 7615 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/cirrus,cs42l43.yaml
 create mode 100644 drivers/mfd/cs42l43-i2c.c
 create mode 100644 drivers/mfd/cs42l43-sdw.c
 create mode 100644 drivers/mfd/cs42l43.c
 create mode 100644 drivers/mfd/cs42l43.h
 create mode 100644 drivers/pinctrl/cirrus/pinctrl-cs42l43.c
 create mode 100644 drivers/spi/spi-cs42l43.c
 create mode 100644 include/linux/mfd/cs42l43-regs.h
 create mode 100644 include/linux/mfd/cs42l43.h
 create mode 100644 include/sound/cs42l43.h
 create mode 100644 sound/soc/codecs/cs42l43-jack.c
 create mode 100644 sound/soc/codecs/cs42l43-sdw.c
 create mode 100644 sound/soc/codecs/cs42l43.c
 create mode 100644 sound/soc/codecs/cs42l43.h

--
2.30.2
2023-08-22 12:48:04 +01:00
Andrew Morton
5994eabf3b merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes 2023-08-21 14:26:20 -07:00
Andy Shevchenko
3d0b713984 kstrtox: consistently use _tolower()
We already use _tolower() in other places, so convert the one which open
codes it.

Link: https://lkml.kernel.org/r/20230817145919.543251-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:46:25 -07:00
Andy Shevchenko
6655360923 lib/vsprintf: declare no_hash_pointers in sprintf.h
Sparse is not happy to see non-static variable without declaration:
lib/vsprintf.c:61:6: warning: symbol 'no_hash_pointers' was not declared. 
Should it be static?

Declare respective variable in the sprintf.h.  With this, add a comment to
discourage its use if no real need.

Link: https://lkml.kernel.org/r/20230814163344.17429-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:46:24 -07:00
Andy Shevchenko
39ced19b9e lib/vsprintf: split out sprintf() and friends
Patch series "lib/vsprintf: Rework header inclusions", v3.

Some patches that reduce the mess with the header inclusions related to
vsprintf.c module.  Each patch has its own description, and has no
dependencies to each other, except the collisions over modifications of
the same places.  Hence the series.


This patch (of 2):

kernel.h is being used as a dump for all kinds of stuff for a long time. 
sprintf() and friends are used in many drivers without need of the full
kernel.h dependency train with it.

Here is the attempt on cleaning it up by splitting out sprintf() and
friends.

Link: https://lkml.kernel.org/r/20230814163344.17429-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20230814163344.17429-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:46:24 -07:00
Liam R. Howlett
530f745c76 maple_tree: replace data before marking dead in split and spanning store
Reorder the operations for split and spanning stores so that new data is
placed in the tree prior to marking the old data as dead.  This will limit
re-walks on dead data to just once instead of a retry loop.

The order of operations is as follows: Create the new data, put the new
data in place, mark the top node of the old data as dead.

Then repair parent links in the reused nodes through all levels of the
tree, following the new nodes downwards.  Finally walk the top dead node
looking for nodes that are no longer used, or subtrees that should be
destroyed (marked dead throughout then freed), follow the partially used
nodes downwards to discover other dead nodes and subtrees.

Link: https://lkml.kernel.org/r/20230804165951.2661157-7-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:37:41 -07:00
Liam R. Howlett
068bafcac0 maple_tree: change mas_adopt_children() parent usage
All calls to mas_adopt_children() currently pass the parent as the node in
the maple state.  Allow for the parent pointer that is passed in to be
used instead.

Link: https://lkml.kernel.org/r/20230804165951.2661157-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:37:41 -07:00
Liam R. Howlett
4ffc2ee2cf maple_tree: introduce mas_tree_parent() definition
Add a definition to shorten long code lines and clarify what the code is
doing.  Use the new definition to get the maple tree parent pointer from
the maple state where possible.

Link: https://lkml.kernel.org/r/20230804165951.2661157-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:37:41 -07:00
Liam R. Howlett
1238f6a226 maple_tree: introduce mas_put_in_tree()
mas_replace() has a single user that takes a flag which is now always
true.  Replace this function with mas_put_in_tree() to better align with
mas_replace_node().  Inline the remaining logic into the only caller;
mas_wmb_replace().

Link: https://lkml.kernel.org/r/20230804165951.2661157-4-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:37:41 -07:00
Liam R. Howlett
72bcf4aa86 maple_tree: reorder replacement of nodes to avoid live lock
Replacing nodes may cause a live lock-up if CPU resources are saturated by
write operations on the tree by continuously retrying on dead nodes.  To
avoid the continuous retry scenario, ensure the new node is inserted into
the tree prior to marking the old data as dead.  This will define a window
where old and new data is swapped.

When reusing lower level nodes, ensure the parent pointer is updated after
the parent is marked dead.  This ensures that the child is still reachable
from the top of the tree, but walking up to a dead node will result in a
single retry that will start a fresh walk from the top down through the
new node.

Link: https://lkml.kernel.org/r/20230804165951.2661157-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:37:41 -07:00
Liam R. Howlett
83d97f620f maple_tree: add hex output to maple_arange64 dump
Patch series "maple_tree: Change replacement strategy".

The maple tree marks nodes dead as soon as they are going to be replaced. 
This could be problematic when used in the RCU context since the writer
may be starved of CPU time by the readers.  This patch set addresses the
issue by switching the data replacement strategy to one that will only
mark data as dead once the new data is available.

This series changes the ordering of the node replacement so that the new
data is live before the old data is marked 'dead'.  When readers hit
'dead' nodes, they will restart from the top of the tree and end up in the
new data.

In more complex scenarios, the replacement strategy means a subtree is
built and graphed into the tree leaving some nodes to point to the old
parent.  The view of tasks into the old data will either remain with the
old data, or see the new data once the old data is marked 'dead'.

Iterators will see the 'dead' node and restart on their own and switch to
the new data.  There is no risk of the reader seeing old data in these
cases.

The 'dead' subtree of data is then fully marked dead, but reused nodes
will still point to the dead nodes until the parent pointer is updated. 
Walking up to a 'dead' node will cause a re-walk from the top of the tree
and enter the new data area where old data is not reachable.

Once the parent pointers are fully up to date in the active data, the
'dead' subtree is iterated to collect entirely 'dead' subtrees, and dead
nodes (nodes that partially contained reused data).


This patch (of 6):

When dumping the tree, honour formatting request to output hex for the
maple node type arange64.

Link: https://lkml.kernel.org/r/20230804165951.2661157-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20230804165951.2661157-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:37:40 -07:00
Arnd Bergmann
d59070d107 radix tree: remove unused variable
Recent versions of clang warn about an unused variable, though older
versions saw the 'slot++' as a use and did not warn:

radix-tree.c:1136:50: error: parameter 'slot' set but not used [-Werror,-Wunused-but-set-parameter]

It's clearly not needed any more, so just remove it.

Link: https://lkml.kernel.org/r/20230811131023.2226509-1-arnd@kernel.org
Fixes: 3a08cd52c3 ("radix tree: Remove multiorder support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Rong Tao <rongtao@cestc.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:07:22 -07:00
Rae Moar
25e324bc9c kunit: fix struct kunit_attr header
Add parameter descriptions to struct kunit_attr header for the
parameters attr_default and print.

Fixes: 39e92cb1e4 ("kunit: Add test attributes API structure")

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308180127.VD7YRPGa-lkp@intel.com/

Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-08-21 08:07:56 -06:00
Zhen Lei
1b28cb81da kobject: Remove redundant checks for whether ktype is NULL
When adding koject or kset, we have made sure that ktype cannot be NULL.
Therefore, after adding koject or kset, there is no need to worry about
ktype being NULL. Clear all ktype-related redundancy checks.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20230805084114.1298-3-thunder.leizhen@huaweicloud.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-19 19:37:53 +02:00
Zhen Lei
4d0fe8c52b kobject: Add sanity check for kset->kobj.ktype in kset_register()
When I register a kset in the following way:
	static struct kset my_kset;
	kobject_set_name(&my_kset.kobj, "my_kset");
        ret = kset_register(&my_kset);

A null pointer dereference exception is occurred:
[ 4453.568337] Unable to handle kernel NULL pointer dereference at \
virtual address 0000000000000028
... ...
[ 4453.810361] Call trace:
[ 4453.813062]  kobject_get_ownership+0xc/0x34
[ 4453.817493]  kobject_add_internal+0x98/0x274
[ 4453.822005]  kset_register+0x5c/0xb4
[ 4453.825820]  my_kobj_init+0x44/0x1000 [my_kset]
... ...

Because I didn't initialize my_kset.kobj.ktype.

According to the description in Documentation/core-api/kobject.rst:
 - A ktype is the type of object that embeds a kobject.  Every structure
   that embeds a kobject needs a corresponding ktype.

So add sanity check to make sure kset->kobj.ktype is not NULL.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20230805084114.1298-2-thunder.leizhen@huaweicloud.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-19 19:37:53 +02:00
Jakub Kicinski
7ff57803d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/sfc/tc.c
  fa165e1949 ("sfc: don't unregister flow_indr if it was never registered")
  3bf969e88a ("sfc: add MAE table machinery for conntrack table")
https://lore.kernel.org/all/20230818112159.7430e9b4@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18 12:44:56 -07:00
Douglas Anderson
8d539b84f1 nmi_backtrace: allow excluding an arbitrary CPU
The APIs that allow backtracing across CPUs have always had a way to
exclude the current CPU.  This convenience means callers didn't need to
find a place to allocate a CPU mask just to handle the common case.

Let's extend the API to take a CPU ID to exclude instead of just a
boolean.  This isn't any more complex for the API to handle and allows the
hardlockup detector to exclude a different CPU (the one it already did a
trace for) without needing to find space for a CPU mask.

Arguably, this new API also encourages safer behavior.  Specifically if
the caller wants to avoid tracing the current CPU (maybe because they
already traced the current CPU) this makes it more obvious to the caller
that they need to make sure that the current CPU ID can't change.

[akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub]
Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:19:00 -07:00
John Sanpe
02d7f74a04 lib/bch.c: use bitrev instead of internal logic
Replace internal logic with separate bitrev library.

Link: https://lkml.kernel.org/r/20230730081717.1498217-1-sanpeqf@gmail.com
Signed-off-by: John Sanpe <sanpeqf@gmail.com>
Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:18:58 -07:00
Wang Ming
a7284b0e75 lib: error-inject: remove error checking for debugfs_create_dir()
It is expected that most callers should _ignore_ the errors return by
debugfs_create_dir() in ei_debugfs_init().

Link: https://lkml.kernel.org/r/20230719144355.6720-1-machel@vivo.com
Signed-off-by: Wang Ming <machel@vivo.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:18:55 -07:00
Wang Ming
c3d2d45b06 lib: remove error checking for debugfs_create_dir()
It is expected that most callers should _ignore_ the errors return by
debugfs_create_dir() in err_inject_init().

Link: https://lkml.kernel.org/r/20230713082455.2415-1-machel@vivo.com
Signed-off-by: Wang Ming <machel@vivo.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:18:55 -07:00
Sumitra Sharma
ae96e0cdc7 lib: replace kmap() with kmap_local_page()
kmap() has been deprecated in favor of the kmap_local_page() due to high
cost, restricted mapping space, the overhead of a global lock for
synchronization, and making the process sleep in the absence of free
slots.

kmap_local_page() is faster than kmap() and offers thread-local and
CPU-local mappings, take pagefaults in a local kmap region and preserves
preemption by saving the mappings of outgoing tasks and restoring those of
the incoming one during a context switch.

The mappings are kept thread local in the functions “dmirror_do_read”
and “dmirror_do_write” in test_hmm.c

Therefore, replace kmap() with kmap_local_page() and use
mempcy_from/to_page() to avoid open coding kmap_local_page() + memcpy() +
kunmap_local().

Remove the unused variable “tmp”.

Link: https://lkml.kernel.org/r/20230610175712.GA348514@sumitra.com
Signed-off-by: Sumitra Sharma <sumitraartsy@gmail.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Deepak R Varma <drv@mailo.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:18:50 -07:00
Liam R. Howlett
fec2936434 maple_tree: reduce resets during store setup
mas_prealloc() may walk partially down the tree before finding that a
split or spanning store is needed.  When the write occurs, relax the
logic on resetting the walk so that partial walks will not restart, but
walks that have gone too far (a store that affects beyond the current
node) should be restarted.

Link: https://lkml.kernel.org/r/20230724183157.3939892-15-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:50 -07:00
Liam R. Howlett
17983dc617 maple_tree: refine mas_preallocate() node calculations
Calculate the number of nodes based on the pending write action instead
of assuming the worst case.

This addresses a performance regression introduced in platforms that
have longer allocation timing.

Link: https://lkml.kernel.org/r/20230724183157.3939892-14-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:50 -07:00
Liam R. Howlett
a7496ad529 maple_tree: move mas_wr_end_piv() below mas_wr_extend_null()
Relocate it and call mas_wr_extend_null() from within mas_wr_end_piv().
Extending the NULL may affect the end pivot value so call
mas_wr_endtend_null() from within mas_wr_end_piv() to keep it all
together.

Link: https://lkml.kernel.org/r/20230724183157.3939892-12-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:49 -07:00
Liam R. Howlett
c108df767f maple_tree: adjust node allocation on mas_rebalance()
mas_rebalance() is called to rebalance an insufficient node into a
single node or two sufficient nodes.  The preallocation estimate is
always too many in this case as the height of the tree will never grow
and there is no possibility to have a three way split in this case, so
revise the node allocation count.

Link: https://lkml.kernel.org/r/20230724183157.3939892-9-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:48 -07:00
Liam R. Howlett
da0892547b maple_tree: re-introduce entry to mas_preallocate() arguments
The current preallocation strategy is to preallocate the absolute
worst-case allocation for a tree modification.  The entry (or NULL) is
needed to know how many nodes are needed to write to the tree.  Start by
adding the argument to the mas_preallocate() definition.

Link: https://lkml.kernel.org/r/20230724183157.3939892-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:48 -07:00
Liam R. Howlett
8c314f3b55 maple_tree: add benchmarking for mas_prev()
Add some benchmarking functions in testing for mas_prev().  This is
useful to ensure there are no regressions added during modifications.

Link: https://lkml.kernel.org/r/20230724183157.3939892-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:47 -07:00
Liam R. Howlett
361c678be7 maple_tree: add benchmarking for mas_for_each
Patch series "Reduce preallocations for maple tree", v3.

Initial work on preallocations showed no regression in performance during
testing, but recently some users (both on [1] and off [android] list) have
reported that preallocating the worst-case number of nodes has caused some
slow down.  This patch set addresses the number of allocations in a few
ways.

During munmap() most munmap() operations will remove a single VMA, so
leverage the fact that the maple tree can place a single pointer at range
0 - 0 without allocating.  This is done by changing the index of the VMAs
to be indexed by the count, starting at 0.

Re-introduce the entry argument to mas_preallocate() so that a more
intelligent guess of the node count can be made.

Implement the more intelligent guess of the node count, although there is
more work to be done.

During development of v2 of this patch set, I also noticed that the number
of nodes being allocated for a rebalance was beyond what could possibly be
needed.  This is addressed in patch 0008.


This patch (of 15):

Add a way to test the speed of mas_for_each() to the testing code.

Link: https://lkml.kernel.org/r/20230724183157.3939892-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20230724183157.3939892-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:47 -07:00
Liam R. Howlett
19a462f06e maple_tree: Be more strict about locking
Use lockdep to check the write path in the maple tree holds the lock in
write mode.

Introduce mt_write_lock_is_held() to check if the lock is held for
writing.  Update the necessary checks for rcu_dereference_protected() to
use the new write lock check.

Link: https://lkml.kernel.org/r/20230714195551.894800-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oliver Sang <oliver.sang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:40 -07:00
Mike Rapoport (IBM)
4ae6944d15 maple_tree: mtree_insert: fix typo in kernel-doc description of GFP flags
Replace FGP_FLAGS with GFP_FLAGS

Link: https://lkml.kernel.org/r/20230715084038.987955-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:38 -07:00
Mike Rapoport (IBM)
4445e58264 maple_tree: mtree_insert*: fix typo in kernel-doc description
Replace "Insert and entry at a give index" with "Insert an entry at a
given index"

Link: https://lkml.kernel.org/r/20230715143920.994812-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:38 -07:00
Andrew Donnellan
efb78fa86e lib/test_meminit: allocate pages up to order MAX_ORDER
test_pages() tests the page allocator by calling alloc_pages() with
different orders up to order 10.

However, different architectures and platforms support different maximum
contiguous allocation sizes.  The default maximum allocation order
(MAX_ORDER) is 10, but architectures can use CONFIG_ARCH_FORCE_MAX_ORDER
to override this.  On platforms where this is less than 10, test_meminit()
will blow up with a WARN().  This is expected, so let's not do that.

Replace the hardcoded "10" with the MAX_ORDER macro so that we test
allocations up to the expected platform limit.

Link: https://lkml.kernel.org/r/20230714015238.47931-1-ajd@linux.ibm.com
Fixes: 5015a300a5 ("lib: introduce test_meminit module")
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Xiaoke Wang <xkernel.wang@foxmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:32 -07:00
Peng Zhang
6783bd4b5f maple_tree: drop mas_first_entry()
The internal function mas_first_entry() is no longer used, so drop it.

Link: https://lkml.kernel.org/r/20230711035444.526-9-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:22 -07:00
Peng Zhang
29b2681f1a maple_tree: replace mas_logical_pivot() with mas_safe_pivot()
Replace mas_logical_pivot() with mas_safe_pivot() and drop
mas_logical_pivot() since it won't be used anymore.  We can do this since
now all nodes will have node limit pivot (if it is not full node).

Link: https://lkml.kernel.org/r/20230711035444.526-8-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:22 -07:00
Peng Zhang
a489539e33 maple_tree: update mt_validate()
Instead of using mas_first_entry() to find the leftmost leaf, use a simple
loop instead.  Remove an unneeded check for root node.  To make the error
message more accurate, check pivots first and then slots, because checking
slots depend on the node limit pivot to break the loop.

Link: https://lkml.kernel.org/r/20230711035444.526-7-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:22 -07:00
Peng Zhang
33af39d024 maple_tree: make mas_validate_limits() check root node and node limit
Update mas_validate_limits() to check root node, check node limit pivot if
there is enough room for it to exist and check data_end.  Remove the check
for child existence as it is done in mas_validate_child_slot().

Link: https://lkml.kernel.org/r/20230711035444.526-6-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:22 -07:00
Peng Zhang
e93fda5a1a maple_tree: fix mas_validate_child_slot() to check last missed slot
Don't break the loop before checking the last slot.  Also here check if
non-leaf nodes are missing children.

Link: https://lkml.kernel.org/r/20230711035444.526-5-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:21 -07:00
Peng Zhang
f8e5eac8ab maple_tree: make mas_validate_gaps() to check metadata
Make mas_validate_gaps() check whether the offset in the metadata points
to the largest gap.  By the way, simplify this function.

Add the verification that gaps beyond the node limit are zero.

Link: https://lkml.kernel.org/r/20230711035444.526-4-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:21 -07:00
Peng Zhang
d695c30a8c maple_tree: don't use MAPLE_ARANGE64_META_MAX to indicate no gap
Patch series "Improve the validation for maple tree and some cleanup", v2.


This patch (of 7):

Do not use a special offset to indicate that there is no gap.  When there
is no gap, offset can point to any valid slots because its gap is 0.

Link: https://lkml.kernel.org/r/20230711035444.526-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20230711035444.526-3-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:21 -07:00
Peng Zhang
64891ba3e5 maple_tree: add a fast path case in mas_wr_slot_store()
When expanding a range in two directions, only partially overwriting the
previous and next ranges, the number of entries will not be increased, so
we can just update the pivots as a fast path. However, it may introduce
potential risks in RCU mode, because it updates two pivots. We only
enable it in non-RCU mode.

Link: https://lkml.kernel.org/r/20230628073657.75314-5-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:05 -07:00
Peng Zhang
23e9dde0b2 maple_tree: optimize mas_wr_append(), also improve duplicating VMAs
When the new range can be completely covered by the original last range
without touching the boundaries on both sides, two new entries can be
appended to the end as a fast path. We update the original last pivot at
the end, and the newly appended two entries will not be accessed before
this, so it is also safe in RCU mode.

This is useful for sequential insertion, which is what we do in
dup_mmap(). Enabling BENCH_FORK in test_maple_tree and just running
bench_forking() gives the following time-consuming numbers:

before:               after:
17,874.83 msec        15,738.38 msec

It shows about a 12% performance improvement for duplicating VMAs.

Link: https://lkml.kernel.org/r/20230628073657.75314-4-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:05 -07:00
Peng Zhang
d6e8d0dc19 maple_tree: add test for mas_wr_modify() fast path
Patch series "Optimize the fast path of mas_store()", v4.

Add fast paths for mas_wr_append() and mas_wr_slot_store() respectively.
The newly added fast path of mas_wr_append() is used in fork() and how
much it benefits fork() depends on how many VMAs are duplicated.

Thanks Liam for the review.


This patch (of 4):

Add tests for all cases of mas_wr_append() and mas_wr_slot_store().

Link: https://lkml.kernel.org/r/20230628073657.75314-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20230628073657.75314-2-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:05 -07:00
Thomas Gleixner
fad9c80e63 maple_tree: fix a few documentation issues
The documentation of mt_next() claims that it starts the search at the
provided index.  That's incorrect as it starts the search after the
provided index.

The documentation of mt_find() is slightly confusing.  "Handles locking"
is not really helpful as it does not explain how the "locking" works. 
Also the documentation of index talks about a range, while in reality the
index is updated on a succesful search to the index of the found entry
plus one.

Fix similar issues for mt_find_after() and mt_prev().

Reword the confusing "Note: Will not return the zero entry." comment on
mt_for_each() and document @__index correctly.

Link: https://lkml.kernel.org/r/87ttw2n556.ffs@tglx
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:12:00 -07:00
Helge Deller
b6594a17ec bpf/tests: Enhance output on error and fix typos
If a testcase returns a wrong (unexpected) value, print the expected and
returned value in hex notation in addition to the decimal notation.

This is very useful in tests which bit-shift hex values left or right and
helped me a lot while developing the JIT compiler for the hppa architecture.

Additionally fix two typos: dowrd -> dword, tall calls -> tail calls.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/ZN6ZAAVoWZpsD1Jf@p100
2023-08-18 17:08:42 +02:00
Takashi Iwai
70e969eb23 iov_iter: Export import_ubuf()
Export import_ubuf() to be used in sound subsystem for generic memory
handling as Linus suggested.  It's used for constructing an iov_iter
of a single segment user-space copy for PCM data.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/CAHk-=wh-mUL6mp4chAc6E_UjwpPLyCPRCJK+iB4ZMD2BqjwGHA@mail.gmail.com
Link: https://lore.kernel.org/r/20230815190136.8987-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-08-18 12:18:15 +02:00
Nathan Chancellor
92382d7441 lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()
A recent change in clang allows it to consider more expressions as
compile time constants, which causes it to point out an implicit
conversion in the scanf tests:

  lib/test_scanf.c:661:2: warning: implicit conversion from 'int' to 'unsigned char' changes value from -168 to 88 [-Wconstant-conversion]
    661 |         test_number_prefix(unsigned char,       "0xA7", "%2hhx%hhx", 0, 0xa7, 2, check_uchar);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  lib/test_scanf.c:609:29: note: expanded from macro 'test_number_prefix'
    609 |         T result[2] = {~expect[0], ~expect[1]};                                 \
        |                       ~            ^~~~~~~~~~
  1 warning generated.

The result of the bitwise negation is the type of the operand after
going through the integer promotion rules, so this truncation is
expected but harmless, as the initial values in the result array get
overwritten by _test() anyways. Add an explicit cast to the expected
type in test_number_prefix() to silence the warning. There is no
functional change, as all the tests still pass with GCC 13.1.0 and clang
18.0.0.

Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linuxq/issues/1899
Link: 610ec954e1
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230807-test_scanf-wconstant-conversion-v2-1-839ca39083e1@kernel.org
2023-08-16 11:47:29 +02:00
Marco Elver
aa9f10d570 hardening: Move BUG_ON_DATA_CORRUPTION to hardening options
BUG_ON_DATA_CORRUPTION is turning detected corruptions of list data
structures from WARNings into BUGs. This can be useful to stop further
corruptions or even exploitation attempts.

However, the option has less to do with debugging than with hardening.
With the introduction of LIST_HARDENED, it makes more sense to move it
to the hardening options, where it selects LIST_HARDENED instead.

Without this change, combining BUG_ON_DATA_CORRUPTION with LIST_HARDENED
alone wouldn't be possible, because DEBUG_LIST would always be selected
by BUG_ON_DATA_CORRUPTION.

Signed-off-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20230811151847.1594958-4-elver@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-15 14:57:25 -07:00
Marco Elver
aebc7b0d8d list: Introduce CONFIG_LIST_HARDENED
Numerous production kernel configs (see [1, 2]) are choosing to enable
CONFIG_DEBUG_LIST, which is also being recommended by KSPP for hardened
configs [3]. The motivation behind this is that the option can be used
as a security hardening feature (e.g. CVE-2019-2215 and CVE-2019-2025
are mitigated by the option [4]).

The feature has never been designed with performance in mind, yet common
list manipulation is happening across hot paths all over the kernel.

Introduce CONFIG_LIST_HARDENED, which performs list pointer checking
inline, and only upon list corruption calls the reporting slow path.

To generate optimal machine code with CONFIG_LIST_HARDENED:

  1. Elide checking for pointer values which upon dereference would
     result in an immediate access fault (i.e. minimal hardening
     checks).  The trade-off is lower-quality error reports.

  2. Use the __preserve_most function attribute (available with Clang,
     but not yet with GCC) to minimize the code footprint for calling
     the reporting slow path. As a result, function size of callers is
     reduced by avoiding saving registers before calling the rarely
     called reporting slow path.

     Note that all TUs in lib/Makefile already disable function tracing,
     including list_debug.c, and __preserve_most's implied notrace has
     no effect in this case.

  3. Because the inline checks are a subset of the full set of checks in
     __list_*_valid_or_report(), always return false if the inline
     checks failed.  This avoids redundant compare and conditional
     branch right after return from the slow path.

As a side-effect of the checks being inline, if the compiler can prove
some condition to always be true, it can completely elide some checks.

Since DEBUG_LIST is functionally a superset of LIST_HARDENED, the
Kconfig variables are changed to reflect that: DEBUG_LIST selects
LIST_HARDENED, whereas LIST_HARDENED itself has no dependency on
DEBUG_LIST.

Running netperf with CONFIG_LIST_HARDENED (using a Clang compiler with
"preserve_most") shows throughput improvements, in my case of ~7% on
average (up to 20-30% on some test cases).

Link: https://r.android.com/1266735 [1]
Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/blob/main/config [2]
Link: https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings [3]
Link: https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html [4]
Signed-off-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20230811151847.1594958-3-elver@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-15 14:57:25 -07:00
Marco Elver
b16c42c8fd list_debug: Introduce inline wrappers for debug checks
Turn the list debug checking functions __list_*_valid() into inline
functions that wrap the out-of-line functions. Care is taken to ensure
the inline wrappers are always inlined, so that additional compiler
instrumentation (such as sanitizers) does not result in redundant
outlining.

This change is preparation for performing checks in the inline wrappers.

No functional change intended.

Signed-off-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20230811151847.1594958-2-elver@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-15 14:57:24 -07:00
WANG Xuerui
7b3c70c43c raid6: test: only check for Altivec if building on powerpc hosts
Altivec is only available for powerpc hosts, so only check for its
availability when the host is powerpc, to avoid error messages being
shown on architectures other than x86, arm or powerpc.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Link: https://lore.kernel.org/r/20230731104911.411964-6-kernel@xen0n.name
Signed-off-by: Song Liu <song@kernel.org>
2023-08-15 09:40:27 -07:00
WANG Xuerui
6601f5e122 raid6: test: make sure all intermediate and artifact files are .gitignored
Currently when the raid6test utility is built, the resulting binary and
an int.uc file are not being ignored, which can get inadvertently
committed as a result when one works on the raid6 code. Ignore them to
make `git status` clean at all times.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Link: https://lore.kernel.org/r/20230731104911.411964-5-kernel@xen0n.name
Signed-off-by: Song Liu <song@kernel.org>
2023-08-15 09:40:27 -07:00
WANG Xuerui
2008d89fb6 raid6: test: cosmetic cleanups for the test Makefile
Use tabs/spaces consistently: hard tabs for marking recipe lines only,
spaces for everything else.

Also, the OPTFLAGS declaration actually included the tabs preceding the
line comment, making compiler invocation lines unnecessarily long. As
the entire block of declarations are meant for ad-hoc customization
(otherwise they would probably make use of `?=` instead of `=`), move
the "Adjust as desired" comment above the block too to fix the long
invocation lines.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Link: https://lore.kernel.org/r/20230731104911.411964-4-kernel@xen0n.name
Signed-off-by: Song Liu <song@kernel.org>
2023-08-15 09:40:27 -07:00
WANG Xuerui
9dd6e1da81 raid6: guard the tables.c include of <linux/export.h> with __KERNEL__
The export directives for the tables are already emitted with __KERNEL__
guards, but the <linux/export.h> include is not, causing errors when
building the raid6test program. Guard this include too to fix the
raid6test build.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Link: https://lore.kernel.org/r/20230731104911.411964-3-kernel@xen0n.name
Signed-off-by: Song Liu <song@kernel.org>
2023-08-15 09:40:27 -07:00
WANG Xuerui
5afcf28d07 raid6: remove the <linux/export.h> include from recov.c
There is no exported symbol left in recov.c, so the include is now
unnecessary, and breaks the raid6test build. Remove it.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Link: https://lore.kernel.org/r/20230731104911.411964-2-kernel@xen0n.name
Signed-off-by: Song Liu <song@kernel.org>
2023-08-15 09:40:27 -07:00
Greg Kroah-Hartman
e75850b457 Merge 6.5-rc6 into char-misc-next
We need the char/misc fixes in here as well to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-13 22:14:51 +02:00
Linus Torvalds
190bf7b14b 14 hotfixes. 11 of these are cc:stable and the remainder address post-6.4
issues, or are not considered suitable for -stable backporting.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZNad/gAKCRDdBJ7gKXxA
 jmw6AP9u6k8XcS8ec3/u0IUEuh7ckHx5Vvjfmo5YgWlIJDeWegD9G2fh3ZJgcjMO
 jMssklfXmP+QSijCIxUva1TlzwtPDAQ=
 =MqiN
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-08-11-13-44' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "14 hotfixes. 11 of these are cc:stable and the remainder address
  post-6.4 issues, or are not considered suitable for -stable
  backporting"

* tag 'mm-hotfixes-stable-2023-08-11-13-44' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/damon/core: initialize damo_filter->list from damos_new_filter()
  nilfs2: fix use-after-free of nilfs_root in dirtying inodes via iput
  selftests: cgroup: fix test_kmem_basic false positives
  fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions
  MAINTAINERS: add maple tree mailing list
  mm: compaction: fix endless looping over same migrate block
  selftests: mm: ksm: fix incorrect evaluation of parameter
  hugetlb: do not clear hugetlb dtor until allocating vmemmap
  mm: memory-failure: avoid false hwpoison page mapped error info
  mm: memory-failure: fix potential unexpected return value from unpoison_memory()
  mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page
  radix tree test suite: fix incorrect allocation size for pthreads
  crypto, cifs: fix error handling in extract_iter_to_sg()
  zsmalloc: fix races between modifications of fullness and isolated
2023-08-11 14:19:20 -07:00
Mark O'Donovan
9e47a758b7 crypto: lib/mpi - avoid null pointer deref in mpi_cmp_ui()
During NVMeTCP Authentication a controller can trigger a kernel
oops by specifying the 8192 bit Diffie Hellman group and passing
a correctly sized, but zeroed Diffie Hellamn value.
mpi_cmp_ui() was detecting this if the second parameter was 0,
but 1 is passed from dh_is_pubkey_valid(). This causes the null
pointer u->d to be dereferenced towards the end of mpi_cmp_ui()

Signed-off-by: Mark O'Donovan <shiftee@posteo.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-11 19:19:52 +08:00
Herbert Xu
2a598d0b28 crypto: lib - Move mpi into lib/crypto
As lib/mpi is mostly used by crypto code, move it under lib/crypto
so that patches touching it get directed to the right mailing list.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-11 19:19:27 +08:00
Jakub Kicinski
4d016ae42e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/intel/igc/igc_main.c
  06b412589e ("igc: Add lock to safeguard global Qbv variables")
  d3750076d4 ("igc: Add TransmissionOverrun counter")

drivers/net/ethernet/microsoft/mana/mana_en.c
  a7dfeda6fd ("net: mana: Fix MANA VF unload when hardware is unresponsive")
  a9ca9f9cef ("page_pool: split types and declarations from page_pool.h")
  92272ec410 ("eth: add missing xdp.h includes in drivers")

net/mptcp/protocol.h
  511b90e392 ("mptcp: fix disconnect vs accept race")
  b8dc6d6ce9 ("mptcp: fix rcv buffer auto-tuning")

tools/testing/selftests/net/mptcp/mptcp_join.sh
  c8c101ae39 ("selftests: mptcp: join: fix 'implicit EP' test")
  03668c65d1 ("selftests: mptcp: join: rework detailed report")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-10 14:10:53 -07:00
Janusz Krzysztofik
b67abaad4d kunit: Allow kunit test modules to use test filtering
External tools, e.g., Intel GPU tools (IGT), support execution of
individual selftests provided by kernel modules.  That could be also
applicable to kunit test modules if they provided test filtering.  But
test filtering is now possible only when kunit code is built into the
kernel.  Moreover, a filter can be specified only at boot time, then
reboot is required each time a different filter is needed.

Build the test filtering code also when kunit is configured as a module,
expose test filtering functions to other kunit source files, and use them
in kunit module notifier callback functions.  Userspace can then reload
the kunit module with a value of the filter_glob parameter tuned to a
specific kunit test module every time it wants to limit the scope of tests
executed on that module load.  Make the kunit.filter* parameters visible
in sysfs for user convenience.

v5: Refresh on tpp of attributes filtering fix
v4: Refresh on top of newly applied attributes patches and changes
    introdced by new versions of other patches submitted in series with
    this one.
v3: Fix CONFIG_GLOB, required by filtering functions, not selected when
    building as a module (lkp@intel.com).
v2: Fix new name of a structure moved to kunit namespace not updated
    across all uses (lkp@intel.com).

Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-08-08 13:46:18 -06:00
Janusz Krzysztofik
18258c60f8 kunit: Make 'list' action available to kunit test modules
Results from kunit tests reported via dmesg may be interleaved with other
kernel messages.  When parsing dmesg for modular kunit results in real
time, external tools, e.g., Intel GPU tools (IGT), may want to insert
their own test name markers into dmesg at the start of each test, before
any kernel message related to that test appears there, so existing upper
level test result parsers have no doubt which test to blame for a specific
kernel message.  Unfortunately, kunit reports names of tests only at their
completion (with the exeption of a not standarized "# Subtest: <name>"
header above a test plan of each test suite or parametrized test).

External tools could be able to insert their own "start of the test"
markers with test names included if they new those names in advance.
Test names could be learned from a list if provided by a kunit test
module.

There exists a feature of listing kunit tests without actually executing
them, but it is now limited to configurations with the kunit module built
in and covers only built-in tests, already available at boot time.
Moreover, switching from list to normal mode requires reboot.  If that
feature was also available when kunit is built as a module, userspace
could load the module with action=list parameter, load some kunit test
modules they are interested in and learn about the list of tests provided
by those modules, then unload them, reload the kunit module in normal mode
and execute the tests with their lists already known.

Extend kunit module notifier initialization callback with a processing
path for only listing the tests provided by a module if the kunit action
parameter is set to "list" or "list_attr".  For user convenience, make the
kunit.action parameter visible in sysfs.

v2: Don't use a different format, use kunit_exec_list_tests() (Rae),
  - refresh on top of new attributes patches, handle newly introduced
    kunit.action=list_attr case (Rae).

Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-08-08 13:46:13 -06:00
Janusz Krzysztofik
c95e7c05c1 kunit: Report the count of test suites in a module
According to KTAP specification[1], results should always start from a
header that provides a TAP protocol version, followed by a test plan with
a count of items to be executed.  That pattern should be followed at each
nesting level.  In the current implementation of the top-most, i.e., test
suite level, those rules apply only for test suites built into the kernel,
executed and reported on boot.  Results submitted to dmesg from kunit test
modules loaded later are missing those top-level headers.

As a consequence, if a kunit test module provides more than one test suite
then, without the top level test plan, external tools that are parsing
dmesg for kunit test output are not able to tell how many test suites
should be expected and whether to continue parsing after complete output
from the first test suite is collected.

Submit the top-level headers also from the kunit test module notifier
initialization callback.

v3: Fix new name of a structure moved to kunit namespace not updated in
    executor_test functions (lkp@intel.com).
v2: Use kunit_exec_run_tests() (Mauro, Rae), but prevent it from
    emitting the headers when called on load of non-test modules.

[1] https://docs.kernel.org/dev-tools/ktap.html#

Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-08-08 13:46:05 -06:00
Linus Torvalds
14f9643dc9 workqueue: Fixes for v6.5-rc5
Two commits:
 
 * The recently added cpu_intensive auto detection and warning mechanism was
   spuriously triggered on slow CPUs. While not causing serious issues, it's
   still a nuisance and can cause unintended concurrency management
   behaviors. Relax the threshold on machines with lower BogoMIPS. While
   BogoMIPS is not an accurate measure of performance by most measures, we
   don't have to be accurate and it has rough but strong enough correlation.
 
 * A correction in Kconfig help text.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZNFMTQ4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGb+4AQCniWx3rwWWmLgviPR0AfYWbcQ8/P/qGh++fmsR
 tEF3sQD/bLdeWcVa1pSzXjhGtRVGsTis6oOhk81A0zIZlx0v2Qg=
 =sThu
 -----END PGP SIGNATURE-----

Merge tag 'wq-for-6.5-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fixes from Tejun Heo:

 - The recently added cpu_intensive auto detection and warning mechanism
   was spuriously triggered on slow CPUs.

   While not causing serious issues, it's still a nuisance and can cause
   unintended concurrency management behaviors.

   Relax the threshold on machines with lower BogoMIPS. While BogoMIPS
   is not an accurate measure of performance by most measures, we don't
   have to be accurate and it has rough but strong enough correlation.

 - A correction in Kconfig help text

* tag 'wq-for-6.5-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Scale up wq_cpu_intensive_thresh_us if BogoMIPS is below 4000
  workqueue: Fix cpu_intensive_thresh_us name in help text
2023-08-07 13:07:12 -07:00
Zhen Lei
e2dfa1d522 kobject: Add helper kobj_ns_type_is_valid()
There are too many "(type > KOBJ_NS_TYPE_NONE) && (type < KOBJ_NS_TYPES)"
and "(type <= KOBJ_NS_TYPE_NONE) || (type >= KOBJ_NS_TYPES)", add helper
kobj_ns_type_is_valid() to eliminate duplicate code and improve
readability.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20230726062508.950-1-thunder.leizhen@huaweicloud.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-05 08:31:41 +02:00
Andy Shevchenko
045ad46441 lib/string_helpers: Add kstrdup_and_replace() helper
Duplicate a NULL-terminated string and replace all occurrences of
the old character with a new one. In other words, provide functionality
of kstrdup() + strreplace().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230804143910.15504-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2023-08-04 18:21:50 -07:00
David Howells
f443fd5af5 crypto, cifs: fix error handling in extract_iter_to_sg()
Fix error handling in extract_iter_to_sg().  Pages need to be unpinned, not
put in extract_user_to_sg() when handling IOVEC/UBUF sources.

The bug may result in a warning like the following:

  WARNING: CPU: 1 PID: 20384 at mm/gup.c:229 __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:27 [inline]
  WARNING: CPU: 1 PID: 20384 at mm/gup.c:229 arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
  WARNING: CPU: 1 PID: 20384 at mm/gup.c:229 raw_atomic_add include/linux/atomic/atomic-arch-fallback.h:537 [inline]
  WARNING: CPU: 1 PID: 20384 at mm/gup.c:229 atomic_add include/linux/atomic/atomic-instrumented.h:105 [inline]
  WARNING: CPU: 1 PID: 20384 at mm/gup.c:229 try_grab_page+0x108/0x160 mm/gup.c:252
  ...
  pc : try_grab_page+0x108/0x160 mm/gup.c:229
  lr : follow_page_pte+0x174/0x3e4 mm/gup.c:651
  ...
  Call trace:
   __lse_atomic_add arch/arm64/include/asm/atomic_lse.h:27 [inline]
   arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
   raw_atomic_add include/linux/atomic/atomic-arch-fallback.h:537 [inline]
   atomic_add include/linux/atomic/atomic-instrumented.h:105 [inline]
   try_grab_page+0x108/0x160 mm/gup.c:252
   follow_pmd_mask mm/gup.c:734 [inline]
   follow_pud_mask mm/gup.c:765 [inline]
   follow_p4d_mask mm/gup.c:782 [inline]
   follow_page_mask+0x12c/0x2e4 mm/gup.c:839
   __get_user_pages+0x174/0x30c mm/gup.c:1217
   __get_user_pages_locked mm/gup.c:1448 [inline]
   __gup_longterm_locked+0x94/0x8f4 mm/gup.c:2142
   internal_get_user_pages_fast+0x970/0xb60 mm/gup.c:3140
   pin_user_pages_fast+0x4c/0x60 mm/gup.c:3246
   iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline]
   iov_iter_extract_pages+0xc8/0x54c lib/iov_iter.c:1831
   extract_user_to_sg lib/scatterlist.c:1123 [inline]
   extract_iter_to_sg lib/scatterlist.c:1349 [inline]
   extract_iter_to_sg+0x26c/0x6fc lib/scatterlist.c:1339
   hash_sendmsg+0xc0/0x43c crypto/algif_hash.c:117
   sock_sendmsg_nosec net/socket.c:725 [inline]
   sock_sendmsg+0x54/0x60 net/socket.c:748
   ____sys_sendmsg+0x270/0x2ac net/socket.c:2494
   ___sys_sendmsg+0x80/0xdc net/socket.c:2548
   __sys_sendmsg+0x68/0xc4 net/socket.c:2577
   __do_sys_sendmsg net/socket.c:2586 [inline]
   __se_sys_sendmsg net/socket.c:2584 [inline]
   __arm64_sys_sendmsg+0x24/0x30 net/socket.c:2584
   __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
   invoke_syscall+0x48/0x114 arch/arm64/kernel/syscall.c:52
   el0_svc_common.constprop.0+0x44/0xe4 arch/arm64/kernel/syscall.c:142
   do_el0_svc+0x38/0xa4 arch/arm64/kernel/syscall.c:191
   el0_svc+0x2c/0xb0 arch/arm64/kernel/entry-common.c:647
   el0t_64_sync_handler+0xc0/0xc4 arch/arm64/kernel/entry-common.c:665
   el0t_64_sync+0x19c/0x1a0 arch/arm64/kernel/entry.S:591

Link: https://lkml.kernel.org/r/20571.1690369076@warthog.procyon.org.uk
Fixes: 0185846975 ("netfs: Add a function to extract an iterator into a scatterlist")
Reported-by: syzbot+9b82859567f2e50c123e@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-mm/000000000000273d0105ff97bf56@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Steve French <stfrench@microsoft.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Shyam Prasad N <nspmangalore@gmail.com>
Cc: Rohith Surabattula <rohiths.msft@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-04 13:03:40 -07:00
Rae Moar
1c9fd080df kunit: fix uninitialized variables bug in attributes filtering
Fix smatch warnings regarding uninitialized variables in the filtering
patch of the new KUnit Attributes feature.

Fixes: 529534e8cb ("kunit: Add ability to filter attributes")

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202307270610.s0w4NKEn-lkp@intel.com/

Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-08-04 13:41:55 -06:00
Ruan Jinjie
abbf73816b kunit: fix possible memory leak in kunit_filter_suites()
Inject fault while probing drm_kunit_helpers.ko, if one of
kunit_next_attr_filter(), kunit_filter_glob_tests() and
kunit_filter_attr_tests() fails, parsed_filters,
parsed_glob.suite_glob/test_glob alloced in
kunit_parse_glob_filter() is leaked.
And the filtered_suite->test_cases alloced in kunit_filter_glob_tests()
or kunit_filter_attr_tests() may also be leaked.

unreferenced object 0xff110001067e4800 (size 1024):
  comm "kunit_try_catch", pid 96, jiffies 4294671796 (age 763.547s)
  hex dump (first 32 bytes):
    73 75 69 74 65 32 00 00 00 00 00 00 00 00 00 00  suite2..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000116e8eba>] __kmalloc_node_track_caller+0x4e/0x140
    [<00000000e2f9cce9>] kmemdup+0x2c/0x60
    [<000000002a36710b>] kunit_filter_suites+0x3e4/0xa50
    [<0000000045779fb9>] filter_suites_test+0x1b7/0x440
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30
unreferenced object 0xff11000105d79b00 (size 192):
  comm "kunit_try_catch", pid 96, jiffies 4294671796 (age 763.547s)
  hex dump (first 32 bytes):
    f0 e1 5a 88 ff ff ff ff 60 59 bb 8a ff ff ff ff  ..Z.....`Y......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000000d6e4891>] __kmalloc+0x4d/0x140
    [<000000006afe50bd>] kunit_filter_suites+0x424/0xa50
    [<0000000045779fb9>] filter_suites_test+0x1b7/0x440
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30
unreferenced object 0xff110001067e6000 (size 1024):
  comm "kunit_try_catch", pid 98, jiffies 4294671798 (age 763.545s)
  hex dump (first 32 bytes):
    73 75 69 74 65 32 00 00 00 00 00 00 00 00 00 00  suite2..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000116e8eba>] __kmalloc_node_track_caller+0x4e/0x140
    [<00000000e2f9cce9>] kmemdup+0x2c/0x60
    [<000000002a36710b>] kunit_filter_suites+0x3e4/0xa50
    [<00000000f452f130>] filter_suites_test_glob_test+0x1b7/0x660
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30
unreferenced object 0xff11000103f3a800 (size 96):
  comm "kunit_try_catch", pid 98, jiffies 4294671798 (age 763.545s)
  hex dump (first 32 bytes):
    f0 e1 5a 88 ff ff ff ff 40 39 bb 8a ff ff ff ff  ..Z.....@9......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000000d6e4891>] __kmalloc+0x4d/0x140
    [<000000006afe50bd>] kunit_filter_suites+0x424/0xa50
    [<00000000f452f130>] filter_suites_test_glob_test+0x1b7/0x660
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30
unreferenced object 0xff11000101a72ac0 (size 16):
  comm "kunit_try_catch", pid 104, jiffies 4294671814 (age 763.529s)
  hex dump (first 16 bytes):
    00 00 00 00 00 00 00 00 e0 2a a7 01 01 00 11 ff  .........*......
  backtrace:
    [<000000000d6e4891>] __kmalloc+0x4d/0x140
    [<00000000c7b724e7>] kunit_filter_suites+0x108/0xa50
    [<00000000bad5427d>] filter_attr_test+0x1e9/0x6a0
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30
unreferenced object 0xff11000103caf880 (size 32):
  comm "kunit_try_catch", pid 104, jiffies 4294671814 (age 763.547s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000000d6e4891>] __kmalloc+0x4d/0x140
    [<00000000c47b0f75>] kunit_filter_suites+0x189/0xa50
    [<00000000bad5427d>] filter_attr_test+0x1e9/0x6a0
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30
unreferenced object 0xff11000101a72ae0 (size 16):
  comm "kunit_try_catch", pid 106, jiffies 4294671823 (age 763.538s)
  hex dump (first 16 bytes):
    00 00 00 00 00 00 00 00 00 2b a7 01 01 00 11 ff  .........+......
  backtrace:
    [<000000000d6e4891>] __kmalloc+0x4d/0x140
    [<00000000c7b724e7>] kunit_filter_suites+0x108/0xa50
    [<0000000096255c51>] filter_attr_empty_test+0x1b0/0x310
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30
unreferenced object 0xff11000103caf9c0 (size 32):
  comm "kunit_try_catch", pid 106, jiffies 4294671823 (age 763.538s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000000d6e4891>] __kmalloc+0x4d/0x140
    [<00000000c47b0f75>] kunit_filter_suites+0x189/0xa50
    [<0000000096255c51>] filter_attr_empty_test+0x1b0/0x310
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30
unreferenced object 0xff11000101a72b00 (size 16):
  comm "kunit_try_catch", pid 108, jiffies 4294671832 (age 763.529s)
  hex dump (first 16 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000000d6e4891>] __kmalloc+0x4d/0x140
    [<00000000c47b0f75>] kunit_filter_suites+0x189/0xa50
    [<00000000881258cc>] filter_attr_skip_test+0x148/0x770
    [<00000000cd1104a7>] kunit_try_run_case+0x119/0x270
    [<00000000c654c917>] kunit_generic_run_threadfn_adapter+0x4e/0xa0
    [<00000000d195ac13>] kthread+0x2c7/0x3c0
    [<00000000b79c1ee9>] ret_from_fork+0x2c/0x70
    [<000000001167f7e6>] ret_from_fork_asm+0x1b/0x30

Fixes: 5d31f71efc ("kunit: add kunit.filter_glob cmdline option to filter suites")
Fixes: 529534e8cb ("kunit: Add ability to filter attributes")
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-08-04 13:41:47 -06:00
Thomas Weißschuh
31ed379b7c dyndbg: add source filename to prefix
Printing the line number without the file is of limited usefulness.

Knowing the filename also makes it also easier to relate the logged
information to the controlfile.

Example:

    # modprobe test_dynamic_debug
    # echo 'file test_dynamic_debug.c =pfsl' > /proc/dynamic_debug/control
    # echo 1 > /sys/module/test_dynamic_debug/parameters/do_prints
    # dmesg | tail -2
    [   71.802212] do_cats:lib/test_dynamic_debug.c:103: test_dd: doing categories
    [   71.802227] do_levels:lib/test_dynamic_debug.c:123: test_dd: doing levels

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: Jason Baron <jbaron@akamai.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230709-dyndbg-filename-v2-3-fd83beef0925@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-04 15:28:41 +02:00
Thomas Weißschuh
3bdaf73905 dyndbg: increase PREFIX_SIZE to 128
A follow-up patch will add the possibility to print the filename as part
of the prefix.
Increase the maximum prefix size to accommodate this.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230709-dyndbg-filename-v2-2-fd83beef0925@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-04 15:28:41 +02:00
Thomas Weißschuh
882f7a64ed dyndbg: constify opt_array
It is never modified, so mark it const.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230709-dyndbg-filename-v2-1-fd83beef0925@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-04 15:28:40 +02:00
Jakub Kicinski
35b1b1fd96 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/dsa/port.c
  9945c1fb03 ("net: dsa: fix older DSA drivers using phylink")
  a88dd75384 ("net: dsa: remove legacy_pre_march2020 detection")
https://lore.kernel.org/all/20230731102254.2c9868ca@canb.auug.org.au/

net/xdp/xsk.c
  3c5b4d69c3 ("net: annotate data-races around sk->sk_mark")
  b7f72a30e9 ("xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path")
https://lore.kernel.org/all/20230731102631.39988412@canb.auug.org.au/

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  37b61cda9c ("bnxt: don't handle XDP in netpoll")
  2b56b3d992 ("eth: bnxt: handle invalid Tx completions more gracefully")
https://lore.kernel.org/all/20230801101708.1dc7faac@canb.auug.org.au/

Adjacent changes:

drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
  62da08331f ("net/mlx5e: Set proper IPsec source port in L4 selector")
  fbd517549c ("net/mlx5e: Add function to get IPsec offload namespace")

drivers/net/ethernet/sfc/selftest.c
  55c1528f9b ("sfc: fix field-spanning memcpy in selftest")
  ae9d445cd4 ("sfc: Miscellaneous comment removals")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-03 14:34:37 -07:00
Linus Torvalds
a4e98a30bc bitmap fixes for v6.5
- Fix for bitmap documentation;
  - Fix for kernel build under certain configuration.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmTIHIcACgkQsUSA/Tof
 vsjjWQv/cRLlsolIBc3gmV6YGYZuXc99SGALLp+2BjGz63GQ1YNaIPPHZWFNeH7f
 fATEZCXUssgbRRSOQWAqt+9Zbzkz85nU/L/WDC63/eMaBNL5bueYKbRnivixb6CK
 0N7ruQUxW9D+n/ioXuvNecRTjOI8zPKDrcXYTVbcWcTd2cUd+VsrXnhBibcsnkiF
 /d/svVVO7S/wNjHbOTm9Miru34CP5KxBJMrgCALJy9wS4NY9NohnoACxli3Igp8/
 JGYBg5JuWIk+Adw7rGRPCsJUuAgyNltb5BlP/JrjDW0Ra6SntLafE+kcwQu2lIwi
 WPoKqZz+CdHGVP8hkbsDxg+UCR+gkUm/RoImcYLhl0RvHF6eaDckUBWvU9DUi41N
 VRvB+yjVTvubM4rbrbsSJp3vIAjLqjLlCyv6Z3XGrwl/B3TXfwpEEHfSTq0lSnnv
 HRNOcjZHedTT2xTljHsW7yc/xv3877h+smzXl07qMXR3Tj6kUMxGcLS9VuZwoBA4
 b8nLoKm1
 =5IbU
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.5-rc5' of https://github.com:/norov/linux

Pull bitmap fixes from Yury Norov:

 - Fix for bitmap documentation

 - Fix for kernel build under certain configurations

* tag 'bitmap-6.5-rc5' of https://github.com:/norov/linux:
  lib/bitmap: workaround const_eval test build failure
  cpumask: eliminate kernel-doc warnings
2023-08-02 18:10:26 -07:00
Ruan Jinjie
5a175d369c kunit: fix wild-memory-access bug in kunit_filter_suites()
As for kunit_filter_suites(), When the filters arg = NULL, such as
the call of kunit_filter_suites(&suite_set, "suite2", NULL, NULL, &err)
in filter_suites_test() tese case in kunit, both filter_count and
parsed_filters will not be initialized.

So it's possible to enter kunit_filter_attr_tests(), and the use of
uninitialized parsed_filters will cause below wild-memory-access.

 RIP: 0010:kunit_filter_suites+0x780/0xa40
 Code: fe ff ff e8 42 87 4d ff 41 83 c6 01 49 83 c5 10 49 89 dc 44 39 74 24 50 0f 8e 81 fe ff ff e8 27 87 4d ff 4c 89 e8 48 c1 e8 03 <66> 42 83 3c 38 00 0f 85 af 01 00 00 49 8b 75 00 49 8b 55 08 4c 89
 RSP: 0000:ff1100010743fc38 EFLAGS: 00010203
 RAX: 03fc4400041d0ff1 RBX: ff1100010389a900 RCX: ffffffff9f940ad9
 RDX: ff11000107429740 RSI: 0000000000000000 RDI: ff110001037ec920
 RBP: ff1100010743fd50 R08: 0000000000000000 R09: ffe21c0020e87f1e
 R10: 0000000000000003 R11: 0000000000032001 R12: ff110001037ec800
 R13: 1fe2200020e87f8c R14: 0000000000000000 R15: dffffc0000000000
 FS:  0000000000000000(0000) GS:ff1100011b000000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ff11000115201000 CR3: 0000000113066001 CR4: 0000000000771ef0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? die_addr+0x3c/0xa0
  ? exc_general_protection+0x148/0x220
  ? asm_exc_general_protection+0x26/0x30
  ? kunit_filter_suites+0x779/0xa40
  ? kunit_filter_suites+0x780/0xa40
  ? kunit_filter_suites+0x779/0xa40
  ? __pfx_kunit_filter_suites+0x10/0x10
  ? __pfx_kfree+0x10/0x10
  ? kunit_add_action_or_reset+0x3d/0x50
  filter_suites_test+0x1b7/0x440
  ? __pfx_filter_suites_test+0x10/0x10
  ? __pfx___schedule+0x10/0x10
  ? try_to_wake_up+0xa8e/0x1210
  ? _raw_spin_lock_irqsave+0x86/0xe0
  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
  ? set_cpus_allowed_ptr+0x7c/0xb0
  kunit_try_run_case+0x119/0x270
  ? __kthread_parkme+0xdc/0x160
  ? __pfx_kunit_try_run_case+0x10/0x10
  kunit_generic_run_threadfn_adapter+0x4e/0xa0
  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
  kthread+0x2c7/0x3c0
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x70
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1b/0x30
  </TASK>
 Modules linked in:
 Dumping ftrace buffer:
    (ftrace buffer empty)
 ---[ end trace 0000000000000000 ]---
 RIP: 0010:kunit_filter_suites+0x780/0xa40
 Code: fe ff ff e8 42 87 4d ff 41 83 c6 01 49 83 c5 10 49 89 dc 44 39 74 24 50 0f 8e 81 fe ff ff e8 27 87 4d ff 4c 89 e8 48 c1 e8 03 <66> 42 83 3c 38 00 0f 85 af 01 00 00 49 8b 75 00 49 8b 55 08 4c 89
 RSP: 0000:ff1100010743fc38 EFLAGS: 00010203
 RAX: 03fc4400041d0ff1 RBX: ff1100010389a900 RCX: ffffffff9f940ad9
 RDX: ff11000107429740 RSI: 0000000000000000 RDI: ff110001037ec920
 RBP: ff1100010743fd50 R08: 0000000000000000 R09: ffe21c0020e87f1e
 R10: 0000000000000003 R11: 0000000000032001 R12: ff110001037ec800
 R13: 1fe2200020e87f8c R14: 0000000000000000 R15: dffffc0000000000
 FS:  0000000000000000(0000) GS:ff1100011b000000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ff11000115201000 CR3: 0000000113066001 CR4: 0000000000771ef0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Kernel panic - not syncing: Fatal exception
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Kernel Offset: 0x1da00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
 Rebooting in 1 seconds..

Fixes: 529534e8cb ("kunit: Add ability to filter attributes")
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-07-31 08:27:05 -06:00
Mark Brown
2cddb06cb0 Linux 6.5-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmTGxtMeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGy2kH/RUepcJdCEQT2XGz
 NnHQ7NMKNnkcGLZq0yIgxHnGtu0KWSjb+LxBtd1un0nuVuSnSgGesJFH/B4uY2hq
 veXHMlyfCPCKLvYziOegMoUBiLR3d7K6urP9XAhAKX5gz9zn3ciZ13W7N9Qdf3Qx
 2t0kdT7guv5Ki5u7o+Pylzsbz9wKgIngY1ncfPqWQOaS/McZ58keAjU0z/mVaFQ4
 Wanc18dzawpceybqXb6qgCg+khZl6we2Mkl872ANYwAzvOzlVmrenWzM+7jBQvHw
 DD82KXTwq4lcf1xp7fTKrWdXOTcfyREmXu+Nuazqu5KkcQ9aY7GMi9O5JtjR1PaQ
 EXCNR3w=
 =0UH2
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmTG5lQACgkQJNaLcl1U
 h9CZowf/RGKBZeMY4iKZt72Aq0tFvu0k0P2psD7IxWtf/HCjR/6AxkKRE2DP4bIt
 7KguxzqwkYgHHmHhJ4o7YfuJiVEXIwn8P5ma//8b3t0XQAoWko8JWV3v+NvLcQ48
 8Nta6n6w+j9+QgZqdwvYhpS3s027K5RYgf4k3b1BgnK9WPGN9lOvwA+kI4Oe3EhB
 WxRSsImPHZ7qnbOC/nKH+mD5k5VfQdAsBRJzcbFYs7+kismXqUGFGs0RV+3al1vl
 p5mZQExWBYmzU7peJbcLoMttzVwbXH/PB3QOhV4pcXdMSLNPGAe33a1ixcM88qS3
 tJyZ47Frdnjx2STXmOYh1jMeX0EbDw==
 =AOEr
 -----END PGP SIGNATURE-----

ASoC: Merge up fixes from Linus' tree

Gets us pine64plus back if nothing else.
2023-07-30 23:38:02 +01:00
Linus Torvalds
cf270e7b75 Char driver and Documentation fixes for 6.5-rc4
Here is a char driver fix and some documentation updates for 6.5-rc4
 that contain the following changes:
   - sram/genalloc bugfix for reported problem
   - security-bugs.rst update based on recent discussions
   - embargoed-hardware-issues minor cleanups and then partial revert for
     the project/company lists
 
 All of these have been in linux-next for a while with no reported
 problems, and the documentation updates have all been reviewed by the
 relevant developers.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZMZD6A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynxZACgksV7C7yJWTm9UfZNZ2ABUhj69aEAnR/X9tLr
 Sjtjo0iaoAZpE+2tjHt1
 =J/gW
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char driver and Documentation fixes from Greg KH:
 "Here is a char driver fix and some documentation updates for 6.5-rc4
  that contain the following changes:

   - sram/genalloc bugfix for reported problem

   - security-bugs.rst update based on recent discussions

   - embargoed-hardware-issues minor cleanups and then partial revert
     for the project/company lists

  All of these have been in linux-next for a while with no reported
  problems, and the documentation updates have all been reviewed by the
  relevant developers"

* tag 'char-misc-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  misc/genalloc: Name subpools by of_node_full_name()
  Documentation: embargoed-hardware-issues.rst: add AMD to the list
  Documentation: embargoed-hardware-issues.rst: clean out empty and unused entries
  Documentation: security-bugs.rst: clarify CVE handling
  Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group
2023-07-30 11:44:00 -07:00
Matthew Wilcox (Oracle)
cbc0285433 XArray: Do not return sibling entries from xa_load()
It is possible for xa_load() to observe a sibling entry pointing to
another sibling entry.  An example:

Thread A:		Thread B:
			xa_store_range(xa, entry, 188, 191, gfp);
xa_load(xa, 191);
entry = xa_entry(xa, node, 63);
[entry is a sibling of 188]
			xa_store_range(xa, entry, 184, 191, gfp);
if (xa_is_sibling(entry))
offset = xa_to_sibling(entry);
entry = xa_entry(xas->xa, node, offset);
[entry is now a sibling of 184]

It is sufficient to go around this loop until we hit a non-sibling entry.
Sibling entries always point earlier in the node, so we are guaranteed
to terminate this search.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Fixes: 6b24ca4a1a ("mm: Use multi-index entries in the page cache")
Cc: stable@vger.kernel.org
2023-07-28 15:37:45 -04:00
Jakub Kicinski
5908a4c47c netfilter net-next pull request 2023-07-27
-----BEGIN PGP SIGNATURE-----
 
 iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmTCcgkNHGZ3QHN0cmxl
 bi5kZQAKCRBwkajZrV/2AJmMD/9IPWnzSNLUgoAhSo0h2OkCKl2iIdRnkrPrruhE
 Su8bD8ohmU100iN1DMXT2a7C9o0BTog4EB7WtF21z+06dUhROiZizrSt8bTk/rRi
 0+Sm9xlDAdl3CZcU8fnVjwf6PLYgUv5zVjcQc4Ggf15MwEIdpviKCps2bbBtrozF
 PJEK6+UwTU6+z4GSTc957nhFHstEcwktyxoaAote98CD78G2YCQT5yVbfctHgRm0
 9qovT8S/zZmqHvqvUfrqJd+N5V/+40O7ZuFls93kYxK9Bttx9wRwEqALPldxXudU
 o0kG4QZ8NAwiIVsGqPwKu/cKi9PF0z/PUXYgVdnkKK+XofBDHbHyfR+BJO1ejOdX
 +ea9AoQ6lD6NVmvX01+lF9OI4D1zgc6pLGyjSsyVgv3x0iKJeZ8QOgb0DTGFiG1U
 MnFIeckedrh/dt3NXLG/blZvuAzhofHqEhH/DlvbI/QBtN2zEgIMJKxRfBAMs3OO
 WAIlaHASQFVbyrHOr/X3FoNDTsvZyrTppo9WwJVTj9F41lYXzWoiBY+nVj2brGDR
 SMW1M13sufRBQlk0aTpPYPvcS5FhsMf6ggxygi2rNxX5/AdFE02nnEU9ybpHAqcy
 NiZ8kCxJ2J9+aCj7yvJ7QQcAD7l2tAIeAZCKSlKteigqTI0PWoTUc0IYPT85URLm
 cy/l4A==
 =fgLz
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-23-07-27' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
netfilter updates for net-next

1.  silence a harmless warning for CONFIG_NF_CONNTRACK_PROCFS=n builds,
 from Zhu Wang.

2, 3:
Allow NLA_POLICY_MASK to be used with BE16/BE32 types, and replace a few
manual checks with nla_policy based one in nf_tables, from myself.

4: cleanup in ctnetlink to validate while parsing rather than
   using two steps, from Lin Ma.

5: refactor boyer-moore textsearch by moving a small chunk to
   a helper function, rom Jeremy Sowden.

* tag 'nf-next-23-07-27' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  lib/ts_bm: add helper to reduce indentation and improve readability
  netfilter: conntrack: validate cta_ip via parsing
  netfilter: nf_tables: use NLA_POLICY_MASK to test for valid flag options
  netlink: allow be16 and be32 types in all uint policy checks
  nf_conntrack: fix -Wunused-const-variable=
====================

Link: https://lore.kernel.org/r/20230727133604.8275-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-27 20:25:43 -07:00
Jakub Kicinski
014acf2668 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-27 15:22:46 -07:00
Jeremy Sowden
86e9c9aa23 lib/ts_bm: add helper to reduce indentation and improve readability
The flow-control of `bm_find` is very deeply nested with a conditional
comparing a ternary expression against the pattern inside a for-loop
inside a while-loop inside a for-loop.

Move the inner for-loop into a helper function to reduce the amount of
indentation and make the code easier to read.

Fix indentation and trailing white-space in preceding debug logging
statement.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
2023-07-27 13:45:51 +02:00
Florian Westphal
5fac9b7c16 netlink: allow be16 and be32 types in all uint policy checks
__NLA_IS_BEINT_TYPE(tp) isn't useful.  NLA_BE16/32 are identical to
NLA_U16/32, the only difference is that it tells the netlink validation
functions that byteorder conversion might be needed before comparing
the value to the policy min/max ones.

After this change all policy macros that can be used with UINT types,
such as NLA_POLICY_MASK() can also be used with NLA_BE16/32.

This will be used to validate nf_tables flag attributes which
are in bigendian byte order.

Signed-off-by: Florian Westphal <fw@strlen.de>
2023-07-27 13:45:51 +02:00
Rae Moar
76066f93f1 kunit: add tests for filtering attributes
Add four tests to executor_test.c to test behavior of filtering attributes.

- parse_filter_attr_test - to test the parsing of inputted filters

- filter_attr_test - to test the filtering procedure on attributes

- filter_attr_empty_test - to test the behavior when all tests are filtered
  out

- filter_attr_skip_test - to test the configurable filter_action=skip
  option

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-07-26 13:29:41 -06:00
Rae Moar
d055c6a2cc kunit: memcpy: Mark tests as slow using test attributes
Mark slow memcpy KUnit tests using test attributes.

Tests marked as slow are as follows: memcpy_large_test, memmove_test,
memmove_large_test, and memmove_overlap_test. These tests were the slowest
of the memcpy tests and relatively slower to most other KUnit tests. Most
of these tests are already skipped when CONFIG_MEMCPY_SLOW_KUNIT_TEST is
not enabled.

These tests can now be filtered using the KUnit test attribute filtering
feature. Example: --filter "speed>slow". This will run only the tests that
have speeds faster than slow. The slow attribute will also be outputted in
KTAP.

Note: This patch is intended to replace the use of
CONFIG_MEMCPY_SLOW_KUNIT_TEST and to potentially deprecate this feature.
This patch does not remove the config option but does add a note to the
config definition commenting on this future shift.

Reviewed-by: David Gow <davidgow@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-07-26 13:29:28 -06:00
Rae Moar
529534e8cb kunit: Add ability to filter attributes
Add filtering of test attributes. Users can filter tests using the
module_param called "filter".

Filters are imputed in the format: <attribute_name><operation><value>

Example: kunit.filter="speed>slow"

Operations include: >, <, >=, <=, !=, and =. These operations will act the
same for attributes of the same type but may not between types.

Note multiple filters can be inputted by separating them with a comma.
Example: kunit.filter="speed=slow, module!=example"

Since both suites and test cases can have attributes, there may be
conflicts. The process of filtering follows these rules:
- Filtering always operates at a per-test level.
- If a test has an attribute set, then the test's value is filtered on.
- Otherwise, the value falls back to the suite's value.
- If neither are set, the attribute has a global "default" value, which
  is used.

Filtered tests will not be run or show in output. The tests can instead be
skipped using the configurable option "kunit.filter_action=skip".

Note the default settings for running tests remains unfiltered.

Finally, add "filter" methods for the speed and module attributes to parse
and compare attribute values.

Note this filtering functionality will be added to kunit.py in the next
patch.

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-07-26 13:29:15 -06:00
Rae Moar
a00a727091 kunit: Add module attribute
Add module attribute to the test attribute API. This attribute stores the
module name associated with the test using KBUILD_MODNAME.

The name of a test suite and the module name often do not match. A
reference to the module name associated with the suite could be extremely
helpful in running tests as modules without needing to check the codebase.

This attribute will be printed for each suite.

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-07-26 13:29:09 -06:00
Rae Moar
02c2d0c2a8 kunit: Add speed attribute
Add speed attribute to the test attribute API. This attribute will allow
users to mark tests with a category of speed.

Currently the categories of speed proposed are: normal, slow, and very_slow
(outlined in enum kunit_speed). These are outlined in the enum kunit_speed.

The assumed default speed for tests is "normal". This indicates that the
test takes a relatively trivial amount of time (less than 1 second),
regardless of the machine it is running on. Any test slower than this could
be marked as "slow" or "very_slow".

Add the macro KUNIT_CASE_SLOW to set a test as slow, as this is likely a
common use of the attributes API.

Add an example of marking a slow test to kunit-example-test.c.

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-07-26 13:29:04 -06:00
Rae Moar
39e92cb1e4 kunit: Add test attributes API structure
Add the basic structure of the test attribute API to KUnit, which can be
used to save and access test associated data.

Add attributes.c and attributes.h to hold associated structs and functions
for the API.

Create a struct that holds a variety of associated helper functions for
each test attribute. These helper functions will be used to get the
attribute value, convert the value to a string, and filter based on the
value. This struct is flexible by design to allow for attributes of
numerous types and contexts.

Add a method to print test attributes in the format of "# [<test_name if
not suite>.]<attribute_name>: <attribute_value>".

Example for a suite: "# speed: slow"

Example for a test case: "# test_case.speed: very_slow"

Use this method to report attributes in the KTAP output (KTAP spec:
https://docs.kernel.org/dev-tools/ktap.html) and _list_tests output when
kernel's new kunit.action=list_attr option is used. Note this is derivative
of the kunit.action=list option.

In test.h, add fields and associated helper functions to test cases and
suites to hold user-inputted test attributes.

Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-07-26 13:28:57 -06:00
Boqun Feng
f66c538098 lockdep/selftests: Use SBRM APIs for wait context tests
The "__cleanup__" attribute is already used for wait context tests, so
using it for locking tests has already been proven working. Now since
SBRM APIs are merged, let's use these APIs instead of a local guard
framework. This also helps testing SBRM APIs.

Note that originally the tests don't rely on the cleanup ordering of
two variables in the same scope, but since now it's something we'd like
to assume and rely on[1], drop the extra scope in inner_in_outer()
function. Again this gives us another opportunity to test the compiler
behavior.

[1]: https://lore.kernel.org/lkml/CAHk-=whEsr6fuVSdsoNPokLR2fZiGuo_hCLyrS-LCw7hT_N7cQ@mail.gmail.com/
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230715235257.110325-1-boqun.feng@gmail.com
2023-07-26 12:29:13 +02:00
Linus Walleij
f8ea950210 misc/genalloc: Name subpools by of_node_full_name()
A previous commit tried to come up with more generic subpool
names, but this isn't quite working: the node name was used
elsewhere to match pools to consumers which regressed the
nVidia Tegra 2/3 video decoder.

Revert back to an earlier approach using of_node_full_name()
instead of just the name to make sure the pool name is more
unique, and change both sites using this in the kernel.

It is not perfect since two SRAM nodes could have the same
subpool name but it makes the situation better than before.

Reported-by: Dmitry Osipenko <digetx@gmail.com>
Fixes: 21e5a2d10c ("misc: sram: Generate unique names for subpools")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20230622074520.3058027-1-linus.walleij@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-26 09:45:01 +02:00
Matthew Wilcox (Oracle)
1b0306981e iov_iter: Add copy_folio_from_iter_atomic()
Add a folio wrapper around copy_page_from_iter_atomic().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-07-24 18:03:58 -04:00
Matthew Wilcox (Oracle)
908a1ad894 iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()
copy_page_from_iter_atomic() already handles !highmem compound
pages correctly, but if we are passed a highmem compound page,
each base page needs to be mapped & unmapped individually.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-07-24 18:03:43 -04:00
Matthew Wilcox (Oracle)
f7f9a0c873 iov_iter: Map the page later in copy_page_from_iter_atomic()
Remove a couple of calls to kunmap_atomic() in the rare error cases.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2023-07-24 18:03:43 -04:00
Mark Brown
de1b43a57a Linux 6.5-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmS9qIoeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGH6EH/2EnB8lLGOl8QINL
 E8eTWj6e7hdXXEX42j5h+TeGZZgBbTwogzE08uHBOP7lO0h31GVa97D5xkjS8UIa
 CzjYcnAuvf36nexakdC/0T8WgGzWwzKo0MIVraPBbq/pPRyrJ0CXPzB0Sl4Z2XlL
 W3N12a1N655FRx/tjaXgUB+aMPGrdBA2t0k6eXwFWyBdQhmt7O8Y3xy0rTVA+qHZ
 F6D4fZI2Ej9WbxX+tBs+DLEk+ZUz+0fABUqvgJRNofjgm71CpGhbv4ZGUFQaJT+I
 5S7cu3R8pS2YLP8TA3kJSj5GUEwPEDEZpxMIJAqkr5uvkNysGi55lYRxxULUw/sO
 EYHRBJE=
 =c8SQ
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmS9qq8ACgkQJNaLcl1U
 h9CCAAf9FKLPWszdL/Do/OSnxh1qlgqg63f0BMUQCmPZRT2cWFDguN4Jnhr0hDtC
 Ul/SjzQzZaXalHG90yGmXZ9kBdBvnxX8XqyfVNg2sXM1TsqHrxItq7tlNd1UJlls
 zRAJo03xc9BeC9kdmBTt+Dqt41OqimQuccsHk1U/O9l7nhYwKB2xlpiu5XMZxvvZ
 IP1sb9MDONfz0K52Lz4U5QOBChA0VGhlOoduY/yTLQlzQBKNdVaLToNhy6RLQtZp
 xXhvQELB0NB2BMg9wbFQicmTb1kkMQF6HhqDfuvTeItoNYrM/APpGiPbRByotB+Z
 LanAEcsu4R70iSFvymaHI0oYeXcNHg==
 =Ye2o
 -----END PGP SIGNATURE-----

ASoC: Merge up fixes from mainline

There's several things here that will really help my CI.
2023-07-23 23:33:05 +01:00
Linus Torvalds
f036d67c02 block-6.5-2023-07-21
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmS629wQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpv/7D/99ysE5ZszmjxNOmyy1lGfqtQnaTLuToRsl
 wB16umIPAFfye5r4TV8l9GZuUyI7FU8LySglu0Y0qMKmCp+kJKLh90kB281Co4Dn
 yp1AbqlTorAlG4ElQJBRaQr4kaqqvI2tzeVmFdUhIE1oX2e9OX/O+YKa8k1JfsKI
 oecChQgodlPxX3wusItgiyvZKl2q2+mivg5E6cqiGIgP3uF8fmOQCbio4Vm8ZSxb
 TO8JEfBTiXslR+CvJD3Gi96pzexN1qCUed8/7FDiIUufhETmwqSIOo89GxzGAQ6O
 7o/83IkqgXPHjKLYs3R4/jhHPXZmXmvDZHWIiSg+KLOFqxxWmRPNJ6V6igIBP8SG
 eu5PTA7SDGtvIXePpu38FTPmSiUW7MbGhnjqY8u64Je6MaQ8l28KN7xkFtmxV+n4
 hgB0gr6uKBnXMKZHobk0yJeUUI/L/0ESzbVPDHY8JM/rQCsp1eSNQDpZoVjPWZmg
 lMGYmOq57oPA20LVch7U3gUFhD4CJ7c3e2/EzJdJVjsTveTYieBCEESQErFbMcEr
 VuRZSAGnPyXQ4yF4wG93x4sDye28ZFS/Q9c6Q3DCUxctDkCz4eY1+vmdX+NJXwDA
 aYXCyyKzk18udbKvV0QvTuDTb6PrJDPxbFagCveibPTtP4XDMv1LvpdZPUPJ/HGX
 4xA1mrsGJA==
 =e2OR
 -----END PGP SIGNATURE-----

Merge tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - Fix for loop regressions (Mauricio)

 - Fix a potential stall with batched wakeups in sbitmap (David)

 - Fix for stall with recursive plug flushes (Ross)

 - Skip accounting of empty requests for blk-iocost (Chengming)

 - Remove a dead field in struct blk_mq_hw_ctx (Chengming)

* tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux:
  loop: do not enforce max_loop hard limit by (new) default
  loop: deprecate autoloading callback loop_probe()
  sbitmap: fix batching wakeup
  blk-iocost: skip empty flush bio in iocost
  blk-mq: delete dead struct blk_mq_hw_ctx->queued field
  blk-mq: Fix stall due to recursive flush plug
2023-07-22 11:05:15 -07:00
David Jeffery
106397376c sbitmap: fix batching wakeup
Current code supposes that it is enough to provide forward progress by
just waking up one wait queue after one completion batch is done.

Unfortunately this way isn't enough, cause waiter can be added to wait
queue just after it is woken up.

Follows one example(64 depth, wake_batch is 8)

1) all 64 tags are active

2) in each wait queue, there is only one single waiter

3) each time one completion batch(8 completions) wakes up just one
   waiter in each wait queue, then immediately one new sleeper is added
   to this wait queue

4) after 64 completions, 8 waiters are wakeup, and there are still 8
   waiters in each wait queue

5) after another 8 active tags are completed, only one waiter can be
   wakeup, and the other 7 can't be waken up anymore.

Turns out it isn't easy to fix this problem, so simply wakeup enough
waiters for single batch.

Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Chengming Zhou <zhouchengming@bytedance.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20230721095715.232728-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-21 11:40:20 -06:00
Jakub Kicinski
59be3baa8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-20 15:52:55 -07:00
Miguel Ojeda
a66d733da8 rust: support running Rust documentation tests as KUnit ones
Rust has documentation tests: these are typically examples of
usage of any item (e.g. function, struct, module...).

They are very convenient because they are just written
alongside the documentation. For instance:

    /// Sums two numbers.
    ///
    /// ```
    /// assert_eq!(mymod::f(10, 20), 30);
    /// ```
    pub fn f(a: i32, b: i32) -> i32 {
        a + b
    }

In userspace, the tests are collected and run via `rustdoc`.
Using the tool as-is would be useful already, since it allows
to compile-test most tests (thus enforcing they are kept
in sync with the code they document) and run those that do not
depend on in-kernel APIs.

However, by transforming the tests into a KUnit test suite,
they can also be run inside the kernel. Moreover, the tests
get to be compiled as other Rust kernel objects instead of
targeting userspace.

On top of that, the integration with KUnit means the Rust
support gets to reuse the existing testing facilities. For
instance, the kernel log would look like:

    KTAP version 1
    1..1
        KTAP version 1
        # Subtest: rust_doctests_kernel
        1..59
        # rust_doctest_kernel_build_assert_rs_0.location: rust/kernel/build_assert.rs:13
        ok 1 rust_doctest_kernel_build_assert_rs_0
        # rust_doctest_kernel_build_assert_rs_1.location: rust/kernel/build_assert.rs:56
        ok 2 rust_doctest_kernel_build_assert_rs_1
        # rust_doctest_kernel_init_rs_0.location: rust/kernel/init.rs:122
        ok 3 rust_doctest_kernel_init_rs_0
        ...
        # rust_doctest_kernel_types_rs_2.location: rust/kernel/types.rs:150
        ok 59 rust_doctest_kernel_types_rs_2
    # rust_doctests_kernel: pass:59 fail:0 skip:0 total:59
    # Totals: pass:59 fail:0 skip:0 total:59
    ok 1 rust_doctests_kernel

Therefore, add support for running Rust documentation tests
in KUnit. Some other notes about the current implementation
and support follow.

The transformation is performed by a couple scripts written
as Rust hostprogs.

Tests using the `?` operator are also supported as usual, e.g.:

    /// ```
    /// # use kernel::{spawn_work_item, workqueue};
    /// spawn_work_item!(workqueue::system(), || pr_info!("x"))?;
    /// # Ok::<(), Error>(())
    /// ```

The tests are also compiled with Clippy under `CLIPPY=1`, just
like normal code, thus also benefitting from extra linting.

The names of the tests are currently automatically generated.
This allows to reduce the burden for documentation writers,
while keeping them fairly stable for bisection. This is an
improvement over the `rustdoc`-generated names, which include
the line number; but ideally we would like to get `rustdoc` to
provide the Rust item path and a number (for multiple examples
in a single documented Rust item).

In order for developers to easily see from which original line
a failed doctests came from, a KTAP diagnostic line is printed
to the log, containing the location (file and line) of the
original test (i.e. instead of the location in the generated
Rust file):

    # rust_doctest_kernel_types_rs_2.location: rust/kernel/types.rs:150

This line follows the syntax for declaring test metadata in the
proposed KTAP v2 spec [1], which may be used for the proposed
KUnit test attributes API [2]. Thus hopefully this will make
migration easier later on (suggested by David [3]).

The original line in that test attribute is figured out by
providing an anchor (suggested by Boqun [4]). The original file
is found by walking the filesystem, checking directory prefixes
to reduce the amount of combinations to check, and it is only
done once per file. Ambiguities are detected and reported.

A notable difference from KUnit C tests is that the Rust tests
appear to assert using the usual `assert!` and `assert_eq!`
macros from the Rust standard library (`core`). We provide
a custom version that forwards the call to KUnit instead.
Importantly, these macros do not require passing context,
unlike the KUnit C ones (i.e. `struct kunit *`). This makes
them easier to use, and readers of the documentation do not need
to care about which testing framework is used. In addition, it
may allow us to test third-party code more easily in the future.

However, a current limitation is that KUnit does not support
assertions in other tasks. Thus we presently simply print an
error to the kernel log if an assertion actually failed. This
should be revisited to properly fail the test, perhaps saving
the context somewhere else, or letting KUnit handle it.

Link: https://lore.kernel.org/lkml/20230420205734.1288498-1-rmoar@google.com/ [1]
Link: https://lore.kernel.org/linux-kselftest/20230707210947.1208717-1-rmoar@google.com/ [2]
Link: https://lore.kernel.org/rust-for-linux/CABVgOSkOLO-8v6kdAGpmYnZUb+LKOX0CtYCo-Bge7r_2YTuXDQ@mail.gmail.com/ [3]
Link: https://lore.kernel.org/rust-for-linux/ZIps86MbJF%2FiGIzd@boqun-archlinux/ [4]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-07-19 09:32:53 -06:00
Mark Brown
4619dd77e6
ASoC: Improve coverage in default KUnit runs
Merge series from Mark Brown <broonie@kernel.org>:

We have some KUnit tests for ASoC but they're not being run as much as
they should be since ASoC isn't enabled in the configs used by default
with KUnit and in the case of the topology tests there is no way to
enable them without enabling drivers that use them.  This series
provides a Kconfig option which KUnit can use directly rather than worry
about drivers.

Further, since KUnit is typically run in UML but ALSA prevents build
with UML we need to remove that Kconfig conflict.  As far as I can tell
the motiviation for this is that many ALSA drivers use iomem APIs which
are not available under UML and it's more trouble than it's worth to go
through and add per driver dependencies.  In order to avoid these issues
we also provide stubs for these APIs so there are no build time issues
if a driver relies on iomem but does not depend on it.  With these stubs
I am able to build all the sound drivers available in a UML defconfig
(UML allmodconfig appears to have substantial other issues in a quick
test).

With this series I am able to run the topology KUnit tests as part of a
kunit --alltests run.
2023-07-19 01:06:03 +01:00
Linus Torvalds
4806364acf Seven hotfixes, six of which are cc:stable and one of which addresses a
post-6.5 issue.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZLboHQAKCRDdBJ7gKXxA
 jqwtAP4m3MQNcYzQk8qbV+EQat/csTnrefytyD0ogFRoxcMAFAD/XT784sZzn4SU
 s/mL1HLk1BsubT/yQmY3lISXHDPuPAo=
 =5W3V
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-07-18-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
 "Seven hotfixes, six of which are cc:stable and one of which addresses
  a post-6.5 issue"

* tag 'mm-hotfixes-stable-2023-07-18-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  maple_tree: fix node allocation testing on 32 bit
  maple_tree: fix 32 bit mas_next testing
  selftests/mm: mkdirty: fix incorrect position of #endif
  maple_tree: set the node limit when creating a new root node
  mm/mlock: fix vma iterator conversion of apply_vma_lock_flags()
  prctl: move PR_GET_AUXV out of PR_MCE_KILL
  selftests/mm: give scripts execute permission
2023-07-18 14:19:42 -07:00
Yury Norov
2356d198d2 lib/bitmap: workaround const_eval test build failure
When building with Clang, and when KASAN and GCOV_PROFILE_ALL are both
enabled, the test fails to build [1]:

>> lib/test_bitmap.c:920:2: error: call to '__compiletime_assert_239' declared with 'error' attribute: BUILD_BUG_ON failed: !__builtin_constant_p(res)
           BUILD_BUG_ON(!__builtin_constant_p(res));
           ^
   include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON'
           BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
           ^
   include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
   #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                       ^
   include/linux/compiler_types.h:352:2: note: expanded from macro 'compiletime_assert'
           _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
           ^
   include/linux/compiler_types.h:340:2: note: expanded from macro '_compiletime_assert'
           __compiletime_assert(condition, msg, prefix, suffix)
           ^
   include/linux/compiler_types.h:333:4: note: expanded from macro '__compiletime_assert'
                           prefix ## suffix();                             \
                           ^
   <scratch space>:185:1: note: expanded from here
   __compiletime_assert_239

Originally it was attributed to s390, which now looks seemingly wrong. The
issue is not related to bitmap code itself, but it breaks build for a given
configuration.

Disabling the const_eval test under that config may potentially hide other
bugs. Instead, workaround it by disabling GCOV for the test_bitmap unless
the compiler will get fixed.

[1] https://github.com/ClangBuiltLinux/linux/issues/1874

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202307171254.yFcH97ej-lkp@intel.com/
Fixes: dc34d50366 ("lib: test_bitmap: add compile-time optimization/evaluations assertions")
Co-developed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
2023-07-18 13:25:37 -07:00
Jann Horn
ce6616724f ubsan: Clarify Kconfig text for CONFIG_UBSAN_TRAP
Make it clearer in the one-line description and the verbose description
text that CONFIG_UBSAN_TRAP as currently implemented involves a tradeoff of
much less helpful oops messages in exchange for a smaller kernel image.
(With the additional effect of turning UBSAN warnings into crashes, which
may or may not be desired.)

Signed-off-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20230705215128.486054-1-jannh@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-07-17 16:05:19 -07:00
Randy Dunlap
dcb60f9c40 cpumask: eliminate kernel-doc warnings
Update lib/cpumask.c and <linux/cpumask.h> to fix all kernel-doc
warnings:

include/linux/cpumask.h:185: warning: Function parameter or member 'srcp1' not described in 'cpumask_first_and'
include/linux/cpumask.h:185: warning: Function parameter or member 'srcp2' not described in 'cpumask_first_and'
include/linux/cpumask.h:185: warning: Excess function parameter 'src1p' description in 'cpumask_first_and'
include/linux/cpumask.h:185: warning: Excess function parameter 'src2p' description in 'cpumask_first_and'

lib/cpumask.c:59: warning: Function parameter or member 'node' not described in 'alloc_cpumask_var_node'
lib/cpumask.c:169: warning: Function parameter or member 'src1p' not described in 'cpumask_any_and_distribute'
lib/cpumask.c:169: warning: Function parameter or member 'src2p' not described in 'cpumask_any_and_distribute'

Fixes: 7b4967c532 ("cpumask: Add alloc_cpumask_var_node()")
Fixes: 839cad5fa5 ("cpumask: fix function description kernel-doc notation")
Fixes: 93ba139ba8 ("cpumask: use find_first_and_bit()")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-07-17 15:47:21 -07:00
Liam R. Howlett
7a93c71a67 maple_tree: fix 32 bit mas_next testing
The test setup of mas_next is dependent on node entry size to create a 2
level tree, but the tests did not account for this in the expected value
when shifting beyond the scope of the tree.

Fix this by setting up the test to succeed depending on the node entries
which is dependent on the 32/64 bit setup.

Link: https://lkml.kernel.org/r/20230712173916.168805-1-Liam.Howlett@oracle.com
Fixes: 120b116208 ("maple_tree: reorganize testing to restore module testing")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
  Closes: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg53zZX_DnA@mail.gmail.com/
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-17 12:53:22 -07:00
Peng Zhang
3c769fd88b maple_tree: set the node limit when creating a new root node
Set the node limit of the root node so that the last pivot of all nodes is
the node limit (if the node is not full).

This patch also fixes a bug in mas_rev_awalk().  Effectively, always
setting a maximum makes mas_logical_pivot() behave as mas_safe_pivot(). 
Without this fix, it is possible that very small tasks would fail to find
the correct gap.  Although this has not been observed with real tasks, it
has been reported to happen in m68k nommu running the maple tree tests.

Link: https://lkml.kernel.org/r/20230711035444.526-1-zhangpeng.00@bytedance.com
Link: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg53zZX_DnA@mail.gmail.com/
Link: https://lkml.kernel.org/r/20230711035444.526-2-zhangpeng.00@bytedance.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-17 12:53:22 -07:00
Jakub Kicinski
d2afa89f66 for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmSwqwoACgkQ6rmadz2v
 bTqOHRAAn+fzTLqUqsveFQcxOkie5MPHxKoOTjG4+yFR7rzPkU6Mn5RX3w5yFzSn
 RqutwykF9OgipAzC3QXv4pRJuq6Gia5nvwUSDP4CX273ljyeF54DK7HfopE1+YrK
 HXyBWZvVvMZP6q7qQyQ3qtbHZSjs5XP/M6YBlJ5zo/BTLFCyvbSDP14YKEqcBkWG
 ld72ElXFxlnr/zEfRjzBCfMlbmgeHLO0SiHS/9827zEmNP1AAH5/ETA7/rJ7yCJs
 QNQUIoJWob8xm5FMJ6CU/+sOqXR1CY053meGJFFBX5pvVD/CLRhrwHn0IMCyQqmh
 wKR5waeXhpl/CKNeFuxXVMNFiXbqBb/0LYJaJtrMysjMLTsQ9X7NkrDBa/+kYGyZ
 +ghGlaMQvPqUGg0rLH2nl9JNB8Ne/8prLMsAKUWnPuOo+Q03j054gnqhGeNtDd5b
 gpSk+7x93PlhGcegBV1Wk8dkiGC5V9nTVNxg40XQUCs4k9L/8Vjc35Tjqx7nBTNH
 DiFD24DDKUZacw9L6nEqvLF/N2fiRjtUZnVPC0yn/annyBcfX1s+ZH2Tu1F6Qk38
 QMfBCnt12exmsiDoxdzzGJtjHnS/k5fsaKjlR21mOyMrIH7ipltr5UHHrdr1hBP6
 24uSeTImvQQKDi+9IuXN127jZDOupKqVS6csrA0ZXrlKWh2HR+U=
 =GVUB
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2023-07-13

We've added 67 non-merge commits during the last 15 day(s) which contain
a total of 106 files changed, 4444 insertions(+), 619 deletions(-).

The main changes are:

1) Fix bpftool build in presence of stale vmlinux.h,
   from Alexander Lobakin.

2) Introduce bpf_me_mcache_free_rcu() and fix OOM under stress,
   from Alexei Starovoitov.

3) Teach verifier actual bounds of bpf_get_smp_processor_id()
   and fix perf+libbpf issue related to custom section handling,
   from Andrii Nakryiko.

4) Introduce bpf map element count, from Anton Protopopov.

5) Check skb ownership against full socket, from Kui-Feng Lee.

6) Support for up to 12 arguments in BPF trampoline, from Menglong Dong.

7) Export rcu_request_urgent_qs_task, from Paul E. McKenney.

8) Fix BTF walking of unions, from Yafang Shao.

9) Extend link_info for kprobe_multi and perf_event links,
   from Yafang Shao.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (67 commits)
  selftests/bpf: Add selftest for PTR_UNTRUSTED
  bpf: Fix an error in verifying a field in a union
  selftests/bpf: Add selftests for nested_trust
  bpf: Fix an error around PTR_UNTRUSTED
  selftests/bpf: add testcase for TRACING with 6+ arguments
  bpf, x86: allow function arguments up to 12 for TRACING
  bpf, x86: save/restore regs with BPF_DW size
  bpftool: Use "fallthrough;" keyword instead of comments
  bpf: Add object leak check.
  bpf: Convert bpf_cpumask to bpf_mem_cache_free_rcu.
  bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu().
  selftests/bpf: Improve test coverage of bpf_mem_alloc.
  rcu: Export rcu_request_urgent_qs_task()
  bpf: Allow reuse from waiting_for_gp_ttrace list.
  bpf: Add a hint to allocated objects.
  bpf: Change bpf_mem_cache draining process.
  bpf: Further refactor alloc_bulk().
  bpf: Factor out inc/dec of active flag into helpers.
  bpf: Refactor alloc_bulk().
  bpf: Let free_all() return the number of freed elements.
  ...
====================

Link: https://lore.kernel.org/r/20230714020910.80794-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-13 19:13:24 -07:00
Geert Uytterhoeven
b2ec116aad workqueue: Fix cpu_intensive_thresh_us name in help text
There exists no parameter called "cpu_intensive_threshold_us".
The actual parameter name is "cpu_intensive_thresh_us".

Fixes: 6363845005 ("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-07-11 11:33:25 -10:00
Peter Zijlstra
719a937b70 iov_iter: Mark copy_iovec_from_user() noclone
Extend commit 50f9a76ef1 ("iov_iter: Mark
copy_compat_iovec_from_user() noinline") to also cover
copy_iovec_from_user(). Different compiler versions cause the same
problem on different functions.

lib/iov_iter.o: warning: objtool: .altinstr_replacement+0x1f: redundant UACCESS disable
lib/iov_iter.o: warning: objtool: iovec_from_user+0x84: call to copy_iovec_from_user.part.0() with UACCESS enabled
lib/iov_iter.o: warning: objtool: __import_iovec+0x143: call to copy_iovec_from_user.part.0() with UACCESS enabled

Fixes: 50f9a76ef1 ("iov_iter: Mark copy_compat_iovec_from_user() noinline")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lkml.kernel.org/r/20230616124354.GD4253@hirez.programming.kicks-ass.net
2023-07-10 09:52:28 +02:00
Andy Shevchenko
9ab04d7ed8
lib/math/int_log: Replace LGPL-2.1-or-later boilerplate with SPDX identifier
Replace license boilerplate in udftime.c with SPDX identifier for
LGPL-2.1-or-later.

Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lore.kernel.org/r/20230619172019.21457-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230703135211.87416-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-07-09 22:47:50 +01:00
Andy Shevchenko
08f6a14b2d
lib/math/int_log: Use ARRAY_SIZE(logtable) where makes sense
Use ARRAY_SIZE(logtable) where makes sense.

Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lore.kernel.org/r/20230619172019.21457-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230703135211.87416-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-07-09 22:47:49 +01:00
Andy Shevchenko
f97fa3dcb2
lib/math: Move dvb_math.c into lib/math/int_log.c
Some existing and new users may benefit from the intlog2() and
intlog10() APIs, make them wide available.

Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lore.kernel.org/r/20230619172019.21457-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20230703135211.87416-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-07-09 22:47:48 +01:00
Linus Torvalds
946c6b59c5 16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZKmgXAAKCRDdBJ7gKXxA
 joqDAP0V520Jy0cyJrRMvaQRFMqtVeDOdTpAue7ZOQHSi/LZnAD9EEAxDpYF/V4x
 PO27ixXQ4Glm2iYgH7bDX7J73WiA3wg=
 =JsYW
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
 "16 hotfixes. Six are cc:stable and the remainder address post-6.4
  issues"

The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since
it was all hopefully fixed in mainline.

* tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  lib: dhry: fix sleeping allocations inside non-preemptable section
  kasan, slub: fix HW_TAGS zeroing with slub_debug
  kasan: fix type cast in memory_is_poisoned_n
  mailmap: add entries for Heiko Stuebner
  mailmap: update manpage link
  bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
  MAINTAINERS: add linux-next info
  mailmap: add Markus Schneider-Pargmann
  writeback: account the number of pages written back
  mm: call arch_swap_restore() from do_swap_page()
  squashfs: fix cache race with migration
  mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison
  docs: update ocfs2-devel mailing list address
  MAINTAINERS: update ocfs2-devel mailing list address
  mm: disable CONFIG_PER_VMA_LOCK until its fixed
  fork: lock VMAs of the parent process when forking
2023-07-08 14:30:25 -07:00
Linus Torvalds
8fc3b8f082 hardening fixes for v6.5-rc1
- Check for NULL bdev in LoadPin (Matthias Kaehlcke)
 
 - Revert unwanted KUnit FORTIFY build default
 
 - Fix 1-element array causing boot warnings with xhci-hub
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmSoVSsWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjyuD/9Sgr+T3VJyROJdKouYO8tLUqaO
 g0A6+WE0L7XyO4ZYk4FOadeihsVEPuhB0fpDTwriKCKdPB35+Nhq8YfWPPQcGdjQ
 0IAT5AjsjYDDFGABRtsNRcL+KXyR+QRVUnSllEsZuwb3lyq6HRbdTF2QBjToAbyO
 QOgEnFJNqPp2w9y2KSzpMuYL4I9o1WbyM+huVSfoKe/3d2WnVKiARMpV+0EJgUAy
 BvORp55+c1w77IRbQduACWszdCLXfkQyI+p5ii3M7cZmePDe4q8LHN01WtIMEnHy
 cln7AnwU4daxzfdeAWIQMLFjOXTLHlkRhC18KSobeBc5Zkudtcg5LxtFGiDsDgOU
 mUWB/Ow8rgr6KlYkMFmFrW/GAVX12KbPXDATECa/4Yhl55Ydl/1bChJWWnX2pppU
 mRRnwIcY7MfhRLeB284Gst81wOHy408arJsm/vck5kdya0Ys1y38rgNQm7iKfXVu
 FYMrDU9qqGmeIVk2namjQYoWH5ei670PXndtrcvSffeZOhpzk2FnFphtraPe0mrl
 l1lcUonZwEoTJ4wDiOR9cjSphoDVom9LgwygQVb4KGHBjuCfRABDV2DGy9duBMtv
 Akcet48VkCX6wF91+30fFmTs5haRiF/5kkx5fGuxhFlQO8QHYVjIOH55VqhAt3mw
 d0OWiZaNRvbNfjPSkQ==
 =R3uK
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - Check for NULL bdev in LoadPin (Matthias Kaehlcke)

 - Revert unwanted KUnit FORTIFY build default

 - Fix 1-element array causing boot warnings with xhci-hub

* tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  usb: ch9: Replace bmSublinkSpeedAttr 1-element array with flexible array
  Revert "fortify: Allow KUnit test to build without FORTIFY"
  dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter
2023-07-08 12:08:39 -07:00
Linus Torvalds
ad8258e877 bitmap patches for v6.5
Fixes for different bitmap pieces:
 
  - lib/test_bitmap: increment failure counter properly
 
    The tests that don't use expect_eq() macro to determine that a test is
    failured must increment failed_tests explicitly.
 
  - lib/bitmap: drop optimization of bitmap_{from,to}_arr64
 
    bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE
    architectures when it's wired to bitmap_copy_clear_tail().
 
  - nodemask: Drop duplicate check in for_each_node_mask()
 
    As the return value type of first_node() became unsigned, the node >= 0
    became unnecessary.
 
  - cpumask: fix function description kernel-doc notation
 
  - MAINTAINERS: Add bits.h to the BITMAP API record
  - MAINTAINERS: Add bitfield.h to the BITMAP API record
 
    Add linux/bits.h and linux/bitfield.h for visibility
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmShrd0ACgkQsUSA/Tof
 vsg5RAv/YyedOccCkLtd4D+l+jF/a1qd64u6HfONkYzdoSeKIBiAlKukpHOjJjky
 ii5vSwEMzW34MA6wBylmY4078OXEwEu9SUsF/+5jGvqJ1ABCZkwFoMuAavRHkHhc
 xE/es6XvpSbZ4U4GHnyaUbQCuEkpW21U644QdSWqjVPDHmo32EgtNG7hTaZfCRNh
 VNa8H+tbgi5rQUMBeyxppJjyynGKwut6pX4OcOi2lq1onz8zzHx8H4AJC2v8Yg0r
 2ZRS1FfAuk2Pvp7JFQOExcxf0x/Ph3zp8dqIgfo2pSMAi/0ZjnTkAPxCYIGsWG5u
 TbfO3rHviHtKAN0q5/6xp939G69flW5xSica+IvtuXSIS/JbllwDN4mrexYxppWH
 Euj/ixJe79Dn+jtOgG17p4cp70yAcM1IQfyWH25Q8Jej+C/gN8WB0UXD4K7jTXrn
 9rGID28WbuclAWzJDmFQTpZutG1GHjM7rytit+if08gz065vUhTllhJC+vCHQojY
 clS4E2DW
 =nZF4
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.5-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:
 "Fixes for different bitmap pieces:

   - lib/test_bitmap: increment failure counter properly

     The tests that don't use expect_eq() macro to determine that a test
     is failured must increment failed_tests explicitly.

   - lib/bitmap: drop optimization of bitmap_{from,to}_arr64

     bitmap_{from,to}_arr64() optimization is overly optimistic
     on 32-bit LE architectures when it's wired to
     bitmap_copy_clear_tail().

   - nodemask: Drop duplicate check in for_each_node_mask()

     As the return value type of first_node() became unsigned, the node
     >= 0 became unnecessary.

   - cpumask: fix function description kernel-doc notation

   - MAINTAINERS: Add bits.h and bitfield.h to the BITMAP API record

     Add linux/bits.h and linux/bitfield.h for visibility"

* tag 'bitmap-6.5-rc1' of https://github.com/norov/linux:
  MAINTAINERS: Add bitfield.h to the BITMAP API record
  MAINTAINERS: Add bits.h to the BITMAP API record
  cpumask: fix function description kernel-doc notation
  nodemask: Drop duplicate check in for_each_node_mask()
  lib/bitmap: drop optimization of bitmap_{from,to}_arr64
  lib/test_bitmap: increment failure counter properly
2023-07-08 10:02:24 -07:00
Geert Uytterhoeven
8ba388c06b lib: dhry: fix sleeping allocations inside non-preemptable section
The Smatch static checker reports the following warnings:

    lib/dhry_run.c:38 dhry_benchmark() warn: sleeping in atomic context
    lib/dhry_run.c:43 dhry_benchmark() warn: sleeping in atomic context

Indeed, dhry() does sleeping allocations inside the non-preemptable
section delimited by get_cpu()/put_cpu().

Fix this by using atomic allocations instead.
Add error handling, as atomic these allocations may fail.

Link: https://lkml.kernel.org/r/bac6d517818a7cd8efe217c1ad649fffab9cc371.1688568764.git.geert+renesas@glider.be
Fixes: 13684e966d ("lib: dhry: fix unstable smp_processor_id(_) usage")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/0469eb3a-02eb-4b41-b189-de20b931fa56@moroto.mountain
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:32 -07:00
Linus Torvalds
7b82e90411 asm-generic updates for 6.5
These are cleanups for architecture specific header files:
 
  - the comments in include/linux/syscalls.h have gone out of sync
    and are really pointless, so these get removed
 
  - The asm/bitsperlong.h header no longer needs to be architecture
    specific on modern compilers, so use a generic version for newer
    architectures that use new enough userspace compilers
 
  - A cleanup for virt_to_pfn/virt_to_bus to have proper type
    checking, forcing the use of pointers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmSl138ACgkQYKtH/8kJ
 UieqWxAA2WjNVfyuieYckglOVE0PZPs2fzCwyzTY5iUTH3gE5cBFWJDWcg2EnouG
 v3X3htEQcowYWaCF9+rypQXaGiSx4WXi2Bjxnz3D/BcreqWPI4eSQ0fpGG5SURTY
 2zYF72GTt4JGR++l+7/R9MZwPbwYDT9BsD5tkel8PxnyVLM6/c5xFvbjzRSKFE8x
 SMN1jGZ62ITLNf/8coAOEPNxBYtDT6yQyu7P2sx5cd65LAQq9yLKjFklnBBovgWT
 OoCIZAdGkhcNwOh1LjyHcdNdpfNJGceKyqKPqty07IhCQuF2jxiyFYFzuBbeyQfE
 S0itN8o/MIfUmxaQl3e8dPAVb1RlNVr1zfQ6y4tUtWNdkNL2WwSnSQSRHrBfHxCQ
 QCF++PMeFcLhGwMYtqdNJ7XGLQ0PsjD74pRf0vo+vjmqDk2BJsJBP57VU+8MJn5r
 SoxqnJ0WxLvm1TfrNKusV7zMNWquc2duJDW40zsOssP4itjYELSI6qa56qmzlqmX
 zKmRx6mxAlx9RRK8FHXFYHbz3p93vv8z9vTOZV3AjIjjED960CLknUAwCC8FoJyz
 9b5wyMXsLQHQjGt8luAvPc6OiU0EiU9a4SPK+feWcv27serFvnjJlRTS/yG2Z3zd
 BYsUgsXHypsdoud+aE7MeCy7fE8n3mhoyMQQRBkOMFJ7RsG6wAE=
 =S/he
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "These are cleanups for architecture specific header files:

   - the comments in include/linux/syscalls.h have gone out of sync and
     are really pointless, so these get removed

   - The asm/bitsperlong.h header no longer needs to be architecture
     specific on modern compilers, so use a generic version for newer
     architectures that use new enough userspace compilers

   - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking,
     forcing the use of pointers"

* tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  syscalls: Remove file path comments from headers
  tools arch: Remove uapi bitsperlong.h of hexagon and microblaze
  asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch
  m68k/mm: Make pfn accessors static inlines
  arm64: memory: Make virt_to_pfn() a static inline
  ARM: mm: Make virt_to_pfn() a static inline
  asm-generic/page.h: Make pfn accessors static inlines
  xen/netback: Pass (void *) to virt_to_page()
  netfs: Pass a pointer to virt_to_page()
  cifs: Pass a pointer to virt_to_page() in cifsglob
  cifs: Pass a pointer to virt_to_page()
  riscv: mm: init: Pass a pointer to virt_to_page()
  ARC: init: Pass a pointer to virt_to_pfn() in init
  m68k: Pass a pointer to virt_to_pfn() virt_to_page()
  fs/proc/kcore.c: Pass a pointer to virt_addr_valid()
2023-07-06 10:06:04 -07:00
Kees Cook
5e2956ee46 Revert "fortify: Allow KUnit test to build without FORTIFY"
This reverts commit a9dc8d0442.

The standard for KUnit is to not build tests at all when required
functionality is missing, rather than doing test "skip". Restore this
for the fortify tests, so that architectures without
CONFIG_ARCH_HAS_FORTIFY_SOURCE do not emit unsolvable warnings.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/all/CAMuHMdUrxOEroHVUt7-mAnKSBjY=a-D3jr+XiAifuwv06Ob9Pw@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-07-03 12:50:26 -07:00
Linus Torvalds
44aeec836d Char/Misc and other driver subsystem updates for 6.5-rc1
Here is the big set of char/misc and other driver subsystem updates for
 6.5-rc1.
 
 Lots of different, tiny, stuff in here, from a range of smaller driver
 subsystems, including pulls from some substems directly:
   - IIO driver updates and additions
   - W1 driver updates and fixes (and a new maintainer!)
   - FPGA driver updates and fixes
   - Counter driver updates
   - Extcon driver updates
   - Interconnect driver updates
   - Coresight driver updates
   - mfd tree tag merge needed for other updates
 on top of that, lots of small driver updates as patches, including:
   - static const updates for class structures
   - nvmem driver updates
   - pcmcia driver fix
   - lots of other small driver updates and fixes
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZKKNMw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylhlQCfZrtz8RIbau8zbzh/CKpKBOmvHp4An3V64hbz
 recBPLH0ZACKl0wPl4iZ
 =A83A
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull Char/Misc updates from Greg KH:
 "Here is the big set of char/misc and other driver subsystem updates
  for 6.5-rc1.

  Lots of different, tiny, stuff in here, from a range of smaller driver
  subsystems, including pulls from some substems directly:

   - IIO driver updates and additions

   - W1 driver updates and fixes (and a new maintainer!)

   - FPGA driver updates and fixes

   - Counter driver updates

   - Extcon driver updates

   - Interconnect driver updates

   - Coresight driver updates

   - mfd tree tag merge needed for other updates on top of that, lots of
     small driver updates as patches, including:

   - static const updates for class structures

   - nvmem driver updates

   - pcmcia driver fix

   - lots of other small driver updates and fixes

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'char-misc-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (243 commits)
  bsr: fix build problem with bsr_class static cleanup
  comedi: make all 'class' structures const
  char: xillybus: make xillybus_class a static const structure
  xilinx_hwicap: make icap_class a static const structure
  virtio_console: make port class a static const structure
  ppdev: make ppdev_class a static const structure
  char: misc: make misc_class a static const structure
  /dev/mem: make mem_class a static const structure
  char: lp: make lp_class a static const structure
  dsp56k: make dsp56k_class a static const structure
  bsr: make bsr_class a static const structure
  oradax: make 'cl' a static const structure
  hwtracing: hisi_ptt: Fix potential sleep in atomic context
  hwtracing: hisi_ptt: Advertise PERF_PMU_CAP_NO_EXCLUDE for PTT PMU
  hwtracing: hisi_ptt: Export available filters through sysfs
  hwtracing: hisi_ptt: Add support for dynamically updating the filter list
  hwtracing: hisi_ptt: Factor out filter allocation and release operation
  samples: pfsm: add CC_CAN_LINK dependency
  misc: fastrpc: check return value of devm_kasprintf()
  coresight: dummy: Update type of mode parameter in dummy_{sink,source}_enable()
  ...
2023-07-03 12:46:47 -07:00
Linus Torvalds
5d95ff84e6 This update includes the following changes:
API:
 
 - Add linear akcipher/sig API.
 - Add tfm cloning (hmac, cmac).
 - Add statesize to crypto_ahash.
 
 Algorithms:
 
 - Allow only odd e and restrict value in FIPS mode for RSA.
 - Replace LFSR with SHA3-256 in jitter.
 - Add interface for gathering of raw entropy in jitter.
 
 Drivers:
 
 - Fix race on data_avail and actual data in hwrng/virtio.
 - Add hash and HMAC support in starfive.
 - Add RSA algo support in starfive.
 - Add support for PCI device 0x156E in ccp.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmSdECcACgkQxycdCkmx
 i6dW3g//a4DR6aaqYF8pU4svAzO56a0Plx3DVHUiJ4ygRB7xOzrQqXjCren6wY2a
 LFuetwxebAhIAPsC79vI+3j8VAIlU9cNVqOxBIJHGY7wFO4m1AjqBjlealzqLrth
 +nEIeUibqLeRw7imOO4adzSsKuSQgyU5rPtKWfrGqqI3RhuMgfWroCtmJ82jmq5l
 uMZgB+aGGkzyXztxubHRPeJ3nOFEzo95SscpJ43lOjMcURRBhEa+20jXDhUGwpI7
 9ycFV31AW+tfkIprAcliiIzZuwIbzlCkte6AxjAVsN100T/wh9JS1Y+uf1P0oZ9y
 AUQQKyc8/QpSkzHZPTncat5P6zta28r8Q5neCvEEEGGuOE8Oc6kb0Os+RE5ANMU4
 2A/zrKGOMIWeEWwXGc51xT3gxyl/Rn5wLw1pW7Lm4d5osGT9jiVXx/g66hKLpagJ
 jegI6CqgvUajkRNi7JPVnSAauu0Ay8O6pU37/8gLOXNGVZBqONpRimk9qB05LNSF
 QYzM2sgYv1tQEmjnG8jLhF5Z8brnqYTv2TZwBX43W10EDQNqUYUDff9Flean5xCb
 +2mxJc81rgtUffnMXyYvQwKLhVKoLpeLR6Ts455S5aP06WAfoyEJyYTA/LHG24GX
 H2HdS9g5y/K15k9yygMWaXgAx7O7MjM9gEa2VQakhnByj/eQM0s=
 =rOLu
 -----END PGP SIGNATURE-----

Merge tag 'v6.5-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Add linear akcipher/sig API
   - Add tfm cloning (hmac, cmac)
   - Add statesize to crypto_ahash

  Algorithms:
   - Allow only odd e and restrict value in FIPS mode for RSA
   - Replace LFSR with SHA3-256 in jitter
   - Add interface for gathering of raw entropy in jitter

  Drivers:
   - Fix race on data_avail and actual data in hwrng/virtio
   - Add hash and HMAC support in starfive
   - Add RSA algo support in starfive
   - Add support for PCI device 0x156E in ccp"

* tag 'v6.5-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (85 commits)
  crypto: akcipher - Do not copy dst if it is NULL
  crypto: sig - Fix verify call
  crypto: akcipher - Set request tfm on sync path
  crypto: sm2 - Provide sm2_compute_z_digest when sm2 is disabled
  hwrng: imx-rngc - switch to DEFINE_SIMPLE_DEV_PM_OPS
  hwrng: st - keep clock enabled while hwrng is registered
  hwrng: st - support compile-testing
  hwrng: imx-rngc - fix the timeout for init and self check
  KEYS: asymmetric: Use new crypto interface without scatterlists
  KEYS: asymmetric: Move sm2 code into x509_public_key
  KEYS: Add forward declaration in asymmetric-parser.h
  crypto: sig - Add interface for sign/verify
  crypto: akcipher - Add sync interface without SG lists
  crypto: cipher - On clone do crypto_mod_get()
  crypto: api - Add __crypto_alloc_tfmgfp
  crypto: api - Remove crypto_init_ops()
  crypto: rsa - allow only odd e and restrict value in FIPS mode
  crypto: geniv - Split geniv out of AEAD Kconfig option
  crypto: algboss - Add missing dependency on RNG2
  crypto: starfive - Add RSA algo support
  ...
2023-06-30 21:27:13 -07:00
Linus Torvalds
d2a6fd45c5 Probes updates for v6.5:
- fprobe: Pass return address to the fprobe entry/exit callbacks so that
   the callbacks don't need to analyze pt_regs/stack to find the function
   return address.
 
 - kprobe events: cleanup usage of TPARG_FL_FENTRY and TPARG_FL_RETURN
   flags so that those are not set at once.
 
 - fprobe events:
  . Add a new fprobe events for tracing arbitrary function entry and
    exit as a trace event.
  . Add a new tracepoint events for tracing raw tracepoint as a trace
    event. This allows user to trace non user-exposed tracepoints.
  . Move eprobe's event parser code into probe event common file.
  . Introduce BTF (BPF type format) support to kernel probe (kprobe,
    fprobe and tracepoint probe) events so that user can specify traced
    function arguments by name. This also applies the type of argument
    when fetching the argument.
  . Introduce '$arg*' wildcard support if BTF is available. This expands
    the '$arg*' meta argument to all function argument automatically.
  . Check the return value types by BTF. If the function returns 'void',
    '$retval' is rejected.
  . Add some selftest script for fprobe events, tracepoint events and
    BTF support.
  . Update documentation about the fprobe events.
  . Some fixes for above features, document and selftests.
 
 - selftests for ftrace (except for new fprobe events):
  . Add a test case for multiple consecutive probes in a function which
    checks if ftrace based kprobe, optimized kprobe and normal kprobe
    can be defined in the same target function.
  . Add a test case for optimized probe, which checks whether kprobe
    can be optimized or not.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmSa+9MACgkQ2/sHvwUr
 PxsmOAgAmUOIWtvH5py7AZpIRhCj8B18F6KnT7w2hByCsRxf7SaCqMhpBCk9VnYv
 9fJFBHpvYRJEmpHoH3o2ET5AGfKVNac9z96AGI2qJ4ECWITd6I5+WfTdZ5ueVn2d
 f6DQ10mHXDHSMFbuqfYWSHtkeivJpWpUNHhwzPb4doNOe06bZNfVuSgnksFg1at5
 kq16HbvGnhPzdO4YHmvqwjmRHr5/nCI1KDE9xIBcqNtWFbiRigC11zaZEUkLX+vT
 F63ShyfCK718AiwDfnjXpGkXAiVOZuAIR8RELaSqQ92YHCFKq5k9K4++WllPR5f9
 AxjVultFDiCd4oSPgYpQkjuZdFq9NA==
 =IhmY
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:

 - fprobe: Pass return address to the fprobe entry/exit callbacks so
   that the callbacks don't need to analyze pt_regs/stack to find the
   function return address.

 - kprobe events: cleanup usage of TPARG_FL_FENTRY and TPARG_FL_RETURN
   flags so that those are not set at once.

 - fprobe events:
      - Add a new fprobe events for tracing arbitrary function entry and
        exit as a trace event.
      - Add a new tracepoint events for tracing raw tracepoint as a
        trace event. This allows user to trace non user-exposed
        tracepoints.
      - Move eprobe's event parser code into probe event common file.
      - Introduce BTF (BPF type format) support to kernel probe (kprobe,
        fprobe and tracepoint probe) events so that user can specify
        traced function arguments by name. This also applies the type of
        argument when fetching the argument.
      - Introduce '$arg*' wildcard support if BTF is available. This
        expands the '$arg*' meta argument to all function argument
        automatically.
      - Check the return value types by BTF. If the function returns
        'void', '$retval' is rejected.
      - Add some selftest script for fprobe events, tracepoint events
        and BTF support.
      - Update documentation about the fprobe events.
      - Some fixes for above features, document and selftests.

 - selftests for ftrace (in addition to the new fprobe events):
      - Add a test case for multiple consecutive probes in a function
        which checks if ftrace based kprobe, optimized kprobe and normal
        kprobe can be defined in the same target function.
      - Add a test case for optimized probe, which checks whether kprobe
        can be optimized or not.

* tag 'probes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/probes: Fix tracepoint event with $arg* to fetch correct argument
  Documentation: Fix typo of reference file name
  tracing/probes: Fix to return NULL and keep using current argc
  selftests/ftrace: Add new test case which checks for optimized probes
  selftests/ftrace: Add new test case which adds multiple consecutive probes in a function
  Documentation: tracing/probes: Add fprobe event tracing document
  selftests/ftrace: Add BTF arguments test cases
  selftests/ftrace: Add tracepoint probe test case
  tracing/probes: Add BTF retval type support
  tracing/probes: Add $arg* meta argument for all function args
  tracing/probes: Support function parameters if BTF is available
  tracing/probes: Move event parameter fetching code to common parser
  tracing/probes: Add tracepoint support on fprobe_events
  selftests/ftrace: Add fprobe related testcases
  tracing/probes: Add fprobe events for tracing function entry and exit.
  tracing/probes: Avoid setting TPARG_FL_FENTRY and TPARG_FL_RETURN
  fprobe: Pass return address to the handlers
2023-06-30 10:44:53 -07:00
Sumitra Sharma
da1a055d01 lib/test_bpf: Call page_address() on page acquired with GFP_KERNEL flag
generate_test_data() acquires a page with alloc_page(GFP_KERNEL).
The GFP_KERNEL is typical for kernel-internal allocations. The
caller requires ZONE_NORMAL or a lower zone for direct access.

Therefore the page cannot come from ZONE_HIGHMEM. Thus there's no
need to map it with kmap().

Also, the kmap() is being deprecated in favor of kmap_local_page() [1].

Hence, use a plain page_address() directly.

Since the page passed to the page_address() is not from the highmem
zone, the page_address() function will always return a valid kernel
virtual address and will not return NULL. Hence, remove the check
'if (!ptr)'.

Remove the unused variable 'ptr' and label 'err_free_page'.

  [1] https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com/

Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Sumitra Sharma <sumitraartsy@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/bpf/20230623151644.GA434468@sumitra.com
2023-06-29 15:32:25 +02:00
Linus Torvalds
3a8a670eee Networking changes for 6.5.
Core
 ----
 
  - Rework the sendpage & splice implementations. Instead of feeding
    data into sockets page by page extend sendmsg handlers to support
    taking a reference on the data, controlled by a new flag called
    MSG_SPLICE_PAGES. Rework the handling of unexpected-end-of-file
    to invoke an additional callback instead of trying to predict what
    the right combination of MORE/NOTLAST flags is.
    Remove the MSG_SENDPAGE_NOTLAST flag completely.
 
  - Implement SCM_PIDFD, a new type of CMSG type analogous to
    SCM_CREDENTIALS, but it contains pidfd instead of plain pid.
 
  - Enable socket busy polling with CONFIG_RT.
 
  - Improve reliability and efficiency of reporting for ref_tracker.
 
  - Auto-generate a user space C library for various Netlink families.
 
 Protocols
 ---------
 
  - Allow TCP to shrink the advertised window when necessary, prevent
    sk_rcvbuf auto-tuning from growing the window all the way up to
    tcp_rmem[2].
 
  - Use per-VMA locking for "page-flipping" TCP receive zerocopy.
 
  - Prepare TCP for device-to-device data transfers, by making sure
    that payloads are always attached to skbs as page frags.
 
  - Make the backoff time for the first N TCP SYN retransmissions
    linear. Exponential backoff is unnecessarily conservative.
 
  - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO).
 
  - Avoid waking up applications using TLS sockets until we have
    a full record.
 
  - Allow using kernel memory for protocol ioctl callbacks, paving
    the way to issuing ioctls over io_uring.
 
  - Add nolocalbypass option to VxLAN, forcing packets to be fully
    encapsulated even if they are destined for a local IP address.
 
  - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure
    in-kernel ECMP implementation (e.g. Open vSwitch) select the same
    link for all packets. Support L4 symmetric hashing in Open vSwitch.
 
  - PPPoE: make number of hash bits configurable.
 
  - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client
    (ipconfig).
 
  - Add layer 2 miss indication and filtering, allowing higher layers
    (e.g. ACL filters) to make forwarding decisions based on whether
    packet matched forwarding state in lower devices (bridge).
 
  - Support matching on Connectivity Fault Management (CFM) packets.
 
  - Hide the "link becomes ready" IPv6 messages by demoting their
    printk level to debug.
 
  - HSR: don't enable promiscuous mode if device offloads the proto.
 
  - Support active scanning in IEEE 802.15.4.
 
  - Continue work on Multi-Link Operation for WiFi 7.
 
 BPF
 ---
 
  - Add precision propagation for subprogs and callbacks. This allows
    maintaining verification efficiency when subprograms are used,
    or in fact passing the verifier at all for complex programs,
    especially those using open-coded iterators.
 
  - Improve BPF's {g,s}setsockopt() length handling. Previously BPF
    assumed the length is always equal to the amount of written data.
    But some protos allow passing a NULL buffer to discover what
    the output buffer *should* be, without writing anything.
 
  - Accept dynptr memory as memory arguments passed to helpers.
 
  - Add routing table ID to bpf_fib_lookup BPF helper.
 
  - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands.
 
  - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark
    maps as read-only).
 
  - Show target_{obj,btf}_id in tracing link fdinfo.
 
  - Addition of several new kfuncs (most of the names are self-explanatory):
    - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(),
      bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size()
      and bpf_dynptr_clone().
    - bpf_task_under_cgroup()
    - bpf_sock_destroy() - force closing sockets
    - bpf_cpumask_first_and(), rework bpf_cpumask_any*() kfuncs
 
 Netfilter
 ---------
 
  - Relax set/map validation checks in nf_tables. Allow checking
    presence of an entry in a map without using the value.
 
  - Increase ip_vs_conn_tab_bits range for 64BIT builds.
 
  - Allow updating size of a set.
 
  - Improve NAT tuple selection when connection is closing.
 
 Driver API
 ----------
 
  - Integrate netdev with LED subsystem, to allow configuring HW
    "offloaded" blinking of LEDs based on link state and activity
    (i.e. packets coming in and out).
 
  - Support configuring rate selection pins of SFP modules.
 
  - Factor Clause 73 auto-negotiation code out of the drivers, provide
    common helper routines.
 
  - Add more fool-proof helpers for managing lifetime of MDIO devices
    associated with the PCS layer.
 
  - Allow drivers to report advanced statistics related to Time Aware
    scheduler offload (taprio).
 
  - Allow opting out of VF statistics in link dump, to allow more VFs
    to fit into the message.
 
  - Split devlink instance and devlink port operations.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Synopsys EMAC4 IP support (stmmac)
    - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches
    - Marvell 88E6250 7 port switches
    - Microchip LAN8650/1 Rev.B0 PHYs
    - MediaTek MT7981/MT7988 built-in 1GE PHY driver
 
  - WiFi:
    - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps
    - Realtek RTL8723DS (SDIO variant)
    - Realtek RTL8851BE
 
  - CAN:
    - Fintek F81604
 
 Drivers
 -------
 
  - Ethernet NICs:
    - Intel (100G, ice):
      - support dynamic interrupt allocation
      - use meta data match instead of VF MAC addr on slow-path
    - nVidia/Mellanox:
      - extend link aggregation to handle 4, rather than just 2 ports
      - spawn sub-functions without any features by default
    - OcteonTX2:
      - support HTB (Tx scheduling/QoS) offload
      - make RSS hash generation configurable
      - support selecting Rx queue using TC filters
    - Wangxun (ngbe/txgbe):
      - add basic Tx/Rx packet offloads
      - add phylink support (SFP/PCS control)
    - Freescale/NXP (enetc):
      - report TAPRIO packet statistics
    - Solarflare/AMD:
      - support matching on IP ToS and UDP source port of outer header
      - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6
      - add devlink dev info support for EF10
 
  - Virtual NICs:
    - Microsoft vNIC:
      - size the Rx indirection table based on requested configuration
      - support VLAN tagging
    - Amazon vNIC:
      - try to reuse Rx buffers if not fully consumed, useful for ARM
        servers running with 16kB pages
    - Google vNIC:
      - support TCP segmentation of >64kB frames
 
  - Ethernet embedded switches:
    - Marvell (mv88e6xxx):
      - enable USXGMII (88E6191X)
    - Microchip:
     - lan966x: add support for Egress Stage 0 ACL engine
     - lan966x: support mapping packet priority to internal switch
       priority (based on PCP or DSCP)
 
  - Ethernet PHYs:
    - Broadcom PHYs:
      - support for Wake-on-LAN for BCM54210E/B50212E
      - report LPI counter
    - Microsemi PHYs: support RGMII delay configuration (VSC85xx)
    - Micrel PHYs: receive timestamp in the frame (LAN8841)
    - Realtek PHYs: support optional external PHY clock
    - Altera TSE PCS: merge the driver into Lynx PCS which it is
      a variant of
 
  - CAN: Kvaser PCIEcan:
    - support packet timestamping
 
  - WiFi:
    - Intel (iwlwifi):
      - major update for new firmware and Multi-Link Operation (MLO)
      - configuration rework to drop test devices and split
        the different families
      - support for segmented PNVM images and power tables
      - new vendor entries for PPAG (platform antenna gain) feature
    - Qualcomm 802.11ax (ath11k):
      - Multiple Basic Service Set Identifier (MBSSID) and
        Enhanced MBSSID Advertisement (EMA) support in AP mode
      - support factory test mode
    - RealTek (rtw89):
      - add RSSI based antenna diversity
      - support U-NII-4 channels on 5 GHz band
    - RealTek (rtl8xxxu):
      - AP mode support for 8188f
      - support USB RX aggregation for the newer chips
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmSbJM4ACgkQMUZtbf5S
 IrtoDhAAhEim1+LBIKf4lhPcVdZ2p/TkpnwTz5jsTwSeRBAxTwuNJ2fQhFXg13E3
 MnRq6QaEp8G4/tA/gynLvQop+FEZEnv+horP0zf/XLcC8euU7UrKdrpt/4xxdP07
 IL/fFWsoUGNO+L9LNaHwBo8g7nHvOkPscHEBHc2Xrvzab56TJk6vPySfLqcpKlNZ
 CHWDwTpgRqNZzSKiSpoMVd9OVMKUXcPYHpDmfEJ5l+e8vTXmZzOLHrSELHU5nP5f
 mHV7gxkDCTshoGcaed7UTiOvgu1p6E5EchDJxiLaSUbgsd8SZ3u4oXwRxgj33RK/
 fB2+UaLrRt/DdlHvT/Ph8e8Ygu77yIXMjT49jsfur/zVA0HEA2dFb7V6QlsYRmQp
 J25pnrdXmE15llgqsC0/UOW5J1laTjII+T2T70UOAqQl4LWYAQDG4WwsAqTzU0KY
 dueydDouTp9XC2WYrRUEQxJUzxaOaazskDUHc5c8oHp/zVBT+djdgtvVR9+gi6+7
 yy4elI77FlEEqL0ItdU/lSWINayAlPLsIHkMyhSGKX0XDpKjeycPqkNx4UterXB/
 JKIR5RBWllRft+igIngIkKX0tJGMU0whngiw7d1WLw25wgu4sB53hiWWoSba14hv
 tXMxwZs5iGaPcT38oRVMZz8I1kJM4Dz3SyI7twVvi4RUut64EG4=
 =9i4I
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking changes from Jakub Kicinski:
 "WiFi 7 and sendpage changes are the biggest pieces of work for this
  release. The latter will definitely require fixes but I think that we
  got it to a reasonable point.

  Core:

   - Rework the sendpage & splice implementations

     Instead of feeding data into sockets page by page extend sendmsg
     handlers to support taking a reference on the data, controlled by a
     new flag called MSG_SPLICE_PAGES

     Rework the handling of unexpected-end-of-file to invoke an
     additional callback instead of trying to predict what the right
     combination of MORE/NOTLAST flags is

     Remove the MSG_SENDPAGE_NOTLAST flag completely

   - Implement SCM_PIDFD, a new type of CMSG type analogous to
     SCM_CREDENTIALS, but it contains pidfd instead of plain pid

   - Enable socket busy polling with CONFIG_RT

   - Improve reliability and efficiency of reporting for ref_tracker

   - Auto-generate a user space C library for various Netlink families

  Protocols:

   - Allow TCP to shrink the advertised window when necessary, prevent
     sk_rcvbuf auto-tuning from growing the window all the way up to
     tcp_rmem[2]

   - Use per-VMA locking for "page-flipping" TCP receive zerocopy

   - Prepare TCP for device-to-device data transfers, by making sure
     that payloads are always attached to skbs as page frags

   - Make the backoff time for the first N TCP SYN retransmissions
     linear. Exponential backoff is unnecessarily conservative

   - Create a new MPTCP getsockopt to retrieve all info
     (MPTCP_FULL_INFO)

   - Avoid waking up applications using TLS sockets until we have a full
     record

   - Allow using kernel memory for protocol ioctl callbacks, paving the
     way to issuing ioctls over io_uring

   - Add nolocalbypass option to VxLAN, forcing packets to be fully
     encapsulated even if they are destined for a local IP address

   - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure
     in-kernel ECMP implementation (e.g. Open vSwitch) select the same
     link for all packets. Support L4 symmetric hashing in Open vSwitch

   - PPPoE: make number of hash bits configurable

   - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client
     (ipconfig)

   - Add layer 2 miss indication and filtering, allowing higher layers
     (e.g. ACL filters) to make forwarding decisions based on whether
     packet matched forwarding state in lower devices (bridge)

   - Support matching on Connectivity Fault Management (CFM) packets

   - Hide the "link becomes ready" IPv6 messages by demoting their
     printk level to debug

   - HSR: don't enable promiscuous mode if device offloads the proto

   - Support active scanning in IEEE 802.15.4

   - Continue work on Multi-Link Operation for WiFi 7

  BPF:

   - Add precision propagation for subprogs and callbacks. This allows
     maintaining verification efficiency when subprograms are used, or
     in fact passing the verifier at all for complex programs,
     especially those using open-coded iterators

   - Improve BPF's {g,s}setsockopt() length handling. Previously BPF
     assumed the length is always equal to the amount of written data.
     But some protos allow passing a NULL buffer to discover what the
     output buffer *should* be, without writing anything

   - Accept dynptr memory as memory arguments passed to helpers

   - Add routing table ID to bpf_fib_lookup BPF helper

   - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands

   - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark
     maps as read-only)

   - Show target_{obj,btf}_id in tracing link fdinfo

   - Addition of several new kfuncs (most of the names are
     self-explanatory):
      - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(),
        bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size()
        and bpf_dynptr_clone().
      - bpf_task_under_cgroup()
      - bpf_sock_destroy() - force closing sockets
      - bpf_cpumask_first_and(), rework bpf_cpumask_any*() kfuncs

  Netfilter:

   - Relax set/map validation checks in nf_tables. Allow checking
     presence of an entry in a map without using the value

   - Increase ip_vs_conn_tab_bits range for 64BIT builds

   - Allow updating size of a set

   - Improve NAT tuple selection when connection is closing

  Driver API:

   - Integrate netdev with LED subsystem, to allow configuring HW
     "offloaded" blinking of LEDs based on link state and activity
     (i.e. packets coming in and out)

   - Support configuring rate selection pins of SFP modules

   - Factor Clause 73 auto-negotiation code out of the drivers, provide
     common helper routines

   - Add more fool-proof helpers for managing lifetime of MDIO devices
     associated with the PCS layer

   - Allow drivers to report advanced statistics related to Time Aware
     scheduler offload (taprio)

   - Allow opting out of VF statistics in link dump, to allow more VFs
     to fit into the message

   - Split devlink instance and devlink port operations

  New hardware / drivers:

   - Ethernet:
      - Synopsys EMAC4 IP support (stmmac)
      - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches
      - Marvell 88E6250 7 port switches
      - Microchip LAN8650/1 Rev.B0 PHYs
      - MediaTek MT7981/MT7988 built-in 1GE PHY driver

   - WiFi:
      - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps
      - Realtek RTL8723DS (SDIO variant)
      - Realtek RTL8851BE

   - CAN:
      - Fintek F81604

  Drivers:

   - Ethernet NICs:
      - Intel (100G, ice):
         - support dynamic interrupt allocation
         - use meta data match instead of VF MAC addr on slow-path
      - nVidia/Mellanox:
         - extend link aggregation to handle 4, rather than just 2 ports
         - spawn sub-functions without any features by default
      - OcteonTX2:
         - support HTB (Tx scheduling/QoS) offload
         - make RSS hash generation configurable
         - support selecting Rx queue using TC filters
      - Wangxun (ngbe/txgbe):
         - add basic Tx/Rx packet offloads
         - add phylink support (SFP/PCS control)
      - Freescale/NXP (enetc):
         - report TAPRIO packet statistics
      - Solarflare/AMD:
         - support matching on IP ToS and UDP source port of outer
           header
         - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6
         - add devlink dev info support for EF10

   - Virtual NICs:
      - Microsoft vNIC:
         - size the Rx indirection table based on requested
           configuration
         - support VLAN tagging
      - Amazon vNIC:
         - try to reuse Rx buffers if not fully consumed, useful for ARM
           servers running with 16kB pages
      - Google vNIC:
         - support TCP segmentation of >64kB frames

   - Ethernet embedded switches:
      - Marvell (mv88e6xxx):
         - enable USXGMII (88E6191X)
      - Microchip:
         - lan966x: add support for Egress Stage 0 ACL engine
         - lan966x: support mapping packet priority to internal switch
           priority (based on PCP or DSCP)

   - Ethernet PHYs:
      - Broadcom PHYs:
         - support for Wake-on-LAN for BCM54210E/B50212E
         - report LPI counter
      - Microsemi PHYs: support RGMII delay configuration (VSC85xx)
      - Micrel PHYs: receive timestamp in the frame (LAN8841)
      - Realtek PHYs: support optional external PHY clock
      - Altera TSE PCS: merge the driver into Lynx PCS which it is a
        variant of

   - CAN: Kvaser PCIEcan:
      - support packet timestamping

   - WiFi:
      - Intel (iwlwifi):
         - major update for new firmware and Multi-Link Operation (MLO)
         - configuration rework to drop test devices and split the
           different families
         - support for segmented PNVM images and power tables
         - new vendor entries for PPAG (platform antenna gain) feature
      - Qualcomm 802.11ax (ath11k):
         - Multiple Basic Service Set Identifier (MBSSID) and Enhanced
           MBSSID Advertisement (EMA) support in AP mode
         - support factory test mode
      - RealTek (rtw89):
         - add RSSI based antenna diversity
         - support U-NII-4 channels on 5 GHz band
      - RealTek (rtl8xxxu):
         - AP mode support for 8188f
         - support USB RX aggregation for the newer chips"

* tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1602 commits)
  net: scm: introduce and use scm_recv_unix helper
  af_unix: Skip SCM_PIDFD if scm->pid is NULL.
  net: lan743x: Simplify comparison
  netlink: Add __sock_i_ino() for __netlink_diag_dump().
  net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses
  Revert "af_unix: Call scm_recv() only after scm_set_cred()."
  phylink: ReST-ify the phylink_pcs_neg_mode() kdoc
  libceph: Partially revert changes to support MSG_SPLICE_PAGES
  net: phy: mscc: fix packet loss due to RGMII delays
  net: mana: use vmalloc_array and vcalloc
  net: enetc: use vmalloc_array and vcalloc
  ionic: use vmalloc_array and vcalloc
  pds_core: use vmalloc_array and vcalloc
  gve: use vmalloc_array and vcalloc
  octeon_ep: use vmalloc_array and vcalloc
  net: usb: qmi_wwan: add u-blox 0x1312 composition
  perf trace: fix MSG_SPLICE_PAGES build error
  ipvlan: Fix return value of ipvlan_queue_xmit()
  netfilter: nf_tables: fix underflow in chain reference counter
  netfilter: nf_tables: unbind non-anonymous set if rule construction fails
  ...
2023-06-28 16:43:10 -07:00
Linus Torvalds
6a8cbd9253 v6.5-rc1-sysctl-next
The changes queued up for v6.5-rc1 for sysctl are in line with
 prior efforts to stop usage of deprecated routines which incur
 recursion and also make it hard to remove the empty array element
 in each sysctl array declaration. The most difficult user to modify
 was parport which required a bit of re-thinking of how to declare shared
 sysctls there, Joel Granados has stepped up to the plate to do most of
 this work and eventual removal of register_sysctl_table(). That work
 ended up saving us about 1465 bytes according to bloat-o-meter. Since
 we gained a few bloat-o-meter karma points I moved two rather small
 sysctl arrays from kernel/sysctl.c leaving us only two more sysctl
 arrays to move left.
 
 Most changes have been tested on linux-next for about a month. The last
 straggler patches are a minor parport fix, changes to the sysctl
 kernel selftest so to verify correctness and prevent regressions for
 the future change he made to provide an alternative solution for the
 special sysctl mount point target which was using the now deprecated
 sysctl child element.
 
 This is all prep work to now finally be able to remove the empty
 array element in all sysctl declarations / registrations which is
 expected to save us a bit of bytes all over the kernel. That work
 will be tested early after v6.5-rc1 is out.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmSceh0SHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinBFQQAK9WdpcU8ODoDzoSls4jsCQpZUCfZ+ED
 pbCgQqUqu9VPs6bnJ+aXVa6Fh3uCr6+TIfNFM55qI/Sbo2issZ7bm0nvKmGgc6/m
 giqDP7btvHqiAsEootci8DVdbBXKkdH4dx3pSwleyN8pdinewH0hrKImaPpahyo6
 1mB1du0iI89yjsZmheHVVSyfXXYAnP0PqRVy5Y+qxY7yYlIegQ5uAZmwRE62lfTf
 TuiV7OFuDZ2DBYOmqIhfGKGRnfOL5ZVF3iHCrfUpX3p+fEFzDmwvm3vr73PTSrFw
 /aRRLa/hOWr5ilw1bvnMcazgQzFEOlQb3DMhBKH7gLl3XHVrM+TaaqYHjUia1+6Y
 e2axz/duA2q9uLMW81daRApvHMCgy0exkpC7prfOxF5bgTe4TjA7ZWvGpqG1kPKT
 PPSxw80XvG5hLZm4tB0ZWJ5rOfFpiUGGneSeRQwyuClBt73SIO+F03jyGpt83slU
 jFE50ac14Zwh1oxpCQtYoR1+bXWdq1QwM5vQBNEuaoTSnJfVjrXqBz/BnqJChtjr
 m1vA27+4/dfki2P3gVWF1lGx43ir3uJvqk+BjWXm2CDDJqpRi3N0qcUwZwLuqAAz
 /LEgFqK61bpHi/C8c2NWAxIoeWRU4NUOaoiKmZwyt0sKAWU1Yzg70xssYeg7VYqZ
 3pvFNVBqkV+F
 =sXUU
 -----END PGP SIGNATURE-----

Merge tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "The changes for sysctl are in line with prior efforts to stop usage of
  deprecated routines which incur recursion and also make it hard to
  remove the empty array element in each sysctl array declaration.

  The most difficult user to modify was parport which required a bit of
  re-thinking of how to declare shared sysctls there, Joel Granados has
  stepped up to the plate to do most of this work and eventual removal
  of register_sysctl_table(). That work ended up saving us about 1465
  bytes according to bloat-o-meter. Since we gained a few bloat-o-meter
  karma points I moved two rather small sysctl arrays from
  kernel/sysctl.c leaving us only two more sysctl arrays to move left.

  Most changes have been tested on linux-next for about a month. The
  last straggler patches are a minor parport fix, changes to the sysctl
  kernel selftest so to verify correctness and prevent regressions for
  the future change he made to provide an alternative solution for the
  special sysctl mount point target which was using the now deprecated
  sysctl child element.

  This is all prep work to now finally be able to remove the empty array
  element in all sysctl declarations / registrations which is expected
  to save us a bit of bytes all over the kernel. That work will be
  tested early after v6.5-rc1 is out"

* tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  sysctl: replace child with an enumeration
  sysctl: Remove debugging dump_stack
  test_sysclt: Test for registering a mount point
  test_sysctl: Add an option to prevent test skip
  test_sysctl: Add an unregister sysctl test
  test_sysctl: Group node sysctl test under one func
  test_sysctl: Fix test metadata getters
  parport: plug a sysctl register leak
  sysctl: move security keys sysctl registration to its own file
  sysctl: move umh sysctl registration to its own file
  signal: move show_unhandled_signals sysctl to its own file
  sysctl: remove empty dev table
  sysctl: Remove register_sysctl_table
  sysctl: Refactor base paths registrations
  sysctl: stop exporting register_sysctl_table
  parport: Removed sysctl related defines
  parport: Remove register_sysctl_table from parport_default_proc_register
  parport: Remove register_sysctl_table from parport_device_proc_register
  parport: Remove register_sysctl_table from parport_proc_register
  parport: Move magic number "15" to a define
2023-06-28 16:05:21 -07:00
Linus Torvalds
77b1a7f7a0 - Arnd Bergmann has fixed a bunch of -Wmissing-prototypes in
top-level directories.
 
 - Douglas Anderson has added a new "buddy" mode to the hardlockup
   detector.  It permits the detector to work on architectures which
   cannot provide the required interrupts, by having CPUs periodically
   perform checks on other CPUs.
 
 - Zhen Lei has enhanced kexec's ability to support two crash regions.
 
 - Petr Mladek has done a lot of cleanup on the hard lockup detector's
   Kconfig entries.
 
 - And the usual bunch of singleton patches in various places.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJelTAAKCRDdBJ7gKXxA
 juDkAP0VXWynzkXoojdS/8e/hhi+htedmQ3v2dLZD+vBrctLhAEA7rcH58zAVoWa
 2ejqO6wDrRGUC7JQcO9VEjT0nv73UwU=
 =F293
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-06-24-19-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-mm updates from Andrew Morton:

 - Arnd Bergmann has fixed a bunch of -Wmissing-prototypes in top-level
   directories

 - Douglas Anderson has added a new "buddy" mode to the hardlockup
   detector. It permits the detector to work on architectures which
   cannot provide the required interrupts, by having CPUs periodically
   perform checks on other CPUs

 - Zhen Lei has enhanced kexec's ability to support two crash regions

 - Petr Mladek has done a lot of cleanup on the hard lockup detector's
   Kconfig entries

 - And the usual bunch of singleton patches in various places

* tag 'mm-nonmm-stable-2023-06-24-19-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits)
  kernel/time/posix-stubs.c: remove duplicated include
  ocfs2: remove redundant assignment to variable bit_off
  watchdog/hardlockup: fix typo in config HARDLOCKUP_DETECTOR_PREFER_BUDDY
  powerpc: move arch_trigger_cpumask_backtrace from nmi.h to irq.h
  devres: show which resource was invalid in __devm_ioremap_resource()
  watchdog/hardlockup: define HARDLOCKUP_DETECTOR_ARCH
  watchdog/sparc64: define HARDLOCKUP_DETECTOR_SPARC64
  watchdog/hardlockup: make HAVE_NMI_WATCHDOG sparc64-specific
  watchdog/hardlockup: declare arch_touch_nmi_watchdog() only in linux/nmi.h
  watchdog/hardlockup: make the config checks more straightforward
  watchdog/hardlockup: sort hardlockup detector related config values a logical way
  watchdog/hardlockup: move SMP barriers from common code to buddy code
  watchdog/buddy: simplify the dependency for HARDLOCKUP_DETECTOR_PREFER_BUDDY
  watchdog/buddy: don't copy the cpumask in watchdog_next_cpu()
  watchdog/buddy: cleanup how watchdog_buddy_check_hardlockup() is called
  watchdog/hardlockup: remove softlockup comment in touch_nmi_watchdog()
  watchdog/hardlockup: in watchdog_hardlockup_check() use cpumask_copy()
  watchdog/hardlockup: don't use raw_cpu_ptr() in watchdog_hardlockup_kick()
  watchdog/hardlockup: HAVE_NMI_WATCHDOG must implement watchdog_hardlockup_probe()
  watchdog/hardlockup: keep kernel.nmi_watchdog sysctl as 0444 if probe fails
  ...
2023-06-28 10:59:38 -07:00
Linus Torvalds
6e17c6de3d - Yosry Ahmed brought back some cgroup v1 stats in OOM logs.
- Yosry has also eliminated cgroup's atomic rstat flushing.
 
 - Nhat Pham adds the new cachestat() syscall.  It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability.
 
 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning.
 
 - Lorenzo Stoakes has done some maintanance work on the get_user_pages()
   interface.
 
 - Liam Howlett continues with cleanups and maintenance work to the maple
   tree code.  Peng Zhang also does some work on maple tree.
 
 - Johannes Weiner has done some cleanup work on the compaction code.
 
 - David Hildenbrand has contributed additional selftests for
   get_user_pages().
 
 - Thomas Gleixner has contributed some maintenance and optimization work
   for the vmalloc code.
 
 - Baolin Wang has provided some compaction cleanups,
 
 - SeongJae Park continues maintenance work on the DAMON code.
 
 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting.
 
 - Christoph Hellwig has some cleanups for the filemap/directio code.
 
 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the provided
   APIs rather than open-coding accesses.
 
 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings.
 
 - John Hubbard has a series of fixes to the MM selftesting code.
 
 - ZhangPeng continues the folio conversion campaign.
 
 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock.
 
 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from
   128 to 8.
 
 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management.
 
 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code.
 
 - Vishal Moola also has done some folio conversion work.
 
 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJejewAKCRDdBJ7gKXxA
 joggAPwKMfT9lvDBEUnJagY7dbDPky1cSYZdJKxxM2cApGa42gEA6Cl8HRAWqSOh
 J0qXCzqaaN8+BuEyLGDVPaXur9KirwY=
 =B7yQ
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:

 - Yosry Ahmed brought back some cgroup v1 stats in OOM logs

 - Yosry has also eliminated cgroup's atomic rstat flushing

 - Nhat Pham adds the new cachestat() syscall. It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability

 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning

 - Lorenzo Stoakes has done some maintanance work on the
   get_user_pages() interface

 - Liam Howlett continues with cleanups and maintenance work to the
   maple tree code. Peng Zhang also does some work on maple tree

 - Johannes Weiner has done some cleanup work on the compaction code

 - David Hildenbrand has contributed additional selftests for
   get_user_pages()

 - Thomas Gleixner has contributed some maintenance and optimization
   work for the vmalloc code

 - Baolin Wang has provided some compaction cleanups,

 - SeongJae Park continues maintenance work on the DAMON code

 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting

 - Christoph Hellwig has some cleanups for the filemap/directio code

 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the
   provided APIs rather than open-coding accesses

 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings

 - John Hubbard has a series of fixes to the MM selftesting code

 - ZhangPeng continues the folio conversion campaign

 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock

 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
   from 128 to 8

 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management

 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code

 - Vishal Moola also has done some folio conversion work

 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch

* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
  mm/hugetlb: remove hugetlb_set_page_subpool()
  mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
  hugetlb: revert use of page_cache_next_miss()
  Revert "page cache: fix page_cache_next/prev_miss off by one"
  mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
  mm: memcg: rename and document global_reclaim()
  mm: kill [add|del]_page_to_lru_list()
  mm: compaction: convert to use a folio in isolate_migratepages_block()
  mm: zswap: fix double invalidate with exclusive loads
  mm: remove unnecessary pagevec includes
  mm: remove references to pagevec
  mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
  mm: remove struct pagevec
  net: convert sunrpc from pagevec to folio_batch
  i915: convert i915_gpu_error to use a folio_batch
  pagevec: rename fbatch_count()
  mm: remove check_move_unevictable_pages()
  drm: convert drm_gem_put_pages() to use a folio_batch
  i915: convert shmem_sg_free_table() to use a folio_batch
  scatterlist: add sg_set_folio()
  ...
2023-06-28 10:28:11 -07:00
Linus Torvalds
582c161cf3 hardening updates for v6.5-rc1
- Fix KMSAN vs FORTIFY in strlcpy/strlcat (Alexander Potapenko)
 
 - Convert strreplace() to return string start (Andy Shevchenko)
 
 - Flexible array conversions (Arnd Bergmann, Wyes Karny, Kees Cook)
 
 - Add missing function prototypes seen with W=1 (Arnd Bergmann)
 
 - Fix strscpy() kerndoc typo (Arne Welzel)
 
 - Replace strlcpy() with strscpy() across many subsystems which were
   either Acked by respective maintainers or were trivial changes that
   went ignored for multiple weeks (Azeem Shaikh)
 
 - Remove unneeded cc-option test for UBSAN_TRAP (Nick Desaulniers)
 
 - Add KUnit tests for strcat()-family
 
 - Enable KUnit tests of FORTIFY wrappers under UML
 
 - Add more complete FORTIFY protections for strlcat()
 
 - Add missed disabling of FORTIFY for all arch purgatories.
 
 - Enable -fstrict-flex-arrays=3 globally
 
 - Tightening UBSAN_BOUNDS when using GCC
 
 - Improve checkpatch to check for strcpy, strncpy, and fake flex arrays
 
 - Improve use of const variables in FORTIFY
 
 - Add requested struct_size_t() helper for types not pointers
 
 - Add __counted_by macro for annotating flexible array size members
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmSbftQWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJj0MD/9X9jzJzCmsAU+yNldeoAzC84Sk
 GVU3RBxGcTNysL1gZXynkIgigw7DWc4htMGeSABHHwQRVP65JCH1Kw/VqIkyumbx
 9LdX6IklMJb4pRT4PVU3azebV4eNmSjlur2UxMeW54Czm91/6I8RHbJOyAPnOUmo
 2oomGdP/hpEHtKR7hgy8Axc6w5ySwQixh2V5sVZG3VbvCS5WKTmTXbs6puuRT5hz
 iHt7v+7VtEg/Qf1W7J2oxfoghvVBsaRrSLrExWT/oZYh1ZxM7DsCAAoG/IsDgHGA
 9LBXiRECgAFThbHVxLvvKZQMXdVk0i8iXLX43XMKC0wTA+NTyH7wlcQQ4RWNMuo8
 sfA9Qm9gMArXaf64aymr3Uwn20Zan0391HdlbhOJZAE6v3PPJbleUnM58AzD2d3r
 5Lz6AIFBxDImy+3f9iDWgacCT5/PkeiXTHzk9QnKhJyKKtRA58XJxj4q2+rPnGJP
 n4haXqoxD5FJbxdXiGKk31RS0U5HBug7wkOcUrTqDHUbc/QNU2b7dxTKUx+zYtCU
 uV5emPzpF4H4z+91WpO47n9gkMAfwV0lt9S2dwS8pxsgqctbmIan+Jgip7rsqZ2G
 OgLXBsb43eEs+6WgO8tVt/ZHYj9ivGMdrcNcsIfikzNs/xweUJ53k2xSEn2xEa5J
 cwANDmkL6QQK7yfeeg==
 =s0j1
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "There are three areas of note:

  A bunch of strlcpy()->strscpy() conversions ended up living in my tree
  since they were either Acked by maintainers for me to carry, or got
  ignored for multiple weeks (and were trivial changes).

  The compiler option '-fstrict-flex-arrays=3' has been enabled
  globally, and has been in -next for the entire devel cycle. This
  changes compiler diagnostics (though mainly just -Warray-bounds which
  is disabled) and potential UBSAN_BOUNDS and FORTIFY _warning_
  coverage. In other words, there are no new restrictions, just
  potentially new warnings. Any new FORTIFY warnings we've seen have
  been fixed (usually in their respective subsystem trees). For more
  details, see commit df8fc4e934.

  The under-development compiler attribute __counted_by has been added
  so that we can start annotating flexible array members with their
  associated structure member that tracks the count of flexible array
  elements at run-time. It is possible (likely?) that the exact syntax
  of the attribute will change before it is finalized, but GCC and Clang
  are working together to sort it out. Any changes can be made to the
  macro while we continue to add annotations.

  As an example of that last case, I have a treewide commit waiting with
  such annotations found via Coccinelle:

    https://git.kernel.org/linus/adc5b3cb48a049563dc673f348eab7b6beba8a9b

  Also see commit dd06e72e68 for more details.

  Summary:

   - Fix KMSAN vs FORTIFY in strlcpy/strlcat (Alexander Potapenko)

   - Convert strreplace() to return string start (Andy Shevchenko)

   - Flexible array conversions (Arnd Bergmann, Wyes Karny, Kees Cook)

   - Add missing function prototypes seen with W=1 (Arnd Bergmann)

   - Fix strscpy() kerndoc typo (Arne Welzel)

   - Replace strlcpy() with strscpy() across many subsystems which were
     either Acked by respective maintainers or were trivial changes that
     went ignored for multiple weeks (Azeem Shaikh)

   - Remove unneeded cc-option test for UBSAN_TRAP (Nick Desaulniers)

   - Add KUnit tests for strcat()-family

   - Enable KUnit tests of FORTIFY wrappers under UML

   - Add more complete FORTIFY protections for strlcat()

   - Add missed disabling of FORTIFY for all arch purgatories.

   - Enable -fstrict-flex-arrays=3 globally

   - Tightening UBSAN_BOUNDS when using GCC

   - Improve checkpatch to check for strcpy, strncpy, and fake flex
     arrays

   - Improve use of const variables in FORTIFY

   - Add requested struct_size_t() helper for types not pointers

   - Add __counted_by macro for annotating flexible array size members"

* tag 'hardening-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (54 commits)
  netfilter: ipset: Replace strlcpy with strscpy
  uml: Replace strlcpy with strscpy
  um: Use HOST_DIR for mrproper
  kallsyms: Replace all non-returning strlcpy with strscpy
  sh: Replace all non-returning strlcpy with strscpy
  of/flattree: Replace all non-returning strlcpy with strscpy
  sparc64: Replace all non-returning strlcpy with strscpy
  Hexagon: Replace all non-returning strlcpy with strscpy
  kobject: Use return value of strreplace()
  lib/string_helpers: Change returned value of the strreplace()
  jbd2: Avoid printing outside the boundary of the buffer
  checkpatch: Check for 0-length and 1-element arrays
  riscv/purgatory: Do not use fortified string functions
  s390/purgatory: Do not use fortified string functions
  x86/purgatory: Do not use fortified string functions
  acpi: Replace struct acpi_table_slit 1-element array with flex-array
  clocksource: Replace all non-returning strlcpy with strscpy
  string: use __builtin_memcpy() in strlcpy/strlcat
  staging: most: Replace all non-returning strlcpy with strscpy
  drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
  ...
2023-06-27 21:24:18 -07:00
Linus Torvalds
7ab044a4f4 workqueue: Changes for v6.5
* Concurrency-managed per-cpu work items that hog CPUs and delay the
   execution of other work items are now automatically detected and excluded
   from concurrency management. Reporting on such work items can also be
   enabled through a config option.
 
 * Added tools/workqueue/wq_monitor.py which improves visibility into
   workqueue usages and behaviors.
 
 * Includes Arnd's minimal fix for gcc-13 enum warning on 32bit compiles.
   This conflicts with afa4bb778e ("workqueue: clean up WORK_* constant
   types, clarify masking") in master. Can be resolved by picking the master
   version.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZJoGvw4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGZu0AP9IGK2opAzO9i3i1/Ys81b3sHi9PwrYWH3g252T
 Oe3O6QD/Wh0wYBVl0o7IdW6BGdd5iNwIEs420G53UmmPrATqsgQ=
 =TffY
 -----END PGP SIGNATURE-----

Merge tag 'wq-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue updates from Tejun Heo:

 - Concurrency-managed per-cpu work items that hog CPUs and delay the
   execution of other work items are now automatically detected and
   excluded from concurrency management. Reporting on such work items
   can also be enabled through a config option.

 - Added tools/workqueue/wq_monitor.py which improves visibility into
   workqueue usages and behaviors.

 - Arnd's minimal fix for gcc-13 enum warning on 32bit compiles,
   superseded by commit afa4bb778e in mainline.

* tag 'wq-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Disable per-cpu CPU hog detection when wq_cpu_intensive_thresh_us is 0
  workqueue: Fix WARN_ON_ONCE() triggers in worker_enter_idle()
  workqueue: fix enum type for gcc-13
  workqueue: Track and monitor per-workqueue CPU time usage
  workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism
  workqueue: Automatically mark CPU-hogging work items CPU_INTENSIVE
  workqueue: Improve locking rule description for worker fields
  workqueue: Move worker_set/clr_flags() upwards
  workqueue: Re-order struct worker fields
  workqueue: Add pwq->stats[] and a monitoring script
  Further upgrade queue_work_on() comment
2023-06-27 16:32:52 -07:00
Linus Torvalds
bc6cb4d5bc Locking changes for v6.5:
- Introduce cmpxchg128() -- aka. the demise of cmpxchg_double().
 
   The cmpxchg128() family of functions is basically & functionally
   the same as cmpxchg_double(), but with a saner interface: instead
   of a 6-parameter horror that forced u128 - u64/u64-halves layout
   details on the interface and exposed users to complexity,
   fragility & bugs, use a natural 3-parameter interface with u128 types.
 
 - Restructure the generated atomic headers, and add
   kerneldoc comments for all of the generic atomic{,64,_long}_t
   operations. Generated definitions are much cleaner now,
   and come with documentation.
 
 - Implement lock_set_cmp_fn() on lockdep, for defining an ordering
   when taking multiple locks of the same type. This gets rid of
   one use of lockdep_set_novalidate_class() in the bcache code.
 
 - Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended
   variable shadowing generating garbage code on Clang on certain
   ARM builds.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSav3wRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gDyxAAjCHQjpolrre7fRpyiTDwqzIKT27H04vQ
 zrQVlVc42WBnn9pe8LthGy43/RvYvqlZvLoLONA4fMkuYriM6nSMsoZjeUmE+6Rs
 QAElQC74P5YvEBOa67VNY3/M7sj22ftDe7ODtVV8OrnPjMk1sQNRvaK025Cs3yig
 8MAI//hHGNmyVAp1dPYZMJNqxGCvluReLZ4SaUJFCMrg7YgUXgCBj/5Gi07TlKxn
 sT8BFCssoEW/B9FXkh59B1t6FBCZoSy4XSZfsZe0uVAUJ4XDEOO+zBgaWFCedNQT
 wP323ryBgMrkzUKA8j2/o5d3QnMA1GcBfHNNlvAl/fOfrxWXzDZnOEY26YcaLMa0
 YIuRF/JNbPZlt6DCUVBUEvMPpfNYi18dFN0rat1a6xL2L4w+tm55y3mFtSsg76Ka
 r7L2nWlRrAGXnuA+VEPqkqbSWRUSWOv5hT2Mcyb5BqqZRsxBETn6G8GVAzIO6j6v
 giyfUdA8Z9wmMZ7NtB6usxe3p1lXtnZ/shCE7ZHXm6xstyZrSXaHgOSgAnB9DcuJ
 7KpGIhhSODQSwC/h/J0KEpb9Pr/5jCWmXAQ2DWnZK6ndt1jUfFi8pfK58wm0AuAM
 o9t8Mx3o8wZjbMdt6up9OIM1HyFiMx2BSaZK+8f/bWemHQ0xwez5g4k5O5AwVOaC
 x9Nt+Tp0Ze4=
 =DsYj
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - Introduce cmpxchg128() -- aka. the demise of cmpxchg_double()

   The cmpxchg128() family of functions is basically & functionally the
   same as cmpxchg_double(), but with a saner interface.

   Instead of a 6-parameter horror that forced u128 - u64/u64-halves
   layout details on the interface and exposed users to complexity,
   fragility & bugs, use a natural 3-parameter interface with u128
   types.

 - Restructure the generated atomic headers, and add kerneldoc comments
   for all of the generic atomic{,64,_long}_t operations.

   The generated definitions are much cleaner now, and come with
   documentation.

 - Implement lock_set_cmp_fn() on lockdep, for defining an ordering when
   taking multiple locks of the same type.

   This gets rid of one use of lockdep_set_novalidate_class() in the
   bcache code.

 - Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended variable
   shadowing generating garbage code on Clang on certain ARM builds.

* tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldoc
  percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg()
  locking/atomic: treewide: delete arch_atomic_*() kerneldoc
  locking/atomic: docs: Add atomic operations to the driver basic API documentation
  locking/atomic: scripts: generate kerneldoc comments
  docs: scripts: kernel-doc: accept bitwise negation like ~@var
  locking/atomic: scripts: simplify raw_atomic*() definitions
  locking/atomic: scripts: simplify raw_atomic_long*() definitions
  locking/atomic: scripts: split pfx/name/sfx/order
  locking/atomic: scripts: restructure fallback ifdeffery
  locking/atomic: scripts: build raw_atomic_long*() directly
  locking/atomic: treewide: use raw_atomic*_<op>()
  locking/atomic: scripts: add trivial raw_atomic*_<op>()
  locking/atomic: scripts: factor out order template generation
  locking/atomic: scripts: remove leftover "${mult}"
  locking/atomic: scripts: remove bogus order parameter
  locking/atomic: xtensa: add preprocessor symbols
  locking/atomic: x86: add preprocessor symbols
  locking/atomic: sparc: add preprocessor symbols
  locking/atomic: sh: add preprocessor symbols
  ...
2023-06-27 14:14:30 -07:00
Linus Torvalds
4baa098a14 - Remove the local symbols prefix of the get/put_user() exception
handling symbols so that tools do not get confused by the presence of
   code belonging to the wrong symbol/not belonging to any symbol
 
 - Improve csum_partial()'s performance
 
 - Some improvements to the kcpuid tool
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSawNoACgkQEsHwGGHe
 VUppXw//YezVoWUUUeTedZl8nRbotwXUlATjsIGcRGe2rZQ/7Ud/NUagWiLmKcpy
 fAEt+Rd0MbukCNPmTjcw04NN9djs2avVXJS3CCsGNDv/Q6AsBpMcOD4dESxbWIgh
 NpkNvO3bKRKxtaoJukmxiiIBlMFzXXtKg/fgzB8FeYZDhGMfS7wBlcDeJIdmWWxO
 T5hykFoc/47e8SPG+K/VLT8hoQCg4KPpi3aSN6n+eq8nnlosABr95JKvgqeq1mXf
 UPdITYzKHDiny0ZqL2nqsx1MGh24CLc3QCxi5qMDE27NVFokRdfyCiK3DVZvgrNo
 IA5BsiKJ0Ddeo2F1Weu+rBI7Hhf+OBZlw7WmWpqQ3rEbeEJ4L1iWeeHwrBNzyuZq
 ftb7OScukusaGAMamhhnErR2GwdP3SBDnnUtsue3qqPK1acYPdFfCJCqXvYsCczQ
 Pn6eKE2Vlp/3febce7QtZtcz7qlv60UZvj3OpYbECIKcD1/8BWEidquSgPASxs9e
 WH+MvDlV/tgwzLVAG0Zp5x7DE/VzDPIKtMMRzgx1clSSPyRwzW0jhp+C4/xPsDCT
 2lLHZu/ay7O2A1kiH6m0/ULAm/gUzRNsKCNRlP/HVVXl7+U6lZeZR3D14QOPl8n8
 F1W/seOCLxnxx8dVF/hHmirDQuwSjF9vRewmWvvOUgzmYBid8j0=
 =6U2J
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 updates from Borislav Petkov:

 - Remove the local symbols prefix of the get/put_user() exception
   handling symbols so that tools do not get confused by the presence of
   code belonging to the wrong symbol/not belonging to any symbol

 - Improve csum_partial()'s performance

 - Some improvements to the kcpuid tool

* tag 'x86_misc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/lib: Make get/put_user() exception handling a visible symbol
  x86/csum: Fix clang -Wuninitialized in csum_partial()
  x86/csum: Improve performance of `csum_partial`
  tools/x86/kcpuid: Add .gitignore
  tools/x86/kcpuid: Dump the correct CPUID function in error
2023-06-27 12:25:42 -07:00
Linus Torvalds
9ba92dc1de linux-kselftest-kunit-6.5-rc1
This KUnit update for Linux 6.5-rc1 consists of:
 
 - kunit_add_action() API to defer a call until test exit.
 - Update document to add kunit_add_action() usage notes.
 - Changes to always run cleanup from a test kthread.
 - Documentation updates to clarify cleanup usage
   - assertions should not be used in cleanup
 - Documentation update to clearly indicate that exit
   functions should run even if init fails
 - Several fixes and enhancements to existing tests.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmSYWVcACgkQCwJExA0N
 QxwbxA//eGx3xkFN9CWb8ryBTZhs8DZrzc+JlqWEDpk7GQTSlErd3DtInzY0jM2a
 GWKV4BJCX6uI2JiyG+cof7nWtnv//L4LxRCpYlY/n7sJeYwZyd1s745nM8lfYTh9
 UtAHPmZplAqMCOHgfeUQ6wMxiUc7VGC8Spu82nFzRuSLzf+q5BpK7LPHSJiJ4ea+
 kkM+5ygHzBW2cfvULIglb8jQPgPRoVR4RhmmHMF7CYTZQkrU/z7ZZlFTx7LowrxC
 p2zWVuH0KJONn4L8rB4QI8oqCZejU2qV2bealCnKY3/atSLUvrnYxyPQbbxCNqmi
 EY1XyQFbGsvmgy77IeEXKWhiUmAfD7/Hcvh8M/vLk2wHzQG8+428DAQ7sGRHHqZX
 6DvDUo8Z2TE7585glxkbiXhuGsY0y8dkeNURw4URys+TvucNHGrmDfKp0UIEAJW1
 iqopMGmM/MDfV5gPUlUEg6jKhTkZOn6OlVwZ8moUaAeAKV7qGGuMrNSZJ6Jw1Gc9
 LjI2ma3uZ3hOahyqwU+zwO4CeTJHOq6JjXJZt9aiGwqJPrbjvVCUtikz4QSptU2z
 vCjVEV/e7tTGXl+suDb48cu/pyh+z3t5/Gz7eOHMId7S3MENTauxyBXDm1WzoV0c
 HuBEsmWXetYuXXkh66LJ/8fzUeWvaGrQPM9hXi2fn1hmPLxOnxw=
 =rYT5
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - kunit_add_action() API to defer a call until test exit

 - Update document to add kunit_add_action() usage notes

 - Changes to always run cleanup from a test kthread

 - Documentation updates to clarify cleanup usage (assertions should not
   be used in cleanup)

 - Documentation update to clearly indicate that exit functions should
   run even if init fails

 - Several fixes and enhancements to existing tests

* tag 'linux-kselftest-kunit-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  MAINTAINERS: Add source tree entry for kunit
  Documentation: kunit: Rename references to kunit_abort()
  kunit: Move kunit_abort() call out of kunit_do_failed_assertion()
  kunit: Fix obsolete name in documentation headers (func->action)
  Documentation: Kunit: add MODULE_LICENSE to sample code
  kunit: Update kunit_print_ok_not_ok function
  kunit: Fix reporting of the skipped parameterized tests
  kunit/test: Add example test showing parameterized testing
  Documentation: kunit: Add usage notes for kunit_add_action()
  kunit: kmalloc_array: Use kunit_add_action()
  kunit: executor_test: Use kunit_add_action()
  kunit: Add kunit_add_action() to defer a call until test exit
  kunit: example: Provide example exit functions
  Documentation: kunit: Warn that exit functions run even if init fails
  Documentation: kunit: Note that assertions should not be used in cleanup
  kunit: Always run cleanup from a test kthread
  Documentation: kunit: Modular tests should not depend on KUNIT=y
  kunit: tool: undo type subscripts for subprocess.Popen
2023-06-27 11:12:55 -07:00
Jakub Kicinski
3674fbf045 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in late fixes to prepare for the 6.5 net-next PR.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-27 09:45:22 -07:00
Linus Torvalds
cef2dd7653 A single update for debug objects:
- Recheck whether debug objects is enabled before reporting a problem to
     avoid spamming the logs with messages which are caused by a concurrent
     OOM.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmSZaDsTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYaOD/4q1DuDAQx5roamXVK5/Rq1ywVJsZnQ
 ilTOR2w8LSA6EqUDEGK117Zsp/W9NZSPvmcGHTuxqKIvsSw0bqvZM62aoFMWDP0y
 FeIqYOK28QALzV9gNQsvhI+b6tTUOx5BPHJrZGbfRr4Z0+EAKv28E3UndLLFRHys
 ijpyIpNPOvvwHbMau7kwr7BhTir5FkMm8YpQDv6WdUkPdAXKNOHA9x+W6LscmAai
 M6ymfcX182HgrhI+6r1lnHnG1frAASWEdv/P8gkbtod9DwB1hqcMkIa3287LJpBL
 GoySJhu0B/7svvAzbNkFt8DGRDpqEcHOzkwTBv+bfBG5IDvG4ePJi2BA371zCwDo
 IX2zTvywo45gE5u4nH9zkiSAX0B05QhsdPA24VrpZKAx/LmZ8MZlelgyzr2gLOBe
 +aBqueBs2ZZ02KVZcQc2ZLFS2AYD2rVxAZ37VOK/Z/KphnUBfBGcqDkxF4sIVqWW
 A2vpLQXiC/TPZC88ZndtUX3taUIly9oSHmX0KOM9EjIlq3G03vYG+oZJAtedvpND
 r2tNyzfeIvFCkOnD5sHbbDHTstVsz+VtfIsGgcVGt0T9NqT5w9Av4sF2P9ZCkMVH
 TcfwQe3G2epILFBtq/wRVq/dZcUR44iFcgqsBtJh8LTtppTxxovut96OebLMeSNK
 iNM8Vz5tep5WpQ==
 =bxLt
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects update from Thomas Gleixner:
 "A single update for debug objects:

   - Recheck whether debug objects is enabled before reporting a problem
     to avoid spamming the logs with messages which are caused by a
     concurrent OOM"

* tag 'core-debugobjects-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Recheck debug_objects_enabled before reporting
2023-06-26 13:33:00 -07:00
Linus Torvalds
a0433f8cae for-6.5/block-2023-06-23
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmSV8dwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpilGD/9Yys1oxIXJpRf00fzrylAlBthRxMjFQVWw
 zAut106hAQiBHvU8IkmGA3MvEFVHxtzwYhHI7IR8K3aZBIqscweCqmVI9JyogJw9
 U9Twnzel47VmuKdM94FeoN+hbj1fP8EWTjzmy67/zEEfFCdmHvNlMi3lSrGYIpFy
 39LxTB99Y4UarM5PtWbes37GYYljzMSWKuo4AfBkvq1eQa+sZ0Vq2xAABKq3UM7f
 apqhgHtkJooRePDP0eQp+kAyyVMgW2jIK+oIdJDxNF3CKTu2w40RzaYz6fp+jVSU
 H4R/xS59GW4/xql+VBJDh/qJg9K62DPPYjlW8BmSR8+IjvfFpsyH3/MacE50CD3P
 20fs/Mnj49H79fDrQEHJI53cOOb2EmUitbwLbvOcColNTPpt8loBtdQxjF2RMU8R
 Nyort9DJPFclYCxky1LYg1CNEC2Ln4Zy/jD47wPvqRmOQphOoVlV/hPnOEqvjaZC
 49Vn70W2DeE9cXvYI7ha+XIg6/oj+Gs3iusEbV08Ci7EAtXgI+ZUUsQ97K8UNiUh
 h2lqSJtuI7lBpYP9sf+BeCch5UCC+xGYyTdoM5f58lehWBBPtbs0g7S9RyRyOYxe
 n+yxEUo3dAGzJ/xsKAjinbZfeWIpr0b1TkAh4w3Cq/BKzRr9Bp8lBAxYuancbQ+Y
 1ADPteUOTA==
 =zP4Y
 -----END PGP SIGNATURE-----

Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - NVMe pull request via Keith:
      - Various cleanups all around (Irvin, Chaitanya, Christophe)
      - Better struct packing (Christophe JAILLET)
      - Reduce controller error logs for optional commands (Keith)
      - Support for >=64KiB block sizes (Daniel Gomez)
      - Fabrics fixes and code organization (Max, Chaitanya, Daniel
        Wagner)

 - bcache updates via Coly:
      - Fix a race at init time (Mingzhe Zou)
      - Misc fixes and cleanups (Andrea, Thomas, Zheng, Ye)

 - use page pinning in the block layer for dio (David)

 - convert old block dio code to page pinning (David, Christoph)

 - cleanups for pktcdvd (Andy)

 - cleanups for rnbd (Guoqing)

 - use the unchecked __bio_add_page() for the initial single page
   additions (Johannes)

 - fix overflows in the Amiga partition handling code (Michael)

 - improve mq-deadline zoned device support (Bart)

 - keep passthrough requests out of the IO schedulers (Christoph, Ming)

 - improve support for flush requests, making them less special to deal
   with (Christoph)

 - add bdev holder ops and shutdown methods (Christoph)

 - fix the name_to_dev_t() situation and use cases (Christoph)

 - decouple the block open flags from fmode_t (Christoph)

 - ublk updates and cleanups, including adding user copy support (Ming)

 - BFQ sanity checking (Bart)

 - convert brd from radix to xarray (Pankaj)

 - constify various structures (Thomas, Ivan)

 - more fine grained persistent reservation ioctl capability checks
   (Jingbo)

 - misc fixes and cleanups (Arnd, Azeem, Demi, Ed, Hengqi, Hou, Jan,
   Jordy, Li, Min, Yu, Zhong, Waiman)

* tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux: (266 commits)
  scsi/sg: don't grab scsi host module reference
  ext4: Fix warning in blkdev_put()
  block: don't return -EINVAL for not found names in devt_from_devname
  cdrom: Fix spectre-v1 gadget
  block: Improve kernel-doc headers
  blk-mq: don't insert passthrough request into sw queue
  bsg: make bsg_class a static const structure
  ublk: make ublk_chr_class a static const structure
  aoe: make aoe_class a static const structure
  block/rnbd: make all 'class' structures const
  block: fix the exclusive open mask in disk_scan_partitions
  block: add overflow checks for Amiga partition support
  block: change all __u32 annotations to __be32 in affs_hardblocks.h
  block: fix signed int overflow in Amiga partition support
  block: add capacity validation in bdev_add_partition()
  block: fine-granular CAP_SYS_ADMIN for Persistent Reservation
  block: disallow Persistent Reservation on partitions
  reiserfs: fix blkdev_put() warning from release_journal_dev()
  block: fix wrong mode for blkdev_get_by_dev() from disk_scan_partitions()
  block: document the holder argument to blkdev_get_by_path
  ...
2023-06-26 12:47:20 -07:00
Linus Torvalds
3eccc0c886 for-6.5/splice-2023-06-23
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmSV8QgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpupIEADKEZvpxDyaxHjYZFFeoSJRkh+AEJHe0Xtr
 J5vUL8t8zmAV3F7i8XaoAEcR0dC0VQcoTc8fAOty71+5hsc7gvtyyNjqU/YWRVqK
 Xr+VJuSJ+OGx3MzpRWEkepagfPyqP5cyyCOK6gqIgqzc3IwqkR/3QHVRc6oR8YbY
 AQd7tqm2fQXK9WDHEy5hcaQeqb9uKZjQQoZejpPPerpJM+9RMgKxpCGtnLLIUhr/
 sgl7KyLIQPBmveO2vfOR+dmsJBqsLqneqkXDKMAIfpeVEEkHHAlCH4E5Ne1XUS+s
 ie4If+reuyn1Ktt5Ry1t7w2wr8cX1fcay3K28tgwjE2Bvremc5YnYgb3pyUDW38f
 tXXkpg/eTXd/Pn0Crpagoa9zJ927tt5JXIO1/PagPEP1XOqUuthshDFsrVqfqbs+
 36gqX2JWB4NJTg9B9KBHA3+iVCJyZLjUqOqws7hOJOvhQytZVm/IwkGBg1Slhe1a
 J5WemBlqX8lTgXz0nM7cOhPYTZeKe6hazCcb5VwxTUTj9SGyYtsMfqqTwRJO9kiF
 j1VzbOAgExDYe+GvfqOFPh9VqZho66+DyOD/Xtca4eH7oYyHSmP66o8nhRyPBPZA
 maBxQhUkPQn4/V/0fL2TwIdWYKsbj8bUyINKPZ2L35YfeICiaYIctTwNJxtRmItB
 M3VxWD3GZQ==
 =KhW4
 -----END PGP SIGNATURE-----

Merge tag 'for-6.5/splice-2023-06-23' of git://git.kernel.dk/linux

Pull splice updates from Jens Axboe:
 "This kills off ITER_PIPE to avoid a race between truncate,
  iov_iter_revert() on the pipe and an as-yet incomplete DMA to a bio
  with unpinned/unref'ed pages from an O_DIRECT splice read. This causes
  memory corruption.

  Instead, we either use (a) filemap_splice_read(), which invokes the
  buffered file reading code and splices from the pagecache into the
  pipe; (b) copy_splice_read(), which bulk-allocates a buffer, reads
  into it and then pushes the filled pages into the pipe; or (c) handle
  it in filesystem-specific code.

  Summary:

   - Rename direct_splice_read() to copy_splice_read()

   - Simplify the calculations for the number of pages to be reclaimed
     in copy_splice_read()

   - Turn do_splice_to() into a helper, vfs_splice_read(), so that it
     can be used by overlayfs and coda to perform the checks on the
     lower fs

   - Make vfs_splice_read() jump to copy_splice_read() to handle
     direct-I/O and DAX

   - Provide shmem with its own splice_read to handle non-existent pages
     in the pagecache. We don't want a ->read_folio() as we don't want
     to populate holes, but filemap_get_pages() requires it

   - Provide overlayfs with its own splice_read to call down to a lower
     layer as overlayfs doesn't provide ->read_folio()

   - Provide coda with its own splice_read to call down to a lower layer
     as coda doesn't provide ->read_folio()

   - Direct ->splice_read to copy_splice_read() in tty, procfs, kernfs
     and random files as they just copy to the output buffer and don't
     splice pages

   - Provide wrappers for afs, ceph, ecryptfs, ext4, f2fs, nfs, ntfs3,
     ocfs2, orangefs, xfs and zonefs to do locking and/or revalidation

   - Make cifs use filemap_splice_read()

   - Replace pointers to generic_file_splice_read() with pointers to
     filemap_splice_read() as DIO and DAX are handled in the caller;
     filesystems can still provide their own alternate ->splice_read()
     op

   - Remove generic_file_splice_read()

   - Remove ITER_PIPE and its paraphernalia as generic_file_splice_read
     was the only user"

* tag 'for-6.5/splice-2023-06-23' of git://git.kernel.dk/linux: (31 commits)
  splice: kdoc for filemap_splice_read() and copy_splice_read()
  iov_iter: Kill ITER_PIPE
  splice: Remove generic_file_splice_read()
  splice: Use filemap_splice_read() instead of generic_file_splice_read()
  cifs: Use filemap_splice_read()
  trace: Convert trace/seq to use copy_splice_read()
  zonefs: Provide a splice-read wrapper
  xfs: Provide a splice-read wrapper
  orangefs: Provide a splice-read wrapper
  ocfs2: Provide a splice-read wrapper
  ntfs3: Provide a splice-read wrapper
  nfs: Provide a splice-read wrapper
  f2fs: Provide a splice-read wrapper
  ext4: Provide a splice-read wrapper
  ecryptfs: Provide a splice-read wrapper
  ceph: Provide a splice-read wrapper
  afs: Provide a splice-read wrapper
  9p: Add splice_read wrapper
  net: Make sock_splice_read() use copy_splice_read() by default
  tty, proc, kernfs, random: Use copy_splice_read()
  ...
2023-06-26 11:52:12 -07:00
Jeremy Sowden
6f67fbf819 lib/ts_bm: reset initial match offset for every block of text
The `shift` variable which indicates the offset in the string at which
to start matching the pattern is initialized to `bm->patlen - 1`, but it
is not reset when a new block is retrieved.  This means the implemen-
tation may start looking at later and later positions in each successive
block and miss occurrences of the pattern at the beginning.  E.g.,
consider a HTTP packet held in a non-linear skb, where the HTTP request
line occurs in the second block:

  [... 52 bytes of packet headers ...]
  GET /bmtest HTTP/1.1\r\nHost: www.example.com\r\n\r\n

and the pattern is "GET /bmtest".

Once the first block comprising the packet headers has been examined,
`shift` will be pointing to somewhere near the end of the block, and so
when the second block is examined the request line at the beginning will
be missed.

Reinitialize the variable for each new block.

Fixes: 8082e4ed0a ("[LIB]: Boyer-Moore extension for textsearch infrastructure strike #2")
Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1390
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-26 13:26:39 +02:00
Jakub Kicinski
a685d0df75 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZJX+ygAKCRDbK58LschI
 g0/2AQDHg12smf9mPfK9wOFDNRIIX8r2iufB8LUFQMzCwltN6gEAkAdkAyfbof7P
 TMaNUiHABijAFtChxoSI35j3OOSRrwE=
 =GJgN
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-06-23

We've added 49 non-merge commits during the last 24 day(s) which contain
a total of 70 files changed, 1935 insertions(+), 442 deletions(-).

The main changes are:

1) Extend bpf_fib_lookup helper to allow passing the route table ID,
   from Louis DeLosSantos.

2) Fix regsafe() in verifier to call check_ids() for scalar registers,
   from Eduard Zingerman.

3) Extend the set of cpumask kfuncs with bpf_cpumask_first_and()
   and a rework of bpf_cpumask_any*() kfuncs. Additionally,
   add selftests, from David Vernet.

4) Fix socket lookup BPF helpers for tc/XDP to respect VRF bindings,
   from Gilad Sever.

5) Change bpf_link_put() to use workqueue unconditionally to fix it
   under PREEMPT_RT, from Sebastian Andrzej Siewior.

6) Follow-ups to address issues in the bpf_refcount shared ownership
   implementation, from Dave Marchevsky.

7) A few general refactorings to BPF map and program creation permissions
   checks which were part of the BPF token series, from Andrii Nakryiko.

8) Various fixes for benchmark framework and add a new benchmark
   for BPF memory allocator to BPF selftests, from Hou Tao.

9) Documentation improvements around iterators and trusted pointers,
   from Anton Protopopov.

10) Small cleanup in verifier to improve allocated object check,
    from Daniel T. Lee.

11) Improve performance of bpf_xdp_pointer() by avoiding access
    to shared_info when XDP packet does not have frags,
    from Jesper Dangaard Brouer.

12) Silence a harmless syzbot-reported warning in btf_type_id_size(),
    from Yonghong Song.

13) Remove duplicate bpfilter_umh_cleanup in favor of umd_cleanup_helper,
    from Jarkko Sakkinen.

14) Fix BPF selftests build for resolve_btfids under custom HOSTCFLAGS,
    from Viktor Malik.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits)
  bpf, docs: Document existing macros instead of deprecated
  bpf, docs: BPF Iterator Document
  selftests/bpf: Fix compilation failure for prog vrf_socket_lookup
  selftests/bpf: Add vrf_socket_lookup tests
  bpf: Fix bpf socket lookup from tc/xdp to respect socket VRF bindings
  bpf: Call __bpf_sk_lookup()/__bpf_skc_lookup() directly via TC hookpoint
  bpf: Factor out socket lookup functions for the TC hookpoint.
  selftests/bpf: Set the default value of consumer_cnt as 0
  selftests/bpf: Ensure that next_cpu() returns a valid CPU number
  selftests/bpf: Output the correct error code for pthread APIs
  selftests/bpf: Use producer_cnt to allocate local counter array
  xsk: Remove unused inline function xsk_buff_discard()
  bpf: Keep BPF_PROG_LOAD permission checks clear of validations
  bpf: Centralize permissions checks for all BPF map types
  bpf: Inline map creation logic in map_create() function
  bpf: Move unprivileged checks into map_create() and bpf_prog_load()
  bpf: Remove in_atomic() from bpf_link_put().
  selftests/bpf: Verify that check_ids() is used for scalars in regsafe()
  bpf: Verify scalar ids mapping in regsafe() using check_ids()
  selftests/bpf: Check if mark_chain_precision() follows scalar ids
  ...
====================

Link: https://lore.kernel.org/r/20230623211256.8409-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24 14:52:28 -07:00
Lukas Bulwahn
a8992d8ad7 watchdog/hardlockup: fix typo in config HARDLOCKUP_DETECTOR_PREFER_BUDDY
Commit a5fcc2367e ("watchdog/hardlockup: make HAVE_NMI_WATCHDOG
sparc64-specific") accidentially introduces a typo in one of the config
dependencies of HARDLOCKUP_DETECTOR_PREFER_BUDDY.

Fix this accidental typo.

Link: https://lkml.kernel.org/r/20230623040717.8645-1-lukas.bulwahn@gmail.com
Fixes: a5fcc2367e ("watchdog/hardlockup: make HAVE_NMI_WATCHDOG sparc64-specific")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-23 17:04:04 -07:00
Ben Dooks
875e0c31f8 devres: show which resource was invalid in __devm_ioremap_resource()
The other error prints in this call show the resource which wsan't valid,
so add this to the first print when it checks for basic validity of the
resource.

Link: https://lkml.kernel.org/r/20230621163050.477668-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-23 17:04:04 -07:00
Andrew Morton
63773d2b59 Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes. 2023-06-23 16:58:19 -07:00
Randy Dunlap
839cad5fa5 cpumask: fix function description kernel-doc notation
Use kernel-doc notation for the function description to prevent
a warning:

lib/cpumask.c:160: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Returns an arbitrary cpu within srcp1 & srcp2.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2023-06-22 13:57:41 -07:00
Yury Norov
c1d2ba10f5 lib/bitmap: drop optimization of bitmap_{from,to}_arr64
bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE
architectures when it's wired to bitmap_copy_clear_tail().

bitmap_copy_clear_tail() takes care of unused bits in the bitmap up to
the next word boundary. But on 32-bit machines when copying bits from
bitmap to array of 64-bit words, it's expected that the unused part of
a recipient array must be cleared up to 64-bit boundary, so the last 4
bytes may stay untouched when nbits % 64 <= 32.

While the copying part of the optimization works correct, that clear-tail
trick makes corresponding tests reasonably fail:

test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)

Fix it by removing bitmap_{from,to}_arr64() optimization for 32-bit LE
arches.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/lkml/20230225184702.GA3587246@roeck-us.net/
Fixes: 0a97953fd2 ("lib: add bitmap_{from,to}_arr64")
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
2023-06-22 13:57:41 -07:00
Yury Norov
c4c14c2906 lib/test_bitmap: increment failure counter properly
The tests that don't use expect_eq() macro to determine that a test is
failured must increment failed_tests explicitly.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/lkml/20230225184702.GA3587246@roeck-us.net/
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
2023-06-22 13:57:41 -07:00
Petr Mladek
7ca8fe94aa watchdog/hardlockup: define HARDLOCKUP_DETECTOR_ARCH
The HAVE_ prefix means that the code could be enabled. Add another
variable for HAVE_HARDLOCKUP_DETECTOR_ARCH without this prefix.
It will be set when it should be built. It will make it compatible
with the other hardlockup detectors.

The change allows to clean up dependencies of PPC_WATCHDOG
and HAVE_HARDLOCKUP_DETECTOR_PERF definitions for powerpc.

As a result HAVE_HARDLOCKUP_DETECTOR_PERF has the same dependencies
on arm, x86, powerpc architectures.

Link: https://lkml.kernel.org/r/20230616150618.6073-7-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:25:29 -07:00
Petr Mladek
47f4cb4339 watchdog/sparc64: define HARDLOCKUP_DETECTOR_SPARC64
The HAVE_ prefix means that the code could be enabled. Add another
variable for HAVE_HARDLOCKUP_DETECTOR_SPARC64 without this prefix.
It will be set when it should be built. It will make it compatible
with the other hardlockup detectors.

Before, it is far from obvious that the SPARC64 variant is actually used:

$> make ARCH=sparc64 defconfig
$> grep HARDLOCKUP_DETECTOR .config
CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_SPARC64=y

After, it is more clear:

$> make ARCH=sparc64 defconfig
$> grep HARDLOCKUP_DETECTOR .config
CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_SPARC64=y
CONFIG_HARDLOCKUP_DETECTOR_SPARC64=y

Link: https://lkml.kernel.org/r/20230616150618.6073-6-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:25:29 -07:00
Petr Mladek
a5fcc2367e watchdog/hardlockup: make HAVE_NMI_WATCHDOG sparc64-specific
There are several hardlockup detector implementations and several Kconfig
values which allow selection and build of the preferred one.

CONFIG_HARDLOCKUP_DETECTOR was introduced by the commit 23637d477c
("lockup_detector: Introduce CONFIG_HARDLOCKUP_DETECTOR") in v2.6.36.
It was a preparation step for introducing the new generic perf hardlockup
detector.

The existing arch-specific variants did not support the to-be-created
generic build configurations, sysctl interface, etc. This distinction
was made explicit by the commit 4a7863cc2e ("x86, nmi_watchdog:
Remove ARCH_HAS_NMI_WATCHDOG and rely on CONFIG_HARDLOCKUP_DETECTOR")
in v2.6.38.

CONFIG_HAVE_NMI_WATCHDOG was introduced by the commit d314d74c69
("nmi watchdog: do not use cpp symbol in Kconfig") in v3.4-rc1. It replaced
the above mentioned ARCH_HAS_NMI_WATCHDOG. At that time, it was still used
by three architectures, namely blackfin, mn10300, and sparc.

The support for blackfin and mn10300 architectures has been completely
dropped some time ago. And sparc is the only architecture with the historic
NMI watchdog at the moment.

And the old sparc implementation is really special. It is always built on
sparc64. It used to be always enabled until the commit 7a5c8b57ce
("sparc: implement watchdog_nmi_enable and watchdog_nmi_disable") added
in v4.10-rc1.

There are only few locations where the sparc64 NMI watchdog interacts
with the generic hardlockup detectors code:

  + implements arch_touch_nmi_watchdog() which is called from the generic
    touch_nmi_watchdog()

  + implements watchdog_hardlockup_enable()/disable() to support
    /proc/sys/kernel/nmi_watchdog

  + is always preferred over other generic watchdogs, see
    CONFIG_HARDLOCKUP_DETECTOR

  + includes asm/nmi.h into linux/nmi.h because some sparc-specific
    functions are needed in sparc-specific code which includes
    only linux/nmi.h.

The situation became more complicated after the commit 05a4a95279
("kernel/watchdog: split up config options") and commit 2104180a53
("powerpc/64s: implement arch-specific hardlockup watchdog") in v4.13-rc1.
They introduced HAVE_HARDLOCKUP_DETECTOR_ARCH. It was used for powerpc
specific hardlockup detector. It was compatible with the perf one
regarding the general boot, sysctl, and programming interfaces.

HAVE_HARDLOCKUP_DETECTOR_ARCH was defined as a superset of
HAVE_NMI_WATCHDOG. It made some sense because all arch-specific
detectors had some common requirements, namely:

  + implemented arch_touch_nmi_watchdog()
  + included asm/nmi.h into linux/nmi.h
  + defined the default value for /proc/sys/kernel/nmi_watchdog

But it actually has made things pretty complicated when the generic
buddy hardlockup detector was added. Before the generic perf detector
was newer supported together with an arch-specific one. But the buddy
detector could work on any SMP system. It means that an architecture
could support both the arch-specific and buddy detector.

As a result, there are few tricky dependencies. For example,
CONFIG_HARDLOCKUP_DETECTOR depends on:

  ((HAVE_HARDLOCKUP_DETECTOR_PERF || HAVE_HARDLOCKUP_DETECTOR_BUDDY) && !HAVE_NMI_WATCHDOG) || HAVE_HARDLOCKUP_DETECTOR_ARCH

The problem is that the very special sparc implementation is defined as:

  HAVE_NMI_WATCHDOG && !HAVE_HARDLOCKUP_DETECTOR_ARCH

Another problem is that the meaning of HAVE_NMI_WATCHDOG is far from clear
without reading understanding the history.

Make the logic less tricky and more self-explanatory by making
HAVE_NMI_WATCHDOG specific for the sparc64 implementation. And rename it to
HAVE_HARDLOCKUP_DETECTOR_SPARC64.

Note that HARDLOCKUP_DETECTOR_PREFER_BUDDY, HARDLOCKUP_DETECTOR_PERF,
and HARDLOCKUP_DETECTOR_BUDDY may conflict only with
HAVE_HARDLOCKUP_DETECTOR_ARCH. They depend on HARDLOCKUP_DETECTOR
and it is not longer enabled when HAVE_NMI_WATCHDOG is set.

Link: https://lkml.kernel.org/r/20230616150618.6073-5-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:25:29 -07:00
Petr Mladek
1356d0b966 watchdog/hardlockup: make the config checks more straightforward
There are four possible variants of hardlockup detectors:

  + buddy: available when SMP is set.

  + perf: available when HAVE_HARDLOCKUP_DETECTOR_PERF is set.

  + arch-specific: available when HAVE_HARDLOCKUP_DETECTOR_ARCH is set.

  + sparc64 special variant: available when HAVE_NMI_WATCHDOG is set
	and HAVE_HARDLOCKUP_DETECTOR_ARCH is not set.

The check for the sparc64 variant is more complicated because
HAVE_NMI_WATCHDOG is used to #ifdef code used by both arch-specific
and sparc64 specific variant. Therefore it is automatically
selected with HAVE_HARDLOCKUP_DETECTOR_ARCH.

This complexity is partly hidden in HAVE_HARDLOCKUP_DETECTOR_NON_ARCH.
It reduces the size of some checks but it makes them harder to follow.

Finally, the other temporary variable HARDLOCKUP_DETECTOR_NON_ARCH
is used to re-compute HARDLOCKUP_DETECTOR_PERF/BUDDY when the global
HARDLOCKUP_DETECTOR switch is enabled/disabled.

Make the logic more straightforward by the following changes:

  + Better explain the role of HAVE_HARDLOCKUP_DETECTOR_ARCH and
    HAVE_NMI_WATCHDOG in comments.

  + Add HAVE_HARDLOCKUP_DETECTOR_BUDDY so that there is separate
    HAVE_* for all four hardlockup detector variants.

    Use it in the other conditions instead of SMP. It makes it
    clear that it is about the buddy detector.

  + Open code HAVE_HARDLOCKUP_DETECTOR_NON_ARCH in HARDLOCKUP_DETECTOR
    and HARDLOCKUP_DETECTOR_PREFER_BUDDY. It helps to understand
    the conditions between the four hardlockup detector variants.

  + Define the exact conditions when HARDLOCKUP_DETECTOR_PERF/BUDDY
    can be enabled. It explains the dependency on the other
    hardlockup detector variants.

    Also it allows to remove HARDLOCKUP_DETECTOR_NON_ARCH by using "imply".
    It triggers re-evaluating HARDLOCKUP_DETECTOR_PERF/BUDDY when
    the global HARDLOCKUP_DETECTOR switch is changed.

  + Add dependency on HARDLOCKUP_DETECTOR so that the affected variables
    disappear when the hardlockup detectors are disabled.

    Another nice side effect is that HARDLOCKUP_DETECTOR_PREFER_BUDDY
    value is not preserved when the global switch is disabled.
    The user has to make the decision again when it gets re-enabled.

Link: https://lkml.kernel.org/r/20230616150618.6073-3-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:25:28 -07:00
Petr Mladek
4917a25f83 watchdog/hardlockup: sort hardlockup detector related config values a logical way
Patch series "watchdog/hardlockup: Cleanup configuration of hardlockup
detectors", v2.

Clean up watchdog Kconfig after introducing the buddy detector.


This patch (of 6):

There are four possible variants of hardlockup detectors:

  + buddy: available when SMP is set.

  + perf: available when HAVE_HARDLOCKUP_DETECTOR_PERF is set.

  + arch-specific: available when HAVE_HARDLOCKUP_DETECTOR_ARCH is set.

  + sparc64 special variant: available when HAVE_NMI_WATCHDOG is set
	and HAVE_HARDLOCKUP_DETECTOR_ARCH is not set.

Only one hardlockup detector can be compiled in. The selection is done
using quite complex dependencies between several CONFIG variables.
The following patches will try to make it more straightforward.

As a first step, reorder the definitions of the various CONFIG variables.
The logical order is:

   1. HAVE_* variables define available variants. They are typically
      defined in the arch/ config files.

   2. HARDLOCKUP_DETECTOR y/n variable defines whether the hardlockup
      detector is enabled at all.

   3. HARDLOCKUP_DETECTOR_PREFER_BUDDY y/n variable defines whether
      the buddy detector should be preferred over the perf one.
      Note that the arch specific variants are always preferred when
      available.

   4. HARDLOCKUP_DETECTOR_PERF/BUDDY variables define whether the given
      detector is enabled in the end.

   5. HAVE_HARDLOCKUP_DETECTOR_NON_ARCH and HARDLOCKUP_DETECTOR_NON_ARCH
      are temporary variables that are going to be removed in
      a followup patch.

This is a preparation step for further cleanup. It will change the logic
without shuffling the definitions.

This change temporary breaks the C-like ordering where the variables are
declared or defined before they are used. It is not really needed for
Kconfig. Also the following patches will rework the logic so that
the ordering will be C-like in the end.

The patch just shuffles the definitions. It should not change the existing
behavior.

Link: https://lkml.kernel.org/r/20230616150618.6073-1-pmladek@suse.com
Link: https://lkml.kernel.org/r/20230616150618.6073-2-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:25:28 -07:00
Douglas Anderson
7ece48b7b4 watchdog/buddy: simplify the dependency for HARDLOCKUP_DETECTOR_PREFER_BUDDY
The dependency for HARDLOCKUP_DETECTOR_PREFER_BUDDY was more complicated
than it needed to be.  If the "perf" detector is available and we have SMP
then we have a choice, so enable the config based on just those two config
items.

Link: https://lkml.kernel.org/r/20230526184139.8.I49d5b483336b65b8acb1e5066548a05260caf809@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:25:28 -07:00
Joel Granados
f2e7a6265e test_sysclt: Test for registering a mount point
Test that target gets created by register_sysctl_mount_point and that no
additional target can be created "on top" of a permanently empty sysctl
table.

Create a mount point target (mnt) in the sysctl test driver; try to
create another on top of that (mnt_error). Output an error if
"mnt_error" is present when we run the sysctl selftests.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-18 02:32:54 -07:00
Joel Granados
3557643859 test_sysctl: Add an unregister sysctl test
Add a test that checks that the unregistered directory is removed from
/proc/sys/debug

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-18 02:32:54 -07:00
Joel Granados
e009bd5efe test_sysctl: Group node sysctl test under one func
Preparation commit to add a new type of test to test_sysctl.c. We
want to differentiate between node and (sub)directory tests.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-18 02:32:53 -07:00
Christoph Hellwig
84bd06c632 iov_iter: remove iov_iter_get_pages and iov_iter_get_pages_alloc
Now that the direct I/O helpers have switched to use
iov_iter_extract_pages, these helpers are unused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230614140341.521331-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-16 10:08:09 -06:00
Jakub Kicinski
173780ff18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

include/linux/mlx5/driver.h
  617f5db1a6 ("RDMA/mlx5: Fix affinity assignment")
  dc13180824 ("net/mlx5: Enable devlink port for embedded cpu VF vports")
https://lore.kernel.org/all/20230613125939.595e50b8@canb.auug.org.au/

tools/testing/selftests/net/mptcp/mptcp_join.sh
  47867f0a7e ("selftests: mptcp: join: skip check if MIB counter not supported")
  425ba80312 ("selftests: mptcp: join: support RM_ADDR for used endpoints or not")
  45b1a1227a ("mptcp: introduces more address related mibs")
  0639fa230a ("selftests: mptcp: add explicit check for new mibs")
https://lore.kernel.org/netdev/20230609-upstream-net-20230610-mptcp-selftests-support-old-kernels-part-3-v1-0-2896fe2ee8a3@tessares.net/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15 22:19:41 -07:00
Mirsad Goran Todorovac
7dae593cd2 test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation
In a couple of situations like

	name = kstrndup(buf, count, GFP_KERNEL);
	if (!name)
		return -ENOSPC;

the error is not actually "No space left on device", but "Out of memory".

It is semantically correct to return -ENOMEM in all failed kstrndup()
and kzalloc() cases in this driver, as it is not a problem with disk
space, but with kernel memory allocator failing allocation.

The semantically correct should be:

        name = kstrndup(buf, count, GFP_KERNEL);
        if (!name)
                return -ENOMEM;

Cc: Dan Carpenter <error27@gmail.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Luis R. Rodriguez" <mcgrof@ruslug.rutgers.edu>
Cc: Scott Branden <sbranden@broadcom.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Brian Norris <briannorris@chromium.org>
Fixes: c92316bf8e ("test_firmware: add batched firmware tests")
Fixes: 0a8adf5847 ("test: add firmware_class loader test")
Fixes: 548193cba2 ("test_firmware: add support for firmware_request_platform")
Fixes: eb910947c8 ("test: firmware_class: add asynchronous request trigger")
Fixes: 061132d2b9 ("test_firmware: add test custom fallback trigger")
Fixes: 7feebfa487 ("test_firmware: add support for request_firmware_into_buf")
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Message-ID: <20230606070808.9300-1-mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-15 13:42:18 +02:00
Arnd Bergmann
3de13550a2 raid6: neon: add missing prototypes
The raid6 syndrome functions are generated for different sizes and have
no generic prototype, while in the inner functions have a prototype
in a header that cannot be included from the correct file. In both
cases, the compiler warns about missing prototypes:

lib/raid6/recov_neon_inner.c:27:6: warning: no previous prototype for '__raid6_2data_recov_neon' [-Wmissing-prototypes]
lib/raid6/recov_neon_inner.c:77:6: warning: no previous prototype for '__raid6_datap_recov_neon' [-Wmissing-prototypes]
lib/raid6/neon1.c:56:6: warning: no previous prototype for 'raid6_neon1_gen_syndrome_real' [-Wmissing-prototypes]
lib/raid6/neon1.c:86:6: warning: no previous prototype for 'raid6_neon1_xor_syndrome_real' [-Wmissing-prototypes]
lib/raid6/neon2.c:56:6: warning: no previous prototype for 'raid6_neon2_gen_syndrome_real' [-Wmissing-prototypes]
lib/raid6/neon2.c:97:6: warning: no previous prototype for 'raid6_neon2_xor_syndrome_real' [-Wmissing-prototypes]
lib/raid6/neon4.c:56:6: warning: no previous prototype for 'raid6_neon4_gen_syndrome_real' [-Wmissing-prototypes]
lib/raid6/neon4.c:119:6: warning: no previous prototype for 'raid6_neon4_xor_syndrome_real' [-Wmissing-prototypes]
lib/raid6/neon8.c:56:6: warning: no previous prototype for 'raid6_neon8_gen_syndrome_real' [-Wmissing-prototypes]
lib/raid6/neon8.c:163:6: warning: no previous prototype for 'raid6_neon8_xor_syndrome_real' [-Wmissing-prototypes]

Add a new header file that contains the prototypes for both to avoid
the warnings.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230517132220.937200-1-arnd@kernel.org
2023-06-13 15:13:20 -07:00
Linus Torvalds
fb054096ae 19 hotfixes. 14 are cc:stable and the remainder address issues which were
introduced during this -rc cycle or which were considered inappropriate
 for a backport.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZIdw7QAKCRDdBJ7gKXxA
 jki4AQCygi1UoqVPq4N/NzJbv2GaNDXNmcJIoLvPpp3MYFhucAEAtQNzAYO9z6CT
 iLDMosnuh+1KLTaKNGL5iak3NAxnxQw=
 =mTdI
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-06-12-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "19 hotfixes. 14 are cc:stable and the remainder address issues which
  were introduced during this development cycle or which were considered
  inappropriate for a backport"

* tag 'mm-hotfixes-stable-2023-06-12-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  zswap: do not shrink if cgroup may not zswap
  page cache: fix page_cache_next/prev_miss off by one
  ocfs2: check new file size on fallocate call
  mailmap: add entry for John Keeping
  mm/damon/core: fix divide error in damon_nr_accesses_to_accesses_bp()
  epoll: ep_autoremove_wake_function should use list_del_init_careful
  mm/gup_test: fix ioctl fail for compat task
  nilfs2: reject devices with insufficient block count
  ocfs2: fix use-after-free when unmounting read-only filesystem
  lib/test_vmalloc.c: avoid garbage in page array
  nilfs2: fix possible out-of-bounds segment allocation in resize ioctl
  riscv/purgatory: remove PGO flags
  powerpc/purgatory: remove PGO flags
  x86/purgatory: remove PGO flags
  kexec: support purgatories with .text.hot sections
  mm/uffd: allow vma to merge as much as possible
  mm/uffd: fix vma operation where start addr cuts part of vma
  radix-tree: move declarations to header
  nilfs2: fix incomplete buffer cleanup in nilfs_btnode_abort_change_key()
2023-06-12 16:14:34 -07:00
Lorenzo Stoakes
9f6c6ad161 lib/test_vmalloc.c: avoid garbage in page array
It turns out that alloc_pages_bulk_array() does not treat the page_array
parameter as an output parameter, but rather reads the array and skips any
entries that have already been allocated.

This is somewhat unexpected and breaks this test, as we allocate the pages
array uninitialised on the assumption it will be overwritten.

As a result, the test was referencing uninitialised data and causing the
PFN to not be valid and thus a WARN_ON() followed by a null pointer deref
and panic.

In addition, this is an array of pointers not of struct page objects, so we
need only allocate an array with elements of pointer size.

We solve both problems by simply using kcalloc() and referencing
sizeof(struct page *) rather than sizeof(struct page).

Link: https://lkml.kernel.org/r/20230524082424.10022-1-lstoakes@gmail.com
Fixes: 869cb29a61 ("lib/test_vmalloc.c: add vm_map_ram()/vm_unmap_ram() test case")
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12 11:31:51 -07:00
Arnd Bergmann
bde1597d0f radix-tree: move declarations to header
The xarray.c file contains the only call to radix_tree_node_rcu_free(),
and it comes with its own extern declaration for it.  This means the
function definition causes a missing-prototype warning:

lib/radix-tree.c:288:6: error: no previous prototype for 'radix_tree_node_rcu_free' [-Werror=missing-prototypes]

Instead, move the declaration for this function to a new header that can
be included by both, and do the same for the radix_tree_node_cachep
variable that has the same underlying problem but does not cause a warning
with gcc.

[zhangpeng.00@bytedance.com: fix building radix tree test suite]
  Link: https://lkml.kernel.org/r/20230521095450.21332-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20230516194212.548910-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12 11:31:50 -07:00
Douglas Anderson
1f423c905a watchdog/hardlockup: detect hard lockups using secondary (buddy) CPUs
Implement a hardlockup detector that doesn't doesn't need any extra
arch-specific support code to detect lockups.  Instead of using something
arch-specific we will use the buddy system, where each CPU watches out for
another one.  Specifically, each CPU will use its softlockup hrtimer to
check that the next CPU is processing hrtimer interrupts by verifying that
a counter is increasing.

NOTE: unlike the other hard lockup detectors, the buddy one can't easily
show what's happening on the CPU that locked up just by doing a simple
backtrace.  It relies on some other mechanism in the system to get
information about the locked up CPUs.  This could be support for NMI
backtraces like [1], it could be a mechanism for printing the PC of locked
CPUs at panic time like [2] / [3], or it could be something else.  Even
though that means we still rely on arch-specific code, this arch-specific
code seems to often be implemented even on architectures that don't have a
hardlockup detector.

This style of hardlockup detector originated in some downstream Android
trees and has been rebased on / carried in ChromeOS trees for quite a long
time for use on arm and arm64 boards.  Historically on these boards we've
leveraged mechanism [2] / [3] to get information about hung CPUs, but we
could move to [1].

Although the original motivation for the buddy system was for use on
systems without an arch-specific hardlockup detector, it can still be
useful to use even on systems that _do_ have an arch-specific hardlockup
detector.  On x86, for instance, there is a 24-part patch series [4] in
progress switching the arch-specific hard lockup detector from a scarce
perf counter to a less-scarce hardware resource.  Potentially the buddy
system could be a simpler alternative to free up the perf counter but
still get hard lockup detection.

Overall, pros (+) and cons (-) of the buddy system compared to an
arch-specific hardlockup detector (which might be implemented using
perf):
+ The buddy system is usable on systems that don't have an
  arch-specific hardlockup detector, like arm32 and arm64 (though it's
  being worked on for arm64 [5]).
+ The buddy system may free up scarce hardware resources.
+ If a CPU totally goes out to lunch (can't process NMIs) the buddy
  system could still detect the problem (though it would be unlikely
  to be able to get a stack trace).
+ The buddy system uses the same timer function to pet the hardlockup
  detector on the running CPU as it uses to detect hardlockups on
  other CPUs. Compared to other hardlockup detectors, this means it
  generates fewer interrupts and thus is likely better able to let
  CPUs stay idle longer.
- If all CPUs are hard locked up at the same time the buddy system
  can't detect it.
- If we don't have SMP we can't use the buddy system.
- The buddy system needs an arch-specific mechanism (possibly NMI
  backtrace) to get info about the locked up CPU.

[1] https://lore.kernel.org/r/20230419225604.21204-1-dianders@chromium.org
[2] https://issuetracker.google.com/172213129
[3] https://docs.kernel.org/trace/coresight/coresight-cpu-debug.html
[4] https://lore.kernel.org/lkml/20230301234753.28582-1-ricardo.neri-calderon@linux.intel.com/
[5] https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.chen@mediatek.com/

Link: https://lkml.kernel.org/r/20230519101840.v5.14.I6bf789d21d0c3d75d382e7e51a804a7a51315f2c@changeid
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Tzung-Bi Shih <tzungbi@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 17:44:21 -07:00
Arnd Bergmann
0044444861 decompressor: provide missing prototypes
The entry points for the decompressor don't always have a prototype
included in the .c file:

lib/decompress_inflate.c:42:17: error: no previous prototype for '__gunzip' [-Werror=missing-prototypes]
lib/decompress_unxz.c:251:17: error: no previous prototype for 'unxz' [-Werror=missing-prototypes]
lib/decompress_unzstd.c:331:17: error: no previous prototype for 'unzstd' [-Werror=missing-prototypes]

Include the correct headers for unxz and unzstd, and mark the inflate
function above as unconditionally 'static' to avoid these warnings.

Link: https://lkml.kernel.org/r/20230517131936.936840-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Nick Terrell <terrelln@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 17:44:17 -07:00
Arnd Bergmann
23108f6aac kunit: include debugfs header file
An extra #include statement is needed to ensure the prototypes for debugfs
interfaces are visible, avoiding this warning:

lib/kunit/debugfs.c:28:6: error: no previous prototype for 'kunit_debugfs_cleanup' [-Werror=missing-prototypes]
lib/kunit/debugfs.c:33:6: error: no previous prototype for 'kunit_debugfs_init' [-Werror=missing-prototypes]
lib/kunit/debugfs.c:102:6: error: no previous prototype for 'kunit_debugfs_create_suite' [-Werror=missing-prototypes]
lib/kunit/debugfs.c:118:6: error: no previous prototype for 'kunit_debugfs_destroy_suite' [-Werror=missing-prototypes]

Link: https://lkml.kernel.org/r/20230517131102.934196-10-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: David Gow <davidgow@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 17:44:16 -07:00
Arnd Bergmann
6b76ca2ab9 lib: devmem_is_allowed: include linux/io.h
The devmem_is_allowed() function is defined in a file of the same name,
but the declaration is in asm/io.h, which is not included there, causing a
W=1 warning:

lib/devmem_is_allowed.c:20:5: error: no previous prototype for 'devmem_is_allowed' [-Werror=missing-prototypes]

Include the appropriate header to avoid the warning.

Link: https://lkml.kernel.org/r/20230517131102.934196-6-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 17:44:15 -07:00
Alexey Dobriyan
3db55767da add intptr_t
Add signed intptr_t given that a) it is standard type and b) uintptr_t is
in tree.

Link: https://lkml.kernel.org/r/ed66b9e4-1fb7-45be-9bb9-d4bc291c691f@p183
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 17:44:13 -07:00
Peng Zhang
7a03ae3920 maple_tree: simplify and clean up mas_wr_node_store()
Simplify and clean up mas_wr_node_store(), remove unnecessary code.

Link: https://lkml.kernel.org/r/20230524031247.65949-10-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:45 -07:00
Peng Zhang
e6d1ffd611 maple_tree: rework mas_wr_slot_store() to be cleaner and more efficient.
Get whether the two gaps to be overwritten are empty to avoid calling
mas_update_gap() all the time.  Also clean up the code and add comments.

Link: https://lkml.kernel.org/r/20230524031247.65949-9-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:45 -07:00
Peng Zhang
2e1da329b4 maple_tree: add comments and some minor cleanups to mas_wr_append()
Add comment for mas_wr_append(), move mas_update_gap() into
mas_wr_append(), and other cleanups to make mas_wr_modify() cleaner.

Link: https://lkml.kernel.org/r/20230524031247.65949-8-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:45 -07:00
Peng Zhang
c6fc9e4a5c maple_tree: add mas_wr_new_end() to calculate new_end accurately
The previous new_end calculation is inaccurate, because it assumes that
two new pivots must be added (this is inaccurate), and sometimes it will
miss the fast path and enter the slow path.  Add mas_wr_new_end() to
accurately calculate new_end to make the conditions for entering the fast
path more accurate.

Link: https://lkml.kernel.org/r/20230524031247.65949-7-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:45 -07:00
Peng Zhang
8c995a6314 maple_tree: make the code symmetrical in mas_wr_extend_null()
Just make the code symmetrical to improve readability.

Link: https://lkml.kernel.org/r/20230524031247.65949-6-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:44 -07:00
Peng Zhang
bc147f0f70 maple_tree: simplify mas_is_span_wr()
Make the code for detecting spanning writes more concise.

Link: https://lkml.kernel.org/r/20230524031247.65949-5-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:44 -07:00
Peng Zhang
14c4b5ab6a maple_tree: fix the arguments to __must_hold()
Fix the arguments to __must_hold() to make sparse work.

Link: https://lkml.kernel.org/r/20230524031247.65949-4-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:44 -07:00
Peng Zhang
c2aa6f5328 maple_tree: drop mas_{rev_}alloc() and mas_fill_gap()
mas_{rev_}alloc() and mas_fill_gap() are no longer used, delete them.

Link: https://lkml.kernel.org/r/20230524031247.65949-3-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:44 -07:00
Peng Zhang
523716770e maple_tree: rework mtree_alloc_{range,rrange}()
Patch series "Clean ups for maple tree", v4.

Some clean ups, mainly to make the code of maple tree more concise.
This patchset has passed the self-test.


This patch (of 10):

Use mas_empty_area{_rev}() to refactor mtree_alloc_{range,rrange}()

Link: https://lkml.kernel.org/r/20230524031247.65949-2-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:43 -07:00
Liam R. Howlett
eb2e817f38 maple_tree: update testing code for mas_{next,prev,walk}
Now that the functions have changed the limits, update the testing of the
maple tree to test these new settings.

Link: https://lkml.kernel.org/r/20230518145544.1722059-34-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:35 -07:00
Liam R. Howlett
6b23a29061 maple_tree: clear up index and last setting in single entry tree
When there is a single entry tree (range of 0-0 pointing to an entry),
then ensure the limit is either 0-0 or 1-oo, depending on where the user
walks.  Ensure the correct node setting as well; either MAS_ROOT or
MAS_NONE.

Link: https://lkml.kernel.org/r/20230518145544.1722059-33-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:34 -07:00
Liam R. Howlett
6b9e93e010 maple_tree: add mas_prev_range() and mas_find_range_rev interface
Some users of the maple tree may want to move to the previous range
regardless of the value stored there.  Add this interface as well as the
'find' variant to support walking to the first value, then iterating over
the previous ranges.

Link: https://lkml.kernel.org/r/20230518145544.1722059-32-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:34 -07:00
Liam R. Howlett
dd9a851382 maple_tree: introduce mas_prev_slot() interface
Sometimes the user needs to revert to the previous slot, regardless of if
it is empty or not.  Add an interface to go to the previous slot.

Since there can't be two consecutive NULLs in the tree, the mas_prev()
function can be implemented by calling mas_prev_slot() a maximum of 2
times.  Change the underlying interface to use mas_prev_slot() to align
the code.

Link: https://lkml.kernel.org/r/20230518145544.1722059-31-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:34 -07:00
Liam R. Howlett
de6e386c06 maple_tree: relocate mas_rewalk() and mas_rewalk_if_dead()
These functions need to move for future use.

Link: https://lkml.kernel.org/r/20230518145544.1722059-30-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:34 -07:00
Liam R. Howlett
6169b55319 maple_tree: add mas_next_range() and mas_find_range() interfaces
Some users of the maple tree may want to move to the next range in the
tree, even if it stores a NULL.  This family of function provides that
functionality by advancing one slot at a time and returning the result,
while mas_contiguous() will iterate over the range and stop on
encountering the first NULL.

Link: https://lkml.kernel.org/r/20230518145544.1722059-29-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:34 -07:00
Liam R. Howlett
fff4a58cc0 maple_tree: introduce mas_next_slot() interface
Sometimes, during a tree walk, the user needs the next slot regardless of
if it is empty or not.  Add an interface to get the next slot.

Since there are no consecutive NULLs allowed in the tree, the mas_next()
function can only advance two slots at most.  So use the new
mas_next_slot() interface to align both implementations.  Use this method
for mas_find() as well.

Link: https://lkml.kernel.org/r/20230518145544.1722059-28-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:33 -07:00
Liam R. Howlett
17e7436bd3 maple_tree: fix testing mas_empty_area()
Empty area will return -EINVAL if the search window is smaller than the
requested size.  Fix the test case to check for this error code.

Link: https://lkml.kernel.org/r/20230518145544.1722059-27-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:33 -07:00
Liam R. Howlett
ba9972121a maple_tree: revise limit checks in mas_empty_area{_rev}()
Since the maple tree is inclusive in range, ensure that a range of 1 (min
= max) works for searching for a gap in either direction, and make sure
the size is at least 1 but not larger than the delta between min and max.

This commit also updates the testing.  Unfortunately there isn't a way to
safely update the tests and code without a test failure.

Link: https://lkml.kernel.org/r/20230518145544.1722059-26-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:33 -07:00
Liam R. Howlett
39193685d5 maple_tree: try harder to keep active node with mas_prev()
Keep a reference to the node when possible with mas_prev().  This will
avoid re-walking the tree.  In keeping a reference to the node, keep the
last/index accurate to the range being referenced.  This means the limit
may be within the range, but the range may extend outside of the limit.

Also fix the single entry tree to respect the range (of 0), or set the
node to MAS_NONE in the case of shifting beyond 0.

Link: https://lkml.kernel.org/r/20230518145544.1722059-25-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:33 -07:00
Liam R. Howlett
ca80f61004 maple_tree: try harder to keep active node after mas_next()
Clean up the mas_next() call to try and keep a node reference when
possible.  This will avoid re-walking the tree in most cases.

Also clean up the single entry tree handling to ensure index/last are
consistent with what one would expect.  (returning NULL with limit of
1-oo).

Link: https://lkml.kernel.org/r/20230518145544.1722059-24-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:32 -07:00
Liam R. Howlett
d04118605f maple_tree: mas_start() reset depth on dead node
When a dead node is detected, the depth has already been set to 1 so reset
it to 0.

Link: https://lkml.kernel.org/r/20230518145544.1722059-22-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:32 -07:00
Liam R. Howlett
23e734ecd9 maple_tree: remove unnecessary check from mas_destroy()
mas_destroy currently checks if mas->node is MAS_START prior to calling
mas_start(), but this is unnecessary as mas_start() will do nothing if the
node is anything but MAS_START.

Link: https://lkml.kernel.org/r/20230518145544.1722059-21-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:32 -07:00
Liam R. Howlett
eaf9790d3b maple_tree: add __init and __exit to test module
The test functions are not needed after the module is removed, so mark
them as such.  Add __exit to the module removal function.  Some other
variables have been marked as const static as well.

Link: https://lkml.kernel.org/r/20230518145544.1722059-20-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:31 -07:00
Liam R. Howlett
a5199577b1 maple_tree: make test code work without debug enabled
The test code is less useful without debug, but can still do general
validations.  Define mt_dump(), mas_dump() and mas_wr_dump() as a noop if
debug is not enabled and document it in the test module information that
more information can be obtained with another kernel config option.

MT_BUG_ON() will report a failures without tree dumps, and the output will
be less useful.

Link: https://lkml.kernel.org/r/20230518145544.1722059-17-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:31 -07:00
Liam R. Howlett
acd4de60dd maple_tree: return error on mte_pivots() out of range
Rename mte_pivots() to mas_pivots() and pass through the ma_state to set
the error code to -EIO when the offset is out of range for the node type. 
Change the WARN_ON() to MAS_WARN_ON() to log the maple state.

Link: https://lkml.kernel.org/r/20230518145544.1722059-16-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:30 -07:00
Liam R. Howlett
bec1b51efb maple_tree: use MAS_BUG_ON() prior to calling mas_meta_gap()
Replace the call to BUG_ON() in mas_meta_gap() with calls before the
function call MAS_BUG_ON() to get more information on error condition.

Link: https://lkml.kernel.org/r/20230518145544.1722059-15-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:30 -07:00
Liam R. Howlett
1c414c6a4b maple_tree: use MAS_WR_BUG_ON() in mas_store_prealloc()
mas_store_prealloc() should never fail, but if it does due to internal
tree issues then get as much debug information as possible prior to
crashing the kernel.

Link: https://lkml.kernel.org/r/20230518145544.1722059-14-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:30 -07:00
Liam R. Howlett
4bbd1748c1 maple_tree: use MAS_BUG_ON() from mas_topiary_range()
In the even of trying to remove data from a leaf node by use of
mas_topiary_range(), log the maple state.

Link: https://lkml.kernel.org/r/20230518145544.1722059-13-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:30 -07:00
Liam R. Howlett
5950ada963 maple_tree: use MAS_BUG_ON() in mas_set_height()
Use MAS_BUG_ON() instead of MT_BUG_ON() to get the maple state
information.  In the unlikely event of a tree height of > 31, try to
increase the probability of useful information being logged.

Link: https://lkml.kernel.org/r/20230518145544.1722059-12-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:29 -07:00
Liam R. Howlett
bf96715eb4 maple_tree: use MAS_BUG_ON() when setting a leaf node as a parent
Use MAS_BUG_ON() to dump the maple state and tree in the unlikely event of
an issue.

Link: https://lkml.kernel.org/r/20230518145544.1722059-11-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:29 -07:00
Liam R. Howlett
e6d6792a5c maple_tree: convert debug code to use MT_WARN_ON() and MAS_WARN_ON()
Using MT_WARN_ON() allows for the removal of if statements before logging.
Using MAS_WARN_ON() will provide more information when issues are
encountered.

Link: https://lkml.kernel.org/r/20230518145544.1722059-10-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:29 -07:00
Liam R. Howlett
0d7c52bb29 maple_tree: convert BUG_ON() to MT_BUG_ON()
Use MT_BUG_ON() to get more information when running with MAPLE_TREE_DEBUG
enabled.

Link: https://lkml.kernel.org/r/20230518145544.1722059-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:28 -07:00
Liam R. Howlett
f0a1f866ab maple_tree: add debug BUG_ON and WARN_ON variants
Add debug macros to dump the maple state and/or the tree for both warning
and bug_on calls.

Link: https://lkml.kernel.org/r/20230518145544.1722059-7-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:28 -07:00
Liam R. Howlett
89f499f35c maple_tree: add format option to mt_dump()
Allow different formatting strings to be used when dumping the tree. 
Currently supports hex and decimal.

Link: https://lkml.kernel.org/r/20230518145544.1722059-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:28 -07:00
Liam R. Howlett
c3eb787e88 maple_tree: clean up mas_dfs_postorder()
Convert loop type to ensure all variables are set to make the compiler
happy, and use the mas_is_none() function instead of explicitly checking
the node in the maple state.

Link: https://lkml.kernel.org/r/20230518145544.1722059-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:28 -07:00
Liam R. Howlett
633769c926 maple_tree: avoid unnecessary ascending
The maple tree node limits are implied by the parent.  When walking up the
tree, the limit may not be known until a slot that does not have implied
limits are encountered.  However, if the node is the left-most or
right-most node, the walking up to find that limit can be skipped.

This commit also fixes the debug/testing code that was not setting the
limit on walking down the tree as that optimization is not compatible with
this change.

Link: https://lkml.kernel.org/r/20230518145544.1722059-4-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:27 -07:00
Liam R. Howlett
afc754c651 maple_tree: clean up mas_parent_enum() and rename to mas_parent_type()
mas_parent_enum() is a simple wrapper for mte_parent_enum() which is only
called from that wrapper.  Remove the wrapper and inline mte_parent_enum()
into mas_parent_enum().

At the same time, clean up the bit masking of the root pointer since it
cannot be set by the time the bit masking occurs.  Change the check on the
root bit to a WARN_ON(), and fix the verification code to not trigger the
WARN_ON() before checking if the node is root.

Align the name to mas_parent_type() since mas_node_type() exists already.

Link: https://lkml.kernel.org/r/20230518145544.1722059-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:27 -07:00
Liam R. Howlett
5729e06c81 maple_tree: fix static analyser cppcheck issue
Patch series "Maple tree mas_{next,prev}_range() and cleanup", v4.

This patchset contains a number of clean ups to the code to make it more
usable (next/prev range), the addition of debug output formatting, the
addition of printing the maple state information in the WARN_ON/BUG_ON
code.

There is also work done here to keep nodes active during iterations to
reduce the necessity of re-walking the tree.

Finally, there is a new interface added to move to the next or previous
range in the tree, even if it is empty.

The organisation of the patches is as follows:

0001-0004 - Small clean ups
0005-0018 - Additional debug options and WARN_ON/BUG_ON changes
0019      - Test module __init and __exit addition
0020-0021 - More functional clean ups
0022-0026 - Changes to keep nodes active
0027-0034 - Add new mas_{prev,next}_range()
0035      - Use new mas_{prev,next}_range() in mmap_region()


This patch (of 35):

Static analyser of the maple tree code noticed that the split variable is
being used to dereference into an array prior to checking the variable
itself.  Fix this issue by changing the order of the statement to check
the variable first.

Link: https://lkml.kernel.org/r/20230518145544.1722059-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20230518145544.1722059-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: David Binderman <dcb314@hotmail.com>
Reviewed-by: Peng Zhang<zhangpeng.00@bytedance.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:27 -07:00
Kefeng Wang
e9aae17092 mm: page_alloc: collect mem statistic into show_mem.c
Let's move show_mem.c from lib to mm, as it belongs memory subsystem, also
split some memory statistic related functions from page_alloc.c to
show_mem.c, and we cleanup some unneeded include.

There is no functional change.

Link: https://lkml.kernel.org/r/20230516063821.121844-5-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:22 -07:00
Peng Zhang
cd00dd2585 maple_tree: fix potential out-of-bounds access in mas_wr_end_piv()
Check the write offset end bounds before using it as the offset into the
pivot array.  This avoids a possible out-of-bounds access on the pivot
array if the write extends to the last slot in the node, in which case the
node maximum should be used as the end pivot.

akpm: this doesn't affect any current callers, but new users of mapletree
may encounter this problem if backported into earlier kernels, so let's
fix it in -stable kernels in case of this.

Link: https://lkml.kernel.org/r/20230506024752.2550-1-zhangpeng.00@bytedance.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:18 -07:00
Jakub Kicinski
449f6bc17a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/sched/sch_taprio.c
  d636fc5dd6 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
  dced11ef84 ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()")

net/ipv4/sysctl_net_ipv4.c
  e209fee411 ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294")
  ccce324dab ("tcp: make the first N SYN RTO backoffs linear")
https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 11:35:14 -07:00
Linus Torvalds
25041a4c02 Networking fixes for 6.4-rc6, including fixes from can, wifi, netfilter,
bluetooth and ebpf.
 
 Current release - regressions:
 
   - bpf: sockmap: avoid potential NULL dereference in sk_psock_verdict_data_ready()
 
   - wifi: iwlwifi: fix -Warray-bounds bug in iwl_mvm_wait_d3_notif()
 
   - phylink: actually fix ksettings_set() ethtool call
 
   - eth: dwmac-qcom-ethqos: fix a regression on EMAC < 3
 
 Current release - new code bugs:
 
   - wifi: mt76: fix possible NULL pointer dereference in mt7996_mac_write_txwi()
 
 Previous releases - regressions:
 
   - netfilter: fix NULL pointer dereference in nf_confirm_cthelper
 
   - wifi: rtw88/rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS
 
   - openvswitch: fix upcall counter access before allocation
 
   - bluetooth:
     - fix use-after-free in hci_remove_ltk/hci_remove_irk
     - fix l2cap_disconnect_req deadlock
 
   - nic: bnxt_en: prevent kernel panic when receiving unexpected PHC_UPDATE event
 
 Previous releases - always broken:
 
   - core: annotate rfs lockless accesses
 
   - sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values
 
   - netfilter: add null check for nla_nest_start_noflag() in nft_dump_basechain_hook()
 
   - bpf: fix UAF in task local storage
 
   - ipv4: ping_group_range: allow GID from 2147483648 to 4294967294
 
   - ipv6: rpl: fix route of death.
 
   - tcp: gso: really support BIG TCP
 
   - mptcp: fixes for user-space PM address advertisement
 
   - smc: avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT
 
   - can: avoid possible use-after-free when j1939_can_rx_register fails
 
   - batman-adv: fix UaF while rescheduling delayed work
 
   - eth: qede: fix scheduling while atomic
 
   - eth: ice: make writes to /dev/gnssX synchronous
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmSBsv4SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkMXUP/jisT2xvTFRmtshX3h+xxPkBxZSo9ovx
 ujviqZkyCNep9fu7Njv+5WWp0V8cy3Ui6G6RiGNHDV24vBtISlX21yQt+VANOPjH
 7x8oqqnANxn3PXjL5hp6YZhNaxiwfAfQGJiU+TngVo1jTJopnWEt2x8Q3EhF/k0S
 id8VaHGh/ugC8lRZSJBK/b+FsJjWY0sxTcsoRSjp6gg1WHUVO8mJXlCfHFhNJcQQ
 /8ghieuskLUs4V6UX3TGg4smGxgl2HPdA79+ohvrVhcB1WoGCsWV83SfUTBWgHkU
 IZrIfM4BFCThcN88IgRgJioeX95D54SK0RzEZdCnJx+elmgTK1ZdUGlBh1Vybh+v
 iQel2dgJI+8zyIl/4lXYdhHogLwnONVrkszMrx+Ds2PzNecmnFWg4LUK01xLjW7J
 poAFsZGVBk0BuTkEqXtxv/8Cc7wU/PMOmy4ZVBrHkNIyGgOLbt5eM0T/pArYoKvr
 +34del2Us2vGVk6i89F/GgRuNCvevO0Y+HyAArOJr2XwpakwQYQHdBdj/77FGjFZ
 PyR/bVJZhxdUMv+J7BdKQK+mwt+ZFBVwIRfU2gvHcDa2XQJe2Eg8GXRtcJ1P7hpr
 Q2A+AgiHSoAn6GrgYNHNZVBhWywQFCsu2ZpH7J0uo4zOyTUl3+4O8jyfDrD7o56D
 BodtDJKZit3B
 =X6b2
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can, wifi, netfilter, bluetooth and ebpf.

  Current release - regressions:

   - bpf: sockmap: avoid potential NULL dereference in
     sk_psock_verdict_data_ready()

   - wifi: iwlwifi: fix -Warray-bounds bug in iwl_mvm_wait_d3_notif()

   - phylink: actually fix ksettings_set() ethtool call

   - eth: dwmac-qcom-ethqos: fix a regression on EMAC < 3

  Current release - new code bugs:

   - wifi: mt76: fix possible NULL pointer dereference in
     mt7996_mac_write_txwi()

  Previous releases - regressions:

   - netfilter: fix NULL pointer dereference in nf_confirm_cthelper

   - wifi: rtw88/rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS

   - openvswitch: fix upcall counter access before allocation

   - bluetooth:
      - fix use-after-free in hci_remove_ltk/hci_remove_irk
      - fix l2cap_disconnect_req deadlock

   - nic: bnxt_en: prevent kernel panic when receiving unexpected
     PHC_UPDATE event

  Previous releases - always broken:

   - core: annotate rfs lockless accesses

   - sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values

   - netfilter: add null check for nla_nest_start_noflag() in
     nft_dump_basechain_hook()

   - bpf: fix UAF in task local storage

   - ipv4: ping_group_range: allow GID from 2147483648 to 4294967294

   - ipv6: rpl: fix route of death.

   - tcp: gso: really support BIG TCP

   - mptcp: fixes for user-space PM address advertisement

   - smc: avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT

   - can: avoid possible use-after-free when j1939_can_rx_register fails

   - batman-adv: fix UaF while rescheduling delayed work

   - eth: qede: fix scheduling while atomic

   - eth: ice: make writes to /dev/gnssX synchronous"

* tag 'net-6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
  bnxt_en: Implement .set_port / .unset_port UDP tunnel callbacks
  bnxt_en: Prevent kernel panic when receiving unexpected PHC_UPDATE event
  bnxt_en: Skip firmware fatal error recovery if chip is not accessible
  bnxt_en: Query default VLAN before VNIC setup on a VF
  bnxt_en: Don't issue AP reset during ethtool's reset operation
  bnxt_en: Fix bnxt_hwrm_update_rss_hash_cfg()
  net: bcmgenet: Fix EEE implementation
  eth: ixgbe: fix the wake condition
  eth: bnxt: fix the wake condition
  lib: cpu_rmap: Fix potential use-after-free in irq_cpu_rmap_release()
  bpf: Add extra path pointer check to d_path helper
  net: sched: fix possible refcount leak in tc_chain_tmplt_add()
  net: sched: act_police: fix sparse errors in tcf_police_dump()
  net: openvswitch: fix upcall counter access before allocation
  net: sched: move rtm_tca_policy declaration to include file
  ice: make writes to /dev/gnssX synchronous
  net: sched: add rcu annotations around qdisc->qdisc_sleeping
  rfs: annotate lockless accesses to RFS sock flow table
  rfs: annotate lockless accesses to sk->sk_rxhash
  virtio_net: use control_buf for coalesce params
  ...
2023-06-08 09:27:19 -07:00
David Howells
f5f82cd187 Move netfs_extract_iter_to_sg() to lib/scatterlist.c
Move netfs_extract_iter_to_sg() to lib/scatterlist.c as it's going to be
used by more than just network filesystems (AF_ALG, for example).

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: netdev@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-08 13:42:33 +02:00
Ben Hutchings
7c5d4801ec lib: cpu_rmap: Fix potential use-after-free in irq_cpu_rmap_release()
irq_cpu_rmap_release() calls cpu_rmap_put(), which may free the rmap.
So we need to clear the pointer to our glue structure in rmap before
doing that, not after.

Fixes: 4e0473f106 ("lib: cpu_rmap: Avoid use after free on rmap->obj array entries")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/ZHo0vwquhOy3FaXc@decadent.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-07 21:25:00 -07:00
Tetsuo Handa
8b64d420fe debugobjects: Recheck debug_objects_enabled before reporting
syzbot is reporting false a positive ODEBUG message immediately after
ODEBUG was disabled due to OOM.

  [ 1062.309646][T22911] ODEBUG: Out of memory. ODEBUG disabled
  [ 1062.886755][ T5171] ------------[ cut here ]------------
  [ 1062.892770][ T5171] ODEBUG: assert_init not available (active state 0) object: ffffc900056afb20 object type: timer_list hint: process_timeout+0x0/0x40

  CPU 0 [ T5171]                CPU 1 [T22911]
  --------------                --------------
  debug_object_assert_init() {
    if (!debug_objects_enabled)
      return;
    db = get_bucket(addr);
                                lookup_object_or_alloc() {
                                  debug_objects_enabled = 0;
                                  return NULL;
                                }
                                debug_objects_oom() {
                                  pr_warn("Out of memory. ODEBUG disabled\n");
                                  // all buckets get emptied here, and
                                }
    lookup_object_or_alloc(addr, db, descr, false, true) {
      // this bucket is already empty.
      return ERR_PTR(-ENOENT);
    }
    // Emits false positive warning.
    debug_print_object(&o, "assert_init");
  }

Recheck debug_object_enabled in debug_print_object() to avoid that.

Reported-by: syzbot <syzbot+7937ba6a50bdd00fffdf@syzkaller.appspotmail.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/492fe2ae-5141-d548-ebd5-62f5fe2e57f7@I-love.SAKURA.ne.jp
Closes: https://syzkaller.appspot.com/bug?extid=7937ba6a50bdd00fffdf
2023-06-07 14:16:12 +02:00
Masami Hiramatsu (Google)
cb16330d12 fprobe: Pass return address to the handlers
Pass return address as 'ret_ip' to the fprobe entry and return handlers
so that the fprobe user handler can get the reutrn address without
analyzing arch-dependent pt_regs.

Link: https://lore.kernel.org/all/168507467664.913472.11642316698862778600.stgit@mhiramat.roam.corp.google.com/

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-06-06 21:39:55 +09:00
Andy Shevchenko
8d2b2281ae mac_pton: Clean up the header inclusions
Since hex_to_bin() is provided by hex.h there is no need to require
kernel.h. Replace the latter by the former and add missing export.h.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230604132858.6650-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-06 13:18:32 +02:00
Andy Shevchenko
b2f10148ec kobject: Use return value of strreplace()
Since strreplace() returns the pointer to the string itself,
we may use it directly in the code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230605170553.7835-4-andriy.shevchenko@linux.intel.com
2023-06-05 15:31:12 -07:00
Andy Shevchenko
d01a77afd6 lib/string_helpers: Change returned value of the strreplace()
It's more useful to return the pointer to the string itself
with strreplace(), so it may be used like

	attr->name = strreplace(name, '/', '_');

While at it, amend the kernel documentation.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230605170553.7835-3-andriy.shevchenko@linux.intel.com
2023-06-05 15:31:12 -07:00
Andrzej Hajda
acd8f0e5d7 lib/ref_tracker: remove warnings in case of allocation failure
Library can handle allocation failures. To avoid allocation warnings
__GFP_NOWARN has been added everywhere. Moreover GFP_ATOMIC has been
replaced with GFP_NOWAIT in case of stack allocation on tracker free
call.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-05 15:28:42 -07:00
Andrzej Hajda
227c6c8323 lib/ref_tracker: add printing to memory buffer
Similar to stack_(depot|trace)_snprint the patch
adds helper to printing stats to memory buffer.
It will be helpful in case of debugfs.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-05 15:28:42 -07:00
Andrzej Hajda
b6d7c0eb2d lib/ref_tracker: improve printing stats
In case the library is tracking busy subsystem, simply
printing stack for every active reference will spam log
with long, hard to read, redundant stack traces. To improve
readabilty following changes have been made:
- reports are printed per stack_handle - log is more compact,
- added display name for ref_tracker_dir - it will differentiate
  multiple subsystems,
- stack trace is printed indented, in the same printk call,
- info about dropped references is printed as well.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-05 15:28:42 -07:00
Andrzej Hajda
7a113ff635 lib/ref_tracker: add unlocked leak print helper
To have reliable detection of leaks, caller must be able to check under
the same lock both: tracked counter and the leaks. dir.lock is natural
candidate for such lock and unlocked print helper can be called with this
lock taken.
As a bonus we can reuse this helper in ref_tracker_dir_exit.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-05 15:28:42 -07:00
Peter Zijlstra
224d80c584 types: Introduce [us]128
Introduce [us]128 (when available). Unlike [us]64, ensure they are
always naturally aligned.

This also enables 128bit wide atomics (which require natural
alignment) such as cmpxchg128().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230531132323.385005581@infradead.org
2023-06-05 09:36:35 +02:00
David Gow
260755184c kunit: Move kunit_abort() call out of kunit_do_failed_assertion()
KUnit aborts the current thread when an assertion fails. Currently, this
is done conditionally as part of the kunit_do_failed_assertion()
function, but this hides the kunit_abort() call from the compiler
(particularly if it's in another module). This, in turn, can lead to
both suboptimal code generation (the compiler can't know if
kunit_do_failed_assertion() will return), and to static analysis tools
like smatch giving false positives.

Moving the kunit_abort() call into the macro should give the compiler
and tools a better chance at understanding what's going on. Doing so
requires exporting kunit_abort(), though it's recommended to continue to
use assertions in lieu of aborting directly.

In addition, kunit_abort() and kunit_do_failed_assertion() are renamed
to make it clear they they're intended for internal KUnit use, to:
__kunit_do_failed_assertion() and __kunit_abort()

Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-06-01 13:04:46 -06:00
Alexander Potapenko
f9cfb1910e string: use __builtin_memcpy() in strlcpy/strlcat
lib/string.c is built with -ffreestanding, which prevents the compiler
from replacing certain functions with calls to their library versions.

On the other hand, this also prevents Clang and GCC from instrumenting
calls to memcpy() when building with KASAN, KCSAN or KMSAN:
 - KASAN normally replaces memcpy() with __asan_memcpy() with the
   additional cc-param,asan-kernel-mem-intrinsic-prefix=1;
 - KCSAN and KMSAN replace memcpy() with __tsan_memcpy() and
   __msan_memcpy() by default.

To let the tools catch memory accesses from strlcpy/strlcat, replace
the calls to memcpy() with __builtin_memcpy(), which KASAN, KCSAN and
KMSAN are able to replace even in -ffreestanding mode.

This preserves the behavior in normal builds (__builtin_memcpy() ends up
being replaced with memcpy()), and does not introduce new instrumentation
in unwanted places, as strlcpy/strlcat are already instrumented.

Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/all/20230224085942.1791837-1-elver@google.com/
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230530083911.1104336-1-glider@google.com
2023-06-01 11:24:50 -07:00
Mirsad Goran Todorovac
48e1560230 test_firmware: fix the memory leak of the allocated firmware buffer
The following kernel memory leak was noticed after running
tools/testing/selftests/firmware/fw_run_tests.sh:

[root@pc-mtodorov firmware]# cat /sys/kernel/debug/kmemleak
.
.
.
unreferenced object 0xffff955389bc3400 (size 1024):
  comm "test_firmware-0", pid 5451, jiffies 4294944822 (age 65.652s)
  hex dump (first 32 bytes):
    47 48 34 35 36 37 0a 00 00 00 00 00 00 00 00 00  GH4567..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff962f5dec>] slab_post_alloc_hook+0x8c/0x3c0
    [<ffffffff962fcca4>] __kmem_cache_alloc_node+0x184/0x240
    [<ffffffff962704de>] kmalloc_trace+0x2e/0xc0
    [<ffffffff9665b42d>] test_fw_run_batch_request+0x9d/0x180
    [<ffffffff95fd813b>] kthread+0x10b/0x140
    [<ffffffff95e033e9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff9553c334b400 (size 1024):
  comm "test_firmware-1", pid 5452, jiffies 4294944822 (age 65.652s)
  hex dump (first 32 bytes):
    47 48 34 35 36 37 0a 00 00 00 00 00 00 00 00 00  GH4567..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff962f5dec>] slab_post_alloc_hook+0x8c/0x3c0
    [<ffffffff962fcca4>] __kmem_cache_alloc_node+0x184/0x240
    [<ffffffff962704de>] kmalloc_trace+0x2e/0xc0
    [<ffffffff9665b42d>] test_fw_run_batch_request+0x9d/0x180
    [<ffffffff95fd813b>] kthread+0x10b/0x140
    [<ffffffff95e033e9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff9553c334f000 (size 1024):
  comm "test_firmware-2", pid 5453, jiffies 4294944822 (age 65.652s)
  hex dump (first 32 bytes):
    47 48 34 35 36 37 0a 00 00 00 00 00 00 00 00 00  GH4567..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff962f5dec>] slab_post_alloc_hook+0x8c/0x3c0
    [<ffffffff962fcca4>] __kmem_cache_alloc_node+0x184/0x240
    [<ffffffff962704de>] kmalloc_trace+0x2e/0xc0
    [<ffffffff9665b42d>] test_fw_run_batch_request+0x9d/0x180
    [<ffffffff95fd813b>] kthread+0x10b/0x140
    [<ffffffff95e033e9>] ret_from_fork+0x29/0x50
unreferenced object 0xffff9553c3348400 (size 1024):
  comm "test_firmware-3", pid 5454, jiffies 4294944822 (age 65.652s)
  hex dump (first 32 bytes):
    47 48 34 35 36 37 0a 00 00 00 00 00 00 00 00 00  GH4567..........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff962f5dec>] slab_post_alloc_hook+0x8c/0x3c0
    [<ffffffff962fcca4>] __kmem_cache_alloc_node+0x184/0x240
    [<ffffffff962704de>] kmalloc_trace+0x2e/0xc0
    [<ffffffff9665b42d>] test_fw_run_batch_request+0x9d/0x180
    [<ffffffff95fd813b>] kthread+0x10b/0x140
    [<ffffffff95e033e9>] ret_from_fork+0x29/0x50
[root@pc-mtodorov firmware]#

Note that the size 1024 corresponds to the size of the test firmware
buffer. The actual number of the buffers leaked is around 70-110,
depending on the test run.

The cause of the leak is the following:

request_partial_firmware_into_buf() and request_firmware_into_buf()
provided firmware buffer isn't released on release_firmware(), we
have allocated it and we are responsible for deallocating it manually.
This is introduced in a number of context where previously only
release_firmware() was called, which was insufficient.

Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Fixes: 7feebfa487 ("test_firmware: add support for request_firmware_into_buf")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Carpenter <error27@gmail.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Tianfei zhang <tianfei.zhang@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Zhengchao Shao <shaozhengchao@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>
Cc: Scott Branden <sbranden@broadcom.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: stable@vger.kernel.org # v5.4
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Link: https://lore.kernel.org/r/20230509084746.48259-3-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-31 20:31:07 +01:00
Mirsad Goran Todorovac
be37bed754 test_firmware: fix a memory leak with reqs buffer
Dan Carpenter spotted that test_fw_config->reqs will be leaked if
trigger_batched_requests_store() is called two or more times.
The same appears with trigger_batched_requests_async_store().

This bug wasn't trigger by the tests, but observed by Dan's visual
inspection of the code.

The recommended workaround was to return -EBUSY if test_fw_config->reqs
is already allocated.

Fixes: 7feebfa487 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Tianfei Zhang <tianfei.zhang@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-kselftest@vger.kernel.org
Cc: stable@vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27@gmail.com>
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230509084746.48259-2-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-31 20:31:07 +01:00
Mirsad Goran Todorovac
4acfe3dfde test_firmware: prevent race conditions by a correct implementation of locking
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:

static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
        u8 val;
        int ret;

        ret = kstrtou8(buf, 10, &val);
        if (ret)
                return ret;

        mutex_lock(&test_fw_mutex);
        *(u8 *)cfg = val;
        mutex_unlock(&test_fw_mutex);

        /* Always return full write size even if we didn't consume all */
        return size;
}

static ssize_t config_num_requests_store(struct device *dev,
                                         struct device_attribute *attr,
                                         const char *buf, size_t count)
{
        int rc;

        mutex_lock(&test_fw_mutex);
        if (test_fw_config->reqs) {
                pr_err("Must call release_all_firmware prior to changing config\n");
                rc = -EINVAL;
                mutex_unlock(&test_fw_mutex);
                goto out;
        }
        mutex_unlock(&test_fw_mutex);

        rc = test_dev_config_update_u8(buf, count,
                                       &test_fw_config->num_requests);

out:
        return rc;
}

static ssize_t config_read_fw_idx_store(struct device *dev,
                                        struct device_attribute *attr,
                                        const char *buf, size_t count)
{
        return test_dev_config_update_u8(buf, count,
                                         &test_fw_config->read_fw_idx);
}

The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.

To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.

Having two locks wouldn't assure a race-proof mutual exclusion.

This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:

static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
        int ret;

        mutex_lock(&test_fw_mutex);
        ret = __test_dev_config_update_u8(buf, size, cfg);
        mutex_unlock(&test_fw_mutex);

        return ret;
}

doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.

The similar approach was applied to all functions called from the locked
and the unlocked context, which safely mitigates both deadlocks and race
conditions in the driver.

__test_dev_config_update_bool(), __test_dev_config_update_u8() and
__test_dev_config_update_size_t() unlocked versions of the functions
were introduced to be called from the locked contexts as a workaround
without releasing the main driver's lock and thereof causing a race
condition.

The test_dev_config_update_bool(), test_dev_config_update_u8() and
test_dev_config_update_size_t() locked versions of the functions
are being called from driver methods without the unnecessary multiplying
of the locking and unlocking code for each method, and complicating
the code with saving of the return value across lock.

Fixes: 7feebfa487 ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tianfei Zhang <tianfei.zhang@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-kselftest@vger.kernel.org
Cc: stable@vger.kernel.org # v5.4
Suggested-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-31 20:31:07 +01:00
Su Hui
0d2da4b595 bpf/tests: Use struct_size()
Use struct_size() instead of hand writing it. This is less verbose and
more informative.

Signed-off-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20230531043251.989312-1-suhui@nfschina.com
2023-05-31 12:58:38 +02:00
Arnd Bergmann
26f15e5de1 ubsan: add prototypes for internal functions
Most of the functions in ubsan that are only called from generated
code don't have a prototype, which W=1 builds warn about:

lib/ubsan.c:226:6: error: no previous prototype for '__ubsan_handle_divrem_overflow' [-Werror=missing-prototypes]
lib/ubsan.c:307:6: error: no previous prototype for '__ubsan_handle_type_mismatch' [-Werror=missing-prototypes]
lib/ubsan.c:321:6: error: no previous prototype for '__ubsan_handle_type_mismatch_v1' [-Werror=missing-prototypes]
lib/ubsan.c:335:6: error: no previous prototype for '__ubsan_handle_out_of_bounds' [-Werror=missing-prototypes]
lib/ubsan.c:352:6: error: no previous prototype for '__ubsan_handle_shift_out_of_bounds' [-Werror=missing-prototypes]
lib/ubsan.c:394:6: error: no previous prototype for '__ubsan_handle_builtin_unreachable' [-Werror=missing-prototypes]
lib/ubsan.c:404:6: error: no previous prototype for '__ubsan_handle_load_invalid_value' [-Werror=missing-prototypes]

Add prototypes for all of these to lib/ubsan.h, and remove the
one that was already present in ubsan.c.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Fangrui Song <maskray@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230517125102.930491-1-arnd@kernel.org
2023-05-30 16:42:01 -07:00
Linus Torvalds
d8f14b84fe Two fixes for debugobjects:
- Prevent that the allocation path wakes up kswapd. That's a long
     standing issue due to the GFP_ATOMIC allocation flag. As debug objects
     can be invoked from pretty much any context waking kswapd can end up
     in arbitrary lock chains versus the waitqueue lock.
 
   - Correct the explicit lockdep wait-type violation in
     debug_object_fill_pool().
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRzCBQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoa8FD/sFaHGSVtNTYgkV75umETMWbx+nR0Sp
 Y/i62MswIWU/DWmD9IKaBxlHpBByHgopBAozDnUix6RfQvf8V/GSU6PWa9HAR2QH
 rYwQCN/2/e8yQNAFv+9AiYGzPU3fRI/z7rYgfhhiWoLjivMFUCXypjBG0BAiCBxC
 pYKZDMhBeySIUjtEL6xjcflA8XXKuLUPGy1WeKBxRgJeNvM0GlbifNXoy0JaXBso
 NK+1FOG7zm05r2RqZjN0rAVRrrdgA4JYygpYC8YmzePoFQVXLeUnlbjjW9uYX+hz
 MoLuVeF+rKk9NHNu3NoD4kFgrNp3NXAAAzH1MJwIADy9THtsyWAeEgyUkkie9aiX
 Oa8eSjpJQjUv5h+VRKpMhh2RAAAhCYDuX/QC2FLImLy+GRF3dMhsAmuYgKXN2kHa
 CFkM84vStMiMVxKhwtLpxVE7VOrxzXxbqMO65kMrCXYxK1SfKtEZr8FrORvUjU7G
 MmH+D9sB034nkCBU+oGMsMYAAzB4rLp5Cw9qqvwWLfJvWLcUoPxjgUV6hLR6mNXx
 6+2133Tf68Fz4TgyEDN9XhQ7QEsKKGTTDMJ5JYolnrRe54sUJSsX+44khrbocSde
 WcEfcwhR+mjDDx0eVB2oT9bedxMf639mqPNn//EqJkzS4s+sECC8OiHbdvL3ArUq
 S92nrMxvyMB42Q==
 =7B4m
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects fixes from Thomas Gleixner:
 "Two fixes for debugobjects:

   - Prevent the allocation path from waking up kswapd.

     That's a long standing issue due to the GFP_ATOMIC allocation flag.
     As debug objects can be invoked from pretty much any context waking
     kswapd can end up in arbitrary lock chains versus the waitqueue
     lock

   - Correct the explicit lockdep wait-type violation in
     debug_object_fill_pool()"

* tag 'core-debugobjects-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Don't wake up kswapd from fill_pool()
  debugobjects,locking: Annotate debug_object_fill_pool() wait type violation
2023-05-28 07:15:33 -04:00
Kees Cook
d67790ddf0 overflow: Add struct_size_t() helper
While struct_size() is normally used in situations where the structure
type already has a pointer instance, there are places where no variable
is available. In the past, this has been worked around by using a typed
NULL first argument, but this is a bit ugly. Add a helper to do this,
and replace the handful of instances of the code pattern with it.

Instances were found with this Coccinelle script:

@struct_size_t@
identifier STRUCT, MEMBER;
expression COUNT;
@@

-       struct_size((struct STRUCT *)\(0\|NULL\),
+       struct_size_t(struct STRUCT,
                MEMBER, COUNT)

Suggested-by: Christoph Hellwig <hch@infradead.org>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: James Smart <james.smart@broadcom.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: HighPoint Linux Team <linux@highpoint-tech.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Cc: Don Brace <don.brace@microchip.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Guo Xuenan <guoxuenan@huawei.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: kernel test robot <lkp@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: linux-scsi@vger.kernel.org
Cc: megaraidlinux.pdl@broadcom.com
Cc: storagedev@microchip.com
Cc: linux-xfs@vger.kernel.org
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230522211810.never.421-kees@kernel.org
2023-05-26 13:52:19 -07:00
Michal Wajdeczko
b1eaa8b2a5 kunit: Update kunit_print_ok_not_ok function
There is no need use opaque test_or_suite pointer and is_test flag
as we don't use anything from the suite struct. Always expect test
pointer and use NULL as indication that provided results are from
the suite so we can treat them differently.

Since results could be from nested tests, like parameterized tests,
add explicit level parameter to properly indent output messages and
thus allow to reuse this function from other places.

While around, remove small code duplication near skip directive.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-26 08:44:09 -06:00
Michal Wajdeczko
b08f75b9bb kunit: Fix reporting of the skipped parameterized tests
Logs from the parameterized tests that were skipped don't include
SKIP directive thus they are displayed as PASSED. Fix that.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-26 08:44:03 -06:00
Michal Wajdeczko
d273b72846 kunit/test: Add example test showing parameterized testing
Use of parameterized testing is documented [1] but such use case
is not present in demo kunit test. Add small subtest for that.

[1] https://kernel.org/doc/html/latest/dev-tools/kunit/usage.html#parameterized-testing

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: David Gow <davidgow@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-26 08:43:57 -06:00
Noah Goldstein
688eb8191b x86/csum: Improve performance of csum_partial
1) Add special case for len == 40 as that is the hottest value. The
   nets a ~8-9% latency improvement and a ~30% throughput improvement
   in the len == 40 case.

2) Use multiple accumulators in the 64-byte loop. This dramatically
   improves ILP and results in up to a 40% latency/throughput
   improvement (better for more iterations).

Results from benchmarking on Icelake. Times measured with rdtsc()
 len   lat_new   lat_old      r    tput_new  tput_old      r
   8      3.58      3.47  1.032        3.58      3.51  1.021
  16      4.14      4.02  1.028        3.96      3.78  1.046
  24      4.99      5.03  0.992        4.23      4.03  1.050
  32      5.09      5.08  1.001        4.68      4.47  1.048
  40      5.57      6.08  0.916        3.05      4.43  0.690
  48      6.65      6.63  1.003        4.97      4.69  1.059
  56      7.74      7.72  1.003        5.22      4.95  1.055
  64      6.65      7.22  0.921        6.38      6.42  0.994
  96      9.43      9.96  0.946        7.46      7.54  0.990
 128      9.39     12.15  0.773        8.90      8.79  1.012
 200     12.65     18.08  0.699       11.63     11.60  1.002
 272     15.82     23.37  0.677       14.43     14.35  1.005
 440     24.12     36.43  0.662       21.57     22.69  0.951
 952     46.20     74.01  0.624       42.98     53.12  0.809
1024     47.12     78.24  0.602       46.36     58.83  0.788
1552     72.01    117.30  0.614       71.92     96.78  0.743
2048     93.07    153.25  0.607       93.28    137.20  0.680
2600    114.73    194.30  0.590      114.28    179.32  0.637
3608    156.34    268.41  0.582      154.97    254.02  0.610
4096    175.01    304.03  0.576      175.89    292.08  0.602

There is no such thing as a free lunch, however, and the special case
for len == 40 does add overhead to the len != 40 cases. This seems to
amount to be ~5% throughput and slightly less in terms of latency.

Testing:
Part of this change is a new kunit test. The tests check all
alignment X length pairs in [0, 64) X [0, 512).
There are three cases.
    1) Precomputed random inputs/seed. The expected results where
       generated use the generic implementation (which is assumed to be
       non-buggy).
    2) An input of all 1s. The goal of this test is to catch any case
       a carry is missing.
    3) An input that never carries. The goal of this test si to catch
       any case of incorrectly carrying.

More exhaustive tests that test all alignment X length pairs in
[0, 8192) X [0, 8192] on random data are also available here:
https://github.com/goldsteinn/csum-reproduction

The reposity also has the code for reproducing the above benchmark
numbers.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20230511011002.935690-1-goldstein.w.n%40gmail.com
2023-05-25 10:55:18 -07:00
David Gow
57e3cded99 kunit: kmalloc_array: Use kunit_add_action()
The kunit_add_action() function is much simpler and cleaner to use that
the full KUnit resource API for simple things like the
kunit_kmalloc_array() functionality.

Replacing it allows us to get rid of a number of helper functions, and
leaves us with no uses of kunit_alloc_resource(), which has some
usability problems and is going to have its behaviour modified in an
upcoming patch.

Note that we need to use kunit_defer_trigger_all() to implement
kunit_kfree().

Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Maxime Ripard <maxime@cerno.tech>
Tested-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-25 08:53:07 -06:00
David Gow
00e63f8afc kunit: executor_test: Use kunit_add_action()
Now we have the kunit_add_action() function, we can use it to implement
kfree_at_end() and free_subsuite_at_end() without the need for extra
helper functions.

Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Maxime Ripard <maxime@cerno.tech>
Tested-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-25 08:53:01 -06:00
David Gow
b9dce8a1ed kunit: Add kunit_add_action() to defer a call until test exit
Many uses of the KUnit resource system are intended to simply defer
calling a function until the test exits (be it due to success or
failure). The existing kunit_alloc_resource() function is often used for
this, but was awkward to use (requiring passing NULL init functions, etc),
and returned a resource without incrementing its reference count, which
-- while okay for this use-case -- could cause problems in others.

Instead, introduce a simple kunit_add_action() API: a simple function
(returning nothing, accepting a single void* argument) can be scheduled
to be called when the test exits. Deferred actions are called in the
opposite order to that which they were registered.

This mimics the devres API, devm_add_action(), and also provides
kunit_remove_action(), to cancel a deferred action, and
kunit_release_action() to trigger one early.

This is implemented as a resource under the hood, so the ordering
between resource cleanup and deferred functions is maintained.

Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Maxime Ripard <maxime@cerno.tech>
Tested-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-25 08:52:55 -06:00
David Howells
3fc40265ae iov_iter: Kill ITER_PIPE
The ITER_PIPE-type iterator was only used by generic_file_splice_read() and
that has been replaced and removed.  This leaves ITER_PIPE unused - so
remove it too.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20230522135018.2742245-31-dhowells@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24 08:42:17 -06:00
Tetsuo Handa
eb799279fb debugobjects: Don't wake up kswapd from fill_pool()
syzbot is reporting a lockdep warning in fill_pool() because the allocation
from debugobjects is using GFP_ATOMIC, which is (__GFP_HIGH | __GFP_KSWAPD_RECLAIM)
and therefore tries to wake up kswapd, which acquires kswapd_wait::lock.

Since fill_pool() might be called with arbitrary locks held, fill_pool()
should not assume that acquiring kswapd_wait::lock is safe.

Use __GFP_HIGH instead and remove __GFP_NORETRY as it is pointless for
!__GFP_DIRECT_RECLAIM allocation.

Fixes: 3ac7fe5a4a ("infrastructure to debug (dynamic) objects")
Reported-by: syzbot <syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/6577e1fa-b6ee-f2be-2414-a2b51b1c5e30@I-love.SAKURA.ne.jp
Closes: https://syzkaller.appspot.com/bug?extid=fe0c72f0ccbb93786380
2023-05-22 14:52:58 +02:00
Herbert Xu
6c19f3bfff crypto: lib/sha256 - Use generic code from sha256_base
Instead of duplicating the sha256 block processing code, reuse
the common code from crypto/sha256_base.h.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-05-19 16:45:43 +08:00
Herbert Xu
70d391a863 crypto: lib/sha256 - Remove redundant and unused sha224_update
The function sha224_update is exactly the same as sha256_update.
Moreover it's not even used in the kernel so it can be removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-05-19 16:45:43 +08:00
Linus Torvalds
f4a8871f9f Eight hotfixes. Four are cc:stable, the other four are for post-6.4
issues, or aren't considered suitable for backporting.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZGasdgAKCRDdBJ7gKXxA
 jpTFAQC2WlV6CbEsy46jJK2XzCypzLLxHiRmVCw5pmAucki4awEAjllEuzK6vw61
 ytBZ/O2sMB5AbCf31c6UYxgLS32oyAo=
 =IDcO
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-05-18-15-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "Eight hotfixes. Four are cc:stable, the other four are for post-6.4
  issues, or aren't considered suitable for backporting"

* tag 'mm-hotfixes-stable-2023-05-18-15-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  MAINTAINERS: Cleanup Arm Display IP maintainers
  MAINTAINERS: repair pattern in DIALOG SEMICONDUCTOR DRIVERS
  nilfs2: fix use-after-free bug of nilfs_root in nilfs_evict_inode()
  mm: fix zswap writeback race condition
  mm: kfence: fix false positives on big endian
  zsmalloc: move LRU update from zs_map_object() to zs_malloc()
  mm: shrinkers: fix race condition on debugfs cleanup
  maple_tree: make maple state reusable after mas_empty_area()
2023-05-18 17:06:04 -07:00
Tejun Heo
6363845005 workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism
Workqueue now automatically marks per-cpu work items that hog CPU for too
long as CPU_INTENSIVE, which excludes them from concurrency management and
prevents stalling other concurrency-managed work items. If a work function
keeps running over the thershold, it likely needs to be switched to use an
unbound workqueue.

This patch adds a debug mechanism which tracks the work functions which
trigger the automatic CPU_INTENSIVE mechanism and report them using
pr_warn() with exponential backoff.

v3: Documentation update.

v2: Drop bouncing to kthread_worker for printing messages. It was to avoid
    introducing circular locking dependency through printk but not effective
    as it still had pool lock -> wci_lock -> printk -> pool lock loop. Let's
    just print directly using printk_deferred().

Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
2023-05-17 17:02:08 -10:00
Peng Zhang
0257d9908d maple_tree: make maple state reusable after mas_empty_area()
Make mas->min and mas->max point to a node range instead of a leaf entry
range.  This allows mas to still be usable after mas_empty_area() returns.
Users would get unexpected results from other operations on the maple
state after calling the affected function.

For example, x86 MAP_32BIT mmap() acts as if there is no suitable gap when
there should be one.

Link: https://lkml.kernel.org/r/20230505145829.74574-1-zhangpeng.00@bytedance.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reported-by: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Reported-by: Tad <support@spotco.us>
Reported-by: Michael Keyes <mgkeyes@vigovproductions.net>
  Link: https://lore.kernel.org/linux-mm/32f156ba80010fd97dbaf0a0cdfc84366608624d.camel@intel.com/
  Link: https://lore.kernel.org/linux-mm/e6108286ac025c268964a7ead3aab9899f9bc6e9.camel@spotco.us/
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-17 15:24:32 -07:00
Nick Desaulniers
08e4044243 ubsan: remove cc-option test for UBSAN_TRAP
-fsanitize-undefined-trap-on-error has been supported since GCC 5.1 and
Clang 3.2.  The minimum supported version of these according to
Documentation/process/changes.rst is 5.1 and 11.0.0 respectively. Drop
this cc-option check.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230407215406.768464-1-ndesaulniers@google.com
2023-05-17 12:01:54 -07:00
Kees Cook
3bf301e1ab string: Add Kunit tests for strcat() family
Add tests to make sure the strcat() family of functions behave
correctly.

Signed-off-by: Kees Cook <keescook@chromium.org>
2023-05-16 14:08:02 -07:00
Kees Cook
a9dc8d0442 fortify: Allow KUnit test to build without FORTIFY
In order for CI systems to notice all the skipped tests related to
CONFIG_FORTIFY_SOURCE, allow the FORTIFY_SOURCE KUnit tests to build
with or without CONFIG_FORTIFY_SOURCE.

Signed-off-by: Kees Cook <keescook@chromium.org>
2023-05-16 14:07:49 -07:00
Kees Cook
2d47c6956a ubsan: Tighten UBSAN_BOUNDS on GCC
The use of -fsanitize=bounds on GCC will ignore some trailing arrays,
leaving a gap in coverage. Switch to using -fsanitize=bounds-strict to
match Clang's stricter behavior.

Cc: Marco Elver <elver@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Tom Rix <trix@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: linux-kbuild@vger.kernel.org
Cc: llvm@lists.linux.dev
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230405022356.gonna.338-kees@kernel.org
2023-05-16 13:57:14 -07:00
David Gow
a5ce66ad29 kunit: example: Provide example exit functions
Add an example .exit and .suite_exit function to the KUnit example
suite. Given exit functions are a bit more subtle than init functions
(due to running in a different kthread, and running even after tests or
test init functions fail), providing an easy place to experiment with
them is useful.

Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-11 18:17:41 -06:00
David Gow
55e8c1b49a kunit: Always run cleanup from a test kthread
KUnit tests run in a kthread, with the current->kunit_test pointer set
to the test's context. This allows the kunit_get_current_test() and
kunit_fail_current_test() macros to work. Normally, this pointer is
still valid during test shutdown (i.e., the suite->exit function, and
any resource cleanup). However, if the test has exited early (e.g., due
to a failed assertion), the cleanup is done in the parent KUnit thread,
which does not have an active context.

Instead, in the event test terminates early, run the test exit and
cleanup from a new 'cleanup' kthread, which sets current->kunit_test,
and better isolates the rest of KUnit from issues which arise in test
cleanup.

If a test cleanup function itself aborts (e.g., due to an assertion
failing), there will be no further attempts to clean up: an error will
be logged and the test failed. For example:
	 # example_simple_test: test aborted during cleanup. continuing without cleaning up

This should also make it easier to get access to the KUnit context,
particularly from within resource cleanup functions, which may, for
example, need access to data in test->priv.

Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Maxime Ripard <maxime@cerno.tech>
Tested-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-05-11 18:17:24 -06:00
Linus Torvalds
6e27831b91 Networking fixes for 6.4-rc2, including fixes from netfilter
Current release - regressions:
 
   - mtk_eth_soc: fix NULL pointer dereference
 
 Previous releases - regressions:
 
   - core:
     - skb_partial_csum_set() fix against transport header magic value
     - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
     - annotate sk->sk_err write from do_recvmmsg()
     - add vlan_get_protocol_and_depth() helper
 
   - netlink: annotate accesses to nlk->cb_running
 
   - netfilter: always release netdev hooks from notifier
 
 Previous releases - always broken:
 
   - core: deal with most data-races in sk_wait_event()
 
   - netfilter: fix possible bug_on with enable_hooks=1
 
   - eth: bonding: fix send_peer_notif overflow
 
   - eth: xpcs: fix incorrect number of interfaces
 
   - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb
 
   - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRcxawSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkv6wQAJgfOBlDAkZNKHzwtMuFiLxECeEMWY9h
 wJCyiq0qXnz9p5ZqjdmTmA8B+jUp9VkpgN5Z3lid5hXDfzDrvXL1KGZW4pc4ooz9
 GUzrp0EUzO5UsyrlZRS9vJ9mbCGN5M1ZWtWH93g8OzGJPRnLs0Q/Tr4IFTBVKzVb
 GmJPy/ZYWYDjnvx3BgewRDuYeH3Rt9lsIt4Pxq/E+D8W3ypvVM0m3GvrO5eEzMeu
 EfeilAdmJGJUufeoGguKt0hheqILS3kNCjQO25XS2Lq1OqetnR/wqTwXaaVxL2du
 Eb2ca7wKkihDpl2l8bQ3ss6vqM0HEpZ63Y2PJaNBS8ASdLsMq4n2L6j2JMfT8hWY
 RG3nJS7F2UFLyYmCJjNL1/I+Z9XeMyFKnHORzHK1dAkMlhd+8NauKWAxdjlxMbxX
 p1msyTl54bG0g6FrU/zAirCWNAAZYCPdZG/XvA/2Jj9mdy64OlGlv/QdJvfjcx+C
 L6nkwZfwXU7QUwKeeTfP8abte2SLrXIxkJrnNEAntPnFOSmd16+/yvQ8JVlbWTMd
 JugJrSAIxjOglIr/1fsnUuV+Ab+JDYQv/wkoyzvtcY2tjhTAHzgmTwwSfeYiCTJE
 rEbjyVvVgMcLTUIk/R9QC5/k6nX/7/KRDHxPOMBX4boOsuA0ARVjzt8uKRvv/7cS
 dRV98RwvCKvD
 =MoPD
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

   - mtk_eth_soc: fix NULL pointer dereference

  Previous releases - regressions:

   - core:
      - skb_partial_csum_set() fix against transport header magic value
      - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
      - annotate sk->sk_err write from do_recvmmsg()
      - add vlan_get_protocol_and_depth() helper

   - netlink: annotate accesses to nlk->cb_running

   - netfilter: always release netdev hooks from notifier

  Previous releases - always broken:

   - core: deal with most data-races in sk_wait_event()

   - netfilter: fix possible bug_on with enable_hooks=1

   - eth: bonding: fix send_peer_notif overflow

   - eth: xpcs: fix incorrect number of interfaces

   - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb

   - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register"

* tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits)
  af_unix: Fix data races around sk->sk_shutdown.
  af_unix: Fix a data race of sk->sk_receive_queue->qlen.
  net: datagram: fix data-races in datagram_poll()
  net: mscc: ocelot: fix stat counter register values
  ipvlan:Fix out-of-bounds caused by unclear skb->cb
  docs: networking: fix x25-iface.rst heading & index order
  gve: Remove the code of clearing PBA bit
  tcp: add annotations around sk->sk_shutdown accesses
  net: add vlan_get_protocol_and_depth() helper
  net: pcs: xpcs: fix incorrect number of interfaces
  net: deal with most data-races in sk_wait_event()
  net: annotate sk->sk_err write from do_recvmmsg()
  netlink: annotate accesses to nlk->cb_running
  kselftest: bonding: add num_grat_arp test
  selftests: forwarding: lib: add netns support for tc rule handle stats get
  Documentation: bonding: fix the doc of peer_notif_delay
  bonding: fix send_peer_notif overflow
  net: ethernet: mtk_eth_soc: fix NULL pointer dereference
  selftests: nft_flowtable.sh: check ingress/egress chain too
  selftests: nft_flowtable.sh: monitor result file sizes
  ...
2023-05-11 08:42:47 -05:00
Roy Novich
162bd18eb5 linux/dim: Do nothing if no time delta between samples
Add return value for dim_calc_stats. This is an indication for the
caller if curr_stats was assigned by the function. Avoid using
curr_stats uninitialized over {rdma/net}_dim, when no time delta between
samples. Coverity reported this potential use of an uninitialized
variable.

Fixes: 4c4dbb4a73 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Link: https://lore.kernel.org/r/20230507135743.138993-1-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-09 11:06:45 +02:00
Linus Torvalds
17784de648 A single fix for debugobjects:
The recent fix to ensure atomicity of lookup and allocation inadvertently
   broke the pool refill mechanism, so that debugobject OOMs now in certain
   situations. The reason is that the functions which got updated no longer
   invoke debug_objecs_init(), which is now the only place to care about
   refilling the tracking object pool.
 
   Restore the original behaviour by adding explicit refill opportunities to
   those places.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRWoFATHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYocNID/9e1fU2Nf32woHokzBGgARKb69Kl/hb
 6yVdMpOnZtxmluheJLnqCWI4WbAB6NjulEMFv+KkwRZ+QndBKVEo8NMZ9RjbXDBb
 HEehI6DvsqRDjaytOLEZj+/8afcZ7bUBKk7JuUK+y5B1gZViazfp1eF3hpiKsIV9
 aowpH6c9lL/9sPgFe2qpp21MUmNTUQbHpz0vbYC0QjqSEU2zTlu8p//P6VLA3xpl
 qoh8Gu5qo/L8lPspN2v8TRVXdiqH67J+KpbGO9IuUQWYPQqFdc6WchhHwomAk8nr
 Nyn9Q1Lred96pTdW3B0Cumnxuf0VPt4X/uQxPSP0kCo/h0Q0Mh6fq59Z66H/Mhjk
 TAvM52w3VzfTmQB6WgaCD1HyRRqIK5Nd+XqXnenCkHN4kjmGXNLg9MUGxua5CVgF
 iQTSRYtN18rF9OevDOFGzsEig2RN1JFi9MnJg9Q/L8SoDUn5ZUfhPaSA/HcOBnSe
 m+9aeRxlb0hAP7+upFKsJkDYzJTtbP6LSx6qqZMyQWqYdsUVHpdiPtJpXb7mLIqQ
 wo83i/Ohq8+dF6ykd89ZcKJ8vLBrnE1rPFKKmvS5ov1eRt/hZbtR3tmMviCNna0M
 2nrJE2fKClbs8Dmc6NNboJdz51ASgZEi32XmdFkATiuZqiD1id7ne0f85ju7DHD9
 sOjfo4ZtIKD/Fw==
 =/0Kc
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects fix from Thomas Gleixner:
 "A single fix for debugobjects:

  The recent fix to ensure atomicity of lookup and allocation
  inadvertently broke the pool refill mechanism, so that debugobject
  OOMs now in certain situations. The reason is that the functions which
  got updated no longer invoke debug_objecs_init(), which is now the
  only place to care about refilling the tracking object pool.

  Restore the original behaviour by adding explicit refill opportunities
  to those places"

* tag 'core-debugobjects-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobject: Ensure pool refill (again)
2023-05-07 11:04:26 -07:00
Linus Torvalds
15fb96a35d - Some DAMON cleanups from Kefeng Wang
- Some KSM work from David Hildenbrand, to make the PR_SET_MEMORY_MERGE
   ioctl's behavior more similar to KSM's behavior.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZFLsxAAKCRDdBJ7gKXxA
 jl8yAQCqjstPsOULf9QN0z4bGAUhY+Wj4ERz1jbKSIuhFCJWiQEAgQvgRXObKjmi
 OtUB0Ek4CMDCQzbyIQ1Bhp3kxi6+Jgs=
 =AbyC
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-05-03-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - Some DAMON cleanups from Kefeng Wang

 - Some KSM work from David Hildenbrand, to make the PR_SET_MEMORY_MERGE
   ioctl's behavior more similar to KSM's behavior.

[ Andrew called these "final", but I suspect we'll have a series fixing
  up the fact that the last commit in the dmapools series in the
  previous pull seems to have unintentionally just reverted all the
  other commits in the same series..   - Linus ]

* tag 'mm-stable-2023-05-03-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: hwpoison: coredump: support recovery from dump_user_range()
  mm/page_alloc: add some comments to explain the possible hole in __pageblock_pfn_to_page()
  mm/ksm: move disabling KSM from s390/gmap code to KSM code
  selftests/ksm: ksm_functional_tests: add prctl unmerge test
  mm/ksm: unmerge and clear VM_MERGEABLE when setting PR_SET_MEMORY_MERGE=0
  mm/damon/paddr: fix missing folio_sz update in damon_pa_young()
  mm/damon/paddr: minor refactor of damon_pa_mark_accessed_or_deactivate()
  mm/damon/paddr: minor refactor of damon_pa_pageout()
2023-05-04 13:09:43 -07:00
Kefeng Wang
245f092268 mm: hwpoison: coredump: support recovery from dump_user_range()
dump_user_range() is used to copy the user page to a coredump file, but if
a hardware memory error occurred during copy, which called from
__kernel_write_iter() in dump_user_range(), it crashes,

  CPU: 112 PID: 7014 Comm: mca-recover Not tainted 6.3.0-rc2 #425

  pc : __memcpy+0x110/0x260
  lr : _copy_from_iter+0x3bc/0x4c8
  ...
  Call trace:
   __memcpy+0x110/0x260
   copy_page_from_iter+0xcc/0x130
   pipe_write+0x164/0x6d8
   __kernel_write_iter+0x9c/0x210
   dump_user_range+0xc8/0x1d8
   elf_core_dump+0x308/0x368
   do_coredump+0x2e8/0xa40
   get_signal+0x59c/0x788
   do_signal+0x118/0x1f8
   do_notify_resume+0xf0/0x280
   el0_da+0x130/0x138
   el0t_64_sync_handler+0x68/0xc0
   el0t_64_sync+0x188/0x190

Generally, the '->write_iter' of file ops will use copy_page_from_iter()
and copy_page_from_iter_atomic(), change memcpy() to copy_mc_to_kernel()
in both of them to handle #MC during source read, which stop coredump
processing and kill the task instead of kernel panic, but the source
address may not always a user address, so introduce a new copy_mc flag in
struct iov_iter{} to indicate that the iter could do a safe memory copy,
also introduce the helpers to set/cleck the flag, for now, it's only used
in coredump's dump_user_range(), but it could expand to any other
scenarios to fix the similar issue.

Link: https://lkml.kernel.org/r/20230417045323.11054-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Tong Tiangen <tongtiangen@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-02 17:21:50 -07:00
Peter Zijlstra
0cce06ba85 debugobjects,locking: Annotate debug_object_fill_pool() wait type violation
There is an explicit wait-type violation in debug_object_fill_pool()
for PREEMPT_RT=n kernels which allows them to more easily fill the
object pool and reduce the chance of allocation failures.

Lockdep's wait-type checks are designed to check the PREEMPT_RT
locking rules even for PREEMPT_RT=n kernels and object to this, so
create a lockdep annotation to allow this to stand.

Specifically, create a 'lock' type that overrides the inner wait-type
while it is held -- allowing one to temporarily raise it, such that
the violation is hidden.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Qi Zheng <zhengqi.arch@bytedance.com>
Link: https://lkml.kernel.org/r/20230429100614.GA1489784@hirez.programming.kicks-ass.net
2023-05-02 14:48:14 +02:00
Thomas Gleixner
0af462f19e debugobject: Ensure pool refill (again)
The recent fix to ensure atomicity of lookup and allocation inadvertently
broke the pool refill mechanism.

Prior to that change debug_objects_activate() and debug_objecs_assert_init()
invoked debug_objecs_init() to set up the tracking object for statically
initialized objects. That's not longer the case and debug_objecs_init() is
now the only place which does pool refills.

Depending on the number of statically initialized objects this can be
enough to actually deplete the pool, which was observed by Ido via a
debugobjects OOM warning.

Restore the old behaviour by adding explicit refill opportunities to
debug_objects_activate() and debug_objecs_assert_init().

Fixes: 63a759694e ("debugobject: Prevent init race with static objects")
Reported-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/871qk05a9d.ffs@tglx
2023-05-02 10:07:04 +02:00
Linus Torvalds
10de638d8e s390 updates for the 6.4 merge window
- Add support for stackleak feature. Also allow specifying
   architecture-specific stackleak poison function to enable faster
   implementation. On s390, the mvc-based implementation helps decrease
   typical overhead from a factor of 3 to just 25%
 
 - Convert all assembler files to use SYM* style macros, deprecating the
   ENTRY() macro and other annotations. Select ARCH_USE_SYM_ANNOTATIONS
 
 - Improve KASLR to also randomize module and special amode31 code
   base load addresses
 
 - Rework decompressor memory tracking to support memory holes and improve
   error handling
 
 - Add support for protected virtualization AP binding
 
 - Add support for set_direct_map() calls
 
 - Implement set_memory_rox() and noexec module_alloc()
 
 - Remove obsolete overriding of mem*() functions for KASAN
 
 - Rework kexec/kdump to avoid using nodat_stack to call purgatory
 
 - Convert the rest of the s390 code to use flexible-array member instead
   of a zero-length array
 
 - Clean up uaccess inline asm
 
 - Enable ARCH_HAS_MEMBARRIER_SYNC_CORE
 
 - Convert to using CONFIG_FUNCTION_ALIGNMENT and enable
   DEBUG_FORCE_FUNCTION_ALIGN_64B
 
 - Resolve last_break in userspace fault reports
 
 - Simplify one-level sysctl registration
 
 - Clean up branch prediction handling
 
 - Rework CPU counter facility to retrieve available counter sets just
   once
 
 - Other various small fixes and improvements all over the code
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmRM8pwACgkQjYWKoQLX
 FBjV1AgAlvAhu1XkwOdwqdT4GqE8pcN4XXzydog1MYihrSO2PdgWAxpEW7o2QURN
 W+3xa6RIqt7nX2YBiwTanMZ12TYaFY7noGl3eUpD/NhueprweVirVl7VZUEuRoW/
 j0mbx77xsVzLfuDFxkpVwE6/j+tTO78kLyjUHwcN9rFVUaL7/orJneDJf+V8fZG0
 sHLOv0aljF7Jr2IIkw82lCmW/vdk7k0dACWMXK2kj1H3dIK34B9X4AdKDDf/WKXk
 /OSElBeZ93tSGEfNDRIda6iR52xocROaRnQAaDtargKFl9VO0/dN9ADxO+SLNHjN
 pFE/9VD6xT/xo4IuZZh/Z3TcYfiLvA==
 =Geqx
 -----END PGP SIGNATURE-----

Merge tag 's390-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add support for stackleak feature. Also allow specifying
   architecture-specific stackleak poison function to enable faster
   implementation. On s390, the mvc-based implementation helps decrease
   typical overhead from a factor of 3 to just 25%

 - Convert all assembler files to use SYM* style macros, deprecating the
   ENTRY() macro and other annotations. Select ARCH_USE_SYM_ANNOTATIONS

 - Improve KASLR to also randomize module and special amode31 code base
   load addresses

 - Rework decompressor memory tracking to support memory holes and
   improve error handling

 - Add support for protected virtualization AP binding

 - Add support for set_direct_map() calls

 - Implement set_memory_rox() and noexec module_alloc()

 - Remove obsolete overriding of mem*() functions for KASAN

 - Rework kexec/kdump to avoid using nodat_stack to call purgatory

 - Convert the rest of the s390 code to use flexible-array member
   instead of a zero-length array

 - Clean up uaccess inline asm

 - Enable ARCH_HAS_MEMBARRIER_SYNC_CORE

 - Convert to using CONFIG_FUNCTION_ALIGNMENT and enable
   DEBUG_FORCE_FUNCTION_ALIGN_64B

 - Resolve last_break in userspace fault reports

 - Simplify one-level sysctl registration

 - Clean up branch prediction handling

 - Rework CPU counter facility to retrieve available counter sets just
   once

 - Other various small fixes and improvements all over the code

* tag 's390-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (118 commits)
  s390/stackleak: provide fast __stackleak_poison() implementation
  stackleak: allow to specify arch specific stackleak poison function
  s390: select ARCH_USE_SYM_ANNOTATIONS
  s390/mm: use VM_FLUSH_RESET_PERMS in module_alloc()
  s390: wire up memfd_secret system call
  s390/mm: enable ARCH_HAS_SET_DIRECT_MAP
  s390/mm: use BIT macro to generate SET_MEMORY bit masks
  s390/relocate_kernel: adjust indentation
  s390/relocate_kernel: use SYM* macros instead of ENTRY(), etc.
  s390/entry: use SYM* macros instead of ENTRY(), etc.
  s390/purgatory: use SYM* macros instead of ENTRY(), etc.
  s390/kprobes: use SYM* macros instead of ENTRY(), etc.
  s390/reipl: use SYM* macros instead of ENTRY(), etc.
  s390/head64: use SYM* macros instead of ENTRY(), etc.
  s390/earlypgm: use SYM* macros instead of ENTRY(), etc.
  s390/mcount: use SYM* macros instead of ENTRY(), etc.
  s390/crc32le: use SYM* macros instead of ENTRY(), etc.
  s390/crc32be: use SYM* macros instead of ENTRY(), etc.
  s390/crypto,chacha: use SYM* macros instead of ENTRY(), etc.
  s390/amode31: use SYM* macros instead of ENTRY(), etc.
  ...
2023-04-30 11:43:31 -07:00
Linus Torvalds
d579c468d7 tracing updates for 6.4:
- User events are finally ready!
   After lots of collaboration between various parties, we finally locked
   down on a stable interface for user events that can also work with user
   space only tracing. This is implemented by telling the kernel (or user
   space library, but that part is user space only and not part of this
   patch set), where the variable is that the application uses to know if
   something is listening to the trace. There's also an interface to tell
   the kernel about these events, which will show up in the
   /sys/kernel/tracing/events/user_events/ directory, where it can be
    enabled. When it's enabled, the kernel will update the variable, to tell
   the application to start writing to the kernel.
   See https://lwn.net/Articles/927595/
 
 - Cleaned up the direct trampolines code to simplify arm64 addition of
   direct trampolines. Direct trampolines use the ftrace interface but
   instead of jumping to the ftrace trampoline, applications (mostly BPF)
   can register their own trampoline for performance reasons.
 
 - Some updates to the fprobe infrastructure. fprobes are more efficient than
   kprobes, as it does not need to save all the registers that kprobes on
   ftrace do. More work needs to be done before the fprobes will be exposed
   as dynamic events.
 
 - More updates to references to the obsolete path of
   /sys/kernel/debug/tracing for the new /sys/kernel/tracing path.
 
 - Add a seq_buf_do_printk() helper to seq_bufs, to print a large buffer line
   by line instead of all at once. There's users in production kernels that
   have a large data dump that originally used printk() directly, but the
   data dump was larger than what printk() allowed as a single print.
   Using seq_buf() to do the printing fixes that.
 
 - Add /sys/kernel/tracing/touched_functions that shows all functions that
   was every traced by ftrace or a direct trampoline. This is used for
   debugging issues where a traced function could have caused a crash by
   a bpf program or live patching.
 
 - Add a "fields" option that is similar to "raw" but outputs the fields of
   the events. It's easier to read by humans.
 
 - Some minor fixes and clean ups.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZEr36xQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6quZHAQCzuqnn2S8DsPd3Sy1vKIYaj0uajW5D
 Kz1oUJH4F0H7kgEA8XwXkdtfKpOXWc/ZH4LWfL7Orx2wJZJQMV9dVqEPDAE=
 =w0Z1
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - User events are finally ready!

   After lots of collaboration between various parties, we finally
   locked down on a stable interface for user events that can also work
   with user space only tracing.

   This is implemented by telling the kernel (or user space library, but
   that part is user space only and not part of this patch set), where
   the variable is that the application uses to know if something is
   listening to the trace.

   There's also an interface to tell the kernel about these events,
   which will show up in the /sys/kernel/tracing/events/user_events/
   directory, where it can be enabled.

   When it's enabled, the kernel will update the variable, to tell the
   application to start writing to the kernel.

   See https://lwn.net/Articles/927595/

 - Cleaned up the direct trampolines code to simplify arm64 addition of
   direct trampolines.

   Direct trampolines use the ftrace interface but instead of jumping to
   the ftrace trampoline, applications (mostly BPF) can register their
   own trampoline for performance reasons.

 - Some updates to the fprobe infrastructure. fprobes are more efficient
   than kprobes, as it does not need to save all the registers that
   kprobes on ftrace do. More work needs to be done before the fprobes
   will be exposed as dynamic events.

 - More updates to references to the obsolete path of
   /sys/kernel/debug/tracing for the new /sys/kernel/tracing path.

 - Add a seq_buf_do_printk() helper to seq_bufs, to print a large buffer
   line by line instead of all at once.

   There are users in production kernels that have a large data dump
   that originally used printk() directly, but the data dump was larger
   than what printk() allowed as a single print.

   Using seq_buf() to do the printing fixes that.

 - Add /sys/kernel/tracing/touched_functions that shows all functions
   that was every traced by ftrace or a direct trampoline. This is used
   for debugging issues where a traced function could have caused a
   crash by a bpf program or live patching.

 - Add a "fields" option that is similar to "raw" but outputs the fields
   of the events. It's easier to read by humans.

 - Some minor fixes and clean ups.

* tag 'trace-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (41 commits)
  ring-buffer: Sync IRQ works before buffer destruction
  tracing: Add missing spaces in trace_print_hex_seq()
  ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus
  recordmcount: Fix memory leaks in the uwrite function
  tracing/user_events: Limit max fault-in attempts
  tracing/user_events: Prevent same address and bit per process
  tracing/user_events: Ensure bit is cleared on unregister
  tracing/user_events: Ensure write index cannot be negative
  seq_buf: Add seq_buf_do_printk() helper
  tracing: Fix print_fields() for __dyn_loc/__rel_loc
  tracing/user_events: Set event filter_type from type
  ring-buffer: Clearly check null ptr returned by rb_set_head_page()
  tracing: Unbreak user events
  tracing/user_events: Use print_format_fields() for trace output
  tracing/user_events: Align structs with tabs for readability
  tracing/user_events: Limit global user_event count
  tracing/user_events: Charge event allocs to cgroups
  tracing/user_events: Update documentation for ABI
  tracing/user_events: Use write ABI in example
  tracing/user_events: Add ABI self-test
  ...
2023-04-28 15:57:53 -07:00
Linus Torvalds
f20730efbd SMP cross-CPU function-call updates for v6.4:
- Remove diagnostics and adjust config for CSD lock diagnostics
 
  - Add a generic IPI-sending tracepoint, as currently there's no easy
    way to instrument IPI origins: it's arch dependent and for some
    major architectures it's not even consistently available.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRK438RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jJ5Q/5AZ0HGpyqwdFK8GmGznyu5qjP5HwV9pPq
 gZQScqSy4tZEeza4TFMi83CoXSg9uJ7GlYJqqQMKm78LGEPomnZtXXC7oWvTA9M5
 M/jAvzytmvZloSCXV6kK7jzSejMHhag97J/BjTYhZYQpJ9T+hNC87XO6J6COsKr9
 lPIYqkFrIkQNr6B0U11AQfFejRYP1ics2fnbnZL86G/zZAc6x8EveM3KgSer2iHl
 KbrO+xcYyGY8Ef9P2F72HhEGFfM3WslpT1yzqR3sm4Y+fuMG0oW3qOQuMJx0ZhxT
 AloterY0uo6gJwI0P9k/K4klWgz81Tf/zLb0eBAtY2uJV9Fo3YhPHuZC7jGPGAy3
 JusW2yNYqc8erHVEMAKDUsl/1KN4TE2uKlkZy98wno+KOoMufK5MA2e2kPPqXvUi
 Jk9RvFolnWUsexaPmCftti0OCv3YFiviVAJ/t0pchfmvvJA2da0VC9hzmEXpLJVF
 25nBTV/1uAOrWvOpCyo3ElrC2CkQVkFmK5rXMDdvf6ib0Nid4vFcCkCSLVfu+ePB
 11mi7QYro+CcnOug1K+yKogUDmsZgV/u1kUwgQzTIpZ05Kkb49gUiXw9L2RGcBJh
 yoDoiI66KPR7PWQ2qBdQoXug4zfEEtWG0O9HNLB0FFRC3hu7I+HHyiUkBWs9jasK
 PA5+V7HcQRk=
 =Wp7f
 -----END PGP SIGNATURE-----

Merge tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull SMP cross-CPU function-call updates from Ingo Molnar:

 - Remove diagnostics and adjust config for CSD lock diagnostics

 - Add a generic IPI-sending tracepoint, as currently there's no easy
   way to instrument IPI origins: it's arch dependent and for some major
   architectures it's not even consistently available.

* tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  trace,smp: Trace all smp_function_call*() invocations
  trace: Add trace_ipi_send_cpu()
  sched, smp: Trace smp callback causing an IPI
  smp: reword smp call IPI comment
  treewide: Trace IPIs sent via smp_send_reschedule()
  irq_work: Trace self-IPIs sent via arch_irq_work_raise()
  smp: Trace IPIs sent via arch_send_call_function_ipi_mask()
  sched, smp: Trace IPIs sent via send_call_function_single_ipi()
  trace: Add trace_ipi_send_cpumask()
  kernel/smp: Make csdlock_debug= resettable
  locking/csd_lock: Remove per-CPU data indirection from CSD lock debugging
  locking/csd_lock: Remove added data from CSD lock debugging
  locking/csd_lock: Add Kconfig option for csd_debug default
2023-04-28 15:03:43 -07:00
Linus Torvalds
33afd4b763 Mainly singleton patches all over the place. Series of note are:
- updates to scripts/gdb from Glenn Washburn
 
 - kexec cleanups from Bjorn Helgaas
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr+6wAKCRDdBJ7gKXxA
 jn4NAP4u/hj/kR2dxYehcVLuQqJspCRZZBZlAReFJyHNQO6voAEAk0NN9rtG2+/E
 r0G29CJhK+YL0W6mOs8O1yo9J1rZnAM=
 =2CUV
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-04-27-16-01' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Mainly singleton patches all over the place.

  Series of note are:

   - updates to scripts/gdb from Glenn Washburn

   - kexec cleanups from Bjorn Helgaas"

* tag 'mm-nonmm-stable-2023-04-27-16-01' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (50 commits)
  mailmap: add entries for Paul Mackerras
  libgcc: add forward declarations for generic library routines
  mailmap: add entry for Oleksandr
  ocfs2: reduce ioctl stack usage
  fs/proc: add Kthread flag to /proc/$pid/status
  ia64: fix an addr to taddr in huge_pte_offset()
  checkpatch: introduce proper bindings license check
  epoll: rename global epmutex
  scripts/gdb: add GDB convenience functions $lx_dentry_name() and $lx_i_dentry()
  scripts/gdb: create linux/vfs.py for VFS related GDB helpers
  uapi/linux/const.h: prefer ISO-friendly __typeof__
  delayacct: track delays from IRQ/SOFTIRQ
  scripts/gdb: timerlist: convert int chunks to str
  scripts/gdb: print interrupts
  scripts/gdb: raise error with reduced debugging information
  scripts/gdb: add a Radix Tree Parser
  lib/rbtree: use '+' instead of '|' for setting color.
  proc/stat: remove arch_idle_time()
  checkpatch: check for misuse of the link tags
  checkpatch: allow Closes tags with links
  ...
2023-04-27 19:57:00 -07:00
Linus Torvalds
7fa8a8ee94 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
 
 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
 
 - zsmalloc performance improvements from Sergey Senozhatsky.
 
 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.
 
 - VFS rationalizations from Christoph Hellwig:
 
   - removal of most of the callers of write_one_page().
 
   - make __filemap_get_folio()'s return value more useful
 
 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing.  Use `mount -o noswap'.
 
 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.
 
 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).
 
 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.
 
 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
   than exclusive, and has fixed a bunch of errors which were caused by its
   unintuitive meaning.
 
 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.
 
 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.
 
 - Cleanups to do_fault_around() by Lorenzo Stoakes.
 
 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.
 
 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.
 
 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.
 
 - Lorenzo has also contributed further cleanups of vma_merge().
 
 - Chaitanya Prakash provides some fixes to the mmap selftesting code.
 
 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.
 
 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.
 
 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.
 
 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.
 
 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.
 
 - Yosry Ahmed has improved the performance of memcg statistics flushing.
 
 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.
 
 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.
 
 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.
 
 - Pankaj Raghav has made page_endio() unneeded and has removed it.
 
 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.
 
 - Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
 
 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.
 
 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.
 
 - Peter Xu fixes a few issues with uffd-wp at fork() time.
 
 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
 jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
 eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
 =J+Dj
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
   switching from a user process to a kernel thread.

 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
   Raghav.

 - zsmalloc performance improvements from Sergey Senozhatsky.

 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.

 - VFS rationalizations from Christoph Hellwig:
     - removal of most of the callers of write_one_page()
     - make __filemap_get_folio()'s return value more useful

 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing. Use `mount -o noswap'.

 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.

 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).

 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.

 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
   rather than exclusive, and has fixed a bunch of errors which were
   caused by its unintuitive meaning.

 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.

 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.

 - Cleanups to do_fault_around() by Lorenzo Stoakes.

 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.

 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.

 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.

 - Lorenzo has also contributed further cleanups of vma_merge().

 - Chaitanya Prakash provides some fixes to the mmap selftesting code.

 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.

 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.

 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.

 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.

 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.

 - Yosry Ahmed has improved the performance of memcg statistics
   flushing.

 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.

 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.

 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.

 - Pankaj Raghav has made page_endio() unneeded and has removed it.

 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.

 - Yosry Ahmed has fixed an issue around memcg's page recalim
   accounting.

 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.

 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.

 - Peter Xu fixes a few issues with uffd-wp at fork() time.

 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.

* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
  shmem: restrict noswap option to initial user namespace
  mm/khugepaged: fix conflicting mods to collapse_file()
  sparse: remove unnecessary 0 values from rc
  mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
  hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
  maple_tree: fix allocation in mas_sparse_area()
  mm: do not increment pgfault stats when page fault handler retries
  zsmalloc: allow only one active pool compaction context
  selftests/mm: add new selftests for KSM
  mm: add new KSM process and sysfs knobs
  mm: add new api to enable ksm per process
  mm: shrinkers: fix debugfs file permissions
  mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
  migrate_pages_batch: fix statistics for longterm pin retry
  userfaultfd: use helper function range_in_vma()
  lib/show_mem.c: use for_each_populated_zone() simplify code
  mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
  fs/buffer: convert create_page_buffers to folio_create_buffers
  fs/buffer: add folio_create_empty_buffers helper
  ...
2023-04-27 19:42:02 -07:00
Linus Torvalds
8ccd54fe45 virtio,vhost,vdpa: features, fixes, cleanups
reduction in interrupt rate in virtio
 perf improvement for VDUSE
 scalability for vhost-scsi
 non power of 2 ring support for packed rings
 better management for mlx5 vdpa
 suspend for snet
 VIRTIO_F_NOTIFICATION_DATA
 shared backend with vdpa-sim-blk
 user VA support in vdpa-sim
 better struct packing for virtio
 
 fixes, cleanups all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmRG+QcPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpMyAIALpq8Z9ljl7ADGLuvt/xeCnIdifo7NXam71s
 +algalRplF3QplnMxZ0vH19Z8Gvyl18fkk/l0tHoCrZZgyseYR6DbyZXPv8YIfFh
 NSBokhil+ZURH6eNJc2PLcBUF3QIL3rSv7tBq7/++PN3KIqdHIePbyUFLlwqb272
 NLkOkHT30QBtncRWJORj/GqDxi/4H1zHDmfMd6xD/1B6IrC3gin205RnLuCa2H65
 bP0IE025VrmrRqNGX7nhi7dIFo6SmMPwG5O0YWeEhFHaSOL9PJM/Z9EN4tLhC1v1
 Y34fryH9e+MMSgBnCK2ExxTq/pGWsbhPbvisDfDf3M1m1HHfhYI=
 =N1SV
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "virtio,vhost,vdpa: features, fixes, and cleanups:

   - reduction in interrupt rate in virtio

   - perf improvement for VDUSE

   - scalability for vhost-scsi

   - non power of 2 ring support for packed rings

   - better management for mlx5 vdpa

   - suspend for snet

   - VIRTIO_F_NOTIFICATION_DATA

   - shared backend with vdpa-sim-blk

   - user VA support in vdpa-sim

   - better struct packing for virtio

  and fixes, cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (52 commits)
  vhost_vdpa: fix unmap process in no-batch mode
  MAINTAINERS: make me a reviewer of VIRTIO CORE AND NET DRIVERS
  tools/virtio: fix build caused by virtio_ring changes
  virtio_ring: add a struct device forward declaration
  vdpa_sim_blk: support shared backend
  vdpa_sim: move buffer allocation in the devices
  vdpa/snet: use likely/unlikely macros in hot functions
  vdpa/snet: implement kick_vq_with_data callback
  virtio-vdpa: add VIRTIO_F_NOTIFICATION_DATA feature support
  virtio: add VIRTIO_F_NOTIFICATION_DATA feature support
  vdpa/snet: support the suspend vDPA callback
  vdpa/snet: support getting and setting VQ state
  MAINTAINERS: add vringh.h to Virtio Core and Net Drivers
  vringh: address kdoc warnings
  vdpa: address kdoc warnings
  virtio_ring: don't update event idx on get_buf
  vdpa_sim: add support for user VA
  vdpa_sim: replace the spinlock with a mutex to protect the state
  vdpa_sim: use kthread worker
  vdpa_sim: make devices agnostic for work management
  ...
2023-04-27 17:05:34 -07:00
Linus Torvalds
b6a7828502 modules-6.4-rc1
The summary of the changes for this pull requests is:
 
  * Song Liu's new struct module_memory replacement
  * Nick Alcock's MODULE_LICENSE() removal for non-modules
  * My cleanups and enhancements to reduce the areas where we vmalloc
    module memory for duplicates, and the respective debug code which
    proves the remaining vmalloc pressure comes from userspace.
 
 Most of the changes have been in linux-next for quite some time except
 the minor fixes I made to check if a module was already loaded
 prior to allocating the final module memory with vmalloc and the
 respective debug code it introduces to help clarify the issue. Although
 the functional change is small it is rather safe as it can only *help*
 reduce vmalloc space for duplicates and is confirmed to fix a bootup
 issue with over 400 CPUs with KASAN enabled. I don't expect stable
 kernels to pick up that fix as the cleanups would have also had to have
 been picked up. Folks on larger CPU systems with modules will want to
 just upgrade if vmalloc space has been an issue on bootup.
 
 Given the size of this request, here's some more elaborate details
 on this pull request.
 
 The functional change change in this pull request is the very first
 patch from Song Liu which replaces the struct module_layout with a new
 struct module memory. The old data structure tried to put together all
 types of supported module memory types in one data structure, the new
 one abstracts the differences in memory types in a module to allow each
 one to provide their own set of details. This paves the way in the
 future so we can deal with them in a cleaner way. If you look at changes
 they also provide a nice cleanup of how we handle these different memory
 areas in a module. This change has been in linux-next since before the
 merge window opened for v6.3 so to provide more than a full kernel cycle
 of testing. It's a good thing as quite a bit of fixes have been found
 for it.
 
 Jason Baron then made dynamic debug a first class citizen module user by
 using module notifier callbacks to allocate / remove module specific
 dynamic debug information.
 
 Nick Alcock has done quite a bit of work cross-tree to remove module
 license tags from things which cannot possibly be module at my request
 so to:
 
   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area
      is active with no clear solution in sight.
 
   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags
 
 In so far as a) is concerned, although module license tags are a no-op
 for non-modules, tools which would want create a mapping of possible
 modules can only rely on the module license tag after the commit
 8b41fc4454 ("kbuild: create modules.builtin without Makefile.modbuiltin
 or tristate.conf").  Nick has been working on this *for years* and
 AFAICT I was the only one to suggest two alternatives to this approach
 for tooling. The complexity in one of my suggested approaches lies in
 that we'd need a possible-obj-m and a could-be-module which would check
 if the object being built is part of any kconfig build which could ever
 lead to it being part of a module, and if so define a new define
 -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've
 suggested would be to have a tristate in kconfig imply the same new
 -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names
 mapping to modules always, and I don't think that's the case today. I am
 not aware of Nick or anyone exploring either of these options. Quite
 recently Josh Poimboeuf has pointed out that live patching, kprobes and
 BPF would benefit from resolving some part of the disambiguation as
 well but for other reasons. The function granularity KASLR (fgkaslr)
 patches were mentioned but Joe Lawrence has clarified this effort has
 been dropped with no clear solution in sight [1].
 
 In the meantime removing module license tags from code which could never
 be modules is welcomed for both objectives mentioned above. Some
 developers have also welcomed these changes as it has helped clarify
 when a module was never possible and they forgot to clean this up,
 and so you'll see quite a bit of Nick's patches in other pull
 requests for this merge window. I just picked up the stragglers after
 rc3. LWN has good coverage on the motivation behind this work [2] and
 the typical cross-tree issues he ran into along the way. The only
 concrete blocker issue he ran into was that we should not remove the
 MODULE_LICENSE() tags from files which have no SPDX tags yet, even if
 they can never be modules. Nick ended up giving up on his efforts due
 to having to do this vetting and backlash he ran into from folks who
 really did *not understand* the core of the issue nor were providing
 any alternative / guidance. I've gone through his changes and dropped
 the patches which dropped the module license tags where an SPDX
 license tag was missing, it only consisted of 11 drivers.  To see
 if a pull request deals with a file which lacks SPDX tags you
 can just use:
 
   ./scripts/spdxcheck.py -f \
 	$(git diff --name-only commid-id | xargs echo)
 
 You'll see a core module file in this pull request for the above,
 but that's not related to his changes. WE just need to add the SPDX
 license tag for the kernel/module/kmod.c file in the future but
 it demonstrates the effectiveness of the script.
 
 Most of Nick's changes were spread out through different trees,
 and I just picked up the slack after rc3 for the last kernel was out.
 Those changes have been in linux-next for over two weeks.
 
 The cleanups, debug code I added and final fix I added for modules
 were motivated by David Hildenbrand's report of boot failing on
 a systems with over 400 CPUs when KASAN was enabled due to running
 out of virtual memory space. Although the functional change only
 consists of 3 lines in the patch "module: avoid allocation if module is
 already present and ready", proving that this was the best we can
 do on the modules side took quite a bit of effort and new debug code.
 
 The initial cleanups I did on the modules side of things has been
 in linux-next since around rc3 of the last kernel, the actual final
 fix for and debug code however have only been in linux-next for about a
 week or so but I think it is worth getting that code in for this merge
 window as it does help fix / prove / evaluate the issues reported
 with larger number of CPUs. Userspace is not yet fixed as it is taking
 a bit of time for folks to understand the crux of the issue and find a
 proper resolution. Worst come to worst, I have a kludge-of-concept [3]
 of how to make kernel_read*() calls for modules unique / converge them,
 but I'm currently inclined to just see if userspace can fix this
 instead.
 
 [0] https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/
 [1] https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com
 [2] https://lwn.net/Articles/927569/
 [3] https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRG4m0SHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinQ2oP/0xlvKwJg6Ey8fHZF0qv8VOskE80zoLF
 hMazU3xfqLA+1TQvouW1YBxt3jwS3t1Ehs+NrV+nY9Yzcm0MzRX/n3fASJVe7nRr
 oqWWQU+voYl5Pw1xsfdp6C8IXpBQorpYby3Vp0MAMoZyl2W2YrNo36NV488wM9KC
 jD4HF5Z6xpnPSZTRR7AgW9mo7FdAtxPeKJ76Bch7lH8U6omT7n36WqTw+5B1eAYU
 YTOvrjRs294oqmWE+LeebyiOOXhH/yEYx4JNQgCwPdxwnRiGJWKsk5va0hRApqF/
 WW8dIqdEnjsa84lCuxnmWgbcPK8cgmlO0rT0DyneACCldNlldCW1LJ0HOwLk9pea
 p3JFAsBL7TKue4Tos6I7/4rx1ufyBGGIigqw9/VX5g0Iif+3BhWnqKRfz+p9wiMa
 Fl7cU6u7yC68CHu1HBSisK16cYMCPeOnTSd89upHj8JU/t74O6k/ARvjrQ9qmNUt
 c5U+OY+WpNJ1nXQydhY/yIDhFdYg8SSpNuIO90r4L8/8jRQYXNG80FDd1UtvVDuy
 eq0r2yZ8C0XHSlOT9QHaua/tWV/aaKtyC/c0hDRrigfUrq8UOlGujMXbUnrmrWJI
 tLJLAc7ePWAAoZXGSHrt0U27l029GzLwRdKqJ6kkDANVnTeOdV+mmBg9zGh3/Mp6
 agiwdHUMVN7X
 =56WK
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
 "The summary of the changes for this pull requests is:

   - Song Liu's new struct module_memory replacement

   - Nick Alcock's MODULE_LICENSE() removal for non-modules

   - My cleanups and enhancements to reduce the areas where we vmalloc
     module memory for duplicates, and the respective debug code which
     proves the remaining vmalloc pressure comes from userspace.

  Most of the changes have been in linux-next for quite some time except
  the minor fixes I made to check if a module was already loaded prior
  to allocating the final module memory with vmalloc and the respective
  debug code it introduces to help clarify the issue. Although the
  functional change is small it is rather safe as it can only *help*
  reduce vmalloc space for duplicates and is confirmed to fix a bootup
  issue with over 400 CPUs with KASAN enabled. I don't expect stable
  kernels to pick up that fix as the cleanups would have also had to
  have been picked up. Folks on larger CPU systems with modules will
  want to just upgrade if vmalloc space has been an issue on bootup.

  Given the size of this request, here's some more elaborate details:

  The functional change change in this pull request is the very first
  patch from Song Liu which replaces the 'struct module_layout' with a
  new 'struct module_memory'. The old data structure tried to put
  together all types of supported module memory types in one data
  structure, the new one abstracts the differences in memory types in a
  module to allow each one to provide their own set of details. This
  paves the way in the future so we can deal with them in a cleaner way.
  If you look at changes they also provide a nice cleanup of how we
  handle these different memory areas in a module. This change has been
  in linux-next since before the merge window opened for v6.3 so to
  provide more than a full kernel cycle of testing. It's a good thing as
  quite a bit of fixes have been found for it.

  Jason Baron then made dynamic debug a first class citizen module user
  by using module notifier callbacks to allocate / remove module
  specific dynamic debug information.

  Nick Alcock has done quite a bit of work cross-tree to remove module
  license tags from things which cannot possibly be module at my request
  so to:

   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area is
      active with no clear solution in sight.

   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags

  In so far as a) is concerned, although module license tags are a no-op
  for non-modules, tools which would want create a mapping of possible
  modules can only rely on the module license tag after the commit
  8b41fc4454 ("kbuild: create modules.builtin without
  Makefile.modbuiltin or tristate.conf").

  Nick has been working on this *for years* and AFAICT I was the only
  one to suggest two alternatives to this approach for tooling. The
  complexity in one of my suggested approaches lies in that we'd need a
  possible-obj-m and a could-be-module which would check if the object
  being built is part of any kconfig build which could ever lead to it
  being part of a module, and if so define a new define
  -DPOSSIBLE_MODULE [0].

  A more obvious yet theoretical approach I've suggested would be to
  have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
  well but that means getting kconfig symbol names mapping to modules
  always, and I don't think that's the case today. I am not aware of
  Nick or anyone exploring either of these options. Quite recently Josh
  Poimboeuf has pointed out that live patching, kprobes and BPF would
  benefit from resolving some part of the disambiguation as well but for
  other reasons. The function granularity KASLR (fgkaslr) patches were
  mentioned but Joe Lawrence has clarified this effort has been dropped
  with no clear solution in sight [1].

  In the meantime removing module license tags from code which could
  never be modules is welcomed for both objectives mentioned above. Some
  developers have also welcomed these changes as it has helped clarify
  when a module was never possible and they forgot to clean this up, and
  so you'll see quite a bit of Nick's patches in other pull requests for
  this merge window. I just picked up the stragglers after rc3. LWN has
  good coverage on the motivation behind this work [2] and the typical
  cross-tree issues he ran into along the way. The only concrete blocker
  issue he ran into was that we should not remove the MODULE_LICENSE()
  tags from files which have no SPDX tags yet, even if they can never be
  modules. Nick ended up giving up on his efforts due to having to do
  this vetting and backlash he ran into from folks who really did *not
  understand* the core of the issue nor were providing any alternative /
  guidance. I've gone through his changes and dropped the patches which
  dropped the module license tags where an SPDX license tag was missing,
  it only consisted of 11 drivers. To see if a pull request deals with a
  file which lacks SPDX tags you can just use:

    ./scripts/spdxcheck.py -f \
	$(git diff --name-only commid-id | xargs echo)

  You'll see a core module file in this pull request for the above, but
  that's not related to his changes. WE just need to add the SPDX
  license tag for the kernel/module/kmod.c file in the future but it
  demonstrates the effectiveness of the script.

  Most of Nick's changes were spread out through different trees, and I
  just picked up the slack after rc3 for the last kernel was out. Those
  changes have been in linux-next for over two weeks.

  The cleanups, debug code I added and final fix I added for modules
  were motivated by David Hildenbrand's report of boot failing on a
  systems with over 400 CPUs when KASAN was enabled due to running out
  of virtual memory space. Although the functional change only consists
  of 3 lines in the patch "module: avoid allocation if module is already
  present and ready", proving that this was the best we can do on the
  modules side took quite a bit of effort and new debug code.

  The initial cleanups I did on the modules side of things has been in
  linux-next since around rc3 of the last kernel, the actual final fix
  for and debug code however have only been in linux-next for about a
  week or so but I think it is worth getting that code in for this merge
  window as it does help fix / prove / evaluate the issues reported with
  larger number of CPUs. Userspace is not yet fixed as it is taking a
  bit of time for folks to understand the crux of the issue and find a
  proper resolution. Worst come to worst, I have a kludge-of-concept [3]
  of how to make kernel_read*() calls for modules unique / converge
  them, but I'm currently inclined to just see if userspace can fix this
  instead"

Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]

* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
  module: add debugging auto-load duplicate module support
  module: stats: fix invalid_mod_bytes typo
  module: remove use of uninitialized variable len
  module: fix building stats for 32-bit targets
  module: stats: include uapi/linux/module.h
  module: avoid allocation if module is already present and ready
  module: add debug stats to help identify memory pressure
  module: extract patient module check into helper
  modules/kmod: replace implementation with a semaphore
  Change DEFINE_SEMAPHORE() to take a number argument
  module: fix kmemleak annotations for non init ELF sections
  module: Ignore L0 and rename is_arm_mapping_symbol()
  module: Move is_arm_mapping_symbol() to module_symbol.h
  module: Sync code of is_arm_mapping_symbol()
  scripts/gdb: use mem instead of core_layout to get the module address
  interconnect: remove module-related code
  interconnect: remove MODULE_LICENSE in non-modules
  zswap: remove MODULE_LICENSE in non-modules
  zpool: remove MODULE_LICENSE in non-modules
  x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
  ...
2023-04-27 16:36:55 -07:00
Linus Torvalds
556eb8b791 Driver core changes for 6.4-rc1
Here is the large set of driver core changes for 6.4-rc1.
 
 Once again, a busy development cycle, with lots of changes happening in
 the driver core in the quest to be able to move "struct bus" and "struct
 class" into read-only memory, a task now complete with these changes.
 
 This will make the future rust interactions with the driver core more
 "provably correct" as well as providing more obvious lifetime rules for
 all busses and classes in the kernel.
 
 The changes required for this did touch many individual classes and
 busses as many callbacks were changed to take const * parameters
 instead.  All of these changes have been submitted to the various
 subsystem maintainers, giving them plenty of time to review, and most of
 them actually did so.
 
 Other than those changes, included in here are a small set of other
 things:
   - kobject logging improvements
   - cacheinfo improvements and updates
   - obligatory fw_devlink updates and fixes
   - documentation updates
   - device property cleanups and const * changes
   - firwmare loader dependency fixes.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
 LEGadNS38k5fs+73UaxV
 =7K4B
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.4-rc1.

  Once again, a busy development cycle, with lots of changes happening
  in the driver core in the quest to be able to move "struct bus" and
  "struct class" into read-only memory, a task now complete with these
  changes.

  This will make the future rust interactions with the driver core more
  "provably correct" as well as providing more obvious lifetime rules
  for all busses and classes in the kernel.

  The changes required for this did touch many individual classes and
  busses as many callbacks were changed to take const * parameters
  instead. All of these changes have been submitted to the various
  subsystem maintainers, giving them plenty of time to review, and most
  of them actually did so.

  Other than those changes, included in here are a small set of other
  things:

   - kobject logging improvements

   - cacheinfo improvements and updates

   - obligatory fw_devlink updates and fixes

   - documentation updates

   - device property cleanups and const * changes

   - firwmare loader dependency fixes.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
  device property: make device_property functions take const device *
  driver core: update comments in device_rename()
  driver core: Don't require dynamic_debug for initcall_debug probe timing
  firmware_loader: rework crypto dependencies
  firmware_loader: Strip off \n from customized path
  zram: fix up permission for the hot_add sysfs file
  cacheinfo: Add use_arch[|_cache]_info field/function
  arch_topology: Remove early cacheinfo error message if -ENOENT
  cacheinfo: Check cache properties are present in DT
  cacheinfo: Check sib_leaf in cache_leaves_are_shared()
  cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
  cacheinfo: Add arm64 early level initializer implementation
  cacheinfo: Add arch specific early level initializer
  tty: make tty_class a static const structure
  driver core: class: remove struct class_interface * from callbacks
  driver core: class: mark the struct class in struct class_interface constant
  driver core: class: make class_register() take a const *
  driver core: class: mark class_release() as taking a const *
  driver core: remove incorrect comment for device_create*
  MIPS: vpe-cmp: remove module owner pointer from struct class usage.
  ...
2023-04-27 11:53:57 -07:00
Linus Torvalds
6e98b09da9 Networking changes for 6.4.
Core
 ----
 
  - Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
    default value allows for better BIG TCP performances.
 
  - Reduce compound page head access for zero-copy data transfers.
 
  - RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when possible.
 
  - Threaded NAPI improvements, adding defer skb free support and unneeded
    softirq avoidance.
 
  - Address dst_entry reference count scalability issues, via false
    sharing avoidance and optimize refcount tracking.
 
  - Add lockless accesses annotation to sk_err[_soft].
 
  - Optimize again the skb struct layout.
 
  - Extends the skb drop reasons to make it usable by multiple
    subsystems.
 
  - Better const qualifier awareness for socket casts.
 
 BPF
 ---
 
  - Add skb and XDP typed dynptrs which allow BPF programs for more
    ergonomic and less brittle iteration through data and variable-sized
    accesses.
 
  - Add a new BPF netfilter program type and minimal support to hook
    BPF programs to netfilter hooks such as prerouting or forward.
 
  - Add more precise memory usage reporting for all BPF map types.
 
  - Adds support for using {FOU,GUE} encap with an ipip device operating
    in collect_md mode and add a set of BPF kfuncs for controlling encap
    params.
 
  - Allow BPF programs to detect at load time whether a particular kfunc
    exists or not, and also add support for this in light skeleton.
 
  - Bigger batch of BPF verifier improvements to prepare for upcoming BPF
    open-coded iterators allowing for less restrictive looping capabilities.
 
  - Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
    programs to NULL-check before passing such pointers into kfunc.
 
  - Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
    local storage maps.
 
  - Enable RCU semantics for task BPF kptrs and allow referenced kptr
    tasks to be stored in BPF maps.
 
  - Add support for refcounted local kptrs to the verifier for allowing
    shared ownership, useful for adding a node to both the BPF list and
    rbtree.
 
  - Add BPF verifier support for ST instructions in convert_ctx_access()
    which will help new -mcpu=v4 clang flag to start emitting them.
 
  - Add ARM32 USDT support to libbpf.
 
  - Improve bpftool's visual program dump which produces the control
    flow graph in a DOT format by adding C source inline annotations.
 
 Protocols
 ---------
 
  - IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
    indicates the provenance of the IP address.
 
  - IPv6: optimize route lookup, dropping unneeded R/W lock acquisition.
 
  - Add the handshake upcall mechanism, allowing the user-space
    to implement generic TLS handshake on kernel's behalf.
 
  - Bridge: support per-{Port, VLAN} neighbor suppression, increasing
    resilience to nodes failures.
 
  - SCTP: add support for Fair Capacity and Weighted Fair Queueing
    schedulers.
 
  - MPTCP: delay first subflow allocation up to its first usage. This
    will allow for later better LSM interaction.
 
  - xfrm: Remove inner/outer modes from input/output path. These are
    not needed anymore.
 
  - WiFi:
    - reduced neighbor report (RNR) handling for AP mode
    - HW timestamping support
    - support for randomized auth/deauth TA for PASN privacy
    - per-link debugfs for multi-link
    - TC offload support for mac80211 drivers
    - mac80211 mesh fast-xmit and fast-rx support
    - enable Wi-Fi 7 (EHT) mesh support
 
 Netfilter
 ---------
 
  - Add nf_tables 'brouting' support, to force a packet to be routed
    instead of being bridged.
 
  - Update bridge netfilter and ovs conntrack helpers to handle
    IPv6 Jumbo packets properly, i.e. fetch the packet length
    from hop-by-hop extension header. This is needed for BIT TCP
    support.
 
  - The iptables 32bit compat interface isn't compiled in by default
    anymore.
 
  - Move ip(6)tables builtin icmp matches to the udptcp one.
    This has the advantage that icmp/icmpv6 match doesn't load the
    iptables/ip6tables modules anymore when iptables-nft is used.
 
  - Extended netlink error report for netdevice in flowtables and
    netdev/chains. Allow for incrementally add/delete devices to netdev
    basechain. Allow to create netdev chain without device.
 
 Driver API
 ----------
 
  - Remove redundant Device Control Error Reporting Enable, as PCI core
    has already error reporting enabled at enumeration time.
 
  - Move Multicast DB netlink handlers to core, allowing devices other
    then bridge to use them.
 
  - Allow the page_pool to directly recycle the pages from safely
    localized NAPI.
 
  - Implement lockless TX queue stop/wake combo macros, allowing for
    further code de-duplication and sanitization.
 
  - Add YNL support for user headers and struct attrs.
 
  - Add partial YNL specification for devlink.
 
  - Add partial YNL specification for ethtool.
 
  - Add tc-mqprio and tc-taprio support for preemptible traffic classes.
 
  - Add tx push buf len param to ethtool, specifies the maximum number
    of bytes of a transmitted packet a driver can push directly to the
    underlying device.
 
  - Add basic LED support for switch/phy.
 
  - Add NAPI documentation, stop relaying on external links.
 
  - Convert dsa_master_ioctl() to netdev notifier. This is a preparatory
    work to make the hardware timestamping layer selectable by user
    space.
 
  - Add transceiver support and improve the error messages for CAN-FD
    controllers.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - AMD/Pensando core device support
    - MediaTek MT7981 SoC
    - MediaTek MT7988 SoC
    - Broadcom BCM53134 embedded switch
    - Texas Instruments CPSW9G ethernet switch
    - Qualcomm EMAC3 DWMAC ethernet
    - StarFive JH7110 SoC
    - NXP CBTX ethernet PHY
 
  - WiFi:
    - Apple M1 Pro/Max devices
    - RealTek rtl8710bu/rtl8188gu
    - RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset
 
  - Bluetooth:
    - Realtek RTL8821CS, RTL8851B, RTL8852BS
    - Mediatek MT7663, MT7922
    - NXP w8997
    - Actions Semi ATS2851
    - QTI WCN6855
    - Marvell 88W8997
 
  - Can:
    - STMicroelectronics bxcan stm32f429
 
 Drivers
 -------
  - Ethernet NICs:
    - Intel (1G, icg):
      - add tracking and reporting of QBV config errors.
      - add support for configuring max SDU for each Tx queue.
    - Intel (100G, ice):
      - refactor mailbox overflow detection to support Scalable IOV
      - GNSS interface optimization
    - Intel (i40e):
      - support XDP multi-buffer
    - nVidia/Mellanox:
      - add the support for linux bridge multicast offload
      - enable TC offload for egress and engress MACVLAN over bond
      - add support for VxLAN GBP encap/decap flows offload
      - extend packet offload to fully support libreswan
      - support tunnel mode in mlx5 IPsec packet offload
      - extend XDP multi-buffer support
      - support MACsec VLAN offload
      - add support for dynamic msix vectors allocation
      - drop RX page_cache and fully use page_pool
      - implement thermal zone to report NIC temperature
    - Netronome/Corigine:
      - add support for multi-zone conntrack offload
    - Solarflare/Xilinx:
      - support offloading TC VLAN push/pop actions to the MAE
      - support TC decap rules
      - support unicast PTP
 
  - Other NICs:
    - Broadcom (bnxt): enforce software based freq adjustments only
 		on shared PHC NIC
    - RealTek (r8169): refactor to addess ASPM issues during NAPI poll.
    - Micrel (lan8841): add support for PTP_PF_PEROUT
    - Cadence (macb): enable PTP unicast
    - Engleder (tsnep): add XDP socket zero-copy support
    - virtio-net: implement exact header length guest feature
    - veth: add page_pool support for page recycling
    - vxlan: add MDB data path support
    - gve: add XDP support for GQI-QPL format
    - geneve: accept every ethertype
    - macvlan: allow some packets to bypass broadcast queue
    - mana: add support for jumbo frame
 
  - Ethernet high-speed switches:
    - Microchip (sparx5): Add support for TC flower templates.
 
  - Ethernet embedded switches:
    - Broadcom (b54):
      - configure 6318 and 63268 RGMII ports
    - Marvell (mv88e6xxx):
      - faster C45 bus scan
    - Microchip:
      - lan966x:
        - add support for IS1 VCAP
        - better TX/RX from/to CPU performances
      - ksz9477: add ETS Qdisc support
      - ksz8: enhance static MAC table operations and error handling
      - sama7g5: add PTP capability
    - NXP (ocelot):
      - add support for external ports
      - add support for preemptible traffic classes
    - Texas Instruments:
      - add CPSWxG SGMII support for J7200 and J721E
 
  - Intel WiFi (iwlwifi):
    - preparation for Wi-Fi 7 EHT and multi-link support
    - EHT (Wi-Fi 7) sniffer support
    - hardware timestamping support for some devices/firwmares
    - TX beacon protection on newer hardware
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - MU-MIMO parameters support
    - ack signal support for management packets
 
  - RealTek WiFi (rtw88):
    - SDIO bus support
    - better support for some SDIO devices
      (e.g. MAC address from efuse)
 
  - RealTek WiFi (rtw89):
    - HW scan support for 8852b
    - better support for 6 GHz scanning
    - support for various newer firmware APIs
    - framework firmware backwards compatibility
 
  - MediaTek WiFi (mt76):
    - P2P support
    - mesh A-MSDU support
    - EHT (Wi-Fi 7) support
    - coredump support
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRI/mUSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkgO0QAJGxpuN67YgYV0BIM+/atWKEEexJYG7B
 9MMpU4jMO3EW/pUS5t7VRsBLUybLYVPmqCZoHodObDfnu59jiPOegb6SikJv/ZwJ
 Zw62PVk5MvDnQjlu4e6kDcGwkplteN08TlgI+a49BUTedpdFitrxHAYGW8f2fRO6
 cK2XSld+ZucMoym5vRwf8yWS1BwdxnslPMxDJ+/8ZbWBZv44qAnG2vMB/kIx7ObC
 Vel/4m6MzTwVsLYBsRvcwMVbNNlZ9GuhztlTzEbfGA4ZhTadIAMgb5VTWXB84Ws7
 Aic5wTdli+q+x6/2cxhbyeoVuB9HHObYmLBAciGg4GNljP5rnQBY3X3+KVZ/x9TI
 HQB7CmhxmAZVrO9pLARFV+ECrMTH2/dy3NyrZ7uYQ3WPOXJi8hJZjOTO/eeEGL7C
 eTjdz0dZBWIBK2gON/6s4nExXVQUTEF2ZsPi52jTTClKjfe5pz/ddeFQIWaY1DTm
 pInEiWPAvd28JyiFmhFNHsuIBCjX/Zqe2JuMfMBeBibDAC09o/OGdKJYUI15AiRf
 F46Pdb7use/puqfrYW44kSAfaPYoBiE+hj1RdeQfen35xD9HVE4vdnLNeuhRlFF9
 aQfyIRHYQofkumRDr5f8JEY66cl9NiKQ4IVW1xxQfYDNdC6wQqREPG1md7rJVMrJ
 vP7ugFnttneg
 =ITVa
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core:

   - Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
     default value allows for better BIG TCP performances

   - Reduce compound page head access for zero-copy data transfers

   - RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when
     possible

   - Threaded NAPI improvements, adding defer skb free support and
     unneeded softirq avoidance

   - Address dst_entry reference count scalability issues, via false
     sharing avoidance and optimize refcount tracking

   - Add lockless accesses annotation to sk_err[_soft]

   - Optimize again the skb struct layout

   - Extends the skb drop reasons to make it usable by multiple
     subsystems

   - Better const qualifier awareness for socket casts

  BPF:

   - Add skb and XDP typed dynptrs which allow BPF programs for more
     ergonomic and less brittle iteration through data and
     variable-sized accesses

   - Add a new BPF netfilter program type and minimal support to hook
     BPF programs to netfilter hooks such as prerouting or forward

   - Add more precise memory usage reporting for all BPF map types

   - Adds support for using {FOU,GUE} encap with an ipip device
     operating in collect_md mode and add a set of BPF kfuncs for
     controlling encap params

   - Allow BPF programs to detect at load time whether a particular
     kfunc exists or not, and also add support for this in light
     skeleton

   - Bigger batch of BPF verifier improvements to prepare for upcoming
     BPF open-coded iterators allowing for less restrictive looping
     capabilities

   - Rework RCU enforcement in the verifier, add kptr_rcu and enforce
     BPF programs to NULL-check before passing such pointers into kfunc

   - Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and
     in local storage maps

   - Enable RCU semantics for task BPF kptrs and allow referenced kptr
     tasks to be stored in BPF maps

   - Add support for refcounted local kptrs to the verifier for allowing
     shared ownership, useful for adding a node to both the BPF list and
     rbtree

   - Add BPF verifier support for ST instructions in
     convert_ctx_access() which will help new -mcpu=v4 clang flag to
     start emitting them

   - Add ARM32 USDT support to libbpf

   - Improve bpftool's visual program dump which produces the control
     flow graph in a DOT format by adding C source inline annotations

  Protocols:

   - IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
     indicates the provenance of the IP address

   - IPv6: optimize route lookup, dropping unneeded R/W lock acquisition

   - Add the handshake upcall mechanism, allowing the user-space to
     implement generic TLS handshake on kernel's behalf

   - Bridge: support per-{Port, VLAN} neighbor suppression, increasing
     resilience to nodes failures

   - SCTP: add support for Fair Capacity and Weighted Fair Queueing
     schedulers

   - MPTCP: delay first subflow allocation up to its first usage. This
     will allow for later better LSM interaction

   - xfrm: Remove inner/outer modes from input/output path. These are
     not needed anymore

   - WiFi:
      - reduced neighbor report (RNR) handling for AP mode
      - HW timestamping support
      - support for randomized auth/deauth TA for PASN privacy
      - per-link debugfs for multi-link
      - TC offload support for mac80211 drivers
      - mac80211 mesh fast-xmit and fast-rx support
      - enable Wi-Fi 7 (EHT) mesh support

  Netfilter:

   - Add nf_tables 'brouting' support, to force a packet to be routed
     instead of being bridged

   - Update bridge netfilter and ovs conntrack helpers to handle IPv6
     Jumbo packets properly, i.e. fetch the packet length from
     hop-by-hop extension header. This is needed for BIT TCP support

   - The iptables 32bit compat interface isn't compiled in by default
     anymore

   - Move ip(6)tables builtin icmp matches to the udptcp one. This has
     the advantage that icmp/icmpv6 match doesn't load the
     iptables/ip6tables modules anymore when iptables-nft is used

   - Extended netlink error report for netdevice in flowtables and
     netdev/chains. Allow for incrementally add/delete devices to netdev
     basechain. Allow to create netdev chain without device

  Driver API:

   - Remove redundant Device Control Error Reporting Enable, as PCI core
     has already error reporting enabled at enumeration time

   - Move Multicast DB netlink handlers to core, allowing devices other
     then bridge to use them

   - Allow the page_pool to directly recycle the pages from safely
     localized NAPI

   - Implement lockless TX queue stop/wake combo macros, allowing for
     further code de-duplication and sanitization

   - Add YNL support for user headers and struct attrs

   - Add partial YNL specification for devlink

   - Add partial YNL specification for ethtool

   - Add tc-mqprio and tc-taprio support for preemptible traffic classes

   - Add tx push buf len param to ethtool, specifies the maximum number
     of bytes of a transmitted packet a driver can push directly to the
     underlying device

   - Add basic LED support for switch/phy

   - Add NAPI documentation, stop relaying on external links

   - Convert dsa_master_ioctl() to netdev notifier. This is a
     preparatory work to make the hardware timestamping layer selectable
     by user space

   - Add transceiver support and improve the error messages for CAN-FD
     controllers

  New hardware / drivers:

   - Ethernet:
      - AMD/Pensando core device support
      - MediaTek MT7981 SoC
      - MediaTek MT7988 SoC
      - Broadcom BCM53134 embedded switch
      - Texas Instruments CPSW9G ethernet switch
      - Qualcomm EMAC3 DWMAC ethernet
      - StarFive JH7110 SoC
      - NXP CBTX ethernet PHY

   - WiFi:
      - Apple M1 Pro/Max devices
      - RealTek rtl8710bu/rtl8188gu
      - RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset

   - Bluetooth:
      - Realtek RTL8821CS, RTL8851B, RTL8852BS
      - Mediatek MT7663, MT7922
      - NXP w8997
      - Actions Semi ATS2851
      - QTI WCN6855
      - Marvell 88W8997

   - Can:
      - STMicroelectronics bxcan stm32f429

  Drivers:

   - Ethernet NICs:
      - Intel (1G, icg):
         - add tracking and reporting of QBV config errors
         - add support for configuring max SDU for each Tx queue
      - Intel (100G, ice):
         - refactor mailbox overflow detection to support Scalable IOV
         - GNSS interface optimization
      - Intel (i40e):
         - support XDP multi-buffer
      - nVidia/Mellanox:
         - add the support for linux bridge multicast offload
         - enable TC offload for egress and engress MACVLAN over bond
         - add support for VxLAN GBP encap/decap flows offload
         - extend packet offload to fully support libreswan
         - support tunnel mode in mlx5 IPsec packet offload
         - extend XDP multi-buffer support
         - support MACsec VLAN offload
         - add support for dynamic msix vectors allocation
         - drop RX page_cache and fully use page_pool
         - implement thermal zone to report NIC temperature
      - Netronome/Corigine:
         - add support for multi-zone conntrack offload
      - Solarflare/Xilinx:
         - support offloading TC VLAN push/pop actions to the MAE
         - support TC decap rules
         - support unicast PTP

   - Other NICs:
      - Broadcom (bnxt): enforce software based freq adjustments only on
        shared PHC NIC
      - RealTek (r8169): refactor to addess ASPM issues during NAPI poll
      - Micrel (lan8841): add support for PTP_PF_PEROUT
      - Cadence (macb): enable PTP unicast
      - Engleder (tsnep): add XDP socket zero-copy support
      - virtio-net: implement exact header length guest feature
      - veth: add page_pool support for page recycling
      - vxlan: add MDB data path support
      - gve: add XDP support for GQI-QPL format
      - geneve: accept every ethertype
      - macvlan: allow some packets to bypass broadcast queue
      - mana: add support for jumbo frame

   - Ethernet high-speed switches:
      - Microchip (sparx5): Add support for TC flower templates

   - Ethernet embedded switches:
      - Broadcom (b54):
         - configure 6318 and 63268 RGMII ports
      - Marvell (mv88e6xxx):
         - faster C45 bus scan
      - Microchip:
         - lan966x:
            - add support for IS1 VCAP
            - better TX/RX from/to CPU performances
         - ksz9477: add ETS Qdisc support
         - ksz8: enhance static MAC table operations and error handling
         - sama7g5: add PTP capability
      - NXP (ocelot):
         - add support for external ports
         - add support for preemptible traffic classes
      - Texas Instruments:
         - add CPSWxG SGMII support for J7200 and J721E

   - Intel WiFi (iwlwifi):
      - preparation for Wi-Fi 7 EHT and multi-link support
      - EHT (Wi-Fi 7) sniffer support
      - hardware timestamping support for some devices/firwmares
      - TX beacon protection on newer hardware

   - Qualcomm 802.11ax WiFi (ath11k):
      - MU-MIMO parameters support
      - ack signal support for management packets

   - RealTek WiFi (rtw88):
      - SDIO bus support
      - better support for some SDIO devices (e.g. MAC address from
        efuse)

   - RealTek WiFi (rtw89):
      - HW scan support for 8852b
      - better support for 6 GHz scanning
      - support for various newer firmware APIs
      - framework firmware backwards compatibility

   - MediaTek WiFi (mt76):
      - P2P support
      - mesh A-MSDU support
      - EHT (Wi-Fi 7) support
      - coredump support"

* tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2078 commits)
  net: phy: hide the PHYLIB_LEDS knob
  net: phy: marvell-88x2222: remove unnecessary (void*) conversions
  tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.
  net: amd: Fix link leak when verifying config failed
  net: phy: marvell: Fix inconsistent indenting in led_blink_set
  lan966x: Don't use xdp_frame when action is XDP_TX
  tsnep: Add XDP socket zero-copy TX support
  tsnep: Add XDP socket zero-copy RX support
  tsnep: Move skb receive action to separate function
  tsnep: Add functions for queue enable/disable
  tsnep: Rework TX/RX queue initialization
  tsnep: Replace modulo operation with mask
  net: phy: dp83867: Add led_brightness_set support
  net: phy: Fix reading LED reg property
  drivers: nfc: nfcsim: remove return value check of `dev_dir`
  net: phy: dp83867: Remove unnecessary (void*) conversions
  net: ethtool: coalesce: try to make user settings stick twice
  net: mana: Check if netdev/napi_alloc_frag returns single page
  net: mana: Rename mana_refill_rxoob and remove some empty lines
  net: veth: add page_pool stats
  ...
2023-04-26 16:07:23 -07:00
Linus Torvalds
9dd6956b38 for-6.4/block-2023-04-21
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmRCvcIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpk+JEACj01t7Xen2+Razagu3aTx9tmRGFnTNR3MY
 raFG6B1TADk1TgCWWa2C4Dj67SOispPLm8hbIcOxqB1UscDWCCwjmnr/debADFzW
 Ap6shv/IRwVGmDp+F7ocYas0ynwooOJg4WJTwkSKz2o4m4p3vzlwAKi4fLiSjbXp
 gJTrA7WEvDOVjzajlTFUtjr8rc6PdunbGm25cPIufAxUEhvttYex2VbVqjDmfNsE
 8tyyk9RWbe4AY/ZYaGXVn4yQ/CgL/sXFkVc5noRXNfAQ/K3CVLQrFLJ3JlwUHpiA
 xXBor21TUWCZEo33Y2G5NConAYqE7etoPTkaTDO3/aZ+dAMFyhC/WAYLz1KZGMh1
 +g1fDX1QKEd40H2lfDXvqF1ob7Ut8EzUx+gvBXcc3/AiRpJ5rjfOcj6LPUMUqQJk
 nucLLFTiMKecnDMBERbvixqbaTyrjvkFEj2wYJvgj1LKXAd+x/bj8SGajs9r88Nb
 9YT9ai/+Yl7Ppfb67rCgXJU7oNZQSAQ2H+X/l2jbiqImOgq1u/45AmINnbanS7HH
 Y1I8pbH45AcnCgkJRoQwrNX3BnTOTBJ+D/4Fl4b8jsihq0D3UtwCwPCObHP4LW9S
 MUNPhP3tUuYsAgXqX80+Sao6SYvXDwnbWOM+LOaaZXgjb1ndwDUZXpto8Ra8WB1u
 8kM6s6ZR7g==
 =W1Zb
 -----END PGP SIGNATURE-----

Merge tag 'for-6.4/block-2023-04-21' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - drbd patches, bringing us closer to unifying the out-of-tree version
   and the in tree one (Andreas, Christoph)

 - support for auto-quiesce for the s390 dasd driver (Stefan)

 - MD pull request via Song:
      - md/bitmap: Optimal last page size (Jon Derrick)
      - Various raid10 fixes (Yu Kuai, Li Nan)
      - md: add error_handlers for raid0 and linear (Mariusz Tkaczyk)

 - NVMe pull request via Christoph:
      - Drop redundant pci_enable_pcie_error_reporting (Bjorn Helgaas)
      - Validate nvmet module parameters (Chaitanya Kulkarni)
      - Fence TCP socket on receive error (Chris Leech)
      - Fix async event trace event (Keith Busch)
      - Minor cleanups (Chaitanya Kulkarni, zhenwei pi)
      - Fix and cleanup nvmet Identify handling (Damien Le Moal,
        Christoph Hellwig)
      - Fix double blk_mq_complete_request race in the timeout handler
        (Lei Yin)
      - Fix irq locking in nvme-fcloop (Ming Lei)
      - Remove queue mapping helper for rdma devices (Sagi Grimberg)

 - use structured request attribute checks for nbd (Jakub)

 - fix blk-crypto race conditions between keyslot management (Eric)

 - add sed-opal support for reading read locking range attributes
   (Ondrej)

 - make fault injection configurable for null_blk (Akinobu)

 - clean up the request insertion API (Christoph)

 - clean up the queue running API (Christoph)

 - blkg config helper cleanups (Tejun)

 - lazy init support for blk-iolatency (Tejun)

 - various fixes and tweaks to ublk (Ming)

 - remove hybrid polling. It hasn't really been useful since we got
   async polled IO support, and these days we don't support sync polled
   IO at all (Keith)

 - misc fixes, cleanups, improvements (Zhong, Ondrej, Colin, Chengming,
   Chaitanya, me)

* tag 'for-6.4/block-2023-04-21' of git://git.kernel.dk/linux: (118 commits)
  nbd: fix incomplete validation of ioctl arg
  ublk: don't return 0 in case of any failure
  sed-opal: geometry feature reporting command
  null_blk: Always check queue mode setting from configfs
  block: ublk: switch to ioctl command encoding
  blk-mq: fix the blk_mq_add_to_requeue_list call in blk_kick_flush
  block, bfq: Fix division by zero error on zero wsum
  fault-inject: fix build error when FAULT_INJECTION_CONFIGFS=y and CONFIGFS_FS=m
  block: store bdev->bd_disk->fops->submit_bio state in bdev
  block: re-arrange the struct block_device fields for better layout
  md/raid5: remove unused working_disks variable
  md/raid10: don't call bio_start_io_acct twice for bio which experienced read error
  md/raid10: fix memleak of md thread
  md/raid10: fix memleak for 'conf->bio_split'
  md/raid10: fix leak of 'r10bio->remaining' for recovery
  md/raid10: don't BUG_ON() in raise_barrier()
  md: fix soft lockup in status_resync
  md: add error_handlers for raid0 and linear
  md: Use optimal I/O size for last bitmap page
  md: Fix types in sb writer
  ...
2023-04-26 12:52:58 -07:00
Linus Torvalds
85d7ab2463 for-6.4-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmRHC3gACgkQxWXV+ddt
 WDvI/A//ZzREEE0wNexbuidoTacDVXVJ6LBb2K1eP+HUKfsmd6GYWQDJ9x/ExpKb
 T1ehLibCYWLeYxEREFbjXI3x9G8mrvLzvzsqXs/MzJPkmEF1igPddFztidBwvLQH
 ey/Bh+cra2bpVhRhkX0Cf09/q/YWp17/d14ZxxW60PMfyhx8RWXejXhHkulOPVv8
 +3FL8E0kc2Zjx9ioUwOy/i18LR6YzsCNVXoHzUZuWyWM4A7NG2TZR6FhuLSjlWSZ
 3RAnROwr+8i5nR0xchcyYaVMO2LMbqH6mBtHnXCtxCr+4pFrfrvKym+CQco/Xriz
 v1y/xDc23XeYXLCVhb0beJ6uRcjaM9+gvDF1oVBSJEv6V7sQr/tEGo/8QRehfEfT
 FTro7Lf89R1GOa1IBSkv/T5S25d9LlIID3/g7PbcUBtXNKvLAjDAGTH9bzL4HS5x
 /MKwN80GvaGs1KyEfUndbVPIpAwNFDYZPHM7nw1x+JTkIBcHgfjRyAMAC9jrJd0D
 730W04c+0nXZtQGtKKsxc3U8y4ewzSJAKx9t7Vgo7+1P6dSRnzvJee3x/5kXV9Yn
 MhxxzYDfIN9EcWbASdSm11gY5WZdG3an609pO7nc1T2K4Tuo0SPs4xOR7c3xuZrY
 MN5z3QFWyI2ustUuTG+nsd5J81j76DEmj5ymWQfG3SBplTneDM0=
 =Jt7p
 -----END PGP SIGNATURE-----

Merge tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "Mostly core changes and cleanups, some notable fixes and two
  performance improvements in directory logging.

  The IO path cleanups are removing or refactoring old code, scrub main
  loop has been completely rewritten also refactoring old code.

  There are some changes to non-btrfs code, mostly trivial, the cgroup
  punt bio logic is only moved from generic code.

  Performance improvements:

   - improve logging changes in a directory during one transaction,
     avoid iterating over items and reduce lock contention (fsync time
     4x lower)

   - when logging directory entries during one transaction, reduce
     locking of subvolume trees by checking tree-log instead
     (improvement in throughput and latency for concurrent access to a
     subvolume)

  Notable fixes:

   - dev-replace:
      - properly honor read mode when requested to avoid reading from
        source device
      - target device won't be used for eventual read repair, this is
        unreliable for NODATASUM files
      - when there are unpaired (and unrepairable) metadata during
        replace, exit early with error and don't try to finish whole
        operation

   - scrub ioctl properly rejects unknown flags

   - fix global block reserve calculations

   - fix partial direct io write when there's a page fault in the
     middle, iomap will try to continue with partial request but the
     btrfs part did not match that, this can lead to zeros written
     instead of data

  Core changes:

   - io path:
      - continued cleanups and refactoring around bio handling
      - extent io submit path simplifications and cleanups
      - flush write path simplifications and cleanups
      - rework logic of passing sync mode of bio, with further cleanups

   - rewrite scrub code flow, restructure how the stripes are enumerated
     and verified in a more unified way

   - allow to set lower threshold for block group reclaim in debug mode
     to aid zoned mode testing

   - remove obsolete time-based delayed ref throttling logic when
     truncating items

   - DREW locks are not using percpu variables anymore

   - more warning fixes (-Wmaybe-uninitialized)

   - u64 division simplifications

   - error handling improvements

  Non-btrfs code changes:

   - push cgroup punt bio logic to btrfs code (there was no other user
     of that), the functionality can be now selected separately by
     BLK_CGROUP_PUNT_BIO

   - crc32c_impl removed after removing last uses in btrfs code

   - add btrfs_assertfail() to objtool table"

* tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits)
  btrfs: mark btrfs_assertfail() __noreturn
  btrfs: fix uninitialized variable warnings
  btrfs: use log root when iterating over index keys when logging directory
  btrfs: avoid iterating over all indexes when logging directory
  btrfs: dev-replace: error out if we have unrepaired metadata error during
  btrfs: remove pointless loop at btrfs_get_next_valid_item()
  btrfs: scrub: reject unsupported scrub flags
  btrfs: reinterpret async discard iops_limit=0 as no delay
  btrfs: set default discard iops_limit to 1000
  btrfs: remove unused raid56 functions which were dedicated for scrub
  btrfs: scrub: remove scrub_bio structure
  btrfs: scrub: remove scrub_block and scrub_sector structures
  btrfs: scrub: remove the old scrub recheck code
  btrfs: scrub: remove the old writeback infrastructure
  btrfs: scrub: remove scrub_parity structure
  btrfs: scrub: use scrub_stripe to implement RAID56 P/Q scrub
  btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure
  btrfs: scrub: introduce helper to queue a stripe for scrub
  btrfs: scrub: introduce error reporting functionality for scrub_stripe
  btrfs: scrub: introduce a writeback helper for scrub_stripe
  ...
2023-04-26 09:13:44 -07:00
Linus Torvalds
733f7e9c18 This update includes the following changes:
API:
 
 - Total usage stats now include all that returned error (instead of some).
 - Remove maximum hash statesize limit.
 - Add cloning support for hmac and unkeyed hashes.
 - Demote BUG_ON in crypto_unregister_alg to a WARN_ON.
 
 Algorithms:
 
 - Use RIP-relative addressing on x86 to prepare for PIE build.
 - Add accelerated AES/GCM stitched implementation on powerpc P10.
 - Add some test vectors for cmac(camellia).
 - Remove failure case where jent is unavailable outside of FIPS mode in drbg.
 - Add permanent and intermittent health error checks in jitter RNG.
 
 Drivers:
 
 - Add support for 402xx devices in qat.
 - Add support for HiSTB TRNG.
 - Fix hash concurrency issues in stm32.
 - Add OP-TEE firmware support in caam.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmRGCjcACgkQxycdCkmx
 i6d6JA//ZmwgEqAKA8qWpHnNKZylTLqFhLxnKZwr4Hhp1KzManh/T9pepXiD2zAY
 D92wU60v0hfGAazeUWQRmrIZxcjyd3b3Tr7WiFuNoZbkPsuXWZAoz8iHgMq69dqb
 DXZhKJnlmVlcr+qTSk9MP8HODL5kU6Ug2pk+r8hL/WsBI+JGfZEXKcJhhMqYLYls
 nl+NN4fkE5tgcTh2lp/9dQsQRylhESZuqb8L2wItQmripSbhPGwYf24I7B7xcGrn
 o7X4XG//cQO6zQErgnOJOosIgJEEynW27CN4ZiHB8WhRAk0YLXydQBs6EjZgNA8H
 EvZC/bIx2YOt8ngG99q4kRg4OgKp4c7UnV6l1pxuJWbIyXrFh4djxHdq9pTYr3UB
 P3pVEX38Wu7U5Tfgy3y1QqZzsvrPjmnI3NQ8QBrcFzNRDan5K6nH4kQyk9Cv7LQm
 GlE1JOThU5U2G33ZWKCluJUjVUCRceMWQYla1X5R4uWMCwSqRMpmx8Ib9QvbYlWe
 iUI+RatLnlIobx+lgaC8mtij9dQddFjk6YwFYhQcD3Bl30DhTeIlbnOUY9YOTXps
 H6V9X2inVUjyZr1uJ4a7rPdCUuzQxR6HWPyp6fXMlbLrEhL8e6c4/QbEoTubRQeS
 WTtoIFt4ezd2SG6hI6dTCscgFc5EAyEMDD5GtQmJeyozu0Gqtpo=
 =ITkW
 -----END PGP SIGNATURE-----

Merge tag 'v6.4-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Total usage stats now include all that returned errors (instead of
     just some)
   - Remove maximum hash statesize limit
   - Add cloning support for hmac and unkeyed hashes
   - Demote BUG_ON in crypto_unregister_alg to a WARN_ON

  Algorithms:
   - Use RIP-relative addressing on x86 to prepare for PIE build
   - Add accelerated AES/GCM stitched implementation on powerpc P10
   - Add some test vectors for cmac(camellia)
   - Remove failure case where jent is unavailable outside of FIPS mode
     in drbg
   - Add permanent and intermittent health error checks in jitter RNG

  Drivers:
   - Add support for 402xx devices in qat
   - Add support for HiSTB TRNG
   - Fix hash concurrency issues in stm32
   - Add OP-TEE firmware support in caam"

* tag 'v6.4-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (139 commits)
  i2c: designware: Add doorbell support for Mendocino
  i2c: designware: Use PCI PSP driver for communication
  powerpc: Move Power10 feature PPC_MODULE_FEATURE_P10
  crypto: p10-aes-gcm - Remove POWER10_CPU dependency
  crypto: testmgr - Add some test vectors for cmac(camellia)
  crypto: cryptd - Add support for cloning hashes
  crypto: cryptd - Convert hash to use modern init_tfm/exit_tfm
  crypto: hmac - Add support for cloning
  crypto: hash - Add crypto_clone_ahash/shash
  crypto: api - Add crypto_clone_tfm
  crypto: api - Add crypto_tfm_get
  crypto: x86/sha - Use local .L symbols for code
  crypto: x86/crc32 - Use local .L symbols for code
  crypto: x86/aesni - Use local .L symbols for code
  crypto: x86/sha256 - Use RIP-relative addressing
  crypto: x86/ghash - Use RIP-relative addressing
  crypto: x86/des3 - Use RIP-relative addressing
  crypto: x86/crc32c - Use RIP-relative addressing
  crypto: x86/cast6 - Use RIP-relative addressing
  crypto: x86/cast5 - Use RIP-relative addressing
  ...
2023-04-26 08:32:52 -07:00
Sergey Senozhatsky
96928d9032 seq_buf: Add seq_buf_do_printk() helper
Sometimes we use seq_buf to format a string buffer, which
we then pass to printk(). However, in certain situations
the seq_buf string buffer can get too big, exceeding the
PRINTKRB_RECORD_MAX bytes limit, and causing printk() to
truncate the string.

Add a new seq_buf helper. This helper prints the seq_buf
string buffer line by line, using \n as a delimiter,
rather than passing the whole string buffer to printk()
at once.

Link: https://lkml.kernel.org/r/20230415100110.1419872-1-senozhatsky@chromium.org

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-25 21:03:14 -04:00
Linus Torvalds
7ec85f3e08 printk changes for 6.4
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmRGaygACgkQUqAMR0iA
 lPKlGBAAqn0yS8E2CP16Oo8nCB5AjoPVzohh6pQ6O8G0CFhvu47EKVTHPTa1BEFE
 YAz94geN5crpAmEcQyBcqkcJuLRXmYBOqE1x9M4PcCUUXTjcyYEzBYsOZO+j5jB7
 LUPX6jBbm2PpbT/e1ZSr90R8MhblVfBTD7DJHmXGhibYHj5D4KOwxQnhx8uWz9aT
 dgTWm1AgwEX85wUpXil5phD+YnvI/TxGlyV4AVOYh3y3K7Kc4CAeHFzCsg3h/Amr
 c2RR1dzvmMcEvg8lF3U9MsnVNF/2i0Tg9BXLRxSe1c20CKhtzNNPH5krPa3vHGeP
 P//FWDAd9S2hev54TN7LO92V+IsDh8nlU++HwRua50wflzJU/tkyWDtcmmlkGU6A
 hqtMUWE4libAaAW7FBJomRFirmEtEA4GwXN5WH3+B6htgVwKKrKhL9U/PtQtZxZ1
 GUEvtjmnBIfGndu7fHv70a1sLc9LuebOfmOQs3W6p6KUZkmL1Hqg1WGQoYwmUz4A
 bZRbCwMYNJCG4iO2jDmPU27D6tWMbQdt1kZ20svP6p3PRGy8EuI1C5tnO5Jhkw3E
 FCFudMMZEuZmBoztWWqEkZSfbMDlH6kc1+6+HMuCfSrpg6QD87TzO5CONIHCZyk9
 f3UD04R//BubTdiKQ4y/g6OwctihX7F8i3O71hTj5etuYqPs0nI=
 =t0d6
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Code cleanup and dead code removal

* tag 'printk-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Remove obsoleted check for non-existent "user" object
  lib/vsprintf: Use isodigit() for the octal number check
  Remove orphaned CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT
2023-04-25 12:46:48 -07:00
Linus Torvalds
53b5e72b9d asm-generic updates for 6.4
These are various cleanups, fixing a number of uapi header files to no
 longer reference CONFIG_* symbols, and one patch that introduces the
 new CONFIG_HAS_IOPORT symbol for architectures that provide working
 inb()/outb() macros, as a preparation for adding driver dependencies
 on those in the following release.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmRG8IkACgkQYKtH/8kJ
 Uid15Q/9E/neIIEqEk6IvtyhUicrJiIZUM0rGoYtWXiz75ggk6Kx9+3I+j8zIQ/E
 kf2TzAG7q9Md7nfTDFLr4FSr0IcNDj+VG4nYxUyDHdKGcARO+g9Kpdvscxip3lgU
 Rw5w74Gyd30u4iUKGS39OYuxcCgl9LaFjMA9Gh402Oiaoh+OYLmgQS9h/goUD5KN
 Nd+AoFvkdbnHl0/SpxthLRyL5rFEATBmAY7apYViPyMvfjS3gfDJwXJR9jkKgi6X
 Qs4t8Op8BA3h84dCuo6VcFqgAJs2Wiq3nyTSUnkF8NxJ2RFTpeiVgfsLOzXHeDgz
 SKDB4Lp14o3mlyZyj00MWq1uMJRRetUgNiVb6iHOoKQ/E4demBdh+mhIFRybjM5B
 XNTWFcg9PWFCMa4W9jnLfZBc881X4+7T+qUF8I0W/1AbRJUmyGj8HO6jLceC4yGD
 UYLn5oFPM6OWXHp6DqJrCr9Yw8h6fuviQZFEbl/ARlgVGt+J4KbYweJYk8DzfX6t
 PZIj8LskOqyIpRuC2oDA1PHxkaJ1/z+N5oRBHq1uicSh4fxY5HW7HnyzgF08+R3k
 cf+fjAhC3TfGusHkBwQKQJvpxrxZjPuvYXDZ0GxTvNKJRB8eMeiTm1n41E5oTVwQ
 swSblSCjZj/fMVVPXLcjxEW4SBNWRxa9Lz3tIPXb3RheU10Lfy8=
 =H3k4
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "These are various cleanups, fixing a number of uapi header files to no
  longer reference CONFIG_* symbols, and one patch that introduces the
  new CONFIG_HAS_IOPORT symbol for architectures that provide working
  inb()/outb() macros, as a preparation for adding driver dependencies
  on those in the following release"

* tag 'asm-generic-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  Kconfig: introduce HAS_IOPORT option and select it as necessary
  scripts: Update the CONFIG_* ignore list in headers_install.sh
  pktcdvd: Remove CONFIG_CDROM_PKTCDVD_WCACHE from uapi header
  Move bp_type_idx to include/linux/hw_breakpoint.h
  Move ep_take_care_of_epollwakeup() to fs/eventpoll.c
  Move COMPAT_ATM_ADDPARTY to net/atm/svc.c
2023-04-25 12:22:11 -07:00
Linus Torvalds
e7989789c6 Timers and timekeeping updates:
- Improve the VDSO build time checks to cover all dynamic relocations
 
     VDSO does not allow dynamic relcations, but the build time check is
     incomplete and fragile.
 
     It's based on architectures specifying the relocation types to search
     for and does not handle R_*_NONE relocation entries correctly.
     R_*_NONE relocations are injected by some GNU ld variants if they fail
     to determine the exact .rel[a]/dyn_size to cover trailing zeros.
     R_*_NONE relocations must be ignored by dynamic loaders, so they
     should be ignored in the build time check too.
 
     Remove the architecture specific relocation types to check for and
     validate strictly that no other relocations than R_*_NONE end up
     in the VSDO .so file.
 
   - Prefer signal delivery to the current thread for
     CLOCK_PROCESS_CPUTIME_ID based posix-timers
 
     Such timers prefer to deliver the signal to the main thread of a
     process even if the context in which the timer expires is the current
     task. This has the downside that it might wake up an idle thread.
 
     As there is no requirement or guarantee that the signal has to be
     delivered to the main thread, avoid this by preferring the current
     task if it is part of the thread group which shares sighand.
 
     This not only avoids waking idle threads, it also distributes the
     signal delivery in case of multiple timers firing in the context
     of different threads close to each other better.
 
   - Align the tick period properly (again)
 
     For a long time the tick was starting at CLOCK_MONOTONIC zero, which
     allowed users space applications to either align with the tick or to
     place a periodic computation so that it does not interfere with the
     tick. The alignement of the tick period was more by chance than by
     intention as the tick is set up before a high resolution clocksource is
     installed, i.e. timekeeping is still tick based and the tick period
     advances from there.
 
     The early enablement of sched_clock() broke this alignement as the time
     accumulated by sched_clock() is taken into account when timekeeping is
     initialized. So the base value now(CLOCK_MONOTONIC) is not longer a
     multiple of tick periods, which breaks applications which relied on
     that behaviour.
 
     Cure this by aligning the tick starting point to the next multiple of
     tick periods, i.e 1000ms/CONFIG_HZ.
 
  - A set of NOHZ fixes and enhancements
 
    - Cure the concurrent writer race for idle and IO sleeptime statistics
 
      The statitic values which are exposed via /proc/stat are updated from
      the CPU local idle exit and remotely by cpufreq, but that happens
      without any form of serialization. As a consequence sleeptimes can be
      accounted twice or worse.
 
      Prevent this by restricting the accumulation writeback to the CPU
      local idle exit and let the remote access compute the accumulated
      value.
 
    - Protect idle/iowait sleep time with a sequence count
 
      Reading idle/iowait sleep time, e.g. from /proc/stat, can race with
      idle exit updates. As a consequence the readout may result in random
      and potentially going backwards values.
 
      Protect this by a sequence count, which fixes the idle time
      statistics issue, but cannot fix the iowait time problem because
      iowait time accounting races with remote wake ups decrementing the
      remote runqueues nr_iowait counter. The latter is impossible to fix,
      so the only way to deal with that is to document it properly and to
      remove the assertion in the selftest which triggers occasionally due
      to that.
 
    - Restructure struct tick_sched for better cache layout
 
    - Some small cleanups and a better cache layout for struct tick_sched
 
  - Implement the missing timer_wait_running() callback for POSIX CPU timers
 
    For unknown reason the introduction of the timer_wait_running() callback
    missed to fixup posix CPU timers, which went unnoticed for almost four
    years.
 
    While initially only targeted to prevent livelocks between a timer
    deletion and the timer expiry function on PREEMPT_RT enabled kernels, it
    turned out that fixing this for mainline is not as trivial as just
    implementing a stub similar to the hrtimer/timer callbacks.
 
    The reason is that for CONFIG_POSIX_CPU_TIMERS_TASK_WORK enabled systems
    there is a livelock issue independent of RT.
 
    CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y moves the expiry of POSIX CPU timers
    out from hard interrupt context to task work, which is handled before
    returning to user space or to a VM. The expiry mechanism moves the
    expired timers to a stack local list head with sighand lock held. Once
    sighand is dropped the task can be preempted and a task which wants to
    delete a timer will spin-wait until the expiry task is scheduled back
    in. In the worst case this will end up in a livelock when the preempting
    task and the expiry task are pinned on the same CPU.
 
    The timer wheel has a timer_wait_running() mechanism for RT, which uses
    a per CPU timer-base expiry lock which is held by the expiry code and the
    task waiting for the timer function to complete blocks on that lock.
 
    This does not work in the same way for posix CPU timers as there is no
    timer base and expiry for process wide timers can run on any task
    belonging to that process, but the concept of waiting on an expiry lock
    can be used too in a slightly different way.
 
    Add a per task mutex to struct posix_cputimers_work, let the expiry task
    hold it accross the expiry function and let the deleting task which
    waits for the expiry to complete block on the mutex.
 
    In the non-contended case this results in an extra mutex_lock()/unlock()
    pair on both sides.
 
    This avoids spin-waiting on a task which is scheduled out, prevents the
    livelock and cures the problem for RT and !RT systems.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRGrj4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZhdEAC/lwfDWCnTXHC8ExQQRDIVNyXmDlLb
 EHB8ZY7Wc4gNZ8UEXEOLOXJHMG9bsbtPGctVewJwRGnXZWKVhpPwQba6kCRycyX0
 0J6l5DlvUaGGrpoOzOZwgETRmtIZE9tEArZR8xlfRScYd93a7yLhwIjO8JaV9vKs
 IQpAQMeJ/ysp6gHrS59qakYfoHU/ERUAu3Tk4GqHUtPtcyz3nX3eTlLWV8LySqs+
 00qr2yc0bQFUFoKzTCxtM8lcEi9ja9SOj1rw28348O+BXE4d0HC12Ie7eU/CDN2Y
 OAlWYxVjy4LMh24LDrRQKTzoVqx9MXDx2g+09B3t8NK5LgeS+EJIjujDhZF147/H
 5y906nplZUKa8BiZW5Rpm/HKH8tFI80T9XWSQCRBeMgTEJyRyRU1yASAwO4xw+dY
 Dn3tGmFGymcV/72o4ic9JFKQd8cTSxPjEJS3qqzMkEAtyI/zPBmKxj/Tce50OH40
 6FSZq1uU21ZQzszwSHISwgFtNr75laUSK4Z1te5OhPOOz+C7O9YqHvqS/1jwhPj2
 tMd8X17fRW3UTUBlBj+zqxqiEGBl/Yk2AvKrJIXGUtfWYCtjMJ7ieCf0kZ7NSVJx
 9ewubA0gqseMD783YomZsy8LLtMKnhclJeslUOVb1oKs1q/WF1R/k6qjy9vUwYaB
 nIJuHl8mxSetag==
 =SVnj
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timers and timekeeping updates from Thomas Gleixner:

 - Improve the VDSO build time checks to cover all dynamic relocations

   VDSO does not allow dynamic relocations, but the build time check is
   incomplete and fragile.

   It's based on architectures specifying the relocation types to search
   for and does not handle R_*_NONE relocation entries correctly.
   R_*_NONE relocations are injected by some GNU ld variants if they
   fail to determine the exact .rel[a]/dyn_size to cover trailing zeros.
   R_*_NONE relocations must be ignored by dynamic loaders, so they
   should be ignored in the build time check too.

   Remove the architecture specific relocation types to check for and
   validate strictly that no other relocations than R_*_NONE end up in
   the VSDO .so file.

 - Prefer signal delivery to the current thread for
   CLOCK_PROCESS_CPUTIME_ID based posix-timers

   Such timers prefer to deliver the signal to the main thread of a
   process even if the context in which the timer expires is the current
   task. This has the downside that it might wake up an idle thread.

   As there is no requirement or guarantee that the signal has to be
   delivered to the main thread, avoid this by preferring the current
   task if it is part of the thread group which shares sighand.

   This not only avoids waking idle threads, it also distributes the
   signal delivery in case of multiple timers firing in the context of
   different threads close to each other better.

 - Align the tick period properly (again)

   For a long time the tick was starting at CLOCK_MONOTONIC zero, which
   allowed users space applications to either align with the tick or to
   place a periodic computation so that it does not interfere with the
   tick. The alignement of the tick period was more by chance than by
   intention as the tick is set up before a high resolution clocksource
   is installed, i.e. timekeeping is still tick based and the tick
   period advances from there.

   The early enablement of sched_clock() broke this alignement as the
   time accumulated by sched_clock() is taken into account when
   timekeeping is initialized. So the base value now(CLOCK_MONOTONIC) is
   not longer a multiple of tick periods, which breaks applications
   which relied on that behaviour.

   Cure this by aligning the tick starting point to the next multiple of
   tick periods, i.e 1000ms/CONFIG_HZ.

 - A set of NOHZ fixes and enhancements:

     * Cure the concurrent writer race for idle and IO sleeptime
       statistics

       The statitic values which are exposed via /proc/stat are updated
       from the CPU local idle exit and remotely by cpufreq, but that
       happens without any form of serialization. As a consequence
       sleeptimes can be accounted twice or worse.

       Prevent this by restricting the accumulation writeback to the CPU
       local idle exit and let the remote access compute the accumulated
       value.

     * Protect idle/iowait sleep time with a sequence count

       Reading idle/iowait sleep time, e.g. from /proc/stat, can race
       with idle exit updates. As a consequence the readout may result
       in random and potentially going backwards values.

       Protect this by a sequence count, which fixes the idle time
       statistics issue, but cannot fix the iowait time problem because
       iowait time accounting races with remote wake ups decrementing
       the remote runqueues nr_iowait counter. The latter is impossible
       to fix, so the only way to deal with that is to document it
       properly and to remove the assertion in the selftest which
       triggers occasionally due to that.

     * Restructure struct tick_sched for better cache layout

     * Some small cleanups and a better cache layout for struct
       tick_sched

 - Implement the missing timer_wait_running() callback for POSIX CPU
   timers

   For unknown reason the introduction of the timer_wait_running()
   callback missed to fixup posix CPU timers, which went unnoticed for
   almost four years.

   While initially only targeted to prevent livelocks between a timer
   deletion and the timer expiry function on PREEMPT_RT enabled kernels,
   it turned out that fixing this for mainline is not as trivial as just
   implementing a stub similar to the hrtimer/timer callbacks.

   The reason is that for CONFIG_POSIX_CPU_TIMERS_TASK_WORK enabled
   systems there is a livelock issue independent of RT.

   CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y moves the expiry of POSIX CPU
   timers out from hard interrupt context to task work, which is handled
   before returning to user space or to a VM. The expiry mechanism moves
   the expired timers to a stack local list head with sighand lock held.
   Once sighand is dropped the task can be preempted and a task which
   wants to delete a timer will spin-wait until the expiry task is
   scheduled back in. In the worst case this will end up in a livelock
   when the preempting task and the expiry task are pinned on the same
   CPU.

   The timer wheel has a timer_wait_running() mechanism for RT, which
   uses a per CPU timer-base expiry lock which is held by the expiry
   code and the task waiting for the timer function to complete blocks
   on that lock.

   This does not work in the same way for posix CPU timers as there is
   no timer base and expiry for process wide timers can run on any task
   belonging to that process, but the concept of waiting on an expiry
   lock can be used too in a slightly different way.

   Add a per task mutex to struct posix_cputimers_work, let the expiry
   task hold it accross the expiry function and let the deleting task
   which waits for the expiry to complete block on the mutex.

   In the non-contended case this results in an extra
   mutex_lock()/unlock() pair on both sides.

   This avoids spin-waiting on a task which is scheduled out, prevents
   the livelock and cures the problem for RT and !RT systems

* tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-cpu-timers: Implement the missing timer_wait_running callback
  selftests/proc: Assert clock_gettime(CLOCK_BOOTTIME) VS /proc/uptime monotonicity
  selftests/proc: Remove idle time monotonicity assertions
  MAINTAINERS: Remove stale email address
  timers/nohz: Remove middle-function __tick_nohz_idle_stop_tick()
  timers/nohz: Add a comment about broken iowait counter update race
  timers/nohz: Protect idle/iowait sleep time under seqcount
  timers/nohz: Only ever update sleeptime from idle exit
  timers/nohz: Restructure and reshuffle struct tick_sched
  tick/common: Align tick period with the HZ tick.
  selftests/timers/posix_timers: Test delivery of signals across threads
  posix-timers: Prefer delivery of signals to the current thread
  vdso: Improve cmd_vdso_check to check all dynamic relocations
2023-04-25 11:22:46 -07:00
Linus Torvalds
29e95a4b26 A single update to debugobjects:
Prevent a race vs. statically initialized objects. Such objects are
   usually not initialized via an init() function. They are special cased
   and detected on first use under the assumption that they are already
   correctly initialized via the static initializer.
 
   This works correctly unless there are two concurrent debug object
   operations on such an object.
 
   The first one detects that the object is not yet tracked and tries to
   establish a tracking object after dropping the debug objects hash bucket
   lock. The concurrent operation does the same. The one which wins the
   race ends up modifying the state of the object which makes the other one
   fail resulting in a bogus debug objects warning.
 
   Prevent this by making the detection of a static object and the
   allocation of a tracking object atomic under the hash bucket lock. So the
   first one to acquire the hash bucket lock will succeed and the second one
   will observe the correct tracking state.
 
   This race existed forever but was only exposed when the timer wheel code
   added a debug_object_assert_init() call outside of the timer base locked
   region. This replaced the previous warning about timer::function being
   NULL which had to be removed when the timer_shutdown() mechanics were
   added.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRGfx8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYod4YD/98pgjxl9zht0tJpjOQv1GHQeKWGOnS
 T2NcK7UBF7IZnGVCaQovs1TLPEiHZMY9TgSmefP9UuYNCUthdzgxUv1hljb915zI
 xcQmqFopUyFF+F+qE7ti1C4HvXzbdss14XK97EcsoooS1ALq5xTkUJEcmdLFRL85
 /ACkHz0/iMEHT9QVX6WoAOptg7HLoscb30CEZGa8skStAIRZfMIFqmN5GXzKUsPH
 oLUldSjoXyqq2ZBu9jiO9GoPmei3VuaZO3qWtN4KYY0C37BvKavgS2N/NsOh7s+0
 I51G5+R8o6kQgr3RSll6frsPcy1EXsgPDZXO5tC1W9bp6+yrQ97ztdG0QS52fcPb
 fcCQtAX3L+K38vf4GfvboDyf7x21leJSYE3u+HCXUlyC2Es8QZgWw4U7Bi8IwSZg
 /BKC6QkQD/YyG/aQyZq6ZGiLgbJt8g53WiR8HGx35P3RUEy5Mit3bBSuq1dSuGR0
 RozFlWswUif3Teticq33MR6Mv9M3866lX4iTMGT50xjJZirb8ongpKkRxIOHVeXV
 4//0V/GOswyTwkY884Q6zJCZZq2FEudn6/Vtjh97zLxvJzLbdIEnEPC5HG75Jed0
 a9NISg+NT9VOx4PLwgMWgW6dlT5SNUeWD4ddC879c4ELbyNd1i4AY54pMrcwEVVj
 fGdL6pFfFzZI5w==
 =19cg
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core debugobjects update from Thomas Gleixner:
 "A single update to debugobjects:

  Prevent a race vs statically initialized objects. Such objects are
  usually not initialized via an init() function. They are special cased
  and detected on first use under the assumption that they are already
  correctly initialized via the static initializer.

  This works correctly unless there are two concurrent debug object
  operations on such an object.

  The first one detects that the object is not yet tracked and tries to
  establish a tracking object after dropping the debug objects hash
  bucket lock. The concurrent operation does the same. The one which
  wins the race ends up modifying the state of the object which makes
  the other one fail resulting in a bogus debug objects warning.

  Prevent this by making the detection of a static object and the
  allocation of a tracking object atomic under the hash bucket lock. So
  the first one to acquire the hash bucket lock will succeed and the
  second one will observe the correct tracking state.

  This race existed forever but was only exposed when the timer wheel
  code added a debug_object_assert_init() call outside of the timer base
  locked region. This replaced the previous warning about
  timer::function being NULL which had to be removed when the
  timer_shutdown() mechanics were added"

* tag 'core-debugobjects-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobject: Prevent init race with static objects
2023-04-25 11:00:45 -07:00
Linus Torvalds
1be89faab3 linux-kselftest-kunit-6.4-rc1
This KUnit update Linux 6.4-rc1 consists of:
 
 - several fixes to kunit tool
 - new klist structure test
 - support for m68k under QEMU
 - support for overriding the QEMU serial port
 - support for SH under QEMU
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmRFYFsACgkQCwJExA0N
 Qxw+PxAA1KHnHool3QbzZouFgLgTS2N/hxsOIoWKeUl6guUPX0XYu67FEIyt7p5k
 a1eFLjt+q4URW/heHKYdffP+Up6xhN5yVP8xJEcbn6GD13lz1clI9RAjObiPOehc
 KOV90PeAEfzosEGRIp97g4Gzu8NUMZqN7BsKBdzYJ4rEftlcjaILBVp4OfSuCyAi
 UbYBdRjK4eIOwGXuHVfhNqzH1HRSbzcoSRTywj5qW0Qhpe6KnZBRuZESXYBsxzGb
 G0nd4+OttjZyplI/xQYwaU0XGAI6roG5G4nAT5YGHLp5g8rTaHetTi+i3iK4iEru
 wEL0NgywkA0ujAge97RldOjtU97vvSFk7FwxdS9lxaMW/Ut2sN72I2ThI8dBvVRZ
 fcw8t8mmT1gUv3SCq+s1X13vz22IedXLOfvOY2o/fLk2zxOw5e8FirAz/aFeOf3K
 ++hK+IQvDmeMMv08bz0ORzdRQcjdwQNQ3klnfdrUVFN9yK+iAllOJ/nrXHLNIXu4
 c3ITlAMldcAf2W+LRWzvqqKyT4H8MCXL3L0bBc1M1reRu9nM89AZedO8MHCB0R9Q
 2ic0rOxIwZzPJuk0qPDxEVmN7Rpyx85I96YOwRemJTEfdkB/ZX+BfOU0KzinOVHC
 3qrHuIw/SyRTlUEDAr53gJ5WHbdjhKAmrd1/FuplyoOSX0w6VVA=
 =COQn
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - several fixes to kunit tool

 - new klist structure test

 - support for m68k under QEMU

 - support for overriding the QEMU serial port

 - support for SH under QEMU

* tag 'linux-kselftest-kunit-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: add tests for using current KUnit test field
  kunit: tool: Add support for SH under QEMU
  kunit: tool: Add support for overriding the QEMU serial port
  .gitignore: Unignore .kunitconfig
  list: test: Test the klist structure
  kunit: increase KUNIT_LOG_SIZE to 2048 bytes
  kunit: Use gfp in kunit_alloc_resource() kernel-doc
  kunit: tool: fix pre-existing `mypy --strict` errors and update run_checks.py
  kunit: tool: remove unused imports and variables
  kunit: tool: add subscripts for type annotations where appropriate
  kunit: fix bug of extra newline characters in debugfs logs
  kunit: fix bug in the order of lines in debugfs logs
  kunit: fix bug in debugfs logs of parameterized tests
  kunit: tool: Add support for m68k under QEMU
2023-04-24 12:31:32 -07:00
Linus Torvalds
5dfb75e842 RCU Changes for 6.4:
o  MAINTAINERS files additions and changes.
  o  Fix hotplug warning in nohz code.
  o  Tick dependency changes by Zqiang.
  o  Lazy-RCU shrinker fixes by Zqiang.
  o  rcu-tasks stall reporting improvements by Neeraj.
  o  Initial changes for renaming of k[v]free_rcu() to its new k[v]free_rcu_mightsleep()
     name for robustness.
  o  Documentation Updates:
  o  Significant changes to srcu_struct size.
  o  Deadlock detection for srcu_read_lock() vs synchronize_srcu() from Boqun.
  o  rcutorture and rcu-related tool, which are targeted for v6.4 from Boqun's tree.
  o  Other misc changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEcoCIrlGe4gjE06JJqA4nf2o45hAFAmQuBnIACgkQqA4nf2o4
 5hACVRAAoXu7/gfh5Pjw9O4E4pCdPJKsZZVYrcrVGrq6NAxRn6M1SgurAdC5grj2
 96x0waoGaiO82V0H5iJMcKdAVu67x9R8WaQ1JoxN75Efn8h9W4TguB87TV1gk0xS
 eZ18b/CyEaM5mNb80DFFF4FLohy5737p/kNTMqXQdUyR1BsDl16iRMgjiBiFhNUx
 yPo8Y2kC2U2OTbldZgaE7s9bQO3xxEcifx93sGWsAex/gx54FYNisiwSlCOSgOE+
 XkYo/OKk8Xvr82tLVX8XQVEPCMJ+rxea8T5zSs8/alvsPq7gA8wW3y6fsoa3vUU/
 +Gd+W+Q/OsONIDtp8rQAY1qsD0ScDpaR8052RSH0zTa7pj8HsQgE5PjZ+cJW0SEi
 cKN+Oe8+ETqKald+xZ6PDf58O212VLrru3RpQWrOQcJ7fmKmfT4REK0RcbLgg4qT
 CBgOo6eg+ub4pxq2y11LZJBNTv1/S7xAEzFE0kArew64KB2gyVud0VJRZVAJnEfe
 93QQVDFrwK2bhgWQZ6J6IbTvGeQW0L93IibuaU6jhZPR283VtUIIvM7vrOylN7Fq
 4jsae0T7YGYfKUhgTpm7rCnm8A/D3Ni8MY0sKYYgDSyKmZUsnpI5wpx1xke4lwwV
 ErrY46RCFa+k8wscc6iWfB4cGXyyFHyu+wtyg0KpFn5JAzcfz4A=
 =Rgbj
 -----END PGP SIGNATURE-----

Merge tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux

Pull RCU updates from Joel Fernandes:

 - Updates and additions to MAINTAINERS files, with Boqun being added to
   the RCU entry and Zqiang being added as an RCU reviewer.

   I have also transitioned from reviewer to maintainer; however, Paul
   will be taking over sending RCU pull-requests for the next merge
   window.

 - Resolution of hotplug warning in nohz code, achieved by fixing
   cpu_is_hotpluggable() through interaction with the nohz subsystem.

   Tick dependency modifications by Zqiang, focusing on fixing usage of
   the TICK_DEP_BIT_RCU_EXP bitmask.

 - Avoid needless calls to the rcu-lazy shrinker for CONFIG_RCU_LAZY=n
   kernels, fixed by Zqiang.

 - Improvements to rcu-tasks stall reporting by Neeraj.

 - Initial renaming of k[v]free_rcu() to k[v]free_rcu_mightsleep() for
   increased robustness, affecting several components like mac802154,
   drbd, vmw_vmci, tracing, and more.

   A report by Eric Dumazet showed that the API could be unknowingly
   used in an atomic context, so we'd rather make sure they know what
   they're asking for by being explicit:

      https://lore.kernel.org/all/20221202052847.2623997-1-edumazet@google.com/

 - Documentation updates, including corrections to spelling,
   clarifications in comments, and improvements to the srcu_size_state
   comments.

 - Better srcu_struct cache locality for readers, by adjusting the size
   of srcu_struct in support of SRCU usage by Christoph Hellwig.

 - Teach lockdep to detect deadlocks between srcu_read_lock() vs
   synchronize_srcu() contributed by Boqun.

   Previously lockdep could not detect such deadlocks, now it can.

 - Integration of rcutorture and rcu-related tools, targeted for v6.4
   from Boqun's tree, featuring new SRCU deadlock scenarios, test_nmis
   module parameter, and more

 - Miscellaneous changes, various code cleanups and comment improvements

* tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux: (71 commits)
  checkpatch: Error out if deprecated RCU API used
  mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep()
  rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
  ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
  net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
  net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access
  rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed
  rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()
  rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early
  rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
  rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels
  rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
  rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race
  rcu/trace: use strscpy() to instead of strncpy()
  tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem
  ...
2023-04-24 12:16:14 -07:00
Linus Torvalds
487c20b016 iov: improve copy_iovec_from_user() code generation
Use the same pattern as the compat version of this code does: instead of
copying the whole array to a kernel buffer and then having a separate
phase of verifying it, just do it one entry at a time, verifying as you
go.

On Jens' /dev/zero readv() test this improves performance by ~6%.

[ This was obviously triggered by Jens' ITER_UBUF updates series ]

Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/all/de35d11d-bce7-e976-7372-1f2caf417103@kernel.dk/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-24 10:37:17 -07:00
Linus Torvalds
b9dff2195f iter-ubuf.2-2023-04-21
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmRCvdsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpg4oD/457EJ21Fm36NuyT/S0Cr8ok9Tdk7t9BeBh
 V/9CYThoXr5aqAox0Vq23FF+Rhzm81GzwYERN4493LBblliNeNOo2IaXF9/7qrUW
 11v9Bkug2J3k3hRGtEa6Zl0EpMu+FRLsNpchjFS2KPuOq+iMDxrvwuy50kidWg7n
 r25e4UwpExVO9fIoUSmzgWVfRHOTuj9yiG/UsaH2+2BRXerIX0Q1tyElwmcGh25M
 Ad2hN+yDnuIbNA5gNUpnzY32Dp0zjAsquc//QOvq9mltcNTElokB8idGliismvyd
 8qF0lkwQwewOBT/sSD5EY3K0Qd8IJu425bvT/yPUDScHz1chxHUoxo5eisIr2M9l
 5AL5KHAf7Zzs8ZuV+IYPzZ5qM6a/vF3mHUisKRNKYVhF46Nmd4cBratfXwWb1MxV
 clQM2qr0TLOYli9mOeTXph3hg/rBVqKqf90boAZoN8b2tWBKlMykpqRadbepjrgx
 bmBSwwAF99NxIHEjU3U5DMdUloCSiMZIfMfDxQrPNDrfWAW4xJs5Ym0VeOjEotTt
 oFEs1fr6c3Mn7KEuPPfOtnDxvs51IP/B8+gDgMt/edf+wHiCU1Zm31u2gxt2dsKh
 g73Y92i5SHjIf36H5szBTeioyMy1E1VA9HF14xWz2eKdQ+wxQ9VNWoctcJ85k3F4
 6AZDYRIrWA==
 =EaE9
 -----END PGP SIGNATURE-----

Merge tag 'iter-ubuf.2-2023-04-21' of git://git.kernel.dk/linux

Pull ITER_UBUF updates from Jens Axboe:
 "This turns singe vector imports into ITER_UBUF, rather than
  ITER_IOVEC.

  The former is more trivial to iterate and advance, and hence a bit
  more efficient. From some very unscientific testing, ~60% of all iovec
  imports are single vector"

* tag 'iter-ubuf.2-2023-04-21' of git://git.kernel.dk/linux:
  iov_iter: Mark copy_compat_iovec_from_user() noinline
  iov_iter: import single vector iovecs as ITER_UBUF
  iov_iter: convert import_single_range() to ITER_UBUF
  iov_iter: overlay struct iovec and ubuf/len
  iov_iter: set nr_segs = 1 for ITER_UBUF
  iov_iter: remove iov_iter_iovec()
  iov_iter: add iter_iov_addr() and iter_iov_len() helpers
  ALSA: pcm: check for user backed iterator, not specific iterator type
  IB/qib: check for user backed iterator, not specific iterator type
  IB/hfi1: check for user backed iterator, not specific iterator type
  iov_iter: add iter_iovec() helper
  block: ensure bio_alloc_map_data() deals with ITER_UBUF correctly
2023-04-24 10:29:28 -07:00
Peng Zhang
29ad6bb313 maple_tree: fix allocation in mas_sparse_area()
In the case of reverse allocation, mas->index and mas->last do not point
to the correct allocation range, which will cause users to get incorrect
allocation results, so fix it.  If the user does not use it in a specific
way, this bug will not be triggered.

This is a bug, but only VMA uses it now, the way VMA is used now will
not trigger it.  There is a possibility that a user will trigger it in
the future.

Also re-check whether the size is still satisfied after the lower bound
was increased, which is a corner case and is incorrect in previous
versions.

Link: https://lkml.kernel.org/r/20230419093625.99201-1-zhangpeng.00@bytedance.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-21 14:52:04 -07:00
Yajun Deng
13215e8a4b lib/show_mem.c: use for_each_populated_zone() simplify code
__show_mem() needs to iterate over all zones that have memory, we can
simplify the code by using for_each_populated_zone().

Link: https://lkml.kernel.org/r/20230417035226.4013584-1-yajun.deng@linux.dev
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-21 14:52:02 -07:00
Xie Yongji
aaf0594829 lib/group_cpus: Export group_cpus_evenly()
Export group_cpus_evenly() so that some modules
can make use of it to group CPUs evenly according
to NUMA and CPU locality.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230323053043.35-2-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21 03:02:31 -04:00
Jakub Kicinski
681c5b51dc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Adjacent changes:

net/mptcp/protocol.h
  63740448a3 ("mptcp: fix accept vs worker race")
  2a6a870e44 ("mptcp: stops worker on unaccepted sockets at listener close")
  ddb1a072f8 ("mptcp: move first subflow allocation at mpc access time")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20 16:29:51 -07:00
Noah Goldstein
b0687c1119 lib/rbtree: use '+' instead of '|' for setting color.
This has a slight benefit for x86 and has no effect on other targets.

The benefit to x86 is it change the codegen for setting a node to block
from `mov %r0, %r1; or $RB_BLACK, %r1` to `lea RB_BLACK(%r0), %r1` which
saves an instructions.

In all other cases it just replace ALU with ALU (or -> and) which
perform the same on all machines I am aware of.

Total instructions in rbtree.o:
    Before  - 802
    After   - 782

so it saves about 20 `mov` instructions.

Link: https://lkml.kernel.org/r/20230404221350.3806566-1-goldstein.w.n@gmail.com
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Cc: Michel Lespinasse <michel@lespinasse.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:39:33 -07:00
Peng Zhang
fb20e99a74 maple_tree: use correct variable type in sizeof
The type of variable pointed to by pivs is unsigned long, but the type
used in sizeof is a pointer type.  Change it to unsigned long.

This change has no runtime effect, as sizeof(ul) == sizeof(ul *).

Link: https://lkml.kernel.org/r/20230411023513.15227-1-zhangpeng.00@bytedance.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reported-by: David Binderman <dcb314@hotmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:29:56 -07:00
Peng Zhang
97f7e09481 maple_tree: simplify mas_wr_node_walk()
Simplify code of mas_wr_node_walk() without changing functionality, and
improve readability.  Remove some special judgments.  Instead of
dynamically recording the min and max in the loop, get the final min and
max directly at the end.

Link: https://lkml.kernel.org/r/20230314124203.91572-3-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:29:53 -07:00
Uladzislau Rezki (Sony)
869cb29a61 lib/test_vmalloc.c: add vm_map_ram()/vm_unmap_ram() test case
Add vm_map_ram()/vm_unmap_ram() test case to our stress test-suite.

[akpm@linux-foundation.org: fix whitespace, per Lorenzo]
Link: https://lkml.kernel.org/r/20230330190639.431589-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:29:47 -07:00
Andrew Morton
f8f238ffe5 sync mm-stable with mm-hotfixes-stable to pick up depended-upon upstream changes 2023-04-18 14:53:49 -07:00
Liam R. Howlett
06e8fd9993 maple_tree: fix mas_empty_area() search
The internal function of mas_awalk() was incorrectly skipping the last
entry in a node, which could potentially be NULL.  This is only a problem
for the left-most node in the tree - otherwise that NULL would not exist.

Fix mas_awalk() by using the metadata to obtain the end of the node for
the loop and the logical pivot as apposed to the raw pivot value.

Link: https://lkml.kernel.org/r/20230414145728.4067069-2-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 14:22:13 -07:00
Liam R. Howlett
fad8e4291d maple_tree: make maple state reusable after mas_empty_area_rev()
Stop using maple state min/max for the range by passing through pointers
for those values.  This will allow the maple state to be reused without
resetting.

Also add some logic to fail out early on searching with invalid
arguments.

Link: https://lkml.kernel.org/r/20230414145728.4067069-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 14:22:13 -07:00
Christoph Hellwig
7533583e12 libcrc32c: remove crc32c_impl
This was only ever used by btrfs, and the usage just went away.
This effectively reverts df91f56adc ("libcrc32c: Add crc32c_impl
function").

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:23 +02:00
Andrew Morton
e492cd61b9 sync mm-stable with mm-hotfixes-stable to pick up depended-upon upstream changes 2023-04-16 12:31:58 -07:00
Akinobu Mita
d325c16263 fault-inject: fix build error when FAULT_INJECTION_CONFIGFS=y and CONFIGFS_FS=m
This fixes a build error when CONFIG_FAULT_INJECTION_CONFIGFS=y and
CONFIG_CONFIGFS_FS=m.

Since the fault-injection library cannot built as a module, avoid building
configfs as a module.

Fixes: 4668c7a294 ("fault-inject: allow configuration via configfs")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202304150025.K0hczLR4-lkp@intel.com/
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-16 13:01:42 -06:00
Peng Zhang
1f5f12ece7 maple_tree: fix a potential memory leak, OOB access, or other unpredictable bug
In mas_alloc_nodes(), "node->node_count = 0" means to initialize the
node_count field of the new node, but the node may not be a new node.  It
may be a node that existed before and node_count has a value, setting it
to 0 will cause a memory leak.  At this time, mas->alloc->total will be
greater than the actual number of nodes in the linked list, which may
cause many other errors.  For example, out-of-bounds access in
mas_pop_node(), and mas_pop_node() may return addresses that should not be
used.  Fix it by initializing node_count only for new nodes.

Also, by the way, an if-else statement was removed to simplify the code.

Link: https://lkml.kernel.org/r/20230411041005.26205-1-zhangpeng.00@bytedance.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-16 10:41:26 -07:00
Thomas Gleixner
63a759694e debugobject: Prevent init race with static objects
Statically initialized objects are usually not initialized via the init()
function of the subsystem. They are special cased and the subsystem
provides a function to validate whether an object which is not yet tracked
by debugobjects is statically initialized. This means the object is started
to be tracked on first use, e.g. activation.

This works perfectly fine, unless there are two concurrent operations on
that object. Schspa decoded the problem:

T0 	          	    	    T1

debug_object_assert_init(addr)
  lock_hash_bucket()
  obj = lookup_object(addr);
  if (!obj) {
  	unlock_hash_bucket();
	- > preemption
			            lock_subsytem_object(addr);
				      activate_object(addr)
				      lock_hash_bucket();
				      obj = lookup_object(addr);
				      if (!obj) {
				    	unlock_hash_bucket();
					if (is_static_object(addr))
					   init_and_track(addr);
				      lock_hash_bucket();
				      obj = lookup_object(addr);
				      obj->state = ACTIVATED;
				      unlock_hash_bucket();

				    subsys function modifies content of addr,
				    so static object detection does
				    not longer work.

				    unlock_subsytem_object(addr);
				    
        if (is_static_object(addr)) <- Fails

	  debugobject emits a warning and invokes the fixup function which
	  reinitializes the already active object in the worst case.

This race exists forever, but was never observed until mod_timer() got a
debug_object_assert_init() added which is outside of the timer base lock
held section right at the beginning of the function to cover the lockless
early exit points too.

Rework the code so that the lookup, the static object check and the
tracking object association happens atomically under the hash bucket
lock. This prevents the issue completely as all callers are serialized on
the hash bucket lock and therefore cannot observe inconsistent state.

Fixes: 3ac7fe5a4a ("infrastructure to debug (dynamic) objects")
Reported-by: syzbot+5093ba19745994288b53@syzkaller.appspotmail.com
Debugged-by: Schspa Shi <schspa@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://syzkaller.appspot.com/bug?id=22c8a5938eab640d1c6bcc0e3dc7be519d878462
Link: https://lore.kernel.org/lkml/20230303161906.831686-1-schspa@gmail.com
Link: https://lore.kernel.org/r/87zg7dzgao.ffs@tglx
2023-04-15 23:13:36 +02:00
Jakub Kicinski
800e68c44f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

tools/testing/selftests/net/config
  62199e3f16 ("selftests: net: Add VXLAN MDB test")
  3a0385be13 ("selftests: add the missing CONFIG_IP_SCTP in net config")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13 16:04:28 -07:00
Nick Alcock
7f82b39dc3 treewide: remove MODULE_LICENSE in non-modules
Since commit 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.

So remove it in the files in this commit, none of which can be built as
modules.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13 13:13:54 -07:00
Nick Alcock
0c9bf64c5b btree: remove MODULE_LICENSE in non-modules
Since commit 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.

So remove it in the files in this commit, none of which can be built as
modules.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13 13:13:54 -07:00
Nick Alcock
5e0266f0e5 lib: remove MODULE_LICENSE in non-modules
Since commit 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.

So remove it in the files in this commit, none of which can be built as
modules.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13 13:13:53 -07:00
Nick Alcock
ef5bbd1172 crypto: blake2s: remove module-related code
Now blake2s-generic.c can no longer be a module, drop all remaining
module-related code as well.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Requested-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13 13:13:51 -07:00
Nick Alcock
3714878005 crypto: remove MODULE_LICENSE in non-modules
Since commit 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.

So remove it in the files in this commit, none of which can be built as
modules.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-modules@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13 13:13:51 -07:00
Akinobu Mita
4668c7a294 fault-inject: allow configuration via configfs
This provides a helper function to allow configuration of fault-injection
for configfs-based drivers.

The config items created by this function have the same interface as the
one created under debugfs by fault_create_debugfs_attr().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Link: https://lore.kernel.org/r/20230327143733.14599-2-akinobu.mita@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-13 07:38:54 -06:00
Josh Poimboeuf
50f9a76ef1 iov_iter: Mark copy_compat_iovec_from_user() noinline
After commit 6376ce56feb6 ("iov_iter: import single vector iovecs as
ITER_UBUF"), GCC does an inter-procedural compiler optimization which
moves the user_access_begin() out of copy_compat_iovec_from_user() and
into its callers:

  lib/iov_iter.o: warning: objtool: .altinstr_replacement+0x0: redundant UACCESS disable
  lib/iov_iter.o: warning: objtool: iovec_from_user.part.0+0xc7: call to copy_compat_iovec_from_user.part.0() with UACCESS enabled
  lib/iov_iter.o: warning: objtool: __import_iovec+0x21d: call to copy_compat_iovec_from_user.part.0() with UACCESS enabled

Enforce the "no UACCESS enable across function boundaries" rule by
disabling cloning for copy_compat_iovec_from_user().

Fixes: 6376ce56feb6 ("iov_iter: import single vector iovecs as ITER_UBUF")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
https://lkml.kernel.org/lkml/20230327120017.6bb826d7@canb.auug.org.au
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-12 10:46:48 -06:00
Andy Shevchenko
ef55ef3e64 lib/test-string_helpers: replace UNESCAPE_ANY by UNESCAPE_ALL_MASK
When we get a random number to generate a flag in the valid range of
UNESCAPE flags, use UNESCAPE_ALL_MASK, It's more correct and prevents from
missed updates of the test coverage in the future if any.

Link: https://lkml.kernel.org/r/20230327142604.48213-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-08 13:45:39 -07:00
Alexey Dobriyan
70e79866ab ELF: fix all "Elf" typos
ELF is acronym and therefore should be spelled in all caps.

I left one exception at Documentation/arm/nwfpe/nwfpe.rst which looks like
being written in the first person.

Link: https://lkml.kernel.org/r/Y/3wGWQviIOkyLJW@p183
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-08 13:45:37 -07:00
Lorenzo Stoakes
4f80818b4a iov_iter: add copy_page_to_iter_nofault()
Provide a means to copy a page to user space from an iterator, aborting if
a page fault would occur.  This supports compound pages, but may be passed
a tail page with an offset extending further into the compound page, so we
cannot pass a folio.

This allows for this function to be called from atomic context and _try_
to user pages if they are faulted in, aborting if not.

The function does not use _copy_to_iter() in order to not specify
might_fault(), this is similar to copy_page_from_iter_atomic().

This is being added in order that an iteratable form of vread() can be
implemented while holding spinlocks.

Link: https://lkml.kernel.org/r/19734729defb0f498a76bdec1bef3ac48a3af3e8.1679511146.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Liu Shixin <liushixin2@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 19:42:57 -07:00
Peng Zhang
c45ea315a6 maple_tree: fix a potential concurrency bug in RCU mode
There is a concurrency bug that may cause the wrong value to be loaded
when a CPU is modifying the maple tree.

CPU1:
mtree_insert_range()
  mas_insert()
    mas_store_root()
      ...
      mas_root_expand()
        ...
        rcu_assign_pointer(mas->tree->ma_root, mte_mk_root(mas->node));
        ma_set_meta(node, maple_leaf_64, 0, slot);    <---IP

CPU2:
mtree_load()
  mtree_lookup_walk()
    ma_data_end();

When CPU1 is about to execute the instruction pointed to by IP, the
ma_data_end() executed by CPU2 may return the wrong end position, which
will cause the value loaded by mtree_load() to be wrong.

An example of triggering the bug:

Add mdelay(100) between rcu_assign_pointer() and ma_set_meta() in
mas_root_expand().

static DEFINE_MTREE(tree);
int work(void *p) {
	unsigned long val;
	for (int i = 0 ; i< 30; ++i) {
		val = (unsigned long)mtree_load(&tree, 8);
		mdelay(5);
		pr_info("%lu",val);
	}
	return 0;
}

mt_init_flags(&tree, MT_FLAGS_USE_RCU);
mtree_insert(&tree, 0, (void*)12345, GFP_KERNEL);
run_thread(work)
mtree_insert(&tree, 1, (void*)56789, GFP_KERNEL);

In RCU mode, mtree_load() should always return the value before or after
the data structure is modified, and in this example mtree_load(&tree, 8)
may return 56789 which is not expected, it should always return NULL.  Fix
it by put ma_set_meta() before rcu_assign_pointer().

Link: https://lkml.kernel.org/r/20230314124203.91572-4-zhangpeng.00@bytedance.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:25 -07:00
Peng Zhang
ec07967d75 maple_tree: fix get wrong data_end in mtree_lookup_walk()
if (likely(offset > end))
	max = pivots[offset];

The above code should be changed to if (likely(offset < end)), which is
correct.  This affects the correctness of ma_data_end().  Now it seems
that the final result will not be wrong, but it is best to change it. 
This patch does not change the code as above, because it simplifies the
code by the way.

Link: https://lkml.kernel.org/r/20230314124203.91572-1-zhangpeng.00@bytedance.com
Link: https://lkml.kernel.org/r/20230314124203.91572-2-zhangpeng.00@bytedance.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:25 -07:00
Liam R. Howlett
790e1fa86b maple_tree: add RCU lock checking to rcu callback functions
Dereferencing RCU objects within the RCU callback without the RCU check
has caused lockdep to complain.  Fix the RCU dereferencing by using the
RCU callback lock to ensure the operation is safe.

Also stop creating a new lock to use for dereferencing during destruction
of the tree or subtree.  Instead, pass through a pointer to the tree that
has the lock that is held for RCU dereferencing checking.  It also does
not make sense to use the maple state in the freeing scenario as the tree
walk is a special case where the tree no longer has the normal encodings
and parent pointers.

Link: https://lkml.kernel.org/r/20230227173632.3292573-8-surenb@google.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:22 -07:00
Liam R. Howlett
0a2b18d948 maple_tree: add smp_rmb() to dead node detection
Add an smp_rmb() before reading the parent pointer to ensure that anything
read from the node prior to the parent pointer hasn't been reordered ahead
of this check.

The is necessary for RCU mode.

Link: https://lkml.kernel.org/r/20230227173632.3292573-7-surenb@google.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:22 -07:00
Liam R. Howlett
c13af03de4 maple_tree: fix write memory barrier of nodes once dead for RCU mode
During the development of the maple tree, the strategy of freeing multiple
nodes changed and, in the process, the pivots were reused to store
pointers to dead nodes.  To ensure the readers see accurate pivots, the
writers need to mark the nodes as dead and call smp_wmb() to ensure any
readers can identify the node as dead before using the pivot values.

There were two places where the old method of marking the node as dead
without smp_wmb() were being used, which resulted in RCU readers seeing
the wrong pivot value before seeing the node was dead.  Fix this race
condition by using mte_set_node_dead() which has the smp_wmb() call to
ensure the race is closed.

Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed
are marked as dead to ensure there are no other call paths besides the two
updated paths.

This is necessary for the RCU mode of the maple tree.

Link: https://lkml.kernel.org/r/20230227173632.3292573-6-surenb@google.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:21 -07:00
Liam Howlett
8372f4d83f maple_tree: remove extra smp_wmb() from mas_dead_leaves()
The call to mte_set_dead_node() before the smp_wmb() already calls
smp_wmb() so this is not needed.  This is an optimization for the RCU mode
of the maple tree.

Link: https://lkml.kernel.org/r/20230227173632.3292573-5-surenb@google.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:21 -07:00
Liam Howlett
2e5b4921f8 maple_tree: fix freeing of nodes in rcu mode
The walk to destroy the nodes was not always setting the node type and
would result in a destroy method potentially using the values as nodes. 
Avoid this by setting the correct node types.  This is necessary for the
RCU mode of the maple tree.

Link: https://lkml.kernel.org/r/20230227173632.3292573-4-surenb@google.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:21 -07:00
Liam Howlett
a7b92d59c8 maple_tree: detect dead nodes in mas_start()
When initially starting a search, the root node may already be in the
process of being replaced in RCU mode.  Detect and restart the walk if
this is the case.  This is necessary for RCU mode of the maple tree.

Link: https://lkml.kernel.org/r/20230227173632.3292573-3-surenb@google.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:21 -07:00
Liam Howlett
39d0bd86c4 maple_tree: be more cautious about dead nodes
Patch series "Fix VMA tree modification under mmap read lock".

Syzbot reported a BUG_ON in mm/mmap.c which was found to be caused by an
inconsistency between threads walking the VMA maple tree.  The
inconsistency is caused by the page fault handler modifying the maple tree
while holding the mmap_lock for read.

This only happens for stack VMAs.  We had thought this was safe as it only
modifies a single pivot in the tree.  Unfortunately, syzbot constructed a
test case where the stack had no guard page and grew the stack to abut the
next VMA.  This causes us to delete the NULL entry between the two VMAs
and rewrite the node.

We considered several options for fixing this, including dropping the
mmap_lock, then reacquiring it for write; and relaxing the definition of
the tree to permit a zero-length NULL entry in the node.  We decided the
best option was to backport some of the RCU patches from -next, which
solve the problem by allocating a new node and RCU-freeing the old node. 
Since the problem exists in 6.1, we preferred a solution which is similar
to the one we intended to merge next merge window.

These patches have been in -next since next-20230301, and have received
intensive testing in Android as part of the RCU page fault patchset.  They
were also sent as part of the "Per-VMA locks" v4 patch series.  Patches 1
to 7 are bug fixes for RCU mode of the tree and patch 8 enables RCU mode
for the tree.

Performance v6.3-rc3 vs patched v6.3-rc3: Running these changes through
mmtests showed there was a 15-20% performance decrease in
will-it-scale/brk1-processes.  This tests creating and inserting a single
VMA repeatedly through the brk interface and isn't representative of any
real world applications.


This patch (of 8):

ma_pivots() and ma_data_end() may be called with a dead node.  Ensure to
that the node isn't dead before using the returned values.

This is necessary for RCU mode of the maple tree.

Link: https://lkml.kernel.org/r/20230327185532.2354250-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20230227173632.3292573-1-surenb@google.com
Link: https://lkml.kernel.org/r/20230227173632.3292573-2-surenb@google.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjun Roy <arjunroy@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Li <chriscli@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: freak07 <michalechner92@googlemail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Oskolkov <posk@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Punit Agrawal <punit.agrawal@bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05 18:06:20 -07:00
Niklas Schnelle
fcbfe8121a
Kconfig: introduce HAS_IOPORT option and select it as necessary
We introduce a new HAS_IOPORT Kconfig option to indicate support for I/O
Port access. In a future patch HAS_IOPORT=n will disable compilation of
the I/O accessor functions inb()/outb() and friends on architectures
which can not meaningfully support legacy I/O spaces such as s390.

The following architectures do not select HAS_IOPORT:

* ARC
* C-SKY
* Hexagon
* Nios II
* OpenRISC
* s390
* User-Mode Linux
* Xtensa

All other architectures select HAS_IOPORT at least conditionally.

The "depends on" relations on HAS_IOPORT in drivers as well as ifdefs
for HAS_IOPORT specific sections will be added in subsequent patches on
a per subsystem basis.

Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net> # for ARCH=um
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-05 22:15:19 +02:00
Rae Moar
a42077b787 kunit: add tests for using current KUnit test field
Create test suite called "kunit_current" to add test coverage for the use
of current->kunit_test, which returns the current KUnit test.

Add two test cases:
- kunit_current_test to test current->kunit_test and the method
  kunit_get_current_test(), which utilizes current->kunit_test.

- kunit_current_fail_test to test the method
  kunit_fail_current_test(), which utilizes current->kunit_test.

Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-04-05 12:51:30 -06:00
Uladzislau Rezki (Sony)
c779b97281 lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
The kvfree_rcu() macro's single-argument form is deprecated.  Therefore
switch to the new kvfree_rcu_mightsleep() variant. The goal is to
avoid accidental use of the single-argument forms, which can introduce
functionality bugs in atomic contexts and latency bugs in non-atomic
contexts.

Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2023-04-05 13:48:03 +00:00
Andy Shevchenko
48e1a66fec lib/vsprintf: Use isodigit() for the octal number check
Use isodigit() to test the octal number instead of homegrown approach.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230327142721.48378-1-andriy.shevchenko@linux.intel.com
2023-04-03 10:51:25 +02:00
Greg Kroah-Hartman
cd8fe5b6db Merge 6.3-rc5 into driver-core-next
We need the fixes in here for testing, as well as the driver core
changes for documentation updates to build on.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-03 09:33:30 +02:00
Sadiya Kazi
57b4f760f9 list: test: Test the klist structure
Add KUnit tests to the klist linked-list structure.
These perform testing for different variations of node add
and node delete in the klist data structure (<linux/klist.h>).

Limitation: Since we use a static global variable, and if
multiple instances of this test are run concurrently, the test may fail.

Signed-off-by: Sadiya Kazi <sadiyakazi@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-03-31 09:21:35 -06:00
Herbert Xu
c616fb0cba crypto: lib/utils - Move utilities into new header
The utilities have historically resided in algapi.h as they were
first used internally before being exported.  Move them into a
new header file so external users don't see internal API details.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-03-31 17:50:09 +08:00
Jakub Kicinski
79548b7984 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/ethernet/mediatek/mtk_ppe.c
  3fbe4d8c0e ("net: ethernet: mtk_eth_soc: ppe: add support for flow accounting")
  924531326e ("net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30 14:43:03 -07:00
Jens Axboe
3b2deb0e46 iov_iter: import single vector iovecs as ITER_UBUF
Add a special case to __import_iovec(), which imports a single segment
iovec as an ITER_UBUF rather than an ITER_IOVEC. ITER_UBUF is cheaper
to iterate than ITER_IOVEC, and for a single segment iovec, there's no
point in using a segmented iterator.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-30 08:12:29 -06:00
Jens Axboe
e03ad4ee27 iov_iter: convert import_single_range() to ITER_UBUF
Since we're just importing a single vector, we don't have to turn it
into an ITER_IOVEC. Instead turn it into an ITER_UBUF, which is cheaper
to iterate.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-30 08:12:29 -06:00
Jens Axboe
de4f5fed3f iov_iter: add iter_iovec() helper
This returns a pointer to the current iovec entry in the iterator. Only
useful with ITER_IOVEC right now, but it prepares us to treat ITER_UBUF
and ITER_IOVEC identically for the first segment.

Rename struct iov_iter->iov to iov_iter->__iov to find any potentially
troublesome spots, and also to prevent anyone from adding new code that
accesses iter->iov directly.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-30 08:12:29 -06:00
Jakub Kicinski
de7494524d mlx5-updates-2023-03-20
mlx5 dynamic msix
 
 This patch series adds support for dynamic msix vectors allocation in mlx5.
 
 Eli Cohen Says:
 
 ================
 
 The following series of patches modifies mlx5_core to work with the
 dynamic MSIX API. Currently, mlx5_core allocates all the interrupt
 vectors it needs and distributes them amongst the consumers. With the
 introduction of dynamic MSIX support, which allows for allocation of
 interrupts more than once, we now allocate vectors as we need them.
 This allows other drivers running on top of mlx5_core to allocate
 interrupt vectors for their own use. An example for this is mlx5_vdpa,
 which uses these vectors to propagate interrupts directly from the
 hardware to the vCPU [1].
 
 As a preparation for using this series, a use after free issue is fixed
 in lib/cpu_rmap.c and the allocator for rmap entries has been modified.
 A complementary API for irq_cpu_rmap_add() has also been introduced.
 
 [1] https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/patch/?id=0f2bf1fcae96a83b8c5581854713c9fc3407556e
 
 ================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmQeTIUACgkQSD+KveBX
 +j7oCQgAx9yNHM4BZD2UfIx/P+W13v1B+xOds04Vezl9JlakoqvviPxm3vvuKkl+
 j/8DdyoqMUbWV0j5XxgZ+GG91bc14jN1GQ+4fUf63SzA99vAGb9GJPV2aQt5roGh
 JmMqI2utDfoz+29qtQ+kVchY5AN5AoPXSQH2zkEZmJaPUjYb9Dr/4IayL0JaViAw
 S31QLHKkSJ8bL8Wc6Op1emNVV7eXs18f7IIjVs3sYOb3WJRPVpmdKneRqLgVYplf
 Td40Gwobl1elpjEqSSRTJI5YUSR8gcAJlBqIwHeJzFFpO3Pnciopl761osNKKs/a
 5ctES5DS6JHqqFGbWV1gKYcRMil3LA==
 =9i8l
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2023-03-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-03-20

mlx5 dynamic msix

This patch series adds support for dynamic msix vectors allocation in mlx5.

Eli Cohen Says:

================

The following series of patches modifies mlx5_core to work with the
dynamic MSIX API. Currently, mlx5_core allocates all the interrupt
vectors it needs and distributes them amongst the consumers. With the
introduction of dynamic MSIX support, which allows for allocation of
interrupts more than once, we now allocate vectors as we need them.
This allows other drivers running on top of mlx5_core to allocate
interrupt vectors for their own use. An example for this is mlx5_vdpa,
which uses these vectors to propagate interrupts directly from the
hardware to the vCPU [1].

As a preparation for using this series, a use after free issue is fixed
in lib/cpu_rmap.c and the allocator for rmap entries has been modified.
A complementary API for irq_cpu_rmap_add() has also been introduced.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/patch/?id=0f2bf1fcae96a83b8c5581854713c9fc3407556e

================

* tag 'mlx5-updates-2023-03-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Provide external API for allocating vectors
  net/mlx5: Use one completion vector if eth is disabled
  net/mlx5: Refactor calculation of required completion vectors
  net/mlx5: Move devlink registration before mlx5_load
  net/mlx5: Use dynamic msix vectors allocation
  net/mlx5: Refactor completion irq request/release code
  net/mlx5: Improve naming of pci function vectors
  net/mlx5: Use newer affinity descriptor
  net/mlx5: Modify struct mlx5_irq to use struct msi_map
  net/mlx5: Fix wrong comment
  net/mlx5e: Coding style fix, add empty line
  lib: cpu_rmap: Add irq_cpu_rmap_remove to complement irq_cpu_rmap_add
  lib: cpu_rmap: Use allocator for rmap entries
  lib: cpu_rmap: Avoid use after free on rmap->obj array entries
====================

Link: https://lore.kernel.org/r/20230324231341.29808-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-28 23:52:12 -07:00
Jakub Kicinski
b133fffe57 Merge branch 'locking/rcuref' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pulling rcurefs from Peter for tglx's work.

Link: https://lore.kernel.org/all/20230328084534.GE4253@hirez.programming.kicks-ass.net/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-28 18:49:35 -07:00
Danilo Krummrich
5c63a7c32a maple_tree: export symbol mas_preallocate()
Fix missing EXPORT_SYMBOL_GPL() statement for mas_preallocate().

It isn't actually used by anything yet, but mas_preallocate() is part of
the maple tree's 'Advanced API'.  All other functions of this API are
exported already.

Link: https://lkml.kernel.org/r/20230302011035.4928-1-dakr@redhat.com
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28 16:20:14 -07:00
Alexander Potapenko
8e00b2dffd lib/stackdepot: kmsan: mark API outputs as initialized
KMSAN does not instrument stackdepot and may treat memory allocated by it
as uninitialized.  This is not a problem for KMSAN itself, because its
functions calling stackdepot API are also not instrumented.  But other
kernel features (e.g.  netdev tracker) may access stack depot from
instrumented code, which will lead to false positives, unless we
explicitly mark stackdepot outputs as initialized.

Link: https://lkml.kernel.org/r/20230306111322.205724-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28 16:20:13 -07:00
Hyeonggon Yoo
4c85c0be3d mm, printk: introduce new format %pGt for page_type
%pGp format is used to display 'flags' field of a struct page.  However,
some page flags (i.e.  PG_buddy, see page-flags.h for more details) are
stored in page_type field.  To display human-readable output of page_type,
introduce %pGt format.

It is important to note the meaning of bits are different in page_type. 
if page_type is 0xffffffff, no flags are set.  Setting PG_buddy
(0x00000080) flag results in a page_type of 0xffffff7f.  Clearing a bit
actually means setting a flag.  Bits in page_type are inverted when
displaying type names.

Only values for which page_type_has_type() returns true are considered as
page_type, to avoid confusion with mapcount values.  if it returns false,
only raw values are displayed and not page type names.

Link: https://lkml.kernel.org/r/20230130042514.2418-3-42.hyeyoo@gmail.com
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>	[vsprintf part]
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28 16:20:09 -07:00
Nicholas Piggin
2655421ae6 lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme
On big systems, the mm refcount can become highly contented when doing a
lot of context switching with threaded applications.  user<->idle switch
is one of the important cases.  Abandoning lazy tlb entirely slows this
switching down quite a bit in the common uncontended case, so that is not
viable.

Implement a scheme where lazy tlb mm references do not contribute to the
refcount, instead they get explicitly removed when the refcount reaches
zero.

The final mmdrop() sends IPIs to all CPUs in the mm_cpumask and they
switch away from this mm to init_mm if it was being used as the lazy tlb
mm.  Enabling the shoot lazies option therefore requires that the arch
ensures that mm_cpumask contains all CPUs that could possibly be using mm.
A DEBUG_VM option IPIs every CPU in the system after this to ensure there
are no references remaining before the mm is freed.

Shootdown IPIs cost could be an issue, but they have not been observed to
be a serious problem with this scheme, because short-lived processes tend
not to migrate CPUs much, therefore they don't get much chance to leave
lazy tlb mm references on remote CPUs.  There are a lot of options to
reduce them if necessary, described in comments.

The near-worst-case can be benchmarked with will-it-scale:

  context_switch1_threads -t $(($(nproc) / 2))

This will create nproc threads (nproc / 2 switching pairs) all sharing the
same mm that spread over all CPUs so each CPU does thread->idle->thread
switching.

[ Rik came up with basically the same idea a few years ago, so credit
  to him for that. ]

Link: https://lore.kernel.org/linux-mm/20230118080011.2258375-1-npiggin@gmail.com/
Link: https://lore.kernel.org/all/20180728215357.3249-11-riel@surriel.com/
Link: https://lkml.kernel.org/r/20230203071837.1136453-5-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28 16:20:08 -07:00
Masami Hiramatsu (Google)
87de2163a3 lib/test_fprobe: Add a testcase for skipping exit_handler
Add a testcase for skipping exit_handler if entry_handler
returns !0.

Link: https://lkml.kernel.org/r/167526700658.433354.12922388040490848613.stgit@mhiramat.roam.corp.google.com

Cc: Florent Revest <revest@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-28 18:52:22 -04:00
Masami Hiramatsu (Google)
39d954200b fprobe: Skip exit_handler if entry_handler returns !0
Skip hooking function return and calling exit_handler if the
entry_handler() returns !0.

Link: https://lkml.kernel.org/r/167526699798.433354.10998365726830117303.stgit@mhiramat.roam.corp.google.com

Cc: Florent Revest <revest@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-28 18:52:22 -04:00
Masami Hiramatsu (Google)
7e7ef1bfe5 lib/test_fprobe: Add a test case for nr_maxactive
Add a test case for nr_maxactive. If the number of active
functions is more than nr_maxactive, it must be skipped.

Link: https://lkml.kernel.org/r/167526698856.433354.4430007340787176666.stgit@mhiramat.roam.corp.google.com

Cc: Florent Revest <revest@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-28 18:52:22 -04:00
Masami Hiramatsu (Google)
34cabf8fd1 lib/test_fprobe: Add private entry_data testcases
Add test cases for checking whether private entry_data is
correctly passed or not.

Link: https://lkml.kernel.org/r/167526697074.433354.17790288501657876219.stgit@mhiramat.roam.corp.google.com

Cc: Florent Revest <revest@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-28 18:52:22 -04:00
Masami Hiramatsu (Google)
76d0de5729 fprobe: Pass entry_data to handlers
Pass the private entry_data to the entry and exit handlers so that
they can share the context data, something like saved function
arguments etc.
User must specify the private entry_data size by @entry_data_size
field before registering the fprobe.

Link: https://lkml.kernel.org/r/167526696173.433354.17408372048319432574.stgit@mhiramat.roam.corp.google.com

Cc: Florent Revest <revest@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-28 18:52:22 -04:00
Tiezhu Yang
f478b9987c lib/Kconfig.debug: correct help info of LOCKDEP_STACK_TRACE_HASH_BITS
We can see the following definition in kernel/locking/lockdep_internals.h:

  #define STACK_TRACE_HASH_SIZE	(1 << CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS)

CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS is related with STACK_TRACE_HASH_SIZE
instead of MAX_STACK_TRACE_ENTRIES, fix it.

Link: https://lkml.kernel.org/r/1679380508-20830-1-git-send-email-yangtiezhu@loongson.cn
Fixes: 5dc33592e9 ("lockdep: Allow tuning tracing capacity constants.")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28 15:24:31 -07:00
ye xingchen
35260cf545 Kconfig.debug: fix SCHED_DEBUG dependency
The path for SCHED_DEBUG is /sys/kernel/debug/sched.  So, SCHED_DEBUG
should depend on DEBUG_FS, not PROC_FS.

Link: https://lkml.kernel.org/r/202301291110098787982@zte.com.cn
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28 15:24:31 -07:00
Thomas Gleixner
ee1ee6db07 atomics: Provide rcuref - scalable reference counting
atomic_t based reference counting, including refcount_t, uses
atomic_inc_not_zero() for acquiring a reference. atomic_inc_not_zero() is
implemented with a atomic_try_cmpxchg() loop. High contention of the
reference count leads to retry loops and scales badly. There is nothing to
improve on this implementation as the semantics have to be preserved.

Provide rcuref as a scalable alternative solution which is suitable for RCU
managed objects. Similar to refcount_t it comes with overflow and underflow
detection and mitigation.

rcuref treats the underlying atomic_t as an unsigned integer and partitions
this space into zones:

  0x00000000 - 0x7FFFFFFF	valid zone (1 .. (INT_MAX + 1) references)
  0x80000000 - 0xBFFFFFFF	saturation zone
  0xC0000000 - 0xFFFFFFFE	dead zone
  0xFFFFFFFF   			no reference

rcuref_get() unconditionally increments the reference count with
atomic_add_negative_relaxed(). rcuref_put() unconditionally decrements the
reference count with atomic_add_negative_release().

This unconditional increment avoids the inc_not_zero() problem, but
requires a more complex implementation on the put() side when the count
drops from 0 to -1.

When this transition is detected then it is attempted to mark the reference
count dead, by setting it to the midpoint of the dead zone with a single
atomic_cmpxchg_release() operation. This operation can fail due to a
concurrent rcuref_get() elevating the reference count from -1 to 0 again.

If the unconditional increment in rcuref_get() hits a reference count which
is marked dead (or saturated) it will detect it after the fact and bring
back the reference count to the midpoint of the respective zone. The zones
provide enough tolerance which makes it practically impossible to escape
from a zone.

The racy implementation of rcuref_put() requires to protect rcuref_put()
against a grace period ending in order to prevent a subtle use after
free. As RCU is the only mechanism which allows to protect against that, it
is not possible to fully replace the atomic_inc_not_zero() based
implementation of refcount_t with this scheme.

The final drop is slightly more expensive than the atomic_dec_return()
counterpart, but that's not the case which this is optimized for. The
optimization is on the high frequeunt get()/put() pairs and their
scalability.

The performance of an uncontended rcuref_get()/put() pair where the put()
is not dropping the last reference is still on par with the plain atomic
operations, while at the same time providing overflow and underflow
detection and mitigation.

The performance of rcuref compared to plain atomic_inc_not_zero() and
atomic_dec_return() based reference counting under contention:

 -  Micro benchmark: All CPUs running a increment/decrement loop on an
    elevated reference count, which means the 0 to -1 transition never
    happens.

    The performance gain depends on microarchitecture and the number of
    CPUs and has been observed in the range of 1.3X to 4.7X

 - Conversion of dst_entry::__refcnt to rcuref and testing with the
    localhost memtier/memcached benchmark. That benchmark shows the
    reference count contention prominently.

    The performance gain depends on microarchitecture and the number of
    CPUs and has been observed in the range of 1.1X to 2.6X over the
    previous fix for the false sharing issue vs. struct
    dst_entry::__refcnt.

    When memtier is run over a real 1Gb network connection, there is a
    small gain on top of the false sharing fix. The two changes combined
    result in a 2%-5% total gain for that networked test.

Reported-by: Wangyang Guo <wangyang.guo@intel.com>
Reported-by: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230323102800.158429195@linutronix.de
2023-03-28 10:39:29 +02:00
Linus Torvalds
f768b35a23 Fixes for 6.3-rc3:
* Fix a race in the percpu counters summation code where the summation
    failed to add in the values for any CPUs that were dying but not yet
    dead.  This fixes some minor discrepancies and incorrect assertions
    when running generic/650.
 
 Signed-off-by: Darrick J. Wong <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZBdAbgAKCRBKO3ySh0YR
 pkltAQCs4QO5LjYReqjUxd4cSsLtNnNon09qswRsl2GuRyI36AEAxI9QMq4Q6D9V
 ZasNbiTCkV3KPKfmp6gf1mQNLk1lGQ0=
 =Bz3q
 -----END PGP SIGNATURE-----

Merge tag 'xfs-6.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs percpu counter fixes from Darrick Wong:
 "We discovered a filesystem summary counter corruption problem that was
  traced to cpu hot-remove racing with the call to percpu_counter_sum
  that sets the free block count in the superblock when writing it to
  disk. The root cause is that percpu_counter_sum doesn't cull from
  dying cpus and hence misses those counter values if the cpu shutdown
  hooks have not yet run to merge the values.

  I'm hoping this is a fairly painless fix to the problem, since the
  dying cpu mask should generally be empty. It's been in for-next for a
  week without any complaints from the bots.

   - Fix a race in the percpu counters summation code where the
     summation failed to add in the values for any CPUs that were dying
     but not yet dead. This fixes some minor discrepancies and incorrect
     assertions when running generic/650"

* tag 'xfs-6.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  pcpcntr: remove percpu_counter_sum_all()
  fork: remove use of percpu_counter_sum_all
  pcpcntrs: fix dying cpu summation race
  cpumask: introduce for_each_cpu_or
2023-03-25 12:57:34 -07:00
Eli Cohen
71f0a24786 lib: cpu_rmap: Add irq_cpu_rmap_remove to complement irq_cpu_rmap_add
Add a function to complement irq_cpu_rmap_add(). It removes the irq from
the reverse mapping by setting the notifier to NULL. The function calls
irq_set_affinity_notifier() with NULL at the notify argument which then
cancel any pending notifier work and decrement reference on the
notifier. When ref count reaches zero, the glue pointer is kfree and the
rmap entry is set to NULL serving both to avoid second attempt to
release it and also making the rmap entry available for subsequent
mapping.

It should be noted the drivers usually creates the reverse mapping at
initialization time and remove it at unload time so we do not expect
failures in allocating rmap due to kref holding the glue entry.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
2023-03-24 16:04:21 -07:00
Eli Cohen
9821d8d462 lib: cpu_rmap: Use allocator for rmap entries
Use a proper allocator for rmap entries using a naive for loop. The
allocator relies on whether an entry is NULL to be considered free.
Remove the used field of rmap which is not needed.

Also, avoid crashing the kernel if an entry is not available.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
2023-03-24 16:04:10 -07:00
Eli Cohen
4e0473f106 lib: cpu_rmap: Avoid use after free on rmap->obj array entries
When calling irq_set_affinity_notifier() with NULL at the notify
argument, it will cause freeing of the glue pointer in the
corresponding array entry but will leave the pointer in the array. A
subsequent call to free_irq_cpu_rmap() will try to free this entry again
leading to possible use after free.

Fix that by setting NULL to the array entry and checking that we have
non-zero at the array entry when iterating over the array in
free_irq_cpu_rmap().

The current code does not suffer from this since there are no cases
where irq_set_affinity_notifier(irq, NULL) (note the NULL passed for the
notify arg) is called, followed by a call to free_irq_cpu_rmap() so we
don't hit and issue. Subsequent patches in this series excersize this
flow, hence the required fix.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
2023-03-24 16:03:52 -07:00
Paul E. McKenney
c521986016 locking/csd_lock: Add Kconfig option for csd_debug default
The csd_debug kernel parameter works well, but is inconvenient in cases
where it is more closely associated with boot loaders or automation than
with a particular kernel version or release.  Thererfore, provide a new
CSD_LOCK_WAIT_DEBUG_DEFAULT Kconfig option that defaults csd_debug to
1 when selected and 0 otherwise, with this latter being the default.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20230321005516.50558-1-paulmck@kernel.org
2023-03-24 11:01:25 +01:00
Geert Uytterhoeven
13684e966d lib: dhry: fix unstable smp_processor_id(_) usage
When running the in-kernel Dhrystone benchmark with
CONFIG_DEBUG_PREEMPT=y:

    BUG: using smp_processor_id() in preemptible [00000000] code: bash/938

Fix this by not using smp_processor_id() directly, but instead wrapping
the whole benchmark inside a get_cpu()/put_cpu() pair.  This makes sure
the whole benchmark is run on the same CPU core, and the reported values
are consistent.

Link: https://lkml.kernel.org/r/b0d29932bb24ad82cea7f821e295c898e9657be0.1678890070.git.geert+renesas@glider.be
Fixes: d5528cc168 ("lib: add Dhrystone benchmark test")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reported-by: Tobias Klausmann <klausman@schwarzvogel.de>
  Link: https://bugzilla.kernel.org/show_bug.cgi?id=217179
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-23 17:18:35 -07:00