Commit Graph

1153827 Commits

Author SHA1 Message Date
Anton Protopopov
55171f2930 bpftool: Fix linkage with statically built libllvm
Since the commit eb9d1acf63 ("bpftool: Add LLVM as default library for
disassembling JIT-ed programs") we might link the bpftool program with the
libllvm library. This works fine when a shared libllvm library is available,
but fails if we want to link bpftool with a statically built LLVM:

  [...]
  /usr/bin/ld: /usr/local/lib/libLLVMSupport.a(CrashRecoveryContext.cpp.o): in function `llvm::CrashRecoveryContextCleanup::~CrashRecoveryContextCleanup()':
  CrashRecoveryContext.cpp:(.text._ZN4llvm27CrashRecoveryContextCleanupD0Ev+0x17): undefined reference to `operator delete(void*, unsigned long)'
  /usr/bin/ld: /usr/local/lib/libLLVMSupport.a(CrashRecoveryContext.cpp.o): in function `llvm::CrashRecoveryContext::~CrashRecoveryContext()':
  CrashRecoveryContext.cpp:(.text._ZN4llvm20CrashRecoveryContextD2Ev+0xc8): undefined reference to `operator delete(void*, unsigned long)'
  [...]

So in the case of static libllvm we need to explicitly link bpftool with
required libraries, namely, libstdc++ and those provided by the `llvm-config
--system-libs` command. We can distinguish between the shared and static cases
by using the `llvm-config --shared-mode` command.

Fixes: eb9d1acf63 ("bpftool: Add LLVM as default library for disassembling JIT-ed programs")
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20221222102627.1643709-1-aspsk@isovalent.com
2022-12-22 20:09:43 +01:00
Linus Torvalds
d1ac1a2b14 perf tools fixes and improvements for v6.2: 2nd batch
- Don't stop building perf if python setuptools isn't installed, just
   disable the affected perf feature.
 
 - Remove explicit reference to python 2.x devel files, that warning is
   about python-devel, no matter what version, being unavailable and thus
   disabling the linking with libpython.
 
 - Don't use -Werror=switch-enum when building the python support that
   handles libtraceevent enumerations, as there is no good way to test
   if some specific enum entry is available with the libtraceevent
   installed on the system.
 
 - Introduce 'perf lock contention' --type-filter and --lock-filter, to
   filter by lock type and lock name:
 
   $ sudo ./perf lock record -a -- ./perf bench sched messaging
 
   $ sudo ./perf lock contention -E 5 -Y spinlock
    contended  total wait   max wait  avg wait      type  caller
 
          802     1.26 ms   11.73 us   1.58 us  spinlock  __wake_up_common_lock+0x62
           13   787.16 us  105.44 us  60.55 us  spinlock  remove_wait_queue+0x14
           12   612.96 us   78.70 us  51.08 us  spinlock  prepare_to_wait+0x27
          114   340.68 us   12.61 us   2.99 us  spinlock  try_to_wake_up+0x1f5
           83   226.38 us    9.15 us   2.73 us  spinlock  folio_lruvec_lock_irqsave+0x5e
 
   $ sudo ./perf lock contention -l
    contended  total wait  max wait  avg wait           address  symbol
 
           57     1.11 ms  42.83 us  19.54 us  ffff9f4140059000
           15   280.88 us  23.51 us  18.73 us  ffffffff9d007a40  jiffies_lock
            1    20.49 us  20.49 us  20.49 us  ffffffff9d0d50c0  rcu_state
            1     9.02 us   9.02 us   9.02 us  ffff9f41759e9ba0
 
   $ sudo ./perf lock contention -L jiffies_lock,rcu_state
    contended  total wait  max wait  avg wait      type  caller
 
           15   280.88 us  23.51 us  18.73 us  spinlock  tick_sched_do_timer+0x93
            1    20.49 us  20.49 us  20.49 us  spinlock  __softirqentry_text_start+0xeb
 
   $ sudo ./perf lock contention -L ffff9f4140059000
    contended  total wait  max wait  avg wait      type  caller
 
           38   779.40 us  42.83 us  20.51 us  spinlock  worker_thread+0x50
           11   216.30 us  39.87 us  19.66 us  spinlock  queue_work_on+0x39
            8   118.13 us  20.51 us  14.77 us  spinlock  kthread+0xe5
 
 - Fix splitting CC into compiler and options when checking if a option
   is present in clang to build the python binding, needed in systems
   such as yocto that set CC to, e.g.: "gcc --sysroot=/a/b/c".
 
 - Refresh metris and events for Intel systems: alderlake.  alderlake-n,
   bonnell, broadwell, broadwellde, broadwellx, cascadelakex,
   elkhartlake, goldmont, goldmontplus, haswell, haswellx, icelake,
   icelakex, ivybridge, ivytown, jaketown, knightslanding, meteorlake,
   nehalemep, nehalemex, sandybridge, sapphirerapids, silvermont, skylake,
   skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp,
   westmereex.
 
 - Add vendor events files (JSON) for AMD Zen 4, from sections 2.1.15.4
   "Core Performance Monitor Counters", 2.1.15.5 "L3 Cache Performance
   Monitor Counter"s and Section 7.1 "Fabric Performance Monitor Counter
   (PMC) Events" in the Processor Programming Reference (PPR) for AMD
   Family 19h Model 11h Revision B1 processors.
 
   This constitutes events which capture op dispatch, execution and
   retirement, branch prediction, L1 and L2 cache activity, TLB activity,
   L3 cache activity and data bandwidth for various links and interfaces in
   the Data Fabric.
 
 - Also, from the same PPR are metrics taken from Section 2.1.15.2
   "Performance Measurement", including pipeline utilization, which are
   new to Zen 4 processors and useful for finding performance bottlenecks
   by analyzing activity at different stages of the pipeline.
 
 - Greatly improve the 'srcline', 'srcline_from', 'srcline_to' and
   'srcfile' sort keys performance by postponing calling the external
   addr2line utility to the collapse phase of histogram bucketing.
 
 - Fix 'perf test' "all PMU test" to skip parametrized events, that
   requires setting up and are not supported by this test.
 
 - Update tools/ copies of kernel headers: features, disabled-features,
   fscrypt.h, i915_drm.h, msr-index.h, power pc syscall table and kvm.h.
 
 - Add .DELETE_ON_ERROR special Makefile target to clean up partially
   updated files on error.
 
 - Simplify the mksyscalltbl script for arm64 by avoiding to run the host
   compiler to create the syscall table, do it all just with the shell
   script.
 
 - Further fixes to honour quiet mode (-q).
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCY6SJ+gAKCRCyPKLppCJ+
 J5JSAQCSokw2lsIqelDfoBfOQcMwah4ogW1vuO5KiepHgGOjuwD/d+65IxFIRA/h
 tJjAtq4fReyi4u4eTc1aLgUwFh7V0ws=
 =rneN
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:
 "perf tools fixes and improvements:

   - Don't stop building perf if python setuptools isn't installed, just
     disable the affected perf feature.

   - Remove explicit reference to python 2.x devel files, that warning
     is about python-devel, no matter what version, being unavailable
     and thus disabling the linking with libpython.

   - Don't use -Werror=switch-enum when building the python support that
     handles libtraceevent enumerations, as there is no good way to test
     if some specific enum entry is available with the libtraceevent
     installed on the system.

   - Introduce 'perf lock contention' --type-filter and --lock-filter,
     to filter by lock type and lock name:

        $ sudo ./perf lock record -a -- ./perf bench sched messaging

        $ sudo ./perf lock contention -E 5 -Y spinlock
         contended  total wait   max wait  avg wait      type  caller

               802     1.26 ms   11.73 us   1.58 us  spinlock  __wake_up_common_lock+0x62
                13   787.16 us  105.44 us  60.55 us  spinlock  remove_wait_queue+0x14
                12   612.96 us   78.70 us  51.08 us  spinlock  prepare_to_wait+0x27
               114   340.68 us   12.61 us   2.99 us  spinlock  try_to_wake_up+0x1f5
                83   226.38 us    9.15 us   2.73 us  spinlock  folio_lruvec_lock_irqsave+0x5e

        $ sudo ./perf lock contention -l
         contended  total wait  max wait  avg wait           address  symbol

                57     1.11 ms  42.83 us  19.54 us  ffff9f4140059000
                15   280.88 us  23.51 us  18.73 us  ffffffff9d007a40  jiffies_lock
                 1    20.49 us  20.49 us  20.49 us  ffffffff9d0d50c0  rcu_state
                 1     9.02 us   9.02 us   9.02 us  ffff9f41759e9ba0

        $ sudo ./perf lock contention -L jiffies_lock,rcu_state
         contended  total wait  max wait  avg wait      type  caller

                15   280.88 us  23.51 us  18.73 us  spinlock  tick_sched_do_timer+0x93
                 1    20.49 us  20.49 us  20.49 us  spinlock  __softirqentry_text_start+0xeb

        $ sudo ./perf lock contention -L ffff9f4140059000
         contended  total wait  max wait  avg wait      type  caller

                38   779.40 us  42.83 us  20.51 us  spinlock  worker_thread+0x50
                11   216.30 us  39.87 us  19.66 us  spinlock  queue_work_on+0x39
                 8   118.13 us  20.51 us  14.77 us  spinlock  kthread+0xe5

   - Fix splitting CC into compiler and options when checking if a
     option is present in clang to build the python binding, needed in
     systems such as yocto that set CC to, e.g.: "gcc --sysroot=/a/b/c".

   - Refresh metris and events for Intel systems: alderlake.
     alderlake-n, bonnell, broadwell, broadwellde, broadwellx,
     cascadelakex, elkhartlake, goldmont, goldmontplus, haswell,
     haswellx, icelake, icelakex, ivybridge, ivytown, jaketown,
     knightslanding, meteorlake, nehalemep, nehalemex, sandybridge,
     sapphirerapids, silvermont, skylake, skylakex, snowridgex,
     tigerlake, westmereep-dp, westmereep-sp, westmereex.

   - Add vendor events files (JSON) for AMD Zen 4, from sections
     2.1.15.4 "Core Performance Monitor Counters", 2.1.15.5 "L3 Cache
     Performance Monitor Counter"s and Section 7.1 "Fabric Performance
     Monitor Counter (PMC) Events" in the Processor Programming
     Reference (PPR) for AMD Family 19h Model 11h Revision B1
     processors.

     This constitutes events which capture op dispatch, execution and
     retirement, branch prediction, L1 and L2 cache activity, TLB
     activity, L3 cache activity and data bandwidth for various links
     and interfaces in the Data Fabric.

   - Also, from the same PPR are metrics taken from Section 2.1.15.2
     "Performance Measurement", including pipeline utilization, which
     are new to Zen 4 processors and useful for finding performance
     bottlenecks by analyzing activity at different stages of the
     pipeline.

   - Greatly improve the 'srcline', 'srcline_from', 'srcline_to' and
     'srcfile' sort keys performance by postponing calling the external
     addr2line utility to the collapse phase of histogram bucketing.

   - Fix 'perf test' "all PMU test" to skip parametrized events, that
     requires setting up and are not supported by this test.

   - Update tools/ copies of kernel headers: features,
     disabled-features, fscrypt.h, i915_drm.h, msr-index.h, power pc
     syscall table and kvm.h.

   - Add .DELETE_ON_ERROR special Makefile target to clean up partially
     updated files on error.

   - Simplify the mksyscalltbl script for arm64 by avoiding to run the
     host compiler to create the syscall table, do it all just with the
     shell script.

   - Further fixes to honour quiet mode (-q)"

* tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits)
  perf python: Fix splitting CC into compiler and options
  perf scripting python: Don't be strict at handling libtraceevent enumerations
  perf arm64: Simplify mksyscalltbl
  perf build: Remove explicit reference to python 2.x devel files
  perf vendor events amd: Add Zen 4 mapping
  perf vendor events amd: Add Zen 4 metrics
  perf vendor events amd: Add Zen 4 uncore events
  perf vendor events amd: Add Zen 4 core events
  perf vendor events intel: Refresh westmereex events
  perf vendor events intel: Refresh westmereep-sp events
  perf vendor events intel: Refresh westmereep-dp events
  perf vendor events intel: Refresh tigerlake metrics and events
  perf vendor events intel: Refresh snowridgex events
  perf vendor events intel: Refresh skylakex metrics and events
  perf vendor events intel: Refresh skylake metrics and events
  perf vendor events intel: Refresh silvermont events
  perf vendor events intel: Refresh sapphirerapids metrics and events
  perf vendor events intel: Refresh sandybridge metrics and events
  perf vendor events intel: Refresh nehalemex events
  perf vendor events intel: Refresh nehalemep events
  ...
2022-12-22 11:07:29 -08:00
Mario Limonciello
e555c85792 ACPI: x86: s2idle: Stop using AMD specific codepath for Rembrandt+
After we introduced a module parameter and quirk infrastructure for
picking the Microsoft GUID over the SOC vendor GUID we discovered
that lots and lots of systems are getting this wrong.

The table continues to grow, and is becoming unwieldy.

We don't really have any benefit to forcing vendors to populate the
AMD GUID. This is just extra work, and more and more vendors seem
to mess it up.  As the Microsoft GUID is used by Windows as well,
it's very likely that it won't be messed up like this.

So drop all the quirks forcing it and the Rembrandt behavior. This
means that Cezanne or later effectively only run the Microsoft GUID
codepath with the exception of HP Elitebook 8*5 G9.

Fixes: fd894f05cf ("ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt")
Cc: stable@vger.kernel.org # 6.1
Reported-by: Benjamin Cheng <ben@bcheng.me>
Reported-by: bilkow@tutanota.com
Reported-by: Paul <paul@zogpog.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2292
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216768
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
Tested-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:39:31 +01:00
Mario Limonciello
3ea45390e9 ACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP Elitebook 865
HP Elitebook 865 supports both the AMD GUID w/ _REV 2 and Microsoft
GUID with _REV 0. Both have very similar code but the AMD GUID
has a special workaround that is specific to a problem with
spurious wakeups on systems with Qualcomm WLAN.

This is believed to be a bug in the Qualcomm WLAN F/W (it doesn't
affect any other WLAN H/W). If this WLAN firmware is fixed this
quirk can be dropped.

Cc: stable@vger.kernel.org # 6.1
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:39:31 +01:00
Hans de Goede
3cf3b7f012 ACPI: video: Fix Apple GMUX backlight detection
The apple-gmux driver only binds to old GMUX devices which have an
IORESOURCE_IO resource (using inb()/outb()) rather then memory-mapped
IO (IORESOURCE_MEM).

T2 MacBooks use the new style GMUX devices (with IORESOURCE_MEM access),
so these are not supported by the apple-gmux driver. This is not a problem
since they have working ACPI video backlight support.

But the apple_gmux_present() helper only checks if an ACPI device with
the "APP000B" HID is present, causing acpi_video_get_backlight_type()
to return acpi_backlight_apple_gmux disabling the acpi_video backlight
device.

Add a new apple_gmux_backlight_present() helper which checks that
the "APP000B" device actually is an old GMUX device with an IORESOURCE_IO
resource.

This fixes the acpi_video0 backlight no longer registering on T2 MacBooks.

Note people are working to add support for the new style GMUX to Linux:
https://github.com/kekrby/linux-t2/commits/wip/hybrid-graphics

Once this lands this patch should be reverted so that
acpi_video_get_backlight_type() also prefers the gmux on new style GMUX
MacBooks, but for now this is necessary to avoid regressing backlight
control on T2 Macs.

Fixes: 21245df307 ("ACPI: video: Add Apple GMUX brightness control detection")
Reported-and-tested-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:36:49 +01:00
Hans de Goede
7203481fd1 ACPI: resource: Add Asus ExpertBook B2502 to Asus quirks
The Asus ExpertBook B2502 has the same keyboard issue as Asus Vivobook
K3402ZA/K3502ZA. The kernel overrides IRQ 1 to Edge_High when it
should be Active_Low.

This patch adds the ExpertBook B2502 model to the existing
quirk list of Asus laptops with this issue.

Fixes: b5f9223a10 ("ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2142574
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:35:29 +01:00
Adrian Freund
f3cb9b7408 ACPI: resource: do IRQ override on Lenovo 14ALC7
Commit bfcdf58380 ("ACPI: resource: do IRQ override on LENOVO IdeaPad")
added an override for Lenovo IdeaPad 5 16ALC7. The 14ALC7 variant also
suffers from a broken touchscreen and trackpad.

Fixes: 9946e39fe8 ("ACPI: resource: skip IRQ override on AMD Zen platforms")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216804
Signed-off-by: Adrian Freund <adrian@freund.io>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:32:34 +01:00
Erik Schumacher
7592b79ba4 ACPI: resource: do IRQ override on XMG Core 15
The Schenker XMG CORE 15 (M22) is Ryzen-6 based and needs IRQ overriding
for the keyboard to work. Adding an entry for this laptop to the
override_table makes the internal keyboard functional again.

Signed-off-by: Erik Schumacher <ofenfisch@googlemail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:29:49 +01:00
Mario Limonciello
5aa9d943e9 ACPI: video: Don't enable fallback path for creating ACPI backlight by default
The ACPI video detection code has a module parameter
`register_backlight_delay` which is currently configured to 8 seconds.
This means that if after 8 seconds of booting no native driver has created
a backlight device then the code will attempt to make an ACPI video
backlight device.

This was intended as a safety mechanism with the backlight overhaul that
occurred in kernel 6.1, but as it doesn't appear necesssary set it to be
disabled by default.

Suggested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:26:42 +01:00
Mario Limonciello
c573e24060 drm/amd/display: Report to ACPI video if no panels were found
On desktop APUs amdgpu doesn't create a native backlight device
as no eDP panels are found.  However if the BIOS has reported
backlight control methods in the ACPI tables then an acpi_video0
backlight device will be made 8 seconds after boot.

This has manifested in a power slider on a number of desktop APUs
ranging from Ryzen 5000 through Ryzen 7000 on various motherboard
manufacturers. To avoid this, report to the acpi video detection
that the system does not have any panel connected in the native
driver.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783786
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:26:42 +01:00
Mario Limonciello
00a734104a ACPI: video: Allow GPU drivers to report no panels
The current logic for the ACPI backlight detection will create
a backlight device if no native or vendor drivers have created
8 seconds after the system has booted if the ACPI tables
included backlight control methods.

If the GPU drivers have loaded, they may be able to report whether
any LCD panels were found.  Allow using this information to factor
in whether to enable the fallback logic for making an acpi_video0
backlight device.

Suggested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-22 17:26:41 +01:00
Jens Axboe
fb857b0bb2 nvme fixes for Linux 6.2
- fix doorbell buffer value endianness (Klaus Jensen)
  - fix Linux vs NVMe page size mismatch (Keith Busch)
  - fix a potential use memory access beyong the allocation limit
    (Keith Busch)
  - fix a multipath vs blktrace NULL pointer dereference
    (Yanjun Zhang)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmOkeqkLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYNqvBAAleIay/9mavb1iXTteEFKBN3ml/3Dslc1nETP5FWS
 7j8oXaYT4TsXTN4D5lGUPNzeDIVaPvbVeduJLpGbA7Z/g4XSEdfnorc+AmLdje4q
 LzPAd9u99+P92U5Colj2el4eyPTPzZFbP8IHBZxsR6fTU1i2WyiVYDw+V+MCIQE0
 yrg8oU4JHTq3/4B21guADIOK46hYlUMKUhNNsmW1DNsMs/i320ENbZ5gPY4+WiQq
 t9LK8QDY/NS519KCwtHsZOVwicTpXZoRG19Kx9duiLU+cRUwG5ApdRe0vBXBVjMH
 R65ekFUu7BUXcRHFoNOZeHzjLnDekYkdfBEHTol9+5fdLMZM3Dbv0CAZindYWA38
 VNr63nUkkMh4kShBQjk6VR/TYMsVJ8ZmmrC9Q8kkV9JnvG0ajohQspVhVDwQDKgO
 +RJSZ0yE6uvw9Vzjha0lpUs/DxMEBzXyCe1kGhecb830lLDB0T9KH5EnBMcnpH9w
 E5QGqLHfgbqaAqOXq8aBrZRHc0gcb7ubh47LJI4G+d52XrbeHBmRIbpQ4HAq9A7s
 AeCNtTZ1ksByZsvX/Wwy/Osxs52U9+piRvdBBL39WuM7R0DFQuRykJNqxofhkf6g
 OG/8i1xd0jQusnyyGNY7jRra9FLcvHNKZTx8HNOFXP7RVeWWdVUrajwaRiGZufQ3
 mwg=
 =1hmt
 -----END PGP SIGNATURE-----

Merge tag 'nvme-6.2-2022-12-22' of git://git.infradead.org/nvme into block-6.2

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 6.2

 - fix doorbell buffer value endianness (Klaus Jensen)
 - fix Linux vs NVMe page size mismatch (Keith Busch)
 - fix a potential use memory access beyong the allocation limit
   (Keith Busch)
 - fix a multipath vs blktrace NULL pointer dereference
   (Yanjun Zhang)"

* tag 'nvme-6.2-2022-12-22' of git://git.infradead.org/nvme:
  nvme: fix multipath crash caused by flush request when blktrace is enabled
  nvme-pci: fix page size checks
  nvme-pci: fix mempool alloc size
  nvme-pci: fix doorbell buffer value endianness
2022-12-22 09:22:35 -07:00
Jeff Layton
789e1e10f2 nfsd: shut down the NFSv4 state objects before the filecache
Currently, we shut down the filecache before trying to clean up the
stateids that depend on it. This leads to the kernel trying to free an
nfsd_file twice, and a refcount overput on the nf_mark.

Change the shutdown procedure to tear down all of the stateids prior
to shutting down the filecache.

Reported-and-tested-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Fixes: 5e113224c1 ("nfsd: nfsd_file cache entries should be per net namespace")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-22 10:12:56 -05:00
Arnaldo Carvalho de Melo
09e6f9f983 perf python: Fix splitting CC into compiler and options
Noticed this build failure on archlinux:base when building with clang:

  clang-14: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument]

In tools/perf/util/setup.py we check if clang supports that option, but
since commit 3cad53a6f9 ("perf python: Account for multiple words
in CC") this got broken as in the common case where CC="clang":

  >>> cc="clang"
  >>> print(cc.split()[0])
  clang
  >>> option="-ffat-lto-objects"
  >>> print(str(cc.split()[1:]) + option)
  []-ffat-lto-objects
  >>>

And then the Popen will call clang with that bogus option name that in
turn will not produce the b"unknown argument" or b"is not supported"
that this function uses to detect if the option is not available and
thus later on clang will be called with an unknown/unsupported option.

Fix it by looking if really there are options in the provided CC
variable, and if so override 'cc' with the first token and append the
options to the 'option' variable.

Fixes: 3cad53a6f9 ("perf python: Account for multiple words in CC")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Khem Raj <raj.khem@gmail.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Link: http://lore.kernel.org/lkml/Y6Rq5F5NI0v1QQHM@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-22 11:34:30 -03:00
Shawn Bohrer
fa349e396e veth: Fix race with AF_XDP exposing old or uninitialized descriptors
When AF_XDP is used on on a veth interface the RX ring is updated in two
steps.  veth_xdp_rcv() removes packet descriptors from the FILL ring
fills them and places them in the RX ring updating the cached_prod
pointer.  Later xdp_do_flush() syncs the RX ring prod pointer with the
cached_prod pointer allowing user-space to see the recently filled in
descriptors.  The rings are intended to be SPSC, however the existing
order in veth_poll allows the xdp_do_flush() to run concurrently with
another CPU creating a race condition that allows user-space to see old
or uninitialized descriptors in the RX ring.  This bug has been observed
in production systems.

To summarize, we are expecting this ordering:

CPU 0 __xsk_rcv_zc()
CPU 0 __xsk_map_flush()
CPU 2 __xsk_rcv_zc()
CPU 2 __xsk_map_flush()

But we are seeing this order:

CPU 0 __xsk_rcv_zc()
CPU 2 __xsk_rcv_zc()
CPU 0 __xsk_map_flush()
CPU 2 __xsk_map_flush()

This occurs because we rely on NAPI to ensure that only one napi_poll
handler is running at a time for the given veth receive queue.
napi_schedule_prep() will prevent multiple instances from getting
scheduled. However calling napi_complete_done() signals that this
napi_poll is complete and allows subsequent calls to
napi_schedule_prep() and __napi_schedule() to succeed in scheduling a
concurrent napi_poll before the xdp_do_flush() has been called.  For the
veth driver a concurrent call to napi_schedule_prep() and
__napi_schedule() can occur on a different CPU because the veth xmit
path can additionally schedule a napi_poll creating the race.

The fix as suggested by Magnus Karlsson, is to simply move the
xdp_do_flush() call before napi_complete_done().  This syncs the
producer ring pointers before another instance of napi_poll can be
scheduled on another CPU.  It will also slightly improve performance by
moving the flush closer to when the descriptors were placed in the
RX ring.

Fixes: d1396004dd ("veth: Add XDP TX and REDIRECT")
Suggested-by: Magnus Karlsson <magnus.karlsson@gmail.com>
Signed-off-by: Shawn Bohrer <sbohrer@cloudflare.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-22 15:06:10 +01:00
David Howells
a9eb558a5b afs: Stop implementing ->writepage()
We're trying to get rid of the ->writepage() hook[1].  Stop afs from using
it by unlocking the page and calling afs_writepages_region() rather than
folio_write_one().

A flag is passed to afs_writepages_region() to indicate that it should only
write a single region so that we don't flush the entire file in
->write_begin(), but do add other dirty data to the region being written to
try and reduce the number of RPC ops.

This requires ->migrate_folio() to be implemented, so point that at
filemap_migrate_folio() for files and also for symlinks and directories.

This can be tested by turning on the afs_folio_dirty tracepoint and then
doing something like:

   xfs_io -c "w 2223 7000" -c "w 15000 22222" -c "w 23 7" /afs/my/test/foo

and then looking in the trace to see if the write at position 15000 gets
stored before page 0 gets dirtied for the write at position 23.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20221113162902.883850-1-hch@lst.de/ [1]
Link: https://lore.kernel.org/r/166876785552.222254.4403222906022558715.stgit@warthog.procyon.org.uk/ # v1
2022-12-22 11:40:35 +00:00
Gaosheng Cui
b3d3ca5567 afs: remove afs_cache_netfs and afs_zap_permits() declarations
afs_zap_permits() has been removed since
commit be080a6f43 ("afs: Overhaul permit caching").

afs_cache_netfs has been removed since
commit 523d27cda1 ("afs: Convert afs to use the new fscache API").

so remove the declare for them from header file.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20220909070353.1160228-1-cuigaosheng1@huawei.com/
2022-12-22 11:40:35 +00:00
Colin Ian King
318b83b712 afs: remove variable nr_servers
Variable nr_servers is no longer being used, the last reference
to it was removed in commit 45df846273 ("afs: Fix server list handling")
so clean up the code by removing it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20221020173923.21342-1-colin.i.king@gmail.com/
2022-12-22 11:40:35 +00:00
David Howells
36f82c93ee afs: Fix lost servers_outstanding count
The afs_fs_probe_dispatcher() work function is passed a count on
net->servers_outstanding when it is scheduled (which may come via its
timer).  This is passed back to the work_item, passed to the timer or
dropped at the end of the dispatcher function.

But, at the top of the dispatcher function, there are two checks which
skip the rest of the function: if the network namespace is being destroyed
or if there are no fileservers to probe.  These two return paths, however,
do not drop the count passed to the dispatcher, and so, sometimes, the
destruction of a network namespace, such as induced by rmmod of the kafs
module, may get stuck in afs_purge_servers(), waiting for
net->servers_outstanding to become zero.

Fix this by adding the missing decrements in afs_fs_probe_dispatcher().

Fixes: f6cbb368bc ("afs: Actively poll fileservers to maintain NAT or firewall openings")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/167164544917.2072364.3759519569649459359.stgit@warthog.procyon.org.uk/
2022-12-22 11:40:35 +00:00
Horatiu Vultur
d717f9474e net: lan966x: Fix configuration of the PCS
When the PCS was taken out of reset, we were changing by mistake also
the speed to 100 Mbit. But in case the link was going down, the link
up routine was setting correctly the link speed. If the link was not
getting down then the speed was forced to run at 100 even if the
speed was something else.
On lan966x, to set the speed link to 1G or 2.5G a value of 1 needs to be
written in DEV_CLOCK_CFG_LINK_SPEED. This is similar to the procedure in
lan966x_port_init.

The issue was reproduced using 1000base-x sfp module using the commands:
ip link set dev eth2 up
ip link addr add 10.97.10.2/24 dev eth2
ethtool -s eth2 speed 1000 autoneg off

Fixes: d28d6d2e37 ("net: lan966x: add port module support")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Link: https://lore.kernel.org/r/20221221093315.939133-1-horatiu.vultur@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-22 12:21:05 +01:00
Eric Dumazet
42c7ded0ee bonding: fix lockdep splat in bond_miimon_commit()
bond_miimon_commit() is run while RTNL is held, not RCU.

WARNING: suspicious RCU usage
6.1.0-syzkaller-09671-g89529367293c #0 Not tainted
-----------------------------
drivers/net/bonding/bond_main.c:2704 suspicious rcu_dereference_check() usage!

Fixes: e95cc44763 ("bonding: do failover when high prio link up")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Link: https://lore.kernel.org/r/20221220130831.1480888-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-22 10:40:35 +01:00
Pablo Neira Ayuso
123b99619c netfilter: nf_tables: honor set timeout and garbage collection updates
Set timeout and garbage collection interval updates are ignored on
updates. Add transaction to update global set element timeout and
garbage collection interval.

Fixes: 96518518cc ("netfilter: add nftables")
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-22 10:36:37 +01:00
Yanjun Zhang
3659fb5ac2 nvme: fix multipath crash caused by flush request when blktrace is enabled
The flush request initialized by blk_kick_flush has NULL bio,
and it may be dealt with nvme_end_req during io completion.
When blktrace is enabled, nvme_trace_bio_complete with multipath
activated trying to access NULL pointer bio from flush request
results in the following crash:

[ 2517.831677] BUG: kernel NULL pointer dereference, address: 000000000000001a
[ 2517.835213] #PF: supervisor read access in kernel mode
[ 2517.838724] #PF: error_code(0x0000) - not-present page
[ 2517.842222] PGD 7b2d51067 P4D 0
[ 2517.845684] Oops: 0000 [#1] SMP NOPTI
[ 2517.849125] CPU: 2 PID: 732 Comm: kworker/2:1H Kdump: loaded Tainted: G S                5.15.67-0.cl9.x86_64 #1
[ 2517.852723] Hardware name: XFUSION 2288H V6/BC13MBSBC, BIOS 1.13 07/27/2022
[ 2517.856358] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
[ 2517.859993] RIP: 0010:blk_add_trace_bio_complete+0x6/0x30
[ 2517.863628] Code: 1f 44 00 00 48 8b 46 08 31 c9 ba 04 00 10 00 48 8b 80 50 03 00 00 48 8b 78 50 e9 e5 fe ff ff 0f 1f 44 00 00 41 54 49 89 f4 55 <0f> b6 7a 1a 48 89 d5 e8 3e 1c 2b 00 48 89 ee 4c 89 e7 5d 89 c1 ba
[ 2517.871269] RSP: 0018:ff7f6a008d9dbcd0 EFLAGS: 00010286
[ 2517.875081] RAX: ff3d5b4be00b1d50 RBX: 0000000002040002 RCX: ff3d5b0a270f2000
[ 2517.878966] RDX: 0000000000000000 RSI: ff3d5b0b021fb9f8 RDI: 0000000000000000
[ 2517.882849] RBP: ff3d5b0b96a6fa00 R08: 0000000000000001 R09: 0000000000000000
[ 2517.886718] R10: 000000000000000c R11: 000000000000000c R12: ff3d5b0b021fb9f8
[ 2517.890575] R13: 0000000002000000 R14: ff3d5b0b021fb1b0 R15: 0000000000000018
[ 2517.894434] FS:  0000000000000000(0000) GS:ff3d5b42bfc80000(0000) knlGS:0000000000000000
[ 2517.898299] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2517.902157] CR2: 000000000000001a CR3: 00000004f023e005 CR4: 0000000000771ee0
[ 2517.906053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2517.909930] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2517.913761] PKRU: 55555554
[ 2517.917558] Call Trace:
[ 2517.921294]  <TASK>
[ 2517.924982]  nvme_complete_rq+0x1c3/0x1e0 [nvme_core]
[ 2517.928715]  nvme_tcp_recv_pdu+0x4d7/0x540 [nvme_tcp]
[ 2517.932442]  nvme_tcp_recv_skb+0x4f/0x240 [nvme_tcp]
[ 2517.936137]  ? nvme_tcp_recv_pdu+0x540/0x540 [nvme_tcp]
[ 2517.939830]  tcp_read_sock+0x9c/0x260
[ 2517.943486]  nvme_tcp_try_recv+0x65/0xa0 [nvme_tcp]
[ 2517.947173]  nvme_tcp_io_work+0x64/0x90 [nvme_tcp]
[ 2517.950834]  process_one_work+0x1e8/0x390
[ 2517.954473]  worker_thread+0x53/0x3c0
[ 2517.958069]  ? process_one_work+0x390/0x390
[ 2517.961655]  kthread+0x10c/0x130
[ 2517.965211]  ? set_kthread_struct+0x40/0x40
[ 2517.968760]  ret_from_fork+0x1f/0x30
[ 2517.972285]  </TASK>

To avoid this situation, add a NULL check for req->bio before
calling trace_block_bio_complete.

Signed-off-by: Yanjun Zhang <zhangyanjun@cestc.cn>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-12-22 09:40:27 +01:00
Takashi Iwai
6bf5f9a8b4 ASoC: Updates for v6.2
Some more small fixes and board quirks that came in since my last
 update, the main one being the fixes from Kai for issues around the
 attempts to get kexec working well on SOF based systems.
 -----BEGIN PGP SIGNATURE-----
 
 iQEyBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmOhn2kACgkQJNaLcl1U
 h9Dfdgf47os8jUAaEuV3/pFl7OOh+L2jR2P5yCK60VHu0CfuHo3lynwpYvS/8wKN
 XqYz0eeuYOWpFeZ12wZBY/Dnk2dwkXiqpv7e0ID0szAH9TezSlQ3MRno9hwWGloU
 w3ntU5VeIYTKl91E2y5X9GMoDsfnfh751MsjXOcP40npjGEJpOtAO0z1sIXANSKz
 ftceXGapvTokSp7mbk68BM5ivom4TM3eDSlQiOeMj2OeOhXRylx5tHeQV3FzVeB+
 4K7bECzveDn/hTYBX2Lopn2stR1RF5S9HDynjo83YDKXOKUp8bJfEHK7R/3y3u56
 eIwKgMmxb2eK2IwgjU/7sKP87ARz
 =OaSY
 -----END PGP SIGNATURE-----

Merge tag 'asoc-v6.2-3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Updates for v6.2

Some more small fixes and board quirks that came in since my last
update, the main one being the fixes from Kai for issues around the
attempts to get kexec working well on SOF based systems.
2022-12-22 09:18:38 +01:00
Jaroslav Kysela
fd28941cff ALSA: usb-audio: Add new quirk FIXED_RATE for JBL Quantum810 Wireless
It seems that the firmware is broken and does not accept
the UAC_EP_CS_ATTR_SAMPLE_RATE URB. There is only one rate (48000Hz)
available in the descriptors for the output endpoint.

Create a new quirk QUIRK_FLAG_FIXED_RATE to skip the rate setup
when only one rate is available (fixed).

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216798
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20221215153037.1163786-1-perex@perex.cz
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-12-22 09:13:54 +01:00
Jiapeng Chong
a95e163a4b ALSA: azt3328: Remove the unused function snd_azf3328_codec_outl()
The function snd_azf3328_codec_outl is defined in the azt3328.c file, but
not called elsewhere, so remove this unused function.

sound/pci/azt3328.c:367:1: warning: unused function 'snd_azf3328_codec_outl'.

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3432
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221213061355.62856-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-12-22 09:12:26 +01:00
Takashi Iwai
2d78eb0342 Merge branch 'for-next' into for-linus 2022-12-22 09:11:48 +01:00
Linus Torvalds
9d2f6060fe Tracing fix for 6.2:
- Make monitor structures read only
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY6J+vxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qohJAP9Yx3A4xmopkMjpfK1HBzuB7j4U7blN
 2NhqKM626unbeQEAi3FhPRc5N/sGBdsUClYZIKau0p3ip1TVfYbhk8vSgwg=
 =VcGm
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fix from Steven Rostedt:
 "I missed this minor hardening of the kernel in the first pull.

   - Make monitor structures read only"

* tag 'trace-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rv/monitors: Move monitor structure in rodata
2022-12-21 19:03:42 -08:00
Linus Torvalds
af9b3fa15d Trace probes updates for 6.2:
- New "symstr" type for dynamic events that writes the name of the
   function+offset into the ring buffer and not just the address
 
 - Prevent kernel symbol processing on addresses in user space probes
   (uprobes).
 
 - And minor fixes and clean ups
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY5yAHxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qoWoAP9ZLmqgIqlH3Zcms31SR250kLXxsxT3
 JHe82hiuI1I3fAD/Z93QLHw9wngLqIMx/wXsdFjTNOGGWdxfclSWI2qI6Q0=
 =KaJg
 -----END PGP SIGNATURE-----

Merge tag 'trace-probes-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull trace probes updates from Steven Rostedt:

 - New "symstr" type for dynamic events that writes the name of the
   function+offset into the ring buffer and not just the address

 - Prevent kernel symbol processing on addresses in user space probes
   (uprobes).

 - And minor fixes and clean ups

* tag 'trace-probes-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/probes: Reject symbol/symstr type for uprobe
  tracing/probes: Add symstr type for dynamic events
  kprobes: kretprobe events missing on 2-core KVM guest
  kprobes: Fix check for probe enabled in kill_kprobe()
  test_kprobes: Fix implicit declaration error of test_kprobes
  tracing: Fix race where eprobes can be called before the event
2022-12-21 18:57:24 -08:00
Linus Torvalds
7a5189c58b KVM/riscv changes for 6.2
* Allow unloading KVM module
 
 * Allow KVM user-space to set mvendorid, marchid, and mimpid
 
 * Several fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOhy+QUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOdUwf+K3i8RHW1H8TF/JSrn1I6nURNLYhb
 2wXzl3esOsfswtn6dxEvLEXivcKmD2G9bLpa2UIa3vw1Plg9tdce9IJ5qDodtxVL
 mlISMUSgMNy+lelKJiG+l5Ld4oJ4HUY0yw/p3Ml9WUpra98UCB0sJ+FsqXr4ndi9
 LxkQJrNyZkQcRH2IXjQhKjdjkepFTmkhKs/uCxAZvW9zfUmGX0dcp9W22PTbsapQ
 IcaBKdVaNN3TXNSIdDCM2Iv+oBN7gJn1CbgFxhkp4L8eE5PvRjFw0QooFMn2TjDw
 VflP3gIs/41+5tnoPWXGAkKFe/Z5aJjGjx6Yx0WnEEgoAG47RUHYsKIUjw==
 =8ejV
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull RISC-V kvm updates from Paolo Bonzini:

 - Allow unloading KVM module

 - Allow KVM user-space to set mvendorid, marchid, and mimpid

 - Several fixes and cleanups

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  RISC-V: KVM: Add ONE_REG interface for mvendorid, marchid, and mimpid
  RISC-V: KVM: Save mvendorid, marchid, and mimpid when creating VCPU
  RISC-V: Export sbi_get_mvendorid() and friends
  RISC-V: KVM: Move sbi related struct and functions to kvm_vcpu_sbi.h
  RISC-V: KVM: Use switch-case in kvm_riscv_vcpu_set/get_reg()
  RISC-V: KVM: Remove redundant includes of asm/csr.h
  RISC-V: KVM: Remove redundant includes of asm/kvm_vcpu_timer.h
  RISC-V: KVM: Fix reg_val check in kvm_riscv_vcpu_set_reg_config()
  RISC-V: KVM: Simplify kvm_arch_prepare_memory_region()
  RISC-V: KVM: Exit run-loop immediately if xfer_to_guest fails
  RISC-V: KVM: use vma_lookup() instead of find_vma_intersection()
  RISC-V: KVM: Add exit logic to main.c
2022-12-21 18:52:15 -08:00
Jakub Kicinski
43ae218f69 Merge branch 'mptcp-locking-fixes'
Mat Martineau says:

====================
mptcp: Locking fixes

Two separate locking fixes for the networking tree:

Patch 1 addresses a MPTCP fastopen error-path deadlock that was found
with syzkaller.

Patch 2 works around a lockdep false-positive between MPTCP listening and
non-listening sockets at socket destruct time.
====================

Link: https://lore.kernel.org/r/20221220195215.238353-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 18:06:02 -08:00
Paolo Abeni
fec3adfd75 mptcp: fix lockdep false positive
MattB reported a lockdep splat in the mptcp listener code cleanup:

 WARNING: possible circular locking dependency detected
 packetdrill/14278 is trying to acquire lock:
 ffff888017d868f0 ((work_completion)(&msk->work)){+.+.}-{0:0}, at: __flush_work (kernel/workqueue.c:3069)

 but task is already holding lock:
 ffff888017d84130 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close (net/mptcp/protocol.c:2973)

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (sk_lock-AF_INET){+.+.}-{0:0}:
        __lock_acquire (kernel/locking/lockdep.c:5055)
        lock_acquire (kernel/locking/lockdep.c:466)
        lock_sock_nested (net/core/sock.c:3463)
        mptcp_worker (net/mptcp/protocol.c:2614)
        process_one_work (kernel/workqueue.c:2294)
        worker_thread (include/linux/list.h:292)
        kthread (kernel/kthread.c:376)
        ret_from_fork (arch/x86/entry/entry_64.S:312)

 -> #0 ((work_completion)(&msk->work)){+.+.}-{0:0}:
        check_prev_add (kernel/locking/lockdep.c:3098)
        validate_chain (kernel/locking/lockdep.c:3217)
        __lock_acquire (kernel/locking/lockdep.c:5055)
        lock_acquire (kernel/locking/lockdep.c:466)
        __flush_work (kernel/workqueue.c:3070)
        __cancel_work_timer (kernel/workqueue.c:3160)
        mptcp_cancel_work (net/mptcp/protocol.c:2758)
        mptcp_subflow_queue_clean (net/mptcp/subflow.c:1817)
        __mptcp_close_ssk (net/mptcp/protocol.c:2363)
        mptcp_destroy_common (net/mptcp/protocol.c:3170)
        mptcp_destroy (include/net/sock.h:1495)
        __mptcp_destroy_sock (net/mptcp/protocol.c:2886)
        __mptcp_close (net/mptcp/protocol.c:2959)
        mptcp_close (net/mptcp/protocol.c:2974)
        inet_release (net/ipv4/af_inet.c:432)
        __sock_release (net/socket.c:651)
        sock_close (net/socket.c:1367)
        __fput (fs/file_table.c:320)
        task_work_run (kernel/task_work.c:181 (discriminator 1))
        exit_to_user_mode_prepare (include/linux/resume_user_mode.h:49)
        syscall_exit_to_user_mode (kernel/entry/common.c:130)
        do_syscall_64 (arch/x86/entry/common.c:87)
        entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(sk_lock-AF_INET);
                                lock((work_completion)(&msk->work));
                                lock(sk_lock-AF_INET);
   lock((work_completion)(&msk->work));

  *** DEADLOCK ***

The report is actually a false positive, since the only existing lock
nesting is the msk socket lock acquired by the mptcp work.
cancel_work_sync() is invoked without the relevant socket lock being
held, but under a different (the msk listener) socket lock.

We could silence the splat adding a per workqueue dynamic lockdep key,
but that looks overkill. Instead just tell lockdep the msk socket lock
is not held around cancel_work_sync().

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/322
Fixes: 30e51b923e ("mptcp: fix unreleased socket in accept queue")
Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 18:05:47 -08:00
Paolo Abeni
7d803344fd mptcp: fix deadlock in fastopen error path
MatM reported a deadlock at fastopening time:

INFO: task syz-executor.0:11454 blocked for more than 143 seconds.
      Tainted: G S                 6.1.0-rc5-03226-gdb0157db5153 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.0  state:D stack:25104 pid:11454 ppid:424    flags:0x00004006
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5191 [inline]
 __schedule+0x5c2/0x1550 kernel/sched/core.c:6503
 schedule+0xe8/0x1c0 kernel/sched/core.c:6579
 __lock_sock+0x142/0x260 net/core/sock.c:2896
 lock_sock_nested+0xdb/0x100 net/core/sock.c:3466
 __mptcp_close_ssk+0x1a3/0x790 net/mptcp/protocol.c:2328
 mptcp_destroy_common+0x16a/0x650 net/mptcp/protocol.c:3171
 mptcp_disconnect+0xb8/0x450 net/mptcp/protocol.c:3019
 __inet_stream_connect+0x897/0xa40 net/ipv4/af_inet.c:720
 tcp_sendmsg_fastopen+0x3dd/0x740 net/ipv4/tcp.c:1200
 mptcp_sendmsg_fastopen net/mptcp/protocol.c:1682 [inline]
 mptcp_sendmsg+0x128a/0x1a50 net/mptcp/protocol.c:1721
 inet6_sendmsg+0x11f/0x150 net/ipv6/af_inet6.c:663
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0xf7/0x190 net/socket.c:734
 ____sys_sendmsg+0x336/0x970 net/socket.c:2476
 ___sys_sendmsg+0x122/0x1c0 net/socket.c:2530
 __sys_sendmmsg+0x18d/0x460 net/socket.c:2616
 __do_sys_sendmmsg net/socket.c:2645 [inline]
 __se_sys_sendmmsg net/socket.c:2642 [inline]
 __x64_sys_sendmmsg+0x9d/0x110 net/socket.c:2642
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f5920a75e7d
RSP: 002b:00007f59201e8028 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 00007f5920bb4f80 RCX: 00007f5920a75e7d
RDX: 0000000000000001 RSI: 0000000020002940 RDI: 0000000000000005
RBP: 00007f5920ae7593 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000020004050 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f5920bb4f80 R15: 00007f59201c8000
 </TASK>

In the error path, tcp_sendmsg_fastopen() ends-up calling
mptcp_disconnect(), and the latter tries to close each
subflow, acquiring the socket lock on each of them.

At fastopen time, we have a single subflow, and such subflow
socket lock is already held by the called, causing the deadlock.

We already track the 'fastopen in progress' status inside the msk
socket. Use it to address the issue, making mptcp_disconnect() a
no op when invoked from the fastopen (error) path and doing the
relevant cleanup after releasing the subflow socket lock.

While at the above, rename the fastopen status bit to something
more meaningful.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/321
Fixes: fa9e57468a ("mptcp: fix abba deadlock on fastopen")
Reported-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 18:05:39 -08:00
Yinjun Zhang
e20aa071cd nfp: fix schedule in atomic context when sync mc address
The callback `.ndo_set_rx_mode` is called in atomic context, sleep
is not allowed in the implementation. Now use workqueue mechanism
to avoid this issue.

Fixes: de62486449 ("nfp: add support for multicast filter")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20221220152100.1042774-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 18:03:42 -08:00
Ronak Doshi
3d8f2c4269 vmxnet3: correctly report csum_level for encapsulated packet
Commit dacce2be33 ("vmxnet3: add geneve and vxlan tunnel offload
support") added support for encapsulation offload. However, the
pathc did not report correctly the csum_level for encapsulated packet.

This patch fixes this issue by reporting correct csum level for the
encapsulated packet.

Fixes: dacce2be33 ("vmxnet3: add geneve and vxlan tunnel offload support")
Signed-off-by: Ronak Doshi <doshir@vmware.com>
Acked-by: Peng Li <lpeng@vmware.com>
Link: https://lore.kernel.org/r/20221220202556.24421-1-doshir@vmware.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 17:55:30 -08:00
Aaron Conole
95637d91fe net: openvswitch: release vport resources on failure
A recent commit introducing upcall packet accounting failed to properly
release the vport object when the per-cpu stats struct couldn't be
allocated.  This can cause dangling pointers to dp objects long after
they've been released.

Cc: wangchuanlei <wangchuanlei@inspur.com>
Fixes: 1933ea365a ("net: openvswitch: Add support to count upcall packets")
Reported-by: syzbot+8f4e2dcfcb3209ac35f9@syzkaller.appspotmail.com
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://lore.kernel.org/r/20221220212717.526780-1-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 17:48:12 -08:00
Antoine Tenart
f2575c8f40 net: vrf: determine the dst using the original ifindex for multicast
Multicast packets received on an interface bound to a VRF are marked as
belonging to the VRF and the skb device is updated to point to the VRF
device itself. This was fine even when a route was associated to a
device as when performing a fib table lookup 'oif' in fib6_table_lookup
(coming from 'skb->dev->ifindex' in ip6_route_input) was set to 0 when
FLOWI_FLAG_SKIP_NH_OIF was set.

With commit 40867d74c3 ("net: Add l3mdev index to flow struct and
avoid oif reset for port devices") this is not longer true and multicast
traffic is not received on the original interface.

Instead of adding back a similar check in fib6_table_lookup determine
the dst using the original ifindex for multicast VRF traffic. To make
things consistent across the function do the above for all strict
packets, which was the logic before commit 6f12fa7755 ("vrf: mark skb
for multicast or link-local as enslaved to VRF"). Note that reverting to
this behavior should be fine as the change was about marking packets
belonging to the VRF, not about their dst.

Fixes: 40867d74c3 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20221220171825.1172237-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 17:47:37 -08:00
Maciej Fijalkowski
53fc61be27 ice: xsk: do not use xdp_return_frame() on tx_buf->raw_buf
Previously ice XDP xmit routine was changed in a way that it avoids
xdp_buff->xdp_frame conversion as it is simply not needed for handling
XDP_TX action and what is more it saves us CPU cycles. This routine is
re-used on ZC driver to handle XDP_TX action.

Although for XDP_TX on Rx ZC xdp_buff that comes from xsk_buff_pool is
converted to xdp_frame, xdp_frame itself is not stored inside
ice_tx_buf, we only store raw data pointer. Casting this pointer to
xdp_frame and calling against it xdp_return_frame in
ice_clean_xdp_tx_buf() results in undefined behavior.

To fix this, simply call page_frag_free() on tx_buf->raw_buf.
Later intention is to remove the buff->frame conversion in order to
simplify the codebase and improve XDP_TX performance on ZC.

Fixes: 126cdfe100 ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
Reported-and-tested-by: Robin Cowley <robin.cowley@thehutgroup.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Piotr Raczynski <piotr.raczynski@.intel.com>
Link: https://lore.kernel.org/r/20221220175448.693999-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 17:46:49 -08:00
Dave Airlie
fe8f5b2f7b Merge tag 'amd-drm-fixes-6.2-2022-12-21' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-fixes-6.2-2022-12-21:

amdgpu:
- Avoid large variable on the stack
- S0ix fixes
- SMU 13.x fixes
- VCN fix
- Add missing fence reference

amdkfd:
- Fix init vm error handling
- Fix double release of compute pasid

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221221205828.6093-1-alexander.deucher@amd.com
2022-12-22 11:02:56 +10:00
Jakub Kicinski
aa6c3961a3 wireless fixes for v6.2
First set of fixes for v6.2. Fix for a link error in mt76, fix for an
 iwlwifi firmware crash and two cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmOjSuwRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZtiiAgAqnV5VazyEnir2hwI9L34nUGpMfgv6+wE
 io3Epa2vwCOQE6ZMKbLmGkaMiySCJF9hj8cQCfrLOL998u/X/m9zvv7NV68XrtlM
 22BGYvJRh8k4mNUzmYHsIGxu9f3XMPhNz7UUDhaPmwzShvHWoOiGwbuiAJj0EDIe
 RuALauhKhR3YSiZtfQ9/FJrotqRaDBomHtcVfCDwaCSE40EkMBN6gkomOdhh+ci7
 Ec/RDY/DSaomum7DCzbowa8HmSuXQagAeDJ4HDaDGZvDAL/A4bT5koDfEwtuPrw0
 u+crUS20i66erHZfqX50VuoZmvXhZJr0CQJP+39WeeRXUgtiiKoIXQ==
 =FhQ7
 -----END PGP SIGNATURE-----

Merge tag 'wireless-2022-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Kalle Valo says:

====================
wireless fixes for v6.2

First set of fixes for v6.2. Fix for a link error in mt76, fix for an
iwlwifi firmware crash and two cleanups.

* tag 'wireless-2022-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: ath9k: use proper statements in conditionals
  wifi: mt76: mt7996: select CONFIG_RELAY
  wifi: iwlwifi: fw: skip PPAG for JF
  wifi: ti: remove obsolete lines in the Makefile
====================

Link: https://lore.kernel.org/r/20221221180808.96A8AC433EF@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-21 16:44:56 -08:00
Linus Torvalds
569c3a283c block-6.2-2022-12-19
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOgp5AQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpm5SD/9tduSZQW00aDm83HbEikWdCgQm0w37tyYl
 C2+IwRwLF8pnAoSb6yaO7LZM9ZUYfoIfIlkHXkKhT1xNJ/XdeGDgwjOHi106iaEx
 kG08DcFnUjyJ4Yh6hnnpnSepIo0ckwa18pSaE4smvmKZirj3it3O6xSspyBxtUcv
 q6PvJDMN15aG6uLHq3xNZPzoI2KYXBDgwanyImRhdvLoOTiS9rok+F9e2ob3lzAa
 PB+FOipQoKb7M6jbyfZe4KbeTiJh4EYEl5Qa6ebrDIkOTm7zjc8sQbCkNeI7osh+
 D0FvEQ1Vsrjj5Bp6N9CmZcrmNagjEcAPbzguxAilrgw2/XvA8d0fymziGXvuyUEv
 bSAx6lyJzfMLrvtubSqMhIF+8DlccQnnXz2ccacwvAfayytzNJjC9serU+czHA4O
 ZkPTwZFjAmbn6q6SK3qaOCB9IgITHipj8R/ncGu9KjNvM2QgzM+OIrP0xGxtk6uI
 ZGrt9nGMUmgjtaliQjiDVZomMewru1lRWPRAjfQ995gmVkejgapUHYoaDtDzaLKZ
 Q9BaK5CC2jltGUuuoFEnXnwu/Eyvp9y++pKkz4Esb+/Wkst4qyGtr9DOSTnv1wKN
 W20h3Z5vOAXXquvUJ5S3mQl8TNJHiBz+/CRB9PZG8XFtn8ubGo8XttGdgjQgyLM3
 6FHzcZgeWw==
 =TSec
 -----END PGP SIGNATURE-----

Merge tag 'block-6.2-2022-12-19' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - Various fixes for BFQ (Yu, Yuwei)

 - Fix for loop command line parsing (Isaac)

 - No need to specifically clear REQ_ALLOC_CACHE on IOPOLL downgrade
   anymore (me)

 - blk-iocost enum fix for newer gcc (Jiri)

 - UAF fix for queue release (Ming)

 - blk-iolatency error handling memory leak fix (Tejun)

* tag 'block-6.2-2022-12-19' of git://git.kernel.dk/linux:
  block: don't clear REQ_ALLOC_CACHE for non-polled requests
  block: fix use-after-free of q->q_usage_counter
  block, bfq: only do counting of pending-request for BFQ_GROUP_IOSCHED
  blk-iolatency: Fix memory leak on add_disk() failures
  loop: Fix the max_loop commandline argument treatment when it is set to 0
  block/blk-iocost (gcc13): keep large values in a new enum
  block, bfq: replace 0/1 with false/true in bic apis
  block, bfq: don't return bfqg from __bfq_bic_change_cgroup()
  block, bfq: fix possible uaf for 'bfqq->bic'
2022-12-21 16:35:26 -08:00
Linus Torvalds
5d4740fc78 io_uring-6.2-2022-12-19
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOgp3oQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpvjeD/4w17ERLignAko51qJFS+lcpjEWFYk63XZN
 tFaZqGOscH9PertlQu5IstORa/OWY2iCzhi2waMvtHAI9YaT7jpxkgrUdfEoGyNL
 6Ij5DIqnlIkZG+cUBXKq+xLhThssJECkqckcVPgtZIbCZzDAL/ffghH94sZY/LxA
 +cwsloA24s0hjZX3Cm/RNQIgEBf2g4HNNA09Ft3Idd9tSL0WndqcHTasEGAC8K+Z
 r9ZFsKCSVKB+6wUCYawO5xF+zfm5wA4sD1PXjVA1q++mwDm8BKmpcBG10v3grJ24
 qzh+k8wMeiD7BJLDekEyWvklV7bIpbMZ2dzkdMI7n0Cs6WRumhLo+Enrh1l5l9YJ
 wizqwWykGjWWyLm9QP5R249o7n/T6q7jKqmsBzN+3wWYasq4W7PIjr4hQZ2hqiAW
 pUdaqvb0V91OQHjDHi4wI4xZnsmgr6eJhDz0JAd5wzc+g2Uav8GWs6wEaW++4HkL
 IHWggX51oF3Mzjo+Lx0pfs3dkcA5vQ85KDcICeLXnv6HPm90ImZY4cTdSW+YYzlK
 351GwaPOTepm4M+hLZHZYVj5pTQPIKAspxwbSNcYlQ4nLhVPfcKkQGZ6di4yHZaC
 j8zT1opSmh4OqKA9mE/tCUf3s3e2YDmemDuyUKD56luAIw+rsxScC6HEPJPBxrmm
 hZfEkgw/vQ==
 =Jq/9
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-6.2-2022-12-19' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

 - Improve the locking for timeouts. This was originally queued up for
   the initial pull, but I messed up and it got missed. (Pavel)

 - Fix an issue with running task_work from the wait path, causing some
   inefficiencies (me)

 - Add a clear of ->free_iov upfront in the 32-bit compat data
   importing, so we ensure that it's always sane at completion time (me)

 - Use call_rcu_hurry() for the eventfd signaling (Dylan)

 - Ordering fix for multishot recv completions (Pavel)

 - Add the io_uring trace header to the MAINTAINERS entry (Ammar)

* tag 'io_uring-6.2-2022-12-19' of git://git.kernel.dk/linux:
  MAINTAINERS: io_uring: Add include/trace/events/io_uring.h
  io_uring/net: fix cleanup after recycle
  io_uring/net: ensure compat import handlers clear free_iov
  io_uring: include task_work run after scheduling in wait for events
  io_uring: don't use TIF_NOTIFY_SIGNAL to test for availability of task_work
  io_uring: use call_rcu_hurry if signaling an eventfd
  io_uring: fix overflow handling regression
  io_uring: ease timeout flush locking requirements
  io_uring: revise completion_lock locking
  io_uring: protect cq_timeouts with timeout_lock
2022-12-21 16:28:25 -08:00
Martin KaFai Lau
70a00e2f1d selftests/bpf: Test bpf_skb_adjust_room on CHECKSUM_PARTIAL
When the bpf_skb_adjust_room() shrinks the skb such that its csum_start
is invalid, the skb->ip_summed should be reset from CHECKSUM_PARTIAL to
CHECKSUM_NONE.

The commit 54c3f1a814 ("bpf: pull before calling skb_postpull_rcsum()")
fixed it.

This patch adds a test to ensure the skb->ip_summed changed from
CHECKSUM_PARTIAL to CHECKSUM_NONE after bpf_skb_adjust_room().

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20221221185653.1589961-1-martin.lau@linux.dev
2022-12-22 00:56:27 +01:00
Rickard x Andersson
e96b95c2b7 gcov: add support for checksum field
In GCC version 12.1 a checksum field was added.

This patch fixes a kernel crash occurring during boot when using
gcov-kernel with GCC version 12.2.  The crash occurred on a system running
on i.MX6SX.

Link: https://lkml.kernel.org/r/20221220102318.3418501-1-rickaran@axis.com
Fixes: 977ef30a7d ("gcov: support GCC 12.1 and newer compilers")
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Tested-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Martin Liska <mliska@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:52 -08:00
Liam Howlett
c5651b31f5 test_maple_tree: add test for mas_spanning_rebalance() on insufficient data
Add a test to the maple tree test suite for the spanning rebalance
insufficient node issue does not go undetected again.

Link: https://lkml.kernel.org/r/20221219161922.2708732-3-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:52 -08:00
Liam Howlett
0abb964aae maple_tree: fix mas_spanning_rebalance() on insufficient data
Mike Rapoport contacted me off-list with a regression in running criu. 
Periodic tests fail with an RCU stall during execution.  Although rare, it
is possible to hit this with other uses so this patch should be backported
to fix the regression.

This patchset adds the fix and a test case to the maple tree test
suite.


This patch (of 2):

An insufficient node was causing an out-of-bounds access on the node in
mas_leaf_max_gap().  The cause was the faulty detection of the new node
being a root node when overwriting many entries at the end of the tree.

Fix the detection of a new root and ensure there is sufficient data prior
to entering the spanning rebalance loop.

Link: https://lkml.kernel.org/r/20221219161922.2708732-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20221219161922.2708732-2-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Mike Rapoport <rppt@kernel.org>
Tested-by: Mike Rapoport <rppt@kernel.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:52 -08:00
Mike Kravetz
e700898fa0 hugetlb: really allocate vma lock for all sharable vmas
Commit bbff39cc6c ("hugetlb: allocate vma lock for all sharable vmas")
removed the pmd sharable checks in the vma lock helper routines.  However,
it left the functional version of helper routines behind #ifdef
CONFIG_ARCH_WANT_HUGE_PMD_SHARE.  Therefore, the vma lock is not being
used for sharable vmas on architectures that do not support pmd sharing. 
On these architectures, a potential fault/truncation race is exposed that
could leave pages in a hugetlb file past i_size until the file is removed.

Move the functional vma lock helpers outside the ifdef, and remove the
non-functional stubs.  Since the vma lock is not just for pmd sharing,
rename the routine __vma_shareable_flags_pmd.

Link: https://lkml.kernel.org/r/20221212235042.178355-1-mike.kravetz@oracle.com
Fixes: bbff39cc6c ("hugetlb: allocate vma lock for all sharable vmas")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:52 -08:00
Arnd Bergmann
7ba594d700 kmsan: export kmsan_handle_urb
USB support can be in a loadable module, and this causes a link failure
with KMSAN:

ERROR: modpost: "kmsan_handle_urb" [drivers/usb/core/usbcore.ko] undefined!

Export the symbol so it can be used by this module.

Link: https://lkml.kernel.org/r/20221215162710.3802378-1-arnd@kernel.org
Fixes: 553a80188a ("kmsan: handle memory sent to/from USB")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:52 -08:00
Arnd Bergmann
aaa746ad8b kmsan: include linux/vmalloc.h
This is needed for the vmap/vunmap declarations:

mm/kmsan/kmsan_test.c:316:9: error: implicit declaration of function 'vmap' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        vbuf = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
               ^
mm/kmsan/kmsan_test.c:316:29: error: use of undeclared identifier 'VM_MAP'
        vbuf = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
                                   ^
mm/kmsan/kmsan_test.c:322:3: error: implicit declaration of function 'vunmap' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
                vunmap(vbuf);
                ^

Link: https://lkml.kernel.org/r/20221215163046.4079767-1-arnd@kernel.org
Fixes: 8ed691b02a ("kmsan: add tests for KMSAN")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:51 -08:00
Mathieu Desnoyers
38ce7c9bdf mm/mempolicy: fix memory leak in set_mempolicy_home_node system call
When encountering any vma in the range with policy other than MPOL_BIND or
MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put on
the policy just allocated with mpol_dup().

This allows arbitrary users to leak kernel memory.

Link: https://lkml.kernel.org/r/20221215194621.202816-1-mathieu.desnoyers@efficios.com
Fixes: c6018b4b25 ("mm/mempolicy: add set_mempolicy_home_node syscall")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>	[5.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-21 14:31:51 -08:00