Commit Graph

1266966 Commits

Author SHA1 Message Date
Edward Liaw
176517c931 selftests/mm: include strings.h for ffsl
Got a compilation error on Android for ffsl after 91b80cc5b3
("selftests: mm: fix map_hugetlb failure on 64K page size systems")
included vm_util.h.

Link: https://lkml.kernel.org/r/20240329185814.16304-1-edliaw@google.com
Fixes: af605d26a8 ("selftests/mm: merge util.h into vm_util.h")
Signed-off-by: Edward Liaw <edliaw@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05 11:21:30 -07:00
Uladzislau Rezki (Sony)
fc2c22693c mm: vmalloc: fix lockdep warning
A lockdep reports a possible deadlock in the find_vmap_area_exceed_addr_lock()
function:

============================================
WARNING: possible recursive locking detected
6.9.0-rc1-00060-ged3ccc57b108-dirty #6140 Not tainted
--------------------------------------------
drgn/455 is trying to acquire lock:
ffff0000c00131d0 (&vn->busy.lock/1){+.+.}-{2:2}, at: find_vmap_area_exceed_addr_lock+0x64/0x124

but task is already holding lock:
ffff0000c0011878 (&vn->busy.lock/1){+.+.}-{2:2}, at: find_vmap_area_exceed_addr_lock+0x64/0x124

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&vn->busy.lock/1);
  lock(&vn->busy.lock/1);

 *** DEADLOCK ***

indeed it can happen if the find_vmap_area_exceed_addr_lock() gets called
concurrently because it tries to acquire two nodes locks.  It was done to
prevent removing a lowest VA found on a previous step.

To address this a lowest VA is found first without holding a node lock
where it resides.  As a last step we check if a VA still there because it
can go away, if removed, proceed with next lowest.

[akpm@linux-foundation.org: fix comment typos, per Baoquan]
Link: https://lkml.kernel.org/r/20240328140330.4747-1-urezki@gmail.com
Fixes: 53becf32ae ("mm: vmalloc: support multiple nodes in vread_iter")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Omar Sandoval <osandov@fb.com>
Reported-by: Jens Axboe <axboe@kernel.dk>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05 11:21:30 -07:00
Uladzislau Rezki (Sony)
4ed91fa917 mm: vmalloc: bail out early in find_vmap_area() if vmap is not init
During the boot the s390 system triggers "spinlock bad magic" messages
if the spinlock debugging is enabled:

[    0.465445] BUG: spinlock bad magic on CPU#0, swapper/0
[    0.465490]  lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
[    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[    0.466270] Call Trace:
[    0.466470]  [<00000000011f26c8>] dump_stack_lvl+0x98/0xd8
[    0.466516]  [<00000000001dcc6a>] do_raw_spin_lock+0x8a/0x108
[    0.466545]  [<000000000042146c>] find_vmap_area+0x6c/0x108
[    0.466572]  [<000000000042175a>] find_vm_area+0x22/0x40
[    0.466597]  [<000000000012f152>] __set_memory+0x132/0x150
[    0.466624]  [<0000000001cc0398>] vmem_map_init+0x40/0x118
[    0.466651]  [<0000000001cc0092>] paging_init+0x22/0x68
[    0.466677]  [<0000000001cbbed2>] setup_arch+0x52a/0x708
[    0.466702]  [<0000000001cb6140>] start_kernel+0x80/0x5c8
[    0.466727]  [<0000000000100036>] startup_continue+0x36/0x40

it happens because such system tries to access some vmap areas
whereas the vmalloc initialization is not even yet done:

[    0.465490] lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
[    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[    0.466270] Call Trace:
[    0.466470] dump_stack_lvl (lib/dump_stack.c:117)
[    0.466516] do_raw_spin_lock (kernel/locking/spinlock_debug.c:87 kernel/locking/spinlock_debug.c:115)
[    0.466545] find_vmap_area (mm/vmalloc.c:1059 mm/vmalloc.c:2364)
[    0.466572] find_vm_area (mm/vmalloc.c:3150)
[    0.466597] __set_memory (arch/s390/mm/pageattr.c:360 arch/s390/mm/pageattr.c:393)
[    0.466624] vmem_map_init (./arch/s390/include/asm/set_memory.h:55 arch/s390/mm/vmem.c:660)
[    0.466651] paging_init (arch/s390/mm/init.c:97)
[    0.466677] setup_arch (arch/s390/kernel/setup.c:972)
[    0.466702] start_kernel (init/main.c:899)
[    0.466727] startup_continue (arch/s390/kernel/head64.S:35)
[    0.466811] INFO: lockdep is turned off.
...
[    0.718250] vmalloc init - busy lock init 0000000002871860
[    0.718328] vmalloc init - busy lock init 00000000028731b8

Some background. It worked before because the lock that is in question
was statically defined and initialized. As of now, the locks and data
structures are initialized in the vmalloc_init() function.

To address that issue add the check whether the "vmap_initialized"
variable is set, if not find_vmap_area() bails out on entry returning NULL.

Link: https://lkml.kernel.org/r/20240323141544.4150-1-urezki@gmail.com
Fixes: 72210662c5 ("mm: vmalloc: offload free_vmap_area_lock lock")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05 11:21:30 -07:00
John Sperbeck
8434f9aa6b init: open output files from cpio unpacking with O_LARGEFILE
If a member of a cpio archive for an initrd or initrams is larger than
2Gb, we'll eventually fail to write to that file when we get to that
limit, unless O_LARGEFILE is set.

The problem can be seen with this recipe, assuming that BLK_DEV_RAM
is not configured:

cd /tmp
dd if=/dev/zero of=BIGFILE bs=1048576 count=2200
echo BIGFILE | cpio -o -H newc -R root:root > initrd.img
kexec -l /boot/vmlinuz-$(uname -r) --initrd=initrd.img --reuse-cmdline
kexec -e

The console will show 'Initramfs unpacking failed: write error'.  With
the patch, the error is gone.

Link: https://lkml.kernel.org/r/20240323152934.3307391-1-jsperbeck@google.com
Signed-off-by: John Sperbeck <jsperbeck@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05 11:21:30 -07:00
David Hildenbrand
65291dcfcf mm/secretmem: fix GUP-fast succeeding on secretmem folios
folio_is_secretmem() currently relies on secretmem folios being LRU
folios, to save some cycles.

However, folios might reside in a folio batch without the LRU flag set, or
temporarily have their LRU flag cleared.  Consequently, the LRU flag is
unreliable for this purpose.

In particular, this is the case when secretmem_fault() allocates a fresh
page and calls filemap_add_folio()->folio_add_lru().  The folio might be
added to the per-cpu folio batch and won't get the LRU flag set until the
batch was drained using e.g., lru_add_drain().

Consequently, folio_is_secretmem() might not detect secretmem folios and
GUP-fast can succeed in grabbing a secretmem folio, crashing the kernel
when we would later try reading/writing to the folio, because the folio
has been unmapped from the directmap.

Fix it by removing that unreliable check.

Link: https://lkml.kernel.org/r/20240326143210.291116-2-david@redhat.com
Fixes: 1507f51255 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: xingwei lee <xrivendell7@gmail.com>
Reported-by: yue sun <samsun1006219@gmail.com>
Closes: https://lore.kernel.org/lkml/CABOYnLyevJeravW=QrH0JUPYEcDN160aZFb7kwndm-J2rmz0HQ@mail.gmail.com/
Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
Tested-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05 11:21:29 -07:00
Miguel Ojeda
d3bbc4dfcc drm/msm: fix the CRASHDUMP_READ target of a6xx_get_shader_block()
Clang 14 in an (essentially) defconfig arm64 build for next-20240326
reports [1]:

    drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error:
    variable 'out' set but not used [-Werror,-Wunused-but-set-variable]

The variable `out` in these functions is meant to compute the `target` of
`CRASHDUMP_READ()`, but in this case only the initial value (`dumper->iova
+ A6XX_CD_DATA_OFFSET`) was being passed.

Thus use `out` as it was intended by Connor [2].

There was an alternative patch at [3] that removed the variable
altogether, but that would only use the initial value.

Fixes: 64d6255650 ("drm/msm: More fully implement devcoredump for a7xx")
Closes: https://lore.kernel.org/lkml/CANiq72mjc5t4n25SQvYSrOEhxxpXYPZ4pPzneSJHEnc3qApu2Q@mail.gmail.com/ [1]
Link: https://lore.kernel.org/lkml/CACu1E7HhCKMJd6fixZSPiNAz6ekoZnkMTHTcLFVmbZ-9VoLxKg@mail.gmail.com/ [2]
Link: https://lore.kernel.org/lkml/20240307093727.1978126-1-colin.i.king@gmail.com/ [3]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/584955/
Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-04-05 11:21:26 -07:00
Rafael J. Wysocki
6f824c9fcc Merge branch 'acpi-thermal'
* acpi-thermal:
  ACPI: thermal: Register thermal zones without valid trip points
2024-04-05 20:17:48 +02:00
Jeff Layton
10396f4df8 nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the
client. While a callback job is technically an RPC that counter is
really more for client-driven RPCs, and this has the effect of
preventing the client from being unhashed until the callback completes.

If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we
can end up in a situation where the callback can't complete on the (now
dead) callback channel, but the new client can't connect because the old
client can't be unhashed. This usually manifests as a NFS4ERR_DELAY
return on the CREATE_SESSION operation.

The job is only holding a reference to the client so it can clear a flag
after the RPC completes. Fix this by having CB_RECALL_ANY instead hold a
reference to the cl_nfsdfs.cl_ref. Typically we only take that sort of
reference when dealing with the nfsdfs info files, but it should work
appropriately here to ensure that the nfs4_client doesn't disappear.

Fixes: 44df6f439a ("NFSD: add delegation reaper to react to low memory condition")
Reported-by: Vladimir Benes <vbenes@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-05 14:05:35 -04:00
Björn Töpel
76cd338994 MAINTAINERS: bpf: Add Lehui and Puranjay as riscv64 reviewers
Lehui and Puranjay have been active RISC-V 64-bit BPF JIT
contributors/reviewers for a long time!

Let's make it more official by adding them as reviewers in
MAINTAINERS.

Thank you for your hard work!

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240405123352.2852393-1-bjorn@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-05 10:34:57 -07:00
Andy Shevchenko
ace0ebe5c9 gpio: crystalcove: Use -ENOTSUPP consistently
The GPIO library expects the drivers to return -ENOTSUPP in some
cases and not using analogue POSIX code. Make the driver to follow
this.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-04-05 20:12:39 +03:00
Andy Shevchenko
0c3b532ad3 gpio: wcove: Use -ENOTSUPP consistently
The GPIO library expects the drivers to return -ENOTSUPP in some
cases and not using analogue POSIX code. Make the driver to follow
this.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-04-05 20:12:03 +03:00
Linus Torvalds
e8b0ccb2a7 minor 9p cleanups:
- kernel doc fix & removal of unused flag
 - some bogus debug statement for read/write
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmYP3tIACgkQq06b7GqY
 5nAd0hAAhQz7J+JGBWTdVhOTiKATaLsw1rioaVgs6FCV6qTn1BISzBGGaJ1pRYI6
 fg9lqbEDsL8ZRk16GNHqKXbh5+mwN827VtZ6geAWEpFGEjlq7sGBNYymWJkl9Db2
 hEsTlztiTX0oW3FgTQD9zQcYa5HJSaIXBKlxxCoSaPluMNEFREYRPLqwIpplGn/p
 Tgi/DxVHjYGew+N0jTfklC4Ofu/yoA438/H9RQ2fvS1Jm/1AsCgNrcnuxCRy6d/4
 JWh5p0r02ReCTWBy9J0UyY6R/UufYadfyrTaioJkL6hTEnY1xOpeB6w98bgUscWy
 I3bT/jXYTzPiFg1ooLT6kVzziXOnZCKlLPwbUw2CCZFSA3xYqfD/w3zzruZG77rW
 2cHGPpWWht2MamzjUUWA4uDd3ObhJ52K208UKRb5OS2EfJ45mRtY6IneyQrX+LZr
 zIiCkwhnqrOuEgwAzqiALTrsVptXYmhUPWjDUhgUbMCMq4Jf37I7bnCqbJIbajTL
 CZQHEAbwqQgbkxk8Ea4zizDwVHYDvDoYccwiqakCjVphuuWcY/D5dg7Im7MUKtaZ
 j+YT6zm6gk7AtuKZiqXEFIX9O1ZcbrFLoq5bsWxr9wPiuTJWPn6BYxMrya9jUzl/
 Ow6AxJXG/U6Okse2nEh4e5Y/+Lh5Z8P+jV0VanY2R1uuLKoGJ20=
 =u9st
 -----END PGP SIGNATURE-----

Merge tag '9p-for-6.9-rc3' of https://github.com/martinetd/linux

Pull minor 9p cleanups from Dominique Martinet:

 - kernel doc fix & removal of unused flag

 - fix some bogus debug statement for read/write

* tag '9p-for-6.9-rc3' of https://github.com/martinetd/linux:
  9p: remove SLAB_MEM_SPREAD flag usage
  9p: Fix read/write debug statements to report server reply
  9p/trans_fd: remove Excess kernel-doc comment
2024-04-05 10:05:42 -07:00
Linus Torvalds
405ac6a572 3 ksmbd changesets, all also for stable
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmYPU0MACgkQiiy9cAdy
 T1HmrQv/cw5YUwg3O1Ai4GWDJ07hJAZoYP6IdZGJGZSx2TMEJKh0gxAnUTk/Idnv
 TvK80d9GOOrPTUH7ToDYGnFOUH3T1/chx/R/t2OMs9w1Rc02Kq4+XZhv/1HnflEm
 NbXvYG0ZhZOV331GwydqATpq+IjzNkdf1rzb2Agy1YZGpd2uNU1cgb30FGvqIHAQ
 SyYxX7v9uBLBrBU/IPUxHyUKabiLwz7nFZYDLtGNu88oIZaDbTen/lJju5+Dt0Uz
 x5lL6h3kwwqHNa/1BFe84/h/EKBBUM9ha2VL7lZP3S2imBcspBeY8N+Vb8+z18PS
 7KexqIq9tfkTw2FRne4gqcjxF5fSA4n9hXldFf0t+kz35tosU8akpSpsPkwv0cmT
 dbH1u6vS18WilzqIjCDCNfP/e8/G4HwH0DuAIWbWG/IPasDJLeHJ9fiswzuWPJbG
 Nblqu98I6kHiOwTyOuDHcrrqRUBK9AkspgU3bS3PYh5PwYXMCd8+wHkybm6LhJPH
 dpc2BKHG
 =l/a7
 -----END PGP SIGNATURE-----

Merge tag '6.9-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:
 "Three fixes, all also for stable:

   - encryption fix

   - memory overrun fix

   - oplock break fix"

* tag '6.9-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: do not set SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1
  ksmbd: validate payload size in ipc response
  ksmbd: don't send oplock break if rename fails
2024-04-05 10:02:09 -07:00
Mikhail Kobuk
627207703b phy: marvell: a3700-comphy: Fix hardcoded array size
Replace hardcoded 'gbe_phy_init' array size by explicit one.

Fixes: 934337080c ("phy: marvell: phy-mvebu-a3700-comphy: Add native kernel implementation")
Signed-off-by: Mikhail Kobuk <m.kobuk@ispras.ru>
Link: https://lore.kernel.org/r/20240321164734.49273-2-m.kobuk@ispras.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-04-05 22:32:00 +05:30
Mikhail Kobuk
e4308bc22b phy: marvell: a3700-comphy: Fix out of bounds read
There is an out of bounds read access of 'gbe_phy_init_fix[fix_idx].addr'
every iteration after 'fix_idx' reaches 'ARRAY_SIZE(gbe_phy_init_fix)'.

Make sure 'gbe_phy_init[addr]' is used when all elements of
'gbe_phy_init_fix' array are handled.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 934337080c ("phy: marvell: phy-mvebu-a3700-comphy: Add native kernel implementation")
Signed-off-by: Mikhail Kobuk <m.kobuk@ispras.ru>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/r/20240321164734.49273-1-m.kobuk@ispras.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-04-05 22:32:00 +05:30
Linus Torvalds
fae0268777 vfs-6.9-rc3.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZg/C8wAKCRCRxhvAZXjc
 oljxAQCneq62ginESgeQLw88fzSBTV4C50xXUA+Qz18AEgA/fgD+J3DlWquEHhMM
 tJmfs3aUn9w7+wDpukcsLjJfJEiSYA8=
 =f2Z6
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.9-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "This contains a few small fixes. This comes with some delay because I
  wanted to wait on people running their reproducers and the Easter
  Holidays meant that those replies came in a little later than usual:

   - Fix handling of preventing writes to mounted block devices.

     Since last kernel we allow to prevent writing to mounted block
     devices provided CONFIG_BLK_DEV_WRITE_MOUNTED isn't set and the
     block device is opened with restricted writes. When we switched to
     opening block devices as files we altered the mechanism by which we
     recognize when a block device has been opened with write
     restrictions.

     The detection logic assumed that only read-write mounted
     filesystems would apply write restrictions to their block devices
     from other openers. That of course is not true since it also makes
     sense to apply write restrictions for filesystems that are
     read-only.

     Fix the detection logic using an FMODE_* bit. We still have a few
     left since we freed up a couple a while ago. I also picked up a
     patch to free up four additional FMODE_* bits scheduled for the
     next merge window.

   - Fix counting the number of writers to a block device. This just
     changes the logic to be consistent.

   - Fix a bug in aio causing a NULL pointer derefernce after we
     implemented batched processing in aio.

   - Finally, add the changes we discussed that allows to yield block
     devices early even though file closing itself is deferred.

     This also allows us to remove two holder operations to get and
     release the holder to align lifetime of file and holder of the
     block device"

* tag 'vfs-6.9-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  aio: Fix null ptr deref in aio_complete() wakeup
  fs,block: yield devices early
  block: count BLK_OPEN_RESTRICT_WRITES openers
  block: handle BLK_OPEN_RESTRICT_WRITES correctly
2024-04-05 09:47:26 -07:00
Arnd Bergmann
185fdb4697 nouveau: fix function cast warning
Calling a function through an incompatible pointer type causes breaks
kcfi, so clang warns about the assignment:

drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c:73:10: error: cast from 'void (*)(const void *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
   73 |         .fini = (void(*)(void *))kfree,

Avoid this with a trivial wrapper.

Fixes: c39f472e9f ("drm/nouveau: remove symlinks, move core/ to nvkm/ (no code changes)")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404160234.2923554-1-arnd@kernel.org
2024-04-05 18:30:29 +02:00
Kees Cook
838ae9f45c nouveau/gsp: Avoid addressing beyond end of rpc->entries
Using the end of rpc->entries[] for addressing runs into both compile-time
and run-time detection of accessing beyond the end of the array. Use the
base pointer instead, since was allocated with the additional bytes for
storing the strings. Avoids the following warning in future GCC releases
with support for __counted_by:

In function 'fortify_memcpy_chk',
    inlined from 'r535_gsp_rpc_set_registry' at ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1123:3:
../include/linux/fortify-string.h:553:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
  553 |                         __write_overflow_field(p_size_field, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

for this code:

	strings = (char *)&rpc->entries[NV_GSP_REG_NUM_ENTRIES];
	...
                memcpy(strings, r535_registry_entries[i].name, name_len);

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240330141159.work.063-kees@kernel.org
2024-04-05 18:30:29 +02:00
Dave Jiang
648dae58a8 cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates()
The while() loop in cxl_endpoint_get_perf_coordinates() checks to see if
'iter' is valid as part of the condition breaking out of the loop.
is_cxl_root() will stop the loop before the next iteration could go NULL.
Remove the iter check.

The presence of the iter or removing the iter does not impact the behavior
of the code. This is a code clean up and not a bug fix.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240403154844.3403859-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-05 08:15:50 -07:00
Puranjay Mohan
cfddb04804 MAINTAINERS: Update email address for Puranjay Mohan
I would like to use the kernel.org address for kernel development from
now on.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240405132337.71950-1-puranjay@kernel.org
2024-04-05 15:46:00 +02:00
Damien Le Moal
24cfd86433 ata: ahci: Add mask_port_map module parameter
Commits 0077a504e1 ("ahci: asm1166: correct count of reported ports")
and 9815e39617 ("ahci: asm1064: correct count of reported ports")
attempted to limit the ports of the ASM1166 and ASM1064 AHCI controllers
to avoid long boot times caused by the fact that these adapters report
a port map larger than the number of physical ports. The excess ports
are "virtual" to hide port multiplier devices and probing these ports
takes time. However, these commits caused a regression for users that do
use PMP devices, as the ATA devices connected to the PMP cannot be
scanned. These commits have thus been reverted by commit 6cd8adc3e1
("ahci: asm1064: asm1166: don't limit reported ports") to allow the
discovery of devices connected through a port multiplier. But this
revert re-introduced the long boot times for users that do not use a
port multiplier setup.

This patch adds the mask_port_map ahci module parameter to allow users
to manually specify port map masks for controllers. In the case of the
ASMedia 1166 and 1064 controllers, users that do not have port
multiplier devices can mask the excess virtual ports exposed by the
controller to speedup port scanning, thus reducing boot time.

The mask_port_map parameter accepts 2 different formats:
 - mask_port_map=<mask>
   This applies the same mask to all AHCI controllers
   present in the system. This format is convenient for small systems
   that have only a single AHCI controller.
 - mask_port_map=<pci_dev>=<mask>,<pci_dev>=mask,...
   This applies the specified masks only to the PCI device listed. The
   <pci_dev> field is a regular PCI device ID (domain🚌dev.func).
   This ID can be seen following "ahci" in the kernel messages. E.g.
   for "ahci 0000:01:00.0: 2/2 ports implemented (port mask 0x3)", the
   <pci_dev> field is "0000:01:00.0".

When used, the function ahci_save_initial_config() indicates that a
port map mask was applied with the message "masking port_map ...".
E.g.: without a mask:
modprobe ahci
dmesg | grep ahci
...
ahci 0000:00:17.0: AHCI vers 0001.0301, 32 command slots, 6 Gbps, SATA mode
ahci 0000:00:17.0: (0000:00:17.0) 8/8 ports implemented (port mask 0xff)

With a mask:
modprobe ahci mask_port_map=0000:00:17.0=0x1
dmesg | grep ahci
...
ahci 0000:00:17.0: masking port_map 0xff -> 0x1
ahci 0000:00:17.0: AHCI vers 0001.0301, 32 command slots, 6 Gbps, SATA mode
ahci 0000:00:17.0: (0000:00:17.0) 1/8 ports implemented (port mask 0x1)

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2024-04-05 22:22:27 +09:00
Andrii Nakryiko
229087f6f1 bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition
Turns out that due to CONFIG_DEBUG_INFO_BTF_MODULES not having an
explicitly specified "menu item name" in Kconfig, it's basically
impossible to turn it off (see [0]).

This patch fixes the issue by defining menu name for
CONFIG_DEBUG_INFO_BTF_MODULES, which makes it actually adjustable
and independent of CONFIG_DEBUG_INFO_BTF, in the sense that one can
have DEBUG_INFO_BTF=y and DEBUG_INFO_BTF_MODULES=n.

We still keep it as defaulting to Y, of course.

Fixes: 5f9ae91f7c ("kbuild: Build kernel module BTFs if BTF is enabled and pahole supports it")
Reported-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/CAK3+h2xiFfzQ9UXf56nrRRP=p1+iUxGoEP5B+aq9MDT5jLXDSg@mail.gmail.com [0]
Link: https://lore.kernel.org/bpf/20240404220344.3879270-1-andrii@kernel.org
2024-04-05 15:12:47 +02:00
Alex Constantino
07ed11afb6
Revert "drm/qxl: simplify qxl_fence_wait"
This reverts commit 5a838e5d58.

Changes from commit 5a838e5d58 ("drm/qxl: simplify qxl_fence_wait") would
result in a '[TTM] Buffer eviction failed' exception whenever it reached a
timeout.
Due to a dependency to DMA_FENCE_WARN this also restores some code deleted
by commit d72277b6c3 ("dma-buf: nuke DMA_FENCE_TRACE macros v2").

Fixes: 5a838e5d58 ("drm/qxl: simplify qxl_fence_wait")
Link: https://lore.kernel.org/regressions/ZTgydqRlK6WX_b29@eldamar.lan/
Reported-by: Timo Lindfors <timo.lindfors@iki.fi>
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1054514
Signed-off-by: Alex Constantino <dreaming.about.electric.sheep@gmail.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404181448.1643-2-dreaming.about.electric.sheep@gmail.com
2024-04-05 15:10:31 +02:00
Kent Overstreet
caeb4b0a11
aio: Fix null ptr deref in aio_complete() wakeup
list_del_init_careful() needs to be the last access to the wait queue
entry - it effectively unlocks access.

Previously, finish_wait() would see the empty list head and skip taking
the lock, and then we'd return - but the completion path would still
attempt to do the wakeup after the task_struct pointer had been
overwritten.

Fixes: 71eb6b6b0b ("fs/aio: obey min_nr when doing wakeups")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-fsdevel/CAHTA-ubfwwB51A5Wg5M6H_rPEQK9pNf8FkAGH=vr=FEkyRrtqw@mail.gmail.com/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/stable/20240331215212.522544-1-kent.overstreet%40linux.dev
Link: https://lore.kernel.org/r/20240331215212.522544-1-kent.overstreet@linux.dev
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05 11:20:28 +02:00
Anna-Maria Behnsen
7a96a84bfb timers/migration: Return early on deactivation
Commit 4b6f4c5a67 ("timer/migration: Remove buggy early return on
deactivation") removed the logic to return early in tmigr_update_events()
on deactivation. With this the problem with a not properly updated first
global event in a hierarchy containing only a single group was fixed.

But when having a look at this code path with a hierarchy with more than a
single level, now unnecessary work is done (example is partially copied
from the message of the commit mentioned above):

                            [GRP1:0]
                         migrator = GRP0:0
                         active   = GRP0:0
                         nextevt  = T0:0i, T0:1
                         /              \
              [GRP0:0]                  [GRP0:1]
           migrator = 0              migrator = NONE
           active   = 0              active   = NONE
           nextevt  = T0i, T1        nextevt  = T2
           /         \                /         \
          0 (T0i)     1 (T1)         2 (T2)      3
      active         idle            idle       idle

0) CPU 0 is active thus its event is ignored (the letter 'i') and so are
upper levels' events. CPU 1 is idle and has the timer T1 enqueued.
CPU 2 also has a timer. The expiry order is T0 (ignored) < T1 < T2

                            [GRP1:0]
                         migrator = GRP0:0
                         active   = GRP0:0
                         nextevt  = T0:0i, T0:1
                         /              \
              [GRP0:0]                  [GRP0:1]
           migrator = NONE           migrator = NONE
           active   = NONE           active   = NONE
           nextevt  = T1             nextevt  = T2
           /         \                /         \
          0 (T0i)     1 (T1)         2 (T2)      3
        idle         idle            idle         idle

1) CPU 0 goes idle without global event queued. Therefore KTIME_MAX is
pushed as its next expiry and its own event kept as "ignore". Without this
early return the following steps happen in tmigr_update_events() when
child = null and group = GRP0:0 :

  lock(GRP0:0->lock);
  timerqueue_del(GRP0:0, T0i);
  unlock(GRP0:0->lock);


                            [GRP1:0]
                         migrator = NONE
                         active   = NONE
                         nextevt  = T0:0, T0:1
                         /              \
              [GRP0:0]                  [GRP0:1]
           migrator = NONE           migrator = NONE
           active   = NONE           active   = NONE
           nextevt  = T1             nextevt  = T2
           /         \                /         \
          0 (T0i)     1 (T1)         2 (T2)      3
        idle         idle            idle         idle

2) The change now propagates up to the top. Then tmigr_update_events()
updates the group event of GRP0:0 and executes the following steps
(child = GRP0:0 and group = GRP0:0):

  lock(GRP0:0->lock);
  lock(GRP1:0->lock);
  evt = tmigr_next_groupevt(GRP0:0); -> this removes the ignored events
					in GRP0:0
  ... update GRP1:0 group event and timerqueue ...
  unlock(GRP1:0->lock);
  unlock(GRP0:0->lock);

So the dance in 1) with locking the GRP0:0->lock and removing the T0i from
the timerqueue is redundand as this is done nevertheless in 2) when
tmigr_next_groupevt(GRP0:0) is executed.

Revert commit 4b6f4c5a67 ("timer/migration: Remove buggy early return on
deactivation") and add a condition into return path to skip the return
only, when hierarchy contains a single group. Adapt comments accordingly.

Fixes: 4b6f4c5a67 ("timer/migration: Remove buggy early return on deactivation")
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/87cyr49on2.fsf@somnus
2024-04-05 11:05:16 +02:00
Frederic Weisbecker
61f7fdf8fd timers/migration: Fix ignored event due to missing CPU update
When a group event is updated with its expiry unchanged but a different
CPU, that target change may go unnoticed and the event may be propagated
up with a stale CPU value. The following depicts a scenario that has
been actually observed:

                       [GRP2:0]
                   migrator = GRP1:1
                   active   = GRP1:1
                   nextevt  = TGRP1:0 (T0)
                    /              \
               [GRP1:0]           [GRP1:1]
            migrator = NONE       [...]
            active   = NONE
            nextevt  = TGRP0:0 (T0)
            /           \
        [GRP0:0]       [...]
      migrator = NONE
      active   = NONE
      nextevt  = T0
      /         \
    0 (T0)       1 (T1)
    idle         idle

0) The hierarchy has 3 levels. The left part (GRP1:0) is all idle,
including CPU 0 and CPU 1 which have a timer each: T0 and T1. They have
the same expiry value.

                       [GRP2:0]
                   migrator = GRP1:1
                   active   = GRP1:1
                   nextevt  = KTIME_MAX
                    /              \
               [GRP1:0]           [GRP1:1]
            migrator = NONE       [...]
            active   = NONE
            nextevt  = TGRP0:0 (T0)
            /           \
        [GRP0:0]       [...]
      migrator = NONE
      active   = NONE
      nextevt  = T0
      /         \
    0 (T0)       1 (T1)
    idle         idle

1) The migrator in GRP1:1 handles remotely T0. The event is dequeued
from the top and T0 executed.

                       [GRP2:0]
                   migrator = GRP1:1
                   active   = GRP1:1
                   nextevt  = KTIME_MAX
                    /              \
               [GRP1:0]           [GRP1:1]
            migrator = NONE       [...]
            active   = NONE
            nextevt  = TGRP0:0 (T0)
            /           \
        [GRP0:0]       [...]
      migrator = NONE
      active   = NONE
      nextevt  = T1
      /         \
    0            1 (T1)
    idle         idle

2) The migrator in GRP1:1 fetches the next timer for CPU 0 and finds
none. But it updates the events from its groups, starting with GRP0:0
which now has T1 as its next event. So far so good.

                       [GRP2:0]
                   migrator = GRP1:1
                   active   = GRP1:1
                   nextevt  = KTIME_MAX
                    /              \
               [GRP1:0]           [GRP1:1]
            migrator = NONE       [...]
            active   = NONE
            nextevt  = TGRP0:0 (T0)
            /           \
        [GRP0:0]       [...]
      migrator = NONE
      active   = NONE
      nextevt  = T1
      /         \
    0            1 (T1)
    idle         idle

3) The migrator in GRP1:1 proceeds upward and updates the events in
GRP1:0. The child event TGRP0:0 is found queued with the same expiry
as before. And therefore it is left unchanged. However the target CPU
is not the same but that fact is ignored so TGRP0:0 still points to
CPU 0 when it should point to CPU 1.

                       [GRP2:0]
                   migrator = GRP1:1
                   active   = GRP1:1
                   nextevt  = TGRP1:0 (T0)
                    /              \
               [GRP1:0]           [GRP1:1]
            migrator = NONE       [...]
            active   = NONE
            nextevt  = TGRP0:0 (T0)
            /           \
        [GRP0:0]       [...]
      migrator = NONE
      active   = NONE
      nextevt  = T1
      /         \
    0            1 (T1)
    idle         idle

4) The propagation has reached the top level and TGRP1:0, having TGRP0:0
as its first event, also wrongly points to CPU 0. TGRP1:0 is added to
the top level group.

                       [GRP2:0]
                   migrator = GRP1:1
                   active   = GRP1:1
                   nextevt  = KTIME_MAX
                    /              \
               [GRP1:0]           [GRP1:1]
            migrator = NONE       [...]
            active   = NONE
            nextevt  = TGRP0:0 (T0)
            /           \
        [GRP0:0]       [...]
      migrator = NONE
      active   = NONE
      nextevt  = T1
      /         \
    0            1 (T1)
    idle         idle

5) The migrator in GRP1:1 dequeues the next event in top level pointing
to CPU 0. But since it actually doesn't see any real event in CPU 0, it
early returns.

6) T1 is left unhandled until either CPU 0 or CPU 1 wake up.

Some other bad scenario may involve trees with just two levels.

Fix this with unconditionally updating the CPU of the child event before
considering to early return while updating a queued event with an
unchanged expiry value.

Fixes: 7ee9887703 ("timers: Implement the hierarchical pull model")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Link: https://lore.kernel.org/r/Zg2Ct6M2RJAYHgCB@localhost.localdomain
2024-04-05 11:05:16 +02:00
Jammy Huang
bc004f5038 drm/ast: Fix soft lockup
There is a while-loop in ast_dp_set_on_off() that could lead to
infinite-loop. This is because the register, VGACRI-Dx, checked in
this API is a scratch register actually controlled by a MCU, named
DPMCU, in BMC.

These scratch registers are protected by scu-lock. If suc-lock is not
off, DPMCU can not update these registers and then host will have soft
lockup due to never updated status.

DPMCU is used to control DP and relative registers to handshake with
host's VGA driver. Even the most time-consuming task, DP's link
training, is less than 100ms. 200ms should be enough.

Signed-off-by: Jammy Huang <jammy_huang@aspeedtech.com>
Fixes: 594e9c04b5 ("drm/ast: Create the driver for ASPEED proprietory Display-Port")
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: KuoHsiang Chou <kuohsiang_chou@aspeedtech.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.19+
Link: https://patchwork.freedesktop.org/patch/msgid/20240403090246.1495487-1-jammy_huang@aspeedtech.com
2024-04-05 10:22:23 +02:00
Rafał Miłecki
3baac7291e
arm64: dts: mediatek: mt2712: fix validation errors
1. Fixup infracfg clock controller binding
   It also acts as reset controller so #reset-cells is required.
2. Use -pins suffix for pinctrl

This fixes:
arch/arm64/boot/dts/mediatek/mt2712-evb.dtb: syscon@10001000: '#reset-cells' is a required property
        from schema $id: http://devicetree.org/schemas/arm/mediatek/mediatek,infracfg.yaml#
arch/arm64/boot/dts/mediatek/mt2712-evb.dtb: pinctrl@1000b000: 'eth_default', 'eth_sleep', 'usb0_iddig', 'usb1_iddig' do not match any of the regexes: 'pinctrl-[0-9]+', 'pins$'
        from schema $id: http://devicetree.org/schemas/pinctrl/mediatek,mt65xx-pinctrl.yaml#

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20240301074741.8362-1-zajec5@gmail.com
[Angelo: Added Fixes tags]
Fixes: 5d4839709c ("arm64: dts: mt2712: Add clock controller device nodes")
Fixes: 1724f4cc51 ("arm64: dts: Add USB3 related nodes for MT2712")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
2024-04-05 09:52:47 +02:00
Rafał Miłecki
f8c65a5e45
arm64: dts: mediatek: mt7986: prefix BPI-R3 cooling maps with "map-"
This fixes:
arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3.dtb: thermal-zones: cpu-thermal:cooling-maps: 'cpu-active-high', 'cpu-active-low', 'cpu-active-med' do not match any of the regexes: '^map[-a-zA-Z0-9]*$', 'pinctrl-[0-9]+'
        from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#

Fixes: c26f779a22 ("arm64: dts: mt7986: add pwm-fan and cooling-maps to BPI-R3 dts")
Cc: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20240213061459.17917-1-zajec5@gmail.com
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
2024-04-05 09:49:51 +02:00
Rafał Miłecki
970f8b01bd
arm64: dts: mediatek: mt7986: drop invalid thermal block clock
Thermal block uses only two clocks. Its binding doesn't document or
allow "adc_32k". Also Linux driver doesn't support it.

It has been additionally verified by Angelo by his detailed research on
MT7981 / MT7986 clocks (thanks!).

This fixes:
arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3.dtb: thermal@1100c800: clocks: [[4, 27], [4, 44], [4, 45]] is too long
        from schema $id: http://devicetree.org/schemas/thermal/mediatek,thermal.yaml#
arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3.dtb: thermal@1100c800: clock-names: ['therm', 'auxadc', 'adc_32k'] is too long
        from schema $id: http://devicetree.org/schemas/thermal/mediatek,thermal.yaml#

Fixes: 0a9615d58d ("arm64: dts: mt7986: add thermal and efuse")
Cc: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/linux-devicetree/17d143aa-576e-4d67-a0ea-b79f3518b81c@collabora.com/
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20240213053739.14387-3-zajec5@gmail.com
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
2024-04-05 09:49:23 +02:00
Rafał Miłecki
9bd88afc94
arm64: dts: mediatek: mt7986: drop "#reset-cells" from Ethernet controller
Ethernet block doesn't include or act as a reset controller.
Documentation also doesn't document "#reset-cells" for it.

This fixes:
arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3.dtb: ethernet@15100000: Unevaluated properties are not allowed ('#reset-cells' was unexpected)
        from schema $id: http://devicetree.org/schemas/net/mediatek,net.yaml#

Fixes: 082ff36bd5 ("arm64: dts: mediatek: mt7986: introduce ethernet nodes")
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20240213053739.14387-2-zajec5@gmail.com
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
2024-04-05 09:49:23 +02:00
Rafał Miłecki
3b449bfd2f
arm64: dts: mediatek: mt7986: drop invalid properties from ethsys
Mediatek ethsys controller / syscon binding doesn't allow any subnodes
so "#address-cells" and "#size-cells" are redundant (actually:
disallowed).

This fixes:
arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3.dtb: syscon@15000000: '#address-cells', '#size-cells' do not match any of the regexes: 'pinctrl-[0-9]+'
        from schema $id: http://devicetree.org/schemas/clock/mediatek,ethsys.yaml#

Fixes: 1f9986b258 ("arm64: dts: mediatek: add clock support for mt7986a")
Cc: Sam Shih <sam.shih@mediatek.com>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20240213053739.14387-1-zajec5@gmail.com
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
2024-04-05 09:49:22 +02:00
Kent Overstreet
5957e0a28b bcachefs: Fix rebalance from durability=0 device
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-05 03:05:30 -04:00
Takashi Iwai
100c85421b ASoC: Fixes for v6.9
A relatively large set of fixes here, the biggest piece of it is a
 series correcting some problems with the delay reporting for Intel SOF
 cards but there's a bunch of other things.  Everything here is driver
 specific except for a fix in the core for an issue with sign extension
 handling volume controls.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmYPJIUACgkQJNaLcl1U
 h9Ap8Qf/Xr2YP+KJxiD7g52grelyqjUdMCBkoB5Ndf4BU/mBl5X1yZkBuiTEB24D
 dcjamLuxNHGJhrHalYEfok6wRK3RMlkDZ8SNgpCP9rmOH0cSdt/8I3vUXA/XUKrx
 SnceTSZC1S7BSRk26IKf2UdFRULMSGpC87mVLxi7DT+nmuIAigGau3yBPXveG00p
 OLOYmJK9vwdlxAkfIp0ddYx3iTqfaq55W5ttWadoLG9gpoGbDzkvapBsZebqlGeo
 MkZrur70Vi7ousGATkzQHCkEUD8atNOTRrrAgVmzgBbUy3Y6LOmeMwiH306wXJAq
 XHiYYLCPlPI5myNYHeKmOQbCK1AzpA==
 =/2l+
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v6.9-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.9

A relatively large set of fixes here, the biggest piece of it is a
series correcting some problems with the delay reporting for Intel SOF
cards but there's a bunch of other things.  Everything here is driver
specific except for a fix in the core for an issue with sign extension
handling volume controls.
2024-04-05 08:48:12 +02:00
Uwe Kleine-König
d464dac472 usb: gadget: fsl: Initialize udc before using it
fsl_ep_queue() is only called by usb_ep_queue() (as ep->ops->queue()).
So _ep isn't NULL.

As ep->ops->queue = fsl_ep_queue, the ep was initialized by
struct_ep_setup() and so ep->udc isn't NULL either.

Drop the check for _ep being NULL and assign udc earlier to prevent
following an uninitialized pointer in the two dev_vdbg()s in lines 878
and 882. This fixes a compiler warning when using clang and
CONFIG_USB_GADGET_VERBOSE=y.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404050227.TTvcCPBu-lkp@intel.com/
Fixes: 6025f20f16 ("usb: gadget: fsl-udc: Replace custom log wrappers by dev_{err,warn,dbg,vdbg}")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20240405055812.694123-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-05 08:08:37 +02:00
Dave Airlie
4c8595741b Display fixes:
- A few DisplayPort related fixes (Imre, Arun, Ankit, Ville)
 - eDP PSR fixes (Jouni)
 
 Core/GT fixes:
 - Remove some VM space restrictions on older platforms (Andi)
 - Disable automatic load CCS load balancing (Andi)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmYO5zgACgkQ+mJfZA7r
 E8pqSAf9Hn3buQyYsRbNOpQnSI3/PhDlU0mfmPFRUJ4TykmdHyIJH1a5F0v0aHV3
 uyqZzYUbMzfDdHT3Mw0TRqihJCYq/pTo7mOaJj3kysZXWQQgmuKdZxaYe0SpQoLz
 OnGHqYhMD7nuVjs7hRPmUMEXtFB2KtcFeU6io9bSJnY3y0xnBGJuOOtd7LcsB1un
 Rsmgnvm2T/3xkz4M6gCQm3s1j2U0Fr7R1drz5dVJIQJcEXE91rIQf+dwipr9MqYX
 yoFO9cBhekW7mnD9MshSMjxlGGnwewg0ZrwJRrwDWIU3Z9GJ1113HjvuFy4LTTsx
 27TFd0ErqthqVkCN2VGXYAWYJBpVDA==
 =sska
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-fixes-2024-04-04' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-fixes

Display fixes:
- A few DisplayPort related fixes (Imre, Arun, Ankit, Ville)
- eDP PSR fixes (Jouni)

Core/GT fixes:
- Remove some VM space restrictions on older platforms (Andi)
- Disable automatic load CCS load balancing (Andi)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zg7nSK5oTmWfKPPI@intel.com
2024-04-05 12:32:14 +10:00
Dave Airlie
a5b5ab33d2 - Stop using system_unbound_wq for preempt fences,
as this can cause starvation when reaching more
   than max_active defined by workqueue
 - Fix saving unordered rebinding fences by attaching
   them as kernel feces to the vm's resv
 - Fix TLB invalidation fences completing out of order
 - Move rebind TLB invalidation to the ring ops to reduce
   the latency
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmYOugUZHGx1Y2FzLmRl
 bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU2zCEACaMkMai6ugrvDfScG+jPb0
 kgTTTbEsOlAyaZJgiIrelLOFh45cqQVORa+dIRs4SkJA4O/Kt4Mlwen/Yf3yshGY
 ppAQRxbehKQ9abMJIkUkEsntpnMrAU03Y1sYiOoMp62BRVUL/4Qm6bBQ+wGgxoUm
 vKhk0p9GxD7l95gnRk3r1PGjdg72kmrLm9D8i6qxymhDqBV6dgTpB+JvO/p6NJ+F
 JJo8rllSAaoV+p3baaimdWRNsa3DDWcsO3aGWYyt8Q8ssTK6JlE3kuwl3ImEPg9E
 zxaAPLZWKsawD2SEkyj8AJK3+UM5Kw1AbM7n0fhev5Ts7tekgmiYFxkiFw+p65Sz
 BM+ay3UXtfh3qWKNwJ0sj1TfqIMEfHtHl/yec1WyJ7sk/E8VyDvJeSImHdvJggQN
 RHYG4wIs/OpT9U1bIeOp/TGpfARFGZggHz5wXPClfxHZp9pQ4d8mb7Q7MSmp+XjN
 j6Av1AXWKLtvtnFqVgT0npbzcfbONBMCcXMnAjc02gPFy9eD76FaKJMY9mEHZpvS
 3ZiKTTGFSlyvI9E76yYXbBa+L3+Vnl92fM251P1fDCbkIWHxIk746ejipmaTshoB
 MEUK7K5pIqYMPSyZFrVyoNa7fVY/vMyStYKMOqAEAfOGgdzq7uI+15GIeef2ONGY
 q42m281/dWMS0dyFaGEIlA==
 =WIRn
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-fixes-2024-04-04' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

- Stop using system_unbound_wq for preempt fences,
  as this can cause starvation when reaching more
  than max_active defined by workqueue
- Fix saving unordered rebinding fences by attaching
  them as kernel feces to the vm's resv
- Fix TLB invalidation fences completing out of order
- Move rebind TLB invalidation to the ring ops to reduce
  the latency

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/tizan6wdpxu4ayudeikjglxdgzmnhdzj3li3z2pgkierjtozzw@lbfddeg43a7h
2024-04-05 12:25:35 +10:00
Dave Airlie
4cf09f171f Short summary of fixes pull:
display:
 - fix typos in kerneldoc
 
 nouveau:
 - uvmm: fix remap address calculation
 - minor cleanups
 
 panfrost:
 - fix power-transition timeouts
 
 prime:
 - unbreak dma-buf export for virt-gpu
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmYOhS0ACgkQaA3BHVML
 eiP64Af/WbbZAk/rZWS4yqIqvEsylVqDxofKAOakzmNrU26c493smRvRl77ma2XP
 cjNF4Go0LiLtaqmFe/s3FSGMgleIC36yiBNBJlXFrzeNYWiJf8bblrftKfiCKplJ
 ODlLcpKru3u5kRqgo+kr9SgWhSj0N5nRcnTFb6COYjjWMZFqYoLTF++rZ44Jfiyk
 EwoZqC8cIu4nQr2tbKW7qfAZKgKSAkKC55uRmWg1l7hOqmYHS1Pr8+NWTLhWZXix
 5aiCNLp7/0qYU/lWgio0qYSZYJ40DI/aAh9xJdP7UsL4n+MrX2hprejOjQO2YI/H
 wXjX1V2iiksiThkfRPHRONI8qbFM7g==
 =DTm7
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2024-04-04' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

display:
- fix typos in kerneldoc

nouveau:
- uvmm: fix remap address calculation
- minor cleanups

panfrost:
- fix power-transition timeouts

prime:
- unbreak dma-buf export for virt-gpu

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404104813.GA27376@localhost.localdomain
2024-04-05 11:59:10 +10:00
Sean Christopherson
8cb4a9a82b x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word
Add CPUID_LNX_5 to track cpufeatures' word 21, and add the appropriate
compile-time assert in KVM to prevent direct lookups on the features in
CPUID_LNX_5.  KVM uses X86_FEATURE_* flags to manage guest CPUID, and so
must translate features that are scattered by Linux from the Linux-defined
bit to the hardware-defined bit, i.e. should never try to directly access
scattered features in guest CPUID.

Opportunistically add NR_CPUID_WORDS to enum cpuid_leafs, along with a
compile-time assert in KVM's CPUID infrastructure to ensure that future
additions update cpuid_leafs along with NCAPINTS.

No functional change intended.

Fixes: 7f274e609f ("x86/cpufeatures: Add new word for scattered features")
Cc: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-04-04 17:42:19 -07:00
Kuniyuki Iwashima
24457f1be2 nfs: Handle error of rpc_proc_register() in nfs_net_init().
syzkaller reported a warning [0] triggered while destroying immature
netns.

rpc_proc_register() was called in init_nfs_fs(), but its error
has been ignored since at least the initial commit 1da177e4c3
("Linux-2.6.12-rc2").

Recently, commit d47151b79e ("nfs: expose /proc/net/sunrpc/nfs
in net namespaces") converted the procfs to per-netns and made
the problem more visible.

Even when rpc_proc_register() fails, nfs_net_init() could succeed,
and thus nfs_net_exit() will be called while destroying the netns.

Then, remove_proc_entry() will be called for non-existing proc
directory and trigger the warning below.

Let's handle the error of rpc_proc_register() properly in nfs_net_init().

[0]:
name 'nfs'
WARNING: CPU: 1 PID: 1710 at fs/proc/generic.c:711 remove_proc_entry+0x1bb/0x2d0 fs/proc/generic.c:711
Modules linked in:
CPU: 1 PID: 1710 Comm: syz-executor.2 Not tainted 6.8.0-12822-gcd51db110a7e #12
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:remove_proc_entry+0x1bb/0x2d0 fs/proc/generic.c:711
Code: 41 5d 41 5e c3 e8 85 09 b5 ff 48 c7 c7 88 58 64 86 e8 09 0e 71 02 e8 74 09 b5 ff 4c 89 e6 48 c7 c7 de 1b 80 84 e8 c5 ad 97 ff <0f> 0b eb b1 e8 5c 09 b5 ff 48 c7 c7 88 58 64 86 e8 e0 0d 71 02 eb
RSP: 0018:ffffc9000c6d7ce0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff8880422b8b00 RCX: ffffffff8110503c
RDX: ffff888030652f00 RSI: ffffffff81105045 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff81bb62cb R12: ffffffff84807ffc
R13: ffff88804ad6fcc0 R14: ffffffff84807ffc R15: ffffffff85741ff8
FS:  00007f30cfba8640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff51afe8000 CR3: 000000005a60a005 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 rpc_proc_unregister+0x64/0x70 net/sunrpc/stats.c:310
 nfs_net_exit+0x1c/0x30 fs/nfs/inode.c:2438
 ops_exit_list+0x62/0xb0 net/core/net_namespace.c:170
 setup_net+0x46c/0x660 net/core/net_namespace.c:372
 copy_net_ns+0x244/0x590 net/core/net_namespace.c:505
 create_new_namespaces+0x2ed/0x770 kernel/nsproxy.c:110
 unshare_nsproxy_namespaces+0xae/0x160 kernel/nsproxy.c:228
 ksys_unshare+0x342/0x760 kernel/fork.c:3322
 __do_sys_unshare kernel/fork.c:3393 [inline]
 __se_sys_unshare kernel/fork.c:3391 [inline]
 __x64_sys_unshare+0x1f/0x30 kernel/fork.c:3391
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x46/0x4e
RIP: 0033:0x7f30d0febe5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007f30cfba7cc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00000000004bbf80 RCX: 00007f30d0febe5d
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000006c020600
RBP: 00000000004bbf80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
R13: 000000000000000b R14: 00007f30d104c530 R15: 0000000000000000
 </TASK>

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-04-04 18:27:13 -04:00
Alexander Wetzel
d4e655c49f scsi: sg: Avoid race in error handling & drop bogus warn
Commit 27f58c04a8 ("scsi: sg: Avoid sg device teardown race") introduced
an incorrect WARN_ON_ONCE() and missed a sequence where sg_device_destroy()
was used after scsi_device_put().

sg_device_destroy() is accessing the parent scsi_device request_queue which
will already be set to NULL when the preceding call to scsi_device_put()
removed the last reference to the parent scsi_device.

Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
access to the sg device - and make sure sg_device_destroy() is not used
after scsi_device_put() in the error handling.

Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
Fixes: 27f58c04a8 ("scsi: sg: Avoid sg device teardown race")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
Link: https://lore.kernel.org/r/20240401191038.18359-1-Alexander@wetzel-home.de
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-04-04 18:22:12 -04:00
Linus Torvalds
c88b9b4cde Including fixes from netfilter, bluetooth and bpf.
Fairly usual collection of driver and core fixes. The large selftest
 accompanying one of the fixes is also becoming a common occurrence.
 
 Current release - regressions:
 
  - ipv6: fix infinite recursion in fib6_dump_done()
 
  - net/rds: fix possible null-deref in newly added error path
 
 Current release - new code bugs:
 
  - net: do not consume a full cacheline for system_page_pool
 
  - bpf: fix bpf_arena-related file descriptor leaks in the verifier
 
  - drv: ice: fix freeing uninitialized pointers, fixing misuse of
    the newfangled __free() auto-cleanup
 
 Previous releases - regressions:
 
  - x86/bpf: fixes the BPF JIT with retbleed=stuff
 
  - xen-netfront: add missing skb_mark_for_recycle, fix page pool
    accounting leaks, revealed by recently added explicit warning
 
  - tcp: fix bind() regression for v6-only wildcard and v4-mapped-v6
    non-wildcard addresses
 
  - Bluetooth:
    - replace "hci_qca: Set BDA quirk bit if fwnode exists in DT"
      with better workarounds to un-break some buggy Qualcomm devices
    - set conn encrypted before conn establishes, fix re-connecting
      to some headsets which use slightly unusual sequence of msgs
 
  - mptcp:
    - prevent BPF accessing lowat from a subflow socket
    - don't account accept() of non-MPC client as fallback to TCP
 
  - drv: mana: fix Rx DMA datasize and skb_over_panic
 
  - drv: i40e: fix VF MAC filter removal
 
 Previous releases - always broken:
 
  - gro: various fixes related to UDP tunnels - netns crossing problems,
    incorrect checksum conversions, and incorrect packet transformations
    which may lead to panics
 
  - bpf: support deferring bpf_link dealloc to after RCU grace period
 
  - nf_tables:
    - release batch on table validation from abort path
    - release mutex after nft_gc_seq_end from abort path
    - flush pending destroy work before exit_net release
 
  - drv: r8169: skip DASH fw status checks when DASH is disabled
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmYO91wACgkQMUZtbf5S
 IrvHBQ/+PH/hobI+o3aLqwtdVlyxhmA31bVQ0I3aTIZV7c3ideMBcfgYa8TiZM2g
 pLiBiWoJXCN0h33wgUmlUee+sBvpoPCdPjGD/g99OJyKWjVt2D7ObnSwxMfjHUoq
 dtcN2JupqHP0SHz6wPPCmnWtTLxSGUsDdKjmkHQcCRhQIGTYFkYyHcOmPgNbBjaB
 6jvmH1kE9WQTFD8QcOMaZmXQ5omoafpxxQLsgundtOWxPWHL7XNvk0B5k/ESDRG1
 ujbxwtNnOESzpxZMQ6OyZlsnN/1tWfnEvLJFYVwf9BMrOlahJT/f5b/EJ9/Xy4dC
 zkAp7Tul3uAvNRKhBNhVBTWQbnIykmiNMp1VBFmiScQAy8hcnX+6d4LKTIHxbXZK
 V3AqcUS6YU2nyMdLRkhvq9f3uxD6hcY19gQdyqgCUPOtyUAs/JPv7lXQjCuuEqkq
 urEZkigUApnEqPIrIqANJ7nXUy3U0K8qU6evOZoGZ5OdiKeNKC3+tIr+g2f1ZUZq
 a7Dkat7JH9WQ7IG8Geody6Z30K9EpSqYMTKzB5wTfmuqw6cV8bl9OAW9UOSRK0GL
 pyG8GwpkpFPkNiZdu9Zt44Pno5xdLIa1+C3QZR0r5CJWYAzCbI80MppP5veF9Mw+
 v+2v8iBWuh9iv0AUj9KJOwG5QQ+EXLUuSlhtx/DFnmn2CJ9plXI=
 =6bQI
 -----END PGP SIGNATURE-----

Merge tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, bluetooth and bpf.

  Fairly usual collection of driver and core fixes. The large selftest
  accompanying one of the fixes is also becoming a common occurrence.

  Current release - regressions:

   - ipv6: fix infinite recursion in fib6_dump_done()

   - net/rds: fix possible null-deref in newly added error path

  Current release - new code bugs:

   - net: do not consume a full cacheline for system_page_pool

   - bpf: fix bpf_arena-related file descriptor leaks in the verifier

   - drv: ice: fix freeing uninitialized pointers, fixing misuse of the
     newfangled __free() auto-cleanup

  Previous releases - regressions:

   - x86/bpf: fixes the BPF JIT with retbleed=stuff

   - xen-netfront: add missing skb_mark_for_recycle, fix page pool
     accounting leaks, revealed by recently added explicit warning

   - tcp: fix bind() regression for v6-only wildcard and v4-mapped-v6
     non-wildcard addresses

   - Bluetooth:
      - replace "hci_qca: Set BDA quirk bit if fwnode exists in DT" with
        better workarounds to un-break some buggy Qualcomm devices
      - set conn encrypted before conn establishes, fix re-connecting to
        some headsets which use slightly unusual sequence of msgs

   - mptcp:
      - prevent BPF accessing lowat from a subflow socket
      - don't account accept() of non-MPC client as fallback to TCP

   - drv: mana: fix Rx DMA datasize and skb_over_panic

   - drv: i40e: fix VF MAC filter removal

  Previous releases - always broken:

   - gro: various fixes related to UDP tunnels - netns crossing
     problems, incorrect checksum conversions, and incorrect packet
     transformations which may lead to panics

   - bpf: support deferring bpf_link dealloc to after RCU grace period

   - nf_tables:
      - release batch on table validation from abort path
      - release mutex after nft_gc_seq_end from abort path
      - flush pending destroy work before exit_net release

   - drv: r8169: skip DASH fw status checks when DASH is disabled"

* tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
  netfilter: validate user input for expected length
  net/sched: act_skbmod: prevent kernel-infoleak
  net: usb: ax88179_178a: avoid the interface always configured as random address
  net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45()
  net: ravb: Always update error counters
  net: ravb: Always process TX descriptor ring
  netfilter: nf_tables: discard table flag update with pending basechain deletion
  netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
  netfilter: nf_tables: reject new basechain after table flag update
  netfilter: nf_tables: flush pending destroy work before exit_net release
  netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
  netfilter: nf_tables: release batch on table validation from abort path
  Revert "tg3: Remove residual error handling in tg3_suspend"
  tg3: Remove residual error handling in tg3_suspend
  net: mana: Fix Rx DMA datasize and skb_over_panic
  net/sched: fix lockdep splat in qdisc_tree_reduce_backlog()
  net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping
  net: stmmac: fix rx queue priority assignment
  net: txgbe: fix i2c dev name cannot match clkdev
  net: fec: Set mac_managed_pm during probe
  ...
2024-04-04 14:49:10 -07:00
Linus Torvalds
ec25bd8d98 bcachefs repair code for 6.9-rc3
A couple more small fixes, and new repair code.
 
 We can now automatically recover from arbitrary corrupted interior btree
 nodes by scanning, and we can reconstruct metadata as needed to bring a
 filesystem back into a working, consistent, read-write state and
 preserve access to whatevver wasn't corrupted.
 
 Meaning - you can blow away all metadata except for extents and dirents
 leaf nodes, and repair will reconstruct everything else and give you
 your data, and under the correct paths. If inodes are missing i_size
 will be slightly off and permissions/ownership/timestamps will be gone,
 and we do still need the snapshots btree if snapshots were in use - in
 the future we'll be able to guess the snapshot tree structure in some
 situations.
 
 IOW - aside from shaking out remaining bugs (fuzz testing is still
 coming), repair code should be complete and if repair ever doesn't work
 that's the highest priority bug that I want to know about immediately.
 
 This patchset was kindly tested by a user from India who accidentally
 wiped one drive out of a three drive filesystem with no replication on
 the family computer - it took a couple weeks but we got everything
 important back.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmYNq9IACgkQE6szbY3K
 bnaG9w/+Od0iq4Nqx62Mf8+O5DLnZZNu3c9aUOEiuzdXlNrpUr4S9j4WwDxTb/EN
 2a3ldXY5AhauZqEW7Qv+WBZvVVbm3GYH+oOYQo8V+yf1oGNB3+AGxBCCmruHJGLk
 5nmwsRyVm1ihAKxn1oxwrDDPtOlxbGOlc4peR+nCY/b5QnlXegGkGfRAHO/z9bul
 4JdBYBqR4KBGdevIV8EG2WVa6ASA6mF1QOboeB6INekD4klDpm41gK/0S9Uf2oXm
 q1PiN655YHquXbJTT9k/HtVX4WhlcaHv+R4KeZ5TEReJjB57ot/M8Rx57lgsYHP6
 TeyR4Y5VYGLYqlwMK5RiKyGLB92qNFcSlg5inASyTCUNi1KKu12SpqS3+Nel6+tF
 gu4F4ElSvAcsmJ6LrfsfP9B8u0ULDkIyq9xBFFbLTIpLuDOqz8FcgFpZrpiO445w
 F6FcYXqt2/fP7gxA3GzdFjeUojIjWNMJapgpsePg/HGNArBsoAsBL8rAhAyetG3Z
 EOJlrJ8m59/QoPgXBpScfbS7cxk3JgrUzfSI/oKaEr2lS0YNlYjQANYHoEHTFaxA
 bMWKXwMkvqz49MMm5WLaMIOYDJRDtrt0qpnW7x+qU7ik/VkHeUTJr07bSRIKT0z1
 yNCynYtdbeQVfekZQS6JwsyTs/ehbI1OVN8MGwVRCrQTonYz+BA=
 =7/rR
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-04-03' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs repair code from Kent Overstreet:
 "A couple more small fixes, and new repair code.

  We can now automatically recover from arbitrary corrupted interior
  btree nodes by scanning, and we can reconstruct metadata as needed to
  bring a filesystem back into a working, consistent, read-write state
  and preserve access to whatevver wasn't corrupted.

  Meaning - you can blow away all metadata except for extents and
  dirents leaf nodes, and repair will reconstruct everything else and
  give you your data, and under the correct paths. If inodes are missing
  i_size will be slightly off and permissions/ownership/timestamps will
  be gone, and we do still need the snapshots btree if snapshots were in
  use - in the future we'll be able to guess the snapshot tree structure
  in some situations.

  IOW - aside from shaking out remaining bugs (fuzz testing is still
  coming), repair code should be complete and if repair ever doesn't
  work that's the highest priority bug that I want to know about
  immediately.

  This patchset was kindly tested by a user from India who accidentally
  wiped one drive out of a three drive filesystem with no replication on
  the family computer - it took a couple weeks but we got everything
  important back"

* tag 'bcachefs-2024-04-03' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: reconstruct_inode()
  bcachefs: Subvolume reconstruction
  bcachefs: Check for extents that point to same space
  bcachefs: Reconstruct missing snapshot nodes
  bcachefs: Flag btrees with missing data
  bcachefs: Topology repair now uses nodes found by scanning to fill holes
  bcachefs: Repair pass for scanning for btree nodes
  bcachefs: Don't skip fake btree roots in fsck
  bcachefs: bch2_btree_root_alloc() -> bch2_btree_root_alloc_fake()
  bcachefs: Etyzinger cleanups
  bcachefs: bch2_shoot_down_journal_keys()
  bcachefs: Clear recovery_passes_required as they complete without errors
  bcachefs: ratelimit informational fsck errors
  bcachefs: Check for bad needs_discard before doing discard
  bcachefs: Improve bch2_btree_update_to_text()
  mean_and_variance: Drop always failing tests
  bcachefs: fix nocow lock deadlock
  bcachefs: BCH_WATERMARK_interior_updates
  bcachefs: Fix btree node reserve
2024-04-04 14:36:32 -07:00
Kent Overstreet
9802ff48f3 bcachefs: Print shutdown journal sequence number
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-04 16:56:44 -04:00
Kent Overstreet
d880a43836 bcachefs: Further improve btree_update_to_text()
Print start and end level of the btree update; also a bit of cleanup.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-04 16:56:44 -04:00
Kent Overstreet
9fb3036fe3 bcachefs: Move btree_updates to debugfs
sysfs is limited to PAGE_SIZE, and when we're debugging strange
deadlocks/priority inversions we need to see the full list.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-04 16:56:44 -04:00
Kent Overstreet
be42e4a621 bcachefs: Bump limit in btree_trans_too_many_iters()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-04 16:53:12 -04:00
Kent Overstreet
01e5f4fc0f bcachefs: Make snapshot_is_ancestor() safe
Snapshot table accesses generally need to be checking for invalid
snapshot ID now, fix one that was missed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-04 16:52:42 -04:00
Stefan O'Rear
d14fa1fcf6
riscv: process: Fix kernel gp leakage
childregs represents the registers which are active for the new thread
in user context. For a kernel thread, childregs->gp is never used since
the kernel gp is not touched by switch_to. For a user mode helper, the
gp value can be observed in user space after execve or possibly by other
means.

[From the email thread]

The /* Kernel thread */ comment is somewhat inaccurate in that it is also used
for user_mode_helper threads, which exec a user process, e.g. /sbin/init or
when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have
PF_KTHREAD set and are valid targets for ptrace etc. even before they exec.

childregs is the *user* context during syscall execution and it is observable
from userspace in at least five ways:

1. kernel_execve does not currently clear integer registers, so the starting
   register state for PID 1 and other user processes started by the kernel has
   sp = user stack, gp = kernel __global_pointer$, all other integer registers
   zeroed by the memset in the patch comment.

   This is a bug in its own right, but I'm unwilling to bet that it is the only
   way to exploit the issue addressed by this patch.

2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread
   before it execs, but ptrace requires SIGSTOP to be delivered which can only
   happen at user/kernel boundaries.

3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for
   user_mode_helpers before the exec completes, but gp is not one of the
   registers it returns.

4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel
   addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses
   are also exposed via PERF_SAMPLE_REGS_USER which is permitted under
   LOCKDOWN_PERF. I have not attempted to write exploit code.

5. Much of the tracing infrastructure allows access to user registers. I have
   not attempted to determine which forms of tracing allow access to user
   registers without already allowing access to kernel registers.

Fixes: 7db91e57a0 ("RISC-V: Task implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Stefan O'Rear <sorear@fastmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240327061258.2370291-1-sorear@fastmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-04 12:35:05 -07:00
Alexandre Ghiti
a370c2419e
riscv: Disable preemption when using patch_map()
patch_map() uses fixmap mappings to circumvent the non-writability of
the kernel text mapping.

The __set_fixmap() function only flushes the current cpu tlb, it does
not emit an IPI so we must make sure that while we use a fixmap mapping,
the current task is not migrated on another cpu which could miss the
newly introduced fixmap mapping.

So in order to avoid any task migration, disable the preemption.

Reported-by: Andrea Parri <andrea@rivosinc.com>
Closes: https://lore.kernel.org/all/ZcS+GAaM25LXsBOl@andrea/
Reported-by: Andy Chiu <andy.chiu@sifive.com>
Closes: https://lore.kernel.org/linux-riscv/CABgGipUMz3Sffu-CkmeUB1dKVwVQ73+7=sgC45-m0AE9RCjOZg@mail.gmail.com/
Fixes: cad539baa4 ("riscv: implement a memset like function for text")
Fixes: 0ff7c3b331 ("riscv: Use text_mutex instead of patch_lock")
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Puranjay Mohan <puranjay12@gmail.com>
Link: https://lore.kernel.org/r/20240326203017.310422-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-04 12:33:38 -07:00