Commit Graph

98333 Commits

Author SHA1 Message Date
Linus Torvalds
e0f8e1a7c1 3 ksmbd SMB3 server fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmgumngACgkQiiy9cAdy
 T1G11wwAmJTHyRxLx/koFy7cKCwL4+ttza0TdXD+NhRSjtb3Bd69OvDLWtefpa+0
 NERFUM43/kJ4vIBeIw0mTaW8BVuVc5zna/yzkbPRN0ehdRnN1iuZTjHLJeqMvDt8
 jNKfuRpC0IaC5DQxtVWMDmNDNSEQmK9zlcRVb2LhDTTBPSLlFIUcmUfJDXysyzwB
 4I7Q2xG/OkpBx7oA7QQZEvzj4S80kcDRkWZK2sQ0EgwuEflc+rU6EUSJsEkenUTx
 fQiv6rKY5ul6QV29VytXGOK3yHDDVNmUQ/g0FcbdAwpXEXz2H+ZmjDCVpRkuGlVZ
 9IZA6lYy6eEheyCMYcVWmhtQgs+96Oqez6Z+snQf4Sjq+op37bWotFBaB5/gRuPr
 lue43XfUbjRey8a8xTb0zNCwph9y5WaEez9dPR3BoHMHgTvvhrgFT4VL/iffGK9v
 fLcMWb+HgHasvRag07khSRNp1pJEnYPEMzhyFyHTVExtgr3jhktP2k6jtSVRScrW
 Dc2/pjfM
 =vrEs
 -----END PGP SIGNATURE-----

Merge tag 'v6.15-rc8-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

 - Fix for rename regression due to the recent VFS lookup changes

 - Fix write failure

 - locking fix for oplock handling

* tag 'v6.15-rc8-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: use list_first_entry_or_null for opinfo_get_list()
  ksmbd: fix rename failure
  ksmbd: fix stream write failure
2025-05-23 08:42:29 -07:00
Linus Torvalds
eccf6f2f6a vfs-6.15-rc8.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaDBLdgAKCRCRxhvAZXjc
 oh61AP43WQ/Y0OrRqzKDPHGaFb4wGCdJwTKM2ZIjo8bSSXucZgD/ZcX6ksmmLp5/
 XMsPzB7e5vrnkY5Y1jRdPn1fBWqlHQk=
 =dTHV
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.15-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "This contains a small set of fixes for the blocking buffer lookup
  conversion done earlier this cycle.

  It adds a missing conversion in the getblk slowpath and a few minor
  optimizations and cleanups"

* tag 'vfs-6.15-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs/buffer: optimize discard_buffer()
  fs/buffer: remove superfluous statements
  fs/buffer: avoid redundant lookup in getblk slowpath
  fs/buffer: use sleeping lookup in __getblk_slowpath()
2025-05-23 07:51:05 -07:00
Linus Torvalds
040c0f6a18 bcachefs fixes for 6.15
Small stuff, main ones users will be interested in:
 
 - Couple more casefolding fixes; we can now detect and repair casefolded
   dirents in non-casefolded dir and vice versa
 - Fix for massive write inflation with mmapped io, which hit certain
   databases
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmgvz74ACgkQE6szbY3K
 bnY2mw/+Ioz+7dQz4vw2DY48o6Qn9jGst1XANOb2PQD/1RaEcnJF2d8C5gmjxVN6
 ysl/TQf+BHLkiXb6TTOgXVgtLjGBMOk3NNHrv8ebthacFcIfPC7r/cE5KmECRPiT
 0Fs6KTDeQscjLzg8YHI1PHnJ6EU/uYy76wK1nIcF5VJxLYSH6UQ2ZAo7NT0sAc2K
 PyCsvmOb8mev3wrCHgYYUWZyS0Cla6a44bpG83km5NhIC4dcrL8On8TO2E2VwVpX
 A9otPSLmmdIuJd+Q6nUNiEFOVcdtZn/rH9jz9QWokH/kpdoqCysZneTHXEy0kVCG
 NAKuEKv/mv+cQ45pn+4JhWf+FaLih/46w+C/10gNWShHoqdxIZFYOlb3S5ZCvK7L
 ij5zlcQF/ZrT7bZJTov1J6SUM9LShgTvt4gjqcaiokb2BBYqLn6u/TBaNgW9wu0Q
 4BiYfdO/BfCaLiBvIz2WuMmUfh2i+rBYjetW0R+Sq8f/vFfFTw7tWipO/u4zrE7J
 D5FVT9+XGwAvRULk1/jG5df9ZsF/g4L2CMdQuWh/8wXfTp/GrKirFN5x6Ywkz6ew
 rb/m/ukSOLLUcRBJCapZmDrRwBnaQ0MTFi6z8qNb125ENR0WwbYkV0XdkBOi3U/g
 1OM2eC/SWOrWkNXl4mLMv0MUSLEsVyuYUtf75FvVm4zNH9Spqos=
 =a06t
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2025-05-22' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Small stuff, main ones users will be interested in:

   - Couple more casefolding fixes; we can now detect and repair
     casefolded dirents in non-casefolded dir and vice versa

   - Fix for massive write inflation with mmapped io, which hit certain
     databases"

* tag 'bcachefs-2025-05-22' of git://evilpiepirate.org/bcachefs:
  bcachefs: Check for casefolded dirents in non casefolded dirs
  bcachefs: Fix bch2_dirent_create_snapshot() for casefolding
  bcachefs: Fix casefold opt via xattr interface
  bcachefs: mkwrite() now only dirties one page
  bcachefs: fix extent_has_stripe_ptr()
  bcachefs: Fix bch2_btree_path_traverse_cached() when paths realloced
2025-05-22 19:17:55 -07:00
Linus Torvalds
e85dea591f two smb3 client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmgvRZwACgkQiiy9cAdy
 T1FhHgwAmCyhbUzJkszIn3KCVrcxPmydM4zf7fniiAEk9uUX58FdovQ7fbrt6wxY
 joN3dtvoCu5A6zOAyzBWt8V6gnWqz2EH6nve9bMo+WRk380RbIisSYnZC0NaYjQb
 oM/5zuyBxIqvN30CkLVMp/6Ps6wdGmdyOcjtK4xeyW7BPnM7pd74Z2ttEy9QsxlT
 PCETHtL1wM+iKKf3ua5N7Sti11mXyTOe/6X3Kl65rmiyiNQ2F6L/qTtswbu4QOzv
 mVsxoEOSxPu52KIostZsWloP2vQuvE8Cuk4z3UoC1Osd/xmvMAoOiMbB72vyAmHW
 4dJgvZei+D3gKUQslIZSCIG0cQfneBxhp/z4+YxSGAnWgDx/5g3IJuyZ6bk5SQXA
 PNJu80fOe683QudxNzmQN3WioYdgRatxPxZFjqW8uhovWRM9EPydB3vi+oCdEQcH
 KXJNAR3pUSaavVRiLdm8JbLkqVchjEuTj/Ba1Ws9Z4LVJpVFAqhIynHDoxqPIhsh
 jGcJJA4X
 =Agsj
 -----END PGP SIGNATURE-----

Merge tag '6.15-rc8-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - Two fixes for use after free in readdir code paths

* tag '6.15-rc8-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: Reset all search buffer pointers when releasing buffer
  smb: client: Fix use-after-free in cifs_fill_dirent
2025-05-22 12:35:16 -07:00
Namjae Jeon
10379171f3 ksmbd: use list_first_entry_or_null for opinfo_get_list()
The list_first_entry() macro never returns NULL.  If the list is
empty then it returns an invalid pointer.  Use list_first_entry_or_null()
to check if the list is empty.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202505080231.7OXwq4Te-lkp@intel.com/
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-21 22:30:39 -05:00
Namjae Jeon
68477b5dc5 ksmbd: fix rename failure
I found that rename fails after cifs mount due to update of
lookup_one_qstr_excl().

 mv a/c b/
mv: cannot move 'a/c' to 'b/c': No such file or directory

In order to rename to a new name regardless of whether the dentry is
negative, we need to get the dentry through lookup_one_qstr_excl().
So It will not return error if the name doesn't exist.

Fixes: 204a575e91 ("VFS: add common error checks to lookup_one_qstr_excl()")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-21 22:30:30 -05:00
Kent Overstreet
010c894681 bcachefs: Check for casefolded dirents in non casefolded dirs
Check for mismatches between casefold dirents and casefold directories.

A mismatch will cause lookups to fail, as we'll be doing the lookup with
the casefolded name, which won't match the non-casefolded dirent, and
vice versa.

Reported-by: Christopher Snowhill <chris@kode54.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:14 -04:00
Kent Overstreet
ecd76c5f10 bcachefs: Fix bch2_dirent_create_snapshot() for casefolding
bch2_dirent_create_snapshot(), used in fsck, neglected to create a
casefolded dirent.

Just move this into dirent_create_key().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:13 -04:00
Kent Overstreet
8d5ac187da bcachefs: Fix casefold opt via xattr interface
Changing the casefold option requires extra checks/work - factor out a
helper from bch2_fileattr_set() for the xattr code to use.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:09 -04:00
Davidlohr Bueso
8e184bf1cd
fs/buffer: optimize discard_buffer()
While invalidating, the clearing of the bits in discard_buffer()
is done in one fully ordered CAS operation. In the past this was
done via individual clear_bit(), until e7470ee89f (fs: buffer:
do not use unnecessary atomic operations when discarding buffers).
This implies that there were never strong ordering requirements
outside of being serialized by the buffer lock.

As such relax the ordering for archs that can benefit. Further,
the implied ordering in buffer_unlock() makes current cmpxchg
implied barrier redundant due to release semantics. And while in
theory the unlock could be part of the bulk clearing, it is
best to leave it explicit, but without the double barriers.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/20250515173925.147823-5-dave@stgolabs.net
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:29 +02:00
Davidlohr Bueso
d11a249996
fs/buffer: remove superfluous statements
Get rid of those unnecessary return statements.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/20250515173925.147823-4-dave@stgolabs.net
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:29 +02:00
Davidlohr Bueso
98a6ca1633
fs/buffer: avoid redundant lookup in getblk slowpath
__getblk_slow() already implies failing a first lookup
as the fastpath, so try to create the buffers immediately
and avoid the redundant lookup. This saves 5-10% of the
total cost/latency of the slowpath.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/20250515173925.147823-3-dave@stgolabs.net
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:29 +02:00
Davidlohr Bueso
fb27226c38
fs/buffer: use sleeping lookup in __getblk_slowpath()
Just as with the fast path, call the lookup variant depending
on the gfp flags.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/20250515173925.147823-2-dave@stgolabs.net
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-21 09:34:28 +02:00
Linus Torvalds
b36ddb9210 Hello Linus.
A 6.14-rc7 commit (665575cf) broke orangefs for 6.14. I have a patch,
 and it has been in linux-next for several days, but if I wait until
 the merge window to ask for it to be pulled, orangefs will be broken
 in 6.15 too.
 
 I hope you can pull this patch during 6.15-rc7.
 
 Thanks...
 
 Mike Marshall
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEWLOQAZkv9m2xtUCf/Vh1C1u0YnAFAmgsoBYACgkQ/Vh1C1u0
 YnCFCA//cAe+YVGhzo90X2qGbgxYl4AnoR8q8d9IhhJulGmJivvF0wFsccKf5S9g
 jSHqQD315fJ1io9cS3fwUFH5xGYYaBT7CK+x5mzP24aN8cGdHut1boEiW+aaGpwi
 VIarX2pSPtgQ/KSrfv+Y7KpbkBER4sX3mHXa2XFw/PGseJnniaH8HTO5dIjkl4JL
 rt8Xtba3zsb6R8/HbfFGX/QA14b9ZfdSpZAY8+OEl9cPghhZjKUhDuw8dJnH4M0x
 ilEWGd0wl6q9ghhNXtrNYRHVaDiO300P4lSphbRDA++zH1bCigzWIlZNnE3lPvXb
 lm5Y+HnpE/bgQAI1l5+wjKSoBt4zRY2T6cUtf7mANbsvNKoc9D8dEIE3hSbBpb2A
 zroEbqKt+OPhrZDt7YE6jM/66RpcX7ng81oFEVkfmEPeRXNRgYD39RtrIknZ7Kgr
 CqsF5xtdFzmxmYeOyB89wEFkewUn1vsh86t4wwk74cbJnSwgZF7KQ+TUD8Gdh21X
 h6746x1ycP3dT8RPdRgVNAfbV3IeIufKqIZWfItu9PG4g+G2LRhbxAI8ralqO/4p
 zj9TRNPCcCCczPzrG/8z2u9d70jkechsQ5V0oipN9TsG55elADjaVcEPXfCvvrAs
 p+MwM5wJZywLgUqUEwyIU7bXRdwf8sq1kUAiFcUhvuHTL0SuTCo=
 =BM1w
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.15-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux

Pull orangefs fix from Mike Marshall:
 "Fix for orangefs page writeout counting"

* tag 'for-linus-6.15-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: adjust counting code to recover from 665575cf
2025-05-20 09:03:34 -07:00
Mike Marshall
219bf6edd7 orangefs: adjust counting code to recover from 665575cf
A late commit to 6.14-rc7! broke orangefs. 665575cf seems like a
good change, but maybe should have been introduced during the merge
window. This patch adjusts the counting code associated with
writing out pages so that orangefs works in a 665575cf world.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2025-05-20 11:07:00 -04:00
Namjae Jeon
1f4bbedd4e ksmbd: fix stream write failure
If there is no stream data in file, v_len is zero.
So, If position(*pos) is zero, stream write will fail
due to stream write position validation check.
This patch reorganize stream write position validation.

Fixes: 0ca6df4f40 ("ksmbd: prevent out-of-bounds stream writes by validating *pos")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-19 20:35:08 -05:00
Wang Zhaolong
e48f9d849b smb: client: Reset all search buffer pointers when releasing buffer
Multiple pointers in struct cifs_search_info (ntwrk_buf_start,
srch_entries_start, and last_entry) point to the same allocated buffer.
However, when freeing this buffer, only ntwrk_buf_start was set to NULL,
while the other pointers remained pointing to freed memory.

This is defensive programming to prevent potential issues with stale
pointers. While the active UAF vulnerability is fixed by the previous
patch, this change ensures consistent pointer state and more robust error
handling.

Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-19 20:29:06 -05:00
Kent Overstreet
cbed8287e5 bcachefs: mkwrite() now only dirties one page
Don't dirty the whole folio - fixes write amplification with
applications doing mmaped writes.

https://www.reddit.com/r/bcachefs/comments/1klzcg1/incredible_amounts_of_write_amplification_when/

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-19 08:28:41 -04:00
Kent Overstreet
494d458cfa bcachefs: fix extent_has_stripe_ptr()
This wasn't checking indirect extents.

Fixes: https://github.com/koverstreet/bcachefs/issues/887
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-18 22:35:33 -04:00
Wang Zhaolong
a7a8fe56e9 smb: client: Fix use-after-free in cifs_fill_dirent
There is a race condition in the readdir concurrency process, which may
access the rsp buffer after it has been released, triggering the
following KASAN warning.

 ==================================================================
 BUG: KASAN: slab-use-after-free in cifs_fill_dirent+0xb03/0xb60 [cifs]
 Read of size 4 at addr ffff8880099b819c by task a.out/342975

 CPU: 2 UID: 0 PID: 342975 Comm: a.out Not tainted 6.15.0-rc6+ #240 PREEMPT(full)
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x53/0x70
  print_report+0xce/0x640
  kasan_report+0xb8/0xf0
  cifs_fill_dirent+0xb03/0xb60 [cifs]
  cifs_readdir+0x12cb/0x3190 [cifs]
  iterate_dir+0x1a1/0x520
  __x64_sys_getdents+0x134/0x220
  do_syscall_64+0x4b/0x110
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
 RIP: 0033:0x7f996f64b9f9
 Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89
 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
 f0 ff ff  0d f7 c3 0c 00 f7 d8 64 89 8
 RSP: 002b:00007f996f53de78 EFLAGS: 00000207 ORIG_RAX: 000000000000004e
 RAX: ffffffffffffffda RBX: 00007f996f53ecdc RCX: 00007f996f64b9f9
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
 RBP: 00007f996f53dea0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000207 R12: ffffffffffffff88
 R13: 0000000000000000 R14: 00007ffc8cd9a500 R15: 00007f996f51e000
  </TASK>

 Allocated by task 408:
  kasan_save_stack+0x20/0x40
  kasan_save_track+0x14/0x30
  __kasan_slab_alloc+0x6e/0x70
  kmem_cache_alloc_noprof+0x117/0x3d0
  mempool_alloc_noprof+0xf2/0x2c0
  cifs_buf_get+0x36/0x80 [cifs]
  allocate_buffers+0x1d2/0x330 [cifs]
  cifs_demultiplex_thread+0x22b/0x2690 [cifs]
  kthread+0x394/0x720
  ret_from_fork+0x34/0x70
  ret_from_fork_asm+0x1a/0x30

 Freed by task 342979:
  kasan_save_stack+0x20/0x40
  kasan_save_track+0x14/0x30
  kasan_save_free_info+0x3b/0x60
  __kasan_slab_free+0x37/0x50
  kmem_cache_free+0x2b8/0x500
  cifs_buf_release+0x3c/0x70 [cifs]
  cifs_readdir+0x1c97/0x3190 [cifs]
  iterate_dir+0x1a1/0x520
  __x64_sys_getdents64+0x134/0x220
  do_syscall_64+0x4b/0x110
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

 The buggy address belongs to the object at ffff8880099b8000
  which belongs to the cache cifs_request of size 16588
 The buggy address is located 412 bytes inside of
  freed 16588-byte region [ffff8880099b8000, ffff8880099bc0cc)

 The buggy address belongs to the physical page:
 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x99b8
 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
 anon flags: 0x80000000000040(head|node=0|zone=1)
 page_type: f5(slab)
 raw: 0080000000000040 ffff888001e03400 0000000000000000 dead000000000001
 raw: 0000000000000000 0000000000010001 00000000f5000000 0000000000000000
 head: 0080000000000040 ffff888001e03400 0000000000000000 dead000000000001
 head: 0000000000000000 0000000000010001 00000000f5000000 0000000000000000
 head: 0080000000000003 ffffea0000266e01 00000000ffffffff 00000000ffffffff
 head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff8880099b8080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8880099b8100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 >ffff8880099b8180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                             ^
  ffff8880099b8200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8880099b8280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ==================================================================

POC is available in the link [1].

The problem triggering process is as follows:

Process 1                       Process 2
-----------------------------------------------------------------
cifs_readdir
  /* file->private_data == NULL */
  initiate_cifs_search
    cifsFile = kzalloc(sizeof(struct cifsFileInfo), GFP_KERNEL);
    smb2_query_dir_first ->query_dir_first()
      SMB2_query_directory
        SMB2_query_directory_init
        cifs_send_recv
        smb2_parse_query_directory
          srch_inf->ntwrk_buf_start = (char *)rsp;
          srch_inf->srch_entries_start = (char *)rsp + ...
          srch_inf->last_entry = (char *)rsp + ...
          srch_inf->smallBuf = true;
  find_cifs_entry
    /* if (cfile->srch_inf.ntwrk_buf_start) */
    cifs_small_buf_release(cfile->srch_inf // free

                        cifs_readdir  ->iterate_shared()
                          /* file->private_data != NULL */
                          find_cifs_entry
                            /* in while (...) loop */
                            smb2_query_dir_next  ->query_dir_next()
                              SMB2_query_directory
                                SMB2_query_directory_init
                                cifs_send_recv
                                  compound_send_recv
                                    smb_send_rqst
                                    __smb_send_rqst
                                      rc = -ERESTARTSYS;
                                      /* if (fatal_signal_pending()) */
                                      goto out;
                                      return rc
                            /* if (cfile->srch_inf.last_entry) */
                            cifs_save_resume_key()
                              cifs_fill_dirent // UAF
                            /* if (rc) */
                            return -ENOENT;

Fix this by ensuring the return code is checked before using pointers
from the srch_inf.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220131 [1]
Fixes: a364bc0b37 ("[CIFS] fix saving of resume key before CIFSFindNext")
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-18 16:53:06 -05:00
Kent Overstreet
49771a7578 bcachefs: Fix bch2_btree_path_traverse_cached() when paths realloced
btree_key_cache_fill() will allocate and traverse another path (for the
underlying btree), so we can't hold pointers to paths across a call - we
have to pass indices.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-17 18:46:17 -04:00
Linus Torvalds
172a9d9433 two smb3 client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmgny7kACgkQiiy9cAdy
 T1GbEgv+LW38yTQdLWzWTUQgNl7XcnOFGE6vqSaRVr1A5oomkjHbTfBGc2XxbSbj
 iBABTSqhcSHNnc5vDk4t6+SK8rYUBXUiLA2o5Ol0KxLznrMU2sVaSC6tAqV1MBzG
 c9AgoNLZf16QsWmRQbZRXz0wiMJrcmXR4tOrKYisA9uBP43BaYNdGII/KTOV+HYb
 4ytT7RPyDHiyVF+4dewk591mVUU9DqLawN+FysLzDOuJmt/hbh2xd6MEDC9KwD4u
 jOocz++K+VbHxotE+XTYYFszyC+GVEtGhL9HLoXq3v1KPJanUZgiCdWBTdJwMhrv
 zXFd9d87PJ5vjFMbriEx2RVQ7e7/oq+HleMOcW4z2puODegNqU+ZVhmXrvTSRoVw
 Rl1BzqP4MONK/70T0v7lwbw9B9JykhLQPzM0elP/uWQYDlHhAlGn4rsWD+asUTtH
 TJM0q/JmUlDxgVBMLz6EBNHTxc52bwM5tJe0r4d4U93NVlAaNkt8+PQUBctgWicd
 9L+FOLHu
 =0j+F
 -----END PGP SIGNATURE-----

Merge tag '6.15-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - Fix memory leak in mkdir error path

 - Fix max rsize miscalculation after channel reconnect

* tag '6.15-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix zero rsize error messages
  smb: client: fix memory leak during error handling for POSIX mkdir
2025-05-16 18:02:41 -07:00
Linus Torvalds
450d2f6e88 NFS client bugfixes for Linux 6.15
Bugfixes:
 - NFS: Fix a couple of missed handlers for the ENETDOWN and ENETUNREACH
   transport errors.
 - NFS: Handle Oopsable failure of nfs_get_lock_context in the unlock
   path.
 - NFSv4: Fix a race in nfs_local_open_fh().
 - NFSv4/pNFS: Fix a couple of layout segment leaks in layoutreturn.
 - NFSv4/pNFS Avoid sharing pNFS DS connections between net namespaces
   since IP addresses are not guaranteed to refer to the same nodes.
 - NFS: Don't flush file data while holding multiple directory locks in
   nfs_rename().
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmgnoW8ACgkQZwvnipYK
 APIDMBAAnt47W2zXdurecdQ9Kd5NUvcsyiqSdBf7PbnW5+pYjDEnl+0+P3UXWCle
 hZZgR330zglD4cTQX8Ci6EhwNSmitNPWljzTiWrb0W4Y24Eh5qXHMqN8qn9K7lUr
 68PvJ4QlHYY2AMzNYnPTo8oPCfn/o1x7Ne6Qj615wobgXbWFHIBl/67HQ/Um9lMp
 G7M2bUsqYXrILlhh8olh3XxTn327WbnO7RSFyzix4XyvQZ/12HckERPw/qFDb78y
 ChnHQn0WlyRtEmgxnTGDAFJelh7evAyPsrCOoUGJh7rTbQw7Z7HybaawBwNPhd07
 oPuJABwVJOZr7wxRmv7HBWGgVBJPnnPkc0ohVwKRVSVMBW+ujUeQpQTdQpwRz49U
 XzHD43dfk3p2SezTEfZN+b4nE3phK2s367eWAmhCwM4gHmm6LpsroxyAOdc34GpC
 7slPfGPLxqY6WL9ZDTiW6hX/ayOwGjpjSkU/VtX60CENKnBvmbmXlDR7n36oou3Y
 nc45ZViy4ndG7a2eXfZ2GANp64RBS4XvNUbs4Y9XQfXEzDiolHzCrl3y1wXzJfFG
 AxrGRbWAkkUHIuBhrTD0SH91IkwAeTx9RMzyrWxiPHQvAWVXDkv7fyzQ+EdXCedE
 FtD6mUqU3jj4dZ/m5zkunsRaFdMntLrD+WGtBjM31uOFJJ23orM=
 =FIBm
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-6.15-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - NFS: Fix a couple of missed handlers for the ENETDOWN and ENETUNREACH
   transport errors

 - NFS: Handle Oopsable failure of nfs_get_lock_context in the unlock
   path

 - NFSv4: Fix a race in nfs_local_open_fh()

 - NFSv4/pNFS: Fix a couple of layout segment leaks in layoutreturn

 - NFSv4/pNFS Avoid sharing pNFS DS connections between net namespaces
   since IP addresses are not guaranteed to refer to the same nodes

 - NFS: Don't flush file data while holding multiple directory locks in
   nfs_rename()

* tag 'nfs-for-6.15-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Avoid flushing data while holding directory locks in nfs_rename()
  NFS/pnfs: Fix the error path in pnfs_layoutreturn_retry_later_locked()
  NFSv4/pnfs: Reset the layout state after a layoutreturn
  NFS/localio: Fix a race in nfs_local_open_fh()
  nfs: nfs3acl: drop useless assignment in nfs3_get_acl()
  nfs: direct: drop useless initializer in nfs_direct_write_completion()
  nfs: move the nfs4_data_server_cache into struct nfs_net
  nfs: don't share pNFS DS connections between net namespaces
  nfs: handle failure of nfs_get_lock_context in unlock path
  pNFS/flexfiles: Record the RPC errors in the I/O tracepoints
  NFSv4/pnfs: Layoutreturn on close must handle fatal networking errors
  NFSv4: Handle fatal ENETDOWN and ENETUNREACH errors
2025-05-16 14:29:12 -07:00
Trond Myklebust
dcd21b609d NFS: Avoid flushing data while holding directory locks in nfs_rename()
The Linux client assumes that all filehandles are non-volatile for
renames within the same directory (otherwise sillyrename cannot work).
However, the existence of the Linux 'subtree_check' export option has
meant that nfs_rename() has always assumed it needs to flush writes
before attempting to rename.

Since NFSv4 does allow the client to query whether or not the server
exhibits this behaviour, and since knfsd does actually set the
appropriate flag when 'subtree_check' is enabled on an export, it
should be OK to optimise away the write flushing behaviour in the cases
where it is clearly not needed.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2025-05-16 22:31:35 +02:00
Trond Myklebust
28511504f3 NFS/pnfs: Fix the error path in pnfs_layoutreturn_retry_later_locked()
If there isn't a valid layout, or the layout stateid has changed, the
cleanup after a layout return should clear out the old data.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-05-16 22:31:35 +02:00
Trond Myklebust
6d6d7f91cc NFSv4/pnfs: Reset the layout state after a layoutreturn
If there are still layout segments in the layout plh_return_lsegs list
after a layout return, we should be resetting the state to ensure they
eventually get returned as well.

Fixes: 68f744797e ("pNFS: Do not free layout segments that are marked for return")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2025-05-16 22:31:35 +02:00
Linus Torvalds
1524cb2830 xfs fixes for 6.15-rc7
Signed-off-by: Carlos Maiolino <cem@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iJUEABMJAB0WIQSmtYVZ/MfVMGUq1GNcsMJ8RxYuYwUCaCcCuwAKCRBcsMJ8RxYu
 Y4y9AX0eXpTqbVMGs+D1KG6s1ryQz3vhinZxwxZJgkHysrpXQklVXQXeOgRWV9gi
 pxM2EjoBgOr0qwMxleAvM4OqUI/2TFwBhl59CBcxnFFruU5H+snUzEWWsNw645/C
 9ONJI05uXA==
 =Wc8h
 -----END PGP SIGNATURE-----

Merge tag 'xfs-fixes-6.15-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Carlos Maiolino:
 "This includes a bug fix for a possible data corruption vector on the
  zoned allocator garbage collector"

* tag 'xfs-fixes-6.15-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Fix comment on xfs_trans_ail_update_bulk()
  xfs: Fix a comment on xfs_ail_delete
  xfs: Fail remount with noattr2 on a v5 with v4 enabled
  xfs: fix zoned GC data corruption due to wrong bv_offset
  xfs: free up mp->m_free[0].count in error case
2025-05-16 09:51:49 -07:00
Linus Torvalds
fee3e843b3 bcachefs fixes for 6.15-rc7
The main user reported ones are:
 
 - Fix a btree iterator locking inconsistency that's been causing us to
   go emergency read-only in evacuate: "Fix broken btree_path lock
   invariants in next_node()"
 
 - Minor btree noed cache reclaim tweak that should help with OOMs: don't
   set btree nodes as accessed on fill
 
 - Fix a bch2_bkey_clear_rebalance() issue that was causing rebalance to
   do needless work
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmgmVZUACgkQE6szbY3K
 bnZxpQ/+Pvcj3zsccJgh68MHHKfccDHu5XkHE+zm0Dmr9B+Uq/Rvi5+2mIlcW/Bc
 j2fUK2je4+BnEkCTEeZI1yGLIAYtryPE9p/bOgrlui5eIJ1x4BhgZoF/BiVC6cLM
 9Fvi+Uq5+j+1CPMUlpWRO86D113Za/EpEKZs1LVYeBzc70WNCWWVzwrZQAJaTddH
 y5zyZpUVifnlv5+OyEey81+kVVrWhvN+Ggh/Yd+aieV6Ll8Xj6JjbWn92JB/CFy4
 le2Y2iXz7Bw1jTm8huTqtQpjqlJJTB2HrpbBvxwk+Hsacwmw6tY0gKdF+gGi1nfW
 ONkYdSDZhYkTcFv3LsVyRWOrD96RCNVZd7oBkX1P8k5rU8BgOTKroh2dsaszLs74
 xJK5mrOzo/OrnlWJdxBOwSmAN3QL3CvEFX+KqZVpAbxycHAKMYQvnyuSmuBAsZXa
 gW6zNvQE5umNbLmSkV6l0l7Bvao2zMwxra9AiGM+uK+RB8RWuTy3VphABWvvdVex
 5QLUyvKa3Q5A6APdfPawuWWvjphcj1hmktyiOBLRETBxhEJxbjNZJ98wAlpr+2yX
 5JQ9Y6kZ8t4gsMTT5Jp/SvS3vzGU6IW4HPKwQX3v+AhtrHgD17Vjl2UvqifW91V5
 6p1L4z0x783pLNoPTuaHbU3s/lVw2bBdxDq+9951WSxvA1nHSAU=
 =Vjm2
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2025-05-15' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "The main user reported ones are:

   - Fix a btree iterator locking inconsistency that's been causing us
     to go emergency read-only in evacuate: "Fix broken btree_path lock
     invariants in next_node()"

   - Minor btree node cache reclaim tweak that should help with OOMs:
     don't set btree nodes as accessed on fill

   - Fix a bch2_bkey_clear_rebalance() issue that was causing rebalance
     to do needless work"

* tag 'bcachefs-2025-05-15' of git://evilpiepirate.org/bcachefs:
  bcachefs: fix wrong arg to fsck_err()
  bcachefs: Fix missing commit in backpointer to missing target
  bcachefs: Fix accidental O(n^2) in fiemap
  bcachefs: Fix set_should_be_locked() call in peek_slot()
  bcachefs: Fix self deadlock
  bcachefs: Don't set btree nodes as accessed on fill
  bcachefs: Fix livelock in journal_entry_open()
  bcachefs: Fix broken btree_path lock invariants in next_node()
  bcachefs: Don't strip rebalance_opts from indirect extents
2025-05-15 14:20:48 -07:00
Linus Torvalds
74a6325597 for-6.15-rc6-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmglNBkACgkQxWXV+ddt
 WDtOeA/+Ifj7fYP6feVya+KF5qLXg4H0x6p+IpoBhgzOyrRFiBR9yPbOADt3MEX4
 ATpG7cHhOd8Mxaegbpz6zArHcZqO1VlPWbl+HpVJ6Ji7+N+u+eiHcSFyUT5yFIl7
 HLrJ7bxpc8xVLLsPeBOrk3c7LKkiaeAw4EmuMAY70d0oqaMJ5nqSiYFvLislTETR
 DaOoInem16WvjfEwHgXXZcfxxjqc/R8WFW1Tud+jJSkrxSQ/V1viP0G06IGq8ucz
 cHx7SM9D/myqoHa/dTwx3DeZglcsYQN5tBk0aW3HkylcXLPueFf70cGxzk1mRUw5
 zavKJ31mW73zNJs4hIFQiy2rbfyi7g/LuOFlhNT+AbDRX4HDP88+42anVlQl3VdC
 FcKL+VEtY5sgfn4kslsyo4fMbNpt0VXA7wy0qOEmHbHdnBgaYTIjqwu1LUnU/eLJ
 WQQstUkfuo+pZffaaKsR7S5r5i5xUzYjqHXF9qf1Dju9rEKYbLVtu/T3EVziO1Mc
 vdVE2xxdnuf8UTeJ+gJtcyeUJT54SihaR2qm8tErMdILMjSTPmaAQFhtRV14nQTp
 upqsJ5gesbc3++VPPmsBgcLP7UL9uN7s6NeRRanj1Zg2bZY8B+zGwhr8/k1ZmR8T
 uMr0qFrYx5SVCS2g47FRK6dWrnYgAdT5LaXA5cx02nTynU2hw1o=
 =8C8t
 -----END PGP SIGNATURE-----

Merge tag 'for-6.15-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix potential endless loop when discarding a block group when
   disabling discard

 - reinstate message when setting a large value of mount option 'commit'

 - fix a folio leak when async extent submission fails

* tag 'for-6.15-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: add back warning for mount option commit values exceeding 300
  btrfs: fix folio leak in submit_one_async_extent()
  btrfs: fix discard worker infinite loop after disabling discard
2025-05-14 18:39:12 -07:00
Paulo Alcantara
3965c23773 smb: client: fix zero rsize error messages
cifs_prepare_read() might be called with a disconnected channel, where
TCP_Server_Info::max_read is set to zero due to reconnect, so calling
->negotiate_rize() will set @rsize to default min IO size (64KiB) and
then logging

	CIFS: VFS: SMB: Zero rsize calculated, using minimum value
	65536

If the reconnect happens in cifsd thread, cifs_renegotiate_iosize()
will end up being called and then @rsize set to the expected value.

Since we can't rely on the value of @server->max_read by the time we
call cifs_prepare_read(), try to ->negotiate_rize() only if
@cifs_sb->ctx->rsize is zero.

Reported-by: Steve French <stfrench@microsoft.com>
Fixes: c59f7c9661 ("smb: client: ensure aligned IO sizes")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-14 19:26:38 -05:00
Jethro Donaldson
1fe4a44b7f smb: client: fix memory leak during error handling for POSIX mkdir
The response buffer for the CREATE request handled by smb311_posix_mkdir()
is leaked on the error path (goto err_free_rsp_buf) because the structure
pointer *rsp passed to free_rsp_buf() is not assigned until *after* the
error condition is checked.

As *rsp is initialised to NULL, free_rsp_buf() becomes a no-op and the leak
is instead reported by __kmem_cache_shutdown() upon subsequent rmmod of
cifs.ko if (and only if) the error path has been hit.

Pass rsp_iov.iov_base to free_rsp_buf() instead, similar to the code in
other functions in smb2pdu.c for which *rsp is assigned late.

Cc: stable@vger.kernel.org
Signed-off-by: Jethro Donaldson <devel@jro.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-14 19:26:15 -05:00
Kent Overstreet
9c09e59cc5 bcachefs: fix wrong arg to fsck_err()
fsck_err() needs the btree transaction passed to it if there is one - so
that it can unlock/relock around prompting userspace for fixing the
error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 18:59:15 -04:00
Kent Overstreet
d1041d8eab bcachefs: Fix missing commit in backpointer to missing target
Fsck wants to do transaction commits from an outer context; it may have
other repair to do (i.e. duplicate backpointers).

But when calling backpointer_not_found() from runtime code, i.e. runtime
self healing, we should be doing the commit - the outer context expects
to just be doing lookups.

This fixes bugs where we get stuck spinning, reported as "RCU lock hold
time warnings.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
a12cb6f758 bcachefs: Fix accidental O(n^2) in fiemap
Since bch2_seek_pagecache_data() searches for dirty data, we only want
to call it for holes in the extents btree - otherwise we have an
accidental O(n^2), as we repeatedly search the same range.

Reported-by: Marcin Mirosław <marcin@mejor.pl>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
43b9fece2d bcachefs: Fix set_should_be_locked() call in peek_slot()
set_should_be_locked() needs to be called before peek_key_cache(), which
traverses other paths and may do a trans unlock/relock.

This fixes an assertion pop in path_peek_slot(), when the path we're
using is unexpectedly not uptodate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Alan Huang
61198e6287 bcachefs: Fix self deadlock
Before invoking bch2_accounting_mem_mod_locked in
bch2_gc_accounting_done, we already write locked mark_lock,
in bch2_accounting_mem_insert, we lock mark_lock again.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
19b22d04cd bcachefs: Don't set btree nodes as accessed on fill
Prevent jobs that do lots of scanning (i.e. evacuatee, scrub) from
causing OOMs.

The shrinker code seems to be having issues when it doesn't do any
freeing because it's just flipping off the acccessed bit - and the
accessed bit shouldn't be set on first use anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
7b6759b199 bcachefs: Fix livelock in journal_entry_open()
When the journal is low on space, we might do discards from
journal_res_get() -> journal_entry_open().

Make sure we set j->can_discard correctly, so that if we're low on space
but not because discards aren't keeping up we don't livelock.

Fixes: 8e4d28036c ("bcachefs: Don't aggressively discard the journal")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
b1c71cb492 bcachefs: Fix broken btree_path lock invariants in next_node()
This fixes btree locking assert pops users were seeing during evacuate:

https://github.com/koverstreet/bcachefs/issues/878

May 09 22:45:02 sharon kernel: bcachefs (68116e25-fa2d-4c6f-86c7-e8b431d792ae):   bch2_btree_insert_node(): node not locked at level 1
May 09 22:45:02 sharon kernel:   bch2_btree_node_rewrite [bcachefs]: watermark=btree no_check_rw alloc l=0-1 mode=none nodes_written=0 cl.remaining=2 journal_seq=0
May 09 22:45:02 sharon kernel:   path: idx   1 ref 1:0   S B btree=alloc level=0 pos 0:3699637:0 0:3698012:1-0:3699637:0 bch2_move_btree.isra.0+0x1db/0x490 [bcachefs] uptodate 0 locks_want 2
May 09 22:45:02 sharon kernel:     l=0 locks intent seq 4 node ffff8bd700c93600
May 09 22:45:02 sharon kernel:     l=1 locks unlocked seq 1712 node ffff8bd6fd5e7a00
May 09 22:45:02 sharon kernel:     l=2 locks unlocked seq 2295 node ffff8bd6cc725400
May 09 22:45:02 sharon kernel:     l=3 locks unlocked seq 0 node 0000000000000000

Evacuate walks btree nodes with bch2_btree_iter_next_node() and rewrites
them, bch2_btree_update_start() upgrades the path to take intent locks
as far as it needs to.

But next_node() does low level unlock/relock calls on individual nodes,
and didn't handle the case where a path is supposed to be holding
multiple intent locks. If a path has locks_want > 1, it needs to be
either holding locks on all the btree nodes (at each level) requested,
or none of them.

Fix this with a bch2_btree_path_downgrade().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Kent Overstreet
cd52cc3544 bcachefs: Don't strip rebalance_opts from indirect extents
Fix bch2_bkey_clear_needs_rebalance(): indirect extents are never
supposed to have bch_extent_rebalance stripped off, because that's how
we get the IO path options when we don't have the original inode it
belonged to.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 17:05:19 -04:00
Linus Torvalds
1a80a098c6 execve fix for v6.15-rc7
- binfmt_elf: Move brk for static PIE even if ASLR disabled
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaCS8WQAKCRA2KwveOeQk
 u2XQAQDmhWxSh1m16/N3Utlph0h1+YpzJaJltYmrb4IV7w/WKgD/W7o1359gzhw4
 VfDc6nEbjpf678pAUc+seLnX82OYwwI=
 =da0V
 -----END PGP SIGNATURE-----

Merge tag 'execve-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull execve fix from Kees Cook:
 "This fixes a corner case for ASLR-disabled static-PIE brk collision
  with vdso allocations:

   - binfmt_elf: Move brk for static PIE even if ASLR disabled"

* tag 'execve-v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  binfmt_elf: Move brk for static PIE even if ASLR disabled
2025-05-14 09:15:16 -07:00
Carlos Maiolino
08c73a4b2e xfs: Fix comment on xfs_trans_ail_update_bulk()
This function doesn't take the AIL lock, but should be called
with AIL lock held. Also (hopefuly) simplify the comment.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:50 +02:00
Carlos Maiolino
fa8deae92f xfs: Fix a comment on xfs_ail_delete
It doesn't return anything.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:50 +02:00
Nirjhar Roy (IBM)
95b613339c xfs: Fail remount with noattr2 on a v5 with v4 enabled
Bug: When we compile the kernel with CONFIG_XFS_SUPPORT_V4=y,
remount with "-o remount,noattr2" on a v5 XFS does not
fail explicitly.

Reproduction:
mkfs.xfs -f /dev/loop0
mount /dev/loop0 /mnt/scratch
mount -o remount,noattr2 /dev/loop0 /mnt/scratch

However, with CONFIG_XFS_SUPPORT_V4=n, the remount
correctly fails explicitly. This is because the way the
following 2 functions are defined:

static inline bool xfs_has_attr2 (struct xfs_mount *mp)
{
	return !IS_ENABLED(CONFIG_XFS_SUPPORT_V4) ||
		(mp->m_features & XFS_FEAT_ATTR2);
}
static inline bool xfs_has_noattr2 (const struct xfs_mount *mp)
{
	return mp->m_features & XFS_FEAT_NOATTR2;
}

xfs_has_attr2() returns true when CONFIG_XFS_SUPPORT_V4=n
and hence, the following if condition in
xfs_fs_validate_params() succeeds and returns -EINVAL:

/*
 * We have not read the superblock at this point, so only the attr2
 * mount option can set the attr2 feature by this stage.
 */

if (xfs_has_attr2(mp) && xfs_has_noattr2(mp)) {
	xfs_warn(mp, "attr2 and noattr2 cannot both be specified.");
	return -EINVAL;
}

With CONFIG_XFS_SUPPORT_V4=y, xfs_has_attr2() always return
false and hence no error is returned.

Fix: Check if the existing mount has crc enabled(i.e, of
type v5 and has attr2 enabled) and the
remount has noattr2, if yes, return -EINVAL.

I have tested xfs/{189,539} in fstests with v4
and v5 XFS with both CONFIG_XFS_SUPPORT_V4=y/n and
they both behave as expected.

This patch also fixes remount from noattr2 -> attr2 (on a v4 xfs).

Related discussion in [1]

[1] https://lore.kernel.org/all/Z65o6nWxT00MaUrW@dread.disaster.area/

Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:50 +02:00
Christoph Hellwig
fbecd731de xfs: fix zoned GC data corruption due to wrong bv_offset
xfs_zone_gc_write_chunk writes out the data buffer read in earlier using
the same bio, and currenly looks at bv_offset for the offset into the
scratch folio for that.  But commit 26064d3e2b ("block: fix adding
folio to bio") changed how bv_page and bv_offset are calculated for
adding larger folios, breaking this fragile logic.

Switch to extracting the full physical address from the old bio_vec,
and calculate the offset into the folio from that instead.

This fixes data corruption during garbage collection with heavy rockdsb
workloads.  Thanks to Hans for tracking down the culprit commit during
long bisection sessions.

Fixes: 26064d3e2b ("block: fix adding folio to bio")
Fixes: 080d01c41d ("xfs: implement zoned garbage collection")
Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Tested-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:49 +02:00
Wengang Wang
09dab6ce02 xfs: free up mp->m_free[0].count in error case
In xfs_init_percpu_counters(), memory for mp->m_free[0].count wasn't freed
in error case. Free it up in this patch.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Fixes: 712bae9663 ("xfs: generalize the freespace and reserved blocks handling")
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14 15:37:49 +02:00
Kyoji Ogasawara
4ce2affc6e btrfs: add back warning for mount option commit values exceeding 300
The Btrfs documentation states that if the commit value is greater than
300 a warning should be issued. The warning was accidentally lost in the
new mount API update.

Fixes: 6941823cc8 ("btrfs: remove old mount API code")
CC: stable@vger.kernel.org # 6.12+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12 21:39:34 +02:00
Boris Burkov
a0fd1c6098 btrfs: fix folio leak in submit_one_async_extent()
If btrfs_reserve_extent() fails while submitting an async_extent for a
compressed write, then we fail to call free_async_extent_pages() on the
async_extent and leak its folios. A likely cause for such a failure
would be btrfs_reserve_extent() failing to find a large enough
contiguous free extent for the compressed extent.

I was able to reproduce this by:

1. mount with compress-force=zstd:3
2. fallocating most of a filesystem to a big file
3. fragmenting the remaining free space
4. trying to copy in a file which zstd would generate large compressed
   extents for (vmlinux worked well for this)

Step 4. hits the memory leak and can be repeated ad nauseam to
eventually exhaust the system memory.

Fix this by detecting the case where we fallback to uncompressed
submission for a compressed async_extent and ensuring that we call
free_async_extent_pages().

Fixes: 131a821a24 ("btrfs: fallback if compressed IO fails for ENOSPC")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Co-developed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12 21:39:13 +02:00
Filipe Manana
54db6d1bdd btrfs: fix discard worker infinite loop after disabling discard
If the discard worker is running and there's currently only one block
group, that block group is a data block group, it's in the unused block
groups discard list and is being used (it got an extent allocated from it
after becoming unused), the worker can end up in an infinite loop if a
transaction abort happens or the async discard is disabled (during remount
or unmount for example).

This happens like this:

1) Task A, the discard worker, is at peek_discard_list() and
   find_next_block_group() returns block group X;

2) Block group X is in the unused block groups discard list (its discard
   index is BTRFS_DISCARD_INDEX_UNUSED) since at some point in the past
   it become an unused block group and was added to that list, but then
   later it got an extent allocated from it, so its ->used counter is not
   zero anymore;

3) The current transaction is aborted by task B and we end up at
   __btrfs_handle_fs_error() in the transaction abort path, where we call
   btrfs_discard_stop(), which clears BTRFS_FS_DISCARD_RUNNING from
   fs_info, and then at __btrfs_handle_fs_error() we set the fs to RO mode
   (setting SB_RDONLY in the super block's s_flags field);

4) Task A calls __add_to_discard_list() with the goal of moving the block
   group from the unused block groups discard list into another discard
   list, but at __add_to_discard_list() we end up doing nothing because
   btrfs_run_discard_work() returns false, since the super block has
   SB_RDONLY set in its flags and BTRFS_FS_DISCARD_RUNNING is not set
   anymore in fs_info->flags. So block group X remains in the unused block
   groups discard list;

5) Task A then does a goto into the 'again' label, calls
   find_next_block_group() again we gets block group X again. Then it
   repeats the previous steps over and over since there are not other
   block groups in the discard lists and block group X is never moved
   out of the unused block groups discard list since
   btrfs_run_discard_work() keeps returning false and therefore
   __add_to_discard_list() doesn't move block group X out of that discard
   list.

When this happens we can get a soft lockup report like this:

  [71.957] watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [kworker/u4:3:97]
  [71.957] Modules linked in: xfs af_packet rfkill (...)
  [71.957] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G        W          6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa
  [71.957] Tainted: [W]=WARN
  [71.957] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  [71.957] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs]
  [71.957] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs]
  [71.957] Code: c1 01 48 83 (...)
  [71.957] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246
  [71.957] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000
  [71.957] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad
  [71.957] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080
  [71.957] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860
  [71.957] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000
  [71.957] FS:  0000000000000000(0000) GS:ffff89707bc00000(0000) knlGS:0000000000000000
  [71.957] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [71.957] CR2: 00005605bcc8d2f0 CR3: 000000010376a001 CR4: 0000000000770ef0
  [71.957] PKRU: 55555554
  [71.957] Call Trace:
  [71.957]  <TASK>
  [71.957]  process_one_work+0x17e/0x330
  [71.957]  worker_thread+0x2ce/0x3f0
  [71.957]  ? __pfx_worker_thread+0x10/0x10
  [71.957]  kthread+0xef/0x220
  [71.957]  ? __pfx_kthread+0x10/0x10
  [71.957]  ret_from_fork+0x34/0x50
  [71.957]  ? __pfx_kthread+0x10/0x10
  [71.957]  ret_from_fork_asm+0x1a/0x30
  [71.957]  </TASK>
  [71.957] Kernel panic - not syncing: softlockup: hung tasks
  [71.987] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G        W    L     6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa
  [71.989] Tainted: [W]=WARN, [L]=SOFTLOCKUP
  [71.989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  [71.991] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs]
  [71.992] Call Trace:
  [71.993]  <IRQ>
  [71.994]  dump_stack_lvl+0x5a/0x80
  [71.994]  panic+0x10b/0x2da
  [71.995]  watchdog_timer_fn.cold+0x9a/0xa1
  [71.996]  ? __pfx_watchdog_timer_fn+0x10/0x10
  [71.997]  __hrtimer_run_queues+0x132/0x2a0
  [71.997]  hrtimer_interrupt+0xff/0x230
  [71.998]  __sysvec_apic_timer_interrupt+0x55/0x100
  [71.999]  sysvec_apic_timer_interrupt+0x6c/0x90
  [72.000]  </IRQ>
  [72.000]  <TASK>
  [72.001]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [72.002] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs]
  [72.002] Code: c1 01 48 83 (...)
  [72.005] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246
  [72.006] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000
  [72.006] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad
  [72.007] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080
  [72.008] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860
  [72.009] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000
  [72.010]  ? btrfs_discard_workfn+0x51/0x400 [btrfs 23b01089228eb964071fb7ca156eee8cd3bf996f]
  [72.011]  process_one_work+0x17e/0x330
  [72.012]  worker_thread+0x2ce/0x3f0
  [72.013]  ? __pfx_worker_thread+0x10/0x10
  [72.014]  kthread+0xef/0x220
  [72.014]  ? __pfx_kthread+0x10/0x10
  [72.015]  ret_from_fork+0x34/0x50
  [72.015]  ? __pfx_kthread+0x10/0x10
  [72.016]  ret_from_fork_asm+0x1a/0x30
  [72.017]  </TASK>
  [72.017] Kernel Offset: 0x15000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  [72.019] Rebooting in 90 seconds..

So fix this by making sure we move a block group out of the unused block
groups discard list when calling __add_to_discard_list().

Fixes: 2bee7eb8bb ("btrfs: discard one region at a time in async discard")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1242012
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Daniel Vacek <neelx@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12 21:38:56 +02:00
Linus Torvalds
8b64199a7f \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmgiIbAACgkQnJ2qBz9k
 QNlbuQf+K3r9pPtR2IOfeWj5lJtY0ZYjd51qazTZ48OdGsSnjU9raKsWChbqaSAZ
 90TPK7/7QHXvebuT2akpC5ko50k6gVwGpEgFTJP1/b6SnuUelm8bKy6i4FK+hvpR
 9hrdZfWCe6Nm3oCJ/4buAJVXgNDbnZYeD7dFElJ863MxyytNYyvuKaUN4tlMTDMA
 9Z0N2FfOXYwRWRF7a5o9JulJh0cgdlq3w82w/ZePo7qNuVTL127ESRQpqu2/31js
 f2oOeXJ2qsMGhzyvouZkCseFfbHDyg/XqVBTFpE11smqKJf6/+0Uf4RABPxvmsIU
 /vy4YRXWQDRSAUj9CedYa6S/8VgNMw==
 =xAt3
 -----END PGP SIGNATURE-----

Merge tag 'udf_for_v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull UDF fix from Jan Kara:
 "Fix a bug in UDF inode eviction leading to spewing pointless
  error messages"

* tag 'udf_for_v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Make sure i_lenExtents is uptodate on inode eviction
2025-05-12 10:23:20 -07:00