Fix a regression in overlayfs caused by an fsverity API change
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaf97zhQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK+PtAQDzevIJ21gYwhimSRxpbB/eDRQO3gD2
wz4bz2lXZiCjIwEA8rktrlFgrLeqo1W252YXGpvlczf5qgupos3t4Hm8LAk=
=WsMl
-----END PGP SIGNATURE-----
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux
Pull fsverity fix from Eric Biggers:
"Fix a regression in overlayfs caused by an fsverity API change"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
ovl: fix verity lazy-load guard broken by fsverity_active() semantic change
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmn+DfMACgkQiiy9cAdy
T1FlwQv/bOScs7kYk5M5cCUf8kvA3kHBBmXcewSXYVzEaspJFd49IOrbejh07UXR
KmfJ4zgX3usbFNzXkmm8AKrax9ZJd8vmdey7/+ELxuBoYiyyDTATZ/VG+yDae0Cu
zU7pZNv99LppFkkxQM+7hpBtbazRUTZu3VYprFZ+UCWPupKZs/fQm9huBzJPf2bn
dMkojp/AAOGmhuRok3DWA1fu/BvFgslXPk4QohIfWxd0zRGVXQLRkOXvVI34bhR2
IOLH1PohkFsajqWClEyikCaFjhW8ZpmmHVl2t+NZer/wYoq2Mp2Ad9NkILmfrWR1
w4NSxh73emsllZpDkXYULlM9voxnjIXpvg/wPP+DA4yhuThwluJyCgsEkoInMw6X
mLM8JiD4EMQhxKiZwtrO4gd/TshSBhm01ly0a6VwvV2p1mvW2cJH2VAZyoC+xN8d
CabEmVnJuiwh4SPwKwsJN3bePwvjp30j1oVRspQthTQRrunyY4hkXr3z2Hpo6TNb
tMudF/Qh
=A8aI
-----END PGP SIGNATURE-----
Merge tag 'v7.1-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix for two ACL issues (security fix to validate dacloffset better
and chmod fix)
- Fix out of bounds reads (in check_wsl_eas and smb2_check_msg for
symlinks)
- Two Kerberos fixes including an important one when AES-256 encryption
chosen
- Fix open_cached_dir problem when directory leases disabled
* tag 'v7.1-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: validate dacloffset before building DACL pointers
smb/client: fix out-of-bounds read in smb2_compound_op()
smb/client: fix out-of-bounds read in symlink_data()
smb: client: Zero-pad short GSS session keys per MS-SMB2
smb: client: Use FullSessionKey for AES-256 encryption key derivation
smb: client: use kzalloc to zero-initialize security descriptor buffer
cifs: abort open_cached_dir if we don't request leases
parse_sec_desc(), build_sec_desc(), and the chown path in
id_mode_to_cifs_acl() all add the server-supplied dacloffset to pntsd
before proving a DACL header fits inside the returned security
descriptor.
On 32-bit builds a malicious server can return dacloffset near
U32_MAX, wrap the derived DACL pointer below end_of_acl, and then slip
past the later pointer-based bounds checks. build_sec_desc() and
id_mode_to_cifs_acl() can then dereference DACL fields from the wrapped
pointer in the chmod/chown rewrite paths.
Validate dacloffset numerically before building any DACL pointer and
reuse the same helper at the three DACL entry points.
Fixes: bc3e9dd9d1 ("cifs: Change SIDs in ACEs while transferring file ownership.")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
If a server sends a truncated response but a large OutputBufferLength, and
terminates the EA list early, check_wsl_eas() returns success without
validating that the entire OutputBufferLength fits within iov_len.
Then smb2_compound_op() does:
memcpy(idata->wsl.eas, data[0], size[0]);
Where size[0] is OutputBufferLength. If iov_len is smaller than size[0],
memcpy can read beyond the end of the rsp_iov allocation and leak adjacent
kernel heap memory.
Link: https://lore.kernel.org/linux-cifs/d998240c-aca9-420d-9dbd-f5ba24af19e0@chenxiaosong.com/
Fixes: ea41367b2a ("smb: client: introduce SMB2_OP_QUERY_WSL_EA")
Cc: stable@vger.kernel.org
Signed-off-by: Zisen Ye <zisenye@stu.xidian.edu.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
Since smb2_check_message() returns success without length validation for
the symlink error response, in symlink_data() it is possible for
iov->iov_len to be smaller than sizeof(struct smb2_err_rsp). If the buffer
only contains the base SMB2 header (64 bytes), accessing
err->ErrorContextCount (at offset 66) or err->ByteCount later in
symlink_data() will cause an out-of-bounds read.
Link: https://lore.kernel.org/linux-cifs/297d8d9b-adf7-42fd-a1c2-5b1f230032bc@chenxiaosong.com/
Fixes: 76894f3e2f ("cifs: improve symlink handling for smb2+")
Cc: Stable@vger.kernel.org
Signed-off-by: Zisen Ye <zisenye@stu.xidian.edu.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
Per MS-SMB2 section 3.2.5.3, Session.SessionKey is the first 16 bytes
of the GSS cryptographic key, right-padded with zero bytes if the key
is shorter than 16 bytes.
SMB2_auth_kerberos() copies the GSS session key from the cifs.upcall
response using kmemdup(msg->data, msg->sesskey_len, ...) and stores
the GSS-reported length verbatim in ses->auth_key.len. generate_key()
reads SMB2_NTLMV2_SESSKEY_SIZE bytes from this buffer when feeding the
HMAC-SHA256 KDF for signing key derivation. If a GSS mechanism returns
a session key shorter than 16 bytes (e.g. a deprecated single-DES
Kerberos enctype with an 8-byte session key), the KDF call performs an
out-of-bounds slab read and derives keys that do not match the server,
which pads per the spec.
Modern KDCs disable short-key enctypes by default, so this is latent
rather than reachable in production, but it is still a kernel heap
over-read.
Allocate auth_key.response with kzalloc() at a length of
max(msg->sesskey_len, SMB2_NTLMV2_SESSKEY_SIZE), copy the GSS key in,
and rely on kzalloc()'s zero initialization for the spec-mandated
padding. Set ses->auth_key.len to the padded length. Larger GSS keys
(e.g. the 32-byte aes256-cts-hmac-sha1-96 session key) continue to be
stored at their natural length, preserving the FullSessionKey path.
Emit a cifs_dbg(VFS, ...) message when a short key is encountered to
surface deprecated-enctype usage.
NTLMv2 and NTLMSSP code paths produce a 16-byte session key by
construction and are unaffected.
Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com>
Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
When Kerberos authentication is used with AES-256 encryption (AES-256-CCM
or AES-256-GCM), the SMB3 encryption and decryption keys must be derived
using the full session key (Session.FullSessionKey) rather than just the
first 16 bytes (Session.SessionKey).
Per MS-SMB2 section 3.2.5.3.1, when Connection.Dialect is "3.1.1" and
Connection.CipherId is AES-256-CCM or AES-256-GCM, Session.FullSessionKey
must be set to the full cryptographic key from the GSS authentication
context. The encryption and decryption key derivation (SMBC2SCipherKey,
SMBS2CCipherKey) must use this FullSessionKey as the KDF input. The
signing key derivation continues to use Session.SessionKey (first 16
bytes) in all cases.
Previously, generate_key() hardcoded SMB2_NTLMV2_SESSKEY_SIZE (16) as the
HMAC-SHA256 key input length for all derivations. When Kerberos with
AES-256 provides a 32-byte session key, the KDF for encryption/decryption
was using only the first 16 bytes, producing keys that did not match the
server's, causing mount failures with sec=krb5 and require_gcm_256=1.
Add a full_key_size parameter to generate_key() and pass the appropriate
size from generate_smb3signingkey():
- Signing: always SMB2_NTLMV2_SESSKEY_SIZE (16 bytes)
- Encryption/Decryption: ses->auth_key.len when AES-256, otherwise 16
Also fix cifs_dump_full_key() to report the actual session key length for
AES-256 instead of hardcoded CIFS_SESS_KEY_SIZE, so that userspace tools
like Wireshark receive the correct key for decryption.
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com>
Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Commit f77f281b61 ("fsverity: use a hashtable to find the
fsverity_info") made fsverity_active() check whether the inode has the
verity flag, rather than whether the inode's fsverity_info is loaded.
This broke ovl_ensure_verity_loaded(), which wants to load the
fsverity_info for any verity inodes that haven't had it loaded yet.
Therefore, to check that the fsverity_info hasn't yet been loaded, use
fsverity_get_info(inode) == NULL instead of !fsverity_active(inode).
Also, since fsverity_get_info() now involves a hash table lookup, put
the more lightweight IS_VERITY() flag check first.
Fixes: f77f281b61 ("fsverity: use a hashtable to find the fsverity_info")
Cc: stable@vger.kernel.org
Link: https://github.com/bootc-dev/bootc/issues/2174
Signed-off-by: Colin Walters <walters@verbum.org>
Acked-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/20260505224257.23213-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
- Fix issues in EFI graceful recovery on x86 introduced by changes to
the kernel mode FPU APIs
- I-cache coherency fixes for the LoongArch EFI stub
- Locking fix for EFI pstore
- Code tweak for efivarfs
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCafml9AAKCRAwbglWLn0t
XJSnAQD400URJjhvoFRAkGNEt+ETSGYU03s07wzv8efsfjLCyAEA9gRhsa3D0ArK
21zd5xeLHgeMhLCA5ZkO+HkhFsd8hQk=
=45fv
-----END PGP SIGNATURE-----
Merge tag 'efi-fixes-for-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- Fix issues in EFI graceful recovery on x86 introduced by changes to
the kernel mode FPU APIs
- I-cache coherency fixes for the LoongArch EFI stub
- Locking fix for EFI pstore
- Code tweak for efivarfs
* tag 'efi-fixes-for-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
x86/efi: Restore IRQ state in EFI page fault handler
x86/efi: Fix graceful fault handling after FPU softirq changes
efi/libstub: Synchronize instruction cache after kernel relocation
efi/loongarch: Implement efi_cache_sync_image()
efi/libstub: Move efi_relocate_kernel() into its only remaining user
efi: pstore: Drop efivar lock when efi_pstore_open() returns with an error
efivarfs: use QSTR() in efivarfs_alloc_dentry
Commit 62e7dd0a39 ("smb: common: change the data type of num_aces
to le16") split struct smb_acl's __le32 num_aces field into __le16
num_aces and __le16 reserved. The reserved field corresponds to Sbz2
in the MS-DTYP ACL wire format, which must be zero [1].
When building an ACL descriptor in build_sec_desc(), we are using a
kmalloc()'ed descriptor buffer and writing the fields explicitly using
le16() writes now. This never writes to the 2 byte reserved field,
leaving it as uninitialized heap data.
When the reserved field happens to contain non-zero slab garbage,
Samba rejects the security descriptor with "ndr_pull_security_descriptor
failed: Range Error", causing chmod to fail with EINVAL.
Change kmalloc() to kzalloc() to ensure the entire buffer is
zero-initialized.
Fixes: 62e7dd0a39 ("smb: common: change the data type of num_aces to le16")
Cc: stable@vger.kernel.org
Signed-off-by: Bjoern Doebel <doebel@amazon.de>
Assisted-by: Kiro:claude-opus-4.6
[1] https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/20233ed8-a6c6-4097-aafa-dd545ed24428
Signed-off-by: Steve French <stfrench@microsoft.com>
It is possible that SMB2_open_init may not set lease context based
on the requested oplock level. This can happen when leases have been
temporarily or permanently disabled. When this happens, we will have
open_cached_dir making an open without lease context and the response
will anyway be rejected by open_cached_dir (thereby forcing a close to
discard this open). That's unnecessary two round-trips to the server.
This change adds a check before making the open request to the server
to make sure that SMB2_open_init did add the expected lease context
to the open in open_cached_dir.
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
- Fix a NULL pointer dereference in ntfs_index_walk_down() by validating
index block allocation.
- Fix a memory leak of the symlink target string in
ntfs_reparse_set_wsl_symlink() during error paths.
- Prevent VCN overflow and validate lowest_vcn in
ntfs_mapping_pairs_decompress() to avoid runlist corruption.
- Fix a page reference leak in ntfs_write_iomap_end_resident() when
attribute search context allocation fails.
- Fix an invalid PTR_ERR() usage on a valid folio pointer in
__ntfs_bitmap_set_bits_in_run().
- Correct directory link counting by dropping nlink only when the MFT
record link count reaches zero for WIN32/DOS aliases.
- Fix an uninitialized variable usage in ntfs_mapping_pairs_decompress()
by returning an error pointer directly.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE6NzKS6Uv/XAAGHgyZwv7A1FEIQgFAmn1SX4WHGxpbmtpbmpl
b25Aa2VybmVsLm9yZwAKCRBnC/sDUUQhCLFtEACQou87tSAG0pjuOe4FDW2/ijTJ
B4CWQ5AxSU/G8Mts1Or9bvjKMA2zI8A/N8Bx0kzZviB8G1TiIs2y8KWqJajLCXsX
dEvLwu1UUvtYlclw3sVdo+7oA8lB9NQB5LNlaubTzkDeCXHpkfQ5/+zgbU2Bdpjf
5qe34klrr8jU6KHIJnQlpiqJj8wYvNXizDRYkYZw0tMzNGlzM5csO8cZ4HNW8ENK
+D7CAKBDW4JA8AaaBC9eGL3cpl/a8a1X46O1LoEoCeH14FKGEGAoSa5z5aWBDJpg
X84v/19iP9Ti2poh2I5KZZfgKxFjsQodXYoPRofrXCGpVYUveTRmfEZ//qt33mr/
Y+bX5iTBjP0H4OLr5o8TZNgHXqjsR5/kkbnz71VEZey53U3/fFLC6L0tt9S9vLnb
mC2YghFgmcgQEIYz3S79F8K0JBEl4gSUsMNQtM8+vjqpYRsqFSSUYSEUqEJWgdaK
1tnzbZlGMTgiiNO5EdqZXLIGqsJsckUfi0Qr3tnzdw2CWqj6Q0fCbBV0KVfeLYuY
LtFfG6W2A8KUAvX+Nc6+MiQ887A9F8VYjR4sIC633IISiU05Kfd3OWP4Bx+05Yty
wt6cccm+gCMBVMVacRDccfK+ovIDN50r+7Flbuw1jw28rxcbe5tVmoKrC3HQ/RYr
hIXUXwqqCX5VMxsAOA==
=MosZ
-----END PGP SIGNATURE-----
Merge tag 'ntfs-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs
Pull ntfs fixes from Namjae Jeon:
- Fix a NULL pointer dereference in ntfs_index_walk_down() by
validating index block allocation
- Fix a memory leak of the symlink target string in
ntfs_reparse_set_wsl_symlink() during error paths
- Prevent VCN overflow and validate lowest_vcn in
ntfs_mapping_pairs_decompress() to avoid runlist corruption
- Fix a page reference leak in ntfs_write_iomap_end_resident()
when attribute search context allocation fails
- Fix an invalid PTR_ERR() usage on a valid folio pointer in
__ntfs_bitmap_set_bits_in_run()
- Correct directory link counting by dropping nlink only when
the MFT record link count reaches zero for WIN32/DOS aliases
- Fix an uninitialized variable in ntfs_mapping_pairs_decompress()
by returning an error pointer directly
* tag 'ntfs-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs:
ntfs: Use return instead of goto in ntfs_mapping_pairs_decompress()
ntfs: drop nlink once for WIN32/DOS aliases
ntfs: fix invalid PTR_ERR() usage in __ntfs_bitmap_set_bits_in_run()
ntfs: fix error handling in ntfs_write_iomap_end_resident()
ntfs: fix VCN overflow in ntfs_mapping_pairs_decompress()
ntfs: fix WSL symlink target leak on reparse failure
ntfs: fix NULL dereference in ntfs_index_walk_down()
smb_inherit_dacl() walks the parent directory DACL loaded from the
security descriptor xattr. It verifies that each ACE contains the fixed
SID header before using it, but does not verify that the variable-length
SID described by sid.num_subauth is fully contained in the ACE.
A malformed inheritable ACE can advertise more subauthorities than are
present in the ACE. compare_sids() may then read past the ACE.
smb_set_ace() also clamps the copied destination SID, but used the
unchecked source SID count to compute the inherited ACE size. That could
advance the temporary inherited ACE buffer pointer and nt_size accounting
past the allocated buffer.
Fix this by validating the parent ACE SID count and SID length before
using the SID during inheritance. Compute the inherited ACE size from the
copied SID so the size matches the bounded destination SID. Reject the
inherited DACL if size accumulation would overflow smb_acl.size or the
security descriptor allocation size.
Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Shota Zaizen <s@zaizen.me>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
The kernel test robot reported W=1 build warnings for ksmbd_conn_get()
and ksmbd_conn_put() due to missing parameter descriptions.
Add the @conn description to fix these warnings.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Non-pipe shares must have a duplicated backing path before they can be
published. share_config_request() currently calls kstrndup() for that
path, but if the allocation fails it leaves ret unchanged. If veto list
parsing succeeds and share->name exists, the partially built share is
still inserted into the global share table with share->path left NULL.
A later share-root SMB2 create uses tree_conn->share_conf->path as the
lookup root. If the share was published with path == NULL, that request
passes a NULL pathname into do_getname_kernel()/strlen() and can crash
the ksmbd worker.
Set ret = -ENOMEM when path duplication fails so the incomplete share is
destroyed before publication.
Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd_durable_scavenger() has two related races against any walker
that iterates f_ci->m_fp_list, including ksmbd_lookup_fd_inode()
(used by ksmbd_vfs_rename) and the share-mode checks in
fs/smb/server/smb_common.c.
(1) fp->node list-head reuse. Durable-preserved handles can remain
linked on f_ci->m_fp_list after session teardown so share-mode checks
still see them while the handle is reconnectable. The scavenger
collected expired handles by adding fp->node to a local
scavenger_list after removing them from the global durable idr.
Because fp->node is the same list_head used by m_fp_list,
list_add(&fp->node, &scavenger_list) overwrites the m_fp_list links
and corrupts both lists. CONFIG_DEBUG_LIST can report this on the
share-mode walk path.
(2) Refcount race against m_fp_list walkers. The scavenger qualifies
an expired durable handle with atomic_read(&fp->refcount) > 1 and
fp->conn under global_ft.lock, removes fp from global_ft, then drops
global_ft.lock before unlinking fp from m_fp_list and freeing it.
During that gap fp is still linked on m_fp_list with f_state ==
FP_INITED. ksmbd_lookup_fd_inode() under m_lock read calls
ksmbd_fp_get() (atomic_inc_not_zero on refcount that is still 1) and
takes a live reference; the scavenger then unlinks and frees fp
while the holder owns a reference, leading to UAF on the holder's
subsequent ksmbd_fd_put() and on any field reads performed by a
concurrent share-mode walker that iterates m_fp_list without taking
ksmbd_fp_get() (smb_check_perm_dleases-like paths).
Fix both:
* Stop reusing fp->node as a scavenger-private list node. Remove
one expired handle from global_ft under global_ft.lock, take an
explicit transient reference, drop the lock, unlink fp->node
from m_fp_list under f_ci->m_lock, then drop both the durable
lifetime and transient references with atomic_sub_and_test(2,
&fp->refcount). If the scavenger is the last putter the close
runs there; otherwise an in-flight holder that already raced
through the m_fp_list lookup owns the final close via its
ksmbd_fd_put() path. The one-at-a-time disposal can rescan the
durable idr when multiple handles expire in the same pass, but
durable scavenging is a background expiration path and the final
full scan recomputes min_timeout before the next wait.
* Clear fp->persistent_id inside __ksmbd_remove_durable_fd() right
after idr_remove(), so a delayed final close from a holder that
snatched fp does not re-issue idr_remove() on a persistent id
that idr_alloc_cyclic() in ksmbd_open_durable_fd() may have
already handed out to a brand-new durable handle.
* Bypass the per-conn open_files_count decrement in
__put_fd_final() when fp is detached from any session table
(fp->conn cleared by session_fd_check() at durable preserve --
paired with the volatile_id clear at unpublish, so checking
fp->conn alone is sufficient). The walker that owns the final
close runs from an unrelated work->conn whose
stats.open_files_count never tracked this durable fp; without
this guard the holder would underflow that unrelated counter.
The two races are folded into one patch because patch (1) alone
cleans up the corrupted list but leaves a deterministic UAF window
for m_fp_list walkers that the transient-reference and
persistent_id discipline in (2) close; bisecting onto an
intermediate state would land on a UAF that pre-patch chaos merely
made less reproducible.
Validation:
* CONFIG_DEBUG_LIST coverage for the list_head reuse path.
* KASAN-enabled direct SMB2 durable-handle coverage that exercised
ksmbd_durable_scavenger() and non-NULL ksmbd_lookup_fd_inode()
returns while durable handles expired under concurrent rename
lookups, with no KASAN, UAF, list-corruption, ODEBUG, or WARNING
reports.
* checkpatch --strict
* make -j$(nproc) M=fs/smb/server
Fixes: d484d621d4 ("ksmbd: add durable scavenger timer")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
__close_file_table_ids() is the per-session teardown that closes every
fp belonging to a session (or to one tree connect on that session) by
walking the session's volatile-id idr. The current loop has three
related problems on busy or racing workloads:
* Sleeping under ft->lock. The session-teardown skip callback,
session_fd_check(), already sleeps in ksmbd_vfs_copy_durable_owner()
-> kstrdup(GFP_KERNEL) and down_write(&fp->f_ci->m_lock) (a
rw_semaphore). Running the callback inside write_lock(&ft->lock)
trips CONFIG_DEBUG_ATOMIC_SLEEP / CONFIG_PROVE_LOCKING on a
durable-fd workload.
* Refcount accounting blind to f_state. The unconditional
atomic_dec_and_test(&fp->refcount) does not distinguish
FP_INITED (idr-owned reference still intact) from FP_CLOSED (an
earlier ksmbd_close_fd() already consumed the idr-owned reference
while leaving fp in the idr because a holder kept refcount
non-zero). When the latter races with teardown the same path
over-decrements into a holder reference and ksmbd_fd_put() later
UAFs that holder.
* FP_NEW window. Between __open_id() publishing fp into the
session idr and ksmbd_update_fstate(..., FP_INITED) committing the
transition at the end of smb2_open(), an fp is in FP_NEW and an
intervening teardown that takes a transient reference and
unpublishes the volatile id leaves the original idr-owned
reference orphaned -- the opener is unaware that fp has been
unpublished, returns success to the client, and the fp leaks at
refcount = 1.
Refactor __close_file_table_ids() to take a transient reference on fp
and unpublish fp from the session idr *under ft->lock* before calling
skip() outside the lock. A transient ref protects lifetime but not
concurrent field mutation, so the idr_remove() is what keeps
__ksmbd_lookup_fd() through this session's idr from granting a new
ksmbd_fp_get() reference to an fp whose fp->conn / fp->tcon /
fp->volatile_id / op->conn / lock_list links are about to be rewritten
by session_fd_check(). Durable reconnect is unaffected because it
reaches fp through the global durable table (ksmbd_lookup_durable_fd
-> global_ft).
Decide n_to_drop together with any FP_INITED -> FP_CLOSED transition
under ft->lock so teardown and ksmbd_close_fd() never both consume the
idr-owned reference. See ksmbd_mark_fp_closed() for the per-state
accounting. For the FP_NEW path to be safe, the opener has to learn
that fp was unpublished: ksmbd_update_fstate() now returns -ENOENT
when an FP_NEW -> FP_INITED transition finds f_state already advanced
or the volatile id cleared (both committed by teardown under
ft->lock); smb2_open() propagates that as STATUS_OBJECT_NAME_INVALID
and drops the original reference via ksmbd_fd_put().
The list removal cannot be left for a deferred final putter because
fp->volatile_id has already been cleared and __ksmbd_remove_fd() will
intentionally skip both idr_remove() and list_del_init(). Move the
m_fp_list unlink in __ksmbd_remove_fd() above the volatile-id check so
that an FP_NEW fp that happened to be added to m_fp_list (smb2_open()
adds fp->node before ksmbd_update_fstate() runs) is still cleaned up
on the deferred putter path; list_del_init() on an empty node is a
no-op and remains safe for fps that were never added.
Add a defensive guard in session_fd_check() that refuses non-FP_INITED
fps so that even if a teardown reaches an FP_NEW fp it falls into the
close branch (where the n_to_drop = 1 accounting keeps the opener's
reference alive) instead of the durable-preserve branch (which mutates
fp->conn / fp->tcon).
Validation on a debug kernel additionally built with CONFIG_DEBUG_LIST
and CONFIG_DEBUG_OBJECTS_WORK used a same-session two-tcon workload
(open/write storm on one tcon, 50 tree disconnects on the other) and
reported no list-corruption, work_struct ODEBUG, sleep-in-atomic,
lockdep or kmemleak reports. Reverting only the
__close_file_table_ids() hunk while keeping a forced-is_reconnectable()
harness produced the expected sleep-in-atomic at vfs_cache.c:1095,
confirming the ft->lock-out-of-sleepable-skip discipline.
KASAN-enabled direct SMB2 coverage with durable handles enabled
exercised ksmbd_close_tree_conn_fds(), ksmbd_close_session_fds(),
the FP_NEW failure path, tree_conn_fd_check(), and a non-zero
session_fd_check() durable-preserve return. This produced no KASAN,
DEBUG_LIST, ODEBUG, or WARNING reports.
Fixes: f441584858 ("cifsd: add file operations")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
ksmbd_conn_free() is one of four sites that can observe the last
refcount drop of a struct ksmbd_conn. The other three
fs/smb/server/connection.c ksmbd_conn_r_count_dec()
fs/smb/server/oplock.c __free_opinfo()
fs/smb/server/vfs_cache.c session_fd_check()
end the conn with a bare kfree(), skipping
ida_destroy(&conn->async_ida) and
conn->transport->ops->free_transport(conn->transport). Whenever one
of them is the last putter, the embedded async_ida and the entire
transport struct leak -- for TCP, that is also the struct socket and
the kvec iov.
__free_opinfo() being a final putter is not theoretical. opinfo_put()
queues the callback via call_rcu(&opinfo->rcu, free_opinfo_rcu), so
ksmbd_server_terminate_conn() can deposit N opinfo releases in RCU and
have ksmbd_conn_free() run in the handler thread before any of them
fire. ksmbd_conn_free() then observes refcnt > 0 and short-circuits;
the last RCU-delivered __free_opinfo() falls onto its bare kfree(conn)
branch and the transport is lost.
A/B validation in a QEMU/virtme guest, mounting //127.0.0.1/testshare:
each iteration holds 8 files open via sleep processes, force-closes
TCP with "ss -K sport = :445", kills the holders, lazy-umounts;
repeated 10 times, then ksmbd shutdown and kmemleak scan.
state conn_alloc conn_free tcp_free opi_rcu kmemleak
---------- ---------- --------- -------- ------- --------
pre-patch 20 20 10 160 7
with patch 20 20 20 160 0
Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the
bare-kfree paths skipping transport cleanup; kmemleak backtraces point
into struct tcp_transport / iov. With this patch tcp_free matches
conn_free at 20/20 and kmemleak is clean.
Move the per-struct final release into __ksmbd_conn_release_work() and
route the three bare-kfree final-put sites through a new
ksmbd_conn_put(). Those sites now pair ida_destroy() and
free_transport() with kfree(conn) regardless of which holder happens
to release the last reference. stop_sessions() only triggers the
transport shutdown and does not itself drop the last conn reference,
so it is unaffected.
The centralized release reaches sock_release() -> tcp_close() ->
lock_sock_nested() (might_sleep) from every final putter, including
__free_opinfo() invoked from an RCU softirq callback, which trips
CONFIG_DEBUG_ATOMIC_SLEEP. Defer the release to a dedicated
ksmbd_conn_wq workqueue so ksmbd_conn_put() is safe from any
non-sleeping context.
Make ksmbd_file own a strong connection reference while fp->conn is
non-NULL so durable-preserve and final-close paths cannot dereference
a stale connection. ksmbd_open_fd() and ksmbd_reopen_durable_fd()
take the reference via ksmbd_conn_get() (the latter also reorders the
fp->conn / fp->tcon assignments before __open_id() so the published fp
is never observed with fp->conn == NULL); session_fd_check() and
__ksmbd_close_fd() drop it via ksmbd_conn_put(). With that invariant,
session_fd_check() can take a local conn pointer once and use it
across the m_op_list and lock_list iterations even though op->conn
puts may otherwise drop the last reference.
At module exit the workqueue is flushed and destroyed after
rcu_barrier(), so any release queued by a trailing RCU callback is
drained before the inode hash and module text go away.
Fixes: ee426bfb9d ("ksmbd: add refcnt to ksmbd_conn struct")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
ib_dma_map_sg() modifies the provided scatterlist and returns the
number of mapped entries, which can be fewer than the requested
mr->sgt.nents if the DMA controller coalesces contiguous memory
segments. Passing the original, uncoalesced count to ib_map_mr_sg()
causes memory registration failures if coalescing actually occurs.
Capture the actual mapped count returned by ib_dma_map_sg() and pass it
to ib_map_mr_sg() to ensure correct MR registration.
Also update the ib_dma_map_sg() error logging to drop the error
pointer formatting, since the return value is an integer count
rather than an error code.
Ensure a proper error code (-EIO) is assigned when DMA mapping or
MR registration fails.
Fixes: de5ef8ec3c ("smb: smbdirect: introduce smbdirect_mr.c with client mr code")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221408
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Yi Kuo <yi@yikuo.dev>
Signed-off-by: Steve French <stfrench@microsoft.com>
This makes it easier to rebuild cifs.ko and ksmbd.ko against
a running kernel.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
This is a better solution than
EXPORT_SYMBOL_FOR_MODULES(__sym, "cifs,ksmbd") as it makes
it possible to rebuild smbdirect.ko against a
running kernel and then load the existing cifs.ko and ksmbd.ko
from the running kernel.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmnzfsoACgkQiiy9cAdy
T1E8ogv+Kx7TMahO+6RJpSHknPbwHxmEQjvz6SZkSFG7WgtupqRofAjnxIJyiqo5
PuoH2LPd9ggWvzZC3spz/J/XqcwiqY+u94h3pudGClJuLU7p1AH7eH5aS+GgFePW
FFymUOWUaqwPp6NTBHKfEFg6byPfqzm7e256WpSQSDqKPiEEcrLZqxiZ0H6iOoBK
4asWO/0P6a1MMWf+rUeNq0IduHt8R1tTsukuF5Ye/B919eA3zvnlRTGjhW0X35Qc
BxPaGO4eIrBvmPHSZUS2XN9tBES7kFK+lEdYpDIHkOhD67BKIqJ7rPOgoXrgJwtK
MxZbTNm1Zfkrh7wbxOCbyfHLs1ckPKmOWzfa3Qjls2SyohmwaV6u2EJ2xu/A11r5
4O31gDTunwZ1f1v72k/mXbC2Bi1rIdBVzzfRxqSzMApfSeouk5PLjedvekAoqrO9
0EXGWIp/uIbs8Be1+YIEfXkitjff0znC2VFd+1N7nifF8zqBfYQJyyJpigdjhLB4
WFi1zp9C
=CAio
-----END PGP SIGNATURE-----
Merge tag 'v7.1-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- multichannel crediting fix
- memory allocation improvement for smb2_compound_op
- remove some dead code
* tag 'v7.1-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: change_conf needs to be called for session setup
smb: client: change allocation requirements in smb2_compound_op
smb/client: remove unused smb3_parse_opt()
Today we skip calling change_conf for negotiates and session setup
requests. This can be a problem for mchan as the immediate next call
after session setup could be due to an I/O that is made on the
mount point. For single channel, this is not a problem as
there will be several calls after setting up session.
This change enforces calling change_conf when the total credits contain
enough for reservations for echoes and oplocks. We expect this to happen
during the last session setup response. This way, echoes and oplocks are
not disabled before the first request to the server. So if that first
request is an open, it does not need to disable requesting leases.
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Currently, smb2_compound_op() allocates
struct smb2_compound_vars *vars using GFP_ATOMIC, although
smb2_compound_op() can sleep when it calls compound_send_recv()
before vars is freed.
Allocate vars using GFP_KERNEL.
Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Clang warns (or errors with CONFIG_WERROR=y / W=e):
fs/ntfs/runlist.c:755:6: error: variable 'rl' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
755 | if (overflows_type(lowest_vcn, vcn)) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...
fs/ntfs/runlist.c:971:9: note: uninitialized use occurs here
971 | kvfree(rl);
| ^~
...
rl has not been allocated at this point so the 'goto err_out' should
really just be a return of the error pointer -EIO.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
NTFS could store a filename as paired WIN32 and DOS $FILE_NAME attributes
for directories. But ntfs_delete() deleted both attributes for unlinking
a directory, but it also called drop_nlink() for each attributes.
This could trigger warnings when unlinking directories.
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
stop_sessions() walks conn_list with hash_for_each() and, for every
entry, drops conn_list_lock across the transport ->shutdown() call
before re-acquiring the read lock to continue the loop. The hash
walk relies on cross-iteration state (the current bucket and the
hlist position), which is not preserved across unlock/relock: if
another thread performs a list mutation during the unlocked window,
the ongoing iteration becomes unreliable and can re-visit
connections that have already been handled or skip connections that
have not. The outer `if (!hash_empty(conn_list)) goto again;` retry
masks the symptom in the common case but does not address the
unsafe iteration itself.
Reframe the loop so it never relies on iterator state across
unlock/relock. Under conn_list_lock held for read, pick the first
connection whose ->shutdown() has not yet been issued by this path,
pin it by taking an extra reference, record that fact on the
connection and mark it EXITING while still inside the locked walk,
then drop the lock. Then call ->shutdown() outside the lock, drop
the pin (freeing the connection if the handler already released its
reference), and restart from the top.
Use a new per-connection flag, conn->stop_called, as the "shutdown
issued from stop_sessions()" marker rather than reusing the status
state. ksmbd_conn_set_exiting() is also invoked by
ksmbd_sessions_deregister() on sibling channels of a multichannel
session without issuing a transport shutdown, so treating
KSMBD_SESS_EXITING as "already handled here" would skip connections
that still need shutdown() to wake their handler out of recv(),
leaving the outer retry waiting indefinitely for the hash to drain.
stop_sessions() is serialised by init_lock in
ksmbd_conn_transport_destroy(), so writing stop_called under the
read lock has no other writer.
Set EXITING inside the locked walk so the selection, the stop_called
marker, and the status transition all happen together, and guard
against regressing a connection that has already advanced to
KSMBD_SESS_RELEASING on its own (for example, if the handler exited
its receive loop for an unrelated reason between teardown steps).
When the pin drop is the last put, release the transport and pair
ida_destroy(&target->async_ida) with the ida_init() done in
ksmbd_conn_alloc(), so stop_sessions() retiring a connection on its
own does not leak the xarray backing of the embedded async_ida.
The outer retry with msleep() is kept to wait for handler threads to
reach ksmbd_conn_free() and drain the hash.
Observed with an instrumented build that logs one line per visit and
widens the unlocked window before ->shutdown() by 200 ms, under
five concurrent cifs mounts (nosharesock, one connection each):
* Current code: the same connection address is revisited many
times during a single stop_sessions() call and ->shutdown() is
invoked well beyond the number of live connections before the
hash finally drains.
* Rewritten code: each live connection produces exactly one
->shutdown() call; the function returns as soon as the hash is
empty.
Functional teardown via `ksmbd.control --shutdown` with the same
five mounts completes cleanly on the rewritten path.
Performance is observably unchanged. Tearing down N concurrent
nosharesock cifs connections with `ksmbd.control --shutdown` +
`rmmod ksmbd` takes essentially the same wall time before and after
the rewrite:
N before after
10 4.93s 5.34s
30 7.34s 7.03s
50 7.31s 7.01s (3-run avg: 7.04s vs 7.25s)
100 6.98s 6.78s
200 6.77s 6.89s
and the number of ->shutdown() calls equals the number of live
connections on both paths when the race is not widened. The
teardown is dominated by the msleep(100)-based outer retry waiting
for handler threads to run ksmbd_conn_free(), not by the iteration
itself; the restartable loop's worst-case O(N^2) visit cost is in
the microseconds even at N=200 and sits far below the msleep(100)
granularity.
Applied alone on top of ksmbd-for-next-next, this patch does not
introduce a new leak site. Under the same reproducer (10x
concurrent-holders + ss -K + ksmbd.control --shutdown + rmmod), the
tree still shows the pre-existing per-connection transport leak
count that arises when the last refcount drop lands in one of
ksmbd_conn_r_count_dec(), __free_opinfo() or session_fd_check() -
all of which end with a bare kfree() today. kmemleak backtraces
for the unreferenced objects point into the TCP accept path
(sk_clone -> inet_csk_clone_lock, sock_alloc_inode) and none
involve stop_sessions(). Plugging those bare-kfree sites is the
responsibility of the follow-up patch.
Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
early exit in smb2_populate_readdir_entry() if the requested info_level
is unknown.
Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
The Smatch reported a warning in __ntfs_bitmap_set_bits_in_run():
"warn: passing a valid pointer to 'PTR_ERR'"
This occurs because the 'folio' variable might contain a valid pointer
when jumping to the 'rollback' label, specifically when 'cnt <= 0' is
detected during the subsequent page mapping loop. In such cases,
calling PTR_ERR(folio) is incorrect as it does not contain an error
code.
Fix this by introducing an explicit 'err' variable to track the error
status. This ensures that the rollback logic and the return value
consistently use a proper error code regardless of the state of the
folio pointer.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Use QSTR() and drop strlen() in efivarfs_alloc_dentry().
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmnvhuAACgkQnJ2qBz9k
QNn7NQgAhNZiUl5HQ2/+6XjrhrbaLTiwYLFYX36MlVcy6jZZ7PABEn4Iq+dKn19L
EQ6WL97yEAOV8A3wdXLnm3J/euf37JlWApuNWqmgA5ZsJ0p14nXKbhd/jcqSf5LF
K4afO0OlN6WrPPJxxY2KfiaElgf9bPAhSkX/JvV5pEVyz0CcAjFfgwmNhxrYxIY1
DfQJ/2w7G1VdgpxeO1kVW5REH5NvbsWj9IQSSRS9r1HzTa7E28e3Zn75XWcdab3I
pt3E03nuUiyfVTmJXi2/HGNb40XZjH5TCeDrsbjo759ZdiPIvWDUpcTx0+5acOaj
b039wWZKKBFTag4KA4yPMkEV37ROiA==
=+sU8
-----END PGP SIGNATURE-----
Merge tag 'fs_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull isofs and udf fixes from Jan Kara:
"Several isofs and udf fixes"
* tag 'fs_for_v7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
docs: isofs: replace dead ECMA-119 FTP link
udf: reject descriptors with oversized CRC length
isofs: use QSTR_LEN() in isofs_cmp
isofs: validate block number from NFS file handle in isofs_export_iget
isofs: validate Rock Ridge CE continuation extent against volume size
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmnvdO8ACgkQxWXV+ddt
WDuLqA//fcHDOnClWHRRUaIWhhkMYm7gNZkXf2d+qyYLMtAP2Cv2sZ+aV+OkHp5D
/Gq1W1mUXZLabu0EV0xKICn01nwzWtbZwDO8Bo3+QEdLoAi2gITODsYyY8yeW9KO
GfSBPsom+d7ktVrjaYE7Ppcm6YifBjWNDDcC+MX7Kpy+OUqhyOtsJIaEeTwii9+P
eiyAAC1zqrHZtaQfLsY3WvM0baNaqlm1xURMjJPyRCAtjGpjZy1hK/iFsGcHRlfc
SR//WT/MRnUAFn8zlIBG0wNrk1IEIgPPiA7hAXMRGXFKo0C6ICYLl5MJQh/o/MUs
tFBdkBhtcX/Kynvwb059SyalXZzVhQvzaRN89ZGuDyalNiejRzb8F2oVCfKAVKIU
MdkKOjnR5b7BUzCcZ1cJf1LgX4SngYKTnXrNGHpW0fuUzX6moJEd4wbrgmHjk9ke
+TVdl2vcpAduvBU9idkpDAcUW998tcYmX/LyQhGYpR6k/4n2UdFZJPINqco3pOAO
RIFbIgEAq9rUi+GMSJdEDMO6xLmUYoI6vaw7uZSU6E04zJPiVIcixfRDCBKGPV5Q
Yl9PC3ViLSlgKWaG7UVl8PVaSkCQ7esbfPAnNI/+RBCUeehhSFygePcY+kH1k4LA
0qMne1abDysUVwolb/1de/fqkznLlB3SlA447HwdvwMI0mCSb7w=
=ajKs
-----END PGP SIGNATURE-----
Merge tag 'for-7.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- space reservation fixes:
- correctly undo 'may_use' accounting for remap tree
- avoid double decrement of 'may_use' when submitting async io
- actually enable the shutdown ioctl callback (not just the superblock
ops)
- raid stripe tree fixes when deleting extents
- add missing error handling
- fix various incorrect values set
- fix transaction state when removing a directory, possibly leading to
EIO during log replay
- additional b-tree node key checks during metadata readahead
- error handling and transaction abort updates
* tag 'for-7.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix double-decrement of bytes_may_use in submit_one_async_extent()
btrfs: check return value of btrfs_partially_delete_raid_extent()
btrfs: handle -EAGAIN from btrfs_duplicate_item and refresh stale leaf pointer
btrfs: replace ASSERT with proper error handling in stripe lookup fallback
btrfs: fix wrong min_objectid in btrfs_previous_item() call
btrfs: fix raid stripe search missing entries at leaf boundaries
btrfs: copy devid in btrfs_partially_delete_raid_extent()
btrfs: handle unexpected free-space-tree key types
btrfs: fix missing last_unlink_trans update when removing a directory
btrfs: don't clobber errors in add_remap_tree_entries()
btrfs: enable shutdown ioctl for non-experimental builds
btrfs: apply first key check for readahead when possible
btrfs: abort transaction in do_remap_reloc_trans() on failure
btrfs: fix bytes_may_use leak in do_remap_reloc_trans()
btrfs: fix bytes_may_use leak in move_existing_remap()
When ntfs_attr_get_search_ctx() fails and returns NULL, the function
returned early without calling put_page(ipage).
Fix this by jumping to err_out label on error. The err_out path now
properly releases the page and the mutex, with a NULL check for
the search context.
Reported-by: DaeMyung Kang <charsyam@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
In ntfs_mapping_pairs_decompress(), lowest_vcn is read from
on-disk metadata and used as the initial vcn without validation.
A malformed value can introduce an invalid (e.g. negative) vcn,
corrupting the runlist from the start.
Additionally, the accumulation
vcn += deltaxcn
does not check for s64 overflow. A crafted mapping pairs array
can wrap vcn to a negative value, breaking the monotonically-
increasing invariant relied upon by ntfs_rl_vcn_to_lcn() and
related helpers.
Fix this by validating lowest_vcn and using check_add_overflow()
for vcn accumulation.
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
ntfs_reparse_set_wsl_symlink() converts the symlink target into an
allocated NLS string and transfers ownership to ni->target only after
ntfs_set_ntfs_reparse_data() succeeds. If setting the reparse data fails,
the converted target is left unreferenced and leaks.
Free the converted target on the reparse update failure path. Use kfree()
for the other local failure path as well, matching the ntfs_ucstonls()
allocation contract.
Fixes: fc053f05ca ("ntfs: add reparse and ea operations")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
ntfs_index_walk_down() allocates ictx->ib when descending from the root
into an index allocation block. If that allocation fails, the old code
still passes the NULL buffer to ntfs_ib_read(), which can write through
it via ntfs_inode_attr_pread().
Allocate the index block into a temporary pointer and return -ENOMEM
before changing the index context on allocation failure. Also propagate
ERR_PTR() through ntfs_index_next() and ntfs_readdir() so walk-down
allocation or index block read failures are not mistaken for normal
index iteration inside the filesystem.
ntfs_readdir() keeps the existing userspace-visible behavior of
suppressing readdir errors after marking end_in_iterate; this change only
prevents the walk-down failure path from dereferencing NULL internally.
The failure was reproduced with failslab fail-nth injection on getdents64;
the original module hits a NULL pointer dereference in memcpy_orig through
ntfs_ib_read(), while the patched module reaches the same
ntfs_index_walk_down() allocation failure without crashing.
Fixes: 0a8ac0c1fa ("ntfs: update directory operations")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Commit abdb1742a3 ("cifs: get rid of mount options string parsing")
removed the last caller.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Several development updates for 7.1-rc1:
- fix a race condition handling PG_dcache_clean
- further cleanups for the fault handling, allowing RT to be enabled
- fixing nzones validation in adfs filesystem driver
- fix for module unwinding
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmnrfHEACgkQ9OeQG+St
rGTtMA/+JmEQuVvtJ4ZJ+PhFYdI/VVPTpTwf4tP1dhlLEKaeX4vzlw09+mW/dYeM
jH65rkQ3wcLC4gRH7ZI1SRMm6n0lAf0ROLjpwagd+4U+xnhLCmIHxo8FM2v3Rmyl
34WnCmWitiEtUN41t0EfbUBxDoX5pV42vpPkYIdnQtLkjObCcXsVFMhegBPCTfp2
UOozgH7yNlkFwWEOCEacNne22MmAheOU0R4Jb/OiFAaE3z4i/9+nFoHd7NCKpmDf
hgWaOalg+jsclxn5kyE0KbraP7Uvx44eX8OeQk1rdvL+F9ItGFXtXKD7FUUj+71/
S5yHT3YrRYvAEM1GDKAieZhVWFxKPm5osnlJwrrrnfLWR7JvwrsU7SGS8FqSsqqf
i/Ie5rY1hV9faXJikY8MDSzKC7gwdX0wtZ8JZanZtGwFlPvI17oaE4K3hbiD13Dk
WIJpd7kfYEWG0SjWtyrW1E2plqy3lfVN4MZfjW2ID18uo0PwVtgE5GFf/HFGxugN
WLV+/DhSkc40EUC4bVflWDLtnJYhsQYCJoHwZXVI8YpNK+tShia0BVCSj+Llv8Tv
f33Ae7IARZad+ty2MsTW1oG0zcIjHI8A4OIoRfbRwf3z9Bs8S3o6chlTOi7YYV5y
fbSxCg9AIYRcUZ0ACsnrIgrVd/YF57FD8AvJ/BHvTm7nVTeoLKY=
=arB6
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
Pull ARM updates from Russell King:
- fix a race condition handling PG_dcache_clean
- further cleanups for the fault handling, allowing RT to be enabled
- fixing nzones validation in adfs filesystem driver
- fix for module unwinding
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9463/1: Allow to enable RT
ARM: 9472/1: fix race condition on PG_dcache_clean in __sync_icache_dcache()
ARM: 9471/1: module: fix unwind section relocation out of range error
fs/adfs: validate nzones in adfs_validate_bblk()
ARM: provide individual is_translation_fault() and is_permission_fault()
ARM: move FSR fault status definitions before fsr_fs()
ARM: use BIT() and GENMASK() for fault status register fields
ARM: move is_permission_fault() and is_translation_fault() to fault.h
ARM: move vmalloc() lazy-page table population
ARM: ensure interrupts are enabled in __do_user_fault()
Highlights include:
Bugfixes:
- NFS: Fix handling of ENOSPC so that if we have to resend writes, they
are written synchronously.
- SUNRPC: RDMA transport fixes from Chuck
- NFSv4.2: Several fixes for delegated timestamps
- NFSv4: Failure to obtain a directory delegation should not cause
stat() to fail.
- NFSv4: Rename was failing to update timestamps when a directory
delegation is held.
- NFSv4: Ensure we check rsize/wsize after crossing a NFSv4 filesystem
boundary.
- NFSv4/pnfs: If the server is down, retry the layout returns on reboot
- NFSv4/pnfs: Fallback to MDS could result in a short write being
incorrectly logged.
Cleanups:
- NFS: use memcpy_and_pad in decode_fh
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR8xgHcVzJNfOYElJo6EXfx2a6V0QUCaevSUgAKCRA6EXfx2a6V
0ewIAQD+23uMo5sxY10btKATcBBxswY5YMtN1qQBMyn88N0XfwEAz0+zoEbRv4L2
39goJ/WeJ0/gqhfJV9F+Oe2U1DbsEgM=
=l9y/
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Bugfixes:
- Fix handling of ENOSPC so that if we have to resend writes, they
are written synchronously
- SUNRPC RDMA transport fixes from Chuck
- Several fixes for delegated timestamps in NFSv4.2
- Failure to obtain a directory delegation should not cause stat() to
fail with NFSv4
- Rename was failing to update timestamps when a directory delegation
is held on NFSv4
- Ensure we check rsize/wsize after crossing a NFSv4 filesystem
boundary
- NFSv4/pnfs:
- If the server is down, retry the layout returns on reboot
- Fallback to MDS could result in a short write being incorrectly
logged
Cleanups:
- Use memcpy_and_pad in decode_fh"
* tag 'nfs-for-7.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (21 commits)
NFS: Fix RCU dereference of cl_xprt in nfs_compare_super_address
NFS: remove redundant __private attribute from nfs_page_class
NFSv4.2: fix CLONE/COPY attrs in presence of delegated attributes
NFS: fix writeback in presence of errors
nfs: use memcpy_and_pad in decode_fh
NFSv4.1: Apply session size limits on clone path
NFSv4: retry GETATTR if GET_DIR_DELEGATION failed
NFS: fix RENAME attr in presence of directory delegations
pnfs/flexfiles: validate ds_versions_cnt is non-zero
NFS/blocklayout: print each device used for SCSI layouts
xprtrdma: Post receive buffers after RPC completion
xprtrdma: Scale receive batch size with credit window
xprtrdma: Replace rpcrdma_mr_seg with xdr_buf cursor
xprtrdma: Decouple frwr_wp_create from frwr_map
xprtrdma: Close lost-wakeup race in xprt_rdma_alloc_slot
xprtrdma: Avoid 250 ms delay on backlog wakeup
xprtrdma: Close sendctx get/put race that can block a transport
nfs: update inode ctime after removexattr operation
nfs: fix utimensat() for atime with delegated timestamps
NFS: improve "Server wrote zero bytes" error
...
support for per-subvolume data I/O performance and latency tracking
(metadata operations aren't included) and a good variety of fixes and
cleanups across RBD and CephFS.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmnrq1YTHGlkcnlvbW92
QGdtYWlsLmNvbQAKCRBKf944AhHzi+WFCACA2Yc6oj6W4yXX2LSGCFCN3FOanSb3
6ZvPeSrmAALzwD9ZXdef6j50An6w05P7kXKmAyKTgmW2tpiRciJs6uT6y7By/aph
uGZCaPoJWPDvTlo8d05MAVuyfoKH5eU8pwx2YiEMN5W6kfo7VJQze6BgLbvQt7yH
ToIzzBLifYONH4vF3nfsHj/uCr38Cbpr6GWY8LIPo8QtInWKJTcwF7HWVVicCaMs
yqf1t+/CWzlIsnnIQtp+aSxWlpoA5lAqWxGt3jSfd3eVTCAL8eDzw5fkbGMRJYgM
paH3kZ+LuJWkRXe2ts/RMrXWLJF3ZWOVD6sWU6sfnXf+vBe4SkiwwcUt
=Ooc5
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"We have a series from Alex which extends CephFS client metrics with
support for per-subvolume data I/O performance and latency tracking
(metadata operations aren't included) and a good variety of fixes and
cleanups across RBD and CephFS"
* tag 'ceph-for-7.1-rc1' of https://github.com/ceph/ceph-client:
ceph: add subvolume metrics collection and reporting
ceph: parse subvolume_id from InodeStat v9 and store in inode
ceph: handle InodeStat v8 versioned field in reply parsing
libceph: Fix slab-out-of-bounds access in auth message processing
rbd: fix null-ptr-deref when device_add_disk() fails
crush: cleanup in crush_do_rule() method
ceph: clear s_cap_reconnect when ceph_pagelist_encode_32() fails
ceph: only d_add() negative dentries when they are unhashed
libceph: update outdated comment in ceph_sock_write_space()
libceph: Remove obsolete session key alignment logic
ceph: fix num_ops off-by-one when crypto allocation fails
libceph: Prevent potential null-ptr-deref in ceph_handle_auth_reply()
- Fix potential data leakage by zeroing the portion of the straddle block
beyond initialized_size when reading non-resident attributes.
- Remove unnecessary zeroing in ntfs_punch_hole() for ranges beyond
initialized_size, as they are already returned as zeros on read.
- Fix writable check in ntfs_file_mmap_prepare() to correctly handle
shared mappings using VMA_SHARED_BIT | VMA_MAYWRITE_BIT.
- Use page allocation instead of kmemdup() for IOMAP_INLINE data to
ensure page-aligned address and avoid BUG trap in
iomap_inline_data_valid() caused by the page boundary check.
- Add a size check before memory allocation in ntfs_attr_readall() and
reject overly large attributes.
- Remove unneeded noop_direct_IO from ntfs_aops as it is no longer
required following the FMODE_CAN_ODIRECT flag.
- Fix seven static analysis warnings reported by Smatch.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE6NzKS6Uv/XAAGHgyZwv7A1FEIQgFAmnrdBoWHGxpbmtpbmpl
b25Aa2VybmVsLm9yZwAKCRBnC/sDUUQhCMNDD/9K6oUrK67F9vOhTSD/dDlLVZXu
mKP7idXDxsGQRo4xbQCKdR5F0f7fHyRYUW40+vwhcyf8eGg8pcIN8ATSh9eXoLXo
uGggm0Wf1mnVhcAnlrLCYESVF1OUrYqZWL0mdY+T5UfdimmlOZvgBT0tPMKbVt3A
6wVGpo216/ttjgEbw6txDOS9qXtoUWD+AfMnHkSmHwdxABD3Hnv082qaWeAlCMvZ
T0QJdBFW4zfynqL3x2l5jgMau0fUJOEGsghxklRjFTHLFGcwgJ8b51wO0vkIYIb8
d+Ty78LpaMDShxS9P3aGSDewMgMztRHSf4F1Lhey9ZyU+0TYRTQ+lvjpjv/tND8o
I/zYl/hUWyasIKzGXoa1oMQ9IQews+hguX/tWnal4vWB95coqH+DPSoa5/pA54hd
t5ib7asDnEFeoRiebeB/zo6bY4LspfTQwLh8/O22mHdcc2/MLLriAVFx5I0Rx9d8
eumexuAbaLSD+1fIB4F/dNZW8GsIXu8xRJEPh6NqLJqqy9fQSOoDhbUP9nNpUUvx
ZmFhQKcn/1eGk6qYJgahZ683PJskvXKtu/w6qOyGmWf44RAaOo8DgAyg2a7kt1Eh
CIa5xdvXS3RDVSwY3yOtyA9PsW1v1H80iFVbYRO/kx+92HnAkH4fCCFWd07fX7z/
X94ISuRPFzGdkIInBg==
=JZSI
-----END PGP SIGNATURE-----
Merge tag 'ntfs-for-7.1-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs
Pull ntfs updates from Namjae Jeon:
- Fix potential data leakage by zeroing the portion of the straddle
block beyond initialized_size when reading non-resident attributes
- Remove unnecessary zeroing in ntfs_punch_hole() for ranges beyond
initialized_size, as they are already returned as zeros on read
- Fix writable check in ntfs_file_mmap_prepare() to correctly handle
shared mappings using VMA_SHARED_BIT | VMA_MAYWRITE_BIT
- Use page allocation instead of kmemdup() for IOMAP_INLINE data to
ensure page-aligned address and avoid BUG trap in
iomap_inline_data_valid() caused by the page boundary check
- Add a size check before memory allocation in ntfs_attr_readall() and
reject overly large attributes
- Remove unneeded noop_direct_IO from ntfs_aops as it is no longer
required following the FMODE_CAN_ODIRECT flag
- Fix seven static analysis warnings reported by Smatch
* tag 'ntfs-for-7.1-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs:
ntfs: use page allocation for resident attribute inline data
ntfs: fix mmap_prepare writable check for shared mappings
ntfs: fix potential 32-bit truncation in ntfs_write_cb()
ntfs: fix uninitialized variable in ntfs_map_runlist_nolock
ntfs: delete dead code
ntfs: add missing error code in ntfs_mft_record_alloc()
ntfs: fix uninitialized variables in ntfs_ea_set_wsl_inode()
ntfs: fix uninitialized pointer in ntfs_write_mft_block
ntfs: fix uninitialized variable in ntfs_write_simple_iomap_begin_non_resident
ntfs: remove noop_direct_IO from address_space_operations
ntfs: limit memory allocation in ntfs_attr_readall
ntfs: not zero out range beyond init in punch_hole
ntfs: zero out stale data in straddle block beyond initialized_size
- some minor cleanup
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmnrRHgACgkQq06b7GqY
5nDgGw/7BG4ry89+Q2UR5T+/zqNj/kISpemz2F0Q62lPiQEIv3IvAAGErsGR6DiR
dGvIzvsPLUAWv8gzwhfCn9zFCO92gmMRav5BxOhFpaHfqy6TTJIqe7DoRjbjjjhR
7wUId3WZ8U1V9Z7Ea3oE2gpf/rR+iwACu/O0U1ZYhuJIcmJRsBIWGnWsaZr+PWSf
VuM74YrdGcdrgMkB2hGxezjZ16MBGekWmVBjbWbQVWiUg1FBiewug9syUlJxgEnv
WatEdPu/gWcfGj9bY3RAAlUnP5YJRX22xfmLNVayBOV1LThvKDRd7RKicyDFLIWe
2NbvIaEtyMvupS8w+n0fsDFqL4/yRTwO6p26YV5nDOOMWr5yrQ6bNNmMJrJvJroP
rtG5ww8c7mKLv7wBDJua7m6IIwHxAjzpmWbvX51+ap6uN2oQL7BWbGT3NrDuRVFj
CblbXS4GzTMo5EbmOjuqO0HbA9a2gAPn6g8ElxFfKBUOJx6U86hDJ3CDfRs5i8ft
SUXXJQYJqyhniK7INPvINSoIYn/+5cQavyFay+LgMBPZ8yNY9qSZVskb/Q5AugWI
LdPoJ9DthzMzaSEiEn/sJcq8FGt8w7cOlOABYCMRazSAGWEGIwzHxOtXZp8UfifG
ULbtc9uAGxZClbaO7p2T7cWHveAz+kvvhQn0+LxLJZ+/RzuRc/w=
=miGY
-----END PGP SIGNATURE-----
Merge tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux
Pull 9p updates from Dominique Martinet:
- 9p access flag fix (cannot change access flag since new mount API implem)
- some minor cleanup
* tag '9p-for-7.1-rc1' of https://github.com/martinetd/linux:
9p/trans_xen: replace simple_strto* with kstrtouint
9p/trans_xen: make cleanup idempotent after dataring alloc errors
9p: document missing enum values in kernel-doc comments
9p: fix access mode flags being ORed instead of replaced
9p: fix memory leak in v9fs_init_fs_context error path
Please consider pulling these changes from the signed vfs-7.1-rc1.fixes tag.
Thanks!
Christian
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaeqfYAAKCRCRxhvAZXjc
oltyAP4y1SFYvmoy2mPM3jrSbYuT2rX0q4OZ/GDbuWOvir/bcgEAoPI9JHraS1+2
xFj/7JJFWzuDXlFoaX6g+nv42pfatgU=
=BnjA
-----END PGP SIGNATURE-----
Merge tag 'vfs-7.1-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- eventpoll: fix ep_remove() UAF and follow-up cleanup
- fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference
error
- writeback: Fix use after free in inode_switch_wbs_work_fn()
- fuse: reject oversized dirents in page cache
- fs: aio: reject partial mremap to avoid Null-pointer-dereference
error
- nstree: fix func. parameter kernel-doc warnings
- fs: Handle multiply claimed blocks more gracefully with mmb
* tag 'vfs-7.1-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
eventpoll: drop vestigial epi->dying flag
eventpoll: drop dead bool return from ep_remove_epi()
eventpoll: refresh eventpoll_release() fast-path comment
eventpoll: move f_lock acquisition into ep_remove_file()
eventpoll: fix ep_remove struct eventpoll / struct file UAF
eventpoll: move epi_fget() up
eventpoll: rename ep_remove_safe() back to ep_remove()
eventpoll: drop vestigial __ prefix from ep_remove_{file,epi}()
eventpoll: kill __ep_remove()
eventpoll: split __ep_remove()
eventpoll: use hlist_is_singular_node() in __ep_remove()
fs: Handle multiply claimed blocks more gracefully with mmb
nstree: fix func. parameter kernel-doc warnings
fs: aio: reject partial mremap to avoid Null-pointer-dereference error
fuse: reject oversized dirents in page cache
writeback: Fix use after free in inode_switch_wbs_work_fn()
fs: aio: set VMA_DONTCOPY_BIT in mmap to fix NULL-pointer-dereference error
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmnqdE8ACgkQiiy9cAdy
T1E4uwwAtRDds+fqFZHgqEW/0Vd8O1RJCvUGomoampb4z9rzHOMrekofAjRT6OJs
M6jVx5L/22TT9Vf+Ya+WMrOkxQbjyFy8j6IjdefJi2SxX5Z9QM7ZvEWhQlDhdVUV
Hfb7Zd3jsdDk6GvvIfVzlEXMLbWtkD5zhGYCVOfNuh/RlyGy+orkjbUfbGEI56c4
WPXkVUvGqHnniU/AB4/9pDFFMOwy4IAY9Bs8u2b65FWoxDsPFbz8ntJ1+Ehcy+Er
Try0JqSQT7uJNHN7O334NeylbsxLyszkqDyYUv3A8un7Txzi4OIZFNJuHE4Av95S
XVbmrkCgZ7Bm4wvPBxc35usZk+7WFdIgLM5vA37pG93zQ/n/zdjTdOiGQ6+8qw/L
rHgG6A1ti6/f48Lk5vj01fOcCoNIEBcwVYtajSGU45f44cGjyqgnDe0Id8OZe1yY
DMhsthL45kK1mmYPaq9h85mdxCeec3aKTrMd79dgwiyVMIFEeVJwd6Yvs8SH6qFb
+OSVq87r
=WKB0
-----END PGP SIGNATURE-----
Merge tag 'v7.1-rc-part2-ksmbd-fixes' of git://git.samba.org/ksmbd
Pull more smb server updates from Steve French:
- move fs/smb/common/smbdirect to fs/smb/smbdirect
- change signature calc to use AES-CMAC library, simpler and faster
- invalid signature fix
- multichannel fix
- open create options fix
- fix durable handle leak
- cap maximum lock count to avoid potential denial of service
- four connection fixes: connection free and session destroy IDA fixes,
refcount fix, connection leak fix, max_connections off by one fix
- IPC validation fix
- fix out of bounds write in getting xattrs
- fix use after free in durable handle reconnect
- three ACL fixes: fix potential ACL overflow, harden num_aces check,
and fix minimum ACE size check
* tag 'v7.1-rc-part2-ksmbd-fixes' of git://git.samba.org/ksmbd:
smb: smbdirect: move fs/smb/common/smbdirect/ to fs/smb/smbdirect/
smb: server: stop sending fake security descriptors
ksmbd: scope conn->binding slowpath to bound sessions only
ksmbd: fix CreateOptions sanitization clobbering the whole field
ksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 open
ksmbd: fix O(N^2) DoS in smb2_lock via unbounded LockCount
ksmbd: destroy async_ida in ksmbd_conn_free()
ksmbd: destroy tree_conn_ida in ksmbd_session_destroy()
ksmbd: Use AES-CMAC library for SMB3 signature calculation
ksmbd: reset rcount per connection in ksmbd_conn_wait_idle_sess_id()
ksmbd: fix out-of-bounds write in smb2_get_ea() EA alignment
ksmbd: use check_add_overflow() to prevent u16 DACL size overflow
ksmbd: fix use-after-free in smb2_open during durable reconnect
ksmbd: validate num_aces and harden ACE walk in smb_inherit_dacl()
smb: server: fix max_connections off-by-one in tcp accept path
ksmbd: require minimum ACE size in smb_check_perm_dacl()
ksmbd: validate response sizes in ipc_validate_msg()
smb: server: fix active_num_conn leak on transport allocation failure
With ep_remove() now pinning @file via epi_fget() across the
f_ep clear and hlist_del_rcu(), the dying flag no longer
orchestrates anything: it was set in eventpoll_release_file()
(which only runs from __fput(), i.e. after @file's refcount has
reached zero) and read in __ep_remove() / ep_remove() as a cheap
bail before attempting the same synchronization epi_fget() now
provides unconditionally.
The implication is simple: epi->dying == true always coincides
with file_ref_get(&file->f_ref) == false, because __fput() is
reachable only once the refcount hits zero and the refcount is
monotone in that state. The READ_ONCE(epi->dying) in ep_remove()
therefore selects exactly the same callers that epi_fget() would
reject, just one atomic cheaper. That's not worth a struct
field, a second coordination mechanism, and the comments on
both.
Refresh the eventpoll_release_file() comment to describe what
actually makes the path race-free now (the pin in ep_remove()).
No functional change: the correctness argument is unchanged,
only the mechanism is now a single one instead of two.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-10-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
ep_remove_epi() always returns true -- the "can be disposed"
answer was meaningful back when the dying-check lived inside the
pre-split __ep_remove(), but after that check moved to ep_remove()
the return value is just noise. Both callers gate on it
unconditionally:
if (ep_remove_epi(ep, epi))
WARN_ON_ONCE(ep_refcount_dec_and_test(ep));
dispose = ep_remove_epi(ep, epi);
...
if (dispose && ep_refcount_dec_and_test(ep))
ep_free(ep);
Make ep_remove_epi() return void, drop the dispose local in
eventpoll_release_file(), and the useless conditionals at both
callers. No functional change.
Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-9-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>