mirror of
https://github.com/torvalds/linux.git
synced 2026-05-12 16:18:45 +02:00
__close_file_table_ids() is the per-session teardown that closes every
fp belonging to a session (or to one tree connect on that session) by
walking the session's volatile-id idr. The current loop has three
related problems on busy or racing workloads:
* Sleeping under ft->lock. The session-teardown skip callback,
session_fd_check(), already sleeps in ksmbd_vfs_copy_durable_owner()
-> kstrdup(GFP_KERNEL) and down_write(&fp->f_ci->m_lock) (a
rw_semaphore). Running the callback inside write_lock(&ft->lock)
trips CONFIG_DEBUG_ATOMIC_SLEEP / CONFIG_PROVE_LOCKING on a
durable-fd workload.
* Refcount accounting blind to f_state. The unconditional
atomic_dec_and_test(&fp->refcount) does not distinguish
FP_INITED (idr-owned reference still intact) from FP_CLOSED (an
earlier ksmbd_close_fd() already consumed the idr-owned reference
while leaving fp in the idr because a holder kept refcount
non-zero). When the latter races with teardown the same path
over-decrements into a holder reference and ksmbd_fd_put() later
UAFs that holder.
* FP_NEW window. Between __open_id() publishing fp into the
session idr and ksmbd_update_fstate(..., FP_INITED) committing the
transition at the end of smb2_open(), an fp is in FP_NEW and an
intervening teardown that takes a transient reference and
unpublishes the volatile id leaves the original idr-owned
reference orphaned -- the opener is unaware that fp has been
unpublished, returns success to the client, and the fp leaks at
refcount = 1.
Refactor __close_file_table_ids() to take a transient reference on fp
and unpublish fp from the session idr *under ft->lock* before calling
skip() outside the lock. A transient ref protects lifetime but not
concurrent field mutation, so the idr_remove() is what keeps
__ksmbd_lookup_fd() through this session's idr from granting a new
ksmbd_fp_get() reference to an fp whose fp->conn / fp->tcon /
fp->volatile_id / op->conn / lock_list links are about to be rewritten
by session_fd_check(). Durable reconnect is unaffected because it
reaches fp through the global durable table (ksmbd_lookup_durable_fd
-> global_ft).
Decide n_to_drop together with any FP_INITED -> FP_CLOSED transition
under ft->lock so teardown and ksmbd_close_fd() never both consume the
idr-owned reference. See ksmbd_mark_fp_closed() for the per-state
accounting. For the FP_NEW path to be safe, the opener has to learn
that fp was unpublished: ksmbd_update_fstate() now returns -ENOENT
when an FP_NEW -> FP_INITED transition finds f_state already advanced
or the volatile id cleared (both committed by teardown under
ft->lock); smb2_open() propagates that as STATUS_OBJECT_NAME_INVALID
and drops the original reference via ksmbd_fd_put().
The list removal cannot be left for a deferred final putter because
fp->volatile_id has already been cleared and __ksmbd_remove_fd() will
intentionally skip both idr_remove() and list_del_init(). Move the
m_fp_list unlink in __ksmbd_remove_fd() above the volatile-id check so
that an FP_NEW fp that happened to be added to m_fp_list (smb2_open()
adds fp->node before ksmbd_update_fstate() runs) is still cleaned up
on the deferred putter path; list_del_init() on an empty node is a
no-op and remains safe for fps that were never added.
Add a defensive guard in session_fd_check() that refuses non-FP_INITED
fps so that even if a teardown reaches an FP_NEW fp it falls into the
close branch (where the n_to_drop = 1 accounting keeps the opener's
reference alive) instead of the durable-preserve branch (which mutates
fp->conn / fp->tcon).
Validation on a debug kernel additionally built with CONFIG_DEBUG_LIST
and CONFIG_DEBUG_OBJECTS_WORK used a same-session two-tcon workload
(open/write storm on one tcon, 50 tree disconnects on the other) and
reported no list-corruption, work_struct ODEBUG, sleep-in-atomic,
lockdep or kmemleak reports. Reverting only the
__close_file_table_ids() hunk while keeping a forced-is_reconnectable()
harness produced the expected sleep-in-atomic at vfs_cache.c:1095,
confirming the ft->lock-out-of-sleepable-skip discipline.
KASAN-enabled direct SMB2 coverage with durable handles enabled
exercised ksmbd_close_tree_conn_fds(), ksmbd_close_session_fds(),
the FP_NEW failure path, tree_conn_fd_check(), and a non-zero
session_fd_check() durable-preserve return. This produced no KASAN,
DEBUG_LIST, ODEBUG, or WARNING reports.
Fixes:
|
||
|---|---|---|
| .. | ||
| 9p | ||
| adfs | ||
| affs | ||
| afs | ||
| autofs | ||
| befs | ||
| bfs | ||
| btrfs | ||
| cachefiles | ||
| ceph | ||
| coda | ||
| configfs | ||
| cramfs | ||
| crypto | ||
| debugfs | ||
| devpts | ||
| dlm | ||
| ecryptfs | ||
| efivarfs | ||
| efs | ||
| erofs | ||
| exfat | ||
| exportfs | ||
| ext2 | ||
| ext4 | ||
| f2fs | ||
| fat | ||
| freevxfs | ||
| fuse | ||
| gfs2 | ||
| hfs | ||
| hfsplus | ||
| hostfs | ||
| hpfs | ||
| hugetlbfs | ||
| iomap | ||
| isofs | ||
| jbd2 | ||
| jffs2 | ||
| jfs | ||
| kernfs | ||
| lockd | ||
| minix | ||
| netfs | ||
| nfs | ||
| nfs_common | ||
| nfsd | ||
| nilfs2 | ||
| nls | ||
| notify | ||
| ntfs | ||
| ntfs3 | ||
| ocfs2 | ||
| omfs | ||
| openpromfs | ||
| orangefs | ||
| overlayfs | ||
| proc | ||
| pstore | ||
| qnx4 | ||
| qnx6 | ||
| quota | ||
| ramfs | ||
| resctrl | ||
| romfs | ||
| smb | ||
| squashfs | ||
| sysfs | ||
| tests | ||
| tracefs | ||
| ubifs | ||
| udf | ||
| ufs | ||
| unicode | ||
| vboxsf | ||
| verity | ||
| xfs | ||
| zonefs | ||
| aio.c | ||
| anon_inodes.c | ||
| attr.c | ||
| backing-file.c | ||
| bad_inode.c | ||
| binfmt_elf_fdpic.c | ||
| binfmt_elf.c | ||
| binfmt_flat.c | ||
| binfmt_misc.c | ||
| binfmt_script.c | ||
| bpf_fs_kfuncs.c | ||
| buffer.c | ||
| char_dev.c | ||
| compat_binfmt_elf.c | ||
| coredump.c | ||
| d_path.c | ||
| dax.c | ||
| dcache.c | ||
| direct-io.c | ||
| drop_caches.c | ||
| eventfd.c | ||
| eventpoll.c | ||
| exec.c | ||
| fcntl.c | ||
| fhandle.c | ||
| file_attr.c | ||
| file_table.c | ||
| file.c | ||
| filesystems.c | ||
| fs_context.c | ||
| fs_dirent.c | ||
| fs_parser.c | ||
| fs_pin.c | ||
| fs_struct.c | ||
| fs-writeback.c | ||
| fserror.c | ||
| fsopen.c | ||
| init.c | ||
| inode.c | ||
| internal.h | ||
| ioctl.c | ||
| Kconfig | ||
| Kconfig.binfmt | ||
| kernel_read_file.c | ||
| libfs.c | ||
| locks.c | ||
| Makefile | ||
| mbcache.c | ||
| mnt_idmapping.c | ||
| mount.h | ||
| mpage.c | ||
| namei.c | ||
| namespace.c | ||
| nsfs.c | ||
| nullfs.c | ||
| open.c | ||
| pidfs.c | ||
| pipe.c | ||
| pnode.c | ||
| pnode.h | ||
| posix_acl.c | ||
| proc_namespace.c | ||
| read_write.c | ||
| readdir.c | ||
| remap_range.c | ||
| select.c | ||
| seq_file.c | ||
| signalfd.c | ||
| splice.c | ||
| stack.c | ||
| stat.c | ||
| statfs.c | ||
| super.c | ||
| sync.c | ||
| sysctls.c | ||
| timerfd.c | ||
| userfaultfd.c | ||
| utimes.c | ||
| xattr.c | ||