linux/fs
Linus Torvalds 426b4ca2d6 fs.setgid.v6.0
-----BEGIN PGP SIGNATURE-----
 
 iHQEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYvIYIgAKCRCRxhvAZXjc
 omE8AQDAZG2YjNJfMnUUaaWYO3+zTaHlQp7OQkQTXIHfcfViXQD4vPt3Wxh3rrF+
 J8fwNcWmXhSNei8HP6cA06QmSajnDQ==
 =GF9/
 -----END PGP SIGNATURE-----

Merge tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull setgid updates from Christian Brauner:
 "This contains the work to move setgid stripping out of individual
  filesystems and into the VFS itself.

  Creating files that have both the S_IXGRP and S_ISGID bit raised in
  directories that themselves have the S_ISGID bit set requires
  additional privileges to avoid security issues.

  When a filesystem creates a new inode it needs to take care that the
  caller is either in the group of the newly created inode or they have
  CAP_FSETID in their current user namespace and are privileged over the
  parent directory of the new inode. If any of these two conditions is
  true then the S_ISGID bit can be raised for an S_IXGRP file and if not
  it needs to be stripped.

  However, there are several key issues with the current implementation:

   - S_ISGID stripping logic is entangled with umask stripping.

     For example, if the umask removes the S_IXGRP bit from the file
     about to be created then the S_ISGID bit will be kept.

     The inode_init_owner() helper is responsible for S_ISGID stripping
     and is called before posix_acl_create(). So we can end up with two
     different orderings:

     1. FS without POSIX ACL support

        First strip umask then strip S_ISGID in inode_init_owner().

        In other words, if a filesystem doesn't support or enable POSIX
        ACLs then umask stripping is done directly in the vfs before
        calling into the filesystem:

     2. FS with POSIX ACL support

        First strip S_ISGID in inode_init_owner() then strip umask in
        posix_acl_create().

        In other words, if the filesystem does support POSIX ACLs then
        unmask stripping may be done in the filesystem itself when
        calling posix_acl_create().

     Note that technically filesystems are free to impose their own
     ordering between posix_acl_create() and inode_init_owner() meaning
     that there's additional ordering issues that influence S_ISGID
     inheritance.

     (Note that the commit message of commit 1639a49ccd ("fs: move
     S_ISGID stripping into the vfs_*() helpers") gets the ordering
     between inode_init_owner() and posix_acl_create() the wrong way
     around. I realized this too late.)

   - Filesystems that don't rely on inode_init_owner() don't get S_ISGID
     stripping logic.

     While that may be intentional (e.g. network filesystems might just
     defer setgid stripping to a server) it is often just a security
     issue.

     Note that mandating the use of inode_init_owner() was proposed as
     an alternative solution but that wouldn't fix the ordering issues
     and there are examples such as afs where the use of
     inode_init_owner() isn't possible.

     In any case, we should also try the cleaner and generalized
     solution first before resorting to this approach.

   - We still have S_ISGID inheritance bugs years after the initial
     round of S_ISGID inheritance fixes:

       e014f37db1 ("xfs: use setattr_copy to set vfs inode attributes")
       01ea173e10 ("xfs: fix up non-directory creation in SGID directories")
       fd84bfdddd ("ceph: fix up non-directory creation in SGID directories")

  All of this led us to conclude that the current state is too messy.
  While we won't be able to make it completely clean as
  posix_acl_create() is still a filesystem specific call we can improve
  the S_SIGD stripping situation quite a bit by hoisting it out of
  inode_init_owner() and into the respective vfs creation operations.

  The obvious advantage is that we don't need to rely on individual
  filesystems getting S_ISGID stripping right and instead can
  standardize the ordering between S_ISGID and umask stripping directly
  in the VFS.

  A few short implementation notes:

   - The stripping logic needs to happen in vfs_*() helpers for the sake
     of stacking filesystems such as overlayfs that rely on these
     helpers taking care of S_ISGID stripping.

   - Security hooks have never seen the mode as it is ultimately seen by
     the filesystem because of the ordering issue we mentioned. Nothing
     is changed for them. We simply continue to strip the umask before
     passing the mode down to the security hooks.

   - The following filesystems use inode_init_owner() and thus relied on
     S_ISGID stripping: spufs, 9p, bfs, btrfs, ext2, ext4, f2fs,
     hfsplus, hugetlbfs, jfs, minix, nilfs2, ntfs3, ocfs2, omfs,
     overlayfs, ramfs, reiserfs, sysv, ubifs, udf, ufs, xfs, zonefs,
     bpf, tmpfs.

     We've audited all callchains as best as we could. More details can
     be found in the commit message to 1639a49ccd ("fs: move S_ISGID
     stripping into the vfs_*() helpers")"

* tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  ceph: rely on vfs for setgid stripping
  fs: move S_ISGID stripping into the vfs_*() helpers
  fs: Add missing umask strip in vfs_tmpfile
  fs: add mode_strip_sgid() helper
2022-08-09 09:52:28 -07:00
..
9p 9p: Fix some kernel-doc comments 2022-07-02 18:52:21 +09:00
adfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
affs affs: use memcpy_to_page and remove replace kmap_atomic() 2022-08-01 19:53:31 +02:00
afs Folio changes for 6.0 2022-08-03 10:35:43 -07:00
autofs autofs: remove unused ino field inode 2022-07-17 17:31:42 -07:00
befs befs: Convert befs_symlink_read_folio() to use a folio 2022-08-02 12:34:03 -04:00
bfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
btrfs - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
cachefiles cachefiles: narrow the scope of flushed requests when releasing fd 2022-07-05 16:12:21 +01:00
ceph fs.setgid.v6.0 2022-08-09 09:52:28 -07:00
cifs iov_iter stuff, part 2, rebased 2022-08-08 20:04:35 -07:00
coda coda: Convert coda_symlink_filler() to use a folio 2022-08-02 12:34:03 -04:00
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-02-22 18:30:28 +01:00
cramfs cramfs: read_mapping_page() is synchronous 2022-08-02 12:34:02 -04:00
crypto fscrypt: Add HCTR2 support for filename encryption 2022-06-10 16:40:18 +08:00
debugfs debugfs: Document that debugfs_create functions need not be error checked 2022-02-25 11:56:13 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-01-24 14:17:02 +01:00
dlm fs: dlm: move kref_put assert for lkb structs 2022-08-01 09:31:46 -05:00
ecryptfs ecryptfs: Convert ecryptfs to read_folio 2022-05-09 16:21:45 -04:00
efivarfs efi: vars: Move efivar caching layer into efivarfs 2022-06-24 20:40:19 +02:00
efs efs: Convert efs symlinks to read_folio 2022-05-09 16:21:45 -04:00
erofs - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
exfat exfat: Drop superfluous new line for error messages 2022-08-01 10:14:07 +09:00
exportfs exportfs: support idmapped mounts 2022-04-28 16:31:10 +02:00
ext2 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
ext4 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
f2fs f2fs-for-6.0 2022-08-08 11:18:31 -07:00
fat Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
freevxfs freevxfs: Convert vxfs_immed_read_folio() to use a folio 2022-08-02 12:34:03 -04:00
fscache fscache: Fix invalidation/lookup race 2022-07-05 16:12:55 +01:00
fuse iov_iter stuff, part 2, rebased 2022-08-08 20:04:35 -07:00
gfs2 iov_iter stuff, part 2, rebased 2022-08-08 20:04:35 -07:00
hfs hfs: Remove check for PageError 2022-06-29 08:51:06 -04:00
hfsplus Folio changes for 6.0 2022-08-03 10:35:43 -07:00
hostfs hostfs: Handle page write errors correctly 2022-08-02 12:34:02 -04:00
hpfs hpfs: Convert symlinks to read_folio 2022-05-09 16:21:45 -04:00
hugetlbfs iov_iter stuff, part 2, rebased 2022-08-08 20:04:35 -07:00
iomap new iov_iter flavour - ITER_UBUF 2022-08-08 22:37:15 -04:00
isofs fs/buffer: Combine two submit_bh() and ll_rw_block() arguments 2022-07-14 12:14:32 -06:00
jbd2 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
jffs2 This pull request contains fixes for JFFS2, UBI and UBIFS 2022-06-03 14:42:24 -07:00
jfs Folio changes for 6.0 2022-08-03 10:35:43 -07:00
kernfs kernfs: Fix typo 'the the' in comment 2022-07-28 10:57:25 +02:00
ksmbd 12 server fixes, including for oopses, memory leaks and adding a channel lock 2022-08-08 20:15:13 -07:00
lockd lockd: fix nlm_close_files 2022-07-11 15:49:56 -04:00
minix fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
netfs netfs: do not unlock and put the folio twice 2022-07-14 10:10:12 +02:00
nfs iov_iter stuff, part 2, rebased 2022-08-08 20:04:35 -07:00
nfs_common
nfsd - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
nilfs2 Folio changes for 6.0 2022-08-03 10:35:43 -07:00
nls
notify fsnotify: Fix comment typo 2022-07-26 13:38:47 +02:00
ntfs Folio changes for 6.0 2022-08-03 10:35:43 -07:00
ntfs3 Folio changes for 6.0 2022-08-03 10:35:43 -07:00
ocfs2 fs.setgid.v6.0 2022-08-09 09:52:28 -07:00
omfs fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
openpromfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
orangefs orangefs: Remove test for folio error 2022-06-29 08:51:07 -04:00
overlayfs overlayfs update for 6.0 2022-08-08 11:03:11 -07:00
proc Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
pstore EFI updates for v5.20 2022-08-03 14:38:02 -07:00
qnx4 fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
qnx6 fs: Convert mpage_readpage to mpage_read_folio 2022-05-09 16:21:44 -04:00
quota - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
ramfs
reiserfs Folio changes for 6.0 2022-08-03 10:35:43 -07:00
romfs romfs: Convert romfs to read_folio 2022-05-09 16:21:46 -04:00
smbfs_common Add various fsctl structs 2022-05-23 20:24:12 -05:00
squashfs Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
sysfs kobject: kobj_type: remove default_attrs 2022-04-05 15:39:19 +02:00
sysv Not a lot of material this cycle. Many singleton patches against various 2022-05-27 11:22:03 -07:00
tracefs tracefs: Fix syntax errors in comments 2022-06-17 19:01:28 -04:00
ubifs - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
udf fs/buffer: Combine two submit_bh() and ll_rw_block() arguments 2022-07-14 12:14:32 -06:00
ufs Folio changes for 6.0 2022-08-03 10:35:43 -07:00
unicode kbuild: unify cmd_copy and cmd_shipped 2022-02-14 10:37:32 +09:00
vboxsf vboxsf: Convert vboxsf to read_folio 2022-05-09 16:21:46 -04:00
verity fs-verity: mention btrfs support 2022-07-15 23:42:30 -07:00
xfs - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
zonefs zonefs changes for 5.20-rc1 2022-08-03 15:21:53 -07:00
aio.c iov_iter work, part 1 - isolated cleanups and optimizations. 2022-08-03 13:50:22 -07:00
anon_inodes.c
attr.c vfs: Check the truncate maximum size in inode_newsize_ok() 2022-08-08 10:39:29 -07:00
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c coredump: Snapshot the vmas in do_coredump 2022-03-08 12:55:29 -06:00
binfmt_elf_test.c binfmt_elf: Introduce KUnit test 2022-03-03 20:38:56 -08:00
binfmt_elf.c revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE" 2022-04-15 14:49:56 -07:00
binfmt_flat.c binfmt_flat: Remove shared library support 2022-04-22 10:57:18 -07:00
binfmt_misc.c Fix regression due to "fs: move binfmt_misc sysctl to its own file" 2022-02-09 09:50:02 -08:00
binfmt_script.c
buffer.c Folio changes for 6.0 2022-08-03 10:35:43 -07:00
char_dev.c
compat_binfmt_elf.c binfmt_elf: Introduce KUnit test 2022-03-03 20:38:56 -08:00
coredump.c fs: do not compare against ->llseek 2022-07-16 09:19:15 -04:00
d_path.c
dax.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
dcache.c fs/dcache: Move wakeup out of i_seq_dir write held region. 2022-07-30 00:38:16 -04:00
direct-io.c iov_iter: advancing variants of iov_iter_get_pages{,_alloc}() 2022-08-08 22:37:22 -04:00
drop_caches.c
eventfd.c
eventpoll.c epoll: autoremove wakers even more aggressively 2022-07-17 17:31:40 -07:00
exec.c execve updates for v5.20-rc1 2022-08-02 14:36:19 -07:00
fcntl.c keep iocb_flags() result cached in struct file 2022-06-10 16:10:23 -04:00
fhandle.c
file_table.c iov_iter work, part 1 - isolated cleanups and optimizations. 2022-08-03 13:50:22 -07:00
file.c fix the breakage in close_fd_get_file() calling conventions change 2022-06-05 15:03:03 -04:00
filesystems.c
fs_context.c vfs: fs_context: fix up param length parsing in legacy_parse_param 2022-01-18 09:23:19 +02:00
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c writeback: Fix inode->i_io_list not be protected by inode->i_lock error 2022-06-06 09:54:30 +02:00
fsopen.c uninline may_mount() and don't opencode it in fspick(2)/fsopen(2) 2022-05-19 23:25:10 -04:00
init.c
inode.c fs.setgid.v6.0 2022-08-09 09:52:28 -07:00
internal.h Cleanups (and one fix) around struct mount handling. 2022-06-04 19:00:05 -07:00
ioctl.c Fixes for 5.18-rc1: 2022-04-01 19:35:56 -07:00
Kconfig mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
Kconfig.binfmt m68knommu: changes for linux 5.19 2022-05-30 10:56:18 -07:00
kernel_read_file.c fs/kernel_read_file: allow to read files up-to ssize_t 2022-06-16 19:58:21 -07:00
libfs.c fs: Convert simple_readpage to simple_read_folio 2022-05-09 16:21:44 -04:00
locks.c fs/lock: Rearrange ops in flock syscall. 2022-07-18 10:01:47 -04:00
Makefile io_uring: move to separate directory 2022-07-24 18:39:10 -06:00
mbcache.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
mount.h switch try_to_unlazy_next() to __legitimize_mnt() 2022-07-05 16:18:21 -04:00
mpage.c Folio changes for 6.0 2022-08-03 10:35:43 -07:00
namei.c fs.setgid.v6.0 2022-08-09 09:52:28 -07:00
namespace.c switch try_to_unlazy_next() to __legitimize_mnt() 2022-07-05 16:18:21 -04:00
no-block.c
nsfs.c
open.c iov_iter work, part 1 - isolated cleanups and optimizations. 2022-08-03 13:50:22 -07:00
pipe.c Not a lot of material this cycle. Many singleton patches against various 2022-05-27 11:22:03 -07:00
pnode.c
pnode.h
posix_acl.c acl: make posix_acl_clone() available to overlayfs 2022-07-15 22:09:57 +02:00
proc_namespace.c
read_write.c switch new_sync_{read,write}() to ITER_UBUF 2022-08-08 22:37:15 -04:00
readdir.c
remap_range.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-11 09:03:05 -08:00
seq_file.c rxrpc: Fix locking issue 2022-05-22 21:03:01 +01:00
signalfd.c Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
splice.c iter_to_pipe(): switch to advancing variant of iov_iter_get_pages() 2022-08-08 22:37:23 -04:00
stack.c
stat.c RISC-V Patches for the 5.19 Merge Window, Part 1 2022-05-31 14:10:54 -07:00
statfs.c
super.c fuse update for 6.0 2022-08-08 11:10:02 -07:00
sync.c riscv: compat: syscall: Add compat_sys_call_table implementation 2022-04-26 13:36:25 -07:00
sysctls.c fs: move namespace sysctls and declare fs base directory 2022-01-22 08:33:36 +02:00
timerfd.c
userfaultfd.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
utimes.c
xattr.c acl: move idmapped mount fixup into vfs_{g,s}etxattr() 2022-07-15 22:08:59 +02:00