linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-08 14:42:37 +02:00

Author	SHA1	Message	Date
David Sterba	4d95b9efd7	btrfs: handle unexpected free-space-tree key types Replace the conditional assertions with proper error handling and transaction abort if we find an unexpected key type in the free space tree. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:02:02 +02:00
Filipe Manana	999757231c	btrfs: fix missing last_unlink_trans update when removing a directory When removing a directory we are not updating its last_unlink_trans field, which can result in incorrect fsync behaviour in case some one fsyncs the directory after it was removed because it's holding a file descriptor on it. Example scenario: mkdir /mnt/dir1 mkdir /mnt/dir1/dir2 mkdir /mnt/dir3 sync -f /mnt # Do some change to the directory and fsync it. chmod 700 /mnt/dir1 xfs_io -c fsync /mnt/dir1 # Move dir2 out of dir1 so that dir1 becomes empty. mv /mnt/dir1/dir2 /mnt/dir3/ open fd on /mnt/dir1 call rmdir(2) on path "/mnt/dir1" fsync fd <trigger power failure> When attempting to mount the filesystem, the log replay will fail with an -EIO error and dmesg/syslog has the following: [445771.626482] BTRFS info (device dm-0): first mount of filesystem 0368bbea-6c5e-44b5-b409-09abe496e650 [445771.626486] BTRFS info (device dm-0): using crc32c checksum algorithm [445771.627912] BTRFS info (device dm-0): start tree-log replay [445771.628335] page: refcount:2 mapcount:0 mapping:0000000061443ddc index:0x1d00 pfn:0x7072a5 [445771.629453] memcg:ffff89f400351b00 [445771.629892] aops:btree_aops [btrfs] ino:1 [445771.630737] flags: 0x17fffc00000402a(uptodate\|lru\|private\|writeback\|node=0\|zone=2\|lastcpupid=0x1ffff) [445771.632359] raw: 017fffc00000402a fffff47284d950c8 fffff472907b7c08 ffff89f458e412b8 [445771.633713] raw: 0000000000001d00 ffff89f6c51d1a90 00000002ffffffff ffff89f400351b00 [445771.635029] page dumped because: eb page dump [445771.635825] BTRFS critical (device dm-0): corrupt leaf: root=5 block=30408704 slot=10 ino=258, invalid nlink: has 2 expect no more than 1 for dir [445771.638088] BTRFS info (device dm-0): leaf 30408704 gen 10 total ptrs 17 free space 14878 owner 5 [445771.638091] BTRFS info (device dm-0): refs 4 lock_owner 0 current 3581087 [445771.638094] item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160 [445771.638097] inode generation 3 transid 9 size 16 nbytes 16384 [445771.638098] block group 0 mode 40755 links 1 uid 0 gid 0 [445771.638100] rdev 0 sequence 2 flags 0x0 [445771.638102] atime 1775744884.0 [445771.660056] ctime 1775744885.645502983 [445771.660058] mtime 1775744885.645502983 [445771.660060] otime 1775744884.0 [445771.660062] item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12 [445771.660064] index 0 name_len 2 [445771.660066] item 2 key (256 DIR_ITEM 1843588421) itemoff 16077 itemsize 34 [445771.660068] location key (259 1 0) type 2 [445771.660070] transid 9 data_len 0 name_len 4 [445771.660075] item 3 key (256 DIR_ITEM 2363071922) itemoff 16043 itemsize 34 [445771.660076] location key (257 1 0) type 2 [445771.660077] transid 9 data_len 0 name_len 4 [445771.660078] item 4 key (256 DIR_INDEX 2) itemoff 16009 itemsize 34 [445771.660079] location key (257 1 0) type 2 [445771.660080] transid 9 data_len 0 name_len 4 [445771.660081] item 5 key (256 DIR_INDEX 3) itemoff 15975 itemsize 34 [445771.660082] location key (259 1 0) type 2 [445771.660083] transid 9 data_len 0 name_len 4 [445771.660084] item 6 key (257 INODE_ITEM 0) itemoff 15815 itemsize 160 [445771.660086] inode generation 9 transid 9 size 8 nbytes 0 [445771.660087] block group 0 mode 40777 links 1 uid 0 gid 0 [445771.660088] rdev 0 sequence 2 flags 0x0 [445771.660089] atime 1775744885.641174097 [445771.660090] ctime 1775744885.645502983 [445771.660091] mtime 1775744885.645502983 [445771.660105] otime 1775744885.641174097 [445771.660106] item 7 key (257 INODE_REF 256) itemoff 15801 itemsize 14 [445771.660107] index 2 name_len 4 [445771.660108] item 8 key (257 DIR_ITEM 2676584006) itemoff 15767 itemsize 34 [445771.660109] location key (258 1 0) type 2 [445771.660110] transid 9 data_len 0 name_len 4 [445771.660111] item 9 key (257 DIR_INDEX 2) itemoff 15733 itemsize 34 [445771.660112] location key (258 1 0) type 2 [445771.660113] transid 9 data_len 0 name_len 4 [445771.660114] item 10 key (258 INODE_ITEM 0) itemoff 15573 itemsize 160 [445771.660115] inode generation 9 transid 10 size 0 nbytes 0 [445771.660116] block group 0 mode 40755 links 2 uid 0 gid 0 [445771.660117] rdev 0 sequence 0 flags 0x0 [445771.660118] atime 1775744885.645502983 [445771.660119] ctime 1775744885.645502983 [445771.660120] mtime 1775744885.645502983 [445771.660121] otime 1775744885.645502983 [445771.660122] item 11 key (258 INODE_REF 257) itemoff 15559 itemsize 14 [445771.660123] index 2 name_len 4 [445771.660124] item 12 key (258 INODE_REF 259) itemoff 15545 itemsize 14 [445771.660125] index 2 name_len 4 [445771.660126] item 13 key (259 INODE_ITEM 0) itemoff 15385 itemsize 160 [445771.660127] inode generation 9 transid 10 size 8 nbytes 0 [445771.660128] block group 0 mode 40755 links 1 uid 0 gid 0 [445771.660129] rdev 0 sequence 1 flags 0x0 [445771.660130] atime 1775744885.645502983 [445771.660130] ctime 1775744885.645502983 [445771.660131] mtime 1775744885.645502983 [445771.660132] otime 1775744885.645502983 [445771.660133] item 14 key (259 INODE_REF 256) itemoff 15371 itemsize 14 [445771.660134] index 3 name_len 4 [445771.660135] item 15 key (259 DIR_ITEM 2676584006) itemoff 15337 itemsize 34 [445771.660136] location key (258 1 0) type 2 [445771.660137] transid 10 data_len 0 name_len 4 [445771.660138] item 16 key (259 DIR_INDEX 2) itemoff 15303 itemsize 34 [445771.660139] location key (258 1 0) type 2 [445771.660140] transid 10 data_len 0 name_len 4 [445771.660144] BTRFS error (device dm-0): block=30408704 write time tree block corruption detected [445771.661650] ------------[ cut here ]------------ [445771.662358] WARNING: fs/btrfs/disk-io.c:326 at btree_csum_one_bio+0x217/0x230 [btrfs], CPU#8: mount/3581087 [445771.663588] Modules linked in: btrfs f2fs xfs (...) [445771.671229] CPU: 8 UID: 0 PID: 3581087 Comm: mount Tainted: G W 7.0.0-rc6-btrfs-next-230+ #2 PREEMPT(full) [445771.672575] Tainted: [W]=WARN [445771.672987] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [445771.674460] RIP: 0010:btree_csum_one_bio+0x217/0x230 [btrfs] [445771.675222] Code: 89 44 24 (...) [445771.677364] RSP: 0018:ffffd23882247660 EFLAGS: 00010246 [445771.678029] RAX: 0000000000000000 RBX: ffff89f6c51d1a90 RCX: 0000000000000000 [445771.678975] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff89f406020000 [445771.679983] RBP: ffff89f821204000 R08: 0000000000000000 R09: 00000000ffefffff [445771.680905] R10: ffffd23882247448 R11: 0000000000000003 R12: ffffd23882247668 [445771.681978] R13: ffff89f458e40fc0 R14: ffff89f737f4f500 R15: ffff89f737f4f500 [445771.682912] FS: 00007f0447a98840(0000) GS:ffff89fb9771d000(0000) knlGS:0000000000000000 [445771.684393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [445771.685230] CR2: 00007f0447bf1330 CR3: 000000017cb02002 CR4: 0000000000370ef0 [445771.686273] Call Trace: [445771.686646] <TASK> [445771.686969] btrfs_submit_bbio+0x83f/0x860 [btrfs] [445771.687750] ? write_one_eb+0x28f/0x340 [btrfs] [445771.688428] btree_writepages+0x2e3/0x550 [btrfs] [445771.689180] ? kmem_cache_alloc_noprof+0x12a/0x490 [445771.689963] ? alloc_extent_state+0x19/0x120 [btrfs] [445771.690801] ? kmem_cache_free+0x135/0x380 [445771.691328] ? preempt_count_add+0x69/0xa0 [445771.691831] ? set_extent_bit+0x252/0x8e0 [btrfs] [445771.692468] ? xas_load+0x9/0xc0 [445771.692873] ? xas_find+0x14d/0x1a0 [445771.693304] do_writepages+0xc6/0x160 [445771.693756] filemap_writeback+0xb8/0xe0 [445771.694274] btrfs_write_marked_extents+0x61/0x170 [btrfs] [445771.694999] btrfs_write_and_wait_transaction+0x4e/0xc0 [btrfs] [445771.695818] btrfs_commit_transaction+0x5c8/0xd10 [btrfs] [445771.696530] ? kmem_cache_free+0x135/0x380 [445771.697120] ? release_extent_buffer+0x34/0x160 [btrfs] [445771.697786] btrfs_recover_log_trees+0x7be/0x7e0 [btrfs] [445771.698525] ? __pfx_replay_one_buffer+0x10/0x10 [btrfs] [445771.699206] open_ctree+0x11e5/0x1810 [btrfs] [445771.699776] btrfs_get_tree.cold+0xb/0x162 [btrfs] [445771.700463] ? fscontext_read+0x165/0x180 [445771.701146] ? rw_verify_area+0x50/0x180 [445771.701866] vfs_get_tree+0x25/0xd0 [445771.702491] vfs_cmd_create+0x59/0xe0 [445771.703125] __do_sys_fsconfig+0x303/0x610 [445771.703603] do_syscall_64+0xe9/0xf20 [445771.703974] entry_SYSCALL_64_after_hwframe+0x76/0x7e [445771.704700] RIP: 0033:0x7f0447cbd4aa [445771.705108] Code: 73 01 c3 (...) [445771.707263] RSP: 002b:00007ffc4e528318 EFLAGS: 00000246 ORIG_RAX: 00000000000001af [445771.708107] RAX: ffffffffffffffda RBX: 00005561585d8c20 RCX: 00007f0447cbd4aa [445771.708931] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003 [445771.709744] RBP: 00005561585d9120 R08: 0000000000000000 R09: 0000000000000000 [445771.710674] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [445771.711477] R13: 00007f0447e4f580 R14: 00007f0447e5126c R15: 00007f0447e36a23 [445771.712277] </TASK> [445771.712541] ---[ end trace 0000000000000000 ]--- [445771.713382] BTRFS error (device dm-0): error while writing out transaction: -5 [445771.714679] BTRFS warning (device dm-0): Skipping commit of aborted transaction. [445771.715562] BTRFS error (device dm-0 state A): Transaction aborted (error -5) [445771.716459] BTRFS: error (device dm-0 state A) in cleanup_transaction:2068: errno=-5 IO failure [445771.717936] BTRFS error (device dm-0 state EA): failed to recover log trees with error: -5 [445771.719681] BTRFS error (device dm-0 state EA): open_ctree failed: -5 The problem is that such a fsync should have result in a fallback to a transaction commit, but that did not happen because through the btrfs_rmdir() we never update the directory's last_unlink_trans field. Any inode that had a link removed must have its last_unlink_trans updated to the ID of transaction used for the operation, otherwise fsync and log replay will not work correctly. btrfs_rmdir() calls btrfs_unlink_inode() and through that call chain we never call btrfs_record_unlink_dir() in order to update last_unlink_trans. However btrfs_unlink(), which is used for unlinking regular files, calls btrfs_record_unlink_dir() and then calls btrfs_unlink_inode(). So fix this by moving the call to btrfs_record_unlink_dir() from btrfs_unlink() to btrfs_unlink_inode(). A test case for fstests will follow soon. Reported-by: Slava0135 <slava.kovalevskiy.2014@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CAAJYhww5ov62Hm+n+tmhcL-e_4cBobg+OWogKjOJxVUXivC=MQ@mail.gmail.com/ CC: stable@vger.kernel.org Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:01:48 +02:00
Mark Harmstone	44366af740	btrfs: don't clobber errors in add_remap_tree_entries() In add_remap_tree_entries(), we only process a certain number of entries at a time, meaning we may need to loop. But because we weren't checking the return value of btrfs_insert_empty_items() within the loop, this meant that if the last iteration of the loop succeeded but a previous iteration failed, we were erroneously returning 0. Fix this by breaking the loop early if btrfs_insert_empty_items() fails. Fixes: `b56f35560b` ("btrfs: handle setting up relocation of block group with remap-tree") Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:01:43 +02:00
Qu Wenruo	41e706c07e	btrfs: enable shutdown ioctl for non-experimental builds Although commit `304076527c` ("btrfs: move shutdown and remove_bdev callbacks out of experimental features") tries to move both shutdown and remove_bdev out of experimental features, that commit has only addressed the super block operation callback, the ioctl one is left untouched. Fix that missing aspect by also moving shutdown ioctl out of experimental features. Since we're here, also add unknown flag detection to reject any unsupported shutdown flags. Fixes: `304076527c` ("btrfs: move shutdown and remove_bdev callbacks out of experimental features") Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:01:31 +02:00
Qu Wenruo	a86a283430	btrfs: apply first key check for readahead when possible Currently for tree block readahead we never pass a btrfs_tree_parent_check with @has_first_key set. Without @has_first_key set, btrfs will skip the following extra checks: - Header generation check This is a minor one. - Empty leaf/node checks This is more serious, for certain trees like the csum tree, they are allowed to be empty, thus an empty leaf can pass the tree checker. But if there is a parent node for such an empty leaf, it indicates corruption. Without @has_first_key set, we can no longer detect such a problem. In fact there is already a fuzzed image report that a corrupted csum leaf which has zero nritems but still has a parent node can trigger a BUG_ON() during csum deletion. However there are only two call sites of btrfs_readahead_tree_block(): - Inside relocate_tree_blocks() At this call site we are trying to grab the first key of the tree block, thus we are not able to pass a @first_key parameter. - Inside btrfs_readahead_node_child() This is the more common call site, where we have the parent node and want to readahead the child tree blocks. In this case we can easily grab the node key and pass it for checks. Add a new parameter @first_key to btrfs_readahead_tree_block() and pass the node key to it inside btrfs_readahead_node_child(). This should plug the gap in empty leaf detection during readahead. Link: https://lore.kernel.org/linux-btrfs/20260409071255.3358044-1-gality369@gmail.com/ Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:01:24 +02:00
Mark Harmstone	73db0fad67	btrfs: abort transaction in do_remap_reloc_trans() on failure If one of the calls made by do_remap_reloc_trans() fails, we can leave the remap tree in an inconsistent state. Abort the transaction if this happens, to prevent the corrupt state from reaching the disk. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:00:52 +02:00
Mark Harmstone	9b8824533d	btrfs: fix bytes_may_use leak in do_remap_reloc_trans() If the call to btrfs_reserve_extent() in do_remap_reloc_trans() returns a smaller extent than we asked for, currently we're not undoing the bytes_may_use change that we made. Fix this by calling btrfs_space_info_update_bytes_may_use() again for the difference. Fixes: `fd6594b144` ("btrfs: replace identity remaps with actual remaps when doing relocations") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Mark Harmstone <mark@harmstone.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:00:39 +02:00
Mark Harmstone	68a135013b	btrfs: fix bytes_may_use leak in move_existing_remap() If the call to btrfs_reserve_extent() in move_existing_remap() returns a smaller extent than we asked for, currently we're not undoing the bytes_may_use change that we made. Fix this by calling btrfs_space_info_update_bytes_may_use() again for the difference. Fixes: `bbea42dfb9` ("btrfs: move existing remaps before relocating block group") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Mark Harmstone <mark@harmstone.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-21 04:00:32 +02:00
Linus Torvalds	b4e07588e7	tracing: tell git to ignore the generated 'undefsyms_base.c' file This odd file was added to automatically figure out tool-generated symbols. Honestly, it should have been just a real honest-to-goodness regular file in git, instead of having strange code to generate it in the Makefile, but that is not how that silly thing works. So now we need to ignore it explicitly. Fixes: `1211907ac0` ("tracing: Generate undef symbols allowlist for simple_ring_buffer") Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-04-20 17:25:56 -07:00
Linus Torvalds	f154634e42	linux_kselftest-next-7.1-next-fixes Fixes regressions in non-bash shells and busybox support, and reverts a commit that regression in build and installation when one or more tests fail to build. Fixes duplicated test number reporting introduced in ktap support patch. - selftests: Fix duplicated test number reporting - selftests: Fix runner.sh for non-bash shells - selftests: Fix runner.sh busybox support - selftests: Deescalate error reporting -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmnmPx8ACgkQCwJExA0N QxwZ7g/+L4ZZDp3RyuauvCV4zhG5GQFudtvAkcLOSwBVnRbiLdPmdOXXz7IkW7DN U/WQx3pDOYmvtr2QdNvXch3HOdk1vUfNViU5yPNqC4jVZEMON2N7oGr2Eq+WVhi+ gl63pRYk9ISh+5vOlzQY9UX1sLOxlME1foMJdHQEZHhgbNxlc7s/NfpqAnRC7a4l SFuzL/PJl9kYiMUFeYLB9kwrelvoLrzItMVz7/m56dgNVuEmbNDESBXGwJQneH6l SWOXPC96gu2cajluNfyhOqarkuGVD8x6J+2vWBwrDnSiyMLyealAOHnK5JGR17hW NErJDpqpdlIue5/h/XFnZ+4o43J8uEiUxmP7UiPAmreBllajeNz4xZPuz+i2vLH2 O9dzzj/SV9War5txaFdqHXpbZE2zYOfhA07Xg6VdjcB0LTWaSOPqiIPr9UTwvT6o T7vYkvE+w4rjXwTFEscHkZ5jXrvAiWMrgiK4BuzXWy03/BvOF6LiMf0NCELvKvZG ZubLCJ1N/2EXgt+MX9dRmxq+7ZXCGu53TU5GeX1u/vT5lqsaPwoXT4RylZek5hwx DfKjEOU22TOQXAV01z1sPJvNwPZa84Hejzf6c6v2xobY/vaf4XxgyXAIAJJGOfpj 25nEdecvjdX62kFfY/QrX7akpr7IreonpssmQVuGt3jilpq4uLg= =1k+f -----END PGP SIGNATURE----- Merge tag 'linux_kselftest-next-7.1-next-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "Fix regressions in non-bash shells and busybox support, and revert a commit that regressed in build and installation when one or more tests fail to build. Fix duplicated test number reporting introduced in ktap support patch" * tag 'linux_kselftest-next-7.1-next-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: Fix duplicated test number reporting selftests: Fix runner.sh for non-bash shells selftests: Fix runner.sh busybox support selftests: Deescalate error reporting	2026-04-20 17:19:30 -07:00
Linus Torvalds	13f24586a2	arm64 updates for 7.1 (second round): Core features: - Add workaround for C1-Pro erratum 4193714 - early CME (SME unit) DVMSync acknowledgement. The fix consists of sending IPIs on TLB maintenance to those CPUs running in user space with SME enabled - Include kernel-hwcap.h in list of generated files (missed in a recent commit generating the KERNEL_HWCAP_* macros) CCA: - Fix RSI_INCOMPLETE error check in arm-cca-guest MPAM: - Fix an unmount->remount problem with the CDP emulation, uninitialised variable and checker warnings -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmnmQZcACgkQa9axLQDI XvGouA//SXlo7hQyM41rkgRru9oqftrGg0y6nxz4Z089kv50cm3Jlf/nUuti6vah BMBLCGXA1iOQrGIVmuvtCxDRrfYZWpfKGuT9A0gmEoMqrGIpWl9gfBQG+uR+YrQX 4kp5DLqB85WrJIPiy7HUV6GQoCbFuMrRJwxl89IdWZSobaei3SczTmnttwyJtxG5 /BMitl024TYdiOPNo8bhiML1wIJCaTHvH4IrtCHPyUHEAtsHSMy00y0OrSKBtA/9 ZHZRpY7Po/jnL7YUs1AfYwsaSXjkvqXN0K1Tdavzm75k6lpJmbM3VsZabG/CEuvK PCOGV++is4Y/A+7aQsCwXKeVnY3b6AC4sextytNq0g3GZ7I+Ht9O6nbsp5ZmyXzB HRiFxmFS1pSQOMX9f1neKi3vxDMTy1tKPeccTTzL8dNnxTvUBXnoWfPoJh3cpbjm Dbhe1kksiEn01WWFacGtkIPDa9c+Bkd2T+8wrsk85Z+u3Z0JPM5PfOn6v3X9YlKl K7W8fhvlDL1wP+iyWcMT5zdo+xzHY4ZxuyWbi9a4RhKc6lFHVVG2mpUuPwSsh2ma NnxkDouriuoADHBir89U71N483HSnNfSjhlVSFYD2LFCre5KOZM4KYZ2vwWb8Sy4 79q+BlVRUTQ5O6XjePoSPjUW4APPNviHJsF4E4IiqHkd9O5lMZU= =LNY2 -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull more arm64 updates from Catalin Marinas: "The main 'feature' is a workaround for C1-Pro erratum 4193714 requiring IPIs during TLB maintenance if a process is running in user space with SME enabled. The hardware acknowledges the DVMSync messages before completing in-flight SME accesses, with security implications. The workaround makes use of the mm_cpumask() to track the cores that need interrupting (arm64 hasn't used this mask before). The rest are fixes for MPAM, CCA and generated header that turned up during the merging window or shortly before. Summary: Core features: - Add workaround for C1-Pro erratum 4193714 - early CME (SME unit) DVMSync acknowledgement. The fix consists of sending IPIs on TLB maintenance to those CPUs running in user space with SME enabled - Include kernel-hwcap.h in list of generated files (missed in a recent commit generating the KERNEL_HWCAP_* macros) CCA: - Fix RSI_INCOMPLETE error check in arm-cca-guest MPAM: - Fix an unmount->remount problem with the CDP emulation, uninitialised variable and checker warnings" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm_mpam: resctrl: Make resctrl_mon_ctx_waiters static arm_mpam: resctrl: Fix the check for no monitor components found arm_mpam: resctrl: Fix MBA CDP alloc_capable handling on unmount virt: arm-cca-guest: fix error check for RSI_INCOMPLETE arm64/hwcap: Include kernel-hwcap.h in list of generated files arm64: errata: Work around early CME DVMSync acknowledgement arm64: cputype: Add C1-Pro definitions arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish() arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance	2026-04-20 16:46:22 -07:00
Linus Torvalds	ce9e93383a	sh updates for v7.1 - sh: Drop CONFIG_FIRMWARE_EDID from defconfig files - sh: Remove CONFIG_VSYSCALL reference from UAPI - sh: Fix typo in SPDX license ID lines - sh: Include <linux/io.h> in dac.h -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEYv+KdYTgKVaVRgAGdCY7N/W1+RMFAmnmHWgACgkQdCY7N/W1 +RNVSQ/8DCMVL6n3HWJ0hllF5q5GDjk2CrpRRvcexw9B/ewOkgdykxnlFU90xSy3 cHO+Y5ppeQuTLvnfYajCRUoR06OvBppAa3Ch1mWUYSDk/Ajs18gWnOYvvc+9isl9 Pc25RjUa1E0pqayL+1XVdihk/moWM4A25bM8ND/Gqc4A6TyfRzESKvrz+jwrm5KL xDJBpD/tEKDqBnkiPE0Y/W3IjiZUG5ZpDuNIpkIW5JDWwlbT+4Xd7pcMQfor9JTM UCO4ZHDuhyf7vuMptFx/x6h2D1ssfDS7+5uGBRFIMpXJsXtbKSrePwF9dp/C8jia 7XZVvmchtALZyUfyE/Z/UxaywtG8KLbPATMAlkb4veXJoqpKZXGNhtSE4M3P3h35 CFfaGeSZEZjNYYT7TUjjZfv3EgTOeuC5I2wsJ+0ZaGMa/r/lIXwo7t6GwEApXXMN xGGG8/pS0Amyg2oWcf0xH0UGFeBt4AjO16NdC6z/5WRL42kqUn3V3KUGi2CWPx9L oHMnTQVTXvvhB7Lml58BVauLq3NHAPSuOvc2B/aelhv2sheET9PWVt/FeKacmC4N NyiNGFSlgonXUvMmh5YJwE1IwvQiziTF3BiHaCkuU4qS0lBpfxMsDGvmDusBlCjF MMTjlhDowpWYEbswHZmHhi4w2/ATTemrEoaY6m9jZNvZRwOK01k= =jFKW -----END PGP SIGNATURE----- Merge tag 'sh-for-v7.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux Pull sh updates from John Paul Adrian Glaubitz: "Two patches from Thomas Zimmermann, one by Tim Bird and one by Thomas Weißschuh. The first patch by Thomas Zimmermann adds a missing include in dac.h for SH-3 which became necessary after `243ce64b2b` ("backlight: Do not include <linux/fb.h> in header file") which made __raw_readb() and __raw_writeb() inaccessible in dac.h. Thomas' second patch drops CONFIG_FIRMWARE_EDID for SH as it depends on X86 or EFI_GENERIC_STUB which are not defined on SH for obvious reasons. The patch by Tim Bird fixes just a small typo in two SPDX ID lines which he stumbled over by accident. And, least but not last, the patch by Thomas Weißschuh removes the CONFIG_VSYSCALL reference from UAPI. This was necessary as the definition of AT_SYSINFO_EHDR was gated between CONFIG_VSYSCALL to avoid a default gate VMA to be created. However that default gate VMA was removed entirely in commit `a6c19dfe39` (arm64,ia64,ppc,s390, sh,tile,um,x86,mm: remove default gate area)" * tag 'sh-for-v7.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux: sh: Drop CONFIG_FIRMWARE_EDID from defconfig files sh: Remove CONFIG_VSYSCALL reference from UAPI sh: Fix typo in SPDX license ID lines sh: Include <linux/io.h> in dac.h	2026-04-20 16:41:19 -07:00
Linus Torvalds	065c4e67cc	Mostly cleanups and small things, notably: - musl libc compatibility - vDSO installation fix - TLB sync race fix for recent SMP support - build fix for 32-bit with Clang 20/21 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmnmKS4ACgkQ10qiO8sP aAD2+w//dOOblgUYgQJUXIxHpS7Gcb3Tm+a7ujC23q/kWf/pc8milCSf+zoxzUXL 23Vwh4Gt4KrHKp8lG1gU3xZqV0qwhXNi5HO2hMpB0ioIVpX3TcrUhFbp/Oirvhgi 3PvnvsFtUlW82DFgewB98tefXZSAlG/pg+RjQ3weHfEo+xQbjYc+kR8o59tN8LNR Ea4rrxyjsr3KN2yBNaFpDkMchudP6XWgKByAZBxZ2FofC3zuVRCyF8ThDfQl/3/W muSqX+2iuKjGpmxV0XWt72hYOhNYjBtDY7f4EPe6sbUy+PU6SjD9h/s7VTyVHgZR 3Sii9AQLLJNYPoglExMfmWfeUnJCUJNNTLUze+ZtnhURZQYTvyJRzVmKj6fDPjK2 jGEKXanfZCK9Cfgy2f2xbQxCxhAVwz6QT0XaQO2dZBXa0anzG+2HM0Zn8MNa9jbU +Lm11k1jd1QBifr+5zeni98KHt2mf77blCny8TraODgLNgWUVi5kMkPF4bZgD4Qj udMU9lOkTD08R89hG/Le9TsB+NIpPauyNxDHUpC/VDterFdZqFvmOFT6afTo/4RZ nXNVdL1tn+7O7v0bLdbyhXwj2her1GDbe6HZ5eTNqmjcOthcgI3gF2stDfFhEbNb /wMHnpGPncMeEI8YWtWOFA4FA5T32+LafLCKhuRJdaw0+f/NMOo= =oovZ -----END PGP SIGNATURE----- Merge tag 'uml-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull uml updates from Johannes Berg: "Mostly cleanups and small things, notably: - musl libc compatibility - vDSO installation fix - TLB sync race fix for recent SMP support - build fix for 32-bit with Clang 20/21" * tag 'uml-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: Disable GCOV_PROFILE_ALL on 32-bit UML with Clang 20/21 um: drivers: call kernel_strrchr() explicitly in cow_user.c um: Replace strncpy() with strnlen()+memcpy_and_pad() in strncpy_chunk_from_user() x86/um: fix vDSO installation um: Remove CONFIG_FRAME_WARN from x86_64_defconfig um: Fix pte_read() and pte_exec() for kernel mappings um: Fix potential race condition in TLB sync um: time-travel: clean up kernel-doc warnings um: avoid struct sigcontext redefinition with musl um: fix address-of CMSG_DATA() rvalue in stub	2026-04-20 16:36:46 -07:00
Linus Torvalds	b66cb4f156	printk changes for 7.1 -----BEGIN PGP SIGNATURE----- iQJPBAABCAA5FiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmnmEggbFIAAAAAABAAO bWFudTIsMi41KzEuMTIsMiwyAAoJEFKgDEdIgJTyjQwP/j2a9EzdI4+DfGrmA56m /heo44ObpFJppYaWEyAN7xX8Xpm7ErokLjZxVxhgQY7hGU5WLx1CmslnJbfWYkWy r7q92Kd/QIDmsvwHzlE0xdaX3rp8AXp3O2iIhcDfv9FRe+UultrESBpw9JbRYXXQ SLAFU1iOFMxyuzvhW/2l/B+PQDm0uoRMpMWTsLXo2JtNOsIGewNyV/7dhOumm+RD /0lgVA9jMAQA33j4Hkr38REe8lYH7aGl66x1mZhDg+kYb7w7ndKW/QN21OJiP+2D s4moi2/VLmC/UcxoDAOQTArKYkrYc2nzLMJRMaIun8jcNfCHEqaQAW2MTzSDAC78 +atdlWrfIxykORA31lTtjh4o5vg43sQPjFZCJr3RxNLfxFy15NULhT75QFrxe9sA W3gs4Rz+LeynoQbJNCn8hgK0xcpKLtzC4dMY7dfQo6uyq2YnJauEvioJKbUYyoDh 3l3uYfnzHcfRUb+yNtIkNCiYXwrpTAnSnifCbWO3smWYxhCdAT14Rval/fxtTjSe sTphd7m32U56WpWX0lYQbY001nMzso+gN/eQSK20IzrhvghwYUqvCKNm+fIM9tjR nQBxka+B7XhO27hNV4QIT8ZQRUkabQQAseEFhLQI2Ptgw3eV0s85gMP60Lg4gpZ0 7OGA/VM+xrLqXvsFeDDWccSg =O6QR -----END PGP SIGNATURE----- Merge tag 'printk-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Fix printk ring buffer initialization and sanity checks - Workaround printf kunit test compilation with gcc < 12.1 - Add IPv6 address printf format tests - Misc code and documentation cleanup * tag 'printk-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printf: Compile the kunit test with DISABLE_BRANCH_PROFILING DISABLE_BRANCH_PROFILING lib/vsprintf: use bool for local decode variable lib/hexdump: print_hex_dump_bytes() calls print_hex_dump_debug() printk: ringbuffer: fix errors in comments printk_ringbuffer: Add sanity check for 0-size data printk_ringbuffer: Fix get_data() size sanity check printf: add IPv6 address format tests printk: Fix _DESCS_COUNT type for 64-bit systems	2026-04-20 15:42:18 -07:00
Linus Torvalds	ccbc9fdb32	Fix timer stalls caused by incorrect handling of the dev->next_event_forced flag. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmnl1/4RHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1g/ww/7BI4CyQUJLSCpYMjvkj+87Rrfd5u6FGqt jz0dGeQpY0LvRGSqASwICe+1r0zwHF+xDsUfJvA13mRaPM6D6bEU+JE6ffK8B6T9 EIyYwEwQ2a7DrdIu9+FCXTwqXDUoGLFsguD50b4qupQKFcDlwgZbg4UAWi/ptau9 Ww+5T/+sfw/SMR9EwXBSKH79N0gOOGpDNfGtpDv+0X0qPQvo9QGAxMfIUgMf7ZaA y55agXi5iOdM+mAIrE69WLinBzrBvXHWNr66/SaMadQs93I1hU54sLpir4ft1yCs WnDtTRWG11Y0HBHUqqgbnN8BR/2VIFDVe9BtRDoDUD70iEJ8TGqJjOvF8v7C00MK ets0zNel9Rqbz9wjrjTekPYUHfC/t9qqzV77c0TdU1IR6FArf/OT9Ge34AVr60EX a5s4aX7ECLjwuTwgQPLXsSedOD0eQndf/VYdEQ86fTUfyyujVg2NCxbFEfDr3eho SbjcNv1UQ1WY/7miJzYaiA2aVNtwuX25YNI+t3f3pX/1tGqmx9oB1tNzJqgGfuN9 3/Rx3uP0kH+gpbw1lKAugFJazOEHDLJG8LBgF0PYbmlVdIGn57IuBlQL5GDLe77O G+sFUhrLNpkrIJVhNWODkM+K/z9vvKzENiRgG4hB0oAbQEUuqTA6ppayKWWs8KDb 9fdgLSkdjtI= =GPvm -----END PGP SIGNATURE----- Merge tag 'timers-urgent-2026-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix timer stalls caused by incorrect handling of the dev->next_event_forced flag" * tag 'timers-urgent-2026-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clockevents: Add missing resets of the next_event_forced flag	2026-04-20 15:30:08 -07:00
Keenan Dong	3bfdc63936	rtmutex: Use waiter::task instead of current in remove_waiter() remove_waiter() is used by the slowlock paths, but it is also used for proxy-lock rollback in rt_mutex_start_proxy_lock() when invoked from futex_requeue(). In the latter case waiter::task is not current, but remove_waiter() operates on current for the dequeue operation. That results in several problems: 1) the rbtree dequeue happens without waiter::task::pi_lock being held 2) the waiter task's pi_blocked_on state is not cleared, which leaves a dangling pointer primed for UAF around. 3) rt_mutex_adjust_prio_chain() operates on the wrong top priority waiter task Use waiter::task instead of current in all related operations in remove_waiter() to cure those problems. [ tglx: Fixup rt_mutex_adjust_prio_chain(), add a comment and amend the changelog ] Fixes: `8161239a8b` ("rtmutex: Simplify PI algorithm and make highest prio task get lock") Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Keenan Dong <keenanat2000@gmail.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Cc: stable@vger.kernel.org	2026-04-21 00:22:31 +02:00
Linus Torvalds	65e9974ae2	Remove the unused ARCH_SYSCALL_WORK_{ENTER,EXIT} flags. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmnl1nYRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1htAg/9EUdjaabxTyjgC5EzaGbOJ6dpK+sNnxLm hGljfVLxKX0ymPrPTC9dVl5+kIg6tETwtCwXOSR6TWy6ZlcZ/lB8gaddqfWaJQ2z FprjCgRGPcd2+UgiI/Z/fGp5vLePXe3x2KRmhHZEns9zxpZQ3L1gRhesNcbkrlXl rrKAWF3IndoKi0kC838uczOkZWYa/7wEaKCVTYMZ9Cw/MR7EgjoeT6+2Eg0idXNv EGwNA0TXFiJmqIU7J0eCWp779hFKyaiPUkloIsg4TQnL2bZMma1yP6DCf2+6o6zP xfONLGBakNuRmEcJXp08nBhuDLmCELFFsnEU08qzdpMNmrOCsm96RVo5ALnNZLJT oByxga7XU4QCnKLxPr3VF7sTUQ90/Yde+R061lkbzoKyQIffcYjcQ3qPLmi0T3+Y JLPBbzg8EFNd0o8Z5y6CR5k2yDfZXbde59Ani6cOjKQz7tY3UoF0+q+nLFKy59tj jp8qPgVRYbCAzDIkBprb+dyQHHZoPPy7waYDoHgtOk2kZRGvJHZyJkhAuIj3kMtG tp3uf3VpyAfKieR/q137wu6a0dnIiycftBmaB+OOJVmAi43t5qWezH0WwPtCLeqI sEd/bEBuawasIXHzxmfQsumTspPAbWuTSM7I4RT+Xpa6V1vYxfRTgC+pPOt8kQ/5 DgeJC8NvwY4= =NYzQ -----END PGP SIGNATURE----- Merge tag 'core-urgent-2026-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull entry cleanup from Ingo Molnar: "Remove the unused ARCH_SYSCALL_WORK_{ENTER,EXIT} flags" * tag 'core-urgent-2026-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Kill ARCH_SYSCALL_WORK_{ENTER,EXIT}	2026-04-20 15:07:28 -07:00
Sean Christopherson	dfd2a8b07c	KVM: selftests: Replace "paddr" with "gpa" throughout Replace all variations of "paddr" variables in KVM selftests with "gpa", with the exception of the ELF structures, as those fields are not specific to guest virtual addresses, to complete the conversion from vm_paddr_t to gpa_t. No functional change intended. Link: https://patch.msgid.link/20260420212004.3938325-20-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
Sean Christopherson	abc374191d	KVM: selftests: Replace "u64 nested_paddr" with "gpa_t l2_gpa" In x86's nested TDP APIs, use the appropriate gpa_t typedef and rename variables from nested_paddr to l2_gpa to match KVM x86's nomenclature. No functional change intended. Link: https://patch.msgid.link/20260420212004.3938325-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
Sean Christopherson	df079910f9	KVM: selftests: Replace "u64 gpa" with "gpa_t" throughout Use gpa_t instead of u64 for obvious declarations of GPA variables. No functional change intended. Link: https://patch.msgid.link/20260420212004.3938325-18-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
Sean Christopherson	014dfb7b9b	KVM: selftests: Replace "vaddr" with "gva" throughout Replace all variations of "vaddr" variables in KVM selftests with "gva", with the exception of the ELF structures, as those fields are not specific to guest virtual addresses, to complete the conversion from vm_vaddr_t to gva_t. Opportunistically use gva_t instead of u64 for relevant variables, and fixup indentation as appropriate. No functional change intended. Link: https://patch.msgid.link/20260420212004.3938325-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
Sean Christopherson	a662c4e038	KVM: selftests: Clarify that arm64's inject_uer() takes a host PA, not a guest PA Rename inject_uer()'s @paddr to @hpa to make it more obvious that it injects an error using a host PA, not a guest PA. No functional change intended. Link: https://patch.msgid.link/20260420212004.3938325-16-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
Sean Christopherson	4babae4ca1	KVM: selftests: Rename translate_to_host_paddr() => translate_hva_to_hpa() Rename arm64's translate_to_host_paddr() to translate_hva_to_hpa() and update variable names to match, as using "vaddr" and "paddr" terminology is super confusing due to selftests using those exact names for guest addresses. Opportunisitically drop superfluous local page_addr and paddr variables. No functional change intended. Link: https://patch.msgid.link/20260420212004.3938325-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
Sean Christopherson	3fd995905b	KVM: selftests: Rename vm_vaddr_populate_bitmap() => vm_populate_gva_bitmap() Now that KVM selftests use gva_t instead of vm_vaddr_t, rename the helper for populating the initial GVA bitmap to drop the defunct terminology and use "vm" for the scope. Opportunistically fixup the declaration of the API, which has been broken since day 1. The flaw went unnoticed because the sole caller is defined after the weak version, i.e. can see the prototype without a previous declaration. No functional change intended. Fixes: `e8b9a055fa` ("KVM: arm64: selftests: Align VA space allocator with TTBR0") Link: https://patch.msgid.link/20260420212004.3938325-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
Sean Christopherson	48321f609a	KVM: selftests: Rename vm_vaddr_unused_gap() => vm_unused_gva_gap() Now that KVM selftests use gva_t instead of vm_vaddr_t, rename the API for finding an unused range of virtual memory to drop the defunct terminology and use "vm" for the scope. Opportunistically clean up the function comment to drop superfluous and redundant information. No functional change intended. Link: https://patch.msgid.link/20260420212004.3938325-13-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
Sean Christopherson	85819fa0e3	KVM: selftests: Drop "vaddr_" from APIs that allocate memory for a given VM Now that KVM selftests use gva_t instead of vm_vaddr_t, drop "vaddr_" from the core memory allocation APIs as the information is extraneous and does more harm than good. E.g. the APIs don't _just_ allocate virtual memory, they allocate backing physical memory and install mappings in the guest page tables. And as proven by kmalloc() and malloc(), developers generally expect that allocations come with a working virtual address. Opportunistically clean up the function comment for vm_alloc(), and drop the misleading and superfluous comments for its wrappers. No functional change intended. Link: https://patch.msgid.link/20260420212004.3938325-12-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
David Matlack	6ec982b5a2	KVM: selftests: Use u8 instead of uint8_t Use u8 instead of uint8_t to make the KVM selftests code more concise and more similar to the kernel (since selftests are primarily developed by kernel developers). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/uint8_t/u8/g' Then by manually adjusting whitespace to make checkpatch.pl happy. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://patch.msgid.link/20260420212004.3938325-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
David Matlack	2540ebd603	KVM: selftests: Use s16 instead of int16_t Use s16 instead of int16_t to make the KVM selftests code more concise and more similar to the kernel (since selftests are primarily developed by kernel developers). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/int16_t/s16/g' Then by manually adjusting whitespace to make checkpatch.pl happy. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://patch.msgid.link/20260420212004.3938325-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
David Matlack	19d0914920	KVM: selftests: Use u16 instead of uint16_t Use u16 instead of uint16_t to make the KVM selftests code more concise and more similar to the kernel (since selftests are primarily developed by kernel developers). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/uint16_t/u16/g' Then by manually adjusting whitespace to make checkpatch.pl happy. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://patch.msgid.link/20260420212004.3938325-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:17 -07:00
David Matlack	7b60918768	KVM: selftests: Use s32 instead of int32_t Use s32 instead of int32_t to make the KVM selftests code more concise and more similar to the kernel (since selftests are primarily developed by kernel developers). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/int32_t/s32/g' Then by manually adjusting whitespace to make checkpatch.pl happy. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://patch.msgid.link/20260420212004.3938325-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:16 -07:00
David Matlack	0c3a877469	KVM: selftests: Use u32 instead of uint32_t Use u32 instead of uint32_t to make the KVM selftests code more concise and more similar to the kernel (since selftests are primarily developed by kernel developers). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/uint32_t/u32/g' Then by manually adjusting whitespace to make checkpatch.pl happy. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://patch.msgid.link/20260420212004.3938325-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:16 -07:00
David Matlack	286e8903ae	KVM: selftests: Use s64 instead of int64_t Use s64 instead of int64_t to make the KVM selftests code more concise and more similar to the kernel (since selftests are primarily developed by kernel developers). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/int64_t/s64/g' Then by manually adjusting whitespace to make checkpatch.pl happy. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://patch.msgid.link/20260420212004.3938325-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:16 -07:00
David Matlack	26f8453288	KVM: selftests: Use u64 instead of uint64_t Use u64 instead of uint64_t to make the KVM selftests code more concise and more similar to the kernel (since selftests are primarily developed by kernel developers). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/uint64_t/u64/g' Then by manually adjusting whitespace to make checkpatch.pl happy. Include <linux/types.h> in include/kvm_util_types.h, iinclude/test_util.h, and include/x86/pmu.h to pick up the tools-defined u64. Arguably, all headers (especially kvm_util_types.h) should have already been including stdint.h to get uint64_t from the libc headers, but the missing dependency only rears its head once KVM uses u64 instead of uint64_t. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> [sean: rename pread_uint64() => pread_u64, expand on types.h include] Link: https://patch.msgid.link/20260420212004.3938325-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:16 -07:00
David Matlack	6d3494255a	KVM: selftests: Use gpa_t for GPAs in Hyper-V selftests Fix various Hyper-V selftests to use gpa_t for variables that contain guest physical addresses, rather than gva_t. In practice, the bugs are benign as both gva_t and gpa_t are u64 typedefs, i.e. gpa_t and gva_t are interchangeable from a functional perspective, the code is just confusing. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> [sean: call out that both are u64 typedefs] Link: https://patch.msgid.link/20260420212004.3938325-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:16 -07:00
David Matlack	97dcda3fdc	KVM: selftests: Use gpa_t instead of vm_paddr_t Replace all occurrences of vm_paddr_t with gpa_t to align with KVM code and with the conversion helpers (e.g. addr_hva2gpa()). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/vm_paddr_/gpa_/g' Then by manually adjusting whitespace to make checkpatch.pl happy. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> [sean: drop bogus changelog blurb about renaming functions] Link: https://patch.msgid.link/20260420212004.3938325-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:16 -07:00
David Matlack	5567fc9dcd	KVM: selftests: Use gva_t instead of vm_vaddr_t Replace all occurrences of vm_vaddr_t with gva_t to align with KVM code and with the conversion helpers (e.g. addr_gva2hva()). This commit was generated with the following command: git ls-files tools/testing/selftests/kvm \| xargs sed -i 's/vm_vaddr_/gva_/g' Then by manually adjusting whitespace to make checkpatch.pl happy, and dropping renames of functions that allocate memory within a given VM. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> [sean: drop renames of allocator APIs] Link: https://patch.msgid.link/20260420212004.3938325-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2026-04-20 14:54:16 -07:00
Fernando Fernandez Mancera	711987ba28	netfilter: nfnetlink_osf: fix potential NULL dereference in ttl check The nf_osf_ttl() function accessed skb->dev to perform a local interface address lookup without verifying that the device pointer was valid. Additionally, the implementation utilized an in_dev_for_each_ifa_rcu loop to match the packet source address against local interface addresses. It assumed that packets from the same subnet should not see a decrement on the initial TTL. A packet might appear it is from the same subnet but it actually isn't especially in modern environments with containers and virtual switching. Remove the device dereference and interface loop. Replace the logic with a switch statement that evaluates the TTL according to the ttl_check. Fixes: `11eeef41d5` ("netfilter: passive OS fingerprint xtables match") Reported-by: Kito Xu (veritas501) <hxzene@gmail.com> Closes: https://lore.kernel.org/netfilter-devel/20260414074556.2512750-1-hxzene@gmail.com/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-20 23:45:44 +02:00
Fernando Fernandez Mancera	f5ca450087	netfilter: nfnetlink_osf: fix out-of-bounds read on option matching In nf_osf_match(), the nf_osf_hdr_ctx structure is initialized once and passed by reference to nf_osf_match_one() for each fingerprint checked. During TCP option parsing, nf_osf_match_one() advances the shared ctx->optp pointer. If a fingerprint perfectly matches, the function returns early without restoring ctx->optp to its initial state. If the user has configured NF_OSF_LOGLEVEL_ALL, the loop continues to the next fingerprint. However, because ctx->optp was not restored, the next call to nf_osf_match_one() starts parsing from the end of the options buffer. This causes subsequent matches to read garbage data and fail immediately, making it impossible to log more than one match or logging incorrect matches. Instead of using a shared ctx->optp pointer, pass the context as a constant pointer and use a local pointer (optp) for TCP option traversal. This makes nf_osf_match_one() strictly stateless from the caller's perspective, ensuring every fingerprint check starts at the correct option offset. Fixes: `1a6a0951fc` ("netfilter: nfnetlink_osf: add missing fmatch check") Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-20 23:45:43 +02:00
Yingnan Zhang	67bf42cae4	ipvs: fix MTU check for GSO packets in tunnel mode Currently, IPVS skips MTU checks for GSO packets by excluding them with the !skb_is_gso(skb) condition. This creates problems when IPVS tunnel mode encapsulates GSO packets with IPIP headers. The issue manifests in two ways: 1. MTU violation after encapsulation: When a GSO packet passes through IPVS tunnel mode, the original MTU check is bypassed. After adding the IPIP tunnel header, the packet size may exceed the outgoing interface MTU, leading to unexpected fragmentation at the IP layer. 2. Fragmentation with problematic IP IDs: When net.ipv4.vs.pmtu_disc=1 and a GSO packet with multiple segments is fragmented after encapsulation, each segment gets a sequentially incremented IP ID (0, 1, 2, ...). This happens because: a) The GSO packet bypasses MTU check and gets encapsulated b) At __ip_finish_output, the oversized GSO packet is split into separate SKBs (one per segment), with IP IDs incrementing c) Each SKB is then fragmented again based on the actual MTU This sequential IP ID allocation differs from the expected behavior and can cause issues with fragment reassembly and packet tracking. Fix this by properly validating GSO packets using skb_gso_validate_network_len(). This function correctly validates whether the GSO segments will fit within the MTU after segmentation. If validation fails, send an ICMP Fragmentation Needed message to enable proper PMTU discovery. Fixes: `4cdd34084d` ("netfilter: nf_conntrack_ipv6: improve fragmentation handling") Signed-off-by: Yingnan Zhang <342144303@qq.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-20 23:45:43 +02:00
Pablo Neira Ayuso	6eda0d771f	netfilter: nat: use kfree_rcu to release ops Florian Westphal says: "Historically this is not an issue, even for normal base hooks: the data path doesn't use the original nf_hook_ops that are used to register the callbacks. However, in v5.14 I added the ability to dump the active netfilter hooks from userspace. This code will peek back into the nf_hook_ops that are available at the tail of the pointer-array blob used by the datapath. The nat hooks are special, because they are called indirectly from the central nat dispatcher hook. They are currently invisible to the nfnl hook dump subsystem though. But once that changes the nat ops structures have to be deferred too." Update nf_nat_register_fn() to deal with partial exposition of the hooks from error path which can be also an issue for nfnetlink_hook. Fixes: `e2cf17d377` ("netfilter: add new hook nfnl subsystem") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-20 23:45:41 +02:00
Pablo Neira Ayuso	b6fe26f86a	netfilter: xtables: restrict several matches to inet family This is a partial revert of: commit `ab4f21e6fb` ("netfilter: xtables: use NFPROTO_UNSPEC in more extensions") to allow ipv4 and ipv6 only. - xt_mac - xt_owner - xt_physdev These extensions are not used by ebtables in userspace. Moreover, xt_realm is only for ipv4, since dst->tclassid is ipv4 specific. Fixes: `ab4f21e6fb` ("netfilter: xtables: use NFPROTO_UNSPEC in more extensions") Reported-by: "Kito Xu (veritas501)" <hxzene@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-20 23:27:52 +02:00
Florian Westphal	6e7066bdb4	netfilter: conntrack: remove sprintf usage Replace it with scnprintf, the buffer sizes are expected to be large enough to hold the result, no need for snprintf+overflow check. Increase buffer size in mangle_content_len() while at it. BUG: KASAN: stack-out-of-bounds in vsnprintf+0xea5/0x1270 Write of size 1 at addr [..] vsnprintf+0xea5/0x1270 sprintf+0xb1/0xe0 mangle_content_len+0x1ac/0x280 nf_nat_sdp_session+0x1cc/0x240 process_sdp+0x8f8/0xb80 process_invite_request+0x108/0x2b0 process_sip_msg+0x5da/0xf50 sip_help_tcp+0x45e/0x780 nf_confirm+0x34d/0x990 [..] Fixes: `9fafcd7b20` ("[NETFILTER]: nf_conntrack/nf_nat: add SIP helper port") Reported-by: Yiming Qian <yimingqian591@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-20 23:27:46 +02:00
Xiang Mei	2195574dc6	netfilter: nfnetlink_osf: fix divide-by-zero in OSF_WSS_MODULO nf_osf_match_one() computes ctx->window % f->wss.val in the OSF_WSS_MODULO branch with no guard for f->wss.val == 0. A CAP_NET_ADMIN user can add such a fingerprint via nfnetlink; a subsequent matching TCP SYN divides by zero and panics the kernel. Reject the bogus fingerprint in nfnl_osf_add_callback() above the per-option for-loop. f->wss is per-fingerprint, not per-option, so the check must run regardless of f->opt_num (including 0). Also reject wss.wc >= OSF_WSS_MAX; nf_osf_match_one() already treats that as "should not happen". Crash: Oops: divide error: 0000 [#1] SMP KASAN NOPTI RIP: 0010:nf_osf_match_one (net/netfilter/nfnetlink_osf.c:98) Call Trace: <IRQ> nf_osf_match (net/netfilter/nfnetlink_osf.c:220) xt_osf_match_packet (net/netfilter/xt_osf.c:32) ipt_do_table (net/ipv4/netfilter/ip_tables.c:348) nf_hook_slow (net/netfilter/core.c:622) ip_local_deliver (net/ipv4/ip_input.c:265) ip_rcv (include/linux/skbuff.h:1162) __netif_receive_skb_one_core (net/core/dev.c:6181) process_backlog (net/core/dev.c:6642) __napi_poll (net/core/dev.c:7710) net_rx_action (net/core/dev.c:7945) handle_softirqs (kernel/softirq.c:622) Fixes: `11eeef41d5` ("netfilter: passive OS fingerprint xtables match") Reported-by: Weiming Shi <bestswngs@gmail.com> Suggested-by: Florian Westphal <fw@strlen.de> Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-20 23:27:41 +02:00
Pablo Neira Ayuso	b336fdbb71	netfilter: nft_osf: restrict it to ipv4 This expression only supports for ipv4, restrict it. Fixes: `b96af92d6e` ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf") Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2026-04-20 23:27:36 +02:00
Jens Axboe	42a702aaed	io_uring: fix iowq_limits data race in tctx node addition __io_uring_add_tctx_node() reads ctx->int_flags and ctx->iowq_limits[0..1] without holding ctx->uring_lock, while io_register_iowq_max_workers() writes these same fields under the lock. Mostly an application problem if you try and make these race, but let's silence KCSAN by just grabbing the ->uring_lock around the operation. This is a slow path operation anyway, and ->uring_lock will be grabbed by submission right after anyway. Fixes: `2e480058dd` ("io-wq: provide a way to limit max number of workers") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-20 14:57:21 -06:00
Rick Edgecombe	9874b2917b	x86/shstk: Prevent deadlock during shstk sigreturn During sigreturn the shadow stack signal frame is popped. The kernel does this by reading the shadow stack using normal read accesses. When it can't assume the memory is shadow stack, it takes extra steps to makes sure it is reading actual shadow stack memory and not other normal readable memory. It does this by holding the mmap read lock while doing the access and checking the flags of the VMA. Unfortunately that is not safe. If the read of the shadow stack sigframe hits a page fault, the fault handler will try to recursively grab another mmap read lock. This normally works ok, but if a writer on another CPU is also waiting, the second read lock could fail and cause a deadlock. Fix this by not holding mmap lock during the read access to userspace. Instead use mmap_lock_speculate_...() to watch for changes between dropping mmap lock and the userspace access. Retry if anything grabbed an mmap write lock in between and could have changed the VMA. These mmap_lock_speculate_...() helpers use mm::mm_lock_seq, which is only available when PER_VMA_LOCK is configured. So make X86_USER_SHADOW_STACK depend on it. On x86, PER_VMA_LOCK is a default configuration for SMP kernels. So drop support for the other configs under the assumption that the !SMP shadow stack user base does not exist. Currently there is a check that skips the lookup work when the SSP can be assumed to be on a shadow stack. While reorganizing the function, remove the optimization to make the tricky code flows more common, such that issues like this cannot escape detection for so long. Fixes: `7fad2a432c` ("x86/shstk: Check that signal frame is shadow stack mem") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@kernel.org> Cc: stable@vger.kernel.org	2026-04-20 22:54:24 +02:00
Jens Axboe	41859843f2	io_uring/tctx: mark io_wq as exiting before error path teardown syzbot reports that it's hitting the below condition for exiting an io_wq context: WARN_ON_ONCE(!test_bit(IO_WQ_BIT_EXIT, &wq->state)) in io_wq_put_and_exit(), which can be triggered with memory allocation fault injection. Ensure that the io_wq is marked as exiting to silence this warning trigger. Reported-by: syzbot+79a4cc863a8db58cd92b@syzkaller.appspotmail.com Fixes: `7880174e1e` ("io_uring/tctx: clean up __io_uring_add_tctx_node() error handling") Reviewed-by: Clément Léger <cleger@meta.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-20 14:47:37 -06:00
Jens Axboe	ee5417fd02	io_uring/tctx: check for setup tctx->io_wq before teardown As with the idling code before it, the error exit path should check for a NULL tctx->io_wq before calling io_wq_put_and_exit(). Fixes: `7880174e1e` ("io_uring/tctx: clean up __io_uring_add_tctx_node() error handling") Reported-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Clément Léger <cleger@meta.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-20 14:47:29 -06:00
Greg Kroah-Hartman	2fc87d37be	drm/nouveau: fix u32 overflow in pushbuf reloc bounds check nouveau_gem_pushbuf_reloc_apply() validates each relocation with if (r->reloc_bo_offset + 4 > nvbo->bo.base.size) but reloc_bo_offset is __u32 (uapi/drm/nouveau_drm.h) and the integer literal 4 promotes to unsigned int, so the addition is performed in 32 bits and wraps before the comparison against the size_t bo size. Cast to u64 so the addition happens in 64-bit arithmetic. Cc: Lyude Paul <lyude@redhat.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Reported-by: Anthropic Cc: stable <stable@kernel.org> Assisted-by: gkh_clanker_t1000 Fixes: `a1606a9596` ("drm/nouveau: new gem pushbuf interface, bump to 0.0.16") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [ Add Fixes: tag. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2026-04-20 21:23:14 +02:00
Steven Rostedt	932cdaf3e2	ktest: Add logfile to failure directory The logfile contains a lot of useful information about the tests being run. Add it to the stored failure directory when the test fails. Cc: John 'Warthog9' Hawley <warthog9@kernel.org> Link: https://patch.msgid.link/20260420142315.7bbc3624@fedora Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2026-04-20 15:23:13 -04:00

... 66 67 68 69 70 ...

1447055 Commits