linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-28 17:13:52 +02:00

Author	SHA1	Message	Date
Filipe Manana	6fa9729568	btrfs: pass literal booleans to functions that take boolean arguments We have several functions with parameters defined as booleans but then we have callers passing integers, 0 or 1, instead of false and true. While this isn't a bug since 0 and 1 are converted to false and true, it is odd and less readable. Change the callers to pass true and false literals instead. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-04-07 18:55:56 +02:00
Kees Cook	189f164e57	Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-22 08:26:33 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/$alloc_objs(.*$, GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Filipe Manana	ccba88cb6a	btrfs: remove pointless out labels from send.c Some functions (process_extent(), process_recorded_refs_if_needed(), changed_inode(), compare_refs() and changed_cb()) have an 'out' label that does nothing but return, making it pointless. Simplify this by removing the label and returning instead of gotos plus setting the 'ret' variable. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:56:20 +01:00
Linus Torvalds	e84d960149	for-6.19-rc5-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmlsRSEACgkQxWXV+ddt WDtXqg/+IGNYVrdRU6Ds/m1T7RO7uAYTTk9X4+21u7cYfVtZzC2gkBVfRwjv3gkD JVVCjXrKmdcQl1QMazeALauF0NBjLtA46SH4mm6tF9ZJu2TNxOmumCYpqjMtGjRG npAJW3LrdNxyClm9AbtJrvGcEQ78XuaGoctIHIvJH5NakA7pgOjEGMTGuCcKMMGH e/cQ6rEriOIAl9hPVJRx4odI1zd7AY0MMiLsSLKJdad1/TcJxFT36lg7SNc2Eqvz bBUlRqb6KOWLItU1c+KBx3yMuFpGYe7OLLqOtZo1Xdb0jDF0cU54eM0OgSkgMhRJ pFGFnGjfGtmpwDuthjNObnLTUzfsR2Yj7Yc56e0Ah7KAbhvWr7npAkN7bid+bagP 1MnnRNrb4zqGNbzZ5EwX2MYcHV5EiwlzSZx9H0MIP0umCNVa4vubQ9S4UCHk4Twj BbAK7No3pgicwXvp0HI/4lYQldYBhp6ni+lh1HLgScYVReUCpmG1X2rDkf8viJ9d dYnKHnoyzPDIDQcVDkPOMIDnBofTGHux3aUIhE9va/ysALyNHn5uDr+idQt+eM1E SwnK+l8V+0kr55cHNZ0l3+e9gnDuD27pgQv0oHCDe2KKiUJCMBwaepH+P+QZjsHx bgCXnau6897qjxc3oDjQUZArkPPXw+8rqvkzv2aSFEMnhSOBqmI= =qIBc -----END PGP SIGNATURE----- Merge tag 'for-6.19-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - with large folios in use, fix partial incorrect update of a reflinked range - fix potential deadlock in iget when lookup fails and eviction is needed - in send, validate inline extent type while detecting file holes - fix memory leak after an error when creating a space info - remove zone statistics from sysfs again, the output size limitations make it unusable, we'll do it in another way in another release - test fixes: - return proper error codes from block remapping tests - fix tree root leaks in qgroup tests after errors * tag 'for-6.19-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: remove zoned statistics from sysfs btrfs: fix memory leaks in create_space_info() error paths btrfs: invalidate pages instead of truncate after reflinking btrfs: update the Kconfig string for CONFIG_BTRFS_EXPERIMENTAL btrfs: send: check for inline extents in range_is_hole_in_parent() btrfs: tests: fix return 0 on rmap test failure btrfs: tests: fix root tree leak in btrfs_test_qgroups() btrfs: release path before iget_failed() in btrfs_read_locked_inode()	2026-01-17 19:29:32 -08:00
Qu Wenruo	08b096c137	btrfs: send: check for inline extents in range_is_hole_in_parent() Before accessing the disk_bytenr field of a file extent item we need to check if we are dealing with an inline extent. This is because for inline extents their data starts at the offset of the disk_bytenr field. So accessing the disk_bytenr means we are accessing inline data or in case the inline data is less than 8 bytes we can actually cause an invalid memory access if this inline extent item is the first item in the leaf or access metadata from other items. Fixes: `82bfb2e7b6` ("Btrfs: incremental send, fix unnecessary hole writes for sparse files") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-01-09 17:41:53 +01:00
Linus Torvalds	7696286034	for-6.19-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmko/DUACgkQxWXV+ddt WDtyCw//UaFOTX/k72HgA1n2MWfegWbWyD+OGNbGosoZljKOrAe/mRnjXTyF9lyW 8GDzGvJzF4Tkl5lyuGyiequlrO2F3veTpwHo94xnBTOYCeiTpMqTN/e/5SkasBpN 4YlWq7OGYR4hwghRvZpaW7nsmVCKDLIlZVkH77x9Bmvx0NLO24EJlEZusQT4zYew ntC/i9x3DW0ZxYyfRhFIFvk9JUUdgXfxJ6dNexz0zi3dKUSUIR9hI0J9Nwl++1cF SgjAzbtO064htWoCvsKykgA6YGbJCZjw8XO8D2eJonkN24VbqSMaY44TPXmCMLVs ZXw871jV2E/urfWhRNdxv/kJdCFudPk0qXG5ZtfHO4UUwS/nZ+qAig+LHawgAOCJ 9CgWy4zrfiYCqULRuqF1wzWu/z22++zIlZC552VAZd1RQ+JjqJY/aje4xhY5nUF4 n1uVBReZaI9sH3jJOsMWpwLMptbhpH9RZp3QPgqZlUHo6GtPJJmNKfw8KgMAhZ7L wf7iy6v9yo+7VZ2ACwu2qJ+lZRxbZ0yvCnFatN3O5G1O0kkIrZFUM3MwdKtufZ0u LHWkGfoaq7zR6E6DhIaxIhiTTXMlOfLTikNKgBUO3NEdrRZwrDhr7K07S25jFxSx ZCNV6OdSCeziShPqT0ntcwecnJ41/kOcm13732NHF+QgzMK5LrI= =rO4x -----END PGP SIGNATURE----- Merge tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Features: - shutdown ioctl support (needs CONFIG_BTRFS_EXPERIMENTAL for now): - set filesystem state as being shut down (also named going down in other filesystems), where all active operations return EIO and this cannot be changed until unmount - pending operations are attempted to be finished but error messages may still show up depending on where exactly the shutdown happened - scrub (and device replace) vs suspend/hibernate: - a running scrub will prevent suspend, which can be annoying as suspend is an immediate request and scrub is not critical - filesystem freezing before suspend was not sufficient as the problem was in process freezing - behaviour change: on suspend scrub and device replace are cancelled, where scrub can record the last state and continue from there; the device replace has to be restarted from the beginning - zone stats exported in sysfs, from the perspective of the filesystem this includes active, reclaimable, relocation etc zones Performance: - improvements when processing space reservation tickets by optimizing locking and shrinking critical sections, cumulative improvements in lockstat numbers show +15% Notable fixes: - use vmalloc fallback when allocating bios as high order allocations can happen with wide checksums (like sha256) - scrub will always track the last position of progress so it's not starting from zero after an error Core: - under experimental config, checksum calculations are offloaded to process context, simplifies locking and allows to remove compression write worker kthread(s): - speed improvement in direct IO throughput with buffered IO fallback is +15% when not offloaded but this is more related to internal crypto subsystem improvements - this will be probably default in the future removing the sysfs tunable - (experimental) block size > page size updates: - support more operations when not using large folios (encoded read/write and send) - raid56 - more preparations for fscrypt support Other: - more conversions to auto-cleaned variables - parameter cleanups and removals - extended warning fixes - improved printing of structured values like keys - lots of other cleanups and refactoring" * tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits) btrfs: remove unnecessary inode key in btrfs_log_all_parents() btrfs: remove redundant zero/NULL initializations in btrfs_alloc_root() btrfs: remaining BTRFS_PATH_AUTO_FREE conversions btrfs: send: do not allocate memory for xattr data when checking it exists btrfs: send: add unlikely to all unexpected overflow checks btrfs: reduce arguments to btrfs_del_inode_ref_in_log() btrfs: remove root argument from btrfs_del_dir_entries_in_log() btrfs: use test_and_set_bit() in btrfs_delayed_delete_inode_ref() btrfs: don't search back for dir inode item in INO_LOOKUP_USER btrfs: don't rewrite ret from inode_permission btrfs: add orig_logical to btrfs_bio for encryption btrfs: disable verity on encrypted inodes btrfs: disable various operations on encrypted inodes btrfs: remove redundant level reset in btrfs_del_items() btrfs: simplify leaf traversal after path release in btrfs_next_old_leaf() btrfs: optimize balance_level() path reference handling btrfs: factor out root promotion logic into promote_child_to_root() btrfs: raid56: remove the "_step" infix btrfs: raid56: enable bs > ps support btrfs: raid56: prepare finish_parity_scrub() to support bs > ps cases ...	2025-12-03 20:03:46 -08:00
Linus Torvalds	2ddcf4962c	Kbuild updates for v6.19 - Enable -fms-extensions, allowing anonymous use of tagged struct or union in struct/union (tag kbuild-ms-extensions-6.19). An exemplary conversion patch is added here, too (btrfs). - Introduce architecture-specific CC_CAN_LINK and flags for userprogs - Add new packaging target 'modules-cpio-pkg' for building a initramfs cpio w/ kmods - Handle included .c files in gen_compile_commands - Minor kbuild changes: - Use objtree for module signing key path, fixing oot kmod signing - Improve documentation of KBUILD_BUILD_TIMESTAMP - Reuse KBUILD_USERCFLAGS for UAPI, instead of defining twice - Rename scripts/Makefile.extrawarn to Makefile.warn - Drop obsolete types.h check from headers_check.pl - Remove outdated config leak ignore entries Signed-off-by: Nicolas Schier <nsc@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEh0E3p4c3JKeBvsLGB1IKcBYmEmkFAmkttSoACgkQB1IKcBYm Emkt6g/+NLW1rJBCcONXIe3UlfHvo4MBziA+ItuLkYx7FwhhTTOivQgav68wXSQK /JqQ5N8O0nO4Gznv5aJWoW+GUCKFjqaDbJDXW9pRdDNGQjXJfL82OHu9gPiUAiDJ ffZwyeRfXtBaglWv6ovM5z6817OqUCLErqjESAMHZv4nXbXjFNi2YKwgJGmAjfNM kiNUU1ieO2/dxgI/mMCW0UuK0jjiC0EsY1as3IkxBdVZi3dRPobLNaL4JiKHGFlZ BlrtAoGHrqwqEQgFACVJE2BzWhW+Hwz7SyfBvF4c8T0eM/02ogcL43COh6scq0pV 2om7hFwvE/NBdbCvz0KB/q3khx9jJRagCm0o/+YEexHb1rn+LcCbomQZYRjIEcHU mBc3DTD+B+jwiN6mow6e6E1uuKvpTrmPGBNVyQ5BMFpdlhfga7zsttLO5kMIXqAH Fc8MsSx06YggrwGevc/39C+3uzI501Rcxu9BWdyHXfeZKeq0gZGNtkXVN2edzzKm c4uFeDbbF3M0oeGXmtlm8I//jqeJuc0wK8d8tSXjKodo+vgppXj0s2okAwALY2jW NVAlxqKwS7CnjBbXOsHIXSCDDL5IQ+np/gIaOkSbl6DXU50BvFV+KGvqW9hdjmwR AEQXemqUhSvGFmYG92mLwewXWAivnxUvGTCifOhXipyikXhX/wI= =Gn4V -----END PGP SIGNATURE----- Merge tag 'kbuild-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild updates from Nicolas Schier: - Enable -fms-extensions, allowing anonymous use of tagged struct or union in struct/union (tag kbuild-ms-extensions-6.19). An exemplary conversion patch is added here, too (btrfs). [ Editor's note: the core of this actually came in early through a shared branch and a few other trees - Linus ] - Introduce architecture-specific CC_CAN_LINK and flags for userprogs - Add new packaging target 'modules-cpio-pkg' for building a initramfs cpio w/ kmods - Handle included .c files in gen_compile_commands - Minor kbuild changes: - Use objtree for module signing key path, fixing oot kmod signing - Improve documentation of KBUILD_BUILD_TIMESTAMP - Reuse KBUILD_USERCFLAGS for UAPI, instead of defining twice - Rename scripts/Makefile.extrawarn to Makefile.warn - Drop obsolete types.h check from headers_check.pl - Remove outdated config leak ignore entries * tag 'kbuild-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: kbuild: add target to build a cpio containing modules initramfs: add gen_init_cpio to hostprogs unconditionally kbuild: allow architectures to override CC_CAN_LINK init: deduplicate cc-can-link.sh invocations kbuild: don't enable CC_CAN_LINK if the dummy program generates warnings scripts: headers_install.sh: Remove two outdated config leak ignore entries scripts/clang-tools: Handle included .c files in gen_compile_commands kbuild: uapi: Drop types.h check from headers_check.pl kbuild: Rename Makefile.extrawarn to Makefile.warn MAINTAINERS, .mailmap: Update mail address for Nicolas Schier kbuild: uapi: reuse KBUILD_USERCFLAGS kbuild: doc: improve KBUILD_BUILD_TIMESTAMP documentation kbuild: Use objtree for module signing key path btrfs: send: make use of -fms-extensions for defining struct fs_path	2025-12-03 14:42:21 -08:00
Filipe Manana	5c9cac55b7	btrfs: send: do not allocate memory for xattr data when checking it exists When checking if xattrs were deleted we don't care about their data, but we are allocating memory for the data and copying it, which only wastes time and can result in an unnecessary error in case the allocation fails. So stop allocating memory and copying data by making find_xattr() and __find_xattr() skip those steps if the given data buffer is NULL. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-11-25 01:53:33 +01:00
Filipe Manana	7c3acdb998	btrfs: send: add unlikely to all unexpected overflow checks There are several checks for unexpected overflows of buffers and path lengths that makes us fail the send operation with an error if for some highly unexpected reason they happen. So add the unlikely tag to those checks to hint the compiler to generate better code, while also making it more explicit in the source that it's highly unexpected. With gcc 14.2.0-19 from Debian on x86_64, I also got a small reduction the text size of the btrfs module. Before: $ size fs/btrfs/btrfs.ko text data bss dec hex filename 1936917 162723 15592 2115232 2046a0 fs/btrfs/btrfs.ko After: $ size fs/btrfs/btrfs.ko text data bss dec hex filename 1936789 162723 15592 2115104 204620 fs/btrfs/btrfs.ko Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-11-25 01:53:33 +01:00
Filipe Manana	d7fe41044b	btrfs: use bool type for btrfs_path members used as booleans Many fields of struct btrfs_path are used as booleans but their type is an unsigned int (of one 1 bit width to save space). Change the type to bool keeping the :1 suffix so that they combine with the previous u8 fields in order to save space. This makes the code more clear by using explicit true/false and more in line with the preferred style, preserving the size of the structure. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-11-24 22:42:25 +01:00
Qu Wenruo	ec20799064	btrfs: enable encoded read/write/send for bs > ps cases Since the read verification and read repair are all supporting bs > ps without large folios now, we can enable encoded read/write/send. Now we can relax the alignment in assert_bbio_alignment() to min(blocksize, PAGE_SIZE). But also add the extra blocksize based alignment check for the logical and length of the bbio. There is a pitfall in btrfs_add_compress_bio_folios(), which relies on the folios passed in to meet the minimal folio order. But now we can pass regular page sized folios in, update it to check each folio's size instead of using the minimal folio size. This allows btrfs_add_compress_bio_folios() to even handle folios array with different sizes, thankfully we don't yet need to handle such crazy situation. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-11-24 22:42:24 +01:00
Miquel Sabaté Solà	7ab5d01d58	btrfs: apply the AUTO_K(V)FREE macros throughout the code Apply the AUTO_KFREE and AUTO_KVFREE macros wherever it makes sense. Since this macro is expected to improve code readability, it has been avoided in places where the lifetime of objects wasn't easy to follow and a cleanup attribute would've made things worse; or when the cleanup section of a function involved many other things and thus there was no readability impact anyways. This change has also not been applied in extremely short functions where readability was clearly not an issue. Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-11-24 22:34:51 +01:00
Filipe Manana	af1e800c02	btrfs: use the key format macros when printing keys Change all locations that print a key to use the new macros to print them in order to ensure a consistent style and avoid repetitive code. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-11-24 22:03:02 +01:00
Rasmus Villemoes	d599f571b3	btrfs: send: make use of -fms-extensions for defining struct fs_path The newly introduced -fms-extensions compiler flag allows defining struct fs_path in such a way that inline_buf becomes a proper array with a size known to the compiler. This also makes the problem fixed by commit `8aec9dbf2d` ("btrfs: send: fix -Wflex-array-member-not-at-end warning in struct send_ctx") go away. Whether cur_inode_path should be put back to its original place in struct send_ctx I don't know, but at least the comment no longer applies. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: David Sterba <dsterba@suse.com> Link: https://patch.msgid.link/20251020142228.1819871-3-linux@rasmusvillemoes.dk Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nicolas Schier <nsc@kernel.org>	2025-11-08 12:17:58 +01:00
Ting-Chang Hou	1fabe43b4e	btrfs: send: fix duplicated rmdir operations when using extrefs Commit `29d6d30f5c` ("Btrfs: send, don't send rmdir for same target multiple times") has fixed an issue that a send stream contained a rmdir operation for the same directory multiple times. After that fix we keep track of the last directory for which we sent a rmdir operation and compare with it before sending a rmdir for the parent inode of a deleted hardlink we are processing. But there is still a corner case that in between rmdir dir operations for the same inode we find deleted hardlinks for other parent inodes, so tracking just the last inode for which we sent a rmdir operation is not enough. Hardlinks of a file in the same directory are stored in the same INODE_REF item, but if the number of hardlinks is too large and can not fit in a leaf, we use INODE_EXTREF items to store them. The key of an INODE_EXTREF item is (inode_id, INODE_EXTREF, hash[name, parent ino]), so between two hardlinks for the same parent directory, we can find others for other parent directories. For example for the reproducer below we get the following (from a btrfs inspect-internal dump-tree output): item 0 key (259 INODE_EXTREF 2309449) itemoff 16257 itemsize 26 index 6925 parent 257 namelen 8 name: foo.6923 item 1 key (259 INODE_EXTREF 2311350) itemoff 16231 itemsize 26 index 6588 parent 258 namelen 8 name: foo.6587 item 2 key (259 INODE_EXTREF 2457395) itemoff 16205 itemsize 26 index 6611 parent 257 namelen 8 name: foo.6609 (...) So tracking the last directory's inode number does not work in this case since we process a link for parent inode 257, then for 258 and then back again for 257, and that second time we process a deleted link for 257 we think we have not yet sent a rmdir operation. Fix this by using a rbtree to keep track of all the directories for which we have already sent rmdir operations, and add those directories to the 'check_dirs' ref list in process_recorded_refs() only if the directory is not yet in the rbtree, otherwise skip it since it means we have already sent a rmdir operation for that directory. The following test script reproduces the problem: $ cat test.sh #!/bin/bash DEV=/dev/sdi MNT=/mnt/sdi mkfs.btrfs -f $DEV mount $DEV $MNT mkdir $MNT/a $MNT/b echo 123 > $MNT/a/foo for ((i = 1; i <= 1000; i++)); do ln $MNT/a/foo $MNT/a/foo.$i ln $MNT/a/foo $MNT/b/foo.$i done btrfs subvolume snapshot -r $MNT $MNT/snap1 btrfs send $MNT/snap1 -f /tmp/base.send rm -r $MNT/a $MNT/b btrfs subvolume snapshot -r $MNT $MNT/snap2 btrfs send -p $MNT/snap1 $MNT/snap2 -f /tmp/incremental.send umount $MNT mkfs.btrfs -f $DEV mount $DEV $MNT btrfs receive $MNT -f /tmp/base.send btrfs receive $MNT -f /tmp/incremental.send rm -f /tmp/base.send /tmp/incremental.send umount $MNT When running it, it fails like this: $ ./test.sh (...) At subvol snap1 At snapshot snap2 ERROR: rmdir o257-9-0 failed: No such file or directory CC: <stable@vger.kernel.org> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Ting-Chang Hou <tchou@synology.com> [ Updated changelog ] Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-10-17 18:33:34 +02:00
Gustavo A. R. Silva	8aec9dbf2d	btrfs: send: fix -Wflex-array-member-not-at-end warning in struct send_ctx The warning -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Fix the following warning: fs/btrfs/send.c:181:24: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] and move the declaration of send_ctx::cur_inode_path to the end. Notice that struct fs_path contains a flexible array member inline_buf, but also a padding array and a limit calculated for the usable space of inline_buf (FS_PATH_INLINE_SIZE). It is not the pattern where flexible array is in the middle of a structure and could potentially overwrite other members. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-10-13 22:36:38 +02:00
David Sterba	cc53bd2085	btrfs: add unlikely annotations to branches leading to EIO The unlikely() annotation is a static prediction hint that compiler may use to reorder code out of hot path. We use it elsewhere (namely tree-checker.c) for error branches that almost never happen, where EIO is one of them. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-23 08:49:26 +02:00
David Sterba	9264d004a6	btrfs: add unlikely annotations to branches leading to EUCLEAN The unlikely() annotation is a static prediction hint that compiler may use to reorder code out of hot path. We use it elsewhere (namely tree-checker.c) for error branches that almost never happen, where EUCLEAN (a corruption) is one of them. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-23 08:49:26 +02:00
Sun YangKai	4ca6f24a52	btrfs: more trivial BTRFS_PATH_AUTO_FREE conversions Trivial pattern for the auto freeing with goto -> return conversions if possible. The following cases are considered trivial in this patch: 1. Cases where there are no operations between btrfs_free_path() and the function returns. 2. Cases where only simple cleanup operations (such as kfree(), kvfree(), clear_bit(), and fs_path_free()) are present between btrfs_free_path() and the function return. Signed-off-by: Sun YangKai <sunk67188@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-23 08:49:26 +02:00
Qu Wenruo	98077f7f21	btrfs: enable experimental bs > ps support With all the preparation patches, we're able to finally enable btrfs block size (sector size) larger than page size support and give it a full fstests run. And obviously this new feature is hidden behind experimental flags, and should not be considered as a core feature yet as btrfs' default block size is still 4K. But this is still a feature that will shine in the future where 16K block sized device are widely adopted. For now there are some features explicitly disabled: - Direct IO This is the most complex part to support, the root reason is we can not control the pages of iov iter passed in. User space programs can only ensure the virtual addresses are contiguous, but have no control on their physical addresses. Our bs > ps support heavily relies on large folios, and direct IO memory can easily break it. So direct IO is disabled and will always fall back to buffered IO. - RAID56 In theory we can convert RAID56 to use large folios, but it will need to be converted back to page based if we want to support direct IO in the future. So just reject it for now. - Encoded send - Encoded read Both are utilizing btrfs_encoded_read_regular_fill_pages(), and send is utilizing vmallocated memory. Unfortunately for vmallocated memory we can not guarantee the minimal folio order. For send, it will just always fallback to regular writes, which reads from page cache and will follow the existing folio order requirement. - Encoded write Encoded write itself is allocating pages by themselves, and we can easily change it to follow the minimal order. But since encoded read is already disabled, there is no need to only enable encoded write. Finally just like what we did for bs < ps support in the past, add a warning message for bs > ps mounts. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-23 08:49:25 +02:00
Filipe Manana	0dc93e4652	btrfs: send: index backref cache by node number instead of by sector number We now have a nodesize_bits member in fs_info so we can index an extent buffer in the backref cache by node number instead of by sector number. While this allows for a denser index space with the possibility of using less maple tree nodes, in practice it's unlikely to hit such benefits since we currently limit the maximum number of keys in the cache to 128, so unless all extent buffers are contiguous we are unlikely to see a memory usage reduction in the backing maple tree due to fewer nodes. Nevertheless it doesn't cost anything to index by node number and it's more logical. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-23 08:49:21 +02:00
David Sterba	17dc82dc1e	btrfs: fix typos in comments and strings Annual typo fixing pass. Strangely codespell found only about 30% of what is in this patch, the rest was done manually using text spellchecker with a custom dictionary of acceptable terms. Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-23 08:49:16 +02:00
David Sterba	67e78f983e	btrfs: convert several int parameters to bool We're almost done cleaning misused int/bool parameters. Convert a bunch of them, found by manual grepping. Note that btrfs_sync_fs() needs an int as it's mandated by the struct super_operations prototype. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2025-09-22 10:54:32 +02:00
Filipe Manana	005b0a0c24	btrfs: send: use fallocate for hole punching with send stream v2 Currently holes are sent as writes full of zeroes, which results in unnecessarily using disk space at the receiving end and increasing the stream size. In some cases we avoid sending writes of zeroes, like during a full send operation where we just skip writes for holes. But for some cases we fill previous holes with writes of zeroes too, like in this scenario: 1) We have a file with a hole in the range [2M, 3M), we snapshot the subvolume and do a full send. The range [2M, 3M) stays as a hole at the receiver since we skip sending write commands full of zeroes; 2) We punch a hole for the range [3M, 4M) in our file, so that now it has a 2M hole in the range [2M, 4M), and snapshot the subvolume. Now if we do an incremental send, we will send write commands full of zeroes for the range [2M, 4M), removing the hole for [2M, 3M) at the receiver. We could improve cases such as this last one by doing additional comparisons of file extent items (or their absence) between the parent and send snapshots, but that's a lot of code to add plus additional CPU and IO costs. Since the send stream v2 already has a fallocate command and btrfs-progs implements a callback to execute fallocate since the send stream v2 support was added to it, update the kernel to use fallocate for punching holes for V2+ streams. Test coverage is provided by btrfs/284 which is a version of btrfs/007 that exercises send stream v2 instead of v1, using fsstress with random operations and fssum to verify file contents. Link: https://github.com/kdave/btrfs-progs/issues/1001 CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-07-22 01:23:14 +02:00
Filipe Manana	2b759eea98	btrfs: send: directly return strcmp() result when comparing recorded refs There's no point in converting the return values from strcmp() as all we need is that it returns a negative value if the first argument is less than the second, a positive value if it's greater and 0 if equal. We do not have a need for -1 instead of any other negative value and no need for 1 instead of any other positive value - that's all that rb_find() needs and no where else we need specific negative and positive values. So remove the intermediate local variable and checks and return directly the result from strcmp(). This also reduces the module's text size. Before: $ size fs/btrfs/btrfs.ko text data bss dec hex filename 1888116 161347 16136 2065599 1f84bf fs/btrfs/btrfs.ko After: $ size fs/btrfs/btrfs.ko text data bss dec hex filename 1888052 161347 16136 2065535 1f847f fs/btrfs/btrfs.ko Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-07-22 00:09:20 +02:00
Brahmajit Das	164299ba11	btrfs: replace strcpy() with strscpy() strcpy() is discouraged from use due to lack of bounds checking. Replaces it with strscpy(), the recommended alternative for null terminated strings, to follow best practices. There are instances where strscpy() cannot be used such as where both the source and destination are character pointers. In that instance we can use sysfs_emit(). Link: https://github.com/KSPP/linux/issues/88 Suggested-by: Anthony Iliopoulos <ailiop@suse.com> Signed-off-by: Brahmajit Das <bdas@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-07-22 00:05:00 +02:00
Dmitry Antipov	2fb5e56f52	btrfs: send: avoid extra calls to strlen() in gen_unique_name() Since 'snprintf()' returns the number of characters which would be emitted and output truncation is handled by 'ASSERT()', it should be safe to use that return value instead of the subsequent calls to 'strlen()' in 'gen_unique_name()'. This also reduces the module's text size. Before: $ size fs/btrfs/btrfs.ko text data bss dec hex filename 1897006 161571 16136 2074713 1fa859 fs/btrfs/btrfs.ko After: $ size fs/btrfs/btrfs.ko text data bss dec hex filename 1896848 161571 16136 2074555 1fa7bb fs/btrfs/btrfs.ko Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-07-21 23:58:05 +02:00
David Sterba	585e944a31	btrfs: send: remove btrfs_debug() calls There are debugging prints for each emitted send command and other related actions. This does not seem right as the number of commands can be high and dumping that to the system log will likely hit some rate limiting. This should be done by trace points that are more lightweight and can keep up with high frequency. Signed-off-by: David Sterba <dsterba@suse.com>	2025-05-15 14:30:56 +02:00
David Sterba	2d44a15afd	btrfs: use list_first_entry() everywhere Using the helper makes it a bit more clear that we're accessing the first list entry. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-05-15 14:30:47 +02:00
David Sterba	9e0a739a9e	btrfs: convert ASSERT(0) with handled errors to DEBUG_WARN() The use of ASSERT(0) is maybe useful for some cases but more like a notice for developers. Assertions can be compiled in independently so convert it to a debugging helper. The difference is that it's just a warning and will not end up in BUG(). The converted cases are in connection with proper error handling. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-05-15 14:30:47 +02:00
Qu Wenruo	a4a636a437	btrfs: send: prepare put_file_data() for large data folios Currently put_file_data() can only accept a page sized folio. However the function itself is not that complex, it's just copying data from filemap folio into the send buffer. Make it support large data folios: - Change the loop to use file offset instead of page index - Calculate @pg_offset and @cur_len after getting the folio - Remove the "WARN_ON(folio_order(folio));" line Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-05-15 14:30:42 +02:00
Qu Wenruo	70a376475d	btrfs: send: remove the again label inside put_file_data() The again label is here to retry to get the folio for the current index. When triggering that label, there is no advance of the iterator. So it can be replaced by a simple "continue" and remove the again label. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-05-15 14:30:42 +02:00
Filipe Manana	b204e5c7d4	btrfs: make btrfs_iget() return a btrfs inode instead It's an internal function and most of the time the callers are doing a lot of BTRFS_I() calls on the returned VFS inode to get the btrfs inode, so change the return type to struct btrfs_inode instead. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:50 +01:00
Filipe Manana	08f340767d	btrfs: send: simplify return logic from send_encoded_extent() The 'out' label is pointless as we don't have anything to cleanup anymore (we used to have an inode to iput), so remove it and make error paths directly return an error. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:50 +01:00
Filipe Manana	0c8337c220	btrfs: send: remove unnecessary inode lookup at send_encoded_inline_extent() We are doing a lookup of the inode but we don't use it at all. So just remove this pointless lookup. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:50 +01:00
David Sterba	4e043cd196	btrfs: pass btrfs_root pointers to send ioctl parameters The ioctl switch btrfs_ioctl() provides several parameter types for convenience so we don't have to do the conversion in the callbacks. Pass root pointers to the send related functions. Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:49 +01:00
Filipe Manana	25aff7b964	btrfs: send: simplify return logic from send_set_xattr() There's no longer any need for the 'out' label as there are no resources to cleanup anymore in case of an error and we can directly return if begin_cmd() fails. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	374d45af64	btrfs: send: avoid path allocation for the current inode when issuing commands Whenever we issue a command we allocate a path and then compute it. For the current inode this is not necessary since we have one preallocated and computed in the send context structure, so we can use it instead and avoid allocating and freeing a path. For example if we have 100 extents to send (100 write commands) for a file, we are allocating and freeing paths 100 times. So improve on this by avoiding path allocation and freeing whenever a command is for the current inode by using the current inode's path stored in the send context structure. A test was run before applying this patch and the previous one in the series: "btrfs: send: keep the current inode's path cached" The test script is the following: $ cat test.sh #!/bin/bash DEV=/dev/nullb0 MNT=/mnt/nullb0 mkfs.btrfs -f $DEV > /dev/null mount $DEV $MNT DIR="$MNT/one/two/three/four" FILE="$DIR/foobar" mkdir -p $DIR # Create some empty files to get a deeper btree and therefore make # path computations slower. for ((i = 1; i <= 30000; i++)); do echo -n > "$DIR/filler_$i" done for ((i = 0; i < 10000; i += 2)); do offset=$(( i * 4096 )) xfs_io -f -c "pwrite -S 0xab $offset 4K" $FILE > /dev/null done btrfs subvolume snapshot -r $MNT $MNT/snap start=$(date +%s%N) btrfs send -f /dev/null $MNT/snap end=$(date +%s%N) echo -e "\nsend took $(( (end - start) / 1000000 )) milliseconds" umount $MNT Result before applying the 2 patches: 1121 milliseconds Result after applying the 2 patches: 815 milliseconds (-31.6%) Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	fc746acb7a	btrfs: send: keep the current inode's path cached Whenever we need to send a command for the current inode, like sending writes, xattr updates, truncates, utimes, etc, we compute the inode's path each time, which implies doing some memory allocations and traversing the inode hierarchy to extract the name of the inode and each ancestor directory, and that implies doing lookups in the subvolume tree amongst other operations. Most of the time, by far, the current inode's path doesn't change while we are processing it (like if we need to issue 100 write commands, the path remains the same and it's pointless to compute it 100 times). To avoid this keep the current inode's path cached in the send context and invalidate it or update it whenever it's needed (after unlinks or renames). A performance test, and its results, is mentioned in the next patch in the series (subject: "btrfs: send: avoid path allocation for the current inode when issuing commands"). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	d7d56ccf10	btrfs: send: simplify return logic from send_rmdir() There is no need to have an 'out' label and jump into it since there are no resource cleanups to perform (release locks, free memory, etc), so make this simpler by removing the label and goto and instead return directly. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	26605cc9d0	btrfs: send: simplify return logic from send_unlink() There is no need to have an 'out' label and jump into it since there are no resource cleanups to perform (release locks, free memory, etc), so make this simpler by removing the label and goto and instead return directly. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	711584496f	btrfs: send: simplify return logic from send_link() There is no need to have an 'out' label and jump into it since there are no resource cleanups to perform (release locks, free memory, etc), so make this simpler by removing the label and goto and instead return directly. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	264515c7cb	btrfs: send: simplify return logic from send_rename() There is no need to have an 'out' label and jump into it since there are no resource cleanups to perform (release locks, free memory, etc), so make this simpler by removing the label and goto and instead return directly. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	cb474665f9	btrfs: send: simplify return logic from send_verity() There's no need for the 'out' label as there are no resources to cleanup in case of an error and we can directly return if begin_cmd() fails. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	ab12858161	btrfs: send: simplify return logic from process_changed_xattr() There is no need to have an 'out' label and jump into it since there are no resource cleanups to perform (release locks, free memory, etc), so make this simpler by removing the label and goto and instead return directly. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	31db3e17e2	btrfs: send: remove unnecessary return variable from process_new_xattr() There's no need for the 'ret' variable, we can just return directly the result of the call to iterate_dir_item(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	892772c389	btrfs: send: simplify return logic from record_changed_ref() There is no need to have an 'out' label and jump into it since there are no resource cleanups to perform (release locks, free memory, etc), so make this simpler by removing the label and goto and instead return directly. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00
Filipe Manana	43090f2ca9	btrfs: send: simplify return logic from record_deleted_ref() There is no need to have an 'out' label and jump into it since there are no resource cleanups to perform (release locks, free memory, etc), so make this simpler by removing the label and goto and instead return directly. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-03-18 20:35:46 +01:00

1 2 3 4 5 ...

487 Commits