When indx_create_allocate() fails after
attr_allocate_clusters() succeeds, run_deallocate()
frees the disk clusters but never frees the memory
allocated by run_add_entry() via kvmalloc() for the
runs_tree structure.
Fix this by adding run_close() at the out: label to
free the run.runs memory on all error paths. The
success path is unaffected as it returns 0 directly
without going through out:, transferring ownership
of the run memory to indx->alloc_run via memcpy().
Reported-by: syzbot+7adcddaeeb860e5d3f2f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7adcddaeeb860e5d3f2f
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Previously the comparator was stored in struct ntfs_index and
used by low-level helpers such as hdr_find_e(). This creates
a dependency on index state in private helpers.
Resolve the compare function in the public index APIs and pass
it explicitly to internal helpers.
This should make the ownership of the comparator explicit and keeps
low-level index code independent of index-root policy.
This also resolves the TODO comment about dropping the stored
comparator from struct ntfs_index.
Signed-off-by: Adarsh Das <adarshdas950@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
This patch implements delayed allocation (delalloc) in ntfs3 driver.
It introduces an in-memory delayed-runlist (run_da) and the helpers to
track, reserve and later convert those delayed reservations into real
clusters at writeback time. The change keeps on-disk formats untouched and
focuses on pagecache integration, correctness and safe interaction with
fallocate, truncate, and dio/iomap paths.
Key points:
- add run_da (delay-allocated run tree) and bookkeeping for delayed clusters.
- mark ranges as delalloc (DELALLOC_LCN) instead of immediately allocating.
Actual allocation performed later (writeback / attr_set_size_ex / explicit
flush paths).
- direct i/o / iomap paths updated to avoid dio collisions with
delalloc: dio falls back or forces allocation of delayed blocks before
proceeding.
- punch/collapse/truncate/fallocate check and cancel delay-alloc reservations.
Sparse/compressed files handled specially.
- free-space checks updated (ntfs_check_free_space) to account for reserved
delalloc clusters and MFT record budgeting.
- delayed allocations are committed on last writer (file release) and on
explicit allocation flush paths.
Tested-by: syzbot@syzkaller.appspotmail.com
Reported-by: syzbot+2bd8e813c7f767aa9bb1@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
This patch introduces a per-directory version counter that increments on
each directory modification (indx_insert_entry() / indx_delete_entry()).
ntfs_readdir() uses this version to detect whether the directory has
changed since enumeration began. If readdir() reaches end-of-directory
but the version has changed, the walk restarts from the beginning of the
index tree instead of returning prematurely. This provides rmdir-like
behavior for tools that remove entries as they enumerate them.
Prior to this change, bonnie++ directory operations could fail due to
premature termination of readdir() during concurrent index updates.
With this patch applied, bonnie++ completes successfully with no errors.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Previously sequential reads operations relied solely on single-page reads,
causing the block layer to perform many synchronous I/O requests,
especially for large volumes or large directories. This patch introduces
explicit readahead via page_cache_sync_readahead() and file_ra_state to
reduce I/O latency and improve sequential throughput.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
We found an infinite loop bug in the ntfs3 file system that can lead to a
Denial-of-Service (DoS) condition.
A malformed dentry in the ntfs3 filesystem can cause the kernel to hang
during the lookup operations. By setting the HAS_SUB_NODE flag in an
INDEX_ENTRY within a directory's INDEX_ALLOCATION block and manipulating the
VCN pointer, an attacker can cause the indx_find() function to repeatedly
read the same block, allocating 4 KB of memory each time. The kernel lacks
VCN loop detection and depth limits, causing memory exhaustion and an OOM
crash.
This patch adds a return value check for fnd_push() to prevent a memory
exhaustion vulnerability caused by infinite loops. When the index exceeds the
size of the fnd->nodes array, fnd_push() returns -EINVAL. The indx_find()
function checks this return value and stops processing, preventing further
memory allocation.
Co-developed-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Co-developed-by: Jihoon Kwon <kjh010315@gmail.com>
Signed-off-by: Jihoon Kwon <kjh010315@gmail.com>
Signed-off-by: Jaehun Gou <p22gone@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
There are more entries after the structure, use unsafe_memcpy() to avoid
this warning.
syzbot reported:
memcpy: detected field-spanning write (size 3656) of single field "hdr1" at fs/ntfs3/index.c:1927 (size 16)
Call Trace:
indx_insert_entry+0x1a0/0x460 fs/ntfs3/index.c:1996
ni_add_name+0x4dd/0x820 fs/ntfs3/frecord.c:2995
ni_rename+0x98/0x170 fs/ntfs3/frecord.c:3026
ntfs_rename+0xab9/0xf00 fs/ntfs3/namei.c:332
Reported-by: syzbot+3a1878433bc1cb97b42a@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=3a1878433bc1cb97b42a
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Index allocation requires at least one bit in the $BITMAP attribute to
track usage of index entries. If the bitmap is empty while index blocks
are already present, this reflects on-disk corruption.
syzbot triggered this condition using a malformed NTFS image. During a
rename() operation involving a long filename (which spans multiple
index entries), the empty bitmap allowed the name to be added without
valid tracking. Subsequent deletion of the original entry failed with
-ENOENT, due to unexpected index state.
Reject such cases by verifying that the bitmap is not empty when index
blocks exist.
Reported-by: syzbot+b0373017f711c06ada64@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b0373017f711c06ada64
Fixes: d99208b919 ("fs/ntfs3: cancle set bad inode after removing name fails")
Tested-by: syzbot+b0373017f711c06ada64@syzkaller.appspotmail.com
Signed-off-by: Moon Hee Lee <moonhee.lee.ca@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
The hdr_first_de() function returns a pointer to a struct NTFS_DE. This
pointer may be NULL. To handle the NULL error effectively, it is important
to implement an error handler. This will help manage potential errors
consistently.
Additionally, error handling for the return value already exists at other
points where this function is called.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
On 32bit systems the "off + sizeof(struct NTFS_DE)" addition can
have an integer wrapping issue. Fix it by using size_add().
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Fields flags and res[3] replaced with one 4 byte flags.
Fixes: 4534a70b70 ("fs/ntfs3: Add headers and misc files")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEh0DEKNP0I9IjwfWEqbAzH4MkB7YFAmZQY1sACgkQqbAzH4Mk
B7bMZQ/9Ea7GFiNcIS+tCOIn5VhmGP7Dwd92T0jS9W0AiTkM83hql4lzamrSTR7B
7tQ2lNakUYIwMUqrZgTrzx5eEgWXU4YH43Az+0m/lfTpBGr0wKiXB/3yu84sdoCG
87PkxOO+hDdK11zQ+hHkVvNBpmYSNh9GIPET4L+SMNgJztJ3ZnYWSmne0EBdI3S7
PAeFMgxag67p1zqg4qIVrO4PMeXd9WBsKL3oO1E/abXoQnHV6I5PBdYHQaafVRy8
Meokx+uyJALScI/AEEigBlCDQ6MYBIWvtb4T4+eSJCSaMMqCRj66zynEPL5oY6Z3
Ah2IR+8wiwKjDMyYODZI8nqwJtZOVne0EprHNLzu+dQ4e60N5gQbPWRWlcWJmHaT
2rhNv1uUSnXnU3oLaqkXza9nayuWPstQlgG/I92B/GHPk4c+K5Fxtiv/wmzNn77h
qzqNuXPFC0tnvbmNGMxEOzRM/duaoyf5nurBpO0Xlo2dl29RAhWfXpyGu3IVSt7P
03bkXt9FdoAyyDHH8i24yKqPLnLcDZwH+63qDPPz6EgD07DZDdBIpjosbJF/AGW7
eGvsFKcGM5zH0ktZ8ZYN5a66/jaW8NP/e+vUa5RIwipdhwCn7ZEf1ERjfS14w+yD
cbgKdbORm6DV3psM9mI/XxWLeihr0daeZWWB2GaSy+8JuYCFYVI=
=IALr
-----END PGP SIGNATURE-----
Merge tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
"Fixes:
- reusing of the file index (could cause the file to be trimmed)
- infinite dir enumeration
- taking DOS names into account during link counting
- le32_to_cpu conversion, 32 bit overflow, NULL check
- some code was refactored
Changes:
- removed max link count info display during driver init
Remove:
- atomic_open has been removed for lack of use"
* tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3:
fs/ntfs3: Break dir enumeration if directory contents error
fs/ntfs3: Fix case when index is reused during tree transformation
fs/ntfs3: Mark volume as dirty if xattr is broken
fs/ntfs3: Always make file nonresident on fallocate call
fs/ntfs3: Redesign ntfs_create_inode to return error code instead of inode
fs/ntfs3: Use variable length array instead of fixed size
fs/ntfs3: Use 64 bit variable to avoid 32 bit overflow
fs/ntfs3: Check 'folio' pointer for NULL
fs/ntfs3: Missed le32_to_cpu conversion
fs/ntfs3: Remove max link count info display during driver init
fs/ntfs3: Taking DOS names into account during link counting
fs/ntfs3: remove atomic_open
fs/ntfs3: use kcalloc() instead of kzalloc()
In most cases when adding a cluster to the directory index,
they are placed at the end, and in the bitmap, this cluster corresponds
to the last bit. The new directory size is calculated as follows:
data_size = (u64)(bit + 1) << indx->index_bits;
In the case of reusing a non-final cluster from the index,
data_size is calculated incorrectly, resulting in the directory size
differing from the actual size.
A check for cluster reuse has been added, and the size update is skipped.
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: stable@vger.kernel.org
bitmap_size() is a pretty generic name and one may want to use it for
a generic bitmap API function. At the same time, its logic is
NTFS-specific, as it aligns to the sizeof(u64), not the sizeof(long)
(although it uses ideologically right ALIGN() instead of division).
Add the prefix 'ntfs3_' used for that FS (not just 'ntfs_' to not mix
it with the legacy module) and use generic BITS_TO_U64() while at it.
Suggested-by: Yury Norov <yury.norov@gmail.com> # BITS_TO_U64()
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon investigation of the C reproducer provided by Syzbot, it seemed
the reproducer was trying to mount a corrupted NTFS filesystem, then
issue a rename syscall to some nodes in the filesystem. This can be
shown by modifying the reproducer to only include the mount syscall,
and investigating the filesystem by e.g. `ls` and `rm` commands. As a
result, during the problematic call to `hdr_fine_e`, the `inode` being
supplied did not go through `indx_init`, hence the `cmp` function
pointer was never set.
The fix is simply to check whether `cmp` is not set, and return NULL
if that's the case, in order to be consistent with other error
scenarios of the `hdr_find_e` method. The rationale behind this patch
is that:
- We should prevent crashing the kernel even if the mounted filesystem
is corrupted. Any syscalls made on the filesystem could return
invalid, but the kernel should be able to sustain these calls.
- Only very specific corruption would lead to this bug, so it would be
a pretty rare case in actual usage anyways. Therefore, introducing a
check to specifically protect against this bug seems appropriate.
Because of its rarity, an `unlikely` clause is used to wrap around
this nullity check.
Reported-by: syzbot+60cf892fc31d1f4358fc@syzkaller.appspotmail.com
Signed-off-by: Ziqi Zhao <astrajoan@yahoo.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Added minor refactoring.
Added and fixed some comments.
In some places, the code has been reformatted to fit into 80 columns.
clang-format-12 was used to format code according kernel's .clang-format.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Added null pointer checks in function ntfs_security_init.
Also added le32_to_cpu in functions ntfs_security_init and indx_read.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Here is a BUG report from syzbot:
BUG: KASAN: slab-out-of-bounds in hdr_delete_de+0xe0/0x150 fs/ntfs3/index.c:806
Read of size 16842960 at addr ffff888079cc0600 by task syz-executor934/3631
Call Trace:
memmove+0x25/0x60 mm/kasan/shadow.c:54
hdr_delete_de+0xe0/0x150 fs/ntfs3/index.c:806
indx_delete_entry+0x74f/0x3670 fs/ntfs3/index.c:2193
ni_remove_name+0x27a/0x980 fs/ntfs3/frecord.c:2910
ntfs_unlink_inode+0x3d4/0x720 fs/ntfs3/inode.c:1712
ntfs_rename+0x41a/0xcb0 fs/ntfs3/namei.c:276
Before using the meta-data in struct INDEX_HDR, we need to
check index header valid or not. Otherwise, the corruptedi
(or malicious) fs image can cause out-of-bounds access which
could make kernel panic.
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Reported-by: syzbot+9c2811fd56591639ff5f@syzkaller.appspotmail.com
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Syzbot reported a OOB read bug:
BUG: KASAN: slab-out-of-bounds in indx_insert_into_buffer+0xaa3/0x13b0
fs/ntfs3/index.c:1755
Read of size 17168 at addr ffff8880255e06c0 by task syz-executor308/3630
Call Trace:
<TASK>
memmove+0x25/0x60 mm/kasan/shadow.c:54
indx_insert_into_buffer+0xaa3/0x13b0 fs/ntfs3/index.c:1755
indx_insert_entry+0x446/0x6b0 fs/ntfs3/index.c:1863
ntfs_create_inode+0x1d3f/0x35c0 fs/ntfs3/inode.c:1548
ntfs_create+0x3e/0x60 fs/ntfs3/namei.c:100
lookup_open fs/namei.c:3413 [inline]
If the member struct INDEX_BUFFER *index of struct indx_node is
incorrect, that is, the value of __le32 used is greater than the value
of __le32 total in struct INDEX_HDR. Therefore, OOB read occurs when
memmove is called in indx_insert_into_buffer().
Fix this by adding a check in hdr_find_e().
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Reported-by: syzbot+d882d57193079e379309@syzkaller.appspotmail.com
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Added new functions index_hdr_check and index_buf_check.
Now we check all stuff for correctness while reading from disk.
Also fixed bug with stale nfs data.
Reported-by: van fantasy <g1042620637@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
There were 2 problems:
- in some cases we lost dirty flag;
- cluster allocation can be called even when it wasn't needed.
Fixes xfstest generic/465
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Remove ntfs_sparse_cluster.
Zero clusters in attr_allocate_clusters.
Fixes xfstest generic/263
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
The functions from bitops.h already have _le variants so use them to
prevent invalid reads/writes of the bitmap on big endian systems.
Signed-off-by: Thomas Kühnel <thomas.kuehnel@avm.de>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
There is no need to initialize with NULL as it'll be rewritten later.
Signed-off-by: Li kunyu <kunyu@nfschina.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
This value is checked in indx_read, so it must be initialized
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Yan Lei <chinayanlei2002@163.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
This commit makes function a bit more readable
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
'fnd' has been dereferenced several time before, so testing it here is
pointless.
Moreover, all callers of 'indx_find()' already have some error handling
code that makes sure that no NULL 'fnd' is passed.
So, remove the useless test.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
We do not have any reason to keep old linear search in. Before this was
used for error path or if table was so big that it cannot be allocated.
Current binary search implementation won't need error path. Remove old
references to linear entry search.
Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
We could try to optimize algorithm to first fill just small table and
after that use bigger table all the way up to ARRAY_SIZE(offs). This
way we can use bigger search array, but not lose benefits with entry
count smaller < ARRAY_SIZE(offs).
Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Current binary search allocates memory for table and fill whole table
before we start actual binary search. This is quite inefficient because
table fill will always be O(n). Also if table is huge we need to
reallocate memory which is costly.
This implementation use just stack memory and always when table is full
we will check if last element is <= and if not start table fill again.
The idea was that it would be same cost as table reallocation.
Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
We have lot of unnecessary headers in these files. Remove them so that
we help compiler a little bit.
Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
The variable err is being initialized with a value that is never read, it
is being updated later on. The assignment is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Rename now works "Add new name and remove old name".
"Remove old name and add new name" may result in bad inode
if we can't add new name and then can't restore (add) old name.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Capitalize comments and end with period for better reading.
Also function comments are now little more kernel-doc style. This way we
can easily convert them to kernel-doc style if we want. Note that these
are not yet complete with this style. Example function comments start
with /* and in kernel-doc style they start /**.
Use imperative mood in function descriptions.
Change words like ntfs -> NTFS, linux -> Linux.
Use "we" not "I" when commenting code.
Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
There are three bugs in this code:
1) If indx_get_root() fails, then return -EINVAL instead of success.
2) On the "/* make root external */" -EOPNOTSUPP; error path it should
free "re" but it has a memory leak.
3) If indx_new() fails then it will lead to an error pointer dereference
when we call put_indx_node().
I've re-written the error handling to be more clear.
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
The "e" pointer is dereferenced before it has been checked for NULL.
Move the dereference after the NULL check to prevent an Oops.
Fixes: 82cae269cf ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>