linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-28 00:53:34 +02:00

Author	SHA1	Message	Date
Jens Axboe	02b8d41c17	io_uring/rsrc: use kvfree() for the imu cache Currently anything that requires kvmalloc_flex() for allocations will not get re-cached, and hence the cache freeing path is correct in that it always uses kfree() to free the allocated memory. But this seems a bit fragile as it's something that could get mix should that situation change, so switch io_free_imu() and io_alloc_cache_free() to use kvfree as the desctructor. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-21 12:19:01 -06:00
Jens Axboe	53262c91f7	io_uring/rsrc: unify nospec indexing for direct descriptors For file updates, the node reset isn't capping the value via array_index_nospec() like the other paths do. Ensure it's all sane and have the update path do the proper capping as well. Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-21 12:18:54 -06:00
Linus Torvalds	23acda7c22	for-7.1/io_uring-20260411 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmna0vIQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpu8MEACN6owH/1suaJp5HBhrKseVIPQl1ldmsGF3 ZDwZndUE6pWXaeuI3g5QjSPcfWIUuLG6vs/btkIh4M32zAcFsSD8zYPItvgFzMVp X762WPCrUcfFwKt5GqeNn6IblO8BrsbzoJWNCaSVRhWqCdzQRVktq6684nNy/fj1 JBFnMsRpwGhoKzpg1oCLOrs0V57CRdJqFdmMzQHwRTWHemvfHf6SD2+h9axfKCaV baqvXGOLQXLwr8qHFo1LIu8lqEltHUa7boU8EMFQn/v8sPjUv46EuqZ8VVtzXH08 fY2zqWI5atA3DZCfORCHnK0qh6tPiSUtVUilXbIffhqd6lCTs891RJf3TegRCGTZ k8WfBFVKzVlhbgGk0Km6+tiHTaK1ZmcKU0Q+uucnb3RlOdOoPvXJy3u+I5BK74aV 36JmNPWRQfzh5icmrrGKySBTX0z7NPtMiEA+qHEndIO5FWrkf5pf9U5C5gu0WEMh iK2gotbd0Vym3EpqKQnefxflce6IpYteOACeYPXAprcQOzPK+WYjiVUJ9JcH6DhP RPUIXXck8+GkHnM9vWtBXBKaoR7gcATHUzLX8ZnhDkAhsTJ+tOXN8skq28gglUtj 8kLMzyXklbhAJsykxKn0rqcNUOcVMatFyK4VIFyp2tWRhzMDAY4xyXYSz0lRowkd pZAm4eSkmw== =IoaB -----END PGP SIGNATURE----- Merge tag 'for-7.1/io_uring-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring updates from Jens Axboe: - Add a callback driven main loop for io_uring, and BPF struct_ops on top to allow implementing custom event loop logic - Decouple IOPOLL from being a ring-wide all-or-nothing setting, allowing IOPOLL use cases to also issue certain white listed non-polled opcodes - Timeout improvements. Migrate internal timeout storage from timespec64 to ktime_t for simpler arithmetic and avoid copying of timespec data - Zero-copy receive (zcrx) updates: - Add a device-less mode (ZCRX_REG_NODEV) for testing and experimentation where data flows through the copy fallback path - Fix two-step unregistration regression, DMA length calculations, xarray mark usage, and a potential 32-bit overflow in id shifting - Refactoring toward multi-area support: dedicated refill queue struct, consolidated DMA syncing, netmem array refilling format, and guard-based locking - Zero-copy transmit (zctx) cleanup: - Unify io_send_zc() and io_sendmsg_zc() into a single function - Add vectorized registered buffer send for IORING_OP_SEND_ZC - Add separate notification user_data via sqe->addr3 so notification and completion CQEs can be distinguished without extra reference counting - Switch struct io_ring_ctx internal bitfields to explicit flag bits with atomic-safe accessors, and annotate the known harmless races on those flags - Various optimizations caching ctx and other request fields in local variables to avoid repeated loads, and cleanups for tctx setup, ring fd registration, and read path early returns * tag 'for-7.1/io_uring-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (58 commits) io_uring: unify getting ctx from passed in file descriptor io_uring/register: don't get a reference to the registered ring fd io_uring/tctx: clean up __io_uring_add_tctx_node() error handling io_uring/tctx: have io_uring_alloc_task_context() return tctx io_uring/timeout: use 'ctx' consistently io_uring/rw: clean up __io_read() obsolete comment and early returns io_uring/zcrx: use correct mmap off constants io_uring/zcrx: use dma_len for chunk size calculation io_uring/zcrx: don't clear not allocated niovs io_uring/zcrx: don't use mark0 for allocating xarray io_uring: cast id to u64 before shifting in io_allocate_rbuf_ring() io_uring/zcrx: reject REG_NODEV with large rx_buf_size io_uring/cancel: validate opcode for IORING_ASYNC_CANCEL_OP io_uring/rsrc: use io_cache_free() to free node io_uring/zcrx: rename zcrx [un]register functions io_uring/zcrx: check ctrl op payload struct sizes io_uring/zcrx: cache fallback availability in zcrx ctx io_uring/zcrx: warn on a repeated area append io_uring/zcrx: consolidate dma syncing io_uring/zcrx: netmem array as refiling format ...	2026-04-13 16:22:30 -07:00
Jens Axboe	c5e9f6a96b	io_uring: unify getting ctx from passed in file descriptor io_uring_enter() and io_uring_register() end up having duplicated code for getting a ctx from a passed in file descriptor, for either a registered ring descriptor or a normal file descriptor. Move the io_uring_register_get_file() into io_uring.c and name it a bit more generically, and use it from both callsites rather than have that logic and handling duplicated. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-08 13:21:35 -06:00
Jens Axboe	b4d893d636	io_uring/register: don't get a reference to the registered ring fd This isn't necessary and was only done because the register path isn't a hot path and hence the extra ref/put doesn't matter, and to have the exit path be able to unconditionally put whatever file was gotten regardless of the type. In preparation for sharing this code with the main io_uring_enter(2) syscall, drop the reference and have the caller conditionally put the file if it was a normal file descriptor. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-08 13:21:35 -06:00
Linus Torvalds	e41255ce7a	io_uring-7.0-20260403 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmnPokUQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpjW6D/91Xg/mGvYUBVvwEhP0ydPncuAsThnkoDHY 6Pu+VxawKW480yAC06nktAeDgJNnpFpJXatPEtk2n8r7Ol3Cx0sDWdQjzoKSlBC7 9wj+MVpCcU970Gb1G6PNLKQoW+DxKuD9Iq6Ph434uCx/bgC2EKthj0vYpssoU48S OxyFGBTjhbgnmiZaAEMHpLC/LJP27eH24QQbobeVWyY7C6jy6YI0WQaoG4Qt+UMd S2XdFe97xrVaCVS3E5X5BAyHCcMX4e1D6/Y7bNDGG3Ke673RuUJHhqvk8P1NJnTI CaMlfoGhNw36FpkzTYIvoZlkCFl48axXmscRcekTg4d9ssnY9aSFVY+xMSHmkhKu zs1r1tZK970xUbQK0NAoD9T+LsFKU1S0PaEaCL2KMHwz9vG0uY7iUYteKqdM8L/f jUpYcxn9R6AhdeL77eEu3w6vCdMqP2+OgDv1uEpyJv6oWSdhfI38+EIwmMoEq+Az BkDipYNh4lAiI23qbS9CDe5aam6pv+hwecDn3x7MZVpGZ6cJjs43QWwk+jZz+KQj gacQu01q/TN7rpyaFYhkxPGKHVs259/uSLJY647ORgpJNXg4a+6DlB7/4YdW35il O4gnKECSflmoePm7B4QFh8Q89XPma74hVqtDB7opz0xL3PQMq07EQKQmoNP7RTOp GLW71uD2MA== =q/wK -----END PGP SIGNATURE----- Merge tag 'io_uring-7.0-20260403' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring fixes from Jens Axboe: - A previous fix in this release covered the case of the rings being RCU protected during resize, but it missed a few spots. This covers the rest - Fix the cBPF filters when COW'ed, introduced in this merge window - Fix for an attempt to import a zero sized buffer - Fix for a missing clamp in importing bundle buffers * tag 'io_uring-7.0-20260403' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring/bpf_filters: retain COW'ed settings on parse failures io_uring: protect remaining lockless ctx->rings accesses with RCU io_uring/rsrc: reject zero-length fixed buffer import io_uring/net: fix slab-out-of-bounds read in io_bundle_nbufs()	2026-04-03 11:58:04 -07:00
Jackie Liu	19a8cc6cda	io_uring/rsrc: use io_cache_free() to free node Replace kfree(node) with io_cache_free() in io_buffer_register_bvec() to match all other error paths that free nodes allocated via io_rsrc_node_alloc(). The node is allocated through io_cache_alloc() internally, so it should be returned to the cache via io_cache_free() for proper object reuse. Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Link: https://patch.msgid.link/20260331104509.7055-1-liu.yun@linux.dev [axboe: remove fixes tag, it's not a fix, it's a cleanup] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-04-01 10:21:13 -06:00
Qi Tang	111a12b422	io_uring/rsrc: reject zero-length fixed buffer import validate_fixed_range() admits buf_addr at the exact end of the registered region when len is zero, because the check uses strict greater-than (buf_end > imu->ubuf + imu->len). io_import_fixed() then computes offset == imu->len, which causes the bvec skip logic to advance past the last bio_vec entry and read bv_offset from out-of-bounds slab memory. Return early from io_import_fixed() when len is zero. A zero-length import has no data to transfer and should not walk the bvec array at all. BUG: KASAN: slab-out-of-bounds in io_import_reg_buf+0x697/0x7f0 Read of size 4 at addr ffff888002bcc254 by task poc/103 Call Trace: io_import_reg_buf+0x697/0x7f0 io_write_fixed+0xd9/0x250 __io_issue_sqe+0xad/0x710 io_issue_sqe+0x7d/0x1100 io_submit_sqes+0x86a/0x23c0 __do_sys_io_uring_enter+0xa98/0x1590 Allocated by task 103: The buggy address is located 12 bytes to the right of allocated 584-byte region [ffff888002bcc000, ffff888002bcc248) Fixes: `8622b20f23` ("io_uring: add validate_fixed_range() for validate fixed buffer") Signed-off-by: Qi Tang <tpluszz77@gmail.com> Link: https://patch.msgid.link/20260329164936.240871-1-tpluszz77@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-03-29 14:03:55 -06:00
Jens Axboe	f1a424e21c	io_uring: switch struct io_ring_ctx internal bitfields to flags Bitfields cannot be set and checked atomically, and this makes it more clear that these are indeed in shared storage and must be checked and set in a sane fashion. This is in preparation for annotating a few of the known racy, but harmless, flags checking. No intended functional changes in this patch. Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-03-16 15:32:59 -06:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/\(alloc_objs(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Pavel Begunkov	2e02f9efdb	io_uring/rsrc: improve regbuf iov validation Deduplicate io_buffer_validate() calls by moving the checks into io_sqe_buffer_register(). Now we also don't need special handling in io_buffer_validate() passing through buffer removal requests. I also was using it as a cleanup before some other changes. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-02-16 08:15:38 -07:00
Pavel Begunkov	0efc331d78	io_uring/rsrc: replace reg buffer bit field with flags I'll need a flag in the registered buffer struct for dmabuf work, and it'll be more convenient to have a flags field rather than bit fields, especially for io_mapped_ubuf initialisation. We might want to add more flags in the future as well. For example, it might be useful for debugging and potentially optimisations to split out a flag indicating the shape of the buffer to gate iov_iter_advance() walks vs bit/mask arithmetics. It can also be combined with the direction mask field. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-02-10 05:26:15 -07:00
Linus Torvalds	0c00ed308d	for-7.0/block-20260206 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmGLwcQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpv+TD/48S2HTnMhmW6AtFYWErQ+sEKXpHrxbYe7S +qR8/g/T+QSfhfqPwZEuagndFKtIP3LJfaXGSP1Lk1RfP9NLQy91v33Ibe4DjHkp etWSfnMHA9MUAoWKmg8EvncB2G+ZQFiYCpjazj5tKHD9S2+psGMuL8kq6qzMJE83 uhpb8WutUl4aSIXbMSfyGlwBhI1MjjRbbWlIBmg4yC8BWt1sH8Qn2L2GNVylEIcX U8At3KLgPGn0axSg4yGMAwTqtGhL/jwdDyeczbmRlXuAr4iVL9UX/yADCYkazt6U ttQ2/H+cxCwfES84COx9EteAatlbZxo6wjGvZ3xOMiMJVTjYe1x6Gkcckq+LrZX6 tjofi2KK78qkrMXk1mZMkZjpyUWgRtCswhDllbQyqFs0SwzQtno2//Rk8HU9dhbt pkpryDbGFki9X3upcNyEYp5TYflpW6YhAzShYgmE6KXim2fV8SeFLviy0erKOAl+ fwjTE6KQ5QoQv0s3WxkWa4lREm34O6IHrCUmbiPm5CruJnQDhqAN2QZIDgYC4WAf 0gu9cR/O4Vxu7TQXrumPs5q+gCyDU0u0B8C3mG2s+rIo+PI5cVZKs2OIZ8HiPo0F x73kR/pX3DMe35ZQkQX22ymMuowV+aQouDLY9DTwakP5acdcg7h7GZKABk6VLB06 gUIsnxURiQ== =jNzW -----END PGP SIGNATURE----- Merge tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Support for batch request processing for ublk, improving the efficiency of the kernel/ublk server communication. This can yield nice 7-12% performance improvements - Support for integrity data for ublk - Various other ublk improvements and additions, including a ton of selftests additions and updated - Move the handling of blk-crypto software fallback from below the block layer to above it. This reduces the complexity of dealing with bio splitting - Series fixing a number of potential deadlocks in blk-mq related to the queue usage counter and writeback throttling and rq-qos debugfs handling - Add an async_depth queue attribute, to resolve a performance regression that's been around for a qhilw related to the scheduler depth handling - Only use task_work for IOPOLL completions on NVMe, if it is necessary to do so. An earlier fix for an issue resulted in all these completions being punted to task_work, to guarantee that completions were only run for a given io_uring ring when it was local to that ring. With the new changes, we can detect if it's necessary to use task_work or not, and avoid it if possible. - rnbd fixes: - Fix refcount underflow in device unmap path - Handle PREFLUSH and NOUNMAP flags properly in protocol - Fix server-side bi_size for special IOs - Zero response buffer before use - Fix trace format for flags - Add .release to rnbd_dev_ktype - MD pull requests via Yu Kuai - Fix raid5_run() to return error when log_init() fails - Fix IO hang with degraded array with llbitmap - Fix percpu_ref not resurrected on suspend timeout in llbitmap - Fix GPF in write_page caused by resize race - Fix NULL pointer dereference in process_metadata_update - Fix hang when stopping arrays with metadata through dm-raid - Fix any_working flag handling in raid10_sync_request - Refactor sync/recovery code path, improve error handling for badblocks, and remove unused recovery_disabled field - Consolidate mddev boolean fields into mddev_flags - Use mempool to allocate stripe_request_ctx and make sure max_sectors is not less than io_opt in raid5 - Fix return value of mddev_trylock - Fix memory leak in raid1_run() - Add Li Nan as mdraid reviewer - Move phys_vec definitions to the kernel types, mostly in preparation for some VFIO and RDMA changes - Improve the speed for secure erase for some devices - Various little rust updates - Various other minor fixes, improvements, and cleanups * tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) blk-mq: ABI/sysfs-block: fix docs build warnings selftests: ublk: organize test directories by test ID block: decouple secure erase size limit from discard size limit block: remove redundant kill_bdev() call in set_blocksize() blk-mq: add documentation for new queue attribute async_dpeth block, bfq: convert to use request_queue->async_depth mq-deadline: covert to use request_queue->async_depth kyber: covert to use request_queue->async_depth blk-mq: add a new queue sysfs attribute async_depth blk-mq: factor out a helper blk_mq_limit_depth() blk-mq-sched: unify elevators checking for async requests block: convert nr_requests to unsigned int block: don't use strcpy to copy blockdev name blk-mq-debugfs: warn about possible deadlock blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos() blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static blk-rq-qos: fix possible debugfs_mutex deadlock blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter ...	2026-02-09 17:57:21 -08:00
Jens Axboe	e26f51f6f6	io_uring/rsrc: use GFP_KERNEL_ACCOUNT consistently For potential long term allocations, ensure that we play nicer with memcg and use the accounting variant of the GFP_KERNEL allocation. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-01-25 10:07:35 -07:00
Ming Lei	15f506a77a	io_uring: remove nr_segs recalculation in io_import_kbuf() io_import_kbuf() recalculates iter->nr_segs to reflect only the bvecs needed for the requested byte range. This was added to provide an accurate segment count to bio_iov_bvec_set(), which copied nr_segs to bio->bi_vcnt for use as a bio split hint. The previous two patches eliminated this dependency: - bio_may_need_split() now uses bi_iter instead of bi_vcnt for split decisions - bio_iov_bvec_set() no longer copies nr_segs to bi_vcnt Since nr_segs is no longer used for bio split decisions, the recalculation loop is unnecessary. The iov_iter already has the correct bi_size to cap iteration, so an oversized nr_segs is harmless. Link: https://lkml.org/lkml/2025/4/16/351 Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-01-07 08:06:33 -07:00
huang-jl	114ea9bbaf	io_uring: fix nr_segs calculation in io_import_kbuf io_import_kbuf() calculates nr_segs incorrectly when iov_offset is non-zero after iov_iter_advance(). It doesn't account for the partial consumption of the first bvec. The problem comes when meet the following conditions: 1. Use UBLK_F_AUTO_BUF_REG feature of ublk. 2. The kernel will help to register the buffer, into the io uring. 3. Later, the ublk server try to send IO request using the registered buffer in the io uring, to read/write to fuse-based filesystem, with O_DIRECT. >From a userspace perspective, the ublk server thread is blocked in the kernel, and will see "soft lockup" in the kernel dmesg. When ublk registers a buffer with mixed-size bvecs like [4K]6 + [12K] and a request partially consumes a bvec, the next request's nr_segs calculation uses bvec->bv_len instead of (bv_len - iov_offset). This causes fuse_get_user_pages() to loop forever because nr_segs indicates fewer pages than actually needed. Specifically, the infinite loop happens at: fuse_get_user_pages() -> iov_iter_extract_pages() -> iov_iter_extract_bvec_pages() Since the nr_segs is miscalculated, the iov_iter_extract_bvec_pages returns when finding that i->nr_segs is zero. Then iov_iter_extract_pages returns zero. However, fuse_get_user_pages does still not get enough data/pages, causing infinite loop. Example: - Bvecs: [4K, 4K, 4K, 4K, 4K, 4K, 12K, ...] - Request 1: 32K at offset 0, uses 64K + 8K of the 12K bvec - Request 2: 32K at offset 32K - iov_offset = 8K (8K already consumed from 12K bvec) - Bug: calculates using 12K, not (12K - 8K) = 4K - Result: nr_segs too small, infinite loop in fuse_get_user_pages. Fix by accounting for iov_offset when calculating the first segment's available length. Fixes: `b419bed4f0` ("io_uring/rsrc: ensure segments counts are correct on kbuf buffers") Signed-off-by: huang-jl <huang-jl@deepseek.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-12-17 07:35:42 -07:00
Joanne Koong	525916ce49	io_uring/rsrc: fix lost entries after cloned range When cloning with node replacements (IORING_REGISTER_DST_REPLACE), destination entries after the cloned range are not copied over. Add logic to copy them over to the new destination table. Fixes: `c1329532d5` ("io_uring/rsrc: allow cloning with node replacements") Cc: stable@vger.kernel.org Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-12-04 15:46:13 -07:00
Joanne Koong	e29af2aba2	io_uring/rsrc: rename misleading src_node variable in io_clone_buffers() The variable holds nodes from the destination ring's existing buffer table. In io_clone_buffers(), the term "src" is used to refer to the source ring. Rename to node for clarity. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-12-04 15:46:13 -07:00
Joanne Koong	b8201b50e4	io_uring/rsrc: clean up buffer cloning arg validation Get rid of some redundant checks and move the src arg validation to before the buffer table allocation, which simplifies error handling. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-12-04 15:46:13 -07:00
Jens Axboe	ecb8490b2f	Merge branch 'io_uring-6.18' into for-6.19/io_uring Merge 6.18-rc io_uring fixes, as certain coming changes depend on some of these. * io_uring-6.18: io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs io_uring/query: return number of available queries io_uring/rw: ensure allocated iovec gets cleared for early failure io_uring: fix regbuf vector size truncation io_uring: fix types for region size calulation io_uring/zcrx: remove sync refill uapi io_uring: fix buffer auto-commit for multishot uring_cmd io_uring: correct __must_hold annotation in io_install_fixed_file io_uring zcrx: add MAINTAINERS entry io_uring: Fix code indentation error io_uring/sqpoll: be smarter on when to update the stime usage io_uring/sqpoll: switch away from getrusage() for CPU accounting io_uring: fix incorrect unlikely() usage in io_waitid_prep() Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-13 07:26:37 -07:00
Caleb Sander Mateos	2d0e88f3fd	io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs io_buffer_register_bvec() currently uses blk_rq_nr_phys_segments() as the number of bvecs in the request. However, bvecs may be split into multiple segments depending on the queue limits. Thus, the number of segments may overestimate the number of bvecs. For ublk devices, the only current users of io_buffer_register_bvec(), virt_boundary_mask, seg_boundary_mask, max_segments, and max_segment_size can all be set arbitrarily by the ublk server process. Set imu->nr_bvecs based on the number of bvecs the rq_for_each_bvec() loop actually yields. However, continue using blk_rq_nr_phys_segments() as an upper bound on the number of bvecs when allocating imu to avoid needing to iterate the bvecs a second time. Link: https://lore.kernel.org/io-uring/20251111191530.1268875-1-csander@purestorage.com/ Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: `27cb27b6d5` ("io_uring: add support for kernel registered bvecs") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-12 08:25:33 -07:00
Pavel Begunkov	146eb58629	io_uring: fix regbuf vector size truncation There is a report of io_estimate_bvec_size() truncating the calculated number of segments that leads to corruption issues. Check it doesn't overflow "int"s used later. Rough but simple, can be improved on top. Cc: stable@vger.kernel.org Fixes: `9ef4cbbcb4` ("io_uring: add infra for importing vectored reg buffers") Reported-by: Google Big Sleep <big-sleep-vuln-reports+bigsleep-458654612@google.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Günther Noack <gnoack@google.com> Tested-by: Günther Noack <gnoack@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-07 17:17:13 -07:00
David Wei	6ab39b392e	io_uring/rsrc: refactor io_{un}account_mem() to take {user,mm}_struct param Refactor io_{un}account_mem() to take user_struct and mm_struct directly, instead of accessing it from the ring ctx. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-06 16:23:21 -07:00
Jens Axboe	3615e3f794	io_uring/rsrc: use get/put_user() for integer copy It's just getting an integer from userspace, installing a file, then copying the output direct descriptor back. No need to use the full copy_to/from_user() for that. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-11-03 11:02:54 -07:00
Caleb Sander Mateos	2f076a453f	io_uring/rsrc: respect submitter_task in io_register_clone_buffers() io_ring_ctx's enabled with IORING_SETUP_SINGLE_ISSUER are only allowed a single task submitting to the ctx. Although the documentation only mentions this restriction applying to io_uring_enter() syscalls, commit `d7cce96c44` ("io_uring: limit registration w/ SINGLE_ISSUER") extends it to io_uring_register(). Ensuring only one task interacts with the io_ring_ctx will be important to allow this task to avoid taking the uring_lock. There is, however, one gap in these checks: io_register_clone_buffers() may take the uring_lock on a second (source) io_ring_ctx, but __io_uring_register() only checks the current thread against the destination io_ring_ctx's submitter_task. Fail the IORING_REGISTER_CLONE_BUFFERS with -EEXIST if the source io_ring_ctx has a registered submitter_task other than the current task. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-09-08 13:21:24 -06:00
Caleb Sander Mateos	5d4c52bfa8	io_uring: don't include filetable.h in io_uring.h io_uring/io_uring.h doesn't use anything declared in io_uring/filetable.h, so drop the unnecessary #include. Add filetable.h includes in .c files previously relying on the transitive include from io_uring.h. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-09-08 13:20:46 -06:00
Pavel Begunkov	11fbada718	io_uring: export io_[un]account_mem Export pinned memory accounting helpers, they'll be used by zcrx shortly. Cc: stable@vger.kernel.org Fixes: `cf96310c5f` ("io_uring/zcrx: add io_zcrx_area") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9a61e54bd89289b39570ae02fe620e12487439e4.1752699568.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-07-16 16:23:28 -06:00
Jens Axboe	1bc8890264	Merge branch 'io_uring-6.16' into for-6.17/io_uring Merge in 6.16 io_uring fixes, to avoid clashes with pending net and settings changes. * io_uring-6.16: io_uring: gate REQ_F_ISREG on !S_ANON_INODE as well io_uring/kbuf: flag partial buffer mappings io_uring/net: mark iov as dynamically allocated even for single segments io_uring: fix resource leak in io_import_dmabuf() io_uring: don't assume uaddr alignment in io_vec_fill_bvec io_uring/rsrc: don't rely on user vaddr alignment io_uring/rsrc: fix folio unpinning io_uring: make fallocate be hashed work	2025-07-06 16:42:23 -06:00
Caleb Sander Mateos	daa01d954b	io_uring/rsrc: skip atomic refcount for uncloned buffers io_buffer_unmap() performs an atomic decrement of the io_mapped_ubuf's reference count in case it has been cloned into another io_ring_ctx's registered buffer table. This is an expensive operation and unnecessary in the common case that the io_mapped_ubuf is only registered once. Load the reference count first and check whether it's 1. In that case, skip the atomic decrement and immediately free the io_mapped_ubuf. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250619143435.3474028-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-07-02 17:11:23 -06:00
Pavel Begunkov	e1d7727b73	io_uring: don't assume uaddr alignment in io_vec_fill_bvec There is no guaranteed alignment for user pointers. Don't use mask trickery and adjust the offset by bv_offset. Cc: stable@vger.kernel.org Reported-by: David Hildenbrand <david@redhat.com> Fixes: `9ef4cbbcb4` ("io_uring: add infra for importing vectored reg buffers") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/io-uring/19530391f5c361a026ac9b401ff8e123bde55d98.1750771718.git.asml.silence@gmail.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-06-24 20:51:08 -06:00
Pavel Begunkov	3a3c6d6157	io_uring/rsrc: don't rely on user vaddr alignment There is no guaranteed alignment for user pointers, however the calculation of an offset of the first page into a folio after coalescing uses some weird bit mask logic, get rid of it. Cc: stable@vger.kernel.org Reported-by: David Hildenbrand <david@redhat.com> Fixes: `a8edbb424b` ("io_uring/rsrc: enable multi-hugepage buffer coalescing") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/io-uring/e387b4c78b33f231105a601d84eefd8301f57954.1750771718.git.asml.silence@gmail.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-06-24 20:50:59 -06:00
Pavel Begunkov	5afb4bf9fc	io_uring/rsrc: fix folio unpinning syzbot complains about an unmapping failure: [ 108.070381][ T14] kernel BUG at mm/gup.c:71! [ 108.070502][ T14] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [ 108.123672][ T14] Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20250221-8.fc42 02/21/2025 [ 108.127458][ T14] Workqueue: iou_exit io_ring_exit_work [ 108.174205][ T14] Call trace: [ 108.175649][ T14] sanity_check_pinned_pages+0x7cc/0x7d0 (P) [ 108.178138][ T14] unpin_user_page+0x80/0x10c [ 108.180189][ T14] io_release_ubuf+0x84/0xf8 [ 108.182196][ T14] io_free_rsrc_node+0x250/0x57c [ 108.184345][ T14] io_rsrc_data_free+0x148/0x298 [ 108.186493][ T14] io_sqe_buffers_unregister+0x84/0xa0 [ 108.188991][ T14] io_ring_ctx_free+0x48/0x480 [ 108.191057][ T14] io_ring_exit_work+0x764/0x7d8 [ 108.193207][ T14] process_one_work+0x7e8/0x155c [ 108.195431][ T14] worker_thread+0x958/0xed8 [ 108.197561][ T14] kthread+0x5fc/0x75c [ 108.199362][ T14] ret_from_fork+0x10/0x20 We can pin a tail page of a folio, but then io_uring will try to unpin the head page of the folio. While it should be fine in terms of keeping the page actually alive, mm folks say it's wrong and triggers a debug warning. Use unpin_user_folio() instead of unpin_user_page*. Cc: stable@vger.kernel.org Debugged-by: David Hildenbrand <david@redhat.com> Reported-by: syzbot+1d335893772467199ab6@syzkaller.appspotmail.com Closes: https://lkml.kernel.org/r/683f1551.050a0220.55ceb.0017.GAE@google.com Fixes: `a8edbb424b` ("io_uring/rsrc: enable multi-hugepage buffer coalescing") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/io-uring/a28b0f87339ac2acf14a645dad1e95bbcbf18acd.1750771718.git.asml.silence@gmail.com/ [axboe: adapt to current tree, massage commit message] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-06-24 20:49:39 -06:00
Penglei Jiang	e1c75831f6	io_uring: fix potential page leak in io_sqe_buffer_register() If allocation of the 'imu' fails, then the existing pages aren't unpinned in the error path. This is mostly a theoretical issue, requiring fault injection to hit. Move unpin_user_pages() to unified error handling to fix the page leak issue. Fixes: `d8c2237d0a` ("io_uring: add io_pin_pages() helper") Signed-off-by: Penglei Jiang <superman.xpt@gmail.com> Link: https://lore.kernel.org/r/20250617165644.79165-1-superman.xpt@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-06-18 05:09:46 -06:00
Jens Axboe	1d27f11bf0	io_uring/rsrc: validate buffer count with offset for cloning syzbot reports that it can trigger a WARN_ON() for kmalloc() attempt that's too big: WARNING: CPU: 0 PID: 6488 at mm/slub.c:5024 __kvmalloc_node_noprof+0x520/0x640 mm/slub.c:5024 Modules linked in: CPU: 0 UID: 0 PID: 6488 Comm: syz-executor312 Not tainted 6.15.0-rc7-syzkaller-gd7fa1af5b33e #0 PREEMPT Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __kvmalloc_node_noprof+0x520/0x640 mm/slub.c:5024 lr : __do_kmalloc_node mm/slub.c:-1 [inline] lr : __kvmalloc_node_noprof+0x3b4/0x640 mm/slub.c:5012 sp : ffff80009cfd7a90 x29: ffff80009cfd7ac0 x28: ffff0000dd52a120 x27: 0000000000412dc0 x26: 0000000000000178 x25: ffff7000139faf70 x24: 0000000000000000 x23: ffff800082f4cea8 x22: 00000000ffffffff x21: 000000010cd004a8 x20: ffff0000d75816c0 x19: ffff0000dd52a000 x18: 00000000ffffffff x17: ffff800092f39000 x16: ffff80008adbe9e4 x15: 0000000000000005 x14: 1ffff000139faf1c x13: 0000000000000000 x12: 0000000000000000 x11: ffff7000139faf21 x10: 0000000000000003 x9 : ffff80008f27b938 x8 : 0000000000000002 x7 : 0000000000000000 x6 : 0000000000000000 x5 : 00000000ffffffff x4 : 0000000000400dc0 x3 : 0000000200000000 x2 : 000000010cd004a8 x1 : ffff80008b3ebc40 x0 : 0000000000000001 Call trace: __kvmalloc_node_noprof+0x520/0x640 mm/slub.c:5024 (P) kvmalloc_array_node_noprof include/linux/slab.h:1065 [inline] io_rsrc_data_alloc io_uring/rsrc.c:206 [inline] io_clone_buffers io_uring/rsrc.c:1178 [inline] io_register_clone_buffers+0x484/0xa14 io_uring/rsrc.c:1287 __io_uring_register io_uring/register.c:815 [inline] __do_sys_io_uring_register io_uring/register.c:926 [inline] __se_sys_io_uring_register io_uring/register.c:903 [inline] __arm64_sys_io_uring_register+0x42c/0xea8 io_uring/register.c:903 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x58/0x17c arch/arm64/kernel/entry-common.c:767 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:786 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600 which is due to offset + buffer_count being too large. The registration code checks only the total count of buffers, but given that the indexing is an array, it should also check offset + count. That can't exceed IORING_MAX_REG_BUFFERS either, as there's no way to reach buffers beyond that limit. There's no issue with registrering a table this large, outside of the fact that it's pointless to register buffers that cannot be reached, and that it can trigger this kmalloc() warning for attempting an allocation that is too large. Cc: stable@vger.kernel.org Fixes: `b16e920a19` ("io_uring/rsrc: allow cloning at an offset") Reported-by: syzbot+cb4bf3cb653be0d25de8@syzkaller.appspotmail.com Link: https://lore.kernel.org/io-uring/684e77bd.a00a0220.279073.0029.GAE@google.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-06-15 08:09:14 -06:00
Jens Axboe	8bb9d6ccd3	io_uring: finish IOU_OK -> IOU_COMPLETE transition IOU_COMPLETE is more descriptive, in that it explicitly says that the return value means "please post a completion for this request". This patch completes the transition from IOU_OK to IOU_COMPLETE, replacing existing IOU_OK users. This is a purely mechanical change. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-21 08:41:16 -06:00
Pavel Begunkov	d760d3f59f	io_uring/zcrx: improve area validation dmabuf backed area will be taking an offset instead of addresses, and io_buffer_validate() is not flexible enough to facilitate it. It also takes an iovec, which may truncate the u64 length zcrx takes. Add a new helper function for validation. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0b3b735391a0a8f8971bf0121c19765131fddd3b.1746097431.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-05-02 09:24:42 -06:00
Pavel Begunkov	be6bad57b2	io_uring/rsrc: remove null check on import WARN_ON_ONCE() checking imu for NULL in io_import_fixed() is in the hot path and too protective, it's time to get rid of it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3782c01e311dbd474bca45aefaf79a2f2822fafb.1745083025.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-21 05:10:04 -06:00
Pavel Begunkov	9cebcf7b0c	io_uring/rsrc: clean up io_coalesce_buffer() We don't need special handling for the first page in io_coalesce_buffer(), move it inside the loop. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/r/ad698cddc1eadb3d92a7515e95bb13f79420323d.1745083025.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-21 05:10:04 -06:00
Pavel Begunkov	ea76925614	io_uring/rsrc: use unpin_user_folio We want to have a full folio to be left pinned but with only one reference, for that we "unpin" all but the first page with unpin_user_pages(), which can be confusing. There is a new helper to achieve that called unpin_user_folio(), so use that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/r/e0b2be8f9ea68f6b351ec3bb046f04f437f68491.1745083025.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-21 05:10:04 -06:00
Jens Axboe	8a2dacd49f	io_uring/rsrc: remove node assignment helpers There are two helpers here, one assigns and increments the node ref count, and the other is simply a wrapper around that for the buffer node handling. The buffer node assignment benefits from checking and setting REQ_F_BUF_NODE together, otherwise stalls have been observed on setting that flag later in the process. Hence re-do it so that it's set when checked, and cleared in case of (unlikely) failure. With that, the buffer node helper can go, and then drop the generic io_req_assign_rsrc_node() helper as well as there's only a single user of it left at that point. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-21 05:06:58 -06:00
Jens Axboe	b419bed4f0	io_uring/rsrc: ensure segments counts are correct on kbuf buffers kbuf imports have the front offset adjusted and segments removed, but the tail segments are still included in the segment count that gets passed in the iov_iter. As the segments aren't necessarily all the same size, move importing to a separate helper and iterate the mapped length to get an exact count. Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-17 11:59:12 -06:00
Nitesh Shetty	80c7378f94	io_uring/rsrc: send exact nr_segs for fixed buffer Sending exact nr_segs, avoids bio split check and processing in block layer, which takes around 5%[1] of overall CPU utilization. In our setup, we see overall improvement of IOPS from 7.15M to 7.65M [2] and 5% less CPU utilization. [1] 3.52% io_uring [kernel.kallsyms] [k] bio_split_rw_at 1.42% io_uring [kernel.kallsyms] [k] bio_split_rw 0.62% io_uring [kernel.kallsyms] [k] bio_submit_split [2] sudo taskset -c 0,1 ./t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n2 -r4 /dev/nvme0n1 /dev/nvme1n1 Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> [Pavel: fixed for kbuf, rebased and reworked on top of cleanups] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7a1a49a8d053bd617c244291d63dbfbc07afde36.1744882081.git.asml.silence@gmail.com [axboe: fold in fix factoring in buf reg offset] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-17 07:42:14 -06:00
Pavel Begunkov	59852ebad9	io_uring/rsrc: refactor io_import_fixed io_import_fixed is a mess. Even though we know the final len of the iterator, we still assign offset + len and do some magic after to correct for that. Do offset calculation first and finalise it with iov_iter_bvec at the end. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2d5107fed24f8b23245ef2ede9a5a7f7c426df61.1744882081.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-17 06:21:13 -06:00
Pavel Begunkov	50169d0754	io_uring/rsrc: separate kbuf offset adjustments Kernel registered buffers are special because segments are not uniform in size, and we have a bunch of optimisations based on that uniformity for normal buffers. Handle kbuf separately, it'll be cleaner this way. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4e9e5990b0ab5aee723c0be5cd9b5bcf810375f9.1744882081.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-17 06:21:13 -06:00
Pavel Begunkov	1ac5712888	io_uring/rsrc: don't skip offset calculation Don't optimise for requests with offset=0. Large registered buffers are the preference and hence the user is likely to pass an offset, and the adjustments are not expensive and will be made even cheaper in following patches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1c2beb20470ee3c886a363d4d8340d3790db19f3.1744882081.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-17 06:21:13 -06:00
Pavel Begunkov	ab6005f391	io_uring: don't post tag CQEs on file/buffer registration failure Buffer / file table registration is all or nothing, if it fails all resources we might have partially registered are dropped and the table is killed. If that happens, it doesn't make sense to post any rsrc tag CQEs. That would be confusing to the application, which should not need to handle that case. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Fixes: `7029acd8a9` ("io_uring/rsrc: get rid of per-ring io_rsrc_node list") Link: https://lore.kernel.org/r/c514446a8dcb0197cddd5d4ba8f6511da081cf1f.1743777957.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-04 09:40:44 -06:00
Ming Lei	1045afae4b	io_uring: support vectored kernel fixed buffer io_uring has supported fixed kernel buffer via io_buffer_register_bvec() and io_buffer_unregister_bvec(). The vectored fixed buffer has been ready, so it is natural to support fixed kernel buffer, one use case is ublk. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250325135155.935398-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-02 07:06:59 -06:00
Ming Lei	8622b20f23	io_uring: add validate_fixed_range() for validate fixed buffer Add helper of validate_fixed_range() for validating fixed buffer range. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250325135155.935398-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-04-02 07:06:43 -06:00

1 2 3 4

196 Commits