linux

mirror of https://github.com/torvalds/linux.git synced 2026-07-27 01:32:21 +02:00

Author	SHA1	Message	Date
Bryam Vargas	fda6a1f3c3	scsi: target: core: Fix iSCSI ISID use-after-free in REGISTER AND MOVE core_scsi3_emulate_pro_register_and_move() maps the PERSISTENT RESERVE OUT parameter list with transport_kmap_data_sg() and parses the destination TransportID with target_parse_pr_out_transport_id(). For an iSCSI TransportID (FORMAT CODE 01b), iscsi_parse_pr_out_transport_id() returns the ISID in iport_ptr as a raw pointer into that mapped buffer. The function then unmaps the buffer with transport_kunmap_data_sg() before dereferencing iport_ptr in strcmp(), __core_scsi3_locate_pr_reg() and core_scsi3_alloc_registration(). When the parameter list spans more than one page (PARAMETER LIST LENGTH > 4096), transport_kmap_data_sg() uses vmap() and transport_kunmap_data_sg() does vunmap(), so the kernel virtual address backing iport_ptr is torn down and every subsequent dereference is a use-after-free read of the unmapped region. Keep the parameter list mapped until iport_ptr is no longer needed: drop the early transport_kunmap_data_sg() and unmap once on the success path, right before returning. The error paths already unmap through the existing "if (buf) transport_kunmap_data_sg(cmd)" at the out: label, which now runs on every post-map error exit because buf is no longer cleared early. Only reads of the mapping happen while spinlocks are held; the map and unmap calls remain outside any lock. The sibling caller core_scsi3_decode_spec_i_port() already uses the buffer before unmapping it and is left unchanged. Fixes: `4949314c72` ("target: Allow control CDBs with data > 1 page") Cc: stable@vger.kernel.org Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: David Disseldorp <ddiss@suse.de> Link: https://patch.msgid.link/20260610042245.35473-1-hexlabsecurity@proton.me Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-07-12 22:21:22 -04:00
Bryam Vargas	d04a179085	scsi: target: Bound PR-OUT TransportID parsing to the received buffer core_scsi3_decode_spec_i_port() and core_scsi3_emulate_register_and_move() hand the raw PERSISTENT RESERVE OUT parameter buffer to target_parse_pr_out_transport_id() without telling it how many bytes are valid. For an iSCSI TransportID (FORMAT CODE 01b), iscsi_parse_pr_out_transport_id() locates the ",i,0x" ISID separator with an unbounded strstr() (and on the error path prints the name with a further unbounded "%s"). An initiator can submit a TransportID whose iSCSI name contains neither a ",i,0x" substring nor a NUL terminator, filling the parameter list to its end, so the scan runs off the end of the buffer. When the parameter list spans more than one page the buffer is a multi-page vmap (transport_kmap_data_sg()), so the over-read walks into the trailing vmalloc guard page and oopses (KASAN: vmalloc-out-of-bounds in strstr). It is reachable by any fabric that delivers a PR OUT to a device exported through an iSCSI TPG, including a guest via vhost-scsi. Pass the number of received bytes down to the parser and validate the iSCSI TransportID's own self-described length (ADDITIONAL LENGTH + 4) once, up front: reject it if it is below the spc4r17 minimum or larger than the received buffer, then bound the separator search, the ISID walk and the name copy by that length. This is the length check the callers already perform after the parse (core_scsi3_decode_spec_i_port() compares tid_len against tpdl, core_scsi3_emulate_register_and_move() validates it against data_length), moved ahead of the scan. Also drop the unbounded "%s" of the unterminated name. Add per-format explicit name-length checks before copying into i_str, rather than silently truncating with min_t: for FORMAT CODE 00b reject if the descriptor body (tid_len - 4 bytes) cannot fit in i_str[TRANSPORT_IQN_LEN]; for FORMAT CODE 01b reject if the name portion (from &buf[4] up to the separator) cannot fit. Both checks make the bounds intent explicit at each format branch. While here, also reject a FORMAT CODE 01b TransportID whose ",i,0x" separator sits at the very end of the descriptor: that leaves an empty ISID and points the returned port nexus pointer at buf + tid_len, one past the descriptor, which the registration code (__core_scsi3_locate_pr_reg(), __core_scsi3_alloc_registration()) then dereferences as the ISID string -- the same over-read of the parameter buffer for a malformed descriptor. Fixes: `c66ac9db8d` ("[SCSI] target: Add LIO target core v4.0.0-rc6") Cc: stable@vger.kernel.org Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: David Disseldorp <ddiss@suse.de> Link: https://patch.msgid.link/20260611-b4-disp-9f20739e-v6-1-f6630e2aae44@proton.me Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-07-12 22:21:22 -04:00
Martin K. Petersen	4f87e9068b	Merge branch 7.1/scsi-fixes into 7.2/scsi-staging Pull in outstanding commits from 7.1 branch. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-06-15 21:01:30 -04:00
Mike Christie	7c08d43083	scsi: target: Remove tcm_loop target reset handling tcm_loop_target_reset is supposed to handle all the LUNs on a target but it's only doing a TMR_LUN_RESET so only that one LUN is handled. This will cause us to return early while IOs to other LUNs are still hung in lower layers. This just removes the target reset handler for the driver because LIO doesn't support target resets and for the common case where this is run from the scsi-ml error hamdler we have already tried an abort and lun reset so waiting again is most likely useless. Fixes: `1333eee56c` ("scsi: target: tcm_loop: Drain commands in target_reset handler") Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260530052349.5134-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-06-08 17:50:48 -04:00
David Disseldorp	cf14fc2be8	scsi: target: Use constant-time crypto_memneq() for CHAP digests A constant-time memory comparison is more suitable than plain memcmp() for authentication digest comparison. CHAP digests use an authenticator-provided random challenge, so any timing side-channel shouldn't be easily exploitable. Reported-by: Sashiko (gemini/gemini-3.1-pro-preview) Link: https://sashiko.dev/#/patchset/20260521151121.808477-1-hossu.alexandru%40gmail.com Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Lee Duncan <lduncan@suse.com> Link: https://patch.msgid.link/20260605122019.24146-3-ddiss@suse.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-06-08 17:20:33 -04:00
David Disseldorp	7e161211f1	scsi: target: Fix hexadecimal CHAP_I handling A mutual CHAP handshake requires target processing of an initiator-sent CHAP_I identifier. The RFC 3720 specification states: 11.1.4. Challenge Handshake Authentication Protocol (CHAP) ... CHAP_A=<A> CHAP_I=<I> CHAP_C=<C> ... Where N, (A,A1,A2), I, C, and R are (correspondingly) the Name, Algorithm, Identifier, Challenge, and Response as defined in [RFC1994], N is a text string, A,A1,A2, and I are numbers CHAP_I parsing currently calls extract_param(), which returns the @identifier string (stripped of any 0b/0B or 0x/0X prefix) and a @type which indicates DECIMAL, HEX, or BASE64 encoding (based on any stripped prefix). Any HEX encoded CHAP_I string is further processed via: ret = kstrtoul(&identifier[2], 0, &id); This is incorrect for two reasons: * The @identifier string has already been stripped of the 0x/0X prefix, so skipping the first two bytes omits part of the number. * The kstrtoul() call specifies a base of 0, which will see &identifier[2] parsed as a decimal, unless a '0x' or (octal) '0' is erroneously present at that offset. Fix this by passing the (zero-offset) identifier string to kstrtoul() along with a base=16 parameter. Also add an explicit error handler for BASE64 encoding. Hex-encoded CHAP_I handling can be testing using the libiscsi EncodedI test linked below. Reported-by: Sashiko (gemini/gemini-3.1-pro-preview) Link: https://sashiko.dev/#/patchset/20260521151121.808477-1-hossu.alexandru%40gmail.com Link: https://github.com/sahlberg/libiscsi/pull/473 Fixes: `85db739131` ("scsi: target: iscsi: Validate CHAP_R length before base64 decode") Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20260605122019.24146-2-ddiss@suse.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-06-08 17:20:33 -04:00
Alexandru Hossu	85db739131	scsi: target: iscsi: Validate CHAP_R length before base64 decode chap_server_compute_hash() allocates client_digest as kzalloc(chap->digest_size) and then, for BASE64-encoded responses, passes chap_r directly to chap_base64_decode() without checking whether the input length could produce more than digest_size bytes of output. chap_base64_decode() writes to the destination unconditionally as long as there is input to consume. With MAX_RESPONSE_LENGTH set to 128 and the "0b" prefix stripped by extract_param(), up to 127 base64 characters can reach the decoder. 127 characters decode to 95 bytes. For SHA-256 (digest_size=32) this overflows client_digest by 63 bytes; for MD5 (digest_size=16) the overflow is 79 bytes. The length check at line 344 fires after the write has already happened. The HEX branch in the same switch statement already validates the length up front. Apply the same approach to the BASE64 branch: strip trailing base64 padding characters, then reject any input whose data length exceeds DIV_ROUND_UP(digest_size * 4, 3) before calling the decoder. Stripping trailing '=' before the comparison handles both padded and unpadded encodings. chap_base64_decode() already returns early on '=', so the full original string is still passed to the decoder unchanged. The mutual CHAP path decodes CHAP_C into initiatorchg_binhex, which is kzalloc(CHAP_CHALLENGE_STR_LEN). extract_param() caps initiatorchg at CHAP_CHALLENGE_STR_LEN characters, so at most CHAP_CHALLENGE_STR_LEN-1 base64 characters reach the decoder. The maximum decoded size, DIV_ROUND_UP((CHAP_CHALLENGE_STR_LEN-1) * 3, 4), is less than CHAP_CHALLENGE_STR_LEN, so no overflow is possible there. A comment is added at the call site to document this. Fixes: `1e57338834` ("scsi: target: iscsi: Support base64 in CHAP") Cc: stable@vger.kernel.org Signed-off-by: Alexandru Hossu <hossu.alexandru@gmail.com> Reviewed-by: David Disseldorp <ddiss@suse.de> Link: https://patch.msgid.link/20260521151121.808477-1-hossu.alexandru@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-05-22 23:06:00 -04:00
Michael Bommarito	bf33e01f88	scsi: target: iscsi: Bound iscsi_encode_text_output() appends to rsp_buf iscsi_encode_text_output() concatenates "key=value\0" records into login->rsp_buf, an 8192-byte kzalloc(MAX_KEY_VALUE_PAIRS) buffer allocated in iscsit_alloc_login_setup_buffer(). The three sprintf() call sites in this function (lines 1398, 1411, 1424 in v7.1-rc2) never check the remaining buffer capacity: length += sprintf(output_buf, "%s=%s", er->key, er->value); length += 1; output_buf = textbuf + length; The 8192-byte ceiling at iscsi_target_check_login_request() bounds the input* Login PDU payload, but a single PDU can carry up to 2048 minimal four-byte "a=b\0" pairs, each unknown key expanding to a 16-byte "a=NotUnderstood\0" output record via iscsi_add_notunderstood_response(). 2048 * 16 = 32 KiB of output into an 8 KiB buffer, producing a ~24 KiB heap overrun in the kmalloc-8k slab. The fix introduces a static iscsi_encode_text_record() helper that uses snprintf() with a per-call bounds check against the remaining buffer, and threads a u32 textbuf_size parameter through iscsi_encode_text_output(). Both call sites in iscsi_target_handle_csg_zero() (PHASE_SECURITY) and iscsi_target_handle_csg_one() (PHASE_OPERATIONAL) pass MAX_KEY_VALUE_PAIRS. On overflow the encoder logs the condition, calls iscsi_release_extra_responses() to drop queued records, and returns -1; both caller sites now emit ISCSI_STATUS_CLS_INITIATOR_ERR / ISCSI_LOGIN_STATUS_INIT_ERR via iscsit_tx_login_rsp() before returning, so the initiator sees an explicit failed-login response rather than a silent connection drop. (Prior to this patch only the PHASE_OPERATIONAL caller did that; the PHASE_SECURITY caller is converted to the same shape.) Fixes: `e48354ce07` ("iscsi-target: Add iSCSI fabric support for target v4.1") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Tested-by: John Garry <john.g.garry@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-05-22 23:04:36 -04:00
Michael Bommarito	778c2ab142	scsi: target: iscsi: Fix CRC overread and double-free in iscsit_handle_text_cmd() Two latent bugs in the Text-phase handler, both present since the original LIO integration in commit `e48354ce07` ("iscsi-target: Add iSCSI fabric support for target v4.1"): 1) DataDigest CRC buffer overread (4 bytes past text_in). text_in is kzalloc()'d at ALIGN(payload_length, 4). rx_size is then incremented by ISCSI_CRC_LEN to make room for the received DataDigest in the iovec, but the same (now-bumped) rx_size is passed as the buffer length to iscsit_crc_buf(): if (conn->conn_ops->DataDigest) { ... rx_size += ISCSI_CRC_LEN; } ... if (conn->conn_ops->DataDigest) { data_crc = iscsit_crc_buf(text_in, rx_size, 0, NULL); iscsit_crc_buf() walks rx_size bytes of text_in with crc32c(), so when DataDigest is negotiated it reads 4 bytes past the end of the text_in allocation. KASAN reproduces this directly on the unpatched mainline tree as slab-out-of-bounds in crc32c() called from the Text PDU path. The OOB bytes feed crc32c() and are then compared against the initiator-supplied checksum, so the value does not flow back to the attacker, but the kernel does read past the buffer on every Text PDU with DataDigest=CRC32C. Fix by passing the actual padded payload length (ALIGN(payload_length, 4)) that was used for the kzalloc(). 2) Stale cmd->text_in_ptr re-free (double-free) on ERL>0 bad DataDigest drop. On DataDigest mismatch with ErrorRecoveryLevel > 0 the handler silently drops the PDU and lets the initiator plug the CmdSN gap: kfree(text_in); return 0; cmd->text_in_ptr still points at the freed buffer. The next Text Request on the same ITT re-enters iscsit_setup_text_cmd(), which unconditionally does kfree(cmd->text_in_ptr); cmd->text_in_ptr = NULL; freeing the same pointer a second time. Session teardown via iscsit_release_cmd() has the same shape and hits the same double-free if the connection is dropped before a second Text Request arrives. On an unmodified mainline tree the bug-1 CRC overread fires first on the initial valid Text Request and perturbs the subsequent state, so #4 was isolated by building a kernel with only the bug-1 hunk of this patch applied plus temporary printk() observability around the three relevant kfree() sites. The observability prints are not part of this patch. On that build, a three-PDU Text Request sequence after login produces two back-to-back splats: BUG: KASAN: double-free in iscsit_setup_text_cmd+0x?? BUG: KASAN: double-free in iscsit_release_cmd+0x?? showing the same pointer freed in the ERL>0 drop path and again in iscsit_setup_text_cmd() (next Text Request on the same ITT) and once more in iscsit_release_cmd() (session teardown). On distro kernels with CONFIG_SLAB_FREELIST_HARDENED=y (default) the double-free becomes a remote kernel BUG(); on non-hardened kernels it corrupts the slab freelist. Fix by clearing cmd->text_in_ptr after the kfree() in the ERL>0 drop path. With both hunks applied #4 is directly observable on the stock tree without observability printks; fixing bug-1 alone would mask #4 less, not more, so the hunks are submitted together. Both fixes are one-liners. The Text PDU state machine is unchanged and the wire protocol is unaffected. Fixes: `e48354ce07` ("iscsi-target: Add iSCSI fabric support for target v4.1") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Tested-by: John Garry <john.g.garry@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-05-22 22:24:16 -04:00
Guixin Liu	b71cb088b2	scsi: target: tcm_loop: Fix NULL ptr dereference The TCM_LOOP LUN creation process calls device_register() to create the device, which in turn invokes tcm_loop_driver_probe() registered with the TCM_LOOP bus to create and register the scsi_host. However, if the scsi_host memory allocation fails or scsi_add_host() fails, the device_register() process still returns success. Subsequently, when the user binds the LUN to a specific backend device, it accesses the NULL or freed scsi_host. Crash Call Trace: RIP: 0010:scsi_is_host_device+0x7/0x20 scsi_alloc_target+0x32/0x2c0 __scsi_add_device+0x41/0xf0 scsi_add_device+0xd/0x30 tcm_loop_port_link+0x25/0x50 [tcm_loop] target_fabric_port_link+0x9c/0xb0 [target_core_mod] ... This issue is fixed by: 1. Setting the tcm_loop_hba's scsi_host to NULL, if scsi_add_host() fails. 2. Checking the tcm_loop_hba's scsi_host after device_register(). 3. Checking the tcm_loop_hba's scsi_host in tcm_loop_driver_remove(). Fixes: `3703b2c5d0` ("[SCSI] tcm_loop: Add multi-fabric Linux/SCSI LLD fabric module") Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260424013923.25998-1-kanie@linux.alibaba.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-05-14 22:21:01 -04:00
Martin K. Petersen	98f69975d4	Merge branch '7.1/scsi-queue' into 7.1/scsi-fixes Pull in remaining commits from 7.1/scsi-queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-26 21:15:04 -04:00
Carlos Bilbao	2f3835771d	scsi: target: iscsi: reject invalid size Extended CDB AHS If ecdb_ahdr->ahslength is zero, two bugs follow: kmalloc(be16_to_cpu(ecdb_ahdr->ahslength) + 15, ...) allocates 15 bytes, but the immediately following memcpy writes ISCSI_CDB_SIZE (16) bytes into it, a one-byte heap overflow. Also: memcpy(cdb + ISCSI_CDB_SIZE, ecdb_ahdr->ecdb, be16_to_cpu(ecdb_ahdr->ahslength) - 1); (u16)0 - 1 promotes to (int)-1 which converts to SIZE_MAX as size_t, causing a massive out-of-bounds write. Reject ahslength == 0 with ISCSI_REASON_PROTOCOL_ERROR before the kmalloc. Also reject ahslength values that exceed the actual AHS buffer advertised. Fixes: `8f1f7d297b` ("scsi: target: iscsi: Add support for extended CDB AHS") Signed-off-by: Carlos Bilbao <carlos.bilbao@kernel.org> Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Link: https://patch.msgid.link/20260415040728.187680-1-carlos.bilbao@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-21 21:08:25 -04:00
Linus Torvalds	a85d6ff994	SCSI misc on 20260421 Usual driver updates (ufs, lpfc, fnic, target, mpi3mr). The substantive core changes are adding a 'serial' sysfs attribute and getting sd to support > PAGE_SIZE sectors. Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> -----BEGIN PGP SIGNATURE----- iLgEABMIAGAWIQTnYEDbdso9F2cI+arnQslM7pishQUCaeeTSRsUgAAAAAAEAA5t YW51MiwyLjUrMS4xMiwyLDImHGphbWVzLmJvdHRvbWxleUBoYW5zZW5wYXJ0bmVy c2hpcC5jb20ACgkQ50LJTO6YrIXxbAEA+LYW6sZ4JHAKRAlU1kPxlPFP18eBjcHa HqWm8/sllyoBAIBt6ldCACW8sP0bLCDTpLz21XZqJGHm7bvR8xkyca0Q =oglt -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Usual driver updates (ufs, lpfc, fnic, target, mpi3mr). The substantive core changes are adding a 'serial' sysfs attribute and getting sd to support > PAGE_SIZE sectors" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (98 commits) scsi: target: Don't validate ignored fields in PROUT PREEMPT scsi: qla2xxx: Use nr_cpu_ids instead of NR_CPUS for qp_cpu_map allocation scsi: ufs: core: Disable timestamp for Kioxia THGJFJT0E25BAIP scsi: mpi3mr: Fix typo scsi: sd: fix missing put_disk() when device_add(&disk_dev) fails scsi: libsas: Delete unused to_dom_device() and to_dev_attr() scsi: storvsc: Handle PERSISTENT_RESERVE_IN truncation for Hyper-V vFC scsi: iscsi_tcp: Remove unneeded selections of CRYPTO and CRYPTO_MD5 scsi: lpfc: Update lpfc version to 15.0.0.0 scsi: lpfc: Add PCI ID support for LPe42100 series adapters scsi: lpfc: Introduce 128G link speed selection and support scsi: lpfc: Check ASIC_ID register to aid diagnostics during failed fw updates scsi: lpfc: Update construction of SGL when XPSGL is enabled scsi: lpfc: Remove deprecated PBDE feature scsi: lpfc: Add REG_VFI mailbox cmd error handling scsi: lpfc: Log MCQE contents for mbox commands with no context scsi: lpfc: Select mailbox rq_create cmd version based on SLI4 if_type scsi: lpfc: Break out of IRQ affinity assignment when mask reaches nr_cpu_ids scsi: ufs: core: Make the header files self-contained scsi: ufs: core: Remove an include directive from ufshcd-crypto.h ...	2026-04-21 08:22:18 -07:00
Linus Torvalds	334fbe734e	mm.git review status for linus..mm-stable Everything: Total patches: 368 Reviews/patch: 1.56 Reviewed rate: 74% Excluding DAMON: Total patches: 316 Reviews/patch: 1.77 Reviewed rate: 81% Excluding DAMON and zram: Total patches: 306 Reviews/patch: 1.81 Reviewed rate: 82% Excluding DAMON, zram and maple_tree: Total patches: 276 Reviews/patch: 2.01 Reviewed rate: 91% Significant patch series in this merge: - The 30 patch series "maple_tree: Replace big node with maple copy" from Liam Howlett is mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - The 12 patch series "mm, swap: swap table phase III: remove swap_map" from Kairui Song offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - The 2 patch series "mm: memfd_luo: preserve file seals" from Pratyush Yadav adds file seal preservation to LUO's memfd code. - The 2 patch series "mm: zswap: add per-memcg stat for incompressible pages" from Jiayuan Chen adds additional userspace stats reportng to zswap. - The 4 patch series "arch, mm: consolidate empty_zero_page" from Mike Rapoport implements some cleanups for our handling of ZERO_PAGE() and zero_pfn. - The 2 patch series "mm/kmemleak: Improve scan_should_stop() implementation" from Zhongqiu Han provides an robustness improvement and some cleanups in the kmemleak code. - The 4 patch series "Improve khugepaged scan logic" from Vernon Yang "improves the khugepaged scan logic and reduces CPU consumption by prioritizing scanning tasks that access memory frequently". - The 2 patch series "Make KHO Stateless" from Jason Miu simplifies Kexec Handover by "transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel" - The 3 patch series "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" from Thomas Ballasi and Steven Rostedt enhances vmscan's tracepointing. - The 5 patch series "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" from Catalin Marinas is a cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation. - The 2 patch series "Fix KASAN support for KHO restored vmalloc regions" from Pasha Tatashin fixes a WARN() which can be emitted the KHO restores a vmalloc area. - The 4 patch series "mm: Remove stray references to pagevec" from Tal Zussman provides several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago. - The 17 patch series "mm: Eliminate fake head pages from vmemmap optimization" from Kiryl Shutsemau simplifies the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page. - The 2 patch series "mm/damon/core: improve DAMOS quota efficiency for core layer filters" from SeongJae Park improves two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used. - The 3 patch series "mm/damon: strictly respect min_nr_regions" from SeongJae Park improves DAMON usability by extending the treatment of the min_nr_regions user-settable parameter. - The 3 patch series "mm/page_alloc: pcp locking cleanup" from Vlastimil Babka is a proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ennsed. - The 16 patch series "mm: cleanups around unmapping / zapping" from David Hildenbrand implements "a bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions". - The 6 patch series "support batched checking of the young flag for MGLRU" from Baolin Wang supports batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64. - The 5 patch series "memcg: obj stock and slab stat caching cleanups" from Johannes Weiner provides memcg cleanup and robustness improvements. - The 5 patch series "Allow order zero pages in page reporting" from Yuvraj Sakshith enhances page_reporting's free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - The 6 patch series "mm: vma flag tweaks" from Lorenzo Stoakes is cleanup work following from the recent conversion of the VMA flags to a bitmap. - The 10 patch series "mm/damon: add optional debugging-purpose sanity checks" from SeongJae Park adds some more developer-facing debug checks into DAMON core. - The 2 patch series "mm/damon: test and document power-of-2 min_region_sz requirement" from SeongJae Park adds an additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling. - The 3 patch series "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" from SeongJae Park fixes a hard-to-hit time overflow issue in DAMON core. - The 7 patch series "mm/damon: improve/fixup/update ratio calculation, test and documentation" from SeongJae Park is a "batch of misc/minor improvements and fixups" for DAMON. - The 4 patch series "mm: move vma_(kernel\|mmu)_pagesize() out of hugetlb.c" from David Hildenbrand fixes a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - The 6 patch series "zram: recompression cleanups and tweaks" from Sergey Senozhatsky provides "a somewhat random mix of fixups, recompression cleanups and improvements" in the zram code. - The 11 patch series "mm/damon: support multiple goal-based quota tuning algorithms" from SeongJae Park extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select. - The 4 patch series "mm: thp: reduce unnecessary start_stop_khugepaged()" from Breno Leitao fixes the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged. - The 3 patch series "mm: improve map count checks" from Lorenzo Stoakes provides some cleanups and slight fixes in the mremap, mmap and vma code. - The 5 patch series "mm/damon: support addr_unit on default monitoring targets for modules" from SeongJae Park extends the use of DAMON core's addr_unit tunable. - The 5 patch series "mm: khugepaged cleanups and mTHP prerequisites" from Nico Pache provides cleanups in the khugepaged and is a base for Nico's planned khugepaged mTHP support. - The 15 patch series "mm: memory hot(un)plug and SPARSEMEM cleanups" from David Hildenbrand implements code movement and cleanups in the memhotplug and sparsemem code. - The 2 patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" from David Hildenbrand rationalizes some memhotplug Kconfig support. - The 6 patch series "change young flag check functions to return bool" from Baolin Wang is "a cleanup patchset to change all young flag check functions to return bool". - The 3 patch series "mm/damon/sysfs: fix memory leak and NULL dereference issues" from Josh Law and SeongJae Park fixes a few potential DAMON bugs. - The 25 patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma code" from "converts a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it". Mainly in the vma code. - The 21 patch series "mm: expand mmap_prepare functionality and usage" from Lorenzo Stoakes "expands the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time". Cleanups, documentation, extension of mmap_prepare into filesystem drivers. - The 13 patch series "mm/huge_memory: refactor zap_huge_pmd()" from Lorenzo Stoakes simplifies and cleans up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad3HDQAKCRDdBJ7gKXxA jrUQAPwNhPk5nPSxnyxjAeQtOBHqgCdnICeEismLajPKd9aYRgEA0s2XAu3tSUYi GrBnWImHG3s4ePQxVcPCegWTsOUrXgQ= =1Q7o -----END PGP SIGNATURE----- Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel\|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky) A somewhat random mix of fixups, recompression cleanups and improvements in the zram code - "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park) Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao) Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged - "mm: improve map count checks" (Lorenzo Stoakes) Provide some cleanups and slight fixes in the mremap, mmap and vma code - "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park) Extend the use of DAMON core's addr_unit tunable - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache) Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand) Code movement and cleanups in the memhotplug and sparsemem code - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand) Rationalize some memhotplug Kconfig support - "change young flag check functions to return bool" (Baolin Wang) Cleanups to change all young flag check functions to return bool - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park) Fix a few potential DAMON bugs - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes) Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code. - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes) Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes) Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. * tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ...	2026-04-15 12:59:16 -07:00
Greg Kroah-Hartman	772a896a56	scsi: target: configfs: Bound snprintf() return in tg_pt_gp_members_show() target_tg_pt_gp_members_show() formats LUN paths with snprintf() into a 256-byte stack buffer, then will memcpy() cur_len bytes from that buffer. snprintf() returns the length the output would have had, which can exceed the buffer size when the fabric WWN is long because iSCSI IQN names can be up to 223 bytes. The check at the memcpy() site only guards the destination page write, not the source read, so memcpy() will read past the stack buffer and copy adjacent stack contents to the sysfs reader, which when CONFIG_FORTIFY_SOURCE is enabled, fortify_panic() will be triggered. Commit `27e06650a5` ("scsi: target: target_core_configfs: Add length check to avoid buffer overflow") added the same bound to the target_lu_gp_members_show() but the tg_pt_gp variant was missed so resolve that here. Cc: Martin K. Petersen <martin.petersen@oracle.com> Fixes: `c66ac9db8d` ("[SCSI] target: Add LIO target core v4.0.0-rc6") Assisted-by: gregkh_clanker_t1000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/2026041159-garter-theft-3be0@gregkh Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-13 22:43:56 -04:00
Linus Torvalds	7fe6ac157b	for-7.1/block-20260411 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmna0tgQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgptEbD/0ZMEsz5pcN+/bpM9Qva5lVVkByRieua+JA T7L+JMcEigp1Hf2idAPlv1e9dbrtgOGhkjZNlbZenP2MHXBmbUTnzTWDKW5w0ZQ4 UqnVC7fMmxzI57DPt7iG/1WQo8O6QPHWwBof5ZXn0b83qwByTB2oVkAb9ysT7CdM wGk5KnPRLIAWf5o+aZ4LoWE+196jQiszx1m6U58FTqnCgvJ/GyKyrgzx+uvGUgF+ owZT/6TrN7cN9A68fOnmcjEZ7beZXygOQPTn32sF9rEOi8JsgK71EE2LofdVVSNU ES/tyKVJbSNDgUH2b0T84rErT4MtZcw5J29V3k7CVndC+DcT2uLSroPz3lYQjDg9 TLeq7ZLjnyoBG+muboWdXcvBKn3aKLec3nfVSbz6J1xb/Z22gWYy5TZbrGnGH8fJ zBiyKkHMaZi55IdTDWQT3a48h36qFh0Y2wbvZ6uhyYOfXHyj4pA4ccJZgFfmf4ZG flVRFGEL9Tqc82lB8dfy9DBp0ZQSjeBUCd+gyDKjiuWVau5L5iTUeMMkt8yr7qbg PY+ATJcHk5S5zwM2xcZUt5EcHBBbCaKQ6DdRZKwzMMUvCjHlvnWvENVjUtRa9Dng 1vUKpB/e5NGpqD05Iqgyai+OD9/tALc4sUEI2yQ7/dk9pKIXQ4RE9HR/pSkgbjeR LGokj08cgg== =ga3t -----END PGP SIGNATURE----- Merge tag 'for-7.1/block-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Add shared memory zero-copy I/O support for ublk, bypassing per-I/O copies between kernel and userspace by matching registered buffer PFNs at I/O time. Includes selftests. - Refactor bio integrity to support filesystem initiated integrity operations and arbitrary buffer alignment. - Clean up bio allocation, splitting bio_alloc_bioset() into clear fast and slow paths. Add bio_await() and bio_submit_or_kill() helpers, unify synchronous bi_end_io callbacks. - Fix zone write plug refcount handling and plug removal races. Add support for serializing zone writes at QD=1 for rotational zoned devices, yielding significant throughput improvements. - Add SED-OPAL ioctls for Single User Mode management and a STACK_RESET command. - Add io_uring passthrough (uring_cmd) support to the BSG layer. - Replace pp_buf in partition scanning with struct seq_buf. - zloop improvements and cleanups. - drbd genl cleanup, switching to pre_doit/post_doit. - NVMe pull request via Keith: - Fabrics authentication updates - Enhanced block queue limits support - Workqueue usage updates - A new write zeroes device quirk - Tagset cleanup fix for loop device - MD pull requests via Yu Kuai: - Fix raid5 soft lockup in retry_aligned_read() - Fix raid10 deadlock with check operation and nowait requests - Fix raid1 overlapping writes on writemostly disks - Fix sysfs deadlock on array_state=clear - Proactive RAID-5 parity building with llbitmap, with write_zeroes_unmap optimization for initial sync - Fix llbitmap barrier ordering, rdev skipping, and bitmap_ops version mismatch fallback - Fix bcache use-after-free and uninitialized closure - Validate raid5 journal metadata payload size - Various cleanups - Various other fixes, improvements, and cleanups * tag 'for-7.1/block-20260411' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (146 commits) ublk: fix tautological comparison warning in ublk_ctrl_reg_buf scsi: bsg: fix buffer overflow in scsi_bsg_uring_cmd() block: refactor blkdev_zone_mgmt_ioctl MAINTAINERS: update ublk driver maintainer email Documentation: ublk: address review comments for SHMEM_ZC docs ublk: allow buffer registration before device is started ublk: replace xarray with IDA for shmem buffer index allocation ublk: simplify PFN range loop in __ublk_ctrl_reg_buf ublk: verify all pages in multi-page bvec fall within registered range ublk: widen ublk_shmem_buf_reg.len to __u64 for 4GB buffer support xfs: use bio_await in xfs_zone_gc_reset_sync block: add a bio_submit_or_kill helper block: factor out a bio_await helper block: unify the synchronous bi_end_io callbacks xfs: fix number of GC bvecs selftests/ublk: add read-only buffer registration test selftests/ublk: add filesystem fio verify test for shmem_zc selftests/ublk: add hugetlbfs shmem_zc test for loop target selftests/ublk: add shared memory zero-copy test selftests/ublk: add UBLK_F_SHMEM_ZC support for loop target ...	2026-04-13 15:51:31 -07:00
Stefan Hajnoczi	070ec6f691	scsi: target: Don't validate ignored fields in PROUT PREEMPT The PERSISTENT RESERVE OUT command's PREEMPT service action provides two different functions: 1. preempting persistent reservations and 2. removing registrations. In the latter case the spec says: b) ignore the contents of the SCOPE field and the TYPE field; The code currently validates the SCOPE and TYPE fields even when PREEMPT is called to remove registrations. This patch achieves spec compliance by validating the SCOPE and TYPE fields only when they will actually be used. To confirm my interpretation of the specification I tested against HPE 3PAR storage and found the TYPE field is indeed ignored in this case. Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://patch.msgid.link/20260402180342.126583-1-stefanha@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-08 22:38:18 -04:00
Lorenzo Stoakes (Oracle)	933f05f58a	uio: replace deprecated mmap hook with mmap_prepare in uio_info The f_op->mmap interface is deprecated, so update uio_info to use its successor, mmap_prepare. Therefore, replace the uio_info->mmap hook with a new uio_info->mmap_prepare hook, and update its one user, target_core_user, to both specify this new mmap_prepare hook and also to use the new vm_ops->mapped() hook to continue to maintain a correct udev->kref refcount. Then update uio_mmap() to utilise the mmap_prepare compatibility layer to invoke this callback from the uio mmap invocation. Link: https://lkml.kernel.org/r/157583e4477705b496896c7acd4ac88a937b8fa6.1774045440.git.ljs@kernel.org Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bodo Stroesser <bostroesser@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Hildenbrand <david@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Long Li <longli@microsoft.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Richard Weinberger <richard@nod.at> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:44 -07:00
Martin K. Petersen	fc0a1d05e4	Merge branch 7.0/scsi-fixes into 7.1/scsi-staging Pull in fixes to resolve mpi3mr merge conflict. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-04-02 20:31:22 -04:00
Kees Cook	665fb6a643	scsi: target: Replace strncpy() with strscpy() in VPD dump functions Replace the deprecated[1] strncpy() with strscpy() in transport_dump_vpd_proto_id(), transport_dump_vpd_assoc(), transport_dump_vpd_ident_type(), and transport_dump_vpd_ident(). All four functions follow the same pattern: a local buf[VPD_TMP_BUF_SIZE] (254 bytes) is zeroed with memset(), populated via sprintf()/snprintf() (always NUL-terminated), then conditionally copied to p_buf with strncpy(). The p_buf destination is used as a C string by all callers in target_core_configfs.c: strlen(buf) and sprintf(page+len, "%s", buf) to build sysfs output. NUL-padding is not required: callers in target_core_configfs.c pre-zero p_buf with memset() or initializer before calling these functions, and consume p_buf only as a NUL-terminated C string via strlen() and "%s", never exposing trailing bytes. No behavioral change: the source buf is always NUL-terminated and shorter than VPD_TMP_BUF_SIZE, so strscpy() produces identical output. Link: https://github.com/KSPP/linux/issues/90 [1] Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://patch.msgid.link/20260323171311.work.101-kees@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-03-27 16:32:44 -04:00
Thinh Nguyen	01f784fc9d	scsi: target: file: Use kzalloc_flex for aio_cmd The target_core_file doesn't initialize the aio_cmd->iocb for the ki_write_stream. When a write command fd_execute_rw_aio() is executed, we may get a bogus ki_write_stream value, causing unintended write failure status when checking iocb->ki_write_stream > max_write_streams in the block device. Let's just use kzalloc_flex when allocating the aio_cmd and let ki_write_stream=0 to fix this issue. Fixes: `732f25a289` ("fs: add a write stream field to the kiocb") Fixes: `c27683da64` ("block: expose write streams for block device nodes") Cc: stable@vger.kernel.org Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://patch.msgid.link/f1a2f81c62f043e31f80bb92d5f29893400c8ee2.1773450782.git.Thinh.Nguyen@synopsys.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-03-19 22:07:39 -04:00
Josef Bacik	1333eee56c	scsi: target: tcm_loop: Drain commands in target_reset handler tcm_loop_target_reset() violates the SCSI EH contract: it returns SUCCESS without draining any in-flight commands. The SCSI EH documentation (scsi_eh.rst) requires that when a reset handler returns SUCCESS the driver has made lower layers "forget about timed out scmds" and is ready for new commands. Every other SCSI LLD (virtio_scsi, mpt3sas, ipr, scsi_debug, mpi3mr) enforces this by draining or completing outstanding commands before returning SUCCESS. Because tcm_loop_target_reset() doesn't drain, the SCSI EH reuses in-flight scsi_cmnd structures for recovery commands (e.g. TUR) while the target core still has async completion work queued for the old se_cmd. The memset in queuecommand zeroes se_lun and lun_ref_active, causing transport_lun_remove_cmd() to skip its percpu_ref_put(). The leaked LUN reference prevents transport_clear_lun_ref() from completing, hanging configfs LUN unlink forever in D-state: INFO: task rm:264 blocked for more than 122 seconds. rm D 0 264 258 0x00004000 Call Trace: __schedule+0x3d0/0x8e0 schedule+0x36/0xf0 transport_clear_lun_ref+0x78/0x90 [target_core_mod] core_tpg_remove_lun+0x28/0xb0 [target_core_mod] target_fabric_port_unlink+0x50/0x60 [target_core_mod] configfs_unlink+0x156/0x1f0 [configfs] vfs_unlink+0x109/0x290 do_unlinkat+0x1d5/0x2d0 Fix this by making tcm_loop_target_reset() actually drain commands: 1. Issue TMR_LUN_RESET via tcm_loop_issue_tmr() to drain all commands that the target core knows about (those not yet CMD_T_COMPLETE). 2. Use blk_mq_tagset_busy_iter() to iterate all started requests and flush_work() on each se_cmd — this drains any deferred completion work for commands that already had CMD_T_COMPLETE set before the TMR (which the TMR skips via __target_check_io_state()). This is the same pattern used by mpi3mr, scsi_debug, and libsas to drain outstanding commands during reset. Fixes: `e0eb5d38b7` ("scsi: target: tcm_loop: Use block cmd allocator for se_cmds") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Josef Bacik <josef@toxicpanda.com> Link: https://patch.msgid.link/27011aa34c8f6b1b94d2e3cf5655b6d037f53428.1773706803.git.josef@toxicpanda.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-03-19 21:54:51 -04:00
Junrui Luo	2bf2d65f76	scsi: target: core: Fix integer overflow in UNMAP bounds check sbc_execute_unmap() checks LBA + range does not exceed the device capacity, but does not guard against LBA + range wrapping around on 64-bit overflow. Add an overflow check matching the pattern already used for WRITE_SAME in the same file. Fixes: `86d7182985` ("target: Add sbc_execute_unmap() helper") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://patch.msgid.link/SYBPR01MB7881593C61AD52C69FBDB0BDAF7CA@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-03-10 21:56:39 -04:00
Mike Christie	3e70441fb5	scsi: target: core: Fix complete_type use There were two copy and paste type of errors where I swapped 'complete' with 'submit' names. Use the correct variables and definitions. This problem was found by Dmitry Bogdanov and this patch builds on top of his patch to fix a second instance that was found. Fixes: `06933066d8` ("scsi: target: Add support for completing commands from backend context") Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> [mnc: Fixed second instance] Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260307220630.131008-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-03-10 21:32:01 -04:00
Damien Le Moal	ecd92cfec5	block: remove bdev_nonrot() bdev_nonrot() is simply the negative return value of bdev_rot(). So replace all call sites of bdev_nonrot() with calls to bdev_rot() and remove bdev_nonrot(). Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-03-09 14:30:00 -06:00
Mike Christie	e1502d990c	scsi: target: Allow userspace to set the completion type This allows userspace to request if we complete in the backend context or the frontend driver. It works the same as submission where you can write 0 to 2 to complete_type: 0 - Use the fabric driver's preference. 1 - Complete from the backend calling context if the fabric supports it. 2 - Queue the completion to LIO's completion workqueue. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260222232946.7637-4-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-02-28 21:04:03 -05:00
Mike Christie	89663fb2e5	scsi: target: Use driver completion preference by default This has us use the driver's completion preference by default. There is no behavior changes with this patch and we queue completion to LIO's completion workqueue by default. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260222232946.7637-3-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-02-28 21:04:02 -05:00
Mike Christie	06933066d8	scsi: target: Add support for completing commands from backend context To complete a command several drivers just drop their reference and add it to list to be processed by a driver specific thread. So there's no need to go from backend context to the LIO thread then to the driver's thread. When avoiding the LIO thread, IOPS can increase from 20-30% for workloads like: fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=8K \ --ioengine=libaio --iodepth=128 --numjobs=$jobs where increasing jobs increases the performance improvement (this is using NVMe drives with LIO's submit_type=1 to directly submit). Add the infrastructure so drivers and userspace can control how to complete a command like is done for the submission path. In this commit there is no behavior change and we continue to defer to the LIO workqueue thread. In the subsequent commits we will allow drivers to report what they support and allow userspace to control the behavior. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260222232946.7637-2-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-02-28 21:04:02 -05:00
Prithvi Tambewagh	14d4ac19d1	scsi: target: Fix recursive locking in __configfs_open_file() In flush_write_buffer, &p->frag_sem is acquired and then the loaded store function is called, which, here, is target_core_item_dbroot_store(). This function called filp_open(), following which these functions were called (in reverse order), according to the call trace: down_read __configfs_open_file do_dentry_open vfs_open do_open path_openat do_filp_open file_open_name filp_open target_core_item_dbroot_store flush_write_buffer configfs_write_iter target_core_item_dbroot_store() tries to validate the new file path by trying to open the file path provided to it; however, in this case, the bug report shows: db_root: not a directory: /sys/kernel/config/target/dbroot indicating that the same configfs file was tried to be opened, on which it is currently working on. Thus, it is trying to acquire frag_sem semaphore of the same file of which it already holds the semaphore obtained in flush_write_buffer(), leading to acquiring the semaphore in a nested manner and a possibility of recursive locking. Fix this by modifying target_core_item_dbroot_store() to use kern_path() instead of filp_open() to avoid opening the file using filesystem-specific function __configfs_open_file(), and further modifying it to make this fix compatible. Reported-by: syzbot+f6e8174215573a84b797@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f6e8174215573a84b797 Tested-by: syzbot+f6e8174215573a84b797@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Prithvi Tambewagh <activprithvi@gmail.com> Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Link: https://patch.msgid.link/20260216062002.61937-1-activprithvi@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-02-28 20:41:52 -05:00
Linus Torvalds	32a92f8c89	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 20:03:00 -08:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/$alloc_objs(.*$, GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Linus Torvalds	d4a379a52c	SCSI misc on 20260212 Usual driver updates (qla2xxx, mpi3mr, mpt3sas, ufs) plus assorted cleanups and fixes. The biggest core change is the massive code motion in the sd driver to remove forward declarations and the most significant change is to enumify the queuecommand return. Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> -----BEGIN PGP SIGNATURE----- iLgEABMIAGAWIQTnYEDbdso9F2cI+arnQslM7pishQUCaY4ljBsUgAAAAAAEAA5t YW51MiwyLjUrMS4xMSwyLDImHGphbWVzLmJvdHRvbWxleUBoYW5zZW5wYXJ0bmVy c2hpcC5jb20ACgkQ50LJTO6YrIWFlwEAr9nc1ntxH4UNPgFCVjKyAOa5IE+p5o5C 2lwQIufcihEBAORvI9KO6AoEK6v9TmMKZXoyVsDRFe79fQE5NwCjfAA3 =My8f -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Usual driver updates (qla2xxx, mpi3mr, mpt3sas, ufs) plus assorted cleanups and fixes. The biggest core change is the massive code motion in the sd driver to remove forward declarations and the most significant change is to enumify the queuecommand return" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (78 commits) scsi: csiostor: Fix dereference of null pointer rn scsi: buslogic: Reduce stack usage scsi: ufs: host: mediatek: Require CONFIG_PM scsi: ufs: mediatek: Fix page faults in ufs_mtk_clk_scale() trace event scsi: smartpqi: Fix memory leak in pqi_report_phys_luns() scsi: mpi3mr: Make driver probing asynchronous scsi: ufs: core: Flush exception handling work when RPM level is zero scsi: efct: Use IRQF_ONESHOT and default primary handler scsi: ufs: core: Use a host-wide tagset in SDB mode scsi: qla2xxx: target: Add WQ_PERCPU to alloc_workqueue() users scsi: qla2xxx: Add WQ_PERCPU to alloc_workqueue() users scsi: qla4xxx: Add WQ_PERCPU to alloc_workqueue() users scsi: mpi3mr: Driver version update to 8.17.0.3.50 scsi: mpi3mr: Fixed the W=1 compilation warning scsi: mpi3mr: Record and report controller firmware faults scsi: mpi3mr: Update MPI Headers to revision 39 scsi: mpi3mr: Use negotiated link rate from DevicePage0 scsi: mpi3mr: Avoid redundant diag-fault resets scsi: mpi3mr: Rename log data save helper to reflect threaded/BH context scsi: mpi3mr: Add module parameter to control threaded IRQ polling ...	2026-02-12 15:43:02 -08:00
Linus Torvalds	136114e0ab	mm.git review status for linus..mm-nonmm-stable Total patches: 107 Reviews/patch: 1.07 Reviewed rate: 67% - The 2 patch series "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" from Heming Zhao saves disk space by teaching ocfs2 to reclaim suballocator block group space. - The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in various places. - The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size. - The 7 patch series "kallsyms: Prevent invalid access when showing module buildid" from Petr Mladek cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces. - The 3 patch series "Address page fault in ima_restore_measurement_list()" from Harshit Mogalapalli fixes a kexec-related crash that can occur when booting the second-stage kernel on x86. - The 6 patch series "kho: ABI headers and Documentation updates" from Mike Rapoport updates the kexec handover ABI documentation. - The 4 patch series "Align atomic storage" from Finn Thain adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh. - The 2 patch series "kho: clean up page initialization logic" from Pratyush Yadav simplifies the page initialization logic in kho_restore_page(). - The 6 patch series "Unload linux/kernel.h" from Yury Norov moves several things out of kernel.h and into more appropriate places. - The 7 patch series "don't abuse task_struct.group_leader" from Oleg Nesterov removes the usage of ->group_leader when it is "obviously unnecessary". - The 5 patch series "list private v2 & luo flb" from Pasha Tatashin adds some infrastructure improvements to the live update orchestrator. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6 0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww= =mmsH -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ...	2026-02-12 12:13:01 -08:00
Linus Torvalds	0c00ed308d	for-7.0/block-20260206 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmGLwcQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpv+TD/48S2HTnMhmW6AtFYWErQ+sEKXpHrxbYe7S +qR8/g/T+QSfhfqPwZEuagndFKtIP3LJfaXGSP1Lk1RfP9NLQy91v33Ibe4DjHkp etWSfnMHA9MUAoWKmg8EvncB2G+ZQFiYCpjazj5tKHD9S2+psGMuL8kq6qzMJE83 uhpb8WutUl4aSIXbMSfyGlwBhI1MjjRbbWlIBmg4yC8BWt1sH8Qn2L2GNVylEIcX U8At3KLgPGn0axSg4yGMAwTqtGhL/jwdDyeczbmRlXuAr4iVL9UX/yADCYkazt6U ttQ2/H+cxCwfES84COx9EteAatlbZxo6wjGvZ3xOMiMJVTjYe1x6Gkcckq+LrZX6 tjofi2KK78qkrMXk1mZMkZjpyUWgRtCswhDllbQyqFs0SwzQtno2//Rk8HU9dhbt pkpryDbGFki9X3upcNyEYp5TYflpW6YhAzShYgmE6KXim2fV8SeFLviy0erKOAl+ fwjTE6KQ5QoQv0s3WxkWa4lREm34O6IHrCUmbiPm5CruJnQDhqAN2QZIDgYC4WAf 0gu9cR/O4Vxu7TQXrumPs5q+gCyDU0u0B8C3mG2s+rIo+PI5cVZKs2OIZ8HiPo0F x73kR/pX3DMe35ZQkQX22ymMuowV+aQouDLY9DTwakP5acdcg7h7GZKABk6VLB06 gUIsnxURiQ== =jNzW -----END PGP SIGNATURE----- Merge tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Support for batch request processing for ublk, improving the efficiency of the kernel/ublk server communication. This can yield nice 7-12% performance improvements - Support for integrity data for ublk - Various other ublk improvements and additions, including a ton of selftests additions and updated - Move the handling of blk-crypto software fallback from below the block layer to above it. This reduces the complexity of dealing with bio splitting - Series fixing a number of potential deadlocks in blk-mq related to the queue usage counter and writeback throttling and rq-qos debugfs handling - Add an async_depth queue attribute, to resolve a performance regression that's been around for a qhilw related to the scheduler depth handling - Only use task_work for IOPOLL completions on NVMe, if it is necessary to do so. An earlier fix for an issue resulted in all these completions being punted to task_work, to guarantee that completions were only run for a given io_uring ring when it was local to that ring. With the new changes, we can detect if it's necessary to use task_work or not, and avoid it if possible. - rnbd fixes: - Fix refcount underflow in device unmap path - Handle PREFLUSH and NOUNMAP flags properly in protocol - Fix server-side bi_size for special IOs - Zero response buffer before use - Fix trace format for flags - Add .release to rnbd_dev_ktype - MD pull requests via Yu Kuai - Fix raid5_run() to return error when log_init() fails - Fix IO hang with degraded array with llbitmap - Fix percpu_ref not resurrected on suspend timeout in llbitmap - Fix GPF in write_page caused by resize race - Fix NULL pointer dereference in process_metadata_update - Fix hang when stopping arrays with metadata through dm-raid - Fix any_working flag handling in raid10_sync_request - Refactor sync/recovery code path, improve error handling for badblocks, and remove unused recovery_disabled field - Consolidate mddev boolean fields into mddev_flags - Use mempool to allocate stripe_request_ctx and make sure max_sectors is not less than io_opt in raid5 - Fix return value of mddev_trylock - Fix memory leak in raid1_run() - Add Li Nan as mdraid reviewer - Move phys_vec definitions to the kernel types, mostly in preparation for some VFIO and RDMA changes - Improve the speed for secure erase for some devices - Various little rust updates - Various other minor fixes, improvements, and cleanups * tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) blk-mq: ABI/sysfs-block: fix docs build warnings selftests: ublk: organize test directories by test ID block: decouple secure erase size limit from discard size limit block: remove redundant kill_bdev() call in set_blocksize() blk-mq: add documentation for new queue attribute async_dpeth block, bfq: convert to use request_queue->async_depth mq-deadline: covert to use request_queue->async_depth kyber: covert to use request_queue->async_depth blk-mq: add a new queue sysfs attribute async_depth blk-mq: factor out a helper blk_mq_limit_depth() blk-mq-sched: unify elevators checking for async requests block: convert nr_requests to unsigned int block: don't use strcpy to copy blockdev name blk-mq-debugfs: warn about possible deadlock blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos() blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static blk-rq-qos: fix possible debugfs_mutex deadlock blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter ...	2026-02-09 17:57:21 -08:00
Kery Qi	b2d6b1d443	scsi: firewire: sbp-target: Fix overflow in sbp_make_tpg() The code in sbp_make_tpg() limits "tpgt" to UINT_MAX but the data type of "tpg->tport_tpgt" is u16. This causes a type truncation issue. When a user creates a TPG via configfs mkdir, for example: mkdir /sys/kernel/config/target/sbp/<wwn>/tpgt_70000 The value 70000 passes the "tpgt > UINT_MAX" check since 70000 is far less than 4294967295. However, when assigned to the u16 field tpg->tport_tpgt, the value is silently truncated to 4464 (70000 & 0xFFFF). This causes the value the user specified to differ from what is actually stored, leading to confusion and potential unexpected behavior. Fix this by changing the type of "tpgt" to u16 and using kstrtou16() which will properly reject values outside the u16 range. Fixes: `a511ce3397` ("sbp-target: Initial merge of firewire/ieee-1394 target mode support") Signed-off-by: Kery Qi <qikeyu2017@gmail.com> Link: https://patch.msgid.link/20260121114515.1829-2-qikeyu2017@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-23 22:41:21 -05:00
Bart Van Assche	0db3f51839	scsi: Change the return type of the .queuecommand() callback In clang version 21.1 and later the -Wimplicit-enum-enum-cast warning option has been introduced. This warning is enabled by default and can be used to catch .queuecommand() implementations that return another value than 0 or one of the SCSI_MLQUEUE_* constants. Hence this patch that changes the return type of the .queuecommand() implementations from 'int' into 'enum scsi_qc_status'. No functionality has been changed. Cc: Damien Le Moal <dlemoal@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20260115210357.2501991-6-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-23 21:32:34 -05:00
Randy Dunlap	24c776355f	kernel.h: drop hex.h and update all hex.h users Remove <linux/hex.h> from <linux/kernel.h> and update all users/callers of hex.h interfaces to directly #include <linux/hex.h> as part of the process of putting kernel.h on a diet. Removing hex.h from kernel.h means that 36K C source files don't have to pay the price of parsing hex.h for the roughly 120 C source files that need it. This change has been build-tested with allmodconfig on most ARCHes. Also, all users/callers of <linux/hex.h> in the entire source tree have been updated if needed (if not already #included). Link: https://lkml.kernel.org/r/20251215005206.2362276-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-20 19:44:19 -08:00
Ming Lei	5e2fde1a94	block: pass io_comp_batch to rq_end_io_fn callback Add a third parameter 'const struct io_comp_batch *' to the rq_end_io_fn callback signature. This allows end_io handlers to access the completion batch context when requests are completed via blk_mq_end_request_batch(). The io_comp_batch is passed from blk_mq_end_request_batch(), while NULL is passed from __blk_mq_end_request() and blk_mq_put_rq_ref() which don't have batch context. This infrastructure change enables drivers to detect whether they're being called from a batched completion path (like iopoll) and access additional context stored in the io_comp_batch. Update all rq_end_io_fn implementations: - block/blk-mq.c: blk_end_sync_rq - block/blk-flush.c: flush_end_io, mq_flush_data_end_io - drivers/nvme/host/ioctl.c: nvme_uring_cmd_end_io - drivers/nvme/host/core.c: nvme_keep_alive_end_io - drivers/nvme/host/pci.c: abort_endio, nvme_del_queue_end, nvme_del_cq_end - drivers/nvme/target/passthru.c: nvmet_passthru_req_done - drivers/scsi/scsi_error.c: eh_lock_door_done - drivers/scsi/sg.c: sg_rq_end_io - drivers/scsi/st.c: st_scsi_execute_end - drivers/target/target_core_pscsi.c: pscsi_req_done - drivers/md/dm-rq.c: end_clone_request Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-01-20 10:12:54 -07:00
Maurizio Lombardi	84dc603739	scsi: target: iscsi: Fix use-after-free in iscsit_dec_session_usage_count() In iscsit_dec_session_usage_count(), the function calls complete() while holding the sess->session_usage_lock. Similar to the connection usage count logic, the waiter signaled by complete() (e.g., in the session release path) may wake up and free the iscsit_session structure immediately. This creates a race condition where the current thread may attempt to execute spin_unlock_bh() on a session structure that has already been deallocated, resulting in a KASAN slab-use-after-free. To resolve this, release the session_usage_lock before calling complete() to ensure all dereferences of the sess pointer are finished before the waiter is allowed to proceed with deallocation. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reported-by: Zhaojuan Guo <zguo@redhat.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260112165352.138606-3-mlombard@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-16 23:02:15 -05:00
Maurizio Lombardi	9411a89e9e	scsi: target: iscsi: Fix use-after-free in iscsit_dec_conn_usage_count() In iscsit_dec_conn_usage_count(), the function calls complete() while holding the conn->conn_usage_lock. As soon as complete() is invoked, the waiter (such as iscsit_close_connection()) may wake up and proceed to free the iscsit_conn structure. If the waiter frees the memory before the current thread reaches spin_unlock_bh(), it results in a KASAN slab-use-after-free as the function attempts to release a lock within the already-freed connection structure. Fix this by releasing the spinlock before calling complete(). Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reported-by: Zhaojuan Guo <zguo@redhat.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260112165352.138606-2-mlombard@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-16 23:02:15 -05:00
Christophe JAILLET	ae62d62b1c	scsi: target: Constify struct configfs_item_operations and configfs_group_operations 'struct configfs_item_operations' and 'configfs_group_operations' are not modified in this driver. Constifying these structures moves some data to a read-only section, so increases overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig: Before: ====== text data bss dec hex filename 151831 80058 4832 236721 39cb1 drivers/target/target_core_configfs.o 45200 16658 0 61858 f1a2 drivers/target/target_core_fabric_configfs.o After: ===== text data bss dec hex filename 152599 79290 4832 236721 39cb1 drivers/target/target_core_configfs.o 46352 15506 0 61858 f1a2 drivers/target/target_core_fabric_configfs.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/a0f25237ae86b8c4dd7a3876c4ed2dc3de200173.1767008082.git.christophe.jaillet@wanadoo.fr Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-04 15:46:10 -05:00
ReBeating	8e8e8e7e84	scsi: target: sbp: Potential integer overflow in sbp_make_tpg() The variable tpgt in sbp_make_tpg() is defined as unsigned long and is assigned to tpgt->tport_tpgt, which is defined as u16. This may cause an integer overflow when tpgt is greater than USHRT_MAX (65535). I haven't tried to trigger it myself, but it is possible to trigger it by calling sbp_make_tpg() with a large value for tpgt. Modify the type of tpgt to match tpgt->tport_tpgt and adjusted the relevant code accordingly. This patch is similar to commit `59c816c1f2` ("vhost/scsi: potential memory corruption"). Signed-off-by: ReBeating <rebeating@163.com> Link: https://patch.msgid.link/20251226031936.852-1-rebeating@163.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2026-01-04 15:44:17 -05:00
Gulam Mohamed	7011e8aafe	scsi: target: core: Add emulation for REPORT IDENTIFYING INFORMATION Add emulation for REPORT IDENTIFYING INFORMATION command using the configfs file 'pd_text_id_info' in target core module. The configfs file is created in /sys/kernel/config/target/core/<backend type>/ <backing_store_name>/wwn/. An emulation function, spc_emulate_report_id_info(), is defined to return the identification string based on the contents of 'pd_text_id_info'. The details of the REPORT IDENTIFYING INFORMATION command is defined in section 6.32 of SPC4. [mkp: checkpatch tweaks] Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251201110716.227588-1-gulam.mohamed@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-12-16 21:07:22 -05:00
Linus Torvalds	6a1636e066	SCSI misc on 20251214 The only core fix is in doc; all the others are in drivers, with the biggest impacts in libsas being the rollback on error handling and in ufs coming from a couple of error handling fixes, one causing a crash if it's activated before scanning and the other fixing W-LUN resumption. Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> -----BEGIN PGP SIGNATURE----- iLgEABMIAGAWIQTnYEDbdso9F2cI+arnQslM7pishQUCaT4RWBsUgAAAAAAEAA5t YW51MiwyLjUrMS4xMSwyLDImHGphbWVzLmJvdHRvbWxleUBoYW5zZW5wYXJ0bmVy c2hpcC5jb20ACgkQ50LJTO6YrIVfSgEA23oY0kPjLLkNj4KAjLoWVPmZ/5efKw49 m6YUwfvzTS8BAIq4eFHVecwAM8FAe9NwuOKgO5d9u6epAy6wzBeP2jly =cpHq -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "The only core fix is in doc; all the others are in drivers, with the biggest impacts in libsas being the rollback on error handling and in ufs coming from a couple of error handling fixes, one causing a crash if it's activated before scanning and the other fixing W-LUN resumption" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: qcom: Fix confusing cleanup.h syntax scsi: libsas: Add rollback handling when an error occurs scsi: device_handler: Return error pointer in scsi_dh_attached_handler_name() scsi: ufs: core: Fix a deadlock in the frequency scaling code scsi: ufs: core: Fix an error handler crash scsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed" scsi: ufs: core: Fix RPMB link error by reversing Kconfig dependencies scsi: qla4xxx: Use time conversion macros scsi: qla2xxx: Enable/disable IRQD_NO_BALANCING during reset scsi: ipr: Enable/disable IRQD_NO_BALANCING during reset scsi: imm: Fix use-after-free bug caused by unfinished delayed work scsi: target: sbp: Remove KMSG_COMPONENT macro scsi: core: Correct documentation for scsi_device_quiesce() scsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1 scsi: target: Reset t_task_cdb pointer in error case scsi: ufs: core: Fix EH failure after W-LUN resume error	2025-12-14 15:35:35 +12:00
Linus Torvalds	7eb7f5723d	SCSI misc on 20251204 Usual driver updates (ufs, lpfc, target, qla2xxx) plus assorted cleanups and fixes including the WQ_PERCPU series. The biggest core change is the new allocation of pseudo-devices which allow the sending of internal commands to a given SCSI target. Signed-off-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> -----BEGIN PGP SIGNATURE----- iLgEABMIAGAWIQTnYEDbdso9F2cI+arnQslM7pishQUCaTJf+BsUgAAAAAAEAA5t YW51MiwyLjUrMS4xMSwyLDImHGphbWVzLmJvdHRvbWxleUBoYW5zZW5wYXJ0bmVy c2hpcC5jb20ACgkQ50LJTO6YrIUC9QEA+q+UqGr7hPTs9C4kdLoxjDpG6tmXo11H ZkuXKjR5rDABAPPe3HV0Bk9C9jpLVPLp+fGJvgw//rib+XtYMjwxAJ71 =bQZL -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Usual driver updates (ufs, lpfc, target, qla2xxx) plus assorted cleanups and fixes including the WQ_PERCPU series. The biggest core change is the new allocation of pseudo-devices which allow the sending of internal commands to a given SCSI target" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (147 commits) scsi: MAINTAINERS: Add the UFS include directory scsi: scsi_debug: Support injecting unaligned write errors scsi: qla2xxx: Fix improper freeing of purex item scsi: ufs: rockchip: Fix compile error without CONFIG_GPIOLIB scsi: ufs: rockchip: Reset controller on PRE_CHANGE of hce enable notify scsi: ufs: core: Use scsi_device_busy() scsi: ufs: core: Fix single doorbell mode support scsi: pm80xx: Add WQ_PERCPU to alloc_workqueue() users scsi: target: Add WQ_PERCPU to alloc_workqueue() users scsi: qedi: Add WQ_PERCPU to alloc_workqueue() users scsi: target: ibmvscsi: Add WQ_PERCPU to alloc_workqueue() users scsi: qedf: Add WQ_PERCPU to alloc_workqueue() users scsi: bnx2fc: Add WQ_PERCPU to alloc_workqueue() users scsi: be2iscsi: Add WQ_PERCPU to alloc_workqueue() users scsi: message: fusion: Add WQ_PERCPU to alloc_workqueue() users scsi: lpfc: WQ_PERCPU added to alloc_workqueue() users scsi: scsi_transport_fc: WQ_PERCPU added to alloc_workqueue users() scsi: scsi_dh_alua: WQ_PERCPU added to alloc_workqueue() users scsi: qla2xxx: WQ_PERCPU added to alloc_workqueue() users scsi: target: sbp: Replace use of system_unbound_wq with system_dfl_wq ...	2025-12-05 19:56:50 -08:00
Linus Torvalds	8f7aa3d3c7	Networking changes for 6.19. Core & protocols ---------------- - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API ---------- - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers -------------- - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmkveRQACgkQMUZtbf5S IrvY7A/+Nb0o4BxLHjPkAl1m3t3q2d0Y29B7SNkwnwEtxAV8EkNeZ3GWrdtDnTQY MYhmc7LEzvz8/lihapr7UJkcokzSASUV54hbez5jDBKC8EEoyUk8FdWDPerwlcRI zmCFNAVFyh9GX8i7wcrzKbDTHT5+GZLbSlGl9U5mhLsDdRlJgH7d8PJ7vWcmtLFY XN0paDyaeHfCl8wReWNAYx4C/I0ODOvlscpO0tnAKhB0ngJbQCKY2t6tn3rOYdif ZSQ5KwVRnJtQ4fYOFMOy9+FSCjVXtyrxF8KLxD+mqom2ZhmO00UpOMl09tqhq3uT WnvwoHUVBt6F+iITHwg5kMgIDPUq1kpUvL4S4UbVSuUm9ZKD+4KRU2ZHRBYMx+MU bsqmtY8/IULClUoRz+tZhltA8eb0NEqNZE2JPOFDiJHn1YiCCkFwxibhir893oM3 sB7x65D7LQI2ty2BBGVGYnwYDPtyaxOA/s3WTwPvLEi3+Y/TGNIIrS9lBLA4U+Yr Gi93WQGVjttMmVyaHgXBUGmi3L52hvolm0AZ8zSRGrnIEpecjhly2KfYuaOzuxXC IHEQ6AFLdRh6JzafXGb/mQwGCHNmhwsY8A49i94fakWQamaL/L6A+1dyPu4LXMqi NwqCmlVb/LKGlfNG+V4wT27srJ+yBA2Vk3tpR1sZQQytFh0LKHI= =UoDR -----END PGP SIGNATURE----- Merge tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API: - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers: - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support" * tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits) net: page_pool: sanitise allocation order net: page pool: xa init with destroy on pp init net/mlx5e: Support XDP target xmit with dummy program net/mlx5e: Update XDP features in switch channels selftests/tc-testing: Test CAKE scheduler when enqueue drops packets net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop wireguard: netlink: generate netlink code wireguard: uapi: generate header with ynl-gen wireguard: uapi: move flag enums wireguard: uapi: move enum wg_cmd wireguard: netlink: add YNL specification selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py selftests: drv-net: introduce Iperf3Runner for measurement use cases selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive() Documentation: net: dsa: mention simple HSR offload helpers Documentation: net: dsa: mention availability of RedBox ...	2025-12-03 17:24:33 -08:00
Linus Torvalds	1d18101a64	kernel-6.19-rc1.cred -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZQAKCRCRxhvAZXjc orJLAP9UD+dX6cicJDkzFZowDakmoIQkR5ZSDwChSlmvLcmquwEAlSq4svVd9Bdl 7kOFUk71DqhVHrPAwO7ap0BxehokEAA= =Cli6 -----END PGP SIGNATURE----- Merge tag 'kernel-6.19-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull cred guard updates from Christian Brauner: "This contains substantial credential infrastructure improvements adding guard-based credential management that simplifies code and eliminates manual reference counting in many subsystems. Features: - Kernel Credential Guards Add with_kernel_creds() and scoped_with_kernel_creds() guards that allow using the kernel credentials without allocating and copying them. This was requested by Linus after seeing repeated prepare_kernel_creds() calls that duplicate the kernel credentials only to drop them again later. The new guards completely avoid the allocation and never expose the temporary variable to hold the kernel credentials anywhere in callers. - Generic Credential Guards Add scoped_with_creds() guards for the common override_creds() and revert_creds() pattern. This builds on earlier work that made override_creds()/revert_creds() completely reference count free. - Prepare Credential Guards Add prepare credential guards for the more complex pattern of preparing a new set of credentials and overriding the current credentials with them: - prepare_creds() - modify new creds - override_creds() - revert_creds() - put_cred() Cleanups: - Make init_cred static since it should not be directly accessed - Add kernel_cred() helper to properly access the kernel credentials - Fix scoped_class() macro that was introduced two cycles ago - coredump: split out do_coredump() from vfs_coredump() for cleaner credential handling - coredump: move revert_cred() before coredump_cleanup() - coredump: mark struct mm_struct as const - coredump: pass struct linux_binfmt as const - sev-dev: use guard for path" * tag 'kernel-6.19-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (36 commits) trace: use override credential guard trace: use prepare credential guard coredump: use override credential guard coredump: use prepare credential guard coredump: split out do_coredump() from vfs_coredump() coredump: mark struct mm_struct as const coredump: pass struct linux_binfmt as const coredump: move revert_cred() before coredump_cleanup() sev-dev: use override credential guards sev-dev: use prepare credential guard sev-dev: use guard for path cred: add prepare credential guard net/dns_resolver: use credential guards in dns_query() cgroup: use credential guards in cgroup_attach_permissions() act: use credential guards in acct_write_process() smb: use credential guards in cifs_get_spnego_key() nfs: use credential guards in nfs_idmap_get_key() nfs: use credential guards in nfs_local_call_write() nfs: use credential guards in nfs_local_call_read() erofs: use credential guards ...	2025-12-01 13:45:41 -08:00
Heiko Carstens	971bb08704	scsi: target: sbp: Remove KMSG_COMPONENT macro The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel message catalog" from 2008 [1] which never made it upstream. The macro was added to s390 code to allow for an out-of-tree patch which used this to generate unique message ids. Also this out-of-tree doesn't exist anymore. The pattern of how the KMSG_COMPONENT is used was partially also used for non s390 specific code, for whatever reasons. Remove the macro in order to get rid of a pointless indirection. [1] https://lwn.net/Articles/292650/ Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Link: https://patch.msgid.link/20251126144027.2213895-1-hca@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-11-29 15:27:33 -05:00

1 2 3 4 5 ...

2603 Commits